SQL DISTINCT

The DISTINCT keyword is used in SQL to return unique values in the result set. It removes duplicate rows based on the columns specified in the SELECT statement.

 

Basic Syntax of DISTINCT

SELECT DISTINCT column1, column2, ...
FROM table_name;
  • column1, column2, ...: Specifies the columns for which you want to return distinct (unique) values.
  • If multiple columns are used, the combination of values across all specified columns must be unique.

 

Key Features of DISTINCT Clause

1. Removing Duplicate Values from One Column

Using DISTINCT with a single column will return only unique values from that column.

SELECT DISTINCT city
FROM customers;

This query will return a list of unique cities from the customers table.

2. Removing Duplicate Rows Based on Multiple Columns

You can use DISTINCT with multiple columns to eliminate rows where the combination of values across the columns is the same.

SELECT DISTINCT city, country
FROM customers;

This will return unique pairs of city and country values, excluding any duplicate combinations.

3. DISTINCT in Aggregate Functions

DISTINCT can also be used with aggregate functions (like COUNT(), SUM(), AVG(), etc.) to apply the function to only unique values.

SELECT COUNT(DISTINCT city)
FROM customers;

This query will count the number of unique cities in the customers table.

 

Example Queries

1. Get Unique Job Titles from the Employees Table

SELECT DISTINCT job_title
FROM employees;

This will return all unique job titles from the employees table, removing any duplicates.

2. Get Unique City and Country Combinations

SELECT DISTINCT city, country
FROM customers;

This query will return unique combinations of city and country pairs from the customers table.

3. Count the Number of Unique Products Sold

SELECT COUNT(DISTINCT product_name)
FROM sales;

This query will count the number of unique products sold in the sales table.

4. Get Unique Product Categories from the Products Table

SELECT DISTINCT category
FROM products;

This query will return a list of unique product categories from the products table.

 

Important Notes

  • DISTINCT works on the entire row. When used with multiple columns, it eliminates rows where all specified column values are identical.
  • DISTINCT can slow down query performance if used on large tables, as it requires extra processing to filter out duplicates.

 

Summary

The DISTINCT keyword in SQL is used to remove duplicate values from the result set, making it easy to retrieve only unique rows or values from a table. It can be applied to one or multiple columns and is particularly useful for eliminating redundancy in query results.