- SQL Basics
- SQL Introduction
- SQL Syntax
- SQL Sample Database
- SQL SELECT
- SQL WHERE
- SQL ORDER BY
- SQL DISTINCT
- SQL LIMIT
- SQL FETCH
- SQL AND
- SQL OR
- SQL BETWEEN
- SQL IN
- SQL LIKE
- SQL IS NULL
- SQL Comparison Operators
- SQL Logical Operators
- SQL Alias
- SQL CASE
- Joins and Subqueries
- SQL INNER JOIN
- SQL LEFT JOIN
- SQL RIGHT JOIN
- SQL FULL OUTER JOIN
- SQL SELF JOIN
- SQL CROSS JOIN
- SQL Subquery
- SQL Correlated Subquery
- SQL UNION
- SQL INTERSECT
- SQL EXCEPT
- Aggregate Functions
- SQL AVG
- SQL COUNT
- SQL MAX
- SQL MIN
- SQL SUM
- SQL GROUP BY
- SQL HAVING
- SQL ROLLUP
- SQL CUBE
- SQL GROUPING SETS
- Database Management
- SQL CREATE DATABASE
- SQL ALTER DATABASE
- SQL DROP DATABASE
- SQL BACKUP DATABASE
- SQL SHOW DATABASES
- SQL SELECT DATABASE
- Table Management
- SQL CREATE TABLE
- SQL ALTER TABLE
- SQL ADD COLUMN
- SQL DROP COLUMN
- SQL DROP TABLE
- SQL TRUNCATE TABLE
- SQL SHOW TABLES
- SQL RENAME TABLE
- SQL Constraints
- SQL Primary Key
- SQL Foreign Key
- SQL UNIQUE Constraint
- SQL CHECK Constraint
- SQL NOT NULL Constraint
SQL DISTINCT
In a real-world database, tables often contain duplicate values across various rows. For example, a customers table might have hundreds of entries from the same city. The DISTINCT keyword is used in SQL to filter these results and return only unique values, effectively "de-duplicating" your output.
DISTINCT as a unique filter for your result set. It doesn't change the data in your table; it only changes how the data is presented in your query results.
Basic Syntax of DISTINCT
SELECT DISTINCT column1, column2, ...
FROM table_name;
column1, column2, ...: These are the columns where you want to identify unique values.- If you specify only one column, SQL returns only the unique entries for that specific field.
- If multiple columns are used, SQL evaluates the combination of those columns. A row is only considered a duplicate if every single column value in the
SELECTlist matches another row.
DISTINCT on only one column while selecting several: SELECT DISTINCT(city), name FROM customers. In SQL, DISTINCT applies to the entire row of the result set, not just the first column.
Key Features of DISTINCT Clause
1. Removing Duplicate Values from One Column
When you need a quick list of categories, locations, or status types, using DISTINCT on a single column is the most efficient method.
SELECT DISTINCT city
FROM customers;
This query will scan the customers table and return a clean list of unique cities, no matter how many customers live in each one.
2. Removing Duplicate Rows Based on Multiple Columns
You can use DISTINCT with multiple columns to find unique pairings or groups. This is helpful when you want to see all unique locations (City + Country) without seeing the same city listed multiple times for the same country.
SELECT DISTINCT city, country
FROM customers;
In this case, "Springfield, USA" and "Springfield, UK" would both appear because their combinations are unique, even though the city name is the same.
DISTINCT when cleaning data for reports, such as identifying unique shipping routes or unique product-supplier pairings.
3. DISTINCT in Aggregate Functions
DISTINCT is incredibly powerful when used inside aggregate functions like COUNT(). This allows you to count how many unique items exist rather than just counting the total number of rows.
SELECT COUNT(DISTINCT city)
FROM customers;
Instead of getting the total number of customers, this query tells you exactly how many different cities your customers are located in.
DISTINCT ignores NULL values in some contexts but treats them as a single unique value in others. If you have 10 rows with NULL in the city column, SELECT DISTINCT city will return one NULL row.
Example Queries
1. Get Unique Job Titles from the Employees Table
Imagine a company with 500 employees, but only 10 actual roles (Developer, Manager, HR, etc.). To see the list of roles:
SELECT DISTINCT job_title
FROM employees;
2. Get Unique City and Country Combinations
Useful for logistics and mapping applications where you need to know every unique geographical area your business serves.
SELECT DISTINCT city, country
FROM customers;
3. Count the Number of Unique Products Sold
In a sales ledger, the same product ID might appear thousands of times. To find out how many different types of products actually moved today:
SELECT COUNT(DISTINCT product_id)
FROM sales
WHERE sale_date = '2023-10-01';
4. Get Unique Product Categories from the Products Table
Perfect for populating a dropdown menu in a web application's user interface.
SELECT DISTINCT category
FROM products;
Important Notes
- Performance Cost:
DISTINCTrequires the database to sort or hash the data to find duplicates. On tables with millions of rows, this can lead to slower query performance. Always check if you can use aWHEREclause to limit the data first. - The "All Columns" Effect: If you use
SELECT DISTINCT *, SQL looks for rows where every single value in every column is identical. If your table has a unique primary key (likeid),SELECT DISTINCT *will essentially do nothing because the ID makes every row unique anyway. - NULL Handling: SQL treats all
NULLvalues as identical for the purpose of theDISTINCTkeyword.
Summary
The DISTINCT keyword is an essential tool for any developer's SQL toolkit. It simplifies data analysis by stripping away redundancy, allowing you to focus on unique data points. Whether you are generating a list of unique categories for a UI or performing complex counts for a business report, DISTINCT ensures your results are concise and accurate.