SQL INTERSECT

The INTERSECT operator in SQL is a set operator used to find the "common ground" between two datasets. It returns only the rows that are present in the result sets of both SELECT queries. Think of it as a mathematical intersection in a Venn diagram: only the overlapping area where both circles meet is returned.

While a UNION combines everything and a EXCEPT (or MINUS) finds differences, INTERSECT is your go-to tool when you need to identify records that satisfy two different sets of criteria simultaneously across different tables or queries.

Developer Tip: Use INTERSECT when you need to perform data validation or find overlapping records between a production table and a backup/archive table.

 

Syntax of SQL INTERSECT

SELECT column1, column2, ...
FROM table1
WHERE condition
INTERSECT
SELECT column1, column2, ...
FROM table2
WHERE condition;
  • SELECT statements: These are the standard queries you want to compare. You can include WHERE clauses, GROUP BY, and joins within these individual statements.
  • INTERSECT: This keyword tells the database engine to compare the results of the top query with the bottom query and only keep the duplicates found in both.
  • columns: For INTERSECT to work, both SELECT statements must have the exact same number of columns in the same order.
Best Practice: Always ensure the data types of corresponding columns match. If the first column in Query A is an INT, the first column in Query B must also be an INT (or a type that can be implicitly converted).

 

Key Points

  • Automatic Deduplication: By default, INTERSECT returns unique rows. If a row appears multiple times in both tables, it will only appear once in the final result.
  • Column Alignment: The names of the columns in the final result set are usually taken from the first SELECT statement.
  • Performance: Large intersections can be resource-intensive. Ensure the columns you are intersecting are indexed if you are working with millions of rows.
  • Comparison: Unlike a JOIN, which combines columns horizontally, INTERSECT compares rows vertically.
Watch Out: Not all database systems support INTERSECT. For example, MySQL and MariaDB do not have a built-in INTERSECT operator. In those systems, you would achieve the same result using an INNER JOIN or an IN clause.

 

Example of SQL INTERSECT

Imagine you are managing a large organization and you have two different lists of employees: those who worked on "Project A" and those who worked on "Project B." You want to find out which "rockstar" employees contributed to both projects.

Table 1: Employees (Department A)

employee_id name department
1 John HR
2 Alice HR
3 Bob HR

Table 2: Employees (Department B)

employee_id name department
2 Alice HR
3 Bob HR
4 Sarah IT

To find the employees present in both lists, we use the following query:

SELECT name, department
FROM employees_1
INTERSECT
SELECT name, department
FROM employees_2;

Output:

name department
Alice HR
Bob HR

Explanation:

  • The database looks at the first set (John, Alice, Bob) and the second set (Alice, Bob, Sarah).
  • It identifies that only **Alice** and **Bob** exist in both results.
  • **John** is excluded because he is only in Department A.
  • **Sarah** is excluded because she is only in Department B.
Common Mistake: Beginners often assume that the column names in both SELECT statements must be identical. They don't have to be! Only the data types and the order of columns matter.

 

SQL INTERSECT with Conditions

In real-world applications, you rarely want to intersect entire tables. Usually, you are looking for a specific subset. For instance, you might want to find customers who are both "Subscribed to the Newsletter" AND "Have made a purchase in the last 30 days."

SELECT email
FROM newsletter_subscribers
WHERE status = 'active'
INTERSECT
SELECT email
FROM orders
WHERE order_date >= '2023-10-01';

Here is another example using our employee tables, focusing specifically on the HR department:

SELECT name, department
FROM employees_1
WHERE department = 'HR'
INTERSECT
SELECT name, department
FROM employees_2
WHERE department = 'HR';

Output:

name department
Alice HR
Bob HR
Developer Tip: If you need to include duplicates in your intersection (though rare), check if your database supports INTERSECT ALL. Standard INTERSECT performs an implicit DISTINCT.

 

Summary

The INTERSECT operator is a powerful way to filter data by finding matches across two different queries. It simplifies complex logic that would otherwise require nested joins or subqueries. Remember these three Golden Rules for using INTERSECT:

  1. The number of columns must match in both SELECT statements.
  2. The data types of those columns must be compatible.
  3. The final result will automatically remove duplicate rows.