- SQL Basics
- SQL Introduction
- SQL Syntax
- SQL Sample Database
- SQL SELECT
- SQL WHERE
- SQL ORDER BY
- SQL DISTINCT
- SQL LIMIT
- SQL FETCH
- SQL AND
- SQL OR
- SQL BETWEEN
- SQL IN
- SQL LIKE
- SQL IS NULL
- SQL Comparison Operators
- SQL Logical Operators
- SQL Alias
- SQL CASE
- Joins and Subqueries
- SQL INNER JOIN
- SQL LEFT JOIN
- SQL RIGHT JOIN
- SQL FULL OUTER JOIN
- SQL SELF JOIN
- SQL CROSS JOIN
- SQL Subquery
- SQL Correlated Subquery
- SQL UNION
- SQL INTERSECT
- SQL EXCEPT
- Aggregate Functions
- SQL AVG
- SQL COUNT
- SQL MAX
- SQL MIN
- SQL SUM
- SQL GROUP BY
- SQL HAVING
- SQL ROLLUP
- SQL CUBE
- SQL GROUPING SETS
- Database Management
- SQL CREATE DATABASE
- SQL ALTER DATABASE
- SQL DROP DATABASE
- SQL BACKUP DATABASE
- SQL SHOW DATABASES
- SQL SELECT DATABASE
- Table Management
- SQL CREATE TABLE
- SQL ALTER TABLE
- SQL ADD COLUMN
- SQL DROP COLUMN
- SQL DROP TABLE
- SQL TRUNCATE TABLE
- SQL SHOW TABLES
- SQL RENAME TABLE
- SQL Constraints
- SQL Primary Key
- SQL Foreign Key
- SQL UNIQUE Constraint
- SQL CHECK Constraint
- SQL NOT NULL Constraint
SQL Correlated Subquery
In standard SQL, a subquery is a query nested inside another statement. A correlated subquery is a specific type of subquery that relies on the outer query for its values. Think of it like a "nested loop" in traditional programming: for every single row processed by the outer query, the inner subquery runs once.
Unlike a non-correlated subquery (which runs once, finishes, and passes its result to the outer query), a correlated subquery is executed repeatedly. This deep connection between the two queries allows you to perform complex row-by-row comparisons that wouldn't be possible with a simple join.
Syntax of Correlated Subquery
SELECT column1, column2, ...
FROM outer_table o
WHERE column_name operator (SELECT column_name
FROM inner_table i
WHERE i.reference_column = o.reference_column);
- outer_table: The primary table being queried. We usually give it an alias (like
oore) so the subquery can "reach out" and grab its values. - inner_table: The table used inside the subquery to perform the check.
- operator: Comparison operators like
=,>,<, or logical operators likeEXISTSandIN.
employees e) when writing correlated subqueries. This prevents "ambiguous column" errors and makes it clear which column belongs to the outer query versus the inner query.
Example of Correlated Subquery
Let's look at a classic real-world scenario. Imagine you are managing a payroll system and need to find high-performers who are paid significantly more than their immediate peers.
Employees Table:
Departments Table:
Correlated Subquery Example
Goal: Find employees whose salary is higher than the average salary within their own department.
SELECT e.name, e.salary
FROM employees e
WHERE e.salary > (SELECT AVG(e2.salary)
FROM employees e2
WHERE e2.department_id = e.department_id);
Explanation:
- The outer query looks at the first employee (John).
- The inner query takes John's
department_id(101) and calculates the average salary for everyone in department 101. - The outer query then checks if John's salary is greater than that average.
- This process repeats for Jane, Michael, and Sarah.
Output:
How Correlated Subqueries Work
- Row Selection: The outer query fetches a candidate row.
- Variable Passing: Values from that candidate row are passed into the inner subquery.
- Subquery Execution: The subquery executes using those external values.
- Evaluation: The outer query uses the subquery's result to decide whether to include the candidate row in the final output.
- Iteration: The process moves to the next row and repeats until all rows are checked.
WHERE e2.department_id = e.department_id, the subquery will return the average salary of the entire company for every row, turning it into a non-correlated subquery.
Correlated Subquery with IN
You can also use correlated subqueries to filter data based on relationships in other tables. While this can often be done with a JOIN, using a subquery is sometimes more readable when you just need to check for existence.
SELECT e.name
FROM employees e
WHERE e.department_id IN (SELECT d.department_id
FROM departments d
WHERE d.department_name = 'IT'
AND d.department_id = e.department_id);
Explanation:
- For every employee, the subquery checks if their
department_idbelongs to a department named "IT". - If the match is found, the employee's name is returned.
EXISTS is usually preferred over IN for correlated subqueries because EXISTS stops searching as soon as it finds the first match, making it slightly more efficient.
Key Points
- A correlated subquery is "correlated" because it depends on the outer query's data.
- It acts like a row-by-row filter.
- It is highly flexible and great for "Top N per group" or "Compare against group average" problems.
- Performance is the main trade-off; use them wisely on large datasets.
Summary
Correlated subqueries are a powerful tool in a developer's SQL toolkit. They allow for sophisticated logic by linking the inner and outer queries together. While they are intuitive to write for complex filtering, always keep an eye on execution plans to ensure your database stays performant as your data grows.