SQL Correlated Subquery

A correlated subquery is a type of subquery that references columns from the outer query. Unlike a non-correlated subquery, which can be executed independently, a correlated subquery must be executed for each row processed by the outer query. This means that for every row in the outer query, the subquery is re-executed, using values from the outer query's row.

 

Syntax of Correlated Subquery

SELECT column1, column2, ...
FROM outer_table
WHERE column_name operator (SELECT column_name 
                             FROM inner_table 
                             WHERE outer_table.column_name = inner_table.column_name);
  • outer_table: The main table in the outer query.
  • inner_table: The table in the inner (correlated) query.
  • operator: Comparison operators such as =, >, <, IN, etc.

 

Example of Correlated Subquery

Consider the following tables:

Employees Table:

employee_id name department_id salary
1 John 101 50000
2 Jane 102 60000
3 Michael 101 45000
4 Sarah 102 70000

Departments Table:

department_id department_name
101 HR
102 IT

 

Correlated Subquery Example

Find the employees whose salary is greater than the average salary of employees in the same department.

SELECT e.name, e.salary
FROM employees e
WHERE e.salary > (SELECT AVG(e2.salary) 
                  FROM employees e2 
                  WHERE e2.department_id = e.department_id);

Explanation:

  • The inner query calculates the average salary for each department. It uses e.department_id from the outer query to find the department for each employee.
  • The outer query retrieves the employees whose salary is greater than the average salary of their own department.

Output:

name salary
Jane 60000
Sarah 70000

 

How Correlated Subqueries Work

  1. The outer query processes one row at a time.
  2. For each row in the outer query, the correlated subquery is executed, using values from that specific row.
  3. The result of the subquery is then used by the outer query to filter the data.

 

Correlated Subquery with IN

Another common use of correlated subqueries is with the IN operator to match rows from the outer query with a list of values produced by the inner query.

SELECT e.name
FROM employees e
WHERE e.department_id IN (SELECT d.department_id 
                           FROM departments d 
                           WHERE d.department_name = 'IT');

Explanation:

  • The inner query selects the department IDs of departments that are named "IT."
  • The outer query selects all employees whose department_id matches one of those department IDs.

 

Key Points

  • A correlated subquery references columns from the outer query.
  • It must be executed once for each row of the outer query.
  • Common operators used in correlated subqueries include =, IN, ANY, ALL, etc.
  • The subquery is evaluated for each row, making it more computationally expensive than a non-correlated subquery.

 

Summary

A correlated subquery is a subquery that depends on values from the outer query for its execution. It is evaluated for each row processed by the outer query, which makes it suitable for operations where comparison between rows of the outer query and related values in the subquery is required.