SQL Self Join

A Self Join is a join where a table is joined with itself. It is used to compare rows within the same table or to find relationships within the table, such as hierarchical data or paired records.

Syntax of Self Join

SELECT a.column1, b.column2
FROM table_name a
JOIN table_name b
ON a.common_column = b.common_column;
  • table_name: The table is referenced twice using table aliases (e.g., a and b).
  • common_column: The column on which the join condition is applied.

 

Example

Consider the following Employees table:

employee_id name manager_id
1 Alice NULL
2 Bob 1
3 Charlie 1
4 David 2

Here:

  • manager_id is a reference to employee_id within the same table, indicating that employees report to a manager.

 

Self Join Query

To find employees and their managers:

SELECT e1.name AS employee, e2.name AS manager
FROM employees e1
LEFT JOIN employees e2
ON e1.manager_id = e2.employee_id;

Output:

employee manager
Alice NULL
Bob Alice
Charlie Alice
David Bob

Explanation:

  • Alice has no manager (manager_id = NULL), so manager is NULL.
  • Bob and Charlie report to Alice (manager_id = 1).
  • David reports to Bob (manager_id = 2).

 

Self Join for Hierarchical Data

Self Joins can also be used to identify relationships like subordinates under managers in a hierarchy.

Query for Direct Subordinates

SELECT e1.name AS manager, e2.name AS subordinate
FROM employees e1
JOIN employees e2
ON e1.employee_id = e2.manager_id;

Output:

manager subordinate
Alice Bob
Alice Charlie
Bob David

 

Self Join with Additional Conditions

You can add more conditions to refine the results. For example, to find employees reporting to Alice:

SELECT e1.name AS employee, e2.name AS manager
FROM employees e1
JOIN employees e2
ON e1.manager_id = e2.employee_id
WHERE e2.name = 'Alice';

Output:

employee manager
Bob Alice
Charlie Alice

 

Key Points

  • A Self Join uses table aliases to differentiate between the two instances of the same table.
  • Common use cases:
    • Finding hierarchical relationships (e.g., managers and subordinates).
    • Comparing rows within the same table (e.g., pairs of employees with the same attributes).
  • Use a LEFT JOIN if you want to include rows without a match (e.g., employees without managers).

 

Summary

The Self Join is a powerful tool for querying hierarchical or relational data within the same table. It allows you to uncover insights by comparing rows of a single dataset in a structured way.