SQL GROUP BY

The GROUP BY statement in SQL is used to arrange identical data into groups. It is often used with aggregate functions such as COUNT(), SUM(), AVG(), MAX(), and MIN() to perform calculations on each group of data.

 

Syntax of SQL GROUP BY

SELECT column_name, aggregate_function(column_name)  
FROM table_name  
WHERE condition  
GROUP BY column_name;  
  • column_name: The column(s) by which the data will be grouped.
  • aggregate_function(): An aggregate function applied to each group.
  • table_name: The table containing the data.
  • condition: Optional filter to apply before grouping.

 

Key Points

  • The GROUP BY clause follows the WHERE clause but precedes the ORDER BY clause.
  • Each column in the SELECT statement must be included in either the GROUP BY clause or an aggregate function.
  • It groups rows that have the same values in specified columns.

 

Example of SQL GROUP BY

Consider the following table:

Table: Orders

order_id customer_name product_name quantity
1 Alice Apple 3
2 Bob Apple 2
3 Alice Banana 5
4 Charlie Banana 7
5 Bob Cherry 4

To calculate the total quantity ordered by each customer:

SELECT customer_name, SUM(quantity) AS total_quantity  
FROM orders  
GROUP BY customer_name;  

Output:

customer_nametotal_quantity
Alice8
Bob6
Charlie7

Explanation:

  • The GROUP BY customer_name groups rows based on the customer_name column, and the SUM(quantity) calculates the total quantity for each customer.

 

SQL GROUP BY with Multiple Columns

You can group data by multiple columns to create more specific groups.

Example: Total quantity ordered by each customer for each product:

SELECT customer_name, product_name, SUM(quantity) AS total_quantity  
FROM orders  
GROUP BY customer_name, product_name;  

Output:

customer_nameproduct_nametotal_quantity
AliceApple3
AliceBanana5
BobApple2
BobCherry4
CharlieBanana7

Explanation:

  • The GROUP BY customer_name, product_name creates groups for each combination of customer_name and product_name.

 

SQL GROUP BY with HAVING Clause

The HAVING clause is used to filter grouped data, similar to how the WHERE clause filters rows.

Example: Show customers who ordered more than 5 items in total:

SELECT customer_name, SUM(quantity) AS total_quantity  
FROM orders  
GROUP BY customer_name  
HAVING SUM(quantity) > 5;  

Output:

customer_nametotal_quantity
Alice8
Charlie7

Explanation:

  • The HAVING SUM(quantity) > 5 filters out groups where the total quantity is 5 or less.

 

SQL GROUP BY with Aggregate Functions

GROUP BY is commonly used with aggregate functions to summarize data.

Example: Total quantity and average quantity ordered for each product:

SELECT product_name, SUM(quantity) AS total_quantity, AVG(quantity) AS avg_quantity  
FROM orders  
GROUP BY product_name;  

Output:

product_nametotal_quantityavg_quantity
Apple52.5
Banana126.0
Cherry44.0

 

Summary

The GROUP BY clause is a powerful feature in SQL for organizing data into meaningful groups. It allows you to perform aggregate calculations for each group, and combined with the HAVING clause, it enables filtering based on group conditions. Use it to derive insights and summaries from large datasets.