In SQL, the
GROUP BY statement is used to group the result coming from a
SELECT clause, based on one or more columns in the resultant table.
GROUP BY is often used with aggregate functions (MAX, MIN, SUM, COUNT, AVG) to group the resulting set by one or more columns.
GROUP BY statements come after the
WHERE clause in the query. Column names are taken from the tables specified in the
SELECT columnName(s) FROM tableName(s) WHERE condition(s) GROUP BY columnName(s);
Consider the following table,
Persons. Given below are some queries that demonstrate different ways to use the
GROUP BY clause:
GROUP BY clause grouped all the duplicate countries (United States, Egypt) together and returned only a single row for them. So only 8 out of original 10 rows are selected since there are only 8 unique countries in the
MultipleColumns: There are two queries in this snippet:
Language column entries are selected without using the
GROUP BY clause, it returns 2 rows that are duplicate.
On the other hand, just as was the case with a SingleColumn, when multiple columns are passed to
GROUP BY, it returns a single row.
GROUP BY only treats two rows as duplicates if all the column values in both the rows are the same. If even a single column value in either of the row is non-matching, they are treated as unique.
GROUP BYcan also be used to count the number of occurrences of a specific value in a column. The given query selects all the values in
Countrycolumn, groups them up, and returns the
COUNTalong with the country for each country.
After specific columns have been selected, one can choose to group them up under specific conditions. This is done using the
HAVING clause. The given query only gives
COUNT of the countries as specified in the
GROUP BY clause is used when selecting MIN/MAX/AVG of a column. The given query groups by the
Country column and gives the average value for each country. Since the United States and Egypt had 2 rows each, the
AVG(Salary) for these countries is the average of both the values, as opposed to Germany with only one row.