Joins with Aggregate Functions

Learn to use aggregate functions.

Let’s start answering some more complex queries.

How many customers do we have with orders?

Suppose that we had a lot of customers with lots of orders. Some customers are referred to as leads because they haven’t yet placed an order. So, we have customers that have orders and customers that don’t have an order.

How can we count the customers that have orders?

We tried the following query earlier:

mysql> select * from customers join orders on orders.customer_id = customers.id;
+----+-----------+----------+----+--------------+---------------------+-------------+
| id | name      | identity | id | order_number | ordered_at          | customer_id |
+----+-----------+----------+----+--------------+---------------------+-------------+
|  4 | John Woo  | JW001    |  2 | ABC001       | 2016-09-12 21:36:56 |           4 |
|  5 | Maria Foo | MF001    |  3 | ABC002       | 2016-09-12 21:40:20 |           5 |
|  5 | Maria Foo | MF001    |  4 | ABC003       | 2016-09-12 21:44:34 |           5 |
|  6 | Jim Papas | JP001    |  5 | ABC004       | 2016-09-12 21:46:38 |           6 |
|  6 | Jim Papas | JP001    |  6 | ABC005       | 2016-09-12 21:47:41 |           6 |
|  6 | Jim Papas | JP001    |  7 | ABC006       | 2016-09-12 21:49:31 |           6 |
+----+-----------+----------+----+--------------+---------------------+-------------+
6 rows in set (0.01 sec)

This joins the customers with the orders and won’t include any customer that doesn’t have an order. This is a start to the answer to our question, but it isn’t the full answer. It shows six rows. Let’s look at the following:

mysql> select count(*) from customers join orders on orders.customer_id = customers.id;
+----------+
| count(*) |
+----------+
|        6 |
+----------+
1 row in set (0.01 sec)

We get 6, which isn’t the correct number of customers that have placed at least one order. The correct number should be 3. The above query repeats the same customer for every order that the customer has. That’s why we don’t get the correct result.

Use of distinct

One correct SQL statement to count the customers that have placed an order is the following:

mysql> select count(distinct customers.id) from customers join orders on orders.customer_id = customers.id;
+------------------------------+
| count(distinct customers.id) |
+------------------------------+
|                            3 |
+------------------------------+
1 row in set (0.00 sec)

So, instead of count(*), we tell MySQL that we want to count the distinct customers.id values. The customers.id values are 4, 5, 5, 6, 6, and 6. The distinct, unique set is 4, 5, and 6, which means that the count would return 3. This is the correct result.

Remember that we can alias a column header with a label of our own to make the header of the column read easier:

mysql> select count(distinct customers.id) as number_of_customers_with_orders from customers join orders on orders.customer_id = customers.id;
+---------------------------------+ 
| number_of_customers_with_orders | 
+---------------------------------+ 
|                               3 |
+---------------------------------+ 
1 row in set (0.00 sec)

Another easier way to count the number of customers that have placed orders is to use only the order’s customer and count the distinct orders.customer_id values instead:

mysql> select count(distinct orders.customer_id) from orders;
+------------------------------------+
| count(distinct orders.customer_id) |
+------------------------------------+
|                                  3 |
+------------------------------------+
1 row in set (0.00 sec)

Since every customer that has placed an order appears inside the orders table, there’s no reason to use the customers table. The SQL statement above consumes fewer resources and is faster than the count that uses both the customers and orders tables since it doesn’t have to join information from different tables. Joins are costly. We should avoid them if we can. Practice our queries in the code widget below:

Get hands-on with 1200+ tech skills courses.