SQL Group by vs Distinct

From WikiOD
Revision as of 11:47, 15 June 2021 by Admin (talk | contribs) (Text replacement - "{{note| This article is an extract of the original Stack Overflow Documentation created by contributors and released under [https://creativecommons.org/licenses/by-sa/3.0/ CC BY-SA 3.0]. This website is not affiliated with Stack Overflow }}" to "{{note| Credit:Stack_Overflow_Documentation }}")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Difference between GROUP BY and DISTINCT[edit | edit source]

GROUP BY is used in combination with aggregation functions. Consider the following table:

orderId userId storeName orderValue orderDate
1 43 Store A 25 20-03-2016
2 57 Store B 50 22-03-2016
3 43 Store A 30 25-03-2016
4 82 Store C 10 26-03-2016
5 21 Store A 45 29-03-2016

The query below uses GROUP BY to perform aggregated calculations.

SELECT
    storeName,
    COUNT(*) AS total_nr_orders,
    COUNT(DISTINCT userId) AS nr_unique_customers,
    AVG(orderValue) AS average_order_value,
    MIN(orderDate) AS first_order,
    MAX(orderDate) AS lastOrder
FROM
    orders
GROUP BY
    storeName;

and will return the following information

storeName total_nr_orders nr_unique_customers average_order_value first_order lastOrder
Store A 3 2 33.3 20-03-2016 29-03-2016
Store B 1 1 50 22-03-2016 22-03-2016
Store C 1 1 10 26-03-2016 26-03-2016

While DISTINCT is used to list a unique combination of distinct values for the specified columns.

SELECT DISTINCT
    storeName,
    userId
FROM
    orders;
storeName userId
Store A 43
Store B 57
Store C 82
Store A 21

Credit:Stack_Overflow_Documentation