The world's most popular open source database
MySQL extends the use of GROUP BY so that you
can use non-aggregated columns or calculations in the
SELECT list that do not appear in
the GROUP BY clause. You can use this feature
to get better performance by avoiding unnecessary column sorting
and grouping. For example, you do not need to group on
customer.name in the following query:
SELECT order.custid, customer.name, MAX(payments) FROM order,customer WHERE order.custid = customer.custid GROUP BY order.custid;
In standard SQL, you would have to add
customer.name to the GROUP
BY clause. In MySQL, the name is redundant.
Do not use this feature if the columns you
omit from the GROUP BY part are not constant
in the group. The server is free to return any value from the
group, so the results are indeterminate unless all values are
the same.
A similar MySQL extension applies to the
HAVING clause. The SQL standard does not
allow the HAVING clause to name any column
that is not found in the GROUP BY clause if
it is not enclosed in an aggregate function. MySQL allows the
use of such columns to simplify calculations. This extension
assumes that the non-grouped columns will have the same
group-wise values. Otherwise, the result is indeterminate.
If the ONLY_FULL_GROUP_BY SQL mode is
enabled, the MySQL extension to GROUP BY does
not apply to the SELECT. That is,
columns not named in the GROUP BY clause
cannot be used in the SELECT list
if not used in an aggregate function.
The select list extension also applies to ORDER
BY. That is, you can use non-aggregated columns or
calculations in the ORDER BY clause that do
not appear in the GROUP BY clause. This
extension does not apply if the
ONLY_FULL_GROUP_BY SQL mode is enabled.
In some cases, you can use MIN()
and MAX() to obtain a specific
column value even if it isn't unique. The following gives the
value of column from the row containing the
smallest value in the sort column:
SUBSTR(MIN(CONCAT(RPAD(sort,6,' '),column)),7)
See Section 3.6.4, “The Rows Holding the Group-wise Maximum of a Certain Field”.
Note that if you are using MySQL 3.22 (or earlier) or if you are
trying to follow standard SQL, you cannot use expressions in
GROUP BY or ORDER BY
clauses. You can work around this limitation by using an alias
for the expression:
mysql>SELECT id,FLOOR(value/100) AS val->FROM->tbl_nameGROUP BY id, val ORDER BY val;
In MySQL 3.23 and up, aliases are unnecessary. You can use
expressions in GROUP BY and ORDER
BY clauses. For example:
mysql> SELECT id, FLOOR(value/100) FROM tbl_name ORDER BY RAND();
Note that if you are using MySQL 3.22 (or earlier) or if you are
trying to follow standard SQL, you can't use expressions in
GROUP BY clauses. You can work around this
limitation by using an alias for the expression:
SELECT id,FLOOR(value/100) AS val
FROM tbl_name
GROUP BY id, val;
In MySQL 3.23 and up, aliases are unnecessary and MySQL does
allow expressions in GROUP BY clauses. For
example:
SELECT id,FLOOR(value/100)
FROM tbl_name
GROUP BY id, FLOOR(value/100);
Before MySQL 3.23, MySQL also requires use of aliases to refer
to expressions in ORDER BY clauses.


User Comments
Here's a way to count subsets within a group:
SELECT
shoeStyle, count(color) as Count,
count(color = 'red' OR NULL) as redCount,
count(color = 'green' OR NULL) as greenCount,
count(color = 'blue' OR NULL) as blueCount,
FROM
bowlingShoes
GROUP BY
shoeStyle
Not sure why, but in MySQL 4.0 the "OR NULL" is necessary, or you will just get a count of all rows in the group.
BTW, the following does NOT work in a GROUP BY in 4.0 with or without the OR NULLs:
SELECT
shoeStyle, SUM(price) AS priceSUM,
SUM((color = 'red')*price) AS redPriceSum,
SUM((color = 'blue')*price) AS bluePriceSum
FROM
bowlingShoes
GROUP BY
shoeStyle
<BTW, the following does NOT work in a GROUP BY in 4.0 with or without the OR NULLs:
<SELECT
<shoeStyle, SUM(price) AS priceSUM,
<SUM((color = 'red')*price) AS redPriceSum,
<SUM((color = 'blue')*price) AS bluePriceSum
<FROM
<bowlingShoes
<GROUP BY
<shoeStyle
I would remove the extra parentheses from around the SUM. You only want to SUM the number of "red" or "blue," then you can multiply by the price and still call it the redPriceSum as follows:
SELECT
shoeStyle, SUM(price) AS priceSUM,
SUM(color = 'red' OR NULL)*price AS redPriceSum,
SUM(color = 'blue' OR NULL)*price AS bluePriceSum
FROM
bowlingShoes
GROUP BY
shoeStyle
In the end, the priceSUM should be equal to the redPriceSum plus the bluePriceSum (unless there are other colors in the table)
I believe an example to "obtain a specific column value even if it isn't unique":
SUBSTR(MIN(CONCAT(RPAD(sort,6,' '),column)),7)
isn't that great, because it won't sort fields correctly that have more that 6 (or N) letters coinciding.
I'd suppose this one:
SUBSTR(MIN(CONCAT(sort,'@',column)), LENGTH(MIN(sort))+2)
Th only limitation is to choose a distinct character like '@', that would surely not be met in 'column' column, so be aware.
The statement "Do not use this feature if the columns you omit from the GROUP BY part are not constant in the group" is probably a bit alarmist. The consequence of not heeding that advice is that "The server is free to return any value from the group"; that consequence could well be perfectly acceptable, and even a valuable feature.
>Not sure why, but in MySQL 4.0 the "OR NULL" is necessary, or you will just get a count of all rows in the group.
If you take a look at COUNT(expr) function, you'll see it says "Returns a count of the number of non-NULL values in the rows retrieved by a SELECT statement". That's why the "OR NULL" is necessary.
You could have also written your query as:
SELECT ... sum(color = 'red') as redCount ...
Thats because the expression evaluates to 1 when color is red, or 0 otherwise.
Oh man this is a stupid non standard feature, I just spent a while debugging a stored proc by someone who used this feature. I recommend putting all columns in your GROUP BY and avoiding this mess of a feature. In my case a guy was selecting about 12 columns, grouping by only 2. But, there was a 3rd column he was depending on being selected based on the fact that it was the "lowest value" for that column. for the most part it worked but then MySql started picking other columns for the 3rd instead of the lowest one, i managed to use an ORDER BY MIN(Col_3) and that seems to fix it. (This stored proc is massive and i dont want to spend weeks rewriting it). There are better more standard ways to achieve this effect. I recommend not using it.
Add your own comment.