The world's most popular open source database
Task: For each article, find the dealer or dealers with the most expensive price.
In standard SQL (and as of MySQL 4.1), the problem can be solved with a subquery like this:
SELECT article, dealer, price
FROM shop s1
WHERE price=(SELECT MAX(s2.price)
FROM shop s2
WHERE s1.article = s2.article);
The preceding example uses a correlated subquery, which can be
inefficient (see Section 12.2.8.7, “Correlated Subqueries”). Other
possibilities for solving the problem are to use a uncorrelated
subquery in the FROM clause or a
LEFT JOIN:
SELECT s1.article, dealer, s1.price FROM shop s1 JOIN ( SELECT article, MAX(price) AS price FROM shop GROUP BY article) AS s2 ON s1.article = s2.article AND s1.price = s2.price; SELECT s1.article, s1.dealer, s1.price FROM shop s1 LEFT JOIN shop s2 ON s1.article = s2.article AND s1.price < s2.price WHERE s2.article IS NULL;
The LEFT JOIN works on the basis that when
s1.price is at its maximum value, there is no
s2.price with a greater value and the
s2 rows values will be
NULL. See Section 12.2.7.1, “JOIN Syntax”.
Before MySQL 4.1, subqueries are unavailable. Another approach is to solve the problem in several steps:
Get the list of (article,maxprice) pairs.
For each article, get the corresponding rows that have the stored maximum price.
This can easily be done with a temporary table and a join:
CREATE TEMPORARY TABLE tmp (
article INT(4) UNSIGNED ZEROFILL DEFAULT '0000' NOT NULL,
price DOUBLE(16,2) DEFAULT '0.00' NOT NULL);
LOCK TABLES shop READ;
INSERT INTO tmp SELECT article, MAX(price) FROM shop GROUP BY article;
SELECT shop.article, dealer, shop.price FROM shop, tmp
WHERE shop.article=tmp.article AND shop.price=tmp.price;
UNLOCK TABLES;
DROP TABLE tmp;
If you don't use a TEMPORARY table, you must
also lock the tmp table.
“Can it be done with a single query?”
Yes, but only by using a quite inefficient trick called the “MAX-CONCAT trick”:
SELECT article,
SUBSTRING( MAX( CONCAT(LPAD(price,6,'0'),dealer) ), 7) AS dealer,
0.00+LEFT( MAX( CONCAT(LPAD(price,6,'0'),dealer) ), 6) AS price
FROM shop
GROUP BY article;
+---------+--------+-------+
| article | dealer | price |
+---------+--------+-------+
| 0001 | B | 3.99 |
| 0002 | A | 10.99 |
| 0003 | C | 1.69 |
| 0004 | D | 19.95 |
+---------+--------+-------+
The last example can be made a bit more efficient by doing the splitting of the concatenated column in the client.


User Comments
OK, Some BAD News about performance:
If you're accessing your MySQL via PHP, (and possibly other programming languages) which many of us are, each SQL statement has to be run separately (i.e. using mysql_query($qry,$db);).
This means that the temporary table method requires *5* separate calls to 'mysql_query()', whereas Csaba's method only takes one!
I've performance tested these methods on my own CMS (TC! http://top-cat.com) using XDebug (http://www.xdebug.org/) and found these results for running a function which displays the most recent version of a list of items:
TEMP TABLE METHOD: 12.8439433575 s
CSABA'S METHOD: 11.8751540184 s
What's WORSE is this is the result when I use a separate PHP function to get the max version:
SEPARATE FUNCTION METHOD: 7.6197450161 s
... This means you're better off creating a separate PHP function to get the max value, THEN building it into the query... Whichever way you do it - it's slow!
If you can change your coding to set a separate column like 'status' to 'C'urrent or 'M'ax in each row - this is the quickest way to get the data (although setting it obviously becomes slower). If I get the first version of all the items, this is the timing I get (it may be quicker if you index the column too):
NO-JOIN METHOD: 6.4667682648 s
MAX CONCAT METHOD: ... I'm afraid I couldn't get this to work at all.
It is important to note how multiple dealers tied for the best price are treated differently. In the sub-select, all dealers with the best price are selected. But using the max-concat, only one will be, namely the one that wins alphabetically.
another variation of the self-join (which, after all, is what the database engine executes for a correlated subquery, no?)
this self-join uses GROUP BY, thus allowing aggregate functions
example using the shop table:
select s1.article
, s1.dealer
, s1.price
, count(*) as articles
from shop s1
inner
join shop s2
on s1.dealer = s2.dealer
group
by s1.article
, s1.dealer
, s1.price
having s1.price = max(s2.price)
results:
article dealer price articles
1 B 3.99 2
2 A 10.99 2
3 C 1.69 1
4 D 19.95 2
LEFT JOIN first appears here in the tutorial with little explanation.
groupwise MAX
19 rows in set (0.22 sec)<quoted-from-site>:
A good solution to "Get row with the group-wise maximum" Getting just the maximum for the group is simple, getting the full row which is belonging to the maximum is the interesting step.
</quoted-from-site>
The solution was found at the link below. I thought it will be helpful to others like it was to me after spending couple days to get working (I am a noob). It is a great read (must read) to anyone and has multiple (10) different solutions. I will check all of them to find the fastest and add more info. to this post later.
http://jan.kneschke.de/projects/mysql/groupwise-max
Here is my elaborate example:
To list all records without filtering for the MAX and verify the final result only:
NOTICE HOW THE 2ND TABLE HAS THE RECORDS WITH MAXIMUM RevNo ONLY
SELECT DrawingID,DrawingNo,SheetNo,RevNo FROM Drawings WHERE (`DrawingNo` LIKE 'to-10-D%');
Here is the statement that got me the good result I was expecting. I did the test with different search criteria to make sure and all worked perfectly.
SELECT D1.DrawingID,D1.DrawingNo,D1.SheetNo,D1.RevNo
FROM Drawings AS D1, (SELECT DrawingID,DrawingNo,SheetNo,MAX(RevNo) AS MaxRevNo
FROM Drawings WHERE (`DrawingNo` LIKE 'to-10-D%') GROUP BY DrawingNo,SheetNo) AS D2
WHERE D2.DrawingNo=D1.DrawingNo AND D2.SheetNo=D1.SheetNo AND D1.RevNo=D2.MaxRevNo;
16 rows in set (0.81 sec)
To Find the Nth Maximum Value we can use these querys
for ex: to find the 3rd max value we can use these querys
Changing the numbers we can find other max values.
select * from tbl_name a where 3=(Select COUNT(DISTINCT(b.field_name)) from tbl_name b where a.field_name<=b.field_name);
select MIN(b.field_name) from (select a.field_name from tbl_name a order by a.field_name desc limit 3 ) as b
Add your own comment.