Given the following table archived
:
CREATE TABLE archived (
sid serial -- row id
showurl text -- can be: btg, tmwwt, gcaf, emda
title text
reference text -- title id number
);
Sample Input:
sid | showurl | title | reference
----+---------+-------------+-----------
1 | btg | 01-Preface | 01
2 | btg | 01-Preface | 01
3 | emda | 26-Friction | 26
4 | tmwwt | 13-Helper | 13
5 | btg | 05-Finale | 05
6 | gcaf | 07-Help | 07
Desired Output:
1st selection: showurl: 'btg'
2nd selection: reference: '01'
3rd and final selection: title: '01-Preface'
I am trying to determine from table archived
which showurl
has the greatest number of rows.
Then, for that 'showurl', I need to identify which reference
has the greatest number of rows. Obviously, that 'reference' will be duplicated, but the 'sid' associated with each duplicated 'reference' will be unique.
Once that is determined, I wish to select/display its title
.
My initial attempt (using php, postgresql 9.2: queried each showurl
, then using max to identify the showurl
with the greatest number of hits.):
$query = "SELECT * FROM archived WHERE showurl = 'btg' order by sid ASC";
$query = "SELECT * FROM archived WHERE showurl = 'tmwwt' order by sid ASC";
$query = "SELECT * FROM archived WHERE showurl = 'gcaf' order by sid ASC";
$query = "SELECT * FROM archived WHERE showurl = 'emda' order by sid ASC";
$max = max($numrows1,$numrows2,$numrows3,$numrows4);
Using the sample data above, $max
would be 3, but showurl
, title
, reference
, and sid
are unavailable for further processing.
I have tried using pg_numrows()
but I cannot find a way to refer back either to 'title' or 'reference' for the max rows.
I gather some form of UNION
would be in order, but have no idea how to implement this.
Best Answer
Depending on your exact requirements, something along these lines should work:
Subqueries to determine most common attributes
I added
DISTINCT
to the outerSELECT
to fold duplicate titles. It's undefined what you need exactly.Shorter with window functions
A more modern / sophisticated / elegant approach would be with two window functions in the
ORDER BY
clause, allowing for a very terse query:Not necessarily faster than the first solution, though. Again, guessing what you might want exactly, I only retrieve a single winning title.
The trick here is to partition the rows by
showurl
andreference
, run a count over each partition and order descending by that count. This way, you can determine a winner in a single query level.This will not work, if there is no row with the most common
showurl
and the most commonreference
at the same time. Again, exact requirements would be needed ...-> SQLfiddle demo.