There are several ways to pass the IN / LIMIT
limitation and the UNION / ORDER BY
syntax requirements.
Wrap the Union in a derived table:
SELECT * FROM comment AS tbl WHERE sscm_id IN
( SELECT sscm_id
FROM
( ( SELECT sscm_id FROM comment AS t1 WHERE t1.sscm_oid = 4
ORDER BY t1.sscm_id DESC LIMIT 3)
UNION ALL
( SELECT sscm_id FROM comment AS t2 WHERE t2.sscm_oid = 3
ORDER BY t2.sscm_id DESC LIMIT 3)
) AS tmp
) ;
Use JOIN
instead of WHERE / IN
:
UNION
is required here instead of UNION ALL
to avoid duplicates
SELECT tbl.*
FROM comment AS tbl
JOIN
( ( SELECT sscm_id FROM comment AS t1 WHERE t1.sscm_oid = 4
ORDER BY t1.sscm_id DESC LIMIT 3)
UNION
( SELECT sscm_id FROM comment AS t2 WHERE t2.sscm_oid = 3
ORDER BY t2.sscm_id DESC LIMIT 3)
) AS tmp
ON tmp.sscm_id = tbl.sscm_id ;
You can have window functions on the result of aggregate functions in a single query level.
This would all work nicely after a few modifications - except that it fails for the standard deviation on mathematical principal. The involved calculations are not linear, so you cannot simply combine standard deviations of sub-populations.
SELECT perm
,combo
,avg(value) AS perm_average_value
,sum(avg(value) * count(*)) OVER w_combo /
sum(count(*)) OVER w_combo AS combo_average_value
,stddev_pop(value) AS perm_stddev
,0 AS combo_stddev -- doesn't work!
,count(*) AS perm_count
,sum(count(*)) OVER w_combo AS combo_count
FROM foo
GROUP BY perm, combo
WINDOW w_combo AS (PARTITION BY combo);
For combo_average_value
you would need this expression
sum(avg(value) * count(*)) OVER w_combo / sum(count(*)) OVER w_combo
Since you need a weighted average. (The average of a group with 10 members weighs more than the average of a group with just 2 members!)
This works:
SELECT DISTINCT ON (perm, combo)
perm
,combo
,avg(value) OVER wpc AS perm_average_value
,avg(value) OVER wc AS combo_average_value
,stddev_pop(value) OVER wpc AS perm_stddev
,stddev_pop(value) OVER wc AS combo_stddev
,count(*) OVER wpc AS perm_count
,count(*) OVER wc AS combo_count
FROM foo
WINDOW wc AS (PARTITION BY combo)
,wpc AS (PARTITION BY perm, combo);
I am using two different windows here, and reduce the rows with DISTINCT
which is applied even after window functions.
But I seriously doubt it will be faster than your original query. I am pretty sure it isn't.
Better performance with altered table layout
Arrays have an overhead of 24 bytes (slight variations depending on type). Also, you seem to have quite a few items per array and many repetitions. For a huge table like yours it would pay to normalize the schema. Example layout:
CREATE TABLE combo (
combo_id serial PRIMARY KEY
,combo int[] NOT NULL
);
CREATE TABLE perm (
perm_id serial PRIMARY KEY
,perm int[] NOT NULL
);
CREATE TABLE value (
perm_id int REFERENCES perm(perm_id)
,combo_id int REFERENCES combo(combo_id)
,value numeric NOT NULL DEFAULT 0
);
If you don't need referential integrity you can omit the foreign key constraints.
The connection to combo_id
could also be placed in the table perm
, but in this scenario I would store it (slightly de-normalized) in value
for better performance.
This would result in a row size of 32 bytes (tuple header + padding: 24 bytes, 2 x int (8 byte), no padding), plus the unknown size of your numeric
column. (If you don't need extreme precision, a double precision
or even a real
column might do, too.)
More on physical storage in this related answer on SO or here:
Configuring PostgreSQL for read performance
Anyway, that's only a fraction of what you have now and would make your query a lot faster by size alone. Grouping and sorting on simple integers is also a lot faster.
You would first aggregate in a subquery and then join to perm
and combo
for best performance.
Best Answer
The best solution depends on where and in what form the values come from.
unnest()
Use the set returning function
unnest()
with an array in theFROM
clause. That's shortest for long lists, but it's a Postgres specific solution.Which is short syntax for:
VALUES
Alternatively, use a
VALUES
expression as subquery. This is standard SQL.UNION ALL
You can do the same with a much more verbose chain of
SELECT ... UNION ALL SELECT ...
. Also standard SQL.Use
UNION ALL
, not justUNION
, which would fold duplicates (which you did not specify!) and is therefore also considerably more expensive.SQL Fiddle.
Note that all numeric literals in the example are cast to
integer
automatically (just digits). For other data types you may have to add explicit type casts.avg()
returns (per documentation):