Postgresql – How to speed up an ASC sort on a column that only holds an integer between 0 and 9 across multiple millions of rows

order-bypostgresql

CREATE TABLE foo (
  id  TEXT    NOT NULL,
  bar INTEGER NOT NULL DEFAULT 0,
  PRIMARY KEY (id)
);

CREATE INDEX foo_bar_idx ON foo(bar) WHERE NOT bar = 0;

I can sort on the bar column in DESC order about 3 times faster than I can sort on the bar column in ASC order. I assume an index on the column isn't helping with the sort since, relatively speaking w/ millions of rows, most of the values are the same.

The default value for the bar column is 0, but I only care about rows that have a value between 1 and 9, so my query includes a where clause for that. Not sure if the partial index is helping.

SELECT id FROM foo WHERE bar > 0 ORDER BY bar DESC;
SELECT id FROM foo WHERE bar > 0 ORDER BY bar ASC;

Across 9M rows, explain analize shows the ORDER BY DESC taking less than a second, and ORDER BY ASC taking a little over 3 seconds. Approximately 117K of the 9M rows get filtered by the WHERE clause.

Is the ORDER BY DESC faster because it doesn't necessarily have to worry about the 0 values? Any way to speed up the ORDER BY ASC query? Or… should I just be happy w/ a 3 second query time for a table w/ 9M rows in it? I'm continually adding rows, so I'm worried about the query time increasing over time…

Best Answer

I would recommend using the same syntax for all of your WHERE clauses (your index would build for negative values of bar, but you never SELECT those)

CREATE INDEX has an implicit order of ASC, so it unlikely that not specifying the indexing order alone is the source of your problem (as you stated the DESC sort is faster).

Naively, I'd also recommend that you include the id column in your index, but this may negatively affect insert/index rebuild performance, as well as overall disk and memory usage.

CREATE INDEX foo_bar_idx
  ON foo(bar ASC, id ASC) 
  WHERE bar > 0

You could try converting all zeros to NULL (and setting NULL as the default), then changing your WHERE clauses to WHERE bar IS NOT NULL. So your create table code would look like

CREATE TABLE foo (
  id TEXT NOT NULL,
  bar INTEGER NULL DEFAULT NULL,
  PRIMARY KEY (id)
);

Once you have NULLs established, you could alternatively play around with the NULLS FIRST / LAST parameters in your index, rather than using a WHERE clause, but that is unlikely to improve performance.