I have been having a hard time with this query, I have a webpage which is supposed to run this query a dozen of times, and with an execution time up to 40 – 50 seconds per query, you can see how this is becoming quite problematic as the database keeps growing.
Query:
select distinct(scanid)
from misspelled
where word = ?
knowing that the table 'misspelled' can grow up to a dozen million rows, is there any way to make this work ?
Thanks, here are more details:
Query:
select distinct(scanid)
from misspelled
where word = ?
Explain Analyze:
HashAggregate (cost=76018.68..76018.79 rows=11 width=8) (actual time=38359.639..38359.823 rows=1071 loops=1)
Group Key: scanid
-> Bitmap Heap Scan on misspelled (cost=1014.12..75918.78 rows=39960 width=8) (actual time=49.946..38319.196 rows=37993 loops=1)
Recheck Cond: (word = 'Facebook*'::text)"
Heap Blocks: exact=28920
-> Bitmap Index Scan on test_scanid (cost=0.00..1004.13 rows=39960 width=0) (actual time=40.541..40.541 rows=37993 loops=1)
Index Cond: (word = 'Facebook*'::text)
Planning time: 197.561 ms
Execution time: 38360.440 ms
Also available at: http://explain.depesz.com/s/7ttn
Table definition:
CREATE TABLE misspelled
(
word text,
url text,
occnum BIGINT,
proper BOOLEAN,
scanid BIGINT,
CONSTRAINT misspelled_scanid_fkey FOREIGN KEY (scanid)
REFERENCES scans (scanid) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE NO ACTION,
CONSTRAINT misspelled_word_url_scanid_key UNIQUE (word, url, scanid)
);
CREATE INDEX test_scanid
ON misspelled
USING btree
(word COLLATE pg_catalog."default", scanid);
Version 9.4.0
If you need anything else please let me know.
Thanks
Best Answer
You have almost 40000 rows where
word='Facebook*'
. I suppose that these rows are somewhat randomly scattered on the whole table. This means that Postgres has to read tens of thousands of random blocks from a disk, which might be slow.You can try to cluster rows that have the same word together. You can use the command
cluster misspelled using test_scanid;
. This would take some time and block the table until done, so plan accordingly.