Postgresql – Postgres DB is extremely fast with a query with a certain parameter, but extremely slow for another

postgresqlquery

I have a table called stats, where I store every action on a certain website (basically when the user clicks a link, data is saved in the db). The data is modeled with PostgreSQL as

(
id integer NOT NULL DEFAULT nextval('stats_id_seq'::regclass),
company integer NOT NULL,
layout integer NOT NULL,
browsername character varying(50) COLLATE pg_catalog."default",
lang character varying(5) COLLATE pg_catalog."default" NOT NULL,
...)

There are other fields that I don't think are relevant. Basically what I want to do is see which are the most used browsers for a certain company, by counting how many pages are visited by that browser and how many users there are for a given browser.

My query is:

select browsername, count(DISTINCT phpsessid) as visitors, count(*) as pages
from public.stats
where company=1 and createdate>'2017-01-01' and createdate<'2018-12-12' 
group by browsername order by visitors desc;

The stats table contains around 80 millions rows. Now, this query is extremely fast if I set company=1 or company=222 (it's the company ID). It takes less than half a second to fetch around 20k rows. However, it is EXTREMELY slow if I set company=13549, for instance (we're talking about literal hours here). Obviously something is wrong, either in the data modeling or in the way I query.

How come there's such a difference for different companies? The DB was not done by me, so I apologize if I left something useful out, and feel free to ask.

The indexes are:

CREATE INDEX stats_new_company_13549_index
ON public.stats USING btree
(company, createdate)
TABLESPACE pg_default    WHERE company = 13549


CREATE INDEX stats_new_company_14863_index
ON public.stats USING btree
(company, createdate)
TABLESPACE pg_default    WHERE company = 14863

CREATE INDEX stats_new_company_createdate_cet_index
ON public.stats USING btree
(company, date(timezone('CET'::text, createdate)))
TABLESPACE pg_default;

CREATE INDEX stats_new_company_createdate_index_1
ON public.stats USING btree
(company, createdate)
TABLESPACE pg_default;

Here's the plan for the fast query: https://explain.depesz.com/s/uKmJ

And here's the plan for the slow one: https://explain.depesz.com/s/wysA

Just by looking at the plans you can see it took several minutes for the second one to even query the explain.

I also noticed that after running the explain a couple of times, it got done in reasonable amounts the third time. From that moment onward, the query would also drastically reduce its execution time, from hours to a mere 4 seconds.
It's the second time it happens, and I swear I'm not crazy. If I change the company ID once again, the query takes hours again. I'm at lost here: is there some index problem?

Best Answer

The fast query retrieves and sorts 26229 rows - for that small number of rows, the sorting can be done in memory, so obviously this is going to be quick. First because retrieving the data only takes ~500ms and then the sorting is done in 50ms.

The slow query retrieves 560135 rows (20 times as many as the first query) but the time it wook - 149939ms - seems quite slow. Maybe your table (or index) is bloated - the number of blocks needed to read that number of rows is way too high I think.

You can run vacuum full analyze public.stats; and see if performance gets better after that.

Or maybe you simply have a very slow harddisk.

The sorting was done on disk, but that only added another 3 seconds to the total runtime.

The question why the same query is sometimes fast and sometimes quick (especially when run the second time) is more often that not answered with: caching effects. When you re-run the second query you will probably see a lot of the "shared read=318147" information turn to "shared hit=..." which means those blocks were already in the cache.

Related Solutions

Postgresql – Postgres Index scan forward vs backward = speed difference of 357X slower

Since I like replacing aggregate functions by old-fashioned self-joins and NOT EXISTS clauses, here is my attempt:

SET search_path='tmp';

DROP TABLE tmp.changes CASCADE;
CREATE TABLE tmp.changes
        ( id integer NOT NULL PRIMARY KEY
        , fullname varchar
        , issuer varchar
        , rsymbol varchar
        , industry varchar
        , activity INTEGER NOT NULL
        , shareschange FLOAT
        , sharespchange FLOAT
        , mfiled FLOAT
        );

        -- lacking information from the OP
        -- I can only presume a flat distribution.
INSERT INTO tmp.changes(id, activity, shareschange,sharespchange,mfiled )
SELECT nm.*
        , (random() *20)::integer -- mfiled
        , random() *10000
        , random() *100
        , random() *100000
FROM generate_series(1,1000000) nm
        ;

ALTER TABLE tmp.changes
        ALTER shareschange
        SET STATISTICS 1000
        ;
ALTER TABLE tmp.changes
        ALTER mfiled
        SET STATISTICS 1000
        ;

VACUUM ANALYZE tmp.changes
        ;


CREATE INDEX changes_mfiled_shareschange
    ON tmp.changes(mfiled,shareschange)
        ;

EXPLAIN ANALYZE
SELECT initcap(ch.fullname) AS some_name1
     , initcap(ch.issuer) AS some_name2
     , upper(ch.rsymbol) AS some_name3
     , initcap(ch.industry) AS some_name4
     , ch.activity
     , to_char(ch.shareschange,'FM9,999,999,999,999,999') AS some_name5
     , ch.sharespchange || '%' AS some_name6
FROM   changes ch
WHERE  ch.activity IN (4,5)
        -- NOTE: the subquery is *not* correlated.
        -- [I had expected a subselect of nx.activity IN (4,5)
        -- like in the main query. ]
AND    NOT EXISTS (SELECT * FROM changes nx
        WHERE nx.mfiled > ch.mfiled
        )
ORDER  BY ch.shareschange ASC
LIMIT  15
        ;

Postgresql – Postgres multiple joins slow query, how to store default child record

You write:

Each customer can have multiple sites, but only one should be displayed in this list.

Yet, your query retrieves all rows. That would be a point to optimize. But you also do not define which site is to be picked.

Either way, it does not matter much here. Your EXPLAIN shows only 5026 rows for the site scan (5018 for the customer scan). So hardly any customer actually has more than one site. Did you ANALYZE your tables before running EXPLAIN?

From the numbers I see in your EXPLAIN, indexes will give you nothing for this query. Sequential table scans will be the fastest possible way. Half a second is rather slow for 5000 rows, though. Maybe your database needs some general performance tuning?

Maybe the query itself is faster, but "half a second" includes network transfer? EXPLAIN ANALYZE would tell us more.

If this query is your bottleneck, I would suggest you implement a materialized view.

After you provided more information I find that my diagnosis pretty much holds.

The query itself needs 27 ms. Not much of a problem there. "Half a second" was the kind of misunderstanding I had suspected. The slow part is the network transfer (plus ssh encoding / decoding, possibly rendering). You should only retrieve 100 rows, that would solve most of it, even if it means to execute the whole query every time.

If you go the route with a materialized view like I proposed you could add a serial number without gaps to the table plus index on it - by adding a column row_number() OVER (<your sort citeria here>) AS mv_id.

Then you can query:

SELECT *
FROM   materialized_view
WHERE  mv_id >= 2700
AND    mv_id <  2800;

This will perform very fast. LIMIT / OFFSET cannot compete, that needs to compute the whole table before it can sort and pick 100 rows.

pgAdmin timing

When you execute a query from the query tool, the message pane shows something like:

Total query runtime: 62 ms.

And the status line shows the same time. I quote pgAdmin help about that:

The status line will show how long the last query took to complete. If a dataset was returned, not only the elapsed time for server execution is displayed, but also the time to retrieve the data from the server to the Data Output page.

If you want to see the time on the server you need to use SQL EXPLAIN ANALYZE or the built in Shift + F7keyboard shortcut or Query -> Explain analyze. Then, at the bottom of the explain output you get something like this:

Total runtime: 0.269 ms

Best Answer

Related Solutions

Postgresql – Postgres Index scan forward vs backward = speed difference of 357X slower

Postgresql – Postgres multiple joins slow query, how to store default child record

pgAdmin timing

Related Question