Postgresql – Indexes Size 20 times bigger then the table itself and very slow queries

postgresql

The table I want to query has only about ~3000 rows with 3 columns:

int/varchar(100)/varchar(100)

yet it takes ~30 sec to query them all!

Then I saw that the table size is 1331MB but the indexes size is 25GB!

The table has only one index and that is the pkey.

VACCUM FULL is running every 24hours and the table sould be pretty static anyway

My question is why are the indexes so big (I guess thats why the querys are so slow) and what can I do to fix that?

Hope you have any ideas, because I have none and I haven't found anything on google.

EDIT: Postgres Version is 8.1.18

Best Answer

Pre-9.0 `VACUUM FULL`

You're on PostgreSQL 8.4 or older, where VACUUM FULL tends to bloat indexes. See this wiki page for details.

Don't run VACUUM FULL as a periodic maintenance task. It's unnecessary and inefficient. This remains true on current versions, it's just not as bad on 9.0 and above. If you feel the need to run VACUUM FULL regularly then you probably don't have autovacuum turned up far enough and are having table bloat issues. In fact, unless you've changed the FILLFACTOR on the table from its default 100 a VACUUM FULL is quite counter-productive; it'll compact away all the free space in the table, so following UPDATEs will have to extend the table.

Table extensions are currently one of the poorer performing operations in PostgreSQL, as they're controlled by a single global lock. So if you have tables that fluctuate in size, you really want to avoid constantly compacting and truncating them only to extend them again.

On some unusual workloads it can be worth running a periodic CLUSTER, which orders the table based on an index and effectively REINDEXes it. If you do many UPDATEs on the table should set a lower FILLFACTOR for efficiency.

If this table is being emptied and re-populated regularly, you should generally using TRUNCATE followed by COPY to fill it back up. If it's big, drop the indexes before the COPY then re-create them afterwards to produce indexes that are more compact and faster and to speed up the data load.

For one-off mitigation, CLUSTER the table or REINDEX it.

8.1?!?!

After edit added version: Holy bleepazoids, batman. 8.1.18? Forget what I said about autovacuum, autovacuum in 8.1 was way too ineffective. Upgrade to a sane version ASAP. You're not even on the current point release of 8.1, 8.1.23, from December 2010. 8.1.18 was released in September 2009! You need to begin your upgrade planning ... well, about two years ago, preferably. Read the release notes for every .0 version between 8.1 and the current release, focusing on the upgrade notes and compatibility notes. Then plan and execute your upgrade. If you don't feel up managing that on your own there are people who'll help you with it (I work for one of them) but honestly, the release notes and docs are quite sufficient for most people to do an upgrade themselves without undue pain.

Moving from 8.1 to 8.3 or newer will be your biggest pain point, as PostgreSQL 8.3 dropped a whole bunch of implicit casts that lots of potentially buggy SQL relied on. You'll need to test your application carefully on the newer version. Other changes to be aware of are:

The removal of implicit FROM and in later versions removal of the backwards compatibility parameter for it;
UTF-8 validation improvements in newer versions that can cause older dumps to fail to load until the data is corrected;
The change to standard_conforming_strings by default;
The change of bytea_output to hex

BRIN index

Available since Postgres 9.5 and probably just what you are looking for. Much faster index creation, much smaller index. But queries are typically not as fast. The manual:

BRIN stands for Block Range Index. BRIN is designed for handling very large tables in which certain columns have some natural correlation with their physical location within the table. A block range is a group of pages that are physically adjacent in the table; for each block range, some summary info is stored by the index.

Read on, there is more.
Depesz ran a preliminiary test.

The optimum for your case: If you can write rows clustered on run_id, your index becomes very small and creation much cheaper.

CREATE INDEX foo ON run.perception USING brin (run_id, frame)
WHERE run_id >= 266 AND run_id <= 270;

You might even just index the whole table.

Table layout

Whatever else you do, you can save 8 bytes lost to padding due to alignment requirements per row by ording columns like this:

CREATE TABLE run.perception(
  id               bigint NOT NULL PRIMARY KEY
, run_id           bigint NOT NULL
, frame            bigint NOT NULL
, by_anyone        bigint NOT NULL
, by_me            bigint NOT NULL
, owning_p_id      bigint NOT NULL
, subj_id          bigint NOT NULL
, subj_state_frame bigint NOT NULL
, obj_type_set     bigint
, by_s_id          integer
, seq              integer
, by               varchar(45) NOT NULL -- or just use type text
);

Makes your table 79 GB smaller if none of the columns has NULL values. Details:

Configuring PostgreSQL for read performance

Also, you only have three columns that can be NULL. The NULL bitmap occupies 8 bytes for 9 - 72 columns. If only one integer column is NULL, there is a corner case for a storage paradox: it would be cheaper to use a dummy value instead: 4 bytes wasted but 8 bytes saved by not needing a NULL bitmap for the row. More details here:

How do completely empty columns in a large table affect performance?

Partial indexes

Depending on your actual queries it might be more efficient to have these five partial indices instead of the one above:

CREATE INDEX perception_run_id266_idx ON run.perception(frame) WHERE run_id = 266;
CREATE INDEX perception_run_id266_idx ON run.perception(frame) WHERE run_id = 267;
CREATE INDEX perception_run_id266_idx ON run.perception(frame) WHERE run_id = 268;
CREATE INDEX perception_run_id266_idx ON run.perception(frame) WHERE run_id = 269;
CREATE INDEX perception_run_id266_idx ON run.perception(frame) WHERE run_id = 270;

Run one transaction for each.

Removing run_id as index column this way saves 8 bytes per index entry - 32 instead of 40 bytes per row. Each index is also cheaper to create, but creating five instead of just one takes substantially longer for a table that's too big to stay in cache (like @Jürgen and @Chris commented). So that may or may not be useful for you.

Partitioning

Based on inheritance - the only option up to Postgres 9.5.
_{(The new declarative partitioning in Postgres 11 or, preferably, 12 is smarter.)}

The manual:

All constraints on all children of the parent table are examined during constraint exclusion, so large numbers of partitions are likely to increase query planning time considerably. So the legacy inheritance based partitioning will work well with up to perhaps a hundred partitions; don't try to use many thousands of partitions.

Bold emphasis mine. Consequently, estimating 1000 different values for run_id, you would make partitions spanning around 10 values each.

`maintenance_work_mem`

I missed that you are already adjusting for maintenance_work_mem in my first read. I'll leave quote and advice in my answer for reference. Per documentation:

maintenance_work_mem (integer)

Specifies the maximum amount of memory to be used by maintenance operations, such as VACUUM, CREATE INDEX, and ALTER TABLE ADD FOREIGN KEY. It defaults to 64 megabytes (64MB). Since only one of these operations can be executed at a time by a database session, and an installation normally doesn't have many of them running concurrently, it's safe to set this value significantly larger than work_mem. Larger settings might improve performance for vacuuming and for restoring database dumps.

Note that when autovacuum runs, up to autovacuum_max_workers times this memory may be allocated, so be careful not to set the default value too high. It may be useful to control for this by separately setting autovacuum_work_mem.

I would only set it as high as needed - which depends on the unknown (to us) index size. And only locally for the executing session. As the quote explains, a too-high general setting can starve the server otherwise, because autovacuum may claim more RAM, too. Also, don't set it much higher than needed, even in the executing session, free RAM might be put to good use in caching data.

It could look like this:

BEGIN;

SET LOCAL maintenance_work_mem = 10GB;  -- depends on resulting index size

CREATE INDEX perception_run_frame_idx_run_266_thru_270 ON run.perception(run_id, frame)
WHERE run_id >= 266 AND run_id <= 270;

COMMIT;

About SET LOCAL:

The effects of SET LOCAL last only till the end of the current transaction, whether committed or not.

To measure object sizes:

Measure the size of a PostgreSQL table row

The server should generally be configured reasonably otherwise, obviously.

PostgreSQL 9.3 – Why Indexes Are Bigger Than Their Tables

Possible reasons:

Numerous and probably overlapping indexes on the table; have a look with \d
Bloat due to high update churn can sometimes affect indexes more than tables, depending on update patterns. Examine the size of each individual index to see if it makes sense.
GiST indexes, if used, can be quite large

Unlike what I originally thought this is not an issue with TOAST out-of-line storage not being counted, since pg_table_size includes TOAST tables.

Note that if you're concerned about index bloat and decide to REINDEX some of all of the involved indexes, consider setting a non-default FILLFACTOR first if the table is subject to lots of updates (or inserts+deletes). Otherwise you'll take a write performance hit because the index doesn't have any space to insert new values so it'll force lots of page splits and be less efficiently structured.

Best Answer

Pre-9.0 VACUUM FULL

8.1?!?!

Related Solutions

PostgreSQL Performance – Speeding Up Creation of Partial Index