Postgres – Does Updating a Partially Indexed Field Affect Index?

indexindex-tuningperformancepostgresql

We can create a partial index for a field in Postgres.

I am looking at using this to stop rows which are archived from being in an index on a created_at field:

First we create the index:

test=# CREATE INDEX idx_my_table_created_at ON my_table(created_at) WHERE NOT(archived);

We insert two rows, one which gets in idx_my_table_created_at and one which does not:

 id |         created_at         | archived
----+----------------------------+----------
  2 | 2012-10-12 17:39:17.28511  | t
  1 | 2012-10-12 17:38:55.531278 | f

Now we UPDATE a row which is presently in the index, such that it shouldn't be in the index:
```
test=# UPDATE my_table SET archived = true WHERE id = 1;
```

My question: Is id = 1 still in idx_my_table_created_at?

This question follows on from an answer to a previous question I asked on how to handle rows which won't be queried once they are old, hence archived.

Best Answer

(Note: I'm not a deep expert in the Pg guts. This is only my understanding. You should really read the mvcc and internals sections of the manual).

Yes, initially the row with id = 1 is still in the index, despite the index's WHERE clause excluding it because it has been updated with archived = true. At some point down the track the index entry will be purged by VACUUM or autovacuum, freeing the space.

The reason for this is PostgreSQL's MVCC design. Other concurrent transactions might still be able to see the old version of the row with id = 1, archived = false - either READ COMMITTED transactions with a long-running statement, or SERIALIZABLE transactions that don't see changes committed after they began. If the index entry were removed when the row was updated those transactions wouldn't find the row in index scans and would produce the wrong result.

Technically what's actually happening is that the old row still exists, it's just had an xmax set so newer transactions ignore it when they see it. After the update the index points to that row, same as before. A new row with archived = true has been inserted with an xmin above the old row's xmax, so any given transaction either sees the old or new version, never both. That new row is not added to the index.

Eventually there will be no open transaction that can still see the old row, so VACUUM will come by and remove the old row and its index entry, freeing the space for re-use.

Note that the index its self has no row visibility information. You can - and often do - have many versions of a row, each with index entries. When a transaction looks for matching index entries it reads all of the versions (unless it can tell from the visibility map that it doesn't need to) and ignores all but the one that has an xmin and xmax that makes it visible to that transaction.

This is one of many good reasons to ensure that autovaccum runs frequently. It reduces the number of dead index entries and dead rows you have lying around wasting space and I/O.

EDIT

I've found Jeremiah Peschka's terrific article about table and index storage. With information from there, I could answer (2) as well. I set up a small test, so you can check these yourself.

-- we will need two extensions
CREATE EXTENSION pg_buffercache;
CREATE EXTENSION pageinspect;


-- a very simple test table
CREATE TABLE index_cache_test (
      id serial
    , blah text
);


-- I am a bit megalomaniac here, but I will use this for other purposes as well
INSERT INTO index_cache_test
SELECT i, i::text || 'a'
FROM generate_series(1, 1000000) a(i);


-- let's create the index to be cached
CREATE INDEX idx_cache_test ON index_cache_test (id);


-- now we can have a look at what is cached
SELECT c.relname,count(*) AS buffers
FROM 
    pg_class c 
    INNER JOIN pg_buffercache b ON b.relfilenode = c.relfilenode 
    INNER JOIN pg_database d ON (b.reldatabase = d.oid AND d.datname = current_database())
GROUP BY c.relname
ORDER BY 2 DESC LIMIT 10;

             relname              | buffers
----------------------------------+---------
 index_cache_test                 |    2747
 pg_statistic_relid_att_inh_index |       4
 pg_operator_oprname_l_r_n_index  |       4
... (others are all pg_something, which are not interesting now)

-- this shows that the whole table is cached and our index is not in use yet

-- now we can check which row is where in our index
-- in the ctid column, the first number shows the page, so 
-- all rows starting with the same number are stored in the same page
SELECT * FROM bt_page_items('idx_cache_test', 1);

 itemoffset |  ctid   | itemlen | nulls | vars |          data
------------+---------+---------+-------+------+-------------------------
          1 | (1,164) |      16 | f     | f    | 6f 01 00 00 00 00 00 00
          2 | (0,1)   |      16 | f     | f    | 01 00 00 00 00 00 00 00
          3 | (0,2)   |      16 | f     | f    | 02 00 00 00 00 00 00 00
          4 | (0,3)   |      16 | f     | f    | 03 00 00 00 00 00 00 00
          5 | (0,4)   |      16 | f     | f    | 04 00 00 00 00 00 00 00
          6 | (0,5)   |      16 | f     | f    | 05 00 00 00 00 00 00 00
...
         64 | (0,63)  |      16 | f     | f    | 3f 00 00 00 00 00 00 00
         65 | (0,64)  |      16 | f     | f    | 40 00 00 00 00 00 00 00

-- with the information obtained, we can write a query which is supposed to
-- touch only a single page of the index
EXPLAIN (ANALYZE, BUFFERS) 
    SELECT id 
    FROM index_cache_test 
    WHERE id BETWEEN 10 AND 20 ORDER BY id
;

 Index Scan using idx_test_cache on index_cache_test  (cost=0.00..8.54 rows=9 width=4) (actual time=0.031..0.042 rows=11 loops=1)
   Index Cond: ((id >= 10) AND (id <= 20))
   Buffers: shared hit=4
 Total runtime: 0.094 ms
(4 rows)

-- let's have a look at the cache again (the query remains the same as above)
             relname              | buffers
----------------------------------+---------
 index_cache_test                 |    2747
 idx_test_cache                   |       4
...

-- and compare it to a bigger index scan:
EXPLAIN (ANALYZE, BUFFERS) 
SELECT id 
    FROM index_cache_test 
    WHERE id <= 20000 ORDER BY id
;


 Index Scan using idx_test_cache on index_cache_test  (cost=0.00..666.43 rows=19490 width=4) (actual time=0.072..19.921 rows=20000 loops=1)
   Index Cond: (id <= 20000)
   Buffers: shared hit=4 read=162
 Total runtime: 24.967 ms
(4 rows)

-- this already shows that something was in the cache and further pages were read from disk
-- but to be sure, a final glance at cache contents:

             relname              | buffers
----------------------------------+---------
 index_cache_test                 |    2691
 idx_test_cache                   |      58

-- note that some of the table pages are disappeared
-- but, more importantly, a bigger part of our index is now cached

All in all, this shows that indexes and tables can be cached page by page, therefore the answer for (2) is NO.

And a final one to illustrate temporary tables being non-cached here:

CREATE TEMPORARY TABLE tmp_cache_test AS 
SELECT * FROM index_cache_test ORDER BY id FETCH FIRST 20000 ROWS ONLY;

EXPLAIN (ANALYZE, BUFFERS) SELECT id FROM tmp_cache_test ORDER BY id;

-- checking the buffer cache now shows no sign of the temp table

Sql-server – Indexing strategy for dynamic predicate

If I were you I would run a trace specific to hits on that table. It shouldn't be overly intensive since you are restricting it to just queries against that table, from your application. Run just the minimum needed by the DTA (Database Tuning Advisor). Run it for a day here, a day there, make sure you get some end of week days and some end of month days. Then run the whole lot through the DTA.

Here is why, I'm willing to bet that you have specific combinations of columns that are going to come up more often than not. You can create more complex indexes based on that information. You might also find that you can create some correlated statistics. Basically statistics that have more than one column. For example creating a statistic on City and State together may improve queries against those two columns.

However make sure you don't create to many indexes. On a table that large I'm guessing you do a fair number of writes and every additional index added will slow them down. Of course you may do most of your writes during a batch process.

Also make sure that you put an automatic process to update your statistics periodically. With that many rows the statistics aren't going to update on their own very often. Only once 500+20% of the rows have changed, in at 500 mil rows that's a LOT.

Best Answer

Related Solutions

PostgreSQL – Index Caching for Performance Improvement

EDIT

Sql-server – Indexing strategy for dynamic predicate

Related Question