Postgresql – Efficient key value store in Postgres

google-cloud-sqlhashingjsonpostgresqlupsert

I want to use Postgres(11 in CloudSQL) as an efficient key value store. I have about 200GB of dictionaries (average size is 10kB, the structure can be different and nested). I'm thinking about leveraging improved hash indexes. Here is the schema:

        CREATE EXTENSION IF NOT EXISTS "uuid-ossp";

        CREATE TABLE IF NOT EXISTS key_val (
            id uuid DEFAULT uuid_generate_v4(),
            value jsonb,
            EXCLUDE using hash (id with =)
        );

        CREATE INDEX IF NOT EXISTS idx_key_val ON key_val USING hash (id);

Get, Update and Insert are quite simple, but I'm not sure how to implement efficient upsert.

INSERT INTO key_val VALUES ($1, $2) ON CONFLICT ON CONSTRAINT key_val_id_excl DO UPDATE SET value = ($2)

results in WrongObjectTypeError ON CONFLICT DO UPDATE not supported with exclusion constraints

Possible solution:

  1. Upsert https://stackoverflow.com/a/8702291/336186
  2. Use hstore, but I'd lose json query improvements and a single dictionary can have 20kB+
  3. Use btree – Lookups would be slower (estimate is 2x-3x) and the index would be much larger.
  4. Another option

EDIT

Info:
Mac 2,3 GHz Intel Core i9; 16GB RAM
(PostgreSQL) 11.4

\d+
public | user_profiles | table | postgres | 16 GB
# num of records
SELECT COUNT(*) FROM user_profiles -> 3 095 348
# hash index
SELECT pg_table_size('idx_user_profiles');
87 334 912
SELECT pg_table_size('idx_user_profiles_btree')
97 705 984

For Btree

postgres=# \d user_profiles
                Table "public.user_profiles"
 Column | Type  | Collation | Nullable |      Default
--------+-------+-----------+----------+--------------------
 key    | uuid  |           | not null | uuid_generate_v4()
 value  | jsonb |           |          |
Indexes:
    "user_profiles_pkey" PRIMARY KEY, btree (key)


postgres=# SELECT * FROM user_profiles WHERE key = '2cfc4dbf-a1b9-46b3-8c15-a03f51dde890';
Time: 3.126 ms

INSERT INTO user_profiles (value) VALUES ('{"type": "_app_retail","user_id": "a628.......    0  ]}');
INSERT 0 1
Time: 4.496 ms # with standard btree index, no optimization

Hash index

\d+ user_profiles
                                    Table "public.user_profiles"
 Column | Type  | Collation | Nullable |      Default       | Storage  | Stats target | Description
--------+-------+-----------+----------+--------------------+----------+--------------+-------------
 key    | uuid  |           | not null | uuid_generate_v4() | plain    |              |
 value  | jsonb |           |          |                    | extended |              |
Indexes:
    "idx_user_profiles" hash (key)


INSERT INTO user_profiles...);
INSERT 0 1
Time: 1.690 ms
# doesnt exists
SELECT * FROM user_profiles WHERE key = '2cfc4dbf-a1b9-46b3-8c15-a03f51dde891'; 
Time: 0.514 ms
# exists
postgres=# SELECT * FROM user_profiles WHERE key = '2cfc4dbf-a1b9-46b3-8c15-a03f51dde890';
Time: 1.747 ms

To confirm I made a python script using asyncpg.

import asyncio
import uuid
from time import time
import os

import asyncpg
from flask import json
# pip(env) install flask asyncpg


TABLE_NAME = "user_profiles"
EXAMPLE_PROFILE = {
    "type": "whatever",
    "user_id": "8378c54f-3a39-41af-b4ab-5a514aa1b941",
    "same_action_count": 8,
    "active_time": 156.36,
    "is_premium": 0.0,
    "early_premium": 0.0,
    "referral_type": "empty",
    "age": 200,
    "age_is_minor": 0,
    "age_is_student": 0,
    "age_is_young_adult": 0,
    "age_is_mid": 1,
    "age_is_senior": 0,
    "integral_balance_w": 0.0,
    "average_balance": 0.0,
    "integral_balance": 0.0,
    "sum_total_in": 0.0,
    "third_party_company": [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
}

async def create_connection():
    conn = await asyncpg.connect(
        user="postgres",
        password=os.environ.get("SECRET_DB_PASSWORD", "postgres"),
        database="postgres",
        host=os.environ.get("DB_HOST", "127.0.0.1"),
    )
    await conn.set_type_codec("json", encoder=json.dumps, decoder=json.loads, schema="pg_catalog")
    return conn


async def create_table(conn, table_name):
    await conn.execute(
        f"""
        CREATE EXTENSION IF NOT EXISTS "uuid-ossp";

        CREATE TABLE IF NOT EXISTS {table_name} (
            id uuid DEFAULT uuid_generate_v4(),
            value jsonb,
            EXCLUDE using hash (id with =)
        );

        CREATE INDEX IF NOT EXISTS idx_{table_name} ON {table_name} USING hash (id);
    """
    )


async def run():
    conn = await create_connection()
    await create_table(conn, TABLE_NAME)
    profile = EXAMPLE_PROFILE.copy()
    for i in range(100):
        profile["user_id"] = str(uuid.uuid4())
        hash_id = str(uuid.uuid4())
        str_profile = json.dumps(profile)
        tic = time()
        _ = await conn.execute(f"INSERT INTO {TABLE_NAME} VALUES ($1, $2)", hash_id, str_profile)
        toc = time()
        print(i, (toc - tic) * 1000)

    await conn.close()


loop = asyncio.get_event_loop()
loop.run_until_complete(run())

Hash index has 1,2 ms on average for inserts, where btree has about 3ms.

We can see here that the hash index performs better than the btree index and the performance difference is in the range of 10 to 22%. In some other workloads we have seen a better performance like with hash index on varchar columns and even in the community, it has been reported that there is performance improvement in the range of 40-60% when hash indexes are used for unique index columns.

http://amitkapila16.blogspot.com/2017/03/hash-indexes-are-faster-than-btree.html

Here is where Hash indexes start to shine, a simple table with 2 columns a serial and a text column, and 319,894,674 records, the table size is 23 GB, the serial column is the primary key (no good reason, just added to it) and the size of the PK 6852 MB.
A query without index in the sha1 column, the execution time is 4 minutes (thanks to the parallel workers).
The size of the B-Tree index: 20 GB. The size of the Hash index: 8192 MB (8 GB) more than a half than B-Tree 🙂 and the execution time is roughly the same as b-tree.
Another advantage of smaller indexes are that they can fit best in memory and less reads of disk, “Buffers: shared hit=2” vs “Buffers: shared hit=6”.

https://medium.com/@jorsol/postgresql-10-features-hash-indexes-484f319db281

Best Answer

  • Number one: PostgreSQL will never be great as a key-value store if you have many UPDATEs. Workloads with many UPDATEs are just hard for PostgreSQL's architecture.

    Make sure that you create your table with a fillfactor way below 100, so that you can make use of HOT updates. This is the only way you can survive workloads with many UPDATEs. Make sure that there is no index on value, otherwise it won't work.

    Also, since there will be considerable churn on the TOAST table of that table, make sure to set autovacuum_work_mem high and toast.autovacuum_vacuum_cost_delay = 0 on that table, so that autovacuum has a chance to keep up.

  • Number two: Don't use hash indexes. I don't buy for a second that they are two or three times as fast. You'd have to benchmark this to convince me.

    With a B-tree index, your problem would simply go away: just put a primary key constraint on id.

  • Number three: storing the dictionaries as a large jsonb means that you have to store a new copy of the whole whenever you update even a single value of the jsonb.

    You might be better off by modeling this in a relational way. This will of course only be a good option if there are no nested structures in the jsonb.