PostgreSQL Indexing – Creating Composite Index on Bigints

database-designindexpostgresqlschema

I have a User Relations table, which basically stores all the one-directional relations of users.
For example , User A follows user B ; User X blocks User Y

User ids:

the id of users are bigints.
It's not like I think that using ints would really end up in a bottleneck(unless my app grows to more than 2 billion users that is 🙂 ), but as far as a I understand, if I integrate social networks signups, Facebook uses bigints for their user ids, hence I have to keep the id of the users as bigint.

Indexing :

At first I thought about making a composite index of both user (the relating and the related), but since both indexes are bigints, meaning 8 bytes each, wouldn't it be a waste to store such a monstrous index?
I ended up leaving only one column index, but not sure whether it's the wisest decision in this case.

I also thought of a variant with covering index – but in this case the relation_type(follow, block) will not be part of indexing/filtering, so maybe I should include that inside the composite index, but then I would have 8+8+4 bytes of index.
The tradeoffs between space and time are not clear to me in this case.
Would appreciate any thoughts on the matter.

First variant (one-column index):

CREATE TABLE user_relations (
    relating_user_id bigint NOT NULL,
    related_user_id bigint NOT NULL,
    relation_type smallint NOT NULL,
    created_at default current_timestamp,

    PRIMARY KEY (relating_user_id),
    FOREIGN KEY (relation_type) references user_relations_types (id)

    -- will having only one index affect the performance really ? 
    -- should I include the type into the composite type ?
);

Second variant (using covering index and omitting PK) :

    CREATE TABLE user_relations (
        relating_user_id bigint NOT NULL,
        related_user_id bigint NOT NULL,
        relation_type smallint NOT NULL,
        created_at default current_timestamp,

        FOREIGN KEY (relation_type) references user_relations_types (id),
        CREATE UNIQUE INDEX relating_user_related_user_relation_type_idx ON user_relations 
                    (relating_user_id, related_user_id) INCLUDE (relation_type);

        -- should I include the type into the composite type ?
    );

Best Answer

Indexes use space, and a 16 byte key is nothing to worry about.

So you should define the primary key on user_relations on (relating_user_id, related_user_id). If you need to search by relation_type, it won't help to put the column into the INCLUDE list, since such columns cannot be used as filter for an index scan.

I see two options:

  1. In addition to the primary key suggested above, have additional indexes on (relating_user_id, relation_type) and (related_user_id, relation_type).

  2. If you want to have fewer indexes, define the primary key on (relating_user_id, relation_type, related_user_id) and define only one additional index. Perhaps you can live with it if the database allows several different relationships between the same users.

The deciding factor which indexes to define should be the queries you will have to serve frequently and efficiently. Design these queries, and the question will become easier.