PostgreSQL – Custom Hash Function and C Type Implementation

database-internalspostgresqlpostgresql-extensions

I am creating a custom type for PostgreSQL. The type is as follow:

typedef struct {
    unsigned long prefix; 
    unsigned long long id; 
} CustomType;

I have build all the in and out functions. Comparison functions and such, but I don't understand how to build the hash function. Since I will need it for hash joins. I had a look at the hash_numeric function in https://doxygen.postgresql.org/backend_2utils_2adt_2numeric_8c.html#a1358689e8be944cad3a3fad87eb237f1 and don't quite understand it.

How does a hash function works and what is its purpose?

Best Answer

The hash function in this context is used to tranform the set of all possible type values into a substantially smaller set of their hash values. The hash join works by separating values it needs to compare for equality into buckets based on their hash values. Values producing the same hash and therefore falling in the same bucket have a chance to be equal, while values in different buckets cannot be equal and thus can be removed from consideration.

One trivial example of a 10-bucket hash for integer numbers would be a function returning the least significant digit of its parameter. You'll probably be able to use the built-in hash_any function in your case, but read the caveats about HASHES in the manual.