Postgresql – Prevent timing attacks in postgres

postgresqlSecurity

I'm looking into accepting an API token from a user, and I would like to prevent timing attacks. The conventional wisdom seems to be that one should hash the token server-side in order to make the comparison non-deterministic. If you have some other identifier you can look up the record by that and then do a constant-time in-memory comparison, as with Rails secure_compare. In my case, though, I was planning on using the token to look up the user. I'm wondering if, by any chance, Postgres has some facility for doing constant-time comparisons when looking up records. Something that might look like:

SELECT * FROM users WHERE secure_compare(token, 'abcdef')

Best Answer

Damn, my sleep() idea was oversimplistic!

SELECT * FROM users WHERE secure_compare(token, 'abcdef')
or...
SELECT * FROM users WHERE token = 'abcdef'

You'll want to index that to make it fast, which means timing information can be leaked not just from the string comparison, but also from the btree index walk. Can we actually do that?

CREATE UNLOGGED TABLE foo (t TEXT PRIMARY KEY);
INSERT INTO foo SELECT substr(sha256(n::TEXT::BYTEA)::TEXT,3) FROM generate_series(100000,199999) n;

...and python

def gettime(sql,N=300):
    t = time.time()
    for _ in range(N):
        cur.execute(sql)
    return (time.time()-t)/N

for sql in  ("select * from foo where t='3000000000000000000000000000000000000000000000000000000000000000';",
            "select * from foo where t='z000000000000000000000000000000000000000000000000000000000000000';",
            "select * from foo where t='3bb78535cc9555ff19fe3556aaa41c78a0a45c64d49ba2bc564507648a000000';"):
    print( sql )
    gettime(sql)
    r = [gettime(sql) for n in range(1000)]
    plt.hist(r,bins=30,label=sql)
plt.legend()
plt.show()

Oh yes, definitely a timing difference, about a microsecond between the case with no rows (hexadecimal hashes never start with a 'z') and the one that starts with a '3' which has plenty of matches in the table. I forgot to label the x axis as "time". Since this is most likely due to the path it takes walking through the btree index depending on the value of the hash, using a secure comparison function won't do a thing.

enter image description here

Let's try with a hash index instead of the btree:

enter image description here

Much more similar timings, in fact I had to set the histograms to be transparent since they overlap each other so much. Now, averaged over a much larger number of queries, there would probably be a detectable difference anyway. You could try it out...

What I'm saying is you're focusing on the string comparison you can see, in your query, but what about the rest of postgres code? As seen above, the btree index timing definitely leaks information. Hash doesn't seem to, the index only stores references to tuples that contain values having the same hash, and then pg goes to check the tuples. So perhaps with a bit of hacking, even the hash index could leak timing showing there are rows with the same hash in the bucket, depending how long it spends checking for tuples. And the postgres hash function isn't designed to be secure, so who knows. Besides, it will still use the equality operator, which is a string comparison. And if you want to index your own operator, you'll have to write a lot of C code.

What you could do to avoid too much hassle would be to index a BIGINT instead, which would contain some 64-bit hash of your API key. BIGINT comparisons should be constant time. And then index that with a hash index to make it fast, so you only need to run the secure_compare on the row matched by the index. That lets you use a super dumb version of secure_compare, like a bit of plpgsql that compares all characters in the string, and does not exit on the first character that does not match. Hell you could even use

SELECT sum(substr(str1,n,1) != substr(str2,n,1)) 
FROM generate_series(1,your_string_length) n;

But that doesn't guarantee that no part of postgres code will leak info.

I'd still hack it with a sleep(). But, instead of being random, I'd take the timestamp T_start of the beginning of the HTTP request, do all the work, and then sleep until say, T_start+100ms. That way, all requests will take about 100ms no matter what actually happens, even in code you didn't look at.