I'm building an low utilization time-series database to capture yearly data points for a set of items fewer than 100,000.
My question has to do with storing arrays of data in a way that is easily queried later.
Right now the yearly_visits
table looks something like:
visitID MEDIUMINT primary key
userID SMALLINT id of individual submitting yearly data
weight SMALLINT weight of individual
The intake form also contains a checkbox with a list of favorite colors (via numeric value) from a lookup table. Users can select one or more colors.
Should colors be stored in a separate visit_colors
table that looks something like:
visitID MEDIUMINT
colorID SMALLINT
Or is there a better way of storing arrays of data in a time series? I haven't written any code yet, so I want to design this in a way that doesn't bite me down the road when I'm asked to query against the color data down the road.
Best Answer
I've read about (but not implemented) Timescale, a time-series database over Postgres. It claims to support all the PostreSQL native datatypes, including arrays. In that product the array type is supported within the query language. I don't imagine Postres disappearing any time soon so a solution based on it would be somewhat future-proof.