I have a pretty simple 3 column table in postgresql 11. It's timeseries data and the table can contain many billions of rows. I'm concerned about my table size and total size, and want to optimize my design to improve bytes/row.
I've found a couple of really useful questions and answers on the subject already
Measure the size of a PostgreSQL table row
Configuring PostgreSQL for read performance
From running some of the queries shown in those discussions, I believe there's room for improvement, but I don't understand enough to make those improvements 🙂
My create script is as follows:
-- table
CREATE TABLE public.vector_events
(
vector_stream_id integer NOT NULL,
event_time timestamp without time zone NOT NULL,
event_data0 real NOT NULL
)
WITH (
OIDS = FALSE
)
TABLESPACE pg_default;
-- index
CREATE INDEX vector_events_stream_id_event_time_index
ON public.vector_events USING btree
(vector_stream_id, event_time DESC)
TABLESPACE pg_default;
I believe my column widths are optimal – vector_stream_id could be over 100000, event time needs millisecond precision, and our data can be stored within a float.
I chose the index because our queries will only ever be of the form:
SELECT event_time, event_data
FROM vector_events
WHERE stream_id=@streamId
AND event_time >= @lowerBound
-- (optionally with upper bound) AND event_time <= @upperBound
ORDER BY event_time DESC -- (sometimes ASC)
It's essential that the above query is performant when the table has at least a million rows (probably hundreds of millions). TBH the choice of binary tree index was a bit of a best guess.
If I use Erwin Brandstetter 's query to check table size:
(query omitted for brevity, but found here: Answer to 'Measure the size of a PostgreSQL table row' )
I get the following (this is from a smaller sample table):
metric bytes/ct|bytes_pretty|bytes_per_row
core_relation_size 9076736 8864 kB 52
visibility_map 8192 8192 bytes 0
free_space_map 24576 24 kB 0
table_size_incl_toast 9109504 8896 kB 52
indexes_size 9256960 9040 kB 53
total_size_incl_toast_and_indexes 18366464 18 MB 106
live_rows_in_text_representation 5685353 5552kB 32
------------------------------
row_count 172800
live_tuples 172800
dead_tuples 0
A naive view of the table would say that I have an int (4 bytes), a timestamp without tz (8 bytes) and a float (4 bytes) so 16 bytes of actual data.
I understand it's not quite that simple, but a 52 byte table size seems excessive.
Furthermore, the index size is even larger at 53 bytes (and that is just the index, it doesnt include the event data, right?)
So I have a total size of 105 bytes for each row – surely there must be something I can do to improve this?
I seem to be able to save a few bytes (about 8) by applying this 'column tetris' technique (putting the wider columns first), changing my column order to event_time, stream_id, event_data
Still, how can I get this this down below 97 bytes? What size should I expect for a well designed table and index?
Notes:
Currently I'm using postgresql 11 on windows, I'm in the process of provisioning a linux box for comparison.
My 'real' database is using timescaledb, but I see the same pattern of tablesize/index size in a plain postgresql table, so I believe the cause of excessive table size is in my postgresql schema or index design. (timescale will split up my billions of events into chunk tables each containing several million, but my choice of schema and index is still essential to efficient disk use and performance)
I expect I can also make improvements to server configuration, but firstly I just want to get the best table size.
My 3 considerations right now are (in order of importance)
-
Read performace, when getting tens of thousands of rows from a table containing many millions. Also aggregate queries.
-
Disk usage, this becomes prohibitively expensive as the total number of events gets into billions.
-
Write performance, normally in chronological order for any stream, although some streams may lag behind others, and occasionally we might backfill data.
Best Answer
The best thing to do with questions like this is measure:
So the 52 bytes per row are pretty much spot on.
About the index:
That seems pretty normal to me.
You can expect the data to take up more space eventually if you have
DELETE
s andUPDATE
s in your workload, because these cause a certain internal fragmentation (bloat); particularly indexes can become twice or three times as big.To answer your questions:
Your index is perfect for your query, and it does not matter if you declare it
ASC
orDESC
. So access speed should be optimal.As you said, you can save 4 bytes per row by having
event_time
as the first or last row. That's the limit of what is possible.For good write performance, have fast disks and set
max_wal_size
high.You will need a primary key index for the table. The cheapest way would be to use your index for that (if it can be set to
UNIQUE
), but then you'd have to get rid of theDESC
.