Postgresql – Prepared statement overhead

performancepostgresqlprepared-statement

I am using libpqxx to communicate with a PostgreSQL 9.3 server.

For multiple table inserts, if they are split into individual prepared statements, they cost approximately 2ms each. If they are all wrapped up into a single CTE, the entire prepared statement costs approximately 2ms.

In fact, nearly all individual prepared statements cost approximately 2ms, 1ms for unlogged tables.

What explains this exactly?

Edit

I first broke down my most expensive set of 14 prepared statements into two CTEs to avoid the single row write rule.

This dropped the average time consumed from below 40ms to approximately 10ms.

To avoid the single row write rule, I tried a function with input parameters. This was a disaster despite its convenience, driving time consumed to approximately 500ms.

I just completed the single CTE, and I have found no difference between using two CTEs in halves and the single, but I think there may be some caveats.

To merge the two CTEs, I had to work a variant of this solution because of how I'm moving data between tables, copying from one table to another, deleting that source data, and rebuilding it.

Keys that are not present in the new data are deleted, coexisting ones are updated if they have new data, and keys not present in the old data are inserted, converting a delete+insert into a delete+update+insert.

This prepared CTE is now almost 25 queries long vs the original 14; however, very few rows are being moved. In the future, many more rows should be moved.

It seems strange that the single CTE would consume as much time as the two-CTE since all queries are running more or less concurrently, and less rows are being altered. My guess is that since this CTE is much larger, it's taking more trials for the server to come up with an optimal query plan.

Single CTE now only consumes 5ms on average

As I ran the query more, the average started dropping. It's currently dropped to a 5ms average. This is on an i7 laptop. I still haven't ran the query that many times, so there may be room for more improvement.

Since I have no knowledge of the PostgreSQL source code and can't determine how much of the performance was due to roundtrips between database and application despite them being on the same box, my best guess is that nearly all of the performance increase is due to the efficiency of PostgreSQL CTEs, considering the default concurrent execution of the sub-statements and main query.

Hopefully, an expert can give a full explanation, but as for me and for performance, I will try to cram every process into a single CTE.

Best Answer

The next to last major step was to find out that it is, unfortunately for MySQL users, only MySQL that joins slowly, so these write intensive tables were all given unique integer identifiers to join upon. That took off 1/3 of the remaining average time consumption reduction.

The last step was taking CTE optimization fencing into account which took off the last 2/3 of reduction.

In the case of the above set of queries, the average time consumed is now between 1 and 2ms on an i7 laptop tuned for SSDs even though I'm using the included hard drive. On a server with a single SSD, the time consumed averages less than 1ms.

To get the last performance boost, almost all queries were condensed, so for example if a table needed to be updated, it's best to do it in one query no matter how strange or performance reducing it might appear.

Reading to assemble relevant primary key values required more testing. If it was a combination of RETURNINGs, it was best to allow each sub-statement to recalculate it rather than depend upon another sub-statement to do the calculation. If the read had to go to disk, even if it was small, it was best to do that once and reference it across the multiple writing queries.

Copying was another grey area. If a table required inserts, updates, and deletes based upon data from another table, it was slower to pre-select the data. It was faster to simply reference the primary key data.

In general, aside from reads that go to disk to assemble primary key values to reference, it is best to compress a CTE as much as possible though it may appear strange to keep the chains as short and narrow as possible.

Multi-prepared statement transactions that previously consumed 10s of ms in my application now consume max 5ms.

This approach should probably be limited to small amounts of redundant data.