Postgresql – Why are the Postgresql sessions writing directly on disk

postgresqlpostgresql-9.4

I have a write-heavy application on a PostgreSQL 9.4 on CentOS 6 that I'm trying to tune.

Using iotop, I see that my session processers are doing all the heavy writing on disk when (to my understanding) only the checkpointer and writer processes should be doing it.

I thought that the sessions might be writing temp files due to lack of workmem but when I query the pg_stat_database, the columns "temp_files" and "temp_bytes" are zero. I'm trying to log this kind of usage via system parameter "log_temp_files" but nothing appears on the log files.

Can somebody help me? Are there any kind of data that the sessions write directly on disk besides temp files?

EDIT: Added an iotop example

19156 be/4 postgres    0.00 B/s  382.33 K/s  0.00 %  0.00 % postgres: checkpointer process
19167 be/4 postgres    0.00 B/s  146.96 K/s  0.00 %  3.03 % postgres: example example 10.18.14.1(59536) idle
19190 be/4 postgres    0.00 B/s  146.16 K/s  0.00 %  2.89 % postgres: example example 10.18.14.1(59551) idle
19164 be/4 postgres    0.00 B/s  144.17 K/s  0.00 %  2.20 % postgres: example example 10.18.14.1(59533) idle
19172 be/4 postgres    0.00 B/s  128.64 K/s  0.00 %  1.98 % postgres: example example 10.18.14.1(59541) idle
19161 be/4 postgres    0.00 B/s  122.27 K/s  0.00 %  1.88 % postgres: example example 10.18.14.1(59525) idle
19174 be/4 postgres    0.00 B/s  119.08 K/s  0.00 %  2.08 % postgres: example example 10.18.14.1(59543) idle
19170 be/4 postgres    0.00 B/s  116.29 K/s  0.00 %  2.62 % postgres: example example 10.18.14.1(59539) idle
19169 be/4 postgres    0.00 B/s  115.50 K/s  0.00 %  1.99 % postgres: example example 10.18.14.1(59538) idle
 1304 be/4 root        0.00 B/s  115.50 K/s  0.00 %  0.92 % [flush-253:0]
19173 be/4 postgres    0.00 B/s  113.90 K/s  0.00 %  1.97 % postgres: example example 10.18.14.1(59542) SELECT
19175 be/4 postgres    0.00 B/s  113.50 K/s  0.00 %  1.98 % postgres: example example 10.18.14.1(59544) idle
19476 be/4 postgres    0.00 B/s  111.91 K/s  0.00 %  1.91 % postgres: example example 10.18.14.1(59618) idle
19434 be/4 postgres    0.00 B/s  110.32 K/s  0.00 %  1.77 % postgres: example example 10.18.14.1(59606) DISCARD ALL
19168 be/4 postgres    0.00 B/s  101.16 K/s  0.00 %  1.82 % postgres: example example 10.18.14.1(59537) SELECT
19166 be/4 postgres    0.00 B/s   99.57 K/s  0.00 %  2.82 % postgres: example example 10.18.14.1(59535) idle
19171 be/4 postgres    0.00 B/s   98.77 K/s  0.00 %  1.78 % postgres: example example 10.18.14.1(59540) idle
19179 be/4 postgres    0.00 B/s   98.37 K/s  0.00 %  2.22 % postgres: example example 10.18.14.1(59547) idle
19163 be/4 postgres    0.00 B/s   96.38 K/s  0.00 %  1.72 % postgres: example example 10.18.14.1(59532) idle
19189 be/4 postgres    0.00 B/s   94.39 K/s  0.00 %  2.26 % postgres: example example 10.18.14.1(59550) idle

Thanks in advance.

Best Answer

Write-heavy operations like COPY or INSERT INTO ... SELECT FROM will use a non-default buffer access strategy which intentionally causes the write-heavy process to write its own buffers.

This is so that, if you write a billion rows into a table, that table growth doesn't shove the entire rest of the database out of the cache.

Even if your write heavy operations do not use a buffer access strategy:

  1. There is no reason for checkpointing to be doing most of the writing unless our checkpoint_segments are set too small.

  2. If your bgwriter is not aggressive enough, it might not get enough data written and so the user backends will do it themselves. You could try to make bgwriter more aggressive. But ever since version 8.3.20 or so, the bgwriter is not very useful anyway, and having the backends do writes themselves is rarely a problem anymore.