We have 3 instances of MongoDB 3.2.12
configured as a replica set. Journaling is enabled, the storage engine is WiredTiger.
Most of the time our application produces a very low load on the database server and it works well, but rarely we launch a background maintenance causing x10…x100 load increase, mostly due to insert
operations: ~10000 inserts/sec.
Today during the maintenance we noticed that the response time of our application increased every 1 minute for a short time.
We suspected that it's related to the MongoDB server and found the following facts:
mongostat
output example:
insert query update delete getmore command % dirty % used flushes vsize res qr|qw ar|aw netIn netOut conn set repl time
891 936 235 14 388 2620|0 4.6 80.0 0 33.8G 26.7G 0|0 1|0 2.02m 2.38m 208 rs0 PRI 14:55:49+03:00
1098 861 3 12 107 452|0 4.6 80.0 0 33.8G 26.7G 0|0 1|1 1.07m 5.39m 208 rs0 PRI 14:55:50+03:00
14602 725 11 14 259 727|0 4.7 80.0 0 33.8G 26.7G 0|0 0|0 7.57m 8.44m 208 rs0 PRI 14:55:51+03:00
4 527 260 1 23 100|0 4.7 80.0 0 33.8G 26.7G 0|0 0|0 383k 3.60m 208 rs0 PRI 14:55:52+03:00
8698 121 671 11 186 585|0 4.8 80.0 0 33.8G 26.7G 0|0 2|0 5.61m 6.30m 208 rs0 PRI 14:55:53+03:00
5539 645 21 9 199 623|0 4.9 80.0 0 33.8G 26.7G 0|0 1|0 5.19m 5.77m 208 rs0 PRI 14:55:54+03:00
2599 602 14 9 99 370|0 4.9 80.0 0 33.8G 26.7G 0|0 1|0 1.48m 2.37m 208 rs0 PRI 14:55:55+03:00
3507 485 22 3 139 415|0 4.9 80.0 0 33.8G 26.7G 0|0 0|0 4.90m 5.33m 208 rs0 PRI 14:55:56+03:00
3627 369 4019 9 133 415|0 4.3 80.0 1 33.8G 26.7G 0|0 4|3 4.59m 4.18m 208 rs0 PRI 14:55:57+03:00
3 17 20 1 11 63|0 2.9 80.0 0 33.8G 26.7G 0|0 13|12 29.8k 56.3k 208 rs0 PRI 14:55:58+03:00
5 24 4 1 18 88|0 2.1 80.0 0 33.8G 26.7G 0|0 18|16 39.1k 270k 208 rs0 PRI 14:55:59+03:00
2 9 2 3 9 54|0 0.5 80.0 0 33.8G 26.7G 0|0 22|20 20.2k 88.9k 208 rs0 PRI 14:56:00+03:00
2 47 *0 3 2 21|0 0.3 80.0 0 33.8G 26.7G 0|0 24|23 18.0k 39.6k 208 rs0 PRI 14:56:01+03:00
26635 991 5475 35 410 1189|0 0.8 80.0 0 33.8G 26.7G 0|1 2|0 16.4m 21.0m 207 rs0 PRI 14:56:02+03:00
-
Moments, when the application is slow, perfectly match with moments when
mongstat
reportsflushes = 1
. -
MongoDB handles a very low number of requests in the next 4 seconds after this event.
-
The
transaction checkpoint most recent time (msecs)
value from theserverStatus
is also equal to ~4000 msec.
We launched iostat -x -d 1
and found the following pattern:
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 36.00 311.00 33.00 61.00 276.00 1384.00 35.32 0.04 0.39 0.36 0.41 0.34 3.20
sdb 60.00 311.00 21.00 61.00 324.00 1384.00 41.66 0.03 0.38 0.38 0.38 0.34 2.80
sdc 36.00 311.00 33.00 63.00 276.00 1392.00 34.75 0.04 0.43 0.33 0.48 0.38 3.60
sdd 57.00 311.00 20.00 63.00 308.00 1392.00 40.96 0.04 0.46 0.30 0.51 0.42 3.50
md126 0.00 0.00 296.00 695.00 1184.00 2776.00 7.99 0.00 0.00 0.00 0.00 0.00 0.00
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 0.00 0.00 7.00 21.00 28.00 36.00 4.57 0.01 0.43 0.29 0.48 0.32 0.90
sdb 0.00 0.00 4.00 21.00 16.00 36.00 4.16 0.01 0.40 0.25 0.43 0.28 0.70
sdc 0.00 0.00 6.00 21.00 24.00 36.00 4.44 0.01 0.26 0.33 0.24 0.19 0.50
sdd 0.00 0.00 4.00 21.00 16.00 36.00 4.16 0.01 0.24 0.25 0.24 0.16 0.40
md126 0.00 0.00 21.00 21.00 84.00 72.00 7.43 0.00 0.00 0.00 0.00 0.00 0.00
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 0.00 0.00 0.00 4.00 0.00 8.00 4.00 0.00 0.50 0.00 0.50 0.50 0.20
sdb 0.00 0.00 0.00 4.00 0.00 8.00 4.00 0.00 0.50 0.00 0.50 0.50 0.20
sdc 0.00 0.00 0.00 2.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sdd 0.00 0.00 0.00 2.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
md126 0.00 0.00 0.00 4.00 0.00 8.00 4.00 0.00 0.00 0.00 0.00 0.00 0.00
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 0.00 17275.00 1.00 2824.00 4.00 79688.00 56.42 10.25 3.63 0.00 3.63 0.14 39.40
sdb 0.00 17275.00 1.00 2824.00 4.00 79688.00 56.42 10.65 3.77 0.00 3.77 0.14 39.60
sdc 0.00 17284.00 1.00 2799.00 4.00 79624.00 56.88 10.70 3.82 0.00 3.82 0.14 39.20
sdd 0.00 17284.00 1.00 2799.00 4.00 79624.00 56.88 11.40 4.07 0.00 4.07 0.15 40.70
md126 0.00 0.00 4.00 39867.00 16.00 159316.00 7.99 0.00 0.00 0.00 0.00 0.00 0.00
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 53.00 19426.00 64.00 4645.00 468.00 95132.00 40.60 18.43 3.91 0.31 3.96 0.11 52.80
sdb 28.00 19426.00 29.00 4645.00 228.00 95132.00 40.80 18.68 4.00 0.97 4.02 0.12 56.80
sdc 62.00 19463.00 61.00 4660.00 492.00 95340.00 40.60 24.42 5.17 1.00 5.23 0.12 57.00
sdd 18.00 19463.00 39.00 4660.00 228.00 95340.00 40.68 19.39 4.13 0.15 4.16 0.11 53.70
md126 0.00 0.00 357.00 47782.00 1428.00 190468.00 7.97 0.00 0.00 0.00 0.00 0.00 0.00
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 24.00 1261.00 275.00 127.00 1196.00 5348.00 32.56 0.10 0.26 0.11 0.59 0.24 9.80
sdb 55.00 1261.00 33.00 127.00 352.00 5348.00 71.25 0.09 0.54 0.30 0.61 0.52 8.30
sdc 61.00 1261.00 281.00 128.00 1368.00 5352.00 32.86 0.13 0.31 0.14 0.68 0.28 11.50
sdd 11.00 1261.00 38.00 128.00 196.00 5352.00 66.84 0.09 0.54 0.16 0.66 0.53 8.80
md126 0.00 0.00 775.00 2675.00 3100.00 10700.00 8.00 0.00 0.00 0.00 0.00 0.00 0.00
Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sda 75.00 1161.00 59.00 165.00 536.00 5040.00 49.79 0.11 0.51 0.29 0.59 0.48 10.70
sdb 4.00 1161.00 37.00 165.00 164.00 5040.00 51.52 0.11 0.52 0.14 0.61 0.52 10.60
sdc 58.00 1151.00 58.00 165.00 464.00 5000.00 49.00 0.12 0.54 0.31 0.62 0.50 11.20
sdd 23.00 1151.00 33.00 165.00 224.00 5000.00 52.77 0.10 0.50 0.18 0.56 0.50 9.90
md126 0.00 0.00 347.00 2510.00 1388.00 10040.00 8.00 0.00 0.00 0.00 0.00 0.00 0.00
Firstly, all IO activity metrics dropped to zeroes (that I can't explain at all), then we see a bunch of writes increasing avgqu-sz
up to 20, and then all metrics returned to typical values.
So, I guess the performance degradation we experienced was related to the WiredTiger Checkpoint writing process which happens every 1 minute (docs).
It looks that it takes a reasonable amount of time writing this amount of data, but why doesn't the MongoDB handle requests during this process? I thought it should happen in the background and shouldn't affect currently executing operations.
Best Answer
We have updated MongoDB and the issue disappeared. Most likely it was related to the bug: https://jira.mongodb.org/browse/WT-3362