mysql will open and flush data/log files as below when innodb_flush_method=O_DSYNC
Open log Flush log Open datafile Flush data
O_SYNC NONE NONE fsync ()
My First question is ,why do we need to Use O_SYNC
to open log instead of O_DSYNC
?
With O_DSYNC
, each transaction require one write, but with O_SYNC
it requires two writes. I think that would make a big difference.
And I found another option in percona server ALL_O_DIRECT
: use O_DIRECT
to open both data and log files, and use fsync() to flush the data files but not the log files.
Tested this option use sysbench OLTP , single thread
response time changes from 20.11ms to 8.3 ms
Test option O_DIRECT_NO_FSYNC on mysql 5.7, it still call fsync to flush log files
strace -p `pgrep -x mysqld` -ff -e trace=desc >strace.out 2>&1 &
tail -f strace.out |grep "(8\|(9"
[pid 12258] pwrite(9, "\200\10K\376\2\0\0N\0\0\0L.\0\4\0\2018\0300\2018\2\0\2018\0304\226\240\4\0"..., 1024, 26740736) = 1024
[pid 12258] fsync(9) = 0
[pid 12258] pwrite(9, "\200\10K\377\2\0\0\25\0\0\0L\10\215\247\3\4\0\0\0\1\247\f\240\37\0\6\0\1\200\4\200"..., 1024, 26741248) = 1024
[pid 12258] fsync(9) = 0
Second Question ,beside use ALL_O_DIRECT, any other method could stop innodb from writing metadata of log file into storage device ?
Best Answer
Yes, this
O_DSYNC
(the option) is fundamentally useless because MySQL/InnoDB engineers saw that it was dangerous in certain cases (corruption), and disabled it in many cases, making it a synonym toO_SYNC
. it also does anfsync()
for the same reason.The main thing you want to tune is
O_DIRECT
vs. default (flushed), and maybeALL_O_DIRECT
on Percona.In your case, I suppose you have good latency to the filesystem (where
O_DIRECT
shines). Sometimes, having the transaction logs on the filesystem cache can be hurtful (although such a difference seem a bit too much for me - did you compare it withO_DIRECT
, available on all versions of MySQL, which only avoids data file caching?).Also be careful, because sysbench may not be a good representative of your real-world load. Response times can vary a lot depending on
flush_log_at_trx_commit
, I assume you have that as 1.According to Dimitri, from Oracle-MySQL, the solution upstream comes in 5.7. I like it for now in Percona, as you can choose.