Mysql – ib_logfile opened with O_SYNC when innodb_flush_method=O_DSYNC

innodbMySQL

mysql will open and flush data/log files as below when innodb_flush_method=O_DSYNC

Open log    Flush log   Open datafile   Flush data

O_SYNC       NONE           NONE         fsync ()                    

My First question is ,why do we need to Use O_SYNC to open log instead of O_DSYNC ?

With O_DSYNC, each transaction require one write, but with O_SYNC it requires two writes. I think that would make a big difference.

And I found another option in percona server ALL_O_DIRECT: use O_DIRECT to open both data and log files, and use fsync() to flush the data files but not the log files.

Tested this option use sysbench OLTP , single thread

response time changes from 20.11ms to 8.3 ms

Test option O_DIRECT_NO_FSYNC on mysql 5.7, it still call fsync to flush log files

  strace -p `pgrep -x mysqld`  -ff -e trace=desc >strace.out 2>&1 &
  tail -f strace.out |grep "(8\|(9"

  [pid 12258] pwrite(9, "\200\10K\376\2\0\0N\0\0\0L.\0\4\0\2018\0300\2018\2\0\2018\0304\226\240\4\0"..., 1024, 26740736) = 1024
  [pid 12258] fsync(9)                    = 0
  [pid 12258] pwrite(9, "\200\10K\377\2\0\0\25\0\0\0L\10\215\247\3\4\0\0\0\1\247\f\240\37\0\6\0\1\200\4\200"..., 1024, 26741248) = 1024
  [pid 12258] fsync(9)                    = 0

Second Question ,beside use ALL_O_DIRECT, any other method could stop innodb from writing metadata of log file into storage device ?

Best Answer

Yes, this O_DSYNC (the option) is fundamentally useless because MySQL/InnoDB engineers saw that it was dangerous in certain cases (corruption), and disabled it in many cases, making it a synonym to O_SYNC. it also does an fsync() for the same reason.

The main thing you want to tune is O_DIRECT vs. default (flushed), and maybe ALL_O_DIRECT on Percona.

In your case, I suppose you have good latency to the filesystem (where O_DIRECT shines). Sometimes, having the transaction logs on the filesystem cache can be hurtful (although such a difference seem a bit too much for me - did you compare it with O_DIRECT, available on all versions of MySQL, which only avoids data file caching?).

Also be careful, because sysbench may not be a good representative of your real-world load. Response times can vary a lot depending on flush_log_at_trx_commit, I assume you have that as 1.

According to Dimitri, from Oracle-MySQL, the solution upstream comes in 5.7. I like it for now in Percona, as you can choose.