I want to understand the I/O pattern a database is writing to disk to decide how many disks to use for best performance. To analyse the I/O pattern I want to use blktrace and I have to grok it first. This is what I try here.
I have a USB stick that I attach to my computer and it becomes /dev/sdd. Now I start
dd if=/dev/sdd of=/dev/null
and on a separate window I start
blktrace -d /dev/sdd -o - | blkparse -i -
and expect to see read (R) operations that get merged (M) and put into the queue (Q). That works, but to my understanding the block size is always 8:
8,48 6 15257 2.157995037 2470 M R 816696 + 8 [dd]
8,48 6 15258 2.157996273 2470 Q R 816704 + 8 [dd]
8,48 6 15259 2.157996520 2470 M R 816704 + 8 [dd]
8,48 6 15260 2.157997794 2470 Q R 816712 + 8 [dd]
Now I am stopping everything and tell the system to read only one byte:
dd if=/dev/sdd of=/dev/null count=1 bs=1
1+0 records in
1+0 records out
1 byte (1 B) copied, 0.00325544 s, 0.3 kB/s
This shows up on the blkparse console like this:
8,48 6 1 17.220316681 2543 G N [dd]
8,48 6 2 17.220317209 2543 I N 0 (00 ..) [dd]
8,48 6 3 17.220317707 2543 D N 0 (00 ..) [dd]
8,48 6 4 17.220787473 2543 Q R 0 + 8 [dd]
8,48 6 5 17.220790545 2543 G R 0 + 8 [dd]
8,48 6 6 17.220791330 2543 P N [dd]
8,48 6 7 17.220793515 2543 Q R 8 + 8 [dd]
8,48 6 8 17.220794597 2543 M R 8 + 8 [dd]
8,48 6 9 17.220796134 2543 Q R 16 + 8 [dd]
8,48 6 10 17.220796419 2543 M R 16 + 8 [dd]
8,48 6 11 17.220797695 2543 Q R 24 + 8 [dd]
8,48 6 12 17.220797943 2543 M R 24 + 8 [dd]
8,48 6 13 17.220798862 2543 I R 0 + 32 [dd]
what's going on here? Why does a read of one byte show up as 3 "R" requests, each with a Q and a M action? Why does it "seem to" read 32 or 24 bytes? Where is docutainment to educate me further?
Best Answer
Because you are doing buffered IO and the page cache works in whole pages, which are 4k on PCs, or 8 512 byte sectors. The kernel readahead mechanism also reads a bit more on the assumption that dd will continue reading. If you want to avoid this, then you need to use direct IO by passing dd the iflag=direct option, but you won't be able to have it read a single byte doing that -- direct IO must be aligned to, and an even multiple of the sector size.