‘seek’ argument in command dd

Can some explain me what is happening in the following lines?

dd if=/dev/urandom bs=4096 seek=7 count=2 of=file_with_holes

especially seek part is not clear

Man pages says :

 seek=BLOCKS
              skip BLOCKS obs-sized blocks at start of output

What is obs-sized block?

Best Answer

dd is designed to copy blocks of data from an input file to an output file. The dd block size options are as follows, from the man page:

ibs=expr
    Specify the input block size, in bytes, by expr (default is 512).
obs=expr
    Specify the output block size, in bytes, by expr (default is 512).
bs=expr
    Set both input and output block sizes to expr bytes, superseding ibs= and obs=.

The dd seek option is similar to the UNIX lseek() system call¹. It moves the read/write pointer within the file. From the man page:

seek=n
    Skip n blocks (using the specified output block size) from the beginning of the output file before copying.

Ordinary files in UNIX have the convenient property that you do not have to read or write them starting at the beginning; you can seek anywhere and read or write starting from there. So bs=4096 seek=7 means to move to a position 7*4096 bytes from the beginning of the output file and start writing from there. It won't write to the portion of the file that is between 0 and 7*4096 bytes.

Areas of ordinary files that are never written to at all aren't even allocated by the underlying filesystem. These areas are called holes and the files are called sparse files. In your example, file_with_holes will have a 7*4096-byte hole at the beginning. (h/t @frostschutz for pointing out that dd truncates the output file by default.)

It is OK to read these unallocated areas; you get a bunch of zeroes.

[1] back when dd was written, the analogous system call was seek().

Related Solutions

Passing dd skip|seek offset as hexadecimal

Why the second command outputs a different value?

For historical reasons, dd considers x to be a multiplication operator. So 0x3 is evaluated to be 0.

Is it possible to pass the skip|seek offset to dd as an hexadecimal value?

Not directly, as far as I know. As well as multiplication using the operator x, you can suffix any number with b to mean "multiply by 512" (0x200) and with K to mean "multiply by 1024" (0x400). With GNU dd you can also use suffixes M, G, T, P, E, Z and Y to mean multiply by 2 to the power of 20, 30, 40, 50, 60, 70, 80 or 90, respectively, and you can use upper or lower case except for the b suffix. (There are many other possible suffixes. For example, EB means "multiply by 10¹⁸" and PiB means "multiply by 2⁵⁰". See info coreutils "block size" for more information, if you have a GNU installation.)

You might find the above arcane, anachronistic, and geeky to the point of absurdity. Not to worry: you are not alone. Fortunately, you can just ignore it all and use your shell's arithmetic substitution instead (bash and other Posix compliant shells will work, as well as some non-Posix shells). The shell does understand hexadecimal numbers, and it allows a full range of arithmetic operators written in the normal way. You just need to surround the expression with $((...)):

# dd if=2013-Aug-uptime.csv bs=1 count=$((0x2B * 1024)) skip=$((0x37))

The difference between ‘bs’, ‘count’ and ‘seek’ in dd command

I really don't know how to explain this better than the manpage does.

bs= sets the blocksize, for example bs=1M would be 1MiB blocksize.

count= copies only this number of blocks (the default is for dd to keep going forever or until the input runs out). Ideally blocks are of bs= size but there may be incomplete reads, so if you use count= in order to copy a specific amount of data (count*bs), you should also supply iflag=fullblock.

seek= seeks this number of blocks in the output, instead of writing to the very beginning of the output device.

So, for example, this copies 1MiB worth of y\n to position 8MiB of the outputfile. So the total filesize will be 9MiB.

$ yes | dd bs=1M count=1 seek=8 iflag=fullblock of=outputfile
$ ls -alh outputfile
9.0M Jun  3 21:02 outputfile
$ hexdump -C outputfile
00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00800000  79 0a 79 0a 79 0a 79 0a  79 0a 79 0a 79 0a 79 0a  |y.y.y.y.y.y.y.y.|
*
00900000

Since you mention /dev/random and overwriting partitions... it will take forever since /dev/random (as well as /dev/urandom) is just too slow. You could just use shred -v -n 1 instead, that's fast and usually available anywhere.

Best Answer

Related Solutions

Passing dd skip|seek offset as hexadecimal

The difference between ‘bs’, ‘count’ and ‘seek’ in dd command

Related Question