Linux – Detailed sparse file information on Linux

linuxsparse-files

I have a sparse file, in which only some blocks are allocated:

~% du -h --apparent-size example
100K    example
~% du -h example
52K     example

I would like to know which blocks of the file are actually allocated. Is there a system call or kernel interface that could be used to get a list of either the allocations, or the holes of file?

Simply checking for a long enough string of zeros (the approach used by GNU cp, rsync, etc) does not work correctly:

~% cp example example1  
~% du -h example1 
32K     example1

It detected other sequences of zeros that were actually allocated.

Best Answer

There is a similar question on SO. The currently accepted answer by @ephemient suggests using an ioctl called fiemap which is documented in linux/Documentation/filesystems/fiemap.txt. Quoting from that file:

The fiemap ioctl is an efficient method for userspace to get file extent mappings. Instead of block-by-block mapping (such as bmap), fiemap returns a list of extents.

Sounds like this is the kind of information you're looking for. Support by filesystems is again optional:

File systems wishing to support fiemap must implement a ->fiemap callback on their inode_operations structure.

Support for the SEEK_DATA and SEEK_HOLE arguments to lseek you mentioned from Solaris was added in Linux 3.1 according to the man page, so you might use that as well. The fiemap ioctl appears to be older, so it might be more portable across different Linux versions for now, whereas lseek might be more portable across operating systems if Solaris has the same.

Related Question