I've just found out what sparse files are and wanted to conduct some experiments on them. On wiki you can read that the files can get fragmented easily. I wanted to check how bad that is. I created a file in the following way:
# truncate -s 10G sparse-file
# mkfs.ext4 -m 0 -L sparse ./sparse-file
I mounted the sparse file, and I put a 600M file in it. The fragmentation level looks like this:
# filefrag -v "/media/Grafi/sparse-file"
Filesystem type is: ef53
File size of /media/Grafi/sparse-file is 10737418240 (2621440 blocks of 4096 bytes)
ext: logical_offset: physical_offset: length: expected: flags:
0: 0.. 1032: 36864.. 37896: 1033:
1: 1043.. 1043: 37907.. 37907: 1:
2: 1059.. 1059: 37923.. 37923: 1:
3: 9251.. 9256: 46115.. 46120: 6:
4: 32768.. 32770: 51200.. 51202: 3: 69632:
5: 34816.. 55295: 77824.. 98303: 20480: 53248:
6: 55296.. 57343: 114688.. 116735: 2048: 98304:
7: 57344.. 69631: 120832.. 133119: 12288: 116736:
8: 69632.. 81919: 102400.. 114687: 12288: 133120:
9: 81920.. 98303: 135168.. 151551: 16384: 114688:
10: 98304.. 98306: 57344.. 57346: 3: 151552:
11: 100352.. 112639: 151552.. 163839: 12288: 59392:
12: 112640.. 145407: 165888.. 198655: 32768: 163840:
13: 145408.. 163839: 198656.. 217087: 18432:
14: 163840.. 163842: 40960.. 40962: 3: 217088:
15: 165888.. 178175: 217088.. 229375: 12288: 43008:
16: 178176.. 202751: 231424.. 255999: 24576: 229376:
17: 202752.. 206847: 258048.. 262143: 4096: 256000:
18: 206848.. 216756: 276480.. 286388: 9909: 262144:
19: 229376.. 229378: 43008.. 43010: 3: 299008:
20: 294912.. 294914: 53248.. 53250: 3: 108544:
21: 524288.. 524288: 55296.. 55296: 1: 282624:
22: 819200.. 819202: 61440.. 61442: 3: 350208:
23: 884736.. 884738: 63488.. 63490: 3: 126976:
24: 1048576.. 1048577: 67584.. 67585: 2: 227328:
25: 1081344.. 1081391: 69632.. 69679: 48: 100352:
26: 1572864.. 1572864: 71680.. 71680: 1: 561152:
27: 1605632.. 1605634: 73728.. 73730: 3: 104448:
28: 2097152.. 2097152: 75776.. 75776: 1: 565248:
29: 2097167.. 2097167: 75791.. 75791: 1: last
/media/Grafi/sparse-file: 25 extents found
I thought it was because of the "sparse" feature, but it looks like all files that have other filesystems in them get fragmented in this way. Take a look at the following example:
Create a file full of zeroes:
# dd if=/dev/zero of=./zero bs=1M count=2048
Check its fragmentation level:
# filefrag -v /media/Grafi/zero
Filesystem type is: ef53
File size of /media/Grafi/zero is 2147483648 (524288 blocks of 4096 bytes)
ext: logical_offset: physical_offset: length: expected: flags:
0: 0.. 32767: 6172672.. 6205439: 32768:
1: 32768.. 65535: 6205440.. 6238207: 32768:
2: 65536.. 98303: 6238208.. 6270975: 32768:
3: 98304.. 118783: 6270976.. 6291455: 20480:
4: 118784.. 151551: 6324224.. 6356991: 32768: 6291456:
5: 151552.. 184319: 6356992.. 6389759: 32768:
6: 184320.. 217087: 6389760.. 6422527: 32768:
7: 217088.. 249855: 6422528.. 6455295: 32768:
8: 249856.. 282623: 6455296.. 6488063: 32768:
9: 282624.. 315391: 6488064.. 6520831: 32768:
10: 315392.. 348159: 6520832.. 6553599: 32768:
11: 348160.. 380927: 6553600.. 6586367: 32768:
12: 380928.. 413695: 6586368.. 6619135: 32768:
13: 413696.. 446463: 6619136.. 6651903: 32768:
14: 446464.. 479231: 6651904.. 6684671: 32768:
15: 479232.. 511999: 6684672.. 6717439: 32768:
16: 512000.. 524287: 6717440.. 6729727: 12288: last,eof
/media/Grafi/zero: 2 extents found
So basically, this file has 17 extents, but from the human perspective, the file has two chunks
Now create a filesystem in this file:
# mkfs.ext4 -m 0 -L ext /media/Grafi/zero
Check its fragmentation again:
# filefrag -v /media/Grafi/zero
Filesystem type is: ef53
File size of /media/Grafi/zero is 2147483648 (524288 blocks of 4096 bytes)
ext: logical_offset: physical_offset: length: expected: flags:
0: 0.. 257: 5505024.. 5505281: 258:
1: 265.. 265: 5505289.. 5505289: 1:
2: 272.. 273: 5505296.. 5505297: 2:
3: 289.. 289: 5505313.. 5505313: 1:
4: 8481.. 8486: 5507361.. 5507366: 6: 5513505:
5: 32768.. 32769: 5509120.. 5509121: 2: 5531648:
6: 98304.. 98305: 5511168.. 5511169: 2: 5574656:
7: 163840.. 163841: 5513216.. 5513217: 2: 5576704:
8: 229376.. 229377: 5515264.. 5515265: 2: 5578752:
9: 262144.. 262144: 5517312.. 5517312: 1: 5548032:
10: 294912.. 294913: 5519360.. 5519361: 2: 5550080: last
/media/Grafi/zero: 8 extents found
Does anyone know what actually happened here? Why the file got fragmented by creating a filesystem on it? What happened to the length
?
Added:
The mkfs.ext4
parameter -Enodiscard
doesn't work. With this option I can see the structure of the file in filefrag
(the zeroed blocks). But after creating the filesystem in this way, the file becomes fragmented for some reason no matter what. Maybe it's because of the filesystem metadata that is written, and it does something to the zeroed file. I don't know. But when I watch the output of filefrag
, I can see that there is always +6 extents (in the case of 2G file). Maybe it's because of the superblock and its 5 copies? But this still doesn't explain why the whole file is fragmented — it's still the same file.
There's another thing. When I recreate the filesystem in this file:
# mkfs.ext4 -Enodiscard /media/Grafi/zero
mke2fs 1.43 (17-May-2016)
/media/Grafi/zero contains a ext4 file system
created on Thu Jun 2 13:02:28 2016
Proceed anyway? (y,n) y
Creating filesystem with 524288 4k blocks and 131072 inodes
Filesystem UUID: 6d58dddc-439b-4175-9af6-8628f0d2a278
Superblock backups stored on blocks:
32768, 98304, 163840, 229376, 294912
Allocating group tables: done
Writing inode tables: done
Creating journal (16384 blocks): done
Writing superblocks and filesystem accounting information: done
The added extents magically disappear.
Best Answer
It looks like this was a bug in
mke2fs
that caused it to usefallocate(fd, PUNCH_HOLE, ...)
instead offallocate(fd, DISCARD_ZERO, ...)
when zeroing out the space in the inode tables (even when-E nodiscard
was used).I submitted a bug report to the upstream
linux-ext4@vger.kernel.org
mailing list after verifying this behaviour locally, and got a patch within an hour, subject:They should be included into the e2fsprogs-1.45 release, and likely the 1.44.x maintenance release. If you want them in a vendor
e2fsprogs
release, I'd recommend to patch+build your e2fsprogs to verify this is working for you, report success tolinux-ext4
so that the patches will land sooner, then submit a bug report to your distro of choice so they pull the upstream patches into their releases.