How to change the extent size in the ext4 file system

defragmentationext4filesystems

As you can read here , ext4 file system has an extent feature that groups blocks into extents. Each of them can have up to 128MiB contiguous space. In e4defrag , there are lines similar to the following:

[325842/327069]/file:  100%  extents: 100 -> 10   [ OK ]

The size of the file is around 150MiB. So according to the wiki page, there should be 2 extents instead of 10.

Does anyone know why the extents are 15MiB instead of 128MiB?
Is there a a tool that can check the exact extent size?
How can I change the size so it could be 128MiB?

Best Answer

I think I know how it works.

I connected another disk to my machine because it has a big almost empty partition ~458G . I checked its free space via e2freefrag:

HISTOGRAM OF FREE EXTENT SIZES:
Extent Size Range :  Free extents   Free Blocks  Percent
   64M...  128M-  :             6        146233    0.12%
  128M...  256M-  :             5        322555    0.27%
  256M...  512M-  :             3        263897    0.22%
  512M... 1024M-  :             6       1159100    0.98%
    1G...    2G-  :           228     116312183   98.40%

It's just a contiguous free blocks. So because the partition is almost empty, there's lots of free space and you have 228 chunks of 1-2G.

I placed a big 2,5G file inside of the partition, and the table above changed a little bit:

HISTOGRAM OF FREE EXTENT SIZES:
Extent Size Range :  Free extents   Free Blocks  Percent
    2M...    4M-  :             5          5114    0.00%
   64M...  128M-  :             7        170777    0.14%
  128M...  256M-  :             1         64511    0.05%
  256M...  512M-  :             4        361579    0.31%
  512M... 1024M-  :             5        930749    0.79%
    1G...    2G-  :           227     116025495   98.16%

This doesn't tell anything about the allocated block extents, but it gave me some ideas. When I looked at the file in e4defrag, there was something like this:

# e4defrag -cv file
<File>
[ext 1]:        start 34816:    logical 0:      len 32768
[ext 2]:        start 67584:    logical 32768:  len 30720
[ext 3]:        start 100352:   logical 63488:  len 32768
[ext 4]:        start 133120:   logical 96256:  len 30720
[ext 5]:        start 165888:   logical 126976: len 32768
[ext 6]:        start 198656:   logical 159744: len 30720
[ext 7]:        start 231424:   logical 190464: len 32768
[ext 8]:        start 264192:   logical 223232: len 30720
[ext 9]:        start 296960:   logical 253952: len 32768
[ext 10]:       start 329728:   logical 286720: len 32768
[ext 11]:       start 362496:   logical 319488: len 32768
[ext 12]:       start 395264:   logical 352256: len 32768
[ext 13]:       start 428032:   logical 385024: len 32768
[ext 14]:       start 460800:   logical 417792: len 32768
[ext 15]:       start 493568:   logical 450560: len 30720
[ext 16]:       start 557056:   logical 481280: len 32768
[ext 17]:       start 589824:   logical 514048: len 32768
[ext 18]:       start 622592:   logical 546816: len 32768
[ext 19]:       start 655360:   logical 579584: len 32768
[ext 20]:       start 688128:   logical 612352: len 32768
[ext 21]:       start 720896:   logical 645120: len 622

The number 32768 means blocks (4K), which equals to 128MiB. Some of them have fewer blocks and I don't know why because the filesystem is empty and I think all the extents should have 32768 blocks.

Anyway I checked the main partition to see its free space, and there was something like this:

HISTOGRAM OF FREE EXTENT SIZES:
Extent Size Range :  Free extents   Free Blocks  Percent
    4K...    8K-  :          3955          3955    0.06%
    8K...   16K-  :          3495          8194    0.13%
   16K...   32K-  :          2601         13165    0.20%
   32K...   64K-  :          2622         28991    0.45%
   64K...  128K-  :          2565         58267    0.90%
  128K...  256K-  :          1576         71371    1.11%
  256K...  512K-  :          1331        118346    1.83%
  512K... 1024K-  :          1058        190532    2.95%
    1M...    2M-  :          1202        444210    6.89%
    2M...    4M-  :          1211        884489   13.71%
    4M...    8M-  :          1249       1803998   27.97%
    8M...   16M-  :           622       1643226   25.48%
   16M...   32M-  :           198       1024999   15.89%
   32M...   64M-  :            16        163082    2.53%

As you can see, there's no free contiguous blocks that could provide 128M (and more) space and that's why they've written on the wiki that you can have extents "up to" 128M.

I'm not sure why the file in question has 10 extents because there's still 16 chunks that are at least 32M.

Related Solutions

Sparse Files – Understanding File Holes and Block Size

Ext4 can use 1kB, 2kB or 4kB as the block size; as far as I know the default on Ubuntu is 4kB. Note that here, a block is the size of a file chunk, which is constant for a given filesystem. The file you describe has two blocks that are not zeroes: the one containing hello (surrounded by a bunch of zeroes — 3616 before and 474 after), and the one containing here (preceded by a bunch of zeroes, and containing only 3148 bytes, after which the end of the file is reached). The total is two blocks of 4kB.

In the ls output, blocks are an arbitrary unit chosen by the ls command and defaulting to 1kB. There are 2 blocks of 4kB each allocated to contain file data, therefore the allocated size for the file is 8kB.

Your confusion may be due to two things. First, the figure of 2048 bytes for a block is possible, but it's not the default value under Ubuntu (or most modern distributions), and it's apparently not the value on your system. You can check the block size by running tune2fs -l /dev/sdz42 (use the actual path to your filesystem device).

Second, sparse files consist of not storing blocks that are entirely made of zeroes. If a block (which is of necessity aligned on a block size boundary, at least for most filesystems including ext4) contains zeroes and other things, then the full block is stored on the disk. Thus, in that 40012-byte file (how did you get to 40013, by the way), there are 4 all-zero non-stored blocks, then one stored block containing hello surrounded by zeroes, then 4 more all-zero non-stored blocks, and a final partial block containing zeroes and there.

Note that your utility can be written in terms of standard shell commands:

n=20000
while IFS= read -r line; do
  dd bs=1 seek=$n </dev/null
  echo "$line"
done >testfile

Stat, Blocks and Sector size – ext4

The links you give explicitly state:

The st_blocks field indicates the number of blocks allocated to the file, 512-byte units.

So they're always in units of 512-byte blocks, regardless of what underlying device is used. The stat command simply displays what the stat system call returns. The 512-byte block is a historic thing, defined in POSIX. Compare for example these:

$ ls -s smallfile.txt
4 smallfile.txt
$ env POSIXLY_CORRECT=1 ls -s smallfile.txt
8 smallfile.txt

GNU ls displays blocks by default in 1kB blocks, but when forced to comply with POSIX it shows 512-byte blocks.

Best Answer

Related Solutions

Sparse Files – Understanding File Holes and Block Size

Stat, Blocks and Sector size – ext4

Related Question