Difference between block size and cluster size

block-deviceext3filesystemsterminology

I've got a question concerning the block size and cluster size. Regarding to what I have read about that I assume the following:

  • The block size is the physical size of a block, mostly 512 bytes. There is no way to change this.
  • The cluster size is the minimal size of a block that is read and writable by the OS. If I create a new filesystem e.g. ext3, I can specify this minimal block size with the switch -b. Almost all programs like dumpe2fs, mke2fs use block size as an name for cluster size.

If I have got the following output:

$ stat test
File: `test'
Size: 13            Blocks: 4          IO Block: 2048   regular file
Device: 700h/1792d  Inode: 15          Links: 1

Is it correct that the size is the actual space in bytes, blocks are the physically used blocks (512 bytes for each) and IO block relates to the block size specified when creating the FS?

Best Answer

I think you're confused, possibly because you've read several documents that use different terminology. Terms like “block size” and “cluster size” don't have a universal meaning, even within the context of filesystem literature.

Filesystems

For ext2 or ext3, the situation is relatively simple: each file occupies a certain number of blocks. All blocks on a given filesystem have the same size, usually one of 1024, 2048 or 4096 bytes. A file¹ whose size is between N blocks plus one byte and N+1 blocks occupies N+1 blocks. That block size is what you specify with mke2fs -b. There is no separate notion of clusters.

The FAT filesystem used in particular by MS-DOS and early versions of Windows has a similarly simple space allocation. What ext2 calls blocks, FAT calls clusters; the concept is the same.

Some filesystems have a more sophisticated allocation scheme: they have fixed-size blocks, but can use the same block to store the last few bytes of more than one file. This is known as block suballocation; Reiserfs and Btrfs do it, but not ext3 or even ext4.

Utilities

Unix utilities often use the word “block” to mean an arbitrarily-sized unit, typically 512 bytes or 1kB. This usage is unrelated to any particular filesystem or disk hardware. Historically, the 512B block did come about because disks and filesystems at the time often operated in 512B chunks, but the modern usage is just arbitrary. Traditional unix utilities and interfaces still use 512B blocks sometimes, though 1kB blocks are now often preferred. You need to check the documentation of each utility to know what size of block it's using (some have a switch, e.g. du -B or df -B on Linux).

In the GNU/Linux stat utility, the blocks figure is the number of 512B blocks used by the file. The IO Block figure is the preferred size for file input-output, which is in principle unrelated but usually an indication of the underlying filesystem's block size (or cluster size if that's what you want to call it). Here, you have a 13-byte file, which is occupying one block on the ext3 filesystem with a block size of 2048; therefore the file occupies 4 512-byte units (called “blocks” by stat).

Disks

Most disks present an interface that shows the disk as a bunch of sectors. The disk can only write or read a whole sector, not individual bits or bytes. Most hard disks have 512-byte sectors, though 4kB-sector disks started appearing a couple of years ago.

The disk sector size is not directly related to the filesystem block size, but having a block be a whole number of sectors is better for performance.

¹ Exception: sparse files save space .

Related Question