What makes fsck so slow on big filesystems

ffsfilesystemsfsckopenbsd

I have over a dozen of filesystems on my OpenBSD server with 12GB DDR3 and several 1.5TB HDDs. All filesystems themselves are generally between 8GB and 64GB in size.

I've noticed that even by following the best practice — of keeping them so small — fsck is still very slow on reboot.

What makes fsck so slow? Raw filesystem size? Total number of inodes (iused + ifree)? Number of used inodes? Something else entirely? Any easy way to improve fsck times even further?

Best Answer

The purpose of running fsck is to find inconsistencies. Doing so means walkig the filesystem to look at each directory entry (directory/file) as well as the data behind it to verify for example that the size in the directory entry matches the actual size of the data. This process has always been slow. In the old days we didn't notice since filesystems were much smaller, contained a smaller amount of files and computers took longer to boot anyway (services were started sequentially). Since speed of rotating disks isn't increasing in the same way capacity does, running a filesystem check during system start is becoming less and less feasible.

That's why many reasonably modern filesystems like ext3, ext4, reiserfs, XFS,... don't do a filesystem check on reboot anymore. Instead they use a journal for bookkeeping. Before a change is written to disk it is written to the journal. Once the change is complete the outstanding transaction is marked as complete in the journal. Should the system die before a transaction is complete the filesystem knows which transactions were underway and can "replay" these transactions to bring the filesystem back into a consistent state. This tends to be much faster than running a filesystem check. Modern filesystems use a ton of clever tricks to reduce the overhead of maintaining the journal - in practice you often don't notice the difference.

The latest generation of filesystems like btrfs, ZFS,... uses copy-on-write techniques which means a transaction that modifies a file or metadata never overwrites existing data. Instead the new data is written to separate blocks. Once the new copy is ready the filesystem atomically switches over to using the new copy. This also effectively prevents the filesystem from becoming inconsistent (plus it has some other advantages).

Consider using a journaling filesystem or a copy-on-write filesystem if you want your system to start quickly.

Related Solutions

FSCK at Boot – Importance with Journaled Filesystems

I'm answering this in the general context of "journalled filesystems".

I think that if you did a number of "unclean shutdowns" (by pulling the power cord or something) sooner or later you'd get to a filesystem state that would require fsck or the moral equivalent of fsck, xfs_repair. The ext4 fileystsm on my laptop for the most part just replays the journal on every reboot, clean shutdowns included, but every once in a while, it does a full-on fsck.

But ask yourself what "replaying the journal" accomplishes. Replaying a journal just ensures that the diskblocks of the rest of the fileystem match the ordering that the journal entries demand. Replaying a journal amounts to a small fsck, or to parts of a full on fsck.

I think there's some verbal sleight of hand going on: replaying a journal does part of what traditional fsck does, and xfs_repair is exactly what the same kind of program that e2fs.fsck (or any other filesystem's fsck) is. The XFS people just believed or their experience led them to not running xfs_repair on every boot, just to replaying the journal.

Debian – How to force fsck at every boot – all (relevant) filesystems

In /etc/init.d/checkfs.sh is the line if [ -f /forcefsck ] || grep -s -w -i "forcefsck" /proc/cmdline, so providing forcefsck on the kernel command line or generating a /forcefsck file on shutdown should cause an fsck on the next reboot.

To prevent manual fsck runs, ask fsck to try to automatically fix errors with the -y option by uncommenting and changing no to yes in the following /etc/default/rcS entry, after the edit it should look like:

# automatically repair filesystems with inconsistencies during boot
FSCKFIX=yes

One option (forcefsck or FSCKFIX) does not imply the other.

Best Answer

Related Solutions

FSCK at Boot – Importance with Journaled Filesystems

Debian – How to force fsck at every boot – all (relevant) filesystems

Related Question