I'm having a problem with my ext4 system partition. I'm running 17.04 (upgraded from 16.10), but the problem was already present in 16.10.
Occasionally (most commonly after waking the system from suspend) the system crashes with a bunch of ext4 filesystem errors.
Now when checking the filesystem with
sudo fsck -n /dev/nvme0n1p2
fsck claims that the filesystem is clean
fsck from util-linux 2.29
e2fsck 1.43.4 (31-Jan-2017)
Warning! /dev/nvme0n1p2 is mounted.
Warning: skipping journal recovery because doing a read-only filesystem check.
/dev/nvme0n1p2: clean, 434755/15089664 files, 46490132/60347136 blocks
However if I force a check, I get a whole bunch of errors:
sudo fsck -nf /dev/nvme0n1p2
fsck from util-linux 2.29
e2fsck 1.43.4 (31-Jan-2017)
Warning! /dev/nvme0n1p2 is mounted.
Warning: skipping journal recovery because doing a read-only filesystem check.
Pass 1: Checking inodes, blocks, and sizes
Inodes that were part of a corrupted orphan linked list found. Fix? no
Inode 131275 was part of the orphaned inode list. IGNORED.
Inode 135221 was part of the orphaned inode list. IGNORED.
Inode 135244 was part of the orphaned inode list. IGNORED.
Inode 135260 was part of the orphaned inode list. IGNORED.
Inode 135263 was part of the orphaned inode list. IGNORED.
Inode 135268 was part of the orphaned inode list. IGNORED.
Deleted inode 135272 has zero dtime. Fix? no
Inode 135274 was part of the orphaned inode list. IGNORED.
Inode 135384 was part of the orphaned inode list. IGNORED.
Inode 135396 was part of the orphaned inode list. IGNORED.
Inode 135697 was part of the orphaned inode list. IGNORED.
Inode 135711 was part of the orphaned inode list. IGNORED.
Inode 135713 was part of the orphaned inode list. IGNORED.
Inode 12059086 was part of the orphaned inode list. IGNORED.
Inode 12061077 was part of the orphaned inode list. IGNORED.
Inode 12062594 was part of the orphaned inode list. IGNORED.
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Block bitmap differences: -(40927357--40927367) -(40927563--40927569) -(40940652--40940653) -(40940676--40940681) -(48296964--48296970) -(48296978--48296984) -(48304145--48304165) -(48304315--48304321) -(48326677--48326690) -(48326733--48326739) -(48326775--48326781)
Fix? no
Free blocks count wrong (13857004, counted=13856542).
Fix? no
Inode bitmap differences: -131275 -135221 -135244 -135260 -135263 -135268 -135272 -135274 -135384 -135396 -135697 -135711 -135713 -12059086 -12061077 -12062594
Fix? no
Free inodes count wrong (14654909, counted=14654758).
Fix? no
/dev/nvme0n1p2: ********** WARNING: Filesystem still has errors **********
/dev/nvme0n1p2: 434755/15089664 files (0.3% non-contiguous), 46490132/60347136 blocks
Now my problem is that I cannot fix those errors, since it is my system partition, which I cannot unmount.
But when I boot from an external drive or in recovery mode, fsck does not report any errors, even with -f.
After rebooting my system, however, the errors persist. I'm currently at a loss how I might be able to fix the filesystem. Any help would be greatly appreciated.
Best Answer
If you force a file system check on an ext4 file system that is currently mounted in r/w-mode using
fsck -nf <filesystem>
, you will always get error messages like the ones you posted (corrupted orphan linked list, Deleted inode ... has zero dtime). The fact thatfsck -n <filesystem>
reports it as clean is a bit misleading here.The reason you are not seeing these errors in recovery mode or when booted from external drive is simply the fact, that in this case the file system in question is not mounted in r/w-mode, or not even mounted at all.
The manual page for e2fsck also explicitly states:
Conclusion: If you intend to use the
-f
flag for fsck, make sure you understand 100% what it does. In particular, using it on a mounted file system is usually not what you want.As to why you are getting ext4 errors when waking from suspend is an entirely different problem which you seem to have already solved. For reference reasons I will include the links you posted yourself in a comment here, as they were helpful in solving your original problem: