I have a directory tree that contains many small files, and a small number of larger files. The average size of a file is about 1 kilobyte. There are 210158 files and directories in the tree (this number was obtained by running find | wc -l
).
A small percentage of files gets added/deleted/rewritten several times per week. This applies to the small files, as well as to the (small number of) larger files.
The filesystems that I tried (ext4, btrfs) have some problems with positioning of files on disk. Over a longer span of time, the physical positions of files on the disk (rotating media, not solid state disk) are becoming more randomly distributed. The negative consequence of this random distribution is that the filesystem is getting slower (such as: 4 times slower than a fresh filesystem).
Is there a Linux filesystem (or a method of filesystem maintenance) that does not suffer from this performance degradation and is able to maintain a stable performance profile on a rotating media? The filesystem may run on Fuse, but it needs to be reliable.
Best Answer
Performance
I wrote a small Benchmark (source), to find out, what file system performs best with hundred thousands of small files:
delete all files
sync and drop cache after every step
Results (average time in seconds, lower = better):
Result:
While Ext4 had good overall performance, ReiserFS was extreme fast at reading sequential files. It turned out that XFS is slow with many small files - you should not use it for this use case.
Fragmentation issue
The only way to prevent file systems from distributing files over the drive, is to keep the partition only as big as you really need it, but pay attention not to make the partition too small, to prevent intrafile-fragmenting. Using LVM can be very helpful.
Further reading
The Arch Wiki has some great articles dealing with file system performance:
https://wiki.archlinux.org/index.php/Beginner%27s_Guide#Filesystem_types
https://wiki.archlinux.org/index.php/Maximizing_Performance#Storage_devices