Performance
I wrote a small Benchmark (source), to find out, what file system performs best with hundred thousands of small files:
Results (average time in seconds, lower = better):
Using Linux Kernel version 3.1.7
Btrfs:
create: 53 s
rewrite: 6 s
read sq: 4 s
read rn: 312 s
delete: 373 s
ext4:
create: 46 s
rewrite: 18 s
read sq: 29 s
read rn: 272 s
delete: 12 s
ReiserFS:
create: 62 s
rewrite: 321 s
read sq: 6 s
read rn: 246 s
delete: 41 s
XFS:
create: 68 s
rewrite: 430 s
read sq: 37 s
read rn: 367 s
delete: 36 s
Result:
While Ext4 had good overall performance, ReiserFS was extreme fast at reading sequential files. It turned out that XFS is slow with many small files - you should not use it for this use case.
Fragmentation issue
The only way to prevent file systems from distributing files over the drive, is to keep the partition only as big as you really need it, but pay attention not to make the partition too small, to prevent intrafile-fragmenting. Using LVM can be very helpful.
Further reading
The Arch Wiki has some great articles dealing with file system performance:
https://wiki.archlinux.org/index.php/Beginner%27s_Guide#Filesystem_types
https://wiki.archlinux.org/index.php/Maximizing_Performance#Storage_devices
All three data journaling modes should leave the filesystem itself fully intact after a power failure. So it should always mount without errors. The difference is only in the data in your files; data=writeback
mode may leave stale data (i.e., what was stored in the disk sectors before the writes your app did). data=ordered
and data=journaled
should not do this.
Most likely what you're seeing is that I/O barriers aren't working on your setup. First, make sure you're not mounting with barrier=0
/nobarrier
. That boosts performance, but will cause corruption on power failure.
If I/O barriers are on, it's also possible you're passing through a storage layer that doesn't support them. On older releases, LVM didn't and various mdraid levels didn't. (This was fixed in Linux 2.6.33; so only if you're running Lucid still.)
Finally, it's possible your disks are telling lies. Disks have write caches. Especially with NCQ, they're supposed to only tell the OS they've written data when they've actually done so, but they've been known to tell the OS its written when its only in the disk's write cache. Increases performance. At least as long as the power stays on. You can try disabling the write cache on the disks, though you'll take a performance hit for this.
Note also that flash-memory disks have a lot of work to do under the hood, and many of them don't handle power failure well. (For example, wear leveling sometimes requires that a full flash block of data be moved. If the power fails in the middle, bad things happen on some flash disks.)
Finally... have you considered an UPS?
Best Answer
Do the random files include any font files, or files with a filename suffix matching those of any font file types? And does your desktop environment include a tool or library that would produce a preview of font files, or a custom icon for them?
fontconfig
creates.uuid
files in directories like~/.fontconfig
. I guess some sort of font previewer might be doing its job by invokingfontconfig
with a custom directory, and thus cause the.uuid
files be dropped into directories where possible font files exist.