Linux – File system that never breaks (data loss acceptable)

embeddedfilesystemslinuxsd card

There're several existing topics revolving around this issue, but what I seek is slightly different. I have an SD card on an embedded Linux and it suffers from power loss. I might be able to modify the hardware at some point, close down properly and so on, etc. But right now, I'd just like to find a file system that survives power loss without fuss. Data loss is acceptable. I'd prefer not to lose more than the file that I'm currently writing, but I'd still rather lose it all than facing an 'unable to mount', 'wait for this 10 minutes fsck' or 'unable to create new file due to this inode something something error'.
The program MUST go on!

I'm making a lot of effort in ensuring this. I'm using industrial grade components, I got hardware watchdogs, software watchdogs, internal, external, init restarting the programs, daemons constantly checking memory, file descriptors and whatnot, I got watchdogs watching my watchdogs, which in turn are watched by other watchdogs…
But I can't seem to guarantee that the SD card is able to mount and function?

My best bet right now, is to use JFS on the SD card, include fsck and fsck.jfs in my installation. (Adding 600kb+ eating my ram and my flash. Which is bad.) And run fsck at every startup (maybe adding a lot of boot time. Which is somewhat bad.).
It seems a bit sad though.

Does anyone know of a better way or a better file system?

UPDATE:
e2fsprogs-libs (dependency to jfsutils) seems to be hellishly difficult to compile in my distribution. I'll look into ZFS (it's not native to my distribution though. And it seems to do a lot that I don't need.)

UPDATE2:
Some more info about my system and my tests: The SD card storage is a secondary, optional storage. The SD cards are 2Gb-8Gb industrial grade microSD. The SD card is mounted through my rc with a mount -t command. Options "noatime" but not "sync". My distribution is a custom Analog Device flavored uClinux, with a 3.10 kernel and a 1.21 busybox.
My primary storage is a spi flash with jffs2. I've never had any issues with that. I don't even know if there is a fsck.jffs2 available. Nand flash on the other hand … but that's a different story. The purpose of the SD card, is to store measurement data. The 'monitor' program will append results to a file and has strategic sync placements. When the file comes above a given size a new will be created. When a given number of files have been reached the oldest one will be deleted. If the current measurement file is lost due to power loss, it's no disaster. The files are usually at 50-100kb and 1 result is usually 1kb.
This is just the initial development phase. Nothing is fixed. This is the first time I've been dealing with non-flash filesystems in embedded systems. (I got ext4 at my x86 servers.)

I started out with vfat. The default filesystem. (I figured that the factories might have a reason for choosing it. And if things work I don't really care that much.) I've never seen any power loss issues in my embedded vfat devices. I've experienced issues with FAT in WinCE though. However, when my 'monitor' program reached 100-200 files it refused to create any more. It seems that FAT has a special file limit issue in the root and a slightly bigger one in sub dirs. I need to be able to create 500-1000 files in 1 dir. So vfat won't do.

Then I switched to ext2. I didn't insert a fsck at startup though. (Didn't know I had to do so.) Within a day my 'monitor' program were unable to create more files due to a 'inode something something' error. Disaster!

My current solution is ext2 with a "e2fsck -y" at startup. So far it's seems promising. But the e2fsck and the whole concept of 'fsck at startup' is nagging at me. The e2fsck by it self is spending more than 350kb of my primary flash and ram. (When it's not running.) Which means that it's my biggest program. It's bigger than busybox. It's almost rivaling my kernel.

I've been considering ext3. It has journalled meta data, which wouldn't hurt. I'm in doubt as to how much it will help though. With my small files and controlled syncs I should be covered I think? It has a ordered write sequence. Meaning that data are also somewhat journalled. This however can lead to non-deterministic lags. Which is bad in my situation. (It's probably not an issue.) It also has a scheduled sync feature. Eg. commit every 5 sec. Which is interfering with my own syncs I think. Too many writes are bad for SD cards. Even industrial ones. I cannot find any documentation on how to disable this. And ext3 still requires fsck to be run at every startup! But ext3 is still a possibility.

Ext4. Will fix a lot of the performance issues of ext3. I don't really need performance though. And my distribution doesn't seem to have a builtin mkfs.ext4 and a fsck.ext4. Perhaps that's not a problem. It might though. Eg. the e2progs-libs (dependency to jfsutils) seems to have a lot of compile issues.

JFS, XFS, BRFSS. All supported by my kernel. Currently not included in my user space tool box. All seems to be rather big, complex systems. And they all seem to require a 'fsck' equivalent at startup?

I've also considered throwing my own filesystem: Always write 2 copies of the file table. When traversing, it pick the one with the correct CRC and the newest sequence number. Make a 2-stage write sequence. Allocate temporary, fix at commit. No fsck needed.
I'm afraid that it might be a bit naive though.

UPDATE3: BTW, the nature of embedded systems (this one at least) is that they're autonomous, unattended, out of reach, and they have to run for years. Programs like fsck that may require human interaction creeps me out.

Best Answer

There's a bit of an inconsistency or at least, ambiguity, in your story here:

I'd still rather lose it all than facing an 'unable to mount', 'wait for this 10 minutes fsck'

Implies -- although you don't actually say it -- that this is a problem you are actually experiencing. But then:

e2fsprogs-libs (dependency to jfsutils) seems to be hellishly difficult to compile in my distribution.

Meaning you don't have any fsck at all, since e2fsprogs-libs is a dependency for e2fsprogs which provides e2fsck. So perhaps you are still in a planning stage here and have not even tested the system with, e.g., ext4, but instead jumped to the conclusion that you should start with JFS? Is there any particular reason for that?

I've noticed on the raspberry pi exchange (the pi's primary storage is also a SD card) that a significant number of users seem to be very frustrated by problems of this sort, even though the majority (including myself) have never had it at all. At first I assumed these were people ignorant of the fact that the system should be cleanly shut down, but that is not a hard point to grasp when explained, and there are people who report it even though the system HAS been shut down properly.

You've already said you need this to be able to tolerate power cuts (which is fair enough), but I mention this because it implies there are some pis, or some SD cards, or some combination of both, that are just prone to corrupting the filesystem due to some event (surge?) that occurs regularly either when the plug is pulled, or when it is put back in. I also have NOT seen -- and there's been plenty of time for plenty of people to try -- ANY reports of someone saying they've switched to btrfs or jfs or whatever and now the problem is solved.

The other mysterious thing about this is even if people are yanking the cord, this should not regularly result in an unusable filesystem. Certainly I've done it a bunch of times w/ the pi, and scores if not hundreds of times w/ a regular linux box (the power was cut, the system has become unresponsive, I'm exhausted and angry, etc.) and while I've seen minor data loss, I've never seen a filesystem corrupted to the point of being unusable after a quick fsck.

Again, presuming all these reports are true (I don't see why numbers of people would lie about it), there's something much more going on than just not cleanly unmounting, but it seems to only affect a small percentage of users, implying again some kind of common hardware defect.

On the pi I write -y to /forcefsck in a boot script, so that on the next boot it is run automatically and any problems are fixed, regardless of whether this appears to be necessary or not. On a 700 Mhz single core this takes ~10 seconds for a 12 GB filesystem containing ~4 GB of data. So "10 minutes" sounds like an incredibly long time, especially since you've already said "This is the small filesystem for write!".

You might also consider calling sync at regular intervals.

Finally, you should update the question with more factual, specific details of the problems you have actually encountered, and less hyperbole. Otherwise it just looks too much like a premature XY problem, which will likely get quickly skipped over by people with a lot of experience and potential advice for you.

Related Solutions

Linux – Safely use SD cards when power can go out at any time

Well, the way you can fix this is to fix the "power can be cut at any time" problem. Is it impossible to add even a minute of battery power?

Alternatively, maybe you could use two SD cards. Write the data to one card, sync, write to the other. Each block of your data would need a checksum and block number, but then even with some pretty unlucky power failures, one of the cards should be right.

Your basic problem is going to be wear leveling on the SD cards, which AFAIK depends on the card vendor (and maybe even the batch, they can change it whenever). It probably doesn't handle power outage correctly. And depending on what it does, that may not just mean corrupting the block you're writing to.

Assume trivially small card—3 (flash) blocks. Block 1 has received more writes than 2 or 3. I'll call the physical blocks by number, and logical blocks A, B, C by letter. Right now, A=1, B=2, C=3.
You issue a write to block A . SD card is like aha! we need wear leveling here, else block 1 is going to wear out before 2 and 3. It decides to swap block 1 and 2.
It reads block 1 into RAM position i (on the SD card, not system RAM). It updates the part you wanted to change.
It reads block 2 into RAM position ii
It erases block 1
It writes RAM position ii to block 1.
It updates the mapping table to say B=1
It erases block 2.
It writes RAM position i to block 2.
It updates the mapping table to say A=2

Of course, "updates the mapping table" isn't always trivial. And the order of 5–10 could be different (if they all complete, it doesn't matter, well the erases have to come before writes, of course). But a power failure happens, you could wind up with not only A corrupted (as you expect) but B as well. Or, if power failure happens during a mapping update, who knows what kind of corruption that'll cause.

Best file system for removable media

To answer your specific question: "So is there a filesystem for SD cards where recovery is virtually gauranteed?"

NO!

It also appears that you are confusing the issue here. The problem that you describe having is with the way the system handles I/O for removable media, not with the filesystem itself. It is possible to recover corrupted information from any filesystem, but this assumes the information was previously there to recover. When I/O operations to a disk are disrupted, then the information is never written to the filesystem to begin with, and therefore there is nothing to recover.

To avoid these types of problems in the future, use the proper mount options for the media. In particular, use the 'sync' option when mounting, which will force all writes to be performed immediately, instead of being cached. By default, caching is enabled for all partitions, which allows for better performance, but can cause these types of problems because of the asynchronous nature of the I/O operations.

Best Answer

Related Solutions

Linux – Safely use SD cards when power can go out at any time

Best file system for removable media

Related Question