Linux – Safely use SD cards when power can go out at any time

embeddedfilesystemslinuxsd card

We're working on a small embedded Linux system (2.6.35-ish) with a smallish internal NAND device for the OS and applications (250-500Meg) and an SD card with 8Gb SDHC SD cards for data.

The unit's power can be cut at any time.

The system must store data to SD cards. This data is pretty important… it's the whole purpose of the system. The systems are usually entirely disconnected from any network in remote locations and data is retrieved via sneakernet every 4-8 weeks.

Currently, we've simply got VFAT on the SD cards. That was mainly so the first test clients could easily copy data off manually onto their Win7 laptops.

However, I'm now worried that it's only a matter of time until a power outage at the wrong time causes data loss.

What's the best way to configure such a system to prevent data loss? JFFS2 sounds like what I would want in terms of how it writes data (and the performance needs are not high at all), but it sounds fairly kludgy to use block2mtd, etc. I'm also not sure how the card's wear leveling will interact with it.

What's the best way to do this?

EDIT

I'm now thinking of leaving the filesystem VFAT and allocating day-sized files at a time, filled with 0xFF, that should greatly limit the exposure to power cycle failures. I could then only append records within those precreated blocks, and hopefully the SD cards aren't so stupid that they'd erase/wear level writes to 0xFF regions.

I can use noatime, but is there a VFAT nomtime equivalent to prevent writes to the modified time field? I'd need some way to prevent any metadata updates at all until a new day's file is created.

EDIT 2

Someone on the electronics stack exchange reminded me there's also ECC data on NAND, so there's no way to prevent the need for an erase.

So, would JFFS2 via block2mtd be appropriate in this situation?

EDIT 3

It's worse than I thought. The SD cards I have will erase the data blocks even if you write the exact same contents to disk. The eraseblocks are 64KB, and that's too large to entirely delay writes for. I'll store up to 128KB of data in NAND flash (which I can control the write behavior of), in a kind of journal, and then write 128KB blocks to a 128KB-aligned file in a VFAT partition on the SD card (in case other SD cards have 128KB eraseblocks).

Best Answer

Well, the way you can fix this is to fix the "power can be cut at any time" problem. Is it impossible to add even a minute of battery power?

Alternatively, maybe you could use two SD cards. Write the data to one card, sync, write to the other. Each block of your data would need a checksum and block number, but then even with some pretty unlucky power failures, one of the cards should be right.

Your basic problem is going to be wear leveling on the SD cards, which AFAIK depends on the card vendor (and maybe even the batch, they can change it whenever). It probably doesn't handle power outage correctly. And depending on what it does, that may not just mean corrupting the block you're writing to.

  1. Assume trivially small card—3 (flash) blocks. Block 1 has received more writes than 2 or 3. I'll call the physical blocks by number, and logical blocks A, B, C by letter. Right now, A=1, B=2, C=3.
  2. You issue a write to block A . SD card is like aha! we need wear leveling here, else block 1 is going to wear out before 2 and 3. It decides to swap block 1 and 2.
  3. It reads block 1 into RAM position i (on the SD card, not system RAM). It updates the part you wanted to change.
  4. It reads block 2 into RAM position ii
  5. It erases block 1
  6. It writes RAM position ii to block 1.
  7. It updates the mapping table to say B=1
  8. It erases block 2.
  9. It writes RAM position i to block 2.
  10. It updates the mapping table to say A=2

Of course, "updates the mapping table" isn't always trivial. And the order of 5–10 could be different (if they all complete, it doesn't matter, well the erases have to come before writes, of course). But a power failure happens, you could wind up with not only A corrupted (as you expect) but B as well. Or, if power failure happens during a mapping update, who knows what kind of corruption that'll cause.