If the program detects that a power loss will occur in a few seconds, what can it do to avoid data corruption

embeddedfilesystemsshutdown

A program (which needs to read and write from and to the filesystem), has the extra feature of being able to communicate with an external sensor. Therefore it has the unique ability to know if a power loss is imminent. The warning I get is only a few (around 3 to 5) seconds – not enough to perform a full shut-down. A few precious seconds might be added, but it would increase hardware costs.

As writing to a file does not guarantee that the OS will do the job now, even closing the file can, as far as I know, lead to the OS deciding it will do the closing later if no one else tries to access it, so how can I guarantee that

  1. all the writes I perform now will be saved. (After the warning is received, my program will write a few kilobytes to disk, but there might have been larger amounts of data written before the warning is received, which might or might not yet be finalized by the OS)
  2. No other corruption occurs because of an improper shut-down of the OS.

Note: by design, the loss of power might be a regular occurrence. Also by design, no other "user applications" will run on the system (so we don't have to worry for example about a music player, a development environment, a spreadsheet editor and gimp all running and accessing the file system).

Best Answer

The main thing you need to do is issue a sync system call. There's a sync utility that does just that. When the sync system call returns, it guarantees that any filesystem write operation (on any mounted filesystem) that was issued before the sync is completed.

It's up to your application design to ensure that if this happens in the middle of a sequence of write operations, the data is left in a usable state. However, if you have the luxury of a guaranteed warning period before power loss, you can be sloppier in your application design, as long as you guarantee timely response to the power loss notice (which is hard).

With journaling filesystems such as ext4, if you sync and then turn off power, you won't get an fsck on reboot. However, if something causes a write after the sync, I think it's possible, but rare, that fsck could be needed. If you want to be absolutely sure, unmount all read-write filesystems before the power loss, or at least remount them as read-only. Normally, you can't do that if there are files open for writing. If your system runs Linux, you can use its magic sysrq feature (you'll need to make sure that it's enabled). This can be invoked programmatically by writing a character to /proc/sysrq-trigger: echo u >/proc/sysrq-trigger force-remounts all filesystems as read-only (this includes the effect of sync). You can also use this interface to reboot (b) or power off (o), if that's useful in your setup.

If the power loss notice might be cancelled, you can call sync: that has no ill effect on anything but performance. A force-mount-read-only, on the other hand, is not recoverable in general, so do that only when you've committed to rebooting.

For most setups that match your description, this is a reasonable reaction to a power loss notice:

  1. Send a custom signal (e.g. SIGUSR1 or SIGPWR) to affected process, instructing them to quickly commit or abort any ongoing transaction, if it can help make the recovery on the next boot easier.
  2. Wait for part of the delay before the expected power loss. Calibrate that to have enough for the remaining operations.
  3. Write a log message.
  4. echo u >/proc/sysrq-trigger
Related Question