Linux Hardware – Stress Testing SD Cards

hardwarelinuxsd card

I got into a little debate with someone yesterday regarding the logic and/or veracity of my answer here, vis., that logging and maintaining fs meta-data on a decent (GB+) sized SD card could never be significant enough to wear the card out in a reasonable amount of time (years and years). The jist of the counter-argument seemed to be that I must be wrong since there are so many stories online of people wearing out SD cards.

Since I do have devices with SD cards in them containing rw root filesystems that are left on 24/7, I had tested the premise before to my own satisfaction. I've tweaked this test a bit, repeated it (using the same card, in fact) and am presenting it here. The two central questions I have are:

Is the method I used to attempt to wreck the card viable, keeping in mind it's intended to reproduce the effects of continuously re-writing small amounts of data?
Is the method I used to verify the card was still okay viable?

I'm putting the question here rather than S.O. or SuperUser because an objection to the first part would probably have to assert that my test didn't really write to the card the way I'm sure it does, and asserting that would require some special knowledge of linux.

[It could also be that SD cards use some kind of smart buffering or cache, such that repeated writes to the same place would be buffered/cached somewhere less prone to wear. I haven't found any indication of this anywhere, but I am asking about that on S.U.]

The idea behind the test is to write to the same small block on the card millions of times. This is well beyond any claim of how many write cycles such devices can sustain, but presuming wear leveling is effective, if the card is of a decent size, millions of such writes still shouldn't matter much, as "the same block" would not literally be the same physical block. To do this, I needed to make sure every write was truly flushed to the hardware, and to the same apparent place.

For flushing to hardware, I relied on the POSIX library call fdatasync():

#include <stdio.h>
#include <string.h>
#include <fcntl.h>
#include <errno.h>
#include <unistd.h>
#include <stdlib.h>

// Compile std=gnu99

#define BLOCK 1 << 16

int main (void) {
    int in = open ("/dev/urandom", O_RDONLY);
    if (in < 0) {
        fprintf(stderr,"open in %s", strerror(errno));
        exit(0);
    }

    int out = open("/dev/sdb1", O_WRONLY);
    if (out < 0) {
        fprintf(stderr,"open out %s", strerror(errno));
        exit(0);
    }

    fprintf(stderr,"BEGIN\n");

    char buffer[BLOCK];
    unsigned int count = 0;
    int thousands = 0;
    for (unsigned int i = 1; i !=0; i++) {
        ssize_t r = read(in, buffer, BLOCK);
        ssize_t w = write(out, buffer, BLOCK);
        if (r != w) {
            fprintf(stderr, "r %d w %d\n", r, w);
            if (errno) {
                fprintf(stderr,"%s\n", strerror(errno));
                break;
            }
        }
        if (fdatasync(out) != 0) {
            fprintf(stderr,"Sync failed: %s\n", strerror(errno));
            break;
        }
        count++;
        if (!(count % 1000)) {
            thousands++;
            fprintf(stderr,"%d000...\n", thousands);
        }
        lseek(out, 0, SEEK_SET);
    }
    fprintf(stderr,"TOTAL %lu\n", count);
    close(in);
    close(out);

    return 0;
}

I ran this for ~8 hours, until I had accumulated 2 million+ writes to the beginning of the /dev/sdb1 partition.¹ I could just have easily used /dev/sdb (the raw device and not the partition) but I cannot see what difference this would make.

I then checked the card by trying to create and mount a filesystem on /dev/sdb1. This worked, indicating the specific block I had been writing to all night was feasible. However, it does not mean that some regions of the card had not been worn out and displaced by wear levelling, but left accessible.

To test that, I used badblocks -v -w on the partition. This is a destructive read-write test, but wear levelling or not, it should be a strong indication of the feasibility of the card since it must still provide space for each rolling write. In other words, it is the literal equivalent of filling the card completely, then checking that all of that was okay. Several times, since I let badblocks work through a few patterns.

[Contra Jason C's comments below, there is nothing wrong or false about using badblocks this way. While it would not be useful for actually identifying bad blocks due to the nature of SD cards, it is fine for doing destructive read-write tests of an arbitrary size using the -b and -c switches, which is where the revised test went (see my own answer). No amount of magic or caching by the card's controller can fool a test whereby several megabytes of data can be written to hardware and read back again correctly. Jason's other comments seem based on a misreading — IMO an intentional one, which is why I have not bothered to argue. With that head's up, I leave it to the reader to decide what makes sense and what does not.]

¹ The card was an old 4 GB Sandisk card (it has no "class" number on it) which I've barely used. Once again, keep in mind that this is not 2 million writes to literally the same physical place; due to wear leveling the "first block" will have been moved constantly by the controller during the test to, as the term states, level out the wear.

Best Answer

I think stress testing an SD card is in general problematic given 2 things:

wear leveling There are no guarantees that one write to the next is actually exercising the same physical locations on the SD. Remember that most of the SD systems in place are actively taking a block as we know it and moving the physical location that backs it around based on the perceived "wear" that each location has been subjected to.
different technologies (MLC vs. SLC) The other issue that I see with this is the difference in technologies. SLC types of SSD I would expect to have a far longer life vs. the MLC variety. Also there are much tighter tolerances on MLC that you just don't have to deal with on SLC's, or at least they're much more tolerant to failing in this way.
- MLC - Multi Level Cell
- SLC - Single Level Cell

The trouble with MLC is that a given cell can store multiple values, the bits are essentially stacked using a voltage, rather than just being a physical +5V or 0V, for example, so this can lead to much higher failure rate potential than their SLC equivalent.

Life expectancy

I found this link that discusses a bit about how long the hardware can last. It's titled: Know Your SSDs - SLC vs. MLC.

SLC

SLC ssds can be calculated, for the most part, to live anywhere between 49 years and 149 years, on average, by the best estimates. The Memoright testing can validate the 128Gb SSD having a write endurance lifespan in excess of 200 years with an average write of 100Gb per day.

MLC

This is where the mlc design falls short. None have been released as of yet. Nobody has really examined what kind of life expectancy is assured with the mlc except that, it will be considerably lower. I have received several different beliefs which average out to a 10 to 1 lifespan in favour of the slc design. A conservative guess is that most lifespan estimates will come between 7 and 10 years, depending on the advancement of ‘wear leveling algorythms ’ within the controllers of each manufacturer.

Comparisons

To draw comparison by way of write cycles, a slc would have a lifetime of 100,000 complete write cycles in comparison to the mlc which has a lifetime of 10,000 write cycles. This could increase significantly depending on the design of ‘wear leveling’ utilized.

Related Solutions

Linux – Safely use SD cards when power can go out at any time

Well, the way you can fix this is to fix the "power can be cut at any time" problem. Is it impossible to add even a minute of battery power?

Alternatively, maybe you could use two SD cards. Write the data to one card, sync, write to the other. Each block of your data would need a checksum and block number, but then even with some pretty unlucky power failures, one of the cards should be right.

Your basic problem is going to be wear leveling on the SD cards, which AFAIK depends on the card vendor (and maybe even the batch, they can change it whenever). It probably doesn't handle power outage correctly. And depending on what it does, that may not just mean corrupting the block you're writing to.

Assume trivially small card—3 (flash) blocks. Block 1 has received more writes than 2 or 3. I'll call the physical blocks by number, and logical blocks A, B, C by letter. Right now, A=1, B=2, C=3.
You issue a write to block A . SD card is like aha! we need wear leveling here, else block 1 is going to wear out before 2 and 3. It decides to swap block 1 and 2.
It reads block 1 into RAM position i (on the SD card, not system RAM). It updates the part you wanted to change.
It reads block 2 into RAM position ii
It erases block 1
It writes RAM position ii to block 1.
It updates the mapping table to say B=1
It erases block 2.
It writes RAM position i to block 2.
It updates the mapping table to say A=2

Of course, "updates the mapping table" isn't always trivial. And the order of 5–10 could be different (if they all complete, it doesn't matter, well the erases have to come before writes, of course). But a power failure happens, you could wind up with not only A corrupted (as you expect) but B as well. Or, if power failure happens during a mapping update, who knows what kind of corruption that'll cause.

Best file system for removable media

To answer your specific question: "So is there a filesystem for SD cards where recovery is virtually gauranteed?"

NO!

It also appears that you are confusing the issue here. The problem that you describe having is with the way the system handles I/O for removable media, not with the filesystem itself. It is possible to recover corrupted information from any filesystem, but this assumes the information was previously there to recover. When I/O operations to a disk are disrupted, then the information is never written to the filesystem to begin with, and therefore there is nothing to recover.

To avoid these types of problems in the future, use the proper mount options for the media. In particular, use the 'sync' option when mounting, which will force all writes to be performed immediately, instead of being cached. By default, caching is enabled for all partitions, which allows for better performance, but can cause these types of problems because of the asynchronous nature of the I/O operations.