Skip to main content

Raymii.org Logo (IEC resistor symbol)logo

Quis custodiet ipsos custodes?
Home | About | All pages | RSS Feed | Gopher

Broken Corrupted Raspberry Pi SD Card

Published: 09-01-2015 | Author: Remy van Elst | Text only version of this article


Table of Contents


Pi

One of my Raspberry Pi's would not boot up after a reboot. The SD card wascorrupted, sadly beyond repair. This article walks you through the steps I tookto try to fix the SD card, including fsck, badblocks and other filesystemutilities. It also has tips to reduce the writing on the Raspberry Pi, this tosave SD cards from some amount of wear and thus possible corruption.

If you like this article, consider sponsoring me by trying out a Digital OceanVPS. With this link you'll get $100 credit for 60 days). (referral link)

The machine was running as an FM transmitter so that my regular soundsystem could play podcasts (Please note that it is probably illegal to transmitwithout having your HAM license). I't would sync up my feeds and, when turnedon, started playing them.

However, after the last reboot it would not start up again. I rigged up the UARTto my machine and found out why, there were files that the system could notread, therefore init would not boot up.

fsck

The file system consistency check utility most of the time is able to find andrecover damaged filesystems. My systems do a filesystem check at every boot, andthey get a reboot regularly. The SD card has an ext4 filesystem and is 16 GB insize.

Since most of the time you can fix broken filesystems with an fsck I hooked upthe card to my machine and started the fsck:

# fsck /dev/mmcblk0p2

And it failed right away:

fsck from util-linux 2.25.2e2fsck 1.42.12 (29-Aug-2014)fsck.ext4: Filesystem revision too high while trying to open /dev/mmcblk0p2The filesystem revision is apparently too high for this version of e2fsck.(Or the filesystem superblock is corrupt)The superblock could not be read or does not describe a valid ext2/ext3/ext4filesystem.  If the device is valid and it really contains an ext2/ext3/ext4filesystem (and not swap or ufs or something else), then the superblockis corrupt, and you might try running e2fsck with an alternate superblock:    e2fsck -b 8193 <device> or    e2fsck -b 32768 <device>

The superblock contains information about the file system such as the filesystem type, size, status, information about other metadata structures, blockcounts, inode counts, supported features, maintenance information, and more andso on (source). It is an important piece of information for the filesystem.

So the card is so broken that the first superblock is not readable. An ext4filesystem has multiple copies of it's superblock, so lets find them and usethem to try the fsck again. To find the other superblocks we can use thefollowing command:

mke2fs -n /dev/mmcblk0p2

Output

mke2fs 1.42.12 (29-Aug-2014)/dev/mmcblk0p2 contains a ext4 file systemProceed anyway? (y,n) yCreating filesystem with 3794688 4k blocks and 950272 inodesFilesystem UUID: a56c8a06-9907-41a2-86a0-dd601212880bSuperblock backups stored on blocks:        32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208

We also need to find the block size, the above command shows it as 4k, which is4096. With this information we can retry the fsck with the alternatesuperblock:

fsck -b 163840 -B 4096 /dev/mmcblk0p2

Where -b is the alternative superblock and -B is the blocksize.

It didn't help much, after a lot of questions I decided to automatically fixeverything it found:

fsck from util-linux 2.25.2e2fsck 1.42.12 (29-Aug-2014)Superblock has an invalid journal (inode 8).Clear<y>? yes*** ext3 journal has been deleted - filesystem is now ext2 only ***Resize inode not valid.  Recreate<y>? yesPass 1: Checking inodes, blocks, and sizesyInode 1 has EXTENTS_FL flag set on filesystem without extents support.Clear<y>? yesRoot inode has dtime set (probably due to old mke2fs).  Fix<y>? yesQuota inode is not in use, but contains data.  Clear<y>? yesQuota inode is not in use, but contains data.  Clear<y>? yesInode 5, i_size is 2305843009213693952, should be 0.  Fix<y>? yesInode 5, i_blocks is 131072, should be 0.  Fix<y>? yesReserved inode 6 (<The undelete directory inode>) has invalid mode.  Clear<y>? yesInode 6 has a bad extended attribute block 2064.  Clear<y>? yesInode 6, i_size is 666532745924706320, should be 0.  Fix<y>? yesJournal inode is not in use, but contains data.  Clear<y>? yesReserved inode 9 (<Reserved inode 9>) has invalid mode.  Clear<y>? yesReserved inode 10 (<Reserved inode 10>) has invalid mode.  Clear<y>?Recreate journal<y>? cancelled!/dev/mmcblk0p2: e2fsck canceled./dev/mmcblk0p2: ***** FILE SYSTEM WAS MODIFIED *****

Adding the -y option to the command will automatically answer yes toeverything:

fsck -y -b 163840 -B 4096 /dev/mmcblk0p2

Scratch Files

This goes well for a while, but fails after some time with the following error:

Error storing directory block information (inode=5542, block=0, num=33754683): Memory allocation failed/dev/mmcblk0p2: ***** FILE SYSTEM WAS MODIFIED *****Recreate journal? yesCreating journal (32768 blocks):  Done.*** journal has been re-created - filesystem is now ext3 again ***e2fsck: aborted

Searching around on the web results in multiple topics suggesting to add thefollowing:

[scratch_files]directory = /var/cache/e2fsck

To the file /etc/e2fsck.conf. The man page describes it like so:

   [scratch_files]      This  stanza  controls  when  e2fsck will attempt to use scratch      files to reduce the need for memory.

Ted Tso explains what this option does on this mailinglist topic:

This will cause e2fsck to store certain data structures which grow large withbackup servers that have a vast number of hard-linked files in /var/cache/e2fsckinstead of in memory. This will slow down e2fsck by approximately 25%, but forlarge filesystems where you couldn't otherwise get e2fsck to complete becauseyou're exhausting the 2GB VM per-process limitation for 32-bit systems, itshould allow you to run through to completion.

You have to create the /var/cache/e2fsck folder if it does not exist yetbefore running the fsck:

mkdir -p /var/cache/e2fsck

The scratch_files stanza has more options, two of which might be important,set dirinfo and set icount.

You configure set dirinfo to false if your filesystem had an large number ofindividual files and not that many directories. You configure set icount tofalse if the reverse were true. Otherwise, you configure them both on true.

There was also a good explanation of where the Out of Memory error comes from onStack Exchange, it might not be out of memory.

The scratch_files setting did not improve the check. It still gave the sameerror, out of memory.

Even after playing with the dirinfo settings, or adding a 200 GB swapfile thefsck would still fail. dmesg was also full with these kinds of logs:

[12343.5678] end_request: I/O error, dev mmcblk0, sector 50944

Spinrite

My last guess would be to try Spinrite. Spinrite is a hard drive recoveryand maintenande utility written by Steve Gibson from Gibson ResearchCorporation. I configured a KVM VM with the card reader device as a secondblock device and tried to run Spinrite on Level 2. However, even spinritefailed:

Booting Spinrite on a spare laptop with a cardreader with this SD card gave thesame error. It sees the device, but fails to run on it.

Spinrite has helped me multiple times recover drives, but this one might be tomuch for it.

Badblocks

Badblocks is a *NIX utility to search for bad blocks on a device. While it isnot usefull for creating a list of bad blocks on the SD card because SD cards donot report actual physical addresses (because of wear levelling) it does tell usif the card is broken or not.

The following badblocks command will scan and report bad blocks for the device.It is a destructive write operation, you will lose your data.

badblocks -o ./badblocks.list -w -s -v -b 4096 -c 16 /dev/mmcblk0

-o to output the badblocks list to the file ./badblocks.list, -w for thewrite operation, -s to show progress, -v to be verbose, -b 4096 for theblocksize of 4K and -c 16 to test 16 blocks at once (default is 64).

It showed me that a lot of writes were failing:

Checking for bad blocks in read-write modeFrom block 0 to 3799039Testing with pattern 0xaa: 0.01% done, 1:57 elapsed. (0/292/0 errors)^CInterrupted at block 294

The error output format means the following:

number of read errors/number of write errors/number of corruption errors

So most of the writes (292 of 294) failed, meaning the SD card is broken, beyondrepair I guess.

Less writing

This SD card is a lost case. To prevent your Raspberry Pi's from writing a lotof data, and thus, wearing the SD card, you can do a couple of things.

tmpfs

The first one is to mount a few folders in RAM as tmpfs. The folders are thefolders where temp files and logging is written to. This means that you won'thave syslog available, but most of the time that is not a problem.

Edit /etc/fstab and add the following:

none        /var/run        tmpfs   size=1M,noatime         00none        /var/log        tmpfs   size=1M,noatime         00none        /var/tmp        tmpfs   size=1M,noatime         00none        /tmp            tmpfs   size=1M,noatime         00

This will mount the above folders in RAM, with a max size of 1 megabyte. Thenoatime option means that the access time of a file is not updated, saving alot of writes as well. You should also add the noatime option to your otherpartitions, for example on a standard Raspbian:

proc            /proc           proc    defaults          0       0/dev/mmcblk0p1  /boot           vfat    ro,noatime        0       2/dev/mmcblk0p2  /               ext4    defaults,noatime  0       1

Here the /boot partition is also mounted read only (ro). The noatimeoption is added.

Issue a mount -a command or reboot the machine to make this active.

Disable swap

Linux divides its physical RAM (random access memory) into chucks of memorycalled pages. Swapping is the process whereby a page of memory is copied to thepreconfigured space on the hard disk, called swap space, to free up that page ofmemory. The combined sizes of the physical memory and the swap space is theamount of virtual memory available.

Swappig causes a lot of writes to the SD card. You would want to turn it off tosave writes. The downside of this is that when there is not enough RAM availablethe linux OOM killer will randomly kill processes to save RAM.

Raspbian by default has a swap file, dynamically managed by the dphsys-swapfile utility. You can turn off this utility by issueing the followingcommands:

dphys-swapfile swapoffdphys-swapfile uninstallupdate-rc.d dphys-swapfile remove

After a reboot the swap will be gone, which you can check with the free -mcommand:

             total       used       free     shared    buffers     cachedMem:           484        243        241          0         42        162-/+ buffers/cache:         38        446Swap:            0          0          0

fsck at every boot

My Raspberry Pi's have a cronjob which reboots them once every seven days. Thisto apply kernel updates and just a general good procedure to see if all stillworks after a reboot. By default, fsck checks a filesystem every 30 boots(counted individually for each partition). I decided to change this to everyboot, so problems will be found and possibly fixed earlier.

To set up an fsck at every boot, execute the following command:

tune2fs -c 1 /dev/sda1

Where /dev/sda1 is the device or partition.

Remove the GUI

I don't use the Raspbian GUI so I decided to remove everything GUI related. Thebest way to do this is to remove libx11 and all it's dependencies, that beingall the GUI applications:

apt-get purge --auto-remove 'libx11-.*'

This lowers the amount of disk space used, the amount of packages updated andthe amount of services running on the machine, saving both RAM and disk space.

Conclusion

This SD card was so corrupt I threw it away. The Raspberry Pi was reinstalledand the pifm software was set up again. The music came from another machine,so that did not have to be copied again.

With the tips above you might be able to save your SD card if it ever getscorrupted. These tips also work on regular disks and SSD's.

Tags: badblocks, blog, ext4, filesystem, fsck, mkfs, raspberry-pi, sd, swap