Linux software raid, rebuilding broken raid 1
Published: 14-04-2014 | Author: Remy van Elst | Text only version of this article
Table of Contents
Last week Nagios alerted me about a broken disk in one of my clients testingservers. There is a best effort SLA on the thing, and there were spare drives ofthe same type and size in the datacenter. Lucky me. This particular data centeris on biking distance, so I enjoyed a sunny ride there.
Simply put, I needed to replace the disk and rebuild the raid 1 array. Thisserver is a simple Ubuntu 12.04 LTS server with two disks running in raid 1, nospare. Client has a tight budget, and with a best effort SLA not in production,fine with me. Consultant tip, make sure you have those things signed.
_ in the
cat /proc/mdstat tells me the second disk (
Personalities : [raid1] [raid6] [raid5] [raid4]md0 : active raid1 sda1 sdb1 129596288 blocks [2/2] [U_]
U means up,
_ means down [source]
First we remove the disk from the RAID array:
mdadm --manage /dev/md0 --remove /dev/sdb1
Make sure the server can boot from a degraded RAID array:
grep BOOT_DEGRADED /etc/initramfs-tools/conf.d/mdadm
If it says true, continue on. If not, add or change it and rebuild the initramfsusing the following command:
(Thank you Karssen)
We can now safely shut down the server:
shutdown -h 10
Replacing the disk was an issue on itself, it is a Supermicro 512L-260Bchassis where the disks are not in a drive bay, rather they are screwed in fromthe bottom. Therefore the whole server needs to be removed from the rack (norails...) when replacing the disk.
Normally I would replace them while the server is on, but this server has no hotswap disks so that would be kind of an issue in a full rack.
After that, boot the server from the first disk (via the BIOS/UEFI). Make sureyou boot to recovery mode. Select the root shell and mount the disk read/write:
mount -o remount,rw /dev/sda1
Now copy the partition table to the new (in my case, empty) disk:
sfdisk -d /dev/sda > sfdisk /dev/sdb
This will erase data on the new disk.
Add the disk to the RAID array and wait for the rebuilding to be complete:
mdadm --manage /dev/md0 --add /dev/sdb1
This is a nice progress command:
watch cat /proc/mdstat
It will take a while on large disks:
Tags: blog, disks, kernel, mdadm, raid, software-raid, ubuntu
Personalities : [raid1] [raid6] [raid5] [raid4]md0 : active raid1 sda1 sdb1 129596288 blocks [2/2] [U_] [=>...................] recovery = 2.6% (343392/129596288) finish=67min speed=98840K/secunused devices: <none>