This isn’t really about the Commodore Pet I recently built and have been fetishizing about for a few months. This is about my ongoing love and hate of mdadm. You see, after years of not using it, I’m using it again. Partly because it’s useful, partly because I was feeling abstruse and decided to use it again when I could have just bought a 10TB drive and called it a day. Sometimes we like to create problems so we can solve them.
I recycled a few old disks from work and it turned out, those disks were already part of an ancient mdadm array. I was able to unwittingly create a new array over these disks but ended up absentmindedly using /dev/sda1 /dev/sdab1 and /dev/sdac1, instead of using the full disk (yep cause I was typing stuff from a tutorial because I can’t ever remember this stuff). and boy did it cause a fun problem!
After many MANY hours of copying old backup data do the new array, which is attached to the Commodore Pet Mini (aka RPi), I rebooted the Pi, as you do, when you’re running retropie on it and get it stuck in some weird emulator, and hung after sshing in and issuing a reboot command. When I rebooted things got weird:
First, mdadm automatically tried to assemble the OLD raid. It probably did this because there were RAID superblocks on the raw devices and maybe it favors the raw device rather than a partition. Then the machine refused to boot because the md0 array was in fstab.
I did lots of digging and found some others with this problem. It turns out lots of people unwittingly create raids on top of raids this way. Today I Learned, always check for, and delete old superblocks before creating a new raid. I tried all sorts of things to bring this array back but mdadm refused to forget about the ancient array referenced in the raw superblocks. Never mind that the data was all wiped. Never mind that this machine had no knowledge of that array. The superblocks weren’t having it.
In fact, everything was harder because the disks were in use… they were busy because mdadm would read the old superblocks on boot (or disk connection) and create a weird temporary array device called md127.
I did lots of things like:
sudo mdadm — assemble /dev/md0 — force
And lots of other stuff. Got lots of weird errors. Tried to recreate the array (heard that works sometimes).
Turns out, I didn’t have to do ANYTHING to repair the new array. It was there all along, with all of many terabytes of precious old pictures and files.
Here’s what finally worked:
- Run “cat /proc/mdstat” and figure out the (ancient, wrong, useless) array name, in my case it was md127, and stop it with something like “sudo mdadm — stop /dev/md127”
- Remove the offending array device with something like “sudo mdadm — remove /dev/md127”
- Remove the old superblocks (don’t do this unless you’re really sure this is what happened):
- Remove the old superblocks that were throwing you for a loop “mdadm — zero-superblock /dev/sda”
I’ve been here before with mdadm, with much higher stakes. She’s a harsh mistress who can whisper that all of your data is GONE… but many times, I’ve been able to type the right things to magically get it back.
Those are happy times.