Recovering Repairing from a failed Raid 5.

 

So, you’ve ignored SAM and gone and run on a RAID 5. Then, through some elder gods’ handiwork, two of your disks drop off the RAID at the same time rendering your RAID 5 inaccessible. Your skin pales as you realise your last backup was *coughcough* months ago and close to useless. What to do? WHAT TO DO!?!

First things first – stop. Stop everything. Well, keep breathing, but stop everything else. Slow your mind and calm down.

Two disks dropping out of a RAID 5 simultaneously is odd. It’s unlikely to be a hardware issue on both drives, so it could be a software issue.. possibly an incorrectly referenced or corrupt RAID config. If so, things may not be as bad as they seem and indeed, you may be able to recover everything.

This How-To addresses recovering data from failed RAID 5 arrays on Desktop PCs, a situation we all dread (and should avoid by running OBR 10). It is generic, and where assumptions are made they are stated.

Failing to have a recent backup is going to be expensive. Hopefully you already have the necessary hardware available to recover but if not, you’ll need to make a few purchases.

Depending on how the following steps pan out, you may not be successful in data recovery and you should accept that is a possibility. A lot of variables have to align to allow you to recover, but before admitting defeat stick your chin up, take a deep breath, and … switch your PC off.
1.
Shut it off!

First things first, shut your PC down.
Reason: your existing PC is a plague to your RAIDed disks at the moment, the controller can likely still access your disks and thinks it’s in its old raid config. While like this, there is the potential for extra writes to occur on those disks and this risk needs to be eliminated.
Keep it shut off. Don’t switch it back on until you can pass the next step. Where possible, the entire RAID recovery needs to be performed in one long period without interruption.

2.
Here.. just take my card.

To enable you to recover the data, you’ll need to have the necessary hardware.

1. A NAS or spare PC with significantly large network storage share.
Reason: an equal amount of space available somewhere else. If you had 3 x 1TB disks in RAID 5, you will need at least 3TB storage elsewhere. Likewise, if you had 5 x 3TB drives, you’ll need 15TB storage available on the network. This How To assumes you have a NAS with enough spare storage available, and is accessible on the network and ready for writing. Ka-ching!

2. An alternative disk connection. This can be a spare PC with spare disk ports, or an external HDD mount. This How To assumes you use a USB HDD dock
Reason: Since the RAID controller is the interface between your OS and the disks, your OS can only see the corrupted RAID, and may not even see the individual disks depending on the RAID corruption. Your disks will need to be removed from the existing controller and attached via an alternate means to a bootable Windows OS. If you’re plugging them into a different PC, ensure the disk controller isn’t the same model as the one controlling your failed RAID (which basically means ensure it’s a significantly different motherboard model). An external USB3 HDD dock is a safe bet, and not too expensive.

3. UPS me up, Scotty!
Reason: You’ll need enough UPS’s for all components during recovery. It/they need to be able to power a PC, a NAS, a switch/router and a USB HDD Dock in the event of a decent power failure. You know it’ll happen during this process (damn you elder gods!) so best protect against it.

4. R-Studio Network Edition. Prepare to buy it, but not yet. Get the trial for now.
site: http://www.r-tt.com/
download: http://www.data-recovery-software.net/Data_Recovery_Download.shtml
store: https://secure.r-tt.com/cgi-bin/Store?id=1
Reason: It’s the most stable, assured, and fastest RAID reconstruction/recovery tool I’ve encountered (from recent experience, Jan 2013). There are other tools such as Zero Assumption Recovery that will probably suffice, but my personal recommendation is R-Studio. Since this How-To assumes you’re recovering to and from a NAS or network share, the Network Edition is required. If you have uber space locally, the standard version is fine.
If you make it through to seeing your files later in this process, you’ll need to then purchase R-Studio, but in the meantime just download the trial. Install it.

5. ReclaiMe Free RAID Recovery
site: http://www.freeraidrecovery.com/
download: http://www.freeraidrecovery.com/download.aspx
Reason: This automates finding the correct parameters for reconstructing a virtual RAID, and gives clear and concise R-studio instructions when finished. And it’s free! If you’re handy with a hex editor, you can ignore this program and follow the R-Studio tutorials on finding RAID parameters (http://www.r-tt.com/Articles/Finding_RAID_parameters/). This How To assumes you’re using this program, so install it.

3.
Disassemble number 5!

You’re about to pull your PC apart and take those failed drives out. At this point, I recommend documenting everything and labeling cables and drive placements. It’s not entirely necessary but it’s good to know what connected to what and where if you ever want to escalate this to a forensic Data Recovery Center. The more info the better.. and the cheaper.
Once documented, anti-static yourself and carefully remove your failed RAID drives.

4.
Check your integrity at the door

One by one you’ll need to verify the physical integrity of your drives by running your vendor’s checking utilities.. e.g:
IBM/Hitachi — Drive Fitness Test
SeaGate — SeaTools
Western Digital — WD Diagnostics/Lifeguard
So, plug in each drive to the dock and run the tests. DO NOT ATTEMPT ANY REPAIRS, run this in read-only mode ONLY. A random write to the could wipe out your chances of recovery. We just want to make sure your drives are physically OK.
If two disks fail because of a mechanical issue or your drives are dead, unfortunately you’re hosed. Nothing to do except take it to a Data Recovery Centre. You’re probably looking at around $1,000 per recovered disk to get the experts to do it. Sorry, but this is by and large your only shot from here.
If one disk fails and the rest are fine you still have a shot. Continue.
If none fail and all your disks are fine, great! Continue!

5.
Snap that image

One by one, assuming your disks are physically fine, you need to use R-studio to take disk images of your failed RAID drives. If any disks failed the integrity check in Step 4, don’t image those as it can corrupt the overall integrity of the virtual RAID later on – only image your physically OK disks.
To do this:
– ensure you have a Mapped Drive to your NAS or network storage share. This How To will assume you labeled it S: (R-Studio didn’t like using UNC paths to take images)
– attach a HDD to the USB dock and ensure it’s powered up and visible in Windows (if not readable).
– open R-Studio
– in the drive list on the left, right-click your failed drive and select “Create Image”
– in the popup window, Main Tab, select Byte-to-Byte image, and for the Image Filename, rename it to be unique and on the mapped network share (e.g. “S:disk-01.dsk”)
– click OK and wait until finished (could be hours to days)
– repeat for all disks, uniquely naming each image file on the file share.
Sw_raid5-takeimage_big
6.
ReclaiM your RAID parameters

So, you’ve physcially checked your disks and they’re OK. You’ve successfully created images of those disks and they’re ready on the network. Now the clincher – you need to find if the RAID is able to be deciphered.
Close R-Studio
Open ReclaiMe Free RAID Recovery.
– Click on the drop-down arrow on the Disks icon up the top and select “Open disk image”
– Navigate to your NAS share and select all your uniquely named .dsk files you just made in R-Studio.
– back at the main screen, tick the checkboxes next to each of your disk images.
– click the big green Start RAID 5 icon up the top.
– let it do it’s thing. This step can take hours or days, or even weeks. For 4 x 2TB images it took 2 days on my system. Let it be and try to keep anything away from the PC and NAS and router/switch while it runs.

There is a “Confidence” meter shown during analysis. If this gets to 100% before the “Progress” meter then things are looking good. If the Progress meter gets to 100% while Confidence is low, the amount of data you can recover is diminished, if any. For RAID data recovery, as in life, high Confidence is what you’re after.
When finished, ReclaiMe will give you instructions for R-studio. Copy it into a document and save/print it for reference later.
Sw_raid5-getparams_big
7.
Peek a boo, I see you

Using the instructions given to you by ReclaiMe, follow them exactly. And by exactly, I mean exactly.
Here’s an example what you’re likely to have:
——————————————————-
These instructions are provided for R-Studio version 5.1
1. Launch R-Studio
2. On the toolbar, click “Open image”. Enter “S:ST2000DM001-9YN164CC4C-Disk3.dsk” as the file name, click “Open”
3. On the toolbar, click “Open image”. Enter “S:ST2000DM001-9YN164CC4C-Disk1.dsk” as the file name, click “Open”
4. On the toolbar, click “Open image”. Enter “S:ST2000DM001-9YN164CC4C-Disk2.dsk” as the file name, click “Open”
5. On the toolbar, click “Create Virtual RAID”. Then, select “Create Virtual Block RAID” from the dropdown menu.
6. Right click the disk list on the right, select “Add S:ST2000DM001-9YN164CC4C-Disk3.dsk” from the pop-up menu.
7. Right click the disk list on the right, select “Add S:ST2000DM001-9YN164CC4C-Disk1.dsk” from the pop-up menu.
8. Right click the disk list on the right, select “Add S:ST2000DM001-9YN164CC4C-Disk2.dsk” from the pop-up menu.
9. On the right side of the R-Studio window, set “RAID type” to “RAID5”.
10. Below that, set “Block size” to “64 KB”.
11. Below that, set “Block order” to “Left Asynchronous”.
12. In the “Parents” table, enter “0 Sectors” as “Offset” in all rows.
13. Below the RAID diagram, click “Apply”.
14. On the left panel, “Virtual Block RAID 1” is the newly created RAID. Double click it to start recovery.
Generated by ReclaiMe Free RAID Recovery build 889, www.FreeRaidRecovery.com
——————————————————-
Notice that the order in which you load drive images is not sequential, nor is it when you add the images to the Virtual Block RAID. Perform the steps as instructed in the order it advises.

Also, the last step (14) is slightly misleading. Double-click on the “Basic data partition” listed beneath the root Virtual Block RAID (rather than the “Virtual Block RAID 1” itself), and that will begin recovery.

The process of mapping the Virtual Block RAID is quite fast in comparison to what you’ve done so far, just a few minutes after beginning you’ll be presented with a data recovery screen that lists all the folders/files that it could find.
Sw_raid5-configvirtualblockraid_big
8.
Test recovery

In the R-Studio “File View” window, the folder list on the left, expand the Basic Data Partition, expand the Root.
This lists all the folders/files that it could find. Hopefully for you, it lists a lot.. maybe even everything!
Time to test a single file recovery:
– navigate your folder list and find a file you’d really like but that isn’t too big,
– select it by ticking the checkbox beside it
– click the “Recover Marked” icon up the top and follow the prompts as to where you’d like to save the file.
Sometimes recovering files can take some time. For me I recovered about 1.5TB per day across the network, but it could differ for you.
Test the recovered file – if it’s complete and satisfactory, make sure you’ve saved the instructions from ReclaiMe on how to set up the Virtual Block RAID for R-Studio.

If successful, it’s time for you to purchase R-Studio Network Edition.
store: https://secure.r-tt.com/cgi-bin/Store?id=1

It’s also time for some semi-drastic, but highly recommended action….
Sw_raid5-testrecovery_big
9.
Lose your FakeRAID, lose your RAID 5. We have the technology.

You’re ready to recover your data from the NAS to your PC, so you’ll need your disks back in your PC.

So to start off, close everything and shut down cleanly.

At this stage you should reconsider using a RAID 5 – it is inherently fragile for current drives (read up on some of Spicework’s SAM’s posts on this). Where possible grab yourself an even number of disks, buy another one if you have an odd amount, and rebuild your RAID as a RAID 10.

So, take all your disks to the original PC, plug them all back (I do it in order that I took them out, but it’s of no consequence now), and start ‘er up.

Jump into your RAID config and remove all existing RAID settings and create a new RAID 10 with all drives. RAID 10 is really the way to go. It’s highly recommended.

This will destroy all data on your disks so just in case, keep your NAS running to minimise risk of corruption on restarts.

Also, if you’ve been using FakeRAID (RAID run by the motherboard controller) consider getting yourself a proper RAID card and running your drives from there. If the controller goes on the fritz, you could be paddling without a .. paddle ..

Wait until the new RAID 10 is configured and initialised and boot into Windows.

Following Step 7, get back into R-Studio and rebuild the Virtual Block RAID so that you can see all your files again.

Restore at will to your PC, back to whence they came.

10.
Three rules of storage – backup, backup, backup

Now that you have all the data you want back, time to do some housecleaning.

If you have the space, perform a backup now. If you need, you can delete the images of your disks off the NAS first, but only if absolutely necessary and you’re confident that you won’t lose anything.

Get a backup system in place, preferably automated. That means if you lose a RAID again it’s of no consequence due to having recent backups. Check your backups often to ensure they’re being taken successfully.

You maybe used to think you didn’t need backups. I hope you’ve learned this is not the case.

So, in the wise words of every moderator in Spiceworks, “Backup, backup, backup”
“And … backup”

Conclusion

As you’ve seen, if you’ve been unlucky enough to encounter two failed drives in a RAID 5 configuration, and the drives haven’t been written to, and the drives are physically functioning correctly, and the moons align with Saturn, and your tongue is held half to the left, recovering data from a failed RAID 5 is possible. It takes time and patience. But if your data isn’t backed up and it is important, the above steps will lead you down the right path and hopefully to a full recovery.

If you have any further tips or amendments, feel free to comment!

And thanks for reading!

 

 

by _p_glenprofile_small Glen Bodor c/- Spiceworks

Leave a Reply