Back in July 2010 I built a file server with RAID 1 which I used to store every single bit of digital media that was important to me, including over 200 gigabytes of photos dating all the way back to 2003. The RAID was comprised of two 2TB Western Digital green hard drives that, as of August 2017, had reached almost four years of continuous uptime. I didn't have any other backup solutions for this data because I figured what were the odds that both drives would fail at the exact same time? As long as one of the drives lived I should be able to rebuild the array with a new drive and all the data would be recovered, right? I'm sure most of you can see where this is going.
Cut to the morning of Sunday, August 13, 2017. I decided to build a new file server because my existing one was getting old and had absolutely no room for expansion since it was in a home theatre PC style case. The server had not even been turned on in months and was tucked away deep in my closet. I pulled it out, plugged it in, turned it on and BAM. "Serious issues were found while checking the disk drive. Press M for manual recovery." uh oh.
I proceeded to spend the next five hours of my Sunday trying to understand what the heck went wrong. After reading thousands of lines of man pages and trying dozens of tools I finally came to the realization that both of the hard drives were fucked. I could not get either of them to mount and only one of them was actually recognized by Ubuntu as having a valid file system.
I quickly switched from "I am a computer science major. I can fix this." to "Oh fuck, I am in over my head. Who can I pay to fix this?". I typed "sf data recovery" into Google and the top result was Lazarus Data Recovery. Their Yelp reviews filled me with confidence so I gave them a call. Keep in mind this is now 3pm PST on a Sunday and someone picked up within seconds. I talked to William who listened to my problem and told me that if I showed up tomorrow morning at 10am he would be able to help me out.
I bought a 4TB Western Digital Black hard drive and external drive enclosure, packaged up my corrupted RAID and showed up at Lazarus Data Recovery exactly at 10am. I cannot say enough good things about William. He was absolutely amazing from beginning to end. He explained to me how he hoped to recover my data and gave me daily (sometimes twice a day) updates as to what progress he was making. By Wednesday night I heard back that he had successfully recovered my data and was in the process of transferring it to the drive I provided. By Thursday afternoon I had all of the recovered data in my hands. Not only did William recover my data but he made a duplicate copy of my photos just in case anything went wrong during the first copy attempt (spoiler alert: it did). He also profusely apologized to me about how he could not save the drives and that both of them were dead and not worth using again. William was incredibly empathetic and never stopped thinking about the customer.
After getting the drive home I hooked it up to my MacBook and transferred both copies of the photos off of it. After diffing the two directories I found that one of them had a single corrupted image file and the other had a single corrupted video file but together all of the data had been recovered perfectly.
After recovering all of my data I moved onto revamping my data backup solutions. Going forward I plan to do the following:
- Set up a new file server with RAID 1. Mirroring the data across two drives is still a solid starting point.
- Every year or two, remove one of the drives in the RAID and replace it with a new drive. This should prevent what I just went through by ensuring the drives being used in my RAID are of different ages. Also, this removed drive can be stored somewhere so in the worst case I have a drive with all of the data up until a certain date so I shouldn't be able to lose everything.
- Set up periodic backups of the RAID to another hard drive. Either an external one or possibly a SSD inside the file server. If the RAID fails I should have a functional snapshot of it that is not that old and because the drive is not constantly being accessed it should live longer than the RAID (especially if it is a SSD).
- Copy important photos/videos to iCloud Photo Library. I pay for iCloud storage so I may as well make use of it. This will also allow me to easily share photos with family and friends which is something I barely do right now.
- Set up a cloud backup solution, like Backblaze, to store the entire RAID offsite so if my place burns down or some other catastrophic event occurs the data should be stored safely in the cloud.
- I currently have all of my photos on my MacBook Pro. Continue to store them there before copying them to my RAID. This way I can easily use Time Machine
With all of these in place my photos should be stored on at least six different hard drives in at least three different locations. I think that many redundancies should ensure that I never lose my data again.
So let this be a lesson to anyone out there who thinks they have enough backup solutions for their data. Odds are that you don't. Start thinking about the doomsday scenarios and figure out how confident you are that your data is going to survive.