There and back and there and back again

Well, the old server is up with the new hard drives. I ran into quite a few snags yesterday. It’s annoying when a problem manifests itself through an unconventional means. For example, putting a computer upright instead of on its side. One would think that it was safe to do such a thing. However, the reality is quite different and I was left with a hosed raid 10 array. That’s right, 2 of 4 disks bombed.

This wasn’t the expected result of course. I’ve gained some much needed experience in this matter about a year ago, when my array was hosed and I over-reacted. This time, not a single hasty decision was made, and the result was the saving of my data. There are a few lessons to be gained from this:

When an array is listed as failed in the disk controller but no drive failures are listed, re-create the array. On a lot of controllers, array re-creation without formatting will sometimes allow access to the data again. This isn’t a guarantee, but is better than loosing data outright. This worked in my case, and I was able to boot the greatest rescue cd there is and actually repair the file system through fsck.

Another nice thing was a result of using VM’s. All I needed off of the filesystem was one file, not a bunch of scattered files over different partitions. The file copied back and forth and ran on the other system with very little hassle.

Cables are touchier than they should be. Never discount them in troubleshooting. They ended up being the issue in this case.  Also, try to avoid connecting the same power cable chain to crucial disks.  In a raid 5 this may not be possible, but on the raid 10, I was able to stagger the power connections so that any of the cables can go bad without losing the array.

Data is harder to loose than your controller card would have you think. Exhaust all possibilities before calling it quits.

Related posts: