[SunRescue] Excuse my gross oversimplification of RAID

Christopher Byrne rescue at sunhelp.org
Sun Dec 3 14:46:32 CST 2000


>Behalf Of Paul Theodoropoulos
>Sent: Sunday, December 03, 2000 10:40
<snip>
>An additional observation, the issue of 'hot spares' is not even
>touched upon in the RAID specification, however it's use is
>probably one of the most significant steps one can take in assuring
>the safety of one's data in a RAID scheme.

I couldn't agree more. Though a lot of vendors seem to have a braindead
implementation of hot spare in their lower end product lines. To my mind a
hot spare should take over and be rebuilt to automatically without any user
intervention, if not it's really a warm spare. Too many implementations out
there are really warm spares.

<snip>

>if a disk fails in the
>middle of the night when nobody is around, it's fabulous to be able
>to turn off the beeping pager, sound in the knowledge that you can
>swap out the bad disk in the morning - at your leisure, rather than
>rushing in a panic to the data center to swap it out ASAP, for fear
>that another disk may die in that window.

Oh my yes that is a comforting thought. A recent employer was, as I said, a
storage company, and one of our vendors shipped a batch of defective drives.
Unfortunately by the time we found out the lot was screwing up they were
already installed in half a dozen locations. To further complicate the
matter these systems were in production, and the vendor wasn't sure exactly
which drives were failure prone, and our record keeping on which drives had
gone into which bay in each array wasn't the greatest... What a joy.
Thankfully the arrays contained a good sized pool of hot spares so whenever
a drive would fizzle out a hot spare would take over and we'd get a field
engineer out there with a known good drive... then call up and bitch to the
vendor again. Not the greatest method of fixing the problem, but it works.
Especially since the array in question has eight hot spares per cabinet and
the most we ever had fail was three drives.

>hmm, does it sound like i'm speaking from experience? ;^)

Nahhhhh, not at all. No one has ever had any problem with anything ever when
you were on call... really, especially not on holidays or vacations ;-)

I wrote an article for slashdot about two years ago called the high tech
sweatshop touching on administrators mission criticality. If you're
interested here's the link from my personal site.
http://www.chrisbyrne.com/writing/other/sweatshop.html

You can still find it on slashdot, but it's in their old story archive whicn
can take forever to load.

Chris Byrne
=======================================
The eyes may be the windows on the soul
But the word is the doorway to the mind
=======================================




More information about the rescue mailing list