[geeks] Murphy, instantiated

Jonathan Groll lists at groll.co.za
Tue Jun 2 08:28:15 CDT 2009


On Tue, Jun 02, 2009 at 08:11:00AM -0400, Phil Stracchino wrote:
>Some after-the-fact forensics made it pretty clear what happened in this
>case.  It's all used hardware, so all the disks were unknown quantities.
> A network-wide full backup started at 03:10, putting heavy load on the
>array.  At 04:29:55, c1t7d0, evidently the weakest disk, buckled under
>the load, increasing the load on the remaining disks.  Then at 08:49:29,
>c1t6d0 folded as well, and the array went into fully degraded mode,
>increasing the load on the remaining disks even further.  Eight minutes
>later at 08:57:06, c1t4d0 gave up and the entire array went down.  The
>fact that it was three drives on the same controller is coincidence, I
>think.

A more likely hypothesis is that the controller was corrupting disk
writes though, don't you think? (My first thought was to ask if they
were on different controllers). How many controllers do you have in
this box?

Regards,
Jonathan



More information about the geeks mailing list