[geeks] Solaris 10 / OpenSolaris bits to be in next version of OSX

Wed Aug 9 18:58:24 CDT 2006

Jonathan C. Patschke wrote:
> For instance, AIX has been doing things like this for years.  -Every-
> slice lives inside a volume group managed by the LVM.  If you have a
> volume group stating that there will exist more than one copy of each
> block, AIX will -tell you- if/where a bad block comes into play, return
> data from the good block, and deactivate the bad block.  This isn't new,
> and AIX doesn't have separate maintenance tools, mount tables, and
> nethodologies for dealing with filesystems guarded by the LVM in this
> fashion.
> 
> The added bonus to the AIX approach is that you get this for -all-
> slices, including paging slices.  It's Nice when you can replace a device
> holding your page slice without rebooting the system.

But it appears that to do so, you have to tell the system to allocate
redundant copies of every block.  It appears ZFS uses the per-stripe
checksums in the metadata to avoid having to do this, as it can
unambiguously determine which stripe is bad and reconstruct its data.

> The more sinister side of this is how Sun is selling it.  "Silent data
> corruption"?  What in the world is that supposed to mean?

You've never had a shaky disk start throwing random read errors?

> Data corruption does not happen unless there is a bug in the stack of
> code somewhere between fread() and the SCSI transaction or unless the
> hardware is defective.  There's nothing silent about it.  Either
> hardware or software has failed; this is the general case for a failure
> in a RAID metadevice.

I've had several disks over time that silently failed in WORN mode --
write once, read never.  You could write to the disk all you wanted, and
it'd sit there cheerfully saying everything was fine.  "Oh, now you want
to read that back?  Uh, hold on a minute, I'll get it for you ... just a
minute ... I know I had it here somewhere ...  Here, will this do?  MOST
of the bits are right ... I think ..."

>  So Sun's RAID implementation can handle that
> scenario?  Well, uh, great.  That's rather why we have RAIDs in the
> first place, isn't it?
> 
> If Sun is trying to say that their RAID software can handle errors not
> directly derived from hardware failures, they might want to rethink the
> implications of that statement before waving it about on a banner.

I think what they're saying is that their RAID-Z can automatically
detect, and transparently handle, read errors on the media that do not
actually involve complete hardware failure.  In short, anyone's RAID can
handle a failed disk, but ZFS can handle it when the disk is just
starting to go bad and is dropping bits, and identify exactly which disk
block it was that returned bad data even if the disk's ECC didn't catch
or flag it.

-- 
 Phil Stracchino                     Landline: 603-886-3518
 phil.stracchino at speakeasy.net         Mobile: 603-216-7037
 Renaissance Man, Unix generalist, Perl hacker, Free Stater