[geeks] Solaris 10 / OpenSolaris bits to be in next version of OSX

Wed Aug 9 19:35:17 CDT 2006

Wed, 09 Aug 2006 @ 19:58 -0400, Phil Stracchino said:

> But it appears that to do so, you have to tell the system to allocate
> redundant copies of every block.  It appears ZFS uses the per-stripe
> checksums in the metadata to avoid having to do this, as it can
> unambiguously determine which stripe is bad and reconstruct its data.

But what about this required integration of the two abstractions?

These features should just be part of the communications between them.

> I've had several disks over time that silently failed in WORN mode --
> write once, read never.  You could write to the disk all you wanted, and
> it'd sit there cheerfully saying everything was fine.  "Oh, now you want
> to read that back?  Uh, hold on a minute, I'll get it for you ... just a
> minute ... I know I had it here somewhere ...  Here, will this do?  MOST
> of the bits are right ... I think ..."

Well, it sound like you had bad disks to me.

I've *never* had a good drive that did that.  If a write fails, it
knows, and it tells me.

I *HAVE* seen several crappy drives like Maxtor that do that.  It
happens because their firmware simply doesn't report a host of errors,
and even lies about some settings, like sync writes.

I guess the question being asked here is exactly what Sun means by
silent data corruption and how they solve it.

If all they do is notice errors and correct for them: YAWN.

I am willing to accept that they might have a smoother implementation
than what has been availble for *UNIX* systems so far, but in the
industry as a whole, this is all old technology.

Or, is there something else that really is new magic in ZFS?

> I think what they're saying is that their RAID-Z can automatically
> detect, and transparently handle, read errors on the media that do not
> actually involve complete hardware failure.  In short, anyone's RAID can
> handle a failed disk, but ZFS can handle it when the disk is just
> starting to go bad and is dropping bits, and identify exactly which disk
> block it was that returned bad data even if the disk's ECC didn't catch
> or flag it.

If this is true, then I definitely would not trust ZFS, because you just
described them trying to handle a situation that should not be handled.

When the drive starts having errors, a properly working RAID should kick
it out immediately.

Modern drives automatically remap.  When you start having errors, the
drive is toast.

The entire purpose of SMART, especially in LVM and RAID, is to detect
failures early and swap the drive before it runs out of remapping
ability, loses its spindle, etc.

-- 
shannon "AT" widomaker.com -- ["The grieving lords take ship.  With these
our very souls pass overseas." -- Exile]