[geeks] Weird MacOS issue

Jonathan C. Patschke jp at celestrion.net
Thu Dec 25 14:44:42 CST 2008


On Wed, 24 Dec 2008, Bill Bradford wrote:

> It also depends on what you're *doing* with zfs.  Jonathan Patschke has
> some horror stories of it just completely shitting the bed.

The best one doesn't come from me, but it does come from work this month.

$ork has Rather Large ZFS installation which holds our online backups.
For a variety of reasons, we need to keep -everything- available (as well
as archived to tape) for restoration.

I don't recall the specifics (since I've heard it through our admins--I'm
no-longer in IS), but I believe the current configuration is a half-rack
of eight 16-disk shelves, each of which is a raidz2 with two hotspares.
The trays are then striped together.  Each tray has a 4Gb/s link to each
of two FC switches, and each of the 4Gb/s switches has a 4Gb/s link to the
host box.

So, 12TB/shelf x 8 x (0.93) = 89TB usable space.

Even if you hate PCs, this is a Damn Studly Backup Box, and it even has a
Sun logo on it.

These systems were running Solaris 10u5 and whatever ancient version of
ZFS shipped with that (either 6 or 9, I think).  $admin upgraded the
system to Solaris 10u6, which includes a more recent build of ZFS (either
11 or 13--I don't know, as I run Nevada for my R&D stuff).  After he had
the OS upgraded, he followed Sun's procedure for migrating the filesystem
from v[69] to v1[13] (including quiescing the FS and flushing pending
writes).

Bam!  Uncorrectable data errors in -every- raidz2 volume.  All the
hotspares are immediately rushed into service, the system grinds itself to
a halt with resilvering and kernel panics due to out-of-memory problems
(on a 128GB box).  When it comes up, huge swaths of the backup hive are
missing, the entirety of it isn't write-accessible, and zpool reports that
resilvering will be complete sometime in April.

So, a week before the Christmas holidays, with Sun support running on the
thin, $ork has no active backups and is looking at reimporting a few tens
of TBs of backup archives.  Go ZFS!  At least the IS folks won't be bored
on their "days off" for the holiday season.

ZFS is a fine idea, and it's even pretty solid for research work and
casual-to-heavy use (ZFS 14 and Nevada b104 are very close to shippable
quality).  Just don't push it to the limits Sun claims it can handle; it's
not quite there yet.

-- 
Jonathan Patschke < "There is great satisfaction in building good tools
Elgin, TX          > for other people to use."
USA               <                                     --Freeman Dyson



More information about the geeks mailing list