[geeks] Impressive...

Jonathan C. Patschke jp at celestrion.net
Mon Mar 9 15:03:54 CDT 2009


On Mon, 9 Mar 2009, Joshua Boyd wrote:

>> You can, so long as you're not using anything other that mirroring or
>> striping.  The math to make a parity-based or Reed-Solomon based
>> redundancy system grow transparently is somewhere between complicated
>> and impossible.
>
> So, how do you think that Drobo is pulling off that trick?  They appear
> (if their file system space estimator is to be trusted) to go from a
> mirror to a raid5 transparently,

That isn't particularly difficult.  You have all the data there, so you
just calculate the parity data for each stripe, and mark the rest of the
space as unused.

> and claim to be able to grow or shrink the raid5 seemlessly.

I would be very surprised if there aren't very stringent requirements for
scratch space.  You can do the conversion if you have enough space by
marching a window from one end of the logical volume to the other and
recalculating the parity data in-place, but this would not be a fast
process.

> I've been trying to figure out how they do it.

One of the things the user's guide says is "just replace the smallest
drive" to upgrade.  From that, I would guess that they divide each
physical volume into compartments LVM-style and do their calculations over
"physical partitions", or whatever the generic form of that IBM-specific
term is.

Allowing expansion in that matter is significantly easier than treating
each disk as a whole unit within a RAID set.  Given that the examples in
the user's guide always show the largest disk as being unavailable in
terms of calculating usable storage space, it would make sense that
they're not only doing this, but using the new disk for as scratch space
for the parity recalculations, as well.

Drobo is a specific case, as well.  The number of disks is fixed.  They
don't let you grow/shrink arbitrarily.  This makes the problem almost
trivial.  In fact, if one were clever, this system could be used to
reconfigure storage without unavailability.  The general case of
growing/shrinking RAID sets of an arbitrary number of members (like ZFS
would need to support for tossing any number of new disks into an existing
raidz to produce the expected ((N - 1) * size) capacity) is a much
harder problem to solve.

Still, now that I've spent an hour or two thinking on it, working at it on
a stripe-size basis could probably make it doable.  However, you'd have to
be really careful to ensure that no block of data is ever left in a state
where there exists only one copy of it.  Doing that while keeping the
storage available and not ending up with a suboptimal arrangement of
data/parity stripes after the fact is a bit sticky.

-- 
Jonathan Patschke ( "They don't have the right to read a book out loud."
Elgin, TX         (                  --Paul Aiken
USA               (                    Executive Director, Authors Guild



More information about the geeks mailing list