[SunRescue] Info on Sun Multipack and disk arrays in general (long)

Christopher Byrne rescue at sunhelp.org
Sat Dec 2 05:26:48 CST 2000


All,

There has been a huge amount of interest in the Sun MultiPack storage array
I listed here a few days ago. Unfortnately most of that interest was
somewhat uninformed as to what the MultiPack is, what it does, and how much
it is worth. I'm going to be a little long winded on this because it seems
that people are interested in this subject. Up until a few weeks ago I was
the Sr. Security architect for an enterprise storage company so while I'm
not a storage expert per-se I do speak from a position of knowledge and
experience.

There are four basic types of external storage. The simplest and least
expensive is an external drive housing. This basically acts as a connector
and power supply for a drive inserted into your SCSI chain. There are no
benefits assoicated with this except the ability to add storage without
cracking open your case. These systems are almost never hot swappable or hot
pluggable and provide no redundancy, fault tolerance, or performance
increase.

Next up in the chain (bad SCSI humor) is the JBOD array (stands for Just a
Bunch Of Disks, another bad industry joke). JBOD arrays are basically large
boxes with their own internal SCSI bus, power supply, fans etc... They are
the most common type of RAID array. For the most part they are basically
similar to external housings, and generally don't support hotswap or have
any hardware RAID support. JBOD boxes sell for anywhere from a few hundred
to a few thousand dollars (no drives) depending on size and quality.

As an extension to that, you have E-JBOD (Enhanced JBOD) systems which have
more features. Often they will have more than one internal SCSI bus and can
offer such features as RAID awareness or direct RAID support, hot swapping,
and on board cache.

The Sun StorEdge MultiPack is an enhanced JBOD system with two internal SCSI
wiring busses ( a single SCSI chain however for SCSI ID purposes), a heavy
duty power supply with redundant fans, and slots for up to twelve hot
swappable drives, from F/W SCSI-2 all the way up to Ultra 160 or FC-AL. You
can change the media type supported by switching parts available from Sun.
Remember that U160 devices are backward compatible through all LVD SCSI
standards, all the way to F/W SCSI-2. With the proper controller and volume
management the MultiPack supports RAID 0,0+1, 1, 5, 10, 50, and 100. It is
also supported in Suns enterprise volume management software as a manageble
RAID device, as well as Veritas volume manager (they are the same software
with a different front end). JBOD boxes are usually priced based on size and
features. The largest of them allow 14 or 15 drives because of SCSI device
limitations, and sell for up to about 10 grand bare, or upwards of 50 grand
fully loaded with the largest available disks (currently 73 gigabyte).

Sun sells the six disk StorEdge MultiPack new with two 36 gig drives in them
for about $5000, and the drives at about $1700 a piece. Or you can get the
same drive direct from the manufacturer for about $750 if you don't care
about your Sun warranty. Sun no longer sells the 12 disk array, which is the
one I'm offering, because it conflicts in the market space for their lower
end enterprise storage servers. If they still sold it it would probably sell
for something around $7500.00 new with 2 drives in it. The Six drive version
maxed out sells for 10 grand, and that's still only half the capactiy fo the
twelve slot.

The MultiPack doesn't officially support the new 73 gig drives (big bucks),
but they do work. Of course finding a 1" high 73 gig drive with an SCA plug
for less then 2k is pretty much impossible, if you can get one at all. The
more common 1.6" or 2" high drives are not compatable. If you fully
populated the box with twelve 73 gig drives you would have an almost 900
gigabyte array. Throw in the slots in your Sun box itself and add volume
management software, and you can build a more than 1 terabyte storage array
in two small boxes for less than 30 grand. Just to put it into perspective,
a terabyte is about 333,000 MP3's, 1600 full CD's worth of music, or about
200 full screen high quality surround sound movies.

The final type of external storage is the enterprise storage array, as
typified by the EMC Symmetrix or Hitachi freedom series. These are large
extremely expensive cabinets with hundreds of hot swappable disks, built in
harware RAID controllers, large quantities of cache, and enterprise level
magement tools. The current leader in terms of performance in these is the
Hitachi freedom 9900, which can hold up to 28 terabytes of data, and has up
to 32 gigabytes of cache. That particular box runs about three million bucks
fully populated with drives and cache.

If you need a storage system like this you dont really care how much it
costs as long as it give you the high performance, highly available storage
that your business depends on. Companies like Lucas Film, Macromedia, Pixar
etc... who make vast quantities of digital media content, and companies like
fidelity, liberty mutual etc... who have HUGE databases (as in single
databases of more than 1 terabyte) are the main customers for thse types of
devices.

Below I've written out an explanation of  RAID  for those of you who may not
be familiar with it. I go over all of the common levels here so if you have
any confusion about what RAID 0+1 is and how it's different from 10 please
read on.

----------------------------------------------------------------------------
-----------------

RAID: Redundant Array of Inexpensive Disks (industry changed inexpensive to
independant a few years ago because it sounds better)

RAID 0: Mirroring. All data from one drive is simultaneously written to
another drive. 100% redundancy, decrease in write performance, modest
increase in read performance. The problem with this is that you need to
double the number of drives for the same capacity. Mirrors can also be made
hot swappable. This is the most common type of RAID.

RAID 1: Striping. A group of drives is logically combined, and data is
written across all of the drives simultaneously in stripes or "chunks". This
radically improves all performance because the system can read "chunks' of
data from more than one disk at a time (or write). It aslo seriously reduces
reliability because if any single disk in a stripe set fails all of the data
can be lost. This also has the effect of creating a single very large
logical volume to the operating system, which allows you to create
filesystems far larger than a single disk could support. Technically this
isnt really RAID becasue it isnt redundant, but it's a part of the
standards. Also drives in a stripe set are not hot swappable.

RAID 5: Striping with parity. As I mentioned in RAID one, if a single drive
fails in a stripe set then all of the data can be lost. In order to get
around this while still providing the benefits of RAID 1, RAID 5 provides
something called parity. In an 8 bit byte it is possible to learn the value
of a missing bit by comparing the values of all of the present bits, plus
what is called a parity bit. This uses a mathematica comparison known as a
'bitwise exclusive or', or 'XOR'. Basically what that means is that if one
drive fails then the data left on the other drives can be used to
reconstruct the missing data. This is a pretty good solution for a lot of
things, but it has some important limitations. It provides between 12.5% and
33% redundancy depending on how many disk there are per RAID group, while
losing a much smaller amount of useful capacity than RAID 0, and retains
much of the perfomance of RAID 1 if a good controller is used. The
limitations are as follows. First, you lose one drives worth of capacity. A
RAID 5 array must have a minimum of 3 disks, and generally a maximum of 8
disks. These disks do not have to be the same size, but if they aren't each
one is treated as if it was the same size as the RAID groups smallest
member. If you need more space you can configure multiple RAID groups. You
lose one drives worth of capacity to parity data, because In effect each
byte becomes longer because of its parity information. Also this XOR
calculation is resource intensive, so your disk controller has to do a lot
more work. This is the second most common type of raid.

RAID 0+1: Mirroring of a stripe. In this configuration, a stripe set as
shown in RAID 1 above is mirrored as in RAID 0 above. This provides 100%
redundancy for the stripe set, but it only protects agains the failure of a
drive on one side of a mirror. If a drive on one side and a drive on the
other side of the mirror both fail, the data is lost. This is unlikely in
small arrays, but for large arrays the likelihood of a multi disk failure
increases proportional to the number of disks in the array. This is not an
official RAID level, but a common industry term for the combination of two
raid levels.

RAID 10: Striping of mirrors. In this configuration all the drives are set
up in mirrored pairs as in RAID 0. Each pair is then added to a stripe set
as in RAID 1. This provides all of the performance benefits of RAID 1
without the reliability penalties. It allows for up to half of the drives in
the array to fail before data is lost. It also makes hot swapping possible
because when you remove adrive, it's mirror takes over. The limitations to
this are that you lose 50% of your raw capacity, and only one drive per
mirrored pair can fail before you lsoe data. If an entire mirrored pair
fails you lose your data. This is not an official RAID level, but a common
industry term for the combination of two raid levels.

RAID 50: Mirroring of a stripe with parity: In this configuration a RAID 5
parity stripe set is mirrored in a similar fashion to RAID 0+1. THis
configuration is pretty much pointless since it doesnt provide anything
extra in the way of redundancy, while wasting more storage than any other
configuration. This is not an official RAID level, but a common industry
term for the combination of  several raid levels.


RAID 100: Striping with parity of mirrors. In this configuration all drives
are configured in mirorred pairs. The mirrored pairs are then made into a
stripe set with parity. This striping does induce the overhead of RAID 5
into the array, but it allows you to have one complete mirrored pair fail in
your array , and still be able to recover the data. This provides you with
better than 100% redundancy with almost all of the performance of a RAID 10
configuration. This is not yet commonly used, however I expect it to gain
popularity for mid level enterpise storage systems. This is not an official
RAID level, but a common industry term for the combination of  several raid
levels.



RAID, as my wife calls it, Viagra for your hard drives


Chris Byrne
=======================================
The eyes may be the windows on the soul
But the word is the doorway to the mind
=======================================





More information about the rescue mailing list