[geeks] backup architecture

Phil Stracchino alaric at metrocast.net
Wed Jan 20 13:20:17 CST 2010


On 01/20/10 11:19, der Mouse wrote:
>>> It is, however, a good point that if the drive streams 40MB/sec, you
>>> need more than 100Mb/sec incoming bandwidth to keep it busy.  Four
>>> 100Mb interfaces might do, if they're shared in sufficiently clever
>>> ways, but unless there's some reason to avoid a gigabit-capable
>>> switch, you'd probably be better off with gigabit.
>> Yup.  Only budget for the switch and NICs is keeping me from it.
> 
> I think that counts as a reason.
> 
> But 100Mb switches are cheap and plentiful.  So are 100Mb cards.
> Perhaps four (or five or six) 100Mb cards would do?  Figuring out how
> to spread the load could be interesting, but even if you can't quite
> hit that 40MB figure, you might come close enough to improve things.

babylon4 already has two gigabit interfaces, one of which is doing
nothing except periodically carry rsync data to the "disaster fallback"
array on minbar.  minbar has a 100Mbit regular network interface and a
PCI-X gigabit NIC for the other end of the PTP gigabit link.  So if I
hooked the LTO2 drive up to minbar (which would necessitate moving
minbar indoors from the deckhouse), getting traffic from babylon4 over
to minbar would be trivial.

Actually, I just had an idea, though I have no experience with the
technology (yet) and don't know if it would work.  What I might be able
to do is put the LTO2 on minbar, publish it as an ISCSI target over the
gigabit link, and connect an FD on babylon4 to it.  That would allow the
LTO2 drive to be inside in a more controlled environment, but still be
controlled by babylon4's FD.  Then babylon4 can spool to its SATA array
and write out over the wire almost directly to the LTO2, and with the
LTO2 on the same FD as the disk pool, I can use job migration.

Of course, I'd rather use a newer machine with a better CPU/watt ratio
than minbar (250MHz U30) for that job.

>> It's the monthly full backups that are the problem,
> 
> Hmm.  Is there any particular reason you have to do all the fulls at
> the same time?

Not really, but the way it works out, 80% to 90% of the full backup set
is the main server (babylon4), so spreading out the full backups
wouldn't really gain me anything.

>> because until I can spare the money for a complete set of new disks
>> for the server, I only have enough disk space free on it to have a
>> single full backup to disk in existence at any one time.
> 
> You can't just pop in a SATA card and plunk one of those 931G drives on
> it?

I have two eight-port SATA controllers in babylon4 already, driving a
12-disk ZFS RAIDZ2.  The same budget that has no room in it for a
gigabit switch right now has no room for new disks other than emergency
replacements either.

> I don't entirely understand the explanation about why you couldn't dump
> to disk and then write it to tape later, but I suspect that's because
> Bacula concepts I don't know are involved (I have a general knowledge
> of what Bacula is, but I've never worked with it and thus don't know
> details).

Briefly:
Bacula's architecture basically involves three components:  file daemons
that handle sending data from individual clients, storage daemons that
handle writing it to storage pools, and the director that schedules
jobs, coordinates everything else, and writes metadata to the SQL
catalog.  A storage daemon can have multiple pools of media volumes on a
single physical device, and can control multiple physical storage
devices on a single machine.  The Director can control and apportion out
data to multiple storage daemons on different machines[1].  One of the
more recently implemented features allows jobs written to one physical
storage device to be copied or moved, internally to the storage daemon,
to a different physical storage device.  However, bacula cannot YET copy
or migrate jobs directly to a different SD on a different machine.

> Is there some reason it has to be done entirely within Bacula, though?
> Maybe tell Bacula to dump to disk and then handle moving things between
> disk and tape with other tools, so Bacula always deals with disks?
> Seems to me that might make the migration-between-hosts issue you were
> talking about irrelevant, if it would work.

Well, I considered using, say, tar or afio to directly copy the
backed-up disk volumes to tape over the network.  But if I do that, then
in order to restore, I've first got to copy the entire restore volume
back to disk across the wire, which would be nightmarish.



[1] Technically, you could run multiple SDs on a single machine; they'd
just have co communicate on different ports.  However, I can't think of
a realistic real-world scenario in which this would be a good idea.


-- 
  Phil Stracchino, CDK#2     DoD#299792458     ICBM: 43.5607, -71.355
  alaric at caerllewys.net   alaric at metrocast.net   phil at co.ordinate.org
         Renaissance Man, Unix ronin, Perl hacker, Free Stater
                 It's not the years, it's the mileage.



More information about the geeks mailing list