[rescue] advice on rescuing an e10k

Charles Shannon Hendrix shannon at widomaker.com
Tue Oct 24 15:25:12 CDT 2006


On Mon, 23 Oct 2006 14:12:47 -0500 (CDT)
"Jonathan C. Patschke" <jp at celestrion.net> wrote:

> On Mon, 23 Oct 2006, Andrew Gaylard wrote:
> 
> > Is Solaris really that much faster on this hardware? And what do
> > you mean by "maximum use of the hardware" and "scalability"?
> 
> Well, first off, Linux doesn't support that hardware.

No, but kernel 2.6 laid the foundation for it.

> In general, open source support of large Sun hardware is a dicey
> prospect because they system's don't look like typical workstation
> hardware.  Instead of a processor-to-memory bus multiplexed by a
> "north bridge" and a processot-to-I/O bus multiplexed by a "south
> bridge" and a system controller to manage cache coherency, you tend to
> hvae a far more complicated system when you get past 4 CPUs or 2 CPU
> boards in RISC hardware.

Your information is quite outdated.

Yes, Solaris will run better on high end hardware, but Linux isn't as
bad as it used to be either.

Keep in mind that Solaris has never been known as a speed demon and
only recently has been able to scale really large.  It's a really hard
problem to solve with UNIX because UNIX was created mostly for small
systems.

Linux has not been workstation/PC-centric for almost 3 years now.  In
kernel 2.5 they moved to a sub-architecture organization.

Linux 2.6 was focused on big improvements on big iron systems.  Earlier
Linux performance, especially I/O and the scheduler were terrible on
large systems, especially as the number of processes or threads got
really high.

Linux now has an O(1) scheduler, preemptable kernel, new I/O subsystem,
greatly reduced latency in scheduling, task switching, and systems
calls, and in general scales far better.

Linux even has NUMA support now, and some cluster foundations are in.

2.6 is really just the first foundation step to bigger things.  It was
a painful but necessary rewrite of much of 2.4's code base to make way
for needed changes.

Just for example, Linux is the OS for the Cray XD1 supercomputer,
supporting 144 CPUs in a clustered architecture layered on top of
Infiniband and HyperTransport.

It's certainly not limited to northbridge PC systems for good
performance.

Perhaps more exciting for the future, 2.6 has laid the basic
foundations for moving to "single-instance-on-cluster" operation.  Not
only will this be faster than the high-overhead clusters we have now,
but it should make it a lot easier to manage them.

Think of how Irix runs on Origin, and that's where Linux is headed.

I keep hoping that BSD will get pushed like that too, but Linux is
sucking up all the attention.  NetBSD is languishing, OpenBSD really
doesn't care, and FreeBSD only recently came out of the 5.x dark ages.

FreeBSD 6.1, warts and all, does seem to be a good performer, though
I've never had the opportunity to use it on a big system, and don't
know if it has any support at all for non-PC (ish) architectures.
Haven't paid attention.

> E-series Suns (and, I would presume, Sun Fire servers) look more like
> a cluster running in lock-step with a shared clock.  Linux doesn't
> support many systems with an architecture that looks like this
> (large-ish AXP and IA64 systems excepted).

True, but most of that is handled by system hardware, not the OS.

Solaris was also not originally designed for a clustered system.

In fact, UNIX in general was never designed for big iron, and for many
years fundamental design issues have kept it from running as well as
some of the old legacy OS on large systems.

That's changing though, so I suppose it eventually won't matter.  UNIX
keeps changing rather than admit defeat.

> Well, Sun's compiler beats GCC in terms of both code size and code
> efficiency.  

gcc isn't too good on any non-x86 CPU.

I wonder what impact Apple moving to Intel will have on gcc?  For awhile
they were helping non-Intel code generation.

> A lot of what "faster" means depends on your point of view.  Solaris
> is "fast" on large Sun hardware not because any one task runs faster,
> but because the SunOS scheduler is optimised for Very Large hardware
> (much the same reason Solaris is so much slower than Linux on tiny
> SPARC systems).  Solaris can get more done at once than Linux above a
> certain hardware threshold (that used to be nearly any 4-way SMP
> system)

Well, now it is nearly any 144 CPU supercomputer... :)
 


-- 
shannon "AT" widomaker.com -- ["Star Wars Moral Number 17: Teddy bears
are dangerous in herds."]



More information about the rescue mailing list