[rescue] Cray J90s

Tue Jun 12 18:16:46 CDT 2001

On Tue, 12 Jun 2001, Dave McGuire wrote:

>   Vector processors, on the other hand, operate on vectors as atomic
> values.  They can perform the same operation on whole lists of numbers
> (vectors) in one operation...one vector instruction.

Interestingly enough, there are parallels here between vector supers and
the IBM mainframes of yore.  Core (RAM) was such an expensive and
relatively slow commodity that a great deal of functionality was packed
into the microcode for certain instructions.  It's possible, for example,
to XOR a pair of 255-byte blocks together on an S/370 or above in a single
assembler instruction (though internally the microcode expands it to a
hardwired loop) much more rapidly than explicitly coding the loop.  The
SIMD instructions in many modern micros (MMX, VIS, Altivec, etc) attempt
to gain more speed through "in-chip" vectorization.  The gains can be
significant but none of them have as much versatility as a true vector
super (where just about any operation can be vectorized).

>   This may not seem like such a huge win at first glance...but when
> you consider the fact that most Crays (all except the YMP-C90) operate
> on 64-bit numbers and have vector registers that are 64 elements deep
> (i.e. the maximum length of a single vector is 64 numbers), the
> usefulness and raw power of this form of operational parallelism
> becomes clear.

And it's a _huge_ win for vector (in the geometry sense) math and matrix
manipulation, something that supers do a lot of (and the reason why we
have scientific libs like BLAS around).

-James