[geeks] examples of vector processors (vs scalar processors)

Mon Aug 5 15:41:18 CDT 2002

On Mon, Aug 05, 2002 at 10:23:38PM +0200, William S. wrote:
> Sometimes I think my browser has a mind of its own.
> Anyways, my brain has locked in on trying to better understand
> the differences between vector and scalar processors.
> 
> What I was looking for now are some examples of vector
> processors (does the G4 qualify with it's Altivec engine?).
> Are there any vector processors found on small machines?

Some people argue over whether the G4 counts as a vector processor or
not.  Altivec is definately a vector processing unit though.  To a
certain extent, VIS (from the UltraSPARCS), 3DNow, and MMX/SSE2/other
intel stuff, are all vector units, even if the main CPU doesn't count
as a vector processor.

I'm hard pressed to really comment though since my knowledge of Cray's
instruction set is very limited.  For instance, do Crays have
non-vector FP registers, or non-vector integer registers?

The short explaination of the differences between a vector unit and a
scaler unit is that if you have two small arrays, like say:
       float f[4], g[4], h[4]
       //...
       //code to fill f and g above
then a vector unit would allow you to say something like
     h = f + g
while a scalar unit would require this:
     h[0]=f[0]+g[0];
     h[1]=f[1]+g[1];
     h[2]=f[2]+g[2];
     h[3]=f[3]+g[3];

Note, this isn't exactly how it would look on any platform.  In
reality, usage of the vector add instruction is going to be triggered
by assembly code, not c code, and if you are allowed to write h = f +
g, then that probably means that you are using a vector class (which
in turn means that you wouldn't be talking about float[4]s anymore for
the datatype most likely) that overrides the + operator, and uses some
sort of embeded assembly code (either inlined, or linked from
externally) to perform the add.

Now (and this is from listening to others, rather than personal
experience), Altivec allows adding 2 sets of 4 floats in one instruction
like above (this part I know personally from reading the docs, and
reviewing OS X code that used the units).  Cray's (and here is where I
get to the hear say) can add 2 sets of 4 floats together like that one
instruction, but they can also add a 2 sets of a few hundred floats
together in one instruction as well.

Does this at least clarify a little?

Some people would argue that any processor with any weak vector unit
is a vector processor.  I find it laughable to think of a pentium mmx
as a vector processor, with that stupid 64bit, ints only vector unit.
The latest revisions of the whole MMX/SSE thing are supposed to be
better (I believe that SSE now works on single precision floats), but
I haven't really paid attention.  Personally, I really thought that 3D
Now was pretty slick.  But that's probably just me getting worked up
over little.  AMD Athlons are definately no where near as nice as
Motorola G4s in my book.

-- 
Joshua D. Boyd