[geeks] examples of vector processors (vs scalar

Mon Aug 5 16:55:48 CDT 2002

On August 5, jdboyd at cs.millersville.edu wrote:
>Now (and this is from listening to others, rather than personal
>experience), Altivec allows adding 2 sets of 4 floats in one instruction
>like above (this part I know personally from reading the docs, and
>reviewing OS X code that used the units).  Cray's (and here is where I
>get to the hear say) can add 2 sets of 4 floats together like that one
>instruction, but they can also add a 2 sets of a few hundred floats
>together in one instruction as well.

  Most Cray YMP-architecture machines (which are most Crays vector
machines except the Cray 1/2/3, X/MP, and SV[12], and their variants)
usually have 64-element vector registers.  The YMP-C90 and T90 both
have 128-element vector registers.  The word width is 64 bits, and the
width of a single-precision floating point number is 64 bits (not 32
like most everything else).  Cray floats are non-IEEE, though the most
recent machines do support IEEE.

  In the YMP architecture, a single instruction can perform a single
operation on a 64-element (or 128-element on the C90 and T90) vector
stored in one of eight vector registers (V0-V7).  There is a register
called VL (vector length) which tells the processor how many elements
to process with the next vector instruction.  Vector operations in
these machines are always register to register, except of course for
vector loads and vector stores, which transfer vectors between vector
registers and main memory.  Some systems (the CDC Cyber205 and it's
ancestor the STAR come to mind) don't have vector registers, but
execute their vector operations memory-to-memory.

            -Dave

-- 
Dave McGuire                     "I haven't worn pants in 14 months!"
St. Petersburg, FL                                   -Pete Wargo