[rescue] Perverse Question

Sat Jun 14 22:29:40 CDT 2003

> I would like to ask you, since you study computer architecture, why
> not take a bunch of R3k's, bump up the MHz and put them on a 2 or 3
> million transistor die and have an instant SMP box?  You could put the
> cache-coherency and memory controller on the same die.

Some groups have done that, this is called OCP (On Chip Parallelism) or
CLP (Chip Level PArallelism). Usually you have a budget of transistors
right, so how do you use them: Do you have a very complex serial pipeline,
or you just put a bunch of simpler pipelines on the same space? The
problem with putting a SMP on a singles chips is that you still have to
deal with an SMP design :). Meaning you either have a mix of code that is
very instruction parallel (ILP), or you have to put extra effort to thread
your code to take advantage of those cores.

Also remember, you have to feed all those cores you you need a significant
amount of instruction and data bandwith into the chip (and PINS are very
very costly in computer design pins are usually the limiting factor for
most technologies -I meant pin in the IO port sense not an actual physical
pin since most parts are surface mounted of some sort). Plus the cache
coherency circuitry is not THAT trivial.

I am not at school, so I do not have acces to some reference papers, but
the idea you have expressed has been tried before. Even commercially, I
believe the alpha xx364 was indeed 2 x264 cores (with enhacements) put
together in the same die. I believe that instead of having the chip
operate as a straight SMP machine they were more focussed on a simmetric
multithreading (SMT) behavior. Multithreading is my area of research BTW
:)