[rescue] seaking of SGI's - IP25 woes

Thomas Gallaway rescue at port11.net
Thu Mar 18 14:12:33 CST 2004


Geoff Koehler wrote:

>HI All, 
>
>Speaking of SGI challenge's, I have a problem with my XL that Im wondering if anyone has seen before.  After about half and hour from being powered on, one of the CPU's dies.  I've taken out the board and looked at it, but it doesnt "look" any different.  The heatsink on the CPU in question does not seem loose or unseated.  It stikes me as a heat-related failure, but Ive moved it to a different slot and it doesnt make any difference.  After a few hours having the thing turned off, it comes back, only to fail again half and hour later.  Any ideas?  
>
>Cheers, Geoff
>  
>
I have had exact the same thing on a SGI Onyx with quad r10k's. One of 
the cpu's was bad in the first place and I had to disable it and another 
one would actually crash the entire system after about 20-30 minutes. I 
would get a core dump. I had to disable the CPU entirely to get the 
system to work.

Smells like bad cpu board to me. Probably what could work is resoldering 
contacts maybe but that is a lot of work to do and maybe you have a dead 
CPU or so. Probably less time consuming just buying a new IP25 board.

-- Thomas



More information about the rescue mailing list