[SunHELP] unknown machine problems

Brian Hechinger sunhelp at sunhelp.org
Wed Dec 6 11:58:17 CST 2000


we have two Resilience boxes running Raptor firewall version blah.  in the
last couple of days they have both crashed several times, both times panicing
and dropping to the ok prompt.  since we have remote serial console on these
boxes we don't have it auto-reboot.

does anyone have any experience with these two things?  both companies have
been less than useful so far (Axent seems to care less that we have outages)
and this is really starting to annoy us.  we are in the middle of installing
Raptor on a non-Resiliance box to see if it is the hardware, or the software.

i'm just looking for any ideas anyone may have since we are all stumped.

also, going through the crash dump, i don't see a trap() call anywhere when
i do a $c in adb.  the instructions i have for dumps involved doing a $<regs
lookup of the second argument of the trap call.

on other, possibly unrelated issues, another admin scribbled down the following
from the console before resetting the machine:

fast data access MMU Miss

this looks like a hardware issue?  that's what i think.  i could be wrong 
though.

and on a another completely unrelated issue, while all this was going on i was
reinstalling Solaris 8 on my ss20 workstation.  when i walked back, it had 
paniced, couldn't fsck /var and dropped into single-user.  fixed it up, rebooted
and checked the logs, and here is what i found:

Dec  6 11:21:21 dev8 savecore: [ID 570001 auth.error] reboot after panic:
BAD TRAP: type=9 (Data fault) rp=fbebea94 addr=70 mmu_fsr=126 rw=1 occurred in
module "krtld" due to a NULL pointer dereference

(formatted to fit 80 wide, put it all one one line of course)

this looks like a software issue, but what may have caused something like this?
stray gamma rays??  dumb luck??  i just don't want my desktop machine crashing
all the time, makes it hard to work (this has only happened once keep in mind,
it may never happen again for all i know)

here's a bit of the crash dump, but look how useless this one is:

panicsys(0xf0258050,0xfbebe958,0xf007d370,0xf5901000,0xfbebee60,0xf024fbb8) + 44
vpanic(0xf007d370,0xfbebe958,0x0,0xf5ff6888,0x0,0x0) + a4
panic(0xf007d370,0x9,0xf024c800,0xfbebea94,0x70,0x126) + 1c
die(0x9,0xf024c9a4,0xf024c800,0x126,0x1,0x126) + b8
trap(0x0,0x1,0xf0000000,0xf0258248,0x0,0x0) + 774

0xf0000000$<regs  
0xf0000000:     psr             pc              npc

data address not found

bah.  this is fun. :)

if anyone has any ideas at all for me to try, or information to get, please
let me know, we need to get this fixed, and like i said, we have gotten zero
help from our vendors.

thanks,

-brian



More information about the SunHELP mailing list