[SunHELP] My E3500 re-booted.. /var/adm/message output is confusing..

Joe Pampel joe at ardsley.com
Thu Aug 8 15:24:52 CDT 2002


Hi -

My E3500 re-booted itself this morning for some reason. The best record of
this I've found so far is in my var/adm/messages file.
I first noticed it when our accounting system was down despite the machine
being "up".  None of the components were running
and so  started trying to figure out who had shut it down when I looked at the
message file. I can do a prtdiag and it says everything
is hunky & dory..,. both processors show up as does all the RAM.. drives look
good. plenty of disk space.
Can anyone make heads or tails out of the messages below?

it's just chugging along for months on end and then suddenly.....

Aug  8 09:25:45 M5 SUNW,UltraSPARC-II: [ID 217111 kern.warning] WARNING:
[AFT1] WP event on CPU14, errID 0x00065815.00e88d68
Aug  8 09:25:45 M5     AFSR 0x00000000.00800100<WP> AFAR 0x00000000.00200000
Aug  8 09:25:45 M5     AFSR.PSYND 0x0100(Score 95) AFSR.ETS 0x00 Fault_PC
0x1009421c
Aug  8 09:25:45 M5     UDBH 0x0000 UDBH.ESYND 0x00 UDBL 0x0000 UDBL.ESYND
0x00
Aug  8 09:25:58 M5 SUNW,UltraSPARC-II: [ID 686001 kern.warning] WARNING:
[AFT1] Uncorrectable Memory Error on CPU6 Data access at TL=0, errID
0x00065818.3e7c9e9b
Aug  8 09:25:58 M5     AFSR 0x00000000.80200000<PRIV,UE> AFAR
0x00000001.713ff2e0
Aug  8 09:25:58 M5     AFSR.PSYND 0x0000(Score 05) AFSR.ETS 0x00 Fault_PC
0x10023c14
Aug  8 09:25:58 M5     UDBH 0x0203<UE> UDBH.ESYND 0x03 UDBL 0x0000 UDBL.ESYND
0x00
Aug  8 09:25:58 M5     UDBH Syndrome 0x3 Memory Module Board 7 J3100 J3200
J3300 J3400 J3500 J3600 J3700 J3800
Aug  8 09:25:58 M5 SUNW,UltraSPARC-II: [ID 527159 kern.warning] WARNING:
[AFT1] errID 0x00065818.3e7c9e9b Syndrome 0x3 indicates that this may not be a
memory module problem
Aug  8 09:25:58 M5 SUNW,UltraSPARC-II: [ID 944252 kern.info] [AFT2] errID
0x00065818.3e7c9e9b PA=0x00000001.713ff2e0
Aug  8 09:25:58 M5     E$tag 0x00000000.1ec02e27 E$State: Exclusive E$parity
0x0f
Aug  8 09:25:58 M5 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data
(0x00): 0x00000310.04fff2a0
Aug  8 09:25:58 M5 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data
(0x08): 0x00000310.04fff2a0
Aug  8 09:25:58 M5 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data
(0x10): 0x006001f6.7c3f0000
Aug  8 09:25:58 M5 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data
(0x18): 0x00000000.00000000
Aug  8 09:25:58 M5 SUNW,UltraSPARC-II: [ID 989652 kern.info] [AFT2] E$Data
(0x20): 0x00000000.00000040 *Bad* PSYND=0xff00
Aug  8 09:25:58 M5 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data
(0x28): 0x00000000.00000000
Aug  8 09:25:58 M5 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data
(0x30): 0x0004aa87.03020000
Aug  8 09:25:58 M5 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data
(0x38): 0x00000000.00000000
Aug  8 09:25:58 M5 SUNW,UltraSPARC-II: [ID 836497 kern.info] [AFT3] errID
0x00065818.3e7c9e9b: cannot schedule clearing of error on page
0x00000001.713fe000; page not in VM system
Aug  8 09:25:58 M5 SUNW,UltraSPARC-II: [ID 706291 kern.info] [AFT3] errID
0x00065818.3e7c9e9b Above Error detected by protected Kernel code
Aug  8 09:25:58 M5     that will try to clear error from system
Aug  8 09:25:59 M5 SUNW,UltraSPARC-II: [ID 318182 kern.warning] WARNING:
[AFT1] Uncorrectable Memory Error on CPU6 Data access at TL=0, errID
0x00065818.414a7ea4
Aug  8 09:25:59 M5     AFSR 0x00000000.80200000<PRIV,UE> AFAR
0x00000001.713ff2e0
Aug  8 09:25:59 M5     AFSR.PSYND 0x0000(Score 05) AFSR.ETS 0x00 Fault_PC
0x10023c14
Aug  8 09:25:59 M5     UDBH 0x0203<UE> UDBH.ESYND 0x03 UDBL 0x0000 UDBL.ESYND
0x00
Aug  8 09:25:59 M5     UDBH Syndrome 0x3 Memory Module Board 7 J3100 J3200
J3300 J3400 J3500 J3600 J3700 J3800
Aug  8 09:25:59 M5 SUNW,UltraSPARC-II: [ID 299860 kern.warning] WARNING:
[AFT1] errID 0x00065818.414a7ea4 Syndrome 0x3 indicates that this may not be a
memory module problem
Aug  8 09:25:59 M5 SUNW,UltraSPARC-II: [ID 234293 kern.info] [AFT2] errID
0x00065818.414a7ea4 PA=0x00000001.713ff2e0
Aug  8 09:25:59 M5     E$tag 0x00000000.1ec02e27 E$State: Exclusive E$parity
0x0f
Aug  8 09:25:59 M5 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data
(0x00): 0x00000310.04fff2a0
Aug  8 09:25:59 M5 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data
(0x08): 0x00000310.04fff2a0
Aug  8 09:25:59 M5 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data
(0x10): 0x006001f6.7c3f0000
Aug  8 09:25:59 M5 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data
(0x18): 0x00000000.00000000
Aug  8 09:25:59 M5 SUNW,UltraSPARC-II: [ID 989652 kern.info] [AFT2] E$Data
(0x20): 0x00000000.00000040 *Bad* PSYND=0xff00
Aug  8 09:25:59 M5 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data
(0x28): 0x00000000.00000000
Aug  8 09:25:59 M5 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data
(0x30): 0x0004aa87.03020000
Aug  8 09:25:59 M5 SUNW,UltraSPARC-II: [ID 359263 kern.info] [AFT2] E$Data
(0x38): 0x00000000.00000000
Aug  8 09:25:59 M5 SUNW,UltraSPARC-II: [ID 529467 kern.info] [AFT3] errID
0x00065818.414a7ea4: cannot schedule clearing of error on page
0x00000001.713fe000; page not in VM system
Aug  8 09:25:59 M5 SUNW,UltraSPARC-II: [ID 935734 kern.info] [AFT3] errID
0x00065818.414a7ea4 Above Error detected by protected Kernel code
Aug  8 09:25:59 M5     that will try to clear error from system

..and then it reboots apparently:

Aug  8 09:26:03 M5 hme: [ID 786680 kern.notice] SUNW,hme1 : No response from
Ethernet network : Link down -- cable problem?
Aug  8 09:29:15 M5 genunix: [ID 540533 kern.notice] ^MSunOS Release 5.8
Version Generic_108528-12 64-bit
Aug  8 09:29:15 M5 genunix: [ID 913631 kern.notice] Copyright 1983-2001 Sun
Microsystems, Inc.  All rights reserved.
Aug  8 09:29:15 M5 genunix: [ID 678236 kern.info] Ethernet address =
8:0:20:92:c7:a4
Aug  8 09:29:15 M5 unix: [ID 597320 kern.info] NOTICE: DR Kernel Cage is
DISABLED
Aug  8 09:29:15 M5 unix: [ID 389951 kern.info] mem = 6291456K (0x180000000)
Aug  8 09:29:15 M5 unix: [ID 930857 kern.info] avail mem = 6162849792
Aug  8 09:29:15 M5 rootnex: [ID 466748 kern.info] root nexus = 5-slot Sun
Enterprise E3500
Aug  8 09:29:15 M5 rootnex: [ID 349649 kern.info] sbus0 at root: UPA 0x2 0x0
...
Aug  8 09:29:15 M5 genunix: [ID 936769 kern.info] sbus0 is /sbus at 2,0

etc etc etc.
What worries me in retrospect looking at this now is that "DR Kernel Cage is
DISABLED" message. Is that normal? It sounds bad..

Any advice at all would be welcome. If I can figure something out I'll post it
up.. thanks - Joe


**********************************************************************
This email and any files transmitted with it are confidential and
intended solely for the use of the individual or entity to whom they
are addressed. If you have received this email in error please notify
the system manager.

This footnote also confirms that this email message has been swept by
MIMEsweeper for the presence of computer viruses.

www.mimesweeper.com
**********************************************************************



More information about the SunHELP mailing list