[SunHELP] masive IO wait with kernel using most of the CPU

Dale Ghent daleg at elemental.org
Mon Mar 4 18:43:26 CST 2002


On Mon, 4 Mar 2002, Michelle Loranger wrote:

| If anyone has any ideas as to how I can track down this mysterious IO
| wait problem, It would help me a ton.  It seems that I have many
| machines on my network experienceing this ridiculous IO wait but not for
| any processes that I can pin down.  It doesn't appear to be platform
| specific as I am seeing it on servers as well as workstations.  The only
| common ground I can pin down (but am not allowed to test) is the all
| these machine has SUn Grid Engine and Orcallator running.  Below is the
| results of top:
|
|   PID USERNAME THR PRI NICE  SIZE   RES STATE   TIME    CPU COMMAND
|  5718 root       1  -5    0 5880K 1464K sleep  30:09 12.49% bptm
|  1514 root       1 -25    0 1600K 1464K cpu/1   0:15  6.25% top
|  5717 root       1  34    0 6104K 1872K sleep  95:30  2.00% bpbkar

Lets see.. Grid Network Computing, Orcallator writing logs and stats, a
busy mountd process, indicating some sort of NFS activity, and last but
not least, bpbkar is running, indicating that the server is undergoing
backups using Veritas Netbackup.

So, why are you asking what the cause of this system stress is, again, as
if it isnt clear already? :)

Grid can eat network resources. NFS can eat up both network and disk - if
this particular box is a server... but most definitely, Netbackup is
causing your woes, esp. if all machines suffer these symptoms that you
describe at the same time.

Shouldnt those fancy Orcallator stats be showing you this?

/dale



More information about the SunHELP mailing list