[Sunhelp] Could this be a race condition or what?

trebor sreyb tsreyb at my-dejanews.com
Fri May 14 09:31:32 CDT 1999


A 3rd party application goes to sleep forever (at least up to 12 hours without signs of waking up, anyway). It
used to run to completion in a matter of seconds.
                   
% uname -a
SunOS flotsam 5.5.1 Generic_103640-24 sun4u sparc SUNW,Ultra-Enterprise 
                   
I do not have access to the source.
                   
I trussed the process and found its last action was a getmsg() on a unit that had been opened against
/dev/ticotsord.
                   
The last lines of the truss output read as follows.
[NOTE THE FIRST AND LAST LINES IN THIS TRUSS OUTPUT - they illustrate that the process sleeps waiting for something to come back from ticotsord]

open("/dev/ticotsord", O_RDWR)                  = 9
...snip...
putmsg(9, 0xEFFFDBB4, 0xEFFFDB18, 0)            = 0
getmsg(9, 0xEFFFDAE8, 0xEFFFDAF4, 0xEFFFDAD4)  = 2
getmsg(9, 0xEFFFDAE8, 0xEFFFDAF4, 0xEFFFDAD4)  = 0
lseek(7, 0, SEEK_SET)                          = 0
fcntl(7, 14, 0xEFFFDBDC)                        = 0
fcntl(7, F_SETLKW, 0xEFFFDC3C)                  = 0
lseek(7, 0, SEEK_SET)                          = 0
read(7, "\007\001 o p t l m\0\0\0".., 486)      = 486
lseek(7, 0, SEEK_SET)                          = 0
write(7, "\007\001 o p t l m\0\0\0".., 486)    = 486
dup(7)                                          = 10
close(10)                                      = 0
lseek(7, 0, SEEK_SET)                          = 0
fcntl(7, F_SETLK, 0xEFFFDBDC)                  = 0
putmsg(9, 0xEFFFDBB4, 0xEFFFDB18, 0)            = 0
putmsg(9, 0xEFFFDC74, 0xEFFFDBD8, 0)            = 0
chmod("/opt/lnms/db/7278.log", 0666)            Err#2 ENOENT 
chmod("/opt/lnms/db/optivity.taf", 0666)        = 0
putmsg(9, 0xEFFFE254, 0xEFFFE1B8, 0)            = 0
getmsg(9, 0xEFFFE188, 0xEFFFE194, 0xEFFFE174) (sleeping...)
                   
My questions/thoughts are:
                   
=> Maybe it is some sort of race condition?
                   
=> Maybe some other process has a hold of /dev/ticotsord?
                   
=> Could this an application error? Or, maybe something funky at the system level (e.g., OS bug or HW failure)?
                   
=> Need an OS patch?
                   
=> What is /dev/ticotsord? And why would this application want to use it? The app is a Network Management app that uses Raima as its database tool.
                   
Thanks in advance,
-TR


-----== Sent via Deja News, The Discussion Network ==-----
http://www.dejanews.com/  Easy access to 50,000+ discussion forums





More information about the SunHELP mailing list