[SunHELP] for loop changes nohup?
Richard Russell
richard.russell at db.com
Fri Nov 19 12:12:14 CST 2004
Hi sunhelp list,
I am experiencing a problem that has baffled myself and my colleagues for
some time now. We've experimented, researched, and thought, but can't find
any documentation that explains what we are seeing: It seems that running
a command within a for loop is quite different to running it directly, at
least in regards to how nohup works... The best way of explaining what I'm
seeing here is to show an example. I have two terminals open on the same
machine, both as the same user. Both have PS1="`tty` $ " set, so you can
clearly see which terminal I am typing into...
Note that I've obfuscated a few details, to satisfy my paranoia and the
paranoia of the Risk Management team here:
$USER -- my username
$HOME -- my homedir
$SOMEPATH -- the path to a specific install of the JVM that I use. It's a
stock standard 1.3.1_07 JVM, and isn't incredibly relevant anyway.
For legacy reasons, I have one script that nohup calls another script,
which nohup calls a simple java process in the background (with '&'). The
(at least partially) simplified version of what's happenning is:
--------------------------------------
/dev/pts/27 $ more start start2 Sleep.java
::::::::::::::
start
::::::::::::::
#!/bin/sh -x
nohup $HOME/start2
::::::::::::::
start2
::::::::::::::
#!/bin/ksh -x
nohup $SOMEPATH/java/bin/java -cp . Sleep &
::::::::::::::
Sleep.java
::::::::::::::
class Sleep {
public static void main(String[] args) {
while (true) {
try {
Thread.sleep( 1000 );
} catch ( InterruptedException e ) {
System.out.println( "awakened prematurely" );
};
};
};
};
/dev/pts/27 $ javac Sleep.java
/dev/pts/27 $ ls -l start start2 Sleep.java Sleep.class
-rw-r--r-- 1 $USER <group> 552 Nov 19 17:11 Sleep.class
-rw-r--r-- 1 $USER <group> 221 Nov 19 17:07 Sleep.java
-rwxr-xr-x 1 $USER <group> 32 Nov 19 17:07 start
-rwxr-xr-x 1 $USER <group> 250 Nov 19 17:09 start2
--------------------------------------
Never mind my Java skills -- This is a trivial program designed simply to
execute indefinitely without wasting CPU cycles.
You can also see that I'm not running anything but ksh on two sessions:
--------------------------------------
/dev/pts/27 $ /usr/ucb/ps auxww |grep ^$USER
$USER 8393 0.0 0.0 992 720 pts/27 S 17:13:08 0:00 grep ^$USER
$USER 1679 0.0 0.0 1904 1408 pts/21 S 17:10:13 0:00 -ksh
$USER 24835 0.0 0.0 1904 1408 pts/27 S 17:07:03 0:00 -ksh
--------------------------------------
Now, I run ./start on the pts/21 session:
--------------------------------------
/dev/pts/21 $ ./start
+ nohup $HOME/start2
Sending output to nohup.out
/dev/pts/21 $ more nohup.out
+ nohup $SOMEPATH/java/bin/java -cp . Sleep
--------------------------------------
As expected, I see the process running on the pts/27 session:
--------------------------------------
/dev/pts/27 $ /usr/ucb/ps auxww |grep ^$USER
$USER 9985 0.1 0.13074411408 pts/21 S 17:13:42 0:00
$SOMEPATH/java/bin/../bin/sparc/native_threads/java -cp . Sleep
$USER 11356 0.0 0.0 992 720 pts/27 S 17:14:10 0:00 grep ^$USER
$USER 24835 0.0 0.0 1904 1408 pts/27 S 17:07:03 0:00 -ksh
$USER 1679 0.0 0.0 1904 1408 pts/21 S 17:10:13 0:00 -ksh
--------------------------------------
When I quit the second terminal (pts/27), and re-run the ps command on the
first terminal (pts/21), I see that the ksh has gone, but my java process
is still running, but with "?" for the tty, as expected:
--------------------------------------
/dev/pts/27 $ /usr/ucb/ps auxww |grep ^$USER
$USER 24835 0.0 0.0 1904 1408 pts/27 S 17:07:03 0:00 -ksh
$USER 13833 0.0 0.0 992 720 pts/27 S 17:15:04 0:00 grep ^$USER
$USER 9985 0.0 0.13074411408 ? S 17:13:42 0:00
$SOMEPATH/java/bin/../bin/sparc/native_threads/java -cp . Sleep
--------------------------------------
OK, so now I kill that process, open another terminal (pts/24 this time),
and we're back where we started from:
--------------------------------------
/dev/pts/27 $ kill 9985
/dev/pts/27 $ /usr/ucb/ps auxww |grep ^$USER
$USER 15246 0.1 0.0 1904 1408 pts/24 S 17:15:55 0:00 -ksh
$USER 24835 0.0 0.0 1904 1408 pts/27 S 17:07:03 0:00 -ksh
$USER 16074 0.0 0.0 992 720 pts/27 S 17:16:12 0:00 grep ^$USER
--------------------------------------
This time, I'll run ./start from a for loop in this new terminal (pts/24):
--------------------------------------
/dev/pts/24 $ for i in 1; do ./start; done
+ nohup $HOME/start2
Sending output to nohup.out
/dev/pts/24 $ more nohup.out
+ nohup $SOMEPATH/java/bin/java -cp . Sleep
+ nohup $SOMEPATH/java/bin/java -cp . Sleep
--------------------------------------
Looks good -- you can see that it's appended the same line to nohup.out.
I'll check the process from the other terminal (pts/27):
--------------------------------------
/dev/pts/27 $ /usr/ucb/ps auxww |grep ^$USER
$USER 17101 0.1 0.13073611400 pts/24 S 17:16:36 0:00
$SOMEPATH/java/bin/../bin/sparc/native_threads/java -cp . Sleep
$USER 17866 0.0 0.0 992 720 pts/27 S 17:17:06 0:00 grep ^$USER
$USER 24835 0.0 0.0 1904 1408 pts/27 S 17:07:03 0:00 -ksh
$USER 15246 0.0 0.0 1904 1408 pts/24 S 17:15:55 0:00 -ksh
--------------------------------------
OK, so now, as before, I exit the terminal (pts/24), and re-check the
process from the other terminal (pts/27):
--------------------------------------
/dev/pts/27 $ /usr/ucb/ps auxww |grep ^$USER
$USER 18934 0.0 0.0 992 720 pts/27 S 17:17:35 0:00 grep ^$USER
$USER 24835 0.0 0.0 1904 1408 pts/27 S 17:07:03 0:00 -ksh
--------------------------------------
Where has it gone?
BTW: I also had the same behaviour with the following perl script instead
of the Java program, but for some reason, can't repeat it any more.
Logically speaking, I must have changed *something*, but don't know what:
--------------------------------------
#!/usr/bin/perl -w
while (1) {
sleep 10;
};
--------------------------------------
I can find workarounds for this situation -- I can run java with -Xrs, I
can put a nohup in the for loop, I can not use a for loop, and so on. I
could even do this without Java -- As I mentioned, I have experienced the
same behaviour with a perl script (though I can't reliably repeat it).
However, what's driving me insane is not the fact that the process dies,
per se, but that there seems to be *something* different about running
commands from within a for loop (or a while loop), compared to running
them separately, and I don't know what that is. I had assumed that apart
from a few CPU cycles of overhead, the following four commands should be
completely identical:
./start
if true; then ./start; fi
for i in one; do ./start; done
while true; do ./start; break; done
The if construct does appear to be identical to the command on its own,
whereas the for and while constructs seem to have the effect of negating
the nohup. If anyone can explain this to me, I shall buy them a vBeer --
it's been bugging me and my cohorts for quite literally months.
For information, I'm running this on:
----
/dev/pts/27 $ uname -X
System = SunOS
Node = <servername>
Release = 5.8
KernelID = Generic_117350-08
Machine = sun4u
<snip>
NumCPU = 12
----
ksh Version M-11/16/88i
----
/dev/pts/27 $ which nohup
/usr/bin/nohup
/dev/pts/27 $ sh
$ which nohup
/usr/bin/nohup
----
/dev/pts/27 $ which javac
/usr/bin/javac
/dev/pts/27 $ which java
/usr/bin/java
/dev/pts/27 $ java -version
java version "1.2.2"
Solaris VM (build Solaris_JDK_1.2.2_10, native threads, sunwjit)
/dev/pts/27 $ $SOMEPATH/java/bin/java -version
java version "1.3.1_07"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.3.1_07-b02)
Java HotSpot(TM) Client VM (build 1.3.1_07-b02, mixed mode)
----
(Yes, I know that I compiled and executed with two different versions of
Java, but I am 99% confident that this is not to do with the behaviour of
Java, but with the behaviour of ksh or sh.)
Cheers all.
Richard Russell
Deutsche Bank AG London
Global Markets Customer Solutions
Office: +44 (0)20 7545 8060
Mobile: +44 (0)79 0661 2237
More information about the SunHELP
mailing list