can't kill some processes under SunOS

Kevin Thompson kthompso at ptolemy.arc.nasa.gov
Wed Nov 22 05:47:32 AEST 1989


[SunOS 4.0.1, 3/60, usually tcsh but have tried with csh and sh also]

Under what conditions does kill -9 not kill a process?  I have some processes
that are completely un-killable, I've tried /bin/kill and built-in kill, with
just about any accepted signal including -9 and -TERM, with no effect at all.
I don't want to reboot (though I'm coming close).  And yes I RTFM'ed, no
indication that -9 should ever fail.

If it matters, I started these processes with the shell script:

========================================
#!/bin/csh
# Usage: labother machine file1 [file2] ...

set machine = $argv[1]
shift
rsh $machine -n "(cd $cwd; labtests $argv)"
========================================

Where labtests is another shell script, whose meat is a command of the form:

echo '(test-lab "'$file'")' | nice labyrinth -batch > /dev/null

where labyrinth is a lisp dump.  I've guessed (no I'm no unix wizard):

   -- it could have something to do with the way I call rsh, I've since put
      in I/O redirections to /dev/null, too early to determine if that's
      more robust, since this problem is sporadic.

   -- something to do with the franz lisp dump???  Dubious, I can't kill
      the csh process either.

   -- something to do with the process being nice'd

   -- something to do with the process having output put to /dev/null.

   -- something weird about our network set-up, our support group is 'in
      transition'.

Ok, I'm grasping.   At any rate, I now have from ps -ux this:

========================================
USER       PID %CPU %MEM   SZ  RSS TT STAT START  TIME COMMAND
kthompso 11847 16.6  5.8   64  440 p2 S    10:39   0:00 /bin/csh -c ps -ux
kthompso 11555  0.0  3.1   64  232 ?  D    19:41   0:00 csh /usr/kthompso
kthompso 11568  0.0  1.9   64  144 ?  D    19:56   0:00 csh /usr/kthompso
kthompso 11580  0.0  1.9   64  144 ?  D    19:56   0:00 csh /usr/kthompso
kthompso 11579  0.0  0.0    0    0 ?  Z    Nov  7  0:00 <defunct>
kthompso 11854  0.0  5.5  136  416 p2 R    10:39   0:00 ps -ux
kthompso 11554  0.0  0.0    0    0 ?  Z    Nov  7  0:00 <defunct>
kthompso 11845  0.0 12.1  424  912 p2 S    10:33   0:10 emacs
kthompso 11536  0.0  0.0    0    0 ?  Z    Nov  7  0:00 <defunct>
kthompso 11561  0.0  0.0    0    0 ?  Z    Nov  7  0:00 <defunct>
kthompso 11543  0.0  3.1   64  232 ?  D    19:41   0:00 csh /usr/kthompso
kthompso 11797  0.0  3.6  128  272 p2 I    09:46   0:04 -tcsh (tcsh)
========================================

and 11555,11568,11580,11543 are completely un-killable.  The common feature
they have is being 'STAT=D' (disk wait), so maybe that's a problem.  I had
no problem killing the 'in.rshd' process.

Sorry for the length, I was trying to be complete.  Posted or mail replies
equally welcome, hope this isn't a common one, I've never seen discussion
about it.

Kevin Thompson
-- 
kthompso at ptolemy.arc.nasa.gov     Sterling Software/Nasa-Ames Research Center



More information about the Comp.unix.questions mailing list