tracing system calls (long)

Sun Sep 4 17:55:41 AEST 1988

In article <66624 at sun.uucp> brent%terra at Sun.COM (Brent Callaghan) writes:
>In article <2040 at cuuxb.ATT.COM>, dlm at cuuxb.ATT.COM (Dennis L. Mumaugh) writes:
>> 1).  We have a system call trace program that reports on each and
>> every system call a process makes -- useful for support to figure
>> out what a program  REALLY  is  doing.

[ Dennis was referring to AT&T's as yet unreleased truss(1) command. ]

>And how!  Actually any SunOs 4.0 user can do that now with
>the trace(1) command.

[ output of 'trace date' omitted ]

Great minds run in the same paths, with some variations.
AT&T's truss(1) command was developed without any knowledge of Sun's trace(1)
command's actual or planned existence.  I presume the reverse is also true.

I wish to point out some of the ways in which AT&T's truss(1) is superior to
Sun's trace(1).  I do not especially want to put Sun down (though some could
read it that way), only to indicate some shortcomings in what they have done
and to impress on your minds some of the delicacy of debugger interfaces.

Sun does get credit for being first; trace(1) is already available on SunOS
4.0 while truss(1) is planned for a future release of System V from AT&T.
And no, it is not available in SVR3.2 as of now; it does exist as an add-on
package for SVR3.1 and SVR3.2 on the 3B2, but not yet to the outside world.
Complain to AT&T, not to me.

First and foremost, it must be observed that trace(1) is based on Sun's
enhanced ptrace(2) system call while truss(1) is based on AT&T's proc(4)
process filesystem, invented by Tom Killian of Bell Labs research and
extended and implemented for System V by Ron Gomes, with significant
input from me.  The deficiencies in trace(1) are largely due to the
deficiencies in ptrace(2) as compared to proc(4).

Ron Gomes did proc(4), I did truss(1); the credit (or blame) goes to us.

1. truss(1) can follow children created by fork(2).  You can trace a shell
   script of arbitrary complexity.  My favorite is spell(1), which runs
   an 8-member pipeline.  trace(1) can't do this because the ptrace(2)ed
   condition is not inherited; proc(4) tracing flags can be inherited.

2. Both trace(1) and truss(1) can grab existing processes.  However,
   truss(1) will grab an arbitrary number while trace(1) will grab only
   one.  Also, there is a bug in ptrace(2):  If a process terminates
   while being traced, its termination status is delivered (via wait(2))
   to the controlling process, not to the process's parent.  If a process
   is grabbed by trace(1) and then dies on a signal, the process's parent
   is not informed of the termination; to it, the process just vanished.
   (Terminating via exit(2) works OK because trace(1) lets go in time.)

3. truss(1) allows you to specify which system calls you wish to trace
   or exclude.  trace(1) traces all syscalls regardless.  proc(4) accepts
   a bit-mask to specify which syscalls to stop on; ptrace(2) stops on
   all syscalls.  Untraced syscalls incur no overhead with proc(4).

4. truss(1) does symbolic interpretation of syscall arguments, using
   #define names from relevant system header files.  trace(1) shows
   arguments only in decimal, octal, or hexadecimal.  truss(1) has
   an option to turn off symbolic interpretation, for unredeemed
   hackers like me who must see the raw bits to be happy.

5. truss(1) (verbose option) shows the contents of structures passed by
   address to specified system calls.  The contents are shown on output;
   values passed back from the operating system (like the stat structure
   from stat(2)) are displayed properly.  trace(1) doesn't do this.

6. truss(1) shows all characters of any filename argument; trace(1)
   shows only the first 32.  This is related to the next item.

7. trace(1) uses a heuristic based on the number of printable characters
   in the first 32 bytes of the I/O buffer for a read(2) or write(2) to
   decide whether or not to print the first 32 bytes of the buffer as a
   string (ambiguously, since '\' may or may not be an actual character
   in the I/O buffer).  truss(1) always prints the first 16 bytes in an
   unambiguous format.  Also, truss(1) accepts an option to print the
   entire contents of the I/O buffer for read()s or write()s on specified
   file descriptors.  This feature came only after I had an opportunity
   to play directly with trace(1); kudos to Sun, this is very useful.

8. truss(1) optionally prints the argument and environment strings passed
   in each exec(2) system call.  trace(1) could do this too, but it is
   useful mostly when following children, which trace(1) can't do.

9. Both truss(1) and trace(1) accept an option to count system calls rather
   than showing them line-by-line.  truss(1) only counts those syscalls which
   are being traced; child process syscalls may be included in the counts.

10.truss(1) reports sleeping system calls as "sleeping ..." if they remain
   asleep for more than 2 seconds.  trace(1) can't do this because of the
   ptrace(2) interface.

11.Both truss(1) and trace(1) report the receipt of signals.  Neither
   reports a signal before it is received (sent but blocked).  truss(1),
   by virtue of the proc(4) interface, reports any machine fault which
   the process incurs when it is incurred, even if the associated signal
   is blocked; trace(1) cannot do this with ptrace(2).

12.truss(1) accepts options to trace or exclude specified signals or
   machine faults.  proc(4) accepts a bit-mask of signals or faults
   to stop upon; ptrace(2) stops on all signals but no faults.

13.When truss(1) encounters an exec(2) of a set-uid or set-gid object
   (a process tracing security violation), proc(4) forces it to give up
   and allow the process to continue unmolested.  When trace(1) encounters
   such an exec(2), ptrace(2) silently disables the setting of the set-uid
   or set-gid and trace(1) continues to trace the process.  The process
   will eventually fail because it doesn't have correct permissions.
   The proc(4) interface does a proper job of enforcing security without
   changing process behavior; ptrace(2) just botches it (and always has).
   If truss(1) is run as super-user, set-uid and set-gid processes can
   be traced with no problem.  Running trace(1) as super-user helps some
   but it still has the same problem for non-super-user grabbed processes.

14.The ptrace(2) mechanism is intimately intertwined with the signal
   mechanism.  In particular, stopping on syscalls involves sending
   SIGTRAP.  If a process uses SIGTRAP for interprocess communication
   (I would call such a process terminally brain-damaged, but nothing in
   the system prevents such things), it will fail when trace(1) is applied
   to it.  The proc(4) mechanism is independent of the signal mechanism and
   does not suffer from this sort of problem.  A program using proc(4) can
   choose to trace signals or not; a signal is just one of the events a
   process can stop on, others are machine faults and syscalls.  A process
   can be stopped without sending SIGSTOP.  Provisions exist for cooperating
   with job-control stop/start signals and ptrace(2) as well.

15.ptrace(2) causes a traced process to die when its controlling process
   dies.  If a process is grabbed by trace(1) and trace(1) is killed with
   'kill -9', then the traced process also dies.  trace(1) catches all
   other signals in order to let go of the traced process before exiting.
   truss(1) doesn't have this problem; when it is killed with 'kill -9',
   the traced process continues unmolested.

16.There is a serious bug in SunOS 4.0 involving the interaction of
   job-control stop signals (SIGSTOP and its relatives) and ptrace(2).
   If a process is stopped by sending it a job-control stop signal and
   trace(1) is applied to it while it is so stopped, then trace(1) hangs
   and becomes unkillable, even with 'kill -9'.  The whole ptrace(2)
   mechanism is then locked out and any instance of dbx also becomes
   hung and unkillable.  The only recourse is a reboot.

				Roger A. Faulkner
				allegra!raf