1-second resolution of process accounting times

Jeff Stearns jeff at fluke.UUCP
Fri Mar 29 07:11:27 AEST 1985


(The following applies to 4.2BSD on a VAX....)

Do you ever run lastcomm(1) or sa(8) and wonder about all those processes which
consumed zero seconds of CPU time?  For example:

printenv         tod      tty22       0 secs Thu Mar 28 10:26
stty             klis     tty25       0 secs Thu Mar 28 10:26
sh               gms      tty17       0 secs Thu Mar 28 10:26
more             gms      tty17       0 secs Thu Mar 28 10:26
ls               don      tty11       0 secs Thu Mar 28 10:26
sendmail    F    root     __          0 secs Thu Mar 28 10:25
csh              joe      ttyp1       2 secs Thu Mar 28 10:25
whereis          joe      ttyp1       0 secs Thu Mar 28 10:25
mail       S     daemon   __          0 secs Thu Mar 28 10:25
mail             owens    tty08       2 secs Thu Mar 28 10:12
comsat      F    root     __          0 secs Thu Mar 28 10:25

C'mon now, even the simplest command takes *some* time:
	% time date
	Thu Mar 28 11:18:35 PST 1985
	0.2u 0.3s 0:01 62% 120+24k 1+3io 4pf+0w

So what's the story here?  Why do I get such crummy resolution from lastcomm?

Well, it turns out that the accounting file /usr/adm/acct comprises a series of
records, one per process.  A record contains various interesting data about
the process, including user and system CPU time.  These times are stored in a
cute little 16-bit floating point format with a dynamic range from 0 to 4.58E6
seconds.  (Luckily, I don't run that many processes which consume more than 5.3
CPU days.)

But the time is recorded in *seconds*, and is TRUNCATED by the kernel (rather
than rounded) when it is written.

So most times recorded in the accounting file are wildly in error.  A check of
yesterday's accounting data shows that 10,000 out of the total 12,000 processes
were recorded as using zero time!  I know that DEC says a VAX is fast, but...

This makes the output of sa(8) very untrustworthy.

I would like to see the CPU time data recorded in a form which resolves to
milliseconds at the low end of the scale.  It seems to me that the designer
of the current scheme went overboard with 13 bits in the mantissa and cut
himself short on exponent (only 3 bits).  How about using a few more bits of
exponent, and recording the time in milliseconds?  This would still give us a
couple of decimal digits of precision - and values which are meaningful for
those other 10,000 processes which slipped under the rug.
-- 
	Jeff Stearns       (206) 356-5064
	John Fluke Mfg. Co.
	P.O. Box C9090  Everett WA  98043  
	{uw-beaver,decvax!microsof,ucbvax!lbl-csam,allegra,ssc-vax}!fluke!jeff



More information about the Comp.unix.wizards mailing list