1-second resolution of process accounting times

Pete Cottrell pete at umcp-cs.UUCP
Wed Apr 3 16:32:15 AEST 1985


>>>Subject: 1-second resolution of process accounting times

>>>(The following applies to 4.2BSD on a VAX....)

>>>Well, it turns out that the accounting file /usr/adm/acct comprises a series
>>>of records, one per process.  A record contains various interesting data 
>>>about the process, including user and system CPU time.  These times are 
>>>stored in a cute little 16-bit floating point format with a dynamic range 
>>>from 0 to 4.58E6 seconds.  (Luckily, I don't run that many processes which 
>>>consume more than 5.3 CPU days.)
>>>But the time is recorded in *seconds*, and is TRUNCATED by the kernel (rather
>>>than rounded) when it is written.

The range of the floating point format is a lot higher. See below.

>>>So most times recorded in the accounting file are wildly in error.  A check 
>>>of yesterday's accounting data shows that 10,000 out of the total 12,000 
>>>processes were recorded as using zero time!  I know that DEC says a VAX is 
>>>fast, but...

Yeah, I've found that about 87% to 93% of all commands have either 0 system
or user CPU seconds, and about 75% to 85% of all commands have NO time reported
at all. These numbers are derived from our systems, which are an 11/780 and
an 11/750. I've found that only about 75% to 80% of CPU time is reported if
you report in seconds, as described above.

>>>I would like to see the CPU time data recorded in a form which resolves to
>>>milliseconds at the low end of the scale.  It seems to me that the designer
>>>of the current scheme went overboard with 13 bits in the mantissa and cut
>>>himself short on exponent (only 3 bits).  How about using a few more bits of
>>>exponent, and recording the time in milliseconds?  This would still give us a
>>>couple of decimal digits of precision - and values which are meaningful for
>>>those other 10,000 processes which slipped under the rug.

The present format should work fine; the 13 bit mantissa gives you 8192, and
the 3 bit exponent lets you left-shift this 3 places 7 times. So, the largest
number representable is larger than what's in the long variable that's handed
to the compress routine, and even in milliseconds, that is a long time (quick
calculation yields over 49 days). Make the following change to kern_acct.c
and change your accounting programs, and you're in business.

93,94c93,96
< 	ap->ac_utime = compress((long)u.u_ru.ru_utime.tv_sec);
< 	ap->ac_stime = compress((long)u.u_ru.ru_stime.tv_sec);
---
> 	ap->ac_utime = compress((long)(u.u_ru.ru_utime.tv_sec * 1000 +
> 				       (u.u_ru.ru_utime.tv_usec / 1000)));
> 	ap->ac_stime = compress((long)(u.u_ru.ru_stime.tv_sec * 1000 +
> 				       (u.u_ru.ru_stime.tv_usec / 1000)));
-- 
Call-Me:   Pete Cottrell, Univ. of Md. Comp. Sci. Dept.
UUCP:	   {seismo,allegra,brl-bmd}!umcp-cs!pete
CSNet:	   pete at umcp-cs
ARPA:	   pete at maryland



More information about the Comp.unix.wizards mailing list