Bit counting

Rich Neitzel thor at stout.ucar.edu
Thu Jan 12 02:38:10 AEST 1989


As a followup to my posting of the "pair addition" algorithm for bit
counting, I received the following mail message from dunc at Sun.COM
(duncs home):

>That bitcounting algorithm you posted is really pretty; it's a technique I'll
>keep in mind, as it's no doubt adaptable to other things.  It turns out that
>on my machine the 8 bit table scheme does win, but not by much:

>int alternate(word) /* assumes char's are 8 bits and int's are 32 bits */
>    int word;
>{
>    static char bits[] = {
>        0,1,1,2,1,2,2,3,1,2,2,3,2,3,3,4,1,2,2,3,2,3,3,4,2,3,3,4,3,4,4,5,
>        1,2,2,3,2,3,3,4,2,3,3,4,3,4,4,5,2,3,3,4,3,4,4,5,3,4,4,5,4,5,5,6,
>        1,2,2,3,2,3,3,4,2,3,3,4,3,4,4,5,2,3,3,4,3,4,4,5,3,4,4,5,4,5,5,6,
>        2,3,3,4,3,4,4,5,3,4,4,5,4,5,5,6,3,4,4,5,4,5,5,6,4,5,5,6,5,6,6,7,
>        1,2,2,3,2,3,3,4,2,3,3,4,3,4,4,5,2,3,3,4,3,4,4,5,3,4,4,5,4,5,5,6,
>        2,3,3,4,3,4,4,5,3,4,4,5,4,5,5,6,3,4,4,5,4,5,5,6,4,5,5,6,5,6,6,7,
>        2,3,3,4,3,4,4,5,3,4,4,5,4,5,5,6,3,4,4,5,4,5,5,6,4,5,5,6,5,6,6,7,
>        3,4,4,5,4,5,5,6,4,5,5,6,5,6,6,7,4,5,5,6,5,6,6,7,5,6,6,7,6,7,7,8
>    };
>    register int result;
>    register unsigned char *p = (unsigned char *)&word;
>    result = bits[*p++];
>    result += bits[*p++];
>    result += bits[*p++];
>    result += bits[*p];
>}
>
>index  %time    self descendents  called+self    name           index
>[5]     14.2    2.04        0.00  100000         _l_bitcnt [5]
>[6]     12.0    1.72        0.00  100000         _alternate [6]

I tested this on my VxWorks system, after making the following
changes:

	add if (!word)
		return(0);

	add return(result);
This made it perform like the pair routine and gave back the result. I
tried both a 16 and 32 bit version. I would have tested them under RSX
but I don't have a PDP anymore. The results are:

	Shift	Nibble lut	Byte lut	Pair add.
RSX	12.3	7.4		n/a		4.6
VxWorks1 25	12		8		8
VxWorks2 51	25		11		11

As before RSX times are seconds/65k iterations, while VxWorks times
are microseconds/iteration. The times marked 1 are for 16 bit words
and times marked 2 are for 32 bit words. 

I too find no real difference between the two algorithms, so I stand
corrected. For general purpose use, I would probably recommend the
byte lut approach. However, the pair addition method was originally
for use on small memory machines and embedded systems with limited
memory. In that case the difference in memory size may weigh heavily
in favor of the pair addition method. On my VxWorks system the memory
use was:
	byte lut - 372 bytes
	pair addition - 144 bytes
both for 32 bit words.

-------------------------------------------------------------------------------

			Richard Neitzel
			National Center For Atmospheric Research
			Box 3000
			Boulder, CO 80307-3000
			303-497-2057

			thor at thor.ucar.edu

    	Torren med sitt skjegg		Thor with the beard
    	lokkar borni under sole-vegg	calls the children to the sunny wall
    	Gjo'i med sitt shinn		Gjo with the pelts
    	jagar borni inn.		chases the children in.



-------------------------------------------------------------------------------



More information about the Comp.lang.c mailing list