BIT Counting: C programming problem

Jerry Leichter leichter at cs.yale.edu
Thu Jan 12 00:52:35 AEST 1989


You might find the following amusing.  It works ONLY on 32-bit quantities
(though an analogous algorithm can be derived for any word size).  It's not
a particularly good way to count bits at run-time (though I suppose it might
lead to an optimally-short program on some architectures).  What the "best"
algorithm is depends on what you want, but along most measures of space and
speed using a 256-entry table of "bits per byte" count and then adding up the
counts for each byte is usually good.  This algorithm's claim to fame (other
than obscurity!) is that it can be evaluated completely at compile time by
C's limited macro pre-processor (plus a constant folder) if the argument is a
constant.  There are simpler algorithms with this property - you could always
write a sum of 32 expressions testing each bit, or (better) a recursive sub-
division into halves until you use such an expression on small pieces - but
they produce much larger macro expansions, often expansions so large that
some macro processors die.  (If they don't die, they run for a LONG time.)

							-- Jerry

/*
 * Macros to compute a bit value from a bit number, bit number from bit value,
 * and the bit count (number of bits on) for 32-bit quantities.  The results
 * are compile-time constants if the arguments are.
 *
 * The definition of BITCOUNT is based on an algorithm condensed from one
 * extracted from a graph coloring register allocator which manipulates
 * bitvectors.  It was inspired by an analogous algorithm for counting bits in
 * 36 bit words in MIT's HAKMEM, designed by Chris Smith, and written by Jane
 * Hoffman.  The original conversion to a form suitable for C on a VAX was by
 * Brad Merrill; that code was fixed and re-written by Jerry Leichter.  It
 * assumes a 32-bit word.  Note that the argument is evaluated multiple times,
 * so should not have side-effects.  A full explanation of the algorithm is
 * too long to include here; in summary, BX_(x) replaces each 4-bit sequence
 * in x by the number of bits in that sequence (which will, of course, fit in
 * 4 bits).  BITCOUNT folds the 4-bit sequences into 8-bit sequences.  The
 * 8-bit sequences are then added up by recognizing that they can be viewed as
 * a base 256 number, and the sum of the digits of a base b number is the same
 * mod b-1 as the number mod b-1.
 *
 * The original implementation produced results in groups of 3, then 6 bits.
 * While natural for a PDP-10's 36-bit word, this requires special case
 * handling of the sign bit on a VAX, since the top 3-bit sequence is really
 * only 2 bits long. 
 */
#define BITVAL(n)	(1 << (n))
#define BITNUM(b)	BITCOUNT((b)-1)
#define BITCOUNT(x)	(((BX_(x)+(BX_(x)>>4)) & 0x0F0F0F0F) % 255)
#define  BX_(x)		((x) - (((x)>>1)&0x77777777)			\
			     - (((x)>>2)&0x33333333)			\
			     - (((x)>>3)&0x11111111))



More information about the Comp.lang.c mailing list