count of bits in a long

Gene H. Olson gene at zeno.mn.org
Sat Oct 13 01:41:37 AEST 1990


peter at neccan.oz (Peter Miller) writes:
>/*
> * A while back I posted a solution to the "count of bits set in a
> * long" question, suggesting that HACKMEM 169 was a very fast way to do it.
> * This program demonstrates the technique.
> * 
> * The HACKMEM 169 method uses a modulo to sum the digits.
> * A number of people have asked me "how can a division be faster than a loop?"
> * This program demonstrates how.  The modulo is a special case,
> * and can be unwound into a tight loop if your machine does not
> * have a fast divide opcode (although many modern machines have a 1 cycle
> * divide opcode, except for we-really-believe-our-press-hype risc
> * manufacturers).
> * 
> * This program was run on a 386/20 under unix with the following results:
> *	Func	sec/iter	scaled
> *	nbits	0.00018310547	 1.000	hackmem 169
> *	nbits1	0.0006980896	 3.812	hackmem 169 no % operator
> *	nbits2	0.0016822815	 9.188	count lsb's loop
> *	nbits3	0.0037384033	20.417
> *	nbits4	0.0034370422	18.771
> *	nbits5	0.0037765503	20.625
> *	nbits6	0.0035247803	19.250
> */
> .....  A program follows this.

To this program I have added a bug fix (the function table
was wrong) and another function (nbit7) which is MY FAVORITE
WAY of counting bits.   The patches are at the end of this
posting....

Anyway, I got much better results on both the 386 (which
has a multiply) and on a SPARC (which doesn't) than any of
the functions in the original program.

The results were:

            386                               SPARC
------------------------------      ------------------------------
nbits   0.00020980835    1.774      nbits   0.00096893311   15.875
nbits1  0.00038909912    3.290      nbits1  0.00012969971    2.125
nbits2  0.00075149536    6.355      nbits2  0.00031280518    5.125
nbits3  0.0015411377    13.032      nbits3  0.00067138672   11.000
nbits4  0.0016479492    13.935      nbits4  0.00065231323   10.688
nbits5  0.0015525818    13.129      nbits5  0.00067520142   11.062
nbits6  0.0021286011    18.000      nbits6  0.00086212158   14.125
nbits7  0.00011825562    1.000      nbits7  6.1035156e-05    1.000

The nbit7 technique is a simple routine which shifts and sums
bits in place until all bits have summed in the lower 8 bits
of the word.  The implementation here requires 32 bit longs,
but it is easily extensible to 64 bits with a mask change and
the additional shift shown.

Enjoy,

------------- Remainder of Posting is a Cdiff Patchfile ----------

*** orig.c	Fri Oct 12 08:56:22 1990
--- gene.c	Fri Oct 12 09:23:51 1990
***************
*** 215,220 ****
--- 215,236 ----
  	return count;
  }
  
+ int
+ nbits7(n)
+ 	register unsigned long n ;
+ {
+ 	n = ((n >> 1) & 0x55555555) + (n & 0x55555555) ;
+ 	n = ((n >> 2) & 0x33333333) + (n & 0x33333333) ;
+ 	n = ((n >> 4) + n) & 0x0F0F0F0F ;
+ 	n = ((n >> 8) + n) ;
+ #if LONG64
+ 	n = (n >> 16) + n ;
+ 	return ((n >> 32) + n) & 0xFF ;
+ #else
+ 	return ((n >> 16) + n) & 0xFF ;
+ #endif
+ }
+ 
  /*
   * build a long random number.
   * The standard rand() returns a 15bit number.
***************
*** 275,282 ****
  	{ "nbits2", 	nbits2 },
  	{ "nbits3", 	nbits3 },
  	{ "nbits4", 	nbits4 },
! 	{ "nbits5", 	nbits3 },
! 	{ "nbits6", 	nbits4 },
  };
  
  main()
--- 291,299 ----
  	{ "nbits2", 	nbits2 },
  	{ "nbits3", 	nbits3 },
  	{ "nbits4", 	nbits4 },
! 	{ "nbits5", 	nbits5 },
! 	{ "nbits6", 	nbits6 },
! 	{ "nbits7", 	nbits7 },
  };
  
  main()
_________________________________________________________________________
   __ 
  /  )                 Gene H. Olson             uunet!digibd!zeno!gene
 / __  _ __  _         Smartware Consulting
(__/ _(/_//_(/_        Minneapolis, MN           (612) 824-9108



More information about the Comp.lang.c mailing list