hash function for text in C

Ron Irvine ron at sco.COM
Sat Oct 20 06:54:08 AEST 1990


I have used many hash functions.  Each one has advantages
and disadvantages.  Here is a powerful but simple one.
It simply does a crc16 on the character string.  The crc16 is
a great hash function since it is designed to produce a
unique code for different character strings.  So for "reasonable"
text it gives a good distribution of hash numbers (a big problem
to solve in general).  The crc16 function I list below uses nibbles
to build the crc from each byte, this is faster than a bit by bit and
less room than a 256 entry table needed for a byte by byte (fits
in cache).  So, it is fast (no multiply, just shifts and masks),
well behaved and simple.  If you have a VAX you could even use the
"crc" instruction instead of the crc16 function - what a CISC.
To use, do a crc16 on the string and the truncate the 16 bit
result to the size you need (this should be a power of 2).


/*	crc16 - crc16 routine
 *
 *	R.K. Irvine
 *
 *	This routine returns the crc16 for the string pointed
 *	to by "in".
 *	crc16 is given by:  x^16 + x^15 + x^2 + 1
 */
unsigned short
crc16(register char *in) {
	register unsigned int	n, crc;

	static unsigned short crc16l[] = {
		0x0000,0xc0c1,0xc181,0x0140,
		0xc301,0x03c0,0x0280,0xc241,
		0xc601,0x06c0,0x0780,0xc741,
		0x0500,0xc5c1,0xc481,0x0440,
	};

	static unsigned short crc16h[] = {
		0x0000,0xcc01,0xd801,0x1400,
		0xf001,0x3c00,0x2800,0xe401,
		0xa001,0x6c00,0x7800,0xb401,
		0x5000,0x9c01,0x8801,0x4400,
	};

	crc = 0;
	while (n = (unsigned char)(*in++)) {	
		n ^= crc;
		crc = crc16l[n&0x0f] ^ crc16h[(n>>4)&0x0f] ^ (crc>>8);
	}
	return(crc);
}



More information about the Comp.lang.c mailing list