soundex algorithm wanted

BALDWIN mike at whuxl.UUCP
Thu Sep 4 08:06:54 AEST 1986


> > I would like any info pertaining to soundex search algorithms
> > (phonetic grep).  Source to a nifty, efficient algorithm would
> > be great, but I'll take anything.  Thanx in advance.
> > 
> 
>  /*********************************************************\
>  * This program exemplifies the soundex algorithm.	  *
>  *							  *
>  * You type in a word and it spits out the soundex string  *
>  * that was produced for that word.			  *
>  \*********************************************************/

Unfortunately, it doesn't generate correct Soundex codes.
The algorithm is actually pretty tricky, and I've seen
lots that don't handle names like Lloyd and Manning
properly.  Here's one that I believe is correct:
-----

#include <ctype.h>

#define	SDXLEN	4

char *
soundex(name)
char	*name;
{
	static char	buf[SDXLEN+1];
	register char	c, lc, prev = '0';
	register int	i;

	strcpy(buf, "a000");

	for (i = 0; *name && i < SDXLEN; name++)
		if (isalpha(*name)) {
			lc = tolower(*name);
			c = "01230120022455012623010202" [lc-'a'];
			if (i == 0 || (c != '0' && c != prev)) {
				buf[i] = i ? c : lc;
				i++;
			}
			prev = c;
		}

	return buf;
}

-----
And a little driver for it:
-----

main()
{
	char	line[64];

	while (gets(line))
		puts(soundex(line));
	return 0;
}
-- 
						Michael Baldwin
			(not the opinions of)	AT&T Bell Laboratories
						{at&t}!whuxl!mike



More information about the Comp.sources.unix mailing list