faster bcopy using duffs device (source)

stergios marinopoulos stergios at Jessica.stanford.edu
Fri Sep 8 10:52:37 AEST 1989


I wanted a faster bcopy, so I used duffs device as a basis for it.  In
addition, it copies ints at a time instead of chars, and the loop is
unrolled  a little too.  Its been working well for me today, so it has
to be perfect right?

I have been seeing 4X speed ups, so I thought I would pass it along.

A potential problem is the char*'s not being alligned, but I have not
run into it.  Also, this probably will not copy strings smaller than
32 bytes (no problem for me, I wanted to copy megs-o-stuff.)

Let me know what you think.  Of the code or anything else for that
matter.

sm

**********************************************************************


#define IFACTOR 4

dcopy(chardest, charsrc, size)
	char *chardest, *charsrc ;
	int size ;
{
	register int *src, *dest, intcount ;
	int startcharcpy, intoffset, numints2cpy, i ;

	numints2cpy = size >> 2 ;
	startcharcpy = numints2cpy << 2 ;
	intcount = numints2cpy & ~(IFACTOR-1) ;
	intoffset = numints2cpy - intcount ;

	src = (int *)(((int) charsrc) + intcount*sizeof(int*)) ;
	dest = (int *)(((int) chardest) + intcount*sizeof(int*)) ;

	/* copy the ints */
	switch(intoffset)
		do {
		case 0: dest[3] = src[3] ;
		case 3: dest[2] = src[2] ;
		case 2: dest[1] = src[1] ;
		case 1: dest[0] = src[0] ;
			intcount -= IFACTOR ;
			dest -= IFACTOR ;
			src -= IFACTOR ;
		} while (intcount >= 0) ;

	/* copy the chars left over by the int copy at the end */
	for(i=startcharcpy ; i<size ; i++)
		chardest[i] = charsrc[i] ;
}



More information about the Alt.sources mailing list