fastest way to copy hunks of memory

Wed May 2 16:17:33 AEST 1990

In article <5531 at helios.ee.lbl.gov> tierney at ux1.lbl.gov (Brian Tierney) writes:
>Which method is fastest?
>1.  char *p1, *p2;
>    for (j = 0; j < size; j++)
>	    *p1++ = *p2++;
>2.  bcopy(p2,p1,size);
>3.  memcpy(p1,p2,size);
>In general, system calls are slower, right? (ie, 1 faster that 2 and 3)
>BTW, what's the difference between bcopy and memcpy anyway??

bcopy() and memcpy() are never (to my knowledge) system calls, but are
functions in the system's C library.  While there is certainly
considerable overhead in making a system call, there is much less
overhead making a library function call.

bcopy() is provided on 4BSD-based systems, while memcpy() is provided
on System V-based systems.  The ANSI C standard requires support for
memcpy() AND memmove(); the difference is that memmove() is guaranteed
to "do the right thing" when the source and destination buffers overlap,
whereas memcpy() doesn't necessarily follow such a bit-blit model (but
this allows it to be slightly faster, sometimes).  bcopy()'s behavior
in such circumstances is not as well defined, although there seems to
be some sentiment that it is supposed to "do the right thing" also.

Generally, bcopy() and memcpy() are implemented to exploit whatever
"block move" instructions the processor might support, which makes
them much faster when `size' is fairly large.  For small values of
`size', the in-line loop code might be faster, but unless it's in a
bottleneck section of code, why bother.  Indeed, instead of copying
one byte at a time, you might copy a word at a time, and change the
loop test to be against 0, and use do..while so that a subtract-one-
and-branch-if-nonzero instruction is generated, and ...

Since this is neither a UNIX-specific nor a Wizardly question, I've
directed follow-ups to comp.lang.c (INFO-C).