fast file copying (was questions about a backup program ...)

Sat May 5 14:35:03 AEST 1990

In article <24164 at mimsy.umd.edu> chris at mimsy.umd.edu (Chris Torek) writes:
>In article <12578 at wpi.wpi.edu> jhallen at wpi.wpi.edu (Joseph H Allen) writes:
>>Interestingly, this aspect of the copy program [reading and writing very
>>large blocks] is one place where I think DOS is sometimes faster than
>>UNIX.  I suspect that many UNIX versions of 'cp' use block-sized buffers.
>>Doing so makes overly pessimistic assumptions about the amount of
>>physical memory you're likely to get.  
>
>...`big gulp' style copying is not always, and
>indeed not often, the best way to go about things...  Unix systems
>use write-behind (also known as delayed write) schemes to help out here;
>writers need use only block-sized buffers to avoid user-to-kernel copy
>inefficiencies.

Indeed.  The DOS implementation of cp is only apparently "better"
because it is doing something explicitly which the Unix program
has no need for.  Unaided DOS has no write-behind or read-ahead
(and very little caching), and programs that do 512-byte or 1K
reads and writes (including, tragically, most programs using
stdio) run abysmally slowly.

Using stdio is supposed to be the "right" thing to do; the stdio
implementation should worry about things like correct block
sizes, leaving these unnecessary system-related details out of
application programs.  (Indeed, BSD stdio does an fstat to pick a
buffer size matching the block size of the underlying
filesystem.)  If a measly little not-really-an-operating-system
like DOS must be used at all, a better place to patch over its
miserably simpleminded I/O "architecture" would be inside stdio,
which (on DOS) should use large buffers (up around 10K) if it can
get them, certainly not 512 bytes or 1K.  Otherwise, every
program (not just backup or cp) potentially needs to be making
explicit, system-dependent blocksize choices.  cat, grep, wc,
cmp, sum, strings, compress, etc., etc., etc. all want to be able
to read large files fast.  (The versions of these programs that I
have for DOS all run unnecessarily slowly, because the stdio
package they are written in terms of is doing pokey little 512
byte reads.  I refuse to sully all of those programs with
explicit blocksize notions.  Sooner or later I have to stop using
the vendor's stdio implementation and start using my own, so I
can have it do I/O in bigger chunks.)

In article <12642 at wpi.wpi.edu> jhallen at wpi.wpi.edu (Joseph H Allen) writes:
>Also, for copying to tapes and raw disks, 'cp' is usually very bad.  I think
>dd can be used to transfer large sets of blocks.  On one system I know of, if
>you 'cp' between two raw floppy devices, the floppy lights will blink on and
>off for each sector.

A certain amount of "flip-flopping" like this is inevitable under
vanilla Unix, at least when using raw devices, since there is no
notion of asynchronous I/O: the system call for reading is active
until it completes, during which time the write call is inactive,
and the reader is similarly idle while writing is going on.
Graham Ross once proposed a clever "double-buffered" device copy
program which forked, resulting in two processes, sharing input
and output file descriptors, and synchronized through a semaphore
so that one was always actively reading while the other one was
writing.  (This trick is analogous to the fork cu used to do to
have two non-blocking reads pending.)  It was amazing to watch a
tape-to-tape copy via this program between high-throughput, 6250
bpi tape drives: both tapes would spin continuously, without
pausing.  (cp or dd under the same circumstances resulted in
block-at-a-time pauses while the writing drive waited for the
reader and vice versa.)

                                            Steve Summit
                                            scs at adam.mit.edu