4.2bsd eof flag in stdio

Thu Nov 29 15:51:44 AEST 1984

>Try this program on your favorite version of stdio:
>
>#include <stdio.h>
>
>char    buf[256];
>
>main()
>{
>        register int n;
>
>        while (n = fread(buf, 1, sizeof buf, stdin))
>                fwrite(buf, 1, n, stdout);
>        printf("got EOF\n");
>}
>
>Run it and type (e.g.):
>
>testing 1 2 3
>^D
>another test

fread() is not read().  Read() from a terminal is delimited by the newline
character, so that an EOF is always determined by a read that returns 0.
No such guarantee is offered by fread; show me the manual page for fread
that says that 0 is returned upon EOF!  Had you used an fgets or getc loop,
the documentation states that NULL (fgets) or EOF (getc) indicates EOF on
the stream, and you can depend on that.  All you can depend on with
fread is feof().  Thus your program is wrong, and rather than fix it you
broke the library.

>It seemed like the right thing to
>do; the incompatibility was unfortunate.

That is a pretty clear statement of BSD philosophy; it causes some problems.

>>fread() returns 0 if there are 0 characters left in the terminal
>>input queue when the ^D is typed.  What would you have it do?
>The problem is if you type 'foo^D' with no newline.  You would expect
>that this would terminate input reading, but it does not -- you must
>type another ^D to finish it off.

As an experienced UNIX user who has read tty(4) [termio(7) in SysV],
I certainly would not expect that.

>>Contrary to popular misconception, ^D is NOT an "EOF" character;
>>rather, it marks a delimiter for input canonicalization.  If all
>>previous input has been consumed and a ^D is typed, then read()
>>returns a count of 0.  This is often interpreted as EOF.  If there
>>is some uncanonicalized input and ^D is typed, it acts much like
>>NEWLINE except of course no \n is appended.
>>
>This is, of course, a matter of opinion, but all the documentation
>states that ^D is the *end-of-file* character.  Perhaps the
>documentation (unchanged since my memory) is "buggy"?

It of course *is not* a matter of opinion, and while the documentation
calls ^D the EOF character, the formal behavior described in the documentation
is less naive than the name:

EOF     (Control-d or ASCII EOT) may be used to generate an end-of-file from
	a terminal.  When received, all the characters waiting to be read are
	immediately passed to the program, without waiting for a new-line, and
	the EOF is discarded.  Thus, if there are no characters waiting, which
	is to say *the EOF occurred at the beginning of a line*, zero
	characters will be passed back, which is the standard end-of-file
	indication.

(That is the >=SysIII text; the BSD text merely says that newline or ^D
terminate a line being read in cooked mode; nothing anywhere says
that simply entering a ^D will cause an end-of-file indication anywhere).

When discussing fine points of documentation, it is more accurate and less
embarrassing to use your eyeballs, not your memory.  When something is
claimed to be a popular misconception, you should not be so arrogant as
to assume that you are not subject to such misconceptions without
verifying it.

>>If the 4.2BSD fread() was buggy, it should have been fixed rather
>>than introducing a significant incompatibility with other STDIOs.
>This bug is in ALL versions of fread (and getchar, and ...) *except*
>4.2.

Do you consider it a bug to be able to read() from a terminal after getting
an end-of-file indication?  The behavior of fread was consistent with the
documentation.  Changing it, whether desirable or not, is a change in
functionality.  A change can only be considered a bug fix if it brings into
line behavior previously out of line with the documentation.

-- Jim Balter, INTERACTIVE Systems (ima!jim)