File descriptors and streams and co

Lloyd Kremer kremer at cs.odu.edu
Wed Apr 19 14:45:04 AEST 1989



In article <207600018 at s.cs.uiuc.edu> carroll at s.cs.uiuc.edu (Alan M. Carroll)
writes:
>I must be missing something - given that
>FILE *my_file;
>has been properly set up (with fopen(), no errors, etc.), why can't you
>switch stdin by having another variable
>FILE *tmp;
>and doing
>tmp = stdin; stdin = my_file;


There are several concepts missing here.  Although any discussion of I/O
is, strictly speaking, not relevant to the C language, in practice almost
every C program does some I/O, and hopefully the commonly used interfaces are
sufficiently consistent across operating systems, at least conceptually, to
make a discussion here useful.  When I say "conceptually," I mean that even
if it isn't really implemented in this way, if you assume that it is, your
program will work properly in all cases.

In virtually any system that has a UNIX ancestry or that attempts to emulate
the UNIX I/O methodology, the following should be conceptually correct.  The
names of the various internal objects may vary or may not be defined.

Low level I/O consists of a number (often 20) of integer file descriptors that
can be returned from low level I/O calls such as open, creat, and dup.  In
some systems the symbol _NFILE is #define'd as this maximum number of open
files.  High level I/O consists of a low level file combined with associated
buffering.  The buffering avoids the necessity of a system I/O call to process
every character.

A FILE is typedef'd or #define'd as a struct containing a low level file
descriptor and a few other members pertaining to the buffering (type of
buffering, pointers to the buffer, count of characters in the buffer,
read/write capability, error flags, etc.).  The first three FILEs are
normally inherited from the parent process and are provided pre-opened.  They
are open to the same things and in the same modes as they were in the parent.

There is an array of _NFILE FILEs often called _iob or some similar name.

The names stdin, stdout, and stderr are usually #define'd as the addresses
of the first three of these FILEs (structs).

	#define stdin  (&_iob[0])
	#define stdout (&_iob[1])
	#define stderr (&_iob[2])

Hence stdin cannot, in general, be used as an lvalue.

This is the reason that "changing stdin" is, in general, non-trivial.
Stdin cannot be changed; it's the address of an absolute location in memory;
it's immovable.  When we speak of "changing stdin", we mean changing the
*contents* of the structure referenced by stdin.  This involves clearing out
the previous contents properly, with fflushing to preclude any data loss,
and then opening the new file such that *stdin (_iob[0]) will be selected
as its FILE structure, the FILE structure will contain file descriptor 0
(this is not automatic; it must be arranged), the file descriptor will be
validly open, it will be open for reading, the FILE will be set for reading
("r"), and all the other structure members will be properly and consistently
initialized for the new stream.

I have found that programmers who perform surgery on stdio without due regard
for these considerations produce programs whose I/O sort of works most of
the time, but suffers occasional lost data, misdirected data, invalid file
descriptor problems, and memory errors.  Moral: be sure you understand both
low level and high level I/O, and the relationships between them, before you
start rewiring them in the middle of an executable.

-- 

					Lloyd Kremer
					Brooks Financial Systems
					...!uunet!xanth!brooks!lloyd
					Have terminal...will hack!



More information about the Comp.lang.c mailing list