pipes in unix

Sat May 11 03:31:12 AEST 1991

>What I wanna know is how does this work. I've looked up 'dup' and discovered
>that it duplicates a descriptor which will be carried across an 'exec' but
>dont understand how the recipient program knows its got a pipe. Does it know?
>Is it simple that dup replaces stdin and stdout if so why? because it doesn't
>seem to do that in the above code?

>I know I'm missing something here...would some kind person please tell me what
>it is?

To understand how your sample program works, you need to first understand
how fork() ,pipe() and dup() works.

When fork() is invoked, UNIX kernel will "clone" the parent process to create
the child process. I am not going to bore you with the details of cloning.
But an important data structure that will be copied to the child process is
the file descriptor table. Each process in UNIX has a file descriptor table
associated with it. In UNIX V, file descriptor table can have up to 25 entries.
In BSD UNIX, file descriptor table can have up to 99 entries. Therefore, in UNIX
V, file descriptor table entries can be a "precious" commodities and you must
use it carefully in order to avoid running out of file descriptors.

This cloning will allow all the file descriptor table entries established in the
parent process to be inherited by the child process. When the parent process
invoke pipe() and then fork(), the child process will also inherit the parent
process's file descriptors returned from the pipe() system call. One
misconception about pipe() is that it allows bidirectional communication between
two processes. That is simply untrue. One process must be a "reader" and the
other must be a "writer" where both communicating thru the pair of file
descriptors returned from the pipe() system call. Say you invoke pipe with the
parameter pfd where pfd is declared as an integer array of 2 elements. The
writer process will write to pfd[1] while the reader process will read from
pfd[0]. Now, you may ask what happen to the pfd[0] belong to the writer. Well,
it is simply not used and usually will be released thru close(). Ditto for the
pfd[1] belongs to the reader. If you want to allow bidirectional communication,
you need to use another pair of file descriptors returned from a second pipe()
system call. But, who wants to do that? sockets has a much better implementation
of bidirectional communication between processes. That's another topic.

Each process will typically use up the first 3 entries of the file descriptor
table; 0 for stdin, 1 for stdout and 2 for stderr.If you can manuiplate these
entries, you can set up a communication channel between the parent and the
child process. That's is how the dup() system call come in. The dup() system
call duplicates the file descriptor and returns to you another file descriptor
that points to the same file. It is very similiar to opening the same file
twice. However, there is a twist to it. Dup() will return the lowest file
descriptor number back to the caller that is NOT used. If you close the stdin,
UNIX kernel will free up the file descriptor entry 0. By doing a dup() right
after a close(0), you can be sure that the file descriptor returned by dup will
be 0. (UNIX Version 7 has a dup2() system call which takes care both close() and
dup() in one system call.)

A revised copy of your sample program with comments is listed below to show you
"the proper way" of linking two processes to communicate with each other.
I am not going to get into error checking and all that jazz. I assume you 
know how to do that.

#include <sys/types.h>
main()
{
     pid_t pid;
     int pipefd[2];

              /* invoke pipe() to get file descriptors */

              pipe (pipefd);

	      if ((pid = fork()) == (pid_t)0) 
	      {
              /* I am a child process */
              /* close stdout to free up file descriptor 1 */

              close(1);      

              /* invoke dup() to associate file descriptor 1 (stdout) */
              /* with the write end of the pipe.                      */

              dup (pipefd[1]);

              /* free up the rest of unnecessary file descriptors */

              close(pipefd[0]);
              close(pipefd[1]);
              close(0);

	      execlp ("ls", "ls", (char *)0);
	      }
	      else
	      if (pid > (pid_t)0) 
	      {
              /* I am the parent */
              /* close stdin to free up file descriptor 0 */

	      close(0); 

              /* invoke dup() to associate file descriptor 0 */
              /* with the read end of the pipe */

              dup (pipefd[0]);

              /* close all the unnecessary file descriptors */
              /* you don't want to close stdout & stderr.*/
              /* Otherwise, the output has nowhere to go */

              close(pipefd[0]);
              close(pipefd[1]);

	      execlp ("sort", "sort", (char *)0);
	      }
}

An excellent book for you to get started with UNIX system programming is
"Advanced UNIX Programming" by Marc. J. Rochkind. The publisher is
Prentice-Hall. It emphasizes more on the standard UNIX System V stuff.

I hope I have answered most of your questions.

P.S. Anyone out there know a good BSD system programming book other than the
"Design & Implemntation of 4.3 BSD UNIX" ?

------------------------------------------------------------------------------
 Doug Yip             Hewlett Packard - Manufacturing Productivity Operation
 Santa Clara, CA      Internet: dougy at hpsciz.hp.com
 (408) 553-3622  
 Mailstop: 51U-91
------------------------------------------------------------------------------