qfork() (The Spawn of spawn())

Guido van Rossum guido at cwi.nl
Thu Jan 3 20:42:04 AEST 1991


Submitted-by: guido at cwi.nl (Guido van Rossum)

peter at ficc.ferranti.com (Peter da Silva) writes:

>Yes, fork() is a cleaner method of creating new processes. Yes, it takes
>a fairly complex calling sequence to get spawn() to have anything like
>the functionality of fork()...exec(). But I think it'd be worthwhile to
>let a little heresy in in exchange for making POSIX more palatable to
>folks in poorer environments.

I know of precedents even in OS'es that support fork(): Amoeba and
Topaz support a variant of what you call spawn().  (Note that the
spawn() functions found in Microsoft C for MS-DOS emulate either just
exec() or fork()+exec()+wait(), which is much less powerful, but
all that MS-DOS can support (last time I looked).)

Amoeba's UNIX emulation supports fork(), but since Amoeba has no virtual
memory (yet), it is fairly expensive.  An alternative function is
provided, "newproc()", which creates a child process running a
different program (and, because it is Amoeba, also running on a
different processor, in the average case) just like fork()+exec() would
do, only much cheaper since the parent's address space never gets
copied.

Amoeba's newproc() lets you change the two perhaps most important
bits of "kernel state" that programs fiddle between fork() and exec():
the set of signals to be ignored and the set of open file descriptors.
The interface lets you specify a bitmask of signals that are to be
ignored in the child (or -1 to inherit the parent's ignored signals)
and an array of file descriptors which provides a mapping between file
descriptors in the parent and in the child (also with an option to
inherit all file descriptors from the parent).

Amoeba's library functions popen() and system() have been changed to
use newproc(), and the shell uses newproc() for most simple program
invocations (environment manipulations and a few other things make it
fall back on fork()).  The performance gain was well worth the hacking.

The newproc() interface could also be implemented on UNIX using
[v]fork() and exec(), although extreme cases of file descriptor
permutations could fail if not enough spare file descriptors were
available.

The Topaz operating system (an Ultrix clone for Firefly multiprocessors
developed at DEC's System Research Centre in Palo Alto) has a similar
but more complete feature in its Modula-2+ (and now Modula-3?) version
of the OS interface, not because Topaz doesn't have virtual memory (it
does), but because the average Modula-2+ binary is several megabytes.

In Topaz, you create a descriptor for the new process, which represents
its relevant kernel state.  The descriptor is initialized to inherit
all state from the parent, and you can call library functions that
modify various parts of the descriptor; this is the equivalent of what
you would do between fork() and exec() in real UNIX.  Finally you make
a system call that presents the descriptor to the kernel for creation.
Yes, it's a bit more tedious, but it has all the required
functionality, unlike (it seems to me) the proposed qfork() with its
not-well-understood restrictions on modifying memory.

--Guido

--
Guido van Rossum, CWI, Amsterdam <guido at cwi.nl>
"Well I'm a plumber.  I can't act."

Volume-Number: Volume 22, Number 55



More information about the Comp.std.unix mailing list