Is System V.4 fork reliable?

Jim Rosenberg jr at oglvee.UUCP
Thu Aug 2 10:16:23 AEST 1990


In <18478 at rpp386.cactus.org> jfh at rpp386.cactus.org (John F. Haugh II) writes:
>>In article <573 at oglvee.UUCP> jr at oglvee.UUCP (Jim Rosenberg) writes:
>>> The larger question is *why can't the kernel sleep* when it needs more memory
>>> for a fork???
>It isn't that the kernel =can't= sleep, but rather that someone decided
>[ for some totally random reason, I suspect ... ] that the kernel
>=shouldn't= sleep.

I agree with the sentiments 100%, obviously, but I fear life is not so simple.

>In the past, the kernel swapped the process if malloc() didn't return
>with space.  So, this is a change in function, not some defect in
>application code.

But that was in swapping days, before V.3's hideous virtual memory.
Historically the kernel has never multitasked internally.  But somewhere along
the way, System V acquired "kernel daemons" -- in spite of lacking generalized
internal primitives for synchronizing true threads.  As I understand it, to
this day the only ways to synchronize flow of execution in the kernel are
sleep/wakeup and spl (disabling interrupts.)  These are *not good enough* for
generalized thread support.  Wakeups can be lost, as opposed ups on a
semaphore.  Somebody decided that the dirty work of moving reclaimable memory
pages to swap space could be handled by an asynchronous "kernel daemon".  Now
just exactly HOW, without generalized thread support, is this daemon to be
synchronized with the part of the kernel that needs the memory when you fork?
The answer, apparently, is that they are synchronized only by that arcane
black magic that has come to surround the kernel.  You just have to "know"
what you can get away with to avoid deadlock and race conditions, and when you
can't avoid them you throw out the door "the functionality that used to be
there" to paraphrase John.

*THIS* is the reason for rewriting the kernel:

In <13435 at smoke.BRL.MIL> gwyn at smoke.BRL.MIL (Doug Gwyn) writes:
> Oh, good grief.  It is SILLY to say that the kernel should be redesigned
> to compensate for bugs in application programs.

Lighten up, Doug.  Some mighty heavy folks thought the kernel *needed*
threads.  I'd love to see some of the folks who've been lecturing me on KISS
and policy and whatnot say those things to the people who *were working* on
V.5 -- which *did have* kernel threads.  The V.5 project was not stopped (was
it stopped? :-)) for technical reasons but for political reasons.  Are the
technical reasons that got it started suddenly wrong?

I am truly mortified that **NO ONE** has posted an explanation of just what
the deadlock would be if the kernel did allow sleep in allocating memory for a
fork.  I fear no one understands it any more.  Dammit, Doug, it's not that I'm
too lazy to read a man page or to rewrite my fork code to do the right thing.
The point I'm making is that this functionality which we've lost since
swapping days is but the tip of a future iceberg.  The kernel *can* sleep on
the availability of a block in the buffer cache.  (Yeah, I know, there is no
more buffer cache in V.4, VM does it all ...)  If the page stealing daemon and
the "main part" of the kernel could communicate properly, using something
better than sleep/wakeup, then perhaps the kernel *could sleep* on the
availability of a memory page.  John & I & other folks are saying that we
thought this was the sort of thing kernels were supposed to be able to do.
They could once.  (BSD still can?)  If the System V kernel could, it would be
a better kernel.

And now folks are talking about bolting symmetric multiprocessing onto this
ballooning kernel.  If we can't get a clear explanation of just what the issue
is as to why the kernel can't sleep on more memory for a fork, what's going to
happen when all this is running on multiple processors?  Since I've provoked a
good deal of theological spouting, I can't resist asking what happened to the
old idea that UNIX was supposed to be *understandable*?

Meantime, if architects think they can get away with kernel threads without
full support for it, and us consumers of UNIX point out that by golly they
didn't really quite get away with it, hey, it's our bucks, we're entitled.
When I found that V.4, like V.3, uses an asynchronous page-stealing daemon, it
made me nervous.  I'm still nervous.
-- 
Jim Rosenberg             #include <disclaimer.h>      --cgh!amanue!oglvee!jr
Oglevee Computer Systems                                        /      /
151 Oglevee Lane, Connellsville, PA 15425                    pitt!  ditka!
INTERNET:  cgh!amanue!oglvee!jr at dsi.com                      /      /



More information about the Comp.unix.wizards mailing list