Standards Update, IEEE 1003.4: Real-time Extensions

Sat Sep 8 10:01:00 AEST 1990

From:  swart at src.dec.com (Garret Swart)

I believe in putting lots of interesting stuff in the file system name
space but I don't believe that semaphores belong there.  The reason
I don't want to put semaphores in the name space is the same reason
I don't want to put my program variables in the name space:  I want
to have lots of them, I want to create and destroy them very quickly
and I want to operate on them even more quickly.  In other words, the
granularity is wrong.

The purpose of a semaphore is to synchronize actions on an object.
What kinds of objects might one want to synchronize?  Generally the
objects are either OS supplied like devices or files, or user defined
data structures.  The typical way of synchronizing files and devices
is to use advisory locks or the "exclusive use" mode on the device.
The more difficult case and the one for which semaphores were invented,
and later added to Unix, is that of synchronizing user data structures.

In Unix, user data structures may live either in a process's private
memory or in a shared memory segment.  In both cases there are probably
many different data structures in that memory and many of these data
structures may need to be synchronized.  For maximum concurrency the
programmer may wish to synchronize each data structure with its own
semaphore.  In many applications these data structures may come and
go very quickly and the expense of creating a semaphore to synchronize
the data can be important factor in the performance of the application.

It thus seems more natural to allow semaphores to be efficiently
allocated along with the data that they are designed to synchronize.
That is, allow them to be allocated in a process's private address
space or in a mapped shared memory segment.  A shared memory segment
is a much larger grain object, creating, destroying and mapping them
can be much more expensive than creating, destroying or using a
semaphore and these segments are generally important enough to the
application to have sensible names.  Thus putting a shared memory
segment in the name space seems reasonable.  

For example, a data base library may use a shared member segment named
/usr/local/lib/dbm/personnel/bufpool to hold the buffer pool for the
personnel department's data base.  The data base library would map
the buffer pool into each client's address space allowing many data
base client programs to efficiently access the data base.  Each page
in the buffer pool and each transaction would have its own set of
semaphores used to synchronize access to the page in the pool or the
state of a transaction.  Giving the buffer pool a name is no problem,
but giving each semaphore a name is much more of a hassle.

[Aside:  Another way of structuring such a data base library is as
an RPC style multi-threaded server.  This allows access to the data
base from remote machines and allows easier solutions to the security
and failure problems inherent in the shared memory approach.  However
the shared memory approach has a major performance advantage for systems
that do not support ultra-fast RPCs.  Another approach is to run the
library in an inner mode.  (Unix has one inner mode called the kernel,
VMS has 3, Multics had many.)  This solves the security and failure
problems of the shared segments but it is generally difficult for mere
mortals to write their own inner mode libraries.]

One other issue that may cause one to want to unify all objects in
the file system, at least at the level of using file descriptors to
refer to all objects if not going so far as to put all objects in the
name space, is the fact that single threaded programming is much nicer
if there is a single primitive that will wait for ANY event that the
process may be interested in (e.g. the 4.2BSD select call.)  This call
is useful if one is to write a single threaded program that doesn't
busy wait when it has nothing to do but also won't block when an event
of interest has occurred.  With the advent of multi-threaded programming
the single multi-way wait primitive is no longer needed as instead
one can create a separate thread each blocking for an event of interest
and processing it.  Multi-way waiting is a problem if single threaded
programs are going to get maximum use out of the facility.

I've spoken to a number of people in 1003.4 about these ideas.  I am
not sure whether it played any part in their decision.

Just to prove that I am a pro-name space kind of guy, I am currently
working on and using an experimental file system called Echo that
integrates the Internet Domain name service for access to global names,
our internal higher performance name service for highly available
naming of arbitrary objects, our experimental fault tolerant, log based,
distributed file service with read/write consistency and universal
write back for file storage, and auto-mounting NFS for accessing other
systems.

Objects that are named in our name space currently include:

   hosts, users, groups, network servers, network services (a fault
   tolerant network service is generally provided by several servers),
   any every version of any source or object file known by our source
   code control system

Some of these objects are represented in the name space as a directory
with auxiliary information, mount points or files stored underneath.
This subsumes much of the use of special files like /etc/passwd,
/etc/services and the like in traditional Unix.  Processes are not
currently in the name space, but they will/should be.  (Just a "simple
matter of programming.")

For example /-/com/dec/src/user/swart/home/.draft/6.draft is the name
of the file I am currently typing, /-/com/dec/src/user/swart/shell
is a symbolic link to my shell, /-/com/dec/prl/perle/nfs/bin/ls is
the name of the "ls" program on a vanilla Ultrix machine at DEC's Paris
Research Lab..

[Yes, I know we are using "/-/" as the name of the super root and not
either "/../" or "//" as POSIX mandates, but those other strings are
so uhhgly and /../ is especially misleading in a system with multiple
levels of super root, e.g. on my machine "cd /; pwd" types
/-/com/dec/src.]

Things that we don't put in the name space are objects that are passed
within or between processes by 'handle' rather than by name.  For
example, pipes created with the pipe(2) call, need not be in the name
space.  [At a further extreme, pipes for intra-process communication
don't even involve calling the kernel.]

I personally don't believe in overloading file system operations on
objects for which the meaning is tenuous (e.g. "unlink" => "kill -TERM"
on objects of type process); we tend to define new operations for
manipulating objects of a new type.  But that is even more of a
digression than I wanted to get into!

Sorry for the length of this message, I seem to have gotten carried
away.

Happy trails,

Garret Swart
DEC Systems Research Center
130 Lytton Avenue
Palo Alto, CA 94301
(415) 853-2220
decwrl!swart.UUCP or swart at src.dec.com

Volume-Number: Volume 21, Number 91