Why use U* over VMS

Thu Nov 1 07:52:23 AEST 1990

Note: I apologize for leaving large portions of the previous messages
in this text, but I felt that they were needed to form the context
for my replies.

In article <8355 at tekgvs.LABS.TEK.COM>, terryl at sail.LABS.TEK.COM writes:
> In article <1809.272c3135 at dcs.simpact.com> kquick at dcs.simpact.com (Kevin Quick, Simpact Assoc., Inc.) writes:
> +Drivers:
> +--------
> +
> +Because the OS's are considerably different, driver writing is as well.  A
> +driver is, by definition, a three-way bridge between the operating system,
> +a device, and the application program.  If you are writing a device driver,
> +there are several significant differences to be aware of.  My general
> +impression is that the Unix environment for a driver is much simpler and
> +therefore easier to write to, whereas the VMS environment is more
> +complicated, but provides better tools.
> +
> +1. VMS has the concept of allocatable virtual memory, which may be obtained
> +   by calling system routines; most VMS device drivers use this technique
> +   for buffering data, etc.
> +
> +   Unix (usually) also has the concept of allocatable virtual memory (implying
> +   non-pageable, kernel space), but few Unix drivers actually use this
> +   technique.  The Unix drivers (that I've seen) usually pre-allocate a large
> +   chunk of memory at compile time and use that memory as needed.
> +
> +   The problem arises in that, while VMS returns an error indicating when
> +   no memory is available, Unix simply "sleeps".  This is effectively a
> +   suspending of the current process until memory becomes available.
> +   Unfortunately, the VMS driver does not depend on process context to
> +   execute, which brings us to point 2:
> +
> +2. In Unix, when an application issues a driver request, the PC is transferred
> +   to the appropriate driver routine in kernel and remains there, in execution,
> +   until the request completes, at which time the routine exits and the user
> +   level code resumes.  There is no Unix implementation of "no-wait",
> +   "asynchronous",  or "background" IO.
> +
> +   In VMS, the application issues a driver request.  That process is then
> +   placed in a wait state after a VMS structure is initialized to describe
> +   the request.  That structure is then passed through several parts of the
> +   device driver in several stages and interrupt levels to accomplish the
> +   request.  Each intermediary routine is free to do its work and then exit
> +   which returns to the OS.  When the user's request is finally completed,
> +   a specific "wakeup" is issued to the process with an output status.
>
>     Actually, no, the Unix driver does NOT "remain in execution" in the driver
> until the request completes. For disk drivers, as an example, what happens is
> that a request for a transfer is queued, and then the higher level OS code
> will wait for the transfer to complete, thus giving up the processor to another
> process so it may run. While it is true that some device drivers may do a busy
> wait, waiting for some command to complete while in the driver, these are usu-
> ally commands that are known to complete in a very short amount of time, but
> they are usually the exeception, and not the rule (like clearing error condi-
> tions).
>
>      As for the "no-wait", "asynchronous",  or "background" IO, at the user
> level, yes, that is true, but at the kernel level, it is possible to do this.
>

Actually, I should have clarified my point a little better.  I did not intend
to imply that the *host processer* continues to execute the driver code for
that request until the request completes, but that the *process's code thread*
does so.  This means that, as Terry stated, the process makes a request, the
driver pre-processes that request, the driver signals the device, and then
the driver tells Unix to suspend the current process and go handle someone
else until the device interrupt routine issues a wakeup, at which time the
requesting process is restarted from its suspended state and continues at
the next instruction to do post-processing and exit.

The difference between this scheme and VMS is that under Unix, the driver
routine must be very careful not to exit via a return statement until the
I/O request has been completely handled; if it needs to wait for the device
then it issues the "sleep" request to suspend the current process until the
device interrupts and restarts this process.  Under VMS, the OS automatically
suspends the process (assuming wait I/O) after the descriptive structure is
generated.  When driver code needs to wait for something, it tells VMS where
to resume when the event occurs and then simply exits -- back to the VMS OS.
When the specified event occurs, the specified routines are activated to resume
processing the I/O; the user process is not restarted until a driver routine
explicitly tells VMS that the I/O has been completed.

> +3. Everything except interrupt handlers in Unix are written in "user"
> +   context, whereas only the preliminary portion of a VMS driver (the
> +   FDT routines) are in user context.
> +
> +   This means that all buffer validation and copying from the user space
> +   must be done from the FDT routines in VMS; copyout of data after the
> +   request completes is done by VMS based on addresses and flags in the
> +   structure describing the request.
> +
> +   This also means that, whereas a Unix driver has a relatively straight-
> +   forward code flow, with perhaps a sleep to await later rescheduling,
> +   the VMS environment is more versatile at  the cost of a much more
> +   complex code flow.
> +
> +   Care must be taken in a VMS driver not to access user level data and
> +   such from most of the driver, whereas a Unix driver must insure user
> +   context for kernel memory allocation, sleep requests, and much more.
>     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
>      I'm a little confused here on what you call "user" context. To me, a
> "user" context consists of a transfer direction (eg a read or a write), a
> transfer count, a starting block number, and a buffer address (to use my
> previous disk analogy). That's it; nothing more, nothing less. Also, your
> comment about Unix "must insure user context for kernel memory allocation,
> sleep requests, and much more" is a little cryptic. All the driver has to
> do is validate the user's buffer address, and that the transfer is valid
> with respect to the disk geometry. Now Unix does provide both direct-from-
> user-space I/O, and also from the internal kernel buffers into the user-
> provided buffer address, but I'm still not sure what you mean by the above
> quoted remarks. For transfers into/out of the internal kernel buffers men-
> tioned above, the user context doesn't even come into play. It is taken care
> of at a higher level. For transfers directly into/out of the user's buffer
> address, again, most of that is taken care of at a higher level. By the time
> it gets down to the driver level, all the driver sees is a transfer direction,
> a transfer count, a starting block, and a buffer address. As far as the driver
> is concerned, there isn't much of a distinction between a kernel buffer address
> and a user buffer address.

More clarification:

"User" context, as I've used it, is intended to mean that you are not
executing on the interrupt stack, and that the user process that made
the I/O request is the "current" process as far as the scheduler (and
the pager, etc.) are concerned.  The better Unix driver documentation
will warn you not to use the kernel memory allocation routines from
interrupt level, since these routines usually "sleep" the current process
if no memory is available at that time; when you are in interrupt
context, who knows what process is currently scheduled that you may be
suspending with this sleep function, or better yet, you're interrupt
level is above that of the scheduler, so your sleep hangs the system
because it can't pass control to the scheduler to activate the next
process in the run queue!  Also, and for similar reasons, you should
never be copying data to or from the user's memory region at interrupt
level (unless you have previously, in user context, locked that memory
into the physical working set, and are accessing  that physical location
in your interrupt routine) since you don't know what process is current,
and what that process is keeping in that memory location.

The reason I have compared this to VMS, is that all parts of a VMS driver
except the FDT routines may be considered to be in "interrupt context",
since they have been asynchronously scheduled independent of what the
current process on the system is, and therefore may not access "user"
space or other user data structures.

All of the above is doubly important when writing an asynchronous multiplexed
driver which can handle multiple user requests simultaneously; one user's
request may need to wait for some event, but unless you use "sleep" for
this scheduling, how can you tell which user context that code is in after
the event has occurred?

--
	-- Kevin Quick,  Simpact Associates, Inc.,  San Diego, CA.
	   Internet: simpact!kquick at crash.cts.com