NFS, hung processes

Thu Aug 3 11:43:28 AEST 1989

In article <658 at skye.ed.ac.uk> richard at aiai.UUCP (Richard Tobin) writes:
>Another solution is to mount the filesystems in, say, /nfs, and have symbolic
>links to them from the places people actually refer to.  Then you can
>remove the symbolic links if the server is down.
>
>Even better, you can have a program do it.  Here's one I wrote recently.
>We've only just started using it, so it may not be bug-free.

     We considered implementing something like this.  Unfortunately, for our
usage of NFS, simply switching symbolic links is not enough.  Most of our
workstations NFS their /usr partition.  All of the processes that are
referencing the original server remain hung even after the necessary symbolic
links are changed.
     I solved this problem by implementing NFS kernel changes on the client
workstations.  When an NFS request times out, the client consults a user
specified list of equivalent servers to find an alternate server.  The NFS
request is automatically converted by the client workstation to be sent to the
alternate server.  Running programs do not even realize that a switch in
servers has been made.  The code works great for read only NFS partitions
(such as /usr.)  It is useless for handling remotely mounted home directories.
     I also made made changes to the way NFS handles interruptible file systems
so that a user can interrupt an NFS request in a realistic amount of time.
(Current NFS implementations can take several minutes to act upon an interrupt
request.)

--
Bruce Cole
Computer Sciences Dept.
U. of Wisconsin - Madison