Setting up Home dirs...

Fri Sep 21 22:28:22 AEST 1990

In article <1990Sep20.141822.26387 at cs.utk.edu> Dave Sill <de5 at ornl.gov> writes:
>In article <1990Sep20.053541.18081 at cs.utk.edu>, moore at betelgeuse.cs.utk.edu (Keith Moore) writes:
>>
>>We use amd instead of Sun's automount, for several reasons -- but mainly
>>because it's more flexible, more robust, and it runs on all of our machines.
>
>Where can one obtain and/or learn about amd?  What does it do?  Who
>wrote it?  What are the security issues?

1.  Amd was posted to comp.sources.unix some time ago.  Look in the
archives, or anonymous ftp to usc.edu and look in directory pub/amd.
TeX documentation is included in the package, with a much better
description than I can give here.
2.  Amd "mounts" itself on a directory (we use /amd), listens for NFS
RPC requests, and mounts other file systems as needed to simulate a
file system hierarchy under its mount point.  Periodically it polls
all of its file servers to see whether they are still up and running;
if not, amd marks the server as down.  Subsequent attempts by user
programs to open files on that server will fail.  If the file system
had been mounted with ordinary NFS hard mounts, the open attempt would
cause the user process to wait (perhaps forever) until the file system
came back to life.  Amd is very similar to in function to Sun's
automount program, which comes with SunOS 4.x, Ultrix 4.x (I believe),
AIX 3.x, and probably other systems.
3.  Amd was written by Jan-Simon Pendry at Imperial College in London.
4.  ``Security considerations are not addressed in this memo.''  :-)

>>Our users' home directories (in the passwd file) are all of the form
>>/$color/homes/$user.  We don't imbed the name of the machine that does
>>the file service...because we want to have the freedom to move users around
>>between machines to balance load and disk usage between groups of users.
>>We use colors as partition names precisely because they are arbitrary.
>>Each machine has a symlink for each color from /$color/homes -> /amd/$color,
>>and the amd map associates a machine and disk partition with the particular
>>color.
>
>I'm easily confused, I guess.  Could you give a simple example with a
>couple servers and a couple clients?

Okay.  My home directory is (currently) /gold/homes/moore as defined
by the YP passwd database.  /gold/homes is a symlink (on every
machine) to /amd/gold.  The amd.map file (which defines how the space
under the /amd mount point is laid out) associates "gold" with server
alphard, and mount point /gold/homes on alphard.  On the other hand,
/ruby/homes is a symlink to /amd/ruby, and the amd.map file says to
mount /amd/ruby from cs.utk.edu:/ruby/homes.  We currently have 21
filesystems full of users' directory trees mounted in this way.

Our amd.map files are the same for all clients, but they do contain
conditional code so that, for example, amd doesn't try to mount
alphard:/gold/homes from alphard.

Having the directory name dissassociated with the machine name gives
us lots of flexibility.  For instance, if alphard breaks for some
reason, we can unplug the SCSI disk that contains /gold/homes, plug it
in at another machine, change the amd map, and /gold/homes will
magically re-appear on all of the client machines.  We can also move
/gold/homes to a larger disk if space gets too tight, with minimal
impact on users.

When this gets reorganized (probably next summer), we will probably
rename everybody's home directory to /homes/$color, and then won't
need all of the symlinks.

>>(The ".../homes/..." part is an anachronism from the days when these were 
>>hard NFS mounts in /etc/fstab and the system would hang if you typed `pwd'
>>and any directory in any ancestor of your current directory happened to be 
>>an NFS mount point on a unreachable file server....Yuk!
>
>This is something that amd/automounter fixes?

Yep.  In fact, I'd guess that's why it was written.  Having the system
hang just because you try to type 'pwd' or 'df', or not being able to
log in just because the /var/spool/mail file server is down, are
*very* inconvenient.  Amd solves these problems rather neatly.
Programs still hang up if they have files open on a file server that
goes down, (perhaps to resume when the server becomes available
again), but file opens and stats and other operations that require a
directory lookup fail.

You can think of amd as an ugly wart that tries to correct a major
design flaw of NFS.  NFS tries to pretend that there aren't any
additional failure modes when accessing files from across a network,
from accessing files on a local disk.  So it doesn't let user
processes detect and handle those kinds of failures.  If your setup is
a handful of diskless (or dataless) workstations around a single file
server, it doesn't matter much -- you can't get anything done when the
server fails anyway.  On the other hand, if you have a dozen file
servers and a couple of hundred workstations, the failure of any
single machine had better not keep everyone from getting work done.

>>This scheme actually works remarkably well, but there are lots of little
>>things we've had to learn about.  The biggest problems we have found 
>>have been with mail -- sendmail isn't prepared to deal with the kinds of 
>>failure modes you run into in a distributed file system.   (e.g. What if 
>>a user's .forward file is missing because the file server that contains 
>>his home area is down?)  I've managed to solve these problems without 
>>patching sendmail by replacing the "local", "prog", and "file" mailers 
>>with small programs or shell scripts that do some error checking before
>>actually delivering the mail.  
>
>What are ``the "local", "prog", and "file" mailers''?

These are defined within sendmail.  The "local" mailer delivers mail
to local mailboxes in /var/spool/mail/whatever.  Usually it just calls
/bin/mail.  The "prog" mailer is invoked when a user forwards mail to
a program (like vacation) for further processing.  Usually it just
calls /bin/sh.  The "file" mailer is used when saving mail to a file
-- it's used for those situations where you want to log a copy of
every message sent to some particular user.

My replacements for these programs detect the condition that one or
more files needed to complete the task are not available.  The "local"
mailer requires that a user's home directory be present so it can
check the presence of the user's .forward file.  The "prog" mailer
makes sure that the program to be run is present.  The "file" mailer
checks to make sure that the log file exists.  All of these return the
exit code EX_TEMPFAIL if they cannot complete delivery -- this causes
sendmail to queue the mail and retry delivery later.

>>Other problems have been due to NFS mapping root->nobody on remote mounts.
>>Most recent NFS server implementations provide a way around this, but we
>>still have a few machines that don't fix this problem.  We therefore have
>>a special version of "calendar" that does an "su" to the owner of the
>>calendar file in order to read it, in case it's not readable by "nobody".
>
>Don't use no double negatives, Keith.  :-)
>
>>This version of calendar also does "ypcat passwd" instead of reading
>>the /etc/passwd file, so it scans directories for every user in the entire
>>passwd map...we have to make sure that only one system in the entire 
>>"cluster" runs calendar, else things slow down to a crawl.  We run it
>>on our mail server, since the mail that calendar generates will end up
>>there anyway.
>
>Again, I show my ignorance.  What's this "calendar" program?  And how
>can you use ypcat if you aren't running YP/NIS?

1.  Calendar is an ancient program that scans every user's home directory
looking for files named "calendar".  When it finds one, it looks for
any line which has today's date, tomorrow's date, or any date through
the coming weekend if today is a Friday.  If any lines from the
calendar file match these criteria, they are mailed to the user as a
reminder.

2.  We do run YP...but we don't use it for much except passwd and
group lookups.  We'd much rather use a real distributed database, but
we would have to get source code for every system we use and recompile
every program that calls getpwXXX() or getgrXXX() or whatever else
uses YP.  (The technical effort required is nothing compared to the
amount of time our lawyers would spend negotiating the source license
agreements with the various vendors.)  Anyway, now that our Suns have
shared libraries, we can simply write replacements for the functions
that do use YP, and link them in...  so we don't need source as much
as we used to.  We'd still have to have a YP server interface for all
of the oddball machines lying around.

Oh yes, the reason I mentioned YP at all in my earlier message is that
amd lets you distribute its map file via YP if you want to...but we
don't want to.

Hope I answered all of your questions (whew!)

Keith Moore			Internet: moore at cs.utk.edu
University of Tenn. CS Dept.	BITNET: moore at utkvx
107 Ayres Hall, UT Campus	Telephone: +1 615 974 0822
Knoxville Tennessee 37996-1301	``Friends don't let friends use YP (or NIS)''