Sun-Spots Digest, v6n276

Mon Oct 31 04:13:00 AEST 1988

SUN-SPOTS DIGEST         Sunday, 30 October 1988      Volume 6 : Issue 276

Today's Topics:
                          Re: NFS mount mail (2)
                           Re: weird NFS hangup
                        Re: Some Benchmark Results
                            NAMED - YP Feature
                     netmask, yp service, and booting
                       route hangs, portmap crashes
                           Wren IV system hang
                             SCSI + SUN3/50?

Send contributions to:  sun-spots at rice.edu
Send subscription add/delete requests to:  sun-spots-request at rice.edu
Bitnet readers can subscribe directly with the CMS command:
    TELL LISTSERV AT RICE SUBSCRIBE SUNSPOTS My Full Name
Recent backissues are available via anonymous FTP from "titan.rice.edu".
For volume X, issue Y, "get sun-spots/vXnY".  They are also accessible
through the archive server:  mail the request "send sun-spots vXnY" to
"archive-server at rice.edu" or mail the word "help" to the same address
for more information.

----------------------------------------------------------------------

Date:    23 Oct 88 01:48:19 GMT
From:    hedrick at athos.rutgers.edu (Charles Hedrick)
Subject: Re: NFS mount mail (1)
Reference: v6n268

We've used mail over NFS for a long time, with little trouble.  However we
make sure all of the mail software is set up to do old-style locking (via
.lock files) rather than flock or lockf.  As far as I can tell from the
source, Berkeley Mail as shipped with 3.2 does in fact use .lock files.
You'd also want to make sure that your version of sendmail does.  (It had
better, since having Mail and sendmail use different locking mechanisms
would be trouble in any case.)  The Berkeley lock primitive, flock, works
only locally.  That is, it will protect the file against access by other
processes on the same machine, but not elsewhere on the network.  The
System V one, lockf, is supposed to work across the network.  However we
don't yet trust it.  Its state for the last year or two has always been
"yeah, we had problems with it, but there's this patch from Sun that is
claimed to fix it.  We don't yet know whether it works."  This status
seems to have continued in release 4.0.  So we think using .lock files is
safer.  We did make one change to Mail for network use.  There is code to
handle the case where you find a .lock file.  If it is more than 2 min.
old you figure the system crashed with the file locked, and just remove
it.  On a network, not all machines have the same time.  So we go through
a little dance to make sure that the comparison is done using only times
with respect to the server.  This was done while we were trying to track
down a problem, and we later found that the cause was something else.
However we do recommend that you try to keep times in sync if you're using
Mail over NFS.  We've also done some security-related changes, so that we
don't have to leave /usr/spool/mail set so that the world can write it.
We leave it group writeable only, and make Mail setgid.  This requires
making Mail put the user's group back when he does certain things (like
push to an inferior shell).  Finally, we found a case where Mail exits
without removing a .lock file.  This causes Mail to hang the next time it
is invoked when the user exits.  This is not specific to use over NFS.  I
have to confess that mail in general has been a major hassle for us.  We
try to support Berkeley Mail, RMAIL mode in Gnu Emacs, and Columbia's MM.
Getting them all to obey locks correctly, to not have any security holes,
and not to lose the user's mail when they are over disk quota or the disk
full is almost a full time job.  You'd think something as basic as mail
would be mature.  In fact there is very little Unix software that is
really safe when disk quotas are in use or when file systems can fill.

------------------------------

Date:    Mon, 24 Oct 88 10:08:14 EDT
From:    karl at triceratops.cis.ohio-state.edu (Karl Kleinpaste)
Subject: Re: NFS mount mail (2)

> From:    beck at svax.cs.cornell.edu (Micah Beck)
> 
> Can anyone out there provide an authoritative answer: is it safe to read
> mail using NFS in this fashion?  Is there a potential problem when mail is
> read while new mail is delivered?

In theory, I would say the answers are No, it is not safe, and Yes, there
is a potential problem.

In practice, I am extremely confident saying Yes, it is quite safe, and
No, there is no problem, because we've been doing precisely that for over
a year with only 2 reported instances of NFS-induced mail file garbling.

We have a single /usr/spool/mail partition which is NFS-mounted among 4
Pyramids (it lives physically on one of those), 200+ Suns and a dozen HP
9000s.  No two of these system types delivers mail via /bin/mail in the
same way, as regards locking.  Nonetheless, I get in the vicinity of 200
pieces of mail every day and have yet to be bitten by any problems.

Try it for a week, during a week when things are relatively calm (e.g.,
between semesters/quarters) to see how it goes initially.  Then try
leaving it in place as the next heavy-load period comes.  I suspect you'll
just leave it there forever after.

--Karl
postmaster at cis.ohio-state.edu

------------------------------

Date:    23 Oct 88 02:31:52 GMT
From:    hedrick at athos.rutgers.edu (Charles Hedrick)
Subject: Re: weird NFS hangup
Reference: v6n268

Re the question about NFS hangups.  This is probably not your problem, but
I thought I'd mention it anyway.  In order to get enough performance to
page over Ethernet, NFS does some things that make great demands on the
Ethernet subsystem.  NFS sends bursts of 6 packets, with very little
spacing between them.  If you are running the default 4 copies of biod, it
may do 4 of these bursts at almost the same time, making up to 24 packets.
I'm not sure whether it's limitations in the controller chips, the device
drivers, or what, but most systems (not just Suns) find it very hard to
absorb these.  We have seen a number of cases where you get better
performance by running only one or two biod's on a workstation.  Normally
the users complain of lots of "NFS server ... not reponding" with more or
less immediate "now OK".  Lowering the number of biod's causes it to go
away.  We don't even try 3/60's with 4 biod's, but I've had to go down to
one biod on a 3/50 also.  (This was a machine whose user is paranoid.  He
has emacs tuned so it writes checkpoint files very often.  Thus there was
a lot more write activity on this client than on a normal one.)  I also
saw a situation a couple of days ago where ie1 on a 3/280 suddenly decided
it didn't like even 6 packet sequences.  Attempts to write files via NFS
were failing solidly.  In order to make NFS work, I had to insert a 2
millisecond pause between the packets.  (The traffic was going through a
cisco gateway using the new MCI card.  It is so fast that it has to have a
way to slow it down on request.  Thus I can control packet spacing.)
After the next reboot, the problem went away.  Don't ask me....

------------------------------

Date:    Sun, 23 Oct 88 19:51:40 PDT
From:    aoki at sun.com (Chris Aoki)
Subject: Re: Some Benchmark Results

Actually, the cache conflict occurs because of the placement of the string
compare function (strcmp=f77040b0) and the dynamic call linkage area
(__DYNAMIC=0x4000).  The strlen function is never used in dhrystone.

>What is interesting to note is that a Sun 4/260 running 3.2 is
>significantly faster that the same hardware running 4.0.  Anyone have any
>thoughts as to why this should be so?

What you're seeing is a cache effect caused by a change in the placement
of text and data for shared libraries.   The difference between the
"dhrystone" times seen in the 3.2 and 4.0 versions may be explained by
looking at the memory layouts of the two versions.

First, the 3.2 version (compiled cc -O3 on a sun4/260 running 3.2):

	00002020 t crt0.o
	00002020 T start
	00002090 T _main
	00002090 t dry.3.2.o
	000020b0 T _Proc0
	00002354 T _Proc1
	00002434 T _Proc2
	00002480 T _Proc3
	000024d4 T _Proc4
	000024ec T _Proc5
	0000250c T _Proc6
	000025b8 T _Proc7
	000025d0 T _Proc8
	000026a4 T _Func1
	000026d4 T _Func2
	00002750 T _Func3
	...			(dhrystone code ends here; libc code follows)
	000028c8 T _strcpy
	000028c8 t strcpy.o
	00002be8 T _strcmp
	00002be8 t strcmp.o
	...
	0000f3e8 B _end		(end of process image, before use of malloc)

Note that the whole process image fits in well under 128KB, in- cluding
storage allocated by malloc, though I don't show it here.  What this means
is that under 3.2, with cc -O3, the dhrystone benchmark executes entirely
from the cache.

In 4.0, the cc command gives you shared libraries by default.  This
implies that the C library will be mapped somewhere in the high end of
your address space; for dhrystone compiled cc -O3 on a sun4/260 running
4.0, with the default value of 'limit stacksize', this gives the following
memory map:

	00002020 t crt0.o
	00002020 T start
	00002288 T _main
	00002288 t dry.4.0.o
	000022a4 T _Proc0
	00002548 T _Proc1
	00002614 T _Proc2
	00002648 T _Proc3
	00002698 T _Proc4
	000026ac T _Proc5
	000026c8 T _Proc6
	00002770 T _Proc7
	00002784 T _Proc8
	00002854 T _Func1
	0000287c T _Func2
	000028f0 T _Func3
	00002d90 T _etext
	00004000 d __DYNAMIC
	....			(dhrystone code ends here; libc follows,
	....			 after a LARGE gap whose size is known
	....			 only at run time)
	f7702530 T _strlen
	f7702530 t strlen.o
	f770255c t s1algn
	f7702568 t s2algn	(addresses of strlen and friends determined
	f7702594 t s3algn	 at run time using adb; note that the addresses
	f77025a0 t nowalgn	 you get may differ).
	f77025e0 t done
	f77025f4 t done1
	f7702608 t done2
	f7702618 t done3
	....

What's important here is the fact that with shared libraries, the
dhrystone code must now share cache lines with the C library routine
"strlen", so that whenever one is executed, the other must be flushed.
Hence the difference.   If you compile dhrystone in 4.0 with the -Bstatic
option, you'll see the same times that you saw in 3.2, for what that's
worth.

[Note that Dhrystone by itself is a poor predictor of system performance;
for example, it doesn't show the effects of shared libraries in real
workloads with multiple concurrent applications.  However, it does provide
a reasonably good predictor of performance on string copies of 30 or more
characters.]

------------------------------

Date:    Sun, 23 Oct 88 15:04:35 EDT
From:    jjb at zeus.cs.wayne.edu (Jon J. Brewster)
Subject: NAMED - YP Feature

We're running ypserv.share from the nameserver kit on a couple of 3/180's
under 4.0, and the nameserver software along with that.  One of the
3/180's is the yp master, and the other is a slave yp server.  I don't
seem to be able to the get the slave server to query the nameserver if the
hostname isn't found in the local table.  The map on the master (as seen
with makedbm -u) contains an extra entry, "YP_INTERDOMAIN", which the
slave's map does not have.

I've found a workaround which is to run "ypmake hosts" on the slave, after
copying the master's hosts file to the slave.  That gets me the
YP_INTERDOMAIN entry and the desired functionality.  This seems to defeat
the purpose of having a single yp master though...  Is there something
I've overlooked in setting up YP on the two machines?  Assuming that the
missing YP_INTERDOMAIN entry is the real reason for the problem on our
slave, is there a good reason why yppush doesn't copy it to the slave
server?

------------------------------

Date:    Sun, 23 Oct 88 16:37:36 PDT
From:    lsf at moissac.ucsb.edu (Sam Finn)
Subject: netmask, yp service, and booting

We are running a 3/280 server with approximately 10 3/50 and 1 4/110
clients (the numbers are increasing almost daily). Our network is part of
a larger, subnetted network; so, we need to run a netmask (class c on a
class b network).  This leads to a problem involving yellow pages and
server/client booting. Below I will outline what we are doing, and then
describe what effects it seems to have (and then end with a plea for
help).

To set the netmask, we make the appropriate entry in the /etc/netmasks
file on the server, and remake the appropriate yp database (cd /var/yp;
make netmasks). At boot time, /etc/rc.boot does an ifconfig to bring up
the network interface; however, it does not set the netmask until after
the yp services are brought up in the /etc/rc.local file. Thus, this
should take care of setting the netmask during a reboot on both server and
client. Now comes the problem. The reboot of the server hangs during the
ifconfig that sets the netmask --- the problem being that ifconfig, noting
that yp services are up, is making a yp inquiry for the netmask and yp is
not responding. This same problem occurs for all services that call yp
throughout the remainder of the /etc/rc.local file. We can boot the server
by commenting out the ifconfig line that sets the netmask, and set it
manually once we have finished booting; however, then clients won't boot.
One solution we have found is to set the netmask in rc.boot on the server,
leaving the ifconfig line commented out in rc.local. This seems to permit
both server and clients to boot. (Note that in all the preceeding, only
the server rc* files have been diddled --- the clients have been left
alone). 

So, a couple of questions: first and formost, are we missing something;
ie., is there something that we should have done but have not? Second, has
anyone run across this problem before, and (if so) have they found a
satisfactory understanding of it and a solution? Third, is our temp fix of
setting the netmaks in the ifconfig of rc.boot likely to mess anything up,
or does it qualify as a safe fix?

Finally, an open comment to Sun Microsystems: You really need to do
something about your 1-800-USA-4SUN service --- waiting 5+ days for the
return of a service call is unacceptable  for a service sold with the
promise of 2-4 hour turn-around on trouble calls. This is not an isolated
occurence, but has been going on now for several weeks. The release of a
major version of the operating system and considerable expansion are
explanations for the lack of service, but they are not excuses. 

Well, 'nuff said. Respond directly to me, and I will summarize for
Sun-Spots. I recommend that you respond to the e-mail address
U8VY at CORNELLF.tn.cornell.edu, as the address I am sending this from is not
well known to the world at large.

Thanks,
Sam Finn

------------------------------

Date:    Sat, 22 Oct 88 17:23:37 CDT
From:    slevy at uf.msc.umn.edu (Stuart Levy)
Subject: route hangs, portmap crashes

We've run into problems like these, maybe they have the same causes as
yours.

We've seen route (and other programs) hang when they were trying to do
name->address or address->name translation, AND we were using the ypserv
-i YP-to-name-server linkage AND the root name servers were inaccessible.
There are two causes.

One, most 4.2-derived programs including route don't check their
parameters for numeric addresses until they've first tried gethostbyname.
The standard hosts.byname code in ypserv doesn't check for numeric
addresses being looked up as host names, and so sends every such request
to the name servers.  If the servers don't reply, the request is retried
*forever*.

Fortunately you can fix this by installing the ypserv from the Sun Name
Server Kit, which checks for numeric addresses.  It's nice enough to
construct fake host records for them (with the proper addresses) so even
commands that aren't prepared for numerics, e.g.  /etc/arp, will magically
start accepting them if you use that copy of ypserv.  This works at least
for the 3.x ypserv, not sure about the 4.0 copy also in the kit.  Sun
(Bill Nowicki in particular) did a nice job here.

Two, the route command demands to do address->name translation on its
arguments so it can print their canonical names.  This is again OK if
you're not using the domain server linkage, but if you are and the servers
don't respond, you're totally stuck.  I don't think the Name Server Kit
fixes this.

We dealt with this by changing 'route' so it never attempts the
address->name translation, just prints the numbers.  Fortunately we had
source and could do this; it sounds like you also do, but people without
source might just have to rewrite route.

With portmap we too encountered sporadic crashes.  It turned out they
occurred when the portmapper was handling a proxy call (e.g. broadcast RPC
services such as rusers) whose results overran their incorrectly-sized
buffer.  It suddenly started happening when we got enough users that
"rusers -l" yielded more than 2K of output.

The fix is to enlarge "buf" in portmap.c/callit() to be ARGSIZE bytes
long, which is the limit the corresponding XDR routine checks against.
This was broken at least in 3.3, I don't know about later releases.

	Stuart Levy, Minnesota Supercomputer Center
	slevy at uc.msc.umn.edu

------------------------------

Date:    Sun, 23 Oct 88 10:54:45 PDT
From:    blia.UUCP!blipyramid!mike at cgl.ucsf.edu (Mike Ubell)
Subject: Wren IV system hang

We installed a CDC Wren IV 300meg drive on a 3/60 running SUN Os3.4.
Using the formatting information provided in sun-spots seemed to work but
when the system is used it will hang after about 5 - 15 minutes.
Sometimes it is stone dead and sometimes it will echo and ping as if it
were up but will not execute anything.  Does anyone have any suggestions
on diagnosing the situation?  I suspect our home made scsi cable may be
flakey but I would suspect that we should see disk errors if the sd driver
is at all awake (does it implement timeouts on commands?).

Ps.  As far as I can figure out the sectors/track and cylendars recommened
in sun-spots is not correct but it should not matter given that SCSI
doesn't really care so long as you don't indicate that there are more
sectors on the drive than are really there.

PPs.  Does anyone know a good srouce of SunStyle SCSI cables that adapt to
ribbon cable edge connectors?

------------------------------

Date:    Thu, 20 Oct 88 14:05:00 BST
From:    mcvax!ux.cs.man.ac.uk!ian at uunet.uu.net
Subject: SCSI + SUN3/50?

Would anyone care to share their experiences of interfacing a SCSI disc to
a SUN 3/50 to make a cheap(ish) stand-alone system?  I know you can but
this config. from SUN; I am interested in the possibility of buying the
SCSI disc from another (cheaper) vendor.

Thanks
 Ian Cottam

------------------------------

End of SUN-Spots Digest
***********************