Bug in two-node SunOS 4.0.3c + DNI 6.0 net

FAUCONNE at FRSIM51.BITNET FAUCONNE at FRSIM51.BITNET
Mon Dec 11 10:18:06 AEST 1989


I recently posted a question concerning a problem I had in my
configuration composed of a SUN 4/110 and a diskless SparcStation 1
running SunOS 4.0.3c and DNI 6.0. It seems that my posting never made it
to the list and now the problem is identified... I think some people may
find this useful, so I'm answering my own question...

If you have a configuration similar to the above and no other TCP/IP
traffic on your Ethernet then you may experience a hang of the
SparcStation 1 when DNI tries to change the local Ethernet address by:

    ifconfig le0 ether aa-xx-xx-xx-xx-xx

in /etc/rc.boot

The SparcStation 1 will hang with "NFS server node_name not responding,
still trying".

We worked hard on this one with the local Sun folks and it turns out that
there seems to be a "ARP protocol deadlock" between the two stations in
this case. Should you have other TCP/IP nodes on your Ethernet you will
not notice such a deadlock, because ARP broadcasts from these nodes will
break the deadlock. In our case, we only have DECnet and LAT traffic on
the Ethernet so the SS1 was completely hung.  (Actually it sometimes
restarted several minutes later with no apparent reason).

We found a workaround for that situation but that's a real kludge: start a
background process on the *server* running the following script

#! /bin/sh
#
hostname=`/usr/bin/hostname`
while true
# Ping myself then wait a while
do /usr/etc/ping $hostname >/dev/null ; /bin/sleep 10
done

    (Please no flames, I'm really novice to Un*x)

    It worked for me.
                                        -- Alain
                                           <fauconne at frsim51.bitnet>



More information about the Comp.sys.sun mailing list