ethernet interface crash

Sat Jul 9 02:13:36 AEST 1988

In article <736 at cunixc.columbia.edu> cck at cunixc.columbia.edu (Charlie C. Kim) writes:
>In article <3422 at ut-emx.UUCP> boerner at ut-emx.UUCP (Brendan B. Boerner) writes:
>>
>>Has anyone out there received the following message:
>>  ae0:  overflow NIC reset failed
>>  ae6_intr:  Receive overflow warning.
>>....
>
>Yes, I've been waiting to see if anyone else had this problem.  This
>happens every time I leave my mac booted for any period of time.  I
>believe it happens a result of many closely spaced packets causing the
>board to go into a bad hardware state that the driver cannot reset...
>

I started having this problem when our network was re-arranged here, and it
was so bad that I couldn't do any networking.  The new configuration had
placed me on a very busy portion of the network at CMU.

Some of the problem was tracked down to broadcasts that my A/UX machine was
making, that were being responded to by hundreds of machines on campus.
Although this doesn't completely solve the problem, here are the steps I
took which resulted in a big improvement.

1) In /etc/inittab I turned off nfs0 (the release notes tell you to turn
this on even if you're not running nfs).  I haven't noticed any loss of
functionality after turning it off.

2) I created a file /etc/resolv.conf, listing three domain name servers, so
my machine doesn't broadcast domain name resolution requests.  See the
manual entry for resolver(4).

3) Changed my broadcast address from 128.2.0.0 to 128.2.255.255 (most of the
machines in the CS department here are still using 128.2.0.0, but the plan
is to move to 128.2.255.255).  This is a temporary fix, relying on the fact
that fewer machines are currently responding to broadcasts on the new
address.

So now, my machine does less broadcasting, and because of the
change of broadcast address, receives fewer replies when it does broadcast.

The only remaining problem, which happens much less frequently, is that
100+ other machines on the network don't know about the 255.255
broadcast address, and when they receive such a broadcast (from my machine
or others) they respond by arp'ing.  This flood of arp's still causes my
networking to go down.

The fact remains that the hardware/low-level software should be able to
handle this level of traffic.  Does anybody know if the acknowledged
"defect" in the ethertalk boards could manifest itself in this way?

John Pane
Department of Computer Science
Carnegie Mellon University
(412)268-5884

pane at cs.cmu.edu