zs[0-3]: silo overflow

Sundar Narasimhan sundar at AI.MIT.EDU
Fri Jul 14 06:25:09 AEST 1989


Hi: This is a complicated problem but we have thus far had NO help
from sun in figuring this out. 

Our hardware consists of a Sun 3/280 system hooked up to a slave VME
system through a HVE 2000 VME-VME adaptor. On the slave VME we have a
number of 68020 single board cpu's made by Ironics (usually 2 to 4).
Until recently this configuration worked fine. We could run programs on
the slave cpus with no problem. 

Recently, however, our sun CPU board got upgraded to rev level 15.  Since
then whenever a slave processor tries accessing memory in another slave
processor, the sun freezes, gets a lot of zs[0-2] silo overflows and then
crashes (usually with a bus error). Replacing the 3/280 with a 3/180 or a
board with a lower rev no. seems to fix the problem leading us to suspect
the later rev. changes and/or the 3/280 bus interface design. 

We have determined that it is NOT a problem with our particular CPU board
- all the SUN cpu boards after rev level 15 exhibit the same problem. I
have spoken with the person who designed the VME-VME adaptor and he
mentioned that Sun-4's and Sun-3E's had a problem with ignoring bus
requests -- i.e. if they didn't get the bus within 40 musecs, they would
just punt. This may not be the source of our problem but I would like to
know: 

a. Why does the kernel give us these silo overflow errors?

b. If the sun cannot get the bus for an extended period of time (our slave
cpus are set to Release-When-Done) what do they do? Why does the sun need
the bus in the first place ?? (we have no other devices on the bus)

c. Is there a way of hacking the hardware on the boards (by increasing the
timeout period, for example) that can solve this problem? 

Anyone have any ideas?
Thanks,
Sundar



More information about the Comp.sys.sun mailing list