UDA50/RA81 problems....

dave at RIACS.ARPA dave at RIACS.ARPA
Tue Aug 14 04:03:00 AEST 1984


From:  "David L. Gehrt" <dave at RIACS.ARPA>

The distributed berkeley drivers I have seen are buggy.  We noticed
poor[er than was reasonable] throughput on our 81's, and another site
here at ames was having a serious random offline problem.  We took a
dynamic look at some of the data structures in the driver, and
discovered that under heavy load, the controller was being flooded with
Get Unit Status (M_OP_GTUNT) commands.  If you look into the driver you
will see a block of code in udstart() which begins with an "if ((i =
ubasetup(..." and ends by sending a Get Unit Status command.  The
effect of this block of code is the flooding to which I referred above.
Removing this behavior clears up a lot of the problems if not all, but
we made enough changes that the context diffs are about the same size
as the driver.

The driver we are currently running works just fine, and has dumpcode
(which the distributed code lacked). We haven't gotten around to adding
support for more than one device type at a time so our ra60 is not yet
installed. The other site here started running the driver and its
serious random offline problem went away.  There are a number of sites
which have picked up the code for our driver and none have reported
back any problems as of this writing.  Neither of our sites had any
microcode upgrades but the legend is that early versions microcode
caused all sorts of problems. We have seen a number of modified drivers
all of which look like they would solve the problem.

We have had plans to add bad block forwarding to our driver for six
months, and have received some code which will advance that effort.
I'll report any successes in this location.  The problem with the
effort has been lack of time and lack a source for reliable information
in support of the activity, which brings me to a...

Minor Flame:  After all the time in the field with this hardware (we
have had our ra81s for amost a year), I am more than a little
dissappointed at the small amount of reliable information on the
uda50/ra?? combination in the hands of the DEC field service folks and
the users, and with the large amount of misinformation and legend we
all seem to be given. Here are a couple of legends I think are or were
wide spread and false:

	1.  "UN*X (TM) scribbles all over the rct (replacement caching
	tables) used for bad block forwarding."  [Not in *any* UN*X
	driver I have seen.]
	
	2.  "The controller forwards bad blocks automatically."  [I have
	seen nothing that indicates that this is true, and lots of bad 
	block reports to indicate that the controller is not forwarding
	them.  In VMS for example the host  seems to initiate all
	bad block forwarding].

Flame off.  

Because the devices are new and, except for a couple of little
problems, have been reliable, and quick, the fact that the bsd
distributed drivers I have seen are not correct is very troublesome.
Also, it is beginning to look like the users of these devices need to
establish their own communications path to diseminate information on
the devices and their drivers.  Dec has a clear interest in not
disclosing too much about the protocol used and other technical details
to keep out the competition, but judging from the number of pieces of
mail  here which start with some variation on the theme "Help with
UDA50/RA81 problems!!!" It is clear that there is a need to improve the
information flow. So here is a start.  I have a 4.2 driver which works
fine.  [There is no way I know of to determine if it is completely
correct, or the most efficient implementation.]  Also, I know of a site
with a working 4.1 driver, and I will try to get a copy of the diffs
for redistribution if there is sufficient interest.

I now relinquish the soap box, but I do feel better.

dave
----------



More information about the Comp.unix.wizards mailing list