Microport '386 Unix

Steve Nuchia steve at nuchat.UUCP
Fri Feb 26 00:15:10 AEST 1988


>From article <4280 at b-tech.UUCP>, by zeeff at b-tech.UUCP (Jon Zeeff):
> I don't think the drive numbers are in the messages.  There do seem to be a
> large number of non repeatable disk errors.  Does anyone else get frequent
> disk errors? 

I've been getting them since day one, december 86.   Running my machine
at 6 MHz I was getting them on a single ST4096 so bad that I could only
use half of it.  At 10 MHz, single drive, they went away.  Added a
second drive and they came back.

Sometimes the errors are really soft, some of the time the formatting
of a sector gets screwed up and you get a bad spot there until you
reformat the track.

Needless to say, the microjerks assured me they had never heard
of the problem before, I was the only person who was reporting
it, it was probably real bad spots on the disk that weren't on
the defect list, my controller (all three of them) was at fault.

The best single piece of advice came from a very helpful and
pleasant person at Western Digital who suggested that I might
be overloading my power supply.  Sadly, that was not the case,
but it was a first class guess.

I've got my ST4096 as drive 0 with root, swap, and "files" on
it (in that order) and a ST251 as drive 1 with usr taking up
the whole thing.  My news spool is on /files and normal operations 
involve a lot of copying between /usr and /files.  Interresting
thing about it is that 99.9% of my errors land in the /files
filesystem.  I have to shut down at least once a week to fsck
/files, often finding a lot of problems (trashed iblocks - my
favorite!) but /usr and / seldom have anything serious wrong
with them.

I've started getting I/O errors on the swap partitiion lately,
which is a real drag (panic!).  The recentness of this development
correlates with the installation of the ramdisk driver - I never
swapped at all before that.

My best guess is that this is a timing foobar triggered when
switching from one drive to another and could probably be
fixed quite easily - like adding one line to the driver somewhere.
The problem I had before (single drive, 6MHz clock) correlated
with very long seeks and is probably coincidental rather than
related in any strong fasion.
-- 
Steve Nuchia	    | [...] but the machine would probably be allowed no mercy.
uunet!nuchat!steve  | In other words then, if a machine is expected to be
(713) 334 6720	    | infallible, it cannot be intelligent.  - Alan Turing, 1947



More information about the Comp.unix.microport mailing list