adding bad blocks using 386/ix

Sun Mar 4 07:53:37 AEST 1990

In article <511211 at nstar.UUCP> larry at nstar.UUCP (Larry Snyder) writes:

   I tried adding the errors for the SCSI drive using mkpart -A <abs sector>
   and I get an error - No root partition.  I also tried "mkpart -A <sec> -f
   /etc/partitions" - likewise the same error, no the bad blocks were not added
   to /etc/partitions.  

This is completely wrong. You must first build the VTOC, with 'mkpart
-p', and then the 'rsrvd' and 'alts' partitionlets.

   Finally, I added the bad blocks manually to /etc/partitions, unmounted and
   remounted the device (I assume that /etc/partitions is read at mount time).

This is also completely wrong. Look at the 'last accessed time' using
'ls -ltur'.

What happens with bad block handling is that the bad block table is an
array of block numbers in the 'rsrvd' partitionlet on the disc. Whwn the
disc is mounted, the table is read into memory by the driver.

The on-disc table is *initialized* by mkpart, which reads the initial
map off the /etc/partitions file; after this, the contents of
/etc/partitions are ignored, unless you rebuild the VTOC.

You can print the current contents of the on-disc partition table
with 'mkpart -ta' by the way.

The kernel, on encountering an IO error, whether soft or hard, will
automtically find a spare block (from the 'alts' partitionlet), update
the in-core bad block table, and write it back to the 'rsrvd'
partitionlet.

If you want to add manually a bad block to the list, an ioctl, used
by 'mkpart -A', allows you to trigger the mechanism yourself.

The original System V.3.2 from AT&T had a catastrophic bug, documented
in the utilities release notes, that means that adding a bad block
number manually only updates the in core kernel bad block table; the
on-disc bad block table is _not_ updated. This means that you will get
into big, big trouble, because the revectoring information will be lost
on every boot.

I seem to remember that ISC, like most other vendors, have not corrected
this bug; it may still be there in 2.01, but I don't know for sure, as 
I am not familiar with ISC's latest releases.

This bug makes manual bad block assignment virtually impossible, or
dangerous (the only work around is to keep a file with the bad block
information and reinvoke 'mkpart -A' for each of them at every boot up;
woe befall you if you fail to update this file correctly).

This bug, and the automatic revectoring of soft errors mean that System
V.3.2 bad block handling is badly broken; most soft errors should not
be revectored, because they are transient (vibrations, etc...). The best
bad block handling system _logs_ IO errors, and then the system
administrator should revector blocks that either exhibit hard errors or
repeated soft errors.

As to what is giving you all those errors on the disc, I cannot help
you a lot. I am not familiar with your device. I will observe however
that many SCSI discs will, regrettably (but in a particular case
usefully), automatically revector bad blocks to give the illusion of
a defect free volume. Probably your disc/controller will allow you to
format the disc reserving a sector per track as spare.
--
Piercarlo "Peter" Grandi           | ARPA: pcg%cs.aber.ac.uk at nsfnet-relay.ac.uk
Dept of CS, UCW Aberystwyth        | UUCP: ...!mcvax!ukc!aber-cs!pcg
Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg at cs.aber.ac.uk