The 3B1 and the Bad Block

Jeffery Small jeff at cjsa.wa.com
Sat Jan 12 12:45:24 AEST 1991


I guess I need the help of some of you experts!

Environment:	3B1, 2Mb RAM, (1) 67-Mb internal disk,
		(2) RS232 expansion cards, unix-3.51m, WD2010

I have been getting a number of the following HDERR reports in
/usr/adm/unix.log (long lines have been wrapped for readability):


    HDERR ST:51 EF:40 CL:E3 CH:3 SN:7 SC:1 SDH:22 DMACNT:FFFF DCRREG:9A
    MCRREG:8F00 Thu Dec 27 11:11:00 1990

    WD2010 ST=/Sekg/Err/ EF=/CRC/ cy=995. sc=7. hd=2. dr#=0. MCR2:0x0
    Thu Dec 27 11:11:04 1990
    
    drv:0 part:2 blk:58635 rpts:1 Fri Jan 11 07:19:14 1991


So, I used Brant Cheikes' great "bf" program to determine that block number
58635 was allocated to inode #2583.  ncheck then told me that this was
currently assigned to my 1Mb-sized Cnews "history" file.  Next, I tried to
make a copy of this file and after about 8 attempts I had a successful read
of this data block.  I renamed the bad file and installed the good copy as
the "history" file.

Although I have three of these machines (one dating back to 1985) I have
never had a repeatable HD error before (just lucky I guess) so I was now
ready to add my first block to the Bad Block Table.  I booted the revised
(WD2010) s4diag diskette, selected test 3 "Enter Bad Blocks" and when
prompted, selected option #3 to specify by logical block number.  At the
prompt for the block number I entered:  58635.  The diagnostic routine then
responded with [approximately]:

    Added bad block: Cylinder 916, Track 1, Sector 6.
    Used Track 7329 as the alternative.

Cylinder 916?  Shouldn't this be 995?  At the next prompt to
(Add, Delete or Ignore) I entered an "I" and then thought I would try
this procedure again.  I re-ran the test and re-entered block #58635 and
everything occurred as described above but now the report read:

    Added bad block: Cylinder 916, Track 1, Sector 7.
    Used Track 7329 as the alternative.

Running the expert subtest #6,12 to display the BBT, I now saw that I had
16 BBT entries with the last two being the 916,1,6 and 916,1,7 as reported
above.

Not sure what to do next, I rebooted the machine.  As you might expect, the 
problem was not resolved.  The bad file still contained the bad block and
"cat file" yielded additional error reports to unix.log.

So my questions are:

1:  Why didn't the block number in the error report (58635) work?  What
    (probably obvious) idea am I missing and how should I properly fix this
    problem?

2:  Did I just (potentially) hose some disk files by entering the two good
    sectors into the BBT or did the contents of these sectors get copied to
    the alternate track by the diagnostic routine?  If the data was not
    copied, is there a way that I could determine the files (if any) which
    were damaged?

3:  Can I use the same diagnostic routine to recover the use of these two
    sectors by "Deleting" them from the BBT?  If not, what is the "Delete"
    option for?


Any help would be greatly appreciated.  Thanks.
--
Jeff Small                     C. Jeffery Small & Associates    (206) 232-3338
uunet!nwnexus!cjsa!jeff        7000 E Mercer Way,  Mercer Island, WA     98040



More information about the Comp.sys.att mailing list