What is 'SMALL DISK ERROR' ?

Alan's Home for Wayward Notes File. alan at shodha.dec.com
Fri Mar 2 03:15:36 AEST 1990


In article <2490.25e7e086 at esat.kuleuven.ac.be>, elsen at esat.kuleuven.ac.be writes:
> 
>             Recently I have been getting the following errorlog entries :
>             ( It happens a couple of times each day )

	[ A "Small Disk Error" follows. ]
> 
>     What kind of action should I undertake with respect to this ?

	Step 1: Get the "full" listing of the error from uerf(8).
	The option to do this is:

		uerf -o full [ other options you might want ]

>     What does 'SMALL DISK ERROR' mean ? Does it mean that a correctable
>     ECC error has occurred ?

	Once upon a time, somebody told me the difference between
	a "small disk" error and a normal disk error.  It may have
	something as simple as the error being from a small disk.
	The full uerf(8) listing should tell you exactly what the
	error was.

>     Or is it possible that an uncorrectable ECC with Bad Block Replacement
>     has occurred ? (I prefer the first one...)

	If this was the case you'd either be seeing Forced Errors
	on the LBN or you have have a good block and the error
	would go away.
> 
>     This leads me to further questions (since I am relatively new with
>     respect to Ultrix) :
>     Does Ultrix (3.1) support online Bad Block Replacement ?
>     If so then why is the program 'radisk' still needed ?

	Yes, ULTRIX V3.1 (and every version since V2.0) had
	dynamic BBR.  The operations performed by radisk are:

		-s	scan
		-c	clear forced errors
		-r	replace

	The scan could be done just by using dd(1) and if the
	bad block is encounted the BBR code of the host or
	controller will be envoked (the RQDX3 does BBR itself).
	Of course dd(1) will probably fail and you'll have to
	start it over.  The conv=noerror switch may be enough
	to avoid this.  The "scan" switch of radisk on the other
	hand merely tells the controller to read the entire disk
	without transfering the data back to the host.  This ends
	up being faster than dd(1).

	You can clear the forced errors just by writting over the
	block.  Of course since you can't read the block first
	because the forced error will cause an input error.  radisk(8)
	will at least preserve the corrupted contents which may be
	enough to reconstruct the rest of the block.

	The replace could be done with a simple program that reads
	the block over and over it fails or gives up (a built in
	threshold).  All of the functions of radisk(8) could be 
	provided by other program, but it has the convience of 
	putting them all in one place.
> 
> -- 
>   Marc Elsen (System Manager/Software Engineer)
-- 
Alan Rollow				alan at nabeth.enet.dec.com



More information about the Comp.unix.ultrix mailing list