A Nasty Bug in the 68020 4.0.3 Upgrade Standalone Copy Program

Katy Kislitzin ktk at spam.istc.sri.com
Fri Jun 30 06:25:06 AEST 1989


QUICK VERSION:

stand copy overwrote the d,e,f partitions of my scsi disk while
trying to copy the 4.0.3 Upgrade miniunix into my swap partition.
goto: DESCRIPTION OF BUG

================== THE LONG SOB STORY

Last night I brought my 140M scsi disk into work to install 4.0.3 on it so
I could use the refurbished 3/50 I just bought for home.  The only useful
scsi tape i could find had a scsi disk which was in use attached to it.
That disk is running 3.5 and is attached to a 3/110.  I hooked up my disk
as sd2, got my 4.0 release tapes and did the installation onto the sd2
disk.  

Everything went *really* smoothly  The only problem i had was that i
wanted to copy the minunix into sd2b, the swap partition of the 140M disk.
this way i could avoid touching the other disk.  unfortunately, when i
followed the 4.0.3 manual and used sd(,8,1) to refer to the second scsi
disk, it failed.  with hindsight, i realize that the first place that
mentions doing that is the 4.0.3 manual and pre-4.0.3 releases problably
don't support it.  i didn't try with the 4.0.3 tape because "it didn't
work before so it won't work now".

================== DESCRIPTION OF BUG

After the 4.0 full install i went to do the 4.0.3 upgrade, the first step
of which is copying in the miniroot.  this is where the bug showed up.
The first step, as usual, is to boot the copy program and copy mini-unix
into the swap area (sd0b).  So i executed the following sequence:

b st(,,)
Boot:st(,,2)
From:st(,,3)
To:sd(,,1)

At this point the machine got busy for an inordinately long time.
Eventually i aborted it, as i couldn't see *any* possible way for it to be
taking this long and it didn't appear that the tape was moving.  At that
point i needed to reset the sun (L1-a k2) before i could continue, as the
scsi bus was OTL.  After that i tried again and the upgrade completed
normally on sd2.

HOWEVER, when i went to boot the problems with the sd0 disk partitions
showed up.

================== CONTINUE SOB STORY

When I came up single user mode, fscked the root partion and rebooted, I
couldn't do anything, not even ls!  I soon figured out this was because
pub wasn't mounted and SunOS root is worthless!  all I had was /bin/sh,
and etc ( thank god it was a 3.X machine ).  I tried fsck'ing the other
partions and found that d, e, and f were missing their superblocks,
although g was fine.  This is where i got VERY unhappy.  the machine i was
using is not dumped because its owners decided they didn't need too.
(ALWAYS DUMP!)  I hadn't dumped it before i started because i KNEW i was
only going to be using the b (swap) partition.  (ALWAYS DUMP!)  I didn't
really care that I had wiped out pub and user, but I also wiped out one
partition's worth of moderately valuable stuff.  The machine was a clone
of testbed machines, and a clone of the development machine, *except* not
quite.

What I believe happened is that the scsi controller got an error, probably
from the tape which it passed onto stand copy.  Unfortunately, copy
ignored the error and kept writing out to disk.  it never reached the end
of the tape file so it just kept writing junk out to disk forever.  Of
course, the particularily nasty thing is that it also ignored the disk
partitions and just kept writing.

Needless to say, my boss does not look kindly towards me or anyone using
company resources for personal projects right now ;-)  

The only good part is that people whose disk it is have all been very
reasonable and the stuff on the disk 

a) needed to be put back in sync with the field machines anyway
b) was not *crucial*, although it's loss will have some impact

The final blow to the evening was getting home around 2 am and finding
that my house had been broken into ;-(

I have reported the bug to my sun sales rep and to the last person I
talked to at the sun support number via e-mail.  If someone could be
kind enough to send me the bugs address...

Also, if this has happened to anyone else, I would like to hear about
it.

--ktk at spam.istc.sri.com



More information about the Comp.sys.sun mailing list