slow boot-time fsck -y -D, invoked by mount(!)

Sun Oct 28 18:34:38 AEST 1990

We have an SGI 4D220 with 64Mb of main memory running Irix 3.2 with a
built-in ESDI disk and 4 SMD disks (3 Fujitsu Eagles and 1 Swallow) on a
Xylogics 754 (yer basic adequate computer).  The SMD drives come from an old
Sun 3/160, where the Eagles were on a Xylogics 451 instead.  One might
naively expect that the boot-time fscks would run faster on the SGI due to
having two, much faster, processors and a faster SMD controller, yet the SGI
takes 45 minutes (elapsed time) to check these drives, while the Sun took
about 15 minutes.  The obvious difference is that the Sun used /etc/preen,
which in turn invokes one "fsck -p" per drive and runs all the fsck's in
parallel, whereas the SGI fscks each filesystem separately, one after the
other with no parallelism at all (truly Unparalled Performance (TM)!).  I
sought to change this suboptimal state of affairs.

My first thought was to explore dfsck.  Don't bother.  It's full of
unchecked system calls, notably read(2)(!), and can run at most two fscks
simultaneously, which might arguably be sufficient for us with a mere two
processors, but is inadequate for a number of sites hereabout with four or
eight processors.

My next thought was to find the fsck invocation in the rc startup subtree
and replace it with an invocation of a modified preen (lacking "fsck -p",
the output would be somewhat messy, but it scrolls off the non-hardcopy
console quickly anyway).  With the help of some friends, I eventually found
the boot-time fsck invocation, in /etc/mount(!).

It turns out that the rc scripts just try to mount each efs filesystem named
in /etc/fstab, and when mount (actually /etc/fsstat) determines that one is
dirty, it runs "/etc/fsck -y -D /dev/rfs".  (Incidentally, someone at SGI
should lint and fix mount.c; in 3.2 at least mount_efs() doesn't return a
value in all its return statements, even though its return value is used,
and the fix in 3.3.1 looks wrong.)  This is certainly surprising behaviour
for those of us accustomed to non-System-V Unixes and is rather nasty when
one wants to change the options to fsck.  Fortunately, one can avoid the
magic fsck invocation in mount by explicitly fscking the filesystems (in
parallel) in the rc scripts.  Here is what we are about to use (it uses the
fstab "pass" number as a "drive" number instead); does anyone, especially an
ex-4BSD or ex-Research Unix user, have anything better?

#! /bin/sh
# preen - run fsck -y -D, keeping each drive as busy as possible,
#	on all read/write file systems, except the first one (root).
#	avoid using /tmp or /usr/bin/sort, as they may not be mounted yet.
# N.B.: /bin/awk must exist (it does on real Unixes, but not on Irix).
PATH=/bin:/etc; export PATH
fsckbad=/etc/tmp.preen.bad
fsck=fsck			# fsck or echo

# $fsck -p /dev/root
# case $? in 0) ;; *) exit $? ;; esac
rm -f $fsckbad

# collapse fstab into bins of file systems on the same drive by pass # ($6)
for args in `awk '$1 ~ /^\/dev\// && $3 == "efs" {
	if (count++ > 0) {	# non-root file systems, root already checked
		fields = split($1, name, "/")
		fss[$6] = fss[$6] "~/" name[2] "/r" name[3]	# use raw device
	}
}
END {
	if (count > 1)			# file systems other than root?
		for (drive in fss)
			print fss[drive]
}' /etc/fstab `				# spin off an fsck per drive
do
	if eval $fsck -y -D "`echo $args | sed 's/~/ /g' `"
	then : far out
	else >>$fsckbad	# note fsck failure
	fi &
done
wait				# for all the fsck's to finish

# return fsck-like exit status
if test -f $fsckbad
then exit 8
else exit 0
fi

-- 
Geoff Collyer		utzoo!utstat!geoff, geoff at utstat.toronto.edu