fix for 4.2BSD kernel bug that trashes file systems

Sat Feb 25 05:16:00 AEST 1984

From:  Jeff Mogul <mogul at coyote>

A few months ago, I sent a request to this list for help with a bug
that was quietly trashing files and directories.  I knew that the problem
was a bad reference count on a file struct; I just wasn't sure how it
got like that.  Berkeley responded to me, with a fix that works fine.
However, every few days I get a message from someone else who has the
same problem, and since Berkeley has publicized this fix, I have to do
so to keep my sanity.

My guess is that there are oodles of apparently bizarre problems that
will be solved by installing this fix.  Of course, I take no responsibility
if it doesn't work for you!

------- Forwarded Message

Received: from UCB-VAX.ARPA by Navajo with TCP; Tue, 13 Dec 83 16:06:50 pst
Received: from ucbmonet.ARPA by UCB-VAX.ARPA (4.22/4.16)
	id AA13058; Tue, 13 Dec 83 16:04:52 pst
Received: by ucbmonet.ARPA (4.22/4.14)
	id AA00424; Tue, 13 Dec 83 16:06:54 pst
From: karels%ucbmonet at Berkeley (Mike Karels)
Message-Id: <8312140006.AA00424 at ucbmonet.ARPA>
Date: 13 Dec 1983 1606-PST (Tuesday)
To: Jeff Mogul <mogul at navajo>
Subject: Re: Serious 4.2 kernel bug causes files and directories to be mangled

You are right about the race in ino_close/closef, the problem can occur
whenever the device close routine blocks for output to flush.  We haven't
seen the problem here (strangely), but it was discovered by Robert Elz.
The changes that we have made follow; they have been running for a week
or two on several machines without any problems, so I think there shouldn't
be any problem.  There are actually two changes; the first guarantees
that closef will be done only once, even if interrupted, and the second
catches interrupts in ino_close, which will then always return to closef.
f_close can then be cleared exactly once.  By the way, the ordering
becomes more similar to that in 4.1.

		Mike

Nov 18 10:06 1983  SCCS/s.kern_descrip.c: -r6.2 vs. -r6.3 Page 1

246,247d245
< 	closef(fp);
< 	/* WHAT IF u.u_error ? */
249a248,249
> 	closef(fp);
> 	/* WHAT IF u.u_error ? */

Nov 18 10:06 1983  SCCS/s.sys_inode.c: -r6.1 vs. -r6.2 Page 1

294c294
< 	struct file *fp;
---
> 	register struct file *fp;
296a297
> 	register struct file *ffp;
309d309
< 	fp->f_count = 0;			/* XXX Should catch */
336,337c336,337
< 	for (fp = file; fp < fileNFILE; fp++) {
< 		if (fp->f_type == DTYPE_SOCKET)		/* XXX */
---
> 	for (ffp = file; ffp < fileNFILE; ffp++) {
> 		if (ffp == fp)
339c339,341
< 		if (fp->f_count && (ip = (struct inode *)fp->f_data) &&
---
> 		if (ffp->f_type == DTYPE_SOCKET)		/* XXX */
> 			continue;
> 		if (ffp->f_count && (ip = (struct inode *)ffp->f_data) &&
352c354,363
< 	(*cfunc)(dev, flag, fp);
---
> 	if (setjmp(&u.u_qsave)) {
> 		/*
> 		 * If device close routine is interrupted,
> 		 * must return so closef can clean up.
> 		 */
> 		if (u.u_error == 0)
> 			u.u_error = EINTR;	/* ??? */
> 		return;
> 	}
> 	(*cfunc)(dev, flag);

------- End of Forwarded Message

By the way, I strongly recommend, in sys/ufs_inode.c, in iput()
adding (before the first line of code):

	if (ip->i_count < 1)
		panic("iput: starting count < 1");

This will save you from similar sorts of trashing in the future (i.e.,
your system will crash but your files will not be randomly trashed.)
For those of you who remember a similar bug in the 4.1BSD mpx code,
this same panic would also have caught the problem.

-Jeff