Ever seen nondeterministic a.out execution from some filesystems?

Chris Torek chris at mimsy.UUCP
Mon Oct 9 01:20:24 AEST 1989


In article <11827 at watcgl.waterloo.edu> idallen at watcgl.waterloo.edu writes:
>File system /tmp on our 4.3BSD vax8600's has a block size equal to its
>frag size equal to 8192. ... If I compile ... ten times in a row, half
>the time the resulting a.out won't run.  Copying the a.out to another
>file in /tmp often fixes the problem.  Copying the file to another file
>system and running it from there always fixes the problem. ... If I run
>a faulting a.out under adb, it will fault and when I examine instructions
>near where it faults I see zeroes!

Sounds like munhash() is either not being called properly, or not doing
its job.  There was a small change to realloccg() between 4.3BSD and
4.3BSD-tahoe, along the following lines:

[old]
	count = roundup(osize, CLBYTES / DEV_BSIZE);
	for (i = 0; i < count; i += CLBYTES / DEV_BSIZE)
		... munhash(..., bn + i);

[new]
	count = roundup(osize, CLBYTES);
	for (i = 0; i < count; i++)
		... munhash(..., bn + i * CLBYTES / DEV_BSIZE);

As far as I can tell, this change has no actual effect (on both Vax
and Tahoe).  Also, with fsize==bsize, realloccg() should not be called
at all since there are no fragments.

The other likely possiblity is the buffer size-changing code, for
which a fix was posted from Berkeley.  Here is a version of that fix.

*** /tmp/,RCSt1003260	Sun Oct  8 11:19:12 1989
--- ufs_bio.c	Tue Nov  8 00:19:24 1988
***************
*** 4,8 ****
   * specifies the terms and conditions for redistribution.
   *
!  *	@(#)ufs_bio.c	7.1 (Berkeley) 6/5/86
   */
  
--- 4,8 ----
   * specifies the terms and conditions for redistribution.
   *
!  *	@(#)ufs_bio.c	7.3 (Berkeley) 11/12/87
   */
  
***************
*** 34,38 ****
  		panic("bread: size 0");
  	bp = getblk(dev, blkno, size);
! 	if (bp->b_flags&B_DONE) {
  		trace(TR_BREADHIT, pack(dev, size), blkno);
  		return (bp);
--- 34,38 ----
  		panic("bread: size 0");
  	bp = getblk(dev, blkno, size);
! 	if (bp->b_flags&(B_DONE|B_DELWRI)) {
  		trace(TR_BREADHIT, pack(dev, size), blkno);
  		return (bp);
***************
*** 68,72 ****
  	if (!incore(dev, blkno)) {
  		bp = getblk(dev, blkno, size);
! 		if ((bp->b_flags&B_DONE) == 0) {
  			bp->b_flags |= B_READ;
  			if (bp->b_bcount > bp->b_bufsize)
--- 68,72 ----
  	if (!incore(dev, blkno)) {
  		bp = getblk(dev, blkno, size);
! 		if ((bp->b_flags&(B_DONE|B_DELWRI)) == 0) {
  			bp->b_flags |= B_READ;
  			if (bp->b_bcount > bp->b_bufsize)
***************
*** 85,89 ****
  	if (rablkno && !incore(dev, rablkno)) {
  		rabp = getblk(dev, rablkno, rabsize);
! 		if (rabp->b_flags & B_DONE) {
  			brelse(rabp);
  			trace(TR_BREADHITRA, pack(dev, rabsize), blkno);
--- 85,89 ----
  	if (rablkno && !incore(dev, rablkno)) {
  		rabp = getblk(dev, rablkno, rabsize);
! 		if (rabp->b_flags & (B_DONE|B_DELWRI)) {
  			brelse(rabp);
  			trace(TR_BREADHITRA, pack(dev, rabsize), blkno);
***************
*** 150,159 ****
  	register struct buf *bp;
  {
- 	register int flags;
  
  	if ((bp->b_flags&B_DELWRI) == 0)
  		u.u_ru.ru_oublock++;		/* noone paid yet */
! 	flags = bdevsw[major(bp->b_dev)].d_flags;
! 	if(flags & B_TAPE)
  		bawrite(bp);
  	else {
--- 150,157 ----
  	register struct buf *bp;
  {
  
  	if ((bp->b_flags&B_DELWRI) == 0)
  		u.u_ru.ru_oublock++;		/* noone paid yet */
! 	if (bdevsw[major(bp->b_dev)].d_flags & B_TAPE)
  		bawrite(bp);
  	else {
***************
*** 261,264 ****
--- 259,269 ----
   * for the oldest non-busy buffer and reassign it.
   *
+  * If we find the buffer, but it is dirty (marked DELWRI) and
+  * its size is changing, we must write it out first. When the
+  * buffer is shrinking, the write is done by brealloc to avoid
+  * losing the unwritten data. When the buffer is growing, the
+  * write is done by getblk, so that bread will not read stale
+  * disk data over the modified data in the buffer.
+  *
   * We use splx here because this routine may be called
   * on the interrupt stack during a dump, and we don't
***************
*** 306,309 ****
--- 311,323 ----
  		splx(s);
  		notavail(bp);
+ 		if (bp->b_bcount != size) {
+ 			if (bp->b_bcount < size && (bp->b_flags&B_DELWRI)) {
+ 				bp->b_flags &= ~B_ASYNC;
+ 				bwrite(bp);
+ 				goto loop;
+ 			}
+ 			if (brealloc(bp, size) == 0)
+ 				goto loop;
+ 		}
  		if (bp->b_bcount != size && brealloc(bp, size) == 0)
  			goto loop;
***************
*** 365,369 ****
  
  	/*
! 	 * First need to make sure that all overlaping previous I/O
  	 * is dispatched with.
  	 */
--- 379,383 ----
  
  	/*
! 	 * First need to make sure that all overlapping previous I/O
  	 * is dispatched with.
  	 */
***************
*** 502,505 ****
--- 516,522 ----
  /*
   * Insure that no part of a specified block is in an incore buffer.
+ #ifdef SECSIZE
+  * "size" is given in device blocks (the units of b_blkno).
+ #endif SECSIZE
   */
  blkflush(dev, blkno, size)
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris at cs.umd.edu	Path:	uunet!mimsy!chris



More information about the Comp.unix.wizards mailing list