File system limit in 4.2 BSD

Joseph Moran mojo at sun.uucp
Mon Apr 1 05:14:08 AEST 1985


In article <681 at rayssd.UUCP> dhb at rayssd.UUCP writes:
>Has anyone ever successfully gotten more than 15 file systems on a 4.2 BSD
>system?  After many long delays, we are finally going to convert from 4.1
>to 4.2, and we need to be able to mount more than 15 file systems.  I tried
>making the same changes that I made in 4.1 (increase the size of mdev in the
>cmap stucture, increase NMOUNT and NSWAPX in param.h, fix mount/umount) but
>it doesn't seem to work.  I even talked to Mike Karrels in Dallas and he
>indicated that that was all I had to do.  The problem we are experiencing
>is that random processes dump core at random times.  This can be very
>annoying if the shell core dumps, and it can be disastrous if "init" core
>dumps.  The behaviour seems to indicate some kind of swapping error.  At
>first I didn't even associate this problem with the changes to the coremap
>structure but in a final act of desperation I backed off the change and
>now the system runs fine.  We have been trying to track what we thought
>was a weird swapping error for three months (tues and wed eve.) and have
>now been running smoothly WITHOUT the coremap changes for over two weeks.
>
>	...

Your problem is the "Fastreclaim" code in vax/locore.s.  This code is
an optimization put into 4.2.  This code knows about the cmap
structure.  If you change anything in the cmap structure w/o rewritting
this code, you are bound to get bad paging problems.  As it turns out,
you can take out the call to Fastreclaim as it is simply an
optimization, in the long run you'll want to rewrite the code for your
new cmap structure.  It turns out that this code also knows a few other
magic numbers also, w/o using the right symbols to reference them (like
UPAGES).  The second problem can be avoided by figuring out some of the
magic numbers in the code and putting in an expression using the right
symbols.

It turns out that we were bit by this same problem here at Sun twice.
We changed the cmap structure for use with the nfs (network file
system).  We had a hard time figuring out why random pages got paged in
incorrectly and processes were dying when we were running the nfs
kernel until it was tracked down to Fastreclaim.  Later we were playing
with changing UPAGES and got bit by Fastreclaim again.  Sometimes
changing .h files doesn't do everything it really needs to.  Hats off
to Bill Shannon for finding both of these.

			Joe Moran
			sun!mojo



More information about the Comp.unix.wizards mailing list