Sparcstations using 16M or more under SunOS

Tue May 8 12:37:14 AEST 1990

Now that we have figured out how to put lots of memory in a Sparcstation 1
the next problem is trying to use it.  SunOS tends to have problems
handling anything more than around 16M on a Sparcstation as described
below.

We have been trying to run a Sparcstation 1 with 48M of memory, and have
been having problems making SunOS 4.03 make full use of the memory.  Based
on our experiences with SunOS 4.1 on a Sun 4/330 we suspect that the same
problem persists under SunOS 4.1 on a Sparcstation.

The problem we encounter is that with this much memory a very large number
of address translations faults are occurring, up to 3000 per second, and
with each fault taking around 300ms to handle this means that around 90%
of the total CPU time is spent in system mode handling these faults.

Getting into the kernel it appears that SunOS allocates a fixed sized area
to keep track of mappings between virtual and physical addresses.  On a
Sparcstation this area can hold up to 128 page management groups.  Each
page management group maps a contiguous range of addresses up to 256k in
size.  One page management group will be needed to map each of the text,
data, and stack segments of a process, and one appears to be used by the
operating system for the process.  Thus a minimum of four page management
groups are used by a process, more are used if any of its segments exceed
256k.

When a process needs to access a page in a page management group that is
not in the kernel table, and no free page management groups exist, it
steals a page management group from another process.  When the other
process next goes to access a page in this page management group it will
get a fault and also have to steal a page management group from another
process.  Having got a page management group back all the page table
entries associated with that page management group are marked invalid, and
thus the process will receive address translation faults when it goes to
access each of the 64 4k pages associated with the page management group.

The problem is compounded by SunOS swapping out processes whose resident
set size is zero.  If all the page management groups belonging to a
process get stolen from it the kernel determines that the process's
resident set size is zero, and promptly swaps the process out.
Fortunately this swapping only involves moving all of the processes pages
onto the free list, and not to disk.  But the CPU load associated with
doing this is significant.  We are finding around five processes per
second get swapped to the free list.

With 128 page management groups it will be possible to map up to 32M of
virtual memory, although many processes have page management groups that
map less than their full 256k, so that a total virtual address space of
around 16M will be more typical.  Thus the problem described is probably
apparent to some extent on any Sparcstation with a total active virtual
address space exceeding 16M, and sufficient physical memory to hold the
entire virtual address space without paging.  Shared libraries, and text
segments mean that the actual physical memory on the machine could be less
than 16M, and these problems could still occur.

To get a feel for the cost of these problems you can have a look at the
hatcnt data structure in the kernel.

# nm /vmunix | grep hatcnt

f80cbf00 B _hatcnt

# od -D /dev/kmem +0xf80cbf00 | head -2

f80cbf00  0002129059 0002034884 0019942909 0003173659
f80cbf10  0002685512 0000000000 0000000000 0000000000

The 4th word is the total number of page management group allocations
(3173659), the 5th word is the number of page management group allocations
that stole a page management group from another process (2685512).

Figuring 300ms per fault and around 32 faults for each page management
group that has been stolen from another process a total of 7 hours has
been spent handling these faults on a machine that has been running for 2
days, and is idle at night.

We have made one unsuccessful attempt to patch /vmunix so as to increase
the number of page management groups allocated by the npmgrps variable,
and controlled by the NPMGRPS_60, and MNPMGRPS constants.  Has anybody
managed to increase the number of page management groups under SunOS (on
any Sun 4 machine, not necessarily a Sparcstation), either with or without
recompiling the sources, and if so what is the secret?

					  Gordon Irlam
					  Adelaide University, Australia
					  (gordoni at cs.ua.oz.au)