Memory models?

Charles Hedrick hedrick at athos.rutgers.edu
Fri Feb 26 19:57:22 AEST 1988


randy at umn-cs.UUCP (Randy Orrison) expressed surprise that people 
were talking about memory models on Unix.  He had hoped that under
Unix, 64K would no longer be a magic number.  Sorry to break it
to you, but...

The only way to get around the silly segments (as you put it) is to
get a machine that uses a decent chip.  I haven't played with the 386,
so I can't comment on it, but the 286 isn't one.  The 386 does allow
arbitrarily large segments, so 64K is no longer a problem there, but
on the 286, it certainly is.  Changing OS's from MS-DOS to Unix can't
fix that.  Unix runs in protected mode.  The problem is that the
address space in protected mode contains 3 bits of access mode right
in the middle of the address.  So you go from address 2fffff to
37ffff.  (One wonders why they didn't put the magic bits at the other
end of the word.  Were they *trying* to make the machine unusable?)
There's no way to get a big linear address space.  Microport supports
small and large models.  Small model is 64K each for code and data.
Large model lets both code and data be as big as you like.  But no one
object (subroutine, array, struct, or malloc'ed block) can cross a 64K
block, which means they can't be bigger than 64K.  Microport doesn't
support the huge model.  The huge model emulates a linear address
space.  I'm not sure quite how it would be done.  Probably you'd store
addresses with the high-order 16 bits right-shifted by 3 bits, so that
you had a continuous address.  Then when you needed a real machine
pointer, you'd shift the high-order word left and supply the dratted
access mode bits.  Fortunately none of the software I've tried to port
so far has really needed more than a 64K block of memory.  But if I
were trying to do big Fortran things, or tried to port the real (Gnu)
Emacs, I'd be really in trouble.  Fortunately, for a great majority of
C code, there is no problem.  I used to think a 286 would be lousy for
Lisp, but there should be no problem doing a Lisp for the 286, since
Lisp typically has lots of fairly little blocks of memory.  What
protected mode does give you is in effect an MMU.  Your programs don't
have to know where they are in memory.  In MS-DOS, the segment
registers have to be set up by the OS based on your physical location,
and if you change them you can write over somebody else's memory.
Also, your program has to be contiguous in memory.  In Unix (or any
system that uses protected memory), each 64K chunk can be anywhere in
physical memory, and the OS can move them without your program knowing
anything.  But the fact that there are 64K chunks is still visible.
(Actually, they needn't be 64K.  64K is a maximum.  They can be
smaller.  If you have some reason to do so, you can use just 512 bytes
in each segment, and have something like 8000 segments.  Microport
allocates physical memory in units of 512 bytes.  As far as I can
tell, you can even allocate segments 1, 23, and 7000.  For languages
such as Lisp, this can be quite handy.)  Also, int's are 16 bits in C
on the 286.  This can be a portability problem since a lot of code
these days is written for 32-bit machines.  Note that the large model
is potentially a portability problem.  The small model is more or less
just like a PDP-11 with separate I&D space.  This is one of the
classic machine that Unix supports, so lots of code will work that
way.  But the large model gives you 16 bit int's and 32 bit pointers.
This is really an inconsistency.  The language spec assumes implicitly
that a pointer will fit into an int, and some existing C code relies
on this.  I've ported several C programs and so far I haven't run into
any problems for this reason, but one could imagine programs that
would work with the small model (which while limiting at least has
self-consistent semantics), but not in the large model.  I'm inclined
to think there should be a compiler option that defines int as 32
bits, even though it would be slow.

Unix buys you a number of things, including multiple processes and a
clean separation between processes, kernel, devices, etc.  But it
can't create a silk purse from a sow's ear.  And the 286 is a sow's
ear.  (It's still a lot better than the 8086 or 186, since the 286
adds protected mode.)  The trade press is singing the praises of the
386 these days.  Does anybody know if it is really true?  If the 386
is a 286 extended to 32 bits, there is a potential problem.
680x0-based Unix systems all have MMU's.  These typically break your
address space up into pages, which the OS can put anywhere in memory
that they like.  If you want a large linear address space in the 386,
do you have to make your whole program one big segment?  If so,
doesn't that mean it all has to be placed in physical memory in one
big contiguous piece?  Or are machines using external MMU's in
addition to the 386's segment-mapping mechanism?  (The problem with
requiring a program to be in memory contiguously is that when programs
ask for more memory you normally have to swap something out or
reshuffle memory, which involves copying programs around in memory.)



More information about the Comp.unix.microport mailing list