why is free() a void?

Martin Weitzel martin at mwtech.UUCP
Fri Oct 26 09:10:15 AEST 1990


In article <1749 at meaddata.meaddata.com> rob at pmserv.meaddata.com writes:
>
[Q: why free() doesn't return something "useful"?]
>
>From K&R 2, page 252:
>
>    "void free(void *p)
>        ... p MUST be a pointer to space previously allocated by calloc,
>        malloc, or realloc."   (Emphasis on MUST by me.)
>
>What happens if p WASN'T allocated as it should've been??  How do we know
>there was a problem??  Is there really a problem??

As the ANSI-Standard tells us in 4.10.3.2, with a NULL pointer you have
a no-op, otherwise UNDEFINED BEHAVIOR. As some others (esp. Chris Torek)
never seem to get tired to invent scenarios to illustrate undefined
behavior, I'll also try to add one: Undefined behavior *can* mean that
your machine becomes inoperatable until you have typed the following
sentence 100 times at the console terminal:

  "The argument handed to free must be NULL or a pointer to
  space previously allocated by calloc, malloc, or realloc"

But let's get more serious. K&R 2 contains in section 8.7 a sample
implementation of a storage allocator; the code for the free()-function
is on page 188. As this allocator IMHO served for long years as a
modell for real implementations, looking at the effects of free()ing
some arbitrary pointer with this allocator will serve as a good
realistic example for undefined behavior. (And it will give you
enough reason to avoid such behavior, even if someone ensures you
that "undefined behavior" will neither make your equipment explode,
nor result in rude mail beeing sent to your boss ...)

The basic idea of K&Rs allocator (the example is essentially the
same in K&R I and II), is to keep the free()ed parts of dynamically
allocated memory in a linked list. For that purpose each allocation
requests some additional bytes, just enough for storing a number
and a pointer(%1). So, after you have allocated, say 20, bytes the
memory *may* look like follows (lower adresses to the left):

Bytes:	   4  +  4                   20                    4
	+-----+------+-------------------------------------------+
	| XXX |  4   |        20 bytes allocated       |         |
	+-----+------+-------------------------------------------+
	    HEADER   ^
	             | pointer returned be malloc, calloc or realloc

(Note that the byte counts above are only given to make the example a
bit more realistic - they don't mean that pointers or int-s allways
occupy 4 Bytes or are equal in size on all systems!)

In the above example the allocator had "really" reserved 32 Bytes. But
this "waste" shouldn't be of much interest for the programmer: He or she
can count on at least 20 bytes beeing available above the pointer that
was returned. Immediatly below the returned pointer there is some very
important information stored: The size of the "really" allocated area,
meassured in units of the size of the header (8 in the example). These
four bytes are the *sole* information for a later free() to know the
size of the allocated chunk of memory.

So we should have a look at free() now. Its basic work is to walk thru
a linked list of previosly freed chunks, which are chained by the first
field in the header (XXX above). The list is kept in the order of
increasing adresses (for reasons we'll see in a second). After free() has
determined where the new part fits in, it links it either into the list,
or combines it with the immediatly adjacent chunks. (Clearly, this
strategy should keep fragmentation low.)

Now let's come to the question what may happen if you try the following:

	char *greet = "hello, world";
	....
	free (greet);

(Again please note that in the following scenario I make some assumptions,
e.g. that static data is located together with string constants in the lower
parts of the memory, below the heap; these assumptions are quite realistic,
but there are certainly machines and implementations for which this doesn't
hold).

If free() follows the K&R implementation, it will first notice that the
given adress is below the lowest adress of the previously free()ed parts
of dynamic memory(%2). Now some bytes immediatly *below* the adress where
the character 'h' from the "hello, world"-string is stored, will be
taken as int which gives the size of the chunk free() is about to link
into the list. NOTE THAT THIS IS TOTALLY UNRELATED TO THE LENGHT OF
THE "hello, world"-STRING! Now free() decides that the bytes which are
just beeing returned to the allocator cannot be combined with something
else, so it links them as an alement into the list, WHICH CHANGES SOME
BYTES AT SOME DISTANCE BELOW THE ADRESS OF THE CHARACTER 'h'. Besides
that, so far not much damage has happened, and there is a fair chance
that the changed bytes belong to some static variable (or other string)
which is not any more needed until the program ends.

Let's consider the next request to the allocator now (malloc, calloc or
realloc). It steps thru the list, treating the bytes above the link-pointer
as size information to decide if a given chunk may be used to fullfill the
request. If we are lucky, these bytes (again: the contents of some totally
unrelated static variable, or some other string) tell that it is too small,
and nothing will happen. With less luck, the space of the "hello, world"-
string will now be reused - something the programmer who originally free()ed
it might have had in mind :-) - but not only that: Remember that the "how
much space is available"-information is IN NO WAY RELATED to the length of
the "hello, world"-string. If less than the "available" space was requested,
some space at the "end" of the chunk under consideration is returned (and
of course the size information is properly adjusted). I think some graphic
will help to show what this really means:

Bytes:	        4     4    ---------16----------   4     4     8
	+------------------------------------------------------------+
	|      XXX   nnn   hello, world\0       | YYY | mmm | ZZZ    |
	+------------------------------------------------------------+
	                   ^                                 ^
	                   | pointer handed to free          | malloc(6)

- XXX are the bytes overwritten by the erreneous free();
- nnn are some bytes that may accidentially form the value 5 (if treated
  as an int) when the request to malloc() starts beeing processed;
  nnn will be changed to 3 before malloc() returns;
- mmm are some bytes that will be treated as an int and set to the value
  2 before malloc() returns;
- ZZZ are the bytes that will be overwritten, if the "space" to which
  malloc() returns a pointer gets used;
- YYY are some bytes that may be used as a link pointer if the space just
  allocated will ever be returned (It is left as an exercise to the reader
  (:-)) to realize that this will not happen if the free follows immediatly
  and that also one more malloc() with a specific size could result in
  changing YYY :-)

I hope everyone who has followed so far will be convinced now that
things become more and more unpredictable and that undefined behavior
is the last thing which one would wish for some program. At least *I*
definetly wouldn't want to debug such a program, as it may be very very
hard to trace back from the point the error shows up to the point where
things started go wrong(%3).
----------------------------------------------------------------------
%1: Note that a more "space efficient" implementation would need only
    the space for the size, but not for the pointer, as the latter is
    only required as long as the chunks are "free". The K&R implementation
    was choosen to implement the obscure AND UNPORTABLE feature that you
    can use free()ed space until the next malloc() -- a thread about this
    topic appeared a short time ago in this group.
%2: Interestingly enough, though K&R II specifies in the appendix about
    the library that free(NULL) does nothing, they fail to implement
    this behavior in their "own" function. What can we learn from this?
    (No, I have no answer, take it more as a fact I observed.)
%3: For debugging techniques which might help you out of such situations,
    you may want to read Robert Ward's book "Debugging C" or the Article
    "Controlling the Malloc Heap" by Eric White in the CUJ Volume 7,
    Number 2 (Feb 1989).
-- 
Martin Weitzel, email: martin at mwtech.UUCP, voice: 49-(0)6151-6 56 83



More information about the Comp.lang.c mailing list