"(char *) NULL", and the like ["Re: SIZEOF"]

Sat Jan 26 09:34:56 AEST 1985

The problem of casting null pointers was discussed in this group sometime
last year.  Let's not start it again.  For the benefit of those who didn't
see them then, I'm reposting in my own words the authoritative answers.

I'm sorry about the length of this item, but I thought it was important
to cover all the major points clearly in one article.  And I wanted to
do it myself without waiting for more responses because the articles
I'm replying to originated from here in Toronto, and this one hopefully
won't be too far behind them on the net.

Peter Curran (ecr!peterc) writes:

> Of course, C SHOULD be defined to allow sizeof(int) != sizeof(int *).
> However, due to one point in the Reference Manual, and K&R...
> ... they are actually required to
> be equal.  The problem is that "0" is defined to be the Null pointer
> constant.  When "0" is passed as a parameter to a function, the compiler
> cannot tell whether an int or an int * is intended.

This is not quite true.  What it actually says is:  "The assignment of the
constant 0 to a pointer will produce a null pointer".  Notice the word
assignment; it doesn't say anything about function calls.  I'll get to
them later, but first I want to answer some other remarks.

Dennis Smith (ecr!quenton) writes [edited for brevity]:

> The problem of passing 0 for a null pointer (as a parameter), and
> the solution ["#define NULL ((char *)0)"], as pointed out by P.Curran,
> is valid.  However, [this,] although portable, will cause many compilers
> to complain about differing pointer types, and will also cause lint to
> generate many additional useless messages. The only generally usable solution
> that I know of is -
>   #define NULL 0	/** when sizeof(xxx *) == sizeof(int) **/
>   #define NULL 0L       /** when sizeof(xxx *) == sizeof(long) **/
> This unfortunately means that the "define" must be changed whenever
> the target machine/compiler/environment changes.

This objection is correct.  Furthermore, the complaints by lint and
the C compiler are also correct, and not useless.  Smith goes on to
give the reason himself:

> It might also be noted ... compilers for certain ... computers, generate
> pointers of differing sizes.  This occurs when the machine is not
> byte addressable, so that ...
>   sizeof(char *) != sizeof(int *)

Precisely.  You can't assume that pointers to different types are the same
size, nor can you assume they're the size of an int.

Now, about function calls.  Because functions may be compiled separately,
all C compilers (that I know of) pass arguments to them without any type
checking.  If you call

	Funk (0);

then sizeof(0), i.e. sizeof(int), bytes go on the stack.  And if it's declared

	Funk (a)
	long a;

then it reads sizeof(long) bytes from the stack, and you're in trouble if
you're using a machine where ints are shorter than longs.  Well, the case

	Funk (a)
	int *a;

is NO DIFFERENT.  Just as you would have to pass a long-valued expression
to the first version, such as Funk(0L); or Funk((long)0);, in the second
case you have to call

	Funk ((int *)0);

Now, you CAN, if you want, declare a separate null pointer of each type:

	#define	NULLCHARP	(char *)0
	#define	NULLINTP	(int *) 0
	#define NULLSXPP	(struct x **) 0		/* yuck! */
and call
	Funk (NULLINTP);

but you can see how silly this gets.  What's much better is to simply declare

	#define	NULL	0
and call
	Funk ((int *) NULL);

Incidentally, the manuals for some versions of UNIX contain inadequate
or incorrect descriptions of certain functions that are commonly passed
null pointers.  The last argument of execl, for instance, is sometimes
given as 0.  It should be (char *)0, since all the other arguments to
the function are char *'s and the number varies.

Similarly, the &buffer arguments to fread and fwrite should be cast to
(char *) since the function can't tell in advance what kind of pointer to
expect and has to assume something.  The call "fread (&x, sizeof x, n, stdin);"
is correct only if x is of type char; otherwise the first argument
should be "(char *) &x".

It can also be argued that it's a bug for stdio to #define NULL 0; it should
be #define NULL (FILE *)0, since functions like fopen() [whose type is FILE *]
are claimed to return NULL on error.  However, since comparisons with 0,
like assigning 0, are guaranteed to work, I think what they really meant
to say was that the functions will return a value that compares equal to NULL,
and thus #define NULL 0 is correct here also.

To return to Curran:

> The real solution, of course, would be to introduce a new keyword, say "null",
> which represents the Null address constant, with an implementation-
> defined value.  However, I doubt that that will ever come about.

That won't help, because there are different kinds of null pointers.
I understand that the REAL real solution is embodied in the forthcoming
C standard.  The new standard C will enable you to declare the type of
arguments a function expects in the function that calls it.  By declaring
fread, for instance, this way:

	int fread(char *, int, int, FILE *);

you can make that "(char *)" in the call unnecessary.  And similarly for
the Funk example.  However, I don't think there's any hope for execl,
since its argument list varies in length too.

{ allegra | decvax | duke | ihnp4 | linus | watmath | ... } !utzoo!lsuc!msb
Mark Brader                                 also uw-beaver!utcsrgv!lsuc!msb

"I don't cook.  I don't sew.  I don't clean houses.
 That's what God gave us other people for."			-- Night Court