Pointers and Arrays

Mon Jul 21 20:13:02 AEST 1986

> All right, tell me:  What is the type of the address of an array?
> 
> That is, suppose I write:
> 
> 	int a[10];
> 
> What type is &a?  Don't tell me it's "pointer to integer" because
> that is the type of &a[0], and a and a[0] are different things.

No, it's "pointer to array of 10 'int's", a type that is already in C;
consider

	int a[10][10];

and ask what the type of "&a[5]" is.  PCC, at least, even handles this sort
of type correctly; you can declare such a pointer, assign a value to it
(even though it's only possible to construct such a value by taking the
address of a subarray, at least in C as she currently is spoke), and select
an element from the array that it points to by the obvious method.

This brings up an interesting problem.  The ANSI C draft I have (Aug 11, 1985)
says

	C.2.2.1 Arrays, functions, and pointers

	   Except when used as the operand of the "sizeof" operator or
	when a string literal is used to initialize an array of "char"s,
	an expression that has type "array of 'type'" is converted to
	an expression that has type "pointer to 'type'" and that points
	to the initial member of the array object.

This is a generalization of what "The C Reference Manual" says, which is:

	7.1 Primary expressions

	...

	An identifier is a primary expression, provided that it has
	been suitably declared as discussed below.  Its type is specified
	by its declaration.  If the type of the identifier is
	"array of ...", however, then the value of the identifier-expression
	is a pointer to the first object in the array, and the type of
	the expression is "pointer to ...".

This is silent about "array-valued expressions", except that it implies that
the name of an array is not an array-valued expression.  It later (in 8.7
Type names) acknowledges the existence of the type "pointer to array of
...", but doesn't indicate what happens if it encounters an expression of
that type.

The ANSI C statement seems to be the obvious way of correcting this
omission.  However, it now makes it harder to construct a value of this
type.

Neither K&R C nor ANSI C allow you to construct a pointer to an array that
is not a member of another array (if you declare "int a[10]", "&a" is
illegal and "a" is a pointer to the first member of the array, not to the
array itself).  However, K&R C does not explicitly *forbid* putting an "&"
in front of an expression that is a member of an array of arrays.  E.g., if
you declare "int a[10][10]", it doesn't forbid "&a[3]".  (Our PCC, and
probably most, if not all PCCs, *will* complain about this; I don't know if
this is plugging a loophole in the rules, or just an accident of the
implementation.)

ANSI C, however, says that *any* expression of type "array of 'type'" is
converted to a pointer to the first element of that array (hence of type
"pointer to 'type'".  This means that the expression "&a[3]" is invalid,
since "a[3]" is an array-valued expression referring to the fourth member of
"a", and this is converted to a pointer to the first member of the fourth
member of "a"; this expression cannot have its address taken.

You can get a pointer to the *first* member of "a"; the expression "a" is
converted to such a pointer.  You can then get a pointer to other members
with pointer arithmetic; i.e., "a + 1" is a pointer to the second member of
"a" (which is another 10-element array of "int").  Unfortunately, this means
something that works for arrays of types that are not arrays won't work for
arrays of types that are.  If you have "int a[10]", "&a[5]" is a point to
"a"s sixth element; if you have "int a[10][10]", however, "&a[5]" is illegal.

This is a bit of a rough spot in C's type system.  It would be preferable if
the operand of the "&" operator, like the operand of the "sizeof" operator,
were not converted from "array of 'type'" to "pointer to first element of
array of 'type".  If this were the case, "&a" would be legal, regardless of
the type of "a" (except, possibly, if "a" were of type "function returning
'type', and perhaps even that could be allowed).  This would make pointers
to arrays more useful, and would permit a routine that took a pointer to an
array to be written as such, instead of using the subterfuge of declaring
the argument in question to be a pointer to an element of such an array.

One would presumably be allowed to declare a pointer of type "pointer to
array of 'type' of unspecified size", thus permitting a function to take
arrays of arbitrary size as arguments.  The Aug 11, 1985 ANSI C draft seems
to forbid this; an array size specifier "must be present, except that the
first size may be omitted when an array is being declared as a formal
parameter of a function, or when the array declaration has storage-class
specifier 'extern' and the definition that actually allocates storage is
given elsewhere."  One is currently allowed to do so by PCC, at least; K&R
doesn't forbid it, although the only contexts in which it discusses such
array specifiers are the two mentioned by the ANSI C draft.

If "&" is to be changed to work like "sizeof", the rules for type specifiers
should also be changed in this fashion.  Yes, there will be a problem with
pointer arithmetic involving pointers to arrays of unspecified size"; this
will have to be forbidden.  However, ANSI C already has object like this;
consider a pointer to a structure of unspecified shape.  One can declare
such pointers - this is needed to deal with mutually-recursive structures,
where an object of type "struct a" contains a pointer to an object of type
"struct b", and *vice versa* - and the language must somehow forbid pointer
arithmetic on such pointers, at least until the structure's shape is
declared.

If anyone wonders why the type "pointer to array of 'type'" would be useful,
and is not swayed by arguments involving the completeness of type systems or
the relative merits of using "pointer to array of 'type'" to point to
something of type "array of 'type'" rather than using "pointer to 'type'",
consider a program stepping through an array of "vectors", defined as arrays
of three "double"s, computing the norm of each one.  It *could* do so by
stepping an index of integral type, but one reason why C pointers work the
way they do is so you can step through an array by stepping a pointer into
that array!  (These arguments sound somewhat similar to the arguments about
taking the address of a "jmp_buf" using "&", since forbidding "&" to be
applied to an array forces a programmer to know whether "jmp_buf" is
implemented as an array or a structure.  In both cases, one is forced to
treat arrays differently from other sorts of objects, and it seems
unnecessary to require this.)
-- 
	Guy Harris
	{ihnp4, decvax, seismo, decwrl, ...}!sun!guy
	guy at sun.com (or guy at sun.arpa)