casting to a union

Wed Dec 13 17:33:27 AEST 1989

In article <3455 at ccncsu.ColoState.EDU> wendt at handel.cs.colostate.edu
(alan l wendt) writes:
>I have a function that accepts either char pointers or ints as
>arguments, so I declare it with unions,

(probably a mistake---have two different functions, or make it a variadic
function; it will make life easier)

>union numstr {
>    int i;
>    char *s;
>    };
>
>void foo( union numstr ns ) { ... }
>
>Why can't I call it as follows:
>
>	foo((union numstr)7);

No casts to or from aggregate types are allowed in C.  (GCC allows
some such casts; this is an extension.)  In particular, there is a
large semantic problem: which element(s) of the union get the value?
The simplistic answer---that the union element which has the same type
as the argument---is not sufficient.  A better answer is to declare
that a cast to some type is semantically equivalent to declaring a
variable of that type, and initialising it with the value(s) being
cast, and returning the (rvalue of) the variable.  This would allow
aggregate casts and give them well-defined meanings:

	struct foo { int i; char c; short *s; double d; };
	... (struct foo) { 1, 'a', &myshort, 3.14159265 } ...

Unfortunately, K&R 1 disallows all aggregate initialisers except
in static (compile-time constant) contexts, and most useful places
for casts are not such.  The proposed ANSI standard allows more,
but says that union initialisers are for the first element only.
Hence

	(union numstr) expr

would (under this proposal) set only the first element of the
temporary `union numstr' `variable'.

>Probably if this were allowed in, the backwards conversion would
>also be allowed, i.e. from (union numstr) to (int).

Here the semantic gap is even worse.  What is the value of

	struct foo { /* as above */ } x;
	(char *)x

?  What about

	union numstr ns;
	(void *) ns

?  Which element did you want?

The usual answer at this point (yes, this is one of the many repeating
comp.lang.c topics) is that the element of the union which should be
chosen for the cast is the one which has the same type as the thing
being cast.  This has two problems: (a) there may be no same type; (b)
there may be more than one `same' type, differing only in object types,
not in value types.  For instance:

	union junk { int i; short s; };
	char c;
	f ( (union junk)c );

`c' is a char; its rvalue type is `int', so we could set either i or s,
both of which will hold all possible `char' values.  It does make a
difference which is chosen, as if `s' is set, some parts of `i' may be
uninitialised (full of junk).  Now take a harder case:

	union junk { struct foo st; double d[2]; };
	f ( (union junk){ 1, 2.5 } );

Is this putting the 1 and 2.5 into foo.i (an int) and foo.c (a char), or
is it putting the 1 and 2.5 into d[0] and d[1]?  They all fit---that is,
they are all assignment compatible---but neither match types exactly, and
it is far from clear which is meant.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris at cs.umd.edu	Path:	uunet!mimsy!chris