Character types in ANSI C (long, but the meat is at the top)

msb at sq.UUCP msb at sq.UUCP
Fri Feb 20 04:29:26 AEST 1987


> What exactly are the compatibility rules for character types in ANSI C?
> I.e. which of the following [pointer assignments] are legal:
> 
>     char *p1; unsigned char *p2; signed char *p3;
>     p1 = p2; p1 = p3; p2 = p3;

They are all illegal.  The draft specifies three different "char" types,
though in any particular implementation two of them are treated similarly.

To avoid confusion, let me add that the *character* assignments

	*p1 = *p2; *p1 = *p3; *p2 = *p3;

are all legal, and the *explicit* pointer conversions

	p1 = (char *) p2; p2 = (unsigned char *) p3;

are also legal.

Furthermore, the treatment of "signed" in conjunction with "char" is
different from its treatment in conjunction with "int" or "long".
In the latter cases, "signed" is a noise word.  Thus if "char" in
the original example was changed to "int", then p1=p3; would be legal.

In my formal submission, which was too long to post to this group,
I suggested that most of #3.1.2.5 needed editorial improvements, and
provided the following suggested text, which I believe to convey the
same facts as the existing draft is supposed to, but more understandably.
This is based on a close reading of the draft and mail conversations
with Larry Rosler.  Any errors are mine.

				---

   The following are always *signed integral types*:  "signed char",
   "short int", "int", and "long int".  For the last three types listed,
   the set of values of each type is a superset of the set of values of
   the preceding listed type.
   
   An object declared as "signed char" is large enough to store any
   member of the execution character set, and if any member of the re-
   quired source character set enumerated in #2.2.1 is stored in the ob-
   ject, its value is guaranteed to be positive.  The size of an object
   declared "int" is a natural size suggested by the architecture of the
   execution environment.
   
   Corresponding respectively to the above four types are the *unsigned
   integral types*:  "unsigned char", "unsigned short int", "unsigned
   int", and "unsigned long int".  In each case an object of unsigned in-
   tegral type utilizes the same amount of storage as does an object of
   the corresponding signed integral type, including its sign.  The set
   of nonnegative values of a signed integral type is a subset of that of
   the corresponding unsigned integral type, and the representation of
   the same value in each type is the same.  A computation in an unsigned
   integral type can never overflow, because a result that cannot be
   represented in the type is reduced modulo the largest number that can
   be represented in the type plus one.
   
   The type "char" is either a signed integral type with the same set of
   values as "signed char", or an unsigned integral type with the same
   set of values as "unsigned char"; which of the two applies is
   implementation-dependent.
   
   Even if the implementation defines two or more types of integers to
   have the same set of values, they are nevertheless different types.**

 **Thus even if "char" is a signed integral type, "signed char" is a
   different type.  On the other hand, as explained in #3.5.2, "signed
   int" is merely an alternate way of specifying the type "int".

				---
The reference to #3.5.2 is to the following text, which I would put there:
				---

   The keyword "signed" has no effect when specified in conjunction with
   "int" or in a construction where "int" is implied.**

 **Thus "signed" alone is equivalent to "int" alone.

				---
Mark Brader, utzoo!sq!msb
#define	MSB(type)	(~(((unsigned type)-1)>>1))



More information about the Comp.lang.c mailing list