Multibyte characters

Doug Gwyn gwyn at smoke.BRL.MIL
Thu Jul 5 13:33:17 AEST 1990


In article <1467 at inset.UUCP> mikeb at inset.co.uk (Mike Banahan) writes:
>Presumably the integral constant '@' is a three-byte constant, no matter
>what it may look like?

No, the value of such a multibyte character constant is implementation-
defined.  The type of the constant is int.

>An alternative interpretation is that it violates the constraint in
>2.2.1.2 `a .. character constant .. shall begin and end in the initial
>shift state', but presumably I can expect my implementation to do the
>necessary good deeds and put a shift-out in there too.

No, you had better put the shift-out in there too or the final ' may not
be recognized by the compiler.

>Since it is a three-byte constant (assuming I'm right), then can I be
>sure that I do not get overflow when I assign it to a char variable?

It is most unlikely that the implementation definition will assign a
value less than 256 to '@'.  Therefore (assuming that chars are
represented in 8 bits, as is usually the case these days), information
will be lost if you assign that character constant to a char variable.

Situations like this are best dealt with by explicit use of the wchar_t
type, which should be large enough to contain any source character.



More information about the Comp.std.c mailing list