Multibyte characters
Doug Gwyn
gwyn at smoke.BRL.MIL
Thu Jul 5 13:33:17 AEST 1990
In article <1467 at inset.UUCP> mikeb at inset.co.uk (Mike Banahan) writes:
>Presumably the integral constant '@' is a three-byte constant, no matter
>what it may look like?
No, the value of such a multibyte character constant is implementation-
defined. The type of the constant is int.
>An alternative interpretation is that it violates the constraint in
>2.2.1.2 `a .. character constant .. shall begin and end in the initial
>shift state', but presumably I can expect my implementation to do the
>necessary good deeds and put a shift-out in there too.
No, you had better put the shift-out in there too or the final ' may not
be recognized by the compiler.
>Since it is a three-byte constant (assuming I'm right), then can I be
>sure that I do not get overflow when I assign it to a char variable?
It is most unlikely that the implementation definition will assign a
value less than 256 to '@'. Therefore (assuming that chars are
represented in 8 bits, as is usually the case these days), information
will be lost if you assign that character constant to a char variable.
Situations like this are best dealt with by explicit use of the wchar_t
type, which should be large enough to contain any source character.
More information about the Comp.std.c
mailing list