How to use toupper()

Karl Heuer karl at haddock.ima.isc.com
Thu Jan 12 07:32:35 AEST 1989


This has mostly reduced to an ANSI-C-specific issue, so I'm redirecting
followups to comp.std.c.

In article <1989Jan6.231955.7445 at sq.uucp> msb at sq.com (Mark Brader) writes:
>So for now, the best compromise seems to be:
>#ifdef __STDC__	/* [corrected --kwzh] */
>	if (*p >= 0) *p = toupper(*p);			/* Version 2 */
>#else
>	if (isascii(*p) && islower(*p)) *p = toupper(*p);  /* Version 5 */
>#endif

As Mark already pointed out, version 2 can break in an international
environment.  My recommendation (in a parallel article) was
	*p = toupper((unsigned char)*p);		/* Version 6 */
which has the subtle flaw that, if plain chars are signed and the result of
toupper() doesn't fit, ANSI C does not guarantee the integrity of the value
(the conversion is implementation-defined).

Mark further points out in e-mail:
>The trouble is that while Version 2 can break for some characters in the
>international environment, Version 6 can break for ALL characters in a
>vanilla environment ("C" locale)!

Well, not *all* characters; just those that appear negative (and hence don't
fit when converted back from unsigned char).  And this set is guaranteed to
exclude the minimal execution character set.  But the code as written could
still produce surprises on a sufficiently weird implementation which is still
within the letter of the Standard.

>The best you can do is to avoid "char" altogether and use "unsigned char".
>You probably have to do it throughout the program, in fact.

If the program has to be strictly conforming, you may be right.  (But then
string literals, and functions that expect `char *' arguments, may screw
things up; casting the pointers ought to be safe, though.)

Karl W. Z. Heuer (ima!haddock!karl or karl at haddock.isc.com), The Walking Lint



More information about the Comp.lang.c mailing list