Character Sets

Doug Gwyn gwyn at smoke.BRL.MIL
Thu May 18 08:28:32 AEST 1989


In article <442 at cybaswan.UUCP> iiit-sh at cybaswan.UUCP (Steve Hosgood) writes:
>surely the C language standardization committee has confused its brief with
>that of the character set standardization committee?

Not with respect to trigraphs; far from requiring sufficient character
set support, X3J11 bent over backwards to require only the minimum
practical character set according to already extant international
standards.

However, X3J11 did mandate that the character values for '0'..'9' have
adjacent values in ascending numerical order.  That is clearly a code
set requirement, which I argued against.  The need for some way to
map digit characters to numbers and vice versa does exist, but other
means to meet this need could have been specified.  For example, my
standard application system-tailored configuration header contains
the following, which ANSI C conforming implementations must support:

/* integer (or character) arguments and value: */
#define tonumber( c )	((c) - '0')	/* convt digit char to number */
#define todigit( n )	((n) + '0')	/* convt digit number to char */

Of course this is edited as required to mach the actual implementation.
For years, I had been using the portable definition

#define todigit( n )	"0123456789"[n]	/* convt digit number to char */

but I never figured out a really good portable definition for tonumber().
I would much rather X3J11 have standardized macros like these than
imposing requirements on the code set.  The X3J11 requirement can "work"
only because all known implementations happen to already meet the
requirement.  If they didn't, it would be impractical to fix them!

>The 'UCASE' hack to allow UN*X to work on silly old terminals was put
>into the TTY handler. So I believe should this trigraph thingy.

Not every system has such facilities, but I agree with your general
sentiment.  In fact I expect that some of the more enlightened
implementors will take exactly this tack to deal with practical use
of so-called "European character sets".  The new ISO code set standards
should also help.  C trigraphs should remain essentially an inter-site
code transporting aid.



More information about the Comp.std.c mailing list