Internationalisation (was: NULL as a string terminator)

Richard A. O'Keefe ok at goanna.cs.rmit.oz.au
Wed Aug 22 18:02:12 AEST 1990


In article <1881 at jura.tcom.stc.co.uk>, rmj at tcom.stc.co.uk (Rhodri James) writes:
> In article <3585 at goanna.cs.rmit.oz.au> ok at goanna.cs.rmit.oz.au (Richard A. O'Keefe) writes:
> }For why?  Internationalisation, _that's_ for why.

> I cringe when I see this (unwords like "internationalisation", I mean).

One uses language for the purpose of communication.
In order to effect that purpose, one uses words that other people know
and use, not the words one happens to like.  Like it or not,
"internationalise" and its derivatives are *words* in 1990s computing
jargon.  Perhaps Rhodri James may have a better term that is a miracle
of euphony and clarity; well for heaven's sake tell us what it is *now*
and let's get pushing it, for "internationalisation" bolted from its
stable long ago.  (By the way, there is no such word as "unword".  If
there were such a term, it would be "nonword".  "dictcheck -pedantic")

> Also I fail to see your point. Surely such #ifdef switching
> as above is more efficient, simpler to maintain and more legible than
> the scrabbling about with resource files you prefer?

So now Cn James reads minds and knows what I prefer.  Wonderful just.
No, it is *not* simpler to maintain.  The point of the resource file
approach (not my invention by any means; no-hopers like IBM, DEC, HP,
X/Open, AT&T, Apple, ... have been using it for a while and I just
copied the idea and simplified it a bit for this newsgroup) is that
you have all the text in one place; you don't have to go "scrabbling
about" in the source files to find all the strings.  You can give the
resource file to a human translator who knows nothing about the
programming language you are using.  A minor addition to such a tool
(have it generate
	INTEGER MSGNO
	PARAMETER (MSGNO=......
instead of #defines) will let you use the *same* message file with a
Fortran program.  Speaking as a no-hoper, I must admit that using a
technique that adapts to *all* the programming languages I use, not
just C, sounds like a saving.  But what do I know?

As for efficiency, the point is that we are talking about a scheme for
generating messages for display to humans.  The cost of fishing the text
out of a file is (or was every time I measured it) considerably less than
the cost of displaying it on the terminal.

The real schemes (such as the X/Open one) identify messages by numbers,
not by address in the text file.  That has the disadvantage that finding
the right text is a wee bit more complex (but not very; one need merely
attaches a directory at the end of the file), but it has the great
advantage that the program does not need to be recompiled.  This means
that one customer can be running the program with messages coming from
the "English-speaking idiot" message file and another with messages
coming from the "Spanish-speaking wizard" message file, and both can be
sharing the same copy of the program without any recompilation at all.

That's the way it *is* in UNIX System V Release 4.  We might as well get
used to thinking about messages in that way now.

> Demonstrate to me a negative impact on internationalisation (ugh) and I
> might believe you.  Any negative impact will do, I'm not too choosy.

The schemes actually used by IBM (MVS, CMS, AIX) HP (HP-UX), DEC (VMS,
Ultrix), AT&T (SVR4) and others essentially add another couple of layers
of indirection above what I presented.  Those systems all allow you to
switch languages at run time, without any recompilation.  Those systems
all allow you to translate message files without having any other access
to the sources.  They all allow many programs, and many programming
languages, to share the same message files.  They all allow a customer
to substitute his own translation of a message file (perhaps amplifying
some messages, or getting the grammar right, or ...) without access to
the sources.

There's four negative impacts of the #ifdef approach, just for starters.
-- 
The taxonomy of Pleistocene equids is in a state of confusion.



More information about the Comp.lang.c mailing list