Reserving identifiers for future use.

Kevin Martin kpmartin at watmath.UUCP
Thu Jul 19 23:07:34 AEST 1984


>From phipps at fortune.UUCP (Clay Phipps):
>Why do I continually run across the suggestion that prefixing names
>with underscores makes them unique, as if no one has used underscore
>characters for some special purpose before ?  
It doesn't make them unique. However, it does allow the documentation
to state that if the user supplies his own identifiers which begin
with underscore, he will (eventually) get burned. This effectively
reserves 1/27th of the possible identifiers (1/53rd if case distinction
is used).
No matter what naming convention is used for "reserved names", there will
exist some programs which already use such names.

>Sorry, the underscore has already been spoken for.
>The VAX C compiler uses that convention for external names, for example.
>Conventional VAX UN*X subroutines names, therefore, all begin with "_".
Only if you look at the 'as' input or beyond. Any C identifiers which
already start with an underscore end up starting with two.

>Making names all upper case isn't adequate, either.
I agree. May people make all of their #define'd symbols uppercase.

>What is really needed is a name qualification or prefixing convention
>that can be applied across all of UN*X, for example,
>
>    <prefix> "_" <mnemonic name>
>
>The prefix would be the name of the program or routine package;
>for example, "cpp" for the C Preprocessor, "lp" for the Pascal Library, &c.
>Thus, "unix" would become, for example, "cpp_unix", 
>"waterloo" would be "cpp_waterloo", and Pascal Library "IN" could be "lp_in".
The problem with this is that it effectively reserves *every* name containing
an underscore (since the user has no clue as to what might become a
'prefix' in the future). Besides, Joe User shouldn't have to know that the
pass that happens to process an identifier was called 'cpp' in the twilight
ages of computing. With all this talk of #if sizeof(...), CPP might well have
to disappear as a separate entity.
(Also, one would hope that the naming convention would be useful and used
on non-unix systems too)

There actually seem to be two problems here. Both involve conflicts between
user's identifiers and internally-generated identifiers. I suspect that the
distinction between the problems stems from whether the identifier occurs
in the original source code or not.

For example, the problems with C or Pascal operators implemented as function
calls is easily solved: User's external symbols all have an underscore
prepended, the builtin operators don't. Thus a user function can never
conflict with built-in functions (like 'in', which bacomes 'lp_in', as
suggested above).

On the other hand, symbols like 'unix' or '_flsbuf', which (eventually) appear
in the source code cannot benifit from this solution. They require a naming
convention. The simplest convention is "Any identifier containing an
underscore may become reserved in the future". This is unacceptable, since
it leaves no 'break character' for users' identifiers. Another simple rule
is "Any identifier ending with an underscore ...", but this gets clipped by
the loss of trailing characters in various compilers and linkers.
"Any identifers beginning with underscore ..." only uses a small fraction
of the available identifers and is easy to describe.

                           Kevin Martin, U. of Waterloo



More information about the Comp.lang.c mailing list