length of external names

Henry Spencer henry at utzoo.UUCP
Tue Jan 8 08:51:29 AEST 1985


> Henry Spencer, who seems to be one of the chief exponents of short
> external names, just posted a convincing explaination of the need to not
> break existing linkers. ...

To rebut a misconception:  I don't like short external names.  I merely
think that (a) some provision for them in the standard is inevitable,
and (b) annoying though this is, we can live with it, which is a passing
grade for a standard that has to apply to everyone.

> [A] solution that was used, and worked, was to have the COMPILER use the
> external "name" to store a hashed value.  During the recent net
> discussion I posted a description of this technique and some analysis of
> the chance and cost of collisions.

I don't recall seeing the previous posting about this, but the problem of
collisions is definitely a nasty one.  Bearing in mind that separately-
compiled modules must agree on the object-file (i.e. short) name under
which an identifier is known, the possibility of collisions is a major
flaw in a hashing scheme.  I've worked with compilers that did similar
things (first 4 and last 3 chars of the identifier, as I recall) and one
had to be careful about collisions; it really wasn't much better than
short identifiers.  If the algorithm used is really a hashing function
rather than a systematic "cut and paste" rearrangement of the original
identifier, collisions become (a) less likely, and (b) harder to spot and
deal with.

Note that hashing *demands* a way to force an internal-to-external
correspondence, like the proposed "entry" clause, for linking to system
services and other languages.

I like the idea of using an "entry" clause to manage correspondences
between internal/long and external/short names, although if you ignore
the issue of identifiers containing funny characters, you can do exactly
the same thing with #define.  (Note that preprocessor identifiers are
internal, hence must be long.)

I am not a member of the committee, but will comment on some of the
suggestions addressed to them...

> 1.  Suppose that the standard required longer names and suggested the
>     hashing technique as an implementation technique, you would force
>     manufacturers to update either linker or compiler to meet the
>     standard.  Is this politically possible?

I don't know.  If the problem of collisions can be shown to be a non-issue,
and the "entry" clause or something like it can be introduced, it might be
viable.  It depends on how manufacturers feel about hashing.

> 2.  In some other areas, I am told, the standard described a relatively
>     high level language, rather than the mimimum of implementations.
>     This will prevent some present compilers from meeting the standard.
>     Why should it pick the mimimum here?

Because the problems go much farther than the compiler.  Object-module
formats are visible system-wide, making changes much harder.

> 3.  How can I get a copy of the draft standard?

I believe the draft has gone to ANSI for publication for formal public
comment; it should be available from CBEMA (don't have the address handy)
shortly.  The price will be unpleasant, though, knowing CBEMA.  I don't
know whether the older informal channels are still open.

> 4.  Is this an adequate method of getting comments and questions to the
>     committee? If not, what is a useful channel?

Some of the committee folks definitely do read this newsgroup.  If you
want to be forceful about something, though, the recommended course is
to write (on a piece of paper) to them.  The transition to ANSI formal-
public-comment phase may have altered this, though.
-- 
				Henry Spencer @ U of Toronto Zoology
				{allegra,ihnp4,linus,decvax}!utzoo!henry



More information about the Comp.lang.c mailing list