C not LALR(1) & compiler bugs

rex ballard rb at ccivax.UUCP
Fri Feb 7 08:30:26 AEST 1986


In article <10200037 at ada-uts.UUCP> richw at ada-uts.UUCP writes:
>
>Well, I've calmed down a bit.  That was quite a flame...
>
>Whether you consider these ambiguities "all that bad" seems to
>be a matter of personal taste.
>
>In any case, the 8 significant character stuff...  (yes, more flames)
>
>If anybody can think of ANY reason for limiting the number of significant
>characters of non-external identifiers to 8, I'd be honestly interested
>in hearing it.

As has been mentioned before, the big problem with >6|8 significant
characters is with the assemblers.  The old RT-11 assemblers only
provided 6 characters.  Some of the Whitesmith flavors are the same,
along with most old micro assemblers like for the 8080, where the
memory available for the symbol table was less than 64K.

The same problem exists with the linkers, librarians, and debuggers.
Debuggers which reference source line lables such as DBX and SDB
are free of this limitation, but source is not always available.

Static and automatic lables are often treated as local lables with
L[1-256] or a similar approach.  This makes debugging more difficult,
but eases the symbol table crunch.  These can be made variable length
so long as the programmer understands the difference.  Structure
member names are another candidate for variable length names.

One popular approach is to use #defines do define full names and use a
good pre-processor to resolve them into their cryptic names.  Doing
this automatically seems attractive until you get into the problem of
global resolution.  Perhaps incorporating the file name into the lable
would help.

As long as there are assemblers that run in 'PDP-11 emulation mode'
and machines with memory restrictions, the restriction will hold for
'portable code'.

One alternative (though less practical).  Compile directly from 'C'
source to link module.  This bypasses the assembler but not the linker.
Another is to go from source to executable, this can lead to a very
large compiler, like SmallTalk, but would make debugging easier.

I have noticed that lint (4.2) will complain when a very long
name is declared, and a trunkated version is referenced (Particularly
with #defines).  Ideally, lint should check for both 'Unique prefix'
and 'Unique full-name' on all lables, issuing '<lable> not used',
'<lable> undefined', and '<lable multiply defined>' if there is a clash
either way.

Unfortunately, it seems that there is little incentive to rewrite
assemblers, linkers, librarians...,  Fourth generation languages,
incremental compilers, and interactive developement systems are
following the FORTH tradition of keeping symbol table information
directly even after the source is compiled.  Could something like this
could be done for C?

I am glad there is interest in making these types of improvements in the
language.  Perhaps by investigating some of the good features of languages
like FORTH or SmallTalk, rather than trying to poke at their weaknesses
(they have many), a better, more powerful 'C' developement system
will evolve.  I'd love to see a fully interactive environment that
allows unit testing and gradual integration of actual compiled code.
DBX comes close, but I'm almost positive it could be even better.



More information about the Comp.lang.c mailing list