C not LALR(1) & compiler bugs

Jack Jansen jack at boring.uucp
Sun Feb 2 05:01:30 AEST 1986


In article <7800010 at datacube.UUCP> stephen at datacube.UUCP writes:
>
>C is context sensitive in many ways, but so is just about every other
>programming language in existence. PASCAL or MODULA-II, for example,
>have an ambiguity for:
>
>Statement ::= assign_stmt | procedure_invocation | ...
>
>assign_stmt ::= IDENTIFIER ':=' expression
>procedure_invocation ::= IDENTIFIER ...
>
>This also must be resolved by feedback to the lexer.
This is not true. The lexical analyzer doesn't have anything to
do with it, since both things are IDENTIFIERs. The problem in
C is that type-names are reserved words, so, in effect, a 'typedef'
introduces a new reserved word. This makes it necessary to
either tell lex about it (the quick-and-dirty approach, used
by most compilers I know of), or mess up your yacc grammar in
a truly horrible way.

Also, note that, at least in pascal, where there are *no*
procedure-type variables, you only have to look ahead exactly
*one* token to find out wether it is an assignment or a call.
(Next token is ( or any statement terminator => call else assigment).

In C, you could conceivably have to look ahead an unlimited number
of tokens:

typedef int geheel_getal;

geheel_getal ((((((((((((((((((((a))))))))))))))))))));

This makes the C typedef problem an order of magnitude more
difficult.
-- 
	Jack Jansen, jack at mcvax.UUCP
	The shell is my oyster.



More information about the Comp.lang.c mailing list