FYUO (For YACC Users Only), YACC "feature" (b-u-g) found

utzoo!decvax!harpo!eagle!mhuxt!mhuxj!mhuxi!mhuxv!burl!rcj utzoo!decvax!harpo!eagle!mhuxt!mhuxj!mhuxi!mhuxv!burl!rcj
Fri Apr 8 10:19:14 AEST 1983


Don't use token numbers over 1000!!!!  Explanation follows:

I have found a bug in YACC, or rather the lack of a caveat in the
YACC documentation.  YACC allows the user to either specify his
own token numbers following a %token directive, or let YACC default
them starting at 257.  This feature was quite useful in early versions
of YACC, like the one that I learned on (V6).  If you declared a token:

%token	CMOP	310

YACC's debug option would tell you that it had received token 310,
not CMOP like it does today.  If you let YACC default your token
numbers, you had to get a new printout every time you ran YACC with
a token change to find out what token number went with what token --
hence it was easier to define your own token numbers.

I kept doing this out of habit, and recently wrote an assembler
preprocessor using YACC, and started my token numbers at 1000 and
went up by increments of 10.  The program that it generated was
flagging a syntax error at strange places, and I took it into sdb
and found out that even though the y.output file said that there
were several legal inputs in a given state, there was a hardwired
test of the action table (yypact) that would take the default action
if the yypact entry for that state was <= -1000.  The parser was
taking the default action without even asking yylex() for the next
token!!  I found out after much sdb'ing of the UNCOMMENTED
(Grrrr!!!!) YACC source that YACC somehow complements token numbers
to make these action table entries.  Therefore, token numbers
of > 1000 cause a number of < 1000 to be entered into yypact, and
the parser assumes the default action for that state.

The easiest cure, since YACC's debug is now smart enough to use a
symbol table to return the ascii name of your token to you on the
debug output rather than the token number, is to simply let YACC
default the token numbers for you.  The Unix Hotline has been informed
of the problem and will submit an MR to the documentation so that
it appears in the BUGS section.
-- 

The MAD Programmer -- 919-228-3814 (Cornet 291)
alias: Curtis Jackson	...!floyd!burl!rcj
			...!sb1!burl!rcj
			...!mhuxv!burl!rcj



More information about the Comp.lang.c mailing list