Using more than one "yacc" in a single program

Thu Feb 2 14:35:06 AEST 1984

From:            Rich Wales <v.wales at ucla-locus>

Some people out there may already be aware of the following trick, but
in case there's anyone who isn't . . .

Those of you who have used "yacc" extensively have probably noted that
the global symbols in the output file ("y.tab.c") are always the same in
every program (they always start with "yy").  This becomes a big pain if
you want to use "yacc" in two or more separate places in the same pro-
gram -- or if you want to use "yacc" in a library routine which might be
invoked by another "yacc"-based program.

I recently had just such a situation on my hands, and I got around it by
massaging the symbols in question so they would be unique -- as shown in
the following sample excerpt from a makefile:

blah.c blah.h: blah.y
	yacc -d blah.y
	mv y.tab.c blah.c
	mv y.tab.h blah.h

blah.o: blah.c blah.h
	cc -S blah.c
	/lib/c2 blah.s | sed 's/_yy/_blah_yy/g' | as -o blah.o
	rm -f blah.s

The first group of commands above show how to get around the fact that
the output files produced by YACC always get the same names.

The second group of commands use the "-S" option of "cc" to generate the
assembly-language version of the C program.  "/lib/c2" is the C optimi-
zer (normally invoked via the "-O" option of "cc").  The "sed" call will
change all symbols starting with "_yy" (note the initial underscore; all
global symbols have one prepended to them, at least on the PDP-11 and
VAX UNIXes I am familiar with) so that they will start with "_blah_yy"
instead.  And "as" (the assembler) finishes the compilation job.  The
entry point to the resulting parser would be "blah_yyparse".

The point, of course, is that if you had more than one "yacc" in a sin-
gle project, you could use a different prefix string (instead of "blah"
as I used here) for each "yacc".

This same trick can be used for "lex" as well.  If you do this kind of
thing with "lex", you can't use the standard "lex" library ("-ll" or
"-lln") any more; you must supply your own routines instead.

Anyone who has compiled UNIX kernel code has probably already seen this
technique, by the way.  In the case of the kernel, the assembly code is
run through "sys/asm.sed" to change selected function calls into privi-
leged machine instructions.

I'm not what you would call a "modular software tools" fanatic, but it
is kind-of neat that you can do things like the above.  If I had had
things my way, though, I would have added an option to YACC and LEX so
that they would generate file and symbol names with a unique prefix of
the programmer's choice.

-- Rich <v.wales at UCLA-LOCUS>