yacc sorrows

Chris Torek chris at mimsy.umd.edu
Thu Feb 15 06:39:27 AEST 1990


(Incidentally, this is another thing that does not really belong in
comp.lang.c, but in this case there *is* no appropriate group, so I
have not attempted to redirect followups....)

A few minor points:

In article <1990Feb9.171557.18465 at tcsc3b2.tcsc.com> prs at tcsc3b2.tcsc.com
(Paul Stath) writes:
>The string that gets matched in LEX is stored in a character pointer called
>`yytext'.

Actually, this is an array (of size YYLMAX, typically 200) of characters,
not a pointer.

[example lex code]
>${alpha}{alphanum}*	{
>				yylval.str=malloc(strlen(yytext)+1);
>				strcpy(yylval.str, yytext);
>				return (Identifier);
>			}


It is not actually necessary to call malloc() here, as the characters
in yytext[] will be left undisturbed until the next call to yylex().
The string saving, if necessary, can be deferred to the parser.  One
useful trick is to have a parse rule like save_id:

	%type <str> save_id
	%token <str> ID
	%%
	save_id: ID { $$ = savestr($1); };

Then, whenever you need an ID that must be saved from destruction by
the next call to yylex(), you can use save_id instead of ID.

Another different trick (which I have used in some hand-coded lexers) is
to save all strings in hash tables, possibly reference counted (depending
on whether many should be freed later).  In any case, a routine that
calls malloc() should check for no-space: instead of

			yylval.str=malloc(strlen(yytext)+1);
			strcpy(yylval.str, yytext);

you need something like

			yylval.str = malloc(strlen(yytext) + 1);
			if (yylval.str == NULL)
				die_horribly_due_to_running_out_of_space();
			strcpy(yylval.str, yytext);

or more simply

			yylval.str = estrdup(yytext);

where estrdup is like strdup, but errors out if out of space.  (strdup
is a common library function that acts like malloc+strcpy, returning
NULL if out of space.)

>LEX and YACC are powerful tools which IMHO are poorly documented.

The real documentation for both of these tools is found in compiler
courses and in compiler textbooks, not in the supplementary Unix
documents.  The latter assume you know what LALR parsing and regular
expressions are all about, and merely tell you how to tell yacc and
lex what syntax rules and regular expressions to recognise, and what
actions to take on recognition.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris at cs.umd.edu	Path:	uunet!mimsy!chris



More information about the Comp.lang.c mailing list