LEX rule, anyone???

Chris Torek chris at mimsy.umd.edu
Sat Dec 9 04:45:45 AEST 1989


In article <1989Dec7.193738.9829 at twwells.com> bill at twwells.com
(T. William Wells) writes:
>I don't know what triggers it, but, in certain cases, an RE that
>is in some start state gets recognized, even when the program is
>not in that start state. I've been bitten by this one three
>times, on three different machines (a VAX, a Sun, and a '386), so
>it appears to be a generic problem with lex.

Lex's start states are not terribly well documented.  Besides %start
and BEGIN(state) and <state>text, there is the default initial state
(called `INITIAL') and the fact that a lex rule, if it has no state,
acts in *all* states.  For instance:

	%state FOO BAR
	%%
	f	{ BEGIN(FOO); }
	b	{ BEGIN(BAR); }
	i	{ BEGIN(INITIAL); }
	<INITIAL>t { return (1); }
	<FOO>t	{ return (2); }
	ack	{ return (3); }
	<BAR>gasp	{ return (4); }
	.|\n	;
	%%
	yywrap() { return (1); }
	main() { int c; while ((c = yylex()) != 0) printf("lex => %d\n", c); }

(yes, not conformant, main should return an int, but then this belongs
in some other group anyway and is here only because that is where it
started) shows how this works:

	% a.out
	ack
	lex => 3
	gasp
	t
	lex => 1
	f
	ack
	lex => 3
	gasp
	t
	lex => 2
	i
	b
	ack
	lex => 3
	gasp
	lex => 4
	t
	i
	^D %

`ack' is unadorned, hence recognised in all states, while `t' produces
a token only in states INITIAL and FOO, and `gasp' produces something only
in state BAR.  (Everything else is eaten.)

At any rate, perhaps what Bill Wells is remembering is lex taking
something that it apparently should not have because it was in some
state other than INITIAL.  Or it could be yet another lex bug....
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris at cs.umd.edu	Path:	uunet!mimsy!chris



More information about the Comp.lang.c mailing list