using lex with strings, not files

Garrett Wollman wollman at emily.uvm.edu
Sun May 12 15:13:48 AEST 1991


In article <187 at shasta.Stanford.EDU> shap at shasta.Stanford.EDU (shap) writes:
>Yacc doesn't read the input directly, so there's no work there.  My
>recollection is that lex uses two macros: GET() and UNGET() to
>obtain/pushback characters.  If you take a look at the lex-generated C
>code you will spot them. 
>
>What you need to do is supply your own version of these macros at teh
>top of your file.
>
>Jonathan

Well, sort of.  Flex, for one, does not allow the user to redefine GET
and UNGET; with the way flex scanners work, that would mean an
*extremely* serious performance hit.  [In fact, the major performance
feature of flex is the fact that it uses read() to read a block,
rather than reading a line at a time like conventional lex does.]

Thankfully, you can spot a flex scanner very easily in your code...

#ifdef FLEX_SCANNER
/* flex specific code here */
#else
/* old slow lex specific code here */
#endif

But, flex uses a macro to do this read()ing, so that, without too
much hassle, you can write a string-scanner that works correctly under
both flex and lex.  Your users will thank you for it.

In particular, by redefining the following macro (taken from a
2.1-beta skeleton):
/* gets input and stuffs it into "buf".  number of characters read, or YY_NULL,
 * is returned in "result".
 */
#define YY_INPUT(buf,result,max_size) \
	if ( (result = read( fileno(yyin), buf, max_size )) < 0 ) \
	    YY_FATAL_ERROR( "read() in flex scanner failed" );

to something like this

#define YY_INPUT(buf,result,max_size) \
    { \
	int len = strlen(my_string); \
	if(!len) { \
	    result = 0; \
	} else { \
	    strncpy(buf,my_string,result=min(len,max_size)); \
	    my_string += result;  /* possible fencepost error? */ \
	} \
    } 

[there are probably some errors... in which case please remember that
it's now 1:15 in the morning here.]

-GAWollman

Garrett A. Wollman - wollman at emily.uvm.edu

Disclaimer:  I'm not even sure this represents *my* opinion, never
mind UVM's, EMBA's, EMBA-CF's, or indeed anyone else's.



More information about the Comp.lang.c mailing list