lex & yacc questions

Lapus Lazuli chryses at xurilka.UUCP
Wed Aug 22 01:59:32 AEST 1990


In article <8144 at jarthur.Claremont.EDU> ssdken at watson.Claremont.EDU (Ken Nelson) writes:
>> by defining letters in the lex rules section.  In my example
>> I am using letters  [A-Za-z'_'] to match upper or lower case characters
>> and possibly the underscore.  My question is this, how can you get
>> lex to match a reserved word you have declared, whether it's upper case
>> or not.  For example, Unify has the reserved command word "application".
>Try this:
>[Aa][Pp][Pp][Ll][Ii][Cc][Aa][Tt][Ii][Oo][Nn] 	{ return _APPLICATION; }
>this will match no matter what combination of case is used.
>


Sorry about the blank article, I had a lapse of motor control.


The above method will work, but will generate HUGE lex tables if you have
more than a couple.  Someone I know tried doing that with a Pascal-type
language and lex ran out of memory.

A better way would be to define an IDENTIFIER token, i.e., letter or underscore
followed by zero or more letters, digits, or underscores.  Send them all to 
a function which will convert it to uppercase.  If you have a fixed list of
reserved words you can make a hash table for it.  Return the appropriate value
if the token is in the hash table, else just return IDENTIFIER.

There are algorithms out there to generate perfect hash functions (guaranteed
no collisions), given a fixed list of values.  The table is about 3 or 4 times
bigger than the list, but if your list is 50 values or less that shouldn't be
a problem.  I don't remember where I got mine.  Perhaps someone else has an
idea.


Happy hacking!
Phong.



>				Ken Nelson


-- 
Phong T. Co (Lapus Lazuli) |	One in your belly, and one for Rudi,
chryses at xurilka.UUCP	   |	You got what you gave by the heel of my bootie,
dada Indugu Inc.	   |	Bang, bang, out! like an old cherootie,
Montreal, CANADA	   |	I'm coming for you.	-- Kate Bush



More information about the Comp.lang.c mailing list