[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Parser tools: How to make the lexer context sensitive?

I have spent some time translating the lex and yacc specifications


for the C programming language into something suitable for the
parser tools collection provided by plt-scheme.

I am almost done (with respect to my goal) but have yet to
overcome one little obstacle. 

In the specification for the lexer is the following rule

   {L}({L}|{D})*		{ count(); return(check_type()); }

which says that a letter possibly followed by letters or digits
is a IDENTIFIER or a TYPENAME ; since check_type() is defined as: 

int check_type()
{ /*
  * pseudo code --- this is what it should check
  *	if (yytext == type_name)
  *		return(TYPE_NAME);
  *	return(IDENTIFIER);

Thus it should be possible from the lexer to determine whether
the parser is currently parsing a 'type_name' or not.
The lexer should return a TYPENAME during parsing of the 
non-terminal 'type_name' and otherwise a IDENTIFIER. If I recall
correctly from my compiler course (5 years ago?) one solution in 
lex/yacc was to set a global flag, when entering the parsing of type_name,
and reseting it when leaving. 

How do I do this using the parser tools?

The rule I use in parser is:

    [(specifier_qualifier_list)                     (list 'type_name $1)]
    [(specifier_qualifier_list abstract_declarator) (list 'type_name $1 $2)])

The non working rule in the lexer is

   [(@ (L) (* (: (L) (D))))  (token-IDENTIFIER (get-lexeme))]

which should be

   [(@ (L) (* (: (L) (D))))  (if <parsing a type_name?>
                                 (token-TYPENAME (get-lexeme))
                                 (token-IDENTIFIER (get-lexeme)))]

What shall I write in place of <parsing a type_name?> ?

For the curious: I am implementing the parser in order to
make writing extensions for MzScheme easier. Some time ago
I wrote a tiny extension in order to use some functions
from ImageMagick. Due to number of constants and functions
in ImageMagick I figured it would be worth while to automate
some of the extension writing. So far I can succesfully
convert typedefed enumerations into association lists
(using a little evaluator of the constant expressions).

I am now working on extracting the function definitions.
In most cases, if two functions in the API have the same type,
I can reuse corresping functions in the extension.

Jens Axel Søgaard

There is no substitute for good manners, except, perhaps, fast reflexes.
  - fortune