[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: using the PLT scheme lexer
OK, my bad. There was a trivial bug in my code.
BTW there is a trivial bug in the lexer in collects/parser-tools/doc.txt:
(define-lex-abbrevs
[initial (: (- a z) (- #\A #\Z) ! $ % & * / : < = > ? ^ _ ~)]
[subsequent (: (initial) (digit) + - #\. @)]
[digit (- #\0 #\9)]
[comment (@ #\; (^ #\newline) #\newline)])
should be
(define-lex-abbrevs
[initial (: (- a z) (- #\A #\Z) ! $ % & * / : < = > ? ^ _ ~)]
[subsequent (: (initial) (digit) + - #\. @)]
[digit (- #\0 #\9)]
[comment (@ #\; (* (^ #\newline)) #\newline)])
Also, I'm curious as to why some characters have to be written with escape
sequences (e.g. #\A) and some don't (e.g. a, z, !, $ etc.). Is there an
automatic conversion to lower case which is overridden by the #\ ?
Mike
> Date: Sun, 16 Dec 2001 19:53:52 -0700 (MST)
> From: Scott Owens <sowens@cs.utah.edu>
>
> The intended behavior is to match the longest possible token, so this
> would indeed be a bug. However, I have been unable to reproduce it
> as follows:
>
> Welcome to MzScheme version 200alpha4, Copyright (c) 1995-2001 PLT
> Note: readline loaded
> > (require (lib "lex.ss" "parser-tools"))
> >
> > (define l
> (lex
> ("-" 1)
> ("42" 2)
> ("-42" 3)))
> > (define in (make-lex-buf (current-input-port)))
> > (l in)
> -42
> 3
>
> Thus a more detailed description of how to reproduce the bug would be
> useful. (Just send me an e-mail, don't submit a bug report)
>
> -Scott Owens
>
> On Sun, 16 Dec 2001, Michael Vanier wrote:
>
> >
> > Playing around with the example lexer, it appears as if the lexer will
> > always return the first matching token, even if that token is the prefix to
> > a longer token which is also a valid match. For instance, if you have two
> > token categories, one of which matches "-" (e.g. a symbol) and one of which
> > matches "-42" (an integer), the lexer matches "-" instead of "-42". This
> > is different from standard lex/flex behavior. Is this a bug?
> >
> > Mike
> >
> > > Date: Thu, 13 Dec 2001 14:12:01 -0700 (MST)
> > > From: Scott Owens <sowens@cs.utah.edu>
> > >
> > > In general, the return value of the lexer is not restricted. The token
> > > structure is provided for interoperation with a parser, but you are free
> > > to return whatever values you prefer.
> > >
> > > Since the lexer generator is a new tool to v200, I would appreciate any
> > > suggestions/feedback on you experience using it.
> > >
> > > -Scott Owens
> > >
> > > On Thu, 13 Dec 2001, Michael Vanier wrote:
> > >
> > > >
> > > > OK, now that I've found the lexer, here's a simple question.
> > > >
> > > > I can run the lexer to return a single value of type <struct:token>, but I
> > > > can't figure out how to extract the fields from the struct. As far as I
> > > > can tell (and I'm a total newbie with the module system), the token struct
> > > > type is defined in collects/parser-tools/private-lex/token.ss. I can't
> > > > import it directly because it already gets imported when I import the lex
> > > > library. However, the token accessor functions do not appear to be
> > > > exported. Does that mean that tokens are an opaque data type? If I can't
> > > > get the value of the token, the lex module is not usable by me. I don't
> > > > want or need to use the yacc module. Sorry for my utter cluelessness.
> > > >
> > > > Mike
> > > >
> > > >
> > >
> > >
> >
>
>