Regexp optimization question

Magnus Lie Hetland mlh at furu.idi.ntnu.no
Fri Apr 23 11:52:53 EDT 2004


In article <c6a8ku$jga$1 at newsreader2.netcologne.de>, Günter Jantzen wrote:
>
>"Magnus Lie Hetland" <mlh at furu.idi.ntnu.no> schrieb im Newsbeitrag
>news:slrnc8gal3.9da.mlh at furu.idi.ntnu.no...
>>
>> Any ideas?
>>
>
>Maybe Plex is helpful. I did not use it already, but it seems to adress your
>problem

Ahah.

>The author of Plex is Greg Ewing. He build Pyrex on top of Plex

Yeah, I know -- I just never looked at Plex in detail.

>The documentation
>http://www.cosc.canterbury.ac.nz/~greg/python/Plex/version/doc/index.html
>contains
[snip]

Yeah, I looked at the docs, and it looks very promising!

One of the reasons I've avoided existing lexers is that I don't do
standard tokenization -- I don't partition all of the text into regexp
tokens. I allow the lexer to skip over text -- somewhat like how
whitespace is normally handled, except that this can be *any* text --
and to return the next token that is of any use to the current parsing
rule.

But who knows -- I may be able to use Plex anyway.

One problem might be that the regexp format seems to be quite
stripped-down (although, of course, a regexp is a regexp,
theoretically speaking ;)

No non-greedy matching, no lookahead/lookback etc.

But if Plex gives a real performance boost, I may be able to live with
that. (The regexp part is functionality that is available to the user,
in my case.)

>Hope I could help you

It might. Thanks.

>Guenter

-- 
Magnus Lie Hetland              "Wake up!"  - Rage Against The Machine
http://hetland.org              "Shut up!"  - Linkin Park



More information about the Python-list mailing list