Regexp optimization question
Magnus Lie Hetland
mlh at furu.idi.ntnu.no
Sat Apr 24 11:15:45 EDT 2004
In article <kblic.15952$hR1.8421 at fe2.texas.rr.com>, Paul McGuire
wrote: [snip]
>
> pyparsing supports this kind of text skipping, using scanString()
> instead of parseString().
I already have a working implementation in Python -- if this isn't
more efficient (I'm just talking about the tokenization part) I don't
think there would be much gain in switching.
(IIRC, the pyparsing docs say that pyparsing is slow for complex
grammars, at least.)
BTW: I have not done some experiments with Plex with lots of regular
expressiosn; simply compiling a pattern with 500 alternatives took
forever, whereas re.compile was quite fast.
So... If I can somehow be content with only getting one match per
position, I guess re is the best solution.
Or I could implement something in C (Pyrex)... (Or use something like
pybison.)
--
Magnus Lie Hetland "Wake up!" - Rage Against The Machine
http://hetland.org "Shut up!" - Linkin Park
More information about the Python-list
mailing list