Regular Expressions - Python vs Perl

Ilpo Nyyssönen iny+news at iki.fi
Sat Apr 23 01:08:21 EDT 2005


"Fredrik Lundh" <fredrik at pythonware.com> writes:

> so you picked the wrong file format for the task, and the slowest
> tool you could find for that file format, and instead of fixing
> that, you decided that the regular expression engine was to blame
> for the bad performance. hmm.

What would you recommend instead?

I have searched alternatives, but somehow I still find XML the best
there is. It is a standard format with standard programming API.

I don't want to lose my calendar data. XML as a standard format makes
it easier to convert later to some other format. As a textual format
it is also readable as raw also and this eases debugging.

And my point is that the regular expression compilation can be a
problem in python. The current regular expression engine is just
unusable slow in short lived programs with a bit bigger amount of
regexps. And fixing it should not be that hard: an easy improvement
would be to add some kind of storing mechanism for the compiled
regexps. Are there any reasons not to do this?

>> Nowdays I use libxml2-python as the XML parser and so the problem is
>> not so acute anymore. (That is just harder to get in running for
>> python compiled from source outside the rpm system and it is not so
>> easy to use via DOM interface.)
>
> python has shipped with a fast XML parser since 2.1, or so.

With what features? validation? I really want a validating parser with
a DOM interface. (Or something better than DOM, must be object
oriented.)

I don't want to make my programs ugly (read: use some more low level
interface) and error prone (read: no validation) to make them fast. 

-- 
Ilpo Nyyssönen # biny # /* :-) */



More information about the Python-list mailing list