Object oriented storage with validation (was: Re: Caching compiled regexps across sessions (was Re: Regular Expressions - Python vs Perl))

Ilpo Nyyssönen iny+news at iki.fi
Sun Apr 24 03:36:37 EDT 2005


[reorganized a bit]

Ville Vainio <ville at spammers.com> writes:

> Why don't you use external validation on the created xml? Validating
> it every time sounds like way too much like Javaic B&D to be fun
> anymore. Pickle should serve you well, and would probably remove about
> half of your code. "Do the simplest thing that could possibly work"
> and all that.

What is the point in doing validation if it isn't done every time? Why
wouldn't I do it every time? It isn't that slow thing to do.

Pickle doesn't have validation. I am not comfortable for using it as
storage format that should be reliable over years when the program
evolves. It also doesn't tell me if my program has put something other
to the data than I meant to. The program will just throw some weird
exception.

I want to do the simplest thing, but I also want something that helps
me keep the program usable also in the future. I prefer putting some
resources to get some validation to it initially than use later more
resouces to do something with undetermined lump of data.

>     >> python has shipped with a fast XML parser since 2.1, or so.
>
>     Ilpo> With what features? validation? I really want a validating
>     Ilpo> parser with a DOM interface. (Or something better than DOM,
>     Ilpo> must be object oriented.)
>
> Check out (coincidentally) Fredrik's elementtree:
>
> http://effbot.org/zone/element-index.htm

At least the interface looks quite simple and usable. With some
validation wrapping over it, it might be ok...

>     Ilpo> And my point is that the regular expression compilation can
>     Ilpo> be a problem in python. The current regular expression
>     Ilpo> engine is just unusable slow in short lived programs with a
>     Ilpo> bit bigger amount of regexps. And fixing it should not be
>     Ilpo> that hard: an easy improvement would be to add some kind of
>     Ilpo> storing mechanism for the compiled regexps. Are there any
>     Ilpo> reasons not to do this?
>
> It should start life as a third-party module (perhaps written by you,
> who knows :-). If it is deemed useful and clean enough, it could be
> integrated w/ python proper. This is clearly something that should not
> be in the python core, because the regexps themselves aren't there
> either.

How can it work automatically in separate module? Replacing the
re.compile with something sounds possible way of getting the regexps,
but how and where to store the compiled data? Is there a way to put it
to the byte code file?

Maybe I need to take a look at it when I find the time...

-- 
Ilpo Nyyssönen # biny # /* :-) */



More information about the Python-list mailing list