How to request data from a lazily-created tree structure ?

méchoui laurent.ploix at gmail.com
Mon Jun 16 17:56:22 EDT 2008


On Jun 16, 11:16 pm, "Diez B. Roggisch" <de... at nospam.web.de> wrote:
> méchoui schrieb:
>
>
>
> > Problem:
>
> > - You have tree structure (XML-like) that you don't want to create
> > 100% in memory, because it just takes too long (for instance, you need
> > a http request to request the information from a slow distant site).
> > - But you want to be able to request data from it, such has "give me
> > all nodes that are under a "//foo/bar" tree, and have a child with an
> > "baz" attribute of value "zzz".
>
> > Question :
>
> > Do you have any other idea to request data from a lazily-created tree
> > structure ?
>
> > And does it make sense to create a DOM-like structure and to use a
> > generic XPath engine to request the tree ? (and does this generic
> > XPath engine exist ?)
>
> > The idea is to have the tree structure created on the fly (we are in
> > python), only when the XPath engine requests the data. Hopefully the
> > XPath engine will not request all the data from the tree (if the
> > request is smart enough and does not contain **, for instance).
>
> Generic XPath works only with a DOM(like) structure. How else would you
> e.g. evaluate an expression like foo[last()]?
>
> So if you really need lazy evaluation, you will need to specifically
> analyze the query of interest and see if it can be coded in a way that
> allows to forget as much of the tree as possible, or even better not
> query it.
>
> Diez

Yes, I need to make sure my requests are properly written so that the
generic XPath engine does not need all the structure in memory.

There are quite a few cases where you really don't need to load
everything at all. /a/b/*/c/d is an example. But even with an example
like /x/z[last()]/t, you don't need to load everything under the
every /x/z nodes. You just need to check for the latest one, and make
sure there is a t node under it.

Anyway, if I need to make requests that need all the data... that
means that the need for lazy instantiation of nodes disappears,
right ?



More information about the Python-list mailing list