lxml/ElementTree and .tail

Stefan Behnel stefan.behnel-n05pAM at web.de
Thu Nov 16 08:47:47 EST 2006


Chas Emerick wrote:
> the delta between Elements and DOM-style elements leads to other issues.
> There's no doubt that the needed helpers are simple, but all things being
> equal, not having to carry them around anywhere we're doing DOM
> manipulations is a big plus.
> 
> Because we're far from doing anything that is regular or one-off in nature.
> We're systematizing the extraction of data from functionally unstructured
> content, and it's flatly necessary to normalize the XHTML into something
> that can be easily consumed by the processes we've built that can do that
> content->data extraction/conversion from plain text, XML, PDF, and now
> XHTML.
> 
> Remember, corner cases. :-)

Hmm, then I really don't get why you didn't just write a customised XHTML API
on top of lxml's custom Element classes feature. Hiding XML language specific
behaviour directly in the Element classes really helps in getting your code
clean, especially in larger code bases.

Stefan



More information about the Python-list mailing list