lxml and identification of elements in source

Stefan Behnel stefan.behnel-n05pAM at web.de
Thu Sep 20 07:55:10 EDT 2007


Laurent Pointal wrote:
> I use lxml to parse a document, modify some elements in the tree, run
> xinclude on it (thanks to Stefan Behnel for the pointer), and finally use
> the data to feed my sax-aware tool.
> 
> To report warnings/errors to the user (not related to XML itself, but to
> things like missing source for a cross-reference by example), is there a
> way to get the source-file and line/column of an element ?
> 
> [note: these informations are available when lxml parse the source, but are
> they kept anywhere in the document tree ?]

The lxml mailing list is a good place to ask these questions - although having
the name come up in c.l.py from time to time makes for some good advertising :)

Elements have a "sourceline" property that returns the line the Element was
parsed from. It returns None if the Element was created through the API. Also,
lines may become meaningless when you merge documents, e.g. xincluded elements
will refer to the line in the xincluded document, not some magically
calculated line in the target document. The column is not stored for an
Element after parsing is finished (no, we can't change that).

You can call getroottree() on an Element to retrieve it's root ElementTree,
which knows about the URL (and other things) in its docinfo property.

Stefan



More information about the Python-list mailing list