lxml parsing with validation and target?

Robin Becker robin at reportlab.com
Wed Nov 3 10:59:29 EDT 2021


On 02/11/2021 12:55, Robin Becker wrote:
> I'm having a problem using lxml.etree to make a treebuilding parser that validates; I have test code where invalid xml 
> is detected and an error raised when the line below target=ET.TreeBuilder(), is commented out.
> 
.........

I managed to overcome this problem by utilizing the non-targeted parser with returns an _ElementTree object. I can then 
convert to tuple tree using code like this

> class TT:
> 	def __init__(self):
> 		pass
> 
> 	def __call__(self,tree):
> 		if not tree: return
> 		return self.maketuple(next(tree.iter()))
> 
> 	def maketuple(self,e):
> 		return (e.tag,e.attrib or None,self.content(e),e.sourceline)
> 
> 	def content(self,e):
> 		t = e.text
> 		kids = e.getchildren()
> 		if len(kids)==0 and t is None:
> 			return t
> 		else:
> 			r = [].append
> 			if t is not None: r(t)
> 			for c in kids:
> 				r(self.maketuple(c))
> 				t = c.tail
> 				if t is not None:
> 					r(t)
> 			return r.__self__


-- 
Robin Becker


More information about the Python-list mailing list