How *build* new elements and *replace* elements with xml.dom.minidom ?

Stefan Behnel stefan_ml at behnel.de
Fri Jun 12 01:08:19 EDT 2009


Johannes Bauer wrote:
> Stefan Behnel schrieb:
> 
>>> So I need to build hyperlinks (a elements) with href attribute and
>>> replace the text elements (numbers) somehow.
>> Try lxml.html instead. It makes it really easy to do these things. For
>> example, you can use XPath to find all table cells that contain numbers:
>>
>> 	td_list = doc.xpath("//td[number() >= 0]")
>>
>> or maybe using regular expressions to make sure it's an int:
>>
>> 	td_list = doc.xpath("//td[re:match(., '^[0-9]+$')]",
>>               namespaces={'re':'http://exslt.org/regular-expressions'})
>>
>> and then replace them by a hyperlink:
>>
>> 	# assuming links = ['http://...', ...]
>>
>> 	from lxml.html.builder import A
>> 	for td in td_list:
>> 		index = int(td.text)
>> 		a = A("some text", href=links[index])
>> 		td.getparent().replace(td, a)
> 
> Oh no! I was looking for something like this for *ages* but always
> fought with minidom - where this is a real pain :-(
> 
> Had I only known before that such a wonderful library exists. I'll
> definitely use lxml from now on.

Yep, I keep advertising it all over the place, but there are still so many
references to minidom on the web that it's hard to become the first hit in
Google when you search for "Python XML". ;)

Actually, the first hit (for me) is currently PyXML, which is officially
unmaintained. Wasn't there 'some' Python developer working for Google? What
about fixing their database?


> Does it compile with Python3?

Sure. :)

Stefan



More information about the Python-list mailing list