[XML-SIG] Alternatives for xml.dom.ext.reader.HtmlReader?

Iwan Vosloo iwan at reahl.org
Fri May 30 16:44:31 CEST 2008


Hi there,

We have code using PyXML modules xml.dom.ext.reader.HtmlReader and
xml.dom.ext.Print

After an upgrade from ubuntu gutsy to hardy, this now breaks, because
the hardy release moved python-xml from the default path (see
https://bugs.launchpad.net/ubuntu/+source/python-xml/+bug/215723 )

We have tried the workaround suggested in the changelog, which is to do:
sys.path.append('/usr/lib/python%s/site-packages/oldxml' %
sys.version[:3])

But, that has to be done before the first xml import, and determining
the first xml import is a bit difficult in a large app...

We're using these to solve the following problem:
 - to read possibly erroneous HTML (not XML) into a dom tree
 - and then to create an html fragment (part of the complete doc) which
we have to render as a chunk of HTML (not XML) again.

We decided on using these, based on
http://www.boddie.org.uk/python/HTML.html .

We don't really want to use libxml2dom as also suggested there, since it
is not packaged for ubuntu (and dealing with ad-hoc unpackaged things
would be a nightmare in our environment).

We also tried the following at the start of the problematic module:

import sys
import test.test_support

test.test_support.forget('xml')
test.test_support.forget('xml.dom')
sys.path.insert(0,'/usr/lib/python%s/site-packages/oldxml' %
sys.version[:3])

But this also does not work...

Any pointers to a quick workaround would be appreciated. (And to "the
right way" too)

Thanks
-i



More information about the XML-SIG mailing list