[XML-SIG] Anybody using PyXML (4DOM) HTML DOM?

John J Lee jjl at pobox.com
Mon Aug 25 19:39:11 EDT 2003


On Mon, 25 Aug 2003, John J Lee wrote:
[...]
> 1. HTMLDocument.getElementsByTagName doesn't work at all for lower-case
> attribute values (SF bug 782470):

Attribute *names*, I should have said (eg. the 'name' in 'name="blah"' in
the example below).

Actually, this problem seems to be a parser bug: Sgmlop.HtmlParser does
setAttributeNS.  If you change that to setAttribute, it works.


> #!/usr/bin/env python
>
> from xml.dom.ext.reader import HtmlLib
>
> doc = HtmlLib.FromHtml("""<html><head><title></title></head><body>
> <form name="blah"></form>
> </body></html>""")
>
> # HTMLElement.getAttribute uppercases the name, but it was *stored*
> # in lower case, so both fail.
> print repr(doc.getElementsByName("blah"))
> print repr(doc.getElementsByName("BLAH"))
[...]


John




More information about the XML-SIG mailing list