SAX-Parser entity

Harvey Thomas hst at empolis.co.uk
Fri Mar 1 08:06:21 EST 2002


I would guess that your document is in ISO 8859/1 (otherwise known as latin-1). XML parsers must be able to parse utf-8 and utf-16 and may support other encodings. If your parser supports latin-1 then modify the XML declaration. Otherwise use the codecs module.


> -----Original Message-----
> From: fabi.kreutz at gmx.de [mailto:fabi.kreutz at gmx.de]
> Sent: 01 March 2002 12:43
> To: python-list at python.org
> Subject: SAX-Parser entity
> 
> 
> Hi,
> 
> I spend the last 3 hours browsing FAQs and Mailinglist - 
> without success -
> but have nevertheless the feeling, that this is a very easy question:
> 
> I try to use the minidom XML-Parser to parse my little file 
> in order to
> generate HTML Code.
> Being german, I really like to use Umlauts but minidom does not.
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
>   File 
> "/usr/lib/python2.0/site-packages/_xmlplus/dom/minidom.py", 
> line 908, in parse
>     return _doparse(pulldom.parse, args, kwargs)
>   File 
> "/usr/lib/python2.0/site-packages/_xmlplus/dom/minidom.py", 
> line 900, in _doparse
>     toktype, rootNode = events.getEvent()
>   File 
> "/usr/lib/python2.0/site-packages/_xmlplus/dom/pulldom.py", 
> line 251, in getEvent
>     self.parser.feed(buf)
>   File 
> "/usr/lib/python2.0/site-packages/_xmlplus/sax/expatreader.py"
> , line 92, in feed
>     self._err_handler.fatalError(exc)
>   File 
> "/usr/lib/python2.0/site-packages/_xmlplus/sax/handler.py", 
> line 38, in fatalError
>     raise exception
> xml.sax._exceptions.SAXParseException: <unknown>:29:19: not 
> well-formed
> 
> where Character 19 in Row 29 is the occurence of an ü.
> 
> After browsing the FAQs I changed the default encoding in site.py to
> iso-8859-1, which had some nice effect, but not on minidom.
> Some more browsing let me tell pulldom to use StringIO 
> instead of cStringIO,
> still no success.
> 
> Since I want to use the text in HTML it would be enough, if I 
> could use
> the ü instead, but parse gives me in this case
> ...
>   File 
> "/usr/lib/python2.0/site-packages/_xmlplus/sax/handler.py", 
> line 38, in fatalError
>     raise exception
> xml.sax._exceptions.SAXParseException: <unknown>:29:19: 
> undefined entity
> 
> where 29:19 is the &.
> I tried to protected it with an \ or / but still no success.
> 
> Can anybody help me with this?
> 
> -- 
> The irony of the Information Age is that it has given
> new respectability to uninformed opinion.
> 					- John Lawton
> -- 
> http://mail.python.org/mailman/listinfo/python-list
> 
> _____________________________________________________________________
> This message has been checked for all known viruses by Star Internet
> delivered through the MessageLabs Virus Scanning Service. For further
> information visit http://www.star.net.uk/stats.asp or 
> alternatively call
> Star Internet for details on the Virus Scanning Service.
> 

_____________________________________________________________________
This message has been checked for all known viruses by the MessageLabs Virus Scanning Service.




More information about the Python-list mailing list