Please help!! SAXParseException: not well-formed (invalid token)

jvictor118 at yahoo.fr jvictor118 at yahoo.fr
Tue Mar 27 10:59:45 EDT 2007


I've been using the xml.sax.handler module to do event-driven parsing
of XML files in this python application I'm working on. However, I
keep having really pesky invalid token exceptions. Initially, I was
only getting them on control characters, and a little "sed -e 's/
[^[:print:]]/ /g' $1;" took care of that just fine. But recently, I've
been getting these invalid token excpetions with n-tildes (like the n
in España), smart/fancy/curly quotes and other seemingly harmless
characters. Specifying encoding="utf-8" in the xml header hasn't
helped matters.

Any ideas? As a last resort, I'd be willing to scrub invalid
characters.... it just seems strange that curly quotes and n-tildes
wouldn't be valid XML! Is that really the case?

TIA!

Jason




More information about the Python-list mailing list