xml.sax removing newlines from attribute value?

Grant Edwards grante at visi.com
Thu Sep 29 15:12:20 EDT 2005


On 2005-09-29, Fredrik Lundh <fredrik at pythonware.com> wrote:

>> I'm using xml.sax to parse the "datebook" xml file generated
>> by QTopiaDesktop.  When I look at the xml file, some of the
>> attribute strings have newlines in them (as they are supposed
>> to).
>>
>> However, when xml.sax passes the attributes to my
>> startElement() method the newlines seem to have been deleted.
>>
>> How do I get the un-munged element attribute values?
>
> newlines as in chr(10) rather than &#xa; ?

Yup, Looks that way.

> if so, the only way is to avoid XML:
>
>     http://www.w3.org/TR/REC-xml/#AVNormalize

I can't quite find it in the BNF, but I take it that chr(10)
isn't really allowed in XML attribute strings.  IOW, the file
generate by Trolltech's app is broken.

> if the "yes, I know, but I have good reasons" approach is okay
> with you,

I didn't define the file or write the program that generated
it.  It's claimed to be "xml", and I'm just trying to parse it.

> and you're big enough to defend yourself against the
> XML-Is-The-Law crowd, you can use a "sloppy" XML parsers such
> as sgmlop to deal with your files:
>
>     http://effbot.org/zone/sgmlop-index.htm

Good to know for future reference.  For now, I think I'll just
live with the way it works.  Everything basically works, except
some strings don't display quite "right".  My current app
treats the file as read-only.  If I ever get around to
modifying data and writing it back, I'll probably have to deal
with the newline issue at that point.

-- 
Grant Edwards                   grante             Yow!  When this load is
                                  at               DONE I think I'll wash
                               visi.com            it AGAIN...



More information about the Python-list mailing list