[Expat-discuss] Is this a bug?

Karl Waclawek karl at waclawek.net
Fri Apr 28 15:35:06 CEST 2006


Elver Loho wrote:
> Hiya, all!
>
> I have no idea whether this is a bug or not, but I'd like to get
> people's opinion on it.
>
> I'm using Python's xml.parsers.expat to parse an RSS feed from
> http://blog.kriso.ee/feed/ to generate a static component for our
> webstore sidebar at http://www.kriso.ee/
>
> The problem is with the title of the latest entry: "Nip/Tuck'i
> stsenarist filmi kirjutamas" -- I have a method registered as
> CharacterDataHandler and for the title, it returns *three times*
>
> So the line:
>
> <title>Nip/Tuck'i stsenarist filmi kirjutamas</title>
>
> Ends up causing calls to:
> StartElementHandler("title")
> CharacterDataHandler("Nip/Tuck")
> CharacterDataHandler("'")
> CharacterDataHandler("i stsenarist filmi kirjutamas")
> EndElementHandler("title")
>
> It's trivial to work around it, but what's going on here?
>   
That is normal behaviour. Expat may split character data into any number
of call-backs. You need to buffer them.

Karl


More information about the Expat-discuss mailing list