[XML-SIG] outputting non-ascii strings
Martin v. Loewis
martin@v.loewis.de
23 May 2002 08:44:50 +0200
Matt Patterson <matt@reprocessed.org> writes:
> The other thing that's cropped up now - using Juergen's suggestion to
> use stream.write(ustring.encode('utf-8')), which works a treat, decodes
> all the entities in the text, so I now have free-floating ampersands and
> angle brackets, where before I had entities. I do have typographer's
> quotes still :-)
>
> Is there an easy way around this problem? I've looked through my Python
> books (Learning Python, Programming Python, Python and XML) and can't
> find a comprehensive treatment of this issue - if there is one I'd like
> to know, please! Is there a good place to go and look for such
> documentation?
I'm not sure what issue you are referrring to, here?
Are you saying that,
- when using XML library functions to generate XML, it will produce
ampersands and angle brackets which are not markup? That would be a
bug; please report details.
- when using your own custom XML generating functions, you see such
things? You may use xml.sax.saxutils.escape to replace them with
the built-in entity references.
- with your custom writeback routines, you see more "literal"
characters in the output text than you originally had in the input
document, and you want them all back, exactly where they used to be?
This is not possible. You have the option of searching the output
strings yourself, and either writing character references or entity
references where appropriate.
Regards,
Martin