[XML-SIG] PyExpat encoding (was: XML support in Python 1.6)

Greg Stein gstein@lyra.org
Thu, 1 Jun 2000 12:56:28 -0700 (PDT)


On Thu, 1 Jun 2000, Fred L. Drake, Jr. wrote:
>...
>   We also need to determine how Unicode should be supported; should
> the parser always produce Unicode strings, or UTF-8, and provide a
> wrapper that converts everything?  Since it appears likely that
> auto-conversion between Unicode and narrow strings will likely only
> work for 7-bit narrow strings, it may be reasonable to create Unicode
> output directly from the parser (probably at the pyexpat level for
> efficiency).

Expat is typically compiled to spit out a particular encoding. By default,
this is UTF-8.

Presuming that the compilation flags are exposed and/or runtime-queryable,
then pyexpat can compensate accordingly. This implies that it would
sometimes return UTF-8 strings, or Unicode objects.

IMO, we should have a fixed output format, which is the Expat default:
UTF-8.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/