[Python-Dev] Minidom and Unicode

Fredrik Lundh Fredrik Lundh" <effbot@telia.com
Mon, 3 Jul 2000 19:01:27 +0200


mal wrote:

> > Anyhow, why would it be wrong for Fredrick to hard-code an encoding =
in
> > repr but right for me to hard-code one in minidom?=20
>=20
> Because hardcoding the encoding into the core Python API touches
> all programs. Hardcoded encodings should be userland options
> whereever possible.

the problem is that the existing design breaks peoples
expectations: first, minidom didn't work because people
expected to be able to return the result of:

    "8 bit string" + something + "8 bit string"

or

    "8 bit string %s" % something

from __repr__.  that's a reasonable expectation (just look
in the python standard library).

after my fix, minidom still didn't work because people expected
the conversion to work on all strings, on all platforms.  that's
also a reasonable expectation (read on).

> Besides, we're talking about __repr__ which is mainly a
> debug tool and doesn't affect program flow or interfacing
> in any way.

exactly.  this is the whole point: __repr__ is a debug tool,
and therefore it must work in all platforms, for all strings.

if it's true that repr() cannot be changed to return unicode
strings (in which case the conversion will be done on the
way out to the user, by a file object or a user-interface
library which might actually know what encoding to use),
using a lossless encoding is the second best thing.

on the other hand, if we can change repr/str, this is a non-
issue.  maybe someone could tell me exactly what code we'll
break if we do that change?

</F>