[Python-Dev] Re: Python 1.6a2 Unicode bug (was Re: comparing strings and ints)

Mark Hammond mhammond@skippinet.com.au
Thu, 27 Apr 2000 10:08:23 +1000


It is necessary for us to also have this scrag-fight in public?
Most of the thread on c.l.py is filled in by people who are also
py-dev members!

[MAL writes]

> Please note that the support for mixing strings and Unicode
> objects is really only there to aid porting applications
> to Unicode.
>
> New code should use Unicode directly and apply all needed
> conversions explicitly using one of the many ways to
> encode or decode Unicode data.

This will _never_ happen.  The Python programmer should never need
to be aware they have a Unicode string versus a standard string -
just a "string"!  The fact there are 2 string types should be
considered an implementation detail, and not a conceptual model for
people to work within.

I think we will be mixing Unicode and strings for ever!  The only
way to avoid it would be a unified type - possibly Py3k.  Until
then, people will still generally use strings as literals in their
code, and should not even be aware they are mixing.  Im never going
to prefix my ascii-only strings with u"" just to avoid the
possibility of mixing!

Listening to the arguments, Ive got to say Im coming down squarely
on the side of Fredrik and Just.  strings must be sequences of
characters, whose length is the number of characters.  A string
holding an encoding should be considered logically a byte array, and
conversions should be explicit.

> The auto-conversions are only there to help out and provide some
convenience.

Doesn't sound like it is working :-(

Mark.