[Python-Dev] More Unicode support
M.-A. Lemburg
mal@lemburg.com
Mon, 06 Nov 2000 10:14:12 +0100
Guido van Rossum wrote:
>
> [me]
> > > - Internationalization. Barry knows what he wants here; I bet Martin
> > > von Loewis and Marc-Andre Lemburg have ideas too.
>
> [MAL]
> > We'd need a few more codecs, support for the Unicode compression,
> > normalization and collation algorithms.
>
> Hm... There's also the problem that there's no easy way to do Unicode
> I/O. I'd like to have a way to turn a particular file into a Unicode
> output device (where the actual encoding might be UTF-8 or UTF-16 or a
> local encoding), which should mean that writing Unicode objects to the
> file should "do the right thing" (in particular should not try to
> coerce it to an 8-bit string using the default encoding first, like
> print and str() currently do) and that writing 8-bit string objects to
> it should first convert them to Unicode using the default encoding
> (meaning that at least ASCII strings can be written to a Unicode file
> without having to specify a conversion). I support that reading from
> a "Unicode file" should always return a Unicode string object (even if
> the actual characters read all happen to fall in the ASCII range).
>
> This requires some serious changes to the current I/O mechanisms; in
> particular str() needs to be fixed, or perhaps a ustr() needs to be
> added that it used in certain cases. Tricky, tricky!
It's not all that tricky since you can write a StreamRecoder
subclass which implements this. AFAIR, I posted such an implementation
on i18n-sig.
BTW, one of my patches on SF adds unistr(). Could be that it's
time to apply it :-)
--
Marc-Andre Lemburg
______________________________________________________________________
Business: http://www.lemburg.com/
Python Pages: http://www.lemburg.com/python/