PEP 263 comments

Huaiyu Zhu huaiyu at gauss.almadan.ibm.com
Fri Mar 1 14:41:54 EST 2002


I've been following this discussion with quite some interest, but I do not
have the background to delimit the scope of various concepts.  Is there a
gentle introduction to a unicode-newbie?

On 01 Mar 2002 15:39:42 +0900, Stephen J. Turnbull <stephen at xemacs.org> wrote:
>
>IMO, the Python source code parser should never see any text data[1]
>that is not UTF-8 encoded.  

Presumably this discussion only concerns unicode strings - I don't think
want to lose the ability to read in arbitrary binary data as a raw string.
But then you mention

>[1]  Ie, Python language or character text.  It might be convenient to
>have an octet-string primitive data type, in which you could put
>EUC-encoded Japanese or Java byte codes.  

What's the difference between this and a raw string (a byte sequence) that
you can translate into any other encoding?

Huaiyu



More information about the Python-list mailing list