diferences between 22 and python 23

Bengt Richter bokr at oz.net
Thu Dec 4 05:33:33 EST 2003


On 04 Dec 2003 08:28:16 +0100, martin at v.loewis.de (Martin v. =?iso-8859-15?q?L=F6wis?=) wrote:

>"Mike C. Fletcher" <mcfletch at rogers.com> writes:
>
>> AFAIK, that's the plan.  IIRC, rationale was that there would be some
>> other type for 8-bit data, while all "normal" strings would become
>> Unicode strings.  
>
>No. <type 'str'> will remain a byte string type for any foreseeable
>future. The only change that is likely to happen is this: To denote
>bytes > 128 in source code, you will need to use escape codes. 
>
Anyone considered extending the hex escape with delimiters to make
long runs more dense? E.g.,

   'ab\x00\x01\x02\x03cd'

being spellable as

   'ab\<00010203>cd'

or
   'ab' x'00010203' 'cd'

or
    x'6162000102036364'

>A change that might happen in the future is this: A string literal
>does not create an instance of <type 'str'>, but an instance of <type
>'unicode'>. However, IMO, this should only happen after a syntax for
>byte string literals has been introduced.
>
Still, the actual characters used in the _source_ representation will have to
be whatever the -*- xxx -*- thing says, right? -- including the characters
in the source representation of a string that might wind up utf-8 internally?
(so you could have several modules whose sources are encoded differently and
have the run time see a single unified internal representation of utf-8?
Or wchar/utf-16le?

>> >I'm very unimpressed with this decision if that's the case.
>> >
>> Doesn't make me ecstatic, either, as I like the simple 8-bit-clean
>> string type.  But maybe we'll luck out and it will turn out that I'm
>> all wet on this one :) .
>
>The byte string type is not going away. It is a useful type, e.g. when
>reading or writing to or from a byte stream.
>
Is this moving towards a single 8-bit str base type with various
encoding-specifying subtypes?

Regards,
Bengt Richter




More information about the Python-list mailing list