[Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]

Michael Foord fuzzyman at voidspace.org.uk
Tue Feb 14 00:53:16 CET 2006


Guido van Rossum wrote:
> On 2/13/06, Michael Foord <fuzzyman at voidspace.org.uk> wrote:
>   
>> Phillip J. Eby wrote:
>> [snip..]
>>     
>>> In fact, the 'encoding' argument seems useless in the case of str objects,
>>> and it seems it should default to latin-1 for unicode objects.  The only
>>>
>>>       
>> -1 for having an implicit encode that behaves differently to other
>> implicit encodes/decodes that happen in Python. Life is confusing enough
>> already.
>>     
>
> But adding an encoding doesn't help. The str.encode() method always
> assumes that the string itself is ASCII-encoded, and that's not good
> enough:
>
>   
Sorry - I meant for the unicode to bytes case. A default encoding that 
behaves differently to the current to implicit encodes/decodes would be 
confusing IMHO.

I agree that string to bytes shouldn't change the value of the bytes. 
The least confusing description of a non-unicode string is 'byte-string'.

Michael Foord
>>>> "abc".encode("latin-1")
>>>>         
> 'abc'
>   
>>>> "abc".decode("latin-1")
>>>>         
> u'abc'
>   
>>>> "abc\xf0".decode("latin-1")
>>>>         
> u'abc\xf0'
>   
>>>> "abc\xf0".encode("latin-1")
>>>>         
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
> UnicodeDecodeError: 'ascii' codec can't decode byte 0xf0 in position
> 3: ordinal not in range(128)
>   
>
> The right way to look at this is, as Phillip says, to consider
> conversion between str and bytes as not an encoding but a data type
> change *only*.
>
> --
> --Guido van Rossum (home page: http://www.python.org/~guido/)
>
>   



More information about the Python-Dev mailing list