Totally confused by the str/bytes/unicode differences introduced in Pythyon 3.x

Terry Reedy tjreedy at udel.edu
Sat Jan 17 22:00:38 EST 2009


John Machin wrote:
> On Jan 18, 9:10 am, Terry Reedy <tjre... at udel.edu> wrote:
>> Martin v. Löwis wrote:
>>>>> Does he intend to maintain two separate codebases, one 2.x and the
>>>>> other 3.x?
>>>> I think I have no other choice.
>>>> Why? Is theoretically possible to maintain an unique code base for
>>>> both 2.x and 3.x?
>>> That is certainly possible! One might have to make tradeoffs wrt.
>>> readability sometimes, but I found that this approach works quite
>>> well for Django. I think Mark Hammond is also working on maintaining
>>> a single code base for both 2.x and 3.x, for PythonWin.
>> Where 'single codebase' means that the code runs as is in 2.x and as
>> autoconverted by 2to3 (or possibly a custom comverter) in 3.x.
>>
>> One barrier to doing this is when the 2.x code has a mix of string
>> literals with some being character strings that should not have 'b'
>> prepended and some being true byte strings that should have 'b'
>> prepended.  (Many programs do not have such a mix.)
>>
>> One approach to dealing with string constants I have not yet seen
>> discussed here is to put them all in separate file(s) to be imported.
>> Group the text and bytes separately.  Them marking the bytes with a 'b',
>> either by hand or program would be easy.
> 
> (1) How would this work for somebody who wanted/needed to support 2.5
> and earlier?
> 
> (2) Assuming supporting only 2.6 and 3.x:
> 
> Suppose you have this line:
> if binary_data[:4] == "PK\x03\x04": # signature of ZIP file
> 
> Plan A:
> Change original to:
> if binary_data[:4] == ZIPFILE_SIG: # "PK\x03\x04"
> Add this to the bytes section of the separate file:
> ZIPFILE_SIG = "PK\x03\x04"
> [somewhat later]
> Change the above to:
> ZIPFILE_SIG = b"PK\x03\x04"
> [once per original file]
> Add near the top:
> from separatefile import *
> 
> Plan B:
> Change original to:
> if binary_data[:4] == ZIPFILE_SIG: # "PK\x03\x04"
> Add this to the separate file:
> ZIPFILE_SIG = b"PK\x03\x04"
> [once per original file]
> Add near the top:
> from separatefile import *
> 
> Plan C:
> Change original to:
> if binary_data[:4] == b"PK\3\4": # signature of ZIP file
> 
> Unless I'm gravely mistaken, you seem to be suggesting Plan A or some
> variety thereof -- what advantages do you see in this over Plan C?
> --
> http://mail.python.org/mailman/listinfo/python-list
> 




More information about the Python-list mailing list