[Web-SIG] Unicode in Python 3

René Dudfield renesd at gmail.com
Sat Sep 19 16:01:38 CEST 2009


On Sat, Sep 19, 2009 at 2:26 PM, Georg Brandl <g.brandl at gmx.net> wrote:
> René Dudfield schrieb:
>
>>>> Here is a snippet from the compat.py we used to port pygame to support
>>>> python2.3 through 3.1
>>> How is that related?
>>>
>>
>> Rather than using a 2to3 tool - which then makes you have two versions
>> of your code, making the code work in python 2.x and 3.x.  2to3
>> outputs python2.x incompatible code - when it doesn't have to.
>
> Sorry, but I think you do not express the intent of 2to3 correctly here.
> It is not meant to provide a one-time conversion, so that you then
> have to maintain two codebases, it is meant to be run over your 2.x code
> every time you want to distribute a version for Python 3, or even
> transparently in the distutils build process.  This of course means that
> the 2.x code needs to be written with 3.x and the conversion in mind.
>

My point is: using b'' stops those that choose to have one code base.
Not everyone can use 2to3, but for those that can: great!

There is no 2to3 for extension modules.  There is no 2to3 distutils
mod to run 2to3 automatically at this time(correct me if I'm wrong).
People are creating separate branches for py3k... and those projects
that do that seem to let the py3k version rot.  You still need to
debug, and support multiple versions of code... since 2to3 generates
multiple versions.  If someone sends you a patch for the 3.0 version
you need to either reverse it yourself or find someone to do it for
you... same thing with bug reports and tracebacks.

There's some points for why 2to3 is not ok for every project.

> Writing code that runs unchanged on 2.x (where x < 6) and 3.x may seem
> nice, but forces you to do unnecessary workarounds, e.g. in exception
> handlers.

Well, I'm sure there are cases where it would cause unnecessary
workarounds... however with the right compat.py and compat.h setup it
hasn't been too hard in my experience in porting this way.

There is an easy workaround for the exceptions changes...
# define geterror in your compatibility module.
def geterror ():
    return sys.exc_info()[1]

Now you can write:
except ImportError:
    e = geterror()

Instead of these:
#py2
except ImportError, e:
    pass
#py3k
except ImportError as e:
    pass



>
>>>> Arguments against using bytes (and using unicode instead).
>>>>
>>>> So I'm -1 on using b'' all over the place since it's not in both
>>>> versions of python, and makes it impossible for code bases to share
>>>> the same code for multiple versions of python.
>>> That would not matter much because the high-level applications never see
>>> what's under the hood.  Besides web2py all frameworks and libraries I
>>> know about are using unicode internally anyways.
>>>
>>
>> It would mean code bases need to support b'' - which is not compatible
>> with python2.
>
> b'' is supported as of Python 2.6.
>
> Georg

ah yes.  I guess I meant the python2 series.  Python2.5 is still the
most popular python... with 2.6 catching up(or passing it) in
popularity.  So that should be changed to:

"""It would mean code bases need to support b'' - which is not
compatible with <= python2.5.4"""

... anyway, just something to consider.


ps. my facebook account on this email address was just banned.  I
swear I didn't rant about how tornado sucks!


More information about the Web-SIG mailing list