Python 2 to 3 conversion - embrace the pain
INADA Naoki
songofacandy at gmail.com
Mon Mar 16 15:41:11 EDT 2015
On Tue, Mar 17, 2015 at 2:47 AM, Terry Reedy <tjreedy at udel.edu> wrote:
> On 3/16/2015 5:13 AM, INADA Naoki wrote:
>
>> Another experience is porting Flask application in my company from
>> Python 2 to Python 3.
>> It has 26k lines of code and 7.6k lines of tests.
>>
>> Since we don't need to support both of PY2 and PY3, we used 2to3.
>> 2to3 changes 740 lines.
>
>
> That is less than 3% of the lines. Were any changes incorrect? How many
> lines *not* flagged by 2to3 needed change?
All changes are OK. Flask (and Werkzeug) handles most part of pain.
Application using Flask uses unicode most everywhere on Python 2 already.
Few changes 2to3 can't handle is like this:
- reader = DictReader(open(file_path, 'r'), delimiter='\t')
+ reader = DictReader(open(file_path, 'r', encoding='utf-8'), delimiter='\t')
Since csv module in Python 2 doesn't support unicode, we had to parse
csv as bytestring.
And our server doesn't have utf-8 locale, we should specify encoding
explicitly on PY3.
There were few (less than 10, maybe) easy trouble like this.
>
>> I had to replace google-api-client with
>> requests+oauthlib since
>> it had not supported PY3 yet.
>
>
> Other than those needed for this change, which 2to3 could not anticipate or
> handle?
>
>> After that, we encountered few trouble with untested code. But Porting
>> effort is surprisingly small.
>> We're happy now with Python 3. We can write non-ascii string to log
>> without fear of UnicodeError.
>> We can use csv with unicode without hack.
>
>
> People who use ascii only or perhaps one encoding everywhere severely
> underestimate the benefit of unicode strings (and utf-8) everywhere.
I agree. We may lost log easily on Python 2. It makes investigating bug hard.
>>> import logging
>>> logging.error("%s %s", u'こんにちは', 'こんにちは')
Traceback (most recent call last):
...
File "/usr/local/Cellar/python/2.7.9/Frameworks/Python.framework/Versions/2.7/lib/python2.7/logging/__init__.py",
line 335, in getMessage
msg = msg % self.args
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe3 in
position 0: ordinal not in range(128)
Logged from file <stdin>, line 1
And log including unicode is hard to read.
>>> logging.error("%s", [u'こんにちは'])
ERROR:root:[u'\u3053\u3093\u306b\u3061\u306f']
Python 3 makes our development faster and easier.
Since old Python programmers knows how to avoid pitfalls in Python 2,
writing Python 2 is not a pain.
But when teaching Python to PHP programmer, teaching tons of pitfalls is pain.
This is why I think new applications should start with Python 3.
>
>> Porting *modern* *application* code to *PY3 only* is easy, while
>> porting libraries on the edge of
>> bytes/unicode like google-api-client to PY2/3 is not easy.
>>
>> I think application developers should use *only* Python 3 from this year.
>> If we start moving, more library developers will be able to start
>> writing Python 3 only code from next year.
>
>
> --
> Terry Jan Reedy
>
> --
> https://mail.python.org/mailman/listinfo/python-list
--
INADA Naoki <songofacandy at gmail.com>
More information about the Python-list
mailing list