"python -3" not working as expected

John Machin sjmachin at lexicon.net
Fri Jan 9 00:35:19 EST 2009


On Jan 9, 1:56 pm, Benjamin <musiccomposit... at gmail.com> wrote:
> On Jan 8, 4:21 pm, Thorsten Kampe <thors... at thorstenkampe.de> wrote:
>
> > * Terry Reedy (Thu, 08 Jan 2009 17:04:04 -0500)
> > > Since you are, I believe, at least the second person to report being bit
> > > by this confusion, please open an issue at bugs.python.org and suggest a
> > > couple of revised sentences that you think are more informative.
>
> > Will do tomorrow. The revised sentence could be in the line of "warn
> > about Python 3.x incompatibilities that cannot trivially be fixed by
> > 2to3.py".
>
> Actually, don't bother now; I've fixed it up in the trunk.

Would you mind giving a pointer to where or what your fix is? The
reason for asking is that Thorsten's suggestion is ambiguous: warn
about some? all? 3.x problems that can't be trivially fixed by 2to3?
Can't be "all"; there are in fact a number of problems that can't be
trivially fixed by 2to3 and can't be detected by running 2.6 with the
-3 option.

These include
(a) problems that cause a reasonably informative exception in 3.x
right at the point where the problem exists
(b) problems where the behaviour has changed but no exception is
raised, and your code lurches off down the wrong path, and you need to
work backwards from subsequent exception(s) and/or failing test case
(s) to pin-point the problem.

I'll use string constants to provide an example of each type. When
faced with "abcd", 2to3 has no way of telling whether that should be
str ("abcd") or bytes (b"abcd"). In the vast majority of cases, to
guess it should be str is correct, so there is no change to the source
file, and a warning would  almostly always be noise.

Example of problem (a): chunks is a list of slices of bytes read from
a binary file.
In 2.x you write
glued = ''.join(chunks)
In 3.0 you get this:
>>> chunks = [b'x', b'y']
>>> ''.join(chunks)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: sequence item 0: expected str instance, bytes found

Example of problem (b): some_bytes has been read from a file that was
opened in 'rb' mode and contains the 4 ASCII bytes 'abcd'
# 2.x simulation
>> some_bytes == "abcd"
True
# 3.0 simulation
>>> type(some_bytes)
<class 'bytes'>
>>> type("abcd")
<class 'str'>
>>> some_bytes == "abcd"
False # because the types are not comparable for equality.

Another type (b) example is the (majority-guessed) 2to3 change from [c]
StringIO.StringIO to io.StringIO ... if you really should feed some
library an io.BytesIO instance instead, it can travel quite a distance
before blowing up.

Perhaps some of this info could be put into
http://docs.python.org/dev/py3k/whatsnew/3.0.html#porting-to-python-3-0
... or maybe a separate HOWTO or wiki chapter could be set up for
porting to 3.x, including topics like:
(1) maintaining one set of source files (when you are maintaining a
package that must run on e.g. 2.1 through 3.x)
(2) the possibility of a 3to2 kit for those who have the 2.1 to 3.x
support-range issue but would like to have the one set of source
looking like 3.x code instead of the ugliness of version-conditional
stuff like
BYTES_NULL = bytes(0) # 3.x
or
BYTES_NULL = '' # 2.x
and (3) [getting in early] what about a %to{} kit (or a {}to% kit!)?

Cheers,
John



More information about the Python-list mailing list