[issue38003] Change 2to3 to replace 'basestring' with '(str,bytes)'

Terry J. Reedy report at bugs.python.org
Sat Sep 7 14:28:56 EDT 2019


Terry J. Reedy <tjreedy at udel.edu> added the comment:

Replacing 'basestring' with 'str' is not a bug in the behavioral sense because it is intended and documented.
https://docs.python.org/3/library/2to3.html#2to3fixer-basestring

How the current behavior is correct: 2to3 converts syntactically valid 2.x code to syntactically valid 3.x code.  It cannot, however, guarantee semantic correctness.  A particular problem is that str is semantically ambiguous in 2.x, as it is used both for (encoded) text and binary data.  To resolve the ambiguity, 2.6 introduced 'bytes' as a synonym for 'str'.  2to3 assumes that 'bytes' means binary data, including text that will still be encoded in 3.x, while 'str' means text that is encoded bytes in 2.x but *will be unicode* in 3.x.  Hence it changes 'unicode' to unambiguous 'str' and 'basestring' == Union(unicode, str) to Union(str, str) == 'str'.

If you fool 2to3 by applying isinstance(value, basestring) to a value that will still be bytes at that point in 3.x, you get a semantic change.  Possible fixes:

1. Since you decode value after the check, do it before the check.

if isinstance(value, bytes):
    value = value.decode(encoding)
if not isinstance(value, unicode):
    some other code

2. Replace 'basestring' with '(unicode, basestring)'

In both cases, the 'unicode' to 'str' replacement should result in correct 3.x code.

3. Edit Lib/lib2to3/fixes/fix_basestring.py to replace with '(str, bytes)'.  This should be straightforward, but ask on python-list if you need help.

As for your second example, 2to3 is not meant for 2&3 code using exception tricks and six/future imports.  Turning 2&3 code into idiomatic 3-only code is a separate subject.

Since other have and will run into the same issues, I intend to post a revised version of the explanation above, with fixes for a revised example, to python-list as "2to3, str, and basestring".  Any further discussion should go there.

----------
resolution: rejected -> not a bug

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue38003>
_______________________________________


More information about the Python-bugs-list mailing list