Why Python3

Terry Reedy tjreedy at udel.edu
Sun Jun 27 20:12:10 EDT 2010


Some people appear to not understand the purpose of Python3 or more 
specifically, of the changes that break Python2 code. I attempt here to 
give a relatively full explanation.

SUMMARY: Python3 completes (or makes progress in) several transitions 
begun in Python2.

In particular, Python3 bunches together several feature removals (which 
always break someone's code) and a few feature changes (which also break 
code). The alternative would have been to make the same changes, a few 
at a time, over several releases, starting with about 2.5.

Another alternative would have been to declare 2.0 or 2.1 complete at 
far as it went and forbid adding new features that duplicate and 
supersede existing features.

Another would have been to add but never remove anthing, with the 
consequence that Python would become increasingly difficult to learn and 
the interpreter increasingly difficult to maintain with volunteers. I 
think 2.7 is far enough in that direction.

SPECIFIC REPLACEMENTS:

1. Integer division

In Python1, arithmetic expressions are mostly generic (polymophic) in 
that they have the same meaning for all builtin number types. The 
glaring exception is division, which has a special meaning for pairs of 
integers. Newbies were constantly surprised that 1/2 equals 0.

Guido proposed to fix the situation with a second division operator 
'//', with a standard warning/deprecation/removal process for the old 
behavior over three releases, perhaps ending with 2.5. In response to 
complaints about having to find and examine every use of '/', Guido 
promised that there would be a helper program. I proposed that the 
version that had only the new behavior, without 'from __future__ import 
division', be named 3.0 to signal the change.

Guido obviously decide to make a lot more changes in a 3.0 release. And 
the helper program was expanded to the current 2to3. In any case, 
Python3 finished the integer division transition.

2. User classes

In Python1, user classes, defined with a class statement, are instances 
of a class type and are otherwise separate from and cannot inherit from 
builtin types. Instances of user classes are actually instances of an 
instance type, not of the user class.

Python2.2 introduced a unified class-type system. Instead of 'from 
__future__ import newclass', the new system was invoked by defining 
__metaclass__ or inheriting from a future class. Python3 finished this 
transition by dropping ClassType and InstanceType and making the 
built-in class 'object' the default baseclass for all user classes.

3. User-defined exceptions

In Python1, user usually defined exceptions by using strings as 
exceptions, with the caveat that it was the identity and not the value 
of the string that defined the exception. The ability to subclass 
built-in exceptions made this obsolete. String exceptions were 
discouraged in 2.5 and removed in 3.0, completing another transition. 
Now all exceptions are instances of subclasses of BaseException.

4. Function application

Suppose one has a function f, object sequence args, and name-object 
mapping kwds, and one wants to call f with the *contents* of args and 
kwds as arguments. Calling f(args,kwds) does not do that as it would 
pass the collections themselves as a pair of arguments. In Python1, one 
called apply(f, args, kwds). Python2 introduced the synonym syntac 
f(*args,**kwds) and deprecated 'apply' by 2.5. Python3 removed apply, 
completing the transition.

5. Date interchange

In Python1, lists are the primary data interchange type used by built-in 
functions such as filter, map, and range. Python2.2 introduced a new 
iterator/iterable protocal, the iterator class 'generator', and 
generator functions. One advantage is that iterables and iterators can 
compactly represent virtual sequences of indefinite length. Another is 
that non-sequence collections, like sets and dicts, used be made 
iterable. Python2.3 introduced iterator versions of filter map, and zip 
in the itertools module. It also defined the new built-in function 
enumerate as returning an iterator rather than a list, as would have 
been the case if introduced much earlier. 2.4 introduce 'reversed' as 
returning an iterator. (The new in 2.4 'sorted' returns a list because 
it has to contruct one anyway.)

Pyhon3 moved much closer to replacing lists with iterators as the 
primary data interchange format. Ifilter, imap, and izip replaced 
filter, map, and zip. Xrange, which preceded 2.2, replaced range. 
Getting lists, as previously, is easy with 'list()'. The transition is 
not complete (and may never be), as slicing still returns a subsequence 
rather than an iterator, so itertools still has islice.

6. Integers

Python1 had two separate integer types, int and long. I believe that 
integer operations overflowed if the answer was too large. At some 
point, the two types were partially unified in that ints were converted 
to long when necessary to avoid overflow. At this point, having two 
types, with 'L' appended or not on output, became something of a 
nuisance. Python3 finishes int-long unification.

7. Order comparisonS

In early Python1, I believe all objects could be (arbitrarily) compared 
and sorted. When Guido added the complex type, he decided not to add an 
arbitrary order, as he thought that could mask bugs. I believe other 
classes were added later that only allowed comparisons between their own 
instances. Python3 completed the transition from comparable by default 
to incomparable by default.

More controversially, it also completed a transition from __cmp__ to the 
full comparison set. Similarly, list.sort dropped the 'cmp' parameter 
after gaining the 'key' parameter.

8. Text

Python1 had a byte string type that doubled as a text string type. (Some 
would put this the other way.) Python2 introduced a second text type, 
unicode, but kept using bytes as the default, as in its namespace dicts. 
Python3 replaced bytes with unicode, including in its namespaces, 
thereby making Python3 a more univeresally useful language. It kept 
bytes both for generic binary data use and for specialized encoded text 
use. This part of the transition is not complete, especially for some of 
the internet interfacing libraries.

I think that covers the main transitions in core Python.

-- 
Terry Jan Reedy




More information about the Python-list mailing list