the Gravity of Python 2

Mon Jan 6 20:55:01 EST 2014

On Mon, Jan 6, 2014 at 5:00 PM, Chris Angelico <rosuav at gmail.com> wrote:
> On Tue, Jan 7, 2014 at 11:27 AM, Devin Jeanpierre
> <jeanpierreda at gmail.com> wrote:
>> For example, I imagine that it is kind of _silly_ to have a
>> __future__.disable_str_autoencoding on a per-module basis, because
>> some modules' functions will fail when they are given the wrong type,
>> and some won't -- but in the context of making migration easier, that
>> silliness is probably OK.
>
> At what point does the auto-encoding happen, though? If a function
> calls another function calls another function, at what point do you
> decide that this ought to have become a str?

Python has a defined place where it happens. For example the __add__
method of str objects can do it.

As you note below for dicts, the place where you change behavior can
change, though. e.g. maybe all str objects created in a module cannot
be coerced anywhere else, or maybe it's coercions that happen inside a
module that are disabled. The former is more efficient, but it has
effects that creep out transitively in the most difficult way
possible. The latter is essentially just an API change (rather than
type change), and so easy enough, but it's prohibitively expensive, in
a way that makes all code everywhere in Python slower. In the end, we
can still choose one of those, and in principle the __future__ feature
would work, even if it's not the best. (In fact, if you want, you
could even do both.)

> I suspect there'll be quite a few problems that can't be solved
> per-module. The division change is easy, because it just changes the
> way code gets compiled (there's still "integer division" and "float
> division", it's just that / gets compiled into the latter instead of
> the former). With print_function I can imagine there might be some
> interactions that are affected, but nothing too major. Deploying
> new-style classes exclusively could be minorly problematic, but it'd
> probably work (effectively, a future directive stipulates that
> everything in this module inherits from object - technically should
> work, but might cause code readability confusion). But there are much
> subtler issues. Compare this code in Python 2 and Python 3:
>
> def f1():
>     return {1:2, 11:22, 111:222}
>
> def f2(d):
>     return d.keys()
>
> def f3(k):
>     return k.pop()
>
> process_me = f2(f1())
> try:
>     while True:
>         current = f3(process_me)
>         # ....
> except IndexError:
>     pass
>
> Obviously this works in Python 2, and fails in Python 3 (because
> keys() returns a view). Now imagine these are four separate modules.
> Somewhere along the way, something needs to pass the view through
> list() to make it poppable. Or, putting it the other way, somewhere
> there needs to be an alert saying that this won't work in Py3. Whose
> responsibility is it?
>
> * Is it f1's responsibility to create a different sort of dict that
> has a keys() method that returns a view?
> * Is it f2's responsibility to notice that it's calling keys() on a
> dictionary, and that it should warn that this will change (or switch
> to compatibility mode, or raise error, or whatever)? This is where the
> error actually is.
> * Is it f3's responsibility? This one I'm pretty sure is not so.
> * Is it the main routine's job to turn process_me into a list? I don't
> think so. There's nothing in that code that indicates that it's using
> either a dictionary or a list.
>
> I'd put the job either on f1 or on f2. A __future__ directive could
> change the interpretation of the { } literal syntax and have it return
> a dictionary with a keys view, but the fix would be better done in f2
> - where it's not obvious that it's using a dictionary at all.
>
> I'm not sure that a future directive can really solve this one. Maybe
> a command-line argument could, but that doesn't help with the gradual
> migration of individual modules.

What if we decide there is no single source of responsibility, and it
can't be limited exactly to a module, and make a __future__ feature
the best we can regardless? We can still exact some benefit from a
"sloppy" __future__ feature: we can still move code piecemeal.

If whatever __future__ feature there is, when enabled on the module
with f2 (or, in another case, f1), causes an error in f3, that's a
little misleading in that the error is in the wrong place, but it
doesn't fundamentally mean we can't move the codebase piecemeal. It
means that the change we make to the file for f2 (or f1) might require
some additional changes elsewhere or internally due to outside-facing
changes in semantics. It makes the required changes larger than in the
case of division, like you say, but it's still potentially smaller and
simpler than in the case of an atomic migration to Python 3.

-- Devin