[Python-ideas] Where-statement (Proposal for function expressions)

Steven D'Aprano steve at pearwood.info
Sat Jul 18 02:39:56 CEST 2009


On Sat, 18 Jul 2009 06:28:40 am Gerald Britton wrote:

> I often time my code and often find that removing function calls
> "where" possible actually makes a measurable difference.  This is
> especially the case with little getters and setters in class defs.

Perhaps you shouldn't be writing Java code with lots of getters and 
setters then *wink*

Seriously, I've very glad to hear you're measuring before optimising, 
but a "measurable difference" is not necessarily a significant 
difference. Anyway, we're not here to discuss the pros and cons of 
optimization, so back to the top on hand:

> Anyway, I like the "where" idea not because of real or imagined
> performance gains, but because of its cleanness when expressing
> problems.

I just don't see this cleanness.

It seems to me that the "where" construct makes writing code easier than 
reading code. It is a top-down construct: first you come up with some 
abstract expression:

element = w + x.y - f(z) where:

and then you need to think about how to implement the construct:

    w = 2
    x = thingy()
    z = 4

That roughly models the process of writing code in the first place, so I 
can see the attraction. But it doesn't model the process of *reading* 
code:

"Hmmm, element is equal to w + x.y - f(z), okay, but I haven't seen any 
of those names before, are they globals? Ah, there's a 'where' 
statement, better pop that expression into short-term memory while I 
read the block and find out what they are..."

In a real sense, the proposed 'where' block breaks the flow of reading 
code. Normally, reading a function proceeds in an orderly fashion from 
the start of the function to the end in (mostly) sequential order, a 
bottom-up process:

def parrot():
    x = 1
    y = 2
    z = 3
    result = x+y+z
    return result

A complication: if the function you are reading relies on globals, 
including functions, then you may need to jump out of the function to 
discover what it is. But if you already know what the function does, or 
can infer it from the name, then it doesn't disrupt the sequential 
reading.

But the where clause introduces *look-ahead* to the process:

def parrot():
    x = 1
    result = x+y+z where:  # look-ahead
        y = 2
        z = 3
    return result


This is not a heavy burden if you are limited to a few simple names, but 
as soon as you re-use pre-existing names in the 'where' scope, or 
increase its complexity, readability suffers greatly.

def harder():
    w = 5
    z = 2
    result = w + x.y - f(z) where:
        w = 2
        class K:
            y = 3
        x = K()
        def f(a):
            return a**2 + a
        z = 4
    return result


Look-ahead is simply harder on the reader than reading sequentially.

You can abuse any tool, write spaghetti code in any language, but some 
idioms encourage misuse, and in my opinion this is one of them.



> A big use for me would be in list comprehensions.  In one 
> project I work on, I see things like:
>
> for i in [item in self.someobject.get_generator() if
> self.important_test(item)]
>
> and other really long object references.

I don't see anything objectionable in that. It's a tad on the long side, 
but not excessively.


> which I would like to write:
>
> mylist =  [item in f if g(item)] where:
>     f = self.someobject.get_generator()
>     g = self.important_test

I don't think much of your naming convention. Surely g should be used 
for the generator, and f for the function, instead of the other way 
around?



> To my eyes, the first is harder to read than the second one.  Of
> course I can do this:
>
>     f = self.someobject.get_generator()
>     g = self.important_test
>     mylist =  [item in f if g(item)]:
>
>
> but then "f" and "g" pollute the calling context's namespace.

Holy cow! How many variable names do you have in a single function that 
you have to worry about that???

Honestly, I think this entire proposal is a pessimation: you're 
proposing to significantly complicate the language and negatively 
impact the readability of code in order to avoid polluting the 
namespace of a function?

Of course, then people will start worrying about "polluting the 
namespace" of the where-block, and start doing this:

result = x + y + z where:
    x = 1
    y = 2
    z = a*b - c where:
        a = 5
        b = 6
        c = d*e where:
            d = 3
            e = 4




-- 
Steven D'Aprano



More information about the Python-ideas mailing list