Explanation of this Python language feature? [x for x in x for x in x] (to flatten a nested list)

Mon Apr 7 01:10:22 EDT 2014

On Sun, 06 Apr 2014 20:45:47 -0400, Terry Reedy wrote:

> On 4/6/2014 7:48 PM, Steven D'Aprano wrote:
>> On Sun, 06 Apr 2014 23:10:47 +0300, Marko Rauhamaa wrote:
>>
>>> Steven D'Aprano <steve+comp.lang.python at pearwood.info>:
>>>
>>>> On Sun, 06 Apr 2014 12:05:16 +0300, Marko Rauhamaa wrote:
>>>>> Python, BTW, is perfectly suitable for computer science.
>>>>
>>>> I don't think it is. Python is not a pure functional language, so
>>>> it's very difficult to prove anything about the code apart from
>>>> running it.
>>>
>>> Many classic CS ideas are expressed in terms of an Algol-like
>>> language. Nothing would prevent you from framing those ideas in a
>>> Python-like (pseudo)language. The question is mostly whether you
>>> prefer begin/end, braces or indentation.
>>
>> Okay, I made an error in stating that it's because Python is not a pure
>> functional language. It's because Python is so dynamic that it is very
>> difficult to prove anything about the code apart from running it. Take
>> this code-snippet of Python:
>>
>> n = len([1, 2, 3])
>>
>> What can we say about it? Almost nothing!
> 
> One merely needs to stipulate that builtin names have not been rebound
> to give the answer: n is bound to 3. 

But if I can do that, I can also stipulate that len() has been rebound to 
a function that ignores its argument and always returns the string 
"Surprise!". In that case, n is bound to the string "Surprise!". I can 
prove that this code snippet does almost *anything*, just be making some 
assumption about len.

The point is that one cannot derive much about the behaviour of Python 
code except by analysing the whole program, which is a very difficult 
problem, and often not even then. The only way to be sure what value is 
bound to len at the time that code snippet is executed is to actually run 
the code up to that code snippet and then look. In practical terms, 
things are not quite as bleak as I've made out: a little bit of runtime 
analysis goes a long way, as the success of PyPy, Numba, Cython and Psyco 
prove. That's why optimizers like PyPy generally produce code like this:

    if some guard condition is true:
        run fast optimized branch
    else:
        fall back on standard Python 

where the guard condition is generally checked at runtime, not at compile 
time.

But *in isolation*, you can't tell what len will be bound to unless you 
wait until runtime and look. A peephole optimizer that replaced a call 
like len([1,2,3]) with the constant 3 every time it sees it would be 
*wrong*.

> In the absence of code or text
> specifying otherwise, that is the reasonable default assumption and the
> one that most makes when reading code.

Well of course, but the requirements of an optimizer or correctness 
prover or similar is much higher than just "this is a reasonable default 
assumption". 

> Restricting the usage of Python's flexibility does not make it another
> language. It makes it the actual language that the vast majority of
> programs are written in and that people assume when reading code.

That's incorrect. If len were a keyword, and couldn't be shadowed or 
replaced, it would be another language. It is not an accident that you 
can replace len in builtins, it is a deliberate feature of the language.

-- 
Steven