[Python-ideas] Map-then-filter in comprehensions

Sjoerd Job Postmus sjoerdjob at sjec.nl
Fri Mar 11 08:23:51 EST 2016


On Thu, Mar 10, 2016 at 11:20:21PM +0100, Michał Żukowski wrote:
> Some time ago I was trying to solve the same issue but using a new keyword
> "where", and I thought that new keyword is too much for just list
> comprehension filtering, so I've made it something like assignments in
> expresion, eg.
> 
> (x+y)**2 + (x-y)**2 where x=1, y=2
> 
> So for list comprehension I can write:
> 
> [stripped for line in lines if stripped where stripped=line.strip()]
> 
> or:
> 
> result = map(f, objs) where f=lambda x: x.return_something()
> 
> or:
> 
> it = iter(lines)
> while len(line) > 4 where line=next(it, '').strip():
>     print(line)
> 
> or:
> 
> lambda x, y: (
>     0 if z == 0 else
>     1 if z > 0 else
>     -1) where z = x + y
> 
> or even:
> 
> lambda something: d where (d, _)=something, d['a']=1
> 
> I even implemented it:
> https://github.com/thektulu/cpython/commit/9e669d63d292a639eb6ba2ecea3ed2c0c23f2636
> 
> and it works nicely. I was thinking to reuse "with [expr] as [var]" but I
> also don't like idea of context sensitive semantics, and I even thought
> that maybe someone, someday would want to write "content = fp.read() with
> open('foo.txt') as fp"...
> 
> The "where" keyword is from guards pattern in Haskell :)

But in Haskell, the `where` keyword also considers scoping. That is,
outside the statement/expression with the `where`, you can't access the
variables introduced by the where.

Even though the `where` looks kind-of-nice, it (at least to me) is also
a bit confusing with respect to evaluation order. Consider

    [ stripped for idx, line in enumerate(lines) if idx >= 5 or stripped where stripped=line.strip() ]

(intended semantics: give me all lines (stripped), but ignore
any lines that are whitespace-only in the first 5 lines)

    retval = []
    for idx, line in enumerate(lines):
        stripped = line.strip()
        if idx >= 5 or stripped:
            retval.append(stripped)

now I'm not very sure, but I expect what actually happens is:

    retval = []
    for idx, line in enumerate(lines):
        if idx < 5:
            stripped = line.strip()
        if idx >= 5 or stripped:
            retval.append(stripped)

that is, should I read it as
    (if idx >= 5 or stripped) where stripped=line.strip()
or
    if idx >= 5 or (stripped where stripped=line.strip())

For comprehensions, I'd think the 'let' statement might make more sense.
Abusing Haskell's notation:

    [ stripped | (idx, line) <- zip [0..] lines, let stripped = strip line, idx >= 5 || length stripped > 0 ]

Porting this to something Python-ish, it'd be

    [ stripped for idx, line in enumerate(lines) let stripped = line.strip() if idx >= 5 or stripped ]

where `let` is a keyword (possibly only applicable in a compexpr). In
Haskell it's a keyword everywhere, but it has somewhat different
semantics.


More information about the Python-ideas mailing list