[Python-ideas] Function composition (was no subject)

Douglas La Rocca larocca at abiresearch.com
Sun May 10 12:36:59 CEST 2015


> I understand why you named it; I don't understand why you didn't just use
> def if you were going to name it (and declare it in a statement instead of the
> middle of an expression). Anyway, this is already in operator, as itemgetter,
> and it's definitely useful to functional code, especially itertools-style
> generator-driven functional code. And it feels like the pattern ought to be
> generalizable... but other than attrgetter, it's hard to think of another
> example where you want the same thing. After all, Python only has a couple
> of syntactic forms that you'd want to wrap up as functions at all, so it only has
> a couple of syntactic forms that you'd want to wrap up as curried functions.

Sorry for the confusion here--I was trying to say that it's correct to use def in order to properly set __name__, give space for doc strings, etc. 
The downside is that the nesting can strain readability. I was only showing the "formal" equivalent in lambda-style to point out how currying arguments isn't really confusing at all, considering

    lambda x: lambda y: lambda z: <some expression of x, y, z>

begins to resemble syntactically

    def anon(x, y, z):
        <some expression of x, y, z>

(obviously semantically these are different). 

Regarding the `getitem` example, this wasn't intended as a use-case. It's true Python has few syntactic forms you'd want to wrap (isinstance, hasattr, etc.). I mostly had external module apis in mind here.

> I don't understand why this is called fmap. I see below that you're not
> implying anything like Haskell's fmap (which confused me...), but then what
> _does_ the f mean? It seems like this is just a manually curried map, that
> returns a list instead of an iterator, and only takes one iterable instead of one
> or more. None of those things say "f" to me, but maybe I'm still hung up on
> expecting it to mean "functor" and I'll feel like an idiot once you clear it up. :)

> Also, why _is_ it calling list? Do your notions of composition and currying not
> play well with iterators? If so, that seems like a pretty major thing to give up.
> And why isn't it variadic in the iterables? You can trivially change that by just
> having the wrapped function take and pass *x, but I assume there's some
> reason you didn't?

It was only called fmap to leave the builtin map in the namespace, the 'f' just meant 'function'.

Taking a single iterable as the first item rather than varargs avoids the use of the `star` shim in the composition. I do use a wrapper `s` for this but find it ugly to use. It's basically a conventional decision that's forced by the difference between passing a single value to a "monadic" (in the APL not Haskell sense) function and a variadic function. In my own util library this also shows up as two versions of the identity function:

    def identity(x):
        return x

    def identity_star(*x):
        return x

It will seem these are useless but purpose becomes felt when you're in the middle of a composition.

For data structures where you want to map over lists of lists of lists etc., you can either define a higher map or do something like

    fmap(fmap(fmap(function_to_apply)))(iterable)

which would incidentally be the same as the uglier

    compose(*(fmap,)*3)(function_to_apply)(iterable)

though the latter makes it possible to parametrize the iteration depth.

As for wrapping in `list`--in some cases (I can't immediately recall them all) the list actually needed to be built in order for the composition to work. A simple case would be

    compose(mul(10), fmap(len), len)([[1]*10]*10)

which would return TypeError. I should look again to see if there's a better way to fix it. But I reverted the default back to 2.x because I made full use of generators before moving to 3.x and decided I didn't need map to be lazy. To be honest, the preference for everything to be lazy seems somewhat fashionable at the moment... you can get along just as well knowing where things shouldn't be fully loaded into memory (i.e. when to use a generator).

> These two aren't variadic in fn like fmap was. Is that just a typo, or is there a
> reason not to be?

Yes just a typo!

> Now that we have a concrete example... This looks like a nifty translation of
> what you might write in Haskell, but it doesn't look at all like Python to me.
> 
> And compare:
> 
>     def f(d):
>         pairs = (pair.strip(' ').split(':') for pair in d.split('•'))
>         strippedpairs = ((part.strip(' ') for part in pair) for pair in pairs)
>         return dict(strippedpairs)
> 
> Or, even better:
> 
>     def f(d):
>         pairs = (pair.strip(' ').split(':') for pair in d.split('•'))
>         return {k.strip(' '): v.strip(' ') for k, v in pairs}
> 
> Of course I skipped a lot of steps--turning the inner iterables into tuples,
> then into dicts, then turning the outer iterable into a list, then merging all the
> dicts, and of course wrapping various subsets of the process up into
> functions and calling them--but that's because those steps are unnecessary.
> We have comprehensions, we have iterators, why try to write for Python
> 2.2?

I agree these work just as well.

> And notice that any chain of iterator transformations like this _could_ be
> written as a single expression. But the fact that it doesn't _have_ to be--that
> you can take any step you want and name the intermediate iterable without
> having to change anything (and with negligible performance cost), and you
> can make your code vertical and play into Python indentation instead of
> writing it horizontally and faking indentation with paren-continuation--is
> what makes generator expressions and map and filter so nice.

> Well, that, and the fact that in a comprehension I can just write an expression
> and it means that expression. I don't have to wrap the expression in a
> function, or try to come up with a higher-order expression that will effect
> that first-order expression when evaluated.

> But often, the individual values have useful names that make it easier to
> keep track of them. Like calling the keys and values k and v instead of having
> them be elements 0 and 1 of an implicit *args.

I agree for the most part, but there are cases where you're really deep into some structure, manipulating the values in a generic way, and the names *do* get in the way. The temptation for me in those cases is to use x, y, z, s, t, etc. At this point the readability really suffers. The alternative is to modularize more, breaking the functions apart, but this only helps so much... In a certain way I find `(pair.strip(' ').split(':') for pair in d.split('•'))` to be less readable than the first steps in the composition--with the generator I'm reading back and forth in order to find out what's happening whereas the composition + map outlines the steps in a tree-like structure.


More information about the Python-ideas mailing list