[Python-ideas] Function composition (was no subject)

Douglas La Rocca larocca at abiresearch.com
Sun May 10 06:58:29 CEST 2015


(Newcomer here.)

I use function composition pretty extensively. I've found it to be incredibly powerful, but can lead to bad practices. Certain other drawbacks are there as well, like unreadable tracebacks. But in many cases there are real benefits. And for data pipelines where you want to avoid state and mutation it works well.

The fn and pymonad modules implement infix composition functions through overloading but I've found this to be unworkable.

For me, the ideal infix operator would simply be a space, with the composition wrapped in parentheses. So e.g.

>>> (list str sorted)(range(10))
[' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ' ', ',', ',', ',', ',', ',', ',', ',', ',', ',', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '[', ']']
 
I might be overlooking something, but it seems to me this would work with existing syntax and semantics and wouldn't conflict with anything else like operator overloading would. The only other place non-indentation level spaces are significant is with keywords which can't be re-assigned. So e.g. (yield from gen()) wouldn't be parsed as 3 functions, and (def func) would raise SyntaxError.

Here's the composition function I'm working with, stripped of the little debugging helpers:

```
def compose(*fns):
    def compose_(*x):
        fn, *fns = fns
        value = fn(*x)
        if fns:
            return compose(*fns)(value)
        else:
            return value
    return compose_

O=compose
```

I haven't had any issues with the recursion. The `O` alias rubs me the wrong way but seemed to make sense at the time. The thought was that it should look like an operator because it acts like one.

So the use looks like

>>> O(fn1, fn2, fn3, ...)('string to be piped')

The problem for composition is essentially argument passing and has to do with the convenience of *args, **kwargs.

The way to make composition work predictably is to curry the functions yourself, wrapping the arguments you expect to get with nested closures, then repairing the __name__ etc with functools.wraps or update_wrapper in the usual way. This looks much nicer and almost natural when you write it with lambdas, e.g.

>>> getitem = lambda item: lambda container: container[item]

(Apologies for having named that lambda there...)

The other way to manage passing values from one function to the next is to define a function like

def star(x):
    return lambda fn: fn(*x)

Then if you get a list at one point in the pipeline and your function takes *args, you can decorate the function and call it like 

>>> star(getattr)((getattr, '__name__'))
'getattr'

I've run into problems using the @curried decorators from the fn and pymonad modules because they don't how to handle *args, i.e. when to stop collecting arguments and finally make the function call.

If you want to have the composition order reversed you could decorate the definition with

```
def flip(f):
    def flip_(*x):
        f(*reversed(x))
    return flip_
```

Once we have composition we can write partials for `map`, `filter`, and `reduce`, but with a small twist: make them variadic in the first argument and pass the arguments to compose:

def fmap(*fn):
    def fmap_(x):
        return list(map(compose(*fn),x))
    return fmap_

def ffilter(fn):
    def ffilter_(xs):
        return list(filter(fn, xs))
    return ffilter_

def freduce(fn):
    def _freduce(xs):
        return reduce(fn, xs)
    return _freduce

def Fmap(*fns):
    def Fmap_(x):
        return list(map(lambda fn:fn(x), fns))
    return Fmap_

The `Fmap` function seemed like some sort of "conjugate" to `fmap` so I tried to give it name suggesting this (again, at the expense of abusing naming conventions).

Instead of mapping a function over a iterable like `fmap`, `Fmap` applies a each given function to a value. So

>>> Fmap(add(1), sub(1))(1)
[2, 0]

I've called them `fmap`, `ffilter`, and `freduce` but don't much like these names as they imply they might be the same as Haskell's `fmap`, and they're not. And there's no way to make them anything like Haskell as far as I can tell and they shouldn't be. If these implement a "paradigm" it's not purely functional but tacit/concatenative.

It made sense to compose the passed arguments because there's no reason to pass anything else to `fmap` in the first call. So sequential calls to (the return value of) `fmap` inside a pipeline, like

>>> O(mul(10),
...   fmap(add(1)), 
...   fmap(mul(2))
...  )([1])
[4, 4, 4, 4, 4, 4, 4, 4, 4, 4]                                          
                                                                        
can instead be written like

>>> O(mul(10),
...   fmap(add(1), 
...        mul(2))
...  )([1])
[4, 4, 4, 4, 4, 4, 4, 4, 4, 4]
                                                                        
It also makes it easier to work at different levels inside nested structures. In these heavily nested cases the composition pipeline even begins to resemble the data structure passing through, which makes sense.

As another example, following is part of a pipeline that takes strings of bullet-separated strings of "key:value" pairs and converts each one to a dictionary, then folds the result together:

>>> d = [' foo00 : bar00 • foo01 : bar01 ', 
...      ' foo10 : bar10 • foo11 : bar11 ', 
...      ' foo20 : bar10 • foo21 : bar21 ',]

>>> dict_foldl = freduce(lambda d1, d2: dict(d1, **d2))
>>> strip = lambda x: lambda s: s.strip(x)
>>> split = lambda x: lambda s: s.split(x)

>>> f = O(fmap(strip(' '),
...            split('•'), 
...            fmap(split(':'),
...                 strip(' '), 
...                 tuple), 
...            tuple,
...            dict),
...       dict_foldl)

>>> f(d)
{'foo00': 'bar00',
 'foo01': 'bar01',
 'foo10': 'bar10',
 'foo11': 'bar11',
 'foo20': 'bar10',
 'foo21': 'bar21'}

The combination of `compose`, `fmap`, and `Fmap` can be amazingly powerful for doing lots of work in a neat way while keeping the focus on the pipeline itself and not the individual values passing through.

The other thing is that this opens the door to a full "algebra" of maps which is kind of insane:

def mapeach(*fns):
    def mapeach_(*xs): 
        return list(map(lambda fn, *x: fn(*x), fns, *xs))
    return mapeach_

def product_map(fns):
    return lambda xs: list(map(lambda x: map(lambda fn: fn(x), fns), xs))

def smap(*fns):
    "star map"
    return lambda xs: list(map(O(*fns),*xs))

def pmap(*fns):
    return lambda *xs: list(map(lambda *x:list(map(lambda fn:fn(*x),fns)),*xs))

def matrix_map(*_fns):
    def matrix_map_(*_xs):
        return list(map(lambda fns, xs: list(map(lambda fn, x: fmap(fn)(x), fns, xs)), _fns, _xs))
    return matrix_map_

def mapcat(*fn):
    "clojure-inspired?"
    return compose(fmap(*fn), freduce(list.__add__))

def filtercat(*fn):
    return compose(ffilter(*fn), freduce(list.__add__))

I rarely use any of these of these. They grew out of an attempt to tease out some hidden structure behind the combination of `map` and star packing/unpacking.

I do think there's something there but the names get in the way--it would be better to find a way to define a function that takes a specification of the structures of functions and values and knows what to do, e.g. something like

>>> from types import FunctionType
>>> fn = FunctionType
>>> # then the desired/imaginary version of map...
>>> _map(fn, [int])(add(1))(range(5)) # sort of like `fmap`
[1,2,3,4,5]
>>> _map([fn], [int])((add(x) for x in range(5)))(range(5)) # sort of like `mapeach`
[0,2,4,6,8]
>>> _map([[fn]], [[int]])(((add(x) for x in range(5))*10))((list(range(5)))*10) # sort of like `matrix_map`
[[[0, 1, 2, 3, 4],
  [1, 2, 3, 4, 5],
  [2, 3, 4, 5, 6],
  [3, 4, 5, 6, 7],
  [4, 5, 6, 7, 8],
  [0, 1, 2, 3, 4],
  [1, 2, 3, 4, 5],
  [2, 3, 4, 5, 6],
  [3, 4, 5, 6, 7],
  [4, 5, 6, 7, 8]]]

In most cases the first argument would just be `fn`, but it would be *really* nice to be able to do something like

>>> map(fn, [[int], [[int],[[str],[str]]]])

where all you need to do is give the schema and indicate which values to apply the function to. Giving the type would be an added measure, but passing `type` in the schema for unknowns should work just as well.
________________________________________
From: Python-ideas <python-ideas-bounces+larocca=abiresearch.com at python.org> on behalf of Steven D'Aprano <steve at pearwood.info>
Sent: Saturday, May 09, 2015 11:20 PM
To: python-ideas at python.org
Subject: Re: [Python-ideas] Function composition (was no subject)

On Sun, May 10, 2015 at 03:51:38AM +0300, Koos Zevenhoven wrote:

> Another way to deal with elementwise operations on iterables would be to
> make a small, mostly backwards compatible change in map:
>
> When map is called with just one argument, for instance map(square), it
> would return a function that takes iterables and maps them element-wise.
>
> Now it would be easier to use map in pipelines, for example:
>
> rms = sqrt @ mean @ map(square)

Or just use a tiny helper function:

def vectorise(func):
    return partial(map, func)

rms = sqrt @ mean @ vectorise(square)


--
Steve
_______________________________________________
Python-ideas mailing list
Python-ideas at python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/


More information about the Python-ideas mailing list