[Python-ideas] Proposal: A Reduce-Map Comprehension and a "last" builtin

David Mertz mertz at gnosis.cx
Mon Apr 9 21:46:11 EDT 2018

I continue to find all this weird new syntax to create absurdly long
one-liners confusing and mysterious. Python is not Perl for a reason.

On Mon, Apr 9, 2018, 5:55 PM Peter O'Connor <peter.ed.oconnor at gmail.com>

> Kyle, you sounded so reasonable when you were trashing
> itertools.accumulate (which I now agree is horrible).  But then you go and
> support Serhiy's madness:  "smooth_signal = [average for average in [0] for
> x in signal for average in [(1-decay)*average + decay*x]]" which I agree
> is clever, but reads more like a riddle than readable code.
> Anyway, I continue to stand by:
>     (y:= f(y, x) for x in iter_x from y=initial_y)
> And, if that's not offensive enough, to its extension:
>     (z, y := f(z, x) -> y for x in iter_x from z=initial_z)
> Which carries state "z" forward but only yields "y" at each iteration.
> (see proposal: https://github.com/petered/peps/blob/master/pep-9999.rst)
> Why am I so obsessed?  Because it will allow you to conveniently replace
> classes with more clean, concise, functional code.  People who thought they
> never needed such a construct may suddenly start finding it indispensable
> once they get used to it.
> How many times have you written something of the form?:
>     class StatefulThing(object):
>         def __init__(self, initial_state, param_1, param_2):
>             self._param_1= param_1
>             self._param_2 = param_2
>             self._state = initial_state
>         def update_and_get_output(self, new_observation):  # (or just
> __call__)
>             self._state = do_some_state_update(self._state,
> new_observation, self._param_1)
>             output = transform_state_to_output(self._state, self._param_2)
>             return output
>     processor = StatefulThing(initial_state = initial_state, param_1 = 1,
> param_2 = 4)
>     processed_things = [processor.update_and_get_output(x) for x in x_gen]
> I've done this many times.  Video encoding, robot controllers, neural
> networks, any iterative machine learning algorithm, and probably lots of
> things I don't know about - they all tend to have this general form.
> And how many times have I had issues like "Oh no now I want to change
> param_1 on the fly instead of just setting it on initialization, I guess I
> have to refactor all usages of this class to pass param_1 into
> update_and_get_output instead of __init__".
> What if instead I could just write:
>     def update_and_get_output(last_state, new_observation, param_1,
> param_2)
>         new_state = do_some_state_update(last_state, new_observation,
> _param_1)
>         output = transform_state_to_output(last_state, _param_2)
>         return new_state, output
>     processed_things = [state, output:= update_and_get_output(state, x,
> param_1=1, param_2=4) -> output for x in observations from
> state=initial_state]
> Now we have:
> - No mutable objects (which cuts of a whole slew of potential bugs and
> anti-patterns familiar to people who do OOP.)
> - Fewer lines of code
> - Looser assumptions on usage and less refactoring. (if I want to now pass
> in param_1 at each iteration instead of just initialization, I need to make
> no changes to update_and_get_output).
> - No need for state getters/setters, since state is is passed around
> explicitly.
> I realize that calling for changes to syntax is a lot to ask - but I still
> believe that the main objections to this syntax would also have been raised
> as objections to the now-ubiquitous list-comprehensions - they seem hostile
> and alien-looking at first, but very lovable once you get used to them.
> On Sun, Apr 8, 2018 at 1:41 PM, Kyle Lahnakoski <klahnakoski at mozilla.com>
> wrote:
>> On 2018-04-05 21:18, Steven D'Aprano wrote:
>> > (I don't understand why so many people have such an aversion to writing
>> > functions and seek to eliminate them from their code.)
>> >
>> I think I am one of those people that have an aversion to writing
>> functions!
>> I hope you do not mind that I attempt to explain my aversion here. I
>> want to clarify my thoughts on this, and maybe others will find
>> something useful in this explanation, maybe someone has wise words for
>> me. I think this is relevant to python-ideas because someone with this
>> aversion will make different language suggestions than those that don't.
>> Here is why I have an aversion to writing functions: Every unread
>> function represents multiple unknowns in the code. Every function adds
>> to code complexity by mapping an inaccurate name to specific
>> functionality.
>> When I read code, this is what I see:
>> >    x = you_will_never_guess_how_corner_cases_are_handled(a, b, c)
>> >    y =
>> you_dont_know_I_throw_a_BaseException_when_I_do_not_like_your_arguments(j,
>> k, l)
>> Not everyone sees code this way: I see people read method calls, make a
>> number of wild assumptions about how those methods work, AND THEY ARE
>> CORRECT!  How do they do it!?  It is as if there are some unspoken
>> convention about how code should work that's opaque to me.
>> For example before I read the docs on
>> itertools.accumulate(list_of_length_N, func), here are the unknowns I see:
>> * Does it return N, or N-1 values?
>> * How are initial conditions handled?
>> * Must `func` perform the initialization by accepting just one
>> parameter, and accumulate with more-than-one parameter?
>> * If `func` is a binary function, and `accumulate` returns N values,
>> what's the Nth value?
>> * if `func` is a non-cummutative binary function, what order are the
>> arguments passed?
>> * Maybe accumulate expects func(*args)?
>> * Is there a window size? Is it equal to the number of arguments of
>> `func`?
>> These are not all answered by reading the docs, they are answered by
>> reading the code. The code tells me the first value is a special case;
>> the first parameter of `func` is the accumulated `total`; `func` is
>> applied in order; and an iterator is returned.  Despite all my
>> questions, notice I missed asking what `accumulate` returns? It is the
>> unknown unknowns that get me most.
>> So, `itertools.accumulate` is a kinda-inaccurate name given to a
>> specific functionality: Not a problem on its own, and even delightfully
>> useful if I need it often.
>> What if I am in a domain where I see `accumulate` only a few times a
>> year? Or how about a program that uses `accumulate` in only one place?
>> For me, I must (re)read the `accumulate` source (or run the caller
>> through the debugger) before I know what the code is doing. In these
>> cases I advocate for in-lining the function code to remove these
>> unknowns. Instead of an inaccurate name, there is explicit code. If we
>> are lucky, that explicit code follows idioms that make the increased
>> verbosity easier to read.
>> Consider Serhiy Storchaka's elegant solution, which I reformatted for
>> readability
>> > smooth_signal = [
>> >     average
>> >     for average in [0]
>> >     for x in signal
>> >     for average in [(1-decay)*average + decay*x]
>> > ]
>> We see the initial conditions, we see the primary function, we see how
>> the accumulation happens, we see the number of returned values, and we
>> see it's a list. It is a compact, easy read, from top to bottom. Yes, we
>> must know `for x in [y]` is an idiom for assignment, but we can reuse
>> that knowledge in all our other list comprehensions.  So, in the
>> specific case of this Reduce-Map thread, I would advocate using the list
>> comprehension.
>> In general, all functions introduce non-trivial code debt: This debt is
>> worth it if the function is used enough; but, in single-use or rare-use
>> cases, functions can obfuscate.
>> Thank you for your time.
