Python-based monads essay part 2

Steven D'Aprano steve+comp.lang.python at pearwood.info
Thu Oct 20 03:42:48 EDT 2016


On Thursday 20 October 2016 04:57, Chris Angelico wrote:

> Short-circuiting depends only on what's known *prior* to that
> evaluation. Dead code elimination removes the notion of time:

Surely not. I understand "dead code" to mean the same as *unreachable* code: 
code which cannot be reached by any path through the program:

if False:
    spam()  # dead code

not merely code that doesn't affect the return result:

x = spam()
x = 999

Calling spam is not dead code, even if the result is immediately thrown away!

In any case, whichever definition you use, for spam() to be dead code in the 
later example, it has to be known to have no side-effects.


> int(input("Enter something:")) * 0 + 5
> 
> This can simply return 5, because (barring an exception being thrown)
> the value entered cannot affect the result. Python does not do this.
> And any system that has optimization flags (PostgreSQL, Pike, etc),
> this would be marked "has side effects", and therefore is not
> considered dead code. But if you (maybe mistakenly) mark input() as a
> pure function, then yes, this could indeed be optimized out.

*Assuming* that the compiler can distinguish between functions with side-
effects and those that aren't, and *assuming* that the compiler knows that 
input() returns something which can be cast to an int with no side-effects or 
errors, then in principle the compiler could optimize that to just 5.

But I don't think that's terribly likely, even for an aggressive optimizing 
compiler. Maybe C, because there's nothing so terrible I wouldn't believe about 
C compilers *wink*

This is not the best example, because its obvious that the call to int() might 
fail, and hence the effect of this code may be to raise ValueError. A better 
example would be:

len(input("Enter something:")) * 0 + 5


Assuming the compiler knows that len() and input() have their standard meaning, 
is it justified to optimized that expression to just 5? I don't think so, 
because the side effect of input() displaying text on the screen and waiting 
for the user to type something is the whole reason for the function to exist.

Regardless of whether your language is imperative, or functional with monads, I 
think that expression has to write text to the screen and wait for the user to 
type something, or the compiler is buggy. But it doesn't have to happen 
straight away! Consider:

L = [0, 1, 2, 3, 4, len(input("A")) * 0 + 5, 6]
print("B")
print(L[5])

Does this program print "A B 5", or "B A 5"? In Python, it prints "A" first, 
but that may not be the case for all languages. My guess is that, like 
declarative, asynchronous, concurrent and parallel programming (did I miss any 
buzzwords?), functional programming doesn't guarantee the order in which code 
is executed.

Actually, neither does imperative programming, if the language supports "call 
by name" or similar parameter passing mechanisms:


def call_by_name(x):
    print("B")
    print(x)

call_by_name(len(input("A")) * 0 + 5)


If x is a thunk, then this too will print "B" first.


>> However, if we think of the I/O interaction (aka *the process*) to be
>> the return value of a function, every bead in the I/O chain counts.
> 
> And that's what I mean about making function purity meaningless. Here,
> look - this is a pure function!
> 
> magic_numbers = [1, 2, 4, 8]
> def do_magic(x):
>     magic_numbers[x % 4] += x / 4
>     return magic_numbers[(x + 1) % 4]
> 
> Since the previous state of magic_numbers can be considered an input,
> and the new state can be considered an output, this is a pure
> function! Right?

No, but you know that :-)

Inputs are explicitly arguments to the function, not external state. Here's how 
we can turn that into real input:

def do_magic(state, x):
    state = calculate_new_state(state)  # work it out for yourself *wink*
    value = state[(x + 1) % 4]
    return state, value

and then use a monad to make the state invisible, so that you actually only 
need to call do_magic(x).

I'm hand-waving here because I don't remember all the fine details from Greg's 
essay. Call it magic if you want, but the compiler can do it, because the 
details of how state is stored is invisible to the caller, the compiler is free 
to do the optimization:


    change the functional but inefficient idiom 
        state = calculate_new_state(state)
    into the imperative but efficient idiom:
        modify_in_place(state)


That's allowed, because (1) nobody and nothing except the do_magic() function 
can access the state; (2) the use of a monad makes this completely transparent 
and mathematically vigorous, which means the compiler can do it without human 
intervention; and (3) we want the program to run in less than a billion 
petabytes of memory. Practicality beats purity.

All joking aside, the whole point of functional programming is to avoid the 
human programmer having to track external state. It's not actually to avoid 
external state: given our existing hardware, such a thing would be (1) 
inefficient and (2) impossible because we're not doing a Platonic Ideal 
computation, we're manipulating bits in RAM with a CPU that has to bang 
trillions of electrons around really fast, generating heat. We just have to 
avoid external state *where the programmer can trip over it* and end up with 
buggy code.

And also where, in practice, you actually need and want it. We have to handle 
input and output, otherwise what's the point of the computation? If your code 
doesn't interact with the outside universe, it might as well not exist.

As I understand it, monads are just a way to put these essential I/O things 
into a mathematically vigorous footing so that the compiler can handle them as 
if they were functionally pure. It doesn't mean they *actually are* 
functionally pure -- obviously if you delete a file or print output to the 
screen or pause for the user to enter data, you're making changes of state to 
the outside world. So what? We don't live in Plato's Ideal World of forms.


> Technical means nothing if you're destroying the meaning of functional
> purity in order to squeeze I/O into it.

I think that's an extreme over-reaction.




-- 
Steven
git gets easier once you get the basic idea that branches are homeomorphic 
endofunctors mapping submanifolds of a Hilbert space.




More information about the Python-list mailing list