Flat-schmat! [was Re: Python syntax in Lisp and Scheme]

Sun Oct 5 15:56:13 EDT 2003

Kenny Tilton
> Not?:
>
> (defun quadratic (a b c)
>    (let ((rad (sqrt (- (* b b) (* 4 a c)))))
>      (mapcar (lambda (plus-or-minus)
>                (/ (funcall plus-or-minus (- b) rad) (+ a a)))
>        '(+ -))))
>
> :)

Indeed no, I don't suspect most people write it this way.

Also, as Robin Becker pointed out in this branch, the above
equation fails numerically for several cases, when intermediate
terms are too large for floats or when there's a b*b is very
much greater than 4*a*c.

> Well it was a bad example because it does require two similar
> calculations which can be done /faster/ by pre-computing shared
> components.

It was a bad example because it can have two return values.

> But then the flattening is about performance

That's not relevant.  That's why I said "possibly."

> and the subject is whether deeply nested forms are in fact
> simpler than flattened sequences where the algorithm
> itself would be drawn as a tree.

You said your enlightenment came from analogy to
math, where writing everything as one expression is
better than writing things in different parts.  I objected,
saying that math also uses ways to remove deep
hierarchies, including ways to flatten those equations.

It is not changing the subject.  It is a rebuttal to one of
your justifications and an attempt at doing a comparison
with other fields, to bring in my view that there is a
balance between flat and deep, and the choice of
where the different tradeoffs are made is the essense
of the Zen of Python ... or of any other field of endevour.

> No! Those are like subroutines; they do not flatten, they create call
> trees, hiding and encapsulating the details of subcomputations.

But didn't I do that when I said "root = sqrt(b*b-4*a*c)"?
Why isn't that "like [a] subroutine"?

> When any local computation gets too long, there is
> probably a subroutine to be carved out, or at least I can take 10 lines
> and give it a nice readable name so I can avoid confronting too much
> detail at any one time.

There's also common subexpressions like my 'root' which
aren't appropriate to compute as a function; that is, it's only
used twice and writing discriminate(a, b, c) is too much hiding
given what it does.  But it is one where it may make sense to
use a temporary variable because it makes the expression more
readable .. at least by those who prefer reading flatter equations
than you do.  Again, it's that balance thing.

I must agree that the examples weren't the best I could do.
The problem is that books have built up the nomenclature
during the preceeding pages, so don't do the "let X be ..."
that you would see elsewhere.  The people who do that
the most that I've seen are the fluid dynamics modellers,
so let me quote from a not atypical example I dug up

http://citeseer.nj.nec.com/cache/papers/cs/8949/http:zSzzSzcronos.rutgers.ed
uzSz~knightzSzpaperszSzaiaa96-0040.pdf/computation-of-d-asymmetric.pdf

===============================
   Nomenclature

M_infinity     Mach Number
M_t            Turbulent Mach Number
Re_delta       Reynolds Number, Based on
                 Incoming Boundary Layer Thickness
delta_infinity Incoming Boundary Layer Thickness
delta^*        Boundary Layer Displacement Thickness
   ...
The Reynolds-average equations for conservation of mass,
momentum and energy are,
         __            __  ~
  del_t  rho   + del_k rho u_k = 0
      __  ~            __  ~     ~
del_t rho u_i  + del_k rho u_i * u_k =
            _                 ,,  ''   ___
    - del_i p  + del_k (-rho u_i u_k + tau_{ik})

 ...
where del_t = del/del t, del_k = del / del x_k and the
Einstein summantion convention is employed.  The overbar
denotes a conventional Reynolds average, while
the overtilde is used to denote the Favre mass
average.  A double superscript '' represents
flucuations with respect to the Favre average, while
a single superscript ' stands for fluctuations
respect to the Reynolds average.
                        ___                       ~
In the above equations, rho  is the mean density, u_i
                              _
is the mass-averaged velocty, p is the mean pressure
    _
and e is the mass-averaged total enery per unit mass.
The following relations are employed to evaluate
_     _
p and e:
         _   ___   _
         p = rho R T
     _     _      ~ ~
     e = c_v T + (1/2) u_i u_i + k

where k is the mass-averaged turbulence kinetic
energy
                 _____________
   ___                  ,,  ,,
   rho k = (1/2) rho u_i u_i

===============================

All those terms, definitions, and relationships are
needed to understand the equations.  Some of these are
external variables, others are simple substitutions
      _          ___   _
(like p which is rho R T ) and some require multiple
                    _
substitutions, like e, which uses k, which is based
on another function).

This could all be written as one set of expressions,
                                         _     _
without any of the substituted variables p and e,
but it wouldn't be done because that makes the structure
of the equation less obvious and overly verbose.

> But I don't throw away the structure of the
> problem to get to simplicity.

Who said anything about throwing it away?  Again,
you interpreted things in an extreme view never
advocated by anyone here, much less me.

The questions are, when does the structure get too
complicated and what substructures are present which
can be used to simplify the overall structure without
losing understanding?  And how can it be presented so
that others can more easily understand what's going on?

> Hmmm, Zen constrained by the details of a computing language. Some
> philosophy! :) What I see in "flat is better" is the mind imposing
> preferred structure on an algorithm which has its own structure
> independent of any particular observer/mind.

But it's well known there are different but equivalent ways
to approach the same problem.  For example, in mechanics you
can use a Newtownian approach or a Lagrangian one and you'll
end up with the same answer even through the approaches are quite
different in style.  In quantum mechanics, Schrodinger's wave
equation is identical to Heisenberg's matrix formulation.
In analysis, you can use measure theory or infinitesimals to
solve the same problem.  Or as a simpler example, you can use
geometry to solve some problems easier than you can algebra.
(I recall a very nice, concise geometic proof of the Pythagorean
theorem.)  Consider Ramanujan, who came up with very different
and powerful ways to think about problems, including alternate
ways to handle existing proofs.  In computer science, all
recursive solutions can be made iterative, but there are times
when the recursive solution is easier to think about.  We
know that NP-hard problems are identically hard (up to a
polynomial scaling factor) but using an algorithm for solving
a minimax problem might be more appropriate for a given task
than one meant for subgraph isomorphism, even if the two
solutions are equivalent.

By solving the problem using one of these alternatives, then
yes, your mind would impose a preferred structure to the solution.
So what?  Sapir-Whorf is wrong and we can come up with new
new language and structures.  It's hard to come up with new,
generally powerful solutions, but possible.

> I am getting excellent results lately by always striving to conform my
> code to the structure of the problem as it exists independently of me.
> How can I know the structure independently of my knowing? I cannot, but
> the problem will tell me if I screw up and maybe even suggest how I went
> wrong.

Suppose you needed to find a given number in an unsorted list
of numbers.  You know a priori that the number is in the list.
What algorithm do you use?  My guess is you use a linear search,
which is O(N).

However, if you had a general purpose quantum computer available
to you then you could use Grover's algorithm and solve the problem
in O(sqrt(N)) time.

Did you even consider that alternative?  Likely no, since
quantum algorithms still have little real life applicability.
(I think the biggest search so far has been in a list of 5
elements.)

If you didn't, then you must agree that your decision of how
to think of the problem and describe it on a computer is
constrained (not limited! - constraints can be broken) by
your training and the tools you have available to you.

Even if you did agree, can you prove there are no other
alternative solutions to your problem which are equally
graceful?  If you say yes, I'm sure a Forth fan would disagree.

> Python (I gather from what I read here) /deliberately/ interferes in my
> attempts to conform my code to the problem at hand, because the
> designers have decreed "flat is better". Python rips a tool from my
> hands without asking if, in some cases (I would say most) it might be
> the right tool (where an algorithm has a tree-like structure).

That's a false and deliberately antagonistic statement.  Python
doesn't constrain you to squat - you're free to use any other
language you want to use.  Python is another tool available to you,
and adding it to the universe of tools doesn't take any other one
away from you.

A programming language must balance between many and often
contrary factors.  The fact that you might be good at deeply
hierarchical structures does not mean that others share your
abilities or preferences (and studies have shown that people
in general are not good at deep hierarchies - look at how
flat most user directories are) nor does it mean that every
programming language must support your favored way of thinking.

One of the balances is the expressability and power
available to a single user vs. the synergies available to
a group of people with different skills who have a consistent
worldview -- even if imposed from the outside.  Ramanujan
made up his own notation for math, which has taken a lot
of time for others decode and so helping to reduce the impact
of his work on the rest of the world.  Your viewpoint is
that there is no tradeoff, or rather that the balance point
is not much different than full single-person expressability.
Others here, including me, disagree.

In any case, your argument that mathematics equations can be
used as justification for prefering a deeply hierarchical
description has no basis in how people actually use and
express equations.  If there was then FORTRAN would have
looked a lot different.

                    Andrew Dalke
                    dalke at dalkescientific.com