Python's simplicity philosophy

Thu Nov 27 17:35:42 EST 2003

Douglas Alan wrote:

> Alex Martelli <aleax at aleax.it> writes:
> 
> 
>>>That would have a larger big O time growth value -- most likely
>>>O(n * log n) vs. O(n), for reasonable implemenations.  And while I
> 
> 
>>If the sequence is carefully randomized, yes.  If the sequence has
>>any semblance of pre-existing order, the timsort is amazingly good
>>at exploiting it, so that in many real-world cases it DOES run as
>>O(N).
> 
> 
> C'mon -- to make robust programs you have to assume the worst-case
> scenario for your data, not the best case.  I certainly don't want to
> write a program that runs quickly most of the time and then for opaque
> reasons slows to a crawl occasionally.  I want it to either run
> quickly all of the time or run really slowly all of the time (so that
> I can then figure out what is wrong and fix it).

In theory, I'd agree with you Douglas. But IRL, I agree with Alex. If I
have to choose between two algorithms that do almost the same, but one
works on an special case (that's very common to my range) and the other
works in general, I'd go with the special case. There is no compelling
reason for getting into the trouble of a general approach if I can do it
correctly with a simpler, special case.

One example. I once wrote a small program for my graphic calculator to
analyze (rather) simple electrical networks. To make the long story
short, I had two approaches: implement Gaussian Elimination "purely" (no
modifications to take into account some nasty problems) or implement it
with scaled partial pivoting. Sure, the partial pivoting is _much_
better than no pivoting at all, but for the type of electrical networks
I was going to analyze, there was no need to. All the fuzz about
pivoting is to prevent a small value to wreck havoc in the calculations.
In this *specific* case (electric networks with no dependent sources),
it was impossible for this situation to ever happen.

Results? Reliable results in the specific range of working, which is
what I wanted. :D

>>>wouldn't sweat a factor of 2 for a feature of a RAD or scripting
>>>language, I would be more concerned about moving to a larger big O
>>>value.
> 
>>Me too!  That's why I'd like to make SURE that some benighted soul
>>cannot code:
> 
>>    onebigstring = reduce(str.__add__, lotsofstrings)
> 
> 
> The idea of aiming a language at trying to prevent people from doing
> stupid things is just innane, if you ask me.  It's not just inane,
> it's offensive to my concept of human ability and creativity.  Let
> people make their mistakes, and then let them learn from them.  Make a
> programming language to be a fluid and natural medium for expressing
> their concepts, not a straight-jacket for citing canon in the
> orthodox manner.

It's not about restraining someone from doing something. It's about
making it possible to *read* the "f(.)+" code. Human ability and
creativity are not comprimised when restrictions are made. In any case,
try to program an MCU (micro-controller unit). Python's restrictions are
nothing compared with what you have to deal in a MCU.

> Furthermore, by your argument, we have to get rid of loops, since
> an obvious way of appending strings is:
> 
>   result = ""
>   for s in strings: result += s

By your logic, fly swatters should be banned because shotguns are more
general. :S It's a matter of design decisions. Whatever the designer
thinks is better, so be it (in this case, GvR).

At least, in my CS introductory class, one of the things we learned was
that programming languages could be extremely easy to read, but very
hard to write and vice versa. These design decisions *must* be taken by
someone *and* they should stick to them. Decision can be changed, but
must be done with rather large quantities of caution.

>>>Your proposed extension to max() and min() has all the same problems.
> 
>>Not at all.  Maybe you have totally misunderstood "my proposed
>>extension"?
> 
> You are correct, sorry -- I misunderstood your proposed extension.
> But max() and min() still have all the same problems as reduce(), and
> so does sum(), since the programmer can provide his own comparison
> and addition operations in a user-defined class, and therefore, he can
> make precisely the same mistakes and abuses with sum(), min(), and
> max() that he can with reduce().

It might still be abused, but not as much as reduce(). But considering
the alternative (reduce(), namely), it's *much* better because it shifts
the problem to someone else. We are consenting adults, y'know.

>>>But reasonable programmers don't abuse this generality, and so there
> 
>>So, you're claiming that ALL people who were defending 'reduce' by
>>posting use cases which DID "abuse this generality" are
>>unreasonable?
> 
> In this small regard, at least, yes.  So, here reduce() has granted
> them the opportunity to learn from their mistakes and become better
> programmers.

Then reduce() shouldn't be as general as it is in the first place.

>>>this urge to be stiffled.  Don't take my word for it -- ask Paul
>>>Graham.  I believe he was even invited to give the Keynote Address at
>>>a recent PyCon.
> 
>>However, if you agree with Paul Graham's theories on language
>>design, you should be consistent, and use Lisp.
> 
> I don't agree with *everything* that anyone says.  But if there were a
> version of Lisp that were as tuned for scripting as Python is, as
> portable as Python is, came with as many batteries installed, and had
> anywhere near as large a user-base, I probably *would* be using Lisp.
> But there isn't, so I don't.
> 
> And Python suits me fine.  But if it continues to be bloated with a
> large number special-purpose features, rather than a small number of
> general and expressive features, there may come a time when Python
> will no longer suit me.

reduce() ... expressive? LOL. I'll grant you it's (over)general, but
expressive? Hardly. It's not a bad thing, but, as Martelli noted, it is
overgeneralized. Not everyone understand the concept off the bat (as you
_love_ to claim) and not everyone find it useful.

>>If you consider Python to be preferable, then there must be some
>>point on which you disagree with him.  In my case, I would put
>>"simplicity vs generality" issues as the crux of my own
>>disagreements with Dr. Graham.
> 
> Bloating the language with lots of special-purpose features does not
> match my idea of simplicity.  To the extent that I have succeeded in
> this world, it has always been by understanding how to simplify things
> by moving to a more general model, meaning that I have to memorize and
> understand less.  Memorizing lots of detail is not something I am
> particurly good at, and is one of the reasons why I dislike Perl so
> much.  Who can remember all that stuff in Perl?  Certainly not I.  I
> suppose some people can, but this is why *I* prefer Python -- there is
> much less to remember.

Then you should understand some of the design decisions reached (I said,
understand, not agree).

What I understand for simplicity is that I should not memorize anything
at all, if I read the code. That's one of the things I absolutely hate
about LISP/Scheme. Especially when it comes to debugging the darn code.

> Apparently you would have it so that for every common task you might
> want to do there is one "right" way to do it that you have to
> remember, and the language is packed to the brim with special features
> to support each of these common tasks.  That's not simple!  That's a
> nightmare of memorizing special cases, and if that were the future of
> Python, then that future would be little better than Perl.

As I have said, these are design decisions GvR reached or he should be
taking in the future.

Now, what's so hard about sum()? Can't you get what it does by reading
the function name? The hardest part of _any_ software project is not
writing it, but mantaining it. IIRC (and if the figures arent' correct, 
please someone correct me), in a complete software project life cycle, 
more than 70% of the total budget spent. So you will find that most 
companies standarize many things that can be faster/better/"obviously" 
done any other way, i.e., x ^= x in C/C++.

>>>Just what is it that I don't grasp again?  I think my position is
>>>clear: I have no intention to abuse reduce(), so I don't worry myself
>>>with ways in which I might be tempted to.
> 
>>Yet you want reduce to keep accepting ANY callable that takes two
>>arguments as its first argument, differently from APL's / (which does
>>NOT accept arbitrary functions on its left);
> 
> That's because I believe that there should be little distinction
> between features built into the language and the features that users
> can add to it themselves.  This is one of the primary benefits of
> object-oriented languages -- they allow the user to add new data types
> that are as facile to use as the built-in data types.

_Then_ let them *build* new classes to use sum(), min(), max(), etc. 
These functionality is better suited for a class/object in an OO 
approach anyway, *not* a function.

>>and you claimed that reduce could be removed if add, mul, etc, would
>>accept arbitrary numbers of arguments.  This set of stances is not
>>self-consistent.
> 
> Either solution is fine with me.  I just don't think that addition
> should be placed on a pedestal above other operations. This means that
> you have to remember that addition is different from all the other
> operations, and then when you want to multiply a bunch of numbers
> together, or xor them together, for example, you use a different
> idiom, and if you haven't remembered that addition has been placed on
> this pedestal, you become frustrated when you can't find the
> equivalent of sum() for multiplication or xor in the manual.

Have you ever programmed in assembly? It's worth a look...

(In case someone's wondering, addition is the only operation available 
in many MPU/MCUs. Multiplication is heavily expensive in those that 
support it.)

>>>So, now you *do* want multiple obviously right ways to do the same
>>>thing?
> 
>>sum(sequence) is the obviously right way to sum the numbers that are
>>the items of sequence.  If that maps to add.reduce(sequence), no problem;
>>nobody in their right mind would claim the latter as "the one obvious
>>way", exactly because it IS quite un-obvious.
> 
> It's quite obvious to me.  As is a loop.

Procecution rests.

>>The point is that the primary meaning of "reduce" is "diminish", and
>>when you're summing (positive:-) numbers you are not diminishing
>>anything whatsoever
> 
> Of course you are: You are reducing a bunch of numbers down to one
> number.

That make sense if you are in a math related area. But for a layperson,
that is nonsense.

>>>"summary" or "gist" in addition to addition.  It also can be confusing
>>>by appearing to be just a synonym for "add".  Now people might have
>>>trouble remember what the difference between sum() and add() is.
> 
>>Got any relevant experience teaching Python?  I have plenty and I
>>have never met ANY case of the "trouble" you mention.
> 
> Yes, I taught a seminar on Python, and I didn't feel it necessary to
> teach either sum() or reduce().  I taught loops, and I feel confident
> that by the time a student is ready for sum(), they are ready for
> reduce().

<sarcasm>But why didn't you teached reduce()? If it were so simple, it
was a must in the seminar.</sarcasm>

Now in a more serious note, reduce() is not an easy concept to grasp. 
That's why many people don't want it in the language. The middle land 
obviously is to reduce the functionality of reduce().

>>>In Computer Science, however, "reduce" typically only has one meaning
>>>when provided as a function in a language, and programmers might as
>>>well learn that sooner than later.
>>
>>I think you're wrong.  "reduce dimensionality of a multi-dimensional
>>array by 1 by operating along one axis" is one such meaning, but there
>>are many others.  For example, the second Google hit for "reduce
>>function" gives me:
> 
>>http://www.gits.nl/dg/node65.html
> 
> That's a specialized meaning of "reduce" in a specific application
> domain, not a function in a general-purpose programming.
> 
>>where 'reduce' applies to rewriting for multi-dot grammars, and
>>the 5th hit is
> 
>>http://www.dcs.ed.ac.uk/home/stg/NOTES/node31.html
> 
>>which uses a much more complicated generalization:
> 
> It still means the same thing that reduce() typically means. They've
> just generalized it further.  Some language might generalize sum()
> further than you have in Python. That wouldn't mean that it still
> didn't mean the same thing.
> 
>>while http://csdl.computer.org/comp/trans/tp/1993/04/i0364abs.htm
>>deals with "the derivation of general methods for the L/sub 2/
>>approximation of signals by polynomial splines" and defines REDUCE
>>as "prefilter and down-sampler" (which is exactly as I might expect
>>it to be defined in any language dealing mostly with signal
>>processing, of course).
> 
> Again a specialized domain.

?-|

You mention here "general-purpose programming". The other languages that 
I have done something more than a small code snippet (C/C++, Java and 
PHP) lack a reduce()-like function. And it haven't hurt them by lacking 
it. If something like reduce() to be "general", I ~think~ it should be 
something in one of the mainstream languages. For example, regex is 
"general" in programming languages either because libraries for it have 
been added to existing languages (C/C++), or because it has been 
incorporated into rising languages (Perl, PHP).

>>Designing an over-general approach, and "fixing it in the docs" by
>>telling people not to use 90% of the generality they so obviously
>>get, is not a fully satisfactory solution.  Add in the caveats about
>>not using reduce(str.__add__, manystrings), etc, and any reasonable
>>observer would agree that reduce had better be redesigned.
> 
> You have to educate people not to do stupid things with loop and sum
> too.  I can't see this as much of an argument.

After reading the whole message, how do you plan to do *that*? The only
way to effectively do this is by warning the user explicitly *and*
limiting the power the functionality has. As Martelli said, APL and
Numeric does have an equivalent to reduce(), but it's limited to a range
of functions. Doing so ensures that abuse can be contained.

And remember, we are talking about the real world.

>>Again, I commend APL's approach, also seen with more generality in
>>Numeric (in APL you're stuck with the existing operator on the left
>>of + -- in Numeric you can, in theory, write your own ufuncs), as
>>saner.  While not quite as advisable, allowing callables such as
>>operator.add to take multiple arguments would afford a similarly
>>_correctly-limited generality_ effect.  reduce + a zillion warnings
>>about not using most of its potential is just an unsatisfactory
>>combination.
> 
> You hardly need a zillion warnings.  A couple examples will suffice.

I'd rather have the warnings. It's much better than me saying "How 
funny, this shouldn't do this..." later. Why? Because you can't predict 
what people will actually do. Pretending that most people will act like 
you is insane.

Two last things:

1) Do you have any extensive experience with C/C++? (By extensive, I
mean a small-medium to medium project) These languages taught me the
value of -Wall. There are way too many bugs lurking in the warnings to
just ignore them.

2) Do you have any experience in the design process?

-- 
Andres Rosado

-----BEGIN TF FAN CODE BLOCK-----
G+++ G1 G2+ BW++++ MW++ BM+ Rid+ Arm-- FR+ FW-
#3 D+ ADA N++ W OQP MUSH- BC- CN++ OM P75
-----END TF FAN CODE BLOCK-----

"Greed and self-interest, eh? Excellent! I discern a protege!"
         -- Starscream to Blackarachnia, "Possession"