[Edu-sig] Why Python?

Wed Apr 14 21:06:58 CEST 2010

I've been adding to this note over the course of the day and it has 
gotten very long. Hopefully it won't bore you too much.

I should start by saying that I'm a Python fan! Rich Enbody and I spent 
a lot of time converting the introductory computing class here at 
Michigan State to Python. Heck, we even wrote a book about it! So, let 
us all assume that I like Python.

However, Python is not the do-all nor end-all of programming languages. 
It has any number of flaws, as does any programming language. There is 
no one "good" programming language, and I surely hope that better 
languages come along after Python. Having said that, I think Python 
*right now* is a very good language, one especially well suited to 
beginners. I have heard it described as a best practices language, and 
that is likely right. It combines many good features from existing 
languages and makes them relatively easily accessible to new users. It 
is an evolutionary step forward, but not revolutionary. Furthermore, and 
very importantly, it has a very active user community that helps 
translate basic programming skill into advanced tool use. 
http://pypi.python.org/pypi lists 9574 available packages for Python at 
the moment. Were it not for that, I think Python would be rather 
ordinary. I believe this because if students are going to put effort 
into learning a computer language, which is a daunting task for many, 
then they should get something for it. What they get for Python is the 
big user base and good tools. They can *do stuff* having learned Python, 
whatever their path might be.

So what sucks in Python. I have my own personal peeves list:

- concurrency. Others have commented on this already. Python is not 
concurrent and will not likely be concurrent soon. Python does provide 
threads, but those threads do not execute concurrently due to the Global 
Interpreter Lock (GIL). You can learn threads, come to understand them, 
but will not see the typical concurrent benefits in Python. Now, if 
Python calls other processes externally, yes you can get concurrency. 
You can even use external toolsets such as MPI to do a better job. Lest 
you say this is unimportant, I think you should reconsider. The world of 
computing has changed in the last 5 years. You don't get faster cores, 
you just get more of them. The hexacores are already being sold, with 
octocores soon to follow. Teaching early programmers the importance of 
concurrency means that in the future concurrency will be better 
utilized. Saying it isn't important dooms younger programmers to not 
understanding the problem.

- automatic type coercion. Python used to have a coerce method but it is 
deprecated. Imagine you have a simple Rational number class, and an 
expression myRational + 1. You write a __add__ method to add two 
rationals, but that won't work with the given expression as the argument 
is an integer. What to do? C++ (and it's ilk) would allow you to write a 
method that can convert one type to another. When a problem such as the 
above would come along, it would look for such a conversion, do it for 
you, (converting 1 to 1/1) and now your existing method works! Not 
Python. You are required to do introspection in your method __add__ for 
any type that doesn't match up and covert it by hand in your method. 
Each and every one of them. Very clugy compared to C++

- radd, iadd and the like. This really sucks. Consider the same 
situation, Rational number class, only the expression is 1 + myRational. 
Python cannot handle such a call (it tries to find an appropriate method 
in the integer class). If it were C++, again automatic type conversion 
would come to the rescue. Instead, you have to write another method, 
__radd__, which reverses the argument order for commutative operations. 
Same with myRational += 1, there is an __iadd__ method (though 
thankfully they patched that recently). Very clugy indeed.

- list comprehensions. I love list comprehensions but hate the syntax. 
It is overly confusing for a beginner. Other languages, for example 
common lisp, have a much more elegant and consistent solution:

    [x**2 for x in range(10) if x%2==0]
vs
    for x in range(10) collect x**2 when x%2==0

A list comprehension is really an extension of the for iterator, why not 
work with that syntax? Now we have three potential meanings for 
something in [ ]: a list, an index and a list comprehension. Sure it 
works, and you can even point me to the math foundation for it, but it 
is uselessly complicated for a language structured for a beginner. 
Extending the for iterator would have been the way to go IMHO.

- open source and language changes. I'm a big fan of open source, but it 
is often a two edged sword. When Python started, it generated such 
enthusiasm and resulted in so many packages, which was wonderful. Then, 
the language guys came along and broke Python for the 3.x series, making 
a "better" language. Well, that's nice and all, but the difference 
between writing a package the first time and fixing a package for a 
bunch of language changes is pretty big. Lots of open source folks are 
happy to do work for cudos and praise (the first time) but no one wants 
to do the nasty work of a language upgrade. Who knows how long till 
numpy gets upgraded to 3.x, and it must happen for many other packages, 
and so on and so forth. Open source is a great starter but it has its 
troubles w.r.t long term maintenance and changes. Till then, we are 
stuck with 2.x if we want to work with all those nice packages. Worse, 
people developing for 3.x may not wait and write their own numpy (or 
whatever package), causing a package break. Now there are two, with two 
groups and all the hassle that goes with it. Forking is another feature 
of open source (look at all the linux distros).

I'll cut it there. Languages are tools. If you were a carpenter, you 
might learn how to use a hammer first but then move on to other tools as 
your problems change. You cannot solve all your problems with a hammer, 
which is why other tools exist. Same with languages. Each has their 
advantages, their flaws. Python is a great way to start, but don't think 
it is the only way to go.

        >>>bill<<<

David MacQuigg wrote:
> kirby urner wrote:
>>> On Tue, Apr 13, 2010 at 10:30:49AM -0700, David MacQuigg wrote:
>>>     
>>
>>>> That's not to say the 1% is unimportant.  Here we will find brilliant
>>>> programmers working on sophisticated techniques to break large 
>>>> problems
>>>> into pieces that can be executed concurrently by hundreds of 
>>>> processors.
>>>> Each problem is very different, and we may find programs for circuit
>>>> simulation using very different techniques than programs for weather
>>>> prediction.  These programs will be run in an environment 
>>>> controlled by
>>>> Python.  The circuit designer or atmospheric scientist will not be
>>>> concerned about the details of concurrency, as long as the result 
>>>> is fast
>>>> and accurate.
>>>>       
>>
>> There's a school of thought out there that says operating systems are
>> supposed to handle concurrency pretty well.  Some designs boot
>> multiple Pythons as multiple processes and let the OS take care of
>> concurrency issues.  Why reinvent the wheel and rely on internal
>> threading?  Instead of multiple threads within Python, make each
>> Python its own thread.
>>
>> These concurrently running Pythons are then trained to communicate in
>> a loosely coupled fashion by reading and writing to centralized SQL
>> tables, which double as audit trails.
>>
>> If a process dies, there might need to kill a process (zombie snake
>> problem), and maybe the controlling process (like air traffic control)
>> launches a new one -- depends on many factors.
>>   
>
> As long as the coupling is loose, this kind of "concurrency" is easily 
> handled by Python (either with threads, or separate processes).  The 
> more challenging concurrency problems involve tight coupling, tighter 
> than can be achieved with inter-process communications.   Think of a 
> large system of equations, with 1000 outputs depending on 1000 input 
> variables.
>
> The challenge is in partitioning that problem into smaller problems 
> that can be solved on separate processors, with manageable 
> requirements relating to interprocess communication.  Strategies tend 
> to be domain-specific.  If the domain is circuit design, the equations 
> follow the modularity of the design, e.g. a collection of subcircuits, 
> each with only one input and one output.  That strategy won't work in 
> other domains, so we have no partition() function that will do it 
> automatically, and (in my opinion) not much chance that some new 
> language will come to the rescue.  For problems that really need 
> partitioning, we need programmers that understand the problem domain.
>
> The question for teachers using Python is - Will there be some future 
> "concurrency" language that is:
> 1) So different that its features can't be unobtrusively merged into 
> Python (not burdening those who don't need it).
> 2) So easy that it will replace Python as a general-purpose language, 
> even for those 99% that don't need concurrency.
>
> I'll keep an open mind.  Meanwhile, I've got to make a choice for 
> PyKata.  We need to be ready for next semester.
>
> _______________________________________________
> Edu-sig mailing list
> Edu-sig at python.org
> http://mail.python.org/mailman/listinfo/edu-sig