Python advocacy in scientific computation

David Treadwell i.failed.turing.test at gmail.com
Fri Mar 3 22:05:19 EST 2006


On Mar 3, 2006, at 8:33 PM, sturlamolden wrote:


> Michael Tobis skrev:
>
> Being a scientist, I can tell you that your not getting it right. If
> you speak computer science or business talk no scientist are going to
> listen. Lets just see how you argue:
>
>
>> These include: source and version control and audit trails for runs,
>> build system management, test specification, deployment testing  
>> (across
>> multiple platforms), post-processing analysis, run-time and
>> asynchronous visualization, distributed control and ensemble
>> management.
>>
>
> At this point, no scientist will no longer understand what the heck  
> you
> are talking about. All have stopped reading and are busy doing
> experiments in the laboratory instead. Perhaps it sound good to a CS
> geek, but not to a busy researcher.
>

Agreed. I'm slowly learning the CS lingo, but then, I've been trying  
to learn the lingo since 1977.


>
> Typically a scientist need to:
>
> 1. do a lot of experiments
>
> 2. analyse the data from experiments
>
> 3. run a simulation now and then
>

Generally correct, but the way of the physical scientist is becoming  
more #3 and less #1.


>
> Thus, we need something that is "easy to program" and "runs fast
> enough" (and by fast enough we usually mean extremely fast). The tools
> of choice seems to be Fortran for the older professors (you can't  
> teach
> old dogs new tricks) and MATLAB (perhaps combined with plain C) for  
> the
> younger ones (that would e.g. be yours truly).
>

I was very unlucky, as I was in college just as the old computer  
landscape was passing away and the new one was being born.
My first programs were punched on cards in WatFiv Fortran and run on  
an IBM 360. Next, I got my Apple ][ and learned BASIC. Then off to  
college for more Fortran 77. The most advanced CS course I've ever  
taken was called "Introduction to Interactive Computing", where I was  
taught that there's more to life than punch cards and a line printer.


> Hiring professional
> programmers are usually futile, as they don't understand the problems
> we are working with. They can't solve problems they don't understand.
>

I wouldn't touch this comment with a 01010 foot pole.


>
> What you really ned to address is something very simple:
>
>
>     Why is Python better a better Matlab than Matlab?
>

1. Matlab costs $$$. In grad school, I had to buy my own student copy  
of Mathematica for my Mac Plus because there wasn't any research  
money or any access to anything else. IIRC, most math I had to do was  
done with the backs of stacks of computer printouts, a mechanical  
pencil and four pots of black coffee.

2. The Python community makes very sophisticated code available in a  
wide array of areas, from pure number crunching, to symbolic algebra,  
graphics, image processing, databases, communications, and on-and-on.  
And I can customize every bit of it to meet my needs.

3. The people who maintain and write SciPy and NumPy are  
knowledgeable, and helpful, despite having more pressing issues than  
helping me! (Thanks, guys!)

>
>
> The programs we need to write typically falls into one of three
> categories:
>
> 1. simulations
> 2. data analysis
> 3. experiment control and data aquisition
>
> (that are words that scientists do know)
>

Yes, but I write other code, too. I've often said that my favorite  
toy is a great programming language, and Python fits that concept  
perfectly.


> In addition, there are 10 things you should know about scientific
> programming:
>
> 1. Time is money. Time is the only thing that a scientist cannot  
> afford
> to lose. Licensing fees for Matlab is not an issue. If we can spend
> $1,000,000 on specialised equipment we can pay whatever Mathworks or
> Lahey charges as well. However, time spent programming are an issue.
> (As are time time spend learning a new language.)
>

True, if you work at a well-funded institution.

I work for myself. Very little money to go around. I don't have  
million-dollar instruments. What I have, I build as inexpensively as  
I can. "Hack" is a word with meanings beyond CS.


> 2. We don't need fancy GUIs. GUI coding is a waste of time we don't
> have. We don't care if Python have fancy GUI frameworks or not.
>

Fancy? No. Usable, most definitely! Without a decent UI, I have a  
hard time using my own code. Plus, if I want to share my ideas with  
anyone, an understandable GUI helps tremendously.


> 3. We do need fancy data plotting and graphing. We do need fancy
> plotting and graphing that are easy to use - as in Matlab or S-PLUS.
>

Yes, I need fancy visualizing tools, too. I have to work a bit to get  
what I want from Python, but I have total control when I get it.


> 4. Anything that has to do with website development or enterprise  
> class
> production quality control are crap that we don't care about.
>

There's a devil hiding in this statement. The last company I worked  
for was founded by engineers. The company went bankrupt after they  
decided that they knew more about quality programming than the CSs.


> 5. Versioning control? For each program there is only one developer  
> and
> a single or a handful users.
>

Matlab is version controlled by people well-paid to do so. Whatever  
code you write is built upon a Matlab foundation. If it's of poor  
quality, then the results of your program will be, too. Also, I can't  
tell you how many times I've mixed up versions of 30-page programs  
I've written.


> 6. The prototype is the final version. We are not making software  
> for a
> living, we are doing research.
>

I personally find it difficult to stop working on code. New features,  
better algorithms. Better interface to other programs I write later.

[snip]


>
> 9. What are algorithms and data structures? Very few of us knows  
> how to
> use a datastructure more complicated than an array. That is why we  
> like
> Matlab and Fortran so much.
>

My ability to think  of data structures was stunted BECAUSE of  
Fortran and BASIC. It's very difficult for me to give up my bottom-up  
programming style, even though I write better, clearer and more  
useful code when I write top-down.


>
> 10. We are novice programmers. We are not passionate programmers. We
> take no pride in our work. The easier hack the better. We don't  
> care if
> we are doing OOP or not. However, we do hate complicated APIs or APIs
> that look funny. We are used to seeing sin(x) in our calculus  
> textbooks
> and because of that we don't find Math.Sin(x) particularly elegant --
> even though Math.Sin(x) is more OOP and sin(x) clutters the global
> namespace.
>

I agree with your thought. Certainly, the internals of the language  
are beyond me, and structures shouldn't be arcane. But I can have sin 
(x) in Python if I want it. Python has taught me a great deal about  
OOP. Pascal, C, C++, etc., still mystify me. I can't figure them out  
to save my life. But everything I've tried in Python (so far) has  
made sense to me, even if it took a few days of thought to figure it  
out.


>
> Now please go ahead and tell me how Python can help me become a better
> scientist. And try to steer clear of the computer science buzzwords
> that don't mean anyting to me.
>

My pleasure. Here's my experience:

1. I don't have the money for Matlab.
2. I don't have the skills or time to write every module I might need.
3. I demand a general-purpose toolkit, not a gold-plated screwdriver.
4. I've learned new ways to organize computations because of Python.
5. User groups have given me access to thousands of professional  
scientists and engineers (computer and otherwise) around the world.
6. I love the MPFC jokes.


>
> Thanks!
>
> Sturla Molden
> (neuroscience PhD)
>

Kindest regards,

--David Treadwell
Chemistry PhD
Quintillion Materials Research LLC





More information about the Python-list mailing list