[SciPy-user] How to start with SciPy and NumPy

Vicent vginer at gmail.com
Sun Jan 25 06:17:33 EST 2009


On Sat, Jan 24, 2009 at 21:08, David Baddeley
<david_baddeley at yahoo.com.au>wrote:

> Hi Vincent,
>
> if you're new to both python and numerical programming I'd suggest you make
> yourself familiar with basic python first and then move on to the numerical
> stuff - it'll probably be easier that way.


Thank you for the advice.



> To answer your question, there are two main ways in which Numpy and Scipy
> help with numeric programming. The first (and simplest) of these is by
> providing lots of pre-rolled algorithms to do useful things (e.g. computing
> bessel functions, fourier transforms, and much more).


Yes, I realize of that. In that aspect, NumPy+SciPy are like any other
Python, for me. If any time I need something specific, I look if a package
for that already exists.



> The second, and arguably more important (at least when it comes to
> performance) is to facilitate vectorisation, which is best illustrated with
> an example.

[...]
>
> The equivalent code using numpy/scipy would be:
>
> x = numpy.arange(0, 2*pi, 0.1)
> y = numpy.sin(x)
>
>
> This is much closer to the underlying maths, making it quicker to program
> and more readable, and also much faster. The reason for the speed increase
> is that python is an interpreted language and the for loops above are slow.
> Numpy effectively executes these under the hood in compiled c code which is
> much faster. An equally important factor is cost of allocating and
> navigating the python lists used for storage in the python example - as each
> data point is processed new memory needs to be allocated which is highly
> unlikely to be contiguous with the original.


I understand this advantage. Sorry if this was already explained in the
online documentation, but I was not able to find it...

So, let me ask, in order to know if I have understood it well:

Any time I want to perform a task over all the elements on a list, and those
elements are the same type, it is better to use a NumPy array instead a list
to store data. Is that?

I have some questions related to this topic:

(1) Is there any point in maintaining a list and then create a temporary
NumPy array just to perform calculations, and then "copy and paste" the
results on the list?

I mean something similar to, for example, with lists and sets: I have a
list, because I'm interested in order, but then I buid a set based on that
list, just because I know it is faster to look for an element on a set
(isn't it??). Later, I "kill" the set, when it is no longer useful.

>>> c = [1, 2, 3, 1, 1, 2, "a"]
>>> type(c)
<type 'list'>
>>> d = set(c)
>>> type(d)
<type 'set'>
>>> d
set(['a', 1, 2, 3])
>>> "a" in d
True
>>> del d
>>> d
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'd' is not defined
>>> c
[1, 2, 3, 1, 1, 2, 'a']
>>>


(2) What about lists with different typed items within them?

(3) Can I perform operations over all the elements (scalars) in one given
array that meet some given condition?? For example, in your previous
example,"compute sinus only for those elements which are multiple of pi/4
(or whatever)".



>
>
> This probably doesn't fully answer your question, but should give you a
> starting point do do a little googling / more reading in the documentation.


Yes, thank you!








On Sat, Jan 24, 2009 at 22:51, Vasileios Gkinis <v.gkinis at gfy.ku.dk> wrote:

>
>
> Dear Vicent,
>
> You could perhaps take a closer look into the documentation section of
> scipy. I believe that many of your questions will be answered this way. I
> would suggest you take a look into the following performance study:
>
> http://www.scipy.org/PerformancePython
>
> Thank you, Vas, that's a good example. Now I am starting to understand the
power of using NumPy.



> [...]
>
> With time though complexity and size of the code get larger and larger and
> there one can see the benefits  of using the tools included in scipy/numpy.
> Reinventing the wheel is not a smart choice when tested and well coded
> methods are available.
>

OK, I get it...




On Sat, Jan 24, 2009 at 23:32, Gael Varoquaux <gael.varoquaux at normalesup.org
> wrote:

> On Sat, Jan 24, 2009 at 06:31:26PM +0100, Vicent wrote:
> >    For example, what is the difference between "random" from random
> module
> >    and "random" from numpy.random? Or are they the same?
>
> Well, if you look at the number of distributions included in numpy.random
> and random, this will give you a clue.



Ok, but, if I want just to generate a (pseudo) random number between 0 an 1
(uniform distribution), just one number or scalar (not a vector), does NumPy
implement an improved algortihm for that, different from the algorithm
within standard Python?

[ The reason for this question is that, in the past, I worked with some
pseudorandom number generators in C++, and I had some problems with the
quality of the "randomness" of those numbers (and I had to use more
specialized "random"-packages, etc.). ]



> In addition to shipping much more
> distributions, numpy.random, just like al numpy, and scipy, works with
> arrays, rather than numbers, which allows you to vectorize part of the
> code (check out
> http://en.wikipedia.org/wiki/Vectorization_(computer_science)<http://en.wikipedia.org/wiki/Vectorization_%28computer_science%29>
> and
> http://en.wikipedia.org/wiki/Array_programming
>

You seem to believe that working with large chunk of numbers organized
> in arrays is useful only for linear algebra, but on the opposite,
> avoiding loops and working on arrays is the basis of a whole catagory of
> very succesful language such as Matlab, or IDL. Many old-class numerical
> developper despise these languages, but they have proven to be effective.



I admit I had no idea about this topic. Thank you for those links!

So, it could be said that NumPy adds array programming capabilities to
Python?



> Yes, documentation showing the big picture is missing. The problem is
> that nobody seems to have time to write it. Maybe it is because it
> doesn't bring money, or academic credit in. We all need to survive.


An old problem...


> I reckon from your name that you might be speaking French. It which case,
> I just happen to have spent time writting a 12 page article trying to
> give the big picture on this problem:
> http://www.gnulinuxmag.com/index.php/2009/01/23/gnulinux-magazine-hs-n
> °40-janvierfevrier-2009-chez-votre-marchand-de-journaux



Merci, Gaël!

I don't speak French (well, just a little), but I understand it. Anyway, it
seems I can't get an online copy of your article.

[My name "Vicent" is in Valencian, which is a language of Spain. And, yes,
Valencian-Catalan is quite similar to French, in some aspects.]

Thank you to all for your kind answers!

--
Vicent
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090125/09f22b2c/attachment.html>


More information about the SciPy-User mailing list