[SciPy-user] How to start with SciPy and NumPy

Vicent vginer at gmail.com
Sun Jan 25 07:26:08 EST 2009


On Sun, Jan 25, 2009 at 12:43, David Cournapeau <cournape at gmail.com> wrote:


> Depending on your POV, this may be true. But for many scientific
> usages, an array capability is so fundamental that it has strong
> consequences on all the dependent code (e.g. little scientific code in
> python will use list as its core data structure, for example). It is a
> fundamental building block if you want.


I see...


>
>
> I think the online documentation is organized for people who are
> familiar with those concepts - most people doing numerical
> computations are familiar with the union R/matlab/idl/labview. I am
> not sure we have a documentation for people not familiar with those
> concepts - this would certainly be nice.



I understand that the online documentation is not complete, also as long as
NumPy and SciPy current version numbers are under 1.

But, yes, it would be desirable a kind of introduction to the benefits of
array programming, or something like that.



>
>
> > (1) Is there any point in maintaining a list and then create a temporary
> > NumPy array just to perform calculations, and then "copy and paste" the
> > results on the list?
> >
>
> Depends on whether you need a list for later computation: a list
> generally takes much more memory if you only care about homogenous
> items (a numpy array only takes M * N bytes + overhead, where M is the
> size of one item and N the number of bytes of your item - 4 for a 32
> bits integers). OTOH, if you keep resizing your data, list may makes
> sense - and list can be faster than arrays for small sizes.
>
> There is no unique rule, but for computation on a lot of data, numpy
> arrays certainly are a powerful data structure, useful on its own.
>
> > (2) What about lists with different typed items within them?
>
> Numpy arrays - and generally arrays - fundamentally rely on the
> assumption of the same type for every item. A lot of the performances
> of array comes from this assumption (it means you can access any item
> randomly without the need to traverse any other item first, etc...).



In my case, I am not expecting to change the type of the items within a
list, one they've been entered. And also, I'll have some lists whose
elements will be the same type.

But, also, I am going to have a list of "variables", that can be "float",
"int" or "bool" (in the sense of 0-1 or bit valued), and I want to store,
for each variable (or value) in the list, which kind of type it is/has.

If I do it using lists, I can get the type of a given element in the list by
doing something like this:

>>> type(c[1])
<type 'int'>

If I used NumPy arrays, then every value would be stored as "float" (I
guess), and then an extra field would be necessary in order to store and get
the actual type for each variable.

I mean, I would have a "variable" class which would contain "value" and
"type" as properties (among others), and then I would have a NumPy array of
"variable" objects.


Stop! [ I am thinking...]

Anyway, I'll have a "variable" object, because I need to store some
information for each variable, it doesn't depend on wether I use lists or
arrays to store "variables".

So, within each "variable" element in the NumPy array, the "value" property
for that "variable" can contain an integer value, or a boolean value, etc.
No matter about "different types of elements", because all of them are
"wrapped" with the "variable" structure.


Anyway, from your answer, I see that the point is "How large are the
lists/arrays I am planning to use/need?" Isn't it?

Which approach would be better (lists or arrays)? I guess it depends on the
size on the set of variables... I am thinking about not many variables,
maybe from 10 to 100, at this point of my research.


--
Vicent
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.scipy.org/pipermail/scipy-user/attachments/20090125/c317e4aa/attachment.html>


More information about the SciPy-User mailing list