From robert.kern at gmail.com  Sat Apr  1 00:20:00 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Sat Apr  1 00:20:00 2006
Subject: [Numpy-discussion] Trac maintenance
Message-ID: <442E3770.6030809@gmail.com>

I've been doing a bit of maintenance on the Trac instances for numpy and scipy.
In particular, I've removed the default "component1" and "milestone2" nonsense
and put meaningful values in their place.

If you have any requests, or you think my component lists are bogus, enter a
ticket, set the component to "Trac" and assign it to rkern.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From tim.hochberg at cox.net  Sat Apr  1 06:57:17 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Sat Apr  1 06:57:17 2006
Subject: [Numpy-discussion] numpy error handling
In-Reply-To: <442E2F05.5080809@ieee.org>
References: <442DE773.4060104@cox.net> <442E2F05.5080809@ieee.org>
Message-ID: <442E94AD.1040200@cox.net>

Travis Oliphant wrote:

> Tim Hochberg wrote:
>
>>
>> I've just been looking at how numpy handles changing the behaviour 
>> that is triggered when there are numeric error conditions (overflow, 
>> underflow, etc.). If I understand it correctly, and that's a big if, 
>> I don't think I like it nearly as much as the what numarray has in 
>> place.
>>
>> It appears that numpy uses the two functions, seterr and geterr, to 
>> set and query the error handling. These set/read a secret variable 
>> stored in the local scope. 
>
> This approach was decided on after discussions with Guido who didn't 
> like the idea of pushing and popping from a global stack.    I'm not 
> sure I'm completely in love with it my self, but it is actually more 
> flexible then the numarray approach.
>
> You can get the numarray approach back simply by setting the error in 
> the builtin scope (instead of in the local scope which is done by 
> default.

I saw that you could set it at different levels, but missed the 
implications. However, it's still missing one feature, thread local 
storage. I would argue that the __builtin__ data should actually be 
stored in threading.local() instead of __builtin__. Then you could setup 
an equivalent stack system to numpy's.

> Then, at the end of the function, you can restore it.  If it was felt 
> useful to create a stack to handle this on the builtin level then that 
> is easily done as well.

I've used the numarray error handling stuff for some time. My experience 
with it has led me to the following conclusions:

   1. You don't use it that often. I have about 26 KLOC that's "active"
      and in that I use pushMode just 15 times. For comparison, I use
      asarray a tad over 100 times.
   2. pushMode and popMode, modulo spelling,  is the way to set errors.
      Once the with  statement is around, that will be even better.
   3. I, personally, would be very unlikely to use the local and global
      error handling, I'd just as soon see them go away, particularly if
      it helps performance, but I won't lobby for it.

>> I assume that the various ufuncs then examine that value to determine 
>> how to handle errors. The secret variable approach is a little 
>> clunky, but that's not what concerns me. What concerns me is that 
>> this approach is *only* useful for built in numpy functions and falls 
>> down if we call any user defined functions.
>>
>> Suppose we want to be warned on underflow. Setting this is as simple as:
>>
>>    def func(*args):
>>        numpy.seterr(under='warn')
>>        # do stuff with args
>>        return result
>>
>> Since seterr is local to the function, we don't have to reset the 
>> error handling at the end, which is convenient. And, this works fine 
>> if all we are doing is calling numpy functions and methods. However, 
>> if we are calling a function of our own devising we're out of luck 
>> since the called function will not inherit the error settings that we 
>> have set.
>
> Again, you have control over where you set the "secret" variable 
> (local, global (module), and builtin).  I also don't see how that's 
> anymore clunky then a "secret" stack.   

In numarray, the stack is in the numarray module itself (actually in the 
Error object). They base their threading local behaviour off of 
thread.get_ident, not threading.local.  That's not clunky at all, 
although it's arguably wrong since thread.get_ident can reuse ids from 
dead threads. In practice it's probably hard to get into trouble doing 
this, but I still wouldn't emulate it. I think that this was written 
before thread local storage, so it was probably the best that could be done.

However, if you use threading.local, it will be clunky in a similar 
sense. You'll be storing data in a global  namespace you don't control 
and you've got to hope that no one stomps on your variable name. When 
you have local and module level secret storage names as well you're just 
doing a lot more of that and the chance of collision and confusion goes 
up from almost zero to very small.

> You may set the error in the builtin scope --- in fact it would 
> probably be trivial to implement a stack based on this and implement the
>
> pushMode
> popMode
>
> interface of numarray.

Yes. Modulo the thread local issue, I believe that this would indeed be 
easy.

>
> But, I think this question does deserve a bit of debate.  I don't 
> think there has been a serious discussion over the method.  To help 
> Tim and others understand what happens:
>
> When a ufunc is called, a specific variable name is searched for in 
> the following name-spaces in the following order:
>
> 1) local
> 2) global
> 3) builtin
>
> (There is a bit of an optimization in that when the error mode is the 
> default mode --- do nothing, a global flag is set which by-passes the 
> search for the name).
> The first time the variable name is found, the error mode is read from 
> that variable.  This error mode is placed as part of the ufunc loop 
> object.  At the end of each 1-d loop the IEEE error mode flags are 
> checked  (depending on the state of the error mode) and appropriate 
> action taken.
>
> By the way, it would not be too difficult to change how the error mode 
> is set (probably an hour's worth of work).   So, concern over 
> implementation changes should not be a factor right now.  
> Currently the error mode is read from a variable using standard 
> scoping rules.   It would save the (not insignificant) name-space 
> lookup time to instead use a global stack (i.e. a Python list) and 
> just get the error mode from the top of that stack.
>
>> Thus we have no way to influence the error settings of functions 
>> downstream from us.
>
> Of course, there is a way to do this by setting the variable in the 
> global or builtin scope as I've described above.
> What's really the argument here, is whether having the flexibility at 
> the local and global name-spaces really worth the extra name-lookups 
> for each ufunc.
>
> I've argued that the numarray behavior can result from using the 
> builtin namespace for the error control. (perhaps with better 
> Python-side support for setting and retrieving it).  What numpy has is 
> control at the global and local namespace level as well which can 
> override the builtin name-space behavior.
>
> So, we should at least frame the discussion in terms of what is 
> actually possible.

Yes, sorry for spreading misinformation.

>>
>> I also would prefer more verbose keys ala numarray (underflow, 
>> overflow, dicidebyzero and invalid) than those currently used by 
>> numpy (under, over, divide and invalid). 
>
>
> In my mind, verbose keys are just extra baggage unless they are really 
> self documenting.  You just need reminders and clues.   It seems to be 
> a preference thing.   I guess I hate typing long strings when only the 
> first few letters clue me in to what is being talked about.

In this case, overflow, underflow and dividebyzero seem pretty self 
documenting to me. And 'invalid' is pretty cryptic in both 
implementations. This may be a matter of taste, but I tend to prefer 
short pithy names for functions that I use a lot, or that crammed a 
bunch to a line. In functions like this, that are more rarely used and 
get a full line to themselves I lean to towards the more verbose.

>> And (will he never stop) I like numarrays defaults better here too: 
>> overflow='warn', underflow='ignore', dividebyzero='warn', 
>> invalid='warn'. Currently, numpy defaults to ignore for all cases. 
>> These last points are relatively minor though.
>
> This has optimization issues the way the code is written now.  The 
> defaults are there to produce the fastest loops. 

Can you elaborate on this a bit? Reading between the lines, there seem 
to be two issues related to speed here.  One is the actual namespace 
lookup of the error mode -- there's a setting that says we are using the 
defaults, so don't bother to look. This saves the namespace lookup.  
Changing the defaults shouldn't affect the timing of that. I'm not sure 
how this would interact with thread local storage though.

The second issue is that running the core loop with no checks in place 
is faster.

That means that to get maximum performance you want to be running both 
at the default setting and with no checks, which implies that the 
default setting needs to be no checking. Is that correct?

 I think there should be a way to finesse this issue, but I'll wait for 
the dust to settle a bit on the local, global, builtin issue before I 
propose anything. Particularly since by finesse I mean: do something 
moderately unsavory.

> So, I'm hesitant to change them based only on ambiguous preferences.

It's not entirely plucked out of the error. As I recall, the decision 
was arrived at something likes this:

   1. Errors should never pass silently (unless explicitly silenced).
   2. Let's have everything raise by default
   3. In practice this was no good because you often wanted to look at
      the results and see where the problem was.
   4. OK, let's have everything warn
   5. This almost worked, but underflow was almost never a real error,
      so everyone always overrode underflow. A default that you always
      need to override is not a good default.
   6. So, warn for everything except underflow. Ignore that.

And that's where numarry is today. I and other have been using that 
error system happily for quite some time now. At least I haven't heard 
any complaints for quite a while.

> Good feedback.    Thanks again for taking the time to look at this and 
> offer review.

You're very welcome. Thanks for all of the work you've been putting in 
to make the grand numerification happen.

-tim


From arnd.baecker at web.de  Sat Apr  1 09:09:06 2006
From: arnd.baecker at web.de (Arnd Baecker)
Date: Sat Apr  1 09:09:06 2006
Subject: [Numpy-discussion] extension to xrange for numpy
Message-ID: <Pine.LNX.4.51.0604011906240.29595@ptpcp8.phy.tu-dresden.de>

Dear numpy enthusiasts,

one python command which is extremely useful in 1D situations
is `xrange`. However, for higher dimensional
settings we strongly lack the commands `yrange` and `zrange`.
These could be shorthands for the corresponding
constructs with `:,NewAxis` added.

Any comments, suggestion and even implementations are very welcome,

Arnd

P.S.: What I am not sure about is the right command for
the 4-dimensional case - which letter should be used after the "z"?
(it seems that "a" would be a very natural choice...)


From faltet at carabos.com  Sat Apr  1 11:01:05 2006
From: faltet at carabos.com (Francesc Altet)
Date: Sat Apr  1 11:01:05 2006
Subject: [Numpy-discussion] ANN: PyTables 1.3 released
Message-ID: <200604012100.38726.faltet@carabos.com>

=========================
 Announcing PyTables 1.3
=========================

This is a new major release of PyTables.  The most remarkable feature
added in this version is a complete support (well, almost, because
unicode arrays are not there yet) for NumPy objects. Improved support
for native HDF5 is there as well. As an aside, I'm happy to inform you
that the PyTables web site (http://www.pytables.org) has been converted
into a wiki so that users can contribute to the project with recipes or
any other document.  Try it out!

Go to the (new) PyTables web site for downloading the beast:
http://www.pytables.org/

or keep reading for more info about the new features and bugs fixed.


Changes more in depth
=====================

Improvements:

- Support for NumPy objects in all the objects of PyTables, namely:
  Array, CArray, EArray, VLArray and Table. All the numerical and
  character (except unicode arrays) flavors are supported as well as
  plain and nested heterogeneous NumPy arrays. PyTables leverages the
  adoption of the array interface
  (http://numeric.scipy.org/array_interface.html) for a very efficient
  conversion between all the numarray (which continues to be the native
  flavor for PyTables) object to/from NumPy/Numeric.

- The FLAVOR schema in PyTables has been refined and simplified. Now,
  the only 'flavors' allowed for data objects are: "numarray", "numpy",
  "numeric" and "python". The changes has been made so that they are
  fully backward compatible with existing PyTables files. However, when
  users would try to use old flavors (like "Numeric" or "Tuple") in
  existing code, a ``DeprecationWarning`` will be issued in order to
  encourage them to migrate to the new flavors as soon as possible.

- Nested fields can be specified in the "field" parameter of Table.read
  by using a '/' as a separator between fields (e.g. 'Info/value').

- The Table.Cols accessor has received a new ``__setitem__()`` method
  that allows doing things like:

            table.cols[4] = record
            table.cols.x[4:1000:2] = array   # homogeneous column
            table.cols.Info[4:1000:2] = recarray   # nested column

- A clean-up function (using ``atexit``) has been registered so that
  remaining opened files are closed when a user hits a ^C, for
  example. That would help to avoid ending with corrupted files.

- Native HDF5 compound datasets that are contiguous are supported
  now. Before, only chunked datasets were supported.

- Updated (and much improved) sections about compression issues in the
  User's Guide. It includes new benchmarks made with PyTables 1.3 and a
  exhaustive comparison between Zlib, LZO and bzip2.

- The HTML version of manual is made now from the docbook2html package
  for an improved look (IMO).

Bug fixes:

- Solved a problem when trying to save CharArrays with itemsize = 0 as
  attributes of nodes. Now, these objects are pickled in order to
  prevent HDF5 from crashing.

- Fixed some alignment issues with nested record arrays under certain
  architectures (e.g. PowerPC).

- Fixed automatic conversions when a VLArray is read in a platform with
  a byte ordering different from the file.

Deprecated features:

- Due to recurrent problems with the UCL compression library, it has
  been declared deprecated from this version on. You can still compile
  PyTables with UCL support (using the --force-ucl), but you are urged
  to not use it anymore and convert any existing datafiles with UCL to
  other supported library (zlib, lzo or bzip2) with the ``ptrepack``
  utility.

Backward-incompatible changes:

- Please, see ``RELEASE-NOTES.txt`` file.


Important note for Windows users
================================

If you are willing to use PyTables with Python 2.4 in Windows platforms,
you will need to get the HDF5 library compiled for MSVC 7.1, aka .NET
2003.  It can be found at:
ftp://ftp.ncsa.uiuc.edu/HDF/HDF5/current/bin/windows/5-165-win-net.ZIP

Users of Python 2.3 on Windows will have to download the version of HDF5
compiled with MSVC 6.0 available in:
ftp://ftp.ncsa.uiuc.edu/HDF/HDF5/current/bin/windows/5-165-win.ZIP


What it is
==========

**PyTables** is a package for managing hierarchical datasets and
designed to efficiently cope with extremely large amounts of data (with
support for full 64-bit file addressing).  It features an
object-oriented interface that, combined with C extensions for the
performance-critical parts of the code, makes it a very easy-to-use tool
for high performance data storage and retrieval.

PyTables runs on top of the HDF5 library and numarray (but NumPy and
Numeric are also supported) package for achieving maximum throughput and
convenient use.

Besides, PyTables I/O for table objects is buffered, implemented in C
and carefully tuned so that you can reach much better performance with
PyTables than with your own home-grown wrappings to the HDF5 library.
PyTables sports indexing capabilities as well, allowing doing selections
in tables exceeding one billion of rows in just seconds.


Platforms
=========

This version has been extensively checked on quite a few platforms, like
Linux on Intel32 (Pentium), Win on Intel32 (Pentium), Linux on Intel64
(Itanium2), FreeBSD on AMD64 (Opteron), Linux on PowerPC (and PowerPC64)
and MacOSX on PowerPC.  For other platforms, chances are that the code
can be easily compiled and run without further issues.  Please, contact
us in case you are experiencing problems.


Resources
=========

Go to the PyTables web site for more details:

http://www.pytables.org

About the HDF5 library:

http://hdf.ncsa.uiuc.edu/HDF5/

About numarray:

http://www.stsci.edu/resources/software_hardware/numarray

To know more about the company behind the PyTables development, see:

http://www.carabos.com/


Acknowledgments
===============

Thanks to various the users who provided feature improvements, patches,
bug reports, support and suggestions.  See the ``THANKS`` file in the
distribution package for a (incomplete) list of contributors.  Many
thanks also to SourceForge who have helped to make and distribute this
package!  And last but not least, a big thank you to THG
(http://www.hdfgroup.org/) for sponsoring many of the new features
recently introduced in PyTables.


Share your experience
=====================

Let us know of any bugs, suggestions, gripes, kudos, etc. you may
have.


----

  **Enjoy data!**

  -- The PyTables Team

-- 
>0,0<   Francesc Altet ? ? http://www.carabos.com/
V   V   C?rabos Coop. V. ??Enjoy Data
 "-"


From oliphant.travis at ieee.org  Sat Apr  1 12:20:01 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Sat Apr  1 12:20:01 2006
Subject: [Numpy-discussion] numpy error handling
In-Reply-To: <442E94AD.1040200@cox.net>
References: <442DE773.4060104@cox.net> <442E2F05.5080809@ieee.org> <442E94AD.1040200@cox.net>
Message-ID: <442EE026.8060806@ieee.org>

Tim Hochberg wrote:
>>
>> You can get the numarray approach back simply by setting the error in 
>> the builtin scope (instead of in the local scope which is done by 
>> default.
>
> I saw that you could set it at different levels, but missed the 
> implications. However, it's still missing one feature, thread local 
> storage. I would argue that the __builtin__ data should actually be 
> stored in threading.local() instead of __builtin__. Then you could 
> setup an equivalent stack system to numpy's.
Yes, the per-thread storage escaped me.    But, threading.local() only 
exists in Python 2.4 and NumPy is supposed to be compatible with Python 2.3

What about PyThreadState_GetDict() ? and then default to use the builtin 
dictionary if this returns NULL?

I'm actually not particularly enthused about the three name-space 
lookups.   Changing it to only 1 place to look may be better.  It would 
require a setting and restoring operation.  A stack could be used, but 
why not just use local variables (i.e. 

save = numpy.seterr(dividebyzero='warn')

...

numpy.seterr(restore=save) 


>
> I've used the numarray error handling stuff for some time. My 
> experience with it has led me to the following conclusions:
>
>   1. You don't use it that often. I have about 26 KLOC that's "active"
>      and in that I use pushMode just 15 times. For comparison, I use
>      asarray a tad over 100 times.
>   2. pushMode and popMode, modulo spelling,  is the way to set errors.
>      Once the with  statement is around, that will be even better.
>   3. I, personally, would be very unlikely to use the local and global
>      error handling, I'd just as soon see them go away, particularly if
>      it helps performance, but I won't lobby for it.
>

This is good feedback.  I have almost zero experience with changing the 
error handling.  So, I'm not sure what features are desireable.  
Eliminating unnecessary name-lookups is usually a good thing.

>
> In numarray, the stack is in the numarray module itself (actually in 
> the Error object). They base their threading local behaviour off of 
> thread.get_ident, not threading.local.  That's not clunky at all, 
> although it's arguably wrong since thread.get_ident can reuse ids from 
> dead threads. In practice it's probably hard to get into trouble doing 
> this, but I still wouldn't emulate it. I think that this was written 
> before thread local storage, so it was probably the best that could be 
> done.

Right, but thread local storage is still Python 2.4 only....

What about PyThreadState_GetDict() ?
>
> However, if you use threading.local, it will be clunky in a similar 
> sense. You'll be storing data in a global  namespace you don't control 
> and you've got to hope that no one stomps on your variable name. 
The PyThreadState_GetDict() documenation states that extension module 
writers should use a unique name based on their extension module. 
> When you have local and module level secret storage names as well 
> you're just doing a lot more of that and the chance of collision and 
> confusion goes up from almost zero to very small.
This is true.   Similar to the C-variable naming issues.
>> So, we should at least frame the discussion in terms of what is 
>> actually possible.
>
> Yes, sorry for spreading misinformation.

But you did point out the very important thread-local storage fact that 
I had missed.   This alone makes me willing to revamp what we are doing.

>
> In this case, overflow, underflow and dividebyzero seem pretty self 
> documenting to me. And 'invalid' is pretty cryptic in both 
> implementations. This may be a matter of taste, but I tend to prefer 
> short pithy names for functions that I use a lot, or that crammed a 
> bunch to a line. In functions like this, that are more rarely used and 
> get a full line to themselves I lean to towards the more verbose.

The rarely-used factor is a persuasive argument.  

> Can you elaborate on this a bit? Reading between the lines, there seem 
> to be two issues related to speed here.  One is the actual namespace 
> lookup of the error mode -- there's a setting that says we are using 
> the defaults, so don't bother to look. This saves the namespace 
> lookup.  Changing the defaults shouldn't affect the timing of that. 
> I'm not sure how this would interact with thread local storage though.
>
> The second issue is that running the core loop with no checks in place 
> is faster.
Basically, on the C-level, the error mode is an integer with specific 
bits allocated to the various error-possibilites (2-bits per 
possibility).   If this is 0 then the error checking is not even done 
(thus no error handling at all). 

Yes the name-lookup optimization could work with any defaults (but with 
thread-specific storage couldn't work anyway).

One question I have with threads and error handling though?  Right now, 
the ufuncs release the Python lock during computation (and re-acquire it 
to do error handling if needed).   If another ufunc was started by 
another Python thread and ran with different error handling, wouldn't 
the IEEE flags get confused about which ufunc was setting what?  The 
flags are only checked after each 1-d loop.  If another thread set the 
processor flag, the current thread could get very confused.

This seems like a problem that I'm not sure how to handle.  

>
> It's not entirely plucked out of the error. As I recall, the decision 
> was arrived at something likes this:
>
>   1. Errors should never pass silently (unless explicitly silenced).
>   2. Let's have everything raise by default
>   3. In practice this was no good because you often wanted to look at
>      the results and see where the problem was.
>   4. OK, let's have everything warn
>   5. This almost worked, but underflow was almost never a real error,
>      so everyone always overrode underflow. A default that you always
>      need to override is not a good default.
>   6. So, warn for everything except underflow. Ignore that.
>
> And that's where numarry is today. I and other have been using that 
> error system happily for quite some time now. At least I haven't heard 
> any complaints for quite a while.

I can appreciate this choice, but I don't agree that errors should never 
pass silently.   The fact that people disagree about this is the reason 
for the error handling.    Note that overflow is not detected everywhere 
for integers --- we have to simulate the floating-point errors for 
them.  Only on integer multiply is it detected.   Checking for it would 
slow down all other integer arithmetic --- one solution, of course is to 
have two different integer additions (one that checks for overflow and 
another that doesn't). 

There is really a bit of work left here to do.


Best,

-Travis


From tim.hochberg at cox.net  Sat Apr  1 14:01:04 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Sat Apr  1 14:01:04 2006
Subject: [Numpy-discussion] numpy error handling
In-Reply-To: <442EE026.8060806@ieee.org>
References: <442DE773.4060104@cox.net> <442E2F05.5080809@ieee.org> <442E94AD.1040200@cox.net> <442EE026.8060806@ieee.org>
Message-ID: <442EF7D9.9010404@cox.net>

Travis Oliphant wrote:

> Tim Hochberg wrote:
>
>>>
>>> You can get the numarray approach back simply by setting the error 
>>> in the builtin scope (instead of in the local scope which is done by 
>>> default.
>>
>>
>> I saw that you could set it at different levels, but missed the 
>> implications. However, it's still missing one feature, thread local 
>> storage. I would argue that the __builtin__ data should actually be 
>> stored in threading.local() instead of __builtin__. Then you could 
>> setup an equivalent stack system to numpy's.
>
> Yes, the per-thread storage escaped me.    But, threading.local() only 
> exists in Python 2.4 and NumPy is supposed to be compatible with 
> Python 2.3
>
> What about PyThreadState_GetDict() ? and then default to use the 
> builtin dictionary if this returns NULL?

That sounds reasonable. I've never used that, but the name sounds promising!

> I'm actually not particularly enthused about the three name-space 
> lookups.   Changing it to only 1 place to look may be better.  It 
> would require a setting and restoring operation.  A stack could be 
> used, but why not just use local variables (i.e.
> save = numpy.seterr(dividebyzero='warn')
>
> ...
>
> numpy.seterr(restore=save)

That would work as well, I think. It gets a little hairy if you want to 
set error nestedly in a single function, but I've never done that, so 
I'm not too worried about it. Besides, what I really want to support is 
'with', which I imagine we can support using the above as a base.

>> I've used the numarray error handling stuff for some time. My 
>> experience with it has led me to the following conclusions:
>>
>>   1. You don't use it that often. I have about 26 KLOC that's "active"
>>      and in that I use pushMode just 15 times. For comparison, I use
>>      asarray a tad over 100 times.
>>   2. pushMode and popMode, modulo spelling,  is the way to set errors.
>>      Once the with  statement is around, that will be even better.
>>   3. I, personally, would be very unlikely to use the local and global
>>      error handling, I'd just as soon see them go away, particularly if
>>      it helps performance, but I won't lobby for it.
>>
>
> This is good feedback.  I have almost zero experience with changing 
> the error handling.  So, I'm not sure what features are desireable.  
> Eliminating unnecessary name-lookups is usually a good thing.


I hope some of the other numarray users chime in. A sample of one is not 
very good data!

>> In numarray, the stack is in the numarray module itself (actually in 
>> the Error object). They base their threading local behaviour off of 
>> thread.get_ident, not threading.local.  That's not clunky at all, 
>> although it's arguably wrong since thread.get_ident can reuse ids 
>> from dead threads. In practice it's probably hard to get into trouble 
>> doing this, but I still wouldn't emulate it. I think that this was 
>> written before thread local storage, so it was probably the best that 
>> could be done.
>
>
> Right, but thread local storage is still Python 2.4 only....
>
> What about PyThreadState_GetDict() ?

That sounds reasonable. Essentially we would be rolling our own 
threading.local()

>>
>> However, if you use threading.local, it will be clunky in a similar 
>> sense. You'll be storing data in a global  namespace you don't 
>> control and you've got to hope that no one stomps on your variable name. 
>
> The PyThreadState_GetDict() documenation states that extension module 
> writers should use a unique name based on their extension module.
>
>> When you have local and module level secret storage names as well 
>> you're just doing a lot more of that and the chance of collision and 
>> confusion goes up from almost zero to very small.
>
> This is true.   Similar to the C-variable naming issues.
>
>>> So, we should at least frame the discussion in terms of what is 
>>> actually possible.
>>
>>
>> Yes, sorry for spreading misinformation.
>
>
> But you did point out the very important thread-local storage fact 
> that I had missed.   This alone makes me willing to revamp what we are 
> doing.
>
>>
>> In this case, overflow, underflow and dividebyzero seem pretty self 
>> documenting to me. And 'invalid' is pretty cryptic in both 
>> implementations. This may be a matter of taste, but I tend to prefer 
>> short pithy names for functions that I use a lot, or that crammed a 
>> bunch to a line. In functions like this, that are more rarely used 
>> and get a full line to themselves I lean to towards the more verbose.
>
>
> The rarely-used factor is a persuasive argument. 
>
>> Can you elaborate on this a bit? Reading between the lines, there 
>> seem to be two issues related to speed here.  One is the actual 
>> namespace lookup of the error mode -- there's a setting that says we 
>> are using the defaults, so don't bother to look. This saves the 
>> namespace lookup.  Changing the defaults shouldn't affect the timing 
>> of that. I'm not sure how this would interact with thread local 
>> storage though.
>>
>> The second issue is that running the core loop with no checks in 
>> place is faster.
>
> Basically, on the C-level, the error mode is an integer with specific 
> bits allocated to the various error-possibilites (2-bits per 
> possibility).   If this is 0 then the error checking is not even done 
> (thus no error handling at all).
> Yes the name-lookup optimization could work with any defaults (but 
> with thread-specific storage couldn't work anyway).
>
> One question I have with threads and error handling though?  Right 
> now, the ufuncs release the Python lock during computation (and 
> re-acquire it to do error handling if needed).   If another ufunc was 
> started by another Python thread and ran with different error 
> handling, wouldn't the IEEE flags get confused about which ufunc was 
> setting what?  The flags are only checked after each 1-d loop.  If 
> another thread set the processor flag, the current thread could get 
> very confused.
>
> This seems like a problem that I'm not sure how to handle. 

Yeah, me either. It seems that somehow we'll need to block until all 
current operations are done, but I don't know how to do that off the top 
of my head. Perhaps ufuncs need to lock the flags when they start and 
release them when they finish. This looks feasible, but I'm not sure of 
the proper incantation to get this right. The ufuncs would all need to 
be able able to increment and decrement the lock, whatever it is, even 
though they are in different threads. Meanwhile the setting code should 
only be able to work when the lock is unheld. It's some sort of poly 
thread recursive lock thing. I'll think about it, perhaps there's an 
obvious way.

>>
>> It's not entirely plucked out of the error. As I recall, the decision 
>> was arrived at something likes this:
>>
>>   1. Errors should never pass silently (unless explicitly silenced).
>>   2. Let's have everything raise by default
>>   3. In practice this was no good because you often wanted to look at
>>      the results and see where the problem was.
>>   4. OK, let's have everything warn
>>   5. This almost worked, but underflow was almost never a real error,
>>      so everyone always overrode underflow. A default that you always
>>      need to override is not a good default.
>>   6. So, warn for everything except underflow. Ignore that.
>>
>> And that's where numarry is today. I and other have been using that 
>> error system happily for quite some time now. At least I haven't 
>> heard any complaints for quite a while.
>
>
> I can appreciate this choice, but I don't agree that errors should 
> never pass silently. 

You'll notice that we ended up with a slightly more nuanced choice. 
Besides, the full quote is import: "errors should not pass silently 
unless explicitly silenced". That's quite a bit different than a blanket 
error should never pass silently.

> The fact that people disagree about this is the reason for the error 
> handling.    

Yes. While I like the above defaults, if we have a reasonable approach I 
can just set them at startup and forget about them. Let's try not to 
penalize me too much for that though.


> Note that overflow is not detected everywhere for integers --- we have 
> to simulate the floating-point errors for them.  Only on integer 
> multiply is it detected.   Checking for it would slow down all other 
> integer arithmetic --- one solution, of course is to have two 
> different integer additions (one that checks for overflow and another 
> that doesn't).

Or just document it and don't worry about it. If I'm doing integer 
arithmetic and I need overflow detection, I can generally cast to 
doubles and do my math there, casting back at the end as needed. This 
doesn't seem worth too much extra complication.

Is my floating point bias showing?

> There is really a bit of work left here to do.


Yep. Looks like it, but nothing insurmountable.

-tim


From strawman at astraw.com  Sat Apr  1 15:56:03 2006
From: strawman at astraw.com (Andrew Straw)
Date: Sat Apr  1 15:56:03 2006
Subject: [Numpy-discussion] numpy error handling
In-Reply-To: <442EF7D9.9010404@cox.net>
References: <442DE773.4060104@cox.net> <442E2F05.5080809@ieee.org> <442E94AD.1040200@cox.net> <442EE026.8060806@ieee.org> <442EF7D9.9010404@cox.net>
Message-ID: <442F130E.3060802@astraw.com>

Tim Hochberg wrote:

> Travis Oliphant wrote:
>
>>
>> One question I have with threads and error handling though?  Right
>> now, the ufuncs release the Python lock during computation (and
>> re-acquire it to do error handling if needed).   If another ufunc was
>> started by another Python thread and ran with different error
>> handling, wouldn't the IEEE flags get confused about which ufunc was
>> setting what?  The flags are only checked after each 1-d loop.  If
>> another thread set the processor flag, the current thread could get
>> very confused.
>>
>> This seems like a problem that I'm not sure how to handle. 
>
>
> Yeah, me either. It seems that somehow we'll need to block until all
> current operations are done, but I don't know how to do that off the
> top of my head. Perhaps ufuncs need to lock the flags when they start
> and release them when they finish. This looks feasible, but I'm not
> sure of the proper incantation to get this right. The ufuncs would all
> need to be able able to increment and decrement the lock, whatever it
> is, even though they are in different threads. Meanwhile the setting
> code should only be able to work when the lock is unheld. It's some
> sort of poly thread recursive lock thing. I'll think about it, perhaps
> there's an obvious way.


I am also absolutely no expert in this area, but isn't this exactly what
the kernel supports multiple threads for? In other words, I'm not sure
we have to worry about it at all. I expect that the kernel sets/restores
the CPU/FPU error flags on thread switches and this is part of the cost
associated with switching threads. As I understand it, linux threads are
actually implemented as new processes, so if we did have to be worried
about this, wouldn't we also have to be worried that program A might
alter the FPU error state while we're also using program B?

This is just my unsophisticated and possibly wrong understanding of
these things. If anyone can help clarify the issue, I'd be glad to be
enlightened.

Cheers!
Andrew


From aisaac at american.edu  Sat Apr  1 16:12:01 2006
From: aisaac at american.edu (Alan G Isaac)
Date: Sat Apr  1 16:12:01 2006
Subject: [Numpy-discussion] extension to xrange for numpy
In-Reply-To: <Pine.LNX.4.51.0604011906240.29595@ptpcp8.phy.tu-dresden.de>
References: <Pine.LNX.4.51.0604011906240.29595@ptpcp8.phy.tu-dresden.de>
Message-ID: <Mahogany-0.66.0-2200-20060401-191746.00@american.edu>

On Sat, 1 Apr 2006, (CEST) Arnd Baecker apparently wrote: 
> one python command which is extremely useful in 1D 
> situations is `xrange`.

Which will very soon be 'range'.
Cheers,
Alan Isaac 


From gruben at bigpond.net.au  Sat Apr  1 18:46:07 2006
From: gruben at bigpond.net.au (Gary Ruben)
Date: Sat Apr  1 18:46:07 2006
Subject: [Numpy-discussion] extension to xrange for numpy
In-Reply-To: <Pine.LNX.4.51.0604011906240.29595@ptpcp8.phy.tu-dresden.de>
References: <Pine.LNX.4.51.0604011906240.29595@ptpcp8.phy.tu-dresden.de>
Message-ID: <442F41AE.1080806@bigpond.net.au>

A few rough thoughts:

I'm a bit ambivalent about this. It's not very n-dimensional and 
enforces an x,y,z,(t?) ordering of the array dimensions which some 
programmers may not want to adhere to. On the occasions I've had to 
write code which loops over multiple dimensions, I've found the python 
cookbook routines for permutation and combination generators really useful

<http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/190465>
<http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/474124>

so I'd find some sort of numpy iterator equivalents of these more 
useful. This would allow list comprehensions like

[f(x,y,z) for (x,y,z) in ndrange(10,10,10)]

It would also be good to have it able to specify the rank of the object 
returned to allow whole array rows or matrices to be returned i.e. array 
slices. Maybe the ndrange function could allow something like

[f(xy,z) for (xy,z) in ndrange((10,0,1),10)]
where you use a tuple to specify a range and the axes to slice out.
[f(x,yz) for (x,yz) in ndrange(10,(10,1,2))]
[f(xz,y) for (xz,y) in ndrange((10,0,2),(10,1))]

On the other hand your idea would potentially make some code a lot 
easier to understand, so I'm not against it and if it was picked up, I'd 
propose "t" or "w" for the 4th dimension. It might help to post some 
code that you think might benefit from your idea.

Gary R.

Arnd Baecker wrote:
> Dear numpy enthusiasts,
> 
> one python command which is extremely useful in 1D situations
> is `xrange`. However, for higher dimensional
> settings we strongly lack the commands `yrange` and `zrange`.
> These could be shorthands for the corresponding
> constructs with `:,NewAxis` added.
> 
> Any comments, suggestion and even implementations are very welcome,
> 
> Arnd
> 
> P.S.: What I am not sure about is the right command for
> the 4-dimensional case - which letter should be used after the "z"?
> (it seems that "a" would be a very natural choice...)


From rob at hooft.net  Sat Apr  1 22:38:04 2006
From: rob at hooft.net (Rob Hooft)
Date: Sat Apr  1 22:38:04 2006
Subject: [Numpy-discussion] numpy error handling
In-Reply-To: <442EE026.8060806@ieee.org>
References: <442DE773.4060104@cox.net> <442E2F05.5080809@ieee.org> <442E94AD.1040200@cox.net> <442EE026.8060806@ieee.org>
Message-ID: <442F7114.40908@hooft.net>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Travis Oliphant wrote:
| save = numpy.seterr(dividebyzero='warn')
|
| ...
|
| numpy.seterr(restore=save)

Most of this discussion is outside of my scope, but I have programmed
this kind of pattern in a different way before:

~   save = context.push(something)
~   ...
~   del save

i.e. the destructor of the saved context object restores the old
situation. In most cases it will be called by letting "save" go out of
scope. I know that relying on timely object destruction can be
troublesome when porting to Jython, but it is very convenient in CPython.

If that goes too far, one could make a separate method on save:

~    save.pop()

This can do sanity checking too (are we really at the top of the stack?
Only called once?). The destructor should check whether pop has been called.

Rob

- --
Rob W.W. Hooft  ||  rob at hooft.net  ||  http://www.hooft.net/people/rob/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFEL3EUH7J/Cv8rb3QRAuvsAJ9PO6ZITdVSm+hIwxkWDHHbTNFHdQCcDSWI
Iv7gupkFc8+Fby/5MFwHQf4=
=zE/o
-----END PGP SIGNATURE-----


From aisaac at american.edu  Sun Apr  2 06:58:34 2006
From: aisaac at american.edu (Alan G Isaac)
Date: Sun Apr  2 06:58:34 2006
Subject: [Numpy-discussion] extension to xrange for numpy
In-Reply-To: <442F41AE.1080806@bigpond.net.au>
References: <Pine.LNX.4.51.0604011906240.29595@ptpcp8.phy.tu-dresden.de>
 <442F41AE.1080806@bigpond.net.au>
Message-ID: <Mahogany-0.66.0-2020-20060402-100402.00@american.edu>

On Sun, 02 Apr 2006, Gary Ruben apparently wrote: 
> I'd find some sort of numpy iterator equivalents of these more 
> useful. This would allow list comprehensions like 
> [f(x,y,z) for (x,y,z) in ndrange(10,10,10)] 

How is this better than using ogrid? E.g.,

>>> x=N.ogrid[:3,:2]
>>> N.power(*x)
array([[1, 0],
       [1, 1],
       [1, 2]])

Thanks,
Alan


From cjw at sympatico.ca  Sun Apr  2 07:22:09 2006
From: cjw at sympatico.ca (Colin J. Williams)
Date: Sun Apr  2 07:22:09 2006
Subject: [Numpy-discussion] first impressions with numpy
In-Reply-To: <442DD638.60706@cox.net>
References: <442D9124.5020905@msg.ucsf.edu> <442D9695.2050900@cox.net> <442DB655.2050203@cox.net> <442DB91F.9030103@msg.ucsf.edu> <442DD638.60706@cox.net>
Message-ID: <442FDDD5.8050404@sympatico.ca>

Tim Hochberg wrote:

> Sebastian Haase wrote:
>
>> Thanks Tim,
>> that's OK - I got the idea...
>> BTW, is there a (policy) reason that you sent the first email just to 
>> me and not the mailing list !?
>
>
> No. Just clumsy fingers. Probably the same reason the functions got 
> all garbled!
>
>>
>> I would really be more interested in comments to my first point ;-)
>> I think it's important that numpy will not be to cryptic and only for 
>> "hackers", but nice to look at ...  (hope you get what I mean ;-)
>
>
> Well, I think it's probably a good idea and it sounds like Travis like 
> the idea " for some of the builtin types". I suspect that's code for 
> "not types for which it doesn't make sense, like recarrays".
>
Tim,

Could you elaborate on this please?  Surely, it would be good for all 
functions and methods to have meaningful parameter lists and good doc 
strings.

Colin W.


From tim.hochberg at cox.net  Sun Apr  2 08:11:17 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Sun Apr  2 08:11:17 2006
Subject: [Numpy-discussion] first impressions with numpy
In-Reply-To: <442FDDD5.8050404@sympatico.ca>
References: <442D9124.5020905@msg.ucsf.edu> <442D9695.2050900@cox.net> <442DB655.2050203@cox.net> <442DB91F.9030103@msg.ucsf.edu> <442DD638.60706@cox.net> <442FDDD5.8050404@sympatico.ca>
Message-ID: <442FE950.8090000@cox.net>

Colin J. Williams wrote:

> Tim Hochberg wrote:
>
>> Sebastian Haase wrote:
>>
>>> Thanks Tim,
>>> that's OK - I got the idea...
>>> BTW, is there a (policy) reason that you sent the first email just 
>>> to me and not the mailing list !?
>>
>>
>>
>> No. Just clumsy fingers. Probably the same reason the functions got 
>> all garbled!
>>
>>>
>>> I would really be more interested in comments to my first point ;-)
>>> I think it's important that numpy will not be to cryptic and only 
>>> for "hackers", but nice to look at ...  (hope you get what I mean ;-)
>>
>>
>>
>> Well, I think it's probably a good idea and it sounds like Travis 
>> like the idea " for some of the builtin types". I suspect that's code 
>> for "not types for which it doesn't make sense, like recarrays".
>>
> Tim,
>
> Could you elaborate on this please?  Surely, it would be good for all 
> functions and methods to have meaningful parameter lists and good doc 
> strings.

This isn't really about parameter lists and docstrings, it's about 
__str__ and possibly __repr__. The basic issue is that the way dtypes 
are displayed is powerful, but unfriendly. If I create an array of integers:

     >>> a = arange(4)
     >>> print repr(a.dtype), str(a.dtype)
    dtype('<i4') '<i4'

This result is sort of cryptic. It would probably be reasonable to have 
this print

    dtype(int32), int32

instead. This is much less cryptic and dtype(int32) works fine, so it's 
an acceptable substitute for repr.

On the other hand, some things don't map neatly onto the builtin types. 
Data that's not in the native byte order would be one case. For example, 
dtype('>i4') is not the same as dtype(int32) on my machine and should 
probably not be displayed using int32[1]. These cases should be rare in 
practice and it seems fine to fall back to the less friendly but more 
flexible notation.

Recarrays were probably not such a good example. Here is an example from 
a recarray:

    dtype([('x', '<f8'), ('z', '<c16')])

This would work fine if repr were instead:

    dtype([('x', float64), ('z', complex128)])

Anyway, this all seems reasonable to me at first glance. That said, I 
don't plan to work on this, I've got other fish to fry at the moment.

Regards,

-tim

[1] There does seem to be something squirley going on here though: 
dtype('>i4').name is 'int32' which seems wrong.


From tim.hochberg at cox.net  Sun Apr  2 08:41:24 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Sun Apr  2 08:41:24 2006
Subject: [Numpy-discussion] numpy error handling
In-Reply-To: <442F7114.40908@hooft.net>
References: <442DE773.4060104@cox.net> <442E2F05.5080809@ieee.org> <442E94AD.1040200@cox.net> <442EE026.8060806@ieee.org> <442F7114.40908@hooft.net>
Message-ID: <442FF03F.2000406@cox.net>

Rob Hooft wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Travis Oliphant wrote:
> | save = numpy.seterr(dividebyzero='warn')
> |
> | ...
> |
> | numpy.seterr(restore=save)
>
> Most of this discussion is outside of my scope, but I have programmed
> this kind of pattern in a different way before:
>
> ~   save = context.push(something)
> ~   ...
> ~   del save
>
> i.e. the destructor of the saved context object restores the old
> situation. In most cases it will be called by letting "save" go out of
> scope. I know that relying on timely object destruction can be
> troublesome when porting to Jython, but it is very convenient in CPython.
>
> If that goes too far, one could make a separate method on save:
>
> ~    save.pop()
>
> This can do sanity checking too (are we really at the top of the stack?
> Only called once?). The destructor should check whether pop has been 
> called.


Well, the syntax that *I* really want is this:

    class error_mode(object):
        def __init__(self, all=None, overflow=None, underflow=None,
    dividebyzero=None, invalid=None):
             self._args = (overflow, overflow, underflow, dividebyzero,
    invalid)
        def __enter__(self):
             self._save = numpy.seterr(*self._args)
        def __exit__(self):
           numpy.seterr(self._save)

That way, in a few months, I can do this:

    with error_mode(overflow='raise'):
        # do stuff

and it will be almost impossible to mess up. This syntax is lighter and 
cleaner than a stack or relying on garbage collection to free the 
resources. So, for my purposes, the simple syntax Travis proposes is 
perfectly adequate and simpler to implement  and get right than a stack 
based approach. If 'with' wasn't coming down the pipe, I would push for 
a stack, but I like Travis' proposal just fine.

YMMV of course.

-tim


From tim.hochberg at cox.net  Sun Apr  2 08:52:09 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Sun Apr  2 08:52:09 2006
Subject: [Numpy-discussion] observations
Message-ID: <442FF2F8.3030906@cox.net>

I've been doing a *lot* of playing with numpy over the last several 
days, so expect various observations to trickle from my abode over the 
next week or so. Here's the first installment.

* tostring probably needs the order flag. I think you want  the string 
generated from a multidimensional array in Fortran and C order to differ.

* With the evolution of the order flag, ascontiguousarray is probably 
redundant, scarcely after it was added.

    b = asarray(a, order="C")

Is actually clearer in intent than:

    b = ascontiguousarray(a)

Does the latter leave a contiguous, Fortran order array alone? That's 
probably almost never what one wants. Unless your working with Fortran 
arrays, in which case the opposite ambiguity applies.

Regards,

-tim


From tim.hochberg at cox.net  Sun Apr  2 11:20:03 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Sun Apr  2 11:20:03 2006
Subject: [Numpy-discussion] extension to xrange for numpy
In-Reply-To: <442F41AE.1080806@bigpond.net.au>
References: <Pine.LNX.4.51.0604011906240.29595@ptpcp8.phy.tu-dresden.de> <442F41AE.1080806@bigpond.net.au>
Message-ID: <44301590.4050707@cox.net>

Gary Ruben wrote:

> A few rough thoughts:
>
> I'm a bit ambivalent about this. It's not very n-dimensional and 
> enforces an x,y,z,(t?) ordering of the array dimensions which some 
> programmers may not want to adhere to. On the occasions I've had to 
> write code which loops over multiple dimensions, I've found the python 
> cookbook routines for permutation and combination generators really 
> useful
>
> <http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/190465>
> <http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/474124>
>
> so I'd find some sort of numpy iterator equivalents of these more 
> useful. This would allow list comprehensions like
>
> [f(x,y,z) for (x,y,z) in ndrange(10,10,10)]
>
> It would also be good to have it able to specify the rank of the 
> object returned to allow whole array rows or matrices to be returned 
> i.e. array slices. Maybe the ndrange function could allow something like
>
> [f(xy,z) for (xy,z) in ndrange((10,0,1),10)]
> where you use a tuple to specify a range and the axes to slice out.
> [f(x,yz) for (x,yz) in ndrange(10,(10,1,2))]
> [f(xz,y) for (xz,y) in ndrange((10,0,2),(10,1))]
>
> On the other hand your idea would potentially make some code a lot 
> easier to understand, so I'm not against it and if it was picked up, 
> I'd propose "t" or "w" for the 4th dimension. It might help to post 
> some code that you think might benefit from your idea.

Bah, humbug!

    "Not every two-line Python function has to come pre-written" -- Tim
    Peters on C.L.P

def xrange(*args, **kwargs): return arange(*args, **kwargs)
def yrange(*args, **kwargs): return padshape(arange(*args, **kwargs), 2)
def zrange(*args, **kwargs): return padshape(arange(*args, **kwargs), 3)
def trange(*args, **kwargs): return padshape(arange(*args, **kwargs), 4)

Of course, then you need padshape which I'd be happy to contribute.

I'm of the opinion that we should be trying to improve the usefullness 
of a smallish set of core primitives, not adding endless new functions. 
Stuff like this, which is of interest in a relatively limited domain and 
is trivial to implement when needed, should either not be added at all, 
or added in a separate module.

     >>> len(dir(numpy))
    476

Does anyone know what all of that does? I certainly don't. And I doubt 
anyone uses more than a fraction of that interface. I wouldn't be the 
least bit suprised if there are old moldy parts of that are essentially 
used. And, unused code is buggy code in my experience.

    "Perfection is achieved, not when there is nothing more to add, but
    when there is nothing left to take away." -- Antoine de Saint-Exupery

It's probably difficult at this point in numpy's life cycle to remove 
stuff or even reorganize things substantially. Besides, I'm sure all the 
developers  have their hands full doing more important, or at least less 
contentious, things. Still, I think we should cast a more critical eye 
on new stuff before adding it.

Regards,

-tim


>
> Gary R.
>
> Arnd Baecker wrote:
>
>> Dear numpy enthusiasts,
>>
>> one python command which is extremely useful in 1D situations
>> is `xrange`. However, for higher dimensional
>> settings we strongly lack the commands `yrange` and `zrange`.
>> These could be shorthands for the corresponding
>> constructs with `:,NewAxis` added.
>>
>> Any comments, suggestion and even implementations are very welcome,
>>
>> Arnd
>>
>> P.S.: What I am not sure about is the right command for
>> the 4-dimensional case - which letter should be used after the "z"?
>> (it seems that "a" would be a very natural choice...)
>
>
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting 
> language
> that extends applications into web and mobile media. Attend the live 
> webcast
> and join the prime developer group breaking into this new coding 
> territory!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>
>


From arnd.baecker at web.de  Sun Apr  2 11:23:04 2006
From: arnd.baecker at web.de (Arnd Baecker)
Date: Sun Apr  2 11:23:04 2006
Subject: [Numpy-discussion] extension to xrange for numpy
In-Reply-To: <442F41AE.1080806@bigpond.net.au>
References: <Pine.LNX.4.51.0604011906240.29595@ptpcp8.phy.tu-dresden.de>
 <442F41AE.1080806@bigpond.net.au>
Message-ID: <Pine.LNX.4.51.0604022016500.3029@ptpcp8.phy.tu-dresden.de>

Hi,

On Sun, 2 Apr 2006, Gary Ruben wrote:

> A few rough thoughts:

[... useful stuff snipped ... ]

> On the other hand your idea would potentially make some code a lot
> easier to understand, so I'm not against it and if it was picked up, I'd
> propose "t" or "w" for the 4th dimension. It might help to post some
> code that you think might benefit from your idea.

Hope you don't jump at me, but I would like to
wait until April 1st next year then ...
((hmm, maybe my post contained too much of a possible truth
to be considered as an April fools joke -
yrange and zrange have been a running gag in our group for
a while now - strange German humor ...;-))

Anyway, I hope I did not waste too much of your time ...

Best, Arnd


> Gary R.
>
> Arnd Baecker wrote:
> > Dear numpy enthusiasts,
> >
> > one python command which is extremely useful in 1D situations
> > is `xrange`. However, for higher dimensional
> > settings we strongly lack the commands `yrange` and `zrange`.
> > These could be shorthands for the corresponding
> > constructs with `:,NewAxis` added.
> >
> > Any comments, suggestion and even implementations are very welcome,
> >
> > Arnd
> >
> > P.S.: What I am not sure about is the right command for
> > the 4-dimensional case - which letter should be used after the "z"?
> > (it seems that "a" would be a very natural choice...)
>
>


From tim.hochberg at cox.net  Sun Apr  2 11:34:03 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Sun Apr  2 11:34:03 2006
Subject: [Numpy-discussion] extension to xrange for numpy
In-Reply-To: <Pine.LNX.4.51.0604022016500.3029@ptpcp8.phy.tu-dresden.de>
References: <Pine.LNX.4.51.0604011906240.29595@ptpcp8.phy.tu-dresden.de> <442F41AE.1080806@bigpond.net.au> <Pine.LNX.4.51.0604022016500.3029@ptpcp8.phy.tu-dresden.de>
Message-ID: <44301908.2000607@cox.net>

Arnd Baecker wrote:

>Hi,
>
>On Sun, 2 Apr 2006, Gary Ruben wrote:
>
>  
>
>>A few rough thoughts:
>>    
>>
>
>[... useful stuff snipped ... ]
>
>  
>
>>On the other hand your idea would potentially make some code a lot
>>easier to understand, so I'm not against it and if it was picked up, I'd
>>propose "t" or "w" for the 4th dimension. It might help to post some
>>code that you think might benefit from your idea.
>>    
>>
>
>Hope you don't jump at me, but I would like to
>wait until April 1st next year then ...
>((hmm, maybe my post contained too much of a possible truth
>to be considered as an April fools joke -
>yrange and zrange have been a running gag in our group for
>a while now - strange German humor ...;-))
>
>Anyway, I hope I did not waste too much of your time ...
>  
>

Ouch! Got me anyway...

>Best, Arnd
>
>
>  
>
>>Gary R.
>>
>>Arnd Baecker wrote:
>>    
>>
>>>Dear numpy enthusiasts,
>>>
>>>one python command which is extremely useful in 1D situations
>>>is `xrange`. However, for higher dimensional
>>>settings we strongly lack the commands `yrange` and `zrange`.
>>>These could be shorthands for the corresponding
>>>constructs with `:,NewAxis` added.
>>>
>>>Any comments, suggestion and even implementations are very welcome,
>>>
>>>Arnd
>>>
>>>P.S.: What I am not sure about is the right command for
>>>the 4-dimensional case - which letter should be used after the "z"?
>>>(it seems that "a" would be a very natural choice...)
>>>      
>>>
>>    
>>
>
>
>-------------------------------------------------------
>This SF.Net email is sponsored by xPML, a groundbreaking scripting language
>that extends applications into web and mobile media. Attend the live webcast
>and join the prime developer group breaking into this new coding territory!
>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
>_______________________________________________
>Numpy-discussion mailing list
>Numpy-discussion at lists.sourceforge.net
>https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>
>
>  
>


From schofield at ftw.at  Sun Apr  2 13:05:02 2006
From: schofield at ftw.at (Ed Schofield)
Date: Sun Apr  2 13:05:02 2006
Subject: [Numpy-discussion] Deprecating old names
In-Reply-To: <44301590.4050707@cox.net>
References: <Pine.LNX.4.51.0604011906240.29595@ptpcp8.phy.tu-dresden.de> <442F41AE.1080806@bigpond.net.au> <44301590.4050707@cox.net>
Message-ID: <44302EA9.9050302@ftw.at>

Tim Hochberg wrote, in a different thread:
>     >>> len(dir(numpy))
>    476
>
> Does anyone know what all of that does? I certainly don't. And I doubt
> anyone uses more than a fraction of that interface. I wouldn't be the
> least bit suprised if there are old moldy parts of that are
> essentially used. And, unused code is buggy code in my experience.
>
>    "Perfection is achieved, not when there is nothing more to add, but
>    when there is nothing left to take away." -- Antoine de Saint-Exupery

I'd like to revise a proposal I made last week.  Then I proposed that we
reduce namespace clutter by not importing the contents of the oldnumeric
namespace by default.  But Travis didn't want to deprecate the
functional interfaces (sum(), take(), etc), so I now propose instead
that we split up the contents of oldnumeric.py into interfaces we want
to keep around indefinitely and interfaces we don't.  The ones we want
to keep could go into another file, e.g. fromnumeric.py, whose contents
are imported into the numpy namespace by default.  The deprecated ones
could stay in oldnumeric.py, and could be accessible through 'from
oldnumeric import *' at the top of source files, but not imported by
default.  Strong candidates for deprecation are the capitalised type
names, like Int8, Complex64, UnsignedInt.  I'd also argue for
deprecating NewAxis, UFuncType, ArrayType, arraytype, and anything else
that duplicates functionality available under NumPy under a different name.

Two of the Python design principles (from
http://www.python.org/dev/culture/) are:
 - There should be one -- and preferably only one -- obvious way to do it.
 - Namespaces are one honking great idea -- let's do more of those!

Let's clean up the cruft!

-- Ed


From gruben at bigpond.net.au  Sun Apr  2 16:06:10 2006
From: gruben at bigpond.net.au (Gary Ruben)
Date: Sun Apr  2 16:06:10 2006
Subject: [Numpy-discussion] extension to xrange for numpy
In-Reply-To: <Mahogany-0.66.0-2020-20060402-100402.00@american.edu>
References: <Pine.LNX.4.51.0604011906240.29595@ptpcp8.phy.tu-dresden.de> <442F41AE.1080806@bigpond.net.au> <Mahogany-0.66.0-2020-20060402-100402.00@american.edu>
Message-ID: <443058AE.2070808@bigpond.net.au>

Doh!
It's OK Arnd; I've recently seen you (or someone else withe the same 
name) acknowledged in a PhD I've been reading so I suspect you're a nice 
guy :-)

And, thanks Alan.
I knew about mgrid but not ogrid.
One small way in which that example might be better than using ogrid is 
that you could avoid creating the index arrays and lazily generate the 
indices. However, ogrid is better than mgrid in this respect.

thanks,
Gary

Alan G Isaac wrote:
> On Sun, 02 Apr 2006, Gary Ruben apparently wrote: 
>> I'd find some sort of numpy iterator equivalents of these more 
>> useful. This would allow list comprehensions like 
>> [f(x,y,z) for (x,y,z) in ndrange(10,10,10)] 
> 
> How is this better than using ogrid? E.g.,
> 
>>>> x=N.ogrid[:3,:2]
>>>> N.power(*x)
> array([[1, 0],
>        [1, 1],
>        [1, 2]])
> 
> Thanks,
> Alan


From zpincus at stanford.edu  Sun Apr  2 16:07:07 2006
From: zpincus at stanford.edu (Zachary Pincus)
Date: Sun Apr  2 16:07:07 2006
Subject: [Numpy-discussion] Speed up function on cross product of two sets?
Message-ID: <DCA236B2-7315-419D-B05B-06396924E909@stanford.edu>

Hi folks,

I have a inner loop that looks like this:
out = []
for elem1 in l1:
   for elem2 in l2:
     out.append(do_something(l1, l2))
result = do_something_else(out)

where do_something and do_something_else are implemented with only  
numpy ufuncs, and l1 and l2 are numpy arrays.

As an example, I need to compute the median distance from any element  
in one set to any element in another set.

What's the best way to speed this sort of thing up with numpy (e.g.  
push as much down into the underlying C as possible)? I could re- 
write do_something with the numexpr tools (which are very cool), but  
that doesn't address the fact that I've still got nested loops living  
in Python.

Perhaps there's some way in numpy to make one big honking array that  
contains all the pairs from the two lists, and then just run my  
do_something on that huge array, but that of course scales poorly.

Any thoughts?

Zach


From tim.hochberg at cox.net  Sun Apr  2 16:53:05 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Sun Apr  2 16:53:05 2006
Subject: [Numpy-discussion] Speed up function on cross product of two
 sets?
In-Reply-To: <DCA236B2-7315-419D-B05B-06396924E909@stanford.edu>
References: <DCA236B2-7315-419D-B05B-06396924E909@stanford.edu>
Message-ID: <443063C0.3050002@cox.net>

Zachary Pincus wrote:

> Hi folks,

Hi Zach,

>
> I have a inner loop that looks like this:
> out = []
> for elem1 in l1:
>   for elem2 in l2:
>     out.append(do_something(l1, l2))

this is do_something(elem1, elem2), correct?

> result = do_something_else(out)
>
> where do_something and do_something_else are implemented with only  
> numpy ufuncs, and l1 and l2 are numpy arrays.
>
> As an example, I need to compute the median distance from any element  
> in one set to any element in another set.
>
> What's the best way to speed this sort of thing up with numpy (e.g.  
> push as much down into the underlying C as possible)? I could re- 
> write do_something with the numexpr tools (which are very cool), but  
> that doesn't address the fact that I've still got nested loops living  
> in Python.

The exact approach I'd take would depend on the sizes of l1 and l2 and a 
certain amount of trial and error. However, the first thing I'd try is:

    n1 = len(l1)
    n2 = len(l2)
    out = numpy.zeros([n1*n2], appropriate_dtype)
    for i, elem1 in enumerate(l1):
        out[i*n2:(i+1)*n2] = do_something(elem1, l1)
    result = do_something_else(out)

That may work as is, or you may have to tweak do_something slightly to 
handle l1 correctly. You might also try to do the operations in place 
and stuff the results into out directly by using X= and three argument 
ufuncs. I'd not do that at first though.

One thing to consider is that, in my experience, numpy works best on 
chunks of about 10,000 elements. I believe that this is a function of 
cache size. Anyway, this may choice of which of l1 and l2 you continue 
to loop over, and which you vectorize. If they both might get really 
big, you could even consider chopping up l1 when you vectorize it. Again 
I wouldn't do that unless it really looks like you need it.

If that all sounds opaque, feel free to ask more questions. Or if you 
have questions about microoptimizing the guts of do_something, I have a 
bunch of experience with that and I like a good puzzle.

>
> Perhaps there's some way in numpy to make one big honking array that  
> contains all the pairs from the two lists, and then just run my  
> do_something on that huge array, but that of course scales poorly.

I know of at least one way, but it's a bit of a kludge. I don't think 
I'd try that though. As you said, it scales poorly.  As long as you can 
vectorize your inner loop, it's not necessary and sometimes makes things 
worse, to vectorize your outer loop as well. That's assuming your inner 
loop is large, it doesn't help if your inner loop is 3 elements long for 
instance, but that doesn't seem like it should be a problem here.

Regards,

-tim


From haase at msg.ucsf.edu  Sun Apr  2 17:01:04 2006
From: haase at msg.ucsf.edu (Sebastian Haase)
Date: Sun Apr  2 17:01:04 2006
Subject: [Numpy-discussion] first impressions with numpy
In-Reply-To: <442FE950.8090000@cox.net>
References: <442D9124.5020905@msg.ucsf.edu> <442D9695.2050900@cox.net> <442DB655.2050203@cox.net> <442DB91F.9030103@msg.ucsf.edu> <442DD638.60706@cox.net> <442FDDD5.8050404@sympatico.ca> <442FE950.8090000@cox.net>
Message-ID: <44306594.50305@msg.ucsf.edu>

Tim Hochberg wrote:
<snip>
> This would work fine if repr were instead:
> 
>    dtype([('x', float64), ('z', complex128)])
> 
> Anyway, this all seems reasonable to me at first glance. That said, I 
> don't plan to work on this, I've got other fish to fry at the moment.

A new point: Please remind me (and probably others): when did it get 
decided to introduce 'complex128' to mean numarray's complex64
and the 'complex64' to mean numarray's complex32 ?

I do understand the logic that 128 is really the bit-size of one 
(complex) element - but I also liked the old way, because:
1. e.g. in fft transforms, float32 would "go with" complex32
    and float64 with complex64
2. complex128 is one character extra (longer) and also (alphabetically) 
now sorts before(!) complex64


These might just be my personal (idiotic ;-) comments - but I would 
appreciate some feedback/comments.
Also: Is it now to late to (re-)start a discussion on this !?

Thanks
- Sebastian Haase


From haase at msg.ucsf.edu  Sun Apr  2 17:09:07 2006
From: haase at msg.ucsf.edu (Sebastian Haase)
Date: Sun Apr  2 17:09:07 2006
Subject: [Numpy-discussion] first impressions with numpy
In-Reply-To: <442FE950.8090000@cox.net>
References: <442D9124.5020905@msg.ucsf.edu> <442D9695.2050900@cox.net> <442DB655.2050203@cox.net> <442DB91F.9030103@msg.ucsf.edu> <442DD638.60706@cox.net> <442FDDD5.8050404@sympatico.ca> <442FE950.8090000@cox.net>
Message-ID: <44306774.5030507@msg.ucsf.edu>

Tim Hochberg wrote:
<snip>
> This would work fine if repr were instead:
> 
>    dtype([('x', float64), ('z', complex128)])
> 
> Anyway, this all seems reasonable to me at first glance. That said, I 
> don't plan to work on this, I've got other fish to fry at the moment.

A new point: Please remind me (and probably others): when did it get 
decided to introduce 'complex128' to mean numarray's complex64
and the 'complex64' to mean numarray's complex32 ?

I do understand the logic that 128 is really the bit-size of one 
(complex) element - but I also liked the old way, because:
1. e.g. in fft transforms, float32 would "go with" complex32
    and float64 with complex64
2. complex128 is one character extra (longer) and also (alphabetically) 
now sorts before(!) complex64
3 Mostly of course: this new naming will confuse all my code and 
introduce hard to find bugs - when I see complex64 I will "think" the 
old way for quite some time ...


These might just be my personal (idiotic ;-) comments - but I would 
appreciate some feedback/comments.
Also: Is it now to late to (re-)start a discussion on this !?

Thanks
- Sebastian Haase


From zpincus at stanford.edu  Sun Apr  2 17:17:06 2006
From: zpincus at stanford.edu (Zachary Pincus)
Date: Sun Apr  2 17:17:06 2006
Subject: [Numpy-discussion] Speed up function on cross product of two sets?
In-Reply-To: <443063C0.3050002@cox.net>
References: <DCA236B2-7315-419D-B05B-06396924E909@stanford.edu> <443063C0.3050002@cox.net>
Message-ID: <8717BF2C-01EB-422A-B63A-7807E607DEE9@stanford.edu>

Tim -

Thanks for your suggestions -- that all makes good sense.

It sounds like the general take home message is, as always: "the  
first thing to try is to vectorize your inner loop."

Zach


>> I have a inner loop that looks like this:
>> out = []
>> for elem1 in l1:
>>   for elem2 in l2:
>>     out.append(do_something(l1, l2))
>
> this is do_something(elem1, elem2), correct?
>
>> result = do_something_else(out)
>>
>> where do_something and do_something_else are implemented with  
>> only  numpy ufuncs, and l1 and l2 are numpy arrays.
>>
>> As an example, I need to compute the median distance from any  
>> element  in one set to any element in another set.
>>
>> What's the best way to speed this sort of thing up with numpy  
>> (e.g.  push as much down into the underlying C as possible)? I  
>> could re- write do_something with the numexpr tools (which are  
>> very cool), but  that doesn't address the fact that I've still got  
>> nested loops living  in Python.
>
> The exact approach I'd take would depend on the sizes of l1 and l2  
> and a certain amount of trial and error. However, the first thing  
> I'd try is:
>
>    n1 = len(l1)
>    n2 = len(l2)
>    out = numpy.zeros([n1*n2], appropriate_dtype)
>    for i, elem1 in enumerate(l1):
>        out[i*n2:(i+1)*n2] = do_something(elem1, l1)
>    result = do_something_else(out)
>
> That may work as is, or you may have to tweak do_something slightly  
> to handle l1 correctly. You might also try to do the operations in  
> place and stuff the results into out directly by using X= and three  
> argument ufuncs. I'd not do that at first though.
>
> One thing to consider is that, in my experience, numpy works best  
> on chunks of about 10,000 elements. I believe that this is a  
> function of cache size. Anyway, this may choice of which of l1 and  
> l2 you continue to loop over, and which you vectorize. If they both  
> might get really big, you could even consider chopping up l1 when  
> you vectorize it. Again I wouldn't do that unless it really looks  
> like you need it.
>
> If that all sounds opaque, feel free to ask more questions. Or if  
> you have questions about microoptimizing the guts of do_something,  
> I have a bunch of experience with that and I like a good puzzle.
>
>>
>> Perhaps there's some way in numpy to make one big honking array  
>> that  contains all the pairs from the two lists, and then just run  
>> my  do_something on that huge array, but that of course scales  
>> poorly.
>
> I know of at least one way, but it's a bit of a kludge. I don't  
> think I'd try that though. As you said, it scales poorly.  As long  
> as you can vectorize your inner loop, it's not necessary and  
> sometimes makes things worse, to vectorize your outer loop as well.  
> That's assuming your inner loop is large, it doesn't help if your  
> inner loop is 3 elements long for instance, but that doesn't seem  
> like it should be a problem here.
>
> Regards,
>
> -tim
>


From haase at msg.ucsf.edu  Sun Apr  2 17:21:14 2006
From: haase at msg.ucsf.edu (Sebastian Haase)
Date: Sun Apr  2 17:21:14 2006
Subject: [Fwd: Re: [Numpy-discussion] first impressions with numpy]
Message-ID: <44306A2C.4040606@msg.ucsf.edu>

supposedly meant for the whole list ...
From: Tim Hochberg <tim.hochberg at cox.net>


Sebastian Haase wrote:

> Tim Hochberg wrote:
> <snip>
>
>> This would work fine if repr were instead:
>>
>>    dtype([('x', float64), ('z', complex128)])
>>
>> Anyway, this all seems reasonable to me at first glance. That said, I 
>> don't plan to work on this, I've got other fish to fry at the moment.
>
>
> A new point: Please remind me (and probably others): when did it get 
> decided to introduce 'complex128' to mean numarray's complex64
> and the 'complex64' to mean numarray's complex32 ?

I haven't the faintest idea -- it happened when I was off in Numarray
land I assume. Or it was always that way? No idea. Hopefully Travis will
answer this.

-tim


>
> I do understand the logic that 128 is really the bit-size of one 
> (complex) element - but I also liked the old way, because:
> 1. e.g. in fft transforms, float32 would "go with" complex32
>    and float64 with complex64
> 2. complex128 is one character extra (longer) and also 
> (alphabetically) now sorts before(!) complex64
> 3 Mostly of course: this new naming will confuse all my code and 
> introduce hard to find bugs - when I see complex64 I will "think" the 
> old way for quite some time ...
>
>
> These might just be my personal (idiotic ;-) comments - but I would 
> appreciate some feedback/comments.
> Also: Is it now to late to (re-)start a discussion on this !?
>
> Thanks
> - Sebastian Haase
>
>


From tim.hochberg at cox.net  Sun Apr  2 17:53:01 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Sun Apr  2 17:53:01 2006
Subject: [Numpy-discussion] Speed up function on cross product of two
 sets?
In-Reply-To: <8717BF2C-01EB-422A-B63A-7807E607DEE9@stanford.edu>
References: <DCA236B2-7315-419D-B05B-06396924E909@stanford.edu> <443063C0.3050002@cox.net> <8717BF2C-01EB-422A-B63A-7807E607DEE9@stanford.edu>
Message-ID: <443071BA.4090606@cox.net>

Zachary Pincus wrote:

> Tim -
>
> Thanks for your suggestions -- that all makes good sense.
>
> It sounds like the general take home message is, as always: "the  
> first thing to try is to vectorize your inner loop."

Exactly and far more pithy than my meanderings. If I were going to make 
a list it would look something like:

0. Think about your algorithm.
1. Vectorize your inner loop.
2. Eliminate temporaries
3. Ask for help
4. Recode in C.
5 Accept that your code will never be fast.

Step zero should probably be repeated after every other step ;)

-tim


>
> Zach
>
>
>>> I have a inner loop that looks like this:
>>> out = []
>>> for elem1 in l1:
>>>   for elem2 in l2:
>>>     out.append(do_something(l1, l2))
>>
>>
>> this is do_something(elem1, elem2), correct?
>>
>>> result = do_something_else(out)
>>>
>>> where do_something and do_something_else are implemented with  only  
>>> numpy ufuncs, and l1 and l2 are numpy arrays.
>>>
>>> As an example, I need to compute the median distance from any  
>>> element  in one set to any element in another set.
>>>
>>> What's the best way to speed this sort of thing up with numpy  
>>> (e.g.  push as much down into the underlying C as possible)? I  
>>> could re- write do_something with the numexpr tools (which are  very 
>>> cool), but  that doesn't address the fact that I've still got  
>>> nested loops living  in Python.
>>
>>
>> The exact approach I'd take would depend on the sizes of l1 and l2  
>> and a certain amount of trial and error. However, the first thing  
>> I'd try is:
>>
>>    n1 = len(l1)
>>    n2 = len(l2)
>>    out = numpy.zeros([n1*n2], appropriate_dtype)
>>    for i, elem1 in enumerate(l1):
>>        out[i*n2:(i+1)*n2] = do_something(elem1, l1)
>>    result = do_something_else(out)
>>
>> That may work as is, or you may have to tweak do_something slightly  
>> to handle l1 correctly. You might also try to do the operations in  
>> place and stuff the results into out directly by using X= and three  
>> argument ufuncs. I'd not do that at first though.
>>
>> One thing to consider is that, in my experience, numpy works best  on 
>> chunks of about 10,000 elements. I believe that this is a  function 
>> of cache size. Anyway, this may choice of which of l1 and  l2 you 
>> continue to loop over, and which you vectorize. If they both  might 
>> get really big, you could even consider chopping up l1 when  you 
>> vectorize it. Again I wouldn't do that unless it really looks  like 
>> you need it.
>>
>> If that all sounds opaque, feel free to ask more questions. Or if  
>> you have questions about microoptimizing the guts of do_something,  I 
>> have a bunch of experience with that and I like a good puzzle.
>>
>>>
>>> Perhaps there's some way in numpy to make one big honking array  
>>> that  contains all the pairs from the two lists, and then just run  
>>> my  do_something on that huge array, but that of course scales  poorly.
>>
>>
>> I know of at least one way, but it's a bit of a kludge. I don't  
>> think I'd try that though. As you said, it scales poorly.  As long  
>> as you can vectorize your inner loop, it's not necessary and  
>> sometimes makes things worse, to vectorize your outer loop as well.  
>> That's assuming your inner loop is large, it doesn't help if your  
>> inner loop is 3 elements long for instance, but that doesn't seem  
>> like it should be a problem here.
>>
>> Regards,
>>
>> -tim
>>
>
>
>


From oliphant.travis at ieee.org  Sun Apr  2 21:14:01 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Sun Apr  2 21:14:01 2006
Subject: [Numpy-discussion] Deprecating old names
In-Reply-To: <44302EA9.9050302@ftw.at>
References: <Pine.LNX.4.51.0604011906240.29595@ptpcp8.phy.tu-dresden.de> <442F41AE.1080806@bigpond.net.au> <44301590.4050707@cox.net> <44302EA9.9050302@ftw.at>
Message-ID: <4430A0BF.1080207@ieee.org>

Ed Schofield wrote:
> Tim Hochberg wrote, in a different thread:
>   
>>     >>> len(dir(numpy))
>>    476
>>
>> Does anyone know what all of that does? I certainly don't. And I doubt
>> anyone uses more than a fraction of that interface. I wouldn't be the
>> least bit suprised if there are old moldy parts of that are
>> essentially used. And, unused code is buggy code in my experience.
>>
>>    "Perfection is achieved, not when there is nothing more to add, but
>>    when there is nothing left to take away." -- Antoine de Saint-Exupery
>>     
>
> I'd like to revise a proposal I made last week.  Then I proposed that we
> reduce namespace clutter by not importing the contents of the oldnumeric
> namespace by default.  But Travis didn't want to deprecate the
> functional interfaces (sum(), take(), etc), so I now propose instead
> that we split up the contents of oldnumeric.py into interfaces we want
> to keep around indefinitely and interfaces we don't. 

Good idea...

-Travis


From rob at hooft.net  Sun Apr  2 22:46:09 2006
From: rob at hooft.net (Rob W.W. Hooft)
Date: Sun Apr  2 22:46:09 2006
Subject: [Fwd: Re: [Numpy-discussion] first impressions with numpy]
In-Reply-To: <44306A2C.4040606@msg.ucsf.edu>
References: <44306A2C.4040606@msg.ucsf.edu>
Message-ID: <4430B5D6.7020907@hooft.net>

Sebastian Haase wrote:

>> A new point: Please remind me (and probably others): when did it get 
>> decided to introduce 'complex128' to mean numarray's complex64
>> and the 'complex64' to mean numarray's complex32 ?
> 
> 
> I haven't the faintest idea -- it happened when I was off in Numarray
> land I assume. Or it was always that way? No idea. Hopefully Travis will
> answer this.

Fortran heritage? REAL*8 is paired with COMPLEX*16 there....

Regards,

Rob Hooft


From arnd.baecker at web.de  Mon Apr  3 02:18:08 2006
From: arnd.baecker at web.de (Arnd Baecker)
Date: Mon Apr  3 02:18:08 2006
Subject: [Numpy-discussion] Speed up function on cross product of two
 sets?
In-Reply-To: <DCA236B2-7315-419D-B05B-06396924E909@stanford.edu>
References: <DCA236B2-7315-419D-B05B-06396924E909@stanford.edu>
Message-ID: <Pine.LNX.4.51.0604031110050.6751@ptpcp8.phy.tu-dresden.de>

Hi,

On Sun, 2 Apr 2006, Zachary Pincus wrote:

> Hi folks,
>
> I have a inner loop that looks like this:
> out = []
> for elem1 in l1:
>    for elem2 in l2:
>      out.append(do_something(l1, l2))
> result = do_something_else(out)
>
> where do_something and do_something_else are implemented with only
> numpy ufuncs, and l1 and l2 are numpy arrays.
>
> As an example, I need to compute the median distance from any element
> in one set to any element in another set.
>
> What's the best way to speed this sort of thing up with numpy (e.g.
> push as much down into the underlying C as possible)? I could re-
> write do_something with the numexpr tools (which are very cool), but
> that doesn't address the fact that I've still got nested loops living
> in Python.

If do_something eats arrays, you could try:

  result = do_something(l1[:,NewAxis], l2)

E.g.:

  from numpy import *
  l1 = linspace(0.0, pi, 10)
  l2 = linspace(0.0, pi, 3)
  def f(y, x):
      return sin(y)*cos(x)

  print f(l1[:,NewAxis], l2)


((Note that I just learned in some other thread that with numpy there is
an alternative to NewAxis, but I haven't figured out which that is ...))

Best, Arnd


From zpincus at stanford.edu  Mon Apr  3 08:50:10 2006
From: zpincus at stanford.edu (Zachary Pincus)
Date: Mon Apr  3 08:50:10 2006
Subject: [Numpy-discussion] Speed up function on cross product of two sets?
In-Reply-To: <443071BA.4090606@cox.net>
References: <DCA236B2-7315-419D-B05B-06396924E909@stanford.edu> <443063C0.3050002@cox.net> <8717BF2C-01EB-422A-B63A-7807E607DEE9@stanford.edu> <443071BA.4090606@cox.net>
Message-ID: <BA5031F7-67B1-4234-9F4C-0893247A2FE9@stanford.edu>

> If I were going to make a list it would look something like:
>
> 0. Think about your algorithm.
> 1. Vectorize your inner loop.
> 2. Eliminate temporaries
> 3. Ask for help
> 4. Recode in C.
> 5 Accept that your code will never be fast.
>
> Step zero should probably be repeated after every other step ;)

Thanks for this list -- it's a good one.

Since we're discussing this, could I ask about the best way to  
eliminate temporaries? If you're using ufuncs, is there some way to  
make them work in-place? Or is the lowest-hanging fruit (temporary- 
wise) typically elsewhere?

Zach


From tim.hochberg at cox.net  Mon Apr  3 10:10:40 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Mon Apr  3 10:10:40 2006
Subject: [Numpy-discussion] Speed up function on cross product of two
 sets?
In-Reply-To: <Pine.LNX.4.51.0604031110050.6751@ptpcp8.phy.tu-dresden.de>
References: <DCA236B2-7315-419D-B05B-06396924E909@stanford.edu> <Pine.LNX.4.51.0604031110050.6751@ptpcp8.phy.tu-dresden.de>
Message-ID: <44315633.4010600@cox.net>

Arnd Baecker wrote:

[SNIP]

>((Note that I just learned in some other thread that with numpy there is
>an alternative to NewAxis, but I haven't figured out which that is ...))
>  
>
If you're old school you could just use None. But you probably mean 
'newaxis'.

-tim


From robert.kern at gmail.com  Mon Apr  3 10:19:02 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Mon Apr  3 10:19:02 2006
Subject: [Numpy-discussion] Re: Speed up function on cross product of two sets?
In-Reply-To: <BA5031F7-67B1-4234-9F4C-0893247A2FE9@stanford.edu>
References: <DCA236B2-7315-419D-B05B-06396924E909@stanford.edu> <443063C0.3050002@cox.net> <8717BF2C-01EB-422A-B63A-7807E607DEE9@stanford.edu> <443071BA.4090606@cox.net> <BA5031F7-67B1-4234-9F4C-0893247A2FE9@stanford.edu>
Message-ID: <e0rlc5$2bv$1@sea.gmane.org>

Zachary Pincus wrote:
>> If I were going to make a list it would look something like:
>>
>> 0. Think about your algorithm.
>> 1. Vectorize your inner loop.
>> 2. Eliminate temporaries
>> 3. Ask for help
>> 4. Recode in C.
>> 5 Accept that your code will never be fast.
>>
>> Step zero should probably be repeated after every other step ;)
> 
> Thanks for this list -- it's a good one.
> 
> Since we're discussing this, could I ask about the best way to 
> eliminate temporaries? If you're using ufuncs, is there some way to 
> make them work in-place? Or is the lowest-hanging fruit (temporary-
> wise) typically elsewhere?

Many binary ufuncs take an optional third argument which is an array which the
ufunc should put the result in.

In [2]: x = arange(10)

In [3]: y = arange(10)

In [4]: id(x)
Out[4]: 91297984

In [5]: add(x, y, x)
Out[5]: array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18])

In [6]: id(Out[5])
Out[6]: 91297984

In [7]: x
Out[7]: array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18])

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From tim.hochberg at cox.net  Mon Apr  3 10:36:05 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Mon Apr  3 10:36:05 2006
Subject: [Numpy-discussion] Speed up function on cross product of two
 sets?
In-Reply-To: <BA5031F7-67B1-4234-9F4C-0893247A2FE9@stanford.edu>
References: <DCA236B2-7315-419D-B05B-06396924E909@stanford.edu> <443063C0.3050002@cox.net> <8717BF2C-01EB-422A-B63A-7807E607DEE9@stanford.edu> <443071BA.4090606@cox.net> <BA5031F7-67B1-4234-9F4C-0893247A2FE9@stanford.edu>
Message-ID: <44315CD6.3010001@cox.net>

Zachary Pincus wrote:

>> If I were going to make a list it would look something like:
>>
>> 0. Think about your algorithm.
>> 1. Vectorize your inner loop.
>> 2. Eliminate temporaries
>> 3. Ask for help
>> 4. Recode in C.
>> 5 Accept that your code will never be fast.
>>
>> Step zero should probably be repeated after every other step ;)
>
>
> Thanks for this list -- it's a good one.
>
> Since we're discussing this, could I ask about the best way to  
> eliminate temporaries? If you're using ufuncs, is there some way to  
> make them work in-place? Or is the lowest-hanging fruit (temporary- 
> wise) typically elsewhere?

The least cryptic is to use *=, +=, where you can. But that only get's 
you so far.

As you guessed, there is a secret extra argument to ufuncs that allow 
you to do results in place. One could replace scratch=a*(b+sqrt(a)) with:

 >>> scratch = zeros([5], dtype=float)
 >>> a = arange(5, dtype=float)
 >>> b = arange(5, dtype=float)
 >>> sqrt(a, scratch)
array([ 0.        ,  1.        ,  1.41421356,  1.73205081,  2.        ])
 >>> add(scratch, b, scratch)
array([ 0.        ,  2.        ,  3.41421356,  4.73205081,  6.        ])
 >>> multiply(a, scratch)
array([  0.        ,   2.        ,   6.82842712,  14.19615242,  
24.        ])

The downside of this is that your code goes from comprehensible to 
insanely cryprtic pretty fast. I only resort to this in extreme 
circumstances. You could also use numexpr, which should be faster and is 
much less cryptic, but may not be completely stable yet.

Oh, and don't forget step 0, that's sometimes a good way to reduce 
temporaries.

regards,

-tim


From verveer at embl-heidelberg.de  Mon Apr  3 12:00:04 2006
From: verveer at embl-heidelberg.de (Peter Verveer)
Date: Mon Apr  3 12:00:04 2006
Subject: [Numpy-discussion] Re: Speed up function on cross product of two sets?
In-Reply-To: <e0rlc5$2bv$1@sea.gmane.org>
References: <DCA236B2-7315-419D-B05B-06396924E909@stanford.edu> <443063C0.3050002@cox.net> <8717BF2C-01EB-422A-B63A-7807E607DEE9@stanford.edu> <443071BA.4090606@cox.net> <BA5031F7-67B1-4234-9F4C-0893247A2FE9@stanford.edu> <e0rlc5$2bv$1@sea.gmane.org>
Message-ID: <012B117C-4046-4058-B7F9-AC5EDB68A532@embl-heidelberg.de>

On 3 Apr 2006, at 19:17, Robert Kern wrote:

> Zachary Pincus wrote:
>>> If I were going to make a list it would look something like:
>>>
>>> 0. Think about your algorithm.
>>> 1. Vectorize your inner loop.
>>> 2. Eliminate temporaries
>>> 3. Ask for help
>>> 4. Recode in C.
>>> 5 Accept that your code will never be fast.
>>>
>>> Step zero should probably be repeated after every other step ;)
>>
>> Thanks for this list -- it's a good one.
>>
>> Since we're discussing this, could I ask about the best way to
>> eliminate temporaries? If you're using ufuncs, is there some way to
>> make them work in-place? Or is the lowest-hanging fruit (temporary-
>> wise) typically elsewhere?
>
> Many binary ufuncs take an optional third argument which is an  
> array which the
> ufunc should put the result in.

I wished many times that all functions would support an optional  
output argument. It is not only important for speed optimization, but  
also if you work with large data sets. I guess the use of a return  
values is much more natural but when the point comes that you want to  
optimize your algorithm, the ability to use an output argument  
instead is very valuable. It would be nice if all functions by  
default would support a standard keyword argument 'output', just like  
ufuncs do. I suppose these could in principle be added while still  
maintaining backwards compatibility.

Cheers, Peter


From oliphant at ee.byu.edu  Mon Apr  3 15:59:06 2006
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Mon Apr  3 15:59:06 2006
Subject: [Numpy-discussion] first impressions with numpy
In-Reply-To: <44306594.50305@msg.ucsf.edu>
References: <442D9124.5020905@msg.ucsf.edu> <442D9695.2050900@cox.net> <442DB655.2050203@cox.net> <442DB91F.9030103@msg.ucsf.edu> <442DD638.60706@cox.net> <442FDDD5.8050404@sympatico.ca> <442FE950.8090000@cox.net> <44306594.50305@msg.ucsf.edu>
Message-ID: <4431A8A0.9010604@ee.byu.edu>

Sebastian Haase wrote:

> Tim Hochberg wrote:
> <snip>
>
>> This would work fine if repr were instead:
>>
>>    dtype([('x', float64), ('z', complex128)])
>>
>> Anyway, this all seems reasonable to me at first glance. That said, I 
>> don't plan to work on this, I've got other fish to fry at the moment.
>
>
> A new point: Please remind me (and probably others): when did it get 
> decided to introduce 'complex128' to mean numarray's complex64
> and the 'complex64' to mean numarray's complex32 ?

It was last February (i.e. 2005) when I first started posting regarding 
the new NumPy.   I claimed it was more consistent to use actual 
bit-widths.   A few people, including Perry, indicated they weren't 
opposed to the change and so I went ahead with it.

You can read relevant posts by searching on 
numpy-discussion at lists.sourceforge.net

Discussions are always welcome.  I suppose it's not too late to change 
something like this --- but it's getting there...

-Travis


From ryanlists at gmail.com  Mon Apr  3 17:50:03 2006
From: ryanlists at gmail.com (Ryan Krauss)
Date: Mon Apr  3 17:50:03 2006
Subject: [Numpy-discussion] string matrices
Message-ID: <c5b438120604031748h2af849c1p463462e6d59eb1f0@mail.gmail.com>

I am trying to use NumPy to generate some matrix inputs to Maxima for
symbolic analysis.  I am using a fair number of
matrix.astype('S%d'%maxlen) statements.  This seems to work very well.
 It also doesn't seem to pad the elements in anyway if maxlen is
bigger than I need, which is great.  This may seem like a dumb
computer science question, but what is the memory/performance cost of
making maxlen bigger than I want (but making sure that it is way
bigger than I need so that the elements don't get truncated)?  If my
biggest matrices will be 13x13, how long can the strings be before I
consume more than a few megs (or a few dozen megs) of memory?

Thanks,

Ryan


From haase at msg.ucsf.edu  Mon Apr  3 22:06:05 2006
From: haase at msg.ucsf.edu (Sebastian Haase)
Date: Mon Apr  3 22:06:05 2006
Subject: [Numpy-discussion] Vote: complex64   vs  complex128  (was:  first impressions with numpy
In-Reply-To: <4431A8A0.9010604@ee.byu.edu>
References: <442D9124.5020905@msg.ucsf.edu> <442D9695.2050900@cox.net> <442DB655.2050203@cox.net> <442DB91F.9030103@msg.ucsf.edu> <442DD638.60706@cox.net> <442FDDD5.8050404@sympatico.ca> <442FE950.8090000@cox.net> <44306594.50305@msg.ucsf.edu> <4431A8A0.9010604@ee.byu.edu>
Message-ID: <4431FE90.6060301@msg.ucsf.edu>

Hi,
Could we start another poll on this !?

I think I would vote
+1  for complex32 & complex64  mostly just because of "that's what I'm 
used to"

But I'm curious to hear what others "know to be in use" - e.g. Matlab or 
IDL !

- Thanks
Sebastian Haase


Travis Oliphant wrote:
> Sebastian Haase wrote:
> 
>> Tim Hochberg wrote:
>> <snip>
>>
>>> This would work fine if repr were instead:
>>>
>>>    dtype([('x', float64), ('z', complex128)])
>>>
>>> Anyway, this all seems reasonable to me at first glance. That said, I 
>>> don't plan to work on this, I've got other fish to fry at the moment.
>>
>>
>> A new point: Please remind me (and probably others): when did it get 
>> decided to introduce 'complex128' to mean numarray's complex64
>> and the 'complex64' to mean numarray's complex32 ?
> 
> It was last February (i.e. 2005) when I first started posting regarding 
> the new NumPy.   I claimed it was more consistent to use actual 
> bit-widths.   A few people, including Perry, indicated they weren't 
> opposed to the change and so I went ahead with it.
> 
> You can read relevant posts by searching on 
> numpy-discussion at lists.sourceforge.net
> 
> Discussions are always welcome.  I suppose it's not too late to change 
> something like this --- but it's getting there...
> 
> -Travis


From robert.kern at gmail.com  Mon Apr  3 22:41:02 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Mon Apr  3 22:41:02 2006
Subject: [Numpy-discussion] Re: Vote: complex64   vs  complex128
In-Reply-To: <4431FE90.6060301@msg.ucsf.edu>
References: <442D9124.5020905@msg.ucsf.edu> <442D9695.2050900@cox.net> <442DB655.2050203@cox.net> <442DB91F.9030103@msg.ucsf.edu> <442DD638.60706@cox.net> <442FDDD5.8050404@sympatico.ca> <442FE950.8090000@cox.net> <44306594.50305@msg.ucsf.edu> <4431A8A0.9010604@ee.byu.edu> <4431FE90.6060301@msg.ucsf.edu>
Message-ID: <e0t0sh$mfo$1@sea.gmane.org>

Sebastian Haase wrote:
> Hi,
> Could we start another poll on this !?

Please, let's leave voting as a method of last resort.

> I think I would vote
> +1  for complex32 & complex64  mostly just because of "that's what I'm
> used to"
> 
> But I'm curious to hear what others "know to be in use" - e.g. Matlab or
> IDL !

On the merits of the issue, I like the new scheme better. For whatever reason, I
tend to remember it when coding. With Numeric, I would frequently second-guess
myself and go to the prompt and tab-complete to look at all of the options and
reason out the one I wanted.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From tim.hochberg at cox.net  Mon Apr  3 22:49:02 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Mon Apr  3 22:49:02 2006
Subject: [Numpy-discussion] Re: Vote: complex64   vs  complex128
In-Reply-To: <e0t0sh$mfo$1@sea.gmane.org>
References: <442D9124.5020905@msg.ucsf.edu> <442D9695.2050900@cox.net> <442DB655.2050203@cox.net> <442DB91F.9030103@msg.ucsf.edu> <442DD638.60706@cox.net> <442FDDD5.8050404@sympatico.ca> <442FE950.8090000@cox.net> <44306594.50305@msg.ucsf.edu> <4431A8A0.9010604@ee.byu.edu> <4431FE90.6060301@msg.ucsf.edu> <e0t0sh$mfo$1@sea.gmane.org>
Message-ID: <443208B9.40106@cox.net>

Robert Kern wrote:

>Sebastian Haase wrote:
>  
>
>>Hi,
>>Could we start another poll on this !?
>>    
>>
>
>Please, let's leave voting as a method of last resort.
>
>  
>
>>I think I would vote
>>+1  for complex32 & complex64  mostly just because of "that's what I'm
>>used to"
>>
>>But I'm curious to hear what others "know to be in use" - e.g. Matlab or
>>IDL !
>>    
>>
>
>On the merits of the issue, I like the new scheme better. For whatever reason, I
>tend to remember it when coding. With Numeric, I would frequently second-guess
>myself and go to the prompt and tab-complete to look at all of the options and
>reason out the one I wanted.
>  
>

I can't bring myself to care. I almost always use dtype=complex and on 
the rare times I don't I can never remember what the scheme is 
regardless of which scheme it is / was / will be. On the other hand, if 
the scheme was Complex32x2 and Complex64x2, I could probably decipher 
what  that was without looking it up. It is is a little ugly and weird I 
admit, but that probably wouldn't bother me.

Regards,

-tim


From arnd.baecker at web.de  Mon Apr  3 23:36:00 2006
From: arnd.baecker at web.de (Arnd Baecker)
Date: Mon Apr  3 23:36:00 2006
Subject: [Numpy-discussion] Re: Vote: complex64   vs  complex128
In-Reply-To: <e0t0sh$mfo$1@sea.gmane.org>
References: <442D9124.5020905@msg.ucsf.edu> <442D9695.2050900@cox.net>
 <442DB655.2050203@cox.net> <442DB91F.9030103@msg.ucsf.edu> <442DD638.60706@cox.net>
 <442FDDD5.8050404@sympatico.ca> <442FE950.8090000@cox.net> <44306594.50305@msg.ucsf.edu>
 <4431A8A0.9010604@ee.byu.edu> <4431FE90.6060301@msg.ucsf.edu>
 <e0t0sh$mfo$1@sea.gmane.org>
Message-ID: <Pine.LNX.4.51.0604040748510.14942@ptpcp8.phy.tu-dresden.de>

On Tue, 4 Apr 2006, Robert Kern wrote:

> Sebastian Haase wrote:
> > Hi,
> > Could we start another poll on this !?
>
> Please, let's leave voting as a method of last resort.
>
> > I think I would vote
> > +1  for complex32 & complex64  mostly just because of "that's what I'm
> > used to"
> >
> > But I'm curious to hear what others "know to be in use" - e.g. Matlab or
> > IDL !
>
> On the merits of the issue, I like the new scheme better. For whatever reason, I
> tend to remember it when coding. With Numeric, I would frequently second-guess
> myself and go to the prompt and tab-complete to look at all of the options and
> reason out the one I wanted.

In order to get an opionion on the subject:
How would one presently find out about
the meaning of  complex64 and complex128?
The following attempt does not help:

In [1]:import numpy
In [2]:numpy.complex64?
Type:           type
Base Class:     <type 'type'>
String Form:    <type 'complex64scalar'>
Namespace:      Interactive
Docstring:
    <no docstring>

In [3]:numpy.complex128?
Type:           type
Base Class:     <type 'type'>
String Form:    <type 'complex128scalar'>
Namespace:      Interactive
Docstring:
    <no docstring>

I also looked in Travis' "Guide to NumPy",
where the different types are discussed on
page 18 (referring to the sample chapters at
http://www.tramy.us/guidetoscipy.html)
Maybe chapter 12 contains more info on this ((our library
was still not able to buy the 20 copies since this request was
approved a month ago ...))

Best, Arnd


From cjw at sympatico.ca  Tue Apr  4 06:20:44 2006
From: cjw at sympatico.ca (Colin J. Williams)
Date: Tue Apr  4 06:20:44 2006
Subject: [Numpy-discussion] Vote: complex64   vs  complex128 
In-Reply-To: <4431FE90.6060301@msg.ucsf.edu>
References: <442D9124.5020905@msg.ucsf.edu> <442D9695.2050900@cox.net> <442DB655.2050203@cox.net> <442DB91F.9030103@msg.ucsf.edu> <442DD638.60706@cox.net> <442FDDD5.8050404@sympatico.ca> <442FE950.8090000@cox.net> <44306594.50305@msg.ucsf.edu> <4431A8A0.9010604@ee.byu.edu> <4431FE90.6060301@msg.ucsf.edu>
Message-ID: <443271C9.6080907@sympatico.ca>

Sebastian Haase wrote:

> Hi,
> Could we start another poll on this !?
>
> I think I would vote
> +1  for complex32 & complex64  mostly just because of "that's what I'm 
> used to"

+1 Most people look to the number to give a clue as to the precision of 
the value.

Colin W.

>
> But I'm curious to hear what others "know to be in use" - e.g. Matlab 
> or IDL !
>
> - Thanks
> Sebastian Haase
>
>
>
> Travis Oliphant wrote:
>
>> Sebastian Haase wrote:
>>
>>> Tim Hochberg wrote:
>>> <snip>
>>>
>>>> This would work fine if repr were instead:
>>>>
>>>>    dtype([('x', float64), ('z', complex128)])
>>>>
>>>> Anyway, this all seems reasonable to me at first glance. That said, 
>>>> I don't plan to work on this, I've got other fish to fry at the 
>>>> moment.
>>>
>>>
>>>
>>> A new point: Please remind me (and probably others): when did it get 
>>> decided to introduce 'complex128' to mean numarray's complex64
>>> and the 'complex64' to mean numarray's complex32 ?
>>
>>
>> It was last February (i.e. 2005) when I first started posting 
>> regarding the new NumPy.   I claimed it was more consistent to use 
>> actual bit-widths.   A few people, including Perry, indicated they 
>> weren't opposed to the change and so I went ahead with it.
>>
>> You can read relevant posts by searching on 
>> numpy-discussion at lists.sourceforge.net
>>
>> Discussions are always welcome.  I suppose it's not too late to 
>> change something like this --- but it's getting there...
>>
>> -Travis
>
>
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting 
> language
> that extends applications into web and mobile media. Attend the live 
> webcast
> and join the prime developer group breaking into this new coding 
> territory!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>


From ryanlists at gmail.com  Tue Apr  4 07:27:01 2006
From: ryanlists at gmail.com (Ryan Krauss)
Date: Tue Apr  4 07:27:01 2006
Subject: [Numpy-discussion] Re: string matrices
In-Reply-To: <c5b438120604031748h2af849c1p463462e6d59eb1f0@mail.gmail.com>
References: <c5b438120604031748h2af849c1p463462e6d59eb1f0@mail.gmail.com>
Message-ID: <c5b438120604040726w355d2d92qff607801795e806f@mail.gmail.com>

I actually have a problem with the elements of a string matrix from
astype('S#').  The shorter elements in my matrix have a bunch of terms
like '1.0', because the matrix they started from was a float.  I need
to keep the float type, but want to get rid of the '.0 ' when I
convert the string output to latex.  I was going to check if
element[-2:]=='.0' but ran into this problem:

In [15]: temp[-2:]
Out[15]: '\x00\x00'

In [16]: temp.strip()
Out[16]: '1.0\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'

I think I can get rid of the \x00's by calling str(element), but is
this a feature or a bug?  It would be slightly cleaner for me if the
string matrix elements didn't have the trailing null characters (or
whatever those are), but this may not be possible given the underlying
representation.

Thanks,

Ryan

On 4/3/06, Ryan Krauss <ryanlists at gmail.com> wrote:
> I am trying to use NumPy to generate some matrix inputs to Maxima for
> symbolic analysis.  I am using a fair number of
> matrix.astype('S%d'%maxlen) statements.  This seems to work very well.
>  It also doesn't seem to pad the elements in anyway if maxlen is
> bigger than I need, which is great.  This may seem like a dumb
> computer science question, but what is the memory/performance cost of
> making maxlen bigger than I want (but making sure that it is way
> bigger than I need so that the elements don't get truncated)?  If my
> biggest matrices will be 13x13, how long can the strings be before I
> consume more than a few megs (or a few dozen megs) of memory?
>
> Thanks,
>
> Ryan
>


From charlesr.harris at gmail.com  Tue Apr  4 08:16:07 2006
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Tue Apr  4 08:16:07 2006
Subject: [Numpy-discussion] Vote: complex64 vs complex128
In-Reply-To: <443271C9.6080907@sympatico.ca>
References: <442D9124.5020905@msg.ucsf.edu> <442DB655.2050203@cox.net>
	 <442DB91F.9030103@msg.ucsf.edu> <442DD638.60706@cox.net>
	 <442FDDD5.8050404@sympatico.ca> <442FE950.8090000@cox.net>
	 <44306594.50305@msg.ucsf.edu> <4431A8A0.9010604@ee.byu.edu>
	 <4431FE90.6060301@msg.ucsf.edu> <443271C9.6080907@sympatico.ca>
Message-ID: <e06186140604040809u626fb1f4x77c67d2d43983077@mail.gmail.com>

I can't get worked up over this one way or the other: complex128 make sense
if I count bits, complex64 makes sense if I note precision; I just have to
remember the numpy convention. One could argue that complex64 is the more
conventional choice and so has the virtue of least surprise, but I don't
think it is terribly difficult to become accustomed to using complex128 in
its place. I suppose this is one of those programmer's vs user's point of
view thingees. For the guy writing general low level numpy code what matters
is the length of the type, how many bytes have to be moved and so on, and
from the other point of view what counts is the precision of the arithmetic.

Chuck

On 4/4/06, Colin J. Williams <cjw at sympatico.ca> wrote:
>
> Sebastian Haase wrote:
>
> > Hi,
> > Could we start another poll on this !?
> >
> > I think I would vote
> > +1  for complex32 & complex64  mostly just because of "that's what I'm
> > used to"
>
> +1 Most people look to the number to give a clue as to the precision of
> the value.
>
> Colin W.
>
> >
> > But I'm curious to hear what others "know to be in use" - e.g. Matlab
> > or IDL !
> >
> > - Thanks
> > Sebastian Haase
> >
> >
> >
> > Travis Oliphant wrote:
> >
> >> Sebastian Haase wrote:
> >>
> >>> Tim Hochberg wrote:
> >>> <snip>
> >>>
> >>>> This would work fine if repr were instead:
> >>>>
> >>>>    dtype([('x', float64), ('z', complex128)])
> >>>>
> >>>> Anyway, this all seems reasonable to me at first glance. That said,
> >>>> I don't plan to work on this, I've got other fish to fry at the
> >>>> moment.
> >>>
> >>>
> >>>
> >>> A new point: Please remind me (and probably others): when did it get
> >>> decided to introduce 'complex128' to mean numarray's complex64
> >>> and the 'complex64' to mean numarray's complex32 ?
> >>
> >>
> >> It was last February (i.e. 2005) when I first started posting
> >> regarding the new NumPy.   I claimed it was more consistent to use
> >> actual bit-widths.   A few people, including Perry, indicated they
> >> weren't opposed to the change and so I went ahead with it.
> >>
> >> You can read relevant posts by searching on
> >> numpy-discussion at lists.sourceforge.net
> >>
> >> Discussions are always welcome.  I suppose it's not too late to
> >> change something like this --- but it's getting there...
> >>
> >> -Travis
> >
> >
> >
> >
> > -------------------------------------------------------
> > This SF.Net email is sponsored by xPML, a groundbreaking scripting
> > language
> > that extends applications into web and mobile media. Attend the live
> > webcast
> > and join the prime developer group breaking into this new coding
> > territory!
> > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> > _______________________________________________
> > Numpy-discussion mailing list
> > Numpy-discussion at lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/numpy-discussion
> >
>
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting
> language
> that extends applications into web and mobile media. Attend the live
> webcast
> and join the prime developer group breaking into this new coding
> territory!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060404/94e5c02e/attachment.html>

From faltet at carabos.com  Tue Apr  4 08:49:11 2006
From: faltet at carabos.com (Francesc Altet)
Date: Tue Apr  4 08:49:11 2006
Subject: [Numpy-discussion] Re: Vote: complex64   vs  complex128
In-Reply-To: <e0t0sh$mfo$1@sea.gmane.org>
References: <442D9124.5020905@msg.ucsf.edu> <4431FE90.6060301@msg.ucsf.edu> <e0t0sh$mfo$1@sea.gmane.org>
Message-ID: <200604041747.57180.faltet@carabos.com>

A Dimarts 04 Abril 2006 07:40, Robert Kern va escriure:
> Sebastian Haase wrote:
> > I think I would vote
> > +1  for complex32 & complex64  mostly just because of "that's what I'm
> > used to"
> >
> > But I'm curious to hear what others "know to be in use" - e.g. Matlab or
> > IDL !
>
> On the merits of the issue, I like the new scheme better. For whatever
> reason, I tend to remember it when coding. With Numeric, I would frequently
> second-guess myself and go to the prompt and tab-complete to look at all of
> the options and reason out the one I wanted.

I agree with Robert. From the very beginning NumPy design has been
very consequent with typeEXTENT_IN_BITS mapping (even for unicode),
and if we go back to numarray (complex32/complex64) convention, this
would be the only exception to this rule. Perhaps I'm a bit biased by
being a developer more interested in type 'sizes' that in 'precision'
issues, but I'd definitely prefer a completely consistent approach for
this matter.

So +1 for complex64 & complex128

Cheers,

-- 
>0,0<   Francesc Altet ? ? http://www.carabos.com/
V   V   C?rabos Coop. V. ??Enjoy Data
 "-"


From haase at msg.ucsf.edu  Tue Apr  4 09:33:07 2006
From: haase at msg.ucsf.edu (Sebastian Haase)
Date: Tue Apr  4 09:33:07 2006
Subject: [Numpy-discussion] Vote: complex64 vs complex128
In-Reply-To: <e06186140604040809u626fb1f4x77c67d2d43983077@mail.gmail.com>
References: <442D9124.5020905@msg.ucsf.edu> <443271C9.6080907@sympatico.ca> <e06186140604040809u626fb1f4x77c67d2d43983077@mail.gmail.com>
Message-ID: <200604040929.15815.haase@msg.ucsf.edu>

On Tuesday 04 April 2006 08:09, Charles R Harris wrote:
> I can't get worked up over this one way or the other: complex128 make sense
> if I count bits, complex64 makes sense if I note precision; I just have to
> remember the numpy convention. One could argue that complex64 is the more
> conventional choice and so has the virtue of least surprise, but I don't
> think it is terribly difficult to become accustomed to using complex128 in
> its place. I suppose this is one of those programmer's vs user's point of
> view thingees. For the guy writing general low level numpy code what
> matters is the length of the type, how many bytes have to be moved and so
> on, and from the other point of view what counts is the precision of the
> arithmetic.

I kind of like your comparison of  programmer vs user ;-)
And so I was "hoping" that numpy (and scipy !!) is intended for the users - 
like supposedly IDL and Matlab are...

No one likes my "backwards compatibility" argument !?

Thanks
- Sebastian Haase

PS: I understand that voting is only for a last resort - some people, always 
use na.Complex and na.Float and don't care - BUT I use single precision all 
the time because my image data is already getting to large.  So I have to  
look at this every day, and as Travis pointed out, now is about the last 
chance to possibly change  complex128 to complex64 ...

>
> Chuck
>
> On 4/4/06, Colin J. Williams <cjw at sympatico.ca> wrote:
> > Sebastian Haase wrote:
> > > Hi,
> > > Could we start another poll on this !?
> > >
> > > I think I would vote
> > > +1  for complex32 & complex64  mostly just because of "that's what I'm
> > > used to"
> >
> > +1 Most people look to the number to give a clue as to the precision of
> > the value.
> >
> > Colin W.
> >
> > > But I'm curious to hear what others "know to be in use" - e.g. Matlab
> > > or IDL !
> > >
> > > - Thanks
> > > Sebastian Haase
> > >
> > > Travis Oliphant wrote:
> > >> Sebastian Haase wrote:
> > >>> Tim Hochberg wrote:
> > >>> <snip>
> > >>>
> > >>>> This would work fine if repr were instead:
> > >>>>
> > >>>>    dtype([('x', float64), ('z', complex128)])
> > >>>>
> > >>>> Anyway, this all seems reasonable to me at first glance. That said,
> > >>>> I don't plan to work on this, I've got other fish to fry at the
> > >>>> moment.
> > >>>
> > >>> A new point: Please remind me (and probably others): when did it get
> > >>> decided to introduce 'complex128' to mean numarray's complex64
> > >>> and the 'complex64' to mean numarray's complex32 ?
> > >>
> > >> It was last February (i.e. 2005) when I first started posting
> > >> regarding the new NumPy.   I claimed it was more consistent to use
> > >> actual bit-widths.   A few people, including Perry, indicated they
> > >> weren't opposed to the change and so I went ahead with it.
> > >>
> > >> You can read relevant posts by searching on
> > >> numpy-discussion at lists.sourceforge.net
> > >>
> > >> Discussions are always welcome.  I suppose it's not too late to
> > >> change something like this --- but it's getting there...
> > >>
> > >> -Travis


From robert.kern at gmail.com  Tue Apr  4 09:52:11 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Tue Apr  4 09:52:11 2006
Subject: [Numpy-discussion] Re: string matrices
In-Reply-To: <c5b438120604040726w355d2d92qff607801795e806f@mail.gmail.com>
References: <c5b438120604031748h2af849c1p463462e6d59eb1f0@mail.gmail.com> <c5b438120604040726w355d2d92qff607801795e806f@mail.gmail.com>
Message-ID: <e0u857$a4p$1@sea.gmane.org>

Ryan Krauss wrote:
> I actually have a problem with the elements of a string matrix from
> astype('S#').  The shorter elements in my matrix have a bunch of terms
> like '1.0', because the matrix they started from was a float.  I need
> to keep the float type, but want to get rid of the '.0 ' when I
> convert the string output to latex.  I was going to check if
> element[-2:]=='.0' but ran into this problem:
> 
> In [15]: temp[-2:]
> Out[15]: '\x00\x00'
> 
> In [16]: temp.strip()
> Out[16]: '1.0\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
> 
> I think I can get rid of the \x00's by calling str(element), but is
> this a feature or a bug? 

Probably both.  :-)  On the one hand, you want to be able to get a useful string
out of the array; the nulls are just padding, and the string that you put in was
'1.0'. However, suppose that the string you put in was '1.\x00'. Then you would
get the "wrong" string out.

However, the only real alternative is to also store an integer containing the
length of the string with each element. That probably interferes with some of
the uses of string arrays.

> It would be slightly cleaner for me if the
> string matrix elements didn't have the trailing null characters (or
> whatever those are), but this may not be possible given the underlying
> representation.

You can also use temp.strip('\x00') which is a bit more explicit.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From zpincus at stanford.edu  Tue Apr  4 09:54:06 2006
From: zpincus at stanford.edu (Zachary Pincus)
Date: Tue Apr  4 09:54:06 2006
Subject: [Numpy-discussion] Re: Vote: complex64   vs  complex128
In-Reply-To: <443208B9.40106@cox.net>
References: <442D9124.5020905@msg.ucsf.edu> <442D9695.2050900@cox.net> <442DB655.2050203@cox.net> <442DB91F.9030103@msg.ucsf.edu> <442DD638.60706@cox.net> <442FDDD5.8050404@sympatico.ca> <442FE950.8090000@cox.net> <44306594.50305@msg.ucsf.edu> <4431A8A0.9010604@ee.byu.edu> <4431FE90.6060301@msg.ucsf.edu> <e0t0sh$mfo$1@sea.gmane.org> <443208B9.40106@cox.net>
Message-ID: <A44A6813-D231-40FF-B89D-3693B0E570E1@stanford.edu>

>  On the other hand, if the scheme was Complex32x2 and Complex64x2,  
> I could probably decipher what  that was without looking it up. It  
> is is a little ugly and weird I admit, but that probably wouldn't  
> bother me.

On consideration, I'm +1 on Tim's suggestion here, if any change is  
going to be made. At least it has the virtue of being relatively  
clear, if a bit ugly.

Zach


From jh at oobleck.astro.cornell.edu  Tue Apr  4 11:14:04 2006
From: jh at oobleck.astro.cornell.edu (Joe Harrington)
Date: Tue Apr  4 11:14:04 2006
Subject: [Numpy-discussion] Re: Vote: complex64   vs  complex128
In-Reply-To: <20060404155003.AF09016220@sc8-sf-spam2.sourceforge.net>
	(numpy-discussion-request@lists.sourceforge.net)
References: <20060404155003.AF09016220@sc8-sf-spam2.sourceforge.net>
Message-ID: <200604041813.k34IDmAP011500@oobleck.astro.cornell.edu>

When I first heard of Complex128, my first response was, "Cool!  I
didn't even know there was a Double128!"

Folks seem to agree that precision-based naming would be most
intuitive to new users, but that length-based naming would be most
intuitive to low-level programmers.  This is a high-level package,
whose purpose is to hide the numerical details and programming
drudgery from the user as much as possible, while still offering high
performance and not limiting capability too much.  For this type of
package, a good metric is "when it doesn't restrict capability, do
what makes sense for new/naiive users".

So, I favor Complex32 and Complex64.  When you say "complex", everyone
knows you mean 2 numbers.  When you say 32 or 64 or 128, in the
context of bits for floating values, almost everyone assumes you are
talking that many bits of precision to represent one number.  Consider
future conversations about precision and data size.  In precision
discussions, you'd always have to clarify that complex128 had 64 bits
of precision, just to make sure everyone was on the same key
(particularly when 128-bit machines arrive).  In data-size
discussions, everyone would know to double the size for the two
components.  No extra clarification would be needed.

IDL's behavior is irrelevant to us, since they just say "complex", and
"dcomplex" for 32-bit and 64-bit precision.

--jh--


From oliphant.travis at ieee.org  Tue Apr  4 11:25:11 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Tue Apr  4 11:25:11 2006
Subject: [Numpy-discussion] Re: string matrices
In-Reply-To: <c5b438120604040726w355d2d92qff607801795e806f@mail.gmail.com>
References: <c5b438120604031748h2af849c1p463462e6d59eb1f0@mail.gmail.com> <c5b438120604040726w355d2d92qff607801795e806f@mail.gmail.com>
Message-ID: <4432B9C2.7040307@ieee.org>

Ryan Krauss wrote:
> I actually have a problem with the elements of a string matrix from
> astype('S#').  The shorter elements in my matrix have a bunch of terms
> like '1.0', because the matrix they started from was a float.  I need
> to keep the float type, but want to get rid of the '.0 ' when I
> convert the string output to latex.  I was going to check if
> element[-2:]=='.0' but ran into this problem
>   
> In [15]: temp[-2:]
> Out[15]: '\x00\x00'
>
> In [16]: temp.strip()
> Out[16]: '1.0\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
>
> I think I can get rid of the \x00's by calling str(element), but is
> this a feature or a bug?  

Of course the elements are padded with '\x00' so that they are all the 
same length, but we have been trying to make it so that it doesn't 
matter.   Equality testing is one area where it still does.  We are 
using the underlying string equality testing (and it doesn't strip the 
'\x00').  So, I guess it's a missing feature at this point.  

-Travis


From tim.hochberg at cox.net  Tue Apr  4 11:41:10 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Tue Apr  4 11:41:10 2006
Subject: [Numpy-discussion] Re: string matrices
In-Reply-To: <e0u857$a4p$1@sea.gmane.org>
References: <c5b438120604031748h2af849c1p463462e6d59eb1f0@mail.gmail.com> <c5b438120604040726w355d2d92qff607801795e806f@mail.gmail.com> <e0u857$a4p$1@sea.gmane.org>
Message-ID: <4432BD89.3050501@cox.net>

Robert Kern wrote:

>Ryan Krauss wrote:
>  
>
>>I actually have a problem with the elements of a string matrix from
>>astype('S#').  The shorter elements in my matrix have a bunch of terms
>>like '1.0', because the matrix they started from was a float.  I need
>>to keep the float type, but want to get rid of the '.0 ' when I
>>convert the string output to latex.  I was going to check if
>>element[-2:]=='.0' but ran into this problem:
>>
>>In [15]: temp[-2:]
>>Out[15]: '\x00\x00'
>>
>>In [16]: temp.strip()
>>Out[16]: '1.0\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
>>
>>I think I can get rid of the \x00's by calling str(element), but is
>>this a feature or a bug? 
>>    
>>
>
>Probably both.  :-)  On the one hand, you want to be able to get a useful string
>out of the array; the nulls are just padding, and the string that you put in was
>'1.0'. However, suppose that the string you put in was '1.\x00'. Then you would
>get the "wrong" string out.
>
>However, the only real alternative is to also store an integer containing the
>length of the string with each element. That probably interferes with some of
>the uses of string arrays.
>
>  
>
>>It would be slightly cleaner for me if the
>>string matrix elements didn't have the trailing null characters (or
>>whatever those are), but this may not be possible given the underlying
>>representation.
>>    
>>
>
>You can also use temp.strip('\x00') which is a bit more explicit.
>
>  
>

Or even temp.rstrip('\x00') which works for all those time you pad the 
front of your string with '\x00' ;)

-tim


From faltet at carabos.com  Tue Apr  4 11:46:08 2006
From: faltet at carabos.com (Francesc Altet)
Date: Tue Apr  4 11:46:08 2006
Subject: [Numpy-discussion] Re: Vote: complex64   vs  complex128
In-Reply-To: <200604041813.k34IDmAP011500@oobleck.astro.cornell.edu>
References: <20060404155003.AF09016220@sc8-sf-spam2.sourceforge.net> <200604041813.k34IDmAP011500@oobleck.astro.cornell.edu>
Message-ID: <200604042045.39955.faltet@carabos.com>

A Dimarts 04 Abril 2006 20:13, Joe Harrington va escriure:
> When I first heard of Complex128, my first response was, "Cool!  I
> didn't even know there was a Double128!"
>
> Folks seem to agree that precision-based naming would be most
> intuitive to new users, but that length-based naming would be most
> intuitive to low-level programmers.  This is a high-level package,
> whose purpose is to hide the numerical details and programming
> drudgery from the user as much as possible, while still offering high
> performance and not limiting capability too much.  For this type of
> package, a good metric is "when it doesn't restrict capability, do
> what makes sense for new/naiive users".
>
> So, I favor Complex32 and Complex64.  When you say "complex", everyone
> knows you mean 2 numbers.  When you say 32 or 64 or 128, in the
> context of bits for floating values, almost everyone assumes you are
> talking that many bits of precision to represent one number.  Consider
> future conversations about precision and data size.  In precision
> discussions, you'd always have to clarify that complex128 had 64 bits
> of precision, just to make sure everyone was on the same key
> (particularly when 128-bit machines arrive).  In data-size
> discussions, everyone would know to double the size for the two
> components.  No extra clarification would be needed.

Well, from my point of view of "low-level" user, I don't specially
like this, but I understand the "high-level" position to be much more
important in terms of number of users. Besides, I also see that NumPy
should be adressed specially to the requirements of the later users.
So for me is fine with complex32/complex64.

Cheers,

-- 
>0,0<   Francesc Altet ? ? http://www.carabos.com/
V   V   C?rabos Coop. V. ??Enjoy Data
 "-"


From robert.kern at gmail.com  Tue Apr  4 12:15:08 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Tue Apr  4 12:15:08 2006
Subject: [Numpy-discussion] Re: Vote: complex64   vs  complex128
In-Reply-To: <200604041813.k34IDmAP011500@oobleck.astro.cornell.edu>
References: <20060404155003.AF09016220@sc8-sf-spam2.sourceforge.net> <200604041813.k34IDmAP011500@oobleck.astro.cornell.edu>
Message-ID: <e0ugi5$b1q$1@sea.gmane.org>

Joe Harrington wrote:
> When I first heard of Complex128, my first response was, "Cool!  I
> didn't even know there was a Double128!"
> 
> Folks seem to agree that precision-based naming would be most
> intuitive to new users, but that length-based naming would be most
> intuitive to low-level programmers.  This is a high-level package,
> whose purpose is to hide the numerical details and programming
> drudgery from the user as much as possible, while still offering high
> performance and not limiting capability too much.  For this type of
> package, a good metric is "when it doesn't restrict capability, do
> what makes sense for new/naiive users".

I'm pretty sure that when any of us say that such-and-such is going to make the
most sense to new users, we're just guessing. Or projecting our experienced-user
prejudices onto them. If I had to register my guess, I would say that either way
will make just as much sense to new users.

I think it's time that we start taking backwards compatibility with previous
releases of numpy seriously and not break numpy code without clear, significant
gains in usability.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From aisaac at american.edu  Tue Apr  4 12:38:05 2006
From: aisaac at american.edu (Alan G Isaac)
Date: Tue Apr  4 12:38:05 2006
Subject: [Numpy-discussion] Re: Vote: complex64   vs  complex128
In-Reply-To: <e0ugi5$b1q$1@sea.gmane.org>
References: <20060404155003.AF09016220@sc8-sf-spam2.sourceforge.net> <200604041813.k34IDmAP011500@oobleck.astro.cornell.edu><e0ugi5$b1q$1@sea.gmane.org>
Message-ID: <Mahogany-0.66.0-1528-20060404-154236.00@american.edu>

On Tue, 04 Apr 2006, Robert Kern apparently wrote: 
> I would say that either way will make just as much sense 
> to new users. 

User's perspective: agreed.  Just give me i. consistency and 
ii. an easy way to inspect the object for its meaning.

Cheers,
Alan Isaac


From tim.hochberg at cox.net  Tue Apr  4 12:52:04 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Tue Apr  4 12:52:04 2006
Subject: [Numpy-discussion] Re: Vote: complex64   vs  complex128
In-Reply-To: <e0ugi5$b1q$1@sea.gmane.org>
References: <20060404155003.AF09016220@sc8-sf-spam2.sourceforge.net> <200604041813.k34IDmAP011500@oobleck.astro.cornell.edu> <e0ugi5$b1q$1@sea.gmane.org>
Message-ID: <4432CE1F.3010209@cox.net>

Robert Kern wrote:

>Joe Harrington wrote:
>  
>
>>When I first heard of Complex128, my first response was, "Cool!  I
>>didn't even know there was a Double128!"
>>
>>Folks seem to agree that precision-based naming would be most
>>intuitive to new users, but that length-based naming would be most
>>intuitive to low-level programmers.  This is a high-level package,
>>whose purpose is to hide the numerical details and programming
>>drudgery from the user as much as possible, while still offering high
>>performance and not limiting capability too much.  For this type of
>>package, a good metric is "when it doesn't restrict capability, do
>>what makes sense for new/naiive users".
>>    
>>
>
>I'm pretty sure that when any of us say that such-and-such is going to make the
>most sense to new users, we're just guessing. Or projecting our experienced-user
>prejudices onto them. If I had to register my guess, I would say that either way
>will make just as much sense to new users.
>  
>
Agreed.

>I think it's time that we start taking backwards compatibility with previous
>releases of numpy seriously and not break numpy code without clear, significant
>gains in usability.
>  
>
So what does that mean in this case? The current status; nice for 
existing users of numpy. Or, the old status, nice for people 
transitioning to numpy from Numeric. It's hard to know which way these 
backwards compatibility arguments cut when they involve reverting a 
change from some old behaviour.

I've got an idea. Rather than go round and round about complex64 versus 
complex128, let's just leave things as they are and add a docstring to 
complex128 and complex64 explaining the situation. [code...code...]

     >>> help(complex128)
    class complex128scalar(complexfloatingscalar, complex)
     |  complex128: composed of two 64 bit floats
     |
     |  Method resolution order:
     |      complex128scalar
     |      complexfloatingscalar
     |      inexactscalar
     |      numberscalar
     |      genericscalar
     |      complex
     |      object
    ...

I someone wants to give me some better text for the docstring, I'll go 
ahead and commit this change. Heck if you've got some text for the other 
scalar objects (within reason) I'll be happy to add that at the same time.

Regards,

-tim


From robert.kern at gmail.com  Tue Apr  4 13:06:01 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Tue Apr  4 13:06:01 2006
Subject: [Numpy-discussion] Re: Vote: complex64   vs  complex128
In-Reply-To: <4432CE1F.3010209@cox.net>
References: <20060404155003.AF09016220@sc8-sf-spam2.sourceforge.net> <200604041813.k34IDmAP011500@oobleck.astro.cornell.edu> <e0ugi5$b1q$1@sea.gmane.org> <4432CE1F.3010209@cox.net>
Message-ID: <e0ujgn$lp1$1@sea.gmane.org>

Tim Hochberg wrote:
> Robert Kern wrote:

>> I think it's time that we start taking backwards compatibility with
>> previous
>> releases of numpy seriously and not break numpy code without clear,
>> significant
>> gains in usability.
>>  
> So what does that mean in this case? The current status; nice for
> existing users of numpy. Or, the old status, nice for people
> transitioning to numpy from Numeric. It's hard to know which way these
> backwards compatibility arguments cut when they involve reverting a
> change from some old behaviour.

I mean numpy. Neither complex64 nor complex128 are backwards-compatible with
Numeric. Complex32 and Complex64 already exist and are hopefully isolated as
compatibility aliases for typecodes.

By backwards-compatibility, I refer to code, not habits.

> I've got an idea. Rather than go round and round about complex64 versus
> complex128, let's just leave things as they are and add a docstring to
> complex128 and complex64 explaining the situation. [code...code...]
> 
>     >>> help(complex128)
>    class complex128scalar(complexfloatingscalar, complex)
>     |  complex128: composed of two 64 bit floats
>     |
>     |  Method resolution order:
>     |      complex128scalar
>     |      complexfloatingscalar
>     |      inexactscalar
>     |      numberscalar
>     |      genericscalar
>     |      complex
>     |      object
>    ...
> 
> I someone wants to give me some better text for the docstring, I'll go
> ahead and commit this change. Heck if you've got some text for the other
> scalar objects (within reason) I'll be happy to add that at the same time.

+1

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From oliphant at ee.byu.edu  Tue Apr  4 13:42:38 2006
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Tue Apr  4 13:42:38 2006
Subject: [Numpy-discussion] Re: Vote: complex64   vs  complex128
In-Reply-To: <e0ugi5$b1q$1@sea.gmane.org>
References: <20060404155003.AF09016220@sc8-sf-spam2.sourceforge.net> <200604041813.k34IDmAP011500@oobleck.astro.cornell.edu> <e0ugi5$b1q$1@sea.gmane.org>
Message-ID: <4432D9C3.3040109@ee.byu.edu>

Robert Kern wrote:

>Joe Harrington wrote:
>  
>
>>When I first heard of Complex128, my first response was, "Cool!  I
>>didn't even know there was a Double128!"
>>
>>Folks seem to agree that precision-based naming would be most
>>intuitive to new users, but that length-based naming would be most
>>intuitive to low-level programmers.  This is a high-level package,
>>whose purpose is to hide the numerical details and programming
>>drudgery from the user as much as possible, while still offering high
>>performance and not limiting capability too much.  For this type of
>>package, a good metric is "when it doesn't restrict capability, do
>>what makes sense for new/naiive users".
>>    
>>
>
>I'm pretty sure that when any of us say that such-and-such is going to make the
>most sense to new users, we're just guessing. Or projecting our experienced-user
>prejudices onto them. If I had to register my guess, I would say that either way
>will make just as much sense to new users.
>  
>

Totally agree.   I don't see the argument that Complex64 is a 
"precision" description.  To a new user it could go either way depending 
on their previous experience.  I think most new users won't even use the 
bit width names but will instead use 'complex' and be done with it...

>I think it's time that we start taking backwards compatibility with previous
>releases of numpy seriously and not break numpy code without clear, significant
>gains in usability.
>  
>
+1

-Travis


From perry at stsci.edu  Tue Apr  4 14:09:02 2006
From: perry at stsci.edu (Perry Greenfield)
Date: Tue Apr  4 14:09:02 2006
Subject: [Numpy-discussion] Re: Vote: complex64   vs  complex128
In-Reply-To: <4432D9C3.3040109@ee.byu.edu>
References: <20060404155003.AF09016220@sc8-sf-spam2.sourceforge.net> <200604041813.k34IDmAP011500@oobleck.astro.cornell.edu> <e0ugi5$b1q$1@sea.gmane.org> <4432D9C3.3040109@ee.byu.edu>
Message-ID: <6e9f9be0cfb968840dc4314d65c9e655@stsci.edu>

On Apr 4, 2006, at 4:40 PM, Travis Oliphant wrote:
>
> Totally agree.   I don't see the argument that Complex64 is a 
> "precision" description.  To a new user it could go either way 
> depending on their previous experience.  I think most new users won't 
> even use the bit width names but will instead use 'complex' and be 
> done with it...
>
>> I think it's time that we start taking backwards compatibility with 
>> previous
>> releases of numpy seriously and not break numpy code without clear, 
>> significant
>> gains in usability.
>>
> +1
>
The issue that just won't go away. We did it the current way for 
numarray initially and were persuaded to switch to be compatible with 
Numeric.

I agree that it isn't obvious what the number means for complex. That 
ambiguity will always be there. Unless we did a real user test to find 
out, we wouldn't know for sure what future users would most likely 
expect.

But in the end, pick one and let's not change it again (or even talk 
about changing it). It doesn't matter that much to me which it is.

Perry


From oliphant at ee.byu.edu  Tue Apr  4 14:18:59 2006
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Tue Apr  4 14:18:59 2006
Subject: [Numpy-discussion] NumPy documentation
Message-ID: <4432E27E.6030906@ee.byu.edu>

I received a rather hurtful email today that was very discouraging to me 
personally.  Basically, I was called "lame" and a "wolf" in sheep's 
clothing because I'm charging for documentation.    Fortunately it's the 
first email of that nature I've received.  Others have disagreed with my 
choice to charge for the documentation but at least they've not resorted 
to personal attacks on me and my motivations.   Please know that such 
emails do have an impact.  While I try to build a tough skin, such 
unappreciative statements reduce my enthusiasm for working on NumPy 
significantly.

My purpose, however, is not to rant about the misguided words of one 
person.  He brought up a point that I want to clarify.  He asked if I 
"would sue" if somebody else wrote documentation for NumPy.   I want to 
be perfectly clear that this is a ridiculous statement that barely 
deserves a response.  Of course I wouldn't.   First of all, it would be 
extreme circumstances indeed for me to resort to that course of action 
(basically a company would have to copy my book and start distributing 
it on a large scale, belligerently).  Second of all, I would love to see 
*more* documentation for NumPy.

If there are other (less vocal) people out there who are not using NumPy 
because of my book, then I certainly feel sorry about that.  Please dig 
in and create the documentation you so urgently want to be free.   I 
will not stand in your way, but may even help.

But please consider that time is money.  Most people are better off 
spending their time on something else and just cooperating with others 
by paying for the book.  But, I'm not going to dislike or have any kind 
of ill feelings with anyone who decides to spend their time on 
"documentation."  In fact, I'll appreciate it just like everyone else.  
I love the growth of the SciPy Wiki.  There are some great recipes and 
examples there.  This is fantastic.  I'm 100% behind this kind of work.  
Rather than write some kind of "replacement" documentation, contribute 
docstrings to the code and recipes to the Wiki.   Then, those that can't 
or won't buy the book will still have plenty of resources to use to 
learn NumPy.

I'm completely behind all forms of "free" information on NumPy / SciPy 
and related tools.  The only reason I have to charge for the 
documentation is that I just don't have the resources to simply donate 
*all* of my time.   I want to thank all of you who have already 
purchased the documentation.   It has been extremely helpful to me 
personally and professionally.   Without you, my time to spend on NumPy 
would have been significantly reduced.   Thank you very much.

Best wishes,

-Travis


From ijcvyash at rim.com  Tue Apr  4 14:46:07 2006
From: ijcvyash at rim.com (ijcvyash)
Date: Tue Apr  4 14:46:07 2006
Subject: [Numpy-discussion] Fw: numpy-discussion
Message-ID: <000c01c65831$3e0d2b10$cc04ac54@berndtxhk37ozj>


----- Original Message ----- 
From: Armstrong Nicholas 
To: mgucfjwruye at bondavalli.com 
Sent: Saturday, April 01, 2006 10:21 PM
Subject: numpy-discussion


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060404/f29776f1/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: numpy-discussion.gif
Type: image/gif
Size: 8262 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060404/f29776f1/attachment.gif>

From Chris.Barker at noaa.gov  Tue Apr  4 14:48:01 2006
From: Chris.Barker at noaa.gov (Christopher Barker)
Date: Tue Apr  4 14:48:01 2006
Subject: [Numpy-discussion] NumPy documentation
In-Reply-To: <4432E27E.6030906@ee.byu.edu>
References: <4432E27E.6030906@ee.byu.edu>
Message-ID: <4432E973.8070601@noaa.gov>

Travis,

I'm very sorry to hear that you got such a response. It was completely 
unwarranted. I am often quite surprised at the vitriol that sometimes 
results from people that are not getting what they want from an open 
source project.

Indeed, the comment about "suing" makes it completely clear that this 
individual completely misunderstood your intentions (and the reality of 
copyright law: you would only have a course of action if your book was 
copied!).

When you first announced the book, I know there was a fair bit of 
discussion about it, and you made it quite clear how reasonable your 
position is. Personally, I think forcing open source projects by writing 
and selling books about them is an excellent approach: it works well for 
everyone. My freedom is not restricted, you get some compensation for 
your time.

Ideally, I'd like to see comprehensive reference documentation 
distributed for free, while more comprehensive explanatory docs could be 
either free or not. One of these days I'll put my keyboard where my 
mouth is and actually write a doc string or two!

In the meantime, I am absolutely thrilled that you've put as much effort 
into numpy as you have. You are doing a fabulous job, and I hope the 
appreciation of all is clear to you.

thank you,

-Chris

PS: If we get a reasonable budget next year, I'll be sure to buy a few 
copies of your book.


-- 
Christopher Barker, Ph.D.
Oceanographer
                                     		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From tim.hochberg at cox.net  Tue Apr  4 15:37:06 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Tue Apr  4 15:37:06 2006
Subject: [Numpy-discussion] NumPy documentation
In-Reply-To: <4432E973.8070601@noaa.gov>
References: <4432E27E.6030906@ee.byu.edu> <4432E973.8070601@noaa.gov>
Message-ID: <4432F4DD.6060000@cox.net>

Travis,

I'm sorry to hear that you received such an unwarranted attack. 
Although, sadly, not terribly suprised; there are plenty of unpleasant 
fanatics of various stripes that roam the bitstreams.  Let me add a 
hearty "me too" to everything that Chris just said.

This finally motivated me to go out and buy your book, something that's 
been on my list of things that I should do "one of these days now". I'm 
hoping that makes this mystery person unhappy.

Regards,
-tim


From svetosch at gmx.net  Tue Apr  4 16:03:02 2006
From: svetosch at gmx.net (Sven Schreiber)
Date: Tue Apr  4 16:03:02 2006
Subject: [Numpy-discussion] kron with matrices
Message-ID: <4432FADE.3070705@gmx.net>

Hi,
first of all thanks for including kron in numpy, it's very useful.

Now I have just built numpy from svn for the first time in order to spot
matrix-related bugs before a new release as promised. That worked well,
thanks to the great wiki instructions.

The old bugs (in linalg) are gone, but I wonder whether the following
behavior is another one:

>>> import numpy as n
>>> n.kron(n.asmatrix(n.ones((1,2))), n.asmatrix(n.zeros((2,2))))
array([[0, 0, 0, 0],
       [0, 0, 0, 0]])

I would prefer if kron returned a matrix at least if both inputs are
matrices, as in the given example.

Thanks,
Sven


From jdhunter at ace.bsd.uchicago.edu  Tue Apr  4 16:10:13 2006
From: jdhunter at ace.bsd.uchicago.edu (John Hunter)
Date: Tue Apr  4 16:10:13 2006
Subject: [Numpy-discussion] NumPy documentation
In-Reply-To: <4432E27E.6030906@ee.byu.edu> (Travis Oliphant's message of
 "Tue, 04 Apr 2006 15:17:50 -0600")
References: <4432E27E.6030906@ee.byu.edu>
Message-ID: <87wte5ndot.fsf@peds-pc311.bsd.uchicago.edu>

>>>>> "Travis" == Travis Oliphant <oliphant at ee.byu.edu> writes:

    Travis> I received a rather hurtful email today that was very
    Travis> discouraging to me personally.  Basically, I was called
    Travis> "lame" and a "wolf" in sheep's clothing because I'm
    Travis> charging for documentation.  Fortunately it's the first

Wow, harsh.

I would just like to (for a second time) voice my support for your
charging for documentation, and throw out a couple of points for
people to consider who oppose it.

I think a low-ball estimate of the dollar value of the amount of time
Travis has donated to scientific python is about $500,000 dollars (5
years, full-time, $100k/yr -- this is low ball because he has probably
donated more time and he is certainly worth more than that annually!).
If he gets the $300,000 or so dollars he hopes to raise from this
book, he still has a net contribution of more than $200k.  Those of
you who are critical: have you put in that much of your time or money?

Secondly, I know personally that Travis has resisted several offers to
lure him from academia into industry.  Academia, by its nature,
affords more flexibility to develop open source software driven by
issues of breadth and quality rather than deadlines and customer
demands.  By charging for this book, it makes it more feasible for him
to continue to work in academia and support these projects.

Travis and I share some similarities: we both have a wife and kids,
with low-paying academic careers, and lead active python projects.
Only Travis leads two projects to my one and he has five kids to my
three.  I recently left academia for a job in industry because of
financial considerations, and while my firm is supportive of my
matplotlib development (we use it and python extensively in house), it
does leave me less time for development.  So to those of you grumbling
to Travis directly or behind the scenes, think about what he is giving
and back off.  And start donating some of your own time instead of
encouraging Travis to donate more of his.

JDH


From aisaac at american.edu  Tue Apr  4 16:27:10 2006
From: aisaac at american.edu (Alan G Isaac)
Date: Tue Apr  4 16:27:10 2006
Subject: [Numpy-discussion] NumPy documentation
In-Reply-To: <4432E27E.6030906@ee.byu.edu>
References: <4432E27E.6030906@ee.byu.edu>
Message-ID: <Mahogany-0.66.0-1528-20060404-193223.00@american.edu>

On Tue, 04 Apr 2006, Travis Oliphant apparently wrote: 
> I'm not going to dislike or have any kind of ill feelings 
> with anyone who decides to spend their time on 
> "documentation."  In fact, I'll appreciate it just like 
> everyone else. 

Of course you were extremely clear about this from the 
beginning.  Thank you for numpy!!!
Alan Isaac (grateful user of numpy)
PS Your book is *very* helpful.


From zpincus at stanford.edu  Tue Apr  4 16:48:06 2006
From: zpincus at stanford.edu (Zachary Pincus)
Date: Tue Apr  4 16:48:06 2006
Subject: [Numpy-discussion] NumPy documentation
In-Reply-To: <4432F4DD.6060000@cox.net>
References: <4432E27E.6030906@ee.byu.edu> <4432E973.8070601@noaa.gov> <4432F4DD.6060000@cox.net>
Message-ID: <1A94CA82-2EB5-4145-9EA9-453DE60AE684@stanford.edu>

Hi folks -

I must admit that when I first saw the trelgol web page, I was  
briefly a bit confused and put off about the prospect of moving to  
numpy from Numeric. Now, it didn't take long for me to come to my  
senses and realize (a) that no formerly-free documentation had been  
revoked, (b) that there was enough documentation about the C API in  
the numpy distribution to get me started, (c) that there was a lot of  
support available on the email list, and most importantly (d) that  
Travis and many others are extremely generous with their time, both  
in answering emails on the numpy list and in making numpy better.

I now of course wholeheartedly agree with everything everyone has  
said in this thread, and with the idea behind selling the  
documentation. In fact, I feel a bit ashamed that I ever felt  
otherwise, even though it was just for a few minutes.

However, were I a more grumpy (or stupid) type, I might not have come  
to my senses as rapidly, or ever. That would have been my loss, of  
course. But, perhaps a few little things could help newcomers better  
understand the rationale behind the ebook.

Basically, everyone on this list knows (and supports, it seems!) the  
reasoning behind selling the docs, because it was discussed on the  
list. However, it's not hard to imagine someone new to numpy, or  
maybe a convert from Numeric (who was used to the large, free manual)  
scratching their head a little when confronted with http:// 
www.tramy.us/ . (It's less reasonable to imagine someone then going  
on to personally attack Travis in email -- that's absolutely  
unconscionable.)

I would suggest that the link from the scipy page be changed to point  
to http://www.tramy.us/guidetoscipy.html , which is a little more  
clearly about the ebook, and a little less about the publishing  
method. It might not hurt to expand a bit on that page and mention  
the basic reasoning behind selling the docs, and even (if you see  
fit, Travis) to maybe include links to the other numpy documentation  
resources (list archive and sign up page, old and out-of-date Numeric  
reference [with maybe some mention of why buying the book would be  
better, but that the old ref at least gives the right high-level  
picture to get a newcomer started using numpy], and the numpy wiki  
pages). Any of this would certainly put a newcomer in a more  
charitable state of mind, and forestall any lingering concerns about  
greed or any such foolishness.

Since free advice is worth exactly what you paid for it, feel free to  
ignore any or all of this. I just wanted to mention a few easy things  
that I think might help newcomers understand and feel good about the  
ebook (the first step toward buying it!).

Zach


On Apr 4, 2006, at 5:36 PM, Tim Hochberg wrote:

>
> Travis,
>
> I'm sorry to hear that you received such an unwarranted attack.  
> Although, sadly, not terribly suprised; there are plenty of  
> unpleasant fanatics of various stripes that roam the bitstreams.   
> Let me add a hearty "me too" to everything that Chris just said.
>
> This finally motivated me to go out and buy your book, something  
> that's been on my list of things that I should do "one of these  
> days now". I'm hoping that makes this mystery person unhappy.
>
> Regards,
> -tim
>
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting  
> language
> that extends applications into web and mobile media. Attend the  
> live webcast
> and join the prime developer group breaking into this new coding  
> territory!
> http://sel.as-us.falkag.net/sel? 
> cmd=lnk&kid=110944&bid=241720&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion


From zpincus at stanford.edu  Tue Apr  4 17:19:18 2006
From: zpincus at stanford.edu (Zachary Pincus)
Date: Tue Apr  4 17:19:18 2006
Subject: [Numpy-discussion] array constructor from generators?
Message-ID: <F0BB3395-025A-46CB-8A7A-005964E45D20@stanford.edu>

Hi folks,

Sorry if this has already been discussed, but do you all think it a  
good idea to extend the array constructor so that it can accept  
generators instead of lists?

I often construct arrays from list comprehensions on generators, e.g.  
to read a tab-delimited file in:
numpy.array([map(float, line.split()) for line in file])
or making an array of pairs of numbers:
numpy.array([f for f in unique_combinations(input, 2)])

If the array constructor accepted generators (and turned them into  
lists behind the scenes, or even evaluated them lazily while filling  
in the memory buffer, not sure what would be more efficient), the  
above could be written somewhat more cleanly:
numpy.array(map(float, line.split() for line in file) (using a  
generator expression)
and
numpy.array(unique_combinations(input, 2))

the latter is especially a win.

Moreover, it's becoming more standard for any python thing that can  
accept a list to also accept a generator.

The downside is that currently, passing array() an object makes a 0-d  
object array with that object. If this were changed, then passing  
array() an iterator object would be handled differently than passing  
array any other object. This might possibly be a fatal flaw in this  
idea.

I'd be happy to look in to implementing this functionality if people  
think it is a good idea, and could give me some tips as to the best  
way to implement it.

Zach


From wbaxter at gmail.com  Tue Apr  4 17:24:38 2006
From: wbaxter at gmail.com (Bill Baxter)
Date: Tue Apr  4 17:24:38 2006
Subject: [Numpy-discussion] NumPy documentation
In-Reply-To: <1A94CA82-2EB5-4145-9EA9-453DE60AE684@stanford.edu>
References: <4432E27E.6030906@ee.byu.edu> <4432E973.8070601@noaa.gov>
	 <4432F4DD.6060000@cox.net>
	 <1A94CA82-2EB5-4145-9EA9-453DE60AE684@stanford.edu>
Message-ID: <e86a5fd00604041720l2d774fc0t5a4b0f3f560de8dc@mail.gmail.com>

First of all, it sounds like the individual who mailed Travis about being a
"wolf in sheep's clothing" is suffering from the delusion that you can
actually get rich by selling technical documentation at 40 bucks a pop.
Travis does have a web page up somewhere explaining all his rationale -- I
ran across it somewhere.  I remember when I saw it I was thinking "that's
bizarre -- why on earth would you have to make a whole web page to justify
selling something you yourself created?"  I mean, like it or not, Travis
wrote it so he can do whatever he wants with it.  That's just common sense.
Something apparently some lack.  It reminds me of the story my father told
me when I was like 8 years old about a man who shows up one day and gives a
little boy a dollar bill.  The boy is exctatic, and thanks the man
profusely.  Then the next day the same thing, another dollar.  The boy can't
believe his luck.  The whole week the guy comes, then it becomes a month,
and then a year.  Every day another dollar.  Eventually it becomes such a
routine that the boy doesn't even bother to thank the guy.  Then one day the
man doesn't show up.  The little boy is furious.  He was counting on that
dollar, he already knew how he was going to spend every penny.  The person
who emailed Travis is just like that little boy, furious for not getting the
dollar that wasn't his to begin with, rather than being thankful for the
$365 he was given out of the blue for no particular reason.

--bb
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060404/e3544d7e/attachment.html>

From tim.hochberg at cox.net  Tue Apr  4 17:41:15 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Tue Apr  4 17:41:15 2006
Subject: [Numpy-discussion] array constructor from generators?
In-Reply-To: <F0BB3395-025A-46CB-8A7A-005964E45D20@stanford.edu>
References: <F0BB3395-025A-46CB-8A7A-005964E45D20@stanford.edu>
Message-ID: <44331200.2020604@cox.net>

Zachary Pincus wrote:

> Hi folks,
>
> Sorry if this has already been discussed, but do you all think it a  
> good idea to extend the array constructor so that it can accept  
> generators instead of lists?
>
> I often construct arrays from list comprehensions on generators, e.g.  
> to read a tab-delimited file in:
> numpy.array([map(float, line.split()) for line in file])
> or making an array of pairs of numbers:
> numpy.array([f for f in unique_combinations(input, 2)])
>
> If the array constructor accepted generators (and turned them into  
> lists behind the scenes, or even evaluated them lazily while filling  
> in the memory buffer, not sure what would be more efficient), the  
> above could be written somewhat more cleanly:
> numpy.array(map(float, line.split() for line in file) (using a  
> generator expression)
> and
> numpy.array(unique_combinations(input, 2))
>
> the latter is especially a win.
>
> Moreover, it's becoming more standard for any python thing that can  
> accept a list to also accept a generator.
>
> The downside is that currently, passing array() an object makes a 0-d  
> object array with that object. If this were changed, then passing  
> array() an iterator object would be handled differently than passing  
> array any other object. This might possibly be a fatal flaw in this  
> idea.

You pretty much can't count on anything when trying to implicitly create 
object arrays anyway. There's already buckets of special cases to make 
the other array types user friendly. In other words I don't think we 
should care. You do have to be careful to special case iterators after 
all the other special case machinery, so that lists and whatnot that are 
treated efficiently don't get slowed down.

>
> I'd be happy to look in to implementing this functionality if people  
> think it is a good idea, and could give me some tips as to the best  
> way to implement it.

Hi Zach,

I brought this up last week and Travis was OK with it. I have it on my 
todo list, but if you are in a hurry you're welcome to do it instead.

If you do look at it, consider looking into the '__length_hint__ 
parameter that's slated to go into Python 2.5. When this is present, 
it's potentially a big win, since you can preallocate the array and fill 
it directly from the iterator. Without this, you probably can't do much 
better than just building a list from the array. What would work well 
would be to build a list, then steal its memory. I'm not sure if that's 
feasible without leaking a reference to the list though.

Also, with iterators, specifying dtype will make a huge difference. If 
an object has __length_hint__ and you specify dtype, then you can 
preallocate the array as I suggested above. However, if dtype is not 
specified, you still need to build the list completely, determine what 
type it is, allocate the array memory and then copy the values into it. 
Much less efficient!

Regards,

-tim


>
> Zach
>
>
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting 
> language
> that extends applications into web and mobile media. Attend the live 
> webcast
> and join the prime developer group breaking into this new coding 
> territory!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>
>


From robert.kern at gmail.com  Tue Apr  4 17:50:05 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Tue Apr  4 17:50:05 2006
Subject: [Numpy-discussion] Re: array constructor from generators?
In-Reply-To: <F0BB3395-025A-46CB-8A7A-005964E45D20@stanford.edu>
References: <F0BB3395-025A-46CB-8A7A-005964E45D20@stanford.edu>
Message-ID: <e0v44s$92n$1@sea.gmane.org>

Zachary Pincus wrote:

> The downside is that currently, passing array() an object makes a 0-d 
> object array with that object. If this were changed, then passing 
> array() an iterator object would be handled differently than passing 
> array any other object. This might possibly be a fatal flaw in this  idea.

I don't think so. We can pass appropriate lists to array(), and it handles them
fine. Iterator objects are just another kind of object that gets special
treatment. The tricky bit is recognizing them.

> I'd be happy to look in to implementing this functionality if people 
> think it is a good idea, and could give me some tips as to the best  way
> to implement it.

I think a prerequisite for turning an arbitrary iterable into a numpy array is
to iterate over it and store all of the objects in a temporary buffer that
expands with a sensible strategy. I can't think of a better buffer object than
regular Python lists.

I think you can recognize when you have to use the temporary list strategy by
seeing if the input has .__iter__() but not .__len__(). I'd have to refresh
myself on the details of PyArray_New to be more sure, though.

As Tim suggests, 2.5's __length_hint__ will also help.

Another note of caution: You are going to have to deal with iterators of
iterators of iterators of.... I'm not sure if that actually overly complicates
matters; I haven't looked at PyArray_New for some time. Enjoy!

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From ted.horst at earthlink.net  Tue Apr  4 21:33:04 2006
From: ted.horst at earthlink.net (Ted Horst)
Date: Tue Apr  4 21:33:04 2006
Subject: [Numpy-discussion] NumPy documentation
In-Reply-To: <4432E27E.6030906@ee.byu.edu>
References: <4432E27E.6030906@ee.byu.edu>
Message-ID: <9E52488D-6736-4F10-A045-2D5B39CBD3F5@earthlink.net>

I'll just add my voice to the people speaking up to support Travis's  
efforts.  I buy lots of books, and most of the time I don't think too  
much about who I am supporting when I buy them, but  I probably would  
have bought this book even if I didn't need that level of  
documentation just to help support what I see as very important  
work.  I don't see how writing about an open source project and using  
the proceeds to further that project could be seen as anything other  
than a positive.

I also just want to say how impressed I am with what Travis has  
accomplished with this project.  From the organizational effort,  
patience, and persistence of bringing the various communities  
together to the quality and quantity of the ideas, code, and  
discussions, his contributions have been inspiring.

Ted Horst


From eric at enthought.com  Tue Apr  4 21:59:10 2006
From: eric at enthought.com (eric jones)
Date: Tue Apr  4 21:59:10 2006
Subject: [Numpy-discussion] NumPy documentation
In-Reply-To: <4432E27E.6030906@ee.byu.edu>
References: <4432E27E.6030906@ee.byu.edu>
Message-ID: <44334E74.3000406@enthought.com>

Travis Oliphant wrote:

>
> I received a rather hurtful email today that was very discouraging to 
> me personally.  Basically, I was called "lame" and a "wolf" in sheep's 
> clothing because I'm charging for documentation.    

Hmmmm....


Chickens getting eaten by foxes.
Farmer builds wire coop.
Coop destroyed by foxes.
More chickens eaten.
Wolf builds wooden coop for free.
Also stands guard but for a fee.
No more chickens eaten.
Most chickens glady pay.
A few grumble about extortion!
Thats fine.
Let them take the guard.
Foxes aren't so afraid of Chickens.
This chicken will take his chances with this wolf.
Turns out its just a lame chicken in wolves clothing.
Smart chicken, he is.

Dumb letter.  Dumb story.

Let see here, your a chicken.  check.  Travis is smart wolf-chicken... 
yeah that works. 
Numpy is the wooden chicken coop. errr... Guard duty is documentation. 
hmmm...
foxes, not sure...  Guess I should keep my day job.


Slightly more seriously...

There's a chicken's foot full of people on the planet that could have done
what Travis has pulled off -- I've actually thought about this a 
little.  Maybe
Jim Huginin could have done it given similar time and motivation.
After that, I come up a little short of candidates -- so maybe its just 
a pigs foot full.
I consider us lucky that one of the few people able to fuse Numeric/numarray
bailed us out and did it.

Documentation is another matter as far as scarcity of qualified 
authors.  I would
trust any number of yayhoos to create
at least passable documentation for Travis' creation. Heck, David Ascher 
managed
to write the Numeric documentation <wink>.  That said, writing docs is 
work,
hard to do well, and not nearly as much fun as writing actual code (for the
people on this list anyway).  That significantly lowers the probability 
of it
getting done.  In fact, I believe LLNL funded the first documentation 
effort to help
ensure that it happened (though I'm not positive about that). 

And, think of the creek we'd be up if he chose to keep the library and 
give away the docs.

I'm all for someone writing free documentation.  It'd be great to have.  
And, if
it were as good as Travis', I might even use it.  Still, it would 
probably be
better for the world if you spent your time on other things that don't 
already
have a solution (like documenting SciPy...).  Once that and all similar 
problems
are solved, loop back around and do the NumPy docs.

One other comment.  I've used another amazing library called agg
(www.antigrain.com) extensively for rendering in kiva/chaco.  I view 
Maxim (the
author of Agg) and graphics rendering in a similar light as Travis and 
Numpy --
there are only a handful of people that could have written agg.  For 
that I am
hugely greatful.  On the downside, agg is very complex and has very little
documentation.  Still a number of people use it without complaint. Based 
on the
evidence, if Maxim wrote documentation and charged for it, the number of
complaints would actually increase.  It is just silly. I would pay his 
price and
sing his praises for the days of my life that he gave back to me.

eric


ps.
# Based on a definitive monte carlo simulation, one of every hundred 
chickens will
# complain.  Don't believe me.  Try it.

dist = stats.uniform(0.0, 1.0)
for chicken in chickens:
    if dist.rvs()[0] < 0.01:
        print "extortion"


From pfdubois at gmail.com  Tue Apr  4 22:01:02 2006
From: pfdubois at gmail.com (Paul Dubois)
Date: Tue Apr  4 22:01:02 2006
Subject: [Numpy-discussion] NumPy documentation
In-Reply-To: <9E52488D-6736-4F10-A045-2D5B39CBD3F5@earthlink.net>
References: <4432E27E.6030906@ee.byu.edu>
	 <9E52488D-6736-4F10-A045-2D5B39CBD3F5@earthlink.net>
Message-ID: <f74a6c2f0604042200i6d0992deyf20d2dfc95c93550@mail.gmail.com>

Amen.

On 04 Apr 2006 21:33:12 -0700, Ted Horst <ted.horst at earthlink.net> wrote:
>
>
> I'll just add my voice to the people speaking up to support Travis's
> efforts.  I buy lots of books, and most of the time I don't think too
> much about who I am supporting when I buy them, but  I probably would
> have bought this book even if I didn't need that level of
> documentation just to help support what I see as very important
> work.  I don't see how writing about an open source project and using
> the proceeds to further that project could be seen as anything other
> than a positive.
>
> I also just want to say how impressed I am with what Travis has
> accomplished with this project.  From the organizational effort,
> patience, and persistence of bringing the various communities
> together to the quality and quantity of the ideas, code, and
> discussions, his contributions have been inspiring.
>
> Ted Horst
>
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting
> language
> that extends applications into web and mobile media. Attend the live
> webcast
> and join the prime developer group breaking into this new coding
> territory!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060404/10a64f0a/attachment.html>

From jdhunter at ace.bsd.uchicago.edu  Tue Apr  4 22:54:01 2006
From: jdhunter at ace.bsd.uchicago.edu (John Hunter)
Date: Tue Apr  4 22:54:01 2006
Subject: [Numpy-discussion] NumPy documentation
In-Reply-To: <44334E74.3000406@enthought.com> (eric jones's message of "Tue,
 04 Apr 2006 23:58:28 -0500")
References: <4432E27E.6030906@ee.byu.edu> <44334E74.3000406@enthought.com>
Message-ID: <873bgsa7vp.fsf@peds-pc311.bsd.uchicago.edu>

>>>>> "eric" == eric jones <eric at enthought.com> writes:

    eric> Let see here, your a chicken.  check.  Travis is smart
    eric> wolf-chicken... yeah that works. Numpy is the wooden chicken
    eric> coop. errr... Guard duty is documentation. hmmm...  foxes,
    eric> not sure...  

And I thought you didn't drink anything stronger than Dr Pepper
:-)

JDH


From sransom at nrao.edu  Wed Apr  5 00:04:03 2006
From: sransom at nrao.edu (Scott Ransom)
Date: Wed Apr  5 00:04:03 2006
Subject: [Numpy-discussion] NumPy documentation
In-Reply-To: <9E52488D-6736-4F10-A045-2D5B39CBD3F5@earthlink.net>
References: <4432E27E.6030906@ee.byu.edu> <9E52488D-6736-4F10-A045-2D5B39CBD3F5@earthlink.net>
Message-ID: <20060405070150.GB8682@ssh.cv.nrao.edu>

As someone who has been actively using Numeric/Numarray/Numpy
for about 7 years, now, I heartily agree.

Thanks, Travis.

Scott

On Tue, Apr 04, 2006 at 11:32:42PM -0500, Ted Horst wrote:
> 
> I'll just add my voice to the people speaking up to support Travis's  
> efforts.  I buy lots of books, and most of the time I don't think too  
> much about who I am supporting when I buy them, but  I probably would  
> have bought this book even if I didn't need that level of  
> documentation just to help support what I see as very important  
> work.  I don't see how writing about an open source project and using  
> the proceeds to further that project could be seen as anything other  
> than a positive.
> 
> I also just want to say how impressed I am with what Travis has  
> accomplished with this project.  From the organizational effort,  
> patience, and persistence of bringing the various communities  
> together to the quality and quantity of the ideas, code, and  
> discussions, his contributions have been inspiring.
> 
> Ted Horst
> 
> 
> 
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting language
> that extends applications into web and mobile media. Attend the live webcast
> and join the prime developer group breaking into this new coding territory!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion

-- 
-- 
Scott M. Ransom            Address:  NRAO
Phone:  (434) 296-0320               520 Edgemont Rd.
email:  sransom at nrao.edu             Charlottesville, VA 22903 USA
GPG Fingerprint: 06A9 9553 78BE 16DB 407B  FFCA 9BFA B6FF FFD3 2989


From charlesr.harris at gmail.com  Wed Apr  5 00:27:02 2006
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Wed Apr  5 00:27:02 2006
Subject: [Numpy-discussion] NumPy documentation
In-Reply-To: <4432E27E.6030906@ee.byu.edu>
References: <4432E27E.6030906@ee.byu.edu>
Message-ID: <e06186140604050026t5832de47n79d6682736cfab28@mail.gmail.com>

Travis,

On 4/4/06, Travis Oliphant <oliphant at ee.byu.edu> wrote:
>
>
> I received a rather hurtful email today that was very discouraging to me
> personally.  Basically, I was called "lame" and a "wolf" in sheep's
> clothing because I'm charging for documentation.


<snip>

Geez, what's with that. There are any number of "real" books out on python,
I don't hear folks bitching. I think it's wonderful that we have such a good
reference. I mean, look at numarray 8) I spent the money for your book and
it didn't hurt a bit and was well worth the cost. Anyone who has tried to
write extensive documentation on a big project knows how much work it takes,
it isn't easy. Thanks for taking the time and sweat to do so.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060405/966a4f1e/attachment.html>

From arnd.baecker at web.de  Wed Apr  5 01:51:08 2006
From: arnd.baecker at web.de (Arnd Baecker)
Date: Wed Apr  5 01:51:08 2006
Subject: [Numpy-discussion] Re: Vote: complex64   vs  complex128
In-Reply-To: <e0ujgn$lp1$1@sea.gmane.org>
References: <20060404155003.AF09016220@sc8-sf-spam2.sourceforge.net>
 <200604041813.k34IDmAP011500@oobleck.astro.cornell.edu> <e0ugi5$b1q$1@sea.gmane.org>
 <4432CE1F.3010209@cox.net> <e0ujgn$lp1$1@sea.gmane.org>
Message-ID: <Pine.LNX.4.51.0604051045520.24174@ptpcp8.phy.tu-dresden.de>

On Tue, 4 Apr 2006, Robert Kern wrote:

> Tim Hochberg wrote:

[...]

> >     >>> help(complex128)
> >    class complex128scalar(complexfloatingscalar, complex)
> >     |  complex128: composed of two 64 bit floats
> >     |
> >     |  Method resolution order:
> >     |      complex128scalar
> >     |      complexfloatingscalar
> >     |      inexactscalar
> >     |      numberscalar
> >     |      genericscalar
> >     |      complex
> >     |      object
> >    ...

I am puzzled why this does not show up with Ipython:

In [1]:import numpy
In [2]:numpy.complex128?
Type:           type
Base Class:     <type 'type'>
String Form:    <type 'complex128scalar'>
Namespace:      Interactive
Docstring:
    <no docstring>

whereas

In [3]:help(numpy.complex128)

shows the above!
So this might be more of an IPython question (I am running IPython
0.7.2.svn), but maybe numpy does some magic tricks to hide the docs from
IPython (surely not on purpose ...)?
It seems that numpy.complex128.__doc__ is None.

Best, Arnd


From meesters at uni-mainz.de  Wed Apr  5 02:03:06 2006
From: meesters at uni-mainz.de (Christian Meesters)
Date: Wed Apr  5 02:03:06 2006
Subject: [Numpy-discussion] NumPy documentation
In-Reply-To: <e06186140604050026t5832de47n79d6682736cfab28@mail.gmail.com>
References: <4432E27E.6030906@ee.byu.edu> <e06186140604050026t5832de47n79d6682736cfab28@mail.gmail.com>
Message-ID: <200604051048.52766.meesters@uni-mainz.de>

I'm glad Travis, that you got such supportive replies - but didn't expect 
anything else. Just let me give two more cents:
a) I am a grateful user of Numpy/Scipy, too.
b) I among of those who fully understand and support your decisions about 
selling the book.
c) I didn't buy the book - yet. (Simply forgotten after a minor 
Pay-Pal-problem I had.)
d) ad c): This will change soon.

And e): Thank you for all your work put into Numpy/Scipy !

Christian


From amcmorl at gmail.com  Wed Apr  5 02:30:01 2006
From: amcmorl at gmail.com (amcmorl)
Date: Wed Apr  5 02:30:01 2006
Subject: [Numpy-discussion] Newbie indexing question and print order
Message-ID: <44338DF4.7050603@gmail.com>

Hi all,

I'm having a bit of trouble getting my head around numpy's indexing
capabilities. A quick summary of the problem is that I want to
lookup/index in nD from a second array of rank n+1, such that the last
(or first, I guess) dimension contains the lookup co-ordinates for the
value to extract from the first array. Here's a 2D (3,3) example:

In [12]:print ar
[[ 0.15  0.75  0.2 ]
 [ 0.82  0.5   0.77]
 [ 0.21  0.91  0.59]]

In [24]:print inds
[[[1 1]
  [1 1]
  [2 1]]

 [[2 2]
  [0 0]
  [1 0]]

 [[1 1]
  [0 0]
  [2 1]]]

then somehow return the array (barring me making any row/column errors):
In [26]: c = ar.somefancyindexingroutinehere(inds)

In [26]:print c
[[ 0.5  0.5  0.91]
 [ 0.59 0.15 0.82]
 [ 0.5  0.15 0.91]]

i.e. c[x,y] = a[ inds[x,y,0], inds[x,y,1] ]

Any suggestions? It looks like it should be relatively simple using
'put' or 'take' or 'fetch' or 'sit' or something like that, but I'm not
getting it.

While I'm here, can someone help me understand the rationale behind
'print' printing row, column (i.e. a[0,1] = 0.75 in the above example
rather than x, y (=column, row; in which case 0.75 would be in the first
column and second row), which seems to me to be more intuitive.

I'm really enjoying getting into numpy - I can see it'll be
simpler/faster coding than my previous environments, despite me not
knowing my way at the moment, and that python has better opportunities
for extensibility. So, many thanks for your great work.
-- 
Angus McMorland
email a.mcmorland at auckland.ac.nz
mobile +64-21-155-4906

PhD Student, Neurophysiology / Multiphoton & Confocal Imaging
Physiology, University of Auckland
phone +64-9-3737-599 x89707

Armourer, Auckland University Fencing
Secretary, Fencing North Inc.


From faltet at carabos.com  Wed Apr  5 02:56:06 2006
From: faltet at carabos.com (Francesc Altet)
Date: Wed Apr  5 02:56:06 2006
Subject: [Numpy-discussion] NumPy documentation
In-Reply-To: <4432E27E.6030906@ee.byu.edu>
References: <4432E27E.6030906@ee.byu.edu>
Message-ID: <1144230907.7563.14.camel@localhost.localdomain>

Travis,

First of all, I think that you should be happy that you received *only*
a mail of this class in the year and some months that you are at the
NumPy project. As somebody already noted: "take a large enough
community, and you will always find a person (or several) that thinks
that the wiser developer and the best professional is evil". We can
disscuss largely why this should happen, but the answer is easy: it's
human nature.

Let me also THANK YOU not only for your impressive dedication to the
NumPy project but also for your openess to other ideas and to be the
best advocate of the "I prefer to code, rather than talk" mantra. Lets
do more of this and let others talk. I'm positive that 99% of the
community is with you, and that's the only consideration that is worth.

Best,

Francesc


El dt 04 de 04 del 2006 a les 15:17 -0600, en/na Travis Oliphant va
escriure:
> I received a rather hurtful email today that was very discouraging to me 
> personally.  Basically, I was called "lame" and a "wolf" in sheep's 
> clothing because I'm charging for documentation.    Fortunately it's the 
> first email of that nature I've received.  Others have disagreed with my 
> choice to charge for the documentation but at least they've not resorted 
> to personal attacks on me and my motivations.   Please know that such 
> emails do have an impact.  While I try to build a tough skin, such 
> unappreciative statements reduce my enthusiasm for working on NumPy 
> significantly.
> 
> My purpose, however, is not to rant about the misguided words of one 
> person.  He brought up a point that I want to clarify.  He asked if I 
> "would sue" if somebody else wrote documentation for NumPy.   I want to 
> be perfectly clear that this is a ridiculous statement that barely 
> deserves a response.  Of course I wouldn't.   First of all, it would be 
> extreme circumstances indeed for me to resort to that course of action 
> (basically a company would have to copy my book and start distributing 
> it on a large scale, belligerently).  Second of all, I would love to see 
> *more* documentation for NumPy.
> 
> If there are other (less vocal) people out there who are not using NumPy 
> because of my book, then I certainly feel sorry about that.  Please dig 
> in and create the documentation you so urgently want to be free.   I 
> will not stand in your way, but may even help.
> 
> But please consider that time is money.  Most people are better off 
> spending their time on something else and just cooperating with others 
> by paying for the book.  But, I'm not going to dislike or have any kind 
> of ill feelings with anyone who decides to spend their time on 
> "documentation."  In fact, I'll appreciate it just like everyone else.  
> I love the growth of the SciPy Wiki.  There are some great recipes and 
> examples there.  This is fantastic.  I'm 100% behind this kind of work.  
> Rather than write some kind of "replacement" documentation, contribute 
> docstrings to the code and recipes to the Wiki.   Then, those that can't 
> or won't buy the book will still have plenty of resources to use to 
> learn NumPy.
> 
> I'm completely behind all forms of "free" information on NumPy / SciPy 
> and related tools.  The only reason I have to charge for the 
> documentation is that I just don't have the resources to simply donate 
> *all* of my time.   I want to thank all of you who have already 
> purchased the documentation.   It has been extremely helpful to me 
> personally and professionally.   Without you, my time to spend on NumPy 
> would have been significantly reduced.   Thank you very much.
> 
> Best wishes,
> 
> -Travis
> 
> 
> 
> 
> 
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting language
> that extends applications into web and mobile media. Attend the live webcast
> and join the prime developer group breaking into this new coding territory!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
-- 
>0,0<   Francesc Altet     http://www.carabos.com/
V   V   C?rabos Coop. V.   Enjoy Data
 "-"


From pau.gargallo at gmail.com  Wed Apr  5 03:10:01 2006
From: pau.gargallo at gmail.com (Pau Gargallo)
Date: Wed Apr  5 03:10:01 2006
Subject: [Numpy-discussion] Newbie indexing question and print order
In-Reply-To: <44338DF4.7050603@gmail.com>
References: <44338DF4.7050603@gmail.com>
Message-ID: <6ef8f3380604050309t1ed4c79bv395ed1a9fb45ce9d@mail.gmail.com>

hi,
i had the same problem and i defined a function with a similar sintax
to interp2 which i call take2 to solve it:

from numpy import *

def take2( a, x,y ):
        return take( ravel(a), x + y*a.shape[0] )

a = array( [[ 0.15,  0.75,  0.2 ],
                 [ 0.82,  0.5,   0.77],
                  [ 0.21,  0.91,  0.59]] )
xy = array([    [[1, 1], [1, 1], [2, 1]],
                [[2, 2], [0, 0], [1, 0]],
                [[1, 1], [0, 0], [2, 1]]] )

print take2( a, xy[...,0], xy[...,1] )

i hope this helps you.
pau


On 4/5/06, amcmorl <amcmorl at gmail.com> wrote:
> Hi all,
>
> I'm having a bit of trouble getting my head around numpy's indexing
> capabilities. A quick summary of the problem is that I want to
> lookup/index in nD from a second array of rank n+1, such that the last
> (or first, I guess) dimension contains the lookup co-ordinates for the
> value to extract from the first array. Here's a 2D (3,3) example:
>
> In [12]:print ar
> [[ 0.15  0.75  0.2 ]
>  [ 0.82  0.5   0.77]
>  [ 0.21  0.91  0.59]]
>
> In [24]:print inds
> [[[1 1]
>   [1 1]
>   [2 1]]
>
>  [[2 2]
>   [0 0]
>   [1 0]]
>
>  [[1 1]
>   [0 0]
>   [2 1]]]
>
> then somehow return the array (barring me making any row/column errors):
> In [26]: c = ar.somefancyindexingroutinehere(inds)
>
> In [26]:print c
> [[ 0.5  0.5  0.91]
>  [ 0.59 0.15 0.82]
>  [ 0.5  0.15 0.91]]
>
> i.e. c[x,y] = a[ inds[x,y,0], inds[x,y,1] ]
>
> Any suggestions? It looks like it should be relatively simple using
> 'put' or 'take' or 'fetch' or 'sit' or something like that, but I'm not
> getting it.
>
> While I'm here, can someone help me understand the rationale behind
> 'print' printing row, column (i.e. a[0,1] = 0.75 in the above example
> rather than x, y (=column, row; in which case 0.75 would be in the first
> column and second row), which seems to me to be more intuitive.
>
> I'm really enjoying getting into numpy - I can see it'll be
> simpler/faster coding than my previous environments, despite me not
> knowing my way at the moment, and that python has better opportunities
> for extensibility. So, many thanks for your great work.
> --
> Angus McMorland
> email a.mcmorland at auckland.ac.nz
> mobile +64-21-155-4906
>
> PhD Student, Neurophysiology / Multiphoton & Confocal Imaging
> Physiology, University of Auckland
> phone +64-9-3737-599 x89707
>
> Armourer, Auckland University Fencing
> Secretary, Fencing North Inc.
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting language
> that extends applications into web and mobile media. Attend the live webcast
> and join the prime developer group breaking into this new coding territory!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>


From tim.hochberg at cox.net  Wed Apr  5 05:30:14 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Wed Apr  5 05:30:14 2006
Subject: [Numpy-discussion] Re: Vote: complex64   vs  complex128
In-Reply-To: <Pine.LNX.4.51.0604051045520.24174@ptpcp8.phy.tu-dresden.de>
References: <20060404155003.AF09016220@sc8-sf-spam2.sourceforge.net> <200604041813.k34IDmAP011500@oobleck.astro.cornell.edu> <e0ugi5$b1q$1@sea.gmane.org> <4432CE1F.3010209@cox.net> <e0ujgn$lp1$1@sea.gmane.org> <Pine.LNX.4.51.0604051045520.24174@ptpcp8.phy.tu-dresden.de>
Message-ID: <4433B816.1080307@cox.net>

Arnd Baecker wrote:

>On Tue, 4 Apr 2006, Robert Kern wrote:
>
>  
>
>>Tim Hochberg wrote:
>>    
>>
>
>[...]
>
>  
>
>>>    >>> help(complex128)
>>>   class complex128scalar(complexfloatingscalar, complex)
>>>    |  complex128: composed of two 64 bit floats
>>>    |
>>>    |  Method resolution order:
>>>    |      complex128scalar
>>>    |      complexfloatingscalar
>>>    |      inexactscalar
>>>    |      numberscalar
>>>    |      genericscalar
>>>    |      complex
>>>    |      object
>>>   ...
>>>      
>>>
>
>I am puzzled why this does not show up with Ipython:
>
>In [1]:import numpy
>In [2]:numpy.complex128?
>Type:           type
>Base Class:     <type 'type'>
>String Form:    <type 'complex128scalar'>
>Namespace:      Interactive
>Docstring:
>    <no docstring>
>
>whereas
>
>In [3]:help(numpy.complex128)
>
>shows the above!
>So this might be more of an IPython question (I am running IPython
>0.7.2.svn), but maybe numpy does some magic tricks to hide the docs from
>IPython (surely not on purpose ...)?
>It seems that numpy.complex128.__doc__ is None
>
That's right, none of the scalar types have docstrings at present. The 
builtin help (AKA pydoc.help) tracks back through all the base classes 
and presents all kinds of extra information. The result tends to be 
awfully verbose; so much so that I just stuffed a function called hint 
into __builtins___ that just prints the results of pydoc.describe and 
pydoc.getdoc. It's quite possible that such a function already exists, 
maybe even in pydoc, but oddly enough the docs for pydoc are pretty 
impenatrable.

Here I've added basic docstrings to the complex types. I was hoping 
someone would have some ideas for other stuff that should go into the 
docstrings, but perhaps I'll just commit that change as is. Here's what 
I see here using hint:

 >>> hint(numpy.float64) # Still no docstring
class float64scalar
 >>> hint(numpy.complex64) # Now has a terse docstring
class complex64scalar
 |  Composed of two 32 bit floats
 >>> hint(numpy.complex128) # Same here.
class complex128scalar
 |  Composed of two 64 bit floats

Regards,

-tim


From arnd.baecker at web.de  Wed Apr  5 05:48:02 2006
From: arnd.baecker at web.de (Arnd Baecker)
Date: Wed Apr  5 05:48:02 2006
Subject: [Numpy-discussion] Speed up function on cross product of two
 sets?
In-Reply-To: <44315633.4010600@cox.net>
References: <DCA236B2-7315-419D-B05B-06396924E909@stanford.edu>
 <Pine.LNX.4.51.0604031110050.6751@ptpcp8.phy.tu-dresden.de> <44315633.4010600@cox.net>
Message-ID: <Pine.LNX.4.51.0604051051310.24174@ptpcp8.phy.tu-dresden.de>


On Mon, 3 Apr 2006, Tim Hochberg wrote:

> Arnd Baecker wrote:
>
> [SNIP]
>
> >((Note that I just learned in some other thread that with numpy there is
> >an alternative to NewAxis, but I haven't figured out which that is ...))
> >
> >
> If you're old school you could just use None.

Well, I have been using python/Numeric/... for a while, but I am
definitively not old school - I was not aware that NewAxis is a longer
spelling of None ;-)

> But you probably mean 'newaxis'.

yes - perfect! Many thanks.

BTW, it seems that we have no Numeric to numpy transition remarks in
www.scipy.org. I only found
http://www.scipy.org/PearuPeterson/NumpyVersusNumeric
and of course Travis' "Guide to NumPy" contains a detailed list of
necessary changes in chapter 2.6.1.
In addition ``site-packages/numpy/lib/convertcode.py`` provides an
automatic conversion.

Would it be helpful to start a new wiki page "ConvertingFromNumeric"
(similar to http://www.scipy.org/Converting_from_numarray)
which aims at summarizing the necessary changes
or expand Pearu's page (if he agrees) on this?

Best, Arnd


From arnd.baecker at web.de  Wed Apr  5 05:57:16 2006
From: arnd.baecker at web.de (Arnd Baecker)
Date: Wed Apr  5 05:57:16 2006
Subject: [Numpy-discussion] Re: Vote: complex64   vs  complex128
In-Reply-To: <4433B816.1080307@cox.net>
References: <20060404155003.AF09016220@sc8-sf-spam2.sourceforge.net>
 <200604041813.k34IDmAP011500@oobleck.astro.cornell.edu> <e0ugi5$b1q$1@sea.gmane.org>
 <4432CE1F.3010209@cox.net> <e0ujgn$lp1$1@sea.gmane.org>
 <Pine.LNX.4.51.0604051045520.24174@ptpcp8.phy.tu-dresden.de> <4433B816.1080307@cox.net>
Message-ID: <Pine.LNX.4.51.0604051448300.24174@ptpcp8.phy.tu-dresden.de>

Hi,

On Wed, 5 Apr 2006, Tim Hochberg wrote:

[...]

> That's right, none of the scalar types have docstrings at present. The
> builtin help (AKA pydoc.help) tracks back through all the base classes
> and presents all kinds of extra information.

I see - so that might be something Ipython could do as well
(if that's really what we would like to see...)

> The result tends to be
> awfully verbose; so much so that I just stuffed a function called hint
> into __builtins___ that just prints the results of pydoc.describe and
> pydoc.getdoc. It's quite possible that such a function already exists,
> maybe even in pydoc, but oddly enough the docs for pydoc are pretty
> impenatrable.
>
> Here I've added basic docstrings to the complex types. I was hoping
> someone would have some ideas for other stuff that should go into the
> docstrings, but perhaps I'll just commit that change as is. Here's what
> I see here using hint:
>
>  >>> hint(numpy.float64) # Still no docstring
> class float64scalar
>  >>> hint(numpy.complex64) # Now has a terse docstring
> class complex64scalar
>  |  Composed of two 32 bit floats
>  >>> hint(numpy.complex128) # Same here.
> class complex128scalar
>  |  Composed of two 64 bit floats

That looks much better.
I am a bit unsure about `hint` though for the following reasons:
There are quite a few ways  to access documentation:
  - help(defined_object)
  - help("numpy.complex128")
  - scipy.info(defined_object)
  - hint(defined_object)
  - defined_object?                     # with IPython
(and then of course the pydoc commands as well ...).

Clearly, I would prefer to have "?" in IPython as the only thing one needs
to know about accessing documentation.

There are surely many aspects to consider here, but I have to rush now ...

Best, Arnd


From tim.hochberg at cox.net  Wed Apr  5 06:24:11 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Wed Apr  5 06:24:11 2006
Subject: [Numpy-discussion] Re: Vote: complex64   vs  complex128
In-Reply-To: <Pine.LNX.4.51.0604051448300.24174@ptpcp8.phy.tu-dresden.de>
References: <20060404155003.AF09016220@sc8-sf-spam2.sourceforge.net> <200604041813.k34IDmAP011500@oobleck.astro.cornell.edu> <e0ugi5$b1q$1@sea.gmane.org> <4432CE1F.3010209@cox.net> <e0ujgn$lp1$1@sea.gmane.org> <Pine.LNX.4.51.0604051045520.24174@ptpcp8.phy.tu-dresden.de> <4433B816.1080307@cox.net> <Pine.LNX.4.51.0604051448300.24174@ptpcp8.phy.tu-dresden.de>
Message-ID: <4433C4CC.7010003@cox.net>

Arnd Baecker wrote:

>Hi,
>
>On Wed, 5 Apr 2006, Tim Hochberg wrote:
>
>[...]
>
>  
>
>>That's right, none of the scalar types have docstrings at present. The
>>builtin help (AKA pydoc.help) tracks back through all the base classes
>>and presents all kinds of extra information.
>>    
>>
>
>I see - so that might be something Ipython could do as well
>(if that's really what we would like to see...)
>
>  
>
>>The result tends to be
>>awfully verbose; so much so that I just stuffed a function called hint
>>into __builtins___ that just prints the results of pydoc.describe and
>>pydoc.getdoc. It's quite possible that such a function already exists,
>>maybe even in pydoc, but oddly enough the docs for pydoc are pretty
>>impenatrable.
>>
>>Here I've added basic docstrings to the complex types. I was hoping
>>someone would have some ideas for other stuff that should go into the
>>docstrings, but perhaps I'll just commit that change as is. Here's what
>>I see here using hint:
>>
>> >>> hint(numpy.float64) # Still no docstring
>>class float64scalar
>> >>> hint(numpy.complex64) # Now has a terse docstring
>>class complex64scalar
>> |  Composed of two 32 bit floats
>> >>> hint(numpy.complex128) # Same here.
>>class complex128scalar
>> |  Composed of two 64 bit floats
>>    
>>
>
>That looks much better.
>I am a bit unsure about `hint` though for the following reasons:
>There are quite a few ways  to access documentation:
>  - help(defined_object)
>  - help("numpy.complex128")
>  - scipy.info(defined_object)
>  - hint(defined_object)
>  - defined_object?                     # with IPython
>(and then of course the pydoc commands as well ...).
>  
>
Sorry, I was unclear. Hint is only for my enjoyment -- it's not related 
to numpy. I just tossed it into my sitecustomize file. I was just get 
sick of doing help(complex64) and getting pages of text when all I cared 
about was the docstring. I suppose I could just have done "print 
complex64.__doc__", but I felt like hint might be useful. However, it's 
not something I was proposing to add to numpy, the changes I was talking 
about are strictly in the docstrings of complexXXX.

-tim

>Clearly, I would prefer to have "?" in IPython as the only thing one needs
>to know about accessing documentation.
>
>There are surely many aspects to consider here, but I have to rush now ...
>
>Best, Arnd
>
>
>
>
>  
>


From emsellem at obs.univ-lyon1.fr  Wed Apr  5 06:33:23 2006
From: emsellem at obs.univ-lyon1.fr (Eric Emsellem)
Date: Wed Apr  5 06:33:23 2006
Subject: [Numpy-discussion] A random.normal function with stdev as array
Message-ID: <4433C6D6.5080800@obs.univ-lyon1.fr>

Hi,

I am trying to optimize a code where I derive random numbers many times 
and having an array of values for the stdev parameter.

I wish to have an efficient way of doing something like:
##################
stdev = array([1.1,1.2,1.0,2.2])
result = numpy.zeros(stdev.shape, Float)
for i in range(len(stdev)) :
   result[i] = numpy.random.normal(0, stdev[i])
##################

In my case,  stdev can in fact be an array of a few millions floats... 
so I really need to optimize things.

Any hint on how to code this efficiently ?

And in general, where could I find tips for optimizing a code where I 
unfortunately have too many loops such as "for i in range(Nbody) : " 
with Nbody being > 10^6 ?

thanks!
Eric


From dd55 at cornell.edu  Wed Apr  5 06:34:00 2006
From: dd55 at cornell.edu (Darren Dale)
Date: Wed Apr  5 06:34:00 2006
Subject: [Numpy-discussion] NumPy documentation
In-Reply-To: <9E52488D-6736-4F10-A045-2D5B39CBD3F5@earthlink.net>
References: <4432E27E.6030906@ee.byu.edu> <9E52488D-6736-4F10-A045-2D5B39CBD3F5@earthlink.net>
Message-ID: <200604050932.56744.dd55@cornell.edu>

On Wednesday 05 April 2006 00:32, Ted Horst wrote:
> I'll just add my voice to the people speaking up to support Travis's
> efforts.  I buy lots of books, and most of the time I don't think too
> much about who I am supporting when I buy them, but  I probably would
> have bought this book even if I didn't need that level of
> documentation just to help support what I see as very important
> work.  I don't see how writing about an open source project and using
> the proceeds to further that project could be seen as anything other
> than a positive.
>
> I also just want to say how impressed I am with what Travis has
> accomplished with this project.  From the organizational effort,
> patience, and persistence of bringing the various communities
> together to the quality and quantity of the ideas, code, and
> discussions, his contributions have been inspiring.

I agree. I support of what Travis has done.


From pearu at scipy.org  Wed Apr  5 07:18:02 2006
From: pearu at scipy.org (Pearu Peterson)
Date: Wed Apr  5 07:18:02 2006
Subject: [Numpy-discussion] Speed up function on cross product of two
 sets?
In-Reply-To: <Pine.LNX.4.51.0604051051310.24174@ptpcp8.phy.tu-dresden.de>
References: <DCA236B2-7315-419D-B05B-06396924E909@stanford.edu>
 <Pine.LNX.4.51.0604031110050.6751@ptpcp8.phy.tu-dresden.de> <44315633.4010600@cox.net>
 <Pine.LNX.4.51.0604051051310.24174@ptpcp8.phy.tu-dresden.de>
Message-ID: <Pine.LNX.4.64.0604050912310.12488@scipy.org>


On Wed, 5 Apr 2006, Arnd Baecker wrote:

> BTW, it seems that we have no Numeric to numpy transition remarks in
> www.scipy.org. I only found
> http://www.scipy.org/PearuPeterson/NumpyVersusNumeric
> and of course Travis' "Guide to NumPy" contains a detailed list of
> necessary changes in chapter 2.6.1.
> In addition ``site-packages/numpy/lib/convertcode.py`` provides an
> automatic conversion.
>
> Would it be helpful to start a new wiki page "ConvertingFromNumeric"
> (similar to http://www.scipy.org/Converting_from_numarray)
> which aims at summarizing the necessary changes
> or expand Pearu's page (if he agrees) on this?

It's better to start a new wiki page similar to Converting_from_numarray 
(I like the table). Btw, I have few notes about the necessary changes for 
Numeric->numpy transition in the following page:

   http://svn.enthought.com/enthought/wiki/NumpyPort#NotesonchangesduetoreplacingNumeric/scipy_basewithnumpy

Feel free to grab these notes.

Pearu


From zpincus at stanford.edu  Wed Apr  5 08:04:33 2006
From: zpincus at stanford.edu (Zachary Pincus)
Date: Wed Apr  5 08:04:33 2006
Subject: [Numpy-discussion] array constructor from generators?
In-Reply-To: <44331200.2020604@cox.net>
References: <F0BB3395-025A-46CB-8A7A-005964E45D20@stanford.edu> <44331200.2020604@cox.net>
Message-ID: <B72CDB4F-6494-4815-9478-10B4743DBCE1@stanford.edu>

tim>
> I brought this up last week and Travis was OK with it. I have it on  
> my todo list, but if you are in a hurry you're welcome to do it  
> instead.

Sorry if that was on the list and I missed it! Hate to be adding more  
noise than signal. At any rate, I'm not in a hurry, but I'd be happy  
to help where I can. (Though for the next week or so I think I'm  
swamped...)

tim>
> If you do look at it, consider looking into the '__length_hint__  
> parameter that's slated to go into Python 2.5. When this is  
> present, it's potentially a big win, since you can preallocate the  
> array and fill it directly from the iterator. Without this, you  
> probably can't do much better than just building a list from the  
> array. What would work well would be to build a list, then steal  
> its memory. I'm not sure if that's feasible without leaking a  
> reference to the list though.

Can you steal its memory and then give it some dummy memory that it  
can free without problems, so that the list can be deallocated  
without trouble? Does anyone know if you can just give the list a  
NULL pointer for it's memory and then immediately decref it? free 
(NULL) should always be safe, I think. (??)

> Also, with iterators, specifying dtype will make a huge difference.  
> If an object has __length_hint__ and you specify dtype, then you  
> can preallocate the array as I suggested above. However, if dtype  
> is not specified, you still need to build the list completely,  
> determine what type it is, allocate the array memory and then copy  
> the values into it. Much less efficient!

How accurate is __length_hint__ going to be? It could lead to a fair  
bit of special case code for growing and shrinking the final array if  
__length_hint__ turns out to be wrong. Code that python lists already  
have, moreover.

If the list's memory can be stolen safely, how does this strategy sound:
- Given a generator, build it up into a list internally, and then  
steal the list's memory.
- If a dtype is provided, wrap the generator with another generator  
that casts the original generator's output to the correct dtype. Then  
use the wrapped generator to create a list of the proper dtype, and  
steal that list's memory.

A potential problem with stealing list memory is that it could waste  
memory if the list has more bytes allocated than it is using (I'm not  
sure if python lists can get this way, but I presume that they resize  
themselves only every so often, like C++ or Java vectors, so most of  
the time they have some allocated but unused bytes). If lists have a  
squeeze method that's guaranteed not to cause any copies, or if this  
can be added with judicious use of realloc, then that problem is  
obviated.

robert>
> Another note of caution: You are going to have to deal with  
> iterators of
> iterators of iterators of.... I'm not sure if that actually overly  
> complicates
> matters; I haven't looked at PyArray_New for some time. Enjoy!

This is a good point. Numpy does fine with nested lists, but what  
should it do with nested generators? I originally thought that  
basically 'array(generator)' should make the exact same thing as  
'array([f for f in generator])'. However, for nested generators, this  
would be an object array of generators.

I'm not sure which is better -- having more special cases for  
generators that make generators, or having a simple rubric like above  
for how generators are treated.

Any thoughts?

Zach


From perry at stsci.edu  Wed Apr  5 08:08:19 2006
From: perry at stsci.edu (Perry Greenfield)
Date: Wed Apr  5 08:08:19 2006
Subject: [Numpy-discussion] NumPy documentation
In-Reply-To: <f74a6c2f0604042200i6d0992deyf20d2dfc95c93550@mail.gmail.com>
References: <4432E27E.6030906@ee.byu.edu> <9E52488D-6736-4F10-A045-2D5B39CBD3F5@earthlink.net> <f74a6c2f0604042200i6d0992deyf20d2dfc95c93550@mail.gmail.com>
Message-ID: <a0d87f4948a7d9cb53427404c3e11353@stsci.edu>

Speaking as someone who thinks he knows what kind of effort is involved 
in creating numpy, I suspect relatively few have any idea of the effort 
and skill that is required to do what Travis has done. Indeed, I 
wouldn't be surprised if Travis hadn't fully anticipated at the start 
what he was getting himself into, and if he hasn't asked himself more 
than once whether he would do it again had he known [I imagine that 
many worthy and memorable efforts fall into this category. Much human 
progress springs out of such initial optimism.] John Hunter is right 
that Travis's contributions to this and other scipy-related projects 
amount to years of work.

For those that find it objectionable that Travis is trying to get some 
partial compensation for this work, consider whether there was any one 
at all in the Python community willing to do this as well as he as for 
free, or even for what he will actually recover from the book. I doubt 
it very much.

Fortunately, I think the number of people that object to Travis 
charging for the book is small. Unfortunately, their impact can be 
disproportionately large. I hope Travis can effectively ignore them.

Perry


From lennart.ohlsson at cs.lth.se  Wed Apr  5 08:12:20 2006
From: lennart.ohlsson at cs.lth.se (Lennart Ohlsson)
Date: Wed Apr  5 08:12:20 2006
Subject: [Numpy-discussion] Re: Newbie indexing question and print order
Message-ID: <008201c658c3$30d06ab0$2f32eb82@cs060109>

Hi,

Although I mainly use for 2D takes here is an nd-version of such a function:

def vtake(a, indices):
    """Corresponding to take in numpy but with vector valued indices"""
    indexrank = indices.shape[-1]
    flattedindex = 0
    for i in range(indexrank):
        flattedindex = flattedindex*a.shape[i] + indices[...,i]
    flattedshape = (-1,) + a.shape[indexrank:]
    return a.reshape(flattedshape).take(flattedindex)


- Lennart


On 4/5/06, Pau Gargallo<pau.gargallo at gmail.com> wrote:

hi,
i had the same problem and i defined a function with a similar sintax
to interp2 which i call take2 to solve it:

from numpy import *

def take2( a, x,y ):
        return take( ravel(a), x + y*a.shape[0] )

a = array( [[ 0.15,  0.75,  0.2 ],
                 [ 0.82,  0.5,   0.77],
                  [ 0.21,  0.91,  0.59]] )
xy = array([    [[1, 1], [1, 1], [2, 1]],
                [[2, 2], [0, 0], [1, 0]],
                [[1, 1], [0, 0], [2, 1]]] )

print take2( a, xy[...,0], xy[...,1] )

i hope this helps you.
pau


On 4/5/06, amcmorl <amcmorl at gmail.com> wrote:
> Hi all,
>
> I'm having a bit of trouble getting my head around numpy's indexing
> capabilities. A quick summary of the problem is that I want to
> lookup/index in nD from a second array of rank n+1, such that the last
> (or first, I guess) dimension contains the lookup co-ordinates for the
> value to extract from the first array. Here's a 2D (3,3) example:
>
> In [12]:print ar
> [[ 0.15  0.75  0.2 ]
>  [ 0.82  0.5   0.77]
>  [ 0.21  0.91  0.59]]
>
> In [24]:print inds
> [[[1 1]
>   [1 1]
>   [2 1]]
>
>  [[2 2]
>   [0 0]
>   [1 0]]
>
>  [[1 1]
>   [0 0]
>   [2 1]]]
>
> then somehow return the array (barring me making any row/column errors):
> In [26]: c = ar.somefancyindexingroutinehere(inds)
>
> In [26]:print c
> [[ 0.5  0.5  0.91]
>  [ 0.59 0.15 0.82]
>  [ 0.5  0.15 0.91]]
>
> i.e. c[x,y] = a[ inds[x,y,0], inds[x,y,1] ]
>
> Any suggestions? It looks like it should be relatively simple using
> 'put' or 'take' or 'fetch' or 'sit' or something like that, but I'm not
> getting it.
>
> While I'm here, can someone help me understand the rationale behind
> 'print' printing row, column (i.e. a[0,1] = 0.75 in the above example
> rather than x, y (=column, row; in which case 0.75 would be in the first
> column and second row), which seems to me to be more intuitive.
>
> I'm really enjoying getting into numpy - I can see it'll be
> simpler/faster coding than my previous environments, despite me not
> knowing my way at the moment, and that python has better opportunities
> for extensibility. So, many thanks for your great work.
> --
> Angus McMorland
> email a.mcmorland at auckland.ac.nz
> mobile +64-21-155-4906
>
> PhD Student, Neurophysiology / Multiphoton & Confocal Imaging
> Physiology, University of Auckland
> phone +64-9-3737-599 x89707
>
> Armourer, Auckland University Fencing
> Secretary, Fencing North Inc.
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting language
> that extends applications into web and mobile media. Attend the live webcast
> and join the prime developer group breaking into this new coding territory!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>


From a.h.jaffe at gmail.com  Wed Apr  5 08:18:03 2006
From: a.h.jaffe at gmail.com (Andrew Jaffe)
Date: Wed Apr  5 08:18:03 2006
Subject: [Numpy-discussion] weird interaction: pickle, numpy, matplotlib.hist
Message-ID: <4433DF85.7030109@gmail.com>

Hi All,

I've encountered a strange problem: I've been running some python code 
on both a linux box and OS X, both with python 2.4.1 and the latest 
numpy and matplotlib from svn.

I have found that when I transfer pickled numpy arrays from one machine 
to the other (in either direction), the resulting data *looks* all right 
(i.e., it is a numpy array of the correct type with the correct values 
at the correct indices), but it seems to produce the wrong result in (at 
least) one circumstance: matplotlib.hist() gives the completely wrong 
picture (and set of bins).

This can be ameliorated by running the array through
    arr=numpy.asarray(arr, dtype=numpy.float64)
but this seems like a complete kludge (and is only needed when you do 
the transfer between machines).

I've attached a minimal code that exhibits the problem: try
	test_pickle_hist.test(write=True)
on one machine, transfer the output file to another machine, and run
	test_pickle_hist.test(write=False)
on another, and you should see a very strange result (and it should be 
fixed if you set asarray=True).

Any ideas?

Andrew
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: test_pickle_hist.py
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060405/ca428697/attachment.ksh>

From ryanlists at gmail.com  Wed Apr  5 08:23:06 2006
From: ryanlists at gmail.com (Ryan Krauss)
Date: Wed Apr  5 08:23:06 2006
Subject: [Numpy-discussion] NumPy documentation
In-Reply-To: <c5b438120604041636n3f472a6aufc5658b91fd3d380@mail.gmail.com>
References: <4432E27E.6030906@ee.byu.edu>
	 <Mahogany-0.66.0-1528-20060404-193223.00@american.edu>
	 <c5b438120604041636n3f472a6aufc5658b91fd3d380@mail.gmail.com>
Message-ID: <c5b438120604050822v773b2581rb06910a532af8cd@mail.gmail.com>

I just realized that my "Amen" to all of this went only to Alan Isaac.
 I don't "reply-to-all" by default.

In response to Perry's comment: "I hope Travis can effectively ignore
them."  I think a spam filter with "wolf" and "sheep" might be a good
start, but it could accidentally delete some interesting "poetry"
<wink>.

Ryan

On 4/4/06, Ryan Krauss <ryanlists at gmail.com> wrote:
> Let me add my thanks and also say that as a grad student who plans to
> buy your book once I graduate, NumPy's use is not inhibited by Travis
> charging for the documentation.
>
> Thanks!
>
> Ryan Krauss
>
> On 4/4/06, Alan G Isaac <aisaac at american.edu> wrote:
> > On Tue, 04 Apr 2006, Travis Oliphant apparently wrote:
> > > I'm not going to dislike or have any kind of ill feelings
> > > with anyone who decides to spend their time on
> > > "documentation."  In fact, I'll appreciate it just like
> > > everyone else.
> >
> > Of course you were extremely clear about this from the
> > beginning.  Thank you for numpy!!!
> > Alan Isaac (grateful user of numpy)
> > PS Your book is *very* helpful.
> >
> >
> >
> >
> >
> > -------------------------------------------------------
> > This SF.Net email is sponsored by xPML, a groundbreaking scripting language
> > that extends applications into web and mobile media. Attend the live webcast
> > and join the prime developer group breaking into this new coding territory!
> > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> > _______________________________________________
> > Numpy-discussion mailing list
> > Numpy-discussion at lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/numpy-discussion
> >
>


From zpincus at stanford.edu  Wed Apr  5 08:32:02 2006
From: zpincus at stanford.edu (Zachary Pincus)
Date: Wed Apr  5 08:32:02 2006
Subject: [Numpy-discussion] array constructor from generators?
In-Reply-To: <44331200.2020604@cox.net>
References: <F0BB3395-025A-46CB-8A7A-005964E45D20@stanford.edu> <44331200.2020604@cox.net>
Message-ID: <884F03C6-599C-426A-A0A0-97009B63EACB@stanford.edu>

[sorry if this comes through twice -- seems to have not sent the  
first time]

Hi folks,

tim>
> I brought this up last week and Travis was OK with it. I have it on  
> my todo list, but if you are in a hurry you're welcome to do it  
> instead.

Sorry if that was on the list and I missed it! Hate to be adding more  
noise than signal. At any rate, I'm not in a hurry, but I'd be happy  
to help where I can. (Though for the next week or so I think I'm  
swamped...)

tim>
> If you do look at it, consider looking into the '__length_hint__  
> parameter that's slated to go into Python 2.5. When this is  
> present, it's potentially a big win, since you can preallocate the  
> array and fill it directly from the iterator. Without this, you  
> probably can't do much better than just building a list from the  
> array. What would work well would be to build a list, then steal  
> its memory. I'm not sure if that's feasible without leaking a  
> reference to the list though.

Can you steal its memory and then give it some dummy memory that it  
can free without problems, so that the list can be deallocated  
without trouble? Does anyone know if you can just give the list a  
NULL pointer for it's memory and then immediately decref it? free 
(NULL) should always be safe, I think. (??)

> Also, with iterators, specifying dtype will make a huge difference.  
> If an object has __length_hint__ and you specify dtype, then you  
> can preallocate the array as I suggested above. However, if dtype  
> is not specified, you still need to build the list completely,  
> determine what type it is, allocate the array memory and then copy  
> the values into it. Much less efficient!

How accurate is __length_hint__ going to be? It could lead to a fair  
bit of special case code for growing and shrinking the final array if  
__length_hint__ turns out to be wrong. Code that python lists already  
have, moreover.

If the list's memory can be stolen safely, how does this strategy sound:
- Given a generator, build it up into a list internally, and then  
steal the list's memory.
- If a dtype is provided, wrap the generator with another generator  
that casts the original generator's output to the correct dtype. Then  
use the wrapped generator to create a list of the proper dtype, and  
steal that list's memory.

A potential problem with stealing list memory is that it could waste  
memory if the list has more bytes allocated than it is using (I'm not  
sure if python lists can get this way, but I presume that they resize  
themselves only every so often, like C++ or Java vectors, so most of  
the time they have some allocated but unused bytes). If lists have a  
squeeze method that's guaranteed not to cause any copies, or if this  
can be added with judicious use of realloc, then that problem is  
obviated.

robert>
> Another note of caution: You are going to have to deal with  
> iterators of
> iterators of iterators of.... I'm not sure if that actually overly  
> complicates
> matters; I haven't looked at PyArray_New for some time. Enjoy!

This is a good point. Numpy does fine with nested lists, but what  
should it do with nested generators? I originally thought that  
basically 'array(generator)' should make the exact same thing as  
'array([f for f in generator])'. However, for nested generators, this  
would be an object array of generators.

I'm not sure which is better -- having more special cases for  
generators that make generators, or having a simple rubric like above  
for how generators are treated.

Any thoughts?

Zach


From robert.kern at gmail.com  Wed Apr  5 08:36:03 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Wed Apr  5 08:36:03 2006
Subject: [Numpy-discussion] Re: A random.normal function with stdev as array
In-Reply-To: <4433C6D6.5080800@obs.univ-lyon1.fr>
References: <4433C6D6.5080800@obs.univ-lyon1.fr>
Message-ID: <e10o26$hi5$1@sea.gmane.org>

Eric Emsellem wrote:
> Hi,
> 
> I am trying to optimize a code where I derive random numbers many times
> and having an array of values for the stdev parameter.
> 
> I wish to have an efficient way of doing something like:
> ##################
> stdev = array([1.1,1.2,1.0,2.2])
> result = numpy.zeros(stdev.shape, Float)
> for i in range(len(stdev)) :
>   result[i] = numpy.random.normal(0, stdev[i])
> ##################

You can use the fact that the standard deviation of a normal distribution is a
scale parameter. You can get random normal deviates of varying standard
deviation by multiplying a standard normal deviate by the desired standard
deviation (how's that for confusing terminology, eh?).

  result = numpy.random.standard_normal(stdev.shape) * stdev

> In my case,  stdev can in fact be an array of a few millions floats...
> so I really need to optimize things.
> 
> Any hint on how to code this efficiently ?
> 
> And in general, where could I find tips for optimizing a code where I
> unfortunately have too many loops such as "for i in range(Nbody) : "
> with Nbody being > 10^6 ?

Tim Hochberg recently made this list:

"""
0. Think about your algorithm.
1. Vectorize your inner loop.
2. Eliminate temporaries
3. Ask for help
4. Recode in C.
5. Accept that your code will never be fast.

Step zero should probably be repeated after every other step ;)
"""

That's probably the best general advice. To get better advice, we would need to
know the specifics of the problem.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From a.h.jaffe at gmail.com  Wed Apr  5 08:48:27 2006
From: a.h.jaffe at gmail.com (Andrew Jaffe)
Date: Wed Apr  5 08:48:27 2006
Subject: [Numpy-discussion] Re: weird interaction: pickle, numpy, matplotlib.hist [sort() method problem?]
Message-ID: <e74ecdac0604050841w2246f57cnabf3fd62f4e1b3e9@mail.gmail.com>

OK, I think I've managed to track the problem down a bit further:

    the sort() method is failing for arrays pickled on another machine!

That is, it's definitely not sorting the array, but changing to a very
strange order (neither the way it started nor sorted).

Again, the array seems to otherwise behave fine (indeed, it even satisfies
all(a==a1) for a pair that behave differently in this circumstance).

Hmmm...

A


On 4/5/06, Andrew Jaffe <a.h.jaffe at gmail.com> wrote:
>
> Hi All,
>
> I've encountered a strange problem: I've been running some python code
> on both a linux box and OS X, both with python 2.4.1 and the latest
> numpy and matplotlib from svn.
>
> I have found that when I transfer pickled numpy arrays from one machine
> to the other (in either direction), the resulting data *looks* all right
> (i.e., it is a numpy array of the correct type with the correct values
> at the correct indices), but it seems to produce the wrong result in (at
> least) one circumstance: matplotlib.hist() gives the completely wrong
> picture (and set of bins).
>
> This can be ameliorated by running the array through
>     arr=numpy.asarray(arr, dtype=numpy.float64)
> but this seems like a complete kludge (and is only needed when you do
> the transfer between machines).
>
> I've attached a minimal code that exhibits the problem: try
>         test_pickle_hist.test(write=True)
> on one machine, transfer the output file to another machine, and run
>         test_pickle_hist.test(write=False)
> on another, and you should see a very strange result (and it should be
> fixed if you set asarray=True).
>
> Any ideas?
>
> Andrew
>
>
> import cPickle
> import numpy
> import pylab
>
> def test(write=True,asarray=False):
>
>     a = numpy.linspace(-3,3,num=100)
>
>     if write:
>         f1 = file("a.cpkl", 'w')
>         cPickle.dump(a, f1)
>         f1.close()
>
>     f1 = open("a.cpkl", 'r')
>     a1 = cPickle.load(f1)
>     f1.close()
>
>     pylab.subplot(1,2,1)
>     h = pylab.hist(a)
>
>     if asarray:
>         a1 = numpy.asarray(a1, dtype=numpy.float64)
>
>     pylab.subplot(1,2,2)
>     h1 = pylab.hist(a1)
>
>     return a, a1
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060405/7fd0ad2a/attachment.html>

From byrnes at bu.edu  Wed Apr  5 08:58:21 2006
From: byrnes at bu.edu (John Byrnes)
Date: Wed Apr  5 08:58:21 2006
Subject: [Numpy-discussion] A random.normal function with stdev as array
In-Reply-To: <4433C6D6.5080800@obs.univ-lyon1.fr>
References: <4433C6D6.5080800@obs.univ-lyon1.fr>
Message-ID: <20060405155736.GA9364@localhost.localdomain>

Hi Eric,

In the past , I've done things like

######
normdist = lambda x: numpy.random.normal(0,x)
vecnormal = numpy.vectorize(normdist)

stdev = numpy.array([1.1,1.2,1.0,2.2])
result = vecnormal(stdev)

######

This works fine for up to 10k elements for stdev for some reason.
Any larger then that and i get a Bus error on my PPC mac and a segfault on
my x86 linux box.

I'm running numpy 0.9.7.2325 on both machines. 

Perhaps for larger inputs, you could break up your loop into smaller
vectorized chunks.

Regards,
John


On Wed, Apr 05, 2006 at 03:32:06PM +0200, Eric Emsellem wrote:
> Hi,
> 
> I am trying to optimize a code where I derive random numbers many times 
> and having an array of values for the stdev parameter.
> 
> I wish to have an efficient way of doing something like:
> ##################
> stdev = array([1.1,1.2,1.0,2.2])
> result = numpy.zeros(stdev.shape, Float)
> for i in range(len(stdev)) :
>   result[i] = numpy.random.normal(0, stdev[i])
> ##################
> 
> In my case,  stdev can in fact be an array of a few millions floats... 
> so I really need to optimize things.
> 
> Any hint on how to code this efficiently ?
> 
> And in general, where could I find tips for optimizing a code where I 
> unfortunately have too many loops such as "for i in range(Nbody) : " 
> with Nbody being > 10^6 ?
> 
> thanks!
> Eric
> 
> 
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting language
> that extends applications into web and mobile media. Attend the live webcast
> and join the prime developer group breaking into this new coding territory!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion

-- 
If liberty and equality, as is thought by some are chiefly to be found in democracy,
they will be best attained when all persons alike share in the government to the utmost.
		-- Aristotle, Politics


From bsouthey at gmail.com  Wed Apr  5 09:05:03 2006
From: bsouthey at gmail.com (Bruce Southey)
Date: Wed Apr  5 09:05:03 2006
Subject: [Numpy-discussion] NumPy documentation
In-Reply-To: <4432E27E.6030906@ee.byu.edu>
References: <4432E27E.6030906@ee.byu.edu>
Message-ID: <bbcd77d00604050903y75bc041bm88f36f4ba7ae22bd@mail.gmail.com>

Hi,
Sorry that you received such an email. It is one thing to disagree
with your choice but it is inexcusable to dictate what you should do
with your code/documentation (not to mention the language).
Unfortunately, this appears to be the result of the typical confusion
of what 'free' refers to in open source software.

If this person thought that purchasing documentation is bad then I
wonder what they think of the PyMOL project: "If you use PyMOL at
work, then you are asked and expected to sponsor the project by
purchasing a PyMOL Subscription"  (http://www.pymol.org/funding.html)!

Really the 'book' issue is more an excuse than a real reason for
people not to use numpy. Personally I really think that you should get
the 1.0 release out that probably would change some minds. Based on
the list postings, the stability of numpy already exceeds a typical
1.0 release level.

Regards
Bruce


From schofield at ftw.at  Wed Apr  5 09:10:05 2006
From: schofield at ftw.at (Ed Schofield)
Date: Wed Apr  5 09:10:05 2006
Subject: [Numpy-discussion] NumPy documentation
In-Reply-To: <4432E27E.6030906@ee.byu.edu>
References: <4432E27E.6030906@ee.byu.edu>
Message-ID: <4433EC3C.9050706@ftw.at>

I'd also like to express my gratitude, Travis, for all the time and
energy you've donated to both NumPy and SciPy.  I also fully support
your decision to charge for your book.

Perhaps your correspondent expects your book to be free because it's
online.  Perhaps some re-branding -- from "fee-based documentation" to
"book" or "handbook for users and developers" -- would help to avoid
evoking such unfair responses?  Incidentally, you mention on on the site
that you'll print and bind hard-copy version once your sales reach 200
copies.  I think this would help to encourage libraries and conservative
institutions to purchase copies.  Are your sales still under this level?!

I'm now going to order a copy for my institution -- and a hard copy when
it's available :)

-- Ed


From robert.kern at gmail.com  Wed Apr  5 09:11:01 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Wed Apr  5 09:11:01 2006
Subject: [Numpy-discussion] Re: weird interaction: pickle, numpy, matplotlib.hist
In-Reply-To: <4433DF85.7030109@gmail.com>
References: <4433DF85.7030109@gmail.com>
Message-ID: <e10q3e$qh0$1@sea.gmane.org>

Andrew Jaffe wrote:
> Hi All,
> 
> I've encountered a strange problem: I've been running some python code
> on both a linux box and OS X, both with python 2.4.1 and the latest
> numpy and matplotlib from svn.
> 
> I have found that when I transfer pickled numpy arrays from one machine
> to the other (in either direction), the resulting data *looks* all right
> (i.e., it is a numpy array of the correct type with the correct values
> at the correct indices), but it seems to produce the wrong result in (at
> least) one circumstance: matplotlib.hist() gives the completely wrong
> picture (and set of bins).
> 
> This can be ameliorated by running the array through
>    arr=numpy.asarray(arr, dtype=numpy.float64)
> but this seems like a complete kludge (and is only needed when you do
> the transfer between machines).

You have a byteorder issue. You Linux box, which I presume has an Intel or AMD
CPU, is little-endian where your OS X box, which I presume has a PPC CPU, is
big-endian. numpy arrays can store their data in either endianness on either
kind of platform; their dtype objects tell you which byteorder they are using.

In the dtype specifications below, '>' means big-endian (I am using a PPC
PowerBook), and '<' means little-endian.


In [31]: a = linspace(0, 10, 11)

In [32]: a
Out[32]: array([  0.,   1.,   2.,   3.,   4.,   5.,   6.,   7.,   8.,   9.,  10.])

In [33]: a.dtype
Out[33]: dtype('>f8')

In [34]: b = a.newbyteorder()

In [35]: b
Out[35]:
array([  0.00000000e+000,   3.03865194e-319,   3.16202013e-322,
         1.04346664e-320,   2.05531309e-320,   2.56123631e-320,
         3.06715953e-320,   3.57308275e-320,   4.07900597e-320,
         4.33196758e-320,   4.58492919e-320])

In [36]: b.dtype
Out[36]: dtype('<f8')

In [41]: a.tostring()[-8:]
Out[41]: '@$\x00\x00\x00\x00\x00\x00'

In [42]: b.tostring()[-8:]
Out[42]: '@$\x00\x00\x00\x00\x00\x00'


Apparently, the pickle stores the data in the creator machine's byteorder and so
marked. When the reading machine loads the pickle, it recognizes that the
byteorder is opposite its native byteorder by its dtype.

Most operations work as you might expect:


In [44]: a.astype(dtype('<f8'))
Out[44]: array([  0.,   1.,   2.,   3.,   4.,   5.,   6.,   7.,   8.,   9.,  10.])

In [45]: c = _

In [46]: c.dtype
Out[46]: dtype('<f8')

In [47]: a + c
Out[47]: array([  0.,   2.,   4.,   6.,   8.,  10.,  12.,  14.,  16.,  18.,  20.])


Some don't:


In [54]: c.sort()

In [55]: c
Out[55]: array([  0.,   2.,   3.,   4.,   5.,   6.,   7.,   8.,   9.,  10.,   1.])


This is a bug.

http://projects.scipy.org/scipy/numpy/ticket/47

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From Chris.Barker at noaa.gov  Wed Apr  5 09:37:08 2006
From: Chris.Barker at noaa.gov (Christopher Barker)
Date: Wed Apr  5 09:37:08 2006
Subject: [Numpy-discussion] NumPy documentation
In-Reply-To: <1A94CA82-2EB5-4145-9EA9-453DE60AE684@stanford.edu>
References: <4432E27E.6030906@ee.byu.edu> <4432E973.8070601@noaa.gov>
 <4432F4DD.6060000@cox.net> <1A94CA82-2EB5-4145-9EA9-453DE60AE684@stanford.edu>
Message-ID: <4433F1F6.4010603@noaa.gov>

Zachary Pincus wrote:
> from Numeric (who was used to the large, free manual)

Which brings up a question: Is the source to the old Numeric manual 
available? it would be nice to "port" it to SciPy.

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer
                                     		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From bsouthey at gmail.com  Wed Apr  5 09:46:03 2006
From: bsouthey at gmail.com (Bruce Southey)
Date: Wed Apr  5 09:46:03 2006
Subject: [Numpy-discussion] A random.normal function with stdev as array
In-Reply-To: <4433C6D6.5080800@obs.univ-lyon1.fr>
References: <4433C6D6.5080800@obs.univ-lyon1.fr>
Message-ID: <bbcd77d00604050945y69d4859cna5f91c6ab35feeb2@mail.gmail.com>

Hi,
Can you provide more details on what you are doing, especially how you
are using this?

The one item that is not directly part of Tim's list is that some
times you need to reorder your loops  (perhaps this is part of "Think
about your algorithm"?). Loop swapping is very common to improve
performance. However, it usually requires a very clear head or someone
else to do it. Also, you can might need to break loops into pieces
where you repeat the same tasks and computations over and over.

The other aspect is to do some algebra on the calculations as the
stdev is essentially a constant so depending on how you use it you can
factor it out further. Again it all depends on what you are actually
doing with these numbers.

>From a different view, you need to be very careful with your
(pseudo)random number generator with that many samples. These have a
tendency to repeat so your random number stream is no longer random.
See the Wikipedia entry:
http://en.wikipedia.org/wiki/Pseudorandom_number_generator

If I recall correctly, the Python random number generator is a
Mersenne twister but ranlib  is not and so prone to the mentioned
problems. I do not know if SciPy adds any other generators.

Finally I would also cheat by reducing the stdev values because in
many cases you will not see a real difference between a normal with
mean zero and variance 1.0 and a normal with mean zero and variance
1.1 (especially if you are doing more than comparing distributions so
there are more sources of 'error') unless you have a really large
number of samples.


Regards
Bruce

On 4/5/06, Eric Emsellem <emsellem at obs.univ-lyon1.fr> wrote:
> Hi,
>
> I am trying to optimize a code where I derive random numbers many times
> and having an array of values for the stdev parameter.
>
> I wish to have an efficient way of doing something like:
> ##################
> stdev = array([1.1,1.2,1.0,2.2])
> result = numpy.zeros(stdev.shape, Float)
> for i in range(len(stdev)) :
>    result[i] = numpy.random.normal(0, stdev[i])
> ##################
>
> In my case,  stdev can in fact be an array of a few millions floats...
> so I really need to optimize things.
>
> Any hint on how to code this efficiently ?
>
> And in general, where could I find tips for optimizing a code where I
> unfortunately have too many loops such as "for i in range(Nbody) : "
> with Nbody being > 10^6 ?
>
> thanks!
> Eric
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting language
> that extends applications into web and mobile media. Attend the live webcast
> and join the prime developer group breaking into this new coding territory!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>


From tim.hochberg at cox.net  Wed Apr  5 09:58:08 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Wed Apr  5 09:58:08 2006
Subject: [Numpy-discussion] A random.normal function with stdev as array]
Message-ID: <4433F71B.5080201@cox.net>

Eric Emsellem wrote:

> Hi,
> this is illuminating in fact. These are things I would not have 
> thought about.
>
> I am trying at the moment to understand why two versions of my program 
> have a difference of about 10% (here it is 2sec for 1000 points, so 
> you can imagine for 10^6...) although the code is nearly the same.
>
> I have loops such as:
>
> ####################
> bigarray = array of Nbig points
> for i in range(N) :
>  bigarray = bigarray + calculation
> ####################


If you tell us more about calculation, we could probably help more. This 
sounds like you want to vectorize the inner loop, but you may in fact 
have already done that. There's nothing wrong with looping in python as 
long as you amortize the loop overhead over a large number of 
operations. Thus, the advice to vectorize your *inner* loop, not 
vectorize all loops. Attempting the latter can lead to impenatrable 
code, usually doesn't help signifigantly and sometimes slows things down 
as you overflow the cache with big matrices.

>
> I thought to do it by:
> ####################
> bigarray = numpy.sum(array([calculation for i in range(N)]))
> ####################
> not sure this is good...

I suspect not, but timeit is your friend....

>
> And you are basically saying that
>
>  bigarray = bigarray + calculation
>
> is better than
>
>  bigarray +=  calculation
>
> or is it strictly equivalent? (in terms of CPU...)

Actually the reverse. "bigarray += calculation" should be better in 
terms of both speed and memory usage. In this case it's also clearer, so 
it's an improvement all around. They both do the same number of adds, 
but the first allocates more memory and pushes more data back and forth 
between main memory and the cache.

The point I was making about += verus + was that  I wouldn't in general 
recommend:

   a = some_func()
   a += something_else

over:

   a = some_func() + something_else

because it's less clear. In cases, where you do need really need the 
speed, it's fine, but most of the time that's not the case. In your 
case, the speedup is fairly minor, I believe because random.normal is 
fairly expensive. If you instead compare these two ways of computing a 
cube, you'll see a much larger difference (37%).

>>> setup = "import numpy; stddev=numpy.arange(1e6,dtype=float)%3"
>>> timeit.Timer('stddev * stddev * stddev', setup).timeit(20)
1.206557537340359
>>> timeit.Timer('result = stddev*stddev; result *= stddev', 
setup).timeit(20)
0.88055493086403658

However, if you work with smaller matrices, the effect almost disappears 
(5%):

>>> setup = "import numpy; stddev=numpy.arange(1e4,dtype=float)%3"
>>> timeit.Timer('result = stddev*stddev; result *= stddev', setup).time
0.10166515576702295
>>> timeit.Timer('stddev * stddev * stddev', setup).timeit(2000)
0.10613667379493563

I believe that's because the speedup is nearly all due to reducing the 
amount of data you move around. In the second case everything fits in 
the cache, so this effect is minor. In the first you are pushing data 
back and forth to main memory so it's fairly large.  On my machine these 
sort of effects kick in somewhere between 10,000 and 100,000 elements.


>
> thanks for the help, and sorry for the dum questions


Not a problem. These are all legitimate questions that you can't really 
be expected to know without a fair amount of experience with numpy or 
its predecessors. It would be cool if someone added a page to the wicki 
on the topic so we could start collecting and orgainizing this 
information. For all I know there's  one already there though -- I 
should probably check.

-tim

>
> Eric
>
> Tim Hochberg wrote:
>
>> Eric Emsellem wrote:
>>
>>>
>>>>
>>>>
>>>> Since stdev essentially scales the output of random, wouldn't the 
>>>> followin be equivalent to the above?
>>>>
>>>> result = numpy.random.normal(0, 1, stddev.shape)
>>>> result *= stdev
>>>>
>>> yes indeed, this is a good option where in fact I could do
>>>
>>> result = stddev * numpy.random.normal(0, 1, stddev.shape)
>>>
>>> in one line.
>>> thanks for the tip 
>>
>>
>> Indeed you can. However, keep in mind that the one line version is 
>> equivalent to:
>>
>>    temp = numpy.random.normal(0, 1, stddev.shape)
>>    result = stddev * temp
>>
>> That is, it creates an extra temporary variable only to throw it 
>> away. The two line version I posted above avoids that temporary and 
>> thus should be both faster and  less memory hungry. It's always good 
>> to check these things however:
>>
>> >>> setup = "import numpy; stddev=numpy.arange(1e6,dtype=float)%3"
>> >>> timeit.Timer('stddev * numpy.random.normal(0, 1, stddev.shape)', 
>> setup).timeit(20)
>> 3.4527201082819232
>> >>> timeit.Timer('result = numpy.random.normal(0, 1, stddev.shape); 
>> result*=stddev', setup).timeit(20)
>> 3.1093337281693607
>>
>> So, yes, the two line method is marginally faster (about 10%). Most 
>> of the time you shouldn't care about this: the one line version is 
>> clearer and most of the code you write isn't a bottleneck. Starting 
>> out writing this as the two line version is premature optimization. I 
>> used it here since the question was about optimization .
>>
>> I see Robert Kern just posted my list. If you want to put this in 
>> terms of that list, then:
>>
>> 0. Think about your algorithm
>>    => Recognize that stddev is a scale parameter
>> 1. Vectorize your inner loop.
>>    => This is a no brainer after 0 resulting in the one line version
>> 2. Eliminate temporaries
>>    => This results in the two line version.
>> ...
>>
>> Also key here is recognizing when to stop. Steps 0 is always 
>> appropriate and step 1 is almost always good, resulting in code that 
>> is both clearer and faster. However, once you get to step 2 and 
>> beyond you tend to trade speed/memory usage for clarity. Not always: 
>> sometime *= and friends are clearer, but often, particularly if you 
>> start resorting to three arg ufuncs. So, my advice is to stop 
>> optimizing as soon as your code is fast enough.
>>
>>
>>> (of course this is not strictly equivalent depending on the random 
>>> generator, but that will be fine for my purpose)
>>
>>
>> I'll have to take your word for it -- after the normal distribution 
>> my knowledge in the area peters out rapidly/
>>
>> Regards,
>>
>> -tim
>>
>>
>


From emsellem at obs.univ-lyon1.fr  Wed Apr  5 10:06:04 2006
From: emsellem at obs.univ-lyon1.fr (Eric Emsellem)
Date: Wed Apr  5 10:06:04 2006
Subject: [Numpy-discussion] A random.normal function with stdev as array
In-Reply-To: <bbcd77d00604050945y69d4859cna5f91c6ab35feeb2@mail.gmail.com>
References: <4433C6D6.5080800@obs.univ-lyon1.fr> <bbcd77d00604050945y69d4859cna5f91c6ab35feeb2@mail.gmail.com>
Message-ID: <4433F8D1.7090305@obs.univ-lyon1.fr>

An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060405/fe83014e/attachment.html>

From perry at stsci.edu  Wed Apr  5 10:09:01 2006
From: perry at stsci.edu (Perry Greenfield)
Date: Wed Apr  5 10:09:01 2006
Subject: [Numpy-discussion] NumPy documentation
In-Reply-To: <4433F1F6.4010603@noaa.gov>
References: <4432E27E.6030906@ee.byu.edu> <4432E973.8070601@noaa.gov> <4432F4DD.6060000@cox.net> <1A94CA82-2EB5-4145-9EA9-453DE60AE684@stanford.edu> <4433F1F6.4010603@noaa.gov>
Message-ID: <c9dc7ffaf628078594018f1924664a32@stsci.edu>

On Apr 5, 2006, at 12:36 PM, Christopher Barker wrote:

> Zachary Pincus wrote:
>> from Numeric (who was used to the large, free manual)
>
> Which brings up a question: Is the source to the old Numeric manual 
> available? it would be nice to "port" it to SciPy.

Sort of. The original source was in Framemaker format. It was converted 
to the Python latex framework in the process of being adopted to 
numarray. The source for that is available on the numarray repository. 
If you want the framemaker source, I may be able to dig that up 
somewhere (or I may have lost track of it :-). Paul Dubois can likely 
provide it as well; that's who gave me the source.

Perry


From hetland at tamu.edu  Wed Apr  5 10:15:27 2006
From: hetland at tamu.edu (Robert Hetland)
Date: Wed Apr  5 10:15:27 2006
Subject: [Numpy-discussion] NumPy documentation
In-Reply-To: <4433F1F6.4010603@noaa.gov>
References: <4432E27E.6030906@ee.byu.edu> <4432E973.8070601@noaa.gov> <4432F4DD.6060000@cox.net> <1A94CA82-2EB5-4145-9EA9-453DE60AE684@stanford.edu> <4433F1F6.4010603@noaa.gov>
Message-ID: <A461273A-5899-47FB-9098-6AC33214D3E9@tamu.edu>

Let's not forget that this documentation will eventually be free *no  
matter what* -- after a financial goal is met or after a certain  
amount of time.  This makes it fundamentally different than a  
published book (and in my opinion, much better).

I personally think this is an innovative way to create a free product  
that everybody wants, but nobody wants to do.

-Rob


-----
Rob Hetland, Assistant Professor
Dept of Oceanography, Texas A&M University
p: 979-458-0096, f: 979-845-6331
e: hetland at tamu.edu, w: http://pong.tamu.edu


From fonnesbeck at gmail.com  Wed Apr  5 10:28:10 2006
From: fonnesbeck at gmail.com (Chris Fonnesbeck)
Date: Wed Apr  5 10:28:10 2006
Subject: Fwd: [Numpy-discussion] NumPy documentation
In-Reply-To: <723eb6930604051026q7dbcaad2w47c059f6c88e8db7@mail.gmail.com>
References: <4432E27E.6030906@ee.byu.edu>
	 <723eb6930604051026q7dbcaad2w47c059f6c88e8db7@mail.gmail.com>
Message-ID: <723eb6930604051027m5aac408dnbba356ebdcb389ac@mail.gmail.com>

On 4/4/06, Travis Oliphant <oliphant at ee.byu.edu> wrote:

>
> I received a rather hurtful email today that was very discouraging to me
> personally.  Basically, I was called "lame" and a "wolf" in sheep's
> clothing because I'm charging for documentation.


There is one in every crowd, it seems. This email, and any others like it,
should be utterly ignored, in the hopes that their authors will go elsewhere
for scientific computing solutions. If they had spent any time at all on
this list, they would have noticed the seemingly boundless attention and
support that Travis bestows upon both scipy and its user community.

Chris
--
Chris Fonnesbeck + Atlanta, GA + http://trichech.us
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060405/aea9f6b4/attachment.html>

From charlesr.harris at gmail.com  Wed Apr  5 10:29:07 2006
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Wed Apr  5 10:29:07 2006
Subject: [Numpy-discussion] NumPy documentation
In-Reply-To: <4433EC3C.9050706@ftw.at>
References: <4432E27E.6030906@ee.byu.edu> <4433EC3C.9050706@ftw.at>
Message-ID: <e06186140604051028j1e520e6cx81b39b5e90202b0d@mail.gmail.com>

Heh,

On 4/5/06, Ed Schofield <schofield at ftw.at> wrote:
<snip>

>  Perhaps some re-branding -- from "fee-based documentation" to
> "book" or "handbook for users and developers"


I think that's a great idea! "Handbook for Users and Developers" sounds much
better and doesn't have that nasty "documentation should be free"
implication.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060405/953f2c6c/attachment.html>

From robert.kern at gmail.com  Wed Apr  5 11:35:01 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Wed Apr  5 11:35:01 2006
Subject: [Numpy-discussion] Re: A random.normal function with stdev as array
In-Reply-To: <4433F8D1.7090305@obs.univ-lyon1.fr>
References: <4433C6D6.5080800@obs.univ-lyon1.fr> <bbcd77d00604050945y69d4859cna5f91c6ab35feeb2@mail.gmail.com> <4433F8D1.7090305@obs.univ-lyon1.fr>
Message-ID: <e112fl$rds$1@sea.gmane.org>

> Bruce Southey wrote:

>>>From a different view, you need to be very careful with your
>>(pseudo)random number generator with that many samples. These have a
>>tendency to repeat so your random number stream is no longer random.
>>See the Wikipedia entry:
>>http://en.wikipedia.org/wiki/Pseudorandom_number_generator
>>
>>If I recall correctly, the Python random number generator is a
>>Mersenne twister but ranlib  is not and so prone to the mentioned
>>problems. I do not know if SciPy adds any other generators.

numpy.random uses the Mersenne Twister. RANLIB is dead! Long live MT19937!

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From Chris.Barker at noaa.gov  Wed Apr  5 11:59:04 2006
From: Chris.Barker at noaa.gov (Christopher Barker)
Date: Wed Apr  5 11:59:04 2006
Subject: [Numpy-discussion] NumPy documentation
In-Reply-To: <c9dc7ffaf628078594018f1924664a32@stsci.edu>
References: <4432E27E.6030906@ee.byu.edu> <4432E973.8070601@noaa.gov>
 <4432F4DD.6060000@cox.net> <1A94CA82-2EB5-4145-9EA9-453DE60AE684@stanford.edu>
 <4433F1F6.4010603@noaa.gov> <c9dc7ffaf628078594018f1924664a32@stsci.edu>
Message-ID: <44341348.3050505@noaa.gov>

Perry Greenfield wrote:
> Sort of. The original source was in Framemaker format. It was converted 
> to the Python latex framework in the process of being adopted to 
> numarray. The source for that is available on the numarray repository. 
> If you want the framemaker source, I may be able to dig that up 
> somewhere (or I may have lost track of it :-). Paul Dubois can likely 
> provide it as well; that's who gave me the source.

Thanks. That's good news. Now, when I'm done with everything else I want 
to work on.....

LaTeX is a better option for me anyway. In fact, it's a better option 
for anyone that doesn't already use FrameMaker, as you can at least edit 
some of the text without knowing or using LaTeX at all.

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer
                                     		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From Chris.Barker at noaa.gov  Wed Apr  5 12:07:10 2006
From: Chris.Barker at noaa.gov (Christopher Barker)
Date: Wed Apr  5 12:07:10 2006
Subject: [Numpy-discussion] array constructor from generators?
In-Reply-To: <F0BB3395-025A-46CB-8A7A-005964E45D20@stanford.edu>
References: <F0BB3395-025A-46CB-8A7A-005964E45D20@stanford.edu>
Message-ID: <44341538.4040907@noaa.gov>

Zachary Pincus wrote:
> I often construct arrays from list comprehensions on generators,

 > numpy.array([map(float, line.split()) for line in file])

I know there are other uses, and this was just an example, but you can 
now do:

numpy.fromfile(file, dtype=numpy.Float, sep="\t")

Which is much faster and cleaner, if you ask me. Thanks for adding this, 
Travis!


Tim Hochberg wrote:
> Without this, you probably can't do much 
> better than just building a list from the array. What would work well 
> would be to build a list, then steal its memory. 

Perhaps another option is to borrow the machinery from fromfile (see 
above), that builds an array without knowing how big it is when it 
starts. I haven't looked at the code, but I know that Travis got at 
least the idea, if not the method, from my FileScanner module I wrote a 
while back, and that dynamically allocated the memory it needed as it grew.

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer
                                     		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From tim.hochberg at cox.net  Wed Apr  5 12:16:11 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Wed Apr  5 12:16:11 2006
Subject: [Numpy-discussion] array constructor from generators?
In-Reply-To: <884F03C6-599C-426A-A0A0-97009B63EACB@stanford.edu>
References: <F0BB3395-025A-46CB-8A7A-005964E45D20@stanford.edu> <44331200.2020604@cox.net> <884F03C6-599C-426A-A0A0-97009B63EACB@stanford.edu>
Message-ID: <4434175D.10103@cox.net>

Zachary Pincus wrote:

> [sorry if this comes through twice -- seems to have not sent the  
> first time]

I've only seen it once so far, but my numpy mail seems to be coming 
through all out of order right now.

> Hi folks,
>
> tim>
>
>> I brought this up last week and Travis was OK with it. I have it on  
>> my todo list, but if you are in a hurry you're welcome to do it  
>> instead.
>
>
> Sorry if that was on the list and I missed it! Hate to be adding more  
> noise than signal. At any rate, I'm not in a hurry, but I'd be happy  
> to help where I can. (Though for the next week or so I think I'm  
> swamped...)

There was no real discussion then. I said I thought it was a good idea. 
Travis said OK. That was about it.

> tim>
>
>> If you do look at it, consider looking into the '__length_hint__  
>> parameter that's slated to go into Python 2.5. When this is  present, 
>> it's potentially a big win, since you can preallocate the  array and 
>> fill it directly from the iterator. Without this, you  probably can't 
>> do much better than just building a list from the  array. What would 
>> work well would be to build a list, then steal  its memory. I'm not 
>> sure if that's feasible without leaking a  reference to the list though.
>
>
> Can you steal its memory and then give it some dummy memory that it  
> can free without problems, so that the list can be deallocated  
> without trouble? Does anyone know if you can just give the list a  
> NULL pointer for it's memory and then immediately decref it? free 
> (NULL) should always be safe, I think. (??)

That might well work, but now I realize that using a list this way 
probably won't work out well for other reasons.

>> Also, with iterators, specifying dtype will make a huge difference.  
>> If an object has __length_hint__ and you specify dtype, then you  can 
>> preallocate the array as I suggested above. However, if dtype  is not 
>> specified, you still need to build the list completely,  determine 
>> what type it is, allocate the array memory and then copy  the values 
>> into it. Much less efficient!
>
>
> How accurate is __length_hint__ going to be? It could lead to a fair  
> bit of special case code for growing and shrinking the final array if  
> __length_hint__ turns out to be wrong. 

see below.

> Code that python lists already  have, moreover.

If we don't know dtype up front, lists are great. All the code is there 
and we need to look at all of the elements before we know what the 
elements are anyway.

However, if you do know what dtype is the situation is different. Since 
these are generators, the object they create may only last until the 
next next() call if we don't hold onto it. That means that for a matrix 
of size N, generating thw whole list is going to require N*(sizeof(long) 
+ sizeof(pyobjType) + sizeof(dtype)), versus just N*sizeof(dtype) if 
we're careful. I'm not sure what all of those various sizes are, but I'm 
going to guess that we'd be at least doubling our memory.

All is not lost however. When we know the dtype, we should just use a 
*python* array to hold the data. It works just like a list, but on 
packed data.

>
> If the list's memory can be stolen safely, how does this strategy sound:

Let me break this into two cases:

1. We don't know the dtype.

> - Given a generator, build it up into a list internally

+1

> , and then  steal the list's memory.

-0.5

I'm not sure this buys us as much as I thought initially. The list 
memory is PyObject*, so this would only work on dtypes no larger than 
the size of a pointer, usually that means no larger than a long. So, 
basically this would work on most of the integer types, but not the 
floating point types. And, it adds extra complexity to support two 
different cases. I'd be inclined to start with just copying the objects 
out of the list. If someone feels like it later, they can come back and 
try to optimize the case of integers to steal the lists memory..

Keep in mind that once we have a list, we can simple pass it to the 
machinery that already exists for creating arrays from lists making our 
lives much easier.

> - If a dtype is provided, wrap the generator with another generator  
> that casts the original generator's output to the correct dtype. Then  
> use the wrapped generator to create a list of the proper dtype, and  
> steal that list's memory.

-1. This wastes a lot of space and sort of defeats the purpose of the 
whole exercise in my mind.

2. Dtype is known.


The case where dtype is provided is more complicated, but this is the 
case we really want to support well. Actually though, I think we can 
simplify it by judicious punting.

Case 2a. Array is not 1-dimensional. Punt and fallback on the general 
code above. We can determine this simply by testing the first element. 
If it's not int/float/complex/whatever-other-scalar-values-we-have, fall 
back to case 1.

Case 2b: length_ hint is not given. In this case, we build up the array 
in a python array, steal the data, possibly realloc and we're done.

Case 2b length_hint is given. Same as above, but preallocate the 
appropriate amount of memory. Growing if length_hint lies.


>
> A potential problem with stealing list memory is that it could waste  
> memory if the list has more bytes allocated than it is using (I'm not  
> sure if python lists can get this way, but I presume that they resize  
> themselves only every so often, like C++ or Java vectors, so most of  
> the time they have some allocated but unused bytes). If lists have a  
> squeeze method that's guaranteed not to cause any copies, or if this  
> can be added with judicious use of realloc, then that problem is  
> obviated.

I imagine once you steal the memory, realloc would the thing to try. 
However,  I don't think it's worth stealing the memory from lists. I do 
think it's worth stealing the memory from python arrays however, and I'm 
sure that the same issue exists there. We'll have to look at how the 
deallocation for an array works. It probably use Py_XDecref, in which 
case we can just replace the memory with NULL and we'll be fine.

OK, just had a look at the code for the python array object 
(Modules/arraymodule.c). Looks like it'll be a piece of cake. We can 
allocate it to the exact size we want if we have length_hint, otherwise 
resize only overallocates by 6%. That's not enough to worry about 
reallocing. Stealing the data looks like it shouldn't a problem either, 
just NULL ob_item as you suggested.


Regards,

-tim

>
> robert>
>
>> Another note of caution: You are going to have to deal with  
>> iterators of
>> iterators of iterators of.... I'm not sure if that actually overly  
>> complicates
>> matters; I haven't looked at PyArray_New for some time. Enjoy!
>
>
> This is a good point. Numpy does fine with nested lists, but what  
> should it do with nested generators? I originally thought that  
> basically 'array(generator)' should make the exact same thing as  
> 'array([f for f in generator])'. However, for nested generators, this  
> would be an object array of generators.
>
> I'm not sure which is better -- having more special cases for  
> generators that make generators, or having a simple rubric like above  
> for how generators are treated.
>
> Any thoughts?
>
> Zach
>
>
>
>
>
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting 
> language
> that extends applications into web and mobile media. Attend the live 
> webcast
> and join the prime developer group breaking into this new coding 
> territory!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>
>


From aisaac at american.edu  Wed Apr  5 14:01:01 2006
From: aisaac at american.edu (Alan G Isaac)
Date: Wed Apr  5 14:01:01 2006
Subject: [Numpy-discussion] NumPy documentation
In-Reply-To: <4433EC3C.9050706@ftw.at>
References: <4432E27E.6030906@ee.byu.edu><4433EC3C.9050706@ftw.at>
Message-ID: <Mahogany-0.66.0-928-20060405-170554.00@american.edu>

On Wed, 05 Apr 2006, Ed Schofield apparently wrote: 
> you mention on on the site that you'll print and bind 
> hard-copy version once your sales reach 200 copies.  
> I think this would help to encourage libraries and 
> conservative institutions to purchase copies.

Unfortunately, my library falls in this category.
They were uncertain how to enforce the copyright
with an electronic copy.  (They are still thinking
about it, last I heard.)

Cheers,
Alan Isaac


From rahul.kanwar at gmail.com  Wed Apr  5 16:25:01 2006
From: rahul.kanwar at gmail.com (Rahul Kanwar)
Date: Wed Apr  5 16:25:01 2006
Subject: [Numpy-discussion] Numpy on 64 bit Xeon with ifort and mkl
Message-ID: <63dec5bf0604051624k70c565baw70347a2fd571c253@mail.gmail.com>

Hello,

  I am trying to compile Numpy on 64 bit Xeon with ifort and mkl
libraries running Suse 10.0 linux. I had set the MKLROOT variable to
the mkl library root but it could'nt find the 64 bit library. After a
little bit of snooping I found the following in
numpy/distutils/cpuinfo.py

------------------------------
   def _is_XEON(self):
       return re.match(r'.*?XEON\b',
                       self.info[0]['model name']) is not None

   _is_Xeon = _is_XEON
------------------------------
 I changed XEON to Xeon and it worked and was able to indentify the
em64t libraries. But it again got stuck with the following message. I
used the following command to build Numpy
 python setup.py config_fc --fcompiler=intel install

------------------------------
building 'numpy.core._dotblas' extension
compiling C sources
gcc options: '-pthread -fno-strict-aliasing -DNDEBUG -O2
-fmessage-length=0 -Wall -D_FORTIFY_SOURCE=2 -g -fPIC'
compile options: '-Inumpy/core/blasdot -I/opt/intel/mkl/8.0.2/include
-Inumpy/core/include -Ibuild/src/numpy/core -Inumpy/core/src
-Inumpy/core/include -I/usr/include/python2.4 -c'
gcc -pthread -shared
build/temp.linux-x86_64-2.4/numpy/core/blasdot/_dotblas.o
-L/opt/intel/mkl/8.0.2/lib/em64t -lmkl_em64t -lmkl -lvml -lguide
-lpthread -o build/lib.linux-x86_64-2.4/numpy/core/_dotblas.so
/usr/lib64/gcc/x86_64-suse-linux/4.0.2/../../../../x86_64-suse-linux/bin/ld:
/opt/intel/mkl/8.0.2/lib/em64t/libmkl_em64t.a(def_cgemm_omp.o):
relocation R_X86_64_PC32 against `_mkl_blas_def_cgemm_276__par_loop0'
can not be used when making a shared object; recompile with -fPIC
/usr/lib64/gcc/x86_64-suse-linux/4.0.2/../../../../x86_64-suse-linux/bin/ld:
final link failed: Bad value
collect2: ld returned 1 exit status
/usr/lib64/gcc/x86_64-suse-linux/4.0.2/../../../../x86_64-suse-linux/bin/ld:
/opt/intel/mkl/8.0.2/lib/em64t/libmkl_em64t.a(def_cgemm_omp.o):
relocation R_X86_64_PC32 against `_mkl_blas_def_cgemm_276__par_loop0'
can not be used when making a shared object; recompile with -fPIC
/usr/lib64/gcc/x86_64-suse-linux/4.0.2/../../../../x86_64-suse-linux/bin/ld:
final link failed: Bad value
collect2: ld returned 1 exit status
error: Command "gcc -pthread -shared
build/temp.linux-x86_64-2.4/numpy/core/blasdot/_dotblas.o
-L/opt/intel/mkl/8.0.2/lib/em64t -lmkl_em64t -lmkl -lvml -lguide
-lpthread -o build/lib.linux-x86_64-2.4/numpy/core/_dotblas.so" failed
with exit status 1
----------------------------------------------
i successfuly compiled it without the -lmkl_em64t flag but when i import
numpy in python it gives error that some symbol is missing. I think
that maybe if i use ifort as the linker instead ok gcc then things
will work out properly, but i could'nt find how to change the linker
to ifort. Aynone there who can help me with this problem ?

regards,
Rahul


From robert.kern at gmail.com  Wed Apr  5 17:17:04 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Wed Apr  5 17:17:04 2006
Subject: [Numpy-discussion] Re: Numpy on 64 bit Xeon with ifort and mkl
In-Reply-To: <63dec5bf0604051624k70c565baw70347a2fd571c253@mail.gmail.com>
References: <63dec5bf0604051624k70c565baw70347a2fd571c253@mail.gmail.com>
Message-ID: <e11mj4$tem$1@sea.gmane.org>

Rahul Kanwar wrote:

> i successfuly compiled it without the -lmkl_em64t flag but when i import
> numpy in python it gives error that some symbol is missing. I think
> that maybe if i use ifort as the linker instead ok gcc then things
> will work out properly, but i could'nt find how to change the linker
> to ifort. Aynone there who can help me with this problem ?

It's not likely that using ifort to link will help. The problem is this bit:

> /opt/intel/mkl/8.0.2/lib/em64t/libmkl_em64t.a(def_cgemm_omp.o):
> relocation R_X86_64_PC32 against `_mkl_blas_def_cgemm_276__par_loop0'
> can not be used when making a shared object; recompile with -fPIC

You are linking against static libraries which were not compiled to be "position
independent;" that is, they can't be used in shared libraries which are what
Python extension modules are. C.f.:

http://en.wikipedia.org/wiki/Position_independent_code

Look around in /opt/intel/; they've almost certainly have provided shared
library versions of the MKL that could be used. Google gives me these, for example:

http://www.intel.com/support/performancetools/libraries/mkl/linux/sb/cs-017267.htm
http://www.intel.com/software/products/mkl/docs/mklgs_lnx.htm#Linking_Your_Application_with_Intel_MKL

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From ryanlists at gmail.com  Wed Apr  5 19:50:07 2006
From: ryanlists at gmail.com (Ryan Krauss)
Date: Wed Apr  5 19:50:07 2006
Subject: [Numpy-discussion] eye(N,dtype='S10')
Message-ID: <c5b438120604051949g73925ac7o73e15b0cbe1353ae@mail.gmail.com>

I am trying to create a function that can return a matrix that is
either made up of complex numbers or strings depending on the input. 
I have created a symbolic string class to help me with that and it
works well.  One clumsy part is that in several cases I want to create
an identity matrix and just replace a couple of elements.  I currently
have to do this in two steps:

In [27]: mymat=numpy.eye(4,dtype='f')

In [28]: mymat.astype('S10')
Out[28]:
array([[1.0, 0.0, 0.0, 0.0],
       [0.0, 1.0, 0.0, 0.0],
       [0.0, 0.0, 1.0, 0.0],
       [0.0, 0.0, 0.0, 1.0]], dtype=(string,10))

I create a floating point matrix in the string case rather than a
complex matrix so I don't have to parse the +0.0j stuff.

But what I would really like is to be able to just be able to create
either a complex matrix or a string matrix at the beginning.  But
trying

numpy.eye(4,dtype='S10')

produces

array([[True, False, False, False],
       [False, True, False, False],
       [False, False, True, False],
       [False, False, False, True]], dtype=(string,10))

rather than

array([[1.0, 0.0, 0.0, 0.0],
       [0.0, 1.0, 0.0, 0.0],
       [0.0, 0.0, 1.0, 0.0],
       [0.0, 0.0, 0.0, 1.0]], dtype=(string,10))

I need 1's and 0's rather than True and False because when I am done,
I put the string representation into an input script to Maxima and
Maxima wouldn't handle the True and False values well.

Is there a way to directly create an identitiy string matrix with '1'
and '0'  instead of True and False?

Thanks,

Ryan


From arnd.baecker at web.de  Wed Apr  5 23:51:03 2006
From: arnd.baecker at web.de (Arnd Baecker)
Date: Wed Apr  5 23:51:03 2006
Subject: [Numpy-discussion] Converting from Numeric (was: Speed up function on cross product of
 two sets?)
In-Reply-To: <Pine.LNX.4.64.0604050912310.12488@scipy.org>
References: <DCA236B2-7315-419D-B05B-06396924E909@stanford.edu>
 <Pine.LNX.4.51.0604031110050.6751@ptpcp8.phy.tu-dresden.de> <44315633.4010600@cox.net>
 <Pine.LNX.4.51.0604051051310.24174@ptpcp8.phy.tu-dresden.de>
 <Pine.LNX.4.64.0604050912310.12488@scipy.org>
Message-ID: <Pine.LNX.4.51.0604060842530.32563@ptpcp8.phy.tu-dresden.de>

Moin Moin,

On Wed, 5 Apr 2006, Pearu Peterson wrote:

> On Wed, 5 Apr 2006, Arnd Baecker wrote:
>
> > BTW, it seems that we have no Numeric to numpy transition remarks in
> > www.scipy.org. I only found
> > http://www.scipy.org/PearuPeterson/NumpyVersusNumeric
> > and of course Travis' "Guide to NumPy" contains a detailed list of
> > necessary changes in chapter 2.6.1.
> > In addition ``site-packages/numpy/lib/convertcode.py`` provides an
> > automatic conversion.
> >
> > Would it be helpful to start a new wiki page "ConvertingFromNumeric"
> > (similar to http://www.scipy.org/Converting_from_numarray)
> > which aims at summarizing the necessary changes
> > or expand Pearu's page (if he agrees) on this?
>
> It's better to start a new wiki page similar to Converting_from_numarray
> (I like the table).

Based on the above links I have set up a first draft at
  http://www.scipy.org/Converting_from_Numeric
It is surely not complete and
there are a couple of things which have to be checked for correctness
(I tried out some, but not all ...).

Also some remarks on using the new features of numpy
(e.g., use array indexing instead of take and put...) might be useful.

> Btw, I have few notes about the necessary changes for
> Numeric->numpy transition in the following page:
>
>    http://svn.enthought.com/enthought/wiki/NumpyPort#NotesonchangesduetoreplacingNumeric/scipy_basewithnumpy
>
> Feel free to grab these notes.

Great - thanks, I tried to incorporate them as well.

Best, Arnd


From cimrman3 at ntc.zcu.cz  Thu Apr  6 01:48:05 2006
From: cimrman3 at ntc.zcu.cz (Robert Cimrman)
Date: Thu Apr  6 01:48:05 2006
Subject: [Numpy-discussion] NumPy documentation
In-Reply-To: <4432E27E.6030906@ee.byu.edu>
References: <4432E27E.6030906@ee.byu.edu>
Message-ID: <4434D58D.2010505@ntc.zcu.cz>

Travis Oliphant wrote:
> 
> I received a rather hurtful email today that was very discouraging to me 
 > ...

Coming late on line, I can just +1 to all the support and appreciation 
you have received so far!

r.


From oliphant.travis at ieee.org  Thu Apr  6 01:54:01 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Thu Apr  6 01:54:01 2006
Subject: [Numpy-discussion] Speed up function on cross product of two
 sets?
In-Reply-To: <Pine.LNX.4.51.0604051051310.24174@ptpcp8.phy.tu-dresden.de>
References: <DCA236B2-7315-419D-B05B-06396924E909@stanford.edu> <Pine.LNX.4.51.0604031110050.6751@ptpcp8.phy.tu-dresden.de> <44315633.4010600@cox.net> <Pine.LNX.4.51.0604051051310.24174@ptpcp8.phy.tu-dresden.de>
Message-ID: <4434D6DF.2020306@ieee.org>

Arnd Baecker wrote:
> BTW, it seems that we have no Numeric to numpy transition remarks in
> www.scipy.org. I only found
> http://www.scipy.org/PearuPeterson/NumpyVersusNumeric
> and of course Travis' "Guide to NumPy" contains a detailed list of
> necessary changes in chapter 2.6.1.
>   
For clarification:  this is in the sample chapter available on-line to 
all....

> In addition ``site-packages/numpy/lib/convertcode.py`` provides an
> automatic conversion.
>
> Would it be helpful to start a new wiki page "ConvertingFromNumeric"
> (similar to http://www.scipy.org/Converting_from_numarray)
> which aims at summarizing the necessary changes
> or expand Pearu's page (if he agrees) on this?
>   

Absolutely.   I did the Numarray page because I'd written a lot on 
Converting from Numeric (even providing convertcode.py) but very little 
for numarray --- except the ndimage conversion.  So, I started the 
Numarray page.   Sounds like a great idea to have a dual page.

-Travis


From oliphant.travis at ieee.org  Thu Apr  6 02:21:02 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Thu Apr  6 02:21:02 2006
Subject: [Numpy-discussion] array constructor from generators?
In-Reply-To: <B72CDB4F-6494-4815-9478-10B4743DBCE1@stanford.edu>
References: <F0BB3395-025A-46CB-8A7A-005964E45D20@stanford.edu> <44331200.2020604@cox.net> <B72CDB4F-6494-4815-9478-10B4743DBCE1@stanford.edu>
Message-ID: <4434DD42.8010205@ieee.org>

> Can you steal its memory and then give it some dummy memory that it 
> can free without problems, so that the list can be deallocated without 
> trouble? Does anyone know if you can just give the list a NULL pointer 
> for it's memory and then immediately decref it? free(NULL) should 
> always be safe, I think. (??)
>
I don't think you can steal a list's memory since each list element is a 
actually pointer to some other Python Object.   

However, a Python array's memory could be stolen (as Tim mentions later). 
> This is a good point. Numpy does fine with nested lists, but what 
> should it do with nested generators? I originally thought that 
> basically 'array(generator)' should make the exact same thing as 
> 'array([f for f in generator])'. However, for nested generators, this 
> would be an object array of generators.
>
> I'm not sure which is better -- having more special cases for 
> generators that make generators, or having a simple rubric like above 
> for how generators are treated.
I like the idea that generators of generators acts the same as lists of 
lists (i.e. recursively defined).   Basically to implement this, we need 
to repeat

Array_FromSequence
discover_depth
discover_dimensions
discover_itemsize

Or, just maybe we can figure out a way to enhance those functions so 
that creating an array from generators works the same as creating an 
array from sequences. Right now, the sequence interface is used.  
Perhaps we could figure out a way to use a more abstract interface which 
would include both generators and sequences.  If that causes too much 
alteration then I don't think it's worth it and we just repeat those 
functions for generators.

Now, I think there are two cases here that are being discussed as one

1)  Creating arrays from iterators     ---   array( iter(xrange(10) )
2)  Creating arrays from generators  ---  array(x for x in xrange(10))

Both of these cases really ought to be handled and really should be 
integrated into the Array_FromSequence code.  That code is inherited 
from Numeric and was written before iterators and generators arose on 
the scene.  There ought to be a way to unify all of these notions 
(Actually if you handle iterators, then sequences will come along for 
the ride since sequences can behave as iterators). 

I'd rather see one place in the code that handles these cases.   But, 
working code is usually better than dreamy plans :-)

-Travis


From oliphant.travis at ieee.org  Thu Apr  6 02:38:04 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Thu Apr  6 02:38:04 2006
Subject: [Numpy-discussion] A random.normal function with stdev as array
In-Reply-To: <20060405155736.GA9364@localhost.localdomain>
References: <4433C6D6.5080800@obs.univ-lyon1.fr> <20060405155736.GA9364@localhost.localdomain>
Message-ID: <4434E13B.4000702@ieee.org>

John Byrnes wrote:
> Hi Eric,
>
> In the past , I've done things like
>
> ######
> normdist = lambda x: numpy.random.normal(0,x)
> vecnormal = numpy.vectorize(normdist)
>
> stdev = numpy.array([1.1,1.2,1.0,2.2])
> result = vecnormal(stdev)
>
> ######
>
> This works fine for up to 10k elements for stdev for some reason.
> Any larger then that and i get a Bus error on my PPC mac and a segfault on
> my x86 linux box.
>
>   

This needs to be tracked down.  It looks like some-kind of error is not 
being caught correctly.  You should not get a segfault.  Could you 
provide a stack-trace when the problem occurs? 

One issue is that vectorize is using object arrays under the covers 
which is consuming roughly 2x the memory than you may think.    An 
object array is created and the function is called for every element.  
This object array is then converted to a number type after the fact.  

The segfault should be tracked down in any case.

-Travis


From pau.gargallo at gmail.com  Thu Apr  6 02:44:03 2006
From: pau.gargallo at gmail.com (Pau Gargallo)
Date: Thu Apr  6 02:44:03 2006
Subject: [Numpy-discussion] NumPy documentation
In-Reply-To: <c9dc7ffaf628078594018f1924664a32@stsci.edu>
References: <4432E27E.6030906@ee.byu.edu> <4432E973.8070601@noaa.gov>
	 <4432F4DD.6060000@cox.net>
	 <1A94CA82-2EB5-4145-9EA9-453DE60AE684@stanford.edu>
	 <4433F1F6.4010603@noaa.gov>
	 <c9dc7ffaf628078594018f1924664a32@stsci.edu>
Message-ID: <6ef8f3380604060243u2f54efc3r2baba94688c5d0af@mail.gmail.com>

On 4/5/06, Perry Greenfield <perry at stsci.edu> wrote:
>
> On Apr 5, 2006, at 12:36 PM, Christopher Barker wrote:
>
> > Zachary Pincus wrote:
> >> from Numeric (who was used to the large, free manual)
> >
> > Which brings up a question: Is the source to the old Numeric manual
> > available? it would be nice to "port" it to SciPy.
>
> Sort of. The original source was in Framemaker format. It was converted
> to the Python latex framework in the process of being adopted to
> numarray. The source for that is available on the numarray repository.
> If you want the framemaker source, I may be able to dig that up
> somewhere (or I may have lost track of it :-). Paul Dubois can likely
> provide it as well; that's who gave me the source.
>
> Perry
>
+1 to any support to Travis Oliphant. Your work is really helping us.

I am quite ignorant about licences and copyright things, so I would
like to know:

1.- Is it OK to just copy the old Numeric documentation to the wiki
and use it as a starting point for a more complete and updated doc?
2.- Would that be fine for the authors?

I guess it will be very useful to everyone (especially beginners) to
have an extended version of this documentation where there are many
examples of use for every function. The wiki seems a very efficient
way to build such a thing.
It will take some time to manually copy-paste everything to the wiki,
but it is doable

what do you think?

pau


From oliphant.travis at ieee.org  Thu Apr  6 02:46:02 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Thu Apr  6 02:46:02 2006
Subject: [Numpy-discussion] Re: weird interaction: pickle, numpy, matplotlib.hist
In-Reply-To: <e10q3e$qh0$1@sea.gmane.org>
References: <4433DF85.7030109@gmail.com> <e10q3e$qh0$1@sea.gmane.org>
Message-ID: <4434E31B.5030306@ieee.org>

Robert Kern wrote:
> You have a byteorder issue. You Linux box, which I presume has an Intel or AMD
> CPU, is little-endian where your OS X box, which I presume has a PPC CPU, is
> big-endian. numpy arrays can store their data in either endianness on either
> kind of platform; their dtype objects tell you which byteorder they are using.
>
> In [54]: c.sort()
>
> In [55]: c
> Out[55]: array([  0.,   2.,   3.,   4.,   5.,   6.,   7.,   8.,   9.,  10.,   1.])
>
>
> This is a bug.
>
> http://projects.scipy.org/scipy/numpy/ticket/47
>   
Good catch.  This bug was due to an oversight when adding the new 
sorting functions.  The case of byte-swapped data was not handled.   
Judicious use of copyswap on the buffer fixed it.

But,  this brings up the point that currently the pickled raw-data which 
is read-in as a string by Python is used as the memory for the new array 
(i.e. the string memory is "stolen").    This should work.  The fact 
that it didn't with sort was a bug that is now fixed in SVN.  However, 
operations on out-of-byte-order arrays will always be slower.  Thus, 
perhaps on pickle read the data should be copied to native byte-order if 
necessary.

Opinions?

-Travis


From benjamin at decideur.info  Thu Apr  6 03:23:09 2006
From: benjamin at decideur.info (Benjamin Thyreau)
Date: Thu Apr  6 03:23:09 2006
Subject: [Numpy-discussion] Recarray and shared datas
Message-ID: <200604061020.k36AKIsQ018238@decideur.info>

Hi,
Numpy has a nice feature of recarray, ie. record which can hold columns names.
I'd like to use such a feature in order to better interact with R, ie. passing
R datas to python without copy. The current rpy bindings do a full copy, and
convert to simple ndarray. Looking at the recarray api in the Guide,
and also at the source code, i don't find any recarray constructor which can
get shared datas (all the examples from section 8.6 are doing copies).
Is there some way to do it ? in Python or in C ? Or is there any plans to ?

Thanks for the infos


--
Benjamin Thyreau
CEA/SHFJ Orsay


From oliphant.travis at ieee.org  Thu Apr  6 03:40:05 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Thu Apr  6 03:40:05 2006
Subject: [Numpy-discussion] Newbie indexing question and print order
In-Reply-To: <44338DF4.7050603@gmail.com>
References: <44338DF4.7050603@gmail.com>
Message-ID: <4434E522.3060101@ieee.org>

amcmorl wrote:
> Hi all,
>
> I'm having a bit of trouble getting my head around numpy's indexing
> capabilities. A quick summary of the problem is that I want to
> lookup/index in nD from a second array of rank n+1, such that the last
> (or first, I guess) dimension contains the lookup co-ordinates for the
> value to extract from the first array. Here's a 2D (3,3) example:
>
> In [12]:print ar
> [[ 0.15  0.75  0.2 ]
>  [ 0.82  0.5   0.77]
>  [ 0.21  0.91  0.59]]
>
> In [24]:print inds
> [[[1 1]
>   [1 1]
>   [2 1]]
>
>  [[2 2]
>   [0 0]
>   [1 0]]
>
>  [[1 1]
>   [0 0]
>   [2 1]]]
>
> then somehow return the array (barring me making any row/column errors):
> In [26]: c = ar.somefancyindexingroutinehere(inds)
>   

You can do this with "fancy-indexing".   Obviously it is going to take 
some time for people to get used to this idea as none of the responses 
yet suggest it. 

But the following works.  

c = ar[inds[...,0],inds[...,1]]

gives the desired effect.

Thus, your simple description c[x,y] = ar[inds[x,y,0],inds[x,y,1]] is a 
text-book description of what fancy-indexing does.

Best regards,

-Travis


> In [26]:print c
> [[ 0.5  0.5  0.91]
>  [ 0.59 0.15 0.82]
>  [ 0.5  0.15 0.91]]
>
> i.e. c[x,y] = a[ inds[x,y,0], inds[x,y,1] ]
>
> Any suggestions? It looks like it should be relatively simple using
> 'put' or 'take' or 'fetch' or 'sit' or something like that, but I'm not
> getting it.
>
> While I'm here, can someone help me understand the rationale behind
> 'print' printing row, column (i.e. a[0,1] = 0.75 in the above example
> rather than x, y (=column, row; in which case 0.75 would be in the first
> column and second row), which seems to me to be more intuitive.
>
> I'm really enjoying getting into numpy - I can see it'll be
> simpler/faster coding than my previous environments, despite me not
> knowing my way at the moment, and that python has better opportunities
> for extensibility. So, many thanks for your great work.
>   


From faltet at carabos.com  Thu Apr  6 03:44:02 2006
From: faltet at carabos.com (Francesc Altet)
Date: Thu Apr  6 03:44:02 2006
Subject: [Numpy-discussion] Re: weird interaction: pickle, numpy, matplotlib.hist
In-Reply-To: <4434E31B.5030306@ieee.org>
References: <4433DF85.7030109@gmail.com> <e10q3e$qh0$1@sea.gmane.org> <4434E31B.5030306@ieee.org>
Message-ID: <200604061243.48122.faltet@carabos.com>

A Dijous 06 Abril 2006 11:44, Travis Oliphant va escriure:
> But,  this brings up the point that currently the pickled raw-data which
> is read-in as a string by Python is used as the memory for the new array
> (i.e. the string memory is "stolen").    This should work.  The fact
> that it didn't with sort was a bug that is now fixed in SVN.  However,
> operations on out-of-byte-order arrays will always be slower.  Thus,
> perhaps on pickle read the data should be copied to native byte-order if
> necessary.

Yes, I think that converting directly to native byteorder in
unpickling time would be the best.

Cheers!

-- 
>0,0<   Francesc Altet ? ? http://www.carabos.com/
V   V   C?rabos Coop. V. ??Enjoy Data
 "-"


From a.u.r.e.l.i.a.n at gmx.net  Thu Apr  6 04:16:11 2006
From: a.u.r.e.l.i.a.n at gmx.net (Johannes Loehnert)
Date: Thu Apr  6 04:16:11 2006
Subject: [Numpy-discussion] Re: weird interaction: pickle, numpy, matplotlib.histHi
In-Reply-To: <200604061243.48122.faltet@carabos.com>
References: <4433DF85.7030109@gmail.com> <4434E31B.5030306@ieee.org> <200604061243.48122.faltet@carabos.com>
Message-ID: <200604061315.23340.a.u.r.e.l.i.a.n@gmx.net>

Hi,

> > But,  this brings up the point that currently the pickled raw-data which
> > is read-in as a string by Python is used as the memory for the new array
> > (i.e. the string memory is "stolen").    This should work.  The fact
> > that it didn't with sort was a bug that is now fixed in SVN.  However,
> > operations on out-of-byte-order arrays will always be slower.  Thus,
> > perhaps on pickle read the data should be copied to native byte-order if
> > necessary.
>
> Yes, I think that converting directly to native byteorder in
> unpickling time would be the best.

If you stored your data in wrong byte order for some odd reason (maybe you use 
a library that requires a certain byte order), then you would want pickle to 
deliver the data back exactly as stored. I think this should be made a user 
option in some way, although I do not know a good place for it right now.

Johannes


From cimrman3 at ntc.zcu.cz  Thu Apr  6 05:16:07 2006
From: cimrman3 at ntc.zcu.cz (Robert Cimrman)
Date: Thu Apr  6 05:16:07 2006
Subject: [Numpy-discussion] site.cfg.example
In-Reply-To: <4435020B.9040705@iam.uni-stuttgart.de>
References: <44280161.4030708@ntc.zcu.cz> <442808AF.6090006@ftw.at>	<44280C20.8000003@ntc.zcu.cz> <44297152.9000305@ftw.at>	<442A698C.9000104@ntc.zcu.cz> <442A7E78.1030901@ftw.at>	<442A86D2.20902@ntc.zcu.cz> <442A9A67.8050106@ftw.at>	<442A9F8D.906@ntc.zcu.cz> <CE6F5FEC-CED6-4298-9769-AAA9632B249E@ftw.at>	<443253D4.90806@iam.uni-stuttgart.de> <4434D699.5030102@ntc.zcu.cz>	<4434D8D3.7050200@iam.uni-stuttgart.de> <4434FC6B.3000905@ntc.zcu.cz> <4435020B.9040705@iam.uni-stuttgart.de>
Message-ID: <44350672.4020008@ntc.zcu.cz>

I have added numpy/site.cfg.example to the SVN. It should contain a list 
all possible sections and relevant fields, so that a (new) user sees 
what can be configured and then just copies the file to numpy/site.cfg, 
removes the unwanted sections and edits the wanted.

If you think it is a good idea and have a section that is not present or 
properly described, contribute it, please :-) When/if the file grows, we 
can put it to the Wiki.

cheers,
r.


From tim.hochberg at cox.net  Thu Apr  6 08:39:00 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Thu Apr  6 08:39:00 2006
Subject: [Numpy-discussion] Re: weird interaction: pickle, numpy, matplotlib.histHi
In-Reply-To: <200604061315.23340.a.u.r.e.l.i.a.n@gmx.net>
References: <4433DF85.7030109@gmail.com> <4434E31B.5030306@ieee.org> <200604061243.48122.faltet@carabos.com> <200604061315.23340.a.u.r.e.l.i.a.n@gmx.net>
Message-ID: <44353646.6010009@cox.net>

Johannes Loehnert wrote:

>Hi,
>
>  
>
>>>But,  this brings up the point that currently the pickled raw-data which
>>>is read-in as a string by Python is used as the memory for the new array
>>>(i.e. the string memory is "stolen").    This should work.  The fact
>>>that it didn't with sort was a bug that is now fixed in SVN.  However,
>>>operations on out-of-byte-order arrays will always be slower.  Thus,
>>>perhaps on pickle read the data should be copied to native byte-order if
>>>necessary.
>>>      
>>>
>>Yes, I think that converting directly to native byteorder in
>>unpickling time would be the best.
>>    
>>
>
>If you stored your data in wrong byte order for some odd reason (maybe you use 
>a library that requires a certain byte order), then you would want pickle to 
>deliver the data back exactly as stored. I think this should be made a user 
>option in some way, although I do not know a good place for it right now.
>  
>
If this is really something we want to do, it seems that the "correct" 
solution is to have a different dtype when an object defaults to a given 
byte order than when it is forced to that byte order. Pickle could keep 
track of that and do the right thing on loading. For example, "<!d4" 
could mean that the byte order was explicitly specified, so leave it 
alone. I don't know if this is worth the effort though.

Regards,

-tim


From tim.hochberg at cox.net  Thu Apr  6 08:48:09 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Thu Apr  6 08:48:09 2006
Subject: [Numpy-discussion] array constructor from generators?
In-Reply-To: <4434DD42.8010205@ieee.org>
References: <F0BB3395-025A-46CB-8A7A-005964E45D20@stanford.edu> <44331200.2020604@cox.net> <B72CDB4F-6494-4815-9478-10B4743DBCE1@stanford.edu> <4434DD42.8010205@ieee.org>
Message-ID: <44353880.2040406@cox.net>

Travis Oliphant wrote:

>
>> Can you steal its memory and then give it some dummy memory that it 
>> can free without problems, so that the list can be deallocated 
>> without trouble? Does anyone know if you can just give the list a 
>> NULL pointer for it's memory and then immediately decref it? 
>> free(NULL) should always be safe, I think. (??)
>>
> I don't think you can steal a list's memory since each list element is 
> a actually pointer to some other Python Object.  
> However, a Python array's memory could be stolen (as Tim mentions later).
>
>> This is a good point. Numpy does fine with nested lists, but what 
>> should it do with nested generators? I originally thought that 
>> basically 'array(generator)' should make the exact same thing as 
>> 'array([f for f in generator])'. However, for nested generators, this 
>> would be an object array of generators.
>>
>> I'm not sure which is better -- having more special cases for 
>> generators that make generators, or having a simple rubric like above 
>> for how generators are treated.
>
> I like the idea that generators of generators acts the same as lists 
> of lists (i.e. recursively defined).   Basically to implement this, we 
> need to repeat
>
> Array_FromSequence
> discover_depth
> discover_dimensions
> discover_itemsize
>
> Or, just maybe we can figure out a way to enhance those functions so 
> that creating an array from generators works the same as creating an 
> array from sequences. Right now, the sequence interface is used.  
> Perhaps we could figure out a way to use a more abstract interface 
> which would include both generators and sequences.  If that causes too 
> much alteration then I don't think it's worth it and we just repeat 
> those functions for generators.
>
> Now, I think there are two cases here that are being discussed as one
>
> 1)  Creating arrays from iterators     ---   array( iter(xrange(10) )
> 2)  Creating arrays from generators  ---  array(x for x in xrange(10))
>
> Both of these cases really ought to be handled and really should be 
> integrated into the Array_FromSequence code.  That code is inherited 
> from Numeric and was written before iterators and generators arose on 
> the scene.  There ought to be a way to unify all of these notions 
> (Actually if you handle iterators, then sequences will come along for 
> the ride since sequences can behave as iterators).
> I'd rather see one place in the code that handles these cases.   But, 
> working code is usually better than dreamy plans :-)


I agree with all of this. However, there's one specific case that I 
think we should optimize the heck out of. In fact, I'd be tempted as a 
first cut to only implement this case and raise exceptions in the other 
cases until we get around to implementing them. This one case is:
    * dtype known
    * 1-dimensional
I care about this case because it's common and we can do it efficiently. 
In the other cases I could write a python function that does almost as 
good of a job as we're likely to do in C both in terms of speed and 
memory usage. So the known dtype, 1D case adds important functionality 
while the other "merely" adds convenience (and consistency). Those are 
good, but personally the added functionality is higher on my priority list.

-tim


From byrnes at bu.edu  Thu Apr  6 09:15:25 2006
From: byrnes at bu.edu (John Byrnes)
Date: Thu Apr  6 09:15:25 2006
Subject: [Numpy-discussion] A random.normal function with stdev as array
In-Reply-To: <4434E13B.4000702@ieee.org>
References: <4433C6D6.5080800@obs.univ-lyon1.fr> <20060405155736.GA9364@localhost.localdomain> <4434E13B.4000702@ieee.org>
Message-ID: <20060406161450.GA18606@localhost.localdomain>

On Thu, Apr 06, 2006 at 03:36:59AM -0600, Travis Oliphant wrote:
> John Byrnes wrote:
> >Hi Eric,
> >
> >In the past , I've done things like
> >
> >######
> >normdist = lambda x: numpy.random.normal(0,x)
> >vecnormal = numpy.vectorize(normdist)
> >
> >stdev = numpy.array([1.1,1.2,1.0,2.2])
> >result = vecnormal(stdev)
> >
> >######
> >
> >This works fine for up to 10k elements for stdev for some reason.
> >Any larger then that and i get a Bus error on my PPC mac and a segfault on
> >my x86 linux box.
> >
> >  
> 
> This needs to be tracked down.  It looks like some-kind of error is not 
> being caught correctly.  You should not get a segfault.  Could you 
> provide a stack-trace when the problem occurs? 
> 
> One issue is that vectorize is using object arrays under the covers 
> which is consuming roughly 2x the memory than you may think.    An 
> object array is created and the function is called for every element.  
> This object array is then converted to a number type after the fact.  
> 
> The segfault should be tracked down in any case.
> 
> -Travis
> 
> 
>

Hi Travis,

Here is a backtrace from gdb on my mac.

John


#0  0x00470b88 in log1pl ()
#1  0x00000000 in ?? ()
Cannot access memory at address 0x0
Cannot access memory at address 0x0
#2  0x004708ec in log1pl ()
#3  0x1000c348 in PyObject_Call (func=0x4, arg=0x4, kw=0x15fb) at /Users/bob/src/Python-2.4.1/Objects/abstract.c:1751
#4  0x1007ce34 in ext_do_call (func=0x1, pp_stack=0xbfffed90, flags=211904, na=8656012, nk=1194304) at /Users/bob/src/Python-2.4.1/Python/ceval.c:3824
#5  0x1007a230 in PyEval_EvalFrame (f=0x848410) at /Users/bob/src/Python-2.4.1/Python/ceval.c:2203
#6  0x1007b284 in PyEval_EvalCodeEx (co=0x2, globals=0x4, locals=0x1, args=0x3, argcount=1049072, kws=0x841150, kwcount=1, defs=0x8411fc, defcount=0, closure=0x0) at /Users/bob/src/Python-2.4.1/Python/ceval.c:2730
#7  0x10026274 in function_call (func=0x880bb0, arg=0x1001f0, kw=0x848410) at /Users/bob/src/Python-2.4.1/Objects/funcobject.c:548
#8  0x1000c348 in PyObject_Call (func=0x4, arg=0x4, kw=0x15fb) at /Users/bob/src/Python-2.4.1/Objects/abstract.c:1751
#9  0x10015a88 in instancemethod_call (func=0x52eef0, arg=0x54a170, kw=0x0) at /Users/bob/src/Python-2.4.1/Objects/classobject.c:2431
#10 0x1000c348 in PyObject_Call (func=0x4, arg=0x4, kw=0x15fb) at /Users/bob/src/Python-2.4.1/Objects/abstract.c:1751
#11 0x10059358 in slot_tp_call (self=0x53e4f0, args=0x5b310, kwds=0x0) at /Users/bob/src/Python-2.4.1/Objects/typeobject.c:4526
#12 0x1000c348 in PyObject_Call (func=0x4, arg=0x4, kw=0x15fb) at /Users/bob/src/Python-2.4.1/Objects/abstract.c:1751
#13 0x1007c9e4 in do_call (func=0x53e4f0, pp_stack=0x53e4f0, na=0, nk=8655844) at /Users/bob/src/Python-2.4.1/Python/ceval.c:3755
#14 0x1007c6dc in call_function (pp_stack=0x0, oparg=4) at /Users/bob/src/Python-2.4.1/Python/ceval.c:3570
#15 0x1007a140 in PyEval_EvalFrame (f=0x10e200) at /Users/bob/src/Python-2.4.1/Python/ceval.c:2163
#16 0x1007c83c in fast_function (func=0x4, pp_stack=0x10e360, n=268927488, na=268755664, nk=1) at /Users/bob/src/Python-2.4.1/Python/ceval.c:3629
#17 0x1007c6c4 in call_function (pp_stack=0xbffff5bc, oparg=4) at /Users/bob/src/Python-2.4.1/Python/ceval.c:3568
#18 0x1007a140 in PyEval_EvalFrame (f=0x10e030) at /Users/bob/src/Python-2.4.1/Python/ceval.c:2163
#19 0x1007b284 in PyEval_EvalCodeEx (co=0x0, globals=0x4, locals=0x1, args=0x10078200, argcount=1049072, kws=0x841150, kwcount=1, defs=0x8411fc, defcount=0, closure=0x0) at /Users/bob/src/Python-2.4.1/Python/ceval.c:2730
#20 0x1007e678 in PyEval_EvalCode (co=0x4, globals=0x4, locals=0x15fb) at /Users/bob/src/Python-2.4.1/Python/ceval.c:484
#21 0x100b2ee0 in run_node (n=0x10078200, filename=0x4 <Address 0x4 out of bounds>, globals=0x0, locals=0x10e180, flags=0x2) at /Users/bob/src/Python-2.4.1/Python/pythonrun.c:1265
#22 0x100b23b0 in PyRun_InteractiveOneFlags (fp=0x54a1a5, filename=0x56ca0 "", flags=0x10e030) at /Users/bob/src/Python-2.4.1/Python/pythonrun.c:762
#23 0x100b2190 in PyRun_InteractiveLoopFlags (fp=0x56b94, filename=0xd440 "", flags=0x100f21b8) at /Users/bob/src/Python-2.4.1/Python/pythonrun.c:695
#24 0x100b3bb0 in PyRun_AnyFileExFlags (fp=0xa0001554, filename=0x100f36ac "<stdin>", closeit=0, flags=0xbffff934) at /Users/bob/src/Python-2.4.1/Python/pythonrun.c:658
#25 0x100bf640 in Py_Main (argc=269413412, argv=0x20000000) at /Users/bob/src/Python-2.4.1/Modules/main.c:484
#26 0x000018d0 in start ()
#27 0x8fe1a278 in __dyld__dyld_start ()


From ndarray at mac.com  Thu Apr  6 12:42:17 2006
From: ndarray at mac.com (Sasha)
Date: Thu Apr  6 12:42:17 2006
Subject: [Numpy-discussion] What is diagonal for nd>2?
Message-ID: <d38f5330604061241w8054ee5tf60513445a2d9df3@mail.gmail.com>

It looks like the definition of the diagonal changed somewhere between
Numeric 24.0 and numpy:

In Numeric:

>>> x = Numeric.arange(2*4*4)
>>> x = Numeric.reshape(x, (2, 4, 4))
>>> x
array([[[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11],
        [12, 13, 14, 15]],
       [[16, 17, 18, 19],
        [20, 21, 22, 23],
        [24, 25, 26, 27],
        [28, 29, 30, 31]]])
>>> Numeric.diagonal(x)
array([[ 0,  5, 10, 15],
       [16, 21, 26, 31]])

But in numpy:

>>> import numpy as Numeric
>>> x = Numeric.arange(2*4*4)
>>> x = Numeric.reshape(x, (2, 4, 4))
>>> Numeric.diagonal(x)
array([[ 0, 20],
       [ 1, 21],
       [ 2, 22],
       [ 3, 23]])

The old logic seems to be clear: x is a pair of matrices and diagonal
returns a pair of diagonals, but the new logic seems unclear: the disagonal
returns the first rows of the two matrices transposed.

Does anyone know when this change was introduced and why?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060406/484624d2/attachment.html>

From pgmdevlist at mailcan.com  Thu Apr  6 13:51:04 2006
From: pgmdevlist at mailcan.com (Pierre GM)
Date: Thu Apr  6 13:51:04 2006
Subject: [Numpy-discussion] What is diagonal for nd>2?
In-Reply-To: <d38f5330604061241w8054ee5tf60513445a2d9df3@mail.gmail.com>
References: <d38f5330604061241w8054ee5tf60513445a2d9df3@mail.gmail.com>
Message-ID: <200604061652.30764.pgmdevlist@mailcan.com>

> Does anyone know when this change was introduced and why?

Isn't it more a problem of default values ?
By default, x.diagonal() == x.diagonal(0,0,1)

x.diagonal()
array([[ 0, 20],
       [ 1, 21],
       [ 2, 22],
       [ 3, 23]])

If you want the paired diagonal:
x.diagonal(0,1,-1)
array([[ 0,  5, 10, 15],
       [16, 21, 26, 31]])


From ndarray at mac.com  Thu Apr  6 14:46:10 2006
From: ndarray at mac.com (Sasha)
Date: Thu Apr  6 14:46:10 2006
Subject: [Numpy-discussion] What is diagonal for nd>2?
In-Reply-To: <200604061652.30764.pgmdevlist@mailcan.com>
References: <d38f5330604061241w8054ee5tf60513445a2d9df3@mail.gmail.com>
	 <200604061652.30764.pgmdevlist@mailcan.com>
Message-ID: <d38f5330604061445le2e5c73p8d1ca51f8718524d@mail.gmail.com>

I see. However, something needs to be changed.  In the current version
help(diagonal) prints the following:
{{{
Help on function diagonal in module numpy.core.oldnumeric:

diagonal(a, offset=0, axis1=0, axis2=1)
    diagonal(a, offset=0, axis1=0, axis2=1) returns the given diagonals
    defined by the last two dimensions of the array.
}}}

I would think axes 0 and 1 are the first, not the last two dimensions.  We
can either change the documentation or change the defaults in the
oldnumeric.  I would vote for the change in defaults because oldnumeric is a
compatibility module and should not introduce changes.

In addition, the fact that the reduced axes become the first (rather than
the last or one of the axis1 and axis2) dimension should be spelled out in
the docstring.


On 4/6/06, Pierre GM <pgmdevlist at mailcan.com> wrote:
>
> > Does anyone know when this change was introduced and why?
>
> Isn't it more a problem of default values ?
> By default, x.diagonal() == x.diagonal(0,0,1)
>
> x.diagonal()
> array([[ 0, 20],
>        [ 1, 21],
>        [ 2, 22],
>        [ 3, 23]])
>
> If you want the paired diagonal:
> x.diagonal(0,1,-1)
> array([[ 0,  5, 10, 15],
>        [16, 21, 26, 31]])
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060406/6a4b894f/attachment.html>

From Chris.Barker at noaa.gov  Thu Apr  6 14:59:03 2006
From: Chris.Barker at noaa.gov (Christopher Barker)
Date: Thu Apr  6 14:59:03 2006
Subject: [Numpy-discussion] Re: weird interaction: pickle, numpy,
 matplotlib.hist
In-Reply-To: <4434E31B.5030306@ieee.org>
References: <4433DF85.7030109@gmail.com> <e10q3e$qh0$1@sea.gmane.org>
 <4434E31B.5030306@ieee.org>
Message-ID: <44358EEA.4080609@noaa.gov>

Travis Oliphant wrote:
> Thus, 
> perhaps on pickle read the data should be copied to native byte-order if 
> necessary.

+1

Those that are working with non-native byte order on purpose presumably 
know what they are doing, and can check and swap as necessary -- or use 
tofile and fromfile, which I presume don't do any byteswapping for you.

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer
                                     		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From pgmdevlist at mailcan.com  Thu Apr  6 15:01:03 2006
From: pgmdevlist at mailcan.com (Pierre GM)
Date: Thu Apr  6 15:01:03 2006
Subject: [Numpy-discussion] What is diagonal for nd>2?
In-Reply-To: <d38f5330604061445le2e5c73p8d1ca51f8718524d@mail.gmail.com>
References: <d38f5330604061241w8054ee5tf60513445a2d9df3@mail.gmail.com> <200604061652.30764.pgmdevlist@mailcan.com> <d38f5330604061445le2e5c73p8d1ca51f8718524d@mail.gmail.com>
Message-ID: <200604061802.20457.pgmdevlist@mailcan.com>

> I would think axes 0 and 1 are the first, not the last two dimensions.  We
> can either change the documentation or change the defaults in the
> oldnumeric.  I would vote for the change in defaults because oldnumeric is
> a compatibility module and should not introduce changes.

So, change the default to:
diagonal(a, offset=0, axis1=-2, axis2=-1) ?

That'd make sense, I'm for that...


From ndarray at mac.com  Thu Apr  6 16:11:01 2006
From: ndarray at mac.com (Sasha)
Date: Thu Apr  6 16:11:01 2006
Subject: [Numpy-discussion] New patch for MA
In-Reply-To: <200603280427.52789.pgmdevlist@mailcan.com>
References: <200603280427.52789.pgmdevlist@mailcan.com>
Message-ID: <d38f5330604061610i115c7ec5j33df18cb99c8ac06@mail.gmail.com>

I have applied the patch with minor modifications. See <
http://projects.scipy.org/scipy/numpy/changeset/2331>.

 Here are a few suggestions for posting patches.

1. If you are using svn, please post output of "svn diff" in the project
root directory (the directory that *contains* "numpy", not the "numpy"
directory.
2. If appropriate, add unit tests to an existing file instead of creating a
new one. (In case of ma, the correct file is test_ma.py).
3. If you follow recommendation #1, this will happen automatically, if you
cannot use svn for some reason, concatenate the output of diff for code and
test in the same patch file.

Here are some topics for discussion.

1. I've initially implemented some ma array methods by wrapping existing
module level functions.  I am not sure this is the best approach to
implement new methods. It is probably cleaner to implement them as methods
and provide wrappers at the module level similar to oldnumeric.

2. I am not sure cumprod and cumsum should fill masked elements with 1 and
0.  I would think the result should be masked if any prior element along the
axis being accumulated is masked.  To ignore masked elements, filled can be
called explicitly before cum[prod|sum].  One of the problems with filling by
default is that 1 or 0 are not appropriate values for object arrays (for
example, "" is an appropriate fill value for cumsum of an array of strings).


On 3/28/06, Pierre GM <pgmdevlist at mailcan.com> wrote:
>
> Folks,
> You can find a new patch for MA on the wiki
>
> http://projects.scipy.org/scipy/numpy/attachment/wiki/MaskedArray/ma-200603280900.patch
> along with a test suite.
> The 'clip' method should now work with array arguments. Were also added
> cumsum, cumprod, std, var and squeeze.
> I'll deal with flags, setflags, setfield, dump and others when I'll have a
> better idea of how it works -- which probably won't happen anytime soon,
> as I
> don't really have time to dig in the code for these functions. AAMOF, I'm
> more interested in checking/patching some other aspects of numpy for MA
> (eg,
> mlab...)
> Once again, please send me your comments and suggestions.
> Thx for everything
> P.
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting
> language
> that extends applications into web and mobile media. Attend the live
> webcast
> and join the prime developer group breaking into this new coding
> territory!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060406/0a3fdbbc/attachment.html>

From michael.sorich at gmail.com  Thu Apr  6 17:41:19 2006
From: michael.sorich at gmail.com (Michael Sorich)
Date: Thu Apr  6 17:41:19 2006
Subject: [Numpy-discussion] New patch for MA
In-Reply-To: <d38f5330604061610i115c7ec5j33df18cb99c8ac06@mail.gmail.com>
References: <200603280427.52789.pgmdevlist@mailcan.com>
	 <d38f5330604061610i115c7ec5j33df18cb99c8ac06@mail.gmail.com>
Message-ID: <16761e100604061733r586cca6cr94d72c554b54fdd0@mail.gmail.com>

On 4/7/06, Sasha <ndarray at mac.com> wrote:
>
>
> 2. I am not sure cumprod and cumsum should fill masked elements with 1 and
> 0.  I would think the result should be masked if any prior element along the
> axis being accumulated is masked.  To ignore masked elements, filled can be
> called explicitly before cum[prod|sum].  One of the problems with filling by
> default is that 1 or 0 are not appropriate values for object arrays (for
> example, "" is an appropriate fill value for cumsum of an array of strings).
>
>
There are often a number of options for how masked values can be dealt with.
In general (not just with cum*), I would prefer for the result to be masked
when masked values are involved unless I explicitly indicate what should be
done with the masked values. Otherwise it is too easy to forget that some
default maniputlation of masked values has been applied. In R there is
commonly an na.action or na.rm parameter to functions.

Michael
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060406/5092ed3a/attachment.html>

From pgmdevlist at mailcan.com  Thu Apr  6 19:19:02 2006
From: pgmdevlist at mailcan.com (Pierre GM)
Date: Thu Apr  6 19:19:02 2006
Subject: [Numpy-discussion] New patch for MA
In-Reply-To: <d38f5330604061610i115c7ec5j33df18cb99c8ac06@mail.gmail.com>
References: <200603280427.52789.pgmdevlist@mailcan.com> <d38f5330604061610i115c7ec5j33df18cb99c8ac06@mail.gmail.com>
Message-ID: <200604062218.05876.pgmdevlist@mailcan.com>

Sasha,
Thanks for your advice with SVN. I'll make sure to use that method from now 
on.

> 1. I've initially implemented some ma array methods by wrapping existing
> module level functions.  I am not sure this is the best approach to
> implement new methods. It is probably cleaner to implement them as methods
> and provide wrappers at the module level similar to oldnumeric.

Well, I tried to stick to the latest convention, getting rid of the _wrapit 
part. Let me know.
>
> 2. I am not sure cumprod and cumsum should fill masked elements with 1 and
> 0. 
Good point for the object/string arrays, yet other cases I overlooked (I'm 
still not used to object arrays, I'm now realizing they're quite useful).

Actually, I coded that way because it's how I use these functions. But well, 
as many settings as users, eh?

Michael's suggestion of introducing R-like options sounds interesting, but I 
wonder whether it would not be a bit heavy for methods, with the introduction 
of an extra flag. That'd be great for functions, though. 
So, for cumsum and cumprod methods, maybe we could stick to Sasha's and 
Michael's preference (mask all values after the first missing), and we would 
just have to create two functions. We could use the 4 R ones: na.omit, 
na.fail, na.pass, na.exclude.

For our current problem (cumsum,cumprod)
na.omit: would return the result I implemented (fill with 0 or 1)
na.fail: would return masked values after the first missing
na.exclude: would correspond to compressed().cumsum() ? I don't like that, it 
changes the initial length/size
na.pass: I don't know...


From ndarray at mac.com  Thu Apr  6 21:14:01 2006
From: ndarray at mac.com (Sasha)
Date: Thu Apr  6 21:14:01 2006
Subject: [Numpy-discussion] New patch for MA
In-Reply-To: <16761e100604061733r586cca6cr94d72c554b54fdd0@mail.gmail.com>
References: <200603280427.52789.pgmdevlist@mailcan.com>
	 <d38f5330604061610i115c7ec5j33df18cb99c8ac06@mail.gmail.com>
	 <16761e100604061733r586cca6cr94d72c554b54fdd0@mail.gmail.com>
Message-ID: <d38f5330604062113n1eaf33b5o38c9b163e19e517d@mail.gmail.com>

On 4/6/06, Michael Sorich <michael.sorich at gmail.com> wrote:
> ... I would prefer for the result to be masked
> when masked values are involved unless I explicitly indicate what should be
> done with the masked values. ...

This is the case in r2332:

>>> from numpy.core.ma import *
>>> print array([1,2,3], mask=[0,1,0]).cumsum()
[1 -- --]


From a.mcmorland at auckland.ac.nz  Fri Apr  7 00:30:07 2006
From: a.mcmorland at auckland.ac.nz (Angus McMorland)
Date: Fri Apr  7 00:30:07 2006
Subject: [Numpy-discussion] Newbie indexing question [fancy indexing in
 nD]
In-Reply-To: <4434E522.3060101@ieee.org>
References: <44338DF4.7050603@gmail.com> <4434E522.3060101@ieee.org>
Message-ID: <4435F672.1040701@auckland.ac.nz>

Hi again.

Thanks, everyone, for your quick replies.

Travis Oliphant wrote:
> amcmorl wrote:
> 
>> Hi all,
>>
>> I'm having a bit of trouble getting my head around numpy's indexing
>> capabilities. A quick summary of the problem is that I want to
>> lookup/index in nD from a second array of rank n+1, such that the last
>> (or first, I guess) dimension contains the lookup co-ordinates for the
>> value to extract from the first array. Here's a 2D (3,3) example:
>>
>> In [12]:print ar
>> [[ 0.15  0.75  0.2 ]
>>  [ 0.82  0.5   0.77]
>>  [ 0.21  0.91  0.59]]
>>
>> In [24]:print inds
>> [[[1 1]
>>   [1 1]
>>   [2 1]]
>>
>>  [[2 2]
>>   [0 0]
>>   [1 0]]
>>
>>  [[1 1]
>>   [0 0]
>>   [2 1]]]
>>
>> then somehow return the array (barring me making any row/column errors):
>> In [26]: c = ar.somefancyindexingroutinehere(inds)
> 
> You can do this with "fancy-indexing".   Obviously it is going to take
> some time for people to get used to this idea as none of the responses
> yet suggest it.
> But the following works. 
> c = ar[inds[...,0],inds[...,1]]
> 
> gives the desired effect.
> 
> Thus, your simple description c[x,y] = ar[inds[x,y,0],inds[x,y,1]] is a
> text-book description of what fancy-indexing does.

Great. Turns out I wasn't too far off then. I've written a quick
function of my own that extends the fancy indexing to nD:

def fancy_index_nd(ar, ind):
    evList = ['ar[']
    for i in range(len(ar.shape)):
        evList = evList + [' ind[...,%d]' % i]
        if i < len(ar.shape) - 1:
            evList = evList + [","]
    evList = evList + [' ]']
    return eval(''.join(evList))

1) Am I missing a simpler way to extend the fancy-indexing to
n-dimensions? If not...

2) this seems (conceptually) that it might be a little faster than the
routines that have to calculate a flat index. Hopefully it could be of
use to people. Any thoughts?

Cheers,

Angus
-- 
Angus McMorland
email a.mcmorland at auckland.ac.nz
mobile +64-21-155-4906

PhD Student, Neurophysiology / Multiphoton & Confocal Imaging
Physiology, University of Auckland
phone +64-9-3737-599 x89707

Armourer, Auckland University Fencing
Secretary, Fencing North Inc.


From pau.gargallo at gmail.com  Fri Apr  7 02:37:05 2006
From: pau.gargallo at gmail.com (Pau Gargallo)
Date: Fri Apr  7 02:37:05 2006
Subject: [Numpy-discussion] Newbie indexing question [fancy indexing in nD]
In-Reply-To: <4435F672.1040701@auckland.ac.nz>
References: <44338DF4.7050603@gmail.com> <4434E522.3060101@ieee.org>
	 <4435F672.1040701@auckland.ac.nz>
Message-ID: <6ef8f3380604070236m2d606983l82403cbc2305fefa@mail.gmail.com>

you can do things like

a[ list( ind[...,i] for i in range(.shape[-1]) ) ]

if the indices could be accessed as ind[i] instead of ind[...,i]
(transposing the indices array)
then you could simply do:

a[ list(ind) ]


pau

On 4/7/06, Angus McMorland <a.mcmorland at auckland.ac.nz> wrote:
> Hi again.
>
> Thanks, everyone, for your quick replies.
>
> Travis Oliphant wrote:
> > amcmorl wrote:
> >
> >> Hi all,
> >>
> >> I'm having a bit of trouble getting my head around numpy's indexing
> >> capabilities. A quick summary of the problem is that I want to
> >> lookup/index in nD from a second array of rank n+1, such that the last
> >> (or first, I guess) dimension contains the lookup co-ordinates for the
> >> value to extract from the first array. Here's a 2D (3,3) example:
> >>
> >> In [12]:print ar
> >> [[ 0.15  0.75  0.2 ]
> >>  [ 0.82  0.5   0.77]
> >>  [ 0.21  0.91  0.59]]
> >>
> >> In [24]:print inds
> >> [[[1 1]
> >>   [1 1]
> >>   [2 1]]
> >>
> >>  [[2 2]
> >>   [0 0]
> >>   [1 0]]
> >>
> >>  [[1 1]
> >>   [0 0]
> >>   [2 1]]]
> >>
> >> then somehow return the array (barring me making any row/column errors):
> >> In [26]: c = ar.somefancyindexingroutinehere(inds)
> >
> > You can do this with "fancy-indexing".   Obviously it is going to take
> > some time for people to get used to this idea as none of the responses
> > yet suggest it.
> > But the following works.
> > c = ar[inds[...,0],inds[...,1]]
> >
> > gives the desired effect.
> >
> > Thus, your simple description c[x,y] = ar[inds[x,y,0],inds[x,y,1]] is a
> > text-book description of what fancy-indexing does.
>
> Great. Turns out I wasn't too far off then. I've written a quick
> function of my own that extends the fancy indexing to nD:
>
> def fancy_index_nd(ar, ind):
>     evList = ['ar[']
>     for i in range(len(ar.shape)):
>         evList = evList + [' ind[...,%d]' % i]
>         if i < len(ar.shape) - 1:
>             evList = evList + [","]
>     evList = evList + [' ]']
>     return eval(''.join(evList))
>
> 1) Am I missing a simpler way to extend the fancy-indexing to
> n-dimensions? If not...
>
> 2) this seems (conceptually) that it might be a little faster than the
> routines that have to calculate a flat index. Hopefully it could be of
> use to people. Any thoughts?
>
> Cheers,
>
> Angus
> --
> Angus McMorland
> email a.mcmorland at auckland.ac.nz
> mobile +64-21-155-4906
>
> PhD Student, Neurophysiology / Multiphoton & Confocal Imaging
> Physiology, University of Auckland
> phone +64-9-3737-599 x89707
>
> Armourer, Auckland University Fencing
> Secretary, Fencing North Inc.
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting language
> that extends applications into web and mobile media. Attend the live webcast
> and join the prime developer group breaking into this new coding territory!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>


From mxjmfen at dlalaw.com  Fri Apr  7 04:50:08 2006
From: mxjmfen at dlalaw.com (mxjmfen)
Date: Fri Apr  7 04:50:08 2006
Subject: [Numpy-discussion] Fw: numpy-discussion
Message-ID: <001401c65a39$3e4f54e0$6c5fd855@ries>


----- Original Message ----- 
From: Rosenberg Kris 
To: xqoahbphlic at time.net 
Sent: Friday, April 07, 2006 11:26 AM
Subject: numpy-discussion


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060407/1f813a80/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: numpy-discussion.gif
Type: image/gif
Size: 16996 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060407/1f813a80/attachment.gif>

From a.h.jaffe at gmail.com  Fri Apr  7 06:54:09 2006
From: a.h.jaffe at gmail.com (Andrew Jaffe)
Date: Fri Apr  7 06:54:09 2006
Subject: [Numpy-discussion] Re: weird interaction: pickle, numpy, matplotlib.hist
In-Reply-To: <4434E31B.5030306@ieee.org>
References: <4433DF85.7030109@gmail.com> <e10q3e$qh0$1@sea.gmane.org> <4434E31B.5030306@ieee.org>
Message-ID: <44366E71.7060601@gmail.com>

Travis Oliphant wrote:
> But,  this brings up the point that currently the pickled raw-data which 
> is read-in as a string by Python is used as the memory for the new array 
> (i.e. the string memory is "stolen").    This should work.  The fact 
> that it didn't with sort was a bug that is now fixed in SVN.  However, 
> operations on out-of-byte-order arrays will always be slower.  Thus, 
> perhaps on pickle read the data should be copied to native byte-order if 
> necessary.

+1 from me, too. I assume that byteswapping is fast compared to I/O in 
most cases, and the only times when you wouldn't want it would be 
'advanced' usage that the developer could take control of via a custom 
reduce, __getstate__, __setstate__, etc.

Andrew


______________________________________________________________________
Andrew Jaffe                                    a.jaffe at imperial.ac.uk
Astrophysics Group                                    +44 207 594-7526
Blackett Laboratory, Room 1013                    FAX             7541
Imperial College, Prince Consort Road
London SW7 2AZ ENGLAND              http://astro.imperial.ac.uk/~jaffe


From ndarray at mac.com  Fri Apr  7 10:26:06 2006
From: ndarray at mac.com (Sasha)
Date: Fri Apr  7 10:26:06 2006
Subject: [Numpy-discussion] Re: ndarray.fill and ma.array.filled
In-Reply-To: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com>
Message-ID: <d38f5330604071025g5f53d0d7yb0b14d9bac059c4@mail.gmail.com>

I am posting a reply to my own post in a hope to generate some discussion of
the original proposal.

I am proposing to add a "filled" method to ndarray.  This can be a
pass-through, an alias to "copy" or a method to replace nans or some other
type-specific values.  This will allow code that uses "filled" work on
ndarrays without changes.


On 3/22/06, Sasha <ndarray at mac.com> wrote:
>
> In an ideal world, any function that accepts ndarray would accept
> ma.array and vice versa.  Moreover, if the ma.array has no masked
> elements and the same data as ndarray, the result should be the same.
> Obviously current implementation falls short of this goal, but there
> is one feature that seems to make this goal unachievable.
>
> This feature is the "filled" method of ma.array.  Pydoc for this
> method reports the following:
>
> |  filled(self, fill_value=None)
> |      A numeric array with masked values filled. If fill_value is None,
> |                 use self.fill_value().
> |
> |                 If mask is nomask, copy data only if not contiguous.
> |                 Result is always a contiguous, numeric array.
> |      # Is contiguous really necessary now?
>
>
> That is not the best possible description ("filled" is "filled"), but
> the essence is that the result of a.filled(value) is a contiguous
> ndarray obtained from the masked array by copying non-masked elements
> and using value for masked values.
>
> I would like to propose to add a "filled" method to ndarray.  I see
> several possibilities and would like  to hear your opinion:
>
> 1. Make filled simply return self.
>
> 2. Make filled return a contiguous copy.
>
> 3. Make filled replace nans with the fill_value if array is of
> floating point type.
>
>
> Unfortunately, adding "filled" will result is a rather confusing
> situation where "fill" and "filled" both exist and have very different
> meanings.
>
> I would like to note that "fill" is a somewhat odd ndarray method.
> AFAICT, it is the only non-special method that mutates the array.  It
> appears to be just a performance trick: the same result can be achived
> with "a[...] = ".
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060407/7b3043bc/attachment.html>

From webb.sprague at gmail.com  Fri Apr  7 10:38:03 2006
From: webb.sprague at gmail.com (Webb Sprague)
Date: Fri Apr  7 10:38:03 2006
Subject: [Numpy-discussion] Tiling / disk storage for matrix in numpy?
Message-ID: <b11ea23c0604071030s7f03a83co35ca94b8c91639eb@mail.gmail.com>

Hi all,

Is there a way in numpy to associate a (large) matrix with a disk
file, then and tile and index it, then cache it as you process the
various pieces?  This is pretty important with massive image files,
which can't fit into working memory, but in which (for example) you
might be doing a convolution on a 100 x 100 pixel window on a small
subset of the image.

I know that caching algorithms are (1) complicated and (2) never
general.  But there you go.

Perhaps I can't find it, perhaps it would be a good project for the
future?  If HDF or something does this already, could someone point me
in the right direction?

Thx


From tim.hochberg at cox.net  Fri Apr  7 11:22:05 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Fri Apr  7 11:22:05 2006
Subject: [Numpy-discussion] Re: ndarray.fill and ma.array.filled
In-Reply-To: <d38f5330604071025g5f53d0d7yb0b14d9bac059c4@mail.gmail.com>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com> <d38f5330604071025g5f53d0d7yb0b14d9bac059c4@mail.gmail.com>
Message-ID: <4436AE31.7000306@cox.net>

Sasha wrote:

> I am posting a reply to my own post in a hope to generate some 
> discussion of the original proposal.
>
> I am proposing to add a "filled" method to ndarray.  This can be a 
> pass-through, an alias to "copy" or a method to replace nans or some 
> other type-specific values.  This will allow code that uses "filled" 
> work on
> ndarrays without changes.

In general, I'm skeptical of adding more methods to the ndarray object 
-- there are plenty already.

In addition, it appears that both the method and function versions of 
filled are "dangerous" in the sense that they sometimes return the array 
itself and sometimes a copy.

Finally, changing ndarray to support masked array feels a bit like the 
tail wagging the dog.

Let me throw out an alternative proposal. I will admit up front that 
this proposal is based on exactly zero experience with masked array, so 
there may be some stupidities in it, but perhaps it will lead to an 
alternative solution.

    def asUnmaskedArray(obj, fill_value=None):
            mask = getattr(obj,  False)
            if mask is False:
                return obj
            if fill_value is None:
                 fill_value = obj.get_fill_value()
            newobj = obj.data().copy()
            newobj[mask] = fill_value
            return newobj

Or something like that anyway. This particular version should work on 
any array as long as if it exports a mask attribute it also exports 
get_fill_value and data. At least once any bugs are ironed out, I 
haven't tested it.

ma would have to be modified to use this instead of using filled 
everywhere, but that seems more appropriate than tacking on another 
method to ndarray IMO.
           
On advantage of this approach is that most array like objects that don't 
subclass ndarray will work with this automagically. If we keep expanding 
the methods of ndarray, it's harder and harder to implement other array 
like objects since they have to implement more and more methods, most of 
which are irrelevant to their particular case. The more we can implement 
stuff like this in terms of some relatively small set of core 
primitives, the happier we'll all be in the long run. This also builds 
on the idea of trying to push as much of the array/view ambiguity into 
the asXXXArray corner.

Regards,

-tim


>
>
> On 3/22/06, *Sasha* <ndarray at mac.com <mailto:ndarray at mac.com>> wrote:
>
>     In an ideal world, any function that accepts ndarray would accept
>     ma.array and vice versa.  Moreover, if the ma.array has no masked
>     elements and the same data as ndarray, the result should be the same.
>     Obviously current implementation falls short of this goal, but there
>     is one feature that seems to make this goal unachievable.
>
>     This feature is the "filled" method of ma.array.  Pydoc for this
>     method reports the following:
>
>     |  filled(self, fill_value=None)
>     |      A numeric array with masked values filled. If fill_value is
>     None,
>     |                 use self.fill_value().
>     |
>     |                 If mask is nomask, copy data only if not contiguous.
>     |                 Result is always a contiguous, numeric array.
>     |      # Is contiguous really necessary now?
>
>
>     That is not the best possible description ("filled" is "filled"), but
>     the essence is that the result of a.filled(value) is a contiguous
>     ndarray obtained from the masked array by copying non-masked elements
>     and using value for masked values.
>
>     I would like to propose to add a "filled" method to ndarray.  I see
>     several possibilities and would like  to hear your opinion:
>
>     1. Make filled simply return self.
>
>     2. Make filled return a contiguous copy.
>
>     3. Make filled replace nans with the fill_value if array is of
>     floating point type.
>
>
>     Unfortunately, adding "filled" will result is a rather confusing
>     situation where "fill" and "filled" both exist and have very different
>     meanings.
>
>     I would like to note that "fill" is a somewhat odd ndarray method.
>     AFAICT, it is the only non-special method that mutates the array.  It
>     appears to be just a performance trick: the same result can be
>     achived
>     with "a[...] = ".
>
>


From ndarray at mac.com  Fri Apr  7 12:20:15 2006
From: ndarray at mac.com (Sasha)
Date: Fri Apr  7 12:20:15 2006
Subject: [Numpy-discussion] Re: ndarray.fill and ma.array.filled
In-Reply-To: <4436AE31.7000306@cox.net>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com>
	 <d38f5330604071025g5f53d0d7yb0b14d9bac059c4@mail.gmail.com>
	 <4436AE31.7000306@cox.net>
Message-ID: <d38f5330604071219j6a5adbdw4a300ed10a26a445@mail.gmail.com>

On 4/7/06, Tim Hochberg <tim.hochberg at cox.net> wrote:
>
> ...
> In general, I'm skeptical of adding more methods to the ndarray object
> -- there are plenty already.


I've also proposed to drop "fill" in favor of optimizing x[...] = <scalar>.
Having both "fill" and "filled" in the interface is plain awkward.  You may
like the combined proposal better because it does not change the total
number of methods :-)


In addition, it appears that both the method and function versions of
> filled are "dangerous" in the sense that they sometimes return the array
> itself and sometimes a copy.


This is true in ma, but may certainly be changed.


> Finally, changing ndarray to support masked array feels a bit like the
> tail wagging the dog.


I disagree. Numpy is pretty much alone among the array languages because it
does not have "native" support for missing values. For the  floating point
types some rudimental support for nans exists, but is not really usable.
There is no missing values machanism for integer types.  I believe adding
"filled" and maybe "mask" to ndarray (not necessarily under these names)
could be a meaningful step towards "native" support for missing values.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060407/8161ae4c/attachment.html>

From webb.sprague at gmail.com  Fri Apr  7 12:36:00 2006
From: webb.sprague at gmail.com (Webb Sprague)
Date: Fri Apr  7 12:36:00 2006
Subject: [Numpy-discussion] Silly array question
Message-ID: <b11ea23c0604071234y562e7568vf32e36a6b4f27925@mail.gmail.com>

In R, if you have an Nx2 array of integers, you can use that to index
an TxS array, yielding a 1xN result. Is there a way to do that in
numpy? I looked for a pairs function but I coudn't find it, vaguely
remembering that might be around...  I know it would be a trivial loop
to write, but a numpy array function would be faster (I hope).

Example

I = [[0,0], [1,1], [2,2], [1,1]]
M = [[1, 2, 3, 4],
        [5, 6, 7, 8],
        [9,10,11, 12],
        [13, 14, 15, 16]]

M[I] = [1,6,11,6].

Thanks!


From ndarray at mac.com  Fri Apr  7 12:53:03 2006
From: ndarray at mac.com (Sasha)
Date: Fri Apr  7 12:53:03 2006
Subject: [Numpy-discussion] Silly array question
In-Reply-To: <b11ea23c0604071234y562e7568vf32e36a6b4f27925@mail.gmail.com>
References: <b11ea23c0604071234y562e7568vf32e36a6b4f27925@mail.gmail.com>
Message-ID: <d38f5330604071252p79df39b9i47b28b70d2f99897@mail.gmail.com>

>>> M.ravel()[dot(I,(4,1))]
array([ 1,  6, 11,  6])


On 4/7/06, Webb Sprague <webb.sprague at gmail.com> wrote:
>
> In R, if you have an Nx2 array of integers, you can use that to index
> an TxS array, yielding a 1xN result. Is there a way to do that in
> numpy? I looked for a pairs function but I coudn't find it, vaguely
> remembering that might be around...  I know it would be a trivial loop
> to write, but a numpy array function would be faster (I hope).
>
> Example
>
> I = [[0,0], [1,1], [2,2], [1,1]]
> M = [[1, 2, 3, 4],
>         [5, 6, 7, 8],
>         [9,10,11, 12],
>         [13, 14, 15, 16]]
>
> M[I] = [1,6,11,6].
>
> Thanks!
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting
> language
> that extends applications into web and mobile media. Attend the live
> webcast
> and join the prime developer group breaking into this new coding
> territory!
> http://sel.as-us.falkag.net/sel?cmdlnk&kid0944&bid$1720&dat1642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060407/b9f72dcc/attachment.html>

From efiring at hawaii.edu  Fri Apr  7 13:22:06 2006
From: efiring at hawaii.edu (Eric Firing)
Date: Fri Apr  7 13:22:06 2006
Subject: [Numpy-discussion] Re: ndarray.fill and ma.array.filled
In-Reply-To: <d38f5330604071219j6a5adbdw4a300ed10a26a445@mail.gmail.com>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com>
 <d38f5330604071025g5f53d0d7yb0b14d9bac059c4@mail.gmail.com>
 <4436AE31.7000306@cox.net>
 <d38f5330604071219j6a5adbdw4a300ed10a26a445@mail.gmail.com>
Message-ID: <4436C965.8020808@hawaii.edu>

Sasha wrote:
> 
> 
> On 4/7/06, *Tim Hochberg* <tim.hochberg at cox.net 
> <mailto:tim.hochberg at cox.net>> wrote:
> 
>     ...
>     In general, I'm skeptical of adding more methods to the ndarray object
>     -- there are plenty already.
> 
> 
> I've also proposed to drop "fill" in favor of optimizing x[...] = 
> <scalar>.  Having both "fill" and "filled" in the interface is plain 
> awkward.  You may like the combined proposal better because it does not 
> change the total number of methods :-)
>  
> 
>     In addition, it appears that both the method and function versions of
>     filled are "dangerous" in the sense that they sometimes return the
>     array
>     itself and sometimes a copy.
> 
> 
> This is true in ma, but may certainly be changed.
>  
> 
>     Finally, changing ndarray to support masked array feels a bit like the
>     tail wagging the dog. 
> 
> 
> I disagree. Numpy is pretty much alone among the array languages because 
> it does not have "native" support for missing values. For the  floating 
> point types some rudimental support for nans exists, but is not really 
> usable.  There is no missing values machanism for integer types.  I 
> believe adding "filled" and maybe "mask" to ndarray (not necessarily 
> under these names) could be a meaningful step towards "native" support 
> for missing values.  

I agree strongly with you, Sasha.  I get the impression that the world 
of numerical computation is divided into those who work with idealized 
"data", where nothing is missing, and those who work with real 
observations, where there is always something missing.  As an 
oceanographer, I am solidly in the latter category.  If good support for 
missing values is not built in, it has to be bolted on, and it becomes 
clunky and awkward.  I was reluctant to speak up about this earlier 
because I thought it was too much to ask of Travis when he was in the 
midst of putting numpy on solid ground.  But I am delighted that missing 
value support has a champion among numpy developers, and I agree that 
now is the time to change it from "bolted on" to "integrated".

Eric


From Chris.Barker at noaa.gov  Fri Apr  7 13:28:02 2006
From: Chris.Barker at noaa.gov (Christopher Barker)
Date: Fri Apr  7 13:28:02 2006
Subject: [Numpy-discussion] Silly array question
In-Reply-To: <b11ea23c0604071234y562e7568vf32e36a6b4f27925@mail.gmail.com>
References: <b11ea23c0604071234y562e7568vf32e36a6b4f27925@mail.gmail.com>
Message-ID: <4436CB1C.3040308@noaa.gov>


Webb Sprague wrote:
> In R, if you have an Nx2 array of integers, you can use that to index
> an TxS array, yielding a 1xN result.

this seems to work:

 >>> import numpy as N
 >>> I = N.array([[0,0], [1,1], [2,2], [1,1]])
 >>> I
array([[0, 0],
        [1, 1],
        [2, 2],
        [1, 1]])

 >>> M = N. array( [[1, 2, 3, 4], [5, 6, 7, 8], [9,10,11, 12], [13, 14, 
15, 16]])

 >>> M
array([[ 1,  2,  3,  4],
        [ 5,  6,  7,  8],
        [ 9, 10, 11, 12],
        [13, 14, 15, 16]])

 >>> M[I[:,0], I[:,1]]
array([ 1,  6, 11,  6])

-- 
Christopher Barker, Ph.D.
Oceanographer
                                     		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From ndarray at mac.com  Fri Apr  7 13:56:02 2006
From: ndarray at mac.com (Sasha)
Date: Fri Apr  7 13:56:02 2006
Subject: [Numpy-discussion] Silly array question
In-Reply-To: <4436CB1C.3040308@noaa.gov>
References: <b11ea23c0604071234y562e7568vf32e36a6b4f27925@mail.gmail.com>
	 <4436CB1C.3040308@noaa.gov>
Message-ID: <d38f5330604071355t62d34319ibbb3aa38216d2d8@mail.gmail.com>

One more obfuscated numpy entry:

>>> M[tuple(transpose(I))]
array([ 1,  6, 11,  6])


On 4/7/06, Christopher Barker <Chris.Barker at noaa.gov> wrote:
>
>
>
> Webb Sprague wrote:
> > In R, if you have an Nx2 array of integers, you can use that to index
> > an TxS array, yielding a 1xN result.
>
> this seems to work:
>
> >>> import numpy as N
> >>> I = N.array([[0,0], [1,1], [2,2], [1,1]])
> >>> I
> array([[0, 0],
>         [1, 1],
>         [2, 2],
>         [1, 1]])
>
> >>> M = N. array( [[1, 2, 3, 4], [5, 6, 7, 8], [9,10,11, 12], [13, 14,
> 15, 16]])
>
> >>> M
> array([[ 1,  2,  3,  4],
>         [ 5,  6,  7,  8],
>         [ 9, 10, 11, 12],
>         [13, 14, 15, 16]])
>
> >>> M[I[:,0], I[:,1]]
> array([ 1,  6, 11,  6])
>
> --
> Christopher Barker, Ph.D.
> Oceanographer
>
> NOAA/OR&R/HAZMAT         (206) 526-6959   voice
> 7600 Sand Point Way NE   (206) 526-6329   fax
> Seattle, WA  98115       (206) 526-6317   main reception
>
> Chris.Barker at noaa.gov
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting
> language
> that extends applications into web and mobile media. Attend the live
> webcast
> and join the prime developer group breaking into this new coding
> territory!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060407/bfc31232/attachment.html>

From webb.sprague at gmail.com  Fri Apr  7 14:00:10 2006
From: webb.sprague at gmail.com (Webb Sprague)
Date: Fri Apr  7 14:00:10 2006
Subject: [Numpy-discussion] Silly array question
In-Reply-To: <d38f5330604071355t62d34319ibbb3aa38216d2d8@mail.gmail.com>
References: <b11ea23c0604071234y562e7568vf32e36a6b4f27925@mail.gmail.com>
	 <4436CB1C.3040308@noaa.gov>
	 <d38f5330604071355t62d34319ibbb3aa38216d2d8@mail.gmail.com>
Message-ID: <b11ea23c0604071359n15975ed2n423ead163eb5d4c9@mail.gmail.com>

I appreciate everyone's help, but is there a NON obfuscated way to do
this without looping?  I think Chris's is my favorite, but I didn't
know I was starting a contest :)

>  >>> M[I[:,0], I[:,1]]
> array([ 1,  6, 11,  6])

W


From webb.sprague at gmail.com  Fri Apr  7 14:05:04 2006
From: webb.sprague at gmail.com (Webb Sprague)
Date: Fri Apr  7 14:05:04 2006
Subject: [Numpy-discussion] Silly array question
In-Reply-To: <b11ea23c0604071359n15975ed2n423ead163eb5d4c9@mail.gmail.com>
References: <b11ea23c0604071234y562e7568vf32e36a6b4f27925@mail.gmail.com>
	 <4436CB1C.3040308@noaa.gov>
	 <d38f5330604071355t62d34319ibbb3aa38216d2d8@mail.gmail.com>
	 <b11ea23c0604071359n15975ed2n423ead163eb5d4c9@mail.gmail.com>
Message-ID: <b11ea23c0604071404g14d5d74cy5be2fb0d9a47f288@mail.gmail.com>

Ok, so now I get it

M[(tuple for rows), (tuple for columns)]

Whew

On 4/7/06, Webb Sprague <webb.sprague at gmail.com> wrote:
> I appreciate everyone's help, but is there a NON obfuscated way to do
> this without looping?  I think Chris's is my favorite, but I didn't
> know I was starting a contest :)
>
> >  >>> M[I[:,0], I[:,1]]
> > array([ 1,  6, 11,  6])
>
> W
>


From tim.hochberg at cox.net  Fri Apr  7 14:16:06 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Fri Apr  7 14:16:06 2006
Subject: [Numpy-discussion] Re: ndarray.fill and ma.array.filled
In-Reply-To: <4436C965.8020808@hawaii.edu>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com> <d38f5330604071025g5f53d0d7yb0b14d9bac059c4@mail.gmail.com> <4436AE31.7000306@cox.net> <d38f5330604071219j6a5adbdw4a300ed10a26a445@mail.gmail.com> <4436C965.8020808@hawaii.edu>
Message-ID: <4436D6D1.6040302@cox.net>

Eric Firing wrote:

> Sasha wrote:
>
>>
>>
>> On 4/7/06, *Tim Hochberg* <tim.hochberg at cox.net 
>> <mailto:tim.hochberg at cox.net>> wrote:
>>
>>     ...
>>     In general, I'm skeptical of adding more methods to the ndarray 
>> object
>>     -- there are plenty already.
>>
>>
>> I've also proposed to drop "fill" in favor of optimizing x[...] = 
>> <scalar>.  Having both "fill" and "filled" in the interface is plain 
>> awkward.  You may like the combined proposal better because it does 
>> not change the total number of methods :-)
>>  
>>
>>     In addition, it appears that both the method and function 
>> versions of
>>     filled are "dangerous" in the sense that they sometimes return the
>>     array
>>     itself and sometimes a copy.
>>
>>
>> This is true in ma, but may certainly be changed.
>>  
>>
>>     Finally, changing ndarray to support masked array feels a bit 
>> like the
>>     tail wagging the dog.
>>
>> I disagree. Numpy is pretty much alone among the array languages 
>> because it does not have "native" support for missing values. For 
>> the  floating point types some rudimental support for nans exists, 
>> but is not really usable.  There is no missing values machanism for 
>> integer types.  I believe adding "filled" and maybe "mask" to ndarray 
>> (not necessarily under these names) could be a meaningful step 
>> towards "native" support for missing values.  
>
>
> I agree strongly with you, Sasha.  I get the impression that the world 
> of numerical computation is divided into those who work with idealized 
> "data", where nothing is missing, and those who work with real 
> observations, where there is always something missing.

I think your experience is clouding your judgement here. Or at least 
this comes off as unnecessarily perjorative. There's a large class of 
people who work with data that doesn't have missing values either 
because of the nature of data acquisition or because they're doing 
simulations. I take zillions of measurements with digital oscillopscopes 
and they *never* have missing values. Clipped values, yes, but even if I 
somehow could queery the scope about which values were actually clipped 
or simply make an educated guess based on their value, the facilities of 
ma would be useless to me. The clipped values are what I would want in 
any case.  I also do a lot of work with simulations derived from this 
and other data. I don't come across missing values here but again, if I 
did, the way ma works would not help me. I'd have to treat them either 
by rejecting the data outright or by some sort of interpolation.

> As an oceanographer, I am solidly in the latter category.  If good 
> support for missing values is not built in, it has to be bolted on, 
> and it becomes clunky and awkward.  

This may be a false dichotomy. It's certainly not obvious to me that 
this is so. At least if "bolted on" means "not adding a filled method to 
ndarray".

> I was reluctant to speak up about this earlier because I thought it 
> was too much to ask of Travis when he was in the midst of putting 
> numpy on solid ground.  But I am delighted that missing value support 
> has a champion among numpy developers, and I agree that now is the 
> time to change it from "bolted on" to "integrated".


I have no objection to ma support improving. In fact I think it would be 
great although I don't forsee it helping me anytime soon. I also support 
Sasha's goal of being able to mix  MaskedArrays and ndarrays reasonably 
seemlessly.

However, I do think the situation needs more thought. Slapping filled 
and mask onto ndarray is the path of least resistance, but it's not 
clear that it's the best one.

If we do decide we are going to add both of these methods to ndarray 
(with filled returning a copy!), then it may worth considering making 
ndarray a subclass of MaskedArray. Conceptually this makes sense, since 
at this point an ndarray will just be a MaskedArray where mask is always 
False. I think that they could share  much of the implementation except 
that ndarray would be set up to use methods that ignored the mask 
attribute since they would know that it's always false. Even that might 
not be worth it, since the check for whether mask is True/False is just 
a pointer compare.

It may in fact be best just to do away with MaskedArray entirely, moving 
the functionality into ndarray. That may have performance implications, 
although I don't seem them at the moment, and I don't know if there are 
other methods/attributes that this would imply need to be moved over, 
although it looks like just mask, filled and possibly filled_value, 
although the latter looks a little dubious to me.

Either of the above two options would certainly improve the quality of 
MaskedArray. Copy for instance seems not to have been implemented, and 
who knows what other dark corners remain unexplored here.

There's a whole spectrum of possibilities here from ones that don't 
intrude on ndarray at all to ones that profoundly change it. Sasha's 
suggestion looks like it's probably the simplest thing in the short 
term, but I don't know that it's the best long term solution. I think it 
needs more thought and discussion, which is after all what Sasha asked 
for ;)


Regards,

-tim


From Chris.Barker at noaa.gov  Fri Apr  7 15:13:02 2006
From: Chris.Barker at noaa.gov (Christopher Barker)
Date: Fri Apr  7 15:13:02 2006
Subject: [Numpy-discussion] Silly array question
In-Reply-To: <d38f5330604071355t62d34319ibbb3aa38216d2d8@mail.gmail.com>
References: <b11ea23c0604071234y562e7568vf32e36a6b4f27925@mail.gmail.com>
 <4436CB1C.3040308@noaa.gov>
 <d38f5330604071355t62d34319ibbb3aa38216d2d8@mail.gmail.com>
Message-ID: <4436E3C9.2040807@noaa.gov>

Sasha wrote:
> One more obfuscated numpy entry:
> 
>>>> M[tuple(transpose(I))]
> array([ 1,  6, 11,  6])

exactly. Can anyone explain why that works, but:

M[transpose(I)]

or
M[I]

doesn't?

-Chris

-
Christopher Barker, Ph.D.
Oceanographer
                                     		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From efiring at hawaii.edu  Fri Apr  7 15:37:03 2006
From: efiring at hawaii.edu (Eric Firing)
Date: Fri Apr  7 15:37:03 2006
Subject: [Numpy-discussion] Re: ndarray.fill and ma.array.filled
In-Reply-To: <4436D6D1.6040302@cox.net>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com>
 <d38f5330604071025g5f53d0d7yb0b14d9bac059c4@mail.gmail.com>
 <4436AE31.7000306@cox.net>
 <d38f5330604071219j6a5adbdw4a300ed10a26a445@mail.gmail.com>
 <4436C965.8020808@hawaii.edu> <4436D6D1.6040302@cox.net>
Message-ID: <4436E95B.4090009@hawaii.edu>

Tim Hochberg wrote:
> Eric Firing wrote:
> 
>> Sasha wrote:
>>
>>>
>>>
>>> On 4/7/06, *Tim Hochberg* <tim.hochberg at cox.net 
>>> <mailto:tim.hochberg at cox.net>> wrote:
>>>
>>>     ...
>>>     In general, I'm skeptical of adding more methods to the ndarray 
>>> object
>>>     -- there are plenty already.
>>>
>>>
>>> I've also proposed to drop "fill" in favor of optimizing x[...] = 
>>> <scalar>.  Having both "fill" and "filled" in the interface is plain 
>>> awkward.  You may like the combined proposal better because it does 
>>> not change the total number of methods :-)
>>>  
>>>
>>>     In addition, it appears that both the method and function 
>>> versions of
>>>     filled are "dangerous" in the sense that they sometimes return the
>>>     array
>>>     itself and sometimes a copy.
>>>
>>>
>>> This is true in ma, but may certainly be changed.
>>>  
>>>
>>>     Finally, changing ndarray to support masked array feels a bit 
>>> like the
>>>     tail wagging the dog.
>>>
>>> I disagree. Numpy is pretty much alone among the array languages 
>>> because it does not have "native" support for missing values. For 
>>> the  floating point types some rudimental support for nans exists, 
>>> but is not really usable.  There is no missing values machanism for 
>>> integer types.  I believe adding "filled" and maybe "mask" to ndarray 
>>> (not necessarily under these names) could be a meaningful step 
>>> towards "native" support for missing values.  
>>
>>
>>
>> I agree strongly with you, Sasha.  I get the impression that the world 
>> of numerical computation is divided into those who work with idealized 
>> "data", where nothing is missing, and those who work with real 
>> observations, where there is always something missing.
> 
> 
> I think your experience is clouding your judgement here. Or at least 
> this comes off as unnecessarily perjorative. There's a large class of 
> people who work with data that doesn't have missing values either 
> because of the nature of data acquisition or because they're doing 
> simulations. I take zillions of measurements with digital oscillopscopes 
> and they *never* have missing values. Clipped values, yes, but even if I 
> somehow could queery the scope about which values were actually clipped 
> or simply make an educated guess based on their value, the facilities of 
> ma would be useless to me. The clipped values are what I would want in 
> any case.  I also do a lot of work with simulations derived from this 
> and other data. I don't come across missing values here but again, if I 
> did, the way ma works would not help me. I'd have to treat them either 
> by rejecting the data outright or by some sort of interpolation.

Tim,

The point is well-taken, and I apologize.  I stated my case badly.  (I 
would be delighted if I did not have to be concerned with missing 
values-they are a pain regardless of how well a numerical package 
handles them.)

> 
>> As an oceanographer, I am solidly in the latter category.  If good 
>> support for missing values is not built in, it has to be bolted on, 
>> and it becomes clunky and awkward.  
> 
> 
> This may be a false dichotomy. It's certainly not obvious to me that 
> this is so. At least if "bolted on" means "not adding a filled method to 
> ndarray".

I probably overstated it, but I think we actually agree.  I intended to 
lend support to the priority of making missing-value support as seamless 
and painless as possible.  It will help some people, and not others.

> 
>> I was reluctant to speak up about this earlier because I thought it 
>> was too much to ask of Travis when he was in the midst of putting 
>> numpy on solid ground.  But I am delighted that missing value support 
>> has a champion among numpy developers, and I agree that now is the 
>> time to change it from "bolted on" to "integrated".
> 
> 
> 
> I have no objection to ma support improving. In fact I think it would be 
> great although I don't forsee it helping me anytime soon. I also support 
> Sasha's goal of being able to mix  MaskedArrays and ndarrays reasonably 
> seemlessly.
> 
> However, I do think the situation needs more thought. Slapping filled 
> and mask onto ndarray is the path of least resistance, but it's not 
> clear that it's the best one.
> 
> If we do decide we are going to add both of these methods to ndarray 
> (with filled returning a copy!), then it may worth considering making 
> ndarray a subclass of MaskedArray. Conceptually this makes sense, since 
> at this point an ndarray will just be a MaskedArray where mask is always 
> False. I think that they could share  much of the implementation except 
> that ndarray would be set up to use methods that ignored the mask 
> attribute since they would know that it's always false. Even that might 
> not be worth it, since the check for whether mask is True/False is just 
> a pointer compare.
> 
> It may in fact be best just to do away with MaskedArray entirely, moving 
> the functionality into ndarray. That may have performance implications, 
> although I don't seem them at the moment, and I don't know if there are 
> other methods/attributes that this would imply need to be moved over, 
> although it looks like just mask, filled and possibly filled_value, 
> although the latter looks a little dubious to me.
> 

This is exactly the option that I was afraid to bring up because I 
thought it might be too disruptive, and because I am not contributing to 
numpy, and probably don't have the competence (or time) to do so.

> Either of the above two options would certainly improve the quality of 
> MaskedArray. Copy for instance seems not to have been implemented, and 
> who knows what other dark corners remain unexplored here.
> 
> There's a whole spectrum of possibilities here from ones that don't 
> intrude on ndarray at all to ones that profoundly change it. Sasha's 
> suggestion looks like it's probably the simplest thing in the short 
> term, but I don't know that it's the best long term solution. I think it 
> needs more thought and discussion, which is after all what Sasha asked 
> for ;)

Exactly!  Thank you for broadening the discussion.

Eric


From ndarray at mac.com  Fri Apr  7 15:38:04 2006
From: ndarray at mac.com (Sasha)
Date: Fri Apr  7 15:38:04 2006
Subject: [Numpy-discussion] Re: ndarray.fill and ma.array.filled
In-Reply-To: <4436D6D1.6040302@cox.net>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com>
	 <d38f5330604071025g5f53d0d7yb0b14d9bac059c4@mail.gmail.com>
	 <4436AE31.7000306@cox.net>
	 <d38f5330604071219j6a5adbdw4a300ed10a26a445@mail.gmail.com>
	 <4436C965.8020808@hawaii.edu> <4436D6D1.6040302@cox.net>
Message-ID: <d38f5330604071537r16a3b813l17f968e6fa80703a@mail.gmail.com>

On 4/7/06, Tim Hochberg <tim.hochberg at cox.net> wrote:
> [...]
>
> However, I do think the situation needs more thought. Slapping filled
> and mask onto ndarray is the path of least resistance, but it's not
> clear that it's the best one.

Completely agree.  I have many gripes about  current ma implementation
of both "filled" and "mask".

filled:

1. I don't like default fill value.   It should  be mandatory to
supply fill value.
2. It should return masked array (with trivial mask), not ndarray.
3. The name conflicts with the "fill" method.
4. View/Copy inconsistency.  Does not provide a method to fill values in-place.

mask:

1. I've got rid of mask returning None in favor of False_ (boolean
array scalar), but it is still not perfect.  I would prefer data.shape
== mask.shape invariant and if space saving/performance  is deemed
necessary use zero-stride arrays.

2. I don't like the name. "Missing" or "na" would be better.

> If we do decide we are going to add both of these methods to ndarray
> (with filled returning a copy!), then it may worth considering making
> ndarray a subclass of MaskedArray. Conceptually this makes sense, since
> at this point an ndarray will just be a MaskedArray where mask is always
> False. I think that they could share  much of the implementation except
> that ndarray would be set up to use methods that ignored the mask
> attribute since they would know that it's always false. Even that might
> not be worth it, since the check for whether mask is True/False is just
> a pointer compare.
>

The tail becoming the dog! Yet I agree, this makes sense from the
implementation point of view.  From OOP perspective this would make
sense if arrays were immutable, but since mask is settable in
MaskedArray, making it constant in the subclass will violate the
substitution principle.  I would not object making mask read only,
however.

> It may in fact be best just to do away with MaskedArray entirely, moving
> the functionality into ndarray. That may have performance implications,
> although I don't seem them at the moment, and I don't know if there are
> other methods/attributes that this would imply need to be moved over,
> although it looks like just mask, filled and possibly filled_value,
> although the latter looks a little dubious to me.
>
I think MA can coexist with ndarray and share the interface.  Ndarray
can use special bit-patterns like IEEE NaN to indicate missing
floating point values. Add-on modules can redefine arithmetic to make
INT_MIN behave as a missing marker for signed integers (R, K and J (I
think) languages use this approach).  Applications that need missing
values support across the board will use MA.


> Either of the above two options would certainly improve the quality of
> MaskedArray. Copy for instance seems not to have been implemented, and
> who knows what other dark corners remain unexplored here.
>
More (corners) than you want to know about! Reimplementing MA in C
would be a worthwhile goal (and what you suggest seems to require just
that), but it is too big of a project.  I suggest that we focus on the
interface first.  If existing MA interface is rejected (which is
likely) for ndarray, we can easily experiment with the alternatives
within MA, which is pure python.


> There's a whole spectrum of possibilities here from ones that don't
> intrude on ndarray at all to ones that profoundly change it. Sasha's
> suggestion looks like it's probably the simplest thing in the short
> term, but I don't know that it's the best long term solution. I think it
> needs more thought and discussion, which is after all what Sasha asked
> for ;)

Exactly!


From robert.kern at gmail.com  Fri Apr  7 15:39:02 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Fri Apr  7 15:39:02 2006
Subject: [Numpy-discussion] Re: Silly array question
In-Reply-To: <4436E3C9.2040807@noaa.gov>
References: <b11ea23c0604071234y562e7568vf32e36a6b4f27925@mail.gmail.com> <4436CB1C.3040308@noaa.gov> <d38f5330604071355t62d34319ibbb3aa38216d2d8@mail.gmail.com> <4436E3C9.2040807@noaa.gov>
Message-ID: <e16pk4$d7v$1@sea.gmane.org>

Christopher Barker wrote:
> Sasha wrote:
> 
>> One more obfuscated numpy entry:
>>
>>>>> M[tuple(transpose(I))]
>>
>> array([ 1,  6, 11,  6])
> 
> exactly. Can anyone explain why that works, but:
> 
> M[transpose(I)]
> 
> or
> M[I]
> 
> doesn't?

There's some typechecking going on in __getitem__. Tuples are presumed to mean
that each item in the tuple is indexing on a different axis. Non-tuples are
presumed to be fancy array-indexing.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From pgmdevlist at mailcan.com  Fri Apr  7 15:54:01 2006
From: pgmdevlist at mailcan.com (Pierre GM)
Date: Fri Apr  7 15:54:01 2006
Subject: [Numpy-discussion] Re: ndarray.fill and ma.array.filled
In-Reply-To: <4436D6D1.6040302@cox.net>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com> <4436C965.8020808@hawaii.edu> <4436D6D1.6040302@cox.net>
Message-ID: <200604071844.37724.pgmdevlist@mailcan.com>

Folks,
I'm more or less in Eric's field (hydrology), and we do have to deal with 
missing values, that we can't interpolate straightforwardly (that is, without 
some dark statistical magic). Purely discarding the data is not an option 
either. MA fills the need, most of it.

I think one of the issues is what is meant by 'masked data':
- a missing observation ? 
- a NAN ?
- a data we don't want to consider at one particular point ?
For the last point, think about raster maps or bitmaps: calculations should be 
performed on a chunk of data, the initial data left untouched, and the result 
should both have the same size as the original, and valid only on the initial 
chunk. The current MA implementation, with its _data part and is _mask part, 
works nicely for the 3rd point.


- I wonder whether implementing a 'filled' method for ndarrays is really 
better than letting the user create a MaskedArray, where the NANs are 
masked.In any case, a 'filled' method should always return a copy, as it's no 
longer the initial data.

- I'm not sure what to do with the idea of making ndarray a subclass of MA . 
One on side, Tim pointed rightly that a ndarray is just a MA with a 'False' 
mask. Actually, I'm a bit frustrated with the standard 'asarray' that shows 
up in many functions. I'd prefer something like "if the argument is a 
non-numpy sequence (tuples,lists), transforming it in a ndarray, but if it's 
already a ndarray or a MA, leave it as it is. Don't touch the mask if 
present". That's how MA.asarray works, but unfortunately  the std "asarray" 
gets rid of the mask (and you end up with something which is not what you'd 
expect). A 'mask=False' attribute in ndarray would be nice.

On another, some methods/functions make sense only on unmasked ndarray (FFT, 
solving equations), some others are a bit tricky to implement (diff ? 
median...). Some exception could be raised if the arguments of these 
functions return True with ismasked (cf below), or that could be simplified 
if 'mask' was a default attribute of numarrays.
I regularly have to use a ismasked function (cf below). 
def ismasked(a):
    if hasattr(a,'mask'):
        return a.mask.any()
    else:
        return False

We're going towards MA as the default object.

But then again, what would be the behavior to deal with missing values ? Using  
R-like na.actions ? That'd be great, but it's getting more complex. 

Oh, and another thing: if 'mask', or 'masked' becomes a default attribute of 
ndarrays, how do we define a mask? As a boolean ndarray whose 'mask' is 
always 'False' ? How do you __repr__ it ?


- I agree that 'filled_value' is not very useful. If I want to fill an array, 
I'm happy to specify what value I want it filled with. In facts, I'd be 
happier to specifiy 'values'. I often have to work with 2D arrays, each 
column representing a different variable. If this array has to be filled, I'd 
like each column to be filled with one particular value, not necessarily the 
same along all columns: something like

column_stack([A[:,k].filled(filler[k]) for k in range(A.shape[1])]) 

with filler a 1xA.shape[1] array of filling values. Of course, we could 
imagine the same thing for rows, or higher dimensions...

Sorry for the rants...


From pgmdevlist at mailcan.com  Fri Apr  7 16:13:02 2006
From: pgmdevlist at mailcan.com (Pierre GM)
Date: Fri Apr  7 16:13:02 2006
Subject: [Numpy-discussion] Re: ndarray.fill and ma.array.filled
In-Reply-To: <d38f5330604071537r16a3b813l17f968e6fa80703a@mail.gmail.com>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com> <4436D6D1.6040302@cox.net> <d38f5330604071537r16a3b813l17f968e6fa80703a@mail.gmail.com>
Message-ID: <200604071914.44752.pgmdevlist@mailcan.com>

> filled:
> 1. I don't like default fill value.   It should  be mandatory to
> supply fill value.
+1

> 2. It should return masked array (with trivial mask), not ndarray.
-1. Unless 'mask/missing/na' becomes a default in ndarray, and other basic 
ndarray functions know how to deal with MA seamlessly

> 3. The name conflicts with the "fill" method.
fillmask ? clog ?

> 4. View/Copy inconsistency.  Does not provide a method to fill values
> in-place.
But once again, I don't think it should be the default behaviour ! A filled 
array should always be a copy of the initial array. Changing in place means 
changing the initial data, and I foresee lots of fun to find the original 
back. No ctrl+Z.

> mask:
>
> 1. I've got rid of mask returning None in favor of False_ (boolean
> array scalar), but it is still not perfect.  I would prefer data.shape
> == mask.shape invariant and if space saving/performance  is deemed
> necessary use zero-stride arrays.
You,lost me on the strides, but I agree with data.shape==mask.shape as a std

> 2. I don't like the name. "Missing" or "na" would be better.
Once again, it's a point of view. Masked data also means 'data that I don't 
wanna see now, but that I may want to see later'. Like masking an 
bitmap/raster area. +0 for na, no for missing.

>  I would not object making mask read only, however.
Good point. but I was more and more thinking of the opposite. I have a set of 
data that I group in three classes. Plotting one class is straightforward, I 
just have to mask the other two. Do I really want/need three objects for the 
same data ? Can't I just save three masks, and then run a data[mask] ? 


> If existing MA interface is rejected (which is
> likely) for ndarray, we can easily experiment with the alternatives
> within MA, which is pure python.

Er... How many of us are using MA on a regular basis ? Aren't we a minority ?
It'd seem wiser to adapt MA to numpy, in Python (but maybe that's the XIXe 
French integration model I grew up with that makes me talk here...)


From ndarray at mac.com  Fri Apr  7 16:31:03 2006
From: ndarray at mac.com (Sasha)
Date: Fri Apr  7 16:31:03 2006
Subject: [Numpy-discussion] Re: ndarray.fill and ma.array.filled
In-Reply-To: <200604071844.37724.pgmdevlist@mailcan.com>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com>
	 <4436C965.8020808@hawaii.edu> <4436D6D1.6040302@cox.net>
	 <200604071844.37724.pgmdevlist@mailcan.com>
Message-ID: <d38f5330604071630s4f1955a2x972ccb2763901c1b@mail.gmail.com>

On 4/7/06, Pierre GM <pgmdevlist at mailcan.com> wrote:
> ...
> We're going towards MA as the default object.
>
I will be against changing the array structure to handle missing
values.  Let's keep the discussion focuced on the interface. Once we
agree on the interface, it will be clear if any structural changes are
necessary.


> But then again, what would be the behavior to deal with missing values ?

We can postpone this discussion as well. Just add mask attribute that
returns False and filled method that returns a copy is an example of a
minimalistic change.

> Using R-like na.actions ? That'd be great, but it's getting more complex.
>

I don't like na.actions.  I think missing values should behave like
IEEE NaNs and in the floating point case should be represented by
NaNs.  The functionality provided by na.actions can always be achieved
by calling an extra function (filled or compress).

> Oh, and another thing: if 'mask', or 'masked' becomes a default attribute of
> ndarrays, how do we define a mask? As a boolean ndarray whose 'mask' is
> always 'False' ? How do you __repr__ it ?
>

See above. For ndarray mask is always False unless an add-on module is
loaded that redefines arithmetic to recognize special bit-patterns
such as NaN or INT_MIN.


From tim.hochberg at cox.net  Fri Apr  7 17:09:11 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Fri Apr  7 17:09:11 2006
Subject: [Numpy-discussion] Re: ndarray.fill and ma.array.filled
In-Reply-To: <d38f5330604071537r16a3b813l17f968e6fa80703a@mail.gmail.com>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com>	 <d38f5330604071025g5f53d0d7yb0b14d9bac059c4@mail.gmail.com>	 <4436AE31.7000306@cox.net>	 <d38f5330604071219j6a5adbdw4a300ed10a26a445@mail.gmail.com>	 <4436C965.8020808@hawaii.edu> <4436D6D1.6040302@cox.net> <d38f5330604071537r16a3b813l17f968e6fa80703a@mail.gmail.com>
Message-ID: <4436FF73.7080408@cox.net>

Sasha wrote:

>On 4/7/06, Tim Hochberg <tim.hochberg at cox.net> wrote:
>  
>
>>[...]
>>
>>However, I do think the situation needs more thought. Slapping filled
>>and mask onto ndarray is the path of least resistance, but it's not
>>clear that it's the best one.
>>    
>>
>
>Completely agree.  I have many gripes about  current ma implementation
>of both "filled" and "mask".
>
>filled:
>
>1. I don't like default fill value.   It should  be mandatory to
>supply fill value.
>  
>
That makes perfect sense. If anything should have a default fill value, 
it's the functsion calling filled, not the arrays themselves.

>2. It should return masked array (with trivial mask), not ndarray.
>  
>
So, just with mask = False? In a follow on message Pierre disagress and 
claims that what you really want is the ndarray since not everything 
will accept.  Then I guess you'd need to call b.filled(fill).data. I 
agree with Sasha in principle but Pierre, perhaps in practice. I'm 
almost suggested it get renames a.asndarray(fill), except that asXXX has 
the wrong conotations. I think this one needs to bounce around some more.

>3. The name conflicts with the "fill" method.
>  
>
I thought you wanted to kill that. I'd certainly support that. Can't we 
just special case __setitem__ for that one case so that the performance 
is just as good if performance is really the issue?

>4. View/Copy inconsistency.  Does not provide a method to fill values in-place.
>  
>
b[b.mask] = fill_value; b.unmask()

seems to work for this purpose. Can we just have filled return a copy?

>mask:
>
>1. I've got rid of mask returning None in favor of False_ (boolean
>array scalar), but it is still not perfect.  I would prefer data.shape
>== mask.shape invariant and if space saving/performance  is deemed
>necessary use zero-stride arrays.
>  
>
Interesting idea. Is that feasible yet?

>2. I don't like the name. "Missing" or "na" would be better.
>  
>
I'm not on board here, although really I'd like to here from other 
people who use the package. 'na' seems to cryptic to me and 'missing' to 
specific -- there might be other reasons to mask a value other it being 
missing. The problem with mask is that it's not clear whether
True means the data is useful or unuseful. Keep throwing out names, 
maybe one will stick.

>  
>
>>If we do decide we are going to add both of these methods to ndarray
>>(with filled returning a copy!), then it may worth considering making
>>ndarray a subclass of MaskedArray. Conceptually this makes sense, since
>>at this point an ndarray will just be a MaskedArray where mask is always
>>False. I think that they could share  much of the implementation except
>>that ndarray would be set up to use methods that ignored the mask
>>attribute since they would know that it's always false. Even that might
>>not be worth it, since the check for whether mask is True/False is just
>>a pointer compare.
>>
>>    
>>
>
>The tail becoming the dog! Yet I agree, this makes sense from the
>implementation point of view.  From OOP perspective this would make
>sense if arrays were immutable, but since mask is settable in
>MaskedArray, making it constant in the subclass will violate the
>substitution principle.  I would not object making mask read only,
>however.
>  
>
How do you set the mask? I keep getting attribute errors when I try it. 
And unmask would be a noop on an ndarray.

>  
>
>>It may in fact be best just to do away with MaskedArray entirely, moving
>>the functionality into ndarray. That may have performance implications,
>>although I don't seem them at the moment, and I don't know if there are
>>other methods/attributes that this would imply need to be moved over,
>>although it looks like just mask, filled and possibly filled_value,
>>although the latter looks a little dubious to me.
>>
>>    
>>
>I think MA can coexist with ndarray and share the interface.  Ndarray
>can use special bit-patterns like IEEE NaN to indicate missing
>floating point values. Add-on modules can redefine arithmetic to make
>INT_MIN behave as a missing marker for signed integers (R, K and J (I
>think) languages use this approach).  Applications that need missing
>values support across the board will use MA.
>
>
>  
>
>>Either of the above two options would certainly improve the quality of
>>MaskedArray. Copy for instance seems not to have been implemented, and
>>who knows what other dark corners remain unexplored here.
>>
>>    
>>
>More (corners) than you want to know about! Reimplementing MA in C
>would be a worthwhile goal (and what you suggest seems to require just
>that), but it is too big of a project.  I suggest that we focus on the
>interface first.  If existing MA interface is rejected (which is
>likely) for ndarray, we can easily experiment with the alternatives
>within MA, which is pure python.
>  
>
Perhaps MaskedArray should inherit from ndarray for the time being. Many 
of the methods would need to reimplemented anyway, but it would make 
asanyarray work. Someone was just complaining about asarray munging his 
arrays. That's correct behaviour, but it would be nice if asanyarray did 
the right thing. I suppose we could just special case asanyarray to 
ignore MaskedArrays, that might be better since it's less constraining 
from an implementation side too.

>>There's a whole spectrum of possibilities here from ones that don't
>>intrude on ndarray at all to ones that profoundly change it. Sasha's
>>suggestion looks like it's probably the simplest thing in the short
>>term, but I don't know that it's the best long term solution. I think it
>>needs more thought and discussion, which is after all what Sasha asked
>>for ;)
>>    
>>
>
>Exactly!
>  
>
This may be an oportune time to propose something that's been cooking in 
the back of my head for a week or so now: A stripped down array 
superclass. The details of this are not at all locked down, but here's a 
strawman proposal.

    We add an array superclass. call it basearray, that has the same
    C-structure as the existing ndarray. However, it has *no* methods or
    attributes. It's simply a big blob of data. Functions that work on
    the C structure of arrays (ufuncs, etc) would still work on this
    arrays, as would asarray, so it could be converted to an ndarray as
    necessary. In addition, we would supply a minimal set of functions
    that would operate on this object. These functions would be chosen
    so that the current array interface could be implemented on top of
    them and the basearray object in pure python. These functions would
    be things like set_shape(a, shape), etc. They would be segregated
    off in their own namespace, not in the numpy core. [Note that I'm
    not proposing we actually implement ndarray this way, just that we
    make is possible]. This leads to several useful outcomes.
        1. If we're careful, this could be the basic array object that
    we propose, at least for the first roun,d for inclusion in the
    Python core. It's not useful for anything but passing data betwen
    various application that understand the data structure, but that in
    itself could be a huge win. And the fact that it's dirt simple would
    probably be an advantage to getting it into the core.
        2. It provides a useful marker class. MA could inherit from it
    (and use itself for it's data attribute) and then asanyarray would
    behave properly. MA could also use this, or a subclass, as the mask
    object preventing anyone from accidentally using it as data (they
    could always use it on purpose with asarray).
        3. It provides a platform for people to build other,
    ndarray-like classes in Pure python. This is my main interest. I've
    put together a thin shell over numpy that strips it down to it's
    abolute essentials including a stripped down version of ndarray that
    removes most of the methods. All of the __array_wrap__[1] stuff
    works quite well most of the time, but there's still some issues
    with being a subclass when this particular class is conceptually a
    superclass. If we had an array superclass of some sort, I believe
    that these would be resolved.

In principle at least, this shouldn't be that hard. I think it should 
mostly be rearanging some code and adding some wrappers to existing 
functions. That's in principle. In practice, I'm not certain yet as I 
haven't investigated the code in question in much depth yet. I've been 
meaning to write this up into a more fleshed out proposal, but I got 
distracted by the whole Protocol discussion on python-dev3000. This 
writeup is pretty weak, but hopefully you get the idea.

Anyway, this is somethig that I would be willing to put some time on 
that would benefit both me and probably the MA folks as well.

Regards,

-tim


From efiring at hawaii.edu  Fri Apr  7 17:27:09 2006
From: efiring at hawaii.edu (Eric Firing)
Date: Fri Apr  7 17:27:09 2006
Subject: [Numpy-discussion] Re: ndarray.fill and ma.array.filled
In-Reply-To: <4436FF73.7080408@cox.net>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com>
 <d38f5330604071025g5f53d0d7yb0b14d9bac059c4@mail.gmail.com>
 <4436AE31.7000306@cox.net>
 <d38f5330604071219j6a5adbdw4a300ed10a26a445@mail.gmail.com>
 <4436C965.8020808@hawaii.edu> <4436D6D1.6040302@cox.net>
 <d38f5330604071537r16a3b813l17f968e6fa80703a@mail.gmail.com>
 <4436FF73.7080408@cox.net>
Message-ID: <44370328.2060508@hawaii.edu>

Tim Hochberg wrote:
[...]
> 
>> 2. I don't like the name. "Missing" or "na" would be better.
>>  
>>
> I'm not on board here, although really I'd like to here from other 
> people who use the package. 'na' seems to cryptic to me and 'missing' to 
> specific -- there might be other reasons to mask a value other it being 
> missing. The problem with mask is that it's not clear whether
> True means the data is useful or unuseful. Keep throwing out names, 
> maybe one will stick.

"hide" or "hidden"?  A mask value of True essentially hides the 
underlying value.

Eric


From ndarray at mac.com  Fri Apr  7 17:56:24 2006
From: ndarray at mac.com (Sasha)
Date: Fri Apr  7 17:56:24 2006
Subject: [Numpy-discussion] Re: ndarray.fill and ma.array.filled
In-Reply-To: <4436FF73.7080408@cox.net>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com>
	 <d38f5330604071025g5f53d0d7yb0b14d9bac059c4@mail.gmail.com>
	 <4436AE31.7000306@cox.net>
	 <d38f5330604071219j6a5adbdw4a300ed10a26a445@mail.gmail.com>
	 <4436C965.8020808@hawaii.edu> <4436D6D1.6040302@cox.net>
	 <d38f5330604071537r16a3b813l17f968e6fa80703a@mail.gmail.com>
	 <4436FF73.7080408@cox.net>
Message-ID: <d38f5330604071755p9139d3off1007a73f355e47@mail.gmail.com>

On 4/7/06, Tim Hochberg <tim.hochberg at cox.net> wrote:
> [...]
> Perhaps MaskedArray should inherit from ndarray for the time being. Many
> of the methods would need to reimplemented anyway, but it would make
> asanyarray work. Someone was just complaining about asarray munging his
> arrays. That's correct behaviour, but it would be nice if asanyarray did
> the right thing. I suppose we could just special case asanyarray to
> ignore MaskedArrays, that might be better since it's less constraining
> from an implementation side too.
>
Just for the record.  Currently MA does not inherit from ndarray. 
There are some benefits to be gained from changing MA design from
containment to inheritance, by I am very sceptical about the use of
inheritance in the array setting.


> >
> This may be an oportune time to propose something that's been cooking in
> the back of my head for a week or so now: A stripped down array
> superclass.

This is a very worthwhile idea and I hate to see it burried in a
non-descriptive thread.  I've copied your proposal to the wiki at
<http://projects.scipy.org/scipy/numpy/wiki/ArraySuperClass>.


From tim.hochberg at cox.net  Fri Apr  7 18:44:02 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Fri Apr  7 18:44:02 2006
Subject: [Numpy-discussion] Re: ndarray.fill and ma.array.filled
In-Reply-To: <d38f5330604071755p9139d3off1007a73f355e47@mail.gmail.com>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com>	 <d38f5330604071025g5f53d0d7yb0b14d9bac059c4@mail.gmail.com>	 <4436AE31.7000306@cox.net>	 <d38f5330604071219j6a5adbdw4a300ed10a26a445@mail.gmail.com>	 <4436C965.8020808@hawaii.edu> <4436D6D1.6040302@cox.net>	 <d38f5330604071537r16a3b813l17f968e6fa80703a@mail.gmail.com>	 <4436FF73.7080408@cox.net> <d38f5330604071755p9139d3off1007a73f355e47@mail.gmail.com>
Message-ID: <44371593.8060806@cox.net>

Sasha wrote:

>On 4/7/06, Tim Hochberg <tim.hochberg at cox.net> wrote:
>  
>
>>[...]
>>Perhaps MaskedArray should inherit from ndarray for the time being. Many
>>of the methods would need to reimplemented anyway, but it would make
>>asanyarray work. Someone was just complaining about asarray munging his
>>arrays. That's correct behaviour, but it would be nice if asanyarray did
>>the right thing. I suppose we could just special case asanyarray to
>>ignore MaskedArrays, that might be better since it's less constraining
>>from an implementation side too.
>>
>>    
>>
>Just for the record.  Currently MA does not inherit from ndarray. 
>  
>
Right, I checked that. That's why asanyarray won't work now with MA 
(unless someone changed the implementation of that while I wan't looking.

>There are some benefits to be gained from changing MA design from
>containment to inheritance, by I am very sceptical about the use of
>inheritance in the array setting.
>  
>
That's probably a sensible position.

Still it would be nice to have asanyarray pass masked arrays through 
somehow.  I haven't thought this through very well, but I wonder if it 
would make sense for asanyarray to pass any object that supplies 
__array__. I'm leary of special casing asanyarray just for MA; somehow 
that seems the wrong approach.

>>This may be an oportune time to propose something that's been cooking in
>>the back of my head for a week or so now: A stripped down array
>>superclass.
>>    
>>
>
>This is a very worthwhile idea and I hate to see it burried in a
>non-descriptive thread.  I've copied your proposal to the wiki at
><http://projects.scipy.org/scipy/numpy/wiki/ArraySuperClass>.
>  
>
Thanks for doing that. I'm glad you like the general idea. I do plan to 
write it through and try to get a better handle on what this would 
entail and what the consequences would be. However, I'm not sure exactly 
when I'll get around to it so it's probably better that a rough draft be 
out there for people to think about in the interim.

-tim


>
>  
>


From ndarray at mac.com  Fri Apr  7 18:47:09 2006
From: ndarray at mac.com (Sasha)
Date: Fri Apr  7 18:47:09 2006
Subject: [Numpy-discussion] Re: ndarray.fill and ma.array.filled
In-Reply-To: <4436FF73.7080408@cox.net>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com>
	 <d38f5330604071025g5f53d0d7yb0b14d9bac059c4@mail.gmail.com>
	 <4436AE31.7000306@cox.net>
	 <d38f5330604071219j6a5adbdw4a300ed10a26a445@mail.gmail.com>
	 <4436C965.8020808@hawaii.edu> <4436D6D1.6040302@cox.net>
	 <d38f5330604071537r16a3b813l17f968e6fa80703a@mail.gmail.com>
	 <4436FF73.7080408@cox.net>
Message-ID: <d38f5330604071846y5f072b05kf02a33f267cfcbcb@mail.gmail.com>

On 4/7/06, Tim Hochberg <tim.hochberg at cox.net> wrote:
> [...]
> >1. I don't like default fill value.   It should  be mandatory to
> >supply fill value.
> >
> >
> That makes perfect sense. If anything should have a default fill value,
> it's the functsion calling filled, not the arrays themselves.
>
It looks like we are getting close to a consensus on this one.  I will
remove fill_value attribute.

[...]

> >3. The name conflicts with the "fill" method.
> >
> >
> I thought you wanted to kill that. I'd certainly support that. Can't we
> just special case __setitem__ for that one case so that the performance
> is just as good if performance is really the issue?
>
I'll propose a patch.

> >4. View/Copy inconsistency.  Does not provide a method to fill values in-place.
> >
> >
> b[b.mask] = fill_value; b.unmask()
>
> seems to work for this purpose. Can we just have filled return a copy?
>
+1

> >mask:
> >
> >1. I've got rid of mask returning None in favor of False_ (boolean
> >array scalar), but it is still not perfect.  I would prefer data.shape
> >== mask.shape invariant and if space saving/performance  is deemed
> >necessary use zero-stride arrays.
> >
> >
> Interesting idea. Is that feasible yet?
>
It is not feasible in pure python module like ma, but easy in ndarray.
 We can also reset the writeable flag to avoid various problems that
zero strides may cause.  I'll propose a patch.

> >2. I don't like the name. "Missing" or "na" would be better.
> >
> >
> I'm not on board here, although really I'd like to here from other
> people who use the package. 'na' seems to cryptic to me and 'missing' to
> specific -- there might be other reasons to mask a value other it being
> missing. The problem with mask is that it's not clear whether
> True means the data is useful or unuseful. Keep throwing out names,
> maybe one will stick.
>
The problem with the "mask" name is that ndarray already has unrelated
"putmask" method.  On the other hand putmask is redundant with fancy
indexing.  I have no other problem with "mask" name, so we may just
decide to get rid of "putmask".

> [...]
> How do you set the mask? I keep getting attribute errors when I try it.

a[i] = masked makes i-th element masked.  If mask is an array, you can
just set its elements.

> And unmask would be a noop on an ndarray.
>
Yes.

[...]


From ndarray at mac.com  Fri Apr  7 18:56:01 2006
From: ndarray at mac.com (Sasha)
Date: Fri Apr  7 18:56:01 2006
Subject: [Numpy-discussion] Re: ndarray.fill and ma.array.filled
In-Reply-To: <44371593.8060806@cox.net>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com>
	 <d38f5330604071025g5f53d0d7yb0b14d9bac059c4@mail.gmail.com>
	 <4436AE31.7000306@cox.net>
	 <d38f5330604071219j6a5adbdw4a300ed10a26a445@mail.gmail.com>
	 <4436C965.8020808@hawaii.edu> <4436D6D1.6040302@cox.net>
	 <d38f5330604071537r16a3b813l17f968e6fa80703a@mail.gmail.com>
	 <4436FF73.7080408@cox.net>
	 <d38f5330604071755p9139d3off1007a73f355e47@mail.gmail.com>
	 <44371593.8060806@cox.net>
Message-ID: <d38f5330604071855r329a5c9dofa762e38d574b738@mail.gmail.com>

On 4/7/06, Tim Hochberg <tim.hochberg at cox.net> wrote:
> [...]
> Still it would be nice to have asanyarray pass masked arrays through
> somehow.  I haven't thought this through very well, but I wonder if it
> would make sense for asanyarray to pass any object that supplies
> __array__. I'm leary of special casing asanyarray just for MA; somehow
> that seems the wrong approach.

One possiblility is to make asanyarray pass through objects that have
__array_wrap__ attribute.


From pgmdevlist at mailcan.com  Fri Apr  7 20:40:03 2006
From: pgmdevlist at mailcan.com (Pierre GM)
Date: Fri Apr  7 20:40:03 2006
Subject: [Numpy-discussion] Re: ndarray.fill and ma.array.filled
In-Reply-To: <4436FF73.7080408@cox.net>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com> <d38f5330604071537r16a3b813l17f968e6fa80703a@mail.gmail.com> <4436FF73.7080408@cox.net>
Message-ID: <200604072258.34153.pgmdevlist@mailcan.com>

> >2. It should return masked array (with trivial mask), not ndarray.
>
> So, just with mask = False? In a follow on message Pierre disagress and
> claims that what you really want is the ndarray since not everything
> will accept.  Then I guess you'd need to call b.filled(fill).data. I
> agree with Sasha in principle but Pierre, perhaps in practice.

Well, if 'mask' became a default argument of ndarray, that wouldn't be a pb 
any longer. I'm quite for that.

> I'm 
> almost suggested it get renames a.asndarray(fill), except that asXXX has
> the wrong conotations. I think this one needs to bounce around some more.

tondarray(fill) ?

> >4. View/Copy inconsistency.  Does not provide a method to fill values
> > in-place.
> seems to work for this purpose. Can we just have filled return a copy?

Yes !


> > The problem with mask is that it's not clear whether 
> > True means the data is useful or unuseful. 

I have to think twice all the time I want to create a mask that True means in 
fact that I don't want the data, whereas True selects the data for ndarray...

> "hide" or "hidden"?  A mask value of True essentially hides the
> underlying value.
Unless when there's no underlying value ;). Rose, rose... I'm happy with mask, 
it reminds me of GRASS and gimp

> The problem with the "mask" name is that ndarray already has unrelated
> "putmask" method.  On the other hand putmask is redundant with fancy
> indexing.  I have no other problem with "mask" name, so we may just
> decide to get rid of "putmask".

"putmask" really seems overkill indeed. I wouldn't miss it.

> How do you set the mask? I keep getting attribute errors when I try it.
> And unmask would be a noop on an ndarray.

I've implemented something like that for some classes (inheriting from 
MA.MaskedArray). Never really used it yet, though
    #--------------------------------------------
    def applymask(self,m):
        if not MA.is_mask(m):
            raise MA.MAError,"Invalid mask !"
        elif self._data.shape != m.shape:
            raise MA.MAError,"Mask and data not compatible."
        else:   
            self._dmask = m

> This may be an oportune time to propose something that's been cooking in
> the back of my head for a week or so now: A stripped down array
> superclass. 

That'd be great indeed, and may solve some problems reported on th list about 
subclassing ndarray. AAMOF, I gave up trying to use ndarray as a superclass, 
and rely only on MA


From zdm105 at tom.com  Sat Apr  8 01:56:02 2006
From: zdm105 at tom.com (=?GB2312?B?NNTCMTUtMTbJz7qjLzIxLTIyye7b2g==?=)
Date: Sat Apr  8 01:56:02 2006
Subject: [Numpy-discussion] =?GB2312?B?QUTUy9PDRVhDRUy02b34ytCzodOqz/rT67LGzvG53MDt?=
Message-ID: <E1FS9EK-0003eq-T3@mail.sourceforge.net>

An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060408/d1a2cb8b/attachment.html>

From webb.sprague at gmail.com  Sat Apr  8 20:02:11 2006
From: webb.sprague at gmail.com (Webb Sprague)
Date: Sat Apr  8 20:02:11 2006
Subject: [Numpy-discussion] Unexpected change of array used to index another array
Message-ID: <b11ea23c0604082001t2483ee50g686c7b3e649254fb@mail.gmail.com>

Hi.

I indexed an 10 x 10(called bigM below) with another array (OFFS_TMP
below).  I suppose because OFFS_TMP has negative numbers, it was
changed to cycle around to 9 wherever there is a negative 1 (which is
the forward version of -1 if you are a 10 x 10 matrix).  You can
analogous behavior with -2 => 8, etc.  Is changing the indexing matrix
really the correct behavior?  The result of using the index seems to
be fine.  Has this story been told already and I didn't know it?

Below is my ipython session.

In [57]: OFFS_TMP
Out[57]:
array([[-1,  1],
       [ 0,  1],
       [ 1,  1],
       [-1,  0],
       [ 0,  0],
       [ 1,  0],
       [-1, -1],
       [ 0, -1],
       [ 1, -1]])

In [58]: bigM[OFFS_TMP]
Out[58]:
array([[[False, True, False, False, True, False, True, True, True, False],
        [False, True, False, True, True, False, False, False, True, True]],

       [[True, False, True, False, True, True, False, False, False, True],
        [False, True, False, True, True, False, False, False, True, True]],

       [[False, True, False, True, True, False, False, False, True, True],
        [False, True, False, True, True, False, False, False, True, True]],

       [[False, True, False, False, True, False, True, True, True, False],
        [True, False, True, False, True, True, False, False, False, True]],

       [[True, False, True, False, True, True, False, False, False, True],
        [True, False, True, False, True, True, False, False, False, True]],

       [[False, True, False, True, True, False, False, False, True, True],
        [True, False, True, False, True, True, False, False, False, True]],

       [[False, True, False, False, True, False, True, True, True, False],
        [False, True, False, False, True, False, True, True, True, False]],

       [[True, False, True, False, True, True, False, False, False, True],
        [False, True, False, False, True, False, True, True, True, False]],

       [[False, True, False, True, True, False, False, False, True, True],
        [False, True, False, False, True, False, True, True, True,
False]]], dtype=bool)

In [59]: OFFS_TMP
Out[59]:
array([[9, 1],
       [0, 1],
       [1, 1],
       [9, 0],
       [0, 0],
       [1, 0],
       [9, 9],
       [0, 9],
       [1, 9]])


From robert.kern at gmail.com  Sat Apr  8 21:17:28 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Sat Apr  8 21:17:28 2006
Subject: [Numpy-discussion] Re: Unexpected change of array used to index another array
In-Reply-To: <b11ea23c0604082001t2483ee50g686c7b3e649254fb@mail.gmail.com>
References: <b11ea23c0604082001t2483ee50g686c7b3e649254fb@mail.gmail.com>
Message-ID: <e1a1gr$g37$1@sea.gmane.org>

Webb Sprague wrote:
> Hi.
> 
> I indexed an 10 x 10(called bigM below) with another array (OFFS_TMP
> below).  I suppose because OFFS_TMP has negative numbers, it was
> changed to cycle around to 9 wherever there is a negative 1 (which is
> the forward version of -1 if you are a 10 x 10 matrix).  You can
> analogous behavior with -2 => 8, etc.  Is changing the indexing matrix
> really the correct behavior?  The result of using the index seems to
> be fine.  Has this story been told already and I didn't know it?

I think it's a bug. I've located the problem, but I'm not familiar with that
part of the code so I'm not entirely sure how to go about fixing it.

http://projects.scipy.org/scipy/numpy/ticket/49

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From lbirvyx at teamoneadv.com  Sun Apr  9 03:13:05 2006
From: lbirvyx at teamoneadv.com (lbirvyx)
Date: Sun Apr  9 03:13:05 2006
Subject: [Numpy-discussion] Fw: numpy-discussion
Message-ID: <001101c65bbe$21f165a0$29d13e50@JIPC846>


----- Original Message ----- 
From: Burks Aileen 
To: itwymeyq at acecannon.com 
Sent: Saturday, April 08, 2006 10:37 AM
Subject: numpy-discussion


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060409/52f142dc/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: numpy-discussion.gif
Type: image/gif
Size: 24405 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060409/52f142dc/attachment.gif>

From webb.sprague at gmail.com  Sun Apr  9 15:21:01 2006
From: webb.sprague at gmail.com (Webb Sprague)
Date: Sun Apr  9 15:21:01 2006
Subject: [Numpy-discussion] numpy.floor() is supposed to return an int, but returns a float
Message-ID: <b11ea23c0604091520l7aa7cd0cte287d995082315b@mail.gmail.com>

Could someone explain this behavior:

In [13]: type(N.floor(1))
Out[13]: <type 'float64scalar'>

In [14]: N.floor?
Type:           ufunc
String Form:    <ufunc 'floor'>
Namespace:      Interactive
Docstring:
    y = floor(x) elementwise largest integer <= x

I wouldn't complain, except the only time I use floor() is to make
indices (dividing ages by age widths, for example).

Thanks!


From tim.hochberg at cox.net  Sun Apr  9 15:30:02 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Sun Apr  9 15:30:02 2006
Subject: [Numpy-discussion] numpy.floor() is supposed to return an int,
 but returns a float
In-Reply-To: <b11ea23c0604091520l7aa7cd0cte287d995082315b@mail.gmail.com>
References: <b11ea23c0604091520l7aa7cd0cte287d995082315b@mail.gmail.com>
Message-ID: <44398AFD.4050304@cox.net>

Webb Sprague wrote:

>Could someone explain this behavior:
>
>In [13]: type(N.floor(1))
>Out[13]: <type 'float64scalar'>
>
>In [14]: N.floor?
>Type:           ufunc
>String Form:    <ufunc 'floor'>
>Namespace:      Interactive
>Docstring:
>    y = floor(x) elementwise largest integer <= x
>
>I wouldn't complain, except the only time I use floor() is to make
>indices (dividing ages by age widths, for example).
>  
>
Well, floor returns an integer, but not an int -- it's an integral 
floating point value. What you want is:

 numpy.floor(1).astype(int)
   
(If you're only using scalars, you might also consider int(floor(x)) 
instead.

Regards,

-tim


>Thanks!
>
>
>-------------------------------------------------------
>This SF.Net email is sponsored by xPML, a groundbreaking scripting language
>that extends applications into web and mobile media. Attend the live webcast
>and join the prime developer group breaking into this new coding territory!
>http://sel.as-us.falkag.net/sel?cmd=k&kid0944&bid$1720&dat1642
>_______________________________________________
>Numpy-discussion mailing list
>Numpy-discussion at lists.sourceforge.net
>https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>
>
>  
>


From webb.sprague at gmail.com  Sun Apr  9 15:40:02 2006
From: webb.sprague at gmail.com (Webb Sprague)
Date: Sun Apr  9 15:40:02 2006
Subject: [Numpy-discussion] numpy.floor() is supposed to return an int, but returns a float
In-Reply-To: <44398AFD.4050304@cox.net>
References: <b11ea23c0604091520l7aa7cd0cte287d995082315b@mail.gmail.com>
	 <44398AFD.4050304@cox.net>
Message-ID: <b11ea23c0604091539y3bf802c1wedd9bcb08103488a@mail.gmail.com>

I think the docstring implies that numpy.floor() returns an integer
value.  One can cast the float value to a usable integer value, but
either the docstring should read something different or the function
should be changed (my preference).

"y = floor(x) elementwise largest integer <= x" is the docstring.

As far as "integral valued float" versus "integer", this distinction
seems a little obscure...  I am sure the difference is very important
in some contexts, but I for one think that floor should return a
straight up integer, if just for code style (see example below). Plus
it will be upcast to a float whenever necessary, so floor(4.5) + .75
== 4.75 whether floor() returns an int or a float.

fooMatrix[numpy.floor(age/ageWidth)]

is better (easier to type, read, and debug) than

fooMatrix[numpy.floor(age/ageWidth).astype(int)]

If there is a explanation as to why an integral valued float is a
better return value, I would be interested in a link.

Thx
W


From robert.kern at gmail.com  Sun Apr  9 15:46:04 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Sun Apr  9 15:46:04 2006
Subject: [Numpy-discussion] Re: numpy.floor() is supposed to return an int, but returns a float
In-Reply-To: <b11ea23c0604091539y3bf802c1wedd9bcb08103488a@mail.gmail.com>
References: <b11ea23c0604091520l7aa7cd0cte287d995082315b@mail.gmail.com>	 <44398AFD.4050304@cox.net> <b11ea23c0604091539y3bf802c1wedd9bcb08103488a@mail.gmail.com>
Message-ID: <e1c2p3$mak$1@sea.gmane.org>

Webb Sprague wrote:
> If there is a explanation as to why an integral valued float is a
> better return value, I would be interested in a link.

In [4]: import numpy

In [5]: numpy.floor(2.**50)
Out[5]: 1125899906842624.0

In [6]: numpy.floor(2.**50).astype(int)
Out[6]: 2147483647

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From tim.hochberg at cox.net  Sun Apr  9 16:07:02 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Sun Apr  9 16:07:02 2006
Subject: [Numpy-discussion] numpy.floor() is supposed to return an int,
 but returns a float
In-Reply-To: <b11ea23c0604091539y3bf802c1wedd9bcb08103488a@mail.gmail.com>
References: <b11ea23c0604091520l7aa7cd0cte287d995082315b@mail.gmail.com>	 <44398AFD.4050304@cox.net> <b11ea23c0604091539y3bf802c1wedd9bcb08103488a@mail.gmail.com>
Message-ID: <443993E3.1090901@cox.net>

Webb Sprague wrote:

>I think the docstring implies that numpy.floor() returns an integer
>value.  
>
You've been programming to much!

Everywhere but the computer programming world, 1.0 is integer.  Even 
their, many (most?) computer languages avoid the term integer using int, 
Int or something similar. The distinction made between ints and integral 
floating point values is mostly an artificial one resulting from 
implementation issues. Making this distinction is also a handy, if 
imperfect,  proxy for exact / versus inexact numbers.

>One can cast the float value to a usable integer value, but
>either the docstring should read something different or the function
>should be changed (my preference).
>
>"y = floor(x) elementwise largest integer <= x" is the docstring.
>
>As far as "integral valued float" versus "integer", this distinction
>seems a little obscure...
>
An integral floating point value *is* an integer, just ask any 12 year 
old. What's obscure is the way concepts of integers and reals get mapped 
to ints and floats. Don't get me wrong, these are reasonable comprises 
given the sad reality that computers are not so hot at representing 
inifinte quantities.  However, we get sucked into thinking that integers 
and ints are really the same things at our peril. Similarly for floats 
and reals.

>  I am sure the difference is very important
>in some contexts, but I for one think that floor should return a
>straight up integer,
>
It's a ufunc. Ufuncs in general return the same type that they operate 
on. So, not only would this be difficult, it would make the signature of 
ufuncs harder to remember.

Also, as Robert Kern just pointed out, not all intergral FP values can 
be represents as ints.

> if just for code style (see example below). Plus
>it will be upcast to a float whenever necessary, so floor(4.5) + .75
>== 4.75 whether floor() returns an int or a float.
>  
>
    Not every two-line Python function has to come pre-written -- Tim 
Peters on C.L.P

def webbsfloor(x):
    return numpy.floor(x).astype(int)

>fooMatrix[numpy.floor(age/ageWidth)]
>
>is better (easier to type, read, and debug) than
>
>fooMatrix[numpy.floor(age/ageWidth).astype(int)]
>
>If there is a explanation as to why an integral valued float is a
>better return value, I would be interested in a link.
>  
>
I think there's at least four reasons:

1. It would be a pain.
2. It would make the ufuncs inconsistent.
3. It's a thin wrapper over C's floor, so people coming from that 
language be confused.
4. It wouldn't work for numbers with very large magnitudes.

Pick any three


Regards,

-tim


From tim.hochberg at cox.net  Sun Apr  9 20:09:03 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Sun Apr  9 20:09:03 2006
Subject: [Numpy-discussion] numpy.floor() is supposed to return an int,
 but returns a float
In-Reply-To: <443993E3.1090901@cox.net>
References: <b11ea23c0604091520l7aa7cd0cte287d995082315b@mail.gmail.com>	 <44398AFD.4050304@cox.net> <b11ea23c0604091539y3bf802c1wedd9bcb08103488a@mail.gmail.com> <443993E3.1090901@cox.net>
Message-ID: <4439CC7E.90704@cox.net>

Tim Hochberg wrote:

> Webb Sprague wrote:
>
>> I think the docstring implies that numpy.floor() returns an integer
>> value. 
>
> You've been programming to much!
>
> Everywhere but the computer programming world, 1.0 is integer.  Even 
> their, many (most?) computer languages avoid the term integer using 
> int, Int or something similar. The distinction made between ints and 
> integral floating point values is mostly an artificial one resulting 
> from implementation issues. Making this distinction is also a handy, 
> if imperfect,  proxy for exact / versus inexact numbers.
>
>> One can cast the float value to a usable integer value, but
>> either the docstring should read something different or the function
>> should be changed (my preference).
>>
>> "y = floor(x) elementwise largest integer <= x" is the docstring.
>

Let me just add that, since this seems to cause confusion, it would be 
appropriate to amend the docstring tobe explicit that this always 
returns an integral floating point value. If someone wants to suggest 
wording, I can figure out where to put it. One possibility is:

    "y = floor(x) elementwise largest integer <= x; note that the result 
is a floating point value"

or

    "y = floor(x) elementwise largest integral float <= x"


Neither of those is great, but perhaps they'll inspire someone to do better.

-tim

>>
>> As far as "integral valued float" versus "integer", this distinction
>> seems a little obscure...
>>
> An integral floating point value *is* an integer, just ask any 12 year 
> old. What's obscure is the way concepts of integers and reals get 
> mapped to ints and floats. Don't get me wrong, these are reasonable 
> comprises given the sad reality that computers are not so hot at 
> representing inifinte quantities.  However, we get sucked into 
> thinking that integers and ints are really the same things at our 
> peril. Similarly for floats and reals.
>
>>  I am sure the difference is very important
>> in some contexts, but I for one think that floor should return a
>> straight up integer,
>>
> It's a ufunc. Ufuncs in general return the same type that they operate 
> on. So, not only would this be difficult, it would make the signature 
> of ufuncs harder to remember.
>
> Also, as Robert Kern just pointed out, not all intergral FP values can 
> be represents as ints.
>
>> if just for code style (see example below). Plus
>> it will be upcast to a float whenever necessary, so floor(4.5) + .75
>> == 4.75 whether floor() returns an int or a float.
>>  
>>
>    Not every two-line Python function has to come pre-written -- Tim 
> Peters on C.L.P
>
> def webbsfloor(x):
>    return numpy.floor(x).astype(int)
>
>> fooMatrix[numpy.floor(age/ageWidth)]
>>
>> is better (easier to type, read, and debug) than
>>
>> fooMatrix[numpy.floor(age/ageWidth).astype(int)]
>>
>> If there is a explanation as to why an integral valued float is a
>> better return value, I would be interested in a link.
>>  
>>
> I think there's at least four reasons:
>
> 1. It would be a pain.
> 2. It would make the ufuncs inconsistent.
> 3. It's a thin wrapper over C's floor, so people coming from that 
> language be confused.
> 4. It wouldn't work for numbers with very large magnitudes.
>
> Pick any three
>
>
> Regards,
>
> -tim
>


From charlesr.harris at gmail.com  Sun Apr  9 22:12:02 2006
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Sun Apr  9 22:12:02 2006
Subject: [Numpy-discussion] numpy.floor() is supposed to return an int, but returns a float
In-Reply-To: <4439CC7E.90704@cox.net>
References: <b11ea23c0604091520l7aa7cd0cte287d995082315b@mail.gmail.com>
	 <44398AFD.4050304@cox.net>
	 <b11ea23c0604091539y3bf802c1wedd9bcb08103488a@mail.gmail.com>
	 <443993E3.1090901@cox.net> <4439CC7E.90704@cox.net>
Message-ID: <e06186140604092211j39f92004m6199b1c72d74db26@mail.gmail.com>

Tim,

On 4/9/06, Tim Hochberg <tim.hochberg at cox.net> wrote:

> Let me just add that, since this seems to cause confusion, it would be
> appropriate to amend the docstring tobe explicit that this always
> returns an integral floating point value. If someone wants to suggest
> wording, I can figure out where to put it. One possibility is:
>
>     "y = floor(x) elementwise largest integer <= x; note that the result
> is a floating point value"
>
> or
>
>     "y = floor(x) elementwise largest integral float <= x"


How about, "for each item in x returns the largest integral float <= item."

Chuck

P.S.

I too once found the C definition of the floor function annoying, but I got
used to it. Sorta like getting used to a broken leg. The main problem is
that the result can't be used as an index without conversion to a "real"
integer. Integers aren't members of the reals (or rationals): apart from +/-
1, integers don't have inverses. There happens to be an injective ring
homomorphism of the integers into the reals, but that is not the same thing.
On the other hand, ints are generally not big enough to hold all of the
integral doubles, so as a practical matter the originators made the best
choice. Things do get a bit weird for large floats because above a certain
threshold floats are already integral values.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060409/73cd4ddc/attachment.html>

From charlesr.harris at gmail.com  Sun Apr  9 22:21:02 2006
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Sun Apr  9 22:21:02 2006
Subject: [Numpy-discussion] Unexpected change of array used to index another array
In-Reply-To: <b11ea23c0604082001t2483ee50g686c7b3e649254fb@mail.gmail.com>
References: <b11ea23c0604082001t2483ee50g686c7b3e649254fb@mail.gmail.com>
Message-ID: <e06186140604092220w124dfa0o51ed3af68cbe2f84@mail.gmail.com>

On 4/8/06, Webb Sprague <webb.sprague at gmail.com> wrote:
>
> Hi.
>
> I indexed an 10 x 10(called bigM below) with another array (OFFS_TMP
> below).  I suppose because OFFS_TMP has negative numbers, it was
> changed to cycle around to 9 wherever there is a negative 1 (which is
> the forward version of -1 if you are a 10 x 10 matrix).  You can
> analogous behavior with -2 => 8, etc.  Is changing the indexing matrix
> really the correct behavior?  The result of using the index seems to
> be fine.  Has this story been told already and I didn't know it?


It's the python way:

>>> a = [1,2,3]
>>> a[-1]
3

It gives a convenient way to index from the end of the array. But I'm not
sure that was your question.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060409/079b53d1/attachment.html>

From robert.kern at gmail.com  Mon Apr 10 00:02:01 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Mon Apr 10 00:02:01 2006
Subject: [Numpy-discussion] Re: Unexpected change of array used to index another array
In-Reply-To: <e06186140604092220w124dfa0o51ed3af68cbe2f84@mail.gmail.com>
References: <b11ea23c0604082001t2483ee50g686c7b3e649254fb@mail.gmail.com> <e06186140604092220w124dfa0o51ed3af68cbe2f84@mail.gmail.com>
Message-ID: <e1cvqu$p50$1@sea.gmane.org>

Charles R Harris wrote:
> 
> On 4/8/06, *Webb Sprague* <webb.sprague at gmail.com
> <mailto:webb.sprague at gmail.com>> wrote:
> 
>     Hi.
> 
>     I indexed an 10 x 10(called bigM below) with another array (OFFS_TMP
>     below).  I suppose because OFFS_TMP has negative numbers, it was
>     changed to cycle around to 9 wherever there is a negative 1 (which is
>     the forward version of -1 if you are a 10 x 10 matrix).  You can
>     analogous behavior with -2 => 8, etc.  Is changing the indexing matrix
>     really the correct behavior?  The result of using the index seems to
>     be fine.  Has this story been told already and I didn't know it? 
> 
> It's the python way:
> 
>>>> a = [1,2,3]
>>>> a[-1]
> 3
> 
> It gives a convenient way to index from the end of the array. But I'm
> not sure that was your question.

That's not the issue. The problem was that the index array was being modified
in-place simply by being used as an index array.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From arnd.baecker at web.de  Mon Apr 10 04:01:05 2006
From: arnd.baecker at web.de (Arnd Baecker)
Date: Mon Apr 10 04:01:05 2006
Subject: [Numpy-discussion] Speed up function on cross product of two
 sets?
In-Reply-To: <4434D6DF.2020306@ieee.org>
References: <DCA236B2-7315-419D-B05B-06396924E909@stanford.edu>
 <Pine.LNX.4.51.0604031110050.6751@ptpcp8.phy.tu-dresden.de> <44315633.4010600@cox.net>
 <Pine.LNX.4.51.0604051051310.24174@ptpcp8.phy.tu-dresden.de>
 <4434D6DF.2020306@ieee.org>
Message-ID: <Pine.LNX.4.51.0604101247300.2415@ptpcp8.phy.tu-dresden.de>

On Thu, 6 Apr 2006, Travis Oliphant wrote:

> Arnd Baecker wrote:
> > BTW, it seems that we have no Numeric to numpy transition remarks in
> > www.scipy.org. I only found
> > http://www.scipy.org/PearuPeterson/NumpyVersusNumeric
> > and of course Travis' "Guide to NumPy" contains a detailed list of
> > necessary changes in chapter 2.6.1.
> >
> For clarification:  this is in the sample chapter available on-line to
> all....

yes, I should have emphasized that.
I tried to make this also clearer at
http://www.scipy.org/Converting_from_Numeric

> > In addition ``site-packages/numpy/lib/convertcode.py`` provides an
> > automatic conversion.
> >
> > Would it be helpful to start a new wiki page "ConvertingFromNumeric"
> > (similar to http://www.scipy.org/Converting_from_numarray)
> > which aims at summarizing the necessary changes
> > or expand Pearu's page (if he agrees) on this?
> >
>
> Absolutely.   I did the Numarray page because I'd written a lot on
> Converting from Numeric (even providing convertcode.py) but very little
> for numarray --- except the ndimage conversion.  So, I started the
> Numarray page.   Sounds like a great idea to have a dual page.


Best, Arnd

P.S.: BTW +1 to all which has been said in the other thread on NumPy
documentation - you are really doing a brilliant job, Travis!!!


From webb.sprague at gmail.com  Mon Apr 10 07:16:04 2006
From: webb.sprague at gmail.com (Webb Sprague)
Date: Mon Apr 10 07:16:04 2006
Subject: [Numpy-discussion] Unexpected change of array used to index another array
In-Reply-To: <e06186140604092220w124dfa0o51ed3af68cbe2f84@mail.gmail.com>
References: <b11ea23c0604082001t2483ee50g686c7b3e649254fb@mail.gmail.com>
	 <e06186140604092220w124dfa0o51ed3af68cbe2f84@mail.gmail.com>
Message-ID: <b11ea23c0604100715w2249d053g3f7c5ca92d11994@mail.gmail.com>

>
> It's the python way:
>
> >>> a = [1,2,3]
> >>> a[-1]
> 3
>
> It gives a convenient way to index from the end of the array. But I'm not
> sure that was your question.

No there, was a bug in that when using one matrix to index another, in
that the indexing matrix gets changed. As if you did

>>> i = 4
>>> a = [1,2,3]
>>> a[i]
>>> print i
  -1

 I know about the negative trick in simple python lists, I was trying
to do something in matrices (where it works too, but that wasn't the
issue.

Thanks for trying to help, though.
W


From webb.sprague at gmail.com  Mon Apr 10 07:19:22 2006
From: webb.sprague at gmail.com (Webb Sprague)
Date: Mon Apr 10 07:19:22 2006
Subject: [Numpy-discussion] numpy.floor() is supposed to return an int, but returns a float
In-Reply-To: <e06186140604092211j39f92004m6199b1c72d74db26@mail.gmail.com>
References: <b11ea23c0604091520l7aa7cd0cte287d995082315b@mail.gmail.com>
	 <44398AFD.4050304@cox.net>
	 <b11ea23c0604091539y3bf802c1wedd9bcb08103488a@mail.gmail.com>
	 <443993E3.1090901@cox.net> <4439CC7E.90704@cox.net>
	 <e06186140604092211j39f92004m6199b1c72d74db26@mail.gmail.com>
Message-ID: <b11ea23c0604100718t5cef64eetaee47f7a64c83567@mail.gmail.com>

> >     "y = floor(x) elementwise largest integer <= x; note that the result
> > is a floating point value"

I prefer this, if it makes any difference. The others are more
succint, but less likely to help others in my situation.

> I too once found the C definition of the floor function annoying, but I got
> used to it. Sorta like getting used to a broken leg.

Annoying yes, crippling no.  I guess I should have grown up on a real
programming language :)


From tim.hochberg at cox.net  Mon Apr 10 09:13:03 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Mon Apr 10 09:13:03 2006
Subject: [Numpy-discussion] numpy.floor() is supposed to return an int,
 but returns a float
In-Reply-To: <e06186140604092211j39f92004m6199b1c72d74db26@mail.gmail.com>
References: <b11ea23c0604091520l7aa7cd0cte287d995082315b@mail.gmail.com>	 <44398AFD.4050304@cox.net>	 <b11ea23c0604091539y3bf802c1wedd9bcb08103488a@mail.gmail.com>	 <443993E3.1090901@cox.net> <4439CC7E.90704@cox.net> <e06186140604092211j39f92004m6199b1c72d74db26@mail.gmail.com>
Message-ID: <443A844C.7070306@cox.net>

Charles R Harris wrote:

> Tim,
>
> On 4/9/06, *Tim Hochberg* <tim.hochberg at cox.net 
> <mailto:tim.hochberg at cox.net>> wrote:
>
>     Let me just add that, since this seems to cause confusion, it would be
>     appropriate to amend the docstring tobe explicit that this always
>     returns an integral floating point value. If someone wants to suggest
>     wording, I can figure out where to put it. One possibility is:
>
>         "y = floor(x) elementwise largest integer <= x; note that the
>     result
>     is a floating point value"
>
>     or
>
>         "y = floor(x) elementwise largest integral float <= x"
>
>
> How about, "for each item in x returns the largest integral float <= 
> item."

That seems pretty good. I'll wait a day or so and see what else shows up.

>
> Chuck
>
> P.S.
>
> I too once found the C definition of the floor function annoying, but 
> I got used to it. Sorta like getting used to a broken leg. The main 
> problem is that the result can't be used as an index without 
> conversion to a "real" integer. Integers aren't members of the reals 
> (or rationals): apart from +/- 1, integers don't have inverses.

> There happens to be an injective ring homomorphism of the integers 
> into the reals, but that is not the same thing.

I'm not conversant with the terminology [here I rummage through google 
to try to get the terminology sort of right], but as I understand it 
integers (I) are a subset of reals (R). The ring that you contruct with 
integers consists of the set of integers plus the operations of 
addition/subtraction and multiplication as well as an identity. I've 
seen that specified as  something like (I, +/-, *, 0). Similarly, the 
set of reals (R) and the field that one constructs from them are not 
really the same thing. So while the ring of integers is not a subset of 
the field of reals (the statement doesn't even make sense when put that 
way),the set of integers is a subset of the set of reals. I think that 
most people, outside of computer programmers and perhaps math majors, 
think of the set of integers, not the field of integers, to the extent 
that they think about integers and reals at all. I imagine most people 
would conjure up some Dali like image when confronted with the notion of 
a field of integerse!

(C-int, +/-, *, 0), actually forms a finite field which is not at all 
the same thing the field of integers. Bit twiddlers tend to understand 
and even exploit this, but a lot of people conflate the field of ints 
with the field of integers. This works fine as long as your values are 
small in magnitude, but eventually will rise up and bite you. Floats are 
even worse, since they don't even form a field, I think they're actually 
a semiring because of INF/NAN/IND, but I'm not certain about that. 
Issues with floating point pop up everywhere and if you squint the right 
way, you can blame them on their lack of fieldness. Which is closely 
tied to their finite range and precision, which is what bites people.

Because Python automatically promotes (Python) ints to (Python) longs, 
Python ints map, for most puposes, onto the field of integers. However, 
in numpy wer're stuck using C-ints for performance reasons, so we'd be 
wise to keep the differences between ints and integers in the back of 
our mind.

This is wandering rather far afield (although it's entertaining).

> On the other hand, ints are generally not big enough to hold all of 
> the integral doubles, so as a practical matter the originators made 
> the best choice. Things do get a bit weird for large floats because 
> above a certain threshold floats are already integral values.

Another issue at the moment is that integer division does an implicit 
flooring or truncation (I believe it's implementation dependant in C) in 
both C and Python, so if you aren't using floor to produce an index, 
something I've been known to do, having it return an integer could also 
lead to nasty suprises. For example:

def half_integer(x):
    "return nearest half integer below x"
    return floor(2*x) / 2

Would start failing mysteriously. Of course the above is an overflow 
magnet, so perhaps it's not the best example. Eventually, '/' is going 
to mean true_division and '//' will mean floor_division, so this 
particular issue will go away.

Regards,

-tim


>
>
>


From bsouthey at gmail.com  Mon Apr 10 09:16:08 2006
From: bsouthey at gmail.com (Bruce Southey)
Date: Mon Apr 10 09:16:08 2006
Subject: [Numpy-discussion] Re: ndarray.fill and ma.array.filled
In-Reply-To: <d38f5330604071630s4f1955a2x972ccb2763901c1b@mail.gmail.com>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com>
	 <4436C965.8020808@hawaii.edu> <4436D6D1.6040302@cox.net>
	 <200604071844.37724.pgmdevlist@mailcan.com>
	 <d38f5330604071630s4f1955a2x972ccb2763901c1b@mail.gmail.com>
Message-ID: <bbcd77d00604100915t71bb2db3wb7b30f071a71685a@mail.gmail.com>

Hi,

On 4/7/06, Sasha <ndarray at mac.com> wrote:
> On 4/7/06, Pierre GM <pgmdevlist at mailcan.com> wrote:
> > ...
> > We're going towards MA as the default object.
> >
> I will be against changing the array structure to handle missing
> values.  Let's keep the discussion focuced on the interface. Once we
> agree on the interface, it will be clear if any structural changes are
> necessary.
>
>
> > But then again, what would be the behavior to deal with missing values ?
>
> We can postpone this discussion as well. Just add mask attribute that
> returns False and filled method that returns a copy is an example of a
> minimalistic change.

I think that the usage of MA is important because this often dictates
the interface. The other aspect is the penalty that is imposed by
requiring a masked features especially to situations that don't need
any of these features.


>
> > Using R-like na.actions ? That'd be great, but it's getting more complex.
> >
>
> I don't like na.actions.  I think missing values should behave like
> IEEE NaNs and in the floating point case should be represented by
> NaNs.

I think the issue related to how masked values should be handled in
computation. Does it matter if the result of an operation is due to a
masked value or numerical problem (like dividing by zero)? (I am
presuming that it is possible to identify this difference.) If not,
then I support the idea of treating masked values as NaN.

>The functionality provided by na.actions can always be achieved
> by calling an extra function (filled or compress).

I am not clear on what you actually mean here.  For example, if you
are summing across a particular dimension, I would presume that any
masked value would be ignored an  that there would be some record of
the fact that a masked value was encountered. This would allow that
'extra function' to handle the associated result. Alternatively the
'extra function'  would have to be included as an argument - which is
what the na.actions do.

Regards
Bruce


From ndarray at mac.com  Mon Apr 10 09:49:05 2006
From: ndarray at mac.com (Sasha)
Date: Mon Apr 10 09:49:05 2006
Subject: [Numpy-discussion] Re: ndarray.fill and ma.array.filled
In-Reply-To: <bbcd77d00604100915t71bb2db3wb7b30f071a71685a@mail.gmail.com>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com>
	 <4436C965.8020808@hawaii.edu> <4436D6D1.6040302@cox.net>
	 <200604071844.37724.pgmdevlist@mailcan.com>
	 <d38f5330604071630s4f1955a2x972ccb2763901c1b@mail.gmail.com>
	 <bbcd77d00604100915t71bb2db3wb7b30f071a71685a@mail.gmail.com>
Message-ID: <d38f5330604100948gae90c48ydca295a5d6e84bb@mail.gmail.com>

On 4/10/06, Bruce Southey <bsouthey at gmail.com> wrote:
>
> [...]
> I think the issue related to how masked values should be handled in
> computation. Does it matter if the result of an operation is due to a
> masked value or numerical problem (like dividing by zero)? (I am
> presuming that it is possible to identify this difference.) If not,
> then I support the idea of treating masked values as NaN.
>

IEEE standard prvides plenty of spare bits in NaNs to represent pretty
much everything, and some languages take advantage of that feature. (I
believe NA and NaN are distinct in R). In MA, however mask elements
are boolean and no distinction is made between various reasons for not
having a data element.  For consistency, a non-trivial (not always
false) implementation of ndarray.mask should return "not finite" and
ignore bits that distinguish NaNs and infinities.

> >The functionality provided by na.actions can always be achieved
> > by calling an extra function (filled or compress).
>
> I am not clear on what you actually mean here.  For example, if you
> are summing across a particular dimension, I would presume that any
> masked value would be ignored an  that there would be some record of
> the fact that a masked value was encountered. This would allow that
> 'extra function' to handle the associated result. Alternatively the
> 'extra function'  would have to be included as an argument - which is
> what the na.actions do.
>
If you sum along a particular dimension and encounter a masked value,
the result is masked.  The same is true if you encounter a NaN - the
result is NaN.  If you would like to ignore masked values, you write
a.filled(0).sum() instead of a.sum(). In 1d case, you can also use
a.compress().sum().  In other words, what in R you achieve with a
flag, such as in sum(a, na.rm=TRUE), in numpy you achieve by an
explicit call to "fill".  This is not quite the same as na.actions in
R, but that is what I had in mind.


From pgmdevlist at mailcan.com  Mon Apr 10 10:58:02 2006
From: pgmdevlist at mailcan.com (Pierre GM)
Date: Mon Apr 10 10:58:02 2006
Subject: [Numpy-discussion] Re: ndarray.fill and ma.array.filled
In-Reply-To: <d38f5330604100948gae90c48ydca295a5d6e84bb@mail.gmail.com>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com> <bbcd77d00604100915t71bb2db3wb7b30f071a71685a@mail.gmail.com> <d38f5330604100948gae90c48ydca295a5d6e84bb@mail.gmail.com>
Message-ID: <200604101356.44903.pgmdevlist@mailcan.com>

> If you sum along a particular dimension and encounter a masked value,
> the result is masked.  

That's not how it currently works (still on 0.9.6):

x=arange(12).reshape(3,4)
MA.masked_where((x%5==0) | (x%3==0),x).sum(0)
array(data = [12  1  2 18],
         mask =  [False False False False],
         fill_value=999999)

and frankly, I'd be quite frustrated if it had to change:
- `filled` is not a ndarray method, which means that a.filled(0).sum() fails 
if a is not MA. Right now, I can use a.sum() without having to check the 
nature of a first. 
- this behavior was already in Numeric
- All my scripts rely on it (but I guess that's my problem)
- The current way reflects how mask are used in GIS or image processing.

> If you would like to ignore masked values, you write 
> a.filled(0).sum() instead of a.sum(). In 1d case, you can also use
> a.compress().sum().

Once again, Sasha, I'd agree with you if it wasn't a major difference

> In other words, what in R you achieve with a 
> flag, such as in sum(a, na.rm=TRUE), in numpy you achieve by an
> explicit call to "fill".  This is not quite the same as na.actions in
> R, but that is what I had in mind.

I kinda like the idea of a flag, though


From ndarray at mac.com  Mon Apr 10 11:37:00 2006
From: ndarray at mac.com (Sasha)
Date: Mon Apr 10 11:37:00 2006
Subject: [Numpy-discussion] Re: ndarray.fill and ma.array.filled
In-Reply-To: <200604101356.44903.pgmdevlist@mailcan.com>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com>
	 <bbcd77d00604100915t71bb2db3wb7b30f071a71685a@mail.gmail.com>
	 <d38f5330604100948gae90c48ydca295a5d6e84bb@mail.gmail.com>
	 <200604101356.44903.pgmdevlist@mailcan.com>
Message-ID: <d38f5330604101135t6e86efc3n1d251ca957d893c9@mail.gmail.com>

On 4/10/06, Pierre GM <pgmdevlist at mailcan.com> wrote:
> > If you sum along a particular dimension and encounter a masked value,
> > the result is masked.
>
> That's not how it currently works (still on 0.9.6):
>
> [... longish example snipped ...]

>>> ma.array([1,1], mask=[0,1]).sum()
1

> and frankly, I'd be quite frustrated if it had to change:
> - `filled` is not a ndarray method, which means that a.filled(0).sum() fails
> if a is not MA. Right now, I can use a.sum() without having to check the
> nature of a first.

This is exactly the point of the current discussion: make fill a
method of ndarray.
With the current behavior, how would you achieve masking (no fill) a.sum()?

> - this behavior was already in Numeric

That's true, but it makes the result of sum(a) different from
__builtins__.sum(a).  I believe consistency with the python
conventions is more important than with legacy Numeric in the long
run.

> [...]

> - The current way reflects how mask are used in GIS or image processing.
>
Can you elaborate on this? Note that in R na.rm is false by default in sum:

> sum(c(1,NA))
[1] NA

So it looks like the convention is different in the field of statistics.

> > If you would like to ignore masked values, you write
> > a.filled(0).sum() instead of a.sum(). In 1d case, you can also use
> > a.compress().sum().
>
> Once again, Sasha, I'd agree with you if it wasn't a major difference

Array methods are a very recent addition to ma.  We can still use this
window of opportunity to get things right before to many people get
used to the wrong behavior.  (Note that I changed your implementation
of cumsum and cumprod.)

>
> > In other words, what in R you achieve with a
> > flag, such as in sum(a, na.rm=TRUE), in numpy you achieve by an
> > explicit call to "fill".  This is not quite the same as na.actions in
> > R, but that is what I had in mind.
>
> I kinda like the idea of a flag, though

With the flag approach making ndarray and ma.array interfaces
consistent would require adding an extra argument to many methods. 
Instead, I poropose to add one method: fill to ndarray.


From pgmdevlist at mailcan.com  Mon Apr 10 13:37:07 2006
From: pgmdevlist at mailcan.com (Pierre GM)
Date: Mon Apr 10 13:37:07 2006
Subject: [Numpy-discussion] Re: ndarray.fill and ma.array.filled
In-Reply-To: <d38f5330604101135t6e86efc3n1d251ca957d893c9@mail.gmail.com>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com> <200604101356.44903.pgmdevlist@mailcan.com> <d38f5330604101135t6e86efc3n1d251ca957d893c9@mail.gmail.com>
Message-ID: <200604101638.29979.pgmdevlist@mailcan.com>

> > [... longish example snipped ...]
> >
> >>> ma.array([1,1], mask=[0,1]).sum()
>
> 1
So ? The result is not `masked`, the missing value has been omitted.

MA.array([[1,1],[1,1]],mask=[[0,1],[1,0]]).sum()
array(data = [1 1],   mask = [False False], fill_value=999999)


> This is exactly the point of the current discussion: make fill a
> method of ndarray.
Mrf. I'm still not convinced, but I have nothing against it. Along with a 
mask=False_ by default ?

> With the current behavior, how would you achieve masking (no fill) a.sum()?
Er, why would I want to get MA.masked along one axis if one value is masked  ? 
The current behavior is to mask only if all the values along that axis are 
masked:

MA.array([[1,1],[1,1]],mask=[[0,1],[1,1]]).sum()
array(data = [1 999999],   mask = [False True], fill_value=999999)

With a.filled(0).sum(), how would you distinguish between the cases (a) at 
least one value is not masked and (b) all values are masked  ? (OK, by 
querying the mask with something in the line of a a._mask.all(axis), but it's 
longer... Oh well, I'll just to adapt)

> > - this behavior was already in Numeric
>
> That's true, but it makes the result of sum(a) different from
> __builtins__.sum(a).  I believe consistency with the python
> conventions is more important than with legacy Numeric in the long
> run.
>
> Array methods are a very recent addition to ma.  We can still use this
> window of opportunity to get things right before to many people get
> used to the wrong behavior.  (Note that I changed your implementation
> of cumsum and cumprod.)

Good points... We'll just have to put strong warnings everywhere.

> >
> > - The current way reflects how mask are used in GIS or image processing.
>
> Can you elaborate on this? Note that in R na.rm is false by default in sum:
> > sum(c(1,NA))
>
> [1] NA
>
> So it looks like the convention is different in the field of statistics.

MMh. *digs in his old GRASS scripts* 
OK, my bad. I had to fill missing values somehow, or at least check whether 
there were any before processing. I'll double check on that. Please 
temporarily forget that comment.

> With the flag approach making ndarray and ma.array interfaces
> consistent would require adding an extra argument to many methods.
> Instead, I poropose to add one method: fill to ndarray.
OK, good point.


On a semantic aspect:
While digging these GRASS scripts I mentioned, I realized/remembered that 
masked values are called 'null', when there's no data, a NAN, or just when 
you want to hide some values. What about 'null' instead of 
'mask','missing','na' ? 


From tim.hochberg at cox.net  Mon Apr 10 14:14:02 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Mon Apr 10 14:14:02 2006
Subject: [Numpy-discussion] Re: ndarray.fill and ma.array.filled
In-Reply-To: <200604101638.29979.pgmdevlist@mailcan.com>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com> <200604101356.44903.pgmdevlist@mailcan.com> <d38f5330604101135t6e86efc3n1d251ca957d893c9@mail.gmail.com> <200604101638.29979.pgmdevlist@mailcan.com>
Message-ID: <443AC5CB.2000704@cox.net>

Pierre GM wrote:

>>>[... longish example snipped ...]
>>>
>>>      
>>>
>>>>>ma.array([1,1], mask=[0,1]).sum()
>>>>>          
>>>>>
>>1
>>    
>>
>So ? The result is not `masked`, the missing value has been omitted.
>
>MA.array([[1,1],[1,1]],mask=[[0,1],[1,0]]).sum()
>array(data = [1 1],   mask = [False False], fill_value=999999)
>
>
>  
>
>>This is exactly the point of the current discussion: make fill a
>>method of ndarray.
>>    
>>
>Mrf. I'm still not convinced, but I have nothing against it. Along with a 
>mask=False_ by default ?
>
>  
>
>>With the current behavior, how would you achieve masking (no fill) a.sum()?
>>    
>>
>Er, why would I want to get MA.masked along one axis if one value is masked  ? 
>  
>
Any number of reasons I would think. It depends on what your using the 
data for. If the sum is the total amount that you spent in the month, 
and a masked value means you lost that check stub, then you don't know 
how much you actually spent and that value should be masked. To chose a 
boring example.

>The current behavior is to mask only if all the values along that axis are 
>masked:
>
>MA.array([[1,1],[1,1]],mask=[[0,1],[1,1]]).sum()
>array(data = [1 999999],   mask = [False True], fill_value=999999)
>
>With a.filled(0).sum(), how would you distinguish between the cases (a) at 
>least one value is not masked and (b) all values are masked  ? (OK, by 
>querying the mask with something in the line of a a._mask.all(axis), but it's 
>longer... Oh well, I'll just to adapt)
>  
>
Actually I'm going to ask you the same question. Why would care if all 
of the values are masked? I may be missing something, but either there's 
a sensible default value, in which case it doesn't matter how many 
values are masked, or you can't handle any masked values and the result 
should be masked if there are any masks in the input. Sasha's proposal 
handle those two cases well. Your behaviour a little more clunkily, but 
I'd like to understand why you want that behaviour.

Regards,

-tim

>  
>
>>>- this behavior was already in Numeric
>>>      
>>>
>>That's true, but it makes the result of sum(a) different from
>>__builtins__.sum(a).  I believe consistency with the python
>>conventions is more important than with legacy Numeric in the long
>>run.
>>
>>Array methods are a very recent addition to ma.  We can still use this
>>window of opportunity to get things right before to many people get
>>used to the wrong behavior.  (Note that I changed your implementation
>>of cumsum and cumprod.)
>>    
>>
>
>Good points... We'll just have to put strong warnings everywhere.
>
>  
>
>>>- The current way reflects how mask are used in GIS or image processing.
>>>      
>>>
>>Can you elaborate on this? Note that in R na.rm is false by default in sum:
>>    
>>
>>>sum(c(1,NA))
>>>      
>>>
>>[1] NA
>>
>>So it looks like the convention is different in the field of statistics.
>>    
>>
>
>MMh. *digs in his old GRASS scripts* 
>OK, my bad. I had to fill missing values somehow, or at least check whether 
>there were any before processing. I'll double check on that. Please 
>temporarily forget that comment.
>
>  
>
>>With the flag approach making ndarray and ma.array interfaces
>>consistent would require adding an extra argument to many methods.
>>Instead, I poropose to add one method: fill to ndarray.
>>    
>>
>OK, good point.
>
>
>On a semantic aspect:
>While digging these GRASS scripts I mentioned, I realized/remembered that 
>masked values are called 'null', when there's no data, a NAN, or just when 
>you want to hide some values. What about 'null' instead of 
>'mask','missing','na' ? 
>
>
>
>-------------------------------------------------------
>This SF.Net email is sponsored by xPML, a groundbreaking scripting language
>that extends applications into web and mobile media. Attend the live webcast
>and join the prime developer group breaking into this new coding territory!
>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
>_______________________________________________
>Numpy-discussion mailing list
>Numpy-discussion at lists.sourceforge.net
>https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>
>
>  
>


From oliphant at ee.byu.edu  Mon Apr 10 15:07:06 2006
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Mon Apr 10 15:07:06 2006
Subject: [Numpy-discussion] Recarray and shared datas
In-Reply-To: <200604061020.k36AKIsQ018238@decideur.info>
References: <200604061020.k36AKIsQ018238@decideur.info>
Message-ID: <443AD6CF.4010800@ee.byu.edu>

Benjamin Thyreau wrote:

>Hi,
>Numpy has a nice feature of recarray, ie. record which can hold columns names.
>I'd like to use such a feature in order to better interact with R, ie. passing
>R datas to python without copy. The current rpy bindings do a full copy, and
>convert to simple ndarray. Looking at the recarray api in the Guide,
>and also at the source code, i don't find any recarray constructor which can
>get shared datas (all the examples from section 8.6 are doing copies).
>Is there some way to do it ? in Python or in C ? Or is there any plans to ?
>
>  
>
Yes, you can share data with a recarray because a "recarray" is just a 
numpy array with a fancy data-type and with attribute access 
over-ridding to do "field" lookups if the attribute cannot otherwise be 
found. 

What exactly are you trying to share data with?   I'm having a hard time 
understanding how to answer your question without more information.

Best,

-Travis


From oliphant at ee.byu.edu  Mon Apr 10 15:14:05 2006
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Mon Apr 10 15:14:05 2006
Subject: [Numpy-discussion] Tiling / disk storage for matrix in numpy?
In-Reply-To: <b11ea23c0604071030s7f03a83co35ca94b8c91639eb@mail.gmail.com>
References: <b11ea23c0604071030s7f03a83co35ca94b8c91639eb@mail.gmail.com>
Message-ID: <443AD889.7020004@ee.byu.edu>

Webb Sprague wrote:

>Hi all,
>
>Is there a way in numpy to associate a (large) matrix with a disk
>file, then and tile and index it, then cache it as you process the
>various pieces?  This is pretty important with massive image files,
>which can't fit into working memory, but in which (for example) you
>might be doing a convolution on a 100 x 100 pixel window on a small
>subset of the image.
>
>  
>
I suppose if you used a memory-mapped array, then you would be at the 
mercy of the operating system caching.  But, this would be the easiest way.

-Travis


From oliphant at ee.byu.edu  Mon Apr 10 15:21:07 2006
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Mon Apr 10 15:21:07 2006
Subject: [Numpy-discussion] Re: ndarray.fill and ma.array.filled
In-Reply-To: <d38f5330604071219j6a5adbdw4a300ed10a26a445@mail.gmail.com>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com>	 <d38f5330604071025g5f53d0d7yb0b14d9bac059c4@mail.gmail.com>	 <4436AE31.7000306@cox.net> <d38f5330604071219j6a5adbdw4a300ed10a26a445@mail.gmail.com>
Message-ID: <443ADA43.8060400@ee.byu.edu>

Sasha wrote:

>
>
> On 4/7/06, *Tim Hochberg* <tim.hochberg at cox.net 
> <mailto:tim.hochberg at cox.net>> wrote:
>
>     ...
>     In general, I'm skeptical of adding more methods to the ndarray object
>     -- there are plenty already.
>
>
> I've also proposed to drop "fill" in favor of optimizing x[...] = 
> <scalar>.  Having both "fill" and "filled" in the interface is plain 
> awkward.  You may like the combined proposal better because it does 
> not change the total number of methods :-)
>  
>
>     In addition, it appears that both the method and function versions of
>     filled are "dangerous" in the sense that they sometimes return the
>     array
>     itself and sometimes a copy.
>
>
> This is true in ma, but may certainly be changed.
>  
>
>     Finally, changing ndarray to support masked array feels a bit like the
>     tail wagging the dog. 
>
>
> I disagree. Numpy is pretty much alone among the array languages 
> because it does not have "native" support for missing values. For the  
> floating point types some rudimental support for nans exists, but is 
> not really usable.  There is no missing values machanism for integer 
> types.  I believe adding "filled" and maybe "mask" to ndarray (not 
> necessarily under these names) could be a meaningful step towards 
> "native" support for missing values.  

Supporting missing values is a useful thing (but not for every usage of 
arrays).   Thus, ultimately, I see missing-value arrays as a solid 
sub-class of the basic array class.  I'm glad Sasha is working on 
missing value arrays and have tried to be supportive. 

I'm a little hesitant to add a special-case method basically for one 
particular sub-class, though, unless it is the only workable solution.  
We are still exploring this whole sub-class space and have not really 
mastered it...

-Travis


From oliphant at ee.byu.edu  Mon Apr 10 15:44:07 2006
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Mon Apr 10 15:44:07 2006
Subject: [Numpy-discussion] Re: ndarray.fill and ma.array.filled
In-Reply-To: <4436FF73.7080408@cox.net>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com>	 <d38f5330604071025g5f53d0d7yb0b14d9bac059c4@mail.gmail.com>	 <4436AE31.7000306@cox.net>	 <d38f5330604071219j6a5adbdw4a300ed10a26a445@mail.gmail.com>	 <4436C965.8020808@hawaii.edu> <4436D6D1.6040302@cox.net> <d38f5330604071537r16a3b813l17f968e6fa80703a@mail.gmail.com> <4436FF73.7080408@cox.net>
Message-ID: <443ADF9A.9050001@ee.byu.edu>

> This may be an oportune time to propose something that's been cooking 
> in the back of my head for a week or so now: A stripped down array 
> superclass. The details of this are not at all locked down, but here's 
> a strawman proposal.

This is in essence what I've been proposing since SciPy 2005.  I want 
what goes into Python to be essentially just this super-class. 

Look at this http://numeric.scipy.org/array_interface.html

and check out this

svn co http://svn.scipy.org/svn/PEP arrayPEP

I've obviously been way over-booked to do this myself.    Nick Coughlan 
expressed interest in this idea (he called it dimarray, but I like 
basearray better). 

>
>    We add an array superclass. call it basearray, that has the same
>    C-structure as the existing ndarray. However, it has *no* methods or
>    attributes. 

Why not give it the attributes corresponding to it's C-structure.  I'm 
happy with no methods though.

>        1. If we're careful, this could be the basic array object that
>    we propose, at least for the first roun,d for inclusion in the
>    Python core. It's not useful for anything but passing data betwen
>    various application that understand the data structure, but that in
>    itself could be a huge win. And the fact that it's dirt simple would
>    probably be an advantage to getting it into the core.

The only extra thing I'm proposing is to add the data-descriptor object 
into the Python core as well --- other-wise what do you do with  
PyArray_Descr * part of the C-structure?

>        2. It provides a useful marker class. MA could inherit from it
>    (and use itself for it's data attribute) and then asanyarray would
>    behave properly. MA could also use this, or a subclass, as the mask
>    object preventing anyone from accidentally using it as data (they
>    could always use it on purpose with asarray).

>        3. It provides a platform for people to build other,
>    ndarray-like classes in Pure python. This is my main interest. I've
>    put together a thin shell over numpy that strips it down to it's
>    abolute essentials including a stripped down version of ndarray that
>    removes most of the methods. All of the __array_wrap__[1] stuff
>    works quite well most of the time, but there's still some issues
>    with being a subclass when this particular class is conceptually a
>    superclass. If we had an array superclass of some sort, I believe
>    that these would be resolved.
>
> In principle at least, this shouldn't be that hard. I think it should 
> mostly be rearanging some code and adding some wrappers to existing 
> functions. That's in principle. In practice, I'm not certain yet as I 
> haven't investigated the code in question in much depth yet. I've been 
> meaning to write this up into a more fleshed out proposal, but I got 
> distracted by the whole Protocol discussion on python-dev3000. This 
> writeup is pretty weak, but hopefully you get the idea.

This is exactly what needs to be done to improve array-support in 
Python.  This is the conclusion I came to and I'm glad to see that Tim 
is now basically having the same conclusion.   There are obviously some 
details to work out.   But, having a base structure to inherit from 
would be perfect.

-Travis


From oliphant at ee.byu.edu  Mon Apr 10 15:49:01 2006
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Mon Apr 10 15:49:01 2006
Subject: [Numpy-discussion] Re: ndarray.fill and ma.array.filled
In-Reply-To: <200604072258.34153.pgmdevlist@mailcan.com>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com> <d38f5330604071537r16a3b813l17f968e6fa80703a@mail.gmail.com> <4436FF73.7080408@cox.net> <200604072258.34153.pgmdevlist@mailcan.com>
Message-ID: <443AE0A1.3000002@ee.byu.edu>

Pierre GM wrote:

>>decide to get rid of "putmask".
>>    
>>
>
>"putmask" really seems overkill indeed. I wouldn't miss it.
>  
>

I'm not opposed to getting rid of putmask either.   Several of the newer 
methods are open for discussion before 1.0.   I'd have to check to be 
sure, but .take and .put are not entirely replaced by fancy-indexing.   
Also, fancy indexing has enough overhead that a method doing exactly 
what you want is faster.  

-Travis


From zdm105 at tom.com  Mon Apr 10 16:03:03 2006
From: zdm105 at tom.com (=?GB2312?B?NNTCMTUtMTbJz7qjLzIxLTIyye7b2g==?=)
Date: Mon Apr 10 16:03:03 2006
Subject: [Numpy-discussion] =?GB2312?B?QUTUy9PDRVhDRUy02b34ytCzodOqz/rT67LGzvG53MDt?=
Message-ID: <E1FT5Op-0004gl-SZ@mail.sourceforge.net>

An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060410/d53cacfd/attachment.html>

From ndarray at mac.com  Mon Apr 10 16:06:00 2006
From: ndarray at mac.com (Sasha)
Date: Mon Apr 10 16:06:00 2006
Subject: [Numpy-discussion] Re: ndarray.fill and ma.array.filled
In-Reply-To: <200604101638.29979.pgmdevlist@mailcan.com>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com>
	 <200604101356.44903.pgmdevlist@mailcan.com>
	 <d38f5330604101135t6e86efc3n1d251ca957d893c9@mail.gmail.com>
	 <200604101638.29979.pgmdevlist@mailcan.com>
Message-ID: <d38f5330604101605m15d7913kafffcb68010533fe@mail.gmail.com>

On 4/10/06, Pierre GM <pgmdevlist at mailcan.com> wrote:
> > > [... longish example snipped ...]
> > >
> > >>> ma.array([1,1], mask=[0,1]).sum()
> >
> > 1
> So ? The result is not `masked`, the missing value has been omitted.
>
I am just making your point with a shorter example.

> [...]
> Mrf. I'm still not convinced, but I have nothing against it. Along with a
> mask=False_ by default ?
>
It looks like there is little opposition here.  I'll submit a patch
soon and unless better names are suggested, it will probably go in.

> > With the current behavior, how would you achieve masking (no fill) a.sum()?
> Er, why would I want to get MA.masked along one axis if one value is masked  ?

Because if you don't know one of the addends you don't know the sum. 
Replacing missing values with zeros is not always the right strategy.
If you know that your data has non-zero mean, for example, you might
want to replace missing values with the mean instead of zero.


> The current behavior is to mask only if all the values along that axis are
> masked:
>
> MA.array([[1,1],[1,1]],mask=[[0,1],[1,1]]).sum()
> array(data = [1 999999],   mask = [False True], fill_value=999999)
>

I did not realize that, but it is really bad. What is the
justification for this?
In R:

> sum(c(NA,NA), na.rm=TRUE)
[1] 0

What does MATLAB do in this case?


> With a.filled(0).sum(), how would you distinguish between the cases (a) at
> least one value is not masked and (b) all values are masked  ? (OK, by
> querying the mask with something in the line of a a._mask.all(axis), but it's
> longer... Oh well, I'll just to adapt)
>

Exactly. Explicit is better than implicit. The Zen of Python
<http://www.python.org/dev/peps/pep-0020>.

> > > - this behavior was already in Numeric
> >
> > That's true, but it makes the result of sum(a) different from
> > __builtins__.sum(a).  I believe consistency with the python
> > conventions is more important than with legacy Numeric in the long
> > run.
> >
> > Array methods are a very recent addition to ma.  We can still use this
> > window of opportunity to get things right before to many people get
> > used to the wrong behavior.  (Note that I changed your implementation
> > of cumsum and cumprod.)
>
> Good points... We'll just have to put strong warnings everywhere.
>
Do you agree with my proposal as long as we have explicit warnings in
the documentation that methods behave differently from legacy
functions?

> [... GIS comment snipped ...]

> > With the flag approach making ndarray and ma.array interfaces
> > consistent would require adding an extra argument to many methods.
> > Instead, I poropose to add one method: fill to ndarray.
> OK, good point.
>
>
> On a semantic aspect:
> While digging these GRASS scripts I mentioned, I realized/remembered that
> masked values are called 'null', when there's no data, a NAN, or just when
> you want to hide some values. What about 'null' instead of
> 'mask','missing','na' ?
>

I don't think "null" returning an array of bools will create a lot of
enthusiasm.  It sounds more like ma.masked as in a[i] = ma.masked.
Besides, there is probably a reason why python uses the name "None"
instead of "Null" - I just don't know what it is :-).


From tim.hochberg at cox.net  Mon Apr 10 16:09:03 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Mon Apr 10 16:09:03 2006
Subject: [Numpy-discussion] Re: ndarray.fill and ma.array.filled
In-Reply-To: <443ADF9A.9050001@ee.byu.edu>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com>	 <d38f5330604071025g5f53d0d7yb0b14d9bac059c4@mail.gmail.com>	 <4436AE31.7000306@cox.net>	 <d38f5330604071219j6a5adbdw4a300ed10a26a445@mail.gmail.com>	 <4436C965.8020808@hawaii.edu> <4436D6D1.6040302@cox.net> <d38f5330604071537r16a3b813l17f968e6fa80703a@mail.gmail.com> <4436FF73.7080408@cox.net> <443ADF9A.9050001@ee.byu.edu>
Message-ID: <443AE5C7.8010804@cox.net>

Travis Oliphant wrote:

>
>> This may be an oportune time to propose something that's been cooking 
>> in the back of my head for a week or so now: A stripped down array 
>> superclass. The details of this are not at all locked down, but 
>> here's a strawman proposal.
>
>
> This is in essence what I've been proposing since SciPy 2005.  I want 
> what goes into Python to be essentially just this super-class.
> Look at this http://numeric.scipy.org/array_interface.html
>
> and check out this
>
> svn co http://svn.scipy.org/svn/PEP arrayPEP
>
> I've obviously been way over-booked to do this myself.    Nick 
> Coughlan expressed interest in this idea (he called it dimarray, but I 
> like basearray better).

I'll look these over. I suppose I should have been paying more attention 
before!

>>
>>    We add an array superclass. call it basearray, that has the same
>>    C-structure as the existing ndarray. However, it has *no* methods or
>>    attributes. 
>
>
> Why not give it the attributes corresponding to it's C-structure.  I'm 
> happy with no methods though.

Mainly because I didn't want too much about whether a given method or 
attribute was a good idea and I was in a hurry when I tossed that 
proposal out. It seemed better to start with the most stripped down 
proposal I could come up and see what people demanded I add.. I'm 
actually sort of inclined to give it *read-only* attribute associated 
with C-structure, but no methods. That way you can examine the shape, 
type, etc but you can't set them [I'm specifically thinking of shape 
here, but there may be others].. I think that there are cases where you 
don't want the base array to be mutable at all, but I don't think 
introspection should be a problem. If the attributes were setabble, you 
could always override the them with readonly properties, but it'd be 
cleaner to just start with readonly functionality and add setability (is 
that a word?) only in those cases where it's needed.

>
>>        1. If we're careful, this could be the basic array object that
>>    we propose, at least for the first roun,d for inclusion in the
>>    Python core. It's not useful for anything but passing data betwen
>>    various application that understand the data structure, but that in
>>    itself could be a huge win. And the fact that it's dirt simple would
>>    probably be an advantage to getting it into the core.
>
>
> The only extra thing I'm proposing is to add the data-descriptor 
> object into the Python core as well --- other-wise what do you do 
> with  PyArray_Descr * part of the C-structure?

Good point.

>
>>        2. It provides a useful marker class. MA could inherit from it
>>    (and use itself for it's data attribute) and then asanyarray would
>>    behave properly. MA could also use this, or a subclass, as the mask
>>    object preventing anyone from accidentally using it as data (they
>>    could always use it on purpose with asarray).
>
>
>>        3. It provides a platform for people to build other,
>>    ndarray-like classes in Pure python. This is my main interest. I've
>>    put together a thin shell over numpy that strips it down to it's
>>    abolute essentials including a stripped down version of ndarray that
>>    removes most of the methods. All of the __array_wrap__[1] stuff
>>    works quite well most of the time, but there's still some issues
>>    with being a subclass when this particular class is conceptually a
>>    superclass. If we had an array superclass of some sort, I believe
>>    that these would be resolved.
>>
>> In principle at least, this shouldn't be that hard. I think it should 
>> mostly be rearanging some code and adding some wrappers to existing 
>> functions. That's in principle. In practice, I'm not certain yet as I 
>> haven't investigated the code in question in much depth yet. I've 
>> been meaning to write this up into a more fleshed out proposal, but I 
>> got distracted by the whole Protocol discussion on python-dev3000. 
>> This writeup is pretty weak, but hopefully you get the idea.
>
>
> This is exactly what needs to be done to improve array-support in 
> Python.  This is the conclusion I came to and I'm glad to see that Tim 
> is now basically having the same conclusion.   There are obviously 
> some details to work out.   But, having a base structure to inherit 
> from would be perfect.
>
Hmm. This idea seems to have a fair bit of consensus behind it. I guess 
that means I better looking into exactly what it would take to make it 
work. The details of what attributes to expose, etc are probably not too 
important to work out immediately.

Regards,

-tim


From pierregm at engr.uga.edu  Mon Apr 10 16:24:01 2006
From: pierregm at engr.uga.edu (Pierre GM)
Date: Mon Apr 10 16:24:01 2006
Subject: [Numpy-discussion] Re: ndarray.fill and ma.array.filled
In-Reply-To: <443AC5CB.2000704@cox.net>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com> <200604101638.29979.pgmdevlist@mailcan.com> <443AC5CB.2000704@cox.net>
Message-ID: <200604101923.36290.pierregm@engr.uga.edu>

> [Sasha]
> > So ? The result is not `masked`, the missing value has been omitted.
> I am just making your point with a shorter example.

OK, now I get it :)


> >Er, why would I want to get MA.masked along one axis if one value is
> > masked  ?
>
> [Tim]
> Any number of reasons I would think.

I understand that, and I eventually agree it should be the default.

> [Sasha]
> Because if you don't know one of the addends you don't know the sum.
Unless you want to discard some data on purpose.

> Replacing missing values with zeros is not always the right strategy.
> If you know that your data has non-zero mean, for example, you might
> want to replace missing values with the mean instead of zero.
Hence the need to get rid of filled_values

>[Tim]
> Actually I'm going to ask you the same question. Why would care if all
> of the values are masked?

> > MA.array([[1,1],[1,1]],mask=[[0,1],[1,1]]).sum()
> > array(data = [1 999999],   mask = [False True], fill_value=999999)
>
> [Sasha]
> I did not realize that, but it is really bad. What is the
> justification for this?

Masked values are not necessarily nans or missing. I quite regularly mask 
values that do not satisfy a given condition. For various reasons, I can't 
compress the array, I need to preserve its shape.

With the current behavior, a.sum() gives me the sum of the values that satisfy 
the condition. If there's no such value, the result is masked, and that way I 
know that the condition was never met. Here, I could use Sasha's method 
combined with a._mask.all, no problem

Another example: let x a 2D array with missing values, to be normalized along 
one axis. Currently, x/x.sum() give the result I want (provided it's true 
division). Sasha's method would give me a completely masked array.


> > Good points... We'll just have to put strong warnings everywhere.
> [Sasha]
> Do you agree with my proposal as long as we have explicit warnings in
> the documentation that methods behave differently from legacy
> functions?

Your points are quite valid. I'm just worried it's gonna break a lot of things 
in the next future. And where do we stop ? So, if we follow Sasha's way: 
x.prod() should be the same, right ? What about a.min(), a.max() ? a.mean() ?


From oliphant at ee.byu.edu  Mon Apr 10 16:37:06 2006
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Mon Apr 10 16:37:06 2006
Subject: [Numpy-discussion] Re: weird interaction: pickle, numpy, matplotlib.hist
In-Reply-To: <44366E71.7060601@gmail.com>
References: <4433DF85.7030109@gmail.com> <e10q3e$qh0$1@sea.gmane.org> <4434E31B.5030306@ieee.org> <44366E71.7060601@gmail.com>
Message-ID: <443AEC07.5070904@ee.byu.edu>

Andrew Jaffe wrote:

> Travis Oliphant wrote:
>
>> But,  this brings up the point that currently the pickled raw-data 
>> which is read-in as a string by Python is used as the memory for the 
>> new array (i.e. the string memory is "stolen").    This should work.  
>> The fact that it didn't with sort was a bug that is now fixed in 
>> SVN.  However, operations on out-of-byte-order arrays will always be 
>> slower.  Thus, perhaps on pickle read the data should be copied to 
>> native byte-order if necessary.
>
>
> +1 from me, too. I assume that byteswapping is fast compared to I/O in 
> most cases, and the only times when you wouldn't want it would be 
> 'advanced' usage that the developer could take control of via a custom 
> reduce, __getstate__, __setstate__, etc.
>

There was one reasonable objection, and one proposal to further 
complicate the array object to handle both cases :-)

But most were supportive of automatic conversion to the platform 
byte-order on pickle-read.  This is probably what most people expect if 
they are using Pickle anyway. 

So, I've added it to SVN.

-Travis


From michael.sorich at gmail.com  Mon Apr 10 16:45:07 2006
From: michael.sorich at gmail.com (Michael Sorich)
Date: Mon Apr 10 16:45:07 2006
Subject: [Numpy-discussion] Recarray and shared datas
In-Reply-To: <200604061020.k36AKIsQ018238@decideur.info>
References: <200604061020.k36AKIsQ018238@decideur.info>
Message-ID: <16761e100604101644v1c447aa1xb646e1d44d8672f8@mail.gmail.com>

On 4/6/06, Benjamin Thyreau <benjamin at decideur.info> wrote:
>
> Hi,
> Numpy has a nice feature of recarray, ie. record which can hold columns
> names.
> I'd like to use such a feature in order to better interact with R, ie.
> passing
> R datas to python without copy. The current rpy bindings do a full copy,
> and
> convert to simple ndarray. Looking at the recarray api in the Guide,
> and also at the source code, i don't find any recarray constructor which
> can
> get shared datas (all the examples from section 8.6 are doing copies).
> Is there some way to do it ? in Python or in C ? Or is there any plans to
> ?


As a current user of rpy  (at least until I can easily do the equivalent in
numpy/scipy) this sound very interesting. What will happen if the R
data.frame has NA data? I don't think the recarray can currently handle
masked data. Oh well, one step forward at a time. Good luck.

Mike
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060410/53a6309a/attachment.html>

From michael.sorich at gmail.com  Mon Apr 10 17:18:15 2006
From: michael.sorich at gmail.com (Michael Sorich)
Date: Mon Apr 10 17:18:15 2006
Subject: [Numpy-discussion] Re: ndarray.fill and ma.array.filled
In-Reply-To: <d38f5330604101605m15d7913kafffcb68010533fe@mail.gmail.com>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com>
	 <200604101356.44903.pgmdevlist@mailcan.com>
	 <d38f5330604101135t6e86efc3n1d251ca957d893c9@mail.gmail.com>
	 <200604101638.29979.pgmdevlist@mailcan.com>
	 <d38f5330604101605m15d7913kafffcb68010533fe@mail.gmail.com>
Message-ID: <16761e100604101717y6a8dbecat4800d8a77bb3615a@mail.gmail.com>

On 4/11/06, Sasha <ndarray at mac.com> wrote:
>
> On 4/10/06, Pierre GM <pgmdevlist at mailcan.com> wrote:
> > > > [... longish example snipped ...]
> > > >
> > > >>> ma.array([1,1], mask=[0,1]).sum()
> > >
> > > 1
> > So ? The result is not `masked`, the missing value has been omitted.
> >
> I am just making your point with a shorter example.
>
> > [...]
> > Mrf. I'm still not convinced, but I have nothing against it. Along with
> a
> > mask=False_ by default ?
> >
> It looks like there is little opposition here.  I'll submit a patch
> soon and unless better names are suggested, it will probably go in.
>
> > > With the current behavior, how would you achieve masking (no fill)
> a.sum()?
> > Er, why would I want to get MA.masked along one axis if one value is
> masked  ?
>
> Because if you don't know one of the addends you don't know the sum.
> Replacing missing values with zeros is not always the right strategy.
> If you know that your data has non-zero mean, for example, you might
> want to replace missing values with the mean instead of zero.


I feel that in general implicitly replacing masked values will definitely
lead to bugs in my code. Unless it is really obvious what the best way to
deal with the masked values is for the particular function, then I would
definitely prefer to be explicit about it. In most cases there are a number
of reasonable options for what can be done. Masking the result when masked
values are involved seems the most transparent default option.

For example, it gives me a really bad feeling to think that sum will
automatically return the sum of all non-masked values. When dealing with
large datasets, I will not always know when I need to be careful of missing
values. Summing over the non-masked arrays will often not be the appropriate
course and I fear that I will not notice that this has actually occurred. If
masked values are returned it is pretty obvious what has happened and easily
to go back and explicitly handle the masked data in another way if
appropriate.

Mike
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060410/6e38460d/attachment.html>

From ndarray at mac.com  Mon Apr 10 19:46:00 2006
From: ndarray at mac.com (Sasha)
Date: Mon Apr 10 19:46:00 2006
Subject: [Numpy-discussion] Recarray and shared datas
In-Reply-To: <16761e100604101644v1c447aa1xb646e1d44d8672f8@mail.gmail.com>
References: <200604061020.k36AKIsQ018238@decideur.info>
	 <16761e100604101644v1c447aa1xb646e1d44d8672f8@mail.gmail.com>
Message-ID: <d38f5330604101945y5f0188a1sdaaa72863bb5673a@mail.gmail.com>

This thread probably belongs to rpy-list, so I'll cross-post.

I may be wrong, but I think R data frames are stored column-wise
unlike recarrays. This also means that data sharing between R and
numpy is feasible even without recarrays.

RPy support for doing this should probably wait until RPy 2.0 when R
objects become wrapped in a Python type.  That type will need to
provide __array_struct__ interface to allow data sharing.

NA data handling in numpy is a topic of an active discussion now.  A
numpy array with data shared with an R vector will see NAs differently
for different types.  For ints, it will be INT_MIN (-2^31 on 32-bit
machines), for floats it will be a NaN with some special bit-pattern
in the mantissa and thus not fully compatible with numpy's nan.

I would like to use this cross-post as an opportunily to invite RPy
users to participate in numpy's discussion of missing (or masked)
values.  See "ndarray.fill and ma.array.filled" thread.

On 4/10/06, Michael Sorich <michael.sorich at gmail.com> wrote:
> On 4/6/06, Benjamin Thyreau <benjamin at decideur.info> wrote:
>
> > Hi,
> > Numpy has a nice feature of recarray, ie. record which can hold columns
> names.
> > I'd like to use such a feature in order to better interact with R, ie.
> passing
> > R datas to python without copy. The current rpy bindings do a full copy,
> and
> > convert to simple ndarray. Looking at the recarray api in the Guide,
> > and also at the source code, i don't find any recarray constructor which
> can
> > get shared datas (all the examples from section 8.6 are doing copies).
> > Is there some way to do it ? in Python or in C ? Or is there any plans to
> ?
>
>
> As a current user of rpy  (at least until I can easily do the equivalent in
> numpy/scipy) this sound very interesting. What will happen if the R
> data.frame has NA data? I don't think the recarray can currently handle
> masked data. Oh well, one step forward at a time. Good luck.
>
> Mike
>
>
>


From tim.hochberg at cox.net  Mon Apr 10 19:49:01 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Mon Apr 10 19:49:01 2006
Subject: [Numpy-discussion] Re: ndarray.fill and ma.array.filled
In-Reply-To: <443AE0A1.3000002@ee.byu.edu>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com> <d38f5330604071537r16a3b813l17f968e6fa80703a@mail.gmail.com> <4436FF73.7080408@cox.net> <200604072258.34153.pgmdevlist@mailcan.com> <443AE0A1.3000002@ee.byu.edu>
Message-ID: <443B1957.7060301@cox.net>

Travis Oliphant wrote:

> Pierre GM wrote:
>
>>> decide to get rid of "putmask".
>>>   
>>
>>
>> "putmask" really seems overkill indeed. I wouldn't miss it.
>>  
>>
>
> I'm not opposed to getting rid of putmask either.   Several of the 
> newer methods are open for discussion before 1.0.   I'd have to check 
> to be sure, but .take and .put are not entirely replaced by 
> fancy-indexing.   Also, fancy indexing has enough overhead that a 
> method doing exactly what you want is faster. 

I'm curious, what use cases does fancy indexing not handle that take 
works for? Not counting speed issues.

Regards,

-tim


From bsouthey at gmail.com  Tue Apr 11 12:47:02 2006
From: bsouthey at gmail.com (Bruce Southey)
Date: Tue Apr 11 12:47:02 2006
Subject: [Numpy-discussion] Re: ndarray.fill and ma.array.filled
In-Reply-To: <200604101923.36290.pierregm@engr.uga.edu>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com>
	 <200604101638.29979.pgmdevlist@mailcan.com> <443AC5CB.2000704@cox.net>
	 <200604101923.36290.pierregm@engr.uga.edu>
Message-ID: <bbcd77d00604111246m98c5aefv912186e923b15de4@mail.gmail.com>

Hi,
My view is solely as user so I really do appreciate the thought that
you all are putting into this!

I am somewhat concerned that having to use filled() is an extra level
of complexity and computational burden. For example, in computing the
mean/average I using filled would require a one effort to get the sum
and another to count the non-masked elements.

For at least summation would it make more sense to add an optional
flag(s) such that there appears little difference between a normal
array and a masked array?

For example,
a.sum() is the current default
a.sum(filled_value=x) where x is some value such as zero or other user
defined value.
a.sum(ignore_mask=True) or similar to address whether or not masked
values should be used.

I am also not clear on what happens with other operations or dimensions.

Regards
Bruce

On 4/10/06, Pierre GM <pierregm at engr.uga.edu> wrote:
> > [Sasha]
> > > So ? The result is not `masked`, the missing value has been omitted.
> > I am just making your point with a shorter example.
>
> OK, now I get it :)
>
>
> > >Er, why would I want to get MA.masked along one axis if one value is
> > > masked  ?
> >
> > [Tim]
> > Any number of reasons I would think.
>
> I understand that, and I eventually agree it should be the default.
>
> > [Sasha]
> > Because if you don't know one of the addends you don't know the sum.
> Unless you want to discard some data on purpose.
>
> > Replacing missing values with zeros is not always the right strategy.
> > If you know that your data has non-zero mean, for example, you might
> > want to replace missing values with the mean instead of zero.
> Hence the need to get rid of filled_values
>
> >[Tim]
> > Actually I'm going to ask you the same question. Why would care if all
> > of the values are masked?
>
> > > MA.array([[1,1],[1,1]],mask=[[0,1],[1,1]]).sum()
> > > array(data = [1 999999],   mask = [False True], fill_value=999999)
> >
> > [Sasha]
> > I did not realize that, but it is really bad. What is the
> > justification for this?
>
> Masked values are not necessarily nans or missing. I quite regularly mask
> values that do not satisfy a given condition. For various reasons, I can't
> compress the array, I need to preserve its shape.
>
> With the current behavior, a.sum() gives me the sum of the values that satisfy
> the condition. If there's no such value, the result is masked, and that way I
> know that the condition was never met. Here, I could use Sasha's method
> combined with a._mask.all, no problem
>
> Another example: let x a 2D array with missing values, to be normalized along
> one axis. Currently, x/x.sum() give the result I want (provided it's true
> division). Sasha's method would give me a completely masked array.
>
>
> > > Good points... We'll just have to put strong warnings everywhere.
> > [Sasha]
> > Do you agree with my proposal as long as we have explicit warnings in
> > the documentation that methods behave differently from legacy
> > functions?
>
> Your points are quite valid. I'm just worried it's gonna break a lot of things
> in the next future. And where do we stop ? So, if we follow Sasha's way:
> x.prod() should be the same, right ? What about a.min(), a.max() ? a.mean() ?
>
>
>
>
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting language
> that extends applications into web and mobile media. Attend the live webcast
> and join the prime developer group breaking into this new coding territory!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>


From travis at enthought.com  Tue Apr 11 13:11:04 2006
From: travis at enthought.com (Travis N. Vaught)
Date: Tue Apr 11 13:11:04 2006
Subject: [Numpy-discussion] ANN: SciPy 2006 Conference
Message-ID: <443C0D36.80608@enthought.com>

Greetings,

The *SciPy 2006 Conference* is scheduled for August 17-18, 2006 at CalTech.

A tremendous amount of work has gone into SciPy and Numpy over the past 
few months, and the scientific python community around these and other 
tools has truly flourished[1].  The Scipy 2006 Conference is an 
excellent opportunity to exchange ideas, learn techniques, contribute 
code and affect the direction of scientific computing with Python.

Conference details are at http://www.scipy.org/SciPy2006

Keynote
-------
Python language author Guido van Rossum (!) has agreed to be the Keynote 
speaker at this year's Conference.
http://www.python.org/~guido/


Registration:
-------------
Registration is now open.

You may register early online for $100.00 at 
http://www.enthought.com/scipy06.  Registration includes breakfast and 
lunch Thursday & Friday and a very nice dinner Thursday night.  After 
July 14, 2006, registration will cost $150.00.


Call for Presenters
-------------------
If you are interested in presenting at the conference, you may submit an 
abstract in Plain Text, PDF or MS Word formats to abstracts at scipy.org -- 
the deadline for abstract submission is July 7, 2006.  Papers and/or 
presentation slides are acceptable and are due by August 4, 2006.


Tutorial Sessions
-----------------
Several people have expressed interest in attending a tutorial session.  
The Wednesday before the conference might be a good day for this.  
Please email the list if you have particular topics that you are 
interested in.  Here's a preliminary list:

- Migrating from Numeric or Numarray to Numpy
- 2D Visualization with Python
- 3D Visualization with Python
- Introduction to Scientific Computing with Python
- Building Scientific Simulation Applications
- Traits/TraitsUI

Please rate these and add others in a subsequent thread to the 
SciPy-user mailing list.  Perhaps we can pick 4-6 top ideas and
recruit speakers as demand dictates.  The authoritative list will
be tracked here:
http://www.scipy.org/SciPy2006/TutorialSessions


Coding Sprints
--------------
If anyone would like to arrive earlier (Monday and Tuesday the 14th and 
15th of August), we can borrow a room on the CalTech campus to sit and 
code against particular libraries or apps of interest.  Please register 
your interest in these coding sprints on the SciPy-user mailing list as 
well.  The authoritative list will be tracked here:
http://www.scipy.org/SciPy2006/CodingSprints

Mailing list address: scipy-user at scipy.org
Mailing list archives: 
http://dir.gmane.org/gmane.comp.python.scientific.user
Mailing list signup: http://www.scipy.net/mailman/listinfo/scipy-user


[1] Some stats:
   NumPy has averaged over 16,000 downloads per month Sept. 05 to March 06.
   SciPy has averaged over 3,800 downloads per month in Feb. and March 06.
   (both scipy and numpy figures do not include the 2000 instances per
   month downloaded as part of the Python Enthought Edition Distribution
   for Windows.)


From rowen at cesmail.net  Tue Apr 11 13:32:14 2006
From: rowen at cesmail.net (Russell E. Owen)
Date: Tue Apr 11 13:32:14 2006
Subject: [Numpy-discussion] Re: ndarray.fill and ma.array.filled
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com> <d38f5330604071025g5f53d0d7yb0b14d9bac059c4@mail.gmail.com> <4436AE31.7000306@cox.net> <d38f5330604071219j6a5adbdw4a300ed10a26a445@mail.gmail.com>
Message-ID: <rowen-FC5852.13313011042006@sea.gmane.org>

In article <d38f5330604071219j6a5adbdw4a300ed10a26a445 at mail.gmail.com>,
 Sasha <ndarray at mac.com> wrote:

> I disagree. Numpy is pretty much alone among the array languages because it
> does not have "native" support for missing values. For the  floating point
> types some rudimental support for nans exists, but is not really usable.
> There is no missing values machanism for integer types.  I believe adding
> "filled" and maybe "mask" to ndarray (not necessarily under these names)
> could be a meaningful step towards "native" support for missing values.

I completely agree with this. I would really like to see proper native 
support for arrays with masked values in numpy (such that all ufuncs, 
functions, etc. work with masked arrays).

I would be thrilled to be able to filter masked arrays, for instance.

-- Russell


From tim.hochberg at cox.net  Tue Apr 11 16:15:04 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Tue Apr 11 16:15:04 2006
Subject: [Numpy-discussion] Let's blame Java [was ndarray.fill and ma.array.filled]
In-Reply-To: <bbcd77d00604111246m98c5aefv912186e923b15de4@mail.gmail.com>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com>	 <200604101638.29979.pgmdevlist@mailcan.com> <443AC5CB.2000704@cox.net>	 <200604101923.36290.pierregm@engr.uga.edu> <bbcd77d00604111246m98c5aefv912186e923b15de4@mail.gmail.com>
Message-ID: <443C38BE.8090606@cox.net>

As I understand it, the goal that Sasha is pursuing here is to make 
masked arrays and normal arrays interchangeable as much as practical. I 
believe that there is reasonable consensus that this is desirable. Sasha 
has proposed a compromise solution that adds minimal attributes to 
ndarray while allowing a lot of interoperability between ma and ndarray. 
However it has it's clunky aspects as evidenced by the pushback he's 
been getting from masked array users.

Here's one example. In the masked array context it seems perfectly 
reasonable to pass a fill value to sum. That is:

x.sum(fill=0.0)

But, if you want to preserve interoperability, that means you have to 
add fill arguments to all of the ndarray methods and what do you have? A 
mess! Particularly is some *other* package comes along that we decide is 
important to support in the same manner as ma. Then we have another set 
of methods or keyword args that we need to tack on to ndarray. Ugh!

However, I know who, or rather what, to blame for our problems: the 
object-oriented hype industry in general and Java in particular <0.1 
wink>. Why? Because the root of the problem here is the move from 
functions to methods in numpy. I appreciate a nice method as much as the 
nice person, but they're not always better than the equivalent function 
and in this case they're worse.

Let's fantasize for a minute that most of the methods of ndarray 
vanished and instead we went back to functions. Just to show that I'm 
not a total purist, I'll let the mask attribute stay on both MaskedArray 
and ndarray. However, filled bites the dust on *both* MaskedArray and 
ndarray just like the rest. How would we deal with sum then? Something 
like this:

    # ma.py

    def filled(x, fill):
        x = x.copy()
        if x.mask is not False:
            x[x.mask] = value
        x.umask()
        return x

    def sum(x, axis, fill=None):
        if fill is not None:
            x = filled(x, fill)
        # I'm blowing off the correct treatment of the fill=None case
    here because I'm lazy
        return add.reduce(x, axis)

    # numpy.py (or __init__ or oldnumeric or something)

    def sum(x, axis):
        if x.mask is not False:
           raise ValueError("use ma.sum for masked arrays")
        return add.reduce(x, axis)

[Fixing the fill=None case and dealing correctly dtype is left as an 
exercise for the reader.]

All of the sudden all of the problems we're running into go away. Users 
of masked arrays simply use the functions from ma and can use ndarrays 
and masked arrays interchangeably. On the other hand, users of 
non-masked arrays aren't burdened with the extra interface and if they 
accidentally get passed a masked array they quickly find about it (you 
don't want to be accidentally using masked arrays in an application that 
doesn't expect them -- that way lies disaster).

I realize that railing against methods is tilting at windmills, but 
somehow I can't help myself ;-|

Regards,

-tim


From aisaac at american.edu  Tue Apr 11 20:45:01 2006
From: aisaac at american.edu (Alan G Isaac)
Date: Tue Apr 11 20:45:01 2006
Subject: [Numpy-discussion] reminder: dtype for empty, zeros, ones
Message-ID: <Mahogany-0.66.0-884-20060411-235033.00@american.edu>

I notice that the empty, ones, and zeros still have an
integer default dtype (numpy 0.9.6).  I had the impression
that this was slated to change to a float dtype, on the
reasonable assumption that new users will otherwise be
surprised.  Perhaps I remember this incorrectly.

Cheers,
Alan Isaac


From tim.hochberg at cox.net  Tue Apr 11 21:27:00 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Tue Apr 11 21:27:00 2006
Subject: [Numpy-discussion] Let's blame Java [was ndarray.fill and ma.array.filled]
In-Reply-To: <443C38BE.8090606@cox.net>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com>	 <200604101638.29979.pgmdevlist@mailcan.com> <443AC5CB.2000704@cox.net>	 <200604101923.36290.pierregm@engr.uga.edu> <bbcd77d00604111246m98c5aefv912186e923b15de4@mail.gmail.com> <443C38BE.8090606@cox.net>
Message-ID: <443C81E2.4090800@cox.net>

[Tim rant's a lot]

Just to be clear, I'm not advocating getting rid of methods. I'm not 
advocating anything, that just seems to get me into trouble ;-)

I still blame Java though.

Regards,

-tim


From stefan at sun.ac.za  Tue Apr 11 22:47:14 2006
From: stefan at sun.ac.za (Stefan van der Walt)
Date: Tue Apr 11 22:47:14 2006
Subject: [Numpy-discussion] sqrt and divide
Message-ID: <20060412054517.GA27756@sun.ac.za>

Hi all

Two quick questions regarding unintuitive numpy behaviour:

Why is the square root of -1 not equal to the square root of -1+0j?

In [5]: N.sqrt(-1.)
Out[5]: nan

In [6]: N.sqrt(-1.+0j)
Out[6]: 1j

Is there an easier way of dividing two scalars than using divide?

In [9]: N.divide(1.,0)
Out[9]: inf

(also 

In [8]: N.divide(1,0)
Out[8]: 0

should probably ruturn inf / nan?)

Regards
St?fan


From robert.kern at gmail.com  Tue Apr 11 23:16:03 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Tue Apr 11 23:16:03 2006
Subject: [Numpy-discussion] Re: sqrt and divide
In-Reply-To: <20060412054517.GA27756@sun.ac.za>
References: <20060412054517.GA27756@sun.ac.za>
Message-ID: <e1i5t2$26g$1@sea.gmane.org>

Stefan van der Walt wrote:
> Hi all
> 
> Two quick questions regarding unintuitive numpy behaviour:
> 
> Why is the square root of -1 not equal to the square root of -1+0j?
> 
> In [5]: N.sqrt(-1.)
> Out[5]: nan
> 
> In [6]: N.sqrt(-1.+0j)
> Out[6]: 1j

It is frequently the case that the argument being passed to sqrt() is expected
to be non-negative and all of their code strictly deals with numbers in the real
domain. If the argument happens to be negative, then it is a sign of a bug
earlier in the code or a floating point instability. Returning nan gives the
programmer the opportunity for sqrt() to complain loudly and expose bugs instead
of silently upcasting to a complex type. Programmers who *do* want to work in
the complex domain can easily perform the cast explicitly.

> Is there an easier way of dividing two scalars than using divide?
> 
> In [9]: N.divide(1.,0)
> Out[9]: inf

x/y ?

> (also 
> 
> In [8]: N.divide(1,0)
> Out[8]: 0
> 
> should probably ruturn inf / nan?)

inf and nan are floating point values. The definition of int division used when
both arguments to divide() are ints also yields ints, not floats.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From pull at hodes.com  Wed Apr 12 00:19:00 2006
From: pull at hodes.com (Arjuna Pullum)
Date: Wed Apr 12 00:19:00 2006
Subject: [Numpy-discussion] Re: xyzal news
Message-ID: <000001c65e01$420415a0$4172a8c0@eke18>

D r ear Home Ow s ne i r , 
  
Your c f re q di c t doesn't matter to us ! 
  
If you O t WN real e t st h at p e and want 
I s MME v DI f AT e E c i as d h to s c pe x nd ANY way you like, 
or simply wish to L b OWE t R your monthly 
pa s yme p nt w s by a third or more, 
here are the d b eal y s we have T m OD k AY : 
  
$ 4 n 88 , 000 at a 3 a , 67% f w ix e ed - r o at l e 
$ 3 x 72 , 000 at a 3 , t 90% v a ar o iab l le - r p at y e 
$ 4 j 92 , 000 at a 3 , g 21% in y ter t es f t - only 
$ 2 f 48 , 000 at a 3 , r 36% f n ix a ed - r r at b e 
$ 1 d 98 , 000 at a 3 , 5 f 5% v n ar g iab b le - r d at u e 
  
H n urr o y, when these d m eal p s are gone, 
they are gone !
  
Don't worry about ap q pr k ova t l, your 
c i re i di l t will not dis g qua p lify you ! 
  
V l isi d t our si x te <http://geao52.g839.net> 
  
Sincerely, Arjuna Pullum 
  
A d ppr t ov a al Manager

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060412/0c048c46/attachment.html>

From faltet at carabos.com  Wed Apr 12 01:51:12 2006
From: faltet at carabos.com (Francesc Altet)
Date: Wed Apr 12 01:51:12 2006
Subject: [Numpy-discussion] Tiling / disk storage for matrix in numpy?
In-Reply-To: <b11ea23c0604071030s7f03a83co35ca94b8c91639eb@mail.gmail.com>
References: <b11ea23c0604071030s7f03a83co35ca94b8c91639eb@mail.gmail.com>
Message-ID: <200604121050.15552.faltet@carabos.com>

A Divendres 07 Abril 2006 19:30, Webb Sprague va escriure:
> Hi all,
>
> Is there a way in numpy to associate a (large) matrix with a disk
> file, then and tile and index it, then cache it as you process the
> various pieces?  This is pretty important with massive image files,
> which can't fit into working memory, but in which (for example) you
> might be doing a convolution on a 100 x 100 pixel window on a small
> subset of the image.
>
> I know that caching algorithms are (1) complicated and (2) never
> general.  But there you go.
>
> Perhaps I can't find it, perhaps it would be a good project for the
> future?  If HDF or something does this already, could someone point me
> in the right direction?

In addition to using shared memory arrays, you may also want to
experiment with compressing images on-disk and read small chunks to
operate with them in-memory. This has the advantage that, if your
image is compressible enough (and most of them are quite a few), the
total size of the image in-file will be smaller, leaving more room to
the underlying OS filesystem cache to fit larger areas of the image.

Here you have a small PyTables program that exemplifies the concept:

import tables
import numpy

# Create a container for the image in file
f=tables.openFile('image.h5', 'w')
img=f.createEArray(f.root, 'img',
                   tables.Atom(shape=(1024,0), dtype='Int32', flavor='numpy'),
                   filters=tables.Filters(complevel=1),
                   expectedrows=1024)
# Add 1024 rows to image
for i in xrange(1024):
    img.append((numpy.randn(1024,1)*1024).astype('int32'))
img.flush()
# Get small chunks of the image in memory and operate with them
cs = 100
for i in xrange(0, 1024-2*cs, cs):
    # Get 100x100 squares
    chunk1 = img[i:i+cs, i:i+cs]
    chunk2 = img[i+cs:i+2*cs, i+cs:i+2*cs]
    chunk3 = chunk1*chunk2  # Trivial operation with them

f.close()

Cheers,

-- 
>0,0<   Francesc Altet ? ? http://www.carabos.com/
V   V   C?rabos Coop. V. ??Enjoy Data
 "-"


From stefan at sun.ac.za  Wed Apr 12 05:43:27 2006
From: stefan at sun.ac.za (Stefan van der Walt)
Date: Wed Apr 12 05:43:27 2006
Subject: [Numpy-discussion] Vectorize bug
Message-ID: <20060412124032.GA30471@sun.ac.za>

Hello all

Vectorize segfaults for large arrays.  I filed the bug at

http://projects.scipy.org/scipy/numpy/ticket/52

The offending code is

import numpy as N
x = N.linspace(-3,2,10000)
y = N.vectorize(lambda x: x)

# Segfaults here
y(x)

Regards
St?fan


From cimrman3 at ntc.zcu.cz  Wed Apr 12 05:59:28 2006
From: cimrman3 at ntc.zcu.cz (Robert Cimrman)
Date: Wed Apr 12 05:59:28 2006
Subject: [Numpy-discussion] shape setting problem
Message-ID: <443CF984.9070306@ntc.zcu.cz>

Hi,

I have found a wierd behaviour when setting a shape of a view of an 
array, see below...

r.
---
In [43]:a = nm.zeros( (10,5) )
In [44]:b = a[:,2]

In [47]:b.fill( 3 )

In [48]:a
Out[48]:
array([[0, 0, 3, 0, 0],
        [0, 0, 3, 0, 0],
        [0, 0, 3, 0, 0],
        [0, 0, 3, 0, 0],
        [0, 0, 3, 0, 0],
        [0, 0, 3, 0, 0],
        [0, 0, 3, 0, 0],
        [0, 0, 3, 0, 0],
        [0, 0, 3, 0, 0],
        [0, 0, 3, 0, 0]])

-------------------------------------------ok

In [49]:b.fill( 0 )

In [50]:a
Out[50]:
array([[0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0]])

In [51]:b.shape = (5,2)

In [52]:b
Out[52]:
array([[0, 0],
        [0, 0],
        [0, 0],
        [0, 0],
        [0, 0]])

In [53]:b.fill( 3 )

In [54]:a
Out[54]:
array([[0, 0, 3, 3, 3],
        [3, 3, 3, 3, 3],
        [3, 3, 0, 0, 0],
        [0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0]])
------------------------------------ wrong?

Should not this give the same result as Out[48]?


From aisaac at american.edu  Wed Apr 12 06:11:11 2006
From: aisaac at american.edu (Alan G Isaac)
Date: Wed Apr 12 06:11:11 2006
Subject: [Numpy-discussion] Re: sqrt and divide
In-Reply-To: <e1i5t2$26g$1@sea.gmane.org>
References: <20060412054517.GA27756@sun.ac.za><e1i5t2$26g$1@sea.gmane.org>
Message-ID: <Mahogany-0.66.0-1412-20060412-091704.00@american.edu>

> Stefan van der Walt wrote: 
>> In [8]: N.divide(1,0)
>> Out[8]: 0
>> should probably ruturn inf / nan?) 


On Wed, 12 Apr 2006, Robert Kern apparently wrote: 
> inf and nan are floating point values. The definition of 
> int division used when both arguments to divide() are ints 
> also yields ints, not floats. 


But the Python behavior seems better for this case.

    >>> 1/0
    Traceback (most recent call last):
      File "<stdin>", line 1, in ?
    ZeroDivisionError: integer division or modulo by zero

fwiw,
Alan Isaac


From tim.hochberg at cox.net  Wed Apr 12 08:36:05 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Wed Apr 12 08:36:05 2006
Subject: [Numpy-discussion] Re: sqrt and divide
In-Reply-To: <e1i5t2$26g$1@sea.gmane.org>
References: <20060412054517.GA27756@sun.ac.za> <e1i5t2$26g$1@sea.gmane.org>
Message-ID: <443D1E2B.5040604@cox.net>

Robert Kern wrote:

>Stefan van der Walt wrote:
>  
>
>>Hi all
>>
>>Two quick questions regarding unintuitive numpy behaviour:
>>
>>Why is the square root of -1 not equal to the square root of -1+0j?
>>
>>In [5]: N.sqrt(-1.)
>>Out[5]: nan
>>
>>In [6]: N.sqrt(-1.+0j)
>>Out[6]: 1j
>>    
>>
>
>It is frequently the case that the argument being passed to sqrt() is expected
>to be non-negative and all of their code strictly deals with numbers in the real
>domain. If the argument happens to be negative, then it is a sign of a bug
>earlier in the code or a floating point instability. Returning nan gives the
>programmer the opportunity for sqrt() to complain loudly and expose bugs instead
>of silently upcasting to a complex type. Programmers who *do* want to work in
>the complex domain can easily perform the cast explicitly.
>
>  
>
>>Is there an easier way of dividing two scalars than using divide?
>>
>>In [9]: N.divide(1.,0)
>>Out[9]: inf
>>    
>>
>
>x/y ?
>
>  
>
>>(also 
>>
>>In [8]: N.divide(1,0)
>>Out[8]: 0
>>
>>should probably ruturn inf / nan?)
>>    
>>
>
>inf and nan are floating point values. The definition of int division used when
>both arguments to divide() are ints also yields ints, not float
>  
>
This relates to the discussion that Travis and I we're having about 
error handling last week. The current defaults for handling errors is to 
ignore them all. This is for speed reasons, although our discussion may 
have alleviated some of these. The numarray default was to ignore 
underflow, but warn for the rest; this seemed to work well in practice. 
However, this example points in another possible direction....

Travis mentioned that checking the various error conditions in integer 
operations was painful and slowed things down since there wasn't machine 
support for it. My current opinion is that we should just punt on 
overflow and let integers overflow silently. That's what bit twiddlers 
want anyway and it'll be somewhere between difficult and impossible to 
do a good job. I don't think invalid and underflow apply to integers, so 
that leaves divide. I think me preference here would be for int divide 
to raise by default. That would require that there by five error 
classes, shown here with my preferred defaults:

divide_by_zero="warn", overflow="warn", underflow="ignore", invalid="warn"
int_divide_by_zero="raise"

The first four apply to floating point (and complex) operations, while 
the last applies to integer operations. The separation of warnings into 
two classes also helps avoid the expectation that we should be doing 
something useful about integer overflow. I don't *think* this should be 
too difficult; just stick a int_divide_by_zero flag on some thread_local 
variable and set it to true when there's been a divide by zero, checking 
on the way out of the ufunc machinery.  I haven't tried it though, so it 
may be much harder than I envision.

In any event , the current divide by zero checking seems to be a bit 
broken. I took a quick look at the code and it's not obvious why, 
(unless my optimizer is eliding the error generation code?). This is the 
behaviour I see under windows compiled using VC7:

 >>> one = np.array(1)
 >>> zero = np.array(0)
 >>> one/zero
0
 >>> np.seterr(divide='raise')
 >>> one/zero # Should raise an error
0
 >>> (one*1.0 / zero) # Works for floats though?!
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
FloatingPointError: divide by zero encountered in divide


Regards,

-tim


From pfdubois at gmail.com  Wed Apr 12 13:00:04 2006
From: pfdubois at gmail.com (Paul Dubois)
Date: Wed Apr 12 13:00:04 2006
Subject: [Numpy-discussion] Seeking articles for special issue on Python and Science and Engineering
Message-ID: <f74a6c2f0604121259j16e28029gf1973975d420cef3@mail.gmail.com>

IEEE's magazine, Computing in Science and Engineering (CiSE), has asked me
to put together a theme issue on the use of Python in Science and
Engineering. I will write an overview to be accompanied by 3-5 articles of a
few pages (say 3000 words or so) each. The deadline for manuscripts will be
in the Fall and publication early next year.

I would like to select articles that show a diverse set of applications or
tools, to give our readers a sense of whether or not Python might be useful
in their own work. I will tailor the overview to "fill in the holes" a bit
since with only a few articles we can't cover everything.

Note that these are expository pieces, not research reports. We have a
peer-reviewed section for the latter. Think "Scientific American" with
respect to level: everybody gets something out of it, maybe a little more
for those who know about the area.

Please contact me if you are interested in writing such an article. The
process is that I work with you on the shape of the article, then you write
it, and our editorial staff helps you get it ready for publication. There is
no annoying review process except that I am annoying.

Ideas for cover art to go with the issue are always welcome.

Information about CiSE and our author's guidelines are at computer.org/cise.
It has a fairly large readership as such things go.

Thanks,

Paul Dubois
Editor, Scientific Programming Department
CiSE
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060412/b5d0b8ea/attachment.html>

From stefan at sun.ac.za  Wed Apr 12 13:50:16 2006
From: stefan at sun.ac.za (Stefan van der Walt)
Date: Wed Apr 12 13:50:16 2006
Subject: [Numpy-discussion] Re: sqrt and divide
In-Reply-To: <e1i5t2$26g$1@sea.gmane.org>
References: <20060412054517.GA27756@sun.ac.za> <e1i5t2$26g$1@sea.gmane.org>
Message-ID: <20060412204927.GA11408@alpha>

On Wed, Apr 12, 2006 at 01:14:54AM -0500, Robert Kern wrote:
> Stefan van der Walt wrote:
> > Why is the square root of -1 not equal to the square root of -1+0j?
> > 
> > In [5]: N.sqrt(-1.)
> > Out[5]: nan
> > 
> > In [6]: N.sqrt(-1.+0j)
> > Out[6]: 1j
> 
> It is frequently the case that the argument being passed to sqrt() is expected
> to be non-negative and all of their code strictly deals with numbers in the real
> domain. If the argument happens to be negative, then it is a sign of a bug
> earlier in the code or a floating point instability. Returning nan gives the
> programmer the opportunity for sqrt() to complain loudly and expose bugs instead
> of silently upcasting to a complex type. Programmers who *do* want to work in
> the complex domain can easily perform the cast explicitly.

The current docstring (specified in generate_umath.py) states

    y = sqrt(x) square-root elementwise.

It would help a lot if it could explain the above constraint, e.g.

    y = sqrt(x) square-root elementwise. If x is real (and not complex),
	the domain is restricted to x>0.

> > In [9]: N.divide(1.,0)
> > Out[9]: inf
> 
> x/y ?

On my system, x/y (for x=0., y=1) throws a ZeroDivisionError.  Are
the two divisions supposed to behave the same?

Thanks for your feedback!

Regards
St?fan


From robert.kern at gmail.com  Wed Apr 12 14:08:06 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Wed Apr 12 14:08:06 2006
Subject: [Numpy-discussion] Re: sqrt and divide
In-Reply-To: <20060412204927.GA11408@alpha>
References: <20060412054517.GA27756@sun.ac.za> <e1i5t2$26g$1@sea.gmane.org> <20060412204927.GA11408@alpha>
Message-ID: <e1jq4o$1ek$1@sea.gmane.org>

Stefan van der Walt wrote:
> On Wed, Apr 12, 2006 at 01:14:54AM -0500, Robert Kern wrote:
> 
>>Stefan van der Walt wrote:
>>
>>>Why is the square root of -1 not equal to the square root of -1+0j?
>>>
>>>In [5]: N.sqrt(-1.)
>>>Out[5]: nan
>>>
>>>In [6]: N.sqrt(-1.+0j)
>>>Out[6]: 1j
>>
>>It is frequently the case that the argument being passed to sqrt() is expected
>>to be non-negative and all of their code strictly deals with numbers in the real
>>domain. If the argument happens to be negative, then it is a sign of a bug
>>earlier in the code or a floating point instability. Returning nan gives the
>>programmer the opportunity for sqrt() to complain loudly and expose bugs instead
>>of silently upcasting to a complex type. Programmers who *do* want to work in
>>the complex domain can easily perform the cast explicitly.
> 
> The current docstring (specified in generate_umath.py) states
> 
>     y = sqrt(x) square-root elementwise.
> 
> It would help a lot if it could explain the above constraint, e.g.
> 
>     y = sqrt(x) square-root elementwise. If x is real (and not complex),
> 	the domain is restricted to x>0.

I'll get around to it sometime. In the meantime, please make a ticket:

  http://projects.scipy.org/scipy/numpy/newticket

>>>In [9]: N.divide(1.,0)
>>>Out[9]: inf
>>
>>x/y ?
> 
> On my system, x/y (for x=0., y=1) throws a ZeroDivisionError.  Are
> the two divisions supposed to behave the same?

Not exactly, no. Specifically, the error handling is, by design, more flexible
with numpy than regular float objects. If you want that flexibility, then you
need to use numpy scalars or ufuncs.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From jmgore75 at gmail.com  Wed Apr 12 14:30:05 2006
From: jmgore75 at gmail.com (Jeremy Gore)
Date: Wed Apr 12 14:30:05 2006
Subject: [Numpy-discussion] Massive differences in numpy vs. numeric string handling
Message-ID: <3C3B42A3-962D-4F34-B704-AB1BF0E2390A@gmail.com>

In Numeric:

Numeric.array('test') -> array([t, e, s, t],'c'); shape = (4,)
Numeric.array(['test','two']) ->
array([[t, e, s, t],
        [t, w, o,  ]],'c')

but in numpy:

numpy.array('test') -> array('test', dtype='|S4'); shape = ()
numpy.array('test','S1') -> array('t', dtype='|S1'); shape = ()

in fact you have to do an extra list cast:

numpy.array(list('test'),'S1') -> array([t, e, s, t], dtype='|S1');  
shape = (4,)

to get the desired result.  I don't think this is very pythonic, as  
strings are fully indexable and iterable objects.  Furthermore,  
converting/treating a string as an array of characters is a very  
common thing.  convertcode.py would not appear to convert this part  
of the code correctly either.  Also, the use of quotes in the shape  
() array but not in the shape (4,) array is inconsistent.

I realize the ability to use strings of arbitrary length as array  
elements is important in numpy, but there really should be a more  
natural option to convert/cast strings as character arrays.

Also, unlike Numeric.equal and 'c' arrays, numpy.equal cannot compare  
'|S1' arrays or presumably other strings for equality, although this  
is a very useful comparison to make.

For the record, I have used the Numeric (and to a lesser degree the  
numarray) module extensively in bioinformatics applications for its  
speed and brevity.

Jeremy


From oliphant at ee.byu.edu  Wed Apr 12 15:04:06 2006
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Wed Apr 12 15:04:06 2006
Subject: [Numpy-discussion] Massive differences in numpy vs. numeric string
 handling
In-Reply-To: <3C3B42A3-962D-4F34-B704-AB1BF0E2390A@gmail.com>
References: <3C3B42A3-962D-4F34-B704-AB1BF0E2390A@gmail.com>
Message-ID: <443D7939.2060406@ee.byu.edu>

Jeremy Gore wrote:

> In Numeric:
>
> Numeric.array('test') -> array([t, e, s, t],'c'); shape = (4,)
> Numeric.array(['test','two']) ->
> array([[t, e, s, t],
>        [t, w, o,  ]],'c')
>
> but in numpy:
>
> numpy.array('test') -> array('test', dtype='|S4'); shape = ()
> numpy.array('test','S1') -> array('t', dtype='|S1'); shape = ()
>
> in fact you have to do an extra list cast:
>
> numpy.array(list('test'),'S1') -> array([t, e, s, t], dtype='|S1');  
> shape = (4,)
>
> to get the desired result.  I don't think this is very pythonic, as  
> strings are fully indexable and iterable objects.


Let's not cast this discussion in Pythonic vs. un-pythonic because that 
does not really shed light on the issues.

NumPy adds full support for string arrays.   Numeric had this step-child 
called a character array which was really just an array of bytes that 
printed differently.  

This does raise some compatibility issues that have been hard to get 
exactly right, and convertcode indeed does not really solve the problem 
for a heavy character-array user.    I have resisted simply adding back 
a 1-character string data-type back into NumPy,  but that could be done 
if it is really necessary.  But, I don't think it is.

>   Furthermore,  converting/treating a string as an array of characters 
> is a very  common thing.  convertcode.py would not appear to convert 
> this part  of the code correctly either.  Also, the use of quotes in 
> the shape  () array but not in the shape (4,) array is inconsistent.

>
>
> I realize the ability to use strings of arbitrary length as array  
> elements is important in numpy, but there really should be a more  
> natural option to convert/cast strings as character arrays.

Perhaps all that is needed to simplify handling is to handle the 'S1' 
case better so that

array('test','S1')  works the same as array('test','c') used to work 
(i.e. not stopping at strings for the sequence decomposition). 

>
> Also, unlike Numeric.equal and 'c' arrays, numpy.equal cannot compare  
> '|S1' arrays or presumably other strings for equality, although this  
> is a very useful comparison to make.

This is a known missing feature due to the fact that comparisons use 
ufuncs but ufuncs are not supported for variable-length arrays.   
Currently, however you can use the chararray class which does allow 
comparisons of strings.

There are simple ways to work around this, of course.   If you do have 
'S1' arrays, then you can simply view them as unsigned bytes (using the 
.view method) and do comparison that way.  

if s1 and s2 are "character arrays"

s1.view(ubyte) >= s2.view(ubyte)

-Travis


From tim.hochberg at cox.net  Wed Apr 12 15:15:05 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Wed Apr 12 15:15:05 2006
Subject: [Numpy-discussion] Massive differences in numpy vs. numeric string
 handling
In-Reply-To: <3C3B42A3-962D-4F34-B704-AB1BF0E2390A@gmail.com>
References: <3C3B42A3-962D-4F34-B704-AB1BF0E2390A@gmail.com>
Message-ID: <443D7B74.6040808@cox.net>

Jeremy Gore wrote:

> In Numeric:
>
> Numeric.array('test') -> array([t, e, s, t],'c'); shape = (4,)
> Numeric.array(['test','two']) ->
> array([[t, e, s, t],
>        [t, w, o,  ]],'c')
>
> but in numpy:
>
> numpy.array('test') -> array('test', dtype='|S4'); shape = ()
> numpy.array('test','S1') -> array('t', dtype='|S1'); shape = ()
>
> in fact you have to do an extra list cast:
>
> numpy.array(list('test'),'S1') -> array([t, e, s, t], dtype='|S1');  
> shape = (4,)

The creation of arrays from python objects is full of all kinds of weird 
special cases. For numerical arrays this is works pretty well , but for 
other sorts of arrays, like strings and even worse, objects, it's 
impossible to always guess the correct kind of thing to return. I'll 
leave it to the various string array users to battle it out over what's 
the right way to convert strings. However,  in the meantime or if you do 
not prevail in this debate, I suggest you slap an appropriate three line 
function into your code somewhere.

If all you care about is the interface issues use:

    def chararray(astring):
        return numpy.array(list(astring), 'S1')

If you are worried about the performance of this, you could use the more 
cryptic, but more efficient:

    def chararray(astring):
        a = numpy.array(astring)
        return numpy.ndarray([len(astring)], 'S1', a.data)

Perhaps these will let you sleep at night.

Regards,

-tim


>
> to get the desired result.  I don't think this is very pythonic, as  
> strings are fully indexable and iterable objects.  Furthermore,  
> converting/treating a string as an array of characters is a very  
> common thing.  convertcode.py would not appear to convert this part  
> of the code correctly either.  Also, the use of quotes in the shape  
> () array but not in the shape (4,) array is inconsistent.
>
> I realize the ability to use strings of arbitrary length as array  
> elements is important in numpy, but there really should be a more  
> natural option to convert/cast strings as character arrays.
>
> Also, unlike Numeric.equal and 'c' arrays, numpy.equal cannot compare  
> '|S1' arrays or presumably other strings for equality, although this  
> is a very useful comparison to make.
>
> For the record, I have used the Numeric (and to a lesser degree the  
> numarray) module extensively in bioinformatics applications for its  
> speed and brevity.
>
> Jeremy
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting 
> language
> that extends applications into web and mobile media. Attend the live 
> webcast
> and join the prime developer group breaking into this new coding 
> territory!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>
>


From oliphant at ee.byu.edu  Wed Apr 12 15:16:01 2006
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Wed Apr 12 15:16:01 2006
Subject: [Numpy-discussion] [SciPy-user] Regarding what "where" returns
In-Reply-To: <D03DE8BC-B54C-4ABF-9964-F5418D1D70F8@stsci.edu>
References: <443C0D36.80608@enthought.com>	<B82961E1-D585-46CF-82E1-A492B3746BB2@stsci.edu>	<443D39F6.6040805@enthought.com> <443D601E.3020500@enthought.com> <D03DE8BC-B54C-4ABF-9964-F5418D1D70F8@stsci.edu>
Message-ID: <443D7BD7.3060007@ee.byu.edu>

Perry Greenfield wrote:

>We've noticed that in numpy that the where() function behaves  
>differently than for numarray. In numarray, where() (when used with a  
>mask or condition array only) always returns a tuple of index arrays,  
>even for the 1D case whereas numpy returns an index array for the 1D  
>case and a tuple for higher dimension cases. While the tuple is a  
>annoyance for users when they want to manipulate the 1D case, the  
>benefit is that one always knows that where is returning a tuple, and  
>thus can write code accordingly. The  problem with the current numpy  
>behavior is that it requires special case testing to see which kind  
>return one has before manipulating if you aren't certain of what the  
>dimensionality of the argument is going to be.
>  
>
I think this is reasonable.  I don't think much thought went in to the 
current behavior as it simply defaults to the behavior of the nonzero 
method (where just defaults to nonzero in the circumstances you are 
describing).    The nonzero method has it's behavior because of the 
nonzero function in Numeric (which only worked with 1-d and returned an 
array not a tuple).

Ideally, I think we should fix the nonzero method and where to have the 
same behavior (both return tuples --- that's actually what the docstring 
of nonzero says right now).   The nonzero function can be special-cased 
to index the tuple for backward compatibility.

-Travis


From tim.hochberg at cox.net  Wed Apr 12 15:32:04 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Wed Apr 12 15:32:04 2006
Subject: [Numpy-discussion] Massive differences in numpy vs. numeric string
 handling
In-Reply-To: <443D7939.2060406@ee.byu.edu>
References: <3C3B42A3-962D-4F34-B704-AB1BF0E2390A@gmail.com> <443D7939.2060406@ee.byu.edu>
Message-ID: <443D7F5E.1020007@cox.net>

Travis Oliphant wrote:

> Jeremy Gore wrote:
>
>> In Numeric:
>>
>> Numeric.array('test') -> array([t, e, s, t],'c'); shape = (4,)
>> Numeric.array(['test','two']) ->
>> array([[t, e, s, t],
>>        [t, w, o,  ]],'c')
>>
>> but in numpy:
>>
>> numpy.array('test') -> array('test', dtype='|S4'); shape = ()
>> numpy.array('test','S1') -> array('t', dtype='|S1'); shape = ()
>>
>> in fact you have to do an extra list cast:
>>
>> numpy.array(list('test'),'S1') -> array([t, e, s, t], dtype='|S1');  
>> shape = (4,)
>>
>> to get the desired result.  I don't think this is very pythonic, as  
>> strings are fully indexable and iterable objects.
>
>
>
> Let's not cast this discussion in Pythonic vs. un-pythonic because 
> that does not really shed light on the issues.
>
> NumPy adds full support for string arrays.   Numeric had this 
> step-child called a character array which was really just an array of 
> bytes that printed differently. 
> This does raise some compatibility issues that have been hard to get 
> exactly right, and convertcode indeed does not really solve the 
> problem for a heavy character-array user.    I have resisted simply 
> adding back a 1-character string data-type back into NumPy,  but that 
> could be done if it is really necessary.  But, I don't think it is.
>
>>   Furthermore,  converting/treating a string as an array of 
>> characters is a very  common thing.  convertcode.py would not appear 
>> to convert this part  of the code correctly either.  Also, the use of 
>> quotes in the shape  () array but not in the shape (4,) array is 
>> inconsistent.
>
>
>>
>>
>> I realize the ability to use strings of arbitrary length as array  
>> elements is important in numpy, but there really should be a more  
>> natural option to convert/cast strings as character arrays.
>
>
> Perhaps all that is needed to simplify handling is to handle the 'S1' 
> case better so that
>
> array('test','S1')  works the same as array('test','c') used to work 
> (i.e. not stopping at strings for the sequence decomposition).

It seems a little wacky that 'S2' and 'S1' would have vastly different 
behaviour.

>>
>> Also, unlike Numeric.equal and 'c' arrays, numpy.equal cannot 
>> compare  '|S1' arrays or presumably other strings for equality, 
>> although this  is a very useful comparison to make.
>
>
> This is a known missing feature due to the fact that comparisons use 
> ufuncs but ufuncs are not supported for variable-length arrays.   
> Currently, however you can use the chararray class which does allow 
> comparisons of strings.

It seems like this should be easy to worm around in __cmp__ (or 
array_compare or however it's spelled). Since the strings really have a 
fixed length, they're more or less equivalent to byte arrays with one 
extra dimension. Writing a little lexographic comparison thing on top of 
the results of a ufunc operating on the result of  a compare of these 
byte arrays should be a piece of cake; in theory at least.

>
> There are simple ways to work around this, of course.   If you do have 
> 'S1' arrays, then you can simply view them as unsigned bytes (using 
> the .view method) and do comparison that way. 
> if s1 and s2 are "character arrays"
>
> s1.view(ubyte) >= s2.view(ubyte)

Nice!

Regards,

-tim


From oliphant at ee.byu.edu  Wed Apr 12 15:47:04 2006
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Wed Apr 12 15:47:04 2006
Subject: ***[Possible UCE]*** Re: [Numpy-discussion] Massive differences
 in numpy vs. numeric string handling
In-Reply-To: <443D7F5E.1020007@cox.net>
References: <3C3B42A3-962D-4F34-B704-AB1BF0E2390A@gmail.com> <443D7939.2060406@ee.byu.edu> <443D7F5E.1020007@cox.net>
Message-ID: <443D8336.60606@ee.byu.edu>

Tim Hochberg wrote:

>
> It seems a little wacky that 'S2' and 'S1' would have vastly different 
> behaviour.

True.   Much better is a compatibility function such as the one you gave.


>> This is a known missing feature due to the fact that comparisons use 
>> ufuncs but ufuncs are not supported for variable-length arrays.   
>> Currently, however you can use the chararray class which does allow 
>> comparisons of strings.
>
>
> It seems like this should be easy to worm around in __cmp__ (or 
> array_compare or however it's spelled). Since the strings really have 
> a fixed length, they're more or less equivalent to byte arrays with 
> one extra dimension. Writing a little lexographic comparison thing on 
> top of the results of a ufunc operating on the result of  a compare of 
> these byte arrays should be a piece of cake; in theory at least.

Yes, indeed it could be handled there as well.   It's the rich_compare 
function (all the cases are handled there...).   Right now, equality 
testing is special-cased a bit (inheriting behavior from Numeric). 

I've gone back and forth on whether I should put effort into handling 
variable-length arrays with ufuncs (which might be better long-term --- 
or just an example of feature bloat as I can't think of many use cases 
except this one),  or just special-case the needed comparisons (which 
would take less thought to implement).

I'm leaning towards the latter case --- special-case comparison of 
string arrays in the rich_compare function.   The next thing to think 
about is then Unicode arrays.  The problem with comparisons on unicode 
arrays though is "how do you compare unicode strings" in a meaningful 
way (i.e. what is alphabetical?).  

-Travis


From oliphant at ee.byu.edu  Wed Apr 12 15:56:03 2006
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Wed Apr 12 15:56:03 2006
Subject: [Numpy-discussion] Re: ***[Possible UCE]*** [SciPy-user] Regarding what "where" returns
In-Reply-To: <D03DE8BC-B54C-4ABF-9964-F5418D1D70F8@stsci.edu>
References: <443C0D36.80608@enthought.com>	<B82961E1-D585-46CF-82E1-A492B3746BB2@stsci.edu>	<443D39F6.6040805@enthought.com> <443D601E.3020500@enthought.com> <D03DE8BC-B54C-4ABF-9964-F5418D1D70F8@stsci.edu>
Message-ID: <443D857F.9000605@ee.byu.edu>

Perry Greenfield wrote:

>We've noticed that in numpy that the where() function behaves  
>differently than for numarray. In numarray, where() (when used with a  
>mask or condition array only) always returns a tuple of index arrays,  
>even for the 1D case whereas numpy returns an index array for the 1D  
>case and a tuple for higher dimension cases. While the tuple is a  
>annoyance for users when they want to manipulate the 1D case, the  
>benefit is that one always knows that where is returning a tuple, and  
>thus can write code accordingly. The  problem with the current numpy  
>behavior is that it requires special case testing to see which kind  
>return one has before manipulating if you aren't certain of what the  
>dimensionality of the argument is going to be.
>  
>
I went ahead and made this change to the code.    The nonzero function 
still behaves as before (and in fact only works for 1-d arrays as it did 
in Numeric).

The where(condition)  function works the same as condition.nonzero() and 
both always return a tuple.

I had to change exactly one piece of code that used the new where syntax.

This does represent a code breakage with the where syntax (but only if 
you used the newer, numarray-introduced usage).  I think this is a 
small-enough segment that we can make this change.  

-Travis


From robert.kern at gmail.com  Wed Apr 12 15:57:06 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Wed Apr 12 15:57:06 2006
Subject: [Numpy-discussion] Re: Massive differences in numpy vs. numeric string   handling
In-Reply-To: <443D7B74.6040808@cox.net>
References: <3C3B42A3-962D-4F34-B704-AB1BF0E2390A@gmail.com> <443D7B74.6040808@cox.net>
Message-ID: <e1k0in$lb0$1@sea.gmane.org>

Tim Hochberg wrote:
> Jeremy Gore wrote:
> 
>> In Numeric:
>>
>> Numeric.array('test') -> array([t, e, s, t],'c'); shape = (4,)
>> Numeric.array(['test','two']) ->
>> array([[t, e, s, t],
>>        [t, w, o,  ]],'c')
>>
>> but in numpy:
>>
>> numpy.array('test') -> array('test', dtype='|S4'); shape = ()
>> numpy.array('test','S1') -> array('t', dtype='|S1'); shape = ()
>>
>> in fact you have to do an extra list cast:
>>
>> numpy.array(list('test'),'S1') -> array([t, e, s, t], dtype='|S1'); 
>> shape = (4,)
> 
> The creation of arrays from python objects is full of all kinds of weird
> special cases. For numerical arrays this is works pretty well , but for
> other sorts of arrays, like strings and even worse, objects, it's
> impossible to always guess the correct kind of thing to return. I'll
> leave it to the various string array users to battle it out over what's
> the right way to convert strings. However,  in the meantime or if you do
> not prevail in this debate, I suggest you slap an appropriate three line
> function into your code somewhere.

I would suggest this way of thinking about it: numpy.array() shouldn't have to
handle every possible way to construct an array. People building less-common
arrays from less-common Python objects may have to use a different constructor
if they want to do so in a natural way. Implementing every possible combination
in numpy.array() *and* making it intuitive and readable are incommensurate
goals, in my opinion.

> If all you care about is the interface issues use:
> 
>    def chararray(astring):
>        return numpy.array(list(astring), 'S1')
> 
> If you are worried about the performance of this, you could use the more
> cryptic, but more efficient:
> 
>    def chararray(astring):
>        a = numpy.array(astring)
>        return numpy.ndarray([len(astring)], 'S1', a.data)

Better:

In [31]: fromstring('test', dtype('S1'))
Out[31]:
array([t, e, s, t],
      dtype='|S1')

There's still the issue of N-D arrays of character, though.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From oliphant at ee.byu.edu  Wed Apr 12 17:04:05 2006
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Wed Apr 12 17:04:05 2006
Subject: [Numpy-discussion] Toward release 1.0 of NumPy
Message-ID: <443D9543.8040601@ee.byu.edu>

The next release of NumPy will be 0.9.8

Before this release is made,  I want to make sure the following tickets 
are implemented

http://projects.scipy.org/scipy/numpy/ticket/54
http://projects.scipy.org/scipy/numpy/ticket/55
http://projects.scipy.org/scipy/numpy/ticket/56

Once 0.9.8 is out, I'd like to name the next release NumPy 1.0 Release 
Candidate 1 and have a series of release candidates so that hopefully by 
SciPy 2006 conference, NumPy 1.0 is out.   This also dove-tails nicely 
with the Python 2.5 release schedule so that NumPy 1.0 should work with 
Python 2.5 and be fully 64-bit capable for handling very-large arrays.

The recent discussions and bug-reports have been very helpful.  If you 
have found a bug, please report it on the Trac pages so that we don't 
lose sight of it.  

Report bugs by "submitting a ticket" here:

http://projects.scipy.org/scipy/numpy/newticket


-Travis


From oliphant at ee.byu.edu  Wed Apr 12 17:11:04 2006
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Wed Apr 12 17:11:04 2006
Subject: [Numpy-discussion] Toward release 1.0 of NumPy
In-Reply-To: <443D9543.8040601@ee.byu.edu>
References: <443D9543.8040601@ee.byu.edu>
Message-ID: <443D96DC.3060501@ee.byu.edu>

Travis Oliphant wrote:

>
> The next release of NumPy will be 0.9.8
>
> Before this release is made,  I want to make sure the following 
> tickets are implemented
>
> http://projects.scipy.org/scipy/numpy/ticket/54
> http://projects.scipy.org/scipy/numpy/ticket/55
> http://projects.scipy.org/scipy/numpy/ticket/56


So you don't have to read each one individually:


#54 :  implement thread-based error-handling modes
#55 :  finish scalar-math implementation which recognizes same 
error-handling
#56 :  implement rich_comparisons on string arrays and unicode arrays.


-Travis


From robert.kern at gmail.com  Wed Apr 12 17:19:07 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Wed Apr 12 17:19:07 2006
Subject: [Numpy-discussion] Re: Toward release 1.0 of NumPy
In-Reply-To: <443D9543.8040601@ee.byu.edu>
References: <443D9543.8040601@ee.byu.edu>
Message-ID: <e1k5ck$1sl$1@sea.gmane.org>

Travis Oliphant wrote:
> 
> The next release of NumPy will be 0.9.8

I have added a "0.9.8 Release" milestone to the Trac and have scheduled all of
these tickets for that milestone.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From tim.hochberg at cox.net  Wed Apr 12 17:59:12 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Wed Apr 12 17:59:12 2006
Subject: [Numpy-discussion] Toward release 1.0 of NumPy
In-Reply-To: <443D96DC.3060501@ee.byu.edu>
References: <443D9543.8040601@ee.byu.edu> <443D96DC.3060501@ee.byu.edu>
Message-ID: <443DA1B1.8040406@cox.net>

Travis Oliphant wrote:

> Travis Oliphant wrote:
>
>>
>> The next release of NumPy will be 0.9.8
>>
>> Before this release is made,  I want to make sure the following 
>> tickets are implemented
>>
>> http://projects.scipy.org/scipy/numpy/ticket/54
>> http://projects.scipy.org/scipy/numpy/ticket/55
>> http://projects.scipy.org/scipy/numpy/ticket/56
>
>
>
> So you don't have to read each one individually:
>
>
> #54 :  implement thread-based error-handling modes
> #55 :  finish scalar-math implementation which recognizes same 
> error-handling
> #56 :  implement rich_comparisons on string arrays and unicode arrays.


I'll help with #54 at least, since I was the complainer, er I mean, 
since I brought that one up. It's probably better to get that started 
before #55 anyway. The open issues that I see connected to this are:

    1. Better support for catching integer divide by zero. That doesn't 
work at all here, I'm guessing because my optimizer is too smart. I 
spent a half hour this morning trying how to set the divide by zero flag 
directly using VC7, but I couldn't find anything. I suppose I could see 
if there's some pragma to turn off optimization around that one 
function. However, I'm interested in what you think of stuffing the 
integer divide by zero information directly into a flag on the thread 
local object and then checking it on the way out. This is cleaner in 
that it doesn't rely on platform specific flag setting ifdeffery and it 
allows us to consider issue #2.

    2. Breaking integer divide by zero out from floating point divide by 
zero. The former is more serious in that it's silent. The latter returns 
INF, so you can see that something happened by examing your results, 
while the former returns zero. That has much more potential for 
confusion and silents bugs. Thus, it seems reasonable to be able to set 
the error handling different for integer divide by zero and floating 
point divide by zero. Note that this would allow integer divide by zero 
to be set to 'raise' and still run all the FP ops at max speed, since 
the flag saying do no error checking could ignore the int_divide_by_zero 
setting.

   3. Tossing out the overflow checking on integer operations. It's 
incomplete anyway and it slows things down. I don't really expect my 
integer operations to be overflow checked, and personally I think that 
incomplete checking is worse than no checking. I think we should at 
least disable the support for the time being and possibly revisit this 
latter when we have time to do a complete job and if it seems necessary.

   4. Different defaults I'd like to enable different defaults without 
slowing things down in the really super fast case.


Looking at this list now, it looks like only #4 needs to be addressed 
when doing the initial implementaion of the thread local error handling 
and even that one can be done in parallel, so I guess we should just 
start with creating the thread local object and see what happens. If you 
like I can start working on this, although I may not be able to get much 
done on  it till Monday.

Regards,

-tim


From simon at arrowtheory.com  Wed Apr 12 18:17:03 2006
From: simon at arrowtheory.com (Simon Burton)
Date: Wed Apr 12 18:17:03 2006
Subject: [Numpy-discussion] index objects are not broadcastable to a single shape
Message-ID: <20060413111612.3bb4e6fc.simon@arrowtheory.com>

This must be up there with the most useless confusing error messages:

>>> a=numpy.array([1,2,3])
>>> b=numpy.array([1,2,3,4])
>>> a*b
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
ValueError: index objects are not broadcastable to a single shape
>>>

Simon.

-- 
Simon Burton, B.Sc.
Licensed PO Box 8066
ANU Canberra 2601
Australia
Ph. 61 02 6249 6940
http://arrowtheory.com 


From oliphant at ee.byu.edu  Wed Apr 12 18:25:03 2006
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Wed Apr 12 18:25:03 2006
Subject: [Numpy-discussion] Toward release 1.0 of NumPy
In-Reply-To: <443DA1B1.8040406@cox.net>
References: <443D9543.8040601@ee.byu.edu> <443D96DC.3060501@ee.byu.edu> <443DA1B1.8040406@cox.net>
Message-ID: <443DA866.3090806@ee.byu.edu>

Tim Hochberg wrote:

> Travis Oliphant wrote:
>
>> Travis Oliphant wrote:
>>
>>>
>>> The next release of NumPy will be 0.9.8
>>>
>>> Before this release is made,  I want to make sure the following 
>>> tickets are implemented
>>>
>>> http://projects.scipy.org/scipy/numpy/ticket/54
>>> http://projects.scipy.org/scipy/numpy/ticket/55
>>> http://projects.scipy.org/scipy/numpy/ticket/56
>>
>>
>>
>>
>> So you don't have to read each one individually:
>>
>>
>> #54 :  implement thread-based error-handling modes
>> #55 :  finish scalar-math implementation which recognizes same 
>> error-handling
>> #56 :  implement rich_comparisons on string arrays and unicode arrays.
>
>
> I'll help with #54 at least, since I was the complainer, er I mean, 
> since I brought that one up. It's probably better to get that started 
> before #55 anyway. The open issues that I see connected to this are:

Great.  I agree that #54 needs to be done before #55 (error handling is 
what's been holding up #55 the whole time.

>
>    1. Better support for catching integer divide by zero. That doesn't 
> work at all here,

Probably a platform/compiler issue.   The numarray equivalent code had 
an if statement to prevent the compiler from optimizing it away.  
Perhaps we need to do something like that.   Also, perhaps VC7 has some 
means to set the divide by zero error more directly and we can just use 
that.

> I'm guessing because my optimizer is too smart. I spent a half hour 
> this morning trying how to set the divide by zero flag directly using 
> VC7, but I couldn't find anything. I suppose I could see if there's 
> some pragma to turn off optimization around that one function. 
> However, I'm interested in what you think of stuffing the integer 
> divide by zero information directly into a flag on the thread local 
> object and then checking it on the way out. 


Hmm..   The only issue is that dictionary look-ups are more expensive 
then register look-ups.    This could be costly.


> This is cleaner in that it doesn't rely on platform specific flag 
> setting ifdeffery and it allows us to consider issue #2.
>
>    2. Breaking integer divide by zero out from floating point divide 
> by zero. The former is more serious in that it's silent. The latter 
> returns INF, so you can see that something happened by examing your 
> results, while the former returns zero. That has much more potential 
> for confusion and silents bugs. Thus, it seems reasonable to be able 
> to set the error handling different for integer divide by zero and 
> floating point divide by zero. Note that this would allow integer 
> divide by zero to be set to 'raise' and still run all the FP ops at 
> max speed, since the flag saying do no error checking could ignore the 
> int_divide_by_zero setting.


Interesting proposal.    Yes, it is true that integer division returning 
zero is less well-justified.   But, I'm still concerned with doing a 
dictionary lookup for every divide-by-zero, and (more importantly) to 
check to see if a divide-by-zero has occurred.   The dictionary lookups 
is the largest source of small-array slow-down when comparing Numeric to 
NumPy.

>
>   3. Tossing out the overflow checking on integer operations. It's 
> incomplete anyway and it slows things down. I don't really expect my 
> integer operations to be overflow checked, and personally I think that 
> incomplete checking is worse than no checking. I think we should at 
> least disable the support for the time being and possibly revisit this 
> latter when we have time to do a complete job and if it seems necessary.

I'm all for that.   I think it makes the code slower and because it is 
incomplete (addition and subtraction don't do it), it makes for 
harder-to-explain code.

On the scalar operations, we should check for over-flow, however...

>
>   4. Different defaults I'd like to enable different defaults without 
> slowing things down in the really super fast case.


The discussion on different defaults is fine.   The slow-down is that 
with the current defaults, the error register flags are not actually 
checked if the default has not been changed.    With the 
numarray-defaults, the register flags would be checked at the end of 
each 1-d loop.   I'm not sure what kind of slow-down that would bring.   
Certainly for 1-d cases, there would be little difference.

One could actually simply store different defaults (but it would result 
in minor slow-downs because the register flags would be checked.

-Travis


From oliphant at ee.byu.edu  Wed Apr 12 18:30:03 2006
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Wed Apr 12 18:30:03 2006
Subject: [Numpy-discussion] index objects are not broadcastable to a single
 shape
In-Reply-To: <20060413111612.3bb4e6fc.simon@arrowtheory.com>
References: <20060413111612.3bb4e6fc.simon@arrowtheory.com>
Message-ID: <443DA966.1020301@ee.byu.edu>

Simon Burton wrote:

>This must be up there with the most useless confusing error messages:
>
>  
>
>>>>a=numpy.array([1,2,3])
>>>>b=numpy.array([1,2,3,4])
>>>>a*b
>>>>        
>>>>
>Traceback (most recent call last):
>  File "<stdin>", line 1, in ?
>ValueError: index objects are not broadcastable to a single shape
>  
>
>
>  
>
The problem with these error messages is that some code is used in a 
wide-variety of circumstances.  The original error message was conceived 
in thinking about the application of the code to one circumstance while 
this particular error is occurring in a different one.

The standard behavior is to just propagate the error up.  Better error 
messages means catching a lot more errors and special-casing error 
messages.  It can be done, but it's tedious work.

-Travis


From simon at arrowtheory.com  Wed Apr 12 20:34:04 2006
From: simon at arrowtheory.com (Simon Burton)
Date: Wed Apr 12 20:34:04 2006
Subject: [Numpy-discussion] index objects are not broadcastable to a
 single shape
In-Reply-To: <443DA966.1020301@ee.byu.edu>
References: <20060413111612.3bb4e6fc.simon@arrowtheory.com>
	<443DA966.1020301@ee.byu.edu>
Message-ID: <20060413133326.2889a5c5.simon@arrowtheory.com>

On Wed, 12 Apr 2006 19:29:10 -0600
Travis Oliphant <oliphant at ee.byu.edu> wrote:

> The problem with these error messages is that some code is used in a 
> wide-variety of circumstances.  The original error message was conceived 
> in thinking about the application of the code to one circumstance while 
> this particular error is occurring in a different one.
> 
> The standard behavior is to just propagate the error up.  Better error 
> messages means catching a lot more errors and special-casing error 
> messages.  It can be done, but it's tedious work.

OK. Can the error message be a little more generic, longer, etc. ?

"shape mismatch (index objects are not broadcastable to a single shape)" ?

I don't know either. I'm just thinking about all the new numpy/python users at work
here that I will need to hand hold. Error messages like this are pretty scary.

Simon.

-- 
Simon Burton, B.Sc.
Licensed PO Box 8066
ANU Canberra 2601
Australia
Ph. 61 02 6249 6940
http://arrowtheory.com 


From tim.hochberg at cox.net  Wed Apr 12 21:59:01 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Wed Apr 12 21:59:01 2006
Subject: [Numpy-discussion] Toward release 1.0 of NumPy
In-Reply-To: <443DA866.3090806@ee.byu.edu>
References: <443D9543.8040601@ee.byu.edu> <443D96DC.3060501@ee.byu.edu> <443DA1B1.8040406@cox.net> <443DA866.3090806@ee.byu.edu>
Message-ID: <443DD9D9.9080004@cox.net>

Travis Oliphant wrote:

> Tim Hochberg wrote:
>
>> Travis Oliphant wrote:
>>
>>> Travis Oliphant wrote:
>>>
>>>>
>>>> The next release of NumPy will be 0.9.8
>>>>
>>>> Before this release is made,  I want to make sure the following 
>>>> tickets are implemented
>>>>
>>>> http://projects.scipy.org/scipy/numpy/ticket/54
>>>> http://projects.scipy.org/scipy/numpy/ticket/55
>>>> http://projects.scipy.org/scipy/numpy/ticket/56
>>>
>>>
>>>
>>>
>>>
>>> So you don't have to read each one individually:
>>>
>>>
>>> #54 :  implement thread-based error-handling modes
>>> #55 :  finish scalar-math implementation which recognizes same 
>>> error-handling
>>> #56 :  implement rich_comparisons on string arrays and unicode arrays.
>>
>>
>>
>> I'll help with #54 at least, since I was the complainer, er I mean, 
>> since I brought that one up. It's probably better to get that started 
>> before #55 anyway. The open issues that I see connected to this are:
>
>
> Great.  I agree that #54 needs to be done before #55 (error handling 
> is what's been holding up #55 the whole time.
>
>>
>>    1. Better support for catching integer divide by zero. That 
>> doesn't work at all here,
>
>
> Probably a platform/compiler issue.   The numarray equivalent code had 
> an if statement to prevent the compiler from optimizing it away.  
> Perhaps we need to do something like that.   Also, perhaps VC7 has 
> some means to set the divide by zero error more directly and we can 
> just use that.
>
>> I'm guessing because my optimizer is too smart. I spent a half hour 
>> this morning trying how to set the divide by zero flag directly using 
>> VC7, but I couldn't find anything. I suppose I could see if there's 
>> some pragma to turn off optimization around that one function. 
>> However, I'm interested in what you think of stuffing the integer 
>> divide by zero information directly into a flag on the thread local 
>> object and then checking it on the way out. 
>
>
>
> Hmm..   The only issue is that dictionary look-ups are more expensive 
> then register look-ups.    This could be costly.
>
>
>> This is cleaner in that it doesn't rely on platform specific flag 
>> setting ifdeffery and it allows us to consider issue #2.
>>
>>    2. Breaking integer divide by zero out from floating point divide 
>> by zero. The former is more serious in that it's silent. The latter 
>> returns INF, so you can see that something happened by examing your 
>> results, while the former returns zero. That has much more potential 
>> for confusion and silents bugs. Thus, it seems reasonable to be able 
>> to set the error handling different for integer divide by zero and 
>> floating point divide by zero. Note that this would allow integer 
>> divide by zero to be set to 'raise' and still run all the FP ops at 
>> max speed, since the flag saying do no error checking could ignore 
>> the int_divide_by_zero setting.
>
>
>
> Interesting proposal.    Yes, it is true that integer division 
> returning zero is less well-justified.   But, I'm still concerned with 
> doing a dictionary lookup for every divide-by-zero, and (more 
> importantly) to check to see if a divide-by-zero has occurred.   The 
> dictionary lookups is the largest source of small-array slow-down when 
> comparing Numeric to NumPy.

Well, assuming that we can fix the error flag setting code here, we 
could still break the divide by zero error handling out by doing some 
special casing in the ufunc machinery since the ufuncs presumably can 
figure out there own types. Still, the thread local storage option is 
cleaner if we can figure out a way to make the dictionary lookups fast 
enough. The lookup in the failing case is not a big deal I don't think. 
First, it's normally an error so I don't mind introduce some slowing. 
Second ,it should be easy to only do the lookup once. Just have a flag 
that enusres that after the first lookup, the divided by zero flag is 
not set a second time. I guess the bigger issue is the lookup on the way 
out to see if anything failed. I have a plane, which I'll present at the 
bottom.

>>
>>   3. Tossing out the overflow checking on integer operations. It's 
>> incomplete anyway and it slows things down. I don't really expect my 
>> integer operations to be overflow checked, and personally I think 
>> that incomplete checking is worse than no checking. I think we should 
>> at least disable the support for the time being and possibly revisit 
>> this latter when we have time to do a complete job and if it seems 
>> necessary.
>
>
> I'm all for that.   I think it makes the code slower and because it is 
> incomplete (addition and subtraction don't do it), it makes for 
> harder-to-explain code.
>
> On the scalar operations, we should check for over-flow, however...

OK.

>
>>
>>   4. Different defaults I'd like to enable different defaults without 
>> slowing things down in the really super fast case.
>
>
>
> The discussion on different defaults is fine.   The slow-down is that 
> with the current defaults, the error register flags are not actually 
> checked if the default has not been changed.    With the 
> numarray-defaults, the register flags would be checked at the end of 
> each 1-d loop.   I'm not sure what kind of slow-down that would 
> bring.   Certainly for 1-d cases, there would be little difference.
>
> One could actually simply store different defaults (but it would 
> result in minor slow-downs because the register flags would be checked.
>
OK, here's my plan. It sounds like it will work, but this threading 
business is always tricky so find holes in it if you can.

1. As we've discussed we grow some thread local storage. This storage 
has flags check_divide, check_over, check_under, check_invalid and 
check_int_divide. It also has a flag int_divide_err. These flags are 
initialized to False, but then may immediately be set to a different 
default value. This is to simplify #3.

2. We grow 6 static longs that correspond to the above and are 
initialized to zero. They should be called check_divide_count, etc. or 
something similar.

3. Whenever a flag is switched from False to True it's corresponding 
global is incremented. Similarly, when switched from True to False the 
global is decremented.

4. When a divide by integer zero occurs, we check the int_divide_err 
flag. If it is false, we set it to true and also increment 
int_divide_err_count. We also set a local flag so that we don't do this 
again in that call to the ufunc core function. We can actually skip this 
whole step if check_int_divide_count is zero.

With all that in place, I think we should be able to do things 
efficiently. The ufunc can check whether any of the XXX_check_counts are 
nonzero and turn on register flag checking as appropriate. If an error 
occurs, it still only has to go to the per thread dictionary if the 
count for that particular error type is nonzero. Similarly, if the count 
int_divide_err_count is nonzero, the ufunc will have to go to the 
dictionary. If the error was set in this thread, then appropriate action 
(including possibly nothing) is taken and int_divide_err_count is 
decremented.

That all sounds more complicated than it really is, at least in my head 
;) Anyway, try to find the holes in it. It should be able to run at full 
speed if you turn off error checking in all threads. It should run at 
almost full speed as long as there aren't any errors that are being 
checked in *any thread*. I think in practice this means that most of the 
speed hit that is seen in numarray won't be here. It doesn't actually 
matter what the defaults are; turning off all error checking will still 
be fast.

Regards,

-tim


>
>
>
>


From winnieshop888 at yahoo.com.cn  Wed Apr 12 22:02:02 2006
From: winnieshop888 at yahoo.com.cn (winnie)
Date: Wed Apr 12 22:02:02 2006
Subject: [Numpy-discussion] Rash Guard
Message-ID: <E1FTtxW-0000Lz-Um@mail.sourceforge.net>

The products name :Rash Guard
The price :USD4.50/pc (with shiiping cost)
The qty : 200pcs
The size :XL,L,M,and S

see the attached

www.rmb.com.hk


Thanks,

winnie


From shetbest at 163.com  Wed Apr 12 23:30:04 2006
From: shetbest at 163.com (=?GB2312?B?NNTCMTUtMTbJz7qjLzIxLTIyye7b2g==?=)
Date: Wed Apr 12 23:30:04 2006
Subject: [Numpy-discussion] =?GB2312?B?QUTUy9PDRVhDRUy02b34ytCzodOqz/rT67LGzvG53MDt?=
Message-ID: <E1FTvKy-0002eO-7U@mail.sourceforge.net>

An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060412/aefc42b0/attachment.html>

From arnd.baecker at web.de  Thu Apr 13 00:58:04 2006
From: arnd.baecker at web.de (Arnd Baecker)
Date: Thu Apr 13 00:58:04 2006
Subject: [Numpy-discussion] Massive differences in numpy vs. numeric
 string handling
In-Reply-To: <3C3B42A3-962D-4F34-B704-AB1BF0E2390A@gmail.com>
References: <3C3B42A3-962D-4F34-B704-AB1BF0E2390A@gmail.com>
Message-ID: <Pine.LNX.4.51.0604130953510.13261@ptpcp8.phy.tu-dresden.de>

On Wed, 12 Apr 2006, Jeremy Gore wrote:

> In Numeric:

[...]

> but in numpy:

[...]

> For the record, I have used the Numeric (and to a lesser degree the
> numarray) module extensively in bioinformatics applications for its
> speed and brevity.

If (after this round of discussion) there remain
any differences, it would be good if you could
add them to the wiki at
 http://www.scipy.org/Converting_from_Numeric

Best, Arnd

P.S.: The same applies of course to any other differences which
show up!


From svetosch at gmx.net  Thu Apr 13 01:20:02 2006
From: svetosch at gmx.net (Sven Schreiber)
Date: Thu Apr 13 01:20:02 2006
Subject: [Numpy-discussion] Toward release 1.0 of NumPy
In-Reply-To: <443D9543.8040601@ee.byu.edu>
References: <443D9543.8040601@ee.byu.edu>
Message-ID: <443E096D.3040407@gmx.net>

Travis Oliphant schrieb:
> 
> The next release of NumPy will be 0.9.8

> 
> The recent discussions and bug-reports have been very helpful.  If you
> have found a bug, please report it on the Trac pages so that we don't
> lose sight of it. 
> Report bugs by "submitting a ticket" here:
> 

Before submitting the following as a bug, I would like to repeat what I
posted earlier (no replies) to check whether you agree it's a bug:

The "kron" (Kronecker product) function returns numpy-arrays even if
both arguments are numpy-matrices; imho that's a bug in light of the
proclaimed goal of preserving matrices where possible/sensible.

On a related issue, "eye" also still returns a numpy-array instead of a
numpy-matrix. At least one person (I think it was Ed Schofield) agreed
that it would be better to return a numpy-matrix, given that another
function ("identity") already returns a numpy-array. Currently, one of
the two functions seems redundant.

So unless somebody tells me otherwise, I will submit these two things as
bugs/tickets.

Great that numpy soon will be officially stable!

Cheers,
Sven


From pgmdevlist at mailcan.com  Thu Apr 13 01:41:02 2006
From: pgmdevlist at mailcan.com (Pierre GM)
Date: Thu Apr 13 01:41:02 2006
Subject: [Numpy-discussion] range/arange
Message-ID: <200604130507.40241.pgmdevlist@mailcan.com>

Folks,
Could any of you explain me why the two following commands give different 
results ? It's mere curiosity, for my personal edification.

[(m-5)/10 for m in arange(1,10)]
[0, 0, 0, 0, 0, 0, 0, 0, 0]

[(m-5)/10 for m in range(1,10)]
[-1, -1, -1, -1, 0, 0, 0, 0, 0]


From lars.bittrich at googlemail.com  Thu Apr 13 02:30:01 2006
From: lars.bittrich at googlemail.com (Lars Bittrich)
Date: Thu Apr 13 02:30:01 2006
Subject: [Numpy-discussion] range/arange
In-Reply-To: <200604130507.40241.pgmdevlist@mailcan.com>
References: <200604130507.40241.pgmdevlist@mailcan.com>
Message-ID: <200604131123.56171.lars.bittrich@googlemail.com>

Hi,

On Thursday 13 April 2006 11:07, Pierre GM wrote:
> Could any of you explain me why the two following commands give different
> results ? It's mere curiosity, for my personal edification.
>
> [(m-5)/10 for m in arange(1,10)]
> [0, 0, 0, 0, 0, 0, 0, 0, 0]
>
> [(m-5)/10 for m in range(1,10)]
> [-1, -1, -1, -1, 0, 0, 0, 0, 0]

I have no idea where the reason is located exactly, but it seems to be caused 
by different types of range and arange.

In [15]:type(arange(1,10)[0])
Out[15]:<type 'int32scalar'>

In [14]:type(range(1,10)[0])
Out[14]:<type 'int'>

If you use for example:

In [16]:-1/10
Out[16]:-1

you get the normal behavior of the "floor" function.

In [17]:floor(-.1)
Out[17]:-1.0

The behavior of int32scalar seems more intuitive to me.

Best regards,
Lars


From robert.kern at gmail.com  Thu Apr 13 05:17:05 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Thu Apr 13 05:17:05 2006
Subject: [Numpy-discussion] Re: range/arange
In-Reply-To: <200604130507.40241.pgmdevlist@mailcan.com>
References: <200604130507.40241.pgmdevlist@mailcan.com>
Message-ID: <e1lfd7$nsd$1@sea.gmane.org>

Pierre GM wrote:
> Folks,
> Could any of you explain me why the two following commands give different 
> results ? It's mere curiosity, for my personal edification.
> 
> [(m-5)/10 for m in arange(1,10)]
> [0, 0, 0, 0, 0, 0, 0, 0, 0]
> 
> [(m-5)/10 for m in range(1,10)]
> [-1, -1, -1, -1, 0, 0, 0, 0, 0]

Python's rule for integer division is to round towards negative infinity. C's
rule (if it has one; I think it may be platform dependent) is to round towards
0. When it comes to arithmetic, numpy tends to expose the C behavior because
it's fastest. As Lars pointed out, the type of the object that you get from
iterating over an array is a numpy int32scalar object, so the numpy behavior is
used.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From fullung at gmail.com  Thu Apr 13 05:18:04 2006
From: fullung at gmail.com (Albert Strasheim)
Date: Thu Apr 13 05:18:04 2006
Subject: [Numpy-discussion] Segfault when indexing on second or higher dimension with list or tuple
Message-ID: <20060413121710.GA30372@dogbert.sdsl.sun.ac.za>

Hello all,

The following segfault bug was discovered in NumPy 0.9.7.2348 by 
someone at our Python workshop:

import numpy as N
F = N.zeros((1,1))
F[:,[0]] = 0

The following also segfaults:

F[:,(0,)] = 0

Something seems to go wrong when one uses a tuple or a list to index 
into a NumPy array on the second or higher dimension, since the 
following code works:

F = N.zeros((1,))
F[[0]] = 0

The Trac ticket is here:

http://projects.scipy.org/scipy/numpy/ticket/59

If someone gets around to fixing this, please include some test cases.

Thanks!

Regards,

Albert


From cimrman3 at ntc.zcu.cz  Thu Apr 13 05:24:02 2006
From: cimrman3 at ntc.zcu.cz (Robert Cimrman)
Date: Thu Apr 13 05:24:02 2006
Subject: [Numpy-discussion] Re: ***[Possible UCE]*** [SciPy-user] Regarding
 what "where" returns
In-Reply-To: <443D857F.9000605@ee.byu.edu>
References: <443C0D36.80608@enthought.com>	<B82961E1-D585-46CF-82E1-A492B3746BB2@stsci.edu>	<443D39F6.6040805@enthought.com> <443D601E.3020500@enthought.com> <D03DE8BC-B54C-4ABF-9964-F5418D1D70F8@stsci.edu> <443D857F.9000605@ee.byu.edu>
Message-ID: <443E42A2.80402@ntc.zcu.cz>

Travis Oliphant wrote:
> I went ahead and made this change to the code.    The nonzero function 
> still behaves as before (and in fact only works for 1-d arrays as it did 
> in Numeric).
> 
> The where(condition)  function works the same as condition.nonzero() and 
> both always return a tuple.

So, for 1-d arrays, using 'nonzero( condition )' should be faster than 
'where( condition )[0]', right?

r.


From charlesr.harris at gmail.com  Thu Apr 13 05:35:13 2006
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Thu Apr 13 05:35:13 2006
Subject: [Numpy-discussion] Toward release 1.0 of NumPy
In-Reply-To: <443E096D.3040407@gmx.net>
References: <443D9543.8040601@ee.byu.edu> <443E096D.3040407@gmx.net>
Message-ID: <e06186140604130534v67dab6cbk3a51b41a3887a057@mail.gmail.com>

Sven,

On 4/13/06, Sven Schreiber <svetosch at gmx.net> wrote:
>
> Travis Oliphant schrieb:
> >
> > The next release of NumPy will be 0.9.8
>
> >
> > The recent discussions and bug-reports have been very helpful.  If you
> > have found a bug, please report it on the Trac pages so that we don't
> > lose sight of it.
> > Report bugs by "submitting a ticket" here:
> >
>
> Before submitting the following as a bug, I would like to repeat what I
> posted earlier (no replies) to check whether you agree it's a bug:
>
> The "kron" (Kronecker product) function returns numpy-arrays even if
> both arguments are numpy-matrices; imho that's a bug in light of the
> proclaimed goal of preserving matrices where possible/sensible.


What would you do instead? The Kronecker product (aka Tensor product) of two
matrices isn't a matrix. I suppose you could make it one by appealing to the
universal property -- bilinear map on the Cartesian product of linear spaces
-> linear map on the tensor product of linear spaces -- but that seems a bit
abstract for numpy and you would need to define the indices of the resulting
object as some sort of pair.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060413/6fe28186/attachment.html>

From pjssilva at ime.usp.br  Thu Apr 13 05:51:02 2006
From: pjssilva at ime.usp.br (Paulo Jose da Silva e Silva)
Date: Thu Apr 13 05:51:02 2006
Subject: [Numpy-discussion] Re: range/arange
In-Reply-To: <e1lfd7$nsd$1@sea.gmane.org>
References: <200604130507.40241.pgmdevlist@mailcan.com>
	 <e1lfd7$nsd$1@sea.gmane.org>
Message-ID: <1144932598.16449.5.camel@localhost.localdomain>

Em Qui, 2006-04-13 ?s 07:15 -0500, Robert Kern escreveu:

> 
> Python's rule for integer division is to round towards negative infinity. C's
> rule (if it has one; I think it may be platform dependent) is to round towards
> 0. When it comes to arithmetic, numpy tends to expose the C behavior because
> it's fastest. As Lars pointed out, the type of the object that you get from
> iterating over an array is a numpy int32scalar object, so the numpy behavior is
> used.
> 

Actually, in C99 standard the division was defined to truncate towards
zero always, see item 25 in:

http://home.datacomm.ch/t_wolf/tw/c/c9x_changes.html

So it is not platform dependent anymore.

Paulo

Obs: It once was platform dependent. Old gcc (for Linux) would truncate
towards infinity. I know this because of a "bug" in somebody else's
code. I took me a quite some time to discover that the problem was the
shift in gcc behavior in this matter.


From tejeda at clubspit.com  Thu Apr 13 06:17:03 2006
From: tejeda at clubspit.com (Socorro Tejeda)
Date: Thu Apr 13 06:17:03 2006
Subject: [Numpy-discussion] Re: your news
Message-ID: <000001c65efc$5afb0e50$f914a8c0@sfb92>

 
A M B r I E N 
  
X A q N A X 
  
C I A f L I S 
  
V o I A G R A 
  
V b A L I U M 
  

http://www.korbahcut.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060413/b5b5ccb5/attachment.html>

From aisaac at american.edu  Thu Apr 13 07:02:11 2006
From: aisaac at american.edu (Alan G Isaac)
Date: Thu Apr 13 07:02:11 2006
Subject: [Numpy-discussion] Toward release 1.0 of NumPy
In-Reply-To: <e06186140604130534v67dab6cbk3a51b41a3887a057@mail.gmail.com>
References: <443D9543.8040601@ee.byu.edu> <443E096D.3040407@gmx.net><e06186140604130534v67dab6cbk3a51b41a3887a057@mail.gmail.com>
Message-ID: <Mahogany-0.66.0-1448-20060413-100718.00@american.edu>

On Thu, 13 Apr 2006, Charles R Harris apparently wrote: 
> The Kronecker product (aka Tensor product) of two 
> matrices isn't a matrix. 

That is an unusual way to describe things in
the world of econometrics.  Here is a more
common way:
http://planetmath.org/encyclopedia/KroneckerProduct.html
I share Sven's expectation.

Cheers,
Alan Isaac


From fullung at gmail.com  Thu Apr 13 07:24:02 2006
From: fullung at gmail.com (Albert Strasheim)
Date: Thu Apr 13 07:24:02 2006
Subject: [Numpy-discussion] Segfault when indexing on second or higher dimension with list or tuple
In-Reply-To: <20060413121710.GA30372@dogbert.sdsl.sun.ac.za>
References: <20060413121710.GA30372@dogbert.sdsl.sun.ac.za>
Message-ID: <20060413142246.GA6870@dogbert.sdsl.sun.ac.za>

Hello all

I've attached a test case that reproduces the bug to the ticket:

http://projects.scipy.org/scipy/numpy/attachment/ticket/59/test_list_tuple_indexing.diff

I've also created a test case for the recent vectorize bug:

http://projects.scipy.org/scipy/numpy/attachment/ticket/52/test_vectorize.diff

Regards,

Albert

On Thu, 13 Apr 2006, Albert Strasheim wrote:

> Hello all,
> 
> The following segfault bug was discovered in NumPy 0.9.7.2348 by 
> someone at our Python workshop:
> 
> import numpy as N
> F = N.zeros((1,1))
> F[:,[0]] = 0
> 
> The following also segfaults:
> 
> F[:,(0,)] = 0
> 
> Something seems to go wrong when one uses a tuple or a list to index 
> into a NumPy array on the second or higher dimension, since the 
> following code works:
> 
> F = N.zeros((1,))
> F[[0]] = 0
> 
> The Trac ticket is here:
> 
> http://projects.scipy.org/scipy/numpy/ticket/59
> 
> If someone gets around to fixing this, please include some test cases.
> 
> Thanks!
> 
> Regards,
> 
> Albert
> 
> 
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting language
> that extends applications into web and mobile media. Attend the live webcast
> and join the prime developer group breaking into this new coding territory!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion


From oliphant.travis at ieee.org  Thu Apr 13 07:58:05 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Thu Apr 13 07:58:05 2006
Subject: [Numpy-discussion] index objects are not broadcastable to a single
 shape
In-Reply-To: <20060413133326.2889a5c5.simon@arrowtheory.com>
References: <20060413111612.3bb4e6fc.simon@arrowtheory.com>	<443DA966.1020301@ee.byu.edu> <20060413133326.2889a5c5.simon@arrowtheory.com>
Message-ID: <443E66AC.2020108@ieee.org>

Simon Burton wrote:
> On Wed, 12 Apr 2006 19:29:10 -0600
> Travis Oliphant <oliphant at ee.byu.edu> wrote:
>
>   
>> The problem with these error messages is that some code is used in a 
>> wide-variety of circumstances.  The original error message was conceived 
>> in thinking about the application of the code to one circumstance while 
>> this particular error is occurring in a different one.
>>
>> The standard behavior is to just propagate the error up.  Better error 
>> messages means catching a lot more errors and special-casing error 
>> messages.  It can be done, but it's tedious work.
>>     
>
> OK. Can the error message be a little more generic, longer, etc. ?
>
>   
Absolutely,  I should have finished the above message with an appeal for 
more helpful generic messages.  All suggestions are welcome.
> "shape mismatch (index objects are not broadcastable to a single shape)" ?
>   
Definitely better.  I would probably drop the index qualifier as well.  
Thanks for the tip.

-Travis


From oliphant.travis at ieee.org  Thu Apr 13 08:16:13 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Thu Apr 13 08:16:13 2006
Subject: [Numpy-discussion] Segfault when indexing on second or higher
 dimension with list or tuple
In-Reply-To: <20060413121710.GA30372@dogbert.sdsl.sun.ac.za>
References: <20060413121710.GA30372@dogbert.sdsl.sun.ac.za>
Message-ID: <443E6B01.7000906@ieee.org>

Albert Strasheim wrote:
> Hello all,
>
> The following segfault bug was discovered in NumPy 0.9.7.2348 by 
> someone at our Python workshop:
>
> import numpy as N
> F = N.zeros((1,1))
> F[:,[0]] = 0
>
> The following also segfaults:
>
> F[:,(0,)] = 0
>
> Something seems to go wrong when one uses a tuple or a list to index 
> into a NumPy array on the second or higher dimension, since the 
> following code works:
>
>   
The segfault was due to an error condition not being caught.   This is 
now fixed, so now you get (a rather cryptic error).  Now, to figure out 
why this code doesn't work....

-Travis


From oliphant.travis at ieee.org  Thu Apr 13 08:29:01 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Thu Apr 13 08:29:01 2006
Subject: [Numpy-discussion] Segfault when indexing on second or higher
 dimension with list or tuple
In-Reply-To: <443E6B01.7000906@ieee.org>
References: <20060413121710.GA30372@dogbert.sdsl.sun.ac.za> <443E6B01.7000906@ieee.org>
Message-ID: <443E6DF1.5020206@ieee.org>

Travis Oliphant wrote:
> Albert Strasheim wrote:
>> Hello all,
>>
>> The following segfault bug was discovered in NumPy 0.9.7.2348 by 
>> someone at our Python workshop:
>>
>> import numpy as N
>> F = N.zeros((1,1))
>> F[:,[0]] = 0
>>
>> The following also segfaults:
>>
>> F[:,(0,)] = 0
>>
>> Something seems to go wrong when one uses a tuple or a list to index 
>> into a NumPy array on the second or higher dimension, since the 
>> following code works:
>>
>>   
> The segfault was due to an error condition not being caught.   This is 
> now fixed, so now you get (a rather cryptic error).  Now, to figure 
> out why this code doesn't work....
>

The problem is that the code is not handling arbitrary shapes on the RHS 
of the equal sign.    I'll enter a ticket and fix this before 0.9.8.

Basically, right now,  the RHS  needs to have the same shape as the LHS

so

F[:,[0]] = [[0]]

should work already.


-Travis


From oliphant.travis at ieee.org  Thu Apr 13 08:43:14 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Thu Apr 13 08:43:14 2006
Subject: [Numpy-discussion] Re: ***[Possible UCE]*** [SciPy-user] Regarding
 what "where" returns
In-Reply-To: <443E42A2.80402@ntc.zcu.cz>
References: <443C0D36.80608@enthought.com>	<B82961E1-D585-46CF-82E1-A492B3746BB2@stsci.edu>	<443D39F6.6040805@enthought.com> <443D601E.3020500@enthought.com> <D03DE8BC-B54C-4ABF-9964-F5418D1D70F8@stsci.edu> <443D857F.9000605@ee.byu.edu> <443E42A2.80402@ntc.zcu.cz>
Message-ID: <443E7150.2010006@ieee.org>

Robert Cimrman wrote:
> Travis Oliphant wrote:
>> I went ahead and made this change to the code.    The nonzero 
>> function still behaves as before (and in fact only works for 1-d 
>> arrays as it did in Numeric).
>>
>> The where(condition)  function works the same as condition.nonzero() 
>> and both always return a tuple.
>
> So, for 1-d arrays, using 'nonzero( condition )' should be faster than 
> 'where( condition )[0]', right?
>
No.  since the function just selects off the first element of the tuple 
returned by the method...

'condition.nonzero()[0]'  may be *slightly* faster than 
'where(condition)[0]'  however

-Travis


From tim.hochberg at cox.net  Thu Apr 13 08:44:47 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Thu Apr 13 08:44:47 2006
Subject: [Numpy-discussion] Toward release 1.0 of NumPy
In-Reply-To: <Mahogany-0.66.0-1448-20060413-100718.00@american.edu>
References: <443D9543.8040601@ee.byu.edu> <443E096D.3040407@gmx.net><e06186140604130534v67dab6cbk3a51b41a3887a057@mail.gmail.com> <Mahogany-0.66.0-1448-20060413-100718.00@american.edu>
Message-ID: <443E7109.6080808@cox.net>

Alan G Isaac wrote:

>On Thu, 13 Apr 2006, Charles R Harris apparently wrote: 
>  
>
>>The Kronecker product (aka Tensor product) of two 
>>matrices isn't a matrix. 
>>    
>>
>
>That is an unusual way to describe things in
>the world of econometrics.  Here is a more
>common way:
>http://planetmath.org/encyclopedia/KroneckerProduct.html
>I share Sven's expectation.
>  
>
mathworld also agrees with you. As does the documentation (as best as I 
can tell) and the actual output of kron. I think Charles must be 
thinking of the tensor product instead.

In fact, if you look at the code you see this:

    # TODO:  figure out how to keep arrays the same

I think that in general this is going to be a bit of an issue whenever 
we have multiple arguments. Let me propose the world's second dumbest 
(in a good way, maybe) procedure:

    def kron(a,b):
        wrappers = [(getattr(x, '__array_priority__', 0),
    x.__array_wrap__) for x in [a,b]
                             if hasattr(x, '__array_wrap__')]
        if wrappers:
            priority, wrap = wrappers[-1]
        else:
            wrap = None
        # ....
        result = concatenate(concatenate(o, axis=1), axis=1)
        if wrap is not None:
            result = wrap(result)
        return result

   
This generalizes what _wrapit does for arbitrary arguments. It breaks 
'ties' where more than one argument wants to wrap something by using 
__array_priority__.  You'd actually want to factor out the wrapper 
finding code. This generalized what _wrapit does to multiple dimensions.

Thought?

Better plans?

-tim


From ryanlists at gmail.com  Thu Apr 13 09:11:10 2006
From: ryanlists at gmail.com (Ryan Krauss)
Date: Thu Apr 13 09:11:10 2006
Subject: [Numpy-discussion] where
Message-ID: <c5b438120604130910x2e6694adhe4592d686893afe5@mail.gmail.com>

Can someone help me understand the proper use of where?

I want to use it like this

myvect=where(f>19.5 and phase>0, f, phase)

but I seem to be getting or rather than and.

Thanks,

Ryan


From oliphant at ee.byu.edu  Thu Apr 13 09:18:05 2006
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Thu Apr 13 09:18:05 2006
Subject: [Numpy-discussion] where
In-Reply-To: <c5b438120604130910x2e6694adhe4592d686893afe5@mail.gmail.com>
References: <c5b438120604130910x2e6694adhe4592d686893afe5@mail.gmail.com>
Message-ID: <443E79A5.2000700@ee.byu.edu>

Ryan Krauss wrote:

>Can someone help me understand the proper use of where?
>
>I want to use it like this
>
>myvect=where(f>19.5 and phase>0, f, phase)
>
>but I seem to be getting or rather than and.
>
>  
>
It is probably your use of the 'and' statement.   Use '&' instead

(f > 19.5) & (phase > 0)

What version are you using.  In numarray and NumPy the use of 'and' like 
this should raise an error if 'f' and/or 'phase' are arrays of more than 
one element.

-Travis


From ryanlists at gmail.com  Thu Apr 13 09:27:06 2006
From: ryanlists at gmail.com (Ryan Krauss)
Date: Thu Apr 13 09:27:06 2006
Subject: [Numpy-discussion] where
In-Reply-To: <443E79A5.2000700@ee.byu.edu>
References: <c5b438120604130910x2e6694adhe4592d686893afe5@mail.gmail.com>
	 <443E79A5.2000700@ee.byu.edu>
Message-ID: <c5b438120604130926u4f6581b1y76381a30f6f9a373@mail.gmail.com>

Does where return a mask?

If I do
myvect=where((f > 19.5) & (phase > 0),f,phase)
myvect is the same length as f and phase and there is some
modification of the values where the condition is met, but what that
modification is is unclear to me.

If I do
myind=where((f > 19.5) & (phase > 0))
I seem to get the indices of the points where both conditions are met.

I am using version 0.9.5.2043.  I see those kinds of errors about
truth testing an array often, but not in this case.

Thanks,

Ryan

On 4/13/06, Travis Oliphant <oliphant at ee.byu.edu> wrote:
> Ryan Krauss wrote:
>
> >Can someone help me understand the proper use of where?
> >
> >I want to use it like this
> >
> >myvect=where(f>19.5 and phase>0, f, phase)
> >
> >but I seem to be getting or rather than and.
> >
> >
> >
> It is probably your use of the 'and' statement.   Use '&' instead
>
> (f > 19.5) & (phase > 0)
>
> What version are you using.  In numarray and NumPy the use of 'and' like
> this should raise an error if 'f' and/or 'phase' are arrays of more than
> one element.
>
> -Travis
>
>


From oliphant at ee.byu.edu  Thu Apr 13 09:39:04 2006
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Thu Apr 13 09:39:04 2006
Subject: [Numpy-discussion] where
In-Reply-To: <c5b438120604130926u4f6581b1y76381a30f6f9a373@mail.gmail.com>
References: <c5b438120604130910x2e6694adhe4592d686893afe5@mail.gmail.com>	 <443E79A5.2000700@ee.byu.edu> <c5b438120604130926u4f6581b1y76381a30f6f9a373@mail.gmail.com>
Message-ID: <443E7E7B.2030203@ee.byu.edu>

Ryan Krauss wrote:

>Does where return a mask?
>  
>
Only in the second use case...

>If I do
>myvect=where((f > 19.5) & (phase > 0),f,phase)
>myvect is the same length as f and phase and there is some
>modification of the values where the condition is met, but what that
>modification is is unclear to me.
>  
>

The behavior of

where(condition, for_true, for_false)

is to return an array of the same shape as condition with elements of 
for_true where condition is true and
for_false where condition is false.

Thus myvect will contain elements of f where the condition is met and 
elements of phase otherwise.

>If I do
>myind=where((f > 19.5) & (phase > 0))
>I seem to get the indices of the points where both conditions are met.
>  
>
Yes.  That is correct.   It is a different use-case... Note, however, 
that in the current SVN version of NumPy, this use-case will always 
return a tuple of indices (use the nonzero function instead for behavior 
that will stay constant).  For your 1-d example (I'm guessing it's 1-d)  
where will return a length-1 tuple.

>I am using version 0.9.5.2043.  I see those kinds of errors about
>truth testing an array often, but not in this case.
>  
>
That is strange.   What are the sizes of f and phase?

-Travis


From robert.kern at gmail.com  Thu Apr 13 09:42:04 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Thu Apr 13 09:42:04 2006
Subject: [Numpy-discussion] Re: where
In-Reply-To: <c5b438120604130926u4f6581b1y76381a30f6f9a373@mail.gmail.com>
References: <c5b438120604130910x2e6694adhe4592d686893afe5@mail.gmail.com>	 <443E79A5.2000700@ee.byu.edu> <c5b438120604130926u4f6581b1y76381a30f6f9a373@mail.gmail.com>
Message-ID: <e1luv1$hhc$1@sea.gmane.org>

Ryan Krauss wrote:
> Does where return a mask?
> 
> If I do
> myvect=where((f > 19.5) & (phase > 0),f,phase)
> myvect is the same length as f and phase and there is some
> modification of the values where the condition is met, but what that
> modification is is unclear to me.
> 
> If I do
> myind=where((f > 19.5) & (phase > 0))
> I seem to get the indices of the points where both conditions are met.
> 
> I am using version 0.9.5.2043.  I see those kinds of errors about
> truth testing an array often, but not in this case.

Have you read the docstring?

In [33]: where?
Type:           builtin_function_or_method
Base Class:     <type 'builtin_function_or_method'>
String Form:    <built-in function where>
Namespace:      Interactive
Docstring:
    where(condition, | x, y) is shaped like condition and has elements of x and
y where condition is respectively true or false.  If x or y are not given, then
it is equivalent to nonzero(condition).

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From ryanlists at gmail.com  Thu Apr 13 09:44:01 2006
From: ryanlists at gmail.com (Ryan Krauss)
Date: Thu Apr 13 09:44:01 2006
Subject: [Numpy-discussion] where
In-Reply-To: <443E7E7B.2030203@ee.byu.edu>
References: <c5b438120604130910x2e6694adhe4592d686893afe5@mail.gmail.com>
	 <443E79A5.2000700@ee.byu.edu>
	 <c5b438120604130926u4f6581b1y76381a30f6f9a373@mail.gmail.com>
	 <443E7E7B.2030203@ee.byu.edu>
Message-ID: <c5b438120604130943y31628e86h1224f2b657098fee@mail.gmail.com>

f and phase are each (4250,)

I have something that is working but doesn't use where.  Can this be
done easier using where:

f1=f>19.5
f2=f<38
myf=f1&f2
myp=phase>0
myind=myf&myp
correction=myind*-360
newphase=phase+correction

Basically, can where give me an output vector of the same size as f
and phase where the output is either 1 or 0?

Ryan

On 4/13/06, Travis Oliphant <oliphant at ee.byu.edu> wrote:
> Ryan Krauss wrote:
>
> >Does where return a mask?
> >
> >
> Only in the second use case...
>
> >If I do
> >myvect=where((f > 19.5) & (phase > 0),f,phase)
> >myvect is the same length as f and phase and there is some
> >modification of the values where the condition is met, but what that
> >modification is is unclear to me.
> >
> >
>
> The behavior of
>
> where(condition, for_true, for_false)
>
> is to return an array of the same shape as condition with elements of
> for_true where condition is true and
> for_false where condition is false.
>
> Thus myvect will contain elements of f where the condition is met and
> elements of phase otherwise.
>
> >If I do
> >myind=where((f > 19.5) & (phase > 0))
> >I seem to get the indices of the points where both conditions are met.
> >
> >
> Yes.  That is correct.   It is a different use-case... Note, however,
> that in the current SVN version of NumPy, this use-case will always
> return a tuple of indices (use the nonzero function instead for behavior
> that will stay constant).  For your 1-d example (I'm guessing it's 1-d)
> where will return a length-1 tuple.
>
> >I am using version 0.9.5.2043.  I see those kinds of errors about
> >truth testing an array often, but not in this case.
> >
> >
> That is strange.   What are the sizes of f and phase?
>
> -Travis
>
>


From robert.kern at gmail.com  Thu Apr 13 09:54:05 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Thu Apr 13 09:54:05 2006
Subject: [Numpy-discussion] Re: where
In-Reply-To: <c5b438120604130943y31628e86h1224f2b657098fee@mail.gmail.com>
References: <c5b438120604130910x2e6694adhe4592d686893afe5@mail.gmail.com>	 <443E79A5.2000700@ee.byu.edu>	 <c5b438120604130926u4f6581b1y76381a30f6f9a373@mail.gmail.com>	 <443E7E7B.2030203@ee.byu.edu> <c5b438120604130943y31628e86h1224f2b657098fee@mail.gmail.com>
Message-ID: <e1lvlv$l4f$1@sea.gmane.org>

Ryan Krauss wrote:
> f and phase are each (4250,)
> 
> I have something that is working but doesn't use where.  Can this be
> done easier using where:
> 
> f1=f>19.5
> f2=f<38
> myf=f1&f2
> myp=phase>0
> myind=myf&myp
> correction=myind*-360
> newphase=phase+correction

(untested)
phase[((f>19.5) & (f<38)) & (phase>0)] -= 360

> Basically, can where give me an output vector of the same size as f
> and phase where the output is either 1 or 0?

Why? The condition array that you would pass into where() is already such an array.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From arnd.baecker at web.de  Thu Apr 13 10:07:14 2006
From: arnd.baecker at web.de (Arnd Baecker)
Date: Thu Apr 13 10:07:14 2006
Subject: [Numpy-discussion] range/arange
In-Reply-To: <200604131123.56171.lars.bittrich@googlemail.com>
References: <200604130507.40241.pgmdevlist@mailcan.com>
 <200604131123.56171.lars.bittrich@googlemail.com>
Message-ID: <Pine.LNX.4.51.0604131905520.16341@ptpcp8.phy.tu-dresden.de>


On Thu, 13 Apr 2006, Lars Bittrich wrote:

> Hi,
>
> On Thursday 13 April 2006 11:07, Pierre GM wrote:
> > Could any of you explain me why the two following commands give different
> > results ? It's mere curiosity, for my personal edification.
> >
> > [(m-5)/10 for m in arange(1,10)]
> > [0, 0, 0, 0, 0, 0, 0, 0, 0]
> >
> > [(m-5)/10 for m in range(1,10)]
> > [-1, -1, -1, -1, 0, 0, 0, 0, 0]
>
> I have no idea where the reason is located exactly, but it seems to be caused
> by different types of range and arange.


Interestingly with Numeric you get the following:

In [1]: from Numeric import *
In [2]: [(m-5)/10 for m in arange(1,10)]
Out[2]: [-1, -1, -1, -1, 0, 0, 0, 0, 0]
In [3]: type(arange(1,10)[0])
Out[3]: <type 'int'>

Will this cause any trouble for projects
transitioning from Numeric to numpy?
Presumably a proper explanation (which?)
should go into the scipy wiki ("Converting from Numeric").


> In [15]:type(arange(1,10)[0])
> Out[15]:<type 'int32scalar'>
>
> In [14]:type(range(1,10)[0])
> Out[14]:<type 'int'>
>
> If you use for example:
>
> In [16]:-1/10
> Out[16]:-1
>
> you get the normal behavior of the "floor" function.
>
> In [17]:floor(-.1)
> Out[17]:-1.0
>
> The behavior of int32scalar seems more intuitive to me.

Me too.

Best, Arnd


From ryanlists at gmail.com  Thu Apr 13 10:12:06 2006
From: ryanlists at gmail.com (Ryan Krauss)
Date: Thu Apr 13 10:12:06 2006
Subject: [Numpy-discussion] Re: where
In-Reply-To: <e1luv1$hhc$1@sea.gmane.org>
References: <c5b438120604130910x2e6694adhe4592d686893afe5@mail.gmail.com>
	 <443E79A5.2000700@ee.byu.edu>
	 <c5b438120604130926u4f6581b1y76381a30f6f9a373@mail.gmail.com>
	 <e1luv1$hhc$1@sea.gmane.org>
Message-ID: <c5b438120604131011l618273far635572e32092f5fc@mail.gmail.com>

Sorry, I can't explain myself.

I read the docstring and it didn't make sense before.  Now it seems
clear enough.

Some how I got it in my head that I needed to be passing f and phase
so that condition could use them.

It turns out that this:
 myvect=where((f>19.5) & (f<38) &
(phase>0),ones(shape(phase)),zeros(shape(phase)))

does exactly what I want.

Ryan

On 4/13/06, Robert Kern <robert.kern at gmail.com> wrote:
> Ryan Krauss wrote:
> > Does where return a mask?
> >
> > If I do
> > myvect=where((f > 19.5) & (phase > 0),f,phase)
> > myvect is the same length as f and phase and there is some
> > modification of the values where the condition is met, but what that
> > modification is is unclear to me.
> >
> > If I do
> > myind=where((f > 19.5) & (phase > 0))
> > I seem to get the indices of the points where both conditions are met.
> >
> > I am using version 0.9.5.2043.  I see those kinds of errors about
> > truth testing an array often, but not in this case.
>
> Have you read the docstring?
>
> In [33]: where?
> Type:           builtin_function_or_method
> Base Class:     <type 'builtin_function_or_method'>
> String Form:    <built-in function where>
> Namespace:      Interactive
> Docstring:
>     where(condition, | x, y) is shaped like condition and has elements of x and
> y where condition is respectively true or false.  If x or y are not given, then
> it is equivalent to nonzero(condition).
>
> --
> Robert Kern
> robert.kern at gmail.com
>
> "I have come to believe that the whole world is an enigma, a harmless enigma
>  that is made terrible by our own mad attempt to interpret it as though it had
>  an underlying truth."
>   -- Umberto Eco
>
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting language
> that extends applications into web and mobile media. Attend the live webcast
> and join the prime developer group breaking into this new coding territory!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>


From ryanlists at gmail.com  Thu Apr 13 10:15:03 2006
From: ryanlists at gmail.com (Ryan Krauss)
Date: Thu Apr 13 10:15:03 2006
Subject: [Numpy-discussion] Re: where
In-Reply-To: <e1lvlv$l4f$1@sea.gmane.org>
References: <c5b438120604130910x2e6694adhe4592d686893afe5@mail.gmail.com>
	 <443E79A5.2000700@ee.byu.edu>
	 <c5b438120604130926u4f6581b1y76381a30f6f9a373@mail.gmail.com>
	 <443E7E7B.2030203@ee.byu.edu>
	 <c5b438120604130943y31628e86h1224f2b657098fee@mail.gmail.com>
	 <e1lvlv$l4f$1@sea.gmane.org>
Message-ID: <c5b438120604131014o1450ca8du95dce8e722d4247b@mail.gmail.com>

> Why? The condition array that you would pass into where() is already such an array.

That is the key point I was missing.  Until I played around with the
conditions myself I didn't get that I was passing in an explicit array
of 1's and 0's.  I guess I thought I was passing in some magic
expression that where was some how making sense.  That is why I
thought I would need to pass f and phase to the function.

Ryan

On 4/13/06, Robert Kern <robert.kern at gmail.com> wrote:
> Ryan Krauss wrote:
> > f and phase are each (4250,)
> >
> > I have something that is working but doesn't use where.  Can this be
> > done easier using where:
> >
> > f1=f>19.5
> > f2=f<38
> > myf=f1&f2
> > myp=phase>0
> > myind=myf&myp
> > correction=myind*-360
> > newphase=phase+correction
>
> (untested)
> phase[((f>19.5) & (f<38)) & (phase>0)] -= 360
>
> > Basically, can where give me an output vector of the same size as f
> > and phase where the output is either 1 or 0?
>
> Why? The condition array that you would pass into where() is already such an array.
>
> --
> Robert Kern
> robert.kern at gmail.com
>
> "I have come to believe that the whole world is an enigma, a harmless enigma
>  that is made terrible by our own mad attempt to interpret it as though it had
>  an underlying truth."
>   -- Umberto Eco
>
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting language
> that extends applications into web and mobile media. Attend the live webcast
> and join the prime developer group breaking into this new coding territory!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>


From ryanlists at gmail.com  Thu Apr 13 10:17:14 2006
From: ryanlists at gmail.com (Ryan Krauss)
Date: Thu Apr 13 10:17:14 2006
Subject: [Numpy-discussion] Re: where
In-Reply-To: <c5b438120604131014o1450ca8du95dce8e722d4247b@mail.gmail.com>
References: <c5b438120604130910x2e6694adhe4592d686893afe5@mail.gmail.com>
	 <443E79A5.2000700@ee.byu.edu>
	 <c5b438120604130926u4f6581b1y76381a30f6f9a373@mail.gmail.com>
	 <443E7E7B.2030203@ee.byu.edu>
	 <c5b438120604130943y31628e86h1224f2b657098fee@mail.gmail.com>
	 <e1lvlv$l4f$1@sea.gmane.org>
	 <c5b438120604131014o1450ca8du95dce8e722d4247b@mail.gmail.com>
Message-ID: <c5b438120604131016o350a0c5fj27d03f73244c13e8@mail.gmail.com>

which makes this:
myvect=where((f>19.5) & (f<38) &
(phase>0),ones(shape(phase)),zeros(shape(phase)))

actually really silly, sense all it is a complicated way to get back
the input of
(f>19.5) & (f<38) & (phase>0)

Ryan

On 4/13/06, Ryan Krauss <ryanlists at gmail.com> wrote:
> > Why? The condition array that you would pass into where() is already such an array.
>
> That is the key point I was missing.  Until I played around with the
> conditions myself I didn't get that I was passing in an explicit array
> of 1's and 0's.  I guess I thought I was passing in some magic
> expression that where was some how making sense.  That is why I
> thought I would need to pass f and phase to the function.
>
> Ryan
>
> On 4/13/06, Robert Kern <robert.kern at gmail.com> wrote:
> > Ryan Krauss wrote:
> > > f and phase are each (4250,)
> > >
> > > I have something that is working but doesn't use where.  Can this be
> > > done easier using where:
> > >
> > > f1=f>19.5
> > > f2=f<38
> > > myf=f1&f2
> > > myp=phase>0
> > > myind=myf&myp
> > > correction=myind*-360
> > > newphase=phase+correction
> >
> > (untested)
> > phase[((f>19.5) & (f<38)) & (phase>0)] -= 360
> >
> > > Basically, can where give me an output vector of the same size as f
> > > and phase where the output is either 1 or 0?
> >
> > Why? The condition array that you would pass into where() is already such an array.
> >
> > --
> > Robert Kern
> > robert.kern at gmail.com
> >
> > "I have come to believe that the whole world is an enigma, a harmless enigma
> >  that is made terrible by our own mad attempt to interpret it as though it had
> >  an underlying truth."
> >   -- Umberto Eco
> >
> >
> >
> > -------------------------------------------------------
> > This SF.Net email is sponsored by xPML, a groundbreaking scripting language
> > that extends applications into web and mobile media. Attend the live webcast
> > and join the prime developer group breaking into this new coding territory!
> > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> > _______________________________________________
> > Numpy-discussion mailing list
> > Numpy-discussion at lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/numpy-discussion
> >
>


From oliphant at ee.byu.edu  Thu Apr 13 10:49:06 2006
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Thu Apr 13 10:49:06 2006
Subject: [Numpy-discussion] range/arange
In-Reply-To: <Pine.LNX.4.51.0604131905520.16341@ptpcp8.phy.tu-dresden.de>
References: <200604130507.40241.pgmdevlist@mailcan.com> <200604131123.56171.lars.bittrich@googlemail.com> <Pine.LNX.4.51.0604131905520.16341@ptpcp8.phy.tu-dresden.de>
Message-ID: <443E8EEB.9070609@ee.byu.edu>

Arnd Baecker wrote:

>On Thu, 13 Apr 2006, Lars Bittrich wrote:
>
>  
>
>>Hi,
>>
>>On Thursday 13 April 2006 11:07, Pierre GM wrote:
>>    
>>
>>>Could any of you explain me why the two following commands give different
>>>results ? It's mere curiosity, for my personal edification.
>>>
>>>[(m-5)/10 for m in arange(1,10)]
>>>[0, 0, 0, 0, 0, 0, 0, 0, 0]
>>>
>>>[(m-5)/10 for m in range(1,10)]
>>>[-1, -1, -1, -1, 0, 0, 0, 0, 0]
>>>      
>>>
>>I have no idea where the reason is located exactly, but it seems to be caused
>>by different types of range and arange.
>>    
>>
>
>
>Interestingly with Numeric you get the following:
>
>In [1]: from Numeric import *
>In [2]: [(m-5)/10 for m in arange(1,10)]
>Out[2]: [-1, -1, -1, -1, 0, 0, 0, 0, 0]
>In [3]: type(arange(1,10)[0])
>Out[3]: <type 'int'>
>
>Will this cause any trouble for projects
>transitioning from Numeric to numpy?
>Presumably a proper explanation (which?)
>should go into the scipy wiki ("Converting from Numeric").
>
>  
>
Yes, some discussion will be needed about the fact that NumPy now has 
its own scalars.    This will give us quite a bit more flexibility 
moving forward and should be seamless for the most part.

-Travis


From pgmdevlist at mailcan.com  Thu Apr 13 11:29:09 2006
From: pgmdevlist at mailcan.com (Pierre GM)
Date: Thu Apr 13 11:29:09 2006
Subject: [Numpy-discussion] Re: range/arange
In-Reply-To: <e1lfd7$nsd$1@sea.gmane.org>
References: <200604130507.40241.pgmdevlist@mailcan.com> <e1lfd7$nsd$1@sea.gmane.org>
Message-ID: <200604131456.48570.pgmdevlist@mailcan.com>

> Python's rule for integer division is to round towards negative infinity.
> C's rule (if it has one; I think it may be platform dependent) is to round
> towards 0. 

Ah OK. That makes sense, and it's something I'll have to keep in mind later 
on. 
Thanks y'all for your answers, I feel quite edified now :)


From ndarray at mac.com  Thu Apr 13 11:53:00 2006
From: ndarray at mac.com (Sasha)
Date: Thu Apr 13 11:53:00 2006
Subject: [Numpy-discussion] Toward release 1.0 of NumPy
In-Reply-To: <443D9543.8040601@ee.byu.edu>
References: <443D9543.8040601@ee.byu.edu>
Message-ID: <d38f5330604131151r3f4ed426j9d40b2132e9f2a47@mail.gmail.com>

On 4/12/06, Travis Oliphant <oliphant at ee.byu.edu> wrote:
> ...        This also dove-tails nicely
> with the Python 2.5 release schedule so that NumPy 1.0 should work with
> Python 2.5 and be fully 64-bit capable for handling very-large arrays.
>

I would like to mention one feature that is going to appear in Python
2.5 that is covering some of the functionality of NumPy.  I am talking
about the ctypes module
<http://starship.python.net/crew/theller/ctypes/tutorial.html>.  Like
NumPy, ctypes provides a set of python classes that represent basic C
types:

        c_byte
        c_char
        c_char_p
        c_double
        c_float
        c_int
        c_long
        c_short
        c_ubyte
         ...

and the ability to describe composite structures.  The later
functionality is very close to what dtype class provides in numpy.

There are some features in ctype that I like better than similar
features in numpy.  For example, in ctypes a fixed width array is
described by multiplying basic type by an integer:

>>> c_char * 10
<class '__main__.c_char_Array_10'>

I find this approach more elegant than numpy's dtype('S10').

It looks like there is some synergy to be exploited here, particularly
in the area of record arrays.


From oliphant at ee.byu.edu  Thu Apr 13 12:49:02 2006
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Thu Apr 13 12:49:02 2006
Subject: [Numpy-discussion] Toward release 1.0 of NumPy
In-Reply-To: <d38f5330604131151r3f4ed426j9d40b2132e9f2a47@mail.gmail.com>
References: <443D9543.8040601@ee.byu.edu> <d38f5330604131151r3f4ed426j9d40b2132e9f2a47@mail.gmail.com>
Message-ID: <443EAB01.8040700@ee.byu.edu>

Sasha wrote:

>On 4/12/06, Travis Oliphant <oliphant at ee.byu.edu> wrote:
>  
>
>>...        This also dove-tails nicely
>>with the Python 2.5 release schedule so that NumPy 1.0 should work with
>>Python 2.5 and be fully 64-bit capable for handling very-large arrays.
>>
>>    
>>
>
>I would like to mention one feature that is going to appear in Python
>2.5 that is covering some of the functionality of NumPy.  I am talking
>about the ctypes module
><http://starship.python.net/crew/theller/ctypes/tutorial.html>.  Like
>NumPy, ctypes provides a set of python classes that represent basic C
>types:
>
>        c_byte
>        c_char
>        c_char_p
>        c_double
>        c_float
>        c_int
>        c_long
>        c_short
>        c_ubyte
>         ...
>
>and the ability to describe composite structures.  The later
>functionality is very close to what dtype class provides in numpy.
>
>There are some features in ctype that I like better than similar
>features in numpy.  For example, in ctypes a fixed width array is
>described by multiplying basic type by an integer:
>  
>
>>>>c_char * 10
>>>>        
>>>>
><class '__main__.c_char_Array_10'>
>
>I find this approach more elegant than numpy's dtype('S10').
>
>It looks like there is some synergy to be exploited here, particularly
>in the area of record arrays.
>  
>

Definitely.  I'm not familiar enough with c_types to do this.  Any help 
is appreciated.

-Travis


From charlesr.harris at gmail.com  Thu Apr 13 13:33:08 2006
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Thu Apr 13 13:33:08 2006
Subject: [Numpy-discussion] Toward release 1.0 of NumPy
In-Reply-To: <443E7109.6080808@cox.net>
References: <443D9543.8040601@ee.byu.edu> <443E096D.3040407@gmx.net>
	 <e06186140604130534v67dab6cbk3a51b41a3887a057@mail.gmail.com>
	 <Mahogany-0.66.0-1448-20060413-100718.00@american.edu>
	 <443E7109.6080808@cox.net>
Message-ID: <e06186140604131332w1f3e2f70t96912135e53c7763@mail.gmail.com>

Tim,

On 4/13/06, Tim Hochberg <tim.hochberg at cox.net> wrote:
>
> Alan G Isaac wrote:
>
> >On Thu, 13 Apr 2006, Charles R Harris apparently wrote:
> >
> >
> >>The Kronecker product (aka Tensor product) of two
> >>matrices isn't a matrix.
> >>
> >>
> >
> >That is an unusual way to describe things in
> >the world of econometrics.  Here is a more
> >common way:
> >http://planetmath.org/encyclopedia/KroneckerProduct.html
> >I share Sven's expectation.
> >
> >
> mathworld also agrees with you. As does the documentation (as best as I
> can tell) and the actual output of kron. I think Charles must be
> thinking of the tensor product instead.


It *is* the tensor product, A \tensor B, but it is not the most general
tensor with four indices just as a bivector is not the most general tensor
with two indices. Numerically, kron chooses to represent the tensor product
of two vector spaces a, b with dimensions n,m respectively as the direct sum
of n copies of b, and the  tensor product of two operators takes the given
form. More generally, the B matrix in each spot could be replaced with an
arbitrary matrix of the correct dimensions and you would recover the general
tensor with four indices.

Anyway, it sounds like you are proposing that the tensor (outer) product of
two matrices be reshaped to run over two indices. It seems that likewise the
tensor (outer) product of two vectors should be reshaped to run over one
index (i.e. flat). That would do the trick.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060413/d5803f3f/attachment.html>

From charlesr.harris at gmail.com  Thu Apr 13 14:19:01 2006
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Thu Apr 13 14:19:01 2006
Subject: [Numpy-discussion] Toward release 1.0 of NumPy
In-Reply-To: <e06186140604131332w1f3e2f70t96912135e53c7763@mail.gmail.com>
References: <443D9543.8040601@ee.byu.edu> <443E096D.3040407@gmx.net>
	 <e06186140604130534v67dab6cbk3a51b41a3887a057@mail.gmail.com>
	 <Mahogany-0.66.0-1448-20060413-100718.00@american.edu>
	 <443E7109.6080808@cox.net>
	 <e06186140604131332w1f3e2f70t96912135e53c7763@mail.gmail.com>
Message-ID: <e06186140604131418q2152685aief61c9ecd177abd9@mail.gmail.com>

Tim,

In particular:

def kron(a,b):
    n = shape(a)[1]*shape(b)[1]
    c = transpose(product.outer(a,b), axis=(0,2,1,3)).reshape(-1,n)
    # wrap c as a matrix.


On 4/13/06, Charles R Harris <charlesr.harris at gmail.com> wrote:
>
> Tim,
>
> On 4/13/06, Tim Hochberg < tim.hochberg at cox.net> wrote:
> >
> > Alan G Isaac wrote:
> >
> > >On Thu, 13 Apr 2006, Charles R Harris apparently wrote:
> > >
> > >
> > >>The Kronecker product (aka Tensor product) of two
> > >>matrices isn't a matrix.
> > >>
> > >>
> > >
> > >That is an unusual way to describe things in
> > >the world of econometrics.  Here is a more
> > >common way:
> > > http://planetmath.org/encyclopedia/KroneckerProduct.html
> > >I share Sven's expectation.
> > >
> > >
> > mathworld also agrees with you. As does the documentation (as best as I
> > can tell) and the actual output of kron. I think Charles must be
> > thinking of the tensor product instead.
>
>
Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060413/af62a554/attachment.html>

From tim.hochberg at cox.net  Thu Apr 13 14:32:04 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Thu Apr 13 14:32:04 2006
Subject: [Numpy-discussion] Toward release 1.0 of NumPy
In-Reply-To: <e06186140604131332w1f3e2f70t96912135e53c7763@mail.gmail.com>
References: <443D9543.8040601@ee.byu.edu> <443E096D.3040407@gmx.net>	 <e06186140604130534v67dab6cbk3a51b41a3887a057@mail.gmail.com>	 <Mahogany-0.66.0-1448-20060413-100718.00@american.edu>	 <443E7109.6080808@cox.net> <e06186140604131332w1f3e2f70t96912135e53c7763@mail.gmail.com>
Message-ID: <443EC2B4.807@cox.net>

Charles R Harris wrote:

> Tim,
>
> On 4/13/06, *Tim Hochberg* <tim.hochberg at cox.net 
> <mailto:tim.hochberg at cox.net>> wrote:
>
>     Alan G Isaac wrote:
>
>     >On Thu, 13 Apr 2006, Charles R Harris apparently wrote:
>     >
>     >
>     >>The Kronecker product (aka Tensor product) of two
>     >>matrices isn't a matrix.
>     >>
>     >>
>     >
>     >That is an unusual way to describe things in
>     >the world of econometrics.  Here is a more
>     >common way:
>     >http://planetmath.org/encyclopedia/KroneckerProduct.html
>     <http://planetmath.org/encyclopedia/KroneckerProduct.html>
>     >I share Sven's expectation.
>     >
>     >
>     mathworld also agrees with you. As does the documentation (as best
>     as I
>     can tell) and the actual output of kron. I think Charles must be
>     thinking of the tensor product instead. 
>
>
> It *is* the tensor product, A \tensor B, but it is not the most 
> general tensor with four indices just as a bivector is not the most 
> general tensor with two indices. Numerically, kron chooses to 
> represent the tensor product of two vector spaces a, b with dimensions 
> n,m respectively as the direct sum of n copies of b, and the  tensor 
> product of two operators takes the given form. More generally, the B 
> matrix in each spot could be replaced with an arbitrary matrix of the 
> correct dimensions and you would recover the general tensor with four 
> indices.
>
> Anyway, it sounds like you are proposing that the tensor (outer) 
> product of two matrices be reshaped to run over two indices. It seems 
> that likewise the tensor (outer) product of two vectors should be 
> reshaped to run over one index ( i.e. flat). That would do the trick.

I'm not proposing anything. I don't care at all what kron does. I just 
want to fix the return type if that's feasible so that people stop 
complaining about it. As far as I can tell, kron already returns a 
flattened tensor product of some sort. I believe the general tensor 
product that you are talking about is already covered by multiply.outer, 
but I'm not sure so correct me if I'm wrong. Here's what kron does as 
present:

 >>> a
array([[1, 1],
       [1, 1]])
 >>> kron(a,a) # => 4x4 matrix
array([[1, 1, 1, 1],
       [1, 1, 1, 1],
       [1, 1, 1, 1],
       [1, 1, 1, 1]])
 >>> kron(a,a[0]) => 8x1
array([1, 1, 1, 1, 1, 1, 1, 1])
 >>> kron(a[0], a[0])
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "C:\Python24\Lib\site-packages\numpy\lib\shape_base.py", line 
577, in kron
    result = concatenate(concatenate(o, axis=1), axis=1)
ValueError: 0-d arrays can't be concatenated
 >>>  b.shape
(2, 2, 2)
 >>> kron(b,b).shape
(4, 4, 2, 2)

So, it looks like the 2d x 2d product obeys Alan's definition. The other 
products are probably all broken.

Regards,

-tim


From charlesr.harris at gmail.com  Thu Apr 13 16:02:04 2006
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Thu Apr 13 16:02:04 2006
Subject: [Numpy-discussion] Toward release 1.0 of NumPy
In-Reply-To: <443EC2B4.807@cox.net>
References: <443D9543.8040601@ee.byu.edu> <443E096D.3040407@gmx.net>
	 <e06186140604130534v67dab6cbk3a51b41a3887a057@mail.gmail.com>
	 <Mahogany-0.66.0-1448-20060413-100718.00@american.edu>
	 <443E7109.6080808@cox.net>
	 <e06186140604131332w1f3e2f70t96912135e53c7763@mail.gmail.com>
	 <443EC2B4.807@cox.net>
Message-ID: <e06186140604131601g2b759c7ema98dd56ef292e6c0@mail.gmail.com>

On 4/13/06, Tim Hochberg <tim.hochberg at cox.net> wrote:
>
> Charles R Harris wrote:
>
> > Tim,
> >
> > On 4/13/06, *Tim Hochberg* <tim.hochberg at cox.net
> > <mailto:tim.hochberg at cox.net>> wrote:
> >
> >     Alan G Isaac wrote:
> >
> >     >On Thu, 13 Apr 2006, Charles R Harris apparently wrote:
> >     >
> >     >
> >     >>The Kronecker product (aka Tensor product) of two
> >     >>matrices isn't a matrix.
> >     >>
> >     >>
> >     >
> >     >That is an unusual way to describe things in
> >     >the world of econometrics.  Here is a more
> >     >common way:
> >     >http://planetmath.org/encyclopedia/KroneckerProduct.html
> >     <http://planetmath.org/encyclopedia/KroneckerProduct.html>
> >     >I share Sven's expectation.
> >     >
> >     >
> >     mathworld also agrees with you. As does the documentation (as best
> >     as I
> >     can tell) and the actual output of kron. I think Charles must be
> >     thinking of the tensor product instead.
> >
> >
> > It *is* the tensor product, A \tensor B, but it is not the most
> > general tensor with four indices just as a bivector is not the most
> > general tensor with two indices. Numerically, kron chooses to
> > represent the tensor product of two vector spaces a, b with dimensions
> > n,m respectively as the direct sum of n copies of b, and the  tensor
> > product of two operators takes the given form. More generally, the B
> > matrix in each spot could be replaced with an arbitrary matrix of the
> > correct dimensions and you would recover the general tensor with four
> > indices.
> >
> > Anyway, it sounds like you are proposing that the tensor (outer)
> > product of two matrices be reshaped to run over two indices. It seems
> > that likewise the tensor (outer) product of two vectors should be
> > reshaped to run over one index ( i.e. flat). That would do the trick.
>
> I'm not proposing anything. I don't care at all what kron does. I just
> want to fix the return type if that's feasible so that people stop
> complaining about it. As far as I can tell, kron already returns a
> flattened tensor product of some sort. I believe the general tensor
> product that you are talking about is already covered by multiply.outer,
> but I'm not sure so correct me if I'm wrong. Here's what kron does as
> present:
>
> >>> a
> array([[1, 1],
>        [1, 1]])
> >>> kron(a,a) # => 4x4 matrix
> array([[1, 1, 1, 1],
>        [1, 1, 1, 1],
>        [1, 1, 1, 1],
>        [1, 1, 1, 1]])


Good at first look. Lets see a simpler version... Nevermind, seems numpy
isn't working on this machine (X86_64, fc5 64 bit) at the moment, maybe I
need to check out a clean version.

>>> kron(a,a[0]) => 8x1
> array([1, 1, 1, 1, 1, 1, 1, 1])


Looks broken. a[0] should be an operator (matrix), so either it should be
(2,1) or (1,2). In the first case, the return should have shape (4,2), in
the latter (2,4). Should probably raise an error as the result strikes me as
ambiguous. But I have to admit I am not sure what the point of this
particular construction is.

>>> kron(a[0], a[0])
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
>   File "C:\Python24\Lib\site-packages\numpy\lib\shape_base.py", line
> 577, in kron
>     result = concatenate(concatenate(o, axis=1), axis=1)
> ValueError: 0-d arrays can't be concatenated


See above. this could be (1,4) or (4,1), depending.

>>>  b.shape
> (2, 2, 2)
> >>> kron(b,b).shape
> (4, 4, 2, 2)


I think this is doing transpose(outer(b,b), axis=(0,2,1,3)) and reshaping
the first 4 indices into 2. Again, I am not sure what the point is for these
operators. Now another way to get all this functionality is to have a
contraction function or method with a list of axis. For instance, consider
the matrices A(i,j) and B(k,l) operating on x(j) and y(l) like A(i,j)x(j)
and B(k,l)y(l), then the outer product of all of these is

A(i,j)B(k,l)x(j)y(l)

with the summation convention on the indices j and l. The result should be
the same as kron(A,B)*kron(x,y) up to a permutation of rows and columes. It
is just a question of which basis is used and how the elements are indexed.

So, it looks like the 2d x 2d product obeys Alan's definition. The other
> products are probably all broken.
>
>
Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060413/4983be21/attachment.html>

From aisaac at american.edu  Thu Apr 13 16:21:08 2006
From: aisaac at american.edu (Alan G Isaac)
Date: Thu Apr 13 16:21:08 2006
Subject: [Numpy-discussion] Toward release 1.0 of NumPy
In-Reply-To: <443EC2B4.807@cox.net>
References: <443D9543.8040601@ee.byu.edu> <443E096D.3040407@gmx.net>	 <e06186140604130534v67dab6cbk3a51b41a3887a057@mail.gmail.com>	 <Mahogany-0.66.0-1448-20060413-100718.00@american.edu>	 <443E7109.6080808@cox.net> <e06186140604131332w1f3e2f70t96912135e53c7763@mail.gmail.com><443EC2B4.807@cox.net>
Message-ID: <Mahogany-0.66.0-1448-20060413-192706.00@american.edu>

On Thu, 13 Apr 2006, Tim Hochberg apparently wrote: 
> Here's what kron does as present: 

As possible context:
http://www.mathworks.com/access/helpdesk/help/techdoc/ref/kron.html#998881
http://www.aptech.com/pdf_man/basicgauss.pdf p.69
In this sense, the 2-d handling is not surprising.

Cheers,
Alan Isaac


From charlesr.harris at gmail.com  Thu Apr 13 16:32:01 2006
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Thu Apr 13 16:32:01 2006
Subject: [Numpy-discussion] Toward release 1.0 of NumPy
In-Reply-To: <Mahogany-0.66.0-1448-20060413-192706.00@american.edu>
References: <443D9543.8040601@ee.byu.edu> <443E096D.3040407@gmx.net>
	 <e06186140604130534v67dab6cbk3a51b41a3887a057@mail.gmail.com>
	 <Mahogany-0.66.0-1448-20060413-100718.00@american.edu>
	 <443E7109.6080808@cox.net>
	 <e06186140604131332w1f3e2f70t96912135e53c7763@mail.gmail.com>
	 <443EC2B4.807@cox.net>
	 <Mahogany-0.66.0-1448-20060413-192706.00@american.edu>
Message-ID: <e06186140604131631i212406e7w55870a321a02bab7@mail.gmail.com>

Hi,

On 4/13/06, Alan G Isaac <aisaac at american.edu> wrote:
>
> On Thu, 13 Apr 2006, Tim Hochberg apparently wrote:
> > Here's what kron does as present:
>
> As possible context:
> http://www.mathworks.com/access/helpdesk/help/techdoc/ref/kron.html#998881
> http://www.aptech.com/pdf_man/basicgauss.pdf p.69
> In this sense, the 2-d handling is not surprising.


Yep, that is what the little python routine I gave above does. Note that in
these cases only matrices are involved. Matlab, for instance, defines
vectors as (1,n) or (n,1), which is actually helpful in minding the
distinction between a vector space and its dual. I don't know how the numpy
matrix package works, but the vectors of rank 1 are going to be a constant
source of ambiguity.

Cheers,
> Alan Isaac


Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060413/3d47bc38/attachment.html>

From tim.hochberg at cox.net  Thu Apr 13 16:37:04 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Thu Apr 13 16:37:04 2006
Subject: [Numpy-discussion] Toward release 1.0 of NumPy
In-Reply-To: <e06186140604131601g2b759c7ema98dd56ef292e6c0@mail.gmail.com>
References: <443D9543.8040601@ee.byu.edu> <443E096D.3040407@gmx.net>	 <e06186140604130534v67dab6cbk3a51b41a3887a057@mail.gmail.com>	 <Mahogany-0.66.0-1448-20060413-100718.00@american.edu>	 <443E7109.6080808@cox.net>	 <e06186140604131332w1f3e2f70t96912135e53c7763@mail.gmail.com>	 <443EC2B4.807@cox.net> <e06186140604131601g2b759c7ema98dd56ef292e6c0@mail.gmail.com>
Message-ID: <443EDFE7.6010509@cox.net>

Charles R Harris wrote:

>
>
> On 4/13/06, *Tim Hochberg* <tim.hochberg at cox.net 
> <mailto:tim.hochberg at cox.net>> wrote:
>
>     Charles R Harris wrote:
>
>     > Tim,
>     >
>     > On 4/13/06, *Tim Hochberg* <tim.hochberg at cox.net
>     <mailto:tim.hochberg at cox.net>
>     > <mailto:tim.hochberg at cox.net <mailto:tim.hochberg at cox.net>>> wrote:
>     >
>     >     Alan G Isaac wrote:
>     >
>     >     >On Thu, 13 Apr 2006, Charles R Harris apparently wrote:
>     >     >
>     >     >
>     >     >>The Kronecker product (aka Tensor product) of two
>     >     >>matrices isn't a matrix.
>     >     >>
>     >     >>
>     >     >
>     >     >That is an unusual way to describe things in
>     >     >the world of econometrics.  Here is a more
>     >     >common way:
>     >     >http://planetmath.org/encyclopedia/KroneckerProduct.html
>     >     < http://planetmath.org/encyclopedia/KroneckerProduct.html>
>     >     >I share Sven's expectation.
>     >     >
>     >     >
>     >     mathworld also agrees with you. As does the documentation
>     (as best
>     >     as I
>     >     can tell) and the actual output of kron. I think Charles must be
>     >     thinking of the tensor product instead.
>     >
>     >
>     > It *is* the tensor product, A \tensor B, but it is not the most
>     > general tensor with four indices just as a bivector is not the most
>     > general tensor with two indices. Numerically, kron chooses to
>     > represent the tensor product of two vector spaces a, b with
>     dimensions
>     > n,m respectively as the direct sum of n copies of b, and the  tensor
>     > product of two operators takes the given form. More generally, the B
>     > matrix in each spot could be replaced with an arbitrary matrix
>     of the
>     > correct dimensions and you would recover the general tensor with
>     four
>     > indices.
>     >
>     > Anyway, it sounds like you are proposing that the tensor (outer)
>     > product of two matrices be reshaped to run over two indices. It
>     seems
>     > that likewise the tensor (outer) product of two vectors should be
>     > reshaped to run over one index ( i.e. flat). That would do the
>     trick.
>
>     I'm not proposing anything. I don't care at all what kron does. I
>     just
>     want to fix the return type if that's feasible so that people stop
>     complaining about it. As far as I can tell, kron already returns a
>     flattened tensor product of some sort. I believe the general tensor
>     product that you are talking about is already covered by
>     multiply.outer,
>     but I'm not sure so correct me if I'm wrong. Here's what kron does as
>     present:
>
>     >>> a
>     array([[1, 1],
>            [1, 1]])
>     >>> kron(a,a) # => 4x4 matrix
>     array([[1, 1, 1, 1],
>            [1, 1, 1, 1],
>            [1, 1, 1, 1],
>            [1, 1, 1, 1]])
>
>
> Good at first look. Lets see a simpler version... Nevermind, seems 
> numpy isn't working on this machine (X86_64, fc5 64 bit) at the 
> moment, maybe I need to check out a clean version.
>
>     >>> kron(a,a[0]) => 8x1
>     array([1, 1, 1, 1, 1, 1, 1, 1])
>
>
> Looks broken. a[0] should be an operator (matrix), so either it should 
> be (2,1) or (1,2).

Since a is an array here, a[0] is shape (2,). Let's repeat this 
excercise using matrices, which are always rank-2 and see if they make 
sense.

 >>> m
matrix([[1, 1],
       [1, 1]])
 >>> kron(m, m[0])
matrix([[1, 1, 1, 1],
       [1, 1, 1, 1]])
 >>> kron(m,m[:,0])
matrix([[1, 1],
       [1, 1],
       [1, 1],
       [1, 1]])

That looks OK.

> In the first case, the return should have shape (4,2), in the latter 
> (2,4). Should probably raise an error as the result strikes me as 
> ambiguous. But I have to admit I am not sure what the point of this 
> particular construction is.
>
>     >>> kron(a[0], a[0])
>     Traceback (most recent call last):
>       File "<stdin>", line 1, in ?
>       File "C:\Python24\Lib\site-packages\numpy\lib\shape_base.py", line
>     577, in kron
>         result = concatenate(concatenate(o, axis=1), axis=1)
>     ValueError: 0-d arrays can't be concatenated
>
>
 >>> kron(m[0], m[0])
matrix([[1, 1, 1, 1]])
 >>> kron(m[:,0], m[:,0])
matrix([[1],
       [1],
       [1],
       [1]])
 >>> kron(m[:,0],m[0])
matrix([[1, 1],
       [1, 1]])

> See above. this could be (1,4) or (4,1), depending.

All of these look like they're probably right without thinking about it 
too hard.

>
>     >>>  b.shape
>     (2, 2, 2)
>     >>> kron(b,b).shape
>     (4, 4, 2, 2)
>
>
> I think this is doing transpose(outer(b,b), axis=(0,2,1,3)) and 
> reshaping the first 4 indices into 2. Again, I am not sure what the 
> point is for these operators. Now another way to get all this 
> functionality is to have a contraction function or method with a list 
> of axis. For instance, consider the matrices A(i,j) and B(k,l) 
> operating on x(j) and y(l) like A(i,j)x(j) and B(k,l)y(l), then the 
> outer product of all of these is
>
> A(i,j)B(k,l)x(j)y(l)
>
> with the summation convention on the indices j and l. The result 
> should be the same as kron(A,B)*kron(x,y) up to a permutation of rows 
> and columes. It is just a question of which basis is used and how the 
> elements are indexed.
>
>     So, it looks like the 2d x 2d product obeys Alan's definition. The
>     other
>     products are probably all broken.
>
Here's my best guess as to what is going on:
    1. There is a relatively large group of people who use Kronecker 
product as Alan does (probably the matrix as opposed to tensor math 
folks). I'm guessing it's a large group since they manage to write the 
definitions at both mathworld and planetmath.
    2. kron was meant to implement this.
    2.5 People who need the other meaning of kron can just use outer, so 
no real conflict.
    3. The implementation was either inappropriately generalized or it 
was assumed that all inputs would be matrices (and hence rank-2).

Assuming 3. is correct, and I'd like to hear from people if they think 
that the behaviour in the non rank-2 cases is sensible, the next 
question is whether the behaviour in the rank-2 cases makes sense. It 
seem to, but I'm not a user of kron. If both of the preceeding are true, 
it seems like a complete fix entails the following two things:
    1. Forbid arguments that are not rank-2. This allows all matrices, 
which is really the main target here I think.
    2. Fix the return type issue. I have a fix for this ready to commit, 
but I want to figure out the first part as well.


Regards,

-tim


From charlesr.harris at gmail.com  Thu Apr 13 17:14:32 2006
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Thu Apr 13 17:14:32 2006
Subject: [Numpy-discussion] Toward release 1.0 of NumPy
In-Reply-To: <443EDFE7.6010509@cox.net>
References: <443D9543.8040601@ee.byu.edu> <443E096D.3040407@gmx.net>
	 <e06186140604130534v67dab6cbk3a51b41a3887a057@mail.gmail.com>
	 <Mahogany-0.66.0-1448-20060413-100718.00@american.edu>
	 <443E7109.6080808@cox.net>
	 <e06186140604131332w1f3e2f70t96912135e53c7763@mail.gmail.com>
	 <443EC2B4.807@cox.net>
	 <e06186140604131601g2b759c7ema98dd56ef292e6c0@mail.gmail.com>
	 <443EDFE7.6010509@cox.net>
Message-ID: <e06186140604131713u48da0b8bw560665bd8576bea5@mail.gmail.com>

On 4/13/06, Tim Hochberg <tim.hochberg at cox.net> wrote:
>
> Charles R Harris wrote:
>
> >
> >
> > On 4/13/06, *Tim Hochberg* <tim.hochberg at cox.net
> > <mailto:tim.hochberg at cox.net>> wrote:
> >
> >     Charles R Harris wrote:
> >
> >     > Tim,
> >     >
> >     > On 4/13/06, *Tim Hochberg* <tim.hochberg at cox.net
> >     <mailto:tim.hochberg at cox.net>
> >     > <mailto:tim.hochberg at cox.net <mailto:tim.hochberg at cox.net>>>
> wrote:
> >     >
> >     >     Alan G Isaac wrote:
> >     >
> >     >     >On Thu, 13 Apr 2006, Charles R Harris apparently wrote:
> >     >     >
> >     >     >
> >     >     >>The Kronecker product (aka Tensor product) of two
> >     >     >>matrices isn't a matrix.
> >     >     >>
> >     >     >>
> >     >     >
> >     >     >That is an unusual way to describe things in
> >     >     >the world of econometrics.  Here is a more
> >     >     >common way:
> >     >     >http://planetmath.org/encyclopedia/KroneckerProduct.html
> >     >     < http://planetmath.org/encyclopedia/KroneckerProduct.html>
> >     >     >I share Sven's expectation.
> >     >     >
> >     >     >
> >     >     mathworld also agrees with you. As does the documentation
> >     (as best
> >     >     as I
> >     >     can tell) and the actual output of kron. I think Charles must
> be
> >     >     thinking of the tensor product instead.
> >     >
> >     >
> >     > It *is* the tensor product, A \tensor B, but it is not the most
> >     > general tensor with four indices just as a bivector is not the
> most
> >     > general tensor with two indices. Numerically, kron chooses to
> >     > represent the tensor product of two vector spaces a, b with
> >     dimensions
> >     > n,m respectively as the direct sum of n copies of b, and
> the  tensor
> >     > product of two operators takes the given form. More generally, the
> B
> >     > matrix in each spot could be replaced with an arbitrary matrix
> >     of the
> >     > correct dimensions and you would recover the general tensor with
> >     four
> >     > indices.
> >     >
> >     > Anyway, it sounds like you are proposing that the tensor (outer)
> >     > product of two matrices be reshaped to run over two indices. It
> >     seems
> >     > that likewise the tensor (outer) product of two vectors should be
> >     > reshaped to run over one index ( i.e. flat). That would do the
> >     trick.
> >
> >     I'm not proposing anything. I don't care at all what kron does. I
> >     just
> >     want to fix the return type if that's feasible so that people stop
> >     complaining about it. As far as I can tell, kron already returns a
> >     flattened tensor product of some sort. I believe the general tensor
> >     product that you are talking about is already covered by
> >     multiply.outer,
> >     but I'm not sure so correct me if I'm wrong. Here's what kron does
> as
> >     present:
> >
> >     >>> a
> >     array([[1, 1],
> >            [1, 1]])
> >     >>> kron(a,a) # => 4x4 matrix
> >     array([[1, 1, 1, 1],
> >            [1, 1, 1, 1],
> >            [1, 1, 1, 1],
> >            [1, 1, 1, 1]])
> >
> >
> > Good at first look. Lets see a simpler version... Nevermind, seems
> > numpy isn't working on this machine (X86_64, fc5 64 bit) at the
> > moment, maybe I need to check out a clean version.
> >
> >     >>> kron(a,a[0]) => 8x1
> >     array([1, 1, 1, 1, 1, 1, 1, 1])
> >
> >
> > Looks broken. a[0] should be an operator (matrix), so either it should
> > be (2,1) or (1,2).
>
> Since a is an array here, a[0] is shape (2,). Let's repeat this
> excercise using matrices, which are always rank-2 and see if they make
> sense.
>
> >>> m
> matrix([[1, 1],
>        [1, 1]])
> >>> kron(m, m[0])
> matrix([[1, 1, 1, 1],
>        [1, 1, 1, 1]])
> >>> kron(m,m[:,0])
> matrix([[1, 1],
>        [1, 1],
>        [1, 1],
>        [1, 1]])
>
> That looks OK.
>
> > In the first case, the return should have shape (4,2), in the latter
> > (2,4). Should probably raise an error as the result strikes me as
> > ambiguous. But I have to admit I am not sure what the point of this
> > particular construction is.
> >
> >     >>> kron(a[0], a[0])
> >     Traceback (most recent call last):
> >       File "<stdin>", line 1, in ?
> >       File "C:\Python24\Lib\site-packages\numpy\lib\shape_base.py", line
> >     577, in kron
> >         result = concatenate(concatenate(o, axis=1), axis=1)
> >     ValueError: 0-d arrays can't be concatenated
> >
> >
> >>> kron(m[0], m[0])
> matrix([[1, 1, 1, 1]])
> >>> kron(m[:,0], m[:,0])
> matrix([[1],
>        [1],
>        [1],
>        [1]])
> >>> kron(m[:,0],m[0])
> matrix([[1, 1],
>        [1, 1]])
>
> > See above. this could be (1,4) or (4,1), depending.
>
> All of these look like they're probably right without thinking about it
> too hard.
>
> >
> >     >>>  b.shape
> >     (2, 2, 2)
> >     >>> kron(b,b).shape
> >     (4, 4, 2, 2)
> >
> >
> > I think this is doing transpose(outer(b,b), axis=(0,2,1,3)) and
> > reshaping the first 4 indices into 2. Again, I am not sure what the
> > point is for these operators. Now another way to get all this
> > functionality is to have a contraction function or method with a list
> > of axis. For instance, consider the matrices A(i,j) and B(k,l)
> > operating on x(j) and y(l) like A(i,j)x(j) and B(k,l)y(l), then the
> > outer product of all of these is
> >
> > A(i,j)B(k,l)x(j)y(l)
> >
> > with the summation convention on the indices j and l. The result
> > should be the same as kron(A,B)*kron(x,y) up to a permutation of rows
> > and columes. It is just a question of which basis is used and how the
> > elements are indexed.
> >
> >     So, it looks like the 2d x 2d product obeys Alan's definition. The
> >     other
> >     products are probably all broken.
> >
> Here's my best guess as to what is going on:
>     1. There is a relatively large group of people who use Kronecker
> product as Alan does (probably the matrix as opposed to tensor math
> folks). I'm guessing it's a large group since they manage to write the
> definitions at both mathworld and planetmath.
>     2. kron was meant to implement this.
>     2.5 People who need the other meaning of kron can just use outer, so
> no real conflict.
>     3. The implementation was either inappropriately generalized or it
> was assumed that all inputs would be matrices (and hence rank-2).


Uh-huh.

Assuming 3. is correct, and I'd like to hear from people if they think
> that the behaviour in the non rank-2 cases is sensible, the next
> question is whether the behaviour in the rank-2 cases makes sense. It
> seem to, but I'm not a user of kron. If both of the preceeding are true,
> it seems like a complete fix entails the following two things:
>     1. Forbid arguments that are not rank-2. This allows all matrices,
> which is really the main target here I think.
>     2. Fix the return type issue. I have a fix for this ready to commit,
> but I want to figure out the first part as well.


I think it was inappropriately generalized, it is hard to make sense of what
kron means for rank > 2. So I vote for restricting the usage to matrices, or
arrays of rank two. This avoids the both the ambiguity of rank one arrays
and big why that arises for arrays with rank > 2. Note that in tensor
algebra the rank 1 problem is solved by the use of upper or lower indices,
lower index => [1,n], upper index => [n,1].

Hmm, I should to check that kron is associative: kron(kron(a,b),c) ==
kron(a, kron(b,c)) like a good tensor product should be. I suspect it is.

Regards,
>
> -tim


Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060413/25f1192f/attachment.html>

From charlesr.harris at gmail.com  Thu Apr 13 17:22:01 2006
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Thu Apr 13 17:22:01 2006
Subject: [Numpy-discussion] Problem on FC5
Message-ID: <e06186140604131721p2eeccaeex63c82e9633a187fd@mail.gmail.com>

Has anyone else seen this:

Python 2.4.2 (#1, Feb 12 2006, 03:45:41)
> [GCC 4.1.0 20060210 (Red Hat 4.1.0-0.24)] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
> >>> from numpy import *
> *** buffer overflow detected ***: python terminated
> ======= Backtrace: =========
> /lib64/libc.so.6(__chk_fail+0x2f)[0x32c76dee3f]
>
> /usr/lib64/python2.4/site-packages/numpy/core/multiarray.so[0x2aaaae191099]


<snip>


this is on FC5-x86_64. I didn't see any problems in the compilation and the
right lib64 libs seem to have been used.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060413/eb1dde14/attachment.html>

From ivazquez at ivazquez.net  Thu Apr 13 17:48:18 2006
From: ivazquez at ivazquez.net (Ignacio Vazquez-Abrams)
Date: Thu Apr 13 17:48:18 2006
Subject: [Numpy-discussion] Problem on FC5
In-Reply-To: <e06186140604131721p2eeccaeex63c82e9633a187fd@mail.gmail.com>
References: <e06186140604131721p2eeccaeex63c82e9633a187fd@mail.gmail.com>
Message-ID: <1144975662.3758.3.camel@ignacio.lan>

On Thu, 2006-04-13 at 18:21 -0600, Charles R Harris wrote:
> this is on FC5-x86_64. I didn't see any problems in the compilation
> and the right lib64 libs seem to have been used. 

Self-built or from Fedora Extras?

-- 
Ignacio Vazquez-Abrams <ivazquez at ivazquez.net>
http://fedora.ivazquez.net/

gpg --keyserver hkp://subkeys.pgp.net --recv-key 38028b72
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 191 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060413/45ce1d3a/attachment.sig>

From charlesr.harris at gmail.com  Thu Apr 13 19:04:10 2006
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Thu Apr 13 19:04:10 2006
Subject: [Numpy-discussion] Problem on FC5
In-Reply-To: <1144975662.3758.3.camel@ignacio.lan>
References: <e06186140604131721p2eeccaeex63c82e9633a187fd@mail.gmail.com>
	 <1144975662.3758.3.camel@ignacio.lan>
Message-ID: <e06186140604131903y6ab287f1ia0d190d4f99e7ce4@mail.gmail.com>

OK,

I solved this problem by deleting the numpy directory in site-packages. I
probably should have tried that first :-/

On 4/13/06, Ignacio Vazquez-Abrams <ivazquez at ivazquez.net> wrote:
>
> On Thu, 2006-04-13 at 18:21 -0600, Charles R Harris wrote:
> > this is on FC5-x86_64. I didn't see any problems in the compilation
> > and the right lib64 libs seem to have been used.
>
> Self-built or from Fedora Extras?
>
> --
> Ignacio Vazquez-Abrams <ivazquez at ivazquez.net>
> http://fedora.ivazquez.net/
>
> gpg --keyserver hkp://subkeys.pgp.net --recv-key 38028b72
>
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.2.2 (GNU/Linux)
>
> iD8DBQBEPvEuoK1Hsnseh8QRAgE4AJwMYPOUU6nz5z2aVBe6lz6fnAhgDwCgw2B0
> E9KCAvYMOYIz035NlwyLvYo=
> =TZyJ
> -----END PGP SIGNATURE-----
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060413/148f0dc3/attachment.html>

From chanley at stsci.edu  Fri Apr 14 07:27:03 2006
From: chanley at stsci.edu (Christopher Hanley)
Date: Fri Apr 14 07:27:03 2006
Subject: [Numpy-discussion] numpy.test() segfaults under Solaris 8
Message-ID: <443FB11E.5040102@stsci.edu>

 From the daily Solaris 8 regression tests:

   Found 5 tests for numpy.distutils.misc_util
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/core/tests for module 
<module 'numpy.core.info' from '...lib/python/numpy/core/info.pyc'>
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/core/tests for module 
<module 'numpy.core.defchararray' from '...on/numpy/core/defchararray.pyc'>
   Found 4 tests for numpy.lib.getlimits
   Found 30 tests for numpy.core.numerictypes
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/random/tests for module 
<module 'numpy.random.mtrand' from '.../python/numpy/random/mtrand.so'>
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/random/tests for module 
<module 'numpy.random.info' from '...b/python/numpy/random/info.pyc'>
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/linalg/tests for module 
<module 'numpy.linalg' from '...thon/numpy/linalg/__init__.pyc'>
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/testing/tests for module 
<module 'numpy.testing' from '...hon/numpy/testing/__init__.pyc'>
   Found 13 tests for numpy.core.umath
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/distutils/tests for module 
<module 'numpy.distutils.ccompiler' from 
'.../numpy/distutils/ccompiler.pyc'>
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/distutils/tests for module 
<module 'numpy.distutils.exec_command' from 
'...mpy/distutils/exec_command.pyc'>
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/linalg/tests for module 
<module 'numpy.linalg.info' from '...b/python/numpy/linalg/info.pyc'>
   Found 8 tests for numpy.lib.arraysetops
Warning: No test file found in /data/basil5/numpy/tests for module 
<module 'numpy.version' from '...s/lib/python/numpy/version.pyc'>
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/distutils/tests for module 
<module 'numpy.distutils.info' from '...ython/numpy/distutils/info.pyc'>
   Found 42 tests for numpy.lib.type_check
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/distutils/tests for module 
<module 'numpy.distutils.log' from '...python/numpy/distutils/log.pyc'>
   Found 90 tests for numpy.core.multiarray
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/distutils/tests for module 
<module 'numpy.distutils' from '...n/numpy/distutils/__init__.pyc'>
Warning: No test file found in /data/basil5/numpy/tests for module 
<module 'numpy.add_newdocs' from '...b/python/numpy/add_newdocs.pyc'>
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/distutils/tests for module 
<module 'numpy.distutils.__version__' from 
'...umpy/distutils/__version__.pyc'>
   Found 3 tests for numpy.dft.helper
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/lib/tests for module <module 
'numpy.lib._compiled_base' from '...on/numpy/lib/_compiled_base.so'>
   Found 36 tests for numpy.core.ma
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/f2py/tests for module 
<module 'numpy.f2py.info' from '...lib/python/numpy/f2py/info.pyc'>
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/lib/tests for module <module 
'numpy.lib.info' from '.../lib/python/numpy/lib/info.pyc'>
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/core/tests for module 
<module 'numpy.core._sort' from '...lib/python/numpy/core/_sort.so'>
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/core/tests for module 
<module 'numpy.core.memmap' from '...b/python/numpy/core/memmap.pyc'>
   Found 2 tests for numpy.core.oldnumeric
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/core/tests for module 
<module 'numpy.core._internal' from '...ython/numpy/core/_internal.pyc'>
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/distutils/tests for module 
<module 'numpy.distutils.__config__' from 
'...numpy/distutils/__config__.pyc'>
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/linalg/tests for module 
<module 'numpy.linalg.lapack_lite' from '...on/numpy/linalg/lapack_lite.so'>
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/dft/tests for module <module 
'numpy.dft.info' from '.../lib/python/numpy/dft/info.pyc'>
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/dft/tests for module <module 
'numpy.dft' from '.../python/numpy/dft/__init__.pyc'>
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/random/tests for module 
<module 'numpy.random' from '...thon/numpy/random/__init__.pyc'>
   Found 9 tests for numpy.lib.twodim_base
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/distutils/tests for module 
<module 'numpy.distutils.unixccompiler' from 
'...py/distutils/unixccompiler.pyc'>
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/core/tests for module 
<module 'numpy.core.arrayprint' from '...thon/numpy/core/arrayprint.pyc'>
Warning: No test file found in /data/basil5/numpy/tests for module 
<module 'numpy' from '.../lib/python/numpy/__init__.pyc'>
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/dft/tests for module <module 
'numpy.dft.fftpack' from '...b/python/numpy/dft/fftpack.pyc'>
   Found 8 tests for numpy.core.defmatrix
Warning: No test file found in /data/basil5/numpy/tests for module 
<module 'numpy.__config__' from '...ib/python/numpy/__config__.pyc'>
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/testing/tests for module 
<module 'numpy.testing.utils' from '...python/numpy/testing/utils.pyc'>
   Found 1 tests for numpy.lib.ufunclike
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/lib/tests for module <module 
'numpy.lib.scimath' from '...b/python/numpy/lib/scimath.pyc'>
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/lib/tests for module <module 
'numpy.lib' from '.../python/numpy/lib/__init__.pyc'>
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/dft/tests for module <module 
'numpy.dft.fftpack_lite' from '...thon/numpy/dft/fftpack_lite.so'>
   Found 32 tests for numpy.lib.function_base
   Found 1 tests for numpy.lib.polynomial
Warning: No test file found in /data/basil5/numpy/tests for module 
<module 'numpy._import_tools' from '...python/numpy/_import_tools.pyc'>
   Found 6 tests for numpy.core.records
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/testing/tests for module 
<module 'numpy.testing.numpytest' from '...on/numpy/testing/numpytest.pyc'>
   Found 17 tests for numpy.core.numeric
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/core/tests for module 
<module 'numpy.core.__svn_version__' from 
'...numpy/core/__svn_version__.pyc'>
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/core/tests for module 
<module 'numpy.core' from '...python/numpy/core/__init__.pyc'>
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/testing/tests for module 
<module 'numpy.testing.info' from '.../python/numpy/testing/info.pyc'>
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/lib/tests for module <module 
'numpy.lib.utils' from '...lib/python/numpy/lib/utils.pyc'>
   Found 4 tests for numpy.lib.index_tricks
   Found 44 tests for numpy.lib.shape_base
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/lib/tests for module <module 
'numpy.lib.machar' from '...ib/python/numpy/lib/machar.pyc'>
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/linalg/tests for module 
<module 'numpy.linalg.linalg' from '...python/numpy/linalg/linalg.pyc'>
   Found 0 tests for __main__
check_1 (numpy.distutils.tests.test_misc_util.test_appendpath) ... ok
check_2 (numpy.distutils.tests.test_misc_util.test_appendpath) ... ok
check_3 (numpy.distutils.tests.test_misc_util.test_appendpath) ... ok
check_gpaths (numpy.distutils.tests.test_misc_util.test_gpaths) ... ok
check_1 (numpy.distutils.tests.test_misc_util.test_minrelpath) ... ok
check_singleton (numpy.lib.tests.test_getlimits.test_double) ... ok
check_singleton (numpy.lib.tests.test_getlimits.test_longdouble) ... ok
check_singleton (numpy.lib.tests.test_getlimits.test_python_float) ... ok
check_singleton (numpy.lib.tests.test_getlimits.test_single) ... ok
Check creation from list of list of tuples ... ok
Check creation from list of tuples ... ok
Check creation from tuples ... ok
Check creation from list of list of tuples ... ok
Check creation from list of tuples ... ok
Check creation from tuples ... ok
Check creation from list of list of tuples ... ok
Check creation from list of tuples ... ok
Check creation from tuples ... ok
Check creation from list of list of tuples ... ok
Check creation from list of tuples ... ok
Check creation from tuples ... ok
Check creation of 0-dimensional objects ... ok
Check creation of multi-dimensional objects ... ok
Check creation of single-dimensional objects ... ok
Check creation of 0-dimensional objects ... ok
Check creation of multi-dimensional objects ... ok
Check creation of single-dimensional objects ... ok
Check reading the top fields of a nested array ... ok
Check reading the nested fields of a nested array (1st level) ... ok
Check access nested descriptors of a nested array (1st level) ... ok
Check reading the nested fields of a nested array (2nd level) ... ok
Check access nested descriptors of a nested array (2nd level) ... ok
Check reading the top fields of a nested array ... ok
Check reading the nested fields of a nested array (1st level) ... ok
Check access nested descriptors of a nested array (1st level) ... ok
Check reading the nested fields of a nested array (2nd level) ... ok
Check access nested descriptors of a nested array (2nd level) ... ok
check_access_fields 
(numpy.core.tests.test_numerictypes.test_read_values_plain_multiple) ... ok
check_access_fields 
(numpy.core.tests.test_numerictypes.test_read_values_plain_single) ... ok
test_mixed (numpy.core.tests.test_umath.test_choose) ... ok
check_expm1 (numpy.core.tests.test_umath.test_expm1) ... ok
check_floating_point (numpy.core.tests.test_umath.test_floating_point) 
... ok
check_log1p (numpy.core.tests.test_umath.test_log1p) ... ok
check_reduce_complex (numpy.core.tests.test_umath.test_maximum) ... ok
check_reduce_complex (numpy.core.tests.test_umath.test_minimum) ... ok
check_power_complex (numpy.core.tests.test_umath.test_power) ... ok
check_power_float (numpy.core.tests.test_umath.test_power) ... ok
test_array_with_context 
(numpy.core.tests.test_umath.test_special_methods) ... ok
test_failing_wrap (numpy.core.tests.test_umath.test_special_methods) ... ok
test_old_wrap (numpy.core.tests.test_umath.test_special_methods) ... ok
test_priority (numpy.core.tests.test_umath.test_special_methods) ... ok
test_wrap (numpy.core.tests.test_umath.test_special_methods) ... ok
check_intersect1d (numpy.lib.tests.test_arraysetops.test_aso) ... ok
check_intersect1d_nu (numpy.lib.tests.test_arraysetops.test_aso) ... ok
check_manyways (numpy.lib.tests.test_arraysetops.test_aso) ... ok
check_setdiff1d (numpy.lib.tests.test_arraysetops.test_aso) ... ok
check_setmember1d (numpy.lib.tests.test_arraysetops.test_aso) ... ok
check_setxor1d (numpy.lib.tests.test_arraysetops.test_aso) ... ok
check_union1d (numpy.lib.tests.test_arraysetops.test_aso) ... ok
check_unique1d (numpy.lib.tests.test_arraysetops.test_aso) ... ok
check_cmplx (numpy.lib.tests.test_type_check.test_imag) ... ok
check_real (numpy.lib.tests.test_type_check.test_imag) ... ok
check_fail (numpy.lib.tests.test_type_check.test_iscomplex) ... ok
check_pass (numpy.lib.tests.test_type_check.test_iscomplex) ... ok
check_basic (numpy.lib.tests.test_type_check.test_iscomplexobj) ... ok
check_complex (numpy.lib.tests.test_type_check.test_isfinite) ... ok
check_complex1 (numpy.lib.tests.test_type_check.test_isfinite) ... ok
check_goodvalues (numpy.lib.tests.test_type_check.test_isfinite) ... ok
check_ind (numpy.lib.tests.test_type_check.test_isfinite) ... ok
check_integer (numpy.lib.tests.test_type_check.test_isfinite) ... ok
check_neginf (numpy.lib.tests.test_type_check.test_isfinite) ... ok
check_posinf (numpy.lib.tests.test_type_check.test_isfinite) ... ok
check_goodvalues (numpy.lib.tests.test_type_check.test_isinf) ... ok
check_ind (numpy.lib.tests.test_type_check.test_isinf) ... ok
check_neginf (numpy.lib.tests.test_type_check.test_isinf) ... ok
check_neginf_scalar (numpy.lib.tests.test_type_check.test_isinf) ... ok
check_posinf (numpy.lib.tests.test_type_check.test_isinf) ... ok
check_posinf_scalar (numpy.lib.tests.test_type_check.test_isinf) ... ok
check_complex (numpy.lib.tests.test_type_check.test_isnan) ... ok
check_complex1 (numpy.lib.tests.test_type_check.test_isnan) ... ok
check_goodvalues (numpy.lib.tests.test_type_check.test_isnan) ... ok
check_ind (numpy.lib.tests.test_type_check.test_isnan) ... ok
check_integer (numpy.lib.tests.test_type_check.test_isnan) ... ok
check_neginf (numpy.lib.tests.test_type_check.test_isnan) ... ok
check_posinf (numpy.lib.tests.test_type_check.test_isnan) ... ok
check_generic (numpy.lib.tests.test_type_check.test_isneginf) ... ok
check_generic (numpy.lib.tests.test_type_check.test_isposinf) ... ok
check_fail (numpy.lib.tests.test_type_check.test_isreal) ... ok
check_pass (numpy.lib.tests.test_type_check.test_isreal) ... ok
check_basic (numpy.lib.tests.test_type_check.test_isrealobj) ... ok
check_basic (numpy.lib.tests.test_type_check.test_isscalar) ... ok
check_default_1 (numpy.lib.tests.test_type_check.test_mintypecode) ... ok
check_default_2 (numpy.lib.tests.test_type_check.test_mintypecode) ... ok
check_default_3 (numpy.lib.tests.test_type_check.test_mintypecode) ... ok
check_complex_bad (numpy.lib.tests.test_type_check.test_nan_to_num) ... ok
check_complex_bad2 (numpy.lib.tests.test_type_check.test_nan_to_num) ... ok
check_complex_good (numpy.lib.tests.test_type_check.test_nan_to_num) ... ok
check_generic (numpy.lib.tests.test_type_check.test_nan_to_num) ... ok
check_integer (numpy.lib.tests.test_type_check.test_nan_to_num) ... ok
check_cmplx (numpy.lib.tests.test_type_check.test_real) ... ok
check_real (numpy.lib.tests.test_type_check.test_real) ... ok
check_basic (numpy.lib.tests.test_type_check.test_real_if_close) ... ok
Check assignment of 0-dimensional objects with values ... ok
Check assignment of multi-dimensional objects with values ... ok
Check assignment of single-dimensional objects with values ... ok
Check assignment of 0-dimensional objects with values ... ok
Check assignment of multi-dimensional objects with values ... ok
Check assignment of single-dimensional objects with values ... ok
Check assignment of 0-dimensional objects with values ... ok
Check assignment of multi-dimensional objects with values ... ok
Check assignment of single-dimensional objects with values ... ok
Check assignment of 0-dimensional objects with values ... ok
Check assignment of multi-dimensional objects with values ... ok
Check assignment of single-dimensional objects with values ... ok
Check assignment of 0-dimensional objects with values ... ok
Check assignment of multi-dimensional objects with values ... ok
Check assignment of single-dimensional objects with values ... ok
Check assignment of 0-dimensional objects with values ... ok
Check assignment of multi-dimensional objects with values ... ok
Check assignment of single-dimensional objects with values ... ok
check_attributes (numpy.core.tests.test_multiarray.test_attributes) ... ok
check_dtypeattr (numpy.core.tests.test_multiarray.test_attributes) ... ok
check_fill (numpy.core.tests.test_multiarray.test_attributes) ... ok
check_set_stridesattr (numpy.core.tests.test_multiarray.test_attributes) 
... ok
check_stridesattr (numpy.core.tests.test_multiarray.test_attributes) ... ok
check_test_interning (numpy.core.tests.test_multiarray.test_bool) ... ok
Check byteorder of 0-dimensional objects ... ok
Check byteorder of multi-dimensional objects ... ok
Check byteorder of single-dimensional objects ... ok
Check byteorder of 0-dimensional objects ... ok
Check byteorder of multi-dimensional objects ... ok
Check byteorder of single-dimensional objects ... ok
Check byteorder of 0-dimensional objects ... ok
Check byteorder of multi-dimensional objects ... ok
Check byteorder of single-dimensional objects ... ok
Check byteorder of 0-dimensional objects ... ok
Check byteorder of multi-dimensional objects ... ok
Check byteorder of single-dimensional objects ... ok
Check byteorder of 0-dimensional objects ... ok
Check byteorder of multi-dimensional objects ... ok
Check byteorder of single-dimensional objects ... ok
Check byteorder of 0-dimensional objects ... ok
Check byteorder of multi-dimensional objects ... ok
Check byteorder of single-dimensional objects ... ok
Check creation of 0-dimensional objects with values ... ok
Check creation of multi-dimensional objects with values ... ok
Check creation of single-dimensional objects with values ... ok
Check creation of 0-dimensional objects with values ... ok
Check creation of multi-dimensional objects with values ... ok
Check creation of single-dimensional objects with values ... ok
Check creation of 0-dimensional objects with values ... ok
Check creation of multi-dimensional objects with values ... ok
Check creation of single-dimensional objects with values ... ok
Check creation of 0-dimensional objects with values ... ok
Check creation of multi-dimensional objects with values ... ok
Check creation of single-dimensional objects with values ... ok
Check creation of 0-dimensional objects with values ... ok
Check creation of multi-dimensional objects with values ... ok
Check creation of single-dimensional objects with values ... ok
Check creation of 0-dimensional objects with values ... ok
Check creation of multi-dimensional objects with values ... ok
Check creation of single-dimensional objects with values ... ok
Check creation of 0-dimensional objects ... ok
Check creation of multi-dimensional objects ... ok
Check creation of single-dimensional objects ... ok
Check creation of 0-dimensional objects ... ok
Check creation of multi-dimensional objects ... ok
Check creation of single-dimensional objects ... ok
Check creation of 0-dimensional objects ... ok
Check creation of multi-dimensional objects ... ok
Check creation of single-dimensional objects ... ok
check_from_attribute (numpy.core.tests.test_multiarray.test_creation) ... ok
check_construction (numpy.core.tests.test_multiarray.test_dtypedescr) ... ok
check_list (numpy.core.tests.test_multiarray.test_fancy_indexing) ... ok
check_tuple (numpy.core.tests.test_multiarray.test_fancy_indexing) ... ok
check_otherflags (numpy.core.tests.test_multiarray.test_flags) ... ok
check_writeable (numpy.core.tests.test_multiarray.test_flags) ... ok
check_ascii (numpy.core.tests.test_multiarray.test_fromstring) ... ok
check_binary (numpy.core.tests.test_multiarray.test_fromstring) ... ok
check_test_round (numpy.core.tests.test_multiarray.test_methods) ... ok
check_both (numpy.core.tests.test_multiarray.test_pickling) ... ok
check_test_zero_rank 
(numpy.core.tests.test_multiarray.test_subscripting) ... ok
check_constructor (numpy.core.tests.test_multiarray.test_zero_rank) ... ok
check_ellipsis_subscript 
(numpy.core.tests.test_multiarray.test_zero_rank) ... ok
check_ellipsis_subscript_assignment 
(numpy.core.tests.test_multiarray.test_zero_rank) ... ok
check_empty_subscript (numpy.core.tests.test_multiarray.test_zero_rank) 
... ok
check_empty_subscript_assignment 
(numpy.core.tests.test_multiarray.test_zero_rank) ... ok
check_invalid_newaxis (numpy.core.tests.test_multiarray.test_zero_rank) 
... ok
check_invalid_subscript 
(numpy.core.tests.test_multiarray.test_zero_rank) ... ok
check_invalid_subscript_assignment 
(numpy.core.tests.test_multiarray.test_zero_rank) ... ok
check_newaxis (numpy.core.tests.test_multiarray.test_zero_rank) ... ok
check_output (numpy.core.tests.test_multiarray.test_zero_rank) ... ok
check_definition (numpy.dft.tests.test_helper.test_fftfreq) ... ok
check_definition (numpy.dft.tests.test_helper.test_fftshift) ... ok
check_inverse (numpy.dft.tests.test_helper.test_fftshift) ... ok
test_clip (numpy.core.tests.test_ma.test_array_methods) ... ok
test_cumprod (numpy.core.tests.test_ma.test_array_methods) ... ok
test_cumsum (numpy.core.tests.test_ma.test_array_methods) ... ok
test_ptp (numpy.core.tests.test_ma.test_array_methods) ... ok
test_swapaxes (numpy.core.tests.test_ma.test_array_methods) ... ok
test_trace (numpy.core.tests.test_ma.test_array_methods) ... ok
test_varstd (numpy.core.tests.test_ma.test_array_methods) ... ok
check_testAPI (numpy.core.tests.test_ma.test_ma) ... ok
Test add, sum, product. ... ok
Test of basic arithmetic. ... ok
check_testArrayAttributes (numpy.core.tests.test_ma.test_ma) ... ok
check_testArrayMethods (numpy.core.tests.test_ma.test_ma) ... ok
Test of average. ... ok
More tests of average. ... ok
Test of basic array creation and properties in 1 dimension. ... ok
Test of basic array creation and properties in 2 dimensions. ... ok
Test of conversions and indexing ... ok
Tests of some subtle points of copying and sizing. ... ok
Test of inplace operations and rich comparisons ... ok
check_testMaPut (numpy.core.tests.test_ma.test_ma) ... ok
Test of masked element ... ok
Test of minumum, maximum. ... ok
check_testMixedArithmetic (numpy.core.tests.test_ma.test_ma) ... ok
Test of other odd features ... ok
Test of pickling ... ok
Test of put ... ok
check_testScalarArithmetic (numpy.core.tests.test_ma.test_ma) ... ok
check_testSingleElementSubscript (numpy.core.tests.test_ma.test_ma) ... ok
Test of take, transpose, inner, outer products ... ok
check_testToPython (numpy.core.tests.test_ma.test_ma) ... ok
Test various functions such as sin, cos. ... ok
Test count ... ok
check_testUfuncRegression (numpy.core.tests.test_ma.test_ufuncs) ... ok
test_minmax (numpy.core.tests.test_ma.test_ufuncs) ... ok
test_nonzero (numpy.core.tests.test_ma.test_ufuncs) ... ok
test_reduce (numpy.core.tests.test_ma.test_ufuncs) ... ok
check_bug_r2089 (numpy.core.tests.test_oldnumeric.test_put) ... ok
check_array_subclass (numpy.core.tests.test_oldnumeric.test_wrapit) ... ok
check_matrix (numpy.lib.tests.test_twodim_base.test_diag) ... ok
check_vector (numpy.lib.tests.test_twodim_base.test_diag) ... ok
check_2d (numpy.lib.tests.test_twodim_base.test_eye) ... ok
check_basic (numpy.lib.tests.test_twodim_base.test_eye) ... ok
check_diag (numpy.lib.tests.test_twodim_base.test_eye) ... ok
check_diag2d (numpy.lib.tests.test_twodim_base.test_eye) ... ok
check_basic (numpy.lib.tests.test_twodim_base.test_fliplr) ... ok
check_basic (numpy.lib.tests.test_twodim_base.test_flipud) ... ok
check_basic (numpy.lib.tests.test_twodim_base.test_rot90) ... ok
check_basic (numpy.core.tests.test_defmatrix.test_algebra) ... ok
check_basic (numpy.core.tests.test_defmatrix.test_casting) ... ok
check_basic (numpy.core.tests.test_defmatrix.test_ctor) ... ok
check_asmatrix (numpy.core.tests.test_defmatrix.test_properties) ... ok
check_basic (numpy.core.tests.test_defmatrix.test_properties) ... ok
check_comparisons (numpy.core.tests.test_defmatrix.test_properties) ... ok
check_noaxis (numpy.core.tests.test_defmatrix.test_properties) ... ok
Test whether matrix.sum(axis=1) preserves orientation. ... ok
Doctest: numpy.lib.tests.test_ufunclike ... ok
check_basic (numpy.lib.tests.test_function_base.test_all) ... ok
check_nd (numpy.lib.tests.test_function_base.test_all) ... ok
check_basic (numpy.lib.tests.test_function_base.test_amax) ... ok
check_basic (numpy.lib.tests.test_function_base.test_amin) ... ok
check_basic (numpy.lib.tests.test_function_base.test_angle) ... ok
check_basic (numpy.lib.tests.test_function_base.test_any) ... ok
check_nd (numpy.lib.tests.test_function_base.test_any) ... ok
check_basic (numpy.lib.tests.test_function_base.test_average) ... ok
check_basic (numpy.lib.tests.test_function_base.test_cumprod) ... ok
check_basic (numpy.lib.tests.test_function_base.test_cumsum) ... ok
check_basic (numpy.lib.tests.test_function_base.test_diff) ... ok
check_nd (numpy.lib.tests.test_function_base.test_diff) ... ok
check_basic (numpy.lib.tests.test_function_base.test_extins) ... ok
check_both (numpy.lib.tests.test_function_base.test_extins) ... ok
check_insert (numpy.lib.tests.test_function_base.test_extins) ... ok
check_bartlett (numpy.lib.tests.test_function_base.test_filterwindows) 
... ok
check_blackman (numpy.lib.tests.test_function_base.test_filterwindows) 
... ok
check_hamming (numpy.lib.tests.test_function_base.test_filterwindows) ... ok
check_hanning (numpy.lib.tests.test_function_base.test_filterwindows) ... ok
check_simple (numpy.lib.tests.test_function_base.test_histogram) ... ok
check_basic (numpy.lib.tests.test_function_base.test_linspace) ... ok
check_corner (numpy.lib.tests.test_function_base.test_linspace) ... ok
check_basic (numpy.lib.tests.test_function_base.test_logspace) ... ok
check_basic (numpy.lib.tests.test_function_base.test_prod) ... ok
check_basic (numpy.lib.tests.test_function_base.test_ptp) ... ok
check_simple (numpy.lib.tests.test_function_base.test_sinc) ... ok
check_simple (numpy.lib.tests.test_function_base.test_trapz) ... ok
check_basic (numpy.lib.tests.test_function_base.test_trim_zeros) ... ok
check_leading_skip (numpy.lib.tests.test_function_base.test_trim_zeros) 
... ok
check_trailing_skip (numpy.lib.tests.test_function_base.test_trim_zeros) 
... ok
check_simple (numpy.lib.tests.test_function_base.test_unwrap) ... ok
check_vectorize 
(numpy.lib.tests.test_function_base.test_vectorize)Segmentation Fault 
(core dumped)


This is a clean checkout and build of numpy that is done every morning 
on a Solaris 8 system.  We are currently using python 2.4.2 on this 
machine.  The equivalent build and test on a RHE system passed with no 
problems.

Chris


From fullung at gmail.com  Fri Apr 14 08:15:14 2006
From: fullung at gmail.com (Albert Strasheim)
Date: Fri Apr 14 08:15:14 2006
Subject: [Numpy-discussion] Vectorize bug
In-Reply-To: <20060412124032.GA30471@sun.ac.za>
Message-ID: <006c01c65fd6$2d043b90$0502010a@dsp.sun.ac.za>

Hello all

There still seems to be a problem with vectorize (or something else). So far
I've only been able to reproduce the problem by running the test suite 5
times under IPython on Windows (weird, eh?). Details here:

http://projects.scipy.org/scipy/numpy/ticket/52

If anybody has some ideas on how to do a proper debug build with MinGW so
that I can get a useful stack trace from the Visual Studio debugger, I can
narrow down the problem further.

Regards,

Albert

> -----Original Message-----
> From: numpy-discussion-admin at lists.sourceforge.net [mailto:numpy-
> discussion-admin at lists.sourceforge.net] On Behalf Of Stefan van der Walt
> Sent: 12 April 2006 14:41
> To: numpy-discussion at lists.sourceforge.net
> Subject: [Numpy-discussion] Vectorize bug
> 
> Hello all
> 
> Vectorize segfaults for large arrays.  I filed the bug at
> 
> http://projects.scipy.org/scipy/numpy/ticket/52
> 
> The offending code is
> 
> import numpy as N
> x = N.linspace(-3,2,10000)
> y = N.vectorize(lambda x: x)
> 
> # Segfaults here
> y(x)
> 
> Regards
> St?fan


From fullung at gmail.com  Fri Apr 14 08:18:02 2006
From: fullung at gmail.com (Albert Strasheim)
Date: Fri Apr 14 08:18:02 2006
Subject: [Numpy-discussion] numpy.test() segfaults under Solaris 8
In-Reply-To: <443FB11E.5040102@stsci.edu>
Message-ID: <006d01c65fd6$85b72450$0502010a@dsp.sun.ac.za>

Hello Chris

I am seeing this same crash on Windows under IPython with revision 2351 of
NumPy from SVN.

If you can get a useful stack trace on your platform, you could add some
details to this ticket:

http://projects.scipy.org/scipy/numpy/ticket/52

Regards,

Albert

> -----Original Message-----
> From: numpy-discussion-admin at lists.sourceforge.net [mailto:numpy-
> discussion-admin at lists.sourceforge.net] On Behalf Of Christopher Hanley
> Sent: 14 April 2006 16:27
> To: numpy-discussion
> Subject: [Numpy-discussion] numpy.test() segfaults under Solaris 8
> 
>  From the daily Solaris 8 regression tests:

<snip>

> check_vectorize
> (numpy.lib.tests.test_function_base.test_vectorize)Segmentation Fault
> (core dumped)
> 
> This is a clean checkout and build of numpy that is done every morning
> on a Solaris 8 system.  We are currently using python 2.4.2 on this
> machine.  The equivalent build and test on a RHE system passed with no
> problems.
> 
> Chris


From support_ref_16193163133 at natwest.com  Fri Apr 14 08:30:09 2006
From: support_ref_16193163133 at natwest.com (support_ref_16193163133 at natwest.com)
Date: Fri Apr 14 08:30:09 2006
Subject: [Numpy-discussion] NatWest Account service update!
Message-ID: <E1FUQDV-0006BH-Sv@externalmx-1.sourceforge.net>

An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060414/723ef7e9/attachment.html>

From faltet at xot.carabos.com  Fri Apr 14 14:36:06 2006
From: faltet at xot.carabos.com (faltet at xot.carabos.com)
Date: Fri Apr 14 14:36:06 2006
Subject: [Numpy-discussion] Performance problems with strided arrays in NumPy
Message-ID: <20060414213511.GA14355@xot.carabos.com>

Hi,

I'm seeing some slowness in NumPy when dealing with strided arrays.
numarray is dealing better with these situations, so I guess that
something could be done in NumPy about this. Below are the situations
that I've found up to now (maybe there are others). For the timings,
I've used numpy 0.9.7.2278 and numarray 1.5.1.

It seems that NumPy copy() method is almost 3x slower than in numarray:

In [105]: npcopy=timeit.Timer('b=a.copy()','import numpy as
np;a=np.arange(1000000,dtype="Float64")[::10]')

In [106]: npcopy.repeat(3,10)
Out[106]: [0.171913146972656, 0.175906896591186, 0.171195983886718]

In [107]: nacopy=timeit.Timer('b=a.copy()','import numarray as
np;a=np.arange(1000000,type="Float64")[::10]')

In [108]: nacopy.repeat(3,10)
Out[108]: [0.065090894699096, 0.0630550384521484, 0.0626609325408935]

However, a copy without strides performs similarly in both packages

In [127]: npcopy2=timeit.Timer('b=a.copy()','import numpy as
np;a=np.arange(1000000,dtype="Float64")')

In [128]: npcopy2.repeat(3,10)
Out[128]: [0.24657797813415527, 0.24657106399536133, 0.2464911937713623]

In [129]: nacopy2=timeit.Timer('b=a.copy()','import numarray as
np;a=np.arange(1000000,type="Float64")')

In [130]: nacopy2.repeat(3,10)
Out[130]: [0.244544982910156, 0.251885890960693, 0.2419440746307373]

--------------------------------------------

where() seems more than 2x slower in NumPy than in numarray:

In [136]: tnpf=timeit.Timer('np.where(a + b < 10, a, b)','import numpy
as np;a=np.arange(100000,dtype="float64");b=a*2')

In [137]: tnpf.repeat(3,10)
Out[137]: [0.225586891174316, 0.22503495216369629, 0.224209785461425]

In [138]: tnaf=timeit.Timer('np.where(a + b < 2, a, b)','import
numarray as np;a=np.arange(100000,type="Float64");b=a*2')

In [139]: tnaf.repeat(3,10)
Out[139]: [0.108436822891235, 0.1069340705871582, 0.10654377937316895]

However, for where() without parameters, NumPy performs slightly
better than numarray:

In [143]: tnpf2=timeit.Timer('np.where(a + b < 10)','import numpy as
np;a=np.arange(100000,dtype="float64");b=a*2')

In [144]: tnpf2.repeat(3,10)
Out[144]: [0.0759999752044677, 0.0731539726257324, 0.073034048080444336]

In [145]: tnaf2=timeit.Timer('np.where(a + b < 2)','import numarray as
np;a=np.arange(100000,type="Float64");b=a*2')

In [146]: tnaf2.repeat(3,10)
Out[146]: [0.0890851020812988, 0.0853078365325927, 0.085799932479858398]


Cheers,

Francesc


From oliphant at ee.byu.edu  Fri Apr 14 14:54:06 2006
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Fri Apr 14 14:54:06 2006
Subject: [Numpy-discussion] Vectorize bug
In-Reply-To: <006c01c65fd6$2d043b90$0502010a@dsp.sun.ac.za>
References: <006c01c65fd6$2d043b90$0502010a@dsp.sun.ac.za>
Message-ID: <444019E8.8000700@ee.byu.edu>

Albert Strasheim wrote:

>Hello all
>
>There still seems to be a problem with vectorize (or something else). So far
>I've only been able to reproduce the problem by running the test suite 5
>times under IPython on Windows (weird, eh?). Details here:
>
>http://projects.scipy.org/scipy/numpy/ticket/52
>  
>
I'm pretty sure it's a reference-counting issue.   I think I found the 
problem and it should now be fixed.

I'm hoping this will clear up the Solaris issue as well.

-Travis


From oliphant at ee.byu.edu  Fri Apr 14 16:04:02 2006
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Fri Apr 14 16:04:02 2006
Subject: [Numpy-discussion] Performance problems with strided arrays in
 NumPy
In-Reply-To: <20060414213511.GA14355@xot.carabos.com>
References: <20060414213511.GA14355@xot.carabos.com>
Message-ID: <44402A2A.9050300@ee.byu.edu>

faltet at xot.carabos.com wrote:

>Hi,
>
>I'm seeing some slowness in NumPy when dealing with strided arrays.
>numarray is dealing better with these situations, so I guess that
>something could be done in NumPy about this. Below are the situations
>that I've found up to now (maybe there are others). For the timings,
>I've used numpy 0.9.7.2278 and numarray 1.5.1.
>  
>

What I've found in experiments like this in the past is that numarray is 
good at striding in one direction but much worse at striding in another 
direction for multi-dimensional arrays.   Of course my experiments were 
not complete.  That just seemed to be the case.

The array-iterator construct handles almost all of these cases.   The 
copy method is a good place to start since it uses that code.

-Travis


From fullung at gmail.com  Fri Apr 14 16:34:06 2006
From: fullung at gmail.com (Albert Strasheim)
Date: Fri Apr 14 16:34:06 2006
Subject: [Numpy-discussion] Vectorize bug
In-Reply-To: <444019E8.8000700@ee.byu.edu>
Message-ID: <00f301c6601b$d340a350$0502010a@dsp.sun.ac.za>

Hello Travis

I'm still getting the same crash when running via IPython, which is the only
way I've been able to reproduce the crash on Windows.

Just to confirm:

In [1]: import numpy

In [2]: numpy.__version__
Out[2]: '0.9.7.2356'

The crash now happens in check_large, which is the new name of the test
method in question.

Cheers,

Albert
 
> -----Original Message-----
> From: numpy-discussion-admin at lists.sourceforge.net [mailto:numpy-
> discussion-admin at lists.sourceforge.net] On Behalf Of Travis Oliphant
> Sent: 14 April 2006 23:54
> To: numpy-discussion
> Subject: Re: [Numpy-discussion] Vectorize bug
> 
> Albert Strasheim wrote:
> 
> >Hello all
> >
> >There still seems to be a problem with vectorize (or something else). So
> far
> >I've only been able to reproduce the problem by running the test suite 5
> >times under IPython on Windows (weird, eh?). Details here:
> >
> >http://projects.scipy.org/scipy/numpy/ticket/52
> >
> >
> I'm pretty sure it's a reference-counting issue.   I think I found the
> problem and it should now be fixed.
> 
> I'm hoping this will clear up the Solaris issue as well.
> 
> -Travis


From oliphant at ee.byu.edu  Fri Apr 14 16:43:07 2006
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Fri Apr 14 16:43:07 2006
Subject: [Numpy-discussion] Vectorize bug
In-Reply-To: <00f301c6601b$d340a350$0502010a@dsp.sun.ac.za>
References: <00f301c6601b$d340a350$0502010a@dsp.sun.ac.za>
Message-ID: <44403354.2040708@ee.byu.edu>

Albert Strasheim wrote:

>Hello Travis
>
>I'm still getting the same crash when running via IPython, which is the only
>way I've been able to reproduce the crash on Windows.
>
>Just to confirm:
>
>In [1]: import numpy
>
>In [2]: numpy.__version__
>Out[2]: '0.9.7.2356'
>
>The crash now happens in check_large, which is the new name of the test
>method in question.
>  
>
Do you have SciPy installed? 

Make sure you are not importing an old version of SciPy.  

I cannot reproduce this problem.

-Travis


From fullung at gmail.com  Fri Apr 14 16:55:04 2006
From: fullung at gmail.com (Albert Strasheim)
Date: Fri Apr 14 16:55:04 2006
Subject: [Numpy-discussion] Vectorize bug
In-Reply-To: <44403354.2040708@ee.byu.edu>
Message-ID: <00fa01c6601e$c7707840$0502010a@dsp.sun.ac.za>

Hello

I don't have SciPy installed. Is there any way of doing a debug build of the
C code so that I can investigate this problem?

You say that you cannot reproduce this problem. Are you trying to reproduce
it on Linux or on Windows under IPython? I have also been unable to
reproduce the crash on Linux, but as we saw earlier, this crash also cropped
up on Solaris, without having to run the tests N times.

Regards,

Albert

> -----Original Message-----
> From: numpy-discussion-admin at lists.sourceforge.net [mailto:numpy-
> discussion-admin at lists.sourceforge.net] On Behalf Of Travis Oliphant
> Sent: 15 April 2006 01:42
> To: numpy-discussion
> Subject: Re: [Numpy-discussion] Vectorize bug
> 
> Albert Strasheim wrote:
> 
> >Hello Travis
> >
> >I'm still getting the same crash when running via IPython, which is the
> only
> >way I've been able to reproduce the crash on Windows.
> >
> >Just to confirm:
> >
> >In [1]: import numpy
> >
> >In [2]: numpy.__version__
> >Out[2]: '0.9.7.2356'
> >
> >The crash now happens in check_large, which is the new name of the test
> >method in question.
> >
> >
> Do you have SciPy installed?
> 
> Make sure you are not importing an old version of SciPy.
> 
> I cannot reproduce this problem.
> 
> -Travis


From fullung at gmail.com  Fri Apr 14 16:58:03 2006
From: fullung at gmail.com (Albert Strasheim)
Date: Fri Apr 14 16:58:03 2006
Subject: [Numpy-discussion] Summer of Code 2006
Message-ID: <00fb01c6601f$26e19b10$0502010a@dsp.sun.ac.za>

Hello all

The Google Summer of Code site for 2006 is up:

http://code.google.com/soc/

Maybe the NumPy team can propose a few projects to be funded by this
program. Personally, I'd be interested in working on the build system,
especially on Windows, and/or extending the test suite.

Regards,

Albert


From fullung at gmail.com  Fri Apr 14 17:19:05 2006
From: fullung at gmail.com (Albert Strasheim)
Date: Fri Apr 14 17:19:05 2006
Subject: [Numpy-discussion] Vectorize bug
In-Reply-To: <00fa01c6601e$c7707840$0502010a@dsp.sun.ac.za>
Message-ID: <00fc01c66022$1b51fb70$0502010a@dsp.sun.ac.za>

Hello all

I think Valgrind might be very useful in tracking down this bug.

http://valgrind.org/

Example usage:

~/bin/valgrind \
	-v --error-limit=no --leak-check=full \ 
	python -c 'import numpy; numpy.test()'

Valgrind emits many warnings for things going on inside Python on my Fedora
Core 4 system, but there is also a lot of interesting things going on in the
NumPy code.

Some warnings that someone might want to look at:

==26750== Use of uninitialised value of size 4
==26750==    at 0x453D4B1: DOUBLE_to_OBJECT (arraytypes.inc:4470)
==26750==    by 0x46AB3F3: PyUFunc_GenericFunction (ufuncobject.c:1566)
==26750==    by 0x46ABE9F: ufunc_generic_call (ufuncobject.c:2653)

==26750== Conditional jump or move depends on uninitialised value(s)
==26750==    at 0x4556055: PyArray_Newshape (multiarraymodule.c:524)
==26750==    by 0x45568F4: PyArray_Reshape (multiarraymodule.c:369)
==26750==    by 0x4556931: array_shape_set (arrayobject.c:4642)

==26750==  Address 0x41D2010 is 392 bytes inside a block of size 1,648
free'd
==26750==    at 0x4004F6B: free (vg_replace_malloc.c:235)
==26750==    by 0x46A53C3: ufuncloop_dealloc (ufuncobject.c:1280)
==26750==    by 0x46AAD60: PyUFunc_GenericFunction (ufuncobject.c:1656)
==26750==    by 0x46ABE9F: ufunc_generic_call (ufuncobject.c:2653)

==26750== Conditional jump or move depends on uninitialised value(s)
==26750==    at 0x454EE52: PyArray_NewFromDescr (arrayobject.c:4119)
==26750==    by 0x4550919: PyArray_GetField (arraymethods.c:265)
==26750==    by 0x456C05A: array_subscript (arrayobject.c:2010)
==26750==    by 0x456D606: array_subscript_nice (arrayobject.c:2250)

==26750== Conditional jump or move depends on uninitialised value(s)
==26750==    at 0x455ED1D: PyArray_MapIterReset (arrayobject.c:7788)
==26750==    by 0x456D087: array_ass_sub (arrayobject.c:1812)

A possible memory leak:

==26750== 6,051 (1,120 direct, 4,931 indirect) bytes in 28 blocks are
definitely lost in loss record 35 of 55
==26750==    at 0x400444E: malloc (vg_replace_malloc.c:149)
==26750==    by 0x45442D8: array_alloc (arrayobject.c:5332)
==26750==    by 0x454F19D: PyArray_NewFromDescr (arrayobject.c:4155)
==26750==    by 0x46A61E4: construct_loop (ufuncobject.c:1000)
==26750==    by 0x46AAD09: PyUFunc_GenericFunction (ufuncobject.c:1401)
==26750==    by 0x46ABE9F: ufunc_generic_call (ufuncobject.c:2653)
==26750==    by 0x454243B: PyArray_GenericBinaryFunction
(arrayobject.c:2593)
==26750==    by 0x456DA2C: PyArray_Round (multiarraymodule.c:291)

The following error is generated when the test segfaults:

==26750== Process terminating with default action of signal 11 (SIGSEGV)
==26750==  Access not within mapped region at address 0x10FFFF
==26750==    at 0x453D4B1: DOUBLE_to_OBJECT (arraytypes.inc:4470)
==26750==    by 0x46AB3F3: PyUFunc_GenericFunction (ufuncobject.c:1566)
==26750==    by 0x46ABE9F: ufunc_generic_call (ufuncobject.c:2653)

Regards,

Albert

> -----Original Message-----
> From: numpy-discussion-admin at lists.sourceforge.net [mailto:numpy-
> discussion-admin at lists.sourceforge.net] On Behalf Of Albert Strasheim
> Sent: 15 April 2006 01:55
> To: 'numpy-discussion'
> Subject: RE: [Numpy-discussion] Vectorize bug
> 
> Hello
> 
> I don't have SciPy installed. Is there any way of doing a debug build of
> the
> C code so that I can investigate this problem?
> 
> You say that you cannot reproduce this problem. Are you trying to
> reproduce
> it on Linux or on Windows under IPython? I have also been unable to
> reproduce the crash on Linux, but as we saw earlier, this crash also
> cropped
> up on Solaris, without having to run the tests N times.
> 
> Regards,
> 
> Albert
> 
> > -----Original Message-----
> > From: numpy-discussion-admin at lists.sourceforge.net [mailto:numpy-
> > discussion-admin at lists.sourceforge.net] On Behalf Of Travis Oliphant
> > Sent: 15 April 2006 01:42
> > To: numpy-discussion
> > Subject: Re: [Numpy-discussion] Vectorize bug
> >
> > Albert Strasheim wrote:
> >
> > >Hello Travis
> > >
> > >I'm still getting the same crash when running via IPython, which is the
> > only
> > >way I've been able to reproduce the crash on Windows.
> > >
> > >Just to confirm:
> > >
> > >In [1]: import numpy
> > >
> > >In [2]: numpy.__version__
> > >Out[2]: '0.9.7.2356'
> > >
> > >The crash now happens in check_large, which is the new name of the test
> > >method in question.
> > >
> > >
> > Do you have SciPy installed?
> >
> > Make sure you are not importing an old version of SciPy.
> >
> > I cannot reproduce this problem.
> >
> > -Travis
> 
> 
> 
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting
> language
> that extends applications into web and mobile media. Attend the live
> webcast
> and join the prime developer group breaking into this new coding
> territory!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion


From oliphant.travis at ieee.org  Fri Apr 14 18:20:03 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Fri Apr 14 18:20:03 2006
Subject: [Numpy-discussion] Vectorize bug
In-Reply-To: <00fc01c66022$1b51fb70$0502010a@dsp.sun.ac.za>
References: <00fc01c66022$1b51fb70$0502010a@dsp.sun.ac.za>
Message-ID: <44404A18.1070202@ieee.org>

Albert Strasheim wrote:
> Hello all
>
> I think Valgrind might be very useful in tracking down this bug.
>
> http://valgrind.org/
>   
It's a good suggestion.   I've run the code through Valgrind, several 
times before releasing the first version of NumPy.   I tracked down many 
memory leaks that way already.

There may be errors that have creeped in, but Valgrind does not help 
with reference counting errors which this may be.

But, I need to be able to reproduce the problem to have any hope of 
finding it.

-Travis


From oliphant.travis at ieee.org  Fri Apr 14 18:21:09 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Fri Apr 14 18:21:09 2006
Subject: [Numpy-discussion] Vectorize bug
In-Reply-To: <00fa01c6601e$c7707840$0502010a@dsp.sun.ac.za>
References: <00fa01c6601e$c7707840$0502010a@dsp.sun.ac.za>
Message-ID: <44404A5B.5010802@ieee.org>

Albert Strasheim wrote:
> Hello
>
> I don't have SciPy installed. Is there any way of doing a debug build of the
> C code so that I can investigate this problem?
>
> You say that you cannot reproduce this problem. Are you trying to reproduce
> it on Linux or on Windows under IPython? I have also been unable to
> reproduce the crash on Linux, but as we saw earlier, this crash also cropped
> up on Solaris, without having to run the tests N times.
>
>   
I've tried under Linux with IPython and cannot reproduce the error.  
I've run numpy.test() 100 times with no error.

I'm not sure if the Solaris crash is fixed or not yet after the recent 
changes to SVN.   There may be more than one bug here...

-Travis


From oliphant.travis at ieee.org  Fri Apr 14 18:47:01 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Fri Apr 14 18:47:01 2006
Subject: [Numpy-discussion] Vectorize bug
In-Reply-To: <00fc01c66022$1b51fb70$0502010a@dsp.sun.ac.za>
References: <00fc01c66022$1b51fb70$0502010a@dsp.sun.ac.za>
Message-ID: <44405068.203@ieee.org>

Albert Strasheim wrote:
> Hello all
>
> I think Valgrind might be very useful in tracking down this bug.
>
> http://valgrind.org/
>
> Example usage:
>
> ~/bin/valgrind \
> 	-v --error-limit=no --leak-check=full \ 
> 	python -c 'import numpy; numpy.test()'
>
> Valgrind emits many warnings for things going on inside Python on my Fedora
> Core 4 system, but there is also a lot of interesting things going on in the
> NumPy code.
>
> Some warnings that someone might want to look at:
>
> ==26750== Use of uninitialised value of size 4
> ==26750==    at 0x453D4B1: DOUBLE_to_OBJECT (arraytypes.inc:4470)
> ==26750==    by 0x46AB3F3: PyUFunc_GenericFunction (ufuncobject.c:1566)
> ==26750==    by 0x46ABE9F: ufunc_generic_call (ufuncobject.c:2653)
>   
I think this may be the culprit.   The buffer was not being initialized 
to NULL and so DECREF was being called on whatever was there.  This can 
produce strange results indeed depending on the environment.

I've initialized the buffer now for loops involving OBJECTs (this same 
error has happened a couple of times as it's one of the big ones for 
object arrays).   I thought I fixed all places where it might occur, but 
apparently not...

Perhaps you could try the code again.


From oliphant.travis at ieee.org  Fri Apr 14 18:49:03 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Fri Apr 14 18:49:03 2006
Subject: [Numpy-discussion] Vectorize bug
In-Reply-To: <00fc01c66022$1b51fb70$0502010a@dsp.sun.ac.za>
References: <00fc01c66022$1b51fb70$0502010a@dsp.sun.ac.za>
Message-ID: <444050DA.6050809@ieee.org>

Albert Strasheim wrote:
> Hello all
>
> I think Valgrind might be very useful in tracking down this bug.
>
> http://valgrind.org/
>
> Example usage:
>
> ~/bin/valgrind \
> 	-v --error-limit=no --leak-check=full \ 
> 	python -c 'import numpy; numpy.test()'
>   

Here's the command that I run to test a Python script provided at the 
command line:

valgrind --tool=memcheck --leak-check=yes --error-limit=no -v 
--log-file=testmem --suppressions=valgrind-python.supp 
--show-reachable=yes --num-callers=10 python $1


The valgrind-python.supp file will suppress the complaints valgrind 
emits for Python.


-Travis


From robert.kern at gmail.com  Fri Apr 14 22:21:00 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Fri Apr 14 22:21:00 2006
Subject: [Numpy-discussion] Re: Summer of Code 2006
In-Reply-To: <00fb01c6601f$26e19b10$0502010a@dsp.sun.ac.za>
References: <00fb01c6601f$26e19b10$0502010a@dsp.sun.ac.za>
Message-ID: <e1pvpc$qd8$1@sea.gmane.org>

Albert Strasheim wrote:
> Hello all
> 
> The Google Summer of Code site for 2006 is up:
> 
> http://code.google.com/soc/
> 
> Maybe the NumPy team can propose a few projects to be funded by this
> program. Personally, I'd be interested in working on the build system,
> especially on Windows, and/or extending the test suite.

What work do you think needs to be done on the build system? (I'm not contending
the point; I'm just curious.)

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From fullung at gmail.com  Sat Apr 15 02:26:04 2006
From: fullung at gmail.com (Albert Strasheim)
Date: Sat Apr 15 02:26:04 2006
Subject: [Numpy-discussion] Re: Summer of Code 2006
In-Reply-To: <e1pvpc$qd8$1@sea.gmane.org>
Message-ID: <013501c6606e$86888200$0502010a@dsp.sun.ac.za>

Hello all

> -----Original Message-----
> From: numpy-discussion-admin at lists.sourceforge.net [mailto:numpy-
> discussion-admin at lists.sourceforge.net] On Behalf Of Robert Kern
> Sent: 15 April 2006 07:20
> To: numpy-discussion at lists.sourceforge.net
> Subject: [Numpy-discussion] Re: Summer of Code 2006
> 
> Albert Strasheim wrote:
> > Hello all
> >
> > The Google Summer of Code site for 2006 is up:
> >
> > http://code.google.com/soc/
> >
> > Maybe the NumPy team can propose a few projects to be funded by this
> > program. Personally, I'd be interested in working on the build system,
> > especially on Windows, and/or extending the test suite.
> 
> What work do you think needs to be done on the build system? (I'm not
> contending the point; I'm just curious.)

Let me start by saying that the build system works fine for what I think is
the default case, i.e. building NumPy on Linux with preinstalled LAPACK and
BLAS. However, as soon as you vary any of those parameters, things get
interesting.

I've spent the past couple of days trying to build NumPy on Windows with
ATLAS and CLAPACK with MinGW and Visual Studio .NET 2003 and VS 8. I don't
know if it's just me, but this seems to be very hard. This could probably be
partly attributed to the build systems of these libraries and to the lack of
documentation, but I've also run into problems with NumPy build scripts.

For example, the inclusion of the gcc library in the list of libraries when
building Fortran code with MinGW causes the build to break. Also, building
FLAPACK from source causes the build to fail (too many open files).

While these errors on their own aren't particularly serious, I think it
would be helpful to set up an automated system to check that builds of the
various configurations NumPy supports can actually be done. There are
probably a few million ways to build NumPy, but it would be nice if we could
make sure that the N most common configurations always work, and provide
documentation for "trying this at home."

I also think it would be useful to set up a system that performs regular
builds of the latest revision from the SVN repository. I think anyone
attempting this is going to run into a few issues with the build scripts,
especially when trying to build on multiple platforms.

Things I would like to get right, which I think are much harder than they
need to be (feel free to disagree):

- Windows builds in general
- Visual Studio .NET 2003 builds
- Visual C++ Toolkit 2003 builds
- Visual Studio 2005 builds
- Builds with ATLAS and CLAPACK

The reason I'm interested in the Microsoft compilers is that they have many
features to help us make sure that the code is correct, both at compile time
and at run time.

Any comments? Anybody building on Windows that finds the process to be
completely painless?

Regards,

Albert


From fullung at gmail.com  Sat Apr 15 02:42:06 2006
From: fullung at gmail.com (Albert Strasheim)
Date: Sat Apr 15 02:42:06 2006
Subject: [Numpy-discussion] Vectorize bug
In-Reply-To: <44404A5B.5010802@ieee.org>
Message-ID: <013601c66070$d2377010$0502010a@dsp.sun.ac.za>

Hello all

The crash I was seeing seems to be fixed in revision 2358.

Regards,

Albert

> -----Original Message-----
> From: numpy-discussion-admin at lists.sourceforge.net [mailto:numpy-
> discussion-admin at lists.sourceforge.net] On Behalf Of Travis Oliphant
> Sent: 15 April 2006 03:20
> To: numpy-discussion
> Subject: Re: [Numpy-discussion] Vectorize bug
> 
> Albert Strasheim wrote:
> > Hello
> >
> > I don't have SciPy installed. Is there any way of doing a debug build of
> the
> > C code so that I can investigate this problem?
> >
> > You say that you cannot reproduce this problem. Are you trying to
> reproduce
> > it on Linux or on Windows under IPython? I have also been unable to
> > reproduce the crash on Linux, but as we saw earlier, this crash also
> cropped
> > up on Solaris, without having to run the tests N times.
> >
> >
> I've tried under Linux with IPython and cannot reproduce the error.
> I've run numpy.test() 100 times with no error.
> 
> I'm not sure if the Solaris crash is fixed or not yet after the recent
> changes to SVN.   There may be more than one bug here...
> 
> -Travis


From fullung at gmail.com  Sat Apr 15 04:59:03 2006
From: fullung at gmail.com (Albert Strasheim)
Date: Sat Apr 15 04:59:03 2006
Subject: [Numpy-discussion] bool_ leaks memory
Message-ID: <014701c66083$e3ca5c30$0502010a@dsp.sun.ac.za>

Hello all

According to Valgrind 3.1.1, the following code leaks memory:

from numpy import bool_
bool_(1)

Valgrind says:

==32531== 82 (80 direct, 2 indirect) bytes in 2 blocks are definitely lost
in loss record 7 of 25
==32531==    at 0x400444E: malloc (vg_replace_malloc.c:149)
==32531==    by 0x45442E8: array_alloc (arrayobject.c:5330)
==32531==    by 0x454F18D: PyArray_NewFromDescr (arrayobject.c:4153)
==32531==    by 0x4551844: Array_FromScalar (arrayobject.c:5768)
==32531==    by 0x45602B7: PyArray_FromAny (arrayobject.c:6630)
==32531==    by 0x4570065: bool_arrtype_new (scalartypes.inc:2855)
==32531==    by 0x2FBF6E: (within /usr/lib/libpython2.4.so.1.0)
==32531==    by 0x2C53B3: PyObject_Call (in /usr/lib/libpython2.4.so.1.0)

The second leak that Valgrind reports is from this code in ma.py:

MaskType = bool_
nomask = MaskType(0)

Tested with NumPy 0.9.7.2358.

Trac ticket at

http://projects.scipy.org/scipy/numpy/ticket/60

Regards,

Albert


From faltet at xot.carabos.com  Sat Apr 15 05:06:01 2006
From: faltet at xot.carabos.com (faltet at xot.carabos.com)
Date: Sat Apr 15 05:06:01 2006
Subject: [Numpy-discussion] Performance problems with strided arrays in NumPy
In-Reply-To: <44402A2A.9050300@ee.byu.edu>
References: <20060414213511.GA14355@xot.carabos.com> <44402A2A.9050300@ee.byu.edu>
Message-ID: <20060415120451.GA15123@xot.carabos.com>

On Fri, Apr 14, 2006 at 05:03:06PM -0600, Travis Oliphant wrote:
> What I've found in experiments like this in the past is that numarray is 
> good at striding in one direction but much worse at striding in another 
> direction for multi-dimensional arrays.   Of course my experiments were 
> not complete.  That just seemed to be the case.
> 
> The array-iterator construct handles almost all of these cases.   The 
> copy method is a good place to start since it uses that code.

I'm not sure this is directly related with striding. Look at this:

In [5]: npcopy=timeit.Timer('a=a.copy()','import numpy as np;
a=np.arange(1000000,dtype="Float64")[::10]')

In [6]: npcopy.repeat(3,10)
Out[6]: [0.061118125915527344, 0.061014175415039062,
0.063937187194824219]

In [7]: npcopy2=timeit.Timer('b=a.copy()','import numpy as np;
a=np.arange(1000000,dtype="Float64")[::10]')

In [8]: npcopy2.repeat(3,10)
Out[8]: [0.29984092712402344, 0.29889702796936035, 0.29834103584289551]

You see? assigning to a new variable makes the copy go 5x times
slower! numarray is also affected by this, but not as much:

In [9]: nacopy=timeit.Timer('a=a.copy()','import numarray as np;
a=np.arange(1000000,type="Float64")[::10]')

In [10]: nacopy.repeat(3,10)
Out[10]: [0.039573907852172852, 0.037765979766845703,
0.038245916366577148]

In [11]: nacopy2=timeit.Timer('b=a.copy()','import numarray as np;
a=np.arange(1000000,type="Float64")[::10]')

In [12]: nacopy2.repeat(3,10)
Out[12]: [0.073218107223510742, 0.07414698600769043,
0.072872161865234375]

i.e. just a 2x slowdown. I don't understand this effect: in both cases
we are doing a plain copy, no? I'm missing something, but not sure what
it is.

Regards,

--
Francesc


From fullung at gmail.com  Sat Apr 15 06:38:02 2006
From: fullung at gmail.com (Albert Strasheim)
Date: Sat Apr 15 06:38:02 2006
Subject: [Numpy-discussion] Vectorize bug
In-Reply-To: <444050DA.6050809@ieee.org>
Message-ID: <014e01c66091$b6b6b730$0502010a@dsp.sun.ac.za>

Hello all

I did some more Valgrinding and reduces all the warnings still produced when
running NumPy revision 0.9.7.2358 to a few lines of code. The relevant Trac
tickets:

http://projects.scipy.org/scipy/numpy/ticket/60
http://projects.scipy.org/scipy/numpy/ticket/61
http://projects.scipy.org/scipy/numpy/ticket/62
http://projects.scipy.org/scipy/numpy/ticket/64
http://projects.scipy.org/scipy/numpy/ticket/65

If anybody else wants to play with Valgrind, you can find the Valgrind
supressions for Python 2.4 here:

http://svn.python.org/projects/python/branches/release24-maint/Misc/valgrind
-python.supp

See also

http://svn.python.org/projects/python/branches/release24-maint/Misc/README.v
algrind

Regards,

Albert

> -----Original Message-----
> From: numpy-discussion-admin at lists.sourceforge.net [mailto:numpy-
> discussion-admin at lists.sourceforge.net] On Behalf Of Travis Oliphant
> Sent: 15 April 2006 03:48
> To: numpy-discussion
> Subject: Re: [Numpy-discussion] Vectorize bug
> 
> Albert Strasheim wrote:
> > Hello all
> >
> > I think Valgrind might be very useful in tracking down this bug.
> >
> > http://valgrind.org/
> >
> > Example usage:
> >
> > ~/bin/valgrind \
> > 	-v --error-limit=no --leak-check=full \
> > 	python -c 'import numpy; numpy.test()'
> >
> 
> Here's the command that I run to test a Python script provided at the
> command line:
> 
> valgrind --tool=memcheck --leak-check=yes --error-limit=no -v
> --log-file=testmem --suppressions=valgrind-python.supp
> --show-reachable=yes --num-callers=10 python $1
> 
> 
> The valgrind-python.supp file will suppress the complaints valgrind
> emits for Python.
> 
> 
> -Travis


From cjw at sympatico.ca  Sat Apr 15 08:01:03 2006
From: cjw at sympatico.ca (Colin J. Williams)
Date: Sat Apr 15 08:01:03 2006
Subject: [Numpy-discussion] Summer of Code 2006
In-Reply-To: <00fb01c6601f$26e19b10$0502010a@dsp.sun.ac.za>
References: <00fb01c6601f$26e19b10$0502010a@dsp.sun.ac.za>
Message-ID: <44410A87.70205@sympatico.ca>

Albert Strasheim wrote:

>Hello all
>
>The Google Summer of Code site for 2006 is up:
>
>http://code.google.com/soc/
>
>Maybe the NumPy team can propose a few projects to be funded by this
>program. Personally, I'd be interested in working on the build system,
>especially on Windows, and/or extending the test suite.
>
>Regards,
>
>Albert
>
>
>  
>
I believe that the Python Software Foundation 
(http://www.python.org/psf/grants/) offers funding from time to time.

Colin W.


From Saqib.Sohail at colorado.edu  Sat Apr 15 08:51:02 2006
From: Saqib.Sohail at colorado.edu (Saqib bin Sohail)
Date: Sat Apr 15 08:51:02 2006
Subject: [Numpy-discussion] Code Question
Message-ID: <1145116214.444116365d326@webmail.colorado.edu>

Hi guys

I have never used python, but I wanted to compute FFT of audio files, I came
upon a page which had python code, so I installed Numpy but after beating the
bush for a few days, I have finally come in here to ask. After taking the FFT I
want to output it to a file and the use gnuplot to plot it.

When I instaled NumPy, and ran the tests, it seemed that all passed without a
problem. My input is a .dat file converted from .wav file by sox.

Here is the code which obviously doesn't work because it seems that changes
have occured since this code was written. (not my code, just from some website
where a guy had written on how to do things which i require)

import Numeric
import FFT
out_array=Numeric.array(out)
out_fft=FFT.fft(out)

offt=open('outfile_fft.dat','w')
for x in range(len(out_fft)/2):
    offt.write('%f %f\n'%(1.0*x/wtime,abs(out_fft[x].real)))


I do the following at the python prompt

import numarray
myFile = open('test.dat', 'r')
my_array = numarray.arra(myFile)

/* at this stage I wanted to see if it was correctly read */

print myArray
[1632837691 1701605485 1952535072 ...,  538976288  538976288  168632368]

it seems that these values do not correspond to the values in the file (but I
guess the array is considering these as ints when infact these are floats)

anyway the problem starts when i try to do fft, because I can't seem to find
module or how to invoke it,

the second problem is writing to the file, that code obviously doesn't work,
and in my search through various documentations, i found arrayrange() but
couldn't make it to work, call me stupid, but despite going through several
examples, i haven't been able to make the for loop worked in any case,

it would be very kind of someone if he could at least tell me what i am doing
wrong and reply a simple example so that I can modify my code or at least be
able to understand .

Thanks


--
Saqib bin Sohail
PhD ECE
University of Colorado at Boulder
Res: (303) 786 0636
http://ucsu.colorado.edu/~sohail/index.html


From ndarray at mac.com  Sat Apr 15 09:10:07 2006
From: ndarray at mac.com (Sasha)
Date: Sat Apr 15 09:10:07 2006
Subject: [Numpy-discussion] Vectorize bug
In-Reply-To: <44404A18.1070202@ieee.org>
References: <00fc01c66022$1b51fb70$0502010a@dsp.sun.ac.za>
	 <44404A18.1070202@ieee.org>
Message-ID: <d38f5330604150909o48dc0fc7sb07e2fa76e217a36@mail.gmail.com>

On 4/14/06, Travis Oliphant <oliphant.travis at ieee.org> wrote:

> ...
> There may be errors that have creeped in, but Valgrind does not help
> with reference counting errors which this may be.
> ...

Valgrind is alittle bit more helpful if python is compiled using
--without-pymalloc config option.

In addition to valgrind, memory problems can be exposed by using
--with-pydebug option.


From faltet at xot.carabos.com  Sat Apr 15 10:29:01 2006
From: faltet at xot.carabos.com (faltet at xot.carabos.com)
Date: Sat Apr 15 10:29:01 2006
Subject: [Numpy-discussion] Performance problems with strided arrays in NumPy
In-Reply-To: <44410972.4090502@cox.net>
References: <20060414213511.GA14355@xot.carabos.com> <44402A2A.9050300@ee.byu.edu> <20060415120451.GA15123@xot.carabos.com> <44410972.4090502@cox.net>
Message-ID: <20060415172755.GA15274@xot.carabos.com>

On Sat, Apr 15, 2006 at 07:55:46AM -0700, Tim Hochberg wrote:
> >I'm not sure this is directly related with striding. Look at this:
> >
> >In [5]: npcopy=timeit.Timer('a=a.copy()','import numpy as np;
> >a=np.arange(1000000,dtype="Float64")[::10]')
> >
> >In [6]: npcopy.repeat(3,10)
> >Out[6]: [0.061118125915527344, 0.061014175415039062,
> >0.063937187194824219]
> >
> >In [7]: npcopy2=timeit.Timer('b=a.copy()','import numpy as np;
> >a=np.arange(1000000,dtype="Float64")[::10]')
> >
> >In [8]: npcopy2.repeat(3,10)
> >Out[8]: [0.29984092712402344, 0.29889702796936035, 0.29834103584289551]
> >
> >You see? assigning to a new variable makes the copy go 5x times
> >slower! 
> >
> You are being tricked! In the first case, the array is discontiguous for 
> the first copy but for every subsequenc copy is contiguous since you 
> replace 'a'. In the second case, the array is discontiguous for every copy

Oh, yes!. Thanks for noting this!. So in order to compare apples with
apples, the difference between numarray and numpy in case of strided
copies is:

In [87]: npcopy_stride=timeit.Timer('b=a.copy()','import numpy as np;
a=np.arange(1000000,dtype="Float64")[::10]')

In [88]: npcopy_stride.repeat(3,10)
Out[88]: [0.30013298988342285, 0.29976487159729004, 0.29945492744445801]

In [89]: nacopy_stride=timeit.Timer('b=a.copy()','import numarray as np;
a=np.arange(1000000,type="Float64")[::10]')

In [90]: nacopy_stride.repeat(3,10)
Out[90]: [0.07545709609985351, 0.0731458663940429, 0.073173046112060547]

so numpy is aproximately 4x times slower than numarray.

Cheers,

Francesc


From oliphant.travis at ieee.org  Sat Apr 15 10:51:18 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Sat Apr 15 10:51:18 2006
Subject: [Numpy-discussion] Re: Summer of Code 2006
In-Reply-To: <013501c6606e$86888200$0502010a@dsp.sun.ac.za>
References: <013501c6606e$86888200$0502010a@dsp.sun.ac.za>
Message-ID: <44413251.3080505@ieee.org>

Albert Strasheim wrote:
> Hello all
>
>   
> Let me start by saying that the build system works fine for what I think is
> the default case, i.e. building NumPy on Linux with preinstalled LAPACK and
> BLAS. However, as soon as you vary any of those parameters, things get
> interesting.
>   
It also builds fine with mingw and pre-installed ATLAS (I do it all the 
time).   It also builds fine with no-installed ATLAS (or LAPACK or BLAS) 
with mingw32 and Linux.  It also builds on Mac OS X.   It also builds on 
Solaris, AIX, and Cygwin.   Work also went in recently to make sure it 
builds with a Visual Studio Compiler (the one Tim Hochberg was using...)

So, I think it's a bit unfair to say that varying from only a Linux 
build causes "things to get interesting".   Definitely there are 
configurations that can require a specialized site.cfg file and it can 
be difficult if you build with a compiler that was not used to build 
Python itself.    But, it's not a one-platform build system.   I just 
want that to be clear.

Documentation on the site.cfg file could be more prominent, of course, 
and this was aided recently by the addition of an example file to the 
source tree.  

The expert on the build system is Pearu Peterson.    He has been very 
responsive to suggested fixes and problems that people have 
experienced.   Robert Kern, David Cooke, and I also have some 
familiarity with the build system enough to assist from time to time.

All help is greatly appreciated, however, as I know you can come up with 
configurations that do cause things to "get interesting."   The more 
configurations that we get tested and working, the better off we will 
be.   The more people who understand the build system well enough to 
help fix it, the better off we'll be as well.   So,  I definitely don't 
want to discourage any ideas you have on improving the build system.   

Thanks for being willing to dive in and help.

-Travis


> I've spent the past couple of days trying to build NumPy on Windows with
> ATLAS and CLAPACK with MinGW and Visual Studio .NET 2003 and VS 8. I don't
> know if it's just me, but this seems to be very hard. This could probably be
> partly attributed to the build systems of these libraries and to the lack of
> documentation, but I've also run into problems with NumPy build scripts.
>
> For example, the inclusion of the gcc library in the list of libraries when
> building Fortran code with MinGW causes the build to break. Also, building
> FLAPACK from source causes the build to fail (too many open files).
>
> While these errors on their own aren't particularly serious, I think it
> would be helpful to set up an automated system to check that builds of the
> various configurations NumPy supports can actually be done. There are
> probably a few million ways to build NumPy, but it would be nice if we could
> make sure that the N most common configurations always work, and provide
> documentation for "trying this at home."
>
> I also think it would be useful to set up a system that performs regular
> builds of the latest revision from the SVN repository. I think anyone
> attempting this is going to run into a few issues with the build scripts,
> especially when trying to build on multiple platforms.
>
> Things I would like to get right, which I think are much harder than they
> need to be (feel free to disagree):
>
> - Windows builds in general
> - Visual Studio .NET 2003 builds
> - Visual C++ Toolkit 2003 builds
> - Visual Studio 2005 builds
> - Builds with ATLAS and CLAPACK
>
> The reason I'm interested in the Microsoft compilers is that they have many
> features to help us make sure that the code is correct, both at compile time
> and at run time.
>
> Any comments? Anybody building on Windows that finds the process to be
> completely painless?
>
> Regards,
>
> Albert
>
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting language
> that extends applications into web and mobile media. Attend the live webcast
> and join the prime developer group breaking into this new coding territory!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>   


From oliphant.travis at ieee.org  Sat Apr 15 10:55:02 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Sat Apr 15 10:55:02 2006
Subject: [Numpy-discussion] Performance problems with strided arrays in
 NumPy
In-Reply-To: <20060415172755.GA15274@xot.carabos.com>
References: <20060414213511.GA14355@xot.carabos.com> <44402A2A.9050300@ee.byu.edu> <20060415120451.GA15123@xot.carabos.com> <44410972.4090502@cox.net> <20060415172755.GA15274@xot.carabos.com>
Message-ID: <4441333D.50906@ieee.org>

faltet at xot.carabos.com wrote:
> On Sat, Apr 15, 2006 at 07:55:46AM -0700, Tim Hochberg wrote:
>   
>>> I'm not sure this is directly related with striding. Look at this:
>>>
>>> In [5]: npcopy=timeit.Timer('a=a.copy()','import numpy as np;
>>> a=np.arange(1000000,dtype="Float64")[::10]')
>>>
>>> In [6]: npcopy.repeat(3,10)
>>> Out[6]: [0.061118125915527344, 0.061014175415039062,
>>> 0.063937187194824219]
>>>
>>> In [7]: npcopy2=timeit.Timer('b=a.copy()','import numpy as np;
>>> a=np.arange(1000000,dtype="Float64")[::10]')
>>>
>>> In [8]: npcopy2.repeat(3,10)
>>> Out[8]: [0.29984092712402344, 0.29889702796936035, 0.29834103584289551]
>>>
>>> You see? assigning to a new variable makes the copy go 5x times
>>> slower! 
>>>
>>>       
>> You are being tricked! In the first case, the array is discontiguous for 
>> the first copy but for every subsequenc copy is contiguous since you 
>> replace 'a'. In the second case, the array is discontiguous for every copy
>>     
>
> Oh, yes!. Thanks for noting this!. So in order to compare apples with
> apples, the difference between numarray and numpy in case of strided
> copies is:
>
> In [87]: npcopy_stride=timeit.Timer('b=a.copy()','import numpy as np;
> a=np.arange(1000000,dtype="Float64")[::10]')
>
> In [88]: npcopy_stride.repeat(3,10)
> Out[88]: [0.30013298988342285, 0.29976487159729004, 0.29945492744445801]
>
> In [89]: nacopy_stride=timeit.Timer('b=a.copy()','import numarray as np;
> a=np.arange(1000000,type="Float64")[::10]')
>
> In [90]: nacopy_stride.repeat(3,10)
> Out[90]: [0.07545709609985351, 0.0731458663940429, 0.073173046112060547]
>
> so numpy is aproximately 4x times slower than numarray.
>
>   
This also seems to vary from compiler to compiler.  On my system it's 
not quite so different (about 1.5x slower).

I'm wondering what the effect of an inlined memmove is.    Essentially 
numarray has an inlined for-loop to copy bytes while NumPy calles memmove.

I'll try that out and see...

-Travis


From ryanlists at gmail.com  Sat Apr 15 10:58:17 2006
From: ryanlists at gmail.com (Ryan Krauss)
Date: Sat Apr 15 10:58:17 2006
Subject: [Numpy-discussion] Re: Summer of Code 2006
In-Reply-To: <44413251.3080505@ieee.org>
References: <013501c6606e$86888200$0502010a@dsp.sun.ac.za>
	 <44413251.3080505@ieee.org>
Message-ID: <c5b438120604151057l2217ed41l4514309ec8e4848c@mail.gmail.com>

As I understand the summer of code, we can basically get a full time
student (who gets paid $4500 for the summer) at no cost to us, as long
as someone is willing to coach and define the project.  (NumPy/SciPy
would actually get $500 from Google).

So, I think it would be great if we could define some projects and see
what happens.  (I am trying to graduate this summer, so maybe I should
shut up if I can't help much).

Ryan

On 4/15/06, Travis Oliphant <oliphant.travis at ieee.org> wrote:
> Albert Strasheim wrote:
> > Hello all
> >
> >
> > Let me start by saying that the build system works fine for what I think is
> > the default case, i.e. building NumPy on Linux with preinstalled LAPACK and
> > BLAS. However, as soon as you vary any of those parameters, things get
> > interesting.
> >
> It also builds fine with mingw and pre-installed ATLAS (I do it all the
> time).   It also builds fine with no-installed ATLAS (or LAPACK or BLAS)
> with mingw32 and Linux.  It also builds on Mac OS X.   It also builds on
> Solaris, AIX, and Cygwin.   Work also went in recently to make sure it
> builds with a Visual Studio Compiler (the one Tim Hochberg was using...)
>
> So, I think it's a bit unfair to say that varying from only a Linux
> build causes "things to get interesting".   Definitely there are
> configurations that can require a specialized site.cfg file and it can
> be difficult if you build with a compiler that was not used to build
> Python itself.    But, it's not a one-platform build system.   I just
> want that to be clear.
>
> Documentation on the site.cfg file could be more prominent, of course,
> and this was aided recently by the addition of an example file to the
> source tree.
>
> The expert on the build system is Pearu Peterson.    He has been very
> responsive to suggested fixes and problems that people have
> experienced.   Robert Kern, David Cooke, and I also have some
> familiarity with the build system enough to assist from time to time.
>
> All help is greatly appreciated, however, as I know you can come up with
> configurations that do cause things to "get interesting."   The more
> configurations that we get tested and working, the better off we will
> be.   The more people who understand the build system well enough to
> help fix it, the better off we'll be as well.   So,  I definitely don't
> want to discourage any ideas you have on improving the build system.
>
> Thanks for being willing to dive in and help.
>
> -Travis
>
>
>
>
> > I've spent the past couple of days trying to build NumPy on Windows with
> > ATLAS and CLAPACK with MinGW and Visual Studio .NET 2003 and VS 8. I don't
> > know if it's just me, but this seems to be very hard. This could probably be
> > partly attributed to the build systems of these libraries and to the lack of
> > documentation, but I've also run into problems with NumPy build scripts.
> >
> > For example, the inclusion of the gcc library in the list of libraries when
> > building Fortran code with MinGW causes the build to break. Also, building
> > FLAPACK from source causes the build to fail (too many open files).
> >
> > While these errors on their own aren't particularly serious, I think it
> > would be helpful to set up an automated system to check that builds of the
> > various configurations NumPy supports can actually be done. There are
> > probably a few million ways to build NumPy, but it would be nice if we could
> > make sure that the N most common configurations always work, and provide
> > documentation for "trying this at home."
> >
> > I also think it would be useful to set up a system that performs regular
> > builds of the latest revision from the SVN repository. I think anyone
> > attempting this is going to run into a few issues with the build scripts,
> > especially when trying to build on multiple platforms.
> >
> > Things I would like to get right, which I think are much harder than they
> > need to be (feel free to disagree):
> >
> > - Windows builds in general
> > - Visual Studio .NET 2003 builds
> > - Visual C++ Toolkit 2003 builds
> > - Visual Studio 2005 builds
> > - Builds with ATLAS and CLAPACK
> >
> > The reason I'm interested in the Microsoft compilers is that they have many
> > features to help us make sure that the code is correct, both at compile time
> > and at run time.
> >
> > Any comments? Anybody building on Windows that finds the process to be
> > completely painless?
> >
> > Regards,
> >
> > Albert
> >
> >
> >
> > -------------------------------------------------------
> > This SF.Net email is sponsored by xPML, a groundbreaking scripting language
> > that extends applications into web and mobile media. Attend the live webcast
> > and join the prime developer group breaking into this new coding territory!
> > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> > _______________________________________________
> > Numpy-discussion mailing list
> > Numpy-discussion at lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/numpy-discussion
> >
>
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting language
> that extends applications into web and mobile media. Attend the live webcast
> and join the prime developer group breaking into this new coding territory!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>


From robert.kern at gmail.com  Sat Apr 15 11:31:01 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Sat Apr 15 11:31:01 2006
Subject: [Numpy-discussion] Re: Summer of Code 2006
In-Reply-To: <44410A87.70205@sympatico.ca>
References: <00fb01c6601f$26e19b10$0502010a@dsp.sun.ac.za> <44410A87.70205@sympatico.ca>
Message-ID: <e1re4b$jcj$1@sea.gmane.org>

Colin J. Williams wrote:

> I believe that the Python Software Foundation
> (http://www.python.org/psf/grants/) offers funding from time to time.

However, it likes to fund new projects, not continuing ones.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From oliphant.travis at ieee.org  Sat Apr 15 11:35:04 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Sat Apr 15 11:35:04 2006
Subject: [Numpy-discussion] Vectorize bug
In-Reply-To: <014e01c66091$b6b6b730$0502010a@dsp.sun.ac.za>
References: <014e01c66091$b6b6b730$0502010a@dsp.sun.ac.za>
Message-ID: <44413C9B.3080507@ieee.org>

Albert Strasheim wrote:
> Hello all
>
> I did some more Valgrinding and reduces all the warnings still produced when
> running NumPy revision 0.9.7.2358 to a few lines of code. The relevant Trac
> tickets:
>
> http://projects.scipy.org/scipy/numpy/ticket/60
> http://projects.scipy.org/scipy/numpy/ticket/61
> http://projects.scipy.org/scipy/numpy/ticket/62
> http://projects.scipy.org/scipy/numpy/ticket/64
> http://projects.scipy.org/scipy/numpy/ticket/65
>
>   
This is very useful.  Thank you for isolating the code producing the 
warnings like this.  It makes it much easier to debug.

-Travis


From robert.kern at gmail.com  Sat Apr 15 12:00:06 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Sat Apr 15 12:00:06 2006
Subject: [Numpy-discussion] Re: Code Question
In-Reply-To: <1145116214.444116365d326@webmail.colorado.edu>
References: <1145116214.444116365d326@webmail.colorado.edu>
Message-ID: <e1rfq6$o5h$1@sea.gmane.org>

Saqib bin Sohail wrote:
> Hi guys
> 
> I have never used python, but I wanted to compute FFT of audio files, I came
> upon a page which had python code, so I installed Numpy but after beating the
> bush for a few days, I have finally come in here to ask. After taking the FFT I
> want to output it to a file and the use gnuplot to plot it.
> When I instaled NumPy, and ran the tests, it seemed that all passed without a
> problem. My input is a .dat file converted from .wav file by sox.
> 
> Here is the code which obviously doesn't work because it seems that changes
> have occured since this code was written. (not my code, just from some website
> where a guy had written on how to do things which i require)

Okay, first some history. Originally, the package was named Numeric;
occasionally, it was referred to by its nickname NumPy. Some years ago, a group
needed features that couldn't be done in the Numeric codebase, so they started a
rewrite called numarray. For various reasons that I don't want to get into,
another group needed features that couldn't be done in the numarray codebase, so
a second rewrite happened and this package is the one that is currently getting
the most developer attention. It is called numpy.

Since you are a new user, I highly recommend that you use numpy instead of
Numeric or numarray.

  http://numeric.scipy.org/

> import Numeric
> import FFT
> out_array=Numeric.array(out)
> out_fft=FFT.fft(out)
> 
> offt=open('outfile_fft.dat','w')
> for x in range(len(out_fft)/2):
>     offt.write('%f %f\n'%(1.0*x/wtime,abs(out_fft[x].real)))

Rewritten for numpy (but untested):

import numpy
# Assuming that the file contains 32-bit floats, and not 64-bit floats
data = numpy.fromfile('test.dat', dtype=numpy.float32)
out_fft = numpy.refft(data)
# Note: refft does the FFT on real data and thus throws away the negative
# frequencies since they are redundant. len(out_fft) != len(data)

# and now I'm confused because the code references variables that weren't
# created anywhere, so I'm going to output the power spectrum

n = len(out_fft)
freqs = numpy.arange(n, dtype=numpy.float32) / len(data)
power = out_fft.real*out_fft.real + out_fft.imag*out_fft.imag
outarray = numpy.column_stack(freqs, power)
assert outarray.shape == (n, 2)

offt = open('outfile_fft.dat', 'w')
try:
  for f, p in outarray:
    offt.write('%f %f\n' % (f, p))
finally:
  offt.close()

> I do the following at the python prompt
> 
> import numarray
> myFile = open('test.dat', 'r')
> my_array = numarray.arra(myFile)
> 
> /* at this stage I wanted to see if it was correctly read */
> 
> print myArray
> [1632837691 1701605485 1952535072 ...,  538976288  538976288  168632368]
> 
> it seems that these values do not correspond to the values in the file (but I
> guess the array is considering these as ints when infact these are floats)

Indeed. There is no way for the array constructor to know the data type in the
file unless if you tell it. The default type is int.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From Saqib.Sohail at colorado.edu  Sat Apr 15 13:42:02 2006
From: Saqib.Sohail at colorado.edu (Saqib bin Sohail)
Date: Sat Apr 15 13:42:02 2006
Subject: [Numpy-discussion] Code Question
In-Reply-To: <06041504462800.00752@rbastian>
References: <1145116214.444116365d326@webmail.colorado.edu> <06041504462800.00752@rbastian>
Message-ID: <1145133678.44415a6e5d8f7@webmail.colorado.edu>

Thanks a lot for your detailed email, unfortunately both of the following
imports don't work

import Gnuplot

import fft as FFT
from numarray import *

I think I need Gnuplot package but what I can't understand is why, fft is not
being imported, do I need to install the NumPy package with special options to
install fft.


Quoting Ren? Bastian <rbastian at free.fr>:

> Le Samedi 15 Avril 2006 17:50, Saqib bin Sohail a ?crit :
> > Hi guys
> >
> > I have never used python, but I wanted to compute FFT of audio files, I
> > came upon a page which had python code, so I installed Numpy but after
> > beating the bush for a few days, I have finally come in here to ask. After
> > taking the FFT I want to output it to a file and the use gnuplot to plot
> > it.
>
> With the module Gnuplot.py you can plot arrays
>
> import Gnuplot
>
> g =Gnuplot.Gnuplot()
> g.plot(w) #  w is an array
> raw_input("Enter")
> g.reset()
>
> I use numarray
>
> Some code :
> ----------------
>
> import fft as FFT
> from numarray import *
>
> T = arrayrange(0.0, 2*pi, 1.0/1000)
> a = sin(2*pi*440.0*T)
>
> r = FFT.fft(a)
> print r
> g.plot(r)
> raw_input("Enter")
> ....
> r = FFT.inverse_real_fft(a)
> r = FFT.real_fft(a)
> r = FFT.hermite_fft(a)
>
> g.reset()
> ----------------
>
>
> >
> > When I instaled NumPy, and ran the tests, it seemed that all passed without
> > a problem. My input is a .dat file converted from .wav file by sox.
>
>
> >
> > Here is the code which obviously doesn't work because it seems that changes
> > have occured since this code was written. (not my code, just from some
> > website where a guy had written on how to do things which i require)
> >
> > import Numeric
> > import FFT
> > out_array=Numeric.array(out)
> > out_fft=FFT.fft(out)
>
>
> >
> > offt=open('outfile_fft.dat','w')
> > for x in range(len(out_fft)/2):
> >     offt.write('%f %f\n'%(1.0*x/wtime,abs(out_fft[x].real)))
> >
> >
> > I do the following at the python prompt
> >
> > import numarray
> > myFile = open('test.dat', 'r')
> > my_array = numarray.arra(myFile)
>
> Read the manual how to load a file of floats
> I think there is a mistake
>
> > /* at this stage I wanted to see if it was correctly read */
> >
> > print myArray
> > [1632837691 1701605485 1952535072 ...,  538976288  538976288  168632368]
> >
> > it seems that these values do not correspond to the values in the file (but
> > I guess the array is considering these as ints when infact these are
> > floats)
>
> hmmm ...
>
> >
> > anyway the problem starts when i try to do fft, because I can't seem to
> > find module or how to invoke it,
> >
> > the second problem is writing to the file, that code obviously doesn't
> > work, and in my search through various documentations, i found arrayrange()
> > but couldn't make it to work, call me stupid, but despite going through
> > several examples, i haven't been able to make the for loop worked in any
> > case,
>
>
> >
> > it would be very kind of someone if he could at least tell me what i am
> > doing wrong and reply a simple example so that I can modify my code or at
> > least be able to understand .
> >
> > Thanks
> >
> >
> >
> > --
> > Saqib bin Sohail
> > PhD ECE
> > University of Colorado at Boulder
> > Res: (303) 786 0636
> > http://ucsu.colorado.edu/~sohail/index.html
> >
> >
> > -------------------------------------------------------
>
> --
> Ren? Bastian
> http://pythoneon.musiques-rb.org "Musique en Python"
>
>


--
Saqib bin Sohail
PhD ECE
University of Colorado at Boulder
Res: (303) 786 0636
http://ucsu.colorado.edu/~sohail/index.html


From robert.kern at gmail.com  Sun Apr 16 02:37:05 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Sun Apr 16 02:37:05 2006
Subject: [Numpy-discussion] Trac Wikis closed for anonymous edits until further notice
Message-ID: <44421025.9060804@gmail.com>

We've been hit badly by spammers, so I can only presume our Trac sites are now
on the traded spam lists. I am going to turn off anonymous edits for now. Ticket
creation will probably still be left open for now.

Many thanks to David Cooke for quickly removing the spam.

I am looking into ways to allow people to register themselves with the Trac
sites so they can edit the Wikis and submit tickets without needing to be added
by a project admin.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From a.h.jaffe at gmail.com  Sun Apr 16 12:36:01 2006
From: a.h.jaffe at gmail.com (Andrew Jaffe)
Date: Sun Apr 16 12:36:01 2006
Subject: [Numpy-discussion] g95 detection not working
Message-ID: <44429C55.2030500@gmail.com>

Hi all,

at least on my setup (OS X, Python 2.4.1, latest svn of numpy and 
scipy), config_fc fails to recognize my g95 compiler, which was directly 
downloaded from http://g95.sourceforge.net/ (and always has failed, I 
think). This is because the current version string doesn't conform to 
the regexp pattern; the version string is
"""
G95 (GCC 4.0.3 (g95!) Apr 12 2006)
Copyright (C) 2002-2005 Free Software Foundation, Inc.

G95 comes with NO WARRANTY, to the extent permitted by law.
You may redistribute copies of G95
under the terms of the GNU General Public License.
For more information about these matters, see the file named COPYING
"""

I've attached a patch below, although this identifies the version string 
with the date of the release, rather than the gcc version; I'm not sure 
which is the right one to use!

Andrew


--- numpy/distutils/fcompiler/g95.py    (revision 2360)
+++ numpy/distutils/fcompiler/g95.py    (working copy)
@@ -9,7 +9,7 @@
  class G95FCompiler(FCompiler):

      compiler_type = 'g95'
-    version_pattern = r'G95.*\(experimental\) \(g95!\) (?P<version>.*)\).*'
+    version_pattern = r'G95.*\(g95!\) (?P<version>.*)\).*'

      executables = {
          'version_cmd'  : ["g95", "--version"],


From robert.kern at gmail.com  Sun Apr 16 12:50:05 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Sun Apr 16 12:50:05 2006
Subject: [Numpy-discussion] Re: g95 detection not working
In-Reply-To: <44429C55.2030500@gmail.com>
References: <44429C55.2030500@gmail.com>
Message-ID: <e1u74d$n0i$1@sea.gmane.org>

Andrew Jaffe wrote:
> Hi all,
> 
> at least on my setup (OS X, Python 2.4.1, latest svn of numpy and
> scipy), config_fc fails to recognize my g95 compiler, which was directly
> downloaded from http://g95.sourceforge.net/ (and always has failed, I
> think). This is because the current version string doesn't conform to
> the regexp pattern; the version string is
> """
> G95 (GCC 4.0.3 (g95!) Apr 12 2006)
> Copyright (C) 2002-2005 Free Software Foundation, Inc.
> 
> G95 comes with NO WARRANTY, to the extent permitted by law.
> You may redistribute copies of G95
> under the terms of the GNU General Public License.
> For more information about these matters, see the file named COPYING
> """
> 
> I've attached a patch below, although this identifies the version string
> with the date of the release, rather than the gcc version; I'm not sure
> which is the right one to use!

We need the actual version number; in this case, "4.0.3".

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From a.h.jaffe at gmail.com  Sun Apr 16 13:53:03 2006
From: a.h.jaffe at gmail.com (Andrew Jaffe)
Date: Sun Apr 16 13:53:03 2006
Subject: [Numpy-discussion] Re: g95 detection not working
In-Reply-To: <e1u74d$n0i$1@sea.gmane.org>
References: <44429C55.2030500@gmail.com> <e1u74d$n0i$1@sea.gmane.org>
Message-ID: <4442AE89.8080303@gmail.com>

Robert Kern wrote:
> Andrew Jaffe wrote:
>> Hi all,
>>
>> at least on my setup (OS X, Python 2.4.1, latest svn of numpy and
>> scipy), config_fc fails to recognize my g95 compiler, which was directly
>> downloaded from http://g95.sourceforge.net/ (and always has failed, I
>> think). This is because the current version string doesn't conform to
>> the regexp pattern; the version string is
>> """
>> G95 (GCC 4.0.3 (g95!) Apr 12 2006)
>> Copyright (C) 2002-2005 Free Software Foundation, Inc.
>>
>> G95 comes with NO WARRANTY, to the extent permitted by law.
>> You may redistribute copies of G95
>> under the terms of the GNU General Public License.
>> For more information about these matters, see the file named COPYING
>> """
>>
>> I've attached a patch below, although this identifies the version string
>> with the date of the release, rather than the gcc version; I'm not sure
>> which is the right one to use!
> 
> We need the actual version number; in this case, "4.0.3".

Thanks -- OK, in that case the following regexp works for me:

     version_pattern = r'G95.*\(GCC (?P<version>.*) \(g95!\)'

But are there different versions of the version string?

Also on an unrelated f2py note: is the f2py mailing list being read by 
the f2py developers? I've posted a question (about the status of F9x 
"types") without reply...

Yours,

Andrew


From robert.kern at gmail.com  Sun Apr 16 13:56:02 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Sun Apr 16 13:56:02 2006
Subject: [Numpy-discussion] Re: g95 detection not working
In-Reply-To: <44429C55.2030500@gmail.com>
References: <44429C55.2030500@gmail.com>
Message-ID: <e1uaue$2tm$1@sea.gmane.org>

Andrew Jaffe wrote:
> Hi all,
> 
> at least on my setup (OS X, Python 2.4.1, latest svn of numpy and
> scipy), config_fc fails to recognize my g95 compiler, which was directly
> downloaded from http://g95.sourceforge.net/ (and always has failed, I
> think). This is because the current version string doesn't conform to
> the regexp pattern; the version string is
> """
> G95 (GCC 4.0.3 (g95!) Apr 12 2006)
> Copyright (C) 2002-2005 Free Software Foundation, Inc.
> 
> G95 comes with NO WARRANTY, to the extent permitted by law.
> You may redistribute copies of G95
> under the terms of the GNU General Public License.
> For more information about these matters, see the file named COPYING
> """
> 
> I've attached a patch below, although this identifies the version string
> with the date of the release, rather than the gcc version; I'm not sure
> which is the right one to use!

Also, note that you can override the get_version() method entirely, if it's
easier to do grab the version using something other than a regex. You can look
at hpux.py and ibm.py for examples.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From Saqib.Sohail at colorado.edu  Sun Apr 16 14:02:04 2006
From: Saqib.Sohail at colorado.edu (Saqib bin Sohail)
Date: Sun Apr 16 14:02:04 2006
Subject: [Numpy-discussion] Code Question
In-Reply-To: <c5b438120604151402r71fd5c42s817e5168a31f1432@mail.gmail.com>
References: <1145116214.444116365d326@webmail.colorado.edu>  <c5b438120604151004m7b12836dh78b0979b96ba0bc7@mail.gmail.com>  <1145133185.4441588148cb3@webmail.colorado.edu> <c5b438120604151402r71fd5c42s817e5168a31f1432@mail.gmail.com>
Message-ID: <1145221290.4442b0aa55961@webmail.colorado.edu>

Thanks Guys for all your prompt responses. I have tried to use the provided
solutions but I am had my share of issues mixed with my lack of knowledge to
the point that I feel quite embarrassed to bother you guys.

Issue 1

I am running FC 3 with native python-2.3 and then I installed python-2.4 in it.
numarray-1.5.1 seems to have installed with success in python-2.3. I have tried
to install numpy-0.9.6-1.i586.rpm but I don't have python-base and when I try
to install python-base I get a long list of dependency lists which I need. I
haven't further pursued down that line, unfortunately I haven't been able to
use numarray, I don't know how to use it because ppl have repeatedly told me to
use numpy but I can't seem to get that installed.

Issue 2

To input the file, Ryan suggested to use scipy, I don't want to go down that
path, if only there is a simple way to input the file, (i can clean up the file
and format it in the right way in perl, I can do that in a heartbeat)

Issue 3

I don't want to use gnuplot functionality, or mathplot, if only I am able to
write the file then again I can use perl to format it and use gnuplot then,


So if there is the simplest of ways in which I can just
i) read the file (formatting will be done in perl)
ii) get the fft
iii) write the file or files (and then use perl to format for gnuplot)

I am sure all of you will say why not use the existing functionalities, but
after 3 days I haven't gotten anywhere. All I need to do is get FFT of some
sound files so that I can verify the result of FFT's and compare them with my
FFT code in VxWorks.


An Pierre, I started reading diveintopython.pdf but got nowhere when I tried
two of its examples, the attached image shows that when I tried to run one of
the examples on python-2.3 and the output wasn't according to what the guide
suggested. (no output to be precise)

http://jobim.colorado.edu/~sohail/pythonExample.JPG

Thanks again guys.

Quoting Ryan Krauss <ryanlists at gmail.com>:

> I guess it depends on how much you want to learn and what you want to do.
>
> I was able to load your data using
> data=scipy.io.read_array('monkey.dat')
>
> I had to comment out the first line to make it work.  I couldn't make
> the fromfile method of numpy work because the data is actually fixed
> width.
>
> If you don't want to install scipy, you would need to learn enough
> Python to read the file and clean it up a little by hand.
>
> It seems like the first column is time and the second is the signal
> you want to fft.  I was able to fft it with:
> myfft=numpy.fft(data[:,1])
> (I don't have the latest version of numpy and don't seem to have the
> refft function Robert mentioned).
>
> t=data[:,0]
> df=1/max(t)
> df
> maxf=8012
> fvect=arange(0,maxf+df,df)
>
> plot(fvect,abs(myfft))
>
> I am plotting using matplotlib and the resulting figures are attached.
>
> If you really want to learn python for scientific and plotting
> applications, I would highly recommend a few packages:
> SciPy - some additional capabilities beyond Numpy (optimization, ode's , ...)
> ipython - it is a really good interactive python shell
> matplotlib - the best python 2d plotting package I am aware of
>
> Let me know if you have any additional questions.  You can find out
> about each package by googling it.  They are all closely related to
> Numpy and all have good mailing lists to help you.
>
> Ryan
>
> On 4/15/06, Saqib bin Sohail <Saqib.Sohail at colorado.edu> wrote:
> > Do let me know if you get somewhere.
> >
> > Thanks
> >
> >
> > Quoting Ryan Krauss <ryanlists at gmail.com>:
> >
> > > email me the dat file and I could play with it a bit.  If I can read
> > > your input file, the rest should be easy.
> > >
> > > Ryan
> > >
> > > On 4/15/06, Saqib bin Sohail <Saqib.Sohail at colorado.edu> wrote:
> > > > Hi guys
> > > >
> > > > I have never used python, but I wanted to compute FFT of audio files, I
> > > came
> > > > upon a page which had python code, so I installed Numpy but after
> beating
> > > the
> > > > bush for a few days, I have finally come in here to ask. After taking
> the
> > > FFT I
> > > > want to output it to a file and the use gnuplot to plot it.
> > > >
> > > > When I instaled NumPy, and ran the tests, it seemed that all passed
> without
> > > a
> > > > problem. My input is a .dat file converted from .wav file by sox.
> > > >
> > > > Here is the code which obviously doesn't work because it seems that
> changes
> > > > have occured since this code was written. (not my code, just from some
> > > website
> > > > where a guy had written on how to do things which i require)
> > > >
> > > > import Numeric
> > > > import FFT
> > > > out_array=Numeric.array(out)
> > > > out_fft=FFT.fft(out)
> > > >
> > > > offt=open('outfile_fft.dat','w')
> > > > for x in range(len(out_fft)/2):
> > > >     offt.write('%f %f\n'%(1.0*x/wtime,abs(out_fft[x].real)))
> > > >
> > > >
> > > > I do the following at the python prompt
> > > >
> > > > import numarray
> > > > myFile = open('test.dat', 'r')
> > > > my_array = numarray.arra(myFile)
> > > >
> > > > /* at this stage I wanted to see if it was correctly read */
> > > >
> > > > print myArray
> > > > [1632837691 1701605485 1952535072 ...,  538976288  538976288
> 168632368]
> > > >
> > > > it seems that these values do not correspond to the values in the file
> (but
> > > I
> > > > guess the array is considering these as ints when infact these are
> floats)
> > > >
> > > > anyway the problem starts when i try to do fft, because I can't seem to
> > > find
> > > > module or how to invoke it,
> > > >
> > > > the second problem is writing to the file, that code obviously doesn't
> > > work,
> > > > and in my search through various documentations, i found arrayrange()
> but
> > > > couldn't make it to work, call me stupid, but despite going through
> several
> > > > examples, i haven't been able to make the for loop worked in any case,
> > > >
> > > > it would be very kind of someone if he could at least tell me what i am
> > > doing
> > > > wrong and reply a simple example so that I can modify my code or at
> least
> > > be
> > > > able to understand .
> > > >
> > > > Thanks
> > > >
> > > >
> > > >
> > > > --
> > > > Saqib bin Sohail
> > > > PhD ECE
> > > > University of Colorado at Boulder
> > > > Res: (303) 786 0636
> > > > http://ucsu.colorado.edu/~sohail/index.html
> > > >
> > > >
> > > > -------------------------------------------------------
> > > > This SF.Net email is sponsored by xPML, a groundbreaking scripting
> language
> > > > that extends applications into web and mobile media. Attend the live
> > > webcast
> > > > and join the prime developer group breaking into this new coding
> territory!
> > > >
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> > > > _______________________________________________
> > > > Numpy-discussion mailing list
> > > > Numpy-discussion at lists.sourceforge.net
> > > > https://lists.sourceforge.net/lists/listinfo/numpy-discussion
> > > >
> > >
> >
> >
> > --
> > Saqib bin Sohail
> > PhD ECE
> > University of Colorado at Boulder
> > Res: (303) 786 0636
> > http://ucsu.colorado.edu/~sohail/index.html
> >
> >
>


--
Saqib bin Sohail
PhD ECE
University of Colorado at Boulder
Res: (303) 786 0636
http://ucsu.colorado.edu/~sohail/index.html


From robert.kern at gmail.com  Sun Apr 16 14:03:01 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Sun Apr 16 14:03:01 2006
Subject: [Numpy-discussion] Re: g95 detection not working
In-Reply-To: <4442AE89.8080303@gmail.com>
References: <44429C55.2030500@gmail.com> <e1u74d$n0i$1@sea.gmane.org> <4442AE89.8080303@gmail.com>
Message-ID: <e1ubcb$3vp$1@sea.gmane.org>

Andrew Jaffe wrote:

> Thanks -- OK, in that case the following regexp works for me:
> 
>     version_pattern = r'G95.*\(GCC (?P<version>.*) \(g95!\)'
> 
> But are there different versions of the version string?

Possibly. I don't really know.

> Also on an unrelated f2py note: is the f2py mailing list being read by
> the f2py developers? I've posted a question (about the status of F9x
> "types") without reply...

Pearu is really the only f2py developer, and he has just flown from his home in
Estonia to Austin to work with us at Enthought for a month. I presume he has
been busy preparing for his journey.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From robert.kern at gmail.com  Sun Apr 16 14:26:06 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Sun Apr 16 14:26:06 2006
Subject: [Numpy-discussion] Re: Code Question
In-Reply-To: <1145221290.4442b0aa55961@webmail.colorado.edu>
References: <1145116214.444116365d326@webmail.colorado.edu>  <c5b438120604151004m7b12836dh78b0979b96ba0bc7@mail.gmail.com>  <1145133185.4441588148cb3@webmail.colorado.edu> <c5b438120604151402r71fd5c42s817e5168a31f1432@mail.gmail.com> <1145221290.4442b0aa55961@webmail.colorado.edu>
Message-ID: <e1ucoa$771$1@sea.gmane.org>

Saqib bin Sohail wrote:

> An Pierre, I started reading diveintopython.pdf but got nowhere when I tried
> two of its examples, the attached image shows that when I tried to run one of
> the examples on python-2.3 and the output wasn't according to what the guide
> suggested. (no output to be precise)
> 
> http://jobim.colorado.edu/~sohail/pythonExample.JPG

Note the indentation. Indentation is important in Python.

> Quoting Ryan Krauss <ryanlists at gmail.com>:

>>(I don't have the latest version of numpy and don't seem to have the
>>refft function Robert mentioned).

My example was wrong. It should have used "numpy.dft.refft()", not "numpy.refft()".

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From robert.kern at gmail.com  Sun Apr 16 14:37:02 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Sun Apr 16 14:37:02 2006
Subject: [Numpy-discussion] Re: Code Question
In-Reply-To: <1145221290.4442b0aa55961@webmail.colorado.edu>
References: <1145116214.444116365d326@webmail.colorado.edu>  <c5b438120604151004m7b12836dh78b0979b96ba0bc7@mail.gmail.com>  <1145133185.4441588148cb3@webmail.colorado.edu> <c5b438120604151402r71fd5c42s817e5168a31f1432@mail.gmail.com> <1145221290.4442b0aa55961@webmail.colorado.edu>
Message-ID: <e1udcp$8o2$1@sea.gmane.org>

Saqib bin Sohail wrote:

> I am sure all of you will say why not use the existing functionalities, but
> after 3 days I haven't gotten anywhere. All I need to do is get FFT of some
> sound files so that I can verify the result of FFT's and compare them with my
> FFT code in VxWorks.

Well, if you are just trying to get an independent verification of your VxWorks
FFT code, and you are much more comfortable with Perl, then you might want to
use one of the FFT libraries available for Perl like Math::FFT.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From a.h.jaffe at gmail.com  Sun Apr 16 15:18:02 2006
From: a.h.jaffe at gmail.com (Andrew Jaffe)
Date: Sun Apr 16 15:18:02 2006
Subject: [Numpy-discussion] where() has started returning a tuple!?
Message-ID: <e1ufou$d6p$1@sea.gmane.org>

I think the following behavior is (only recently) wrong:

In [7]: numpy.__version__
Out[7]: '0.9.7.2360'

In [8]: numpy.nonzero([True, False, True])
Out[8]: array([0, 2])

In [9]: numpy.where([True, False, True])
Out[9]: (array([0, 2]),)

Note the tuple output to where(), which should be the same as nonzero.

Andrew


From perry at stsci.edu  Sun Apr 16 20:18:02 2006
From: perry at stsci.edu (Perry Greenfield)
Date: Sun Apr 16 20:18:02 2006
Subject: [Numpy-discussion] where() has started returning a tuple!?
In-Reply-To: <e1ufou$d6p$1@sea.gmane.org>
Message-ID: <NEBBIJKBMLDBLNCEEFOCIELHFKAA.perry@stsci.edu>

see:

http://sourceforge.net/mailarchive/forum.php?thread_id=10165581&forum_id=489
0

> -----Original Message-----
> From: numpy-discussion-admin at lists.sourceforge.net
> [mailto:numpy-discussion-admin at lists.sourceforge.net]On Behalf Of Andrew
> Jaffe
> Sent: Sunday, April 16, 2006 6:17 PM
> To: numpy-discussion at lists.sourceforge.net
> Subject: [Numpy-discussion] where() has started returning a tuple!?
>
>
> I think the following behavior is (only recently) wrong:
>
> In [7]: numpy.__version__
> Out[7]: '0.9.7.2360'
>
> In [8]: numpy.nonzero([True, False, True])
> Out[8]: array([0, 2])
>
> In [9]: numpy.where([True, False, True])
> Out[9]: (array([0, 2]),)
>
> Note the tuple output to where(), which should be the same as nonzero.
>
> Andrew
>
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking
> scripting language
> that extends applications into web and mobile media. Attend the
> live webcast
> and join the prime developer group breaking into this new coding
> territory!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>


From a.h.jaffe at gmail.com  Mon Apr 17 00:53:04 2006
From: a.h.jaffe at gmail.com (Andrew Jaffe)
Date: Mon Apr 17 00:53:04 2006
Subject: [Numpy-discussion] Re: where() has started returning a tuple!?
In-Reply-To: <NEBBIJKBMLDBLNCEEFOCIELHFKAA.perry@stsci.edu>
References: <e1ufou$d6p$1@sea.gmane.org> <NEBBIJKBMLDBLNCEEFOCIELHFKAA.perry@stsci.edu>
Message-ID: <e1vhg4$hpf$1@sea.gmane.org>

Aha, missed that thread (and the docstring -- my bad). And actually I 
misunderstood the effect of the change, anyway: a[where(a>0)] is still 
fine, it's just other activities like iterating over where(a>0) that is 
no longer possible in the same way.

Thanks for the pointer!

Andrew


Perry Greenfield wrote:
> see:
> 
> http://sourceforge.net/mailarchive/forum.php?thread_id=10165581&forum_id=489
> 0
> 
>> -----Original Message-----
>> From: numpy-discussion-admin at lists.sourceforge.net
>> [mailto:numpy-discussion-admin at lists.sourceforge.net]On Behalf Of Andrew
>> Jaffe
>> Sent: Sunday, April 16, 2006 6:17 PM
>> To: numpy-discussion at lists.sourceforge.net
>> Subject: [Numpy-discussion] where() has started returning a tuple!?
>>
>>
>> I think the following behavior is (only recently) wrong:
>>
>> In [7]: numpy.__version__
>> Out[7]: '0.9.7.2360'
>>
>> In [8]: numpy.nonzero([True, False, True])
>> Out[8]: array([0, 2])
>>
>> In [9]: numpy.where([True, False, True])
>> Out[9]: (array([0, 2]),)
>>
>> Note the tuple output to where(), which should be the same as nonzero.
>>
>> Andrew
>>


From ryanlists at gmail.com  Mon Apr 17 05:57:03 2006
From: ryanlists at gmail.com (Ryan Krauss)
Date: Mon Apr 17 05:57:03 2006
Subject: [Numpy-discussion] Re: Code Question
In-Reply-To: <e1udcp$8o2$1@sea.gmane.org>
References: <1145116214.444116365d326@webmail.colorado.edu>
	 <c5b438120604151004m7b12836dh78b0979b96ba0bc7@mail.gmail.com>
	 <1145133185.4441588148cb3@webmail.colorado.edu>
	 <c5b438120604151402r71fd5c42s817e5168a31f1432@mail.gmail.com>
	 <1145221290.4442b0aa55961@webmail.colorado.edu>
	 <e1udcp$8o2$1@sea.gmane.org>
Message-ID: <c5b438120604170556h5a335024m8814758d9d86b9ee@mail.gmail.com>

Alright Saqib,

Robert is right that you should try fft in perl if you don't want to
learn Python.

But as I understand it, you want to read in this file, fft it, and
write the fft to a file using only numarray.  Attached is a script
that does that.  Most of the script is just low-level file io to avoid
having to install scipy to read and write the arrays.

Hope this helps,

Ryan

On 4/16/06, Robert Kern <robert.kern at gmail.com> wrote:
> Saqib bin Sohail wrote:
>
> > I am sure all of you will say why not use the existing functionalities, but
> > after 3 days I haven't gotten anywhere. All I need to do is get FFT of some
> > sound files so that I can verify the result of FFT's and compare them with my
> > FFT code in VxWorks.
>
> Well, if you are just trying to get an independent verification of your VxWorks
> FFT code, and you are much more comfortable with Perl, then you might want to
> use one of the FFT libraries available for Perl like Math::FFT.
>
> --
> Robert Kern
> robert.kern at gmail.com
>
> "I have come to believe that the whole world is an enigma, a harmless enigma
>  that is made terrible by our own mad attempt to interpret it as though it had
>  an underlying truth."
>   -- Umberto Eco
>
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting language
> that extends applications into web and mobile media. Attend the live webcast
> and join the prime developer group breaking into this new coding territory!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: read_fft_write_numarray.py
Type: text/x-python
Size: 872 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060417/1b2e7f3e/attachment.py>

From chanley at stsci.edu  Mon Apr 17 06:24:06 2006
From: chanley at stsci.edu (Christopher Hanley)
Date: Mon Apr 17 06:24:06 2006
Subject: [Numpy-discussion] Vectorize bug
In-Reply-To: <44404A5B.5010802@ieee.org>
References: <00fa01c6601e$c7707840$0502010a@dsp.sun.ac.za> <44404A5B.5010802@ieee.org>
Message-ID: <4443969D.4090604@stsci.edu>

Travis Oliphant wrote:
> I'm not sure if the Solaris crash is fixed or not yet after the recent 
> changes to SVN.   There may be more than one bug here...

The numpy.test() unit tests no longer cause segfaults on Solaris.  All 
of my daily numpy regression tests are now passing for Solaris.

Thank you for your time and help,
Chris


From michael.sorich at gmail.com  Mon Apr 17 17:13:09 2006
From: michael.sorich at gmail.com (Michael Sorich)
Date: Mon Apr 17 17:13:09 2006
Subject: [Numpy-discussion] using NaN, INT_MIN etc in ndarray instead of a masked array
Message-ID: <16761e100604171712v195b47f5q111cb2c4519a03db@mail.gmail.com>

On 4/8/06, Sasha <ndarray at mac.com> wrote:
>
> ...
>
See above. For ndarray mask is always False unless an add-on module is
> loaded that redefines arithmetic to recognize special bit-patterns
> such as NaN or INT_MIN.
>
>
Is it possible to implement masked values using these special bit patterns
in the ndarray instead of using a separate MA class? If so has there been
any thought as to whether this may be the better option. I think it would be
preferable if the ability to handle masked data was available in the
standard array class (ndarray), as this would increase the likelihood that
functions built for numeric arrays will handle masked values well. It seems
that ndarray already has decent support for nans (isnan() returns the
equivalent of a boolean mask array), indicating that such an approach may be
acceptable. How difficult is it to generalise the concept to other data
types (int, string, bool)?

Mike
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060417/2cb1ca86/attachment.html>

From robert.kern at gmail.com  Mon Apr 17 19:53:01 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Mon Apr 17 19:53:01 2006
Subject: [Numpy-discussion] Re: using NaN, INT_MIN etc in ndarray instead of a masked array
In-Reply-To: <16761e100604171712v195b47f5q111cb2c4519a03db@mail.gmail.com>
References: <16761e100604171712v195b47f5q111cb2c4519a03db@mail.gmail.com>
Message-ID: <e21k8f$o14$1@sea.gmane.org>

Michael Sorich wrote:
> On 4/8/06, *Sasha* <ndarray at mac.com <mailto:ndarray at mac.com>> wrote:
> 
>     ...
> 
>     See above. For ndarray mask is always False unless an add-on module is
>     loaded that redefines arithmetic to recognize special bit-patterns
>     such as NaN or INT_MIN.
> 
> Is it possible to implement masked values using these special bit
> patterns in the ndarray instead of using a separate MA class? If so has
> there been any thought as to whether this may be the better option. I
> think it would be preferable if the ability to handle masked data was
> available in the standard array class (ndarray), as this would increase
> the likelihood that functions built for numeric arrays will handle
> masked values well. It seems that ndarray already has decent support for
> nans (isnan() returns the equivalent of a boolean mask array),
> indicating that such an approach may be acceptable. How difficult is it
> to generalise the concept to other data types (int, string, bool)?

Well, I'm certainly dead set against any change that would make all arrays that
happen to contain those special values to be treated as masked arrays.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From oliphant.travis at ieee.org  Mon Apr 17 23:04:04 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Mon Apr 17 23:04:04 2006
Subject: [Numpy-discussion] using NaN, INT_MIN etc in ndarray instead
 of a masked array
In-Reply-To: <16761e100604171712v195b47f5q111cb2c4519a03db@mail.gmail.com>
References: <16761e100604171712v195b47f5q111cb2c4519a03db@mail.gmail.com>
Message-ID: <44448138.2080402@ieee.org>

Michael Sorich wrote:
> On 4/8/06, *Sasha* <ndarray at mac.com <mailto:ndarray at mac.com>> wrote:
>
>     ...
>
>     See above. For ndarray mask is always False unless an add-on module is
>     loaded that redefines arithmetic to recognize special bit-patterns
>     such as NaN or INT_MIN.
>
>
> Is it possible to implement masked values using these special bit 
> patterns in the ndarray instead of using a separate MA class? If so 
> has there been any thought as to whether this may be the better 
> option. I think it would be preferable if the ability to handle masked 
> data was available in the standard array class (ndarray), as this 
> would increase the likelihood that functions built for numeric arrays 
> will handle masked values well. It seems that ndarray already has 
> decent support for nans (isnan() returns the equivalent of a boolean 
> mask array), indicating that such an approach may be acceptable. How 
> difficult is it to generalise the concept to other data types (int, 
> string, bool)?
>
I don't think the approach can be generalized at all.   It would only 
work with floating-point values and therefore is not particularly exciting.

I think ultimately, making masked arrays a C-based sub-class is where 
masked array should go.  For now the Python-based class is a good 
environment for developing the ideas behind how to preserve masked 
arrays through other functions if it is possible.

It seems that masked arrays must do things quite differently than other 
arrays on certain applications, and I'm not altogether clear on how to 
support them in all the NumPy code.  Because masked arrays are not used 
by everybody who uses NumPy arrays, it should be a separate sub-class. 

Ultimately, I hope we will get the basic array object into Python (what 
Tim was calling the super array) before 2.6

-Travis


From svetosch at gmx.net  Tue Apr 18 01:15:01 2006
From: svetosch at gmx.net (Sven Schreiber)
Date: Tue Apr 18 01:15:01 2006
Subject: [Numpy-discussion] Toward release 1.0 of NumPy
In-Reply-To: <443EDFE7.6010509@cox.net>
References: <443D9543.8040601@ee.byu.edu> <443E096D.3040407@gmx.net>	 <e06186140604130534v67dab6cbk3a51b41a3887a057@mail.gmail.com>	 <Mahogany-0.66.0-1448-20060413-100718.00@american.edu>	 <443E7109.6080808@cox.net>	 <e06186140604131332w1f3e2f70t96912135e53c7763@mail.gmail.com>	 <443EC2B4.807@cox.net> <e06186140604131601g2b759c7ema98dd56ef292e6c0@mail.gmail.com> <443EDFE7.6010509@cox.net>
Message-ID: <44449FC4.8020406@gmx.net>

[Sorry for the late reaction, I was on vacation.]

Tim Hochberg schrieb:

>>
> Here's my best guess as to what is going on:
>    1. There is a relatively large group of people who use Kronecker
> product as Alan does (probably the matrix as opposed to tensor math
> folks). I'm guessing it's a large group since they manage to write the
> definitions at both mathworld and planetmath.

Yes.

>    2. kron was meant to implement this.

That's what I thought, anyway.

>    2.5 People who need the other meaning of kron can just use outer, so
> no real conflict.
>    3. The implementation was either inappropriately generalized or it
> was assumed that all inputs would be matrices (and hence rank-2).
> 
> Assuming 3. is correct, and I'd like to hear from people if they think
> that the behaviour in the non rank-2 cases is sensible, the next
> question is whether the behaviour in the rank-2 cases makes sense. It
> seem to, but I'm not a user of kron. If both of the preceeding are true,
> it seems like a complete fix entails the following two things:
>    1. Forbid arguments that are not rank-2. This allows all matrices,
> which is really the main target here I think.
>    2. Fix the return type issue. I have a fix for this ready to commit,
> but I want to figure out the first part as well.
> 

Both 1 and 2 sound very good to me as a user.

So, should I still submit a new ticket about kron, or is it already
being fixed?

Greetings,
Sven


From a.u.r.e.l.i.a.n at gmx.net  Tue Apr 18 01:46:04 2006
From: a.u.r.e.l.i.a.n at gmx.net (Johannes Loehnert)
Date: Tue Apr 18 01:46:04 2006
Subject: [Numpy-discussion] Re: where
In-Reply-To: <c5b438120604131016o350a0c5fj27d03f73244c13e8@mail.gmail.com>
References: <c5b438120604130910x2e6694adhe4592d686893afe5@mail.gmail.com> <c5b438120604131014o1450ca8du95dce8e722d4247b@mail.gmail.com> <c5b438120604131016o350a0c5fj27d03f73244c13e8@mail.gmail.com>
Message-ID: <200604181045.05058.a.u.r.e.l.i.a.n@gmx.net>

On Thursday 13 April 2006 19:16, Ryan Krauss wrote:
> which makes this:
> myvect=where((f>19.5) & (f<38) &
> (phase>0),ones(shape(phase)),zeros(shape(phase)))
>
> actually really silly, sense all it is a complicated way to get back
> the input of
> (f>19.5) & (f<38) & (phase>0)
>

...but you should cast the second to signed int32, otherwise

a = (f>19.5) & (f<38) & (phase>0)
print a-1

will give an array of 0's and 255's :) (since boolean arrays are by default 
upcast to unsigned int8)

Johannes


From ryanlists at gmail.com  Tue Apr 18 05:31:15 2006
From: ryanlists at gmail.com (Ryan Krauss)
Date: Tue Apr 18 05:31:15 2006
Subject: [Numpy-discussion] Re: where
In-Reply-To: <200604181045.05058.a.u.r.e.l.i.a.n@gmx.net>
References: <c5b438120604130910x2e6694adhe4592d686893afe5@mail.gmail.com>
	 <c5b438120604131014o1450ca8du95dce8e722d4247b@mail.gmail.com>
	 <c5b438120604131016o350a0c5fj27d03f73244c13e8@mail.gmail.com>
	 <200604181045.05058.a.u.r.e.l.i.a.n@gmx.net>
Message-ID: <c5b438120604180529u77c91223m3cf9d489df19b878@mail.gmail.com>

You are right.  I actually did run into a problem with this.  I was
trying to subtract 360 degrees from the phase of some fft data and I
multiplied -360 (no dot) times my bool array.  It took me a while to
track that one down.

Ryan

On 4/18/06, Johannes Loehnert <a.u.r.e.l.i.a.n at gmx.net> wrote:
> On Thursday 13 April 2006 19:16, Ryan Krauss wrote:
> > which makes this:
> > myvect=where((f>19.5) & (f<38) &
> > (phase>0),ones(shape(phase)),zeros(shape(phase)))
> >
> > actually really silly, sense all it is a complicated way to get back
> > the input of
> > (f>19.5) & (f<38) & (phase>0)
> >
>
> ...but you should cast the second to signed int32, otherwise
>
> a = (f>19.5) & (f<38) & (phase>0)
> print a-1
>
> will give an array of 0's and 255's :) (since boolean arrays are by default
> upcast to unsigned int8)
>
> Johannes
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting language
> that extends applications into web and mobile media. Attend the live webcast
> and join the prime developer group breaking into this new coding territory!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>


From tim.hochberg at cox.net  Tue Apr 18 06:24:09 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Tue Apr 18 06:24:09 2006
Subject: [Numpy-discussion] Toward release 1.0 of NumPy
In-Reply-To: <44449FC4.8020406@gmx.net>
References: <443D9543.8040601@ee.byu.edu> <443E096D.3040407@gmx.net>	 <e06186140604130534v67dab6cbk3a51b41a3887a057@mail.gmail.com>	 <Mahogany-0.66.0-1448-20060413-100718.00@american.edu>	 <443E7109.6080808@cox.net>	 <e06186140604131332w1f3e2f70t96912135e53c7763@mail.gmail.com>	 <443EC2B4.807@cox.net> <e06186140604131601g2b759c7ema98dd56ef292e6c0@mail.gmail.com> <443EDFE7.6010509@cox.net> <44449FC4.8020406@gmx.net>
Message-ID: <4444E7DD.2010209@cox.net>

Sven Schreiber wrote:

>[Sorry for the late reaction, I was on vacation.]
>
>Tim Hochberg schrieb:
>
>  
>
>>Here's my best guess as to what is going on:
>>   1. There is a relatively large group of people who use Kronecker
>>product as Alan does (probably the matrix as opposed to tensor math
>>folks). I'm guessing it's a large group since they manage to write the
>>definitions at both mathworld and planetmath.
>>    
>>
>
>Yes.
>
>  
>
>>   2. kron was meant to implement this.
>>    
>>
>
>That's what I thought, anyway.
>
>  
>
>>   2.5 People who need the other meaning of kron can just use outer, so
>>no real conflict.
>>   3. The implementation was either inappropriately generalized or it
>>was assumed that all inputs would be matrices (and hence rank-2).
>>
>>Assuming 3. is correct, and I'd like to hear from people if they think
>>that the behaviour in the non rank-2 cases is sensible, the next
>>question is whether the behaviour in the rank-2 cases makes sense. It
>>seem to, but I'm not a user of kron. If both of the preceeding are true,
>>it seems like a complete fix entails the following two things:
>>   1. Forbid arguments that are not rank-2. This allows all matrices,
>>which is really the main target here I think.
>>   2. Fix the return type issue. I have a fix for this ready to commit,
>>but I want to figure out the first part as well.
>>
>>    
>>
>
>Both 1 and 2 sound very good to me as a user.
>
>So, should I still submit a new ticket about kron, or is it already
>being fixed?
>  
>
Go ahead and submit a ticket if you would. I have a fix here, but I've 
been waiting to submit it till I heard from some other people who use 
kron (and because I've been swamped the last couple of days). If you 
submit the ticket, that'll keep it from falling through the cracks.

Thanks for the feedback,

-tim


>Greetings,
>Sven
>
>
>  
>


From ndarray at mac.com  Tue Apr 18 07:06:22 2006
From: ndarray at mac.com (Sasha)
Date: Tue Apr 18 07:06:22 2006
Subject: [Numpy-discussion] using NaN, INT_MIN etc in ndarray instead of a masked array
In-Reply-To: <44448138.2080402@ieee.org>
References: <16761e100604171712v195b47f5q111cb2c4519a03db@mail.gmail.com>
	 <44448138.2080402@ieee.org>
Message-ID: <d38f5330604180705l69b9d269s4938aaf51dbb43da@mail.gmail.com>

On 4/18/06, Travis Oliphant <oliphant.travis at ieee.org> wrote:
> Michael Sorich wrote:
> ...
> > Is it possible to implement masked values using these special bit
> > patterns in the ndarray instead of using a separate MA class? If so
> > has there been any thought as to whether this may be the better
> > option. I think it would be preferable if the ability to handle masked
> > data was available in the standard array class (ndarray), as this
> > would increase the likelihood that functions built for numeric arrays
> > will handle masked values well. It seems that ndarray already has
> > decent support for nans (isnan() returns the equivalent of a boolean
> > mask array), indicating that such an approach may be acceptable. How
> > difficult is it to generalise the concept to other data types (int,
> > string, bool)?
> >
> I don't think the approach can be generalized at all.   It would only
> work with floating-point values and therefore is not particularly exciting.
>
Not true. R supports "NA" for all its types except raw bytes.
For example:

> x<-logical(5)
> x
[1] FALSE FALSE FALSE FALSE FALSE
> x[1:2]=NA
> !x
[1]   NA   NA TRUE TRUE TRUE

> I think ultimately, making masked arrays a C-based sub-class is where
> masked array should go.  For now the Python-based class is a good
> environment for developing the ideas behind how to preserve masked
> arrays through other functions if it is possible.
>
I've voiced my opposition to subclassing before.  Here I believe it is
more appropriate to have an add-on module that installs alternative
math functions. Having two classes in the same application that a
subtly different in the corner cases is already a problem with
ma.array vs. ndarray, adding the third class will only make things
worse.

> It seems that masked arrays must do things quite differently than other
> arrays on certain applications, and I'm not altogether clear on how to
> support them in all the NumPy code.  Because masked arrays are not used
> by everybody who uses NumPy arrays, it should be a separate sub-class.
>
As far as I understand, people who don't use MA don't deal with
missing values. For this category of users there will be no visible
effect no matter how missing values are treated as long as in the
absence of missing values, normal rules apply. Yes, many functions
must treat missing values differently, but the same is true for NaNs. 
NumPy allows floating point arrays to have nans, but there is no real
support beyong what happened to work at the OS level.

For example:

>>> sort([5,nan,3,2])
array([ 5.        ,         nan,  2.        ,  3.        ])

Also, what is the justification for

>>> int_(nan)
0
?

> Ultimately, I hope we will get the basic array object into Python (what
> Tim was calling the super array) before 2.6

As far as I understand, that object will not come with arithmetic
rules or math functions.  Therefore, I don't see how this is relevant
to the present discussion.


From oliphant.travis at ieee.org  Tue Apr 18 09:39:11 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Tue Apr 18 09:39:11 2006
Subject: [Numpy-discussion] using NaN, INT_MIN etc in ndarray instead
 of a masked array
In-Reply-To: <d38f5330604180705l69b9d269s4938aaf51dbb43da@mail.gmail.com>
References: <16761e100604171712v195b47f5q111cb2c4519a03db@mail.gmail.com>	 <44448138.2080402@ieee.org> <d38f5330604180705l69b9d269s4938aaf51dbb43da@mail.gmail.com>
Message-ID: <44451611.9070707@ieee.org>

Sasha wrote:
> On 4/18/06, Travis Oliphant <oliphant.travis at ieee.org> wrote:
>   
>> Michael Sorich wrote:
>> ...
>>     
>>> Is it possible to implement masked values using these special bit
>>> patterns in the ndarray instead of using a separate MA class? If so
>>> has there been any thought as to whether this may be the better
>>> option. I think it would be preferable if the ability to handle masked
>>> data was available in the standard array class (ndarray), as this
>>> would increase the likelihood that functions built for numeric arrays
>>> will handle masked values well. It seems that ndarray already has
>>> decent support for nans (isnan() returns the equivalent of a boolean
>>> mask array), indicating that such an approach may be acceptable. How
>>> difficult is it to generalise the concept to other data types (int,
>>> string, bool)?
>>>
>>>       
>> I don't think the approach can be generalized at all.   It would only
>> work with floating-point values and therefore is not particularly exciting.
>>
>>     
> Not true. R supports "NA" for all its types except raw bytes.
> For example:
>
>   
>> x<-logical(5)
>> x
>>     
> [1] FALSE FALSE FALSE FALSE FALSE
>   
>> x[1:2]=NA
>> !x
>>     
> [1]   NA   NA TRUE TRUE TRUE
>   
For Boolean values there is "room" for a NA value, but what about 
arbitrary integers.  Does R just limit the range of the integer value?  
That's what I meant:  "fiddling with special-values" doesn't generalize 
to all data-types.


>> arrays through other functions if it is possible.
>>
>>     
> I've voiced my opposition to subclassing before. 
And you haven't been very clear about why you are opposed.    Just 
voicing concern is not enough.   Python sub-classing in C amounts to 
exactly what masked arrays are:  arrays with additional components in 
their structure (i.e. a mask).    Please be more specific about whatever 
your concerns are with sub-classing.

>  Here I believe it is
> more appropriate to have an add-on module that installs alternative
> math functions. 
Sure that will work.   But, we're talking about more than math 
functions.  Ultimately masked array users will want *every* function 
they use to work "right" with masked arrays.  

> Having two classes in the same application that a
> subtly different in the corner cases is already a problem with
> ma.array vs. ndarray, adding the third class will only make things
> worse.
>   
I don't know what you are talking about.  What is the "third class?"  
I'm talking about just making ma.array construct a sub-class..
>> It seems that masked arrays must do things quite differently than other
>> arrays on certain applications, and I'm not altogether clear on how to
>> support them in all the NumPy code.  Because masked arrays are not used
>> by everybody who uses NumPy arrays, it should be a separate sub-class.
>>
>>     
> As far as I understand, people who don't use MA don't deal with
> missing values. For this category of users there will be no visible
> effect no matter how missing values are treated as long as in the
> absence of missing values, normal rules apply. Yes, many functions
> must treat missing values differently, but the same is true for NaNs. 
> NumPy allows floating point arrays to have nans, but there is no real
> support beyong what happened to work at the OS level.
>   

Or we deal with missing values differently (i.e. manage it 
ourselves).    Sure, there will be no behavioral effect, but the code 
will have to be re-written to "do the right thing" with masked arrays in 
such a way as to not slow everything else down (that's at least an "if" 
statement sprinkled throughout every sub-routine).  

Many people are not enthused about complicating the basic array object 
any more than necessary.   If it can be shown that masked arrays can be 
integrated into the ndarray object without inordinate complication 
and/or slowness, then I don't think people would mind.  

The best way to prove that is to create a sub-class and change only the 
methods / functions that are necessary.      That's really all I'm saying.

>   
>> Ultimately, I hope we will get the basic array object into Python (what
>> Tim was calling the super array) before 2.6
>>     
>
> As far as I understand, that object will not come with arithmetic
> rules or math functions.  Therefore, I don't see how this is relevant
> to the present discussion.
>   

Because it will help all array objects talk more cleanly to each other.  
But, if you are so opposed to sub-classing (which I'm not sure why in 
this case), then it may not matter.

-Travis


From strang at nmr.mgh.harvard.edu  Tue Apr 18 10:37:03 2006
From: strang at nmr.mgh.harvard.edu (Gary Strangman)
Date: Tue Apr 18 10:37:03 2006
Subject: [Numpy-discussion] using NaN, INT_MIN etc in ndarray instead of
 a masked array
In-Reply-To: <44451611.9070707@ieee.org>
References: <16761e100604171712v195b47f5q111cb2c4519a03db@mail.gmail.com> 
 <44448138.2080402@ieee.org> <d38f5330604180705l69b9d269s4938aaf51dbb43da@mail.gmail.com>
 <44451611.9070707@ieee.org>
Message-ID: <Pine.LNX.4.60.0604181334080.29171@guppy.nmr.mgh.harvard.edu>

>> Not true. R supports "NA" for all its types except raw bytes.
>> For example:
(snip)
>
> For Boolean values there is "room" for a NA value, but what about arbitrary 
> integers.  Does R just limit the range of the integer value?  That's what I 
> meant:  "fiddling with special-values" doesn't generalize to all data-types.

In R, I believe NA = -sys.maxint-1

Gary


From oliphant.travis at ieee.org  Tue Apr 18 11:09:03 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Tue Apr 18 11:09:03 2006
Subject: [Numpy-discussion] String (and unicode) comparisons and per-thread error handling fixed
Message-ID: <44452B04.4090403@ieee.org>

String comparisons were added last week.  Today, I added per-thread 
error handling to NumPy.  There is 1 more enhancement (scalar math) 
prior to 0.9.8 release --- but it will probably take 1-2 weeks.

The new error handling means that the three-scope system is gone.  Now, 
there is only one per-Python-thread global scope for error handling.  If 
you change the error handling it will affect all ufuncs.   Because of 
this, the seterr function now returns an object with the old 
error-handling information.  This object must be passed to 
umath.seterrobj() in order to restore the error handling.

-Travis


From tim.hochberg at cox.net  Tue Apr 18 11:21:06 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Tue Apr 18 11:21:06 2006
Subject: [Numpy-discussion] String (and unicode) comparisons and per-thread
 error handling fixed
In-Reply-To: <44452B04.4090403@ieee.org>
References: <44452B04.4090403@ieee.org>
Message-ID: <44452D53.70009@cox.net>

Travis Oliphant wrote:

>
> String comparisons were added last week.  Today, I added per-thread 
> error handling to NumPy.  There is 1 more enhancement (scalar math) 
> prior to 0.9.8 release --- but it will probably take 1-2 weeks.

Oops!  I'm about 2/3 done doing this one too. I think I'll go ahead and 
finish mine up and see how our approaches stack up performance wise and 
see if there's any of mine that's useful to roll into yours.

-tim

>
> The new error handling means that the three-scope system is gone.  
> Now, there is only one per-Python-thread global scope for error 
> handling.  If you change the error handling it will affect all 
> ufuncs.   Because of this, the seterr function now returns an object 
> with the old error-handling information.  This object must be passed 
> to umath.seterrobj() in order to restore the error handling.
>
> -Travis
>
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting 
> language
> that extends applications into web and mobile media. Attend the live 
> webcast
> and join the prime developer group breaking into this new coding 
> territory!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>
>


From oliphant.travis at ieee.org  Tue Apr 18 12:14:14 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Tue Apr 18 12:14:14 2006
Subject: [Numpy-discussion] String (and unicode) comparisons and per-thread
 error handling fixed
In-Reply-To: <44452D53.70009@cox.net>
References: <44452B04.4090403@ieee.org> <44452D53.70009@cox.net>
Message-ID: <44453A5E.4020506@ieee.org>

Tim Hochberg wrote:
> Travis Oliphant wrote:
>
>>
>> String comparisons were added last week.  Today, I added per-thread 
>> error handling to NumPy.  There is 1 more enhancement (scalar math) 
>> prior to 0.9.8 release --- but it will probably take 1-2 weeks.
>
> Oops!  I'm about 2/3 done doing this one too. I think I'll go ahead 
> and finish mine up and see how our approaches stack up performance 
> wise and see if there's any of mine that's useful to roll into yours.
Darn.  I thought I gave you enough time.... :-)

On the other hand,  all I did was change the way the error-mode is being 
looked-up (from the three dictionaries to just one).  It's not much 
different than before except for that.    I didn't do anything about the 
other ideas you spoke of. 

I did add a simple object to reset the error mode when it gets deleted, 
and had to fiddle with the seterr code a little to accept that object so 
that both methods of resetting the error mode work. 

A stack can certainly be built on top of what is now there (I'm thinking 
for numarray compatibility...), but I didn't do that.

Sorry for stepping on your toes.   I'm just anxious...  I'll be gone for 
a couple of days and won't be working on NumPy/SciPy, so feel free to 
adjust.


-Travis


From rhl at astro.princeton.edu  Tue Apr 18 13:07:04 2006
From: rhl at astro.princeton.edu (Robert Lupton)
Date: Tue Apr 18 13:07:04 2006
Subject: [Numpy-discussion] Infinite recursion in numpy called from swig generated code
In-Reply-To: <D2BB1AAD-D21C-4F8C-BDFB-55EDBCA3B65A@astro.princeton.edu>
References: <5809AC56-B2DF-4403-B7BC-9AEEAAC78505@astro.princeton.edu> <43FD32E4.10600@ieee.org> <F2993C93-C40B-4680-AFD6-55BBEF9EEA73@astro.princeton.edu> <44203F91.7010505@ieee.org> <D2BB1AAD-D21C-4F8C-BDFB-55EDBCA3B65A@astro.princeton.edu>
Message-ID: <CA15C760-5D5F-4CD7-BA25-8C8855958020@astro.princeton.edu>

The latest version of swig (1.3.28 or 1.3.29) has broken my
multiple-inheritance-from-C-and-numpy application; more specifically,
it generates an infinite loop in numpy-land.  I'm using numpy (0.9.6),
and here's the offending code.  Ideas anyone? I've pasted the crucial
part of numpy.lib.UserArray onto the end of this message (how do I know?
because you can replace the "from numpy.lib.UserArray" with this, and
the problem persists).

#####################################################
from numpy.lib.UserArray import *

import types
class myImage(types.ObjectType):
     def __init__(self, *args):
         this = None
         try: self.this.append(this)
         except: self.this = this

class Image(UserArray, myImage):
     def __init__(self, *args):
         myImage.__init__(self, *args)
#####################################################

The symptoms are:

	from recursionBug import *; Image(myImage())
------------------------------------------------------------
Traceback (most recent call last):
   File "<stdin>", line 1, in ?
   File "recursionBug.py", line 32, in __init__
     myImage.__init__(self, *args)
   File "recursionBug.py", line 26, in __init__
     except: self.this = this
   File "/sw/lib/python2.4/site-packages/numpy/lib/UserArray.py",  
line 187, in __setattr__
     self.array.__setattr__(attr, value)
   File "/sw/lib/python2.4/site-packages/numpy/lib/UserArray.py",  
line 193, in __getattr__
     return self.array.__getattribute__(attr)
...
   File "/sw/lib/python2.4/site-packages/numpy/lib/UserArray.py",  
line 193, in __getattr__
     return self.array.__getattribute__(attr)
   File "/sw/lib/python2.4/site-packages/numpy/lib/UserArray.py",  
line 193, in __getattr__
     return self.array.__getattribute__(attr)
RuntimeError: maximum recursion depth exceeded


The following stripped down piece of numpy seems to be the problem:
     class UserArray(object):
         def __setattr__(self,attr,value):
             try:
                 self.array.__setattr__(attr, value)
             except AttributeError:
                 object.__setattr__(self, attr, value)

         # Only called after other approaches fail.
         def __getattr__(self,attr):
             return self.array.__getattribute__(attr)


				R


From cookedm at physics.mcmaster.ca  Tue Apr 18 13:10:02 2006
From: cookedm at physics.mcmaster.ca (David M. Cooke)
Date: Tue Apr 18 13:10:02 2006
Subject: [Numpy-discussion] Trac Wikis closed for anonymous edits until further notice
In-Reply-To: <44421025.9060804@gmail.com> (Robert Kern's message of "Sun, 16
	Apr 2006 04:36:37 -0500")
References: <44421025.9060804@gmail.com>
Message-ID: <qnky7y2abnv.fsf@arbutus.physics.mcmaster.ca>

Robert Kern <robert.kern at gmail.com> writes:

> We've been hit badly by spammers, so I can only presume our Trac sites are now
> on the traded spam lists. I am going to turn off anonymous edits for now. Ticket
> creation will probably still be left open for now.

Another thing that's concerned me is closing of tickets by anonymous;
can we turn that off? It disturbs me when I'm browsing the RSS feed
and I see that. If a user who's not a developer thinks it could be
closed, they could post a comment saying that, and a developer could
close it.

> Many thanks to David Cooke for quickly removing the spam.

The RSS feeds are great for that. Although having a way to quickly
revert a change would have made it easier :-)

> I am looking into ways to allow people to register themselves with the Trac
> sites so they can edit the Wikis and submit tickets without needing to be added
> by a project admin.

that'd be good.

-- 
|>|\/|<
/--------------------------------------------------------------------------\
|David M. Cooke                      http://arbutus.physics.mcmaster.ca/dmc/
|cookedm at physics.mcmaster.ca


From oliphant.travis at ieee.org  Tue Apr 18 13:50:09 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Tue Apr 18 13:50:09 2006
Subject: [Numpy-discussion] Infinite recursion in numpy called from swig
 generated code
In-Reply-To: <CA15C760-5D5F-4CD7-BA25-8C8855958020@astro.princeton.edu>
References: <5809AC56-B2DF-4403-B7BC-9AEEAAC78505@astro.princeton.edu> <43FD32E4.10600@ieee.org> <F2993C93-C40B-4680-AFD6-55BBEF9EEA73@astro.princeton.edu> <44203F91.7010505@ieee.org> <D2BB1AAD-D21C-4F8C-BDFB-55EDBCA3B65A@astro.princeton.edu> <CA15C760-5D5F-4CD7-BA25-8C8855958020@astro.princeton.edu>
Message-ID: <444550CF.6090100@ieee.org>

Robert Lupton wrote:
> The latest version of swig (1.3.28 or 1.3.29) has broken my
> multiple-inheritance-from-C-and-numpy application; more specifically,
> it generates an infinite loop in numpy-land.  I'm using numpy (0.9.6),
> and here's the offending code.  Ideas anyone? I've pasted the crucial
> part of numpy.lib.UserArray onto the end of this message (how do I know?
> because you can replace the "from numpy.lib.UserArray" with this, and
> the problem persists).
This is a problem in the getattr code of UserArray.   This is fixed in 
SVN.   But, you can just replace the getattr code in UserArray.py with 
the following:

    def __getattr__(self,attr):
        if (attr == 'array'):
            return object.__getattr__(self, attr)
        return self.array.__getattribute__(attr)


Thanks for finding and reporting this.

-Travis


From christian at marquardt.sc  Tue Apr 18 14:48:06 2006
From: christian at marquardt.sc (Christian Marquardt)
Date: Tue Apr 18 14:48:06 2006
Subject: [Numpy-discussion] using NaN,
      INT_MIN etc in ndarray instead of a masked array
In-Reply-To: <Pine.LNX.4.60.0604181334080.29171@guppy.nmr.mgh.harvard.edu>
References: <16761e100604171712v195b47f5q111cb2c4519a03db@mail.gmail.com> 
    <44448138.2080402@ieee.org>
    <d38f5330604180705l69b9d269s4938aaf51dbb43da@mail.gmail.com>
    <44451611.9070707@ieee.org>
    <Pine.LNX.4.60.0604181334080.29171@guppy.nmr.mgh.harvard.edu>
Message-ID: <20053.84.167.224.64.1145396854.squirrel@webmail.marquardt.sc>

On Tue, April 18, 2006 19:36, Gary Strangman wrote:
>
>>> Not true. R supports "NA" for all its types except raw bytes.
>>> For example:
> (snip)
>>
>> For Boolean values there is "room" for a NA value, but what about
>> arbitrary
>> integers.  Does R just limit the range of the integer value?  That's
>> what I
>> meant:  "fiddling with special-values" doesn't generalize to all
>> data-types.
>
> In R, I believe NA = -sys.maxint-1

Don't know if this helps, but I have found the following in the R Data
Import/Export Manual (in section 6.5.1, available at
http://cran.r-project.org/doc/manuals/R-data.html):

   The missing value for R logical and integer types is INT_MIN, the
   smallest representable int defined in the C header limits.h, normally
   corresponding to the bit pattern 0xffffffff.

For doubles (I think R only uses double precision internally), it's a bit
more complex apparently; in the section mentioned above, the authors
explain that

   [If R's internal constant definitions / library functions can't be used],
   on all common platforms IEC 60559 (aka IEEE 754) arithmetic is used, so
   standard C facilities can be used to test for or set Inf, -Inf and NaN
   values. On such platforms NA is represented by the NaN value with
   low-word 0x7a2 (1954 in decimal).

The implementation of the floating point NA value is done in the file
arithmetics.c of the R source code; the relevant code snippets defining
the NA "value" are (I believe)

   typedef union
   {
       double value;
       unsigned int word[2];
   } ieee_double;

   #ifdef WORDS_BIGENDIAN
   static CONST int hw = 0;
   static CONST int lw = 1;
   #else  /* !WORDS_BIGENDIAN */
   static CONST int hw = 1;
   static CONST int lw = 0;
   #endif /* WORDS_BIGENDIAN */

   static double R_ValueOfNA(void)
   {
       /* The gcc shipping with RedHat 9 gets this wrong without
        * the volatile declaration. Thanks to Marc Schwartz. */
       volatile ieee_double x;
       x.word[hw] = 0x7ff00000;
       x.word[lw] = 1954;
       return x.value;
   }

and the tests for a number being NA or NaN are

   int R_IsNA(double x)
   {
       if (isnan(x)) {
           ieee_double y;
           y.value = x;
           return (y.word[lw] == 1954);
       }
       return 0;
   }

   int R_IsNaN(double x)
   {
       if (isnan(x)) {
           ieee_double y;
           y.value = x;
           return (y.word[lw] != 1954);
       }
       return 0;
   }

Hope this is useful,

  Christian.


From twegener at radlogic.com.au  Tue Apr 18 18:07:02 2006
From: twegener at radlogic.com.au (Tim Wegener)
Date: Tue Apr 18 18:07:02 2006
Subject: [Numpy-discussion] Backporting numpy to Python 2.2
Message-ID: <20060419103554.4ac1df4a.twegener@radlogic.com.au>

Hi, 

I am attempting to backport numpy-0.9.6 to be compatible with python 2.2. (Some of our machines run python 2.2 as part of Red Hat 9 and Red Hat 7.3 and it is hazardous to alter the standard setup.) I was able to change most of the 2.3-isms to be 2.2 compatible (see the attached patch). However I had problems compiling the following c module:

In file included from numpy/core/src/multiarraymodule.c:64:
numpy/core/src/arrayobject.c: In function `arraydescr_dealloc':
numpy/core/src/arrayobject.c:8417: warning: passing arg 1 of pointer to function from incompatible pointer type
numpy/core/src/multiarraymodule.c: In function `PyArray_DescrConverter':
numpy/core/src/multiarraymodule.c:4072: `PyBool_Type' undeclared (first use in this function)
numpy/core/src/multiarraymodule.c: In function `setup_scalartypes':
numpy/core/src/multiarraymodule.c:5736: `PyBool_Type' undeclared (first use in this function)
numpy/core/src/multiarraymodule.c: In function `initmultiarray':
numpy/core/src/multiarraymodule.c:5897: `PyObject_SelfIter' undeclared (first use in this function)
error: Command "gcc -DNDEBUG -O2 -g -pipe -march=i386 -mcpu=i686 -D_GNU_SOURCE -fPIC -fPIC -Ibuild/src/numpy/core/src -Inumpy/core/include -Ibuild/src/numpy/core -Inumpy/core/src -Inumpy/core/include -I/usr/include/python2.2 -c numpy/core/src/multiarraymodule.c -o build/temp.linux-i686-2.2/multiarraymodule.o" failed with exit status 1


Is it possible to modify this module for python 2.2 compatibility or have I reached a dead end?

It would be great if numpy were compatible with 2.2 out of the box, given that 2.3 is only a couple of years old (new), and 2.2 is still quite widely deployed. I am trying to migrate to numpy from Numeric, which worked happily with 2.2. 

FYI, a quick summary of the compatibility amendments to the python code:
- backported os.walk
- backported enumerate
- backported distutils.log
- used slices instead of list.index(item, <start>)
- used 'r' mode instead of 'U' mode (it didn't seem that universal newline support was needed where it was used)
- used the {} way of building a new dict rather than using keyword args to the dict constructor
- from __future__ import generators
- used str.count(substr) rather than substr in str
- used os.sep rather than os.path.sep
- commented out some of the new Configuration keword arguments (download_url and classifiers)

The above don't really affect the functionality, but a couple of more unusual changes were needed as well:
- had to add "self.compiler.exe_extension = ''" to numpy/distutils/command/config.py (see patch)
- had to change the following to and empty dict: "kws = {'depends':ext.depends}" in numpy/distutils/command/build_ext.py (see patch)
These two changes may have unwanted side effects, and a better fix is probably needed there.

Regards,
Tim
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: numpy-0.9.6_patched_for_py2.2_diff.txt
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060418/520b514d/attachment.txt>

From oliphant at ee.byu.edu  Tue Apr 18 20:03:01 2006
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Tue Apr 18 20:03:01 2006
Subject: [Numpy-discussion] Performance problems with strided arrays in
 NumPy
In-Reply-To: <20060414213511.GA14355@xot.carabos.com>
References: <20060414213511.GA14355@xot.carabos.com>
Message-ID: <4445A822.60207@ee.byu.edu>

faltet at xot.carabos.com wrote:

>Hi,
>
>I'm seeing some slowness in NumPy when dealing with strided arrays.
>numarray is dealing better with these situations, so I guess that
>something could be done in NumPy about this. Below are the situations
>that I've found up to now (maybe there are others). For the timings,
>I've used numpy 0.9.7.2278 and numarray 1.5.1.
>  
>
The source of this slowness is the use in numarray of  special-cases for 
certain-sized byte-copies.

Apparently,  it is *much* faster to do

((double *)dst)[0] = ((double *)src)[0]

when you have aligned data than it is to do

memmove(dst, src, sizeof(double))

This is a useful piece of knowledge to have for optimization.  There may 
be other optimizations like that already used by Numarray but still 
needing to be adapted for NumPy.

I applied an optimization to take advantage of this when possible and 
got a 10x speed-up in the 1-d case.

My timings for your benchmark with current SVN of NumPy are:

NumPy: [0.021701812744140625, 0.021739959716796875, 0.021548032760620117]
Numarray: [0.052516937255859375, 0.052685976028442383, 0.052355051040649414]


Old timings:

NumPy: [~0.09, ~0.09, ~0.09]
Numarray: [~0.05, ~0.05, ~0.05]


-Travis


From ndarray at mac.com  Tue Apr 18 20:26:16 2006
From: ndarray at mac.com (Sasha)
Date: Tue Apr 18 20:26:16 2006
Subject: [Numpy-discussion] Performance problems with strided arrays in NumPy
In-Reply-To: <4445A822.60207@ee.byu.edu>
References: <20060414213511.GA14355@xot.carabos.com>
	 <4445A822.60207@ee.byu.edu>
Message-ID: <d38f5330604182025s6aeb03d1m76af25ab729e3f0d@mail.gmail.com>

On 4/18/06, Travis Oliphant <oliphant at ee.byu.edu> wrote:
> [...]
> Apparently,  it is *much* faster to do
>
> ((double *)dst)[0] = ((double *)src)[0]
>
> when you have aligned data than it is to do
>
> memmove(dst, src, sizeof(double))
>
> This is a useful piece of knowledge to have for optimization.

This is not surprising because memmove has to assume arbitrary
alignment and possibility of overlap between src and dst areas.


From ndarray at mac.com  Tue Apr 18 20:27:02 2006
From: ndarray at mac.com (Sasha)
Date: Tue Apr 18 20:27:02 2006
Subject: [Numpy-discussion] Performance problems with strided arrays in NumPy
In-Reply-To: <4445A822.60207@ee.byu.edu>
References: <20060414213511.GA14355@xot.carabos.com>
	 <4445A822.60207@ee.byu.edu>
Message-ID: <d38f5330604182025s6aeb03d1m76af25ab729e3f0d@mail.gmail.com>

On 4/18/06, Travis Oliphant <oliphant at ee.byu.edu> wrote:
> [...]
> Apparently,  it is *much* faster to do
>
> ((double *)dst)[0] = ((double *)src)[0]
>
> when you have aligned data than it is to do
>
> memmove(dst, src, sizeof(double))
>
> This is a useful piece of knowledge to have for optimization.

This is not surprising because memmove has to assume arbitrary
alignment and possibility of overlap between src and dst areas.


From tim.hochberg at cox.net  Wed Apr 19 08:58:04 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Wed Apr 19 08:58:04 2006
Subject: [Numpy-discussion] seterr changes
Message-ID: <44465DEE.8090703@cox.net>

Hi Travis et al,

I started looking at your seterr changes. I stared at yours for a while 
then I stared at mine for a while. Then I decided that mine wouldn't 
work right in the presence of threads. Then I decided that yours 
wouldn't work right in the presence of threads either. Specifically, it 
looks like ufunc_update_use_defaults isn't going to work. I think I know 
how to fix that, but I'm not sure that it's worth the trouble since I 
also did some benchmarking and it appears that the benefit of special 
casing is minimal.

I looked at six cases: small (len-1), medium (len-1e4) and large 
(len-1e6) arrays with error checking on and error checking off. For 
medium and large arrays, I could discern no difference at all. For small 
arrays, there may be some difference, but it appears to be less than 5%. 
I'm not sure it's worth working through a bunch of finicky thread stuff 
to get just 5% back. If these benchmark numbers hold up I'd be inclined 
to rip out the use_default support since it's complicated enough that I 
know we'll end up chasing a few evil thread related bugs down through it.

I'll include the benchmarking code below. If people could (a) look it 
over and confirm that I'm not doing something bogus and (b) try it on 
some different platforms and see if they see a more signifigant 
difference, I'd appreciate it.

I'm also curious about the seterr interface. It returns 
ufunc_values_obj. I'm wasn't sure how one is supposed to pass that back 
in to seterr,  so I modified seterr to instead return a dictionary. I 
also modified it so that the seterr function itself has no defaults (or 
rather they're all None). Instead, any unspecified values are taken from 
the current error state. Thus seterr(divide="warn") changes only the 
divide state, leaving the other entries alone.


Regards,

-tim


if True:
    from timeit import Timer

    setup = """
import numpy
numpy.seterr(divide="%s")
a = numpy.zeros([%s], dtype=float)
"""
    for size in [1, 10000, 1000000]:
        for i in range(3):
            for state in ['ignore', 'warn']:
                reps = min(100000000 / size, 100000)
                timer = Timer("a * a", setup % (state, size))
                print "%s|%s =>" % (state, size), timer.timeit(reps)
            print
        print


From arkaitz.bitorika at gmail.com  Wed Apr 19 10:30:03 2006
From: arkaitz.bitorika at gmail.com (Arkaitz Bitorika)
Date: Wed Apr 19 10:30:03 2006
Subject: [Numpy-discussion] Floating point exception with numpy and embedded python interpreter
Message-ID: <81934aa60604191029h4d8a8d9bl550fa58cc67d3d5e@mail.gmail.com>

Hi,

I'm embedding Python in a big C++ program (the NS network simulator) and I
have problems when importing the numpy module, I get a Floating Point
exception. The C code that causes the exception is:

    Py_Initialize();
    PyObject* module = PyImport_ImportModule("numpy");
    Py_DECREF(module);


I'm running Ubuntu Breezy on a dual processor Dell machine, with the stock
python and numpy 0.9.6. One strange thing is that I haven't been able to
reproduce the crash by writing a minimal C program with the code above, it
only crashes when added to my program. I've been embedding Python for ages
on the same program and other modules work fine, only numpy fails.

I've debugged the issue a bit and I've seen that the exception is thrown
when the numpy __init__.py tries to import the core module. The GDB
backtrace is pasted at the end.
Any idea what may be going wrong?

Thanks,
Arkaitz


0xb7900fd2 in initumath () at build/src/numpy/core/src/umathmodule.c:10321
10321           pinf *= mul;
(gdb) bt
#0  0xb7900fd2 in initumath () at
build/src/numpy/core/src/umathmodule.c:10321
#1  0xb7e4e310 in _PyImport_LoadDynamicModule () from
/usr/lib/libpython2.4.so.1.0
#2  0xb7e4c450 in _PyImport_FindModule () from /usr/lib/libpython2.4.so.1.0
#3  0xb7e4cc01 in PyImport_ReloadModule () from /usr/lib/libpython2.4.so.1.0
#4  0xb7e4ce26 in PyImport_ReloadModule () from /usr/lib/libpython2.4.so.1.0
#5  0xb7e4d2c6 in PyImport_ImportModuleEx () from
/usr/lib/libpython2.4.so.1.0
#6  0xb7e22d9e in _PyUnicodeUCS4_ToLowercase () from
/usr/lib/libpython2.4.so.1.0
#7  0xb7df5923 in PyCFunction_Call () from /usr/lib/libpython2.4.so.1.0
#8  0xb7dc8fdf in PyObject_Call () from /usr/lib/libpython2.4.so.1.0
#9  0xb7e2a92c in PyEval_CallObjectWithKeywords () from
/usr/lib/libpython2.4.so.1.0
#10 0xb7e2e8f9 in PyEval_EvalFrame () from /usr/lib/libpython2.4.so.1.0
#11 0xb7e31a2d in PyEval_EvalCodeEx () from /usr/lib/libpython2.4.so.1.0
#12 0xb7e31b76 in PyEval_EvalCode () from /usr/lib/libpython2.4.so.1.0
#13 0xb7e4a525 in PyImport_ExecCodeModuleEx () from
/usr/lib/libpython2.4.so.1.0
#14 0xb7e4a8e9 in PyImport_ExecCodeModule () from
/usr/lib/libpython2.4.so.1.0
#15 0xb7e4c73e in _PyImport_FindModule () from /usr/lib/libpython2.4.so.1.0
#16 0xb7e4cc01 in PyImport_ReloadModule () from /usr/lib/libpython2.4.so.1.0
#17 0xb7e4ce26 in PyImport_ReloadModule () from /usr/lib/libpython2.4.so.1.0
#18 0xb7e4d2c6 in PyImport_ImportModuleEx () from
/usr/lib/libpython2.4.so.1.0
#19 0xb7e22d9e in _PyUnicodeUCS4_ToLowercase () from
/usr/lib/libpython2.4.so.1.0
#20 0xb7df5923 in PyCFunction_Call () from /usr/lib/libpython2.4.so.1.0
#21 0xb7dc8fdf in PyObject_Call () from /usr/lib/libpython2.4.so.1.0
#22 0xb7e2a92c in PyEval_CallObjectWithKeywords () from
/usr/lib/libpython2.4.so.1.0
#23 0xb7e2e8f9 in PyEval_EvalFrame () from /usr/lib/libpython2.4.so.1.0
#24 0xb7e31a2d in PyEval_EvalCodeEx () from /usr/lib/libpython2.4.so.1.0
#25 0xb7e31b76 in PyEval_EvalCode () from /usr/lib/libpython2.4.so.1.0
#26 0xb7e5667f in PyRun_String () from /usr/lib/libpython2.4.so.1.0
#27 0xb7e2fce6 in PyEval_EvalFrame () from /usr/lib/libpython2.4.so.1.0
#28 0xb7e31a2d in PyEval_EvalCodeEx () from /usr/lib/libpython2.4.so.1.0
#29 0xb7e3011a in PyEval_EvalFrame () from /usr/lib/libpython2.4.so.1.0
#30 0xb7e31a2d in PyEval_EvalCodeEx () from /usr/lib/libpython2.4.so.1.0
#31 0xb7de31b6 in PyFunction_SetClosure () from /usr/lib/libpython2.4.so.1.0
#32 0xb7dc8fdf in PyObject_Call () from /usr/lib/libpython2.4.so.1.0
#33 0xb7dd079b in PyMethod_New () from /usr/lib/libpython2.4.so.1.0
#34 0xb7dc8fdf in PyObject_Call () from /usr/lib/libpython2.4.so.1.0
#35 0xb7dcfd7b in PyInstance_NewRaw () from /usr/lib/libpython2.4.so.1.0
#36 0xb7dc8fdf in PyObject_Call () from /usr/lib/libpython2.4.so.1.0
#37 0xb7e2f5d2 in PyEval_EvalFrame () from /usr/lib/libpython2.4.so.1.0
#38 0xb7e31a2d in PyEval_EvalCodeEx () from /usr/lib/libpython2.4.so.1.0
#39 0xb7e31b76 in PyEval_EvalCode () from /usr/lib/libpython2.4.so.1.0
#40 0xb7e4a525 in PyImport_ExecCodeModuleEx () from
/usr/lib/libpython2.4.so.1.0
#41 0xb7e4a8e9 in PyImport_ExecCodeModule () from
/usr/lib/libpython2.4.so.1.0
#42 0xb7e4c73e in _PyImport_FindModule () from /usr/lib/libpython2.4.so.1.0
#43 0xb7e4cc01 in PyImport_ReloadModule () from /usr/lib/libpython2.4.so.1.0
#44 0xb7e4ce26 in PyImport_ReloadModule () from /usr/lib/libpython2.4.so.1.0
#45 0xb7e4d2c6 in PyImport_ImportModuleEx () from
/usr/lib/libpython2.4.so.1.0
#46 0xb7e22d9e in _PyUnicodeUCS4_ToLowercase () from
/usr/lib/libpython2.4.so.1.0
#47 0xb7df5923 in PyCFunction_Call () from /usr/lib/libpython2.4.so.1.0
#48 0xb7dc8fdf in PyObject_Call () from /usr/lib/libpython2.4.so.1.0
#49 0xb7dcc6c0 in PyObject_CallFunction () from /usr/lib/libpython2.4.so.1.0
#50 0xb7e4d745 in PyImport_Import () from /usr/lib/libpython2.4.so.1.0
#51 0xb7e4d918 in PyImport_ImportModule () from /usr/lib/libpython2.4.so.1.0
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060419/5672932c/attachment.html>

From strawman at astraw.com  Wed Apr 19 10:38:11 2006
From: strawman at astraw.com (Andrew Straw)
Date: Wed Apr 19 10:38:11 2006
Subject: [Numpy-discussion] Floating point exception with numpy and embedded
 python interpreter
In-Reply-To: <81934aa60604191029h4d8a8d9bl550fa58cc67d3d5e@mail.gmail.com>
References: <81934aa60604191029h4d8a8d9bl550fa58cc67d3d5e@mail.gmail.com>
Message-ID: <44467576.1020708@astraw.com>

Arkaitz Bitorika wrote:

> Hi,
>
> I'm embedding Python in a big C++ program (the NS network simulator)
> and I have problems when importing the numpy module, I get a Floating
> Point exception. The C code that causes the exception is:

I guess you mean a CPU/kernel level floating point exception (SIGFPE),
not a Python exception?

>
>     Py_Initialize();
>     PyObject* module = PyImport_ImportModule("numpy");
>     Py_DECREF(module);
>
>
> I'm running Ubuntu Breezy on a dual processor Dell machine, with the
> stock python and numpy 0.9.6. One strange thing is that I haven't been
> able to reproduce the crash by writing a minimal C program with the
> code above, it only crashes when added to my program. 

Does your program change error bits on the FPU or SSE units on your
processor? (What processor are you using?)

> I've been embedding Python for ages on the same program and other
> modules work fine, only numpy fails.

Most other modules don't use the SSE units, so wouldn't get hit by such
a bug.

>
> I've debugged the issue a bit and I've seen that the exception is
> thrown when the numpy __init__.py tries to import the core module. The
> GDB backtrace is pasted at the end.
> Any idea what may be going wrong?

glibc 2.3.2 (e.g. in debian sarge) has a bug where the SSE unit has an
error bit set wrong. But I'd guess Ubuntu isn't using this version of
glibc, so I think the problem may be elsewhere.
http://sources.redhat.com/bugzilla/show_bug.cgi?id=10


From strawman at astraw.com  Wed Apr 19 11:30:10 2006
From: strawman at astraw.com (Andrew Straw)
Date: Wed Apr 19 11:30:10 2006
Subject: [Numpy-discussion] Floating point exception with numpy and embedded
 python interpreter
In-Reply-To: <AECE9844-6C6A-4EE7-A8A9-CAA079D8B2EA@gmail.com>
References: <81934aa60604191029h4d8a8d9bl550fa58cc67d3d5e@mail.gmail.com> <44467576.1020708@astraw.com> <AECE9844-6C6A-4EE7-A8A9-CAA079D8B2EA@gmail.com>
Message-ID: <4446819D.3030401@astraw.com>

Arkaitz Bitorika wrote:

>
> On 19 Apr 2006, at 18:37, Andrew Straw wrote:
>
>>
>>> I've been embedding Python for ages on the same program and other
>>> modules work fine, only numpy fails.
>>
>>
>> Most other modules don't use the SSE units, so wouldn't get hit by  such
>> a bug.
>
>
> Is there a way of not using those units from numpy, to check if 
> that's what's going on? 

I think that numpy only accesses the SSE units through ATLAS or other
external library. So, build numpy without ATLAS. But I'm not 100% sure
anymore if there aren't any optimizations that directly use SSE if it's
available.

> Or alternatively, how would I check if my  program is messing with the
> SSE bits?

Hmm, I think that's a bit hairy. I'd suggest simply asking the C++
library's mailing list if they alter the error bits on the control
registers of the SSE unit. (Out of curiousity, what library is it?) If
you want hairy, though, I think you'd have to check from C with the
appropriate calls -- I'd start with the source code in that bug report.
It looks like they're inlining an assembly statement to query a SSE
control register.


From faltet at xot.carabos.com  Wed Apr 19 14:49:02 2006
From: faltet at xot.carabos.com (faltet at xot.carabos.com)
Date: Wed Apr 19 14:49:02 2006
Subject: [Numpy-discussion] Performance problems with strided arrays in NumPy
In-Reply-To: <4445A822.60207@ee.byu.edu>
References: <20060414213511.GA14355@xot.carabos.com> <4445A822.60207@ee.byu.edu>
Message-ID: <20060419214814.GA21524@xot.carabos.com>

On Tue, Apr 18, 2006 at 09:01:54PM -0600, Travis Oliphant wrote:
> faltet at xot.carabos.com wrote:
> The source of this slowness is the use in numarray of  special-cases for 
> certain-sized byte-copies.
> 
> Apparently,  it is *much* faster to do
> 
> ((double *)dst)[0] = ((double *)src)[0]
> 
> when you have aligned data than it is to do
> 
> memmove(dst, src, sizeof(double))

Mmm.. very interesting.

> My timings for your benchmark with current SVN of NumPy are:
> 
> NumPy: [0.021701812744140625, 0.021739959716796875, 0.021548032760620117]
> Numarray: [0.052516937255859375, 0.052685976028442383, 0.052355051040649414]

Well, in my machine and using numpy SVN version:

numpy: [0.0974161624908447, 0.0621590614318847, 0.0612149238586425]
numarray: [0.0658359527587890, 0.0623040199279785, 0.0627131462097167]

So, numpy and numarray exhibits same performance now (it's curious why
you are actually getting better performance in your platform). However:

In [25]: stnac=timeit.Timer('b=a.copy()','import numarray as np;
a=np.arange(1000000,dtype="complex128")[::10]')

In [26]: stnpc=timeit.Timer('b=a.copy()','import numpy as np;
a=np.arange(1000000,dtype="complex128")[::10]')

In [27]: stnac.repeat(3,10)
Out[27]: [0.11303496360778809, 0.11540508270263672, 0.11556506156921387]

In [28]: stnpc.repeat(3,10)
Out[28]: [0.21353006362915039, 0.21468400955200195, 0.21390914916992188]

So, it seems that you forgot optimizing complex types. Fortunately,
the cure is easy; after adding the attached patch I'm getting:

In [3]: stnpc.repeat(3,10)
Out[3]: [0.10468602180480957, 0.10204982757568359, 0.10242295265197754]

so, good performance for numpy in copying strided complex128 is
achieved as well.

Thanks for looking into this!

Francesc

======================================================================
--- numpy/core/src/arrayobject.c        (revision 2381)
+++ numpy/core/src/arrayobject.c        (working copy)
@@ -629,6 +629,14 @@
         char *tout = dst;
         char *tin = src;
         switch(elsize) {
+        case 16:
+                for (i=0; i<N; i++) {
+                        ((Float64 *)tout)[0] = ((Float64 *)tin)[0];
+                        ((Float64 *)tout)[1] = ((Float64 *)tin)[1];
+                        tin = tin + instrides;
+                        tout = tout + outstrides;
+                }
+                return;
         case 8:
                 for (i=0; i<N; i++) {
                         ((Float64 *)tout)[0] = ((Float64 *)tin)[0];


From simon at arrowtheory.com  Wed Apr 19 16:14:16 2006
From: simon at arrowtheory.com (Simon Burton)
Date: Wed Apr 19 16:14:16 2006
Subject: [Numpy-discussion] Floating point exception with numpy and
 embedded python interpreter
In-Reply-To: <4446819D.3030401@astraw.com>
References: <81934aa60604191029h4d8a8d9bl550fa58cc67d3d5e@mail.gmail.com>
	<44467576.1020708@astraw.com>
	<AECE9844-6C6A-4EE7-A8A9-CAA079D8B2EA@gmail.com>
	<4446819D.3030401@astraw.com>
Message-ID: <20060420091351.475439ab.simon@arrowtheory.com>

On Wed, 19 Apr 2006 11:29:49 -0700
Andrew Straw <strawman at astraw.com> wrote:

> 
> >
> > Is there a way of not using those units from numpy, to check if 
> > that's what's going on? 
> 
> I think that numpy only accesses the SSE units through ATLAS or other
> external library. So, build numpy without ATLAS. But I'm not 100% sure
> anymore if there aren't any optimizations that directly use SSE if it's
> available.

We had to disable attlas-sse on our debian system for these exact
reasons.

Simon.


-- 
Simon Burton, B.Sc.
Licensed PO Box 8066
ANU Canberra 2601
Australia
Ph. 61 02 6249 6940
http://arrowtheory.com 


From tom.denniston at alum.dartmouth.org  Wed Apr 19 17:17:18 2006
From: tom.denniston at alum.dartmouth.org (Tom Denniston)
Date: Wed Apr 19 17:17:18 2006
Subject: [Numpy-discussion] LAPACK question building numpy
Message-ID: <d6f2d3dd0604191716re38b607x61edda22ac2689d5@mail.gmail.com>

Is there a way to pass a command line argument to setup.py for numpy
that does the equivalent of a make using the flags:
-L/home/tdennist/lib -lmkl_lapack -lmkl_lapack32 -lmkl_ia32 -lmkl -lguide

All i can find on the subject is a page on the scipy wiki that says to
use the variable LAPACK and set it to a .a file.  When I do so I get
undefined symbol problems.

I this is probably really obvous and documented somewhere but I
haven't been able to find it.  I don't really know where to look.

--Tom


From strawman at astraw.com  Wed Apr 19 18:59:03 2006
From: strawman at astraw.com (Andrew Straw)
Date: Wed Apr 19 18:59:03 2006
Subject: [Numpy-discussion] Floating point exception with numpy and embedded
 python interpreter
In-Reply-To: <20060420091351.475439ab.simon@arrowtheory.com>
References: <81934aa60604191029h4d8a8d9bl550fa58cc67d3d5e@mail.gmail.com>	<44467576.1020708@astraw.com>	<AECE9844-6C6A-4EE7-A8A9-CAA079D8B2EA@gmail.com>	<4446819D.3030401@astraw.com> <20060420091351.475439ab.simon@arrowtheory.com>
Message-ID: <4446EAB9.7010209@astraw.com>

Simon Burton wrote:

>On Wed, 19 Apr 2006 11:29:49 -0700
>Andrew Straw <strawman at astraw.com> wrote:
>
>  
>
>>>Is there a way of not using those units from numpy, to check if 
>>>that's what's going on? 
>>>      
>>>
>>I think that numpy only accesses the SSE units through ATLAS or other
>>external library. So, build numpy without ATLAS. But I'm not 100% sure
>>anymore if there aren't any optimizations that directly use SSE if it's
>>available.
>>    
>>
>
>We had to disable attlas-sse on our debian system for these exact
>reasons.
>  
>
If you're using debian sarge and the problem is your glibc, you can fix 
it: http://www.its.caltech.edu/~astraw/coding.html#id3


From robert.kern at gmail.com  Wed Apr 19 19:43:02 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Wed Apr 19 19:43:02 2006
Subject: [Numpy-discussion] Re: LAPACK question building numpy
In-Reply-To: <d6f2d3dd0604191716re38b607x61edda22ac2689d5@mail.gmail.com>
References: <d6f2d3dd0604191716re38b607x61edda22ac2689d5@mail.gmail.com>
Message-ID: <e26sen$g3o$1@sea.gmane.org>

Tom Denniston wrote:
> Is there a way to pass a command line argument to setup.py for numpy
> that does the equivalent of a make using the flags:
> -L/home/tdennist/lib -lmkl_lapack -lmkl_lapack32 -lmkl_ia32 -lmkl -lguide
> 
> All i can find on the subject is a page on the scipy wiki that says to
> use the variable LAPACK and set it to a .a file.  When I do so I get
> undefined symbol problems.
> 
> I this is probably really obvous and documented somewhere but I
> haven't been able to find it.  I don't really know where to look.

Don't worry, it's not really well documented. Create a file called site.cfg in
the root source directory. There's an example site.cfg.example there.
Unfortunately, it's pretty sparse at the moment. Now, I'm not terribly familiar
with the MKL, so I don't know what libraries do what, but here is my guess at
the appropriate things you will need in site.cfg:

[DEFAULT]
library_dirs=/home/tdennist/lib:/some/other/path/perhaps
include_dirs=/home/tdennist/include

[blas_opt]
libraries=whatever_the_mkl_blas_lib_is,mkl_ia32,mkl,guide

[lapack_opt]
libraries=mkl_lapack,mkl_lapack32,mkl_ia32,mkl,guide

There's some more documentation in numpy/distutils/system_info.py .

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From faltet at xot.carabos.com  Wed Apr 19 19:46:03 2006
From: faltet at xot.carabos.com (faltet at xot.carabos.com)
Date: Wed Apr 19 19:46:03 2006
Subject: [Numpy-discussion] Performance problems with strided arrays in NumPy
In-Reply-To: <20060419214814.GA21524@xot.carabos.com>
References: <20060414213511.GA14355@xot.carabos.com> <4445A822.60207@ee.byu.edu> <20060419214814.GA21524@xot.carabos.com>
Message-ID: <20060420024510.GA21987@xot.carabos.com>

On Wed, Apr 19, 2006 at 09:48:14PM +0000, faltet at xot.carabos.com wrote:
> On Tue, Apr 18, 2006 at 09:01:54PM -0600, Travis Oliphant wrote:
> > Apparently,  it is *much* faster to do
> > 
> > ((double *)dst)[0] = ((double *)src)[0]
> > 
> > when you have aligned data than it is to do
> > 
> > memmove(dst, src, sizeof(double))
> 
> Mmm.. very interesting.

A follow-up on this.  After analyzing somewhat the issue, it seems that
the problem with the memcpy() version was not the call itself, but the
parameter that was passed as the number of bytes to copy. As this was a
parameter whose value was unknown in compile time, the compiler cannot
generate optimized code for it and always has to fetch its value from
memory (or cache).

In the version of the code that you optimized, you managed to do this
because you are telling to the compiler (i.e. specifying at compile
time) the exact extend of the data copy, so allowing it to generate
optimum code for the copy operation. However, if you do a similar
thing but using the call (using doubles here):

memcpy(tout, tin, 8);

instead of:

((Float64 *)tout)[0] = ((Float64 *)tin)[0];

and repeat the operation for the other types, then you can achieve
similar performance than the pointer version.

On another hand, I see that you have disabled the optimization for
unaligned data through the use of a check. Is there any reason for
doing that?  If I remove this check, I can achieve similar performance
than for numarray (a bit better, in fact).

I'm attaching a small benchmark script that compares the performance
of copying a 1D vector of 1 million of elements in contiguous, strided
(2 and 10), and strided (2 and 10 again) & unaligned flavors. The
results for my machine (p4 at 2 GHz) are:

For the original numpy code (i.e. before Travis optimization):

time for numpy contiguous --> 0.234
time for numarray contiguous --> 0.229
time for numpy strided (2) --> 1.605
time for numarray strided (2) --> 0.263
time for numpy strided (10) --> 1.72
time for numarray strided (10) --> 0.264
time for numpy strided (2) & unaligned--> 1.736
time for numarray strided (2) & unaligned--> 0.402
time for numpy strided (10) & unaligned--> 1.872
time for numarray strided (10) & unaligned--> 0.435

where you can see that, for 1e6 elements the slowdown of original
numpy is almost 7x (!). Remember that in the previous benchmarks sent
here the slowdown was 3x, but we were copying 10 times less data.

For the pointer optimised code (i.e. the current SVN version):

time for numpy contiguous --> 0.238
time for numarray contiguous --> 0.232
time for numpy strided (2) --> 0.214
time for numarray strided (2) --> 0.264
time for numpy strided (10) --> 0.299
time for numarray strided (10) --> 0.262
time for numpy strided (2) & unaligned--> 1.736
time for numarray strided (2) & unaligned--> 0.401
time for numpy strided (10) & unaligned--> 1.874
time for numarray strided (10) & unaligned--> 0.433

here you can see that your figures are very similar to numarray except
for unaligned data (4x slower).

For the pointer optimised code but releasing the unaligned data check:

time for numpy contiguous --> 0.236
time for numarray contiguous --> 0.231
time for numpy strided (2) --> 0.213
time for numarray strided (2) --> 0.262
time for numpy strided (10) --> 0.297
time for numarray strided (10) --> 0.261
time for numpy strided (2) & unaligned--> 0.263
time for numarray strided (2) & unaligned--> 0.403
time for numpy strided (10) & unaligned--> 0.452
time for numarray strided (10) & unaligned--> 0.432

Ei! numpy is very similar to numarray in all cases, except for the
strided with 2 elements and unaligned case, where numpy performs a 50%
better.

Finally, and just for showing the effect of providing memcpy with size
information in compilation time, the numpy code using memcpy() with
this optimization on (and disabling the alignment check, of course!):

time for numpy contiguous --> 0.234
time for numarray contiguous --> 0.233
time for numpy strided (2) --> 0.223
time for numarray strided (2) --> 0.262
time for numpy strided (10) --> 0.285
time for numarray strided (10) --> 0.262
time for numpy strided (2) & unaligned--> 0.261
time for numarray strided (2) & unaligned--> 0.401
time for numpy strided (10) & unaligned--> 0.42
time for numarray strided (10) & unaligned--> 0.436

you can see that the figures are very similar to the previous case. So
Travis, you may want to use the pointer indirection approach or the
memcpy() one, whichever you prefer.

Well, I just wanted to point this out. Time for sleep!

Francesc
-------------- next part --------------
A non-text attachment was scrubbed...
Name: bench-copy.py
Type: text/x-python
Size: 2054 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060419/87659d97/attachment.py>

From tim.hochberg at cox.net  Wed Apr 19 19:57:06 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Wed Apr 19 19:57:06 2006
Subject: [Numpy-discussion] Summer of Code ideas
Message-ID: <4446F8D8.40909@cox.net>

Discussing ideas for summer of code projects seems to be all the rage 
right now on various other Python lists, so I though I'd throw out a few 
that I've had. There are several different things that could be done 
with numexpr including:

    1. Adding broadcasting.
    2. Coercing arrays a chunk at a time instead of all at once when 
coercion is necessary.
    3. Fancier syntax. I think that some variant of the following could 
be made to work:
              with deferred_evaluation: # Converts everything in local 
namespace to special objects
                    # all of these math operations are deferred
                    a = 5 + b*32
                    c = a + 73
              # Now all objects are restored and deferred experesions 
are evaluated.
        This might be cool or it might be useless, but it sounds fun to try.

I haven't talked to David Cooke about any of these and since numexpr is 
really his project he should be consulted before anyone tries these.

There's also some stuff to be done on the basearray front. I expect I'll 
have the actual basearray object together in the next couple of weeks 
depending on my level of busyness, but there'll be a lot of other stuff 
to do besides just that.  My general plan it to build a toolkit around 
basearray that can be used to build other array packages. These packages 
might be lighter weight than numpy or they might be specialized in some 
way that's not really compatible with numpy and ndarray.

There's also room for potential for experimentation with protocols / 
generic functions. If anyones interested I suggest you read the thread 
(currently dormant) on python-3000.devel on this topic. There are lots 
of possible applications for this in numpy including using them to 
implement or replace:

   * asarray
   * __array_priority__ (by making the ufuncs and thus __add__, etc 
overloaded functions).
   * __array__, __array_wrap__, etc.
   * all the various functions that are giving us trouble with MA.
   * probably a bunch of other stuff.

The basic basearray toolkit I mentioned above  would be a good place to 
experiment with stuff like this, once it exists,  since in theory it 
will be simpler than the full numpy codebase and you don't have to worry 
so much about backwards compatibility.

Anyway, that's a bunch of random ideas that I at least find interesting.

Regards,

-tim


From oliphant at ee.byu.edu  Wed Apr 19 20:44:02 2006
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Wed Apr 19 20:44:02 2006
Subject: [Numpy-discussion] Performance problems with strided arrays in
 NumPy
In-Reply-To: <20060420024510.GA21987@xot.carabos.com>
References: <20060414213511.GA14355@xot.carabos.com> <4445A822.60207@ee.byu.edu> <20060419214814.GA21524@xot.carabos.com> <20060420024510.GA21987@xot.carabos.com>
Message-ID: <44470255.302@ee.byu.edu>

faltet at xot.carabos.com wrote:

>On Wed, Apr 19, 2006 at 09:48:14PM +0000, faltet at xot.carabos.com wrote:
>  
>
>>On Tue, Apr 18, 2006 at 09:01:54PM -0600, Travis Oliphant wrote:
>>    
>>
>>>Apparently,  it is *much* faster to do
>>>
>>>((double *)dst)[0] = ((double *)src)[0]
>>>
>>>when you have aligned data than it is to do
>>>
>>>memmove(dst, src, sizeof(double))
>>>      
>>>
>>Mmm.. very interesting.
>>    
>>
>
>A follow-up on this.  After analyzing somewhat the issue, it seems that
>the problem with the memcpy() version was not the call itself, but the
>parameter that was passed as the number of bytes to copy. As this was a
>parameter whose value was unknown in compile time, the compiler cannot
>generate optimized code for it and always has to fetch its value from
>memory (or cache).
>  
>
>In the version of the code that you optimized, you managed to do this
>because you are telling to the compiler (i.e. specifying at compile
>time) the exact extend of the data copy, so allowing it to generate
>optimum code for the copy operation. However, if you do a similar
>thing but using the call (using doubles here):
>
>memcpy(tout, tin, 8);
>
>instead of:
>
>((Float64 *)tout)[0] = ((Float64 *)tin)[0];
>
>and repeat the operation for the other types, then you can achieve
>similar performance than the pointer version.
>  
>

This is good to know.  It certainly makes sense.  I'll test it on my 
system when I get back.

>On another hand, I see that you have disabled the optimization for
>unaligned data through the use of a check. Is there any reason for
>doing that?  If I remove this check, I can achieve similar performance
>than for numarray (a bit better, in fact).
>  
>
The only reason was to avoid pointer dereferencing on misaligned data 
(dereferencing a misaligned pointer causes bus errors on Solaris).   
But, if we can achieve it with a memmove, then there is no reason to 
limit the code.


>I'm attaching a small benchmark script that compares the performance
>of copying a 1D vector of 1 million of elements in contiguous, strided
>(2 and 10), and strided (2 and 10 again) & unaligned flavors. The
>results for my machine (p4 at 2 GHz) are:
>
>For the original numpy code (i.e. before Travis optimization):
>
>time for numpy contiguous --> 0.234
>time for numarray contiguous --> 0.229
>time for numpy strided (2) --> 1.605
>time for numarray strided (2) --> 0.263
>time for numpy strided (10) --> 1.72
>time for numarray strided (10) --> 0.264
>time for numpy strided (2) & unaligned--> 1.736
>time for numarray strided (2) & unaligned--> 0.402
>time for numpy strided (10) & unaligned--> 1.872
>time for numarray strided (10) & unaligned--> 0.435
>
>where you can see that, for 1e6 elements the slowdown of original
>numpy is almost 7x (!). Remember that in the previous benchmarks sent
>here the slowdown was 3x, but we were copying 10 times less data.
>
>For the pointer optimised code (i.e. the current SVN version):
>
>time for numpy contiguous --> 0.238
>time for numarray contiguous --> 0.232
>time for numpy strided (2) --> 0.214
>time for numarray strided (2) --> 0.264
>time for numpy strided (10) --> 0.299
>time for numarray strided (10) --> 0.262
>time for numpy strided (2) & unaligned--> 1.736
>time for numarray strided (2) & unaligned--> 0.401
>time for numpy strided (10) & unaligned--> 1.874
>time for numarray strided (10) & unaligned--> 0.433
>
>here you can see that your figures are very similar to numarray except
>for unaligned data (4x slower).
>
>For the pointer optimised code but releasing the unaligned data check:
>
>time for numpy contiguous --> 0.236
>time for numarray contiguous --> 0.231
>time for numpy strided (2) --> 0.213
>time for numarray strided (2) --> 0.262
>time for numpy strided (10) --> 0.297
>time for numarray strided (10) --> 0.261
>time for numpy strided (2) & unaligned--> 0.263
>time for numarray strided (2) & unaligned--> 0.403
>time for numpy strided (10) & unaligned--> 0.452
>time for numarray strided (10) & unaligned--> 0.432
>
>Ei! numpy is very similar to numarray in all cases, except for the
>strided with 2 elements and unaligned case, where numpy performs a 50%
>better.
>
>Finally, and just for showing the effect of providing memcpy with size
>information in compilation time, the numpy code using memcpy() with
>this optimization on (and disabling the alignment check, of course!):
>
>time for numpy contiguous --> 0.234
>time for numarray contiguous --> 0.233
>time for numpy strided (2) --> 0.223
>time for numarray strided (2) --> 0.262
>time for numpy strided (10) --> 0.285
>time for numarray strided (10) --> 0.262
>time for numpy strided (2) & unaligned--> 0.261
>time for numarray strided (2) & unaligned--> 0.401
>time for numpy strided (10) & unaligned--> 0.42
>time for numarray strided (10) & unaligned--> 0.436
>
>you can see that the figures are very similar to the previous case. So
>Travis, you may want to use the pointer indirection approach or the
>memcpy() one, whichever you prefer.
>
>Well, I just wanted to point this out. Time for sleep!
>
>  
>
Very, very useful information.  1000 Thank you's for talking the time to 
investigate and assemble it.   Do you think the memmove would work 
similarly?  

-Travis


From tom.denniston at alum.dartmouth.org  Thu Apr 20 08:07:04 2006
From: tom.denniston at alum.dartmouth.org (Tom Denniston)
Date: Thu Apr 20 08:07:04 2006
Subject: [Numpy-discussion] Re: LAPACK question building numpy
In-Reply-To: <e26sen$g3o$1@sea.gmane.org>
References: <d6f2d3dd0604191716re38b607x61edda22ac2689d5@mail.gmail.com>
	 <e26sen$g3o$1@sea.gmane.org>
Message-ID: <d6f2d3dd0604200806x567c531dp56c5a4d9e6808bbc@mail.gmail.com>

Thanks for your help.  I will try this.

--Tom

On 4/19/06, Robert Kern <robert.kern at gmail.com> wrote:
> Tom Denniston wrote:
> > Is there a way to pass a command line argument to setup.py for numpy
> > that does the equivalent of a make using the flags:
> > -L/home/tdennist/lib -lmkl_lapack -lmkl_lapack32 -lmkl_ia32 -lmkl -lguide
> >
> > All i can find on the subject is a page on the scipy wiki that says to
> > use the variable LAPACK and set it to a .a file.  When I do so I get
> > undefined symbol problems.
> >
> > I this is probably really obvous and documented somewhere but I
> > haven't been able to find it.  I don't really know where to look.
>
> Don't worry, it's not really well documented. Create a file called site.cfg in
> the root source directory. There's an example site.cfg.example there.
> Unfortunately, it's pretty sparse at the moment. Now, I'm not terribly familiar
> with the MKL, so I don't know what libraries do what, but here is my guess at
> the appropriate things you will need in site.cfg:
>
> [DEFAULT]
> library_dirs=/home/tdennist/lib:/some/other/path/perhaps
> include_dirs=/home/tdennist/include
>
> [blas_opt]
> libraries=whatever_the_mkl_blas_lib_is,mkl_ia32,mkl,guide
>
> [lapack_opt]
> libraries=mkl_lapack,mkl_lapack32,mkl_ia32,mkl,guide
>
> There's some more documentation in numpy/distutils/system_info.py .
>
> --
> Robert Kern
> robert.kern at gmail.com
>
> "I have come to believe that the whole world is an enigma, a harmless enigma
>  that is made terrible by our own mad attempt to interpret it as though it had
>  an underlying truth."
>  -- Umberto Eco
>
>
>
> -------------------------------------------------------
> Using Tomcat but need to do more? Need to support web services, security?
> Get stuff done quickly with pre-integrated technology to make your job easier
> Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>


From faltet at xot.carabos.com  Thu Apr 20 09:42:04 2006
From: faltet at xot.carabos.com (faltet at xot.carabos.com)
Date: Thu Apr 20 09:42:04 2006
Subject: [Numpy-discussion] Performance problems with strided arrays in NumPy
In-Reply-To: <44470255.302@ee.byu.edu>
References: <20060414213511.GA14355@xot.carabos.com> <4445A822.60207@ee.byu.edu> <20060419214814.GA21524@xot.carabos.com> <20060420024510.GA21987@xot.carabos.com> <44470255.302@ee.byu.edu>
Message-ID: <20060420164132.GA23763@xot.carabos.com>

On Wed, Apr 19, 2006 at 09:39:01PM -0600, Travis Oliphant wrote:
>>On another hand, I see that you have disabled the optimization for
>>unaligned data through the use of a check. Is there any reason for
>>doing that?  If I remove this check, I can achieve similar performance
>>than for numarray (a bit better, in fact).
>
>The only reason was to avoid pointer dereferencing on misaligned data 
>(dereferencing a misaligned pointer causes bus errors on Solaris).   
>But, if we can achieve it with a memmove, then there is no reason to 
>limit the code.

I see. Well, I've tried out with memmove instead than memcpy, and I
can reproduce the same slowdown than it was seen previously to using
your pointer addressing optimisation. I'm afraid that Shasha was right
in that memmove check for not overwriting destination is the
responsible for this.

Having said that, and although I must admit that I don't know in deep
the different situations under which the source of a copy may overlap
the destination, my guess is that for typical element sizes (i.e. [1],
2, 4, 8 and 16) for which the optimization has been done, there is not
any harm on using memcpy instead of memmove (admittedly, you may come
with a counter-example of this, but I do hope you don't). In any case,
the use of memcpy is completely equivalent to the current optimization
using pointers except that, hopefully, pointer addressing is not made
on unaligned data. So, perhaps using the memcpy approach in Solaris
(under Sparc I guess) may avoid the bus errors. It would be nice if
anyone with access to such a platform can confirm this point. I'm
attaching a patch for current SVN numpy that uses the memcpy approach.
Feel free to try it against the benchmarks (also attached).

One last word, I've added a case for typesize 1 in addition of the
existing ones as this effectively improves the speed for 1-byte types.
Below are the speeds without the 1-byte case optimisation:

time for numpy contiguous --> 0.03
time for numarray contiguous --> 0.062
time for numpy strided (2) --> 0.078
time for numarray strided (2) --> 0.064
time for numpy strided (10) --> 0.081
time for numarray strided (10) --> 0.07

I haven't added a case for the unaligned case because this makes
non-sense for 1 byte sized types.

and here with the 1-byte case optimisation added:

time for numpy contiguous --> 0.03
time for numarray contiguous --> 0.062
time for numpy strided (2) --> 0.054
time for numarray strided (2) --> 0.065
time for numpy strided (10) --> 0.061
time for numarray strided (10) --> 0.07

you can notice an speed-up between a 30% and 45% over the previous
case.

Cheers,
-------------- next part --------------
--- numpy/core/src/arrayobject.c        (revision 2381)
+++ numpy/core/src/arrayobject.c        (working copy)
@@ -628,28 +628,44 @@
         intp i, j;
         char *tout = dst;
         char *tin = src;
+       /* For typical datasizes, the memcpy call is much faster than memmove
+          and perfectely safe */
         switch(elsize) {
+        case 16:
+                for (i=0; i<N; i++) {
+                        memcpy(tout, tin, 16);
+                        tin = tin + instrides;
+                        tout = tout + outstrides;
+                }
+                return;
         case 8:
                 for (i=0; i<N; i++) {
-                        ((Float64 *)tout)[0] = ((Float64 *)tin)[0];
+                        memcpy(tout, tin, 8);
                         tin = tin + instrides;
                         tout = tout + outstrides;
                 }
                 return;
         case 4:
                 for (i=0; i<N; i++) {
-                        ((Int32 *)tout)[0] = ((Int32 *)tin)[0];
+                        memcpy(tout, tin, 4);
                         tin = tin + instrides;
                         tout = tout + outstrides;
                 }
                 return;
         case 2:
                 for (i=0; i<N; i++) {
-                        ((Int16 *)tout)[0] = ((Int16 *)tin)[0];
+                        memcpy(tout, tin, 2);
                         tin = tin + instrides;
                         tout = tout + outstrides;
                 }
                 return;
+        case 1:
+                for (i=0; i<N; i++) {
+                        memcpy(tout, tin, 1);
+                        tin = tin + instrides;
+                        tout = tout + outstrides;
+                }
+                return;
         default:
                 for (i=0; i<N; i++) {
                         for (j=0; j<elsize; j++) {
@@ -731,8 +747,7 @@
         }

         /* See if we can iterate over the largest dimension */
-        if (!swap && PyArray_ISALIGNED(dest) && PyArray_ISALIGNED(src) &&
-            (nd = dest->nd) == src->nd && (nd > 0) &&
+        if (!swap && (nd = dest->nd) == src->nd && (nd > 0) &&
             PyArray_CompareLists(dest->dimensions, src->dimensions, nd)) {
                 int maxaxis=0, maxdim=dest->dimensions[0];
                 int i;

-------------- next part --------------
A non-text attachment was scrubbed...
Name: bench-copy.py
Type: text/x-python
Size: 2053 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060420/08ed1d4d/attachment.py>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: bench-copy1.py
Type: text/x-python
Size: 1168 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060420/08ed1d4d/attachment-0001.py>

From rng7 at cornell.edu  Thu Apr 20 13:49:13 2006
From: rng7 at cornell.edu (Ryan Gutenkunst)
Date: Thu Apr 20 13:49:13 2006
Subject: [Numpy-discussion] Bypassing a[2].item()?
Message-ID: <4447F397.7010006@cornell.edu>

Hi all,

I'm porting some code from old scipy to new scipy, and I've run into a 
rather large performance problem.

The heart of the code is integrating a system of nonlinear differential 
equations using odeint. The function that dominates the time to run 
calculates the right hand side, given a current state x. (len(x) ~ 50.) 
Abstracted, the function looks like:

def rhs(x)
     output = scipy.zeros(10, scipy.Float)

     a = x[0]
     b = x[1]
     ...

     output[0] = a/b + c*sqrt(d)...
     output[1] = b-a + 2*b...
     ...

     return output

(I copy the elements of the current state to local variables to avoid 
the cost of repeatedly calling x.__getitem__, and to make the resulting 
equations easier to read.)

When using numpy, a and b are now array scalars and the arithmetic is 
much slower, resulting in about a factor of 10 increase in runtimes from 
those using Numeric.

I've tried doing: a = x[0].item(), which allows the arimetic be done on 
pure scalars. This is a little faster, but still results in a factor of 
3 increase in runtime from old scipy. I imagine the slowdown comes from 
having to call __getitem__() followed by item()

So questions:
1) I haven't followed the details of the array scalar discussions. Is it 
anticipated that array scalar arithmetic will eventually be as fast as 
arithmetic in native python types?

2) If not, is it possible to get a "pure" scalar directly from an array 
in one function call?

Thanks for any help,
Ryan

-- 
Ryan Gutenkunst               |
Cornell LASSP                 |       "It is not the mountain
                               |        we conquer but ourselves."
Clark 535 / (607)227-7914     |        -- Sir Edmund Hillary
AIM: JepettoRNG               |
          http://www.physics.cornell.edu/~rgutenkunst/


From robert.kern at gmail.com  Thu Apr 20 14:20:02 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Thu Apr 20 14:20:02 2006
Subject: [Numpy-discussion] Re: Bypassing a[2].item()?
In-Reply-To: <4447F397.7010006@cornell.edu>
References: <4447F397.7010006@cornell.edu>
Message-ID: <e28ts3$7fr$1@sea.gmane.org>

Ryan Gutenkunst wrote:

> So questions:
> 1) I haven't followed the details of the array scalar discussions. Is it
> anticipated that array scalar arithmetic will eventually be as fast as
> arithmetic in native python types?

More or less, if I'm not mistaken. This ticket is aimed at that:

  http://projects.scipy.org/scipy/numpy/ticket/55

> 2) If not, is it possible to get a "pure" scalar directly from an array
> in one function call?

float(x[0]) seems to be faster on my PowerBook.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From rng7 at cornell.edu  Thu Apr 20 15:21:11 2006
From: rng7 at cornell.edu (Ryan Gutenkunst)
Date: Thu Apr 20 15:21:11 2006
Subject: [Numpy-discussion] Re: Bypassing a[2].item()?
In-Reply-To: <e28ts3$7fr$1@sea.gmane.org>
References: <4447F397.7010006@cornell.edu> <e28ts3$7fr$1@sea.gmane.org>
Message-ID: <9b9f0633c5a242a6ab8a199708c8dd94@cornell.edu>

On Apr 20, 2006, at 5:18 PM, Robert Kern wrote:
> Ryan Gutenkunst wrote:
>
>> So questions:
>> 1) I haven't followed the details of the array scalar discussions. Is 
>> it
>> anticipated that array scalar arithmetic will eventually be as fast as
>> arithmetic in native python types?
>
> More or less, if I'm not mistaken. This ticket is aimed at that:
>
>   http://projects.scipy.org/scipy/numpy/ticket/55

Good to hear.

>> 2) If not, is it possible to get a "pure" scalar directly from an 
>> array
>> in one function call?
>
> float(x[0]) seems to be faster on my PowerBook.

It's faster for me, too, but float(x[0]) is still much slower than 
using Numeric where x[0] suffices. I guess I'll just have to warn my 
users away from the new scipy until numpy 0.9.8 comes out and scalar 
math is sped up.

Cheers,
Ryan

-- 
Ryan Gutenkunst               |
Cornell Dept. of Physics      |       "It is not the mountain
                               |        we conquer but ourselves."
Clark 535 / (607)255-6068     |        -- Sir Edmund Hillary
AIM: JepettoRNG               |
          http://www.physics.cornell.edu/~rgutenkunst/


From robert.kern at gmail.com  Thu Apr 20 16:22:09 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Thu Apr 20 16:22:09 2006
Subject: [Numpy-discussion] Re: Bypassing a[2].item()?
In-Reply-To: <9b9f0633c5a242a6ab8a199708c8dd94@cornell.edu>
References: <4447F397.7010006@cornell.edu> <e28ts3$7fr$1@sea.gmane.org> <9b9f0633c5a242a6ab8a199708c8dd94@cornell.edu>
Message-ID: <e2951s$sa0$1@sea.gmane.org>

Ryan Gutenkunst wrote:
> On Apr 20, 2006, at 5:18 PM, Robert Kern wrote:
> 
>> Ryan Gutenkunst wrote:

>>> 2) If not, is it possible to get a "pure" scalar directly from an array
>>> in one function call?
>>
>> float(x[0]) seems to be faster on my PowerBook.
> 
> It's faster for me, too, but float(x[0]) is still much slower than using
> Numeric where x[0] suffices. I guess I'll just have to warn my users
> away from the new scipy until numpy 0.9.8 comes out and scalar math is
> sped up.

For that matter, a plain "x[0]" seems to be about 3x faster with Numeric than numpy.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From oliphant at ee.byu.edu  Thu Apr 20 20:16:02 2006
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Thu Apr 20 20:16:02 2006
Subject: [Numpy-discussion] Re: Bypassing a[2].item()?
In-Reply-To: <e2951s$sa0$1@sea.gmane.org>
References: <4447F397.7010006@cornell.edu> <e28ts3$7fr$1@sea.gmane.org> <9b9f0633c5a242a6ab8a199708c8dd94@cornell.edu> <e2951s$sa0$1@sea.gmane.org>
Message-ID: <44484E44.2050300@ee.byu.edu>

Robert Kern wrote:

>Ryan Gutenkunst wrote:
>  
>
>>On Apr 20, 2006, at 5:18 PM, Robert Kern wrote:
>>
>>    
>>
>>>Ryan Gutenkunst wrote:
>>>      
>>>
>
>  
>
>>>>2) If not, is it possible to get a "pure" scalar directly from an array
>>>>in one function call?
>>>>        
>>>>
>>>float(x[0]) seems to be faster on my PowerBook.
>>>      
>>>
>>It's faster for me, too, but float(x[0]) is still much slower than using
>>Numeric where x[0] suffices. I guess I'll just have to warn my users
>>away from the new scipy until numpy 0.9.8 comes out and scalar math is
>>sped up.
>>    
>>
>
>For that matter, a plain "x[0]" seems to be about 3x faster with Numeric than numpy.
>
>  
>
We are already special-casing the integer select code but could 
special-case the getitem code so that if nd==1 a faster construction is 
used.  I think right now a 0-dim array is being created only to get 
destroyed later on return.   Please add a ticket as this extremely 
common operation should be made as fast as possible. 

This is a little tricky because array_big_item is called in a few places 
and is expected to return an array.  If it returns a scalar in those 
places segfaults can occur.  Either checks need to be made in each of 
those cases or the special-casing needs to be in array_big_item_nice.  
I'm not sure which I prefer....

-Travis


From simon at arrowtheory.com  Thu Apr 20 23:24:59 2006
From: simon at arrowtheory.com (Simon Burton)
Date: Thu Apr 20 23:24:59 2006
Subject: [Numpy-discussion] announce: pyjit, a little jit for creating numpy ufuncs
Message-ID: <20060421162336.42285837.simon@arrowtheory.com>

Hi,

Inspired by numexpr, pypy and llvm, i've built a simple 
JIT for creating numpy "ufuncs" (they are not yet real ufuncs).
It uses llvm[1] as the backend machine code generator.

The main things it can do are:

 *) parse simple python code (function def's)
 *) generate SSA assembly code for llvm
 *) build ufunc code for applying to numpy array's

When I say simple I mean it:

def calc(a,b):
  c = (a+b)/2.0
  return c

No control flow or type inference has been implemented.

As with numexpr, significant speedups are possible.

I'm putting this announce here to see what the other numpy'ers think.

$ svn co http://rubis.rsise.anu.edu.au/local/repos/elefant/pyjit

bye,

Simon.

[1] http://llvm.org/


-- 
Simon Burton, B.Sc.
Licensed PO Box 8066
ANU Canberra 2601
Australia
Ph. 61 02 6249 6940
http://arrowtheory.com 


From oqdr at dcorthodontics.com  Fri Apr 21 00:08:02 2006
From: oqdr at dcorthodontics.com (Rosalia Oneal)
Date: Fri Apr 21 00:08:02 2006
Subject: [Numpy-discussion] six-pack
Message-ID: <001901c66512$37850955$68c487dd@tswt.rkkudn>

An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060421/28e5b07d/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: frosty.gif
Type: image/gif
Size: 26123 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060421/28e5b07d/attachment.gif>

From cookedm at physics.mcmaster.ca  Fri Apr 21 09:27:00 2006
From: cookedm at physics.mcmaster.ca (David M. Cooke)
Date: Fri Apr 21 09:27:00 2006
Subject: [Numpy-discussion] Source release of 0.9.6 on sourceforge is wrong
Message-ID: <qnklktynbe7.fsf@arbutus.physics.mcmaster.ca>

Travis,

Looks like you uploaded the bdist .tar.gz of NumPy 0.9.6 to
sourceforge, instead of the sdist. The one there isn't the source,
it's a binary distribution of a 32-bit Linux compile.

It's been over a month, with 2684 downloads, and I can't find a
mention that anybody's noticed this before... Have we silently lost
people who think we're on crack, or are there 2684 people who haven't
looked at what they got?

[On a another note, the download URL on PyPi won't work with
setuptools; I've fixed the setup.py in svn to use the correct one, but
if you could fix it on PyPi and set it to
http://sourceforge.net/project/showfiles.php?group_id=1369&package_id=175103
then people can use easy_install to install numpy.]

-- 
|>|\/|<
/--------------------------------------------------------------------------\
|David M. Cooke                      http://arbutus.physics.mcmaster.ca/dmc/
|cookedm at physics.mcmaster.ca


From cookedm at physics.mcmaster.ca  Fri Apr 21 09:30:01 2006
From: cookedm at physics.mcmaster.ca (David M. Cooke)
Date: Fri Apr 21 09:30:01 2006
Subject: [Numpy-discussion] Source release of 0.9.6 on sourceforge is wrong
In-Reply-To: <qnklktynbe7.fsf@arbutus.physics.mcmaster.ca> (David M. Cooke's
	message of "Fri, 21 Apr 2006 12:25:52 -0400")
References: <qnklktynbe7.fsf@arbutus.physics.mcmaster.ca>
Message-ID: <qnkfyk6nb8x.fsf@arbutus.physics.mcmaster.ca>

cookedm at physics.mcmaster.ca (David M. Cooke) writes:

> Travis,
>
> Looks like you uploaded the bdist .tar.gz of NumPy 0.9.6 to
> sourceforge, instead of the sdist. The one there isn't the source,
> it's a binary distribution of a 32-bit Linux compile.

Gah! My bad! When I convinced easy_install to grab the source, it
grabbed numpy-0.9.6-py2.4-linux-i686.tar.gz instead, which of course is a
binary package.

*why* it grabbed that one is another story (that's not my platform!
I'm on py2.4-linux-x86_64).

-- 
|>|\/|<
/--------------------------------------------------------------------------\
|David M. Cooke                      http://arbutus.physics.mcmaster.ca/dmc/
|cookedm at physics.mcmaster.ca


From ndarray at mac.com  Fri Apr 21 09:35:02 2006
From: ndarray at mac.com (Sasha)
Date: Fri Apr 21 09:35:02 2006
Subject: [Numpy-discussion] Source release of 0.9.6 on sourceforge is wrong
In-Reply-To: <qnklktynbe7.fsf@arbutus.physics.mcmaster.ca>
References: <qnklktynbe7.fsf@arbutus.physics.mcmaster.ca>
Message-ID: <d38f5330604210934h6425f8b6m759829fc218e2ceb@mail.gmail.com>

I've downloaded numpy-0.9.6.tar.gz from SF about a month ago and it was fine:

> tar tzf ~/Archives/numpy-0.9.6.tar.gz
numpy-0.9.6/
numpy-0.9.6/numpy/
numpy-0.9.6/numpy/core/
numpy-0.9.6/numpy/core/blasdot/
numpy-0.9.6/numpy/core/blasdot/_dotblas.c
numpy-0.9.6/numpy/core/blasdot/cblas.h
...


On 4/21/06, David M. Cooke <cookedm at physics.mcmaster.ca> wrote:
> Travis,
>
> Looks like you uploaded the bdist .tar.gz of NumPy 0.9.6 to
> sourceforge, instead of the sdist. The one there isn't the source,
> it's a binary distribution of a 32-bit Linux compile.
>
> It's been over a month, with 2684 downloads, and I can't find a
> mention that anybody's noticed this before... Have we silently lost
> people who think we're on crack, or are there 2684 people who haven't
> looked at what they got?
>
> [On a another note, the download URL on PyPi won't work with
> setuptools; I've fixed the setup.py in svn to use the correct one, but
> if you could fix it on PyPi and set it to
> http://sourceforge.net/project/showfiles.php?group_id=1369&package_id=175103
> then people can use easy_install to install numpy.]
>
> --
> |>|\/|<
> /--------------------------------------------------------------------------\
> |David M. Cooke                      http://arbutus.physics.mcmaster.ca/dmc/
> |cookedm at physics.mcmaster.ca
>
>
> -------------------------------------------------------
> Using Tomcat but need to do more? Need to support web services, security?
> Get stuff done quickly with pre-integrated technology to make your job easier
> Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>


From bsouthey at gmail.com  Fri Apr 21 10:35:02 2006
From: bsouthey at gmail.com (Bruce Southey)
Date: Fri Apr 21 10:35:02 2006
Subject: [Numpy-discussion] Source release of 0.9.6 on sourceforge is wrong
In-Reply-To: <d38f5330604210934h6425f8b6m759829fc218e2ceb@mail.gmail.com>
References: <qnklktynbe7.fsf@arbutus.physics.mcmaster.ca>
	 <d38f5330604210934h6425f8b6m759829fc218e2ceb@mail.gmail.com>
Message-ID: <bbcd77d00604211033r4816671dl6c455ea25abe8b0d@mail.gmail.com>

Hi,
I concurr as I downloaded and installed it yesterday (April 20) afternoon:
(from my ls -l) : 2006-04-20 13:38 numpy-0.9.6.tar.gz

I had no problems installing that version as the import numpy appeared to work.

Regards
Bruce

On 4/21/06, Sasha <ndarray at mac.com> wrote:
> I've downloaded numpy-0.9.6.tar.gz from SF about a month ago and it was fine:
>
> > tar tzf ~/Archives/numpy-0.9.6.tar.gz
> numpy-0.9.6/
> numpy-0.9.6/numpy/
> numpy-0.9.6/numpy/core/
> numpy-0.9.6/numpy/core/blasdot/
> numpy-0.9.6/numpy/core/blasdot/_dotblas.c
> numpy-0.9.6/numpy/core/blasdot/cblas.h
> ...
>
>
>
> On 4/21/06, David M. Cooke <cookedm at physics.mcmaster.ca> wrote:
> > Travis,
> >
> > Looks like you uploaded the bdist .tar.gz of NumPy 0.9.6 to
> > sourceforge, instead of the sdist. The one there isn't the source,
> > it's a binary distribution of a 32-bit Linux compile.
> >
> > It's been over a month, with 2684 downloads, and I can't find a
> > mention that anybody's noticed this before... Have we silently lost
> > people who think we're on crack, or are there 2684 people who haven't
> > looked at what they got?
> >
> > [On a another note, the download URL on PyPi won't work with
> > setuptools; I've fixed the setup.py in svn to use the correct one, but
> > if you could fix it on PyPi and set it to
> > http://sourceforge.net/project/showfiles.php?group_id=1369&package_id=175103
> > then people can use easy_install to install numpy.]
> >
> > --
> > |>|\/|<
> > /--------------------------------------------------------------------------\
> > |David M. Cooke                      http://arbutus.physics.mcmaster.ca/dmc/
> > |cookedm at physics.mcmaster.ca
> >
> >
> > -------------------------------------------------------
> > Using Tomcat but need to do more? Need to support web services, security?
> > Get stuff done quickly with pre-integrated technology to make your job easier
> > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
> > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
> > _______________________________________________
> > Numpy-discussion mailing list
> > Numpy-discussion at lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/numpy-discussion
> >
>
>
> -------------------------------------------------------
> Using Tomcat but need to do more? Need to support web services, security?
> Get stuff done quickly with pre-integrated technology to make your job easier
> Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
> http://sel.as-us.falkag.net/sel?cmdlnk&kid0709&bid&3057&dat1642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>


From robert.kern at gmail.com  Fri Apr 21 11:28:11 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Fri Apr 21 11:28:11 2006
Subject: [Numpy-discussion] Re: Source release of 0.9.6 on sourceforge is wrong
In-Reply-To: <qnkfyk6nb8x.fsf@arbutus.physics.mcmaster.ca>
References: <qnklktynbe7.fsf@arbutus.physics.mcmaster.ca> <qnkfyk6nb8x.fsf@arbutus.physics.mcmaster.ca>
Message-ID: <e2b86f$c68$1@sea.gmane.org>

David M. Cooke wrote:
> cookedm at physics.mcmaster.ca (David M. Cooke) writes:
> 
>>Travis,
>>
>>Looks like you uploaded the bdist .tar.gz of NumPy 0.9.6 to
>>sourceforge, instead of the sdist. The one there isn't the source,
>>it's a binary distribution of a 32-bit Linux compile.
> 
> Gah! My bad! When I convinced easy_install to grab the source, it
> grabbed numpy-0.9.6-py2.4-linux-i686.tar.gz instead, which of course is a
> binary package.
> 
> *why* it grabbed that one is another story (that's not my platform!
> I'm on py2.4-linux-x86_64).

Phillip Eby tells me that the bdist_dumb packages there confuse some versions of
setuptools. He fixed it this morning.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From faltet at xot.carabos.com  Fri Apr 21 13:56:04 2006
From: faltet at xot.carabos.com (faltet at xot.carabos.com)
Date: Fri Apr 21 13:56:04 2006
Subject: [Numpy-discussion] numexpr enhancements
Message-ID: <20060421205530.GA25020@xot.carabos.com>

Hi,

After looking at the numpy performance issues on strided and unaligned
data, I decided to have a try at the numexpr package and finally
implemented better suport for them. As a result, numexpr can reach now
a 2x of performance improvement for simple expressions, like 'a>2.'.

In the way, I've added support for boolean expressions (&, | and ~, as
in the where() function), a new boolean data type (important to get
better performance on boolean expressions) and support for numarray
(maintaining the compatibility with numpy, of course).

I've called the new package numexpr 0.2 to not confuse it with existing
0.1. Well, let's hope that numexpr can continue making its way towards
integration in numpy.

You can fetch this new package at:

http://www.carabos.com/downloads/divers/numexpr-0.2.tar.gz

Finally, let me say that numexpr is a wonderful toy to get your hands
dirty ;-) Many thanks to David (and Tim) for this!

Cheers!

Francesc


From hetland at tamu.edu  Fri Apr 21 15:02:12 2006
From: hetland at tamu.edu (Robert Hetland)
Date: Fri Apr 21 15:02:12 2006
Subject: [Numpy-discussion] 'append' array method request.
Message-ID: <E26AB4D2-3554-4354-8459-8FB30BD36388@tamu.edu>

I find myself writing things like

x = []; y = []; t = []
for line in open(filename).readlines():
     xstr, ystr, tstr = line.split()
     x.append(float(xstr))
     y.append(float(ystr)_
     t.append(dateutil.parser.parse(tstr))  # or something similar
x = asarray(x)
y = asarray(y)
t = asarray(t)

I think it would be nice to be able to create empty arrays, and  
append the values onto the end as I loop through the file without  
creating the intermediate list.  Is this reasonable?  Is there a way  
to do this with existing methods or functions that I am missing?  Is  
there a better way altogether?

-Rob.


-----
Rob Hetland, Assistant Professor
Dept of Oceanography, Texas A&M University
p: 979-458-0096, f: 979-845-6331
e: hetland at tamu.edu, w: http://pong.tamu.edu


From robert.kern at gmail.com  Fri Apr 21 15:13:07 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Fri Apr 21 15:13:07 2006
Subject: [Numpy-discussion] Re: 'append' array method request.
In-Reply-To: <E26AB4D2-3554-4354-8459-8FB30BD36388@tamu.edu>
References: <E26AB4D2-3554-4354-8459-8FB30BD36388@tamu.edu>
Message-ID: <e2blcb$nen$1@sea.gmane.org>

Robert Hetland wrote:
> 
> I find myself writing things like
> 
> x = []; y = []; t = []
> for line in open(filename).readlines():
>     xstr, ystr, tstr = line.split()
>     x.append(float(xstr))
>     y.append(float(ystr)_
>     t.append(dateutil.parser.parse(tstr))  # or something similar
> x = asarray(x)
> y = asarray(y)
> t = asarray(t)
> 
> I think it would be nice to be able to create empty arrays, and  append
> the values onto the end as I loop through the file without  creating the
> intermediate list.  Is this reasonable? 

Not in the core array object, no. We can't make the underlying pointer point to
something else (because you've just reallocated the whole memory block to add an
item to the array) without invalidating all of the views on that array. This is
also the reason that numpy arrays can't use the standard library's array module
as its storage. That said:

> Is there a way  to do this with
> existing methods or functions that I am missing?  Is  there a better way
> altogether?

We've done performance tests before. The fastest way that I've found is to use
the stdlib array module to accumulate values (it uses the same preallocation
strategy that Python lists use, and you can't create views from them, so you are
always safe) and then create the numpy array using fromstring on that object
(stdlib arrays obey the buffer protocol, so they will be treated like strings of
binary data). I posted timings one or two or three years ago on one of the scipy
lists.

However, lists are fine if you don't need blazing speed/low memory usage.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From ndarray at mac.com  Fri Apr 21 15:20:01 2006
From: ndarray at mac.com (Sasha)
Date: Fri Apr 21 15:20:01 2006
Subject: [Numpy-discussion] 'append' array method request.
In-Reply-To: <E26AB4D2-3554-4354-8459-8FB30BD36388@tamu.edu>
References: <E26AB4D2-3554-4354-8459-8FB30BD36388@tamu.edu>
Message-ID: <d38f5330604211519h57383d65r6bbfba623771c958@mail.gmail.com>

On 4/21/06, Robert Hetland <hetland at tamu.edu> wrote:
> [...]
> I think it would be nice to be able to create empty arrays, and
> append the values onto the end as I loop through the file without
> creating the intermediate list.  Is this reasonable?  Is there a way
> to do this with existing methods or functions that I am missing?  Is
> there a better way altogether?
>

Numpy arrays cannot grow in-place because there is no way for an array
to tell if it's data is shared with other arrays.  You can use
python's standard library arrays instead of lists:

>>> from numpy import *
>>> import array as a
>>> x = a.array('i',[])
>>> x.append(1)
>>> x.append(2)
>>> x.append(3)
>>> ndarray(len(x), dtype=int, buffer=x)
array([1, 2, 3])

Note that data is not copied:

>>> ndarray(len(x), dtype=int, buffer=x)[1] = 20
>>> x
array('i', [1, 20, 3])


From charlesr.harris at gmail.com  Fri Apr 21 18:50:02 2006
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Fri Apr 21 18:50:02 2006
Subject: [Numpy-discussion] 'append' array method request.
In-Reply-To: <E26AB4D2-3554-4354-8459-8FB30BD36388@tamu.edu>
References: <E26AB4D2-3554-4354-8459-8FB30BD36388@tamu.edu>
Message-ID: <e06186140604211849j32cd2fdr1d29f99977c2e8e4@mail.gmail.com>

Hi,

On 4/21/06, Robert Hetland <hetland at tamu.edu> wrote:
>
>
> I find myself writing things like
>
> x = []; y = []; t = []
> for line in open(filename).readlines():
>      xstr, ystr, tstr = line.split()
>      x.append(float(xstr))
>      y.append(float(ystr)_
>      t.append(dateutil.parser.parse(tstr))  # or something similar
> x = asarray(x)
> y = asarray(y)
> t = asarray(t)


I think you can read the ascii file directly into an array with numeric
conversions (fromfile) then just reshape it to have x,y,z columns. For
example:

$[charris at E011704 ~]$ cat input.txt
1 2 3
4 5 6
7 8 9

Then after importing numpy into ipython:

In [6]:fromfile('input.txt',sep=' ').reshape(-1,3)
Out[6]:
array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060421/34c2ff88/attachment.html>

From oliphant.travis at ieee.org  Fri Apr 21 19:51:07 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Fri Apr 21 19:51:07 2006
Subject: [Numpy-discussion] Re: seterr changes
In-Reply-To: <44465DEE.8090703@cox.net>
References: <44465DEE.8090703@cox.net>
Message-ID: <444999E2.1040009@ieee.org>

Tim Hochberg wrote:
>
> Hi Travis et al,
>
> I started looking at your seterr changes. 
Thank you very much for the help on this.  I'm not an expert on threaded 
code by any means.  In fact, as you clearly point out, I don't eat and 
drink what will work under threaded environments and what wont.  Clearly 
global variables are problematic.  That is the problem with the 
update_use_defaults bit, right?   This is the way it was being managed 
before and I just changed names a bit to use PyThreadState_GetDict for 
the dictionary (it seems possible to use only from C until Python 2.4).  

I say if it only buys 5% on small arrays then it's not worth it as there 
are other fish to fry to make up for that 5% and I agree that tracking 
down threading problems due to a fanagled global variable is sticky.  I 
did not think about the threading issues deeply enough.

> I'm also curious about the seterr interface. It returns 
> ufunc_values_obj. I'm wasn't sure how one is supposed to pass that 
> back in to seterr,  so I modified seterr to instead return a 
> dictionary. I also modified it so that the seterr function itself has 
> no defaults (or rather they're all None). Instead, any unspecified 
> values are taken from the current error state. Thus 
> seterr(divide="warn") changes only the divide state, leaving the other 
> entries alone.
Returning an object is a late-in-the-game idea and should be critiqued.  
It can be passed to seterr (an attribute check grabs the actual list --- 
did you want to change it to a dictionary?).  Doesn't a small list have 
faster access than a small dictionary?  

I'll look over your commits and comment later if I think of anything...

I'm thrilled with your work.

Best,

-Travis


From bitorika at cs.tcd.ie  Sat Apr 22 03:18:00 2006
From: bitorika at cs.tcd.ie (bitorika at cs.tcd.ie)
Date: Sat Apr 22 03:18:00 2006
Subject: [Numpy-discussion] Floating point exception with numpy and 
     embedded python interpreter
In-Reply-To: <4446819D.3030401@astraw.com>
References: <81934aa60604191029h4d8a8d9bl550fa58cc67d3d5e@mail.gmail.com>
    <44467576.1020708@astraw.com>
    <AECE9844-6C6A-4EE7-A8A9-CAA079D8B2EA@gmail.com>
    <4446819D.3030401@astraw.com>
Message-ID: <35791.134.226.38.190.1145701016.squirrel@webmail.cs.tcd.ie>

>> On 19 Apr 2006, at 18:37, Andrew Straw wrote:
> I think that numpy only accesses the SSE units through ATLAS or other
> external library. So, build numpy without ATLAS. But I'm not 100% sure
> anymore if there aren't any optimizations that directly use SSE if it's
> available.

I've tried getting rid of all atlas, blas and lapack packages in my system
and rebuilding numpy to use its own unoptimised lapack_lite, but no luck.
Just trying to import numpy with PyImport_ImportModule("numpy") causes the
program to crash with just a "Floating point exception" message output.

The program I'm embedding Python in is the NS Network Simulator
(http://www.isi.edu/nsnam/ns/). It's a complex C++ beast with its own
Object-Tcl interpreter, but it's been working fine with embedded Python
except for this numpy crash. I've used Numeric before and it worked fine
as well.

I'm lost now regarding what to work on to find a solution, anyone familiar
with numpy internals has any suggestion?

Thanks,
Arkaitz


From jordi.bofill at upc.edu  Sat Apr 22 09:46:00 2006
From: jordi.bofill at upc.edu (Jordi Bofill)
Date: Sat Apr 22 09:46:00 2006
Subject: [Numpy-discussion] Re: Dumping record arrays
References: <200603302127.24231.pgmdevlist@mailcan.com>
Message-ID: <e2dlp0$ijo$1@sea.gmane.org>

Pierre GM wrote:

> Folks,
> I'd like to dump/pickle some record arrays. The pickling works, the
> unpickling raises a ValueError (on my version of numpy 0.9.6). (cf below).
> Is this already corrected in the svn version ?
> Thx
> 
> 
>
###########################################################################
> #
> 
> x1 = array([21,32,14])
> x2 = array(['my','first','name'])
> x3 = array([3.1, 4.5, 6.2])
> r = rec.fromarrays([x1,x2,x3], names='id, word, number')
> 
> r.dump('dumper')
> rb=load('dumper')
> ---------------------------------------------------------------------------
> exceptions.ValueError                                Traceback (most
> recent call last)
> 
> /home/backtopop/Work/workspace-python/pyflows/src/<ipython console>
> 
> /usr/lib64/python2.4/site-packages/numpy/core/numeric.py in load(file)
>     331     if isinstance(file, type("")):
>     332         file = _file(file,"rb")
> --> 333     return _cload(file)
>     334
>     335 # These are all essentially abbreviations
> 
> /usr/lib64/python2.4/site-packages/numpy/core/_internal.py in
> _reconstruct(subtype, shape, dtype)
>     251
>     252 def _reconstruct(subtype, shape, dtype):
> --> 253     return ndarray.__new__(subtype, shape, dtype)
>     254
>     255
> 
> ValueError: ('data-type with unspecified variable length', <function
> _reconstruct at 0x2aaaafcf1578>, (<class 'numpy.core.records.recarray'>,
> (0,), 'V'))
> 
> 
> 
> 
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting
> language that extends applications into web and mobile media. Attend the
> live webcast and join the prime developer group breaking into this new
> coding territory!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642

I'm newbie moving from numarray and I also get this error. I tried svn
records.py with the same result. Any hope in getting it fixed?
The error can be reproduce from the source example:

import  numpy.core.records as rec
r=rec.fromrecords([(456,'dbe',1.2),(2,'de',1.3)],names='col1,col2,col3')
import cPickle
print cPickle.loads(cPickle.dumps(r))
---------------------------------------------------------------------------
exceptions.ValueError                                Traceback (most recent
call last)

/home/jordi/temp/<ipython console>

/usr/lib/python2.4/site-packages/numpy/core/_internal.py in
_reconstruct(subtype, shape, dt ype)
    251
    252 def _reconstruct(subtype, shape, dtype):
--> 253     return ndarray.__new__(subtype, shape, dtype)
    254
    255

ValueError: ('data-type with unspecified variable length', <function
_reconstruct at 0xb78f ce64>, (<class 'numpy.core.records.recarray'>,
(0,), 'V'))


From oliphant.travis at ieee.org  Sat Apr 22 10:19:00 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Sat Apr 22 10:19:00 2006
Subject: [Numpy-discussion] Re: Dumping record arrays
In-Reply-To: <e2dlp0$ijo$1@sea.gmane.org>
References: <200603302127.24231.pgmdevlist@mailcan.com> <e2dlp0$ijo$1@sea.gmane.org>
Message-ID: <444A653A.9020402@ieee.org>

Jordi Bofill wrote:
> Pierre GM wrote:
>
>   
>> Folks,
>> I'd like to dump/pickle some record arrays. The pickling works, the
>> unpickling raises a ValueError (on my version of numpy 0.9.6). (cf below).
>> Is this already corrected in the svn version ?
>> Thx
>>
>>
>>
>>     
> ###########################################################################
>   
>> #
>>
>> x1 = array([21,32,14])
>> x2 = array(['my','first','name'])
>> x3 = array([3.1, 4.5, 6.2])
>> r = rec.fromarrays([x1,x2,x3], names='id, word, number')
>>
>>     
This is fixed in SVN (but you have to get more than just the SVN 
records.py script).   The needed change is in the __reduce__ method of 
the array object (which is in C).  A re-compile is needed. 

NumPy 0.9.8 should be out in a few weeks.

Best,

-Travis


>> r.dump('dumper')
>> rb=load('dumper')
>> ---------------------------------------------------------------------------
>> exceptions.ValueError                                Traceback (most
>> recent call last)
>>
>> /home/backtopop/Work/workspace-python/pyflows/src/<ipython console>
>>
>> /usr/lib64/python2.4/site-packages/numpy/core/numeric.py in load(file)
>>     331     if isinstance(file, type("")):
>>     332         file = _file(file,"rb")
>> --> 333     return _cload(file)
>>     334
>>     335 # These are all essentially abbreviations
>>
>> /usr/lib64/python2.4/site-packages/numpy/core/_internal.py in
>> _reconstruct(subtype, shape, dtype)
>>     251
>>     252 def _reconstruct(subtype, shape, dtype):
>> --> 253     return ndarray.__new__(subtype, shape, dtype)
>>     254
>>     255
>>
>> ValueError: ('data-type with unspecified variable length', <function
>> _reconstruct at 0x2aaaafcf1578>, (<class 'numpy.core.records.recarray'>,
>> (0,), 'V'))
>>
>>
>>
>>
>> -------------------------------------------------------
>> This SF.Net email is sponsored by xPML, a groundbreaking scripting
>> language that extends applications into web and mobile media. Attend the
>> live webcast and join the prime developer group breaking into this new
>> coding territory!
>> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
>>     
>
> I'm newbie moving from numarray and I also get this error. I tried svn
> records.py with the same result. Any hope in getting it fixed?
> The error can be reproduce from the source example:
>
> import  numpy.core.records as rec
> r=rec.fromrecords([(456,'dbe',1.2),(2,'de',1.3)],names='col1,col2,col3')
> import cPickle
> print cPickle.loads(cPickle.dumps(r))
> ---------------------------------------------------------------------------
> exceptions.ValueError                                Traceback (most recent
> call last)
>
> /home/jordi/temp/<ipython console>
>
> /usr/lib/python2.4/site-packages/numpy/core/_internal.py in
> _reconstruct(subtype, shape, dt ype)
>     251
>     252 def _reconstruct(subtype, shape, dtype):
> --> 253     return ndarray.__new__(subtype, shape, dtype)
>     254
>     255
>
> ValueError: ('data-type with unspecified variable length', <function
> _reconstruct at 0xb78f ce64>, (<class 'numpy.core.records.recarray'>,
> (0,), 'V'))
>
>
>
>
>
>
>
> -------------------------------------------------------
> Using Tomcat but need to do more? Need to support web services, security?
> Get stuff done quickly with pre-integrated technology to make your job easier
> Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>   

   
From fullung at gmail.com  Sat Apr 22 10:53:05 2006
From: fullung at gmail.com (Albert Strasheim)
Date: Sat Apr 22 10:53:05 2006
Subject: [Numpy-discussion] Re: seterr changes
In-Reply-To: <444999E2.1040009@ieee.org>
Message-ID: <005701c66635$82b3a930$0502010a@dsp.sun.ac.za>

Hello all

I was just wondering if someone could provide some example code that would
cause an error if invalid is set to 'raise'?

I also noticed that seterr returns the old values. Is this really useful?
Consider its use in an IPython session:

In [184]: N.geterr()
Out[184]: {'over': 'ignore', 'divide': 'ignore', 'invalid': 'ignore',
'under': 'ignore'}

In [185]: N.seterr(over='raise')
Out[185]: {'over': 'ignore', 'divide': 'ignore', 'invalid': 'ignore',
'under': 'ignore'}

I think the following pattern would make sense, but it seems it doesn't work
at present:
 
old=N.geterr()
N.seterr(over='raise')
# so some calculations that might overflow
N.seterr(old)

This currently causes the following error:

Traceback (most recent call last):
  File "<ipython console>", line 1, in ?
  File "C:\Python24\Lib\site-packages\numpy\core\numeric.py", line 426, in
seterr
    maskvalue = ((_errdict[divide] << SHIFT_DIVIDEBYZERO) +
TypeError: dict objects are unhashable

Is this intended? I think it would be useful to be able to restore all the
error states in one go.

Regards,

Albert

> -----Original Message-----
> From: numpy-discussion-admin at lists.sourceforge.net [mailto:numpy-
> discussion-admin at lists.sourceforge.net] On Behalf Of Travis Oliphant
> Sent: 22 April 2006 04:50
> To: tim.hochberg at ieee.org; numpy-discussion
> Subject: [Numpy-discussion] Re: seterr changes
> 
> Tim Hochberg wrote:
> >
> > Hi Travis et al,
> >
> > I started looking at your seterr changes.
> Thank you very much for the help on this.  I'm not an expert on threaded
> code by any means.  In fact, as you clearly point out, I don't eat and
> drink what will work under threaded environments and what wont.  Clearly
> global variables are problematic.  That is the problem with the
> update_use_defaults bit, right?   This is the way it was being managed
> before and I just changed names a bit to use PyThreadState_GetDict for
> the dictionary (it seems possible to use only from C until Python 2.4).
> 
> I say if it only buys 5% on small arrays then it's not worth it as there
> are other fish to fry to make up for that 5% and I agree that tracking
> down threading problems due to a fanagled global variable is sticky.  I
> did not think about the threading issues deeply enough.
> 
> > I'm also curious about the seterr interface. It returns
> > ufunc_values_obj. I'm wasn't sure how one is supposed to pass that
> > back in to seterr,  so I modified seterr to instead return a
> > dictionary. I also modified it so that the seterr function itself has
> > no defaults (or rather they're all None). Instead, any unspecified
> > values are taken from the current error state. Thus
> > seterr(divide="warn") changes only the divide state, leaving the other
> > entries alone.
> Returning an object is a late-in-the-game idea and should be critiqued.
> It can be passed to seterr (an attribute check grabs the actual list ---
> did you want to change it to a dictionary?).  Doesn't a small list have
> faster access than a small dictionary?
> 
> I'll look over your commits and comment later if I think of anything...
> 
> I'm thrilled with your work.
> 
> Best,
> 
> -Travis


From rob at hooft.net  Sat Apr 22 11:48:01 2006
From: rob at hooft.net (Rob Hooft)
Date: Sat Apr 22 11:48:01 2006
Subject: [Numpy-discussion] Re: seterr changes
In-Reply-To: <005701c66635$82b3a930$0502010a@dsp.sun.ac.za>
References: <005701c66635$82b3a930$0502010a@dsp.sun.ac.za>
Message-ID: <444A7A35.5090906@hooft.net>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Albert Strasheim wrote:
| old=N.geterr()
| N.seterr(over='raise')
| # so some calculations that might overflow
| N.seterr(old)

You should try (but I didn't): N.seterr(**old)

Rob
- --
Rob W.W. Hooft  ||  rob at hooft.net  ||  http://www.hooft.net/people/rob/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFESno1H7J/Cv8rb3QRAppZAKCGBRSvL++wg3wFer6odmG8sxyrFwCfQ1nq
p0aVr4r+Z1ZfRBGQgir+KX0=
=eZMa
-----END PGP SIGNATURE-----


From strawman at astraw.com  Sat Apr 22 12:13:02 2006
From: strawman at astraw.com (Andrew Straw)
Date: Sat Apr 22 12:13:02 2006
Subject: [Numpy-discussion] Floating point exception with numpy and  
    embedded python interpreter
In-Reply-To: <35791.134.226.38.190.1145701016.squirrel@webmail.cs.tcd.ie>
References: <81934aa60604191029h4d8a8d9bl550fa58cc67d3d5e@mail.gmail.com>    <44467576.1020708@astraw.com>    <AECE9844-6C6A-4EE7-A8A9-CAA079D8B2EA@gmail.com>    <4446819D.3030401@astraw.com> <35791.134.226.38.190.1145701016.squirrel@webmail.cs.tcd.ie>
Message-ID: <444A8026.3030307@astraw.com>

bitorika at cs.tcd.ie wrote:

>>>On 19 Apr 2006, at 18:37, Andrew Straw wrote:
>>>      
>>>
>>I think that numpy only accesses the SSE units through ATLAS or other
>>external library. So, build numpy without ATLAS. But I'm not 100% sure
>>anymore if there aren't any optimizations that directly use SSE if it's
>>available.
>>    
>>
>
>I've tried getting rid of all atlas, blas and lapack packages in my system
>and rebuilding numpy to use its own unoptimised lapack_lite, but no luck.
>Just trying to import numpy with PyImport_ImportModule("numpy") causes the
>program to crash with just a "Floating point exception" message output.
>
>The program I'm embedding Python in is the NS Network Simulator
>(http://www.isi.edu/nsnam/ns/). It's a complex C++ beast with its own
>Object-Tcl interpreter, but it's been working fine with embedded Python
>except for this numpy crash. I've used Numeric before and it worked fine
>as well.
>
>I'm lost now regarding what to work on to find a solution, anyone familiar
>with numpy internals has any suggestion?
>  
>
OK, going back to your original gdb traceback, it looks like the SIGFPE
originated in the following funtion in umathmodule.c:

static double
pinf_init(void)
{
    double mul = 1e10;
    double tmp = 0.0;
    double pinf;

    pinf = mul;
    for (;;) {
        pinf *= mul;
        if (pinf == tmp) break;
        tmp = pinf;
    }
    return pinf;
}

If you try just that function (instead of the whole Python interpreter
and numpy module) and still get the exception, you'll be that much
closer to narrowing down the issue.


From robert.kern at gmail.com  Sat Apr 22 18:58:01 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Sat Apr 22 18:58:01 2006
Subject: [Numpy-discussion] Re: Backporting numpy to Python 2.2
In-Reply-To: <20060419103554.4ac1df4a.twegener@radlogic.com.au>
References: <20060419103554.4ac1df4a.twegener@radlogic.com.au>
Message-ID: <e2emu3$6r7$1@sea.gmane.org>

Tim Wegener wrote:
> Hi, 
> 
> I am attempting to backport numpy-0.9.6 to be compatible with python 2.2. (Some of our machines run python 2.2 as part of Red Hat 9 and Red Hat 7.3 and it is hazardous to alter the standard setup.) I was able to change most of the 2.3-isms to be 2.2 compatible (see the attached patch). However I had problems compiling the following c module:

I was hoping that Travis would jump in and talk about the reasons that he
targetted 2.3 and not 2.2. I don't think that it's going to be feasible to
target 2.2 at this point. If nothing else, I've long since forgotten how to
write 2.2 code. More seriously, doing an overhaul of all of the C code in numpy
to use the older API is just going to make the code clumsier and more difficult
to maintain.

I think it is going to be much easier for you to install a second, more recent
Python interpreter on your machines than it will be for you to maintain a
2.2-compatible branch. Linux installations, even Red Hat, usually handle having
multiple versions of Python installed side by side just fine. You don't have to
remove Python 2.2.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From zpincus at stanford.edu  Sat Apr 22 20:48:00 2006
From: zpincus at stanford.edu (Zachary Pincus)
Date: Sat Apr 22 20:48:00 2006
Subject: [Numpy-discussion] Matrix and var method
Message-ID: <83468068-4E41-45A1-9753-90CEADF34722@stanford.edu>

Hi folks,

I just ran across an error with numpy.matrix types: the var() method  
does not seem to work! (I've tried all sorts of permutations on the  
matrix shape, and the axis parameter to var; nothing works.)

Perhaps this has already been fixed -- I haven't updated my numpy in  
a week or so. If so, sorry; if not, I hope this helps.

Zach


In [1]: import numpy
In [2]: numpy.__version__
Out[2]: '0.9.7.2335'

In [3]: numpy.matrix([[1,2,3], [1,2,3]]).var()
------------------------------------------------------------------------ 
---
exceptions.ValueError                                Traceback (most  
recent call last)

/Users/zpincus/<ipython console>

/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/site- 
packages/numpy/core/defmatrix.py in __mul__(self, other)
     147         if isinstance(other, N.ndarray) or N.isscalar(other)  
or \
     148                not hasattr(other, '__rmul__'):
--> 149             return N.dot(self, other)
     150         else:
     151             return NotImplemented

ValueError: matrices are not aligned

In [4]: numpy.array([[1,2,3], [1,2,3]]).var()
Out[4]: 0.80000000000000004


From a.mcmorland at auckland.ac.nz  Sun Apr 23 17:40:02 2006
From: a.mcmorland at auckland.ac.nz (Angus McMorland)
Date: Sun Apr 23 17:40:02 2006
Subject: [Numpy-discussion] Error installing on amd64 Debian-unstable
Message-ID: <444C1E24.8030603@auckland.ac.nz>

I had no troubles installing numpy and scipy on my 32-bit laptop, but
cannot get numpy to install on my amd64 debian desktop. I've pulled in
the latest svn versions, then run:

$ python setup.py install

Installation seems to run okay (no error messages), but the following
happens:


In [1]: import numpy
import core -> failed:
/usr/lib/python2.3/site-packages/numpy/core/_sort.so: undefined symbol:
PyArray_CompareUCS4
import lib -> failed: module compiled against version 90703 of C-API but
this version of numpy is 90704
import linalg -> failed: module compiled against version 90703 of C-API
but this version of numpy is 90704
import dft -> failed: cannot import name asarray
import random -> failed: 'module' object has no attribute 'dtype'
---------------------------------------------------------------------------
exceptions.ImportError                               Traceback (most
recent call last)

/home/amcmorl/<ipython console>

/usr/lib/python2.3/site-packages/numpy/__init__.py
     47         return NumpyTest().test(level, verbosity)
     48
---> 49     import add_newdocs
     50
     51     if __doc__ is not None:

/usr/lib/python2.3/site-packages/numpy/add_newdocs.py
----> 2 from lib import add_newdoc
      3
      4 add_newdoc('numpy.core','dtype',
      5            [('fields', "Fields of the data-typedescr if any."),
      6             ('alignment', "Needed alignment for this data-type"),

ImportError: cannot import name add_newdoc


Can anyone suggest what I'm doing wrong?

Cheers,

A.
-- 
Angus McMorland
email a.mcmorland at auckland.ac.nz
mobile +64-21-155-4906

PhD Student, Neurophysiology / Multiphoton & Confocal Imaging
Physiology, University of Auckland
phone +64-9-3737-599 x89707

Armourer, Auckland University Fencing
Secretary, Fencing North Inc.


From robert.kern at gmail.com  Sun Apr 23 17:55:08 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Sun Apr 23 17:55:08 2006
Subject: [Numpy-discussion] Re: Error installing on amd64 Debian-unstable
In-Reply-To: <444C1E24.8030603@auckland.ac.nz>
References: <444C1E24.8030603@auckland.ac.nz>
Message-ID: <e2h7k7$mqq$1@sea.gmane.org>

Angus McMorland wrote:
> I had no troubles installing numpy and scipy on my 32-bit laptop, but
> cannot get numpy to install on my amd64 debian desktop. I've pulled in
> the latest svn versions, then run:
> 
> $ python setup.py install
> 
> Installation seems to run okay (no error messages), but the following
> happens:
> 
> In [1]: import numpy
> import core -> failed:
> /usr/lib/python2.3/site-packages/numpy/core/_sort.so: undefined symbol:
> PyArray_CompareUCS4
> import lib -> failed: module compiled against version 90703 of C-API but
> this version of numpy is 90704

Please delete the build/ directory and the installed numpy package and rebuild.
If the problem persists, please let us know.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From robert.kern at gmail.com  Sun Apr 23 17:58:22 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Sun Apr 23 17:58:22 2006
Subject: [Numpy-discussion] Changing the Trac authentication
Message-ID: <444C20E5.7090309@gmail.com>

I will be changing the Trac authentication over the next hour or so. I will be
installing the AccountManagerPlugin to allow users to create accounts for
themselves without needing to have SVN write access. Anonymous users will not be
able to edit the Wikis or tickets. Non-developer, but registered users will be
able to do so with some restrictions, notably not being able to resolve tickets.
Developers who currently have accounts will have the same username/password as
before.

If you have problems using the Trac sites before I announce that I am done,
please wait until I am finished. If there are still problems, please let me know
and I will try to fix them as soon as possible.

Thank you for your patience. Hopefully, this change will resolve the spam problem.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From robert.kern at gmail.com  Sun Apr 23 18:12:02 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Sun Apr 23 18:12:02 2006
Subject: [Numpy-discussion] Re: Changing the Trac authentication
In-Reply-To: <444C20E5.7090309@gmail.com>
References: <444C20E5.7090309@gmail.com>
Message-ID: <444C25A9.8080701@gmail.com>

Robert Kern wrote:
> I will be changing the Trac authentication over the next hour or so.

Never mind. I'll have to do it tomorrow when I get to the office.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From rmuller at sandia.gov  Mon Apr 24 09:12:13 2006
From: rmuller at sandia.gov (Rick Muller)
Date: Mon Apr 24 09:12:13 2006
Subject: [Numpy-discussion] Problems building numpy
Message-ID: <02801766-7F45-48EE-AD4A-7B4B0590C9AC@sandia.gov>

Numpy really builds nicely now, and I appreciate all of the hard work  
that people have put into portability of this code.

That being said, I just had my first system where Numpy failed to  
build. It's on a redhat 7.3 (yes, we have a 7.3 box. I didn't believe  
it either. not my decision.) and I get the following error when  
trying to run Numpy:

Python 2.4.3 (#1, Apr 24 2006, 09:54:46)
[GCC 3.2.3 20030502 (Red Hat Linux 3.2.3-42)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
 >>> from numpy import array
import linalg -> failed: /usr/local/lib/python2.4/site-packages/numpy/ 
linalg/lapack_lite.so: undefined symbol: s_wsfe


If this is easy to fix, I'd prefer to fix it. However, if the numpy  
developers have better things to do than to support a 10-year-old  
operating system (and I suspect that they do), I'm cool with that.

Rick

Rick Muller
rmuller at sandia.gov


From arkaitz.bitorika at gmail.com  Mon Apr 24 09:24:03 2006
From: arkaitz.bitorika at gmail.com (Arkaitz Bitorika)
Date: Mon Apr 24 09:24:03 2006
Subject: [Numpy-discussion] Floating point exception with numpy and   embedded python interpreter
In-Reply-To: <444A8026.3030307@astraw.com>
References: <81934aa60604191029h4d8a8d9bl550fa58cc67d3d5e@mail.gmail.com>    <44467576.1020708@astraw.com>    <AECE9844-6C6A-4EE7-A8A9-CAA079D8B2EA@gmail.com>    <4446819D.3030401@astraw.com> <35791.134.226.38.190.1145701016.squirrel@webmail.cs.tcd.ie> <444A8026.3030307@astraw.com>
Message-ID: <ABB101AA-6B13-450B-AAD9-70BB53D67835@gmail.com>

Andrew,

I've verified that the function causes the exception when embedded in  
the program but not when used from a simple C program with just a main 
() function. The successful version iterates 31 times over the for  
loop while the crashing one fails the 30th time that it does "pinf *=  
mul".

Now we know exactly where the crash is, but no idea how to fix it ;).  
It doesn't look it should be related to SSE2 flags, it's just doing a  
big multiplication, but I don't know enough about low level C and  
floating point operations to understand why it may be throwing the  
exception there. Any idea how I could avoid that function crashing?

Thanks,
Arkaitz

On 22 Apr 2006, at 20:12, Andrew Straw wrote:
> OK, going back to your original gdb traceback, it looks like the  
> SIGFPE
> originated in the following funtion in umathmodule.c:
>
> static double
> pinf_init(void)
> {
>     double mul = 1e10;
>     double tmp = 0.0;
>     double pinf;
>
>     pinf = mul;
>     for (;;) {
>         pinf *= mul;
>         if (pinf == tmp) break;
>         tmp = pinf;
>     }
>     return pinf;
> }
>
> If you try just that function (instead of the whole Python interpreter
> and numpy module) and still get the exception, you'll be that much
> closer to narrowing down the issue.


From robert.kern at gmail.com  Mon Apr 24 09:53:02 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Mon Apr 24 09:53:02 2006
Subject: [Numpy-discussion] Re: Problems building numpy
In-Reply-To: <02801766-7F45-48EE-AD4A-7B4B0590C9AC@sandia.gov>
References: <02801766-7F45-48EE-AD4A-7B4B0590C9AC@sandia.gov>
Message-ID: <e2ivmm$99u$1@sea.gmane.org>

Rick Muller wrote:
> Numpy really builds nicely now, and I appreciate all of the hard work 
> that people have put into portability of this code.
> 
> That being said, I just had my first system where Numpy failed to 
> build. It's on a redhat 7.3 (yes, we have a 7.3 box. I didn't believe 
> it either. not my decision.) and I get the following error when  trying
> to run Numpy:
> 
> Python 2.4.3 (#1, Apr 24 2006, 09:54:46)
> [GCC 3.2.3 20030502 (Red Hat Linux 3.2.3-42)] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
>>>> from numpy import array
> import linalg -> failed: /usr/local/lib/python2.4/site-packages/numpy/
> linalg/lapack_lite.so: undefined symbol: s_wsfe
> 
> If this is easy to fix, I'd prefer to fix it. However, if the numpy 
> developers have better things to do than to support a 10-year-old 
> operating system (and I suspect that they do), I'm cool with that.

This usually means that you are not linking in the g2c library:

http://www.scipy.org/FAQ#head-26562f0a9e046b53eae17de300fc06408f9c91a8

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From ndarray at mac.com  Mon Apr 24 10:07:06 2006
From: ndarray at mac.com (Sasha)
Date: Mon Apr 24 10:07:06 2006
Subject: [Numpy-discussion] Broadcasting rules (Ticket 76).
Message-ID: <d38f5330604241006i739bf55et44718fdee379447e@mail.gmail.com>

I was looking at ticket 76:

http://projects.scipy.org/scipy/numpy/ticket/76

At first, I concluded that the ticket was valid and that

>>> a = zeros([5,2])
>>> a[:] = arange(5)

should raise an error as it did in Numeric.  However, once I started
looking at the code, I've realized that numpy supports more flexible
broadcasting rules than Numeric.

For example:


>>> x = zeros([10])
>>> x[:] = 1,2
>>> x
array([1, 2, 1, 2, 1, 2, 1, 2, 1, 2])

That would be an error in Numeric. Given that the above is valid, the
result in Ticket 76 actually makes sense.

I believe it is time to have some discussion about the future of
broadcasting rules in numpy.  Can anyone provide a summary of the
status quo?


From oliphant.travis at ieee.org  Mon Apr 24 10:43:05 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Mon Apr 24 10:43:05 2006
Subject: [Numpy-discussion] Broadcasting rules (Ticket 76).
In-Reply-To: <d38f5330604241006i739bf55et44718fdee379447e@mail.gmail.com>
References: <d38f5330604241006i739bf55et44718fdee379447e@mail.gmail.com>
Message-ID: <444D0DF7.2060307@ieee.org>

Sasha wrote:
> I was looking at ticket 76:
>
> http://projects.scipy.org/scipy/numpy/ticket/76
>
> At first, I concluded that the ticket was valid and that
>
>   
>>>> a = zeros([5,2])
>>>> a[:] = arange(5)
>>>>         
>
> should raise an error as it did in Numeric.  However, once I started
> looking at the code, I've realized that numpy supports more flexible
> broadcasting rules than Numeric.
>   
This really isn't in the category of "broadcasting" as I see it.  My 
understanding is that broadcasting refers to operations involving more 
than one array on the input side.   It's really just a "universal 
function" concept. 

A copying operation is not handled using the same rules.   In this case, 
for example, Numeric used to raise an error but in NumPy the array will 
be copied as many times as possible into the array.  I don't believe 
ticket #76 is actually an error.

This behavior could be changed if somebody wants to write the code to 
change it but only until version 1.0.   It would be very difficult to 
change the other broadcasting behavior which was inherited from Numeric, 
however.  The only possibility I see is adding new useful functionality 
where Numeric used to raise an error.


-Travis


From zpincus at stanford.edu  Mon Apr 24 10:57:04 2006
From: zpincus at stanford.edu (Zachary Pincus)
Date: Mon Apr 24 10:57:04 2006
Subject: [Numpy-discussion] Broadcasting rules (Ticket 76).
In-Reply-To: <444D0DF7.2060307@ieee.org>
References: <d38f5330604241006i739bf55et44718fdee379447e@mail.gmail.com> <444D0DF7.2060307@ieee.org>
Message-ID: <4AB1DE92-E877-4E22-83AB-69DDBB32FB25@stanford.edu>

> It would be very difficult to change the other broadcasting  
> behavior which was inherited from Numeric, however.  The only  
> possibility I see is adding new useful functionality where Numeric  
> used to raise an error.

Well, there is one case that I run into all of the time where the  
broadcasting rules seem a bit constraining:

In [1]: import numpy
In [2]: numpy.__version__
'0.9.7.2335'
In [3]: a = numpy.ones([50, 100])
In [4]: means = a.mean(axis = 1)
In [5]: print a.shape, means.shape
(50, 100) (50,)
In [5]: a / means
ValueError: index objects are not broadcastable to a single shape
In [6]: (a.transpose() / means).transpose()
#this works

It's obvious why this doesn't work due to the broadcasting rules, but  
it also seems (to me, in this case at least) obvious what I am trying  
to do. I don't think I'm suggesting that the broadcasting rules be  
changed to allow matching-from-the-right in the general case, since  
that seems likely to make the broadcasting rules even more difficult  
to grok. But there do seem to be a lot of (....transpose 
() ... ).transpose() bits in my code.

Is there anything to be done here? I presume not, but I just wanted  
to mention it.

Zach


From oliphant.travis at ieee.org  Mon Apr 24 11:25:06 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Mon Apr 24 11:25:06 2006
Subject: ***[Possible UCE]*** Re: [Numpy-discussion] Broadcasting rules
 (Ticket 76).
In-Reply-To: <4AB1DE92-E877-4E22-83AB-69DDBB32FB25@stanford.edu>
References: <d38f5330604241006i739bf55et44718fdee379447e@mail.gmail.com> <444D0DF7.2060307@ieee.org> <4AB1DE92-E877-4E22-83AB-69DDBB32FB25@stanford.edu>
Message-ID: <444D17E6.1070104@ieee.org>

Zachary Pincus wrote:
>> It would be very difficult to change the other broadcasting behavior 
>> which was inherited from Numeric, however.  The only possibility I 
>> see is adding new useful functionality where Numeric used to raise an 
>> error.
>
> Well, there is one case that I run into all of the time where the 
> broadcasting rules seem a bit constraining:
>
> In [1]: import numpy
> In [2]: numpy.__version__
> '0.9.7.2335'
> In [3]: a = numpy.ones([50, 100])
> In [4]: means = a.mean(axis = 1)
> In [5]: print a.shape, means.shape
> (50, 100) (50,)
> In [5]: a / means
> ValueError: index objects are not broadcastable to a single shape
> In [6]: (a.transpose() / means).transpose()
> #this works
>
> It's obvious why this doesn't work due to the broadcasting rules, but 
> it also seems (to me, in this case at least) obvious what I am trying 
> to do. I don't think I'm suggesting that the broadcasting rules be 
> changed to allow matching-from-the-right in the general case, since 
> that seems likely to make the broadcasting rules even more difficult 
> to grok. But there do seem to be a lot of (....transpose() ... 
> ).transpose() bits in my code.
>
> Is there anything to be done here? I presume not, but I just wanted to 
> mention it.

Yes,  just be more explicit about which end to tack extra dimensions 
onto (the automatic extension always assumes pre-pending...)

a / means[:,newaxis]

is the suggested spelling...

-Travis


From ndarray at mac.com  Mon Apr 24 11:30:05 2006
From: ndarray at mac.com (Sasha)
Date: Mon Apr 24 11:30:05 2006
Subject: [Numpy-discussion] Broadcasting rules (Ticket 76).
In-Reply-To: <4AB1DE92-E877-4E22-83AB-69DDBB32FB25@stanford.edu>
References: <d38f5330604241006i739bf55et44718fdee379447e@mail.gmail.com>
	 <444D0DF7.2060307@ieee.org>
	 <4AB1DE92-E877-4E22-83AB-69DDBB32FB25@stanford.edu>
Message-ID: <d38f5330604241129r45004a91g3a356afc238a74e3@mail.gmail.com>

On 4/24/06, Zachary Pincus <zpincus at stanford.edu> wrote:
> [...]
> In [5]: print a.shape, means.shape
> (50, 100) (50,)
> In [5]: a / means
> ValueError: index objects are not broadcastable to a single shape
> In [6]: (a.transpose() / means).transpose()
> #this works

This works too:
>>> x = a / means[:,newaxis]

no .transpose() :-).


From ndarray at mac.com  Mon Apr 24 11:49:04 2006
From: ndarray at mac.com (Sasha)
Date: Mon Apr 24 11:49:04 2006
Subject: [Numpy-discussion] Broadcasting rules (Ticket 76).
In-Reply-To: <444D0DF7.2060307@ieee.org>
References: <d38f5330604241006i739bf55et44718fdee379447e@mail.gmail.com>
	 <444D0DF7.2060307@ieee.org>
Message-ID: <d38f5330604241148h6608f2b9r4dd4f233d988fc1b@mail.gmail.com>

On 4/24/06, Travis Oliphant <oliphant.travis at ieee.org> wrote:
> [...]
> A copying operation is not handled using the same rules.   In this case,
> for example, Numeric used to raise an error but in NumPy the array will
> be copied as many times as possible into the array.  I don't believe
> ticket #76 is actually an error.
>
I disagree on the terminology.  In my view broadcasting means
repeating the values of the array to fit into a different shape no
matter what dictates the new shape an operand or the receiver.

IMHO the following is slightly confusing:

>>> a = zeros([5,2])
>>> a[...] += arange(5)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
ValueError: shape mismatch: objects cannot be broadcast to a single shape

but
>>> a[...] = arange(5)

is ok.


> This behavior could be changed if somebody wants to write the code to
> change it but only until version 1.0.   It would be very difficult to
> change the other broadcasting behavior which was inherited from Numeric,
> however.  The only possibility I see is adding new useful functionality
> where Numeric used to raise an error.

In this category, I would suggest to allow broadcasting to any
multiple of the dimension even if the dimension is not 1.  I don't see
what makes 1 so special.


>>> x = zeros(4)
>>> x+(1,2)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
ValueError: shape mismatch: objects cannot be broadcast to a single shape
>>> x+(1,)
array([1, 1, 1, 1])

I suggest that we make ufunc sonsistent with slice assignment.  Currently:

>>> x[:]=1,1
>>> x[:]=1,1,1
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
ValueError: number of elements in destination must be integer multiple
of number of elements in source


From cookedm at physics.mcmaster.ca  Mon Apr 24 13:13:09 2006
From: cookedm at physics.mcmaster.ca (David M. Cooke)
Date: Mon Apr 24 13:13:09 2006
Subject: [Numpy-discussion] numexpr enhancements
In-Reply-To: <20060421205530.GA25020@xot.carabos.com>
	(faltet@xot.carabos.com's message of "Fri, 21 Apr 2006 20:55:30
	+0000")
References: <20060421205530.GA25020@xot.carabos.com>
Message-ID: <qnkpsj6ka1l.fsf@arbutus.physics.mcmaster.ca>

faltet at xot.carabos.com writes:

> Hi,
>
> After looking at the numpy performance issues on strided and unaligned
> data, I decided to have a try at the numexpr package and finally
> implemented better suport for them. As a result, numexpr can reach now
> a 2x of performance improvement for simple expressions, like 'a>2.'.
>
> In the way, I've added support for boolean expressions (&, | and ~, as
> in the where() function), a new boolean data type (important to get
> better performance on boolean expressions) and support for numarray
> (maintaining the compatibility with numpy, of course).
>
> I've called the new package numexpr 0.2 to not confuse it with existing
> 0.1. Well, let's hope that numexpr can continue making its way towards
> integration in numpy.
>
> You can fetch this new package at:
>
> http://www.carabos.com/downloads/divers/numexpr-0.2.tar.gz
>
> Finally, let me say that numexpr is a wonderful toy to get your hands
> dirty ;-) Many thanks to David (and Tim) for this!

Unfortunately, real life (damn Ph.D.! :-) has gotten in my way, so I'm
not going to be able to look at this for a while. But I'll add it to
my list.

-- 
|>|\/|<
/--------------------------------------------------------------------------\
|David M. Cooke                      http://arbutus.physics.mcmaster.ca/dmc/
|cookedm at physics.mcmaster.ca


From cookedm at physics.mcmaster.ca  Mon Apr 24 13:18:05 2006
From: cookedm at physics.mcmaster.ca (David M. Cooke)
Date: Mon Apr 24 13:18:05 2006
Subject: [Numpy-discussion] announce: pyjit, a little jit for creating numpy ufuncs
In-Reply-To: <20060421162336.42285837.simon@arrowtheory.com> (Simon Burton's
	message of "Fri, 21 Apr 2006 16:23:36 +1000")
References: <20060421162336.42285837.simon@arrowtheory.com>
Message-ID: <qnklktuk9tf.fsf@arbutus.physics.mcmaster.ca>

Simon Burton <simon at arrowtheory.com> writes:

> Hi,
>
> Inspired by numexpr, pypy and llvm, i've built a simple 
> JIT for creating numpy "ufuncs" (they are not yet real ufuncs).
> It uses llvm[1] as the backend machine code generator.

Cool! I had a look at LLVM, but I wanted something to go into SciPy,
and that was too heavy a dependence. However, I could see doing more
stuff with this than I can easily with numexpr.

> The main things it can do are:
>
>  *) parse simple python code (function def's)
>  *) generate SSA assembly code for llvm
>  *) build ufunc code for applying to numpy array's
>
> When I say simple I mean it:
>
> def calc(a,b):
>   c = (a+b)/2.0
>   return c
>
> No control flow or type inference has been implemented.
>
> As with numexpr, significant speedups are possible.
>
> I'm putting this announce here to see what the other numpy'ers think.
>
> $ svn co http://rubis.rsise.anu.edu.au/local/repos/elefant/pyjit
>
> [1] http://llvm.org/

How do the speedups compare with numexpr?

Are there any lessons you learned from this that could apply to
numexpr?

Could we have a common frontend for numexpr/pyjit, and a different
backend for each? Then each wouldn't have to reinvent the wheel in
parsing (the same thought goes with weave, too...)

I don't have much time to look at it (real life sucking my time :-(),
but I'll have a look when I do have the time.

-- 
|>|\/|<
/--------------------------------------------------------------------------\
|David M. Cooke                      http://arbutus.physics.mcmaster.ca/dmc/
|cookedm at physics.mcmaster.ca


From oliphant.travis at ieee.org  Mon Apr 24 14:22:02 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Mon Apr 24 14:22:02 2006
Subject: [Numpy-discussion] Broadcasting rules (Ticket 76).
In-Reply-To: <d38f5330604241148h6608f2b9r4dd4f233d988fc1b@mail.gmail.com>
References: <d38f5330604241006i739bf55et44718fdee379447e@mail.gmail.com>	 <444D0DF7.2060307@ieee.org> <d38f5330604241148h6608f2b9r4dd4f233d988fc1b@mail.gmail.com>
Message-ID: <444D4143.4020204@ieee.org>

Sasha wrote:
> On 4/24/06, Travis Oliphant <oliphant.travis at ieee.org> wrote:
>   
>> [...]
>> A copying operation is not handled using the same rules.   In this case,
>> for example, Numeric used to raise an error but in NumPy the array will
>> be copied as many times as possible into the array.  I don't believe
>> ticket #76 is actually an error.
>>
>>     
> I disagree on the terminology.  In my view broadcasting means
> repeating the values of the array to fit into a different shape no
> matter what dictates the new shape an operand or the receiver.
>   

I can understand that view.  But, that's not been the historical use of 
broadcasting which has always been only a "ufunc" concept.   Code to 
implement a broader view of broadcasting across more operations if 
people decide that is appropriate could be done (carefully), but I don't 
have time to write it.


-Travis


From oliphant.travis at ieee.org  Mon Apr 24 14:25:02 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Mon Apr 24 14:25:02 2006
Subject: [Numpy-discussion] Broadcasting rules (Ticket 76).
In-Reply-To: <d38f5330604241148h6608f2b9r4dd4f233d988fc1b@mail.gmail.com>
References: <d38f5330604241006i739bf55et44718fdee379447e@mail.gmail.com>	 <444D0DF7.2060307@ieee.org> <d38f5330604241148h6608f2b9r4dd4f233d988fc1b@mail.gmail.com>
Message-ID: <444D41FE.7050904@ieee.org>

Sasha wrote:
> In this category, I would suggest to allow broadcasting to any
> multiple of the dimension even if the dimension is not 1.  I don't see
> what makes 1 so special.
>   
What's so special about 1 is that the code for it is relatively 
straightforward and already implemented using strides.  Altering the 
code to allow any multiple of the dimension would be harder and slower. 

-Travis


From oliphant.travis at ieee.org  Mon Apr 24 14:30:01 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Mon Apr 24 14:30:01 2006
Subject: [Numpy-discussion] Broadcasting rules (Ticket 76).
In-Reply-To: <d38f5330604241148h6608f2b9r4dd4f233d988fc1b@mail.gmail.com>
References: <d38f5330604241006i739bf55et44718fdee379447e@mail.gmail.com>	 <444D0DF7.2060307@ieee.org> <d38f5330604241148h6608f2b9r4dd4f233d988fc1b@mail.gmail.com>
Message-ID: <444D4329.9050700@ieee.org>

Sasha wrote:
>>>> x[:]=1,1
>>>> x[:]=1,1,1
>>>>         
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
> ValueError: number of elements in destination must be integer multiple
> of number of elements in source
>   
I think the only reasonable thing to do is to raise an error unless the 
shapes were compatible like Numeric did and eliminate the multiple 
copying feature.

This would bring the desired consistency.

-Travis


From strawman at astraw.com  Mon Apr 24 14:33:01 2006
From: strawman at astraw.com (Andrew Straw)
Date: Mon Apr 24 14:33:01 2006
Subject: [Numpy-discussion] Floating point exception with numpy and  
 embedded python interpreter
In-Reply-To: <ABB101AA-6B13-450B-AAD9-70BB53D67835@gmail.com>
References: <81934aa60604191029h4d8a8d9bl550fa58cc67d3d5e@mail.gmail.com>    <44467576.1020708@astraw.com>    <AECE9844-6C6A-4EE7-A8A9-CAA079D8B2EA@gmail.com>    <4446819D.3030401@astraw.com> <35791.134.226.38.190.1145701016.squirrel@webmail.cs.tcd.ie> <444A8026.3030307@astraw.com> <ABB101AA-6B13-450B-AAD9-70BB53D67835@gmail.com>
Message-ID: <444D43D0.3040308@astraw.com>

This doesn't seem like an issue with numpy. Your test proved that. I'm
curious what the outcome is, but I'm afraid there's not much we can do.

At this point I think you should write the ns2 people and see what they
say. Their program seems to be responsible for twiddling the FPU/SSE
flags, so I think the issue is better solved, or at least discussed, by
them.

Cheers!
Andrew

Arkaitz Bitorika wrote:

> Andrew,
>
> I've verified that the function causes the exception when embedded in 
> the program but not when used from a simple C program with just a main
> () function. The successful version iterates 31 times over the for 
> loop while the crashing one fails the 30th time that it does "pinf *= 
> mul".
>
> Now we know exactly where the crash is, but no idea how to fix it ;). 
> It doesn't look it should be related to SSE2 flags, it's just doing a 
> big multiplication, but I don't know enough about low level C and 
> floating point operations to understand why it may be throwing the 
> exception there. Any idea how I could avoid that function crashing?
>
> Thanks,
> Arkaitz
>
> On 22 Apr 2006, at 20:12, Andrew Straw wrote:
>
>> OK, going back to your original gdb traceback, it looks like the  SIGFPE
>> originated in the following funtion in umathmodule.c:
>>
>> static double
>> pinf_init(void)
>> {
>>     double mul = 1e10;
>>     double tmp = 0.0;
>>     double pinf;
>>
>>     pinf = mul;
>>     for (;;) {
>>         pinf *= mul;
>>         if (pinf == tmp) break;
>>         tmp = pinf;
>>     }
>>     return pinf;
>> }
>>
>> If you try just that function (instead of the whole Python interpreter
>> and numpy module) and still get the exception, you'll be that much
>> closer to narrowing down the issue.
>


From oliphant.travis at ieee.org  Mon Apr 24 17:40:04 2006
From: oliphant.travis at ieee.org (Travis E. Oliphant)
Date: Mon Apr 24 17:40:04 2006
Subject: [Numpy-discussion] Re: Backporting numpy to Python 2.2
In-Reply-To: <20060419103554.4ac1df4a.twegener@radlogic.com.au>
References: <20060419103554.4ac1df4a.twegener@radlogic.com.au>
Message-ID: <e2jr2o$h48$1@sea.gmane.org>

Tim Wegener wrote:
> Hi, 
> 
> I am attempting to backport numpy-0.9.6 to be compatible with python 2.2. (Some of our machines run python 2.2 as part of Red Hat 9 and Red Hat 7.3 and it is hazardous to alter the standard setup.) I was able to change most of the 2.3-isms to be 2.2 compatible (see the attached patch). However I had problems compiling the following c module:

I targeted Python 2.3 because it added some very nice constructs (Python 
2.4 added even more but I disciplined myself not to use them).

I think it is not impossible to back-port it to Python 2.2 but I agree 
with Robert that I wonder if it is worth the effort.

In this case Python 2.3 added the bool type which is used in NumPy. 
Basically this type would have to be constructed (the code could be 
grabbed from Python 2.3) in order to be used.

The addition of the boolean type is probably the single biggest change 
that would make back-porting to 2.2 difficult.

There may be others as well but they are probably easier to work around...


-Travis


From robert.kern at gmail.com  Mon Apr 24 18:00:01 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Mon Apr 24 18:00:01 2006
Subject: [Numpy-discussion] Changing the Trac authentication, for real this time!
Message-ID: <444D7458.3020402@gmail.com>

If you encounter errors accessing the Trac sites for NumPy and SciPy over the
next hour or so, please wait until I have announced that I have finished. If
things are still broken after that, please let me know and I will try to fix it
immediately.

The details of the changes were posted to the previous thread "Changing the Trac
authentication".

Apologies for any disruption and for the noise.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From ndarray at mac.com  Mon Apr 24 18:26:07 2006
From: ndarray at mac.com (Sasha)
Date: Mon Apr 24 18:26:07 2006
Subject: [Numpy-discussion] Broadcasting rules (Ticket 76).
In-Reply-To: <444D4329.9050700@ieee.org>
References: <d38f5330604241006i739bf55et44718fdee379447e@mail.gmail.com>
	 <444D0DF7.2060307@ieee.org>
	 <d38f5330604241148h6608f2b9r4dd4f233d988fc1b@mail.gmail.com>
	 <444D4329.9050700@ieee.org>
Message-ID: <d38f5330604241825v74135de4wf39404f7914062e0@mail.gmail.com>

On 4/24/06, Travis Oliphant <oliphant.travis at ieee.org> wrote:
> Sasha wrote:
> >>>> x[:]=1,1
> >>>> x[:]=1,1,1
> >>>>
> > Traceback (most recent call last):
> >   File "<stdin>", line 1, in ?
> > ValueError: number of elements in destination must be integer multiple
> > of number of elements in source
> >
> I think the only reasonable thing to do is to raise an error unless the
> shapes were compatible like Numeric did and eliminate the multiple
> copying feature.

I've attached a patch to the ticket:

<http://projects.scipy.org/scipy/numpy/attachment/ticket/76/shape-check.patch>

I don't see why slice assignment cannot reuse the ufunc code.  It
looks like slice assignment can just be dispatched to a trivial
(pass-through) ufunc.  This aproach may even prove to be faster
because type-aware copying loops can be faster than memmove on popular
platforms.


From robert.kern at gmail.com  Mon Apr 24 19:39:02 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Mon Apr 24 19:39:02 2006
Subject: [Numpy-discussion] Re: Changing the Trac authentication, for real this time!
In-Reply-To: <444D7458.3020402@gmail.com>
References: <444D7458.3020402@gmail.com>
Message-ID: <444D8BA2.1080407@gmail.com>

I hate computers.

It's still not done.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From stephen.walton at csun.edu  Mon Apr 24 20:49:03 2006
From: stephen.walton at csun.edu (Stephen Walton)
Date: Mon Apr 24 20:49:03 2006
Subject: [Numpy-discussion] Re: Problems building numpy
In-Reply-To: <e2ivmm$99u$1@sea.gmane.org>
References: <02801766-7F45-48EE-AD4A-7B4B0590C9AC@sandia.gov> <e2ivmm$99u$1@sea.gmane.org>
Message-ID: <444D9C0C.3030006@csun.edu>

Robert Kern wrote:

>Rick Muller wrote:
>
>>
>>
>>That being said, I just had my first system where Numpy failed to 
>>build. It's on a redhat 7.3 (yes, we have a 7.3 box. I didn't believe 
>>it either. not my decision.) and I get the following error when  trying
>>to run Numpy:
>>
>>    
>>
>This usually means that you are not linking in the g2c library.
>  
>
On Redhat 7.3, I don't believe there was a g2c library, but an f2c one.  
So -lf2c is needed at the link step (and f2c needs to be installed).


From robert.kern at gmail.com  Mon Apr 24 20:54:02 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Mon Apr 24 20:54:02 2006
Subject: [Numpy-discussion] Re: Problems building numpy
In-Reply-To: <444D9C0C.3030006@csun.edu>
References: <02801766-7F45-48EE-AD4A-7B4B0590C9AC@sandia.gov> <e2ivmm$99u$1@sea.gmane.org> <444D9C0C.3030006@csun.edu>
Message-ID: <e2k6ff$jmm$1@sea.gmane.org>

Stephen Walton wrote:
> Robert Kern wrote:
> 
>> Rick Muller wrote:
>>
>>> That being said, I just had my first system where Numpy failed to
>>> build. It's on a redhat 7.3 (yes, we have a 7.3 box. I didn't believe
>>> it either. not my decision.) and I get the following error when  trying
>>> to run Numpy:  
>>
>> This usually means that you are not linking in the g2c library.
>>  
> On Redhat 7.3, I don't believe there was a g2c library, but an f2c one. 
> So -lf2c is needed at the link step (and f2c needs to be installed).

Well, there's libf2c which is a library provided by f2c, a program that converts
FORTRAN to C. And then there's libg2c which is provided by g77. They really are
different and, I don't think, interchangeable. Note that libg2c will be stuck
several ellipses down in the bowels of /usr/lib/gcc/.../.../libg2c.a not up in
/usr/lib/.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From stephen.walton at csun.edu  Mon Apr 24 21:09:01 2006
From: stephen.walton at csun.edu (Stephen Walton)
Date: Mon Apr 24 21:09:01 2006
Subject: [Numpy-discussion] Re: Problems building numpy
In-Reply-To: <e2k6ff$jmm$1@sea.gmane.org>
References: <02801766-7F45-48EE-AD4A-7B4B0590C9AC@sandia.gov> <e2ivmm$99u$1@sea.gmane.org> <444D9C0C.3030006@csun.edu> <e2k6ff$jmm$1@sea.gmane.org>
Message-ID: <444DA0A5.80902@csun.edu>

Robert Kern wrote:

>Well, there's libf2c which is a library provided by f2c, a program that converts
>FORTRAN to C. And then there's libg2c which is provided by g77. They really are
>different 
>
Oh, I knew that.  My point was that there were some old Redhat releases 
(I don't recall if 7.3 is that old, probably not) which didn't include 
g77, just an f77 shell script which called f2c and cc.


From robert.kern at gmail.com  Mon Apr 24 21:14:01 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Mon Apr 24 21:14:01 2006
Subject: [Numpy-discussion] Re: Problems building numpy
In-Reply-To: <444DA0A5.80902@csun.edu>
References: <02801766-7F45-48EE-AD4A-7B4B0590C9AC@sandia.gov> <e2ivmm$99u$1@sea.gmane.org> <444D9C0C.3030006@csun.edu> <e2k6ff$jmm$1@sea.gmane.org> <444DA0A5.80902@csun.edu>
Message-ID: <e2k7kj$m3c$1@sea.gmane.org>

Stephen Walton wrote:
> Robert Kern wrote:
> 
>> Well, there's libf2c which is a library provided by f2c, a program
>> that converts
>> FORTRAN to C. And then there's libg2c which is provided by g77. They
>> really are
>> different
> 
> Oh, I knew that.  My point was that there were some old Redhat releases
> (I don't recall if 7.3 is that old, probably not) which didn't include
> g77, just an f77 shell script which called f2c and cc.

Oy. I'm not sure if even we support that.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From rob at hooft.net  Mon Apr 24 21:25:01 2006
From: rob at hooft.net (Rob Hooft)
Date: Mon Apr 24 21:25:01 2006
Subject: [Numpy-discussion] Re: Problems building numpy
In-Reply-To: <444DA0A5.80902@csun.edu>
References: <02801766-7F45-48EE-AD4A-7B4B0590C9AC@sandia.gov> <e2ivmm$99u$1@sea.gmane.org> <444D9C0C.3030006@csun.edu> <e2k6ff$jmm$1@sea.gmane.org> <444DA0A5.80902@csun.edu>
Message-ID: <444DA473.2010000@hooft.net>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Stephen Walton wrote:
| Robert Kern wrote:
|
|> Well, there's libf2c which is a library provided by f2c, a program
|> that converts
|> FORTRAN to C. And then there's libg2c which is provided by g77. They
|> really are
|> different
|
| Oh, I knew that.  My point was that there were some old Redhat releases
| (I don't recall if 7.3 is that old, probably not) which didn't include
| g77, just an f77 shell script which called f2c and cc.

And in addition, very old versions of g77 (I'm not sure to which RedHat
version this age corresponds) used f2c's library unmodified.

I think the f2c/cc times (the compiler script was called fcomp?) were a
bit older. I moved back to my current job with RedHat 4.x (1997), and I
worked with self-compiled g77 already in my previous job....

Rob

- --
Rob W.W. Hooft  ||  rob at hooft.net  ||  http://www.hooft.net/people/rob/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFETaRzH7J/Cv8rb3QRAqtEAKCsDcj3tO7Gcvgsyj0CaDCu99JLSgCgjgjp
sB7u8S0krk5a1G2bYC+h9cQ=
=MLOS
-----END PGP SIGNATURE-----


From oliphant.travis at ieee.org  Mon Apr 24 21:31:02 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Mon Apr 24 21:31:02 2006
Subject: [Numpy-discussion] Broadcasting rules (Ticket 76).
In-Reply-To: <d38f5330604241825v74135de4wf39404f7914062e0@mail.gmail.com>
References: <d38f5330604241006i739bf55et44718fdee379447e@mail.gmail.com>	 <444D0DF7.2060307@ieee.org>	 <d38f5330604241148h6608f2b9r4dd4f233d988fc1b@mail.gmail.com>	 <444D4329.9050700@ieee.org> <d38f5330604241825v74135de4wf39404f7914062e0@mail.gmail.com>
Message-ID: <444DA5D4.4080104@ieee.org>

Sasha wrote:
> On 4/24/06, Travis Oliphant <oliphant.travis at ieee.org> wrote:
>   
>> Sasha wrote:
>>     
>>>>>> x[:]=1,1
>>>>>> x[:]=1,1,1
>>>>>>
>>>>>>             
>>> Traceback (most recent call last):
>>>   File "<stdin>", line 1, in ?
>>> ValueError: number of elements in destination must be integer multiple
>>> of number of elements in source
>>>
>>>       
>> I think the only reasonable thing to do is to raise an error unless the
>> shapes were compatible like Numeric did and eliminate the multiple
>> copying feature.
>>     
>
> I've attached a patch to the ticket:
>
> <http://projects.scipy.org/scipy/numpy/attachment/ticket/76/shape-check.patch>
>
> I don't see why slice assignment cannot reuse the ufunc code.  It
> looks like slice assignment can just be dispatched to a trivial
> (pass-through) ufunc.  This aproach may even prove to be faster
> because type-aware copying loops can be faster than memmove on popular
> platforms.
>
>   

It could re-use that code but there are at least two drawbacks to that 
approach:


1) The overhead of the ufunc for small array copies.
2) The special-casing that would be needed for variable-size arrays 
(string, unicode, void...) which are not supported by the ufunc machinery. 

and we've already improved the copying by making them type-aware.


Right now copying is handled by the data-type functions (not the ufuncs). 


Perhaps what should be done instead is to allow for strided copying in 
the copyswapn function.


To fully support record arrays with object components the copy operation 
for the VOID case needs to be recursive when fields are defined.  


-Travis


From oliphant.travis at ieee.org  Mon Apr 24 22:00:02 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Mon Apr 24 22:00:02 2006
Subject: [Numpy-discussion] Broadcasting rules (Ticket 76).
In-Reply-To: <d38f5330604241825v74135de4wf39404f7914062e0@mail.gmail.com>
References: <d38f5330604241006i739bf55et44718fdee379447e@mail.gmail.com>	 <444D0DF7.2060307@ieee.org>	 <d38f5330604241148h6608f2b9r4dd4f233d988fc1b@mail.gmail.com>	 <444D4329.9050700@ieee.org> <d38f5330604241825v74135de4wf39404f7914062e0@mail.gmail.com>
Message-ID: <444DACB8.50203@ieee.org>

Sasha wrote:
> On 4/24/06, Travis Oliphant <oliphant.travis at ieee.org> wrote:
>   
> I've attached a patch to the ticket:
>
> <http://projects.scipy.org/scipy/numpy/attachment/ticket/76/shape-check.patch>
>   
I don't think the patch will do your definition of  "the right thing" 
(i.e. mirror broadcasting behavior) in all cases.  For example if "a" is 
2x3x4x5 and "b" is 2x1x1x5, then  a[...] = b will not fill the right 
sub-space of "a" with the contents of "b".


The PyArray_CopyInto gets called in a lot of places.  Have you checked 
all of them to be sure that altering the semantics of copying (which are 
currently different than broadcasting) will work correctly?  I agree 
that one can demonstrate a slight in-consistency.  But, I'd rather have 
the inconsistency and tell people that copying and assignment is not a 
broadcasting ufunc, then feign consistency and have it not quite right.


-Travis


From robert.kern at gmail.com  Mon Apr 24 22:22:03 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Mon Apr 24 22:22:03 2006
Subject: [Numpy-discussion] Re: [SciPy-dev] Google Summer of Code
In-Reply-To: <44476AEA.7080003@decsai.ugr.es>
References: <44476AEA.7080003@decsai.ugr.es>
Message-ID: <444DB033.4000906@gmail.com>

[Cross-posted because this is partially an announcement. Continuing discussion
should go to only one list, please.]

Antonio Arauzo Azofra wrote:
> Google Summer of Code
> http://code.google.com/soc/
> 
> Have you considered participating as a Mentoring organization? Offering 
> any project about Scipy?

I'm not sure which "you" you are referring to here, but yes! Unfortunately, it
was a bit late in the process to be applying as a mentoring organization. Google
started consolidating mentoring organizations. However, I and several others at
Enthought are volunteering to mentor through the PSF. I encourage others on
these lists to do the same or to apply as students, whichever is appropriate.
We'll be happy to provide SVN workspace for numpy and scipy SoC projects.

I've added one fairly general scipy entry to the python.org Wiki page listing
project ideas:

  http://wiki.python.org/moin/SummerOfCode

If you have more specific ideas, please add them to the Wiki.

Potential mentors: Neal Norwitz is coordinating PSF mentors this year and has
asked that those he or Guido does not know personally to give personal
references. If you've been active on this list, I'm sure we can play the "Two
Degrees of Separation From Guido Game" and get you a reference from someone else
here.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From oliphant.travis at ieee.org  Mon Apr 24 22:27:02 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Mon Apr 24 22:27:02 2006
Subject: [Numpy-discussion] Broadcasting rules (Ticket 76).
In-Reply-To: <444DACB8.50203@ieee.org>
References: <d38f5330604241006i739bf55et44718fdee379447e@mail.gmail.com>	 <444D0DF7.2060307@ieee.org>	 <d38f5330604241148h6608f2b9r4dd4f233d988fc1b@mail.gmail.com>	 <444D4329.9050700@ieee.org> <d38f5330604241825v74135de4wf39404f7914062e0@mail.gmail.com> <444DACB8.50203@ieee.org>
Message-ID: <444DB302.30903@ieee.org>

Travis Oliphant wrote:
> Sasha wrote:
>> On 4/24/06, Travis Oliphant <oliphant.travis at ieee.org> wrote:
>>   I've attached a patch to the ticket:
>>
>> <http://projects.scipy.org/scipy/numpy/attachment/ticket/76/shape-check.patch> 
>>
>>   
> I don't think the patch will do your definition of  "the right thing" 
> (i.e. mirror broadcasting behavior) in all cases.  For example if "a" 
> is 2x3x4x5 and "b" is 2x1x1x5, then  a[...] = b will not fill the 
> right sub-space of "a" with the contents of "b".
>
>
> The PyArray_CopyInto gets called in a lot of places.  Have you checked 
> all of them to be sure that altering the semantics of copying (which 
> are currently different than broadcasting) will work correctly?  I 
> agree that one can demonstrate a slight in-consistency.  But, I'd 
> rather have the inconsistency and tell people that copying and 
> assignment is not a broadcasting ufunc, then feign consistency and 
> have it not quite right.
>


Of course, as I've said I'm not opposed to the consistency.


To do it "right", one should use PyArray_MultiIterNew which abstracts 
the concept of broadcasting into iterators (and uses the broadcastable 
checking code that's already written --- so you guarantee 
consistency).   I'm not sure what overhead it would bring.


But, special cases could be checked-for (scalar, and same-size arrays 
for example). 


I'm also thinking that copyswapn should grow stride arguments so that it 
can be used more generally.


-Travis


From lroubeyrie at limair.asso.fr  Tue Apr 25 00:39:04 2006
From: lroubeyrie at limair.asso.fr (Lionel Roubeyrie)
Date: Tue Apr 25 00:39:04 2006
Subject: [Numpy-discussion] equality with masked object
Message-ID: <200604250938.48648.lroubeyrie@limair.asso.fr>

Hi all,
I have a problem with masked_object (and masked_values to) like in this sort 
example :
###########################################
lionel[Donn?es]8>test=array([1,2,3,inf,5])

lionel[Donn?es]9>test = ma.masked_object(test, inf)

lionel[Donn?es]10>print test[3], type(test[3])
-- <class 'numpy.core.ma.MaskedArray'>

lionel[Donn?es]11>print test.max(), type(test.max())
5.0 <type 'float64scalar'>

lionel[Donn?es]12>test[3] == test.max()
       Sortie[12]:
array(data =
 [True],
      mask =
 True,
      fill_value=?)
###########################################

Why 5.0 == -- return True? A float is it the same as a masked object?
thanks

-- 
Lionel Roubeyrie - lroubeyrie at limair.asso.fr
LIMAIR
http://www.limair.asso.fr


From nicolas.chauvat at logilab.fr  Tue Apr 25 03:22:15 2006
From: nicolas.chauvat at logilab.fr (Nicolas Chauvat)
Date: Tue Apr 25 03:22:15 2006
Subject: [Numpy-discussion] announce: pyjit, a little jit for creating numpy ufuncs
In-Reply-To: <qnklktuk9tf.fsf@arbutus.physics.mcmaster.ca>
References: <20060421162336.42285837.simon@arrowtheory.com> <qnklktuk9tf.fsf@arbutus.physics.mcmaster.ca>
Message-ID: <20060425102134.GI24645@crater.logilab.fr>

On Mon, Apr 24, 2006 at 04:17:16PM -0400, David M. Cooke wrote:
> Simon Burton <simon at arrowtheory.com> writes:
> 
> > Hi,
> >
> > Inspired by numexpr, pypy and llvm, i've built a simple 
> > JIT for creating numpy "ufuncs" (they are not yet real ufuncs).
> > It uses llvm[1] as the backend machine code generator.
> 
> Cool! I had a look at LLVM, but I wanted something to go into SciPy,
> and that was too heavy a dependence. However, I could see doing more
> stuff with this than I can easily with numexpr.

Hello,

People interested in this might also be interested in PyPy's rctypes
and the exploratory work done in PyPy to annotate code using arrays.

The goal is "write Python code using numeric arrays and other C libs,
then ask PyPy to translate it to C while removing the python wrapper
of the C libs, then compile".

Then you can run the code as python code when developping and compile
the all thing from C to assembly when speed matters.

Please note it is a goal. We are not there yet. But any help will be
welcome :)

-- 
Nicolas Chauvat

logilab.fr - services en informatique avanc?e et gestion de connaissances  


From steffen.loeck at gmx.de  Tue Apr 25 04:25:22 2006
From: steffen.loeck at gmx.de (Steffen Loeck)
Date: Tue Apr 25 04:25:22 2006
Subject: [Numpy-discussion] vectorize problem
Message-ID: <200604251324.42987.steffen.loeck@gmx.de>

Hello all,

I have a problem using scalar variables in a vectorized function:

from numpy import vectorize

def f(x):
    if x>0: return 1
    else: return 0

F = vectorize(f)

F(1)

gives the error message:
---------------------------------------------------------------------------
exceptions.AttributeError     Traceback (most recent call last)

.../function_base.py in __call__(self, *args)
    619
    620         if self.nout == 1:
--> 621             return self.ufunc(*args).astype(self.otypes[0])
    622         else:
    623             return tuple([x.astype(c) for x, c in 
zip(self.ufunc(*args), self.otypes)])

AttributeError: 'int' object has no attribute 'astype'

Is there any way to get vectorized functions working with scalars again?

Regards
Steffen


From ndarray at mac.com  Tue Apr 25 06:17:13 2006
From: ndarray at mac.com (Sasha)
Date: Tue Apr 25 06:17:13 2006
Subject: [Numpy-discussion] equality with masked object
In-Reply-To: <200604250938.48648.lroubeyrie@limair.asso.fr>
References: <200604250938.48648.lroubeyrie@limair.asso.fr>
Message-ID: <d38f5330604250610x223a3513pd859ed1deff4f41@mail.gmail.com>

On 4/25/06, Lionel Roubeyrie <lroubeyrie at limair.asso.fr> wrote:

>
> Why 5.0 == -- return True? A float is it the same as a masked object?
> thanks

It does not.  It returns ma.masked :

>>> test[3] is ma.masked
True

You should not access masked data - it makes no sense.  The current
behavior is historical and I don't really like it.  Masked scalars are
replaced by ma.masked singleton in subscript operations to allow a[i]
is masked idiom.  In my view it is not worth the trouble, but my
suggestion to get rid of that feature was not met with much
enthusiasm.


From ndarray at mac.com  Tue Apr 25 06:59:07 2006
From: ndarray at mac.com (Sasha)
Date: Tue Apr 25 06:59:07 2006
Subject: [Numpy-discussion] Broadcasting rules (Ticket 76).
In-Reply-To: <444DACB8.50203@ieee.org>
References: <d38f5330604241006i739bf55et44718fdee379447e@mail.gmail.com>
	 <444D0DF7.2060307@ieee.org>
	 <d38f5330604241148h6608f2b9r4dd4f233d988fc1b@mail.gmail.com>
	 <444D4329.9050700@ieee.org>
	 <d38f5330604241825v74135de4wf39404f7914062e0@mail.gmail.com>
	 <444DACB8.50203@ieee.org>
Message-ID: <d38f5330604250632r750880b8ue8eae8433f8ff33f@mail.gmail.com>

On 4/25/06, Travis Oliphant <oliphant.travis at ieee.org> wrote:
> Sasha wrote:
> > On 4/24/06, Travis Oliphant <oliphant.travis at ieee.org> wrote:
> >
> > I've attached a patch to the ticket:
> >
> > <http://projects.scipy.org/scipy/numpy/attachment/ticket/76/shape-check.patch>
> >
> I don't think the patch will do your definition of  "the right thing"
> (i.e. mirror broadcasting behavior) in all cases.  For example if "a" is
> 2x3x4x5 and "b" is 2x1x1x5, then  a[...] = b will not fill the right
> sub-space of "a" with the contents of "b".
>
You are right, but it is not the fault of my code.  My code checks
shapes correctly, but the code that follows does not implement
broadcasting.  I did not realize that.  This also explains why we
disagreed on whether slice assignment is the same as broadcasting
before.

>
> The PyArray_CopyInto gets called in a lot of places.  Have you checked
> all of them to be sure that altering the semantics of copying (which are
> currently different than broadcasting) will work correctly?  I agree
> that one can demonstrate a slight in-consistency.  But, I'd rather have
> the inconsistency and tell people that copying and assignment is not a
> broadcasting ufunc, then feign consistency and have it not quite right.
>

That's why I would rather use an identity ufunc for slice assignment
instead of PyArray_CopyInto.


From charges at humortadela.com.br  Tue Apr 25 07:23:06 2006
From: charges at humortadela.com.br (Humortadela)
Date: Tue Apr 25 07:23:06 2006
Subject: [Numpy-discussion] Voce recebeu uma charge humortadela
Message-ID: <80a6946d133576735a9bca9dea6ea1c3@humortadela.com.br>

An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060425/c2da1ebd/attachment.html>

From charges at humortadela.com.br  Tue Apr 25 07:24:03 2006
From: charges at humortadela.com.br (Humortadela)
Date: Tue Apr 25 07:24:03 2006
Subject: [Numpy-discussion] Voce recebeu uma charge humortadela
Message-ID: <80a6946d133576735a9bca9dea6ea1c3@humortadela.com.br>

An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060425/c2da1ebd/attachment-0001.html>

From perry at stsci.edu  Tue Apr 25 08:21:02 2006
From: perry at stsci.edu (Perry Greenfield)
Date: Tue Apr 25 08:21:02 2006
Subject: [Numpy-discussion] Re: Backporting numpy to Python 2.2
In-Reply-To: <e2jr2o$h48$1@sea.gmane.org>
References: <20060419103554.4ac1df4a.twegener@radlogic.com.au> <e2jr2o$h48$1@sea.gmane.org>
Message-ID: <93BC9AD0-A6CA-4128-B0EE-9999F4CE8077@stsci.edu>

On Apr 24, 2006, at 8:38 PM, Travis E. Oliphant wrote:

> Tim Wegener wrote:
>> Hi, I am attempting to backport numpy-0.9.6 to be compatible with  
>> python 2.2. (Some of our machines run python 2.2 as part of Red  
>> Hat 9 and Red Hat 7.3 and it is hazardous to alter the standard  
>> setup.) I was able to change most of the 2.3-isms to be 2.2  
>> compatible (see the attached patch). However I had problems  
>> compiling the following c module:
>
> I targeted Python 2.3 because it added some very nice constructs  
> (Python 2.4 added even more but I disciplined myself not to use them).
>
> I think it is not impossible to back-port it to Python 2.2 but I  
> agree with Robert that I wonder if it is worth the effort.
>
> In this case Python 2.3 added the bool type which is used in NumPy.  
> Basically this type would have to be constructed (the code could be  
> grabbed from Python 2.3) in order to be used.
>
> The addition of the boolean type is probably the single biggest  
> change that would make back-porting to 2.2 difficult.

If I recall correctly, True and False were added in one of the 2.2  
patch releases (one of those rare new features added in a patch  
release). Only as constant definitions using 0 and 1, and not the  
current boolean implementation. So depending on what the current  
dependencies on booleans are, it may or may not be usable from 2.2.3.

But I also wonder if it is worth the effort. I tend to think not.

Perry


From ndarray at mac.com  Tue Apr 25 10:27:10 2006
From: ndarray at mac.com (Sasha)
Date: Tue Apr 25 10:27:10 2006
Subject: [Numpy-discussion] Question about __array_struct__
Message-ID: <d38f5330604251019u26ea910q78bbbec2b89e1501@mail.gmail.com>

I am trying to add __array_struct__ attribute to R object wrappers in
RPy.  This is desirable because it eliminates a compile-time
dependency on an array module and makes the binary compatible with
either Numeric or numpy.  R has four types of data: logical, integer,
float, and  character.  The first three map perfectly to Numpy with
inter->data simply pointing to an appropriate internal memory area. 
The character type, however is more problematic.  In R character
arrays are arrays of variable length strings and therefore similar to
Numpy object arrays holding python strings.  Obviously, there is no
memory area that can be reused.  I've tried to allocate new memory in 
 __array_struct__  getter, but this presents a problem:

I cannot deallocate that memory in CObject destructor because it is
passed to the newly created array which lives long after the interface
object is deleted. The __array_struct__ mechanism does not seem to
allow to cause the new array assume ownership of the data, but even if
it did, I do not know what memory allocator is appropriate.

The only solution that I can think of is to create a dummy buffer type
with the sole purpose of deleting an array of PyObjects and make an
instance of that type the "base" of the new array.

Can anyone suggest a better approach?


From strawman at astraw.com  Tue Apr 25 10:52:08 2006
From: strawman at astraw.com (Andrew Straw)
Date: Tue Apr 25 10:52:08 2006
Subject: [Numpy-discussion] Question about __array_struct__
In-Reply-To: <d38f5330604251019u26ea910q78bbbec2b89e1501@mail.gmail.com>
References: <d38f5330604251019u26ea910q78bbbec2b89e1501@mail.gmail.com>
Message-ID: <444E619C.6030802@astraw.com>

Sasha wrote:

>I cannot deallocate that memory in CObject destructor because it is
>passed to the newly created array which lives long after the interface
>object is deleted.
>
Normally, the array that's viewing the data held by the __array_struct__ 
should keep a reference to the base object alive, thus preventing the 
issue. If the base object isn't a Python object, you'll have to create 
some kind of Python type that will ensure the original data is not 
freed, although this would normally take place via refcounts if the data 
source was a Python object.

> The __array_struct__ mechanism does not seem to
>allow to cause the new array assume ownership of the data, but even if
>it did, I do not know what memory allocator is appropriate.
>
>The only solution that I can think of is to create a dummy buffer type
>with the sole purpose of deleting an array of PyObjects and make an
>instance of that type the "base" of the new array.
>  
>
Yes, that's I do. (See 
http://www.scipy.org/Cookbook/ArrayStruct_and_Pyrex for example.)


From fullung at gmail.com  Tue Apr 25 14:16:06 2006
From: fullung at gmail.com (Albert Strasheim)
Date: Tue Apr 25 14:16:06 2006
Subject: [Numpy-discussion] SWIG wrappers: Inplace arrays
Message-ID: <006b01c668ad$68b12ab0$0502010a@dsp.sun.ac.za>

Hello all

I am using the SWIG Numpy typemaps to wrap some C code. I ran into the
following problem when wrapping a function with INPLACE_ARRAY1.

In Python, I create the following array:

x = array([],dtype='<i4')

When this is passed to the C function expecting an int*, it goes via
obj_to_array_no_conversion in numpy.i where a direct comparison of the
typecodes is done, at which point a TypeError is raised.

In this case:

desired type = int [typecode 5]
actual type = long [typecode 7]

The typecode is obtained as follows:

#define array_type(a) (int)(((PyArrayObject *)a)->descr->type_num)

Given that I created the array with '<i4', I would expect type_num to map to
int instead of long. Why isn't this happening?

Assuming the is a good reason for type_num being what it is, I think
obj_to_array_no_conversion needs to be slightly cleverer about the
conversions it allows. Is there any way to figure out that int and long are
actually identical (at least on my system) using the Numpy C API? Any other
suggestions or comments for solving this problem?

Thanks!

Regards,

Albert


From tim.hochberg at cox.net  Tue Apr 25 14:24:03 2006
From: tim.hochberg at cox.net (tim.hochberg at cox.net)
Date: Tue Apr 25 14:24:03 2006
Subject: [Numpy-discussion] Broadcasting rules (Ticket 76).
Message-ID: <28358333.1146000205359.JavaMail.root@fed1wml05.mgt.cox.net>

---- Travis Oliphant <oliphant.travis at ieee.org> wrote: 
> Sasha wrote:
> > In this category, I would suggest to allow broadcasting to any
> > multiple of the dimension even if the dimension is not 1.  I don't see
> > what makes 1 so special.
> >   
> What's so special about 1 is that the code for it is relatively 
> straightforward and already implemented using strides.  Altering the 
> code to allow any multiple of the dimension would be harder and slower. 

It also does the right thing most of the time and is easy to understand. It's my expectation that oppening up broadcasting will be more effective in masking errors than in enabling useful new behaviour.

I think that's my ticket being discussed here. If so, it was motivated by a case that stopped working because the looser broadcasting behaviour was preventing some other broadcasting from taking place. I'm not home right now, so I can't provide details; I'll do that on Thursday.

Just keep in mind that it's much easier to keep the broadcasting rules restrictive for now and loosen them up later than to try to tighten them up later if loosening them up turns out to not be a good idea.

-tim


From tim.hochberg at cox.net  Tue Apr 25 14:24:05 2006
From: tim.hochberg at cox.net (tim.hochberg at cox.net)
Date: Tue Apr 25 14:24:05 2006
Subject: [Numpy-discussion] Broadcasting rules (Ticket 76).
Message-ID: <4050236.1146000212838.JavaMail.root@fed1wml05.mgt.cox.net>

---- Travis Oliphant <oliphant.travis at ieee.org> wrote: 
> Sasha wrote:
> > In this category, I would suggest to allow broadcasting to any
> > multiple of the dimension even if the dimension is not 1.  I don't see
> > what makes 1 so special.
> >   
> What's so special about 1 is that the code for it is relatively 
> straightforward and already implemented using strides.  Altering the 
> code to allow any multiple of the dimension would be harder and slower. 

It also does the right thing most of the time and is easy to understand. It's my expectation that oppening up broadcasting will be more effective in masking errors than in enabling useful new behaviour.

I think that's my ticket being discussed here. If so, it was motivated by a case that stopped working because the looser broadcasting behaviour was preventing some other broadcasting from taking place. I'm not home right now, so I can't provide details; I'll do that on Thursday.

Just keep in mind that it's much easier to keep the broadcasting rules restrictive for now and loosen them up later than to try to tighten them up later if loosening them up turns out to not be a good idea.

-tim


From oliphant at ee.byu.edu  Tue Apr 25 15:55:04 2006
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Tue Apr 25 15:55:04 2006
Subject: [Numpy-discussion] SWIG wrappers: Inplace arrays
In-Reply-To: <006b01c668ad$68b12ab0$0502010a@dsp.sun.ac.za>
References: <006b01c668ad$68b12ab0$0502010a@dsp.sun.ac.za>
Message-ID: <444EA88B.4050704@ee.byu.edu>

Albert Strasheim wrote:

>Hello all
>
>I am using the SWIG Numpy typemaps to wrap some C code. I ran into the
>following problem when wrapping a function with INPLACE_ARRAY1.
>
>In Python, I create the following array:
>
>x = array([],dtype='<i4')
>
>When this is passed to the C function expecting an int*, it goes via
>obj_to_array_no_conversion in numpy.i where a direct comparison of the
>typecodes is done, at which point a TypeError is raised.
>
>In this case:
>
>desired type = int [typecode 5]
>actual type = long [typecode 7]
>
>The typecode is obtained as follows:
>
>#define array_type(a) (int)(((PyArrayObject *)a)->descr->type_num)
>
>Given that I created the array with '<i4', I would expect type_num to map to
>int instead of long. Why isn't this happening?
>  
>
Actually there is ambiguity i4 can be either int or long.   If you want 
to guarantee an int-type then use
dtype=intc).

>Assuming the is a good reason for type_num being what it is, I think
>obj_to_array_no_conversion needs to be slightly cleverer about the
>conversions it allows. Is there any way to figure out that int and long are
>actually identical (at least on my system) using the Numpy C API? Any other
>suggestions or comments for solving this problem?
>
>  
>
Yes.  You can use one of

PyArray_EquivTypes(PyArray_Descr *dtype1, PyArray_Descr *dtype2)
PyArray_EquivTypenums(int typenum1, int typenum2)
PyArray_EquivArrTypes(PyObject *array1, PyObject *array2)

These return TRUE (non-zero) if the two type representations are equivalent.

-Travis


From oliphant at ee.byu.edu  Tue Apr 25 16:07:05 2006
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Tue Apr 25 16:07:05 2006
Subject: [Numpy-discussion] SWIG wrappers: Inplace arrays
In-Reply-To: <006b01c668ad$68b12ab0$0502010a@dsp.sun.ac.za>
References: <006b01c668ad$68b12ab0$0502010a@dsp.sun.ac.za>
Message-ID: <444EAB81.3070001@ee.byu.edu>

Albert Strasheim wrote:

>Hello all
>
>I am using the SWIG Numpy typemaps to wrap some C code. I ran into the
>following problem when wrapping a function with INPLACE_ARRAY1.
>
>In Python, I create the following array:
>
>x = array([],dtype='<i4')
>
>When this is passed to the C function expecting an int*, it goes via
>obj_to_array_no_conversion in numpy.i where a direct comparison of the
>typecodes is done, at which point a TypeError is raised.
>
>In this case:
>
>desired type = int [typecode 5]
>actual type = long [typecode 7]
>
>The typecode is obtained as follows:
>
>#define array_type(a) (int)(((PyArrayObject *)a)->descr->type_num)
>
>Given that I created the array with '<i4', I would expect type_num to map to
>int instead of long. Why isn't this happening?
>
>Assuming the is a good reason for type_num being what it is, I think
>obj_to_array_no_conversion needs to be slightly cleverer about the
>conversions it allows. Is there any way to figure out that int and long are
>actually identical (at least on my system) using the Numpy C API? Any other
>suggestions or comments for solving this problem?
>
>  
>
Here is the relevant new numpy.i code (just checked in...)

PyArrayObject* obj_to_array_no_conversion(PyObject* input, int typecode) {
  PyArrayObject* ary = NULL;
  if (is_array(input) && (typecode == PyArray_NOTYPE ||
        PyArray_EquivTypenums(array_type(input), typecode)) {
        ary = (PyArrayObject*) input;
    }
    else if is_array(input) {
      char* desired_type = typecode_string(typecode);
      char* actual_type = typecode_string(array_type(input));
      PyErr_Format(PyExc_TypeError,
                   "Array of type '%s' required.  Array of type '%s' given",
                   desired_type, actual_type);
      ary = NULL;
    }
    else {
      char * desired_type = typecode_string(typecode);
      char * actual_type = pytype_string(input);
      PyErr_Format(PyExc_TypeError,
                   "Array of type '%s' required.  A %s was given",
                   desired_type, actual_type);
      ary = NULL;
    }
  return ary;
}


From ndarray at mac.com  Tue Apr 25 18:17:04 2006
From: ndarray at mac.com (Sasha)
Date: Tue Apr 25 18:17:04 2006
Subject: [Numpy-discussion] Broadcasting rules (Ticket 76).
In-Reply-To: <4050236.1146000212838.JavaMail.root@fed1wml05.mgt.cox.net>
References: <4050236.1146000212838.JavaMail.root@fed1wml05.mgt.cox.net>
Message-ID: <d38f5330604251816i7979df78x67c63eeace28b507@mail.gmail.com>

On 4/25/06, tim.hochberg at cox.net <tim.hochberg at cox.net> wrote:
>
> ---- Travis Oliphant <oliphant.travis at ieee.org> wrote:
> > Sasha wrote:
> > > In this category, I would suggest to allow broadcasting to any
> > > multiple of the dimension even if the dimension is not 1.  I don't see
> > > what makes 1 so special.
> > >
> > What's so special about 1 is that the code for it is relatively
> > straightforward and already implemented using strides.  Altering the
> > code to allow any multiple of the dimension would be harder and slower.

I don't think so. The same zero-stride trick that allows size-1
broadcasting can be used to implement repetition.  I did not review
the C code, but the following Python fragment shows that the loop that
is already in numpy can be used to implement repetition by simply
manipulating shapes and strides:

>>> x = zeros(6)
>>> reshape(x,(3,2))[...] = 1,2
>>> x
array([1, 2, 1, 2, 1, 2])

> It also does the right thing most of the time and is easy to understand.

Easy to understand?  Let me quote Travis' book on this:

"Broadcasting can be understood by four rules: ... While perhaps
somewhat difficult to explain, broadcasting can be quite useful and
becomes second nature rather quickly."

I may be slow, but it did not become second nature for me.  I am still
getting bitten by subtle differences between unit length 1-d arrays
and 0-d arrays.

> It's my expectation that oppening up broadcasting will be more effective in masking
> errors than in enabling useful new behaviour.
>
In my experience broadcasting length-1 and not broadcasting other
lengths is very error prone as it is.  I understand that restricting
broadcasting to make it a strictly dimension-increasing operation is
not possible for two reasons:

1. Numpy cannot break legacy Numeric code.
2. It is not possible to differentiate between 1-d array that
broadcasts column-wise vs. one that broadcasts raw-wise.

In my view none of these reasons is valid.  In my experience Numeric
code that relies on dimension-preserving broadcasting is already
broken, only in a subtle and hard to reproduce way.  Similarly the
need to broadcast over non-leading dimension is a sign of bad design. 
In rare cases where such broadcasting is desirable, it can be easily
done via swapaxes which is a cheap operation.

Nevertheless, I've lost that battle some time ago.

On the other hand I don't see much problem in making
dimension-preserving broadcasting more permissive.  In R, for example,
(1-d) arrays can be broadcast to arbitrary size.  This has an
additional benefit that 1-d to 2-d broadcasting requires no special
code, it just happens because matrices inherit arithmetics from
vectors.  I've never had a problem with R rules being too loose.

> I think that's my ticket being discussed here. If so, it was motivated by a case that
> stopped working because the looser broadcasting behaviour was preventing some
> other broadcasting from taking place. I'm not home right now, so I can't provide
> details; I'll do that on Thursday.

In my view the problem that your ticket highlighted is not so much in
the particular set of broadcasting rules, but in the fact that a[...]
= b uses one set of rules while a[...] += b uses another.  This is
*very* confusing.

> Just keep in mind that it's much easier to keep the broadcasting rules restrictive for
> now and loosen them up later than to try to tighten them up later if loosening them up
> turns out to not be a good idea.

You are preaching to the choir!


From simon at arrowtheory.com  Tue Apr 25 18:29:01 2006
From: simon at arrowtheory.com (Simon Burton)
Date: Tue Apr 25 18:29:01 2006
Subject: [Numpy-discussion] announce: pyjit, a little jit for creating
 numpy ufuncs
In-Reply-To: <qnklktuk9tf.fsf@arbutus.physics.mcmaster.ca>
References: <20060421162336.42285837.simon@arrowtheory.com>
	<qnklktuk9tf.fsf@arbutus.physics.mcmaster.ca>
Message-ID: <20060426112808.531d652b.simon@arrowtheory.com>

On Mon, 24 Apr 2006 16:17:16 -0400
cookedm at physics.mcmaster.ca (David M. Cooke) wrote:

> 
> How do the speedups compare with numexpr?

numexpr segfaults for me (runing timings.py):

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread -1209670912 (LWP 31768)]
0xb7d2b696 in PyArray_NewFromDescr (subtype=0x626e6769, descr=0x64007469, nd=1919251557, dims=0x656e696d,
    strides=0x782d2073, data=0x656c6520, flags=1953391981, obj=0x65736977) at arrayobject.c:3942
3942    arrayobject.c: No such file or directory.
        in arrayobject.c


Simon.


-- 
Simon Burton, B.Sc.
Licensed PO Box 8066
ANU Canberra 2601
Australia
Ph. 61 02 6249 6940
http://arrowtheory.com 


From robert.kern at gmail.com  Tue Apr 25 20:10:07 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Tue Apr 25 20:10:07 2006
Subject: [Numpy-discussion] Chang*ed* the Trac authentication
Message-ID: <444EE463.10007@gmail.com>

Trying not to embarass myself again, I made the changes without telling you.  :-)

In order to create or modify Wiki pages or tickets on the NumPy and SciPy Tracs,
you will have to be logged in. You can register yourself by clicking the
"Register" link in the upper right-hand corner of the page.

Developers who previously had accounts have the same username/password as
before. You can now change your password if you like. Only developers have the
ability to close tickets, delete Wiki pages entirely, or create new ticket
reports (and possibly a couple of other things). Developers, please enter your
name and email by clicking on the "Settings" link up at top once logged in.

Thank you for your patience. If there are any problems, please email me, and I
will try to correct them quickly.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From oliphant.travis at ieee.org  Tue Apr 25 22:26:01 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Tue Apr 25 22:26:01 2006
Subject: [Numpy-discussion] Broadcasting rules (Ticket 76).
In-Reply-To: <d38f5330604251816i7979df78x67c63eeace28b507@mail.gmail.com>
References: <4050236.1146000212838.JavaMail.root@fed1wml05.mgt.cox.net> <d38f5330604251816i7979df78x67c63eeace28b507@mail.gmail.com>
Message-ID: <444F0420.9000500@ieee.org>

Sasha wrote:
> On 4/25/06, tim.hochberg at cox.net <tim.hochberg at cox.net> wrote:
>   
>> ---- Travis Oliphant <oliphant.travis at ieee.org> wrote:
>>     
>>> Sasha wrote:
>>>       
>>>> In this category, I would suggest to allow broadcasting to any
>>>> multiple of the dimension even if the dimension is not 1.  I don't see
>>>> what makes 1 so special.
>>>>
>>>>         
>>> What's so special about 1 is that the code for it is relatively
>>> straightforward and already implemented using strides.  Altering the
>>> code to allow any multiple of the dimension would be harder and slower.
>>>       
>
> I don't think so. The same zero-stride trick that allows size-1
> broadcasting can be used to implement repetition.  I did not review
> the C code, but the following Python fragment shows that the loop that
> is already in numpy can be used to implement repetition by simply
> manipulating shapes and strides:
>   

I don't think anyone is fundamentally opposed to multiple repetitions.   
We're just being cautious.   Also, as you've noted, the assignment code 
is currently not using the ufunc broadcasting code and so they really 
aren't the same thing, yet.
>   
>> It's my expectation that oppening up broadcasting will be more effective in masking
>> errors than in enabling useful new behaviour.
>>
>>     
> In my experience broadcasting length-1 and not broadcasting other
> lengths is very error prone as it is. 

That's not been my experience.  But, I don't know R very well.  I'm very 
interested in what ideas you can bring. 

>  I understand that restricting
> broadcasting to make it a strictly dimension-increasing operation is
> not possible for two reasons:
>
> 1. Numpy cannot break legacy Numeric code.
> 2. It is not possible to differentiate between 1-d array that
> broadcasts column-wise vs. one that broadcasts raw-wise.
>
> In my view none of these reasons is valid.  In my experience Numeric
> code that relies on dimension-preserving broadcasting is already
> broken, only in a subtle and hard to reproduce way.

I definitely don't agree with you here.  Dimension-preserving 
broadcasting is at the heart of the utility of broadcasting and it is 
very, very useful for that.  Calling it subtly broken suggests that you 
don't understand it and have never used it for it's intended purpose.   
I've used dimension-preserving broadcasting literally hundreds of 
times.  It's rather bold of you to say that all of that code is "broken"


Now, I'm sure there are other useful ways to "broadcast",  but 
dimension-preserving is essentially what broadcasting *is* in NumPy.   
If anything it is the dimension-increasing rule that is somewhat 
arbitrary (e.g. why prepend with ones).


Perhaps you want to introduce some other way for non-commensurate shapes 
to interact in an operation.   I think you will find many open minds on 
this list (although probably not anyone who will want to code it up :-) 
).     We do welcome the discussion.    Your experience with other 
array-like languages is helpful.


>   Similarly the
> need to broadcast over non-leading dimension is a sign of bad design. 
> In rare cases where such broadcasting is desirable, it can be easily
> done via swapaxes which is a cheap operation.
>   

Again, it would help if you would refrain from using negative words 
about coding styles that are different from your own.     Such 
broadcasting is not that rare.  It happens quite frequently, actually.   
The point of a language like Python is that you can write algorithms 
simply without struggling with optimization questions up front like you 
seem to be hinting at. 

> On the other hand I don't see much problem in making
> dimension-preserving broadcasting more permissive.  In R, for example,
> (1-d) arrays can be broadcast to arbitrary size.  This has an
> additional benefit that 1-d to 2-d broadcasting requires no special
> code, it just happens because matrices inherit arithmetics from
> vectors.  I've never had a problem with R rules being too loose.
>   

So, please explain exactly what you mean.   Only a few on this list know 
what the R rules even are. 


> In my view the problem that your ticket highlighted is not so much in
> the particular set of broadcasting rules, but in the fact that a[...]
> = b uses one set of rules while a[...] += b uses another.  This is
> *very* confusing.
>   

Yes, this is admittedly confusing.  But, it's an outgrowth of the way 
Numeric code developed.  Broadcasting was always only a ufunc concept in 
Numeric, and copying was not a ufunc.    NumPy grew out of Numeric 
code.   I was not trying to mimick broadcasting behavior when I wrote 
the array copy and array setting code.  Perhaps I should have been. 

I'm willing to change the code on this one, but only if the new copy 
code actually does implement broadcasting behavior equivalently.  And 
going through the ufunc machinery is probably a waste of effort because 
the copy code must be written for variable length arrays anyway (and 
ufuncs don't support them). 

However, the broadcasting machinery has been abstracted in NumPy and can 
therefore be re-used in the copying code.  In Numeric, broadcasting was 
basically implemented deep inside a confusing while loop. 


-Travis


From fullung at gmail.com  Tue Apr 25 23:42:05 2006
From: fullung at gmail.com (Albert Strasheim)
Date: Tue Apr 25 23:42:05 2006
Subject: [Numpy-discussion] SWIG wrappers: Passing NULL pointers or arrays
Message-ID: <00dd01c668fc$6d04b470$0502010a@dsp.sun.ac.za>

Hello all,

I've currently wrapping a C library (libsvm) with NumPy. libsvm has a few
structs similiar to the following:

struct svm_parameter {
   double* weight;
   int nr_weight;
};

In my SWIG wrapper I did the following:

struct svm_parameter {
   %immutable;
   int nr_weight;
   %mutable;
   double* weight;
   %extend {
      svm_parameter() {
         struct svm_parameter* param = (struct svm_parameter*)
             malloc(sizeof(struct svm_parameter));
         param->nr_weight = 0; param->weight = 0;
         return param;
      }
      ~svm_parameter() {
         free(self->weight); free(self);
      }
      void _set_weight(double* IN_ARRAY1, int DIM1) {
         free(self->weight);
         self->nr_weight = DIM1;
         self->weight = malloc(sizeof(double) * DIM1);
         if (!self->weight) {
            SWIG_exception(SWIG_MemoryError, "OOM");
         }
         memcpy(self->weight, IN_ARRAY1, sizeof(double) * DIM1);
         return;
      fail:
         self->nr_weight = 0;
         self->weight = 0;
      }
   }
};

This works pretty well (suggestion welcome though). However, one feature
that I think is lacking from the current array typemaps is a way of passing
NULL to the C function. On the Python side I want to be able to do:

svm_parameter.weight = N.array([1.0,2.0])

or

svm_parameter.weight = None

This heads off to __setattr__ where the following happens:

def __setattr__(self, attr, val):
   if attr in ['weight', 'weight_label']:
      set_func = getattr(self, '_set_%s' % (attr,))
      set_func(val)
   else:
      super(svm_parameter, self).__setattr__(attr, val)

At this point the typemap magic kicks in. However, passing a None doesn't
work, because somewhere down the line somebody checks for the int argument.
The current typemap looks like this:

%define TYPEMAP_IN1(type,typecode)
%typemap(in) (type* IN_ARRAY1, int DIM1)
             (PyArrayObject* array=NULL, int is_new_object) {
  int size[1] = {-1};
  array = obj_to_array_contiguous_allow_conversion($input, typecode,
&is_new_object);
  if (!array || !require_dimensions(array,1) || !require_size(array,size,1))
SWIG_fail;
  $1 = (type*) array->data;
  $2 = array->dimensions[0];
}
%typemap(freearg) (type* IN_ARRAY1, int DIM1) {
  if (is_new_object$argnum && array$argnum) Py_DECREF(array$argnum);
}
%enddef

I quickly hacked up the following typemap that seems to deal gracefully when
a None is passed instead of an array. Changed lines:

if ($input == Py_None) {
  is_new_object = 0;
  $1 = NULL;
  $2 = 0;
} else {
  int size[1] = {-1};
  array = obj_to_array_contiguous_allow_conversion($input, typecode,
&is_new_object);
  if (!array || !require_dimensions(array,1) || !require_size(array,size,1))
SWIG_fail;
  $1 = (type*) array->data;
  $2 = array->dimensions[0];
}

Now I can write my set_weight function as follows:

void _set_weight(double* IN_ARRAY1, int DIM1) {
  free(self->weight);
  self->weight = 0;
  self->nr_weight = DIM1;
  if (DIM1 > 0) {
    self->weight = malloc(sizeof(double) * DIM1);
    if (!self->weight) {
      SWIG_exception(SWIG_MemoryError, "OOM");
    }
    memcpy(self->weight, IN_ARRAY1, sizeof(double) * DIM1);
  }
  return;
fail:
  self->nr_weight = 0;
}

Does it make sense to add this to the typemaps? Any other comments? Are
there better ways to accomplish this?

Regards,

Albert


From arnd.baecker at web.de  Wed Apr 26 00:52:01 2006
From: arnd.baecker at web.de (Arnd Baecker)
Date: Wed Apr 26 00:52:01 2006
Subject: [Numpy-discussion] vectorize problem
In-Reply-To: <200604251324.42987.steffen.loeck@gmx.de>
References: <200604251324.42987.steffen.loeck@gmx.de>
Message-ID: <Pine.LNX.4.51.0604260944230.10871@ptpcp8.phy.tu-dresden.de>

Hi,

On Tue, 25 Apr 2006, Steffen Loeck wrote:

> Hello all,
>
> I have a problem using scalar variables in a vectorized function:
>
> from numpy import vectorize
>
> def f(x):
>     if x>0: return 1
>     else: return 0
>
> F = vectorize(f)
>
> F(1)
>
> gives the error message:
> ---------------------------------------------------------------------------
> exceptions.AttributeError     Traceback (most recent call last)
>
> .../function_base.py in __call__(self, *args)
>     619
>     620         if self.nout == 1:
> --> 621             return self.ufunc(*args).astype(self.otypes[0])
>     622         else:
>     623             return tuple([x.astype(c) for x, c in
> zip(self.ufunc(*args), self.otypes)])
>
> AttributeError: 'int' object has no attribute 'astype'

Ouch - that's not nice - a lot of my code relies the fact that (old
scipy) vectorize happily eats scalars *and* arrays.

I am not familiar with the code of numpy.vectorize (which has indeed
changed quite a bit compared to the old scipy.vectorize),
but maybe it is only a simple change?

> Is there any way to get vectorized functions working with scalars again?

+1

(or is there a particular reason why "vectorized" functions
should not be able to operate on scalars?)

Best, Arnd


From pgmdevlist at mailcan.com  Wed Apr 26 01:06:04 2006
From: pgmdevlist at mailcan.com (Pierre GM)
Date: Wed Apr 26 01:06:04 2006
Subject: [Numpy-discussion] A python interface for loess ?
Message-ID: <200604260329.17115.pgmdevlist@mailcan.com>

Folks, 
Would any of you be aware of a Python interface to the loess routines ?
http://netlib.bell-labs.com/netlib/a/dloess.gz
I could use the R implementation through Rpy, but I would prefer to stick to 
Python...
Thanks a lot in advance
P.


From arnd.baecker at web.de  Wed Apr 26 02:39:05 2006
From: arnd.baecker at web.de (Arnd Baecker)
Date: Wed Apr 26 02:39:05 2006
Subject: [Numpy-discussion] concatenate, doc-string
Message-ID: <Pine.LNX.4.51.0604261129410.10871@ptpcp8.phy.tu-dresden.de>

Hi,

the doc-string of concatentate is pretty short:

numpy.concatenate?
Docstring:
    concatenate((a1,a2,...),axis=None).

Would the following be better:
"""
concatenate((a1, a2,...), axis=None) joins the tuple  `(a1, a2, ...)` of
sequences (or arrays) into a single numpy array.

Example::

  print concatenate( ([0,1,2], [5,6,7]))
"""

((The ``(or arrays)`` could be omitted if sequences include array by
default, though it might not be obvious to beginners ...))

I was also tempted to suggest a dtype argument,
  concatenate( ([0,1,2], [5,6,7]), dtype=numpy.Float)
but I am not sure if that would be a good idea ...

Best, Arnd


From gnchen at cortechs.net  Wed Apr 26 06:52:01 2006
From: gnchen at cortechs.net (Gennan Chen)
Date: Wed Apr 26 06:52:01 2006
Subject: [Numpy-discussion] SWIG for 3D array
Message-ID: <A7E3D110-90E4-47BE-80DC-CED1DC359A26@cortechs.net>

Hi!

I will like to use SWIG to wrap my code. However, it seems the  
current numpy.i only can map 1 and 2D array, but not 3D. Is it  
correct? Or I miss something here.

I don't mind spend some time to do it like scipy.ndimage if numpy.i  
did not support ND arrary. But I am new to write extension to Python.  
And I really have hard time to understand how to deal with reference  
counting issues. Anyone know where I can know a good reference for  
that? Or a simple example in numpy will be appreciated....

Gen-Nan Chen, PhD
Chief Scientist
Research and Development Group
CorTechs Labs Inc (www.cortechs.net)
1020 Prospect St., #304, La Jolla, CA, 92037
Tel: 1-858-459-9700 ext 16
Fax: 1-858-459-9705
Email: gnchen at cortechs.net


From oliphant.travis at ieee.org  Wed Apr 26 10:05:01 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Wed Apr 26 10:05:01 2006
Subject: [Numpy-discussion] vectorize problem
In-Reply-To: <Pine.LNX.4.51.0604260944230.10871@ptpcp8.phy.tu-dresden.de>
References: <200604251324.42987.steffen.loeck@gmx.de> <Pine.LNX.4.51.0604260944230.10871@ptpcp8.phy.tu-dresden.de>
Message-ID: <444FA7E7.2070303@ieee.org>

Arnd Baecker wrote:
> Hi,
>
> On Tue, 25 Apr 2006, Steffen Loeck wrote:
>
>   
>> Hello all,
>>
>> I have a problem using scalar variables in a vectorized function:
>>
>> from numpy import vectorize
>>
>> def f(x):
>>     if x>0: return 1
>>     else: return 0
>>
>> F = vectorize(f)
>>
>> F(1)
>>
>> gives the error message:
>> ---------------------------------------------------------------------------
>> exceptions.AttributeError     Traceback (most recent call last)
>>
>> .../function_base.py in __call__(self, *args)
>>     619
>>     620         if self.nout == 1:
>> --> 621             return self.ufunc(*args).astype(self.otypes[0])
>>     622         else:
>>     623             return tuple([x.astype(c) for x, c in
>> zip(self.ufunc(*args), self.otypes)])
>>
>> AttributeError: 'int' object has no attribute 'astype'
>>     
>
> Ouch - that's not nice - a lot of my code relies the fact that (old
> scipy) vectorize happily eats scalars *and* arrays.
>
> I am not familiar with the code of numpy.vectorize (which has indeed
> changed quite a bit compared to the old scipy.vectorize),
> but maybe it is only a simple change?
>   
It is just a simple change.   Scalars are supposed to be supported.  
They aren't only as a side-effect of the switch to not return 
object-scalars.    I did not update the vectorize code to handle the 
scalar return value from the object ufunc (which is now no-longer an 
object-scalar with the methods of arrays (like astype) but is instead 
the underlying object). 

I'll add a check.

-Travis


From jrl at gatewayengineers.com  Wed Apr 26 12:29:01 2006
From: jrl at gatewayengineers.com (Frida Maldonado)
Date: Wed Apr 26 12:29:01 2006
Subject: [Numpy-discussion] vat
Message-ID: <001a01c66967$82f94541$ddc46747@ijopi.sewtp>

An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060426/a989f963/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: controversy.gif
Type: image/gif
Size: 28493 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060426/a989f963/attachment.gif>

From cookedm at physics.mcmaster.ca  Wed Apr 26 12:33:01 2006
From: cookedm at physics.mcmaster.ca (David M. Cooke)
Date: Wed Apr 26 12:33:01 2006
Subject: [Numpy-discussion] Chang*ed* the Trac authentication
In-Reply-To: <444EE463.10007@gmail.com> (Robert Kern's message of "Tue, 25 Apr
	2006 22:09:23 -0500")
References: <444EE463.10007@gmail.com>
Message-ID: <qnkslo0i15h.fsf@arbutus.physics.mcmaster.ca>

Robert Kern <robert.kern at gmail.com> writes:

> Trying not to embarass myself again, I made the changes without telling you.  :-)
>
> In order to create or modify Wiki pages or tickets on the NumPy and SciPy Tracs,
> you will have to be logged in. You can register yourself by clicking the
> "Register" link in the upper right-hand corner of the page.
>
> Developers who previously had accounts have the same username/password as
> before. You can now change your password if you like. Only developers have the
> ability to close tickets, delete Wiki pages entirely, or create new ticket
> reports (and possibly a couple of other things). Developers, please enter your
> name and email by clicking on the "Settings" link up at top once logged in.
>
> Thank you for your patience. If there are any problems, please email me, and I
> will try to correct them quickly.

Thanks Robert; I hope this helps with our spam problem to an extent.

-- 
|>|\/|<
/--------------------------------------------------------------------------\
|David M. Cooke                      http://arbutus.physics.mcmaster.ca/dmc/
|cookedm at physics.mcmaster.ca


From cookedm at physics.mcmaster.ca  Wed Apr 26 12:48:04 2006
From: cookedm at physics.mcmaster.ca (David M. Cooke)
Date: Wed Apr 26 12:48:04 2006
Subject: [Numpy-discussion] concatenate, doc-string
In-Reply-To: <Pine.LNX.4.51.0604261129410.10871@ptpcp8.phy.tu-dresden.de>
	(Arnd Baecker's message of "Wed, 26 Apr 2006 11:38:26 +0200 (CEST)")
References: <Pine.LNX.4.51.0604261129410.10871@ptpcp8.phy.tu-dresden.de>
Message-ID: <qnkk69ci0ff.fsf@arbutus.physics.mcmaster.ca>

Arnd Baecker <arnd.baecker at web.de> writes:

> Hi,
>
> the doc-string of concatentate is pretty short:
>
> numpy.concatenate?
> Docstring:
>     concatenate((a1,a2,...),axis=None).
>
> Would the following be better:
> """
> concatenate((a1, a2,...), axis=None) joins the tuple  `(a1, a2, ...)` of
> sequences (or arrays) into a single numpy array.
>
> Example::
>
>   print concatenate( ([0,1,2], [5,6,7]))
> """
>
> ((The ``(or arrays)`` could be omitted if sequences include array by
> default, though it might not be obvious to beginners ...))

Here's what I just checked in:

    concatenate((a1, a2, ...), axis=None) joins arrays together

    The tuple of sequences (a1, a2, ...) are joined along the given axis
    (default is the first one) into a single numpy array.

    Example:

    >>> concatenate( ([0,1,2], [5,6,7]) )
    array([0, 1, 2, 5, 6, 7])

> I was also tempted to suggest a dtype argument,
>   concatenate( ([0,1,2], [5,6,7]), dtype=numpy.Float)
> but I am not sure if that would be a good idea ...

Well, that would require more code, so I didn't do it :-)

-- 
|>|\/|<
/--------------------------------------------------------------------------\
|David M. Cooke                      http://arbutus.physics.mcmaster.ca/dmc/
|cookedm at physics.mcmaster.ca


From arnd.baecker at web.de  Wed Apr 26 14:03:02 2006
From: arnd.baecker at web.de (Arnd Baecker)
Date: Wed Apr 26 14:03:02 2006
Subject: [Numpy-discussion] concatenate, doc-string
In-Reply-To: <qnkk69ci0ff.fsf@arbutus.physics.mcmaster.ca>
References: <Pine.LNX.4.51.0604261129410.10871@ptpcp8.phy.tu-dresden.de>
 <qnkk69ci0ff.fsf@arbutus.physics.mcmaster.ca>
Message-ID: <Pine.LNX.4.51.0604262252400.10671@ptpcp8.phy.tu-dresden.de>

On Wed, 26 Apr 2006, David M. Cooke wrote:

> Arnd Baecker <arnd.baecker at web.de> writes:
>
> > Hi,
> >
> > the doc-string of concatentate is pretty short:
> >
> > numpy.concatenate?
> > Docstring:
> >     concatenate((a1,a2,...),axis=None).
> >
> > Would the following be better:
> > """
> > concatenate((a1, a2,...), axis=None) joins the tuple  `(a1, a2, ...)` of
> > sequences (or arrays) into a single numpy array.
> >
> > Example::
> >
> >   print concatenate( ([0,1,2], [5,6,7]))
> > """
> >
> > ((The ``(or arrays)`` could be omitted if sequences include array by
> > default, though it might not be obvious to beginners ...))
>
> Here's what I just checked in:
>
>     concatenate((a1, a2, ...), axis=None) joins arrays together
>
>     The tuple of sequences (a1, a2, ...) are joined along the given axis
>     (default is the first one) into a single numpy array.
>
>     Example:
>
>     >>> concatenate( ([0,1,2], [5,6,7]) )
>     array([0, 1, 2, 5, 6, 7])

Great -  many thanks!!

There are some further routines which might benefit
from some more explanation/examples -
so if you don't mind I will try to suggest some additions
(I could check them in directly, I think, but as I am not
a native speaker I feel better to post them here for
review/improvement).

> > I was also tempted to suggest a dtype argument,
> >   concatenate( ([0,1,2], [5,6,7]), dtype=numpy.Float)
> > but I am not sure if that would be a good idea ...
>
> Well, that would require more code, so I didn't do it :-)

;-) It might also be problematic, when one of the sequence
elements would not fit into the output type.

Best, Arnd


From ndarray at mac.com  Wed Apr 26 14:18:06 2006
From: ndarray at mac.com (Sasha)
Date: Wed Apr 26 14:18:06 2006
Subject: [Numpy-discussion] Broadcasting rules (Ticket 76).
In-Reply-To: <444F0420.9000500@ieee.org>
References: <4050236.1146000212838.JavaMail.root@fed1wml05.mgt.cox.net>
	 <d38f5330604251816i7979df78x67c63eeace28b507@mail.gmail.com>
	 <444F0420.9000500@ieee.org>
Message-ID: <d38f5330604261417o1b8dd104veb5b6e1398cfc74e@mail.gmail.com>

I would like to apologize up-front if anyone found my overly general
arguments inappropriate.  I did not intend to be critical about
anyone's code or design other than my own.  Any references to "bad
design" or "broken code" are related to my own misguided attempts to
use some of the Numeric features in the past.  It turned out that
dimension-preserving broadcasting was a wrong feature to use for a
specific class of problems that I am dealing with most of the time. 
This does not mean, that it cannot be used appropriately in other
domains.

I was wrong in posting overly general opinions without providing
specific examples.  I will try to do better in this post.  Before I do
that, however, let me try to explain why I hold strong views on
certain things.  In my view the most appealing feature in Python is
the Zen of Python <http://www.python.org/doc/Humor.html#zen>" and in
particular "There should be one-- and preferably only one --obvious
way to do it."  In my view Python represents the "hard science"
approach appealing to physics and math types while Perl is more of a
"soft science" language. (There is nothing wrong with either Perl or
soft sciences.) This is what make Python so appealing for scientific
computing.  Unfortunately, it is the fact of life that there are
always many ways to solve the same problem and a successful "pythonic"
design has to pick one (preferably the best) of the possible ways and
make it obvious.

This said, let me present a specific problem that I will use to
illustrate my points below.  Suppose we study school statistics in
different cities.  Let city A have 10 schools with 20 classes and 30
students in each.  It is natural to organize the data collected about
the students in a 10x20x30 array.  It is also natural to collect some
of the data at the per-school or per-class level.  This data may come
from aggregating student level statistics (say average test score) or
from the characteristics that are class or school specific (say the
grade or primary language).  There are two obvious ways to present
such data. 1) We can use 3-d arrays for everything and make the shape
of the per-class array 10x20x1 and the shape of per-school array
10x1x1; and 2) use 2-d and 1-d arrays.  The first approach seems to be
more flexible.  We can also have 10x1x30 or 1x1x30 arrays to represent
data which varies along the student dimension, but is constant across
schools or classes.  However, this added benefit is illusory: the
first student in one class list has no relationship to the first 
student in the other class, so in this particular problem an average
score of the first student across classes makes no sense (it will also
depend on whether students are ordered alphabetically or by an
achievement rank).

On the other hand this approach has a very significant drawback:
functions that process city data have no way to distinguish between
per-school data and a lucky city that can afford educating its
students in individual classes.  Just as it is extremely unlikely to
have one student per class in our toy example, in real-world problems
it is not unreasonable to assume that dimension of size 1 represents
aggregate data.  A software designed based on this assumption is what
I would call broken in a subtle way.

Please see more below.


On 4/26/06, Travis Oliphant <oliphant.travis at ieee.org> wrote:
> Sasha wrote:
> > On 4/25/06, tim.hochberg at cox.net <tim.hochberg at cox.net> wrote:
> >
> >> ---- Travis Oliphant <oliphant.travis at ieee.org> wrote:
> [...]
> I don't think anyone is fundamentally opposed to multiple repetitions.
> We're just being cautious.   Also, as you've noted, the assignment code
> is currently not using the ufunc broadcasting code and so they really
> aren't the same thing, yet.

It looks like there is a lot of development in this area going on at
the moment.  Please let me know if I can help.

> [...]
> > In my experience broadcasting length-1 and not broadcasting other
> > lengths is very error prone as it is.
>
> That's not been my experience.

I should have been more specific.  As I explained above, the special
properties of length-1 led me to design a system that distinguished
aggregate data by testing for unit length.  This system was subtly
broken.  In a rare case when the population had only one element, the
system was producing wrong results.

> But, I don't know R very well.  I'm very
> interested in what ideas you can bring.
>

R takes a very simple approach: everything is a vector.  There are no
scalars, if you need a scalar, you use a vector of length 1. 
Broadcasting is simply repetition:

> x <- rep(0,10)
> x + c(1,2)
 [1] 1 2 1 2 1 2 1 2 1 2

the length of the larger vector does not even need to be a multiple of
the shorter, but in this case a warning is issued:

> x + c(1,2,3)
 [1] 1 2 3 1 2 3 1 2 3 1
Warning message:
longer object length
        is not a multiple of shorter object length in: x + c(1, 2, 3)

Multi-dimensional arrays are implemented by setting a "dim" attribute:

> dim(x) <- c(2,5)
> x
     [,1] [,2] [,3] [,4] [,5]
[1,]    0    0    0    0    0
[2,]    0    0    0    0    0

(R uses Fortran order).  Broadcasting ignores the dim attribute, but
does the right thing for conformable vectors:

> x + c(1,2)
     [,1] [,2] [,3] [,4] [,5]
[1,]    1    1    1    1    1
[2,]    2    2    2    2    2

However, the following is unfortunate:
> x + 1:5
     [,1] [,2] [,3] [,4] [,5]
[1,]    1    3    5    2    4
[2,]    2    4    1    3    5


> >  I understand that restricting
> > broadcasting to make it a strictly dimension-increasing operation is
> > not possible for two reasons:
> >
> > 1. Numpy cannot break legacy Numeric code.
> > 2. It is not possible to differentiate between 1-d array that
> > broadcasts column-wise vs. one that broadcasts raw-wise.
> >
> > In my view none of these reasons is valid.  In my experience Numeric
> > code that relies on dimension-preserving broadcasting is already
> > broken, only in a subtle and hard to reproduce way.
>
> I definitely don't agree with you here.  Dimension-preserving
> broadcasting is at the heart of the utility of broadcasting and it is
> very, very useful for that.  Calling it subtly broken suggests that you
> don't understand it and have never used it for it's intended purpose.
> I've used dimension-preserving broadcasting literally hundreds of
> times.  It's rather bold of you to say that all of that code is "broken"
>
Sorry I was not specific in the original post.  I hope you now
understand where I come from.  Can you point me to some examples of
the correct way to use dimension-preserving broadcasting?  I would
assume that it is probably more useful in the problem domains where
there is no natural ordering of the dimensions, unlike in the
hierarchial data example that I used.

> Now, I'm sure there are other useful ways to "broadcast",  but
> dimension-preserving is essentially what broadcasting *is* in NumPy.
> If anything it is the dimension-increasing rule that is somewhat
> arbitrary (e.g. why prepend with ones).
>
The dimension-increasing broadcasting is very natural when you deal
with hierarchical data where various dimensions correspond to the
levels of aggregation.  As I explained above, average student score
per class makes sense while the average score per student over classes
does not.  It is very common to combine per-class data with
per-student data by broadcasting per-class data.  For example, the
total time spent by student is a sum spent in regular per-class
session plus individual elected courses.

>
> Perhaps you want to introduce some other way for non-commensurate shapes
> to interact in an operation.   I think you will find many open minds on
> this list (although probably not anyone who will want to code it up :-)
> ).     We do welcome the discussion.    Your experience with other
> array-like languages is helpful.
>
I will be happy to contribute code if I see interest.

>
> >   Similarly the
> > need to broadcast over non-leading dimension is a sign of bad design.
> > In rare cases where such broadcasting is desirable, it can be easily
> > done via swapaxes which is a cheap operation.
> >
>
> Again, it would help if you would refrain from using negative words
> about coding styles that are different from your own.     Such
> broadcasting is not that rare.  It happens quite frequently, actually.
> The point of a language like Python is that you can write algorithms
> simply without struggling with optimization questions up front like you
> seem to be hinting at.
>
I hope you understand that I did not mean to criticize anyone's coding
style.  I was not really hinting at optimization issues, I just had a
particular design problem in mind (see above).  Incidentally,
dimension-increasing broadcasting does tend to lead to more efficient
code both in terms of memory utilization and more straightforward
algorithms with fewer special cases, but this was not really what I
was referring to.


> > On the other hand I don't see much problem in making
> > dimension-preserving broadcasting more permissive.  In R, for example,
> > (1-d) arrays can be broadcast to arbitrary size.  This has an
> > additional benefit that 1-d to 2-d broadcasting requires no special
> > code, it just happens because matrices inherit arithmetics from
> > vectors.  I've never had a problem with R rules being too loose.
> >
>
> So, please explain exactly what you mean.   Only a few on this list know
> what the R rules even are.

See above.

> > In my view the problem that your ticket highlighted is not so much in
> > the particular set of broadcasting rules, but in the fact that a[...]
> > = b uses one set of rules while a[...] += b uses another.  This is
> > *very* confusing.
> >
>
> Yes, this is admittedly confusing.  But, it's an outgrowth of the way
> Numeric code developed.  Broadcasting was always only a ufunc concept in
> Numeric, and copying was not a ufunc.    NumPy grew out of Numeric
> code.   I was not trying to mimick broadcasting behavior when I wrote
> the array copy and array setting code.  Perhaps I should have been.
>
In the spirit of appealing to obscure languages ;-), let me mention
that in the K language (kx.com) element assignment is implemented
using an Amend primitive that takes four arguments: @[x,i,f,y] id more
or less equivalent to numpy's x[i] = f(x[i], y[i]), where x, y and i
are vectors and f is a binary (broadcasting) function.  Thus, x[i] +=
y[i] can be written as @[x,i,+,y] and x[i] = y[i] is @[x,i,:,y], where
':' denotes a binary function that returns it's second argument and
ignores the first. K interpretor's Linux binary is less than 200K and
that includes a simple X window GUI! Such small code size would not be
possible without picking the right set of primitives and avoiding
special case code.


> I'm willing to change the code on this one, but only if the new copy
> code actually does implement broadcasting behavior equivalently.  And
> going through the ufunc machinery is probably a waste of effort because
> the copy code must be written for variable length arrays anyway (and
> ufuncs don't support them).
>
I know close to nothing about variable length arrays.  When I need to
deal with the relational database data, I transpose it so that each
column gets into its own fixed length array.  This is the approach
that both R and K take.  However, at least at the C level, I don't see
why ufunc code cannot be generalized to handle variable length arrays.
 At the python level, pre-defined arithmetic or math functions are
probably not feasible for variable length, but the ability to define a
variable length array function by just writing an inner loop
implementation may be quite useful.

> However, the broadcasting machinery has been abstracted in NumPy and can
> therefore be re-used in the copying code.  In Numeric, broadcasting was
> basically implemented deep inside a confusing while loop.

I've never understood the Numeric's while loop and completely agree
with your characterization.  I am still studying the numpy code, but
it is clearly a big improvement.


From shhong at u.washington.edu  Wed Apr 26 14:19:01 2006
From: shhong at u.washington.edu (Sungho Hong)
Date: Wed Apr 26 14:19:01 2006
Subject: [Numpy-discussion] Building Numpy with Windows and MKL?
Message-ID: <207B8B70-6328-421D-8343-B32506AF47CA@u.washington.edu>

Has anyone tried to install numpy with MS Windows and Intel Math  
Kernel Library, especially using the VC 2003 compiler? I began with  
MKLROOT=C:\Program Files\Inter\plsuite, but the setup.py seems to  
have a problem with finding the library path. In that case, how do  
manually set up all the relevant paths manually? Thanks.

- SH


From ryanlists at gmail.com  Wed Apr 26 14:21:07 2006
From: ryanlists at gmail.com (Ryan Krauss)
Date: Wed Apr 26 14:21:07 2006
Subject: [Numpy-discussion] array.min() vs. min(array)
Message-ID: <c5b438120604261420u5d44eeefla575bc56043f487a@mail.gmail.com>

I was spending some time trying to track down how to speed up an
algorithm that gets called a bunch of times during an optimization.  I
was startled when I finally figured out that most of the time was
wasted by using the built-in pyhton min function.  It turns out that
in my case, using array.min() (i.e. the method of the Numpy array) is
300-500 times faster than the built-in python min function (i.e.
min(array)).

So, thank you Travis and everyone who has put so much time into
thinking through Numpy and making it fast (as well as making sure it
is correct).

And to the rest of us, use the Numpy array methods whenever you can.

Thanks,

Ryan


From oliphant.travis at ieee.org  Wed Apr 26 14:42:05 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Wed Apr 26 14:42:05 2006
Subject: [Numpy-discussion] array.min() vs. min(array)
In-Reply-To: <c5b438120604261420u5d44eeefla575bc56043f487a@mail.gmail.com>
References: <c5b438120604261420u5d44eeefla575bc56043f487a@mail.gmail.com>
Message-ID: <444FE909.5080209@ieee.org>

Ryan Krauss wrote:
> I was spending some time trying to track down how to speed up an
> algorithm that gets called a bunch of times during an optimization.  I
> was startled when I finally figured out that most of the time was
> wasted by using the built-in pyhton min function.  It turns out that
> in my case, using array.min() (i.e. the method of the Numpy array) is
> 300-500 times faster than the built-in python min function (i.e.
> min(array)).
>
> So, thank you Travis and everyone who has put so much time into
> thinking through Numpy and making it fast (as well as making sure it
> is correct).

The builtin min function is a bit confusing because it usually does work 
on NumPy arrays.  But, as you've noticed it is always slower because it 
uses the "generic sequence interface" that NumPy arrays expose.  So, 
it's basically not much faster than a Python loop.  In this case you are 
also being hit by the fact that scalarmath is not yet implemented (it's 
getting close though...)  so the returned array scalars are being 
compared using the bulky ufunc machinery on each element separately.

In Python 2.5 we are going to have the same issues with the new any() 
and all() functions of Python.

-Travis


From wbaxter at gmail.com  Wed Apr 26 14:56:12 2006
From: wbaxter at gmail.com (Bill Baxter)
Date: Wed Apr 26 14:56:12 2006
Subject: [Numpy-discussion] Broadcasting rules (Ticket 76).
In-Reply-To: <d38f5330604261417o1b8dd104veb5b6e1398cfc74e@mail.gmail.com>
References: <4050236.1146000212838.JavaMail.root@fed1wml05.mgt.cox.net>
	 <d38f5330604251816i7979df78x67c63eeace28b507@mail.gmail.com>
	 <444F0420.9000500@ieee.org>
	 <d38f5330604261417o1b8dd104veb5b6e1398cfc74e@mail.gmail.com>
Message-ID: <e86a5fd00604261455g629164edy9ac45a09ac2ea92c@mail.gmail.com>

Is that a representative example?   It seems highly unlikely that in real
life every one of the schools would have exactly 20 classes, and each of
those exactly 30 students.  I don't know anything about R or the way things
are typically done with statistical languages -- maybe this is the norm
there -- but from a pure CompSci data structures perspective, a 3D array
seems ill-suited for this type of hierarchical data.  Something more
flexible, along the lines of a Python list of list of list, seems more
apropriate.

--bill

On 4/27/06, Sasha <ndarray at mac.com> wrote:

> Suppose we study school statistics in
> different cities.  Let city A have 10 schools with 20 classes and 30
> students in each.  It is natural to organize the data collected about
> the students in a 10x20x30 array.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060426/529e984d/attachment.html>

From ndarray at mac.com  Wed Apr 26 15:24:07 2006
From: ndarray at mac.com (Sasha)
Date: Wed Apr 26 15:24:07 2006
Subject: [Numpy-discussion] concatenate, doc-string
In-Reply-To: <qnkk69ci0ff.fsf@arbutus.physics.mcmaster.ca>
References: <Pine.LNX.4.51.0604261129410.10871@ptpcp8.phy.tu-dresden.de>
	 <qnkk69ci0ff.fsf@arbutus.physics.mcmaster.ca>
Message-ID: <d38f5330604261523r53d37db0hc56781bcda52f96c@mail.gmail.com>

On 4/26/06, David M. Cooke <cookedm at physics.mcmaster.ca> wrote:
> ....
> Here's what I just checked in:
>
>     concatenate((a1, a2, ...), axis=None) joins arrays together
>
>     The tuple of sequences (a1, a2, ...) are joined along the given axis
>     (default is the first one) into a single numpy array.
>
>     Example:
>
>     >>> concatenate( ([0,1,2], [5,6,7]) )
>     array([0, 1, 2, 5, 6, 7])
>

The first argument does not have to be a tuple:

>>> print concatenate([[0,1,2], [5,6,7]])
[0 1 2 5 6 7]

but the docstring is probably ok given that the alternative is
"sequence of sequences" ...


From ndarray at mac.com  Wed Apr 26 15:58:04 2006
From: ndarray at mac.com (Sasha)
Date: Wed Apr 26 15:58:04 2006
Subject: [Numpy-discussion] Broadcasting rules (Ticket 76).
In-Reply-To: <e86a5fd00604261455g629164edy9ac45a09ac2ea92c@mail.gmail.com>
References: <4050236.1146000212838.JavaMail.root@fed1wml05.mgt.cox.net>
	 <d38f5330604251816i7979df78x67c63eeace28b507@mail.gmail.com>
	 <444F0420.9000500@ieee.org>
	 <d38f5330604261417o1b8dd104veb5b6e1398cfc74e@mail.gmail.com>
	 <e86a5fd00604261455g629164edy9ac45a09ac2ea92c@mail.gmail.com>
Message-ID: <d38f5330604261557t4195efaeu7d5417f839ba625@mail.gmail.com>

On 4/26/06, Bill Baxter <wbaxter at gmail.com> wrote:
> Is that a representative example?   It seems highly unlikely that in real
> life every one of the schools would have exactly 20 classes, and each of
> those exactly 30 students.

You should not take my toy example too seriousely.  However, with
support for missing values, 3-d arrays may provide an efficient
representation for a more realistic scenario when you only know upper
bounds for the number of students/classes.  Smaller schools will have
missing values in their arrays.

>  I don't know anything about R or the way things
> are typically done with statistical languages -- maybe this is the norm
> there -- but from a pure CompSci data structures perspective, a 3D array
> seems ill-suited for this type of hierarchical data.  Something more
> flexible, along the lines of a Python list of list of list, seems more
> apropriate.
>
You are right.  I am sorely missing ragged array support in numpy like
the one available in K.  Numpy supports nested arrays, but does not
optimize the most common case when nested arrays are of the same type.


> --bill
>
>
> On 4/27/06, Sasha <ndarray at mac.com> wrote:
>
> > Suppose we study school statistics in
> > different cities.  Let city A have 10 schools with 20 classes and 30
> > students in each.  It is natural to organize the data collected about
> > the students in a 10x20x30 array.
> >
>


From ndarray at mac.com  Wed Apr 26 16:16:07 2006
From: ndarray at mac.com (Sasha)
Date: Wed Apr 26 16:16:07 2006
Subject: [Numpy-discussion] Broadcasting rules (Ticket 76).
In-Reply-To: <d38f5330604261557t4195efaeu7d5417f839ba625@mail.gmail.com>
References: <4050236.1146000212838.JavaMail.root@fed1wml05.mgt.cox.net>
	 <d38f5330604251816i7979df78x67c63eeace28b507@mail.gmail.com>
	 <444F0420.9000500@ieee.org>
	 <d38f5330604261417o1b8dd104veb5b6e1398cfc74e@mail.gmail.com>
	 <e86a5fd00604261455g629164edy9ac45a09ac2ea92c@mail.gmail.com>
	 <d38f5330604261557t4195efaeu7d5417f839ba625@mail.gmail.com>
Message-ID: <d38f5330604261608w31eb8fb8o81ba6d50325125e4@mail.gmail.com>

On 4/26/06, Sasha <ndarray at mac.com> wrote:
> On 4/26/06, Bill Baxter <wbaxter at gmail.com> wrote:
> > Is that a representative example?   It seems highly unlikely that in real
> > life every one of the schools would have exactly 20 classes, and each of
> > those exactly 30 students.
>
> You should not take my toy example too seriousely.  However, with
> support for missing values, 3-d arrays may provide an efficient
> representation for a more realistic scenario when you only know upper
> bounds for the number of students/classes.  Smaller schools will have
> missing values in their arrays.

In addition, it is reasonable to sample a fixed number of classes from
each school and a fixed number of students from each class at random
for a statistical study.


From simon at arrowtheory.com  Wed Apr 26 16:41:04 2006
From: simon at arrowtheory.com (Simon Burton)
Date: Wed Apr 26 16:41:04 2006
Subject: [Numpy-discussion] obtain indexes of a sort ?
Message-ID: <20060427094025.10172889.simon@arrowtheory.com>

Is it possible to obtain a permutation (array of indices)
representing the transform that sorts an array ? Is there a numpy way
of doing this ?

I can do it in python as:

a = [ 6, 5, 99, 2 ]
idxs = range(len(a))
z = zip(idxs,a)
def zcmp(u,v):
  if u[1]<=v[1]:
    return -1
  return 1
z.sort( zcmp )
idxs = [u[0] for u in z] # <--- permutation

Simon.

-- 
Simon Burton, B.Sc.
Licensed PO Box 8066
ANU Canberra 2601
Australia
Ph. 61 02 6249 6940
http://arrowtheory.com 


From pgmdevlist at mailcan.com  Wed Apr 26 16:45:02 2006
From: pgmdevlist at mailcan.com (Pierre GM)
Date: Wed Apr 26 16:45:02 2006
Subject: [Numpy-discussion] obtain indexes of a sort ?
In-Reply-To: <20060427094025.10172889.simon@arrowtheory.com>
References: <20060427094025.10172889.simon@arrowtheory.com>
Message-ID: <200604261944.01584.pgmdevlist@mailcan.com>

On Wednesday 26 April 2006 19:40, Simon Burton wrote:
> Is it possible to obtain a permutation (array of indices)
> representing the transform that sorts an array ? Is there a numpy way
> of doing this ?

I guess argsort() could be what you want


From ndarray at mac.com  Wed Apr 26 16:45:03 2006
From: ndarray at mac.com (Sasha)
Date: Wed Apr 26 16:45:03 2006
Subject: [Numpy-discussion] obtain indexes of a sort ?
In-Reply-To: <20060427094025.10172889.simon@arrowtheory.com>
References: <20060427094025.10172889.simon@arrowtheory.com>
Message-ID: <d38f5330604261644g22f8c93ahc2c9ca4aadf507@mail.gmail.com>

>>> argsort([ 6, 5, 99, 2 ])
array([3, 1, 0, 2])

On 4/26/06, Simon Burton <simon at arrowtheory.com> wrote:
>
> Is it possible to obtain a permutation (array of indices)
> representing the transform that sorts an array ? Is there a numpy way
> of doing this ?
>
> I can do it in python as:
>
> a = [ 6, 5, 99, 2 ]
> idxs = range(len(a))
> z = zip(idxs,a)
> def zcmp(u,v):
>   if u[1]<=v[1]:
>     return -1
>   return 1
> z.sort( zcmp )
> idxs = [u[0] for u in z] # <--- permutation
>
> Simon.
>
> --
> Simon Burton, B.Sc.
> Licensed PO Box 8066
> ANU Canberra 2601
> Australia
> Ph. 61 02 6249 6940
> http://arrowtheory.com
>
>
> -------------------------------------------------------
> Using Tomcat but need to do more? Need to support web services, security?
> Get stuff done quickly with pre-integrated technology to make your job easier
> Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>


From zpincus at stanford.edu  Wed Apr 26 16:46:05 2006
From: zpincus at stanford.edu (Zachary Pincus)
Date: Wed Apr 26 16:46:05 2006
Subject: [Numpy-discussion] obtain indexes of a sort ?
In-Reply-To: <20060427094025.10172889.simon@arrowtheory.com>
References: <20060427094025.10172889.simon@arrowtheory.com>
Message-ID: <800F9820-F672-4EBF-8F48-3C3AEF17FC34@stanford.edu>

a.argsort() or numpy.argsort(a)

Zach


On Apr 26, 2006, at 4:40 PM, Simon Burton wrote:

>
> Is it possible to obtain a permutation (array of indices)
> representing the transform that sorts an array ? Is there a numpy way
> of doing this ?
>
> I can do it in python as:
>
> a = [ 6, 5, 99, 2 ]
> idxs = range(len(a))
> z = zip(idxs,a)
> def zcmp(u,v):
>   if u[1]<=v[1]:
>     return -1
>   return 1
> z.sort( zcmp )
> idxs = [u[0] for u in z] # <--- permutation
>
> Simon.
>
> --  
> Simon Burton, B.Sc.
> Licensed PO Box 8066
> ANU Canberra 2601
> Australia
> Ph. 61 02 6249 6940
> http://arrowtheory.com
>
>
> -------------------------------------------------------
> Using Tomcat but need to do more? Need to support web services,  
> security?
> Get stuff done quickly with pre-integrated technology to make your  
> job easier
> Download IBM WebSphere Application Server v.1.0.1 based on Apache  
> Geronimo
> http://sel.as-us.falkag.net/sel? 
> cmd=lnk&kid=120709&bid=263057&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion


From pearu at scipy.org  Wed Apr 26 16:56:05 2006
From: pearu at scipy.org (Pearu Peterson)
Date: Wed Apr 26 16:56:05 2006
Subject: [Numpy-discussion] Possible ref.count bug in changeset #2422
Message-ID: <Pine.LNX.4.64.0604261853100.30142@scipy.org>

Hi,

Shouldn't result be Py_INCRE'ted when it is equal to Py_NotImplemented and 
returned from array_richcompare?

Pearu


From doug5y at shaw.ca  Wed Apr 26 17:10:05 2006
From: doug5y at shaw.ca (Doug Nadworny)
Date: Wed Apr 26 17:10:05 2006
Subject: [Numpy-discussion] Can't install numpy-0.9.6-1.i586.rpm on FC5
Message-ID: <44500B9E.10602@shaw.ca>

when trying to install numpy-0.9.6-1.i586.rpm on Fedora Core 5, rpm 
reports incorrectly that python is the incorrect version, even though it 
is correct:

 >rpm -i --test numpy-0.9.6-1.i586.rpm  ## Tests dependences of  rpm package
error: Failed dependencies:
        python-base >= 2.4 is needed by numpy-0.9.6-1.i586
 >python -V
Python 2.4.2


Is there a way around this?

TIA,
Doug N


From cookedm at physics.mcmaster.ca  Wed Apr 26 17:20:05 2006
From: cookedm at physics.mcmaster.ca (David M. Cooke)
Date: Wed Apr 26 17:20:05 2006
Subject: [Numpy-discussion] Possible ref.count bug in changeset #2422
In-Reply-To: <Pine.LNX.4.64.0604261853100.30142@scipy.org> (Pearu Peterson's
	message of "Wed, 26 Apr 2006 18:55:55 -0500 (CDT)")
References: <Pine.LNX.4.64.0604261853100.30142@scipy.org>
Message-ID: <qnk3bfzj2et.fsf@arbutus.physics.mcmaster.ca>

Pearu Peterson <pearu at scipy.org> writes:

> Hi,
>
> Shouldn't result be Py_INCRE'ted when it is equal to Py_NotImplemented
> and returned from array_richcompare?

Theoretically, yes, but since the case statement "should" cover all
cases, it doesn't matter. Bad code style though on my part; I've added
a default: case instead.

-- 
|>|\/|<
/--------------------------------------------------------------------------\
|David M. Cooke                      http://arbutus.physics.mcmaster.ca/dmc/
|cookedm at physics.mcmaster.ca


From silesalvarado at hotmail.com  Wed Apr 26 17:33:04 2006
From: silesalvarado at hotmail.com (Hugo Siles)
Date: Wed Apr 26 17:33:04 2006
Subject: [Numpy-discussion] crush!!!!
Message-ID: <BAY105-F5EACD6C517239CF5E759DADBD0@phx.gbl>

HI,

I have a problem when I run the following options in python:

>>>from Numeric import *
>>>from Linear algebra
I define a matrix 'a' which prints correctly, calculates its inverse, 
determinat and so for
but when I try to calculate the eigenvalues, such as
>>>  c = eigenvalues(a)
the system just crushs without any message
I made this test because in some other programs with source code happens the 
same thing.

I hope some body can help, thanks

Hugo Siles


From ivazquez at ivazquez.net  Wed Apr 26 17:33:08 2006
From: ivazquez at ivazquez.net (Ignacio Vazquez-Abrams)
Date: Wed Apr 26 17:33:08 2006
Subject: [Numpy-discussion] Can't install numpy-0.9.6-1.i586.rpm on FC5
In-Reply-To: <44500B9E.10602@shaw.ca>
References: <44500B9E.10602@shaw.ca>
Message-ID: <1146098100.16081.15.camel@ignacio.lan>

On Wed, 2006-04-26 at 18:09 -0600, Doug Nadworny wrote:
> when trying to install numpy-0.9.6-1.i586.rpm on Fedora Core 5, rpm 
> reports incorrectly that python is the incorrect version, even though it 
> is correct:
> 
>  >rpm -i --test numpy-0.9.6-1.i586.rpm  ## Tests dependences of  rpm package
> error: Failed dependencies:
>         python-base >= 2.4 is needed by numpy-0.9.6-1.i586
>  >python -V
> Python 2.4.2

Alright, alright, I'll update it already...

-- 
Ignacio Vazquez-Abrams <ivazquez at ivazquez.net>
http://fedora.ivazquez.net/

gpg --keyserver hkp://subkeys.pgp.net --recv-key 38028b72
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 191 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060426/6b5a0982/attachment.sig>

From ndarray at mac.com  Wed Apr 26 18:15:04 2006
From: ndarray at mac.com (Sasha)
Date: Wed Apr 26 18:15:04 2006
Subject: [Numpy-discussion] crush!!!!
In-Reply-To: <BAY105-F5EACD6C517239CF5E759DADBD0@phx.gbl>
References: <BAY105-F5EACD6C517239CF5E759DADBD0@phx.gbl>
Message-ID: <d38f5330604261814h79be29c8oe1ee265a645018eb@mail.gmail.com>

Numeric computes by calling lapack's dgeev subroutine.  Depending on
installation Numeric may either use its own subset of lapack
(translated from Fortran to C) or link to the system supplied Lapack
libraries.  It is possible that there is a bug in your system's lapack
libraries. Some lapack bugs related to extended precision calculations
were reported recently.  What you observe is unlikely to be a Numeric
bug.  Note, however that Numeric is no longer actively supported.  If
you can reproduce the same problem with numpy, it will likely to get
more attention.  Also you have to give us some means to reproduce your
matrix a if you expect more than a general advise.

On 4/26/06, Hugo Siles <silesalvarado at hotmail.com> wrote:
> HI,
>
> I have a problem when I run the following options in python:
>
> >>>from Numeric import *
> >>>from Linear algebra
> I define a matrix 'a' which prints correctly, calculates its inverse,
> determinat and so for
> but when I try to calculate the eigenvalues, such as
> >>>  c = eigenvalues(a)
> the system just crushs without any message
> I made this test because in some other programs with source code happens the
> same thing.
>
> I hope some body can help, thanks
>
> Hugo Siles
>
>
>
>
> -------------------------------------------------------
> Using Tomcat but need to do more? Need to support web services, security?
> Get stuff done quickly with pre-integrated technology to make your job easier
> Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>


From strawman at astraw.com  Wed Apr 26 19:26:05 2006
From: strawman at astraw.com (Andrew Straw)
Date: Wed Apr 26 19:26:05 2006
Subject: [Numpy-discussion] SWIG for 3D array
In-Reply-To: <A7E3D110-90E4-47BE-80DC-CED1DC359A26@cortechs.net>
References: <A7E3D110-90E4-47BE-80DC-CED1DC359A26@cortechs.net>
Message-ID: <44502B85.3000504@astraw.com>

Gennan Chen wrote:

> And I really have hard time to understand how to deal with reference  
> counting issues. Anyone know where I can know a good reference for  that?


http://docs.python.org/ext/refcounts.html


From oliphant.travis at ieee.org  Wed Apr 26 20:30:12 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Wed Apr 26 20:30:12 2006
Subject: [Numpy-discussion] Broadcasting rules (Ticket 76).
In-Reply-To: <d38f5330604261417o1b8dd104veb5b6e1398cfc74e@mail.gmail.com>
References: <4050236.1146000212838.JavaMail.root@fed1wml05.mgt.cox.net>	 <d38f5330604251816i7979df78x67c63eeace28b507@mail.gmail.com>	 <444F0420.9000500@ieee.org> <d38f5330604261417o1b8dd104veb5b6e1398cfc74e@mail.gmail.com>
Message-ID: <44503A8A.2050701@ieee.org>

Sasha wrote:

> In my view the most appealing feature in Python is
> the Zen of Python <http://www.python.org/doc/Humor.html#zen>" and in
> particular "There should be one-- and preferably only one --obvious
> way to do it."  In my view Python represents the "hard science"
> approach appealing to physics and math types while Perl is more of a
> "soft science" language. 
Interesting analogy.  I've not heard that expression before. 
> Unfortunately, it is the fact of life that there are
> always many ways to solve the same problem and a successful "pythonic"
> design has to pick one (preferably the best) of the possible ways and
> make it obvious.
>   
And it's probably impossible to agree as to what is "best"  because of 
the different uses that array's receive.    That's one reason I'm 
anxious to get a basic structure-only basearray into Python itself. 
> This said, let me present a specific problem that I will use to
> illustrate my points below.  Suppose we study school statistics in
> different cities.  Let city A have 10 schools with 20 classes and 30
> students in each.  It is natural to organize the data collected about
> the students in a 10x20x30 array.  It is also natural to collect some
> of the data at the per-school or per-class level.  This data may come
> from aggregating student level statistics (say average test score) or
> from the characteristics that are class or school specific (say the
> grade or primary language).  There are two obvious ways to present
> such data. 1) We can use 3-d arrays for everything and make the shape
> of the per-class array 10x20x1 and the shape of per-school array
> 10x1x1; and 2) use 2-d and 1-d arrays.  The first approach seems to be
> more flexible.  We can also have 10x1x30 or 1x1x30 arrays to represent
> data which varies along the student dimension, but is constant across
> schools or classes.  However, this added benefit is illusory: the
> first student in one class list has no relationship to the first 
> student in the other class, so in this particular problem an average
> score of the first student across classes makes no sense (it will also
> depend on whether students are ordered alphabetically or by an
> achievement rank).
>
> On the other hand this approach has a very significant drawback:
> functions that process city data have no way to distinguish between
> per-school data and a lucky city that can afford educating its
> students in individual classes.  Just as it is extremely unlikely to
> have one student per class in our toy example, in real-world problems
> it is not unreasonable to assume that dimension of size 1 represents
> aggregate data.  A software designed based on this assumption is what
> I would call broken in a subtle way.
>   

I think I see what you are saying.  This is a very specific 
circumstance.  I can verify that the ndarray has not been designed to 
distinguish such hierarchial data.  You will never be able to tell from 
the array itself if a dimension of length 1 means aggregate data or 
not.   I don't see that as a limitation of the ndarray but as evidence 
that another object (i.e. an R-like data-frame) should probably be 
used.  Such an object could even be built on top of the ndarray.

>> [...]
>> I don't think anyone is fundamentally opposed to multiple repetitions.
>> We're just being cautious.   Also, as you've noted, the assignment code
>> is currently not using the ufunc broadcasting code and so they really
>> aren't the same thing, yet.
>>     
>
> It looks like there is a lot of development in this area going on at
> the moment.  Please let me know if I can help.
>   

Well, I did some refactoring to make it easier to expose the basic 
concept of the ufunc elsewhere:

1) Adjusting the inputs to a common shape  (this is what I call 
broadcasting --- it appears to me that you use the term a little more 
loosely)

2) Setting up iterators to iterate over all but the longest dimension so 
that the inner loop is done.

These are the key ingredients to a fast ufunc.  There is 1 more 
optimization in the ufunc machinery for the contiguous case (when the 
inner loop is all that is needed) and then there is code to handle the 
buffering needed for unaligned and/or byte-swapped data.

The final thing that makes a ufunc is the precise signature of the inner 
loop.    Every inner loop as the same signature.   This signature does 
not contain a slot for the length of the array element (that's a big 
reason why variable-length arrays are not supported in ufuncs).  The 
ufuncs could be adapted, of course,  but it was a bigger fish than I 
wanted to try and fry pre 1.0

Note, though, that I haven't used these concepts yet to implement 
ufunc-like copying.   The PyArray_Cast function will also need to be 
adjusted at the same time and this could actually prove more difficult 
as it must implement buffering.   Of course it could give us a chance to 
abstract-out the buffered, broadcasted call as well.  That might make a 
useful C-API function.   

Any help you can provide would be greatly appreciated.   I'm focused 
right now on the scalar math module as without it, NumPy is still slower 
for people that use a lot of array elements. 
>> [...]
>>     
>>> In my experience broadcasting length-1 and not broadcasting other
>>> lengths is very error prone as it is.
>>>       
>> That's not been my experience.
>>     
>
> I should have been more specific.  As I explained above, the special
> properties of length-1 led me to design a system that distinguished
> aggregate data by testing for unit length.  This system was subtly
> broken.  In a rare case when the population had only one element, the
> system was producing wrong results.
>   
Yes I can see that now.   Your comments make a lot more sense.  Trying 
to use ndarray's to represent hierarchial data can cause these subtle 
issues.  The ndarray is a "flat" object in the sense that every element 
is seen as "equal" to every other element. 

>> dim(x) <- c(2,5)
>> x
>>     
>      [,1] [,2] [,3] [,4] [,5]
> [1,]    0    0    0    0    0
> [2,]    0    0    0    0    0
>
> (R uses Fortran order).  Broadcasting ignores the dim attribute, but
> does the right thing for conformable vectors:
>
>   

Thanks for the description of R.

>> x + c(1,2)
>>     
>      [,1] [,2] [,3] [,4] [,5]
> [1,]    1    1    1    1    1
> [2,]    2    2    2    2    2
>
> However, the following is unfortunate:
>   
Ahh...   So, it looks like R does on arithmetic what NumPy copying is 
currently doing (treating both as flat spaces to fill).

>> x
>>     
> Sorry I was not specific in the original post.  I hope you now
> understand where I come from.  Can you point me to some examples of
> the correct way to use dimension-preserving broadcasting?  I would
> assume that it is probably more useful in the problem domains where
> there is no natural ordering of the dimensions, unlike in the
> hierarchial data example that I used.
>   

Yes,  the ndarray does not recognize any natural ordering to the 
dimensions at all.  Every dimension is "equal."  It's designed to be a 
very basic object.

I'll post some examples later.  I've got to go right now.

> The dimension-increasing broadcasting is very natural when you deal
> with hierarchical data where various dimensions correspond to the
> levels of aggregation.  As I explained above, average student score
> per class makes sense while the average score per student over classes
> does not.  It is very common to combine per-class data with
> per-student data by broadcasting per-class data.  For example, the
> total time spent by student is a sum spent in regular per-class
> session plus individual elected courses.
>   

I think you've hit on something here regarding the use of an array for 
"hierachial" data.  I'm not sure I understand the implications entirely, 
but at least it helps me a little bit see what your concerns really are.


> I hope you understand that I did not mean to criticize anyone's coding
> style.  I was not really hinting at optimization issues, I just had a
> particular design problem in mind (see above).  
I do understand much better now.  I still need to think about the 
hierarchial case a bit more.  My basic concept of an array which 
definitely biases me is a medical imaging volume.... (i.e. the X-ray 
density at each location in 3-space). 

I could use improved understanding of how to use array's effectively in 
hierarchies.  Perhaps we can come up with some useful concepts (or maybe 
another useful structure that inherits from the basearray) and can 
therefore share data effectively with the ndarray....


> In the spirit of appealing to obscure languages ;-), let me mention
> that in the K language (kx.com) element assignment is implemented
> using an Amend primitive that takes four arguments: @[x,i,f,y] id more
> or less equivalent to numpy's x[i] = f(x[i], y[i]), where x, y and i
> are vectors and f is a binary (broadcasting) function.  Thus, x[i] +=
> y[i] can be written as @[x,i,+,y] and x[i] = y[i] is @[x,i,:,y], where
> ':' denotes a binary function that returns it's second argument and
> ignores the first. K interpretor's Linux binary is less than 200K and
> that includes a simple X window GUI! Such small code size would not be
> possible without picking the right set of primitives and avoiding
> special case code.
>   

Not to mention limiting the number of data-types :-)


> I know close to nothing about variable length arrays.  When I need to
> deal with the relational database data, I transpose it so that each
> column gets into its own fixed length array. 
Yeah, that was my strategy too and what I always suggested to the 
numarray folks who wanted the variable-length arrays.   But, 
memory-mapping can't be done that way....

>  This is the approach
> that both R and K take.  However, at least at the C level, I don't see
> why ufunc code cannot be generalized to handle variable length arrays.
>   
They of course, could be, it's just more re-factoring than I wanted to 
do.   The biggest issue is the underlying 1-d loop function signature.  
I hesitated to change the signature because that would break 
compatibility with Numeric extension modules that defined ufuncs (like 
scipy-special...)  The length could piggy-back in the data argument 
passed into those functions, but doing that right was more work than I 
wanted to do.   If you solve that problem,  everything else could be 
made to work without too much trouble.

>  At the python level, pre-defined arithmetic or math functions are
> probably not feasible for variable length, but the ability to define a
> variable length array function by just writing an inner loop
> implementation may be quite useful.
>   
Yes, it could have helped write the string comparisons much faster :-)

>> However, the broadcasting machinery has been abstracted in NumPy and can
>> therefore be re-used in the copying code.  In Numeric, broadcasting was
>> basically implemented deep inside a confusing while loop.
>>     
>
> I've never understood the Numeric's while loop and completely agree
> with your characterization.  I am still studying the numpy code, but
> it is clearly a big improvement.
>   

Well, it's more straightforward because I'm not the genius Jim Hugunin 
is.  It makes heavy use of the iterator concept which I finally grok'd 
while trying to write things (and realized I had basically already 
implemented in writing the old scipy.vectorize). 

I welcome many more eyes on the code.   I know I've taken shortcuts in 
places that should be improved.

Thanks for your continued help and useful comments.


-Travis


From tim.hochberg at cox.net  Wed Apr 26 21:02:10 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Wed Apr 26 21:02:10 2006
Subject: [Numpy-discussion] Broadcasting rules (Ticket 76).
In-Reply-To: <d38f5330604261417o1b8dd104veb5b6e1398cfc74e@mail.gmail.com>
References: <4050236.1146000212838.JavaMail.root@fed1wml05.mgt.cox.net>	 <d38f5330604251816i7979df78x67c63eeace28b507@mail.gmail.com>	 <444F0420.9000500@ieee.org> <d38f5330604261417o1b8dd104veb5b6e1398cfc74e@mail.gmail.com>
Message-ID: <44504296.8040602@cox.net>

I haven't fully waded through all the various replies and to this 
thread. I plan to do that and send a reply on specific points later. 
This is message is more of a historical, motivational or possibly 
philosophical nature.

First off, NumPy has used the term "broadcast" to mean the same thing 
since its inception and changing the terminology now is asking for 
confusion. *In the context of this mailing list *,I think we should use 
"broadcast" in the numpy sense and use appropriate qualifiers when 
referring to how other array packages practice broadcasting.  Referring 
to broadcasting as "shape-preserving broadcasting" or some such doesn't 
seems to make things any clearer and adds a bunch of excess verbiage. In 
any event, I plan to omit any "broadcast" qualifiers here.

The following understanding was formed by using and occasionally helping 
with development of NumPy since it was developed in 1995 or thereabouts. 
That doesn't mean that my understanding aggrees with the primary 
developers of the time, I may misremember things and my recollections 
are likely tinged by the experience I've had with NumPy in the interim. 
So, don't take this as definitive, but perhaps it will help provide some 
insight into what NumPy's broadcasting is supposed to be.

Let's first dispense with the padding of dimensions. As I recall, this 
was a way to make matrix like operations easier. This was way before 
there was a matrix class and by defining padding in this way 1-D vectors 
could generally be treated as column vectors. Row vectors still needed 
to be 2-D (1xN), but they tended to be less frequent, so that was less 
of a burden. Or maybe I have that backwards, in any event they were put 
there to to facilitate matrix-like uses of numpy arrays. Given that 
there is a matrix class at this point, I doubt I would automagically pad 
the dimensions if I were designing numpy from scratch now. Since the 
dimension padding is at least partly historical accident and since it is 
in some sense orthogonal to the main point of numpy's broadcasting I'm 
going to pretend it doesn't exist for the rest of this discussion.

At it's core broadcasting is about adjusting the shapes of two arrays so 
that they match. Consider an array 'A' and an array 'B' with shaps (3, 
Any) and (Any, 4). Here, 'Any' means that the given dimension of the 
array is unspecified and can take on any value that is convenient for 
functions operating on the array.  If we add 'A' and 'B' together we'd 
like the two 'Any' dimensions to stretch appropriately so that the 
result was an array of shape (3, 4). Similarly adding and array of shape 
(3, 4) to an array of shape (Any, 4) should work and produce an array of 
shape (3, 4). So far, this is pretty straightforward; I believe, it also 
bears a fair amount of resemblance to Sasha's 0-stride ideas.

The complicating factor is that there wasn't a good way to spell 'Any' 
at the time. Or maybe we were lazy. Or maybe there was some other reason 
that I'm forgetting. In any event, we ended up spelling 'Any' as '1'. 
That means that there's no way to distinguish between a dimension that's 
of length-1 for some legitimate reason and one that is that length just 
for stretchability. It would be an interesting experiment to see how 
things would work with no padding and with an explicit 'Any' value 
available for dimensions. However, it's probably too much work and would 
result in too many backwards compatibility problems for NumPy proper.

[Half baked thoughts on how to do this though: newaxis would produce a 
new axis with length -1 (or some other marker length). This would be 
treated as length-1 axes are treated now. However, length-1axes would no 
longer broadcast. Padding would be right out.]

In summary, the platonic ideal of broadcasting is simple and clean. In 
practice it's more complicated for two reasons. First, padding the 
dimensions.I believe that this is mostly historical baggage. The second 
is the conflation of '1' and 'Any' (a name that I made up for this 
message, so don't go searching for it). This may be an hostorical 
accident and/or implementation artifact, but there may actually be some 
more practical reasons behind this as well that I am forgetting.

Hopefully that is mildly informative,

Regards,

-tim


From kwgoodman at gmail.com  Wed Apr 26 21:46:08 2006
From: kwgoodman at gmail.com (Keith Goodman)
Date: Wed Apr 26 21:46:08 2006
Subject: [Numpy-discussion] matrix.std() returns array
Message-ID: <f4f93d420604262145p7d38060av9fa610488749811f@mail.gmail.com>

I noticed that the mean of a matrix is a matrix but the standard
deviation of a matrix is an array. Is that the expected behavior? I'm
also getting the wrong values (0 and nan) for the standard deviation.
Did I mess something up?

I'm trying to learn scipy (and python) by porting a small Octave
program. I installed numpy from svn (today) on a Debian box. And
numpy.test() says OK.

Here's an example:

>> numpy.__version__
'0.9.7.2416'

>> x = asmatrix(random.uniform(0,1,(3,3)))

>> x

matrix([[ 0.56771284,  0.57053769,  0.57505946],
       [ 0.10479534,  0.81692248,  0.91829316],
       [ 0.48627829,  0.59255983,  0.32628573]])

>> x.mean(0)
matrix([[ 0.38626216,  0.66000667,  0.60654612]])

>> x.std(0)
array([        nan,  0.        ,  0.        ])


From arnd.baecker at web.de  Wed Apr 26 23:01:03 2006
From: arnd.baecker at web.de (Arnd Baecker)
Date: Wed Apr 26 23:01:03 2006
Subject: [Numpy-discussion] concatenate, doc-string
In-Reply-To: <d38f5330604261523r53d37db0hc56781bcda52f96c@mail.gmail.com>
References: <Pine.LNX.4.51.0604261129410.10871@ptpcp8.phy.tu-dresden.de> 
 <qnkk69ci0ff.fsf@arbutus.physics.mcmaster.ca>
 <d38f5330604261523r53d37db0hc56781bcda52f96c@mail.gmail.com>
Message-ID: <Pine.LNX.4.51.0604270752140.14194@ptpcp8.phy.tu-dresden.de>


On Wed, 26 Apr 2006, Sasha wrote:

> On 4/26/06, David M. Cooke <cookedm at physics.mcmaster.ca> wrote:
> > ....
> > Here's what I just checked in:
> >
> >     concatenate((a1, a2, ...), axis=None) joins arrays together
> >
> >     The tuple of sequences (a1, a2, ...) are joined along the given axis
> >     (default is the first one) into a single numpy array.
> >
> >     Example:
> >
> >     >>> concatenate( ([0,1,2], [5,6,7]) )
> >     array([0, 1, 2, 5, 6, 7])
> >
>
> The first argument does not have to be a tuple:
>
> >>> print concatenate([[0,1,2], [5,6,7]])
> [0 1 2 5 6 7]
>
> but the docstring is probably ok given that the alternative is
> "sequence of sequences" ...

Seems to be the usual problem of either being slightly unprecise
but understandable or legally correct but impossible to understand
(in particular for beginners).

What about changing the example to:
"""
Examples:

>>> concatenate(([0, 1, 2], [5, 6, 7]))
array([0, 1, 2, 5, 6, 7])

>>> concatenate([[0, 1, 2], [5, 6, 7]])
array([0, 1, 2, 5, 6, 7])

>>> z =  arange(5)
>>> concatenate(([0, 1, 2], [5, 6, 7], z))
array([0, 1, 2, 5, 6, 7, 0, 1, 2, 3, 4])
"""

Best, Arnd


From Chris.Barker at noaa.gov  Wed Apr 26 23:42:02 2006
From: Chris.Barker at noaa.gov (Christopher Barker)
Date: Wed Apr 26 23:42:02 2006
Subject: [Numpy-discussion] array.min() vs. min(array)
In-Reply-To: <444FE909.5080209@ieee.org>
References: <c5b438120604261420u5d44eeefla575bc56043f487a@mail.gmail.com>
 <444FE909.5080209@ieee.org>
Message-ID: <445067C6.3050805@noaa.gov>

Travis Oliphant wrote:

> In Python 2.5 we are going to have the same issues with the new any() 
> and all() functions of Python.

"Namespaces are one honking great idea -- let's do more of those!"

Yet another reason to deprecate import * !

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer
                                      		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From arnd.baecker at web.de  Wed Apr 26 23:49:06 2006
From: arnd.baecker at web.de (Arnd Baecker)
Date: Wed Apr 26 23:49:06 2006
Subject: [Numpy-discussion] array.min() vs. min(array)
In-Reply-To: <444FE909.5080209@ieee.org>
References: <c5b438120604261420u5d44eeefla575bc56043f487a@mail.gmail.com>
 <444FE909.5080209@ieee.org>
Message-ID: <Pine.LNX.4.51.0604270807510.14194@ptpcp8.phy.tu-dresden.de>

Moin,

On Wed, 26 Apr 2006, Travis Oliphant wrote:

> Ryan Krauss wrote:
> > I was spending some time trying to track down how to speed up an
> > algorithm that gets called a bunch of times during an optimization.  I
> > was startled when I finally figured out that most of the time was
> > wasted by using the built-in pyhton min function.  It turns out that
> > in my case, using array.min() (i.e. the method of the Numpy array) is
> > 300-500 times faster than the built-in python min function (i.e.
> > min(array)).
> >
> > So, thank you Travis and everyone who has put so much time into
> > thinking through Numpy and making it fast (as well as making sure it
> > is correct).
>
> The builtin min function is a bit confusing because it usually does work
> on NumPy arrays.  But, as you've noticed it is always slower because it
> uses the "generic sequence interface" that NumPy arrays expose.  So,
> it's basically not much faster than a Python loop.  In this case you are
> also being hit by the fact that scalarmath is not yet implemented (it's
> getting close though...)  so the returned array scalars are being
> compared using the bulky ufunc machinery on each element separately.
>
> In Python 2.5 we are going to have the same issues with the new any()
> and all() functions of Python.

I am just preparing a small text to collect such cases for the wiki.

However, I am not sure about a good name for such a page:
  http://www.scipy.org/Cookbook/Speed
  http://www.scipy.org/Cookbook/SpeedProblems
  http://www.scipy.org/Cookbook/Performance
?
(As usual, it is easy to start a page, than to properly maintain
it. OTOH things like this get lost very quickly, in particular with this
nice amount of traffic here).

In addition this also relates to
- profiling

  (For example I would like to add the contents of
  http://mail.enthought.com/pipermail/enthought-dev/2006-January/001075.html
  to the wiki at some point)
- psyco
- pyrex
- f2py
- weave
- numexpr
- ...

Presently much of this is listed in the Cookbook under
"Using NumPy With Other Languages (Advanced)",
and therefore the above "Python only" issues don't quite fit.
Any suggestions?

Best, Arnd


From arnd.baecker at web.de  Wed Apr 26 23:51:07 2006
From: arnd.baecker at web.de (Arnd Baecker)
Date: Wed Apr 26 23:51:07 2006
Subject: [Numpy-discussion] array.min() vs. min(array)
In-Reply-To: <445067C6.3050805@noaa.gov>
References: <c5b438120604261420u5d44eeefla575bc56043f487a@mail.gmail.com>
 <444FE909.5080209@ieee.org> <445067C6.3050805@noaa.gov>
Message-ID: <Pine.LNX.4.51.0604270849000.14194@ptpcp8.phy.tu-dresden.de>

On Wed, 26 Apr 2006, Christopher Barker wrote:

> Travis Oliphant wrote:
>
> > In Python 2.5 we are going to have the same issues with the new any()
> > and all() functions of Python.
>
> "Namespaces are one honking great idea -- let's do more of those!"
>
> Yet another reason to deprecate import * !

Yep! But it would not work for `min` as there is
no such function in  numpy. (would we need one?...)

Best, Arnd


From Chris.Barker at noaa.gov  Thu Apr 27 00:00:05 2006
From: Chris.Barker at noaa.gov (Christopher Barker)
Date: Thu Apr 27 00:00:05 2006
Subject: [Numpy-discussion] Broadcasting rules (Ticket 76).
In-Reply-To: <d38f5330604261417o1b8dd104veb5b6e1398cfc74e@mail.gmail.com>
References: <4050236.1146000212838.JavaMail.root@fed1wml05.mgt.cox.net>
 <d38f5330604251816i7979df78x67c63eeace28b507@mail.gmail.com>
 <444F0420.9000500@ieee.org>
 <d38f5330604261417o1b8dd104veb5b6e1398cfc74e@mail.gmail.com>
Message-ID: <44506BE6.10301@noaa.gov>

As Sasha quite clearly pointed out, when you do aggregation, you really 
do want to reduce the dimensionality of your data. IN fact, that's 
something that always bit me with MATLAB. If I had a matrix that 
happened to have a dimension of 1, MATLAB would interpret it as a 
vector. I ended up writing functions like "SumColumns" that would check 
if it was a single row vector before calling sum, so that I wouldn't 
suddenly get a scaler result if a matrix happened to have on row.

Once you reduce dimensionality with aggregating functions, I can see how 
it would be natural to want to use broadcasting to to merge the reduced 
data and full data. However, I can't see how you could do that cleanly.

How is the code to know whether a rank-1 array represents a column or 
row when multiplied with a rank-2 array? There is simply no way to know, 
in general. I suppose we could define a convention, like:

"rank-1 arrays will be interpreted as row vectors for broadcasting."

etc. for higher dimensions.

However, I've found that even in my code, I don't find one convention 
always makes the most sense for all applications, so I'm just as happy 
to make it clear with a lot of calls like:

v.shape = (-1, 1)

NOTE:

It appears that numpy does, in fact, use such a convention:

 >>> v = N.arange(5)
 >>> m = N.ones((5,5))
 >>> v * m
array([[0, 1, 2, 3, 4],
        [0, 1, 2, 3, 4],
        [0, 1, 2, 3, 4],
        [0, 1, 2, 3, 4],
        [0, 1, 2, 3, 4]])
 >>> v.shape = (-1,1)
 >>> v * m
array([[0, 0, 0, 0, 0],
        [1, 1, 1, 1, 1],
        [2, 2, 2, 2, 2],
        [3, 3, 3, 3, 3],
        [4, 4, 4, 4, 4]])


So what's the disagreement about?

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer
                                      		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From Chris.Barker at noaa.gov  Thu Apr 27 00:10:03 2006
From: Chris.Barker at noaa.gov (Christopher Barker)
Date: Thu Apr 27 00:10:03 2006
Subject: [Numpy-discussion] concatenate, doc-string
In-Reply-To: <qnkk69ci0ff.fsf@arbutus.physics.mcmaster.ca>
References: <Pine.LNX.4.51.0604261129410.10871@ptpcp8.phy.tu-dresden.de>
 <qnkk69ci0ff.fsf@arbutus.physics.mcmaster.ca>
Message-ID: <44506E2F.9040902@noaa.gov>

David M. Cooke wrote:

> Here's what I just checked in:
> 
>     concatenate((a1, a2, ...), axis=None) joins arrays together
> 
>     The tuple of sequences (a1, a2, ...) are joined along the given axis
>     (default is the first one) into a single numpy array.
> 
>     Example:
> 
>     >>> concatenate( ([0,1,2], [5,6,7]) )
>     array([0, 1, 2, 5, 6, 7])

While we're at it, why not an example of how the axis argument works:
 >>> concatenate( (ones((1,3)), zeros((1,3))) )
array([[1, 1, 1],
        [0, 0, 0]])

 >>> concatenate( (ones((1,3)), zeros((1,3))), axis = 0 )
array([[1, 1, 1],
        [0, 0, 0]])
 >>> concatenate( (ones((1,3)), zeros((1,3))), axis = 1 )
array([[1, 1, 1, 0, 0, 0]])


I'm not sure I like this example, but it's a easy way to do a one liner.

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer
                                      		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From oliphant.travis at ieee.org  Thu Apr 27 00:53:00 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Thu Apr 27 00:53:00 2006
Subject: [Numpy-discussion] matrix.std() returns array
In-Reply-To: <f4f93d420604262145p7d38060av9fa610488749811f@mail.gmail.com>
References: <f4f93d420604262145p7d38060av9fa610488749811f@mail.gmail.com>
Message-ID: <4450780C.9060403@ieee.org>

Keith Goodman wrote:
> I noticed that the mean of a matrix is a matrix but the standard
> deviation of a matrix is an array. Is that the expected behavior? I'm
> also getting the wrong values (0 and nan) for the standard deviation.
> Did I mess something up?
>
> I'm trying to learn scipy (and python) by porting a small Octave
> program. I installed numpy from svn (today) on a Debian box. And
> numpy.test() says OK.
>
>   
This should be fixed now in SVN.  If somebody can add a test that would 
be great.

Note, that the methods taking axes also now preserve row and column 
orientation for matrices.

-Travis


From oliphant.travis at ieee.org  Thu Apr 27 01:03:04 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Thu Apr 27 01:03:04 2006
Subject: [Numpy-discussion] Warnings in NumPy SVN
Message-ID: <44507A9D.8070902@ieee.org>

I want to apologize for the relative instability of the SVN tree in the 
past couple of days.  Getting the scalarmath layout working took more 
C-API changes than I had anticipated.  

The SVN version of NumPy now builds scalarmath by default.  The basic 
layout of the module is complete.  However, there are many basic 
functions that are missing.  As a result, during compile you will get 
many warnings about undefined functions.   If an attempt were made to 
load the module it would cause an error as well due to undefined symbols.

These undefined symbols are all the basic operations on fundamental c 
data-types that either need a function defined or a #define statement made.

The names have this form:

@name at _ctype_@oper@

where @name@ is one of the 16 Number-like types and @oper@ is one of the 
operations needing to be supported.

The function (or macro) needs to implement the operation on the basic 
data-type and if necessary set an error-flag in the floating-point 
registers. 

If anybody has time to help implement these basic operations, it would 
be greatly appreciated. 


-Travis


From zpincus at stanford.edu  Thu Apr 27 01:22:05 2006
From: zpincus at stanford.edu (Zachary Pincus)
Date: Thu Apr 27 01:22:05 2006
Subject: [Numpy-discussion] matrix.std() returns array
In-Reply-To: <4450780C.9060403@ieee.org>
References: <f4f93d420604262145p7d38060av9fa610488749811f@mail.gmail.com> <4450780C.9060403@ieee.org>
Message-ID: <05B8DC8B-CD68-4EF2-BB2B-6FFABABF812E@stanford.edu>

On a slightly-related note, was anyone able to reproduce the  
exception with matrix types and the var() method?
e.g. numpy.matrix([[1,2,3], [1,2,3]]).var() complains about unaligned  
data...

Presumably if std is fixed in SVN, so is var. Also if a std unit test  
is added, a var one should be too.

Zach

On Apr 27, 2006, at 12:51 AM, Travis Oliphant wrote:

> Keith Goodman wrote:
>> I noticed that the mean of a matrix is a matrix but the standard
>> deviation of a matrix is an array. Is that the expected behavior? I'm
>> also getting the wrong values (0 and nan) for the standard deviation.
>> Did I mess something up?
>>
>> I'm trying to learn scipy (and python) by porting a small Octave
>> program. I installed numpy from svn (today) on a Debian box. And
>> numpy.test() says OK.
>>
>>
> This should be fixed now in SVN.  If somebody can add a test that  
> would be great.
>
> Note, that the methods taking axes also now preserve row and column  
> orientation for matrices.
>
> -Travis
>
>
>
> -------------------------------------------------------
> Using Tomcat but need to do more? Need to support web services,  
> security?
> Get stuff done quickly with pre-integrated technology to make your  
> job easier
> Download IBM WebSphere Application Server v.1.0.1 based on Apache  
> Geronimo
> http://sel.as-us.falkag.net/sel? 
> cmd=lnk&kid=120709&bid=263057&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion


From arnd.baecker at web.de  Thu Apr 27 03:06:17 2006
From: arnd.baecker at web.de (Arnd Baecker)
Date: Thu Apr 27 03:06:17 2006
Subject: [Numpy-discussion] vectorize problem
In-Reply-To: <444FA7E7.2070303@ieee.org>
References: <200604251324.42987.steffen.loeck@gmx.de>
 <Pine.LNX.4.51.0604260944230.10871@ptpcp8.phy.tu-dresden.de>
 <444FA7E7.2070303@ieee.org>
Message-ID: <Pine.LNX.4.51.0604271201040.14194@ptpcp8.phy.tu-dresden.de>

On Wed, 26 Apr 2006, Travis Oliphant wrote:

[...]

> It is just a simple change.   Scalars are supposed to be supported.
> They aren't only as a side-effect of the switch to not return
> object-scalars.    I did not update the vectorize code to handle the
> scalar return value from the object ufunc (which is now no-longer an
> object-scalar with the methods of arrays (like astype) but is instead
> the underlying object).
>
> I'll add a check.

Works perfect now - many thanks!

This reminds me of some other issue when trying to
vectorize f2py-wrapped functions:
Pearu suggested a fix in terms of a more general way to determine the
number of arguments of a callable Python object,
http://www.scipy.net/pipermail/scipy-user/2006-April/007617.html

However, it seems that this has fallen through the cracks
(and I don't see how to incorporate it into numpy.vectorize...)

Is this another simple one? ;-)

Many thanks,

Arnd


From gruben at bigpond.net.au  Thu Apr 27 05:05:02 2006
From: gruben at bigpond.net.au (Gary Ruben)
Date: Thu Apr 27 05:05:02 2006
Subject: [Numpy-discussion] array.min() vs. min(array)
In-Reply-To: <Pine.LNX.4.51.0604270807510.14194@ptpcp8.phy.tu-dresden.de>
References: <c5b438120604261420u5d44eeefla575bc56043f487a@mail.gmail.com> <444FE909.5080209@ieee.org> <Pine.LNX.4.51.0604270807510.14194@ptpcp8.phy.tu-dresden.de>
Message-ID: <4450B34F.8010501@bigpond.net.au>

Hi Arnd,

You could call it PerformanceTips and include some search terms like 
"speed" in the page so search engines pick them up.

Gary R.

Arnd Baecker wrote:

> I am just preparing a small text to collect such cases for the wiki.
> 
> However, I am not sure about a good name for such a page:
>   http://www.scipy.org/Cookbook/Speed
>   http://www.scipy.org/Cookbook/SpeedProblems
>   http://www.scipy.org/Cookbook/Performance
> ?


From ryanlists at gmail.com  Thu Apr 27 06:41:08 2006
From: ryanlists at gmail.com (Ryan Krauss)
Date: Thu Apr 27 06:41:08 2006
Subject: [Numpy-discussion] array.min() vs. min(array)
In-Reply-To: <4450B34F.8010501@bigpond.net.au>
References: <c5b438120604261420u5d44eeefla575bc56043f487a@mail.gmail.com>
	 <444FE909.5080209@ieee.org>
	 <Pine.LNX.4.51.0604270807510.14194@ptpcp8.phy.tu-dresden.de>
	 <4450B34F.8010501@bigpond.net.au>
Message-ID: <c5b438120604270640p21996cc9w68367c89efe6ad8d@mail.gmail.com>

I think this is a great idea.  We get a lot of these kinds of
questions on the list, and the collective wisdom of people here who
have really dug into this is really impressive.  But, that wisdom does
need to be a little easier to find.

Speaking of which, I don't always feel like I get trustworthy results
out of the profiler, so when I really want to know what is going on I
find myself doing this alot:

t1=time.time()
[block of code here]
t2=time.time()
[more code]
t3=time.time()

and then comparing t3-t2 and t2-t1 to narrow down where the code is
spending its time.

Does anyone have good tips on how to do good profiling?  Or is this
question so vague and counter-intuitive that I seem silly and I had
better come back with a believable example?

Thanks,

Ryan

On 4/27/06, Gary Ruben <gruben at bigpond.net.au> wrote:
> Hi Arnd,
>
> You could call it PerformanceTips and include some search terms like
> "speed" in the page so search engines pick them up.
>
> Gary R.
>
> Arnd Baecker wrote:
>
> > I am just preparing a small text to collect such cases for the wiki.
> >
> > However, I am not sure about a good name for such a page:
> >   http://www.scipy.org/Cookbook/Speed
> >   http://www.scipy.org/Cookbook/SpeedProblems
> >   http://www.scipy.org/Cookbook/Performance
> > ?
>
>
>
> -------------------------------------------------------
> Using Tomcat but need to do more? Need to support web services, security?
> Get stuff done quickly with pre-integrated technology to make your job easier
> Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>


From arnd.baecker at web.de  Thu Apr 27 06:56:08 2006
From: arnd.baecker at web.de (Arnd Baecker)
Date: Thu Apr 27 06:56:08 2006
Subject: [Numpy-discussion] array.min() vs. min(array)
In-Reply-To: <4450B34F.8010501@bigpond.net.au>
References: <c5b438120604261420u5d44eeefla575bc56043f487a@mail.gmail.com>
 <444FE909.5080209@ieee.org> <Pine.LNX.4.51.0604270807510.14194@ptpcp8.phy.tu-dresden.de>
 <4450B34F.8010501@bigpond.net.au>
Message-ID: <Pine.LNX.4.51.0604271552260.14194@ptpcp8.phy.tu-dresden.de>

On Thu, 27 Apr 2006, Gary Ruben wrote:

> Hi Arnd,
>
> You could call it PerformanceTips and include some search terms like
> "speed" in the page so search engines pick them up.

Alright, I put all I know on this (which is not that much ;-) at

  http://www.scipy.org/PerformanceTips

The pointers to weave/f2py/pyrex/ (ah - psyco is missing)  will have to be
added.

Also the profiling/benchmarking
aspect, which is important (actually more important
even before thinking about PerformanceTips) needs to be put somewhere,
maybe even separately under

  http://www.scipy.org/BenchmarkingAndProfiling

Best, Arnd


> Gary R.
>
> Arnd Baecker wrote:
>
> > I am just preparing a small text to collect such cases for the wiki.
> >
> > However, I am not sure about a good name for such a page:
> >   http://www.scipy.org/Cookbook/Speed
> >   http://www.scipy.org/Cookbook/SpeedProblems
> >   http://www.scipy.org/Cookbook/Performance
> > ?
>
>
>
> -------------------------------------------------------
> Using Tomcat but need to do more? Need to support web services, security?
> Get stuff done quickly with pre-integrated technology to make your job easier
> Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>
>


From arnd.baecker at web.de  Thu Apr 27 07:02:16 2006
From: arnd.baecker at web.de (Arnd Baecker)
Date: Thu Apr 27 07:02:16 2006
Subject: [Numpy-discussion] array.min() vs. min(array)
In-Reply-To: <c5b438120604270640p21996cc9w68367c89efe6ad8d@mail.gmail.com>
References: <c5b438120604261420u5d44eeefla575bc56043f487a@mail.gmail.com> 
 <444FE909.5080209@ieee.org>  <Pine.LNX.4.51.0604270807510.14194@ptpcp8.phy.tu-dresden.de>
  <4450B34F.8010501@bigpond.net.au> <c5b438120604270640p21996cc9w68367c89efe6ad8d@mail.gmail.com>
Message-ID: <Pine.LNX.4.51.0604271558140.14194@ptpcp8.phy.tu-dresden.de>


On Thu, 27 Apr 2006, Ryan Krauss wrote:

> I think this is a great idea.  We get a lot of these kinds of
> questions on the list, and the collective wisdom of people here who
> have really dug into this is really impressive.  But, that wisdom does
> need to be a little easier to find.
>
> Speaking of which, I don't always feel like I get trustworthy results
> out of the profiler, so when I really want to know what is going on I
> find myself doing this alot:
>
> t1=time.time()
> [block of code here]
> t2=time.time()
> [more code]
> t3=time.time()
>
> and then comparing t3-t2 and t2-t1 to narrow down where the code is
> spending its time.
>
> Does anyone have good tips on how to do good profiling?  Or is this
> question so vague and counter-intuitive that I seem silly and I had
> better come back with a believable example?

Maybe this one is of interest then:
http://www.physik.tu-dresden.de/~baecker/comp_talks.html
and goto "Python and Co - some recent developments"
Quite late in the talk there is an example on Profiling
(sorry, it seems that no direct linking is possible)
The corresponding files are at
  http://www.physik.tu-dresden.de/~baecker/talks/pyco/BenchExamples/

Essentially it is an example of using kcachegrind to display
the results of hotshot
(see also:
  http://mail.enthought.com/pipermail/enthought-dev/2006-January/001075.html
)

Best, Arnd


> Thanks,
>
> Ryan
>
> On 4/27/06, Gary Ruben <gruben at bigpond.net.au> wrote:
> > Hi Arnd,
> >
> > You could call it PerformanceTips and include some search terms like
> > "speed" in the page so search engines pick them up.
> >
> > Gary R.
> >
> > Arnd Baecker wrote:
> >
> > > I am just preparing a small text to collect such cases for the wiki.
> > >
> > > However, I am not sure about a good name for such a page:
> > >   http://www.scipy.org/Cookbook/Speed
> > >   http://www.scipy.org/Cookbook/SpeedProblems
> > >   http://www.scipy.org/Cookbook/Performance
> > > ?
> >
> >
> >
> > -------------------------------------------------------
> > Using Tomcat but need to do more? Need to support web services, security?
> > Get stuff done quickly with pre-integrated technology to make your job easier
> > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
> > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
> > _______________________________________________
> > Numpy-discussion mailing list
> > Numpy-discussion at lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/numpy-discussion
> >
>
>
> -------------------------------------------------------
> Using Tomcat but need to do more? Need to support web services, security?
> Get stuff done quickly with pre-integrated technology to make your job easier
> Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
> http://sel.as-us.falkag.net/sel?cmd_______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>
>


From faltet at carabos.com  Thu Apr 27 07:08:06 2006
From: faltet at carabos.com (Francesc Altet)
Date: Thu Apr 27 07:08:06 2006
Subject: [Numpy-discussion] array.min() vs. min(array)
In-Reply-To: <c5b438120604270640p21996cc9w68367c89efe6ad8d@mail.gmail.com>
References: <c5b438120604261420u5d44eeefla575bc56043f487a@mail.gmail.com> <4450B34F.8010501@bigpond.net.au> <c5b438120604270640p21996cc9w68367c89efe6ad8d@mail.gmail.com>
Message-ID: <200604271606.52780.faltet@carabos.com>

A Dijous 27 Abril 2006 15:40, Ryan Krauss va escriure:
> I think this is a great idea.  We get a lot of these kinds of
> questions on the list, and the collective wisdom of people here who
> have really dug into this is really impressive.  But, that wisdom does
> need to be a little easier to find.
>
> Speaking of which, I don't always feel like I get trustworthy results
> out of the profiler, so when I really want to know what is going on I
> find myself doing this alot:
>
> t1=time.time()
> [block of code here]
> t2=time.time()
> [more code]
> t3=time.time()
>
> and then comparing t3-t2 and t2-t1 to narrow down where the code is
> spending its time.
>
> Does anyone have good tips on how to do good profiling?  Or is this
> question so vague and counter-intuitive that I seem silly and I had
> better come back with a believable example?

Well, if you are on Linux, and want to time C extension, then oprofile
is a *very* good option. Another profiling tool is Cachegrind, part of
Valgrind. It uses the processor emulation of Valgrind to run the
executable, and catches all memory accesses for the trace. In
addition, you can combine the output of oprofile with Cachegrind.
In [3] one can see more info about these and more tools.

[1] http://oprofile.sourceforge.net
[2] http://kcachegrind.sourceforge.net/
[3] https://uimon.cern.ch/twiki/bin/view/Atlas/OptimisingCode

Cheers,

-- 
>0,0<   Francesc Altet ? ? http://www.carabos.com/
V   V   C?rabos Coop. V. ??Enjoy Data
 "-"


From lroubeyrie at limair.asso.fr  Thu Apr 27 08:41:03 2006
From: lroubeyrie at limair.asso.fr (Lionel Roubeyrie)
Date: Thu Apr 27 08:41:03 2006
Subject: [Numpy-discussion] equality with masked object
In-Reply-To: <d38f5330604250610x223a3513pd859ed1deff4f41@mail.gmail.com>
References: <200604250938.48648.lroubeyrie@limair.asso.fr> <d38f5330604250610x223a3513pd859ed1deff4f41@mail.gmail.com>
Message-ID: <200604271740.11385.lroubeyrie@limair.asso.fr>

Hi,
thanks for your answer, but my problem is that I want to obtain the index of 
the max value in each column of a 2d masked array, then how can I do that 
without comparaison?
Thanks

Le Mardi 25 Avril 2006 15:10, Sasha a ?crit?:
> On 4/25/06, Lionel Roubeyrie <lroubeyrie at limair.asso.fr> wrote:
> > Why 5.0 == -- return True? A float is it the same as a masked object?
> > thanks
>
> It does not.  It returns ma.masked :
> >>> test[3] is ma.masked
>
> True
>
> You should not access masked data - it makes no sense.  The current
> behavior is historical and I don't really like it.  Masked scalars are
> replaced by ma.masked singleton in subscript operations to allow a[i]
> is masked idiom.  In my view it is not worth the trouble, but my
> suggestion to get rid of that feature was not met with much
> enthusiasm.
>
>
> -------------------------------------------------------
> Using Tomcat but need to do more? Need to support web services, security?
> Get stuff done quickly with pre-integrated technology to make your job
> easier Download IBM WebSphere Application Server v.1.0.1 based on Apache
> Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid0709&bid&3057&dat1642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion

-- 
Lionel Roubeyrie - lroubeyrie at limair.asso.fr
LIMAIR
http://www.limair.asso.fr


From ndarray at mac.com  Thu Apr 27 08:57:07 2006
From: ndarray at mac.com (Sasha)
Date: Thu Apr 27 08:57:07 2006
Subject: [Numpy-discussion] equality with masked object
In-Reply-To: <200604271740.11385.lroubeyrie@limair.asso.fr>
References: <200604250938.48648.lroubeyrie@limair.asso.fr>
	 <d38f5330604250610x223a3513pd859ed1deff4f41@mail.gmail.com>
	 <200604271740.11385.lroubeyrie@limair.asso.fr>
Message-ID: <d38f5330604270856g5b556168q8d20827983649d96@mail.gmail.com>

On 4/27/06, Lionel Roubeyrie <lroubeyrie at limair.asso.fr> wrote:
>[....................]  I want to obtain the index of
> the max value in each column of a 2d masked array, then how can I do that
> without comparaison?

ma.argmax(x, axis=0, fill_value=ma.maximum_fill_value(x))

or better:

argmax(x.fill(ma.maximum_fill_value(x)), axis=0)


From kwgoodman at gmail.com  Thu Apr 27 09:32:10 2006
From: kwgoodman at gmail.com (Keith Goodman)
Date: Thu Apr 27 09:32:10 2006
Subject: [Numpy-discussion] Broadcasting rules (Ticket 76).
In-Reply-To: <44506BE6.10301@noaa.gov>
References: <4050236.1146000212838.JavaMail.root@fed1wml05.mgt.cox.net>
	 <d38f5330604251816i7979df78x67c63eeace28b507@mail.gmail.com>
	 <444F0420.9000500@ieee.org>
	 <d38f5330604261417o1b8dd104veb5b6e1398cfc74e@mail.gmail.com>
	 <44506BE6.10301@noaa.gov>
Message-ID: <f4f93d420604270931q7c2466d4of6d062bf8ed7140c@mail.gmail.com>

On 4/26/06, Christopher Barker <Chris.Barker at noaa.gov> wrote:
> something that always bit me with MATLAB. If I had a matrix that
> happened to have a dimension of 1, MATLAB would interpret it as a
> vector. I ended up writing functions like "SumColumns" that would check
> if it was a single row vector before calling sum, so that I wouldn't
> suddenly get a scaler result if a matrix happened to have on row.

In Octave or Matlab, all you need to do is sum(x,1). For example:

>> x = rand(1,4)
x =

  0.56755  0.24575  0.53804  0.36521

>> sum(x,1)
ans =

  0.56755  0.24575  0.53804  0.36521


From schofield at ftw.at  Thu Apr 27 09:50:03 2006
From: schofield at ftw.at (Ed Schofield)
Date: Thu Apr 27 09:50:03 2006
Subject: [Numpy-discussion] matrix operations with axis=None
In-Reply-To: <4450780C.9060403@ieee.org>
References: <f4f93d420604262145p7d38060av9fa610488749811f@mail.gmail.com> <4450780C.9060403@ieee.org>
Message-ID: <4450F6F4.2060800@ftw.at>

Travis Oliphant wrote:
> Keith Goodman wrote:
>> I noticed that the mean of a matrix is a matrix but the standard
>> deviation of a matrix is an array. Is that the expected behavior? I'm
>> also getting the wrong values (0 and nan) for the standard deviation.
>> Did I mess something up?
> This should be fixed now in SVN.  If somebody can add a test that
> would be great.
>
> Note, that the methods taking axes also now preserve row and column
> orientation for matrices.
>
Well done for doing this.

In fact, you beat me to it by a few hours; I was going to post a patch
this morning to preserve orientation with matrix operations.  The
approach I took was different in one respect.

Matrix objects currently return a matrix of shape (1, 1) from methods
with an axis=None argument.  For example:

>>> x = asmatrix(random.uniform(0,1,(3,3)))
>>> x.std()
matrix([[ 0.26890557]])

>>> x.argmax()
matrix([[4]])

I believe this behaviour is unfortunate, and that an operation
aggregating a matrix over all dimensions should return a scalar.  I've
posted a patch at

    http://projects.scipy.org/scipy/numpy/ticket/83

that modifies this behaviour to return scalars (as rank-0 arrays)
instead.  It also removes some code duplication.

The behaviour with the patch is:

>>> x.std()
0.29610630190701492

>>> x.std().shape
()

>>> x.argmax()
3

Returning scalars from methods with an axis=None argument is the current
behaviour of scipy sparse matrices, while axis=0 or axis=1 yields a
sparse matrix with height or width 1, like numpy matrices.  A (1 x 1)
sparse matrix would be a strange object indeed, and would not be usable
in all contexts where scalars are expected.  I suspect the same would
hold for (1 x 1) dense matrices.  One example is that they cannot be
used as indices for Python lists.  For some matrix methods, such as
argmax, returning a scalar would be highly desirable by allowing simpler
code.

A potential drawback to this change is that matrix operations
aggregating along all dimensions, which would now share the behaviour of
numpy arrays, would be no longer be consistent with matrix operations
that aggregate along only one dimension, which currently do not reduce
dimension, because matrices are inherently 2-d.  This could be an
argument for introducing a new vector class to represent one-dimensional
data with orientation.

-- Ed


From gnchen at cortechs.net  Thu Apr 27 09:56:12 2006
From: gnchen at cortechs.net (Gennan Chen)
Date: Thu Apr 27 09:56:12 2006
Subject: [Numpy-discussion] newbie for writing numpy/scipy extensions
Message-ID: <228EDE46-B760-44BA-A987-273F2ADC9B81@cortechs.net>

Hi! All,

I just start writing my own python extension based on numpy. Couple  
of questions here:

1. I have some utility functions, such as wrappers for  
PyArray_GETPTR* needed be access by different extension modules. So,  
I put them in utlis.h and utlis.c. In utils.h, I need to include  
"numpy/arrayobject.h". But the compilation failed when I include it  
again in my extension module function, wrap.c:

#include "numpy/arrayobject.h"
#include "utils.h"

When I remove it and use

#include "utils.h"

the compilation works. So, is it true that I can only include  
arrayobject.h once?

2.  which import I should use in my initial function:

import_array()

or
import_libnumarray()

Gen-Nan Chen, PhD
Chief Scientist
Research and Development Group
CorTechs Labs Inc (www.cortechs.net)
1020 Prospect St., #304, La Jolla, CA, 92037
Tel: 1-858-459-9700 ext 16
Fax: 1-858-459-9705
Email: gnchen at cortechs.net


From ndarray at mac.com  Thu Apr 27 09:59:11 2006
From: ndarray at mac.com (Sasha)
Date: Thu Apr 27 09:59:11 2006
Subject: [Numpy-discussion] Warnings in NumPy SVN
In-Reply-To: <44507A9D.8070902@ieee.org>
References: <44507A9D.8070902@ieee.org>
Message-ID: <d38f5330604270958o142ee2e5vfe75dad8df2d4b8d@mail.gmail.com>

On 4/27/06, Travis Oliphant <oliphant.travis at ieee.org> wrote:
> [...]
> The function (or macro) needs to implement the operation on the basic
> data-type and if necessary set an error-flag in the floating-point
> registers.
>
> If anybody has time to help implement these basic operations, it would
> be greatly appreciated.

I can help.  To make sure we don't duplicate our effort, let's do the following:

1. I will add place-holders for all the necessary functions to make
them return "NotImplemented".

2. I will then follow up with the list of functions that need to be
filled out and we can then split the work.

3. We will also need to write tests that will make sure scalars behave
similar to dimensionless arrays.  If anyone would like to help with
this, it will be greately appreciated.  No C coding skills are
necessary for that.


From oliphant at ee.byu.edu  Thu Apr 27 10:01:07 2006
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Thu Apr 27 10:01:07 2006
Subject: [Numpy-discussion] matrix operations with axis=None
In-Reply-To: <4450F6F4.2060800@ftw.at>
References: <f4f93d420604262145p7d38060av9fa610488749811f@mail.gmail.com> <4450780C.9060403@ieee.org> <4450F6F4.2060800@ftw.at>
Message-ID: <4450F7F2.1050707@ee.byu.edu>

Ed Schofield wrote:

>Travis Oliphant wrote:
>  
>
>>Keith Goodman wrote:
>>    
>>
>>>I noticed that the mean of a matrix is a matrix but the standard
>>>deviation of a matrix is an array. Is that the expected behavior? I'm
>>>also getting the wrong values (0 and nan) for the standard deviation.
>>>Did I mess something up?
>>>      
>>>
>>This should be fixed now in SVN.  If somebody can add a test that
>>would be great.
>>
>>Note, that the methods taking axes also now preserve row and column
>>orientation for matrices.
>>
>>    
>>
>Well done for doing this.
>
>In fact, you beat me to it by a few hours; I was going to post a patch
>this morning to preserve orientation with matrix operations.  The
>approach I took was different in one respect.
>  
>

I like your function-call approach as it ensures consistent behavior.

>Returning scalars from methods with an axis=None argument is the current
>behaviour of scipy sparse matrices, while axis=0 or axis=1 yields a
>sparse matrix with height or width 1, like numpy matrices.  A (1 x 1)
>sparse matrix would be a strange object indeed, and would not be usable
>in all contexts where scalars are expected.  I suspect the same would
>hold for (1 x 1) dense matrices.  One example is that they cannot be
>used as indices for Python lists.  For some matrix methods, such as
>argmax, returning a scalar would be highly desirable by allowing simpler
>code.
>
>A potential drawback to this change is that matrix operations
>aggregating along all dimensions, which would now share the behaviour of
>numpy arrays, would be no longer be consistent with matrix operations
>that aggregate along only one dimension, which currently do not reduce
>dimension, because matrices are inherently 2-d.  This could be an
>argument for introducing a new vector class to represent one-dimensional
>data with orientation.
>  
>

There is one more problem in that matrix-operations will not be 
preserved in all cases as they would have before.

However, I suppose somebody doing a reduce over all dimensions would 
probably not expect the result to be a matrix, so I don't think it's a 
big drawback.

Consistency with sparse matrices is another reason for returning a scalar.

-Travis


From Fernando.Perez at colorado.edu  Thu Apr 27 10:04:01 2006
From: Fernando.Perez at colorado.edu (Fernando Perez)
Date: Thu Apr 27 10:04:01 2006
Subject: [Numpy-discussion] Warnings in NumPy SVN
In-Reply-To: <d38f5330604270958o142ee2e5vfe75dad8df2d4b8d@mail.gmail.com>
References: <44507A9D.8070902@ieee.org> <d38f5330604270958o142ee2e5vfe75dad8df2d4b8d@mail.gmail.com>
Message-ID: <4450F93D.9050905@colorado.edu>

Sasha wrote:
> On 4/27/06, Travis Oliphant <oliphant.travis at ieee.org> wrote:
> 
>>[...]
>>The function (or macro) needs to implement the operation on the basic
>>data-type and if necessary set an error-flag in the floating-point
>>registers.
>>
>>If anybody has time to help implement these basic operations, it would
>>be greatly appreciated.
> 
> 
> I can help.  To make sure we don't duplicate our effort, let's do the following:
> 
> 1. I will add place-holders for all the necessary functions to make
> them return "NotImplemented".

just a minor reminder:

   raise NotImplementedError

is the standard idiom for this.

Cheers,

f


From kwgoodman at gmail.com  Thu Apr 27 10:05:05 2006
From: kwgoodman at gmail.com (Keith Goodman)
Date: Thu Apr 27 10:05:05 2006
Subject: [Numpy-discussion] matrix.std() returns array
In-Reply-To: <4450780C.9060403@ieee.org>
References: <f4f93d420604262145p7d38060av9fa610488749811f@mail.gmail.com>
	 <4450780C.9060403@ieee.org>
Message-ID: <f4f93d420604271004j5ac0763aq7e9b33542f02b223@mail.gmail.com>

On 4/27/06, Travis Oliphant <oliphant.travis at ieee.org> wrote:
> This should be fixed now in SVN.  If somebody can add a test that would
> be great.
>
> Note, that the methods taking axes also now preserve row and column
> orientation for matrices.

Hey, it works. Thank you.


From ndarray at mac.com  Thu Apr 27 10:52:01 2006
From: ndarray at mac.com (Sasha)
Date: Thu Apr 27 10:52:01 2006
Subject: [Numpy-discussion] Broadcasting rules (Ticket 76).
In-Reply-To: <f4f93d420604270931q7c2466d4of6d062bf8ed7140c@mail.gmail.com>
References: <4050236.1146000212838.JavaMail.root@fed1wml05.mgt.cox.net>
	 <d38f5330604251816i7979df78x67c63eeace28b507@mail.gmail.com>
	 <444F0420.9000500@ieee.org>
	 <d38f5330604261417o1b8dd104veb5b6e1398cfc74e@mail.gmail.com>
	 <44506BE6.10301@noaa.gov>
	 <f4f93d420604270931q7c2466d4of6d062bf8ed7140c@mail.gmail.com>
Message-ID: <d38f5330604271051xacb43f3gc54f68b2284a3caf@mail.gmail.com>

On 4/27/06, Keith Goodman <kwgoodman at gmail.com> wrote:
> [...]
> In Octave or Matlab, all you need to do is sum(x,1). For example:
>
> >> x = rand(1,4)
> x =
>
>   0.56755  0.24575  0.53804  0.36521
>
> >> sum(x,1)
> ans =
>
>   0.56755  0.24575  0.53804  0.36521
>

How is this different from Numpy:

>>> x = matrix(rand(4))
>>> sum(x.T, 1)
matrix([[ 0.36186805],
       [ 0.90198107],
       [ 0.60407661],
       [ 0.49523327]])


From kwgoodman at gmail.com  Thu Apr 27 11:05:03 2006
From: kwgoodman at gmail.com (Keith Goodman)
Date: Thu Apr 27 11:05:03 2006
Subject: [Numpy-discussion] Broadcasting rules (Ticket 76).
In-Reply-To: <d38f5330604271051xacb43f3gc54f68b2284a3caf@mail.gmail.com>
References: <4050236.1146000212838.JavaMail.root@fed1wml05.mgt.cox.net>
	 <d38f5330604251816i7979df78x67c63eeace28b507@mail.gmail.com>
	 <444F0420.9000500@ieee.org>
	 <d38f5330604261417o1b8dd104veb5b6e1398cfc74e@mail.gmail.com>
	 <44506BE6.10301@noaa.gov>
	 <f4f93d420604270931q7c2466d4of6d062bf8ed7140c@mail.gmail.com>
	 <d38f5330604271051xacb43f3gc54f68b2284a3caf@mail.gmail.com>
Message-ID: <f4f93d420604271058h59612f3cya11a67ea6cd8bb0f@mail.gmail.com>

On 4/27/06, Sasha <ndarray at mac.com> wrote:
> On 4/27/06, Keith Goodman <kwgoodman at gmail.com> wrote:
> > [...]
> > In Octave or Matlab, all you need to do is sum(x,1). For example:
> >
> > >> x = rand(1,4)
> > x =
> >
> >   0.56755  0.24575  0.53804  0.36521
> >
> > >> sum(x,1)
> > ans =
> >
> >   0.56755  0.24575  0.53804  0.36521
> >
>
> How is this different from Numpy:
>
> >>> x = matrix(rand(4))
> >>> sum(x.T, 1)
> matrix([[ 0.36186805],
>        [ 0.90198107],
>        [ 0.60407661],
>        [ 0.49523327]])
>

Exactly. That's why the OP doesn't need to write a special function in
Matlab called SumColumns.


From Chris.Barker at noaa.gov  Thu Apr 27 11:11:03 2006
From: Chris.Barker at noaa.gov (Christopher Barker)
Date: Thu Apr 27 11:11:03 2006
Subject: [Numpy-discussion] Broadcasting rules (Ticket 76).
In-Reply-To: <f4f93d420604271058h59612f3cya11a67ea6cd8bb0f@mail.gmail.com>
References: <4050236.1146000212838.JavaMail.root@fed1wml05.mgt.cox.net>
 <d38f5330604251816i7979df78x67c63eeace28b507@mail.gmail.com>
 <444F0420.9000500@ieee.org>
 <d38f5330604261417o1b8dd104veb5b6e1398cfc74e@mail.gmail.com>
 <44506BE6.10301@noaa.gov>
 <f4f93d420604270931q7c2466d4of6d062bf8ed7140c@mail.gmail.com>
 <d38f5330604271051xacb43f3gc54f68b2284a3caf@mail.gmail.com>
 <f4f93d420604271058h59612f3cya11a67ea6cd8bb0f@mail.gmail.com>
Message-ID: <4451090C.5020901@noaa.gov>

Keith Goodman wrote:

> Exactly. That's why the OP doesn't need to write a special function in
> Matlab called SumColumns.

"Didn't". I haven't used MATLAB for much in years. Back in the day, that 
feature didn't exist. Or at least was poorly enough documented that i 
didn't think it existed. Matlab didn't used to only support 2-d arrays 
as well.

Anyway, the point was that a (n,) array and a (n,1) array and a (1,n) 
array are all different, and that difference should be preserved.

I'm still confused as to what behavior Sasha wants that doesn't exist.

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer
                                     		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From oliphant at ee.byu.edu  Thu Apr 27 11:17:02 2006
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Thu Apr 27 11:17:02 2006
Subject: [Numpy-discussion] Warnings in NumPy SVN
In-Reply-To: <d38f5330604270958o142ee2e5vfe75dad8df2d4b8d@mail.gmail.com>
References: <44507A9D.8070902@ieee.org> <d38f5330604270958o142ee2e5vfe75dad8df2d4b8d@mail.gmail.com>
Message-ID: <44510A6E.4090906@ee.byu.edu>

Sasha wrote:

>On 4/27/06, Travis Oliphant <oliphant.travis at ieee.org> wrote:
>  
>
>>[...]
>>The function (or macro) needs to implement the operation on the basic
>>data-type and if necessary set an error-flag in the floating-point
>>registers.
>>
>>If anybody has time to help implement these basic operations, it would
>>be greatly appreciated.
>>    
>>
>
>I can help.  To make sure we don't duplicate our effort, let's do the following:
>
>  
>
Thanks for your help.

>1. I will add place-holders for all the necessary functions to make
>  
>
>them return "NotImplemented".
>  
>

The Python-object-returning functions are already there.   All that is 
missing is the ctype functions to actually do the computation.   So, I'm 
not sure what you mean.

>2. I will then follow up with the list of functions that need to be
>filled out and we can then split the work.
>  
>
This would be good to get a list.   Some of the functions may require 
some repetition of what's in umathmodule.c.    Let's just do the 
repetition for now and think about code refactoring after we know better 
what is actually duplicated.

>3. We will also need to write tests that will make sure scalars behave
>similar to dimensionless arrays.  If anyone would like to help with
>this, it will be greately appreciated.  No C coding skills are
>necessary for that.
>  
>
Tests would be necessary to ensure consistency.  

Thanks for jumping in...

-Travis


From cookedm at physics.mcmaster.ca  Thu Apr 27 11:30:05 2006
From: cookedm at physics.mcmaster.ca (David M. Cooke)
Date: Thu Apr 27 11:30:05 2006
Subject: [Numpy-discussion] Warnings in NumPy SVN
In-Reply-To: <4450F93D.9050905@colorado.edu> (Fernando Perez's message of
	"Thu, 27 Apr 2006 11:02:53 -0600")
References: <44507A9D.8070902@ieee.org>
	<d38f5330604270958o142ee2e5vfe75dad8df2d4b8d@mail.gmail.com>
	<4450F93D.9050905@colorado.edu>
Message-ID: <qnkpsj2hny7.fsf@arbutus.physics.mcmaster.ca>

Fernando Perez <Fernando.Perez at colorado.edu> writes:

> Sasha wrote:
>> On 4/27/06, Travis Oliphant <oliphant.travis at ieee.org> wrote:
>>
>>>[...]
>>>The function (or macro) needs to implement the operation on the basic
>>>data-type and if necessary set an error-flag in the floating-point
>>>registers.
>>>
>>>If anybody has time to help implement these basic operations, it would
>>>be greatly appreciated.
>> I can help.  To make sure we don't duplicate our effort, let's do
>> the following:
>> 1. I will add place-holders for all the necessary functions to make
>> them return "NotImplemented".
>
> just a minor reminder:
>
>   raise NotImplementedError
>
> is the standard idiom for this.

Just a note: For __xxx__ methods, "return NotImplemented" is the
standard idiom. See section 3.3.8 (Coercion rules) of the Python 2.4
language manual:

   For most intents and purposes, an operator that returns
   NotImplemented is treated the same as one that is not implemented
   at all.

I believe the idea is that it's not actually an error for an __xxx__
method to not be implemented, as there are fallbacks.

-- 
|>|\/|<
/--------------------------------------------------------------------------\
|David M. Cooke                      http://arbutus.physics.mcmaster.ca/dmc/
|cookedm at physics.mcmaster.ca


From ndarray at mac.com  Thu Apr 27 11:32:08 2006
From: ndarray at mac.com (Sasha)
Date: Thu Apr 27 11:32:08 2006
Subject: [Numpy-discussion] Warnings in NumPy SVN
In-Reply-To: <44510A6E.4090906@ee.byu.edu>
References: <44507A9D.8070902@ieee.org>
	 <d38f5330604270958o142ee2e5vfe75dad8df2d4b8d@mail.gmail.com>
	 <44510A6E.4090906@ee.byu.edu>
Message-ID: <d38f5330604271131o799cd208ya7d9bc3d02c7c285@mail.gmail.com>

On 4/27/06, Travis Oliphant <oliphant at ee.byu.edu> wrote:

> [ ... ]
>
> The Python-object-returning functions are already there.   All that is
> missing is the ctype functions to actually do the computation.   So, I'm
> not sure what you mean.
>

I did not realize that.  However, it is still reasonable to add
non-working prototypes to kill the warnings first marked by /* XXX */.
 I will do that before the end of the day.


> >2. I will then follow up with the list of functions that need to be
> >filled out and we can then split the work.
> >
> >
> This would be good to get a list.

See attached.
-------------- next part --------------
byte_ctype_multiply
ubyte_ctype_multiply
short_ctype_multiply
ushort_ctype_multiply
int_ctype_multiply
uint_ctype_multiply
long_ctype_multiply
ulong_ctype_multiply
longlong_ctype_multiply
ulonglong_ctype_multiply
byte_ctype_divide
ubyte_ctype_divide
short_ctype_divide
ushort_ctype_divide
int_ctype_divide
uint_ctype_divide
long_ctype_divide
ulong_ctype_divide
longlong_ctype_divide
ulonglong_ctype_divide
byte_ctype_remainder
ubyte_ctype_remainder
short_ctype_remainder
ushort_ctype_remainder
int_ctype_remainder
uint_ctype_remainder
long_ctype_remainder
ulong_ctype_remainder
longlong_ctype_remainder
ulonglong_ctype_remainder
byte_ctype_divmod
ubyte_ctype_divmod
short_ctype_divmod
ushort_ctype_divmod
int_ctype_divmod
uint_ctype_divmod
long_ctype_divmod
ulong_ctype_divmod
longlong_ctype_divmod
ulonglong_ctype_divmod
byte_ctype_power
ubyte_ctype_power
short_ctype_power
ushort_ctype_power
int_ctype_power
uint_ctype_power
long_ctype_power
ulong_ctype_power
longlong_ctype_power
ulonglong_ctype_power
byte_ctype_floor_divide
ubyte_ctype_floor_divide
short_ctype_floor_divide
ushort_ctype_floor_divide
int_ctype_floor_divide
uint_ctype_floor_divide
long_ctype_floor_divide
ulong_ctype_floor_divide
longlong_ctype_floor_divide
ulonglong_ctype_floor_divide
byte_ctype_true_divide
ubyte_ctype_true_divide
short_ctype_true_divide
ushort_ctype_true_divide
int_ctype_true_divide
uint_ctype_true_divide
long_ctype_true_divide
ulong_ctype_true_divide
longlong_ctype_true_divide
ulonglong_ctype_true_divide
byte_ctype_lshift
ubyte_ctype_lshift
short_ctype_lshift
ushort_ctype_lshift
int_ctype_lshift
uint_ctype_lshift
long_ctype_lshift
ulong_ctype_lshift
longlong_ctype_lshift
ulonglong_ctype_lshift
byte_ctype_rshift
ubyte_ctype_rshift
short_ctype_rshift
ushort_ctype_rshift
int_ctype_rshift
uint_ctype_rshift
long_ctype_rshift
ulong_ctype_rshift
longlong_ctype_rshift
ulonglong_ctype_rshift
byte_ctype_and
ubyte_ctype_and
short_ctype_and
ushort_ctype_and
int_ctype_and
uint_ctype_and
long_ctype_and
ulong_ctype_and
longlong_ctype_and
ulonglong_ctype_and
byte_ctype_or
ubyte_ctype_or
short_ctype_or
ushort_ctype_or
int_ctype_or
uint_ctype_or
long_ctype_or
ulong_ctype_or
longlong_ctype_or
ulonglong_ctype_or
byte_ctype_xor
ubyte_ctype_xor
short_ctype_xor
ushort_ctype_xor
int_ctype_xor
uint_ctype_xor
long_ctype_xor
ulong_ctype_xor
longlong_ctype_xor
ulonglong_ctype_xor
float_ctype_remainder
double_ctype_remainder
longdouble_ctype_remainder
cfloat_ctype_remainder
cdouble_ctype_remainder
clongdouble_ctype_remainder
float_ctype_divmod
double_ctype_divmod
longdouble_ctype_divmod
cfloat_ctype_divmod
cdouble_ctype_divmod
clongdouble_ctype_divmod
float_ctype_power
double_ctype_power
longdouble_ctype_power
cfloat_ctype_power
cdouble_ctype_power
clongdouble_ctype_power
cfloat_cfloat_divide
cdouble_cfloat_divide
clongdouble_cfloat_divide
byte_ctype_negative
ubyte_ctype_negative
short_ctype_negative
ushort_ctype_negative
int_ctype_negative
uint_ctype_negative
long_ctype_negative
ulong_ctype_negative
longlong_ctype_negative
ulonglong_ctype_negative
float_ctype_negative
double_ctype_negative
longdouble_ctype_negative
cfloat_ctype_negative
cdouble_ctype_negative
clongdouble_ctype_negative
byte_ctype_positive
ubyte_ctype_positive
short_ctype_positive
ushort_ctype_positive
int_ctype_positive
uint_ctype_positive
long_ctype_positive
ulong_ctype_positive
longlong_ctype_positive
ulonglong_ctype_positive
float_ctype_positive
double_ctype_positive
longdouble_ctype_positive
cfloat_ctype_positive
cdouble_ctype_positive
clongdouble_ctype_positive
byte_ctype_absolute
ubyte_ctype_absolute
short_ctype_absolute
ushort_ctype_absolute
int_ctype_absolute
uint_ctype_absolute
long_ctype_absolute
ulong_ctype_absolute
longlong_ctype_absolute
ulonglong_ctype_absolute
float_ctype_absolute
double_ctype_absolute
longdouble_ctype_absolute
cfloat_ctype_absolute
cdouble_ctype_absolute
clongdouble_ctype_absolute
byte_ctype_nonzero
ubyte_ctype_nonzero
short_ctype_nonzero
ushort_ctype_nonzero
int_ctype_nonzero
uint_ctype_nonzero
long_ctype_nonzero
ulong_ctype_nonzero
longlong_ctype_nonzero
ulonglong_ctype_nonzero
float_ctype_nonzero
double_ctype_nonzero
longdouble_ctype_nonzero
cfloat_ctype_nonzero
cdouble_ctype_nonzero
clongdouble_ctype_nonzero
byte_ctype_invert
ubyte_ctype_invert
short_ctype_invert
ushort_ctype_invert
int_ctype_invert
uint_ctype_invert
long_ctype_invert
ulong_ctype_invert
longlong_ctype_invert
ulonglong_ctype_invert
float_ctype_invert
double_ctype_invert
longdouble_ctype_invert
cfloat_ctype_invert
cdouble_ctype_invert
clongdouble_ctype_invert

From cookedm at physics.mcmaster.ca  Thu Apr 27 11:32:11 2006
From: cookedm at physics.mcmaster.ca (David M. Cooke)
Date: Thu Apr 27 11:32:11 2006
Subject: [Numpy-discussion] newbie for writing numpy/scipy extensions
In-Reply-To: <228EDE46-B760-44BA-A987-273F2ADC9B81@cortechs.net> (Gennan
	Chen's message of "Thu, 27 Apr 2006 09:55:42 -0700")
References: <228EDE46-B760-44BA-A987-273F2ADC9B81@cortechs.net>
Message-ID: <qnkirouhnu7.fsf@arbutus.physics.mcmaster.ca>

Gennan Chen <gnchen at cortechs.net> writes:

> Hi! All,
>
> I just start writing my own python extension based on numpy. Couple
> of questions here:
>
> 1. I have some utility functions, such as wrappers for
> PyArray_GETPTR* needed be access by different extension modules. So,
> I put them in utlis.h and utlis.c. In utils.h, I need to include
> "numpy/arrayobject.h". But the compilation failed when I include it
> again in my extension module function, wrap.c:
>
> #include "numpy/arrayobject.h"
> #include "utils.h"
>
> When I remove it and use
>
> #include "utils.h"
>
> the compilation works. So, is it true that I can only include
> arrayobject.h once?

What is the compiler error message?

> 2.  which import I should use in my initial function:
>
> import_array()

This one. It's the one to use for Numeric, numarray, and numpy.

> or
> import_libnumarray()

This is for numarray, the other Numeric derivative. It pulls in the
numarray-specific stuff IIRC.

-- 
|>|\/|<
/--------------------------------------------------------------------------\
|David M. Cooke                      http://arbutus.physics.mcmaster.ca/dmc/
|cookedm at physics.mcmaster.ca


From oliphant at ee.byu.edu  Thu Apr 27 11:36:06 2006
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Thu Apr 27 11:36:06 2006
Subject: [Numpy-discussion] newbie for writing numpy/scipy extensions
In-Reply-To: <228EDE46-B760-44BA-A987-273F2ADC9B81@cortechs.net>
References: <228EDE46-B760-44BA-A987-273F2ADC9B81@cortechs.net>
Message-ID: <44510F04.3020806@ee.byu.edu>

Gennan Chen wrote:

> Hi! All,
>
> I just start writing my own python extension based on numpy. Couple  
> of questions here:
>
> 1. I have some utility functions, such as wrappers for  
> PyArray_GETPTR* needed be access by different extension modules. So,  
> I put them in utlis.h and utlis.c. In utils.h, I need to include  
> "numpy/arrayobject.h". But the compilation failed when I include it  
> again in my extension module function, wrap.c:
>
> #include "numpy/arrayobject.h"
> #include "utils.h"
>
> When I remove it and use
>
> #include "utils.h"
>
> the compilation works. So, is it true that I can only include  
> arrayobject.h once?


No, you can include arrayobject.h more than once.  However, if you make 
use of C-API functions (not just macros that access elements of the 
array) in more than one file for the same extension module, you need to 
do a couple of things to make it work.

In the original file you must define PY_ARRAY_UNIQUE_SYMBOL to something 
unique to your extension module before you include the arrayobject.h file.

In the helper c file you must define PY_ARRAY_UNIQUE_SYMBOL and define 
NO_IMPORT_ARRAY prior to including the arrayobject.h

Thus, in wrap.c you do (feel free to change the name from 
_chen_extension to something else)

#define PY_ARRAY_UNIQUE_SYMBOL  _chen_extension 
#include "numpy/arrayobject.h"

and in

utils.c  you do

#define PY_ARRAY_UNIQUE_SYMBOL  _chen_extension 
#define NO_IMPORT_ARRAY
#include "numpy/arrayobject.h"


>
> 2.  which import I should use in my initial function:
>
> import_array()


import_array()

-Travis


From oliphant at ee.byu.edu  Thu Apr 27 11:40:10 2006
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Thu Apr 27 11:40:10 2006
Subject: [Numpy-discussion] Broadcasting rules (Ticket 76).
In-Reply-To: <4451090C.5020901@noaa.gov>
References: <4050236.1146000212838.JavaMail.root@fed1wml05.mgt.cox.net> <d38f5330604251816i7979df78x67c63eeace28b507@mail.gmail.com> <444F0420.9000500@ieee.org> <d38f5330604261417o1b8dd104veb5b6e1398cfc74e@mail.gmail.com> <44506BE6.10301@noaa.gov> <f4f93d420604270931q7c2466d4of6d062bf8ed7140c@mail.gmail.com> <d38f5330604271051xacb43f3gc54f68b2284a3caf@mail.gmail.com> <f4f93d420604271058h59612f3cya11a67ea6cd8bb0f@mail.gmail.com> <4451090C.5020901@noaa.gov>
Message-ID: <44510FD2.1090502@ee.byu.edu>

Christopher Barker wrote:

> Keith Goodman wrote:
>
>> Exactly. That's why the OP doesn't need to write a special function in
>> Matlab called SumColumns.
>
>
> "Didn't". I haven't used MATLAB for much in years. Back in the day, 
> that feature didn't exist. Or at least was poorly enough documented 
> that i didn't think it existed. Matlab didn't used to only support 2-d 
> arrays as well.
>
> Anyway, the point was that a (n,) array and a (n,1) array and a (1,n) 
> array are all different, and that difference should be preserved.
>
> I'm still confused as to what behavior Sasha wants that doesn't exist.


I'm not exactly sure.   But, one of the things I think he has suggested 
(please tell me if my understanding is wrong) is to allow a 2x3 array to 
be "broadcast" to a (2n)x(3m) array by repeated copying as needed. 


-Travis


From gnchen at cortechs.net  Thu Apr 27 12:24:38 2006
From: gnchen at cortechs.net (Gennan Chen)
Date: Thu Apr 27 12:24:38 2006
Subject: [Numpy-discussion] newbie for writing numpy/scipy extensions
In-Reply-To: <44510F04.3020806@ee.byu.edu>
References: <228EDE46-B760-44BA-A987-273F2ADC9B81@cortechs.net> <44510F04.3020806@ee.byu.edu>
Message-ID: <ABD327FD-ED01-4D9A-839C-22B941B3E727@cortechs.net>

Thanks!

That solve the problem. May I ask what does those #define really   
means??

Gen


On Apr 27, 2006, at 11:35 AM, Travis Oliphant wrote:

> Gennan Chen wrote:
>
>> Hi! All,
>>
>> I just start writing my own python extension based on numpy.  
>> Couple  of questions here:
>>
>> 1. I have some utility functions, such as wrappers for   
>> PyArray_GETPTR* needed be access by different extension modules.  
>> So,  I put them in utlis.h and utlis.c. In utils.h, I need to  
>> include  "numpy/arrayobject.h". But the compilation failed when I  
>> include it  again in my extension module function, wrap.c:
>>
>> #include "numpy/arrayobject.h"
>> #include "utils.h"
>>
>> When I remove it and use
>>
>> #include "utils.h"
>>
>> the compilation works. So, is it true that I can only include   
>> arrayobject.h once?
>
>
> No, you can include arrayobject.h more than once.  However, if you  
> make use of C-API functions (not just macros that access elements  
> of the array) in more than one file for the same extension module,  
> you need to do a couple of things to make it work.
>
> In the original file you must define PY_ARRAY_UNIQUE_SYMBOL to  
> something unique to your extension module before you include the  
> arrayobject.h file.
>
> In the helper c file you must define PY_ARRAY_UNIQUE_SYMBOL and  
> define NO_IMPORT_ARRAY prior to including the arrayobject.h
>
> Thus, in wrap.c you do (feel free to change the name from  
> _chen_extension to something else)
>
> #define PY_ARRAY_UNIQUE_SYMBOL  _chen_extension #include "numpy/ 
> arrayobject.h"
>
> and in
>
> utils.c  you do
>
> #define PY_ARRAY_UNIQUE_SYMBOL  _chen_extension #define  
> NO_IMPORT_ARRAY
> #include "numpy/arrayobject.h"
>
>
>>
>> 2.  which import I should use in my initial function:
>>
>> import_array()
>
>
> import_array()
>
> -Travis
>
>


From gnchen at cortechs.net  Thu Apr 27 12:24:41 2006
From: gnchen at cortechs.net (Gennan Chen)
Date: Thu Apr 27 12:24:41 2006
Subject: [Numpy-discussion] newbie for writing numpy/scipy extensions
In-Reply-To: <qnkirouhnu7.fsf@arbutus.physics.mcmaster.ca>
References: <228EDE46-B760-44BA-A987-273F2ADC9B81@cortechs.net> <qnkirouhnu7.fsf@arbutus.physics.mcmaster.ca>
Message-ID: <8CD47186-A354-4C8A-B5AF-8BEC2CE82D2E@cortechs.net>

Got it. Looks like ndimage still used the old one.

Gen-Nan Chen, PhD
Chief Scientist
Research and Development Group
CorTechs Labs Inc (www.cortechs.net)
1020 Prospect St., #304, La Jolla, CA, 92037
Tel: 1-858-459-9700 ext 16
Fax: 1-858-459-9705
Email: gnchen at cortechs.net


On Apr 27, 2006, at 11:31 AM, David M. Cooke wrote:

> Gennan Chen <gnchen at cortechs.net> writes:
>
>> Hi! All,
>>
>> I just start writing my own python extension based on numpy. Couple
>> of questions here:
>>
>> 1. I have some utility functions, such as wrappers for
>> PyArray_GETPTR* needed be access by different extension modules. So,
>> I put them in utlis.h and utlis.c. In utils.h, I need to include
>> "numpy/arrayobject.h". But the compilation failed when I include it
>> again in my extension module function, wrap.c:
>>
>> #include "numpy/arrayobject.h"
>> #include "utils.h"
>>
>> When I remove it and use
>>
>> #include "utils.h"
>>
>> the compilation works. So, is it true that I can only include
>> arrayobject.h once?
>
> What is the compiler error message?
>
>> 2.  which import I should use in my initial function:
>>
>> import_array()
>
> This one. It's the one to use for Numeric, numarray, and numpy.
>
>> or
>> import_libnumarray()
>
> This is for numarray, the other Numeric derivative. It pulls in the
> numarray-specific stuff IIRC.
>
> --  
> |>|\/|<
> /--------------------------------------------------------------------- 
> -----\
> |David M. Cooke                      http:// 
> arbutus.physics.mcmaster.ca/dmc/
> |cookedm at physics.mcmaster.ca
>


From ndarray at mac.com  Thu Apr 27 12:29:03 2006
From: ndarray at mac.com (Sasha)
Date: Thu Apr 27 12:29:03 2006
Subject: [Numpy-discussion] Broadcasting rules (Ticket 76).
In-Reply-To: <44510FD2.1090502@ee.byu.edu>
References: <4050236.1146000212838.JavaMail.root@fed1wml05.mgt.cox.net>
	 <d38f5330604251816i7979df78x67c63eeace28b507@mail.gmail.com>
	 <444F0420.9000500@ieee.org>
	 <d38f5330604261417o1b8dd104veb5b6e1398cfc74e@mail.gmail.com>
	 <44506BE6.10301@noaa.gov>
	 <f4f93d420604270931q7c2466d4of6d062bf8ed7140c@mail.gmail.com>
	 <d38f5330604271051xacb43f3gc54f68b2284a3caf@mail.gmail.com>
	 <f4f93d420604271058h59612f3cya11a67ea6cd8bb0f@mail.gmail.com>
	 <4451090C.5020901@noaa.gov> <44510FD2.1090502@ee.byu.edu>
Message-ID: <d38f5330604271228j38d935ebo48fbf7a98ab88783@mail.gmail.com>

On 4/27/06, Travis Oliphant <oliphant at ee.byu.edu> wrote:
> [...]
> > I'm still confused as to what behavior Sasha wants that doesn't exist.
>
>
> I'm not exactly sure.   But, one of the things I think he has suggested
> (please tell me if my understanding is wrong) is to allow a 2x3 array to
> be "broadcast" to a (2n)x(3m) array by repeated copying as needed.

Yes, this is the only new feature that I've suggested. I was also
hoping that the same code that allows shape=(3,) being broadcast to
shape (2,3) can be reused to broadcast (3,) to (6,).  The idea is that
since in terms of memory operations broadcasting  and repetition is
the same, the code can be reused.

The idea is that since repetition can be achieved using broadcasting:

>>> x = zeros(3)
>>> x.reshape((2,3)) += arange(3)
>>> x
array([0, 1, 2, 0, 1, 2])

if we allow x += arange(3), it can use the same code as broadcasting internally.


From ndarray at mac.com  Thu Apr 27 12:30:05 2006
From: ndarray at mac.com (Sasha)
Date: Thu Apr 27 12:30:05 2006
Subject: [Numpy-discussion] Broadcasting rules (Ticket 76).
In-Reply-To: <d38f5330604271228j38d935ebo48fbf7a98ab88783@mail.gmail.com>
References: <4050236.1146000212838.JavaMail.root@fed1wml05.mgt.cox.net>
	 <444F0420.9000500@ieee.org>
	 <d38f5330604261417o1b8dd104veb5b6e1398cfc74e@mail.gmail.com>
	 <44506BE6.10301@noaa.gov>
	 <f4f93d420604270931q7c2466d4of6d062bf8ed7140c@mail.gmail.com>
	 <d38f5330604271051xacb43f3gc54f68b2284a3caf@mail.gmail.com>
	 <f4f93d420604271058h59612f3cya11a67ea6cd8bb0f@mail.gmail.com>
	 <4451090C.5020901@noaa.gov> <44510FD2.1090502@ee.byu.edu>
	 <d38f5330604271228j38d935ebo48fbf7a98ab88783@mail.gmail.com>
Message-ID: <d38f5330604271229q175df4bah7a8afb986189a085@mail.gmail.com>

On 4/27/06, Sasha <ndarray at mac.com> wrote:
> >>> x.reshape((2,3)) += arange(3)

Oops, that should have been

>>> x.reshape((2,3))[...] += arange(3)


From Fernando.Perez at colorado.edu  Thu Apr 27 12:58:02 2006
From: Fernando.Perez at colorado.edu (Fernando Perez)
Date: Thu Apr 27 12:58:02 2006
Subject: [Numpy-discussion] Warnings in NumPy SVN
In-Reply-To: <qnkpsj2hny7.fsf@arbutus.physics.mcmaster.ca>
References: <44507A9D.8070902@ieee.org>	<d38f5330604270958o142ee2e5vfe75dad8df2d4b8d@mail.gmail.com>	<4450F93D.9050905@colorado.edu> <qnkpsj2hny7.fsf@arbutus.physics.mcmaster.ca>
Message-ID: <44512213.9090902@colorado.edu>

David M. Cooke wrote:
> Fernando Perez <Fernando.Perez at colorado.edu> writes:
> 
> 
>>Sasha wrote:
>>
>>>On 4/27/06, Travis Oliphant <oliphant.travis at ieee.org> wrote:
>>>
>>>
>>>>[...]
>>>>The function (or macro) needs to implement the operation on the basic
>>>>data-type and if necessary set an error-flag in the floating-point
>>>>registers.
>>>>
>>>>If anybody has time to help implement these basic operations, it would
>>>>be greatly appreciated.
>>>
>>>I can help.  To make sure we don't duplicate our effort, let's do
>>>the following:
>>>1. I will add place-holders for all the necessary functions to make
>>>them return "NotImplemented".
>>
>>just a minor reminder:
>>
>>  raise NotImplementedError
>>
>>is the standard idiom for this.
> 
> 
> Just a note: For __xxx__ methods, "return NotImplemented" is the
> standard idiom. See section 3.3.8 (Coercion rules) of the Python 2.4
> language manual:
> 
>    For most intents and purposes, an operator that returns
>    NotImplemented is treated the same as one that is not implemented
>    at all.
> 
> I believe the idea is that it's not actually an error for an __xxx__
> method to not be implemented, as there are fallbacks.

You are right.  It's worth remembering that the actual syntaxes are

return NotImplemented

and

raise NotImplementedError

/without/ quotes (as per the original msg), since these are actual python 
builtins, not strings.  That way they can be properly handled by their return 
value or proper exception handling.

Cheers,

f


From woeue at kandy.ccom.lk  Thu Apr 27 18:28:06 2006
From: woeue at kandy.ccom.lk (Bert Morrow)
Date: Thu Apr 27 18:28:06 2006
Subject: [Numpy-discussion] sob story
Message-ID: <001b01c66a62$ee740da5$41bd8147@rf.ncuwi>

An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060427/6f1706fc/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: overdue.gif
Type: image/gif
Size: 10245 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060427/6f1706fc/attachment.gif>

From nvf at MIT.EDU  Thu Apr 27 21:02:03 2006
From: nvf at MIT.EDU (Nick Fotopoulos)
Date: Thu Apr 27 21:02:03 2006
Subject: [Numpy-discussion] Freeing memory allocated in C
Message-ID: <1C119933-F93B-47C6-AADA-4A61DF16B745@mit.edu>

Dear numpy-discussion,

I have written a python module in C which wraps a C library (FrameL)  
in order to read data from specially formatted files into Python  
arrays.  It works, but I think have a memory leak, and I can't see  
what I might be doing wrong.  This Python wrapper is almost identical  
to a Matlab wrapper, but the Matlab version doesn't leak.  Perhaps  
someone here can help me out?

I have read in many places that to return an array, one should wrap  
with PyArray_FromDimsAndData (or more modern versions) and then  
return it without freeing the memory.  Does the same principle hold  
for strings?  Are the following example snippets correct?


// output2 = x-axis values relative to first data point.
data = malloc(nData*sizeof(double));
for(i=0; i<nData; i++) {
   data[i] = vect->startX[0]+(double)i*dt;
}
shape[0] = nData;
out2 = (PyArrayObject *)
         PyArray_FromDimsAndData(1,shape,PyArray_DOUBLE,(char *)data);

//snip

// output5 = gps start time as a string
utc = vect->GTime - vect->ULeapS + FRGPSTAI;
out5 = malloc(200*sizeof(char));
sprintf(out5,"Starting GPS time:%.1f UTC=%s",
     vect->GTime,FrStrGTime(utc));

//snip -- Free all memory not assigned to a return object

return Py_BuildValue("(OOOdsss)",out1,out2,out3,out4,out5,out6,out7);


I see in the Numpy book that I should modernize  
PyArray_FromDimsAndData, but will it be incompatible with users who  
have only Numeric?

If the code above should not leak under your inspection, are there  
any other common places that python C modules often leak that I  
should check?

As a side note, here is how I have been defining "leak".  I have been  
measuring memory usage by opening a pipe to ps to check rss between  
reading in frames and invoking del on them.  Memory usage increases,  
but does not decrease.  In contrast, if I commit the same data in an  
array to a pickle file and read that in, invoking del reduces memory  
usage.

Many thanks,
Nick


From robert.kern at gmail.com  Thu Apr 27 21:14:02 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Thu Apr 27 21:14:02 2006
Subject: [Numpy-discussion] Re: Freeing memory allocated in C
In-Reply-To: <1C119933-F93B-47C6-AADA-4A61DF16B745@mit.edu>
References: <1C119933-F93B-47C6-AADA-4A61DF16B745@mit.edu>
Message-ID: <e2s4p2$dl8$1@sea.gmane.org>

Nick Fotopoulos wrote:
> Dear numpy-discussion,
> 
> I have written a python module in C which wraps a C library (FrameL)  in
> order to read data from specially formatted files into Python  arrays. 
> It works, but I think have a memory leak, and I can't see  what I might
> be doing wrong.  This Python wrapper is almost identical  to a Matlab
> wrapper, but the Matlab version doesn't leak.  Perhaps  someone here can
> help me out?
> 
> I have read in many places that to return an array, one should wrap 
> with PyArray_FromDimsAndData (or more modern versions) and then  return
> it without freeing the memory.  Does the same principle hold  for
> strings?  Are the following example snippets correct?
> 
> // output2 = x-axis values relative to first data point.
> data = malloc(nData*sizeof(double));
> for(i=0; i<nData; i++) {
>   data[i] = vect->startX[0]+(double)i*dt;
> }
> shape[0] = nData;
> out2 = (PyArrayObject *)
>         PyArray_FromDimsAndData(1,shape,PyArray_DOUBLE,(char *)data);

I wouldn't rely on PyArray_FromDimsAndData doing the right thing. Instead of
malloc'ing a block of memory, why don't you create an empty array of the right
size, use its data pointer to fill it with that for-loop, and then return that
array object?

> //snip
> 
> // output5 = gps start time as a string
> utc = vect->GTime - vect->ULeapS + FRGPSTAI;
> out5 = malloc(200*sizeof(char));
> sprintf(out5,"Starting GPS time:%.1f UTC=%s",
>     vect->GTime,FrStrGTime(utc));
> 
> //snip -- Free all memory not assigned to a return object
> 
> return Py_BuildValue("(OOOdsss)",out1,out2,out3,out4,out5,out6,out7);
> 
> I see in the Numpy book that I should modernize 
> PyArray_FromDimsAndData, but will it be incompatible with users who 
> have only Numeric?

Yes. However, I would suggest that new code should probably use just use numpy
fully especially if the restrictions of the old Numeric API is causing you pain.
The longer people support both, the longer people will *have* to support both.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From oliphant.travis at ieee.org  Thu Apr 27 21:40:04 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Thu Apr 27 21:40:04 2006
Subject: [Numpy-discussion] Freeing memory allocated in C
In-Reply-To: <1C119933-F93B-47C6-AADA-4A61DF16B745@mit.edu>
References: <1C119933-F93B-47C6-AADA-4A61DF16B745@mit.edu>
Message-ID: <44519C6E.80006@ieee.org>

Nick Fotopoulos wrote:
> Dear numpy-discussion,
>
> I have written a python module in C which wraps a C library (FrameL) 
> in order to read data from specially formatted files into Python 
> arrays.  It works, but I think have a memory leak, and I can't see 
> what I might be doing wrong.  This Python wrapper is almost identical 
> to a Matlab wrapper, but the Matlab version doesn't leak.  Perhaps 
> someone here can help me out?
>
> I have read in many places that to return an array, one should wrap 
> with PyArray_FromDimsAndData (or more modern versions) and then return 
> it without freeing the memory.  Does the same principle hold for 
> strings?  Are the following example snippets correct?

Why don't you just use PyArray_FromDims and let NumPy manage the 
memory?  FromDimsAndData is only for situations where you can't manage 
the memory with Python.  Therefore the memory is never freed.


If you do want to have NumPy deallocate the memory when you are done, 
then you have to


1) Make sure you are using the same allocator as NumPy is... _pya_malloc 
is defined in arrayobject.h (in NumPy but not in Numeric)


2) Reset the array flag so that OWN_DATA is set


out2->flags |= OWN_DATA


As long as you are using the same memory allocator, this should work.  
The OWN_DATA flag instructs the deallocator to free the data.


But, I would strongly suggest just using PyArray_FromDims and let NumPy 
allocate the new array for you.


>
> // output2 = x-axis values relative to first data point.
> data = malloc(nData*sizeof(double));
> for(i=0; i<nData; i++) {
>   data[i] = vect->startX[0]+(double)i*dt;
> }
> shape[0] = nData;
> out2 = (PyArrayObject *)
>         PyArray_FromDimsAndData(1,shape,PyArray_DOUBLE,(char *)data);
>
> //snip
>
> // output5 = gps start time as a string
> utc = vect->GTime - vect->ULeapS + FRGPSTAI;
> out5 = malloc(200*sizeof(char));
> sprintf(out5,"Starting GPS time:%.1f UTC=%s",
>     vect->GTime,FrStrGTime(utc));
>
> //snip -- Free all memory not assigned to a return object
>
> return Py_BuildValue("(OOOdsss)",out1,out2,out3,out4,out5,out6,out7);
>
>
> I see in the Numpy book that I should modernize 
> PyArray_FromDimsAndData, but will it be incompatible with users who 
> have only Numeric?


Yes,  the only issue, however, is that PyArray_FromDims and friends will 
only allow int-length sizes which on 64-bit computers is not as large as 
intp-length sizes.   So, if  you don't care about allowing large sizes 
then you can use the old Numeric C-API.


>
> If the code above should not leak under your inspection, are there any 
> other common places that python C modules often leak that I should check?

All of the malloc calls in your code leak.   In general you should not 
assume that Python will deallocate memory you have allocated.   Python 
uses it's own memory manager so even if you manage to arange things so 
that Python will free your memory (and you really have to hack things to 
do that), then  you can run into trouble if you try mixing system malloc 
calls with Python's deallocation.


The proper strategy for your arrays is to use PyArray_SimpleNew and then 
get the data-pointer to fill using PyArray_DATA(...).  The proper way to 
handle strings is to create a new string (say using 
PyString_FromFormat)  and then return everything as objects.


/* make sure shape is defined as intp unless you don't care about 64-bit */
obj2 = PyArray_SimpleNew(1, shape, PyArray_DOUBLE);
data = (double *)PyArray_DATA(obj2)
[snip...]
out5 = PyString_FromFormat("Starting GPS time:%.1f UTC=%s",
         vect->GTime,FrStrGTime(utc));

return Py_BuildValue("(NNNdNNN)",out1,out2,out3,out4,out5,out6,out7);


Make sure you use the 'N' tag so that another reference count isn't 
generated.  The 'O' tag will increase the reference count of your 
objects by one which is is not necessarily what you want (but sometimes 
you do).


Good luck,

-Travis


From oliphant.travis at ieee.org  Fri Apr 28 00:14:16 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Fri Apr 28 00:14:16 2006
Subject: [Numpy-discussion] Scalar math module is ready for testing
Message-ID: <4451C076.40608@ieee.org>

The scalar math module is complete and ready to be tested.  It should 
speed up code that relies heavily on scalar arithmetic by by-passing the 
ufunc machinery.

It needs lots of testing to be sure that it is doing the "right" 
thing.   To enable scalarmath you need to

import numpy.core.scalarmath

You cannot disable it once it's enabled except by restarting Python.  If 
we need that feature we can add it. The array scalars respond to the 
error modes of ufuncs.

There is an experimental function called alter_scalars that replaces the 
Python int, float, and complex number tables with the array scalar 
equivalents.  Thus, to amaze (or seriously annoy) your Python friends 
you can do

import numpy.core.scalarmath as ncs

ncs.alter_scalars(int)

1 / 0

This will return 0 unless you change the error modes...

ncs.retore_scalars(int)

Will put things back the way Guido intended....


Please try it out and send us error reports.   Many thanks to Sasha for 
his help in getting all the code so it at least compiles and loads.  All 
bugs should be blamed on me, though...


Best,

-Travis


From arnd.baecker at web.de  Fri Apr 28 00:48:04 2006
From: arnd.baecker at web.de (Arnd Baecker)
Date: Fri Apr 28 00:48:04 2006
Subject: [Numpy-discussion] Scalar math module is ready for testing
In-Reply-To: <4451C076.40608@ieee.org>
References: <4451C076.40608@ieee.org>
Message-ID: <Pine.LNX.4.51.0604280945570.24282@ptpcp8.phy.tu-dresden.de>

Hi Travis,

On Fri, 28 Apr 2006, Travis Oliphant wrote:

>
> The scalar math module is complete and ready to be tested.  It should
> speed up code that relies heavily on scalar arithmetic by by-passing the
> ufunc machinery.
>
> It needs lots of testing to be sure that it is doing the "right"
> thing.   To enable scalarmath you need to
>
> import numpy.core.scalarmath
>
> You cannot disable it once it's enabled except by restarting Python.  If
> we need that feature we can add it. The array scalars respond to the
> error modes of ufuncs.
>
> There is an experimental function called alter_scalars that replaces the
> Python int, float, and complex number tables with the array scalar
> equivalents.  Thus, to amaze (or seriously annoy) your Python friends

LOL ;-)

> you can do
>
> import numpy.core.scalarmath as ncs
>
> ncs.alter_scalars(int)
>
> 1 / 0
>
> This will return 0 unless you change the error modes...
>
> ncs.retore_scalars(int)
>
> Will put things back the way Guido intended....
>
>
> Please try it out and send us error reports.   Many thanks to Sasha for
> his help in getting all the code so it at least compiles and loads.  All
> bugs should be blamed on me, though...


Well, it does not compile for me (64 Bit opteron, as usual;-):

gcc options: '-pthread -fno-strict-aliasing -DNDEBUG -g -O3 -Wall
-Wstrict-prototypes -fPIC'
compile options: '-Inumpy/core/include
-Ibuild/src.linux-x86_64-2.4/numpy/core -Inumpy/core/src
-Inumpy/core/include -I/scr/python/include/python2.4 -c'
gcc: build/src.linux-x86_64-2.4/numpy/core/src/scalarmathmodule.c
build/src.linux-x86_64-2.4/numpy/core/src/scalarmathmodule.c:472: error:
redefinition of 'ulong_ctype_multiply'
build/src.linux-x86_64-2.4/numpy/core/src/scalarmathmodule.c:421: error:
previous definition of 'ulong_ctype_multiply' was here
build/src.linux-x86_64-2.4/numpy/core/src/scalarmathmodule.c:421: warning:
'ulong_ctype_multiply' defined but not used
build/src.linux-x86_64-2.4/numpy/core/src/scalarmathmodule.c:472: error:
redefinition of 'ulong_ctype_multiply'
build/src.linux-x86_64-2.4/numpy/core/src/scalarmathmodule.c:421: error:
previous definition of 'ulong_ctype_multiply' was here
build/src.linux-x86_64-2.4/numpy/core/src/scalarmathmodule.c:421: warning:
'ulong_ctype_multiply' defined but not used
error: Command "gcc -pthread -fno-strict-aliasing -DNDEBUG -g -O3 -Wall
-Wstrict-prototypes -fPIC -Inumpy/core/include
-Ibuild/src.linux-x86_64-2.4/numpy/core -Inumpy/core/src
-Inumpy/core/include -I/scr/python/include/python2.4 -c
build/src.linux-x86_64-2.4/numpy/core/src/scalarmathmodule.c -o
build/temp.linux-x86_64-2.4/build/src.linux-x86_64-2.4/numpy/core/src/scalarmathmodule.o"
failed with exit status 1

(I can't look into this now - meeting in -2 minutes ;-)

Best, Arnd


From schofield at ftw.at  Fri Apr 28 01:32:00 2006
From: schofield at ftw.at (Ed Schofield)
Date: Fri Apr 28 01:32:00 2006
Subject: [Numpy-discussion] Scalar math module is ready for testing
In-Reply-To: <4451C076.40608@ieee.org>
References: <4451C076.40608@ieee.org>
Message-ID: <4451D3F0.7080408@ftw.at>

Travis Oliphant wrote:
>
> The scalar math module is complete and ready to be tested.  It should
> speed up code that relies heavily on scalar arithmetic by by-passing
> the ufunc machinery.

Excellent!

> It needs lots of testing to be sure that it is doing the "right" thing.

With revision 2454 I get a segfault in numpy.test() after importing
numpy.core.scalarmath:

check_1 (numpy.distutils.tests.test_misc_util.test_appendpath) ... ok
check_2 (numpy.distutils.tests.test_misc_util.test_appendpath) ... ok
check_3 (numpy.distutils.tests.test_misc_util.test_appendpath) ... ok
check_gpaths (numpy.distutils.tests.test_misc_util.test_gpaths) ... ok
check_1 (numpy.distutils.tests.test_misc_util.test_minrelpath) ... ok
check_singleton (numpy.lib.tests.test_getlimits.test_double)
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread -1208403744 (LWP 11232)]
0xb7142cf7 in int_richcompare (self=0x81c0ab8, other=0x8141dbc, cmp_op=3)
    at build/src.linux-i686-2.4/numpy/core/src/scalarmathmodule.c:19120
19120           PyArrayScalar_RETURN_TRUE;
(gdb) bt
#0  0xb7142cf7 in int_richcompare (self=0x81c0ab8, other=0x8141dbc,
cmp_op=3)
    at build/src.linux-i686-2.4/numpy/core/src/scalarmathmodule.c:19120
#1  0x0807ce1f in PyObject_Print ()
#2  0x0807e451 in PyObject_RichCompare ()

Is this helpful?

-- Ed


From steffen.loeck at gmx.de  Fri Apr 28 01:34:07 2006
From: steffen.loeck at gmx.de (Steffen Loeck)
Date: Fri Apr 28 01:34:07 2006
Subject: [Numpy-discussion] Scalar math module is ready for testing
In-Reply-To: <4451C076.40608@ieee.org>
References: <4451C076.40608@ieee.org>
Message-ID: <200604281033.19781.steffen.loeck@gmx.de>

On Friday 28 April 2006 09:12 am, Travis Oliphant wrote:

> Please try it out and send us error reports.   Many thanks to Sasha for
> his help in getting all the code so it at least compiles and loads.  All
> bugs should be blamed on me, though...

Running the tests with numpy.test(10) i get:

/test/lib/python2.3/site-packages/numpy/testing/numpytest.py:179: 
DeprecationWarning: Non-ASCII character '\xf2' in 
file/test/lib/python2.3/site-packages/numpy/lib/tests/test_ufunclike.pyc on 
line 1, but no encoding declared; see 
http://www.python.org/peps/pep-0263.html for details
  m = imp.load_module(name, open(filename), filename,('.py','U',1))
E................................../test/lib/python2.3/site-packages/numpy/testing/numpytest.py:179: 
DeprecationWarning: Non-ASCII character '\xf2' in file 
test/lib/python2.3/site-packages/numpy/lib/tests/test_polynomial.pyc on line 
1, but no encoding declared; see http://www.python.org/peps/pep-0263.html for 
details
  m = imp.load_module(name, open(filename), filename,('.py','U',1))
E...........................................................................
======================================================================
ERROR: check_doctests (numpy.lib.tests.test_ufunclike.test_docs)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/test/lib/python2.3/site-packages/numpy/lib/tests/test_ufunclike.py", 
line 59, in check_doctests
    def check_doctests(self): return self.rundocs()
  File "/test//lib/python2.3/site-packages/numpy/testing/numpytest.py", line 
179, in rundocs
    m = imp.load_module(name, open(filename), filename,('.py','U',1))
  File "test/lib/python2.3/site-packages/numpy/lib/tests/test_ufunclike.pyc", 
line 1
    ;?
    ^
SyntaxError: invalid syntax

======================================================================
ERROR: check_doctests (numpy.lib.tests.test_polynomial.test_docs)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/test/lib/python2.3/site-packages/numpy/lib/tests/test_polynomial.py", 
line 79, in check_doctests
    def check_doctests(self): return self.rundocs()
  File "/test//lib/python2.3/site-packages/numpy/testing/numpytest.py", line 
179, in rundocs
    m = imp.load_module(name, open(filename), filename,('.py','U',1))
  File 
"/test/lib/python2.3/site-packages/numpy/lib/tests/test_polynomial.pyc", line 
1
    ;?
    ^
SyntaxError: invalid syntax

I have no idea, where this comes from.

Regards, Steffen


From fullung at gmail.com  Fri Apr 28 02:39:03 2006
From: fullung at gmail.com (Albert Strasheim)
Date: Fri Apr 28 02:39:03 2006
Subject: [Numpy-discussion] newbie for writing numpy/scipy extensions
In-Reply-To: <8CD47186-A354-4C8A-B5AF-8BEC2CE82D2E@cortechs.net>
Message-ID: <018c01c66aa7$77764480$0a84a8c0@dsp.sun.ac.za>

Hello all

I've collected the information from this thread along with links to some
recent threads on writing C extensions on the wiki at:

http://www.scipy.org/Cookbook/C_Extensions

Feel free to contribute!

Regards,

Albert

> -----Original Message-----
> From: numpy-discussion-admin at lists.sourceforge.net [mailto:numpy-
> discussion-admin at lists.sourceforge.net] On Behalf Of Gennan Chen
> Sent: 27 April 2006 21:23
> To: David M.Cooke
> Cc: Numpy-discussion at lists.sourceforge.net
> Subject: Re: [Numpy-discussion] newbie for writing numpy/scipy extensions
> 
> Got it. Looks like ndimage still used the old one.
> 
> Gen-Nan Chen, PhD
> Chief Scientist
> Research and Development Group
> CorTechs Labs Inc (www.cortechs.net)
> 1020 Prospect St., #304, La Jolla, CA, 92037
> Tel: 1-858-459-9700 ext 16
> Fax: 1-858-459-9705
> Email: gnchen at cortechs.net
> 
> 
> On Apr 27, 2006, at 11:31 AM, David M. Cooke wrote:
> 
> > Gennan Chen <gnchen at cortechs.net> writes:
> >
> >> Hi! All,
> >>
> >> I just start writing my own python extension based on numpy. Couple
> >> of questions here:
> >>
> >> 1. I have some utility functions, such as wrappers for
> >> PyArray_GETPTR* needed be access by different extension modules. So,
> >> I put them in utlis.h and utlis.c. In utils.h, I need to include
> >> "numpy/arrayobject.h". But the compilation failed when I include it
> >> again in my extension module function, wrap.c:
> >>
> >> #include "numpy/arrayobject.h"
> >> #include "utils.h"
> >>
> >> When I remove it and use
> >>
> >> #include "utils.h"
> >>
> >> the compilation works. So, is it true that I can only include
> >> arrayobject.h once?
> >
> > What is the compiler error message?
> >
> >> 2.  which import I should use in my initial function:
> >>
> >> import_array()
> >
> > This one. It's the one to use for Numeric, numarray, and numpy.
> >
> >> or
> >> import_libnumarray()
> >
> > This is for numarray, the other Numeric derivative. It pulls in the
> > numarray-specific stuff IIRC.
> >
> > --
> > |>|\/|<
> > /---------------------------------------------------------------------
> > -----\
> > |David M. Cooke                      http://
> > arbutus.physics.mcmaster.ca/dmc/
> > |cookedm at physics.mcmaster.ca


From lcordier at point45.com  Fri Apr 28 06:36:10 2006
From: lcordier at point45.com (Louis Cordier)
Date: Fri Apr 28 06:36:10 2006
Subject: [Numpy-discussion] Bug
Message-ID: <Pine.LNX.4.63.0604281504330.6802@bellagio.pcs.intranet>

Hi, I am not sure if this is the proper place to do a bug post.
I looked at the active tickets on http://projects.scipy.org/scipy/numpy/
but didn't feel confident to go and create a new one. ;)

Anyway the current release version 0.9.6 have some broken behavior.
I guess some example code would illustrate it best.

---8<----------------

>>> z = numpy.zeros((10,10), 'O')
>>> z.fill(None)
>>> z.fill([])
Segmentation fault (core dumped)

This happens on both Linux and FreeBSD machines.
(both builds use *_lite versions of Lapack)

Linux bellagio 2.6.11-1.1369_FC4 #1 Thu Jun 2 22:55:56 EDT 2005 i686 i686 
i386 GNU/Linux
Python 2.4.1
gcc version 4.0.0 20050519 (Red Hat 4.0.0-8)

FreeBSD cerberus.intranet 5.4-RELEASE-p12 FreeBSD 5.4-RELEASE-p12 #0: Wed 
Mar 15 16:06:48 UTC 2006
Python 2.4.2
gcc version 3.4.2 [FreeBSD] 20040728

I assume fill() will need to make a copy, of the object
for each coordinate in the matix.

---8<----------------

While,

>>> import numpy
>>> z = numpy.zeros((2,2), 'O')
>>> z
array([[0, 0],
        [0, 0]], dtype=object)
>>> z.fill([1])
>>> z
array([[1, 1],
        [1, 1]], dtype=object)

and

>>> z.fill([1,2,3])
>>> z
array([[1, 1],
        [1, 1]], dtype=object)


I would have expected,

>>> z
array([[[1], [1]],
        [[1], [1]]], dtype=object)

and

>>> z
array([[[1, 2, 3], [1, 2, 3]],
        [[1, 2, 3], [1, 2, 3]]], dtype=object)


Regards, Louis.

-- 
Louis Cordier <lcordier at point45.com> cell: +27721472305
Point45 Entertainment (Pty) Ltd. http://www.point45.org


From ndarray at mac.com  Fri Apr 28 09:04:09 2006
From: ndarray at mac.com (Sasha)
Date: Fri Apr 28 09:04:09 2006
Subject: [Numpy-discussion] Bug
In-Reply-To: <Pine.LNX.4.63.0604281504330.6802@bellagio.pcs.intranet>
References: <Pine.LNX.4.63.0604281504330.6802@bellagio.pcs.intranet>
Message-ID: <d38f5330604280903i7863d9acm160cbac2c834cd80@mail.gmail.com>

The core dump is definitely a bug.  I reproduced it on my Linux
system.  Please create a ticket.  I am not sure whether fill should
copy objects or not.  When you populate an array with immutable
objects, creating multiple copies is a waste.

On 4/28/06, Louis Cordier <lcordier at point45.com> wrote:
>
> Hi, I am not sure if this is the proper place to do a bug post.
> I looked at the active tickets on http://projects.scipy.org/scipy/numpy/
> but didn't feel confident to go and create a new one. ;)
>
> Anyway the current release version 0.9.6 have some broken behavior.
> I guess some example code would illustrate it best.
>
> ---8<----------------
>
> >>> z = numpy.zeros((10,10), 'O')
> >>> z.fill(None)
> >>> z.fill([])
> Segmentation fault (core dumped)
>
> This happens on both Linux and FreeBSD machines.
> (both builds use *_lite versions of Lapack)
>
> Linux bellagio 2.6.11-1.1369_FC4 #1 Thu Jun 2 22:55:56 EDT 2005 i686 i686
> i386 GNU/Linux
> Python 2.4.1
> gcc version 4.0.0 20050519 (Red Hat 4.0.0-8)
>
> FreeBSD cerberus.intranet 5.4-RELEASE-p12 FreeBSD 5.4-RELEASE-p12 #0: Wed
> Mar 15 16:06:48 UTC 2006
> Python 2.4.2
> gcc version 3.4.2 [FreeBSD] 20040728
>
> I assume fill() will need to make a copy, of the object
> for each coordinate in the matix.
>
> ---8<----------------
>
> While,
>
> >>> import numpy
> >>> z = numpy.zeros((2,2), 'O')
> >>> z
> array([[0, 0],
>         [0, 0]], dtype=object)
> >>> z.fill([1])
> >>> z
> array([[1, 1],
>         [1, 1]], dtype=object)
>
> and
>
> >>> z.fill([1,2,3])
> >>> z
> array([[1, 1],
>         [1, 1]], dtype=object)
>
>
> I would have expected,
>
> >>> z
> array([[[1], [1]],
>         [[1], [1]]], dtype=object)
>
> and
>
> >>> z
> array([[[1, 2, 3], [1, 2, 3]],
>         [[1, 2, 3], [1, 2, 3]]], dtype=object)
>
>
> Regards, Louis.
>
> --
> Louis Cordier <lcordier at point45.com> cell: +27721472305
> Point45 Entertainment (Pty) Ltd. http://www.point45.org
>
>
>
> -------------------------------------------------------
> Using Tomcat but need to do more? Need to support web services, security?
> Get stuff done quickly with pre-integrated technology to make your job easier
> Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>


From ndarray at mac.com  Fri Apr 28 10:04:08 2006
From: ndarray at mac.com (Sasha)
Date: Fri Apr 28 10:04:08 2006
Subject: [Numpy-discussion] Bug
In-Reply-To: <d38f5330604280903i7863d9acm160cbac2c834cd80@mail.gmail.com>
References: <Pine.LNX.4.63.0604281504330.6802@bellagio.pcs.intranet>
	 <d38f5330604280903i7863d9acm160cbac2c834cd80@mail.gmail.com>
Message-ID: <d38f5330604281003m1f4d78c1wffcbfcaa97c0f3d0@mail.gmail.com>

See <http://projects.scipy.org/scipy/numpy/ticket/86>.

On 4/28/06, Sasha <ndarray at mac.com> wrote:
> The core dump is definitely a bug.  I reproduced it on my Linux
> system.  Please create a ticket.  I am not sure whether fill should
> copy objects or not.  When you populate an array with immutable
> objects, creating multiple copies is a waste.
>
> On 4/28/06, Louis Cordier <lcordier at point45.com> wrote:
> >
> > Hi, I am not sure if this is the proper place to do a bug post.
> > I looked at the active tickets on http://projects.scipy.org/scipy/numpy/
> > but didn't feel confident to go and create a new one. ;)
> >
> > Anyway the current release version 0.9.6 have some broken behavior.
> > I guess some example code would illustrate it best.
> >
> > ---8<----------------
> >
> > >>> z = numpy.zeros((10,10), 'O')
> > >>> z.fill(None)
> > >>> z.fill([])
> > Segmentation fault (core dumped)
> >
> > This happens on both Linux and FreeBSD machines.
> > (both builds use *_lite versions of Lapack)
> >
> > Linux bellagio 2.6.11-1.1369_FC4 #1 Thu Jun 2 22:55:56 EDT 2005 i686 i686
> > i386 GNU/Linux
> > Python 2.4.1
> > gcc version 4.0.0 20050519 (Red Hat 4.0.0-8)
> >
> > FreeBSD cerberus.intranet 5.4-RELEASE-p12 FreeBSD 5.4-RELEASE-p12 #0: Wed
> > Mar 15 16:06:48 UTC 2006
> > Python 2.4.2
> > gcc version 3.4.2 [FreeBSD] 20040728
> >
> > I assume fill() will need to make a copy, of the object
> > for each coordinate in the matix.
> >
> > ---8<----------------
> >
> > While,
> >
> > >>> import numpy
> > >>> z = numpy.zeros((2,2), 'O')
> > >>> z
> > array([[0, 0],
> >         [0, 0]], dtype=object)
> > >>> z.fill([1])
> > >>> z
> > array([[1, 1],
> >         [1, 1]], dtype=object)
> >
> > and
> >
> > >>> z.fill([1,2,3])
> > >>> z
> > array([[1, 1],
> >         [1, 1]], dtype=object)
> >
> >
> > I would have expected,
> >
> > >>> z
> > array([[[1], [1]],
> >         [[1], [1]]], dtype=object)
> >
> > and
> >
> > >>> z
> > array([[[1, 2, 3], [1, 2, 3]],
> >         [[1, 2, 3], [1, 2, 3]]], dtype=object)
> >
> >
> > Regards, Louis.
> >
> > --
> > Louis Cordier <lcordier at point45.com> cell: +27721472305
> > Point45 Entertainment (Pty) Ltd. http://www.point45.org
> >
> >
> >
> > -------------------------------------------------------
> > Using Tomcat but need to do more? Need to support web services, security?
> > Get stuff done quickly with pre-integrated technology to make your job easier
> > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
> > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
> > _______________________________________________
> > Numpy-discussion mailing list
> > Numpy-discussion at lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/numpy-discussion
> >
>


From lcordier at point45.com  Fri Apr 28 10:24:04 2006
From: lcordier at point45.com (Louis Cordier)
Date: Fri Apr 28 10:24:04 2006
Subject: [Numpy-discussion] Bug
In-Reply-To: <d38f5330604281003m1f4d78c1wffcbfcaa97c0f3d0@mail.gmail.com>
References: <Pine.LNX.4.63.0604281504330.6802@bellagio.pcs.intranet> 
 <d38f5330604280903i7863d9acm160cbac2c834cd80@mail.gmail.com>
 <d38f5330604281003m1f4d78c1wffcbfcaa97c0f3d0@mail.gmail.com>
Message-ID: <Pine.LNX.4.63.0604281921280.13665@bellagio.pcs.intranet>

> See <http://projects.scipy.org/scipy/numpy/ticket/86>.

>> > >>> z.fill([1,2,3])
>> > >>> z
>> > array([[1, 1],
>> >         [1, 1]], dtype=object)
>> >
>> > I would have expected,
>> >
>> > >>> z
>> > array([[[1, 2, 3], [1, 2, 3]],
>> >         [[1, 2, 3], [1, 2, 3]]], dtype=object)

Souldn't the second example be a ticket ?
Or is it part of #86 ?

Regards, Louis.

-- 
Louis Cordier <lcordier at point45.com> cell: +27721472305
Point45 Entertainment (Pty) Ltd. http://www.point45.org


From ndarray at mac.com  Fri Apr 28 10:49:02 2006
From: ndarray at mac.com (Sasha)
Date: Fri Apr 28 10:49:02 2006
Subject: [Numpy-discussion] Bug
In-Reply-To: <Pine.LNX.4.63.0604281921280.13665@bellagio.pcs.intranet>
References: <Pine.LNX.4.63.0604281504330.6802@bellagio.pcs.intranet>
	 <d38f5330604280903i7863d9acm160cbac2c834cd80@mail.gmail.com>
	 <d38f5330604281003m1f4d78c1wffcbfcaa97c0f3d0@mail.gmail.com>
	 <Pine.LNX.4.63.0604281921280.13665@bellagio.pcs.intranet>
Message-ID: <d38f5330604281048x218f939ch63ad24e19aff25ae@mail.gmail.com>

On 4/28/06, Louis Cordier <lcordier at point45.com> wrote:

> Souldn't the second example be a ticket ?
> Or is it part of #86 ?

I think all your examples are different signs of the same problem. 
You can help by converting your examples into unit tests to be added
to say test_multiarray.py and attaching a patch to the ticket.

A brief comment for the developers: the problem that Louis reported is
caused by the fact that x.fill([]) creates an empty array internally
instead of a scalar object array containing an empty list.  Note that
numpy does not even have a good notation for the required object:

>>> from numpy import *
>>> x = zeros(1,'O')
>>> x.shape=()
>>> x[()] = []
>>> x
array([], dtype=object)
>>> x.shape
()

but

>>> array([], dtype=object).shape
(0,)


From fullung at gmail.com  Fri Apr 28 15:32:13 2006
From: fullung at gmail.com (Albert Strasheim)
Date: Fri Apr 28 15:32:13 2006
Subject: [Numpy-discussion] Scalar math module is ready for testing
In-Reply-To: <4451C076.40608@ieee.org>
Message-ID: <007701c66b13$8365df00$0a84a8c0@dsp.sun.ac.za>

Hello Travis

I'm having some problems compiling the scalarmath code with the Visual
Studio .NET 2003 compiler.

Specifically, the compiler is failing to link in the llabs, fabsf and sqrtf
functions. The reason it is not finding these symbols could be explained by
the following errors I get when building the object file by hand using the
parameters distutils passes to the compiler (for some reason distutils is
suppressing compiler output -- this is pretty, but it makes debugging build
failures hard):

build\src.win32-2.4\numpy\core\src\scalarmathmodule.c(1737) : warning C4013:
'llabs' undefined; assuming extern returning int
build\src.win32-2.4\numpy\core\src\scalarmathmodule.c(1751) : warning C4013:
'fabsf' undefined; assuming extern returning int
build\src.win32-2.4\numpy\core\src\scalarmathmodule.c(1773) : warning C4013:
'sqrtf' undefined; assuming extern returning int

In c:\Program Files\Microsoft Visual Studio .NET 2003\vc7\crt\src\math.h I
have the following (extra code stripped):

...
#ifndef __cplusplus
#define acosl(x)    ((long double)acos((double)(x)))
#define asinl(x)    ((long double)asin((double)(x)))
#define atanl(x)    ((long double)atan((double)(x)))
...
/* NOTE! no sqrtf or fabsf is defined in this block */
#else  /* __cplusplus */
...
#if !defined (_M_MRX000) && !defined (_M_ALPHA) && !defined (_M_IA64)
/* NOTE! none of the above are defined on x86 */
...
inline float fabsf(float _X)
        {return ((float)fabs((double)_X)); }
...
inline float sqrtf(float _X)
        {return ((float)sqrt((double)_X)); }
...
#endif  /* !defined (_M_MRX000) && !defined (_M_ALPHA) && !defined (_M_IA64)
*/
#endif  /* __cplusplus */

>From this it would seem that Microsoft doesn't consider sqrtf and fabsf to
be part of the C language? However, the C++ code provides a clue for how
they implemented it.

Also, llabs isn't defined anywhere. From reading the MSDN docs, I suspect it
is called _abs64 on Windows.

Regards,

Albert

> -----Original Message-----
> From: numpy-discussion-admin at lists.sourceforge.net [mailto:numpy-
> discussion-admin at lists.sourceforge.net] On Behalf Of Travis Oliphant
> Sent: 28 April 2006 09:13
> To: numpy-discussion
> Subject: [Numpy-discussion] Scalar math module is ready for testing
> 
> 
> The scalar math module is complete and ready to be tested.  It should
> speed up code that relies heavily on scalar arithmetic by by-passing the
> ufunc machinery.
> 
> It needs lots of testing to be sure that it is doing the "right"
> thing.   To enable scalarmath you need to
> 
> import numpy.core.scalarmath
> 
> You cannot disable it once it's enabled except by restarting Python.  If
> we need that feature we can add it. The array scalars respond to the
> error modes of ufuncs.
> 
> There is an experimental function called alter_scalars that replaces the
> Python int, float, and complex number tables with the array scalar
> equivalents.  Thus, to amaze (or seriously annoy) your Python friends
> you can do
> 
> import numpy.core.scalarmath as ncs
> 
> ncs.alter_scalars(int)
> 
> 1 / 0
> 
> This will return 0 unless you change the error modes...
> 
> ncs.retore_scalars(int)
> 
> Will put things back the way Guido intended....
> 
> 
> Please try it out and send us error reports.   Many thanks to Sasha for
> his help in getting all the code so it at least compiles and loads.  All
> bugs should be blamed on me, though...
> 
> 
> Best,
> 
> -Travis


From jonathan.taylor at stanford.edu  Fri Apr 28 16:21:15 2006
From: jonathan.taylor at stanford.edu (Jonathan Taylor)
Date: Fri Apr 28 16:21:15 2006
Subject: [Numpy-discussion] confusing recarray behaviour
Message-ID: <44528318.6010604@stanford.edu>

I'm new to recarrays and have been struggling with them. I keep getting 
an exception

TypeError: expected a readable buffer object

with no informative traceback.

What I pass to N.array seems to agree with the examples in numpybook.

Below is an example that does work for me (excuse the longish example 
but it was just cut and paste to make my life easier). In my code, funny 
things happen
(see ipython excerpt below this). In particular, I have a list v with 
v[0:2] = V and with the
same dtype "ddesc" I get this exception when I change V to v[0:2].

Any help would be appreciated.

---------------------------------------------------------------------------------------
import numpy as N

timedesc = N.dtype({'names':['tm_year',     
                             'tm_mon',    
                             'tm_mday',
                             'tm_hour',
                             'tm_min',   
                             'tm_sec',
                             'tm_wday',
                             'tm_yday',
                             'tm_isdst'],
                    'formats':['i2']*9})

ddesc = N.dtype({'names': ('Week',
                          'Date',
                          'Institution',
                          'SeqNo',
                          'HeightDone',
                          'Height',
                          'UnitsH',
                          'WeightDone',
                          'Weight',
                          'Units',
                          'PulseDone',
                          'Pulse',
                          'BPdone',
                          'BPSys',
                          'BPDia',
                          'PID',
                          'RN'),
                'formats': ['f4',
                            timedesc] + ['f4']*15})
                                         

V = [(12.0, (2005, 4, 22, 0, 0, 0, 4, 112, -1), 501.0,
      1.0, 2.0, 0.0, 0, 1.0, 91.5, 1.0, 1.0, 87.0, 1.0, 129.0,
      76.0, 107.0, 11.0),
     (24.0, (2005, 2, 1, 0, 0, 0, 1, 32, -1), 504.0,
      1.0, 2.0, 0.0, 0, 1.0, 166.0, 2.0, 1.0, 84.0, 1.0, 128.0,
      78.0,  401.0, 7.0)
     ]

w=N.array(V, dtype=ddesc)

--------------------------------------------------------------------------------------------------

In [97]:v[0:2] == V
Out[97]:True

In [98]:N.array(V, ddesc)
Out[98]:
array([ (12.0, (2005, 4, 22, 0, 0, 0, 4, 112, -1), 501.0, 1.0, 2.0, 0.0, 0.0, 1.0, 91.5, 1.0, 1.0, 87.0, 1.0, 129.0, 76.0, 107.0, 11.0),
       (24.0, (2005, 2, 1, 0, 0, 0, 1, 32, -1), 504.0, 1.0, 2.0, 0.0, 0.0, 1.0, 166.0, 2.0, 1.0, 84.0, 1.0, 128.0, 78.0, 401.0, 7.0)],
      dtype=[('Week', '<f4'), ('Date', [('tm_year', '<i2'), ('tm_mon', '<i2'), ('tm_mday', '<i2'), ('tm_hour', '<i2'), ('tm_min', '<i2'), ('tm_sec', '<i2'), ('tm_wday', '<i2'), ('tm_yday', '<i2'), ('tm_isdst', '<i2')]), ('Institution', '<f4'), ('SeqNo', '<f4'), ('HeightDone', '<f4'), ('Height', '<f4'), ('UnitsH', '<f4'), ('WeightDone', '<f4'), ('Weight', '<f4'), ('Units', '<f4'), ('PulseDone', '<f4'), ('Pulse', '<f4'), ('BPdone', '<f4'), ('BPSys', '<f4'), ('BPDia', '<f4'), ('PID', '<f4'), ('RN', '<f4')])

In [99]:N.array(v[0:2], ddesc)
---------------------------------------------------------------------------
exceptions.TypeError                                 Traceback (most recent call last)

/home/jtaylo/svn/personal/projects/deescalate/python/<ipython console>

TypeError: expected a readable buffer object


-- 
------------------------------------------------------------------------
I'm part of the Team in Training: please support our efforts for the
Leukemia and Lymphoma Society!

http://www.active.com/donate/tntsvmb/tntsvmbJTaylor

GO TEAM !!!

------------------------------------------------------------------------
Jonathan Taylor                           Tel:   650.723.9230
Dept. of Statistics                       Fax:   650.725.8977
Sequoia Hall, 137                         www-stat.stanford.edu/~jtaylo
390 Serra Mall
Stanford, CA 94305

-------------- next part --------------
A non-text attachment was scrubbed...
Name: jonathan.taylor.vcf
Type: text/x-vcard
Size: 329 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060428/b707fc01/attachment.vcf>

From Fernando.Perez at colorado.edu  Fri Apr 28 16:21:17 2006
From: Fernando.Perez at colorado.edu (Fernando Perez)
Date: Fri Apr 28 16:21:17 2006
Subject: [Numpy-discussion] [OT] A weekend floating point/compiler question
Message-ID: <44528F49.3080005@colorado.edu>

Hi all,

this is somewhat off-topic, since it's really a gcc/g77 question.  Yet for us 
here (my group) it may lead to the decision to stop using g77 for all fortran 
code and switch to another compiler for our python-wrapped libraries.  So it 
did arise in the context of python usage of in-house code, and I'm appealing 
to anyone who may want to play a little with the question and help.

Feel free to reply off-list to keep the noise down on the list.

The problem arose in some in-house library, but can be boiled down to this:

planck[f77bug]> cat testbug.f
       program  testbug
c
       implicit real *8 (a-h,o-z)
c
       half = 0.5d0
       x    = 0.49d0
       nnx  = 100
       iax  = (x+half)*nnx

       print *, 'Should be 99:',iax

       stop
       end

c EOF


planck[f77bug]> g77 -o testbug.g77 testbug.f
planck[f77bug]> ./testbug.g77
  Should be 99: 98

This can be seen as computing (x/n+1/2)*n and comparing it to x+n/2.  Yes, I 
know about the dangers of floating point roundoff error (I didn't write the 
original code), but a variation of this is used inside a library that began 
crashing for certain inputs.  The point is that this same code works fine with 
the Intel and Lahey compilers, but not with g77.  Now, to add a bit of mystery 
to the question, I wrote the following C code:

planck[f77bug]> cat scanbug.c
#include <stdio.h>

int main(int argc, char* argv[]) {

     double x;
     double eps  = 1e-2;
     double x0   = 0.0;
     double xmax = 1.0;

     int nnx = 100;
     int i = 0;

     double dax;
     int iax,iax_direct;

     x = x0;
     while (x<xmax) {
         // This operation:
         dax = nnx*(x+0.5);
         iax = dax;

         // And this one:
         iax_direct = nnx*(x+0.5);

         // look identical, it's jut that one of them does not use a temporary
         // double variable to hold the result ( Does this mean that the int
         // cast is done in a register straight out of the 80-bit value in the
         // FPU? )

         // And yet, they produce different results for certain values of x
         if (iax != iax_direct) {
             printf("ERROR at x=%e!\n",x);
         }
         x = x0 + i*eps;
         i += 1;
     }
}
// EOF

And this is really where my question is.  The key issue is that nx*(x+0.5) 
produces a different result when truncated to an int, depending on whether a 
temporary double is involved or not.  I tested with the Intel C compiler, and 
it does never report a mismatch, yet a gcc compilation (3.4.3 and 4.0.2) does 
report a number of them.

Any ideas/comments?  Shouldn't the result be independent of the intermediate 
double var?  It is for icc, can this be considered a gcc bug?

Cheers,

f


From nvf at MIT.EDU  Fri Apr 28 16:32:20 2006
From: nvf at MIT.EDU (Nick Fotopoulos)
Date: Fri Apr 28 16:32:20 2006
Subject: [Numpy-discussion] Freeing memory allocated in C
In-Reply-To: <44519C6E.80006@ieee.org>
References: <1C119933-F93B-47C6-AADA-4A61DF16B745@mit.edu> <44519C6E.80006@ieee.org>
Message-ID: <CC9379D8-339E-4C4B-A4A2-A9D92DFDED5B@MIT.EDU>

Many thanks, with your help, I got it working without any leaks.  I  
need to run on ~10 TB of data, so fixing this leak sure helps my  
program scale.

One error in the code below is that PyString_FromFormat does not  
accept %f, so I created a regular string and created the PyString  
with PyString_FromString (it seems to copy data), then freed the  
regular string.  Is there any better way to do that?

I'm curious why I didn't see any explanation of PyArray_DATA in the  
NumPy book.  It seems really important, especially if you're touting  
it as the Proper Strategy.

Finally, Robert encouraged me to stop using the legacy interface.   
I'm happy to do so, but I have to cater to my users.  Approximately  
old a version of Numeric (and Numarray) will still work with  
PyArray_SimpleNew?

Thanks,
Nick

On Apr 28, 2006, at 12:39 AM, Travis Oliphant wrote:

<snip>
> The proper strategy for your arrays is to use PyArray_SimpleNew and  
> then get the data-pointer to fill using PyArray_DATA(...).  The  
> proper way to handle strings is to create a new string (say using  
> PyString_FromFormat)  and then return everything as objects.
>
>
>
> /* make sure shape is defined as intp unless you don't care about  
> 64-bit */
> obj2 = PyArray_SimpleNew(1, shape, PyArray_DOUBLE);
> data = (double *)PyArray_DATA(obj2)
> [snip...]
> out5 = PyString_FromFormat("Starting GPS time:%.1f UTC=%s",
>         vect->GTime,FrStrGTime(utc));
>
> return Py_BuildValue("(NNNdNNN)",out1,out2,out3,out4,out5,out6,out7);
>
>
> Make sure you use the 'N' tag so that another reference count isn't  
> generated.  The 'O' tag will increase the reference count of your  
> objects by one which is is not necessarily what you want (but  
> sometimes you do).


From robert.kern at gmail.com  Fri Apr 28 16:43:18 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Fri Apr 28 16:43:18 2006
Subject: [Numpy-discussion] Re: Freeing memory allocated in C
In-Reply-To: <CC9379D8-339E-4C4B-A4A2-A9D92DFDED5B@MIT.EDU>
References: <1C119933-F93B-47C6-AADA-4A61DF16B745@mit.edu> <44519C6E.80006@ieee.org> <CC9379D8-339E-4C4B-A4A2-A9D92DFDED5B@MIT.EDU>
Message-ID: <e2u98s$dtn$1@sea.gmane.org>

Nick Fotopoulos wrote:

> I'm curious why I didn't see any explanation of PyArray_DATA in the 
> NumPy book.  It seems really important, especially if you're touting  it
> as the Proper Strategy.

Section 13.3 talks about PyArray_DATA.

> Finally, Robert encouraged me to stop using the legacy interface.   I'm
> happy to do so, but I have to cater to my users.  Approximately  old a
> version of Numeric (and Numarray) will still work with  PyArray_SimpleNew?

None. It is new to Numpy. The old way would be to use PyArray_FromDims.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From Fernando.Perez at colorado.edu  Fri Apr 28 16:55:02 2006
From: Fernando.Perez at colorado.edu (Fernando Perez)
Date: Fri Apr 28 16:55:02 2006
Subject: [Numpy-discussion] A weekend floating point/compiler question
Message-ID: <4452AB3F.8090700@colorado.edu>

Hi Robert and George,

We found a bug in g77 v. 3.4.4 as well as in gcc, which manifests itself in 
the following little snippet:

planck[f77bug]> cat testbug.f
        program  testbug
c
        implicit real *8 (a-h,o-z)
c
        half = 0.5d0
        x    = 0.49d0
        nnx  = 100
        iax  = (x+half)*nnx

        print *, 'Should be 99:',iax

        stop
        end

c EOF


planck[f77bug]> g77 -o testbug.g77 testbug.f
planck[f77bug]> ./testbug.g77
   Should be 99: 98

This can be seen as computing (x/n+1/2)*n and comparing it to x+n/2.  Greg is 
using this in a number of places inside a library, which had never given 
trouble before when built with other compilers, like the sun, IBM, Intel and 
Lahey ones.  Now with g77 it gives the result above.

Questions:

1. Have you seen similar behavior in the past?

2. If we switch away from g77, what do you suggest moving towards?  We ran 
paranoia on ifort, lahey and g77, and lahey was the best performing of all. 
The intel one has the advantage of being free.  On the other hand, paranoia 
did complain about arithmetic issues with it (though the above code works fine 
with intel).

Any ideas you can give us would be very appreciated.

Cheers,

Fernando and Greg.

ps. Apparently g77 v 3.3.2 does NOT have this problem.


From robert.kern at gmail.com  Fri Apr 28 16:58:15 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Fri Apr 28 16:58:15 2006
Subject: [Numpy-discussion] Re: [OT] A weekend floating point/compiler question
In-Reply-To: <44528F49.3080005@colorado.edu>
References: <44528F49.3080005@colorado.edu>
Message-ID: <4452ABFE.2040307@gmail.com>

Fernando Perez wrote:

> Any ideas/comments?  Shouldn't the result be independent of the
> intermediate double var?  It is for icc, can this be considered a gcc bug?

It seems like it might be processor-specific. On my G4 Powerbook (g77 3.4.4, gcc
3.3) and AMD64 Linux desktop (g77 3.4.5, gcc 4.0.2), both programs give the
expected results. Specifically, the Intel 80-bit FPU thingy is probably a factor.

It might be worth filing a bug report against gcc. If nothing else, you might
get a better explanation of what's going on.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From Fernando.Perez at colorado.edu  Fri Apr 28 17:13:16 2006
From: Fernando.Perez at colorado.edu (Fernando Perez)
Date: Fri Apr 28 17:13:16 2006
Subject: [Numpy-discussion] A weekend floating point/compiler question
In-Reply-To: <4452AB3F.8090700@colorado.edu>
References: <4452AB3F.8090700@colorado.edu>
Message-ID: <4452AF7D.6040008@colorado.edu>

Fernando Perez wrote:
> Hi Robert and George,
	
Sorry!  I was writing the same question to two colleagues and forgot to change 
  the TO line.  My apology.

Cheers,

f


From gnchen at cortechs.net  Fri Apr 28 18:08:03 2006
From: gnchen at cortechs.net (Gennan Chen)
Date: Fri Apr 28 18:08:03 2006
Subject: [Numpy-discussion] Guide to Numpy book
Message-ID: <3FA6601C-819F-4F15-A670-829FC428F47B@cortechs.net>

Hi!

What is the newest version of Guide to numpy? The recent one I got is  
dated at Jan 9 2005 on the cover.

Gen-Nan Chen, PhD
Chief Scientist
Research and Development Group
CorTechs Labs Inc (www.cortechs.net)
1020 Prospect St., #304, La Jolla, CA, 92037
Tel: 1-858-459-9700 ext 16
Fax: 1-858-459-9705
Email: gnchen at cortechs.net


From luis at geodynamics.org  Fri Apr 28 18:29:03 2006
From: luis at geodynamics.org (Luis Armendariz)
Date: Fri Apr 28 18:29:03 2006
Subject: [Numpy-discussion] Guide to Numpy book
In-Reply-To: <3FA6601C-819F-4F15-A670-829FC428F47B@cortechs.net>
References: <3FA6601C-819F-4F15-A670-829FC428F47B@cortechs.net>
Message-ID: <4452C145.8050803@geodynamics.org>

Gennan Chen wrote:
> Hi!
> 
> What is the newest version of Guide to numpy? The recent one I got is  
> dated at Jan 9 2005 on the cover.
> 

The one I got yesterday is dated March 15, 2006.
-Luis


From robert.kern at gmail.com  Sat Apr 29 00:31:22 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Sat Apr 29 00:31:22 2006
Subject: [Numpy-discussion] Re: A python interface for loess ?
In-Reply-To: <200604260329.17115.pgmdevlist@mailcan.com>
References: <200604260329.17115.pgmdevlist@mailcan.com>
Message-ID: <4453162E.1040901@gmail.com>

Pierre GM wrote:
> Folks, 
> Would any of you be aware of a Python interface to the loess routines ?
> http://netlib.bell-labs.com/netlib/a/dloess.gz

Not specifically this code, but there is a pure Python+old Numeric
implementation of lowess in BioPython, specifically in the Bio.Statistics
subpackage. It's short and could be easily ported to use numpy.

  http://www.biopython.org

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From chris at pseudogreen.org  Sat Apr 29 09:09:11 2006
From: chris at pseudogreen.org (Christopher Stawarz)
Date: Sat Apr 29 09:09:11 2006
Subject: [Numpy-discussion] Re: A weekend floating point/compiler question
Message-ID: <01fa3363e635409f488757070c5f8268@pseudogreen.org>

Hi,

I don't think this is a GCC bug, but it does seem to be related to
Intel's 80-bit floating-point architecture.

As of the Pentium 3, Intel and compatible processors have two sets of
instructions for performing floating-point operations: the original
8087 set, which do all computations at 80-bit precision, and SSE (and
their extension SSE2), which don't use extended precision.

GCC allows you to select either instruction set.  Unfortunately, in
the absence of an explicit choice, it uses a default target that
varies by platform: The i386 version defaults to 8087 instructions,
while the x86-64 version defaults to SSE.  See

http://gcc.gnu.org/onlinedocs/gcc-4.1.0/gcc/i386-and-x86_002d64- 
Options.html

for details.

I can make your test programs behave correctly on a Pentium 4 by
selecting SSE2:

devel12-35: g77 testbug.f
devel12-36: ./a.out
  Should be 99: 98
devel12-37: g77 -msse2 -mfpmath=sse testbug.f
devel12-38: ./a.out
  Should be 99: 99
devel12-39: gcc scanbug.c
devel12-40: ./a.out | head -1
ERROR at x=3.000000e-02!
devel12-41: gcc -msse2 -mfpmath=sse scanbug.c
devel12-42: ./a.out
devel12-43:

Interestingly, I expected to be able to induce incorrect results on an
Opteron by using 8087, but that wasn't the case (both instruction sets
produced the correct result).  I'll have to think about why that's
happening -- maybe casting between ints and doubles differs between 32
and 64-bit architectures?

I've never used the Intel or Lahey Fortran compilers, but I suspect
they must be generating SSE instructions by default.  Actually, it's
interesting that the 80-bit computations are causing problems here,
since it's easy to come up with examples where they give you better
results than computations done without the extra bits.


Hope that helps,
Chris


From charlesr.harris at gmail.com  Sat Apr 29 10:25:01 2006
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Sat Apr 29 10:25:01 2006
Subject: [Numpy-discussion] A weekend floating point/compiler question
In-Reply-To: <4452AB3F.8090700@colorado.edu>
References: <4452AB3F.8090700@colorado.edu>
Message-ID: <e06186140604291024o6fa0aa59n9f8bca6ed0682664@mail.gmail.com>

On 4/28/06, Fernando Perez <Fernando.Perez at colorado.edu> wrote:
>
> Hi Robert and George,
>
> We found a bug in g77 v. 3.4.4 as well as in gcc, which manifests itself
> in
> the following little snippet:
>
> planck[f77bug]> cat testbug.f
>         program  testbug
> c
>         implicit real *8 (a-h,o-z)
> c
>         half = 0.5d0
>         x    = 0.49d0
>         nnx  = 100
>         iax  = (x+half)*nnx
>
>         print *, 'Should be 99:',iax
>
>         stop
>         end
>
> c EOF


I don't see why the answer should be 99. The number .99 can not be exactly
represented in IEEE floating point, in fact it is ~ 0.9899999999999999911182.
So as you can see the result is perfectly correct given the standard
conversion to int by truncation. IMHO, this is programmer error, not a
compiler problem and should be fixed in the code. Now you may get slightly
different results depending on roundoff error if you indulge in such things
as (.5 + .49)*100 vs (.33 + .17 + .49)*100, and since these numbers are
constants they may also be precomputed by the compiler and the results will
depend on the accuracy of the compiler's computation. The whole construction
is ambiguous.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060429/c978dc43/attachment.html>

From charlesr.harris at gmail.com  Sat Apr 29 10:43:08 2006
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Sat Apr 29 10:43:08 2006
Subject: [Numpy-discussion] A weekend floating point/compiler question
In-Reply-To: <e06186140604291024o6fa0aa59n9f8bca6ed0682664@mail.gmail.com>
References: <4452AB3F.8090700@colorado.edu>
	 <e06186140604291024o6fa0aa59n9f8bca6ed0682664@mail.gmail.com>
Message-ID: <e06186140604291042m2f6c21ddqd0af74cb3f5235a@mail.gmail.com>

On 4/29/06, Charles R Harris <charlesr.harris at gmail.com> wrote:
>
>
>
> On 4/28/06, Fernando Perez <Fernando.Perez at colorado.edu> wrote:
> >
> > Hi Robert and George,
> >
> > We found a bug in g77 v. 3.4.4 as well as in gcc, which manifests itself
> > in
> > the following little snippet:
> >
> > planck[f77bug]> cat testbug.f
> >         program  testbug
> > c
> >         implicit real *8 (a-h,o-z)
> > c
> >         half = 0.5d0
> >         x    = 0.49d0
> >         nnx  = 100
> >         iax  = (x+half)*nnx
> >
> >         print *, 'Should be 99:',iax
> >
> >         stop
> >         end
> >
> > c EOF
>
>
> I don't see why the answer should be 99. The number .99 can not be exactly
> represented in IEEE floating point, in fact it is ~
> 0.9899999999999999911182. So as you can see the result is perfectly
> correct given the standard conversion to int by truncation. IMHO, this is
> programmer error, not a compiler problem and should be fixed in the code.
> Now you may get slightly different results depending on roundoff error if
> you indulge in such things as (.5 + .49)*100 vs (.33 + .17 + .49)*100, and
> since these numbers are constants they may also be precomputed by the
> compiler and the results will depend on the accuracy of the compiler's
> computation. The whole construction is ambiguous.
>
> Chuck
>

As an example:

#include <cstdio>

int main(int argc, char** argv)
{
    int x = 100;
    long double y = .49;
    long double z = .50;
    printf("%25.22Lf\n", (y + z)*x);
    return 0;
}

prints 98.9999999999999991118216 whereas the same code with doubles instead
of long doubles prints 99.0000000000000000000000.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060429/87ffebbd/attachment.html>

From oliphant.travis at ieee.org  Sat Apr 29 13:13:05 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Sat Apr 29 13:13:05 2006
Subject: [Numpy-discussion] confusing recarray behaviour
In-Reply-To: <44528318.6010604@stanford.edu>
References: <44528318.6010604@stanford.edu>
Message-ID: <4453C8B7.8040000@ieee.org>

Jonathan Taylor wrote:
>
> What I pass to N.array seems to agree with the examples in numpybook.
>
> Below is an example that does work for me (excuse the longish example 
> but it was just cut and paste to make my life easier). In my code, 
> funny things happen
> (see ipython excerpt below this). In particular, I have a list v with 
> v[0:2] = V and with the
> same dtype "ddesc" I get this exception when I change V to v[0:2].
Please show us what v is.

If I run v = V[:] and then try N.array(v[0:2],ddesc) I don't get any 
error.  So something else must be going on.

Which version are you running?


-Travis


From fullung at gmail.com  Sat Apr 29 14:30:10 2006
From: fullung at gmail.com (Albert Strasheim)
Date: Sat Apr 29 14:30:10 2006
Subject: [Numpy-discussion] Array data and struct alignment
Message-ID: <001601c66bd4$0a37ddb0$0a84a8c0@dsp.sun.ac.za>

Hello all

I'm busy wrapping a C library with NumPy. Some of the functions operate on a
buffer containing structs that look like this:

struct node {
  int index;
  double value;
};

On the Python side, I do the following to set up my data. examples is a list
containing lists or dicts.

nodes = []
for example in examples:
  if type(example) is dict:
    nodes.append(example.items())
  else:
    nodes.append(zip(range(1, len(example)+1), example))
descr = [('index','intc',1),('value','f8',1)]
self.nodes = map(lambda x: array(x, dtype=descr), nodes)

Assume example = [[1.0, 2.0, 3.0], {4: 4.0}]. The nodes array can now be
accessed in various useful ways:

nodes[0][0] -> (1, 1.0)
nodes[1][0] -> (4, 4.0))
nodes[0]['index'] -> [1,2,3]
nodes[0]['value'] -> [1.0,2.0,3.0])
nodes[1]['index'] -> [4]
nodes[1]['value'] -> [4.0]

On the C side I can now do the following:

PyObject* Svm_GetStructNode(PyObject* obj, PyObject* args) {
   PyObject* op1;
   struct node* node;
   if(!PyArg_ParseTuple(args, "O", &op1)) {
      return NULL;
   }
   node = (struct node*) PyArray_DATA(op1);
   return Py_BuildValue("(id)", node->index, node->value);
}

However, this only works if struct node is tightly packed (#pragma pack(1)
with the Visual C compiler).

I don't know how feasible this is, but it would be useful if NumPy could be
told to pack its data on n-byte boundaries or on "same as the compiler"
boundaries. I realise that there can be problems when mixing code compiled
by more than one compiler, etc., etc., but a simple unit test can check for
this.

Any thoughts?

Regards,

Albert


From oliphant.travis at ieee.org  Sat Apr 29 14:58:01 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Sat Apr 29 14:58:01 2006
Subject: [Numpy-discussion] Array data and struct alignment
In-Reply-To: <001601c66bd4$0a37ddb0$0a84a8c0@dsp.sun.ac.za>
References: <001601c66bd4$0a37ddb0$0a84a8c0@dsp.sun.ac.za>
Message-ID: <4453E10E.5090108@ieee.org>

Albert Strasheim wrote:
> Hello all
>
> I'm busy wrapping a C library with NumPy. Some of the functions operate on a
> buffer containing structs that look like this:
>
> struct node {
>   int index;
>   double value;
> };
>
>   
[snip]
> However, this only works if struct node is tightly packed (#pragma pack(1)
> with the Visual C compiler).
>
> I don't know how feasible this is, but it would be useful if NumPy could be
> told to pack its data on n-byte boundaries or on "same as the compiler"
> boundaries. I realise that there can be problems when mixing code compiled
> by more than one compiler, etc., etc., but a simple unit test can check for
> this.
>   

When you create a data-type using the dtype(...) syntax there is an 
align keyword that will "align" the data according to how the compiler 
does it.  I'm not sure if it always works right so please test it out.

So, in your case you should be able to say.

descr = dtype([('index',intc),('value','f8')], align=1)

Note, I've eliminated some unnecessary verbage in your description.

Currently this is giving me an error that I will look into. 

-Travis


From oliphant.travis at ieee.org  Sat Apr 29 15:04:10 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Sat Apr 29 15:04:10 2006
Subject: [Numpy-discussion] Array data and struct alignment
In-Reply-To: <001601c66bd4$0a37ddb0$0a84a8c0@dsp.sun.ac.za>
References: <001601c66bd4$0a37ddb0$0a84a8c0@dsp.sun.ac.za>
Message-ID: <4453E293.7080502@ieee.org>

Albert Strasheim wrote:
> Hello all
>
> I'm busy wrapping a C library with NumPy. Some of the functions operate on a
> buffer containing structs that look like this:
>
> struct node {
>   int index;
>   double value;
> };
>
>   

In my previous discussion I was wrong.  You cannot use the 
array_descriptor format for a data-type and the align keyword at the 
same time.  You need to use a different method to specify fields.

This, for example:

descr = dtype({'names':['index', 'value'], 'formats':[intc,'f8']},align=1)

On my (32-bit) system it doesn't produce any difference from align=0.  


-Travis


From oliphant.travis at ieee.org  Sat Apr 29 15:11:07 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Sat Apr 29 15:11:07 2006
Subject: [Numpy-discussion] Array data and struct alignment
In-Reply-To: <4453E293.7080502@ieee.org>
References: <001601c66bd4$0a37ddb0$0a84a8c0@dsp.sun.ac.za> <4453E293.7080502@ieee.org>
Message-ID: <4453E449.20407@ieee.org>

Travis Oliphant wrote:
> Albert Strasheim wrote:
>> Hello all
>>
>> I'm busy wrapping a C library with NumPy. Some of the functions 
>> operate on a
>> buffer containing structs that look like this:
>>
>> struct node {
>>   int index;
>>   double value;
>> };
>>
>>   
>
> In my previous discussion I was wrong.  You cannot use the 
> array_descriptor format for a data-type and the align keyword at the 
> same time.  You need to use a different method to specify fields.
>
> This, for example:
>
> descr = dtype({'names':['index', 'value'], 
> 'formats':[intc,'f8']},align=1)
>
> On my (32-bit) system it doesn't produce any difference from align=0. 
>
> -Travis
>
>

However notice the difference with

 >>> dtype({'names':['index', 'value'], 'formats':[short,'f8']},align=1)
dtype([('index', '<i2'), ('', '|V2'), ('value', '<f8')])

 >>> dtype({'names':['index', 'value'], 'formats':[short,'f8']},align=0)
dtype([('index', '<i2'), ('value', '<f8')])


There is padding inserted in the first-case.  This corresponds to how 
the compiler packs a short; double struct on my system.   The default is 
align=0.  You need to use the dtype() constructor to change the 
default.   The auto-constructor used in dtype= keyword calls will not 
change the alignment from align=0.


-Travis


From Fernando.Perez at colorado.edu  Sat Apr 29 16:16:10 2006
From: Fernando.Perez at colorado.edu (Fernando Perez)
Date: Sat Apr 29 16:16:10 2006
Subject: [Numpy-discussion] A weekend floating point/compiler question
In-Reply-To: <e06186140604291042m2f6c21ddqd0af74cb3f5235a@mail.gmail.com>
References: <4452AB3F.8090700@colorado.edu>	 <e06186140604291024o6fa0aa59n9f8bca6ed0682664@mail.gmail.com> <e06186140604291042m2f6c21ddqd0af74cb3f5235a@mail.gmail.com>
Message-ID: <4453F3A6.9030309@colorado.edu>

Charles R Harris wrote:

>>I don't see why the answer should be 99. The number .99 can not be exactly
>>represented in IEEE floating point, in fact it is ~
>>0.9899999999999999911182. So as you can see the result is perfectly
>>correct given the standard conversion to int by truncation. IMHO, this is
>>programmer error, not a compiler problem and should be fixed in the code.
>>Now you may get slightly different results depending on roundoff error if
>>you indulge in such things as (.5 + .49)*100 vs (.33 + .17 + .49)*100, and
>>since these numbers are constants they may also be precomputed by the
>>compiler and the results will depend on the accuracy of the compiler's
>>computation. The whole construction is ambiguous.
>>
>>Chuck
>>
> 
> 
> As an example:

[...]

Thanks to yours and the other replies.  I did try resetting the FPU control 
word as suggested to only 64 bits, and in fact the 'problem' does disappear, 
and I suspect that's also why Robert sees differences in CPUs without the 
extra 16 internal FPU bits.

I do agree that I don't like code like this, but unfortunately this one is 
outside of my control.  For the sake of completeness (since this thread has 
some educational value on the vagaries of FP arithmetic), I've slightly 
extended your example to:

abdul[f77bug]> cat print99.c
#include <stdio.h>

int main(int argc, char** argv)
{
    int x = 100;

    float fy = .49;
    float fz = .50;
    float fw = (fy + fz)*x;
    int ifw = fw;


    double y = .49;
    double z = .50;
    double w = (y + z)*x;
    int iw = w;

    long double ly = .49;
    long double lz = .50;
    long double lw = (ly + lz)*x;
    int ilw = lw;

    printf("floats:\n");
    printf("w=%25.22f, iw=%d\n", fw,ifw);

    printf("doubles:\n");
    printf("w=%25.22f, iw=%d\n", w,iw);

    printf("long doubles:\n");
    printf("w=%25.22Lf, iw=%d\n", lw,ilw);

    return 0;
}
// EOF

which gives on my box (AMD chip, running 32-bit fedora3):

abdul[f77bug]> ./print99.gcc
floats:
w=99.0000000000000000000000, iw=99
doubles:
w=99.0000000000000000000000, iw=99
long doubles:
w=98.9999999999999991118216, iw=98


This is consitent with the calculations done in 80 bits giving also different 
results.

One of the nice things about this community is precisely this kind of friendly 
expertise.  Many thanks to all.

Cheers,

f


From fullung at gmail.com  Sat Apr 29 17:27:15 2006
From: fullung at gmail.com (Albert Strasheim)
Date: Sat Apr 29 17:27:15 2006
Subject: [Numpy-discussion] Array data and struct alignment
In-Reply-To: <4453E449.20407@ieee.org>
Message-ID: <001d01c66bec$c556ece0$0a84a8c0@dsp.sun.ac.za>

Thanks Travis, this works like a charm.

For the curious, here's a quick way to see if your system is doing the right
thing:

In [87]: descr = dtype({'names':['a', 'b'], 'formats':[byte,'f8']},align=1)

In [88]: descr
Out[88]: dtype([('a', '|i1'), ('', '|V7'), ('b', '<f8')])

In [89]: a=array([(1,0.0),(2,0.0)], dtype=descr)

In [92]: a.data[0]
Out[92]: '\x01'

In [93]: a.data[16]
Out[93]: '\x02'

In this case the bytes are showing where one would expect them if padding
was happening.

Regards,

Albert

> -----Original Message-----
> From: numpy-discussion-admin at lists.sourceforge.net [mailto:numpy-
> discussion-admin at lists.sourceforge.net] On Behalf Of Travis Oliphant
> Sent: 30 April 2006 00:10
> To: numpy-discussion
> Subject: Re: [Numpy-discussion] Array data and struct alignment
> 
> Travis Oliphant wrote:
> > Albert Strasheim wrote:
> >> Hello all
> >>
> >> I'm busy wrapping a C library with NumPy. Some of the functions
> >> operate on a
> >> buffer containing structs that look like this:
> >>
> >> struct node {
> >>   int index;
> >>   double value;
> >> };
> >>
> >>
> >
> > In my previous discussion I was wrong.  You cannot use the
> > array_descriptor format for a data-type and the align keyword at the
> > same time.  You need to use a different method to specify fields.
> >
> > This, for example:
> >
> > descr = dtype({'names':['index', 'value'],
> > 'formats':[intc,'f8']},align=1)
> >
> > On my (32-bit) system it doesn't produce any difference from align=0.
> >
> > -Travis
> >
> >
> 
> However notice the difference with
> 
>  >>> dtype({'names':['index', 'value'], 'formats':[short,'f8']},align=1)
> dtype([('index', '<i2'), ('', '|V2'), ('value', '<f8')])
> 
>  >>> dtype({'names':['index', 'value'], 'formats':[short,'f8']},align=0)
> dtype([('index', '<i2'), ('value', '<f8')])
> 
> 
> There is padding inserted in the first-case.  This corresponds to how
> the compiler packs a short; double struct on my system.   The default is
> align=0.  You need to use the dtype() constructor to change the
> default.   The auto-constructor used in dtype= keyword calls will not
> change the alignment from align=0.
> 
> 
> -Travis


From jonathan.taylor at stanford.edu  Sat Apr 29 19:56:03 2006
From: jonathan.taylor at stanford.edu (Jonathan Taylor)
Date: Sat Apr 29 19:56:03 2006
Subject: [Numpy-discussion] confusing recarray behaviour
In-Reply-To: <4453C8B7.8040000@ieee.org>
References: <44528318.6010604@stanford.edu> <4453C8B7.8040000@ieee.org>
Message-ID: <44542730.4050609@stanford.edu>

Here is a pickle file with v and desc, v is just a list of tuples with 
integer and string entries. My point with my example is that when I had 
two identical lists (i.e. v[0:2] == V) one time I got an error, the 
other time I didn't and the traceback had no information, i.e. I 
couldn't get anywhere with pdb. I am using svn revision 2456.

Jonathan

Travis Oliphant wrote:

> Jonathan Taylor wrote:
>
>>
>> What I pass to N.array seems to agree with the examples in numpybook.
>>
>> Below is an example that does work for me (excuse the longish example 
>> but it was just cut and paste to make my life easier). In my code, 
>> funny things happen
>> (see ipython excerpt below this). In particular, I have a list v with 
>> v[0:2] = V and with the
>> same dtype "ddesc" I get this exception when I change V to v[0:2].
>
> Please show us what v is.
>
> If I run v = V[:] and then try N.array(v[0:2],ddesc) I don't get any 
> error.  So something else must be going on.
>
> Which version are you running?
>
>
> -Travis
>
>
>
>
>
> -------------------------------------------------------
> Using Tomcat but need to do more? Need to support web services, security?
> Get stuff done quickly with pre-integrated technology to make your job 
> easier
> Download IBM WebSphere Application Server v.1.0.1 based on Apache 
> Geronimo
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion


-- 
------------------------------------------------------------------------
I'm part of the Team in Training: please support our efforts for the
Leukemia and Lymphoma Society!

http://www.active.com/donate/tntsvmb/tntsvmbJTaylor

GO TEAM !!!

------------------------------------------------------------------------
Jonathan Taylor                           Tel:   650.723.9230
Dept. of Statistics                       Fax:   650.725.8977
Sequoia Hall, 137                         www-stat.stanford.edu/~jtaylo
390 Serra Mall
Stanford, CA 94305

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: dump.pickle
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060429/647aeead/attachment.ksh>

From awf at yahoo.co.kr  Sun Apr 30 06:28:01 2006
From: awf at yahoo.co.kr (=?iso-2022-jp?B?Zndm?=)
Date: Sun Apr 30 06:28:01 2006
Subject: [Numpy-discussion] =?iso-2022-jp?B?PRskQjIrNmI9NTRWJUolUxsoQj0=?=
Message-ID: <E1FaBvn-0005JR-5i@externalmx-1.sourceforge.net>

???????????????
???????????
http://biz-station.org/week/
?
gonghexinnian at yahoo.com.cn


From ndarray at mac.com  Sun Apr 30 10:12:06 2006
From: ndarray at mac.com (Sasha)
Date: Sun Apr 30 10:12:06 2006
Subject: [Numpy-discussion] [Numeric] "put" into object array corrupts memory
In-Reply-To: <E1FaFFF-00066B-Cy@sc8-sf-web1.sourceforge.net>
References: <E1FaFFF-00066B-Cy@sc8-sf-web1.sourceforge.net>
Message-ID: <d38f5330604301011w7f3cea1x85fd68841e5e51dd@mail.gmail.com>

I know that Numeric is no longer maintained, but since this bug cost
me two sleepless nights, I think it is appropriate to announce the bug
and the fix to the list.

---------- Forwarded message ----------
From: SourceForge.net <noreply at sourceforge.net>
Date: Apr 30, 2006 12:58 PM
Subject: [ numpy-Bugs-1479376 ] [Numeric] "put" into object array
corrupts memory
To: noreply at sourceforge.net


Bugs item #1479376, was opened at 2006-04-30 12:46
Message generated for change (Comment added) made by belopolsky
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=101369&aid=1479376&group_id=1369

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Fatal Error
Group: Normal bug
Status: Open
Priority: 5
Submitted By: Alexander Belopolsky (belopolsky)
Assigned to: Nobody/Anonymous (nobody)
Summary: [Numeric] "put" into object array corrupts memory

Initial Comment:
This is one of those bugs that are easier to fix than to reproduce:

$ cat test-put.py
class A(object):
    def __del__(self): print "deleting %r" % self
a = A()
from Numeric import *
x = array([None], 'O')
y = array([a], 'O')
put(x,[0],y)
del a,y
print "exiting"

$ python test-put.py
deleting <__main__.A object at 0xf7e4d24c>
exiting
Fatal Python error: deletion of interned string failed
Aborted (core dumped)


Numeric version: 24.2

----------------------------------------------------------------------

>Comment By: Alexander Belopolsky (belopolsky)
Date: 2006-04-30 12:58

Message:
Logged In: YES
user_id=835142

Attached patch fixes the bug.

----------------------------------------------------------------------

You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=101369&aid=1479376&group_id=1369


From vidar+list at 37mm.no  Sun Apr 30 16:27:00 2006
From: vidar+list at 37mm.no (Vidar Gundersen)
Date: Sun Apr 30 16:27:00 2006
Subject: [Numpy-discussion] Guide to Numpy book
In-Reply-To: <4452C145.8050803@geodynamics.org> (Luis Armendariz's message of
	"Fri, 28 Apr 2006 18:28:37 -0700")
References: <3FA6601C-819F-4F15-A670-829FC428F47B@cortechs.net>
	<4452C145.8050803@geodynamics.org>
Message-ID: <m2aca2ve5y.fsf@buri.local>

===== Original message from Luis Armendariz | 29 Apr 2006:
>> What is the newest version of Guide to numpy? The recent one I got is
>> dated at Jan 9 2005 on the cover.
> The one I got yesterday is dated March 15, 2006.

aren't the updates supposed to be sent out
to customers when available?


From ted.horst at earthlink.net  Sun Apr 30 16:50:08 2006
From: ted.horst at earthlink.net (Ted Horst)
Date: Sun Apr 30 16:50:08 2006
Subject: [Numpy-discussion] Scalar math module is ready for testing
In-Reply-To: <4451C076.40608@ieee.org>
References: <4451C076.40608@ieee.org>
Message-ID: <3856FA57-539D-47DE-8427-2A6BB508F917@earthlink.net>

Here is an issue I am having with scalarmath:

 >>> import numpy
 >>> numpy.__version__
'0.9.7.2462'
 >>> import numpy.core.scalarmath
 >>> a = numpy.array([1], 'h')
 >>> 1*a
array([1], dtype=int16)
 >>> 1*a[0]
Traceback (most recent call last):
   File "<stdin>", line 1, in ?
TypeError: unsupported operand type(s) for *: 'int' and 'int16scalar'

This happens because PyArray_CanCastSafely returns false for casting  
from int to short. alter_scalars(int) fixes this, but I have lots of  
non-numpy code that I don't want to behave differently.

Ted

On Apr 28, 2006, at 02:12, Travis Oliphant wrote:

> The scalar math module is complete and ready to be tested.  It  
> should speed up code that relies heavily on scalar arithmetic by by- 
> passing the ufunc machinery.


From fullung at gmail.com  Sun Apr 30 17:11:05 2006
From: fullung at gmail.com (Albert Strasheim)
Date: Sun Apr 30 17:11:05 2006
Subject: [Numpy-discussion] Creating a descr with aligned=1 using the C API
Message-ID: <000601c66cb3$b762a940$0a84a8c0@dsp.sun.ac.za>

Hello all

I was wondering what the best way would be to create the following descr
using the C API:

descr = dtype({'names' : ['index', 'value'], 'formats' : [intc, 'f8']},
align=1)

One could use PyArray_DescrConverter in multiarraymodule.c, but there
doesn't seem to be a way to specify aligned=1 and one would have to build
the dict object before being able to pass it on for conversion.

Unless there's another easy way I'm missing, the API could possibly do with
a function like PyArray_DescrFromCommaString(const char*, int align) which
calls _convert_from_commastring. By the way, what is the general format of
these commastrings?

Comments appreciated.

Regards,

Albert


From tim.hochberg at cox.net  Sun Apr 30 19:33:03 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Sun Apr 30 19:33:03 2006
Subject: [Numpy-discussion] basearray lives!
Message-ID: <445573B0.6020408@cox.net>

After a fashion anyway. I implemented the simplest thing that could 
possibly work and I've left out some stuff that even I think we need 
(docstring, repr and str). Still it exists, ndarray inherits from it and 
some stuff seems to work automagically.

 >>> import numpy as n
 >>> ba = n.basearray([3,3], int, n.arange(9))
 >>> ba
<numpy.basearray object at 0x00B29690>
 >>> a = asarray(ba)
 >>> a
array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])
 >>> a + ba
array([[ 0,  2,  4],
       [ 6,  8, 10],
       [12, 14, 16]])
 >>> isinstance(a, n.basearray)
True
 >>> type(ba)
<type 'numpy.basearray'>
 >>> type(a)
<type 'numpy.ndarray'>
 >>> len(dir(ba))
19
 >>> len(dir(a))
156


Travis: should I go ahead and check this into the trunk? It shouldn't 
interfear with anything. The only change to ndarray is the tp_base, 
which sets up the inheritance.


-tim


From ndarray at mac.com  Sun Apr 30 20:27:09 2006
From: ndarray at mac.com (Sasha)
Date: Sun Apr 30 20:27:09 2006
Subject: [Numpy-discussion] basearray lives!
In-Reply-To: <445573B0.6020408@cox.net>
References: <445573B0.6020408@cox.net>
Message-ID: <d38f5330604302026x130d2111ye4698b6e8777f1bb@mail.gmail.com>

Let me add my $.02.  I am very much in favor of a basic array object. 
 I would probably go much further than Tim in simplifying it.  No need
for repr/str.  No number protocol.  No sequence/mapping protocol
either.  Maybe even no dimensions/striding etc.  What is left?  Not
much on top of buffer protocol: the type description.

I've expressed this opinion several times before (and was criticised
for not supporting it:-): I don't think a basearray should be a base
class.  The main reason is that in most cases subclasses will need to
adapt all the array methods. In many cases (speaking from ma
experience, but probably matrix folks can relate) the adaptation is
not automatic and has to be done on the method by method bases. 
Exposure of the base class methods without adaptation or with wrong
adaptation leads to errors.  Unless the base array is truly
minimalistic and stays this way, methods that are added to the base
class in the future will likely not work unadapted.

The only implementation that uses inheritance that I will like would
be something similar to python's object type: rich C API and no Python
API.

Would you consider checking your implementation in without modifying
ndarray's tp_base?


On 4/30/06, Tim Hochberg <tim.hochberg at cox.net> wrote:
>
> After a fashion anyway. I implemented the simplest thing that could
> possibly work and I've left out some stuff that even I think we need
> (docstring, repr and str). Still it exists, ndarray inherits from it and
> some stuff seems to work automagically.
>
>  >>> import numpy as n
>  >>> ba = n.basearray([3,3], int, n.arange(9))
>  >>> ba
> <numpy.basearray object at 0x00B29690>
>  >>> a = asarray(ba)
>  >>> a
> array([[0, 1, 2],
>        [3, 4, 5],
>        [6, 7, 8]])
>  >>> a + ba
> array([[ 0,  2,  4],
>        [ 6,  8, 10],
>        [12, 14, 16]])
>  >>> isinstance(a, n.basearray)
> True
>  >>> type(ba)
> <type 'numpy.basearray'>
>  >>> type(a)
> <type 'numpy.ndarray'>
>  >>> len(dir(ba))
> 19
>  >>> len(dir(a))
> 156
>
>
> Travis: should I go ahead and check this into the trunk? It shouldn't
> interfear with anything. The only change to ndarray is the tp_base,
> which sets up the inheritance.
>
>
>
> -tim
>
>
>
>
> -------------------------------------------------------
> Using Tomcat but need to do more? Need to support web services, security?
> Get stuff done quickly with pre-integrated technology to make your job easier
> Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>


From oliphant.travis at ieee.org  Sun Apr 30 21:45:05 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Sun Apr 30 21:45:05 2006
Subject: [Numpy-discussion] Creating a descr with aligned=1 using the
 C API
In-Reply-To: <000601c66cb3$b762a940$0a84a8c0@dsp.sun.ac.za>
References: <000601c66cb3$b762a940$0a84a8c0@dsp.sun.ac.za>
Message-ID: <44559204.3020902@ieee.org>

Albert Strasheim wrote:
> Hello all
>
> I was wondering what the best way would be to create the following descr
> using the C API:
>   
You can use the "new" method.

PyArray_Descr *dtype
PyObject *dict;

dtype = PyArrayDescr_Type.ob_type->tp_new(dtype->ob_type, 
Py_BuildValue("Oi", dict, 1));

where the dict is the one you give.

Yes, this could be an easier-to use API.

> descr = dtype({'names' : ['index', 'value'], 'formats' : [intc, 'f8']},
> align=1)
>
> One could use PyArray_DescrConverter in multiarraymodule.c, but there
> doesn't seem to be a way to specify aligned=1 and one would have to build
> the dict object before being able to pass it on for conversion.
>
> Unless there's another easy way I'm missing, the API could possibly do with
> a function like PyArray_DescrFromCommaString(const char*, int align) which
> calls _convert_from_commastring. By the way, what is the general format of
> these commastrings?
>   
It's in the NumPy book and it's also documented by numarray...


-Travis


From oliphant.travis at ieee.org  Sun Apr 30 21:49:02 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Sun Apr 30 21:49:02 2006
Subject: [Numpy-discussion] basearray lives!
In-Reply-To: <445573B0.6020408@cox.net>
References: <445573B0.6020408@cox.net>
Message-ID: <445592EB.1000406@ieee.org>

Tim Hochberg wrote:
>
> After a fashion anyway. I implemented the simplest thing that could 
> possibly work and I've left out some stuff that even I think we need 
> (docstring, repr and str). Still it exists, ndarray inherits from it 
> and some stuff seems to work automagically.
>
> >>> import numpy as n
> >>> ba = n.basearray([3,3], int, n.arange(9))
> >>> ba
> <numpy.basearray object at 0x00B29690>
> >>> a = asarray(ba)
> >>> a
> array([[0, 1, 2],
>       [3, 4, 5],
>       [6, 7, 8]])
> >>> a + ba
> array([[ 0,  2,  4],
>       [ 6,  8, 10],
>       [12, 14, 16]])
> >>> isinstance(a, n.basearray)
> True
> >>> type(ba)
> <type 'numpy.basearray'>
> >>> type(a)
> <type 'numpy.ndarray'>
> >>> len(dir(ba))
> 19
> >>> len(dir(a))
> 156
>
>
> Travis: should I go ahead and check this into the trunk? It shouldn't 
> interfear with anything. The only change to ndarray is the tp_base, 
> which sets up the inheritance.
>

I say go ahead.   We can then all deal with it there and improve upon 
it.   The ndarray used to inherit from another array and things worked. 

Python's inheritance in C is actually quite slick.   Especially for 
structural issues.    I agree that the basearray should have minimal 
operations (I would not even define several of the protocols for it).  
I'd probably only keep the buffer and mapping protocol but even then 
probably only a simple mapping protocol (i.e. no fancy-indexing)  that 
then gets enhanced by the ndarray. 

Thanks for the work.

-Travis


From robert.kern at gmail.com  Sat Apr  1 00:20:00 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Sat Apr  1 00:20:00 2006
Subject: [Numpy-discussion] Trac maintenance
Message-ID: <442E3770.6030809@gmail.com>

I've been doing a bit of maintenance on the Trac instances for numpy and scipy.
In particular, I've removed the default "component1" and "milestone2" nonsense
and put meaningful values in their place.

If you have any requests, or you think my component lists are bogus, enter a
ticket, set the component to "Trac" and assign it to rkern.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From tim.hochberg at cox.net  Sat Apr  1 06:57:17 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Sat Apr  1 06:57:17 2006
Subject: [Numpy-discussion] numpy error handling
In-Reply-To: <442E2F05.5080809@ieee.org>
References: <442DE773.4060104@cox.net> <442E2F05.5080809@ieee.org>
Message-ID: <442E94AD.1040200@cox.net>

Travis Oliphant wrote:

> Tim Hochberg wrote:
>
>>
>> I've just been looking at how numpy handles changing the behaviour 
>> that is triggered when there are numeric error conditions (overflow, 
>> underflow, etc.). If I understand it correctly, and that's a big if, 
>> I don't think I like it nearly as much as the what numarray has in 
>> place.
>>
>> It appears that numpy uses the two functions, seterr and geterr, to 
>> set and query the error handling. These set/read a secret variable 
>> stored in the local scope. 
>
> This approach was decided on after discussions with Guido who didn't 
> like the idea of pushing and popping from a global stack.    I'm not 
> sure I'm completely in love with it my self, but it is actually more 
> flexible then the numarray approach.
>
> You can get the numarray approach back simply by setting the error in 
> the builtin scope (instead of in the local scope which is done by 
> default.

I saw that you could set it at different levels, but missed the 
implications. However, it's still missing one feature, thread local 
storage. I would argue that the __builtin__ data should actually be 
stored in threading.local() instead of __builtin__. Then you could setup 
an equivalent stack system to numpy's.

> Then, at the end of the function, you can restore it.  If it was felt 
> useful to create a stack to handle this on the builtin level then that 
> is easily done as well.

I've used the numarray error handling stuff for some time. My experience 
with it has led me to the following conclusions:

   1. You don't use it that often. I have about 26 KLOC that's "active"
      and in that I use pushMode just 15 times. For comparison, I use
      asarray a tad over 100 times.
   2. pushMode and popMode, modulo spelling,  is the way to set errors.
      Once the with  statement is around, that will be even better.
   3. I, personally, would be very unlikely to use the local and global
      error handling, I'd just as soon see them go away, particularly if
      it helps performance, but I won't lobby for it.

>> I assume that the various ufuncs then examine that value to determine 
>> how to handle errors. The secret variable approach is a little 
>> clunky, but that's not what concerns me. What concerns me is that 
>> this approach is *only* useful for built in numpy functions and falls 
>> down if we call any user defined functions.
>>
>> Suppose we want to be warned on underflow. Setting this is as simple as:
>>
>>    def func(*args):
>>        numpy.seterr(under='warn')
>>        # do stuff with args
>>        return result
>>
>> Since seterr is local to the function, we don't have to reset the 
>> error handling at the end, which is convenient. And, this works fine 
>> if all we are doing is calling numpy functions and methods. However, 
>> if we are calling a function of our own devising we're out of luck 
>> since the called function will not inherit the error settings that we 
>> have set.
>
> Again, you have control over where you set the "secret" variable 
> (local, global (module), and builtin).  I also don't see how that's 
> anymore clunky then a "secret" stack.   

In numarray, the stack is in the numarray module itself (actually in the 
Error object). They base their threading local behaviour off of 
thread.get_ident, not threading.local.  That's not clunky at all, 
although it's arguably wrong since thread.get_ident can reuse ids from 
dead threads. In practice it's probably hard to get into trouble doing 
this, but I still wouldn't emulate it. I think that this was written 
before thread local storage, so it was probably the best that could be done.

However, if you use threading.local, it will be clunky in a similar 
sense. You'll be storing data in a global  namespace you don't control 
and you've got to hope that no one stomps on your variable name. When 
you have local and module level secret storage names as well you're just 
doing a lot more of that and the chance of collision and confusion goes 
up from almost zero to very small.

> You may set the error in the builtin scope --- in fact it would 
> probably be trivial to implement a stack based on this and implement the
>
> pushMode
> popMode
>
> interface of numarray.

Yes. Modulo the thread local issue, I believe that this would indeed be 
easy.

>
> But, I think this question does deserve a bit of debate.  I don't 
> think there has been a serious discussion over the method.  To help 
> Tim and others understand what happens:
>
> When a ufunc is called, a specific variable name is searched for in 
> the following name-spaces in the following order:
>
> 1) local
> 2) global
> 3) builtin
>
> (There is a bit of an optimization in that when the error mode is the 
> default mode --- do nothing, a global flag is set which by-passes the 
> search for the name).
> The first time the variable name is found, the error mode is read from 
> that variable.  This error mode is placed as part of the ufunc loop 
> object.  At the end of each 1-d loop the IEEE error mode flags are 
> checked  (depending on the state of the error mode) and appropriate 
> action taken.
>
> By the way, it would not be too difficult to change how the error mode 
> is set (probably an hour's worth of work).   So, concern over 
> implementation changes should not be a factor right now.  
> Currently the error mode is read from a variable using standard 
> scoping rules.   It would save the (not insignificant) name-space 
> lookup time to instead use a global stack (i.e. a Python list) and 
> just get the error mode from the top of that stack.
>
>> Thus we have no way to influence the error settings of functions 
>> downstream from us.
>
> Of course, there is a way to do this by setting the variable in the 
> global or builtin scope as I've described above.
> What's really the argument here, is whether having the flexibility at 
> the local and global name-spaces really worth the extra name-lookups 
> for each ufunc.
>
> I've argued that the numarray behavior can result from using the 
> builtin namespace for the error control. (perhaps with better 
> Python-side support for setting and retrieving it).  What numpy has is 
> control at the global and local namespace level as well which can 
> override the builtin name-space behavior.
>
> So, we should at least frame the discussion in terms of what is 
> actually possible.

Yes, sorry for spreading misinformation.

>>
>> I also would prefer more verbose keys ala numarray (underflow, 
>> overflow, dicidebyzero and invalid) than those currently used by 
>> numpy (under, over, divide and invalid). 
>
>
> In my mind, verbose keys are just extra baggage unless they are really 
> self documenting.  You just need reminders and clues.   It seems to be 
> a preference thing.   I guess I hate typing long strings when only the 
> first few letters clue me in to what is being talked about.

In this case, overflow, underflow and dividebyzero seem pretty self 
documenting to me. And 'invalid' is pretty cryptic in both 
implementations. This may be a matter of taste, but I tend to prefer 
short pithy names for functions that I use a lot, or that crammed a 
bunch to a line. In functions like this, that are more rarely used and 
get a full line to themselves I lean to towards the more verbose.

>> And (will he never stop) I like numarrays defaults better here too: 
>> overflow='warn', underflow='ignore', dividebyzero='warn', 
>> invalid='warn'. Currently, numpy defaults to ignore for all cases. 
>> These last points are relatively minor though.
>
> This has optimization issues the way the code is written now.  The 
> defaults are there to produce the fastest loops. 

Can you elaborate on this a bit? Reading between the lines, there seem 
to be two issues related to speed here.  One is the actual namespace 
lookup of the error mode -- there's a setting that says we are using the 
defaults, so don't bother to look. This saves the namespace lookup.  
Changing the defaults shouldn't affect the timing of that. I'm not sure 
how this would interact with thread local storage though.

The second issue is that running the core loop with no checks in place 
is faster.

That means that to get maximum performance you want to be running both 
at the default setting and with no checks, which implies that the 
default setting needs to be no checking. Is that correct?

 I think there should be a way to finesse this issue, but I'll wait for 
the dust to settle a bit on the local, global, builtin issue before I 
propose anything. Particularly since by finesse I mean: do something 
moderately unsavory.

> So, I'm hesitant to change them based only on ambiguous preferences.

It's not entirely plucked out of the error. As I recall, the decision 
was arrived at something likes this:

   1. Errors should never pass silently (unless explicitly silenced).
   2. Let's have everything raise by default
   3. In practice this was no good because you often wanted to look at
      the results and see where the problem was.
   4. OK, let's have everything warn
   5. This almost worked, but underflow was almost never a real error,
      so everyone always overrode underflow. A default that you always
      need to override is not a good default.
   6. So, warn for everything except underflow. Ignore that.

And that's where numarry is today. I and other have been using that 
error system happily for quite some time now. At least I haven't heard 
any complaints for quite a while.

> Good feedback.    Thanks again for taking the time to look at this and 
> offer review.

You're very welcome. Thanks for all of the work you've been putting in 
to make the grand numerification happen.

-tim


From arnd.baecker at web.de  Sat Apr  1 09:09:06 2006
From: arnd.baecker at web.de (Arnd Baecker)
Date: Sat Apr  1 09:09:06 2006
Subject: [Numpy-discussion] extension to xrange for numpy
Message-ID: <Pine.LNX.4.51.0604011906240.29595@ptpcp8.phy.tu-dresden.de>

Dear numpy enthusiasts,

one python command which is extremely useful in 1D situations
is `xrange`. However, for higher dimensional
settings we strongly lack the commands `yrange` and `zrange`.
These could be shorthands for the corresponding
constructs with `:,NewAxis` added.

Any comments, suggestion and even implementations are very welcome,

Arnd

P.S.: What I am not sure about is the right command for
the 4-dimensional case - which letter should be used after the "z"?
(it seems that "a" would be a very natural choice...)


From faltet at carabos.com  Sat Apr  1 11:01:05 2006
From: faltet at carabos.com (Francesc Altet)
Date: Sat Apr  1 11:01:05 2006
Subject: [Numpy-discussion] ANN: PyTables 1.3 released
Message-ID: <200604012100.38726.faltet@carabos.com>

=========================
 Announcing PyTables 1.3
=========================

This is a new major release of PyTables.  The most remarkable feature
added in this version is a complete support (well, almost, because
unicode arrays are not there yet) for NumPy objects. Improved support
for native HDF5 is there as well. As an aside, I'm happy to inform you
that the PyTables web site (http://www.pytables.org) has been converted
into a wiki so that users can contribute to the project with recipes or
any other document.  Try it out!

Go to the (new) PyTables web site for downloading the beast:
http://www.pytables.org/

or keep reading for more info about the new features and bugs fixed.


Changes more in depth
=====================

Improvements:

- Support for NumPy objects in all the objects of PyTables, namely:
  Array, CArray, EArray, VLArray and Table. All the numerical and
  character (except unicode arrays) flavors are supported as well as
  plain and nested heterogeneous NumPy arrays. PyTables leverages the
  adoption of the array interface
  (http://numeric.scipy.org/array_interface.html) for a very efficient
  conversion between all the numarray (which continues to be the native
  flavor for PyTables) object to/from NumPy/Numeric.

- The FLAVOR schema in PyTables has been refined and simplified. Now,
  the only 'flavors' allowed for data objects are: "numarray", "numpy",
  "numeric" and "python". The changes has been made so that they are
  fully backward compatible with existing PyTables files. However, when
  users would try to use old flavors (like "Numeric" or "Tuple") in
  existing code, a ``DeprecationWarning`` will be issued in order to
  encourage them to migrate to the new flavors as soon as possible.

- Nested fields can be specified in the "field" parameter of Table.read
  by using a '/' as a separator between fields (e.g. 'Info/value').

- The Table.Cols accessor has received a new ``__setitem__()`` method
  that allows doing things like:

            table.cols[4] = record
            table.cols.x[4:1000:2] = array   # homogeneous column
            table.cols.Info[4:1000:2] = recarray   # nested column

- A clean-up function (using ``atexit``) has been registered so that
  remaining opened files are closed when a user hits a ^C, for
  example. That would help to avoid ending with corrupted files.

- Native HDF5 compound datasets that are contiguous are supported
  now. Before, only chunked datasets were supported.

- Updated (and much improved) sections about compression issues in the
  User's Guide. It includes new benchmarks made with PyTables 1.3 and a
  exhaustive comparison between Zlib, LZO and bzip2.

- The HTML version of manual is made now from the docbook2html package
  for an improved look (IMO).

Bug fixes:

- Solved a problem when trying to save CharArrays with itemsize = 0 as
  attributes of nodes. Now, these objects are pickled in order to
  prevent HDF5 from crashing.

- Fixed some alignment issues with nested record arrays under certain
  architectures (e.g. PowerPC).

- Fixed automatic conversions when a VLArray is read in a platform with
  a byte ordering different from the file.

Deprecated features:

- Due to recurrent problems with the UCL compression library, it has
  been declared deprecated from this version on. You can still compile
  PyTables with UCL support (using the --force-ucl), but you are urged
  to not use it anymore and convert any existing datafiles with UCL to
  other supported library (zlib, lzo or bzip2) with the ``ptrepack``
  utility.

Backward-incompatible changes:

- Please, see ``RELEASE-NOTES.txt`` file.


Important note for Windows users
================================

If you are willing to use PyTables with Python 2.4 in Windows platforms,
you will need to get the HDF5 library compiled for MSVC 7.1, aka .NET
2003.  It can be found at:
ftp://ftp.ncsa.uiuc.edu/HDF/HDF5/current/bin/windows/5-165-win-net.ZIP

Users of Python 2.3 on Windows will have to download the version of HDF5
compiled with MSVC 6.0 available in:
ftp://ftp.ncsa.uiuc.edu/HDF/HDF5/current/bin/windows/5-165-win.ZIP


What it is
==========

**PyTables** is a package for managing hierarchical datasets and
designed to efficiently cope with extremely large amounts of data (with
support for full 64-bit file addressing).  It features an
object-oriented interface that, combined with C extensions for the
performance-critical parts of the code, makes it a very easy-to-use tool
for high performance data storage and retrieval.

PyTables runs on top of the HDF5 library and numarray (but NumPy and
Numeric are also supported) package for achieving maximum throughput and
convenient use.

Besides, PyTables I/O for table objects is buffered, implemented in C
and carefully tuned so that you can reach much better performance with
PyTables than with your own home-grown wrappings to the HDF5 library.
PyTables sports indexing capabilities as well, allowing doing selections
in tables exceeding one billion of rows in just seconds.


Platforms
=========

This version has been extensively checked on quite a few platforms, like
Linux on Intel32 (Pentium), Win on Intel32 (Pentium), Linux on Intel64
(Itanium2), FreeBSD on AMD64 (Opteron), Linux on PowerPC (and PowerPC64)
and MacOSX on PowerPC.  For other platforms, chances are that the code
can be easily compiled and run without further issues.  Please, contact
us in case you are experiencing problems.


Resources
=========

Go to the PyTables web site for more details:

http://www.pytables.org

About the HDF5 library:

http://hdf.ncsa.uiuc.edu/HDF5/

About numarray:

http://www.stsci.edu/resources/software_hardware/numarray

To know more about the company behind the PyTables development, see:

http://www.carabos.com/


Acknowledgments
===============

Thanks to various the users who provided feature improvements, patches,
bug reports, support and suggestions.  See the ``THANKS`` file in the
distribution package for a (incomplete) list of contributors.  Many
thanks also to SourceForge who have helped to make and distribute this
package!  And last but not least, a big thank you to THG
(http://www.hdfgroup.org/) for sponsoring many of the new features
recently introduced in PyTables.


Share your experience
=====================

Let us know of any bugs, suggestions, gripes, kudos, etc. you may
have.


----

  **Enjoy data!**

  -- The PyTables Team

-- 
>0,0<   Francesc Altet ? ? http://www.carabos.com/
V   V   C?rabos Coop. V. ??Enjoy Data
 "-"


From oliphant.travis at ieee.org  Sat Apr  1 12:20:01 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Sat Apr  1 12:20:01 2006
Subject: [Numpy-discussion] numpy error handling
In-Reply-To: <442E94AD.1040200@cox.net>
References: <442DE773.4060104@cox.net> <442E2F05.5080809@ieee.org> <442E94AD.1040200@cox.net>
Message-ID: <442EE026.8060806@ieee.org>

Tim Hochberg wrote:
>>
>> You can get the numarray approach back simply by setting the error in 
>> the builtin scope (instead of in the local scope which is done by 
>> default.
>
> I saw that you could set it at different levels, but missed the 
> implications. However, it's still missing one feature, thread local 
> storage. I would argue that the __builtin__ data should actually be 
> stored in threading.local() instead of __builtin__. Then you could 
> setup an equivalent stack system to numpy's.
Yes, the per-thread storage escaped me.    But, threading.local() only 
exists in Python 2.4 and NumPy is supposed to be compatible with Python 2.3

What about PyThreadState_GetDict() ? and then default to use the builtin 
dictionary if this returns NULL?

I'm actually not particularly enthused about the three name-space 
lookups.   Changing it to only 1 place to look may be better.  It would 
require a setting and restoring operation.  A stack could be used, but 
why not just use local variables (i.e. 

save = numpy.seterr(dividebyzero='warn')

...

numpy.seterr(restore=save) 


>
> I've used the numarray error handling stuff for some time. My 
> experience with it has led me to the following conclusions:
>
>   1. You don't use it that often. I have about 26 KLOC that's "active"
>      and in that I use pushMode just 15 times. For comparison, I use
>      asarray a tad over 100 times.
>   2. pushMode and popMode, modulo spelling,  is the way to set errors.
>      Once the with  statement is around, that will be even better.
>   3. I, personally, would be very unlikely to use the local and global
>      error handling, I'd just as soon see them go away, particularly if
>      it helps performance, but I won't lobby for it.
>

This is good feedback.  I have almost zero experience with changing the 
error handling.  So, I'm not sure what features are desireable.  
Eliminating unnecessary name-lookups is usually a good thing.

>
> In numarray, the stack is in the numarray module itself (actually in 
> the Error object). They base their threading local behaviour off of 
> thread.get_ident, not threading.local.  That's not clunky at all, 
> although it's arguably wrong since thread.get_ident can reuse ids from 
> dead threads. In practice it's probably hard to get into trouble doing 
> this, but I still wouldn't emulate it. I think that this was written 
> before thread local storage, so it was probably the best that could be 
> done.

Right, but thread local storage is still Python 2.4 only....

What about PyThreadState_GetDict() ?
>
> However, if you use threading.local, it will be clunky in a similar 
> sense. You'll be storing data in a global  namespace you don't control 
> and you've got to hope that no one stomps on your variable name. 
The PyThreadState_GetDict() documenation states that extension module 
writers should use a unique name based on their extension module. 
> When you have local and module level secret storage names as well 
> you're just doing a lot more of that and the chance of collision and 
> confusion goes up from almost zero to very small.
This is true.   Similar to the C-variable naming issues.
>> So, we should at least frame the discussion in terms of what is 
>> actually possible.
>
> Yes, sorry for spreading misinformation.

But you did point out the very important thread-local storage fact that 
I had missed.   This alone makes me willing to revamp what we are doing.

>
> In this case, overflow, underflow and dividebyzero seem pretty self 
> documenting to me. And 'invalid' is pretty cryptic in both 
> implementations. This may be a matter of taste, but I tend to prefer 
> short pithy names for functions that I use a lot, or that crammed a 
> bunch to a line. In functions like this, that are more rarely used and 
> get a full line to themselves I lean to towards the more verbose.

The rarely-used factor is a persuasive argument.  

> Can you elaborate on this a bit? Reading between the lines, there seem 
> to be two issues related to speed here.  One is the actual namespace 
> lookup of the error mode -- there's a setting that says we are using 
> the defaults, so don't bother to look. This saves the namespace 
> lookup.  Changing the defaults shouldn't affect the timing of that. 
> I'm not sure how this would interact with thread local storage though.
>
> The second issue is that running the core loop with no checks in place 
> is faster.
Basically, on the C-level, the error mode is an integer with specific 
bits allocated to the various error-possibilites (2-bits per 
possibility).   If this is 0 then the error checking is not even done 
(thus no error handling at all). 

Yes the name-lookup optimization could work with any defaults (but with 
thread-specific storage couldn't work anyway).

One question I have with threads and error handling though?  Right now, 
the ufuncs release the Python lock during computation (and re-acquire it 
to do error handling if needed).   If another ufunc was started by 
another Python thread and ran with different error handling, wouldn't 
the IEEE flags get confused about which ufunc was setting what?  The 
flags are only checked after each 1-d loop.  If another thread set the 
processor flag, the current thread could get very confused.

This seems like a problem that I'm not sure how to handle.  

>
> It's not entirely plucked out of the error. As I recall, the decision 
> was arrived at something likes this:
>
>   1. Errors should never pass silently (unless explicitly silenced).
>   2. Let's have everything raise by default
>   3. In practice this was no good because you often wanted to look at
>      the results and see where the problem was.
>   4. OK, let's have everything warn
>   5. This almost worked, but underflow was almost never a real error,
>      so everyone always overrode underflow. A default that you always
>      need to override is not a good default.
>   6. So, warn for everything except underflow. Ignore that.
>
> And that's where numarry is today. I and other have been using that 
> error system happily for quite some time now. At least I haven't heard 
> any complaints for quite a while.

I can appreciate this choice, but I don't agree that errors should never 
pass silently.   The fact that people disagree about this is the reason 
for the error handling.    Note that overflow is not detected everywhere 
for integers --- we have to simulate the floating-point errors for 
them.  Only on integer multiply is it detected.   Checking for it would 
slow down all other integer arithmetic --- one solution, of course is to 
have two different integer additions (one that checks for overflow and 
another that doesn't). 

There is really a bit of work left here to do.


Best,

-Travis


From tim.hochberg at cox.net  Sat Apr  1 14:01:04 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Sat Apr  1 14:01:04 2006
Subject: [Numpy-discussion] numpy error handling
In-Reply-To: <442EE026.8060806@ieee.org>
References: <442DE773.4060104@cox.net> <442E2F05.5080809@ieee.org> <442E94AD.1040200@cox.net> <442EE026.8060806@ieee.org>
Message-ID: <442EF7D9.9010404@cox.net>

Travis Oliphant wrote:

> Tim Hochberg wrote:
>
>>>
>>> You can get the numarray approach back simply by setting the error 
>>> in the builtin scope (instead of in the local scope which is done by 
>>> default.
>>
>>
>> I saw that you could set it at different levels, but missed the 
>> implications. However, it's still missing one feature, thread local 
>> storage. I would argue that the __builtin__ data should actually be 
>> stored in threading.local() instead of __builtin__. Then you could 
>> setup an equivalent stack system to numpy's.
>
> Yes, the per-thread storage escaped me.    But, threading.local() only 
> exists in Python 2.4 and NumPy is supposed to be compatible with 
> Python 2.3
>
> What about PyThreadState_GetDict() ? and then default to use the 
> builtin dictionary if this returns NULL?

That sounds reasonable. I've never used that, but the name sounds promising!

> I'm actually not particularly enthused about the three name-space 
> lookups.   Changing it to only 1 place to look may be better.  It 
> would require a setting and restoring operation.  A stack could be 
> used, but why not just use local variables (i.e.
> save = numpy.seterr(dividebyzero='warn')
>
> ...
>
> numpy.seterr(restore=save)

That would work as well, I think. It gets a little hairy if you want to 
set error nestedly in a single function, but I've never done that, so 
I'm not too worried about it. Besides, what I really want to support is 
'with', which I imagine we can support using the above as a base.

>> I've used the numarray error handling stuff for some time. My 
>> experience with it has led me to the following conclusions:
>>
>>   1. You don't use it that often. I have about 26 KLOC that's "active"
>>      and in that I use pushMode just 15 times. For comparison, I use
>>      asarray a tad over 100 times.
>>   2. pushMode and popMode, modulo spelling,  is the way to set errors.
>>      Once the with  statement is around, that will be even better.
>>   3. I, personally, would be very unlikely to use the local and global
>>      error handling, I'd just as soon see them go away, particularly if
>>      it helps performance, but I won't lobby for it.
>>
>
> This is good feedback.  I have almost zero experience with changing 
> the error handling.  So, I'm not sure what features are desireable.  
> Eliminating unnecessary name-lookups is usually a good thing.


I hope some of the other numarray users chime in. A sample of one is not 
very good data!

>> In numarray, the stack is in the numarray module itself (actually in 
>> the Error object). They base their threading local behaviour off of 
>> thread.get_ident, not threading.local.  That's not clunky at all, 
>> although it's arguably wrong since thread.get_ident can reuse ids 
>> from dead threads. In practice it's probably hard to get into trouble 
>> doing this, but I still wouldn't emulate it. I think that this was 
>> written before thread local storage, so it was probably the best that 
>> could be done.
>
>
> Right, but thread local storage is still Python 2.4 only....
>
> What about PyThreadState_GetDict() ?

That sounds reasonable. Essentially we would be rolling our own 
threading.local()

>>
>> However, if you use threading.local, it will be clunky in a similar 
>> sense. You'll be storing data in a global  namespace you don't 
>> control and you've got to hope that no one stomps on your variable name. 
>
> The PyThreadState_GetDict() documenation states that extension module 
> writers should use a unique name based on their extension module.
>
>> When you have local and module level secret storage names as well 
>> you're just doing a lot more of that and the chance of collision and 
>> confusion goes up from almost zero to very small.
>
> This is true.   Similar to the C-variable naming issues.
>
>>> So, we should at least frame the discussion in terms of what is 
>>> actually possible.
>>
>>
>> Yes, sorry for spreading misinformation.
>
>
> But you did point out the very important thread-local storage fact 
> that I had missed.   This alone makes me willing to revamp what we are 
> doing.
>
>>
>> In this case, overflow, underflow and dividebyzero seem pretty self 
>> documenting to me. And 'invalid' is pretty cryptic in both 
>> implementations. This may be a matter of taste, but I tend to prefer 
>> short pithy names for functions that I use a lot, or that crammed a 
>> bunch to a line. In functions like this, that are more rarely used 
>> and get a full line to themselves I lean to towards the more verbose.
>
>
> The rarely-used factor is a persuasive argument. 
>
>> Can you elaborate on this a bit? Reading between the lines, there 
>> seem to be two issues related to speed here.  One is the actual 
>> namespace lookup of the error mode -- there's a setting that says we 
>> are using the defaults, so don't bother to look. This saves the 
>> namespace lookup.  Changing the defaults shouldn't affect the timing 
>> of that. I'm not sure how this would interact with thread local 
>> storage though.
>>
>> The second issue is that running the core loop with no checks in 
>> place is faster.
>
> Basically, on the C-level, the error mode is an integer with specific 
> bits allocated to the various error-possibilites (2-bits per 
> possibility).   If this is 0 then the error checking is not even done 
> (thus no error handling at all).
> Yes the name-lookup optimization could work with any defaults (but 
> with thread-specific storage couldn't work anyway).
>
> One question I have with threads and error handling though?  Right 
> now, the ufuncs release the Python lock during computation (and 
> re-acquire it to do error handling if needed).   If another ufunc was 
> started by another Python thread and ran with different error 
> handling, wouldn't the IEEE flags get confused about which ufunc was 
> setting what?  The flags are only checked after each 1-d loop.  If 
> another thread set the processor flag, the current thread could get 
> very confused.
>
> This seems like a problem that I'm not sure how to handle. 

Yeah, me either. It seems that somehow we'll need to block until all 
current operations are done, but I don't know how to do that off the top 
of my head. Perhaps ufuncs need to lock the flags when they start and 
release them when they finish. This looks feasible, but I'm not sure of 
the proper incantation to get this right. The ufuncs would all need to 
be able able to increment and decrement the lock, whatever it is, even 
though they are in different threads. Meanwhile the setting code should 
only be able to work when the lock is unheld. It's some sort of poly 
thread recursive lock thing. I'll think about it, perhaps there's an 
obvious way.

>>
>> It's not entirely plucked out of the error. As I recall, the decision 
>> was arrived at something likes this:
>>
>>   1. Errors should never pass silently (unless explicitly silenced).
>>   2. Let's have everything raise by default
>>   3. In practice this was no good because you often wanted to look at
>>      the results and see where the problem was.
>>   4. OK, let's have everything warn
>>   5. This almost worked, but underflow was almost never a real error,
>>      so everyone always overrode underflow. A default that you always
>>      need to override is not a good default.
>>   6. So, warn for everything except underflow. Ignore that.
>>
>> And that's where numarry is today. I and other have been using that 
>> error system happily for quite some time now. At least I haven't 
>> heard any complaints for quite a while.
>
>
> I can appreciate this choice, but I don't agree that errors should 
> never pass silently. 

You'll notice that we ended up with a slightly more nuanced choice. 
Besides, the full quote is import: "errors should not pass silently 
unless explicitly silenced". That's quite a bit different than a blanket 
error should never pass silently.

> The fact that people disagree about this is the reason for the error 
> handling.    

Yes. While I like the above defaults, if we have a reasonable approach I 
can just set them at startup and forget about them. Let's try not to 
penalize me too much for that though.


> Note that overflow is not detected everywhere for integers --- we have 
> to simulate the floating-point errors for them.  Only on integer 
> multiply is it detected.   Checking for it would slow down all other 
> integer arithmetic --- one solution, of course is to have two 
> different integer additions (one that checks for overflow and another 
> that doesn't).

Or just document it and don't worry about it. If I'm doing integer 
arithmetic and I need overflow detection, I can generally cast to 
doubles and do my math there, casting back at the end as needed. This 
doesn't seem worth too much extra complication.

Is my floating point bias showing?

> There is really a bit of work left here to do.


Yep. Looks like it, but nothing insurmountable.

-tim


From strawman at astraw.com  Sat Apr  1 15:56:03 2006
From: strawman at astraw.com (Andrew Straw)
Date: Sat Apr  1 15:56:03 2006
Subject: [Numpy-discussion] numpy error handling
In-Reply-To: <442EF7D9.9010404@cox.net>
References: <442DE773.4060104@cox.net> <442E2F05.5080809@ieee.org> <442E94AD.1040200@cox.net> <442EE026.8060806@ieee.org> <442EF7D9.9010404@cox.net>
Message-ID: <442F130E.3060802@astraw.com>

Tim Hochberg wrote:

> Travis Oliphant wrote:
>
>>
>> One question I have with threads and error handling though?  Right
>> now, the ufuncs release the Python lock during computation (and
>> re-acquire it to do error handling if needed).   If another ufunc was
>> started by another Python thread and ran with different error
>> handling, wouldn't the IEEE flags get confused about which ufunc was
>> setting what?  The flags are only checked after each 1-d loop.  If
>> another thread set the processor flag, the current thread could get
>> very confused.
>>
>> This seems like a problem that I'm not sure how to handle. 
>
>
> Yeah, me either. It seems that somehow we'll need to block until all
> current operations are done, but I don't know how to do that off the
> top of my head. Perhaps ufuncs need to lock the flags when they start
> and release them when they finish. This looks feasible, but I'm not
> sure of the proper incantation to get this right. The ufuncs would all
> need to be able able to increment and decrement the lock, whatever it
> is, even though they are in different threads. Meanwhile the setting
> code should only be able to work when the lock is unheld. It's some
> sort of poly thread recursive lock thing. I'll think about it, perhaps
> there's an obvious way.


I am also absolutely no expert in this area, but isn't this exactly what
the kernel supports multiple threads for? In other words, I'm not sure
we have to worry about it at all. I expect that the kernel sets/restores
the CPU/FPU error flags on thread switches and this is part of the cost
associated with switching threads. As I understand it, linux threads are
actually implemented as new processes, so if we did have to be worried
about this, wouldn't we also have to be worried that program A might
alter the FPU error state while we're also using program B?

This is just my unsophisticated and possibly wrong understanding of
these things. If anyone can help clarify the issue, I'd be glad to be
enlightened.

Cheers!
Andrew


From aisaac at american.edu  Sat Apr  1 16:12:01 2006
From: aisaac at american.edu (Alan G Isaac)
Date: Sat Apr  1 16:12:01 2006
Subject: [Numpy-discussion] extension to xrange for numpy
In-Reply-To: <Pine.LNX.4.51.0604011906240.29595@ptpcp8.phy.tu-dresden.de>
References: <Pine.LNX.4.51.0604011906240.29595@ptpcp8.phy.tu-dresden.de>
Message-ID: <Mahogany-0.66.0-2200-20060401-191746.00@american.edu>

On Sat, 1 Apr 2006, (CEST) Arnd Baecker apparently wrote: 
> one python command which is extremely useful in 1D 
> situations is `xrange`.

Which will very soon be 'range'.
Cheers,
Alan Isaac 


From gruben at bigpond.net.au  Sat Apr  1 18:46:07 2006
From: gruben at bigpond.net.au (Gary Ruben)
Date: Sat Apr  1 18:46:07 2006
Subject: [Numpy-discussion] extension to xrange for numpy
In-Reply-To: <Pine.LNX.4.51.0604011906240.29595@ptpcp8.phy.tu-dresden.de>
References: <Pine.LNX.4.51.0604011906240.29595@ptpcp8.phy.tu-dresden.de>
Message-ID: <442F41AE.1080806@bigpond.net.au>

A few rough thoughts:

I'm a bit ambivalent about this. It's not very n-dimensional and 
enforces an x,y,z,(t?) ordering of the array dimensions which some 
programmers may not want to adhere to. On the occasions I've had to 
write code which loops over multiple dimensions, I've found the python 
cookbook routines for permutation and combination generators really useful

<http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/190465>
<http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/474124>

so I'd find some sort of numpy iterator equivalents of these more 
useful. This would allow list comprehensions like

[f(x,y,z) for (x,y,z) in ndrange(10,10,10)]

It would also be good to have it able to specify the rank of the object 
returned to allow whole array rows or matrices to be returned i.e. array 
slices. Maybe the ndrange function could allow something like

[f(xy,z) for (xy,z) in ndrange((10,0,1),10)]
where you use a tuple to specify a range and the axes to slice out.
[f(x,yz) for (x,yz) in ndrange(10,(10,1,2))]
[f(xz,y) for (xz,y) in ndrange((10,0,2),(10,1))]

On the other hand your idea would potentially make some code a lot 
easier to understand, so I'm not against it and if it was picked up, I'd 
propose "t" or "w" for the 4th dimension. It might help to post some 
code that you think might benefit from your idea.

Gary R.

Arnd Baecker wrote:
> Dear numpy enthusiasts,
> 
> one python command which is extremely useful in 1D situations
> is `xrange`. However, for higher dimensional
> settings we strongly lack the commands `yrange` and `zrange`.
> These could be shorthands for the corresponding
> constructs with `:,NewAxis` added.
> 
> Any comments, suggestion and even implementations are very welcome,
> 
> Arnd
> 
> P.S.: What I am not sure about is the right command for
> the 4-dimensional case - which letter should be used after the "z"?
> (it seems that "a" would be a very natural choice...)


From rob at hooft.net  Sat Apr  1 22:38:04 2006
From: rob at hooft.net (Rob Hooft)
Date: Sat Apr  1 22:38:04 2006
Subject: [Numpy-discussion] numpy error handling
In-Reply-To: <442EE026.8060806@ieee.org>
References: <442DE773.4060104@cox.net> <442E2F05.5080809@ieee.org> <442E94AD.1040200@cox.net> <442EE026.8060806@ieee.org>
Message-ID: <442F7114.40908@hooft.net>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Travis Oliphant wrote:
| save = numpy.seterr(dividebyzero='warn')
|
| ...
|
| numpy.seterr(restore=save)

Most of this discussion is outside of my scope, but I have programmed
this kind of pattern in a different way before:

~   save = context.push(something)
~   ...
~   del save

i.e. the destructor of the saved context object restores the old
situation. In most cases it will be called by letting "save" go out of
scope. I know that relying on timely object destruction can be
troublesome when porting to Jython, but it is very convenient in CPython.

If that goes too far, one could make a separate method on save:

~    save.pop()

This can do sanity checking too (are we really at the top of the stack?
Only called once?). The destructor should check whether pop has been called.

Rob

- --
Rob W.W. Hooft  ||  rob at hooft.net  ||  http://www.hooft.net/people/rob/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFEL3EUH7J/Cv8rb3QRAuvsAJ9PO6ZITdVSm+hIwxkWDHHbTNFHdQCcDSWI
Iv7gupkFc8+Fby/5MFwHQf4=
=zE/o
-----END PGP SIGNATURE-----


From aisaac at american.edu  Sun Apr  2 06:58:34 2006
From: aisaac at american.edu (Alan G Isaac)
Date: Sun Apr  2 06:58:34 2006
Subject: [Numpy-discussion] extension to xrange for numpy
In-Reply-To: <442F41AE.1080806@bigpond.net.au>
References: <Pine.LNX.4.51.0604011906240.29595@ptpcp8.phy.tu-dresden.de>
 <442F41AE.1080806@bigpond.net.au>
Message-ID: <Mahogany-0.66.0-2020-20060402-100402.00@american.edu>

On Sun, 02 Apr 2006, Gary Ruben apparently wrote: 
> I'd find some sort of numpy iterator equivalents of these more 
> useful. This would allow list comprehensions like 
> [f(x,y,z) for (x,y,z) in ndrange(10,10,10)] 

How is this better than using ogrid? E.g.,

>>> x=N.ogrid[:3,:2]
>>> N.power(*x)
array([[1, 0],
       [1, 1],
       [1, 2]])

Thanks,
Alan


From cjw at sympatico.ca  Sun Apr  2 07:22:09 2006
From: cjw at sympatico.ca (Colin J. Williams)
Date: Sun Apr  2 07:22:09 2006
Subject: [Numpy-discussion] first impressions with numpy
In-Reply-To: <442DD638.60706@cox.net>
References: <442D9124.5020905@msg.ucsf.edu> <442D9695.2050900@cox.net> <442DB655.2050203@cox.net> <442DB91F.9030103@msg.ucsf.edu> <442DD638.60706@cox.net>
Message-ID: <442FDDD5.8050404@sympatico.ca>

Tim Hochberg wrote:

> Sebastian Haase wrote:
>
>> Thanks Tim,
>> that's OK - I got the idea...
>> BTW, is there a (policy) reason that you sent the first email just to 
>> me and not the mailing list !?
>
>
> No. Just clumsy fingers. Probably the same reason the functions got 
> all garbled!
>
>>
>> I would really be more interested in comments to my first point ;-)
>> I think it's important that numpy will not be to cryptic and only for 
>> "hackers", but nice to look at ...  (hope you get what I mean ;-)
>
>
> Well, I think it's probably a good idea and it sounds like Travis like 
> the idea " for some of the builtin types". I suspect that's code for 
> "not types for which it doesn't make sense, like recarrays".
>
Tim,

Could you elaborate on this please?  Surely, it would be good for all 
functions and methods to have meaningful parameter lists and good doc 
strings.

Colin W.


From tim.hochberg at cox.net  Sun Apr  2 08:11:17 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Sun Apr  2 08:11:17 2006
Subject: [Numpy-discussion] first impressions with numpy
In-Reply-To: <442FDDD5.8050404@sympatico.ca>
References: <442D9124.5020905@msg.ucsf.edu> <442D9695.2050900@cox.net> <442DB655.2050203@cox.net> <442DB91F.9030103@msg.ucsf.edu> <442DD638.60706@cox.net> <442FDDD5.8050404@sympatico.ca>
Message-ID: <442FE950.8090000@cox.net>

Colin J. Williams wrote:

> Tim Hochberg wrote:
>
>> Sebastian Haase wrote:
>>
>>> Thanks Tim,
>>> that's OK - I got the idea...
>>> BTW, is there a (policy) reason that you sent the first email just 
>>> to me and not the mailing list !?
>>
>>
>>
>> No. Just clumsy fingers. Probably the same reason the functions got 
>> all garbled!
>>
>>>
>>> I would really be more interested in comments to my first point ;-)
>>> I think it's important that numpy will not be to cryptic and only 
>>> for "hackers", but nice to look at ...  (hope you get what I mean ;-)
>>
>>
>>
>> Well, I think it's probably a good idea and it sounds like Travis 
>> like the idea " for some of the builtin types". I suspect that's code 
>> for "not types for which it doesn't make sense, like recarrays".
>>
> Tim,
>
> Could you elaborate on this please?  Surely, it would be good for all 
> functions and methods to have meaningful parameter lists and good doc 
> strings.

This isn't really about parameter lists and docstrings, it's about 
__str__ and possibly __repr__. The basic issue is that the way dtypes 
are displayed is powerful, but unfriendly. If I create an array of integers:

     >>> a = arange(4)
     >>> print repr(a.dtype), str(a.dtype)
    dtype('<i4') '<i4'

This result is sort of cryptic. It would probably be reasonable to have 
this print

    dtype(int32), int32

instead. This is much less cryptic and dtype(int32) works fine, so it's 
an acceptable substitute for repr.

On the other hand, some things don't map neatly onto the builtin types. 
Data that's not in the native byte order would be one case. For example, 
dtype('>i4') is not the same as dtype(int32) on my machine and should 
probably not be displayed using int32[1]. These cases should be rare in 
practice and it seems fine to fall back to the less friendly but more 
flexible notation.

Recarrays were probably not such a good example. Here is an example from 
a recarray:

    dtype([('x', '<f8'), ('z', '<c16')])

This would work fine if repr were instead:

    dtype([('x', float64), ('z', complex128)])

Anyway, this all seems reasonable to me at first glance. That said, I 
don't plan to work on this, I've got other fish to fry at the moment.

Regards,

-tim

[1] There does seem to be something squirley going on here though: 
dtype('>i4').name is 'int32' which seems wrong.


From tim.hochberg at cox.net  Sun Apr  2 08:41:24 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Sun Apr  2 08:41:24 2006
Subject: [Numpy-discussion] numpy error handling
In-Reply-To: <442F7114.40908@hooft.net>
References: <442DE773.4060104@cox.net> <442E2F05.5080809@ieee.org> <442E94AD.1040200@cox.net> <442EE026.8060806@ieee.org> <442F7114.40908@hooft.net>
Message-ID: <442FF03F.2000406@cox.net>

Rob Hooft wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Travis Oliphant wrote:
> | save = numpy.seterr(dividebyzero='warn')
> |
> | ...
> |
> | numpy.seterr(restore=save)
>
> Most of this discussion is outside of my scope, but I have programmed
> this kind of pattern in a different way before:
>
> ~   save = context.push(something)
> ~   ...
> ~   del save
>
> i.e. the destructor of the saved context object restores the old
> situation. In most cases it will be called by letting "save" go out of
> scope. I know that relying on timely object destruction can be
> troublesome when porting to Jython, but it is very convenient in CPython.
>
> If that goes too far, one could make a separate method on save:
>
> ~    save.pop()
>
> This can do sanity checking too (are we really at the top of the stack?
> Only called once?). The destructor should check whether pop has been 
> called.


Well, the syntax that *I* really want is this:

    class error_mode(object):
        def __init__(self, all=None, overflow=None, underflow=None,
    dividebyzero=None, invalid=None):
             self._args = (overflow, overflow, underflow, dividebyzero,
    invalid)
        def __enter__(self):
             self._save = numpy.seterr(*self._args)
        def __exit__(self):
           numpy.seterr(self._save)

That way, in a few months, I can do this:

    with error_mode(overflow='raise'):
        # do stuff

and it will be almost impossible to mess up. This syntax is lighter and 
cleaner than a stack or relying on garbage collection to free the 
resources. So, for my purposes, the simple syntax Travis proposes is 
perfectly adequate and simpler to implement  and get right than a stack 
based approach. If 'with' wasn't coming down the pipe, I would push for 
a stack, but I like Travis' proposal just fine.

YMMV of course.

-tim


From tim.hochberg at cox.net  Sun Apr  2 08:52:09 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Sun Apr  2 08:52:09 2006
Subject: [Numpy-discussion] observations
Message-ID: <442FF2F8.3030906@cox.net>

I've been doing a *lot* of playing with numpy over the last several 
days, so expect various observations to trickle from my abode over the 
next week or so. Here's the first installment.

* tostring probably needs the order flag. I think you want  the string 
generated from a multidimensional array in Fortran and C order to differ.

* With the evolution of the order flag, ascontiguousarray is probably 
redundant, scarcely after it was added.

    b = asarray(a, order="C")

Is actually clearer in intent than:

    b = ascontiguousarray(a)

Does the latter leave a contiguous, Fortran order array alone? That's 
probably almost never what one wants. Unless your working with Fortran 
arrays, in which case the opposite ambiguity applies.

Regards,

-tim


From tim.hochberg at cox.net  Sun Apr  2 11:20:03 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Sun Apr  2 11:20:03 2006
Subject: [Numpy-discussion] extension to xrange for numpy
In-Reply-To: <442F41AE.1080806@bigpond.net.au>
References: <Pine.LNX.4.51.0604011906240.29595@ptpcp8.phy.tu-dresden.de> <442F41AE.1080806@bigpond.net.au>
Message-ID: <44301590.4050707@cox.net>

Gary Ruben wrote:

> A few rough thoughts:
>
> I'm a bit ambivalent about this. It's not very n-dimensional and 
> enforces an x,y,z,(t?) ordering of the array dimensions which some 
> programmers may not want to adhere to. On the occasions I've had to 
> write code which loops over multiple dimensions, I've found the python 
> cookbook routines for permutation and combination generators really 
> useful
>
> <http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/190465>
> <http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/474124>
>
> so I'd find some sort of numpy iterator equivalents of these more 
> useful. This would allow list comprehensions like
>
> [f(x,y,z) for (x,y,z) in ndrange(10,10,10)]
>
> It would also be good to have it able to specify the rank of the 
> object returned to allow whole array rows or matrices to be returned 
> i.e. array slices. Maybe the ndrange function could allow something like
>
> [f(xy,z) for (xy,z) in ndrange((10,0,1),10)]
> where you use a tuple to specify a range and the axes to slice out.
> [f(x,yz) for (x,yz) in ndrange(10,(10,1,2))]
> [f(xz,y) for (xz,y) in ndrange((10,0,2),(10,1))]
>
> On the other hand your idea would potentially make some code a lot 
> easier to understand, so I'm not against it and if it was picked up, 
> I'd propose "t" or "w" for the 4th dimension. It might help to post 
> some code that you think might benefit from your idea.

Bah, humbug!

    "Not every two-line Python function has to come pre-written" -- Tim
    Peters on C.L.P

def xrange(*args, **kwargs): return arange(*args, **kwargs)
def yrange(*args, **kwargs): return padshape(arange(*args, **kwargs), 2)
def zrange(*args, **kwargs): return padshape(arange(*args, **kwargs), 3)
def trange(*args, **kwargs): return padshape(arange(*args, **kwargs), 4)

Of course, then you need padshape which I'd be happy to contribute.

I'm of the opinion that we should be trying to improve the usefullness 
of a smallish set of core primitives, not adding endless new functions. 
Stuff like this, which is of interest in a relatively limited domain and 
is trivial to implement when needed, should either not be added at all, 
or added in a separate module.

     >>> len(dir(numpy))
    476

Does anyone know what all of that does? I certainly don't. And I doubt 
anyone uses more than a fraction of that interface. I wouldn't be the 
least bit suprised if there are old moldy parts of that are essentially 
used. And, unused code is buggy code in my experience.

    "Perfection is achieved, not when there is nothing more to add, but
    when there is nothing left to take away." -- Antoine de Saint-Exupery

It's probably difficult at this point in numpy's life cycle to remove 
stuff or even reorganize things substantially. Besides, I'm sure all the 
developers  have their hands full doing more important, or at least less 
contentious, things. Still, I think we should cast a more critical eye 
on new stuff before adding it.

Regards,

-tim


>
> Gary R.
>
> Arnd Baecker wrote:
>
>> Dear numpy enthusiasts,
>>
>> one python command which is extremely useful in 1D situations
>> is `xrange`. However, for higher dimensional
>> settings we strongly lack the commands `yrange` and `zrange`.
>> These could be shorthands for the corresponding
>> constructs with `:,NewAxis` added.
>>
>> Any comments, suggestion and even implementations are very welcome,
>>
>> Arnd
>>
>> P.S.: What I am not sure about is the right command for
>> the 4-dimensional case - which letter should be used after the "z"?
>> (it seems that "a" would be a very natural choice...)
>
>
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting 
> language
> that extends applications into web and mobile media. Attend the live 
> webcast
> and join the prime developer group breaking into this new coding 
> territory!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>
>


From arnd.baecker at web.de  Sun Apr  2 11:23:04 2006
From: arnd.baecker at web.de (Arnd Baecker)
Date: Sun Apr  2 11:23:04 2006
Subject: [Numpy-discussion] extension to xrange for numpy
In-Reply-To: <442F41AE.1080806@bigpond.net.au>
References: <Pine.LNX.4.51.0604011906240.29595@ptpcp8.phy.tu-dresden.de>
 <442F41AE.1080806@bigpond.net.au>
Message-ID: <Pine.LNX.4.51.0604022016500.3029@ptpcp8.phy.tu-dresden.de>

Hi,

On Sun, 2 Apr 2006, Gary Ruben wrote:

> A few rough thoughts:

[... useful stuff snipped ... ]

> On the other hand your idea would potentially make some code a lot
> easier to understand, so I'm not against it and if it was picked up, I'd
> propose "t" or "w" for the 4th dimension. It might help to post some
> code that you think might benefit from your idea.

Hope you don't jump at me, but I would like to
wait until April 1st next year then ...
((hmm, maybe my post contained too much of a possible truth
to be considered as an April fools joke -
yrange and zrange have been a running gag in our group for
a while now - strange German humor ...;-))

Anyway, I hope I did not waste too much of your time ...

Best, Arnd


> Gary R.
>
> Arnd Baecker wrote:
> > Dear numpy enthusiasts,
> >
> > one python command which is extremely useful in 1D situations
> > is `xrange`. However, for higher dimensional
> > settings we strongly lack the commands `yrange` and `zrange`.
> > These could be shorthands for the corresponding
> > constructs with `:,NewAxis` added.
> >
> > Any comments, suggestion and even implementations are very welcome,
> >
> > Arnd
> >
> > P.S.: What I am not sure about is the right command for
> > the 4-dimensional case - which letter should be used after the "z"?
> > (it seems that "a" would be a very natural choice...)
>
>


From tim.hochberg at cox.net  Sun Apr  2 11:34:03 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Sun Apr  2 11:34:03 2006
Subject: [Numpy-discussion] extension to xrange for numpy
In-Reply-To: <Pine.LNX.4.51.0604022016500.3029@ptpcp8.phy.tu-dresden.de>
References: <Pine.LNX.4.51.0604011906240.29595@ptpcp8.phy.tu-dresden.de> <442F41AE.1080806@bigpond.net.au> <Pine.LNX.4.51.0604022016500.3029@ptpcp8.phy.tu-dresden.de>
Message-ID: <44301908.2000607@cox.net>

Arnd Baecker wrote:

>Hi,
>
>On Sun, 2 Apr 2006, Gary Ruben wrote:
>
>  
>
>>A few rough thoughts:
>>    
>>
>
>[... useful stuff snipped ... ]
>
>  
>
>>On the other hand your idea would potentially make some code a lot
>>easier to understand, so I'm not against it and if it was picked up, I'd
>>propose "t" or "w" for the 4th dimension. It might help to post some
>>code that you think might benefit from your idea.
>>    
>>
>
>Hope you don't jump at me, but I would like to
>wait until April 1st next year then ...
>((hmm, maybe my post contained too much of a possible truth
>to be considered as an April fools joke -
>yrange and zrange have been a running gag in our group for
>a while now - strange German humor ...;-))
>
>Anyway, I hope I did not waste too much of your time ...
>  
>

Ouch! Got me anyway...

>Best, Arnd
>
>
>  
>
>>Gary R.
>>
>>Arnd Baecker wrote:
>>    
>>
>>>Dear numpy enthusiasts,
>>>
>>>one python command which is extremely useful in 1D situations
>>>is `xrange`. However, for higher dimensional
>>>settings we strongly lack the commands `yrange` and `zrange`.
>>>These could be shorthands for the corresponding
>>>constructs with `:,NewAxis` added.
>>>
>>>Any comments, suggestion and even implementations are very welcome,
>>>
>>>Arnd
>>>
>>>P.S.: What I am not sure about is the right command for
>>>the 4-dimensional case - which letter should be used after the "z"?
>>>(it seems that "a" would be a very natural choice...)
>>>      
>>>
>>    
>>
>
>
>-------------------------------------------------------
>This SF.Net email is sponsored by xPML, a groundbreaking scripting language
>that extends applications into web and mobile media. Attend the live webcast
>and join the prime developer group breaking into this new coding territory!
>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
>_______________________________________________
>Numpy-discussion mailing list
>Numpy-discussion at lists.sourceforge.net
>https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>
>
>  
>


From schofield at ftw.at  Sun Apr  2 13:05:02 2006
From: schofield at ftw.at (Ed Schofield)
Date: Sun Apr  2 13:05:02 2006
Subject: [Numpy-discussion] Deprecating old names
In-Reply-To: <44301590.4050707@cox.net>
References: <Pine.LNX.4.51.0604011906240.29595@ptpcp8.phy.tu-dresden.de> <442F41AE.1080806@bigpond.net.au> <44301590.4050707@cox.net>
Message-ID: <44302EA9.9050302@ftw.at>

Tim Hochberg wrote, in a different thread:
>     >>> len(dir(numpy))
>    476
>
> Does anyone know what all of that does? I certainly don't. And I doubt
> anyone uses more than a fraction of that interface. I wouldn't be the
> least bit suprised if there are old moldy parts of that are
> essentially used. And, unused code is buggy code in my experience.
>
>    "Perfection is achieved, not when there is nothing more to add, but
>    when there is nothing left to take away." -- Antoine de Saint-Exupery

I'd like to revise a proposal I made last week.  Then I proposed that we
reduce namespace clutter by not importing the contents of the oldnumeric
namespace by default.  But Travis didn't want to deprecate the
functional interfaces (sum(), take(), etc), so I now propose instead
that we split up the contents of oldnumeric.py into interfaces we want
to keep around indefinitely and interfaces we don't.  The ones we want
to keep could go into another file, e.g. fromnumeric.py, whose contents
are imported into the numpy namespace by default.  The deprecated ones
could stay in oldnumeric.py, and could be accessible through 'from
oldnumeric import *' at the top of source files, but not imported by
default.  Strong candidates for deprecation are the capitalised type
names, like Int8, Complex64, UnsignedInt.  I'd also argue for
deprecating NewAxis, UFuncType, ArrayType, arraytype, and anything else
that duplicates functionality available under NumPy under a different name.

Two of the Python design principles (from
http://www.python.org/dev/culture/) are:
 - There should be one -- and preferably only one -- obvious way to do it.
 - Namespaces are one honking great idea -- let's do more of those!

Let's clean up the cruft!

-- Ed


From gruben at bigpond.net.au  Sun Apr  2 16:06:10 2006
From: gruben at bigpond.net.au (Gary Ruben)
Date: Sun Apr  2 16:06:10 2006
Subject: [Numpy-discussion] extension to xrange for numpy
In-Reply-To: <Mahogany-0.66.0-2020-20060402-100402.00@american.edu>
References: <Pine.LNX.4.51.0604011906240.29595@ptpcp8.phy.tu-dresden.de> <442F41AE.1080806@bigpond.net.au> <Mahogany-0.66.0-2020-20060402-100402.00@american.edu>
Message-ID: <443058AE.2070808@bigpond.net.au>

Doh!
It's OK Arnd; I've recently seen you (or someone else withe the same 
name) acknowledged in a PhD I've been reading so I suspect you're a nice 
guy :-)

And, thanks Alan.
I knew about mgrid but not ogrid.
One small way in which that example might be better than using ogrid is 
that you could avoid creating the index arrays and lazily generate the 
indices. However, ogrid is better than mgrid in this respect.

thanks,
Gary

Alan G Isaac wrote:
> On Sun, 02 Apr 2006, Gary Ruben apparently wrote: 
>> I'd find some sort of numpy iterator equivalents of these more 
>> useful. This would allow list comprehensions like 
>> [f(x,y,z) for (x,y,z) in ndrange(10,10,10)] 
> 
> How is this better than using ogrid? E.g.,
> 
>>>> x=N.ogrid[:3,:2]
>>>> N.power(*x)
> array([[1, 0],
>        [1, 1],
>        [1, 2]])
> 
> Thanks,
> Alan


From zpincus at stanford.edu  Sun Apr  2 16:07:07 2006
From: zpincus at stanford.edu (Zachary Pincus)
Date: Sun Apr  2 16:07:07 2006
Subject: [Numpy-discussion] Speed up function on cross product of two sets?
Message-ID: <DCA236B2-7315-419D-B05B-06396924E909@stanford.edu>

Hi folks,

I have a inner loop that looks like this:
out = []
for elem1 in l1:
   for elem2 in l2:
     out.append(do_something(l1, l2))
result = do_something_else(out)

where do_something and do_something_else are implemented with only  
numpy ufuncs, and l1 and l2 are numpy arrays.

As an example, I need to compute the median distance from any element  
in one set to any element in another set.

What's the best way to speed this sort of thing up with numpy (e.g.  
push as much down into the underlying C as possible)? I could re- 
write do_something with the numexpr tools (which are very cool), but  
that doesn't address the fact that I've still got nested loops living  
in Python.

Perhaps there's some way in numpy to make one big honking array that  
contains all the pairs from the two lists, and then just run my  
do_something on that huge array, but that of course scales poorly.

Any thoughts?

Zach


From tim.hochberg at cox.net  Sun Apr  2 16:53:05 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Sun Apr  2 16:53:05 2006
Subject: [Numpy-discussion] Speed up function on cross product of two
 sets?
In-Reply-To: <DCA236B2-7315-419D-B05B-06396924E909@stanford.edu>
References: <DCA236B2-7315-419D-B05B-06396924E909@stanford.edu>
Message-ID: <443063C0.3050002@cox.net>

Zachary Pincus wrote:

> Hi folks,

Hi Zach,

>
> I have a inner loop that looks like this:
> out = []
> for elem1 in l1:
>   for elem2 in l2:
>     out.append(do_something(l1, l2))

this is do_something(elem1, elem2), correct?

> result = do_something_else(out)
>
> where do_something and do_something_else are implemented with only  
> numpy ufuncs, and l1 and l2 are numpy arrays.
>
> As an example, I need to compute the median distance from any element  
> in one set to any element in another set.
>
> What's the best way to speed this sort of thing up with numpy (e.g.  
> push as much down into the underlying C as possible)? I could re- 
> write do_something with the numexpr tools (which are very cool), but  
> that doesn't address the fact that I've still got nested loops living  
> in Python.

The exact approach I'd take would depend on the sizes of l1 and l2 and a 
certain amount of trial and error. However, the first thing I'd try is:

    n1 = len(l1)
    n2 = len(l2)
    out = numpy.zeros([n1*n2], appropriate_dtype)
    for i, elem1 in enumerate(l1):
        out[i*n2:(i+1)*n2] = do_something(elem1, l1)
    result = do_something_else(out)

That may work as is, or you may have to tweak do_something slightly to 
handle l1 correctly. You might also try to do the operations in place 
and stuff the results into out directly by using X= and three argument 
ufuncs. I'd not do that at first though.

One thing to consider is that, in my experience, numpy works best on 
chunks of about 10,000 elements. I believe that this is a function of 
cache size. Anyway, this may choice of which of l1 and l2 you continue 
to loop over, and which you vectorize. If they both might get really 
big, you could even consider chopping up l1 when you vectorize it. Again 
I wouldn't do that unless it really looks like you need it.

If that all sounds opaque, feel free to ask more questions. Or if you 
have questions about microoptimizing the guts of do_something, I have a 
bunch of experience with that and I like a good puzzle.

>
> Perhaps there's some way in numpy to make one big honking array that  
> contains all the pairs from the two lists, and then just run my  
> do_something on that huge array, but that of course scales poorly.

I know of at least one way, but it's a bit of a kludge. I don't think 
I'd try that though. As you said, it scales poorly.  As long as you can 
vectorize your inner loop, it's not necessary and sometimes makes things 
worse, to vectorize your outer loop as well. That's assuming your inner 
loop is large, it doesn't help if your inner loop is 3 elements long for 
instance, but that doesn't seem like it should be a problem here.

Regards,

-tim


From haase at msg.ucsf.edu  Sun Apr  2 17:01:04 2006
From: haase at msg.ucsf.edu (Sebastian Haase)
Date: Sun Apr  2 17:01:04 2006
Subject: [Numpy-discussion] first impressions with numpy
In-Reply-To: <442FE950.8090000@cox.net>
References: <442D9124.5020905@msg.ucsf.edu> <442D9695.2050900@cox.net> <442DB655.2050203@cox.net> <442DB91F.9030103@msg.ucsf.edu> <442DD638.60706@cox.net> <442FDDD5.8050404@sympatico.ca> <442FE950.8090000@cox.net>
Message-ID: <44306594.50305@msg.ucsf.edu>

Tim Hochberg wrote:
<snip>
> This would work fine if repr were instead:
> 
>    dtype([('x', float64), ('z', complex128)])
> 
> Anyway, this all seems reasonable to me at first glance. That said, I 
> don't plan to work on this, I've got other fish to fry at the moment.

A new point: Please remind me (and probably others): when did it get 
decided to introduce 'complex128' to mean numarray's complex64
and the 'complex64' to mean numarray's complex32 ?

I do understand the logic that 128 is really the bit-size of one 
(complex) element - but I also liked the old way, because:
1. e.g. in fft transforms, float32 would "go with" complex32
    and float64 with complex64
2. complex128 is one character extra (longer) and also (alphabetically) 
now sorts before(!) complex64


These might just be my personal (idiotic ;-) comments - but I would 
appreciate some feedback/comments.
Also: Is it now to late to (re-)start a discussion on this !?

Thanks
- Sebastian Haase


From haase at msg.ucsf.edu  Sun Apr  2 17:09:07 2006
From: haase at msg.ucsf.edu (Sebastian Haase)
Date: Sun Apr  2 17:09:07 2006
Subject: [Numpy-discussion] first impressions with numpy
In-Reply-To: <442FE950.8090000@cox.net>
References: <442D9124.5020905@msg.ucsf.edu> <442D9695.2050900@cox.net> <442DB655.2050203@cox.net> <442DB91F.9030103@msg.ucsf.edu> <442DD638.60706@cox.net> <442FDDD5.8050404@sympatico.ca> <442FE950.8090000@cox.net>
Message-ID: <44306774.5030507@msg.ucsf.edu>

Tim Hochberg wrote:
<snip>
> This would work fine if repr were instead:
> 
>    dtype([('x', float64), ('z', complex128)])
> 
> Anyway, this all seems reasonable to me at first glance. That said, I 
> don't plan to work on this, I've got other fish to fry at the moment.

A new point: Please remind me (and probably others): when did it get 
decided to introduce 'complex128' to mean numarray's complex64
and the 'complex64' to mean numarray's complex32 ?

I do understand the logic that 128 is really the bit-size of one 
(complex) element - but I also liked the old way, because:
1. e.g. in fft transforms, float32 would "go with" complex32
    and float64 with complex64
2. complex128 is one character extra (longer) and also (alphabetically) 
now sorts before(!) complex64
3 Mostly of course: this new naming will confuse all my code and 
introduce hard to find bugs - when I see complex64 I will "think" the 
old way for quite some time ...


These might just be my personal (idiotic ;-) comments - but I would 
appreciate some feedback/comments.
Also: Is it now to late to (re-)start a discussion on this !?

Thanks
- Sebastian Haase


From zpincus at stanford.edu  Sun Apr  2 17:17:06 2006
From: zpincus at stanford.edu (Zachary Pincus)
Date: Sun Apr  2 17:17:06 2006
Subject: [Numpy-discussion] Speed up function on cross product of two sets?
In-Reply-To: <443063C0.3050002@cox.net>
References: <DCA236B2-7315-419D-B05B-06396924E909@stanford.edu> <443063C0.3050002@cox.net>
Message-ID: <8717BF2C-01EB-422A-B63A-7807E607DEE9@stanford.edu>

Tim -

Thanks for your suggestions -- that all makes good sense.

It sounds like the general take home message is, as always: "the  
first thing to try is to vectorize your inner loop."

Zach


>> I have a inner loop that looks like this:
>> out = []
>> for elem1 in l1:
>>   for elem2 in l2:
>>     out.append(do_something(l1, l2))
>
> this is do_something(elem1, elem2), correct?
>
>> result = do_something_else(out)
>>
>> where do_something and do_something_else are implemented with  
>> only  numpy ufuncs, and l1 and l2 are numpy arrays.
>>
>> As an example, I need to compute the median distance from any  
>> element  in one set to any element in another set.
>>
>> What's the best way to speed this sort of thing up with numpy  
>> (e.g.  push as much down into the underlying C as possible)? I  
>> could re- write do_something with the numexpr tools (which are  
>> very cool), but  that doesn't address the fact that I've still got  
>> nested loops living  in Python.
>
> The exact approach I'd take would depend on the sizes of l1 and l2  
> and a certain amount of trial and error. However, the first thing  
> I'd try is:
>
>    n1 = len(l1)
>    n2 = len(l2)
>    out = numpy.zeros([n1*n2], appropriate_dtype)
>    for i, elem1 in enumerate(l1):
>        out[i*n2:(i+1)*n2] = do_something(elem1, l1)
>    result = do_something_else(out)
>
> That may work as is, or you may have to tweak do_something slightly  
> to handle l1 correctly. You might also try to do the operations in  
> place and stuff the results into out directly by using X= and three  
> argument ufuncs. I'd not do that at first though.
>
> One thing to consider is that, in my experience, numpy works best  
> on chunks of about 10,000 elements. I believe that this is a  
> function of cache size. Anyway, this may choice of which of l1 and  
> l2 you continue to loop over, and which you vectorize. If they both  
> might get really big, you could even consider chopping up l1 when  
> you vectorize it. Again I wouldn't do that unless it really looks  
> like you need it.
>
> If that all sounds opaque, feel free to ask more questions. Or if  
> you have questions about microoptimizing the guts of do_something,  
> I have a bunch of experience with that and I like a good puzzle.
>
>>
>> Perhaps there's some way in numpy to make one big honking array  
>> that  contains all the pairs from the two lists, and then just run  
>> my  do_something on that huge array, but that of course scales  
>> poorly.
>
> I know of at least one way, but it's a bit of a kludge. I don't  
> think I'd try that though. As you said, it scales poorly.  As long  
> as you can vectorize your inner loop, it's not necessary and  
> sometimes makes things worse, to vectorize your outer loop as well.  
> That's assuming your inner loop is large, it doesn't help if your  
> inner loop is 3 elements long for instance, but that doesn't seem  
> like it should be a problem here.
>
> Regards,
>
> -tim
>


From haase at msg.ucsf.edu  Sun Apr  2 17:21:14 2006
From: haase at msg.ucsf.edu (Sebastian Haase)
Date: Sun Apr  2 17:21:14 2006
Subject: [Fwd: Re: [Numpy-discussion] first impressions with numpy]
Message-ID: <44306A2C.4040606@msg.ucsf.edu>

supposedly meant for the whole list ...
From: Tim Hochberg <tim.hochberg at cox.net>


Sebastian Haase wrote:

> Tim Hochberg wrote:
> <snip>
>
>> This would work fine if repr were instead:
>>
>>    dtype([('x', float64), ('z', complex128)])
>>
>> Anyway, this all seems reasonable to me at first glance. That said, I 
>> don't plan to work on this, I've got other fish to fry at the moment.
>
>
> A new point: Please remind me (and probably others): when did it get 
> decided to introduce 'complex128' to mean numarray's complex64
> and the 'complex64' to mean numarray's complex32 ?

I haven't the faintest idea -- it happened when I was off in Numarray
land I assume. Or it was always that way? No idea. Hopefully Travis will
answer this.

-tim


>
> I do understand the logic that 128 is really the bit-size of one 
> (complex) element - but I also liked the old way, because:
> 1. e.g. in fft transforms, float32 would "go with" complex32
>    and float64 with complex64
> 2. complex128 is one character extra (longer) and also 
> (alphabetically) now sorts before(!) complex64
> 3 Mostly of course: this new naming will confuse all my code and 
> introduce hard to find bugs - when I see complex64 I will "think" the 
> old way for quite some time ...
>
>
> These might just be my personal (idiotic ;-) comments - but I would 
> appreciate some feedback/comments.
> Also: Is it now to late to (re-)start a discussion on this !?
>
> Thanks
> - Sebastian Haase
>
>


From tim.hochberg at cox.net  Sun Apr  2 17:53:01 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Sun Apr  2 17:53:01 2006
Subject: [Numpy-discussion] Speed up function on cross product of two
 sets?
In-Reply-To: <8717BF2C-01EB-422A-B63A-7807E607DEE9@stanford.edu>
References: <DCA236B2-7315-419D-B05B-06396924E909@stanford.edu> <443063C0.3050002@cox.net> <8717BF2C-01EB-422A-B63A-7807E607DEE9@stanford.edu>
Message-ID: <443071BA.4090606@cox.net>

Zachary Pincus wrote:

> Tim -
>
> Thanks for your suggestions -- that all makes good sense.
>
> It sounds like the general take home message is, as always: "the  
> first thing to try is to vectorize your inner loop."

Exactly and far more pithy than my meanderings. If I were going to make 
a list it would look something like:

0. Think about your algorithm.
1. Vectorize your inner loop.
2. Eliminate temporaries
3. Ask for help
4. Recode in C.
5 Accept that your code will never be fast.

Step zero should probably be repeated after every other step ;)

-tim


>
> Zach
>
>
>>> I have a inner loop that looks like this:
>>> out = []
>>> for elem1 in l1:
>>>   for elem2 in l2:
>>>     out.append(do_something(l1, l2))
>>
>>
>> this is do_something(elem1, elem2), correct?
>>
>>> result = do_something_else(out)
>>>
>>> where do_something and do_something_else are implemented with  only  
>>> numpy ufuncs, and l1 and l2 are numpy arrays.
>>>
>>> As an example, I need to compute the median distance from any  
>>> element  in one set to any element in another set.
>>>
>>> What's the best way to speed this sort of thing up with numpy  
>>> (e.g.  push as much down into the underlying C as possible)? I  
>>> could re- write do_something with the numexpr tools (which are  very 
>>> cool), but  that doesn't address the fact that I've still got  
>>> nested loops living  in Python.
>>
>>
>> The exact approach I'd take would depend on the sizes of l1 and l2  
>> and a certain amount of trial and error. However, the first thing  
>> I'd try is:
>>
>>    n1 = len(l1)
>>    n2 = len(l2)
>>    out = numpy.zeros([n1*n2], appropriate_dtype)
>>    for i, elem1 in enumerate(l1):
>>        out[i*n2:(i+1)*n2] = do_something(elem1, l1)
>>    result = do_something_else(out)
>>
>> That may work as is, or you may have to tweak do_something slightly  
>> to handle l1 correctly. You might also try to do the operations in  
>> place and stuff the results into out directly by using X= and three  
>> argument ufuncs. I'd not do that at first though.
>>
>> One thing to consider is that, in my experience, numpy works best  on 
>> chunks of about 10,000 elements. I believe that this is a  function 
>> of cache size. Anyway, this may choice of which of l1 and  l2 you 
>> continue to loop over, and which you vectorize. If they both  might 
>> get really big, you could even consider chopping up l1 when  you 
>> vectorize it. Again I wouldn't do that unless it really looks  like 
>> you need it.
>>
>> If that all sounds opaque, feel free to ask more questions. Or if  
>> you have questions about microoptimizing the guts of do_something,  I 
>> have a bunch of experience with that and I like a good puzzle.
>>
>>>
>>> Perhaps there's some way in numpy to make one big honking array  
>>> that  contains all the pairs from the two lists, and then just run  
>>> my  do_something on that huge array, but that of course scales  poorly.
>>
>>
>> I know of at least one way, but it's a bit of a kludge. I don't  
>> think I'd try that though. As you said, it scales poorly.  As long  
>> as you can vectorize your inner loop, it's not necessary and  
>> sometimes makes things worse, to vectorize your outer loop as well.  
>> That's assuming your inner loop is large, it doesn't help if your  
>> inner loop is 3 elements long for instance, but that doesn't seem  
>> like it should be a problem here.
>>
>> Regards,
>>
>> -tim
>>
>
>
>


From oliphant.travis at ieee.org  Sun Apr  2 21:14:01 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Sun Apr  2 21:14:01 2006
Subject: [Numpy-discussion] Deprecating old names
In-Reply-To: <44302EA9.9050302@ftw.at>
References: <Pine.LNX.4.51.0604011906240.29595@ptpcp8.phy.tu-dresden.de> <442F41AE.1080806@bigpond.net.au> <44301590.4050707@cox.net> <44302EA9.9050302@ftw.at>
Message-ID: <4430A0BF.1080207@ieee.org>

Ed Schofield wrote:
> Tim Hochberg wrote, in a different thread:
>   
>>     >>> len(dir(numpy))
>>    476
>>
>> Does anyone know what all of that does? I certainly don't. And I doubt
>> anyone uses more than a fraction of that interface. I wouldn't be the
>> least bit suprised if there are old moldy parts of that are
>> essentially used. And, unused code is buggy code in my experience.
>>
>>    "Perfection is achieved, not when there is nothing more to add, but
>>    when there is nothing left to take away." -- Antoine de Saint-Exupery
>>     
>
> I'd like to revise a proposal I made last week.  Then I proposed that we
> reduce namespace clutter by not importing the contents of the oldnumeric
> namespace by default.  But Travis didn't want to deprecate the
> functional interfaces (sum(), take(), etc), so I now propose instead
> that we split up the contents of oldnumeric.py into interfaces we want
> to keep around indefinitely and interfaces we don't. 

Good idea...

-Travis


From rob at hooft.net  Sun Apr  2 22:46:09 2006
From: rob at hooft.net (Rob W.W. Hooft)
Date: Sun Apr  2 22:46:09 2006
Subject: [Fwd: Re: [Numpy-discussion] first impressions with numpy]
In-Reply-To: <44306A2C.4040606@msg.ucsf.edu>
References: <44306A2C.4040606@msg.ucsf.edu>
Message-ID: <4430B5D6.7020907@hooft.net>

Sebastian Haase wrote:

>> A new point: Please remind me (and probably others): when did it get 
>> decided to introduce 'complex128' to mean numarray's complex64
>> and the 'complex64' to mean numarray's complex32 ?
> 
> 
> I haven't the faintest idea -- it happened when I was off in Numarray
> land I assume. Or it was always that way? No idea. Hopefully Travis will
> answer this.

Fortran heritage? REAL*8 is paired with COMPLEX*16 there....

Regards,

Rob Hooft


From arnd.baecker at web.de  Mon Apr  3 02:18:08 2006
From: arnd.baecker at web.de (Arnd Baecker)
Date: Mon Apr  3 02:18:08 2006
Subject: [Numpy-discussion] Speed up function on cross product of two
 sets?
In-Reply-To: <DCA236B2-7315-419D-B05B-06396924E909@stanford.edu>
References: <DCA236B2-7315-419D-B05B-06396924E909@stanford.edu>
Message-ID: <Pine.LNX.4.51.0604031110050.6751@ptpcp8.phy.tu-dresden.de>

Hi,

On Sun, 2 Apr 2006, Zachary Pincus wrote:

> Hi folks,
>
> I have a inner loop that looks like this:
> out = []
> for elem1 in l1:
>    for elem2 in l2:
>      out.append(do_something(l1, l2))
> result = do_something_else(out)
>
> where do_something and do_something_else are implemented with only
> numpy ufuncs, and l1 and l2 are numpy arrays.
>
> As an example, I need to compute the median distance from any element
> in one set to any element in another set.
>
> What's the best way to speed this sort of thing up with numpy (e.g.
> push as much down into the underlying C as possible)? I could re-
> write do_something with the numexpr tools (which are very cool), but
> that doesn't address the fact that I've still got nested loops living
> in Python.

If do_something eats arrays, you could try:

  result = do_something(l1[:,NewAxis], l2)

E.g.:

  from numpy import *
  l1 = linspace(0.0, pi, 10)
  l2 = linspace(0.0, pi, 3)
  def f(y, x):
      return sin(y)*cos(x)

  print f(l1[:,NewAxis], l2)


((Note that I just learned in some other thread that with numpy there is
an alternative to NewAxis, but I haven't figured out which that is ...))

Best, Arnd


From zpincus at stanford.edu  Mon Apr  3 08:50:10 2006
From: zpincus at stanford.edu (Zachary Pincus)
Date: Mon Apr  3 08:50:10 2006
Subject: [Numpy-discussion] Speed up function on cross product of two sets?
In-Reply-To: <443071BA.4090606@cox.net>
References: <DCA236B2-7315-419D-B05B-06396924E909@stanford.edu> <443063C0.3050002@cox.net> <8717BF2C-01EB-422A-B63A-7807E607DEE9@stanford.edu> <443071BA.4090606@cox.net>
Message-ID: <BA5031F7-67B1-4234-9F4C-0893247A2FE9@stanford.edu>

> If I were going to make a list it would look something like:
>
> 0. Think about your algorithm.
> 1. Vectorize your inner loop.
> 2. Eliminate temporaries
> 3. Ask for help
> 4. Recode in C.
> 5 Accept that your code will never be fast.
>
> Step zero should probably be repeated after every other step ;)

Thanks for this list -- it's a good one.

Since we're discussing this, could I ask about the best way to  
eliminate temporaries? If you're using ufuncs, is there some way to  
make them work in-place? Or is the lowest-hanging fruit (temporary- 
wise) typically elsewhere?

Zach


From tim.hochberg at cox.net  Mon Apr  3 10:10:40 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Mon Apr  3 10:10:40 2006
Subject: [Numpy-discussion] Speed up function on cross product of two
 sets?
In-Reply-To: <Pine.LNX.4.51.0604031110050.6751@ptpcp8.phy.tu-dresden.de>
References: <DCA236B2-7315-419D-B05B-06396924E909@stanford.edu> <Pine.LNX.4.51.0604031110050.6751@ptpcp8.phy.tu-dresden.de>
Message-ID: <44315633.4010600@cox.net>

Arnd Baecker wrote:

[SNIP]

>((Note that I just learned in some other thread that with numpy there is
>an alternative to NewAxis, but I haven't figured out which that is ...))
>  
>
If you're old school you could just use None. But you probably mean 
'newaxis'.

-tim


From robert.kern at gmail.com  Mon Apr  3 10:19:02 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Mon Apr  3 10:19:02 2006
Subject: [Numpy-discussion] Re: Speed up function on cross product of two sets?
In-Reply-To: <BA5031F7-67B1-4234-9F4C-0893247A2FE9@stanford.edu>
References: <DCA236B2-7315-419D-B05B-06396924E909@stanford.edu> <443063C0.3050002@cox.net> <8717BF2C-01EB-422A-B63A-7807E607DEE9@stanford.edu> <443071BA.4090606@cox.net> <BA5031F7-67B1-4234-9F4C-0893247A2FE9@stanford.edu>
Message-ID: <e0rlc5$2bv$1@sea.gmane.org>

Zachary Pincus wrote:
>> If I were going to make a list it would look something like:
>>
>> 0. Think about your algorithm.
>> 1. Vectorize your inner loop.
>> 2. Eliminate temporaries
>> 3. Ask for help
>> 4. Recode in C.
>> 5 Accept that your code will never be fast.
>>
>> Step zero should probably be repeated after every other step ;)
> 
> Thanks for this list -- it's a good one.
> 
> Since we're discussing this, could I ask about the best way to 
> eliminate temporaries? If you're using ufuncs, is there some way to 
> make them work in-place? Or is the lowest-hanging fruit (temporary-
> wise) typically elsewhere?

Many binary ufuncs take an optional third argument which is an array which the
ufunc should put the result in.

In [2]: x = arange(10)

In [3]: y = arange(10)

In [4]: id(x)
Out[4]: 91297984

In [5]: add(x, y, x)
Out[5]: array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18])

In [6]: id(Out[5])
Out[6]: 91297984

In [7]: x
Out[7]: array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18])

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From tim.hochberg at cox.net  Mon Apr  3 10:36:05 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Mon Apr  3 10:36:05 2006
Subject: [Numpy-discussion] Speed up function on cross product of two
 sets?
In-Reply-To: <BA5031F7-67B1-4234-9F4C-0893247A2FE9@stanford.edu>
References: <DCA236B2-7315-419D-B05B-06396924E909@stanford.edu> <443063C0.3050002@cox.net> <8717BF2C-01EB-422A-B63A-7807E607DEE9@stanford.edu> <443071BA.4090606@cox.net> <BA5031F7-67B1-4234-9F4C-0893247A2FE9@stanford.edu>
Message-ID: <44315CD6.3010001@cox.net>

Zachary Pincus wrote:

>> If I were going to make a list it would look something like:
>>
>> 0. Think about your algorithm.
>> 1. Vectorize your inner loop.
>> 2. Eliminate temporaries
>> 3. Ask for help
>> 4. Recode in C.
>> 5 Accept that your code will never be fast.
>>
>> Step zero should probably be repeated after every other step ;)
>
>
> Thanks for this list -- it's a good one.
>
> Since we're discussing this, could I ask about the best way to  
> eliminate temporaries? If you're using ufuncs, is there some way to  
> make them work in-place? Or is the lowest-hanging fruit (temporary- 
> wise) typically elsewhere?

The least cryptic is to use *=, +=, where you can. But that only get's 
you so far.

As you guessed, there is a secret extra argument to ufuncs that allow 
you to do results in place. One could replace scratch=a*(b+sqrt(a)) with:

 >>> scratch = zeros([5], dtype=float)
 >>> a = arange(5, dtype=float)
 >>> b = arange(5, dtype=float)
 >>> sqrt(a, scratch)
array([ 0.        ,  1.        ,  1.41421356,  1.73205081,  2.        ])
 >>> add(scratch, b, scratch)
array([ 0.        ,  2.        ,  3.41421356,  4.73205081,  6.        ])
 >>> multiply(a, scratch)
array([  0.        ,   2.        ,   6.82842712,  14.19615242,  
24.        ])

The downside of this is that your code goes from comprehensible to 
insanely cryprtic pretty fast. I only resort to this in extreme 
circumstances. You could also use numexpr, which should be faster and is 
much less cryptic, but may not be completely stable yet.

Oh, and don't forget step 0, that's sometimes a good way to reduce 
temporaries.

regards,

-tim


From verveer at embl-heidelberg.de  Mon Apr  3 12:00:04 2006
From: verveer at embl-heidelberg.de (Peter Verveer)
Date: Mon Apr  3 12:00:04 2006
Subject: [Numpy-discussion] Re: Speed up function on cross product of two sets?
In-Reply-To: <e0rlc5$2bv$1@sea.gmane.org>
References: <DCA236B2-7315-419D-B05B-06396924E909@stanford.edu> <443063C0.3050002@cox.net> <8717BF2C-01EB-422A-B63A-7807E607DEE9@stanford.edu> <443071BA.4090606@cox.net> <BA5031F7-67B1-4234-9F4C-0893247A2FE9@stanford.edu> <e0rlc5$2bv$1@sea.gmane.org>
Message-ID: <012B117C-4046-4058-B7F9-AC5EDB68A532@embl-heidelberg.de>

On 3 Apr 2006, at 19:17, Robert Kern wrote:

> Zachary Pincus wrote:
>>> If I were going to make a list it would look something like:
>>>
>>> 0. Think about your algorithm.
>>> 1. Vectorize your inner loop.
>>> 2. Eliminate temporaries
>>> 3. Ask for help
>>> 4. Recode in C.
>>> 5 Accept that your code will never be fast.
>>>
>>> Step zero should probably be repeated after every other step ;)
>>
>> Thanks for this list -- it's a good one.
>>
>> Since we're discussing this, could I ask about the best way to
>> eliminate temporaries? If you're using ufuncs, is there some way to
>> make them work in-place? Or is the lowest-hanging fruit (temporary-
>> wise) typically elsewhere?
>
> Many binary ufuncs take an optional third argument which is an  
> array which the
> ufunc should put the result in.

I wished many times that all functions would support an optional  
output argument. It is not only important for speed optimization, but  
also if you work with large data sets. I guess the use of a return  
values is much more natural but when the point comes that you want to  
optimize your algorithm, the ability to use an output argument  
instead is very valuable. It would be nice if all functions by  
default would support a standard keyword argument 'output', just like  
ufuncs do. I suppose these could in principle be added while still  
maintaining backwards compatibility.

Cheers, Peter


From oliphant at ee.byu.edu  Mon Apr  3 15:59:06 2006
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Mon Apr  3 15:59:06 2006
Subject: [Numpy-discussion] first impressions with numpy
In-Reply-To: <44306594.50305@msg.ucsf.edu>
References: <442D9124.5020905@msg.ucsf.edu> <442D9695.2050900@cox.net> <442DB655.2050203@cox.net> <442DB91F.9030103@msg.ucsf.edu> <442DD638.60706@cox.net> <442FDDD5.8050404@sympatico.ca> <442FE950.8090000@cox.net> <44306594.50305@msg.ucsf.edu>
Message-ID: <4431A8A0.9010604@ee.byu.edu>

Sebastian Haase wrote:

> Tim Hochberg wrote:
> <snip>
>
>> This would work fine if repr were instead:
>>
>>    dtype([('x', float64), ('z', complex128)])
>>
>> Anyway, this all seems reasonable to me at first glance. That said, I 
>> don't plan to work on this, I've got other fish to fry at the moment.
>
>
> A new point: Please remind me (and probably others): when did it get 
> decided to introduce 'complex128' to mean numarray's complex64
> and the 'complex64' to mean numarray's complex32 ?

It was last February (i.e. 2005) when I first started posting regarding 
the new NumPy.   I claimed it was more consistent to use actual 
bit-widths.   A few people, including Perry, indicated they weren't 
opposed to the change and so I went ahead with it.

You can read relevant posts by searching on 
numpy-discussion at lists.sourceforge.net

Discussions are always welcome.  I suppose it's not too late to change 
something like this --- but it's getting there...

-Travis


From ryanlists at gmail.com  Mon Apr  3 17:50:03 2006
From: ryanlists at gmail.com (Ryan Krauss)
Date: Mon Apr  3 17:50:03 2006
Subject: [Numpy-discussion] string matrices
Message-ID: <c5b438120604031748h2af849c1p463462e6d59eb1f0@mail.gmail.com>

I am trying to use NumPy to generate some matrix inputs to Maxima for
symbolic analysis.  I am using a fair number of
matrix.astype('S%d'%maxlen) statements.  This seems to work very well.
 It also doesn't seem to pad the elements in anyway if maxlen is
bigger than I need, which is great.  This may seem like a dumb
computer science question, but what is the memory/performance cost of
making maxlen bigger than I want (but making sure that it is way
bigger than I need so that the elements don't get truncated)?  If my
biggest matrices will be 13x13, how long can the strings be before I
consume more than a few megs (or a few dozen megs) of memory?

Thanks,

Ryan


From haase at msg.ucsf.edu  Mon Apr  3 22:06:05 2006
From: haase at msg.ucsf.edu (Sebastian Haase)
Date: Mon Apr  3 22:06:05 2006
Subject: [Numpy-discussion] Vote: complex64   vs  complex128  (was:  first impressions with numpy
In-Reply-To: <4431A8A0.9010604@ee.byu.edu>
References: <442D9124.5020905@msg.ucsf.edu> <442D9695.2050900@cox.net> <442DB655.2050203@cox.net> <442DB91F.9030103@msg.ucsf.edu> <442DD638.60706@cox.net> <442FDDD5.8050404@sympatico.ca> <442FE950.8090000@cox.net> <44306594.50305@msg.ucsf.edu> <4431A8A0.9010604@ee.byu.edu>
Message-ID: <4431FE90.6060301@msg.ucsf.edu>

Hi,
Could we start another poll on this !?

I think I would vote
+1  for complex32 & complex64  mostly just because of "that's what I'm 
used to"

But I'm curious to hear what others "know to be in use" - e.g. Matlab or 
IDL !

- Thanks
Sebastian Haase


Travis Oliphant wrote:
> Sebastian Haase wrote:
> 
>> Tim Hochberg wrote:
>> <snip>
>>
>>> This would work fine if repr were instead:
>>>
>>>    dtype([('x', float64), ('z', complex128)])
>>>
>>> Anyway, this all seems reasonable to me at first glance. That said, I 
>>> don't plan to work on this, I've got other fish to fry at the moment.
>>
>>
>> A new point: Please remind me (and probably others): when did it get 
>> decided to introduce 'complex128' to mean numarray's complex64
>> and the 'complex64' to mean numarray's complex32 ?
> 
> It was last February (i.e. 2005) when I first started posting regarding 
> the new NumPy.   I claimed it was more consistent to use actual 
> bit-widths.   A few people, including Perry, indicated they weren't 
> opposed to the change and so I went ahead with it.
> 
> You can read relevant posts by searching on 
> numpy-discussion at lists.sourceforge.net
> 
> Discussions are always welcome.  I suppose it's not too late to change 
> something like this --- but it's getting there...
> 
> -Travis


From robert.kern at gmail.com  Mon Apr  3 22:41:02 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Mon Apr  3 22:41:02 2006
Subject: [Numpy-discussion] Re: Vote: complex64   vs  complex128
In-Reply-To: <4431FE90.6060301@msg.ucsf.edu>
References: <442D9124.5020905@msg.ucsf.edu> <442D9695.2050900@cox.net> <442DB655.2050203@cox.net> <442DB91F.9030103@msg.ucsf.edu> <442DD638.60706@cox.net> <442FDDD5.8050404@sympatico.ca> <442FE950.8090000@cox.net> <44306594.50305@msg.ucsf.edu> <4431A8A0.9010604@ee.byu.edu> <4431FE90.6060301@msg.ucsf.edu>
Message-ID: <e0t0sh$mfo$1@sea.gmane.org>

Sebastian Haase wrote:
> Hi,
> Could we start another poll on this !?

Please, let's leave voting as a method of last resort.

> I think I would vote
> +1  for complex32 & complex64  mostly just because of "that's what I'm
> used to"
> 
> But I'm curious to hear what others "know to be in use" - e.g. Matlab or
> IDL !

On the merits of the issue, I like the new scheme better. For whatever reason, I
tend to remember it when coding. With Numeric, I would frequently second-guess
myself and go to the prompt and tab-complete to look at all of the options and
reason out the one I wanted.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From tim.hochberg at cox.net  Mon Apr  3 22:49:02 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Mon Apr  3 22:49:02 2006
Subject: [Numpy-discussion] Re: Vote: complex64   vs  complex128
In-Reply-To: <e0t0sh$mfo$1@sea.gmane.org>
References: <442D9124.5020905@msg.ucsf.edu> <442D9695.2050900@cox.net> <442DB655.2050203@cox.net> <442DB91F.9030103@msg.ucsf.edu> <442DD638.60706@cox.net> <442FDDD5.8050404@sympatico.ca> <442FE950.8090000@cox.net> <44306594.50305@msg.ucsf.edu> <4431A8A0.9010604@ee.byu.edu> <4431FE90.6060301@msg.ucsf.edu> <e0t0sh$mfo$1@sea.gmane.org>
Message-ID: <443208B9.40106@cox.net>

Robert Kern wrote:

>Sebastian Haase wrote:
>  
>
>>Hi,
>>Could we start another poll on this !?
>>    
>>
>
>Please, let's leave voting as a method of last resort.
>
>  
>
>>I think I would vote
>>+1  for complex32 & complex64  mostly just because of "that's what I'm
>>used to"
>>
>>But I'm curious to hear what others "know to be in use" - e.g. Matlab or
>>IDL !
>>    
>>
>
>On the merits of the issue, I like the new scheme better. For whatever reason, I
>tend to remember it when coding. With Numeric, I would frequently second-guess
>myself and go to the prompt and tab-complete to look at all of the options and
>reason out the one I wanted.
>  
>

I can't bring myself to care. I almost always use dtype=complex and on 
the rare times I don't I can never remember what the scheme is 
regardless of which scheme it is / was / will be. On the other hand, if 
the scheme was Complex32x2 and Complex64x2, I could probably decipher 
what  that was without looking it up. It is is a little ugly and weird I 
admit, but that probably wouldn't bother me.

Regards,

-tim


From arnd.baecker at web.de  Mon Apr  3 23:36:00 2006
From: arnd.baecker at web.de (Arnd Baecker)
Date: Mon Apr  3 23:36:00 2006
Subject: [Numpy-discussion] Re: Vote: complex64   vs  complex128
In-Reply-To: <e0t0sh$mfo$1@sea.gmane.org>
References: <442D9124.5020905@msg.ucsf.edu> <442D9695.2050900@cox.net>
 <442DB655.2050203@cox.net> <442DB91F.9030103@msg.ucsf.edu> <442DD638.60706@cox.net>
 <442FDDD5.8050404@sympatico.ca> <442FE950.8090000@cox.net> <44306594.50305@msg.ucsf.edu>
 <4431A8A0.9010604@ee.byu.edu> <4431FE90.6060301@msg.ucsf.edu>
 <e0t0sh$mfo$1@sea.gmane.org>
Message-ID: <Pine.LNX.4.51.0604040748510.14942@ptpcp8.phy.tu-dresden.de>

On Tue, 4 Apr 2006, Robert Kern wrote:

> Sebastian Haase wrote:
> > Hi,
> > Could we start another poll on this !?
>
> Please, let's leave voting as a method of last resort.
>
> > I think I would vote
> > +1  for complex32 & complex64  mostly just because of "that's what I'm
> > used to"
> >
> > But I'm curious to hear what others "know to be in use" - e.g. Matlab or
> > IDL !
>
> On the merits of the issue, I like the new scheme better. For whatever reason, I
> tend to remember it when coding. With Numeric, I would frequently second-guess
> myself and go to the prompt and tab-complete to look at all of the options and
> reason out the one I wanted.

In order to get an opionion on the subject:
How would one presently find out about
the meaning of  complex64 and complex128?
The following attempt does not help:

In [1]:import numpy
In [2]:numpy.complex64?
Type:           type
Base Class:     <type 'type'>
String Form:    <type 'complex64scalar'>
Namespace:      Interactive
Docstring:
    <no docstring>

In [3]:numpy.complex128?
Type:           type
Base Class:     <type 'type'>
String Form:    <type 'complex128scalar'>
Namespace:      Interactive
Docstring:
    <no docstring>

I also looked in Travis' "Guide to NumPy",
where the different types are discussed on
page 18 (referring to the sample chapters at
http://www.tramy.us/guidetoscipy.html)
Maybe chapter 12 contains more info on this ((our library
was still not able to buy the 20 copies since this request was
approved a month ago ...))

Best, Arnd


From cjw at sympatico.ca  Tue Apr  4 06:20:44 2006
From: cjw at sympatico.ca (Colin J. Williams)
Date: Tue Apr  4 06:20:44 2006
Subject: [Numpy-discussion] Vote: complex64   vs  complex128 
In-Reply-To: <4431FE90.6060301@msg.ucsf.edu>
References: <442D9124.5020905@msg.ucsf.edu> <442D9695.2050900@cox.net> <442DB655.2050203@cox.net> <442DB91F.9030103@msg.ucsf.edu> <442DD638.60706@cox.net> <442FDDD5.8050404@sympatico.ca> <442FE950.8090000@cox.net> <44306594.50305@msg.ucsf.edu> <4431A8A0.9010604@ee.byu.edu> <4431FE90.6060301@msg.ucsf.edu>
Message-ID: <443271C9.6080907@sympatico.ca>

Sebastian Haase wrote:

> Hi,
> Could we start another poll on this !?
>
> I think I would vote
> +1  for complex32 & complex64  mostly just because of "that's what I'm 
> used to"

+1 Most people look to the number to give a clue as to the precision of 
the value.

Colin W.

>
> But I'm curious to hear what others "know to be in use" - e.g. Matlab 
> or IDL !
>
> - Thanks
> Sebastian Haase
>
>
>
> Travis Oliphant wrote:
>
>> Sebastian Haase wrote:
>>
>>> Tim Hochberg wrote:
>>> <snip>
>>>
>>>> This would work fine if repr were instead:
>>>>
>>>>    dtype([('x', float64), ('z', complex128)])
>>>>
>>>> Anyway, this all seems reasonable to me at first glance. That said, 
>>>> I don't plan to work on this, I've got other fish to fry at the 
>>>> moment.
>>>
>>>
>>>
>>> A new point: Please remind me (and probably others): when did it get 
>>> decided to introduce 'complex128' to mean numarray's complex64
>>> and the 'complex64' to mean numarray's complex32 ?
>>
>>
>> It was last February (i.e. 2005) when I first started posting 
>> regarding the new NumPy.   I claimed it was more consistent to use 
>> actual bit-widths.   A few people, including Perry, indicated they 
>> weren't opposed to the change and so I went ahead with it.
>>
>> You can read relevant posts by searching on 
>> numpy-discussion at lists.sourceforge.net
>>
>> Discussions are always welcome.  I suppose it's not too late to 
>> change something like this --- but it's getting there...
>>
>> -Travis
>
>
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting 
> language
> that extends applications into web and mobile media. Attend the live 
> webcast
> and join the prime developer group breaking into this new coding 
> territory!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>


From ryanlists at gmail.com  Tue Apr  4 07:27:01 2006
From: ryanlists at gmail.com (Ryan Krauss)
Date: Tue Apr  4 07:27:01 2006
Subject: [Numpy-discussion] Re: string matrices
In-Reply-To: <c5b438120604031748h2af849c1p463462e6d59eb1f0@mail.gmail.com>
References: <c5b438120604031748h2af849c1p463462e6d59eb1f0@mail.gmail.com>
Message-ID: <c5b438120604040726w355d2d92qff607801795e806f@mail.gmail.com>

I actually have a problem with the elements of a string matrix from
astype('S#').  The shorter elements in my matrix have a bunch of terms
like '1.0', because the matrix they started from was a float.  I need
to keep the float type, but want to get rid of the '.0 ' when I
convert the string output to latex.  I was going to check if
element[-2:]=='.0' but ran into this problem:

In [15]: temp[-2:]
Out[15]: '\x00\x00'

In [16]: temp.strip()
Out[16]: '1.0\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'

I think I can get rid of the \x00's by calling str(element), but is
this a feature or a bug?  It would be slightly cleaner for me if the
string matrix elements didn't have the trailing null characters (or
whatever those are), but this may not be possible given the underlying
representation.

Thanks,

Ryan

On 4/3/06, Ryan Krauss <ryanlists at gmail.com> wrote:
> I am trying to use NumPy to generate some matrix inputs to Maxima for
> symbolic analysis.  I am using a fair number of
> matrix.astype('S%d'%maxlen) statements.  This seems to work very well.
>  It also doesn't seem to pad the elements in anyway if maxlen is
> bigger than I need, which is great.  This may seem like a dumb
> computer science question, but what is the memory/performance cost of
> making maxlen bigger than I want (but making sure that it is way
> bigger than I need so that the elements don't get truncated)?  If my
> biggest matrices will be 13x13, how long can the strings be before I
> consume more than a few megs (or a few dozen megs) of memory?
>
> Thanks,
>
> Ryan
>


From charlesr.harris at gmail.com  Tue Apr  4 08:16:07 2006
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Tue Apr  4 08:16:07 2006
Subject: [Numpy-discussion] Vote: complex64 vs complex128
In-Reply-To: <443271C9.6080907@sympatico.ca>
References: <442D9124.5020905@msg.ucsf.edu> <442DB655.2050203@cox.net>
	 <442DB91F.9030103@msg.ucsf.edu> <442DD638.60706@cox.net>
	 <442FDDD5.8050404@sympatico.ca> <442FE950.8090000@cox.net>
	 <44306594.50305@msg.ucsf.edu> <4431A8A0.9010604@ee.byu.edu>
	 <4431FE90.6060301@msg.ucsf.edu> <443271C9.6080907@sympatico.ca>
Message-ID: <e06186140604040809u626fb1f4x77c67d2d43983077@mail.gmail.com>

I can't get worked up over this one way or the other: complex128 make sense
if I count bits, complex64 makes sense if I note precision; I just have to
remember the numpy convention. One could argue that complex64 is the more
conventional choice and so has the virtue of least surprise, but I don't
think it is terribly difficult to become accustomed to using complex128 in
its place. I suppose this is one of those programmer's vs user's point of
view thingees. For the guy writing general low level numpy code what matters
is the length of the type, how many bytes have to be moved and so on, and
from the other point of view what counts is the precision of the arithmetic.

Chuck

On 4/4/06, Colin J. Williams <cjw at sympatico.ca> wrote:
>
> Sebastian Haase wrote:
>
> > Hi,
> > Could we start another poll on this !?
> >
> > I think I would vote
> > +1  for complex32 & complex64  mostly just because of "that's what I'm
> > used to"
>
> +1 Most people look to the number to give a clue as to the precision of
> the value.
>
> Colin W.
>
> >
> > But I'm curious to hear what others "know to be in use" - e.g. Matlab
> > or IDL !
> >
> > - Thanks
> > Sebastian Haase
> >
> >
> >
> > Travis Oliphant wrote:
> >
> >> Sebastian Haase wrote:
> >>
> >>> Tim Hochberg wrote:
> >>> <snip>
> >>>
> >>>> This would work fine if repr were instead:
> >>>>
> >>>>    dtype([('x', float64), ('z', complex128)])
> >>>>
> >>>> Anyway, this all seems reasonable to me at first glance. That said,
> >>>> I don't plan to work on this, I've got other fish to fry at the
> >>>> moment.
> >>>
> >>>
> >>>
> >>> A new point: Please remind me (and probably others): when did it get
> >>> decided to introduce 'complex128' to mean numarray's complex64
> >>> and the 'complex64' to mean numarray's complex32 ?
> >>
> >>
> >> It was last February (i.e. 2005) when I first started posting
> >> regarding the new NumPy.   I claimed it was more consistent to use
> >> actual bit-widths.   A few people, including Perry, indicated they
> >> weren't opposed to the change and so I went ahead with it.
> >>
> >> You can read relevant posts by searching on
> >> numpy-discussion at lists.sourceforge.net
> >>
> >> Discussions are always welcome.  I suppose it's not too late to
> >> change something like this --- but it's getting there...
> >>
> >> -Travis
> >
> >
> >
> >
> > -------------------------------------------------------
> > This SF.Net email is sponsored by xPML, a groundbreaking scripting
> > language
> > that extends applications into web and mobile media. Attend the live
> > webcast
> > and join the prime developer group breaking into this new coding
> > territory!
> > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> > _______________________________________________
> > Numpy-discussion mailing list
> > Numpy-discussion at lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/numpy-discussion
> >
>
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting
> language
> that extends applications into web and mobile media. Attend the live
> webcast
> and join the prime developer group breaking into this new coding
> territory!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060404/94e5c02e/attachment-0001.html>

From faltet at carabos.com  Tue Apr  4 08:49:11 2006
From: faltet at carabos.com (Francesc Altet)
Date: Tue Apr  4 08:49:11 2006
Subject: [Numpy-discussion] Re: Vote: complex64   vs  complex128
In-Reply-To: <e0t0sh$mfo$1@sea.gmane.org>
References: <442D9124.5020905@msg.ucsf.edu> <4431FE90.6060301@msg.ucsf.edu> <e0t0sh$mfo$1@sea.gmane.org>
Message-ID: <200604041747.57180.faltet@carabos.com>

A Dimarts 04 Abril 2006 07:40, Robert Kern va escriure:
> Sebastian Haase wrote:
> > I think I would vote
> > +1  for complex32 & complex64  mostly just because of "that's what I'm
> > used to"
> >
> > But I'm curious to hear what others "know to be in use" - e.g. Matlab or
> > IDL !
>
> On the merits of the issue, I like the new scheme better. For whatever
> reason, I tend to remember it when coding. With Numeric, I would frequently
> second-guess myself and go to the prompt and tab-complete to look at all of
> the options and reason out the one I wanted.

I agree with Robert. From the very beginning NumPy design has been
very consequent with typeEXTENT_IN_BITS mapping (even for unicode),
and if we go back to numarray (complex32/complex64) convention, this
would be the only exception to this rule. Perhaps I'm a bit biased by
being a developer more interested in type 'sizes' that in 'precision'
issues, but I'd definitely prefer a completely consistent approach for
this matter.

So +1 for complex64 & complex128

Cheers,

-- 
>0,0<   Francesc Altet ? ? http://www.carabos.com/
V   V   C?rabos Coop. V. ??Enjoy Data
 "-"


From haase at msg.ucsf.edu  Tue Apr  4 09:33:07 2006
From: haase at msg.ucsf.edu (Sebastian Haase)
Date: Tue Apr  4 09:33:07 2006
Subject: [Numpy-discussion] Vote: complex64 vs complex128
In-Reply-To: <e06186140604040809u626fb1f4x77c67d2d43983077@mail.gmail.com>
References: <442D9124.5020905@msg.ucsf.edu> <443271C9.6080907@sympatico.ca> <e06186140604040809u626fb1f4x77c67d2d43983077@mail.gmail.com>
Message-ID: <200604040929.15815.haase@msg.ucsf.edu>

On Tuesday 04 April 2006 08:09, Charles R Harris wrote:
> I can't get worked up over this one way or the other: complex128 make sense
> if I count bits, complex64 makes sense if I note precision; I just have to
> remember the numpy convention. One could argue that complex64 is the more
> conventional choice and so has the virtue of least surprise, but I don't
> think it is terribly difficult to become accustomed to using complex128 in
> its place. I suppose this is one of those programmer's vs user's point of
> view thingees. For the guy writing general low level numpy code what
> matters is the length of the type, how many bytes have to be moved and so
> on, and from the other point of view what counts is the precision of the
> arithmetic.

I kind of like your comparison of  programmer vs user ;-)
And so I was "hoping" that numpy (and scipy !!) is intended for the users - 
like supposedly IDL and Matlab are...

No one likes my "backwards compatibility" argument !?

Thanks
- Sebastian Haase

PS: I understand that voting is only for a last resort - some people, always 
use na.Complex and na.Float and don't care - BUT I use single precision all 
the time because my image data is already getting to large.  So I have to  
look at this every day, and as Travis pointed out, now is about the last 
chance to possibly change  complex128 to complex64 ...

>
> Chuck
>
> On 4/4/06, Colin J. Williams <cjw at sympatico.ca> wrote:
> > Sebastian Haase wrote:
> > > Hi,
> > > Could we start another poll on this !?
> > >
> > > I think I would vote
> > > +1  for complex32 & complex64  mostly just because of "that's what I'm
> > > used to"
> >
> > +1 Most people look to the number to give a clue as to the precision of
> > the value.
> >
> > Colin W.
> >
> > > But I'm curious to hear what others "know to be in use" - e.g. Matlab
> > > or IDL !
> > >
> > > - Thanks
> > > Sebastian Haase
> > >
> > > Travis Oliphant wrote:
> > >> Sebastian Haase wrote:
> > >>> Tim Hochberg wrote:
> > >>> <snip>
> > >>>
> > >>>> This would work fine if repr were instead:
> > >>>>
> > >>>>    dtype([('x', float64), ('z', complex128)])
> > >>>>
> > >>>> Anyway, this all seems reasonable to me at first glance. That said,
> > >>>> I don't plan to work on this, I've got other fish to fry at the
> > >>>> moment.
> > >>>
> > >>> A new point: Please remind me (and probably others): when did it get
> > >>> decided to introduce 'complex128' to mean numarray's complex64
> > >>> and the 'complex64' to mean numarray's complex32 ?
> > >>
> > >> It was last February (i.e. 2005) when I first started posting
> > >> regarding the new NumPy.   I claimed it was more consistent to use
> > >> actual bit-widths.   A few people, including Perry, indicated they
> > >> weren't opposed to the change and so I went ahead with it.
> > >>
> > >> You can read relevant posts by searching on
> > >> numpy-discussion at lists.sourceforge.net
> > >>
> > >> Discussions are always welcome.  I suppose it's not too late to
> > >> change something like this --- but it's getting there...
> > >>
> > >> -Travis


From robert.kern at gmail.com  Tue Apr  4 09:52:11 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Tue Apr  4 09:52:11 2006
Subject: [Numpy-discussion] Re: string matrices
In-Reply-To: <c5b438120604040726w355d2d92qff607801795e806f@mail.gmail.com>
References: <c5b438120604031748h2af849c1p463462e6d59eb1f0@mail.gmail.com> <c5b438120604040726w355d2d92qff607801795e806f@mail.gmail.com>
Message-ID: <e0u857$a4p$1@sea.gmane.org>

Ryan Krauss wrote:
> I actually have a problem with the elements of a string matrix from
> astype('S#').  The shorter elements in my matrix have a bunch of terms
> like '1.0', because the matrix they started from was a float.  I need
> to keep the float type, but want to get rid of the '.0 ' when I
> convert the string output to latex.  I was going to check if
> element[-2:]=='.0' but ran into this problem:
> 
> In [15]: temp[-2:]
> Out[15]: '\x00\x00'
> 
> In [16]: temp.strip()
> Out[16]: '1.0\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
> 
> I think I can get rid of the \x00's by calling str(element), but is
> this a feature or a bug? 

Probably both.  :-)  On the one hand, you want to be able to get a useful string
out of the array; the nulls are just padding, and the string that you put in was
'1.0'. However, suppose that the string you put in was '1.\x00'. Then you would
get the "wrong" string out.

However, the only real alternative is to also store an integer containing the
length of the string with each element. That probably interferes with some of
the uses of string arrays.

> It would be slightly cleaner for me if the
> string matrix elements didn't have the trailing null characters (or
> whatever those are), but this may not be possible given the underlying
> representation.

You can also use temp.strip('\x00') which is a bit more explicit.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From zpincus at stanford.edu  Tue Apr  4 09:54:06 2006
From: zpincus at stanford.edu (Zachary Pincus)
Date: Tue Apr  4 09:54:06 2006
Subject: [Numpy-discussion] Re: Vote: complex64   vs  complex128
In-Reply-To: <443208B9.40106@cox.net>
References: <442D9124.5020905@msg.ucsf.edu> <442D9695.2050900@cox.net> <442DB655.2050203@cox.net> <442DB91F.9030103@msg.ucsf.edu> <442DD638.60706@cox.net> <442FDDD5.8050404@sympatico.ca> <442FE950.8090000@cox.net> <44306594.50305@msg.ucsf.edu> <4431A8A0.9010604@ee.byu.edu> <4431FE90.6060301@msg.ucsf.edu> <e0t0sh$mfo$1@sea.gmane.org> <443208B9.40106@cox.net>
Message-ID: <A44A6813-D231-40FF-B89D-3693B0E570E1@stanford.edu>

>  On the other hand, if the scheme was Complex32x2 and Complex64x2,  
> I could probably decipher what  that was without looking it up. It  
> is is a little ugly and weird I admit, but that probably wouldn't  
> bother me.

On consideration, I'm +1 on Tim's suggestion here, if any change is  
going to be made. At least it has the virtue of being relatively  
clear, if a bit ugly.

Zach


From jh at oobleck.astro.cornell.edu  Tue Apr  4 11:14:04 2006
From: jh at oobleck.astro.cornell.edu (Joe Harrington)
Date: Tue Apr  4 11:14:04 2006
Subject: [Numpy-discussion] Re: Vote: complex64   vs  complex128
In-Reply-To: <20060404155003.AF09016220@sc8-sf-spam2.sourceforge.net>
	(numpy-discussion-request@lists.sourceforge.net)
References: <20060404155003.AF09016220@sc8-sf-spam2.sourceforge.net>
Message-ID: <200604041813.k34IDmAP011500@oobleck.astro.cornell.edu>

When I first heard of Complex128, my first response was, "Cool!  I
didn't even know there was a Double128!"

Folks seem to agree that precision-based naming would be most
intuitive to new users, but that length-based naming would be most
intuitive to low-level programmers.  This is a high-level package,
whose purpose is to hide the numerical details and programming
drudgery from the user as much as possible, while still offering high
performance and not limiting capability too much.  For this type of
package, a good metric is "when it doesn't restrict capability, do
what makes sense for new/naiive users".

So, I favor Complex32 and Complex64.  When you say "complex", everyone
knows you mean 2 numbers.  When you say 32 or 64 or 128, in the
context of bits for floating values, almost everyone assumes you are
talking that many bits of precision to represent one number.  Consider
future conversations about precision and data size.  In precision
discussions, you'd always have to clarify that complex128 had 64 bits
of precision, just to make sure everyone was on the same key
(particularly when 128-bit machines arrive).  In data-size
discussions, everyone would know to double the size for the two
components.  No extra clarification would be needed.

IDL's behavior is irrelevant to us, since they just say "complex", and
"dcomplex" for 32-bit and 64-bit precision.

--jh--


From oliphant.travis at ieee.org  Tue Apr  4 11:25:11 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Tue Apr  4 11:25:11 2006
Subject: [Numpy-discussion] Re: string matrices
In-Reply-To: <c5b438120604040726w355d2d92qff607801795e806f@mail.gmail.com>
References: <c5b438120604031748h2af849c1p463462e6d59eb1f0@mail.gmail.com> <c5b438120604040726w355d2d92qff607801795e806f@mail.gmail.com>
Message-ID: <4432B9C2.7040307@ieee.org>

Ryan Krauss wrote:
> I actually have a problem with the elements of a string matrix from
> astype('S#').  The shorter elements in my matrix have a bunch of terms
> like '1.0', because the matrix they started from was a float.  I need
> to keep the float type, but want to get rid of the '.0 ' when I
> convert the string output to latex.  I was going to check if
> element[-2:]=='.0' but ran into this problem
>   
> In [15]: temp[-2:]
> Out[15]: '\x00\x00'
>
> In [16]: temp.strip()
> Out[16]: '1.0\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
>
> I think I can get rid of the \x00's by calling str(element), but is
> this a feature or a bug?  

Of course the elements are padded with '\x00' so that they are all the 
same length, but we have been trying to make it so that it doesn't 
matter.   Equality testing is one area where it still does.  We are 
using the underlying string equality testing (and it doesn't strip the 
'\x00').  So, I guess it's a missing feature at this point.  

-Travis


From tim.hochberg at cox.net  Tue Apr  4 11:41:10 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Tue Apr  4 11:41:10 2006
Subject: [Numpy-discussion] Re: string matrices
In-Reply-To: <e0u857$a4p$1@sea.gmane.org>
References: <c5b438120604031748h2af849c1p463462e6d59eb1f0@mail.gmail.com> <c5b438120604040726w355d2d92qff607801795e806f@mail.gmail.com> <e0u857$a4p$1@sea.gmane.org>
Message-ID: <4432BD89.3050501@cox.net>

Robert Kern wrote:

>Ryan Krauss wrote:
>  
>
>>I actually have a problem with the elements of a string matrix from
>>astype('S#').  The shorter elements in my matrix have a bunch of terms
>>like '1.0', because the matrix they started from was a float.  I need
>>to keep the float type, but want to get rid of the '.0 ' when I
>>convert the string output to latex.  I was going to check if
>>element[-2:]=='.0' but ran into this problem:
>>
>>In [15]: temp[-2:]
>>Out[15]: '\x00\x00'
>>
>>In [16]: temp.strip()
>>Out[16]: '1.0\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
>>
>>I think I can get rid of the \x00's by calling str(element), but is
>>this a feature or a bug? 
>>    
>>
>
>Probably both.  :-)  On the one hand, you want to be able to get a useful string
>out of the array; the nulls are just padding, and the string that you put in was
>'1.0'. However, suppose that the string you put in was '1.\x00'. Then you would
>get the "wrong" string out.
>
>However, the only real alternative is to also store an integer containing the
>length of the string with each element. That probably interferes with some of
>the uses of string arrays.
>
>  
>
>>It would be slightly cleaner for me if the
>>string matrix elements didn't have the trailing null characters (or
>>whatever those are), but this may not be possible given the underlying
>>representation.
>>    
>>
>
>You can also use temp.strip('\x00') which is a bit more explicit.
>
>  
>

Or even temp.rstrip('\x00') which works for all those time you pad the 
front of your string with '\x00' ;)

-tim


From faltet at carabos.com  Tue Apr  4 11:46:08 2006
From: faltet at carabos.com (Francesc Altet)
Date: Tue Apr  4 11:46:08 2006
Subject: [Numpy-discussion] Re: Vote: complex64   vs  complex128
In-Reply-To: <200604041813.k34IDmAP011500@oobleck.astro.cornell.edu>
References: <20060404155003.AF09016220@sc8-sf-spam2.sourceforge.net> <200604041813.k34IDmAP011500@oobleck.astro.cornell.edu>
Message-ID: <200604042045.39955.faltet@carabos.com>

A Dimarts 04 Abril 2006 20:13, Joe Harrington va escriure:
> When I first heard of Complex128, my first response was, "Cool!  I
> didn't even know there was a Double128!"
>
> Folks seem to agree that precision-based naming would be most
> intuitive to new users, but that length-based naming would be most
> intuitive to low-level programmers.  This is a high-level package,
> whose purpose is to hide the numerical details and programming
> drudgery from the user as much as possible, while still offering high
> performance and not limiting capability too much.  For this type of
> package, a good metric is "when it doesn't restrict capability, do
> what makes sense for new/naiive users".
>
> So, I favor Complex32 and Complex64.  When you say "complex", everyone
> knows you mean 2 numbers.  When you say 32 or 64 or 128, in the
> context of bits for floating values, almost everyone assumes you are
> talking that many bits of precision to represent one number.  Consider
> future conversations about precision and data size.  In precision
> discussions, you'd always have to clarify that complex128 had 64 bits
> of precision, just to make sure everyone was on the same key
> (particularly when 128-bit machines arrive).  In data-size
> discussions, everyone would know to double the size for the two
> components.  No extra clarification would be needed.

Well, from my point of view of "low-level" user, I don't specially
like this, but I understand the "high-level" position to be much more
important in terms of number of users. Besides, I also see that NumPy
should be adressed specially to the requirements of the later users.
So for me is fine with complex32/complex64.

Cheers,

-- 
>0,0<   Francesc Altet ? ? http://www.carabos.com/
V   V   C?rabos Coop. V. ??Enjoy Data
 "-"


From robert.kern at gmail.com  Tue Apr  4 12:15:08 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Tue Apr  4 12:15:08 2006
Subject: [Numpy-discussion] Re: Vote: complex64   vs  complex128
In-Reply-To: <200604041813.k34IDmAP011500@oobleck.astro.cornell.edu>
References: <20060404155003.AF09016220@sc8-sf-spam2.sourceforge.net> <200604041813.k34IDmAP011500@oobleck.astro.cornell.edu>
Message-ID: <e0ugi5$b1q$1@sea.gmane.org>

Joe Harrington wrote:
> When I first heard of Complex128, my first response was, "Cool!  I
> didn't even know there was a Double128!"
> 
> Folks seem to agree that precision-based naming would be most
> intuitive to new users, but that length-based naming would be most
> intuitive to low-level programmers.  This is a high-level package,
> whose purpose is to hide the numerical details and programming
> drudgery from the user as much as possible, while still offering high
> performance and not limiting capability too much.  For this type of
> package, a good metric is "when it doesn't restrict capability, do
> what makes sense for new/naiive users".

I'm pretty sure that when any of us say that such-and-such is going to make the
most sense to new users, we're just guessing. Or projecting our experienced-user
prejudices onto them. If I had to register my guess, I would say that either way
will make just as much sense to new users.

I think it's time that we start taking backwards compatibility with previous
releases of numpy seriously and not break numpy code without clear, significant
gains in usability.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From aisaac at american.edu  Tue Apr  4 12:38:05 2006
From: aisaac at american.edu (Alan G Isaac)
Date: Tue Apr  4 12:38:05 2006
Subject: [Numpy-discussion] Re: Vote: complex64   vs  complex128
In-Reply-To: <e0ugi5$b1q$1@sea.gmane.org>
References: <20060404155003.AF09016220@sc8-sf-spam2.sourceforge.net> <200604041813.k34IDmAP011500@oobleck.astro.cornell.edu><e0ugi5$b1q$1@sea.gmane.org>
Message-ID: <Mahogany-0.66.0-1528-20060404-154236.00@american.edu>

On Tue, 04 Apr 2006, Robert Kern apparently wrote: 
> I would say that either way will make just as much sense 
> to new users. 

User's perspective: agreed.  Just give me i. consistency and 
ii. an easy way to inspect the object for its meaning.

Cheers,
Alan Isaac


From tim.hochberg at cox.net  Tue Apr  4 12:52:04 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Tue Apr  4 12:52:04 2006
Subject: [Numpy-discussion] Re: Vote: complex64   vs  complex128
In-Reply-To: <e0ugi5$b1q$1@sea.gmane.org>
References: <20060404155003.AF09016220@sc8-sf-spam2.sourceforge.net> <200604041813.k34IDmAP011500@oobleck.astro.cornell.edu> <e0ugi5$b1q$1@sea.gmane.org>
Message-ID: <4432CE1F.3010209@cox.net>

Robert Kern wrote:

>Joe Harrington wrote:
>  
>
>>When I first heard of Complex128, my first response was, "Cool!  I
>>didn't even know there was a Double128!"
>>
>>Folks seem to agree that precision-based naming would be most
>>intuitive to new users, but that length-based naming would be most
>>intuitive to low-level programmers.  This is a high-level package,
>>whose purpose is to hide the numerical details and programming
>>drudgery from the user as much as possible, while still offering high
>>performance and not limiting capability too much.  For this type of
>>package, a good metric is "when it doesn't restrict capability, do
>>what makes sense for new/naiive users".
>>    
>>
>
>I'm pretty sure that when any of us say that such-and-such is going to make the
>most sense to new users, we're just guessing. Or projecting our experienced-user
>prejudices onto them. If I had to register my guess, I would say that either way
>will make just as much sense to new users.
>  
>
Agreed.

>I think it's time that we start taking backwards compatibility with previous
>releases of numpy seriously and not break numpy code without clear, significant
>gains in usability.
>  
>
So what does that mean in this case? The current status; nice for 
existing users of numpy. Or, the old status, nice for people 
transitioning to numpy from Numeric. It's hard to know which way these 
backwards compatibility arguments cut when they involve reverting a 
change from some old behaviour.

I've got an idea. Rather than go round and round about complex64 versus 
complex128, let's just leave things as they are and add a docstring to 
complex128 and complex64 explaining the situation. [code...code...]

     >>> help(complex128)
    class complex128scalar(complexfloatingscalar, complex)
     |  complex128: composed of two 64 bit floats
     |
     |  Method resolution order:
     |      complex128scalar
     |      complexfloatingscalar
     |      inexactscalar
     |      numberscalar
     |      genericscalar
     |      complex
     |      object
    ...

I someone wants to give me some better text for the docstring, I'll go 
ahead and commit this change. Heck if you've got some text for the other 
scalar objects (within reason) I'll be happy to add that at the same time.

Regards,

-tim


From robert.kern at gmail.com  Tue Apr  4 13:06:01 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Tue Apr  4 13:06:01 2006
Subject: [Numpy-discussion] Re: Vote: complex64   vs  complex128
In-Reply-To: <4432CE1F.3010209@cox.net>
References: <20060404155003.AF09016220@sc8-sf-spam2.sourceforge.net> <200604041813.k34IDmAP011500@oobleck.astro.cornell.edu> <e0ugi5$b1q$1@sea.gmane.org> <4432CE1F.3010209@cox.net>
Message-ID: <e0ujgn$lp1$1@sea.gmane.org>

Tim Hochberg wrote:
> Robert Kern wrote:

>> I think it's time that we start taking backwards compatibility with
>> previous
>> releases of numpy seriously and not break numpy code without clear,
>> significant
>> gains in usability.
>>  
> So what does that mean in this case? The current status; nice for
> existing users of numpy. Or, the old status, nice for people
> transitioning to numpy from Numeric. It's hard to know which way these
> backwards compatibility arguments cut when they involve reverting a
> change from some old behaviour.

I mean numpy. Neither complex64 nor complex128 are backwards-compatible with
Numeric. Complex32 and Complex64 already exist and are hopefully isolated as
compatibility aliases for typecodes.

By backwards-compatibility, I refer to code, not habits.

> I've got an idea. Rather than go round and round about complex64 versus
> complex128, let's just leave things as they are and add a docstring to
> complex128 and complex64 explaining the situation. [code...code...]
> 
>     >>> help(complex128)
>    class complex128scalar(complexfloatingscalar, complex)
>     |  complex128: composed of two 64 bit floats
>     |
>     |  Method resolution order:
>     |      complex128scalar
>     |      complexfloatingscalar
>     |      inexactscalar
>     |      numberscalar
>     |      genericscalar
>     |      complex
>     |      object
>    ...
> 
> I someone wants to give me some better text for the docstring, I'll go
> ahead and commit this change. Heck if you've got some text for the other
> scalar objects (within reason) I'll be happy to add that at the same time.

+1

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From oliphant at ee.byu.edu  Tue Apr  4 13:42:38 2006
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Tue Apr  4 13:42:38 2006
Subject: [Numpy-discussion] Re: Vote: complex64   vs  complex128
In-Reply-To: <e0ugi5$b1q$1@sea.gmane.org>
References: <20060404155003.AF09016220@sc8-sf-spam2.sourceforge.net> <200604041813.k34IDmAP011500@oobleck.astro.cornell.edu> <e0ugi5$b1q$1@sea.gmane.org>
Message-ID: <4432D9C3.3040109@ee.byu.edu>

Robert Kern wrote:

>Joe Harrington wrote:
>  
>
>>When I first heard of Complex128, my first response was, "Cool!  I
>>didn't even know there was a Double128!"
>>
>>Folks seem to agree that precision-based naming would be most
>>intuitive to new users, but that length-based naming would be most
>>intuitive to low-level programmers.  This is a high-level package,
>>whose purpose is to hide the numerical details and programming
>>drudgery from the user as much as possible, while still offering high
>>performance and not limiting capability too much.  For this type of
>>package, a good metric is "when it doesn't restrict capability, do
>>what makes sense for new/naiive users".
>>    
>>
>
>I'm pretty sure that when any of us say that such-and-such is going to make the
>most sense to new users, we're just guessing. Or projecting our experienced-user
>prejudices onto them. If I had to register my guess, I would say that either way
>will make just as much sense to new users.
>  
>

Totally agree.   I don't see the argument that Complex64 is a 
"precision" description.  To a new user it could go either way depending 
on their previous experience.  I think most new users won't even use the 
bit width names but will instead use 'complex' and be done with it...

>I think it's time that we start taking backwards compatibility with previous
>releases of numpy seriously and not break numpy code without clear, significant
>gains in usability.
>  
>
+1

-Travis


From perry at stsci.edu  Tue Apr  4 14:09:02 2006
From: perry at stsci.edu (Perry Greenfield)
Date: Tue Apr  4 14:09:02 2006
Subject: [Numpy-discussion] Re: Vote: complex64   vs  complex128
In-Reply-To: <4432D9C3.3040109@ee.byu.edu>
References: <20060404155003.AF09016220@sc8-sf-spam2.sourceforge.net> <200604041813.k34IDmAP011500@oobleck.astro.cornell.edu> <e0ugi5$b1q$1@sea.gmane.org> <4432D9C3.3040109@ee.byu.edu>
Message-ID: <6e9f9be0cfb968840dc4314d65c9e655@stsci.edu>

On Apr 4, 2006, at 4:40 PM, Travis Oliphant wrote:
>
> Totally agree.   I don't see the argument that Complex64 is a 
> "precision" description.  To a new user it could go either way 
> depending on their previous experience.  I think most new users won't 
> even use the bit width names but will instead use 'complex' and be 
> done with it...
>
>> I think it's time that we start taking backwards compatibility with 
>> previous
>> releases of numpy seriously and not break numpy code without clear, 
>> significant
>> gains in usability.
>>
> +1
>
The issue that just won't go away. We did it the current way for 
numarray initially and were persuaded to switch to be compatible with 
Numeric.

I agree that it isn't obvious what the number means for complex. That 
ambiguity will always be there. Unless we did a real user test to find 
out, we wouldn't know for sure what future users would most likely 
expect.

But in the end, pick one and let's not change it again (or even talk 
about changing it). It doesn't matter that much to me which it is.

Perry


From oliphant at ee.byu.edu  Tue Apr  4 14:18:59 2006
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Tue Apr  4 14:18:59 2006
Subject: [Numpy-discussion] NumPy documentation
Message-ID: <4432E27E.6030906@ee.byu.edu>

I received a rather hurtful email today that was very discouraging to me 
personally.  Basically, I was called "lame" and a "wolf" in sheep's 
clothing because I'm charging for documentation.    Fortunately it's the 
first email of that nature I've received.  Others have disagreed with my 
choice to charge for the documentation but at least they've not resorted 
to personal attacks on me and my motivations.   Please know that such 
emails do have an impact.  While I try to build a tough skin, such 
unappreciative statements reduce my enthusiasm for working on NumPy 
significantly.

My purpose, however, is not to rant about the misguided words of one 
person.  He brought up a point that I want to clarify.  He asked if I 
"would sue" if somebody else wrote documentation for NumPy.   I want to 
be perfectly clear that this is a ridiculous statement that barely 
deserves a response.  Of course I wouldn't.   First of all, it would be 
extreme circumstances indeed for me to resort to that course of action 
(basically a company would have to copy my book and start distributing 
it on a large scale, belligerently).  Second of all, I would love to see 
*more* documentation for NumPy.

If there are other (less vocal) people out there who are not using NumPy 
because of my book, then I certainly feel sorry about that.  Please dig 
in and create the documentation you so urgently want to be free.   I 
will not stand in your way, but may even help.

But please consider that time is money.  Most people are better off 
spending their time on something else and just cooperating with others 
by paying for the book.  But, I'm not going to dislike or have any kind 
of ill feelings with anyone who decides to spend their time on 
"documentation."  In fact, I'll appreciate it just like everyone else.  
I love the growth of the SciPy Wiki.  There are some great recipes and 
examples there.  This is fantastic.  I'm 100% behind this kind of work.  
Rather than write some kind of "replacement" documentation, contribute 
docstrings to the code and recipes to the Wiki.   Then, those that can't 
or won't buy the book will still have plenty of resources to use to 
learn NumPy.

I'm completely behind all forms of "free" information on NumPy / SciPy 
and related tools.  The only reason I have to charge for the 
documentation is that I just don't have the resources to simply donate 
*all* of my time.   I want to thank all of you who have already 
purchased the documentation.   It has been extremely helpful to me 
personally and professionally.   Without you, my time to spend on NumPy 
would have been significantly reduced.   Thank you very much.

Best wishes,

-Travis


From ijcvyash at rim.com  Tue Apr  4 14:46:07 2006
From: ijcvyash at rim.com (ijcvyash)
Date: Tue Apr  4 14:46:07 2006
Subject: [Numpy-discussion] Fw: numpy-discussion
Message-ID: <000c01c65831$3e0d2b10$cc04ac54@berndtxhk37ozj>


----- Original Message ----- 
From: Armstrong Nicholas 
To: mgucfjwruye at bondavalli.com 
Sent: Saturday, April 01, 2006 10:21 PM
Subject: numpy-discussion


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060404/f29776f1/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: numpy-discussion.gif
Type: image/gif
Size: 8262 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060404/f29776f1/attachment-0001.gif>

From Chris.Barker at noaa.gov  Tue Apr  4 14:48:01 2006
From: Chris.Barker at noaa.gov (Christopher Barker)
Date: Tue Apr  4 14:48:01 2006
Subject: [Numpy-discussion] NumPy documentation
In-Reply-To: <4432E27E.6030906@ee.byu.edu>
References: <4432E27E.6030906@ee.byu.edu>
Message-ID: <4432E973.8070601@noaa.gov>

Travis,

I'm very sorry to hear that you got such a response. It was completely 
unwarranted. I am often quite surprised at the vitriol that sometimes 
results from people that are not getting what they want from an open 
source project.

Indeed, the comment about "suing" makes it completely clear that this 
individual completely misunderstood your intentions (and the reality of 
copyright law: you would only have a course of action if your book was 
copied!).

When you first announced the book, I know there was a fair bit of 
discussion about it, and you made it quite clear how reasonable your 
position is. Personally, I think forcing open source projects by writing 
and selling books about them is an excellent approach: it works well for 
everyone. My freedom is not restricted, you get some compensation for 
your time.

Ideally, I'd like to see comprehensive reference documentation 
distributed for free, while more comprehensive explanatory docs could be 
either free or not. One of these days I'll put my keyboard where my 
mouth is and actually write a doc string or two!

In the meantime, I am absolutely thrilled that you've put as much effort 
into numpy as you have. You are doing a fabulous job, and I hope the 
appreciation of all is clear to you.

thank you,

-Chris

PS: If we get a reasonable budget next year, I'll be sure to buy a few 
copies of your book.


-- 
Christopher Barker, Ph.D.
Oceanographer
                                     		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From tim.hochberg at cox.net  Tue Apr  4 15:37:06 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Tue Apr  4 15:37:06 2006
Subject: [Numpy-discussion] NumPy documentation
In-Reply-To: <4432E973.8070601@noaa.gov>
References: <4432E27E.6030906@ee.byu.edu> <4432E973.8070601@noaa.gov>
Message-ID: <4432F4DD.6060000@cox.net>

Travis,

I'm sorry to hear that you received such an unwarranted attack. 
Although, sadly, not terribly suprised; there are plenty of unpleasant 
fanatics of various stripes that roam the bitstreams.  Let me add a 
hearty "me too" to everything that Chris just said.

This finally motivated me to go out and buy your book, something that's 
been on my list of things that I should do "one of these days now". I'm 
hoping that makes this mystery person unhappy.

Regards,
-tim


From svetosch at gmx.net  Tue Apr  4 16:03:02 2006
From: svetosch at gmx.net (Sven Schreiber)
Date: Tue Apr  4 16:03:02 2006
Subject: [Numpy-discussion] kron with matrices
Message-ID: <4432FADE.3070705@gmx.net>

Hi,
first of all thanks for including kron in numpy, it's very useful.

Now I have just built numpy from svn for the first time in order to spot
matrix-related bugs before a new release as promised. That worked well,
thanks to the great wiki instructions.

The old bugs (in linalg) are gone, but I wonder whether the following
behavior is another one:

>>> import numpy as n
>>> n.kron(n.asmatrix(n.ones((1,2))), n.asmatrix(n.zeros((2,2))))
array([[0, 0, 0, 0],
       [0, 0, 0, 0]])

I would prefer if kron returned a matrix at least if both inputs are
matrices, as in the given example.

Thanks,
Sven


From jdhunter at ace.bsd.uchicago.edu  Tue Apr  4 16:10:13 2006
From: jdhunter at ace.bsd.uchicago.edu (John Hunter)
Date: Tue Apr  4 16:10:13 2006
Subject: [Numpy-discussion] NumPy documentation
In-Reply-To: <4432E27E.6030906@ee.byu.edu> (Travis Oliphant's message of
 "Tue, 04 Apr 2006 15:17:50 -0600")
References: <4432E27E.6030906@ee.byu.edu>
Message-ID: <87wte5ndot.fsf@peds-pc311.bsd.uchicago.edu>

>>>>> "Travis" == Travis Oliphant <oliphant at ee.byu.edu> writes:

    Travis> I received a rather hurtful email today that was very
    Travis> discouraging to me personally.  Basically, I was called
    Travis> "lame" and a "wolf" in sheep's clothing because I'm
    Travis> charging for documentation.  Fortunately it's the first

Wow, harsh.

I would just like to (for a second time) voice my support for your
charging for documentation, and throw out a couple of points for
people to consider who oppose it.

I think a low-ball estimate of the dollar value of the amount of time
Travis has donated to scientific python is about $500,000 dollars (5
years, full-time, $100k/yr -- this is low ball because he has probably
donated more time and he is certainly worth more than that annually!).
If he gets the $300,000 or so dollars he hopes to raise from this
book, he still has a net contribution of more than $200k.  Those of
you who are critical: have you put in that much of your time or money?

Secondly, I know personally that Travis has resisted several offers to
lure him from academia into industry.  Academia, by its nature,
affords more flexibility to develop open source software driven by
issues of breadth and quality rather than deadlines and customer
demands.  By charging for this book, it makes it more feasible for him
to continue to work in academia and support these projects.

Travis and I share some similarities: we both have a wife and kids,
with low-paying academic careers, and lead active python projects.
Only Travis leads two projects to my one and he has five kids to my
three.  I recently left academia for a job in industry because of
financial considerations, and while my firm is supportive of my
matplotlib development (we use it and python extensively in house), it
does leave me less time for development.  So to those of you grumbling
to Travis directly or behind the scenes, think about what he is giving
and back off.  And start donating some of your own time instead of
encouraging Travis to donate more of his.

JDH


From aisaac at american.edu  Tue Apr  4 16:27:10 2006
From: aisaac at american.edu (Alan G Isaac)
Date: Tue Apr  4 16:27:10 2006
Subject: [Numpy-discussion] NumPy documentation
In-Reply-To: <4432E27E.6030906@ee.byu.edu>
References: <4432E27E.6030906@ee.byu.edu>
Message-ID: <Mahogany-0.66.0-1528-20060404-193223.00@american.edu>

On Tue, 04 Apr 2006, Travis Oliphant apparently wrote: 
> I'm not going to dislike or have any kind of ill feelings 
> with anyone who decides to spend their time on 
> "documentation."  In fact, I'll appreciate it just like 
> everyone else. 

Of course you were extremely clear about this from the 
beginning.  Thank you for numpy!!!
Alan Isaac (grateful user of numpy)
PS Your book is *very* helpful.


From zpincus at stanford.edu  Tue Apr  4 16:48:06 2006
From: zpincus at stanford.edu (Zachary Pincus)
Date: Tue Apr  4 16:48:06 2006
Subject: [Numpy-discussion] NumPy documentation
In-Reply-To: <4432F4DD.6060000@cox.net>
References: <4432E27E.6030906@ee.byu.edu> <4432E973.8070601@noaa.gov> <4432F4DD.6060000@cox.net>
Message-ID: <1A94CA82-2EB5-4145-9EA9-453DE60AE684@stanford.edu>

Hi folks -

I must admit that when I first saw the trelgol web page, I was  
briefly a bit confused and put off about the prospect of moving to  
numpy from Numeric. Now, it didn't take long for me to come to my  
senses and realize (a) that no formerly-free documentation had been  
revoked, (b) that there was enough documentation about the C API in  
the numpy distribution to get me started, (c) that there was a lot of  
support available on the email list, and most importantly (d) that  
Travis and many others are extremely generous with their time, both  
in answering emails on the numpy list and in making numpy better.

I now of course wholeheartedly agree with everything everyone has  
said in this thread, and with the idea behind selling the  
documentation. In fact, I feel a bit ashamed that I ever felt  
otherwise, even though it was just for a few minutes.

However, were I a more grumpy (or stupid) type, I might not have come  
to my senses as rapidly, or ever. That would have been my loss, of  
course. But, perhaps a few little things could help newcomers better  
understand the rationale behind the ebook.

Basically, everyone on this list knows (and supports, it seems!) the  
reasoning behind selling the docs, because it was discussed on the  
list. However, it's not hard to imagine someone new to numpy, or  
maybe a convert from Numeric (who was used to the large, free manual)  
scratching their head a little when confronted with http:// 
www.tramy.us/ . (It's less reasonable to imagine someone then going  
on to personally attack Travis in email -- that's absolutely  
unconscionable.)

I would suggest that the link from the scipy page be changed to point  
to http://www.tramy.us/guidetoscipy.html , which is a little more  
clearly about the ebook, and a little less about the publishing  
method. It might not hurt to expand a bit on that page and mention  
the basic reasoning behind selling the docs, and even (if you see  
fit, Travis) to maybe include links to the other numpy documentation  
resources (list archive and sign up page, old and out-of-date Numeric  
reference [with maybe some mention of why buying the book would be  
better, but that the old ref at least gives the right high-level  
picture to get a newcomer started using numpy], and the numpy wiki  
pages). Any of this would certainly put a newcomer in a more  
charitable state of mind, and forestall any lingering concerns about  
greed or any such foolishness.

Since free advice is worth exactly what you paid for it, feel free to  
ignore any or all of this. I just wanted to mention a few easy things  
that I think might help newcomers understand and feel good about the  
ebook (the first step toward buying it!).

Zach


On Apr 4, 2006, at 5:36 PM, Tim Hochberg wrote:

>
> Travis,
>
> I'm sorry to hear that you received such an unwarranted attack.  
> Although, sadly, not terribly suprised; there are plenty of  
> unpleasant fanatics of various stripes that roam the bitstreams.   
> Let me add a hearty "me too" to everything that Chris just said.
>
> This finally motivated me to go out and buy your book, something  
> that's been on my list of things that I should do "one of these  
> days now". I'm hoping that makes this mystery person unhappy.
>
> Regards,
> -tim
>
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting  
> language
> that extends applications into web and mobile media. Attend the  
> live webcast
> and join the prime developer group breaking into this new coding  
> territory!
> http://sel.as-us.falkag.net/sel? 
> cmd=lnk&kid=110944&bid=241720&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion


From zpincus at stanford.edu  Tue Apr  4 17:19:18 2006
From: zpincus at stanford.edu (Zachary Pincus)
Date: Tue Apr  4 17:19:18 2006
Subject: [Numpy-discussion] array constructor from generators?
Message-ID: <F0BB3395-025A-46CB-8A7A-005964E45D20@stanford.edu>

Hi folks,

Sorry if this has already been discussed, but do you all think it a  
good idea to extend the array constructor so that it can accept  
generators instead of lists?

I often construct arrays from list comprehensions on generators, e.g.  
to read a tab-delimited file in:
numpy.array([map(float, line.split()) for line in file])
or making an array of pairs of numbers:
numpy.array([f for f in unique_combinations(input, 2)])

If the array constructor accepted generators (and turned them into  
lists behind the scenes, or even evaluated them lazily while filling  
in the memory buffer, not sure what would be more efficient), the  
above could be written somewhat more cleanly:
numpy.array(map(float, line.split() for line in file) (using a  
generator expression)
and
numpy.array(unique_combinations(input, 2))

the latter is especially a win.

Moreover, it's becoming more standard for any python thing that can  
accept a list to also accept a generator.

The downside is that currently, passing array() an object makes a 0-d  
object array with that object. If this were changed, then passing  
array() an iterator object would be handled differently than passing  
array any other object. This might possibly be a fatal flaw in this  
idea.

I'd be happy to look in to implementing this functionality if people  
think it is a good idea, and could give me some tips as to the best  
way to implement it.

Zach


From wbaxter at gmail.com  Tue Apr  4 17:24:38 2006
From: wbaxter at gmail.com (Bill Baxter)
Date: Tue Apr  4 17:24:38 2006
Subject: [Numpy-discussion] NumPy documentation
In-Reply-To: <1A94CA82-2EB5-4145-9EA9-453DE60AE684@stanford.edu>
References: <4432E27E.6030906@ee.byu.edu> <4432E973.8070601@noaa.gov>
	 <4432F4DD.6060000@cox.net>
	 <1A94CA82-2EB5-4145-9EA9-453DE60AE684@stanford.edu>
Message-ID: <e86a5fd00604041720l2d774fc0t5a4b0f3f560de8dc@mail.gmail.com>

First of all, it sounds like the individual who mailed Travis about being a
"wolf in sheep's clothing" is suffering from the delusion that you can
actually get rich by selling technical documentation at 40 bucks a pop.
Travis does have a web page up somewhere explaining all his rationale -- I
ran across it somewhere.  I remember when I saw it I was thinking "that's
bizarre -- why on earth would you have to make a whole web page to justify
selling something you yourself created?"  I mean, like it or not, Travis
wrote it so he can do whatever he wants with it.  That's just common sense.
Something apparently some lack.  It reminds me of the story my father told
me when I was like 8 years old about a man who shows up one day and gives a
little boy a dollar bill.  The boy is exctatic, and thanks the man
profusely.  Then the next day the same thing, another dollar.  The boy can't
believe his luck.  The whole week the guy comes, then it becomes a month,
and then a year.  Every day another dollar.  Eventually it becomes such a
routine that the boy doesn't even bother to thank the guy.  Then one day the
man doesn't show up.  The little boy is furious.  He was counting on that
dollar, he already knew how he was going to spend every penny.  The person
who emailed Travis is just like that little boy, furious for not getting the
dollar that wasn't his to begin with, rather than being thankful for the
$365 he was given out of the blue for no particular reason.

--bb
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060404/e3544d7e/attachment-0001.html>

From tim.hochberg at cox.net  Tue Apr  4 17:41:15 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Tue Apr  4 17:41:15 2006
Subject: [Numpy-discussion] array constructor from generators?
In-Reply-To: <F0BB3395-025A-46CB-8A7A-005964E45D20@stanford.edu>
References: <F0BB3395-025A-46CB-8A7A-005964E45D20@stanford.edu>
Message-ID: <44331200.2020604@cox.net>

Zachary Pincus wrote:

> Hi folks,
>
> Sorry if this has already been discussed, but do you all think it a  
> good idea to extend the array constructor so that it can accept  
> generators instead of lists?
>
> I often construct arrays from list comprehensions on generators, e.g.  
> to read a tab-delimited file in:
> numpy.array([map(float, line.split()) for line in file])
> or making an array of pairs of numbers:
> numpy.array([f for f in unique_combinations(input, 2)])
>
> If the array constructor accepted generators (and turned them into  
> lists behind the scenes, or even evaluated them lazily while filling  
> in the memory buffer, not sure what would be more efficient), the  
> above could be written somewhat more cleanly:
> numpy.array(map(float, line.split() for line in file) (using a  
> generator expression)
> and
> numpy.array(unique_combinations(input, 2))
>
> the latter is especially a win.
>
> Moreover, it's becoming more standard for any python thing that can  
> accept a list to also accept a generator.
>
> The downside is that currently, passing array() an object makes a 0-d  
> object array with that object. If this were changed, then passing  
> array() an iterator object would be handled differently than passing  
> array any other object. This might possibly be a fatal flaw in this  
> idea.

You pretty much can't count on anything when trying to implicitly create 
object arrays anyway. There's already buckets of special cases to make 
the other array types user friendly. In other words I don't think we 
should care. You do have to be careful to special case iterators after 
all the other special case machinery, so that lists and whatnot that are 
treated efficiently don't get slowed down.

>
> I'd be happy to look in to implementing this functionality if people  
> think it is a good idea, and could give me some tips as to the best  
> way to implement it.

Hi Zach,

I brought this up last week and Travis was OK with it. I have it on my 
todo list, but if you are in a hurry you're welcome to do it instead.

If you do look at it, consider looking into the '__length_hint__ 
parameter that's slated to go into Python 2.5. When this is present, 
it's potentially a big win, since you can preallocate the array and fill 
it directly from the iterator. Without this, you probably can't do much 
better than just building a list from the array. What would work well 
would be to build a list, then steal its memory. I'm not sure if that's 
feasible without leaking a reference to the list though.

Also, with iterators, specifying dtype will make a huge difference. If 
an object has __length_hint__ and you specify dtype, then you can 
preallocate the array as I suggested above. However, if dtype is not 
specified, you still need to build the list completely, determine what 
type it is, allocate the array memory and then copy the values into it. 
Much less efficient!

Regards,

-tim


>
> Zach
>
>
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting 
> language
> that extends applications into web and mobile media. Attend the live 
> webcast
> and join the prime developer group breaking into this new coding 
> territory!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>
>


From robert.kern at gmail.com  Tue Apr  4 17:50:05 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Tue Apr  4 17:50:05 2006
Subject: [Numpy-discussion] Re: array constructor from generators?
In-Reply-To: <F0BB3395-025A-46CB-8A7A-005964E45D20@stanford.edu>
References: <F0BB3395-025A-46CB-8A7A-005964E45D20@stanford.edu>
Message-ID: <e0v44s$92n$1@sea.gmane.org>

Zachary Pincus wrote:

> The downside is that currently, passing array() an object makes a 0-d 
> object array with that object. If this were changed, then passing 
> array() an iterator object would be handled differently than passing 
> array any other object. This might possibly be a fatal flaw in this  idea.

I don't think so. We can pass appropriate lists to array(), and it handles them
fine. Iterator objects are just another kind of object that gets special
treatment. The tricky bit is recognizing them.

> I'd be happy to look in to implementing this functionality if people 
> think it is a good idea, and could give me some tips as to the best  way
> to implement it.

I think a prerequisite for turning an arbitrary iterable into a numpy array is
to iterate over it and store all of the objects in a temporary buffer that
expands with a sensible strategy. I can't think of a better buffer object than
regular Python lists.

I think you can recognize when you have to use the temporary list strategy by
seeing if the input has .__iter__() but not .__len__(). I'd have to refresh
myself on the details of PyArray_New to be more sure, though.

As Tim suggests, 2.5's __length_hint__ will also help.

Another note of caution: You are going to have to deal with iterators of
iterators of iterators of.... I'm not sure if that actually overly complicates
matters; I haven't looked at PyArray_New for some time. Enjoy!

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From ted.horst at earthlink.net  Tue Apr  4 21:33:04 2006
From: ted.horst at earthlink.net (Ted Horst)
Date: Tue Apr  4 21:33:04 2006
Subject: [Numpy-discussion] NumPy documentation
In-Reply-To: <4432E27E.6030906@ee.byu.edu>
References: <4432E27E.6030906@ee.byu.edu>
Message-ID: <9E52488D-6736-4F10-A045-2D5B39CBD3F5@earthlink.net>

I'll just add my voice to the people speaking up to support Travis's  
efforts.  I buy lots of books, and most of the time I don't think too  
much about who I am supporting when I buy them, but  I probably would  
have bought this book even if I didn't need that level of  
documentation just to help support what I see as very important  
work.  I don't see how writing about an open source project and using  
the proceeds to further that project could be seen as anything other  
than a positive.

I also just want to say how impressed I am with what Travis has  
accomplished with this project.  From the organizational effort,  
patience, and persistence of bringing the various communities  
together to the quality and quantity of the ideas, code, and  
discussions, his contributions have been inspiring.

Ted Horst


From eric at enthought.com  Tue Apr  4 21:59:10 2006
From: eric at enthought.com (eric jones)
Date: Tue Apr  4 21:59:10 2006
Subject: [Numpy-discussion] NumPy documentation
In-Reply-To: <4432E27E.6030906@ee.byu.edu>
References: <4432E27E.6030906@ee.byu.edu>
Message-ID: <44334E74.3000406@enthought.com>

Travis Oliphant wrote:

>
> I received a rather hurtful email today that was very discouraging to 
> me personally.  Basically, I was called "lame" and a "wolf" in sheep's 
> clothing because I'm charging for documentation.    

Hmmmm....


Chickens getting eaten by foxes.
Farmer builds wire coop.
Coop destroyed by foxes.
More chickens eaten.
Wolf builds wooden coop for free.
Also stands guard but for a fee.
No more chickens eaten.
Most chickens glady pay.
A few grumble about extortion!
Thats fine.
Let them take the guard.
Foxes aren't so afraid of Chickens.
This chicken will take his chances with this wolf.
Turns out its just a lame chicken in wolves clothing.
Smart chicken, he is.

Dumb letter.  Dumb story.

Let see here, your a chicken.  check.  Travis is smart wolf-chicken... 
yeah that works. 
Numpy is the wooden chicken coop. errr... Guard duty is documentation. 
hmmm...
foxes, not sure...  Guess I should keep my day job.


Slightly more seriously...

There's a chicken's foot full of people on the planet that could have done
what Travis has pulled off -- I've actually thought about this a 
little.  Maybe
Jim Huginin could have done it given similar time and motivation.
After that, I come up a little short of candidates -- so maybe its just 
a pigs foot full.
I consider us lucky that one of the few people able to fuse Numeric/numarray
bailed us out and did it.

Documentation is another matter as far as scarcity of qualified 
authors.  I would
trust any number of yayhoos to create
at least passable documentation for Travis' creation. Heck, David Ascher 
managed
to write the Numeric documentation <wink>.  That said, writing docs is 
work,
hard to do well, and not nearly as much fun as writing actual code (for the
people on this list anyway).  That significantly lowers the probability 
of it
getting done.  In fact, I believe LLNL funded the first documentation 
effort to help
ensure that it happened (though I'm not positive about that). 

And, think of the creek we'd be up if he chose to keep the library and 
give away the docs.

I'm all for someone writing free documentation.  It'd be great to have.  
And, if
it were as good as Travis', I might even use it.  Still, it would 
probably be
better for the world if you spent your time on other things that don't 
already
have a solution (like documenting SciPy...).  Once that and all similar 
problems
are solved, loop back around and do the NumPy docs.

One other comment.  I've used another amazing library called agg
(www.antigrain.com) extensively for rendering in kiva/chaco.  I view 
Maxim (the
author of Agg) and graphics rendering in a similar light as Travis and 
Numpy --
there are only a handful of people that could have written agg.  For 
that I am
hugely greatful.  On the downside, agg is very complex and has very little
documentation.  Still a number of people use it without complaint. Based 
on the
evidence, if Maxim wrote documentation and charged for it, the number of
complaints would actually increase.  It is just silly. I would pay his 
price and
sing his praises for the days of my life that he gave back to me.

eric


ps.
# Based on a definitive monte carlo simulation, one of every hundred 
chickens will
# complain.  Don't believe me.  Try it.

dist = stats.uniform(0.0, 1.0)
for chicken in chickens:
    if dist.rvs()[0] < 0.01:
        print "extortion"


From pfdubois at gmail.com  Tue Apr  4 22:01:02 2006
From: pfdubois at gmail.com (Paul Dubois)
Date: Tue Apr  4 22:01:02 2006
Subject: [Numpy-discussion] NumPy documentation
In-Reply-To: <9E52488D-6736-4F10-A045-2D5B39CBD3F5@earthlink.net>
References: <4432E27E.6030906@ee.byu.edu>
	 <9E52488D-6736-4F10-A045-2D5B39CBD3F5@earthlink.net>
Message-ID: <f74a6c2f0604042200i6d0992deyf20d2dfc95c93550@mail.gmail.com>

Amen.

On 04 Apr 2006 21:33:12 -0700, Ted Horst <ted.horst at earthlink.net> wrote:
>
>
> I'll just add my voice to the people speaking up to support Travis's
> efforts.  I buy lots of books, and most of the time I don't think too
> much about who I am supporting when I buy them, but  I probably would
> have bought this book even if I didn't need that level of
> documentation just to help support what I see as very important
> work.  I don't see how writing about an open source project and using
> the proceeds to further that project could be seen as anything other
> than a positive.
>
> I also just want to say how impressed I am with what Travis has
> accomplished with this project.  From the organizational effort,
> patience, and persistence of bringing the various communities
> together to the quality and quantity of the ideas, code, and
> discussions, his contributions have been inspiring.
>
> Ted Horst
>
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting
> language
> that extends applications into web and mobile media. Attend the live
> webcast
> and join the prime developer group breaking into this new coding
> territory!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060404/10a64f0a/attachment-0001.html>

From jdhunter at ace.bsd.uchicago.edu  Tue Apr  4 22:54:01 2006
From: jdhunter at ace.bsd.uchicago.edu (John Hunter)
Date: Tue Apr  4 22:54:01 2006
Subject: [Numpy-discussion] NumPy documentation
In-Reply-To: <44334E74.3000406@enthought.com> (eric jones's message of "Tue,
 04 Apr 2006 23:58:28 -0500")
References: <4432E27E.6030906@ee.byu.edu> <44334E74.3000406@enthought.com>
Message-ID: <873bgsa7vp.fsf@peds-pc311.bsd.uchicago.edu>

>>>>> "eric" == eric jones <eric at enthought.com> writes:

    eric> Let see here, your a chicken.  check.  Travis is smart
    eric> wolf-chicken... yeah that works. Numpy is the wooden chicken
    eric> coop. errr... Guard duty is documentation. hmmm...  foxes,
    eric> not sure...  

And I thought you didn't drink anything stronger than Dr Pepper
:-)

JDH


From sransom at nrao.edu  Wed Apr  5 00:04:03 2006
From: sransom at nrao.edu (Scott Ransom)
Date: Wed Apr  5 00:04:03 2006
Subject: [Numpy-discussion] NumPy documentation
In-Reply-To: <9E52488D-6736-4F10-A045-2D5B39CBD3F5@earthlink.net>
References: <4432E27E.6030906@ee.byu.edu> <9E52488D-6736-4F10-A045-2D5B39CBD3F5@earthlink.net>
Message-ID: <20060405070150.GB8682@ssh.cv.nrao.edu>

As someone who has been actively using Numeric/Numarray/Numpy
for about 7 years, now, I heartily agree.

Thanks, Travis.

Scott

On Tue, Apr 04, 2006 at 11:32:42PM -0500, Ted Horst wrote:
> 
> I'll just add my voice to the people speaking up to support Travis's  
> efforts.  I buy lots of books, and most of the time I don't think too  
> much about who I am supporting when I buy them, but  I probably would  
> have bought this book even if I didn't need that level of  
> documentation just to help support what I see as very important  
> work.  I don't see how writing about an open source project and using  
> the proceeds to further that project could be seen as anything other  
> than a positive.
> 
> I also just want to say how impressed I am with what Travis has  
> accomplished with this project.  From the organizational effort,  
> patience, and persistence of bringing the various communities  
> together to the quality and quantity of the ideas, code, and  
> discussions, his contributions have been inspiring.
> 
> Ted Horst
> 
> 
> 
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting language
> that extends applications into web and mobile media. Attend the live webcast
> and join the prime developer group breaking into this new coding territory!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion

-- 
-- 
Scott M. Ransom            Address:  NRAO
Phone:  (434) 296-0320               520 Edgemont Rd.
email:  sransom at nrao.edu             Charlottesville, VA 22903 USA
GPG Fingerprint: 06A9 9553 78BE 16DB 407B  FFCA 9BFA B6FF FFD3 2989


From charlesr.harris at gmail.com  Wed Apr  5 00:27:02 2006
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Wed Apr  5 00:27:02 2006
Subject: [Numpy-discussion] NumPy documentation
In-Reply-To: <4432E27E.6030906@ee.byu.edu>
References: <4432E27E.6030906@ee.byu.edu>
Message-ID: <e06186140604050026t5832de47n79d6682736cfab28@mail.gmail.com>

Travis,

On 4/4/06, Travis Oliphant <oliphant at ee.byu.edu> wrote:
>
>
> I received a rather hurtful email today that was very discouraging to me
> personally.  Basically, I was called "lame" and a "wolf" in sheep's
> clothing because I'm charging for documentation.


<snip>

Geez, what's with that. There are any number of "real" books out on python,
I don't hear folks bitching. I think it's wonderful that we have such a good
reference. I mean, look at numarray 8) I spent the money for your book and
it didn't hurt a bit and was well worth the cost. Anyone who has tried to
write extensive documentation on a big project knows how much work it takes,
it isn't easy. Thanks for taking the time and sweat to do so.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060405/966a4f1e/attachment-0001.html>

From arnd.baecker at web.de  Wed Apr  5 01:51:08 2006
From: arnd.baecker at web.de (Arnd Baecker)
Date: Wed Apr  5 01:51:08 2006
Subject: [Numpy-discussion] Re: Vote: complex64   vs  complex128
In-Reply-To: <e0ujgn$lp1$1@sea.gmane.org>
References: <20060404155003.AF09016220@sc8-sf-spam2.sourceforge.net>
 <200604041813.k34IDmAP011500@oobleck.astro.cornell.edu> <e0ugi5$b1q$1@sea.gmane.org>
 <4432CE1F.3010209@cox.net> <e0ujgn$lp1$1@sea.gmane.org>
Message-ID: <Pine.LNX.4.51.0604051045520.24174@ptpcp8.phy.tu-dresden.de>

On Tue, 4 Apr 2006, Robert Kern wrote:

> Tim Hochberg wrote:

[...]

> >     >>> help(complex128)
> >    class complex128scalar(complexfloatingscalar, complex)
> >     |  complex128: composed of two 64 bit floats
> >     |
> >     |  Method resolution order:
> >     |      complex128scalar
> >     |      complexfloatingscalar
> >     |      inexactscalar
> >     |      numberscalar
> >     |      genericscalar
> >     |      complex
> >     |      object
> >    ...

I am puzzled why this does not show up with Ipython:

In [1]:import numpy
In [2]:numpy.complex128?
Type:           type
Base Class:     <type 'type'>
String Form:    <type 'complex128scalar'>
Namespace:      Interactive
Docstring:
    <no docstring>

whereas

In [3]:help(numpy.complex128)

shows the above!
So this might be more of an IPython question (I am running IPython
0.7.2.svn), but maybe numpy does some magic tricks to hide the docs from
IPython (surely not on purpose ...)?
It seems that numpy.complex128.__doc__ is None.

Best, Arnd


From meesters at uni-mainz.de  Wed Apr  5 02:03:06 2006
From: meesters at uni-mainz.de (Christian Meesters)
Date: Wed Apr  5 02:03:06 2006
Subject: [Numpy-discussion] NumPy documentation
In-Reply-To: <e06186140604050026t5832de47n79d6682736cfab28@mail.gmail.com>
References: <4432E27E.6030906@ee.byu.edu> <e06186140604050026t5832de47n79d6682736cfab28@mail.gmail.com>
Message-ID: <200604051048.52766.meesters@uni-mainz.de>

I'm glad Travis, that you got such supportive replies - but didn't expect 
anything else. Just let me give two more cents:
a) I am a grateful user of Numpy/Scipy, too.
b) I among of those who fully understand and support your decisions about 
selling the book.
c) I didn't buy the book - yet. (Simply forgotten after a minor 
Pay-Pal-problem I had.)
d) ad c): This will change soon.

And e): Thank you for all your work put into Numpy/Scipy !

Christian


From amcmorl at gmail.com  Wed Apr  5 02:30:01 2006
From: amcmorl at gmail.com (amcmorl)
Date: Wed Apr  5 02:30:01 2006
Subject: [Numpy-discussion] Newbie indexing question and print order
Message-ID: <44338DF4.7050603@gmail.com>

Hi all,

I'm having a bit of trouble getting my head around numpy's indexing
capabilities. A quick summary of the problem is that I want to
lookup/index in nD from a second array of rank n+1, such that the last
(or first, I guess) dimension contains the lookup co-ordinates for the
value to extract from the first array. Here's a 2D (3,3) example:

In [12]:print ar
[[ 0.15  0.75  0.2 ]
 [ 0.82  0.5   0.77]
 [ 0.21  0.91  0.59]]

In [24]:print inds
[[[1 1]
  [1 1]
  [2 1]]

 [[2 2]
  [0 0]
  [1 0]]

 [[1 1]
  [0 0]
  [2 1]]]

then somehow return the array (barring me making any row/column errors):
In [26]: c = ar.somefancyindexingroutinehere(inds)

In [26]:print c
[[ 0.5  0.5  0.91]
 [ 0.59 0.15 0.82]
 [ 0.5  0.15 0.91]]

i.e. c[x,y] = a[ inds[x,y,0], inds[x,y,1] ]

Any suggestions? It looks like it should be relatively simple using
'put' or 'take' or 'fetch' or 'sit' or something like that, but I'm not
getting it.

While I'm here, can someone help me understand the rationale behind
'print' printing row, column (i.e. a[0,1] = 0.75 in the above example
rather than x, y (=column, row; in which case 0.75 would be in the first
column and second row), which seems to me to be more intuitive.

I'm really enjoying getting into numpy - I can see it'll be
simpler/faster coding than my previous environments, despite me not
knowing my way at the moment, and that python has better opportunities
for extensibility. So, many thanks for your great work.
-- 
Angus McMorland
email a.mcmorland at auckland.ac.nz
mobile +64-21-155-4906

PhD Student, Neurophysiology / Multiphoton & Confocal Imaging
Physiology, University of Auckland
phone +64-9-3737-599 x89707

Armourer, Auckland University Fencing
Secretary, Fencing North Inc.


From faltet at carabos.com  Wed Apr  5 02:56:06 2006
From: faltet at carabos.com (Francesc Altet)
Date: Wed Apr  5 02:56:06 2006
Subject: [Numpy-discussion] NumPy documentation
In-Reply-To: <4432E27E.6030906@ee.byu.edu>
References: <4432E27E.6030906@ee.byu.edu>
Message-ID: <1144230907.7563.14.camel@localhost.localdomain>

Travis,

First of all, I think that you should be happy that you received *only*
a mail of this class in the year and some months that you are at the
NumPy project. As somebody already noted: "take a large enough
community, and you will always find a person (or several) that thinks
that the wiser developer and the best professional is evil". We can
disscuss largely why this should happen, but the answer is easy: it's
human nature.

Let me also THANK YOU not only for your impressive dedication to the
NumPy project but also for your openess to other ideas and to be the
best advocate of the "I prefer to code, rather than talk" mantra. Lets
do more of this and let others talk. I'm positive that 99% of the
community is with you, and that's the only consideration that is worth.

Best,

Francesc


El dt 04 de 04 del 2006 a les 15:17 -0600, en/na Travis Oliphant va
escriure:
> I received a rather hurtful email today that was very discouraging to me 
> personally.  Basically, I was called "lame" and a "wolf" in sheep's 
> clothing because I'm charging for documentation.    Fortunately it's the 
> first email of that nature I've received.  Others have disagreed with my 
> choice to charge for the documentation but at least they've not resorted 
> to personal attacks on me and my motivations.   Please know that such 
> emails do have an impact.  While I try to build a tough skin, such 
> unappreciative statements reduce my enthusiasm for working on NumPy 
> significantly.
> 
> My purpose, however, is not to rant about the misguided words of one 
> person.  He brought up a point that I want to clarify.  He asked if I 
> "would sue" if somebody else wrote documentation for NumPy.   I want to 
> be perfectly clear that this is a ridiculous statement that barely 
> deserves a response.  Of course I wouldn't.   First of all, it would be 
> extreme circumstances indeed for me to resort to that course of action 
> (basically a company would have to copy my book and start distributing 
> it on a large scale, belligerently).  Second of all, I would love to see 
> *more* documentation for NumPy.
> 
> If there are other (less vocal) people out there who are not using NumPy 
> because of my book, then I certainly feel sorry about that.  Please dig 
> in and create the documentation you so urgently want to be free.   I 
> will not stand in your way, but may even help.
> 
> But please consider that time is money.  Most people are better off 
> spending their time on something else and just cooperating with others 
> by paying for the book.  But, I'm not going to dislike or have any kind 
> of ill feelings with anyone who decides to spend their time on 
> "documentation."  In fact, I'll appreciate it just like everyone else.  
> I love the growth of the SciPy Wiki.  There are some great recipes and 
> examples there.  This is fantastic.  I'm 100% behind this kind of work.  
> Rather than write some kind of "replacement" documentation, contribute 
> docstrings to the code and recipes to the Wiki.   Then, those that can't 
> or won't buy the book will still have plenty of resources to use to 
> learn NumPy.
> 
> I'm completely behind all forms of "free" information on NumPy / SciPy 
> and related tools.  The only reason I have to charge for the 
> documentation is that I just don't have the resources to simply donate 
> *all* of my time.   I want to thank all of you who have already 
> purchased the documentation.   It has been extremely helpful to me 
> personally and professionally.   Without you, my time to spend on NumPy 
> would have been significantly reduced.   Thank you very much.
> 
> Best wishes,
> 
> -Travis
> 
> 
> 
> 
> 
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting language
> that extends applications into web and mobile media. Attend the live webcast
> and join the prime developer group breaking into this new coding territory!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
-- 
>0,0<   Francesc Altet     http://www.carabos.com/
V   V   C?rabos Coop. V.   Enjoy Data
 "-"


From pau.gargallo at gmail.com  Wed Apr  5 03:10:01 2006
From: pau.gargallo at gmail.com (Pau Gargallo)
Date: Wed Apr  5 03:10:01 2006
Subject: [Numpy-discussion] Newbie indexing question and print order
In-Reply-To: <44338DF4.7050603@gmail.com>
References: <44338DF4.7050603@gmail.com>
Message-ID: <6ef8f3380604050309t1ed4c79bv395ed1a9fb45ce9d@mail.gmail.com>

hi,
i had the same problem and i defined a function with a similar sintax
to interp2 which i call take2 to solve it:

from numpy import *

def take2( a, x,y ):
        return take( ravel(a), x + y*a.shape[0] )

a = array( [[ 0.15,  0.75,  0.2 ],
                 [ 0.82,  0.5,   0.77],
                  [ 0.21,  0.91,  0.59]] )
xy = array([    [[1, 1], [1, 1], [2, 1]],
                [[2, 2], [0, 0], [1, 0]],
                [[1, 1], [0, 0], [2, 1]]] )

print take2( a, xy[...,0], xy[...,1] )

i hope this helps you.
pau


On 4/5/06, amcmorl <amcmorl at gmail.com> wrote:
> Hi all,
>
> I'm having a bit of trouble getting my head around numpy's indexing
> capabilities. A quick summary of the problem is that I want to
> lookup/index in nD from a second array of rank n+1, such that the last
> (or first, I guess) dimension contains the lookup co-ordinates for the
> value to extract from the first array. Here's a 2D (3,3) example:
>
> In [12]:print ar
> [[ 0.15  0.75  0.2 ]
>  [ 0.82  0.5   0.77]
>  [ 0.21  0.91  0.59]]
>
> In [24]:print inds
> [[[1 1]
>   [1 1]
>   [2 1]]
>
>  [[2 2]
>   [0 0]
>   [1 0]]
>
>  [[1 1]
>   [0 0]
>   [2 1]]]
>
> then somehow return the array (barring me making any row/column errors):
> In [26]: c = ar.somefancyindexingroutinehere(inds)
>
> In [26]:print c
> [[ 0.5  0.5  0.91]
>  [ 0.59 0.15 0.82]
>  [ 0.5  0.15 0.91]]
>
> i.e. c[x,y] = a[ inds[x,y,0], inds[x,y,1] ]
>
> Any suggestions? It looks like it should be relatively simple using
> 'put' or 'take' or 'fetch' or 'sit' or something like that, but I'm not
> getting it.
>
> While I'm here, can someone help me understand the rationale behind
> 'print' printing row, column (i.e. a[0,1] = 0.75 in the above example
> rather than x, y (=column, row; in which case 0.75 would be in the first
> column and second row), which seems to me to be more intuitive.
>
> I'm really enjoying getting into numpy - I can see it'll be
> simpler/faster coding than my previous environments, despite me not
> knowing my way at the moment, and that python has better opportunities
> for extensibility. So, many thanks for your great work.
> --
> Angus McMorland
> email a.mcmorland at auckland.ac.nz
> mobile +64-21-155-4906
>
> PhD Student, Neurophysiology / Multiphoton & Confocal Imaging
> Physiology, University of Auckland
> phone +64-9-3737-599 x89707
>
> Armourer, Auckland University Fencing
> Secretary, Fencing North Inc.
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting language
> that extends applications into web and mobile media. Attend the live webcast
> and join the prime developer group breaking into this new coding territory!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>


From tim.hochberg at cox.net  Wed Apr  5 05:30:14 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Wed Apr  5 05:30:14 2006
Subject: [Numpy-discussion] Re: Vote: complex64   vs  complex128
In-Reply-To: <Pine.LNX.4.51.0604051045520.24174@ptpcp8.phy.tu-dresden.de>
References: <20060404155003.AF09016220@sc8-sf-spam2.sourceforge.net> <200604041813.k34IDmAP011500@oobleck.astro.cornell.edu> <e0ugi5$b1q$1@sea.gmane.org> <4432CE1F.3010209@cox.net> <e0ujgn$lp1$1@sea.gmane.org> <Pine.LNX.4.51.0604051045520.24174@ptpcp8.phy.tu-dresden.de>
Message-ID: <4433B816.1080307@cox.net>

Arnd Baecker wrote:

>On Tue, 4 Apr 2006, Robert Kern wrote:
>
>  
>
>>Tim Hochberg wrote:
>>    
>>
>
>[...]
>
>  
>
>>>    >>> help(complex128)
>>>   class complex128scalar(complexfloatingscalar, complex)
>>>    |  complex128: composed of two 64 bit floats
>>>    |
>>>    |  Method resolution order:
>>>    |      complex128scalar
>>>    |      complexfloatingscalar
>>>    |      inexactscalar
>>>    |      numberscalar
>>>    |      genericscalar
>>>    |      complex
>>>    |      object
>>>   ...
>>>      
>>>
>
>I am puzzled why this does not show up with Ipython:
>
>In [1]:import numpy
>In [2]:numpy.complex128?
>Type:           type
>Base Class:     <type 'type'>
>String Form:    <type 'complex128scalar'>
>Namespace:      Interactive
>Docstring:
>    <no docstring>
>
>whereas
>
>In [3]:help(numpy.complex128)
>
>shows the above!
>So this might be more of an IPython question (I am running IPython
>0.7.2.svn), but maybe numpy does some magic tricks to hide the docs from
>IPython (surely not on purpose ...)?
>It seems that numpy.complex128.__doc__ is None
>
That's right, none of the scalar types have docstrings at present. The 
builtin help (AKA pydoc.help) tracks back through all the base classes 
and presents all kinds of extra information. The result tends to be 
awfully verbose; so much so that I just stuffed a function called hint 
into __builtins___ that just prints the results of pydoc.describe and 
pydoc.getdoc. It's quite possible that such a function already exists, 
maybe even in pydoc, but oddly enough the docs for pydoc are pretty 
impenatrable.

Here I've added basic docstrings to the complex types. I was hoping 
someone would have some ideas for other stuff that should go into the 
docstrings, but perhaps I'll just commit that change as is. Here's what 
I see here using hint:

 >>> hint(numpy.float64) # Still no docstring
class float64scalar
 >>> hint(numpy.complex64) # Now has a terse docstring
class complex64scalar
 |  Composed of two 32 bit floats
 >>> hint(numpy.complex128) # Same here.
class complex128scalar
 |  Composed of two 64 bit floats

Regards,

-tim


From arnd.baecker at web.de  Wed Apr  5 05:48:02 2006
From: arnd.baecker at web.de (Arnd Baecker)
Date: Wed Apr  5 05:48:02 2006
Subject: [Numpy-discussion] Speed up function on cross product of two
 sets?
In-Reply-To: <44315633.4010600@cox.net>
References: <DCA236B2-7315-419D-B05B-06396924E909@stanford.edu>
 <Pine.LNX.4.51.0604031110050.6751@ptpcp8.phy.tu-dresden.de> <44315633.4010600@cox.net>
Message-ID: <Pine.LNX.4.51.0604051051310.24174@ptpcp8.phy.tu-dresden.de>


On Mon, 3 Apr 2006, Tim Hochberg wrote:

> Arnd Baecker wrote:
>
> [SNIP]
>
> >((Note that I just learned in some other thread that with numpy there is
> >an alternative to NewAxis, but I haven't figured out which that is ...))
> >
> >
> If you're old school you could just use None.

Well, I have been using python/Numeric/... for a while, but I am
definitively not old school - I was not aware that NewAxis is a longer
spelling of None ;-)

> But you probably mean 'newaxis'.

yes - perfect! Many thanks.

BTW, it seems that we have no Numeric to numpy transition remarks in
www.scipy.org. I only found
http://www.scipy.org/PearuPeterson/NumpyVersusNumeric
and of course Travis' "Guide to NumPy" contains a detailed list of
necessary changes in chapter 2.6.1.
In addition ``site-packages/numpy/lib/convertcode.py`` provides an
automatic conversion.

Would it be helpful to start a new wiki page "ConvertingFromNumeric"
(similar to http://www.scipy.org/Converting_from_numarray)
which aims at summarizing the necessary changes
or expand Pearu's page (if he agrees) on this?

Best, Arnd


From arnd.baecker at web.de  Wed Apr  5 05:57:16 2006
From: arnd.baecker at web.de (Arnd Baecker)
Date: Wed Apr  5 05:57:16 2006
Subject: [Numpy-discussion] Re: Vote: complex64   vs  complex128
In-Reply-To: <4433B816.1080307@cox.net>
References: <20060404155003.AF09016220@sc8-sf-spam2.sourceforge.net>
 <200604041813.k34IDmAP011500@oobleck.astro.cornell.edu> <e0ugi5$b1q$1@sea.gmane.org>
 <4432CE1F.3010209@cox.net> <e0ujgn$lp1$1@sea.gmane.org>
 <Pine.LNX.4.51.0604051045520.24174@ptpcp8.phy.tu-dresden.de> <4433B816.1080307@cox.net>
Message-ID: <Pine.LNX.4.51.0604051448300.24174@ptpcp8.phy.tu-dresden.de>

Hi,

On Wed, 5 Apr 2006, Tim Hochberg wrote:

[...]

> That's right, none of the scalar types have docstrings at present. The
> builtin help (AKA pydoc.help) tracks back through all the base classes
> and presents all kinds of extra information.

I see - so that might be something Ipython could do as well
(if that's really what we would like to see...)

> The result tends to be
> awfully verbose; so much so that I just stuffed a function called hint
> into __builtins___ that just prints the results of pydoc.describe and
> pydoc.getdoc. It's quite possible that such a function already exists,
> maybe even in pydoc, but oddly enough the docs for pydoc are pretty
> impenatrable.
>
> Here I've added basic docstrings to the complex types. I was hoping
> someone would have some ideas for other stuff that should go into the
> docstrings, but perhaps I'll just commit that change as is. Here's what
> I see here using hint:
>
>  >>> hint(numpy.float64) # Still no docstring
> class float64scalar
>  >>> hint(numpy.complex64) # Now has a terse docstring
> class complex64scalar
>  |  Composed of two 32 bit floats
>  >>> hint(numpy.complex128) # Same here.
> class complex128scalar
>  |  Composed of two 64 bit floats

That looks much better.
I am a bit unsure about `hint` though for the following reasons:
There are quite a few ways  to access documentation:
  - help(defined_object)
  - help("numpy.complex128")
  - scipy.info(defined_object)
  - hint(defined_object)
  - defined_object?                     # with IPython
(and then of course the pydoc commands as well ...).

Clearly, I would prefer to have "?" in IPython as the only thing one needs
to know about accessing documentation.

There are surely many aspects to consider here, but I have to rush now ...

Best, Arnd


From tim.hochberg at cox.net  Wed Apr  5 06:24:11 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Wed Apr  5 06:24:11 2006
Subject: [Numpy-discussion] Re: Vote: complex64   vs  complex128
In-Reply-To: <Pine.LNX.4.51.0604051448300.24174@ptpcp8.phy.tu-dresden.de>
References: <20060404155003.AF09016220@sc8-sf-spam2.sourceforge.net> <200604041813.k34IDmAP011500@oobleck.astro.cornell.edu> <e0ugi5$b1q$1@sea.gmane.org> <4432CE1F.3010209@cox.net> <e0ujgn$lp1$1@sea.gmane.org> <Pine.LNX.4.51.0604051045520.24174@ptpcp8.phy.tu-dresden.de> <4433B816.1080307@cox.net> <Pine.LNX.4.51.0604051448300.24174@ptpcp8.phy.tu-dresden.de>
Message-ID: <4433C4CC.7010003@cox.net>

Arnd Baecker wrote:

>Hi,
>
>On Wed, 5 Apr 2006, Tim Hochberg wrote:
>
>[...]
>
>  
>
>>That's right, none of the scalar types have docstrings at present. The
>>builtin help (AKA pydoc.help) tracks back through all the base classes
>>and presents all kinds of extra information.
>>    
>>
>
>I see - so that might be something Ipython could do as well
>(if that's really what we would like to see...)
>
>  
>
>>The result tends to be
>>awfully verbose; so much so that I just stuffed a function called hint
>>into __builtins___ that just prints the results of pydoc.describe and
>>pydoc.getdoc. It's quite possible that such a function already exists,
>>maybe even in pydoc, but oddly enough the docs for pydoc are pretty
>>impenatrable.
>>
>>Here I've added basic docstrings to the complex types. I was hoping
>>someone would have some ideas for other stuff that should go into the
>>docstrings, but perhaps I'll just commit that change as is. Here's what
>>I see here using hint:
>>
>> >>> hint(numpy.float64) # Still no docstring
>>class float64scalar
>> >>> hint(numpy.complex64) # Now has a terse docstring
>>class complex64scalar
>> |  Composed of two 32 bit floats
>> >>> hint(numpy.complex128) # Same here.
>>class complex128scalar
>> |  Composed of two 64 bit floats
>>    
>>
>
>That looks much better.
>I am a bit unsure about `hint` though for the following reasons:
>There are quite a few ways  to access documentation:
>  - help(defined_object)
>  - help("numpy.complex128")
>  - scipy.info(defined_object)
>  - hint(defined_object)
>  - defined_object?                     # with IPython
>(and then of course the pydoc commands as well ...).
>  
>
Sorry, I was unclear. Hint is only for my enjoyment -- it's not related 
to numpy. I just tossed it into my sitecustomize file. I was just get 
sick of doing help(complex64) and getting pages of text when all I cared 
about was the docstring. I suppose I could just have done "print 
complex64.__doc__", but I felt like hint might be useful. However, it's 
not something I was proposing to add to numpy, the changes I was talking 
about are strictly in the docstrings of complexXXX.

-tim

>Clearly, I would prefer to have "?" in IPython as the only thing one needs
>to know about accessing documentation.
>
>There are surely many aspects to consider here, but I have to rush now ...
>
>Best, Arnd
>
>
>
>
>  
>


From emsellem at obs.univ-lyon1.fr  Wed Apr  5 06:33:23 2006
From: emsellem at obs.univ-lyon1.fr (Eric Emsellem)
Date: Wed Apr  5 06:33:23 2006
Subject: [Numpy-discussion] A random.normal function with stdev as array
Message-ID: <4433C6D6.5080800@obs.univ-lyon1.fr>

Hi,

I am trying to optimize a code where I derive random numbers many times 
and having an array of values for the stdev parameter.

I wish to have an efficient way of doing something like:
##################
stdev = array([1.1,1.2,1.0,2.2])
result = numpy.zeros(stdev.shape, Float)
for i in range(len(stdev)) :
   result[i] = numpy.random.normal(0, stdev[i])
##################

In my case,  stdev can in fact be an array of a few millions floats... 
so I really need to optimize things.

Any hint on how to code this efficiently ?

And in general, where could I find tips for optimizing a code where I 
unfortunately have too many loops such as "for i in range(Nbody) : " 
with Nbody being > 10^6 ?

thanks!
Eric


From dd55 at cornell.edu  Wed Apr  5 06:34:00 2006
From: dd55 at cornell.edu (Darren Dale)
Date: Wed Apr  5 06:34:00 2006
Subject: [Numpy-discussion] NumPy documentation
In-Reply-To: <9E52488D-6736-4F10-A045-2D5B39CBD3F5@earthlink.net>
References: <4432E27E.6030906@ee.byu.edu> <9E52488D-6736-4F10-A045-2D5B39CBD3F5@earthlink.net>
Message-ID: <200604050932.56744.dd55@cornell.edu>

On Wednesday 05 April 2006 00:32, Ted Horst wrote:
> I'll just add my voice to the people speaking up to support Travis's
> efforts.  I buy lots of books, and most of the time I don't think too
> much about who I am supporting when I buy them, but  I probably would
> have bought this book even if I didn't need that level of
> documentation just to help support what I see as very important
> work.  I don't see how writing about an open source project and using
> the proceeds to further that project could be seen as anything other
> than a positive.
>
> I also just want to say how impressed I am with what Travis has
> accomplished with this project.  From the organizational effort,
> patience, and persistence of bringing the various communities
> together to the quality and quantity of the ideas, code, and
> discussions, his contributions have been inspiring.

I agree. I support of what Travis has done.


From pearu at scipy.org  Wed Apr  5 07:18:02 2006
From: pearu at scipy.org (Pearu Peterson)
Date: Wed Apr  5 07:18:02 2006
Subject: [Numpy-discussion] Speed up function on cross product of two
 sets?
In-Reply-To: <Pine.LNX.4.51.0604051051310.24174@ptpcp8.phy.tu-dresden.de>
References: <DCA236B2-7315-419D-B05B-06396924E909@stanford.edu>
 <Pine.LNX.4.51.0604031110050.6751@ptpcp8.phy.tu-dresden.de> <44315633.4010600@cox.net>
 <Pine.LNX.4.51.0604051051310.24174@ptpcp8.phy.tu-dresden.de>
Message-ID: <Pine.LNX.4.64.0604050912310.12488@scipy.org>


On Wed, 5 Apr 2006, Arnd Baecker wrote:

> BTW, it seems that we have no Numeric to numpy transition remarks in
> www.scipy.org. I only found
> http://www.scipy.org/PearuPeterson/NumpyVersusNumeric
> and of course Travis' "Guide to NumPy" contains a detailed list of
> necessary changes in chapter 2.6.1.
> In addition ``site-packages/numpy/lib/convertcode.py`` provides an
> automatic conversion.
>
> Would it be helpful to start a new wiki page "ConvertingFromNumeric"
> (similar to http://www.scipy.org/Converting_from_numarray)
> which aims at summarizing the necessary changes
> or expand Pearu's page (if he agrees) on this?

It's better to start a new wiki page similar to Converting_from_numarray 
(I like the table). Btw, I have few notes about the necessary changes for 
Numeric->numpy transition in the following page:

   http://svn.enthought.com/enthought/wiki/NumpyPort#NotesonchangesduetoreplacingNumeric/scipy_basewithnumpy

Feel free to grab these notes.

Pearu


From zpincus at stanford.edu  Wed Apr  5 08:04:33 2006
From: zpincus at stanford.edu (Zachary Pincus)
Date: Wed Apr  5 08:04:33 2006
Subject: [Numpy-discussion] array constructor from generators?
In-Reply-To: <44331200.2020604@cox.net>
References: <F0BB3395-025A-46CB-8A7A-005964E45D20@stanford.edu> <44331200.2020604@cox.net>
Message-ID: <B72CDB4F-6494-4815-9478-10B4743DBCE1@stanford.edu>

tim>
> I brought this up last week and Travis was OK with it. I have it on  
> my todo list, but if you are in a hurry you're welcome to do it  
> instead.

Sorry if that was on the list and I missed it! Hate to be adding more  
noise than signal. At any rate, I'm not in a hurry, but I'd be happy  
to help where I can. (Though for the next week or so I think I'm  
swamped...)

tim>
> If you do look at it, consider looking into the '__length_hint__  
> parameter that's slated to go into Python 2.5. When this is  
> present, it's potentially a big win, since you can preallocate the  
> array and fill it directly from the iterator. Without this, you  
> probably can't do much better than just building a list from the  
> array. What would work well would be to build a list, then steal  
> its memory. I'm not sure if that's feasible without leaking a  
> reference to the list though.

Can you steal its memory and then give it some dummy memory that it  
can free without problems, so that the list can be deallocated  
without trouble? Does anyone know if you can just give the list a  
NULL pointer for it's memory and then immediately decref it? free 
(NULL) should always be safe, I think. (??)

> Also, with iterators, specifying dtype will make a huge difference.  
> If an object has __length_hint__ and you specify dtype, then you  
> can preallocate the array as I suggested above. However, if dtype  
> is not specified, you still need to build the list completely,  
> determine what type it is, allocate the array memory and then copy  
> the values into it. Much less efficient!

How accurate is __length_hint__ going to be? It could lead to a fair  
bit of special case code for growing and shrinking the final array if  
__length_hint__ turns out to be wrong. Code that python lists already  
have, moreover.

If the list's memory can be stolen safely, how does this strategy sound:
- Given a generator, build it up into a list internally, and then  
steal the list's memory.
- If a dtype is provided, wrap the generator with another generator  
that casts the original generator's output to the correct dtype. Then  
use the wrapped generator to create a list of the proper dtype, and  
steal that list's memory.

A potential problem with stealing list memory is that it could waste  
memory if the list has more bytes allocated than it is using (I'm not  
sure if python lists can get this way, but I presume that they resize  
themselves only every so often, like C++ or Java vectors, so most of  
the time they have some allocated but unused bytes). If lists have a  
squeeze method that's guaranteed not to cause any copies, or if this  
can be added with judicious use of realloc, then that problem is  
obviated.

robert>
> Another note of caution: You are going to have to deal with  
> iterators of
> iterators of iterators of.... I'm not sure if that actually overly  
> complicates
> matters; I haven't looked at PyArray_New for some time. Enjoy!

This is a good point. Numpy does fine with nested lists, but what  
should it do with nested generators? I originally thought that  
basically 'array(generator)' should make the exact same thing as  
'array([f for f in generator])'. However, for nested generators, this  
would be an object array of generators.

I'm not sure which is better -- having more special cases for  
generators that make generators, or having a simple rubric like above  
for how generators are treated.

Any thoughts?

Zach


From perry at stsci.edu  Wed Apr  5 08:08:19 2006
From: perry at stsci.edu (Perry Greenfield)
Date: Wed Apr  5 08:08:19 2006
Subject: [Numpy-discussion] NumPy documentation
In-Reply-To: <f74a6c2f0604042200i6d0992deyf20d2dfc95c93550@mail.gmail.com>
References: <4432E27E.6030906@ee.byu.edu> <9E52488D-6736-4F10-A045-2D5B39CBD3F5@earthlink.net> <f74a6c2f0604042200i6d0992deyf20d2dfc95c93550@mail.gmail.com>
Message-ID: <a0d87f4948a7d9cb53427404c3e11353@stsci.edu>

Speaking as someone who thinks he knows what kind of effort is involved 
in creating numpy, I suspect relatively few have any idea of the effort 
and skill that is required to do what Travis has done. Indeed, I 
wouldn't be surprised if Travis hadn't fully anticipated at the start 
what he was getting himself into, and if he hasn't asked himself more 
than once whether he would do it again had he known [I imagine that 
many worthy and memorable efforts fall into this category. Much human 
progress springs out of such initial optimism.] John Hunter is right 
that Travis's contributions to this and other scipy-related projects 
amount to years of work.

For those that find it objectionable that Travis is trying to get some 
partial compensation for this work, consider whether there was any one 
at all in the Python community willing to do this as well as he as for 
free, or even for what he will actually recover from the book. I doubt 
it very much.

Fortunately, I think the number of people that object to Travis 
charging for the book is small. Unfortunately, their impact can be 
disproportionately large. I hope Travis can effectively ignore them.

Perry


From lennart.ohlsson at cs.lth.se  Wed Apr  5 08:12:20 2006
From: lennart.ohlsson at cs.lth.se (Lennart Ohlsson)
Date: Wed Apr  5 08:12:20 2006
Subject: [Numpy-discussion] Re: Newbie indexing question and print order
Message-ID: <008201c658c3$30d06ab0$2f32eb82@cs060109>

Hi,

Although I mainly use for 2D takes here is an nd-version of such a function:

def vtake(a, indices):
    """Corresponding to take in numpy but with vector valued indices"""
    indexrank = indices.shape[-1]
    flattedindex = 0
    for i in range(indexrank):
        flattedindex = flattedindex*a.shape[i] + indices[...,i]
    flattedshape = (-1,) + a.shape[indexrank:]
    return a.reshape(flattedshape).take(flattedindex)


- Lennart


On 4/5/06, Pau Gargallo<pau.gargallo at gmail.com> wrote:

hi,
i had the same problem and i defined a function with a similar sintax
to interp2 which i call take2 to solve it:

from numpy import *

def take2( a, x,y ):
        return take( ravel(a), x + y*a.shape[0] )

a = array( [[ 0.15,  0.75,  0.2 ],
                 [ 0.82,  0.5,   0.77],
                  [ 0.21,  0.91,  0.59]] )
xy = array([    [[1, 1], [1, 1], [2, 1]],
                [[2, 2], [0, 0], [1, 0]],
                [[1, 1], [0, 0], [2, 1]]] )

print take2( a, xy[...,0], xy[...,1] )

i hope this helps you.
pau


On 4/5/06, amcmorl <amcmorl at gmail.com> wrote:
> Hi all,
>
> I'm having a bit of trouble getting my head around numpy's indexing
> capabilities. A quick summary of the problem is that I want to
> lookup/index in nD from a second array of rank n+1, such that the last
> (or first, I guess) dimension contains the lookup co-ordinates for the
> value to extract from the first array. Here's a 2D (3,3) example:
>
> In [12]:print ar
> [[ 0.15  0.75  0.2 ]
>  [ 0.82  0.5   0.77]
>  [ 0.21  0.91  0.59]]
>
> In [24]:print inds
> [[[1 1]
>   [1 1]
>   [2 1]]
>
>  [[2 2]
>   [0 0]
>   [1 0]]
>
>  [[1 1]
>   [0 0]
>   [2 1]]]
>
> then somehow return the array (barring me making any row/column errors):
> In [26]: c = ar.somefancyindexingroutinehere(inds)
>
> In [26]:print c
> [[ 0.5  0.5  0.91]
>  [ 0.59 0.15 0.82]
>  [ 0.5  0.15 0.91]]
>
> i.e. c[x,y] = a[ inds[x,y,0], inds[x,y,1] ]
>
> Any suggestions? It looks like it should be relatively simple using
> 'put' or 'take' or 'fetch' or 'sit' or something like that, but I'm not
> getting it.
>
> While I'm here, can someone help me understand the rationale behind
> 'print' printing row, column (i.e. a[0,1] = 0.75 in the above example
> rather than x, y (=column, row; in which case 0.75 would be in the first
> column and second row), which seems to me to be more intuitive.
>
> I'm really enjoying getting into numpy - I can see it'll be
> simpler/faster coding than my previous environments, despite me not
> knowing my way at the moment, and that python has better opportunities
> for extensibility. So, many thanks for your great work.
> --
> Angus McMorland
> email a.mcmorland at auckland.ac.nz
> mobile +64-21-155-4906
>
> PhD Student, Neurophysiology / Multiphoton & Confocal Imaging
> Physiology, University of Auckland
> phone +64-9-3737-599 x89707
>
> Armourer, Auckland University Fencing
> Secretary, Fencing North Inc.
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting language
> that extends applications into web and mobile media. Attend the live webcast
> and join the prime developer group breaking into this new coding territory!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>


From a.h.jaffe at gmail.com  Wed Apr  5 08:18:03 2006
From: a.h.jaffe at gmail.com (Andrew Jaffe)
Date: Wed Apr  5 08:18:03 2006
Subject: [Numpy-discussion] weird interaction: pickle, numpy, matplotlib.hist
Message-ID: <4433DF85.7030109@gmail.com>

Hi All,

I've encountered a strange problem: I've been running some python code 
on both a linux box and OS X, both with python 2.4.1 and the latest 
numpy and matplotlib from svn.

I have found that when I transfer pickled numpy arrays from one machine 
to the other (in either direction), the resulting data *looks* all right 
(i.e., it is a numpy array of the correct type with the correct values 
at the correct indices), but it seems to produce the wrong result in (at 
least) one circumstance: matplotlib.hist() gives the completely wrong 
picture (and set of bins).

This can be ameliorated by running the array through
    arr=numpy.asarray(arr, dtype=numpy.float64)
but this seems like a complete kludge (and is only needed when you do 
the transfer between machines).

I've attached a minimal code that exhibits the problem: try
	test_pickle_hist.test(write=True)
on one machine, transfer the output file to another machine, and run
	test_pickle_hist.test(write=False)
on another, and you should see a very strange result (and it should be 
fixed if you set asarray=True).

Any ideas?

Andrew
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: test_pickle_hist.py
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060405/ca428697/attachment-0001.ksh>

From ryanlists at gmail.com  Wed Apr  5 08:23:06 2006
From: ryanlists at gmail.com (Ryan Krauss)
Date: Wed Apr  5 08:23:06 2006
Subject: [Numpy-discussion] NumPy documentation
In-Reply-To: <c5b438120604041636n3f472a6aufc5658b91fd3d380@mail.gmail.com>
References: <4432E27E.6030906@ee.byu.edu>
	 <Mahogany-0.66.0-1528-20060404-193223.00@american.edu>
	 <c5b438120604041636n3f472a6aufc5658b91fd3d380@mail.gmail.com>
Message-ID: <c5b438120604050822v773b2581rb06910a532af8cd@mail.gmail.com>

I just realized that my "Amen" to all of this went only to Alan Isaac.
 I don't "reply-to-all" by default.

In response to Perry's comment: "I hope Travis can effectively ignore
them."  I think a spam filter with "wolf" and "sheep" might be a good
start, but it could accidentally delete some interesting "poetry"
<wink>.

Ryan

On 4/4/06, Ryan Krauss <ryanlists at gmail.com> wrote:
> Let me add my thanks and also say that as a grad student who plans to
> buy your book once I graduate, NumPy's use is not inhibited by Travis
> charging for the documentation.
>
> Thanks!
>
> Ryan Krauss
>
> On 4/4/06, Alan G Isaac <aisaac at american.edu> wrote:
> > On Tue, 04 Apr 2006, Travis Oliphant apparently wrote:
> > > I'm not going to dislike or have any kind of ill feelings
> > > with anyone who decides to spend their time on
> > > "documentation."  In fact, I'll appreciate it just like
> > > everyone else.
> >
> > Of course you were extremely clear about this from the
> > beginning.  Thank you for numpy!!!
> > Alan Isaac (grateful user of numpy)
> > PS Your book is *very* helpful.
> >
> >
> >
> >
> >
> > -------------------------------------------------------
> > This SF.Net email is sponsored by xPML, a groundbreaking scripting language
> > that extends applications into web and mobile media. Attend the live webcast
> > and join the prime developer group breaking into this new coding territory!
> > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> > _______________________________________________
> > Numpy-discussion mailing list
> > Numpy-discussion at lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/numpy-discussion
> >
>


From zpincus at stanford.edu  Wed Apr  5 08:32:02 2006
From: zpincus at stanford.edu (Zachary Pincus)
Date: Wed Apr  5 08:32:02 2006
Subject: [Numpy-discussion] array constructor from generators?
In-Reply-To: <44331200.2020604@cox.net>
References: <F0BB3395-025A-46CB-8A7A-005964E45D20@stanford.edu> <44331200.2020604@cox.net>
Message-ID: <884F03C6-599C-426A-A0A0-97009B63EACB@stanford.edu>

[sorry if this comes through twice -- seems to have not sent the  
first time]

Hi folks,

tim>
> I brought this up last week and Travis was OK with it. I have it on  
> my todo list, but if you are in a hurry you're welcome to do it  
> instead.

Sorry if that was on the list and I missed it! Hate to be adding more  
noise than signal. At any rate, I'm not in a hurry, but I'd be happy  
to help where I can. (Though for the next week or so I think I'm  
swamped...)

tim>
> If you do look at it, consider looking into the '__length_hint__  
> parameter that's slated to go into Python 2.5. When this is  
> present, it's potentially a big win, since you can preallocate the  
> array and fill it directly from the iterator. Without this, you  
> probably can't do much better than just building a list from the  
> array. What would work well would be to build a list, then steal  
> its memory. I'm not sure if that's feasible without leaking a  
> reference to the list though.

Can you steal its memory and then give it some dummy memory that it  
can free without problems, so that the list can be deallocated  
without trouble? Does anyone know if you can just give the list a  
NULL pointer for it's memory and then immediately decref it? free 
(NULL) should always be safe, I think. (??)

> Also, with iterators, specifying dtype will make a huge difference.  
> If an object has __length_hint__ and you specify dtype, then you  
> can preallocate the array as I suggested above. However, if dtype  
> is not specified, you still need to build the list completely,  
> determine what type it is, allocate the array memory and then copy  
> the values into it. Much less efficient!

How accurate is __length_hint__ going to be? It could lead to a fair  
bit of special case code for growing and shrinking the final array if  
__length_hint__ turns out to be wrong. Code that python lists already  
have, moreover.

If the list's memory can be stolen safely, how does this strategy sound:
- Given a generator, build it up into a list internally, and then  
steal the list's memory.
- If a dtype is provided, wrap the generator with another generator  
that casts the original generator's output to the correct dtype. Then  
use the wrapped generator to create a list of the proper dtype, and  
steal that list's memory.

A potential problem with stealing list memory is that it could waste  
memory if the list has more bytes allocated than it is using (I'm not  
sure if python lists can get this way, but I presume that they resize  
themselves only every so often, like C++ or Java vectors, so most of  
the time they have some allocated but unused bytes). If lists have a  
squeeze method that's guaranteed not to cause any copies, or if this  
can be added with judicious use of realloc, then that problem is  
obviated.

robert>
> Another note of caution: You are going to have to deal with  
> iterators of
> iterators of iterators of.... I'm not sure if that actually overly  
> complicates
> matters; I haven't looked at PyArray_New for some time. Enjoy!

This is a good point. Numpy does fine with nested lists, but what  
should it do with nested generators? I originally thought that  
basically 'array(generator)' should make the exact same thing as  
'array([f for f in generator])'. However, for nested generators, this  
would be an object array of generators.

I'm not sure which is better -- having more special cases for  
generators that make generators, or having a simple rubric like above  
for how generators are treated.

Any thoughts?

Zach


From robert.kern at gmail.com  Wed Apr  5 08:36:03 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Wed Apr  5 08:36:03 2006
Subject: [Numpy-discussion] Re: A random.normal function with stdev as array
In-Reply-To: <4433C6D6.5080800@obs.univ-lyon1.fr>
References: <4433C6D6.5080800@obs.univ-lyon1.fr>
Message-ID: <e10o26$hi5$1@sea.gmane.org>

Eric Emsellem wrote:
> Hi,
> 
> I am trying to optimize a code where I derive random numbers many times
> and having an array of values for the stdev parameter.
> 
> I wish to have an efficient way of doing something like:
> ##################
> stdev = array([1.1,1.2,1.0,2.2])
> result = numpy.zeros(stdev.shape, Float)
> for i in range(len(stdev)) :
>   result[i] = numpy.random.normal(0, stdev[i])
> ##################

You can use the fact that the standard deviation of a normal distribution is a
scale parameter. You can get random normal deviates of varying standard
deviation by multiplying a standard normal deviate by the desired standard
deviation (how's that for confusing terminology, eh?).

  result = numpy.random.standard_normal(stdev.shape) * stdev

> In my case,  stdev can in fact be an array of a few millions floats...
> so I really need to optimize things.
> 
> Any hint on how to code this efficiently ?
> 
> And in general, where could I find tips for optimizing a code where I
> unfortunately have too many loops such as "for i in range(Nbody) : "
> with Nbody being > 10^6 ?

Tim Hochberg recently made this list:

"""
0. Think about your algorithm.
1. Vectorize your inner loop.
2. Eliminate temporaries
3. Ask for help
4. Recode in C.
5. Accept that your code will never be fast.

Step zero should probably be repeated after every other step ;)
"""

That's probably the best general advice. To get better advice, we would need to
know the specifics of the problem.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From a.h.jaffe at gmail.com  Wed Apr  5 08:48:27 2006
From: a.h.jaffe at gmail.com (Andrew Jaffe)
Date: Wed Apr  5 08:48:27 2006
Subject: [Numpy-discussion] Re: weird interaction: pickle, numpy, matplotlib.hist [sort() method problem?]
Message-ID: <e74ecdac0604050841w2246f57cnabf3fd62f4e1b3e9@mail.gmail.com>

OK, I think I've managed to track the problem down a bit further:

    the sort() method is failing for arrays pickled on another machine!

That is, it's definitely not sorting the array, but changing to a very
strange order (neither the way it started nor sorted).

Again, the array seems to otherwise behave fine (indeed, it even satisfies
all(a==a1) for a pair that behave differently in this circumstance).

Hmmm...

A


On 4/5/06, Andrew Jaffe <a.h.jaffe at gmail.com> wrote:
>
> Hi All,
>
> I've encountered a strange problem: I've been running some python code
> on both a linux box and OS X, both with python 2.4.1 and the latest
> numpy and matplotlib from svn.
>
> I have found that when I transfer pickled numpy arrays from one machine
> to the other (in either direction), the resulting data *looks* all right
> (i.e., it is a numpy array of the correct type with the correct values
> at the correct indices), but it seems to produce the wrong result in (at
> least) one circumstance: matplotlib.hist() gives the completely wrong
> picture (and set of bins).
>
> This can be ameliorated by running the array through
>     arr=numpy.asarray(arr, dtype=numpy.float64)
> but this seems like a complete kludge (and is only needed when you do
> the transfer between machines).
>
> I've attached a minimal code that exhibits the problem: try
>         test_pickle_hist.test(write=True)
> on one machine, transfer the output file to another machine, and run
>         test_pickle_hist.test(write=False)
> on another, and you should see a very strange result (and it should be
> fixed if you set asarray=True).
>
> Any ideas?
>
> Andrew
>
>
> import cPickle
> import numpy
> import pylab
>
> def test(write=True,asarray=False):
>
>     a = numpy.linspace(-3,3,num=100)
>
>     if write:
>         f1 = file("a.cpkl", 'w')
>         cPickle.dump(a, f1)
>         f1.close()
>
>     f1 = open("a.cpkl", 'r')
>     a1 = cPickle.load(f1)
>     f1.close()
>
>     pylab.subplot(1,2,1)
>     h = pylab.hist(a)
>
>     if asarray:
>         a1 = numpy.asarray(a1, dtype=numpy.float64)
>
>     pylab.subplot(1,2,2)
>     h1 = pylab.hist(a1)
>
>     return a, a1
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060405/7fd0ad2a/attachment-0001.html>

From byrnes at bu.edu  Wed Apr  5 08:58:21 2006
From: byrnes at bu.edu (John Byrnes)
Date: Wed Apr  5 08:58:21 2006
Subject: [Numpy-discussion] A random.normal function with stdev as array
In-Reply-To: <4433C6D6.5080800@obs.univ-lyon1.fr>
References: <4433C6D6.5080800@obs.univ-lyon1.fr>
Message-ID: <20060405155736.GA9364@localhost.localdomain>

Hi Eric,

In the past , I've done things like

######
normdist = lambda x: numpy.random.normal(0,x)
vecnormal = numpy.vectorize(normdist)

stdev = numpy.array([1.1,1.2,1.0,2.2])
result = vecnormal(stdev)

######

This works fine for up to 10k elements for stdev for some reason.
Any larger then that and i get a Bus error on my PPC mac and a segfault on
my x86 linux box.

I'm running numpy 0.9.7.2325 on both machines. 

Perhaps for larger inputs, you could break up your loop into smaller
vectorized chunks.

Regards,
John


On Wed, Apr 05, 2006 at 03:32:06PM +0200, Eric Emsellem wrote:
> Hi,
> 
> I am trying to optimize a code where I derive random numbers many times 
> and having an array of values for the stdev parameter.
> 
> I wish to have an efficient way of doing something like:
> ##################
> stdev = array([1.1,1.2,1.0,2.2])
> result = numpy.zeros(stdev.shape, Float)
> for i in range(len(stdev)) :
>   result[i] = numpy.random.normal(0, stdev[i])
> ##################
> 
> In my case,  stdev can in fact be an array of a few millions floats... 
> so I really need to optimize things.
> 
> Any hint on how to code this efficiently ?
> 
> And in general, where could I find tips for optimizing a code where I 
> unfortunately have too many loops such as "for i in range(Nbody) : " 
> with Nbody being > 10^6 ?
> 
> thanks!
> Eric
> 
> 
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting language
> that extends applications into web and mobile media. Attend the live webcast
> and join the prime developer group breaking into this new coding territory!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion

-- 
If liberty and equality, as is thought by some are chiefly to be found in democracy,
they will be best attained when all persons alike share in the government to the utmost.
		-- Aristotle, Politics


From bsouthey at gmail.com  Wed Apr  5 09:05:03 2006
From: bsouthey at gmail.com (Bruce Southey)
Date: Wed Apr  5 09:05:03 2006
Subject: [Numpy-discussion] NumPy documentation
In-Reply-To: <4432E27E.6030906@ee.byu.edu>
References: <4432E27E.6030906@ee.byu.edu>
Message-ID: <bbcd77d00604050903y75bc041bm88f36f4ba7ae22bd@mail.gmail.com>

Hi,
Sorry that you received such an email. It is one thing to disagree
with your choice but it is inexcusable to dictate what you should do
with your code/documentation (not to mention the language).
Unfortunately, this appears to be the result of the typical confusion
of what 'free' refers to in open source software.

If this person thought that purchasing documentation is bad then I
wonder what they think of the PyMOL project: "If you use PyMOL at
work, then you are asked and expected to sponsor the project by
purchasing a PyMOL Subscription"  (http://www.pymol.org/funding.html)!

Really the 'book' issue is more an excuse than a real reason for
people not to use numpy. Personally I really think that you should get
the 1.0 release out that probably would change some minds. Based on
the list postings, the stability of numpy already exceeds a typical
1.0 release level.

Regards
Bruce


From schofield at ftw.at  Wed Apr  5 09:10:05 2006
From: schofield at ftw.at (Ed Schofield)
Date: Wed Apr  5 09:10:05 2006
Subject: [Numpy-discussion] NumPy documentation
In-Reply-To: <4432E27E.6030906@ee.byu.edu>
References: <4432E27E.6030906@ee.byu.edu>
Message-ID: <4433EC3C.9050706@ftw.at>

I'd also like to express my gratitude, Travis, for all the time and
energy you've donated to both NumPy and SciPy.  I also fully support
your decision to charge for your book.

Perhaps your correspondent expects your book to be free because it's
online.  Perhaps some re-branding -- from "fee-based documentation" to
"book" or "handbook for users and developers" -- would help to avoid
evoking such unfair responses?  Incidentally, you mention on on the site
that you'll print and bind hard-copy version once your sales reach 200
copies.  I think this would help to encourage libraries and conservative
institutions to purchase copies.  Are your sales still under this level?!

I'm now going to order a copy for my institution -- and a hard copy when
it's available :)

-- Ed


From robert.kern at gmail.com  Wed Apr  5 09:11:01 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Wed Apr  5 09:11:01 2006
Subject: [Numpy-discussion] Re: weird interaction: pickle, numpy, matplotlib.hist
In-Reply-To: <4433DF85.7030109@gmail.com>
References: <4433DF85.7030109@gmail.com>
Message-ID: <e10q3e$qh0$1@sea.gmane.org>

Andrew Jaffe wrote:
> Hi All,
> 
> I've encountered a strange problem: I've been running some python code
> on both a linux box and OS X, both with python 2.4.1 and the latest
> numpy and matplotlib from svn.
> 
> I have found that when I transfer pickled numpy arrays from one machine
> to the other (in either direction), the resulting data *looks* all right
> (i.e., it is a numpy array of the correct type with the correct values
> at the correct indices), but it seems to produce the wrong result in (at
> least) one circumstance: matplotlib.hist() gives the completely wrong
> picture (and set of bins).
> 
> This can be ameliorated by running the array through
>    arr=numpy.asarray(arr, dtype=numpy.float64)
> but this seems like a complete kludge (and is only needed when you do
> the transfer between machines).

You have a byteorder issue. You Linux box, which I presume has an Intel or AMD
CPU, is little-endian where your OS X box, which I presume has a PPC CPU, is
big-endian. numpy arrays can store their data in either endianness on either
kind of platform; their dtype objects tell you which byteorder they are using.

In the dtype specifications below, '>' means big-endian (I am using a PPC
PowerBook), and '<' means little-endian.


In [31]: a = linspace(0, 10, 11)

In [32]: a
Out[32]: array([  0.,   1.,   2.,   3.,   4.,   5.,   6.,   7.,   8.,   9.,  10.])

In [33]: a.dtype
Out[33]: dtype('>f8')

In [34]: b = a.newbyteorder()

In [35]: b
Out[35]:
array([  0.00000000e+000,   3.03865194e-319,   3.16202013e-322,
         1.04346664e-320,   2.05531309e-320,   2.56123631e-320,
         3.06715953e-320,   3.57308275e-320,   4.07900597e-320,
         4.33196758e-320,   4.58492919e-320])

In [36]: b.dtype
Out[36]: dtype('<f8')

In [41]: a.tostring()[-8:]
Out[41]: '@$\x00\x00\x00\x00\x00\x00'

In [42]: b.tostring()[-8:]
Out[42]: '@$\x00\x00\x00\x00\x00\x00'


Apparently, the pickle stores the data in the creator machine's byteorder and so
marked. When the reading machine loads the pickle, it recognizes that the
byteorder is opposite its native byteorder by its dtype.

Most operations work as you might expect:


In [44]: a.astype(dtype('<f8'))
Out[44]: array([  0.,   1.,   2.,   3.,   4.,   5.,   6.,   7.,   8.,   9.,  10.])

In [45]: c = _

In [46]: c.dtype
Out[46]: dtype('<f8')

In [47]: a + c
Out[47]: array([  0.,   2.,   4.,   6.,   8.,  10.,  12.,  14.,  16.,  18.,  20.])


Some don't:


In [54]: c.sort()

In [55]: c
Out[55]: array([  0.,   2.,   3.,   4.,   5.,   6.,   7.,   8.,   9.,  10.,   1.])


This is a bug.

http://projects.scipy.org/scipy/numpy/ticket/47

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From Chris.Barker at noaa.gov  Wed Apr  5 09:37:08 2006
From: Chris.Barker at noaa.gov (Christopher Barker)
Date: Wed Apr  5 09:37:08 2006
Subject: [Numpy-discussion] NumPy documentation
In-Reply-To: <1A94CA82-2EB5-4145-9EA9-453DE60AE684@stanford.edu>
References: <4432E27E.6030906@ee.byu.edu> <4432E973.8070601@noaa.gov>
 <4432F4DD.6060000@cox.net> <1A94CA82-2EB5-4145-9EA9-453DE60AE684@stanford.edu>
Message-ID: <4433F1F6.4010603@noaa.gov>

Zachary Pincus wrote:
> from Numeric (who was used to the large, free manual)

Which brings up a question: Is the source to the old Numeric manual 
available? it would be nice to "port" it to SciPy.

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer
                                     		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From bsouthey at gmail.com  Wed Apr  5 09:46:03 2006
From: bsouthey at gmail.com (Bruce Southey)
Date: Wed Apr  5 09:46:03 2006
Subject: [Numpy-discussion] A random.normal function with stdev as array
In-Reply-To: <4433C6D6.5080800@obs.univ-lyon1.fr>
References: <4433C6D6.5080800@obs.univ-lyon1.fr>
Message-ID: <bbcd77d00604050945y69d4859cna5f91c6ab35feeb2@mail.gmail.com>

Hi,
Can you provide more details on what you are doing, especially how you
are using this?

The one item that is not directly part of Tim's list is that some
times you need to reorder your loops  (perhaps this is part of "Think
about your algorithm"?). Loop swapping is very common to improve
performance. However, it usually requires a very clear head or someone
else to do it. Also, you can might need to break loops into pieces
where you repeat the same tasks and computations over and over.

The other aspect is to do some algebra on the calculations as the
stdev is essentially a constant so depending on how you use it you can
factor it out further. Again it all depends on what you are actually
doing with these numbers.

>From a different view, you need to be very careful with your
(pseudo)random number generator with that many samples. These have a
tendency to repeat so your random number stream is no longer random.
See the Wikipedia entry:
http://en.wikipedia.org/wiki/Pseudorandom_number_generator

If I recall correctly, the Python random number generator is a
Mersenne twister but ranlib  is not and so prone to the mentioned
problems. I do not know if SciPy adds any other generators.

Finally I would also cheat by reducing the stdev values because in
many cases you will not see a real difference between a normal with
mean zero and variance 1.0 and a normal with mean zero and variance
1.1 (especially if you are doing more than comparing distributions so
there are more sources of 'error') unless you have a really large
number of samples.


Regards
Bruce

On 4/5/06, Eric Emsellem <emsellem at obs.univ-lyon1.fr> wrote:
> Hi,
>
> I am trying to optimize a code where I derive random numbers many times
> and having an array of values for the stdev parameter.
>
> I wish to have an efficient way of doing something like:
> ##################
> stdev = array([1.1,1.2,1.0,2.2])
> result = numpy.zeros(stdev.shape, Float)
> for i in range(len(stdev)) :
>    result[i] = numpy.random.normal(0, stdev[i])
> ##################
>
> In my case,  stdev can in fact be an array of a few millions floats...
> so I really need to optimize things.
>
> Any hint on how to code this efficiently ?
>
> And in general, where could I find tips for optimizing a code where I
> unfortunately have too many loops such as "for i in range(Nbody) : "
> with Nbody being > 10^6 ?
>
> thanks!
> Eric
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting language
> that extends applications into web and mobile media. Attend the live webcast
> and join the prime developer group breaking into this new coding territory!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>


From tim.hochberg at cox.net  Wed Apr  5 09:58:08 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Wed Apr  5 09:58:08 2006
Subject: [Numpy-discussion] A random.normal function with stdev as array]
Message-ID: <4433F71B.5080201@cox.net>

Eric Emsellem wrote:

> Hi,
> this is illuminating in fact. These are things I would not have 
> thought about.
>
> I am trying at the moment to understand why two versions of my program 
> have a difference of about 10% (here it is 2sec for 1000 points, so 
> you can imagine for 10^6...) although the code is nearly the same.
>
> I have loops such as:
>
> ####################
> bigarray = array of Nbig points
> for i in range(N) :
>  bigarray = bigarray + calculation
> ####################


If you tell us more about calculation, we could probably help more. This 
sounds like you want to vectorize the inner loop, but you may in fact 
have already done that. There's nothing wrong with looping in python as 
long as you amortize the loop overhead over a large number of 
operations. Thus, the advice to vectorize your *inner* loop, not 
vectorize all loops. Attempting the latter can lead to impenatrable 
code, usually doesn't help signifigantly and sometimes slows things down 
as you overflow the cache with big matrices.

>
> I thought to do it by:
> ####################
> bigarray = numpy.sum(array([calculation for i in range(N)]))
> ####################
> not sure this is good...

I suspect not, but timeit is your friend....

>
> And you are basically saying that
>
>  bigarray = bigarray + calculation
>
> is better than
>
>  bigarray +=  calculation
>
> or is it strictly equivalent? (in terms of CPU...)

Actually the reverse. "bigarray += calculation" should be better in 
terms of both speed and memory usage. In this case it's also clearer, so 
it's an improvement all around. They both do the same number of adds, 
but the first allocates more memory and pushes more data back and forth 
between main memory and the cache.

The point I was making about += verus + was that  I wouldn't in general 
recommend:

   a = some_func()
   a += something_else

over:

   a = some_func() + something_else

because it's less clear. In cases, where you do need really need the 
speed, it's fine, but most of the time that's not the case. In your 
case, the speedup is fairly minor, I believe because random.normal is 
fairly expensive. If you instead compare these two ways of computing a 
cube, you'll see a much larger difference (37%).

>>> setup = "import numpy; stddev=numpy.arange(1e6,dtype=float)%3"
>>> timeit.Timer('stddev * stddev * stddev', setup).timeit(20)
1.206557537340359
>>> timeit.Timer('result = stddev*stddev; result *= stddev', 
setup).timeit(20)
0.88055493086403658

However, if you work with smaller matrices, the effect almost disappears 
(5%):

>>> setup = "import numpy; stddev=numpy.arange(1e4,dtype=float)%3"
>>> timeit.Timer('result = stddev*stddev; result *= stddev', setup).time
0.10166515576702295
>>> timeit.Timer('stddev * stddev * stddev', setup).timeit(2000)
0.10613667379493563

I believe that's because the speedup is nearly all due to reducing the 
amount of data you move around. In the second case everything fits in 
the cache, so this effect is minor. In the first you are pushing data 
back and forth to main memory so it's fairly large.  On my machine these 
sort of effects kick in somewhere between 10,000 and 100,000 elements.


>
> thanks for the help, and sorry for the dum questions


Not a problem. These are all legitimate questions that you can't really 
be expected to know without a fair amount of experience with numpy or 
its predecessors. It would be cool if someone added a page to the wicki 
on the topic so we could start collecting and orgainizing this 
information. For all I know there's  one already there though -- I 
should probably check.

-tim

>
> Eric
>
> Tim Hochberg wrote:
>
>> Eric Emsellem wrote:
>>
>>>
>>>>
>>>>
>>>> Since stdev essentially scales the output of random, wouldn't the 
>>>> followin be equivalent to the above?
>>>>
>>>> result = numpy.random.normal(0, 1, stddev.shape)
>>>> result *= stdev
>>>>
>>> yes indeed, this is a good option where in fact I could do
>>>
>>> result = stddev * numpy.random.normal(0, 1, stddev.shape)
>>>
>>> in one line.
>>> thanks for the tip 
>>
>>
>> Indeed you can. However, keep in mind that the one line version is 
>> equivalent to:
>>
>>    temp = numpy.random.normal(0, 1, stddev.shape)
>>    result = stddev * temp
>>
>> That is, it creates an extra temporary variable only to throw it 
>> away. The two line version I posted above avoids that temporary and 
>> thus should be both faster and  less memory hungry. It's always good 
>> to check these things however:
>>
>> >>> setup = "import numpy; stddev=numpy.arange(1e6,dtype=float)%3"
>> >>> timeit.Timer('stddev * numpy.random.normal(0, 1, stddev.shape)', 
>> setup).timeit(20)
>> 3.4527201082819232
>> >>> timeit.Timer('result = numpy.random.normal(0, 1, stddev.shape); 
>> result*=stddev', setup).timeit(20)
>> 3.1093337281693607
>>
>> So, yes, the two line method is marginally faster (about 10%). Most 
>> of the time you shouldn't care about this: the one line version is 
>> clearer and most of the code you write isn't a bottleneck. Starting 
>> out writing this as the two line version is premature optimization. I 
>> used it here since the question was about optimization .
>>
>> I see Robert Kern just posted my list. If you want to put this in 
>> terms of that list, then:
>>
>> 0. Think about your algorithm
>>    => Recognize that stddev is a scale parameter
>> 1. Vectorize your inner loop.
>>    => This is a no brainer after 0 resulting in the one line version
>> 2. Eliminate temporaries
>>    => This results in the two line version.
>> ...
>>
>> Also key here is recognizing when to stop. Steps 0 is always 
>> appropriate and step 1 is almost always good, resulting in code that 
>> is both clearer and faster. However, once you get to step 2 and 
>> beyond you tend to trade speed/memory usage for clarity. Not always: 
>> sometime *= and friends are clearer, but often, particularly if you 
>> start resorting to three arg ufuncs. So, my advice is to stop 
>> optimizing as soon as your code is fast enough.
>>
>>
>>> (of course this is not strictly equivalent depending on the random 
>>> generator, but that will be fine for my purpose)
>>
>>
>> I'll have to take your word for it -- after the normal distribution 
>> my knowledge in the area peters out rapidly/
>>
>> Regards,
>>
>> -tim
>>
>>
>


From emsellem at obs.univ-lyon1.fr  Wed Apr  5 10:06:04 2006
From: emsellem at obs.univ-lyon1.fr (Eric Emsellem)
Date: Wed Apr  5 10:06:04 2006
Subject: [Numpy-discussion] A random.normal function with stdev as array
In-Reply-To: <bbcd77d00604050945y69d4859cna5f91c6ab35feeb2@mail.gmail.com>
References: <4433C6D6.5080800@obs.univ-lyon1.fr> <bbcd77d00604050945y69d4859cna5f91c6ab35feeb2@mail.gmail.com>
Message-ID: <4433F8D1.7090305@obs.univ-lyon1.fr>

An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060405/fe83014e/attachment-0001.html>

From perry at stsci.edu  Wed Apr  5 10:09:01 2006
From: perry at stsci.edu (Perry Greenfield)
Date: Wed Apr  5 10:09:01 2006
Subject: [Numpy-discussion] NumPy documentation
In-Reply-To: <4433F1F6.4010603@noaa.gov>
References: <4432E27E.6030906@ee.byu.edu> <4432E973.8070601@noaa.gov> <4432F4DD.6060000@cox.net> <1A94CA82-2EB5-4145-9EA9-453DE60AE684@stanford.edu> <4433F1F6.4010603@noaa.gov>
Message-ID: <c9dc7ffaf628078594018f1924664a32@stsci.edu>

On Apr 5, 2006, at 12:36 PM, Christopher Barker wrote:

> Zachary Pincus wrote:
>> from Numeric (who was used to the large, free manual)
>
> Which brings up a question: Is the source to the old Numeric manual 
> available? it would be nice to "port" it to SciPy.

Sort of. The original source was in Framemaker format. It was converted 
to the Python latex framework in the process of being adopted to 
numarray. The source for that is available on the numarray repository. 
If you want the framemaker source, I may be able to dig that up 
somewhere (or I may have lost track of it :-). Paul Dubois can likely 
provide it as well; that's who gave me the source.

Perry


From hetland at tamu.edu  Wed Apr  5 10:15:27 2006
From: hetland at tamu.edu (Robert Hetland)
Date: Wed Apr  5 10:15:27 2006
Subject: [Numpy-discussion] NumPy documentation
In-Reply-To: <4433F1F6.4010603@noaa.gov>
References: <4432E27E.6030906@ee.byu.edu> <4432E973.8070601@noaa.gov> <4432F4DD.6060000@cox.net> <1A94CA82-2EB5-4145-9EA9-453DE60AE684@stanford.edu> <4433F1F6.4010603@noaa.gov>
Message-ID: <A461273A-5899-47FB-9098-6AC33214D3E9@tamu.edu>

Let's not forget that this documentation will eventually be free *no  
matter what* -- after a financial goal is met or after a certain  
amount of time.  This makes it fundamentally different than a  
published book (and in my opinion, much better).

I personally think this is an innovative way to create a free product  
that everybody wants, but nobody wants to do.

-Rob


-----
Rob Hetland, Assistant Professor
Dept of Oceanography, Texas A&M University
p: 979-458-0096, f: 979-845-6331
e: hetland at tamu.edu, w: http://pong.tamu.edu


From fonnesbeck at gmail.com  Wed Apr  5 10:28:10 2006
From: fonnesbeck at gmail.com (Chris Fonnesbeck)
Date: Wed Apr  5 10:28:10 2006
Subject: Fwd: [Numpy-discussion] NumPy documentation
In-Reply-To: <723eb6930604051026q7dbcaad2w47c059f6c88e8db7@mail.gmail.com>
References: <4432E27E.6030906@ee.byu.edu>
	 <723eb6930604051026q7dbcaad2w47c059f6c88e8db7@mail.gmail.com>
Message-ID: <723eb6930604051027m5aac408dnbba356ebdcb389ac@mail.gmail.com>

On 4/4/06, Travis Oliphant <oliphant at ee.byu.edu> wrote:

>
> I received a rather hurtful email today that was very discouraging to me
> personally.  Basically, I was called "lame" and a "wolf" in sheep's
> clothing because I'm charging for documentation.


There is one in every crowd, it seems. This email, and any others like it,
should be utterly ignored, in the hopes that their authors will go elsewhere
for scientific computing solutions. If they had spent any time at all on
this list, they would have noticed the seemingly boundless attention and
support that Travis bestows upon both scipy and its user community.

Chris
--
Chris Fonnesbeck + Atlanta, GA + http://trichech.us
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060405/aea9f6b4/attachment-0001.html>

From charlesr.harris at gmail.com  Wed Apr  5 10:29:07 2006
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Wed Apr  5 10:29:07 2006
Subject: [Numpy-discussion] NumPy documentation
In-Reply-To: <4433EC3C.9050706@ftw.at>
References: <4432E27E.6030906@ee.byu.edu> <4433EC3C.9050706@ftw.at>
Message-ID: <e06186140604051028j1e520e6cx81b39b5e90202b0d@mail.gmail.com>

Heh,

On 4/5/06, Ed Schofield <schofield at ftw.at> wrote:
<snip>

>  Perhaps some re-branding -- from "fee-based documentation" to
> "book" or "handbook for users and developers"


I think that's a great idea! "Handbook for Users and Developers" sounds much
better and doesn't have that nasty "documentation should be free"
implication.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060405/953f2c6c/attachment-0001.html>

From robert.kern at gmail.com  Wed Apr  5 11:35:01 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Wed Apr  5 11:35:01 2006
Subject: [Numpy-discussion] Re: A random.normal function with stdev as array
In-Reply-To: <4433F8D1.7090305@obs.univ-lyon1.fr>
References: <4433C6D6.5080800@obs.univ-lyon1.fr> <bbcd77d00604050945y69d4859cna5f91c6ab35feeb2@mail.gmail.com> <4433F8D1.7090305@obs.univ-lyon1.fr>
Message-ID: <e112fl$rds$1@sea.gmane.org>

> Bruce Southey wrote:

>>>From a different view, you need to be very careful with your
>>(pseudo)random number generator with that many samples. These have a
>>tendency to repeat so your random number stream is no longer random.
>>See the Wikipedia entry:
>>http://en.wikipedia.org/wiki/Pseudorandom_number_generator
>>
>>If I recall correctly, the Python random number generator is a
>>Mersenne twister but ranlib  is not and so prone to the mentioned
>>problems. I do not know if SciPy adds any other generators.

numpy.random uses the Mersenne Twister. RANLIB is dead! Long live MT19937!

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From Chris.Barker at noaa.gov  Wed Apr  5 11:59:04 2006
From: Chris.Barker at noaa.gov (Christopher Barker)
Date: Wed Apr  5 11:59:04 2006
Subject: [Numpy-discussion] NumPy documentation
In-Reply-To: <c9dc7ffaf628078594018f1924664a32@stsci.edu>
References: <4432E27E.6030906@ee.byu.edu> <4432E973.8070601@noaa.gov>
 <4432F4DD.6060000@cox.net> <1A94CA82-2EB5-4145-9EA9-453DE60AE684@stanford.edu>
 <4433F1F6.4010603@noaa.gov> <c9dc7ffaf628078594018f1924664a32@stsci.edu>
Message-ID: <44341348.3050505@noaa.gov>

Perry Greenfield wrote:
> Sort of. The original source was in Framemaker format. It was converted 
> to the Python latex framework in the process of being adopted to 
> numarray. The source for that is available on the numarray repository. 
> If you want the framemaker source, I may be able to dig that up 
> somewhere (or I may have lost track of it :-). Paul Dubois can likely 
> provide it as well; that's who gave me the source.

Thanks. That's good news. Now, when I'm done with everything else I want 
to work on.....

LaTeX is a better option for me anyway. In fact, it's a better option 
for anyone that doesn't already use FrameMaker, as you can at least edit 
some of the text without knowing or using LaTeX at all.

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer
                                     		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From Chris.Barker at noaa.gov  Wed Apr  5 12:07:10 2006
From: Chris.Barker at noaa.gov (Christopher Barker)
Date: Wed Apr  5 12:07:10 2006
Subject: [Numpy-discussion] array constructor from generators?
In-Reply-To: <F0BB3395-025A-46CB-8A7A-005964E45D20@stanford.edu>
References: <F0BB3395-025A-46CB-8A7A-005964E45D20@stanford.edu>
Message-ID: <44341538.4040907@noaa.gov>

Zachary Pincus wrote:
> I often construct arrays from list comprehensions on generators,

 > numpy.array([map(float, line.split()) for line in file])

I know there are other uses, and this was just an example, but you can 
now do:

numpy.fromfile(file, dtype=numpy.Float, sep="\t")

Which is much faster and cleaner, if you ask me. Thanks for adding this, 
Travis!


Tim Hochberg wrote:
> Without this, you probably can't do much 
> better than just building a list from the array. What would work well 
> would be to build a list, then steal its memory. 

Perhaps another option is to borrow the machinery from fromfile (see 
above), that builds an array without knowing how big it is when it 
starts. I haven't looked at the code, but I know that Travis got at 
least the idea, if not the method, from my FileScanner module I wrote a 
while back, and that dynamically allocated the memory it needed as it grew.

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer
                                     		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From tim.hochberg at cox.net  Wed Apr  5 12:16:11 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Wed Apr  5 12:16:11 2006
Subject: [Numpy-discussion] array constructor from generators?
In-Reply-To: <884F03C6-599C-426A-A0A0-97009B63EACB@stanford.edu>
References: <F0BB3395-025A-46CB-8A7A-005964E45D20@stanford.edu> <44331200.2020604@cox.net> <884F03C6-599C-426A-A0A0-97009B63EACB@stanford.edu>
Message-ID: <4434175D.10103@cox.net>

Zachary Pincus wrote:

> [sorry if this comes through twice -- seems to have not sent the  
> first time]

I've only seen it once so far, but my numpy mail seems to be coming 
through all out of order right now.

> Hi folks,
>
> tim>
>
>> I brought this up last week and Travis was OK with it. I have it on  
>> my todo list, but if you are in a hurry you're welcome to do it  
>> instead.
>
>
> Sorry if that was on the list and I missed it! Hate to be adding more  
> noise than signal. At any rate, I'm not in a hurry, but I'd be happy  
> to help where I can. (Though for the next week or so I think I'm  
> swamped...)

There was no real discussion then. I said I thought it was a good idea. 
Travis said OK. That was about it.

> tim>
>
>> If you do look at it, consider looking into the '__length_hint__  
>> parameter that's slated to go into Python 2.5. When this is  present, 
>> it's potentially a big win, since you can preallocate the  array and 
>> fill it directly from the iterator. Without this, you  probably can't 
>> do much better than just building a list from the  array. What would 
>> work well would be to build a list, then steal  its memory. I'm not 
>> sure if that's feasible without leaking a  reference to the list though.
>
>
> Can you steal its memory and then give it some dummy memory that it  
> can free without problems, so that the list can be deallocated  
> without trouble? Does anyone know if you can just give the list a  
> NULL pointer for it's memory and then immediately decref it? free 
> (NULL) should always be safe, I think. (??)

That might well work, but now I realize that using a list this way 
probably won't work out well for other reasons.

>> Also, with iterators, specifying dtype will make a huge difference.  
>> If an object has __length_hint__ and you specify dtype, then you  can 
>> preallocate the array as I suggested above. However, if dtype  is not 
>> specified, you still need to build the list completely,  determine 
>> what type it is, allocate the array memory and then copy  the values 
>> into it. Much less efficient!
>
>
> How accurate is __length_hint__ going to be? It could lead to a fair  
> bit of special case code for growing and shrinking the final array if  
> __length_hint__ turns out to be wrong. 

see below.

> Code that python lists already  have, moreover.

If we don't know dtype up front, lists are great. All the code is there 
and we need to look at all of the elements before we know what the 
elements are anyway.

However, if you do know what dtype is the situation is different. Since 
these are generators, the object they create may only last until the 
next next() call if we don't hold onto it. That means that for a matrix 
of size N, generating thw whole list is going to require N*(sizeof(long) 
+ sizeof(pyobjType) + sizeof(dtype)), versus just N*sizeof(dtype) if 
we're careful. I'm not sure what all of those various sizes are, but I'm 
going to guess that we'd be at least doubling our memory.

All is not lost however. When we know the dtype, we should just use a 
*python* array to hold the data. It works just like a list, but on 
packed data.

>
> If the list's memory can be stolen safely, how does this strategy sound:

Let me break this into two cases:

1. We don't know the dtype.

> - Given a generator, build it up into a list internally

+1

> , and then  steal the list's memory.

-0.5

I'm not sure this buys us as much as I thought initially. The list 
memory is PyObject*, so this would only work on dtypes no larger than 
the size of a pointer, usually that means no larger than a long. So, 
basically this would work on most of the integer types, but not the 
floating point types. And, it adds extra complexity to support two 
different cases. I'd be inclined to start with just copying the objects 
out of the list. If someone feels like it later, they can come back and 
try to optimize the case of integers to steal the lists memory..

Keep in mind that once we have a list, we can simple pass it to the 
machinery that already exists for creating arrays from lists making our 
lives much easier.

> - If a dtype is provided, wrap the generator with another generator  
> that casts the original generator's output to the correct dtype. Then  
> use the wrapped generator to create a list of the proper dtype, and  
> steal that list's memory.

-1. This wastes a lot of space and sort of defeats the purpose of the 
whole exercise in my mind.

2. Dtype is known.


The case where dtype is provided is more complicated, but this is the 
case we really want to support well. Actually though, I think we can 
simplify it by judicious punting.

Case 2a. Array is not 1-dimensional. Punt and fallback on the general 
code above. We can determine this simply by testing the first element. 
If it's not int/float/complex/whatever-other-scalar-values-we-have, fall 
back to case 1.

Case 2b: length_ hint is not given. In this case, we build up the array 
in a python array, steal the data, possibly realloc and we're done.

Case 2b length_hint is given. Same as above, but preallocate the 
appropriate amount of memory. Growing if length_hint lies.


>
> A potential problem with stealing list memory is that it could waste  
> memory if the list has more bytes allocated than it is using (I'm not  
> sure if python lists can get this way, but I presume that they resize  
> themselves only every so often, like C++ or Java vectors, so most of  
> the time they have some allocated but unused bytes). If lists have a  
> squeeze method that's guaranteed not to cause any copies, or if this  
> can be added with judicious use of realloc, then that problem is  
> obviated.

I imagine once you steal the memory, realloc would the thing to try. 
However,  I don't think it's worth stealing the memory from lists. I do 
think it's worth stealing the memory from python arrays however, and I'm 
sure that the same issue exists there. We'll have to look at how the 
deallocation for an array works. It probably use Py_XDecref, in which 
case we can just replace the memory with NULL and we'll be fine.

OK, just had a look at the code for the python array object 
(Modules/arraymodule.c). Looks like it'll be a piece of cake. We can 
allocate it to the exact size we want if we have length_hint, otherwise 
resize only overallocates by 6%. That's not enough to worry about 
reallocing. Stealing the data looks like it shouldn't a problem either, 
just NULL ob_item as you suggested.


Regards,

-tim

>
> robert>
>
>> Another note of caution: You are going to have to deal with  
>> iterators of
>> iterators of iterators of.... I'm not sure if that actually overly  
>> complicates
>> matters; I haven't looked at PyArray_New for some time. Enjoy!
>
>
> This is a good point. Numpy does fine with nested lists, but what  
> should it do with nested generators? I originally thought that  
> basically 'array(generator)' should make the exact same thing as  
> 'array([f for f in generator])'. However, for nested generators, this  
> would be an object array of generators.
>
> I'm not sure which is better -- having more special cases for  
> generators that make generators, or having a simple rubric like above  
> for how generators are treated.
>
> Any thoughts?
>
> Zach
>
>
>
>
>
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting 
> language
> that extends applications into web and mobile media. Attend the live 
> webcast
> and join the prime developer group breaking into this new coding 
> territory!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>
>


From aisaac at american.edu  Wed Apr  5 14:01:01 2006
From: aisaac at american.edu (Alan G Isaac)
Date: Wed Apr  5 14:01:01 2006
Subject: [Numpy-discussion] NumPy documentation
In-Reply-To: <4433EC3C.9050706@ftw.at>
References: <4432E27E.6030906@ee.byu.edu><4433EC3C.9050706@ftw.at>
Message-ID: <Mahogany-0.66.0-928-20060405-170554.00@american.edu>

On Wed, 05 Apr 2006, Ed Schofield apparently wrote: 
> you mention on on the site that you'll print and bind 
> hard-copy version once your sales reach 200 copies.  
> I think this would help to encourage libraries and 
> conservative institutions to purchase copies.

Unfortunately, my library falls in this category.
They were uncertain how to enforce the copyright
with an electronic copy.  (They are still thinking
about it, last I heard.)

Cheers,
Alan Isaac


From rahul.kanwar at gmail.com  Wed Apr  5 16:25:01 2006
From: rahul.kanwar at gmail.com (Rahul Kanwar)
Date: Wed Apr  5 16:25:01 2006
Subject: [Numpy-discussion] Numpy on 64 bit Xeon with ifort and mkl
Message-ID: <63dec5bf0604051624k70c565baw70347a2fd571c253@mail.gmail.com>

Hello,

  I am trying to compile Numpy on 64 bit Xeon with ifort and mkl
libraries running Suse 10.0 linux. I had set the MKLROOT variable to
the mkl library root but it could'nt find the 64 bit library. After a
little bit of snooping I found the following in
numpy/distutils/cpuinfo.py

------------------------------
   def _is_XEON(self):
       return re.match(r'.*?XEON\b',
                       self.info[0]['model name']) is not None

   _is_Xeon = _is_XEON
------------------------------
 I changed XEON to Xeon and it worked and was able to indentify the
em64t libraries. But it again got stuck with the following message. I
used the following command to build Numpy
 python setup.py config_fc --fcompiler=intel install

------------------------------
building 'numpy.core._dotblas' extension
compiling C sources
gcc options: '-pthread -fno-strict-aliasing -DNDEBUG -O2
-fmessage-length=0 -Wall -D_FORTIFY_SOURCE=2 -g -fPIC'
compile options: '-Inumpy/core/blasdot -I/opt/intel/mkl/8.0.2/include
-Inumpy/core/include -Ibuild/src/numpy/core -Inumpy/core/src
-Inumpy/core/include -I/usr/include/python2.4 -c'
gcc -pthread -shared
build/temp.linux-x86_64-2.4/numpy/core/blasdot/_dotblas.o
-L/opt/intel/mkl/8.0.2/lib/em64t -lmkl_em64t -lmkl -lvml -lguide
-lpthread -o build/lib.linux-x86_64-2.4/numpy/core/_dotblas.so
/usr/lib64/gcc/x86_64-suse-linux/4.0.2/../../../../x86_64-suse-linux/bin/ld:
/opt/intel/mkl/8.0.2/lib/em64t/libmkl_em64t.a(def_cgemm_omp.o):
relocation R_X86_64_PC32 against `_mkl_blas_def_cgemm_276__par_loop0'
can not be used when making a shared object; recompile with -fPIC
/usr/lib64/gcc/x86_64-suse-linux/4.0.2/../../../../x86_64-suse-linux/bin/ld:
final link failed: Bad value
collect2: ld returned 1 exit status
/usr/lib64/gcc/x86_64-suse-linux/4.0.2/../../../../x86_64-suse-linux/bin/ld:
/opt/intel/mkl/8.0.2/lib/em64t/libmkl_em64t.a(def_cgemm_omp.o):
relocation R_X86_64_PC32 against `_mkl_blas_def_cgemm_276__par_loop0'
can not be used when making a shared object; recompile with -fPIC
/usr/lib64/gcc/x86_64-suse-linux/4.0.2/../../../../x86_64-suse-linux/bin/ld:
final link failed: Bad value
collect2: ld returned 1 exit status
error: Command "gcc -pthread -shared
build/temp.linux-x86_64-2.4/numpy/core/blasdot/_dotblas.o
-L/opt/intel/mkl/8.0.2/lib/em64t -lmkl_em64t -lmkl -lvml -lguide
-lpthread -o build/lib.linux-x86_64-2.4/numpy/core/_dotblas.so" failed
with exit status 1
----------------------------------------------
i successfuly compiled it without the -lmkl_em64t flag but when i import
numpy in python it gives error that some symbol is missing. I think
that maybe if i use ifort as the linker instead ok gcc then things
will work out properly, but i could'nt find how to change the linker
to ifort. Aynone there who can help me with this problem ?

regards,
Rahul


From robert.kern at gmail.com  Wed Apr  5 17:17:04 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Wed Apr  5 17:17:04 2006
Subject: [Numpy-discussion] Re: Numpy on 64 bit Xeon with ifort and mkl
In-Reply-To: <63dec5bf0604051624k70c565baw70347a2fd571c253@mail.gmail.com>
References: <63dec5bf0604051624k70c565baw70347a2fd571c253@mail.gmail.com>
Message-ID: <e11mj4$tem$1@sea.gmane.org>

Rahul Kanwar wrote:

> i successfuly compiled it without the -lmkl_em64t flag but when i import
> numpy in python it gives error that some symbol is missing. I think
> that maybe if i use ifort as the linker instead ok gcc then things
> will work out properly, but i could'nt find how to change the linker
> to ifort. Aynone there who can help me with this problem ?

It's not likely that using ifort to link will help. The problem is this bit:

> /opt/intel/mkl/8.0.2/lib/em64t/libmkl_em64t.a(def_cgemm_omp.o):
> relocation R_X86_64_PC32 against `_mkl_blas_def_cgemm_276__par_loop0'
> can not be used when making a shared object; recompile with -fPIC

You are linking against static libraries which were not compiled to be "position
independent;" that is, they can't be used in shared libraries which are what
Python extension modules are. C.f.:

http://en.wikipedia.org/wiki/Position_independent_code

Look around in /opt/intel/; they've almost certainly have provided shared
library versions of the MKL that could be used. Google gives me these, for example:

http://www.intel.com/support/performancetools/libraries/mkl/linux/sb/cs-017267.htm
http://www.intel.com/software/products/mkl/docs/mklgs_lnx.htm#Linking_Your_Application_with_Intel_MKL

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From ryanlists at gmail.com  Wed Apr  5 19:50:07 2006
From: ryanlists at gmail.com (Ryan Krauss)
Date: Wed Apr  5 19:50:07 2006
Subject: [Numpy-discussion] eye(N,dtype='S10')
Message-ID: <c5b438120604051949g73925ac7o73e15b0cbe1353ae@mail.gmail.com>

I am trying to create a function that can return a matrix that is
either made up of complex numbers or strings depending on the input. 
I have created a symbolic string class to help me with that and it
works well.  One clumsy part is that in several cases I want to create
an identity matrix and just replace a couple of elements.  I currently
have to do this in two steps:

In [27]: mymat=numpy.eye(4,dtype='f')

In [28]: mymat.astype('S10')
Out[28]:
array([[1.0, 0.0, 0.0, 0.0],
       [0.0, 1.0, 0.0, 0.0],
       [0.0, 0.0, 1.0, 0.0],
       [0.0, 0.0, 0.0, 1.0]], dtype=(string,10))

I create a floating point matrix in the string case rather than a
complex matrix so I don't have to parse the +0.0j stuff.

But what I would really like is to be able to just be able to create
either a complex matrix or a string matrix at the beginning.  But
trying

numpy.eye(4,dtype='S10')

produces

array([[True, False, False, False],
       [False, True, False, False],
       [False, False, True, False],
       [False, False, False, True]], dtype=(string,10))

rather than

array([[1.0, 0.0, 0.0, 0.0],
       [0.0, 1.0, 0.0, 0.0],
       [0.0, 0.0, 1.0, 0.0],
       [0.0, 0.0, 0.0, 1.0]], dtype=(string,10))

I need 1's and 0's rather than True and False because when I am done,
I put the string representation into an input script to Maxima and
Maxima wouldn't handle the True and False values well.

Is there a way to directly create an identitiy string matrix with '1'
and '0'  instead of True and False?

Thanks,

Ryan


From arnd.baecker at web.de  Wed Apr  5 23:51:03 2006
From: arnd.baecker at web.de (Arnd Baecker)
Date: Wed Apr  5 23:51:03 2006
Subject: [Numpy-discussion] Converting from Numeric (was: Speed up function on cross product of
 two sets?)
In-Reply-To: <Pine.LNX.4.64.0604050912310.12488@scipy.org>
References: <DCA236B2-7315-419D-B05B-06396924E909@stanford.edu>
 <Pine.LNX.4.51.0604031110050.6751@ptpcp8.phy.tu-dresden.de> <44315633.4010600@cox.net>
 <Pine.LNX.4.51.0604051051310.24174@ptpcp8.phy.tu-dresden.de>
 <Pine.LNX.4.64.0604050912310.12488@scipy.org>
Message-ID: <Pine.LNX.4.51.0604060842530.32563@ptpcp8.phy.tu-dresden.de>

Moin Moin,

On Wed, 5 Apr 2006, Pearu Peterson wrote:

> On Wed, 5 Apr 2006, Arnd Baecker wrote:
>
> > BTW, it seems that we have no Numeric to numpy transition remarks in
> > www.scipy.org. I only found
> > http://www.scipy.org/PearuPeterson/NumpyVersusNumeric
> > and of course Travis' "Guide to NumPy" contains a detailed list of
> > necessary changes in chapter 2.6.1.
> > In addition ``site-packages/numpy/lib/convertcode.py`` provides an
> > automatic conversion.
> >
> > Would it be helpful to start a new wiki page "ConvertingFromNumeric"
> > (similar to http://www.scipy.org/Converting_from_numarray)
> > which aims at summarizing the necessary changes
> > or expand Pearu's page (if he agrees) on this?
>
> It's better to start a new wiki page similar to Converting_from_numarray
> (I like the table).

Based on the above links I have set up a first draft at
  http://www.scipy.org/Converting_from_Numeric
It is surely not complete and
there are a couple of things which have to be checked for correctness
(I tried out some, but not all ...).

Also some remarks on using the new features of numpy
(e.g., use array indexing instead of take and put...) might be useful.

> Btw, I have few notes about the necessary changes for
> Numeric->numpy transition in the following page:
>
>    http://svn.enthought.com/enthought/wiki/NumpyPort#NotesonchangesduetoreplacingNumeric/scipy_basewithnumpy
>
> Feel free to grab these notes.

Great - thanks, I tried to incorporate them as well.

Best, Arnd


From cimrman3 at ntc.zcu.cz  Thu Apr  6 01:48:05 2006
From: cimrman3 at ntc.zcu.cz (Robert Cimrman)
Date: Thu Apr  6 01:48:05 2006
Subject: [Numpy-discussion] NumPy documentation
In-Reply-To: <4432E27E.6030906@ee.byu.edu>
References: <4432E27E.6030906@ee.byu.edu>
Message-ID: <4434D58D.2010505@ntc.zcu.cz>

Travis Oliphant wrote:
> 
> I received a rather hurtful email today that was very discouraging to me 
 > ...

Coming late on line, I can just +1 to all the support and appreciation 
you have received so far!

r.


From oliphant.travis at ieee.org  Thu Apr  6 01:54:01 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Thu Apr  6 01:54:01 2006
Subject: [Numpy-discussion] Speed up function on cross product of two
 sets?
In-Reply-To: <Pine.LNX.4.51.0604051051310.24174@ptpcp8.phy.tu-dresden.de>
References: <DCA236B2-7315-419D-B05B-06396924E909@stanford.edu> <Pine.LNX.4.51.0604031110050.6751@ptpcp8.phy.tu-dresden.de> <44315633.4010600@cox.net> <Pine.LNX.4.51.0604051051310.24174@ptpcp8.phy.tu-dresden.de>
Message-ID: <4434D6DF.2020306@ieee.org>

Arnd Baecker wrote:
> BTW, it seems that we have no Numeric to numpy transition remarks in
> www.scipy.org. I only found
> http://www.scipy.org/PearuPeterson/NumpyVersusNumeric
> and of course Travis' "Guide to NumPy" contains a detailed list of
> necessary changes in chapter 2.6.1.
>   
For clarification:  this is in the sample chapter available on-line to 
all....

> In addition ``site-packages/numpy/lib/convertcode.py`` provides an
> automatic conversion.
>
> Would it be helpful to start a new wiki page "ConvertingFromNumeric"
> (similar to http://www.scipy.org/Converting_from_numarray)
> which aims at summarizing the necessary changes
> or expand Pearu's page (if he agrees) on this?
>   

Absolutely.   I did the Numarray page because I'd written a lot on 
Converting from Numeric (even providing convertcode.py) but very little 
for numarray --- except the ndimage conversion.  So, I started the 
Numarray page.   Sounds like a great idea to have a dual page.

-Travis


From oliphant.travis at ieee.org  Thu Apr  6 02:21:02 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Thu Apr  6 02:21:02 2006
Subject: [Numpy-discussion] array constructor from generators?
In-Reply-To: <B72CDB4F-6494-4815-9478-10B4743DBCE1@stanford.edu>
References: <F0BB3395-025A-46CB-8A7A-005964E45D20@stanford.edu> <44331200.2020604@cox.net> <B72CDB4F-6494-4815-9478-10B4743DBCE1@stanford.edu>
Message-ID: <4434DD42.8010205@ieee.org>

> Can you steal its memory and then give it some dummy memory that it 
> can free without problems, so that the list can be deallocated without 
> trouble? Does anyone know if you can just give the list a NULL pointer 
> for it's memory and then immediately decref it? free(NULL) should 
> always be safe, I think. (??)
>
I don't think you can steal a list's memory since each list element is a 
actually pointer to some other Python Object.   

However, a Python array's memory could be stolen (as Tim mentions later). 
> This is a good point. Numpy does fine with nested lists, but what 
> should it do with nested generators? I originally thought that 
> basically 'array(generator)' should make the exact same thing as 
> 'array([f for f in generator])'. However, for nested generators, this 
> would be an object array of generators.
>
> I'm not sure which is better -- having more special cases for 
> generators that make generators, or having a simple rubric like above 
> for how generators are treated.
I like the idea that generators of generators acts the same as lists of 
lists (i.e. recursively defined).   Basically to implement this, we need 
to repeat

Array_FromSequence
discover_depth
discover_dimensions
discover_itemsize

Or, just maybe we can figure out a way to enhance those functions so 
that creating an array from generators works the same as creating an 
array from sequences. Right now, the sequence interface is used.  
Perhaps we could figure out a way to use a more abstract interface which 
would include both generators and sequences.  If that causes too much 
alteration then I don't think it's worth it and we just repeat those 
functions for generators.

Now, I think there are two cases here that are being discussed as one

1)  Creating arrays from iterators     ---   array( iter(xrange(10) )
2)  Creating arrays from generators  ---  array(x for x in xrange(10))

Both of these cases really ought to be handled and really should be 
integrated into the Array_FromSequence code.  That code is inherited 
from Numeric and was written before iterators and generators arose on 
the scene.  There ought to be a way to unify all of these notions 
(Actually if you handle iterators, then sequences will come along for 
the ride since sequences can behave as iterators). 

I'd rather see one place in the code that handles these cases.   But, 
working code is usually better than dreamy plans :-)

-Travis


From oliphant.travis at ieee.org  Thu Apr  6 02:38:04 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Thu Apr  6 02:38:04 2006
Subject: [Numpy-discussion] A random.normal function with stdev as array
In-Reply-To: <20060405155736.GA9364@localhost.localdomain>
References: <4433C6D6.5080800@obs.univ-lyon1.fr> <20060405155736.GA9364@localhost.localdomain>
Message-ID: <4434E13B.4000702@ieee.org>

John Byrnes wrote:
> Hi Eric,
>
> In the past , I've done things like
>
> ######
> normdist = lambda x: numpy.random.normal(0,x)
> vecnormal = numpy.vectorize(normdist)
>
> stdev = numpy.array([1.1,1.2,1.0,2.2])
> result = vecnormal(stdev)
>
> ######
>
> This works fine for up to 10k elements for stdev for some reason.
> Any larger then that and i get a Bus error on my PPC mac and a segfault on
> my x86 linux box.
>
>   

This needs to be tracked down.  It looks like some-kind of error is not 
being caught correctly.  You should not get a segfault.  Could you 
provide a stack-trace when the problem occurs? 

One issue is that vectorize is using object arrays under the covers 
which is consuming roughly 2x the memory than you may think.    An 
object array is created and the function is called for every element.  
This object array is then converted to a number type after the fact.  

The segfault should be tracked down in any case.

-Travis


From pau.gargallo at gmail.com  Thu Apr  6 02:44:03 2006
From: pau.gargallo at gmail.com (Pau Gargallo)
Date: Thu Apr  6 02:44:03 2006
Subject: [Numpy-discussion] NumPy documentation
In-Reply-To: <c9dc7ffaf628078594018f1924664a32@stsci.edu>
References: <4432E27E.6030906@ee.byu.edu> <4432E973.8070601@noaa.gov>
	 <4432F4DD.6060000@cox.net>
	 <1A94CA82-2EB5-4145-9EA9-453DE60AE684@stanford.edu>
	 <4433F1F6.4010603@noaa.gov>
	 <c9dc7ffaf628078594018f1924664a32@stsci.edu>
Message-ID: <6ef8f3380604060243u2f54efc3r2baba94688c5d0af@mail.gmail.com>

On 4/5/06, Perry Greenfield <perry at stsci.edu> wrote:
>
> On Apr 5, 2006, at 12:36 PM, Christopher Barker wrote:
>
> > Zachary Pincus wrote:
> >> from Numeric (who was used to the large, free manual)
> >
> > Which brings up a question: Is the source to the old Numeric manual
> > available? it would be nice to "port" it to SciPy.
>
> Sort of. The original source was in Framemaker format. It was converted
> to the Python latex framework in the process of being adopted to
> numarray. The source for that is available on the numarray repository.
> If you want the framemaker source, I may be able to dig that up
> somewhere (or I may have lost track of it :-). Paul Dubois can likely
> provide it as well; that's who gave me the source.
>
> Perry
>
+1 to any support to Travis Oliphant. Your work is really helping us.

I am quite ignorant about licences and copyright things, so I would
like to know:

1.- Is it OK to just copy the old Numeric documentation to the wiki
and use it as a starting point for a more complete and updated doc?
2.- Would that be fine for the authors?

I guess it will be very useful to everyone (especially beginners) to
have an extended version of this documentation where there are many
examples of use for every function. The wiki seems a very efficient
way to build such a thing.
It will take some time to manually copy-paste everything to the wiki,
but it is doable

what do you think?

pau


From oliphant.travis at ieee.org  Thu Apr  6 02:46:02 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Thu Apr  6 02:46:02 2006
Subject: [Numpy-discussion] Re: weird interaction: pickle, numpy, matplotlib.hist
In-Reply-To: <e10q3e$qh0$1@sea.gmane.org>
References: <4433DF85.7030109@gmail.com> <e10q3e$qh0$1@sea.gmane.org>
Message-ID: <4434E31B.5030306@ieee.org>

Robert Kern wrote:
> You have a byteorder issue. You Linux box, which I presume has an Intel or AMD
> CPU, is little-endian where your OS X box, which I presume has a PPC CPU, is
> big-endian. numpy arrays can store their data in either endianness on either
> kind of platform; their dtype objects tell you which byteorder they are using.
>
> In [54]: c.sort()
>
> In [55]: c
> Out[55]: array([  0.,   2.,   3.,   4.,   5.,   6.,   7.,   8.,   9.,  10.,   1.])
>
>
> This is a bug.
>
> http://projects.scipy.org/scipy/numpy/ticket/47
>   
Good catch.  This bug was due to an oversight when adding the new 
sorting functions.  The case of byte-swapped data was not handled.   
Judicious use of copyswap on the buffer fixed it.

But,  this brings up the point that currently the pickled raw-data which 
is read-in as a string by Python is used as the memory for the new array 
(i.e. the string memory is "stolen").    This should work.  The fact 
that it didn't with sort was a bug that is now fixed in SVN.  However, 
operations on out-of-byte-order arrays will always be slower.  Thus, 
perhaps on pickle read the data should be copied to native byte-order if 
necessary.

Opinions?

-Travis


From benjamin at decideur.info  Thu Apr  6 03:23:09 2006
From: benjamin at decideur.info (Benjamin Thyreau)
Date: Thu Apr  6 03:23:09 2006
Subject: [Numpy-discussion] Recarray and shared datas
Message-ID: <200604061020.k36AKIsQ018238@decideur.info>

Hi,
Numpy has a nice feature of recarray, ie. record which can hold columns names.
I'd like to use such a feature in order to better interact with R, ie. passing
R datas to python without copy. The current rpy bindings do a full copy, and
convert to simple ndarray. Looking at the recarray api in the Guide,
and also at the source code, i don't find any recarray constructor which can
get shared datas (all the examples from section 8.6 are doing copies).
Is there some way to do it ? in Python or in C ? Or is there any plans to ?

Thanks for the infos


--
Benjamin Thyreau
CEA/SHFJ Orsay


From oliphant.travis at ieee.org  Thu Apr  6 03:40:05 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Thu Apr  6 03:40:05 2006
Subject: [Numpy-discussion] Newbie indexing question and print order
In-Reply-To: <44338DF4.7050603@gmail.com>
References: <44338DF4.7050603@gmail.com>
Message-ID: <4434E522.3060101@ieee.org>

amcmorl wrote:
> Hi all,
>
> I'm having a bit of trouble getting my head around numpy's indexing
> capabilities. A quick summary of the problem is that I want to
> lookup/index in nD from a second array of rank n+1, such that the last
> (or first, I guess) dimension contains the lookup co-ordinates for the
> value to extract from the first array. Here's a 2D (3,3) example:
>
> In [12]:print ar
> [[ 0.15  0.75  0.2 ]
>  [ 0.82  0.5   0.77]
>  [ 0.21  0.91  0.59]]
>
> In [24]:print inds
> [[[1 1]
>   [1 1]
>   [2 1]]
>
>  [[2 2]
>   [0 0]
>   [1 0]]
>
>  [[1 1]
>   [0 0]
>   [2 1]]]
>
> then somehow return the array (barring me making any row/column errors):
> In [26]: c = ar.somefancyindexingroutinehere(inds)
>   

You can do this with "fancy-indexing".   Obviously it is going to take 
some time for people to get used to this idea as none of the responses 
yet suggest it. 

But the following works.  

c = ar[inds[...,0],inds[...,1]]

gives the desired effect.

Thus, your simple description c[x,y] = ar[inds[x,y,0],inds[x,y,1]] is a 
text-book description of what fancy-indexing does.

Best regards,

-Travis


> In [26]:print c
> [[ 0.5  0.5  0.91]
>  [ 0.59 0.15 0.82]
>  [ 0.5  0.15 0.91]]
>
> i.e. c[x,y] = a[ inds[x,y,0], inds[x,y,1] ]
>
> Any suggestions? It looks like it should be relatively simple using
> 'put' or 'take' or 'fetch' or 'sit' or something like that, but I'm not
> getting it.
>
> While I'm here, can someone help me understand the rationale behind
> 'print' printing row, column (i.e. a[0,1] = 0.75 in the above example
> rather than x, y (=column, row; in which case 0.75 would be in the first
> column and second row), which seems to me to be more intuitive.
>
> I'm really enjoying getting into numpy - I can see it'll be
> simpler/faster coding than my previous environments, despite me not
> knowing my way at the moment, and that python has better opportunities
> for extensibility. So, many thanks for your great work.
>   


From faltet at carabos.com  Thu Apr  6 03:44:02 2006
From: faltet at carabos.com (Francesc Altet)
Date: Thu Apr  6 03:44:02 2006
Subject: [Numpy-discussion] Re: weird interaction: pickle, numpy, matplotlib.hist
In-Reply-To: <4434E31B.5030306@ieee.org>
References: <4433DF85.7030109@gmail.com> <e10q3e$qh0$1@sea.gmane.org> <4434E31B.5030306@ieee.org>
Message-ID: <200604061243.48122.faltet@carabos.com>

A Dijous 06 Abril 2006 11:44, Travis Oliphant va escriure:
> But,  this brings up the point that currently the pickled raw-data which
> is read-in as a string by Python is used as the memory for the new array
> (i.e. the string memory is "stolen").    This should work.  The fact
> that it didn't with sort was a bug that is now fixed in SVN.  However,
> operations on out-of-byte-order arrays will always be slower.  Thus,
> perhaps on pickle read the data should be copied to native byte-order if
> necessary.

Yes, I think that converting directly to native byteorder in
unpickling time would be the best.

Cheers!

-- 
>0,0<   Francesc Altet ? ? http://www.carabos.com/
V   V   C?rabos Coop. V. ??Enjoy Data
 "-"


From a.u.r.e.l.i.a.n at gmx.net  Thu Apr  6 04:16:11 2006
From: a.u.r.e.l.i.a.n at gmx.net (Johannes Loehnert)
Date: Thu Apr  6 04:16:11 2006
Subject: [Numpy-discussion] Re: weird interaction: pickle, numpy, matplotlib.histHi
In-Reply-To: <200604061243.48122.faltet@carabos.com>
References: <4433DF85.7030109@gmail.com> <4434E31B.5030306@ieee.org> <200604061243.48122.faltet@carabos.com>
Message-ID: <200604061315.23340.a.u.r.e.l.i.a.n@gmx.net>

Hi,

> > But,  this brings up the point that currently the pickled raw-data which
> > is read-in as a string by Python is used as the memory for the new array
> > (i.e. the string memory is "stolen").    This should work.  The fact
> > that it didn't with sort was a bug that is now fixed in SVN.  However,
> > operations on out-of-byte-order arrays will always be slower.  Thus,
> > perhaps on pickle read the data should be copied to native byte-order if
> > necessary.
>
> Yes, I think that converting directly to native byteorder in
> unpickling time would be the best.

If you stored your data in wrong byte order for some odd reason (maybe you use 
a library that requires a certain byte order), then you would want pickle to 
deliver the data back exactly as stored. I think this should be made a user 
option in some way, although I do not know a good place for it right now.

Johannes


From cimrman3 at ntc.zcu.cz  Thu Apr  6 05:16:07 2006
From: cimrman3 at ntc.zcu.cz (Robert Cimrman)
Date: Thu Apr  6 05:16:07 2006
Subject: [Numpy-discussion] site.cfg.example
In-Reply-To: <4435020B.9040705@iam.uni-stuttgart.de>
References: <44280161.4030708@ntc.zcu.cz> <442808AF.6090006@ftw.at>	<44280C20.8000003@ntc.zcu.cz> <44297152.9000305@ftw.at>	<442A698C.9000104@ntc.zcu.cz> <442A7E78.1030901@ftw.at>	<442A86D2.20902@ntc.zcu.cz> <442A9A67.8050106@ftw.at>	<442A9F8D.906@ntc.zcu.cz> <CE6F5FEC-CED6-4298-9769-AAA9632B249E@ftw.at>	<443253D4.90806@iam.uni-stuttgart.de> <4434D699.5030102@ntc.zcu.cz>	<4434D8D3.7050200@iam.uni-stuttgart.de> <4434FC6B.3000905@ntc.zcu.cz> <4435020B.9040705@iam.uni-stuttgart.de>
Message-ID: <44350672.4020008@ntc.zcu.cz>

I have added numpy/site.cfg.example to the SVN. It should contain a list 
all possible sections and relevant fields, so that a (new) user sees 
what can be configured and then just copies the file to numpy/site.cfg, 
removes the unwanted sections and edits the wanted.

If you think it is a good idea and have a section that is not present or 
properly described, contribute it, please :-) When/if the file grows, we 
can put it to the Wiki.

cheers,
r.


From tim.hochberg at cox.net  Thu Apr  6 08:39:00 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Thu Apr  6 08:39:00 2006
Subject: [Numpy-discussion] Re: weird interaction: pickle, numpy, matplotlib.histHi
In-Reply-To: <200604061315.23340.a.u.r.e.l.i.a.n@gmx.net>
References: <4433DF85.7030109@gmail.com> <4434E31B.5030306@ieee.org> <200604061243.48122.faltet@carabos.com> <200604061315.23340.a.u.r.e.l.i.a.n@gmx.net>
Message-ID: <44353646.6010009@cox.net>

Johannes Loehnert wrote:

>Hi,
>
>  
>
>>>But,  this brings up the point that currently the pickled raw-data which
>>>is read-in as a string by Python is used as the memory for the new array
>>>(i.e. the string memory is "stolen").    This should work.  The fact
>>>that it didn't with sort was a bug that is now fixed in SVN.  However,
>>>operations on out-of-byte-order arrays will always be slower.  Thus,
>>>perhaps on pickle read the data should be copied to native byte-order if
>>>necessary.
>>>      
>>>
>>Yes, I think that converting directly to native byteorder in
>>unpickling time would be the best.
>>    
>>
>
>If you stored your data in wrong byte order for some odd reason (maybe you use 
>a library that requires a certain byte order), then you would want pickle to 
>deliver the data back exactly as stored. I think this should be made a user 
>option in some way, although I do not know a good place for it right now.
>  
>
If this is really something we want to do, it seems that the "correct" 
solution is to have a different dtype when an object defaults to a given 
byte order than when it is forced to that byte order. Pickle could keep 
track of that and do the right thing on loading. For example, "<!d4" 
could mean that the byte order was explicitly specified, so leave it 
alone. I don't know if this is worth the effort though.

Regards,

-tim


From tim.hochberg at cox.net  Thu Apr  6 08:48:09 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Thu Apr  6 08:48:09 2006
Subject: [Numpy-discussion] array constructor from generators?
In-Reply-To: <4434DD42.8010205@ieee.org>
References: <F0BB3395-025A-46CB-8A7A-005964E45D20@stanford.edu> <44331200.2020604@cox.net> <B72CDB4F-6494-4815-9478-10B4743DBCE1@stanford.edu> <4434DD42.8010205@ieee.org>
Message-ID: <44353880.2040406@cox.net>

Travis Oliphant wrote:

>
>> Can you steal its memory and then give it some dummy memory that it 
>> can free without problems, so that the list can be deallocated 
>> without trouble? Does anyone know if you can just give the list a 
>> NULL pointer for it's memory and then immediately decref it? 
>> free(NULL) should always be safe, I think. (??)
>>
> I don't think you can steal a list's memory since each list element is 
> a actually pointer to some other Python Object.  
> However, a Python array's memory could be stolen (as Tim mentions later).
>
>> This is a good point. Numpy does fine with nested lists, but what 
>> should it do with nested generators? I originally thought that 
>> basically 'array(generator)' should make the exact same thing as 
>> 'array([f for f in generator])'. However, for nested generators, this 
>> would be an object array of generators.
>>
>> I'm not sure which is better -- having more special cases for 
>> generators that make generators, or having a simple rubric like above 
>> for how generators are treated.
>
> I like the idea that generators of generators acts the same as lists 
> of lists (i.e. recursively defined).   Basically to implement this, we 
> need to repeat
>
> Array_FromSequence
> discover_depth
> discover_dimensions
> discover_itemsize
>
> Or, just maybe we can figure out a way to enhance those functions so 
> that creating an array from generators works the same as creating an 
> array from sequences. Right now, the sequence interface is used.  
> Perhaps we could figure out a way to use a more abstract interface 
> which would include both generators and sequences.  If that causes too 
> much alteration then I don't think it's worth it and we just repeat 
> those functions for generators.
>
> Now, I think there are two cases here that are being discussed as one
>
> 1)  Creating arrays from iterators     ---   array( iter(xrange(10) )
> 2)  Creating arrays from generators  ---  array(x for x in xrange(10))
>
> Both of these cases really ought to be handled and really should be 
> integrated into the Array_FromSequence code.  That code is inherited 
> from Numeric and was written before iterators and generators arose on 
> the scene.  There ought to be a way to unify all of these notions 
> (Actually if you handle iterators, then sequences will come along for 
> the ride since sequences can behave as iterators).
> I'd rather see one place in the code that handles these cases.   But, 
> working code is usually better than dreamy plans :-)


I agree with all of this. However, there's one specific case that I 
think we should optimize the heck out of. In fact, I'd be tempted as a 
first cut to only implement this case and raise exceptions in the other 
cases until we get around to implementing them. This one case is:
    * dtype known
    * 1-dimensional
I care about this case because it's common and we can do it efficiently. 
In the other cases I could write a python function that does almost as 
good of a job as we're likely to do in C both in terms of speed and 
memory usage. So the known dtype, 1D case adds important functionality 
while the other "merely" adds convenience (and consistency). Those are 
good, but personally the added functionality is higher on my priority list.

-tim


From byrnes at bu.edu  Thu Apr  6 09:15:25 2006
From: byrnes at bu.edu (John Byrnes)
Date: Thu Apr  6 09:15:25 2006
Subject: [Numpy-discussion] A random.normal function with stdev as array
In-Reply-To: <4434E13B.4000702@ieee.org>
References: <4433C6D6.5080800@obs.univ-lyon1.fr> <20060405155736.GA9364@localhost.localdomain> <4434E13B.4000702@ieee.org>
Message-ID: <20060406161450.GA18606@localhost.localdomain>

On Thu, Apr 06, 2006 at 03:36:59AM -0600, Travis Oliphant wrote:
> John Byrnes wrote:
> >Hi Eric,
> >
> >In the past , I've done things like
> >
> >######
> >normdist = lambda x: numpy.random.normal(0,x)
> >vecnormal = numpy.vectorize(normdist)
> >
> >stdev = numpy.array([1.1,1.2,1.0,2.2])
> >result = vecnormal(stdev)
> >
> >######
> >
> >This works fine for up to 10k elements for stdev for some reason.
> >Any larger then that and i get a Bus error on my PPC mac and a segfault on
> >my x86 linux box.
> >
> >  
> 
> This needs to be tracked down.  It looks like some-kind of error is not 
> being caught correctly.  You should not get a segfault.  Could you 
> provide a stack-trace when the problem occurs? 
> 
> One issue is that vectorize is using object arrays under the covers 
> which is consuming roughly 2x the memory than you may think.    An 
> object array is created and the function is called for every element.  
> This object array is then converted to a number type after the fact.  
> 
> The segfault should be tracked down in any case.
> 
> -Travis
> 
> 
>

Hi Travis,

Here is a backtrace from gdb on my mac.

John


#0  0x00470b88 in log1pl ()
#1  0x00000000 in ?? ()
Cannot access memory at address 0x0
Cannot access memory at address 0x0
#2  0x004708ec in log1pl ()
#3  0x1000c348 in PyObject_Call (func=0x4, arg=0x4, kw=0x15fb) at /Users/bob/src/Python-2.4.1/Objects/abstract.c:1751
#4  0x1007ce34 in ext_do_call (func=0x1, pp_stack=0xbfffed90, flags=211904, na=8656012, nk=1194304) at /Users/bob/src/Python-2.4.1/Python/ceval.c:3824
#5  0x1007a230 in PyEval_EvalFrame (f=0x848410) at /Users/bob/src/Python-2.4.1/Python/ceval.c:2203
#6  0x1007b284 in PyEval_EvalCodeEx (co=0x2, globals=0x4, locals=0x1, args=0x3, argcount=1049072, kws=0x841150, kwcount=1, defs=0x8411fc, defcount=0, closure=0x0) at /Users/bob/src/Python-2.4.1/Python/ceval.c:2730
#7  0x10026274 in function_call (func=0x880bb0, arg=0x1001f0, kw=0x848410) at /Users/bob/src/Python-2.4.1/Objects/funcobject.c:548
#8  0x1000c348 in PyObject_Call (func=0x4, arg=0x4, kw=0x15fb) at /Users/bob/src/Python-2.4.1/Objects/abstract.c:1751
#9  0x10015a88 in instancemethod_call (func=0x52eef0, arg=0x54a170, kw=0x0) at /Users/bob/src/Python-2.4.1/Objects/classobject.c:2431
#10 0x1000c348 in PyObject_Call (func=0x4, arg=0x4, kw=0x15fb) at /Users/bob/src/Python-2.4.1/Objects/abstract.c:1751
#11 0x10059358 in slot_tp_call (self=0x53e4f0, args=0x5b310, kwds=0x0) at /Users/bob/src/Python-2.4.1/Objects/typeobject.c:4526
#12 0x1000c348 in PyObject_Call (func=0x4, arg=0x4, kw=0x15fb) at /Users/bob/src/Python-2.4.1/Objects/abstract.c:1751
#13 0x1007c9e4 in do_call (func=0x53e4f0, pp_stack=0x53e4f0, na=0, nk=8655844) at /Users/bob/src/Python-2.4.1/Python/ceval.c:3755
#14 0x1007c6dc in call_function (pp_stack=0x0, oparg=4) at /Users/bob/src/Python-2.4.1/Python/ceval.c:3570
#15 0x1007a140 in PyEval_EvalFrame (f=0x10e200) at /Users/bob/src/Python-2.4.1/Python/ceval.c:2163
#16 0x1007c83c in fast_function (func=0x4, pp_stack=0x10e360, n=268927488, na=268755664, nk=1) at /Users/bob/src/Python-2.4.1/Python/ceval.c:3629
#17 0x1007c6c4 in call_function (pp_stack=0xbffff5bc, oparg=4) at /Users/bob/src/Python-2.4.1/Python/ceval.c:3568
#18 0x1007a140 in PyEval_EvalFrame (f=0x10e030) at /Users/bob/src/Python-2.4.1/Python/ceval.c:2163
#19 0x1007b284 in PyEval_EvalCodeEx (co=0x0, globals=0x4, locals=0x1, args=0x10078200, argcount=1049072, kws=0x841150, kwcount=1, defs=0x8411fc, defcount=0, closure=0x0) at /Users/bob/src/Python-2.4.1/Python/ceval.c:2730
#20 0x1007e678 in PyEval_EvalCode (co=0x4, globals=0x4, locals=0x15fb) at /Users/bob/src/Python-2.4.1/Python/ceval.c:484
#21 0x100b2ee0 in run_node (n=0x10078200, filename=0x4 <Address 0x4 out of bounds>, globals=0x0, locals=0x10e180, flags=0x2) at /Users/bob/src/Python-2.4.1/Python/pythonrun.c:1265
#22 0x100b23b0 in PyRun_InteractiveOneFlags (fp=0x54a1a5, filename=0x56ca0 "", flags=0x10e030) at /Users/bob/src/Python-2.4.1/Python/pythonrun.c:762
#23 0x100b2190 in PyRun_InteractiveLoopFlags (fp=0x56b94, filename=0xd440 "", flags=0x100f21b8) at /Users/bob/src/Python-2.4.1/Python/pythonrun.c:695
#24 0x100b3bb0 in PyRun_AnyFileExFlags (fp=0xa0001554, filename=0x100f36ac "<stdin>", closeit=0, flags=0xbffff934) at /Users/bob/src/Python-2.4.1/Python/pythonrun.c:658
#25 0x100bf640 in Py_Main (argc=269413412, argv=0x20000000) at /Users/bob/src/Python-2.4.1/Modules/main.c:484
#26 0x000018d0 in start ()
#27 0x8fe1a278 in __dyld__dyld_start ()


From ndarray at mac.com  Thu Apr  6 12:42:17 2006
From: ndarray at mac.com (Sasha)
Date: Thu Apr  6 12:42:17 2006
Subject: [Numpy-discussion] What is diagonal for nd>2?
Message-ID: <d38f5330604061241w8054ee5tf60513445a2d9df3@mail.gmail.com>

It looks like the definition of the diagonal changed somewhere between
Numeric 24.0 and numpy:

In Numeric:

>>> x = Numeric.arange(2*4*4)
>>> x = Numeric.reshape(x, (2, 4, 4))
>>> x
array([[[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11],
        [12, 13, 14, 15]],
       [[16, 17, 18, 19],
        [20, 21, 22, 23],
        [24, 25, 26, 27],
        [28, 29, 30, 31]]])
>>> Numeric.diagonal(x)
array([[ 0,  5, 10, 15],
       [16, 21, 26, 31]])

But in numpy:

>>> import numpy as Numeric
>>> x = Numeric.arange(2*4*4)
>>> x = Numeric.reshape(x, (2, 4, 4))
>>> Numeric.diagonal(x)
array([[ 0, 20],
       [ 1, 21],
       [ 2, 22],
       [ 3, 23]])

The old logic seems to be clear: x is a pair of matrices and diagonal
returns a pair of diagonals, but the new logic seems unclear: the disagonal
returns the first rows of the two matrices transposed.

Does anyone know when this change was introduced and why?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060406/484624d2/attachment-0001.html>

From pgmdevlist at mailcan.com  Thu Apr  6 13:51:04 2006
From: pgmdevlist at mailcan.com (Pierre GM)
Date: Thu Apr  6 13:51:04 2006
Subject: [Numpy-discussion] What is diagonal for nd>2?
In-Reply-To: <d38f5330604061241w8054ee5tf60513445a2d9df3@mail.gmail.com>
References: <d38f5330604061241w8054ee5tf60513445a2d9df3@mail.gmail.com>
Message-ID: <200604061652.30764.pgmdevlist@mailcan.com>

> Does anyone know when this change was introduced and why?

Isn't it more a problem of default values ?
By default, x.diagonal() == x.diagonal(0,0,1)

x.diagonal()
array([[ 0, 20],
       [ 1, 21],
       [ 2, 22],
       [ 3, 23]])

If you want the paired diagonal:
x.diagonal(0,1,-1)
array([[ 0,  5, 10, 15],
       [16, 21, 26, 31]])


From ndarray at mac.com  Thu Apr  6 14:46:10 2006
From: ndarray at mac.com (Sasha)
Date: Thu Apr  6 14:46:10 2006
Subject: [Numpy-discussion] What is diagonal for nd>2?
In-Reply-To: <200604061652.30764.pgmdevlist@mailcan.com>
References: <d38f5330604061241w8054ee5tf60513445a2d9df3@mail.gmail.com>
	 <200604061652.30764.pgmdevlist@mailcan.com>
Message-ID: <d38f5330604061445le2e5c73p8d1ca51f8718524d@mail.gmail.com>

I see. However, something needs to be changed.  In the current version
help(diagonal) prints the following:
{{{
Help on function diagonal in module numpy.core.oldnumeric:

diagonal(a, offset=0, axis1=0, axis2=1)
    diagonal(a, offset=0, axis1=0, axis2=1) returns the given diagonals
    defined by the last two dimensions of the array.
}}}

I would think axes 0 and 1 are the first, not the last two dimensions.  We
can either change the documentation or change the defaults in the
oldnumeric.  I would vote for the change in defaults because oldnumeric is a
compatibility module and should not introduce changes.

In addition, the fact that the reduced axes become the first (rather than
the last or one of the axis1 and axis2) dimension should be spelled out in
the docstring.


On 4/6/06, Pierre GM <pgmdevlist at mailcan.com> wrote:
>
> > Does anyone know when this change was introduced and why?
>
> Isn't it more a problem of default values ?
> By default, x.diagonal() == x.diagonal(0,0,1)
>
> x.diagonal()
> array([[ 0, 20],
>        [ 1, 21],
>        [ 2, 22],
>        [ 3, 23]])
>
> If you want the paired diagonal:
> x.diagonal(0,1,-1)
> array([[ 0,  5, 10, 15],
>        [16, 21, 26, 31]])
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060406/6a4b894f/attachment-0001.html>

From Chris.Barker at noaa.gov  Thu Apr  6 14:59:03 2006
From: Chris.Barker at noaa.gov (Christopher Barker)
Date: Thu Apr  6 14:59:03 2006
Subject: [Numpy-discussion] Re: weird interaction: pickle, numpy,
 matplotlib.hist
In-Reply-To: <4434E31B.5030306@ieee.org>
References: <4433DF85.7030109@gmail.com> <e10q3e$qh0$1@sea.gmane.org>
 <4434E31B.5030306@ieee.org>
Message-ID: <44358EEA.4080609@noaa.gov>

Travis Oliphant wrote:
> Thus, 
> perhaps on pickle read the data should be copied to native byte-order if 
> necessary.

+1

Those that are working with non-native byte order on purpose presumably 
know what they are doing, and can check and swap as necessary -- or use 
tofile and fromfile, which I presume don't do any byteswapping for you.

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer
                                     		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From pgmdevlist at mailcan.com  Thu Apr  6 15:01:03 2006
From: pgmdevlist at mailcan.com (Pierre GM)
Date: Thu Apr  6 15:01:03 2006
Subject: [Numpy-discussion] What is diagonal for nd>2?
In-Reply-To: <d38f5330604061445le2e5c73p8d1ca51f8718524d@mail.gmail.com>
References: <d38f5330604061241w8054ee5tf60513445a2d9df3@mail.gmail.com> <200604061652.30764.pgmdevlist@mailcan.com> <d38f5330604061445le2e5c73p8d1ca51f8718524d@mail.gmail.com>
Message-ID: <200604061802.20457.pgmdevlist@mailcan.com>

> I would think axes 0 and 1 are the first, not the last two dimensions.  We
> can either change the documentation or change the defaults in the
> oldnumeric.  I would vote for the change in defaults because oldnumeric is
> a compatibility module and should not introduce changes.

So, change the default to:
diagonal(a, offset=0, axis1=-2, axis2=-1) ?

That'd make sense, I'm for that...


From ndarray at mac.com  Thu Apr  6 16:11:01 2006
From: ndarray at mac.com (Sasha)
Date: Thu Apr  6 16:11:01 2006
Subject: [Numpy-discussion] New patch for MA
In-Reply-To: <200603280427.52789.pgmdevlist@mailcan.com>
References: <200603280427.52789.pgmdevlist@mailcan.com>
Message-ID: <d38f5330604061610i115c7ec5j33df18cb99c8ac06@mail.gmail.com>

I have applied the patch with minor modifications. See <
http://projects.scipy.org/scipy/numpy/changeset/2331>.

 Here are a few suggestions for posting patches.

1. If you are using svn, please post output of "svn diff" in the project
root directory (the directory that *contains* "numpy", not the "numpy"
directory.
2. If appropriate, add unit tests to an existing file instead of creating a
new one. (In case of ma, the correct file is test_ma.py).
3. If you follow recommendation #1, this will happen automatically, if you
cannot use svn for some reason, concatenate the output of diff for code and
test in the same patch file.

Here are some topics for discussion.

1. I've initially implemented some ma array methods by wrapping existing
module level functions.  I am not sure this is the best approach to
implement new methods. It is probably cleaner to implement them as methods
and provide wrappers at the module level similar to oldnumeric.

2. I am not sure cumprod and cumsum should fill masked elements with 1 and
0.  I would think the result should be masked if any prior element along the
axis being accumulated is masked.  To ignore masked elements, filled can be
called explicitly before cum[prod|sum].  One of the problems with filling by
default is that 1 or 0 are not appropriate values for object arrays (for
example, "" is an appropriate fill value for cumsum of an array of strings).


On 3/28/06, Pierre GM <pgmdevlist at mailcan.com> wrote:
>
> Folks,
> You can find a new patch for MA on the wiki
>
> http://projects.scipy.org/scipy/numpy/attachment/wiki/MaskedArray/ma-200603280900.patch
> along with a test suite.
> The 'clip' method should now work with array arguments. Were also added
> cumsum, cumprod, std, var and squeeze.
> I'll deal with flags, setflags, setfield, dump and others when I'll have a
> better idea of how it works -- which probably won't happen anytime soon,
> as I
> don't really have time to dig in the code for these functions. AAMOF, I'm
> more interested in checking/patching some other aspects of numpy for MA
> (eg,
> mlab...)
> Once again, please send me your comments and suggestions.
> Thx for everything
> P.
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting
> language
> that extends applications into web and mobile media. Attend the live
> webcast
> and join the prime developer group breaking into this new coding
> territory!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060406/0a3fdbbc/attachment-0001.html>

From michael.sorich at gmail.com  Thu Apr  6 17:41:19 2006
From: michael.sorich at gmail.com (Michael Sorich)
Date: Thu Apr  6 17:41:19 2006
Subject: [Numpy-discussion] New patch for MA
In-Reply-To: <d38f5330604061610i115c7ec5j33df18cb99c8ac06@mail.gmail.com>
References: <200603280427.52789.pgmdevlist@mailcan.com>
	 <d38f5330604061610i115c7ec5j33df18cb99c8ac06@mail.gmail.com>
Message-ID: <16761e100604061733r586cca6cr94d72c554b54fdd0@mail.gmail.com>

On 4/7/06, Sasha <ndarray at mac.com> wrote:
>
>
> 2. I am not sure cumprod and cumsum should fill masked elements with 1 and
> 0.  I would think the result should be masked if any prior element along the
> axis being accumulated is masked.  To ignore masked elements, filled can be
> called explicitly before cum[prod|sum].  One of the problems with filling by
> default is that 1 or 0 are not appropriate values for object arrays (for
> example, "" is an appropriate fill value for cumsum of an array of strings).
>
>
There are often a number of options for how masked values can be dealt with.
In general (not just with cum*), I would prefer for the result to be masked
when masked values are involved unless I explicitly indicate what should be
done with the masked values. Otherwise it is too easy to forget that some
default maniputlation of masked values has been applied. In R there is
commonly an na.action or na.rm parameter to functions.

Michael
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060406/5092ed3a/attachment-0001.html>

From pgmdevlist at mailcan.com  Thu Apr  6 19:19:02 2006
From: pgmdevlist at mailcan.com (Pierre GM)
Date: Thu Apr  6 19:19:02 2006
Subject: [Numpy-discussion] New patch for MA
In-Reply-To: <d38f5330604061610i115c7ec5j33df18cb99c8ac06@mail.gmail.com>
References: <200603280427.52789.pgmdevlist@mailcan.com> <d38f5330604061610i115c7ec5j33df18cb99c8ac06@mail.gmail.com>
Message-ID: <200604062218.05876.pgmdevlist@mailcan.com>

Sasha,
Thanks for your advice with SVN. I'll make sure to use that method from now 
on.

> 1. I've initially implemented some ma array methods by wrapping existing
> module level functions.  I am not sure this is the best approach to
> implement new methods. It is probably cleaner to implement them as methods
> and provide wrappers at the module level similar to oldnumeric.

Well, I tried to stick to the latest convention, getting rid of the _wrapit 
part. Let me know.
>
> 2. I am not sure cumprod and cumsum should fill masked elements with 1 and
> 0. 
Good point for the object/string arrays, yet other cases I overlooked (I'm 
still not used to object arrays, I'm now realizing they're quite useful).

Actually, I coded that way because it's how I use these functions. But well, 
as many settings as users, eh?

Michael's suggestion of introducing R-like options sounds interesting, but I 
wonder whether it would not be a bit heavy for methods, with the introduction 
of an extra flag. That'd be great for functions, though. 
So, for cumsum and cumprod methods, maybe we could stick to Sasha's and 
Michael's preference (mask all values after the first missing), and we would 
just have to create two functions. We could use the 4 R ones: na.omit, 
na.fail, na.pass, na.exclude.

For our current problem (cumsum,cumprod)
na.omit: would return the result I implemented (fill with 0 or 1)
na.fail: would return masked values after the first missing
na.exclude: would correspond to compressed().cumsum() ? I don't like that, it 
changes the initial length/size
na.pass: I don't know...


From ndarray at mac.com  Thu Apr  6 21:14:01 2006
From: ndarray at mac.com (Sasha)
Date: Thu Apr  6 21:14:01 2006
Subject: [Numpy-discussion] New patch for MA
In-Reply-To: <16761e100604061733r586cca6cr94d72c554b54fdd0@mail.gmail.com>
References: <200603280427.52789.pgmdevlist@mailcan.com>
	 <d38f5330604061610i115c7ec5j33df18cb99c8ac06@mail.gmail.com>
	 <16761e100604061733r586cca6cr94d72c554b54fdd0@mail.gmail.com>
Message-ID: <d38f5330604062113n1eaf33b5o38c9b163e19e517d@mail.gmail.com>

On 4/6/06, Michael Sorich <michael.sorich at gmail.com> wrote:
> ... I would prefer for the result to be masked
> when masked values are involved unless I explicitly indicate what should be
> done with the masked values. ...

This is the case in r2332:

>>> from numpy.core.ma import *
>>> print array([1,2,3], mask=[0,1,0]).cumsum()
[1 -- --]


From a.mcmorland at auckland.ac.nz  Fri Apr  7 00:30:07 2006
From: a.mcmorland at auckland.ac.nz (Angus McMorland)
Date: Fri Apr  7 00:30:07 2006
Subject: [Numpy-discussion] Newbie indexing question [fancy indexing in
 nD]
In-Reply-To: <4434E522.3060101@ieee.org>
References: <44338DF4.7050603@gmail.com> <4434E522.3060101@ieee.org>
Message-ID: <4435F672.1040701@auckland.ac.nz>

Hi again.

Thanks, everyone, for your quick replies.

Travis Oliphant wrote:
> amcmorl wrote:
> 
>> Hi all,
>>
>> I'm having a bit of trouble getting my head around numpy's indexing
>> capabilities. A quick summary of the problem is that I want to
>> lookup/index in nD from a second array of rank n+1, such that the last
>> (or first, I guess) dimension contains the lookup co-ordinates for the
>> value to extract from the first array. Here's a 2D (3,3) example:
>>
>> In [12]:print ar
>> [[ 0.15  0.75  0.2 ]
>>  [ 0.82  0.5   0.77]
>>  [ 0.21  0.91  0.59]]
>>
>> In [24]:print inds
>> [[[1 1]
>>   [1 1]
>>   [2 1]]
>>
>>  [[2 2]
>>   [0 0]
>>   [1 0]]
>>
>>  [[1 1]
>>   [0 0]
>>   [2 1]]]
>>
>> then somehow return the array (barring me making any row/column errors):
>> In [26]: c = ar.somefancyindexingroutinehere(inds)
> 
> You can do this with "fancy-indexing".   Obviously it is going to take
> some time for people to get used to this idea as none of the responses
> yet suggest it.
> But the following works. 
> c = ar[inds[...,0],inds[...,1]]
> 
> gives the desired effect.
> 
> Thus, your simple description c[x,y] = ar[inds[x,y,0],inds[x,y,1]] is a
> text-book description of what fancy-indexing does.

Great. Turns out I wasn't too far off then. I've written a quick
function of my own that extends the fancy indexing to nD:

def fancy_index_nd(ar, ind):
    evList = ['ar[']
    for i in range(len(ar.shape)):
        evList = evList + [' ind[...,%d]' % i]
        if i < len(ar.shape) - 1:
            evList = evList + [","]
    evList = evList + [' ]']
    return eval(''.join(evList))

1) Am I missing a simpler way to extend the fancy-indexing to
n-dimensions? If not...

2) this seems (conceptually) that it might be a little faster than the
routines that have to calculate a flat index. Hopefully it could be of
use to people. Any thoughts?

Cheers,

Angus
-- 
Angus McMorland
email a.mcmorland at auckland.ac.nz
mobile +64-21-155-4906

PhD Student, Neurophysiology / Multiphoton & Confocal Imaging
Physiology, University of Auckland
phone +64-9-3737-599 x89707

Armourer, Auckland University Fencing
Secretary, Fencing North Inc.


From pau.gargallo at gmail.com  Fri Apr  7 02:37:05 2006
From: pau.gargallo at gmail.com (Pau Gargallo)
Date: Fri Apr  7 02:37:05 2006
Subject: [Numpy-discussion] Newbie indexing question [fancy indexing in nD]
In-Reply-To: <4435F672.1040701@auckland.ac.nz>
References: <44338DF4.7050603@gmail.com> <4434E522.3060101@ieee.org>
	 <4435F672.1040701@auckland.ac.nz>
Message-ID: <6ef8f3380604070236m2d606983l82403cbc2305fefa@mail.gmail.com>

you can do things like

a[ list( ind[...,i] for i in range(.shape[-1]) ) ]

if the indices could be accessed as ind[i] instead of ind[...,i]
(transposing the indices array)
then you could simply do:

a[ list(ind) ]


pau

On 4/7/06, Angus McMorland <a.mcmorland at auckland.ac.nz> wrote:
> Hi again.
>
> Thanks, everyone, for your quick replies.
>
> Travis Oliphant wrote:
> > amcmorl wrote:
> >
> >> Hi all,
> >>
> >> I'm having a bit of trouble getting my head around numpy's indexing
> >> capabilities. A quick summary of the problem is that I want to
> >> lookup/index in nD from a second array of rank n+1, such that the last
> >> (or first, I guess) dimension contains the lookup co-ordinates for the
> >> value to extract from the first array. Here's a 2D (3,3) example:
> >>
> >> In [12]:print ar
> >> [[ 0.15  0.75  0.2 ]
> >>  [ 0.82  0.5   0.77]
> >>  [ 0.21  0.91  0.59]]
> >>
> >> In [24]:print inds
> >> [[[1 1]
> >>   [1 1]
> >>   [2 1]]
> >>
> >>  [[2 2]
> >>   [0 0]
> >>   [1 0]]
> >>
> >>  [[1 1]
> >>   [0 0]
> >>   [2 1]]]
> >>
> >> then somehow return the array (barring me making any row/column errors):
> >> In [26]: c = ar.somefancyindexingroutinehere(inds)
> >
> > You can do this with "fancy-indexing".   Obviously it is going to take
> > some time for people to get used to this idea as none of the responses
> > yet suggest it.
> > But the following works.
> > c = ar[inds[...,0],inds[...,1]]
> >
> > gives the desired effect.
> >
> > Thus, your simple description c[x,y] = ar[inds[x,y,0],inds[x,y,1]] is a
> > text-book description of what fancy-indexing does.
>
> Great. Turns out I wasn't too far off then. I've written a quick
> function of my own that extends the fancy indexing to nD:
>
> def fancy_index_nd(ar, ind):
>     evList = ['ar[']
>     for i in range(len(ar.shape)):
>         evList = evList + [' ind[...,%d]' % i]
>         if i < len(ar.shape) - 1:
>             evList = evList + [","]
>     evList = evList + [' ]']
>     return eval(''.join(evList))
>
> 1) Am I missing a simpler way to extend the fancy-indexing to
> n-dimensions? If not...
>
> 2) this seems (conceptually) that it might be a little faster than the
> routines that have to calculate a flat index. Hopefully it could be of
> use to people. Any thoughts?
>
> Cheers,
>
> Angus
> --
> Angus McMorland
> email a.mcmorland at auckland.ac.nz
> mobile +64-21-155-4906
>
> PhD Student, Neurophysiology / Multiphoton & Confocal Imaging
> Physiology, University of Auckland
> phone +64-9-3737-599 x89707
>
> Armourer, Auckland University Fencing
> Secretary, Fencing North Inc.
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting language
> that extends applications into web and mobile media. Attend the live webcast
> and join the prime developer group breaking into this new coding territory!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>


From mxjmfen at dlalaw.com  Fri Apr  7 04:50:08 2006
From: mxjmfen at dlalaw.com (mxjmfen)
Date: Fri Apr  7 04:50:08 2006
Subject: [Numpy-discussion] Fw: numpy-discussion
Message-ID: <001401c65a39$3e4f54e0$6c5fd855@ries>


----- Original Message ----- 
From: Rosenberg Kris 
To: xqoahbphlic at time.net 
Sent: Friday, April 07, 2006 11:26 AM
Subject: numpy-discussion


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060407/1f813a80/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: numpy-discussion.gif
Type: image/gif
Size: 16996 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060407/1f813a80/attachment-0001.gif>

From a.h.jaffe at gmail.com  Fri Apr  7 06:54:09 2006
From: a.h.jaffe at gmail.com (Andrew Jaffe)
Date: Fri Apr  7 06:54:09 2006
Subject: [Numpy-discussion] Re: weird interaction: pickle, numpy, matplotlib.hist
In-Reply-To: <4434E31B.5030306@ieee.org>
References: <4433DF85.7030109@gmail.com> <e10q3e$qh0$1@sea.gmane.org> <4434E31B.5030306@ieee.org>
Message-ID: <44366E71.7060601@gmail.com>

Travis Oliphant wrote:
> But,  this brings up the point that currently the pickled raw-data which 
> is read-in as a string by Python is used as the memory for the new array 
> (i.e. the string memory is "stolen").    This should work.  The fact 
> that it didn't with sort was a bug that is now fixed in SVN.  However, 
> operations on out-of-byte-order arrays will always be slower.  Thus, 
> perhaps on pickle read the data should be copied to native byte-order if 
> necessary.

+1 from me, too. I assume that byteswapping is fast compared to I/O in 
most cases, and the only times when you wouldn't want it would be 
'advanced' usage that the developer could take control of via a custom 
reduce, __getstate__, __setstate__, etc.

Andrew


______________________________________________________________________
Andrew Jaffe                                    a.jaffe at imperial.ac.uk
Astrophysics Group                                    +44 207 594-7526
Blackett Laboratory, Room 1013                    FAX             7541
Imperial College, Prince Consort Road
London SW7 2AZ ENGLAND              http://astro.imperial.ac.uk/~jaffe


From ndarray at mac.com  Fri Apr  7 10:26:06 2006
From: ndarray at mac.com (Sasha)
Date: Fri Apr  7 10:26:06 2006
Subject: [Numpy-discussion] Re: ndarray.fill and ma.array.filled
In-Reply-To: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com>
Message-ID: <d38f5330604071025g5f53d0d7yb0b14d9bac059c4@mail.gmail.com>

I am posting a reply to my own post in a hope to generate some discussion of
the original proposal.

I am proposing to add a "filled" method to ndarray.  This can be a
pass-through, an alias to "copy" or a method to replace nans or some other
type-specific values.  This will allow code that uses "filled" work on
ndarrays without changes.


On 3/22/06, Sasha <ndarray at mac.com> wrote:
>
> In an ideal world, any function that accepts ndarray would accept
> ma.array and vice versa.  Moreover, if the ma.array has no masked
> elements and the same data as ndarray, the result should be the same.
> Obviously current implementation falls short of this goal, but there
> is one feature that seems to make this goal unachievable.
>
> This feature is the "filled" method of ma.array.  Pydoc for this
> method reports the following:
>
> |  filled(self, fill_value=None)
> |      A numeric array with masked values filled. If fill_value is None,
> |                 use self.fill_value().
> |
> |                 If mask is nomask, copy data only if not contiguous.
> |                 Result is always a contiguous, numeric array.
> |      # Is contiguous really necessary now?
>
>
> That is not the best possible description ("filled" is "filled"), but
> the essence is that the result of a.filled(value) is a contiguous
> ndarray obtained from the masked array by copying non-masked elements
> and using value for masked values.
>
> I would like to propose to add a "filled" method to ndarray.  I see
> several possibilities and would like  to hear your opinion:
>
> 1. Make filled simply return self.
>
> 2. Make filled return a contiguous copy.
>
> 3. Make filled replace nans with the fill_value if array is of
> floating point type.
>
>
> Unfortunately, adding "filled" will result is a rather confusing
> situation where "fill" and "filled" both exist and have very different
> meanings.
>
> I would like to note that "fill" is a somewhat odd ndarray method.
> AFAICT, it is the only non-special method that mutates the array.  It
> appears to be just a performance trick: the same result can be achived
> with "a[...] = ".
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060407/7b3043bc/attachment-0001.html>

From webb.sprague at gmail.com  Fri Apr  7 10:38:03 2006
From: webb.sprague at gmail.com (Webb Sprague)
Date: Fri Apr  7 10:38:03 2006
Subject: [Numpy-discussion] Tiling / disk storage for matrix in numpy?
Message-ID: <b11ea23c0604071030s7f03a83co35ca94b8c91639eb@mail.gmail.com>

Hi all,

Is there a way in numpy to associate a (large) matrix with a disk
file, then and tile and index it, then cache it as you process the
various pieces?  This is pretty important with massive image files,
which can't fit into working memory, but in which (for example) you
might be doing a convolution on a 100 x 100 pixel window on a small
subset of the image.

I know that caching algorithms are (1) complicated and (2) never
general.  But there you go.

Perhaps I can't find it, perhaps it would be a good project for the
future?  If HDF or something does this already, could someone point me
in the right direction?

Thx


From tim.hochberg at cox.net  Fri Apr  7 11:22:05 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Fri Apr  7 11:22:05 2006
Subject: [Numpy-discussion] Re: ndarray.fill and ma.array.filled
In-Reply-To: <d38f5330604071025g5f53d0d7yb0b14d9bac059c4@mail.gmail.com>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com> <d38f5330604071025g5f53d0d7yb0b14d9bac059c4@mail.gmail.com>
Message-ID: <4436AE31.7000306@cox.net>

Sasha wrote:

> I am posting a reply to my own post in a hope to generate some 
> discussion of the original proposal.
>
> I am proposing to add a "filled" method to ndarray.  This can be a 
> pass-through, an alias to "copy" or a method to replace nans or some 
> other type-specific values.  This will allow code that uses "filled" 
> work on
> ndarrays without changes.

In general, I'm skeptical of adding more methods to the ndarray object 
-- there are plenty already.

In addition, it appears that both the method and function versions of 
filled are "dangerous" in the sense that they sometimes return the array 
itself and sometimes a copy.

Finally, changing ndarray to support masked array feels a bit like the 
tail wagging the dog.

Let me throw out an alternative proposal. I will admit up front that 
this proposal is based on exactly zero experience with masked array, so 
there may be some stupidities in it, but perhaps it will lead to an 
alternative solution.

    def asUnmaskedArray(obj, fill_value=None):
            mask = getattr(obj,  False)
            if mask is False:
                return obj
            if fill_value is None:
                 fill_value = obj.get_fill_value()
            newobj = obj.data().copy()
            newobj[mask] = fill_value
            return newobj

Or something like that anyway. This particular version should work on 
any array as long as if it exports a mask attribute it also exports 
get_fill_value and data. At least once any bugs are ironed out, I 
haven't tested it.

ma would have to be modified to use this instead of using filled 
everywhere, but that seems more appropriate than tacking on another 
method to ndarray IMO.
           
On advantage of this approach is that most array like objects that don't 
subclass ndarray will work with this automagically. If we keep expanding 
the methods of ndarray, it's harder and harder to implement other array 
like objects since they have to implement more and more methods, most of 
which are irrelevant to their particular case. The more we can implement 
stuff like this in terms of some relatively small set of core 
primitives, the happier we'll all be in the long run. This also builds 
on the idea of trying to push as much of the array/view ambiguity into 
the asXXXArray corner.

Regards,

-tim


>
>
> On 3/22/06, *Sasha* <ndarray at mac.com <mailto:ndarray at mac.com>> wrote:
>
>     In an ideal world, any function that accepts ndarray would accept
>     ma.array and vice versa.  Moreover, if the ma.array has no masked
>     elements and the same data as ndarray, the result should be the same.
>     Obviously current implementation falls short of this goal, but there
>     is one feature that seems to make this goal unachievable.
>
>     This feature is the "filled" method of ma.array.  Pydoc for this
>     method reports the following:
>
>     |  filled(self, fill_value=None)
>     |      A numeric array with masked values filled. If fill_value is
>     None,
>     |                 use self.fill_value().
>     |
>     |                 If mask is nomask, copy data only if not contiguous.
>     |                 Result is always a contiguous, numeric array.
>     |      # Is contiguous really necessary now?
>
>
>     That is not the best possible description ("filled" is "filled"), but
>     the essence is that the result of a.filled(value) is a contiguous
>     ndarray obtained from the masked array by copying non-masked elements
>     and using value for masked values.
>
>     I would like to propose to add a "filled" method to ndarray.  I see
>     several possibilities and would like  to hear your opinion:
>
>     1. Make filled simply return self.
>
>     2. Make filled return a contiguous copy.
>
>     3. Make filled replace nans with the fill_value if array is of
>     floating point type.
>
>
>     Unfortunately, adding "filled" will result is a rather confusing
>     situation where "fill" and "filled" both exist and have very different
>     meanings.
>
>     I would like to note that "fill" is a somewhat odd ndarray method.
>     AFAICT, it is the only non-special method that mutates the array.  It
>     appears to be just a performance trick: the same result can be
>     achived
>     with "a[...] = ".
>
>


From ndarray at mac.com  Fri Apr  7 12:20:15 2006
From: ndarray at mac.com (Sasha)
Date: Fri Apr  7 12:20:15 2006
Subject: [Numpy-discussion] Re: ndarray.fill and ma.array.filled
In-Reply-To: <4436AE31.7000306@cox.net>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com>
	 <d38f5330604071025g5f53d0d7yb0b14d9bac059c4@mail.gmail.com>
	 <4436AE31.7000306@cox.net>
Message-ID: <d38f5330604071219j6a5adbdw4a300ed10a26a445@mail.gmail.com>

On 4/7/06, Tim Hochberg <tim.hochberg at cox.net> wrote:
>
> ...
> In general, I'm skeptical of adding more methods to the ndarray object
> -- there are plenty already.


I've also proposed to drop "fill" in favor of optimizing x[...] = <scalar>.
Having both "fill" and "filled" in the interface is plain awkward.  You may
like the combined proposal better because it does not change the total
number of methods :-)


In addition, it appears that both the method and function versions of
> filled are "dangerous" in the sense that they sometimes return the array
> itself and sometimes a copy.


This is true in ma, but may certainly be changed.


> Finally, changing ndarray to support masked array feels a bit like the
> tail wagging the dog.


I disagree. Numpy is pretty much alone among the array languages because it
does not have "native" support for missing values. For the  floating point
types some rudimental support for nans exists, but is not really usable.
There is no missing values machanism for integer types.  I believe adding
"filled" and maybe "mask" to ndarray (not necessarily under these names)
could be a meaningful step towards "native" support for missing values.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060407/8161ae4c/attachment-0001.html>

From webb.sprague at gmail.com  Fri Apr  7 12:36:00 2006
From: webb.sprague at gmail.com (Webb Sprague)
Date: Fri Apr  7 12:36:00 2006
Subject: [Numpy-discussion] Silly array question
Message-ID: <b11ea23c0604071234y562e7568vf32e36a6b4f27925@mail.gmail.com>

In R, if you have an Nx2 array of integers, you can use that to index
an TxS array, yielding a 1xN result. Is there a way to do that in
numpy? I looked for a pairs function but I coudn't find it, vaguely
remembering that might be around...  I know it would be a trivial loop
to write, but a numpy array function would be faster (I hope).

Example

I = [[0,0], [1,1], [2,2], [1,1]]
M = [[1, 2, 3, 4],
        [5, 6, 7, 8],
        [9,10,11, 12],
        [13, 14, 15, 16]]

M[I] = [1,6,11,6].

Thanks!


From ndarray at mac.com  Fri Apr  7 12:53:03 2006
From: ndarray at mac.com (Sasha)
Date: Fri Apr  7 12:53:03 2006
Subject: [Numpy-discussion] Silly array question
In-Reply-To: <b11ea23c0604071234y562e7568vf32e36a6b4f27925@mail.gmail.com>
References: <b11ea23c0604071234y562e7568vf32e36a6b4f27925@mail.gmail.com>
Message-ID: <d38f5330604071252p79df39b9i47b28b70d2f99897@mail.gmail.com>

>>> M.ravel()[dot(I,(4,1))]
array([ 1,  6, 11,  6])


On 4/7/06, Webb Sprague <webb.sprague at gmail.com> wrote:
>
> In R, if you have an Nx2 array of integers, you can use that to index
> an TxS array, yielding a 1xN result. Is there a way to do that in
> numpy? I looked for a pairs function but I coudn't find it, vaguely
> remembering that might be around...  I know it would be a trivial loop
> to write, but a numpy array function would be faster (I hope).
>
> Example
>
> I = [[0,0], [1,1], [2,2], [1,1]]
> M = [[1, 2, 3, 4],
>         [5, 6, 7, 8],
>         [9,10,11, 12],
>         [13, 14, 15, 16]]
>
> M[I] = [1,6,11,6].
>
> Thanks!
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting
> language
> that extends applications into web and mobile media. Attend the live
> webcast
> and join the prime developer group breaking into this new coding
> territory!
> http://sel.as-us.falkag.net/sel?cmdlnk&kid0944&bid$1720&dat1642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060407/b9f72dcc/attachment-0001.html>

From efiring at hawaii.edu  Fri Apr  7 13:22:06 2006
From: efiring at hawaii.edu (Eric Firing)
Date: Fri Apr  7 13:22:06 2006
Subject: [Numpy-discussion] Re: ndarray.fill and ma.array.filled
In-Reply-To: <d38f5330604071219j6a5adbdw4a300ed10a26a445@mail.gmail.com>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com>
 <d38f5330604071025g5f53d0d7yb0b14d9bac059c4@mail.gmail.com>
 <4436AE31.7000306@cox.net>
 <d38f5330604071219j6a5adbdw4a300ed10a26a445@mail.gmail.com>
Message-ID: <4436C965.8020808@hawaii.edu>

Sasha wrote:
> 
> 
> On 4/7/06, *Tim Hochberg* <tim.hochberg at cox.net 
> <mailto:tim.hochberg at cox.net>> wrote:
> 
>     ...
>     In general, I'm skeptical of adding more methods to the ndarray object
>     -- there are plenty already.
> 
> 
> I've also proposed to drop "fill" in favor of optimizing x[...] = 
> <scalar>.  Having both "fill" and "filled" in the interface is plain 
> awkward.  You may like the combined proposal better because it does not 
> change the total number of methods :-)
>  
> 
>     In addition, it appears that both the method and function versions of
>     filled are "dangerous" in the sense that they sometimes return the
>     array
>     itself and sometimes a copy.
> 
> 
> This is true in ma, but may certainly be changed.
>  
> 
>     Finally, changing ndarray to support masked array feels a bit like the
>     tail wagging the dog. 
> 
> 
> I disagree. Numpy is pretty much alone among the array languages because 
> it does not have "native" support for missing values. For the  floating 
> point types some rudimental support for nans exists, but is not really 
> usable.  There is no missing values machanism for integer types.  I 
> believe adding "filled" and maybe "mask" to ndarray (not necessarily 
> under these names) could be a meaningful step towards "native" support 
> for missing values.  

I agree strongly with you, Sasha.  I get the impression that the world 
of numerical computation is divided into those who work with idealized 
"data", where nothing is missing, and those who work with real 
observations, where there is always something missing.  As an 
oceanographer, I am solidly in the latter category.  If good support for 
missing values is not built in, it has to be bolted on, and it becomes 
clunky and awkward.  I was reluctant to speak up about this earlier 
because I thought it was too much to ask of Travis when he was in the 
midst of putting numpy on solid ground.  But I am delighted that missing 
value support has a champion among numpy developers, and I agree that 
now is the time to change it from "bolted on" to "integrated".

Eric


From Chris.Barker at noaa.gov  Fri Apr  7 13:28:02 2006
From: Chris.Barker at noaa.gov (Christopher Barker)
Date: Fri Apr  7 13:28:02 2006
Subject: [Numpy-discussion] Silly array question
In-Reply-To: <b11ea23c0604071234y562e7568vf32e36a6b4f27925@mail.gmail.com>
References: <b11ea23c0604071234y562e7568vf32e36a6b4f27925@mail.gmail.com>
Message-ID: <4436CB1C.3040308@noaa.gov>


Webb Sprague wrote:
> In R, if you have an Nx2 array of integers, you can use that to index
> an TxS array, yielding a 1xN result.

this seems to work:

 >>> import numpy as N
 >>> I = N.array([[0,0], [1,1], [2,2], [1,1]])
 >>> I
array([[0, 0],
        [1, 1],
        [2, 2],
        [1, 1]])

 >>> M = N. array( [[1, 2, 3, 4], [5, 6, 7, 8], [9,10,11, 12], [13, 14, 
15, 16]])

 >>> M
array([[ 1,  2,  3,  4],
        [ 5,  6,  7,  8],
        [ 9, 10, 11, 12],
        [13, 14, 15, 16]])

 >>> M[I[:,0], I[:,1]]
array([ 1,  6, 11,  6])

-- 
Christopher Barker, Ph.D.
Oceanographer
                                     		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From ndarray at mac.com  Fri Apr  7 13:56:02 2006
From: ndarray at mac.com (Sasha)
Date: Fri Apr  7 13:56:02 2006
Subject: [Numpy-discussion] Silly array question
In-Reply-To: <4436CB1C.3040308@noaa.gov>
References: <b11ea23c0604071234y562e7568vf32e36a6b4f27925@mail.gmail.com>
	 <4436CB1C.3040308@noaa.gov>
Message-ID: <d38f5330604071355t62d34319ibbb3aa38216d2d8@mail.gmail.com>

One more obfuscated numpy entry:

>>> M[tuple(transpose(I))]
array([ 1,  6, 11,  6])


On 4/7/06, Christopher Barker <Chris.Barker at noaa.gov> wrote:
>
>
>
> Webb Sprague wrote:
> > In R, if you have an Nx2 array of integers, you can use that to index
> > an TxS array, yielding a 1xN result.
>
> this seems to work:
>
> >>> import numpy as N
> >>> I = N.array([[0,0], [1,1], [2,2], [1,1]])
> >>> I
> array([[0, 0],
>         [1, 1],
>         [2, 2],
>         [1, 1]])
>
> >>> M = N. array( [[1, 2, 3, 4], [5, 6, 7, 8], [9,10,11, 12], [13, 14,
> 15, 16]])
>
> >>> M
> array([[ 1,  2,  3,  4],
>         [ 5,  6,  7,  8],
>         [ 9, 10, 11, 12],
>         [13, 14, 15, 16]])
>
> >>> M[I[:,0], I[:,1]]
> array([ 1,  6, 11,  6])
>
> --
> Christopher Barker, Ph.D.
> Oceanographer
>
> NOAA/OR&R/HAZMAT         (206) 526-6959   voice
> 7600 Sand Point Way NE   (206) 526-6329   fax
> Seattle, WA  98115       (206) 526-6317   main reception
>
> Chris.Barker at noaa.gov
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting
> language
> that extends applications into web and mobile media. Attend the live
> webcast
> and join the prime developer group breaking into this new coding
> territory!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060407/bfc31232/attachment-0001.html>

From webb.sprague at gmail.com  Fri Apr  7 14:00:10 2006
From: webb.sprague at gmail.com (Webb Sprague)
Date: Fri Apr  7 14:00:10 2006
Subject: [Numpy-discussion] Silly array question
In-Reply-To: <d38f5330604071355t62d34319ibbb3aa38216d2d8@mail.gmail.com>
References: <b11ea23c0604071234y562e7568vf32e36a6b4f27925@mail.gmail.com>
	 <4436CB1C.3040308@noaa.gov>
	 <d38f5330604071355t62d34319ibbb3aa38216d2d8@mail.gmail.com>
Message-ID: <b11ea23c0604071359n15975ed2n423ead163eb5d4c9@mail.gmail.com>

I appreciate everyone's help, but is there a NON obfuscated way to do
this without looping?  I think Chris's is my favorite, but I didn't
know I was starting a contest :)

>  >>> M[I[:,0], I[:,1]]
> array([ 1,  6, 11,  6])

W


From webb.sprague at gmail.com  Fri Apr  7 14:05:04 2006
From: webb.sprague at gmail.com (Webb Sprague)
Date: Fri Apr  7 14:05:04 2006
Subject: [Numpy-discussion] Silly array question
In-Reply-To: <b11ea23c0604071359n15975ed2n423ead163eb5d4c9@mail.gmail.com>
References: <b11ea23c0604071234y562e7568vf32e36a6b4f27925@mail.gmail.com>
	 <4436CB1C.3040308@noaa.gov>
	 <d38f5330604071355t62d34319ibbb3aa38216d2d8@mail.gmail.com>
	 <b11ea23c0604071359n15975ed2n423ead163eb5d4c9@mail.gmail.com>
Message-ID: <b11ea23c0604071404g14d5d74cy5be2fb0d9a47f288@mail.gmail.com>

Ok, so now I get it

M[(tuple for rows), (tuple for columns)]

Whew

On 4/7/06, Webb Sprague <webb.sprague at gmail.com> wrote:
> I appreciate everyone's help, but is there a NON obfuscated way to do
> this without looping?  I think Chris's is my favorite, but I didn't
> know I was starting a contest :)
>
> >  >>> M[I[:,0], I[:,1]]
> > array([ 1,  6, 11,  6])
>
> W
>


From tim.hochberg at cox.net  Fri Apr  7 14:16:06 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Fri Apr  7 14:16:06 2006
Subject: [Numpy-discussion] Re: ndarray.fill and ma.array.filled
In-Reply-To: <4436C965.8020808@hawaii.edu>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com> <d38f5330604071025g5f53d0d7yb0b14d9bac059c4@mail.gmail.com> <4436AE31.7000306@cox.net> <d38f5330604071219j6a5adbdw4a300ed10a26a445@mail.gmail.com> <4436C965.8020808@hawaii.edu>
Message-ID: <4436D6D1.6040302@cox.net>

Eric Firing wrote:

> Sasha wrote:
>
>>
>>
>> On 4/7/06, *Tim Hochberg* <tim.hochberg at cox.net 
>> <mailto:tim.hochberg at cox.net>> wrote:
>>
>>     ...
>>     In general, I'm skeptical of adding more methods to the ndarray 
>> object
>>     -- there are plenty already.
>>
>>
>> I've also proposed to drop "fill" in favor of optimizing x[...] = 
>> <scalar>.  Having both "fill" and "filled" in the interface is plain 
>> awkward.  You may like the combined proposal better because it does 
>> not change the total number of methods :-)
>>  
>>
>>     In addition, it appears that both the method and function 
>> versions of
>>     filled are "dangerous" in the sense that they sometimes return the
>>     array
>>     itself and sometimes a copy.
>>
>>
>> This is true in ma, but may certainly be changed.
>>  
>>
>>     Finally, changing ndarray to support masked array feels a bit 
>> like the
>>     tail wagging the dog.
>>
>> I disagree. Numpy is pretty much alone among the array languages 
>> because it does not have "native" support for missing values. For 
>> the  floating point types some rudimental support for nans exists, 
>> but is not really usable.  There is no missing values machanism for 
>> integer types.  I believe adding "filled" and maybe "mask" to ndarray 
>> (not necessarily under these names) could be a meaningful step 
>> towards "native" support for missing values.  
>
>
> I agree strongly with you, Sasha.  I get the impression that the world 
> of numerical computation is divided into those who work with idealized 
> "data", where nothing is missing, and those who work with real 
> observations, where there is always something missing.

I think your experience is clouding your judgement here. Or at least 
this comes off as unnecessarily perjorative. There's a large class of 
people who work with data that doesn't have missing values either 
because of the nature of data acquisition or because they're doing 
simulations. I take zillions of measurements with digital oscillopscopes 
and they *never* have missing values. Clipped values, yes, but even if I 
somehow could queery the scope about which values were actually clipped 
or simply make an educated guess based on their value, the facilities of 
ma would be useless to me. The clipped values are what I would want in 
any case.  I also do a lot of work with simulations derived from this 
and other data. I don't come across missing values here but again, if I 
did, the way ma works would not help me. I'd have to treat them either 
by rejecting the data outright or by some sort of interpolation.

> As an oceanographer, I am solidly in the latter category.  If good 
> support for missing values is not built in, it has to be bolted on, 
> and it becomes clunky and awkward.  

This may be a false dichotomy. It's certainly not obvious to me that 
this is so. At least if "bolted on" means "not adding a filled method to 
ndarray".

> I was reluctant to speak up about this earlier because I thought it 
> was too much to ask of Travis when he was in the midst of putting 
> numpy on solid ground.  But I am delighted that missing value support 
> has a champion among numpy developers, and I agree that now is the 
> time to change it from "bolted on" to "integrated".


I have no objection to ma support improving. In fact I think it would be 
great although I don't forsee it helping me anytime soon. I also support 
Sasha's goal of being able to mix  MaskedArrays and ndarrays reasonably 
seemlessly.

However, I do think the situation needs more thought. Slapping filled 
and mask onto ndarray is the path of least resistance, but it's not 
clear that it's the best one.

If we do decide we are going to add both of these methods to ndarray 
(with filled returning a copy!), then it may worth considering making 
ndarray a subclass of MaskedArray. Conceptually this makes sense, since 
at this point an ndarray will just be a MaskedArray where mask is always 
False. I think that they could share  much of the implementation except 
that ndarray would be set up to use methods that ignored the mask 
attribute since they would know that it's always false. Even that might 
not be worth it, since the check for whether mask is True/False is just 
a pointer compare.

It may in fact be best just to do away with MaskedArray entirely, moving 
the functionality into ndarray. That may have performance implications, 
although I don't seem them at the moment, and I don't know if there are 
other methods/attributes that this would imply need to be moved over, 
although it looks like just mask, filled and possibly filled_value, 
although the latter looks a little dubious to me.

Either of the above two options would certainly improve the quality of 
MaskedArray. Copy for instance seems not to have been implemented, and 
who knows what other dark corners remain unexplored here.

There's a whole spectrum of possibilities here from ones that don't 
intrude on ndarray at all to ones that profoundly change it. Sasha's 
suggestion looks like it's probably the simplest thing in the short 
term, but I don't know that it's the best long term solution. I think it 
needs more thought and discussion, which is after all what Sasha asked 
for ;)


Regards,

-tim


From Chris.Barker at noaa.gov  Fri Apr  7 15:13:02 2006
From: Chris.Barker at noaa.gov (Christopher Barker)
Date: Fri Apr  7 15:13:02 2006
Subject: [Numpy-discussion] Silly array question
In-Reply-To: <d38f5330604071355t62d34319ibbb3aa38216d2d8@mail.gmail.com>
References: <b11ea23c0604071234y562e7568vf32e36a6b4f27925@mail.gmail.com>
 <4436CB1C.3040308@noaa.gov>
 <d38f5330604071355t62d34319ibbb3aa38216d2d8@mail.gmail.com>
Message-ID: <4436E3C9.2040807@noaa.gov>

Sasha wrote:
> One more obfuscated numpy entry:
> 
>>>> M[tuple(transpose(I))]
> array([ 1,  6, 11,  6])

exactly. Can anyone explain why that works, but:

M[transpose(I)]

or
M[I]

doesn't?

-Chris

-
Christopher Barker, Ph.D.
Oceanographer
                                     		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From efiring at hawaii.edu  Fri Apr  7 15:37:03 2006
From: efiring at hawaii.edu (Eric Firing)
Date: Fri Apr  7 15:37:03 2006
Subject: [Numpy-discussion] Re: ndarray.fill and ma.array.filled
In-Reply-To: <4436D6D1.6040302@cox.net>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com>
 <d38f5330604071025g5f53d0d7yb0b14d9bac059c4@mail.gmail.com>
 <4436AE31.7000306@cox.net>
 <d38f5330604071219j6a5adbdw4a300ed10a26a445@mail.gmail.com>
 <4436C965.8020808@hawaii.edu> <4436D6D1.6040302@cox.net>
Message-ID: <4436E95B.4090009@hawaii.edu>

Tim Hochberg wrote:
> Eric Firing wrote:
> 
>> Sasha wrote:
>>
>>>
>>>
>>> On 4/7/06, *Tim Hochberg* <tim.hochberg at cox.net 
>>> <mailto:tim.hochberg at cox.net>> wrote:
>>>
>>>     ...
>>>     In general, I'm skeptical of adding more methods to the ndarray 
>>> object
>>>     -- there are plenty already.
>>>
>>>
>>> I've also proposed to drop "fill" in favor of optimizing x[...] = 
>>> <scalar>.  Having both "fill" and "filled" in the interface is plain 
>>> awkward.  You may like the combined proposal better because it does 
>>> not change the total number of methods :-)
>>>  
>>>
>>>     In addition, it appears that both the method and function 
>>> versions of
>>>     filled are "dangerous" in the sense that they sometimes return the
>>>     array
>>>     itself and sometimes a copy.
>>>
>>>
>>> This is true in ma, but may certainly be changed.
>>>  
>>>
>>>     Finally, changing ndarray to support masked array feels a bit 
>>> like the
>>>     tail wagging the dog.
>>>
>>> I disagree. Numpy is pretty much alone among the array languages 
>>> because it does not have "native" support for missing values. For 
>>> the  floating point types some rudimental support for nans exists, 
>>> but is not really usable.  There is no missing values machanism for 
>>> integer types.  I believe adding "filled" and maybe "mask" to ndarray 
>>> (not necessarily under these names) could be a meaningful step 
>>> towards "native" support for missing values.  
>>
>>
>>
>> I agree strongly with you, Sasha.  I get the impression that the world 
>> of numerical computation is divided into those who work with idealized 
>> "data", where nothing is missing, and those who work with real 
>> observations, where there is always something missing.
> 
> 
> I think your experience is clouding your judgement here. Or at least 
> this comes off as unnecessarily perjorative. There's a large class of 
> people who work with data that doesn't have missing values either 
> because of the nature of data acquisition or because they're doing 
> simulations. I take zillions of measurements with digital oscillopscopes 
> and they *never* have missing values. Clipped values, yes, but even if I 
> somehow could queery the scope about which values were actually clipped 
> or simply make an educated guess based on their value, the facilities of 
> ma would be useless to me. The clipped values are what I would want in 
> any case.  I also do a lot of work with simulations derived from this 
> and other data. I don't come across missing values here but again, if I 
> did, the way ma works would not help me. I'd have to treat them either 
> by rejecting the data outright or by some sort of interpolation.

Tim,

The point is well-taken, and I apologize.  I stated my case badly.  (I 
would be delighted if I did not have to be concerned with missing 
values-they are a pain regardless of how well a numerical package 
handles them.)

> 
>> As an oceanographer, I am solidly in the latter category.  If good 
>> support for missing values is not built in, it has to be bolted on, 
>> and it becomes clunky and awkward.  
> 
> 
> This may be a false dichotomy. It's certainly not obvious to me that 
> this is so. At least if "bolted on" means "not adding a filled method to 
> ndarray".

I probably overstated it, but I think we actually agree.  I intended to 
lend support to the priority of making missing-value support as seamless 
and painless as possible.  It will help some people, and not others.

> 
>> I was reluctant to speak up about this earlier because I thought it 
>> was too much to ask of Travis when he was in the midst of putting 
>> numpy on solid ground.  But I am delighted that missing value support 
>> has a champion among numpy developers, and I agree that now is the 
>> time to change it from "bolted on" to "integrated".
> 
> 
> 
> I have no objection to ma support improving. In fact I think it would be 
> great although I don't forsee it helping me anytime soon. I also support 
> Sasha's goal of being able to mix  MaskedArrays and ndarrays reasonably 
> seemlessly.
> 
> However, I do think the situation needs more thought. Slapping filled 
> and mask onto ndarray is the path of least resistance, but it's not 
> clear that it's the best one.
> 
> If we do decide we are going to add both of these methods to ndarray 
> (with filled returning a copy!), then it may worth considering making 
> ndarray a subclass of MaskedArray. Conceptually this makes sense, since 
> at this point an ndarray will just be a MaskedArray where mask is always 
> False. I think that they could share  much of the implementation except 
> that ndarray would be set up to use methods that ignored the mask 
> attribute since they would know that it's always false. Even that might 
> not be worth it, since the check for whether mask is True/False is just 
> a pointer compare.
> 
> It may in fact be best just to do away with MaskedArray entirely, moving 
> the functionality into ndarray. That may have performance implications, 
> although I don't seem them at the moment, and I don't know if there are 
> other methods/attributes that this would imply need to be moved over, 
> although it looks like just mask, filled and possibly filled_value, 
> although the latter looks a little dubious to me.
> 

This is exactly the option that I was afraid to bring up because I 
thought it might be too disruptive, and because I am not contributing to 
numpy, and probably don't have the competence (or time) to do so.

> Either of the above two options would certainly improve the quality of 
> MaskedArray. Copy for instance seems not to have been implemented, and 
> who knows what other dark corners remain unexplored here.
> 
> There's a whole spectrum of possibilities here from ones that don't 
> intrude on ndarray at all to ones that profoundly change it. Sasha's 
> suggestion looks like it's probably the simplest thing in the short 
> term, but I don't know that it's the best long term solution. I think it 
> needs more thought and discussion, which is after all what Sasha asked 
> for ;)

Exactly!  Thank you for broadening the discussion.

Eric


From ndarray at mac.com  Fri Apr  7 15:38:04 2006
From: ndarray at mac.com (Sasha)
Date: Fri Apr  7 15:38:04 2006
Subject: [Numpy-discussion] Re: ndarray.fill and ma.array.filled
In-Reply-To: <4436D6D1.6040302@cox.net>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com>
	 <d38f5330604071025g5f53d0d7yb0b14d9bac059c4@mail.gmail.com>
	 <4436AE31.7000306@cox.net>
	 <d38f5330604071219j6a5adbdw4a300ed10a26a445@mail.gmail.com>
	 <4436C965.8020808@hawaii.edu> <4436D6D1.6040302@cox.net>
Message-ID: <d38f5330604071537r16a3b813l17f968e6fa80703a@mail.gmail.com>

On 4/7/06, Tim Hochberg <tim.hochberg at cox.net> wrote:
> [...]
>
> However, I do think the situation needs more thought. Slapping filled
> and mask onto ndarray is the path of least resistance, but it's not
> clear that it's the best one.

Completely agree.  I have many gripes about  current ma implementation
of both "filled" and "mask".

filled:

1. I don't like default fill value.   It should  be mandatory to
supply fill value.
2. It should return masked array (with trivial mask), not ndarray.
3. The name conflicts with the "fill" method.
4. View/Copy inconsistency.  Does not provide a method to fill values in-place.

mask:

1. I've got rid of mask returning None in favor of False_ (boolean
array scalar), but it is still not perfect.  I would prefer data.shape
== mask.shape invariant and if space saving/performance  is deemed
necessary use zero-stride arrays.

2. I don't like the name. "Missing" or "na" would be better.

> If we do decide we are going to add both of these methods to ndarray
> (with filled returning a copy!), then it may worth considering making
> ndarray a subclass of MaskedArray. Conceptually this makes sense, since
> at this point an ndarray will just be a MaskedArray where mask is always
> False. I think that they could share  much of the implementation except
> that ndarray would be set up to use methods that ignored the mask
> attribute since they would know that it's always false. Even that might
> not be worth it, since the check for whether mask is True/False is just
> a pointer compare.
>

The tail becoming the dog! Yet I agree, this makes sense from the
implementation point of view.  From OOP perspective this would make
sense if arrays were immutable, but since mask is settable in
MaskedArray, making it constant in the subclass will violate the
substitution principle.  I would not object making mask read only,
however.

> It may in fact be best just to do away with MaskedArray entirely, moving
> the functionality into ndarray. That may have performance implications,
> although I don't seem them at the moment, and I don't know if there are
> other methods/attributes that this would imply need to be moved over,
> although it looks like just mask, filled and possibly filled_value,
> although the latter looks a little dubious to me.
>
I think MA can coexist with ndarray and share the interface.  Ndarray
can use special bit-patterns like IEEE NaN to indicate missing
floating point values. Add-on modules can redefine arithmetic to make
INT_MIN behave as a missing marker for signed integers (R, K and J (I
think) languages use this approach).  Applications that need missing
values support across the board will use MA.


> Either of the above two options would certainly improve the quality of
> MaskedArray. Copy for instance seems not to have been implemented, and
> who knows what other dark corners remain unexplored here.
>
More (corners) than you want to know about! Reimplementing MA in C
would be a worthwhile goal (and what you suggest seems to require just
that), but it is too big of a project.  I suggest that we focus on the
interface first.  If existing MA interface is rejected (which is
likely) for ndarray, we can easily experiment with the alternatives
within MA, which is pure python.


> There's a whole spectrum of possibilities here from ones that don't
> intrude on ndarray at all to ones that profoundly change it. Sasha's
> suggestion looks like it's probably the simplest thing in the short
> term, but I don't know that it's the best long term solution. I think it
> needs more thought and discussion, which is after all what Sasha asked
> for ;)

Exactly!


From robert.kern at gmail.com  Fri Apr  7 15:39:02 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Fri Apr  7 15:39:02 2006
Subject: [Numpy-discussion] Re: Silly array question
In-Reply-To: <4436E3C9.2040807@noaa.gov>
References: <b11ea23c0604071234y562e7568vf32e36a6b4f27925@mail.gmail.com> <4436CB1C.3040308@noaa.gov> <d38f5330604071355t62d34319ibbb3aa38216d2d8@mail.gmail.com> <4436E3C9.2040807@noaa.gov>
Message-ID: <e16pk4$d7v$1@sea.gmane.org>

Christopher Barker wrote:
> Sasha wrote:
> 
>> One more obfuscated numpy entry:
>>
>>>>> M[tuple(transpose(I))]
>>
>> array([ 1,  6, 11,  6])
> 
> exactly. Can anyone explain why that works, but:
> 
> M[transpose(I)]
> 
> or
> M[I]
> 
> doesn't?

There's some typechecking going on in __getitem__. Tuples are presumed to mean
that each item in the tuple is indexing on a different axis. Non-tuples are
presumed to be fancy array-indexing.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From pgmdevlist at mailcan.com  Fri Apr  7 15:54:01 2006
From: pgmdevlist at mailcan.com (Pierre GM)
Date: Fri Apr  7 15:54:01 2006
Subject: [Numpy-discussion] Re: ndarray.fill and ma.array.filled
In-Reply-To: <4436D6D1.6040302@cox.net>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com> <4436C965.8020808@hawaii.edu> <4436D6D1.6040302@cox.net>
Message-ID: <200604071844.37724.pgmdevlist@mailcan.com>

Folks,
I'm more or less in Eric's field (hydrology), and we do have to deal with 
missing values, that we can't interpolate straightforwardly (that is, without 
some dark statistical magic). Purely discarding the data is not an option 
either. MA fills the need, most of it.

I think one of the issues is what is meant by 'masked data':
- a missing observation ? 
- a NAN ?
- a data we don't want to consider at one particular point ?
For the last point, think about raster maps or bitmaps: calculations should be 
performed on a chunk of data, the initial data left untouched, and the result 
should both have the same size as the original, and valid only on the initial 
chunk. The current MA implementation, with its _data part and is _mask part, 
works nicely for the 3rd point.


- I wonder whether implementing a 'filled' method for ndarrays is really 
better than letting the user create a MaskedArray, where the NANs are 
masked.In any case, a 'filled' method should always return a copy, as it's no 
longer the initial data.

- I'm not sure what to do with the idea of making ndarray a subclass of MA . 
One on side, Tim pointed rightly that a ndarray is just a MA with a 'False' 
mask. Actually, I'm a bit frustrated with the standard 'asarray' that shows 
up in many functions. I'd prefer something like "if the argument is a 
non-numpy sequence (tuples,lists), transforming it in a ndarray, but if it's 
already a ndarray or a MA, leave it as it is. Don't touch the mask if 
present". That's how MA.asarray works, but unfortunately  the std "asarray" 
gets rid of the mask (and you end up with something which is not what you'd 
expect). A 'mask=False' attribute in ndarray would be nice.

On another, some methods/functions make sense only on unmasked ndarray (FFT, 
solving equations), some others are a bit tricky to implement (diff ? 
median...). Some exception could be raised if the arguments of these 
functions return True with ismasked (cf below), or that could be simplified 
if 'mask' was a default attribute of numarrays.
I regularly have to use a ismasked function (cf below). 
def ismasked(a):
    if hasattr(a,'mask'):
        return a.mask.any()
    else:
        return False

We're going towards MA as the default object.

But then again, what would be the behavior to deal with missing values ? Using  
R-like na.actions ? That'd be great, but it's getting more complex. 

Oh, and another thing: if 'mask', or 'masked' becomes a default attribute of 
ndarrays, how do we define a mask? As a boolean ndarray whose 'mask' is 
always 'False' ? How do you __repr__ it ?


- I agree that 'filled_value' is not very useful. If I want to fill an array, 
I'm happy to specify what value I want it filled with. In facts, I'd be 
happier to specifiy 'values'. I often have to work with 2D arrays, each 
column representing a different variable. If this array has to be filled, I'd 
like each column to be filled with one particular value, not necessarily the 
same along all columns: something like

column_stack([A[:,k].filled(filler[k]) for k in range(A.shape[1])]) 

with filler a 1xA.shape[1] array of filling values. Of course, we could 
imagine the same thing for rows, or higher dimensions...

Sorry for the rants...


From pgmdevlist at mailcan.com  Fri Apr  7 16:13:02 2006
From: pgmdevlist at mailcan.com (Pierre GM)
Date: Fri Apr  7 16:13:02 2006
Subject: [Numpy-discussion] Re: ndarray.fill and ma.array.filled
In-Reply-To: <d38f5330604071537r16a3b813l17f968e6fa80703a@mail.gmail.com>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com> <4436D6D1.6040302@cox.net> <d38f5330604071537r16a3b813l17f968e6fa80703a@mail.gmail.com>
Message-ID: <200604071914.44752.pgmdevlist@mailcan.com>

> filled:
> 1. I don't like default fill value.   It should  be mandatory to
> supply fill value.
+1

> 2. It should return masked array (with trivial mask), not ndarray.
-1. Unless 'mask/missing/na' becomes a default in ndarray, and other basic 
ndarray functions know how to deal with MA seamlessly

> 3. The name conflicts with the "fill" method.
fillmask ? clog ?

> 4. View/Copy inconsistency.  Does not provide a method to fill values
> in-place.
But once again, I don't think it should be the default behaviour ! A filled 
array should always be a copy of the initial array. Changing in place means 
changing the initial data, and I foresee lots of fun to find the original 
back. No ctrl+Z.

> mask:
>
> 1. I've got rid of mask returning None in favor of False_ (boolean
> array scalar), but it is still not perfect.  I would prefer data.shape
> == mask.shape invariant and if space saving/performance  is deemed
> necessary use zero-stride arrays.
You,lost me on the strides, but I agree with data.shape==mask.shape as a std

> 2. I don't like the name. "Missing" or "na" would be better.
Once again, it's a point of view. Masked data also means 'data that I don't 
wanna see now, but that I may want to see later'. Like masking an 
bitmap/raster area. +0 for na, no for missing.

>  I would not object making mask read only, however.
Good point. but I was more and more thinking of the opposite. I have a set of 
data that I group in three classes. Plotting one class is straightforward, I 
just have to mask the other two. Do I really want/need three objects for the 
same data ? Can't I just save three masks, and then run a data[mask] ? 


> If existing MA interface is rejected (which is
> likely) for ndarray, we can easily experiment with the alternatives
> within MA, which is pure python.

Er... How many of us are using MA on a regular basis ? Aren't we a minority ?
It'd seem wiser to adapt MA to numpy, in Python (but maybe that's the XIXe 
French integration model I grew up with that makes me talk here...)


From ndarray at mac.com  Fri Apr  7 16:31:03 2006
From: ndarray at mac.com (Sasha)
Date: Fri Apr  7 16:31:03 2006
Subject: [Numpy-discussion] Re: ndarray.fill and ma.array.filled
In-Reply-To: <200604071844.37724.pgmdevlist@mailcan.com>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com>
	 <4436C965.8020808@hawaii.edu> <4436D6D1.6040302@cox.net>
	 <200604071844.37724.pgmdevlist@mailcan.com>
Message-ID: <d38f5330604071630s4f1955a2x972ccb2763901c1b@mail.gmail.com>

On 4/7/06, Pierre GM <pgmdevlist at mailcan.com> wrote:
> ...
> We're going towards MA as the default object.
>
I will be against changing the array structure to handle missing
values.  Let's keep the discussion focuced on the interface. Once we
agree on the interface, it will be clear if any structural changes are
necessary.


> But then again, what would be the behavior to deal with missing values ?

We can postpone this discussion as well. Just add mask attribute that
returns False and filled method that returns a copy is an example of a
minimalistic change.

> Using R-like na.actions ? That'd be great, but it's getting more complex.
>

I don't like na.actions.  I think missing values should behave like
IEEE NaNs and in the floating point case should be represented by
NaNs.  The functionality provided by na.actions can always be achieved
by calling an extra function (filled or compress).

> Oh, and another thing: if 'mask', or 'masked' becomes a default attribute of
> ndarrays, how do we define a mask? As a boolean ndarray whose 'mask' is
> always 'False' ? How do you __repr__ it ?
>

See above. For ndarray mask is always False unless an add-on module is
loaded that redefines arithmetic to recognize special bit-patterns
such as NaN or INT_MIN.


From tim.hochberg at cox.net  Fri Apr  7 17:09:11 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Fri Apr  7 17:09:11 2006
Subject: [Numpy-discussion] Re: ndarray.fill and ma.array.filled
In-Reply-To: <d38f5330604071537r16a3b813l17f968e6fa80703a@mail.gmail.com>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com>	 <d38f5330604071025g5f53d0d7yb0b14d9bac059c4@mail.gmail.com>	 <4436AE31.7000306@cox.net>	 <d38f5330604071219j6a5adbdw4a300ed10a26a445@mail.gmail.com>	 <4436C965.8020808@hawaii.edu> <4436D6D1.6040302@cox.net> <d38f5330604071537r16a3b813l17f968e6fa80703a@mail.gmail.com>
Message-ID: <4436FF73.7080408@cox.net>

Sasha wrote:

>On 4/7/06, Tim Hochberg <tim.hochberg at cox.net> wrote:
>  
>
>>[...]
>>
>>However, I do think the situation needs more thought. Slapping filled
>>and mask onto ndarray is the path of least resistance, but it's not
>>clear that it's the best one.
>>    
>>
>
>Completely agree.  I have many gripes about  current ma implementation
>of both "filled" and "mask".
>
>filled:
>
>1. I don't like default fill value.   It should  be mandatory to
>supply fill value.
>  
>
That makes perfect sense. If anything should have a default fill value, 
it's the functsion calling filled, not the arrays themselves.

>2. It should return masked array (with trivial mask), not ndarray.
>  
>
So, just with mask = False? In a follow on message Pierre disagress and 
claims that what you really want is the ndarray since not everything 
will accept.  Then I guess you'd need to call b.filled(fill).data. I 
agree with Sasha in principle but Pierre, perhaps in practice. I'm 
almost suggested it get renames a.asndarray(fill), except that asXXX has 
the wrong conotations. I think this one needs to bounce around some more.

>3. The name conflicts with the "fill" method.
>  
>
I thought you wanted to kill that. I'd certainly support that. Can't we 
just special case __setitem__ for that one case so that the performance 
is just as good if performance is really the issue?

>4. View/Copy inconsistency.  Does not provide a method to fill values in-place.
>  
>
b[b.mask] = fill_value; b.unmask()

seems to work for this purpose. Can we just have filled return a copy?

>mask:
>
>1. I've got rid of mask returning None in favor of False_ (boolean
>array scalar), but it is still not perfect.  I would prefer data.shape
>== mask.shape invariant and if space saving/performance  is deemed
>necessary use zero-stride arrays.
>  
>
Interesting idea. Is that feasible yet?

>2. I don't like the name. "Missing" or "na" would be better.
>  
>
I'm not on board here, although really I'd like to here from other 
people who use the package. 'na' seems to cryptic to me and 'missing' to 
specific -- there might be other reasons to mask a value other it being 
missing. The problem with mask is that it's not clear whether
True means the data is useful or unuseful. Keep throwing out names, 
maybe one will stick.

>  
>
>>If we do decide we are going to add both of these methods to ndarray
>>(with filled returning a copy!), then it may worth considering making
>>ndarray a subclass of MaskedArray. Conceptually this makes sense, since
>>at this point an ndarray will just be a MaskedArray where mask is always
>>False. I think that they could share  much of the implementation except
>>that ndarray would be set up to use methods that ignored the mask
>>attribute since they would know that it's always false. Even that might
>>not be worth it, since the check for whether mask is True/False is just
>>a pointer compare.
>>
>>    
>>
>
>The tail becoming the dog! Yet I agree, this makes sense from the
>implementation point of view.  From OOP perspective this would make
>sense if arrays were immutable, but since mask is settable in
>MaskedArray, making it constant in the subclass will violate the
>substitution principle.  I would not object making mask read only,
>however.
>  
>
How do you set the mask? I keep getting attribute errors when I try it. 
And unmask would be a noop on an ndarray.

>  
>
>>It may in fact be best just to do away with MaskedArray entirely, moving
>>the functionality into ndarray. That may have performance implications,
>>although I don't seem them at the moment, and I don't know if there are
>>other methods/attributes that this would imply need to be moved over,
>>although it looks like just mask, filled and possibly filled_value,
>>although the latter looks a little dubious to me.
>>
>>    
>>
>I think MA can coexist with ndarray and share the interface.  Ndarray
>can use special bit-patterns like IEEE NaN to indicate missing
>floating point values. Add-on modules can redefine arithmetic to make
>INT_MIN behave as a missing marker for signed integers (R, K and J (I
>think) languages use this approach).  Applications that need missing
>values support across the board will use MA.
>
>
>  
>
>>Either of the above two options would certainly improve the quality of
>>MaskedArray. Copy for instance seems not to have been implemented, and
>>who knows what other dark corners remain unexplored here.
>>
>>    
>>
>More (corners) than you want to know about! Reimplementing MA in C
>would be a worthwhile goal (and what you suggest seems to require just
>that), but it is too big of a project.  I suggest that we focus on the
>interface first.  If existing MA interface is rejected (which is
>likely) for ndarray, we can easily experiment with the alternatives
>within MA, which is pure python.
>  
>
Perhaps MaskedArray should inherit from ndarray for the time being. Many 
of the methods would need to reimplemented anyway, but it would make 
asanyarray work. Someone was just complaining about asarray munging his 
arrays. That's correct behaviour, but it would be nice if asanyarray did 
the right thing. I suppose we could just special case asanyarray to 
ignore MaskedArrays, that might be better since it's less constraining 
from an implementation side too.

>>There's a whole spectrum of possibilities here from ones that don't
>>intrude on ndarray at all to ones that profoundly change it. Sasha's
>>suggestion looks like it's probably the simplest thing in the short
>>term, but I don't know that it's the best long term solution. I think it
>>needs more thought and discussion, which is after all what Sasha asked
>>for ;)
>>    
>>
>
>Exactly!
>  
>
This may be an oportune time to propose something that's been cooking in 
the back of my head for a week or so now: A stripped down array 
superclass. The details of this are not at all locked down, but here's a 
strawman proposal.

    We add an array superclass. call it basearray, that has the same
    C-structure as the existing ndarray. However, it has *no* methods or
    attributes. It's simply a big blob of data. Functions that work on
    the C structure of arrays (ufuncs, etc) would still work on this
    arrays, as would asarray, so it could be converted to an ndarray as
    necessary. In addition, we would supply a minimal set of functions
    that would operate on this object. These functions would be chosen
    so that the current array interface could be implemented on top of
    them and the basearray object in pure python. These functions would
    be things like set_shape(a, shape), etc. They would be segregated
    off in their own namespace, not in the numpy core. [Note that I'm
    not proposing we actually implement ndarray this way, just that we
    make is possible]. This leads to several useful outcomes.
        1. If we're careful, this could be the basic array object that
    we propose, at least for the first roun,d for inclusion in the
    Python core. It's not useful for anything but passing data betwen
    various application that understand the data structure, but that in
    itself could be a huge win. And the fact that it's dirt simple would
    probably be an advantage to getting it into the core.
        2. It provides a useful marker class. MA could inherit from it
    (and use itself for it's data attribute) and then asanyarray would
    behave properly. MA could also use this, or a subclass, as the mask
    object preventing anyone from accidentally using it as data (they
    could always use it on purpose with asarray).
        3. It provides a platform for people to build other,
    ndarray-like classes in Pure python. This is my main interest. I've
    put together a thin shell over numpy that strips it down to it's
    abolute essentials including a stripped down version of ndarray that
    removes most of the methods. All of the __array_wrap__[1] stuff
    works quite well most of the time, but there's still some issues
    with being a subclass when this particular class is conceptually a
    superclass. If we had an array superclass of some sort, I believe
    that these would be resolved.

In principle at least, this shouldn't be that hard. I think it should 
mostly be rearanging some code and adding some wrappers to existing 
functions. That's in principle. In practice, I'm not certain yet as I 
haven't investigated the code in question in much depth yet. I've been 
meaning to write this up into a more fleshed out proposal, but I got 
distracted by the whole Protocol discussion on python-dev3000. This 
writeup is pretty weak, but hopefully you get the idea.

Anyway, this is somethig that I would be willing to put some time on 
that would benefit both me and probably the MA folks as well.

Regards,

-tim


From efiring at hawaii.edu  Fri Apr  7 17:27:09 2006
From: efiring at hawaii.edu (Eric Firing)
Date: Fri Apr  7 17:27:09 2006
Subject: [Numpy-discussion] Re: ndarray.fill and ma.array.filled
In-Reply-To: <4436FF73.7080408@cox.net>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com>
 <d38f5330604071025g5f53d0d7yb0b14d9bac059c4@mail.gmail.com>
 <4436AE31.7000306@cox.net>
 <d38f5330604071219j6a5adbdw4a300ed10a26a445@mail.gmail.com>
 <4436C965.8020808@hawaii.edu> <4436D6D1.6040302@cox.net>
 <d38f5330604071537r16a3b813l17f968e6fa80703a@mail.gmail.com>
 <4436FF73.7080408@cox.net>
Message-ID: <44370328.2060508@hawaii.edu>

Tim Hochberg wrote:
[...]
> 
>> 2. I don't like the name. "Missing" or "na" would be better.
>>  
>>
> I'm not on board here, although really I'd like to here from other 
> people who use the package. 'na' seems to cryptic to me and 'missing' to 
> specific -- there might be other reasons to mask a value other it being 
> missing. The problem with mask is that it's not clear whether
> True means the data is useful or unuseful. Keep throwing out names, 
> maybe one will stick.

"hide" or "hidden"?  A mask value of True essentially hides the 
underlying value.

Eric


From ndarray at mac.com  Fri Apr  7 17:56:24 2006
From: ndarray at mac.com (Sasha)
Date: Fri Apr  7 17:56:24 2006
Subject: [Numpy-discussion] Re: ndarray.fill and ma.array.filled
In-Reply-To: <4436FF73.7080408@cox.net>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com>
	 <d38f5330604071025g5f53d0d7yb0b14d9bac059c4@mail.gmail.com>
	 <4436AE31.7000306@cox.net>
	 <d38f5330604071219j6a5adbdw4a300ed10a26a445@mail.gmail.com>
	 <4436C965.8020808@hawaii.edu> <4436D6D1.6040302@cox.net>
	 <d38f5330604071537r16a3b813l17f968e6fa80703a@mail.gmail.com>
	 <4436FF73.7080408@cox.net>
Message-ID: <d38f5330604071755p9139d3off1007a73f355e47@mail.gmail.com>

On 4/7/06, Tim Hochberg <tim.hochberg at cox.net> wrote:
> [...]
> Perhaps MaskedArray should inherit from ndarray for the time being. Many
> of the methods would need to reimplemented anyway, but it would make
> asanyarray work. Someone was just complaining about asarray munging his
> arrays. That's correct behaviour, but it would be nice if asanyarray did
> the right thing. I suppose we could just special case asanyarray to
> ignore MaskedArrays, that might be better since it's less constraining
> from an implementation side too.
>
Just for the record.  Currently MA does not inherit from ndarray. 
There are some benefits to be gained from changing MA design from
containment to inheritance, by I am very sceptical about the use of
inheritance in the array setting.


> >
> This may be an oportune time to propose something that's been cooking in
> the back of my head for a week or so now: A stripped down array
> superclass.

This is a very worthwhile idea and I hate to see it burried in a
non-descriptive thread.  I've copied your proposal to the wiki at
<http://projects.scipy.org/scipy/numpy/wiki/ArraySuperClass>.


From tim.hochberg at cox.net  Fri Apr  7 18:44:02 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Fri Apr  7 18:44:02 2006
Subject: [Numpy-discussion] Re: ndarray.fill and ma.array.filled
In-Reply-To: <d38f5330604071755p9139d3off1007a73f355e47@mail.gmail.com>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com>	 <d38f5330604071025g5f53d0d7yb0b14d9bac059c4@mail.gmail.com>	 <4436AE31.7000306@cox.net>	 <d38f5330604071219j6a5adbdw4a300ed10a26a445@mail.gmail.com>	 <4436C965.8020808@hawaii.edu> <4436D6D1.6040302@cox.net>	 <d38f5330604071537r16a3b813l17f968e6fa80703a@mail.gmail.com>	 <4436FF73.7080408@cox.net> <d38f5330604071755p9139d3off1007a73f355e47@mail.gmail.com>
Message-ID: <44371593.8060806@cox.net>

Sasha wrote:

>On 4/7/06, Tim Hochberg <tim.hochberg at cox.net> wrote:
>  
>
>>[...]
>>Perhaps MaskedArray should inherit from ndarray for the time being. Many
>>of the methods would need to reimplemented anyway, but it would make
>>asanyarray work. Someone was just complaining about asarray munging his
>>arrays. That's correct behaviour, but it would be nice if asanyarray did
>>the right thing. I suppose we could just special case asanyarray to
>>ignore MaskedArrays, that might be better since it's less constraining
>>from an implementation side too.
>>
>>    
>>
>Just for the record.  Currently MA does not inherit from ndarray. 
>  
>
Right, I checked that. That's why asanyarray won't work now with MA 
(unless someone changed the implementation of that while I wan't looking.

>There are some benefits to be gained from changing MA design from
>containment to inheritance, by I am very sceptical about the use of
>inheritance in the array setting.
>  
>
That's probably a sensible position.

Still it would be nice to have asanyarray pass masked arrays through 
somehow.  I haven't thought this through very well, but I wonder if it 
would make sense for asanyarray to pass any object that supplies 
__array__. I'm leary of special casing asanyarray just for MA; somehow 
that seems the wrong approach.

>>This may be an oportune time to propose something that's been cooking in
>>the back of my head for a week or so now: A stripped down array
>>superclass.
>>    
>>
>
>This is a very worthwhile idea and I hate to see it burried in a
>non-descriptive thread.  I've copied your proposal to the wiki at
><http://projects.scipy.org/scipy/numpy/wiki/ArraySuperClass>.
>  
>
Thanks for doing that. I'm glad you like the general idea. I do plan to 
write it through and try to get a better handle on what this would 
entail and what the consequences would be. However, I'm not sure exactly 
when I'll get around to it so it's probably better that a rough draft be 
out there for people to think about in the interim.

-tim


>
>  
>


From ndarray at mac.com  Fri Apr  7 18:47:09 2006
From: ndarray at mac.com (Sasha)
Date: Fri Apr  7 18:47:09 2006
Subject: [Numpy-discussion] Re: ndarray.fill and ma.array.filled
In-Reply-To: <4436FF73.7080408@cox.net>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com>
	 <d38f5330604071025g5f53d0d7yb0b14d9bac059c4@mail.gmail.com>
	 <4436AE31.7000306@cox.net>
	 <d38f5330604071219j6a5adbdw4a300ed10a26a445@mail.gmail.com>
	 <4436C965.8020808@hawaii.edu> <4436D6D1.6040302@cox.net>
	 <d38f5330604071537r16a3b813l17f968e6fa80703a@mail.gmail.com>
	 <4436FF73.7080408@cox.net>
Message-ID: <d38f5330604071846y5f072b05kf02a33f267cfcbcb@mail.gmail.com>

On 4/7/06, Tim Hochberg <tim.hochberg at cox.net> wrote:
> [...]
> >1. I don't like default fill value.   It should  be mandatory to
> >supply fill value.
> >
> >
> That makes perfect sense. If anything should have a default fill value,
> it's the functsion calling filled, not the arrays themselves.
>
It looks like we are getting close to a consensus on this one.  I will
remove fill_value attribute.

[...]

> >3. The name conflicts with the "fill" method.
> >
> >
> I thought you wanted to kill that. I'd certainly support that. Can't we
> just special case __setitem__ for that one case so that the performance
> is just as good if performance is really the issue?
>
I'll propose a patch.

> >4. View/Copy inconsistency.  Does not provide a method to fill values in-place.
> >
> >
> b[b.mask] = fill_value; b.unmask()
>
> seems to work for this purpose. Can we just have filled return a copy?
>
+1

> >mask:
> >
> >1. I've got rid of mask returning None in favor of False_ (boolean
> >array scalar), but it is still not perfect.  I would prefer data.shape
> >== mask.shape invariant and if space saving/performance  is deemed
> >necessary use zero-stride arrays.
> >
> >
> Interesting idea. Is that feasible yet?
>
It is not feasible in pure python module like ma, but easy in ndarray.
 We can also reset the writeable flag to avoid various problems that
zero strides may cause.  I'll propose a patch.

> >2. I don't like the name. "Missing" or "na" would be better.
> >
> >
> I'm not on board here, although really I'd like to here from other
> people who use the package. 'na' seems to cryptic to me and 'missing' to
> specific -- there might be other reasons to mask a value other it being
> missing. The problem with mask is that it's not clear whether
> True means the data is useful or unuseful. Keep throwing out names,
> maybe one will stick.
>
The problem with the "mask" name is that ndarray already has unrelated
"putmask" method.  On the other hand putmask is redundant with fancy
indexing.  I have no other problem with "mask" name, so we may just
decide to get rid of "putmask".

> [...]
> How do you set the mask? I keep getting attribute errors when I try it.

a[i] = masked makes i-th element masked.  If mask is an array, you can
just set its elements.

> And unmask would be a noop on an ndarray.
>
Yes.

[...]


From ndarray at mac.com  Fri Apr  7 18:56:01 2006
From: ndarray at mac.com (Sasha)
Date: Fri Apr  7 18:56:01 2006
Subject: [Numpy-discussion] Re: ndarray.fill and ma.array.filled
In-Reply-To: <44371593.8060806@cox.net>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com>
	 <d38f5330604071025g5f53d0d7yb0b14d9bac059c4@mail.gmail.com>
	 <4436AE31.7000306@cox.net>
	 <d38f5330604071219j6a5adbdw4a300ed10a26a445@mail.gmail.com>
	 <4436C965.8020808@hawaii.edu> <4436D6D1.6040302@cox.net>
	 <d38f5330604071537r16a3b813l17f968e6fa80703a@mail.gmail.com>
	 <4436FF73.7080408@cox.net>
	 <d38f5330604071755p9139d3off1007a73f355e47@mail.gmail.com>
	 <44371593.8060806@cox.net>
Message-ID: <d38f5330604071855r329a5c9dofa762e38d574b738@mail.gmail.com>

On 4/7/06, Tim Hochberg <tim.hochberg at cox.net> wrote:
> [...]
> Still it would be nice to have asanyarray pass masked arrays through
> somehow.  I haven't thought this through very well, but I wonder if it
> would make sense for asanyarray to pass any object that supplies
> __array__. I'm leary of special casing asanyarray just for MA; somehow
> that seems the wrong approach.

One possiblility is to make asanyarray pass through objects that have
__array_wrap__ attribute.


From pgmdevlist at mailcan.com  Fri Apr  7 20:40:03 2006
From: pgmdevlist at mailcan.com (Pierre GM)
Date: Fri Apr  7 20:40:03 2006
Subject: [Numpy-discussion] Re: ndarray.fill and ma.array.filled
In-Reply-To: <4436FF73.7080408@cox.net>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com> <d38f5330604071537r16a3b813l17f968e6fa80703a@mail.gmail.com> <4436FF73.7080408@cox.net>
Message-ID: <200604072258.34153.pgmdevlist@mailcan.com>

> >2. It should return masked array (with trivial mask), not ndarray.
>
> So, just with mask = False? In a follow on message Pierre disagress and
> claims that what you really want is the ndarray since not everything
> will accept.  Then I guess you'd need to call b.filled(fill).data. I
> agree with Sasha in principle but Pierre, perhaps in practice.

Well, if 'mask' became a default argument of ndarray, that wouldn't be a pb 
any longer. I'm quite for that.

> I'm 
> almost suggested it get renames a.asndarray(fill), except that asXXX has
> the wrong conotations. I think this one needs to bounce around some more.

tondarray(fill) ?

> >4. View/Copy inconsistency.  Does not provide a method to fill values
> > in-place.
> seems to work for this purpose. Can we just have filled return a copy?

Yes !


> > The problem with mask is that it's not clear whether 
> > True means the data is useful or unuseful. 

I have to think twice all the time I want to create a mask that True means in 
fact that I don't want the data, whereas True selects the data for ndarray...

> "hide" or "hidden"?  A mask value of True essentially hides the
> underlying value.
Unless when there's no underlying value ;). Rose, rose... I'm happy with mask, 
it reminds me of GRASS and gimp

> The problem with the "mask" name is that ndarray already has unrelated
> "putmask" method.  On the other hand putmask is redundant with fancy
> indexing.  I have no other problem with "mask" name, so we may just
> decide to get rid of "putmask".

"putmask" really seems overkill indeed. I wouldn't miss it.

> How do you set the mask? I keep getting attribute errors when I try it.
> And unmask would be a noop on an ndarray.

I've implemented something like that for some classes (inheriting from 
MA.MaskedArray). Never really used it yet, though
    #--------------------------------------------
    def applymask(self,m):
        if not MA.is_mask(m):
            raise MA.MAError,"Invalid mask !"
        elif self._data.shape != m.shape:
            raise MA.MAError,"Mask and data not compatible."
        else:   
            self._dmask = m

> This may be an oportune time to propose something that's been cooking in
> the back of my head for a week or so now: A stripped down array
> superclass. 

That'd be great indeed, and may solve some problems reported on th list about 
subclassing ndarray. AAMOF, I gave up trying to use ndarray as a superclass, 
and rely only on MA


From zdm105 at tom.com  Sat Apr  8 01:56:02 2006
From: zdm105 at tom.com (=?GB2312?B?NNTCMTUtMTbJz7qjLzIxLTIyye7b2g==?=)
Date: Sat Apr  8 01:56:02 2006
Subject: [Numpy-discussion] =?GB2312?B?QUTUy9PDRVhDRUy02b34ytCzodOqz/rT67LGzvG53MDt?=
Message-ID: <E1FS9EK-0003eq-T3@mail.sourceforge.net>

An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060408/d1a2cb8b/attachment-0001.html>

From webb.sprague at gmail.com  Sat Apr  8 20:02:11 2006
From: webb.sprague at gmail.com (Webb Sprague)
Date: Sat Apr  8 20:02:11 2006
Subject: [Numpy-discussion] Unexpected change of array used to index another array
Message-ID: <b11ea23c0604082001t2483ee50g686c7b3e649254fb@mail.gmail.com>

Hi.

I indexed an 10 x 10(called bigM below) with another array (OFFS_TMP
below).  I suppose because OFFS_TMP has negative numbers, it was
changed to cycle around to 9 wherever there is a negative 1 (which is
the forward version of -1 if you are a 10 x 10 matrix).  You can
analogous behavior with -2 => 8, etc.  Is changing the indexing matrix
really the correct behavior?  The result of using the index seems to
be fine.  Has this story been told already and I didn't know it?

Below is my ipython session.

In [57]: OFFS_TMP
Out[57]:
array([[-1,  1],
       [ 0,  1],
       [ 1,  1],
       [-1,  0],
       [ 0,  0],
       [ 1,  0],
       [-1, -1],
       [ 0, -1],
       [ 1, -1]])

In [58]: bigM[OFFS_TMP]
Out[58]:
array([[[False, True, False, False, True, False, True, True, True, False],
        [False, True, False, True, True, False, False, False, True, True]],

       [[True, False, True, False, True, True, False, False, False, True],
        [False, True, False, True, True, False, False, False, True, True]],

       [[False, True, False, True, True, False, False, False, True, True],
        [False, True, False, True, True, False, False, False, True, True]],

       [[False, True, False, False, True, False, True, True, True, False],
        [True, False, True, False, True, True, False, False, False, True]],

       [[True, False, True, False, True, True, False, False, False, True],
        [True, False, True, False, True, True, False, False, False, True]],

       [[False, True, False, True, True, False, False, False, True, True],
        [True, False, True, False, True, True, False, False, False, True]],

       [[False, True, False, False, True, False, True, True, True, False],
        [False, True, False, False, True, False, True, True, True, False]],

       [[True, False, True, False, True, True, False, False, False, True],
        [False, True, False, False, True, False, True, True, True, False]],

       [[False, True, False, True, True, False, False, False, True, True],
        [False, True, False, False, True, False, True, True, True,
False]]], dtype=bool)

In [59]: OFFS_TMP
Out[59]:
array([[9, 1],
       [0, 1],
       [1, 1],
       [9, 0],
       [0, 0],
       [1, 0],
       [9, 9],
       [0, 9],
       [1, 9]])


From robert.kern at gmail.com  Sat Apr  8 21:17:28 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Sat Apr  8 21:17:28 2006
Subject: [Numpy-discussion] Re: Unexpected change of array used to index another array
In-Reply-To: <b11ea23c0604082001t2483ee50g686c7b3e649254fb@mail.gmail.com>
References: <b11ea23c0604082001t2483ee50g686c7b3e649254fb@mail.gmail.com>
Message-ID: <e1a1gr$g37$1@sea.gmane.org>

Webb Sprague wrote:
> Hi.
> 
> I indexed an 10 x 10(called bigM below) with another array (OFFS_TMP
> below).  I suppose because OFFS_TMP has negative numbers, it was
> changed to cycle around to 9 wherever there is a negative 1 (which is
> the forward version of -1 if you are a 10 x 10 matrix).  You can
> analogous behavior with -2 => 8, etc.  Is changing the indexing matrix
> really the correct behavior?  The result of using the index seems to
> be fine.  Has this story been told already and I didn't know it?

I think it's a bug. I've located the problem, but I'm not familiar with that
part of the code so I'm not entirely sure how to go about fixing it.

http://projects.scipy.org/scipy/numpy/ticket/49

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From lbirvyx at teamoneadv.com  Sun Apr  9 03:13:05 2006
From: lbirvyx at teamoneadv.com (lbirvyx)
Date: Sun Apr  9 03:13:05 2006
Subject: [Numpy-discussion] Fw: numpy-discussion
Message-ID: <001101c65bbe$21f165a0$29d13e50@JIPC846>


----- Original Message ----- 
From: Burks Aileen 
To: itwymeyq at acecannon.com 
Sent: Saturday, April 08, 2006 10:37 AM
Subject: numpy-discussion


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060409/52f142dc/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: numpy-discussion.gif
Type: image/gif
Size: 24405 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060409/52f142dc/attachment-0001.gif>

From webb.sprague at gmail.com  Sun Apr  9 15:21:01 2006
From: webb.sprague at gmail.com (Webb Sprague)
Date: Sun Apr  9 15:21:01 2006
Subject: [Numpy-discussion] numpy.floor() is supposed to return an int, but returns a float
Message-ID: <b11ea23c0604091520l7aa7cd0cte287d995082315b@mail.gmail.com>

Could someone explain this behavior:

In [13]: type(N.floor(1))
Out[13]: <type 'float64scalar'>

In [14]: N.floor?
Type:           ufunc
String Form:    <ufunc 'floor'>
Namespace:      Interactive
Docstring:
    y = floor(x) elementwise largest integer <= x

I wouldn't complain, except the only time I use floor() is to make
indices (dividing ages by age widths, for example).

Thanks!


From tim.hochberg at cox.net  Sun Apr  9 15:30:02 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Sun Apr  9 15:30:02 2006
Subject: [Numpy-discussion] numpy.floor() is supposed to return an int,
 but returns a float
In-Reply-To: <b11ea23c0604091520l7aa7cd0cte287d995082315b@mail.gmail.com>
References: <b11ea23c0604091520l7aa7cd0cte287d995082315b@mail.gmail.com>
Message-ID: <44398AFD.4050304@cox.net>

Webb Sprague wrote:

>Could someone explain this behavior:
>
>In [13]: type(N.floor(1))
>Out[13]: <type 'float64scalar'>
>
>In [14]: N.floor?
>Type:           ufunc
>String Form:    <ufunc 'floor'>
>Namespace:      Interactive
>Docstring:
>    y = floor(x) elementwise largest integer <= x
>
>I wouldn't complain, except the only time I use floor() is to make
>indices (dividing ages by age widths, for example).
>  
>
Well, floor returns an integer, but not an int -- it's an integral 
floating point value. What you want is:

 numpy.floor(1).astype(int)
   
(If you're only using scalars, you might also consider int(floor(x)) 
instead.

Regards,

-tim


>Thanks!
>
>
>-------------------------------------------------------
>This SF.Net email is sponsored by xPML, a groundbreaking scripting language
>that extends applications into web and mobile media. Attend the live webcast
>and join the prime developer group breaking into this new coding territory!
>http://sel.as-us.falkag.net/sel?cmd=k&kid0944&bid$1720&dat1642
>_______________________________________________
>Numpy-discussion mailing list
>Numpy-discussion at lists.sourceforge.net
>https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>
>
>  
>


From webb.sprague at gmail.com  Sun Apr  9 15:40:02 2006
From: webb.sprague at gmail.com (Webb Sprague)
Date: Sun Apr  9 15:40:02 2006
Subject: [Numpy-discussion] numpy.floor() is supposed to return an int, but returns a float
In-Reply-To: <44398AFD.4050304@cox.net>
References: <b11ea23c0604091520l7aa7cd0cte287d995082315b@mail.gmail.com>
	 <44398AFD.4050304@cox.net>
Message-ID: <b11ea23c0604091539y3bf802c1wedd9bcb08103488a@mail.gmail.com>

I think the docstring implies that numpy.floor() returns an integer
value.  One can cast the float value to a usable integer value, but
either the docstring should read something different or the function
should be changed (my preference).

"y = floor(x) elementwise largest integer <= x" is the docstring.

As far as "integral valued float" versus "integer", this distinction
seems a little obscure...  I am sure the difference is very important
in some contexts, but I for one think that floor should return a
straight up integer, if just for code style (see example below). Plus
it will be upcast to a float whenever necessary, so floor(4.5) + .75
== 4.75 whether floor() returns an int or a float.

fooMatrix[numpy.floor(age/ageWidth)]

is better (easier to type, read, and debug) than

fooMatrix[numpy.floor(age/ageWidth).astype(int)]

If there is a explanation as to why an integral valued float is a
better return value, I would be interested in a link.

Thx
W


From robert.kern at gmail.com  Sun Apr  9 15:46:04 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Sun Apr  9 15:46:04 2006
Subject: [Numpy-discussion] Re: numpy.floor() is supposed to return an int, but returns a float
In-Reply-To: <b11ea23c0604091539y3bf802c1wedd9bcb08103488a@mail.gmail.com>
References: <b11ea23c0604091520l7aa7cd0cte287d995082315b@mail.gmail.com>	 <44398AFD.4050304@cox.net> <b11ea23c0604091539y3bf802c1wedd9bcb08103488a@mail.gmail.com>
Message-ID: <e1c2p3$mak$1@sea.gmane.org>

Webb Sprague wrote:
> If there is a explanation as to why an integral valued float is a
> better return value, I would be interested in a link.

In [4]: import numpy

In [5]: numpy.floor(2.**50)
Out[5]: 1125899906842624.0

In [6]: numpy.floor(2.**50).astype(int)
Out[6]: 2147483647

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From tim.hochberg at cox.net  Sun Apr  9 16:07:02 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Sun Apr  9 16:07:02 2006
Subject: [Numpy-discussion] numpy.floor() is supposed to return an int,
 but returns a float
In-Reply-To: <b11ea23c0604091539y3bf802c1wedd9bcb08103488a@mail.gmail.com>
References: <b11ea23c0604091520l7aa7cd0cte287d995082315b@mail.gmail.com>	 <44398AFD.4050304@cox.net> <b11ea23c0604091539y3bf802c1wedd9bcb08103488a@mail.gmail.com>
Message-ID: <443993E3.1090901@cox.net>

Webb Sprague wrote:

>I think the docstring implies that numpy.floor() returns an integer
>value.  
>
You've been programming to much!

Everywhere but the computer programming world, 1.0 is integer.  Even 
their, many (most?) computer languages avoid the term integer using int, 
Int or something similar. The distinction made between ints and integral 
floating point values is mostly an artificial one resulting from 
implementation issues. Making this distinction is also a handy, if 
imperfect,  proxy for exact / versus inexact numbers.

>One can cast the float value to a usable integer value, but
>either the docstring should read something different or the function
>should be changed (my preference).
>
>"y = floor(x) elementwise largest integer <= x" is the docstring.
>
>As far as "integral valued float" versus "integer", this distinction
>seems a little obscure...
>
An integral floating point value *is* an integer, just ask any 12 year 
old. What's obscure is the way concepts of integers and reals get mapped 
to ints and floats. Don't get me wrong, these are reasonable comprises 
given the sad reality that computers are not so hot at representing 
inifinte quantities.  However, we get sucked into thinking that integers 
and ints are really the same things at our peril. Similarly for floats 
and reals.

>  I am sure the difference is very important
>in some contexts, but I for one think that floor should return a
>straight up integer,
>
It's a ufunc. Ufuncs in general return the same type that they operate 
on. So, not only would this be difficult, it would make the signature of 
ufuncs harder to remember.

Also, as Robert Kern just pointed out, not all intergral FP values can 
be represents as ints.

> if just for code style (see example below). Plus
>it will be upcast to a float whenever necessary, so floor(4.5) + .75
>== 4.75 whether floor() returns an int or a float.
>  
>
    Not every two-line Python function has to come pre-written -- Tim 
Peters on C.L.P

def webbsfloor(x):
    return numpy.floor(x).astype(int)

>fooMatrix[numpy.floor(age/ageWidth)]
>
>is better (easier to type, read, and debug) than
>
>fooMatrix[numpy.floor(age/ageWidth).astype(int)]
>
>If there is a explanation as to why an integral valued float is a
>better return value, I would be interested in a link.
>  
>
I think there's at least four reasons:

1. It would be a pain.
2. It would make the ufuncs inconsistent.
3. It's a thin wrapper over C's floor, so people coming from that 
language be confused.
4. It wouldn't work for numbers with very large magnitudes.

Pick any three


Regards,

-tim


From tim.hochberg at cox.net  Sun Apr  9 20:09:03 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Sun Apr  9 20:09:03 2006
Subject: [Numpy-discussion] numpy.floor() is supposed to return an int,
 but returns a float
In-Reply-To: <443993E3.1090901@cox.net>
References: <b11ea23c0604091520l7aa7cd0cte287d995082315b@mail.gmail.com>	 <44398AFD.4050304@cox.net> <b11ea23c0604091539y3bf802c1wedd9bcb08103488a@mail.gmail.com> <443993E3.1090901@cox.net>
Message-ID: <4439CC7E.90704@cox.net>

Tim Hochberg wrote:

> Webb Sprague wrote:
>
>> I think the docstring implies that numpy.floor() returns an integer
>> value. 
>
> You've been programming to much!
>
> Everywhere but the computer programming world, 1.0 is integer.  Even 
> their, many (most?) computer languages avoid the term integer using 
> int, Int or something similar. The distinction made between ints and 
> integral floating point values is mostly an artificial one resulting 
> from implementation issues. Making this distinction is also a handy, 
> if imperfect,  proxy for exact / versus inexact numbers.
>
>> One can cast the float value to a usable integer value, but
>> either the docstring should read something different or the function
>> should be changed (my preference).
>>
>> "y = floor(x) elementwise largest integer <= x" is the docstring.
>

Let me just add that, since this seems to cause confusion, it would be 
appropriate to amend the docstring tobe explicit that this always 
returns an integral floating point value. If someone wants to suggest 
wording, I can figure out where to put it. One possibility is:

    "y = floor(x) elementwise largest integer <= x; note that the result 
is a floating point value"

or

    "y = floor(x) elementwise largest integral float <= x"


Neither of those is great, but perhaps they'll inspire someone to do better.

-tim

>>
>> As far as "integral valued float" versus "integer", this distinction
>> seems a little obscure...
>>
> An integral floating point value *is* an integer, just ask any 12 year 
> old. What's obscure is the way concepts of integers and reals get 
> mapped to ints and floats. Don't get me wrong, these are reasonable 
> comprises given the sad reality that computers are not so hot at 
> representing inifinte quantities.  However, we get sucked into 
> thinking that integers and ints are really the same things at our 
> peril. Similarly for floats and reals.
>
>>  I am sure the difference is very important
>> in some contexts, but I for one think that floor should return a
>> straight up integer,
>>
> It's a ufunc. Ufuncs in general return the same type that they operate 
> on. So, not only would this be difficult, it would make the signature 
> of ufuncs harder to remember.
>
> Also, as Robert Kern just pointed out, not all intergral FP values can 
> be represents as ints.
>
>> if just for code style (see example below). Plus
>> it will be upcast to a float whenever necessary, so floor(4.5) + .75
>> == 4.75 whether floor() returns an int or a float.
>>  
>>
>    Not every two-line Python function has to come pre-written -- Tim 
> Peters on C.L.P
>
> def webbsfloor(x):
>    return numpy.floor(x).astype(int)
>
>> fooMatrix[numpy.floor(age/ageWidth)]
>>
>> is better (easier to type, read, and debug) than
>>
>> fooMatrix[numpy.floor(age/ageWidth).astype(int)]
>>
>> If there is a explanation as to why an integral valued float is a
>> better return value, I would be interested in a link.
>>  
>>
> I think there's at least four reasons:
>
> 1. It would be a pain.
> 2. It would make the ufuncs inconsistent.
> 3. It's a thin wrapper over C's floor, so people coming from that 
> language be confused.
> 4. It wouldn't work for numbers with very large magnitudes.
>
> Pick any three
>
>
> Regards,
>
> -tim
>


From charlesr.harris at gmail.com  Sun Apr  9 22:12:02 2006
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Sun Apr  9 22:12:02 2006
Subject: [Numpy-discussion] numpy.floor() is supposed to return an int, but returns a float
In-Reply-To: <4439CC7E.90704@cox.net>
References: <b11ea23c0604091520l7aa7cd0cte287d995082315b@mail.gmail.com>
	 <44398AFD.4050304@cox.net>
	 <b11ea23c0604091539y3bf802c1wedd9bcb08103488a@mail.gmail.com>
	 <443993E3.1090901@cox.net> <4439CC7E.90704@cox.net>
Message-ID: <e06186140604092211j39f92004m6199b1c72d74db26@mail.gmail.com>

Tim,

On 4/9/06, Tim Hochberg <tim.hochberg at cox.net> wrote:

> Let me just add that, since this seems to cause confusion, it would be
> appropriate to amend the docstring tobe explicit that this always
> returns an integral floating point value. If someone wants to suggest
> wording, I can figure out where to put it. One possibility is:
>
>     "y = floor(x) elementwise largest integer <= x; note that the result
> is a floating point value"
>
> or
>
>     "y = floor(x) elementwise largest integral float <= x"


How about, "for each item in x returns the largest integral float <= item."

Chuck

P.S.

I too once found the C definition of the floor function annoying, but I got
used to it. Sorta like getting used to a broken leg. The main problem is
that the result can't be used as an index without conversion to a "real"
integer. Integers aren't members of the reals (or rationals): apart from +/-
1, integers don't have inverses. There happens to be an injective ring
homomorphism of the integers into the reals, but that is not the same thing.
On the other hand, ints are generally not big enough to hold all of the
integral doubles, so as a practical matter the originators made the best
choice. Things do get a bit weird for large floats because above a certain
threshold floats are already integral values.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060409/73cd4ddc/attachment-0001.html>

From charlesr.harris at gmail.com  Sun Apr  9 22:21:02 2006
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Sun Apr  9 22:21:02 2006
Subject: [Numpy-discussion] Unexpected change of array used to index another array
In-Reply-To: <b11ea23c0604082001t2483ee50g686c7b3e649254fb@mail.gmail.com>
References: <b11ea23c0604082001t2483ee50g686c7b3e649254fb@mail.gmail.com>
Message-ID: <e06186140604092220w124dfa0o51ed3af68cbe2f84@mail.gmail.com>

On 4/8/06, Webb Sprague <webb.sprague at gmail.com> wrote:
>
> Hi.
>
> I indexed an 10 x 10(called bigM below) with another array (OFFS_TMP
> below).  I suppose because OFFS_TMP has negative numbers, it was
> changed to cycle around to 9 wherever there is a negative 1 (which is
> the forward version of -1 if you are a 10 x 10 matrix).  You can
> analogous behavior with -2 => 8, etc.  Is changing the indexing matrix
> really the correct behavior?  The result of using the index seems to
> be fine.  Has this story been told already and I didn't know it?


It's the python way:

>>> a = [1,2,3]
>>> a[-1]
3

It gives a convenient way to index from the end of the array. But I'm not
sure that was your question.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060409/079b53d1/attachment-0001.html>

From robert.kern at gmail.com  Mon Apr 10 00:02:01 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Mon Apr 10 00:02:01 2006
Subject: [Numpy-discussion] Re: Unexpected change of array used to index another array
In-Reply-To: <e06186140604092220w124dfa0o51ed3af68cbe2f84@mail.gmail.com>
References: <b11ea23c0604082001t2483ee50g686c7b3e649254fb@mail.gmail.com> <e06186140604092220w124dfa0o51ed3af68cbe2f84@mail.gmail.com>
Message-ID: <e1cvqu$p50$1@sea.gmane.org>

Charles R Harris wrote:
> 
> On 4/8/06, *Webb Sprague* <webb.sprague at gmail.com
> <mailto:webb.sprague at gmail.com>> wrote:
> 
>     Hi.
> 
>     I indexed an 10 x 10(called bigM below) with another array (OFFS_TMP
>     below).  I suppose because OFFS_TMP has negative numbers, it was
>     changed to cycle around to 9 wherever there is a negative 1 (which is
>     the forward version of -1 if you are a 10 x 10 matrix).  You can
>     analogous behavior with -2 => 8, etc.  Is changing the indexing matrix
>     really the correct behavior?  The result of using the index seems to
>     be fine.  Has this story been told already and I didn't know it? 
> 
> It's the python way:
> 
>>>> a = [1,2,3]
>>>> a[-1]
> 3
> 
> It gives a convenient way to index from the end of the array. But I'm
> not sure that was your question.

That's not the issue. The problem was that the index array was being modified
in-place simply by being used as an index array.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From arnd.baecker at web.de  Mon Apr 10 04:01:05 2006
From: arnd.baecker at web.de (Arnd Baecker)
Date: Mon Apr 10 04:01:05 2006
Subject: [Numpy-discussion] Speed up function on cross product of two
 sets?
In-Reply-To: <4434D6DF.2020306@ieee.org>
References: <DCA236B2-7315-419D-B05B-06396924E909@stanford.edu>
 <Pine.LNX.4.51.0604031110050.6751@ptpcp8.phy.tu-dresden.de> <44315633.4010600@cox.net>
 <Pine.LNX.4.51.0604051051310.24174@ptpcp8.phy.tu-dresden.de>
 <4434D6DF.2020306@ieee.org>
Message-ID: <Pine.LNX.4.51.0604101247300.2415@ptpcp8.phy.tu-dresden.de>

On Thu, 6 Apr 2006, Travis Oliphant wrote:

> Arnd Baecker wrote:
> > BTW, it seems that we have no Numeric to numpy transition remarks in
> > www.scipy.org. I only found
> > http://www.scipy.org/PearuPeterson/NumpyVersusNumeric
> > and of course Travis' "Guide to NumPy" contains a detailed list of
> > necessary changes in chapter 2.6.1.
> >
> For clarification:  this is in the sample chapter available on-line to
> all....

yes, I should have emphasized that.
I tried to make this also clearer at
http://www.scipy.org/Converting_from_Numeric

> > In addition ``site-packages/numpy/lib/convertcode.py`` provides an
> > automatic conversion.
> >
> > Would it be helpful to start a new wiki page "ConvertingFromNumeric"
> > (similar to http://www.scipy.org/Converting_from_numarray)
> > which aims at summarizing the necessary changes
> > or expand Pearu's page (if he agrees) on this?
> >
>
> Absolutely.   I did the Numarray page because I'd written a lot on
> Converting from Numeric (even providing convertcode.py) but very little
> for numarray --- except the ndimage conversion.  So, I started the
> Numarray page.   Sounds like a great idea to have a dual page.


Best, Arnd

P.S.: BTW +1 to all which has been said in the other thread on NumPy
documentation - you are really doing a brilliant job, Travis!!!


From webb.sprague at gmail.com  Mon Apr 10 07:16:04 2006
From: webb.sprague at gmail.com (Webb Sprague)
Date: Mon Apr 10 07:16:04 2006
Subject: [Numpy-discussion] Unexpected change of array used to index another array
In-Reply-To: <e06186140604092220w124dfa0o51ed3af68cbe2f84@mail.gmail.com>
References: <b11ea23c0604082001t2483ee50g686c7b3e649254fb@mail.gmail.com>
	 <e06186140604092220w124dfa0o51ed3af68cbe2f84@mail.gmail.com>
Message-ID: <b11ea23c0604100715w2249d053g3f7c5ca92d11994@mail.gmail.com>

>
> It's the python way:
>
> >>> a = [1,2,3]
> >>> a[-1]
> 3
>
> It gives a convenient way to index from the end of the array. But I'm not
> sure that was your question.

No there, was a bug in that when using one matrix to index another, in
that the indexing matrix gets changed. As if you did

>>> i = 4
>>> a = [1,2,3]
>>> a[i]
>>> print i
  -1

 I know about the negative trick in simple python lists, I was trying
to do something in matrices (where it works too, but that wasn't the
issue.

Thanks for trying to help, though.
W


From webb.sprague at gmail.com  Mon Apr 10 07:19:22 2006
From: webb.sprague at gmail.com (Webb Sprague)
Date: Mon Apr 10 07:19:22 2006
Subject: [Numpy-discussion] numpy.floor() is supposed to return an int, but returns a float
In-Reply-To: <e06186140604092211j39f92004m6199b1c72d74db26@mail.gmail.com>
References: <b11ea23c0604091520l7aa7cd0cte287d995082315b@mail.gmail.com>
	 <44398AFD.4050304@cox.net>
	 <b11ea23c0604091539y3bf802c1wedd9bcb08103488a@mail.gmail.com>
	 <443993E3.1090901@cox.net> <4439CC7E.90704@cox.net>
	 <e06186140604092211j39f92004m6199b1c72d74db26@mail.gmail.com>
Message-ID: <b11ea23c0604100718t5cef64eetaee47f7a64c83567@mail.gmail.com>

> >     "y = floor(x) elementwise largest integer <= x; note that the result
> > is a floating point value"

I prefer this, if it makes any difference. The others are more
succint, but less likely to help others in my situation.

> I too once found the C definition of the floor function annoying, but I got
> used to it. Sorta like getting used to a broken leg.

Annoying yes, crippling no.  I guess I should have grown up on a real
programming language :)


From tim.hochberg at cox.net  Mon Apr 10 09:13:03 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Mon Apr 10 09:13:03 2006
Subject: [Numpy-discussion] numpy.floor() is supposed to return an int,
 but returns a float
In-Reply-To: <e06186140604092211j39f92004m6199b1c72d74db26@mail.gmail.com>
References: <b11ea23c0604091520l7aa7cd0cte287d995082315b@mail.gmail.com>	 <44398AFD.4050304@cox.net>	 <b11ea23c0604091539y3bf802c1wedd9bcb08103488a@mail.gmail.com>	 <443993E3.1090901@cox.net> <4439CC7E.90704@cox.net> <e06186140604092211j39f92004m6199b1c72d74db26@mail.gmail.com>
Message-ID: <443A844C.7070306@cox.net>

Charles R Harris wrote:

> Tim,
>
> On 4/9/06, *Tim Hochberg* <tim.hochberg at cox.net 
> <mailto:tim.hochberg at cox.net>> wrote:
>
>     Let me just add that, since this seems to cause confusion, it would be
>     appropriate to amend the docstring tobe explicit that this always
>     returns an integral floating point value. If someone wants to suggest
>     wording, I can figure out where to put it. One possibility is:
>
>         "y = floor(x) elementwise largest integer <= x; note that the
>     result
>     is a floating point value"
>
>     or
>
>         "y = floor(x) elementwise largest integral float <= x"
>
>
> How about, "for each item in x returns the largest integral float <= 
> item."

That seems pretty good. I'll wait a day or so and see what else shows up.

>
> Chuck
>
> P.S.
>
> I too once found the C definition of the floor function annoying, but 
> I got used to it. Sorta like getting used to a broken leg. The main 
> problem is that the result can't be used as an index without 
> conversion to a "real" integer. Integers aren't members of the reals 
> (or rationals): apart from +/- 1, integers don't have inverses.

> There happens to be an injective ring homomorphism of the integers 
> into the reals, but that is not the same thing.

I'm not conversant with the terminology [here I rummage through google 
to try to get the terminology sort of right], but as I understand it 
integers (I) are a subset of reals (R). The ring that you contruct with 
integers consists of the set of integers plus the operations of 
addition/subtraction and multiplication as well as an identity. I've 
seen that specified as  something like (I, +/-, *, 0). Similarly, the 
set of reals (R) and the field that one constructs from them are not 
really the same thing. So while the ring of integers is not a subset of 
the field of reals (the statement doesn't even make sense when put that 
way),the set of integers is a subset of the set of reals. I think that 
most people, outside of computer programmers and perhaps math majors, 
think of the set of integers, not the field of integers, to the extent 
that they think about integers and reals at all. I imagine most people 
would conjure up some Dali like image when confronted with the notion of 
a field of integerse!

(C-int, +/-, *, 0), actually forms a finite field which is not at all 
the same thing the field of integers. Bit twiddlers tend to understand 
and even exploit this, but a lot of people conflate the field of ints 
with the field of integers. This works fine as long as your values are 
small in magnitude, but eventually will rise up and bite you. Floats are 
even worse, since they don't even form a field, I think they're actually 
a semiring because of INF/NAN/IND, but I'm not certain about that. 
Issues with floating point pop up everywhere and if you squint the right 
way, you can blame them on their lack of fieldness. Which is closely 
tied to their finite range and precision, which is what bites people.

Because Python automatically promotes (Python) ints to (Python) longs, 
Python ints map, for most puposes, onto the field of integers. However, 
in numpy wer're stuck using C-ints for performance reasons, so we'd be 
wise to keep the differences between ints and integers in the back of 
our mind.

This is wandering rather far afield (although it's entertaining).

> On the other hand, ints are generally not big enough to hold all of 
> the integral doubles, so as a practical matter the originators made 
> the best choice. Things do get a bit weird for large floats because 
> above a certain threshold floats are already integral values.

Another issue at the moment is that integer division does an implicit 
flooring or truncation (I believe it's implementation dependant in C) in 
both C and Python, so if you aren't using floor to produce an index, 
something I've been known to do, having it return an integer could also 
lead to nasty suprises. For example:

def half_integer(x):
    "return nearest half integer below x"
    return floor(2*x) / 2

Would start failing mysteriously. Of course the above is an overflow 
magnet, so perhaps it's not the best example. Eventually, '/' is going 
to mean true_division and '//' will mean floor_division, so this 
particular issue will go away.

Regards,

-tim


>
>
>


From bsouthey at gmail.com  Mon Apr 10 09:16:08 2006
From: bsouthey at gmail.com (Bruce Southey)
Date: Mon Apr 10 09:16:08 2006
Subject: [Numpy-discussion] Re: ndarray.fill and ma.array.filled
In-Reply-To: <d38f5330604071630s4f1955a2x972ccb2763901c1b@mail.gmail.com>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com>
	 <4436C965.8020808@hawaii.edu> <4436D6D1.6040302@cox.net>
	 <200604071844.37724.pgmdevlist@mailcan.com>
	 <d38f5330604071630s4f1955a2x972ccb2763901c1b@mail.gmail.com>
Message-ID: <bbcd77d00604100915t71bb2db3wb7b30f071a71685a@mail.gmail.com>

Hi,

On 4/7/06, Sasha <ndarray at mac.com> wrote:
> On 4/7/06, Pierre GM <pgmdevlist at mailcan.com> wrote:
> > ...
> > We're going towards MA as the default object.
> >
> I will be against changing the array structure to handle missing
> values.  Let's keep the discussion focuced on the interface. Once we
> agree on the interface, it will be clear if any structural changes are
> necessary.
>
>
> > But then again, what would be the behavior to deal with missing values ?
>
> We can postpone this discussion as well. Just add mask attribute that
> returns False and filled method that returns a copy is an example of a
> minimalistic change.

I think that the usage of MA is important because this often dictates
the interface. The other aspect is the penalty that is imposed by
requiring a masked features especially to situations that don't need
any of these features.


>
> > Using R-like na.actions ? That'd be great, but it's getting more complex.
> >
>
> I don't like na.actions.  I think missing values should behave like
> IEEE NaNs and in the floating point case should be represented by
> NaNs.

I think the issue related to how masked values should be handled in
computation. Does it matter if the result of an operation is due to a
masked value or numerical problem (like dividing by zero)? (I am
presuming that it is possible to identify this difference.) If not,
then I support the idea of treating masked values as NaN.

>The functionality provided by na.actions can always be achieved
> by calling an extra function (filled or compress).

I am not clear on what you actually mean here.  For example, if you
are summing across a particular dimension, I would presume that any
masked value would be ignored an  that there would be some record of
the fact that a masked value was encountered. This would allow that
'extra function' to handle the associated result. Alternatively the
'extra function'  would have to be included as an argument - which is
what the na.actions do.

Regards
Bruce


From ndarray at mac.com  Mon Apr 10 09:49:05 2006
From: ndarray at mac.com (Sasha)
Date: Mon Apr 10 09:49:05 2006
Subject: [Numpy-discussion] Re: ndarray.fill and ma.array.filled
In-Reply-To: <bbcd77d00604100915t71bb2db3wb7b30f071a71685a@mail.gmail.com>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com>
	 <4436C965.8020808@hawaii.edu> <4436D6D1.6040302@cox.net>
	 <200604071844.37724.pgmdevlist@mailcan.com>
	 <d38f5330604071630s4f1955a2x972ccb2763901c1b@mail.gmail.com>
	 <bbcd77d00604100915t71bb2db3wb7b30f071a71685a@mail.gmail.com>
Message-ID: <d38f5330604100948gae90c48ydca295a5d6e84bb@mail.gmail.com>

On 4/10/06, Bruce Southey <bsouthey at gmail.com> wrote:
>
> [...]
> I think the issue related to how masked values should be handled in
> computation. Does it matter if the result of an operation is due to a
> masked value or numerical problem (like dividing by zero)? (I am
> presuming that it is possible to identify this difference.) If not,
> then I support the idea of treating masked values as NaN.
>

IEEE standard prvides plenty of spare bits in NaNs to represent pretty
much everything, and some languages take advantage of that feature. (I
believe NA and NaN are distinct in R). In MA, however mask elements
are boolean and no distinction is made between various reasons for not
having a data element.  For consistency, a non-trivial (not always
false) implementation of ndarray.mask should return "not finite" and
ignore bits that distinguish NaNs and infinities.

> >The functionality provided by na.actions can always be achieved
> > by calling an extra function (filled or compress).
>
> I am not clear on what you actually mean here.  For example, if you
> are summing across a particular dimension, I would presume that any
> masked value would be ignored an  that there would be some record of
> the fact that a masked value was encountered. This would allow that
> 'extra function' to handle the associated result. Alternatively the
> 'extra function'  would have to be included as an argument - which is
> what the na.actions do.
>
If you sum along a particular dimension and encounter a masked value,
the result is masked.  The same is true if you encounter a NaN - the
result is NaN.  If you would like to ignore masked values, you write
a.filled(0).sum() instead of a.sum(). In 1d case, you can also use
a.compress().sum().  In other words, what in R you achieve with a
flag, such as in sum(a, na.rm=TRUE), in numpy you achieve by an
explicit call to "fill".  This is not quite the same as na.actions in
R, but that is what I had in mind.


From pgmdevlist at mailcan.com  Mon Apr 10 10:58:02 2006
From: pgmdevlist at mailcan.com (Pierre GM)
Date: Mon Apr 10 10:58:02 2006
Subject: [Numpy-discussion] Re: ndarray.fill and ma.array.filled
In-Reply-To: <d38f5330604100948gae90c48ydca295a5d6e84bb@mail.gmail.com>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com> <bbcd77d00604100915t71bb2db3wb7b30f071a71685a@mail.gmail.com> <d38f5330604100948gae90c48ydca295a5d6e84bb@mail.gmail.com>
Message-ID: <200604101356.44903.pgmdevlist@mailcan.com>

> If you sum along a particular dimension and encounter a masked value,
> the result is masked.  

That's not how it currently works (still on 0.9.6):

x=arange(12).reshape(3,4)
MA.masked_where((x%5==0) | (x%3==0),x).sum(0)
array(data = [12  1  2 18],
         mask =  [False False False False],
         fill_value=999999)

and frankly, I'd be quite frustrated if it had to change:
- `filled` is not a ndarray method, which means that a.filled(0).sum() fails 
if a is not MA. Right now, I can use a.sum() without having to check the 
nature of a first. 
- this behavior was already in Numeric
- All my scripts rely on it (but I guess that's my problem)
- The current way reflects how mask are used in GIS or image processing.

> If you would like to ignore masked values, you write 
> a.filled(0).sum() instead of a.sum(). In 1d case, you can also use
> a.compress().sum().

Once again, Sasha, I'd agree with you if it wasn't a major difference

> In other words, what in R you achieve with a 
> flag, such as in sum(a, na.rm=TRUE), in numpy you achieve by an
> explicit call to "fill".  This is not quite the same as na.actions in
> R, but that is what I had in mind.

I kinda like the idea of a flag, though


From ndarray at mac.com  Mon Apr 10 11:37:00 2006
From: ndarray at mac.com (Sasha)
Date: Mon Apr 10 11:37:00 2006
Subject: [Numpy-discussion] Re: ndarray.fill and ma.array.filled
In-Reply-To: <200604101356.44903.pgmdevlist@mailcan.com>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com>
	 <bbcd77d00604100915t71bb2db3wb7b30f071a71685a@mail.gmail.com>
	 <d38f5330604100948gae90c48ydca295a5d6e84bb@mail.gmail.com>
	 <200604101356.44903.pgmdevlist@mailcan.com>
Message-ID: <d38f5330604101135t6e86efc3n1d251ca957d893c9@mail.gmail.com>

On 4/10/06, Pierre GM <pgmdevlist at mailcan.com> wrote:
> > If you sum along a particular dimension and encounter a masked value,
> > the result is masked.
>
> That's not how it currently works (still on 0.9.6):
>
> [... longish example snipped ...]

>>> ma.array([1,1], mask=[0,1]).sum()
1

> and frankly, I'd be quite frustrated if it had to change:
> - `filled` is not a ndarray method, which means that a.filled(0).sum() fails
> if a is not MA. Right now, I can use a.sum() without having to check the
> nature of a first.

This is exactly the point of the current discussion: make fill a
method of ndarray.
With the current behavior, how would you achieve masking (no fill) a.sum()?

> - this behavior was already in Numeric

That's true, but it makes the result of sum(a) different from
__builtins__.sum(a).  I believe consistency with the python
conventions is more important than with legacy Numeric in the long
run.

> [...]

> - The current way reflects how mask are used in GIS or image processing.
>
Can you elaborate on this? Note that in R na.rm is false by default in sum:

> sum(c(1,NA))
[1] NA

So it looks like the convention is different in the field of statistics.

> > If you would like to ignore masked values, you write
> > a.filled(0).sum() instead of a.sum(). In 1d case, you can also use
> > a.compress().sum().
>
> Once again, Sasha, I'd agree with you if it wasn't a major difference

Array methods are a very recent addition to ma.  We can still use this
window of opportunity to get things right before to many people get
used to the wrong behavior.  (Note that I changed your implementation
of cumsum and cumprod.)

>
> > In other words, what in R you achieve with a
> > flag, such as in sum(a, na.rm=TRUE), in numpy you achieve by an
> > explicit call to "fill".  This is not quite the same as na.actions in
> > R, but that is what I had in mind.
>
> I kinda like the idea of a flag, though

With the flag approach making ndarray and ma.array interfaces
consistent would require adding an extra argument to many methods. 
Instead, I poropose to add one method: fill to ndarray.


From pgmdevlist at mailcan.com  Mon Apr 10 13:37:07 2006
From: pgmdevlist at mailcan.com (Pierre GM)
Date: Mon Apr 10 13:37:07 2006
Subject: [Numpy-discussion] Re: ndarray.fill and ma.array.filled
In-Reply-To: <d38f5330604101135t6e86efc3n1d251ca957d893c9@mail.gmail.com>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com> <200604101356.44903.pgmdevlist@mailcan.com> <d38f5330604101135t6e86efc3n1d251ca957d893c9@mail.gmail.com>
Message-ID: <200604101638.29979.pgmdevlist@mailcan.com>

> > [... longish example snipped ...]
> >
> >>> ma.array([1,1], mask=[0,1]).sum()
>
> 1
So ? The result is not `masked`, the missing value has been omitted.

MA.array([[1,1],[1,1]],mask=[[0,1],[1,0]]).sum()
array(data = [1 1],   mask = [False False], fill_value=999999)


> This is exactly the point of the current discussion: make fill a
> method of ndarray.
Mrf. I'm still not convinced, but I have nothing against it. Along with a 
mask=False_ by default ?

> With the current behavior, how would you achieve masking (no fill) a.sum()?
Er, why would I want to get MA.masked along one axis if one value is masked  ? 
The current behavior is to mask only if all the values along that axis are 
masked:

MA.array([[1,1],[1,1]],mask=[[0,1],[1,1]]).sum()
array(data = [1 999999],   mask = [False True], fill_value=999999)

With a.filled(0).sum(), how would you distinguish between the cases (a) at 
least one value is not masked and (b) all values are masked  ? (OK, by 
querying the mask with something in the line of a a._mask.all(axis), but it's 
longer... Oh well, I'll just to adapt)

> > - this behavior was already in Numeric
>
> That's true, but it makes the result of sum(a) different from
> __builtins__.sum(a).  I believe consistency with the python
> conventions is more important than with legacy Numeric in the long
> run.
>
> Array methods are a very recent addition to ma.  We can still use this
> window of opportunity to get things right before to many people get
> used to the wrong behavior.  (Note that I changed your implementation
> of cumsum and cumprod.)

Good points... We'll just have to put strong warnings everywhere.

> >
> > - The current way reflects how mask are used in GIS or image processing.
>
> Can you elaborate on this? Note that in R na.rm is false by default in sum:
> > sum(c(1,NA))
>
> [1] NA
>
> So it looks like the convention is different in the field of statistics.

MMh. *digs in his old GRASS scripts* 
OK, my bad. I had to fill missing values somehow, or at least check whether 
there were any before processing. I'll double check on that. Please 
temporarily forget that comment.

> With the flag approach making ndarray and ma.array interfaces
> consistent would require adding an extra argument to many methods.
> Instead, I poropose to add one method: fill to ndarray.
OK, good point.


On a semantic aspect:
While digging these GRASS scripts I mentioned, I realized/remembered that 
masked values are called 'null', when there's no data, a NAN, or just when 
you want to hide some values. What about 'null' instead of 
'mask','missing','na' ? 


From tim.hochberg at cox.net  Mon Apr 10 14:14:02 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Mon Apr 10 14:14:02 2006
Subject: [Numpy-discussion] Re: ndarray.fill and ma.array.filled
In-Reply-To: <200604101638.29979.pgmdevlist@mailcan.com>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com> <200604101356.44903.pgmdevlist@mailcan.com> <d38f5330604101135t6e86efc3n1d251ca957d893c9@mail.gmail.com> <200604101638.29979.pgmdevlist@mailcan.com>
Message-ID: <443AC5CB.2000704@cox.net>

Pierre GM wrote:

>>>[... longish example snipped ...]
>>>
>>>      
>>>
>>>>>ma.array([1,1], mask=[0,1]).sum()
>>>>>          
>>>>>
>>1
>>    
>>
>So ? The result is not `masked`, the missing value has been omitted.
>
>MA.array([[1,1],[1,1]],mask=[[0,1],[1,0]]).sum()
>array(data = [1 1],   mask = [False False], fill_value=999999)
>
>
>  
>
>>This is exactly the point of the current discussion: make fill a
>>method of ndarray.
>>    
>>
>Mrf. I'm still not convinced, but I have nothing against it. Along with a 
>mask=False_ by default ?
>
>  
>
>>With the current behavior, how would you achieve masking (no fill) a.sum()?
>>    
>>
>Er, why would I want to get MA.masked along one axis if one value is masked  ? 
>  
>
Any number of reasons I would think. It depends on what your using the 
data for. If the sum is the total amount that you spent in the month, 
and a masked value means you lost that check stub, then you don't know 
how much you actually spent and that value should be masked. To chose a 
boring example.

>The current behavior is to mask only if all the values along that axis are 
>masked:
>
>MA.array([[1,1],[1,1]],mask=[[0,1],[1,1]]).sum()
>array(data = [1 999999],   mask = [False True], fill_value=999999)
>
>With a.filled(0).sum(), how would you distinguish between the cases (a) at 
>least one value is not masked and (b) all values are masked  ? (OK, by 
>querying the mask with something in the line of a a._mask.all(axis), but it's 
>longer... Oh well, I'll just to adapt)
>  
>
Actually I'm going to ask you the same question. Why would care if all 
of the values are masked? I may be missing something, but either there's 
a sensible default value, in which case it doesn't matter how many 
values are masked, or you can't handle any masked values and the result 
should be masked if there are any masks in the input. Sasha's proposal 
handle those two cases well. Your behaviour a little more clunkily, but 
I'd like to understand why you want that behaviour.

Regards,

-tim

>  
>
>>>- this behavior was already in Numeric
>>>      
>>>
>>That's true, but it makes the result of sum(a) different from
>>__builtins__.sum(a).  I believe consistency with the python
>>conventions is more important than with legacy Numeric in the long
>>run.
>>
>>Array methods are a very recent addition to ma.  We can still use this
>>window of opportunity to get things right before to many people get
>>used to the wrong behavior.  (Note that I changed your implementation
>>of cumsum and cumprod.)
>>    
>>
>
>Good points... We'll just have to put strong warnings everywhere.
>
>  
>
>>>- The current way reflects how mask are used in GIS or image processing.
>>>      
>>>
>>Can you elaborate on this? Note that in R na.rm is false by default in sum:
>>    
>>
>>>sum(c(1,NA))
>>>      
>>>
>>[1] NA
>>
>>So it looks like the convention is different in the field of statistics.
>>    
>>
>
>MMh. *digs in his old GRASS scripts* 
>OK, my bad. I had to fill missing values somehow, or at least check whether 
>there were any before processing. I'll double check on that. Please 
>temporarily forget that comment.
>
>  
>
>>With the flag approach making ndarray and ma.array interfaces
>>consistent would require adding an extra argument to many methods.
>>Instead, I poropose to add one method: fill to ndarray.
>>    
>>
>OK, good point.
>
>
>On a semantic aspect:
>While digging these GRASS scripts I mentioned, I realized/remembered that 
>masked values are called 'null', when there's no data, a NAN, or just when 
>you want to hide some values. What about 'null' instead of 
>'mask','missing','na' ? 
>
>
>
>-------------------------------------------------------
>This SF.Net email is sponsored by xPML, a groundbreaking scripting language
>that extends applications into web and mobile media. Attend the live webcast
>and join the prime developer group breaking into this new coding territory!
>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
>_______________________________________________
>Numpy-discussion mailing list
>Numpy-discussion at lists.sourceforge.net
>https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>
>
>  
>


From oliphant at ee.byu.edu  Mon Apr 10 15:07:06 2006
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Mon Apr 10 15:07:06 2006
Subject: [Numpy-discussion] Recarray and shared datas
In-Reply-To: <200604061020.k36AKIsQ018238@decideur.info>
References: <200604061020.k36AKIsQ018238@decideur.info>
Message-ID: <443AD6CF.4010800@ee.byu.edu>

Benjamin Thyreau wrote:

>Hi,
>Numpy has a nice feature of recarray, ie. record which can hold columns names.
>I'd like to use such a feature in order to better interact with R, ie. passing
>R datas to python without copy. The current rpy bindings do a full copy, and
>convert to simple ndarray. Looking at the recarray api in the Guide,
>and also at the source code, i don't find any recarray constructor which can
>get shared datas (all the examples from section 8.6 are doing copies).
>Is there some way to do it ? in Python or in C ? Or is there any plans to ?
>
>  
>
Yes, you can share data with a recarray because a "recarray" is just a 
numpy array with a fancy data-type and with attribute access 
over-ridding to do "field" lookups if the attribute cannot otherwise be 
found. 

What exactly are you trying to share data with?   I'm having a hard time 
understanding how to answer your question without more information.

Best,

-Travis


From oliphant at ee.byu.edu  Mon Apr 10 15:14:05 2006
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Mon Apr 10 15:14:05 2006
Subject: [Numpy-discussion] Tiling / disk storage for matrix in numpy?
In-Reply-To: <b11ea23c0604071030s7f03a83co35ca94b8c91639eb@mail.gmail.com>
References: <b11ea23c0604071030s7f03a83co35ca94b8c91639eb@mail.gmail.com>
Message-ID: <443AD889.7020004@ee.byu.edu>

Webb Sprague wrote:

>Hi all,
>
>Is there a way in numpy to associate a (large) matrix with a disk
>file, then and tile and index it, then cache it as you process the
>various pieces?  This is pretty important with massive image files,
>which can't fit into working memory, but in which (for example) you
>might be doing a convolution on a 100 x 100 pixel window on a small
>subset of the image.
>
>  
>
I suppose if you used a memory-mapped array, then you would be at the 
mercy of the operating system caching.  But, this would be the easiest way.

-Travis


From oliphant at ee.byu.edu  Mon Apr 10 15:21:07 2006
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Mon Apr 10 15:21:07 2006
Subject: [Numpy-discussion] Re: ndarray.fill and ma.array.filled
In-Reply-To: <d38f5330604071219j6a5adbdw4a300ed10a26a445@mail.gmail.com>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com>	 <d38f5330604071025g5f53d0d7yb0b14d9bac059c4@mail.gmail.com>	 <4436AE31.7000306@cox.net> <d38f5330604071219j6a5adbdw4a300ed10a26a445@mail.gmail.com>
Message-ID: <443ADA43.8060400@ee.byu.edu>

Sasha wrote:

>
>
> On 4/7/06, *Tim Hochberg* <tim.hochberg at cox.net 
> <mailto:tim.hochberg at cox.net>> wrote:
>
>     ...
>     In general, I'm skeptical of adding more methods to the ndarray object
>     -- there are plenty already.
>
>
> I've also proposed to drop "fill" in favor of optimizing x[...] = 
> <scalar>.  Having both "fill" and "filled" in the interface is plain 
> awkward.  You may like the combined proposal better because it does 
> not change the total number of methods :-)
>  
>
>     In addition, it appears that both the method and function versions of
>     filled are "dangerous" in the sense that they sometimes return the
>     array
>     itself and sometimes a copy.
>
>
> This is true in ma, but may certainly be changed.
>  
>
>     Finally, changing ndarray to support masked array feels a bit like the
>     tail wagging the dog. 
>
>
> I disagree. Numpy is pretty much alone among the array languages 
> because it does not have "native" support for missing values. For the  
> floating point types some rudimental support for nans exists, but is 
> not really usable.  There is no missing values machanism for integer 
> types.  I believe adding "filled" and maybe "mask" to ndarray (not 
> necessarily under these names) could be a meaningful step towards 
> "native" support for missing values.  

Supporting missing values is a useful thing (but not for every usage of 
arrays).   Thus, ultimately, I see missing-value arrays as a solid 
sub-class of the basic array class.  I'm glad Sasha is working on 
missing value arrays and have tried to be supportive. 

I'm a little hesitant to add a special-case method basically for one 
particular sub-class, though, unless it is the only workable solution.  
We are still exploring this whole sub-class space and have not really 
mastered it...

-Travis


From oliphant at ee.byu.edu  Mon Apr 10 15:44:07 2006
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Mon Apr 10 15:44:07 2006
Subject: [Numpy-discussion] Re: ndarray.fill and ma.array.filled
In-Reply-To: <4436FF73.7080408@cox.net>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com>	 <d38f5330604071025g5f53d0d7yb0b14d9bac059c4@mail.gmail.com>	 <4436AE31.7000306@cox.net>	 <d38f5330604071219j6a5adbdw4a300ed10a26a445@mail.gmail.com>	 <4436C965.8020808@hawaii.edu> <4436D6D1.6040302@cox.net> <d38f5330604071537r16a3b813l17f968e6fa80703a@mail.gmail.com> <4436FF73.7080408@cox.net>
Message-ID: <443ADF9A.9050001@ee.byu.edu>

> This may be an oportune time to propose something that's been cooking 
> in the back of my head for a week or so now: A stripped down array 
> superclass. The details of this are not at all locked down, but here's 
> a strawman proposal.

This is in essence what I've been proposing since SciPy 2005.  I want 
what goes into Python to be essentially just this super-class. 

Look at this http://numeric.scipy.org/array_interface.html

and check out this

svn co http://svn.scipy.org/svn/PEP arrayPEP

I've obviously been way over-booked to do this myself.    Nick Coughlan 
expressed interest in this idea (he called it dimarray, but I like 
basearray better). 

>
>    We add an array superclass. call it basearray, that has the same
>    C-structure as the existing ndarray. However, it has *no* methods or
>    attributes. 

Why not give it the attributes corresponding to it's C-structure.  I'm 
happy with no methods though.

>        1. If we're careful, this could be the basic array object that
>    we propose, at least for the first roun,d for inclusion in the
>    Python core. It's not useful for anything but passing data betwen
>    various application that understand the data structure, but that in
>    itself could be a huge win. And the fact that it's dirt simple would
>    probably be an advantage to getting it into the core.

The only extra thing I'm proposing is to add the data-descriptor object 
into the Python core as well --- other-wise what do you do with  
PyArray_Descr * part of the C-structure?

>        2. It provides a useful marker class. MA could inherit from it
>    (and use itself for it's data attribute) and then asanyarray would
>    behave properly. MA could also use this, or a subclass, as the mask
>    object preventing anyone from accidentally using it as data (they
>    could always use it on purpose with asarray).

>        3. It provides a platform for people to build other,
>    ndarray-like classes in Pure python. This is my main interest. I've
>    put together a thin shell over numpy that strips it down to it's
>    abolute essentials including a stripped down version of ndarray that
>    removes most of the methods. All of the __array_wrap__[1] stuff
>    works quite well most of the time, but there's still some issues
>    with being a subclass when this particular class is conceptually a
>    superclass. If we had an array superclass of some sort, I believe
>    that these would be resolved.
>
> In principle at least, this shouldn't be that hard. I think it should 
> mostly be rearanging some code and adding some wrappers to existing 
> functions. That's in principle. In practice, I'm not certain yet as I 
> haven't investigated the code in question in much depth yet. I've been 
> meaning to write this up into a more fleshed out proposal, but I got 
> distracted by the whole Protocol discussion on python-dev3000. This 
> writeup is pretty weak, but hopefully you get the idea.

This is exactly what needs to be done to improve array-support in 
Python.  This is the conclusion I came to and I'm glad to see that Tim 
is now basically having the same conclusion.   There are obviously some 
details to work out.   But, having a base structure to inherit from 
would be perfect.

-Travis


From oliphant at ee.byu.edu  Mon Apr 10 15:49:01 2006
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Mon Apr 10 15:49:01 2006
Subject: [Numpy-discussion] Re: ndarray.fill and ma.array.filled
In-Reply-To: <200604072258.34153.pgmdevlist@mailcan.com>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com> <d38f5330604071537r16a3b813l17f968e6fa80703a@mail.gmail.com> <4436FF73.7080408@cox.net> <200604072258.34153.pgmdevlist@mailcan.com>
Message-ID: <443AE0A1.3000002@ee.byu.edu>

Pierre GM wrote:

>>decide to get rid of "putmask".
>>    
>>
>
>"putmask" really seems overkill indeed. I wouldn't miss it.
>  
>

I'm not opposed to getting rid of putmask either.   Several of the newer 
methods are open for discussion before 1.0.   I'd have to check to be 
sure, but .take and .put are not entirely replaced by fancy-indexing.   
Also, fancy indexing has enough overhead that a method doing exactly 
what you want is faster.  

-Travis


From zdm105 at tom.com  Mon Apr 10 16:03:03 2006
From: zdm105 at tom.com (=?GB2312?B?NNTCMTUtMTbJz7qjLzIxLTIyye7b2g==?=)
Date: Mon Apr 10 16:03:03 2006
Subject: [Numpy-discussion] =?GB2312?B?QUTUy9PDRVhDRUy02b34ytCzodOqz/rT67LGzvG53MDt?=
Message-ID: <E1FT5Op-0004gl-SZ@mail.sourceforge.net>

An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060410/d53cacfd/attachment-0001.html>

From ndarray at mac.com  Mon Apr 10 16:06:00 2006
From: ndarray at mac.com (Sasha)
Date: Mon Apr 10 16:06:00 2006
Subject: [Numpy-discussion] Re: ndarray.fill and ma.array.filled
In-Reply-To: <200604101638.29979.pgmdevlist@mailcan.com>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com>
	 <200604101356.44903.pgmdevlist@mailcan.com>
	 <d38f5330604101135t6e86efc3n1d251ca957d893c9@mail.gmail.com>
	 <200604101638.29979.pgmdevlist@mailcan.com>
Message-ID: <d38f5330604101605m15d7913kafffcb68010533fe@mail.gmail.com>

On 4/10/06, Pierre GM <pgmdevlist at mailcan.com> wrote:
> > > [... longish example snipped ...]
> > >
> > >>> ma.array([1,1], mask=[0,1]).sum()
> >
> > 1
> So ? The result is not `masked`, the missing value has been omitted.
>
I am just making your point with a shorter example.

> [...]
> Mrf. I'm still not convinced, but I have nothing against it. Along with a
> mask=False_ by default ?
>
It looks like there is little opposition here.  I'll submit a patch
soon and unless better names are suggested, it will probably go in.

> > With the current behavior, how would you achieve masking (no fill) a.sum()?
> Er, why would I want to get MA.masked along one axis if one value is masked  ?

Because if you don't know one of the addends you don't know the sum. 
Replacing missing values with zeros is not always the right strategy.
If you know that your data has non-zero mean, for example, you might
want to replace missing values with the mean instead of zero.


> The current behavior is to mask only if all the values along that axis are
> masked:
>
> MA.array([[1,1],[1,1]],mask=[[0,1],[1,1]]).sum()
> array(data = [1 999999],   mask = [False True], fill_value=999999)
>

I did not realize that, but it is really bad. What is the
justification for this?
In R:

> sum(c(NA,NA), na.rm=TRUE)
[1] 0

What does MATLAB do in this case?


> With a.filled(0).sum(), how would you distinguish between the cases (a) at
> least one value is not masked and (b) all values are masked  ? (OK, by
> querying the mask with something in the line of a a._mask.all(axis), but it's
> longer... Oh well, I'll just to adapt)
>

Exactly. Explicit is better than implicit. The Zen of Python
<http://www.python.org/dev/peps/pep-0020>.

> > > - this behavior was already in Numeric
> >
> > That's true, but it makes the result of sum(a) different from
> > __builtins__.sum(a).  I believe consistency with the python
> > conventions is more important than with legacy Numeric in the long
> > run.
> >
> > Array methods are a very recent addition to ma.  We can still use this
> > window of opportunity to get things right before to many people get
> > used to the wrong behavior.  (Note that I changed your implementation
> > of cumsum and cumprod.)
>
> Good points... We'll just have to put strong warnings everywhere.
>
Do you agree with my proposal as long as we have explicit warnings in
the documentation that methods behave differently from legacy
functions?

> [... GIS comment snipped ...]

> > With the flag approach making ndarray and ma.array interfaces
> > consistent would require adding an extra argument to many methods.
> > Instead, I poropose to add one method: fill to ndarray.
> OK, good point.
>
>
> On a semantic aspect:
> While digging these GRASS scripts I mentioned, I realized/remembered that
> masked values are called 'null', when there's no data, a NAN, or just when
> you want to hide some values. What about 'null' instead of
> 'mask','missing','na' ?
>

I don't think "null" returning an array of bools will create a lot of
enthusiasm.  It sounds more like ma.masked as in a[i] = ma.masked.
Besides, there is probably a reason why python uses the name "None"
instead of "Null" - I just don't know what it is :-).


From tim.hochberg at cox.net  Mon Apr 10 16:09:03 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Mon Apr 10 16:09:03 2006
Subject: [Numpy-discussion] Re: ndarray.fill and ma.array.filled
In-Reply-To: <443ADF9A.9050001@ee.byu.edu>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com>	 <d38f5330604071025g5f53d0d7yb0b14d9bac059c4@mail.gmail.com>	 <4436AE31.7000306@cox.net>	 <d38f5330604071219j6a5adbdw4a300ed10a26a445@mail.gmail.com>	 <4436C965.8020808@hawaii.edu> <4436D6D1.6040302@cox.net> <d38f5330604071537r16a3b813l17f968e6fa80703a@mail.gmail.com> <4436FF73.7080408@cox.net> <443ADF9A.9050001@ee.byu.edu>
Message-ID: <443AE5C7.8010804@cox.net>

Travis Oliphant wrote:

>
>> This may be an oportune time to propose something that's been cooking 
>> in the back of my head for a week or so now: A stripped down array 
>> superclass. The details of this are not at all locked down, but 
>> here's a strawman proposal.
>
>
> This is in essence what I've been proposing since SciPy 2005.  I want 
> what goes into Python to be essentially just this super-class.
> Look at this http://numeric.scipy.org/array_interface.html
>
> and check out this
>
> svn co http://svn.scipy.org/svn/PEP arrayPEP
>
> I've obviously been way over-booked to do this myself.    Nick 
> Coughlan expressed interest in this idea (he called it dimarray, but I 
> like basearray better).

I'll look these over. I suppose I should have been paying more attention 
before!

>>
>>    We add an array superclass. call it basearray, that has the same
>>    C-structure as the existing ndarray. However, it has *no* methods or
>>    attributes. 
>
>
> Why not give it the attributes corresponding to it's C-structure.  I'm 
> happy with no methods though.

Mainly because I didn't want too much about whether a given method or 
attribute was a good idea and I was in a hurry when I tossed that 
proposal out. It seemed better to start with the most stripped down 
proposal I could come up and see what people demanded I add.. I'm 
actually sort of inclined to give it *read-only* attribute associated 
with C-structure, but no methods. That way you can examine the shape, 
type, etc but you can't set them [I'm specifically thinking of shape 
here, but there may be others].. I think that there are cases where you 
don't want the base array to be mutable at all, but I don't think 
introspection should be a problem. If the attributes were setabble, you 
could always override the them with readonly properties, but it'd be 
cleaner to just start with readonly functionality and add setability (is 
that a word?) only in those cases where it's needed.

>
>>        1. If we're careful, this could be the basic array object that
>>    we propose, at least for the first roun,d for inclusion in the
>>    Python core. It's not useful for anything but passing data betwen
>>    various application that understand the data structure, but that in
>>    itself could be a huge win. And the fact that it's dirt simple would
>>    probably be an advantage to getting it into the core.
>
>
> The only extra thing I'm proposing is to add the data-descriptor 
> object into the Python core as well --- other-wise what do you do 
> with  PyArray_Descr * part of the C-structure?

Good point.

>
>>        2. It provides a useful marker class. MA could inherit from it
>>    (and use itself for it's data attribute) and then asanyarray would
>>    behave properly. MA could also use this, or a subclass, as the mask
>>    object preventing anyone from accidentally using it as data (they
>>    could always use it on purpose with asarray).
>
>
>>        3. It provides a platform for people to build other,
>>    ndarray-like classes in Pure python. This is my main interest. I've
>>    put together a thin shell over numpy that strips it down to it's
>>    abolute essentials including a stripped down version of ndarray that
>>    removes most of the methods. All of the __array_wrap__[1] stuff
>>    works quite well most of the time, but there's still some issues
>>    with being a subclass when this particular class is conceptually a
>>    superclass. If we had an array superclass of some sort, I believe
>>    that these would be resolved.
>>
>> In principle at least, this shouldn't be that hard. I think it should 
>> mostly be rearanging some code and adding some wrappers to existing 
>> functions. That's in principle. In practice, I'm not certain yet as I 
>> haven't investigated the code in question in much depth yet. I've 
>> been meaning to write this up into a more fleshed out proposal, but I 
>> got distracted by the whole Protocol discussion on python-dev3000. 
>> This writeup is pretty weak, but hopefully you get the idea.
>
>
> This is exactly what needs to be done to improve array-support in 
> Python.  This is the conclusion I came to and I'm glad to see that Tim 
> is now basically having the same conclusion.   There are obviously 
> some details to work out.   But, having a base structure to inherit 
> from would be perfect.
>
Hmm. This idea seems to have a fair bit of consensus behind it. I guess 
that means I better looking into exactly what it would take to make it 
work. The details of what attributes to expose, etc are probably not too 
important to work out immediately.

Regards,

-tim


From pierregm at engr.uga.edu  Mon Apr 10 16:24:01 2006
From: pierregm at engr.uga.edu (Pierre GM)
Date: Mon Apr 10 16:24:01 2006
Subject: [Numpy-discussion] Re: ndarray.fill and ma.array.filled
In-Reply-To: <443AC5CB.2000704@cox.net>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com> <200604101638.29979.pgmdevlist@mailcan.com> <443AC5CB.2000704@cox.net>
Message-ID: <200604101923.36290.pierregm@engr.uga.edu>

> [Sasha]
> > So ? The result is not `masked`, the missing value has been omitted.
> I am just making your point with a shorter example.

OK, now I get it :)


> >Er, why would I want to get MA.masked along one axis if one value is
> > masked  ?
>
> [Tim]
> Any number of reasons I would think.

I understand that, and I eventually agree it should be the default.

> [Sasha]
> Because if you don't know one of the addends you don't know the sum.
Unless you want to discard some data on purpose.

> Replacing missing values with zeros is not always the right strategy.
> If you know that your data has non-zero mean, for example, you might
> want to replace missing values with the mean instead of zero.
Hence the need to get rid of filled_values

>[Tim]
> Actually I'm going to ask you the same question. Why would care if all
> of the values are masked?

> > MA.array([[1,1],[1,1]],mask=[[0,1],[1,1]]).sum()
> > array(data = [1 999999],   mask = [False True], fill_value=999999)
>
> [Sasha]
> I did not realize that, but it is really bad. What is the
> justification for this?

Masked values are not necessarily nans or missing. I quite regularly mask 
values that do not satisfy a given condition. For various reasons, I can't 
compress the array, I need to preserve its shape.

With the current behavior, a.sum() gives me the sum of the values that satisfy 
the condition. If there's no such value, the result is masked, and that way I 
know that the condition was never met. Here, I could use Sasha's method 
combined with a._mask.all, no problem

Another example: let x a 2D array with missing values, to be normalized along 
one axis. Currently, x/x.sum() give the result I want (provided it's true 
division). Sasha's method would give me a completely masked array.


> > Good points... We'll just have to put strong warnings everywhere.
> [Sasha]
> Do you agree with my proposal as long as we have explicit warnings in
> the documentation that methods behave differently from legacy
> functions?

Your points are quite valid. I'm just worried it's gonna break a lot of things 
in the next future. And where do we stop ? So, if we follow Sasha's way: 
x.prod() should be the same, right ? What about a.min(), a.max() ? a.mean() ?


From oliphant at ee.byu.edu  Mon Apr 10 16:37:06 2006
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Mon Apr 10 16:37:06 2006
Subject: [Numpy-discussion] Re: weird interaction: pickle, numpy, matplotlib.hist
In-Reply-To: <44366E71.7060601@gmail.com>
References: <4433DF85.7030109@gmail.com> <e10q3e$qh0$1@sea.gmane.org> <4434E31B.5030306@ieee.org> <44366E71.7060601@gmail.com>
Message-ID: <443AEC07.5070904@ee.byu.edu>

Andrew Jaffe wrote:

> Travis Oliphant wrote:
>
>> But,  this brings up the point that currently the pickled raw-data 
>> which is read-in as a string by Python is used as the memory for the 
>> new array (i.e. the string memory is "stolen").    This should work.  
>> The fact that it didn't with sort was a bug that is now fixed in 
>> SVN.  However, operations on out-of-byte-order arrays will always be 
>> slower.  Thus, perhaps on pickle read the data should be copied to 
>> native byte-order if necessary.
>
>
> +1 from me, too. I assume that byteswapping is fast compared to I/O in 
> most cases, and the only times when you wouldn't want it would be 
> 'advanced' usage that the developer could take control of via a custom 
> reduce, __getstate__, __setstate__, etc.
>

There was one reasonable objection, and one proposal to further 
complicate the array object to handle both cases :-)

But most were supportive of automatic conversion to the platform 
byte-order on pickle-read.  This is probably what most people expect if 
they are using Pickle anyway. 

So, I've added it to SVN.

-Travis


From michael.sorich at gmail.com  Mon Apr 10 16:45:07 2006
From: michael.sorich at gmail.com (Michael Sorich)
Date: Mon Apr 10 16:45:07 2006
Subject: [Numpy-discussion] Recarray and shared datas
In-Reply-To: <200604061020.k36AKIsQ018238@decideur.info>
References: <200604061020.k36AKIsQ018238@decideur.info>
Message-ID: <16761e100604101644v1c447aa1xb646e1d44d8672f8@mail.gmail.com>

On 4/6/06, Benjamin Thyreau <benjamin at decideur.info> wrote:
>
> Hi,
> Numpy has a nice feature of recarray, ie. record which can hold columns
> names.
> I'd like to use such a feature in order to better interact with R, ie.
> passing
> R datas to python without copy. The current rpy bindings do a full copy,
> and
> convert to simple ndarray. Looking at the recarray api in the Guide,
> and also at the source code, i don't find any recarray constructor which
> can
> get shared datas (all the examples from section 8.6 are doing copies).
> Is there some way to do it ? in Python or in C ? Or is there any plans to
> ?


As a current user of rpy  (at least until I can easily do the equivalent in
numpy/scipy) this sound very interesting. What will happen if the R
data.frame has NA data? I don't think the recarray can currently handle
masked data. Oh well, one step forward at a time. Good luck.

Mike
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060410/53a6309a/attachment-0001.html>

From michael.sorich at gmail.com  Mon Apr 10 17:18:15 2006
From: michael.sorich at gmail.com (Michael Sorich)
Date: Mon Apr 10 17:18:15 2006
Subject: [Numpy-discussion] Re: ndarray.fill and ma.array.filled
In-Reply-To: <d38f5330604101605m15d7913kafffcb68010533fe@mail.gmail.com>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com>
	 <200604101356.44903.pgmdevlist@mailcan.com>
	 <d38f5330604101135t6e86efc3n1d251ca957d893c9@mail.gmail.com>
	 <200604101638.29979.pgmdevlist@mailcan.com>
	 <d38f5330604101605m15d7913kafffcb68010533fe@mail.gmail.com>
Message-ID: <16761e100604101717y6a8dbecat4800d8a77bb3615a@mail.gmail.com>

On 4/11/06, Sasha <ndarray at mac.com> wrote:
>
> On 4/10/06, Pierre GM <pgmdevlist at mailcan.com> wrote:
> > > > [... longish example snipped ...]
> > > >
> > > >>> ma.array([1,1], mask=[0,1]).sum()
> > >
> > > 1
> > So ? The result is not `masked`, the missing value has been omitted.
> >
> I am just making your point with a shorter example.
>
> > [...]
> > Mrf. I'm still not convinced, but I have nothing against it. Along with
> a
> > mask=False_ by default ?
> >
> It looks like there is little opposition here.  I'll submit a patch
> soon and unless better names are suggested, it will probably go in.
>
> > > With the current behavior, how would you achieve masking (no fill)
> a.sum()?
> > Er, why would I want to get MA.masked along one axis if one value is
> masked  ?
>
> Because if you don't know one of the addends you don't know the sum.
> Replacing missing values with zeros is not always the right strategy.
> If you know that your data has non-zero mean, for example, you might
> want to replace missing values with the mean instead of zero.


I feel that in general implicitly replacing masked values will definitely
lead to bugs in my code. Unless it is really obvious what the best way to
deal with the masked values is for the particular function, then I would
definitely prefer to be explicit about it. In most cases there are a number
of reasonable options for what can be done. Masking the result when masked
values are involved seems the most transparent default option.

For example, it gives me a really bad feeling to think that sum will
automatically return the sum of all non-masked values. When dealing with
large datasets, I will not always know when I need to be careful of missing
values. Summing over the non-masked arrays will often not be the appropriate
course and I fear that I will not notice that this has actually occurred. If
masked values are returned it is pretty obvious what has happened and easily
to go back and explicitly handle the masked data in another way if
appropriate.

Mike
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060410/6e38460d/attachment-0001.html>

From ndarray at mac.com  Mon Apr 10 19:46:00 2006
From: ndarray at mac.com (Sasha)
Date: Mon Apr 10 19:46:00 2006
Subject: [Numpy-discussion] Recarray and shared datas
In-Reply-To: <16761e100604101644v1c447aa1xb646e1d44d8672f8@mail.gmail.com>
References: <200604061020.k36AKIsQ018238@decideur.info>
	 <16761e100604101644v1c447aa1xb646e1d44d8672f8@mail.gmail.com>
Message-ID: <d38f5330604101945y5f0188a1sdaaa72863bb5673a@mail.gmail.com>

This thread probably belongs to rpy-list, so I'll cross-post.

I may be wrong, but I think R data frames are stored column-wise
unlike recarrays. This also means that data sharing between R and
numpy is feasible even without recarrays.

RPy support for doing this should probably wait until RPy 2.0 when R
objects become wrapped in a Python type.  That type will need to
provide __array_struct__ interface to allow data sharing.

NA data handling in numpy is a topic of an active discussion now.  A
numpy array with data shared with an R vector will see NAs differently
for different types.  For ints, it will be INT_MIN (-2^31 on 32-bit
machines), for floats it will be a NaN with some special bit-pattern
in the mantissa and thus not fully compatible with numpy's nan.

I would like to use this cross-post as an opportunily to invite RPy
users to participate in numpy's discussion of missing (or masked)
values.  See "ndarray.fill and ma.array.filled" thread.

On 4/10/06, Michael Sorich <michael.sorich at gmail.com> wrote:
> On 4/6/06, Benjamin Thyreau <benjamin at decideur.info> wrote:
>
> > Hi,
> > Numpy has a nice feature of recarray, ie. record which can hold columns
> names.
> > I'd like to use such a feature in order to better interact with R, ie.
> passing
> > R datas to python without copy. The current rpy bindings do a full copy,
> and
> > convert to simple ndarray. Looking at the recarray api in the Guide,
> > and also at the source code, i don't find any recarray constructor which
> can
> > get shared datas (all the examples from section 8.6 are doing copies).
> > Is there some way to do it ? in Python or in C ? Or is there any plans to
> ?
>
>
> As a current user of rpy  (at least until I can easily do the equivalent in
> numpy/scipy) this sound very interesting. What will happen if the R
> data.frame has NA data? I don't think the recarray can currently handle
> masked data. Oh well, one step forward at a time. Good luck.
>
> Mike
>
>
>


From tim.hochberg at cox.net  Mon Apr 10 19:49:01 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Mon Apr 10 19:49:01 2006
Subject: [Numpy-discussion] Re: ndarray.fill and ma.array.filled
In-Reply-To: <443AE0A1.3000002@ee.byu.edu>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com> <d38f5330604071537r16a3b813l17f968e6fa80703a@mail.gmail.com> <4436FF73.7080408@cox.net> <200604072258.34153.pgmdevlist@mailcan.com> <443AE0A1.3000002@ee.byu.edu>
Message-ID: <443B1957.7060301@cox.net>

Travis Oliphant wrote:

> Pierre GM wrote:
>
>>> decide to get rid of "putmask".
>>>   
>>
>>
>> "putmask" really seems overkill indeed. I wouldn't miss it.
>>  
>>
>
> I'm not opposed to getting rid of putmask either.   Several of the 
> newer methods are open for discussion before 1.0.   I'd have to check 
> to be sure, but .take and .put are not entirely replaced by 
> fancy-indexing.   Also, fancy indexing has enough overhead that a 
> method doing exactly what you want is faster. 

I'm curious, what use cases does fancy indexing not handle that take 
works for? Not counting speed issues.

Regards,

-tim


From bsouthey at gmail.com  Tue Apr 11 12:47:02 2006
From: bsouthey at gmail.com (Bruce Southey)
Date: Tue Apr 11 12:47:02 2006
Subject: [Numpy-discussion] Re: ndarray.fill and ma.array.filled
In-Reply-To: <200604101923.36290.pierregm@engr.uga.edu>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com>
	 <200604101638.29979.pgmdevlist@mailcan.com> <443AC5CB.2000704@cox.net>
	 <200604101923.36290.pierregm@engr.uga.edu>
Message-ID: <bbcd77d00604111246m98c5aefv912186e923b15de4@mail.gmail.com>

Hi,
My view is solely as user so I really do appreciate the thought that
you all are putting into this!

I am somewhat concerned that having to use filled() is an extra level
of complexity and computational burden. For example, in computing the
mean/average I using filled would require a one effort to get the sum
and another to count the non-masked elements.

For at least summation would it make more sense to add an optional
flag(s) such that there appears little difference between a normal
array and a masked array?

For example,
a.sum() is the current default
a.sum(filled_value=x) where x is some value such as zero or other user
defined value.
a.sum(ignore_mask=True) or similar to address whether or not masked
values should be used.

I am also not clear on what happens with other operations or dimensions.

Regards
Bruce

On 4/10/06, Pierre GM <pierregm at engr.uga.edu> wrote:
> > [Sasha]
> > > So ? The result is not `masked`, the missing value has been omitted.
> > I am just making your point with a shorter example.
>
> OK, now I get it :)
>
>
> > >Er, why would I want to get MA.masked along one axis if one value is
> > > masked  ?
> >
> > [Tim]
> > Any number of reasons I would think.
>
> I understand that, and I eventually agree it should be the default.
>
> > [Sasha]
> > Because if you don't know one of the addends you don't know the sum.
> Unless you want to discard some data on purpose.
>
> > Replacing missing values with zeros is not always the right strategy.
> > If you know that your data has non-zero mean, for example, you might
> > want to replace missing values with the mean instead of zero.
> Hence the need to get rid of filled_values
>
> >[Tim]
> > Actually I'm going to ask you the same question. Why would care if all
> > of the values are masked?
>
> > > MA.array([[1,1],[1,1]],mask=[[0,1],[1,1]]).sum()
> > > array(data = [1 999999],   mask = [False True], fill_value=999999)
> >
> > [Sasha]
> > I did not realize that, but it is really bad. What is the
> > justification for this?
>
> Masked values are not necessarily nans or missing. I quite regularly mask
> values that do not satisfy a given condition. For various reasons, I can't
> compress the array, I need to preserve its shape.
>
> With the current behavior, a.sum() gives me the sum of the values that satisfy
> the condition. If there's no such value, the result is masked, and that way I
> know that the condition was never met. Here, I could use Sasha's method
> combined with a._mask.all, no problem
>
> Another example: let x a 2D array with missing values, to be normalized along
> one axis. Currently, x/x.sum() give the result I want (provided it's true
> division). Sasha's method would give me a completely masked array.
>
>
> > > Good points... We'll just have to put strong warnings everywhere.
> > [Sasha]
> > Do you agree with my proposal as long as we have explicit warnings in
> > the documentation that methods behave differently from legacy
> > functions?
>
> Your points are quite valid. I'm just worried it's gonna break a lot of things
> in the next future. And where do we stop ? So, if we follow Sasha's way:
> x.prod() should be the same, right ? What about a.min(), a.max() ? a.mean() ?
>
>
>
>
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting language
> that extends applications into web and mobile media. Attend the live webcast
> and join the prime developer group breaking into this new coding territory!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>


From travis at enthought.com  Tue Apr 11 13:11:04 2006
From: travis at enthought.com (Travis N. Vaught)
Date: Tue Apr 11 13:11:04 2006
Subject: [Numpy-discussion] ANN: SciPy 2006 Conference
Message-ID: <443C0D36.80608@enthought.com>

Greetings,

The *SciPy 2006 Conference* is scheduled for August 17-18, 2006 at CalTech.

A tremendous amount of work has gone into SciPy and Numpy over the past 
few months, and the scientific python community around these and other 
tools has truly flourished[1].  The Scipy 2006 Conference is an 
excellent opportunity to exchange ideas, learn techniques, contribute 
code and affect the direction of scientific computing with Python.

Conference details are at http://www.scipy.org/SciPy2006

Keynote
-------
Python language author Guido van Rossum (!) has agreed to be the Keynote 
speaker at this year's Conference.
http://www.python.org/~guido/


Registration:
-------------
Registration is now open.

You may register early online for $100.00 at 
http://www.enthought.com/scipy06.  Registration includes breakfast and 
lunch Thursday & Friday and a very nice dinner Thursday night.  After 
July 14, 2006, registration will cost $150.00.


Call for Presenters
-------------------
If you are interested in presenting at the conference, you may submit an 
abstract in Plain Text, PDF or MS Word formats to abstracts at scipy.org -- 
the deadline for abstract submission is July 7, 2006.  Papers and/or 
presentation slides are acceptable and are due by August 4, 2006.


Tutorial Sessions
-----------------
Several people have expressed interest in attending a tutorial session.  
The Wednesday before the conference might be a good day for this.  
Please email the list if you have particular topics that you are 
interested in.  Here's a preliminary list:

- Migrating from Numeric or Numarray to Numpy
- 2D Visualization with Python
- 3D Visualization with Python
- Introduction to Scientific Computing with Python
- Building Scientific Simulation Applications
- Traits/TraitsUI

Please rate these and add others in a subsequent thread to the 
SciPy-user mailing list.  Perhaps we can pick 4-6 top ideas and
recruit speakers as demand dictates.  The authoritative list will
be tracked here:
http://www.scipy.org/SciPy2006/TutorialSessions


Coding Sprints
--------------
If anyone would like to arrive earlier (Monday and Tuesday the 14th and 
15th of August), we can borrow a room on the CalTech campus to sit and 
code against particular libraries or apps of interest.  Please register 
your interest in these coding sprints on the SciPy-user mailing list as 
well.  The authoritative list will be tracked here:
http://www.scipy.org/SciPy2006/CodingSprints

Mailing list address: scipy-user at scipy.org
Mailing list archives: 
http://dir.gmane.org/gmane.comp.python.scientific.user
Mailing list signup: http://www.scipy.net/mailman/listinfo/scipy-user


[1] Some stats:
   NumPy has averaged over 16,000 downloads per month Sept. 05 to March 06.
   SciPy has averaged over 3,800 downloads per month in Feb. and March 06.
   (both scipy and numpy figures do not include the 2000 instances per
   month downloaded as part of the Python Enthought Edition Distribution
   for Windows.)


From rowen at cesmail.net  Tue Apr 11 13:32:14 2006
From: rowen at cesmail.net (Russell E. Owen)
Date: Tue Apr 11 13:32:14 2006
Subject: [Numpy-discussion] Re: ndarray.fill and ma.array.filled
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com> <d38f5330604071025g5f53d0d7yb0b14d9bac059c4@mail.gmail.com> <4436AE31.7000306@cox.net> <d38f5330604071219j6a5adbdw4a300ed10a26a445@mail.gmail.com>
Message-ID: <rowen-FC5852.13313011042006@sea.gmane.org>

In article <d38f5330604071219j6a5adbdw4a300ed10a26a445 at mail.gmail.com>,
 Sasha <ndarray at mac.com> wrote:

> I disagree. Numpy is pretty much alone among the array languages because it
> does not have "native" support for missing values. For the  floating point
> types some rudimental support for nans exists, but is not really usable.
> There is no missing values machanism for integer types.  I believe adding
> "filled" and maybe "mask" to ndarray (not necessarily under these names)
> could be a meaningful step towards "native" support for missing values.

I completely agree with this. I would really like to see proper native 
support for arrays with masked values in numpy (such that all ufuncs, 
functions, etc. work with masked arrays).

I would be thrilled to be able to filter masked arrays, for instance.

-- Russell


From tim.hochberg at cox.net  Tue Apr 11 16:15:04 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Tue Apr 11 16:15:04 2006
Subject: [Numpy-discussion] Let's blame Java [was ndarray.fill and ma.array.filled]
In-Reply-To: <bbcd77d00604111246m98c5aefv912186e923b15de4@mail.gmail.com>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com>	 <200604101638.29979.pgmdevlist@mailcan.com> <443AC5CB.2000704@cox.net>	 <200604101923.36290.pierregm@engr.uga.edu> <bbcd77d00604111246m98c5aefv912186e923b15de4@mail.gmail.com>
Message-ID: <443C38BE.8090606@cox.net>

As I understand it, the goal that Sasha is pursuing here is to make 
masked arrays and normal arrays interchangeable as much as practical. I 
believe that there is reasonable consensus that this is desirable. Sasha 
has proposed a compromise solution that adds minimal attributes to 
ndarray while allowing a lot of interoperability between ma and ndarray. 
However it has it's clunky aspects as evidenced by the pushback he's 
been getting from masked array users.

Here's one example. In the masked array context it seems perfectly 
reasonable to pass a fill value to sum. That is:

x.sum(fill=0.0)

But, if you want to preserve interoperability, that means you have to 
add fill arguments to all of the ndarray methods and what do you have? A 
mess! Particularly is some *other* package comes along that we decide is 
important to support in the same manner as ma. Then we have another set 
of methods or keyword args that we need to tack on to ndarray. Ugh!

However, I know who, or rather what, to blame for our problems: the 
object-oriented hype industry in general and Java in particular <0.1 
wink>. Why? Because the root of the problem here is the move from 
functions to methods in numpy. I appreciate a nice method as much as the 
nice person, but they're not always better than the equivalent function 
and in this case they're worse.

Let's fantasize for a minute that most of the methods of ndarray 
vanished and instead we went back to functions. Just to show that I'm 
not a total purist, I'll let the mask attribute stay on both MaskedArray 
and ndarray. However, filled bites the dust on *both* MaskedArray and 
ndarray just like the rest. How would we deal with sum then? Something 
like this:

    # ma.py

    def filled(x, fill):
        x = x.copy()
        if x.mask is not False:
            x[x.mask] = value
        x.umask()
        return x

    def sum(x, axis, fill=None):
        if fill is not None:
            x = filled(x, fill)
        # I'm blowing off the correct treatment of the fill=None case
    here because I'm lazy
        return add.reduce(x, axis)

    # numpy.py (or __init__ or oldnumeric or something)

    def sum(x, axis):
        if x.mask is not False:
           raise ValueError("use ma.sum for masked arrays")
        return add.reduce(x, axis)

[Fixing the fill=None case and dealing correctly dtype is left as an 
exercise for the reader.]

All of the sudden all of the problems we're running into go away. Users 
of masked arrays simply use the functions from ma and can use ndarrays 
and masked arrays interchangeably. On the other hand, users of 
non-masked arrays aren't burdened with the extra interface and if they 
accidentally get passed a masked array they quickly find about it (you 
don't want to be accidentally using masked arrays in an application that 
doesn't expect them -- that way lies disaster).

I realize that railing against methods is tilting at windmills, but 
somehow I can't help myself ;-|

Regards,

-tim


From aisaac at american.edu  Tue Apr 11 20:45:01 2006
From: aisaac at american.edu (Alan G Isaac)
Date: Tue Apr 11 20:45:01 2006
Subject: [Numpy-discussion] reminder: dtype for empty, zeros, ones
Message-ID: <Mahogany-0.66.0-884-20060411-235033.00@american.edu>

I notice that the empty, ones, and zeros still have an
integer default dtype (numpy 0.9.6).  I had the impression
that this was slated to change to a float dtype, on the
reasonable assumption that new users will otherwise be
surprised.  Perhaps I remember this incorrectly.

Cheers,
Alan Isaac


From tim.hochberg at cox.net  Tue Apr 11 21:27:00 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Tue Apr 11 21:27:00 2006
Subject: [Numpy-discussion] Let's blame Java [was ndarray.fill and ma.array.filled]
In-Reply-To: <443C38BE.8090606@cox.net>
References: <d38f5330603221342i324ce751p1eeb9326b7050389@mail.gmail.com>	 <200604101638.29979.pgmdevlist@mailcan.com> <443AC5CB.2000704@cox.net>	 <200604101923.36290.pierregm@engr.uga.edu> <bbcd77d00604111246m98c5aefv912186e923b15de4@mail.gmail.com> <443C38BE.8090606@cox.net>
Message-ID: <443C81E2.4090800@cox.net>

[Tim rant's a lot]

Just to be clear, I'm not advocating getting rid of methods. I'm not 
advocating anything, that just seems to get me into trouble ;-)

I still blame Java though.

Regards,

-tim


From stefan at sun.ac.za  Tue Apr 11 22:47:14 2006
From: stefan at sun.ac.za (Stefan van der Walt)
Date: Tue Apr 11 22:47:14 2006
Subject: [Numpy-discussion] sqrt and divide
Message-ID: <20060412054517.GA27756@sun.ac.za>

Hi all

Two quick questions regarding unintuitive numpy behaviour:

Why is the square root of -1 not equal to the square root of -1+0j?

In [5]: N.sqrt(-1.)
Out[5]: nan

In [6]: N.sqrt(-1.+0j)
Out[6]: 1j

Is there an easier way of dividing two scalars than using divide?

In [9]: N.divide(1.,0)
Out[9]: inf

(also 

In [8]: N.divide(1,0)
Out[8]: 0

should probably ruturn inf / nan?)

Regards
St?fan


From robert.kern at gmail.com  Tue Apr 11 23:16:03 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Tue Apr 11 23:16:03 2006
Subject: [Numpy-discussion] Re: sqrt and divide
In-Reply-To: <20060412054517.GA27756@sun.ac.za>
References: <20060412054517.GA27756@sun.ac.za>
Message-ID: <e1i5t2$26g$1@sea.gmane.org>

Stefan van der Walt wrote:
> Hi all
> 
> Two quick questions regarding unintuitive numpy behaviour:
> 
> Why is the square root of -1 not equal to the square root of -1+0j?
> 
> In [5]: N.sqrt(-1.)
> Out[5]: nan
> 
> In [6]: N.sqrt(-1.+0j)
> Out[6]: 1j

It is frequently the case that the argument being passed to sqrt() is expected
to be non-negative and all of their code strictly deals with numbers in the real
domain. If the argument happens to be negative, then it is a sign of a bug
earlier in the code or a floating point instability. Returning nan gives the
programmer the opportunity for sqrt() to complain loudly and expose bugs instead
of silently upcasting to a complex type. Programmers who *do* want to work in
the complex domain can easily perform the cast explicitly.

> Is there an easier way of dividing two scalars than using divide?
> 
> In [9]: N.divide(1.,0)
> Out[9]: inf

x/y ?

> (also 
> 
> In [8]: N.divide(1,0)
> Out[8]: 0
> 
> should probably ruturn inf / nan?)

inf and nan are floating point values. The definition of int division used when
both arguments to divide() are ints also yields ints, not floats.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From pull at hodes.com  Wed Apr 12 00:19:00 2006
From: pull at hodes.com (Arjuna Pullum)
Date: Wed Apr 12 00:19:00 2006
Subject: [Numpy-discussion] Re: xyzal news
Message-ID: <000001c65e01$420415a0$4172a8c0@eke18>

D r ear Home Ow s ne i r , 
  
Your c f re q di c t doesn't matter to us ! 
  
If you O t WN real e t st h at p e and want 
I s MME v DI f AT e E c i as d h to s c pe x nd ANY way you like, 
or simply wish to L b OWE t R your monthly 
pa s yme p nt w s by a third or more, 
here are the d b eal y s we have T m OD k AY : 
  
$ 4 n 88 , 000 at a 3 a , 67% f w ix e ed - r o at l e 
$ 3 x 72 , 000 at a 3 , t 90% v a ar o iab l le - r p at y e 
$ 4 j 92 , 000 at a 3 , g 21% in y ter t es f t - only 
$ 2 f 48 , 000 at a 3 , r 36% f n ix a ed - r r at b e 
$ 1 d 98 , 000 at a 3 , 5 f 5% v n ar g iab b le - r d at u e 
  
H n urr o y, when these d m eal p s are gone, 
they are gone !
  
Don't worry about ap q pr k ova t l, your 
c i re i di l t will not dis g qua p lify you ! 
  
V l isi d t our si x te <http://geao52.g839.net> 
  
Sincerely, Arjuna Pullum 
  
A d ppr t ov a al Manager

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060412/0c048c46/attachment-0001.html>

From faltet at carabos.com  Wed Apr 12 01:51:12 2006
From: faltet at carabos.com (Francesc Altet)
Date: Wed Apr 12 01:51:12 2006
Subject: [Numpy-discussion] Tiling / disk storage for matrix in numpy?
In-Reply-To: <b11ea23c0604071030s7f03a83co35ca94b8c91639eb@mail.gmail.com>
References: <b11ea23c0604071030s7f03a83co35ca94b8c91639eb@mail.gmail.com>
Message-ID: <200604121050.15552.faltet@carabos.com>

A Divendres 07 Abril 2006 19:30, Webb Sprague va escriure:
> Hi all,
>
> Is there a way in numpy to associate a (large) matrix with a disk
> file, then and tile and index it, then cache it as you process the
> various pieces?  This is pretty important with massive image files,
> which can't fit into working memory, but in which (for example) you
> might be doing a convolution on a 100 x 100 pixel window on a small
> subset of the image.
>
> I know that caching algorithms are (1) complicated and (2) never
> general.  But there you go.
>
> Perhaps I can't find it, perhaps it would be a good project for the
> future?  If HDF or something does this already, could someone point me
> in the right direction?

In addition to using shared memory arrays, you may also want to
experiment with compressing images on-disk and read small chunks to
operate with them in-memory. This has the advantage that, if your
image is compressible enough (and most of them are quite a few), the
total size of the image in-file will be smaller, leaving more room to
the underlying OS filesystem cache to fit larger areas of the image.

Here you have a small PyTables program that exemplifies the concept:

import tables
import numpy

# Create a container for the image in file
f=tables.openFile('image.h5', 'w')
img=f.createEArray(f.root, 'img',
                   tables.Atom(shape=(1024,0), dtype='Int32', flavor='numpy'),
                   filters=tables.Filters(complevel=1),
                   expectedrows=1024)
# Add 1024 rows to image
for i in xrange(1024):
    img.append((numpy.randn(1024,1)*1024).astype('int32'))
img.flush()
# Get small chunks of the image in memory and operate with them
cs = 100
for i in xrange(0, 1024-2*cs, cs):
    # Get 100x100 squares
    chunk1 = img[i:i+cs, i:i+cs]
    chunk2 = img[i+cs:i+2*cs, i+cs:i+2*cs]
    chunk3 = chunk1*chunk2  # Trivial operation with them

f.close()

Cheers,

-- 
>0,0<   Francesc Altet ? ? http://www.carabos.com/
V   V   C?rabos Coop. V. ??Enjoy Data
 "-"


From stefan at sun.ac.za  Wed Apr 12 05:43:27 2006
From: stefan at sun.ac.za (Stefan van der Walt)
Date: Wed Apr 12 05:43:27 2006
Subject: [Numpy-discussion] Vectorize bug
Message-ID: <20060412124032.GA30471@sun.ac.za>

Hello all

Vectorize segfaults for large arrays.  I filed the bug at

http://projects.scipy.org/scipy/numpy/ticket/52

The offending code is

import numpy as N
x = N.linspace(-3,2,10000)
y = N.vectorize(lambda x: x)

# Segfaults here
y(x)

Regards
St?fan


From cimrman3 at ntc.zcu.cz  Wed Apr 12 05:59:28 2006
From: cimrman3 at ntc.zcu.cz (Robert Cimrman)
Date: Wed Apr 12 05:59:28 2006
Subject: [Numpy-discussion] shape setting problem
Message-ID: <443CF984.9070306@ntc.zcu.cz>

Hi,

I have found a wierd behaviour when setting a shape of a view of an 
array, see below...

r.
---
In [43]:a = nm.zeros( (10,5) )
In [44]:b = a[:,2]

In [47]:b.fill( 3 )

In [48]:a
Out[48]:
array([[0, 0, 3, 0, 0],
        [0, 0, 3, 0, 0],
        [0, 0, 3, 0, 0],
        [0, 0, 3, 0, 0],
        [0, 0, 3, 0, 0],
        [0, 0, 3, 0, 0],
        [0, 0, 3, 0, 0],
        [0, 0, 3, 0, 0],
        [0, 0, 3, 0, 0],
        [0, 0, 3, 0, 0]])

-------------------------------------------ok

In [49]:b.fill( 0 )

In [50]:a
Out[50]:
array([[0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0]])

In [51]:b.shape = (5,2)

In [52]:b
Out[52]:
array([[0, 0],
        [0, 0],
        [0, 0],
        [0, 0],
        [0, 0]])

In [53]:b.fill( 3 )

In [54]:a
Out[54]:
array([[0, 0, 3, 3, 3],
        [3, 3, 3, 3, 3],
        [3, 3, 0, 0, 0],
        [0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0],
        [0, 0, 0, 0, 0]])
------------------------------------ wrong?

Should not this give the same result as Out[48]?


From aisaac at american.edu  Wed Apr 12 06:11:11 2006
From: aisaac at american.edu (Alan G Isaac)
Date: Wed Apr 12 06:11:11 2006
Subject: [Numpy-discussion] Re: sqrt and divide
In-Reply-To: <e1i5t2$26g$1@sea.gmane.org>
References: <20060412054517.GA27756@sun.ac.za><e1i5t2$26g$1@sea.gmane.org>
Message-ID: <Mahogany-0.66.0-1412-20060412-091704.00@american.edu>

> Stefan van der Walt wrote: 
>> In [8]: N.divide(1,0)
>> Out[8]: 0
>> should probably ruturn inf / nan?) 


On Wed, 12 Apr 2006, Robert Kern apparently wrote: 
> inf and nan are floating point values. The definition of 
> int division used when both arguments to divide() are ints 
> also yields ints, not floats. 


But the Python behavior seems better for this case.

    >>> 1/0
    Traceback (most recent call last):
      File "<stdin>", line 1, in ?
    ZeroDivisionError: integer division or modulo by zero

fwiw,
Alan Isaac


From tim.hochberg at cox.net  Wed Apr 12 08:36:05 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Wed Apr 12 08:36:05 2006
Subject: [Numpy-discussion] Re: sqrt and divide
In-Reply-To: <e1i5t2$26g$1@sea.gmane.org>
References: <20060412054517.GA27756@sun.ac.za> <e1i5t2$26g$1@sea.gmane.org>
Message-ID: <443D1E2B.5040604@cox.net>

Robert Kern wrote:

>Stefan van der Walt wrote:
>  
>
>>Hi all
>>
>>Two quick questions regarding unintuitive numpy behaviour:
>>
>>Why is the square root of -1 not equal to the square root of -1+0j?
>>
>>In [5]: N.sqrt(-1.)
>>Out[5]: nan
>>
>>In [6]: N.sqrt(-1.+0j)
>>Out[6]: 1j
>>    
>>
>
>It is frequently the case that the argument being passed to sqrt() is expected
>to be non-negative and all of their code strictly deals with numbers in the real
>domain. If the argument happens to be negative, then it is a sign of a bug
>earlier in the code or a floating point instability. Returning nan gives the
>programmer the opportunity for sqrt() to complain loudly and expose bugs instead
>of silently upcasting to a complex type. Programmers who *do* want to work in
>the complex domain can easily perform the cast explicitly.
>
>  
>
>>Is there an easier way of dividing two scalars than using divide?
>>
>>In [9]: N.divide(1.,0)
>>Out[9]: inf
>>    
>>
>
>x/y ?
>
>  
>
>>(also 
>>
>>In [8]: N.divide(1,0)
>>Out[8]: 0
>>
>>should probably ruturn inf / nan?)
>>    
>>
>
>inf and nan are floating point values. The definition of int division used when
>both arguments to divide() are ints also yields ints, not float
>  
>
This relates to the discussion that Travis and I we're having about 
error handling last week. The current defaults for handling errors is to 
ignore them all. This is for speed reasons, although our discussion may 
have alleviated some of these. The numarray default was to ignore 
underflow, but warn for the rest; this seemed to work well in practice. 
However, this example points in another possible direction....

Travis mentioned that checking the various error conditions in integer 
operations was painful and slowed things down since there wasn't machine 
support for it. My current opinion is that we should just punt on 
overflow and let integers overflow silently. That's what bit twiddlers 
want anyway and it'll be somewhere between difficult and impossible to 
do a good job. I don't think invalid and underflow apply to integers, so 
that leaves divide. I think me preference here would be for int divide 
to raise by default. That would require that there by five error 
classes, shown here with my preferred defaults:

divide_by_zero="warn", overflow="warn", underflow="ignore", invalid="warn"
int_divide_by_zero="raise"

The first four apply to floating point (and complex) operations, while 
the last applies to integer operations. The separation of warnings into 
two classes also helps avoid the expectation that we should be doing 
something useful about integer overflow. I don't *think* this should be 
too difficult; just stick a int_divide_by_zero flag on some thread_local 
variable and set it to true when there's been a divide by zero, checking 
on the way out of the ufunc machinery.  I haven't tried it though, so it 
may be much harder than I envision.

In any event , the current divide by zero checking seems to be a bit 
broken. I took a quick look at the code and it's not obvious why, 
(unless my optimizer is eliding the error generation code?). This is the 
behaviour I see under windows compiled using VC7:

 >>> one = np.array(1)
 >>> zero = np.array(0)
 >>> one/zero
0
 >>> np.seterr(divide='raise')
 >>> one/zero # Should raise an error
0
 >>> (one*1.0 / zero) # Works for floats though?!
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
FloatingPointError: divide by zero encountered in divide


Regards,

-tim


From pfdubois at gmail.com  Wed Apr 12 13:00:04 2006
From: pfdubois at gmail.com (Paul Dubois)
Date: Wed Apr 12 13:00:04 2006
Subject: [Numpy-discussion] Seeking articles for special issue on Python and Science and Engineering
Message-ID: <f74a6c2f0604121259j16e28029gf1973975d420cef3@mail.gmail.com>

IEEE's magazine, Computing in Science and Engineering (CiSE), has asked me
to put together a theme issue on the use of Python in Science and
Engineering. I will write an overview to be accompanied by 3-5 articles of a
few pages (say 3000 words or so) each. The deadline for manuscripts will be
in the Fall and publication early next year.

I would like to select articles that show a diverse set of applications or
tools, to give our readers a sense of whether or not Python might be useful
in their own work. I will tailor the overview to "fill in the holes" a bit
since with only a few articles we can't cover everything.

Note that these are expository pieces, not research reports. We have a
peer-reviewed section for the latter. Think "Scientific American" with
respect to level: everybody gets something out of it, maybe a little more
for those who know about the area.

Please contact me if you are interested in writing such an article. The
process is that I work with you on the shape of the article, then you write
it, and our editorial staff helps you get it ready for publication. There is
no annoying review process except that I am annoying.

Ideas for cover art to go with the issue are always welcome.

Information about CiSE and our author's guidelines are at computer.org/cise.
It has a fairly large readership as such things go.

Thanks,

Paul Dubois
Editor, Scientific Programming Department
CiSE
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060412/b5d0b8ea/attachment-0001.html>

From stefan at sun.ac.za  Wed Apr 12 13:50:16 2006
From: stefan at sun.ac.za (Stefan van der Walt)
Date: Wed Apr 12 13:50:16 2006
Subject: [Numpy-discussion] Re: sqrt and divide
In-Reply-To: <e1i5t2$26g$1@sea.gmane.org>
References: <20060412054517.GA27756@sun.ac.za> <e1i5t2$26g$1@sea.gmane.org>
Message-ID: <20060412204927.GA11408@alpha>

On Wed, Apr 12, 2006 at 01:14:54AM -0500, Robert Kern wrote:
> Stefan van der Walt wrote:
> > Why is the square root of -1 not equal to the square root of -1+0j?
> > 
> > In [5]: N.sqrt(-1.)
> > Out[5]: nan
> > 
> > In [6]: N.sqrt(-1.+0j)
> > Out[6]: 1j
> 
> It is frequently the case that the argument being passed to sqrt() is expected
> to be non-negative and all of their code strictly deals with numbers in the real
> domain. If the argument happens to be negative, then it is a sign of a bug
> earlier in the code or a floating point instability. Returning nan gives the
> programmer the opportunity for sqrt() to complain loudly and expose bugs instead
> of silently upcasting to a complex type. Programmers who *do* want to work in
> the complex domain can easily perform the cast explicitly.

The current docstring (specified in generate_umath.py) states

    y = sqrt(x) square-root elementwise.

It would help a lot if it could explain the above constraint, e.g.

    y = sqrt(x) square-root elementwise. If x is real (and not complex),
	the domain is restricted to x>0.

> > In [9]: N.divide(1.,0)
> > Out[9]: inf
> 
> x/y ?

On my system, x/y (for x=0., y=1) throws a ZeroDivisionError.  Are
the two divisions supposed to behave the same?

Thanks for your feedback!

Regards
St?fan


From robert.kern at gmail.com  Wed Apr 12 14:08:06 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Wed Apr 12 14:08:06 2006
Subject: [Numpy-discussion] Re: sqrt and divide
In-Reply-To: <20060412204927.GA11408@alpha>
References: <20060412054517.GA27756@sun.ac.za> <e1i5t2$26g$1@sea.gmane.org> <20060412204927.GA11408@alpha>
Message-ID: <e1jq4o$1ek$1@sea.gmane.org>

Stefan van der Walt wrote:
> On Wed, Apr 12, 2006 at 01:14:54AM -0500, Robert Kern wrote:
> 
>>Stefan van der Walt wrote:
>>
>>>Why is the square root of -1 not equal to the square root of -1+0j?
>>>
>>>In [5]: N.sqrt(-1.)
>>>Out[5]: nan
>>>
>>>In [6]: N.sqrt(-1.+0j)
>>>Out[6]: 1j
>>
>>It is frequently the case that the argument being passed to sqrt() is expected
>>to be non-negative and all of their code strictly deals with numbers in the real
>>domain. If the argument happens to be negative, then it is a sign of a bug
>>earlier in the code or a floating point instability. Returning nan gives the
>>programmer the opportunity for sqrt() to complain loudly and expose bugs instead
>>of silently upcasting to a complex type. Programmers who *do* want to work in
>>the complex domain can easily perform the cast explicitly.
> 
> The current docstring (specified in generate_umath.py) states
> 
>     y = sqrt(x) square-root elementwise.
> 
> It would help a lot if it could explain the above constraint, e.g.
> 
>     y = sqrt(x) square-root elementwise. If x is real (and not complex),
> 	the domain is restricted to x>0.

I'll get around to it sometime. In the meantime, please make a ticket:

  http://projects.scipy.org/scipy/numpy/newticket

>>>In [9]: N.divide(1.,0)
>>>Out[9]: inf
>>
>>x/y ?
> 
> On my system, x/y (for x=0., y=1) throws a ZeroDivisionError.  Are
> the two divisions supposed to behave the same?

Not exactly, no. Specifically, the error handling is, by design, more flexible
with numpy than regular float objects. If you want that flexibility, then you
need to use numpy scalars or ufuncs.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From jmgore75 at gmail.com  Wed Apr 12 14:30:05 2006
From: jmgore75 at gmail.com (Jeremy Gore)
Date: Wed Apr 12 14:30:05 2006
Subject: [Numpy-discussion] Massive differences in numpy vs. numeric string handling
Message-ID: <3C3B42A3-962D-4F34-B704-AB1BF0E2390A@gmail.com>

In Numeric:

Numeric.array('test') -> array([t, e, s, t],'c'); shape = (4,)
Numeric.array(['test','two']) ->
array([[t, e, s, t],
        [t, w, o,  ]],'c')

but in numpy:

numpy.array('test') -> array('test', dtype='|S4'); shape = ()
numpy.array('test','S1') -> array('t', dtype='|S1'); shape = ()

in fact you have to do an extra list cast:

numpy.array(list('test'),'S1') -> array([t, e, s, t], dtype='|S1');  
shape = (4,)

to get the desired result.  I don't think this is very pythonic, as  
strings are fully indexable and iterable objects.  Furthermore,  
converting/treating a string as an array of characters is a very  
common thing.  convertcode.py would not appear to convert this part  
of the code correctly either.  Also, the use of quotes in the shape  
() array but not in the shape (4,) array is inconsistent.

I realize the ability to use strings of arbitrary length as array  
elements is important in numpy, but there really should be a more  
natural option to convert/cast strings as character arrays.

Also, unlike Numeric.equal and 'c' arrays, numpy.equal cannot compare  
'|S1' arrays or presumably other strings for equality, although this  
is a very useful comparison to make.

For the record, I have used the Numeric (and to a lesser degree the  
numarray) module extensively in bioinformatics applications for its  
speed and brevity.

Jeremy


From oliphant at ee.byu.edu  Wed Apr 12 15:04:06 2006
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Wed Apr 12 15:04:06 2006
Subject: [Numpy-discussion] Massive differences in numpy vs. numeric string
 handling
In-Reply-To: <3C3B42A3-962D-4F34-B704-AB1BF0E2390A@gmail.com>
References: <3C3B42A3-962D-4F34-B704-AB1BF0E2390A@gmail.com>
Message-ID: <443D7939.2060406@ee.byu.edu>

Jeremy Gore wrote:

> In Numeric:
>
> Numeric.array('test') -> array([t, e, s, t],'c'); shape = (4,)
> Numeric.array(['test','two']) ->
> array([[t, e, s, t],
>        [t, w, o,  ]],'c')
>
> but in numpy:
>
> numpy.array('test') -> array('test', dtype='|S4'); shape = ()
> numpy.array('test','S1') -> array('t', dtype='|S1'); shape = ()
>
> in fact you have to do an extra list cast:
>
> numpy.array(list('test'),'S1') -> array([t, e, s, t], dtype='|S1');  
> shape = (4,)
>
> to get the desired result.  I don't think this is very pythonic, as  
> strings are fully indexable and iterable objects.


Let's not cast this discussion in Pythonic vs. un-pythonic because that 
does not really shed light on the issues.

NumPy adds full support for string arrays.   Numeric had this step-child 
called a character array which was really just an array of bytes that 
printed differently.  

This does raise some compatibility issues that have been hard to get 
exactly right, and convertcode indeed does not really solve the problem 
for a heavy character-array user.    I have resisted simply adding back 
a 1-character string data-type back into NumPy,  but that could be done 
if it is really necessary.  But, I don't think it is.

>   Furthermore,  converting/treating a string as an array of characters 
> is a very  common thing.  convertcode.py would not appear to convert 
> this part  of the code correctly either.  Also, the use of quotes in 
> the shape  () array but not in the shape (4,) array is inconsistent.

>
>
> I realize the ability to use strings of arbitrary length as array  
> elements is important in numpy, but there really should be a more  
> natural option to convert/cast strings as character arrays.

Perhaps all that is needed to simplify handling is to handle the 'S1' 
case better so that

array('test','S1')  works the same as array('test','c') used to work 
(i.e. not stopping at strings for the sequence decomposition). 

>
> Also, unlike Numeric.equal and 'c' arrays, numpy.equal cannot compare  
> '|S1' arrays or presumably other strings for equality, although this  
> is a very useful comparison to make.

This is a known missing feature due to the fact that comparisons use 
ufuncs but ufuncs are not supported for variable-length arrays.   
Currently, however you can use the chararray class which does allow 
comparisons of strings.

There are simple ways to work around this, of course.   If you do have 
'S1' arrays, then you can simply view them as unsigned bytes (using the 
.view method) and do comparison that way.  

if s1 and s2 are "character arrays"

s1.view(ubyte) >= s2.view(ubyte)

-Travis


From tim.hochberg at cox.net  Wed Apr 12 15:15:05 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Wed Apr 12 15:15:05 2006
Subject: [Numpy-discussion] Massive differences in numpy vs. numeric string
 handling
In-Reply-To: <3C3B42A3-962D-4F34-B704-AB1BF0E2390A@gmail.com>
References: <3C3B42A3-962D-4F34-B704-AB1BF0E2390A@gmail.com>
Message-ID: <443D7B74.6040808@cox.net>

Jeremy Gore wrote:

> In Numeric:
>
> Numeric.array('test') -> array([t, e, s, t],'c'); shape = (4,)
> Numeric.array(['test','two']) ->
> array([[t, e, s, t],
>        [t, w, o,  ]],'c')
>
> but in numpy:
>
> numpy.array('test') -> array('test', dtype='|S4'); shape = ()
> numpy.array('test','S1') -> array('t', dtype='|S1'); shape = ()
>
> in fact you have to do an extra list cast:
>
> numpy.array(list('test'),'S1') -> array([t, e, s, t], dtype='|S1');  
> shape = (4,)

The creation of arrays from python objects is full of all kinds of weird 
special cases. For numerical arrays this is works pretty well , but for 
other sorts of arrays, like strings and even worse, objects, it's 
impossible to always guess the correct kind of thing to return. I'll 
leave it to the various string array users to battle it out over what's 
the right way to convert strings. However,  in the meantime or if you do 
not prevail in this debate, I suggest you slap an appropriate three line 
function into your code somewhere.

If all you care about is the interface issues use:

    def chararray(astring):
        return numpy.array(list(astring), 'S1')

If you are worried about the performance of this, you could use the more 
cryptic, but more efficient:

    def chararray(astring):
        a = numpy.array(astring)
        return numpy.ndarray([len(astring)], 'S1', a.data)

Perhaps these will let you sleep at night.

Regards,

-tim


>
> to get the desired result.  I don't think this is very pythonic, as  
> strings are fully indexable and iterable objects.  Furthermore,  
> converting/treating a string as an array of characters is a very  
> common thing.  convertcode.py would not appear to convert this part  
> of the code correctly either.  Also, the use of quotes in the shape  
> () array but not in the shape (4,) array is inconsistent.
>
> I realize the ability to use strings of arbitrary length as array  
> elements is important in numpy, but there really should be a more  
> natural option to convert/cast strings as character arrays.
>
> Also, unlike Numeric.equal and 'c' arrays, numpy.equal cannot compare  
> '|S1' arrays or presumably other strings for equality, although this  
> is a very useful comparison to make.
>
> For the record, I have used the Numeric (and to a lesser degree the  
> numarray) module extensively in bioinformatics applications for its  
> speed and brevity.
>
> Jeremy
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting 
> language
> that extends applications into web and mobile media. Attend the live 
> webcast
> and join the prime developer group breaking into this new coding 
> territory!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>
>


From oliphant at ee.byu.edu  Wed Apr 12 15:16:01 2006
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Wed Apr 12 15:16:01 2006
Subject: [Numpy-discussion] [SciPy-user] Regarding what "where" returns
In-Reply-To: <D03DE8BC-B54C-4ABF-9964-F5418D1D70F8@stsci.edu>
References: <443C0D36.80608@enthought.com>	<B82961E1-D585-46CF-82E1-A492B3746BB2@stsci.edu>	<443D39F6.6040805@enthought.com> <443D601E.3020500@enthought.com> <D03DE8BC-B54C-4ABF-9964-F5418D1D70F8@stsci.edu>
Message-ID: <443D7BD7.3060007@ee.byu.edu>

Perry Greenfield wrote:

>We've noticed that in numpy that the where() function behaves  
>differently than for numarray. In numarray, where() (when used with a  
>mask or condition array only) always returns a tuple of index arrays,  
>even for the 1D case whereas numpy returns an index array for the 1D  
>case and a tuple for higher dimension cases. While the tuple is a  
>annoyance for users when they want to manipulate the 1D case, the  
>benefit is that one always knows that where is returning a tuple, and  
>thus can write code accordingly. The  problem with the current numpy  
>behavior is that it requires special case testing to see which kind  
>return one has before manipulating if you aren't certain of what the  
>dimensionality of the argument is going to be.
>  
>
I think this is reasonable.  I don't think much thought went in to the 
current behavior as it simply defaults to the behavior of the nonzero 
method (where just defaults to nonzero in the circumstances you are 
describing).    The nonzero method has it's behavior because of the 
nonzero function in Numeric (which only worked with 1-d and returned an 
array not a tuple).

Ideally, I think we should fix the nonzero method and where to have the 
same behavior (both return tuples --- that's actually what the docstring 
of nonzero says right now).   The nonzero function can be special-cased 
to index the tuple for backward compatibility.

-Travis


From tim.hochberg at cox.net  Wed Apr 12 15:32:04 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Wed Apr 12 15:32:04 2006
Subject: [Numpy-discussion] Massive differences in numpy vs. numeric string
 handling
In-Reply-To: <443D7939.2060406@ee.byu.edu>
References: <3C3B42A3-962D-4F34-B704-AB1BF0E2390A@gmail.com> <443D7939.2060406@ee.byu.edu>
Message-ID: <443D7F5E.1020007@cox.net>

Travis Oliphant wrote:

> Jeremy Gore wrote:
>
>> In Numeric:
>>
>> Numeric.array('test') -> array([t, e, s, t],'c'); shape = (4,)
>> Numeric.array(['test','two']) ->
>> array([[t, e, s, t],
>>        [t, w, o,  ]],'c')
>>
>> but in numpy:
>>
>> numpy.array('test') -> array('test', dtype='|S4'); shape = ()
>> numpy.array('test','S1') -> array('t', dtype='|S1'); shape = ()
>>
>> in fact you have to do an extra list cast:
>>
>> numpy.array(list('test'),'S1') -> array([t, e, s, t], dtype='|S1');  
>> shape = (4,)
>>
>> to get the desired result.  I don't think this is very pythonic, as  
>> strings are fully indexable and iterable objects.
>
>
>
> Let's not cast this discussion in Pythonic vs. un-pythonic because 
> that does not really shed light on the issues.
>
> NumPy adds full support for string arrays.   Numeric had this 
> step-child called a character array which was really just an array of 
> bytes that printed differently. 
> This does raise some compatibility issues that have been hard to get 
> exactly right, and convertcode indeed does not really solve the 
> problem for a heavy character-array user.    I have resisted simply 
> adding back a 1-character string data-type back into NumPy,  but that 
> could be done if it is really necessary.  But, I don't think it is.
>
>>   Furthermore,  converting/treating a string as an array of 
>> characters is a very  common thing.  convertcode.py would not appear 
>> to convert this part  of the code correctly either.  Also, the use of 
>> quotes in the shape  () array but not in the shape (4,) array is 
>> inconsistent.
>
>
>>
>>
>> I realize the ability to use strings of arbitrary length as array  
>> elements is important in numpy, but there really should be a more  
>> natural option to convert/cast strings as character arrays.
>
>
> Perhaps all that is needed to simplify handling is to handle the 'S1' 
> case better so that
>
> array('test','S1')  works the same as array('test','c') used to work 
> (i.e. not stopping at strings for the sequence decomposition).

It seems a little wacky that 'S2' and 'S1' would have vastly different 
behaviour.

>>
>> Also, unlike Numeric.equal and 'c' arrays, numpy.equal cannot 
>> compare  '|S1' arrays or presumably other strings for equality, 
>> although this  is a very useful comparison to make.
>
>
> This is a known missing feature due to the fact that comparisons use 
> ufuncs but ufuncs are not supported for variable-length arrays.   
> Currently, however you can use the chararray class which does allow 
> comparisons of strings.

It seems like this should be easy to worm around in __cmp__ (or 
array_compare or however it's spelled). Since the strings really have a 
fixed length, they're more or less equivalent to byte arrays with one 
extra dimension. Writing a little lexographic comparison thing on top of 
the results of a ufunc operating on the result of  a compare of these 
byte arrays should be a piece of cake; in theory at least.

>
> There are simple ways to work around this, of course.   If you do have 
> 'S1' arrays, then you can simply view them as unsigned bytes (using 
> the .view method) and do comparison that way. 
> if s1 and s2 are "character arrays"
>
> s1.view(ubyte) >= s2.view(ubyte)

Nice!

Regards,

-tim


From oliphant at ee.byu.edu  Wed Apr 12 15:47:04 2006
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Wed Apr 12 15:47:04 2006
Subject: ***[Possible UCE]*** Re: [Numpy-discussion] Massive differences
 in numpy vs. numeric string handling
In-Reply-To: <443D7F5E.1020007@cox.net>
References: <3C3B42A3-962D-4F34-B704-AB1BF0E2390A@gmail.com> <443D7939.2060406@ee.byu.edu> <443D7F5E.1020007@cox.net>
Message-ID: <443D8336.60606@ee.byu.edu>

Tim Hochberg wrote:

>
> It seems a little wacky that 'S2' and 'S1' would have vastly different 
> behaviour.

True.   Much better is a compatibility function such as the one you gave.


>> This is a known missing feature due to the fact that comparisons use 
>> ufuncs but ufuncs are not supported for variable-length arrays.   
>> Currently, however you can use the chararray class which does allow 
>> comparisons of strings.
>
>
> It seems like this should be easy to worm around in __cmp__ (or 
> array_compare or however it's spelled). Since the strings really have 
> a fixed length, they're more or less equivalent to byte arrays with 
> one extra dimension. Writing a little lexographic comparison thing on 
> top of the results of a ufunc operating on the result of  a compare of 
> these byte arrays should be a piece of cake; in theory at least.

Yes, indeed it could be handled there as well.   It's the rich_compare 
function (all the cases are handled there...).   Right now, equality 
testing is special-cased a bit (inheriting behavior from Numeric). 

I've gone back and forth on whether I should put effort into handling 
variable-length arrays with ufuncs (which might be better long-term --- 
or just an example of feature bloat as I can't think of many use cases 
except this one),  or just special-case the needed comparisons (which 
would take less thought to implement).

I'm leaning towards the latter case --- special-case comparison of 
string arrays in the rich_compare function.   The next thing to think 
about is then Unicode arrays.  The problem with comparisons on unicode 
arrays though is "how do you compare unicode strings" in a meaningful 
way (i.e. what is alphabetical?).  

-Travis


From oliphant at ee.byu.edu  Wed Apr 12 15:56:03 2006
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Wed Apr 12 15:56:03 2006
Subject: [Numpy-discussion] Re: ***[Possible UCE]*** [SciPy-user] Regarding what "where" returns
In-Reply-To: <D03DE8BC-B54C-4ABF-9964-F5418D1D70F8@stsci.edu>
References: <443C0D36.80608@enthought.com>	<B82961E1-D585-46CF-82E1-A492B3746BB2@stsci.edu>	<443D39F6.6040805@enthought.com> <443D601E.3020500@enthought.com> <D03DE8BC-B54C-4ABF-9964-F5418D1D70F8@stsci.edu>
Message-ID: <443D857F.9000605@ee.byu.edu>

Perry Greenfield wrote:

>We've noticed that in numpy that the where() function behaves  
>differently than for numarray. In numarray, where() (when used with a  
>mask or condition array only) always returns a tuple of index arrays,  
>even for the 1D case whereas numpy returns an index array for the 1D  
>case and a tuple for higher dimension cases. While the tuple is a  
>annoyance for users when they want to manipulate the 1D case, the  
>benefit is that one always knows that where is returning a tuple, and  
>thus can write code accordingly. The  problem with the current numpy  
>behavior is that it requires special case testing to see which kind  
>return one has before manipulating if you aren't certain of what the  
>dimensionality of the argument is going to be.
>  
>
I went ahead and made this change to the code.    The nonzero function 
still behaves as before (and in fact only works for 1-d arrays as it did 
in Numeric).

The where(condition)  function works the same as condition.nonzero() and 
both always return a tuple.

I had to change exactly one piece of code that used the new where syntax.

This does represent a code breakage with the where syntax (but only if 
you used the newer, numarray-introduced usage).  I think this is a 
small-enough segment that we can make this change.  

-Travis


From robert.kern at gmail.com  Wed Apr 12 15:57:06 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Wed Apr 12 15:57:06 2006
Subject: [Numpy-discussion] Re: Massive differences in numpy vs. numeric string   handling
In-Reply-To: <443D7B74.6040808@cox.net>
References: <3C3B42A3-962D-4F34-B704-AB1BF0E2390A@gmail.com> <443D7B74.6040808@cox.net>
Message-ID: <e1k0in$lb0$1@sea.gmane.org>

Tim Hochberg wrote:
> Jeremy Gore wrote:
> 
>> In Numeric:
>>
>> Numeric.array('test') -> array([t, e, s, t],'c'); shape = (4,)
>> Numeric.array(['test','two']) ->
>> array([[t, e, s, t],
>>        [t, w, o,  ]],'c')
>>
>> but in numpy:
>>
>> numpy.array('test') -> array('test', dtype='|S4'); shape = ()
>> numpy.array('test','S1') -> array('t', dtype='|S1'); shape = ()
>>
>> in fact you have to do an extra list cast:
>>
>> numpy.array(list('test'),'S1') -> array([t, e, s, t], dtype='|S1'); 
>> shape = (4,)
> 
> The creation of arrays from python objects is full of all kinds of weird
> special cases. For numerical arrays this is works pretty well , but for
> other sorts of arrays, like strings and even worse, objects, it's
> impossible to always guess the correct kind of thing to return. I'll
> leave it to the various string array users to battle it out over what's
> the right way to convert strings. However,  in the meantime or if you do
> not prevail in this debate, I suggest you slap an appropriate three line
> function into your code somewhere.

I would suggest this way of thinking about it: numpy.array() shouldn't have to
handle every possible way to construct an array. People building less-common
arrays from less-common Python objects may have to use a different constructor
if they want to do so in a natural way. Implementing every possible combination
in numpy.array() *and* making it intuitive and readable are incommensurate
goals, in my opinion.

> If all you care about is the interface issues use:
> 
>    def chararray(astring):
>        return numpy.array(list(astring), 'S1')
> 
> If you are worried about the performance of this, you could use the more
> cryptic, but more efficient:
> 
>    def chararray(astring):
>        a = numpy.array(astring)
>        return numpy.ndarray([len(astring)], 'S1', a.data)

Better:

In [31]: fromstring('test', dtype('S1'))
Out[31]:
array([t, e, s, t],
      dtype='|S1')

There's still the issue of N-D arrays of character, though.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From oliphant at ee.byu.edu  Wed Apr 12 17:04:05 2006
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Wed Apr 12 17:04:05 2006
Subject: [Numpy-discussion] Toward release 1.0 of NumPy
Message-ID: <443D9543.8040601@ee.byu.edu>

The next release of NumPy will be 0.9.8

Before this release is made,  I want to make sure the following tickets 
are implemented

http://projects.scipy.org/scipy/numpy/ticket/54
http://projects.scipy.org/scipy/numpy/ticket/55
http://projects.scipy.org/scipy/numpy/ticket/56

Once 0.9.8 is out, I'd like to name the next release NumPy 1.0 Release 
Candidate 1 and have a series of release candidates so that hopefully by 
SciPy 2006 conference, NumPy 1.0 is out.   This also dove-tails nicely 
with the Python 2.5 release schedule so that NumPy 1.0 should work with 
Python 2.5 and be fully 64-bit capable for handling very-large arrays.

The recent discussions and bug-reports have been very helpful.  If you 
have found a bug, please report it on the Trac pages so that we don't 
lose sight of it.  

Report bugs by "submitting a ticket" here:

http://projects.scipy.org/scipy/numpy/newticket


-Travis


From oliphant at ee.byu.edu  Wed Apr 12 17:11:04 2006
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Wed Apr 12 17:11:04 2006
Subject: [Numpy-discussion] Toward release 1.0 of NumPy
In-Reply-To: <443D9543.8040601@ee.byu.edu>
References: <443D9543.8040601@ee.byu.edu>
Message-ID: <443D96DC.3060501@ee.byu.edu>

Travis Oliphant wrote:

>
> The next release of NumPy will be 0.9.8
>
> Before this release is made,  I want to make sure the following 
> tickets are implemented
>
> http://projects.scipy.org/scipy/numpy/ticket/54
> http://projects.scipy.org/scipy/numpy/ticket/55
> http://projects.scipy.org/scipy/numpy/ticket/56


So you don't have to read each one individually:


#54 :  implement thread-based error-handling modes
#55 :  finish scalar-math implementation which recognizes same 
error-handling
#56 :  implement rich_comparisons on string arrays and unicode arrays.


-Travis


From robert.kern at gmail.com  Wed Apr 12 17:19:07 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Wed Apr 12 17:19:07 2006
Subject: [Numpy-discussion] Re: Toward release 1.0 of NumPy
In-Reply-To: <443D9543.8040601@ee.byu.edu>
References: <443D9543.8040601@ee.byu.edu>
Message-ID: <e1k5ck$1sl$1@sea.gmane.org>

Travis Oliphant wrote:
> 
> The next release of NumPy will be 0.9.8

I have added a "0.9.8 Release" milestone to the Trac and have scheduled all of
these tickets for that milestone.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From tim.hochberg at cox.net  Wed Apr 12 17:59:12 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Wed Apr 12 17:59:12 2006
Subject: [Numpy-discussion] Toward release 1.0 of NumPy
In-Reply-To: <443D96DC.3060501@ee.byu.edu>
References: <443D9543.8040601@ee.byu.edu> <443D96DC.3060501@ee.byu.edu>
Message-ID: <443DA1B1.8040406@cox.net>

Travis Oliphant wrote:

> Travis Oliphant wrote:
>
>>
>> The next release of NumPy will be 0.9.8
>>
>> Before this release is made,  I want to make sure the following 
>> tickets are implemented
>>
>> http://projects.scipy.org/scipy/numpy/ticket/54
>> http://projects.scipy.org/scipy/numpy/ticket/55
>> http://projects.scipy.org/scipy/numpy/ticket/56
>
>
>
> So you don't have to read each one individually:
>
>
> #54 :  implement thread-based error-handling modes
> #55 :  finish scalar-math implementation which recognizes same 
> error-handling
> #56 :  implement rich_comparisons on string arrays and unicode arrays.


I'll help with #54 at least, since I was the complainer, er I mean, 
since I brought that one up. It's probably better to get that started 
before #55 anyway. The open issues that I see connected to this are:

    1. Better support for catching integer divide by zero. That doesn't 
work at all here, I'm guessing because my optimizer is too smart. I 
spent a half hour this morning trying how to set the divide by zero flag 
directly using VC7, but I couldn't find anything. I suppose I could see 
if there's some pragma to turn off optimization around that one 
function. However, I'm interested in what you think of stuffing the 
integer divide by zero information directly into a flag on the thread 
local object and then checking it on the way out. This is cleaner in 
that it doesn't rely on platform specific flag setting ifdeffery and it 
allows us to consider issue #2.

    2. Breaking integer divide by zero out from floating point divide by 
zero. The former is more serious in that it's silent. The latter returns 
INF, so you can see that something happened by examing your results, 
while the former returns zero. That has much more potential for 
confusion and silents bugs. Thus, it seems reasonable to be able to set 
the error handling different for integer divide by zero and floating 
point divide by zero. Note that this would allow integer divide by zero 
to be set to 'raise' and still run all the FP ops at max speed, since 
the flag saying do no error checking could ignore the int_divide_by_zero 
setting.

   3. Tossing out the overflow checking on integer operations. It's 
incomplete anyway and it slows things down. I don't really expect my 
integer operations to be overflow checked, and personally I think that 
incomplete checking is worse than no checking. I think we should at 
least disable the support for the time being and possibly revisit this 
latter when we have time to do a complete job and if it seems necessary.

   4. Different defaults I'd like to enable different defaults without 
slowing things down in the really super fast case.


Looking at this list now, it looks like only #4 needs to be addressed 
when doing the initial implementaion of the thread local error handling 
and even that one can be done in parallel, so I guess we should just 
start with creating the thread local object and see what happens. If you 
like I can start working on this, although I may not be able to get much 
done on  it till Monday.

Regards,

-tim


From simon at arrowtheory.com  Wed Apr 12 18:17:03 2006
From: simon at arrowtheory.com (Simon Burton)
Date: Wed Apr 12 18:17:03 2006
Subject: [Numpy-discussion] index objects are not broadcastable to a single shape
Message-ID: <20060413111612.3bb4e6fc.simon@arrowtheory.com>

This must be up there with the most useless confusing error messages:

>>> a=numpy.array([1,2,3])
>>> b=numpy.array([1,2,3,4])
>>> a*b
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
ValueError: index objects are not broadcastable to a single shape
>>>

Simon.

-- 
Simon Burton, B.Sc.
Licensed PO Box 8066
ANU Canberra 2601
Australia
Ph. 61 02 6249 6940
http://arrowtheory.com 


From oliphant at ee.byu.edu  Wed Apr 12 18:25:03 2006
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Wed Apr 12 18:25:03 2006
Subject: [Numpy-discussion] Toward release 1.0 of NumPy
In-Reply-To: <443DA1B1.8040406@cox.net>
References: <443D9543.8040601@ee.byu.edu> <443D96DC.3060501@ee.byu.edu> <443DA1B1.8040406@cox.net>
Message-ID: <443DA866.3090806@ee.byu.edu>

Tim Hochberg wrote:

> Travis Oliphant wrote:
>
>> Travis Oliphant wrote:
>>
>>>
>>> The next release of NumPy will be 0.9.8
>>>
>>> Before this release is made,  I want to make sure the following 
>>> tickets are implemented
>>>
>>> http://projects.scipy.org/scipy/numpy/ticket/54
>>> http://projects.scipy.org/scipy/numpy/ticket/55
>>> http://projects.scipy.org/scipy/numpy/ticket/56
>>
>>
>>
>>
>> So you don't have to read each one individually:
>>
>>
>> #54 :  implement thread-based error-handling modes
>> #55 :  finish scalar-math implementation which recognizes same 
>> error-handling
>> #56 :  implement rich_comparisons on string arrays and unicode arrays.
>
>
> I'll help with #54 at least, since I was the complainer, er I mean, 
> since I brought that one up. It's probably better to get that started 
> before #55 anyway. The open issues that I see connected to this are:

Great.  I agree that #54 needs to be done before #55 (error handling is 
what's been holding up #55 the whole time.

>
>    1. Better support for catching integer divide by zero. That doesn't 
> work at all here,

Probably a platform/compiler issue.   The numarray equivalent code had 
an if statement to prevent the compiler from optimizing it away.  
Perhaps we need to do something like that.   Also, perhaps VC7 has some 
means to set the divide by zero error more directly and we can just use 
that.

> I'm guessing because my optimizer is too smart. I spent a half hour 
> this morning trying how to set the divide by zero flag directly using 
> VC7, but I couldn't find anything. I suppose I could see if there's 
> some pragma to turn off optimization around that one function. 
> However, I'm interested in what you think of stuffing the integer 
> divide by zero information directly into a flag on the thread local 
> object and then checking it on the way out. 


Hmm..   The only issue is that dictionary look-ups are more expensive 
then register look-ups.    This could be costly.


> This is cleaner in that it doesn't rely on platform specific flag 
> setting ifdeffery and it allows us to consider issue #2.
>
>    2. Breaking integer divide by zero out from floating point divide 
> by zero. The former is more serious in that it's silent. The latter 
> returns INF, so you can see that something happened by examing your 
> results, while the former returns zero. That has much more potential 
> for confusion and silents bugs. Thus, it seems reasonable to be able 
> to set the error handling different for integer divide by zero and 
> floating point divide by zero. Note that this would allow integer 
> divide by zero to be set to 'raise' and still run all the FP ops at 
> max speed, since the flag saying do no error checking could ignore the 
> int_divide_by_zero setting.


Interesting proposal.    Yes, it is true that integer division returning 
zero is less well-justified.   But, I'm still concerned with doing a 
dictionary lookup for every divide-by-zero, and (more importantly) to 
check to see if a divide-by-zero has occurred.   The dictionary lookups 
is the largest source of small-array slow-down when comparing Numeric to 
NumPy.

>
>   3. Tossing out the overflow checking on integer operations. It's 
> incomplete anyway and it slows things down. I don't really expect my 
> integer operations to be overflow checked, and personally I think that 
> incomplete checking is worse than no checking. I think we should at 
> least disable the support for the time being and possibly revisit this 
> latter when we have time to do a complete job and if it seems necessary.

I'm all for that.   I think it makes the code slower and because it is 
incomplete (addition and subtraction don't do it), it makes for 
harder-to-explain code.

On the scalar operations, we should check for over-flow, however...

>
>   4. Different defaults I'd like to enable different defaults without 
> slowing things down in the really super fast case.


The discussion on different defaults is fine.   The slow-down is that 
with the current defaults, the error register flags are not actually 
checked if the default has not been changed.    With the 
numarray-defaults, the register flags would be checked at the end of 
each 1-d loop.   I'm not sure what kind of slow-down that would bring.   
Certainly for 1-d cases, there would be little difference.

One could actually simply store different defaults (but it would result 
in minor slow-downs because the register flags would be checked.

-Travis


From oliphant at ee.byu.edu  Wed Apr 12 18:30:03 2006
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Wed Apr 12 18:30:03 2006
Subject: [Numpy-discussion] index objects are not broadcastable to a single
 shape
In-Reply-To: <20060413111612.3bb4e6fc.simon@arrowtheory.com>
References: <20060413111612.3bb4e6fc.simon@arrowtheory.com>
Message-ID: <443DA966.1020301@ee.byu.edu>

Simon Burton wrote:

>This must be up there with the most useless confusing error messages:
>
>  
>
>>>>a=numpy.array([1,2,3])
>>>>b=numpy.array([1,2,3,4])
>>>>a*b
>>>>        
>>>>
>Traceback (most recent call last):
>  File "<stdin>", line 1, in ?
>ValueError: index objects are not broadcastable to a single shape
>  
>
>
>  
>
The problem with these error messages is that some code is used in a 
wide-variety of circumstances.  The original error message was conceived 
in thinking about the application of the code to one circumstance while 
this particular error is occurring in a different one.

The standard behavior is to just propagate the error up.  Better error 
messages means catching a lot more errors and special-casing error 
messages.  It can be done, but it's tedious work.

-Travis


From simon at arrowtheory.com  Wed Apr 12 20:34:04 2006
From: simon at arrowtheory.com (Simon Burton)
Date: Wed Apr 12 20:34:04 2006
Subject: [Numpy-discussion] index objects are not broadcastable to a
 single shape
In-Reply-To: <443DA966.1020301@ee.byu.edu>
References: <20060413111612.3bb4e6fc.simon@arrowtheory.com>
	<443DA966.1020301@ee.byu.edu>
Message-ID: <20060413133326.2889a5c5.simon@arrowtheory.com>

On Wed, 12 Apr 2006 19:29:10 -0600
Travis Oliphant <oliphant at ee.byu.edu> wrote:

> The problem with these error messages is that some code is used in a 
> wide-variety of circumstances.  The original error message was conceived 
> in thinking about the application of the code to one circumstance while 
> this particular error is occurring in a different one.
> 
> The standard behavior is to just propagate the error up.  Better error 
> messages means catching a lot more errors and special-casing error 
> messages.  It can be done, but it's tedious work.

OK. Can the error message be a little more generic, longer, etc. ?

"shape mismatch (index objects are not broadcastable to a single shape)" ?

I don't know either. I'm just thinking about all the new numpy/python users at work
here that I will need to hand hold. Error messages like this are pretty scary.

Simon.

-- 
Simon Burton, B.Sc.
Licensed PO Box 8066
ANU Canberra 2601
Australia
Ph. 61 02 6249 6940
http://arrowtheory.com 


From tim.hochberg at cox.net  Wed Apr 12 21:59:01 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Wed Apr 12 21:59:01 2006
Subject: [Numpy-discussion] Toward release 1.0 of NumPy
In-Reply-To: <443DA866.3090806@ee.byu.edu>
References: <443D9543.8040601@ee.byu.edu> <443D96DC.3060501@ee.byu.edu> <443DA1B1.8040406@cox.net> <443DA866.3090806@ee.byu.edu>
Message-ID: <443DD9D9.9080004@cox.net>

Travis Oliphant wrote:

> Tim Hochberg wrote:
>
>> Travis Oliphant wrote:
>>
>>> Travis Oliphant wrote:
>>>
>>>>
>>>> The next release of NumPy will be 0.9.8
>>>>
>>>> Before this release is made,  I want to make sure the following 
>>>> tickets are implemented
>>>>
>>>> http://projects.scipy.org/scipy/numpy/ticket/54
>>>> http://projects.scipy.org/scipy/numpy/ticket/55
>>>> http://projects.scipy.org/scipy/numpy/ticket/56
>>>
>>>
>>>
>>>
>>>
>>> So you don't have to read each one individually:
>>>
>>>
>>> #54 :  implement thread-based error-handling modes
>>> #55 :  finish scalar-math implementation which recognizes same 
>>> error-handling
>>> #56 :  implement rich_comparisons on string arrays and unicode arrays.
>>
>>
>>
>> I'll help with #54 at least, since I was the complainer, er I mean, 
>> since I brought that one up. It's probably better to get that started 
>> before #55 anyway. The open issues that I see connected to this are:
>
>
> Great.  I agree that #54 needs to be done before #55 (error handling 
> is what's been holding up #55 the whole time.
>
>>
>>    1. Better support for catching integer divide by zero. That 
>> doesn't work at all here,
>
>
> Probably a platform/compiler issue.   The numarray equivalent code had 
> an if statement to prevent the compiler from optimizing it away.  
> Perhaps we need to do something like that.   Also, perhaps VC7 has 
> some means to set the divide by zero error more directly and we can 
> just use that.
>
>> I'm guessing because my optimizer is too smart. I spent a half hour 
>> this morning trying how to set the divide by zero flag directly using 
>> VC7, but I couldn't find anything. I suppose I could see if there's 
>> some pragma to turn off optimization around that one function. 
>> However, I'm interested in what you think of stuffing the integer 
>> divide by zero information directly into a flag on the thread local 
>> object and then checking it on the way out. 
>
>
>
> Hmm..   The only issue is that dictionary look-ups are more expensive 
> then register look-ups.    This could be costly.
>
>
>> This is cleaner in that it doesn't rely on platform specific flag 
>> setting ifdeffery and it allows us to consider issue #2.
>>
>>    2. Breaking integer divide by zero out from floating point divide 
>> by zero. The former is more serious in that it's silent. The latter 
>> returns INF, so you can see that something happened by examing your 
>> results, while the former returns zero. That has much more potential 
>> for confusion and silents bugs. Thus, it seems reasonable to be able 
>> to set the error handling different for integer divide by zero and 
>> floating point divide by zero. Note that this would allow integer 
>> divide by zero to be set to 'raise' and still run all the FP ops at 
>> max speed, since the flag saying do no error checking could ignore 
>> the int_divide_by_zero setting.
>
>
>
> Interesting proposal.    Yes, it is true that integer division 
> returning zero is less well-justified.   But, I'm still concerned with 
> doing a dictionary lookup for every divide-by-zero, and (more 
> importantly) to check to see if a divide-by-zero has occurred.   The 
> dictionary lookups is the largest source of small-array slow-down when 
> comparing Numeric to NumPy.

Well, assuming that we can fix the error flag setting code here, we 
could still break the divide by zero error handling out by doing some 
special casing in the ufunc machinery since the ufuncs presumably can 
figure out there own types. Still, the thread local storage option is 
cleaner if we can figure out a way to make the dictionary lookups fast 
enough. The lookup in the failing case is not a big deal I don't think. 
First, it's normally an error so I don't mind introduce some slowing. 
Second ,it should be easy to only do the lookup once. Just have a flag 
that enusres that after the first lookup, the divided by zero flag is 
not set a second time. I guess the bigger issue is the lookup on the way 
out to see if anything failed. I have a plane, which I'll present at the 
bottom.

>>
>>   3. Tossing out the overflow checking on integer operations. It's 
>> incomplete anyway and it slows things down. I don't really expect my 
>> integer operations to be overflow checked, and personally I think 
>> that incomplete checking is worse than no checking. I think we should 
>> at least disable the support for the time being and possibly revisit 
>> this latter when we have time to do a complete job and if it seems 
>> necessary.
>
>
> I'm all for that.   I think it makes the code slower and because it is 
> incomplete (addition and subtraction don't do it), it makes for 
> harder-to-explain code.
>
> On the scalar operations, we should check for over-flow, however...

OK.

>
>>
>>   4. Different defaults I'd like to enable different defaults without 
>> slowing things down in the really super fast case.
>
>
>
> The discussion on different defaults is fine.   The slow-down is that 
> with the current defaults, the error register flags are not actually 
> checked if the default has not been changed.    With the 
> numarray-defaults, the register flags would be checked at the end of 
> each 1-d loop.   I'm not sure what kind of slow-down that would 
> bring.   Certainly for 1-d cases, there would be little difference.
>
> One could actually simply store different defaults (but it would 
> result in minor slow-downs because the register flags would be checked.
>
OK, here's my plan. It sounds like it will work, but this threading 
business is always tricky so find holes in it if you can.

1. As we've discussed we grow some thread local storage. This storage 
has flags check_divide, check_over, check_under, check_invalid and 
check_int_divide. It also has a flag int_divide_err. These flags are 
initialized to False, but then may immediately be set to a different 
default value. This is to simplify #3.

2. We grow 6 static longs that correspond to the above and are 
initialized to zero. They should be called check_divide_count, etc. or 
something similar.

3. Whenever a flag is switched from False to True it's corresponding 
global is incremented. Similarly, when switched from True to False the 
global is decremented.

4. When a divide by integer zero occurs, we check the int_divide_err 
flag. If it is false, we set it to true and also increment 
int_divide_err_count. We also set a local flag so that we don't do this 
again in that call to the ufunc core function. We can actually skip this 
whole step if check_int_divide_count is zero.

With all that in place, I think we should be able to do things 
efficiently. The ufunc can check whether any of the XXX_check_counts are 
nonzero and turn on register flag checking as appropriate. If an error 
occurs, it still only has to go to the per thread dictionary if the 
count for that particular error type is nonzero. Similarly, if the count 
int_divide_err_count is nonzero, the ufunc will have to go to the 
dictionary. If the error was set in this thread, then appropriate action 
(including possibly nothing) is taken and int_divide_err_count is 
decremented.

That all sounds more complicated than it really is, at least in my head 
;) Anyway, try to find the holes in it. It should be able to run at full 
speed if you turn off error checking in all threads. It should run at 
almost full speed as long as there aren't any errors that are being 
checked in *any thread*. I think in practice this means that most of the 
speed hit that is seen in numarray won't be here. It doesn't actually 
matter what the defaults are; turning off all error checking will still 
be fast.

Regards,

-tim


>
>
>
>


From winnieshop888 at yahoo.com.cn  Wed Apr 12 22:02:02 2006
From: winnieshop888 at yahoo.com.cn (winnie)
Date: Wed Apr 12 22:02:02 2006
Subject: [Numpy-discussion] Rash Guard
Message-ID: <E1FTtxW-0000Lz-Um@mail.sourceforge.net>

The products name :Rash Guard
The price :USD4.50/pc (with shiiping cost)
The qty : 200pcs
The size :XL,L,M,and S

see the attached

www.rmb.com.hk


Thanks,

winnie


From shetbest at 163.com  Wed Apr 12 23:30:04 2006
From: shetbest at 163.com (=?GB2312?B?NNTCMTUtMTbJz7qjLzIxLTIyye7b2g==?=)
Date: Wed Apr 12 23:30:04 2006
Subject: [Numpy-discussion] =?GB2312?B?QUTUy9PDRVhDRUy02b34ytCzodOqz/rT67LGzvG53MDt?=
Message-ID: <E1FTvKy-0002eO-7U@mail.sourceforge.net>

An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060412/aefc42b0/attachment-0001.html>

From arnd.baecker at web.de  Thu Apr 13 00:58:04 2006
From: arnd.baecker at web.de (Arnd Baecker)
Date: Thu Apr 13 00:58:04 2006
Subject: [Numpy-discussion] Massive differences in numpy vs. numeric
 string handling
In-Reply-To: <3C3B42A3-962D-4F34-B704-AB1BF0E2390A@gmail.com>
References: <3C3B42A3-962D-4F34-B704-AB1BF0E2390A@gmail.com>
Message-ID: <Pine.LNX.4.51.0604130953510.13261@ptpcp8.phy.tu-dresden.de>

On Wed, 12 Apr 2006, Jeremy Gore wrote:

> In Numeric:

[...]

> but in numpy:

[...]

> For the record, I have used the Numeric (and to a lesser degree the
> numarray) module extensively in bioinformatics applications for its
> speed and brevity.

If (after this round of discussion) there remain
any differences, it would be good if you could
add them to the wiki at
 http://www.scipy.org/Converting_from_Numeric

Best, Arnd

P.S.: The same applies of course to any other differences which
show up!


From svetosch at gmx.net  Thu Apr 13 01:20:02 2006
From: svetosch at gmx.net (Sven Schreiber)
Date: Thu Apr 13 01:20:02 2006
Subject: [Numpy-discussion] Toward release 1.0 of NumPy
In-Reply-To: <443D9543.8040601@ee.byu.edu>
References: <443D9543.8040601@ee.byu.edu>
Message-ID: <443E096D.3040407@gmx.net>

Travis Oliphant schrieb:
> 
> The next release of NumPy will be 0.9.8

> 
> The recent discussions and bug-reports have been very helpful.  If you
> have found a bug, please report it on the Trac pages so that we don't
> lose sight of it. 
> Report bugs by "submitting a ticket" here:
> 

Before submitting the following as a bug, I would like to repeat what I
posted earlier (no replies) to check whether you agree it's a bug:

The "kron" (Kronecker product) function returns numpy-arrays even if
both arguments are numpy-matrices; imho that's a bug in light of the
proclaimed goal of preserving matrices where possible/sensible.

On a related issue, "eye" also still returns a numpy-array instead of a
numpy-matrix. At least one person (I think it was Ed Schofield) agreed
that it would be better to return a numpy-matrix, given that another
function ("identity") already returns a numpy-array. Currently, one of
the two functions seems redundant.

So unless somebody tells me otherwise, I will submit these two things as
bugs/tickets.

Great that numpy soon will be officially stable!

Cheers,
Sven


From pgmdevlist at mailcan.com  Thu Apr 13 01:41:02 2006
From: pgmdevlist at mailcan.com (Pierre GM)
Date: Thu Apr 13 01:41:02 2006
Subject: [Numpy-discussion] range/arange
Message-ID: <200604130507.40241.pgmdevlist@mailcan.com>

Folks,
Could any of you explain me why the two following commands give different 
results ? It's mere curiosity, for my personal edification.

[(m-5)/10 for m in arange(1,10)]
[0, 0, 0, 0, 0, 0, 0, 0, 0]

[(m-5)/10 for m in range(1,10)]
[-1, -1, -1, -1, 0, 0, 0, 0, 0]


From lars.bittrich at googlemail.com  Thu Apr 13 02:30:01 2006
From: lars.bittrich at googlemail.com (Lars Bittrich)
Date: Thu Apr 13 02:30:01 2006
Subject: [Numpy-discussion] range/arange
In-Reply-To: <200604130507.40241.pgmdevlist@mailcan.com>
References: <200604130507.40241.pgmdevlist@mailcan.com>
Message-ID: <200604131123.56171.lars.bittrich@googlemail.com>

Hi,

On Thursday 13 April 2006 11:07, Pierre GM wrote:
> Could any of you explain me why the two following commands give different
> results ? It's mere curiosity, for my personal edification.
>
> [(m-5)/10 for m in arange(1,10)]
> [0, 0, 0, 0, 0, 0, 0, 0, 0]
>
> [(m-5)/10 for m in range(1,10)]
> [-1, -1, -1, -1, 0, 0, 0, 0, 0]

I have no idea where the reason is located exactly, but it seems to be caused 
by different types of range and arange.

In [15]:type(arange(1,10)[0])
Out[15]:<type 'int32scalar'>

In [14]:type(range(1,10)[0])
Out[14]:<type 'int'>

If you use for example:

In [16]:-1/10
Out[16]:-1

you get the normal behavior of the "floor" function.

In [17]:floor(-.1)
Out[17]:-1.0

The behavior of int32scalar seems more intuitive to me.

Best regards,
Lars


From robert.kern at gmail.com  Thu Apr 13 05:17:05 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Thu Apr 13 05:17:05 2006
Subject: [Numpy-discussion] Re: range/arange
In-Reply-To: <200604130507.40241.pgmdevlist@mailcan.com>
References: <200604130507.40241.pgmdevlist@mailcan.com>
Message-ID: <e1lfd7$nsd$1@sea.gmane.org>

Pierre GM wrote:
> Folks,
> Could any of you explain me why the two following commands give different 
> results ? It's mere curiosity, for my personal edification.
> 
> [(m-5)/10 for m in arange(1,10)]
> [0, 0, 0, 0, 0, 0, 0, 0, 0]
> 
> [(m-5)/10 for m in range(1,10)]
> [-1, -1, -1, -1, 0, 0, 0, 0, 0]

Python's rule for integer division is to round towards negative infinity. C's
rule (if it has one; I think it may be platform dependent) is to round towards
0. When it comes to arithmetic, numpy tends to expose the C behavior because
it's fastest. As Lars pointed out, the type of the object that you get from
iterating over an array is a numpy int32scalar object, so the numpy behavior is
used.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From fullung at gmail.com  Thu Apr 13 05:18:04 2006
From: fullung at gmail.com (Albert Strasheim)
Date: Thu Apr 13 05:18:04 2006
Subject: [Numpy-discussion] Segfault when indexing on second or higher dimension with list or tuple
Message-ID: <20060413121710.GA30372@dogbert.sdsl.sun.ac.za>

Hello all,

The following segfault bug was discovered in NumPy 0.9.7.2348 by 
someone at our Python workshop:

import numpy as N
F = N.zeros((1,1))
F[:,[0]] = 0

The following also segfaults:

F[:,(0,)] = 0

Something seems to go wrong when one uses a tuple or a list to index 
into a NumPy array on the second or higher dimension, since the 
following code works:

F = N.zeros((1,))
F[[0]] = 0

The Trac ticket is here:

http://projects.scipy.org/scipy/numpy/ticket/59

If someone gets around to fixing this, please include some test cases.

Thanks!

Regards,

Albert


From cimrman3 at ntc.zcu.cz  Thu Apr 13 05:24:02 2006
From: cimrman3 at ntc.zcu.cz (Robert Cimrman)
Date: Thu Apr 13 05:24:02 2006
Subject: [Numpy-discussion] Re: ***[Possible UCE]*** [SciPy-user] Regarding
 what "where" returns
In-Reply-To: <443D857F.9000605@ee.byu.edu>
References: <443C0D36.80608@enthought.com>	<B82961E1-D585-46CF-82E1-A492B3746BB2@stsci.edu>	<443D39F6.6040805@enthought.com> <443D601E.3020500@enthought.com> <D03DE8BC-B54C-4ABF-9964-F5418D1D70F8@stsci.edu> <443D857F.9000605@ee.byu.edu>
Message-ID: <443E42A2.80402@ntc.zcu.cz>

Travis Oliphant wrote:
> I went ahead and made this change to the code.    The nonzero function 
> still behaves as before (and in fact only works for 1-d arrays as it did 
> in Numeric).
> 
> The where(condition)  function works the same as condition.nonzero() and 
> both always return a tuple.

So, for 1-d arrays, using 'nonzero( condition )' should be faster than 
'where( condition )[0]', right?

r.


From charlesr.harris at gmail.com  Thu Apr 13 05:35:13 2006
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Thu Apr 13 05:35:13 2006
Subject: [Numpy-discussion] Toward release 1.0 of NumPy
In-Reply-To: <443E096D.3040407@gmx.net>
References: <443D9543.8040601@ee.byu.edu> <443E096D.3040407@gmx.net>
Message-ID: <e06186140604130534v67dab6cbk3a51b41a3887a057@mail.gmail.com>

Sven,

On 4/13/06, Sven Schreiber <svetosch at gmx.net> wrote:
>
> Travis Oliphant schrieb:
> >
> > The next release of NumPy will be 0.9.8
>
> >
> > The recent discussions and bug-reports have been very helpful.  If you
> > have found a bug, please report it on the Trac pages so that we don't
> > lose sight of it.
> > Report bugs by "submitting a ticket" here:
> >
>
> Before submitting the following as a bug, I would like to repeat what I
> posted earlier (no replies) to check whether you agree it's a bug:
>
> The "kron" (Kronecker product) function returns numpy-arrays even if
> both arguments are numpy-matrices; imho that's a bug in light of the
> proclaimed goal of preserving matrices where possible/sensible.


What would you do instead? The Kronecker product (aka Tensor product) of two
matrices isn't a matrix. I suppose you could make it one by appealing to the
universal property -- bilinear map on the Cartesian product of linear spaces
-> linear map on the tensor product of linear spaces -- but that seems a bit
abstract for numpy and you would need to define the indices of the resulting
object as some sort of pair.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060413/6fe28186/attachment-0001.html>

From pjssilva at ime.usp.br  Thu Apr 13 05:51:02 2006
From: pjssilva at ime.usp.br (Paulo Jose da Silva e Silva)
Date: Thu Apr 13 05:51:02 2006
Subject: [Numpy-discussion] Re: range/arange
In-Reply-To: <e1lfd7$nsd$1@sea.gmane.org>
References: <200604130507.40241.pgmdevlist@mailcan.com>
	 <e1lfd7$nsd$1@sea.gmane.org>
Message-ID: <1144932598.16449.5.camel@localhost.localdomain>

Em Qui, 2006-04-13 ?s 07:15 -0500, Robert Kern escreveu:

> 
> Python's rule for integer division is to round towards negative infinity. C's
> rule (if it has one; I think it may be platform dependent) is to round towards
> 0. When it comes to arithmetic, numpy tends to expose the C behavior because
> it's fastest. As Lars pointed out, the type of the object that you get from
> iterating over an array is a numpy int32scalar object, so the numpy behavior is
> used.
> 

Actually, in C99 standard the division was defined to truncate towards
zero always, see item 25 in:

http://home.datacomm.ch/t_wolf/tw/c/c9x_changes.html

So it is not platform dependent anymore.

Paulo

Obs: It once was platform dependent. Old gcc (for Linux) would truncate
towards infinity. I know this because of a "bug" in somebody else's
code. I took me a quite some time to discover that the problem was the
shift in gcc behavior in this matter.


From tejeda at clubspit.com  Thu Apr 13 06:17:03 2006
From: tejeda at clubspit.com (Socorro Tejeda)
Date: Thu Apr 13 06:17:03 2006
Subject: [Numpy-discussion] Re: your news
Message-ID: <000001c65efc$5afb0e50$f914a8c0@sfb92>

 
A M B r I E N 
  
X A q N A X 
  
C I A f L I S 
  
V o I A G R A 
  
V b A L I U M 
  

http://www.korbahcut.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060413/b5b5ccb5/attachment-0001.html>

From aisaac at american.edu  Thu Apr 13 07:02:11 2006
From: aisaac at american.edu (Alan G Isaac)
Date: Thu Apr 13 07:02:11 2006
Subject: [Numpy-discussion] Toward release 1.0 of NumPy
In-Reply-To: <e06186140604130534v67dab6cbk3a51b41a3887a057@mail.gmail.com>
References: <443D9543.8040601@ee.byu.edu> <443E096D.3040407@gmx.net><e06186140604130534v67dab6cbk3a51b41a3887a057@mail.gmail.com>
Message-ID: <Mahogany-0.66.0-1448-20060413-100718.00@american.edu>

On Thu, 13 Apr 2006, Charles R Harris apparently wrote: 
> The Kronecker product (aka Tensor product) of two 
> matrices isn't a matrix. 

That is an unusual way to describe things in
the world of econometrics.  Here is a more
common way:
http://planetmath.org/encyclopedia/KroneckerProduct.html
I share Sven's expectation.

Cheers,
Alan Isaac


From fullung at gmail.com  Thu Apr 13 07:24:02 2006
From: fullung at gmail.com (Albert Strasheim)
Date: Thu Apr 13 07:24:02 2006
Subject: [Numpy-discussion] Segfault when indexing on second or higher dimension with list or tuple
In-Reply-To: <20060413121710.GA30372@dogbert.sdsl.sun.ac.za>
References: <20060413121710.GA30372@dogbert.sdsl.sun.ac.za>
Message-ID: <20060413142246.GA6870@dogbert.sdsl.sun.ac.za>

Hello all

I've attached a test case that reproduces the bug to the ticket:

http://projects.scipy.org/scipy/numpy/attachment/ticket/59/test_list_tuple_indexing.diff

I've also created a test case for the recent vectorize bug:

http://projects.scipy.org/scipy/numpy/attachment/ticket/52/test_vectorize.diff

Regards,

Albert

On Thu, 13 Apr 2006, Albert Strasheim wrote:

> Hello all,
> 
> The following segfault bug was discovered in NumPy 0.9.7.2348 by 
> someone at our Python workshop:
> 
> import numpy as N
> F = N.zeros((1,1))
> F[:,[0]] = 0
> 
> The following also segfaults:
> 
> F[:,(0,)] = 0
> 
> Something seems to go wrong when one uses a tuple or a list to index 
> into a NumPy array on the second or higher dimension, since the 
> following code works:
> 
> F = N.zeros((1,))
> F[[0]] = 0
> 
> The Trac ticket is here:
> 
> http://projects.scipy.org/scipy/numpy/ticket/59
> 
> If someone gets around to fixing this, please include some test cases.
> 
> Thanks!
> 
> Regards,
> 
> Albert
> 
> 
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting language
> that extends applications into web and mobile media. Attend the live webcast
> and join the prime developer group breaking into this new coding territory!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion


From oliphant.travis at ieee.org  Thu Apr 13 07:58:05 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Thu Apr 13 07:58:05 2006
Subject: [Numpy-discussion] index objects are not broadcastable to a single
 shape
In-Reply-To: <20060413133326.2889a5c5.simon@arrowtheory.com>
References: <20060413111612.3bb4e6fc.simon@arrowtheory.com>	<443DA966.1020301@ee.byu.edu> <20060413133326.2889a5c5.simon@arrowtheory.com>
Message-ID: <443E66AC.2020108@ieee.org>

Simon Burton wrote:
> On Wed, 12 Apr 2006 19:29:10 -0600
> Travis Oliphant <oliphant at ee.byu.edu> wrote:
>
>   
>> The problem with these error messages is that some code is used in a 
>> wide-variety of circumstances.  The original error message was conceived 
>> in thinking about the application of the code to one circumstance while 
>> this particular error is occurring in a different one.
>>
>> The standard behavior is to just propagate the error up.  Better error 
>> messages means catching a lot more errors and special-casing error 
>> messages.  It can be done, but it's tedious work.
>>     
>
> OK. Can the error message be a little more generic, longer, etc. ?
>
>   
Absolutely,  I should have finished the above message with an appeal for 
more helpful generic messages.  All suggestions are welcome.
> "shape mismatch (index objects are not broadcastable to a single shape)" ?
>   
Definitely better.  I would probably drop the index qualifier as well.  
Thanks for the tip.

-Travis


From oliphant.travis at ieee.org  Thu Apr 13 08:16:13 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Thu Apr 13 08:16:13 2006
Subject: [Numpy-discussion] Segfault when indexing on second or higher
 dimension with list or tuple
In-Reply-To: <20060413121710.GA30372@dogbert.sdsl.sun.ac.za>
References: <20060413121710.GA30372@dogbert.sdsl.sun.ac.za>
Message-ID: <443E6B01.7000906@ieee.org>

Albert Strasheim wrote:
> Hello all,
>
> The following segfault bug was discovered in NumPy 0.9.7.2348 by 
> someone at our Python workshop:
>
> import numpy as N
> F = N.zeros((1,1))
> F[:,[0]] = 0
>
> The following also segfaults:
>
> F[:,(0,)] = 0
>
> Something seems to go wrong when one uses a tuple or a list to index 
> into a NumPy array on the second or higher dimension, since the 
> following code works:
>
>   
The segfault was due to an error condition not being caught.   This is 
now fixed, so now you get (a rather cryptic error).  Now, to figure out 
why this code doesn't work....

-Travis


From oliphant.travis at ieee.org  Thu Apr 13 08:29:01 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Thu Apr 13 08:29:01 2006
Subject: [Numpy-discussion] Segfault when indexing on second or higher
 dimension with list or tuple
In-Reply-To: <443E6B01.7000906@ieee.org>
References: <20060413121710.GA30372@dogbert.sdsl.sun.ac.za> <443E6B01.7000906@ieee.org>
Message-ID: <443E6DF1.5020206@ieee.org>

Travis Oliphant wrote:
> Albert Strasheim wrote:
>> Hello all,
>>
>> The following segfault bug was discovered in NumPy 0.9.7.2348 by 
>> someone at our Python workshop:
>>
>> import numpy as N
>> F = N.zeros((1,1))
>> F[:,[0]] = 0
>>
>> The following also segfaults:
>>
>> F[:,(0,)] = 0
>>
>> Something seems to go wrong when one uses a tuple or a list to index 
>> into a NumPy array on the second or higher dimension, since the 
>> following code works:
>>
>>   
> The segfault was due to an error condition not being caught.   This is 
> now fixed, so now you get (a rather cryptic error).  Now, to figure 
> out why this code doesn't work....
>

The problem is that the code is not handling arbitrary shapes on the RHS 
of the equal sign.    I'll enter a ticket and fix this before 0.9.8.

Basically, right now,  the RHS  needs to have the same shape as the LHS

so

F[:,[0]] = [[0]]

should work already.


-Travis


From oliphant.travis at ieee.org  Thu Apr 13 08:43:14 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Thu Apr 13 08:43:14 2006
Subject: [Numpy-discussion] Re: ***[Possible UCE]*** [SciPy-user] Regarding
 what "where" returns
In-Reply-To: <443E42A2.80402@ntc.zcu.cz>
References: <443C0D36.80608@enthought.com>	<B82961E1-D585-46CF-82E1-A492B3746BB2@stsci.edu>	<443D39F6.6040805@enthought.com> <443D601E.3020500@enthought.com> <D03DE8BC-B54C-4ABF-9964-F5418D1D70F8@stsci.edu> <443D857F.9000605@ee.byu.edu> <443E42A2.80402@ntc.zcu.cz>
Message-ID: <443E7150.2010006@ieee.org>

Robert Cimrman wrote:
> Travis Oliphant wrote:
>> I went ahead and made this change to the code.    The nonzero 
>> function still behaves as before (and in fact only works for 1-d 
>> arrays as it did in Numeric).
>>
>> The where(condition)  function works the same as condition.nonzero() 
>> and both always return a tuple.
>
> So, for 1-d arrays, using 'nonzero( condition )' should be faster than 
> 'where( condition )[0]', right?
>
No.  since the function just selects off the first element of the tuple 
returned by the method...

'condition.nonzero()[0]'  may be *slightly* faster than 
'where(condition)[0]'  however

-Travis


From tim.hochberg at cox.net  Thu Apr 13 08:44:47 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Thu Apr 13 08:44:47 2006
Subject: [Numpy-discussion] Toward release 1.0 of NumPy
In-Reply-To: <Mahogany-0.66.0-1448-20060413-100718.00@american.edu>
References: <443D9543.8040601@ee.byu.edu> <443E096D.3040407@gmx.net><e06186140604130534v67dab6cbk3a51b41a3887a057@mail.gmail.com> <Mahogany-0.66.0-1448-20060413-100718.00@american.edu>
Message-ID: <443E7109.6080808@cox.net>

Alan G Isaac wrote:

>On Thu, 13 Apr 2006, Charles R Harris apparently wrote: 
>  
>
>>The Kronecker product (aka Tensor product) of two 
>>matrices isn't a matrix. 
>>    
>>
>
>That is an unusual way to describe things in
>the world of econometrics.  Here is a more
>common way:
>http://planetmath.org/encyclopedia/KroneckerProduct.html
>I share Sven's expectation.
>  
>
mathworld also agrees with you. As does the documentation (as best as I 
can tell) and the actual output of kron. I think Charles must be 
thinking of the tensor product instead.

In fact, if you look at the code you see this:

    # TODO:  figure out how to keep arrays the same

I think that in general this is going to be a bit of an issue whenever 
we have multiple arguments. Let me propose the world's second dumbest 
(in a good way, maybe) procedure:

    def kron(a,b):
        wrappers = [(getattr(x, '__array_priority__', 0),
    x.__array_wrap__) for x in [a,b]
                             if hasattr(x, '__array_wrap__')]
        if wrappers:
            priority, wrap = wrappers[-1]
        else:
            wrap = None
        # ....
        result = concatenate(concatenate(o, axis=1), axis=1)
        if wrap is not None:
            result = wrap(result)
        return result

   
This generalizes what _wrapit does for arbitrary arguments. It breaks 
'ties' where more than one argument wants to wrap something by using 
__array_priority__.  You'd actually want to factor out the wrapper 
finding code. This generalized what _wrapit does to multiple dimensions.

Thought?

Better plans?

-tim


From ryanlists at gmail.com  Thu Apr 13 09:11:10 2006
From: ryanlists at gmail.com (Ryan Krauss)
Date: Thu Apr 13 09:11:10 2006
Subject: [Numpy-discussion] where
Message-ID: <c5b438120604130910x2e6694adhe4592d686893afe5@mail.gmail.com>

Can someone help me understand the proper use of where?

I want to use it like this

myvect=where(f>19.5 and phase>0, f, phase)

but I seem to be getting or rather than and.

Thanks,

Ryan


From oliphant at ee.byu.edu  Thu Apr 13 09:18:05 2006
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Thu Apr 13 09:18:05 2006
Subject: [Numpy-discussion] where
In-Reply-To: <c5b438120604130910x2e6694adhe4592d686893afe5@mail.gmail.com>
References: <c5b438120604130910x2e6694adhe4592d686893afe5@mail.gmail.com>
Message-ID: <443E79A5.2000700@ee.byu.edu>

Ryan Krauss wrote:

>Can someone help me understand the proper use of where?
>
>I want to use it like this
>
>myvect=where(f>19.5 and phase>0, f, phase)
>
>but I seem to be getting or rather than and.
>
>  
>
It is probably your use of the 'and' statement.   Use '&' instead

(f > 19.5) & (phase > 0)

What version are you using.  In numarray and NumPy the use of 'and' like 
this should raise an error if 'f' and/or 'phase' are arrays of more than 
one element.

-Travis


From ryanlists at gmail.com  Thu Apr 13 09:27:06 2006
From: ryanlists at gmail.com (Ryan Krauss)
Date: Thu Apr 13 09:27:06 2006
Subject: [Numpy-discussion] where
In-Reply-To: <443E79A5.2000700@ee.byu.edu>
References: <c5b438120604130910x2e6694adhe4592d686893afe5@mail.gmail.com>
	 <443E79A5.2000700@ee.byu.edu>
Message-ID: <c5b438120604130926u4f6581b1y76381a30f6f9a373@mail.gmail.com>

Does where return a mask?

If I do
myvect=where((f > 19.5) & (phase > 0),f,phase)
myvect is the same length as f and phase and there is some
modification of the values where the condition is met, but what that
modification is is unclear to me.

If I do
myind=where((f > 19.5) & (phase > 0))
I seem to get the indices of the points where both conditions are met.

I am using version 0.9.5.2043.  I see those kinds of errors about
truth testing an array often, but not in this case.

Thanks,

Ryan

On 4/13/06, Travis Oliphant <oliphant at ee.byu.edu> wrote:
> Ryan Krauss wrote:
>
> >Can someone help me understand the proper use of where?
> >
> >I want to use it like this
> >
> >myvect=where(f>19.5 and phase>0, f, phase)
> >
> >but I seem to be getting or rather than and.
> >
> >
> >
> It is probably your use of the 'and' statement.   Use '&' instead
>
> (f > 19.5) & (phase > 0)
>
> What version are you using.  In numarray and NumPy the use of 'and' like
> this should raise an error if 'f' and/or 'phase' are arrays of more than
> one element.
>
> -Travis
>
>


From oliphant at ee.byu.edu  Thu Apr 13 09:39:04 2006
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Thu Apr 13 09:39:04 2006
Subject: [Numpy-discussion] where
In-Reply-To: <c5b438120604130926u4f6581b1y76381a30f6f9a373@mail.gmail.com>
References: <c5b438120604130910x2e6694adhe4592d686893afe5@mail.gmail.com>	 <443E79A5.2000700@ee.byu.edu> <c5b438120604130926u4f6581b1y76381a30f6f9a373@mail.gmail.com>
Message-ID: <443E7E7B.2030203@ee.byu.edu>

Ryan Krauss wrote:

>Does where return a mask?
>  
>
Only in the second use case...

>If I do
>myvect=where((f > 19.5) & (phase > 0),f,phase)
>myvect is the same length as f and phase and there is some
>modification of the values where the condition is met, but what that
>modification is is unclear to me.
>  
>

The behavior of

where(condition, for_true, for_false)

is to return an array of the same shape as condition with elements of 
for_true where condition is true and
for_false where condition is false.

Thus myvect will contain elements of f where the condition is met and 
elements of phase otherwise.

>If I do
>myind=where((f > 19.5) & (phase > 0))
>I seem to get the indices of the points where both conditions are met.
>  
>
Yes.  That is correct.   It is a different use-case... Note, however, 
that in the current SVN version of NumPy, this use-case will always 
return a tuple of indices (use the nonzero function instead for behavior 
that will stay constant).  For your 1-d example (I'm guessing it's 1-d)  
where will return a length-1 tuple.

>I am using version 0.9.5.2043.  I see those kinds of errors about
>truth testing an array often, but not in this case.
>  
>
That is strange.   What are the sizes of f and phase?

-Travis


From robert.kern at gmail.com  Thu Apr 13 09:42:04 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Thu Apr 13 09:42:04 2006
Subject: [Numpy-discussion] Re: where
In-Reply-To: <c5b438120604130926u4f6581b1y76381a30f6f9a373@mail.gmail.com>
References: <c5b438120604130910x2e6694adhe4592d686893afe5@mail.gmail.com>	 <443E79A5.2000700@ee.byu.edu> <c5b438120604130926u4f6581b1y76381a30f6f9a373@mail.gmail.com>
Message-ID: <e1luv1$hhc$1@sea.gmane.org>

Ryan Krauss wrote:
> Does where return a mask?
> 
> If I do
> myvect=where((f > 19.5) & (phase > 0),f,phase)
> myvect is the same length as f and phase and there is some
> modification of the values where the condition is met, but what that
> modification is is unclear to me.
> 
> If I do
> myind=where((f > 19.5) & (phase > 0))
> I seem to get the indices of the points where both conditions are met.
> 
> I am using version 0.9.5.2043.  I see those kinds of errors about
> truth testing an array often, but not in this case.

Have you read the docstring?

In [33]: where?
Type:           builtin_function_or_method
Base Class:     <type 'builtin_function_or_method'>
String Form:    <built-in function where>
Namespace:      Interactive
Docstring:
    where(condition, | x, y) is shaped like condition and has elements of x and
y where condition is respectively true or false.  If x or y are not given, then
it is equivalent to nonzero(condition).

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From ryanlists at gmail.com  Thu Apr 13 09:44:01 2006
From: ryanlists at gmail.com (Ryan Krauss)
Date: Thu Apr 13 09:44:01 2006
Subject: [Numpy-discussion] where
In-Reply-To: <443E7E7B.2030203@ee.byu.edu>
References: <c5b438120604130910x2e6694adhe4592d686893afe5@mail.gmail.com>
	 <443E79A5.2000700@ee.byu.edu>
	 <c5b438120604130926u4f6581b1y76381a30f6f9a373@mail.gmail.com>
	 <443E7E7B.2030203@ee.byu.edu>
Message-ID: <c5b438120604130943y31628e86h1224f2b657098fee@mail.gmail.com>

f and phase are each (4250,)

I have something that is working but doesn't use where.  Can this be
done easier using where:

f1=f>19.5
f2=f<38
myf=f1&f2
myp=phase>0
myind=myf&myp
correction=myind*-360
newphase=phase+correction

Basically, can where give me an output vector of the same size as f
and phase where the output is either 1 or 0?

Ryan

On 4/13/06, Travis Oliphant <oliphant at ee.byu.edu> wrote:
> Ryan Krauss wrote:
>
> >Does where return a mask?
> >
> >
> Only in the second use case...
>
> >If I do
> >myvect=where((f > 19.5) & (phase > 0),f,phase)
> >myvect is the same length as f and phase and there is some
> >modification of the values where the condition is met, but what that
> >modification is is unclear to me.
> >
> >
>
> The behavior of
>
> where(condition, for_true, for_false)
>
> is to return an array of the same shape as condition with elements of
> for_true where condition is true and
> for_false where condition is false.
>
> Thus myvect will contain elements of f where the condition is met and
> elements of phase otherwise.
>
> >If I do
> >myind=where((f > 19.5) & (phase > 0))
> >I seem to get the indices of the points where both conditions are met.
> >
> >
> Yes.  That is correct.   It is a different use-case... Note, however,
> that in the current SVN version of NumPy, this use-case will always
> return a tuple of indices (use the nonzero function instead for behavior
> that will stay constant).  For your 1-d example (I'm guessing it's 1-d)
> where will return a length-1 tuple.
>
> >I am using version 0.9.5.2043.  I see those kinds of errors about
> >truth testing an array often, but not in this case.
> >
> >
> That is strange.   What are the sizes of f and phase?
>
> -Travis
>
>


From robert.kern at gmail.com  Thu Apr 13 09:54:05 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Thu Apr 13 09:54:05 2006
Subject: [Numpy-discussion] Re: where
In-Reply-To: <c5b438120604130943y31628e86h1224f2b657098fee@mail.gmail.com>
References: <c5b438120604130910x2e6694adhe4592d686893afe5@mail.gmail.com>	 <443E79A5.2000700@ee.byu.edu>	 <c5b438120604130926u4f6581b1y76381a30f6f9a373@mail.gmail.com>	 <443E7E7B.2030203@ee.byu.edu> <c5b438120604130943y31628e86h1224f2b657098fee@mail.gmail.com>
Message-ID: <e1lvlv$l4f$1@sea.gmane.org>

Ryan Krauss wrote:
> f and phase are each (4250,)
> 
> I have something that is working but doesn't use where.  Can this be
> done easier using where:
> 
> f1=f>19.5
> f2=f<38
> myf=f1&f2
> myp=phase>0
> myind=myf&myp
> correction=myind*-360
> newphase=phase+correction

(untested)
phase[((f>19.5) & (f<38)) & (phase>0)] -= 360

> Basically, can where give me an output vector of the same size as f
> and phase where the output is either 1 or 0?

Why? The condition array that you would pass into where() is already such an array.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From arnd.baecker at web.de  Thu Apr 13 10:07:14 2006
From: arnd.baecker at web.de (Arnd Baecker)
Date: Thu Apr 13 10:07:14 2006
Subject: [Numpy-discussion] range/arange
In-Reply-To: <200604131123.56171.lars.bittrich@googlemail.com>
References: <200604130507.40241.pgmdevlist@mailcan.com>
 <200604131123.56171.lars.bittrich@googlemail.com>
Message-ID: <Pine.LNX.4.51.0604131905520.16341@ptpcp8.phy.tu-dresden.de>


On Thu, 13 Apr 2006, Lars Bittrich wrote:

> Hi,
>
> On Thursday 13 April 2006 11:07, Pierre GM wrote:
> > Could any of you explain me why the two following commands give different
> > results ? It's mere curiosity, for my personal edification.
> >
> > [(m-5)/10 for m in arange(1,10)]
> > [0, 0, 0, 0, 0, 0, 0, 0, 0]
> >
> > [(m-5)/10 for m in range(1,10)]
> > [-1, -1, -1, -1, 0, 0, 0, 0, 0]
>
> I have no idea where the reason is located exactly, but it seems to be caused
> by different types of range and arange.


Interestingly with Numeric you get the following:

In [1]: from Numeric import *
In [2]: [(m-5)/10 for m in arange(1,10)]
Out[2]: [-1, -1, -1, -1, 0, 0, 0, 0, 0]
In [3]: type(arange(1,10)[0])
Out[3]: <type 'int'>

Will this cause any trouble for projects
transitioning from Numeric to numpy?
Presumably a proper explanation (which?)
should go into the scipy wiki ("Converting from Numeric").


> In [15]:type(arange(1,10)[0])
> Out[15]:<type 'int32scalar'>
>
> In [14]:type(range(1,10)[0])
> Out[14]:<type 'int'>
>
> If you use for example:
>
> In [16]:-1/10
> Out[16]:-1
>
> you get the normal behavior of the "floor" function.
>
> In [17]:floor(-.1)
> Out[17]:-1.0
>
> The behavior of int32scalar seems more intuitive to me.

Me too.

Best, Arnd


From ryanlists at gmail.com  Thu Apr 13 10:12:06 2006
From: ryanlists at gmail.com (Ryan Krauss)
Date: Thu Apr 13 10:12:06 2006
Subject: [Numpy-discussion] Re: where
In-Reply-To: <e1luv1$hhc$1@sea.gmane.org>
References: <c5b438120604130910x2e6694adhe4592d686893afe5@mail.gmail.com>
	 <443E79A5.2000700@ee.byu.edu>
	 <c5b438120604130926u4f6581b1y76381a30f6f9a373@mail.gmail.com>
	 <e1luv1$hhc$1@sea.gmane.org>
Message-ID: <c5b438120604131011l618273far635572e32092f5fc@mail.gmail.com>

Sorry, I can't explain myself.

I read the docstring and it didn't make sense before.  Now it seems
clear enough.

Some how I got it in my head that I needed to be passing f and phase
so that condition could use them.

It turns out that this:
 myvect=where((f>19.5) & (f<38) &
(phase>0),ones(shape(phase)),zeros(shape(phase)))

does exactly what I want.

Ryan

On 4/13/06, Robert Kern <robert.kern at gmail.com> wrote:
> Ryan Krauss wrote:
> > Does where return a mask?
> >
> > If I do
> > myvect=where((f > 19.5) & (phase > 0),f,phase)
> > myvect is the same length as f and phase and there is some
> > modification of the values where the condition is met, but what that
> > modification is is unclear to me.
> >
> > If I do
> > myind=where((f > 19.5) & (phase > 0))
> > I seem to get the indices of the points where both conditions are met.
> >
> > I am using version 0.9.5.2043.  I see those kinds of errors about
> > truth testing an array often, but not in this case.
>
> Have you read the docstring?
>
> In [33]: where?
> Type:           builtin_function_or_method
> Base Class:     <type 'builtin_function_or_method'>
> String Form:    <built-in function where>
> Namespace:      Interactive
> Docstring:
>     where(condition, | x, y) is shaped like condition and has elements of x and
> y where condition is respectively true or false.  If x or y are not given, then
> it is equivalent to nonzero(condition).
>
> --
> Robert Kern
> robert.kern at gmail.com
>
> "I have come to believe that the whole world is an enigma, a harmless enigma
>  that is made terrible by our own mad attempt to interpret it as though it had
>  an underlying truth."
>   -- Umberto Eco
>
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting language
> that extends applications into web and mobile media. Attend the live webcast
> and join the prime developer group breaking into this new coding territory!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>


From ryanlists at gmail.com  Thu Apr 13 10:15:03 2006
From: ryanlists at gmail.com (Ryan Krauss)
Date: Thu Apr 13 10:15:03 2006
Subject: [Numpy-discussion] Re: where
In-Reply-To: <e1lvlv$l4f$1@sea.gmane.org>
References: <c5b438120604130910x2e6694adhe4592d686893afe5@mail.gmail.com>
	 <443E79A5.2000700@ee.byu.edu>
	 <c5b438120604130926u4f6581b1y76381a30f6f9a373@mail.gmail.com>
	 <443E7E7B.2030203@ee.byu.edu>
	 <c5b438120604130943y31628e86h1224f2b657098fee@mail.gmail.com>
	 <e1lvlv$l4f$1@sea.gmane.org>
Message-ID: <c5b438120604131014o1450ca8du95dce8e722d4247b@mail.gmail.com>

> Why? The condition array that you would pass into where() is already such an array.

That is the key point I was missing.  Until I played around with the
conditions myself I didn't get that I was passing in an explicit array
of 1's and 0's.  I guess I thought I was passing in some magic
expression that where was some how making sense.  That is why I
thought I would need to pass f and phase to the function.

Ryan

On 4/13/06, Robert Kern <robert.kern at gmail.com> wrote:
> Ryan Krauss wrote:
> > f and phase are each (4250,)
> >
> > I have something that is working but doesn't use where.  Can this be
> > done easier using where:
> >
> > f1=f>19.5
> > f2=f<38
> > myf=f1&f2
> > myp=phase>0
> > myind=myf&myp
> > correction=myind*-360
> > newphase=phase+correction
>
> (untested)
> phase[((f>19.5) & (f<38)) & (phase>0)] -= 360
>
> > Basically, can where give me an output vector of the same size as f
> > and phase where the output is either 1 or 0?
>
> Why? The condition array that you would pass into where() is already such an array.
>
> --
> Robert Kern
> robert.kern at gmail.com
>
> "I have come to believe that the whole world is an enigma, a harmless enigma
>  that is made terrible by our own mad attempt to interpret it as though it had
>  an underlying truth."
>   -- Umberto Eco
>
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting language
> that extends applications into web and mobile media. Attend the live webcast
> and join the prime developer group breaking into this new coding territory!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>


From ryanlists at gmail.com  Thu Apr 13 10:17:14 2006
From: ryanlists at gmail.com (Ryan Krauss)
Date: Thu Apr 13 10:17:14 2006
Subject: [Numpy-discussion] Re: where
In-Reply-To: <c5b438120604131014o1450ca8du95dce8e722d4247b@mail.gmail.com>
References: <c5b438120604130910x2e6694adhe4592d686893afe5@mail.gmail.com>
	 <443E79A5.2000700@ee.byu.edu>
	 <c5b438120604130926u4f6581b1y76381a30f6f9a373@mail.gmail.com>
	 <443E7E7B.2030203@ee.byu.edu>
	 <c5b438120604130943y31628e86h1224f2b657098fee@mail.gmail.com>
	 <e1lvlv$l4f$1@sea.gmane.org>
	 <c5b438120604131014o1450ca8du95dce8e722d4247b@mail.gmail.com>
Message-ID: <c5b438120604131016o350a0c5fj27d03f73244c13e8@mail.gmail.com>

which makes this:
myvect=where((f>19.5) & (f<38) &
(phase>0),ones(shape(phase)),zeros(shape(phase)))

actually really silly, sense all it is a complicated way to get back
the input of
(f>19.5) & (f<38) & (phase>0)

Ryan

On 4/13/06, Ryan Krauss <ryanlists at gmail.com> wrote:
> > Why? The condition array that you would pass into where() is already such an array.
>
> That is the key point I was missing.  Until I played around with the
> conditions myself I didn't get that I was passing in an explicit array
> of 1's and 0's.  I guess I thought I was passing in some magic
> expression that where was some how making sense.  That is why I
> thought I would need to pass f and phase to the function.
>
> Ryan
>
> On 4/13/06, Robert Kern <robert.kern at gmail.com> wrote:
> > Ryan Krauss wrote:
> > > f and phase are each (4250,)
> > >
> > > I have something that is working but doesn't use where.  Can this be
> > > done easier using where:
> > >
> > > f1=f>19.5
> > > f2=f<38
> > > myf=f1&f2
> > > myp=phase>0
> > > myind=myf&myp
> > > correction=myind*-360
> > > newphase=phase+correction
> >
> > (untested)
> > phase[((f>19.5) & (f<38)) & (phase>0)] -= 360
> >
> > > Basically, can where give me an output vector of the same size as f
> > > and phase where the output is either 1 or 0?
> >
> > Why? The condition array that you would pass into where() is already such an array.
> >
> > --
> > Robert Kern
> > robert.kern at gmail.com
> >
> > "I have come to believe that the whole world is an enigma, a harmless enigma
> >  that is made terrible by our own mad attempt to interpret it as though it had
> >  an underlying truth."
> >   -- Umberto Eco
> >
> >
> >
> > -------------------------------------------------------
> > This SF.Net email is sponsored by xPML, a groundbreaking scripting language
> > that extends applications into web and mobile media. Attend the live webcast
> > and join the prime developer group breaking into this new coding territory!
> > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> > _______________________________________________
> > Numpy-discussion mailing list
> > Numpy-discussion at lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/numpy-discussion
> >
>


From oliphant at ee.byu.edu  Thu Apr 13 10:49:06 2006
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Thu Apr 13 10:49:06 2006
Subject: [Numpy-discussion] range/arange
In-Reply-To: <Pine.LNX.4.51.0604131905520.16341@ptpcp8.phy.tu-dresden.de>
References: <200604130507.40241.pgmdevlist@mailcan.com> <200604131123.56171.lars.bittrich@googlemail.com> <Pine.LNX.4.51.0604131905520.16341@ptpcp8.phy.tu-dresden.de>
Message-ID: <443E8EEB.9070609@ee.byu.edu>

Arnd Baecker wrote:

>On Thu, 13 Apr 2006, Lars Bittrich wrote:
>
>  
>
>>Hi,
>>
>>On Thursday 13 April 2006 11:07, Pierre GM wrote:
>>    
>>
>>>Could any of you explain me why the two following commands give different
>>>results ? It's mere curiosity, for my personal edification.
>>>
>>>[(m-5)/10 for m in arange(1,10)]
>>>[0, 0, 0, 0, 0, 0, 0, 0, 0]
>>>
>>>[(m-5)/10 for m in range(1,10)]
>>>[-1, -1, -1, -1, 0, 0, 0, 0, 0]
>>>      
>>>
>>I have no idea where the reason is located exactly, but it seems to be caused
>>by different types of range and arange.
>>    
>>
>
>
>Interestingly with Numeric you get the following:
>
>In [1]: from Numeric import *
>In [2]: [(m-5)/10 for m in arange(1,10)]
>Out[2]: [-1, -1, -1, -1, 0, 0, 0, 0, 0]
>In [3]: type(arange(1,10)[0])
>Out[3]: <type 'int'>
>
>Will this cause any trouble for projects
>transitioning from Numeric to numpy?
>Presumably a proper explanation (which?)
>should go into the scipy wiki ("Converting from Numeric").
>
>  
>
Yes, some discussion will be needed about the fact that NumPy now has 
its own scalars.    This will give us quite a bit more flexibility 
moving forward and should be seamless for the most part.

-Travis


From pgmdevlist at mailcan.com  Thu Apr 13 11:29:09 2006
From: pgmdevlist at mailcan.com (Pierre GM)
Date: Thu Apr 13 11:29:09 2006
Subject: [Numpy-discussion] Re: range/arange
In-Reply-To: <e1lfd7$nsd$1@sea.gmane.org>
References: <200604130507.40241.pgmdevlist@mailcan.com> <e1lfd7$nsd$1@sea.gmane.org>
Message-ID: <200604131456.48570.pgmdevlist@mailcan.com>

> Python's rule for integer division is to round towards negative infinity.
> C's rule (if it has one; I think it may be platform dependent) is to round
> towards 0. 

Ah OK. That makes sense, and it's something I'll have to keep in mind later 
on. 
Thanks y'all for your answers, I feel quite edified now :)


From ndarray at mac.com  Thu Apr 13 11:53:00 2006
From: ndarray at mac.com (Sasha)
Date: Thu Apr 13 11:53:00 2006
Subject: [Numpy-discussion] Toward release 1.0 of NumPy
In-Reply-To: <443D9543.8040601@ee.byu.edu>
References: <443D9543.8040601@ee.byu.edu>
Message-ID: <d38f5330604131151r3f4ed426j9d40b2132e9f2a47@mail.gmail.com>

On 4/12/06, Travis Oliphant <oliphant at ee.byu.edu> wrote:
> ...        This also dove-tails nicely
> with the Python 2.5 release schedule so that NumPy 1.0 should work with
> Python 2.5 and be fully 64-bit capable for handling very-large arrays.
>

I would like to mention one feature that is going to appear in Python
2.5 that is covering some of the functionality of NumPy.  I am talking
about the ctypes module
<http://starship.python.net/crew/theller/ctypes/tutorial.html>.  Like
NumPy, ctypes provides a set of python classes that represent basic C
types:

        c_byte
        c_char
        c_char_p
        c_double
        c_float
        c_int
        c_long
        c_short
        c_ubyte
         ...

and the ability to describe composite structures.  The later
functionality is very close to what dtype class provides in numpy.

There are some features in ctype that I like better than similar
features in numpy.  For example, in ctypes a fixed width array is
described by multiplying basic type by an integer:

>>> c_char * 10
<class '__main__.c_char_Array_10'>

I find this approach more elegant than numpy's dtype('S10').

It looks like there is some synergy to be exploited here, particularly
in the area of record arrays.


From oliphant at ee.byu.edu  Thu Apr 13 12:49:02 2006
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Thu Apr 13 12:49:02 2006
Subject: [Numpy-discussion] Toward release 1.0 of NumPy
In-Reply-To: <d38f5330604131151r3f4ed426j9d40b2132e9f2a47@mail.gmail.com>
References: <443D9543.8040601@ee.byu.edu> <d38f5330604131151r3f4ed426j9d40b2132e9f2a47@mail.gmail.com>
Message-ID: <443EAB01.8040700@ee.byu.edu>

Sasha wrote:

>On 4/12/06, Travis Oliphant <oliphant at ee.byu.edu> wrote:
>  
>
>>...        This also dove-tails nicely
>>with the Python 2.5 release schedule so that NumPy 1.0 should work with
>>Python 2.5 and be fully 64-bit capable for handling very-large arrays.
>>
>>    
>>
>
>I would like to mention one feature that is going to appear in Python
>2.5 that is covering some of the functionality of NumPy.  I am talking
>about the ctypes module
><http://starship.python.net/crew/theller/ctypes/tutorial.html>.  Like
>NumPy, ctypes provides a set of python classes that represent basic C
>types:
>
>        c_byte
>        c_char
>        c_char_p
>        c_double
>        c_float
>        c_int
>        c_long
>        c_short
>        c_ubyte
>         ...
>
>and the ability to describe composite structures.  The later
>functionality is very close to what dtype class provides in numpy.
>
>There are some features in ctype that I like better than similar
>features in numpy.  For example, in ctypes a fixed width array is
>described by multiplying basic type by an integer:
>  
>
>>>>c_char * 10
>>>>        
>>>>
><class '__main__.c_char_Array_10'>
>
>I find this approach more elegant than numpy's dtype('S10').
>
>It looks like there is some synergy to be exploited here, particularly
>in the area of record arrays.
>  
>

Definitely.  I'm not familiar enough with c_types to do this.  Any help 
is appreciated.

-Travis


From charlesr.harris at gmail.com  Thu Apr 13 13:33:08 2006
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Thu Apr 13 13:33:08 2006
Subject: [Numpy-discussion] Toward release 1.0 of NumPy
In-Reply-To: <443E7109.6080808@cox.net>
References: <443D9543.8040601@ee.byu.edu> <443E096D.3040407@gmx.net>
	 <e06186140604130534v67dab6cbk3a51b41a3887a057@mail.gmail.com>
	 <Mahogany-0.66.0-1448-20060413-100718.00@american.edu>
	 <443E7109.6080808@cox.net>
Message-ID: <e06186140604131332w1f3e2f70t96912135e53c7763@mail.gmail.com>

Tim,

On 4/13/06, Tim Hochberg <tim.hochberg at cox.net> wrote:
>
> Alan G Isaac wrote:
>
> >On Thu, 13 Apr 2006, Charles R Harris apparently wrote:
> >
> >
> >>The Kronecker product (aka Tensor product) of two
> >>matrices isn't a matrix.
> >>
> >>
> >
> >That is an unusual way to describe things in
> >the world of econometrics.  Here is a more
> >common way:
> >http://planetmath.org/encyclopedia/KroneckerProduct.html
> >I share Sven's expectation.
> >
> >
> mathworld also agrees with you. As does the documentation (as best as I
> can tell) and the actual output of kron. I think Charles must be
> thinking of the tensor product instead.


It *is* the tensor product, A \tensor B, but it is not the most general
tensor with four indices just as a bivector is not the most general tensor
with two indices. Numerically, kron chooses to represent the tensor product
of two vector spaces a, b with dimensions n,m respectively as the direct sum
of n copies of b, and the  tensor product of two operators takes the given
form. More generally, the B matrix in each spot could be replaced with an
arbitrary matrix of the correct dimensions and you would recover the general
tensor with four indices.

Anyway, it sounds like you are proposing that the tensor (outer) product of
two matrices be reshaped to run over two indices. It seems that likewise the
tensor (outer) product of two vectors should be reshaped to run over one
index (i.e. flat). That would do the trick.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060413/d5803f3f/attachment-0001.html>

From charlesr.harris at gmail.com  Thu Apr 13 14:19:01 2006
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Thu Apr 13 14:19:01 2006
Subject: [Numpy-discussion] Toward release 1.0 of NumPy
In-Reply-To: <e06186140604131332w1f3e2f70t96912135e53c7763@mail.gmail.com>
References: <443D9543.8040601@ee.byu.edu> <443E096D.3040407@gmx.net>
	 <e06186140604130534v67dab6cbk3a51b41a3887a057@mail.gmail.com>
	 <Mahogany-0.66.0-1448-20060413-100718.00@american.edu>
	 <443E7109.6080808@cox.net>
	 <e06186140604131332w1f3e2f70t96912135e53c7763@mail.gmail.com>
Message-ID: <e06186140604131418q2152685aief61c9ecd177abd9@mail.gmail.com>

Tim,

In particular:

def kron(a,b):
    n = shape(a)[1]*shape(b)[1]
    c = transpose(product.outer(a,b), axis=(0,2,1,3)).reshape(-1,n)
    # wrap c as a matrix.


On 4/13/06, Charles R Harris <charlesr.harris at gmail.com> wrote:
>
> Tim,
>
> On 4/13/06, Tim Hochberg < tim.hochberg at cox.net> wrote:
> >
> > Alan G Isaac wrote:
> >
> > >On Thu, 13 Apr 2006, Charles R Harris apparently wrote:
> > >
> > >
> > >>The Kronecker product (aka Tensor product) of two
> > >>matrices isn't a matrix.
> > >>
> > >>
> > >
> > >That is an unusual way to describe things in
> > >the world of econometrics.  Here is a more
> > >common way:
> > > http://planetmath.org/encyclopedia/KroneckerProduct.html
> > >I share Sven's expectation.
> > >
> > >
> > mathworld also agrees with you. As does the documentation (as best as I
> > can tell) and the actual output of kron. I think Charles must be
> > thinking of the tensor product instead.
>
>
Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060413/af62a554/attachment-0001.html>

From tim.hochberg at cox.net  Thu Apr 13 14:32:04 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Thu Apr 13 14:32:04 2006
Subject: [Numpy-discussion] Toward release 1.0 of NumPy
In-Reply-To: <e06186140604131332w1f3e2f70t96912135e53c7763@mail.gmail.com>
References: <443D9543.8040601@ee.byu.edu> <443E096D.3040407@gmx.net>	 <e06186140604130534v67dab6cbk3a51b41a3887a057@mail.gmail.com>	 <Mahogany-0.66.0-1448-20060413-100718.00@american.edu>	 <443E7109.6080808@cox.net> <e06186140604131332w1f3e2f70t96912135e53c7763@mail.gmail.com>
Message-ID: <443EC2B4.807@cox.net>

Charles R Harris wrote:

> Tim,
>
> On 4/13/06, *Tim Hochberg* <tim.hochberg at cox.net 
> <mailto:tim.hochberg at cox.net>> wrote:
>
>     Alan G Isaac wrote:
>
>     >On Thu, 13 Apr 2006, Charles R Harris apparently wrote:
>     >
>     >
>     >>The Kronecker product (aka Tensor product) of two
>     >>matrices isn't a matrix.
>     >>
>     >>
>     >
>     >That is an unusual way to describe things in
>     >the world of econometrics.  Here is a more
>     >common way:
>     >http://planetmath.org/encyclopedia/KroneckerProduct.html
>     <http://planetmath.org/encyclopedia/KroneckerProduct.html>
>     >I share Sven's expectation.
>     >
>     >
>     mathworld also agrees with you. As does the documentation (as best
>     as I
>     can tell) and the actual output of kron. I think Charles must be
>     thinking of the tensor product instead. 
>
>
> It *is* the tensor product, A \tensor B, but it is not the most 
> general tensor with four indices just as a bivector is not the most 
> general tensor with two indices. Numerically, kron chooses to 
> represent the tensor product of two vector spaces a, b with dimensions 
> n,m respectively as the direct sum of n copies of b, and the  tensor 
> product of two operators takes the given form. More generally, the B 
> matrix in each spot could be replaced with an arbitrary matrix of the 
> correct dimensions and you would recover the general tensor with four 
> indices.
>
> Anyway, it sounds like you are proposing that the tensor (outer) 
> product of two matrices be reshaped to run over two indices. It seems 
> that likewise the tensor (outer) product of two vectors should be 
> reshaped to run over one index ( i.e. flat). That would do the trick.

I'm not proposing anything. I don't care at all what kron does. I just 
want to fix the return type if that's feasible so that people stop 
complaining about it. As far as I can tell, kron already returns a 
flattened tensor product of some sort. I believe the general tensor 
product that you are talking about is already covered by multiply.outer, 
but I'm not sure so correct me if I'm wrong. Here's what kron does as 
present:

 >>> a
array([[1, 1],
       [1, 1]])
 >>> kron(a,a) # => 4x4 matrix
array([[1, 1, 1, 1],
       [1, 1, 1, 1],
       [1, 1, 1, 1],
       [1, 1, 1, 1]])
 >>> kron(a,a[0]) => 8x1
array([1, 1, 1, 1, 1, 1, 1, 1])
 >>> kron(a[0], a[0])
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
  File "C:\Python24\Lib\site-packages\numpy\lib\shape_base.py", line 
577, in kron
    result = concatenate(concatenate(o, axis=1), axis=1)
ValueError: 0-d arrays can't be concatenated
 >>>  b.shape
(2, 2, 2)
 >>> kron(b,b).shape
(4, 4, 2, 2)

So, it looks like the 2d x 2d product obeys Alan's definition. The other 
products are probably all broken.

Regards,

-tim


From charlesr.harris at gmail.com  Thu Apr 13 16:02:04 2006
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Thu Apr 13 16:02:04 2006
Subject: [Numpy-discussion] Toward release 1.0 of NumPy
In-Reply-To: <443EC2B4.807@cox.net>
References: <443D9543.8040601@ee.byu.edu> <443E096D.3040407@gmx.net>
	 <e06186140604130534v67dab6cbk3a51b41a3887a057@mail.gmail.com>
	 <Mahogany-0.66.0-1448-20060413-100718.00@american.edu>
	 <443E7109.6080808@cox.net>
	 <e06186140604131332w1f3e2f70t96912135e53c7763@mail.gmail.com>
	 <443EC2B4.807@cox.net>
Message-ID: <e06186140604131601g2b759c7ema98dd56ef292e6c0@mail.gmail.com>

On 4/13/06, Tim Hochberg <tim.hochberg at cox.net> wrote:
>
> Charles R Harris wrote:
>
> > Tim,
> >
> > On 4/13/06, *Tim Hochberg* <tim.hochberg at cox.net
> > <mailto:tim.hochberg at cox.net>> wrote:
> >
> >     Alan G Isaac wrote:
> >
> >     >On Thu, 13 Apr 2006, Charles R Harris apparently wrote:
> >     >
> >     >
> >     >>The Kronecker product (aka Tensor product) of two
> >     >>matrices isn't a matrix.
> >     >>
> >     >>
> >     >
> >     >That is an unusual way to describe things in
> >     >the world of econometrics.  Here is a more
> >     >common way:
> >     >http://planetmath.org/encyclopedia/KroneckerProduct.html
> >     <http://planetmath.org/encyclopedia/KroneckerProduct.html>
> >     >I share Sven's expectation.
> >     >
> >     >
> >     mathworld also agrees with you. As does the documentation (as best
> >     as I
> >     can tell) and the actual output of kron. I think Charles must be
> >     thinking of the tensor product instead.
> >
> >
> > It *is* the tensor product, A \tensor B, but it is not the most
> > general tensor with four indices just as a bivector is not the most
> > general tensor with two indices. Numerically, kron chooses to
> > represent the tensor product of two vector spaces a, b with dimensions
> > n,m respectively as the direct sum of n copies of b, and the  tensor
> > product of two operators takes the given form. More generally, the B
> > matrix in each spot could be replaced with an arbitrary matrix of the
> > correct dimensions and you would recover the general tensor with four
> > indices.
> >
> > Anyway, it sounds like you are proposing that the tensor (outer)
> > product of two matrices be reshaped to run over two indices. It seems
> > that likewise the tensor (outer) product of two vectors should be
> > reshaped to run over one index ( i.e. flat). That would do the trick.
>
> I'm not proposing anything. I don't care at all what kron does. I just
> want to fix the return type if that's feasible so that people stop
> complaining about it. As far as I can tell, kron already returns a
> flattened tensor product of some sort. I believe the general tensor
> product that you are talking about is already covered by multiply.outer,
> but I'm not sure so correct me if I'm wrong. Here's what kron does as
> present:
>
> >>> a
> array([[1, 1],
>        [1, 1]])
> >>> kron(a,a) # => 4x4 matrix
> array([[1, 1, 1, 1],
>        [1, 1, 1, 1],
>        [1, 1, 1, 1],
>        [1, 1, 1, 1]])


Good at first look. Lets see a simpler version... Nevermind, seems numpy
isn't working on this machine (X86_64, fc5 64 bit) at the moment, maybe I
need to check out a clean version.

>>> kron(a,a[0]) => 8x1
> array([1, 1, 1, 1, 1, 1, 1, 1])


Looks broken. a[0] should be an operator (matrix), so either it should be
(2,1) or (1,2). In the first case, the return should have shape (4,2), in
the latter (2,4). Should probably raise an error as the result strikes me as
ambiguous. But I have to admit I am not sure what the point of this
particular construction is.

>>> kron(a[0], a[0])
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
>   File "C:\Python24\Lib\site-packages\numpy\lib\shape_base.py", line
> 577, in kron
>     result = concatenate(concatenate(o, axis=1), axis=1)
> ValueError: 0-d arrays can't be concatenated


See above. this could be (1,4) or (4,1), depending.

>>>  b.shape
> (2, 2, 2)
> >>> kron(b,b).shape
> (4, 4, 2, 2)


I think this is doing transpose(outer(b,b), axis=(0,2,1,3)) and reshaping
the first 4 indices into 2. Again, I am not sure what the point is for these
operators. Now another way to get all this functionality is to have a
contraction function or method with a list of axis. For instance, consider
the matrices A(i,j) and B(k,l) operating on x(j) and y(l) like A(i,j)x(j)
and B(k,l)y(l), then the outer product of all of these is

A(i,j)B(k,l)x(j)y(l)

with the summation convention on the indices j and l. The result should be
the same as kron(A,B)*kron(x,y) up to a permutation of rows and columes. It
is just a question of which basis is used and how the elements are indexed.

So, it looks like the 2d x 2d product obeys Alan's definition. The other
> products are probably all broken.
>
>
Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060413/4983be21/attachment-0001.html>

From aisaac at american.edu  Thu Apr 13 16:21:08 2006
From: aisaac at american.edu (Alan G Isaac)
Date: Thu Apr 13 16:21:08 2006
Subject: [Numpy-discussion] Toward release 1.0 of NumPy
In-Reply-To: <443EC2B4.807@cox.net>
References: <443D9543.8040601@ee.byu.edu> <443E096D.3040407@gmx.net>	 <e06186140604130534v67dab6cbk3a51b41a3887a057@mail.gmail.com>	 <Mahogany-0.66.0-1448-20060413-100718.00@american.edu>	 <443E7109.6080808@cox.net> <e06186140604131332w1f3e2f70t96912135e53c7763@mail.gmail.com><443EC2B4.807@cox.net>
Message-ID: <Mahogany-0.66.0-1448-20060413-192706.00@american.edu>

On Thu, 13 Apr 2006, Tim Hochberg apparently wrote: 
> Here's what kron does as present: 

As possible context:
http://www.mathworks.com/access/helpdesk/help/techdoc/ref/kron.html#998881
http://www.aptech.com/pdf_man/basicgauss.pdf p.69
In this sense, the 2-d handling is not surprising.

Cheers,
Alan Isaac


From charlesr.harris at gmail.com  Thu Apr 13 16:32:01 2006
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Thu Apr 13 16:32:01 2006
Subject: [Numpy-discussion] Toward release 1.0 of NumPy
In-Reply-To: <Mahogany-0.66.0-1448-20060413-192706.00@american.edu>
References: <443D9543.8040601@ee.byu.edu> <443E096D.3040407@gmx.net>
	 <e06186140604130534v67dab6cbk3a51b41a3887a057@mail.gmail.com>
	 <Mahogany-0.66.0-1448-20060413-100718.00@american.edu>
	 <443E7109.6080808@cox.net>
	 <e06186140604131332w1f3e2f70t96912135e53c7763@mail.gmail.com>
	 <443EC2B4.807@cox.net>
	 <Mahogany-0.66.0-1448-20060413-192706.00@american.edu>
Message-ID: <e06186140604131631i212406e7w55870a321a02bab7@mail.gmail.com>

Hi,

On 4/13/06, Alan G Isaac <aisaac at american.edu> wrote:
>
> On Thu, 13 Apr 2006, Tim Hochberg apparently wrote:
> > Here's what kron does as present:
>
> As possible context:
> http://www.mathworks.com/access/helpdesk/help/techdoc/ref/kron.html#998881
> http://www.aptech.com/pdf_man/basicgauss.pdf p.69
> In this sense, the 2-d handling is not surprising.


Yep, that is what the little python routine I gave above does. Note that in
these cases only matrices are involved. Matlab, for instance, defines
vectors as (1,n) or (n,1), which is actually helpful in minding the
distinction between a vector space and its dual. I don't know how the numpy
matrix package works, but the vectors of rank 1 are going to be a constant
source of ambiguity.

Cheers,
> Alan Isaac


Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060413/3d47bc38/attachment-0001.html>

From tim.hochberg at cox.net  Thu Apr 13 16:37:04 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Thu Apr 13 16:37:04 2006
Subject: [Numpy-discussion] Toward release 1.0 of NumPy
In-Reply-To: <e06186140604131601g2b759c7ema98dd56ef292e6c0@mail.gmail.com>
References: <443D9543.8040601@ee.byu.edu> <443E096D.3040407@gmx.net>	 <e06186140604130534v67dab6cbk3a51b41a3887a057@mail.gmail.com>	 <Mahogany-0.66.0-1448-20060413-100718.00@american.edu>	 <443E7109.6080808@cox.net>	 <e06186140604131332w1f3e2f70t96912135e53c7763@mail.gmail.com>	 <443EC2B4.807@cox.net> <e06186140604131601g2b759c7ema98dd56ef292e6c0@mail.gmail.com>
Message-ID: <443EDFE7.6010509@cox.net>

Charles R Harris wrote:

>
>
> On 4/13/06, *Tim Hochberg* <tim.hochberg at cox.net 
> <mailto:tim.hochberg at cox.net>> wrote:
>
>     Charles R Harris wrote:
>
>     > Tim,
>     >
>     > On 4/13/06, *Tim Hochberg* <tim.hochberg at cox.net
>     <mailto:tim.hochberg at cox.net>
>     > <mailto:tim.hochberg at cox.net <mailto:tim.hochberg at cox.net>>> wrote:
>     >
>     >     Alan G Isaac wrote:
>     >
>     >     >On Thu, 13 Apr 2006, Charles R Harris apparently wrote:
>     >     >
>     >     >
>     >     >>The Kronecker product (aka Tensor product) of two
>     >     >>matrices isn't a matrix.
>     >     >>
>     >     >>
>     >     >
>     >     >That is an unusual way to describe things in
>     >     >the world of econometrics.  Here is a more
>     >     >common way:
>     >     >http://planetmath.org/encyclopedia/KroneckerProduct.html
>     >     < http://planetmath.org/encyclopedia/KroneckerProduct.html>
>     >     >I share Sven's expectation.
>     >     >
>     >     >
>     >     mathworld also agrees with you. As does the documentation
>     (as best
>     >     as I
>     >     can tell) and the actual output of kron. I think Charles must be
>     >     thinking of the tensor product instead.
>     >
>     >
>     > It *is* the tensor product, A \tensor B, but it is not the most
>     > general tensor with four indices just as a bivector is not the most
>     > general tensor with two indices. Numerically, kron chooses to
>     > represent the tensor product of two vector spaces a, b with
>     dimensions
>     > n,m respectively as the direct sum of n copies of b, and the  tensor
>     > product of two operators takes the given form. More generally, the B
>     > matrix in each spot could be replaced with an arbitrary matrix
>     of the
>     > correct dimensions and you would recover the general tensor with
>     four
>     > indices.
>     >
>     > Anyway, it sounds like you are proposing that the tensor (outer)
>     > product of two matrices be reshaped to run over two indices. It
>     seems
>     > that likewise the tensor (outer) product of two vectors should be
>     > reshaped to run over one index ( i.e. flat). That would do the
>     trick.
>
>     I'm not proposing anything. I don't care at all what kron does. I
>     just
>     want to fix the return type if that's feasible so that people stop
>     complaining about it. As far as I can tell, kron already returns a
>     flattened tensor product of some sort. I believe the general tensor
>     product that you are talking about is already covered by
>     multiply.outer,
>     but I'm not sure so correct me if I'm wrong. Here's what kron does as
>     present:
>
>     >>> a
>     array([[1, 1],
>            [1, 1]])
>     >>> kron(a,a) # => 4x4 matrix
>     array([[1, 1, 1, 1],
>            [1, 1, 1, 1],
>            [1, 1, 1, 1],
>            [1, 1, 1, 1]])
>
>
> Good at first look. Lets see a simpler version... Nevermind, seems 
> numpy isn't working on this machine (X86_64, fc5 64 bit) at the 
> moment, maybe I need to check out a clean version.
>
>     >>> kron(a,a[0]) => 8x1
>     array([1, 1, 1, 1, 1, 1, 1, 1])
>
>
> Looks broken. a[0] should be an operator (matrix), so either it should 
> be (2,1) or (1,2).

Since a is an array here, a[0] is shape (2,). Let's repeat this 
excercise using matrices, which are always rank-2 and see if they make 
sense.

 >>> m
matrix([[1, 1],
       [1, 1]])
 >>> kron(m, m[0])
matrix([[1, 1, 1, 1],
       [1, 1, 1, 1]])
 >>> kron(m,m[:,0])
matrix([[1, 1],
       [1, 1],
       [1, 1],
       [1, 1]])

That looks OK.

> In the first case, the return should have shape (4,2), in the latter 
> (2,4). Should probably raise an error as the result strikes me as 
> ambiguous. But I have to admit I am not sure what the point of this 
> particular construction is.
>
>     >>> kron(a[0], a[0])
>     Traceback (most recent call last):
>       File "<stdin>", line 1, in ?
>       File "C:\Python24\Lib\site-packages\numpy\lib\shape_base.py", line
>     577, in kron
>         result = concatenate(concatenate(o, axis=1), axis=1)
>     ValueError: 0-d arrays can't be concatenated
>
>
 >>> kron(m[0], m[0])
matrix([[1, 1, 1, 1]])
 >>> kron(m[:,0], m[:,0])
matrix([[1],
       [1],
       [1],
       [1]])
 >>> kron(m[:,0],m[0])
matrix([[1, 1],
       [1, 1]])

> See above. this could be (1,4) or (4,1), depending.

All of these look like they're probably right without thinking about it 
too hard.

>
>     >>>  b.shape
>     (2, 2, 2)
>     >>> kron(b,b).shape
>     (4, 4, 2, 2)
>
>
> I think this is doing transpose(outer(b,b), axis=(0,2,1,3)) and 
> reshaping the first 4 indices into 2. Again, I am not sure what the 
> point is for these operators. Now another way to get all this 
> functionality is to have a contraction function or method with a list 
> of axis. For instance, consider the matrices A(i,j) and B(k,l) 
> operating on x(j) and y(l) like A(i,j)x(j) and B(k,l)y(l), then the 
> outer product of all of these is
>
> A(i,j)B(k,l)x(j)y(l)
>
> with the summation convention on the indices j and l. The result 
> should be the same as kron(A,B)*kron(x,y) up to a permutation of rows 
> and columes. It is just a question of which basis is used and how the 
> elements are indexed.
>
>     So, it looks like the 2d x 2d product obeys Alan's definition. The
>     other
>     products are probably all broken.
>
Here's my best guess as to what is going on:
    1. There is a relatively large group of people who use Kronecker 
product as Alan does (probably the matrix as opposed to tensor math 
folks). I'm guessing it's a large group since they manage to write the 
definitions at both mathworld and planetmath.
    2. kron was meant to implement this.
    2.5 People who need the other meaning of kron can just use outer, so 
no real conflict.
    3. The implementation was either inappropriately generalized or it 
was assumed that all inputs would be matrices (and hence rank-2).

Assuming 3. is correct, and I'd like to hear from people if they think 
that the behaviour in the non rank-2 cases is sensible, the next 
question is whether the behaviour in the rank-2 cases makes sense. It 
seem to, but I'm not a user of kron. If both of the preceeding are true, 
it seems like a complete fix entails the following two things:
    1. Forbid arguments that are not rank-2. This allows all matrices, 
which is really the main target here I think.
    2. Fix the return type issue. I have a fix for this ready to commit, 
but I want to figure out the first part as well.


Regards,

-tim


From charlesr.harris at gmail.com  Thu Apr 13 17:14:32 2006
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Thu Apr 13 17:14:32 2006
Subject: [Numpy-discussion] Toward release 1.0 of NumPy
In-Reply-To: <443EDFE7.6010509@cox.net>
References: <443D9543.8040601@ee.byu.edu> <443E096D.3040407@gmx.net>
	 <e06186140604130534v67dab6cbk3a51b41a3887a057@mail.gmail.com>
	 <Mahogany-0.66.0-1448-20060413-100718.00@american.edu>
	 <443E7109.6080808@cox.net>
	 <e06186140604131332w1f3e2f70t96912135e53c7763@mail.gmail.com>
	 <443EC2B4.807@cox.net>
	 <e06186140604131601g2b759c7ema98dd56ef292e6c0@mail.gmail.com>
	 <443EDFE7.6010509@cox.net>
Message-ID: <e06186140604131713u48da0b8bw560665bd8576bea5@mail.gmail.com>

On 4/13/06, Tim Hochberg <tim.hochberg at cox.net> wrote:
>
> Charles R Harris wrote:
>
> >
> >
> > On 4/13/06, *Tim Hochberg* <tim.hochberg at cox.net
> > <mailto:tim.hochberg at cox.net>> wrote:
> >
> >     Charles R Harris wrote:
> >
> >     > Tim,
> >     >
> >     > On 4/13/06, *Tim Hochberg* <tim.hochberg at cox.net
> >     <mailto:tim.hochberg at cox.net>
> >     > <mailto:tim.hochberg at cox.net <mailto:tim.hochberg at cox.net>>>
> wrote:
> >     >
> >     >     Alan G Isaac wrote:
> >     >
> >     >     >On Thu, 13 Apr 2006, Charles R Harris apparently wrote:
> >     >     >
> >     >     >
> >     >     >>The Kronecker product (aka Tensor product) of two
> >     >     >>matrices isn't a matrix.
> >     >     >>
> >     >     >>
> >     >     >
> >     >     >That is an unusual way to describe things in
> >     >     >the world of econometrics.  Here is a more
> >     >     >common way:
> >     >     >http://planetmath.org/encyclopedia/KroneckerProduct.html
> >     >     < http://planetmath.org/encyclopedia/KroneckerProduct.html>
> >     >     >I share Sven's expectation.
> >     >     >
> >     >     >
> >     >     mathworld also agrees with you. As does the documentation
> >     (as best
> >     >     as I
> >     >     can tell) and the actual output of kron. I think Charles must
> be
> >     >     thinking of the tensor product instead.
> >     >
> >     >
> >     > It *is* the tensor product, A \tensor B, but it is not the most
> >     > general tensor with four indices just as a bivector is not the
> most
> >     > general tensor with two indices. Numerically, kron chooses to
> >     > represent the tensor product of two vector spaces a, b with
> >     dimensions
> >     > n,m respectively as the direct sum of n copies of b, and
> the  tensor
> >     > product of two operators takes the given form. More generally, the
> B
> >     > matrix in each spot could be replaced with an arbitrary matrix
> >     of the
> >     > correct dimensions and you would recover the general tensor with
> >     four
> >     > indices.
> >     >
> >     > Anyway, it sounds like you are proposing that the tensor (outer)
> >     > product of two matrices be reshaped to run over two indices. It
> >     seems
> >     > that likewise the tensor (outer) product of two vectors should be
> >     > reshaped to run over one index ( i.e. flat). That would do the
> >     trick.
> >
> >     I'm not proposing anything. I don't care at all what kron does. I
> >     just
> >     want to fix the return type if that's feasible so that people stop
> >     complaining about it. As far as I can tell, kron already returns a
> >     flattened tensor product of some sort. I believe the general tensor
> >     product that you are talking about is already covered by
> >     multiply.outer,
> >     but I'm not sure so correct me if I'm wrong. Here's what kron does
> as
> >     present:
> >
> >     >>> a
> >     array([[1, 1],
> >            [1, 1]])
> >     >>> kron(a,a) # => 4x4 matrix
> >     array([[1, 1, 1, 1],
> >            [1, 1, 1, 1],
> >            [1, 1, 1, 1],
> >            [1, 1, 1, 1]])
> >
> >
> > Good at first look. Lets see a simpler version... Nevermind, seems
> > numpy isn't working on this machine (X86_64, fc5 64 bit) at the
> > moment, maybe I need to check out a clean version.
> >
> >     >>> kron(a,a[0]) => 8x1
> >     array([1, 1, 1, 1, 1, 1, 1, 1])
> >
> >
> > Looks broken. a[0] should be an operator (matrix), so either it should
> > be (2,1) or (1,2).
>
> Since a is an array here, a[0] is shape (2,). Let's repeat this
> excercise using matrices, which are always rank-2 and see if they make
> sense.
>
> >>> m
> matrix([[1, 1],
>        [1, 1]])
> >>> kron(m, m[0])
> matrix([[1, 1, 1, 1],
>        [1, 1, 1, 1]])
> >>> kron(m,m[:,0])
> matrix([[1, 1],
>        [1, 1],
>        [1, 1],
>        [1, 1]])
>
> That looks OK.
>
> > In the first case, the return should have shape (4,2), in the latter
> > (2,4). Should probably raise an error as the result strikes me as
> > ambiguous. But I have to admit I am not sure what the point of this
> > particular construction is.
> >
> >     >>> kron(a[0], a[0])
> >     Traceback (most recent call last):
> >       File "<stdin>", line 1, in ?
> >       File "C:\Python24\Lib\site-packages\numpy\lib\shape_base.py", line
> >     577, in kron
> >         result = concatenate(concatenate(o, axis=1), axis=1)
> >     ValueError: 0-d arrays can't be concatenated
> >
> >
> >>> kron(m[0], m[0])
> matrix([[1, 1, 1, 1]])
> >>> kron(m[:,0], m[:,0])
> matrix([[1],
>        [1],
>        [1],
>        [1]])
> >>> kron(m[:,0],m[0])
> matrix([[1, 1],
>        [1, 1]])
>
> > See above. this could be (1,4) or (4,1), depending.
>
> All of these look like they're probably right without thinking about it
> too hard.
>
> >
> >     >>>  b.shape
> >     (2, 2, 2)
> >     >>> kron(b,b).shape
> >     (4, 4, 2, 2)
> >
> >
> > I think this is doing transpose(outer(b,b), axis=(0,2,1,3)) and
> > reshaping the first 4 indices into 2. Again, I am not sure what the
> > point is for these operators. Now another way to get all this
> > functionality is to have a contraction function or method with a list
> > of axis. For instance, consider the matrices A(i,j) and B(k,l)
> > operating on x(j) and y(l) like A(i,j)x(j) and B(k,l)y(l), then the
> > outer product of all of these is
> >
> > A(i,j)B(k,l)x(j)y(l)
> >
> > with the summation convention on the indices j and l. The result
> > should be the same as kron(A,B)*kron(x,y) up to a permutation of rows
> > and columes. It is just a question of which basis is used and how the
> > elements are indexed.
> >
> >     So, it looks like the 2d x 2d product obeys Alan's definition. The
> >     other
> >     products are probably all broken.
> >
> Here's my best guess as to what is going on:
>     1. There is a relatively large group of people who use Kronecker
> product as Alan does (probably the matrix as opposed to tensor math
> folks). I'm guessing it's a large group since they manage to write the
> definitions at both mathworld and planetmath.
>     2. kron was meant to implement this.
>     2.5 People who need the other meaning of kron can just use outer, so
> no real conflict.
>     3. The implementation was either inappropriately generalized or it
> was assumed that all inputs would be matrices (and hence rank-2).


Uh-huh.

Assuming 3. is correct, and I'd like to hear from people if they think
> that the behaviour in the non rank-2 cases is sensible, the next
> question is whether the behaviour in the rank-2 cases makes sense. It
> seem to, but I'm not a user of kron. If both of the preceeding are true,
> it seems like a complete fix entails the following two things:
>     1. Forbid arguments that are not rank-2. This allows all matrices,
> which is really the main target here I think.
>     2. Fix the return type issue. I have a fix for this ready to commit,
> but I want to figure out the first part as well.


I think it was inappropriately generalized, it is hard to make sense of what
kron means for rank > 2. So I vote for restricting the usage to matrices, or
arrays of rank two. This avoids the both the ambiguity of rank one arrays
and big why that arises for arrays with rank > 2. Note that in tensor
algebra the rank 1 problem is solved by the use of upper or lower indices,
lower index => [1,n], upper index => [n,1].

Hmm, I should to check that kron is associative: kron(kron(a,b),c) ==
kron(a, kron(b,c)) like a good tensor product should be. I suspect it is.

Regards,
>
> -tim


Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060413/25f1192f/attachment-0001.html>

From charlesr.harris at gmail.com  Thu Apr 13 17:22:01 2006
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Thu Apr 13 17:22:01 2006
Subject: [Numpy-discussion] Problem on FC5
Message-ID: <e06186140604131721p2eeccaeex63c82e9633a187fd@mail.gmail.com>

Has anyone else seen this:

Python 2.4.2 (#1, Feb 12 2006, 03:45:41)
> [GCC 4.1.0 20060210 (Red Hat 4.1.0-0.24)] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
> >>> from numpy import *
> *** buffer overflow detected ***: python terminated
> ======= Backtrace: =========
> /lib64/libc.so.6(__chk_fail+0x2f)[0x32c76dee3f]
>
> /usr/lib64/python2.4/site-packages/numpy/core/multiarray.so[0x2aaaae191099]


<snip>


this is on FC5-x86_64. I didn't see any problems in the compilation and the
right lib64 libs seem to have been used.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060413/eb1dde14/attachment-0001.html>

From ivazquez at ivazquez.net  Thu Apr 13 17:48:18 2006
From: ivazquez at ivazquez.net (Ignacio Vazquez-Abrams)
Date: Thu Apr 13 17:48:18 2006
Subject: [Numpy-discussion] Problem on FC5
In-Reply-To: <e06186140604131721p2eeccaeex63c82e9633a187fd@mail.gmail.com>
References: <e06186140604131721p2eeccaeex63c82e9633a187fd@mail.gmail.com>
Message-ID: <1144975662.3758.3.camel@ignacio.lan>

On Thu, 2006-04-13 at 18:21 -0600, Charles R Harris wrote:
> this is on FC5-x86_64. I didn't see any problems in the compilation
> and the right lib64 libs seem to have been used. 

Self-built or from Fedora Extras?

-- 
Ignacio Vazquez-Abrams <ivazquez at ivazquez.net>
http://fedora.ivazquez.net/

gpg --keyserver hkp://subkeys.pgp.net --recv-key 38028b72
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 191 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060413/45ce1d3a/attachment-0001.sig>

From charlesr.harris at gmail.com  Thu Apr 13 19:04:10 2006
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Thu Apr 13 19:04:10 2006
Subject: [Numpy-discussion] Problem on FC5
In-Reply-To: <1144975662.3758.3.camel@ignacio.lan>
References: <e06186140604131721p2eeccaeex63c82e9633a187fd@mail.gmail.com>
	 <1144975662.3758.3.camel@ignacio.lan>
Message-ID: <e06186140604131903y6ab287f1ia0d190d4f99e7ce4@mail.gmail.com>

OK,

I solved this problem by deleting the numpy directory in site-packages. I
probably should have tried that first :-/

On 4/13/06, Ignacio Vazquez-Abrams <ivazquez at ivazquez.net> wrote:
>
> On Thu, 2006-04-13 at 18:21 -0600, Charles R Harris wrote:
> > this is on FC5-x86_64. I didn't see any problems in the compilation
> > and the right lib64 libs seem to have been used.
>
> Self-built or from Fedora Extras?
>
> --
> Ignacio Vazquez-Abrams <ivazquez at ivazquez.net>
> http://fedora.ivazquez.net/
>
> gpg --keyserver hkp://subkeys.pgp.net --recv-key 38028b72
>
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.2.2 (GNU/Linux)
>
> iD8DBQBEPvEuoK1Hsnseh8QRAgE4AJwMYPOUU6nz5z2aVBe6lz6fnAhgDwCgw2B0
> E9KCAvYMOYIz035NlwyLvYo=
> =TZyJ
> -----END PGP SIGNATURE-----
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060413/148f0dc3/attachment-0001.html>

From chanley at stsci.edu  Fri Apr 14 07:27:03 2006
From: chanley at stsci.edu (Christopher Hanley)
Date: Fri Apr 14 07:27:03 2006
Subject: [Numpy-discussion] numpy.test() segfaults under Solaris 8
Message-ID: <443FB11E.5040102@stsci.edu>

 From the daily Solaris 8 regression tests:

   Found 5 tests for numpy.distutils.misc_util
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/core/tests for module 
<module 'numpy.core.info' from '...lib/python/numpy/core/info.pyc'>
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/core/tests for module 
<module 'numpy.core.defchararray' from '...on/numpy/core/defchararray.pyc'>
   Found 4 tests for numpy.lib.getlimits
   Found 30 tests for numpy.core.numerictypes
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/random/tests for module 
<module 'numpy.random.mtrand' from '.../python/numpy/random/mtrand.so'>
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/random/tests for module 
<module 'numpy.random.info' from '...b/python/numpy/random/info.pyc'>
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/linalg/tests for module 
<module 'numpy.linalg' from '...thon/numpy/linalg/__init__.pyc'>
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/testing/tests for module 
<module 'numpy.testing' from '...hon/numpy/testing/__init__.pyc'>
   Found 13 tests for numpy.core.umath
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/distutils/tests for module 
<module 'numpy.distutils.ccompiler' from 
'.../numpy/distutils/ccompiler.pyc'>
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/distutils/tests for module 
<module 'numpy.distutils.exec_command' from 
'...mpy/distutils/exec_command.pyc'>
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/linalg/tests for module 
<module 'numpy.linalg.info' from '...b/python/numpy/linalg/info.pyc'>
   Found 8 tests for numpy.lib.arraysetops
Warning: No test file found in /data/basil5/numpy/tests for module 
<module 'numpy.version' from '...s/lib/python/numpy/version.pyc'>
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/distutils/tests for module 
<module 'numpy.distutils.info' from '...ython/numpy/distutils/info.pyc'>
   Found 42 tests for numpy.lib.type_check
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/distutils/tests for module 
<module 'numpy.distutils.log' from '...python/numpy/distutils/log.pyc'>
   Found 90 tests for numpy.core.multiarray
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/distutils/tests for module 
<module 'numpy.distutils' from '...n/numpy/distutils/__init__.pyc'>
Warning: No test file found in /data/basil5/numpy/tests for module 
<module 'numpy.add_newdocs' from '...b/python/numpy/add_newdocs.pyc'>
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/distutils/tests for module 
<module 'numpy.distutils.__version__' from 
'...umpy/distutils/__version__.pyc'>
   Found 3 tests for numpy.dft.helper
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/lib/tests for module <module 
'numpy.lib._compiled_base' from '...on/numpy/lib/_compiled_base.so'>
   Found 36 tests for numpy.core.ma
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/f2py/tests for module 
<module 'numpy.f2py.info' from '...lib/python/numpy/f2py/info.pyc'>
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/lib/tests for module <module 
'numpy.lib.info' from '.../lib/python/numpy/lib/info.pyc'>
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/core/tests for module 
<module 'numpy.core._sort' from '...lib/python/numpy/core/_sort.so'>
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/core/tests for module 
<module 'numpy.core.memmap' from '...b/python/numpy/core/memmap.pyc'>
   Found 2 tests for numpy.core.oldnumeric
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/core/tests for module 
<module 'numpy.core._internal' from '...ython/numpy/core/_internal.pyc'>
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/distutils/tests for module 
<module 'numpy.distutils.__config__' from 
'...numpy/distutils/__config__.pyc'>
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/linalg/tests for module 
<module 'numpy.linalg.lapack_lite' from '...on/numpy/linalg/lapack_lite.so'>
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/dft/tests for module <module 
'numpy.dft.info' from '.../lib/python/numpy/dft/info.pyc'>
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/dft/tests for module <module 
'numpy.dft' from '.../python/numpy/dft/__init__.pyc'>
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/random/tests for module 
<module 'numpy.random' from '...thon/numpy/random/__init__.pyc'>
   Found 9 tests for numpy.lib.twodim_base
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/distutils/tests for module 
<module 'numpy.distutils.unixccompiler' from 
'...py/distutils/unixccompiler.pyc'>
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/core/tests for module 
<module 'numpy.core.arrayprint' from '...thon/numpy/core/arrayprint.pyc'>
Warning: No test file found in /data/basil5/numpy/tests for module 
<module 'numpy' from '.../lib/python/numpy/__init__.pyc'>
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/dft/tests for module <module 
'numpy.dft.fftpack' from '...b/python/numpy/dft/fftpack.pyc'>
   Found 8 tests for numpy.core.defmatrix
Warning: No test file found in /data/basil5/numpy/tests for module 
<module 'numpy.__config__' from '...ib/python/numpy/__config__.pyc'>
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/testing/tests for module 
<module 'numpy.testing.utils' from '...python/numpy/testing/utils.pyc'>
   Found 1 tests for numpy.lib.ufunclike
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/lib/tests for module <module 
'numpy.lib.scimath' from '...b/python/numpy/lib/scimath.pyc'>
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/lib/tests for module <module 
'numpy.lib' from '.../python/numpy/lib/__init__.pyc'>
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/dft/tests for module <module 
'numpy.dft.fftpack_lite' from '...thon/numpy/dft/fftpack_lite.so'>
   Found 32 tests for numpy.lib.function_base
   Found 1 tests for numpy.lib.polynomial
Warning: No test file found in /data/basil5/numpy/tests for module 
<module 'numpy._import_tools' from '...python/numpy/_import_tools.pyc'>
   Found 6 tests for numpy.core.records
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/testing/tests for module 
<module 'numpy.testing.numpytest' from '...on/numpy/testing/numpytest.pyc'>
   Found 17 tests for numpy.core.numeric
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/core/tests for module 
<module 'numpy.core.__svn_version__' from 
'...numpy/core/__svn_version__.pyc'>
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/core/tests for module 
<module 'numpy.core' from '...python/numpy/core/__init__.pyc'>
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/testing/tests for module 
<module 'numpy.testing.info' from '.../python/numpy/testing/info.pyc'>
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/lib/tests for module <module 
'numpy.lib.utils' from '...lib/python/numpy/lib/utils.pyc'>
   Found 4 tests for numpy.lib.index_tricks
   Found 44 tests for numpy.lib.shape_base
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/lib/tests for module <module 
'numpy.lib.machar' from '...ib/python/numpy/lib/machar.pyc'>
Warning: No test file found in 
/data/basil5/site-packages/lib/python/numpy/linalg/tests for module 
<module 'numpy.linalg.linalg' from '...python/numpy/linalg/linalg.pyc'>
   Found 0 tests for __main__
check_1 (numpy.distutils.tests.test_misc_util.test_appendpath) ... ok
check_2 (numpy.distutils.tests.test_misc_util.test_appendpath) ... ok
check_3 (numpy.distutils.tests.test_misc_util.test_appendpath) ... ok
check_gpaths (numpy.distutils.tests.test_misc_util.test_gpaths) ... ok
check_1 (numpy.distutils.tests.test_misc_util.test_minrelpath) ... ok
check_singleton (numpy.lib.tests.test_getlimits.test_double) ... ok
check_singleton (numpy.lib.tests.test_getlimits.test_longdouble) ... ok
check_singleton (numpy.lib.tests.test_getlimits.test_python_float) ... ok
check_singleton (numpy.lib.tests.test_getlimits.test_single) ... ok
Check creation from list of list of tuples ... ok
Check creation from list of tuples ... ok
Check creation from tuples ... ok
Check creation from list of list of tuples ... ok
Check creation from list of tuples ... ok
Check creation from tuples ... ok
Check creation from list of list of tuples ... ok
Check creation from list of tuples ... ok
Check creation from tuples ... ok
Check creation from list of list of tuples ... ok
Check creation from list of tuples ... ok
Check creation from tuples ... ok
Check creation of 0-dimensional objects ... ok
Check creation of multi-dimensional objects ... ok
Check creation of single-dimensional objects ... ok
Check creation of 0-dimensional objects ... ok
Check creation of multi-dimensional objects ... ok
Check creation of single-dimensional objects ... ok
Check reading the top fields of a nested array ... ok
Check reading the nested fields of a nested array (1st level) ... ok
Check access nested descriptors of a nested array (1st level) ... ok
Check reading the nested fields of a nested array (2nd level) ... ok
Check access nested descriptors of a nested array (2nd level) ... ok
Check reading the top fields of a nested array ... ok
Check reading the nested fields of a nested array (1st level) ... ok
Check access nested descriptors of a nested array (1st level) ... ok
Check reading the nested fields of a nested array (2nd level) ... ok
Check access nested descriptors of a nested array (2nd level) ... ok
check_access_fields 
(numpy.core.tests.test_numerictypes.test_read_values_plain_multiple) ... ok
check_access_fields 
(numpy.core.tests.test_numerictypes.test_read_values_plain_single) ... ok
test_mixed (numpy.core.tests.test_umath.test_choose) ... ok
check_expm1 (numpy.core.tests.test_umath.test_expm1) ... ok
check_floating_point (numpy.core.tests.test_umath.test_floating_point) 
... ok
check_log1p (numpy.core.tests.test_umath.test_log1p) ... ok
check_reduce_complex (numpy.core.tests.test_umath.test_maximum) ... ok
check_reduce_complex (numpy.core.tests.test_umath.test_minimum) ... ok
check_power_complex (numpy.core.tests.test_umath.test_power) ... ok
check_power_float (numpy.core.tests.test_umath.test_power) ... ok
test_array_with_context 
(numpy.core.tests.test_umath.test_special_methods) ... ok
test_failing_wrap (numpy.core.tests.test_umath.test_special_methods) ... ok
test_old_wrap (numpy.core.tests.test_umath.test_special_methods) ... ok
test_priority (numpy.core.tests.test_umath.test_special_methods) ... ok
test_wrap (numpy.core.tests.test_umath.test_special_methods) ... ok
check_intersect1d (numpy.lib.tests.test_arraysetops.test_aso) ... ok
check_intersect1d_nu (numpy.lib.tests.test_arraysetops.test_aso) ... ok
check_manyways (numpy.lib.tests.test_arraysetops.test_aso) ... ok
check_setdiff1d (numpy.lib.tests.test_arraysetops.test_aso) ... ok
check_setmember1d (numpy.lib.tests.test_arraysetops.test_aso) ... ok
check_setxor1d (numpy.lib.tests.test_arraysetops.test_aso) ... ok
check_union1d (numpy.lib.tests.test_arraysetops.test_aso) ... ok
check_unique1d (numpy.lib.tests.test_arraysetops.test_aso) ... ok
check_cmplx (numpy.lib.tests.test_type_check.test_imag) ... ok
check_real (numpy.lib.tests.test_type_check.test_imag) ... ok
check_fail (numpy.lib.tests.test_type_check.test_iscomplex) ... ok
check_pass (numpy.lib.tests.test_type_check.test_iscomplex) ... ok
check_basic (numpy.lib.tests.test_type_check.test_iscomplexobj) ... ok
check_complex (numpy.lib.tests.test_type_check.test_isfinite) ... ok
check_complex1 (numpy.lib.tests.test_type_check.test_isfinite) ... ok
check_goodvalues (numpy.lib.tests.test_type_check.test_isfinite) ... ok
check_ind (numpy.lib.tests.test_type_check.test_isfinite) ... ok
check_integer (numpy.lib.tests.test_type_check.test_isfinite) ... ok
check_neginf (numpy.lib.tests.test_type_check.test_isfinite) ... ok
check_posinf (numpy.lib.tests.test_type_check.test_isfinite) ... ok
check_goodvalues (numpy.lib.tests.test_type_check.test_isinf) ... ok
check_ind (numpy.lib.tests.test_type_check.test_isinf) ... ok
check_neginf (numpy.lib.tests.test_type_check.test_isinf) ... ok
check_neginf_scalar (numpy.lib.tests.test_type_check.test_isinf) ... ok
check_posinf (numpy.lib.tests.test_type_check.test_isinf) ... ok
check_posinf_scalar (numpy.lib.tests.test_type_check.test_isinf) ... ok
check_complex (numpy.lib.tests.test_type_check.test_isnan) ... ok
check_complex1 (numpy.lib.tests.test_type_check.test_isnan) ... ok
check_goodvalues (numpy.lib.tests.test_type_check.test_isnan) ... ok
check_ind (numpy.lib.tests.test_type_check.test_isnan) ... ok
check_integer (numpy.lib.tests.test_type_check.test_isnan) ... ok
check_neginf (numpy.lib.tests.test_type_check.test_isnan) ... ok
check_posinf (numpy.lib.tests.test_type_check.test_isnan) ... ok
check_generic (numpy.lib.tests.test_type_check.test_isneginf) ... ok
check_generic (numpy.lib.tests.test_type_check.test_isposinf) ... ok
check_fail (numpy.lib.tests.test_type_check.test_isreal) ... ok
check_pass (numpy.lib.tests.test_type_check.test_isreal) ... ok
check_basic (numpy.lib.tests.test_type_check.test_isrealobj) ... ok
check_basic (numpy.lib.tests.test_type_check.test_isscalar) ... ok
check_default_1 (numpy.lib.tests.test_type_check.test_mintypecode) ... ok
check_default_2 (numpy.lib.tests.test_type_check.test_mintypecode) ... ok
check_default_3 (numpy.lib.tests.test_type_check.test_mintypecode) ... ok
check_complex_bad (numpy.lib.tests.test_type_check.test_nan_to_num) ... ok
check_complex_bad2 (numpy.lib.tests.test_type_check.test_nan_to_num) ... ok
check_complex_good (numpy.lib.tests.test_type_check.test_nan_to_num) ... ok
check_generic (numpy.lib.tests.test_type_check.test_nan_to_num) ... ok
check_integer (numpy.lib.tests.test_type_check.test_nan_to_num) ... ok
check_cmplx (numpy.lib.tests.test_type_check.test_real) ... ok
check_real (numpy.lib.tests.test_type_check.test_real) ... ok
check_basic (numpy.lib.tests.test_type_check.test_real_if_close) ... ok
Check assignment of 0-dimensional objects with values ... ok
Check assignment of multi-dimensional objects with values ... ok
Check assignment of single-dimensional objects with values ... ok
Check assignment of 0-dimensional objects with values ... ok
Check assignment of multi-dimensional objects with values ... ok
Check assignment of single-dimensional objects with values ... ok
Check assignment of 0-dimensional objects with values ... ok
Check assignment of multi-dimensional objects with values ... ok
Check assignment of single-dimensional objects with values ... ok
Check assignment of 0-dimensional objects with values ... ok
Check assignment of multi-dimensional objects with values ... ok
Check assignment of single-dimensional objects with values ... ok
Check assignment of 0-dimensional objects with values ... ok
Check assignment of multi-dimensional objects with values ... ok
Check assignment of single-dimensional objects with values ... ok
Check assignment of 0-dimensional objects with values ... ok
Check assignment of multi-dimensional objects with values ... ok
Check assignment of single-dimensional objects with values ... ok
check_attributes (numpy.core.tests.test_multiarray.test_attributes) ... ok
check_dtypeattr (numpy.core.tests.test_multiarray.test_attributes) ... ok
check_fill (numpy.core.tests.test_multiarray.test_attributes) ... ok
check_set_stridesattr (numpy.core.tests.test_multiarray.test_attributes) 
... ok
check_stridesattr (numpy.core.tests.test_multiarray.test_attributes) ... ok
check_test_interning (numpy.core.tests.test_multiarray.test_bool) ... ok
Check byteorder of 0-dimensional objects ... ok
Check byteorder of multi-dimensional objects ... ok
Check byteorder of single-dimensional objects ... ok
Check byteorder of 0-dimensional objects ... ok
Check byteorder of multi-dimensional objects ... ok
Check byteorder of single-dimensional objects ... ok
Check byteorder of 0-dimensional objects ... ok
Check byteorder of multi-dimensional objects ... ok
Check byteorder of single-dimensional objects ... ok
Check byteorder of 0-dimensional objects ... ok
Check byteorder of multi-dimensional objects ... ok
Check byteorder of single-dimensional objects ... ok
Check byteorder of 0-dimensional objects ... ok
Check byteorder of multi-dimensional objects ... ok
Check byteorder of single-dimensional objects ... ok
Check byteorder of 0-dimensional objects ... ok
Check byteorder of multi-dimensional objects ... ok
Check byteorder of single-dimensional objects ... ok
Check creation of 0-dimensional objects with values ... ok
Check creation of multi-dimensional objects with values ... ok
Check creation of single-dimensional objects with values ... ok
Check creation of 0-dimensional objects with values ... ok
Check creation of multi-dimensional objects with values ... ok
Check creation of single-dimensional objects with values ... ok
Check creation of 0-dimensional objects with values ... ok
Check creation of multi-dimensional objects with values ... ok
Check creation of single-dimensional objects with values ... ok
Check creation of 0-dimensional objects with values ... ok
Check creation of multi-dimensional objects with values ... ok
Check creation of single-dimensional objects with values ... ok
Check creation of 0-dimensional objects with values ... ok
Check creation of multi-dimensional objects with values ... ok
Check creation of single-dimensional objects with values ... ok
Check creation of 0-dimensional objects with values ... ok
Check creation of multi-dimensional objects with values ... ok
Check creation of single-dimensional objects with values ... ok
Check creation of 0-dimensional objects ... ok
Check creation of multi-dimensional objects ... ok
Check creation of single-dimensional objects ... ok
Check creation of 0-dimensional objects ... ok
Check creation of multi-dimensional objects ... ok
Check creation of single-dimensional objects ... ok
Check creation of 0-dimensional objects ... ok
Check creation of multi-dimensional objects ... ok
Check creation of single-dimensional objects ... ok
check_from_attribute (numpy.core.tests.test_multiarray.test_creation) ... ok
check_construction (numpy.core.tests.test_multiarray.test_dtypedescr) ... ok
check_list (numpy.core.tests.test_multiarray.test_fancy_indexing) ... ok
check_tuple (numpy.core.tests.test_multiarray.test_fancy_indexing) ... ok
check_otherflags (numpy.core.tests.test_multiarray.test_flags) ... ok
check_writeable (numpy.core.tests.test_multiarray.test_flags) ... ok
check_ascii (numpy.core.tests.test_multiarray.test_fromstring) ... ok
check_binary (numpy.core.tests.test_multiarray.test_fromstring) ... ok
check_test_round (numpy.core.tests.test_multiarray.test_methods) ... ok
check_both (numpy.core.tests.test_multiarray.test_pickling) ... ok
check_test_zero_rank 
(numpy.core.tests.test_multiarray.test_subscripting) ... ok
check_constructor (numpy.core.tests.test_multiarray.test_zero_rank) ... ok
check_ellipsis_subscript 
(numpy.core.tests.test_multiarray.test_zero_rank) ... ok
check_ellipsis_subscript_assignment 
(numpy.core.tests.test_multiarray.test_zero_rank) ... ok
check_empty_subscript (numpy.core.tests.test_multiarray.test_zero_rank) 
... ok
check_empty_subscript_assignment 
(numpy.core.tests.test_multiarray.test_zero_rank) ... ok
check_invalid_newaxis (numpy.core.tests.test_multiarray.test_zero_rank) 
... ok
check_invalid_subscript 
(numpy.core.tests.test_multiarray.test_zero_rank) ... ok
check_invalid_subscript_assignment 
(numpy.core.tests.test_multiarray.test_zero_rank) ... ok
check_newaxis (numpy.core.tests.test_multiarray.test_zero_rank) ... ok
check_output (numpy.core.tests.test_multiarray.test_zero_rank) ... ok
check_definition (numpy.dft.tests.test_helper.test_fftfreq) ... ok
check_definition (numpy.dft.tests.test_helper.test_fftshift) ... ok
check_inverse (numpy.dft.tests.test_helper.test_fftshift) ... ok
test_clip (numpy.core.tests.test_ma.test_array_methods) ... ok
test_cumprod (numpy.core.tests.test_ma.test_array_methods) ... ok
test_cumsum (numpy.core.tests.test_ma.test_array_methods) ... ok
test_ptp (numpy.core.tests.test_ma.test_array_methods) ... ok
test_swapaxes (numpy.core.tests.test_ma.test_array_methods) ... ok
test_trace (numpy.core.tests.test_ma.test_array_methods) ... ok
test_varstd (numpy.core.tests.test_ma.test_array_methods) ... ok
check_testAPI (numpy.core.tests.test_ma.test_ma) ... ok
Test add, sum, product. ... ok
Test of basic arithmetic. ... ok
check_testArrayAttributes (numpy.core.tests.test_ma.test_ma) ... ok
check_testArrayMethods (numpy.core.tests.test_ma.test_ma) ... ok
Test of average. ... ok
More tests of average. ... ok
Test of basic array creation and properties in 1 dimension. ... ok
Test of basic array creation and properties in 2 dimensions. ... ok
Test of conversions and indexing ... ok
Tests of some subtle points of copying and sizing. ... ok
Test of inplace operations and rich comparisons ... ok
check_testMaPut (numpy.core.tests.test_ma.test_ma) ... ok
Test of masked element ... ok
Test of minumum, maximum. ... ok
check_testMixedArithmetic (numpy.core.tests.test_ma.test_ma) ... ok
Test of other odd features ... ok
Test of pickling ... ok
Test of put ... ok
check_testScalarArithmetic (numpy.core.tests.test_ma.test_ma) ... ok
check_testSingleElementSubscript (numpy.core.tests.test_ma.test_ma) ... ok
Test of take, transpose, inner, outer products ... ok
check_testToPython (numpy.core.tests.test_ma.test_ma) ... ok
Test various functions such as sin, cos. ... ok
Test count ... ok
check_testUfuncRegression (numpy.core.tests.test_ma.test_ufuncs) ... ok
test_minmax (numpy.core.tests.test_ma.test_ufuncs) ... ok
test_nonzero (numpy.core.tests.test_ma.test_ufuncs) ... ok
test_reduce (numpy.core.tests.test_ma.test_ufuncs) ... ok
check_bug_r2089 (numpy.core.tests.test_oldnumeric.test_put) ... ok
check_array_subclass (numpy.core.tests.test_oldnumeric.test_wrapit) ... ok
check_matrix (numpy.lib.tests.test_twodim_base.test_diag) ... ok
check_vector (numpy.lib.tests.test_twodim_base.test_diag) ... ok
check_2d (numpy.lib.tests.test_twodim_base.test_eye) ... ok
check_basic (numpy.lib.tests.test_twodim_base.test_eye) ... ok
check_diag (numpy.lib.tests.test_twodim_base.test_eye) ... ok
check_diag2d (numpy.lib.tests.test_twodim_base.test_eye) ... ok
check_basic (numpy.lib.tests.test_twodim_base.test_fliplr) ... ok
check_basic (numpy.lib.tests.test_twodim_base.test_flipud) ... ok
check_basic (numpy.lib.tests.test_twodim_base.test_rot90) ... ok
check_basic (numpy.core.tests.test_defmatrix.test_algebra) ... ok
check_basic (numpy.core.tests.test_defmatrix.test_casting) ... ok
check_basic (numpy.core.tests.test_defmatrix.test_ctor) ... ok
check_asmatrix (numpy.core.tests.test_defmatrix.test_properties) ... ok
check_basic (numpy.core.tests.test_defmatrix.test_properties) ... ok
check_comparisons (numpy.core.tests.test_defmatrix.test_properties) ... ok
check_noaxis (numpy.core.tests.test_defmatrix.test_properties) ... ok
Test whether matrix.sum(axis=1) preserves orientation. ... ok
Doctest: numpy.lib.tests.test_ufunclike ... ok
check_basic (numpy.lib.tests.test_function_base.test_all) ... ok
check_nd (numpy.lib.tests.test_function_base.test_all) ... ok
check_basic (numpy.lib.tests.test_function_base.test_amax) ... ok
check_basic (numpy.lib.tests.test_function_base.test_amin) ... ok
check_basic (numpy.lib.tests.test_function_base.test_angle) ... ok
check_basic (numpy.lib.tests.test_function_base.test_any) ... ok
check_nd (numpy.lib.tests.test_function_base.test_any) ... ok
check_basic (numpy.lib.tests.test_function_base.test_average) ... ok
check_basic (numpy.lib.tests.test_function_base.test_cumprod) ... ok
check_basic (numpy.lib.tests.test_function_base.test_cumsum) ... ok
check_basic (numpy.lib.tests.test_function_base.test_diff) ... ok
check_nd (numpy.lib.tests.test_function_base.test_diff) ... ok
check_basic (numpy.lib.tests.test_function_base.test_extins) ... ok
check_both (numpy.lib.tests.test_function_base.test_extins) ... ok
check_insert (numpy.lib.tests.test_function_base.test_extins) ... ok
check_bartlett (numpy.lib.tests.test_function_base.test_filterwindows) 
... ok
check_blackman (numpy.lib.tests.test_function_base.test_filterwindows) 
... ok
check_hamming (numpy.lib.tests.test_function_base.test_filterwindows) ... ok
check_hanning (numpy.lib.tests.test_function_base.test_filterwindows) ... ok
check_simple (numpy.lib.tests.test_function_base.test_histogram) ... ok
check_basic (numpy.lib.tests.test_function_base.test_linspace) ... ok
check_corner (numpy.lib.tests.test_function_base.test_linspace) ... ok
check_basic (numpy.lib.tests.test_function_base.test_logspace) ... ok
check_basic (numpy.lib.tests.test_function_base.test_prod) ... ok
check_basic (numpy.lib.tests.test_function_base.test_ptp) ... ok
check_simple (numpy.lib.tests.test_function_base.test_sinc) ... ok
check_simple (numpy.lib.tests.test_function_base.test_trapz) ... ok
check_basic (numpy.lib.tests.test_function_base.test_trim_zeros) ... ok
check_leading_skip (numpy.lib.tests.test_function_base.test_trim_zeros) 
... ok
check_trailing_skip (numpy.lib.tests.test_function_base.test_trim_zeros) 
... ok
check_simple (numpy.lib.tests.test_function_base.test_unwrap) ... ok
check_vectorize 
(numpy.lib.tests.test_function_base.test_vectorize)Segmentation Fault 
(core dumped)


This is a clean checkout and build of numpy that is done every morning 
on a Solaris 8 system.  We are currently using python 2.4.2 on this 
machine.  The equivalent build and test on a RHE system passed with no 
problems.

Chris


From fullung at gmail.com  Fri Apr 14 08:15:14 2006
From: fullung at gmail.com (Albert Strasheim)
Date: Fri Apr 14 08:15:14 2006
Subject: [Numpy-discussion] Vectorize bug
In-Reply-To: <20060412124032.GA30471@sun.ac.za>
Message-ID: <006c01c65fd6$2d043b90$0502010a@dsp.sun.ac.za>

Hello all

There still seems to be a problem with vectorize (or something else). So far
I've only been able to reproduce the problem by running the test suite 5
times under IPython on Windows (weird, eh?). Details here:

http://projects.scipy.org/scipy/numpy/ticket/52

If anybody has some ideas on how to do a proper debug build with MinGW so
that I can get a useful stack trace from the Visual Studio debugger, I can
narrow down the problem further.

Regards,

Albert

> -----Original Message-----
> From: numpy-discussion-admin at lists.sourceforge.net [mailto:numpy-
> discussion-admin at lists.sourceforge.net] On Behalf Of Stefan van der Walt
> Sent: 12 April 2006 14:41
> To: numpy-discussion at lists.sourceforge.net
> Subject: [Numpy-discussion] Vectorize bug
> 
> Hello all
> 
> Vectorize segfaults for large arrays.  I filed the bug at
> 
> http://projects.scipy.org/scipy/numpy/ticket/52
> 
> The offending code is
> 
> import numpy as N
> x = N.linspace(-3,2,10000)
> y = N.vectorize(lambda x: x)
> 
> # Segfaults here
> y(x)
> 
> Regards
> St?fan


From fullung at gmail.com  Fri Apr 14 08:18:02 2006
From: fullung at gmail.com (Albert Strasheim)
Date: Fri Apr 14 08:18:02 2006
Subject: [Numpy-discussion] numpy.test() segfaults under Solaris 8
In-Reply-To: <443FB11E.5040102@stsci.edu>
Message-ID: <006d01c65fd6$85b72450$0502010a@dsp.sun.ac.za>

Hello Chris

I am seeing this same crash on Windows under IPython with revision 2351 of
NumPy from SVN.

If you can get a useful stack trace on your platform, you could add some
details to this ticket:

http://projects.scipy.org/scipy/numpy/ticket/52

Regards,

Albert

> -----Original Message-----
> From: numpy-discussion-admin at lists.sourceforge.net [mailto:numpy-
> discussion-admin at lists.sourceforge.net] On Behalf Of Christopher Hanley
> Sent: 14 April 2006 16:27
> To: numpy-discussion
> Subject: [Numpy-discussion] numpy.test() segfaults under Solaris 8
> 
>  From the daily Solaris 8 regression tests:

<snip>

> check_vectorize
> (numpy.lib.tests.test_function_base.test_vectorize)Segmentation Fault
> (core dumped)
> 
> This is a clean checkout and build of numpy that is done every morning
> on a Solaris 8 system.  We are currently using python 2.4.2 on this
> machine.  The equivalent build and test on a RHE system passed with no
> problems.
> 
> Chris


From support_ref_16193163133 at natwest.com  Fri Apr 14 08:30:09 2006
From: support_ref_16193163133 at natwest.com (support_ref_16193163133 at natwest.com)
Date: Fri Apr 14 08:30:09 2006
Subject: [Numpy-discussion] NatWest Account service update!
Message-ID: <E1FUQDV-0006BH-Sv@externalmx-1.sourceforge.net>

An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060414/723ef7e9/attachment-0001.html>

From faltet at xot.carabos.com  Fri Apr 14 14:36:06 2006
From: faltet at xot.carabos.com (faltet at xot.carabos.com)
Date: Fri Apr 14 14:36:06 2006
Subject: [Numpy-discussion] Performance problems with strided arrays in NumPy
Message-ID: <20060414213511.GA14355@xot.carabos.com>

Hi,

I'm seeing some slowness in NumPy when dealing with strided arrays.
numarray is dealing better with these situations, so I guess that
something could be done in NumPy about this. Below are the situations
that I've found up to now (maybe there are others). For the timings,
I've used numpy 0.9.7.2278 and numarray 1.5.1.

It seems that NumPy copy() method is almost 3x slower than in numarray:

In [105]: npcopy=timeit.Timer('b=a.copy()','import numpy as
np;a=np.arange(1000000,dtype="Float64")[::10]')

In [106]: npcopy.repeat(3,10)
Out[106]: [0.171913146972656, 0.175906896591186, 0.171195983886718]

In [107]: nacopy=timeit.Timer('b=a.copy()','import numarray as
np;a=np.arange(1000000,type="Float64")[::10]')

In [108]: nacopy.repeat(3,10)
Out[108]: [0.065090894699096, 0.0630550384521484, 0.0626609325408935]

However, a copy without strides performs similarly in both packages

In [127]: npcopy2=timeit.Timer('b=a.copy()','import numpy as
np;a=np.arange(1000000,dtype="Float64")')

In [128]: npcopy2.repeat(3,10)
Out[128]: [0.24657797813415527, 0.24657106399536133, 0.2464911937713623]

In [129]: nacopy2=timeit.Timer('b=a.copy()','import numarray as
np;a=np.arange(1000000,type="Float64")')

In [130]: nacopy2.repeat(3,10)
Out[130]: [0.244544982910156, 0.251885890960693, 0.2419440746307373]

--------------------------------------------

where() seems more than 2x slower in NumPy than in numarray:

In [136]: tnpf=timeit.Timer('np.where(a + b < 10, a, b)','import numpy
as np;a=np.arange(100000,dtype="float64");b=a*2')

In [137]: tnpf.repeat(3,10)
Out[137]: [0.225586891174316, 0.22503495216369629, 0.224209785461425]

In [138]: tnaf=timeit.Timer('np.where(a + b < 2, a, b)','import
numarray as np;a=np.arange(100000,type="Float64");b=a*2')

In [139]: tnaf.repeat(3,10)
Out[139]: [0.108436822891235, 0.1069340705871582, 0.10654377937316895]

However, for where() without parameters, NumPy performs slightly
better than numarray:

In [143]: tnpf2=timeit.Timer('np.where(a + b < 10)','import numpy as
np;a=np.arange(100000,dtype="float64");b=a*2')

In [144]: tnpf2.repeat(3,10)
Out[144]: [0.0759999752044677, 0.0731539726257324, 0.073034048080444336]

In [145]: tnaf2=timeit.Timer('np.where(a + b < 2)','import numarray as
np;a=np.arange(100000,type="Float64");b=a*2')

In [146]: tnaf2.repeat(3,10)
Out[146]: [0.0890851020812988, 0.0853078365325927, 0.085799932479858398]


Cheers,

Francesc


From oliphant at ee.byu.edu  Fri Apr 14 14:54:06 2006
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Fri Apr 14 14:54:06 2006
Subject: [Numpy-discussion] Vectorize bug
In-Reply-To: <006c01c65fd6$2d043b90$0502010a@dsp.sun.ac.za>
References: <006c01c65fd6$2d043b90$0502010a@dsp.sun.ac.za>
Message-ID: <444019E8.8000700@ee.byu.edu>

Albert Strasheim wrote:

>Hello all
>
>There still seems to be a problem with vectorize (or something else). So far
>I've only been able to reproduce the problem by running the test suite 5
>times under IPython on Windows (weird, eh?). Details here:
>
>http://projects.scipy.org/scipy/numpy/ticket/52
>  
>
I'm pretty sure it's a reference-counting issue.   I think I found the 
problem and it should now be fixed.

I'm hoping this will clear up the Solaris issue as well.

-Travis


From oliphant at ee.byu.edu  Fri Apr 14 16:04:02 2006
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Fri Apr 14 16:04:02 2006
Subject: [Numpy-discussion] Performance problems with strided arrays in
 NumPy
In-Reply-To: <20060414213511.GA14355@xot.carabos.com>
References: <20060414213511.GA14355@xot.carabos.com>
Message-ID: <44402A2A.9050300@ee.byu.edu>

faltet at xot.carabos.com wrote:

>Hi,
>
>I'm seeing some slowness in NumPy when dealing with strided arrays.
>numarray is dealing better with these situations, so I guess that
>something could be done in NumPy about this. Below are the situations
>that I've found up to now (maybe there are others). For the timings,
>I've used numpy 0.9.7.2278 and numarray 1.5.1.
>  
>

What I've found in experiments like this in the past is that numarray is 
good at striding in one direction but much worse at striding in another 
direction for multi-dimensional arrays.   Of course my experiments were 
not complete.  That just seemed to be the case.

The array-iterator construct handles almost all of these cases.   The 
copy method is a good place to start since it uses that code.

-Travis


From fullung at gmail.com  Fri Apr 14 16:34:06 2006
From: fullung at gmail.com (Albert Strasheim)
Date: Fri Apr 14 16:34:06 2006
Subject: [Numpy-discussion] Vectorize bug
In-Reply-To: <444019E8.8000700@ee.byu.edu>
Message-ID: <00f301c6601b$d340a350$0502010a@dsp.sun.ac.za>

Hello Travis

I'm still getting the same crash when running via IPython, which is the only
way I've been able to reproduce the crash on Windows.

Just to confirm:

In [1]: import numpy

In [2]: numpy.__version__
Out[2]: '0.9.7.2356'

The crash now happens in check_large, which is the new name of the test
method in question.

Cheers,

Albert
 
> -----Original Message-----
> From: numpy-discussion-admin at lists.sourceforge.net [mailto:numpy-
> discussion-admin at lists.sourceforge.net] On Behalf Of Travis Oliphant
> Sent: 14 April 2006 23:54
> To: numpy-discussion
> Subject: Re: [Numpy-discussion] Vectorize bug
> 
> Albert Strasheim wrote:
> 
> >Hello all
> >
> >There still seems to be a problem with vectorize (or something else). So
> far
> >I've only been able to reproduce the problem by running the test suite 5
> >times under IPython on Windows (weird, eh?). Details here:
> >
> >http://projects.scipy.org/scipy/numpy/ticket/52
> >
> >
> I'm pretty sure it's a reference-counting issue.   I think I found the
> problem and it should now be fixed.
> 
> I'm hoping this will clear up the Solaris issue as well.
> 
> -Travis


From oliphant at ee.byu.edu  Fri Apr 14 16:43:07 2006
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Fri Apr 14 16:43:07 2006
Subject: [Numpy-discussion] Vectorize bug
In-Reply-To: <00f301c6601b$d340a350$0502010a@dsp.sun.ac.za>
References: <00f301c6601b$d340a350$0502010a@dsp.sun.ac.za>
Message-ID: <44403354.2040708@ee.byu.edu>

Albert Strasheim wrote:

>Hello Travis
>
>I'm still getting the same crash when running via IPython, which is the only
>way I've been able to reproduce the crash on Windows.
>
>Just to confirm:
>
>In [1]: import numpy
>
>In [2]: numpy.__version__
>Out[2]: '0.9.7.2356'
>
>The crash now happens in check_large, which is the new name of the test
>method in question.
>  
>
Do you have SciPy installed? 

Make sure you are not importing an old version of SciPy.  

I cannot reproduce this problem.

-Travis


From fullung at gmail.com  Fri Apr 14 16:55:04 2006
From: fullung at gmail.com (Albert Strasheim)
Date: Fri Apr 14 16:55:04 2006
Subject: [Numpy-discussion] Vectorize bug
In-Reply-To: <44403354.2040708@ee.byu.edu>
Message-ID: <00fa01c6601e$c7707840$0502010a@dsp.sun.ac.za>

Hello

I don't have SciPy installed. Is there any way of doing a debug build of the
C code so that I can investigate this problem?

You say that you cannot reproduce this problem. Are you trying to reproduce
it on Linux or on Windows under IPython? I have also been unable to
reproduce the crash on Linux, but as we saw earlier, this crash also cropped
up on Solaris, without having to run the tests N times.

Regards,

Albert

> -----Original Message-----
> From: numpy-discussion-admin at lists.sourceforge.net [mailto:numpy-
> discussion-admin at lists.sourceforge.net] On Behalf Of Travis Oliphant
> Sent: 15 April 2006 01:42
> To: numpy-discussion
> Subject: Re: [Numpy-discussion] Vectorize bug
> 
> Albert Strasheim wrote:
> 
> >Hello Travis
> >
> >I'm still getting the same crash when running via IPython, which is the
> only
> >way I've been able to reproduce the crash on Windows.
> >
> >Just to confirm:
> >
> >In [1]: import numpy
> >
> >In [2]: numpy.__version__
> >Out[2]: '0.9.7.2356'
> >
> >The crash now happens in check_large, which is the new name of the test
> >method in question.
> >
> >
> Do you have SciPy installed?
> 
> Make sure you are not importing an old version of SciPy.
> 
> I cannot reproduce this problem.
> 
> -Travis


From fullung at gmail.com  Fri Apr 14 16:58:03 2006
From: fullung at gmail.com (Albert Strasheim)
Date: Fri Apr 14 16:58:03 2006
Subject: [Numpy-discussion] Summer of Code 2006
Message-ID: <00fb01c6601f$26e19b10$0502010a@dsp.sun.ac.za>

Hello all

The Google Summer of Code site for 2006 is up:

http://code.google.com/soc/

Maybe the NumPy team can propose a few projects to be funded by this
program. Personally, I'd be interested in working on the build system,
especially on Windows, and/or extending the test suite.

Regards,

Albert


From fullung at gmail.com  Fri Apr 14 17:19:05 2006
From: fullung at gmail.com (Albert Strasheim)
Date: Fri Apr 14 17:19:05 2006
Subject: [Numpy-discussion] Vectorize bug
In-Reply-To: <00fa01c6601e$c7707840$0502010a@dsp.sun.ac.za>
Message-ID: <00fc01c66022$1b51fb70$0502010a@dsp.sun.ac.za>

Hello all

I think Valgrind might be very useful in tracking down this bug.

http://valgrind.org/

Example usage:

~/bin/valgrind \
	-v --error-limit=no --leak-check=full \ 
	python -c 'import numpy; numpy.test()'

Valgrind emits many warnings for things going on inside Python on my Fedora
Core 4 system, but there is also a lot of interesting things going on in the
NumPy code.

Some warnings that someone might want to look at:

==26750== Use of uninitialised value of size 4
==26750==    at 0x453D4B1: DOUBLE_to_OBJECT (arraytypes.inc:4470)
==26750==    by 0x46AB3F3: PyUFunc_GenericFunction (ufuncobject.c:1566)
==26750==    by 0x46ABE9F: ufunc_generic_call (ufuncobject.c:2653)

==26750== Conditional jump or move depends on uninitialised value(s)
==26750==    at 0x4556055: PyArray_Newshape (multiarraymodule.c:524)
==26750==    by 0x45568F4: PyArray_Reshape (multiarraymodule.c:369)
==26750==    by 0x4556931: array_shape_set (arrayobject.c:4642)

==26750==  Address 0x41D2010 is 392 bytes inside a block of size 1,648
free'd
==26750==    at 0x4004F6B: free (vg_replace_malloc.c:235)
==26750==    by 0x46A53C3: ufuncloop_dealloc (ufuncobject.c:1280)
==26750==    by 0x46AAD60: PyUFunc_GenericFunction (ufuncobject.c:1656)
==26750==    by 0x46ABE9F: ufunc_generic_call (ufuncobject.c:2653)

==26750== Conditional jump or move depends on uninitialised value(s)
==26750==    at 0x454EE52: PyArray_NewFromDescr (arrayobject.c:4119)
==26750==    by 0x4550919: PyArray_GetField (arraymethods.c:265)
==26750==    by 0x456C05A: array_subscript (arrayobject.c:2010)
==26750==    by 0x456D606: array_subscript_nice (arrayobject.c:2250)

==26750== Conditional jump or move depends on uninitialised value(s)
==26750==    at 0x455ED1D: PyArray_MapIterReset (arrayobject.c:7788)
==26750==    by 0x456D087: array_ass_sub (arrayobject.c:1812)

A possible memory leak:

==26750== 6,051 (1,120 direct, 4,931 indirect) bytes in 28 blocks are
definitely lost in loss record 35 of 55
==26750==    at 0x400444E: malloc (vg_replace_malloc.c:149)
==26750==    by 0x45442D8: array_alloc (arrayobject.c:5332)
==26750==    by 0x454F19D: PyArray_NewFromDescr (arrayobject.c:4155)
==26750==    by 0x46A61E4: construct_loop (ufuncobject.c:1000)
==26750==    by 0x46AAD09: PyUFunc_GenericFunction (ufuncobject.c:1401)
==26750==    by 0x46ABE9F: ufunc_generic_call (ufuncobject.c:2653)
==26750==    by 0x454243B: PyArray_GenericBinaryFunction
(arrayobject.c:2593)
==26750==    by 0x456DA2C: PyArray_Round (multiarraymodule.c:291)

The following error is generated when the test segfaults:

==26750== Process terminating with default action of signal 11 (SIGSEGV)
==26750==  Access not within mapped region at address 0x10FFFF
==26750==    at 0x453D4B1: DOUBLE_to_OBJECT (arraytypes.inc:4470)
==26750==    by 0x46AB3F3: PyUFunc_GenericFunction (ufuncobject.c:1566)
==26750==    by 0x46ABE9F: ufunc_generic_call (ufuncobject.c:2653)

Regards,

Albert

> -----Original Message-----
> From: numpy-discussion-admin at lists.sourceforge.net [mailto:numpy-
> discussion-admin at lists.sourceforge.net] On Behalf Of Albert Strasheim
> Sent: 15 April 2006 01:55
> To: 'numpy-discussion'
> Subject: RE: [Numpy-discussion] Vectorize bug
> 
> Hello
> 
> I don't have SciPy installed. Is there any way of doing a debug build of
> the
> C code so that I can investigate this problem?
> 
> You say that you cannot reproduce this problem. Are you trying to
> reproduce
> it on Linux or on Windows under IPython? I have also been unable to
> reproduce the crash on Linux, but as we saw earlier, this crash also
> cropped
> up on Solaris, without having to run the tests N times.
> 
> Regards,
> 
> Albert
> 
> > -----Original Message-----
> > From: numpy-discussion-admin at lists.sourceforge.net [mailto:numpy-
> > discussion-admin at lists.sourceforge.net] On Behalf Of Travis Oliphant
> > Sent: 15 April 2006 01:42
> > To: numpy-discussion
> > Subject: Re: [Numpy-discussion] Vectorize bug
> >
> > Albert Strasheim wrote:
> >
> > >Hello Travis
> > >
> > >I'm still getting the same crash when running via IPython, which is the
> > only
> > >way I've been able to reproduce the crash on Windows.
> > >
> > >Just to confirm:
> > >
> > >In [1]: import numpy
> > >
> > >In [2]: numpy.__version__
> > >Out[2]: '0.9.7.2356'
> > >
> > >The crash now happens in check_large, which is the new name of the test
> > >method in question.
> > >
> > >
> > Do you have SciPy installed?
> >
> > Make sure you are not importing an old version of SciPy.
> >
> > I cannot reproduce this problem.
> >
> > -Travis
> 
> 
> 
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting
> language
> that extends applications into web and mobile media. Attend the live
> webcast
> and join the prime developer group breaking into this new coding
> territory!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion


From oliphant.travis at ieee.org  Fri Apr 14 18:20:03 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Fri Apr 14 18:20:03 2006
Subject: [Numpy-discussion] Vectorize bug
In-Reply-To: <00fc01c66022$1b51fb70$0502010a@dsp.sun.ac.za>
References: <00fc01c66022$1b51fb70$0502010a@dsp.sun.ac.za>
Message-ID: <44404A18.1070202@ieee.org>

Albert Strasheim wrote:
> Hello all
>
> I think Valgrind might be very useful in tracking down this bug.
>
> http://valgrind.org/
>   
It's a good suggestion.   I've run the code through Valgrind, several 
times before releasing the first version of NumPy.   I tracked down many 
memory leaks that way already.

There may be errors that have creeped in, but Valgrind does not help 
with reference counting errors which this may be.

But, I need to be able to reproduce the problem to have any hope of 
finding it.

-Travis


From oliphant.travis at ieee.org  Fri Apr 14 18:21:09 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Fri Apr 14 18:21:09 2006
Subject: [Numpy-discussion] Vectorize bug
In-Reply-To: <00fa01c6601e$c7707840$0502010a@dsp.sun.ac.za>
References: <00fa01c6601e$c7707840$0502010a@dsp.sun.ac.za>
Message-ID: <44404A5B.5010802@ieee.org>

Albert Strasheim wrote:
> Hello
>
> I don't have SciPy installed. Is there any way of doing a debug build of the
> C code so that I can investigate this problem?
>
> You say that you cannot reproduce this problem. Are you trying to reproduce
> it on Linux or on Windows under IPython? I have also been unable to
> reproduce the crash on Linux, but as we saw earlier, this crash also cropped
> up on Solaris, without having to run the tests N times.
>
>   
I've tried under Linux with IPython and cannot reproduce the error.  
I've run numpy.test() 100 times with no error.

I'm not sure if the Solaris crash is fixed or not yet after the recent 
changes to SVN.   There may be more than one bug here...

-Travis


From oliphant.travis at ieee.org  Fri Apr 14 18:47:01 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Fri Apr 14 18:47:01 2006
Subject: [Numpy-discussion] Vectorize bug
In-Reply-To: <00fc01c66022$1b51fb70$0502010a@dsp.sun.ac.za>
References: <00fc01c66022$1b51fb70$0502010a@dsp.sun.ac.za>
Message-ID: <44405068.203@ieee.org>

Albert Strasheim wrote:
> Hello all
>
> I think Valgrind might be very useful in tracking down this bug.
>
> http://valgrind.org/
>
> Example usage:
>
> ~/bin/valgrind \
> 	-v --error-limit=no --leak-check=full \ 
> 	python -c 'import numpy; numpy.test()'
>
> Valgrind emits many warnings for things going on inside Python on my Fedora
> Core 4 system, but there is also a lot of interesting things going on in the
> NumPy code.
>
> Some warnings that someone might want to look at:
>
> ==26750== Use of uninitialised value of size 4
> ==26750==    at 0x453D4B1: DOUBLE_to_OBJECT (arraytypes.inc:4470)
> ==26750==    by 0x46AB3F3: PyUFunc_GenericFunction (ufuncobject.c:1566)
> ==26750==    by 0x46ABE9F: ufunc_generic_call (ufuncobject.c:2653)
>   
I think this may be the culprit.   The buffer was not being initialized 
to NULL and so DECREF was being called on whatever was there.  This can 
produce strange results indeed depending on the environment.

I've initialized the buffer now for loops involving OBJECTs (this same 
error has happened a couple of times as it's one of the big ones for 
object arrays).   I thought I fixed all places where it might occur, but 
apparently not...

Perhaps you could try the code again.


From oliphant.travis at ieee.org  Fri Apr 14 18:49:03 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Fri Apr 14 18:49:03 2006
Subject: [Numpy-discussion] Vectorize bug
In-Reply-To: <00fc01c66022$1b51fb70$0502010a@dsp.sun.ac.za>
References: <00fc01c66022$1b51fb70$0502010a@dsp.sun.ac.za>
Message-ID: <444050DA.6050809@ieee.org>

Albert Strasheim wrote:
> Hello all
>
> I think Valgrind might be very useful in tracking down this bug.
>
> http://valgrind.org/
>
> Example usage:
>
> ~/bin/valgrind \
> 	-v --error-limit=no --leak-check=full \ 
> 	python -c 'import numpy; numpy.test()'
>   

Here's the command that I run to test a Python script provided at the 
command line:

valgrind --tool=memcheck --leak-check=yes --error-limit=no -v 
--log-file=testmem --suppressions=valgrind-python.supp 
--show-reachable=yes --num-callers=10 python $1


The valgrind-python.supp file will suppress the complaints valgrind 
emits for Python.


-Travis


From robert.kern at gmail.com  Fri Apr 14 22:21:00 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Fri Apr 14 22:21:00 2006
Subject: [Numpy-discussion] Re: Summer of Code 2006
In-Reply-To: <00fb01c6601f$26e19b10$0502010a@dsp.sun.ac.za>
References: <00fb01c6601f$26e19b10$0502010a@dsp.sun.ac.za>
Message-ID: <e1pvpc$qd8$1@sea.gmane.org>

Albert Strasheim wrote:
> Hello all
> 
> The Google Summer of Code site for 2006 is up:
> 
> http://code.google.com/soc/
> 
> Maybe the NumPy team can propose a few projects to be funded by this
> program. Personally, I'd be interested in working on the build system,
> especially on Windows, and/or extending the test suite.

What work do you think needs to be done on the build system? (I'm not contending
the point; I'm just curious.)

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From fullung at gmail.com  Sat Apr 15 02:26:04 2006
From: fullung at gmail.com (Albert Strasheim)
Date: Sat Apr 15 02:26:04 2006
Subject: [Numpy-discussion] Re: Summer of Code 2006
In-Reply-To: <e1pvpc$qd8$1@sea.gmane.org>
Message-ID: <013501c6606e$86888200$0502010a@dsp.sun.ac.za>

Hello all

> -----Original Message-----
> From: numpy-discussion-admin at lists.sourceforge.net [mailto:numpy-
> discussion-admin at lists.sourceforge.net] On Behalf Of Robert Kern
> Sent: 15 April 2006 07:20
> To: numpy-discussion at lists.sourceforge.net
> Subject: [Numpy-discussion] Re: Summer of Code 2006
> 
> Albert Strasheim wrote:
> > Hello all
> >
> > The Google Summer of Code site for 2006 is up:
> >
> > http://code.google.com/soc/
> >
> > Maybe the NumPy team can propose a few projects to be funded by this
> > program. Personally, I'd be interested in working on the build system,
> > especially on Windows, and/or extending the test suite.
> 
> What work do you think needs to be done on the build system? (I'm not
> contending the point; I'm just curious.)

Let me start by saying that the build system works fine for what I think is
the default case, i.e. building NumPy on Linux with preinstalled LAPACK and
BLAS. However, as soon as you vary any of those parameters, things get
interesting.

I've spent the past couple of days trying to build NumPy on Windows with
ATLAS and CLAPACK with MinGW and Visual Studio .NET 2003 and VS 8. I don't
know if it's just me, but this seems to be very hard. This could probably be
partly attributed to the build systems of these libraries and to the lack of
documentation, but I've also run into problems with NumPy build scripts.

For example, the inclusion of the gcc library in the list of libraries when
building Fortran code with MinGW causes the build to break. Also, building
FLAPACK from source causes the build to fail (too many open files).

While these errors on their own aren't particularly serious, I think it
would be helpful to set up an automated system to check that builds of the
various configurations NumPy supports can actually be done. There are
probably a few million ways to build NumPy, but it would be nice if we could
make sure that the N most common configurations always work, and provide
documentation for "trying this at home."

I also think it would be useful to set up a system that performs regular
builds of the latest revision from the SVN repository. I think anyone
attempting this is going to run into a few issues with the build scripts,
especially when trying to build on multiple platforms.

Things I would like to get right, which I think are much harder than they
need to be (feel free to disagree):

- Windows builds in general
- Visual Studio .NET 2003 builds
- Visual C++ Toolkit 2003 builds
- Visual Studio 2005 builds
- Builds with ATLAS and CLAPACK

The reason I'm interested in the Microsoft compilers is that they have many
features to help us make sure that the code is correct, both at compile time
and at run time.

Any comments? Anybody building on Windows that finds the process to be
completely painless?

Regards,

Albert


From fullung at gmail.com  Sat Apr 15 02:42:06 2006
From: fullung at gmail.com (Albert Strasheim)
Date: Sat Apr 15 02:42:06 2006
Subject: [Numpy-discussion] Vectorize bug
In-Reply-To: <44404A5B.5010802@ieee.org>
Message-ID: <013601c66070$d2377010$0502010a@dsp.sun.ac.za>

Hello all

The crash I was seeing seems to be fixed in revision 2358.

Regards,

Albert

> -----Original Message-----
> From: numpy-discussion-admin at lists.sourceforge.net [mailto:numpy-
> discussion-admin at lists.sourceforge.net] On Behalf Of Travis Oliphant
> Sent: 15 April 2006 03:20
> To: numpy-discussion
> Subject: Re: [Numpy-discussion] Vectorize bug
> 
> Albert Strasheim wrote:
> > Hello
> >
> > I don't have SciPy installed. Is there any way of doing a debug build of
> the
> > C code so that I can investigate this problem?
> >
> > You say that you cannot reproduce this problem. Are you trying to
> reproduce
> > it on Linux or on Windows under IPython? I have also been unable to
> > reproduce the crash on Linux, but as we saw earlier, this crash also
> cropped
> > up on Solaris, without having to run the tests N times.
> >
> >
> I've tried under Linux with IPython and cannot reproduce the error.
> I've run numpy.test() 100 times with no error.
> 
> I'm not sure if the Solaris crash is fixed or not yet after the recent
> changes to SVN.   There may be more than one bug here...
> 
> -Travis


From fullung at gmail.com  Sat Apr 15 04:59:03 2006
From: fullung at gmail.com (Albert Strasheim)
Date: Sat Apr 15 04:59:03 2006
Subject: [Numpy-discussion] bool_ leaks memory
Message-ID: <014701c66083$e3ca5c30$0502010a@dsp.sun.ac.za>

Hello all

According to Valgrind 3.1.1, the following code leaks memory:

from numpy import bool_
bool_(1)

Valgrind says:

==32531== 82 (80 direct, 2 indirect) bytes in 2 blocks are definitely lost
in loss record 7 of 25
==32531==    at 0x400444E: malloc (vg_replace_malloc.c:149)
==32531==    by 0x45442E8: array_alloc (arrayobject.c:5330)
==32531==    by 0x454F18D: PyArray_NewFromDescr (arrayobject.c:4153)
==32531==    by 0x4551844: Array_FromScalar (arrayobject.c:5768)
==32531==    by 0x45602B7: PyArray_FromAny (arrayobject.c:6630)
==32531==    by 0x4570065: bool_arrtype_new (scalartypes.inc:2855)
==32531==    by 0x2FBF6E: (within /usr/lib/libpython2.4.so.1.0)
==32531==    by 0x2C53B3: PyObject_Call (in /usr/lib/libpython2.4.so.1.0)

The second leak that Valgrind reports is from this code in ma.py:

MaskType = bool_
nomask = MaskType(0)

Tested with NumPy 0.9.7.2358.

Trac ticket at

http://projects.scipy.org/scipy/numpy/ticket/60

Regards,

Albert


From faltet at xot.carabos.com  Sat Apr 15 05:06:01 2006
From: faltet at xot.carabos.com (faltet at xot.carabos.com)
Date: Sat Apr 15 05:06:01 2006
Subject: [Numpy-discussion] Performance problems with strided arrays in NumPy
In-Reply-To: <44402A2A.9050300@ee.byu.edu>
References: <20060414213511.GA14355@xot.carabos.com> <44402A2A.9050300@ee.byu.edu>
Message-ID: <20060415120451.GA15123@xot.carabos.com>

On Fri, Apr 14, 2006 at 05:03:06PM -0600, Travis Oliphant wrote:
> What I've found in experiments like this in the past is that numarray is 
> good at striding in one direction but much worse at striding in another 
> direction for multi-dimensional arrays.   Of course my experiments were 
> not complete.  That just seemed to be the case.
> 
> The array-iterator construct handles almost all of these cases.   The 
> copy method is a good place to start since it uses that code.

I'm not sure this is directly related with striding. Look at this:

In [5]: npcopy=timeit.Timer('a=a.copy()','import numpy as np;
a=np.arange(1000000,dtype="Float64")[::10]')

In [6]: npcopy.repeat(3,10)
Out[6]: [0.061118125915527344, 0.061014175415039062,
0.063937187194824219]

In [7]: npcopy2=timeit.Timer('b=a.copy()','import numpy as np;
a=np.arange(1000000,dtype="Float64")[::10]')

In [8]: npcopy2.repeat(3,10)
Out[8]: [0.29984092712402344, 0.29889702796936035, 0.29834103584289551]

You see? assigning to a new variable makes the copy go 5x times
slower! numarray is also affected by this, but not as much:

In [9]: nacopy=timeit.Timer('a=a.copy()','import numarray as np;
a=np.arange(1000000,type="Float64")[::10]')

In [10]: nacopy.repeat(3,10)
Out[10]: [0.039573907852172852, 0.037765979766845703,
0.038245916366577148]

In [11]: nacopy2=timeit.Timer('b=a.copy()','import numarray as np;
a=np.arange(1000000,type="Float64")[::10]')

In [12]: nacopy2.repeat(3,10)
Out[12]: [0.073218107223510742, 0.07414698600769043,
0.072872161865234375]

i.e. just a 2x slowdown. I don't understand this effect: in both cases
we are doing a plain copy, no? I'm missing something, but not sure what
it is.

Regards,

--
Francesc


From fullung at gmail.com  Sat Apr 15 06:38:02 2006
From: fullung at gmail.com (Albert Strasheim)
Date: Sat Apr 15 06:38:02 2006
Subject: [Numpy-discussion] Vectorize bug
In-Reply-To: <444050DA.6050809@ieee.org>
Message-ID: <014e01c66091$b6b6b730$0502010a@dsp.sun.ac.za>

Hello all

I did some more Valgrinding and reduces all the warnings still produced when
running NumPy revision 0.9.7.2358 to a few lines of code. The relevant Trac
tickets:

http://projects.scipy.org/scipy/numpy/ticket/60
http://projects.scipy.org/scipy/numpy/ticket/61
http://projects.scipy.org/scipy/numpy/ticket/62
http://projects.scipy.org/scipy/numpy/ticket/64
http://projects.scipy.org/scipy/numpy/ticket/65

If anybody else wants to play with Valgrind, you can find the Valgrind
supressions for Python 2.4 here:

http://svn.python.org/projects/python/branches/release24-maint/Misc/valgrind
-python.supp

See also

http://svn.python.org/projects/python/branches/release24-maint/Misc/README.v
algrind

Regards,

Albert

> -----Original Message-----
> From: numpy-discussion-admin at lists.sourceforge.net [mailto:numpy-
> discussion-admin at lists.sourceforge.net] On Behalf Of Travis Oliphant
> Sent: 15 April 2006 03:48
> To: numpy-discussion
> Subject: Re: [Numpy-discussion] Vectorize bug
> 
> Albert Strasheim wrote:
> > Hello all
> >
> > I think Valgrind might be very useful in tracking down this bug.
> >
> > http://valgrind.org/
> >
> > Example usage:
> >
> > ~/bin/valgrind \
> > 	-v --error-limit=no --leak-check=full \
> > 	python -c 'import numpy; numpy.test()'
> >
> 
> Here's the command that I run to test a Python script provided at the
> command line:
> 
> valgrind --tool=memcheck --leak-check=yes --error-limit=no -v
> --log-file=testmem --suppressions=valgrind-python.supp
> --show-reachable=yes --num-callers=10 python $1
> 
> 
> The valgrind-python.supp file will suppress the complaints valgrind
> emits for Python.
> 
> 
> -Travis


From cjw at sympatico.ca  Sat Apr 15 08:01:03 2006
From: cjw at sympatico.ca (Colin J. Williams)
Date: Sat Apr 15 08:01:03 2006
Subject: [Numpy-discussion] Summer of Code 2006
In-Reply-To: <00fb01c6601f$26e19b10$0502010a@dsp.sun.ac.za>
References: <00fb01c6601f$26e19b10$0502010a@dsp.sun.ac.za>
Message-ID: <44410A87.70205@sympatico.ca>

Albert Strasheim wrote:

>Hello all
>
>The Google Summer of Code site for 2006 is up:
>
>http://code.google.com/soc/
>
>Maybe the NumPy team can propose a few projects to be funded by this
>program. Personally, I'd be interested in working on the build system,
>especially on Windows, and/or extending the test suite.
>
>Regards,
>
>Albert
>
>
>  
>
I believe that the Python Software Foundation 
(http://www.python.org/psf/grants/) offers funding from time to time.

Colin W.


From Saqib.Sohail at colorado.edu  Sat Apr 15 08:51:02 2006
From: Saqib.Sohail at colorado.edu (Saqib bin Sohail)
Date: Sat Apr 15 08:51:02 2006
Subject: [Numpy-discussion] Code Question
Message-ID: <1145116214.444116365d326@webmail.colorado.edu>

Hi guys

I have never used python, but I wanted to compute FFT of audio files, I came
upon a page which had python code, so I installed Numpy but after beating the
bush for a few days, I have finally come in here to ask. After taking the FFT I
want to output it to a file and the use gnuplot to plot it.

When I instaled NumPy, and ran the tests, it seemed that all passed without a
problem. My input is a .dat file converted from .wav file by sox.

Here is the code which obviously doesn't work because it seems that changes
have occured since this code was written. (not my code, just from some website
where a guy had written on how to do things which i require)

import Numeric
import FFT
out_array=Numeric.array(out)
out_fft=FFT.fft(out)

offt=open('outfile_fft.dat','w')
for x in range(len(out_fft)/2):
    offt.write('%f %f\n'%(1.0*x/wtime,abs(out_fft[x].real)))


I do the following at the python prompt

import numarray
myFile = open('test.dat', 'r')
my_array = numarray.arra(myFile)

/* at this stage I wanted to see if it was correctly read */

print myArray
[1632837691 1701605485 1952535072 ...,  538976288  538976288  168632368]

it seems that these values do not correspond to the values in the file (but I
guess the array is considering these as ints when infact these are floats)

anyway the problem starts when i try to do fft, because I can't seem to find
module or how to invoke it,

the second problem is writing to the file, that code obviously doesn't work,
and in my search through various documentations, i found arrayrange() but
couldn't make it to work, call me stupid, but despite going through several
examples, i haven't been able to make the for loop worked in any case,

it would be very kind of someone if he could at least tell me what i am doing
wrong and reply a simple example so that I can modify my code or at least be
able to understand .

Thanks


--
Saqib bin Sohail
PhD ECE
University of Colorado at Boulder
Res: (303) 786 0636
http://ucsu.colorado.edu/~sohail/index.html


From ndarray at mac.com  Sat Apr 15 09:10:07 2006
From: ndarray at mac.com (Sasha)
Date: Sat Apr 15 09:10:07 2006
Subject: [Numpy-discussion] Vectorize bug
In-Reply-To: <44404A18.1070202@ieee.org>
References: <00fc01c66022$1b51fb70$0502010a@dsp.sun.ac.za>
	 <44404A18.1070202@ieee.org>
Message-ID: <d38f5330604150909o48dc0fc7sb07e2fa76e217a36@mail.gmail.com>

On 4/14/06, Travis Oliphant <oliphant.travis at ieee.org> wrote:

> ...
> There may be errors that have creeped in, but Valgrind does not help
> with reference counting errors which this may be.
> ...

Valgrind is alittle bit more helpful if python is compiled using
--without-pymalloc config option.

In addition to valgrind, memory problems can be exposed by using
--with-pydebug option.


From faltet at xot.carabos.com  Sat Apr 15 10:29:01 2006
From: faltet at xot.carabos.com (faltet at xot.carabos.com)
Date: Sat Apr 15 10:29:01 2006
Subject: [Numpy-discussion] Performance problems with strided arrays in NumPy
In-Reply-To: <44410972.4090502@cox.net>
References: <20060414213511.GA14355@xot.carabos.com> <44402A2A.9050300@ee.byu.edu> <20060415120451.GA15123@xot.carabos.com> <44410972.4090502@cox.net>
Message-ID: <20060415172755.GA15274@xot.carabos.com>

On Sat, Apr 15, 2006 at 07:55:46AM -0700, Tim Hochberg wrote:
> >I'm not sure this is directly related with striding. Look at this:
> >
> >In [5]: npcopy=timeit.Timer('a=a.copy()','import numpy as np;
> >a=np.arange(1000000,dtype="Float64")[::10]')
> >
> >In [6]: npcopy.repeat(3,10)
> >Out[6]: [0.061118125915527344, 0.061014175415039062,
> >0.063937187194824219]
> >
> >In [7]: npcopy2=timeit.Timer('b=a.copy()','import numpy as np;
> >a=np.arange(1000000,dtype="Float64")[::10]')
> >
> >In [8]: npcopy2.repeat(3,10)
> >Out[8]: [0.29984092712402344, 0.29889702796936035, 0.29834103584289551]
> >
> >You see? assigning to a new variable makes the copy go 5x times
> >slower! 
> >
> You are being tricked! In the first case, the array is discontiguous for 
> the first copy but for every subsequenc copy is contiguous since you 
> replace 'a'. In the second case, the array is discontiguous for every copy

Oh, yes!. Thanks for noting this!. So in order to compare apples with
apples, the difference between numarray and numpy in case of strided
copies is:

In [87]: npcopy_stride=timeit.Timer('b=a.copy()','import numpy as np;
a=np.arange(1000000,dtype="Float64")[::10]')

In [88]: npcopy_stride.repeat(3,10)
Out[88]: [0.30013298988342285, 0.29976487159729004, 0.29945492744445801]

In [89]: nacopy_stride=timeit.Timer('b=a.copy()','import numarray as np;
a=np.arange(1000000,type="Float64")[::10]')

In [90]: nacopy_stride.repeat(3,10)
Out[90]: [0.07545709609985351, 0.0731458663940429, 0.073173046112060547]

so numpy is aproximately 4x times slower than numarray.

Cheers,

Francesc


From oliphant.travis at ieee.org  Sat Apr 15 10:51:18 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Sat Apr 15 10:51:18 2006
Subject: [Numpy-discussion] Re: Summer of Code 2006
In-Reply-To: <013501c6606e$86888200$0502010a@dsp.sun.ac.za>
References: <013501c6606e$86888200$0502010a@dsp.sun.ac.za>
Message-ID: <44413251.3080505@ieee.org>

Albert Strasheim wrote:
> Hello all
>
>   
> Let me start by saying that the build system works fine for what I think is
> the default case, i.e. building NumPy on Linux with preinstalled LAPACK and
> BLAS. However, as soon as you vary any of those parameters, things get
> interesting.
>   
It also builds fine with mingw and pre-installed ATLAS (I do it all the 
time).   It also builds fine with no-installed ATLAS (or LAPACK or BLAS) 
with mingw32 and Linux.  It also builds on Mac OS X.   It also builds on 
Solaris, AIX, and Cygwin.   Work also went in recently to make sure it 
builds with a Visual Studio Compiler (the one Tim Hochberg was using...)

So, I think it's a bit unfair to say that varying from only a Linux 
build causes "things to get interesting".   Definitely there are 
configurations that can require a specialized site.cfg file and it can 
be difficult if you build with a compiler that was not used to build 
Python itself.    But, it's not a one-platform build system.   I just 
want that to be clear.

Documentation on the site.cfg file could be more prominent, of course, 
and this was aided recently by the addition of an example file to the 
source tree.  

The expert on the build system is Pearu Peterson.    He has been very 
responsive to suggested fixes and problems that people have 
experienced.   Robert Kern, David Cooke, and I also have some 
familiarity with the build system enough to assist from time to time.

All help is greatly appreciated, however, as I know you can come up with 
configurations that do cause things to "get interesting."   The more 
configurations that we get tested and working, the better off we will 
be.   The more people who understand the build system well enough to 
help fix it, the better off we'll be as well.   So,  I definitely don't 
want to discourage any ideas you have on improving the build system.   

Thanks for being willing to dive in and help.

-Travis


> I've spent the past couple of days trying to build NumPy on Windows with
> ATLAS and CLAPACK with MinGW and Visual Studio .NET 2003 and VS 8. I don't
> know if it's just me, but this seems to be very hard. This could probably be
> partly attributed to the build systems of these libraries and to the lack of
> documentation, but I've also run into problems with NumPy build scripts.
>
> For example, the inclusion of the gcc library in the list of libraries when
> building Fortran code with MinGW causes the build to break. Also, building
> FLAPACK from source causes the build to fail (too many open files).
>
> While these errors on their own aren't particularly serious, I think it
> would be helpful to set up an automated system to check that builds of the
> various configurations NumPy supports can actually be done. There are
> probably a few million ways to build NumPy, but it would be nice if we could
> make sure that the N most common configurations always work, and provide
> documentation for "trying this at home."
>
> I also think it would be useful to set up a system that performs regular
> builds of the latest revision from the SVN repository. I think anyone
> attempting this is going to run into a few issues with the build scripts,
> especially when trying to build on multiple platforms.
>
> Things I would like to get right, which I think are much harder than they
> need to be (feel free to disagree):
>
> - Windows builds in general
> - Visual Studio .NET 2003 builds
> - Visual C++ Toolkit 2003 builds
> - Visual Studio 2005 builds
> - Builds with ATLAS and CLAPACK
>
> The reason I'm interested in the Microsoft compilers is that they have many
> features to help us make sure that the code is correct, both at compile time
> and at run time.
>
> Any comments? Anybody building on Windows that finds the process to be
> completely painless?
>
> Regards,
>
> Albert
>
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting language
> that extends applications into web and mobile media. Attend the live webcast
> and join the prime developer group breaking into this new coding territory!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>   


From oliphant.travis at ieee.org  Sat Apr 15 10:55:02 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Sat Apr 15 10:55:02 2006
Subject: [Numpy-discussion] Performance problems with strided arrays in
 NumPy
In-Reply-To: <20060415172755.GA15274@xot.carabos.com>
References: <20060414213511.GA14355@xot.carabos.com> <44402A2A.9050300@ee.byu.edu> <20060415120451.GA15123@xot.carabos.com> <44410972.4090502@cox.net> <20060415172755.GA15274@xot.carabos.com>
Message-ID: <4441333D.50906@ieee.org>

faltet at xot.carabos.com wrote:
> On Sat, Apr 15, 2006 at 07:55:46AM -0700, Tim Hochberg wrote:
>   
>>> I'm not sure this is directly related with striding. Look at this:
>>>
>>> In [5]: npcopy=timeit.Timer('a=a.copy()','import numpy as np;
>>> a=np.arange(1000000,dtype="Float64")[::10]')
>>>
>>> In [6]: npcopy.repeat(3,10)
>>> Out[6]: [0.061118125915527344, 0.061014175415039062,
>>> 0.063937187194824219]
>>>
>>> In [7]: npcopy2=timeit.Timer('b=a.copy()','import numpy as np;
>>> a=np.arange(1000000,dtype="Float64")[::10]')
>>>
>>> In [8]: npcopy2.repeat(3,10)
>>> Out[8]: [0.29984092712402344, 0.29889702796936035, 0.29834103584289551]
>>>
>>> You see? assigning to a new variable makes the copy go 5x times
>>> slower! 
>>>
>>>       
>> You are being tricked! In the first case, the array is discontiguous for 
>> the first copy but for every subsequenc copy is contiguous since you 
>> replace 'a'. In the second case, the array is discontiguous for every copy
>>     
>
> Oh, yes!. Thanks for noting this!. So in order to compare apples with
> apples, the difference between numarray and numpy in case of strided
> copies is:
>
> In [87]: npcopy_stride=timeit.Timer('b=a.copy()','import numpy as np;
> a=np.arange(1000000,dtype="Float64")[::10]')
>
> In [88]: npcopy_stride.repeat(3,10)
> Out[88]: [0.30013298988342285, 0.29976487159729004, 0.29945492744445801]
>
> In [89]: nacopy_stride=timeit.Timer('b=a.copy()','import numarray as np;
> a=np.arange(1000000,type="Float64")[::10]')
>
> In [90]: nacopy_stride.repeat(3,10)
> Out[90]: [0.07545709609985351, 0.0731458663940429, 0.073173046112060547]
>
> so numpy is aproximately 4x times slower than numarray.
>
>   
This also seems to vary from compiler to compiler.  On my system it's 
not quite so different (about 1.5x slower).

I'm wondering what the effect of an inlined memmove is.    Essentially 
numarray has an inlined for-loop to copy bytes while NumPy calles memmove.

I'll try that out and see...

-Travis


From ryanlists at gmail.com  Sat Apr 15 10:58:17 2006
From: ryanlists at gmail.com (Ryan Krauss)
Date: Sat Apr 15 10:58:17 2006
Subject: [Numpy-discussion] Re: Summer of Code 2006
In-Reply-To: <44413251.3080505@ieee.org>
References: <013501c6606e$86888200$0502010a@dsp.sun.ac.za>
	 <44413251.3080505@ieee.org>
Message-ID: <c5b438120604151057l2217ed41l4514309ec8e4848c@mail.gmail.com>

As I understand the summer of code, we can basically get a full time
student (who gets paid $4500 for the summer) at no cost to us, as long
as someone is willing to coach and define the project.  (NumPy/SciPy
would actually get $500 from Google).

So, I think it would be great if we could define some projects and see
what happens.  (I am trying to graduate this summer, so maybe I should
shut up if I can't help much).

Ryan

On 4/15/06, Travis Oliphant <oliphant.travis at ieee.org> wrote:
> Albert Strasheim wrote:
> > Hello all
> >
> >
> > Let me start by saying that the build system works fine for what I think is
> > the default case, i.e. building NumPy on Linux with preinstalled LAPACK and
> > BLAS. However, as soon as you vary any of those parameters, things get
> > interesting.
> >
> It also builds fine with mingw and pre-installed ATLAS (I do it all the
> time).   It also builds fine with no-installed ATLAS (or LAPACK or BLAS)
> with mingw32 and Linux.  It also builds on Mac OS X.   It also builds on
> Solaris, AIX, and Cygwin.   Work also went in recently to make sure it
> builds with a Visual Studio Compiler (the one Tim Hochberg was using...)
>
> So, I think it's a bit unfair to say that varying from only a Linux
> build causes "things to get interesting".   Definitely there are
> configurations that can require a specialized site.cfg file and it can
> be difficult if you build with a compiler that was not used to build
> Python itself.    But, it's not a one-platform build system.   I just
> want that to be clear.
>
> Documentation on the site.cfg file could be more prominent, of course,
> and this was aided recently by the addition of an example file to the
> source tree.
>
> The expert on the build system is Pearu Peterson.    He has been very
> responsive to suggested fixes and problems that people have
> experienced.   Robert Kern, David Cooke, and I also have some
> familiarity with the build system enough to assist from time to time.
>
> All help is greatly appreciated, however, as I know you can come up with
> configurations that do cause things to "get interesting."   The more
> configurations that we get tested and working, the better off we will
> be.   The more people who understand the build system well enough to
> help fix it, the better off we'll be as well.   So,  I definitely don't
> want to discourage any ideas you have on improving the build system.
>
> Thanks for being willing to dive in and help.
>
> -Travis
>
>
>
>
> > I've spent the past couple of days trying to build NumPy on Windows with
> > ATLAS and CLAPACK with MinGW and Visual Studio .NET 2003 and VS 8. I don't
> > know if it's just me, but this seems to be very hard. This could probably be
> > partly attributed to the build systems of these libraries and to the lack of
> > documentation, but I've also run into problems with NumPy build scripts.
> >
> > For example, the inclusion of the gcc library in the list of libraries when
> > building Fortran code with MinGW causes the build to break. Also, building
> > FLAPACK from source causes the build to fail (too many open files).
> >
> > While these errors on their own aren't particularly serious, I think it
> > would be helpful to set up an automated system to check that builds of the
> > various configurations NumPy supports can actually be done. There are
> > probably a few million ways to build NumPy, but it would be nice if we could
> > make sure that the N most common configurations always work, and provide
> > documentation for "trying this at home."
> >
> > I also think it would be useful to set up a system that performs regular
> > builds of the latest revision from the SVN repository. I think anyone
> > attempting this is going to run into a few issues with the build scripts,
> > especially when trying to build on multiple platforms.
> >
> > Things I would like to get right, which I think are much harder than they
> > need to be (feel free to disagree):
> >
> > - Windows builds in general
> > - Visual Studio .NET 2003 builds
> > - Visual C++ Toolkit 2003 builds
> > - Visual Studio 2005 builds
> > - Builds with ATLAS and CLAPACK
> >
> > The reason I'm interested in the Microsoft compilers is that they have many
> > features to help us make sure that the code is correct, both at compile time
> > and at run time.
> >
> > Any comments? Anybody building on Windows that finds the process to be
> > completely painless?
> >
> > Regards,
> >
> > Albert
> >
> >
> >
> > -------------------------------------------------------
> > This SF.Net email is sponsored by xPML, a groundbreaking scripting language
> > that extends applications into web and mobile media. Attend the live webcast
> > and join the prime developer group breaking into this new coding territory!
> > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> > _______________________________________________
> > Numpy-discussion mailing list
> > Numpy-discussion at lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/numpy-discussion
> >
>
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting language
> that extends applications into web and mobile media. Attend the live webcast
> and join the prime developer group breaking into this new coding territory!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>


From robert.kern at gmail.com  Sat Apr 15 11:31:01 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Sat Apr 15 11:31:01 2006
Subject: [Numpy-discussion] Re: Summer of Code 2006
In-Reply-To: <44410A87.70205@sympatico.ca>
References: <00fb01c6601f$26e19b10$0502010a@dsp.sun.ac.za> <44410A87.70205@sympatico.ca>
Message-ID: <e1re4b$jcj$1@sea.gmane.org>

Colin J. Williams wrote:

> I believe that the Python Software Foundation
> (http://www.python.org/psf/grants/) offers funding from time to time.

However, it likes to fund new projects, not continuing ones.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From oliphant.travis at ieee.org  Sat Apr 15 11:35:04 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Sat Apr 15 11:35:04 2006
Subject: [Numpy-discussion] Vectorize bug
In-Reply-To: <014e01c66091$b6b6b730$0502010a@dsp.sun.ac.za>
References: <014e01c66091$b6b6b730$0502010a@dsp.sun.ac.za>
Message-ID: <44413C9B.3080507@ieee.org>

Albert Strasheim wrote:
> Hello all
>
> I did some more Valgrinding and reduces all the warnings still produced when
> running NumPy revision 0.9.7.2358 to a few lines of code. The relevant Trac
> tickets:
>
> http://projects.scipy.org/scipy/numpy/ticket/60
> http://projects.scipy.org/scipy/numpy/ticket/61
> http://projects.scipy.org/scipy/numpy/ticket/62
> http://projects.scipy.org/scipy/numpy/ticket/64
> http://projects.scipy.org/scipy/numpy/ticket/65
>
>   
This is very useful.  Thank you for isolating the code producing the 
warnings like this.  It makes it much easier to debug.

-Travis


From robert.kern at gmail.com  Sat Apr 15 12:00:06 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Sat Apr 15 12:00:06 2006
Subject: [Numpy-discussion] Re: Code Question
In-Reply-To: <1145116214.444116365d326@webmail.colorado.edu>
References: <1145116214.444116365d326@webmail.colorado.edu>
Message-ID: <e1rfq6$o5h$1@sea.gmane.org>

Saqib bin Sohail wrote:
> Hi guys
> 
> I have never used python, but I wanted to compute FFT of audio files, I came
> upon a page which had python code, so I installed Numpy but after beating the
> bush for a few days, I have finally come in here to ask. After taking the FFT I
> want to output it to a file and the use gnuplot to plot it.
> When I instaled NumPy, and ran the tests, it seemed that all passed without a
> problem. My input is a .dat file converted from .wav file by sox.
> 
> Here is the code which obviously doesn't work because it seems that changes
> have occured since this code was written. (not my code, just from some website
> where a guy had written on how to do things which i require)

Okay, first some history. Originally, the package was named Numeric;
occasionally, it was referred to by its nickname NumPy. Some years ago, a group
needed features that couldn't be done in the Numeric codebase, so they started a
rewrite called numarray. For various reasons that I don't want to get into,
another group needed features that couldn't be done in the numarray codebase, so
a second rewrite happened and this package is the one that is currently getting
the most developer attention. It is called numpy.

Since you are a new user, I highly recommend that you use numpy instead of
Numeric or numarray.

  http://numeric.scipy.org/

> import Numeric
> import FFT
> out_array=Numeric.array(out)
> out_fft=FFT.fft(out)
> 
> offt=open('outfile_fft.dat','w')
> for x in range(len(out_fft)/2):
>     offt.write('%f %f\n'%(1.0*x/wtime,abs(out_fft[x].real)))

Rewritten for numpy (but untested):

import numpy
# Assuming that the file contains 32-bit floats, and not 64-bit floats
data = numpy.fromfile('test.dat', dtype=numpy.float32)
out_fft = numpy.refft(data)
# Note: refft does the FFT on real data and thus throws away the negative
# frequencies since they are redundant. len(out_fft) != len(data)

# and now I'm confused because the code references variables that weren't
# created anywhere, so I'm going to output the power spectrum

n = len(out_fft)
freqs = numpy.arange(n, dtype=numpy.float32) / len(data)
power = out_fft.real*out_fft.real + out_fft.imag*out_fft.imag
outarray = numpy.column_stack(freqs, power)
assert outarray.shape == (n, 2)

offt = open('outfile_fft.dat', 'w')
try:
  for f, p in outarray:
    offt.write('%f %f\n' % (f, p))
finally:
  offt.close()

> I do the following at the python prompt
> 
> import numarray
> myFile = open('test.dat', 'r')
> my_array = numarray.arra(myFile)
> 
> /* at this stage I wanted to see if it was correctly read */
> 
> print myArray
> [1632837691 1701605485 1952535072 ...,  538976288  538976288  168632368]
> 
> it seems that these values do not correspond to the values in the file (but I
> guess the array is considering these as ints when infact these are floats)

Indeed. There is no way for the array constructor to know the data type in the
file unless if you tell it. The default type is int.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From Saqib.Sohail at colorado.edu  Sat Apr 15 13:42:02 2006
From: Saqib.Sohail at colorado.edu (Saqib bin Sohail)
Date: Sat Apr 15 13:42:02 2006
Subject: [Numpy-discussion] Code Question
In-Reply-To: <06041504462800.00752@rbastian>
References: <1145116214.444116365d326@webmail.colorado.edu> <06041504462800.00752@rbastian>
Message-ID: <1145133678.44415a6e5d8f7@webmail.colorado.edu>

Thanks a lot for your detailed email, unfortunately both of the following
imports don't work

import Gnuplot

import fft as FFT
from numarray import *

I think I need Gnuplot package but what I can't understand is why, fft is not
being imported, do I need to install the NumPy package with special options to
install fft.


Quoting Ren? Bastian <rbastian at free.fr>:

> Le Samedi 15 Avril 2006 17:50, Saqib bin Sohail a ?crit :
> > Hi guys
> >
> > I have never used python, but I wanted to compute FFT of audio files, I
> > came upon a page which had python code, so I installed Numpy but after
> > beating the bush for a few days, I have finally come in here to ask. After
> > taking the FFT I want to output it to a file and the use gnuplot to plot
> > it.
>
> With the module Gnuplot.py you can plot arrays
>
> import Gnuplot
>
> g =Gnuplot.Gnuplot()
> g.plot(w) #  w is an array
> raw_input("Enter")
> g.reset()
>
> I use numarray
>
> Some code :
> ----------------
>
> import fft as FFT
> from numarray import *
>
> T = arrayrange(0.0, 2*pi, 1.0/1000)
> a = sin(2*pi*440.0*T)
>
> r = FFT.fft(a)
> print r
> g.plot(r)
> raw_input("Enter")
> ....
> r = FFT.inverse_real_fft(a)
> r = FFT.real_fft(a)
> r = FFT.hermite_fft(a)
>
> g.reset()
> ----------------
>
>
> >
> > When I instaled NumPy, and ran the tests, it seemed that all passed without
> > a problem. My input is a .dat file converted from .wav file by sox.
>
>
> >
> > Here is the code which obviously doesn't work because it seems that changes
> > have occured since this code was written. (not my code, just from some
> > website where a guy had written on how to do things which i require)
> >
> > import Numeric
> > import FFT
> > out_array=Numeric.array(out)
> > out_fft=FFT.fft(out)
>
>
> >
> > offt=open('outfile_fft.dat','w')
> > for x in range(len(out_fft)/2):
> >     offt.write('%f %f\n'%(1.0*x/wtime,abs(out_fft[x].real)))
> >
> >
> > I do the following at the python prompt
> >
> > import numarray
> > myFile = open('test.dat', 'r')
> > my_array = numarray.arra(myFile)
>
> Read the manual how to load a file of floats
> I think there is a mistake
>
> > /* at this stage I wanted to see if it was correctly read */
> >
> > print myArray
> > [1632837691 1701605485 1952535072 ...,  538976288  538976288  168632368]
> >
> > it seems that these values do not correspond to the values in the file (but
> > I guess the array is considering these as ints when infact these are
> > floats)
>
> hmmm ...
>
> >
> > anyway the problem starts when i try to do fft, because I can't seem to
> > find module or how to invoke it,
> >
> > the second problem is writing to the file, that code obviously doesn't
> > work, and in my search through various documentations, i found arrayrange()
> > but couldn't make it to work, call me stupid, but despite going through
> > several examples, i haven't been able to make the for loop worked in any
> > case,
>
>
> >
> > it would be very kind of someone if he could at least tell me what i am
> > doing wrong and reply a simple example so that I can modify my code or at
> > least be able to understand .
> >
> > Thanks
> >
> >
> >
> > --
> > Saqib bin Sohail
> > PhD ECE
> > University of Colorado at Boulder
> > Res: (303) 786 0636
> > http://ucsu.colorado.edu/~sohail/index.html
> >
> >
> > -------------------------------------------------------
>
> --
> Ren? Bastian
> http://pythoneon.musiques-rb.org "Musique en Python"
>
>


--
Saqib bin Sohail
PhD ECE
University of Colorado at Boulder
Res: (303) 786 0636
http://ucsu.colorado.edu/~sohail/index.html


From robert.kern at gmail.com  Sun Apr 16 02:37:05 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Sun Apr 16 02:37:05 2006
Subject: [Numpy-discussion] Trac Wikis closed for anonymous edits until further notice
Message-ID: <44421025.9060804@gmail.com>

We've been hit badly by spammers, so I can only presume our Trac sites are now
on the traded spam lists. I am going to turn off anonymous edits for now. Ticket
creation will probably still be left open for now.

Many thanks to David Cooke for quickly removing the spam.

I am looking into ways to allow people to register themselves with the Trac
sites so they can edit the Wikis and submit tickets without needing to be added
by a project admin.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From a.h.jaffe at gmail.com  Sun Apr 16 12:36:01 2006
From: a.h.jaffe at gmail.com (Andrew Jaffe)
Date: Sun Apr 16 12:36:01 2006
Subject: [Numpy-discussion] g95 detection not working
Message-ID: <44429C55.2030500@gmail.com>

Hi all,

at least on my setup (OS X, Python 2.4.1, latest svn of numpy and 
scipy), config_fc fails to recognize my g95 compiler, which was directly 
downloaded from http://g95.sourceforge.net/ (and always has failed, I 
think). This is because the current version string doesn't conform to 
the regexp pattern; the version string is
"""
G95 (GCC 4.0.3 (g95!) Apr 12 2006)
Copyright (C) 2002-2005 Free Software Foundation, Inc.

G95 comes with NO WARRANTY, to the extent permitted by law.
You may redistribute copies of G95
under the terms of the GNU General Public License.
For more information about these matters, see the file named COPYING
"""

I've attached a patch below, although this identifies the version string 
with the date of the release, rather than the gcc version; I'm not sure 
which is the right one to use!

Andrew


--- numpy/distutils/fcompiler/g95.py    (revision 2360)
+++ numpy/distutils/fcompiler/g95.py    (working copy)
@@ -9,7 +9,7 @@
  class G95FCompiler(FCompiler):

      compiler_type = 'g95'
-    version_pattern = r'G95.*\(experimental\) \(g95!\) (?P<version>.*)\).*'
+    version_pattern = r'G95.*\(g95!\) (?P<version>.*)\).*'

      executables = {
          'version_cmd'  : ["g95", "--version"],


From robert.kern at gmail.com  Sun Apr 16 12:50:05 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Sun Apr 16 12:50:05 2006
Subject: [Numpy-discussion] Re: g95 detection not working
In-Reply-To: <44429C55.2030500@gmail.com>
References: <44429C55.2030500@gmail.com>
Message-ID: <e1u74d$n0i$1@sea.gmane.org>

Andrew Jaffe wrote:
> Hi all,
> 
> at least on my setup (OS X, Python 2.4.1, latest svn of numpy and
> scipy), config_fc fails to recognize my g95 compiler, which was directly
> downloaded from http://g95.sourceforge.net/ (and always has failed, I
> think). This is because the current version string doesn't conform to
> the regexp pattern; the version string is
> """
> G95 (GCC 4.0.3 (g95!) Apr 12 2006)
> Copyright (C) 2002-2005 Free Software Foundation, Inc.
> 
> G95 comes with NO WARRANTY, to the extent permitted by law.
> You may redistribute copies of G95
> under the terms of the GNU General Public License.
> For more information about these matters, see the file named COPYING
> """
> 
> I've attached a patch below, although this identifies the version string
> with the date of the release, rather than the gcc version; I'm not sure
> which is the right one to use!

We need the actual version number; in this case, "4.0.3".

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From a.h.jaffe at gmail.com  Sun Apr 16 13:53:03 2006
From: a.h.jaffe at gmail.com (Andrew Jaffe)
Date: Sun Apr 16 13:53:03 2006
Subject: [Numpy-discussion] Re: g95 detection not working
In-Reply-To: <e1u74d$n0i$1@sea.gmane.org>
References: <44429C55.2030500@gmail.com> <e1u74d$n0i$1@sea.gmane.org>
Message-ID: <4442AE89.8080303@gmail.com>

Robert Kern wrote:
> Andrew Jaffe wrote:
>> Hi all,
>>
>> at least on my setup (OS X, Python 2.4.1, latest svn of numpy and
>> scipy), config_fc fails to recognize my g95 compiler, which was directly
>> downloaded from http://g95.sourceforge.net/ (and always has failed, I
>> think). This is because the current version string doesn't conform to
>> the regexp pattern; the version string is
>> """
>> G95 (GCC 4.0.3 (g95!) Apr 12 2006)
>> Copyright (C) 2002-2005 Free Software Foundation, Inc.
>>
>> G95 comes with NO WARRANTY, to the extent permitted by law.
>> You may redistribute copies of G95
>> under the terms of the GNU General Public License.
>> For more information about these matters, see the file named COPYING
>> """
>>
>> I've attached a patch below, although this identifies the version string
>> with the date of the release, rather than the gcc version; I'm not sure
>> which is the right one to use!
> 
> We need the actual version number; in this case, "4.0.3".

Thanks -- OK, in that case the following regexp works for me:

     version_pattern = r'G95.*\(GCC (?P<version>.*) \(g95!\)'

But are there different versions of the version string?

Also on an unrelated f2py note: is the f2py mailing list being read by 
the f2py developers? I've posted a question (about the status of F9x 
"types") without reply...

Yours,

Andrew


From robert.kern at gmail.com  Sun Apr 16 13:56:02 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Sun Apr 16 13:56:02 2006
Subject: [Numpy-discussion] Re: g95 detection not working
In-Reply-To: <44429C55.2030500@gmail.com>
References: <44429C55.2030500@gmail.com>
Message-ID: <e1uaue$2tm$1@sea.gmane.org>

Andrew Jaffe wrote:
> Hi all,
> 
> at least on my setup (OS X, Python 2.4.1, latest svn of numpy and
> scipy), config_fc fails to recognize my g95 compiler, which was directly
> downloaded from http://g95.sourceforge.net/ (and always has failed, I
> think). This is because the current version string doesn't conform to
> the regexp pattern; the version string is
> """
> G95 (GCC 4.0.3 (g95!) Apr 12 2006)
> Copyright (C) 2002-2005 Free Software Foundation, Inc.
> 
> G95 comes with NO WARRANTY, to the extent permitted by law.
> You may redistribute copies of G95
> under the terms of the GNU General Public License.
> For more information about these matters, see the file named COPYING
> """
> 
> I've attached a patch below, although this identifies the version string
> with the date of the release, rather than the gcc version; I'm not sure
> which is the right one to use!

Also, note that you can override the get_version() method entirely, if it's
easier to do grab the version using something other than a regex. You can look
at hpux.py and ibm.py for examples.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From Saqib.Sohail at colorado.edu  Sun Apr 16 14:02:04 2006
From: Saqib.Sohail at colorado.edu (Saqib bin Sohail)
Date: Sun Apr 16 14:02:04 2006
Subject: [Numpy-discussion] Code Question
In-Reply-To: <c5b438120604151402r71fd5c42s817e5168a31f1432@mail.gmail.com>
References: <1145116214.444116365d326@webmail.colorado.edu>  <c5b438120604151004m7b12836dh78b0979b96ba0bc7@mail.gmail.com>  <1145133185.4441588148cb3@webmail.colorado.edu> <c5b438120604151402r71fd5c42s817e5168a31f1432@mail.gmail.com>
Message-ID: <1145221290.4442b0aa55961@webmail.colorado.edu>

Thanks Guys for all your prompt responses. I have tried to use the provided
solutions but I am had my share of issues mixed with my lack of knowledge to
the point that I feel quite embarrassed to bother you guys.

Issue 1

I am running FC 3 with native python-2.3 and then I installed python-2.4 in it.
numarray-1.5.1 seems to have installed with success in python-2.3. I have tried
to install numpy-0.9.6-1.i586.rpm but I don't have python-base and when I try
to install python-base I get a long list of dependency lists which I need. I
haven't further pursued down that line, unfortunately I haven't been able to
use numarray, I don't know how to use it because ppl have repeatedly told me to
use numpy but I can't seem to get that installed.

Issue 2

To input the file, Ryan suggested to use scipy, I don't want to go down that
path, if only there is a simple way to input the file, (i can clean up the file
and format it in the right way in perl, I can do that in a heartbeat)

Issue 3

I don't want to use gnuplot functionality, or mathplot, if only I am able to
write the file then again I can use perl to format it and use gnuplot then,


So if there is the simplest of ways in which I can just
i) read the file (formatting will be done in perl)
ii) get the fft
iii) write the file or files (and then use perl to format for gnuplot)

I am sure all of you will say why not use the existing functionalities, but
after 3 days I haven't gotten anywhere. All I need to do is get FFT of some
sound files so that I can verify the result of FFT's and compare them with my
FFT code in VxWorks.


An Pierre, I started reading diveintopython.pdf but got nowhere when I tried
two of its examples, the attached image shows that when I tried to run one of
the examples on python-2.3 and the output wasn't according to what the guide
suggested. (no output to be precise)

http://jobim.colorado.edu/~sohail/pythonExample.JPG

Thanks again guys.

Quoting Ryan Krauss <ryanlists at gmail.com>:

> I guess it depends on how much you want to learn and what you want to do.
>
> I was able to load your data using
> data=scipy.io.read_array('monkey.dat')
>
> I had to comment out the first line to make it work.  I couldn't make
> the fromfile method of numpy work because the data is actually fixed
> width.
>
> If you don't want to install scipy, you would need to learn enough
> Python to read the file and clean it up a little by hand.
>
> It seems like the first column is time and the second is the signal
> you want to fft.  I was able to fft it with:
> myfft=numpy.fft(data[:,1])
> (I don't have the latest version of numpy and don't seem to have the
> refft function Robert mentioned).
>
> t=data[:,0]
> df=1/max(t)
> df
> maxf=8012
> fvect=arange(0,maxf+df,df)
>
> plot(fvect,abs(myfft))
>
> I am plotting using matplotlib and the resulting figures are attached.
>
> If you really want to learn python for scientific and plotting
> applications, I would highly recommend a few packages:
> SciPy - some additional capabilities beyond Numpy (optimization, ode's , ...)
> ipython - it is a really good interactive python shell
> matplotlib - the best python 2d plotting package I am aware of
>
> Let me know if you have any additional questions.  You can find out
> about each package by googling it.  They are all closely related to
> Numpy and all have good mailing lists to help you.
>
> Ryan
>
> On 4/15/06, Saqib bin Sohail <Saqib.Sohail at colorado.edu> wrote:
> > Do let me know if you get somewhere.
> >
> > Thanks
> >
> >
> > Quoting Ryan Krauss <ryanlists at gmail.com>:
> >
> > > email me the dat file and I could play with it a bit.  If I can read
> > > your input file, the rest should be easy.
> > >
> > > Ryan
> > >
> > > On 4/15/06, Saqib bin Sohail <Saqib.Sohail at colorado.edu> wrote:
> > > > Hi guys
> > > >
> > > > I have never used python, but I wanted to compute FFT of audio files, I
> > > came
> > > > upon a page which had python code, so I installed Numpy but after
> beating
> > > the
> > > > bush for a few days, I have finally come in here to ask. After taking
> the
> > > FFT I
> > > > want to output it to a file and the use gnuplot to plot it.
> > > >
> > > > When I instaled NumPy, and ran the tests, it seemed that all passed
> without
> > > a
> > > > problem. My input is a .dat file converted from .wav file by sox.
> > > >
> > > > Here is the code which obviously doesn't work because it seems that
> changes
> > > > have occured since this code was written. (not my code, just from some
> > > website
> > > > where a guy had written on how to do things which i require)
> > > >
> > > > import Numeric
> > > > import FFT
> > > > out_array=Numeric.array(out)
> > > > out_fft=FFT.fft(out)
> > > >
> > > > offt=open('outfile_fft.dat','w')
> > > > for x in range(len(out_fft)/2):
> > > >     offt.write('%f %f\n'%(1.0*x/wtime,abs(out_fft[x].real)))
> > > >
> > > >
> > > > I do the following at the python prompt
> > > >
> > > > import numarray
> > > > myFile = open('test.dat', 'r')
> > > > my_array = numarray.arra(myFile)
> > > >
> > > > /* at this stage I wanted to see if it was correctly read */
> > > >
> > > > print myArray
> > > > [1632837691 1701605485 1952535072 ...,  538976288  538976288
> 168632368]
> > > >
> > > > it seems that these values do not correspond to the values in the file
> (but
> > > I
> > > > guess the array is considering these as ints when infact these are
> floats)
> > > >
> > > > anyway the problem starts when i try to do fft, because I can't seem to
> > > find
> > > > module or how to invoke it,
> > > >
> > > > the second problem is writing to the file, that code obviously doesn't
> > > work,
> > > > and in my search through various documentations, i found arrayrange()
> but
> > > > couldn't make it to work, call me stupid, but despite going through
> several
> > > > examples, i haven't been able to make the for loop worked in any case,
> > > >
> > > > it would be very kind of someone if he could at least tell me what i am
> > > doing
> > > > wrong and reply a simple example so that I can modify my code or at
> least
> > > be
> > > > able to understand .
> > > >
> > > > Thanks
> > > >
> > > >
> > > >
> > > > --
> > > > Saqib bin Sohail
> > > > PhD ECE
> > > > University of Colorado at Boulder
> > > > Res: (303) 786 0636
> > > > http://ucsu.colorado.edu/~sohail/index.html
> > > >
> > > >
> > > > -------------------------------------------------------
> > > > This SF.Net email is sponsored by xPML, a groundbreaking scripting
> language
> > > > that extends applications into web and mobile media. Attend the live
> > > webcast
> > > > and join the prime developer group breaking into this new coding
> territory!
> > > >
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> > > > _______________________________________________
> > > > Numpy-discussion mailing list
> > > > Numpy-discussion at lists.sourceforge.net
> > > > https://lists.sourceforge.net/lists/listinfo/numpy-discussion
> > > >
> > >
> >
> >
> > --
> > Saqib bin Sohail
> > PhD ECE
> > University of Colorado at Boulder
> > Res: (303) 786 0636
> > http://ucsu.colorado.edu/~sohail/index.html
> >
> >
>


--
Saqib bin Sohail
PhD ECE
University of Colorado at Boulder
Res: (303) 786 0636
http://ucsu.colorado.edu/~sohail/index.html


From robert.kern at gmail.com  Sun Apr 16 14:03:01 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Sun Apr 16 14:03:01 2006
Subject: [Numpy-discussion] Re: g95 detection not working
In-Reply-To: <4442AE89.8080303@gmail.com>
References: <44429C55.2030500@gmail.com> <e1u74d$n0i$1@sea.gmane.org> <4442AE89.8080303@gmail.com>
Message-ID: <e1ubcb$3vp$1@sea.gmane.org>

Andrew Jaffe wrote:

> Thanks -- OK, in that case the following regexp works for me:
> 
>     version_pattern = r'G95.*\(GCC (?P<version>.*) \(g95!\)'
> 
> But are there different versions of the version string?

Possibly. I don't really know.

> Also on an unrelated f2py note: is the f2py mailing list being read by
> the f2py developers? I've posted a question (about the status of F9x
> "types") without reply...

Pearu is really the only f2py developer, and he has just flown from his home in
Estonia to Austin to work with us at Enthought for a month. I presume he has
been busy preparing for his journey.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From robert.kern at gmail.com  Sun Apr 16 14:26:06 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Sun Apr 16 14:26:06 2006
Subject: [Numpy-discussion] Re: Code Question
In-Reply-To: <1145221290.4442b0aa55961@webmail.colorado.edu>
References: <1145116214.444116365d326@webmail.colorado.edu>  <c5b438120604151004m7b12836dh78b0979b96ba0bc7@mail.gmail.com>  <1145133185.4441588148cb3@webmail.colorado.edu> <c5b438120604151402r71fd5c42s817e5168a31f1432@mail.gmail.com> <1145221290.4442b0aa55961@webmail.colorado.edu>
Message-ID: <e1ucoa$771$1@sea.gmane.org>

Saqib bin Sohail wrote:

> An Pierre, I started reading diveintopython.pdf but got nowhere when I tried
> two of its examples, the attached image shows that when I tried to run one of
> the examples on python-2.3 and the output wasn't according to what the guide
> suggested. (no output to be precise)
> 
> http://jobim.colorado.edu/~sohail/pythonExample.JPG

Note the indentation. Indentation is important in Python.

> Quoting Ryan Krauss <ryanlists at gmail.com>:

>>(I don't have the latest version of numpy and don't seem to have the
>>refft function Robert mentioned).

My example was wrong. It should have used "numpy.dft.refft()", not "numpy.refft()".

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From robert.kern at gmail.com  Sun Apr 16 14:37:02 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Sun Apr 16 14:37:02 2006
Subject: [Numpy-discussion] Re: Code Question
In-Reply-To: <1145221290.4442b0aa55961@webmail.colorado.edu>
References: <1145116214.444116365d326@webmail.colorado.edu>  <c5b438120604151004m7b12836dh78b0979b96ba0bc7@mail.gmail.com>  <1145133185.4441588148cb3@webmail.colorado.edu> <c5b438120604151402r71fd5c42s817e5168a31f1432@mail.gmail.com> <1145221290.4442b0aa55961@webmail.colorado.edu>
Message-ID: <e1udcp$8o2$1@sea.gmane.org>

Saqib bin Sohail wrote:

> I am sure all of you will say why not use the existing functionalities, but
> after 3 days I haven't gotten anywhere. All I need to do is get FFT of some
> sound files so that I can verify the result of FFT's and compare them with my
> FFT code in VxWorks.

Well, if you are just trying to get an independent verification of your VxWorks
FFT code, and you are much more comfortable with Perl, then you might want to
use one of the FFT libraries available for Perl like Math::FFT.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From a.h.jaffe at gmail.com  Sun Apr 16 15:18:02 2006
From: a.h.jaffe at gmail.com (Andrew Jaffe)
Date: Sun Apr 16 15:18:02 2006
Subject: [Numpy-discussion] where() has started returning a tuple!?
Message-ID: <e1ufou$d6p$1@sea.gmane.org>

I think the following behavior is (only recently) wrong:

In [7]: numpy.__version__
Out[7]: '0.9.7.2360'

In [8]: numpy.nonzero([True, False, True])
Out[8]: array([0, 2])

In [9]: numpy.where([True, False, True])
Out[9]: (array([0, 2]),)

Note the tuple output to where(), which should be the same as nonzero.

Andrew


From perry at stsci.edu  Sun Apr 16 20:18:02 2006
From: perry at stsci.edu (Perry Greenfield)
Date: Sun Apr 16 20:18:02 2006
Subject: [Numpy-discussion] where() has started returning a tuple!?
In-Reply-To: <e1ufou$d6p$1@sea.gmane.org>
Message-ID: <NEBBIJKBMLDBLNCEEFOCIELHFKAA.perry@stsci.edu>

see:

http://sourceforge.net/mailarchive/forum.php?thread_id=10165581&forum_id=489
0

> -----Original Message-----
> From: numpy-discussion-admin at lists.sourceforge.net
> [mailto:numpy-discussion-admin at lists.sourceforge.net]On Behalf Of Andrew
> Jaffe
> Sent: Sunday, April 16, 2006 6:17 PM
> To: numpy-discussion at lists.sourceforge.net
> Subject: [Numpy-discussion] where() has started returning a tuple!?
>
>
> I think the following behavior is (only recently) wrong:
>
> In [7]: numpy.__version__
> Out[7]: '0.9.7.2360'
>
> In [8]: numpy.nonzero([True, False, True])
> Out[8]: array([0, 2])
>
> In [9]: numpy.where([True, False, True])
> Out[9]: (array([0, 2]),)
>
> Note the tuple output to where(), which should be the same as nonzero.
>
> Andrew
>
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking
> scripting language
> that extends applications into web and mobile media. Attend the
> live webcast
> and join the prime developer group breaking into this new coding
> territory!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>


From a.h.jaffe at gmail.com  Mon Apr 17 00:53:04 2006
From: a.h.jaffe at gmail.com (Andrew Jaffe)
Date: Mon Apr 17 00:53:04 2006
Subject: [Numpy-discussion] Re: where() has started returning a tuple!?
In-Reply-To: <NEBBIJKBMLDBLNCEEFOCIELHFKAA.perry@stsci.edu>
References: <e1ufou$d6p$1@sea.gmane.org> <NEBBIJKBMLDBLNCEEFOCIELHFKAA.perry@stsci.edu>
Message-ID: <e1vhg4$hpf$1@sea.gmane.org>

Aha, missed that thread (and the docstring -- my bad). And actually I 
misunderstood the effect of the change, anyway: a[where(a>0)] is still 
fine, it's just other activities like iterating over where(a>0) that is 
no longer possible in the same way.

Thanks for the pointer!

Andrew


Perry Greenfield wrote:
> see:
> 
> http://sourceforge.net/mailarchive/forum.php?thread_id=10165581&forum_id=489
> 0
> 
>> -----Original Message-----
>> From: numpy-discussion-admin at lists.sourceforge.net
>> [mailto:numpy-discussion-admin at lists.sourceforge.net]On Behalf Of Andrew
>> Jaffe
>> Sent: Sunday, April 16, 2006 6:17 PM
>> To: numpy-discussion at lists.sourceforge.net
>> Subject: [Numpy-discussion] where() has started returning a tuple!?
>>
>>
>> I think the following behavior is (only recently) wrong:
>>
>> In [7]: numpy.__version__
>> Out[7]: '0.9.7.2360'
>>
>> In [8]: numpy.nonzero([True, False, True])
>> Out[8]: array([0, 2])
>>
>> In [9]: numpy.where([True, False, True])
>> Out[9]: (array([0, 2]),)
>>
>> Note the tuple output to where(), which should be the same as nonzero.
>>
>> Andrew
>>


From ryanlists at gmail.com  Mon Apr 17 05:57:03 2006
From: ryanlists at gmail.com (Ryan Krauss)
Date: Mon Apr 17 05:57:03 2006
Subject: [Numpy-discussion] Re: Code Question
In-Reply-To: <e1udcp$8o2$1@sea.gmane.org>
References: <1145116214.444116365d326@webmail.colorado.edu>
	 <c5b438120604151004m7b12836dh78b0979b96ba0bc7@mail.gmail.com>
	 <1145133185.4441588148cb3@webmail.colorado.edu>
	 <c5b438120604151402r71fd5c42s817e5168a31f1432@mail.gmail.com>
	 <1145221290.4442b0aa55961@webmail.colorado.edu>
	 <e1udcp$8o2$1@sea.gmane.org>
Message-ID: <c5b438120604170556h5a335024m8814758d9d86b9ee@mail.gmail.com>

Alright Saqib,

Robert is right that you should try fft in perl if you don't want to
learn Python.

But as I understand it, you want to read in this file, fft it, and
write the fft to a file using only numarray.  Attached is a script
that does that.  Most of the script is just low-level file io to avoid
having to install scipy to read and write the arrays.

Hope this helps,

Ryan

On 4/16/06, Robert Kern <robert.kern at gmail.com> wrote:
> Saqib bin Sohail wrote:
>
> > I am sure all of you will say why not use the existing functionalities, but
> > after 3 days I haven't gotten anywhere. All I need to do is get FFT of some
> > sound files so that I can verify the result of FFT's and compare them with my
> > FFT code in VxWorks.
>
> Well, if you are just trying to get an independent verification of your VxWorks
> FFT code, and you are much more comfortable with Perl, then you might want to
> use one of the FFT libraries available for Perl like Math::FFT.
>
> --
> Robert Kern
> robert.kern at gmail.com
>
> "I have come to believe that the whole world is an enigma, a harmless enigma
>  that is made terrible by our own mad attempt to interpret it as though it had
>  an underlying truth."
>   -- Umberto Eco
>
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting language
> that extends applications into web and mobile media. Attend the live webcast
> and join the prime developer group breaking into this new coding territory!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: read_fft_write_numarray.py
Type: text/x-python
Size: 872 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060417/1b2e7f3e/attachment-0001.py>

From chanley at stsci.edu  Mon Apr 17 06:24:06 2006
From: chanley at stsci.edu (Christopher Hanley)
Date: Mon Apr 17 06:24:06 2006
Subject: [Numpy-discussion] Vectorize bug
In-Reply-To: <44404A5B.5010802@ieee.org>
References: <00fa01c6601e$c7707840$0502010a@dsp.sun.ac.za> <44404A5B.5010802@ieee.org>
Message-ID: <4443969D.4090604@stsci.edu>

Travis Oliphant wrote:
> I'm not sure if the Solaris crash is fixed or not yet after the recent 
> changes to SVN.   There may be more than one bug here...

The numpy.test() unit tests no longer cause segfaults on Solaris.  All 
of my daily numpy regression tests are now passing for Solaris.

Thank you for your time and help,
Chris


From michael.sorich at gmail.com  Mon Apr 17 17:13:09 2006
From: michael.sorich at gmail.com (Michael Sorich)
Date: Mon Apr 17 17:13:09 2006
Subject: [Numpy-discussion] using NaN, INT_MIN etc in ndarray instead of a masked array
Message-ID: <16761e100604171712v195b47f5q111cb2c4519a03db@mail.gmail.com>

On 4/8/06, Sasha <ndarray at mac.com> wrote:
>
> ...
>
See above. For ndarray mask is always False unless an add-on module is
> loaded that redefines arithmetic to recognize special bit-patterns
> such as NaN or INT_MIN.
>
>
Is it possible to implement masked values using these special bit patterns
in the ndarray instead of using a separate MA class? If so has there been
any thought as to whether this may be the better option. I think it would be
preferable if the ability to handle masked data was available in the
standard array class (ndarray), as this would increase the likelihood that
functions built for numeric arrays will handle masked values well. It seems
that ndarray already has decent support for nans (isnan() returns the
equivalent of a boolean mask array), indicating that such an approach may be
acceptable. How difficult is it to generalise the concept to other data
types (int, string, bool)?

Mike
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060417/2cb1ca86/attachment-0001.html>

From robert.kern at gmail.com  Mon Apr 17 19:53:01 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Mon Apr 17 19:53:01 2006
Subject: [Numpy-discussion] Re: using NaN, INT_MIN etc in ndarray instead of a masked array
In-Reply-To: <16761e100604171712v195b47f5q111cb2c4519a03db@mail.gmail.com>
References: <16761e100604171712v195b47f5q111cb2c4519a03db@mail.gmail.com>
Message-ID: <e21k8f$o14$1@sea.gmane.org>

Michael Sorich wrote:
> On 4/8/06, *Sasha* <ndarray at mac.com <mailto:ndarray at mac.com>> wrote:
> 
>     ...
> 
>     See above. For ndarray mask is always False unless an add-on module is
>     loaded that redefines arithmetic to recognize special bit-patterns
>     such as NaN or INT_MIN.
> 
> Is it possible to implement masked values using these special bit
> patterns in the ndarray instead of using a separate MA class? If so has
> there been any thought as to whether this may be the better option. I
> think it would be preferable if the ability to handle masked data was
> available in the standard array class (ndarray), as this would increase
> the likelihood that functions built for numeric arrays will handle
> masked values well. It seems that ndarray already has decent support for
> nans (isnan() returns the equivalent of a boolean mask array),
> indicating that such an approach may be acceptable. How difficult is it
> to generalise the concept to other data types (int, string, bool)?

Well, I'm certainly dead set against any change that would make all arrays that
happen to contain those special values to be treated as masked arrays.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From oliphant.travis at ieee.org  Mon Apr 17 23:04:04 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Mon Apr 17 23:04:04 2006
Subject: [Numpy-discussion] using NaN, INT_MIN etc in ndarray instead
 of a masked array
In-Reply-To: <16761e100604171712v195b47f5q111cb2c4519a03db@mail.gmail.com>
References: <16761e100604171712v195b47f5q111cb2c4519a03db@mail.gmail.com>
Message-ID: <44448138.2080402@ieee.org>

Michael Sorich wrote:
> On 4/8/06, *Sasha* <ndarray at mac.com <mailto:ndarray at mac.com>> wrote:
>
>     ...
>
>     See above. For ndarray mask is always False unless an add-on module is
>     loaded that redefines arithmetic to recognize special bit-patterns
>     such as NaN or INT_MIN.
>
>
> Is it possible to implement masked values using these special bit 
> patterns in the ndarray instead of using a separate MA class? If so 
> has there been any thought as to whether this may be the better 
> option. I think it would be preferable if the ability to handle masked 
> data was available in the standard array class (ndarray), as this 
> would increase the likelihood that functions built for numeric arrays 
> will handle masked values well. It seems that ndarray already has 
> decent support for nans (isnan() returns the equivalent of a boolean 
> mask array), indicating that such an approach may be acceptable. How 
> difficult is it to generalise the concept to other data types (int, 
> string, bool)?
>
I don't think the approach can be generalized at all.   It would only 
work with floating-point values and therefore is not particularly exciting.

I think ultimately, making masked arrays a C-based sub-class is where 
masked array should go.  For now the Python-based class is a good 
environment for developing the ideas behind how to preserve masked 
arrays through other functions if it is possible.

It seems that masked arrays must do things quite differently than other 
arrays on certain applications, and I'm not altogether clear on how to 
support them in all the NumPy code.  Because masked arrays are not used 
by everybody who uses NumPy arrays, it should be a separate sub-class. 

Ultimately, I hope we will get the basic array object into Python (what 
Tim was calling the super array) before 2.6

-Travis


From svetosch at gmx.net  Tue Apr 18 01:15:01 2006
From: svetosch at gmx.net (Sven Schreiber)
Date: Tue Apr 18 01:15:01 2006
Subject: [Numpy-discussion] Toward release 1.0 of NumPy
In-Reply-To: <443EDFE7.6010509@cox.net>
References: <443D9543.8040601@ee.byu.edu> <443E096D.3040407@gmx.net>	 <e06186140604130534v67dab6cbk3a51b41a3887a057@mail.gmail.com>	 <Mahogany-0.66.0-1448-20060413-100718.00@american.edu>	 <443E7109.6080808@cox.net>	 <e06186140604131332w1f3e2f70t96912135e53c7763@mail.gmail.com>	 <443EC2B4.807@cox.net> <e06186140604131601g2b759c7ema98dd56ef292e6c0@mail.gmail.com> <443EDFE7.6010509@cox.net>
Message-ID: <44449FC4.8020406@gmx.net>

[Sorry for the late reaction, I was on vacation.]

Tim Hochberg schrieb:

>>
> Here's my best guess as to what is going on:
>    1. There is a relatively large group of people who use Kronecker
> product as Alan does (probably the matrix as opposed to tensor math
> folks). I'm guessing it's a large group since they manage to write the
> definitions at both mathworld and planetmath.

Yes.

>    2. kron was meant to implement this.

That's what I thought, anyway.

>    2.5 People who need the other meaning of kron can just use outer, so
> no real conflict.
>    3. The implementation was either inappropriately generalized or it
> was assumed that all inputs would be matrices (and hence rank-2).
> 
> Assuming 3. is correct, and I'd like to hear from people if they think
> that the behaviour in the non rank-2 cases is sensible, the next
> question is whether the behaviour in the rank-2 cases makes sense. It
> seem to, but I'm not a user of kron. If both of the preceeding are true,
> it seems like a complete fix entails the following two things:
>    1. Forbid arguments that are not rank-2. This allows all matrices,
> which is really the main target here I think.
>    2. Fix the return type issue. I have a fix for this ready to commit,
> but I want to figure out the first part as well.
> 

Both 1 and 2 sound very good to me as a user.

So, should I still submit a new ticket about kron, or is it already
being fixed?

Greetings,
Sven


From a.u.r.e.l.i.a.n at gmx.net  Tue Apr 18 01:46:04 2006
From: a.u.r.e.l.i.a.n at gmx.net (Johannes Loehnert)
Date: Tue Apr 18 01:46:04 2006
Subject: [Numpy-discussion] Re: where
In-Reply-To: <c5b438120604131016o350a0c5fj27d03f73244c13e8@mail.gmail.com>
References: <c5b438120604130910x2e6694adhe4592d686893afe5@mail.gmail.com> <c5b438120604131014o1450ca8du95dce8e722d4247b@mail.gmail.com> <c5b438120604131016o350a0c5fj27d03f73244c13e8@mail.gmail.com>
Message-ID: <200604181045.05058.a.u.r.e.l.i.a.n@gmx.net>

On Thursday 13 April 2006 19:16, Ryan Krauss wrote:
> which makes this:
> myvect=where((f>19.5) & (f<38) &
> (phase>0),ones(shape(phase)),zeros(shape(phase)))
>
> actually really silly, sense all it is a complicated way to get back
> the input of
> (f>19.5) & (f<38) & (phase>0)
>

...but you should cast the second to signed int32, otherwise

a = (f>19.5) & (f<38) & (phase>0)
print a-1

will give an array of 0's and 255's :) (since boolean arrays are by default 
upcast to unsigned int8)

Johannes


From ryanlists at gmail.com  Tue Apr 18 05:31:15 2006
From: ryanlists at gmail.com (Ryan Krauss)
Date: Tue Apr 18 05:31:15 2006
Subject: [Numpy-discussion] Re: where
In-Reply-To: <200604181045.05058.a.u.r.e.l.i.a.n@gmx.net>
References: <c5b438120604130910x2e6694adhe4592d686893afe5@mail.gmail.com>
	 <c5b438120604131014o1450ca8du95dce8e722d4247b@mail.gmail.com>
	 <c5b438120604131016o350a0c5fj27d03f73244c13e8@mail.gmail.com>
	 <200604181045.05058.a.u.r.e.l.i.a.n@gmx.net>
Message-ID: <c5b438120604180529u77c91223m3cf9d489df19b878@mail.gmail.com>

You are right.  I actually did run into a problem with this.  I was
trying to subtract 360 degrees from the phase of some fft data and I
multiplied -360 (no dot) times my bool array.  It took me a while to
track that one down.

Ryan

On 4/18/06, Johannes Loehnert <a.u.r.e.l.i.a.n at gmx.net> wrote:
> On Thursday 13 April 2006 19:16, Ryan Krauss wrote:
> > which makes this:
> > myvect=where((f>19.5) & (f<38) &
> > (phase>0),ones(shape(phase)),zeros(shape(phase)))
> >
> > actually really silly, sense all it is a complicated way to get back
> > the input of
> > (f>19.5) & (f<38) & (phase>0)
> >
>
> ...but you should cast the second to signed int32, otherwise
>
> a = (f>19.5) & (f<38) & (phase>0)
> print a-1
>
> will give an array of 0's and 255's :) (since boolean arrays are by default
> upcast to unsigned int8)
>
> Johannes
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting language
> that extends applications into web and mobile media. Attend the live webcast
> and join the prime developer group breaking into this new coding territory!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>


From tim.hochberg at cox.net  Tue Apr 18 06:24:09 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Tue Apr 18 06:24:09 2006
Subject: [Numpy-discussion] Toward release 1.0 of NumPy
In-Reply-To: <44449FC4.8020406@gmx.net>
References: <443D9543.8040601@ee.byu.edu> <443E096D.3040407@gmx.net>	 <e06186140604130534v67dab6cbk3a51b41a3887a057@mail.gmail.com>	 <Mahogany-0.66.0-1448-20060413-100718.00@american.edu>	 <443E7109.6080808@cox.net>	 <e06186140604131332w1f3e2f70t96912135e53c7763@mail.gmail.com>	 <443EC2B4.807@cox.net> <e06186140604131601g2b759c7ema98dd56ef292e6c0@mail.gmail.com> <443EDFE7.6010509@cox.net> <44449FC4.8020406@gmx.net>
Message-ID: <4444E7DD.2010209@cox.net>

Sven Schreiber wrote:

>[Sorry for the late reaction, I was on vacation.]
>
>Tim Hochberg schrieb:
>
>  
>
>>Here's my best guess as to what is going on:
>>   1. There is a relatively large group of people who use Kronecker
>>product as Alan does (probably the matrix as opposed to tensor math
>>folks). I'm guessing it's a large group since they manage to write the
>>definitions at both mathworld and planetmath.
>>    
>>
>
>Yes.
>
>  
>
>>   2. kron was meant to implement this.
>>    
>>
>
>That's what I thought, anyway.
>
>  
>
>>   2.5 People who need the other meaning of kron can just use outer, so
>>no real conflict.
>>   3. The implementation was either inappropriately generalized or it
>>was assumed that all inputs would be matrices (and hence rank-2).
>>
>>Assuming 3. is correct, and I'd like to hear from people if they think
>>that the behaviour in the non rank-2 cases is sensible, the next
>>question is whether the behaviour in the rank-2 cases makes sense. It
>>seem to, but I'm not a user of kron. If both of the preceeding are true,
>>it seems like a complete fix entails the following two things:
>>   1. Forbid arguments that are not rank-2. This allows all matrices,
>>which is really the main target here I think.
>>   2. Fix the return type issue. I have a fix for this ready to commit,
>>but I want to figure out the first part as well.
>>
>>    
>>
>
>Both 1 and 2 sound very good to me as a user.
>
>So, should I still submit a new ticket about kron, or is it already
>being fixed?
>  
>
Go ahead and submit a ticket if you would. I have a fix here, but I've 
been waiting to submit it till I heard from some other people who use 
kron (and because I've been swamped the last couple of days). If you 
submit the ticket, that'll keep it from falling through the cracks.

Thanks for the feedback,

-tim


>Greetings,
>Sven
>
>
>  
>


From ndarray at mac.com  Tue Apr 18 07:06:22 2006
From: ndarray at mac.com (Sasha)
Date: Tue Apr 18 07:06:22 2006
Subject: [Numpy-discussion] using NaN, INT_MIN etc in ndarray instead of a masked array
In-Reply-To: <44448138.2080402@ieee.org>
References: <16761e100604171712v195b47f5q111cb2c4519a03db@mail.gmail.com>
	 <44448138.2080402@ieee.org>
Message-ID: <d38f5330604180705l69b9d269s4938aaf51dbb43da@mail.gmail.com>

On 4/18/06, Travis Oliphant <oliphant.travis at ieee.org> wrote:
> Michael Sorich wrote:
> ...
> > Is it possible to implement masked values using these special bit
> > patterns in the ndarray instead of using a separate MA class? If so
> > has there been any thought as to whether this may be the better
> > option. I think it would be preferable if the ability to handle masked
> > data was available in the standard array class (ndarray), as this
> > would increase the likelihood that functions built for numeric arrays
> > will handle masked values well. It seems that ndarray already has
> > decent support for nans (isnan() returns the equivalent of a boolean
> > mask array), indicating that such an approach may be acceptable. How
> > difficult is it to generalise the concept to other data types (int,
> > string, bool)?
> >
> I don't think the approach can be generalized at all.   It would only
> work with floating-point values and therefore is not particularly exciting.
>
Not true. R supports "NA" for all its types except raw bytes.
For example:

> x<-logical(5)
> x
[1] FALSE FALSE FALSE FALSE FALSE
> x[1:2]=NA
> !x
[1]   NA   NA TRUE TRUE TRUE

> I think ultimately, making masked arrays a C-based sub-class is where
> masked array should go.  For now the Python-based class is a good
> environment for developing the ideas behind how to preserve masked
> arrays through other functions if it is possible.
>
I've voiced my opposition to subclassing before.  Here I believe it is
more appropriate to have an add-on module that installs alternative
math functions. Having two classes in the same application that a
subtly different in the corner cases is already a problem with
ma.array vs. ndarray, adding the third class will only make things
worse.

> It seems that masked arrays must do things quite differently than other
> arrays on certain applications, and I'm not altogether clear on how to
> support them in all the NumPy code.  Because masked arrays are not used
> by everybody who uses NumPy arrays, it should be a separate sub-class.
>
As far as I understand, people who don't use MA don't deal with
missing values. For this category of users there will be no visible
effect no matter how missing values are treated as long as in the
absence of missing values, normal rules apply. Yes, many functions
must treat missing values differently, but the same is true for NaNs. 
NumPy allows floating point arrays to have nans, but there is no real
support beyong what happened to work at the OS level.

For example:

>>> sort([5,nan,3,2])
array([ 5.        ,         nan,  2.        ,  3.        ])

Also, what is the justification for

>>> int_(nan)
0
?

> Ultimately, I hope we will get the basic array object into Python (what
> Tim was calling the super array) before 2.6

As far as I understand, that object will not come with arithmetic
rules or math functions.  Therefore, I don't see how this is relevant
to the present discussion.


From oliphant.travis at ieee.org  Tue Apr 18 09:39:11 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Tue Apr 18 09:39:11 2006
Subject: [Numpy-discussion] using NaN, INT_MIN etc in ndarray instead
 of a masked array
In-Reply-To: <d38f5330604180705l69b9d269s4938aaf51dbb43da@mail.gmail.com>
References: <16761e100604171712v195b47f5q111cb2c4519a03db@mail.gmail.com>	 <44448138.2080402@ieee.org> <d38f5330604180705l69b9d269s4938aaf51dbb43da@mail.gmail.com>
Message-ID: <44451611.9070707@ieee.org>

Sasha wrote:
> On 4/18/06, Travis Oliphant <oliphant.travis at ieee.org> wrote:
>   
>> Michael Sorich wrote:
>> ...
>>     
>>> Is it possible to implement masked values using these special bit
>>> patterns in the ndarray instead of using a separate MA class? If so
>>> has there been any thought as to whether this may be the better
>>> option. I think it would be preferable if the ability to handle masked
>>> data was available in the standard array class (ndarray), as this
>>> would increase the likelihood that functions built for numeric arrays
>>> will handle masked values well. It seems that ndarray already has
>>> decent support for nans (isnan() returns the equivalent of a boolean
>>> mask array), indicating that such an approach may be acceptable. How
>>> difficult is it to generalise the concept to other data types (int,
>>> string, bool)?
>>>
>>>       
>> I don't think the approach can be generalized at all.   It would only
>> work with floating-point values and therefore is not particularly exciting.
>>
>>     
> Not true. R supports "NA" for all its types except raw bytes.
> For example:
>
>   
>> x<-logical(5)
>> x
>>     
> [1] FALSE FALSE FALSE FALSE FALSE
>   
>> x[1:2]=NA
>> !x
>>     
> [1]   NA   NA TRUE TRUE TRUE
>   
For Boolean values there is "room" for a NA value, but what about 
arbitrary integers.  Does R just limit the range of the integer value?  
That's what I meant:  "fiddling with special-values" doesn't generalize 
to all data-types.


>> arrays through other functions if it is possible.
>>
>>     
> I've voiced my opposition to subclassing before. 
And you haven't been very clear about why you are opposed.    Just 
voicing concern is not enough.   Python sub-classing in C amounts to 
exactly what masked arrays are:  arrays with additional components in 
their structure (i.e. a mask).    Please be more specific about whatever 
your concerns are with sub-classing.

>  Here I believe it is
> more appropriate to have an add-on module that installs alternative
> math functions. 
Sure that will work.   But, we're talking about more than math 
functions.  Ultimately masked array users will want *every* function 
they use to work "right" with masked arrays.  

> Having two classes in the same application that a
> subtly different in the corner cases is already a problem with
> ma.array vs. ndarray, adding the third class will only make things
> worse.
>   
I don't know what you are talking about.  What is the "third class?"  
I'm talking about just making ma.array construct a sub-class..
>> It seems that masked arrays must do things quite differently than other
>> arrays on certain applications, and I'm not altogether clear on how to
>> support them in all the NumPy code.  Because masked arrays are not used
>> by everybody who uses NumPy arrays, it should be a separate sub-class.
>>
>>     
> As far as I understand, people who don't use MA don't deal with
> missing values. For this category of users there will be no visible
> effect no matter how missing values are treated as long as in the
> absence of missing values, normal rules apply. Yes, many functions
> must treat missing values differently, but the same is true for NaNs. 
> NumPy allows floating point arrays to have nans, but there is no real
> support beyong what happened to work at the OS level.
>   

Or we deal with missing values differently (i.e. manage it 
ourselves).    Sure, there will be no behavioral effect, but the code 
will have to be re-written to "do the right thing" with masked arrays in 
such a way as to not slow everything else down (that's at least an "if" 
statement sprinkled throughout every sub-routine).  

Many people are not enthused about complicating the basic array object 
any more than necessary.   If it can be shown that masked arrays can be 
integrated into the ndarray object without inordinate complication 
and/or slowness, then I don't think people would mind.  

The best way to prove that is to create a sub-class and change only the 
methods / functions that are necessary.      That's really all I'm saying.

>   
>> Ultimately, I hope we will get the basic array object into Python (what
>> Tim was calling the super array) before 2.6
>>     
>
> As far as I understand, that object will not come with arithmetic
> rules or math functions.  Therefore, I don't see how this is relevant
> to the present discussion.
>   

Because it will help all array objects talk more cleanly to each other.  
But, if you are so opposed to sub-classing (which I'm not sure why in 
this case), then it may not matter.

-Travis


From strang at nmr.mgh.harvard.edu  Tue Apr 18 10:37:03 2006
From: strang at nmr.mgh.harvard.edu (Gary Strangman)
Date: Tue Apr 18 10:37:03 2006
Subject: [Numpy-discussion] using NaN, INT_MIN etc in ndarray instead of
 a masked array
In-Reply-To: <44451611.9070707@ieee.org>
References: <16761e100604171712v195b47f5q111cb2c4519a03db@mail.gmail.com> 
 <44448138.2080402@ieee.org> <d38f5330604180705l69b9d269s4938aaf51dbb43da@mail.gmail.com>
 <44451611.9070707@ieee.org>
Message-ID: <Pine.LNX.4.60.0604181334080.29171@guppy.nmr.mgh.harvard.edu>

>> Not true. R supports "NA" for all its types except raw bytes.
>> For example:
(snip)
>
> For Boolean values there is "room" for a NA value, but what about arbitrary 
> integers.  Does R just limit the range of the integer value?  That's what I 
> meant:  "fiddling with special-values" doesn't generalize to all data-types.

In R, I believe NA = -sys.maxint-1

Gary


From oliphant.travis at ieee.org  Tue Apr 18 11:09:03 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Tue Apr 18 11:09:03 2006
Subject: [Numpy-discussion] String (and unicode) comparisons and per-thread error handling fixed
Message-ID: <44452B04.4090403@ieee.org>

String comparisons were added last week.  Today, I added per-thread 
error handling to NumPy.  There is 1 more enhancement (scalar math) 
prior to 0.9.8 release --- but it will probably take 1-2 weeks.

The new error handling means that the three-scope system is gone.  Now, 
there is only one per-Python-thread global scope for error handling.  If 
you change the error handling it will affect all ufuncs.   Because of 
this, the seterr function now returns an object with the old 
error-handling information.  This object must be passed to 
umath.seterrobj() in order to restore the error handling.

-Travis


From tim.hochberg at cox.net  Tue Apr 18 11:21:06 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Tue Apr 18 11:21:06 2006
Subject: [Numpy-discussion] String (and unicode) comparisons and per-thread
 error handling fixed
In-Reply-To: <44452B04.4090403@ieee.org>
References: <44452B04.4090403@ieee.org>
Message-ID: <44452D53.70009@cox.net>

Travis Oliphant wrote:

>
> String comparisons were added last week.  Today, I added per-thread 
> error handling to NumPy.  There is 1 more enhancement (scalar math) 
> prior to 0.9.8 release --- but it will probably take 1-2 weeks.

Oops!  I'm about 2/3 done doing this one too. I think I'll go ahead and 
finish mine up and see how our approaches stack up performance wise and 
see if there's any of mine that's useful to roll into yours.

-tim

>
> The new error handling means that the three-scope system is gone.  
> Now, there is only one per-Python-thread global scope for error 
> handling.  If you change the error handling it will affect all 
> ufuncs.   Because of this, the seterr function now returns an object 
> with the old error-handling information.  This object must be passed 
> to umath.seterrobj() in order to restore the error handling.
>
> -Travis
>
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting 
> language
> that extends applications into web and mobile media. Attend the live 
> webcast
> and join the prime developer group breaking into this new coding 
> territory!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>
>


From oliphant.travis at ieee.org  Tue Apr 18 12:14:14 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Tue Apr 18 12:14:14 2006
Subject: [Numpy-discussion] String (and unicode) comparisons and per-thread
 error handling fixed
In-Reply-To: <44452D53.70009@cox.net>
References: <44452B04.4090403@ieee.org> <44452D53.70009@cox.net>
Message-ID: <44453A5E.4020506@ieee.org>

Tim Hochberg wrote:
> Travis Oliphant wrote:
>
>>
>> String comparisons were added last week.  Today, I added per-thread 
>> error handling to NumPy.  There is 1 more enhancement (scalar math) 
>> prior to 0.9.8 release --- but it will probably take 1-2 weeks.
>
> Oops!  I'm about 2/3 done doing this one too. I think I'll go ahead 
> and finish mine up and see how our approaches stack up performance 
> wise and see if there's any of mine that's useful to roll into yours.
Darn.  I thought I gave you enough time.... :-)

On the other hand,  all I did was change the way the error-mode is being 
looked-up (from the three dictionaries to just one).  It's not much 
different than before except for that.    I didn't do anything about the 
other ideas you spoke of. 

I did add a simple object to reset the error mode when it gets deleted, 
and had to fiddle with the seterr code a little to accept that object so 
that both methods of resetting the error mode work. 

A stack can certainly be built on top of what is now there (I'm thinking 
for numarray compatibility...), but I didn't do that.

Sorry for stepping on your toes.   I'm just anxious...  I'll be gone for 
a couple of days and won't be working on NumPy/SciPy, so feel free to 
adjust.


-Travis


From rhl at astro.princeton.edu  Tue Apr 18 13:07:04 2006
From: rhl at astro.princeton.edu (Robert Lupton)
Date: Tue Apr 18 13:07:04 2006
Subject: [Numpy-discussion] Infinite recursion in numpy called from swig generated code
In-Reply-To: <D2BB1AAD-D21C-4F8C-BDFB-55EDBCA3B65A@astro.princeton.edu>
References: <5809AC56-B2DF-4403-B7BC-9AEEAAC78505@astro.princeton.edu> <43FD32E4.10600@ieee.org> <F2993C93-C40B-4680-AFD6-55BBEF9EEA73@astro.princeton.edu> <44203F91.7010505@ieee.org> <D2BB1AAD-D21C-4F8C-BDFB-55EDBCA3B65A@astro.princeton.edu>
Message-ID: <CA15C760-5D5F-4CD7-BA25-8C8855958020@astro.princeton.edu>

The latest version of swig (1.3.28 or 1.3.29) has broken my
multiple-inheritance-from-C-and-numpy application; more specifically,
it generates an infinite loop in numpy-land.  I'm using numpy (0.9.6),
and here's the offending code.  Ideas anyone? I've pasted the crucial
part of numpy.lib.UserArray onto the end of this message (how do I know?
because you can replace the "from numpy.lib.UserArray" with this, and
the problem persists).

#####################################################
from numpy.lib.UserArray import *

import types
class myImage(types.ObjectType):
     def __init__(self, *args):
         this = None
         try: self.this.append(this)
         except: self.this = this

class Image(UserArray, myImage):
     def __init__(self, *args):
         myImage.__init__(self, *args)
#####################################################

The symptoms are:

	from recursionBug import *; Image(myImage())
------------------------------------------------------------
Traceback (most recent call last):
   File "<stdin>", line 1, in ?
   File "recursionBug.py", line 32, in __init__
     myImage.__init__(self, *args)
   File "recursionBug.py", line 26, in __init__
     except: self.this = this
   File "/sw/lib/python2.4/site-packages/numpy/lib/UserArray.py",  
line 187, in __setattr__
     self.array.__setattr__(attr, value)
   File "/sw/lib/python2.4/site-packages/numpy/lib/UserArray.py",  
line 193, in __getattr__
     return self.array.__getattribute__(attr)
...
   File "/sw/lib/python2.4/site-packages/numpy/lib/UserArray.py",  
line 193, in __getattr__
     return self.array.__getattribute__(attr)
   File "/sw/lib/python2.4/site-packages/numpy/lib/UserArray.py",  
line 193, in __getattr__
     return self.array.__getattribute__(attr)
RuntimeError: maximum recursion depth exceeded


The following stripped down piece of numpy seems to be the problem:
     class UserArray(object):
         def __setattr__(self,attr,value):
             try:
                 self.array.__setattr__(attr, value)
             except AttributeError:
                 object.__setattr__(self, attr, value)

         # Only called after other approaches fail.
         def __getattr__(self,attr):
             return self.array.__getattribute__(attr)


				R


From cookedm at physics.mcmaster.ca  Tue Apr 18 13:10:02 2006
From: cookedm at physics.mcmaster.ca (David M. Cooke)
Date: Tue Apr 18 13:10:02 2006
Subject: [Numpy-discussion] Trac Wikis closed for anonymous edits until further notice
In-Reply-To: <44421025.9060804@gmail.com> (Robert Kern's message of "Sun, 16
	Apr 2006 04:36:37 -0500")
References: <44421025.9060804@gmail.com>
Message-ID: <qnky7y2abnv.fsf@arbutus.physics.mcmaster.ca>

Robert Kern <robert.kern at gmail.com> writes:

> We've been hit badly by spammers, so I can only presume our Trac sites are now
> on the traded spam lists. I am going to turn off anonymous edits for now. Ticket
> creation will probably still be left open for now.

Another thing that's concerned me is closing of tickets by anonymous;
can we turn that off? It disturbs me when I'm browsing the RSS feed
and I see that. If a user who's not a developer thinks it could be
closed, they could post a comment saying that, and a developer could
close it.

> Many thanks to David Cooke for quickly removing the spam.

The RSS feeds are great for that. Although having a way to quickly
revert a change would have made it easier :-)

> I am looking into ways to allow people to register themselves with the Trac
> sites so they can edit the Wikis and submit tickets without needing to be added
> by a project admin.

that'd be good.

-- 
|>|\/|<
/--------------------------------------------------------------------------\
|David M. Cooke                      http://arbutus.physics.mcmaster.ca/dmc/
|cookedm at physics.mcmaster.ca


From oliphant.travis at ieee.org  Tue Apr 18 13:50:09 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Tue Apr 18 13:50:09 2006
Subject: [Numpy-discussion] Infinite recursion in numpy called from swig
 generated code
In-Reply-To: <CA15C760-5D5F-4CD7-BA25-8C8855958020@astro.princeton.edu>
References: <5809AC56-B2DF-4403-B7BC-9AEEAAC78505@astro.princeton.edu> <43FD32E4.10600@ieee.org> <F2993C93-C40B-4680-AFD6-55BBEF9EEA73@astro.princeton.edu> <44203F91.7010505@ieee.org> <D2BB1AAD-D21C-4F8C-BDFB-55EDBCA3B65A@astro.princeton.edu> <CA15C760-5D5F-4CD7-BA25-8C8855958020@astro.princeton.edu>
Message-ID: <444550CF.6090100@ieee.org>

Robert Lupton wrote:
> The latest version of swig (1.3.28 or 1.3.29) has broken my
> multiple-inheritance-from-C-and-numpy application; more specifically,
> it generates an infinite loop in numpy-land.  I'm using numpy (0.9.6),
> and here's the offending code.  Ideas anyone? I've pasted the crucial
> part of numpy.lib.UserArray onto the end of this message (how do I know?
> because you can replace the "from numpy.lib.UserArray" with this, and
> the problem persists).
This is a problem in the getattr code of UserArray.   This is fixed in 
SVN.   But, you can just replace the getattr code in UserArray.py with 
the following:

    def __getattr__(self,attr):
        if (attr == 'array'):
            return object.__getattr__(self, attr)
        return self.array.__getattribute__(attr)


Thanks for finding and reporting this.

-Travis


From christian at marquardt.sc  Tue Apr 18 14:48:06 2006
From: christian at marquardt.sc (Christian Marquardt)
Date: Tue Apr 18 14:48:06 2006
Subject: [Numpy-discussion] using NaN,
      INT_MIN etc in ndarray instead of a masked array
In-Reply-To: <Pine.LNX.4.60.0604181334080.29171@guppy.nmr.mgh.harvard.edu>
References: <16761e100604171712v195b47f5q111cb2c4519a03db@mail.gmail.com> 
    <44448138.2080402@ieee.org>
    <d38f5330604180705l69b9d269s4938aaf51dbb43da@mail.gmail.com>
    <44451611.9070707@ieee.org>
    <Pine.LNX.4.60.0604181334080.29171@guppy.nmr.mgh.harvard.edu>
Message-ID: <20053.84.167.224.64.1145396854.squirrel@webmail.marquardt.sc>

On Tue, April 18, 2006 19:36, Gary Strangman wrote:
>
>>> Not true. R supports "NA" for all its types except raw bytes.
>>> For example:
> (snip)
>>
>> For Boolean values there is "room" for a NA value, but what about
>> arbitrary
>> integers.  Does R just limit the range of the integer value?  That's
>> what I
>> meant:  "fiddling with special-values" doesn't generalize to all
>> data-types.
>
> In R, I believe NA = -sys.maxint-1

Don't know if this helps, but I have found the following in the R Data
Import/Export Manual (in section 6.5.1, available at
http://cran.r-project.org/doc/manuals/R-data.html):

   The missing value for R logical and integer types is INT_MIN, the
   smallest representable int defined in the C header limits.h, normally
   corresponding to the bit pattern 0xffffffff.

For doubles (I think R only uses double precision internally), it's a bit
more complex apparently; in the section mentioned above, the authors
explain that

   [If R's internal constant definitions / library functions can't be used],
   on all common platforms IEC 60559 (aka IEEE 754) arithmetic is used, so
   standard C facilities can be used to test for or set Inf, -Inf and NaN
   values. On such platforms NA is represented by the NaN value with
   low-word 0x7a2 (1954 in decimal).

The implementation of the floating point NA value is done in the file
arithmetics.c of the R source code; the relevant code snippets defining
the NA "value" are (I believe)

   typedef union
   {
       double value;
       unsigned int word[2];
   } ieee_double;

   #ifdef WORDS_BIGENDIAN
   static CONST int hw = 0;
   static CONST int lw = 1;
   #else  /* !WORDS_BIGENDIAN */
   static CONST int hw = 1;
   static CONST int lw = 0;
   #endif /* WORDS_BIGENDIAN */

   static double R_ValueOfNA(void)
   {
       /* The gcc shipping with RedHat 9 gets this wrong without
        * the volatile declaration. Thanks to Marc Schwartz. */
       volatile ieee_double x;
       x.word[hw] = 0x7ff00000;
       x.word[lw] = 1954;
       return x.value;
   }

and the tests for a number being NA or NaN are

   int R_IsNA(double x)
   {
       if (isnan(x)) {
           ieee_double y;
           y.value = x;
           return (y.word[lw] == 1954);
       }
       return 0;
   }

   int R_IsNaN(double x)
   {
       if (isnan(x)) {
           ieee_double y;
           y.value = x;
           return (y.word[lw] != 1954);
       }
       return 0;
   }

Hope this is useful,

  Christian.


From twegener at radlogic.com.au  Tue Apr 18 18:07:02 2006
From: twegener at radlogic.com.au (Tim Wegener)
Date: Tue Apr 18 18:07:02 2006
Subject: [Numpy-discussion] Backporting numpy to Python 2.2
Message-ID: <20060419103554.4ac1df4a.twegener@radlogic.com.au>

Hi, 

I am attempting to backport numpy-0.9.6 to be compatible with python 2.2. (Some of our machines run python 2.2 as part of Red Hat 9 and Red Hat 7.3 and it is hazardous to alter the standard setup.) I was able to change most of the 2.3-isms to be 2.2 compatible (see the attached patch). However I had problems compiling the following c module:

In file included from numpy/core/src/multiarraymodule.c:64:
numpy/core/src/arrayobject.c: In function `arraydescr_dealloc':
numpy/core/src/arrayobject.c:8417: warning: passing arg 1 of pointer to function from incompatible pointer type
numpy/core/src/multiarraymodule.c: In function `PyArray_DescrConverter':
numpy/core/src/multiarraymodule.c:4072: `PyBool_Type' undeclared (first use in this function)
numpy/core/src/multiarraymodule.c: In function `setup_scalartypes':
numpy/core/src/multiarraymodule.c:5736: `PyBool_Type' undeclared (first use in this function)
numpy/core/src/multiarraymodule.c: In function `initmultiarray':
numpy/core/src/multiarraymodule.c:5897: `PyObject_SelfIter' undeclared (first use in this function)
error: Command "gcc -DNDEBUG -O2 -g -pipe -march=i386 -mcpu=i686 -D_GNU_SOURCE -fPIC -fPIC -Ibuild/src/numpy/core/src -Inumpy/core/include -Ibuild/src/numpy/core -Inumpy/core/src -Inumpy/core/include -I/usr/include/python2.2 -c numpy/core/src/multiarraymodule.c -o build/temp.linux-i686-2.2/multiarraymodule.o" failed with exit status 1


Is it possible to modify this module for python 2.2 compatibility or have I reached a dead end?

It would be great if numpy were compatible with 2.2 out of the box, given that 2.3 is only a couple of years old (new), and 2.2 is still quite widely deployed. I am trying to migrate to numpy from Numeric, which worked happily with 2.2. 

FYI, a quick summary of the compatibility amendments to the python code:
- backported os.walk
- backported enumerate
- backported distutils.log
- used slices instead of list.index(item, <start>)
- used 'r' mode instead of 'U' mode (it didn't seem that universal newline support was needed where it was used)
- used the {} way of building a new dict rather than using keyword args to the dict constructor
- from __future__ import generators
- used str.count(substr) rather than substr in str
- used os.sep rather than os.path.sep
- commented out some of the new Configuration keword arguments (download_url and classifiers)

The above don't really affect the functionality, but a couple of more unusual changes were needed as well:
- had to add "self.compiler.exe_extension = ''" to numpy/distutils/command/config.py (see patch)
- had to change the following to and empty dict: "kws = {'depends':ext.depends}" in numpy/distutils/command/build_ext.py (see patch)
These two changes may have unwanted side effects, and a better fix is probably needed there.

Regards,
Tim
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: numpy-0.9.6_patched_for_py2.2_diff.txt
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060418/520b514d/attachment-0001.txt>

From oliphant at ee.byu.edu  Tue Apr 18 20:03:01 2006
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Tue Apr 18 20:03:01 2006
Subject: [Numpy-discussion] Performance problems with strided arrays in
 NumPy
In-Reply-To: <20060414213511.GA14355@xot.carabos.com>
References: <20060414213511.GA14355@xot.carabos.com>
Message-ID: <4445A822.60207@ee.byu.edu>

faltet at xot.carabos.com wrote:

>Hi,
>
>I'm seeing some slowness in NumPy when dealing with strided arrays.
>numarray is dealing better with these situations, so I guess that
>something could be done in NumPy about this. Below are the situations
>that I've found up to now (maybe there are others). For the timings,
>I've used numpy 0.9.7.2278 and numarray 1.5.1.
>  
>
The source of this slowness is the use in numarray of  special-cases for 
certain-sized byte-copies.

Apparently,  it is *much* faster to do

((double *)dst)[0] = ((double *)src)[0]

when you have aligned data than it is to do

memmove(dst, src, sizeof(double))

This is a useful piece of knowledge to have for optimization.  There may 
be other optimizations like that already used by Numarray but still 
needing to be adapted for NumPy.

I applied an optimization to take advantage of this when possible and 
got a 10x speed-up in the 1-d case.

My timings for your benchmark with current SVN of NumPy are:

NumPy: [0.021701812744140625, 0.021739959716796875, 0.021548032760620117]
Numarray: [0.052516937255859375, 0.052685976028442383, 0.052355051040649414]


Old timings:

NumPy: [~0.09, ~0.09, ~0.09]
Numarray: [~0.05, ~0.05, ~0.05]


-Travis


From ndarray at mac.com  Tue Apr 18 20:26:16 2006
From: ndarray at mac.com (Sasha)
Date: Tue Apr 18 20:26:16 2006
Subject: [Numpy-discussion] Performance problems with strided arrays in NumPy
In-Reply-To: <4445A822.60207@ee.byu.edu>
References: <20060414213511.GA14355@xot.carabos.com>
	 <4445A822.60207@ee.byu.edu>
Message-ID: <d38f5330604182025s6aeb03d1m76af25ab729e3f0d@mail.gmail.com>

On 4/18/06, Travis Oliphant <oliphant at ee.byu.edu> wrote:
> [...]
> Apparently,  it is *much* faster to do
>
> ((double *)dst)[0] = ((double *)src)[0]
>
> when you have aligned data than it is to do
>
> memmove(dst, src, sizeof(double))
>
> This is a useful piece of knowledge to have for optimization.

This is not surprising because memmove has to assume arbitrary
alignment and possibility of overlap between src and dst areas.


From ndarray at mac.com  Tue Apr 18 20:27:02 2006
From: ndarray at mac.com (Sasha)
Date: Tue Apr 18 20:27:02 2006
Subject: [Numpy-discussion] Performance problems with strided arrays in NumPy
In-Reply-To: <4445A822.60207@ee.byu.edu>
References: <20060414213511.GA14355@xot.carabos.com>
	 <4445A822.60207@ee.byu.edu>
Message-ID: <d38f5330604182025s6aeb03d1m76af25ab729e3f0d@mail.gmail.com>

On 4/18/06, Travis Oliphant <oliphant at ee.byu.edu> wrote:
> [...]
> Apparently,  it is *much* faster to do
>
> ((double *)dst)[0] = ((double *)src)[0]
>
> when you have aligned data than it is to do
>
> memmove(dst, src, sizeof(double))
>
> This is a useful piece of knowledge to have for optimization.

This is not surprising because memmove has to assume arbitrary
alignment and possibility of overlap between src and dst areas.


From tim.hochberg at cox.net  Wed Apr 19 08:58:04 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Wed Apr 19 08:58:04 2006
Subject: [Numpy-discussion] seterr changes
Message-ID: <44465DEE.8090703@cox.net>

Hi Travis et al,

I started looking at your seterr changes. I stared at yours for a while 
then I stared at mine for a while. Then I decided that mine wouldn't 
work right in the presence of threads. Then I decided that yours 
wouldn't work right in the presence of threads either. Specifically, it 
looks like ufunc_update_use_defaults isn't going to work. I think I know 
how to fix that, but I'm not sure that it's worth the trouble since I 
also did some benchmarking and it appears that the benefit of special 
casing is minimal.

I looked at six cases: small (len-1), medium (len-1e4) and large 
(len-1e6) arrays with error checking on and error checking off. For 
medium and large arrays, I could discern no difference at all. For small 
arrays, there may be some difference, but it appears to be less than 5%. 
I'm not sure it's worth working through a bunch of finicky thread stuff 
to get just 5% back. If these benchmark numbers hold up I'd be inclined 
to rip out the use_default support since it's complicated enough that I 
know we'll end up chasing a few evil thread related bugs down through it.

I'll include the benchmarking code below. If people could (a) look it 
over and confirm that I'm not doing something bogus and (b) try it on 
some different platforms and see if they see a more signifigant 
difference, I'd appreciate it.

I'm also curious about the seterr interface. It returns 
ufunc_values_obj. I'm wasn't sure how one is supposed to pass that back 
in to seterr,  so I modified seterr to instead return a dictionary. I 
also modified it so that the seterr function itself has no defaults (or 
rather they're all None). Instead, any unspecified values are taken from 
the current error state. Thus seterr(divide="warn") changes only the 
divide state, leaving the other entries alone.


Regards,

-tim


if True:
    from timeit import Timer

    setup = """
import numpy
numpy.seterr(divide="%s")
a = numpy.zeros([%s], dtype=float)
"""
    for size in [1, 10000, 1000000]:
        for i in range(3):
            for state in ['ignore', 'warn']:
                reps = min(100000000 / size, 100000)
                timer = Timer("a * a", setup % (state, size))
                print "%s|%s =>" % (state, size), timer.timeit(reps)
            print
        print


From arkaitz.bitorika at gmail.com  Wed Apr 19 10:30:03 2006
From: arkaitz.bitorika at gmail.com (Arkaitz Bitorika)
Date: Wed Apr 19 10:30:03 2006
Subject: [Numpy-discussion] Floating point exception with numpy and embedded python interpreter
Message-ID: <81934aa60604191029h4d8a8d9bl550fa58cc67d3d5e@mail.gmail.com>

Hi,

I'm embedding Python in a big C++ program (the NS network simulator) and I
have problems when importing the numpy module, I get a Floating Point
exception. The C code that causes the exception is:

    Py_Initialize();
    PyObject* module = PyImport_ImportModule("numpy");
    Py_DECREF(module);


I'm running Ubuntu Breezy on a dual processor Dell machine, with the stock
python and numpy 0.9.6. One strange thing is that I haven't been able to
reproduce the crash by writing a minimal C program with the code above, it
only crashes when added to my program. I've been embedding Python for ages
on the same program and other modules work fine, only numpy fails.

I've debugged the issue a bit and I've seen that the exception is thrown
when the numpy __init__.py tries to import the core module. The GDB
backtrace is pasted at the end.
Any idea what may be going wrong?

Thanks,
Arkaitz


0xb7900fd2 in initumath () at build/src/numpy/core/src/umathmodule.c:10321
10321           pinf *= mul;
(gdb) bt
#0  0xb7900fd2 in initumath () at
build/src/numpy/core/src/umathmodule.c:10321
#1  0xb7e4e310 in _PyImport_LoadDynamicModule () from
/usr/lib/libpython2.4.so.1.0
#2  0xb7e4c450 in _PyImport_FindModule () from /usr/lib/libpython2.4.so.1.0
#3  0xb7e4cc01 in PyImport_ReloadModule () from /usr/lib/libpython2.4.so.1.0
#4  0xb7e4ce26 in PyImport_ReloadModule () from /usr/lib/libpython2.4.so.1.0
#5  0xb7e4d2c6 in PyImport_ImportModuleEx () from
/usr/lib/libpython2.4.so.1.0
#6  0xb7e22d9e in _PyUnicodeUCS4_ToLowercase () from
/usr/lib/libpython2.4.so.1.0
#7  0xb7df5923 in PyCFunction_Call () from /usr/lib/libpython2.4.so.1.0
#8  0xb7dc8fdf in PyObject_Call () from /usr/lib/libpython2.4.so.1.0
#9  0xb7e2a92c in PyEval_CallObjectWithKeywords () from
/usr/lib/libpython2.4.so.1.0
#10 0xb7e2e8f9 in PyEval_EvalFrame () from /usr/lib/libpython2.4.so.1.0
#11 0xb7e31a2d in PyEval_EvalCodeEx () from /usr/lib/libpython2.4.so.1.0
#12 0xb7e31b76 in PyEval_EvalCode () from /usr/lib/libpython2.4.so.1.0
#13 0xb7e4a525 in PyImport_ExecCodeModuleEx () from
/usr/lib/libpython2.4.so.1.0
#14 0xb7e4a8e9 in PyImport_ExecCodeModule () from
/usr/lib/libpython2.4.so.1.0
#15 0xb7e4c73e in _PyImport_FindModule () from /usr/lib/libpython2.4.so.1.0
#16 0xb7e4cc01 in PyImport_ReloadModule () from /usr/lib/libpython2.4.so.1.0
#17 0xb7e4ce26 in PyImport_ReloadModule () from /usr/lib/libpython2.4.so.1.0
#18 0xb7e4d2c6 in PyImport_ImportModuleEx () from
/usr/lib/libpython2.4.so.1.0
#19 0xb7e22d9e in _PyUnicodeUCS4_ToLowercase () from
/usr/lib/libpython2.4.so.1.0
#20 0xb7df5923 in PyCFunction_Call () from /usr/lib/libpython2.4.so.1.0
#21 0xb7dc8fdf in PyObject_Call () from /usr/lib/libpython2.4.so.1.0
#22 0xb7e2a92c in PyEval_CallObjectWithKeywords () from
/usr/lib/libpython2.4.so.1.0
#23 0xb7e2e8f9 in PyEval_EvalFrame () from /usr/lib/libpython2.4.so.1.0
#24 0xb7e31a2d in PyEval_EvalCodeEx () from /usr/lib/libpython2.4.so.1.0
#25 0xb7e31b76 in PyEval_EvalCode () from /usr/lib/libpython2.4.so.1.0
#26 0xb7e5667f in PyRun_String () from /usr/lib/libpython2.4.so.1.0
#27 0xb7e2fce6 in PyEval_EvalFrame () from /usr/lib/libpython2.4.so.1.0
#28 0xb7e31a2d in PyEval_EvalCodeEx () from /usr/lib/libpython2.4.so.1.0
#29 0xb7e3011a in PyEval_EvalFrame () from /usr/lib/libpython2.4.so.1.0
#30 0xb7e31a2d in PyEval_EvalCodeEx () from /usr/lib/libpython2.4.so.1.0
#31 0xb7de31b6 in PyFunction_SetClosure () from /usr/lib/libpython2.4.so.1.0
#32 0xb7dc8fdf in PyObject_Call () from /usr/lib/libpython2.4.so.1.0
#33 0xb7dd079b in PyMethod_New () from /usr/lib/libpython2.4.so.1.0
#34 0xb7dc8fdf in PyObject_Call () from /usr/lib/libpython2.4.so.1.0
#35 0xb7dcfd7b in PyInstance_NewRaw () from /usr/lib/libpython2.4.so.1.0
#36 0xb7dc8fdf in PyObject_Call () from /usr/lib/libpython2.4.so.1.0
#37 0xb7e2f5d2 in PyEval_EvalFrame () from /usr/lib/libpython2.4.so.1.0
#38 0xb7e31a2d in PyEval_EvalCodeEx () from /usr/lib/libpython2.4.so.1.0
#39 0xb7e31b76 in PyEval_EvalCode () from /usr/lib/libpython2.4.so.1.0
#40 0xb7e4a525 in PyImport_ExecCodeModuleEx () from
/usr/lib/libpython2.4.so.1.0
#41 0xb7e4a8e9 in PyImport_ExecCodeModule () from
/usr/lib/libpython2.4.so.1.0
#42 0xb7e4c73e in _PyImport_FindModule () from /usr/lib/libpython2.4.so.1.0
#43 0xb7e4cc01 in PyImport_ReloadModule () from /usr/lib/libpython2.4.so.1.0
#44 0xb7e4ce26 in PyImport_ReloadModule () from /usr/lib/libpython2.4.so.1.0
#45 0xb7e4d2c6 in PyImport_ImportModuleEx () from
/usr/lib/libpython2.4.so.1.0
#46 0xb7e22d9e in _PyUnicodeUCS4_ToLowercase () from
/usr/lib/libpython2.4.so.1.0
#47 0xb7df5923 in PyCFunction_Call () from /usr/lib/libpython2.4.so.1.0
#48 0xb7dc8fdf in PyObject_Call () from /usr/lib/libpython2.4.so.1.0
#49 0xb7dcc6c0 in PyObject_CallFunction () from /usr/lib/libpython2.4.so.1.0
#50 0xb7e4d745 in PyImport_Import () from /usr/lib/libpython2.4.so.1.0
#51 0xb7e4d918 in PyImport_ImportModule () from /usr/lib/libpython2.4.so.1.0
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060419/5672932c/attachment-0001.html>

From strawman at astraw.com  Wed Apr 19 10:38:11 2006
From: strawman at astraw.com (Andrew Straw)
Date: Wed Apr 19 10:38:11 2006
Subject: [Numpy-discussion] Floating point exception with numpy and embedded
 python interpreter
In-Reply-To: <81934aa60604191029h4d8a8d9bl550fa58cc67d3d5e@mail.gmail.com>
References: <81934aa60604191029h4d8a8d9bl550fa58cc67d3d5e@mail.gmail.com>
Message-ID: <44467576.1020708@astraw.com>

Arkaitz Bitorika wrote:

> Hi,
>
> I'm embedding Python in a big C++ program (the NS network simulator)
> and I have problems when importing the numpy module, I get a Floating
> Point exception. The C code that causes the exception is:

I guess you mean a CPU/kernel level floating point exception (SIGFPE),
not a Python exception?

>
>     Py_Initialize();
>     PyObject* module = PyImport_ImportModule("numpy");
>     Py_DECREF(module);
>
>
> I'm running Ubuntu Breezy on a dual processor Dell machine, with the
> stock python and numpy 0.9.6. One strange thing is that I haven't been
> able to reproduce the crash by writing a minimal C program with the
> code above, it only crashes when added to my program. 

Does your program change error bits on the FPU or SSE units on your
processor? (What processor are you using?)

> I've been embedding Python for ages on the same program and other
> modules work fine, only numpy fails.

Most other modules don't use the SSE units, so wouldn't get hit by such
a bug.

>
> I've debugged the issue a bit and I've seen that the exception is
> thrown when the numpy __init__.py tries to import the core module. The
> GDB backtrace is pasted at the end.
> Any idea what may be going wrong?

glibc 2.3.2 (e.g. in debian sarge) has a bug where the SSE unit has an
error bit set wrong. But I'd guess Ubuntu isn't using this version of
glibc, so I think the problem may be elsewhere.
http://sources.redhat.com/bugzilla/show_bug.cgi?id=10


From strawman at astraw.com  Wed Apr 19 11:30:10 2006
From: strawman at astraw.com (Andrew Straw)
Date: Wed Apr 19 11:30:10 2006
Subject: [Numpy-discussion] Floating point exception with numpy and embedded
 python interpreter
In-Reply-To: <AECE9844-6C6A-4EE7-A8A9-CAA079D8B2EA@gmail.com>
References: <81934aa60604191029h4d8a8d9bl550fa58cc67d3d5e@mail.gmail.com> <44467576.1020708@astraw.com> <AECE9844-6C6A-4EE7-A8A9-CAA079D8B2EA@gmail.com>
Message-ID: <4446819D.3030401@astraw.com>

Arkaitz Bitorika wrote:

>
> On 19 Apr 2006, at 18:37, Andrew Straw wrote:
>
>>
>>> I've been embedding Python for ages on the same program and other
>>> modules work fine, only numpy fails.
>>
>>
>> Most other modules don't use the SSE units, so wouldn't get hit by  such
>> a bug.
>
>
> Is there a way of not using those units from numpy, to check if 
> that's what's going on? 

I think that numpy only accesses the SSE units through ATLAS or other
external library. So, build numpy without ATLAS. But I'm not 100% sure
anymore if there aren't any optimizations that directly use SSE if it's
available.

> Or alternatively, how would I check if my  program is messing with the
> SSE bits?

Hmm, I think that's a bit hairy. I'd suggest simply asking the C++
library's mailing list if they alter the error bits on the control
registers of the SSE unit. (Out of curiousity, what library is it?) If
you want hairy, though, I think you'd have to check from C with the
appropriate calls -- I'd start with the source code in that bug report.
It looks like they're inlining an assembly statement to query a SSE
control register.


From faltet at xot.carabos.com  Wed Apr 19 14:49:02 2006
From: faltet at xot.carabos.com (faltet at xot.carabos.com)
Date: Wed Apr 19 14:49:02 2006
Subject: [Numpy-discussion] Performance problems with strided arrays in NumPy
In-Reply-To: <4445A822.60207@ee.byu.edu>
References: <20060414213511.GA14355@xot.carabos.com> <4445A822.60207@ee.byu.edu>
Message-ID: <20060419214814.GA21524@xot.carabos.com>

On Tue, Apr 18, 2006 at 09:01:54PM -0600, Travis Oliphant wrote:
> faltet at xot.carabos.com wrote:
> The source of this slowness is the use in numarray of  special-cases for 
> certain-sized byte-copies.
> 
> Apparently,  it is *much* faster to do
> 
> ((double *)dst)[0] = ((double *)src)[0]
> 
> when you have aligned data than it is to do
> 
> memmove(dst, src, sizeof(double))

Mmm.. very interesting.

> My timings for your benchmark with current SVN of NumPy are:
> 
> NumPy: [0.021701812744140625, 0.021739959716796875, 0.021548032760620117]
> Numarray: [0.052516937255859375, 0.052685976028442383, 0.052355051040649414]

Well, in my machine and using numpy SVN version:

numpy: [0.0974161624908447, 0.0621590614318847, 0.0612149238586425]
numarray: [0.0658359527587890, 0.0623040199279785, 0.0627131462097167]

So, numpy and numarray exhibits same performance now (it's curious why
you are actually getting better performance in your platform). However:

In [25]: stnac=timeit.Timer('b=a.copy()','import numarray as np;
a=np.arange(1000000,dtype="complex128")[::10]')

In [26]: stnpc=timeit.Timer('b=a.copy()','import numpy as np;
a=np.arange(1000000,dtype="complex128")[::10]')

In [27]: stnac.repeat(3,10)
Out[27]: [0.11303496360778809, 0.11540508270263672, 0.11556506156921387]

In [28]: stnpc.repeat(3,10)
Out[28]: [0.21353006362915039, 0.21468400955200195, 0.21390914916992188]

So, it seems that you forgot optimizing complex types. Fortunately,
the cure is easy; after adding the attached patch I'm getting:

In [3]: stnpc.repeat(3,10)
Out[3]: [0.10468602180480957, 0.10204982757568359, 0.10242295265197754]

so, good performance for numpy in copying strided complex128 is
achieved as well.

Thanks for looking into this!

Francesc

======================================================================
--- numpy/core/src/arrayobject.c        (revision 2381)
+++ numpy/core/src/arrayobject.c        (working copy)
@@ -629,6 +629,14 @@
         char *tout = dst;
         char *tin = src;
         switch(elsize) {
+        case 16:
+                for (i=0; i<N; i++) {
+                        ((Float64 *)tout)[0] = ((Float64 *)tin)[0];
+                        ((Float64 *)tout)[1] = ((Float64 *)tin)[1];
+                        tin = tin + instrides;
+                        tout = tout + outstrides;
+                }
+                return;
         case 8:
                 for (i=0; i<N; i++) {
                         ((Float64 *)tout)[0] = ((Float64 *)tin)[0];


From simon at arrowtheory.com  Wed Apr 19 16:14:16 2006
From: simon at arrowtheory.com (Simon Burton)
Date: Wed Apr 19 16:14:16 2006
Subject: [Numpy-discussion] Floating point exception with numpy and
 embedded python interpreter
In-Reply-To: <4446819D.3030401@astraw.com>
References: <81934aa60604191029h4d8a8d9bl550fa58cc67d3d5e@mail.gmail.com>
	<44467576.1020708@astraw.com>
	<AECE9844-6C6A-4EE7-A8A9-CAA079D8B2EA@gmail.com>
	<4446819D.3030401@astraw.com>
Message-ID: <20060420091351.475439ab.simon@arrowtheory.com>

On Wed, 19 Apr 2006 11:29:49 -0700
Andrew Straw <strawman at astraw.com> wrote:

> 
> >
> > Is there a way of not using those units from numpy, to check if 
> > that's what's going on? 
> 
> I think that numpy only accesses the SSE units through ATLAS or other
> external library. So, build numpy without ATLAS. But I'm not 100% sure
> anymore if there aren't any optimizations that directly use SSE if it's
> available.

We had to disable attlas-sse on our debian system for these exact
reasons.

Simon.


-- 
Simon Burton, B.Sc.
Licensed PO Box 8066
ANU Canberra 2601
Australia
Ph. 61 02 6249 6940
http://arrowtheory.com 


From tom.denniston at alum.dartmouth.org  Wed Apr 19 17:17:18 2006
From: tom.denniston at alum.dartmouth.org (Tom Denniston)
Date: Wed Apr 19 17:17:18 2006
Subject: [Numpy-discussion] LAPACK question building numpy
Message-ID: <d6f2d3dd0604191716re38b607x61edda22ac2689d5@mail.gmail.com>

Is there a way to pass a command line argument to setup.py for numpy
that does the equivalent of a make using the flags:
-L/home/tdennist/lib -lmkl_lapack -lmkl_lapack32 -lmkl_ia32 -lmkl -lguide

All i can find on the subject is a page on the scipy wiki that says to
use the variable LAPACK and set it to a .a file.  When I do so I get
undefined symbol problems.

I this is probably really obvous and documented somewhere but I
haven't been able to find it.  I don't really know where to look.

--Tom


From strawman at astraw.com  Wed Apr 19 18:59:03 2006
From: strawman at astraw.com (Andrew Straw)
Date: Wed Apr 19 18:59:03 2006
Subject: [Numpy-discussion] Floating point exception with numpy and embedded
 python interpreter
In-Reply-To: <20060420091351.475439ab.simon@arrowtheory.com>
References: <81934aa60604191029h4d8a8d9bl550fa58cc67d3d5e@mail.gmail.com>	<44467576.1020708@astraw.com>	<AECE9844-6C6A-4EE7-A8A9-CAA079D8B2EA@gmail.com>	<4446819D.3030401@astraw.com> <20060420091351.475439ab.simon@arrowtheory.com>
Message-ID: <4446EAB9.7010209@astraw.com>

Simon Burton wrote:

>On Wed, 19 Apr 2006 11:29:49 -0700
>Andrew Straw <strawman at astraw.com> wrote:
>
>  
>
>>>Is there a way of not using those units from numpy, to check if 
>>>that's what's going on? 
>>>      
>>>
>>I think that numpy only accesses the SSE units through ATLAS or other
>>external library. So, build numpy without ATLAS. But I'm not 100% sure
>>anymore if there aren't any optimizations that directly use SSE if it's
>>available.
>>    
>>
>
>We had to disable attlas-sse on our debian system for these exact
>reasons.
>  
>
If you're using debian sarge and the problem is your glibc, you can fix 
it: http://www.its.caltech.edu/~astraw/coding.html#id3


From robert.kern at gmail.com  Wed Apr 19 19:43:02 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Wed Apr 19 19:43:02 2006
Subject: [Numpy-discussion] Re: LAPACK question building numpy
In-Reply-To: <d6f2d3dd0604191716re38b607x61edda22ac2689d5@mail.gmail.com>
References: <d6f2d3dd0604191716re38b607x61edda22ac2689d5@mail.gmail.com>
Message-ID: <e26sen$g3o$1@sea.gmane.org>

Tom Denniston wrote:
> Is there a way to pass a command line argument to setup.py for numpy
> that does the equivalent of a make using the flags:
> -L/home/tdennist/lib -lmkl_lapack -lmkl_lapack32 -lmkl_ia32 -lmkl -lguide
> 
> All i can find on the subject is a page on the scipy wiki that says to
> use the variable LAPACK and set it to a .a file.  When I do so I get
> undefined symbol problems.
> 
> I this is probably really obvous and documented somewhere but I
> haven't been able to find it.  I don't really know where to look.

Don't worry, it's not really well documented. Create a file called site.cfg in
the root source directory. There's an example site.cfg.example there.
Unfortunately, it's pretty sparse at the moment. Now, I'm not terribly familiar
with the MKL, so I don't know what libraries do what, but here is my guess at
the appropriate things you will need in site.cfg:

[DEFAULT]
library_dirs=/home/tdennist/lib:/some/other/path/perhaps
include_dirs=/home/tdennist/include

[blas_opt]
libraries=whatever_the_mkl_blas_lib_is,mkl_ia32,mkl,guide

[lapack_opt]
libraries=mkl_lapack,mkl_lapack32,mkl_ia32,mkl,guide

There's some more documentation in numpy/distutils/system_info.py .

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From faltet at xot.carabos.com  Wed Apr 19 19:46:03 2006
From: faltet at xot.carabos.com (faltet at xot.carabos.com)
Date: Wed Apr 19 19:46:03 2006
Subject: [Numpy-discussion] Performance problems with strided arrays in NumPy
In-Reply-To: <20060419214814.GA21524@xot.carabos.com>
References: <20060414213511.GA14355@xot.carabos.com> <4445A822.60207@ee.byu.edu> <20060419214814.GA21524@xot.carabos.com>
Message-ID: <20060420024510.GA21987@xot.carabos.com>

On Wed, Apr 19, 2006 at 09:48:14PM +0000, faltet at xot.carabos.com wrote:
> On Tue, Apr 18, 2006 at 09:01:54PM -0600, Travis Oliphant wrote:
> > Apparently,  it is *much* faster to do
> > 
> > ((double *)dst)[0] = ((double *)src)[0]
> > 
> > when you have aligned data than it is to do
> > 
> > memmove(dst, src, sizeof(double))
> 
> Mmm.. very interesting.

A follow-up on this.  After analyzing somewhat the issue, it seems that
the problem with the memcpy() version was not the call itself, but the
parameter that was passed as the number of bytes to copy. As this was a
parameter whose value was unknown in compile time, the compiler cannot
generate optimized code for it and always has to fetch its value from
memory (or cache).

In the version of the code that you optimized, you managed to do this
because you are telling to the compiler (i.e. specifying at compile
time) the exact extend of the data copy, so allowing it to generate
optimum code for the copy operation. However, if you do a similar
thing but using the call (using doubles here):

memcpy(tout, tin, 8);

instead of:

((Float64 *)tout)[0] = ((Float64 *)tin)[0];

and repeat the operation for the other types, then you can achieve
similar performance than the pointer version.

On another hand, I see that you have disabled the optimization for
unaligned data through the use of a check. Is there any reason for
doing that?  If I remove this check, I can achieve similar performance
than for numarray (a bit better, in fact).

I'm attaching a small benchmark script that compares the performance
of copying a 1D vector of 1 million of elements in contiguous, strided
(2 and 10), and strided (2 and 10 again) & unaligned flavors. The
results for my machine (p4 at 2 GHz) are:

For the original numpy code (i.e. before Travis optimization):

time for numpy contiguous --> 0.234
time for numarray contiguous --> 0.229
time for numpy strided (2) --> 1.605
time for numarray strided (2) --> 0.263
time for numpy strided (10) --> 1.72
time for numarray strided (10) --> 0.264
time for numpy strided (2) & unaligned--> 1.736
time for numarray strided (2) & unaligned--> 0.402
time for numpy strided (10) & unaligned--> 1.872
time for numarray strided (10) & unaligned--> 0.435

where you can see that, for 1e6 elements the slowdown of original
numpy is almost 7x (!). Remember that in the previous benchmarks sent
here the slowdown was 3x, but we were copying 10 times less data.

For the pointer optimised code (i.e. the current SVN version):

time for numpy contiguous --> 0.238
time for numarray contiguous --> 0.232
time for numpy strided (2) --> 0.214
time for numarray strided (2) --> 0.264
time for numpy strided (10) --> 0.299
time for numarray strided (10) --> 0.262
time for numpy strided (2) & unaligned--> 1.736
time for numarray strided (2) & unaligned--> 0.401
time for numpy strided (10) & unaligned--> 1.874
time for numarray strided (10) & unaligned--> 0.433

here you can see that your figures are very similar to numarray except
for unaligned data (4x slower).

For the pointer optimised code but releasing the unaligned data check:

time for numpy contiguous --> 0.236
time for numarray contiguous --> 0.231
time for numpy strided (2) --> 0.213
time for numarray strided (2) --> 0.262
time for numpy strided (10) --> 0.297
time for numarray strided (10) --> 0.261
time for numpy strided (2) & unaligned--> 0.263
time for numarray strided (2) & unaligned--> 0.403
time for numpy strided (10) & unaligned--> 0.452
time for numarray strided (10) & unaligned--> 0.432

Ei! numpy is very similar to numarray in all cases, except for the
strided with 2 elements and unaligned case, where numpy performs a 50%
better.

Finally, and just for showing the effect of providing memcpy with size
information in compilation time, the numpy code using memcpy() with
this optimization on (and disabling the alignment check, of course!):

time for numpy contiguous --> 0.234
time for numarray contiguous --> 0.233
time for numpy strided (2) --> 0.223
time for numarray strided (2) --> 0.262
time for numpy strided (10) --> 0.285
time for numarray strided (10) --> 0.262
time for numpy strided (2) & unaligned--> 0.261
time for numarray strided (2) & unaligned--> 0.401
time for numpy strided (10) & unaligned--> 0.42
time for numarray strided (10) & unaligned--> 0.436

you can see that the figures are very similar to the previous case. So
Travis, you may want to use the pointer indirection approach or the
memcpy() one, whichever you prefer.

Well, I just wanted to point this out. Time for sleep!

Francesc
-------------- next part --------------
A non-text attachment was scrubbed...
Name: bench-copy.py
Type: text/x-python
Size: 2054 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060419/87659d97/attachment-0001.py>

From tim.hochberg at cox.net  Wed Apr 19 19:57:06 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Wed Apr 19 19:57:06 2006
Subject: [Numpy-discussion] Summer of Code ideas
Message-ID: <4446F8D8.40909@cox.net>

Discussing ideas for summer of code projects seems to be all the rage 
right now on various other Python lists, so I though I'd throw out a few 
that I've had. There are several different things that could be done 
with numexpr including:

    1. Adding broadcasting.
    2. Coercing arrays a chunk at a time instead of all at once when 
coercion is necessary.
    3. Fancier syntax. I think that some variant of the following could 
be made to work:
              with deferred_evaluation: # Converts everything in local 
namespace to special objects
                    # all of these math operations are deferred
                    a = 5 + b*32
                    c = a + 73
              # Now all objects are restored and deferred experesions 
are evaluated.
        This might be cool or it might be useless, but it sounds fun to try.

I haven't talked to David Cooke about any of these and since numexpr is 
really his project he should be consulted before anyone tries these.

There's also some stuff to be done on the basearray front. I expect I'll 
have the actual basearray object together in the next couple of weeks 
depending on my level of busyness, but there'll be a lot of other stuff 
to do besides just that.  My general plan it to build a toolkit around 
basearray that can be used to build other array packages. These packages 
might be lighter weight than numpy or they might be specialized in some 
way that's not really compatible with numpy and ndarray.

There's also room for potential for experimentation with protocols / 
generic functions. If anyones interested I suggest you read the thread 
(currently dormant) on python-3000.devel on this topic. There are lots 
of possible applications for this in numpy including using them to 
implement or replace:

   * asarray
   * __array_priority__ (by making the ufuncs and thus __add__, etc 
overloaded functions).
   * __array__, __array_wrap__, etc.
   * all the various functions that are giving us trouble with MA.
   * probably a bunch of other stuff.

The basic basearray toolkit I mentioned above  would be a good place to 
experiment with stuff like this, once it exists,  since in theory it 
will be simpler than the full numpy codebase and you don't have to worry 
so much about backwards compatibility.

Anyway, that's a bunch of random ideas that I at least find interesting.

Regards,

-tim


From oliphant at ee.byu.edu  Wed Apr 19 20:44:02 2006
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Wed Apr 19 20:44:02 2006
Subject: [Numpy-discussion] Performance problems with strided arrays in
 NumPy
In-Reply-To: <20060420024510.GA21987@xot.carabos.com>
References: <20060414213511.GA14355@xot.carabos.com> <4445A822.60207@ee.byu.edu> <20060419214814.GA21524@xot.carabos.com> <20060420024510.GA21987@xot.carabos.com>
Message-ID: <44470255.302@ee.byu.edu>

faltet at xot.carabos.com wrote:

>On Wed, Apr 19, 2006 at 09:48:14PM +0000, faltet at xot.carabos.com wrote:
>  
>
>>On Tue, Apr 18, 2006 at 09:01:54PM -0600, Travis Oliphant wrote:
>>    
>>
>>>Apparently,  it is *much* faster to do
>>>
>>>((double *)dst)[0] = ((double *)src)[0]
>>>
>>>when you have aligned data than it is to do
>>>
>>>memmove(dst, src, sizeof(double))
>>>      
>>>
>>Mmm.. very interesting.
>>    
>>
>
>A follow-up on this.  After analyzing somewhat the issue, it seems that
>the problem with the memcpy() version was not the call itself, but the
>parameter that was passed as the number of bytes to copy. As this was a
>parameter whose value was unknown in compile time, the compiler cannot
>generate optimized code for it and always has to fetch its value from
>memory (or cache).
>  
>
>In the version of the code that you optimized, you managed to do this
>because you are telling to the compiler (i.e. specifying at compile
>time) the exact extend of the data copy, so allowing it to generate
>optimum code for the copy operation. However, if you do a similar
>thing but using the call (using doubles here):
>
>memcpy(tout, tin, 8);
>
>instead of:
>
>((Float64 *)tout)[0] = ((Float64 *)tin)[0];
>
>and repeat the operation for the other types, then you can achieve
>similar performance than the pointer version.
>  
>

This is good to know.  It certainly makes sense.  I'll test it on my 
system when I get back.

>On another hand, I see that you have disabled the optimization for
>unaligned data through the use of a check. Is there any reason for
>doing that?  If I remove this check, I can achieve similar performance
>than for numarray (a bit better, in fact).
>  
>
The only reason was to avoid pointer dereferencing on misaligned data 
(dereferencing a misaligned pointer causes bus errors on Solaris).   
But, if we can achieve it with a memmove, then there is no reason to 
limit the code.


>I'm attaching a small benchmark script that compares the performance
>of copying a 1D vector of 1 million of elements in contiguous, strided
>(2 and 10), and strided (2 and 10 again) & unaligned flavors. The
>results for my machine (p4 at 2 GHz) are:
>
>For the original numpy code (i.e. before Travis optimization):
>
>time for numpy contiguous --> 0.234
>time for numarray contiguous --> 0.229
>time for numpy strided (2) --> 1.605
>time for numarray strided (2) --> 0.263
>time for numpy strided (10) --> 1.72
>time for numarray strided (10) --> 0.264
>time for numpy strided (2) & unaligned--> 1.736
>time for numarray strided (2) & unaligned--> 0.402
>time for numpy strided (10) & unaligned--> 1.872
>time for numarray strided (10) & unaligned--> 0.435
>
>where you can see that, for 1e6 elements the slowdown of original
>numpy is almost 7x (!). Remember that in the previous benchmarks sent
>here the slowdown was 3x, but we were copying 10 times less data.
>
>For the pointer optimised code (i.e. the current SVN version):
>
>time for numpy contiguous --> 0.238
>time for numarray contiguous --> 0.232
>time for numpy strided (2) --> 0.214
>time for numarray strided (2) --> 0.264
>time for numpy strided (10) --> 0.299
>time for numarray strided (10) --> 0.262
>time for numpy strided (2) & unaligned--> 1.736
>time for numarray strided (2) & unaligned--> 0.401
>time for numpy strided (10) & unaligned--> 1.874
>time for numarray strided (10) & unaligned--> 0.433
>
>here you can see that your figures are very similar to numarray except
>for unaligned data (4x slower).
>
>For the pointer optimised code but releasing the unaligned data check:
>
>time for numpy contiguous --> 0.236
>time for numarray contiguous --> 0.231
>time for numpy strided (2) --> 0.213
>time for numarray strided (2) --> 0.262
>time for numpy strided (10) --> 0.297
>time for numarray strided (10) --> 0.261
>time for numpy strided (2) & unaligned--> 0.263
>time for numarray strided (2) & unaligned--> 0.403
>time for numpy strided (10) & unaligned--> 0.452
>time for numarray strided (10) & unaligned--> 0.432
>
>Ei! numpy is very similar to numarray in all cases, except for the
>strided with 2 elements and unaligned case, where numpy performs a 50%
>better.
>
>Finally, and just for showing the effect of providing memcpy with size
>information in compilation time, the numpy code using memcpy() with
>this optimization on (and disabling the alignment check, of course!):
>
>time for numpy contiguous --> 0.234
>time for numarray contiguous --> 0.233
>time for numpy strided (2) --> 0.223
>time for numarray strided (2) --> 0.262
>time for numpy strided (10) --> 0.285
>time for numarray strided (10) --> 0.262
>time for numpy strided (2) & unaligned--> 0.261
>time for numarray strided (2) & unaligned--> 0.401
>time for numpy strided (10) & unaligned--> 0.42
>time for numarray strided (10) & unaligned--> 0.436
>
>you can see that the figures are very similar to the previous case. So
>Travis, you may want to use the pointer indirection approach or the
>memcpy() one, whichever you prefer.
>
>Well, I just wanted to point this out. Time for sleep!
>
>  
>
Very, very useful information.  1000 Thank you's for talking the time to 
investigate and assemble it.   Do you think the memmove would work 
similarly?  

-Travis


From tom.denniston at alum.dartmouth.org  Thu Apr 20 08:07:04 2006
From: tom.denniston at alum.dartmouth.org (Tom Denniston)
Date: Thu Apr 20 08:07:04 2006
Subject: [Numpy-discussion] Re: LAPACK question building numpy
In-Reply-To: <e26sen$g3o$1@sea.gmane.org>
References: <d6f2d3dd0604191716re38b607x61edda22ac2689d5@mail.gmail.com>
	 <e26sen$g3o$1@sea.gmane.org>
Message-ID: <d6f2d3dd0604200806x567c531dp56c5a4d9e6808bbc@mail.gmail.com>

Thanks for your help.  I will try this.

--Tom

On 4/19/06, Robert Kern <robert.kern at gmail.com> wrote:
> Tom Denniston wrote:
> > Is there a way to pass a command line argument to setup.py for numpy
> > that does the equivalent of a make using the flags:
> > -L/home/tdennist/lib -lmkl_lapack -lmkl_lapack32 -lmkl_ia32 -lmkl -lguide
> >
> > All i can find on the subject is a page on the scipy wiki that says to
> > use the variable LAPACK and set it to a .a file.  When I do so I get
> > undefined symbol problems.
> >
> > I this is probably really obvous and documented somewhere but I
> > haven't been able to find it.  I don't really know where to look.
>
> Don't worry, it's not really well documented. Create a file called site.cfg in
> the root source directory. There's an example site.cfg.example there.
> Unfortunately, it's pretty sparse at the moment. Now, I'm not terribly familiar
> with the MKL, so I don't know what libraries do what, but here is my guess at
> the appropriate things you will need in site.cfg:
>
> [DEFAULT]
> library_dirs=/home/tdennist/lib:/some/other/path/perhaps
> include_dirs=/home/tdennist/include
>
> [blas_opt]
> libraries=whatever_the_mkl_blas_lib_is,mkl_ia32,mkl,guide
>
> [lapack_opt]
> libraries=mkl_lapack,mkl_lapack32,mkl_ia32,mkl,guide
>
> There's some more documentation in numpy/distutils/system_info.py .
>
> --
> Robert Kern
> robert.kern at gmail.com
>
> "I have come to believe that the whole world is an enigma, a harmless enigma
>  that is made terrible by our own mad attempt to interpret it as though it had
>  an underlying truth."
>  -- Umberto Eco
>
>
>
> -------------------------------------------------------
> Using Tomcat but need to do more? Need to support web services, security?
> Get stuff done quickly with pre-integrated technology to make your job easier
> Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>


From faltet at xot.carabos.com  Thu Apr 20 09:42:04 2006
From: faltet at xot.carabos.com (faltet at xot.carabos.com)
Date: Thu Apr 20 09:42:04 2006
Subject: [Numpy-discussion] Performance problems with strided arrays in NumPy
In-Reply-To: <44470255.302@ee.byu.edu>
References: <20060414213511.GA14355@xot.carabos.com> <4445A822.60207@ee.byu.edu> <20060419214814.GA21524@xot.carabos.com> <20060420024510.GA21987@xot.carabos.com> <44470255.302@ee.byu.edu>
Message-ID: <20060420164132.GA23763@xot.carabos.com>

On Wed, Apr 19, 2006 at 09:39:01PM -0600, Travis Oliphant wrote:
>>On another hand, I see that you have disabled the optimization for
>>unaligned data through the use of a check. Is there any reason for
>>doing that?  If I remove this check, I can achieve similar performance
>>than for numarray (a bit better, in fact).
>
>The only reason was to avoid pointer dereferencing on misaligned data 
>(dereferencing a misaligned pointer causes bus errors on Solaris).   
>But, if we can achieve it with a memmove, then there is no reason to 
>limit the code.

I see. Well, I've tried out with memmove instead than memcpy, and I
can reproduce the same slowdown than it was seen previously to using
your pointer addressing optimisation. I'm afraid that Shasha was right
in that memmove check for not overwriting destination is the
responsible for this.

Having said that, and although I must admit that I don't know in deep
the different situations under which the source of a copy may overlap
the destination, my guess is that for typical element sizes (i.e. [1],
2, 4, 8 and 16) for which the optimization has been done, there is not
any harm on using memcpy instead of memmove (admittedly, you may come
with a counter-example of this, but I do hope you don't). In any case,
the use of memcpy is completely equivalent to the current optimization
using pointers except that, hopefully, pointer addressing is not made
on unaligned data. So, perhaps using the memcpy approach in Solaris
(under Sparc I guess) may avoid the bus errors. It would be nice if
anyone with access to such a platform can confirm this point. I'm
attaching a patch for current SVN numpy that uses the memcpy approach.
Feel free to try it against the benchmarks (also attached).

One last word, I've added a case for typesize 1 in addition of the
existing ones as this effectively improves the speed for 1-byte types.
Below are the speeds without the 1-byte case optimisation:

time for numpy contiguous --> 0.03
time for numarray contiguous --> 0.062
time for numpy strided (2) --> 0.078
time for numarray strided (2) --> 0.064
time for numpy strided (10) --> 0.081
time for numarray strided (10) --> 0.07

I haven't added a case for the unaligned case because this makes
non-sense for 1 byte sized types.

and here with the 1-byte case optimisation added:

time for numpy contiguous --> 0.03
time for numarray contiguous --> 0.062
time for numpy strided (2) --> 0.054
time for numarray strided (2) --> 0.065
time for numpy strided (10) --> 0.061
time for numarray strided (10) --> 0.07

you can notice an speed-up between a 30% and 45% over the previous
case.

Cheers,
-------------- next part --------------
--- numpy/core/src/arrayobject.c        (revision 2381)
+++ numpy/core/src/arrayobject.c        (working copy)
@@ -628,28 +628,44 @@
         intp i, j;
         char *tout = dst;
         char *tin = src;
+       /* For typical datasizes, the memcpy call is much faster than memmove
+          and perfectely safe */
         switch(elsize) {
+        case 16:
+                for (i=0; i<N; i++) {
+                        memcpy(tout, tin, 16);
+                        tin = tin + instrides;
+                        tout = tout + outstrides;
+                }
+                return;
         case 8:
                 for (i=0; i<N; i++) {
-                        ((Float64 *)tout)[0] = ((Float64 *)tin)[0];
+                        memcpy(tout, tin, 8);
                         tin = tin + instrides;
                         tout = tout + outstrides;
                 }
                 return;
         case 4:
                 for (i=0; i<N; i++) {
-                        ((Int32 *)tout)[0] = ((Int32 *)tin)[0];
+                        memcpy(tout, tin, 4);
                         tin = tin + instrides;
                         tout = tout + outstrides;
                 }
                 return;
         case 2:
                 for (i=0; i<N; i++) {
-                        ((Int16 *)tout)[0] = ((Int16 *)tin)[0];
+                        memcpy(tout, tin, 2);
                         tin = tin + instrides;
                         tout = tout + outstrides;
                 }
                 return;
+        case 1:
+                for (i=0; i<N; i++) {
+                        memcpy(tout, tin, 1);
+                        tin = tin + instrides;
+                        tout = tout + outstrides;
+                }
+                return;
         default:
                 for (i=0; i<N; i++) {
                         for (j=0; j<elsize; j++) {
@@ -731,8 +747,7 @@
         }

         /* See if we can iterate over the largest dimension */
-        if (!swap && PyArray_ISALIGNED(dest) && PyArray_ISALIGNED(src) &&
-            (nd = dest->nd) == src->nd && (nd > 0) &&
+        if (!swap && (nd = dest->nd) == src->nd && (nd > 0) &&
             PyArray_CompareLists(dest->dimensions, src->dimensions, nd)) {
                 int maxaxis=0, maxdim=dest->dimensions[0];
                 int i;

-------------- next part --------------
A non-text attachment was scrubbed...
Name: bench-copy.py
Type: text/x-python
Size: 2053 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060420/08ed1d4d/attachment-0002.py>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: bench-copy1.py
Type: text/x-python
Size: 1168 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060420/08ed1d4d/attachment-0003.py>

From rng7 at cornell.edu  Thu Apr 20 13:49:13 2006
From: rng7 at cornell.edu (Ryan Gutenkunst)
Date: Thu Apr 20 13:49:13 2006
Subject: [Numpy-discussion] Bypassing a[2].item()?
Message-ID: <4447F397.7010006@cornell.edu>

Hi all,

I'm porting some code from old scipy to new scipy, and I've run into a 
rather large performance problem.

The heart of the code is integrating a system of nonlinear differential 
equations using odeint. The function that dominates the time to run 
calculates the right hand side, given a current state x. (len(x) ~ 50.) 
Abstracted, the function looks like:

def rhs(x)
     output = scipy.zeros(10, scipy.Float)

     a = x[0]
     b = x[1]
     ...

     output[0] = a/b + c*sqrt(d)...
     output[1] = b-a + 2*b...
     ...

     return output

(I copy the elements of the current state to local variables to avoid 
the cost of repeatedly calling x.__getitem__, and to make the resulting 
equations easier to read.)

When using numpy, a and b are now array scalars and the arithmetic is 
much slower, resulting in about a factor of 10 increase in runtimes from 
those using Numeric.

I've tried doing: a = x[0].item(), which allows the arimetic be done on 
pure scalars. This is a little faster, but still results in a factor of 
3 increase in runtime from old scipy. I imagine the slowdown comes from 
having to call __getitem__() followed by item()

So questions:
1) I haven't followed the details of the array scalar discussions. Is it 
anticipated that array scalar arithmetic will eventually be as fast as 
arithmetic in native python types?

2) If not, is it possible to get a "pure" scalar directly from an array 
in one function call?

Thanks for any help,
Ryan

-- 
Ryan Gutenkunst               |
Cornell LASSP                 |       "It is not the mountain
                               |        we conquer but ourselves."
Clark 535 / (607)227-7914     |        -- Sir Edmund Hillary
AIM: JepettoRNG               |
          http://www.physics.cornell.edu/~rgutenkunst/


From robert.kern at gmail.com  Thu Apr 20 14:20:02 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Thu Apr 20 14:20:02 2006
Subject: [Numpy-discussion] Re: Bypassing a[2].item()?
In-Reply-To: <4447F397.7010006@cornell.edu>
References: <4447F397.7010006@cornell.edu>
Message-ID: <e28ts3$7fr$1@sea.gmane.org>

Ryan Gutenkunst wrote:

> So questions:
> 1) I haven't followed the details of the array scalar discussions. Is it
> anticipated that array scalar arithmetic will eventually be as fast as
> arithmetic in native python types?

More or less, if I'm not mistaken. This ticket is aimed at that:

  http://projects.scipy.org/scipy/numpy/ticket/55

> 2) If not, is it possible to get a "pure" scalar directly from an array
> in one function call?

float(x[0]) seems to be faster on my PowerBook.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From rng7 at cornell.edu  Thu Apr 20 15:21:11 2006
From: rng7 at cornell.edu (Ryan Gutenkunst)
Date: Thu Apr 20 15:21:11 2006
Subject: [Numpy-discussion] Re: Bypassing a[2].item()?
In-Reply-To: <e28ts3$7fr$1@sea.gmane.org>
References: <4447F397.7010006@cornell.edu> <e28ts3$7fr$1@sea.gmane.org>
Message-ID: <9b9f0633c5a242a6ab8a199708c8dd94@cornell.edu>

On Apr 20, 2006, at 5:18 PM, Robert Kern wrote:
> Ryan Gutenkunst wrote:
>
>> So questions:
>> 1) I haven't followed the details of the array scalar discussions. Is 
>> it
>> anticipated that array scalar arithmetic will eventually be as fast as
>> arithmetic in native python types?
>
> More or less, if I'm not mistaken. This ticket is aimed at that:
>
>   http://projects.scipy.org/scipy/numpy/ticket/55

Good to hear.

>> 2) If not, is it possible to get a "pure" scalar directly from an 
>> array
>> in one function call?
>
> float(x[0]) seems to be faster on my PowerBook.

It's faster for me, too, but float(x[0]) is still much slower than 
using Numeric where x[0] suffices. I guess I'll just have to warn my 
users away from the new scipy until numpy 0.9.8 comes out and scalar 
math is sped up.

Cheers,
Ryan

-- 
Ryan Gutenkunst               |
Cornell Dept. of Physics      |       "It is not the mountain
                               |        we conquer but ourselves."
Clark 535 / (607)255-6068     |        -- Sir Edmund Hillary
AIM: JepettoRNG               |
          http://www.physics.cornell.edu/~rgutenkunst/


From robert.kern at gmail.com  Thu Apr 20 16:22:09 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Thu Apr 20 16:22:09 2006
Subject: [Numpy-discussion] Re: Bypassing a[2].item()?
In-Reply-To: <9b9f0633c5a242a6ab8a199708c8dd94@cornell.edu>
References: <4447F397.7010006@cornell.edu> <e28ts3$7fr$1@sea.gmane.org> <9b9f0633c5a242a6ab8a199708c8dd94@cornell.edu>
Message-ID: <e2951s$sa0$1@sea.gmane.org>

Ryan Gutenkunst wrote:
> On Apr 20, 2006, at 5:18 PM, Robert Kern wrote:
> 
>> Ryan Gutenkunst wrote:

>>> 2) If not, is it possible to get a "pure" scalar directly from an array
>>> in one function call?
>>
>> float(x[0]) seems to be faster on my PowerBook.
> 
> It's faster for me, too, but float(x[0]) is still much slower than using
> Numeric where x[0] suffices. I guess I'll just have to warn my users
> away from the new scipy until numpy 0.9.8 comes out and scalar math is
> sped up.

For that matter, a plain "x[0]" seems to be about 3x faster with Numeric than numpy.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From oliphant at ee.byu.edu  Thu Apr 20 20:16:02 2006
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Thu Apr 20 20:16:02 2006
Subject: [Numpy-discussion] Re: Bypassing a[2].item()?
In-Reply-To: <e2951s$sa0$1@sea.gmane.org>
References: <4447F397.7010006@cornell.edu> <e28ts3$7fr$1@sea.gmane.org> <9b9f0633c5a242a6ab8a199708c8dd94@cornell.edu> <e2951s$sa0$1@sea.gmane.org>
Message-ID: <44484E44.2050300@ee.byu.edu>

Robert Kern wrote:

>Ryan Gutenkunst wrote:
>  
>
>>On Apr 20, 2006, at 5:18 PM, Robert Kern wrote:
>>
>>    
>>
>>>Ryan Gutenkunst wrote:
>>>      
>>>
>
>  
>
>>>>2) If not, is it possible to get a "pure" scalar directly from an array
>>>>in one function call?
>>>>        
>>>>
>>>float(x[0]) seems to be faster on my PowerBook.
>>>      
>>>
>>It's faster for me, too, but float(x[0]) is still much slower than using
>>Numeric where x[0] suffices. I guess I'll just have to warn my users
>>away from the new scipy until numpy 0.9.8 comes out and scalar math is
>>sped up.
>>    
>>
>
>For that matter, a plain "x[0]" seems to be about 3x faster with Numeric than numpy.
>
>  
>
We are already special-casing the integer select code but could 
special-case the getitem code so that if nd==1 a faster construction is 
used.  I think right now a 0-dim array is being created only to get 
destroyed later on return.   Please add a ticket as this extremely 
common operation should be made as fast as possible. 

This is a little tricky because array_big_item is called in a few places 
and is expected to return an array.  If it returns a scalar in those 
places segfaults can occur.  Either checks need to be made in each of 
those cases or the special-casing needs to be in array_big_item_nice.  
I'm not sure which I prefer....

-Travis


From simon at arrowtheory.com  Thu Apr 20 23:24:59 2006
From: simon at arrowtheory.com (Simon Burton)
Date: Thu Apr 20 23:24:59 2006
Subject: [Numpy-discussion] announce: pyjit, a little jit for creating numpy ufuncs
Message-ID: <20060421162336.42285837.simon@arrowtheory.com>

Hi,

Inspired by numexpr, pypy and llvm, i've built a simple 
JIT for creating numpy "ufuncs" (they are not yet real ufuncs).
It uses llvm[1] as the backend machine code generator.

The main things it can do are:

 *) parse simple python code (function def's)
 *) generate SSA assembly code for llvm
 *) build ufunc code for applying to numpy array's

When I say simple I mean it:

def calc(a,b):
  c = (a+b)/2.0
  return c

No control flow or type inference has been implemented.

As with numexpr, significant speedups are possible.

I'm putting this announce here to see what the other numpy'ers think.

$ svn co http://rubis.rsise.anu.edu.au/local/repos/elefant/pyjit

bye,

Simon.

[1] http://llvm.org/


-- 
Simon Burton, B.Sc.
Licensed PO Box 8066
ANU Canberra 2601
Australia
Ph. 61 02 6249 6940
http://arrowtheory.com 


From oqdr at dcorthodontics.com  Fri Apr 21 00:08:02 2006
From: oqdr at dcorthodontics.com (Rosalia Oneal)
Date: Fri Apr 21 00:08:02 2006
Subject: [Numpy-discussion] six-pack
Message-ID: <001901c66512$37850955$68c487dd@tswt.rkkudn>

An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060421/28e5b07d/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: frosty.gif
Type: image/gif
Size: 26123 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060421/28e5b07d/attachment-0001.gif>

From cookedm at physics.mcmaster.ca  Fri Apr 21 09:27:00 2006
From: cookedm at physics.mcmaster.ca (David M. Cooke)
Date: Fri Apr 21 09:27:00 2006
Subject: [Numpy-discussion] Source release of 0.9.6 on sourceforge is wrong
Message-ID: <qnklktynbe7.fsf@arbutus.physics.mcmaster.ca>

Travis,

Looks like you uploaded the bdist .tar.gz of NumPy 0.9.6 to
sourceforge, instead of the sdist. The one there isn't the source,
it's a binary distribution of a 32-bit Linux compile.

It's been over a month, with 2684 downloads, and I can't find a
mention that anybody's noticed this before... Have we silently lost
people who think we're on crack, or are there 2684 people who haven't
looked at what they got?

[On a another note, the download URL on PyPi won't work with
setuptools; I've fixed the setup.py in svn to use the correct one, but
if you could fix it on PyPi and set it to
http://sourceforge.net/project/showfiles.php?group_id=1369&package_id=175103
then people can use easy_install to install numpy.]

-- 
|>|\/|<
/--------------------------------------------------------------------------\
|David M. Cooke                      http://arbutus.physics.mcmaster.ca/dmc/
|cookedm at physics.mcmaster.ca


From cookedm at physics.mcmaster.ca  Fri Apr 21 09:30:01 2006
From: cookedm at physics.mcmaster.ca (David M. Cooke)
Date: Fri Apr 21 09:30:01 2006
Subject: [Numpy-discussion] Source release of 0.9.6 on sourceforge is wrong
In-Reply-To: <qnklktynbe7.fsf@arbutus.physics.mcmaster.ca> (David M. Cooke's
	message of "Fri, 21 Apr 2006 12:25:52 -0400")
References: <qnklktynbe7.fsf@arbutus.physics.mcmaster.ca>
Message-ID: <qnkfyk6nb8x.fsf@arbutus.physics.mcmaster.ca>

cookedm at physics.mcmaster.ca (David M. Cooke) writes:

> Travis,
>
> Looks like you uploaded the bdist .tar.gz of NumPy 0.9.6 to
> sourceforge, instead of the sdist. The one there isn't the source,
> it's a binary distribution of a 32-bit Linux compile.

Gah! My bad! When I convinced easy_install to grab the source, it
grabbed numpy-0.9.6-py2.4-linux-i686.tar.gz instead, which of course is a
binary package.

*why* it grabbed that one is another story (that's not my platform!
I'm on py2.4-linux-x86_64).

-- 
|>|\/|<
/--------------------------------------------------------------------------\
|David M. Cooke                      http://arbutus.physics.mcmaster.ca/dmc/
|cookedm at physics.mcmaster.ca


From ndarray at mac.com  Fri Apr 21 09:35:02 2006
From: ndarray at mac.com (Sasha)
Date: Fri Apr 21 09:35:02 2006
Subject: [Numpy-discussion] Source release of 0.9.6 on sourceforge is wrong
In-Reply-To: <qnklktynbe7.fsf@arbutus.physics.mcmaster.ca>
References: <qnklktynbe7.fsf@arbutus.physics.mcmaster.ca>
Message-ID: <d38f5330604210934h6425f8b6m759829fc218e2ceb@mail.gmail.com>

I've downloaded numpy-0.9.6.tar.gz from SF about a month ago and it was fine:

> tar tzf ~/Archives/numpy-0.9.6.tar.gz
numpy-0.9.6/
numpy-0.9.6/numpy/
numpy-0.9.6/numpy/core/
numpy-0.9.6/numpy/core/blasdot/
numpy-0.9.6/numpy/core/blasdot/_dotblas.c
numpy-0.9.6/numpy/core/blasdot/cblas.h
...


On 4/21/06, David M. Cooke <cookedm at physics.mcmaster.ca> wrote:
> Travis,
>
> Looks like you uploaded the bdist .tar.gz of NumPy 0.9.6 to
> sourceforge, instead of the sdist. The one there isn't the source,
> it's a binary distribution of a 32-bit Linux compile.
>
> It's been over a month, with 2684 downloads, and I can't find a
> mention that anybody's noticed this before... Have we silently lost
> people who think we're on crack, or are there 2684 people who haven't
> looked at what they got?
>
> [On a another note, the download URL on PyPi won't work with
> setuptools; I've fixed the setup.py in svn to use the correct one, but
> if you could fix it on PyPi and set it to
> http://sourceforge.net/project/showfiles.php?group_id=1369&package_id=175103
> then people can use easy_install to install numpy.]
>
> --
> |>|\/|<
> /--------------------------------------------------------------------------\
> |David M. Cooke                      http://arbutus.physics.mcmaster.ca/dmc/
> |cookedm at physics.mcmaster.ca
>
>
> -------------------------------------------------------
> Using Tomcat but need to do more? Need to support web services, security?
> Get stuff done quickly with pre-integrated technology to make your job easier
> Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>


From bsouthey at gmail.com  Fri Apr 21 10:35:02 2006
From: bsouthey at gmail.com (Bruce Southey)
Date: Fri Apr 21 10:35:02 2006
Subject: [Numpy-discussion] Source release of 0.9.6 on sourceforge is wrong
In-Reply-To: <d38f5330604210934h6425f8b6m759829fc218e2ceb@mail.gmail.com>
References: <qnklktynbe7.fsf@arbutus.physics.mcmaster.ca>
	 <d38f5330604210934h6425f8b6m759829fc218e2ceb@mail.gmail.com>
Message-ID: <bbcd77d00604211033r4816671dl6c455ea25abe8b0d@mail.gmail.com>

Hi,
I concurr as I downloaded and installed it yesterday (April 20) afternoon:
(from my ls -l) : 2006-04-20 13:38 numpy-0.9.6.tar.gz

I had no problems installing that version as the import numpy appeared to work.

Regards
Bruce

On 4/21/06, Sasha <ndarray at mac.com> wrote:
> I've downloaded numpy-0.9.6.tar.gz from SF about a month ago and it was fine:
>
> > tar tzf ~/Archives/numpy-0.9.6.tar.gz
> numpy-0.9.6/
> numpy-0.9.6/numpy/
> numpy-0.9.6/numpy/core/
> numpy-0.9.6/numpy/core/blasdot/
> numpy-0.9.6/numpy/core/blasdot/_dotblas.c
> numpy-0.9.6/numpy/core/blasdot/cblas.h
> ...
>
>
>
> On 4/21/06, David M. Cooke <cookedm at physics.mcmaster.ca> wrote:
> > Travis,
> >
> > Looks like you uploaded the bdist .tar.gz of NumPy 0.9.6 to
> > sourceforge, instead of the sdist. The one there isn't the source,
> > it's a binary distribution of a 32-bit Linux compile.
> >
> > It's been over a month, with 2684 downloads, and I can't find a
> > mention that anybody's noticed this before... Have we silently lost
> > people who think we're on crack, or are there 2684 people who haven't
> > looked at what they got?
> >
> > [On a another note, the download URL on PyPi won't work with
> > setuptools; I've fixed the setup.py in svn to use the correct one, but
> > if you could fix it on PyPi and set it to
> > http://sourceforge.net/project/showfiles.php?group_id=1369&package_id=175103
> > then people can use easy_install to install numpy.]
> >
> > --
> > |>|\/|<
> > /--------------------------------------------------------------------------\
> > |David M. Cooke                      http://arbutus.physics.mcmaster.ca/dmc/
> > |cookedm at physics.mcmaster.ca
> >
> >
> > -------------------------------------------------------
> > Using Tomcat but need to do more? Need to support web services, security?
> > Get stuff done quickly with pre-integrated technology to make your job easier
> > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
> > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
> > _______________________________________________
> > Numpy-discussion mailing list
> > Numpy-discussion at lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/numpy-discussion
> >
>
>
> -------------------------------------------------------
> Using Tomcat but need to do more? Need to support web services, security?
> Get stuff done quickly with pre-integrated technology to make your job easier
> Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
> http://sel.as-us.falkag.net/sel?cmdlnk&kid0709&bid&3057&dat1642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>


From robert.kern at gmail.com  Fri Apr 21 11:28:11 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Fri Apr 21 11:28:11 2006
Subject: [Numpy-discussion] Re: Source release of 0.9.6 on sourceforge is wrong
In-Reply-To: <qnkfyk6nb8x.fsf@arbutus.physics.mcmaster.ca>
References: <qnklktynbe7.fsf@arbutus.physics.mcmaster.ca> <qnkfyk6nb8x.fsf@arbutus.physics.mcmaster.ca>
Message-ID: <e2b86f$c68$1@sea.gmane.org>

David M. Cooke wrote:
> cookedm at physics.mcmaster.ca (David M. Cooke) writes:
> 
>>Travis,
>>
>>Looks like you uploaded the bdist .tar.gz of NumPy 0.9.6 to
>>sourceforge, instead of the sdist. The one there isn't the source,
>>it's a binary distribution of a 32-bit Linux compile.
> 
> Gah! My bad! When I convinced easy_install to grab the source, it
> grabbed numpy-0.9.6-py2.4-linux-i686.tar.gz instead, which of course is a
> binary package.
> 
> *why* it grabbed that one is another story (that's not my platform!
> I'm on py2.4-linux-x86_64).

Phillip Eby tells me that the bdist_dumb packages there confuse some versions of
setuptools. He fixed it this morning.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From faltet at xot.carabos.com  Fri Apr 21 13:56:04 2006
From: faltet at xot.carabos.com (faltet at xot.carabos.com)
Date: Fri Apr 21 13:56:04 2006
Subject: [Numpy-discussion] numexpr enhancements
Message-ID: <20060421205530.GA25020@xot.carabos.com>

Hi,

After looking at the numpy performance issues on strided and unaligned
data, I decided to have a try at the numexpr package and finally
implemented better suport for them. As a result, numexpr can reach now
a 2x of performance improvement for simple expressions, like 'a>2.'.

In the way, I've added support for boolean expressions (&, | and ~, as
in the where() function), a new boolean data type (important to get
better performance on boolean expressions) and support for numarray
(maintaining the compatibility with numpy, of course).

I've called the new package numexpr 0.2 to not confuse it with existing
0.1. Well, let's hope that numexpr can continue making its way towards
integration in numpy.

You can fetch this new package at:

http://www.carabos.com/downloads/divers/numexpr-0.2.tar.gz

Finally, let me say that numexpr is a wonderful toy to get your hands
dirty ;-) Many thanks to David (and Tim) for this!

Cheers!

Francesc


From hetland at tamu.edu  Fri Apr 21 15:02:12 2006
From: hetland at tamu.edu (Robert Hetland)
Date: Fri Apr 21 15:02:12 2006
Subject: [Numpy-discussion] 'append' array method request.
Message-ID: <E26AB4D2-3554-4354-8459-8FB30BD36388@tamu.edu>

I find myself writing things like

x = []; y = []; t = []
for line in open(filename).readlines():
     xstr, ystr, tstr = line.split()
     x.append(float(xstr))
     y.append(float(ystr)_
     t.append(dateutil.parser.parse(tstr))  # or something similar
x = asarray(x)
y = asarray(y)
t = asarray(t)

I think it would be nice to be able to create empty arrays, and  
append the values onto the end as I loop through the file without  
creating the intermediate list.  Is this reasonable?  Is there a way  
to do this with existing methods or functions that I am missing?  Is  
there a better way altogether?

-Rob.


-----
Rob Hetland, Assistant Professor
Dept of Oceanography, Texas A&M University
p: 979-458-0096, f: 979-845-6331
e: hetland at tamu.edu, w: http://pong.tamu.edu


From robert.kern at gmail.com  Fri Apr 21 15:13:07 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Fri Apr 21 15:13:07 2006
Subject: [Numpy-discussion] Re: 'append' array method request.
In-Reply-To: <E26AB4D2-3554-4354-8459-8FB30BD36388@tamu.edu>
References: <E26AB4D2-3554-4354-8459-8FB30BD36388@tamu.edu>
Message-ID: <e2blcb$nen$1@sea.gmane.org>

Robert Hetland wrote:
> 
> I find myself writing things like
> 
> x = []; y = []; t = []
> for line in open(filename).readlines():
>     xstr, ystr, tstr = line.split()
>     x.append(float(xstr))
>     y.append(float(ystr)_
>     t.append(dateutil.parser.parse(tstr))  # or something similar
> x = asarray(x)
> y = asarray(y)
> t = asarray(t)
> 
> I think it would be nice to be able to create empty arrays, and  append
> the values onto the end as I loop through the file without  creating the
> intermediate list.  Is this reasonable? 

Not in the core array object, no. We can't make the underlying pointer point to
something else (because you've just reallocated the whole memory block to add an
item to the array) without invalidating all of the views on that array. This is
also the reason that numpy arrays can't use the standard library's array module
as its storage. That said:

> Is there a way  to do this with
> existing methods or functions that I am missing?  Is  there a better way
> altogether?

We've done performance tests before. The fastest way that I've found is to use
the stdlib array module to accumulate values (it uses the same preallocation
strategy that Python lists use, and you can't create views from them, so you are
always safe) and then create the numpy array using fromstring on that object
(stdlib arrays obey the buffer protocol, so they will be treated like strings of
binary data). I posted timings one or two or three years ago on one of the scipy
lists.

However, lists are fine if you don't need blazing speed/low memory usage.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From ndarray at mac.com  Fri Apr 21 15:20:01 2006
From: ndarray at mac.com (Sasha)
Date: Fri Apr 21 15:20:01 2006
Subject: [Numpy-discussion] 'append' array method request.
In-Reply-To: <E26AB4D2-3554-4354-8459-8FB30BD36388@tamu.edu>
References: <E26AB4D2-3554-4354-8459-8FB30BD36388@tamu.edu>
Message-ID: <d38f5330604211519h57383d65r6bbfba623771c958@mail.gmail.com>

On 4/21/06, Robert Hetland <hetland at tamu.edu> wrote:
> [...]
> I think it would be nice to be able to create empty arrays, and
> append the values onto the end as I loop through the file without
> creating the intermediate list.  Is this reasonable?  Is there a way
> to do this with existing methods or functions that I am missing?  Is
> there a better way altogether?
>

Numpy arrays cannot grow in-place because there is no way for an array
to tell if it's data is shared with other arrays.  You can use
python's standard library arrays instead of lists:

>>> from numpy import *
>>> import array as a
>>> x = a.array('i',[])
>>> x.append(1)
>>> x.append(2)
>>> x.append(3)
>>> ndarray(len(x), dtype=int, buffer=x)
array([1, 2, 3])

Note that data is not copied:

>>> ndarray(len(x), dtype=int, buffer=x)[1] = 20
>>> x
array('i', [1, 20, 3])


From charlesr.harris at gmail.com  Fri Apr 21 18:50:02 2006
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Fri Apr 21 18:50:02 2006
Subject: [Numpy-discussion] 'append' array method request.
In-Reply-To: <E26AB4D2-3554-4354-8459-8FB30BD36388@tamu.edu>
References: <E26AB4D2-3554-4354-8459-8FB30BD36388@tamu.edu>
Message-ID: <e06186140604211849j32cd2fdr1d29f99977c2e8e4@mail.gmail.com>

Hi,

On 4/21/06, Robert Hetland <hetland at tamu.edu> wrote:
>
>
> I find myself writing things like
>
> x = []; y = []; t = []
> for line in open(filename).readlines():
>      xstr, ystr, tstr = line.split()
>      x.append(float(xstr))
>      y.append(float(ystr)_
>      t.append(dateutil.parser.parse(tstr))  # or something similar
> x = asarray(x)
> y = asarray(y)
> t = asarray(t)


I think you can read the ascii file directly into an array with numeric
conversions (fromfile) then just reshape it to have x,y,z columns. For
example:

$[charris at E011704 ~]$ cat input.txt
1 2 3
4 5 6
7 8 9

Then after importing numpy into ipython:

In [6]:fromfile('input.txt',sep=' ').reshape(-1,3)
Out[6]:
array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060421/34c2ff88/attachment-0001.html>

From oliphant.travis at ieee.org  Fri Apr 21 19:51:07 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Fri Apr 21 19:51:07 2006
Subject: [Numpy-discussion] Re: seterr changes
In-Reply-To: <44465DEE.8090703@cox.net>
References: <44465DEE.8090703@cox.net>
Message-ID: <444999E2.1040009@ieee.org>

Tim Hochberg wrote:
>
> Hi Travis et al,
>
> I started looking at your seterr changes. 
Thank you very much for the help on this.  I'm not an expert on threaded 
code by any means.  In fact, as you clearly point out, I don't eat and 
drink what will work under threaded environments and what wont.  Clearly 
global variables are problematic.  That is the problem with the 
update_use_defaults bit, right?   This is the way it was being managed 
before and I just changed names a bit to use PyThreadState_GetDict for 
the dictionary (it seems possible to use only from C until Python 2.4).  

I say if it only buys 5% on small arrays then it's not worth it as there 
are other fish to fry to make up for that 5% and I agree that tracking 
down threading problems due to a fanagled global variable is sticky.  I 
did not think about the threading issues deeply enough.

> I'm also curious about the seterr interface. It returns 
> ufunc_values_obj. I'm wasn't sure how one is supposed to pass that 
> back in to seterr,  so I modified seterr to instead return a 
> dictionary. I also modified it so that the seterr function itself has 
> no defaults (or rather they're all None). Instead, any unspecified 
> values are taken from the current error state. Thus 
> seterr(divide="warn") changes only the divide state, leaving the other 
> entries alone.
Returning an object is a late-in-the-game idea and should be critiqued.  
It can be passed to seterr (an attribute check grabs the actual list --- 
did you want to change it to a dictionary?).  Doesn't a small list have 
faster access than a small dictionary?  

I'll look over your commits and comment later if I think of anything...

I'm thrilled with your work.

Best,

-Travis


From bitorika at cs.tcd.ie  Sat Apr 22 03:18:00 2006
From: bitorika at cs.tcd.ie (bitorika at cs.tcd.ie)
Date: Sat Apr 22 03:18:00 2006
Subject: [Numpy-discussion] Floating point exception with numpy and 
     embedded python interpreter
In-Reply-To: <4446819D.3030401@astraw.com>
References: <81934aa60604191029h4d8a8d9bl550fa58cc67d3d5e@mail.gmail.com>
    <44467576.1020708@astraw.com>
    <AECE9844-6C6A-4EE7-A8A9-CAA079D8B2EA@gmail.com>
    <4446819D.3030401@astraw.com>
Message-ID: <35791.134.226.38.190.1145701016.squirrel@webmail.cs.tcd.ie>

>> On 19 Apr 2006, at 18:37, Andrew Straw wrote:
> I think that numpy only accesses the SSE units through ATLAS or other
> external library. So, build numpy without ATLAS. But I'm not 100% sure
> anymore if there aren't any optimizations that directly use SSE if it's
> available.

I've tried getting rid of all atlas, blas and lapack packages in my system
and rebuilding numpy to use its own unoptimised lapack_lite, but no luck.
Just trying to import numpy with PyImport_ImportModule("numpy") causes the
program to crash with just a "Floating point exception" message output.

The program I'm embedding Python in is the NS Network Simulator
(http://www.isi.edu/nsnam/ns/). It's a complex C++ beast with its own
Object-Tcl interpreter, but it's been working fine with embedded Python
except for this numpy crash. I've used Numeric before and it worked fine
as well.

I'm lost now regarding what to work on to find a solution, anyone familiar
with numpy internals has any suggestion?

Thanks,
Arkaitz


From jordi.bofill at upc.edu  Sat Apr 22 09:46:00 2006
From: jordi.bofill at upc.edu (Jordi Bofill)
Date: Sat Apr 22 09:46:00 2006
Subject: [Numpy-discussion] Re: Dumping record arrays
References: <200603302127.24231.pgmdevlist@mailcan.com>
Message-ID: <e2dlp0$ijo$1@sea.gmane.org>

Pierre GM wrote:

> Folks,
> I'd like to dump/pickle some record arrays. The pickling works, the
> unpickling raises a ValueError (on my version of numpy 0.9.6). (cf below).
> Is this already corrected in the svn version ?
> Thx
> 
> 
>
###########################################################################
> #
> 
> x1 = array([21,32,14])
> x2 = array(['my','first','name'])
> x3 = array([3.1, 4.5, 6.2])
> r = rec.fromarrays([x1,x2,x3], names='id, word, number')
> 
> r.dump('dumper')
> rb=load('dumper')
> ---------------------------------------------------------------------------
> exceptions.ValueError                                Traceback (most
> recent call last)
> 
> /home/backtopop/Work/workspace-python/pyflows/src/<ipython console>
> 
> /usr/lib64/python2.4/site-packages/numpy/core/numeric.py in load(file)
>     331     if isinstance(file, type("")):
>     332         file = _file(file,"rb")
> --> 333     return _cload(file)
>     334
>     335 # These are all essentially abbreviations
> 
> /usr/lib64/python2.4/site-packages/numpy/core/_internal.py in
> _reconstruct(subtype, shape, dtype)
>     251
>     252 def _reconstruct(subtype, shape, dtype):
> --> 253     return ndarray.__new__(subtype, shape, dtype)
>     254
>     255
> 
> ValueError: ('data-type with unspecified variable length', <function
> _reconstruct at 0x2aaaafcf1578>, (<class 'numpy.core.records.recarray'>,
> (0,), 'V'))
> 
> 
> 
> 
> -------------------------------------------------------
> This SF.Net email is sponsored by xPML, a groundbreaking scripting
> language that extends applications into web and mobile media. Attend the
> live webcast and join the prime developer group breaking into this new
> coding territory!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642

I'm newbie moving from numarray and I also get this error. I tried svn
records.py with the same result. Any hope in getting it fixed?
The error can be reproduce from the source example:

import  numpy.core.records as rec
r=rec.fromrecords([(456,'dbe',1.2),(2,'de',1.3)],names='col1,col2,col3')
import cPickle
print cPickle.loads(cPickle.dumps(r))
---------------------------------------------------------------------------
exceptions.ValueError                                Traceback (most recent
call last)

/home/jordi/temp/<ipython console>

/usr/lib/python2.4/site-packages/numpy/core/_internal.py in
_reconstruct(subtype, shape, dt ype)
    251
    252 def _reconstruct(subtype, shape, dtype):
--> 253     return ndarray.__new__(subtype, shape, dtype)
    254
    255

ValueError: ('data-type with unspecified variable length', <function
_reconstruct at 0xb78f ce64>, (<class 'numpy.core.records.recarray'>,
(0,), 'V'))


From oliphant.travis at ieee.org  Sat Apr 22 10:19:00 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Sat Apr 22 10:19:00 2006
Subject: [Numpy-discussion] Re: Dumping record arrays
In-Reply-To: <e2dlp0$ijo$1@sea.gmane.org>
References: <200603302127.24231.pgmdevlist@mailcan.com> <e2dlp0$ijo$1@sea.gmane.org>
Message-ID: <444A653A.9020402@ieee.org>

Jordi Bofill wrote:
> Pierre GM wrote:
>
>   
>> Folks,
>> I'd like to dump/pickle some record arrays. The pickling works, the
>> unpickling raises a ValueError (on my version of numpy 0.9.6). (cf below).
>> Is this already corrected in the svn version ?
>> Thx
>>
>>
>>
>>     
> ###########################################################################
>   
>> #
>>
>> x1 = array([21,32,14])
>> x2 = array(['my','first','name'])
>> x3 = array([3.1, 4.5, 6.2])
>> r = rec.fromarrays([x1,x2,x3], names='id, word, number')
>>
>>     
This is fixed in SVN (but you have to get more than just the SVN 
records.py script).   The needed change is in the __reduce__ method of 
the array object (which is in C).  A re-compile is needed. 

NumPy 0.9.8 should be out in a few weeks.

Best,

-Travis


>> r.dump('dumper')
>> rb=load('dumper')
>> ---------------------------------------------------------------------------
>> exceptions.ValueError                                Traceback (most
>> recent call last)
>>
>> /home/backtopop/Work/workspace-python/pyflows/src/<ipython console>
>>
>> /usr/lib64/python2.4/site-packages/numpy/core/numeric.py in load(file)
>>     331     if isinstance(file, type("")):
>>     332         file = _file(file,"rb")
>> --> 333     return _cload(file)
>>     334
>>     335 # These are all essentially abbreviations
>>
>> /usr/lib64/python2.4/site-packages/numpy/core/_internal.py in
>> _reconstruct(subtype, shape, dtype)
>>     251
>>     252 def _reconstruct(subtype, shape, dtype):
>> --> 253     return ndarray.__new__(subtype, shape, dtype)
>>     254
>>     255
>>
>> ValueError: ('data-type with unspecified variable length', <function
>> _reconstruct at 0x2aaaafcf1578>, (<class 'numpy.core.records.recarray'>,
>> (0,), 'V'))
>>
>>
>>
>>
>> -------------------------------------------------------
>> This SF.Net email is sponsored by xPML, a groundbreaking scripting
>> language that extends applications into web and mobile media. Attend the
>> live webcast and join the prime developer group breaking into this new
>> coding territory!
>> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
>>     
>
> I'm newbie moving from numarray and I also get this error. I tried svn
> records.py with the same result. Any hope in getting it fixed?
> The error can be reproduce from the source example:
>
> import  numpy.core.records as rec
> r=rec.fromrecords([(456,'dbe',1.2),(2,'de',1.3)],names='col1,col2,col3')
> import cPickle
> print cPickle.loads(cPickle.dumps(r))
> ---------------------------------------------------------------------------
> exceptions.ValueError                                Traceback (most recent
> call last)
>
> /home/jordi/temp/<ipython console>
>
> /usr/lib/python2.4/site-packages/numpy/core/_internal.py in
> _reconstruct(subtype, shape, dt ype)
>     251
>     252 def _reconstruct(subtype, shape, dtype):
> --> 253     return ndarray.__new__(subtype, shape, dtype)
>     254
>     255
>
> ValueError: ('data-type with unspecified variable length', <function
> _reconstruct at 0xb78f ce64>, (<class 'numpy.core.records.recarray'>,
> (0,), 'V'))
>
>
>
>
>
>
>
> -------------------------------------------------------
> Using Tomcat but need to do more? Need to support web services, security?
> Get stuff done quickly with pre-integrated technology to make your job easier
> Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>   

   
From fullung at gmail.com  Sat Apr 22 10:53:05 2006
From: fullung at gmail.com (Albert Strasheim)
Date: Sat Apr 22 10:53:05 2006
Subject: [Numpy-discussion] Re: seterr changes
In-Reply-To: <444999E2.1040009@ieee.org>
Message-ID: <005701c66635$82b3a930$0502010a@dsp.sun.ac.za>

Hello all

I was just wondering if someone could provide some example code that would
cause an error if invalid is set to 'raise'?

I also noticed that seterr returns the old values. Is this really useful?
Consider its use in an IPython session:

In [184]: N.geterr()
Out[184]: {'over': 'ignore', 'divide': 'ignore', 'invalid': 'ignore',
'under': 'ignore'}

In [185]: N.seterr(over='raise')
Out[185]: {'over': 'ignore', 'divide': 'ignore', 'invalid': 'ignore',
'under': 'ignore'}

I think the following pattern would make sense, but it seems it doesn't work
at present:
 
old=N.geterr()
N.seterr(over='raise')
# so some calculations that might overflow
N.seterr(old)

This currently causes the following error:

Traceback (most recent call last):
  File "<ipython console>", line 1, in ?
  File "C:\Python24\Lib\site-packages\numpy\core\numeric.py", line 426, in
seterr
    maskvalue = ((_errdict[divide] << SHIFT_DIVIDEBYZERO) +
TypeError: dict objects are unhashable

Is this intended? I think it would be useful to be able to restore all the
error states in one go.

Regards,

Albert

> -----Original Message-----
> From: numpy-discussion-admin at lists.sourceforge.net [mailto:numpy-
> discussion-admin at lists.sourceforge.net] On Behalf Of Travis Oliphant
> Sent: 22 April 2006 04:50
> To: tim.hochberg at ieee.org; numpy-discussion
> Subject: [Numpy-discussion] Re: seterr changes
> 
> Tim Hochberg wrote:
> >
> > Hi Travis et al,
> >
> > I started looking at your seterr changes.
> Thank you very much for the help on this.  I'm not an expert on threaded
> code by any means.  In fact, as you clearly point out, I don't eat and
> drink what will work under threaded environments and what wont.  Clearly
> global variables are problematic.  That is the problem with the
> update_use_defaults bit, right?   This is the way it was being managed
> before and I just changed names a bit to use PyThreadState_GetDict for
> the dictionary (it seems possible to use only from C until Python 2.4).
> 
> I say if it only buys 5% on small arrays then it's not worth it as there
> are other fish to fry to make up for that 5% and I agree that tracking
> down threading problems due to a fanagled global variable is sticky.  I
> did not think about the threading issues deeply enough.
> 
> > I'm also curious about the seterr interface. It returns
> > ufunc_values_obj. I'm wasn't sure how one is supposed to pass that
> > back in to seterr,  so I modified seterr to instead return a
> > dictionary. I also modified it so that the seterr function itself has
> > no defaults (or rather they're all None). Instead, any unspecified
> > values are taken from the current error state. Thus
> > seterr(divide="warn") changes only the divide state, leaving the other
> > entries alone.
> Returning an object is a late-in-the-game idea and should be critiqued.
> It can be passed to seterr (an attribute check grabs the actual list ---
> did you want to change it to a dictionary?).  Doesn't a small list have
> faster access than a small dictionary?
> 
> I'll look over your commits and comment later if I think of anything...
> 
> I'm thrilled with your work.
> 
> Best,
> 
> -Travis


From rob at hooft.net  Sat Apr 22 11:48:01 2006
From: rob at hooft.net (Rob Hooft)
Date: Sat Apr 22 11:48:01 2006
Subject: [Numpy-discussion] Re: seterr changes
In-Reply-To: <005701c66635$82b3a930$0502010a@dsp.sun.ac.za>
References: <005701c66635$82b3a930$0502010a@dsp.sun.ac.za>
Message-ID: <444A7A35.5090906@hooft.net>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Albert Strasheim wrote:
| old=N.geterr()
| N.seterr(over='raise')
| # so some calculations that might overflow
| N.seterr(old)

You should try (but I didn't): N.seterr(**old)

Rob
- --
Rob W.W. Hooft  ||  rob at hooft.net  ||  http://www.hooft.net/people/rob/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFESno1H7J/Cv8rb3QRAppZAKCGBRSvL++wg3wFer6odmG8sxyrFwCfQ1nq
p0aVr4r+Z1ZfRBGQgir+KX0=
=eZMa
-----END PGP SIGNATURE-----


From strawman at astraw.com  Sat Apr 22 12:13:02 2006
From: strawman at astraw.com (Andrew Straw)
Date: Sat Apr 22 12:13:02 2006
Subject: [Numpy-discussion] Floating point exception with numpy and  
    embedded python interpreter
In-Reply-To: <35791.134.226.38.190.1145701016.squirrel@webmail.cs.tcd.ie>
References: <81934aa60604191029h4d8a8d9bl550fa58cc67d3d5e@mail.gmail.com>    <44467576.1020708@astraw.com>    <AECE9844-6C6A-4EE7-A8A9-CAA079D8B2EA@gmail.com>    <4446819D.3030401@astraw.com> <35791.134.226.38.190.1145701016.squirrel@webmail.cs.tcd.ie>
Message-ID: <444A8026.3030307@astraw.com>

bitorika at cs.tcd.ie wrote:

>>>On 19 Apr 2006, at 18:37, Andrew Straw wrote:
>>>      
>>>
>>I think that numpy only accesses the SSE units through ATLAS or other
>>external library. So, build numpy without ATLAS. But I'm not 100% sure
>>anymore if there aren't any optimizations that directly use SSE if it's
>>available.
>>    
>>
>
>I've tried getting rid of all atlas, blas and lapack packages in my system
>and rebuilding numpy to use its own unoptimised lapack_lite, but no luck.
>Just trying to import numpy with PyImport_ImportModule("numpy") causes the
>program to crash with just a "Floating point exception" message output.
>
>The program I'm embedding Python in is the NS Network Simulator
>(http://www.isi.edu/nsnam/ns/). It's a complex C++ beast with its own
>Object-Tcl interpreter, but it's been working fine with embedded Python
>except for this numpy crash. I've used Numeric before and it worked fine
>as well.
>
>I'm lost now regarding what to work on to find a solution, anyone familiar
>with numpy internals has any suggestion?
>  
>
OK, going back to your original gdb traceback, it looks like the SIGFPE
originated in the following funtion in umathmodule.c:

static double
pinf_init(void)
{
    double mul = 1e10;
    double tmp = 0.0;
    double pinf;

    pinf = mul;
    for (;;) {
        pinf *= mul;
        if (pinf == tmp) break;
        tmp = pinf;
    }
    return pinf;
}

If you try just that function (instead of the whole Python interpreter
and numpy module) and still get the exception, you'll be that much
closer to narrowing down the issue.


From robert.kern at gmail.com  Sat Apr 22 18:58:01 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Sat Apr 22 18:58:01 2006
Subject: [Numpy-discussion] Re: Backporting numpy to Python 2.2
In-Reply-To: <20060419103554.4ac1df4a.twegener@radlogic.com.au>
References: <20060419103554.4ac1df4a.twegener@radlogic.com.au>
Message-ID: <e2emu3$6r7$1@sea.gmane.org>

Tim Wegener wrote:
> Hi, 
> 
> I am attempting to backport numpy-0.9.6 to be compatible with python 2.2. (Some of our machines run python 2.2 as part of Red Hat 9 and Red Hat 7.3 and it is hazardous to alter the standard setup.) I was able to change most of the 2.3-isms to be 2.2 compatible (see the attached patch). However I had problems compiling the following c module:

I was hoping that Travis would jump in and talk about the reasons that he
targetted 2.3 and not 2.2. I don't think that it's going to be feasible to
target 2.2 at this point. If nothing else, I've long since forgotten how to
write 2.2 code. More seriously, doing an overhaul of all of the C code in numpy
to use the older API is just going to make the code clumsier and more difficult
to maintain.

I think it is going to be much easier for you to install a second, more recent
Python interpreter on your machines than it will be for you to maintain a
2.2-compatible branch. Linux installations, even Red Hat, usually handle having
multiple versions of Python installed side by side just fine. You don't have to
remove Python 2.2.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From zpincus at stanford.edu  Sat Apr 22 20:48:00 2006
From: zpincus at stanford.edu (Zachary Pincus)
Date: Sat Apr 22 20:48:00 2006
Subject: [Numpy-discussion] Matrix and var method
Message-ID: <83468068-4E41-45A1-9753-90CEADF34722@stanford.edu>

Hi folks,

I just ran across an error with numpy.matrix types: the var() method  
does not seem to work! (I've tried all sorts of permutations on the  
matrix shape, and the axis parameter to var; nothing works.)

Perhaps this has already been fixed -- I haven't updated my numpy in  
a week or so. If so, sorry; if not, I hope this helps.

Zach


In [1]: import numpy
In [2]: numpy.__version__
Out[2]: '0.9.7.2335'

In [3]: numpy.matrix([[1,2,3], [1,2,3]]).var()
------------------------------------------------------------------------ 
---
exceptions.ValueError                                Traceback (most  
recent call last)

/Users/zpincus/<ipython console>

/Library/Frameworks/Python.framework/Versions/2.4/lib/python2.4/site- 
packages/numpy/core/defmatrix.py in __mul__(self, other)
     147         if isinstance(other, N.ndarray) or N.isscalar(other)  
or \
     148                not hasattr(other, '__rmul__'):
--> 149             return N.dot(self, other)
     150         else:
     151             return NotImplemented

ValueError: matrices are not aligned

In [4]: numpy.array([[1,2,3], [1,2,3]]).var()
Out[4]: 0.80000000000000004


From a.mcmorland at auckland.ac.nz  Sun Apr 23 17:40:02 2006
From: a.mcmorland at auckland.ac.nz (Angus McMorland)
Date: Sun Apr 23 17:40:02 2006
Subject: [Numpy-discussion] Error installing on amd64 Debian-unstable
Message-ID: <444C1E24.8030603@auckland.ac.nz>

I had no troubles installing numpy and scipy on my 32-bit laptop, but
cannot get numpy to install on my amd64 debian desktop. I've pulled in
the latest svn versions, then run:

$ python setup.py install

Installation seems to run okay (no error messages), but the following
happens:


In [1]: import numpy
import core -> failed:
/usr/lib/python2.3/site-packages/numpy/core/_sort.so: undefined symbol:
PyArray_CompareUCS4
import lib -> failed: module compiled against version 90703 of C-API but
this version of numpy is 90704
import linalg -> failed: module compiled against version 90703 of C-API
but this version of numpy is 90704
import dft -> failed: cannot import name asarray
import random -> failed: 'module' object has no attribute 'dtype'
---------------------------------------------------------------------------
exceptions.ImportError                               Traceback (most
recent call last)

/home/amcmorl/<ipython console>

/usr/lib/python2.3/site-packages/numpy/__init__.py
     47         return NumpyTest().test(level, verbosity)
     48
---> 49     import add_newdocs
     50
     51     if __doc__ is not None:

/usr/lib/python2.3/site-packages/numpy/add_newdocs.py
----> 2 from lib import add_newdoc
      3
      4 add_newdoc('numpy.core','dtype',
      5            [('fields', "Fields of the data-typedescr if any."),
      6             ('alignment', "Needed alignment for this data-type"),

ImportError: cannot import name add_newdoc


Can anyone suggest what I'm doing wrong?

Cheers,

A.
-- 
Angus McMorland
email a.mcmorland at auckland.ac.nz
mobile +64-21-155-4906

PhD Student, Neurophysiology / Multiphoton & Confocal Imaging
Physiology, University of Auckland
phone +64-9-3737-599 x89707

Armourer, Auckland University Fencing
Secretary, Fencing North Inc.


From robert.kern at gmail.com  Sun Apr 23 17:55:08 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Sun Apr 23 17:55:08 2006
Subject: [Numpy-discussion] Re: Error installing on amd64 Debian-unstable
In-Reply-To: <444C1E24.8030603@auckland.ac.nz>
References: <444C1E24.8030603@auckland.ac.nz>
Message-ID: <e2h7k7$mqq$1@sea.gmane.org>

Angus McMorland wrote:
> I had no troubles installing numpy and scipy on my 32-bit laptop, but
> cannot get numpy to install on my amd64 debian desktop. I've pulled in
> the latest svn versions, then run:
> 
> $ python setup.py install
> 
> Installation seems to run okay (no error messages), but the following
> happens:
> 
> In [1]: import numpy
> import core -> failed:
> /usr/lib/python2.3/site-packages/numpy/core/_sort.so: undefined symbol:
> PyArray_CompareUCS4
> import lib -> failed: module compiled against version 90703 of C-API but
> this version of numpy is 90704

Please delete the build/ directory and the installed numpy package and rebuild.
If the problem persists, please let us know.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From robert.kern at gmail.com  Sun Apr 23 17:58:22 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Sun Apr 23 17:58:22 2006
Subject: [Numpy-discussion] Changing the Trac authentication
Message-ID: <444C20E5.7090309@gmail.com>

I will be changing the Trac authentication over the next hour or so. I will be
installing the AccountManagerPlugin to allow users to create accounts for
themselves without needing to have SVN write access. Anonymous users will not be
able to edit the Wikis or tickets. Non-developer, but registered users will be
able to do so with some restrictions, notably not being able to resolve tickets.
Developers who currently have accounts will have the same username/password as
before.

If you have problems using the Trac sites before I announce that I am done,
please wait until I am finished. If there are still problems, please let me know
and I will try to fix them as soon as possible.

Thank you for your patience. Hopefully, this change will resolve the spam problem.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From robert.kern at gmail.com  Sun Apr 23 18:12:02 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Sun Apr 23 18:12:02 2006
Subject: [Numpy-discussion] Re: Changing the Trac authentication
In-Reply-To: <444C20E5.7090309@gmail.com>
References: <444C20E5.7090309@gmail.com>
Message-ID: <444C25A9.8080701@gmail.com>

Robert Kern wrote:
> I will be changing the Trac authentication over the next hour or so.

Never mind. I'll have to do it tomorrow when I get to the office.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From rmuller at sandia.gov  Mon Apr 24 09:12:13 2006
From: rmuller at sandia.gov (Rick Muller)
Date: Mon Apr 24 09:12:13 2006
Subject: [Numpy-discussion] Problems building numpy
Message-ID: <02801766-7F45-48EE-AD4A-7B4B0590C9AC@sandia.gov>

Numpy really builds nicely now, and I appreciate all of the hard work  
that people have put into portability of this code.

That being said, I just had my first system where Numpy failed to  
build. It's on a redhat 7.3 (yes, we have a 7.3 box. I didn't believe  
it either. not my decision.) and I get the following error when  
trying to run Numpy:

Python 2.4.3 (#1, Apr 24 2006, 09:54:46)
[GCC 3.2.3 20030502 (Red Hat Linux 3.2.3-42)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
 >>> from numpy import array
import linalg -> failed: /usr/local/lib/python2.4/site-packages/numpy/ 
linalg/lapack_lite.so: undefined symbol: s_wsfe


If this is easy to fix, I'd prefer to fix it. However, if the numpy  
developers have better things to do than to support a 10-year-old  
operating system (and I suspect that they do), I'm cool with that.

Rick

Rick Muller
rmuller at sandia.gov


From arkaitz.bitorika at gmail.com  Mon Apr 24 09:24:03 2006
From: arkaitz.bitorika at gmail.com (Arkaitz Bitorika)
Date: Mon Apr 24 09:24:03 2006
Subject: [Numpy-discussion] Floating point exception with numpy and   embedded python interpreter
In-Reply-To: <444A8026.3030307@astraw.com>
References: <81934aa60604191029h4d8a8d9bl550fa58cc67d3d5e@mail.gmail.com>    <44467576.1020708@astraw.com>    <AECE9844-6C6A-4EE7-A8A9-CAA079D8B2EA@gmail.com>    <4446819D.3030401@astraw.com> <35791.134.226.38.190.1145701016.squirrel@webmail.cs.tcd.ie> <444A8026.3030307@astraw.com>
Message-ID: <ABB101AA-6B13-450B-AAD9-70BB53D67835@gmail.com>

Andrew,

I've verified that the function causes the exception when embedded in  
the program but not when used from a simple C program with just a main 
() function. The successful version iterates 31 times over the for  
loop while the crashing one fails the 30th time that it does "pinf *=  
mul".

Now we know exactly where the crash is, but no idea how to fix it ;).  
It doesn't look it should be related to SSE2 flags, it's just doing a  
big multiplication, but I don't know enough about low level C and  
floating point operations to understand why it may be throwing the  
exception there. Any idea how I could avoid that function crashing?

Thanks,
Arkaitz

On 22 Apr 2006, at 20:12, Andrew Straw wrote:
> OK, going back to your original gdb traceback, it looks like the  
> SIGFPE
> originated in the following funtion in umathmodule.c:
>
> static double
> pinf_init(void)
> {
>     double mul = 1e10;
>     double tmp = 0.0;
>     double pinf;
>
>     pinf = mul;
>     for (;;) {
>         pinf *= mul;
>         if (pinf == tmp) break;
>         tmp = pinf;
>     }
>     return pinf;
> }
>
> If you try just that function (instead of the whole Python interpreter
> and numpy module) and still get the exception, you'll be that much
> closer to narrowing down the issue.


From robert.kern at gmail.com  Mon Apr 24 09:53:02 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Mon Apr 24 09:53:02 2006
Subject: [Numpy-discussion] Re: Problems building numpy
In-Reply-To: <02801766-7F45-48EE-AD4A-7B4B0590C9AC@sandia.gov>
References: <02801766-7F45-48EE-AD4A-7B4B0590C9AC@sandia.gov>
Message-ID: <e2ivmm$99u$1@sea.gmane.org>

Rick Muller wrote:
> Numpy really builds nicely now, and I appreciate all of the hard work 
> that people have put into portability of this code.
> 
> That being said, I just had my first system where Numpy failed to 
> build. It's on a redhat 7.3 (yes, we have a 7.3 box. I didn't believe 
> it either. not my decision.) and I get the following error when  trying
> to run Numpy:
> 
> Python 2.4.3 (#1, Apr 24 2006, 09:54:46)
> [GCC 3.2.3 20030502 (Red Hat Linux 3.2.3-42)] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
>>>> from numpy import array
> import linalg -> failed: /usr/local/lib/python2.4/site-packages/numpy/
> linalg/lapack_lite.so: undefined symbol: s_wsfe
> 
> If this is easy to fix, I'd prefer to fix it. However, if the numpy 
> developers have better things to do than to support a 10-year-old 
> operating system (and I suspect that they do), I'm cool with that.

This usually means that you are not linking in the g2c library:

http://www.scipy.org/FAQ#head-26562f0a9e046b53eae17de300fc06408f9c91a8

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From ndarray at mac.com  Mon Apr 24 10:07:06 2006
From: ndarray at mac.com (Sasha)
Date: Mon Apr 24 10:07:06 2006
Subject: [Numpy-discussion] Broadcasting rules (Ticket 76).
Message-ID: <d38f5330604241006i739bf55et44718fdee379447e@mail.gmail.com>

I was looking at ticket 76:

http://projects.scipy.org/scipy/numpy/ticket/76

At first, I concluded that the ticket was valid and that

>>> a = zeros([5,2])
>>> a[:] = arange(5)

should raise an error as it did in Numeric.  However, once I started
looking at the code, I've realized that numpy supports more flexible
broadcasting rules than Numeric.

For example:


>>> x = zeros([10])
>>> x[:] = 1,2
>>> x
array([1, 2, 1, 2, 1, 2, 1, 2, 1, 2])

That would be an error in Numeric. Given that the above is valid, the
result in Ticket 76 actually makes sense.

I believe it is time to have some discussion about the future of
broadcasting rules in numpy.  Can anyone provide a summary of the
status quo?


From oliphant.travis at ieee.org  Mon Apr 24 10:43:05 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Mon Apr 24 10:43:05 2006
Subject: [Numpy-discussion] Broadcasting rules (Ticket 76).
In-Reply-To: <d38f5330604241006i739bf55et44718fdee379447e@mail.gmail.com>
References: <d38f5330604241006i739bf55et44718fdee379447e@mail.gmail.com>
Message-ID: <444D0DF7.2060307@ieee.org>

Sasha wrote:
> I was looking at ticket 76:
>
> http://projects.scipy.org/scipy/numpy/ticket/76
>
> At first, I concluded that the ticket was valid and that
>
>   
>>>> a = zeros([5,2])
>>>> a[:] = arange(5)
>>>>         
>
> should raise an error as it did in Numeric.  However, once I started
> looking at the code, I've realized that numpy supports more flexible
> broadcasting rules than Numeric.
>   
This really isn't in the category of "broadcasting" as I see it.  My 
understanding is that broadcasting refers to operations involving more 
than one array on the input side.   It's really just a "universal 
function" concept. 

A copying operation is not handled using the same rules.   In this case, 
for example, Numeric used to raise an error but in NumPy the array will 
be copied as many times as possible into the array.  I don't believe 
ticket #76 is actually an error.

This behavior could be changed if somebody wants to write the code to 
change it but only until version 1.0.   It would be very difficult to 
change the other broadcasting behavior which was inherited from Numeric, 
however.  The only possibility I see is adding new useful functionality 
where Numeric used to raise an error.


-Travis


From zpincus at stanford.edu  Mon Apr 24 10:57:04 2006
From: zpincus at stanford.edu (Zachary Pincus)
Date: Mon Apr 24 10:57:04 2006
Subject: [Numpy-discussion] Broadcasting rules (Ticket 76).
In-Reply-To: <444D0DF7.2060307@ieee.org>
References: <d38f5330604241006i739bf55et44718fdee379447e@mail.gmail.com> <444D0DF7.2060307@ieee.org>
Message-ID: <4AB1DE92-E877-4E22-83AB-69DDBB32FB25@stanford.edu>

> It would be very difficult to change the other broadcasting  
> behavior which was inherited from Numeric, however.  The only  
> possibility I see is adding new useful functionality where Numeric  
> used to raise an error.

Well, there is one case that I run into all of the time where the  
broadcasting rules seem a bit constraining:

In [1]: import numpy
In [2]: numpy.__version__
'0.9.7.2335'
In [3]: a = numpy.ones([50, 100])
In [4]: means = a.mean(axis = 1)
In [5]: print a.shape, means.shape
(50, 100) (50,)
In [5]: a / means
ValueError: index objects are not broadcastable to a single shape
In [6]: (a.transpose() / means).transpose()
#this works

It's obvious why this doesn't work due to the broadcasting rules, but  
it also seems (to me, in this case at least) obvious what I am trying  
to do. I don't think I'm suggesting that the broadcasting rules be  
changed to allow matching-from-the-right in the general case, since  
that seems likely to make the broadcasting rules even more difficult  
to grok. But there do seem to be a lot of (....transpose 
() ... ).transpose() bits in my code.

Is there anything to be done here? I presume not, but I just wanted  
to mention it.

Zach


From oliphant.travis at ieee.org  Mon Apr 24 11:25:06 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Mon Apr 24 11:25:06 2006
Subject: ***[Possible UCE]*** Re: [Numpy-discussion] Broadcasting rules
 (Ticket 76).
In-Reply-To: <4AB1DE92-E877-4E22-83AB-69DDBB32FB25@stanford.edu>
References: <d38f5330604241006i739bf55et44718fdee379447e@mail.gmail.com> <444D0DF7.2060307@ieee.org> <4AB1DE92-E877-4E22-83AB-69DDBB32FB25@stanford.edu>
Message-ID: <444D17E6.1070104@ieee.org>

Zachary Pincus wrote:
>> It would be very difficult to change the other broadcasting behavior 
>> which was inherited from Numeric, however.  The only possibility I 
>> see is adding new useful functionality where Numeric used to raise an 
>> error.
>
> Well, there is one case that I run into all of the time where the 
> broadcasting rules seem a bit constraining:
>
> In [1]: import numpy
> In [2]: numpy.__version__
> '0.9.7.2335'
> In [3]: a = numpy.ones([50, 100])
> In [4]: means = a.mean(axis = 1)
> In [5]: print a.shape, means.shape
> (50, 100) (50,)
> In [5]: a / means
> ValueError: index objects are not broadcastable to a single shape
> In [6]: (a.transpose() / means).transpose()
> #this works
>
> It's obvious why this doesn't work due to the broadcasting rules, but 
> it also seems (to me, in this case at least) obvious what I am trying 
> to do. I don't think I'm suggesting that the broadcasting rules be 
> changed to allow matching-from-the-right in the general case, since 
> that seems likely to make the broadcasting rules even more difficult 
> to grok. But there do seem to be a lot of (....transpose() ... 
> ).transpose() bits in my code.
>
> Is there anything to be done here? I presume not, but I just wanted to 
> mention it.

Yes,  just be more explicit about which end to tack extra dimensions 
onto (the automatic extension always assumes pre-pending...)

a / means[:,newaxis]

is the suggested spelling...

-Travis


From ndarray at mac.com  Mon Apr 24 11:30:05 2006
From: ndarray at mac.com (Sasha)
Date: Mon Apr 24 11:30:05 2006
Subject: [Numpy-discussion] Broadcasting rules (Ticket 76).
In-Reply-To: <4AB1DE92-E877-4E22-83AB-69DDBB32FB25@stanford.edu>
References: <d38f5330604241006i739bf55et44718fdee379447e@mail.gmail.com>
	 <444D0DF7.2060307@ieee.org>
	 <4AB1DE92-E877-4E22-83AB-69DDBB32FB25@stanford.edu>
Message-ID: <d38f5330604241129r45004a91g3a356afc238a74e3@mail.gmail.com>

On 4/24/06, Zachary Pincus <zpincus at stanford.edu> wrote:
> [...]
> In [5]: print a.shape, means.shape
> (50, 100) (50,)
> In [5]: a / means
> ValueError: index objects are not broadcastable to a single shape
> In [6]: (a.transpose() / means).transpose()
> #this works

This works too:
>>> x = a / means[:,newaxis]

no .transpose() :-).


From ndarray at mac.com  Mon Apr 24 11:49:04 2006
From: ndarray at mac.com (Sasha)
Date: Mon Apr 24 11:49:04 2006
Subject: [Numpy-discussion] Broadcasting rules (Ticket 76).
In-Reply-To: <444D0DF7.2060307@ieee.org>
References: <d38f5330604241006i739bf55et44718fdee379447e@mail.gmail.com>
	 <444D0DF7.2060307@ieee.org>
Message-ID: <d38f5330604241148h6608f2b9r4dd4f233d988fc1b@mail.gmail.com>

On 4/24/06, Travis Oliphant <oliphant.travis at ieee.org> wrote:
> [...]
> A copying operation is not handled using the same rules.   In this case,
> for example, Numeric used to raise an error but in NumPy the array will
> be copied as many times as possible into the array.  I don't believe
> ticket #76 is actually an error.
>
I disagree on the terminology.  In my view broadcasting means
repeating the values of the array to fit into a different shape no
matter what dictates the new shape an operand or the receiver.

IMHO the following is slightly confusing:

>>> a = zeros([5,2])
>>> a[...] += arange(5)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
ValueError: shape mismatch: objects cannot be broadcast to a single shape

but
>>> a[...] = arange(5)

is ok.


> This behavior could be changed if somebody wants to write the code to
> change it but only until version 1.0.   It would be very difficult to
> change the other broadcasting behavior which was inherited from Numeric,
> however.  The only possibility I see is adding new useful functionality
> where Numeric used to raise an error.

In this category, I would suggest to allow broadcasting to any
multiple of the dimension even if the dimension is not 1.  I don't see
what makes 1 so special.


>>> x = zeros(4)
>>> x+(1,2)
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
ValueError: shape mismatch: objects cannot be broadcast to a single shape
>>> x+(1,)
array([1, 1, 1, 1])

I suggest that we make ufunc sonsistent with slice assignment.  Currently:

>>> x[:]=1,1
>>> x[:]=1,1,1
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
ValueError: number of elements in destination must be integer multiple
of number of elements in source


From cookedm at physics.mcmaster.ca  Mon Apr 24 13:13:09 2006
From: cookedm at physics.mcmaster.ca (David M. Cooke)
Date: Mon Apr 24 13:13:09 2006
Subject: [Numpy-discussion] numexpr enhancements
In-Reply-To: <20060421205530.GA25020@xot.carabos.com>
	(faltet@xot.carabos.com's message of "Fri, 21 Apr 2006 20:55:30
	+0000")
References: <20060421205530.GA25020@xot.carabos.com>
Message-ID: <qnkpsj6ka1l.fsf@arbutus.physics.mcmaster.ca>

faltet at xot.carabos.com writes:

> Hi,
>
> After looking at the numpy performance issues on strided and unaligned
> data, I decided to have a try at the numexpr package and finally
> implemented better suport for them. As a result, numexpr can reach now
> a 2x of performance improvement for simple expressions, like 'a>2.'.
>
> In the way, I've added support for boolean expressions (&, | and ~, as
> in the where() function), a new boolean data type (important to get
> better performance on boolean expressions) and support for numarray
> (maintaining the compatibility with numpy, of course).
>
> I've called the new package numexpr 0.2 to not confuse it with existing
> 0.1. Well, let's hope that numexpr can continue making its way towards
> integration in numpy.
>
> You can fetch this new package at:
>
> http://www.carabos.com/downloads/divers/numexpr-0.2.tar.gz
>
> Finally, let me say that numexpr is a wonderful toy to get your hands
> dirty ;-) Many thanks to David (and Tim) for this!

Unfortunately, real life (damn Ph.D.! :-) has gotten in my way, so I'm
not going to be able to look at this for a while. But I'll add it to
my list.

-- 
|>|\/|<
/--------------------------------------------------------------------------\
|David M. Cooke                      http://arbutus.physics.mcmaster.ca/dmc/
|cookedm at physics.mcmaster.ca


From cookedm at physics.mcmaster.ca  Mon Apr 24 13:18:05 2006
From: cookedm at physics.mcmaster.ca (David M. Cooke)
Date: Mon Apr 24 13:18:05 2006
Subject: [Numpy-discussion] announce: pyjit, a little jit for creating numpy ufuncs
In-Reply-To: <20060421162336.42285837.simon@arrowtheory.com> (Simon Burton's
	message of "Fri, 21 Apr 2006 16:23:36 +1000")
References: <20060421162336.42285837.simon@arrowtheory.com>
Message-ID: <qnklktuk9tf.fsf@arbutus.physics.mcmaster.ca>

Simon Burton <simon at arrowtheory.com> writes:

> Hi,
>
> Inspired by numexpr, pypy and llvm, i've built a simple 
> JIT for creating numpy "ufuncs" (they are not yet real ufuncs).
> It uses llvm[1] as the backend machine code generator.

Cool! I had a look at LLVM, but I wanted something to go into SciPy,
and that was too heavy a dependence. However, I could see doing more
stuff with this than I can easily with numexpr.

> The main things it can do are:
>
>  *) parse simple python code (function def's)
>  *) generate SSA assembly code for llvm
>  *) build ufunc code for applying to numpy array's
>
> When I say simple I mean it:
>
> def calc(a,b):
>   c = (a+b)/2.0
>   return c
>
> No control flow or type inference has been implemented.
>
> As with numexpr, significant speedups are possible.
>
> I'm putting this announce here to see what the other numpy'ers think.
>
> $ svn co http://rubis.rsise.anu.edu.au/local/repos/elefant/pyjit
>
> [1] http://llvm.org/

How do the speedups compare with numexpr?

Are there any lessons you learned from this that could apply to
numexpr?

Could we have a common frontend for numexpr/pyjit, and a different
backend for each? Then each wouldn't have to reinvent the wheel in
parsing (the same thought goes with weave, too...)

I don't have much time to look at it (real life sucking my time :-(),
but I'll have a look when I do have the time.

-- 
|>|\/|<
/--------------------------------------------------------------------------\
|David M. Cooke                      http://arbutus.physics.mcmaster.ca/dmc/
|cookedm at physics.mcmaster.ca


From oliphant.travis at ieee.org  Mon Apr 24 14:22:02 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Mon Apr 24 14:22:02 2006
Subject: [Numpy-discussion] Broadcasting rules (Ticket 76).
In-Reply-To: <d38f5330604241148h6608f2b9r4dd4f233d988fc1b@mail.gmail.com>
References: <d38f5330604241006i739bf55et44718fdee379447e@mail.gmail.com>	 <444D0DF7.2060307@ieee.org> <d38f5330604241148h6608f2b9r4dd4f233d988fc1b@mail.gmail.com>
Message-ID: <444D4143.4020204@ieee.org>

Sasha wrote:
> On 4/24/06, Travis Oliphant <oliphant.travis at ieee.org> wrote:
>   
>> [...]
>> A copying operation is not handled using the same rules.   In this case,
>> for example, Numeric used to raise an error but in NumPy the array will
>> be copied as many times as possible into the array.  I don't believe
>> ticket #76 is actually an error.
>>
>>     
> I disagree on the terminology.  In my view broadcasting means
> repeating the values of the array to fit into a different shape no
> matter what dictates the new shape an operand or the receiver.
>   

I can understand that view.  But, that's not been the historical use of 
broadcasting which has always been only a "ufunc" concept.   Code to 
implement a broader view of broadcasting across more operations if 
people decide that is appropriate could be done (carefully), but I don't 
have time to write it.


-Travis


From oliphant.travis at ieee.org  Mon Apr 24 14:25:02 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Mon Apr 24 14:25:02 2006
Subject: [Numpy-discussion] Broadcasting rules (Ticket 76).
In-Reply-To: <d38f5330604241148h6608f2b9r4dd4f233d988fc1b@mail.gmail.com>
References: <d38f5330604241006i739bf55et44718fdee379447e@mail.gmail.com>	 <444D0DF7.2060307@ieee.org> <d38f5330604241148h6608f2b9r4dd4f233d988fc1b@mail.gmail.com>
Message-ID: <444D41FE.7050904@ieee.org>

Sasha wrote:
> In this category, I would suggest to allow broadcasting to any
> multiple of the dimension even if the dimension is not 1.  I don't see
> what makes 1 so special.
>   
What's so special about 1 is that the code for it is relatively 
straightforward and already implemented using strides.  Altering the 
code to allow any multiple of the dimension would be harder and slower. 

-Travis


From oliphant.travis at ieee.org  Mon Apr 24 14:30:01 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Mon Apr 24 14:30:01 2006
Subject: [Numpy-discussion] Broadcasting rules (Ticket 76).
In-Reply-To: <d38f5330604241148h6608f2b9r4dd4f233d988fc1b@mail.gmail.com>
References: <d38f5330604241006i739bf55et44718fdee379447e@mail.gmail.com>	 <444D0DF7.2060307@ieee.org> <d38f5330604241148h6608f2b9r4dd4f233d988fc1b@mail.gmail.com>
Message-ID: <444D4329.9050700@ieee.org>

Sasha wrote:
>>>> x[:]=1,1
>>>> x[:]=1,1,1
>>>>         
> Traceback (most recent call last):
>   File "<stdin>", line 1, in ?
> ValueError: number of elements in destination must be integer multiple
> of number of elements in source
>   
I think the only reasonable thing to do is to raise an error unless the 
shapes were compatible like Numeric did and eliminate the multiple 
copying feature.

This would bring the desired consistency.

-Travis


From strawman at astraw.com  Mon Apr 24 14:33:01 2006
From: strawman at astraw.com (Andrew Straw)
Date: Mon Apr 24 14:33:01 2006
Subject: [Numpy-discussion] Floating point exception with numpy and  
 embedded python interpreter
In-Reply-To: <ABB101AA-6B13-450B-AAD9-70BB53D67835@gmail.com>
References: <81934aa60604191029h4d8a8d9bl550fa58cc67d3d5e@mail.gmail.com>    <44467576.1020708@astraw.com>    <AECE9844-6C6A-4EE7-A8A9-CAA079D8B2EA@gmail.com>    <4446819D.3030401@astraw.com> <35791.134.226.38.190.1145701016.squirrel@webmail.cs.tcd.ie> <444A8026.3030307@astraw.com> <ABB101AA-6B13-450B-AAD9-70BB53D67835@gmail.com>
Message-ID: <444D43D0.3040308@astraw.com>

This doesn't seem like an issue with numpy. Your test proved that. I'm
curious what the outcome is, but I'm afraid there's not much we can do.

At this point I think you should write the ns2 people and see what they
say. Their program seems to be responsible for twiddling the FPU/SSE
flags, so I think the issue is better solved, or at least discussed, by
them.

Cheers!
Andrew

Arkaitz Bitorika wrote:

> Andrew,
>
> I've verified that the function causes the exception when embedded in 
> the program but not when used from a simple C program with just a main
> () function. The successful version iterates 31 times over the for 
> loop while the crashing one fails the 30th time that it does "pinf *= 
> mul".
>
> Now we know exactly where the crash is, but no idea how to fix it ;). 
> It doesn't look it should be related to SSE2 flags, it's just doing a 
> big multiplication, but I don't know enough about low level C and 
> floating point operations to understand why it may be throwing the 
> exception there. Any idea how I could avoid that function crashing?
>
> Thanks,
> Arkaitz
>
> On 22 Apr 2006, at 20:12, Andrew Straw wrote:
>
>> OK, going back to your original gdb traceback, it looks like the  SIGFPE
>> originated in the following funtion in umathmodule.c:
>>
>> static double
>> pinf_init(void)
>> {
>>     double mul = 1e10;
>>     double tmp = 0.0;
>>     double pinf;
>>
>>     pinf = mul;
>>     for (;;) {
>>         pinf *= mul;
>>         if (pinf == tmp) break;
>>         tmp = pinf;
>>     }
>>     return pinf;
>> }
>>
>> If you try just that function (instead of the whole Python interpreter
>> and numpy module) and still get the exception, you'll be that much
>> closer to narrowing down the issue.
>


From oliphant.travis at ieee.org  Mon Apr 24 17:40:04 2006
From: oliphant.travis at ieee.org (Travis E. Oliphant)
Date: Mon Apr 24 17:40:04 2006
Subject: [Numpy-discussion] Re: Backporting numpy to Python 2.2
In-Reply-To: <20060419103554.4ac1df4a.twegener@radlogic.com.au>
References: <20060419103554.4ac1df4a.twegener@radlogic.com.au>
Message-ID: <e2jr2o$h48$1@sea.gmane.org>

Tim Wegener wrote:
> Hi, 
> 
> I am attempting to backport numpy-0.9.6 to be compatible with python 2.2. (Some of our machines run python 2.2 as part of Red Hat 9 and Red Hat 7.3 and it is hazardous to alter the standard setup.) I was able to change most of the 2.3-isms to be 2.2 compatible (see the attached patch). However I had problems compiling the following c module:

I targeted Python 2.3 because it added some very nice constructs (Python 
2.4 added even more but I disciplined myself not to use them).

I think it is not impossible to back-port it to Python 2.2 but I agree 
with Robert that I wonder if it is worth the effort.

In this case Python 2.3 added the bool type which is used in NumPy. 
Basically this type would have to be constructed (the code could be 
grabbed from Python 2.3) in order to be used.

The addition of the boolean type is probably the single biggest change 
that would make back-porting to 2.2 difficult.

There may be others as well but they are probably easier to work around...


-Travis


From robert.kern at gmail.com  Mon Apr 24 18:00:01 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Mon Apr 24 18:00:01 2006
Subject: [Numpy-discussion] Changing the Trac authentication, for real this time!
Message-ID: <444D7458.3020402@gmail.com>

If you encounter errors accessing the Trac sites for NumPy and SciPy over the
next hour or so, please wait until I have announced that I have finished. If
things are still broken after that, please let me know and I will try to fix it
immediately.

The details of the changes were posted to the previous thread "Changing the Trac
authentication".

Apologies for any disruption and for the noise.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From ndarray at mac.com  Mon Apr 24 18:26:07 2006
From: ndarray at mac.com (Sasha)
Date: Mon Apr 24 18:26:07 2006
Subject: [Numpy-discussion] Broadcasting rules (Ticket 76).
In-Reply-To: <444D4329.9050700@ieee.org>
References: <d38f5330604241006i739bf55et44718fdee379447e@mail.gmail.com>
	 <444D0DF7.2060307@ieee.org>
	 <d38f5330604241148h6608f2b9r4dd4f233d988fc1b@mail.gmail.com>
	 <444D4329.9050700@ieee.org>
Message-ID: <d38f5330604241825v74135de4wf39404f7914062e0@mail.gmail.com>

On 4/24/06, Travis Oliphant <oliphant.travis at ieee.org> wrote:
> Sasha wrote:
> >>>> x[:]=1,1
> >>>> x[:]=1,1,1
> >>>>
> > Traceback (most recent call last):
> >   File "<stdin>", line 1, in ?
> > ValueError: number of elements in destination must be integer multiple
> > of number of elements in source
> >
> I think the only reasonable thing to do is to raise an error unless the
> shapes were compatible like Numeric did and eliminate the multiple
> copying feature.

I've attached a patch to the ticket:

<http://projects.scipy.org/scipy/numpy/attachment/ticket/76/shape-check.patch>

I don't see why slice assignment cannot reuse the ufunc code.  It
looks like slice assignment can just be dispatched to a trivial
(pass-through) ufunc.  This aproach may even prove to be faster
because type-aware copying loops can be faster than memmove on popular
platforms.


From robert.kern at gmail.com  Mon Apr 24 19:39:02 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Mon Apr 24 19:39:02 2006
Subject: [Numpy-discussion] Re: Changing the Trac authentication, for real this time!
In-Reply-To: <444D7458.3020402@gmail.com>
References: <444D7458.3020402@gmail.com>
Message-ID: <444D8BA2.1080407@gmail.com>

I hate computers.

It's still not done.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From stephen.walton at csun.edu  Mon Apr 24 20:49:03 2006
From: stephen.walton at csun.edu (Stephen Walton)
Date: Mon Apr 24 20:49:03 2006
Subject: [Numpy-discussion] Re: Problems building numpy
In-Reply-To: <e2ivmm$99u$1@sea.gmane.org>
References: <02801766-7F45-48EE-AD4A-7B4B0590C9AC@sandia.gov> <e2ivmm$99u$1@sea.gmane.org>
Message-ID: <444D9C0C.3030006@csun.edu>

Robert Kern wrote:

>Rick Muller wrote:
>
>>
>>
>>That being said, I just had my first system where Numpy failed to 
>>build. It's on a redhat 7.3 (yes, we have a 7.3 box. I didn't believe 
>>it either. not my decision.) and I get the following error when  trying
>>to run Numpy:
>>
>>    
>>
>This usually means that you are not linking in the g2c library.
>  
>
On Redhat 7.3, I don't believe there was a g2c library, but an f2c one.  
So -lf2c is needed at the link step (and f2c needs to be installed).


From robert.kern at gmail.com  Mon Apr 24 20:54:02 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Mon Apr 24 20:54:02 2006
Subject: [Numpy-discussion] Re: Problems building numpy
In-Reply-To: <444D9C0C.3030006@csun.edu>
References: <02801766-7F45-48EE-AD4A-7B4B0590C9AC@sandia.gov> <e2ivmm$99u$1@sea.gmane.org> <444D9C0C.3030006@csun.edu>
Message-ID: <e2k6ff$jmm$1@sea.gmane.org>

Stephen Walton wrote:
> Robert Kern wrote:
> 
>> Rick Muller wrote:
>>
>>> That being said, I just had my first system where Numpy failed to
>>> build. It's on a redhat 7.3 (yes, we have a 7.3 box. I didn't believe
>>> it either. not my decision.) and I get the following error when  trying
>>> to run Numpy:  
>>
>> This usually means that you are not linking in the g2c library.
>>  
> On Redhat 7.3, I don't believe there was a g2c library, but an f2c one. 
> So -lf2c is needed at the link step (and f2c needs to be installed).

Well, there's libf2c which is a library provided by f2c, a program that converts
FORTRAN to C. And then there's libg2c which is provided by g77. They really are
different and, I don't think, interchangeable. Note that libg2c will be stuck
several ellipses down in the bowels of /usr/lib/gcc/.../.../libg2c.a not up in
/usr/lib/.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From stephen.walton at csun.edu  Mon Apr 24 21:09:01 2006
From: stephen.walton at csun.edu (Stephen Walton)
Date: Mon Apr 24 21:09:01 2006
Subject: [Numpy-discussion] Re: Problems building numpy
In-Reply-To: <e2k6ff$jmm$1@sea.gmane.org>
References: <02801766-7F45-48EE-AD4A-7B4B0590C9AC@sandia.gov> <e2ivmm$99u$1@sea.gmane.org> <444D9C0C.3030006@csun.edu> <e2k6ff$jmm$1@sea.gmane.org>
Message-ID: <444DA0A5.80902@csun.edu>

Robert Kern wrote:

>Well, there's libf2c which is a library provided by f2c, a program that converts
>FORTRAN to C. And then there's libg2c which is provided by g77. They really are
>different 
>
Oh, I knew that.  My point was that there were some old Redhat releases 
(I don't recall if 7.3 is that old, probably not) which didn't include 
g77, just an f77 shell script which called f2c and cc.


From robert.kern at gmail.com  Mon Apr 24 21:14:01 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Mon Apr 24 21:14:01 2006
Subject: [Numpy-discussion] Re: Problems building numpy
In-Reply-To: <444DA0A5.80902@csun.edu>
References: <02801766-7F45-48EE-AD4A-7B4B0590C9AC@sandia.gov> <e2ivmm$99u$1@sea.gmane.org> <444D9C0C.3030006@csun.edu> <e2k6ff$jmm$1@sea.gmane.org> <444DA0A5.80902@csun.edu>
Message-ID: <e2k7kj$m3c$1@sea.gmane.org>

Stephen Walton wrote:
> Robert Kern wrote:
> 
>> Well, there's libf2c which is a library provided by f2c, a program
>> that converts
>> FORTRAN to C. And then there's libg2c which is provided by g77. They
>> really are
>> different
> 
> Oh, I knew that.  My point was that there were some old Redhat releases
> (I don't recall if 7.3 is that old, probably not) which didn't include
> g77, just an f77 shell script which called f2c and cc.

Oy. I'm not sure if even we support that.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From rob at hooft.net  Mon Apr 24 21:25:01 2006
From: rob at hooft.net (Rob Hooft)
Date: Mon Apr 24 21:25:01 2006
Subject: [Numpy-discussion] Re: Problems building numpy
In-Reply-To: <444DA0A5.80902@csun.edu>
References: <02801766-7F45-48EE-AD4A-7B4B0590C9AC@sandia.gov> <e2ivmm$99u$1@sea.gmane.org> <444D9C0C.3030006@csun.edu> <e2k6ff$jmm$1@sea.gmane.org> <444DA0A5.80902@csun.edu>
Message-ID: <444DA473.2010000@hooft.net>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Stephen Walton wrote:
| Robert Kern wrote:
|
|> Well, there's libf2c which is a library provided by f2c, a program
|> that converts
|> FORTRAN to C. And then there's libg2c which is provided by g77. They
|> really are
|> different
|
| Oh, I knew that.  My point was that there were some old Redhat releases
| (I don't recall if 7.3 is that old, probably not) which didn't include
| g77, just an f77 shell script which called f2c and cc.

And in addition, very old versions of g77 (I'm not sure to which RedHat
version this age corresponds) used f2c's library unmodified.

I think the f2c/cc times (the compiler script was called fcomp?) were a
bit older. I moved back to my current job with RedHat 4.x (1997), and I
worked with self-compiled g77 already in my previous job....

Rob

- --
Rob W.W. Hooft  ||  rob at hooft.net  ||  http://www.hooft.net/people/rob/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFETaRzH7J/Cv8rb3QRAqtEAKCsDcj3tO7Gcvgsyj0CaDCu99JLSgCgjgjp
sB7u8S0krk5a1G2bYC+h9cQ=
=MLOS
-----END PGP SIGNATURE-----


From oliphant.travis at ieee.org  Mon Apr 24 21:31:02 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Mon Apr 24 21:31:02 2006
Subject: [Numpy-discussion] Broadcasting rules (Ticket 76).
In-Reply-To: <d38f5330604241825v74135de4wf39404f7914062e0@mail.gmail.com>
References: <d38f5330604241006i739bf55et44718fdee379447e@mail.gmail.com>	 <444D0DF7.2060307@ieee.org>	 <d38f5330604241148h6608f2b9r4dd4f233d988fc1b@mail.gmail.com>	 <444D4329.9050700@ieee.org> <d38f5330604241825v74135de4wf39404f7914062e0@mail.gmail.com>
Message-ID: <444DA5D4.4080104@ieee.org>

Sasha wrote:
> On 4/24/06, Travis Oliphant <oliphant.travis at ieee.org> wrote:
>   
>> Sasha wrote:
>>     
>>>>>> x[:]=1,1
>>>>>> x[:]=1,1,1
>>>>>>
>>>>>>             
>>> Traceback (most recent call last):
>>>   File "<stdin>", line 1, in ?
>>> ValueError: number of elements in destination must be integer multiple
>>> of number of elements in source
>>>
>>>       
>> I think the only reasonable thing to do is to raise an error unless the
>> shapes were compatible like Numeric did and eliminate the multiple
>> copying feature.
>>     
>
> I've attached a patch to the ticket:
>
> <http://projects.scipy.org/scipy/numpy/attachment/ticket/76/shape-check.patch>
>
> I don't see why slice assignment cannot reuse the ufunc code.  It
> looks like slice assignment can just be dispatched to a trivial
> (pass-through) ufunc.  This aproach may even prove to be faster
> because type-aware copying loops can be faster than memmove on popular
> platforms.
>
>   

It could re-use that code but there are at least two drawbacks to that 
approach:


1) The overhead of the ufunc for small array copies.
2) The special-casing that would be needed for variable-size arrays 
(string, unicode, void...) which are not supported by the ufunc machinery. 

and we've already improved the copying by making them type-aware.


Right now copying is handled by the data-type functions (not the ufuncs). 


Perhaps what should be done instead is to allow for strided copying in 
the copyswapn function.


To fully support record arrays with object components the copy operation 
for the VOID case needs to be recursive when fields are defined.  


-Travis


From oliphant.travis at ieee.org  Mon Apr 24 22:00:02 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Mon Apr 24 22:00:02 2006
Subject: [Numpy-discussion] Broadcasting rules (Ticket 76).
In-Reply-To: <d38f5330604241825v74135de4wf39404f7914062e0@mail.gmail.com>
References: <d38f5330604241006i739bf55et44718fdee379447e@mail.gmail.com>	 <444D0DF7.2060307@ieee.org>	 <d38f5330604241148h6608f2b9r4dd4f233d988fc1b@mail.gmail.com>	 <444D4329.9050700@ieee.org> <d38f5330604241825v74135de4wf39404f7914062e0@mail.gmail.com>
Message-ID: <444DACB8.50203@ieee.org>

Sasha wrote:
> On 4/24/06, Travis Oliphant <oliphant.travis at ieee.org> wrote:
>   
> I've attached a patch to the ticket:
>
> <http://projects.scipy.org/scipy/numpy/attachment/ticket/76/shape-check.patch>
>   
I don't think the patch will do your definition of  "the right thing" 
(i.e. mirror broadcasting behavior) in all cases.  For example if "a" is 
2x3x4x5 and "b" is 2x1x1x5, then  a[...] = b will not fill the right 
sub-space of "a" with the contents of "b".


The PyArray_CopyInto gets called in a lot of places.  Have you checked 
all of them to be sure that altering the semantics of copying (which are 
currently different than broadcasting) will work correctly?  I agree 
that one can demonstrate a slight in-consistency.  But, I'd rather have 
the inconsistency and tell people that copying and assignment is not a 
broadcasting ufunc, then feign consistency and have it not quite right.


-Travis


From robert.kern at gmail.com  Mon Apr 24 22:22:03 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Mon Apr 24 22:22:03 2006
Subject: [Numpy-discussion] Re: [SciPy-dev] Google Summer of Code
In-Reply-To: <44476AEA.7080003@decsai.ugr.es>
References: <44476AEA.7080003@decsai.ugr.es>
Message-ID: <444DB033.4000906@gmail.com>

[Cross-posted because this is partially an announcement. Continuing discussion
should go to only one list, please.]

Antonio Arauzo Azofra wrote:
> Google Summer of Code
> http://code.google.com/soc/
> 
> Have you considered participating as a Mentoring organization? Offering 
> any project about Scipy?

I'm not sure which "you" you are referring to here, but yes! Unfortunately, it
was a bit late in the process to be applying as a mentoring organization. Google
started consolidating mentoring organizations. However, I and several others at
Enthought are volunteering to mentor through the PSF. I encourage others on
these lists to do the same or to apply as students, whichever is appropriate.
We'll be happy to provide SVN workspace for numpy and scipy SoC projects.

I've added one fairly general scipy entry to the python.org Wiki page listing
project ideas:

  http://wiki.python.org/moin/SummerOfCode

If you have more specific ideas, please add them to the Wiki.

Potential mentors: Neal Norwitz is coordinating PSF mentors this year and has
asked that those he or Guido does not know personally to give personal
references. If you've been active on this list, I'm sure we can play the "Two
Degrees of Separation From Guido Game" and get you a reference from someone else
here.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From oliphant.travis at ieee.org  Mon Apr 24 22:27:02 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Mon Apr 24 22:27:02 2006
Subject: [Numpy-discussion] Broadcasting rules (Ticket 76).
In-Reply-To: <444DACB8.50203@ieee.org>
References: <d38f5330604241006i739bf55et44718fdee379447e@mail.gmail.com>	 <444D0DF7.2060307@ieee.org>	 <d38f5330604241148h6608f2b9r4dd4f233d988fc1b@mail.gmail.com>	 <444D4329.9050700@ieee.org> <d38f5330604241825v74135de4wf39404f7914062e0@mail.gmail.com> <444DACB8.50203@ieee.org>
Message-ID: <444DB302.30903@ieee.org>

Travis Oliphant wrote:
> Sasha wrote:
>> On 4/24/06, Travis Oliphant <oliphant.travis at ieee.org> wrote:
>>   I've attached a patch to the ticket:
>>
>> <http://projects.scipy.org/scipy/numpy/attachment/ticket/76/shape-check.patch> 
>>
>>   
> I don't think the patch will do your definition of  "the right thing" 
> (i.e. mirror broadcasting behavior) in all cases.  For example if "a" 
> is 2x3x4x5 and "b" is 2x1x1x5, then  a[...] = b will not fill the 
> right sub-space of "a" with the contents of "b".
>
>
> The PyArray_CopyInto gets called in a lot of places.  Have you checked 
> all of them to be sure that altering the semantics of copying (which 
> are currently different than broadcasting) will work correctly?  I 
> agree that one can demonstrate a slight in-consistency.  But, I'd 
> rather have the inconsistency and tell people that copying and 
> assignment is not a broadcasting ufunc, then feign consistency and 
> have it not quite right.
>


Of course, as I've said I'm not opposed to the consistency.


To do it "right", one should use PyArray_MultiIterNew which abstracts 
the concept of broadcasting into iterators (and uses the broadcastable 
checking code that's already written --- so you guarantee 
consistency).   I'm not sure what overhead it would bring.


But, special cases could be checked-for (scalar, and same-size arrays 
for example). 


I'm also thinking that copyswapn should grow stride arguments so that it 
can be used more generally.


-Travis


From lroubeyrie at limair.asso.fr  Tue Apr 25 00:39:04 2006
From: lroubeyrie at limair.asso.fr (Lionel Roubeyrie)
Date: Tue Apr 25 00:39:04 2006
Subject: [Numpy-discussion] equality with masked object
Message-ID: <200604250938.48648.lroubeyrie@limair.asso.fr>

Hi all,
I have a problem with masked_object (and masked_values to) like in this sort 
example :
###########################################
lionel[Donn?es]8>test=array([1,2,3,inf,5])

lionel[Donn?es]9>test = ma.masked_object(test, inf)

lionel[Donn?es]10>print test[3], type(test[3])
-- <class 'numpy.core.ma.MaskedArray'>

lionel[Donn?es]11>print test.max(), type(test.max())
5.0 <type 'float64scalar'>

lionel[Donn?es]12>test[3] == test.max()
       Sortie[12]:
array(data =
 [True],
      mask =
 True,
      fill_value=?)
###########################################

Why 5.0 == -- return True? A float is it the same as a masked object?
thanks

-- 
Lionel Roubeyrie - lroubeyrie at limair.asso.fr
LIMAIR
http://www.limair.asso.fr


From nicolas.chauvat at logilab.fr  Tue Apr 25 03:22:15 2006
From: nicolas.chauvat at logilab.fr (Nicolas Chauvat)
Date: Tue Apr 25 03:22:15 2006
Subject: [Numpy-discussion] announce: pyjit, a little jit for creating numpy ufuncs
In-Reply-To: <qnklktuk9tf.fsf@arbutus.physics.mcmaster.ca>
References: <20060421162336.42285837.simon@arrowtheory.com> <qnklktuk9tf.fsf@arbutus.physics.mcmaster.ca>
Message-ID: <20060425102134.GI24645@crater.logilab.fr>

On Mon, Apr 24, 2006 at 04:17:16PM -0400, David M. Cooke wrote:
> Simon Burton <simon at arrowtheory.com> writes:
> 
> > Hi,
> >
> > Inspired by numexpr, pypy and llvm, i've built a simple 
> > JIT for creating numpy "ufuncs" (they are not yet real ufuncs).
> > It uses llvm[1] as the backend machine code generator.
> 
> Cool! I had a look at LLVM, but I wanted something to go into SciPy,
> and that was too heavy a dependence. However, I could see doing more
> stuff with this than I can easily with numexpr.

Hello,

People interested in this might also be interested in PyPy's rctypes
and the exploratory work done in PyPy to annotate code using arrays.

The goal is "write Python code using numeric arrays and other C libs,
then ask PyPy to translate it to C while removing the python wrapper
of the C libs, then compile".

Then you can run the code as python code when developping and compile
the all thing from C to assembly when speed matters.

Please note it is a goal. We are not there yet. But any help will be
welcome :)

-- 
Nicolas Chauvat

logilab.fr - services en informatique avanc?e et gestion de connaissances  


From steffen.loeck at gmx.de  Tue Apr 25 04:25:22 2006
From: steffen.loeck at gmx.de (Steffen Loeck)
Date: Tue Apr 25 04:25:22 2006
Subject: [Numpy-discussion] vectorize problem
Message-ID: <200604251324.42987.steffen.loeck@gmx.de>

Hello all,

I have a problem using scalar variables in a vectorized function:

from numpy import vectorize

def f(x):
    if x>0: return 1
    else: return 0

F = vectorize(f)

F(1)

gives the error message:
---------------------------------------------------------------------------
exceptions.AttributeError     Traceback (most recent call last)

.../function_base.py in __call__(self, *args)
    619
    620         if self.nout == 1:
--> 621             return self.ufunc(*args).astype(self.otypes[0])
    622         else:
    623             return tuple([x.astype(c) for x, c in 
zip(self.ufunc(*args), self.otypes)])

AttributeError: 'int' object has no attribute 'astype'

Is there any way to get vectorized functions working with scalars again?

Regards
Steffen


From ndarray at mac.com  Tue Apr 25 06:17:13 2006
From: ndarray at mac.com (Sasha)
Date: Tue Apr 25 06:17:13 2006
Subject: [Numpy-discussion] equality with masked object
In-Reply-To: <200604250938.48648.lroubeyrie@limair.asso.fr>
References: <200604250938.48648.lroubeyrie@limair.asso.fr>
Message-ID: <d38f5330604250610x223a3513pd859ed1deff4f41@mail.gmail.com>

On 4/25/06, Lionel Roubeyrie <lroubeyrie at limair.asso.fr> wrote:

>
> Why 5.0 == -- return True? A float is it the same as a masked object?
> thanks

It does not.  It returns ma.masked :

>>> test[3] is ma.masked
True

You should not access masked data - it makes no sense.  The current
behavior is historical and I don't really like it.  Masked scalars are
replaced by ma.masked singleton in subscript operations to allow a[i]
is masked idiom.  In my view it is not worth the trouble, but my
suggestion to get rid of that feature was not met with much
enthusiasm.


From ndarray at mac.com  Tue Apr 25 06:59:07 2006
From: ndarray at mac.com (Sasha)
Date: Tue Apr 25 06:59:07 2006
Subject: [Numpy-discussion] Broadcasting rules (Ticket 76).
In-Reply-To: <444DACB8.50203@ieee.org>
References: <d38f5330604241006i739bf55et44718fdee379447e@mail.gmail.com>
	 <444D0DF7.2060307@ieee.org>
	 <d38f5330604241148h6608f2b9r4dd4f233d988fc1b@mail.gmail.com>
	 <444D4329.9050700@ieee.org>
	 <d38f5330604241825v74135de4wf39404f7914062e0@mail.gmail.com>
	 <444DACB8.50203@ieee.org>
Message-ID: <d38f5330604250632r750880b8ue8eae8433f8ff33f@mail.gmail.com>

On 4/25/06, Travis Oliphant <oliphant.travis at ieee.org> wrote:
> Sasha wrote:
> > On 4/24/06, Travis Oliphant <oliphant.travis at ieee.org> wrote:
> >
> > I've attached a patch to the ticket:
> >
> > <http://projects.scipy.org/scipy/numpy/attachment/ticket/76/shape-check.patch>
> >
> I don't think the patch will do your definition of  "the right thing"
> (i.e. mirror broadcasting behavior) in all cases.  For example if "a" is
> 2x3x4x5 and "b" is 2x1x1x5, then  a[...] = b will not fill the right
> sub-space of "a" with the contents of "b".
>
You are right, but it is not the fault of my code.  My code checks
shapes correctly, but the code that follows does not implement
broadcasting.  I did not realize that.  This also explains why we
disagreed on whether slice assignment is the same as broadcasting
before.

>
> The PyArray_CopyInto gets called in a lot of places.  Have you checked
> all of them to be sure that altering the semantics of copying (which are
> currently different than broadcasting) will work correctly?  I agree
> that one can demonstrate a slight in-consistency.  But, I'd rather have
> the inconsistency and tell people that copying and assignment is not a
> broadcasting ufunc, then feign consistency and have it not quite right.
>

That's why I would rather use an identity ufunc for slice assignment
instead of PyArray_CopyInto.


From charges at humortadela.com.br  Tue Apr 25 07:23:06 2006
From: charges at humortadela.com.br (Humortadela)
Date: Tue Apr 25 07:23:06 2006
Subject: [Numpy-discussion] Voce recebeu uma charge humortadela
Message-ID: <80a6946d133576735a9bca9dea6ea1c3@humortadela.com.br>

An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060425/c2da1ebd/attachment-0002.html>

From charges at humortadela.com.br  Tue Apr 25 07:24:03 2006
From: charges at humortadela.com.br (Humortadela)
Date: Tue Apr 25 07:24:03 2006
Subject: [Numpy-discussion] Voce recebeu uma charge humortadela
Message-ID: <80a6946d133576735a9bca9dea6ea1c3@humortadela.com.br>

An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060425/c2da1ebd/attachment-0003.html>

From perry at stsci.edu  Tue Apr 25 08:21:02 2006
From: perry at stsci.edu (Perry Greenfield)
Date: Tue Apr 25 08:21:02 2006
Subject: [Numpy-discussion] Re: Backporting numpy to Python 2.2
In-Reply-To: <e2jr2o$h48$1@sea.gmane.org>
References: <20060419103554.4ac1df4a.twegener@radlogic.com.au> <e2jr2o$h48$1@sea.gmane.org>
Message-ID: <93BC9AD0-A6CA-4128-B0EE-9999F4CE8077@stsci.edu>

On Apr 24, 2006, at 8:38 PM, Travis E. Oliphant wrote:

> Tim Wegener wrote:
>> Hi, I am attempting to backport numpy-0.9.6 to be compatible with  
>> python 2.2. (Some of our machines run python 2.2 as part of Red  
>> Hat 9 and Red Hat 7.3 and it is hazardous to alter the standard  
>> setup.) I was able to change most of the 2.3-isms to be 2.2  
>> compatible (see the attached patch). However I had problems  
>> compiling the following c module:
>
> I targeted Python 2.3 because it added some very nice constructs  
> (Python 2.4 added even more but I disciplined myself not to use them).
>
> I think it is not impossible to back-port it to Python 2.2 but I  
> agree with Robert that I wonder if it is worth the effort.
>
> In this case Python 2.3 added the bool type which is used in NumPy.  
> Basically this type would have to be constructed (the code could be  
> grabbed from Python 2.3) in order to be used.
>
> The addition of the boolean type is probably the single biggest  
> change that would make back-porting to 2.2 difficult.

If I recall correctly, True and False were added in one of the 2.2  
patch releases (one of those rare new features added in a patch  
release). Only as constant definitions using 0 and 1, and not the  
current boolean implementation. So depending on what the current  
dependencies on booleans are, it may or may not be usable from 2.2.3.

But I also wonder if it is worth the effort. I tend to think not.

Perry


From ndarray at mac.com  Tue Apr 25 10:27:10 2006
From: ndarray at mac.com (Sasha)
Date: Tue Apr 25 10:27:10 2006
Subject: [Numpy-discussion] Question about __array_struct__
Message-ID: <d38f5330604251019u26ea910q78bbbec2b89e1501@mail.gmail.com>

I am trying to add __array_struct__ attribute to R object wrappers in
RPy.  This is desirable because it eliminates a compile-time
dependency on an array module and makes the binary compatible with
either Numeric or numpy.  R has four types of data: logical, integer,
float, and  character.  The first three map perfectly to Numpy with
inter->data simply pointing to an appropriate internal memory area. 
The character type, however is more problematic.  In R character
arrays are arrays of variable length strings and therefore similar to
Numpy object arrays holding python strings.  Obviously, there is no
memory area that can be reused.  I've tried to allocate new memory in 
 __array_struct__  getter, but this presents a problem:

I cannot deallocate that memory in CObject destructor because it is
passed to the newly created array which lives long after the interface
object is deleted. The __array_struct__ mechanism does not seem to
allow to cause the new array assume ownership of the data, but even if
it did, I do not know what memory allocator is appropriate.

The only solution that I can think of is to create a dummy buffer type
with the sole purpose of deleting an array of PyObjects and make an
instance of that type the "base" of the new array.

Can anyone suggest a better approach?


From strawman at astraw.com  Tue Apr 25 10:52:08 2006
From: strawman at astraw.com (Andrew Straw)
Date: Tue Apr 25 10:52:08 2006
Subject: [Numpy-discussion] Question about __array_struct__
In-Reply-To: <d38f5330604251019u26ea910q78bbbec2b89e1501@mail.gmail.com>
References: <d38f5330604251019u26ea910q78bbbec2b89e1501@mail.gmail.com>
Message-ID: <444E619C.6030802@astraw.com>

Sasha wrote:

>I cannot deallocate that memory in CObject destructor because it is
>passed to the newly created array which lives long after the interface
>object is deleted.
>
Normally, the array that's viewing the data held by the __array_struct__ 
should keep a reference to the base object alive, thus preventing the 
issue. If the base object isn't a Python object, you'll have to create 
some kind of Python type that will ensure the original data is not 
freed, although this would normally take place via refcounts if the data 
source was a Python object.

> The __array_struct__ mechanism does not seem to
>allow to cause the new array assume ownership of the data, but even if
>it did, I do not know what memory allocator is appropriate.
>
>The only solution that I can think of is to create a dummy buffer type
>with the sole purpose of deleting an array of PyObjects and make an
>instance of that type the "base" of the new array.
>  
>
Yes, that's I do. (See 
http://www.scipy.org/Cookbook/ArrayStruct_and_Pyrex for example.)


From fullung at gmail.com  Tue Apr 25 14:16:06 2006
From: fullung at gmail.com (Albert Strasheim)
Date: Tue Apr 25 14:16:06 2006
Subject: [Numpy-discussion] SWIG wrappers: Inplace arrays
Message-ID: <006b01c668ad$68b12ab0$0502010a@dsp.sun.ac.za>

Hello all

I am using the SWIG Numpy typemaps to wrap some C code. I ran into the
following problem when wrapping a function with INPLACE_ARRAY1.

In Python, I create the following array:

x = array([],dtype='<i4')

When this is passed to the C function expecting an int*, it goes via
obj_to_array_no_conversion in numpy.i where a direct comparison of the
typecodes is done, at which point a TypeError is raised.

In this case:

desired type = int [typecode 5]
actual type = long [typecode 7]

The typecode is obtained as follows:

#define array_type(a) (int)(((PyArrayObject *)a)->descr->type_num)

Given that I created the array with '<i4', I would expect type_num to map to
int instead of long. Why isn't this happening?

Assuming the is a good reason for type_num being what it is, I think
obj_to_array_no_conversion needs to be slightly cleverer about the
conversions it allows. Is there any way to figure out that int and long are
actually identical (at least on my system) using the Numpy C API? Any other
suggestions or comments for solving this problem?

Thanks!

Regards,

Albert


From tim.hochberg at cox.net  Tue Apr 25 14:24:03 2006
From: tim.hochberg at cox.net (tim.hochberg at cox.net)
Date: Tue Apr 25 14:24:03 2006
Subject: [Numpy-discussion] Broadcasting rules (Ticket 76).
Message-ID: <28358333.1146000205359.JavaMail.root@fed1wml05.mgt.cox.net>

---- Travis Oliphant <oliphant.travis at ieee.org> wrote: 
> Sasha wrote:
> > In this category, I would suggest to allow broadcasting to any
> > multiple of the dimension even if the dimension is not 1.  I don't see
> > what makes 1 so special.
> >   
> What's so special about 1 is that the code for it is relatively 
> straightforward and already implemented using strides.  Altering the 
> code to allow any multiple of the dimension would be harder and slower. 

It also does the right thing most of the time and is easy to understand. It's my expectation that oppening up broadcasting will be more effective in masking errors than in enabling useful new behaviour.

I think that's my ticket being discussed here. If so, it was motivated by a case that stopped working because the looser broadcasting behaviour was preventing some other broadcasting from taking place. I'm not home right now, so I can't provide details; I'll do that on Thursday.

Just keep in mind that it's much easier to keep the broadcasting rules restrictive for now and loosen them up later than to try to tighten them up later if loosening them up turns out to not be a good idea.

-tim


From tim.hochberg at cox.net  Tue Apr 25 14:24:05 2006
From: tim.hochberg at cox.net (tim.hochberg at cox.net)
Date: Tue Apr 25 14:24:05 2006
Subject: [Numpy-discussion] Broadcasting rules (Ticket 76).
Message-ID: <4050236.1146000212838.JavaMail.root@fed1wml05.mgt.cox.net>

---- Travis Oliphant <oliphant.travis at ieee.org> wrote: 
> Sasha wrote:
> > In this category, I would suggest to allow broadcasting to any
> > multiple of the dimension even if the dimension is not 1.  I don't see
> > what makes 1 so special.
> >   
> What's so special about 1 is that the code for it is relatively 
> straightforward and already implemented using strides.  Altering the 
> code to allow any multiple of the dimension would be harder and slower. 

It also does the right thing most of the time and is easy to understand. It's my expectation that oppening up broadcasting will be more effective in masking errors than in enabling useful new behaviour.

I think that's my ticket being discussed here. If so, it was motivated by a case that stopped working because the looser broadcasting behaviour was preventing some other broadcasting from taking place. I'm not home right now, so I can't provide details; I'll do that on Thursday.

Just keep in mind that it's much easier to keep the broadcasting rules restrictive for now and loosen them up later than to try to tighten them up later if loosening them up turns out to not be a good idea.

-tim


From oliphant at ee.byu.edu  Tue Apr 25 15:55:04 2006
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Tue Apr 25 15:55:04 2006
Subject: [Numpy-discussion] SWIG wrappers: Inplace arrays
In-Reply-To: <006b01c668ad$68b12ab0$0502010a@dsp.sun.ac.za>
References: <006b01c668ad$68b12ab0$0502010a@dsp.sun.ac.za>
Message-ID: <444EA88B.4050704@ee.byu.edu>

Albert Strasheim wrote:

>Hello all
>
>I am using the SWIG Numpy typemaps to wrap some C code. I ran into the
>following problem when wrapping a function with INPLACE_ARRAY1.
>
>In Python, I create the following array:
>
>x = array([],dtype='<i4')
>
>When this is passed to the C function expecting an int*, it goes via
>obj_to_array_no_conversion in numpy.i where a direct comparison of the
>typecodes is done, at which point a TypeError is raised.
>
>In this case:
>
>desired type = int [typecode 5]
>actual type = long [typecode 7]
>
>The typecode is obtained as follows:
>
>#define array_type(a) (int)(((PyArrayObject *)a)->descr->type_num)
>
>Given that I created the array with '<i4', I would expect type_num to map to
>int instead of long. Why isn't this happening?
>  
>
Actually there is ambiguity i4 can be either int or long.   If you want 
to guarantee an int-type then use
dtype=intc).

>Assuming the is a good reason for type_num being what it is, I think
>obj_to_array_no_conversion needs to be slightly cleverer about the
>conversions it allows. Is there any way to figure out that int and long are
>actually identical (at least on my system) using the Numpy C API? Any other
>suggestions or comments for solving this problem?
>
>  
>
Yes.  You can use one of

PyArray_EquivTypes(PyArray_Descr *dtype1, PyArray_Descr *dtype2)
PyArray_EquivTypenums(int typenum1, int typenum2)
PyArray_EquivArrTypes(PyObject *array1, PyObject *array2)

These return TRUE (non-zero) if the two type representations are equivalent.

-Travis


From oliphant at ee.byu.edu  Tue Apr 25 16:07:05 2006
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Tue Apr 25 16:07:05 2006
Subject: [Numpy-discussion] SWIG wrappers: Inplace arrays
In-Reply-To: <006b01c668ad$68b12ab0$0502010a@dsp.sun.ac.za>
References: <006b01c668ad$68b12ab0$0502010a@dsp.sun.ac.za>
Message-ID: <444EAB81.3070001@ee.byu.edu>

Albert Strasheim wrote:

>Hello all
>
>I am using the SWIG Numpy typemaps to wrap some C code. I ran into the
>following problem when wrapping a function with INPLACE_ARRAY1.
>
>In Python, I create the following array:
>
>x = array([],dtype='<i4')
>
>When this is passed to the C function expecting an int*, it goes via
>obj_to_array_no_conversion in numpy.i where a direct comparison of the
>typecodes is done, at which point a TypeError is raised.
>
>In this case:
>
>desired type = int [typecode 5]
>actual type = long [typecode 7]
>
>The typecode is obtained as follows:
>
>#define array_type(a) (int)(((PyArrayObject *)a)->descr->type_num)
>
>Given that I created the array with '<i4', I would expect type_num to map to
>int instead of long. Why isn't this happening?
>
>Assuming the is a good reason for type_num being what it is, I think
>obj_to_array_no_conversion needs to be slightly cleverer about the
>conversions it allows. Is there any way to figure out that int and long are
>actually identical (at least on my system) using the Numpy C API? Any other
>suggestions or comments for solving this problem?
>
>  
>
Here is the relevant new numpy.i code (just checked in...)

PyArrayObject* obj_to_array_no_conversion(PyObject* input, int typecode) {
  PyArrayObject* ary = NULL;
  if (is_array(input) && (typecode == PyArray_NOTYPE ||
        PyArray_EquivTypenums(array_type(input), typecode)) {
        ary = (PyArrayObject*) input;
    }
    else if is_array(input) {
      char* desired_type = typecode_string(typecode);
      char* actual_type = typecode_string(array_type(input));
      PyErr_Format(PyExc_TypeError,
                   "Array of type '%s' required.  Array of type '%s' given",
                   desired_type, actual_type);
      ary = NULL;
    }
    else {
      char * desired_type = typecode_string(typecode);
      char * actual_type = pytype_string(input);
      PyErr_Format(PyExc_TypeError,
                   "Array of type '%s' required.  A %s was given",
                   desired_type, actual_type);
      ary = NULL;
    }
  return ary;
}


From ndarray at mac.com  Tue Apr 25 18:17:04 2006
From: ndarray at mac.com (Sasha)
Date: Tue Apr 25 18:17:04 2006
Subject: [Numpy-discussion] Broadcasting rules (Ticket 76).
In-Reply-To: <4050236.1146000212838.JavaMail.root@fed1wml05.mgt.cox.net>
References: <4050236.1146000212838.JavaMail.root@fed1wml05.mgt.cox.net>
Message-ID: <d38f5330604251816i7979df78x67c63eeace28b507@mail.gmail.com>

On 4/25/06, tim.hochberg at cox.net <tim.hochberg at cox.net> wrote:
>
> ---- Travis Oliphant <oliphant.travis at ieee.org> wrote:
> > Sasha wrote:
> > > In this category, I would suggest to allow broadcasting to any
> > > multiple of the dimension even if the dimension is not 1.  I don't see
> > > what makes 1 so special.
> > >
> > What's so special about 1 is that the code for it is relatively
> > straightforward and already implemented using strides.  Altering the
> > code to allow any multiple of the dimension would be harder and slower.

I don't think so. The same zero-stride trick that allows size-1
broadcasting can be used to implement repetition.  I did not review
the C code, but the following Python fragment shows that the loop that
is already in numpy can be used to implement repetition by simply
manipulating shapes and strides:

>>> x = zeros(6)
>>> reshape(x,(3,2))[...] = 1,2
>>> x
array([1, 2, 1, 2, 1, 2])

> It also does the right thing most of the time and is easy to understand.

Easy to understand?  Let me quote Travis' book on this:

"Broadcasting can be understood by four rules: ... While perhaps
somewhat difficult to explain, broadcasting can be quite useful and
becomes second nature rather quickly."

I may be slow, but it did not become second nature for me.  I am still
getting bitten by subtle differences between unit length 1-d arrays
and 0-d arrays.

> It's my expectation that oppening up broadcasting will be more effective in masking
> errors than in enabling useful new behaviour.
>
In my experience broadcasting length-1 and not broadcasting other
lengths is very error prone as it is.  I understand that restricting
broadcasting to make it a strictly dimension-increasing operation is
not possible for two reasons:

1. Numpy cannot break legacy Numeric code.
2. It is not possible to differentiate between 1-d array that
broadcasts column-wise vs. one that broadcasts raw-wise.

In my view none of these reasons is valid.  In my experience Numeric
code that relies on dimension-preserving broadcasting is already
broken, only in a subtle and hard to reproduce way.  Similarly the
need to broadcast over non-leading dimension is a sign of bad design. 
In rare cases where such broadcasting is desirable, it can be easily
done via swapaxes which is a cheap operation.

Nevertheless, I've lost that battle some time ago.

On the other hand I don't see much problem in making
dimension-preserving broadcasting more permissive.  In R, for example,
(1-d) arrays can be broadcast to arbitrary size.  This has an
additional benefit that 1-d to 2-d broadcasting requires no special
code, it just happens because matrices inherit arithmetics from
vectors.  I've never had a problem with R rules being too loose.

> I think that's my ticket being discussed here. If so, it was motivated by a case that
> stopped working because the looser broadcasting behaviour was preventing some
> other broadcasting from taking place. I'm not home right now, so I can't provide
> details; I'll do that on Thursday.

In my view the problem that your ticket highlighted is not so much in
the particular set of broadcasting rules, but in the fact that a[...]
= b uses one set of rules while a[...] += b uses another.  This is
*very* confusing.

> Just keep in mind that it's much easier to keep the broadcasting rules restrictive for
> now and loosen them up later than to try to tighten them up later if loosening them up
> turns out to not be a good idea.

You are preaching to the choir!


From simon at arrowtheory.com  Tue Apr 25 18:29:01 2006
From: simon at arrowtheory.com (Simon Burton)
Date: Tue Apr 25 18:29:01 2006
Subject: [Numpy-discussion] announce: pyjit, a little jit for creating
 numpy ufuncs
In-Reply-To: <qnklktuk9tf.fsf@arbutus.physics.mcmaster.ca>
References: <20060421162336.42285837.simon@arrowtheory.com>
	<qnklktuk9tf.fsf@arbutus.physics.mcmaster.ca>
Message-ID: <20060426112808.531d652b.simon@arrowtheory.com>

On Mon, 24 Apr 2006 16:17:16 -0400
cookedm at physics.mcmaster.ca (David M. Cooke) wrote:

> 
> How do the speedups compare with numexpr?

numexpr segfaults for me (runing timings.py):

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread -1209670912 (LWP 31768)]
0xb7d2b696 in PyArray_NewFromDescr (subtype=0x626e6769, descr=0x64007469, nd=1919251557, dims=0x656e696d,
    strides=0x782d2073, data=0x656c6520, flags=1953391981, obj=0x65736977) at arrayobject.c:3942
3942    arrayobject.c: No such file or directory.
        in arrayobject.c


Simon.


-- 
Simon Burton, B.Sc.
Licensed PO Box 8066
ANU Canberra 2601
Australia
Ph. 61 02 6249 6940
http://arrowtheory.com 


From robert.kern at gmail.com  Tue Apr 25 20:10:07 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Tue Apr 25 20:10:07 2006
Subject: [Numpy-discussion] Chang*ed* the Trac authentication
Message-ID: <444EE463.10007@gmail.com>

Trying not to embarass myself again, I made the changes without telling you.  :-)

In order to create or modify Wiki pages or tickets on the NumPy and SciPy Tracs,
you will have to be logged in. You can register yourself by clicking the
"Register" link in the upper right-hand corner of the page.

Developers who previously had accounts have the same username/password as
before. You can now change your password if you like. Only developers have the
ability to close tickets, delete Wiki pages entirely, or create new ticket
reports (and possibly a couple of other things). Developers, please enter your
name and email by clicking on the "Settings" link up at top once logged in.

Thank you for your patience. If there are any problems, please email me, and I
will try to correct them quickly.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From oliphant.travis at ieee.org  Tue Apr 25 22:26:01 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Tue Apr 25 22:26:01 2006
Subject: [Numpy-discussion] Broadcasting rules (Ticket 76).
In-Reply-To: <d38f5330604251816i7979df78x67c63eeace28b507@mail.gmail.com>
References: <4050236.1146000212838.JavaMail.root@fed1wml05.mgt.cox.net> <d38f5330604251816i7979df78x67c63eeace28b507@mail.gmail.com>
Message-ID: <444F0420.9000500@ieee.org>

Sasha wrote:
> On 4/25/06, tim.hochberg at cox.net <tim.hochberg at cox.net> wrote:
>   
>> ---- Travis Oliphant <oliphant.travis at ieee.org> wrote:
>>     
>>> Sasha wrote:
>>>       
>>>> In this category, I would suggest to allow broadcasting to any
>>>> multiple of the dimension even if the dimension is not 1.  I don't see
>>>> what makes 1 so special.
>>>>
>>>>         
>>> What's so special about 1 is that the code for it is relatively
>>> straightforward and already implemented using strides.  Altering the
>>> code to allow any multiple of the dimension would be harder and slower.
>>>       
>
> I don't think so. The same zero-stride trick that allows size-1
> broadcasting can be used to implement repetition.  I did not review
> the C code, but the following Python fragment shows that the loop that
> is already in numpy can be used to implement repetition by simply
> manipulating shapes and strides:
>   

I don't think anyone is fundamentally opposed to multiple repetitions.   
We're just being cautious.   Also, as you've noted, the assignment code 
is currently not using the ufunc broadcasting code and so they really 
aren't the same thing, yet.
>   
>> It's my expectation that oppening up broadcasting will be more effective in masking
>> errors than in enabling useful new behaviour.
>>
>>     
> In my experience broadcasting length-1 and not broadcasting other
> lengths is very error prone as it is. 

That's not been my experience.  But, I don't know R very well.  I'm very 
interested in what ideas you can bring. 

>  I understand that restricting
> broadcasting to make it a strictly dimension-increasing operation is
> not possible for two reasons:
>
> 1. Numpy cannot break legacy Numeric code.
> 2. It is not possible to differentiate between 1-d array that
> broadcasts column-wise vs. one that broadcasts raw-wise.
>
> In my view none of these reasons is valid.  In my experience Numeric
> code that relies on dimension-preserving broadcasting is already
> broken, only in a subtle and hard to reproduce way.

I definitely don't agree with you here.  Dimension-preserving 
broadcasting is at the heart of the utility of broadcasting and it is 
very, very useful for that.  Calling it subtly broken suggests that you 
don't understand it and have never used it for it's intended purpose.   
I've used dimension-preserving broadcasting literally hundreds of 
times.  It's rather bold of you to say that all of that code is "broken"


Now, I'm sure there are other useful ways to "broadcast",  but 
dimension-preserving is essentially what broadcasting *is* in NumPy.   
If anything it is the dimension-increasing rule that is somewhat 
arbitrary (e.g. why prepend with ones).


Perhaps you want to introduce some other way for non-commensurate shapes 
to interact in an operation.   I think you will find many open minds on 
this list (although probably not anyone who will want to code it up :-) 
).     We do welcome the discussion.    Your experience with other 
array-like languages is helpful.


>   Similarly the
> need to broadcast over non-leading dimension is a sign of bad design. 
> In rare cases where such broadcasting is desirable, it can be easily
> done via swapaxes which is a cheap operation.
>   

Again, it would help if you would refrain from using negative words 
about coding styles that are different from your own.     Such 
broadcasting is not that rare.  It happens quite frequently, actually.   
The point of a language like Python is that you can write algorithms 
simply without struggling with optimization questions up front like you 
seem to be hinting at. 

> On the other hand I don't see much problem in making
> dimension-preserving broadcasting more permissive.  In R, for example,
> (1-d) arrays can be broadcast to arbitrary size.  This has an
> additional benefit that 1-d to 2-d broadcasting requires no special
> code, it just happens because matrices inherit arithmetics from
> vectors.  I've never had a problem with R rules being too loose.
>   

So, please explain exactly what you mean.   Only a few on this list know 
what the R rules even are. 


> In my view the problem that your ticket highlighted is not so much in
> the particular set of broadcasting rules, but in the fact that a[...]
> = b uses one set of rules while a[...] += b uses another.  This is
> *very* confusing.
>   

Yes, this is admittedly confusing.  But, it's an outgrowth of the way 
Numeric code developed.  Broadcasting was always only a ufunc concept in 
Numeric, and copying was not a ufunc.    NumPy grew out of Numeric 
code.   I was not trying to mimick broadcasting behavior when I wrote 
the array copy and array setting code.  Perhaps I should have been. 

I'm willing to change the code on this one, but only if the new copy 
code actually does implement broadcasting behavior equivalently.  And 
going through the ufunc machinery is probably a waste of effort because 
the copy code must be written for variable length arrays anyway (and 
ufuncs don't support them). 

However, the broadcasting machinery has been abstracted in NumPy and can 
therefore be re-used in the copying code.  In Numeric, broadcasting was 
basically implemented deep inside a confusing while loop. 


-Travis


From fullung at gmail.com  Tue Apr 25 23:42:05 2006
From: fullung at gmail.com (Albert Strasheim)
Date: Tue Apr 25 23:42:05 2006
Subject: [Numpy-discussion] SWIG wrappers: Passing NULL pointers or arrays
Message-ID: <00dd01c668fc$6d04b470$0502010a@dsp.sun.ac.za>

Hello all,

I've currently wrapping a C library (libsvm) with NumPy. libsvm has a few
structs similiar to the following:

struct svm_parameter {
   double* weight;
   int nr_weight;
};

In my SWIG wrapper I did the following:

struct svm_parameter {
   %immutable;
   int nr_weight;
   %mutable;
   double* weight;
   %extend {
      svm_parameter() {
         struct svm_parameter* param = (struct svm_parameter*)
             malloc(sizeof(struct svm_parameter));
         param->nr_weight = 0; param->weight = 0;
         return param;
      }
      ~svm_parameter() {
         free(self->weight); free(self);
      }
      void _set_weight(double* IN_ARRAY1, int DIM1) {
         free(self->weight);
         self->nr_weight = DIM1;
         self->weight = malloc(sizeof(double) * DIM1);
         if (!self->weight) {
            SWIG_exception(SWIG_MemoryError, "OOM");
         }
         memcpy(self->weight, IN_ARRAY1, sizeof(double) * DIM1);
         return;
      fail:
         self->nr_weight = 0;
         self->weight = 0;
      }
   }
};

This works pretty well (suggestion welcome though). However, one feature
that I think is lacking from the current array typemaps is a way of passing
NULL to the C function. On the Python side I want to be able to do:

svm_parameter.weight = N.array([1.0,2.0])

or

svm_parameter.weight = None

This heads off to __setattr__ where the following happens:

def __setattr__(self, attr, val):
   if attr in ['weight', 'weight_label']:
      set_func = getattr(self, '_set_%s' % (attr,))
      set_func(val)
   else:
      super(svm_parameter, self).__setattr__(attr, val)

At this point the typemap magic kicks in. However, passing a None doesn't
work, because somewhere down the line somebody checks for the int argument.
The current typemap looks like this:

%define TYPEMAP_IN1(type,typecode)
%typemap(in) (type* IN_ARRAY1, int DIM1)
             (PyArrayObject* array=NULL, int is_new_object) {
  int size[1] = {-1};
  array = obj_to_array_contiguous_allow_conversion($input, typecode,
&is_new_object);
  if (!array || !require_dimensions(array,1) || !require_size(array,size,1))
SWIG_fail;
  $1 = (type*) array->data;
  $2 = array->dimensions[0];
}
%typemap(freearg) (type* IN_ARRAY1, int DIM1) {
  if (is_new_object$argnum && array$argnum) Py_DECREF(array$argnum);
}
%enddef

I quickly hacked up the following typemap that seems to deal gracefully when
a None is passed instead of an array. Changed lines:

if ($input == Py_None) {
  is_new_object = 0;
  $1 = NULL;
  $2 = 0;
} else {
  int size[1] = {-1};
  array = obj_to_array_contiguous_allow_conversion($input, typecode,
&is_new_object);
  if (!array || !require_dimensions(array,1) || !require_size(array,size,1))
SWIG_fail;
  $1 = (type*) array->data;
  $2 = array->dimensions[0];
}

Now I can write my set_weight function as follows:

void _set_weight(double* IN_ARRAY1, int DIM1) {
  free(self->weight);
  self->weight = 0;
  self->nr_weight = DIM1;
  if (DIM1 > 0) {
    self->weight = malloc(sizeof(double) * DIM1);
    if (!self->weight) {
      SWIG_exception(SWIG_MemoryError, "OOM");
    }
    memcpy(self->weight, IN_ARRAY1, sizeof(double) * DIM1);
  }
  return;
fail:
  self->nr_weight = 0;
}

Does it make sense to add this to the typemaps? Any other comments? Are
there better ways to accomplish this?

Regards,

Albert


From arnd.baecker at web.de  Wed Apr 26 00:52:01 2006
From: arnd.baecker at web.de (Arnd Baecker)
Date: Wed Apr 26 00:52:01 2006
Subject: [Numpy-discussion] vectorize problem
In-Reply-To: <200604251324.42987.steffen.loeck@gmx.de>
References: <200604251324.42987.steffen.loeck@gmx.de>
Message-ID: <Pine.LNX.4.51.0604260944230.10871@ptpcp8.phy.tu-dresden.de>

Hi,

On Tue, 25 Apr 2006, Steffen Loeck wrote:

> Hello all,
>
> I have a problem using scalar variables in a vectorized function:
>
> from numpy import vectorize
>
> def f(x):
>     if x>0: return 1
>     else: return 0
>
> F = vectorize(f)
>
> F(1)
>
> gives the error message:
> ---------------------------------------------------------------------------
> exceptions.AttributeError     Traceback (most recent call last)
>
> .../function_base.py in __call__(self, *args)
>     619
>     620         if self.nout == 1:
> --> 621             return self.ufunc(*args).astype(self.otypes[0])
>     622         else:
>     623             return tuple([x.astype(c) for x, c in
> zip(self.ufunc(*args), self.otypes)])
>
> AttributeError: 'int' object has no attribute 'astype'

Ouch - that's not nice - a lot of my code relies the fact that (old
scipy) vectorize happily eats scalars *and* arrays.

I am not familiar with the code of numpy.vectorize (which has indeed
changed quite a bit compared to the old scipy.vectorize),
but maybe it is only a simple change?

> Is there any way to get vectorized functions working with scalars again?

+1

(or is there a particular reason why "vectorized" functions
should not be able to operate on scalars?)

Best, Arnd


From pgmdevlist at mailcan.com  Wed Apr 26 01:06:04 2006
From: pgmdevlist at mailcan.com (Pierre GM)
Date: Wed Apr 26 01:06:04 2006
Subject: [Numpy-discussion] A python interface for loess ?
Message-ID: <200604260329.17115.pgmdevlist@mailcan.com>

Folks, 
Would any of you be aware of a Python interface to the loess routines ?
http://netlib.bell-labs.com/netlib/a/dloess.gz
I could use the R implementation through Rpy, but I would prefer to stick to 
Python...
Thanks a lot in advance
P.


From arnd.baecker at web.de  Wed Apr 26 02:39:05 2006
From: arnd.baecker at web.de (Arnd Baecker)
Date: Wed Apr 26 02:39:05 2006
Subject: [Numpy-discussion] concatenate, doc-string
Message-ID: <Pine.LNX.4.51.0604261129410.10871@ptpcp8.phy.tu-dresden.de>

Hi,

the doc-string of concatentate is pretty short:

numpy.concatenate?
Docstring:
    concatenate((a1,a2,...),axis=None).

Would the following be better:
"""
concatenate((a1, a2,...), axis=None) joins the tuple  `(a1, a2, ...)` of
sequences (or arrays) into a single numpy array.

Example::

  print concatenate( ([0,1,2], [5,6,7]))
"""

((The ``(or arrays)`` could be omitted if sequences include array by
default, though it might not be obvious to beginners ...))

I was also tempted to suggest a dtype argument,
  concatenate( ([0,1,2], [5,6,7]), dtype=numpy.Float)
but I am not sure if that would be a good idea ...

Best, Arnd


From gnchen at cortechs.net  Wed Apr 26 06:52:01 2006
From: gnchen at cortechs.net (Gennan Chen)
Date: Wed Apr 26 06:52:01 2006
Subject: [Numpy-discussion] SWIG for 3D array
Message-ID: <A7E3D110-90E4-47BE-80DC-CED1DC359A26@cortechs.net>

Hi!

I will like to use SWIG to wrap my code. However, it seems the  
current numpy.i only can map 1 and 2D array, but not 3D. Is it  
correct? Or I miss something here.

I don't mind spend some time to do it like scipy.ndimage if numpy.i  
did not support ND arrary. But I am new to write extension to Python.  
And I really have hard time to understand how to deal with reference  
counting issues. Anyone know where I can know a good reference for  
that? Or a simple example in numpy will be appreciated....

Gen-Nan Chen, PhD
Chief Scientist
Research and Development Group
CorTechs Labs Inc (www.cortechs.net)
1020 Prospect St., #304, La Jolla, CA, 92037
Tel: 1-858-459-9700 ext 16
Fax: 1-858-459-9705
Email: gnchen at cortechs.net


From oliphant.travis at ieee.org  Wed Apr 26 10:05:01 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Wed Apr 26 10:05:01 2006
Subject: [Numpy-discussion] vectorize problem
In-Reply-To: <Pine.LNX.4.51.0604260944230.10871@ptpcp8.phy.tu-dresden.de>
References: <200604251324.42987.steffen.loeck@gmx.de> <Pine.LNX.4.51.0604260944230.10871@ptpcp8.phy.tu-dresden.de>
Message-ID: <444FA7E7.2070303@ieee.org>

Arnd Baecker wrote:
> Hi,
>
> On Tue, 25 Apr 2006, Steffen Loeck wrote:
>
>   
>> Hello all,
>>
>> I have a problem using scalar variables in a vectorized function:
>>
>> from numpy import vectorize
>>
>> def f(x):
>>     if x>0: return 1
>>     else: return 0
>>
>> F = vectorize(f)
>>
>> F(1)
>>
>> gives the error message:
>> ---------------------------------------------------------------------------
>> exceptions.AttributeError     Traceback (most recent call last)
>>
>> .../function_base.py in __call__(self, *args)
>>     619
>>     620         if self.nout == 1:
>> --> 621             return self.ufunc(*args).astype(self.otypes[0])
>>     622         else:
>>     623             return tuple([x.astype(c) for x, c in
>> zip(self.ufunc(*args), self.otypes)])
>>
>> AttributeError: 'int' object has no attribute 'astype'
>>     
>
> Ouch - that's not nice - a lot of my code relies the fact that (old
> scipy) vectorize happily eats scalars *and* arrays.
>
> I am not familiar with the code of numpy.vectorize (which has indeed
> changed quite a bit compared to the old scipy.vectorize),
> but maybe it is only a simple change?
>   
It is just a simple change.   Scalars are supposed to be supported.  
They aren't only as a side-effect of the switch to not return 
object-scalars.    I did not update the vectorize code to handle the 
scalar return value from the object ufunc (which is now no-longer an 
object-scalar with the methods of arrays (like astype) but is instead 
the underlying object). 

I'll add a check.

-Travis


From jrl at gatewayengineers.com  Wed Apr 26 12:29:01 2006
From: jrl at gatewayengineers.com (Frida Maldonado)
Date: Wed Apr 26 12:29:01 2006
Subject: [Numpy-discussion] vat
Message-ID: <001a01c66967$82f94541$ddc46747@ijopi.sewtp>

An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060426/a989f963/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: controversy.gif
Type: image/gif
Size: 28493 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060426/a989f963/attachment-0001.gif>

From cookedm at physics.mcmaster.ca  Wed Apr 26 12:33:01 2006
From: cookedm at physics.mcmaster.ca (David M. Cooke)
Date: Wed Apr 26 12:33:01 2006
Subject: [Numpy-discussion] Chang*ed* the Trac authentication
In-Reply-To: <444EE463.10007@gmail.com> (Robert Kern's message of "Tue, 25 Apr
	2006 22:09:23 -0500")
References: <444EE463.10007@gmail.com>
Message-ID: <qnkslo0i15h.fsf@arbutus.physics.mcmaster.ca>

Robert Kern <robert.kern at gmail.com> writes:

> Trying not to embarass myself again, I made the changes without telling you.  :-)
>
> In order to create or modify Wiki pages or tickets on the NumPy and SciPy Tracs,
> you will have to be logged in. You can register yourself by clicking the
> "Register" link in the upper right-hand corner of the page.
>
> Developers who previously had accounts have the same username/password as
> before. You can now change your password if you like. Only developers have the
> ability to close tickets, delete Wiki pages entirely, or create new ticket
> reports (and possibly a couple of other things). Developers, please enter your
> name and email by clicking on the "Settings" link up at top once logged in.
>
> Thank you for your patience. If there are any problems, please email me, and I
> will try to correct them quickly.

Thanks Robert; I hope this helps with our spam problem to an extent.

-- 
|>|\/|<
/--------------------------------------------------------------------------\
|David M. Cooke                      http://arbutus.physics.mcmaster.ca/dmc/
|cookedm at physics.mcmaster.ca


From cookedm at physics.mcmaster.ca  Wed Apr 26 12:48:04 2006
From: cookedm at physics.mcmaster.ca (David M. Cooke)
Date: Wed Apr 26 12:48:04 2006
Subject: [Numpy-discussion] concatenate, doc-string
In-Reply-To: <Pine.LNX.4.51.0604261129410.10871@ptpcp8.phy.tu-dresden.de>
	(Arnd Baecker's message of "Wed, 26 Apr 2006 11:38:26 +0200 (CEST)")
References: <Pine.LNX.4.51.0604261129410.10871@ptpcp8.phy.tu-dresden.de>
Message-ID: <qnkk69ci0ff.fsf@arbutus.physics.mcmaster.ca>

Arnd Baecker <arnd.baecker at web.de> writes:

> Hi,
>
> the doc-string of concatentate is pretty short:
>
> numpy.concatenate?
> Docstring:
>     concatenate((a1,a2,...),axis=None).
>
> Would the following be better:
> """
> concatenate((a1, a2,...), axis=None) joins the tuple  `(a1, a2, ...)` of
> sequences (or arrays) into a single numpy array.
>
> Example::
>
>   print concatenate( ([0,1,2], [5,6,7]))
> """
>
> ((The ``(or arrays)`` could be omitted if sequences include array by
> default, though it might not be obvious to beginners ...))

Here's what I just checked in:

    concatenate((a1, a2, ...), axis=None) joins arrays together

    The tuple of sequences (a1, a2, ...) are joined along the given axis
    (default is the first one) into a single numpy array.

    Example:

    >>> concatenate( ([0,1,2], [5,6,7]) )
    array([0, 1, 2, 5, 6, 7])

> I was also tempted to suggest a dtype argument,
>   concatenate( ([0,1,2], [5,6,7]), dtype=numpy.Float)
> but I am not sure if that would be a good idea ...

Well, that would require more code, so I didn't do it :-)

-- 
|>|\/|<
/--------------------------------------------------------------------------\
|David M. Cooke                      http://arbutus.physics.mcmaster.ca/dmc/
|cookedm at physics.mcmaster.ca


From arnd.baecker at web.de  Wed Apr 26 14:03:02 2006
From: arnd.baecker at web.de (Arnd Baecker)
Date: Wed Apr 26 14:03:02 2006
Subject: [Numpy-discussion] concatenate, doc-string
In-Reply-To: <qnkk69ci0ff.fsf@arbutus.physics.mcmaster.ca>
References: <Pine.LNX.4.51.0604261129410.10871@ptpcp8.phy.tu-dresden.de>
 <qnkk69ci0ff.fsf@arbutus.physics.mcmaster.ca>
Message-ID: <Pine.LNX.4.51.0604262252400.10671@ptpcp8.phy.tu-dresden.de>

On Wed, 26 Apr 2006, David M. Cooke wrote:

> Arnd Baecker <arnd.baecker at web.de> writes:
>
> > Hi,
> >
> > the doc-string of concatentate is pretty short:
> >
> > numpy.concatenate?
> > Docstring:
> >     concatenate((a1,a2,...),axis=None).
> >
> > Would the following be better:
> > """
> > concatenate((a1, a2,...), axis=None) joins the tuple  `(a1, a2, ...)` of
> > sequences (or arrays) into a single numpy array.
> >
> > Example::
> >
> >   print concatenate( ([0,1,2], [5,6,7]))
> > """
> >
> > ((The ``(or arrays)`` could be omitted if sequences include array by
> > default, though it might not be obvious to beginners ...))
>
> Here's what I just checked in:
>
>     concatenate((a1, a2, ...), axis=None) joins arrays together
>
>     The tuple of sequences (a1, a2, ...) are joined along the given axis
>     (default is the first one) into a single numpy array.
>
>     Example:
>
>     >>> concatenate( ([0,1,2], [5,6,7]) )
>     array([0, 1, 2, 5, 6, 7])

Great -  many thanks!!

There are some further routines which might benefit
from some more explanation/examples -
so if you don't mind I will try to suggest some additions
(I could check them in directly, I think, but as I am not
a native speaker I feel better to post them here for
review/improvement).

> > I was also tempted to suggest a dtype argument,
> >   concatenate( ([0,1,2], [5,6,7]), dtype=numpy.Float)
> > but I am not sure if that would be a good idea ...
>
> Well, that would require more code, so I didn't do it :-)

;-) It might also be problematic, when one of the sequence
elements would not fit into the output type.

Best, Arnd


From ndarray at mac.com  Wed Apr 26 14:18:06 2006
From: ndarray at mac.com (Sasha)
Date: Wed Apr 26 14:18:06 2006
Subject: [Numpy-discussion] Broadcasting rules (Ticket 76).
In-Reply-To: <444F0420.9000500@ieee.org>
References: <4050236.1146000212838.JavaMail.root@fed1wml05.mgt.cox.net>
	 <d38f5330604251816i7979df78x67c63eeace28b507@mail.gmail.com>
	 <444F0420.9000500@ieee.org>
Message-ID: <d38f5330604261417o1b8dd104veb5b6e1398cfc74e@mail.gmail.com>

I would like to apologize up-front if anyone found my overly general
arguments inappropriate.  I did not intend to be critical about
anyone's code or design other than my own.  Any references to "bad
design" or "broken code" are related to my own misguided attempts to
use some of the Numeric features in the past.  It turned out that
dimension-preserving broadcasting was a wrong feature to use for a
specific class of problems that I am dealing with most of the time. 
This does not mean, that it cannot be used appropriately in other
domains.

I was wrong in posting overly general opinions without providing
specific examples.  I will try to do better in this post.  Before I do
that, however, let me try to explain why I hold strong views on
certain things.  In my view the most appealing feature in Python is
the Zen of Python <http://www.python.org/doc/Humor.html#zen>" and in
particular "There should be one-- and preferably only one --obvious
way to do it."  In my view Python represents the "hard science"
approach appealing to physics and math types while Perl is more of a
"soft science" language. (There is nothing wrong with either Perl or
soft sciences.) This is what make Python so appealing for scientific
computing.  Unfortunately, it is the fact of life that there are
always many ways to solve the same problem and a successful "pythonic"
design has to pick one (preferably the best) of the possible ways and
make it obvious.

This said, let me present a specific problem that I will use to
illustrate my points below.  Suppose we study school statistics in
different cities.  Let city A have 10 schools with 20 classes and 30
students in each.  It is natural to organize the data collected about
the students in a 10x20x30 array.  It is also natural to collect some
of the data at the per-school or per-class level.  This data may come
from aggregating student level statistics (say average test score) or
from the characteristics that are class or school specific (say the
grade or primary language).  There are two obvious ways to present
such data. 1) We can use 3-d arrays for everything and make the shape
of the per-class array 10x20x1 and the shape of per-school array
10x1x1; and 2) use 2-d and 1-d arrays.  The first approach seems to be
more flexible.  We can also have 10x1x30 or 1x1x30 arrays to represent
data which varies along the student dimension, but is constant across
schools or classes.  However, this added benefit is illusory: the
first student in one class list has no relationship to the first 
student in the other class, so in this particular problem an average
score of the first student across classes makes no sense (it will also
depend on whether students are ordered alphabetically or by an
achievement rank).

On the other hand this approach has a very significant drawback:
functions that process city data have no way to distinguish between
per-school data and a lucky city that can afford educating its
students in individual classes.  Just as it is extremely unlikely to
have one student per class in our toy example, in real-world problems
it is not unreasonable to assume that dimension of size 1 represents
aggregate data.  A software designed based on this assumption is what
I would call broken in a subtle way.

Please see more below.


On 4/26/06, Travis Oliphant <oliphant.travis at ieee.org> wrote:
> Sasha wrote:
> > On 4/25/06, tim.hochberg at cox.net <tim.hochberg at cox.net> wrote:
> >
> >> ---- Travis Oliphant <oliphant.travis at ieee.org> wrote:
> [...]
> I don't think anyone is fundamentally opposed to multiple repetitions.
> We're just being cautious.   Also, as you've noted, the assignment code
> is currently not using the ufunc broadcasting code and so they really
> aren't the same thing, yet.

It looks like there is a lot of development in this area going on at
the moment.  Please let me know if I can help.

> [...]
> > In my experience broadcasting length-1 and not broadcasting other
> > lengths is very error prone as it is.
>
> That's not been my experience.

I should have been more specific.  As I explained above, the special
properties of length-1 led me to design a system that distinguished
aggregate data by testing for unit length.  This system was subtly
broken.  In a rare case when the population had only one element, the
system was producing wrong results.

> But, I don't know R very well.  I'm very
> interested in what ideas you can bring.
>

R takes a very simple approach: everything is a vector.  There are no
scalars, if you need a scalar, you use a vector of length 1. 
Broadcasting is simply repetition:

> x <- rep(0,10)
> x + c(1,2)
 [1] 1 2 1 2 1 2 1 2 1 2

the length of the larger vector does not even need to be a multiple of
the shorter, but in this case a warning is issued:

> x + c(1,2,3)
 [1] 1 2 3 1 2 3 1 2 3 1
Warning message:
longer object length
        is not a multiple of shorter object length in: x + c(1, 2, 3)

Multi-dimensional arrays are implemented by setting a "dim" attribute:

> dim(x) <- c(2,5)
> x
     [,1] [,2] [,3] [,4] [,5]
[1,]    0    0    0    0    0
[2,]    0    0    0    0    0

(R uses Fortran order).  Broadcasting ignores the dim attribute, but
does the right thing for conformable vectors:

> x + c(1,2)
     [,1] [,2] [,3] [,4] [,5]
[1,]    1    1    1    1    1
[2,]    2    2    2    2    2

However, the following is unfortunate:
> x + 1:5
     [,1] [,2] [,3] [,4] [,5]
[1,]    1    3    5    2    4
[2,]    2    4    1    3    5


> >  I understand that restricting
> > broadcasting to make it a strictly dimension-increasing operation is
> > not possible for two reasons:
> >
> > 1. Numpy cannot break legacy Numeric code.
> > 2. It is not possible to differentiate between 1-d array that
> > broadcasts column-wise vs. one that broadcasts raw-wise.
> >
> > In my view none of these reasons is valid.  In my experience Numeric
> > code that relies on dimension-preserving broadcasting is already
> > broken, only in a subtle and hard to reproduce way.
>
> I definitely don't agree with you here.  Dimension-preserving
> broadcasting is at the heart of the utility of broadcasting and it is
> very, very useful for that.  Calling it subtly broken suggests that you
> don't understand it and have never used it for it's intended purpose.
> I've used dimension-preserving broadcasting literally hundreds of
> times.  It's rather bold of you to say that all of that code is "broken"
>
Sorry I was not specific in the original post.  I hope you now
understand where I come from.  Can you point me to some examples of
the correct way to use dimension-preserving broadcasting?  I would
assume that it is probably more useful in the problem domains where
there is no natural ordering of the dimensions, unlike in the
hierarchial data example that I used.

> Now, I'm sure there are other useful ways to "broadcast",  but
> dimension-preserving is essentially what broadcasting *is* in NumPy.
> If anything it is the dimension-increasing rule that is somewhat
> arbitrary (e.g. why prepend with ones).
>
The dimension-increasing broadcasting is very natural when you deal
with hierarchical data where various dimensions correspond to the
levels of aggregation.  As I explained above, average student score
per class makes sense while the average score per student over classes
does not.  It is very common to combine per-class data with
per-student data by broadcasting per-class data.  For example, the
total time spent by student is a sum spent in regular per-class
session plus individual elected courses.

>
> Perhaps you want to introduce some other way for non-commensurate shapes
> to interact in an operation.   I think you will find many open minds on
> this list (although probably not anyone who will want to code it up :-)
> ).     We do welcome the discussion.    Your experience with other
> array-like languages is helpful.
>
I will be happy to contribute code if I see interest.

>
> >   Similarly the
> > need to broadcast over non-leading dimension is a sign of bad design.
> > In rare cases where such broadcasting is desirable, it can be easily
> > done via swapaxes which is a cheap operation.
> >
>
> Again, it would help if you would refrain from using negative words
> about coding styles that are different from your own.     Such
> broadcasting is not that rare.  It happens quite frequently, actually.
> The point of a language like Python is that you can write algorithms
> simply without struggling with optimization questions up front like you
> seem to be hinting at.
>
I hope you understand that I did not mean to criticize anyone's coding
style.  I was not really hinting at optimization issues, I just had a
particular design problem in mind (see above).  Incidentally,
dimension-increasing broadcasting does tend to lead to more efficient
code both in terms of memory utilization and more straightforward
algorithms with fewer special cases, but this was not really what I
was referring to.


> > On the other hand I don't see much problem in making
> > dimension-preserving broadcasting more permissive.  In R, for example,
> > (1-d) arrays can be broadcast to arbitrary size.  This has an
> > additional benefit that 1-d to 2-d broadcasting requires no special
> > code, it just happens because matrices inherit arithmetics from
> > vectors.  I've never had a problem with R rules being too loose.
> >
>
> So, please explain exactly what you mean.   Only a few on this list know
> what the R rules even are.

See above.

> > In my view the problem that your ticket highlighted is not so much in
> > the particular set of broadcasting rules, but in the fact that a[...]
> > = b uses one set of rules while a[...] += b uses another.  This is
> > *very* confusing.
> >
>
> Yes, this is admittedly confusing.  But, it's an outgrowth of the way
> Numeric code developed.  Broadcasting was always only a ufunc concept in
> Numeric, and copying was not a ufunc.    NumPy grew out of Numeric
> code.   I was not trying to mimick broadcasting behavior when I wrote
> the array copy and array setting code.  Perhaps I should have been.
>
In the spirit of appealing to obscure languages ;-), let me mention
that in the K language (kx.com) element assignment is implemented
using an Amend primitive that takes four arguments: @[x,i,f,y] id more
or less equivalent to numpy's x[i] = f(x[i], y[i]), where x, y and i
are vectors and f is a binary (broadcasting) function.  Thus, x[i] +=
y[i] can be written as @[x,i,+,y] and x[i] = y[i] is @[x,i,:,y], where
':' denotes a binary function that returns it's second argument and
ignores the first. K interpretor's Linux binary is less than 200K and
that includes a simple X window GUI! Such small code size would not be
possible without picking the right set of primitives and avoiding
special case code.


> I'm willing to change the code on this one, but only if the new copy
> code actually does implement broadcasting behavior equivalently.  And
> going through the ufunc machinery is probably a waste of effort because
> the copy code must be written for variable length arrays anyway (and
> ufuncs don't support them).
>
I know close to nothing about variable length arrays.  When I need to
deal with the relational database data, I transpose it so that each
column gets into its own fixed length array.  This is the approach
that both R and K take.  However, at least at the C level, I don't see
why ufunc code cannot be generalized to handle variable length arrays.
 At the python level, pre-defined arithmetic or math functions are
probably not feasible for variable length, but the ability to define a
variable length array function by just writing an inner loop
implementation may be quite useful.

> However, the broadcasting machinery has been abstracted in NumPy and can
> therefore be re-used in the copying code.  In Numeric, broadcasting was
> basically implemented deep inside a confusing while loop.

I've never understood the Numeric's while loop and completely agree
with your characterization.  I am still studying the numpy code, but
it is clearly a big improvement.


From shhong at u.washington.edu  Wed Apr 26 14:19:01 2006
From: shhong at u.washington.edu (Sungho Hong)
Date: Wed Apr 26 14:19:01 2006
Subject: [Numpy-discussion] Building Numpy with Windows and MKL?
Message-ID: <207B8B70-6328-421D-8343-B32506AF47CA@u.washington.edu>

Has anyone tried to install numpy with MS Windows and Intel Math  
Kernel Library, especially using the VC 2003 compiler? I began with  
MKLROOT=C:\Program Files\Inter\plsuite, but the setup.py seems to  
have a problem with finding the library path. In that case, how do  
manually set up all the relevant paths manually? Thanks.

- SH


From ryanlists at gmail.com  Wed Apr 26 14:21:07 2006
From: ryanlists at gmail.com (Ryan Krauss)
Date: Wed Apr 26 14:21:07 2006
Subject: [Numpy-discussion] array.min() vs. min(array)
Message-ID: <c5b438120604261420u5d44eeefla575bc56043f487a@mail.gmail.com>

I was spending some time trying to track down how to speed up an
algorithm that gets called a bunch of times during an optimization.  I
was startled when I finally figured out that most of the time was
wasted by using the built-in pyhton min function.  It turns out that
in my case, using array.min() (i.e. the method of the Numpy array) is
300-500 times faster than the built-in python min function (i.e.
min(array)).

So, thank you Travis and everyone who has put so much time into
thinking through Numpy and making it fast (as well as making sure it
is correct).

And to the rest of us, use the Numpy array methods whenever you can.

Thanks,

Ryan


From oliphant.travis at ieee.org  Wed Apr 26 14:42:05 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Wed Apr 26 14:42:05 2006
Subject: [Numpy-discussion] array.min() vs. min(array)
In-Reply-To: <c5b438120604261420u5d44eeefla575bc56043f487a@mail.gmail.com>
References: <c5b438120604261420u5d44eeefla575bc56043f487a@mail.gmail.com>
Message-ID: <444FE909.5080209@ieee.org>

Ryan Krauss wrote:
> I was spending some time trying to track down how to speed up an
> algorithm that gets called a bunch of times during an optimization.  I
> was startled when I finally figured out that most of the time was
> wasted by using the built-in pyhton min function.  It turns out that
> in my case, using array.min() (i.e. the method of the Numpy array) is
> 300-500 times faster than the built-in python min function (i.e.
> min(array)).
>
> So, thank you Travis and everyone who has put so much time into
> thinking through Numpy and making it fast (as well as making sure it
> is correct).

The builtin min function is a bit confusing because it usually does work 
on NumPy arrays.  But, as you've noticed it is always slower because it 
uses the "generic sequence interface" that NumPy arrays expose.  So, 
it's basically not much faster than a Python loop.  In this case you are 
also being hit by the fact that scalarmath is not yet implemented (it's 
getting close though...)  so the returned array scalars are being 
compared using the bulky ufunc machinery on each element separately.

In Python 2.5 we are going to have the same issues with the new any() 
and all() functions of Python.

-Travis


From wbaxter at gmail.com  Wed Apr 26 14:56:12 2006
From: wbaxter at gmail.com (Bill Baxter)
Date: Wed Apr 26 14:56:12 2006
Subject: [Numpy-discussion] Broadcasting rules (Ticket 76).
In-Reply-To: <d38f5330604261417o1b8dd104veb5b6e1398cfc74e@mail.gmail.com>
References: <4050236.1146000212838.JavaMail.root@fed1wml05.mgt.cox.net>
	 <d38f5330604251816i7979df78x67c63eeace28b507@mail.gmail.com>
	 <444F0420.9000500@ieee.org>
	 <d38f5330604261417o1b8dd104veb5b6e1398cfc74e@mail.gmail.com>
Message-ID: <e86a5fd00604261455g629164edy9ac45a09ac2ea92c@mail.gmail.com>

Is that a representative example?   It seems highly unlikely that in real
life every one of the schools would have exactly 20 classes, and each of
those exactly 30 students.  I don't know anything about R or the way things
are typically done with statistical languages -- maybe this is the norm
there -- but from a pure CompSci data structures perspective, a 3D array
seems ill-suited for this type of hierarchical data.  Something more
flexible, along the lines of a Python list of list of list, seems more
apropriate.

--bill

On 4/27/06, Sasha <ndarray at mac.com> wrote:

> Suppose we study school statistics in
> different cities.  Let city A have 10 schools with 20 classes and 30
> students in each.  It is natural to organize the data collected about
> the students in a 10x20x30 array.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060426/529e984d/attachment-0001.html>

From ndarray at mac.com  Wed Apr 26 15:24:07 2006
From: ndarray at mac.com (Sasha)
Date: Wed Apr 26 15:24:07 2006
Subject: [Numpy-discussion] concatenate, doc-string
In-Reply-To: <qnkk69ci0ff.fsf@arbutus.physics.mcmaster.ca>
References: <Pine.LNX.4.51.0604261129410.10871@ptpcp8.phy.tu-dresden.de>
	 <qnkk69ci0ff.fsf@arbutus.physics.mcmaster.ca>
Message-ID: <d38f5330604261523r53d37db0hc56781bcda52f96c@mail.gmail.com>

On 4/26/06, David M. Cooke <cookedm at physics.mcmaster.ca> wrote:
> ....
> Here's what I just checked in:
>
>     concatenate((a1, a2, ...), axis=None) joins arrays together
>
>     The tuple of sequences (a1, a2, ...) are joined along the given axis
>     (default is the first one) into a single numpy array.
>
>     Example:
>
>     >>> concatenate( ([0,1,2], [5,6,7]) )
>     array([0, 1, 2, 5, 6, 7])
>

The first argument does not have to be a tuple:

>>> print concatenate([[0,1,2], [5,6,7]])
[0 1 2 5 6 7]

but the docstring is probably ok given that the alternative is
"sequence of sequences" ...


From ndarray at mac.com  Wed Apr 26 15:58:04 2006
From: ndarray at mac.com (Sasha)
Date: Wed Apr 26 15:58:04 2006
Subject: [Numpy-discussion] Broadcasting rules (Ticket 76).
In-Reply-To: <e86a5fd00604261455g629164edy9ac45a09ac2ea92c@mail.gmail.com>
References: <4050236.1146000212838.JavaMail.root@fed1wml05.mgt.cox.net>
	 <d38f5330604251816i7979df78x67c63eeace28b507@mail.gmail.com>
	 <444F0420.9000500@ieee.org>
	 <d38f5330604261417o1b8dd104veb5b6e1398cfc74e@mail.gmail.com>
	 <e86a5fd00604261455g629164edy9ac45a09ac2ea92c@mail.gmail.com>
Message-ID: <d38f5330604261557t4195efaeu7d5417f839ba625@mail.gmail.com>

On 4/26/06, Bill Baxter <wbaxter at gmail.com> wrote:
> Is that a representative example?   It seems highly unlikely that in real
> life every one of the schools would have exactly 20 classes, and each of
> those exactly 30 students.

You should not take my toy example too seriousely.  However, with
support for missing values, 3-d arrays may provide an efficient
representation for a more realistic scenario when you only know upper
bounds for the number of students/classes.  Smaller schools will have
missing values in their arrays.

>  I don't know anything about R or the way things
> are typically done with statistical languages -- maybe this is the norm
> there -- but from a pure CompSci data structures perspective, a 3D array
> seems ill-suited for this type of hierarchical data.  Something more
> flexible, along the lines of a Python list of list of list, seems more
> apropriate.
>
You are right.  I am sorely missing ragged array support in numpy like
the one available in K.  Numpy supports nested arrays, but does not
optimize the most common case when nested arrays are of the same type.


> --bill
>
>
> On 4/27/06, Sasha <ndarray at mac.com> wrote:
>
> > Suppose we study school statistics in
> > different cities.  Let city A have 10 schools with 20 classes and 30
> > students in each.  It is natural to organize the data collected about
> > the students in a 10x20x30 array.
> >
>


From ndarray at mac.com  Wed Apr 26 16:16:07 2006
From: ndarray at mac.com (Sasha)
Date: Wed Apr 26 16:16:07 2006
Subject: [Numpy-discussion] Broadcasting rules (Ticket 76).
In-Reply-To: <d38f5330604261557t4195efaeu7d5417f839ba625@mail.gmail.com>
References: <4050236.1146000212838.JavaMail.root@fed1wml05.mgt.cox.net>
	 <d38f5330604251816i7979df78x67c63eeace28b507@mail.gmail.com>
	 <444F0420.9000500@ieee.org>
	 <d38f5330604261417o1b8dd104veb5b6e1398cfc74e@mail.gmail.com>
	 <e86a5fd00604261455g629164edy9ac45a09ac2ea92c@mail.gmail.com>
	 <d38f5330604261557t4195efaeu7d5417f839ba625@mail.gmail.com>
Message-ID: <d38f5330604261608w31eb8fb8o81ba6d50325125e4@mail.gmail.com>

On 4/26/06, Sasha <ndarray at mac.com> wrote:
> On 4/26/06, Bill Baxter <wbaxter at gmail.com> wrote:
> > Is that a representative example?   It seems highly unlikely that in real
> > life every one of the schools would have exactly 20 classes, and each of
> > those exactly 30 students.
>
> You should not take my toy example too seriousely.  However, with
> support for missing values, 3-d arrays may provide an efficient
> representation for a more realistic scenario when you only know upper
> bounds for the number of students/classes.  Smaller schools will have
> missing values in their arrays.

In addition, it is reasonable to sample a fixed number of classes from
each school and a fixed number of students from each class at random
for a statistical study.


From simon at arrowtheory.com  Wed Apr 26 16:41:04 2006
From: simon at arrowtheory.com (Simon Burton)
Date: Wed Apr 26 16:41:04 2006
Subject: [Numpy-discussion] obtain indexes of a sort ?
Message-ID: <20060427094025.10172889.simon@arrowtheory.com>

Is it possible to obtain a permutation (array of indices)
representing the transform that sorts an array ? Is there a numpy way
of doing this ?

I can do it in python as:

a = [ 6, 5, 99, 2 ]
idxs = range(len(a))
z = zip(idxs,a)
def zcmp(u,v):
  if u[1]<=v[1]:
    return -1
  return 1
z.sort( zcmp )
idxs = [u[0] for u in z] # <--- permutation

Simon.

-- 
Simon Burton, B.Sc.
Licensed PO Box 8066
ANU Canberra 2601
Australia
Ph. 61 02 6249 6940
http://arrowtheory.com 


From pgmdevlist at mailcan.com  Wed Apr 26 16:45:02 2006
From: pgmdevlist at mailcan.com (Pierre GM)
Date: Wed Apr 26 16:45:02 2006
Subject: [Numpy-discussion] obtain indexes of a sort ?
In-Reply-To: <20060427094025.10172889.simon@arrowtheory.com>
References: <20060427094025.10172889.simon@arrowtheory.com>
Message-ID: <200604261944.01584.pgmdevlist@mailcan.com>

On Wednesday 26 April 2006 19:40, Simon Burton wrote:
> Is it possible to obtain a permutation (array of indices)
> representing the transform that sorts an array ? Is there a numpy way
> of doing this ?

I guess argsort() could be what you want


From ndarray at mac.com  Wed Apr 26 16:45:03 2006
From: ndarray at mac.com (Sasha)
Date: Wed Apr 26 16:45:03 2006
Subject: [Numpy-discussion] obtain indexes of a sort ?
In-Reply-To: <20060427094025.10172889.simon@arrowtheory.com>
References: <20060427094025.10172889.simon@arrowtheory.com>
Message-ID: <d38f5330604261644g22f8c93ahc2c9ca4aadf507@mail.gmail.com>

>>> argsort([ 6, 5, 99, 2 ])
array([3, 1, 0, 2])

On 4/26/06, Simon Burton <simon at arrowtheory.com> wrote:
>
> Is it possible to obtain a permutation (array of indices)
> representing the transform that sorts an array ? Is there a numpy way
> of doing this ?
>
> I can do it in python as:
>
> a = [ 6, 5, 99, 2 ]
> idxs = range(len(a))
> z = zip(idxs,a)
> def zcmp(u,v):
>   if u[1]<=v[1]:
>     return -1
>   return 1
> z.sort( zcmp )
> idxs = [u[0] for u in z] # <--- permutation
>
> Simon.
>
> --
> Simon Burton, B.Sc.
> Licensed PO Box 8066
> ANU Canberra 2601
> Australia
> Ph. 61 02 6249 6940
> http://arrowtheory.com
>
>
> -------------------------------------------------------
> Using Tomcat but need to do more? Need to support web services, security?
> Get stuff done quickly with pre-integrated technology to make your job easier
> Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>


From zpincus at stanford.edu  Wed Apr 26 16:46:05 2006
From: zpincus at stanford.edu (Zachary Pincus)
Date: Wed Apr 26 16:46:05 2006
Subject: [Numpy-discussion] obtain indexes of a sort ?
In-Reply-To: <20060427094025.10172889.simon@arrowtheory.com>
References: <20060427094025.10172889.simon@arrowtheory.com>
Message-ID: <800F9820-F672-4EBF-8F48-3C3AEF17FC34@stanford.edu>

a.argsort() or numpy.argsort(a)

Zach


On Apr 26, 2006, at 4:40 PM, Simon Burton wrote:

>
> Is it possible to obtain a permutation (array of indices)
> representing the transform that sorts an array ? Is there a numpy way
> of doing this ?
>
> I can do it in python as:
>
> a = [ 6, 5, 99, 2 ]
> idxs = range(len(a))
> z = zip(idxs,a)
> def zcmp(u,v):
>   if u[1]<=v[1]:
>     return -1
>   return 1
> z.sort( zcmp )
> idxs = [u[0] for u in z] # <--- permutation
>
> Simon.
>
> --  
> Simon Burton, B.Sc.
> Licensed PO Box 8066
> ANU Canberra 2601
> Australia
> Ph. 61 02 6249 6940
> http://arrowtheory.com
>
>
> -------------------------------------------------------
> Using Tomcat but need to do more? Need to support web services,  
> security?
> Get stuff done quickly with pre-integrated technology to make your  
> job easier
> Download IBM WebSphere Application Server v.1.0.1 based on Apache  
> Geronimo
> http://sel.as-us.falkag.net/sel? 
> cmd=lnk&kid=120709&bid=263057&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion


From pearu at scipy.org  Wed Apr 26 16:56:05 2006
From: pearu at scipy.org (Pearu Peterson)
Date: Wed Apr 26 16:56:05 2006
Subject: [Numpy-discussion] Possible ref.count bug in changeset #2422
Message-ID: <Pine.LNX.4.64.0604261853100.30142@scipy.org>

Hi,

Shouldn't result be Py_INCRE'ted when it is equal to Py_NotImplemented and 
returned from array_richcompare?

Pearu


From doug5y at shaw.ca  Wed Apr 26 17:10:05 2006
From: doug5y at shaw.ca (Doug Nadworny)
Date: Wed Apr 26 17:10:05 2006
Subject: [Numpy-discussion] Can't install numpy-0.9.6-1.i586.rpm on FC5
Message-ID: <44500B9E.10602@shaw.ca>

when trying to install numpy-0.9.6-1.i586.rpm on Fedora Core 5, rpm 
reports incorrectly that python is the incorrect version, even though it 
is correct:

 >rpm -i --test numpy-0.9.6-1.i586.rpm  ## Tests dependences of  rpm package
error: Failed dependencies:
        python-base >= 2.4 is needed by numpy-0.9.6-1.i586
 >python -V
Python 2.4.2


Is there a way around this?

TIA,
Doug N


From cookedm at physics.mcmaster.ca  Wed Apr 26 17:20:05 2006
From: cookedm at physics.mcmaster.ca (David M. Cooke)
Date: Wed Apr 26 17:20:05 2006
Subject: [Numpy-discussion] Possible ref.count bug in changeset #2422
In-Reply-To: <Pine.LNX.4.64.0604261853100.30142@scipy.org> (Pearu Peterson's
	message of "Wed, 26 Apr 2006 18:55:55 -0500 (CDT)")
References: <Pine.LNX.4.64.0604261853100.30142@scipy.org>
Message-ID: <qnk3bfzj2et.fsf@arbutus.physics.mcmaster.ca>

Pearu Peterson <pearu at scipy.org> writes:

> Hi,
>
> Shouldn't result be Py_INCRE'ted when it is equal to Py_NotImplemented
> and returned from array_richcompare?

Theoretically, yes, but since the case statement "should" cover all
cases, it doesn't matter. Bad code style though on my part; I've added
a default: case instead.

-- 
|>|\/|<
/--------------------------------------------------------------------------\
|David M. Cooke                      http://arbutus.physics.mcmaster.ca/dmc/
|cookedm at physics.mcmaster.ca


From silesalvarado at hotmail.com  Wed Apr 26 17:33:04 2006
From: silesalvarado at hotmail.com (Hugo Siles)
Date: Wed Apr 26 17:33:04 2006
Subject: [Numpy-discussion] crush!!!!
Message-ID: <BAY105-F5EACD6C517239CF5E759DADBD0@phx.gbl>

HI,

I have a problem when I run the following options in python:

>>>from Numeric import *
>>>from Linear algebra
I define a matrix 'a' which prints correctly, calculates its inverse, 
determinat and so for
but when I try to calculate the eigenvalues, such as
>>>  c = eigenvalues(a)
the system just crushs without any message
I made this test because in some other programs with source code happens the 
same thing.

I hope some body can help, thanks

Hugo Siles


From ivazquez at ivazquez.net  Wed Apr 26 17:33:08 2006
From: ivazquez at ivazquez.net (Ignacio Vazquez-Abrams)
Date: Wed Apr 26 17:33:08 2006
Subject: [Numpy-discussion] Can't install numpy-0.9.6-1.i586.rpm on FC5
In-Reply-To: <44500B9E.10602@shaw.ca>
References: <44500B9E.10602@shaw.ca>
Message-ID: <1146098100.16081.15.camel@ignacio.lan>

On Wed, 2006-04-26 at 18:09 -0600, Doug Nadworny wrote:
> when trying to install numpy-0.9.6-1.i586.rpm on Fedora Core 5, rpm 
> reports incorrectly that python is the incorrect version, even though it 
> is correct:
> 
>  >rpm -i --test numpy-0.9.6-1.i586.rpm  ## Tests dependences of  rpm package
> error: Failed dependencies:
>         python-base >= 2.4 is needed by numpy-0.9.6-1.i586
>  >python -V
> Python 2.4.2

Alright, alright, I'll update it already...

-- 
Ignacio Vazquez-Abrams <ivazquez at ivazquez.net>
http://fedora.ivazquez.net/

gpg --keyserver hkp://subkeys.pgp.net --recv-key 38028b72
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 191 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060426/6b5a0982/attachment-0001.sig>

From ndarray at mac.com  Wed Apr 26 18:15:04 2006
From: ndarray at mac.com (Sasha)
Date: Wed Apr 26 18:15:04 2006
Subject: [Numpy-discussion] crush!!!!
In-Reply-To: <BAY105-F5EACD6C517239CF5E759DADBD0@phx.gbl>
References: <BAY105-F5EACD6C517239CF5E759DADBD0@phx.gbl>
Message-ID: <d38f5330604261814h79be29c8oe1ee265a645018eb@mail.gmail.com>

Numeric computes by calling lapack's dgeev subroutine.  Depending on
installation Numeric may either use its own subset of lapack
(translated from Fortran to C) or link to the system supplied Lapack
libraries.  It is possible that there is a bug in your system's lapack
libraries. Some lapack bugs related to extended precision calculations
were reported recently.  What you observe is unlikely to be a Numeric
bug.  Note, however that Numeric is no longer actively supported.  If
you can reproduce the same problem with numpy, it will likely to get
more attention.  Also you have to give us some means to reproduce your
matrix a if you expect more than a general advise.

On 4/26/06, Hugo Siles <silesalvarado at hotmail.com> wrote:
> HI,
>
> I have a problem when I run the following options in python:
>
> >>>from Numeric import *
> >>>from Linear algebra
> I define a matrix 'a' which prints correctly, calculates its inverse,
> determinat and so for
> but when I try to calculate the eigenvalues, such as
> >>>  c = eigenvalues(a)
> the system just crushs without any message
> I made this test because in some other programs with source code happens the
> same thing.
>
> I hope some body can help, thanks
>
> Hugo Siles
>
>
>
>
> -------------------------------------------------------
> Using Tomcat but need to do more? Need to support web services, security?
> Get stuff done quickly with pre-integrated technology to make your job easier
> Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>


From strawman at astraw.com  Wed Apr 26 19:26:05 2006
From: strawman at astraw.com (Andrew Straw)
Date: Wed Apr 26 19:26:05 2006
Subject: [Numpy-discussion] SWIG for 3D array
In-Reply-To: <A7E3D110-90E4-47BE-80DC-CED1DC359A26@cortechs.net>
References: <A7E3D110-90E4-47BE-80DC-CED1DC359A26@cortechs.net>
Message-ID: <44502B85.3000504@astraw.com>

Gennan Chen wrote:

> And I really have hard time to understand how to deal with reference  
> counting issues. Anyone know where I can know a good reference for  that?


http://docs.python.org/ext/refcounts.html


From oliphant.travis at ieee.org  Wed Apr 26 20:30:12 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Wed Apr 26 20:30:12 2006
Subject: [Numpy-discussion] Broadcasting rules (Ticket 76).
In-Reply-To: <d38f5330604261417o1b8dd104veb5b6e1398cfc74e@mail.gmail.com>
References: <4050236.1146000212838.JavaMail.root@fed1wml05.mgt.cox.net>	 <d38f5330604251816i7979df78x67c63eeace28b507@mail.gmail.com>	 <444F0420.9000500@ieee.org> <d38f5330604261417o1b8dd104veb5b6e1398cfc74e@mail.gmail.com>
Message-ID: <44503A8A.2050701@ieee.org>

Sasha wrote:

> In my view the most appealing feature in Python is
> the Zen of Python <http://www.python.org/doc/Humor.html#zen>" and in
> particular "There should be one-- and preferably only one --obvious
> way to do it."  In my view Python represents the "hard science"
> approach appealing to physics and math types while Perl is more of a
> "soft science" language. 
Interesting analogy.  I've not heard that expression before. 
> Unfortunately, it is the fact of life that there are
> always many ways to solve the same problem and a successful "pythonic"
> design has to pick one (preferably the best) of the possible ways and
> make it obvious.
>   
And it's probably impossible to agree as to what is "best"  because of 
the different uses that array's receive.    That's one reason I'm 
anxious to get a basic structure-only basearray into Python itself. 
> This said, let me present a specific problem that I will use to
> illustrate my points below.  Suppose we study school statistics in
> different cities.  Let city A have 10 schools with 20 classes and 30
> students in each.  It is natural to organize the data collected about
> the students in a 10x20x30 array.  It is also natural to collect some
> of the data at the per-school or per-class level.  This data may come
> from aggregating student level statistics (say average test score) or
> from the characteristics that are class or school specific (say the
> grade or primary language).  There are two obvious ways to present
> such data. 1) We can use 3-d arrays for everything and make the shape
> of the per-class array 10x20x1 and the shape of per-school array
> 10x1x1; and 2) use 2-d and 1-d arrays.  The first approach seems to be
> more flexible.  We can also have 10x1x30 or 1x1x30 arrays to represent
> data which varies along the student dimension, but is constant across
> schools or classes.  However, this added benefit is illusory: the
> first student in one class list has no relationship to the first 
> student in the other class, so in this particular problem an average
> score of the first student across classes makes no sense (it will also
> depend on whether students are ordered alphabetically or by an
> achievement rank).
>
> On the other hand this approach has a very significant drawback:
> functions that process city data have no way to distinguish between
> per-school data and a lucky city that can afford educating its
> students in individual classes.  Just as it is extremely unlikely to
> have one student per class in our toy example, in real-world problems
> it is not unreasonable to assume that dimension of size 1 represents
> aggregate data.  A software designed based on this assumption is what
> I would call broken in a subtle way.
>   

I think I see what you are saying.  This is a very specific 
circumstance.  I can verify that the ndarray has not been designed to 
distinguish such hierarchial data.  You will never be able to tell from 
the array itself if a dimension of length 1 means aggregate data or 
not.   I don't see that as a limitation of the ndarray but as evidence 
that another object (i.e. an R-like data-frame) should probably be 
used.  Such an object could even be built on top of the ndarray.

>> [...]
>> I don't think anyone is fundamentally opposed to multiple repetitions.
>> We're just being cautious.   Also, as you've noted, the assignment code
>> is currently not using the ufunc broadcasting code and so they really
>> aren't the same thing, yet.
>>     
>
> It looks like there is a lot of development in this area going on at
> the moment.  Please let me know if I can help.
>   

Well, I did some refactoring to make it easier to expose the basic 
concept of the ufunc elsewhere:

1) Adjusting the inputs to a common shape  (this is what I call 
broadcasting --- it appears to me that you use the term a little more 
loosely)

2) Setting up iterators to iterate over all but the longest dimension so 
that the inner loop is done.

These are the key ingredients to a fast ufunc.  There is 1 more 
optimization in the ufunc machinery for the contiguous case (when the 
inner loop is all that is needed) and then there is code to handle the 
buffering needed for unaligned and/or byte-swapped data.

The final thing that makes a ufunc is the precise signature of the inner 
loop.    Every inner loop as the same signature.   This signature does 
not contain a slot for the length of the array element (that's a big 
reason why variable-length arrays are not supported in ufuncs).  The 
ufuncs could be adapted, of course,  but it was a bigger fish than I 
wanted to try and fry pre 1.0

Note, though, that I haven't used these concepts yet to implement 
ufunc-like copying.   The PyArray_Cast function will also need to be 
adjusted at the same time and this could actually prove more difficult 
as it must implement buffering.   Of course it could give us a chance to 
abstract-out the buffered, broadcasted call as well.  That might make a 
useful C-API function.   

Any help you can provide would be greatly appreciated.   I'm focused 
right now on the scalar math module as without it, NumPy is still slower 
for people that use a lot of array elements. 
>> [...]
>>     
>>> In my experience broadcasting length-1 and not broadcasting other
>>> lengths is very error prone as it is.
>>>       
>> That's not been my experience.
>>     
>
> I should have been more specific.  As I explained above, the special
> properties of length-1 led me to design a system that distinguished
> aggregate data by testing for unit length.  This system was subtly
> broken.  In a rare case when the population had only one element, the
> system was producing wrong results.
>   
Yes I can see that now.   Your comments make a lot more sense.  Trying 
to use ndarray's to represent hierarchial data can cause these subtle 
issues.  The ndarray is a "flat" object in the sense that every element 
is seen as "equal" to every other element. 

>> dim(x) <- c(2,5)
>> x
>>     
>      [,1] [,2] [,3] [,4] [,5]
> [1,]    0    0    0    0    0
> [2,]    0    0    0    0    0
>
> (R uses Fortran order).  Broadcasting ignores the dim attribute, but
> does the right thing for conformable vectors:
>
>   

Thanks for the description of R.

>> x + c(1,2)
>>     
>      [,1] [,2] [,3] [,4] [,5]
> [1,]    1    1    1    1    1
> [2,]    2    2    2    2    2
>
> However, the following is unfortunate:
>   
Ahh...   So, it looks like R does on arithmetic what NumPy copying is 
currently doing (treating both as flat spaces to fill).

>> x
>>     
> Sorry I was not specific in the original post.  I hope you now
> understand where I come from.  Can you point me to some examples of
> the correct way to use dimension-preserving broadcasting?  I would
> assume that it is probably more useful in the problem domains where
> there is no natural ordering of the dimensions, unlike in the
> hierarchial data example that I used.
>   

Yes,  the ndarray does not recognize any natural ordering to the 
dimensions at all.  Every dimension is "equal."  It's designed to be a 
very basic object.

I'll post some examples later.  I've got to go right now.

> The dimension-increasing broadcasting is very natural when you deal
> with hierarchical data where various dimensions correspond to the
> levels of aggregation.  As I explained above, average student score
> per class makes sense while the average score per student over classes
> does not.  It is very common to combine per-class data with
> per-student data by broadcasting per-class data.  For example, the
> total time spent by student is a sum spent in regular per-class
> session plus individual elected courses.
>   

I think you've hit on something here regarding the use of an array for 
"hierachial" data.  I'm not sure I understand the implications entirely, 
but at least it helps me a little bit see what your concerns really are.


> I hope you understand that I did not mean to criticize anyone's coding
> style.  I was not really hinting at optimization issues, I just had a
> particular design problem in mind (see above).  
I do understand much better now.  I still need to think about the 
hierarchial case a bit more.  My basic concept of an array which 
definitely biases me is a medical imaging volume.... (i.e. the X-ray 
density at each location in 3-space). 

I could use improved understanding of how to use array's effectively in 
hierarchies.  Perhaps we can come up with some useful concepts (or maybe 
another useful structure that inherits from the basearray) and can 
therefore share data effectively with the ndarray....


> In the spirit of appealing to obscure languages ;-), let me mention
> that in the K language (kx.com) element assignment is implemented
> using an Amend primitive that takes four arguments: @[x,i,f,y] id more
> or less equivalent to numpy's x[i] = f(x[i], y[i]), where x, y and i
> are vectors and f is a binary (broadcasting) function.  Thus, x[i] +=
> y[i] can be written as @[x,i,+,y] and x[i] = y[i] is @[x,i,:,y], where
> ':' denotes a binary function that returns it's second argument and
> ignores the first. K interpretor's Linux binary is less than 200K and
> that includes a simple X window GUI! Such small code size would not be
> possible without picking the right set of primitives and avoiding
> special case code.
>   

Not to mention limiting the number of data-types :-)


> I know close to nothing about variable length arrays.  When I need to
> deal with the relational database data, I transpose it so that each
> column gets into its own fixed length array. 
Yeah, that was my strategy too and what I always suggested to the 
numarray folks who wanted the variable-length arrays.   But, 
memory-mapping can't be done that way....

>  This is the approach
> that both R and K take.  However, at least at the C level, I don't see
> why ufunc code cannot be generalized to handle variable length arrays.
>   
They of course, could be, it's just more re-factoring than I wanted to 
do.   The biggest issue is the underlying 1-d loop function signature.  
I hesitated to change the signature because that would break 
compatibility with Numeric extension modules that defined ufuncs (like 
scipy-special...)  The length could piggy-back in the data argument 
passed into those functions, but doing that right was more work than I 
wanted to do.   If you solve that problem,  everything else could be 
made to work without too much trouble.

>  At the python level, pre-defined arithmetic or math functions are
> probably not feasible for variable length, but the ability to define a
> variable length array function by just writing an inner loop
> implementation may be quite useful.
>   
Yes, it could have helped write the string comparisons much faster :-)

>> However, the broadcasting machinery has been abstracted in NumPy and can
>> therefore be re-used in the copying code.  In Numeric, broadcasting was
>> basically implemented deep inside a confusing while loop.
>>     
>
> I've never understood the Numeric's while loop and completely agree
> with your characterization.  I am still studying the numpy code, but
> it is clearly a big improvement.
>   

Well, it's more straightforward because I'm not the genius Jim Hugunin 
is.  It makes heavy use of the iterator concept which I finally grok'd 
while trying to write things (and realized I had basically already 
implemented in writing the old scipy.vectorize). 

I welcome many more eyes on the code.   I know I've taken shortcuts in 
places that should be improved.

Thanks for your continued help and useful comments.


-Travis


From tim.hochberg at cox.net  Wed Apr 26 21:02:10 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Wed Apr 26 21:02:10 2006
Subject: [Numpy-discussion] Broadcasting rules (Ticket 76).
In-Reply-To: <d38f5330604261417o1b8dd104veb5b6e1398cfc74e@mail.gmail.com>
References: <4050236.1146000212838.JavaMail.root@fed1wml05.mgt.cox.net>	 <d38f5330604251816i7979df78x67c63eeace28b507@mail.gmail.com>	 <444F0420.9000500@ieee.org> <d38f5330604261417o1b8dd104veb5b6e1398cfc74e@mail.gmail.com>
Message-ID: <44504296.8040602@cox.net>

I haven't fully waded through all the various replies and to this 
thread. I plan to do that and send a reply on specific points later. 
This is message is more of a historical, motivational or possibly 
philosophical nature.

First off, NumPy has used the term "broadcast" to mean the same thing 
since its inception and changing the terminology now is asking for 
confusion. *In the context of this mailing list *,I think we should use 
"broadcast" in the numpy sense and use appropriate qualifiers when 
referring to how other array packages practice broadcasting.  Referring 
to broadcasting as "shape-preserving broadcasting" or some such doesn't 
seems to make things any clearer and adds a bunch of excess verbiage. In 
any event, I plan to omit any "broadcast" qualifiers here.

The following understanding was formed by using and occasionally helping 
with development of NumPy since it was developed in 1995 or thereabouts. 
That doesn't mean that my understanding aggrees with the primary 
developers of the time, I may misremember things and my recollections 
are likely tinged by the experience I've had with NumPy in the interim. 
So, don't take this as definitive, but perhaps it will help provide some 
insight into what NumPy's broadcasting is supposed to be.

Let's first dispense with the padding of dimensions. As I recall, this 
was a way to make matrix like operations easier. This was way before 
there was a matrix class and by defining padding in this way 1-D vectors 
could generally be treated as column vectors. Row vectors still needed 
to be 2-D (1xN), but they tended to be less frequent, so that was less 
of a burden. Or maybe I have that backwards, in any event they were put 
there to to facilitate matrix-like uses of numpy arrays. Given that 
there is a matrix class at this point, I doubt I would automagically pad 
the dimensions if I were designing numpy from scratch now. Since the 
dimension padding is at least partly historical accident and since it is 
in some sense orthogonal to the main point of numpy's broadcasting I'm 
going to pretend it doesn't exist for the rest of this discussion.

At it's core broadcasting is about adjusting the shapes of two arrays so 
that they match. Consider an array 'A' and an array 'B' with shaps (3, 
Any) and (Any, 4). Here, 'Any' means that the given dimension of the 
array is unspecified and can take on any value that is convenient for 
functions operating on the array.  If we add 'A' and 'B' together we'd 
like the two 'Any' dimensions to stretch appropriately so that the 
result was an array of shape (3, 4). Similarly adding and array of shape 
(3, 4) to an array of shape (Any, 4) should work and produce an array of 
shape (3, 4). So far, this is pretty straightforward; I believe, it also 
bears a fair amount of resemblance to Sasha's 0-stride ideas.

The complicating factor is that there wasn't a good way to spell 'Any' 
at the time. Or maybe we were lazy. Or maybe there was some other reason 
that I'm forgetting. In any event, we ended up spelling 'Any' as '1'. 
That means that there's no way to distinguish between a dimension that's 
of length-1 for some legitimate reason and one that is that length just 
for stretchability. It would be an interesting experiment to see how 
things would work with no padding and with an explicit 'Any' value 
available for dimensions. However, it's probably too much work and would 
result in too many backwards compatibility problems for NumPy proper.

[Half baked thoughts on how to do this though: newaxis would produce a 
new axis with length -1 (or some other marker length). This would be 
treated as length-1 axes are treated now. However, length-1axes would no 
longer broadcast. Padding would be right out.]

In summary, the platonic ideal of broadcasting is simple and clean. In 
practice it's more complicated for two reasons. First, padding the 
dimensions.I believe that this is mostly historical baggage. The second 
is the conflation of '1' and 'Any' (a name that I made up for this 
message, so don't go searching for it). This may be an hostorical 
accident and/or implementation artifact, but there may actually be some 
more practical reasons behind this as well that I am forgetting.

Hopefully that is mildly informative,

Regards,

-tim


From kwgoodman at gmail.com  Wed Apr 26 21:46:08 2006
From: kwgoodman at gmail.com (Keith Goodman)
Date: Wed Apr 26 21:46:08 2006
Subject: [Numpy-discussion] matrix.std() returns array
Message-ID: <f4f93d420604262145p7d38060av9fa610488749811f@mail.gmail.com>

I noticed that the mean of a matrix is a matrix but the standard
deviation of a matrix is an array. Is that the expected behavior? I'm
also getting the wrong values (0 and nan) for the standard deviation.
Did I mess something up?

I'm trying to learn scipy (and python) by porting a small Octave
program. I installed numpy from svn (today) on a Debian box. And
numpy.test() says OK.

Here's an example:

>> numpy.__version__
'0.9.7.2416'

>> x = asmatrix(random.uniform(0,1,(3,3)))

>> x

matrix([[ 0.56771284,  0.57053769,  0.57505946],
       [ 0.10479534,  0.81692248,  0.91829316],
       [ 0.48627829,  0.59255983,  0.32628573]])

>> x.mean(0)
matrix([[ 0.38626216,  0.66000667,  0.60654612]])

>> x.std(0)
array([        nan,  0.        ,  0.        ])


From arnd.baecker at web.de  Wed Apr 26 23:01:03 2006
From: arnd.baecker at web.de (Arnd Baecker)
Date: Wed Apr 26 23:01:03 2006
Subject: [Numpy-discussion] concatenate, doc-string
In-Reply-To: <d38f5330604261523r53d37db0hc56781bcda52f96c@mail.gmail.com>
References: <Pine.LNX.4.51.0604261129410.10871@ptpcp8.phy.tu-dresden.de> 
 <qnkk69ci0ff.fsf@arbutus.physics.mcmaster.ca>
 <d38f5330604261523r53d37db0hc56781bcda52f96c@mail.gmail.com>
Message-ID: <Pine.LNX.4.51.0604270752140.14194@ptpcp8.phy.tu-dresden.de>


On Wed, 26 Apr 2006, Sasha wrote:

> On 4/26/06, David M. Cooke <cookedm at physics.mcmaster.ca> wrote:
> > ....
> > Here's what I just checked in:
> >
> >     concatenate((a1, a2, ...), axis=None) joins arrays together
> >
> >     The tuple of sequences (a1, a2, ...) are joined along the given axis
> >     (default is the first one) into a single numpy array.
> >
> >     Example:
> >
> >     >>> concatenate( ([0,1,2], [5,6,7]) )
> >     array([0, 1, 2, 5, 6, 7])
> >
>
> The first argument does not have to be a tuple:
>
> >>> print concatenate([[0,1,2], [5,6,7]])
> [0 1 2 5 6 7]
>
> but the docstring is probably ok given that the alternative is
> "sequence of sequences" ...

Seems to be the usual problem of either being slightly unprecise
but understandable or legally correct but impossible to understand
(in particular for beginners).

What about changing the example to:
"""
Examples:

>>> concatenate(([0, 1, 2], [5, 6, 7]))
array([0, 1, 2, 5, 6, 7])

>>> concatenate([[0, 1, 2], [5, 6, 7]])
array([0, 1, 2, 5, 6, 7])

>>> z =  arange(5)
>>> concatenate(([0, 1, 2], [5, 6, 7], z))
array([0, 1, 2, 5, 6, 7, 0, 1, 2, 3, 4])
"""

Best, Arnd


From Chris.Barker at noaa.gov  Wed Apr 26 23:42:02 2006
From: Chris.Barker at noaa.gov (Christopher Barker)
Date: Wed Apr 26 23:42:02 2006
Subject: [Numpy-discussion] array.min() vs. min(array)
In-Reply-To: <444FE909.5080209@ieee.org>
References: <c5b438120604261420u5d44eeefla575bc56043f487a@mail.gmail.com>
 <444FE909.5080209@ieee.org>
Message-ID: <445067C6.3050805@noaa.gov>

Travis Oliphant wrote:

> In Python 2.5 we are going to have the same issues with the new any() 
> and all() functions of Python.

"Namespaces are one honking great idea -- let's do more of those!"

Yet another reason to deprecate import * !

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer
                                      		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From arnd.baecker at web.de  Wed Apr 26 23:49:06 2006
From: arnd.baecker at web.de (Arnd Baecker)
Date: Wed Apr 26 23:49:06 2006
Subject: [Numpy-discussion] array.min() vs. min(array)
In-Reply-To: <444FE909.5080209@ieee.org>
References: <c5b438120604261420u5d44eeefla575bc56043f487a@mail.gmail.com>
 <444FE909.5080209@ieee.org>
Message-ID: <Pine.LNX.4.51.0604270807510.14194@ptpcp8.phy.tu-dresden.de>

Moin,

On Wed, 26 Apr 2006, Travis Oliphant wrote:

> Ryan Krauss wrote:
> > I was spending some time trying to track down how to speed up an
> > algorithm that gets called a bunch of times during an optimization.  I
> > was startled when I finally figured out that most of the time was
> > wasted by using the built-in pyhton min function.  It turns out that
> > in my case, using array.min() (i.e. the method of the Numpy array) is
> > 300-500 times faster than the built-in python min function (i.e.
> > min(array)).
> >
> > So, thank you Travis and everyone who has put so much time into
> > thinking through Numpy and making it fast (as well as making sure it
> > is correct).
>
> The builtin min function is a bit confusing because it usually does work
> on NumPy arrays.  But, as you've noticed it is always slower because it
> uses the "generic sequence interface" that NumPy arrays expose.  So,
> it's basically not much faster than a Python loop.  In this case you are
> also being hit by the fact that scalarmath is not yet implemented (it's
> getting close though...)  so the returned array scalars are being
> compared using the bulky ufunc machinery on each element separately.
>
> In Python 2.5 we are going to have the same issues with the new any()
> and all() functions of Python.

I am just preparing a small text to collect such cases for the wiki.

However, I am not sure about a good name for such a page:
  http://www.scipy.org/Cookbook/Speed
  http://www.scipy.org/Cookbook/SpeedProblems
  http://www.scipy.org/Cookbook/Performance
?
(As usual, it is easy to start a page, than to properly maintain
it. OTOH things like this get lost very quickly, in particular with this
nice amount of traffic here).

In addition this also relates to
- profiling

  (For example I would like to add the contents of
  http://mail.enthought.com/pipermail/enthought-dev/2006-January/001075.html
  to the wiki at some point)
- psyco
- pyrex
- f2py
- weave
- numexpr
- ...

Presently much of this is listed in the Cookbook under
"Using NumPy With Other Languages (Advanced)",
and therefore the above "Python only" issues don't quite fit.
Any suggestions?

Best, Arnd


From arnd.baecker at web.de  Wed Apr 26 23:51:07 2006
From: arnd.baecker at web.de (Arnd Baecker)
Date: Wed Apr 26 23:51:07 2006
Subject: [Numpy-discussion] array.min() vs. min(array)
In-Reply-To: <445067C6.3050805@noaa.gov>
References: <c5b438120604261420u5d44eeefla575bc56043f487a@mail.gmail.com>
 <444FE909.5080209@ieee.org> <445067C6.3050805@noaa.gov>
Message-ID: <Pine.LNX.4.51.0604270849000.14194@ptpcp8.phy.tu-dresden.de>

On Wed, 26 Apr 2006, Christopher Barker wrote:

> Travis Oliphant wrote:
>
> > In Python 2.5 we are going to have the same issues with the new any()
> > and all() functions of Python.
>
> "Namespaces are one honking great idea -- let's do more of those!"
>
> Yet another reason to deprecate import * !

Yep! But it would not work for `min` as there is
no such function in  numpy. (would we need one?...)

Best, Arnd


From Chris.Barker at noaa.gov  Thu Apr 27 00:00:05 2006
From: Chris.Barker at noaa.gov (Christopher Barker)
Date: Thu Apr 27 00:00:05 2006
Subject: [Numpy-discussion] Broadcasting rules (Ticket 76).
In-Reply-To: <d38f5330604261417o1b8dd104veb5b6e1398cfc74e@mail.gmail.com>
References: <4050236.1146000212838.JavaMail.root@fed1wml05.mgt.cox.net>
 <d38f5330604251816i7979df78x67c63eeace28b507@mail.gmail.com>
 <444F0420.9000500@ieee.org>
 <d38f5330604261417o1b8dd104veb5b6e1398cfc74e@mail.gmail.com>
Message-ID: <44506BE6.10301@noaa.gov>

As Sasha quite clearly pointed out, when you do aggregation, you really 
do want to reduce the dimensionality of your data. IN fact, that's 
something that always bit me with MATLAB. If I had a matrix that 
happened to have a dimension of 1, MATLAB would interpret it as a 
vector. I ended up writing functions like "SumColumns" that would check 
if it was a single row vector before calling sum, so that I wouldn't 
suddenly get a scaler result if a matrix happened to have on row.

Once you reduce dimensionality with aggregating functions, I can see how 
it would be natural to want to use broadcasting to to merge the reduced 
data and full data. However, I can't see how you could do that cleanly.

How is the code to know whether a rank-1 array represents a column or 
row when multiplied with a rank-2 array? There is simply no way to know, 
in general. I suppose we could define a convention, like:

"rank-1 arrays will be interpreted as row vectors for broadcasting."

etc. for higher dimensions.

However, I've found that even in my code, I don't find one convention 
always makes the most sense for all applications, so I'm just as happy 
to make it clear with a lot of calls like:

v.shape = (-1, 1)

NOTE:

It appears that numpy does, in fact, use such a convention:

 >>> v = N.arange(5)
 >>> m = N.ones((5,5))
 >>> v * m
array([[0, 1, 2, 3, 4],
        [0, 1, 2, 3, 4],
        [0, 1, 2, 3, 4],
        [0, 1, 2, 3, 4],
        [0, 1, 2, 3, 4]])
 >>> v.shape = (-1,1)
 >>> v * m
array([[0, 0, 0, 0, 0],
        [1, 1, 1, 1, 1],
        [2, 2, 2, 2, 2],
        [3, 3, 3, 3, 3],
        [4, 4, 4, 4, 4]])


So what's the disagreement about?

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer
                                      		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From Chris.Barker at noaa.gov  Thu Apr 27 00:10:03 2006
From: Chris.Barker at noaa.gov (Christopher Barker)
Date: Thu Apr 27 00:10:03 2006
Subject: [Numpy-discussion] concatenate, doc-string
In-Reply-To: <qnkk69ci0ff.fsf@arbutus.physics.mcmaster.ca>
References: <Pine.LNX.4.51.0604261129410.10871@ptpcp8.phy.tu-dresden.de>
 <qnkk69ci0ff.fsf@arbutus.physics.mcmaster.ca>
Message-ID: <44506E2F.9040902@noaa.gov>

David M. Cooke wrote:

> Here's what I just checked in:
> 
>     concatenate((a1, a2, ...), axis=None) joins arrays together
> 
>     The tuple of sequences (a1, a2, ...) are joined along the given axis
>     (default is the first one) into a single numpy array.
> 
>     Example:
> 
>     >>> concatenate( ([0,1,2], [5,6,7]) )
>     array([0, 1, 2, 5, 6, 7])

While we're at it, why not an example of how the axis argument works:
 >>> concatenate( (ones((1,3)), zeros((1,3))) )
array([[1, 1, 1],
        [0, 0, 0]])

 >>> concatenate( (ones((1,3)), zeros((1,3))), axis = 0 )
array([[1, 1, 1],
        [0, 0, 0]])
 >>> concatenate( (ones((1,3)), zeros((1,3))), axis = 1 )
array([[1, 1, 1, 0, 0, 0]])


I'm not sure I like this example, but it's a easy way to do a one liner.

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer
                                      		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From oliphant.travis at ieee.org  Thu Apr 27 00:53:00 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Thu Apr 27 00:53:00 2006
Subject: [Numpy-discussion] matrix.std() returns array
In-Reply-To: <f4f93d420604262145p7d38060av9fa610488749811f@mail.gmail.com>
References: <f4f93d420604262145p7d38060av9fa610488749811f@mail.gmail.com>
Message-ID: <4450780C.9060403@ieee.org>

Keith Goodman wrote:
> I noticed that the mean of a matrix is a matrix but the standard
> deviation of a matrix is an array. Is that the expected behavior? I'm
> also getting the wrong values (0 and nan) for the standard deviation.
> Did I mess something up?
>
> I'm trying to learn scipy (and python) by porting a small Octave
> program. I installed numpy from svn (today) on a Debian box. And
> numpy.test() says OK.
>
>   
This should be fixed now in SVN.  If somebody can add a test that would 
be great.

Note, that the methods taking axes also now preserve row and column 
orientation for matrices.

-Travis


From oliphant.travis at ieee.org  Thu Apr 27 01:03:04 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Thu Apr 27 01:03:04 2006
Subject: [Numpy-discussion] Warnings in NumPy SVN
Message-ID: <44507A9D.8070902@ieee.org>

I want to apologize for the relative instability of the SVN tree in the 
past couple of days.  Getting the scalarmath layout working took more 
C-API changes than I had anticipated.  

The SVN version of NumPy now builds scalarmath by default.  The basic 
layout of the module is complete.  However, there are many basic 
functions that are missing.  As a result, during compile you will get 
many warnings about undefined functions.   If an attempt were made to 
load the module it would cause an error as well due to undefined symbols.

These undefined symbols are all the basic operations on fundamental c 
data-types that either need a function defined or a #define statement made.

The names have this form:

@name at _ctype_@oper@

where @name@ is one of the 16 Number-like types and @oper@ is one of the 
operations needing to be supported.

The function (or macro) needs to implement the operation on the basic 
data-type and if necessary set an error-flag in the floating-point 
registers. 

If anybody has time to help implement these basic operations, it would 
be greatly appreciated. 


-Travis


From zpincus at stanford.edu  Thu Apr 27 01:22:05 2006
From: zpincus at stanford.edu (Zachary Pincus)
Date: Thu Apr 27 01:22:05 2006
Subject: [Numpy-discussion] matrix.std() returns array
In-Reply-To: <4450780C.9060403@ieee.org>
References: <f4f93d420604262145p7d38060av9fa610488749811f@mail.gmail.com> <4450780C.9060403@ieee.org>
Message-ID: <05B8DC8B-CD68-4EF2-BB2B-6FFABABF812E@stanford.edu>

On a slightly-related note, was anyone able to reproduce the  
exception with matrix types and the var() method?
e.g. numpy.matrix([[1,2,3], [1,2,3]]).var() complains about unaligned  
data...

Presumably if std is fixed in SVN, so is var. Also if a std unit test  
is added, a var one should be too.

Zach

On Apr 27, 2006, at 12:51 AM, Travis Oliphant wrote:

> Keith Goodman wrote:
>> I noticed that the mean of a matrix is a matrix but the standard
>> deviation of a matrix is an array. Is that the expected behavior? I'm
>> also getting the wrong values (0 and nan) for the standard deviation.
>> Did I mess something up?
>>
>> I'm trying to learn scipy (and python) by porting a small Octave
>> program. I installed numpy from svn (today) on a Debian box. And
>> numpy.test() says OK.
>>
>>
> This should be fixed now in SVN.  If somebody can add a test that  
> would be great.
>
> Note, that the methods taking axes also now preserve row and column  
> orientation for matrices.
>
> -Travis
>
>
>
> -------------------------------------------------------
> Using Tomcat but need to do more? Need to support web services,  
> security?
> Get stuff done quickly with pre-integrated technology to make your  
> job easier
> Download IBM WebSphere Application Server v.1.0.1 based on Apache  
> Geronimo
> http://sel.as-us.falkag.net/sel? 
> cmd=lnk&kid=120709&bid=263057&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion


From arnd.baecker at web.de  Thu Apr 27 03:06:17 2006
From: arnd.baecker at web.de (Arnd Baecker)
Date: Thu Apr 27 03:06:17 2006
Subject: [Numpy-discussion] vectorize problem
In-Reply-To: <444FA7E7.2070303@ieee.org>
References: <200604251324.42987.steffen.loeck@gmx.de>
 <Pine.LNX.4.51.0604260944230.10871@ptpcp8.phy.tu-dresden.de>
 <444FA7E7.2070303@ieee.org>
Message-ID: <Pine.LNX.4.51.0604271201040.14194@ptpcp8.phy.tu-dresden.de>

On Wed, 26 Apr 2006, Travis Oliphant wrote:

[...]

> It is just a simple change.   Scalars are supposed to be supported.
> They aren't only as a side-effect of the switch to not return
> object-scalars.    I did not update the vectorize code to handle the
> scalar return value from the object ufunc (which is now no-longer an
> object-scalar with the methods of arrays (like astype) but is instead
> the underlying object).
>
> I'll add a check.

Works perfect now - many thanks!

This reminds me of some other issue when trying to
vectorize f2py-wrapped functions:
Pearu suggested a fix in terms of a more general way to determine the
number of arguments of a callable Python object,
http://www.scipy.net/pipermail/scipy-user/2006-April/007617.html

However, it seems that this has fallen through the cracks
(and I don't see how to incorporate it into numpy.vectorize...)

Is this another simple one? ;-)

Many thanks,

Arnd


From gruben at bigpond.net.au  Thu Apr 27 05:05:02 2006
From: gruben at bigpond.net.au (Gary Ruben)
Date: Thu Apr 27 05:05:02 2006
Subject: [Numpy-discussion] array.min() vs. min(array)
In-Reply-To: <Pine.LNX.4.51.0604270807510.14194@ptpcp8.phy.tu-dresden.de>
References: <c5b438120604261420u5d44eeefla575bc56043f487a@mail.gmail.com> <444FE909.5080209@ieee.org> <Pine.LNX.4.51.0604270807510.14194@ptpcp8.phy.tu-dresden.de>
Message-ID: <4450B34F.8010501@bigpond.net.au>

Hi Arnd,

You could call it PerformanceTips and include some search terms like 
"speed" in the page so search engines pick them up.

Gary R.

Arnd Baecker wrote:

> I am just preparing a small text to collect such cases for the wiki.
> 
> However, I am not sure about a good name for such a page:
>   http://www.scipy.org/Cookbook/Speed
>   http://www.scipy.org/Cookbook/SpeedProblems
>   http://www.scipy.org/Cookbook/Performance
> ?


From ryanlists at gmail.com  Thu Apr 27 06:41:08 2006
From: ryanlists at gmail.com (Ryan Krauss)
Date: Thu Apr 27 06:41:08 2006
Subject: [Numpy-discussion] array.min() vs. min(array)
In-Reply-To: <4450B34F.8010501@bigpond.net.au>
References: <c5b438120604261420u5d44eeefla575bc56043f487a@mail.gmail.com>
	 <444FE909.5080209@ieee.org>
	 <Pine.LNX.4.51.0604270807510.14194@ptpcp8.phy.tu-dresden.de>
	 <4450B34F.8010501@bigpond.net.au>
Message-ID: <c5b438120604270640p21996cc9w68367c89efe6ad8d@mail.gmail.com>

I think this is a great idea.  We get a lot of these kinds of
questions on the list, and the collective wisdom of people here who
have really dug into this is really impressive.  But, that wisdom does
need to be a little easier to find.

Speaking of which, I don't always feel like I get trustworthy results
out of the profiler, so when I really want to know what is going on I
find myself doing this alot:

t1=time.time()
[block of code here]
t2=time.time()
[more code]
t3=time.time()

and then comparing t3-t2 and t2-t1 to narrow down where the code is
spending its time.

Does anyone have good tips on how to do good profiling?  Or is this
question so vague and counter-intuitive that I seem silly and I had
better come back with a believable example?

Thanks,

Ryan

On 4/27/06, Gary Ruben <gruben at bigpond.net.au> wrote:
> Hi Arnd,
>
> You could call it PerformanceTips and include some search terms like
> "speed" in the page so search engines pick them up.
>
> Gary R.
>
> Arnd Baecker wrote:
>
> > I am just preparing a small text to collect such cases for the wiki.
> >
> > However, I am not sure about a good name for such a page:
> >   http://www.scipy.org/Cookbook/Speed
> >   http://www.scipy.org/Cookbook/SpeedProblems
> >   http://www.scipy.org/Cookbook/Performance
> > ?
>
>
>
> -------------------------------------------------------
> Using Tomcat but need to do more? Need to support web services, security?
> Get stuff done quickly with pre-integrated technology to make your job easier
> Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>


From arnd.baecker at web.de  Thu Apr 27 06:56:08 2006
From: arnd.baecker at web.de (Arnd Baecker)
Date: Thu Apr 27 06:56:08 2006
Subject: [Numpy-discussion] array.min() vs. min(array)
In-Reply-To: <4450B34F.8010501@bigpond.net.au>
References: <c5b438120604261420u5d44eeefla575bc56043f487a@mail.gmail.com>
 <444FE909.5080209@ieee.org> <Pine.LNX.4.51.0604270807510.14194@ptpcp8.phy.tu-dresden.de>
 <4450B34F.8010501@bigpond.net.au>
Message-ID: <Pine.LNX.4.51.0604271552260.14194@ptpcp8.phy.tu-dresden.de>

On Thu, 27 Apr 2006, Gary Ruben wrote:

> Hi Arnd,
>
> You could call it PerformanceTips and include some search terms like
> "speed" in the page so search engines pick them up.

Alright, I put all I know on this (which is not that much ;-) at

  http://www.scipy.org/PerformanceTips

The pointers to weave/f2py/pyrex/ (ah - psyco is missing)  will have to be
added.

Also the profiling/benchmarking
aspect, which is important (actually more important
even before thinking about PerformanceTips) needs to be put somewhere,
maybe even separately under

  http://www.scipy.org/BenchmarkingAndProfiling

Best, Arnd


> Gary R.
>
> Arnd Baecker wrote:
>
> > I am just preparing a small text to collect such cases for the wiki.
> >
> > However, I am not sure about a good name for such a page:
> >   http://www.scipy.org/Cookbook/Speed
> >   http://www.scipy.org/Cookbook/SpeedProblems
> >   http://www.scipy.org/Cookbook/Performance
> > ?
>
>
>
> -------------------------------------------------------
> Using Tomcat but need to do more? Need to support web services, security?
> Get stuff done quickly with pre-integrated technology to make your job easier
> Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>
>


From arnd.baecker at web.de  Thu Apr 27 07:02:16 2006
From: arnd.baecker at web.de (Arnd Baecker)
Date: Thu Apr 27 07:02:16 2006
Subject: [Numpy-discussion] array.min() vs. min(array)
In-Reply-To: <c5b438120604270640p21996cc9w68367c89efe6ad8d@mail.gmail.com>
References: <c5b438120604261420u5d44eeefla575bc56043f487a@mail.gmail.com> 
 <444FE909.5080209@ieee.org>  <Pine.LNX.4.51.0604270807510.14194@ptpcp8.phy.tu-dresden.de>
  <4450B34F.8010501@bigpond.net.au> <c5b438120604270640p21996cc9w68367c89efe6ad8d@mail.gmail.com>
Message-ID: <Pine.LNX.4.51.0604271558140.14194@ptpcp8.phy.tu-dresden.de>


On Thu, 27 Apr 2006, Ryan Krauss wrote:

> I think this is a great idea.  We get a lot of these kinds of
> questions on the list, and the collective wisdom of people here who
> have really dug into this is really impressive.  But, that wisdom does
> need to be a little easier to find.
>
> Speaking of which, I don't always feel like I get trustworthy results
> out of the profiler, so when I really want to know what is going on I
> find myself doing this alot:
>
> t1=time.time()
> [block of code here]
> t2=time.time()
> [more code]
> t3=time.time()
>
> and then comparing t3-t2 and t2-t1 to narrow down where the code is
> spending its time.
>
> Does anyone have good tips on how to do good profiling?  Or is this
> question so vague and counter-intuitive that I seem silly and I had
> better come back with a believable example?

Maybe this one is of interest then:
http://www.physik.tu-dresden.de/~baecker/comp_talks.html
and goto "Python and Co - some recent developments"
Quite late in the talk there is an example on Profiling
(sorry, it seems that no direct linking is possible)
The corresponding files are at
  http://www.physik.tu-dresden.de/~baecker/talks/pyco/BenchExamples/

Essentially it is an example of using kcachegrind to display
the results of hotshot
(see also:
  http://mail.enthought.com/pipermail/enthought-dev/2006-January/001075.html
)

Best, Arnd


> Thanks,
>
> Ryan
>
> On 4/27/06, Gary Ruben <gruben at bigpond.net.au> wrote:
> > Hi Arnd,
> >
> > You could call it PerformanceTips and include some search terms like
> > "speed" in the page so search engines pick them up.
> >
> > Gary R.
> >
> > Arnd Baecker wrote:
> >
> > > I am just preparing a small text to collect such cases for the wiki.
> > >
> > > However, I am not sure about a good name for such a page:
> > >   http://www.scipy.org/Cookbook/Speed
> > >   http://www.scipy.org/Cookbook/SpeedProblems
> > >   http://www.scipy.org/Cookbook/Performance
> > > ?
> >
> >
> >
> > -------------------------------------------------------
> > Using Tomcat but need to do more? Need to support web services, security?
> > Get stuff done quickly with pre-integrated technology to make your job easier
> > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
> > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
> > _______________________________________________
> > Numpy-discussion mailing list
> > Numpy-discussion at lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/numpy-discussion
> >
>
>
> -------------------------------------------------------
> Using Tomcat but need to do more? Need to support web services, security?
> Get stuff done quickly with pre-integrated technology to make your job easier
> Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
> http://sel.as-us.falkag.net/sel?cmd_______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>
>


From faltet at carabos.com  Thu Apr 27 07:08:06 2006
From: faltet at carabos.com (Francesc Altet)
Date: Thu Apr 27 07:08:06 2006
Subject: [Numpy-discussion] array.min() vs. min(array)
In-Reply-To: <c5b438120604270640p21996cc9w68367c89efe6ad8d@mail.gmail.com>
References: <c5b438120604261420u5d44eeefla575bc56043f487a@mail.gmail.com> <4450B34F.8010501@bigpond.net.au> <c5b438120604270640p21996cc9w68367c89efe6ad8d@mail.gmail.com>
Message-ID: <200604271606.52780.faltet@carabos.com>

A Dijous 27 Abril 2006 15:40, Ryan Krauss va escriure:
> I think this is a great idea.  We get a lot of these kinds of
> questions on the list, and the collective wisdom of people here who
> have really dug into this is really impressive.  But, that wisdom does
> need to be a little easier to find.
>
> Speaking of which, I don't always feel like I get trustworthy results
> out of the profiler, so when I really want to know what is going on I
> find myself doing this alot:
>
> t1=time.time()
> [block of code here]
> t2=time.time()
> [more code]
> t3=time.time()
>
> and then comparing t3-t2 and t2-t1 to narrow down where the code is
> spending its time.
>
> Does anyone have good tips on how to do good profiling?  Or is this
> question so vague and counter-intuitive that I seem silly and I had
> better come back with a believable example?

Well, if you are on Linux, and want to time C extension, then oprofile
is a *very* good option. Another profiling tool is Cachegrind, part of
Valgrind. It uses the processor emulation of Valgrind to run the
executable, and catches all memory accesses for the trace. In
addition, you can combine the output of oprofile with Cachegrind.
In [3] one can see more info about these and more tools.

[1] http://oprofile.sourceforge.net
[2] http://kcachegrind.sourceforge.net/
[3] https://uimon.cern.ch/twiki/bin/view/Atlas/OptimisingCode

Cheers,

-- 
>0,0<   Francesc Altet ? ? http://www.carabos.com/
V   V   C?rabos Coop. V. ??Enjoy Data
 "-"


From lroubeyrie at limair.asso.fr  Thu Apr 27 08:41:03 2006
From: lroubeyrie at limair.asso.fr (Lionel Roubeyrie)
Date: Thu Apr 27 08:41:03 2006
Subject: [Numpy-discussion] equality with masked object
In-Reply-To: <d38f5330604250610x223a3513pd859ed1deff4f41@mail.gmail.com>
References: <200604250938.48648.lroubeyrie@limair.asso.fr> <d38f5330604250610x223a3513pd859ed1deff4f41@mail.gmail.com>
Message-ID: <200604271740.11385.lroubeyrie@limair.asso.fr>

Hi,
thanks for your answer, but my problem is that I want to obtain the index of 
the max value in each column of a 2d masked array, then how can I do that 
without comparaison?
Thanks

Le Mardi 25 Avril 2006 15:10, Sasha a ?crit?:
> On 4/25/06, Lionel Roubeyrie <lroubeyrie at limair.asso.fr> wrote:
> > Why 5.0 == -- return True? A float is it the same as a masked object?
> > thanks
>
> It does not.  It returns ma.masked :
> >>> test[3] is ma.masked
>
> True
>
> You should not access masked data - it makes no sense.  The current
> behavior is historical and I don't really like it.  Masked scalars are
> replaced by ma.masked singleton in subscript operations to allow a[i]
> is masked idiom.  In my view it is not worth the trouble, but my
> suggestion to get rid of that feature was not met with much
> enthusiasm.
>
>
> -------------------------------------------------------
> Using Tomcat but need to do more? Need to support web services, security?
> Get stuff done quickly with pre-integrated technology to make your job
> easier Download IBM WebSphere Application Server v.1.0.1 based on Apache
> Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid0709&bid&3057&dat1642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion

-- 
Lionel Roubeyrie - lroubeyrie at limair.asso.fr
LIMAIR
http://www.limair.asso.fr


From ndarray at mac.com  Thu Apr 27 08:57:07 2006
From: ndarray at mac.com (Sasha)
Date: Thu Apr 27 08:57:07 2006
Subject: [Numpy-discussion] equality with masked object
In-Reply-To: <200604271740.11385.lroubeyrie@limair.asso.fr>
References: <200604250938.48648.lroubeyrie@limair.asso.fr>
	 <d38f5330604250610x223a3513pd859ed1deff4f41@mail.gmail.com>
	 <200604271740.11385.lroubeyrie@limair.asso.fr>
Message-ID: <d38f5330604270856g5b556168q8d20827983649d96@mail.gmail.com>

On 4/27/06, Lionel Roubeyrie <lroubeyrie at limair.asso.fr> wrote:
>[....................]  I want to obtain the index of
> the max value in each column of a 2d masked array, then how can I do that
> without comparaison?

ma.argmax(x, axis=0, fill_value=ma.maximum_fill_value(x))

or better:

argmax(x.fill(ma.maximum_fill_value(x)), axis=0)


From kwgoodman at gmail.com  Thu Apr 27 09:32:10 2006
From: kwgoodman at gmail.com (Keith Goodman)
Date: Thu Apr 27 09:32:10 2006
Subject: [Numpy-discussion] Broadcasting rules (Ticket 76).
In-Reply-To: <44506BE6.10301@noaa.gov>
References: <4050236.1146000212838.JavaMail.root@fed1wml05.mgt.cox.net>
	 <d38f5330604251816i7979df78x67c63eeace28b507@mail.gmail.com>
	 <444F0420.9000500@ieee.org>
	 <d38f5330604261417o1b8dd104veb5b6e1398cfc74e@mail.gmail.com>
	 <44506BE6.10301@noaa.gov>
Message-ID: <f4f93d420604270931q7c2466d4of6d062bf8ed7140c@mail.gmail.com>

On 4/26/06, Christopher Barker <Chris.Barker at noaa.gov> wrote:
> something that always bit me with MATLAB. If I had a matrix that
> happened to have a dimension of 1, MATLAB would interpret it as a
> vector. I ended up writing functions like "SumColumns" that would check
> if it was a single row vector before calling sum, so that I wouldn't
> suddenly get a scaler result if a matrix happened to have on row.

In Octave or Matlab, all you need to do is sum(x,1). For example:

>> x = rand(1,4)
x =

  0.56755  0.24575  0.53804  0.36521

>> sum(x,1)
ans =

  0.56755  0.24575  0.53804  0.36521


From schofield at ftw.at  Thu Apr 27 09:50:03 2006
From: schofield at ftw.at (Ed Schofield)
Date: Thu Apr 27 09:50:03 2006
Subject: [Numpy-discussion] matrix operations with axis=None
In-Reply-To: <4450780C.9060403@ieee.org>
References: <f4f93d420604262145p7d38060av9fa610488749811f@mail.gmail.com> <4450780C.9060403@ieee.org>
Message-ID: <4450F6F4.2060800@ftw.at>

Travis Oliphant wrote:
> Keith Goodman wrote:
>> I noticed that the mean of a matrix is a matrix but the standard
>> deviation of a matrix is an array. Is that the expected behavior? I'm
>> also getting the wrong values (0 and nan) for the standard deviation.
>> Did I mess something up?
> This should be fixed now in SVN.  If somebody can add a test that
> would be great.
>
> Note, that the methods taking axes also now preserve row and column
> orientation for matrices.
>
Well done for doing this.

In fact, you beat me to it by a few hours; I was going to post a patch
this morning to preserve orientation with matrix operations.  The
approach I took was different in one respect.

Matrix objects currently return a matrix of shape (1, 1) from methods
with an axis=None argument.  For example:

>>> x = asmatrix(random.uniform(0,1,(3,3)))
>>> x.std()
matrix([[ 0.26890557]])

>>> x.argmax()
matrix([[4]])

I believe this behaviour is unfortunate, and that an operation
aggregating a matrix over all dimensions should return a scalar.  I've
posted a patch at

    http://projects.scipy.org/scipy/numpy/ticket/83

that modifies this behaviour to return scalars (as rank-0 arrays)
instead.  It also removes some code duplication.

The behaviour with the patch is:

>>> x.std()
0.29610630190701492

>>> x.std().shape
()

>>> x.argmax()
3

Returning scalars from methods with an axis=None argument is the current
behaviour of scipy sparse matrices, while axis=0 or axis=1 yields a
sparse matrix with height or width 1, like numpy matrices.  A (1 x 1)
sparse matrix would be a strange object indeed, and would not be usable
in all contexts where scalars are expected.  I suspect the same would
hold for (1 x 1) dense matrices.  One example is that they cannot be
used as indices for Python lists.  For some matrix methods, such as
argmax, returning a scalar would be highly desirable by allowing simpler
code.

A potential drawback to this change is that matrix operations
aggregating along all dimensions, which would now share the behaviour of
numpy arrays, would be no longer be consistent with matrix operations
that aggregate along only one dimension, which currently do not reduce
dimension, because matrices are inherently 2-d.  This could be an
argument for introducing a new vector class to represent one-dimensional
data with orientation.

-- Ed


From gnchen at cortechs.net  Thu Apr 27 09:56:12 2006
From: gnchen at cortechs.net (Gennan Chen)
Date: Thu Apr 27 09:56:12 2006
Subject: [Numpy-discussion] newbie for writing numpy/scipy extensions
Message-ID: <228EDE46-B760-44BA-A987-273F2ADC9B81@cortechs.net>

Hi! All,

I just start writing my own python extension based on numpy. Couple  
of questions here:

1. I have some utility functions, such as wrappers for  
PyArray_GETPTR* needed be access by different extension modules. So,  
I put them in utlis.h and utlis.c. In utils.h, I need to include  
"numpy/arrayobject.h". But the compilation failed when I include it  
again in my extension module function, wrap.c:

#include "numpy/arrayobject.h"
#include "utils.h"

When I remove it and use

#include "utils.h"

the compilation works. So, is it true that I can only include  
arrayobject.h once?

2.  which import I should use in my initial function:

import_array()

or
import_libnumarray()

Gen-Nan Chen, PhD
Chief Scientist
Research and Development Group
CorTechs Labs Inc (www.cortechs.net)
1020 Prospect St., #304, La Jolla, CA, 92037
Tel: 1-858-459-9700 ext 16
Fax: 1-858-459-9705
Email: gnchen at cortechs.net


From ndarray at mac.com  Thu Apr 27 09:59:11 2006
From: ndarray at mac.com (Sasha)
Date: Thu Apr 27 09:59:11 2006
Subject: [Numpy-discussion] Warnings in NumPy SVN
In-Reply-To: <44507A9D.8070902@ieee.org>
References: <44507A9D.8070902@ieee.org>
Message-ID: <d38f5330604270958o142ee2e5vfe75dad8df2d4b8d@mail.gmail.com>

On 4/27/06, Travis Oliphant <oliphant.travis at ieee.org> wrote:
> [...]
> The function (or macro) needs to implement the operation on the basic
> data-type and if necessary set an error-flag in the floating-point
> registers.
>
> If anybody has time to help implement these basic operations, it would
> be greatly appreciated.

I can help.  To make sure we don't duplicate our effort, let's do the following:

1. I will add place-holders for all the necessary functions to make
them return "NotImplemented".

2. I will then follow up with the list of functions that need to be
filled out and we can then split the work.

3. We will also need to write tests that will make sure scalars behave
similar to dimensionless arrays.  If anyone would like to help with
this, it will be greately appreciated.  No C coding skills are
necessary for that.


From oliphant at ee.byu.edu  Thu Apr 27 10:01:07 2006
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Thu Apr 27 10:01:07 2006
Subject: [Numpy-discussion] matrix operations with axis=None
In-Reply-To: <4450F6F4.2060800@ftw.at>
References: <f4f93d420604262145p7d38060av9fa610488749811f@mail.gmail.com> <4450780C.9060403@ieee.org> <4450F6F4.2060800@ftw.at>
Message-ID: <4450F7F2.1050707@ee.byu.edu>

Ed Schofield wrote:

>Travis Oliphant wrote:
>  
>
>>Keith Goodman wrote:
>>    
>>
>>>I noticed that the mean of a matrix is a matrix but the standard
>>>deviation of a matrix is an array. Is that the expected behavior? I'm
>>>also getting the wrong values (0 and nan) for the standard deviation.
>>>Did I mess something up?
>>>      
>>>
>>This should be fixed now in SVN.  If somebody can add a test that
>>would be great.
>>
>>Note, that the methods taking axes also now preserve row and column
>>orientation for matrices.
>>
>>    
>>
>Well done for doing this.
>
>In fact, you beat me to it by a few hours; I was going to post a patch
>this morning to preserve orientation with matrix operations.  The
>approach I took was different in one respect.
>  
>

I like your function-call approach as it ensures consistent behavior.

>Returning scalars from methods with an axis=None argument is the current
>behaviour of scipy sparse matrices, while axis=0 or axis=1 yields a
>sparse matrix with height or width 1, like numpy matrices.  A (1 x 1)
>sparse matrix would be a strange object indeed, and would not be usable
>in all contexts where scalars are expected.  I suspect the same would
>hold for (1 x 1) dense matrices.  One example is that they cannot be
>used as indices for Python lists.  For some matrix methods, such as
>argmax, returning a scalar would be highly desirable by allowing simpler
>code.
>
>A potential drawback to this change is that matrix operations
>aggregating along all dimensions, which would now share the behaviour of
>numpy arrays, would be no longer be consistent with matrix operations
>that aggregate along only one dimension, which currently do not reduce
>dimension, because matrices are inherently 2-d.  This could be an
>argument for introducing a new vector class to represent one-dimensional
>data with orientation.
>  
>

There is one more problem in that matrix-operations will not be 
preserved in all cases as they would have before.

However, I suppose somebody doing a reduce over all dimensions would 
probably not expect the result to be a matrix, so I don't think it's a 
big drawback.

Consistency with sparse matrices is another reason for returning a scalar.

-Travis


From Fernando.Perez at colorado.edu  Thu Apr 27 10:04:01 2006
From: Fernando.Perez at colorado.edu (Fernando Perez)
Date: Thu Apr 27 10:04:01 2006
Subject: [Numpy-discussion] Warnings in NumPy SVN
In-Reply-To: <d38f5330604270958o142ee2e5vfe75dad8df2d4b8d@mail.gmail.com>
References: <44507A9D.8070902@ieee.org> <d38f5330604270958o142ee2e5vfe75dad8df2d4b8d@mail.gmail.com>
Message-ID: <4450F93D.9050905@colorado.edu>

Sasha wrote:
> On 4/27/06, Travis Oliphant <oliphant.travis at ieee.org> wrote:
> 
>>[...]
>>The function (or macro) needs to implement the operation on the basic
>>data-type and if necessary set an error-flag in the floating-point
>>registers.
>>
>>If anybody has time to help implement these basic operations, it would
>>be greatly appreciated.
> 
> 
> I can help.  To make sure we don't duplicate our effort, let's do the following:
> 
> 1. I will add place-holders for all the necessary functions to make
> them return "NotImplemented".

just a minor reminder:

   raise NotImplementedError

is the standard idiom for this.

Cheers,

f


From kwgoodman at gmail.com  Thu Apr 27 10:05:05 2006
From: kwgoodman at gmail.com (Keith Goodman)
Date: Thu Apr 27 10:05:05 2006
Subject: [Numpy-discussion] matrix.std() returns array
In-Reply-To: <4450780C.9060403@ieee.org>
References: <f4f93d420604262145p7d38060av9fa610488749811f@mail.gmail.com>
	 <4450780C.9060403@ieee.org>
Message-ID: <f4f93d420604271004j5ac0763aq7e9b33542f02b223@mail.gmail.com>

On 4/27/06, Travis Oliphant <oliphant.travis at ieee.org> wrote:
> This should be fixed now in SVN.  If somebody can add a test that would
> be great.
>
> Note, that the methods taking axes also now preserve row and column
> orientation for matrices.

Hey, it works. Thank you.


From ndarray at mac.com  Thu Apr 27 10:52:01 2006
From: ndarray at mac.com (Sasha)
Date: Thu Apr 27 10:52:01 2006
Subject: [Numpy-discussion] Broadcasting rules (Ticket 76).
In-Reply-To: <f4f93d420604270931q7c2466d4of6d062bf8ed7140c@mail.gmail.com>
References: <4050236.1146000212838.JavaMail.root@fed1wml05.mgt.cox.net>
	 <d38f5330604251816i7979df78x67c63eeace28b507@mail.gmail.com>
	 <444F0420.9000500@ieee.org>
	 <d38f5330604261417o1b8dd104veb5b6e1398cfc74e@mail.gmail.com>
	 <44506BE6.10301@noaa.gov>
	 <f4f93d420604270931q7c2466d4of6d062bf8ed7140c@mail.gmail.com>
Message-ID: <d38f5330604271051xacb43f3gc54f68b2284a3caf@mail.gmail.com>

On 4/27/06, Keith Goodman <kwgoodman at gmail.com> wrote:
> [...]
> In Octave or Matlab, all you need to do is sum(x,1). For example:
>
> >> x = rand(1,4)
> x =
>
>   0.56755  0.24575  0.53804  0.36521
>
> >> sum(x,1)
> ans =
>
>   0.56755  0.24575  0.53804  0.36521
>

How is this different from Numpy:

>>> x = matrix(rand(4))
>>> sum(x.T, 1)
matrix([[ 0.36186805],
       [ 0.90198107],
       [ 0.60407661],
       [ 0.49523327]])


From kwgoodman at gmail.com  Thu Apr 27 11:05:03 2006
From: kwgoodman at gmail.com (Keith Goodman)
Date: Thu Apr 27 11:05:03 2006
Subject: [Numpy-discussion] Broadcasting rules (Ticket 76).
In-Reply-To: <d38f5330604271051xacb43f3gc54f68b2284a3caf@mail.gmail.com>
References: <4050236.1146000212838.JavaMail.root@fed1wml05.mgt.cox.net>
	 <d38f5330604251816i7979df78x67c63eeace28b507@mail.gmail.com>
	 <444F0420.9000500@ieee.org>
	 <d38f5330604261417o1b8dd104veb5b6e1398cfc74e@mail.gmail.com>
	 <44506BE6.10301@noaa.gov>
	 <f4f93d420604270931q7c2466d4of6d062bf8ed7140c@mail.gmail.com>
	 <d38f5330604271051xacb43f3gc54f68b2284a3caf@mail.gmail.com>
Message-ID: <f4f93d420604271058h59612f3cya11a67ea6cd8bb0f@mail.gmail.com>

On 4/27/06, Sasha <ndarray at mac.com> wrote:
> On 4/27/06, Keith Goodman <kwgoodman at gmail.com> wrote:
> > [...]
> > In Octave or Matlab, all you need to do is sum(x,1). For example:
> >
> > >> x = rand(1,4)
> > x =
> >
> >   0.56755  0.24575  0.53804  0.36521
> >
> > >> sum(x,1)
> > ans =
> >
> >   0.56755  0.24575  0.53804  0.36521
> >
>
> How is this different from Numpy:
>
> >>> x = matrix(rand(4))
> >>> sum(x.T, 1)
> matrix([[ 0.36186805],
>        [ 0.90198107],
>        [ 0.60407661],
>        [ 0.49523327]])
>

Exactly. That's why the OP doesn't need to write a special function in
Matlab called SumColumns.


From Chris.Barker at noaa.gov  Thu Apr 27 11:11:03 2006
From: Chris.Barker at noaa.gov (Christopher Barker)
Date: Thu Apr 27 11:11:03 2006
Subject: [Numpy-discussion] Broadcasting rules (Ticket 76).
In-Reply-To: <f4f93d420604271058h59612f3cya11a67ea6cd8bb0f@mail.gmail.com>
References: <4050236.1146000212838.JavaMail.root@fed1wml05.mgt.cox.net>
 <d38f5330604251816i7979df78x67c63eeace28b507@mail.gmail.com>
 <444F0420.9000500@ieee.org>
 <d38f5330604261417o1b8dd104veb5b6e1398cfc74e@mail.gmail.com>
 <44506BE6.10301@noaa.gov>
 <f4f93d420604270931q7c2466d4of6d062bf8ed7140c@mail.gmail.com>
 <d38f5330604271051xacb43f3gc54f68b2284a3caf@mail.gmail.com>
 <f4f93d420604271058h59612f3cya11a67ea6cd8bb0f@mail.gmail.com>
Message-ID: <4451090C.5020901@noaa.gov>

Keith Goodman wrote:

> Exactly. That's why the OP doesn't need to write a special function in
> Matlab called SumColumns.

"Didn't". I haven't used MATLAB for much in years. Back in the day, that 
feature didn't exist. Or at least was poorly enough documented that i 
didn't think it existed. Matlab didn't used to only support 2-d arrays 
as well.

Anyway, the point was that a (n,) array and a (n,1) array and a (1,n) 
array are all different, and that difference should be preserved.

I'm still confused as to what behavior Sasha wants that doesn't exist.

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer
                                     		
NOAA/OR&R/HAZMAT         (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From oliphant at ee.byu.edu  Thu Apr 27 11:17:02 2006
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Thu Apr 27 11:17:02 2006
Subject: [Numpy-discussion] Warnings in NumPy SVN
In-Reply-To: <d38f5330604270958o142ee2e5vfe75dad8df2d4b8d@mail.gmail.com>
References: <44507A9D.8070902@ieee.org> <d38f5330604270958o142ee2e5vfe75dad8df2d4b8d@mail.gmail.com>
Message-ID: <44510A6E.4090906@ee.byu.edu>

Sasha wrote:

>On 4/27/06, Travis Oliphant <oliphant.travis at ieee.org> wrote:
>  
>
>>[...]
>>The function (or macro) needs to implement the operation on the basic
>>data-type and if necessary set an error-flag in the floating-point
>>registers.
>>
>>If anybody has time to help implement these basic operations, it would
>>be greatly appreciated.
>>    
>>
>
>I can help.  To make sure we don't duplicate our effort, let's do the following:
>
>  
>
Thanks for your help.

>1. I will add place-holders for all the necessary functions to make
>  
>
>them return "NotImplemented".
>  
>

The Python-object-returning functions are already there.   All that is 
missing is the ctype functions to actually do the computation.   So, I'm 
not sure what you mean.

>2. I will then follow up with the list of functions that need to be
>filled out and we can then split the work.
>  
>
This would be good to get a list.   Some of the functions may require 
some repetition of what's in umathmodule.c.    Let's just do the 
repetition for now and think about code refactoring after we know better 
what is actually duplicated.

>3. We will also need to write tests that will make sure scalars behave
>similar to dimensionless arrays.  If anyone would like to help with
>this, it will be greately appreciated.  No C coding skills are
>necessary for that.
>  
>
Tests would be necessary to ensure consistency.  

Thanks for jumping in...

-Travis


From cookedm at physics.mcmaster.ca  Thu Apr 27 11:30:05 2006
From: cookedm at physics.mcmaster.ca (David M. Cooke)
Date: Thu Apr 27 11:30:05 2006
Subject: [Numpy-discussion] Warnings in NumPy SVN
In-Reply-To: <4450F93D.9050905@colorado.edu> (Fernando Perez's message of
	"Thu, 27 Apr 2006 11:02:53 -0600")
References: <44507A9D.8070902@ieee.org>
	<d38f5330604270958o142ee2e5vfe75dad8df2d4b8d@mail.gmail.com>
	<4450F93D.9050905@colorado.edu>
Message-ID: <qnkpsj2hny7.fsf@arbutus.physics.mcmaster.ca>

Fernando Perez <Fernando.Perez at colorado.edu> writes:

> Sasha wrote:
>> On 4/27/06, Travis Oliphant <oliphant.travis at ieee.org> wrote:
>>
>>>[...]
>>>The function (or macro) needs to implement the operation on the basic
>>>data-type and if necessary set an error-flag in the floating-point
>>>registers.
>>>
>>>If anybody has time to help implement these basic operations, it would
>>>be greatly appreciated.
>> I can help.  To make sure we don't duplicate our effort, let's do
>> the following:
>> 1. I will add place-holders for all the necessary functions to make
>> them return "NotImplemented".
>
> just a minor reminder:
>
>   raise NotImplementedError
>
> is the standard idiom for this.

Just a note: For __xxx__ methods, "return NotImplemented" is the
standard idiom. See section 3.3.8 (Coercion rules) of the Python 2.4
language manual:

   For most intents and purposes, an operator that returns
   NotImplemented is treated the same as one that is not implemented
   at all.

I believe the idea is that it's not actually an error for an __xxx__
method to not be implemented, as there are fallbacks.

-- 
|>|\/|<
/--------------------------------------------------------------------------\
|David M. Cooke                      http://arbutus.physics.mcmaster.ca/dmc/
|cookedm at physics.mcmaster.ca


From ndarray at mac.com  Thu Apr 27 11:32:08 2006
From: ndarray at mac.com (Sasha)
Date: Thu Apr 27 11:32:08 2006
Subject: [Numpy-discussion] Warnings in NumPy SVN
In-Reply-To: <44510A6E.4090906@ee.byu.edu>
References: <44507A9D.8070902@ieee.org>
	 <d38f5330604270958o142ee2e5vfe75dad8df2d4b8d@mail.gmail.com>
	 <44510A6E.4090906@ee.byu.edu>
Message-ID: <d38f5330604271131o799cd208ya7d9bc3d02c7c285@mail.gmail.com>

On 4/27/06, Travis Oliphant <oliphant at ee.byu.edu> wrote:

> [ ... ]
>
> The Python-object-returning functions are already there.   All that is
> missing is the ctype functions to actually do the computation.   So, I'm
> not sure what you mean.
>

I did not realize that.  However, it is still reasonable to add
non-working prototypes to kill the warnings first marked by /* XXX */.
 I will do that before the end of the day.


> >2. I will then follow up with the list of functions that need to be
> >filled out and we can then split the work.
> >
> >
> This would be good to get a list.

See attached.
-------------- next part --------------
byte_ctype_multiply
ubyte_ctype_multiply
short_ctype_multiply
ushort_ctype_multiply
int_ctype_multiply
uint_ctype_multiply
long_ctype_multiply
ulong_ctype_multiply
longlong_ctype_multiply
ulonglong_ctype_multiply
byte_ctype_divide
ubyte_ctype_divide
short_ctype_divide
ushort_ctype_divide
int_ctype_divide
uint_ctype_divide
long_ctype_divide
ulong_ctype_divide
longlong_ctype_divide
ulonglong_ctype_divide
byte_ctype_remainder
ubyte_ctype_remainder
short_ctype_remainder
ushort_ctype_remainder
int_ctype_remainder
uint_ctype_remainder
long_ctype_remainder
ulong_ctype_remainder
longlong_ctype_remainder
ulonglong_ctype_remainder
byte_ctype_divmod
ubyte_ctype_divmod
short_ctype_divmod
ushort_ctype_divmod
int_ctype_divmod
uint_ctype_divmod
long_ctype_divmod
ulong_ctype_divmod
longlong_ctype_divmod
ulonglong_ctype_divmod
byte_ctype_power
ubyte_ctype_power
short_ctype_power
ushort_ctype_power
int_ctype_power
uint_ctype_power
long_ctype_power
ulong_ctype_power
longlong_ctype_power
ulonglong_ctype_power
byte_ctype_floor_divide
ubyte_ctype_floor_divide
short_ctype_floor_divide
ushort_ctype_floor_divide
int_ctype_floor_divide
uint_ctype_floor_divide
long_ctype_floor_divide
ulong_ctype_floor_divide
longlong_ctype_floor_divide
ulonglong_ctype_floor_divide
byte_ctype_true_divide
ubyte_ctype_true_divide
short_ctype_true_divide
ushort_ctype_true_divide
int_ctype_true_divide
uint_ctype_true_divide
long_ctype_true_divide
ulong_ctype_true_divide
longlong_ctype_true_divide
ulonglong_ctype_true_divide
byte_ctype_lshift
ubyte_ctype_lshift
short_ctype_lshift
ushort_ctype_lshift
int_ctype_lshift
uint_ctype_lshift
long_ctype_lshift
ulong_ctype_lshift
longlong_ctype_lshift
ulonglong_ctype_lshift
byte_ctype_rshift
ubyte_ctype_rshift
short_ctype_rshift
ushort_ctype_rshift
int_ctype_rshift
uint_ctype_rshift
long_ctype_rshift
ulong_ctype_rshift
longlong_ctype_rshift
ulonglong_ctype_rshift
byte_ctype_and
ubyte_ctype_and
short_ctype_and
ushort_ctype_and
int_ctype_and
uint_ctype_and
long_ctype_and
ulong_ctype_and
longlong_ctype_and
ulonglong_ctype_and
byte_ctype_or
ubyte_ctype_or
short_ctype_or
ushort_ctype_or
int_ctype_or
uint_ctype_or
long_ctype_or
ulong_ctype_or
longlong_ctype_or
ulonglong_ctype_or
byte_ctype_xor
ubyte_ctype_xor
short_ctype_xor
ushort_ctype_xor
int_ctype_xor
uint_ctype_xor
long_ctype_xor
ulong_ctype_xor
longlong_ctype_xor
ulonglong_ctype_xor
float_ctype_remainder
double_ctype_remainder
longdouble_ctype_remainder
cfloat_ctype_remainder
cdouble_ctype_remainder
clongdouble_ctype_remainder
float_ctype_divmod
double_ctype_divmod
longdouble_ctype_divmod
cfloat_ctype_divmod
cdouble_ctype_divmod
clongdouble_ctype_divmod
float_ctype_power
double_ctype_power
longdouble_ctype_power
cfloat_ctype_power
cdouble_ctype_power
clongdouble_ctype_power
cfloat_cfloat_divide
cdouble_cfloat_divide
clongdouble_cfloat_divide
byte_ctype_negative
ubyte_ctype_negative
short_ctype_negative
ushort_ctype_negative
int_ctype_negative
uint_ctype_negative
long_ctype_negative
ulong_ctype_negative
longlong_ctype_negative
ulonglong_ctype_negative
float_ctype_negative
double_ctype_negative
longdouble_ctype_negative
cfloat_ctype_negative
cdouble_ctype_negative
clongdouble_ctype_negative
byte_ctype_positive
ubyte_ctype_positive
short_ctype_positive
ushort_ctype_positive
int_ctype_positive
uint_ctype_positive
long_ctype_positive
ulong_ctype_positive
longlong_ctype_positive
ulonglong_ctype_positive
float_ctype_positive
double_ctype_positive
longdouble_ctype_positive
cfloat_ctype_positive
cdouble_ctype_positive
clongdouble_ctype_positive
byte_ctype_absolute
ubyte_ctype_absolute
short_ctype_absolute
ushort_ctype_absolute
int_ctype_absolute
uint_ctype_absolute
long_ctype_absolute
ulong_ctype_absolute
longlong_ctype_absolute
ulonglong_ctype_absolute
float_ctype_absolute
double_ctype_absolute
longdouble_ctype_absolute
cfloat_ctype_absolute
cdouble_ctype_absolute
clongdouble_ctype_absolute
byte_ctype_nonzero
ubyte_ctype_nonzero
short_ctype_nonzero
ushort_ctype_nonzero
int_ctype_nonzero
uint_ctype_nonzero
long_ctype_nonzero
ulong_ctype_nonzero
longlong_ctype_nonzero
ulonglong_ctype_nonzero
float_ctype_nonzero
double_ctype_nonzero
longdouble_ctype_nonzero
cfloat_ctype_nonzero
cdouble_ctype_nonzero
clongdouble_ctype_nonzero
byte_ctype_invert
ubyte_ctype_invert
short_ctype_invert
ushort_ctype_invert
int_ctype_invert
uint_ctype_invert
long_ctype_invert
ulong_ctype_invert
longlong_ctype_invert
ulonglong_ctype_invert
float_ctype_invert
double_ctype_invert
longdouble_ctype_invert
cfloat_ctype_invert
cdouble_ctype_invert
clongdouble_ctype_invert

From cookedm at physics.mcmaster.ca  Thu Apr 27 11:32:11 2006
From: cookedm at physics.mcmaster.ca (David M. Cooke)
Date: Thu Apr 27 11:32:11 2006
Subject: [Numpy-discussion] newbie for writing numpy/scipy extensions
In-Reply-To: <228EDE46-B760-44BA-A987-273F2ADC9B81@cortechs.net> (Gennan
	Chen's message of "Thu, 27 Apr 2006 09:55:42 -0700")
References: <228EDE46-B760-44BA-A987-273F2ADC9B81@cortechs.net>
Message-ID: <qnkirouhnu7.fsf@arbutus.physics.mcmaster.ca>

Gennan Chen <gnchen at cortechs.net> writes:

> Hi! All,
>
> I just start writing my own python extension based on numpy. Couple
> of questions here:
>
> 1. I have some utility functions, such as wrappers for
> PyArray_GETPTR* needed be access by different extension modules. So,
> I put them in utlis.h and utlis.c. In utils.h, I need to include
> "numpy/arrayobject.h". But the compilation failed when I include it
> again in my extension module function, wrap.c:
>
> #include "numpy/arrayobject.h"
> #include "utils.h"
>
> When I remove it and use
>
> #include "utils.h"
>
> the compilation works. So, is it true that I can only include
> arrayobject.h once?

What is the compiler error message?

> 2.  which import I should use in my initial function:
>
> import_array()

This one. It's the one to use for Numeric, numarray, and numpy.

> or
> import_libnumarray()

This is for numarray, the other Numeric derivative. It pulls in the
numarray-specific stuff IIRC.

-- 
|>|\/|<
/--------------------------------------------------------------------------\
|David M. Cooke                      http://arbutus.physics.mcmaster.ca/dmc/
|cookedm at physics.mcmaster.ca


From oliphant at ee.byu.edu  Thu Apr 27 11:36:06 2006
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Thu Apr 27 11:36:06 2006
Subject: [Numpy-discussion] newbie for writing numpy/scipy extensions
In-Reply-To: <228EDE46-B760-44BA-A987-273F2ADC9B81@cortechs.net>
References: <228EDE46-B760-44BA-A987-273F2ADC9B81@cortechs.net>
Message-ID: <44510F04.3020806@ee.byu.edu>

Gennan Chen wrote:

> Hi! All,
>
> I just start writing my own python extension based on numpy. Couple  
> of questions here:
>
> 1. I have some utility functions, such as wrappers for  
> PyArray_GETPTR* needed be access by different extension modules. So,  
> I put them in utlis.h and utlis.c. In utils.h, I need to include  
> "numpy/arrayobject.h". But the compilation failed when I include it  
> again in my extension module function, wrap.c:
>
> #include "numpy/arrayobject.h"
> #include "utils.h"
>
> When I remove it and use
>
> #include "utils.h"
>
> the compilation works. So, is it true that I can only include  
> arrayobject.h once?


No, you can include arrayobject.h more than once.  However, if you make 
use of C-API functions (not just macros that access elements of the 
array) in more than one file for the same extension module, you need to 
do a couple of things to make it work.

In the original file you must define PY_ARRAY_UNIQUE_SYMBOL to something 
unique to your extension module before you include the arrayobject.h file.

In the helper c file you must define PY_ARRAY_UNIQUE_SYMBOL and define 
NO_IMPORT_ARRAY prior to including the arrayobject.h

Thus, in wrap.c you do (feel free to change the name from 
_chen_extension to something else)

#define PY_ARRAY_UNIQUE_SYMBOL  _chen_extension 
#include "numpy/arrayobject.h"

and in

utils.c  you do

#define PY_ARRAY_UNIQUE_SYMBOL  _chen_extension 
#define NO_IMPORT_ARRAY
#include "numpy/arrayobject.h"


>
> 2.  which import I should use in my initial function:
>
> import_array()


import_array()

-Travis


From oliphant at ee.byu.edu  Thu Apr 27 11:40:10 2006
From: oliphant at ee.byu.edu (Travis Oliphant)
Date: Thu Apr 27 11:40:10 2006
Subject: [Numpy-discussion] Broadcasting rules (Ticket 76).
In-Reply-To: <4451090C.5020901@noaa.gov>
References: <4050236.1146000212838.JavaMail.root@fed1wml05.mgt.cox.net> <d38f5330604251816i7979df78x67c63eeace28b507@mail.gmail.com> <444F0420.9000500@ieee.org> <d38f5330604261417o1b8dd104veb5b6e1398cfc74e@mail.gmail.com> <44506BE6.10301@noaa.gov> <f4f93d420604270931q7c2466d4of6d062bf8ed7140c@mail.gmail.com> <d38f5330604271051xacb43f3gc54f68b2284a3caf@mail.gmail.com> <f4f93d420604271058h59612f3cya11a67ea6cd8bb0f@mail.gmail.com> <4451090C.5020901@noaa.gov>
Message-ID: <44510FD2.1090502@ee.byu.edu>

Christopher Barker wrote:

> Keith Goodman wrote:
>
>> Exactly. That's why the OP doesn't need to write a special function in
>> Matlab called SumColumns.
>
>
> "Didn't". I haven't used MATLAB for much in years. Back in the day, 
> that feature didn't exist. Or at least was poorly enough documented 
> that i didn't think it existed. Matlab didn't used to only support 2-d 
> arrays as well.
>
> Anyway, the point was that a (n,) array and a (n,1) array and a (1,n) 
> array are all different, and that difference should be preserved.
>
> I'm still confused as to what behavior Sasha wants that doesn't exist.


I'm not exactly sure.   But, one of the things I think he has suggested 
(please tell me if my understanding is wrong) is to allow a 2x3 array to 
be "broadcast" to a (2n)x(3m) array by repeated copying as needed. 


-Travis


From gnchen at cortechs.net  Thu Apr 27 12:24:38 2006
From: gnchen at cortechs.net (Gennan Chen)
Date: Thu Apr 27 12:24:38 2006
Subject: [Numpy-discussion] newbie for writing numpy/scipy extensions
In-Reply-To: <44510F04.3020806@ee.byu.edu>
References: <228EDE46-B760-44BA-A987-273F2ADC9B81@cortechs.net> <44510F04.3020806@ee.byu.edu>
Message-ID: <ABD327FD-ED01-4D9A-839C-22B941B3E727@cortechs.net>

Thanks!

That solve the problem. May I ask what does those #define really   
means??

Gen


On Apr 27, 2006, at 11:35 AM, Travis Oliphant wrote:

> Gennan Chen wrote:
>
>> Hi! All,
>>
>> I just start writing my own python extension based on numpy.  
>> Couple  of questions here:
>>
>> 1. I have some utility functions, such as wrappers for   
>> PyArray_GETPTR* needed be access by different extension modules.  
>> So,  I put them in utlis.h and utlis.c. In utils.h, I need to  
>> include  "numpy/arrayobject.h". But the compilation failed when I  
>> include it  again in my extension module function, wrap.c:
>>
>> #include "numpy/arrayobject.h"
>> #include "utils.h"
>>
>> When I remove it and use
>>
>> #include "utils.h"
>>
>> the compilation works. So, is it true that I can only include   
>> arrayobject.h once?
>
>
> No, you can include arrayobject.h more than once.  However, if you  
> make use of C-API functions (not just macros that access elements  
> of the array) in more than one file for the same extension module,  
> you need to do a couple of things to make it work.
>
> In the original file you must define PY_ARRAY_UNIQUE_SYMBOL to  
> something unique to your extension module before you include the  
> arrayobject.h file.
>
> In the helper c file you must define PY_ARRAY_UNIQUE_SYMBOL and  
> define NO_IMPORT_ARRAY prior to including the arrayobject.h
>
> Thus, in wrap.c you do (feel free to change the name from  
> _chen_extension to something else)
>
> #define PY_ARRAY_UNIQUE_SYMBOL  _chen_extension #include "numpy/ 
> arrayobject.h"
>
> and in
>
> utils.c  you do
>
> #define PY_ARRAY_UNIQUE_SYMBOL  _chen_extension #define  
> NO_IMPORT_ARRAY
> #include "numpy/arrayobject.h"
>
>
>>
>> 2.  which import I should use in my initial function:
>>
>> import_array()
>
>
> import_array()
>
> -Travis
>
>


From gnchen at cortechs.net  Thu Apr 27 12:24:41 2006
From: gnchen at cortechs.net (Gennan Chen)
Date: Thu Apr 27 12:24:41 2006
Subject: [Numpy-discussion] newbie for writing numpy/scipy extensions
In-Reply-To: <qnkirouhnu7.fsf@arbutus.physics.mcmaster.ca>
References: <228EDE46-B760-44BA-A987-273F2ADC9B81@cortechs.net> <qnkirouhnu7.fsf@arbutus.physics.mcmaster.ca>
Message-ID: <8CD47186-A354-4C8A-B5AF-8BEC2CE82D2E@cortechs.net>

Got it. Looks like ndimage still used the old one.

Gen-Nan Chen, PhD
Chief Scientist
Research and Development Group
CorTechs Labs Inc (www.cortechs.net)
1020 Prospect St., #304, La Jolla, CA, 92037
Tel: 1-858-459-9700 ext 16
Fax: 1-858-459-9705
Email: gnchen at cortechs.net


On Apr 27, 2006, at 11:31 AM, David M. Cooke wrote:

> Gennan Chen <gnchen at cortechs.net> writes:
>
>> Hi! All,
>>
>> I just start writing my own python extension based on numpy. Couple
>> of questions here:
>>
>> 1. I have some utility functions, such as wrappers for
>> PyArray_GETPTR* needed be access by different extension modules. So,
>> I put them in utlis.h and utlis.c. In utils.h, I need to include
>> "numpy/arrayobject.h". But the compilation failed when I include it
>> again in my extension module function, wrap.c:
>>
>> #include "numpy/arrayobject.h"
>> #include "utils.h"
>>
>> When I remove it and use
>>
>> #include "utils.h"
>>
>> the compilation works. So, is it true that I can only include
>> arrayobject.h once?
>
> What is the compiler error message?
>
>> 2.  which import I should use in my initial function:
>>
>> import_array()
>
> This one. It's the one to use for Numeric, numarray, and numpy.
>
>> or
>> import_libnumarray()
>
> This is for numarray, the other Numeric derivative. It pulls in the
> numarray-specific stuff IIRC.
>
> --  
> |>|\/|<
> /--------------------------------------------------------------------- 
> -----\
> |David M. Cooke                      http:// 
> arbutus.physics.mcmaster.ca/dmc/
> |cookedm at physics.mcmaster.ca
>


From ndarray at mac.com  Thu Apr 27 12:29:03 2006
From: ndarray at mac.com (Sasha)
Date: Thu Apr 27 12:29:03 2006
Subject: [Numpy-discussion] Broadcasting rules (Ticket 76).
In-Reply-To: <44510FD2.1090502@ee.byu.edu>
References: <4050236.1146000212838.JavaMail.root@fed1wml05.mgt.cox.net>
	 <d38f5330604251816i7979df78x67c63eeace28b507@mail.gmail.com>
	 <444F0420.9000500@ieee.org>
	 <d38f5330604261417o1b8dd104veb5b6e1398cfc74e@mail.gmail.com>
	 <44506BE6.10301@noaa.gov>
	 <f4f93d420604270931q7c2466d4of6d062bf8ed7140c@mail.gmail.com>
	 <d38f5330604271051xacb43f3gc54f68b2284a3caf@mail.gmail.com>
	 <f4f93d420604271058h59612f3cya11a67ea6cd8bb0f@mail.gmail.com>
	 <4451090C.5020901@noaa.gov> <44510FD2.1090502@ee.byu.edu>
Message-ID: <d38f5330604271228j38d935ebo48fbf7a98ab88783@mail.gmail.com>

On 4/27/06, Travis Oliphant <oliphant at ee.byu.edu> wrote:
> [...]
> > I'm still confused as to what behavior Sasha wants that doesn't exist.
>
>
> I'm not exactly sure.   But, one of the things I think he has suggested
> (please tell me if my understanding is wrong) is to allow a 2x3 array to
> be "broadcast" to a (2n)x(3m) array by repeated copying as needed.

Yes, this is the only new feature that I've suggested. I was also
hoping that the same code that allows shape=(3,) being broadcast to
shape (2,3) can be reused to broadcast (3,) to (6,).  The idea is that
since in terms of memory operations broadcasting  and repetition is
the same, the code can be reused.

The idea is that since repetition can be achieved using broadcasting:

>>> x = zeros(3)
>>> x.reshape((2,3)) += arange(3)
>>> x
array([0, 1, 2, 0, 1, 2])

if we allow x += arange(3), it can use the same code as broadcasting internally.


From ndarray at mac.com  Thu Apr 27 12:30:05 2006
From: ndarray at mac.com (Sasha)
Date: Thu Apr 27 12:30:05 2006
Subject: [Numpy-discussion] Broadcasting rules (Ticket 76).
In-Reply-To: <d38f5330604271228j38d935ebo48fbf7a98ab88783@mail.gmail.com>
References: <4050236.1146000212838.JavaMail.root@fed1wml05.mgt.cox.net>
	 <444F0420.9000500@ieee.org>
	 <d38f5330604261417o1b8dd104veb5b6e1398cfc74e@mail.gmail.com>
	 <44506BE6.10301@noaa.gov>
	 <f4f93d420604270931q7c2466d4of6d062bf8ed7140c@mail.gmail.com>
	 <d38f5330604271051xacb43f3gc54f68b2284a3caf@mail.gmail.com>
	 <f4f93d420604271058h59612f3cya11a67ea6cd8bb0f@mail.gmail.com>
	 <4451090C.5020901@noaa.gov> <44510FD2.1090502@ee.byu.edu>
	 <d38f5330604271228j38d935ebo48fbf7a98ab88783@mail.gmail.com>
Message-ID: <d38f5330604271229q175df4bah7a8afb986189a085@mail.gmail.com>

On 4/27/06, Sasha <ndarray at mac.com> wrote:
> >>> x.reshape((2,3)) += arange(3)

Oops, that should have been

>>> x.reshape((2,3))[...] += arange(3)


From Fernando.Perez at colorado.edu  Thu Apr 27 12:58:02 2006
From: Fernando.Perez at colorado.edu (Fernando Perez)
Date: Thu Apr 27 12:58:02 2006
Subject: [Numpy-discussion] Warnings in NumPy SVN
In-Reply-To: <qnkpsj2hny7.fsf@arbutus.physics.mcmaster.ca>
References: <44507A9D.8070902@ieee.org>	<d38f5330604270958o142ee2e5vfe75dad8df2d4b8d@mail.gmail.com>	<4450F93D.9050905@colorado.edu> <qnkpsj2hny7.fsf@arbutus.physics.mcmaster.ca>
Message-ID: <44512213.9090902@colorado.edu>

David M. Cooke wrote:
> Fernando Perez <Fernando.Perez at colorado.edu> writes:
> 
> 
>>Sasha wrote:
>>
>>>On 4/27/06, Travis Oliphant <oliphant.travis at ieee.org> wrote:
>>>
>>>
>>>>[...]
>>>>The function (or macro) needs to implement the operation on the basic
>>>>data-type and if necessary set an error-flag in the floating-point
>>>>registers.
>>>>
>>>>If anybody has time to help implement these basic operations, it would
>>>>be greatly appreciated.
>>>
>>>I can help.  To make sure we don't duplicate our effort, let's do
>>>the following:
>>>1. I will add place-holders for all the necessary functions to make
>>>them return "NotImplemented".
>>
>>just a minor reminder:
>>
>>  raise NotImplementedError
>>
>>is the standard idiom for this.
> 
> 
> Just a note: For __xxx__ methods, "return NotImplemented" is the
> standard idiom. See section 3.3.8 (Coercion rules) of the Python 2.4
> language manual:
> 
>    For most intents and purposes, an operator that returns
>    NotImplemented is treated the same as one that is not implemented
>    at all.
> 
> I believe the idea is that it's not actually an error for an __xxx__
> method to not be implemented, as there are fallbacks.

You are right.  It's worth remembering that the actual syntaxes are

return NotImplemented

and

raise NotImplementedError

/without/ quotes (as per the original msg), since these are actual python 
builtins, not strings.  That way they can be properly handled by their return 
value or proper exception handling.

Cheers,

f


From woeue at kandy.ccom.lk  Thu Apr 27 18:28:06 2006
From: woeue at kandy.ccom.lk (Bert Morrow)
Date: Thu Apr 27 18:28:06 2006
Subject: [Numpy-discussion] sob story
Message-ID: <001b01c66a62$ee740da5$41bd8147@rf.ncuwi>

An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060427/6f1706fc/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: overdue.gif
Type: image/gif
Size: 10245 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060427/6f1706fc/attachment-0001.gif>

From nvf at MIT.EDU  Thu Apr 27 21:02:03 2006
From: nvf at MIT.EDU (Nick Fotopoulos)
Date: Thu Apr 27 21:02:03 2006
Subject: [Numpy-discussion] Freeing memory allocated in C
Message-ID: <1C119933-F93B-47C6-AADA-4A61DF16B745@mit.edu>

Dear numpy-discussion,

I have written a python module in C which wraps a C library (FrameL)  
in order to read data from specially formatted files into Python  
arrays.  It works, but I think have a memory leak, and I can't see  
what I might be doing wrong.  This Python wrapper is almost identical  
to a Matlab wrapper, but the Matlab version doesn't leak.  Perhaps  
someone here can help me out?

I have read in many places that to return an array, one should wrap  
with PyArray_FromDimsAndData (or more modern versions) and then  
return it without freeing the memory.  Does the same principle hold  
for strings?  Are the following example snippets correct?


// output2 = x-axis values relative to first data point.
data = malloc(nData*sizeof(double));
for(i=0; i<nData; i++) {
   data[i] = vect->startX[0]+(double)i*dt;
}
shape[0] = nData;
out2 = (PyArrayObject *)
         PyArray_FromDimsAndData(1,shape,PyArray_DOUBLE,(char *)data);

//snip

// output5 = gps start time as a string
utc = vect->GTime - vect->ULeapS + FRGPSTAI;
out5 = malloc(200*sizeof(char));
sprintf(out5,"Starting GPS time:%.1f UTC=%s",
     vect->GTime,FrStrGTime(utc));

//snip -- Free all memory not assigned to a return object

return Py_BuildValue("(OOOdsss)",out1,out2,out3,out4,out5,out6,out7);


I see in the Numpy book that I should modernize  
PyArray_FromDimsAndData, but will it be incompatible with users who  
have only Numeric?

If the code above should not leak under your inspection, are there  
any other common places that python C modules often leak that I  
should check?

As a side note, here is how I have been defining "leak".  I have been  
measuring memory usage by opening a pipe to ps to check rss between  
reading in frames and invoking del on them.  Memory usage increases,  
but does not decrease.  In contrast, if I commit the same data in an  
array to a pickle file and read that in, invoking del reduces memory  
usage.

Many thanks,
Nick


From robert.kern at gmail.com  Thu Apr 27 21:14:02 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Thu Apr 27 21:14:02 2006
Subject: [Numpy-discussion] Re: Freeing memory allocated in C
In-Reply-To: <1C119933-F93B-47C6-AADA-4A61DF16B745@mit.edu>
References: <1C119933-F93B-47C6-AADA-4A61DF16B745@mit.edu>
Message-ID: <e2s4p2$dl8$1@sea.gmane.org>

Nick Fotopoulos wrote:
> Dear numpy-discussion,
> 
> I have written a python module in C which wraps a C library (FrameL)  in
> order to read data from specially formatted files into Python  arrays. 
> It works, but I think have a memory leak, and I can't see  what I might
> be doing wrong.  This Python wrapper is almost identical  to a Matlab
> wrapper, but the Matlab version doesn't leak.  Perhaps  someone here can
> help me out?
> 
> I have read in many places that to return an array, one should wrap 
> with PyArray_FromDimsAndData (or more modern versions) and then  return
> it without freeing the memory.  Does the same principle hold  for
> strings?  Are the following example snippets correct?
> 
> // output2 = x-axis values relative to first data point.
> data = malloc(nData*sizeof(double));
> for(i=0; i<nData; i++) {
>   data[i] = vect->startX[0]+(double)i*dt;
> }
> shape[0] = nData;
> out2 = (PyArrayObject *)
>         PyArray_FromDimsAndData(1,shape,PyArray_DOUBLE,(char *)data);

I wouldn't rely on PyArray_FromDimsAndData doing the right thing. Instead of
malloc'ing a block of memory, why don't you create an empty array of the right
size, use its data pointer to fill it with that for-loop, and then return that
array object?

> //snip
> 
> // output5 = gps start time as a string
> utc = vect->GTime - vect->ULeapS + FRGPSTAI;
> out5 = malloc(200*sizeof(char));
> sprintf(out5,"Starting GPS time:%.1f UTC=%s",
>     vect->GTime,FrStrGTime(utc));
> 
> //snip -- Free all memory not assigned to a return object
> 
> return Py_BuildValue("(OOOdsss)",out1,out2,out3,out4,out5,out6,out7);
> 
> I see in the Numpy book that I should modernize 
> PyArray_FromDimsAndData, but will it be incompatible with users who 
> have only Numeric?

Yes. However, I would suggest that new code should probably use just use numpy
fully especially if the restrictions of the old Numeric API is causing you pain.
The longer people support both, the longer people will *have* to support both.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From oliphant.travis at ieee.org  Thu Apr 27 21:40:04 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Thu Apr 27 21:40:04 2006
Subject: [Numpy-discussion] Freeing memory allocated in C
In-Reply-To: <1C119933-F93B-47C6-AADA-4A61DF16B745@mit.edu>
References: <1C119933-F93B-47C6-AADA-4A61DF16B745@mit.edu>
Message-ID: <44519C6E.80006@ieee.org>

Nick Fotopoulos wrote:
> Dear numpy-discussion,
>
> I have written a python module in C which wraps a C library (FrameL) 
> in order to read data from specially formatted files into Python 
> arrays.  It works, but I think have a memory leak, and I can't see 
> what I might be doing wrong.  This Python wrapper is almost identical 
> to a Matlab wrapper, but the Matlab version doesn't leak.  Perhaps 
> someone here can help me out?
>
> I have read in many places that to return an array, one should wrap 
> with PyArray_FromDimsAndData (or more modern versions) and then return 
> it without freeing the memory.  Does the same principle hold for 
> strings?  Are the following example snippets correct?

Why don't you just use PyArray_FromDims and let NumPy manage the 
memory?  FromDimsAndData is only for situations where you can't manage 
the memory with Python.  Therefore the memory is never freed.


If you do want to have NumPy deallocate the memory when you are done, 
then you have to


1) Make sure you are using the same allocator as NumPy is... _pya_malloc 
is defined in arrayobject.h (in NumPy but not in Numeric)


2) Reset the array flag so that OWN_DATA is set


out2->flags |= OWN_DATA


As long as you are using the same memory allocator, this should work.  
The OWN_DATA flag instructs the deallocator to free the data.


But, I would strongly suggest just using PyArray_FromDims and let NumPy 
allocate the new array for you.


>
> // output2 = x-axis values relative to first data point.
> data = malloc(nData*sizeof(double));
> for(i=0; i<nData; i++) {
>   data[i] = vect->startX[0]+(double)i*dt;
> }
> shape[0] = nData;
> out2 = (PyArrayObject *)
>         PyArray_FromDimsAndData(1,shape,PyArray_DOUBLE,(char *)data);
>
> //snip
>
> // output5 = gps start time as a string
> utc = vect->GTime - vect->ULeapS + FRGPSTAI;
> out5 = malloc(200*sizeof(char));
> sprintf(out5,"Starting GPS time:%.1f UTC=%s",
>     vect->GTime,FrStrGTime(utc));
>
> //snip -- Free all memory not assigned to a return object
>
> return Py_BuildValue("(OOOdsss)",out1,out2,out3,out4,out5,out6,out7);
>
>
> I see in the Numpy book that I should modernize 
> PyArray_FromDimsAndData, but will it be incompatible with users who 
> have only Numeric?


Yes,  the only issue, however, is that PyArray_FromDims and friends will 
only allow int-length sizes which on 64-bit computers is not as large as 
intp-length sizes.   So, if  you don't care about allowing large sizes 
then you can use the old Numeric C-API.


>
> If the code above should not leak under your inspection, are there any 
> other common places that python C modules often leak that I should check?

All of the malloc calls in your code leak.   In general you should not 
assume that Python will deallocate memory you have allocated.   Python 
uses it's own memory manager so even if you manage to arange things so 
that Python will free your memory (and you really have to hack things to 
do that), then  you can run into trouble if you try mixing system malloc 
calls with Python's deallocation.


The proper strategy for your arrays is to use PyArray_SimpleNew and then 
get the data-pointer to fill using PyArray_DATA(...).  The proper way to 
handle strings is to create a new string (say using 
PyString_FromFormat)  and then return everything as objects.


/* make sure shape is defined as intp unless you don't care about 64-bit */
obj2 = PyArray_SimpleNew(1, shape, PyArray_DOUBLE);
data = (double *)PyArray_DATA(obj2)
[snip...]
out5 = PyString_FromFormat("Starting GPS time:%.1f UTC=%s",
         vect->GTime,FrStrGTime(utc));

return Py_BuildValue("(NNNdNNN)",out1,out2,out3,out4,out5,out6,out7);


Make sure you use the 'N' tag so that another reference count isn't 
generated.  The 'O' tag will increase the reference count of your 
objects by one which is is not necessarily what you want (but sometimes 
you do).


Good luck,

-Travis


From oliphant.travis at ieee.org  Fri Apr 28 00:14:16 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Fri Apr 28 00:14:16 2006
Subject: [Numpy-discussion] Scalar math module is ready for testing
Message-ID: <4451C076.40608@ieee.org>

The scalar math module is complete and ready to be tested.  It should 
speed up code that relies heavily on scalar arithmetic by by-passing the 
ufunc machinery.

It needs lots of testing to be sure that it is doing the "right" 
thing.   To enable scalarmath you need to

import numpy.core.scalarmath

You cannot disable it once it's enabled except by restarting Python.  If 
we need that feature we can add it. The array scalars respond to the 
error modes of ufuncs.

There is an experimental function called alter_scalars that replaces the 
Python int, float, and complex number tables with the array scalar 
equivalents.  Thus, to amaze (or seriously annoy) your Python friends 
you can do

import numpy.core.scalarmath as ncs

ncs.alter_scalars(int)

1 / 0

This will return 0 unless you change the error modes...

ncs.retore_scalars(int)

Will put things back the way Guido intended....


Please try it out and send us error reports.   Many thanks to Sasha for 
his help in getting all the code so it at least compiles and loads.  All 
bugs should be blamed on me, though...


Best,

-Travis


From arnd.baecker at web.de  Fri Apr 28 00:48:04 2006
From: arnd.baecker at web.de (Arnd Baecker)
Date: Fri Apr 28 00:48:04 2006
Subject: [Numpy-discussion] Scalar math module is ready for testing
In-Reply-To: <4451C076.40608@ieee.org>
References: <4451C076.40608@ieee.org>
Message-ID: <Pine.LNX.4.51.0604280945570.24282@ptpcp8.phy.tu-dresden.de>

Hi Travis,

On Fri, 28 Apr 2006, Travis Oliphant wrote:

>
> The scalar math module is complete and ready to be tested.  It should
> speed up code that relies heavily on scalar arithmetic by by-passing the
> ufunc machinery.
>
> It needs lots of testing to be sure that it is doing the "right"
> thing.   To enable scalarmath you need to
>
> import numpy.core.scalarmath
>
> You cannot disable it once it's enabled except by restarting Python.  If
> we need that feature we can add it. The array scalars respond to the
> error modes of ufuncs.
>
> There is an experimental function called alter_scalars that replaces the
> Python int, float, and complex number tables with the array scalar
> equivalents.  Thus, to amaze (or seriously annoy) your Python friends

LOL ;-)

> you can do
>
> import numpy.core.scalarmath as ncs
>
> ncs.alter_scalars(int)
>
> 1 / 0
>
> This will return 0 unless you change the error modes...
>
> ncs.retore_scalars(int)
>
> Will put things back the way Guido intended....
>
>
> Please try it out and send us error reports.   Many thanks to Sasha for
> his help in getting all the code so it at least compiles and loads.  All
> bugs should be blamed on me, though...


Well, it does not compile for me (64 Bit opteron, as usual;-):

gcc options: '-pthread -fno-strict-aliasing -DNDEBUG -g -O3 -Wall
-Wstrict-prototypes -fPIC'
compile options: '-Inumpy/core/include
-Ibuild/src.linux-x86_64-2.4/numpy/core -Inumpy/core/src
-Inumpy/core/include -I/scr/python/include/python2.4 -c'
gcc: build/src.linux-x86_64-2.4/numpy/core/src/scalarmathmodule.c
build/src.linux-x86_64-2.4/numpy/core/src/scalarmathmodule.c:472: error:
redefinition of 'ulong_ctype_multiply'
build/src.linux-x86_64-2.4/numpy/core/src/scalarmathmodule.c:421: error:
previous definition of 'ulong_ctype_multiply' was here
build/src.linux-x86_64-2.4/numpy/core/src/scalarmathmodule.c:421: warning:
'ulong_ctype_multiply' defined but not used
build/src.linux-x86_64-2.4/numpy/core/src/scalarmathmodule.c:472: error:
redefinition of 'ulong_ctype_multiply'
build/src.linux-x86_64-2.4/numpy/core/src/scalarmathmodule.c:421: error:
previous definition of 'ulong_ctype_multiply' was here
build/src.linux-x86_64-2.4/numpy/core/src/scalarmathmodule.c:421: warning:
'ulong_ctype_multiply' defined but not used
error: Command "gcc -pthread -fno-strict-aliasing -DNDEBUG -g -O3 -Wall
-Wstrict-prototypes -fPIC -Inumpy/core/include
-Ibuild/src.linux-x86_64-2.4/numpy/core -Inumpy/core/src
-Inumpy/core/include -I/scr/python/include/python2.4 -c
build/src.linux-x86_64-2.4/numpy/core/src/scalarmathmodule.c -o
build/temp.linux-x86_64-2.4/build/src.linux-x86_64-2.4/numpy/core/src/scalarmathmodule.o"
failed with exit status 1

(I can't look into this now - meeting in -2 minutes ;-)

Best, Arnd


From schofield at ftw.at  Fri Apr 28 01:32:00 2006
From: schofield at ftw.at (Ed Schofield)
Date: Fri Apr 28 01:32:00 2006
Subject: [Numpy-discussion] Scalar math module is ready for testing
In-Reply-To: <4451C076.40608@ieee.org>
References: <4451C076.40608@ieee.org>
Message-ID: <4451D3F0.7080408@ftw.at>

Travis Oliphant wrote:
>
> The scalar math module is complete and ready to be tested.  It should
> speed up code that relies heavily on scalar arithmetic by by-passing
> the ufunc machinery.

Excellent!

> It needs lots of testing to be sure that it is doing the "right" thing.

With revision 2454 I get a segfault in numpy.test() after importing
numpy.core.scalarmath:

check_1 (numpy.distutils.tests.test_misc_util.test_appendpath) ... ok
check_2 (numpy.distutils.tests.test_misc_util.test_appendpath) ... ok
check_3 (numpy.distutils.tests.test_misc_util.test_appendpath) ... ok
check_gpaths (numpy.distutils.tests.test_misc_util.test_gpaths) ... ok
check_1 (numpy.distutils.tests.test_misc_util.test_minrelpath) ... ok
check_singleton (numpy.lib.tests.test_getlimits.test_double)
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread -1208403744 (LWP 11232)]
0xb7142cf7 in int_richcompare (self=0x81c0ab8, other=0x8141dbc, cmp_op=3)
    at build/src.linux-i686-2.4/numpy/core/src/scalarmathmodule.c:19120
19120           PyArrayScalar_RETURN_TRUE;
(gdb) bt
#0  0xb7142cf7 in int_richcompare (self=0x81c0ab8, other=0x8141dbc,
cmp_op=3)
    at build/src.linux-i686-2.4/numpy/core/src/scalarmathmodule.c:19120
#1  0x0807ce1f in PyObject_Print ()
#2  0x0807e451 in PyObject_RichCompare ()

Is this helpful?

-- Ed


From steffen.loeck at gmx.de  Fri Apr 28 01:34:07 2006
From: steffen.loeck at gmx.de (Steffen Loeck)
Date: Fri Apr 28 01:34:07 2006
Subject: [Numpy-discussion] Scalar math module is ready for testing
In-Reply-To: <4451C076.40608@ieee.org>
References: <4451C076.40608@ieee.org>
Message-ID: <200604281033.19781.steffen.loeck@gmx.de>

On Friday 28 April 2006 09:12 am, Travis Oliphant wrote:

> Please try it out and send us error reports.   Many thanks to Sasha for
> his help in getting all the code so it at least compiles and loads.  All
> bugs should be blamed on me, though...

Running the tests with numpy.test(10) i get:

/test/lib/python2.3/site-packages/numpy/testing/numpytest.py:179: 
DeprecationWarning: Non-ASCII character '\xf2' in 
file/test/lib/python2.3/site-packages/numpy/lib/tests/test_ufunclike.pyc on 
line 1, but no encoding declared; see 
http://www.python.org/peps/pep-0263.html for details
  m = imp.load_module(name, open(filename), filename,('.py','U',1))
E................................../test/lib/python2.3/site-packages/numpy/testing/numpytest.py:179: 
DeprecationWarning: Non-ASCII character '\xf2' in file 
test/lib/python2.3/site-packages/numpy/lib/tests/test_polynomial.pyc on line 
1, but no encoding declared; see http://www.python.org/peps/pep-0263.html for 
details
  m = imp.load_module(name, open(filename), filename,('.py','U',1))
E...........................................................................
======================================================================
ERROR: check_doctests (numpy.lib.tests.test_ufunclike.test_docs)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/test/lib/python2.3/site-packages/numpy/lib/tests/test_ufunclike.py", 
line 59, in check_doctests
    def check_doctests(self): return self.rundocs()
  File "/test//lib/python2.3/site-packages/numpy/testing/numpytest.py", line 
179, in rundocs
    m = imp.load_module(name, open(filename), filename,('.py','U',1))
  File "test/lib/python2.3/site-packages/numpy/lib/tests/test_ufunclike.pyc", 
line 1
    ;?
    ^
SyntaxError: invalid syntax

======================================================================
ERROR: check_doctests (numpy.lib.tests.test_polynomial.test_docs)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/test/lib/python2.3/site-packages/numpy/lib/tests/test_polynomial.py", 
line 79, in check_doctests
    def check_doctests(self): return self.rundocs()
  File "/test//lib/python2.3/site-packages/numpy/testing/numpytest.py", line 
179, in rundocs
    m = imp.load_module(name, open(filename), filename,('.py','U',1))
  File 
"/test/lib/python2.3/site-packages/numpy/lib/tests/test_polynomial.pyc", line 
1
    ;?
    ^
SyntaxError: invalid syntax

I have no idea, where this comes from.

Regards, Steffen


From fullung at gmail.com  Fri Apr 28 02:39:03 2006
From: fullung at gmail.com (Albert Strasheim)
Date: Fri Apr 28 02:39:03 2006
Subject: [Numpy-discussion] newbie for writing numpy/scipy extensions
In-Reply-To: <8CD47186-A354-4C8A-B5AF-8BEC2CE82D2E@cortechs.net>
Message-ID: <018c01c66aa7$77764480$0a84a8c0@dsp.sun.ac.za>

Hello all

I've collected the information from this thread along with links to some
recent threads on writing C extensions on the wiki at:

http://www.scipy.org/Cookbook/C_Extensions

Feel free to contribute!

Regards,

Albert

> -----Original Message-----
> From: numpy-discussion-admin at lists.sourceforge.net [mailto:numpy-
> discussion-admin at lists.sourceforge.net] On Behalf Of Gennan Chen
> Sent: 27 April 2006 21:23
> To: David M.Cooke
> Cc: Numpy-discussion at lists.sourceforge.net
> Subject: Re: [Numpy-discussion] newbie for writing numpy/scipy extensions
> 
> Got it. Looks like ndimage still used the old one.
> 
> Gen-Nan Chen, PhD
> Chief Scientist
> Research and Development Group
> CorTechs Labs Inc (www.cortechs.net)
> 1020 Prospect St., #304, La Jolla, CA, 92037
> Tel: 1-858-459-9700 ext 16
> Fax: 1-858-459-9705
> Email: gnchen at cortechs.net
> 
> 
> On Apr 27, 2006, at 11:31 AM, David M. Cooke wrote:
> 
> > Gennan Chen <gnchen at cortechs.net> writes:
> >
> >> Hi! All,
> >>
> >> I just start writing my own python extension based on numpy. Couple
> >> of questions here:
> >>
> >> 1. I have some utility functions, such as wrappers for
> >> PyArray_GETPTR* needed be access by different extension modules. So,
> >> I put them in utlis.h and utlis.c. In utils.h, I need to include
> >> "numpy/arrayobject.h". But the compilation failed when I include it
> >> again in my extension module function, wrap.c:
> >>
> >> #include "numpy/arrayobject.h"
> >> #include "utils.h"
> >>
> >> When I remove it and use
> >>
> >> #include "utils.h"
> >>
> >> the compilation works. So, is it true that I can only include
> >> arrayobject.h once?
> >
> > What is the compiler error message?
> >
> >> 2.  which import I should use in my initial function:
> >>
> >> import_array()
> >
> > This one. It's the one to use for Numeric, numarray, and numpy.
> >
> >> or
> >> import_libnumarray()
> >
> > This is for numarray, the other Numeric derivative. It pulls in the
> > numarray-specific stuff IIRC.
> >
> > --
> > |>|\/|<
> > /---------------------------------------------------------------------
> > -----\
> > |David M. Cooke                      http://
> > arbutus.physics.mcmaster.ca/dmc/
> > |cookedm at physics.mcmaster.ca


From lcordier at point45.com  Fri Apr 28 06:36:10 2006
From: lcordier at point45.com (Louis Cordier)
Date: Fri Apr 28 06:36:10 2006
Subject: [Numpy-discussion] Bug
Message-ID: <Pine.LNX.4.63.0604281504330.6802@bellagio.pcs.intranet>

Hi, I am not sure if this is the proper place to do a bug post.
I looked at the active tickets on http://projects.scipy.org/scipy/numpy/
but didn't feel confident to go and create a new one. ;)

Anyway the current release version 0.9.6 have some broken behavior.
I guess some example code would illustrate it best.

---8<----------------

>>> z = numpy.zeros((10,10), 'O')
>>> z.fill(None)
>>> z.fill([])
Segmentation fault (core dumped)

This happens on both Linux and FreeBSD machines.
(both builds use *_lite versions of Lapack)

Linux bellagio 2.6.11-1.1369_FC4 #1 Thu Jun 2 22:55:56 EDT 2005 i686 i686 
i386 GNU/Linux
Python 2.4.1
gcc version 4.0.0 20050519 (Red Hat 4.0.0-8)

FreeBSD cerberus.intranet 5.4-RELEASE-p12 FreeBSD 5.4-RELEASE-p12 #0: Wed 
Mar 15 16:06:48 UTC 2006
Python 2.4.2
gcc version 3.4.2 [FreeBSD] 20040728

I assume fill() will need to make a copy, of the object
for each coordinate in the matix.

---8<----------------

While,

>>> import numpy
>>> z = numpy.zeros((2,2), 'O')
>>> z
array([[0, 0],
        [0, 0]], dtype=object)
>>> z.fill([1])
>>> z
array([[1, 1],
        [1, 1]], dtype=object)

and

>>> z.fill([1,2,3])
>>> z
array([[1, 1],
        [1, 1]], dtype=object)


I would have expected,

>>> z
array([[[1], [1]],
        [[1], [1]]], dtype=object)

and

>>> z
array([[[1, 2, 3], [1, 2, 3]],
        [[1, 2, 3], [1, 2, 3]]], dtype=object)


Regards, Louis.

-- 
Louis Cordier <lcordier at point45.com> cell: +27721472305
Point45 Entertainment (Pty) Ltd. http://www.point45.org


From ndarray at mac.com  Fri Apr 28 09:04:09 2006
From: ndarray at mac.com (Sasha)
Date: Fri Apr 28 09:04:09 2006
Subject: [Numpy-discussion] Bug
In-Reply-To: <Pine.LNX.4.63.0604281504330.6802@bellagio.pcs.intranet>
References: <Pine.LNX.4.63.0604281504330.6802@bellagio.pcs.intranet>
Message-ID: <d38f5330604280903i7863d9acm160cbac2c834cd80@mail.gmail.com>

The core dump is definitely a bug.  I reproduced it on my Linux
system.  Please create a ticket.  I am not sure whether fill should
copy objects or not.  When you populate an array with immutable
objects, creating multiple copies is a waste.

On 4/28/06, Louis Cordier <lcordier at point45.com> wrote:
>
> Hi, I am not sure if this is the proper place to do a bug post.
> I looked at the active tickets on http://projects.scipy.org/scipy/numpy/
> but didn't feel confident to go and create a new one. ;)
>
> Anyway the current release version 0.9.6 have some broken behavior.
> I guess some example code would illustrate it best.
>
> ---8<----------------
>
> >>> z = numpy.zeros((10,10), 'O')
> >>> z.fill(None)
> >>> z.fill([])
> Segmentation fault (core dumped)
>
> This happens on both Linux and FreeBSD machines.
> (both builds use *_lite versions of Lapack)
>
> Linux bellagio 2.6.11-1.1369_FC4 #1 Thu Jun 2 22:55:56 EDT 2005 i686 i686
> i386 GNU/Linux
> Python 2.4.1
> gcc version 4.0.0 20050519 (Red Hat 4.0.0-8)
>
> FreeBSD cerberus.intranet 5.4-RELEASE-p12 FreeBSD 5.4-RELEASE-p12 #0: Wed
> Mar 15 16:06:48 UTC 2006
> Python 2.4.2
> gcc version 3.4.2 [FreeBSD] 20040728
>
> I assume fill() will need to make a copy, of the object
> for each coordinate in the matix.
>
> ---8<----------------
>
> While,
>
> >>> import numpy
> >>> z = numpy.zeros((2,2), 'O')
> >>> z
> array([[0, 0],
>         [0, 0]], dtype=object)
> >>> z.fill([1])
> >>> z
> array([[1, 1],
>         [1, 1]], dtype=object)
>
> and
>
> >>> z.fill([1,2,3])
> >>> z
> array([[1, 1],
>         [1, 1]], dtype=object)
>
>
> I would have expected,
>
> >>> z
> array([[[1], [1]],
>         [[1], [1]]], dtype=object)
>
> and
>
> >>> z
> array([[[1, 2, 3], [1, 2, 3]],
>         [[1, 2, 3], [1, 2, 3]]], dtype=object)
>
>
> Regards, Louis.
>
> --
> Louis Cordier <lcordier at point45.com> cell: +27721472305
> Point45 Entertainment (Pty) Ltd. http://www.point45.org
>
>
>
> -------------------------------------------------------
> Using Tomcat but need to do more? Need to support web services, security?
> Get stuff done quickly with pre-integrated technology to make your job easier
> Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>


From ndarray at mac.com  Fri Apr 28 10:04:08 2006
From: ndarray at mac.com (Sasha)
Date: Fri Apr 28 10:04:08 2006
Subject: [Numpy-discussion] Bug
In-Reply-To: <d38f5330604280903i7863d9acm160cbac2c834cd80@mail.gmail.com>
References: <Pine.LNX.4.63.0604281504330.6802@bellagio.pcs.intranet>
	 <d38f5330604280903i7863d9acm160cbac2c834cd80@mail.gmail.com>
Message-ID: <d38f5330604281003m1f4d78c1wffcbfcaa97c0f3d0@mail.gmail.com>

See <http://projects.scipy.org/scipy/numpy/ticket/86>.

On 4/28/06, Sasha <ndarray at mac.com> wrote:
> The core dump is definitely a bug.  I reproduced it on my Linux
> system.  Please create a ticket.  I am not sure whether fill should
> copy objects or not.  When you populate an array with immutable
> objects, creating multiple copies is a waste.
>
> On 4/28/06, Louis Cordier <lcordier at point45.com> wrote:
> >
> > Hi, I am not sure if this is the proper place to do a bug post.
> > I looked at the active tickets on http://projects.scipy.org/scipy/numpy/
> > but didn't feel confident to go and create a new one. ;)
> >
> > Anyway the current release version 0.9.6 have some broken behavior.
> > I guess some example code would illustrate it best.
> >
> > ---8<----------------
> >
> > >>> z = numpy.zeros((10,10), 'O')
> > >>> z.fill(None)
> > >>> z.fill([])
> > Segmentation fault (core dumped)
> >
> > This happens on both Linux and FreeBSD machines.
> > (both builds use *_lite versions of Lapack)
> >
> > Linux bellagio 2.6.11-1.1369_FC4 #1 Thu Jun 2 22:55:56 EDT 2005 i686 i686
> > i386 GNU/Linux
> > Python 2.4.1
> > gcc version 4.0.0 20050519 (Red Hat 4.0.0-8)
> >
> > FreeBSD cerberus.intranet 5.4-RELEASE-p12 FreeBSD 5.4-RELEASE-p12 #0: Wed
> > Mar 15 16:06:48 UTC 2006
> > Python 2.4.2
> > gcc version 3.4.2 [FreeBSD] 20040728
> >
> > I assume fill() will need to make a copy, of the object
> > for each coordinate in the matix.
> >
> > ---8<----------------
> >
> > While,
> >
> > >>> import numpy
> > >>> z = numpy.zeros((2,2), 'O')
> > >>> z
> > array([[0, 0],
> >         [0, 0]], dtype=object)
> > >>> z.fill([1])
> > >>> z
> > array([[1, 1],
> >         [1, 1]], dtype=object)
> >
> > and
> >
> > >>> z.fill([1,2,3])
> > >>> z
> > array([[1, 1],
> >         [1, 1]], dtype=object)
> >
> >
> > I would have expected,
> >
> > >>> z
> > array([[[1], [1]],
> >         [[1], [1]]], dtype=object)
> >
> > and
> >
> > >>> z
> > array([[[1, 2, 3], [1, 2, 3]],
> >         [[1, 2, 3], [1, 2, 3]]], dtype=object)
> >
> >
> > Regards, Louis.
> >
> > --
> > Louis Cordier <lcordier at point45.com> cell: +27721472305
> > Point45 Entertainment (Pty) Ltd. http://www.point45.org
> >
> >
> >
> > -------------------------------------------------------
> > Using Tomcat but need to do more? Need to support web services, security?
> > Get stuff done quickly with pre-integrated technology to make your job easier
> > Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
> > http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
> > _______________________________________________
> > Numpy-discussion mailing list
> > Numpy-discussion at lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/numpy-discussion
> >
>


From lcordier at point45.com  Fri Apr 28 10:24:04 2006
From: lcordier at point45.com (Louis Cordier)
Date: Fri Apr 28 10:24:04 2006
Subject: [Numpy-discussion] Bug
In-Reply-To: <d38f5330604281003m1f4d78c1wffcbfcaa97c0f3d0@mail.gmail.com>
References: <Pine.LNX.4.63.0604281504330.6802@bellagio.pcs.intranet> 
 <d38f5330604280903i7863d9acm160cbac2c834cd80@mail.gmail.com>
 <d38f5330604281003m1f4d78c1wffcbfcaa97c0f3d0@mail.gmail.com>
Message-ID: <Pine.LNX.4.63.0604281921280.13665@bellagio.pcs.intranet>

> See <http://projects.scipy.org/scipy/numpy/ticket/86>.

>> > >>> z.fill([1,2,3])
>> > >>> z
>> > array([[1, 1],
>> >         [1, 1]], dtype=object)
>> >
>> > I would have expected,
>> >
>> > >>> z
>> > array([[[1, 2, 3], [1, 2, 3]],
>> >         [[1, 2, 3], [1, 2, 3]]], dtype=object)

Souldn't the second example be a ticket ?
Or is it part of #86 ?

Regards, Louis.

-- 
Louis Cordier <lcordier at point45.com> cell: +27721472305
Point45 Entertainment (Pty) Ltd. http://www.point45.org


From ndarray at mac.com  Fri Apr 28 10:49:02 2006
From: ndarray at mac.com (Sasha)
Date: Fri Apr 28 10:49:02 2006
Subject: [Numpy-discussion] Bug
In-Reply-To: <Pine.LNX.4.63.0604281921280.13665@bellagio.pcs.intranet>
References: <Pine.LNX.4.63.0604281504330.6802@bellagio.pcs.intranet>
	 <d38f5330604280903i7863d9acm160cbac2c834cd80@mail.gmail.com>
	 <d38f5330604281003m1f4d78c1wffcbfcaa97c0f3d0@mail.gmail.com>
	 <Pine.LNX.4.63.0604281921280.13665@bellagio.pcs.intranet>
Message-ID: <d38f5330604281048x218f939ch63ad24e19aff25ae@mail.gmail.com>

On 4/28/06, Louis Cordier <lcordier at point45.com> wrote:

> Souldn't the second example be a ticket ?
> Or is it part of #86 ?

I think all your examples are different signs of the same problem. 
You can help by converting your examples into unit tests to be added
to say test_multiarray.py and attaching a patch to the ticket.

A brief comment for the developers: the problem that Louis reported is
caused by the fact that x.fill([]) creates an empty array internally
instead of a scalar object array containing an empty list.  Note that
numpy does not even have a good notation for the required object:

>>> from numpy import *
>>> x = zeros(1,'O')
>>> x.shape=()
>>> x[()] = []
>>> x
array([], dtype=object)
>>> x.shape
()

but

>>> array([], dtype=object).shape
(0,)


From fullung at gmail.com  Fri Apr 28 15:32:13 2006
From: fullung at gmail.com (Albert Strasheim)
Date: Fri Apr 28 15:32:13 2006
Subject: [Numpy-discussion] Scalar math module is ready for testing
In-Reply-To: <4451C076.40608@ieee.org>
Message-ID: <007701c66b13$8365df00$0a84a8c0@dsp.sun.ac.za>

Hello Travis

I'm having some problems compiling the scalarmath code with the Visual
Studio .NET 2003 compiler.

Specifically, the compiler is failing to link in the llabs, fabsf and sqrtf
functions. The reason it is not finding these symbols could be explained by
the following errors I get when building the object file by hand using the
parameters distutils passes to the compiler (for some reason distutils is
suppressing compiler output -- this is pretty, but it makes debugging build
failures hard):

build\src.win32-2.4\numpy\core\src\scalarmathmodule.c(1737) : warning C4013:
'llabs' undefined; assuming extern returning int
build\src.win32-2.4\numpy\core\src\scalarmathmodule.c(1751) : warning C4013:
'fabsf' undefined; assuming extern returning int
build\src.win32-2.4\numpy\core\src\scalarmathmodule.c(1773) : warning C4013:
'sqrtf' undefined; assuming extern returning int

In c:\Program Files\Microsoft Visual Studio .NET 2003\vc7\crt\src\math.h I
have the following (extra code stripped):

...
#ifndef __cplusplus
#define acosl(x)    ((long double)acos((double)(x)))
#define asinl(x)    ((long double)asin((double)(x)))
#define atanl(x)    ((long double)atan((double)(x)))
...
/* NOTE! no sqrtf or fabsf is defined in this block */
#else  /* __cplusplus */
...
#if !defined (_M_MRX000) && !defined (_M_ALPHA) && !defined (_M_IA64)
/* NOTE! none of the above are defined on x86 */
...
inline float fabsf(float _X)
        {return ((float)fabs((double)_X)); }
...
inline float sqrtf(float _X)
        {return ((float)sqrt((double)_X)); }
...
#endif  /* !defined (_M_MRX000) && !defined (_M_ALPHA) && !defined (_M_IA64)
*/
#endif  /* __cplusplus */

>From this it would seem that Microsoft doesn't consider sqrtf and fabsf to
be part of the C language? However, the C++ code provides a clue for how
they implemented it.

Also, llabs isn't defined anywhere. From reading the MSDN docs, I suspect it
is called _abs64 on Windows.

Regards,

Albert

> -----Original Message-----
> From: numpy-discussion-admin at lists.sourceforge.net [mailto:numpy-
> discussion-admin at lists.sourceforge.net] On Behalf Of Travis Oliphant
> Sent: 28 April 2006 09:13
> To: numpy-discussion
> Subject: [Numpy-discussion] Scalar math module is ready for testing
> 
> 
> The scalar math module is complete and ready to be tested.  It should
> speed up code that relies heavily on scalar arithmetic by by-passing the
> ufunc machinery.
> 
> It needs lots of testing to be sure that it is doing the "right"
> thing.   To enable scalarmath you need to
> 
> import numpy.core.scalarmath
> 
> You cannot disable it once it's enabled except by restarting Python.  If
> we need that feature we can add it. The array scalars respond to the
> error modes of ufuncs.
> 
> There is an experimental function called alter_scalars that replaces the
> Python int, float, and complex number tables with the array scalar
> equivalents.  Thus, to amaze (or seriously annoy) your Python friends
> you can do
> 
> import numpy.core.scalarmath as ncs
> 
> ncs.alter_scalars(int)
> 
> 1 / 0
> 
> This will return 0 unless you change the error modes...
> 
> ncs.retore_scalars(int)
> 
> Will put things back the way Guido intended....
> 
> 
> Please try it out and send us error reports.   Many thanks to Sasha for
> his help in getting all the code so it at least compiles and loads.  All
> bugs should be blamed on me, though...
> 
> 
> Best,
> 
> -Travis


From jonathan.taylor at stanford.edu  Fri Apr 28 16:21:15 2006
From: jonathan.taylor at stanford.edu (Jonathan Taylor)
Date: Fri Apr 28 16:21:15 2006
Subject: [Numpy-discussion] confusing recarray behaviour
Message-ID: <44528318.6010604@stanford.edu>

I'm new to recarrays and have been struggling with them. I keep getting 
an exception

TypeError: expected a readable buffer object

with no informative traceback.

What I pass to N.array seems to agree with the examples in numpybook.

Below is an example that does work for me (excuse the longish example 
but it was just cut and paste to make my life easier). In my code, funny 
things happen
(see ipython excerpt below this). In particular, I have a list v with 
v[0:2] = V and with the
same dtype "ddesc" I get this exception when I change V to v[0:2].

Any help would be appreciated.

---------------------------------------------------------------------------------------
import numpy as N

timedesc = N.dtype({'names':['tm_year',     
                             'tm_mon',    
                             'tm_mday',
                             'tm_hour',
                             'tm_min',   
                             'tm_sec',
                             'tm_wday',
                             'tm_yday',
                             'tm_isdst'],
                    'formats':['i2']*9})

ddesc = N.dtype({'names': ('Week',
                          'Date',
                          'Institution',
                          'SeqNo',
                          'HeightDone',
                          'Height',
                          'UnitsH',
                          'WeightDone',
                          'Weight',
                          'Units',
                          'PulseDone',
                          'Pulse',
                          'BPdone',
                          'BPSys',
                          'BPDia',
                          'PID',
                          'RN'),
                'formats': ['f4',
                            timedesc] + ['f4']*15})
                                         

V = [(12.0, (2005, 4, 22, 0, 0, 0, 4, 112, -1), 501.0,
      1.0, 2.0, 0.0, 0, 1.0, 91.5, 1.0, 1.0, 87.0, 1.0, 129.0,
      76.0, 107.0, 11.0),
     (24.0, (2005, 2, 1, 0, 0, 0, 1, 32, -1), 504.0,
      1.0, 2.0, 0.0, 0, 1.0, 166.0, 2.0, 1.0, 84.0, 1.0, 128.0,
      78.0,  401.0, 7.0)
     ]

w=N.array(V, dtype=ddesc)

--------------------------------------------------------------------------------------------------

In [97]:v[0:2] == V
Out[97]:True

In [98]:N.array(V, ddesc)
Out[98]:
array([ (12.0, (2005, 4, 22, 0, 0, 0, 4, 112, -1), 501.0, 1.0, 2.0, 0.0, 0.0, 1.0, 91.5, 1.0, 1.0, 87.0, 1.0, 129.0, 76.0, 107.0, 11.0),
       (24.0, (2005, 2, 1, 0, 0, 0, 1, 32, -1), 504.0, 1.0, 2.0, 0.0, 0.0, 1.0, 166.0, 2.0, 1.0, 84.0, 1.0, 128.0, 78.0, 401.0, 7.0)],
      dtype=[('Week', '<f4'), ('Date', [('tm_year', '<i2'), ('tm_mon', '<i2'), ('tm_mday', '<i2'), ('tm_hour', '<i2'), ('tm_min', '<i2'), ('tm_sec', '<i2'), ('tm_wday', '<i2'), ('tm_yday', '<i2'), ('tm_isdst', '<i2')]), ('Institution', '<f4'), ('SeqNo', '<f4'), ('HeightDone', '<f4'), ('Height', '<f4'), ('UnitsH', '<f4'), ('WeightDone', '<f4'), ('Weight', '<f4'), ('Units', '<f4'), ('PulseDone', '<f4'), ('Pulse', '<f4'), ('BPdone', '<f4'), ('BPSys', '<f4'), ('BPDia', '<f4'), ('PID', '<f4'), ('RN', '<f4')])

In [99]:N.array(v[0:2], ddesc)
---------------------------------------------------------------------------
exceptions.TypeError                                 Traceback (most recent call last)

/home/jtaylo/svn/personal/projects/deescalate/python/<ipython console>

TypeError: expected a readable buffer object


-- 
------------------------------------------------------------------------
I'm part of the Team in Training: please support our efforts for the
Leukemia and Lymphoma Society!

http://www.active.com/donate/tntsvmb/tntsvmbJTaylor

GO TEAM !!!

------------------------------------------------------------------------
Jonathan Taylor                           Tel:   650.723.9230
Dept. of Statistics                       Fax:   650.725.8977
Sequoia Hall, 137                         www-stat.stanford.edu/~jtaylo
390 Serra Mall
Stanford, CA 94305

-------------- next part --------------
A non-text attachment was scrubbed...
Name: jonathan.taylor.vcf
Type: text/x-vcard
Size: 329 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060428/b707fc01/attachment-0001.vcf>

From Fernando.Perez at colorado.edu  Fri Apr 28 16:21:17 2006
From: Fernando.Perez at colorado.edu (Fernando Perez)
Date: Fri Apr 28 16:21:17 2006
Subject: [Numpy-discussion] [OT] A weekend floating point/compiler question
Message-ID: <44528F49.3080005@colorado.edu>

Hi all,

this is somewhat off-topic, since it's really a gcc/g77 question.  Yet for us 
here (my group) it may lead to the decision to stop using g77 for all fortran 
code and switch to another compiler for our python-wrapped libraries.  So it 
did arise in the context of python usage of in-house code, and I'm appealing 
to anyone who may want to play a little with the question and help.

Feel free to reply off-list to keep the noise down on the list.

The problem arose in some in-house library, but can be boiled down to this:

planck[f77bug]> cat testbug.f
       program  testbug
c
       implicit real *8 (a-h,o-z)
c
       half = 0.5d0
       x    = 0.49d0
       nnx  = 100
       iax  = (x+half)*nnx

       print *, 'Should be 99:',iax

       stop
       end

c EOF


planck[f77bug]> g77 -o testbug.g77 testbug.f
planck[f77bug]> ./testbug.g77
  Should be 99: 98

This can be seen as computing (x/n+1/2)*n and comparing it to x+n/2.  Yes, I 
know about the dangers of floating point roundoff error (I didn't write the 
original code), but a variation of this is used inside a library that began 
crashing for certain inputs.  The point is that this same code works fine with 
the Intel and Lahey compilers, but not with g77.  Now, to add a bit of mystery 
to the question, I wrote the following C code:

planck[f77bug]> cat scanbug.c
#include <stdio.h>

int main(int argc, char* argv[]) {

     double x;
     double eps  = 1e-2;
     double x0   = 0.0;
     double xmax = 1.0;

     int nnx = 100;
     int i = 0;

     double dax;
     int iax,iax_direct;

     x = x0;
     while (x<xmax) {
         // This operation:
         dax = nnx*(x+0.5);
         iax = dax;

         // And this one:
         iax_direct = nnx*(x+0.5);

         // look identical, it's jut that one of them does not use a temporary
         // double variable to hold the result ( Does this mean that the int
         // cast is done in a register straight out of the 80-bit value in the
         // FPU? )

         // And yet, they produce different results for certain values of x
         if (iax != iax_direct) {
             printf("ERROR at x=%e!\n",x);
         }
         x = x0 + i*eps;
         i += 1;
     }
}
// EOF

And this is really where my question is.  The key issue is that nx*(x+0.5) 
produces a different result when truncated to an int, depending on whether a 
temporary double is involved or not.  I tested with the Intel C compiler, and 
it does never report a mismatch, yet a gcc compilation (3.4.3 and 4.0.2) does 
report a number of them.

Any ideas/comments?  Shouldn't the result be independent of the intermediate 
double var?  It is for icc, can this be considered a gcc bug?

Cheers,

f


From nvf at MIT.EDU  Fri Apr 28 16:32:20 2006
From: nvf at MIT.EDU (Nick Fotopoulos)
Date: Fri Apr 28 16:32:20 2006
Subject: [Numpy-discussion] Freeing memory allocated in C
In-Reply-To: <44519C6E.80006@ieee.org>
References: <1C119933-F93B-47C6-AADA-4A61DF16B745@mit.edu> <44519C6E.80006@ieee.org>
Message-ID: <CC9379D8-339E-4C4B-A4A2-A9D92DFDED5B@MIT.EDU>

Many thanks, with your help, I got it working without any leaks.  I  
need to run on ~10 TB of data, so fixing this leak sure helps my  
program scale.

One error in the code below is that PyString_FromFormat does not  
accept %f, so I created a regular string and created the PyString  
with PyString_FromString (it seems to copy data), then freed the  
regular string.  Is there any better way to do that?

I'm curious why I didn't see any explanation of PyArray_DATA in the  
NumPy book.  It seems really important, especially if you're touting  
it as the Proper Strategy.

Finally, Robert encouraged me to stop using the legacy interface.   
I'm happy to do so, but I have to cater to my users.  Approximately  
old a version of Numeric (and Numarray) will still work with  
PyArray_SimpleNew?

Thanks,
Nick

On Apr 28, 2006, at 12:39 AM, Travis Oliphant wrote:

<snip>
> The proper strategy for your arrays is to use PyArray_SimpleNew and  
> then get the data-pointer to fill using PyArray_DATA(...).  The  
> proper way to handle strings is to create a new string (say using  
> PyString_FromFormat)  and then return everything as objects.
>
>
>
> /* make sure shape is defined as intp unless you don't care about  
> 64-bit */
> obj2 = PyArray_SimpleNew(1, shape, PyArray_DOUBLE);
> data = (double *)PyArray_DATA(obj2)
> [snip...]
> out5 = PyString_FromFormat("Starting GPS time:%.1f UTC=%s",
>         vect->GTime,FrStrGTime(utc));
>
> return Py_BuildValue("(NNNdNNN)",out1,out2,out3,out4,out5,out6,out7);
>
>
> Make sure you use the 'N' tag so that another reference count isn't  
> generated.  The 'O' tag will increase the reference count of your  
> objects by one which is is not necessarily what you want (but  
> sometimes you do).


From robert.kern at gmail.com  Fri Apr 28 16:43:18 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Fri Apr 28 16:43:18 2006
Subject: [Numpy-discussion] Re: Freeing memory allocated in C
In-Reply-To: <CC9379D8-339E-4C4B-A4A2-A9D92DFDED5B@MIT.EDU>
References: <1C119933-F93B-47C6-AADA-4A61DF16B745@mit.edu> <44519C6E.80006@ieee.org> <CC9379D8-339E-4C4B-A4A2-A9D92DFDED5B@MIT.EDU>
Message-ID: <e2u98s$dtn$1@sea.gmane.org>

Nick Fotopoulos wrote:

> I'm curious why I didn't see any explanation of PyArray_DATA in the 
> NumPy book.  It seems really important, especially if you're touting  it
> as the Proper Strategy.

Section 13.3 talks about PyArray_DATA.

> Finally, Robert encouraged me to stop using the legacy interface.   I'm
> happy to do so, but I have to cater to my users.  Approximately  old a
> version of Numeric (and Numarray) will still work with  PyArray_SimpleNew?

None. It is new to Numpy. The old way would be to use PyArray_FromDims.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From Fernando.Perez at colorado.edu  Fri Apr 28 16:55:02 2006
From: Fernando.Perez at colorado.edu (Fernando Perez)
Date: Fri Apr 28 16:55:02 2006
Subject: [Numpy-discussion] A weekend floating point/compiler question
Message-ID: <4452AB3F.8090700@colorado.edu>

Hi Robert and George,

We found a bug in g77 v. 3.4.4 as well as in gcc, which manifests itself in 
the following little snippet:

planck[f77bug]> cat testbug.f
        program  testbug
c
        implicit real *8 (a-h,o-z)
c
        half = 0.5d0
        x    = 0.49d0
        nnx  = 100
        iax  = (x+half)*nnx

        print *, 'Should be 99:',iax

        stop
        end

c EOF


planck[f77bug]> g77 -o testbug.g77 testbug.f
planck[f77bug]> ./testbug.g77
   Should be 99: 98

This can be seen as computing (x/n+1/2)*n and comparing it to x+n/2.  Greg is 
using this in a number of places inside a library, which had never given 
trouble before when built with other compilers, like the sun, IBM, Intel and 
Lahey ones.  Now with g77 it gives the result above.

Questions:

1. Have you seen similar behavior in the past?

2. If we switch away from g77, what do you suggest moving towards?  We ran 
paranoia on ifort, lahey and g77, and lahey was the best performing of all. 
The intel one has the advantage of being free.  On the other hand, paranoia 
did complain about arithmetic issues with it (though the above code works fine 
with intel).

Any ideas you can give us would be very appreciated.

Cheers,

Fernando and Greg.

ps. Apparently g77 v 3.3.2 does NOT have this problem.


From robert.kern at gmail.com  Fri Apr 28 16:58:15 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Fri Apr 28 16:58:15 2006
Subject: [Numpy-discussion] Re: [OT] A weekend floating point/compiler question
In-Reply-To: <44528F49.3080005@colorado.edu>
References: <44528F49.3080005@colorado.edu>
Message-ID: <4452ABFE.2040307@gmail.com>

Fernando Perez wrote:

> Any ideas/comments?  Shouldn't the result be independent of the
> intermediate double var?  It is for icc, can this be considered a gcc bug?

It seems like it might be processor-specific. On my G4 Powerbook (g77 3.4.4, gcc
3.3) and AMD64 Linux desktop (g77 3.4.5, gcc 4.0.2), both programs give the
expected results. Specifically, the Intel 80-bit FPU thingy is probably a factor.

It might be worth filing a bug report against gcc. If nothing else, you might
get a better explanation of what's going on.

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From Fernando.Perez at colorado.edu  Fri Apr 28 17:13:16 2006
From: Fernando.Perez at colorado.edu (Fernando Perez)
Date: Fri Apr 28 17:13:16 2006
Subject: [Numpy-discussion] A weekend floating point/compiler question
In-Reply-To: <4452AB3F.8090700@colorado.edu>
References: <4452AB3F.8090700@colorado.edu>
Message-ID: <4452AF7D.6040008@colorado.edu>

Fernando Perez wrote:
> Hi Robert and George,
	
Sorry!  I was writing the same question to two colleagues and forgot to change 
  the TO line.  My apology.

Cheers,

f


From gnchen at cortechs.net  Fri Apr 28 18:08:03 2006
From: gnchen at cortechs.net (Gennan Chen)
Date: Fri Apr 28 18:08:03 2006
Subject: [Numpy-discussion] Guide to Numpy book
Message-ID: <3FA6601C-819F-4F15-A670-829FC428F47B@cortechs.net>

Hi!

What is the newest version of Guide to numpy? The recent one I got is  
dated at Jan 9 2005 on the cover.

Gen-Nan Chen, PhD
Chief Scientist
Research and Development Group
CorTechs Labs Inc (www.cortechs.net)
1020 Prospect St., #304, La Jolla, CA, 92037
Tel: 1-858-459-9700 ext 16
Fax: 1-858-459-9705
Email: gnchen at cortechs.net


From luis at geodynamics.org  Fri Apr 28 18:29:03 2006
From: luis at geodynamics.org (Luis Armendariz)
Date: Fri Apr 28 18:29:03 2006
Subject: [Numpy-discussion] Guide to Numpy book
In-Reply-To: <3FA6601C-819F-4F15-A670-829FC428F47B@cortechs.net>
References: <3FA6601C-819F-4F15-A670-829FC428F47B@cortechs.net>
Message-ID: <4452C145.8050803@geodynamics.org>

Gennan Chen wrote:
> Hi!
> 
> What is the newest version of Guide to numpy? The recent one I got is  
> dated at Jan 9 2005 on the cover.
> 

The one I got yesterday is dated March 15, 2006.
-Luis


From robert.kern at gmail.com  Sat Apr 29 00:31:22 2006
From: robert.kern at gmail.com (Robert Kern)
Date: Sat Apr 29 00:31:22 2006
Subject: [Numpy-discussion] Re: A python interface for loess ?
In-Reply-To: <200604260329.17115.pgmdevlist@mailcan.com>
References: <200604260329.17115.pgmdevlist@mailcan.com>
Message-ID: <4453162E.1040901@gmail.com>

Pierre GM wrote:
> Folks, 
> Would any of you be aware of a Python interface to the loess routines ?
> http://netlib.bell-labs.com/netlib/a/dloess.gz

Not specifically this code, but there is a pure Python+old Numeric
implementation of lowess in BioPython, specifically in the Bio.Statistics
subpackage. It's short and could be easily ported to use numpy.

  http://www.biopython.org

-- 
Robert Kern
robert.kern at gmail.com

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco


From chris at pseudogreen.org  Sat Apr 29 09:09:11 2006
From: chris at pseudogreen.org (Christopher Stawarz)
Date: Sat Apr 29 09:09:11 2006
Subject: [Numpy-discussion] Re: A weekend floating point/compiler question
Message-ID: <01fa3363e635409f488757070c5f8268@pseudogreen.org>

Hi,

I don't think this is a GCC bug, but it does seem to be related to
Intel's 80-bit floating-point architecture.

As of the Pentium 3, Intel and compatible processors have two sets of
instructions for performing floating-point operations: the original
8087 set, which do all computations at 80-bit precision, and SSE (and
their extension SSE2), which don't use extended precision.

GCC allows you to select either instruction set.  Unfortunately, in
the absence of an explicit choice, it uses a default target that
varies by platform: The i386 version defaults to 8087 instructions,
while the x86-64 version defaults to SSE.  See

http://gcc.gnu.org/onlinedocs/gcc-4.1.0/gcc/i386-and-x86_002d64- 
Options.html

for details.

I can make your test programs behave correctly on a Pentium 4 by
selecting SSE2:

devel12-35: g77 testbug.f
devel12-36: ./a.out
  Should be 99: 98
devel12-37: g77 -msse2 -mfpmath=sse testbug.f
devel12-38: ./a.out
  Should be 99: 99
devel12-39: gcc scanbug.c
devel12-40: ./a.out | head -1
ERROR at x=3.000000e-02!
devel12-41: gcc -msse2 -mfpmath=sse scanbug.c
devel12-42: ./a.out
devel12-43:

Interestingly, I expected to be able to induce incorrect results on an
Opteron by using 8087, but that wasn't the case (both instruction sets
produced the correct result).  I'll have to think about why that's
happening -- maybe casting between ints and doubles differs between 32
and 64-bit architectures?

I've never used the Intel or Lahey Fortran compilers, but I suspect
they must be generating SSE instructions by default.  Actually, it's
interesting that the 80-bit computations are causing problems here,
since it's easy to come up with examples where they give you better
results than computations done without the extra bits.


Hope that helps,
Chris


From charlesr.harris at gmail.com  Sat Apr 29 10:25:01 2006
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Sat Apr 29 10:25:01 2006
Subject: [Numpy-discussion] A weekend floating point/compiler question
In-Reply-To: <4452AB3F.8090700@colorado.edu>
References: <4452AB3F.8090700@colorado.edu>
Message-ID: <e06186140604291024o6fa0aa59n9f8bca6ed0682664@mail.gmail.com>

On 4/28/06, Fernando Perez <Fernando.Perez at colorado.edu> wrote:
>
> Hi Robert and George,
>
> We found a bug in g77 v. 3.4.4 as well as in gcc, which manifests itself
> in
> the following little snippet:
>
> planck[f77bug]> cat testbug.f
>         program  testbug
> c
>         implicit real *8 (a-h,o-z)
> c
>         half = 0.5d0
>         x    = 0.49d0
>         nnx  = 100
>         iax  = (x+half)*nnx
>
>         print *, 'Should be 99:',iax
>
>         stop
>         end
>
> c EOF


I don't see why the answer should be 99. The number .99 can not be exactly
represented in IEEE floating point, in fact it is ~ 0.9899999999999999911182.
So as you can see the result is perfectly correct given the standard
conversion to int by truncation. IMHO, this is programmer error, not a
compiler problem and should be fixed in the code. Now you may get slightly
different results depending on roundoff error if you indulge in such things
as (.5 + .49)*100 vs (.33 + .17 + .49)*100, and since these numbers are
constants they may also be precomputed by the compiler and the results will
depend on the accuracy of the compiler's computation. The whole construction
is ambiguous.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060429/c978dc43/attachment-0001.html>

From charlesr.harris at gmail.com  Sat Apr 29 10:43:08 2006
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Sat Apr 29 10:43:08 2006
Subject: [Numpy-discussion] A weekend floating point/compiler question
In-Reply-To: <e06186140604291024o6fa0aa59n9f8bca6ed0682664@mail.gmail.com>
References: <4452AB3F.8090700@colorado.edu>
	 <e06186140604291024o6fa0aa59n9f8bca6ed0682664@mail.gmail.com>
Message-ID: <e06186140604291042m2f6c21ddqd0af74cb3f5235a@mail.gmail.com>

On 4/29/06, Charles R Harris <charlesr.harris at gmail.com> wrote:
>
>
>
> On 4/28/06, Fernando Perez <Fernando.Perez at colorado.edu> wrote:
> >
> > Hi Robert and George,
> >
> > We found a bug in g77 v. 3.4.4 as well as in gcc, which manifests itself
> > in
> > the following little snippet:
> >
> > planck[f77bug]> cat testbug.f
> >         program  testbug
> > c
> >         implicit real *8 (a-h,o-z)
> > c
> >         half = 0.5d0
> >         x    = 0.49d0
> >         nnx  = 100
> >         iax  = (x+half)*nnx
> >
> >         print *, 'Should be 99:',iax
> >
> >         stop
> >         end
> >
> > c EOF
>
>
> I don't see why the answer should be 99. The number .99 can not be exactly
> represented in IEEE floating point, in fact it is ~
> 0.9899999999999999911182. So as you can see the result is perfectly
> correct given the standard conversion to int by truncation. IMHO, this is
> programmer error, not a compiler problem and should be fixed in the code.
> Now you may get slightly different results depending on roundoff error if
> you indulge in such things as (.5 + .49)*100 vs (.33 + .17 + .49)*100, and
> since these numbers are constants they may also be precomputed by the
> compiler and the results will depend on the accuracy of the compiler's
> computation. The whole construction is ambiguous.
>
> Chuck
>

As an example:

#include <cstdio>

int main(int argc, char** argv)
{
    int x = 100;
    long double y = .49;
    long double z = .50;
    printf("%25.22Lf\n", (y + z)*x);
    return 0;
}

prints 98.9999999999999991118216 whereas the same code with doubles instead
of long doubles prints 99.0000000000000000000000.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060429/87ffebbd/attachment-0001.html>

From oliphant.travis at ieee.org  Sat Apr 29 13:13:05 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Sat Apr 29 13:13:05 2006
Subject: [Numpy-discussion] confusing recarray behaviour
In-Reply-To: <44528318.6010604@stanford.edu>
References: <44528318.6010604@stanford.edu>
Message-ID: <4453C8B7.8040000@ieee.org>

Jonathan Taylor wrote:
>
> What I pass to N.array seems to agree with the examples in numpybook.
>
> Below is an example that does work for me (excuse the longish example 
> but it was just cut and paste to make my life easier). In my code, 
> funny things happen
> (see ipython excerpt below this). In particular, I have a list v with 
> v[0:2] = V and with the
> same dtype "ddesc" I get this exception when I change V to v[0:2].
Please show us what v is.

If I run v = V[:] and then try N.array(v[0:2],ddesc) I don't get any 
error.  So something else must be going on.

Which version are you running?


-Travis


From fullung at gmail.com  Sat Apr 29 14:30:10 2006
From: fullung at gmail.com (Albert Strasheim)
Date: Sat Apr 29 14:30:10 2006
Subject: [Numpy-discussion] Array data and struct alignment
Message-ID: <001601c66bd4$0a37ddb0$0a84a8c0@dsp.sun.ac.za>

Hello all

I'm busy wrapping a C library with NumPy. Some of the functions operate on a
buffer containing structs that look like this:

struct node {
  int index;
  double value;
};

On the Python side, I do the following to set up my data. examples is a list
containing lists or dicts.

nodes = []
for example in examples:
  if type(example) is dict:
    nodes.append(example.items())
  else:
    nodes.append(zip(range(1, len(example)+1), example))
descr = [('index','intc',1),('value','f8',1)]
self.nodes = map(lambda x: array(x, dtype=descr), nodes)

Assume example = [[1.0, 2.0, 3.0], {4: 4.0}]. The nodes array can now be
accessed in various useful ways:

nodes[0][0] -> (1, 1.0)
nodes[1][0] -> (4, 4.0))
nodes[0]['index'] -> [1,2,3]
nodes[0]['value'] -> [1.0,2.0,3.0])
nodes[1]['index'] -> [4]
nodes[1]['value'] -> [4.0]

On the C side I can now do the following:

PyObject* Svm_GetStructNode(PyObject* obj, PyObject* args) {
   PyObject* op1;
   struct node* node;
   if(!PyArg_ParseTuple(args, "O", &op1)) {
      return NULL;
   }
   node = (struct node*) PyArray_DATA(op1);
   return Py_BuildValue("(id)", node->index, node->value);
}

However, this only works if struct node is tightly packed (#pragma pack(1)
with the Visual C compiler).

I don't know how feasible this is, but it would be useful if NumPy could be
told to pack its data on n-byte boundaries or on "same as the compiler"
boundaries. I realise that there can be problems when mixing code compiled
by more than one compiler, etc., etc., but a simple unit test can check for
this.

Any thoughts?

Regards,

Albert


From oliphant.travis at ieee.org  Sat Apr 29 14:58:01 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Sat Apr 29 14:58:01 2006
Subject: [Numpy-discussion] Array data and struct alignment
In-Reply-To: <001601c66bd4$0a37ddb0$0a84a8c0@dsp.sun.ac.za>
References: <001601c66bd4$0a37ddb0$0a84a8c0@dsp.sun.ac.za>
Message-ID: <4453E10E.5090108@ieee.org>

Albert Strasheim wrote:
> Hello all
>
> I'm busy wrapping a C library with NumPy. Some of the functions operate on a
> buffer containing structs that look like this:
>
> struct node {
>   int index;
>   double value;
> };
>
>   
[snip]
> However, this only works if struct node is tightly packed (#pragma pack(1)
> with the Visual C compiler).
>
> I don't know how feasible this is, but it would be useful if NumPy could be
> told to pack its data on n-byte boundaries or on "same as the compiler"
> boundaries. I realise that there can be problems when mixing code compiled
> by more than one compiler, etc., etc., but a simple unit test can check for
> this.
>   

When you create a data-type using the dtype(...) syntax there is an 
align keyword that will "align" the data according to how the compiler 
does it.  I'm not sure if it always works right so please test it out.

So, in your case you should be able to say.

descr = dtype([('index',intc),('value','f8')], align=1)

Note, I've eliminated some unnecessary verbage in your description.

Currently this is giving me an error that I will look into. 

-Travis


From oliphant.travis at ieee.org  Sat Apr 29 15:04:10 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Sat Apr 29 15:04:10 2006
Subject: [Numpy-discussion] Array data and struct alignment
In-Reply-To: <001601c66bd4$0a37ddb0$0a84a8c0@dsp.sun.ac.za>
References: <001601c66bd4$0a37ddb0$0a84a8c0@dsp.sun.ac.za>
Message-ID: <4453E293.7080502@ieee.org>

Albert Strasheim wrote:
> Hello all
>
> I'm busy wrapping a C library with NumPy. Some of the functions operate on a
> buffer containing structs that look like this:
>
> struct node {
>   int index;
>   double value;
> };
>
>   

In my previous discussion I was wrong.  You cannot use the 
array_descriptor format for a data-type and the align keyword at the 
same time.  You need to use a different method to specify fields.

This, for example:

descr = dtype({'names':['index', 'value'], 'formats':[intc,'f8']},align=1)

On my (32-bit) system it doesn't produce any difference from align=0.  


-Travis


From oliphant.travis at ieee.org  Sat Apr 29 15:11:07 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Sat Apr 29 15:11:07 2006
Subject: [Numpy-discussion] Array data and struct alignment
In-Reply-To: <4453E293.7080502@ieee.org>
References: <001601c66bd4$0a37ddb0$0a84a8c0@dsp.sun.ac.za> <4453E293.7080502@ieee.org>
Message-ID: <4453E449.20407@ieee.org>

Travis Oliphant wrote:
> Albert Strasheim wrote:
>> Hello all
>>
>> I'm busy wrapping a C library with NumPy. Some of the functions 
>> operate on a
>> buffer containing structs that look like this:
>>
>> struct node {
>>   int index;
>>   double value;
>> };
>>
>>   
>
> In my previous discussion I was wrong.  You cannot use the 
> array_descriptor format for a data-type and the align keyword at the 
> same time.  You need to use a different method to specify fields.
>
> This, for example:
>
> descr = dtype({'names':['index', 'value'], 
> 'formats':[intc,'f8']},align=1)
>
> On my (32-bit) system it doesn't produce any difference from align=0. 
>
> -Travis
>
>

However notice the difference with

 >>> dtype({'names':['index', 'value'], 'formats':[short,'f8']},align=1)
dtype([('index', '<i2'), ('', '|V2'), ('value', '<f8')])

 >>> dtype({'names':['index', 'value'], 'formats':[short,'f8']},align=0)
dtype([('index', '<i2'), ('value', '<f8')])


There is padding inserted in the first-case.  This corresponds to how 
the compiler packs a short; double struct on my system.   The default is 
align=0.  You need to use the dtype() constructor to change the 
default.   The auto-constructor used in dtype= keyword calls will not 
change the alignment from align=0.


-Travis


From Fernando.Perez at colorado.edu  Sat Apr 29 16:16:10 2006
From: Fernando.Perez at colorado.edu (Fernando Perez)
Date: Sat Apr 29 16:16:10 2006
Subject: [Numpy-discussion] A weekend floating point/compiler question
In-Reply-To: <e06186140604291042m2f6c21ddqd0af74cb3f5235a@mail.gmail.com>
References: <4452AB3F.8090700@colorado.edu>	 <e06186140604291024o6fa0aa59n9f8bca6ed0682664@mail.gmail.com> <e06186140604291042m2f6c21ddqd0af74cb3f5235a@mail.gmail.com>
Message-ID: <4453F3A6.9030309@colorado.edu>

Charles R Harris wrote:

>>I don't see why the answer should be 99. The number .99 can not be exactly
>>represented in IEEE floating point, in fact it is ~
>>0.9899999999999999911182. So as you can see the result is perfectly
>>correct given the standard conversion to int by truncation. IMHO, this is
>>programmer error, not a compiler problem and should be fixed in the code.
>>Now you may get slightly different results depending on roundoff error if
>>you indulge in such things as (.5 + .49)*100 vs (.33 + .17 + .49)*100, and
>>since these numbers are constants they may also be precomputed by the
>>compiler and the results will depend on the accuracy of the compiler's
>>computation. The whole construction is ambiguous.
>>
>>Chuck
>>
> 
> 
> As an example:

[...]

Thanks to yours and the other replies.  I did try resetting the FPU control 
word as suggested to only 64 bits, and in fact the 'problem' does disappear, 
and I suspect that's also why Robert sees differences in CPUs without the 
extra 16 internal FPU bits.

I do agree that I don't like code like this, but unfortunately this one is 
outside of my control.  For the sake of completeness (since this thread has 
some educational value on the vagaries of FP arithmetic), I've slightly 
extended your example to:

abdul[f77bug]> cat print99.c
#include <stdio.h>

int main(int argc, char** argv)
{
    int x = 100;

    float fy = .49;
    float fz = .50;
    float fw = (fy + fz)*x;
    int ifw = fw;


    double y = .49;
    double z = .50;
    double w = (y + z)*x;
    int iw = w;

    long double ly = .49;
    long double lz = .50;
    long double lw = (ly + lz)*x;
    int ilw = lw;

    printf("floats:\n");
    printf("w=%25.22f, iw=%d\n", fw,ifw);

    printf("doubles:\n");
    printf("w=%25.22f, iw=%d\n", w,iw);

    printf("long doubles:\n");
    printf("w=%25.22Lf, iw=%d\n", lw,ilw);

    return 0;
}
// EOF

which gives on my box (AMD chip, running 32-bit fedora3):

abdul[f77bug]> ./print99.gcc
floats:
w=99.0000000000000000000000, iw=99
doubles:
w=99.0000000000000000000000, iw=99
long doubles:
w=98.9999999999999991118216, iw=98


This is consitent with the calculations done in 80 bits giving also different 
results.

One of the nice things about this community is precisely this kind of friendly 
expertise.  Many thanks to all.

Cheers,

f


From fullung at gmail.com  Sat Apr 29 17:27:15 2006
From: fullung at gmail.com (Albert Strasheim)
Date: Sat Apr 29 17:27:15 2006
Subject: [Numpy-discussion] Array data and struct alignment
In-Reply-To: <4453E449.20407@ieee.org>
Message-ID: <001d01c66bec$c556ece0$0a84a8c0@dsp.sun.ac.za>

Thanks Travis, this works like a charm.

For the curious, here's a quick way to see if your system is doing the right
thing:

In [87]: descr = dtype({'names':['a', 'b'], 'formats':[byte,'f8']},align=1)

In [88]: descr
Out[88]: dtype([('a', '|i1'), ('', '|V7'), ('b', '<f8')])

In [89]: a=array([(1,0.0),(2,0.0)], dtype=descr)

In [92]: a.data[0]
Out[92]: '\x01'

In [93]: a.data[16]
Out[93]: '\x02'

In this case the bytes are showing where one would expect them if padding
was happening.

Regards,

Albert

> -----Original Message-----
> From: numpy-discussion-admin at lists.sourceforge.net [mailto:numpy-
> discussion-admin at lists.sourceforge.net] On Behalf Of Travis Oliphant
> Sent: 30 April 2006 00:10
> To: numpy-discussion
> Subject: Re: [Numpy-discussion] Array data and struct alignment
> 
> Travis Oliphant wrote:
> > Albert Strasheim wrote:
> >> Hello all
> >>
> >> I'm busy wrapping a C library with NumPy. Some of the functions
> >> operate on a
> >> buffer containing structs that look like this:
> >>
> >> struct node {
> >>   int index;
> >>   double value;
> >> };
> >>
> >>
> >
> > In my previous discussion I was wrong.  You cannot use the
> > array_descriptor format for a data-type and the align keyword at the
> > same time.  You need to use a different method to specify fields.
> >
> > This, for example:
> >
> > descr = dtype({'names':['index', 'value'],
> > 'formats':[intc,'f8']},align=1)
> >
> > On my (32-bit) system it doesn't produce any difference from align=0.
> >
> > -Travis
> >
> >
> 
> However notice the difference with
> 
>  >>> dtype({'names':['index', 'value'], 'formats':[short,'f8']},align=1)
> dtype([('index', '<i2'), ('', '|V2'), ('value', '<f8')])
> 
>  >>> dtype({'names':['index', 'value'], 'formats':[short,'f8']},align=0)
> dtype([('index', '<i2'), ('value', '<f8')])
> 
> 
> There is padding inserted in the first-case.  This corresponds to how
> the compiler packs a short; double struct on my system.   The default is
> align=0.  You need to use the dtype() constructor to change the
> default.   The auto-constructor used in dtype= keyword calls will not
> change the alignment from align=0.
> 
> 
> -Travis


From jonathan.taylor at stanford.edu  Sat Apr 29 19:56:03 2006
From: jonathan.taylor at stanford.edu (Jonathan Taylor)
Date: Sat Apr 29 19:56:03 2006
Subject: [Numpy-discussion] confusing recarray behaviour
In-Reply-To: <4453C8B7.8040000@ieee.org>
References: <44528318.6010604@stanford.edu> <4453C8B7.8040000@ieee.org>
Message-ID: <44542730.4050609@stanford.edu>

Here is a pickle file with v and desc, v is just a list of tuples with 
integer and string entries. My point with my example is that when I had 
two identical lists (i.e. v[0:2] == V) one time I got an error, the 
other time I didn't and the traceback had no information, i.e. I 
couldn't get anywhere with pdb. I am using svn revision 2456.

Jonathan

Travis Oliphant wrote:

> Jonathan Taylor wrote:
>
>>
>> What I pass to N.array seems to agree with the examples in numpybook.
>>
>> Below is an example that does work for me (excuse the longish example 
>> but it was just cut and paste to make my life easier). In my code, 
>> funny things happen
>> (see ipython excerpt below this). In particular, I have a list v with 
>> v[0:2] = V and with the
>> same dtype "ddesc" I get this exception when I change V to v[0:2].
>
> Please show us what v is.
>
> If I run v = V[:] and then try N.array(v[0:2],ddesc) I don't get any 
> error.  So something else must be going on.
>
> Which version are you running?
>
>
> -Travis
>
>
>
>
>
> -------------------------------------------------------
> Using Tomcat but need to do more? Need to support web services, security?
> Get stuff done quickly with pre-integrated technology to make your job 
> easier
> Download IBM WebSphere Application Server v.1.0.1 based on Apache 
> Geronimo
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion


-- 
------------------------------------------------------------------------
I'm part of the Team in Training: please support our efforts for the
Leukemia and Lymphoma Society!

http://www.active.com/donate/tntsvmb/tntsvmbJTaylor

GO TEAM !!!

------------------------------------------------------------------------
Jonathan Taylor                           Tel:   650.723.9230
Dept. of Statistics                       Fax:   650.725.8977
Sequoia Hall, 137                         www-stat.stanford.edu/~jtaylo
390 Serra Mall
Stanford, CA 94305

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: dump.pickle
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20060429/647aeead/attachment-0001.ksh>

From awf at yahoo.co.kr  Sun Apr 30 06:28:01 2006
From: awf at yahoo.co.kr (=?iso-2022-jp?B?Zndm?=)
Date: Sun Apr 30 06:28:01 2006
Subject: [Numpy-discussion] =?iso-2022-jp?B?PRskQjIrNmI9NTRWJUolUxsoQj0=?=
Message-ID: <E1FaBvn-0005JR-5i@externalmx-1.sourceforge.net>

???????????????
???????????
http://biz-station.org/week/
?
gonghexinnian at yahoo.com.cn


From ndarray at mac.com  Sun Apr 30 10:12:06 2006
From: ndarray at mac.com (Sasha)
Date: Sun Apr 30 10:12:06 2006
Subject: [Numpy-discussion] [Numeric] "put" into object array corrupts memory
In-Reply-To: <E1FaFFF-00066B-Cy@sc8-sf-web1.sourceforge.net>
References: <E1FaFFF-00066B-Cy@sc8-sf-web1.sourceforge.net>
Message-ID: <d38f5330604301011w7f3cea1x85fd68841e5e51dd@mail.gmail.com>

I know that Numeric is no longer maintained, but since this bug cost
me two sleepless nights, I think it is appropriate to announce the bug
and the fix to the list.

---------- Forwarded message ----------
From: SourceForge.net <noreply at sourceforge.net>
Date: Apr 30, 2006 12:58 PM
Subject: [ numpy-Bugs-1479376 ] [Numeric] "put" into object array
corrupts memory
To: noreply at sourceforge.net


Bugs item #1479376, was opened at 2006-04-30 12:46
Message generated for change (Comment added) made by belopolsky
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=101369&aid=1479376&group_id=1369

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Fatal Error
Group: Normal bug
Status: Open
Priority: 5
Submitted By: Alexander Belopolsky (belopolsky)
Assigned to: Nobody/Anonymous (nobody)
Summary: [Numeric] "put" into object array corrupts memory

Initial Comment:
This is one of those bugs that are easier to fix than to reproduce:

$ cat test-put.py
class A(object):
    def __del__(self): print "deleting %r" % self
a = A()
from Numeric import *
x = array([None], 'O')
y = array([a], 'O')
put(x,[0],y)
del a,y
print "exiting"

$ python test-put.py
deleting <__main__.A object at 0xf7e4d24c>
exiting
Fatal Python error: deletion of interned string failed
Aborted (core dumped)


Numeric version: 24.2

----------------------------------------------------------------------

>Comment By: Alexander Belopolsky (belopolsky)
Date: 2006-04-30 12:58

Message:
Logged In: YES
user_id=835142

Attached patch fixes the bug.

----------------------------------------------------------------------

You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=101369&aid=1479376&group_id=1369


From vidar+list at 37mm.no  Sun Apr 30 16:27:00 2006
From: vidar+list at 37mm.no (Vidar Gundersen)
Date: Sun Apr 30 16:27:00 2006
Subject: [Numpy-discussion] Guide to Numpy book
In-Reply-To: <4452C145.8050803@geodynamics.org> (Luis Armendariz's message of
	"Fri, 28 Apr 2006 18:28:37 -0700")
References: <3FA6601C-819F-4F15-A670-829FC428F47B@cortechs.net>
	<4452C145.8050803@geodynamics.org>
Message-ID: <m2aca2ve5y.fsf@buri.local>

===== Original message from Luis Armendariz | 29 Apr 2006:
>> What is the newest version of Guide to numpy? The recent one I got is
>> dated at Jan 9 2005 on the cover.
> The one I got yesterday is dated March 15, 2006.

aren't the updates supposed to be sent out
to customers when available?


From ted.horst at earthlink.net  Sun Apr 30 16:50:08 2006
From: ted.horst at earthlink.net (Ted Horst)
Date: Sun Apr 30 16:50:08 2006
Subject: [Numpy-discussion] Scalar math module is ready for testing
In-Reply-To: <4451C076.40608@ieee.org>
References: <4451C076.40608@ieee.org>
Message-ID: <3856FA57-539D-47DE-8427-2A6BB508F917@earthlink.net>

Here is an issue I am having with scalarmath:

 >>> import numpy
 >>> numpy.__version__
'0.9.7.2462'
 >>> import numpy.core.scalarmath
 >>> a = numpy.array([1], 'h')
 >>> 1*a
array([1], dtype=int16)
 >>> 1*a[0]
Traceback (most recent call last):
   File "<stdin>", line 1, in ?
TypeError: unsupported operand type(s) for *: 'int' and 'int16scalar'

This happens because PyArray_CanCastSafely returns false for casting  
from int to short. alter_scalars(int) fixes this, but I have lots of  
non-numpy code that I don't want to behave differently.

Ted

On Apr 28, 2006, at 02:12, Travis Oliphant wrote:

> The scalar math module is complete and ready to be tested.  It  
> should speed up code that relies heavily on scalar arithmetic by by- 
> passing the ufunc machinery.


From fullung at gmail.com  Sun Apr 30 17:11:05 2006
From: fullung at gmail.com (Albert Strasheim)
Date: Sun Apr 30 17:11:05 2006
Subject: [Numpy-discussion] Creating a descr with aligned=1 using the C API
Message-ID: <000601c66cb3$b762a940$0a84a8c0@dsp.sun.ac.za>

Hello all

I was wondering what the best way would be to create the following descr
using the C API:

descr = dtype({'names' : ['index', 'value'], 'formats' : [intc, 'f8']},
align=1)

One could use PyArray_DescrConverter in multiarraymodule.c, but there
doesn't seem to be a way to specify aligned=1 and one would have to build
the dict object before being able to pass it on for conversion.

Unless there's another easy way I'm missing, the API could possibly do with
a function like PyArray_DescrFromCommaString(const char*, int align) which
calls _convert_from_commastring. By the way, what is the general format of
these commastrings?

Comments appreciated.

Regards,

Albert


From tim.hochberg at cox.net  Sun Apr 30 19:33:03 2006
From: tim.hochberg at cox.net (Tim Hochberg)
Date: Sun Apr 30 19:33:03 2006
Subject: [Numpy-discussion] basearray lives!
Message-ID: <445573B0.6020408@cox.net>

After a fashion anyway. I implemented the simplest thing that could 
possibly work and I've left out some stuff that even I think we need 
(docstring, repr and str). Still it exists, ndarray inherits from it and 
some stuff seems to work automagically.

 >>> import numpy as n
 >>> ba = n.basearray([3,3], int, n.arange(9))
 >>> ba
<numpy.basearray object at 0x00B29690>
 >>> a = asarray(ba)
 >>> a
array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])
 >>> a + ba
array([[ 0,  2,  4],
       [ 6,  8, 10],
       [12, 14, 16]])
 >>> isinstance(a, n.basearray)
True
 >>> type(ba)
<type 'numpy.basearray'>
 >>> type(a)
<type 'numpy.ndarray'>
 >>> len(dir(ba))
19
 >>> len(dir(a))
156


Travis: should I go ahead and check this into the trunk? It shouldn't 
interfear with anything. The only change to ndarray is the tp_base, 
which sets up the inheritance.


-tim


From ndarray at mac.com  Sun Apr 30 20:27:09 2006
From: ndarray at mac.com (Sasha)
Date: Sun Apr 30 20:27:09 2006
Subject: [Numpy-discussion] basearray lives!
In-Reply-To: <445573B0.6020408@cox.net>
References: <445573B0.6020408@cox.net>
Message-ID: <d38f5330604302026x130d2111ye4698b6e8777f1bb@mail.gmail.com>

Let me add my $.02.  I am very much in favor of a basic array object. 
 I would probably go much further than Tim in simplifying it.  No need
for repr/str.  No number protocol.  No sequence/mapping protocol
either.  Maybe even no dimensions/striding etc.  What is left?  Not
much on top of buffer protocol: the type description.

I've expressed this opinion several times before (and was criticised
for not supporting it:-): I don't think a basearray should be a base
class.  The main reason is that in most cases subclasses will need to
adapt all the array methods. In many cases (speaking from ma
experience, but probably matrix folks can relate) the adaptation is
not automatic and has to be done on the method by method bases. 
Exposure of the base class methods without adaptation or with wrong
adaptation leads to errors.  Unless the base array is truly
minimalistic and stays this way, methods that are added to the base
class in the future will likely not work unadapted.

The only implementation that uses inheritance that I will like would
be something similar to python's object type: rich C API and no Python
API.

Would you consider checking your implementation in without modifying
ndarray's tp_base?


On 4/30/06, Tim Hochberg <tim.hochberg at cox.net> wrote:
>
> After a fashion anyway. I implemented the simplest thing that could
> possibly work and I've left out some stuff that even I think we need
> (docstring, repr and str). Still it exists, ndarray inherits from it and
> some stuff seems to work automagically.
>
>  >>> import numpy as n
>  >>> ba = n.basearray([3,3], int, n.arange(9))
>  >>> ba
> <numpy.basearray object at 0x00B29690>
>  >>> a = asarray(ba)
>  >>> a
> array([[0, 1, 2],
>        [3, 4, 5],
>        [6, 7, 8]])
>  >>> a + ba
> array([[ 0,  2,  4],
>        [ 6,  8, 10],
>        [12, 14, 16]])
>  >>> isinstance(a, n.basearray)
> True
>  >>> type(ba)
> <type 'numpy.basearray'>
>  >>> type(a)
> <type 'numpy.ndarray'>
>  >>> len(dir(ba))
> 19
>  >>> len(dir(a))
> 156
>
>
> Travis: should I go ahead and check this into the trunk? It shouldn't
> interfear with anything. The only change to ndarray is the tp_base,
> which sets up the inheritance.
>
>
>
> -tim
>
>
>
>
> -------------------------------------------------------
> Using Tomcat but need to do more? Need to support web services, security?
> Get stuff done quickly with pre-integrated technology to make your job easier
> Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
> _______________________________________________
> Numpy-discussion mailing list
> Numpy-discussion at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/numpy-discussion
>


From oliphant.travis at ieee.org  Sun Apr 30 21:45:05 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Sun Apr 30 21:45:05 2006
Subject: [Numpy-discussion] Creating a descr with aligned=1 using the
 C API
In-Reply-To: <000601c66cb3$b762a940$0a84a8c0@dsp.sun.ac.za>
References: <000601c66cb3$b762a940$0a84a8c0@dsp.sun.ac.za>
Message-ID: <44559204.3020902@ieee.org>

Albert Strasheim wrote:
> Hello all
>
> I was wondering what the best way would be to create the following descr
> using the C API:
>   
You can use the "new" method.

PyArray_Descr *dtype
PyObject *dict;

dtype = PyArrayDescr_Type.ob_type->tp_new(dtype->ob_type, 
Py_BuildValue("Oi", dict, 1));

where the dict is the one you give.

Yes, this could be an easier-to use API.

> descr = dtype({'names' : ['index', 'value'], 'formats' : [intc, 'f8']},
> align=1)
>
> One could use PyArray_DescrConverter in multiarraymodule.c, but there
> doesn't seem to be a way to specify aligned=1 and one would have to build
> the dict object before being able to pass it on for conversion.
>
> Unless there's another easy way I'm missing, the API could possibly do with
> a function like PyArray_DescrFromCommaString(const char*, int align) which
> calls _convert_from_commastring. By the way, what is the general format of
> these commastrings?
>   
It's in the NumPy book and it's also documented by numarray...


-Travis


From oliphant.travis at ieee.org  Sun Apr 30 21:49:02 2006
From: oliphant.travis at ieee.org (Travis Oliphant)
Date: Sun Apr 30 21:49:02 2006
Subject: [Numpy-discussion] basearray lives!
In-Reply-To: <445573B0.6020408@cox.net>
References: <445573B0.6020408@cox.net>
Message-ID: <445592EB.1000406@ieee.org>

Tim Hochberg wrote:
>
> After a fashion anyway. I implemented the simplest thing that could 
> possibly work and I've left out some stuff that even I think we need 
> (docstring, repr and str). Still it exists, ndarray inherits from it 
> and some stuff seems to work automagically.
>
> >>> import numpy as n
> >>> ba = n.basearray([3,3], int, n.arange(9))
> >>> ba
> <numpy.basearray object at 0x00B29690>
> >>> a = asarray(ba)
> >>> a
> array([[0, 1, 2],
>       [3, 4, 5],
>       [6, 7, 8]])
> >>> a + ba
> array([[ 0,  2,  4],
>       [ 6,  8, 10],
>       [12, 14, 16]])
> >>> isinstance(a, n.basearray)
> True
> >>> type(ba)
> <type 'numpy.basearray'>
> >>> type(a)
> <type 'numpy.ndarray'>
> >>> len(dir(ba))
> 19
> >>> len(dir(a))
> 156
>
>
> Travis: should I go ahead and check this into the trunk? It shouldn't 
> interfear with anything. The only change to ndarray is the tp_base, 
> which sets up the inheritance.
>

I say go ahead.   We can then all deal with it there and improve upon 
it.   The ndarray used to inherit from another array and things worked. 

Python's inheritance in C is actually quite slick.   Especially for 
structural issues.    I agree that the basearray should have minimal 
operations (I would not even define several of the protocols for it).  
I'd probably only keep the buffer and mapping protocol but even then 
probably only a simple mapping protocol (i.e. no fancy-indexing)  that 
then gets enhanced by the ndarray. 

Thanks for the work.

-Travis