From dileepkunjaai at gmail.com  Mon Aug  1 05:31:13 2011
From: dileepkunjaai at gmail.com (dileep kunjaai)
Date: Mon, 1 Aug 2011 15:01:13 +0530
Subject: [Numpy-discussion] Fill a particular value in the place of number
 satisfying certain condition by another number in an array.
Message-ID: <CALTF6smzcTfDTTpmZhV3jzoZc=E7_VA5xEB=1YxuF+g5geJQZg@mail.gmail.com>

Dear sir,
   How can we fill a particular value in the place of number satisfying
certain condition by another number in an array.


Example:
 A=[[[  9.42233087e-42  - 4.71116544e-42   0.00000000e+00 ...,
1.48303127e+01
     1.31524124e+01   1.14745111e+01]
  [  3.91788793e+00   1.95894396e+00   0.00000000e+00 ...,   1.78252487e+01
     1.28667984e+01   7.90834856e+00]
  [  7.83592510e+00   -3.91796255e+00   0.00000000e+00 ...,   2.08202991e+01
     1.25811749e+01   4.34205008e+00]
  ...,
  [  -8.51249974e-03   7.00901222e+00   -1.40095119e+01 ...,
0.00000000e+00
     0.00000000e+00   0.00000000e+00]
  [  4.26390441e-03   3.51080871e+00   -7.01735353e+00 ...,   0.00000000e+00
     0.00000000e+00   0.00000000e+00]
  [  0.00000000e+00   0.00000000e+00   0.00000000e+00 ...,   0.00000000e+00
     0.00000000e+00   0.00000000e+00]]

 [[  9.42233087e-42   -4.71116544e-42   0.00000000e+00 ...,   8.48242474e+00
     7.97146845e+00   7.46051216e+00]
  [  5.16325808e+00   2.58162904e+00   0.00000000e+00 ...,   8.47719383e+00
     8.28024673e+00   8.08330059e+00]
  [  1.03267126e+01   5.16335630e+00   0.00000000e+00 ...,   8.47196198e+00
     8.58903694e+00   8.70611191e+00]
  ...,
  [  0.00000000e+00   2.74500012e-01   5.49000025e-01 ...,   0.00000000e+00
     0.00000000e+00   0.00000000e+00]
  [  0.00000000e+00   1.37496844e-01   -2.74993688e-01 ...,   0.00000000e+00
     0.00000000e+00   0.00000000e+00]
  [  0.00000000e+00   0.00000000e+00   0.00000000e+00 ...,   0.00000000e+00
     0.00000000e+00   0.00000000e+00]]

 [[  9.42233087e-42   4.71116544e-42   0.00000000e+00 ...,   1.18437748e+01
     9.72778034e+00   7.61178637e+00]
  [  2.96431869e-01   1.48215935e-01   0.00000000e+00 ...,   1.64031239e+01
     1.32768812e+01   1.01506386e+01]
  [  5.92875004e-01   2.96437502e-01   0.00000000e+00 ...,   2.09626484e+01
     1.68261185e+01   1.26895866e+01]
  ...,
  [  1.78188753e+00   -8.90943766e-01   0.00000000e+00 ...,   0.00000000e+00
     1.27500005e-03   2.55000009e-03]
  [  9.34620261e-01   -4.67310131e-01   0.00000000e+00 ...,   0.00000000e+00
     6.38646539e-04   1.27729308e-03]
  [  8.43000039e-02   4.21500020e-02   0.00000000e+00 ...,   0.00000000e+00
     0.00000000e+00   0.00000000e+00]]]
  A contain some negative value i want to change the negative numbers to
'0'.
I used 'masked_where', command but I failed.


Please help me

-- 
DILEEPKUMAR. R
J R F, IIT DELHI
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110801/449f69f6/attachment.html>

From mlist at re-factory.de  Mon Aug  1 05:37:09 2011
From: mlist at re-factory.de (Robert Elsner)
Date: Mon, 01 Aug 2011 11:37:09 +0200
Subject: [Numpy-discussion] C api doc shortcomings
Message-ID: <4E3673C5.2020600@re-factory.de>

Hey Everybody,

I noticed that the c-api docs (2.0.dev-72ab385) lack a clear statement
what the preferred entry point into the c-api is (from a users point of
view). Normally I would expect a sentence or two stating that the api
entry point is arrayobject.h (or whatever).
Instead the docs ponder about reading the c sources but do not give any
hints where to start. I suggest something akin to the official Python
docs in a prominent place:

All function, type and macro definitions needed to use the Python/C API
are included in your code by the following line:

#include "Python.h"

This implies inclusion of the following standard headers: <stdio.h>,
<string.h>, <errno.h>, <limits.h>, <assert.h> and <stdlib.h> (if available).

modified for Numpy.

cheers
Robert


From miguel.deval at gmail.com  Mon Aug  1 05:41:52 2011
From: miguel.deval at gmail.com (Miguel de Val-Borro)
Date: Mon, 1 Aug 2011 11:41:52 +0200
Subject: [Numpy-discussion] Fill a particular value in the place of
 number satisfying certain condition by another number in an array.
In-Reply-To: <CALTF6smzcTfDTTpmZhV3jzoZc=E7_VA5xEB=1YxuF+g5geJQZg@mail.gmail.com>
References: <CALTF6smzcTfDTTpmZhV3jzoZc=E7_VA5xEB=1YxuF+g5geJQZg@mail.gmail.com>
Message-ID: <20110801094152.GE30796@poincare.pc.linmpi.mpg.de>

Dear Dileep,

the numpy.where function returns the elements from A or 0 depending if
the condition in the first argument is satisfied:
B = np.where(A >= 0, A, 0)

Miguel

On Mon, Aug 01, 2011 at 03:01:13PM +0530, dileep kunjaai wrote:
> Dear sir,
>    How can we fill a particular value in the place of number satisfying
> certain condition by another number in an array.
> 
> 
> Example:
>  A=[[[  9.42233087e-42  - 4.71116544e-42   0.00000000e+00 ...,
> 1.48303127e+01
>      1.31524124e+01   1.14745111e+01]
>   [  3.91788793e+00   1.95894396e+00   0.00000000e+00 ...,   1.78252487e+01
>      1.28667984e+01   7.90834856e+00]
>   [  7.83592510e+00   -3.91796255e+00   0.00000000e+00 ...,   2.08202991e+01
>      1.25811749e+01   4.34205008e+00]
>   ...,
>   [  -8.51249974e-03   7.00901222e+00   -1.40095119e+01 ...,
> 0.00000000e+00
>      0.00000000e+00   0.00000000e+00]
>   [  4.26390441e-03   3.51080871e+00   -7.01735353e+00 ...,   0.00000000e+00
>      0.00000000e+00   0.00000000e+00]
>   [  0.00000000e+00   0.00000000e+00   0.00000000e+00 ...,   0.00000000e+00
>      0.00000000e+00   0.00000000e+00]]
> 
>  [[  9.42233087e-42   -4.71116544e-42   0.00000000e+00 ...,   8.48242474e+00
>      7.97146845e+00   7.46051216e+00]
>   [  5.16325808e+00   2.58162904e+00   0.00000000e+00 ...,   8.47719383e+00
>      8.28024673e+00   8.08330059e+00]
>   [  1.03267126e+01   5.16335630e+00   0.00000000e+00 ...,   8.47196198e+00
>      8.58903694e+00   8.70611191e+00]
>   ...,
>   [  0.00000000e+00   2.74500012e-01   5.49000025e-01 ...,   0.00000000e+00
>      0.00000000e+00   0.00000000e+00]
>   [  0.00000000e+00   1.37496844e-01   -2.74993688e-01 ...,   0.00000000e+00
>      0.00000000e+00   0.00000000e+00]
>   [  0.00000000e+00   0.00000000e+00   0.00000000e+00 ...,   0.00000000e+00
>      0.00000000e+00   0.00000000e+00]]
> 
>  [[  9.42233087e-42   4.71116544e-42   0.00000000e+00 ...,   1.18437748e+01
>      9.72778034e+00   7.61178637e+00]
>   [  2.96431869e-01   1.48215935e-01   0.00000000e+00 ...,   1.64031239e+01
>      1.32768812e+01   1.01506386e+01]
>   [  5.92875004e-01   2.96437502e-01   0.00000000e+00 ...,   2.09626484e+01
>      1.68261185e+01   1.26895866e+01]
>   ...,
>   [  1.78188753e+00   -8.90943766e-01   0.00000000e+00 ...,   0.00000000e+00
>      1.27500005e-03   2.55000009e-03]
>   [  9.34620261e-01   -4.67310131e-01   0.00000000e+00 ...,   0.00000000e+00
>      6.38646539e-04   1.27729308e-03]
>   [  8.43000039e-02   4.21500020e-02   0.00000000e+00 ...,   0.00000000e+00
>      0.00000000e+00   0.00000000e+00]]]
>   A contain some negative value i want to change the negative numbers to
> '0'.
> I used 'masked_where', command but I failed.
> 
> 
> 
> Please help me
> 
> -- 
> DILEEPKUMAR. R
> J R F, IIT DELHI

> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From silva at lma.cnrs-mrs.fr  Mon Aug  1 05:43:13 2011
From: silva at lma.cnrs-mrs.fr (Fabrice Silva)
Date: Mon, 01 Aug 2011 11:43:13 +0200
Subject: [Numpy-discussion] Fill a particular value in the place of
 number satisfying certain condition by another number in an array.
In-Reply-To: <CALTF6smzcTfDTTpmZhV3jzoZc=E7_VA5xEB=1YxuF+g5geJQZg@mail.gmail.com>
References: <CALTF6smzcTfDTTpmZhV3jzoZc=E7_VA5xEB=1YxuF+g5geJQZg@mail.gmail.com>
Message-ID: <1312191793.5117.7.camel@lma-98.cnrs-mrs.fr>

Le lundi 01 ao?t 2011 ? 15:01 +0530, dileep kunjaai a ?crit :
> Dear sir,
> How can we fill a particular value in the place of number satisfying
> certain condition by another number in an array.
> A contain some negative value i want to change the negative numbers to
> '0'. I used 'masked_where', command but I failed.

Does np.clip fulfill your requirements ?
http://docs.scipy.org/doc/numpy/reference/generated/numpy.clip.html

Be aware that it needs an upper limit (which can be np.inf).

Another option
A[A<0] = 0.

-- 
Fabrice Silva


From jeffspencerd at gmail.com  Mon Aug  1 08:14:51 2011
From: jeffspencerd at gmail.com (Jeffrey Spencer)
Date: Mon, 01 Aug 2011 22:14:51 +1000
Subject: [Numpy-discussion] Fill a particular value in the place of
 number satisfying certain condition by another number in an array.
In-Reply-To: <CALTF6smzcTfDTTpmZhV3jzoZc=E7_VA5xEB=1YxuF+g5geJQZg@mail.gmail.com>
References: <CALTF6smzcTfDTTpmZhV3jzoZc=E7_VA5xEB=1YxuF+g5geJQZg@mail.gmail.com>
Message-ID: <4E3698BB.5000002@gmail.com>

Depends where it is contained but another option is and I find it to 
typically be faster:

B = zeros(A.shape)
maximum(A,B,A)

On 08/01/2011 07:31 PM, dileep kunjaai wrote:
> Dear sir,
>    How can we fill a particular value in the place of number 
> satisfying certain condition by another number in an array.
>
>
> Example:
>  A=[[[  9.42233087e-42  - 4.71116544e-42   0.00000000e+00 ...,   
> 1.48303127e+01
>      1.31524124e+01   1.14745111e+01]
>   [  3.91788793e+00   1.95894396e+00   0.00000000e+00 ...,   
> 1.78252487e+01
>      1.28667984e+01   7.90834856e+00]
>   [  7.83592510e+00   -3.91796255e+00   0.00000000e+00 ...,   
> 2.08202991e+01
>      1.25811749e+01   4.34205008e+00]
>   ...,
>   [  -8.51249974e-03   7.00901222e+00   -1.40095119e+01 ...,   
> 0.00000000e+00
>      0.00000000e+00   0.00000000e+00]
>   [  4.26390441e-03   3.51080871e+00   -7.01735353e+00 ...,   
> 0.00000000e+00
>      0.00000000e+00   0.00000000e+00]
>   [  0.00000000e+00   0.00000000e+00   0.00000000e+00 ...,   
> 0.00000000e+00
>      0.00000000e+00   0.00000000e+00]]
>
>  [[  9.42233087e-42   -4.71116544e-42   0.00000000e+00 ...,   
> 8.48242474e+00
>      7.97146845e+00   7.46051216e+00]
>   [  5.16325808e+00   2.58162904e+00   0.00000000e+00 ...,   
> 8.47719383e+00
>      8.28024673e+00   8.08330059e+00]
>   [  1.03267126e+01   5.16335630e+00   0.00000000e+00 ...,   
> 8.47196198e+00
>      8.58903694e+00   8.70611191e+00]
>   ...,
>   [  0.00000000e+00   2.74500012e-01   5.49000025e-01 ...,   
> 0.00000000e+00
>      0.00000000e+00   0.00000000e+00]
>   [  0.00000000e+00   1.37496844e-01   -2.74993688e-01 ...,   
> 0.00000000e+00
>      0.00000000e+00   0.00000000e+00]
>   [  0.00000000e+00   0.00000000e+00   0.00000000e+00 ...,   
> 0.00000000e+00
>      0.00000000e+00   0.00000000e+00]]
>
>  [[  9.42233087e-42   4.71116544e-42   0.00000000e+00 ...,   
> 1.18437748e+01
>      9.72778034e+00   7.61178637e+00]
>   [  2.96431869e-01   1.48215935e-01   0.00000000e+00 ...,   
> 1.64031239e+01
>      1.32768812e+01   1.01506386e+01]
>   [  5.92875004e-01   2.96437502e-01   0.00000000e+00 ...,   
> 2.09626484e+01
>      1.68261185e+01   1.26895866e+01]
>   ...,
>   [  1.78188753e+00   -8.90943766e-01   0.00000000e+00 ...,   
> 0.00000000e+00
>      1.27500005e-03   2.55000009e-03]
>   [  9.34620261e-01   -4.67310131e-01   0.00000000e+00 ...,   
> 0.00000000e+00
>      6.38646539e-04   1.27729308e-03]
>   [  8.43000039e-02   4.21500020e-02   0.00000000e+00 ...,   
> 0.00000000e+00
>      0.00000000e+00   0.00000000e+00]]]
>   A contain some negative value i want to change the negative numbers 
> to '0'.
> I used 'masked_where', command but I failed.
>
>
>                                                                           
> Please help me
>
> -- 
> DILEEPKUMAR. R
> J R F, IIT DELHI
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110801/f0dbbd43/attachment.html>

From brett.olsen at gmail.com  Mon Aug  1 10:34:16 2011
From: brett.olsen at gmail.com (Brett Olsen)
Date: Mon, 1 Aug 2011 09:34:16 -0500
Subject: [Numpy-discussion] Fill a particular value in the place of
 number satisfying certain condition by another number in an array.
In-Reply-To: <CALTF6smzcTfDTTpmZhV3jzoZc=E7_VA5xEB=1YxuF+g5geJQZg@mail.gmail.com>
References: <CALTF6smzcTfDTTpmZhV3jzoZc=E7_VA5xEB=1YxuF+g5geJQZg@mail.gmail.com>
Message-ID: <CAFq1z2VA2b=nybWNzXbuw2m61imGhBiq=KBk1Ke=BpKixfvjXA@mail.gmail.com>

This method is probably simpler:

In [1]: import numpy as N

In [2]: A = N.random.random_integers(-10, 10, 25).reshape((5, 5))

In [3]: A
Out[3]:
array([[ -5,   9,   1,   9,  -2],
       [ -8,   0,   9,   7, -10],
       [  2,  -3,  -1,   5,  -7],
       [  0,  -2,  -2,   9,   1],
       [ -7,  -9,  -4,  -1,   6]])

In [4]: A[A < 0] = 0

In [5]: A
Out[5]:
array([[0, 9, 1, 9, 0],
       [0, 0, 9, 7, 0],
       [2, 0, 0, 5, 0],
       [0, 0, 0, 9, 1],
       [0, 0, 0, 0, 6]])

~Brett

On Mon, Aug 1, 2011 at 4:31 AM, dileep kunjaai <dileepkunjaai at gmail.com> wrote:
> Dear sir,
> ?? How can we fill a particular value in the place of number satisfying
> certain condition by another number in an array.
>
>
> Example:
> ?A=[[[? 9.42233087e-42? - 4.71116544e-42?? 0.00000000e+00 ...,
> 1.48303127e+01
> ???? 1.31524124e+01?? 1.14745111e+01]
> ? [? 3.91788793e+00?? 1.95894396e+00?? 0.00000000e+00 ...,?? 1.78252487e+01
> ???? 1.28667984e+01?? 7.90834856e+00]
> ? [? 7.83592510e+00?? -3.91796255e+00?? 0.00000000e+00 ...,?? 2.08202991e+01
> ???? 1.25811749e+01?? 4.34205008e+00]
> ? ...,
> ? [? -8.51249974e-03?? 7.00901222e+00?? -1.40095119e+01 ...,
> 0.00000000e+00
> ???? 0.00000000e+00?? 0.00000000e+00]
> ? [? 4.26390441e-03?? 3.51080871e+00?? -7.01735353e+00 ...,?? 0.00000000e+00
> ???? 0.00000000e+00?? 0.00000000e+00]
> ? [? 0.00000000e+00?? 0.00000000e+00?? 0.00000000e+00 ...,?? 0.00000000e+00
> ???? 0.00000000e+00?? 0.00000000e+00]]
>
> ?[[? 9.42233087e-42?? -4.71116544e-42?? 0.00000000e+00 ...,?? 8.48242474e+00
> ???? 7.97146845e+00?? 7.46051216e+00]
> ? [? 5.16325808e+00?? 2.58162904e+00?? 0.00000000e+00 ...,?? 8.47719383e+00
> ???? 8.28024673e+00?? 8.08330059e+00]
> ? [? 1.03267126e+01?? 5.16335630e+00?? 0.00000000e+00 ...,?? 8.47196198e+00
> ???? 8.58903694e+00?? 8.70611191e+00]
> ? ...,
> ? [? 0.00000000e+00?? 2.74500012e-01?? 5.49000025e-01 ...,?? 0.00000000e+00
> ???? 0.00000000e+00?? 0.00000000e+00]
> ? [? 0.00000000e+00?? 1.37496844e-01?? -2.74993688e-01 ...,?? 0.00000000e+00
> ???? 0.00000000e+00?? 0.00000000e+00]
> ? [? 0.00000000e+00?? 0.00000000e+00?? 0.00000000e+00 ...,?? 0.00000000e+00
> ???? 0.00000000e+00?? 0.00000000e+00]]
>
> ?[[? 9.42233087e-42?? 4.71116544e-42?? 0.00000000e+00 ...,?? 1.18437748e+01
> ???? 9.72778034e+00?? 7.61178637e+00]
> ? [? 2.96431869e-01?? 1.48215935e-01?? 0.00000000e+00 ...,?? 1.64031239e+01
> ???? 1.32768812e+01?? 1.01506386e+01]
> ? [? 5.92875004e-01?? 2.96437502e-01?? 0.00000000e+00 ...,?? 2.09626484e+01
> ???? 1.68261185e+01?? 1.26895866e+01]
> ? ...,
> ? [? 1.78188753e+00?? -8.90943766e-01?? 0.00000000e+00 ...,?? 0.00000000e+00
> ???? 1.27500005e-03?? 2.55000009e-03]
> ? [? 9.34620261e-01?? -4.67310131e-01?? 0.00000000e+00 ...,?? 0.00000000e+00
> ???? 6.38646539e-04?? 1.27729308e-03]
> ? [? 8.43000039e-02?? 4.21500020e-02?? 0.00000000e+00 ...,?? 0.00000000e+00
> ???? 0.00000000e+00?? 0.00000000e+00]]]
> ? A contain some negative value i want to change the negative numbers to
> '0'.
> I used 'masked_where', command but I failed.
>
>
>
> Please help me
>
> --
> DILEEPKUMAR. R
> J R F, IIT DELHI
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>


From e.antero.tammi at gmail.com  Mon Aug  1 12:23:04 2011
From: e.antero.tammi at gmail.com (eat)
Date: Mon, 1 Aug 2011 19:23:04 +0300
Subject: [Numpy-discussion] Fill a particular value in the place of
 number satisfying certain condition by another number in an array.
In-Reply-To: <4E3698BB.5000002@gmail.com>
References: <CALTF6smzcTfDTTpmZhV3jzoZc=E7_VA5xEB=1YxuF+g5geJQZg@mail.gmail.com>
	<4E3698BB.5000002@gmail.com>
Message-ID: <CAKa=AYRj-c6BBCEOwpOR5k3kSs9i8V1pQT77CvmSGDp3TCp7Gg@mail.gmail.com>

Hi

On Mon, Aug 1, 2011 at 3:14 PM, Jeffrey Spencer <jeffspencerd at gmail.com>wrote:

>  Depends where it is contained but another option is and I find it to
> typically be faster:
>
> B = zeros(A.shape)
> maximum(A,B,A)
>
Since maximum(.) can handle broadcasting
maximum(A, 0, A)
will be even faster.

-eat

>
>
> On 08/01/2011 07:31 PM, dileep kunjaai wrote:
>
> Dear sir,
>    How can we fill a particular value in the place of number satisfying
> certain condition by another number in an array.
>
>
> Example:
>  A=[[[  9.42233087e-42  - 4.71116544e-42   0.00000000e+00 ...,
> 1.48303127e+01
>      1.31524124e+01   1.14745111e+01]
>   [  3.91788793e+00   1.95894396e+00   0.00000000e+00 ...,   1.78252487e+01
>      1.28667984e+01   7.90834856e+00]
>   [  7.83592510e+00   -3.91796255e+00   0.00000000e+00 ...,
> 2.08202991e+01
>      1.25811749e+01   4.34205008e+00]
>   ...,
>   [  -8.51249974e-03   7.00901222e+00   -1.40095119e+01 ...,
> 0.00000000e+00
>      0.00000000e+00   0.00000000e+00]
>   [  4.26390441e-03   3.51080871e+00   -7.01735353e+00 ...,
> 0.00000000e+00
>      0.00000000e+00   0.00000000e+00]
>   [  0.00000000e+00   0.00000000e+00   0.00000000e+00 ...,   0.00000000e+00
>      0.00000000e+00   0.00000000e+00]]
>
>  [[  9.42233087e-42   -4.71116544e-42   0.00000000e+00 ...,
> 8.48242474e+00
>      7.97146845e+00   7.46051216e+00]
>   [  5.16325808e+00   2.58162904e+00   0.00000000e+00 ...,   8.47719383e+00
>      8.28024673e+00   8.08330059e+00]
>   [  1.03267126e+01   5.16335630e+00   0.00000000e+00 ...,   8.47196198e+00
>      8.58903694e+00   8.70611191e+00]
>   ...,
>   [  0.00000000e+00   2.74500012e-01   5.49000025e-01 ...,   0.00000000e+00
>      0.00000000e+00   0.00000000e+00]
>   [  0.00000000e+00   1.37496844e-01   -2.74993688e-01 ...,
> 0.00000000e+00
>      0.00000000e+00   0.00000000e+00]
>   [  0.00000000e+00   0.00000000e+00   0.00000000e+00 ...,   0.00000000e+00
>      0.00000000e+00   0.00000000e+00]]
>
>  [[  9.42233087e-42   4.71116544e-42   0.00000000e+00 ...,   1.18437748e+01
>      9.72778034e+00   7.61178637e+00]
>   [  2.96431869e-01   1.48215935e-01   0.00000000e+00 ...,   1.64031239e+01
>      1.32768812e+01   1.01506386e+01]
>   [  5.92875004e-01   2.96437502e-01   0.00000000e+00 ...,   2.09626484e+01
>      1.68261185e+01   1.26895866e+01]
>   ...,
>   [  1.78188753e+00   -8.90943766e-01   0.00000000e+00 ...,
> 0.00000000e+00
>      1.27500005e-03   2.55000009e-03]
>   [  9.34620261e-01   -4.67310131e-01   0.00000000e+00 ...,
> 0.00000000e+00
>      6.38646539e-04   1.27729308e-03]
>   [  8.43000039e-02   4.21500020e-02   0.00000000e+00 ...,   0.00000000e+00
>      0.00000000e+00   0.00000000e+00]]]
>   A contain some negative value i want to change the negative numbers to
> '0'.
> I used 'masked_where', command but I failed.
>
>
>
> Please help me
>
> --
> DILEEPKUMAR. R
> J R F, IIT DELHI
>
>
>
> _______________________________________________
> NumPy-Discussion mailing listNumPy-Discussion at scipy.orghttp://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110801/ac28eaea/attachment.html>

From shish at keba.be  Mon Aug  1 12:52:34 2011
From: shish at keba.be (Olivier Delalleau)
Date: Mon, 1 Aug 2011 12:52:34 -0400
Subject: [Numpy-discussion] recommendation for saving data
In-Reply-To: <8807AC87-DA23-49BE-9D6D-74FE528DBBAC@bryant.edu>
References: <8807AC87-DA23-49BE-9D6D-74FE528DBBAC@bryant.edu>
Message-ID: <CAFXk4bpZ9ctB0=bdPrMWT+Sw=+Z5+QqHox=gCc-hZTxrRFsSgg@mail.gmail.com>

I personally use pickle, which does exactly what you are asking for (and can
be customized with __getstate__ and __setstate__ if needed). What are your
issues with pickle?

-=- Olivier

2011/7/31 Brian Blais <bblais at bryant.edu>

> Hello,
>
> I was wondering if there are any recommendations for formats for saving
> scientific data.  I am running a simulation, which has many
> somewhat-indepedent parts which have their own internal state and
> parameters.  I've been using pickle (gzipped) to save the entire object
> (which contains subobjects, etc...), but it is getting too unwieldy and I
> think it is time to look for a more robust solution.  Ideally I'd like to
> have something where I can call a save method on the simulation object, and
> it will call the save methods on all the children, on down the line all
> saving into one file.  It'd also be nice if it were cross-platform, and I
> could depend on the files being readable into the future for a while.
>
> Are there any good standards for this?  What do you use for saving
> scientific data?
>
>
>                thank you,
>
>                        Brian Blais
>
>
>
> --
> Brian Blais
> bblais at bryant.edu
> http://web.bryant.edu/~bblais
> http://bblais.blogspot.com/
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110801/fce29df9/attachment.html>

From Chris.Barker at noaa.gov  Mon Aug  1 13:08:21 2011
From: Chris.Barker at noaa.gov (Christopher Barker)
Date: Mon, 01 Aug 2011 10:08:21 -0700
Subject: [Numpy-discussion] recommendation for saving data
In-Reply-To: <8807AC87-DA23-49BE-9D6D-74FE528DBBAC@bryant.edu>
References: <8807AC87-DA23-49BE-9D6D-74FE528DBBAC@bryant.edu>
Message-ID: <4E36DD85.1040801@noaa.gov>

On 7/31/11 5:48 AM, Brian Blais wrote:
> I was wondering if there are any recommendations for formats for saving scientific data.

every field has it's own standards -- I'd try to find one that is likely 
to be used by folks that may care about your results.

For Oceanographic and Atmospheric modeling data, netcdf is a good 
option. I like the NetCDF4 python lib:

http://code.google.com/p/netcdf4-python/

(there are others)

For broader use, and a bit more flexibility, HDF is a good option. There 
are at least two ways to use it with numpy:

PyTables: http://www.pytables.org

(Nice higher-level interface)

hf5py:
http://alfven.org/wp/hdf5-for-python/

(a more raw HDF5 wrapper)

There is also the npz format, built in to numpy, if you are happy with 
requiring python to read the data.

-Chris


  I am running a simulation, which has many somewhat-indepedent parts 
which have their own internal state and parameters.  I've been using 
pickle (gzipped) to save the entire object (which contains subobjects, 
etc...), but it is getting too unwieldy and I think it is time to look 
for a more robust solution.  Ideally I'd like to have something where I 
can call a save method on the simulation object, and it will call the 
save methods on all the children, on down the line all saving into one 
file.  It'd also be nice if it were cross-platform, and I could depend 
on the files being readable into the future for a while.
>
> Are there any good standards for this?  What do you use for saving scientific data?
>
>
> 		thank you,
>
> 			Brian Blais
>
>
>


-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From aarchiba at physics.mcgill.ca  Mon Aug  1 13:17:33 2011
From: aarchiba at physics.mcgill.ca (Anne Archibald)
Date: Mon, 1 Aug 2011 13:17:33 -0400
Subject: [Numpy-discussion] [SciPy-User] recommendation for saving data
In-Reply-To: <4E36DD85.1040801@noaa.gov>
References: <8807AC87-DA23-49BE-9D6D-74FE528DBBAC@bryant.edu>
	<4E36DD85.1040801@noaa.gov>
Message-ID: <CANm_+ZoRCUDn8vaLEWOmrAYZ-Dcc6eh8G=qFf+0F__zFVBdgfg@mail.gmail.com>

In astronomy we tend to use FITS, which is well-supported by pyfits,
but a little limited. Some new instruments are beginning to use HDF5.

All these generic formats allow very general data storage, so you will
need to come up with a standrdized way to represent your own data.
Used well, these formats can be self-describing enough that generic
tools can be very useful (e.g. display images, build histograms) but
it takes some thought when designing files.

Anne

On 8/1/11, Christopher Barker <Chris.Barker at noaa.gov> wrote:
> On 7/31/11 5:48 AM, Brian Blais wrote:
>> I was wondering if there are any recommendations for formats for saving
>> scientific data.
>
> every field has it's own standards -- I'd try to find one that is likely
> to be used by folks that may care about your results.
>
> For Oceanographic and Atmospheric modeling data, netcdf is a good
> option. I like the NetCDF4 python lib:
>
> http://code.google.com/p/netcdf4-python/
>
> (there are others)
>
> For broader use, and a bit more flexibility, HDF is a good option. There
> are at least two ways to use it with numpy:
>
> PyTables: http://www.pytables.org
>
> (Nice higher-level interface)
>
> hf5py:
> http://alfven.org/wp/hdf5-for-python/
>
> (a more raw HDF5 wrapper)
>
> There is also the npz format, built in to numpy, if you are happy with
> requiring python to read the data.
>
> -Chris
>
>
>   I am running a simulation, which has many somewhat-indepedent parts
> which have their own internal state and parameters.  I've been using
> pickle (gzipped) to save the entire object (which contains subobjects,
> etc...), but it is getting too unwieldy and I think it is time to look
> for a more robust solution.  Ideally I'd like to have something where I
> can call a save method on the simulation object, and it will call the
> save methods on all the children, on down the line all saving into one
> file.  It'd also be nice if it were cross-platform, and I could depend
> on the files being readable into the future for a while.
>>
>> Are there any good standards for this?  What do you use for saving
>> scientific data?
>>
>>
>> 		thank you,
>>
>> 			Brian Blais
>>
>>
>>
>
>
> --
> Christopher Barker, Ph.D.
> Oceanographer
>
> Emergency Response Division
> NOAA/NOS/OR&R            (206) 526-6959   voice
> 7600 Sand Point Way NE   (206) 526-6329   fax
> Seattle, WA  98115       (206) 526-6317   main reception
>
> Chris.Barker at noaa.gov
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>

-- 
Sent from my mobile device


From tkluck at infty.nl  Mon Aug  1 16:33:28 2011
From: tkluck at infty.nl (Timo Kluck)
Date: Mon, 1 Aug 2011 22:33:28 +0200
Subject: [Numpy-discussion] numpy.interp running time
In-Reply-To: <CAGK+T_knsRuciUpOjuEeQoOcs6Y2gG0ZNrzHtVCzjw3rX8YXZQ@mail.gmail.com>
References: <CAGK+T_mHR+vWS_UfR7fv_FHr19Gy-mYZ_aq49Lyni2H=qdO2Wg@mail.gmail.com>
	<CAGK+T_n15L_7oPLe9TRU6-LaV45sZxLiDk8-uVa=HAbrb7X9XQ@mail.gmail.com>
	<4E3452F1.7010607@hawaii.edu>
	<CAGK+T_knsRuciUpOjuEeQoOcs6Y2gG0ZNrzHtVCzjw3rX8YXZQ@mail.gmail.com>
Message-ID: <CAGK+T_=Xoqs6H-GpSC0qysWoenF5Zj==+u0p+ABQ11xrcM7BhA@mail.gmail.com>

2011/8/1 Timo Kluck <tkluck at infty.nl>
> 2011/7/30 Eric Firing <efiring at hawaii.edu>
>> Maybe the thing to do is to pre-calculate if len(xp) <= len(x), or some
>> such guess as to which method would be more efficient.
>>
> What you're suggesting is reasonable. The cutoff at len(xp) <= len(x) can distinguish between the 'refinement' case
> and the 'just one value' case. I'll implement it for a start.

I just submitted a patch at
http://projects.scipy.org/numpy/ticket/1920 . It implements Eric's
suggestion. Please review, I'll be happy to adapt it to any of your
feedback.

Timo


From craigyk at me.com  Mon Aug  1 19:20:50 2011
From: craigyk at me.com (Craig Yoshioka)
Date: Mon, 01 Aug 2011 16:20:50 -0700
Subject: [Numpy-discussion] limit to number of fields in recarray
Message-ID: <3D27C63E-6C4A-4908-B2E5-0CA01EDB53A2@me.com>

Is there a limit to the number of fields a numpy recarray can have?  I was getting a strange error about a duplicate column name, but it wasn't a duplicate.


From bevan07 at gmail.com  Mon Aug  1 20:43:28 2011
From: bevan07 at gmail.com (Bevan Jenkins)
Date: Tue, 2 Aug 2011 00:43:28 +0000 (UTC)
Subject: [Numpy-discussion] hold parameters
Message-ID: <loom.20110802T021909-96@post.gmane.org>

Hello,

I have a function that I fitting to a curve via scipy.optimize.leastsq.  The 
function has 4 parameters and this is all working fine.

For a site, I have a number of curves (n=10 in the example below).  I would 
like to some of the parameters to be the best fit across all curves (best fit 
for a site) while letting the other parameters vary for each curve.  I have 
this working as well.

The issue I have is like to be able to vary this for a run.  That is do a run 
where parameter1 is best fit for entire site, whith the remaining three 
varying per curve. Then on the next run, have two parameters being held or 
fitted for all curves at one.  Or be able to do a run where all 4 parameters 
are fit for each individual curve.

Using my e.g. below, if I change the 'fix' dict, so that 'a','b', and 'c' are 
True, with 'd' False, then I will have to change the zip to
for a,b,c in zip(a,b,c):
    solve(a,b,c,d)
 
I would prefer to find a way to do this via code.  I hope this example makes 
sense.  The code below is all within my objective function that is being 
called by scipy.optimize.leastsq.
import numpy as np

def solve(a,b,c,d):
    print a,b,c,d
    #return x*a*b*c*d


fix = {"a":True,"b":True,"c":False,"d":False}

n=10
params = np.array([0,1,2,3]*n)
params = params.reshape(-1,4)

if fix["a"] is True:
    a = params[0,0]
else:
    a = params[:,0]
if fix["b"] is True:
    b = params[0,1]
else:
    b = params[:,1]
if fix["c"] is True:
    c = params[0,2]
else:
    c = params[:,2]
if fix["d"] is True:
    d = params[0,3]
else:
    d = params[:,3]

res=[]
for c,d in zip(c,d):
    res = solve(a,b,c,d)
    #res = solve(a,b,c,d)-self.orig
#return np.hstack(res)**2 


From pgmdevlist at gmail.com  Tue Aug  2 02:18:53 2011
From: pgmdevlist at gmail.com (Pierre GM)
Date: Tue, 2 Aug 2011 08:18:53 +0200
Subject: [Numpy-discussion] limit to number of fields in recarray
In-Reply-To: <3D27C63E-6C4A-4908-B2E5-0CA01EDB53A2@me.com>
References: <3D27C63E-6C4A-4908-B2E5-0CA01EDB53A2@me.com>
Message-ID: <59DEC051-5161-49B2-9577-8C873A89CB3C@gmail.com>


On Aug 2, 2011, at 1:20 AM, Craig Yoshioka wrote:

> Is there a limit to the number of fields a numpy recarray can have?  I was getting a strange error about a duplicate column name, but it wasn't a duplicate.

And the error was? ? 

From josef.pktd at gmail.com  Tue Aug  2 05:07:20 2011
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Tue, 2 Aug 2011 05:07:20 -0400
Subject: [Numpy-discussion] hold parameters
In-Reply-To: <loom.20110802T021909-96@post.gmane.org>
References: <loom.20110802T021909-96@post.gmane.org>
Message-ID: <CAMMTP+D5kqcaHAQSi_paG=q5EdBHsybwQoWQ2Ap47-6B9eYAYQ@mail.gmail.com>

On Mon, Aug 1, 2011 at 8:43 PM, Bevan Jenkins <bevan07 at gmail.com> wrote:
> Hello,
>
> I have a function that I fitting to a curve via scipy.optimize.leastsq. ?The
> function has 4 parameters and this is all working fine.
>
> For a site, I have a number of curves (n=10 in the example below). ?I would
> like to some of the parameters to be the best fit across all curves (best fit
> for a site) while letting the other parameters vary for each curve. ?I have
> this working as well.
>
> The issue I have is like to be able to vary this for a run. ?That is do a run
> where parameter1 is best fit for entire site, whith the remaining three
> varying per curve. Then on the next run, have two parameters being held or
> fitted for all curves at one. ?Or be able to do a run where all 4 parameters
> are fit for each individual curve.
>
> Using my e.g. below, if I change the 'fix' dict, so that 'a','b', and 'c' are
> True, with 'd' False, then I will have to change the zip to
> for a,b,c in zip(a,b,c):
> ? ?solve(a,b,c,d)
>
> I would prefer to find a way to do this via code. ?I hope this example makes
> sense. ?The code below is all within my objective function that is being
> called by scipy.optimize.leastsq.
> import numpy as np
>
> def solve(a,b,c,d):
> ? ?print a,b,c,d
> ? ?#return x*a*b*c*d
>
>
>
> fix = {"a":True,"b":True,"c":False,"d":False}
>
> n=10
> params = np.array([0,1,2,3]*n)
> params = params.reshape(-1,4)
>
> if fix["a"] is True:
> ? ?a = params[0,0]
> else:
> ? ?a = params[:,0]
> if fix["b"] is True:
> ? ?b = params[0,1]
> else:
> ? ?b = params[:,1]
> if fix["c"] is True:
> ? ?c = params[0,2]
> else:
> ? ?c = params[:,2]
> if fix["d"] is True:
> ? ?d = params[0,3]
> else:
> ? ?d = params[:,3]
>
> res=[]
> for c,d in zip(c,d):
> ? ?res = solve(a,b,c,d)
> ? ?#res = solve(a,b,c,d)-self.orig
> #return np.hstack(res)**2


I'm not a fan of named individual parameters for function arguments
when the number of arguments varies, *args

What I'm using is a full parameter array with nan's

fixed = np.array([nan, nan, c, d])  #fix c,d

def func(args):
   fixed[np.isnan(fixed)] = args
   a,b,c,d = fixed
   ...

to set starting values allstartvals = np.array([a0, b0, c0, d0])

startvals = allstartvals[np.isnan(fixed)

optimize.leastsq(func, startvals, other_args)

or something like this.
I find it easier to keep track of the parameters, if I just have an
array or tuple.

for an alternative, Travis used a different way in the scipy.stats
implementation of partially fixed parameters for distributions fit
with named arguments. (I don't remember the details)

Josef


>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From bsouthey at gmail.com  Tue Aug  2 09:23:43 2011
From: bsouthey at gmail.com (Bruce Southey)
Date: Tue, 02 Aug 2011 08:23:43 -0500
Subject: [Numpy-discussion] hold parameters
In-Reply-To: <loom.20110802T021909-96@post.gmane.org>
References: <loom.20110802T021909-96@post.gmane.org>
Message-ID: <4E37FA5F.6090900@gmail.com>

On 08/01/2011 07:43 PM, Bevan Jenkins wrote:
> Hello,
>
> I have a function that I fitting to a curve via scipy.optimize.leastsq.  The
> function has 4 parameters and this is all working fine.
>
> For a site, I have a number of curves (n=10 in the example below).  I would
> like to some of the parameters to be the best fit across all curves (best fit
> for a site) while letting the other parameters vary for each curve.  I have
> this working as well.
>
> The issue I have is like to be able to vary this for a run.  That is do a run
> where parameter1 is best fit for entire site, whith the remaining three
> varying per curve. Then on the next run, have two parameters being held or
> fitted for all curves at one.  Or be able to do a run where all 4 parameters
> are fit for each individual curve.
It would really help to know what you mean by 'entire site' and 'run'.
If the runs are not independent then what you are doing is incorrect.

> Using my e.g. below, if I change the 'fix' dict, so that 'a','b', and 'c' are
> True, with 'd' False, then I will have to change the zip to
> for a,b,c in zip(a,b,c):
>      solve(a,b,c,d)
>
> I would prefer to find a way to do this via code.  I hope this example makes
> sense.  The code below is all within my objective function that is being
> called by scipy.optimize.leastsq.
> import numpy as np
>
> def solve(a,b,c,d):
>      print a,b,c,d
>      #return x*a*b*c*d
>
>
>
> fix = {"a":True,"b":True,"c":False,"d":False}
>
> n=10
> params = np.array([0,1,2,3]*n)
> params = params.reshape(-1,4)
>
> if fix["a"] is True:
>      a = params[0,0]
> else:
>      a = params[:,0]
> if fix["b"] is True:
>      b = params[0,1]
> else:
>      b = params[:,1]
> if fix["c"] is True:
>      c = params[0,2]
> else:
>      c = params[:,2]
> if fix["d"] is True:
>      d = params[0,3]
> else:
>      d = params[:,3]
>
> res=[]
> for c,d in zip(c,d):
>      res = solve(a,b,c,d)
>      #res = solve(a,b,c,d)-self.orig
> #return np.hstack(res)**2
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
Basically this code seems to be trying to do what an analysis of 
covariance would do.  Analysis of covariance type of approach provides a 
statistical framework where you have a 'global' parameter and condition 
specific parameters that modify  that parameter. Here is one example 
under SAS that fits a common slope but different intercepts due to drug 
level.
http://support.sas.com/documentation/cdl/en/statug/63347/HTML/default/viewer.htm#statug_glm_sect050.htm
That model can be extended allow for different slopes due different drug 
levels by fitting the interaction between both variables. You can do 
that easily in numpy/scipy by creating the correct 'design matrix'.

The real issue is that you need a statistical measure of the model fit 
as well as comparison between models (or restrictions). For linear 
models usually likelihood (or similar measure like Bayesian information 
criterion) and extra-sums of squares tests are used. But these measures 
get more interesting in nonlinear cases.

Bruce


From ralf.gommers at googlemail.com  Tue Aug  2 10:07:33 2011
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Tue, 2 Aug 2011 16:07:33 +0200
Subject: [Numpy-discussion] numpy.sqrt behaving differently on MacOS Lion
In-Reply-To: <CAAUn5qJqpOfPan=UfW7TUG5qtsv2i-yQvSRUjEN3GGfUPOD7FA@mail.gmail.com>
References: <CAAUn5qLKuz1eoQRLf20J0TY9VTG1CevPONUb+2W=FcTWssGD8Q@mail.gmail.com>
	<CABL7CQj4i6StF_QJNdVCh66fsfc5BjQ9ETpX3UkcZAxYYUwpng@mail.gmail.com>
	<rowen-4B220F.12005627072011@news.gmane.org>
	<CABL7CQh8-2ZhVDtvwqcPB0dEd4RYcxHW0Hw1JeEpFX=+hnthNA@mail.gmail.com>
	<4E3073C8.8060601@noaa.gov>
	<CAAUn5qJqpOfPan=UfW7TUG5qtsv2i-yQvSRUjEN3GGfUPOD7FA@mail.gmail.com>
Message-ID: <CABL7CQhs1BxL+5e4kseWe6FqPaRnAJPE70kFVzSZhiw5EZJvDQ@mail.gmail.com>

On Wed, Jul 27, 2011 at 10:33 PM, Ilan Schnell <ischnell at enthought.com>wrote:

> >     Please don't distribute a different numpy binary for each version of
> >     MacOS X.
> +1
>
> Maybe I should mention that I just finished testing all Python
> packages in EPD under 10.7, and everything (execpt numpy.sqr
> for weird complex values such as inf/nan) works fine!
> In particular building C and Fortran extensions with the new LLVM
> based gcc and importing them into Python (both 32 and 64-bit).
> There are two MacOS builds of EPD (one 32-bit and 64-bit), they
> are compiled on 10.5 using gcc 4.0.1 and then tested on 10.5, 10.6
> and 10.7.
>
> Good to know Ilan. It seems that the problems that so many people
experienced with scipy are now solved by the new gfortran binary available
from http://r.research.att.com/tools/. So it should be fine to just skip the
failing tests. Apple says my computer is too old for Lion, so I need a
little help here. Could you either open a ticket with the full output of
numpy.test() and assign it to me, or produce a patch?

Thanks,
Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110802/8ce255bc/attachment.html>

From jlconlin at gmail.com  Tue Aug  2 10:44:26 2011
From: jlconlin at gmail.com (Jeremy Conlin)
Date: Tue, 2 Aug 2011 08:44:26 -0600
Subject: [Numpy-discussion] Finding many ways to incorrectly create a numpy
	array. Please advice
Message-ID: <CAAzQXyNeeDNdugKwmf=nJQLBcaj_UkRgG_0FdAQtLjNvNGF+jg@mail.gmail.com>

I am trying to create a numpy array from some text I'm reading from a
file. Ideally, I'd like to create a structured array with the first
element as an int and the remaining as floats. I'm currently
unsuccessful in my attempts. I've copied a simple script below that
shows what I've done and the wrong output. Can someone please show me
what is happening?

I'm using numpy version 1.5.1 under Python 2.7.1 on a Mac running Snow Leopard.

Thanks,
Jeremy


import numpy

l = '      32000  7.89131E-01  8.05999E-03  3.88222E+03'
tfc_dtype = numpy.dtype([('nps', 'u8'), ('t', 'f8'), ('e', 'f8'),
('fom', 'f8')])

m = numpy.fromstring(l, sep=' ')
print("\nm: {}".format(m))

# Next line gives:
# ValueError: don't know how to read character strings with that array type
#n = numpy.fromstring(l, dtype=tfc_dtype, sep=' ')
#print("\nn: {}".format(n))

words = l.split()
o = numpy.array(words, dtype='f8')
print("\no: {}".format(o))

# Next line(s) gives bad answer
p = numpy.array(words, dtype=tfc_dtype)
print("\np: {}".format(p))

nps = int(words[0])
t = float(words[1])
e = float(words[2])
fom = float(words[3])
a = [nps, t, e, fom]

# Next line(s) converts int to float in first element
r = numpy.array(a)
print("\nr: {}".format(r))

# Next line gives:
# TypeError: expected a readable buffer object
# s = numpy.array(a, dtype=tfc_dtype)
# print("\ns: {}".format(s))


From brett.olsen at gmail.com  Tue Aug  2 11:09:18 2011
From: brett.olsen at gmail.com (Brett Olsen)
Date: Tue, 2 Aug 2011 10:09:18 -0500
Subject: [Numpy-discussion] Finding many ways to incorrectly create a
 numpy array. Please advice
In-Reply-To: <CAAzQXyNeeDNdugKwmf=nJQLBcaj_UkRgG_0FdAQtLjNvNGF+jg@mail.gmail.com>
References: <CAAzQXyNeeDNdugKwmf=nJQLBcaj_UkRgG_0FdAQtLjNvNGF+jg@mail.gmail.com>
Message-ID: <CAFq1z2XJxy++f0W9d8zP0LBCWxusz5PSVGycr5PA8YQn4C9Fbg@mail.gmail.com>

On Tue, Aug 2, 2011 at 9:44 AM, Jeremy Conlin <jlconlin at gmail.com> wrote:
> I am trying to create a numpy array from some text I'm reading from a
> file. Ideally, I'd like to create a structured array with the first
> element as an int and the remaining as floats. I'm currently
> unsuccessful in my attempts. I've copied a simple script below that
> shows what I've done and the wrong output. Can someone please show me
> what is happening?
>
> I'm using numpy version 1.5.1 under Python 2.7.1 on a Mac running Snow Leopard.
>
> Thanks,
> Jeremy

I'd use numpy.loadtxt:

In [1]: import numpy, StringIO

In [2]: l = '      32000  7.89131E-01  8.05999E-03  3.88222E+03'

In [3]: tfc_dtype = numpy.dtype([('nps', 'u8'), ('t', 'f8'), ('e',
'f8'), ('fom', 'f8')])

In [4]: input = StringIO.StringIO(l)

In [5]: numpy.loadtxt(input, dtype=tfc_dtype)
Out[5]:
array((32000L, 0.78913100000000003, 0.0080599899999999995, 3882.2199999999998),
      dtype=[('nps', '<u8'), ('t', '<f8'), ('e', '<f8'), ('fom', '<f8')])

In [6]: input.close()

In [7]: input = StringIO.StringIO(l)

In [8]: numpy.loadtxt(input)
Out[8]:
array([  3.20000000e+04,   7.89131000e-01,   8.05999000e-03,
         3.88222000e+03])

In [9]: input.close()

If you're reading from a file you can replace the StringIO objects
with file objects.

~Brett


From jlconlin at gmail.com  Tue Aug  2 11:38:20 2011
From: jlconlin at gmail.com (Jeremy Conlin)
Date: Tue, 2 Aug 2011 09:38:20 -0600
Subject: [Numpy-discussion] Finding many ways to incorrectly create a
 numpy array. Please advice
In-Reply-To: <CAFq1z2XJxy++f0W9d8zP0LBCWxusz5PSVGycr5PA8YQn4C9Fbg@mail.gmail.com>
References: <CAAzQXyNeeDNdugKwmf=nJQLBcaj_UkRgG_0FdAQtLjNvNGF+jg@mail.gmail.com>
	<CAFq1z2XJxy++f0W9d8zP0LBCWxusz5PSVGycr5PA8YQn4C9Fbg@mail.gmail.com>
Message-ID: <CAAzQXyMhjOz9-_+2YVG7auoTRWjGTdfpsgK29JUuN0-LFm1L4w@mail.gmail.com>

On Tue, Aug 2, 2011 at 9:09 AM, Brett Olsen <brett.olsen at gmail.com> wrote:
> On Tue, Aug 2, 2011 at 9:44 AM, Jeremy Conlin <jlconlin at gmail.com> wrote:
>> I am trying to create a numpy array from some text I'm reading from a
>> file. Ideally, I'd like to create a structured array with the first
>> element as an int and the remaining as floats. I'm currently
>> unsuccessful in my attempts. I've copied a simple script below that
>> shows what I've done and the wrong output. Can someone please show me
>> what is happening?
>>
>> I'm using numpy version 1.5.1 under Python 2.7.1 on a Mac running Snow Leopard.
>>
>> Thanks,
>> Jeremy
>
> I'd use numpy.loadtxt:
>
> In [1]: import numpy, StringIO
>
> In [2]: l = ' ? ? ?32000 ?7.89131E-01 ?8.05999E-03 ?3.88222E+03'
>
> In [3]: tfc_dtype = numpy.dtype([('nps', 'u8'), ('t', 'f8'), ('e',
> 'f8'), ('fom', 'f8')])
>
> In [4]: input = StringIO.StringIO(l)
>
> In [5]: numpy.loadtxt(input, dtype=tfc_dtype)
> Out[5]:
> array((32000L, 0.78913100000000003, 0.0080599899999999995, 3882.2199999999998),
> ? ? ?dtype=[('nps', '<u8'), ('t', '<f8'), ('e', '<f8'), ('fom', '<f8')])
>
> In [6]: input.close()
>
> In [7]: input = StringIO.StringIO(l)
>
> In [8]: numpy.loadtxt(input)
> Out[8]:
> array([ ?3.20000000e+04, ? 7.89131000e-01, ? 8.05999000e-03,
> ? ? ? ? 3.88222000e+03])
>
> In [9]: input.close()
>
> If you're reading from a file you can replace the StringIO objects
> with file objects.

Thanks, Brett. Using StringIO and numpy.loadtxt worked great. I'm
still curious why what I was doing didn't work. Everything I can see
indicates it should work.

Jeremy


From thomasmarkovich at gmail.com  Tue Aug  2 11:50:16 2011
From: thomasmarkovich at gmail.com (Thomas Markovich)
Date: Tue, 2 Aug 2011 10:50:16 -0500
Subject: [Numpy-discussion] Segmentation Fault in Numpy.test()
Message-ID: <CAKgEOEMoJwffxgrHkzeNFJDzNryHVLvMY-5TbV=gekLUVQumxQ@mail.gmail.com>

Hi All,

I installed numpy from the scipy superpack on Snow Leopard with python 2.7
and it all appears to work but when I do the following, I get a segmentation
fault.

>>> import numpy
>>> print numpy.__version__, numpy.__file__
2.0.0.dev-b5cdaee
/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy-2.0.0.dev_b5cdaee_20110710-py2.6-macosx-10.6-universal.egg/numpy/__init__.pyc
>>> numpy.test()
Running unit tests for numpy
NumPy version 2.0.0.dev-b5cdaee
NumPy is installed in
/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy-2.0.0.dev_b5cdaee_20110710-py2.6-macosx-10.6-universal.egg/numpy
Python version 2.7.2 (v2.7.2:8527427914a2, Jun 11 2011, 15:22:34) [GCC 4.2.1
(Apple Inc. build 5666) (dot 3)]
nose version 1.1.2
............................................................................................................................................................................................................................................................................................................................Segmentation
fault
thomasmarkovich:~ Thomas$

What is the best way to trouble shoot this? Do you guys have any
suggestions? I have also included the core dump in this email as a pastie
link.

http://pastie.org/2309652

Best,

Thomas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110802/cafb3941/attachment.html>

From shish at keba.be  Tue Aug  2 12:08:35 2011
From: shish at keba.be (Olivier Delalleau)
Date: Tue, 2 Aug 2011 12:08:35 -0400
Subject: [Numpy-discussion] Segmentation Fault in Numpy.test()
In-Reply-To: <CAKgEOEMoJwffxgrHkzeNFJDzNryHVLvMY-5TbV=gekLUVQumxQ@mail.gmail.com>
References: <CAKgEOEMoJwffxgrHkzeNFJDzNryHVLvMY-5TbV=gekLUVQumxQ@mail.gmail.com>
Message-ID: <CAFXk4boh2FQXzgZQ-Qr8Y=cY9qruMTUbRHkM8wgELiuSNbgR4g@mail.gmail.com>

It's a wild guess, but in the past I've had seg faults issues on Mac due to
conflicting versions of Python. Do you have multiple Python installs on your
Mac?

-=- Olivier


2011/8/2 Thomas Markovich <thomasmarkovich at gmail.com>

> Hi All,
>
> I installed numpy from the scipy superpack on Snow Leopard with python 2.7
> and it all appears to work but when I do the following, I get a segmentation
> fault.
>
> >>> import numpy
> >>> print numpy.__version__, numpy.__file__
> 2.0.0.dev-b5cdaee
> /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy-2.0.0.dev_b5cdaee_20110710-py2.6-macosx-10.6-universal.egg/numpy/__init__.pyc
> >>> numpy.test()
> Running unit tests for numpy
> NumPy version 2.0.0.dev-b5cdaee
> NumPy is installed in
> /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy-2.0.0.dev_b5cdaee_20110710-py2.6-macosx-10.6-universal.egg/numpy
> Python version 2.7.2 (v2.7.2:8527427914a2, Jun 11 2011, 15:22:34) [GCC
> 4.2.1 (Apple Inc. build 5666) (dot 3)]
> nose version 1.1.2
> ............................................................................................................................................................................................................................................................................................................................Segmentation
> fault
> thomasmarkovich:~ Thomas$
>
> What is the best way to trouble shoot this? Do you guys have any
> suggestions? I have also included the core dump in this email as a pastie
> link.
>
> http://pastie.org/2309652
>
> Best,
>
> Thomas
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110802/cb1e1b4a/attachment.html>

From thomasmarkovich at gmail.com  Tue Aug  2 12:14:15 2011
From: thomasmarkovich at gmail.com (Thomas Markovich)
Date: Tue, 2 Aug 2011 11:14:15 -0500
Subject: [Numpy-discussion] Segmentation Fault in Numpy.test()
In-Reply-To: <CAFXk4boh2FQXzgZQ-Qr8Y=cY9qruMTUbRHkM8wgELiuSNbgR4g@mail.gmail.com>
References: <CAKgEOEMoJwffxgrHkzeNFJDzNryHVLvMY-5TbV=gekLUVQumxQ@mail.gmail.com>
	<CAFXk4boh2FQXzgZQ-Qr8Y=cY9qruMTUbRHkM8wgELiuSNbgR4g@mail.gmail.com>
Message-ID: <CAKgEOEP_iSmDB585bnbPkDV3djmeDM-wi05cMXgudSuouKLh0A@mail.gmail.com>

I just have the default "apple" version of python that comes with Snow
Leopard (Python 2.6.1 (r261:67515, Aug  2 2010, 20:10:18)) and python 2.7
(Python 2.7.2 (v2.7.2:8527427914a2, Jun 11 2011, 15:22:34) ) installed.

Should I just remove 2.7 and reinstall everything with the standard apple
python?

On Tue, Aug 2, 2011 at 11:08 AM, Olivier Delalleau <shish at keba.be> wrote:

> It's a wild guess, but in the past I've had seg faults issues on Mac due to
> conflicting versions of Python. Do you have multiple Python installs on your
> Mac?
>
> -=- Olivier
>
>
> 2011/8/2 Thomas Markovich <thomasmarkovich at gmail.com>
>
>> Hi All,
>>
>> I installed numpy from the scipy superpack on Snow Leopard with python 2.7
>> and it all appears to work but when I do the following, I get a segmentation
>> fault.
>>
>> >>> import numpy
>> >>> print numpy.__version__, numpy.__file__
>> 2.0.0.dev-b5cdaee
>> /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy-2.0.0.dev_b5cdaee_20110710-py2.6-macosx-10.6-universal.egg/numpy/__init__.pyc
>> >>> numpy.test()
>> Running unit tests for numpy
>> NumPy version 2.0.0.dev-b5cdaee
>> NumPy is installed in
>> /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy-2.0.0.dev_b5cdaee_20110710-py2.6-macosx-10.6-universal.egg/numpy
>> Python version 2.7.2 (v2.7.2:8527427914a2, Jun 11 2011, 15:22:34) [GCC
>> 4.2.1 (Apple Inc. build 5666) (dot 3)]
>> nose version 1.1.2
>> ............................................................................................................................................................................................................................................................................................................................Segmentation
>> fault
>> thomasmarkovich:~ Thomas$
>>
>> What is the best way to trouble shoot this? Do you guys have any
>> suggestions? I have also included the core dump in this email as a pastie
>> link.
>>
>> http://pastie.org/2309652
>>
>> Best,
>>
>> Thomas
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110802/7e690100/attachment.html>

From bsouthey at gmail.com  Tue Aug  2 12:27:40 2011
From: bsouthey at gmail.com (Bruce Southey)
Date: Tue, 02 Aug 2011 11:27:40 -0500
Subject: [Numpy-discussion] Segmentation Fault in Numpy.test()
In-Reply-To: <CAKgEOEP_iSmDB585bnbPkDV3djmeDM-wi05cMXgudSuouKLh0A@mail.gmail.com>
References: <CAKgEOEMoJwffxgrHkzeNFJDzNryHVLvMY-5TbV=gekLUVQumxQ@mail.gmail.com>	<CAFXk4boh2FQXzgZQ-Qr8Y=cY9qruMTUbRHkM8wgELiuSNbgR4g@mail.gmail.com>
	<CAKgEOEP_iSmDB585bnbPkDV3djmeDM-wi05cMXgudSuouKLh0A@mail.gmail.com>
Message-ID: <4E38257C.5080802@gmail.com>

On 08/02/2011 11:14 AM, Thomas Markovich wrote:
> I just have the default "apple" version of python that comes with Snow 
> Leopard (Python 2.6.1 (r261:67515, Aug  2 2010, 20:10:18)) and python 
> 2.7 (Python 2.7.2 (v2.7.2:8527427914a2, Jun 11 2011, 15:22:34) ) 
> installed.
>
> Should I just remove 2.7 and reinstall everything with the standard 
> apple python?
>
> On Tue, Aug 2, 2011 at 11:08 AM, Olivier Delalleau <shish at keba.be 
> <mailto:shish at keba.be>> wrote:
>
>     It's a wild guess, but in the past I've had seg faults issues on
>     Mac due to conflicting versions of Python. Do you have multiple
>     Python installs on your Mac?
>
>     -=- Olivier
>
>
>     2011/8/2 Thomas Markovich <thomasmarkovich at gmail.com
>     <mailto:thomasmarkovich at gmail.com>>
>
>         Hi All,
>
>         I installed numpy from the scipy superpack on Snow Leopard
>         with python 2.7 and it all appears to work but when I do the
>         following, I get a segmentation fault.
>
>         >>> import numpy
>         >>> print numpy.__version__, numpy.__file__
>         2.0.0.dev-b5cdaee
>         /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy-2.0.0.dev_b5cdaee_20110710-py2.6-macosx-10.6-universal.egg/numpy/__init__.pyc
>         >>> numpy.test()
>         Running unit tests for numpy
>         NumPy version 2.0.0.dev-b5cdaee
>         NumPy is installed in
>         /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy-2.0.0.dev_b5cdaee_20110710-py2.6-macosx-10.6-universal.egg/numpy
>         Python version 2.7.2 (v2.7.2:8527427914a2, Jun 11 2011,
>         15:22:34) [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)]
>         nose version 1.1.2
>         ............................................................................................................................................................................................................................................................................................................................Segmentation
>         fault
>         thomasmarkovich:~ Thomas$
>
>         What is the best way to trouble shoot this? Do you guys have
>         any suggestions? I have also included the core dump in this
>         email as a pastie link.
>
>         http://pastie.org/2309652
>
>         Best,
>
>         Thomas
>
>
Use the numpy test verbose argument ie numpy.test(verbose=10) to find 
which test it is causing the crash.

I have no idea of the Mac but I am curious why there is a 'py2.6' in 
your numpy version with Python2.7.

Bruce

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110802/5647fe01/attachment.html>

From ralf.gommers at googlemail.com  Tue Aug  2 12:28:52 2011
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Tue, 2 Aug 2011 18:28:52 +0200
Subject: [Numpy-discussion] Segmentation Fault in Numpy.test()
In-Reply-To: <CAKgEOEP_iSmDB585bnbPkDV3djmeDM-wi05cMXgudSuouKLh0A@mail.gmail.com>
References: <CAKgEOEMoJwffxgrHkzeNFJDzNryHVLvMY-5TbV=gekLUVQumxQ@mail.gmail.com>
	<CAFXk4boh2FQXzgZQ-Qr8Y=cY9qruMTUbRHkM8wgELiuSNbgR4g@mail.gmail.com>
	<CAKgEOEP_iSmDB585bnbPkDV3djmeDM-wi05cMXgudSuouKLh0A@mail.gmail.com>
Message-ID: <CABL7CQgjVQWUGmisQaBV0Z+t+Q9O6D-Ucvq0+wD5wYr6Kf5E=A@mail.gmail.com>

On Tue, Aug 2, 2011 at 6:14 PM, Thomas Markovich
<thomasmarkovich at gmail.com>wrote:

> I just have the default "apple" version of python that comes with Snow
> Leopard (Python 2.6.1 (r261:67515, Aug  2 2010, 20:10:18)) and python 2.7
> (Python 2.7.2 (v2.7.2:8527427914a2, Jun 11 2011, 15:22:34) ) installed.
>
> Should I just remove 2.7 and reinstall everything with the standard apple
> python?
>
> Did you get it from http://stronginference.com/scipy-superpack/? The info
on the 10.6 installer has disappeared, but the 10.7 one is built against
Apple's Python. So conflicting Pythons makes sense. Even if you find the
right one, it may be worth emailing Chris to ask him to put back the info
for the 10.6 installer.

Ralf


On Tue, Aug 2, 2011 at 11:08 AM, Olivier Delalleau <shish at keba.be> wrote:
>
>> It's a wild guess, but in the past I've had seg faults issues on Mac due
>> to conflicting versions of Python. Do you have multiple Python installs on
>> your Mac?
>>
>> -=- Olivier
>>
>>
>> 2011/8/2 Thomas Markovich <thomasmarkovich at gmail.com>
>>
>>> Hi All,
>>>
>>> I installed numpy from the scipy superpack on Snow Leopard with python
>>> 2.7 and it all appears to work but when I do the following, I get a
>>> segmentation fault.
>>>
>>> >>> import numpy
>>> >>> print numpy.__version__, numpy.__file__
>>> 2.0.0.dev-b5cdaee
>>> /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy-2.0.0.dev_b5cdaee_20110710-py2.6-macosx-10.6-universal.egg/numpy/__init__.pyc
>>> >>> numpy.test()
>>> Running unit tests for numpy
>>> NumPy version 2.0.0.dev-b5cdaee
>>> NumPy is installed in
>>> /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy-2.0.0.dev_b5cdaee_20110710-py2.6-macosx-10.6-universal.egg/numpy
>>> Python version 2.7.2 (v2.7.2:8527427914a2, Jun 11 2011, 15:22:34) [GCC
>>> 4.2.1 (Apple Inc. build 5666) (dot 3)]
>>> nose version 1.1.2
>>> ............................................................................................................................................................................................................................................................................................................................Segmentation
>>> fault
>>> thomasmarkovich:~ Thomas$
>>>
>>> What is the best way to trouble shoot this? Do you guys have any
>>> suggestions? I have also included the core dump in this email as a pastie
>>> link.
>>>
>>> http://pastie.org/2309652
>>>
>>> Best,
>>>
>>> Thomas
>>>
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>
>>>
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110802/6789d12b/attachment.html>

From thomasmarkovich at gmail.com  Tue Aug  2 12:57:21 2011
From: thomasmarkovich at gmail.com (Thomas Markovich)
Date: Tue, 2 Aug 2011 11:57:21 -0500
Subject: [Numpy-discussion] Segmentation Fault in Numpy.test()
In-Reply-To: <CABL7CQgjVQWUGmisQaBV0Z+t+Q9O6D-Ucvq0+wD5wYr6Kf5E=A@mail.gmail.com>
References: <CAKgEOEMoJwffxgrHkzeNFJDzNryHVLvMY-5TbV=gekLUVQumxQ@mail.gmail.com>
	<CAFXk4boh2FQXzgZQ-Qr8Y=cY9qruMTUbRHkM8wgELiuSNbgR4g@mail.gmail.com>
	<CAKgEOEP_iSmDB585bnbPkDV3djmeDM-wi05cMXgudSuouKLh0A@mail.gmail.com>
	<CABL7CQgjVQWUGmisQaBV0Z+t+Q9O6D-Ucvq0+wD5wYr6Kf5E=A@mail.gmail.com>
Message-ID: <CAKgEOEN=NiEFfhR-+Ka1C2mjwzg+7GXeUPBzivXg_rQ=Et2C4w@mail.gmail.com>

It appears that uninstalling python 2.7 and installing the scipy superpack
with the apple standard python removes the segfaulting behavior from numpy.
Now it appears that just scipy is segfaulting at test
"test_arpack.test_hermitian_modes(True, <std-hermitian>, 'F', 2, 'SM', None,
0.5, <function aslinearoperator at 0x1043b1848>) ... Segmentation fault"

Thomas


On Tue, Aug 2, 2011 at 11:28 AM, Ralf Gommers
<ralf.gommers at googlemail.com>wrote:

>
>
> On Tue, Aug 2, 2011 at 6:14 PM, Thomas Markovich <
> thomasmarkovich at gmail.com> wrote:
>
>> I just have the default "apple" version of python that comes with Snow
>> Leopard (Python 2.6.1 (r261:67515, Aug  2 2010, 20:10:18)) and python 2.7
>> (Python 2.7.2 (v2.7.2:8527427914a2, Jun 11 2011, 15:22:34) ) installed.
>>
>> Should I just remove 2.7 and reinstall everything with the standard apple
>> python?
>>
>> Did you get it from http://stronginference.com/scipy-superpack/? The info
> on the 10.6 installer has disappeared, but the 10.7 one is built against
> Apple's Python. So conflicting Pythons makes sense. Even if you find the
> right one, it may be worth emailing Chris to ask him to put back the info
> for the 10.6 installer.
>
> Ralf
>
>
> On Tue, Aug 2, 2011 at 11:08 AM, Olivier Delalleau <shish at keba.be> wrote:
>>
>>> It's a wild guess, but in the past I've had seg faults issues on Mac due
>>> to conflicting versions of Python. Do you have multiple Python installs on
>>> your Mac?
>>>
>>> -=- Olivier
>>>
>>>
>>> 2011/8/2 Thomas Markovich <thomasmarkovich at gmail.com>
>>>
>>>> Hi All,
>>>>
>>>> I installed numpy from the scipy superpack on Snow Leopard with python
>>>> 2.7 and it all appears to work but when I do the following, I get a
>>>> segmentation fault.
>>>>
>>>> >>> import numpy
>>>> >>> print numpy.__version__, numpy.__file__
>>>> 2.0.0.dev-b5cdaee
>>>> /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy-2.0.0.dev_b5cdaee_20110710-py2.6-macosx-10.6-universal.egg/numpy/__init__.pyc
>>>> >>> numpy.test()
>>>> Running unit tests for numpy
>>>> NumPy version 2.0.0.dev-b5cdaee
>>>> NumPy is installed in
>>>> /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy-2.0.0.dev_b5cdaee_20110710-py2.6-macosx-10.6-universal.egg/numpy
>>>> Python version 2.7.2 (v2.7.2:8527427914a2, Jun 11 2011, 15:22:34) [GCC
>>>> 4.2.1 (Apple Inc. build 5666) (dot 3)]
>>>> nose version 1.1.2
>>>> ............................................................................................................................................................................................................................................................................................................................Segmentation
>>>> fault
>>>> thomasmarkovich:~ Thomas$
>>>>
>>>> What is the best way to trouble shoot this? Do you guys have any
>>>> suggestions? I have also included the core dump in this email as a pastie
>>>> link.
>>>>
>>>> http://pastie.org/2309652
>>>>
>>>> Best,
>>>>
>>>> Thomas
>>>>
>>>> _______________________________________________
>>>> NumPy-Discussion mailing list
>>>> NumPy-Discussion at scipy.org
>>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>>
>>>>
>>>
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>
>>>
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110802/50bb9463/attachment.html>

From ralf.gommers at googlemail.com  Tue Aug  2 13:06:37 2011
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Tue, 2 Aug 2011 19:06:37 +0200
Subject: [Numpy-discussion] Segmentation Fault in Numpy.test()
In-Reply-To: <CAKgEOEN=NiEFfhR-+Ka1C2mjwzg+7GXeUPBzivXg_rQ=Et2C4w@mail.gmail.com>
References: <CAKgEOEMoJwffxgrHkzeNFJDzNryHVLvMY-5TbV=gekLUVQumxQ@mail.gmail.com>
	<CAFXk4boh2FQXzgZQ-Qr8Y=cY9qruMTUbRHkM8wgELiuSNbgR4g@mail.gmail.com>
	<CAKgEOEP_iSmDB585bnbPkDV3djmeDM-wi05cMXgudSuouKLh0A@mail.gmail.com>
	<CABL7CQgjVQWUGmisQaBV0Z+t+Q9O6D-Ucvq0+wD5wYr6Kf5E=A@mail.gmail.com>
	<CAKgEOEN=NiEFfhR-+Ka1C2mjwzg+7GXeUPBzivXg_rQ=Et2C4w@mail.gmail.com>
Message-ID: <CABL7CQiWdBJ7BYxoPR5e9D3otT07y0uidqLvtnNMsFyyyGSRKQ@mail.gmail.com>

On Tue, Aug 2, 2011 at 6:57 PM, Thomas Markovich
<thomasmarkovich at gmail.com>wrote:

> It appears that uninstalling python 2.7 and installing the scipy superpack
> with the apple standard python removes the segfaulting behavior from numpy.
> Now it appears that just scipy is segfaulting at test
> "test_arpack.test_hermitian_modes(True, <std-hermitian>, 'F', 2, 'SM', None,
> 0.5, <function aslinearoperator at 0x1043b1848>) ... Segmentation fault"
>
> That is a known problem (unfortunately hard to fix), see
http://projects.scipy.org/scipy/ticket/1472
Everything else besides arpack should work fine for you.

Cheers,
Ralf


>
>
>
> On Tue, Aug 2, 2011 at 11:28 AM, Ralf Gommers <ralf.gommers at googlemail.com
> > wrote:
>
>>
>>
>> On Tue, Aug 2, 2011 at 6:14 PM, Thomas Markovich <
>> thomasmarkovich at gmail.com> wrote:
>>
>>> I just have the default "apple" version of python that comes with Snow
>>> Leopard (Python 2.6.1 (r261:67515, Aug  2 2010, 20:10:18)) and python 2.7
>>> (Python 2.7.2 (v2.7.2:8527427914a2, Jun 11 2011, 15:22:34) ) installed.
>>>
>>> Should I just remove 2.7 and reinstall everything with the standard apple
>>> python?
>>>
>>> Did you get it from http://stronginference.com/scipy-superpack/? The
>> info on the 10.6 installer has disappeared, but the 10.7 one is built
>> against Apple's Python. So conflicting Pythons makes sense. Even if you find
>> the right one, it may be worth emailing Chris to ask him to put back the
>> info for the 10.6 installer.
>>
>> Ralf
>>
>>
>> On Tue, Aug 2, 2011 at 11:08 AM, Olivier Delalleau <shish at keba.be> wrote:
>>>
>>>> It's a wild guess, but in the past I've had seg faults issues on Mac due
>>>> to conflicting versions of Python. Do you have multiple Python installs on
>>>> your Mac?
>>>>
>>>> -=- Olivier
>>>>
>>>>
>>>> 2011/8/2 Thomas Markovich <thomasmarkovich at gmail.com>
>>>>
>>>>> Hi All,
>>>>>
>>>>> I installed numpy from the scipy superpack on Snow Leopard with python
>>>>> 2.7 and it all appears to work but when I do the following, I get a
>>>>> segmentation fault.
>>>>>
>>>>> >>> import numpy
>>>>> >>> print numpy.__version__, numpy.__file__
>>>>> 2.0.0.dev-b5cdaee
>>>>> /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy-2.0.0.dev_b5cdaee_20110710-py2.6-macosx-10.6-universal.egg/numpy/__init__.pyc
>>>>> >>> numpy.test()
>>>>> Running unit tests for numpy
>>>>> NumPy version 2.0.0.dev-b5cdaee
>>>>> NumPy is installed in
>>>>> /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy-2.0.0.dev_b5cdaee_20110710-py2.6-macosx-10.6-universal.egg/numpy
>>>>> Python version 2.7.2 (v2.7.2:8527427914a2, Jun 11 2011, 15:22:34) [GCC
>>>>> 4.2.1 (Apple Inc. build 5666) (dot 3)]
>>>>> nose version 1.1.2
>>>>> ............................................................................................................................................................................................................................................................................................................................Segmentation
>>>>> fault
>>>>> thomasmarkovich:~ Thomas$
>>>>>
>>>>> What is the best way to trouble shoot this? Do you guys have any
>>>>> suggestions? I have also included the core dump in this email as a pastie
>>>>> link.
>>>>>
>>>>> http://pastie.org/2309652
>>>>>
>>>>> Best,
>>>>>
>>>>> Thomas
>>>>>
>>>>> _______________________________________________
>>>>> NumPy-Discussion mailing list
>>>>> NumPy-Discussion at scipy.org
>>>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>>>
>>>>>
>>>>
>>>> _______________________________________________
>>>> NumPy-Discussion mailing list
>>>> NumPy-Discussion at scipy.org
>>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>>
>>>>
>>>
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>
>>>
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110802/e23e50b7/attachment.html>

From derek at astro.physik.uni-goettingen.de  Tue Aug  2 13:12:12 2011
From: derek at astro.physik.uni-goettingen.de (Derek Homeier)
Date: Tue, 2 Aug 2011 19:12:12 +0200
Subject: [Numpy-discussion] Segmentation Fault in Numpy.test()
In-Reply-To: <CAKgEOEN=NiEFfhR-+Ka1C2mjwzg+7GXeUPBzivXg_rQ=Et2C4w@mail.gmail.com>
References: <CAKgEOEMoJwffxgrHkzeNFJDzNryHVLvMY-5TbV=gekLUVQumxQ@mail.gmail.com>
	<CAFXk4boh2FQXzgZQ-Qr8Y=cY9qruMTUbRHkM8wgELiuSNbgR4g@mail.gmail.com>
	<CAKgEOEP_iSmDB585bnbPkDV3djmeDM-wi05cMXgudSuouKLh0A@mail.gmail.com>
	<CABL7CQgjVQWUGmisQaBV0Z+t+Q9O6D-Ucvq0+wD5wYr6Kf5E=A@mail.gmail.com>
	<CAKgEOEN=NiEFfhR-+Ka1C2mjwzg+7GXeUPBzivXg_rQ=Et2C4w@mail.gmail.com>
Message-ID: <D230DC83-AA67-4247-B247-918DA23973DA@astro.physik.uni-goettingen.de>

On 2 Aug 2011, at 18:57, Thomas Markovich wrote:

> It appears that uninstalling python 2.7 and installing the scipy  
> superpack with the apple standard python removes the

Did the superpack installer automatically install numpy to the  
python2.7 directory when present? Even if so, I reckon you could  
simply reinstall python2.7 after the numpy installation (still calling  
python2.6 to use numpy of course...).

> segfaulting behavior from numpy. Now it appears that just scipy is  
> segfaulting at test "test_arpack.test_hermitian_modes(True, <std- 
> hermitian>, 'F', 2, 'SM', None, 0.5, <function aslinearoperator at  
> 0x1043b1848>) ... Segmentation fault"

Which architecture is this? Being on Snow Leopard, probably x86_46...
I remember encountering similar problems on PPC, which I suspect are  
related to stability issues with Apple's Accelerate framework.

Cheers,
						Derek


From thomasmarkovich at gmail.com  Tue Aug  2 13:14:02 2011
From: thomasmarkovich at gmail.com (Thomas Markovich)
Date: Tue, 2 Aug 2011 12:14:02 -0500
Subject: [Numpy-discussion] Segmentation Fault in Numpy.test()
In-Reply-To: <CABL7CQiWdBJ7BYxoPR5e9D3otT07y0uidqLvtnNMsFyyyGSRKQ@mail.gmail.com>
References: <CAKgEOEMoJwffxgrHkzeNFJDzNryHVLvMY-5TbV=gekLUVQumxQ@mail.gmail.com>
	<CAFXk4boh2FQXzgZQ-Qr8Y=cY9qruMTUbRHkM8wgELiuSNbgR4g@mail.gmail.com>
	<CAKgEOEP_iSmDB585bnbPkDV3djmeDM-wi05cMXgudSuouKLh0A@mail.gmail.com>
	<CABL7CQgjVQWUGmisQaBV0Z+t+Q9O6D-Ucvq0+wD5wYr6Kf5E=A@mail.gmail.com>
	<CAKgEOEN=NiEFfhR-+Ka1C2mjwzg+7GXeUPBzivXg_rQ=Et2C4w@mail.gmail.com>
	<CABL7CQiWdBJ7BYxoPR5e9D3otT07y0uidqLvtnNMsFyyyGSRKQ@mail.gmail.com>
Message-ID: <CAKgEOEPj-9iheKiZSYrjT-WCSGoBwiEtTjvv1ph71LwQsxLTzQ@mail.gmail.com>

Oh okay, that's unfortunate but I guess not unexpected. Regardless, thank
you so much for all your help Ralf, Bruce, and Oliver! You guys are great.

Just to recap, the issue appears to stem from using the scipy superpack with
python 2.7 from python.org. This was solved by using the apple python along
with the scipy superpack.

Thomas

On Tue, Aug 2, 2011 at 12:06 PM, Ralf Gommers
<ralf.gommers at googlemail.com>wrote:

>
>
> On Tue, Aug 2, 2011 at 6:57 PM, Thomas Markovich <
> thomasmarkovich at gmail.com> wrote:
>
>> It appears that uninstalling python 2.7 and installing the scipy superpack
>> with the apple standard python removes the segfaulting behavior from numpy.
>> Now it appears that just scipy is segfaulting at test
>> "test_arpack.test_hermitian_modes(True, <std-hermitian>, 'F', 2, 'SM', None,
>> 0.5, <function aslinearoperator at 0x1043b1848>) ... Segmentation fault"
>>
>> That is a known problem (unfortunately hard to fix), see
> http://projects.scipy.org/scipy/ticket/1472
> Everything else besides arpack should work fine for you.
>
> Cheers,
> Ralf
>
>
>>
>>
>>
>> On Tue, Aug 2, 2011 at 11:28 AM, Ralf Gommers <
>> ralf.gommers at googlemail.com> wrote:
>>
>>>
>>>
>>> On Tue, Aug 2, 2011 at 6:14 PM, Thomas Markovich <
>>> thomasmarkovich at gmail.com> wrote:
>>>
>>>> I just have the default "apple" version of python that comes with Snow
>>>> Leopard (Python 2.6.1 (r261:67515, Aug  2 2010, 20:10:18)) and python 2.7
>>>> (Python 2.7.2 (v2.7.2:8527427914a2, Jun 11 2011, 15:22:34) ) installed.
>>>>
>>>> Should I just remove 2.7 and reinstall everything with the standard
>>>> apple python?
>>>>
>>>> Did you get it from http://stronginference.com/scipy-superpack/? The
>>> info on the 10.6 installer has disappeared, but the 10.7 one is built
>>> against Apple's Python. So conflicting Pythons makes sense. Even if you find
>>> the right one, it may be worth emailing Chris to ask him to put back the
>>> info for the 10.6 installer.
>>>
>>> Ralf
>>>
>>>
>>> On Tue, Aug 2, 2011 at 11:08 AM, Olivier Delalleau <shish at keba.be>wrote:
>>>>
>>>>> It's a wild guess, but in the past I've had seg faults issues on Mac
>>>>> due to conflicting versions of Python. Do you have multiple Python installs
>>>>> on your Mac?
>>>>>
>>>>> -=- Olivier
>>>>>
>>>>>
>>>>> 2011/8/2 Thomas Markovich <thomasmarkovich at gmail.com>
>>>>>
>>>>>> Hi All,
>>>>>>
>>>>>> I installed numpy from the scipy superpack on Snow Leopard with python
>>>>>> 2.7 and it all appears to work but when I do the following, I get a
>>>>>> segmentation fault.
>>>>>>
>>>>>> >>> import numpy
>>>>>> >>> print numpy.__version__, numpy.__file__
>>>>>> 2.0.0.dev-b5cdaee
>>>>>> /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy-2.0.0.dev_b5cdaee_20110710-py2.6-macosx-10.6-universal.egg/numpy/__init__.pyc
>>>>>> >>> numpy.test()
>>>>>> Running unit tests for numpy
>>>>>> NumPy version 2.0.0.dev-b5cdaee
>>>>>> NumPy is installed in
>>>>>> /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy-2.0.0.dev_b5cdaee_20110710-py2.6-macosx-10.6-universal.egg/numpy
>>>>>> Python version 2.7.2 (v2.7.2:8527427914a2, Jun 11 2011, 15:22:34) [GCC
>>>>>> 4.2.1 (Apple Inc. build 5666) (dot 3)]
>>>>>> nose version 1.1.2
>>>>>> ............................................................................................................................................................................................................................................................................................................................Segmentation
>>>>>> fault
>>>>>> thomasmarkovich:~ Thomas$
>>>>>>
>>>>>> What is the best way to trouble shoot this? Do you guys have any
>>>>>> suggestions? I have also included the core dump in this email as a pastie
>>>>>> link.
>>>>>>
>>>>>> http://pastie.org/2309652
>>>>>>
>>>>>> Best,
>>>>>>
>>>>>> Thomas
>>>>>>
>>>>>> _______________________________________________
>>>>>> NumPy-Discussion mailing list
>>>>>> NumPy-Discussion at scipy.org
>>>>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>>>>
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> NumPy-Discussion mailing list
>>>>> NumPy-Discussion at scipy.org
>>>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>>>
>>>>>
>>>>
>>>> _______________________________________________
>>>> NumPy-Discussion mailing list
>>>> NumPy-Discussion at scipy.org
>>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>>
>>>>
>>>
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>
>>>
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110802/536aa711/attachment.html>

From Chris.Barker at noaa.gov  Tue Aug  2 13:15:50 2011
From: Chris.Barker at noaa.gov (Christopher Barker)
Date: Tue, 02 Aug 2011 10:15:50 -0700
Subject: [Numpy-discussion] Finding many ways to incorrectly create a
 numpy array. Please advice
In-Reply-To: <CAAzQXyMhjOz9-_+2YVG7auoTRWjGTdfpsgK29JUuN0-LFm1L4w@mail.gmail.com>
References: <CAAzQXyNeeDNdugKwmf=nJQLBcaj_UkRgG_0FdAQtLjNvNGF+jg@mail.gmail.com>
	<CAFq1z2XJxy++f0W9d8zP0LBCWxusz5PSVGycr5PA8YQn4C9Fbg@mail.gmail.com>
	<CAAzQXyMhjOz9-_+2YVG7auoTRWjGTdfpsgK29JUuN0-LFm1L4w@mail.gmail.com>
Message-ID: <4E3830C6.1060902@noaa.gov>

On 8/2/11 8:38 AM, Jeremy Conlin wrote:
> Thanks, Brett. Using StringIO and numpy.loadtxt worked great. I'm
> still curious why what I was doing didn't work. Everything I can see
> indicates it should work.

In [11]: tfc_dtype
Out[11]: dtype([('nps', '>u8'), ('t', '>f8'), ('e', '>f8'), ('fom', '>f8')])

In [15]: n = numpy.fromstring(l, dtype=tfc_dtype, sep=' ')
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)

/Users/cbarker/<ipython console> in <module>()

ValueError: don't know how to read character strings with that array type

means just what it says. In theory, numpy.fromstring() (and fromfile() ) 
provides a way to quickly and efficiently generate arrays from text, but 
it practice, the code is quite limited (and has a bug or two). I don't 
think anyone has gotten around to writing the code to use structured 
dtypes with it -- so it can't do what you want (rational though that 
expectation is)

In [21]: words
Out[21]: ['32000', '7.89131E-01', '8.05999E-03', '3.88222E+03']

In [22]: p =
Display all 249 possibilities? (y or n)

In [22]: p = numpy.array(words, dtype=tfc_dtype)

In [23]: p
Out[23]:
array([(3689064028291727360L, 0.0, 0.0, 0.0),
        (3976177339304456517L, 4.967820413490985e-91, 0.0, 0.0),
        (4048226120204106053L, 4.970217431784588e-91, 0.0, 0.0),
        (3687946958874489413L, 1.1572189237420885e-100, 0.0, 0.0)],
       dtype=[('nps', '>u8'), ('t', '>f8'), ('e', '>f8'), ('fom', '>f8')])

similarly here -- converting from text to structured dtypes is not fully 
supported

In [29]: a
Out[29]: [32000, 0.789131, 0.00805999, 3882.22]

In [30]: r = numpy.array(a)

In [31]: r
Out[31]:
array([  3.20000000e+04,   7.89131000e-01,   8.05999000e-03,
          3.88222000e+03])

sure -- numpy's default behavior is to find a dtype that will hold all 
the input array -- this pre-dates structured dtypes, and probably what 
you would want b default anyway.

In [32]: s = numpy.array(a, dtype=tfc_dtype)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)

/Users/cbarker/<ipython console> in <module>()

TypeError: expected a readable buffer object

OK -- I can see why you'd expect that to work. However, the trick with 
structured dtypes is that the dimensionality of the inputs can be less 
than obvious -- you are passing in a 1-d list of 4 numbers -- do you 
want a 1-d array? or ? -- in this case, it's pretty obvious (as a human) 
what you would want -- you have a dtype with four fields, and you're 
passing in four numbers, but there are so many possible combinations 
that numpy doesn't try to be "smart" about it. So as a rule, you need to 
be quite specific when working with structured dtypes.

However, the default is for numpy to map tuples to dtypes, so if you 
pass in a tuple instead, it works:

In [34]: t = tuple(a)

In [35]: s = numpy.array(t, dtype=tfc_dtype)

In [36]: s
Out[36]:
array((32000L, 0.789131, 0.00805999, 3882.22),
       dtype=[('nps', '>u8'), ('t', '>f8'), ('e', '>f8'), ('fom', '>f8')])

you were THIS close!

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From shish at keba.be  Tue Aug  2 13:21:59 2011
From: shish at keba.be (Olivier Delalleau)
Date: Tue, 2 Aug 2011 13:21:59 -0400
Subject: [Numpy-discussion] Segmentation Fault in Numpy.test()
In-Reply-To: <CAKgEOEPj-9iheKiZSYrjT-WCSGoBwiEtTjvv1ph71LwQsxLTzQ@mail.gmail.com>
References: <CAKgEOEMoJwffxgrHkzeNFJDzNryHVLvMY-5TbV=gekLUVQumxQ@mail.gmail.com>
	<CAFXk4boh2FQXzgZQ-Qr8Y=cY9qruMTUbRHkM8wgELiuSNbgR4g@mail.gmail.com>
	<CAKgEOEP_iSmDB585bnbPkDV3djmeDM-wi05cMXgudSuouKLh0A@mail.gmail.com>
	<CABL7CQgjVQWUGmisQaBV0Z+t+Q9O6D-Ucvq0+wD5wYr6Kf5E=A@mail.gmail.com>
	<CAKgEOEN=NiEFfhR-+Ka1C2mjwzg+7GXeUPBzivXg_rQ=Et2C4w@mail.gmail.com>
	<CABL7CQiWdBJ7BYxoPR5e9D3otT07y0uidqLvtnNMsFyyyGSRKQ@mail.gmail.com>
	<CAKgEOEPj-9iheKiZSYrjT-WCSGoBwiEtTjvv1ph71LwQsxLTzQ@mail.gmail.com>
Message-ID: <CAFXk4bpAYgTqsNfAf9TxKpLKpXS16_sbNJomNuXTok+hk_ayEA@mail.gmail.com>

Maybe specify which scipy superpack. Your issue was probably because the
superpack you installed was not meant to be used with Python 2.7.

-=- Olivier

2011/8/2 Thomas Markovich <thomasmarkovich at gmail.com>

> Oh okay, that's unfortunate but I guess not unexpected. Regardless, thank
> you so much for all your help Ralf, Bruce, and Oliver! You guys are great.
>
> Just to recap, the issue appears to stem from using the scipy superpack
> with python 2.7 from python.org. This was solved by using the apple python
> along with the scipy superpack.
>
> Thomas
>
>
> On Tue, Aug 2, 2011 at 12:06 PM, Ralf Gommers <ralf.gommers at googlemail.com
> > wrote:
>
>>
>>
>> On Tue, Aug 2, 2011 at 6:57 PM, Thomas Markovich <
>> thomasmarkovich at gmail.com> wrote:
>>
>>> It appears that uninstalling python 2.7 and installing the scipy
>>> superpack with the apple standard python removes the segfaulting behavior
>>> from numpy. Now it appears that just scipy is segfaulting at test
>>> "test_arpack.test_hermitian_modes(True, <std-hermitian>, 'F', 2, 'SM', None,
>>> 0.5, <function aslinearoperator at 0x1043b1848>) ... Segmentation fault"
>>>
>>> That is a known problem (unfortunately hard to fix), see
>> http://projects.scipy.org/scipy/ticket/1472
>> Everything else besides arpack should work fine for you.
>>
>> Cheers,
>> Ralf
>>
>>
>>>
>>>
>>>
>>> On Tue, Aug 2, 2011 at 11:28 AM, Ralf Gommers <
>>> ralf.gommers at googlemail.com> wrote:
>>>
>>>>
>>>>
>>>> On Tue, Aug 2, 2011 at 6:14 PM, Thomas Markovich <
>>>> thomasmarkovich at gmail.com> wrote:
>>>>
>>>>> I just have the default "apple" version of python that comes with Snow
>>>>> Leopard (Python 2.6.1 (r261:67515, Aug  2 2010, 20:10:18)) and python 2.7
>>>>> (Python 2.7.2 (v2.7.2:8527427914a2, Jun 11 2011, 15:22:34) ) installed.
>>>>>
>>>>> Should I just remove 2.7 and reinstall everything with the standard
>>>>> apple python?
>>>>>
>>>>> Did you get it from http://stronginference.com/scipy-superpack/? The
>>>> info on the 10.6 installer has disappeared, but the 10.7 one is built
>>>> against Apple's Python. So conflicting Pythons makes sense. Even if you find
>>>> the right one, it may be worth emailing Chris to ask him to put back the
>>>> info for the 10.6 installer.
>>>>
>>>> Ralf
>>>>
>>>>
>>>> On Tue, Aug 2, 2011 at 11:08 AM, Olivier Delalleau <shish at keba.be>wrote:
>>>>>
>>>>>> It's a wild guess, but in the past I've had seg faults issues on Mac
>>>>>> due to conflicting versions of Python. Do you have multiple Python installs
>>>>>> on your Mac?
>>>>>>
>>>>>> -=- Olivier
>>>>>>
>>>>>>
>>>>>> 2011/8/2 Thomas Markovich <thomasmarkovich at gmail.com>
>>>>>>
>>>>>>> Hi All,
>>>>>>>
>>>>>>> I installed numpy from the scipy superpack on Snow Leopard with
>>>>>>> python 2.7 and it all appears to work but when I do the following, I get a
>>>>>>> segmentation fault.
>>>>>>>
>>>>>>> >>> import numpy
>>>>>>> >>> print numpy.__version__, numpy.__file__
>>>>>>> 2.0.0.dev-b5cdaee
>>>>>>> /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy-2.0.0.dev_b5cdaee_20110710-py2.6-macosx-10.6-universal.egg/numpy/__init__.pyc
>>>>>>> >>> numpy.test()
>>>>>>> Running unit tests for numpy
>>>>>>> NumPy version 2.0.0.dev-b5cdaee
>>>>>>> NumPy is installed in
>>>>>>> /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy-2.0.0.dev_b5cdaee_20110710-py2.6-macosx-10.6-universal.egg/numpy
>>>>>>> Python version 2.7.2 (v2.7.2:8527427914a2, Jun 11 2011, 15:22:34)
>>>>>>> [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)]
>>>>>>> nose version 1.1.2
>>>>>>> ............................................................................................................................................................................................................................................................................................................................Segmentation
>>>>>>> fault
>>>>>>> thomasmarkovich:~ Thomas$
>>>>>>>
>>>>>>> What is the best way to trouble shoot this? Do you guys have any
>>>>>>> suggestions? I have also included the core dump in this email as a pastie
>>>>>>> link.
>>>>>>>
>>>>>>> http://pastie.org/2309652
>>>>>>>
>>>>>>> Best,
>>>>>>>
>>>>>>> Thomas
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> NumPy-Discussion mailing list
>>>>>>> NumPy-Discussion at scipy.org
>>>>>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> NumPy-Discussion mailing list
>>>>>> NumPy-Discussion at scipy.org
>>>>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>>>>
>>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> NumPy-Discussion mailing list
>>>>> NumPy-Discussion at scipy.org
>>>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>>>
>>>>>
>>>>
>>>> _______________________________________________
>>>> NumPy-Discussion mailing list
>>>> NumPy-Discussion at scipy.org
>>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>>
>>>>
>>>
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>
>>>
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110802/7228c7b3/attachment.html>

From Chris.Barker at noaa.gov  Tue Aug  2 13:24:15 2011
From: Chris.Barker at noaa.gov (Christopher Barker)
Date: Tue, 02 Aug 2011 10:24:15 -0700
Subject: [Numpy-discussion] Segmentation Fault in Numpy.test()
In-Reply-To: <CAKgEOEPj-9iheKiZSYrjT-WCSGoBwiEtTjvv1ph71LwQsxLTzQ@mail.gmail.com>
References: <CAKgEOEMoJwffxgrHkzeNFJDzNryHVLvMY-5TbV=gekLUVQumxQ@mail.gmail.com>
	<CAFXk4boh2FQXzgZQ-Qr8Y=cY9qruMTUbRHkM8wgELiuSNbgR4g@mail.gmail.com>
	<CAKgEOEP_iSmDB585bnbPkDV3djmeDM-wi05cMXgudSuouKLh0A@mail.gmail.com>
	<CABL7CQgjVQWUGmisQaBV0Z+t+Q9O6D-Ucvq0+wD5wYr6Kf5E=A@mail.gmail.com>
	<CAKgEOEN=NiEFfhR-+Ka1C2mjwzg+7GXeUPBzivXg_rQ=Et2C4w@mail.gmail.com>
	<CABL7CQiWdBJ7BYxoPR5e9D3otT07y0uidqLvtnNMsFyyyGSRKQ@mail.gmail.com>
	<CAKgEOEPj-9iheKiZSYrjT-WCSGoBwiEtTjvv1ph71LwQsxLTzQ@mail.gmail.com>
Message-ID: <4E3832BF.1030300@noaa.gov>

On 8/2/11 10:14 AM, Thomas Markovich wrote:

> Just to recap, the issue appears to stem from using the scipy superpack
> with python 2.7 from python.org <http://python.org>. This was solved by
> using the apple python along with the scipy superpack.

This sure sounds like a bug in the sciy superpack installer -- if it was 
build for the system python2.6, it should not get installed into 2.7.

Unless you did something to force that.

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From derek at astro.physik.uni-goettingen.de  Tue Aug  2 14:19:31 2011
From: derek at astro.physik.uni-goettingen.de (Derek Homeier)
Date: Tue, 2 Aug 2011 20:19:31 +0200
Subject: [Numpy-discussion] Finding many ways to incorrectly create a
	numpy array. Please advice
In-Reply-To: <4E3830C6.1060902@noaa.gov>
References: <CAAzQXyNeeDNdugKwmf=nJQLBcaj_UkRgG_0FdAQtLjNvNGF+jg@mail.gmail.com>
	<CAFq1z2XJxy++f0W9d8zP0LBCWxusz5PSVGycr5PA8YQn4C9Fbg@mail.gmail.com>
	<CAAzQXyMhjOz9-_+2YVG7auoTRWjGTdfpsgK29JUuN0-LFm1L4w@mail.gmail.com>
	<4E3830C6.1060902@noaa.gov>
Message-ID: <C1AE3F72-E7FD-4777-9961-6835D412B9E1@astro.physik.uni-goettingen.de>

On 2 Aug 2011, at 19:15, Christopher Barker wrote:

> In [32]: s = numpy.array(a, dtype=tfc_dtype)
> ---------------------------------------------------------------------------
> TypeError                                 Traceback (most recent  
> call last)
>
> /Users/cbarker/<ipython console> in <module>()
>
> TypeError: expected a readable buffer object
>
> OK -- I can see why you'd expect that to work. However, the trick with
> structured dtypes is that the dimensionality of the inputs can be less
> than obvious -- you are passing in a 1-d list of 4 numbers -- do you
> want a 1-d array? or ? -- in this case, it's pretty obvious (as a  
> human)
> what you would want -- you have a dtype with four fields, and you're
> passing in four numbers, but there are so many possible combinations
> that numpy doesn't try to be "smart" about it. So as a rule, you  
> need to
> be quite specific when working with structured dtypes.
>
> However, the default is for numpy to map tuples to dtypes, so if you
> pass in a tuple instead, it works:
>
> In [34]: t = tuple(a)
>
> In [35]: s = numpy.array(t, dtype=tfc_dtype)
>
> In [36]: s
> Out[36]:
> array((32000L, 0.789131, 0.00805999, 3882.22),
>       dtype=[('nps', '>u8'), ('t', '>f8'), ('e', '>f8'), ('fom',  
> '>f8')])
>
> you were THIS close!

Thanks for the detailed discussion! BTW this works also without  
explicitly converting the words one by one:

In [1]:  l = '      32000  7.89131E-01  8.05999E-03  3.88222E+03'
In [2]: tfc_dtype = numpy.dtype([('nps', 'u8'), ('t', 'f8'), ('e',  
'f8'),('fom', 'f8')])
In [3]: numpy.array(tuple(l.split()), dtype=tfc_dtype)
Out[3]:
array((32000L, 0.789131, 0.00805999, 3882.22),
       dtype=[('nps', '<u8'), ('t', '<f8'), ('e', '<f8'), ('fom',  
'<f8')])

Cheers,
						Derek


From craigyk at me.com  Tue Aug  2 15:19:26 2011
From: craigyk at me.com (Craig Yoshioka)
Date: Tue, 02 Aug 2011 12:19:26 -0700
Subject: [Numpy-discussion] limit to number of fields in recarray
In-Reply-To: <59DEC051-5161-49B2-9577-8C873A89CB3C@gmail.com>
References: <3D27C63E-6C4A-4908-B2E5-0CA01EDB53A2@me.com>
	<59DEC051-5161-49B2-9577-8C873A89CB3C@gmail.com>
Message-ID: <58F1BDB2-4767-4E4B-9E33-0E04F80C1D52@me.com>

duplicate column in dtype?

I just consolidated some of the columns and the error went away... none had duplicate field names... hence the question.


On Aug 1, 2011, at 11:18 PM, Pierre GM wrote:

> 
> On Aug 2, 2011, at 1:20 AM, Craig Yoshioka wrote:
> 
>> Is there a limit to the number of fields a numpy recarray can have?  I was getting a strange error about a duplicate column name, but it wasn't a duplicate.
> 
> And the error was? ? 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From jsseabold at gmail.com  Tue Aug  2 15:31:06 2011
From: jsseabold at gmail.com (Skipper Seabold)
Date: Tue, 2 Aug 2011 15:31:06 -0400
Subject: [Numpy-discussion] limit to number of fields in recarray
In-Reply-To: <58F1BDB2-4767-4E4B-9E33-0E04F80C1D52@me.com>
References: <3D27C63E-6C4A-4908-B2E5-0CA01EDB53A2@me.com>
	<59DEC051-5161-49B2-9577-8C873A89CB3C@gmail.com>
	<58F1BDB2-4767-4E4B-9E33-0E04F80C1D52@me.com>
Message-ID: <CAKF=DjtM=jO4oMzDe4nNy851xqRFcd0aPzvpOg_xrUpnBrd_yw@mail.gmail.com>

On Tue, Aug 2, 2011 at 3:19 PM, Craig Yoshioka <craigyk at me.com> wrote:
> duplicate column in dtype?
>

"Duplicate field names given."? Can you post code to replicate?

> I just consolidated some of the columns and the error went away... none had duplicate field names... hence the question.
>

I don't think this would be raised unless there are duplicates. There
is some name changing for invalid field names that could result in a
name collision. I think I've run into this before.

Skipper


From craigyk at me.com  Tue Aug  2 16:09:59 2011
From: craigyk at me.com (Craig Yoshioka)
Date: Tue, 02 Aug 2011 13:09:59 -0700
Subject: [Numpy-discussion] limit to number of fields in recarray
In-Reply-To: <CAKF=DjtM=jO4oMzDe4nNy851xqRFcd0aPzvpOg_xrUpnBrd_yw@mail.gmail.com>
References: <3D27C63E-6C4A-4908-B2E5-0CA01EDB53A2@me.com>
	<59DEC051-5161-49B2-9577-8C873A89CB3C@gmail.com>
	<58F1BDB2-4767-4E4B-9E33-0E04F80C1D52@me.com>
	<CAKF=DjtM=jO4oMzDe4nNy851xqRFcd0aPzvpOg_xrUpnBrd_yw@mail.gmail.com>
Message-ID: <04693AFD-A761-45DC-8E20-074064B82F36@me.com>

yup, duplicate field names given.  I didn't commit the non-working version and I didn't want to mess up my working code so I tried duplicating the dtype in a new file and couldn't recreate the error.   I suppose the answer to my question is, there is no limit to the number of records?  Must have been an invalid name, or a different error on my part.  Out of curiosity, what does recarray consider an invalid field name?  

On Aug 2, 2011, at 12:31 PM, Skipper Seabold wrote:

> On Tue, Aug 2, 2011 at 3:19 PM, Craig Yoshioka <craigyk at me.com> wrote:
>> duplicate column in dtype?
>> 
> 
> "Duplicate field names given."? Can you post code to replicate?
> 
>> I just consolidated some of the columns and the error went away... none had duplicate field names... hence the question.
>> 
> 
> I don't think this would be raised unless there are duplicates. There
> is some name changing for invalid field names that could result in a
> name collision. I think I've run into this before.
> 
> Skipper
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From jsseabold at gmail.com  Tue Aug  2 16:18:01 2011
From: jsseabold at gmail.com (Skipper Seabold)
Date: Tue, 2 Aug 2011 16:18:01 -0400
Subject: [Numpy-discussion] limit to number of fields in recarray
In-Reply-To: <04693AFD-A761-45DC-8E20-074064B82F36@me.com>
References: <3D27C63E-6C4A-4908-B2E5-0CA01EDB53A2@me.com>
	<59DEC051-5161-49B2-9577-8C873A89CB3C@gmail.com>
	<58F1BDB2-4767-4E4B-9E33-0E04F80C1D52@me.com>
	<CAKF=DjtM=jO4oMzDe4nNy851xqRFcd0aPzvpOg_xrUpnBrd_yw@mail.gmail.com>
	<04693AFD-A761-45DC-8E20-074064B82F36@me.com>
Message-ID: <CAKF=DjsV1=AaUGUy6KQSi1k_idSVhb+OjVDwTt=08Rg4i1mCxQ@mail.gmail.com>

On Tue, Aug 2, 2011 at 4:09 PM, Craig Yoshioka <craigyk at me.com> wrote:
> yup, duplicate field names given. ?I didn't commit the non-working version and I didn't want to mess up my working code so I tried duplicating the dtype in a new file and couldn't recreate the error. ? I suppose the answer to my question is, there is no limit to the number of records? ?Must have been an invalid name, or a different error on my part. ?Out of curiosity, what does recarray consider an invalid field name?

I guess this checking is only done in genfromtxt and that's where I
recall coming across it.

http://docs.scipy.org/doc/numpy/user/basics.io.genfromtxt.html#validating-names

Skipper


From jlconlin at gmail.com  Tue Aug  2 16:40:01 2011
From: jlconlin at gmail.com (Jeremy Conlin)
Date: Tue, 2 Aug 2011 14:40:01 -0600
Subject: [Numpy-discussion] Finding many ways to incorrectly create a
 numpy array. Please advice
In-Reply-To: <4E3830C6.1060902@noaa.gov>
References: <CAAzQXyNeeDNdugKwmf=nJQLBcaj_UkRgG_0FdAQtLjNvNGF+jg@mail.gmail.com>
	<CAFq1z2XJxy++f0W9d8zP0LBCWxusz5PSVGycr5PA8YQn4C9Fbg@mail.gmail.com>
	<CAAzQXyMhjOz9-_+2YVG7auoTRWjGTdfpsgK29JUuN0-LFm1L4w@mail.gmail.com>
	<4E3830C6.1060902@noaa.gov>
Message-ID: <CAAzQXyOyW028DqGZWHQ6wbXGUQhgae2k94oXHkTYtKOJtSy45A@mail.gmail.com>

On Tue, Aug 2, 2011 at 11:15 AM, Christopher Barker
<Chris.Barker at noaa.gov> wrote:
> On 8/2/11 8:38 AM, Jeremy Conlin wrote:
>> Thanks, Brett. Using StringIO and numpy.loadtxt worked great. I'm
>> still curious why what I was doing didn't work. Everything I can see
>> indicates it should work.
>
> In [11]: tfc_dtype
> Out[11]: dtype([('nps', '>u8'), ('t', '>f8'), ('e', '>f8'), ('fom', '>f8')])
>
> In [15]: n = numpy.fromstring(l, dtype=tfc_dtype, sep=' ')
> ---------------------------------------------------------------------------
> ValueError ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?Traceback (most recent call last)
>
> /Users/cbarker/<ipython console> in <module>()
>
> ValueError: don't know how to read character strings with that array type
>
> means just what it says. In theory, numpy.fromstring() (and fromfile() )
> provides a way to quickly and efficiently generate arrays from text, but
> it practice, the code is quite limited (and has a bug or two). I don't
> think anyone has gotten around to writing the code to use structured
> dtypes with it -- so it can't do what you want (rational though that
> expectation is)
>
> In [21]: words
> Out[21]: ['32000', '7.89131E-01', '8.05999E-03', '3.88222E+03']
>
> In [22]: p =
> Display all 249 possibilities? (y or n)
>
> In [22]: p = numpy.array(words, dtype=tfc_dtype)
>
> In [23]: p
> Out[23]:
> array([(3689064028291727360L, 0.0, 0.0, 0.0),
> ? ? ? ?(3976177339304456517L, 4.967820413490985e-91, 0.0, 0.0),
> ? ? ? ?(4048226120204106053L, 4.970217431784588e-91, 0.0, 0.0),
> ? ? ? ?(3687946958874489413L, 1.1572189237420885e-100, 0.0, 0.0)],
> ? ? ? dtype=[('nps', '>u8'), ('t', '>f8'), ('e', '>f8'), ('fom', '>f8')])
>
> similarly here -- converting from text to structured dtypes is not fully
> supported
>
> In [29]: a
> Out[29]: [32000, 0.789131, 0.00805999, 3882.22]
>
> In [30]: r = numpy.array(a)
>
> In [31]: r
> Out[31]:
> array([ ?3.20000000e+04, ? 7.89131000e-01, ? 8.05999000e-03,
> ? ? ? ? ?3.88222000e+03])
>
> sure -- numpy's default behavior is to find a dtype that will hold all
> the input array -- this pre-dates structured dtypes, and probably what
> you would want b default anyway.
>
> In [32]: s = numpy.array(a, dtype=tfc_dtype)
> ---------------------------------------------------------------------------
> TypeError ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? Traceback (most recent call last)
>
> /Users/cbarker/<ipython console> in <module>()
>
> TypeError: expected a readable buffer object
>
> OK -- I can see why you'd expect that to work. However, the trick with
> structured dtypes is that the dimensionality of the inputs can be less
> than obvious -- you are passing in a 1-d list of 4 numbers -- do you
> want a 1-d array? or ? -- in this case, it's pretty obvious (as a human)
> what you would want -- you have a dtype with four fields, and you're
> passing in four numbers, but there are so many possible combinations
> that numpy doesn't try to be "smart" about it. So as a rule, you need to
> be quite specific when working with structured dtypes.
>
> However, the default is for numpy to map tuples to dtypes, so if you
> pass in a tuple instead, it works:
>
> In [34]: t = tuple(a)
>
> In [35]: s = numpy.array(t, dtype=tfc_dtype)
>
> In [36]: s
> Out[36]:
> array((32000L, 0.789131, 0.00805999, 3882.22),
> ? ? ? dtype=[('nps', '>u8'), ('t', '>f8'), ('e', '>f8'), ('fom', '>f8')])
>
> you were THIS close!
>
> -Chris
>
>
>
>
>
>
> --
> Christopher Barker, Ph.D.
> Oceanographer

Chris,

Thanks for that information. It helps greatly in understanding what is
happening. Next time I'll put my data into tuples.

Jeremy


From fperez.net at gmail.com  Wed Aug  3 03:40:46 2011
From: fperez.net at gmail.com (Fernando Perez)
Date: Wed, 3 Aug 2011 00:40:46 -0700
Subject: [Numpy-discussion] [ANN] IPython 0.11 is officially out
In-Reply-To: <CAHAreOpeS1yWz_dg7+HYwyCLvL12DZ3JwoWb_6nHOq4eMNGd+w@mail.gmail.com>
References: <CAHAreOpeS1yWz_dg7+HYwyCLvL12DZ3JwoWb_6nHOq4eMNGd+w@mail.gmail.com>
Message-ID: <CAHAreOpiKakVXfzi=Y9qaq=rY=sNS_zJJc90pxpAmH7O32MZVA@mail.gmail.com>

On Sun, Jul 31, 2011 at 10:19 AM, Fernando Perez <fperez.net at gmail.com> wrote:

> Please see our release notes for the full details on everything about
> this release: https://github.com/ipython/ipython/zipball/rel-0.11

And embarrassingly, that URL was for a zip download instead
(copy/paste error), the detailed release notes are here:

http://ipython.org/ipython-doc/rel-0.11/whatsnew/version0.11.html

Sorry about the mistake...

Cheers,

f


From Chris.Barker at noaa.gov  Wed Aug  3 11:50:11 2011
From: Chris.Barker at noaa.gov (Christopher Barker)
Date: Wed, 03 Aug 2011 08:50:11 -0700
Subject: [Numpy-discussion] Finding many ways to incorrectly create a
 numpy array. Please advice
In-Reply-To: <CAAzQXyOyW028DqGZWHQ6wbXGUQhgae2k94oXHkTYtKOJtSy45A@mail.gmail.com>
References: <CAAzQXyNeeDNdugKwmf=nJQLBcaj_UkRgG_0FdAQtLjNvNGF+jg@mail.gmail.com>
	<CAFq1z2XJxy++f0W9d8zP0LBCWxusz5PSVGycr5PA8YQn4C9Fbg@mail.gmail.com>
	<CAAzQXyMhjOz9-_+2YVG7auoTRWjGTdfpsgK29JUuN0-LFm1L4w@mail.gmail.com>
	<4E3830C6.1060902@noaa.gov>
	<CAAzQXyOyW028DqGZWHQ6wbXGUQhgae2k94oXHkTYtKOJtSy45A@mail.gmail.com>
Message-ID: <4E396E33.4070503@noaa.gov>

On 8/2/11 1:40 PM, Jeremy Conlin wrote:

> Thanks for that information. It helps greatly in understanding what is
> happening. Next time I'll put my data into tuples.

I don't remember where they all are, but there are a few places in numpy 
where tuples and lists are interpreted differently (fancy indexing?). It 
kind of breaks python "duck typing" (a sequence is a sequence), but it's 
useful, too.

So when a list fails to do what you want, try a tuple.

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From kikocorreoso at gmail.com  Wed Aug  3 12:30:29 2011
From: kikocorreoso at gmail.com (Kiko)
Date: Wed, 3 Aug 2011 18:30:29 +0200
Subject: [Numpy-discussion] Reading a big netcdf file
Message-ID: <CAB-sx61ofYnYin5RBNaefatVLJaEGNq95eWXYBP5VUZZbX5Yfg@mail.gmail.com>

Hi.

I'm trying to read a big netcdf file (445 Mb) using netcdf4-python.

The data are described as:
*The GEBCO gridded data set is stored in NetCDF as a one dimensional array
of 2-byte signed integers that represent integer elevations in metres.
The complete data set gives global coverage. It consists of 21601 x 10801
data values, one for each one minute of latitude and longitude for 233312401
points.
The data start at position 90?N, 180?W and are arranged in bands of 360
degrees x 60 points/degree + 1 = 21601 values. The data range eastward from
180?W longitude to 180?E longitude, i.e. the 180? value is repeated.*

The problem is that it is very slow (or I am quite newbie).

Anyone has a suggestion to get these data in a numpy array in a faster way?

Thanks in advance.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110803/6845f281/attachment.html>

From gokhansever at gmail.com  Wed Aug  3 12:46:18 2011
From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=)
Date: Wed, 3 Aug 2011 10:46:18 -0600
Subject: [Numpy-discussion] Reading a big netcdf file
In-Reply-To: <CAB-sx61ofYnYin5RBNaefatVLJaEGNq95eWXYBP5VUZZbX5Yfg@mail.gmail.com>
References: <CAB-sx61ofYnYin5RBNaefatVLJaEGNq95eWXYBP5VUZZbX5Yfg@mail.gmail.com>
Message-ID: <CAE5kuygM7i2TA958MTWoyn39rQQtbDvOjFFhbY8BYsc4dP6nGA@mail.gmail.com>

Here are my values for your comparison:

test.nc file is about 715 MB. The details are below:

In [21]: netCDF4.__version__
Out[21]: '0.9.4'

In [22]: np.__version__
Out[22]: '2.0.0.dev-b233716'

In [23]: from netCDF4 import Dataset

In [24]: f = Dataset("test.nc")

In [25]: f.variables['reflectivity'].shape
Out[25]: (6, 18909, 506)

In [26]: f.variables['reflectivity'].size
Out[26]: 57407724

In [27]: f.variables['reflectivity'][:].dtype
Out[27]: dtype('float32')

In [28]: timeit z = f.variables['reflectivity'][:]
1 loops, best of 3: 731 ms per loop

How long it takes in your side to read that big array?

On Wed, Aug 3, 2011 at 10:30 AM, Kiko <kikocorreoso at gmail.com> wrote:

> Hi.
>
> I'm trying to read a big netcdf file (445 Mb) using netcdf4-python.
>
> The data are described as:
> *The GEBCO gridded data set is stored in NetCDF as a one dimensional array
> of 2-byte signed integers that represent integer elevations in metres.
> The complete data set gives global coverage. It consists of 21601 x 10801
> data values, one for each one minute of latitude and longitude for 233312401
> points.
> The data start at position 90?N, 180?W and are arranged in bands of 360
> degrees x 60 points/degree + 1 = 21601 values. The data range eastward from
> 180?W longitude to 180?E longitude, i.e. the 180? value is repeated.*
>
> The problem is that it is very slow (or I am quite newbie).
>
> Anyone has a suggestion to get these data in a numpy array in a faster way?
>
> Thanks in advance.
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>


-- 
G?khan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110803/974a188f/attachment.html>

From Chris.Barker at noaa.gov  Wed Aug  3 12:50:50 2011
From: Chris.Barker at noaa.gov (Christopher Barker)
Date: Wed, 03 Aug 2011 09:50:50 -0700
Subject: [Numpy-discussion] Reading a big netcdf file
In-Reply-To: <CAB-sx61ofYnYin5RBNaefatVLJaEGNq95eWXYBP5VUZZbX5Yfg@mail.gmail.com>
References: <CAB-sx61ofYnYin5RBNaefatVLJaEGNq95eWXYBP5VUZZbX5Yfg@mail.gmail.com>
Message-ID: <4E397C6A.8010101@noaa.gov>

On 8/3/11 9:30 AM, Kiko wrote:
> I'm trying to read a big netcdf file (445 Mb) using netcdf4-python.

I've never noticed that netCDF4 was particularly slow for reading 
(writing can be pretty slow some times). How slow is slow?

> The data are described as:

please post the results of:

ncdump -h the_file_name.nc

So we can see if there is anything odd in the structure (though I don't 
know what it might be)

Post your code (in the simnd pplest form you can).

and post your timings and machine type

Is the file netcdf4 or 3 format? (the python lib will read either)

As a reference, reading that much data in from a raw file into a numpy 
array takes 2.57 on my machine (a rather old Mac, but disks haven't 
gotten much  faster). YOu can test that like this:

a = np.zeros((21601, 10801), dtype=np.uint16)

a.tofile('temp.npa')

del a

timeit a = np.fromfile('temp.npa', dtype=np.uint16)

(using ipython's timeit)

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From gokhansever at gmail.com  Wed Aug  3 13:01:24 2011
From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=)
Date: Wed, 3 Aug 2011 11:01:24 -0600
Subject: [Numpy-discussion] Reading a big netcdf file
In-Reply-To: <CAE5kuygM7i2TA958MTWoyn39rQQtbDvOjFFhbY8BYsc4dP6nGA@mail.gmail.com>
References: <CAB-sx61ofYnYin5RBNaefatVLJaEGNq95eWXYBP5VUZZbX5Yfg@mail.gmail.com>
	<CAE5kuygM7i2TA958MTWoyn39rQQtbDvOjFFhbY8BYsc4dP6nGA@mail.gmail.com>
Message-ID: <CAE5kuygqkexhaO5TAnDV1PqHed-820QAV7uffT7t64ZouL2axw@mail.gmail.com>

Just a few extra tests on my side pushing the limits of my system memory:

In [34]: k = np.zeros((21601, 10801, 3), dtype='int16')
k          ndarray     21601x10801x3: 699937203 elems, type `int16`,
1399874406 bytes (1335 Mb)

And for the first time my memory explodes with a hard kernel crash:

In [36]: k = np.zeros((21601, 10801, 13), dtype='int16')

Message from syslogd at ccn at Aug  3 10:51:43 ...
 kernel:[48715.531155] ------------[ cut here ]------------

Message from syslogd at ccn at Aug  3 10:51:43 ...
 kernel:[48715.531163] invalid opcode: 0000 [#1] SMP

Message from syslogd at ccn at Aug  3 10:51:43 ...
 kernel:[48715.531166] last sysfs file:
/sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map

Message from syslogd at ccn at Aug  3 10:51:43 ...
 kernel:[48715.531253] Stack:

Message from syslogd at ccn at Aug  3 10:51:43 ...
 kernel:[48715.531265] Call Trace:

Message from syslogd at ccn at Aug  3 10:51:43 ...
 kernel:[48715.531332] Code: be 33 01 00 00 48 89 fb 48 c7 c7 67 31 7a 81 e8
b0 2d f1 ff e8 90 f2 33 00 48 89 df e8 86 db 00 00 48 83 bb 60 01 00 00 00
74 02 <0f> 0b 48 8b 83 10 02 00 00 a8 20 75 02 0f 0b a8 40 74 02 0f 0b


On Wed, Aug 3, 2011 at 10:46 AM, G?khan Sever <gokhansever at gmail.com> wrote:

> Here are my values for your comparison:
>
> test.nc file is about 715 MB. The details are below:
>
> In [21]: netCDF4.__version__
> Out[21]: '0.9.4'
>
> In [22]: np.__version__
> Out[22]: '2.0.0.dev-b233716'
>
> In [23]: from netCDF4 import Dataset
>
> In [24]: f = Dataset("test.nc")
>
> In [25]: f.variables['reflectivity'].shape
> Out[25]: (6, 18909, 506)
>
> In [26]: f.variables['reflectivity'].size
> Out[26]: 57407724
>
> In [27]: f.variables['reflectivity'][:].dtype
> Out[27]: dtype('float32')
>
> In [28]: timeit z = f.variables['reflectivity'][:]
> 1 loops, best of 3: 731 ms per loop
>
> How long it takes in your side to read that big array?
>
> On Wed, Aug 3, 2011 at 10:30 AM, Kiko <kikocorreoso at gmail.com> wrote:
>
>> Hi.
>>
>> I'm trying to read a big netcdf file (445 Mb) using netcdf4-python.
>>
>> The data are described as:
>> *The GEBCO gridded data set is stored in NetCDF as a one dimensional
>> array of 2-byte signed integers that represent integer elevations in metres.
>>
>> The complete data set gives global coverage. It consists of 21601 x 10801
>> data values, one for each one minute of latitude and longitude for 233312401
>> points.
>> The data start at position 90?N, 180?W and are arranged in bands of 360
>> degrees x 60 points/degree + 1 = 21601 values. The data range eastward from
>> 180?W longitude to 180?E longitude, i.e. the 180? value is repeated.*
>>
>> The problem is that it is very slow (or I am quite newbie).
>>
>> Anyone has a suggestion to get these data in a numpy array in a faster
>> way?
>>
>> Thanks in advance.
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
>
>
> --
> G?khan
>


-- 
G?khan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110803/660779da/attachment.html>

From ijstokes at hkl.hms.harvard.edu  Wed Aug  3 14:09:07 2011
From: ijstokes at hkl.hms.harvard.edu (Ian Stokes-Rees)
Date: Wed, 03 Aug 2011 14:09:07 -0400
Subject: [Numpy-discussion] Reading a big netcdf file
In-Reply-To: <4E397C6A.8010101@noaa.gov>
References: <CAB-sx61ofYnYin5RBNaefatVLJaEGNq95eWXYBP5VUZZbX5Yfg@mail.gmail.com>
	<4E397C6A.8010101@noaa.gov>
Message-ID: <4E398EC3.4080800@hkl.hms.harvard.edu>

On 8/3/11 12:50 PM, Christopher Barker wrote:
> As a reference, reading that much data in from a raw file into a numpy
> array takes 2.57 on my machine (a rather old Mac, but disks haven't
> gotten much faster).

2.57 seconds?  or minutes?  If seconds, does it actually read the whole
thing into memory in that time, or is there some kind of delayed read
going on?

Ian
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110803/ed4d44a9/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ijstokes.vcf
Type: text/x-vcard
Size: 380 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110803/ed4d44a9/attachment.vcf>

From Chris.Barker at noaa.gov  Wed Aug  3 14:38:08 2011
From: Chris.Barker at noaa.gov (Christopher Barker)
Date: Wed, 03 Aug 2011 11:38:08 -0700
Subject: [Numpy-discussion] Reading a big netcdf file
In-Reply-To: <4E398EC3.4080800@hkl.hms.harvard.edu>
References: <CAB-sx61ofYnYin5RBNaefatVLJaEGNq95eWXYBP5VUZZbX5Yfg@mail.gmail.com>
	<4E397C6A.8010101@noaa.gov> <4E398EC3.4080800@hkl.hms.harvard.edu>
Message-ID: <4E399590.3020203@noaa.gov>

On 8/3/11 11:09 AM, Ian Stokes-Rees wrote:
> On 8/3/11 12:50 PM, Christopher Barker wrote:
>> As a reference, reading that much data in from a raw file into a numpy
>> array takes 2.57 on my machine (a rather old Mac, but disks haven't
>> gotten much faster).
>
> 2.57 seconds? or minutes?

sorry -- seconds.

>If seconds, does it actually read the whole
> thing into memory in that time, or is there some kind of delayed read
> going on?

I think it reads it all in. However, now that you bring it up, I think 
"timeit" does it a few times, and after the first time, there may well 
be disk cache that speeds things up.

In fact, as I recently wrote the file, there may be disk cache issues 
even on the first read.

I'm no timing expert, but there must be ways to get a clean time.

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From Chris.Barker at noaa.gov  Wed Aug  3 14:40:08 2011
From: Chris.Barker at noaa.gov (Christopher Barker)
Date: Wed, 03 Aug 2011 11:40:08 -0700
Subject: [Numpy-discussion] Reading a big netcdf file
In-Reply-To: <CAE5kuygM7i2TA958MTWoyn39rQQtbDvOjFFhbY8BYsc4dP6nGA@mail.gmail.com>
References: <CAB-sx61ofYnYin5RBNaefatVLJaEGNq95eWXYBP5VUZZbX5Yfg@mail.gmail.com>
	<CAE5kuygM7i2TA958MTWoyn39rQQtbDvOjFFhbY8BYsc4dP6nGA@mail.gmail.com>
Message-ID: <4E399608.6000005@noaa.gov>

On 8/3/11 9:46 AM, G?khan Sever wrote:
> In [23]: from netCDF4 import Dataset
>
> In [24]: f = Dataset("test.nc <http://test.nc>")
>
> In [25]: f.variables['reflectivity'].shape
> Out[25]: (6, 18909, 506)
>
> In [26]: f.variables['reflectivity'].size
> Out[26]: 57407724
>
> In [27]: f.variables['reflectivity'][:].dtype
> Out[27]: dtype('float32')
>
> In [28]: timeit z = f.variables['reflectivity'][:]
> 1 loops, best of 3: 731 ms per loop

that seems pretty fast, actually -- are you sure that [:] forces the 
full data read? It probably does, but I'm not totally sure.

is "z" a numpy array object at that point?

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From gokhansever at gmail.com  Wed Aug  3 16:57:19 2011
From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=)
Date: Wed, 3 Aug 2011 14:57:19 -0600
Subject: [Numpy-discussion] Reading a big netcdf file
In-Reply-To: <4E397C6A.8010101@noaa.gov>
References: <CAB-sx61ofYnYin5RBNaefatVLJaEGNq95eWXYBP5VUZZbX5Yfg@mail.gmail.com>
	<4E397C6A.8010101@noaa.gov>
Message-ID: <CAE5kuyhYNhMV06YoMWLShUg9iqcAgng6fpCRJPx2-tZZxiGv2Q@mail.gmail.com>

This is what I get here:

In [1]: a = np.zeros((21601, 10801), dtype=np.uint16)

In [2]: a.tofile('temp.npa')

In [3]: del a

In [4]: timeit a = np.fromfile('temp.npa', dtype=np.uint16)
1 loops, best of 3: 251 ms per loop


On Wed, Aug 3, 2011 at 10:50 AM, Christopher Barker
<Chris.Barker at noaa.gov>wrote:

> On 8/3/11 9:30 AM, Kiko wrote:
> > I'm trying to read a big netcdf file (445 Mb) using netcdf4-python.
>
> I've never noticed that netCDF4 was particularly slow for reading
> (writing can be pretty slow some times). How slow is slow?
>
> > The data are described as:
>
> please post the results of:
>
> ncdump -h the_file_name.nc
>
> So we can see if there is anything odd in the structure (though I don't
> know what it might be)
>
> Post your code (in the simnd pplest form you can).
>
> and post your timings and machine type
>
> Is the file netcdf4 or 3 format? (the python lib will read either)
>
> As a reference, reading that much data in from a raw file into a numpy
> array takes 2.57 on my machine (a rather old Mac, but disks haven't
> gotten much  faster). YOu can test that like this:
>
> a = np.zeros((21601, 10801), dtype=np.uint16)
>
> a.tofile('temp.npa')
>
> del a
>
> timeit a = np.fromfile('temp.npa', dtype=np.uint16)
>
> (using ipython's timeit)
>
> -Chris
>
>
>
> --
> Christopher Barker, Ph.D.
> Oceanographer
>
> Emergency Response Division
> NOAA/NOS/OR&R            (206) 526-6959   voice
> 7600 Sand Point Way NE   (206) 526-6329   fax
> Seattle, WA  98115       (206) 526-6317   main reception
>
> Chris.Barker at noaa.gov
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


-- 
G?khan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110803/a68ac06f/attachment.html>

From gokhansever at gmail.com  Wed Aug  3 17:02:28 2011
From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=)
Date: Wed, 3 Aug 2011 15:02:28 -0600
Subject: [Numpy-discussion] Reading a big netcdf file
In-Reply-To: <4E399608.6000005@noaa.gov>
References: <CAB-sx61ofYnYin5RBNaefatVLJaEGNq95eWXYBP5VUZZbX5Yfg@mail.gmail.com>
	<CAE5kuygM7i2TA958MTWoyn39rQQtbDvOjFFhbY8BYsc4dP6nGA@mail.gmail.com>
	<4E399608.6000005@noaa.gov>
Message-ID: <CAE5kuyjAKLfT1ZFB_ag3Kaa5-SLsdiSt-LMmLviqUv4naw9=yQ@mail.gmail.com>

I think these answer your questions.

In [3]: type f.variables['reflectivity']
------> type(f.variables['reflectivity'])
Out[3]: <type 'netCDF4.Variable'>

In [4]: type f.variables['reflectivity'][:]
------> type(f.variables['reflectivity'][:])
Out[4]: <type 'numpy.ndarray'>

In [5]: z = f.variables['reflectivity'][:]

In [6]: type z
------> type(z)
Out[6]: <type 'numpy.ndarray'>

In [10]: id f.variables['reflectivity'][:]
-------> id(f.variables['reflectivity'][:])
Out[10]: 37895488

In [11]: id z
-------> id(z)
Out[11]: 37901440


On Wed, Aug 3, 2011 at 12:40 PM, Christopher Barker
<Chris.Barker at noaa.gov>wrote:

> On 8/3/11 9:46 AM, G?khan Sever wrote:
> > In [23]: from netCDF4 import Dataset
> >
> > In [24]: f = Dataset("test.nc <http://test.nc>")
> >
> > In [25]: f.variables['reflectivity'].shape
> > Out[25]: (6, 18909, 506)
> >
> > In [26]: f.variables['reflectivity'].size
> > Out[26]: 57407724
> >
> > In [27]: f.variables['reflectivity'][:].dtype
> > Out[27]: dtype('float32')
> >
> > In [28]: timeit z = f.variables['reflectivity'][:]
> > 1 loops, best of 3: 731 ms per loop
>
> that seems pretty fast, actually -- are you sure that [:] forces the
> full data read? It probably does, but I'm not totally sure.
>
> is "z" a numpy array object at that point?
>
> -Chris
>
>
> --
> Christopher Barker, Ph.D.
> Oceanographer
>
> Emergency Response Division
> NOAA/NOS/OR&R            (206) 526-6959   voice
> 7600 Sand Point Way NE   (206) 526-6329   fax
> Seattle, WA  98115       (206) 526-6317   main reception
>
> Chris.Barker at noaa.gov
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


-- 
G?khan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110803/9a0eee48/attachment.html>

From Chris.Barker at noaa.gov  Wed Aug  3 17:15:06 2011
From: Chris.Barker at noaa.gov (Christopher Barker)
Date: Wed, 03 Aug 2011 14:15:06 -0700
Subject: [Numpy-discussion] Reading a big netcdf file
In-Reply-To: <CAE5kuyhYNhMV06YoMWLShUg9iqcAgng6fpCRJPx2-tZZxiGv2Q@mail.gmail.com>
References: <CAB-sx61ofYnYin5RBNaefatVLJaEGNq95eWXYBP5VUZZbX5Yfg@mail.gmail.com>
	<4E397C6A.8010101@noaa.gov>
	<CAE5kuyhYNhMV06YoMWLShUg9iqcAgng6fpCRJPx2-tZZxiGv2Q@mail.gmail.com>
Message-ID: <4E39BA5A.5070806@noaa.gov>

On 8/3/11 1:57 PM, G?khan Sever wrote:
> This is what I get here:
>
> In [1]: a = np.zeros((21601, 10801), dtype=np.uint16)
>
> In [2]: a.tofile('temp.npa')
>
> In [3]: del a
>
> In [4]: timeit a = np.fromfile('temp.npa', dtype=np.uint16)
> 1 loops, best of 3: 251 ms per loop

so that's about 10 times faster than my machine. I didn't think disks 
had gotten much faster -- they are still generally 7200 rpm (or slower 
in laptops).

So I've either got a really slow disk, or you have a really fast one (or 
both), or maybe you're getting cache effect, as you wrote the file just 
before reading it.

repeating, doing just what you did:

In [8]: timeit a = np.fromfile('temp.npa', dtype=np.uint16)
1 loops, best of 3: 2.53 s per loop

then I wrote a bunch of others to disk, and tried again:

In [17]: timeit a = np.fromfile('temp.npa', dtype=np.uint16)
1 loops, best of 3: 2.45 s per loop

so ti seems I'm not seeing cache effects, but maybe you are.

Anyway, we haven't heard from the OP -- I'm not sure what s/he thought 
was slow.

-Chris


>
> On Wed, Aug 3, 2011 at 10:50 AM, Christopher Barker
> <Chris.Barker at noaa.gov <mailto:Chris.Barker at noaa.gov>> wrote:
>
>     On 8/3/11 9:30 AM, Kiko wrote:
>      > I'm trying to read a big netcdf file (445 Mb) using netcdf4-python.
>
>     I've never noticed that netCDF4 was particularly slow for reading
>     (writing can be pretty slow some times). How slow is slow?
>
>      > The data are described as:
>
>     please post the results of:
>
>     ncdump -h the_file_name.nc <http://the_file_name.nc>
>
>     So we can see if there is anything odd in the structure (though I don't
>     know what it might be)
>
>     Post your code (in the simnd pplest form you can).
>
>     and post your timings and machine type
>
>     Is the file netcdf4 or 3 format? (the python lib will read either)
>
>     As a reference, reading that much data in from a raw file into a numpy
>     array takes 2.57 on my machine (a rather old Mac, but disks haven't
>     gotten much  faster). YOu can test that like this:
>
>     a = np.zeros((21601, 10801), dtype=np.uint16)
>
>     a.tofile('temp.npa')
>
>     del a
>
>     timeit a = np.fromfile('temp.npa', dtype=np.uint16)
>
>     (using ipython's timeit)
>
>     -Chris
>
>
>
>     --
>     Christopher Barker, Ph.D.
>     Oceanographer
>
>     Emergency Response Division
>     NOAA/NOS/OR&R (206) 526-6959 <tel:%28206%29%20526-6959>   voice
>     7600 Sand Point Way NE (206) 526-6329 <tel:%28206%29%20526-6329>   fax
>     Seattle, WA  98115 (206) 526-6317 <tel:%28206%29%20526-6317>   main
>     reception
>
>     Chris.Barker at noaa.gov <mailto:Chris.Barker at noaa.gov>
>     _______________________________________________
>     NumPy-Discussion mailing list
>     NumPy-Discussion at scipy.org <mailto:NumPy-Discussion at scipy.org>
>     http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
>
>
> --
> G?khan
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From gokhansever at gmail.com  Wed Aug  3 17:24:33 2011
From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=)
Date: Wed, 3 Aug 2011 15:24:33 -0600
Subject: [Numpy-discussion] Reading a big netcdf file
In-Reply-To: <4E39BA5A.5070806@noaa.gov>
References: <CAB-sx61ofYnYin5RBNaefatVLJaEGNq95eWXYBP5VUZZbX5Yfg@mail.gmail.com>
	<4E397C6A.8010101@noaa.gov>
	<CAE5kuyhYNhMV06YoMWLShUg9iqcAgng6fpCRJPx2-tZZxiGv2Q@mail.gmail.com>
	<4E39BA5A.5070806@noaa.gov>
Message-ID: <CAE5kuygamphDwbv4XAYuKuBNdrLG7uPnkSUm_L-tR0rV3UWf3w@mail.gmail.com>

On Wed, Aug 3, 2011 at 3:15 PM, Christopher Barker <Chris.Barker at noaa.gov>wrote:

> On 8/3/11 1:57 PM, G?khan Sever wrote:
> > This is what I get here:
> >
> > In [1]: a = np.zeros((21601, 10801), dtype=np.uint16)
> >
> > In [2]: a.tofile('temp.npa')
> >
> > In [3]: del a
> >
> > In [4]: timeit a = np.fromfile('temp.npa', dtype=np.uint16)
> > 1 loops, best of 3: 251 ms per loop
>
> so that's about 10 times faster than my machine. I didn't think disks
> had gotten much faster -- they are still generally 7200 rpm (or slower
> in laptops).
>
> So I've either got a really slow disk, or you have a really fast one (or
> both), or maybe you're getting cache effect, as you wrote the file just
> before reading it.
>
> repeating, doing just what you did:
>
> In [8]: timeit a = np.fromfile('temp.npa', dtype=np.uint16)
> 1 loops, best of 3: 2.53 s per loop
>
> then I wrote a bunch of others to disk, and tried again:
>
> In [17]: timeit a = np.fromfile('temp.npa', dtype=np.uint16)
> 1 loops, best of 3: 2.45 s per loop
>
> so ti seems I'm not seeing cache effects, but maybe you are.
>
> Anyway, we haven't heard from the OP -- I'm not sure what s/he thought
> was slow.
>
> -Chris


In [11]: a = np.zeros((21601, 10801), dtype=np.uint16)

In [12]: a.tofile('temp.npa')

In [13]: del a

Quitting here and restarting IPython. (this should cut the caching effect
isn't it?)

I[1]: timeit a = np.fromfile('temp.npa', dtype=np.uint16)
1 loops, best of 3: 263 ms per loop

#More information about my system:
hdparm -I /dev/sda | grep Rotation
Nominal Media Rotation Rate: 7200

uname -a  #64-bit Fedora 14
Linux ccn 2.6.35.13-92.fc14.x86_64 #1

Filesystem(s) ext4

-- 
G?khan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110803/ba3dad9c/attachment.html>

From warren.weckesser at enthought.com  Wed Aug  3 18:02:16 2011
From: warren.weckesser at enthought.com (Warren Weckesser)
Date: Wed, 3 Aug 2011 17:02:16 -0500
Subject: [Numpy-discussion] Reading a big netcdf file
In-Reply-To: <CAE5kuygamphDwbv4XAYuKuBNdrLG7uPnkSUm_L-tR0rV3UWf3w@mail.gmail.com>
References: <CAB-sx61ofYnYin5RBNaefatVLJaEGNq95eWXYBP5VUZZbX5Yfg@mail.gmail.com>
	<4E397C6A.8010101@noaa.gov>
	<CAE5kuyhYNhMV06YoMWLShUg9iqcAgng6fpCRJPx2-tZZxiGv2Q@mail.gmail.com>
	<4E39BA5A.5070806@noaa.gov>
	<CAE5kuygamphDwbv4XAYuKuBNdrLG7uPnkSUm_L-tR0rV3UWf3w@mail.gmail.com>
Message-ID: <CAM-+wY8EZ9aibcOCr+X8yAVuKAi3d888TP32wGGLLiCmWExNCQ@mail.gmail.com>

On Wed, Aug 3, 2011 at 4:24 PM, G?khan Sever <gokhansever at gmail.com> wrote:
>
>
> On Wed, Aug 3, 2011 at 3:15 PM, Christopher Barker <Chris.Barker at noaa.gov>
> wrote:
>>
>> On 8/3/11 1:57 PM, G?khan Sever wrote:
>> > This is what I get here:
>> >
>> > In [1]: a = np.zeros((21601, 10801), dtype=np.uint16)
>> >
>> > In [2]: a.tofile('temp.npa')
>> >
>> > In [3]: del a
>> >
>> > In [4]: timeit a = np.fromfile('temp.npa', dtype=np.uint16)
>> > 1 loops, best of 3: 251 ms per loop
>>
>> so that's about 10 times faster than my machine. I didn't think disks
>> had gotten much faster -- they are still generally 7200 rpm (or slower
>> in laptops).
>>
>> So I've either got a really slow disk, or you have a really fast one (or
>> both), or maybe you're getting cache effect, as you wrote the file just
>> before reading it.
>>
>> repeating, doing just what you did:
>>
>> In [8]: timeit a = np.fromfile('temp.npa', dtype=np.uint16)
>> 1 loops, best of 3: 2.53 s per loop
>>
>> then I wrote a bunch of others to disk, and tried again:
>>
>> In [17]: timeit a = np.fromfile('temp.npa', dtype=np.uint16)
>> 1 loops, best of 3: 2.45 s per loop
>>
>> so ti seems I'm not seeing cache effects, but maybe you are.
>>
>> Anyway, we haven't heard from the OP -- I'm not sure what s/he thought
>> was slow.
>>
>> -Chris
>
> In [11]: a = np.zeros((21601, 10801), dtype=np.uint16)
> In [12]: a.tofile('temp.npa')
> In [13]: del a
> Quitting here and restarting IPython. (this should cut the caching effect
> isn't it?)


Not necessarily.  In Linux, this should do it:

$ sync; echo 3 > /proc/sys/vm/drop_caches

(Run as root, or use sudo.)

Google for something like "linux reset disk cache" to find other variations.

Warren


> I[1]: timeit a = np.fromfile('temp.npa', dtype=np.uint16)
> 1 loops, best of 3: 263 ms per loop
> #More information about my system:
> hdparm -I /dev/sda | grep Rotation
> Nominal Media Rotation Rate: 7200
> uname -a ?#64-bit Fedora 14
> Linux ccn 2.6.35.13-92.fc14.x86_64 #1
> Filesystem(s) ext4
> --
> G?khan
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>


From efiring at hawaii.edu  Wed Aug  3 18:52:12 2011
From: efiring at hawaii.edu (Eric Firing)
Date: Wed, 03 Aug 2011 12:52:12 -1000
Subject: [Numpy-discussion] Reading a big netcdf file
In-Reply-To: <CAE5kuygamphDwbv4XAYuKuBNdrLG7uPnkSUm_L-tR0rV3UWf3w@mail.gmail.com>
References: <CAB-sx61ofYnYin5RBNaefatVLJaEGNq95eWXYBP5VUZZbX5Yfg@mail.gmail.com>
	<4E397C6A.8010101@noaa.gov>
	<CAE5kuyhYNhMV06YoMWLShUg9iqcAgng6fpCRJPx2-tZZxiGv2Q@mail.gmail.com>
	<4E39BA5A.5070806@noaa.gov>
	<CAE5kuygamphDwbv4XAYuKuBNdrLG7uPnkSUm_L-tR0rV3UWf3w@mail.gmail.com>
Message-ID: <4E39D11C.7010706@hawaii.edu>

On 08/03/2011 11:24 AM, G?khan Sever wrote:

> I[1]: timeit a = np.fromfile('temp.npa', dtype=np.uint16)
> 1 loops, best of 3: 263 ms per loop

You need to clear your cache and then run timeit with options "-n1 -r1".

Eric


From gokhansever at gmail.com  Wed Aug  3 18:56:04 2011
From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=)
Date: Wed, 3 Aug 2011 16:56:04 -0600
Subject: [Numpy-discussion] Reading a big netcdf file
In-Reply-To: <4E39D11C.7010706@hawaii.edu>
References: <CAB-sx61ofYnYin5RBNaefatVLJaEGNq95eWXYBP5VUZZbX5Yfg@mail.gmail.com>
	<4E397C6A.8010101@noaa.gov>
	<CAE5kuyhYNhMV06YoMWLShUg9iqcAgng6fpCRJPx2-tZZxiGv2Q@mail.gmail.com>
	<4E39BA5A.5070806@noaa.gov>
	<CAE5kuygamphDwbv4XAYuKuBNdrLG7uPnkSUm_L-tR0rV3UWf3w@mail.gmail.com>
	<4E39D11C.7010706@hawaii.edu>
Message-ID: <CAE5kuyiuQRAcueWL3J=tXeN2UMg+QYJMpekfZd5MGbszpVOB4A@mail.gmail.com>

Back to the reality. After clearing the cache using Warren's suggestion:

In [1]: timeit -n1 -r1 a = np.fromfile('temp.npa', dtype=np.uint16)
1 loops, best of 1: 7.23 s per loop


On Wed, Aug 3, 2011 at 4:52 PM, Eric Firing <efiring at hawaii.edu> wrote:

> On 08/03/2011 11:24 AM, G?khan Sever wrote:
>
> > I[1]: timeit a = np.fromfile('temp.npa', dtype=np.uint16)
> > 1 loops, best of 3: 263 ms per loop
>
> You need to clear your cache and then run timeit with options "-n1 -r1".
>
> Eric
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


-- 
G?khan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110803/ecb76093/attachment.html>

From kikocorreoso at gmail.com  Thu Aug  4 06:46:55 2011
From: kikocorreoso at gmail.com (Kiko)
Date: Thu, 4 Aug 2011 12:46:55 +0200
Subject: [Numpy-discussion] Reading a big netcdf file
Message-ID: <CAB-sx630Ej-Ttqoc1YMnBCjywXxNdK91w860oTz6tatEqLWweA@mail.gmail.com>

Hi, all.

Thank you very much for your replies.

I am obtaining some issues. If I use netcdf4-python or scipy.io.netcdf
libraries:

In [4]: import netCDF4 as n4
In [5]: from scipy.io import netcdf as nS
In [6]: import numpy as np
In [7]: gebco4 = n4.Dataset('GridOne.grd', 'r')
In [8]: gebcoS = nS.netcdf_file('GridOne.grd', 'r')

Now, if a do:

In [9]: z4 = gebco4.variables['z']

I got no problems and I have:

In [14]: type(z4); z4.shape; z4.size
Out[14]: <type 'netCDF4.Variable'>
Out[14]: (233312401,)
Out[14]: 233312401

But if I do:

In [15]: z4 = gebco4.variables['z'][:]
------------------------------------------------------------
Traceback (most recent call last):
  File "<ipython console>", line 1, in <module>
  File "netCDF4.pyx", line 2466, in netCDF4.Variable.__getitem__
(netCDF4.c:22943)
  File "C:\Python26\lib\site-packages\netCDF4_utils.py", line 278, in
_StartCountStride
    n = len(range(beg,end,inc))
MemoryError

I got a memory error. But if a select a smaller array I've got:

In [16]: z4 = gebco4.variables['z'][:10000000]
In [17]: type(z4); z4.shape; z4.size
Out[17]: <type 'numpy.ndarray'>
Out[17]: (10000000,)
Out[17]: 10000000

What's the difference between z4 as a netCDF4.Variable and as a
numpy.ndarray?

Now, if I use scipy.io.netcdf:

In [18]: zS = gebcoS.variables['z']
In [20]: type(zS); zS.shape
Out[20]: <class 'scipy.io.netcdf.netcdf_variable'>
Out[20]: (233312401,)

In [21]: zS = gebcoS.variables['z'][:]
In [22]: type(zS); zS.shape
Out[22]: <type 'numpy.ndarray'>
Out[22]: (233312401,)

What's the difference between zS as a scipy.io.netcdf.netcdf_variable and as
a numpy.ndarray?
Why with scipy.io.netcdf I do not have a MemoryError?

Finally, if I do the following (maybe it's a silly thing do this) using Eric
suggestions to clear the cache:

In [32]: zS = gebcoS.variables['z']
In [38]: timeit -n1 -r1 zSS = np.array(zS[:100000000]) # 100.000.000 out of
233.312.401 because I've got a MemoryError
1 loops, best of 1: 73.1 s per loop

(If I use a copy, timeit -n1 -r1 zSS = np.array(zS[:100000000], copy=True),
I get a MemoryError and I have to set the size to 50.000.000 but it's quite
fast).

Than you very much for your replies and excuse me if some questions are very
basic.

Best regards.

***********************************************************************
The results of ncdump -h
netcdf GridOne {
dimensions:
        side = 2 ;
        xysize = 233312401 ;
variables:
        double x_range(side) ;
                x_range:units = "user_x_unit" ;
        double y_range(side) ;
                y_range:units = "user_y_unit" ;
        short z_range(side) ;
                z_range:units = "user_z_unit" ;
        double spacing(side) ;
        short dimension(side) ;
        short z(xysize) ;
                z:scale_factor = 1. ;
                z:add_offset = 0. ;
                z:node_offset = 0 ;

// global attributes:
                :title = "GEBCO One Minute Grid" ;
                :source = "1.02" ;
}

The file is publicly available from:
http://www.gebco.net/data_and_products/gridded_bathymetry_data/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110804/320600ec/attachment.html>

From jswhit at fastmail.fm  Thu Aug  4 11:53:03 2011
From: jswhit at fastmail.fm (Jeff Whitaker)
Date: Thu, 04 Aug 2011 09:53:03 -0600
Subject: [Numpy-discussion] Reading a big netcdf file
In-Reply-To: <CAB-sx630Ej-Ttqoc1YMnBCjywXxNdK91w860oTz6tatEqLWweA@mail.gmail.com>
References: <CAB-sx630Ej-Ttqoc1YMnBCjywXxNdK91w860oTz6tatEqLWweA@mail.gmail.com>
Message-ID: <4E3AC05F.2040006@fastmail.fm>

On 8/4/11 4:46 AM, Kiko wrote:
> Hi, all.
>
> Thank you very much for your replies.
>
> I am obtaining some issues. If I use netcdf4-python or scipy.io.netcdf 
> libraries:
>
> In [4]: import netCDF4 as n4
> In [5]: from scipy.io <http://scipy.io> import netcdf as nS
> In [6]: import numpy as np
> In [7]: gebco4 = n4.Dataset('GridOne.grd', 'r')
> In [8]: gebcoS = nS.netcdf_file('GridOne.grd', 'r')
>
> Now, if a do:
>
> In [9]: z4 = gebco4.variables['z']
>
> I got no problems and I have:
>
> In [14]: type(z4); z4.shape; z4.size
> Out[14]: <type 'netCDF4.Variable'>
> Out[14]: (233312401,)
> Out[14]: 233312401
>
> But if I do:
>
> In [15]: z4 = gebco4.variables['z'][:]
> ------------------------------------------------------------
> Traceback (most recent call last):
>   File "<ipython console>", line 1, in <module>
>   File "netCDF4.pyx", line 2466, in netCDF4.Variable.__getitem__ 
> (netCDF4.c:22943)
>   File "C:\Python26\lib\site-packages\netCDF4_utils.py", line 278, in 
> _StartCountStride
>     n = len(range(beg,end,inc))
> MemoryError
>
> I got a memory error. 


Kiko:  I think the difference may be that when you read the data with 
netcdf4-python, it tries to unpack the short integers to a float32 
array, thereby using much more memory (more than you have available).  
scipy.io.netcdf is just returning you a numpy array of short integers.  
I bet if you do

gebco4.set_automaskandscale(False)

before reading the data from the getco4 variable, it will work, since 
this turns off the auto conversion to float32.

You'll have to do the conversion manually then, at which point you will 
may run out of memory anyway.

> But if a select a smaller array I've got:
>
> In [16]: z4 = gebco4.variables['z'][:10000000]
> In [17]: type(z4); z4.shape; z4.size
> Out[17]: <type 'numpy.ndarray'>
> Out[17]: (10000000,)
> Out[17]: 10000000
>
> What's the difference between z4 as a netCDF4.Variable and as a 
> numpy.ndarray?

the netcdf variable object just refers to the data in the file - only 
when you slice the object is the data read in and converted to a numpy 
array.

-Jeff
>
> Now, if I use scipy.io.netcdf:
>
> In [18]: zS = gebcoS.variables['z']
> In [20]: type(zS); zS.shape
> Out[20]: <class 'scipy.io.netcdf.netcdf_variable'>
> Out[20]: (233312401,)
>
> In [21]: zS = gebcoS.variables['z'][:]
> In [22]: type(zS); zS.shape
> Out[22]: <type 'numpy.ndarray'>
> Out[22]: (233312401,)
>
> What's the difference between zS as a scipy.io.netcdf.netcdf_variable 
> and as a numpy.ndarray?
> Why with scipy.io.netcdf I do not have a MemoryError?
>
> Finally, if I do the following (maybe it's a silly thing do this) 
> using Eric suggestions to clear the cache:
>
> In [32]: zS = gebcoS.variables['z']
> In [38]: timeit -n1 -r1 zSS = np.array(zS[:100000000]) # 100.000.000 
> out of 233.312.401 because I've got a MemoryError
> 1 loops, best of 1: 73.1 s per loop
>
> (If I use a copy, timeit -n1 -r1 zSS = np.array(zS[:100000000], 
> copy=True), I get a MemoryError and I have to set the size to 
> 50.000.000 but it's quite fast).
>
> Than you very much for your replies and excuse me if some questions 
> are very basic.
>
> Best regards.
>
> ***********************************************************************
> The results of ncdump -h
> netcdf GridOne {
> dimensions:
>         side = 2 ;
>         xysize = 233312401 ;
> variables:
>         double x_range(side) ;
>                 x_range:units = "user_x_unit" ;
>         double y_range(side) ;
>                 y_range:units = "user_y_unit" ;
>         short z_range(side) ;
>                 z_range:units = "user_z_unit" ;
>         double spacing(side) ;
>         short dimension(side) ;
>         short z(xysize) ;
>                 z:scale_factor = 1. ;
>                 z:add_offset = 0. ;
>                 z:node_offset = 0 ;
>
> // global attributes:
>                 :title = "GEBCO One Minute Grid" ;
>                 :source = "1.02" ;
> }
>
> The file is publicly available from: 
> http://www.gebco.net/data_and_products/gridded_bathymetry_data/
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110804/952fb2b6/attachment.html>

From Chris.Barker at noaa.gov  Thu Aug  4 13:02:19 2011
From: Chris.Barker at noaa.gov (Christopher Barker)
Date: Thu, 04 Aug 2011 10:02:19 -0700
Subject: [Numpy-discussion] Reading a big netcdf file
In-Reply-To: <CAB-sx630Ej-Ttqoc1YMnBCjywXxNdK91w860oTz6tatEqLWweA@mail.gmail.com>
References: <CAB-sx630Ej-Ttqoc1YMnBCjywXxNdK91w860oTz6tatEqLWweA@mail.gmail.com>
Message-ID: <4E3AD09B.6020406@noaa.gov>

On 8/4/11 3:46 AM, Kiko wrote:
> In [9]: z4 = gebco4.variables['z']
>
> I got no problems and I have:
>
> In [14]: type(z4); z4.shape; z4.size
> Out[14]: <type 'netCDF4.Variable'>
> Out[14]: (233312401,)
> Out[14]: 233312401
>
> But if I do:
>
> In [15]: z4 = gebco4.variables['z'][:]

> MemoryError

> What's the difference between z4 as a netCDF4.Variable and as a
> numpy.ndarray?

a netCDF4.Variable is an object that holds the properties of the 
variable, but does not actually load the dat from the file into memory 
until it is needed, so, it doesn't matter how big the data is at this point.

> The results of ncdump -h
...
>          short z_range(side) ;
>                  z_range:units = "user_z_unit" ;

On 8/4/11 8:53 AM, Jeff Whitaker wrote:
> Kiko: I think the difference may be that when you read the data with
> netcdf4-python, it tries to unpack the short integers to a float32
> array.

Jeff, why is that? is it an netcdf4 convention? I always thought that 
the netcdf data model matched numpy's quite well, including the clear 
choice and specification of data type. I guess I've mostly used float 
data anyway, so hadn't noticed this, but ti comes as a surprise to me!

 > gebco4.set_automaskandscale(False)

> before reading the data from the getco4 variable, it will work, since
> this turns off the auto conversion to float32.

Thanks -- I'll have to remember that.

  > You'll have to do the conversion manually then, at which point you will
> may run out of memory anyway.

why would you have to do the conversion at all? (OK, you may, depending 
on your use case, but for the most part, data stored in a file as an 
integer type would be suitable for use in an integer array)

-Chris

-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From Chris.Barker at noaa.gov  Thu Aug  4 13:04:16 2011
From: Chris.Barker at noaa.gov (Christopher Barker)
Date: Thu, 04 Aug 2011 10:04:16 -0700
Subject: [Numpy-discussion] Reading a big netcdf file
In-Reply-To: <CAE5kuyiuQRAcueWL3J=tXeN2UMg+QYJMpekfZd5MGbszpVOB4A@mail.gmail.com>
References: <CAB-sx61ofYnYin5RBNaefatVLJaEGNq95eWXYBP5VUZZbX5Yfg@mail.gmail.com>
	<4E397C6A.8010101@noaa.gov>
	<CAE5kuyhYNhMV06YoMWLShUg9iqcAgng6fpCRJPx2-tZZxiGv2Q@mail.gmail.com>
	<4E39BA5A.5070806@noaa.gov>
	<CAE5kuygamphDwbv4XAYuKuBNdrLG7uPnkSUm_L-tR0rV3UWf3w@mail.gmail.com>
	<4E39D11C.7010706@hawaii.edu>
	<CAE5kuyiuQRAcueWL3J=tXeN2UMg+QYJMpekfZd5MGbszpVOB4A@mail.gmail.com>
Message-ID: <4E3AD110.7000608@noaa.gov>

On 8/3/11 3:56 PM, G?khan Sever wrote:
> Back to the reality. After clearing the cache using Warren's suggestion:
>
> In [1]: timeit -n1 -r1 a = np.fromfile('temp.npa', dtype=np.uint16)
> 1 loops, best of 1: 7.23 s per loop

yup -- that cache sure can be handy!

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From Chris.Barker at noaa.gov  Thu Aug  4 15:26:29 2011
From: Chris.Barker at noaa.gov (Christopher Barker)
Date: Thu, 04 Aug 2011 12:26:29 -0700
Subject: [Numpy-discussion] Reading a big netcdf file
In-Reply-To: <4E3AD09B.6020406@noaa.gov>
References: <CAB-sx630Ej-Ttqoc1YMnBCjywXxNdK91w860oTz6tatEqLWweA@mail.gmail.com>
	<4E3AD09B.6020406@noaa.gov>
Message-ID: <4E3AF265.9020802@noaa.gov>

On 8/4/11 10:02 AM, Christopher Barker wrote:
> On 8/4/11 8:53 AM, Jeff Whitaker wrote:
>> Kiko: I think the difference may be that when you read the data with
>> netcdf4-python, it tries to unpack the short integers to a float32
>> array.
>
> Jeff, why is that? is it an netcdf4 convention? I always thought that
> the netcdf data model matched numpy's quite well, including the clear
> choice and specification of data type. I guess I've mostly used float
> data anyway, so hadn't noticed this, but ti comes as a surprise to me!
>
>   >  gebco4.set_automaskandscale(False)

OK -- looked at this a bit more, and see in the OP's ncdump:

variables:
         short z(xysize) ;
                 z:scale_factor = 1. ;
                 z:add_offset = 0. ;
                 z:node_offset = 0 ;

so I presume netCDF4 is seeing the scale_factor and offsets, and thus 
converting to float.

In this case, the scale factor is 1.0, and the offsets are 0.0, so there 
isn't any need to convert, but that may be too smart!

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From derek at astro.physik.uni-goettingen.de  Thu Aug  4 19:08:51 2011
From: derek at astro.physik.uni-goettingen.de (Derek Homeier)
Date: Fri, 5 Aug 2011 01:08:51 +0200
Subject: [Numpy-discussion] longlong format error with Python <= 2.6 in
	scalartypes.c
Message-ID: <8D5A8864-6827-4164-B8F6-198000B7491D@astro.physik.uni-goettingen.de>

Hi,

commits c15a807e and c135371e (thus most immediately addressed to Mark, but I am sending this to the list hoping for more insight on the issue) introduce a test failure with Python 2.5+2.6 on Mac:

FAIL: test_timedelta_scalar_construction (test_datetime.TestDateTime)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/derek/lib/python2.6/site-packages/numpy/core/tests/test_datetime.py", line 219, in test_timedelta_scalar_construction
    assert_equal(str(np.timedelta64(3, 's')), '3 seconds')
  File "/Users/derek/lib/python2.6/site-packages/numpy/testing/utils.py", line 313, in assert_equal
    raise AssertionError(msg)
AssertionError: 
Items are not equal:
 ACTUAL: '%lld seconds'
 DESIRED: '3 seconds'

due to the "lld" format passed to PyUString_FromFormat in scalartypes.c. 
In the current npy_common.h I found the comment 
 *      in Python 2.6 the %lld formatter is not supported. In this
 *      case we work around the problem by using the %zd formatter.
though I did not notice that problem when I cleaned up the NPY_LONGLONG_FMT definitions in that file (and it is not entirely clear whether the comment only pertains to Windows...). Anyway changing the formatters in scalartypes.c to "zd" as well removes the failure and still works with Python 2.7 and 3.2 (at least on Mac OS). However I am wondering if 
a) NPY_[U]LONGLONG_FMT should also be defined conditional to the Python version (and if "%zu" is a valid formatter), and 
b) scalartypes.c should use NPY_LONGLONG_FMT from npy_common.h

I am attaching a patch implementing a), but only the quick and dirty solution to b).

Cheers,
						Derek

-------------- next part --------------
A non-text attachment was scrubbed...
Name: npy_longlong_fmt.patch
Type: application/octet-stream
Size: 2151 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110805/4944ecdb/attachment.obj>

From morph at debian.org  Fri Aug  5 19:37:56 2011
From: morph at debian.org (Sandro Tosi)
Date: Sat, 6 Aug 2011 01:37:56 +0200
Subject: [Numpy-discussion] Error building numpy (1.5.1 and 1.6.1rc3)
 with python2.7 debug
In-Reply-To: <CAAea2pYB1BJw40EC1ymhe1hemPNaput+2G0NT+_5nU=ygfu8-g@mail.gmail.com>
References: <CAPdtAj1Qk_Myj6Xq=UYCNLjzDdZTsKR+GeTLwGRfW=m42FYNNA@mail.gmail.com>
	<CAAea2pYB1BJw40EC1ymhe1hemPNaput+2G0NT+_5nU=ygfu8-g@mail.gmail.com>
Message-ID: <CAPdtAj0oDFuy8u9B6fBpReKvKfnKrfHAOtwe31NeOFp3stxOdw@mail.gmail.com>

Hello,

On Sat, Jul 16, 2011 at 22:45, Bruce Southey <bsouthey at gmail.com> wrote:
> On Sat, Jul 16, 2011 at 4:34 AM, Sandro Tosi <morph at debian.org> wrote:
>> Hello,
>> while preparing a test upload for 1.6.1rc3 in Debian, I noticed that
>> it gets an error when building blas with python 2.7 in the debug
>> flavor, the build log is at [1]. It's also been confirmed it fails
>> also with 1.5.1 [2]
>>
>> [1] http://people.debian.org/~morph/python-numpy_1.6.1~rc3-1_amd64.build
>> [2] http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=634012
>>
>> I think it might be a toolchain change in Debian (since 1.5.1 was
>> built successfully and now it fails), but could you please give me a
>> hand in debugging the issue?
>>
>> Thanks in advance,
>> --
>> Sandro Tosi (aka morph, morpheus, matrixhasu)
>> My website: http://matrixhasu.altervista.org/
>> Me at Debian: http://wiki.debian.org/SandroTosi
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
> Hi,
> What do you mean by 'python2.7 debug'?
>
> Numpy 1.6.1rc's and earlier build and install with Python 2.7 build in
> debug mode ($ ./configure --with-pydebug
> ) on 64-bit Fedora 14 and 15. But, if I can follow you build process
> (should be the plain 'python setup.py build' to be useful) I think
> numpy is not finding the correct blas/lapack/atlas libraries so either
> you may need a site.cfg for that system or install those in the Linux
> standard locations such as /usr/lib64.
>
> You should probably try building without blas, lapack and atlas etc.:
> BLAS=None LAPACK=None ATLAS=None python setup.py build

It's not a matter of not finding the headers: the same build process
succeeds if run using gfortran-4.5 while fails if run with
gfortran-4.6 , it's likely that gcc is more strict now and something
needs to be adapted in numpy.

Has someone successfully built numpy with gcc 4.6 ?

Cheers,
-- 
Sandro Tosi (aka morph, morpheus, matrixhasu)
My website: http://matrixhasu.altervista.org/
Me at Debian: http://wiki.debian.org/SandroTosi


From charlesr.harris at gmail.com  Sat Aug  6 00:25:04 2011
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Fri, 5 Aug 2011 22:25:04 -0600
Subject: [Numpy-discussion] Error building numpy (1.5.1 and 1.6.1rc3)
 with python2.7 debug
In-Reply-To: <CAPdtAj0oDFuy8u9B6fBpReKvKfnKrfHAOtwe31NeOFp3stxOdw@mail.gmail.com>
References: <CAPdtAj1Qk_Myj6Xq=UYCNLjzDdZTsKR+GeTLwGRfW=m42FYNNA@mail.gmail.com>
	<CAAea2pYB1BJw40EC1ymhe1hemPNaput+2G0NT+_5nU=ygfu8-g@mail.gmail.com>
	<CAPdtAj0oDFuy8u9B6fBpReKvKfnKrfHAOtwe31NeOFp3stxOdw@mail.gmail.com>
Message-ID: <CAB6mnxLkHL_fM04Cpnnt0PgqeT2YkjaVGV09HigtVqxfpzDSTQ@mail.gmail.com>

On Fri, Aug 5, 2011 at 5:37 PM, Sandro Tosi <morph at debian.org> wrote:

> Hello,
>
> On Sat, Jul 16, 2011 at 22:45, Bruce Southey <bsouthey at gmail.com> wrote:
> > On Sat, Jul 16, 2011 at 4:34 AM, Sandro Tosi <morph at debian.org> wrote:
> >> Hello,
> >> while preparing a test upload for 1.6.1rc3 in Debian, I noticed that
> >> it gets an error when building blas with python 2.7 in the debug
> >> flavor, the build log is at [1]. It's also been confirmed it fails
> >> also with 1.5.1 [2]
> >>
> >> [1]
> http://people.debian.org/~morph/python-numpy_1.6.1~rc3-1_amd64.build
> >> [2] http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=634012
> >>
> >> I think it might be a toolchain change in Debian (since 1.5.1 was
> >> built successfully and now it fails), but could you please give me a
> >> hand in debugging the issue?
> >>
> >> Thanks in advance,
> >> --
> >> Sandro Tosi (aka morph, morpheus, matrixhasu)
> >> My website: http://matrixhasu.altervista.org/
> >> Me at Debian: http://wiki.debian.org/SandroTosi
> >> _______________________________________________
> >> NumPy-Discussion mailing list
> >> NumPy-Discussion at scipy.org
> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >>
> > Hi,
> > What do you mean by 'python2.7 debug'?
> >
> > Numpy 1.6.1rc's and earlier build and install with Python 2.7 build in
> > debug mode ($ ./configure --with-pydebug
> > ) on 64-bit Fedora 14 and 15. But, if I can follow you build process
> > (should be the plain 'python setup.py build' to be useful) I think
> > numpy is not finding the correct blas/lapack/atlas libraries so either
> > you may need a site.cfg for that system or install those in the Linux
> > standard locations such as /usr/lib64.
> >
> > You should probably try building without blas, lapack and atlas etc.:
> > BLAS=None LAPACK=None ATLAS=None python setup.py build
>
> It's not a matter of not finding the headers: the same build process
> succeeds if run using gfortran-4.5 while fails if run with
> gfortran-4.6 , it's likely that gcc is more strict now and something
> needs to be adapted in numpy.
>
> Has someone successfully built numpy with gcc 4.6 ?
>
>
Yes, all the time ;)

gcc version 4.6.0 20110603 (Red Hat 4.6.0-10) (GCC)


Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110805/42f25455/attachment.html>

From d.s.seljebotn at astro.uio.no  Sat Aug  6 05:18:53 2011
From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn)
Date: Sat, 06 Aug 2011 11:18:53 +0200
Subject: [Numpy-discussion] [ANN] Cython 0.15
Message-ID: <4E3D06FD.9030701@astro.uio.no>

We are excited to announce the release of Cython 0.15, which is a huge
step forward in achieving full Python language coverage as well as
many new features, optimizations, and bugfixes.

Download: http://cython.org/ or http://pypi.python.org/pypi/Cython

== Major Features ==

  * Generators (yield) - Cython has full support for generators,
generator expressions and coroutines. 
http://www.python.org/dev/peps/pep-0342/

  * The nonlocal keyword is supported.

  * Re-acquiring the gil: with gil - works as expected within a nogil
context.

  * OpenMP support: 
http://docs.cython.org/0.15/src/userguide/parallelism.html

  * Control flow analysis prunes dead code and emits warnings and
errors about uninitialised variables.

  * Debugger command cy set to assign values of expressions to Cython
variables and cy exec counterpart cy_eval().

  * Exception chaining http://www.python.org/dev/peps/pep-3134/

  * Relative imports http://www.python.org/dev/peps/pep-0328/

  * The with statement has its own dedicated and faster C
implementation.

  * Improved pure syntax including cython.cclass, cython.cfunc, and
cython.ccall. http://docs.cython.org/0.15/src/tutorial/pure.html

  * Support for del.

  * Boundschecking directives implemented for builtin Python sequence
types.

  * Several updates and additions to the shipped standard library pxd
files https://github.com/cython/cython/tree/master/Cython/Includes

  * Forward declaration of types is no longer required for circular
references.

Note: this will be the last release to support Python 2.3; Python 2.4
will be supported for at least one more release.

== General improvements and bug fixes ==

This release contains over a thousand commits including hundreds of
bugfixes and optimizations.  The bug tracker has not been as heavily
used this release cycle, but is still useful
http://trac.cython.org/cython_trac/query?status=closed&group=component&order=id&col=id&col=summary&col=milestone&col=status&col=type&col=priority&col=owner&col=component&milestone=0.15&desc=1

== Incompatible changes ==

  * Uninitialized variables are no longer initialized to None and
accessing them has the same semantics as standard Python.

  * globals() now returns a read-only dict of the Cython module's
globals, rather than the globals
    of the first non-Cython module in the stack

  * Many C++ exceptions are now special cases to give closer Python
counterparts.  This means that except+ functions that formally raised
generic RuntimeErrors may raise something else such as
ArithmaticError.

== Known regressions ==

  * The inlined generator expressions (introduced in Cython 0.13) were
disabled in favour of full generator expression support. This induces
a performance regression for cases that were previously inlined.

== Contributors ==

Many thanks to:

Francesc Alted,
Haoyu Bai,
Stefan Behnel,
Robert Bradshaw,
Lars Buitinck,
Lisandro Dalcin,
John Ehresman,
Mark Florisson,
Christoph Gohlke,
Jason Grout,
Chris Lasher,
Vitja Makarov,
Brent Pedersen,
Dag Sverre Seljebotn,
Nathaniel Smith,
and Pauli Virtanen


From morph at debian.org  Sat Aug  6 18:43:41 2011
From: morph at debian.org (Sandro Tosi)
Date: Sun, 7 Aug 2011 00:43:41 +0200
Subject: [Numpy-discussion] Error building numpy (1.5.1 and 1.6.1rc3)
 with python2.7 debug
In-Reply-To: <CAB6mnxLkHL_fM04Cpnnt0PgqeT2YkjaVGV09HigtVqxfpzDSTQ@mail.gmail.com>
References: <CAPdtAj1Qk_Myj6Xq=UYCNLjzDdZTsKR+GeTLwGRfW=m42FYNNA@mail.gmail.com>
	<CAAea2pYB1BJw40EC1ymhe1hemPNaput+2G0NT+_5nU=ygfu8-g@mail.gmail.com>
	<CAPdtAj0oDFuy8u9B6fBpReKvKfnKrfHAOtwe31NeOFp3stxOdw@mail.gmail.com>
	<CAB6mnxLkHL_fM04Cpnnt0PgqeT2YkjaVGV09HigtVqxfpzDSTQ@mail.gmail.com>
Message-ID: <CAPdtAj3Tj90ybYUH0zkGELx3zWQfXEfW+Z4+jsyfNv4wy+N0Lw@mail.gmail.com>

On Sat, Aug 6, 2011 at 06:25, Charles R Harris
<charlesr.harris at gmail.com> wrote:
> Yes, all the time ;)
>
> gcc version 4.6.0 20110603 (Red Hat 4.6.0-10) (GCC)

Great, in fact it turned out it was a debian tool that went nuts :) (I
was able to build _doblas by hand, so it's just a matter of
configuration)

The situation is this:

- until recently, we had this command in our makefile: 'unexport
LDFLAGS' that removes any presence of the variable LDFLAGS from the
environment.
- recently, a Debian-specific tool, started adding LDFLAGS (and other
build variables) to the env, in a way no more controllable by the
makefile
- with that variable set, gfortran misses a '-shared' option and it
generates the error I mentioned in the original email
- I'm following the path to ask for that tool to be made more
flexible, so to allow to "unset" those variables, but maybe I can
workaround it patching the code (I know, I hate to diverge from
upstream, but in extreme situations...), so I'd like to ask your
guidance in thise :) it's probably something
numpy/distutils/fcompiler/ but additional clues would be awesome :)

Cheers,
-- 
Sandro Tosi (aka morph, morpheus, matrixhasu)
My website: http://matrixhasu.altervista.org/
Me at Debian: http://wiki.debian.org/SandroTosi


From sturla at molden.no  Sat Aug  6 22:09:01 2011
From: sturla at molden.no (Sturla Molden)
Date: Sun, 07 Aug 2011 04:09:01 +0200
Subject: [Numpy-discussion] [ANN] Cython 0.15
In-Reply-To: <4E3D06FD.9030701@astro.uio.no>
References: <4E3D06FD.9030701@astro.uio.no>
Message-ID: <4E3DF3BD.5000703@molden.no>

Den 06.08.2011 11:18, skrev Dag Sverre Seljebotn:
> We are excited to announce the release of Cython 0.15, which is a huge
> step forward in achieving full Python language coverage as well as
> many new features, optimizations, and bugfixes.
>
>

This is really great. With Cython progressing like this, I might soon 
have written my last line of Fortran. :-)

I'm finally getting over the post-traumatic stress from writing Matlab 
MEX files ;-)

Sturla


From derek at astro.physik.uni-goettingen.de  Sun Aug  7 15:58:27 2011
From: derek at astro.physik.uni-goettingen.de (Derek Homeier)
Date: Sun, 7 Aug 2011 21:58:27 +0200
Subject: [Numpy-discussion] [ANN] Cython 0.15
In-Reply-To: <4E3DF3BD.5000703@molden.no>
References: <4E3D06FD.9030701@astro.uio.no> <4E3DF3BD.5000703@molden.no>
Message-ID: <3A86B0A1-B6E6-4146-A5B1-626112AD7E47@astro.physik.uni-goettingen.de>

On 7 Aug 2011, at 04:09, Sturla Molden wrote:

> Den 06.08.2011 11:18, skrev Dag Sverre Seljebotn:
>> We are excited to announce the release of Cython 0.15, which is a huge
>> step forward in achieving full Python language coverage as well as
>> many new features, optimizations, and bugfixes.
>> 
>> 
> 
> This is really great. With Cython progressing like this, I might soon 
> have written my last line of Fortran. :-)

+1 (except the bit about writing Fortran, probably ;-)

I am only getting 4 errors with Python 3.1 + 3.2 (Mac OS X 10.6/x86_64):
compiling (cpp) and running numpy_bufacc_T155, numpy_cimport, numpy_parallel, numpy_test...
I could not find much documentation about the runtests.py script (like how to figure out the exact gcc command used), but I am happy to send more details wherever requested. Adding a '-v' flag prints the following additional info:

numpy_bufacc_T155.c: In function ?PyInit_numpy_bufacc_T155?:
numpy_bufacc_T155.c:3652: warning: ?return? with no value, in function returning non-void
.numpy_bufacc_T155.cpp: In function ?PyObject* PyInit_numpy_bufacc_T155()?:
numpy_bufacc_T155.cpp:3652: error: return-statement with no value, in function returning ?PyObject*?
Enumpy_cimport.c: In function ?PyInit_numpy_cimport?:
numpy_cimport.c:3327: warning: ?return? with no value, in function returning non-void
.numpy_cimport.cpp: In function ?PyObject* PyInit_numpy_cimport()?:
numpy_cimport.cpp:3327: error: return-statement with no value, in function returning ?PyObject*?
Enumpy_parallel.c: In function ?PyInit_numpy_parallel?:
numpy_parallel.c:3824: warning: ?return? with no value, in function returning non-void
.numpy_parallel.cpp: In function ?PyObject* PyInit_numpy_parallel()?:
numpy_parallel.cpp:3824: error: return-statement with no value, in function returning ?PyObject*?
Enumpy_test.c: In function ?PyInit_numpy_test?:
numpy_test.c:11611: warning: ?return? with no value, in function returning non-void
.numpy_test.cpp: In function ?PyObject* PyInit_numpy_test()?:
numpy_test.cpp:11611: error: return-statement with no value, in function returning ?PyObject*?

This happens with numpy 1.5.1, 1.6.0, 1.6.1 or git master installed, 
With Python 2.5-2.7 all 5536 tests are passing!

Cheers,
						Derek


From paul.anton.letnes at gmail.com  Sun Aug  7 16:11:38 2011
From: paul.anton.letnes at gmail.com (Paul Anton Letnes)
Date: Sun, 7 Aug 2011 22:11:38 +0200
Subject: [Numpy-discussion] numpy.savetxt Ticket 1573 - suggested fix
Message-ID: <0F3DE703-0C85-49BF-9FA9-6C4E33F377C3@gmail.com>

(A pull request has been submitted on github, but I'm posting here so people can discuss the user interface issues.)

As of now, the fmt= kwarg kan be (for complex dtype):
a) a single specifier, fmt='%.4e', resulting in numbers formatted like ' (%s+%sj)' % (fmt, fmt)
b) a full string specifying every real and imaginary part, e.g. ' %.4e %+.4j' * 3 for 3 columns
c) a list of specifiers, one per column - in this case, the real and imaginary part must have separate specifiers, e.g. ['%.3e + %.3ej', '(%.15e%+.15ej)']

It would be good if people could air their opinion as to whether this is what they would expect from savetxt behavior for real (float) numbers.

Ticket link:
http://projects.scipy.org/numpy/ticket/1573

Cheers,
Paul

From paul.anton.letnes at gmail.com  Sun Aug  7 16:31:27 2011
From: paul.anton.letnes at gmail.com (Paul Anton Letnes)
Date: Sun, 7 Aug 2011 22:31:27 +0200
Subject: [Numpy-discussion] [ANN] Cython 0.15
In-Reply-To: <4E3D06FD.9030701@astro.uio.no>
References: <4E3D06FD.9030701@astro.uio.no>
Message-ID: <D0C6FFE2-7933-47EE-AEDB-4D3EC4C9ADE3@gmail.com>

Looks like you have done some great work! I've been using f2py in the past, but I always liked the idea of cython - gradually wrapping more and more code as the need arises. I read somewhere that fortran wrapping with cython was coming - dare I ask what the status on this is? Is it a goal for cython to support easy fortran wrapping at all?

Keep up the good work!
Paul

On 6. aug. 2011, at 11.18, Dag Sverre Seljebotn wrote:

> We are excited to announce the release of Cython 0.15, which is a huge
> step forward in achieving full Python language coverage as well as
> many new features, optimizations, and bugfixes.
> 
> Download: http://cython.org/ or http://pypi.python.org/pypi/Cython
> 
> == Major Features ==
> 
>  * Generators (yield) - Cython has full support for generators,
> generator expressions and coroutines. 
> http://www.python.org/dev/peps/pep-0342/
> 
>  * The nonlocal keyword is supported.
> 
>  * Re-acquiring the gil: with gil - works as expected within a nogil
> context.
> 
>  * OpenMP support: 
> http://docs.cython.org/0.15/src/userguide/parallelism.html
> 
>  * Control flow analysis prunes dead code and emits warnings and
> errors about uninitialised variables.
> 
>  * Debugger command cy set to assign values of expressions to Cython
> variables and cy exec counterpart cy_eval().
> 
>  * Exception chaining http://www.python.org/dev/peps/pep-3134/
> 
>  * Relative imports http://www.python.org/dev/peps/pep-0328/
> 
>  * The with statement has its own dedicated and faster C
> implementation.
> 
>  * Improved pure syntax including cython.cclass, cython.cfunc, and
> cython.ccall. http://docs.cython.org/0.15/src/tutorial/pure.html
> 
>  * Support for del.
> 
>  * Boundschecking directives implemented for builtin Python sequence
> types.
> 
>  * Several updates and additions to the shipped standard library pxd
> files https://github.com/cython/cython/tree/master/Cython/Includes
> 
>  * Forward declaration of types is no longer required for circular
> references.
> 
> Note: this will be the last release to support Python 2.3; Python 2.4
> will be supported for at least one more release.
> 
> == General improvements and bug fixes ==
> 
> This release contains over a thousand commits including hundreds of
> bugfixes and optimizations.  The bug tracker has not been as heavily
> used this release cycle, but is still useful
> http://trac.cython.org/cython_trac/query?status=closed&group=component&order=id&col=id&col=summary&col=milestone&col=status&col=type&col=priority&col=owner&col=component&milestone=0.15&desc=1
> 
> == Incompatible changes ==
> 
>  * Uninitialized variables are no longer initialized to None and
> accessing them has the same semantics as standard Python.
> 
>  * globals() now returns a read-only dict of the Cython module's
> globals, rather than the globals
>    of the first non-Cython module in the stack
> 
>  * Many C++ exceptions are now special cases to give closer Python
> counterparts.  This means that except+ functions that formally raised
> generic RuntimeErrors may raise something else such as
> ArithmaticError.
> 
> == Known regressions ==
> 
>  * The inlined generator expressions (introduced in Cython 0.13) were
> disabled in favour of full generator expression support. This induces
> a performance regression for cases that were previously inlined.
> 
> == Contributors ==
> 
> Many thanks to:
> 
> Francesc Alted,
> Haoyu Bai,
> Stefan Behnel,
> Robert Bradshaw,
> Lars Buitinck,
> Lisandro Dalcin,
> John Ehresman,
> Mark Florisson,
> Christoph Gohlke,
> Jason Grout,
> Chris Lasher,
> Vitja Makarov,
> Brent Pedersen,
> Dag Sverre Seljebotn,
> Nathaniel Smith,
> and Pauli Virtanen
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From d.s.seljebotn at astro.uio.no  Sun Aug  7 17:24:42 2011
From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn)
Date: Sun, 07 Aug 2011 23:24:42 +0200
Subject: [Numpy-discussion] [ANN] Cython 0.15
In-Reply-To: <D0C6FFE2-7933-47EE-AEDB-4D3EC4C9ADE3@gmail.com>
References: <4E3D06FD.9030701@astro.uio.no>
	<D0C6FFE2-7933-47EE-AEDB-4D3EC4C9ADE3@gmail.com>
Message-ID: <4E3F029A.9010201@astro.uio.no>

On 08/07/2011 10:31 PM, Paul Anton Letnes wrote:
> Looks like you have done some great work! I've been using f2py in the past, but I always liked the idea of cython - gradually wrapping more and more code as the need arises. I read somewhere that fortran wrapping with cython was coming - dare I ask what the status on this is? Is it a goal for cython to support easy fortran wrapping at all?

Fwrap scans Fortran sources and generate a Cython wrapper around a 
iso_c_binding Fortran 2003 wrapper around your Fortran code. Which is a 
bit more portable than f2py in theory, although it's pretty much the 
same in practice currently.

It doesn't work for all Fortran code, but I think it works for what f2py 
does and then some more.

The big difference is that it allows you to sidestep Python boxing of 
arguments when calling from Cython.

In addition to the main website (use Google) there's been quite a lot 
more work on it my Github:

https://github.com/dagss/fwrap

that's not released. I'd like to continue on Fwrap but there's always 
2-3 items higher on the priority list. I can't tell you yet whether the 
project will survive.

But anyway, this is the way Fortran+Cython is supported.

Dag Sverre


From derek at astro.physik.uni-goettingen.de  Sun Aug  7 17:26:57 2011
From: derek at astro.physik.uni-goettingen.de (Derek Homeier)
Date: Sun, 7 Aug 2011 23:26:57 +0200
Subject: [Numpy-discussion] [ANN] Cython 0.15
In-Reply-To: <D0C6FFE2-7933-47EE-AEDB-4D3EC4C9ADE3@gmail.com>
References: <4E3D06FD.9030701@astro.uio.no>
	<D0C6FFE2-7933-47EE-AEDB-4D3EC4C9ADE3@gmail.com>
Message-ID: <2923CFF4-306F-4E15-9DD9-EBA9706BB598@astro.physik.uni-goettingen.de>

On 7 Aug 2011, at 22:31, Paul Anton Letnes wrote:

> Looks like you have done some great work! I've been using f2py in the past, but I always liked the idea of cython - gradually wrapping more and more code as the need arises. I read somewhere that fortran wrapping with cython was coming - dare I ask what the status on this is? Is it a goal for cython to support easy fortran wrapping at all?
> 
Don't know if there is one besides fwrap, but 
http://pypi.python.org/pypi/fwrap/0.1.1
builds and tests OK on python 2.[5-7]. So I am bound to continue my Fortran writing...

>  Keep up the good work!

Absolutely agreed!
					Derek


From d.s.seljebotn at astro.uio.no  Sun Aug  7 17:27:05 2011
From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn)
Date: Sun, 07 Aug 2011 23:27:05 +0200
Subject: [Numpy-discussion] [ANN] Cython 0.15
In-Reply-To: <3A86B0A1-B6E6-4146-A5B1-626112AD7E47@astro.physik.uni-goettingen.de>
References: <4E3D06FD.9030701@astro.uio.no> <4E3DF3BD.5000703@molden.no>
	<3A86B0A1-B6E6-4146-A5B1-626112AD7E47@astro.physik.uni-goettingen.de>
Message-ID: <4E3F0329.9090606@astro.uio.no>

On 08/07/2011 09:58 PM, Derek Homeier wrote:
> On 7 Aug 2011, at 04:09, Sturla Molden wrote:
>
>> Den 06.08.2011 11:18, skrev Dag Sverre Seljebotn:
>>> We are excited to announce the release of Cython 0.15, which is a huge
>>> step forward in achieving full Python language coverage as well as
>>> many new features, optimizations, and bugfixes.
>>>
>>>
>>
>> This is really great. With Cython progressing like this, I might soon
>> have written my last line of Fortran. :-)
>
> +1 (except the bit about writing Fortran, probably ;-)
>
> I am only getting 4 errors with Python 3.1 + 3.2 (Mac OS X 10.6/x86_64):
> compiling (cpp) and running numpy_bufacc_T155, numpy_cimport, numpy_parallel, numpy_test...
> I could not find much documentation about the runtests.py script (like how to figure out the exact gcc command used), but I am happy to send more details wherever requested. Adding a '-v' flag prints the following additional info:
>
> numpy_bufacc_T155.c: In function ?PyInit_numpy_bufacc_T155?:
> numpy_bufacc_T155.c:3652: warning: ?return? with no value, in function returning non-void
> .numpy_bufacc_T155.cpp: In function ?PyObject* PyInit_numpy_bufacc_T155()?:
> numpy_bufacc_T155.cpp:3652: error: return-statement with no value, in function returning ?PyObject*?
> Enumpy_cimport.c: In function ?PyInit_numpy_cimport?:
> numpy_cimport.c:3327: warning: ?return? with no value, in function returning non-void
> .numpy_cimport.cpp: In function ?PyObject* PyInit_numpy_cimport()?:
> numpy_cimport.cpp:3327: error: return-statement with no value, in function returning ?PyObject*?
> Enumpy_parallel.c: In function ?PyInit_numpy_parallel?:
> numpy_parallel.c:3824: warning: ?return? with no value, in function returning non-void
> .numpy_parallel.cpp: In function ?PyObject* PyInit_numpy_parallel()?:
> numpy_parallel.cpp:3824: error: return-statement with no value, in function returning ?PyObject*?
> Enumpy_test.c: In function ?PyInit_numpy_test?:
> numpy_test.c:11611: warning: ?return? with no value, in function returning non-void
> .numpy_test.cpp: In function ?PyObject* PyInit_numpy_test()?:
> numpy_test.cpp:11611: error: return-statement with no value, in function returning ?PyObject*?
>
> This happens with numpy 1.5.1, 1.6.0, 1.6.1 or git master installed,
> With Python 2.5-2.7 all 5536 tests are passing!

I believe this is http://projects.scipy.org/numpy/ticket/1919

Can you confirm?

I don't think there's anything we can do on the Cython end to fix this, 
if the report is correct.

Dag Sverre


From derek at astro.physik.uni-goettingen.de  Sun Aug  7 19:35:26 2011
From: derek at astro.physik.uni-goettingen.de (Derek Homeier)
Date: Mon, 8 Aug 2011 01:35:26 +0200
Subject: [Numpy-discussion] [ANN] Cython 0.15
In-Reply-To: <4E3F0329.9090606@astro.uio.no>
References: <4E3D06FD.9030701@astro.uio.no> <4E3DF3BD.5000703@molden.no>
	<3A86B0A1-B6E6-4146-A5B1-626112AD7E47@astro.physik.uni-goettingen.de>
	<4E3F0329.9090606@astro.uio.no>
Message-ID: <0EDEE61F-FA30-4E74-BCDA-5FAB6196FFF4@astro.physik.uni-goettingen.de>

On 7 Aug 2011, at 23:27, Dag Sverre Seljebotn wrote:

>> Enumpy_test.c: In function ?PyInit_numpy_test?:
>> numpy_test.c:11611: warning: ?return? with no value, in function returning non-void
>> .numpy_test.cpp: In function ?PyObject* PyInit_numpy_test()?:
>> numpy_test.cpp:11611: error: return-statement with no value, in function returning ?PyObject*?
>> 
>> This happens with numpy 1.5.1, 1.6.0, 1.6.1 or git master installed,
>> With Python 2.5-2.7 all 5536 tests are passing!
> 
> I believe this is http://projects.scipy.org/numpy/ticket/1919
> 
> Can you confirm?
> 
> I don't think there's anything we can do on the Cython end to fix this, 
> if the report is correct.

Yes, the proposed patch fixes the errors! I have added a comment to the ticket, hopefully this can be merged soon. 

Cheers,
							Derek


From seb.haase at gmail.com  Mon Aug  8 04:21:57 2011
From: seb.haase at gmail.com (Sebastian Haase)
Date: Mon, 8 Aug 2011 10:21:57 +0200
Subject: [Numpy-discussion] [ANN] Cython 0.15
In-Reply-To: <4E3F029A.9010201@astro.uio.no>
References: <4E3D06FD.9030701@astro.uio.no>
	<D0C6FFE2-7933-47EE-AEDB-4D3EC4C9ADE3@gmail.com>
	<4E3F029A.9010201@astro.uio.no>
Message-ID: <CAN06oV-R5gTZ45an6iU42XdKVxRQb6e356E8QaYR=u5-dw6f2g@mail.gmail.com>

On Sun, Aug 7, 2011 at 11:24 PM, Dag Sverre Seljebotn
<d.s.seljebotn at astro.uio.no> wrote:
> On 08/07/2011 10:31 PM, Paul Anton Letnes wrote:
>> Looks like you have done some great work! I've been using f2py in the past, but I always liked the idea of cython - gradually wrapping more and more code as the need arises. I read somewhere that fortran wrapping with cython was coming - dare I ask what the status on this is? Is it a goal for cython to support easy fortran wrapping at all?
>
> Fwrap scans Fortran sources and generate a Cython wrapper around a
> iso_c_binding Fortran 2003 wrapper around your Fortran code. Which is a
> bit more portable than f2py in theory, although it's pretty much the
> same in practice currently.
>
> It doesn't work for all Fortran code, but I think it works for what f2py
> does and then some more.
>
> The big difference is that it allows you to sidestep Python boxing of
> arguments when calling from Cython.
>
> In addition to the main website (use Google) there's been quite a lot
> more work on it my Github:
>
> https://github.com/dagss/fwrap
>
> that's not released. I'd like to continue on Fwrap but there's always
> 2-3 items higher on the priority list. I can't tell you yet whether the
> project will survive.
>
> But anyway, this is the way Fortran+Cython is supported.
>
> Dag Sverre

Hi,
Not to hijack this thread ....
First, also my congratulations to making such great progress with such
a great project !

a)  Is there anything that would parse a C/C++ header file and
generate Cython wrapper code for it ?

b) What is the status of supporting multi-type Cython functions -- ala
C++ templates ?
This would be one of my top ranked favorites, since I like writing
simple algorithms (like computing certain statistics over a numpy
array), and have this support all of e.g. unit8, int32, unit16,
float32 and float64...   (I'm using some macro-enhanced SWIG for this
so far)

Thanks,
Sebastian Haase


From d.s.seljebotn at astro.uio.no  Mon Aug  8 05:47:26 2011
From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn)
Date: Mon, 08 Aug 2011 11:47:26 +0200
Subject: [Numpy-discussion] [ANN] Cython 0.15
In-Reply-To: <CAN06oV-R5gTZ45an6iU42XdKVxRQb6e356E8QaYR=u5-dw6f2g@mail.gmail.com>
References: <4E3D06FD.9030701@astro.uio.no>	<D0C6FFE2-7933-47EE-AEDB-4D3EC4C9ADE3@gmail.com>	<4E3F029A.9010201@astro.uio.no>
	<CAN06oV-R5gTZ45an6iU42XdKVxRQb6e356E8QaYR=u5-dw6f2g@mail.gmail.com>
Message-ID: <4E3FB0AE.6010504@astro.uio.no>

On 08/08/2011 10:21 AM, Sebastian Haase wrote:
> On Sun, Aug 7, 2011 at 11:24 PM, Dag Sverre Seljebotn
> <d.s.seljebotn at astro.uio.no>  wrote:
>> On 08/07/2011 10:31 PM, Paul Anton Letnes wrote:
>>> Looks like you have done some great work! I've been using f2py in the past, but I always liked the idea of cython - gradually wrapping more and more code as the need arises. I read somewhere that fortran wrapping with cython was coming - dare I ask what the status on this is? Is it a goal for cython to support easy fortran wrapping at all?
>>
>> Fwrap scans Fortran sources and generate a Cython wrapper around a
>> iso_c_binding Fortran 2003 wrapper around your Fortran code. Which is a
>> bit more portable than f2py in theory, although it's pretty much the
>> same in practice currently.
>>
>> It doesn't work for all Fortran code, but I think it works for what f2py
>> does and then some more.
>>
>> The big difference is that it allows you to sidestep Python boxing of
>> arguments when calling from Cython.
>>
>> In addition to the main website (use Google) there's been quite a lot
>> more work on it my Github:
>>
>> https://github.com/dagss/fwrap
>>
>> that's not released. I'd like to continue on Fwrap but there's always
>> 2-3 items higher on the priority list. I can't tell you yet whether the
>> project will survive.
>>
>> But anyway, this is the way Fortran+Cython is supported.
>>
>> Dag Sverre
>
> Hi,
> Not to hijack this thread ....
> First, also my congratulations to making such great progress with such
> a great project !
>
> a)  Is there anything that would parse a C/C++ header file and
> generate Cython wrapper code for it ?

This come up now and again and I believe there's several 
half-baked/started solutions out there by Cython users, but nothing that 
is standard or that I know is carried out to completion.

I.e., you should ask on the cython-users list. It'd be good if somebody 
would compile a list of the efforts so far on the wiki as well...

> b) What is the status of supporting multi-type Cython functions -- ala
> C++ templates ?
> This would be one of my top ranked favorites, since I like writing
> simple algorithms (like computing certain statistics over a numpy
> array), and have this support all of e.g. unit8, int32, unit16,
> float32 and float64...   (I'm using some macro-enhanced SWIG for this
> so far)

It's been implemented as part of Mark Florisson's GSoC (he also did the 
OpenMP support!), currently waiting for review AFAIK. We take an 
approach different to C++ though.

http://wiki.cython.org/enhancements/fusedtypes

Dag Sverre


From amcmorl at gmail.com  Mon Aug  8 11:27:14 2011
From: amcmorl at gmail.com (Angus McMorland)
Date: Mon, 8 Aug 2011 11:27:14 -0400
Subject: [Numpy-discussion] PEP 3118 array size check
Message-ID: <CACtA=Sx9jXbYjmD7kFbeeD2bD=gabXE33Pe9p=hsBbaf2WhJQg@mail.gmail.com>

Hi all,

I've just upgraded to the latest numpy from git along with upgrading
Ubuntu to natty. Now some of my code, which relies on ctypes-wrapping
of data structures from a messaging system, fails with the error
message:

"RuntimeWarning: Item size computed from the PEP 3118 buffer format
string does not match the actual item size."

Can anyone tell me if this was a change that has been added into the
git version recently, in which case I can checkout a previous version
of numpy, or if I've got to try downgrading the whole system (ergh.)

Thanks,

Angus
-- 
AJC McMorland
Post-doctoral research fellow
Neurobiology, University of Pittsburgh


From sturla at molden.no  Mon Aug  8 11:29:38 2011
From: sturla at molden.no (Sturla Molden)
Date: Mon, 08 Aug 2011 17:29:38 +0200
Subject: [Numpy-discussion] [ANN] Cython 0.15
In-Reply-To: <4E3FB0AE.6010504@astro.uio.no>
References: <4E3D06FD.9030701@astro.uio.no>	<D0C6FFE2-7933-47EE-AEDB-4D3EC4C9ADE3@gmail.com>	<4E3F029A.9010201@astro.uio.no>
	<CAN06oV-R5gTZ45an6iU42XdKVxRQb6e356E8QaYR=u5-dw6f2g@mail.gmail.com>
	<4E3FB0AE.6010504@astro.uio.no>
Message-ID: <4E4000E2.9030304@molden.no>

Den 08.08.2011 11:47, skrev Dag Sverre Seljebotn:
> This come up now and again and I believe there's several 
> half-baked/started solutions out there by Cython users, but nothing 
> that is standard or that I know is carried out to completion. I.e., 
> you should ask on the cython-users list. It'd be good if somebody 
> would compile a list of the efforts so far on the wiki as well...

I wrote a mock-up pxd-generator for the OpenGL headers. It only worked 
with a particular set of OpenGL header files, and the output still 
required a few cases of manual editing. But this still saved me a lot of 
time re-declaring OpenGL to Cython, and an important benefit is 
correctness (i.e. almost no manual code to proof) :) The script is so 
bad, though, I am not sure I want to show it to anyone ;-)

Writing a general "headerfile2pxd.py" script is a huge undertaking.

The C preprocessor makes this particularly annoying, because some 
symbols might be defined at compile-time. To make things worse, C 
headers can also be recursive. Running the preprocessor in advance is 
not an option either, because some C APIs rely heavily on defined 
symbols. These will not survive through the preprocessor. Sometimes we 
want to fool Cython into thinking that a defined constant is an external 
variable or function with some C types, which also complicates this 
effort. It's not that bad if we write a generator for a particular set 
of header files, but a PITA when we write one for general use.

Sturla


From Chris.Barker at noaa.gov  Mon Aug  8 12:46:06 2011
From: Chris.Barker at noaa.gov (Christopher Barker)
Date: Mon, 08 Aug 2011 09:46:06 -0700
Subject: [Numpy-discussion] [ANN] Cython 0.15
In-Reply-To: <CAN06oV-R5gTZ45an6iU42XdKVxRQb6e356E8QaYR=u5-dw6f2g@mail.gmail.com>
References: <4E3D06FD.9030701@astro.uio.no>
	<D0C6FFE2-7933-47EE-AEDB-4D3EC4C9ADE3@gmail.com>
	<4E3F029A.9010201@astro.uio.no>
	<CAN06oV-R5gTZ45an6iU42XdKVxRQb6e356E8QaYR=u5-dw6f2g@mail.gmail.com>
Message-ID: <4E4012CE.2020908@noaa.gov>

On 8/8/11 1:21 AM, Sebastian Haase wrote:

> b) What is the status of supporting multi-type Cython functions -- ala
> C++ templates ?

You might want to take a look at what Keith Goodman has done with the 
"Bottleneck" project -- I think he used a generic template tool to 
generate Cython code for a variety of types from a single definition.

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From shish at keba.be  Mon Aug  8 12:54:17 2011
From: shish at keba.be (Olivier Delalleau)
Date: Mon, 8 Aug 2011 12:54:17 -0400
Subject: [Numpy-discussion] Weird upcast behavior with 1.6.x,
	working as intended?
Message-ID: <CAFXk4bqjz-ucMXVmZvUVY6V=rpFeGsMsWPS635d0YpQibf9Xig@mail.gmail.com>

Hi,

This is with numpy 1.6.1 under Linux x86_64, testing the upcast mechanism of
"scalar + array":

>>> import numpy; print (numpy.array(3, dtype=numpy.complex128) +
numpy.ones(3, dtype=numpy.float32)).dtype
complex64

Since it has to upcast my array (float32 is not "compatible enough" with
complex128), why does it upcast it to complex64 instead of complex128?
As far as I can tell 1.4.x and 1.5.x versions of numpy are indeed upcasting
to complex128.

Thanks,

-=- Olivier
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110808/f3ddfd1a/attachment.html>

From charlesr.harris at gmail.com  Mon Aug  8 13:24:13 2011
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Mon, 8 Aug 2011 11:24:13 -0600
Subject: [Numpy-discussion] Weird upcast behavior with 1.6.x,
	working as intended?
In-Reply-To: <CAFXk4bqjz-ucMXVmZvUVY6V=rpFeGsMsWPS635d0YpQibf9Xig@mail.gmail.com>
References: <CAFXk4bqjz-ucMXVmZvUVY6V=rpFeGsMsWPS635d0YpQibf9Xig@mail.gmail.com>
Message-ID: <CAB6mnx+DejCuO+JqkjUA8YRsA=YB_ZyWEYTMjrCrNvSGk_4McQ@mail.gmail.com>

On Mon, Aug 8, 2011 at 10:54 AM, Olivier Delalleau <shish at keba.be> wrote:

> Hi,
>
> This is with numpy 1.6.1 under Linux x86_64, testing the upcast mechanism
> of "scalar + array":
>
> >>> import numpy; print (numpy.array(3, dtype=numpy.complex128) +
> numpy.ones(3, dtype=numpy.float32)).dtype
> complex64
>
> Since it has to upcast my array (float32 is not "compatible enough" with
> complex128), why does it upcast it to complex64 instead of complex128?
> As far as I can tell 1.4.x and 1.5.x versions of numpy are indeed upcasting
> to complex128.
>
>
The 0 dimensional array is being treated as a scalar, hence is cast to the
type of the 1d array. This seems more consistent with the idea that 0
dimensional arrays act like scalars, but I suppose that is open to
discussion.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110808/30acbc94/attachment.html>

From shish at keba.be  Mon Aug  8 15:38:05 2011
From: shish at keba.be (Olivier Delalleau)
Date: Mon, 8 Aug 2011 15:38:05 -0400
Subject: [Numpy-discussion] Weird upcast behavior with 1.6.x,
	working as intended?
In-Reply-To: <CAB6mnx+DejCuO+JqkjUA8YRsA=YB_ZyWEYTMjrCrNvSGk_4McQ@mail.gmail.com>
References: <CAFXk4bqjz-ucMXVmZvUVY6V=rpFeGsMsWPS635d0YpQibf9Xig@mail.gmail.com>
	<CAB6mnx+DejCuO+JqkjUA8YRsA=YB_ZyWEYTMjrCrNvSGk_4McQ@mail.gmail.com>
Message-ID: <CAFXk4bpvf0jr5TOFyPwKpPD8u7t_U50BeC9zJc_O+KK=Kpk7+w@mail.gmail.com>

2011/8/8 Charles R Harris <charlesr.harris at gmail.com>

>
>
> On Mon, Aug 8, 2011 at 10:54 AM, Olivier Delalleau <shish at keba.be> wrote:
>
>> Hi,
>>
>> This is with numpy 1.6.1 under Linux x86_64, testing the upcast mechanism
>> of "scalar + array":
>>
>> >>> import numpy; print (numpy.array(3, dtype=numpy.complex128) +
>> numpy.ones(3, dtype=numpy.float32)).dtype
>> complex64
>>
>> Since it has to upcast my array (float32 is not "compatible enough" with
>> complex128), why does it upcast it to complex64 instead of complex128?
>> As far as I can tell 1.4.x and 1.5.x versions of numpy are indeed
>> upcasting to complex128.
>>
>>
> The 0 dimensional array is being treated as a scalar, hence is cast to the
> type of the 1d array. This seems more consistent with the idea that 0
> dimensional arrays act like scalars, but I suppose that is open to
> discussion.
>
> Chuck
>

I'm afraid I don't understand your reply. I know that the 0d array is a
scalar, and thus should not lead to an upcast "unless the scalar is of a
fundamentally different kind of data (*i.e.*, under a different hierarchy in
the data-type hierarchy) than the array" (quoted from
http://docs.scipy.org/doc/numpy/reference/ufuncs.html).

This is one case where it is under a different hierarchy and thus should
trigger an upcast. What I don't understand it why it upcasts to complex64
instead of complex128.

Note that:
1. When replacing "numpy.ones" with "numpy.array" it yields complex128
(expected upcast of scalar addition of complex128 with float32)
2. The behavior is similar if instead of "3" I use a number which cannot be
represented exactly with a complex64 (so it's not a rule about picking the
smallest data type able to exactly represent the result)

-=- Olivier
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110808/e1d6d30f/attachment.html>

From ndbecker2 at gmail.com  Mon Aug  8 19:01:28 2011
From: ndbecker2 at gmail.com (Neal Becker)
Date: Mon, 08 Aug 2011 19:01:28 -0400
Subject: [Numpy-discussion] Warning: invalid value encountered in divide
Message-ID: <j1pps8$unn$1@dough.gmane.org>

Warning: invalid value encountered in divide

No traceback.  How can I get more info on this?  Can this warning be converted 
to an exception so I can get a trace?


From wesmckinn at gmail.com  Mon Aug  8 19:25:32 2011
From: wesmckinn at gmail.com (Wes McKinney)
Date: Mon, 8 Aug 2011 19:25:32 -0400
Subject: [Numpy-discussion] Warning: invalid value encountered in divide
In-Reply-To: <j1pps8$unn$1@dough.gmane.org>
References: <j1pps8$unn$1@dough.gmane.org>
Message-ID: <CAJPUwMBnz7zwgUGa+ZruN=2F+rdC9uqyJcbo87cLguaev1zmFg@mail.gmail.com>

On Mon, Aug 8, 2011 at 7:01 PM, Neal Becker <ndbecker2 at gmail.com> wrote:
> Warning: invalid value encountered in divide
>
> No traceback. ?How can I get more info on this? ?Can this warning be converted
> to an exception so I can get a trace?
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>

Try calling np.seterr(divide='raise') or np.seterr(all='raise')


From charlesr.harris at gmail.com  Tue Aug  9 00:56:09 2011
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Mon, 8 Aug 2011 22:56:09 -0600
Subject: [Numpy-discussion] Static analysis of python c extensions.
Message-ID: <CAB6mnxL_zzFXbKov3vA5cEmRjQahgLBRdUAmaCb2FnKjOsNn5A@mail.gmail.com>

Thought some might find this <http://tinyurl.com/5r5ccgs> of interest.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110808/208631cd/attachment.html>

From robertwb at math.washington.edu  Tue Aug  9 12:25:17 2011
From: robertwb at math.washington.edu (Robert Bradshaw)
Date: Tue, 9 Aug 2011 09:25:17 -0700
Subject: [Numpy-discussion] [ANN] Cython 0.15
In-Reply-To: <4E4012CE.2020908@noaa.gov>
References: <4E3D06FD.9030701@astro.uio.no>
	<D0C6FFE2-7933-47EE-AEDB-4D3EC4C9ADE3@gmail.com>
	<4E3F029A.9010201@astro.uio.no>
	<CAN06oV-R5gTZ45an6iU42XdKVxRQb6e356E8QaYR=u5-dw6f2g@mail.gmail.com>
	<4E4012CE.2020908@noaa.gov>
Message-ID: <CADiQ+QBVU9w1BeZkKdjXj1E626J6cwL+ggCyfG8vSiQ7FpefRw@mail.gmail.com>

On Mon, Aug 8, 2011 at 9:46 AM, Christopher Barker
<Chris.Barker at noaa.gov> wrote:
> On 8/8/11 1:21 AM, Sebastian Haase wrote:
>
>> b) What is the status of supporting multi-type Cython functions -- ala
>> C++ templates ?
>
> You might want to take a look at what Keith Goodman has done with the
> "Bottleneck" project -- I think he used a generic template tool to
> generate Cython code for a variety of types from a single definition.

Templating and type parameterization is a really tricky issue to get
right, especially when grafting into a "statically typeless" language
like Python.

The consensus that we had over Cython days was, at least for the
medium term, was:

(1) Implement http://wiki.cython.org/enhancements/fusedtypes for the
most common usecases. Mark has nearly finished this as part of his
GSoC project.

(2) Better support for pre-processing Cython with a templating
language, which would provide users with a high level of flexibility
to do anything sophisticated. (Not implemented, but users are already
doing this.)

(3) While would like to support template C++ functions better in C++
mode, for many reasons this is not the model we want to follow for
Cython type parameterization.

- Robert


From alex.flint at gmail.com  Tue Aug  9 17:23:36 2011
From: alex.flint at gmail.com (Alex Flint)
Date: Tue, 9 Aug 2011 17:23:36 -0400
Subject: [Numpy-discussion] dealing with RGB images
Message-ID: <CAN7yqbA1fHVpOBzdMJatgWDiBMcUAbUYO+jV1oXtvR+vfPdd3A@mail.gmail.com>

Until now, I've been representing RGB images in numpy using MxNx3 arrays, as
returned by scipy.misc.imread and friends. However, when performing image
transformations, the color dimension is semantically different from the
spatial dimensions. For example, I would like to write an image scaling
function that works for both grayscale array (MxN) and RGB images (MxNx3):

def imscale(image, scale):
  return scipy.ndimage.zoom(imscale, scale)

But this will apply scaling along the color dimension, resulting in an image
with more/less image channels. So I do:

def imscale(image, scale)
  if image.ndim == 2:
    return scipy.ndimage.zoom(imscale, scale)
  else:
    return scipy.ndimage.zoom(imscale, (scale[0], scale[1], 1))

But now this fails if the scale argument is a scalar. It is possible to
cover all cases but all my functions are become case nighmares as the
combinations of RGB and scalar images multiply.

I am thinking of writing an RGB class with overrides for all the math
operations that make sense (addition, scalar multiplication), and then
creating arrays with dtype=RGB. This will mean that color images always have
ndim=2. Does this make sense? Is there a neater way to achieve this within
numpy?

Alex
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110809/38f894d7/attachment.html>

From njs at pobox.com  Tue Aug  9 19:06:56 2011
From: njs at pobox.com (Nathaniel Smith)
Date: Tue, 9 Aug 2011 16:06:56 -0700
Subject: [Numpy-discussion] dealing with RGB images
In-Reply-To: <CAN7yqbA1fHVpOBzdMJatgWDiBMcUAbUYO+jV1oXtvR+vfPdd3A@mail.gmail.com>
References: <CAN7yqbA1fHVpOBzdMJatgWDiBMcUAbUYO+jV1oXtvR+vfPdd3A@mail.gmail.com>
Message-ID: <CAPJVwB=ADAoOtg0jrCPWG8KrpdZuUtniE2oV19ppo4_3+jKdnw@mail.gmail.com>

1) Have you considered using MxNx1 arrays for greyscale images, so all
images have the same dimensionality?

2) Instead of defining an RGB class from scratch, would a structured dtype
do what you want?

- Nathaniel
On Aug 9, 2011 2:23 PM, "Alex Flint" <alex.flint at gmail.com> wrote:
> Until now, I've been representing RGB images in numpy using MxNx3 arrays,
as
> returned by scipy.misc.imread and friends. However, when performing image
> transformations, the color dimension is semantically different from the
> spatial dimensions. For example, I would like to write an image scaling
> function that works for both grayscale array (MxN) and RGB images (MxNx3):
>
> def imscale(image, scale):
> return scipy.ndimage.zoom(imscale, scale)
>
> But this will apply scaling along the color dimension, resulting in an
image
> with more/less image channels. So I do:
>
> def imscale(image, scale)
> if image.ndim == 2:
> return scipy.ndimage.zoom(imscale, scale)
> else:
> return scipy.ndimage.zoom(imscale, (scale[0], scale[1], 1))
>
> But now this fails if the scale argument is a scalar. It is possible to
> cover all cases but all my functions are become case nighmares as the
> combinations of RGB and scalar images multiply.
>
> I am thinking of writing an RGB class with overrides for all the math
> operations that make sense (addition, scalar multiplication), and then
> creating arrays with dtype=RGB. This will mean that color images always
have
> ndim=2. Does this make sense? Is there a neater way to achieve this within
> numpy?
>
> Alex
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110809/2297c3c5/attachment.html>

From pav at iki.fi  Wed Aug 10 04:01:59 2011
From: pav at iki.fi (Pauli Virtanen)
Date: Wed, 10 Aug 2011 08:01:59 +0000 (UTC)
Subject: [Numpy-discussion] PEP 3118 array size check
References: <CACtA=Sx9jXbYjmD7kFbeeD2bD=gabXE33Pe9p=hsBbaf2WhJQg@mail.gmail.com>
Message-ID: <j1tdtn$2j8$1@dough.gmane.org>

Mon, 08 Aug 2011 11:27:14 -0400, Angus McMorland wrote:
> I've just upgraded to the latest numpy from git along with upgrading
> Ubuntu to natty. Now some of my code, which relies on ctypes-wrapping of
> data structures from a messaging system, fails with the error message:
> 
> "RuntimeWarning: Item size computed from the PEP 3118 buffer format
> string does not match the actual item size."
> 
> Can anyone tell me if this was a change that has been added into the git
> version recently, in which case I can checkout a previous version of
> numpy, or if I've got to try downgrading the whole system (ergh.)

Python's ctypes module implements its PEP 3118 support incorrectly
in recent Python versions. There's a patch in waiting:

	http://bugs.python.org/issue10744

In the meantime, you can just silence the warnings using the warnings
module,

	warnings.simplefilter("ignore", RuntimeWarning)

-- 
Pauli Virtanen


From gnurser at gmail.com  Wed Aug 10 06:45:52 2011
From: gnurser at gmail.com (George Nurser)
Date: Wed, 10 Aug 2011 11:45:52 +0100
Subject: [Numpy-discussion] problems with multiple outputs with numpy.nditer
Message-ID: <CAAyCRnT+WrC=uwwKMADzcYDKqm-rL1Yqb4M3Ebf64K1f-R+6Jw@mail.gmail.com>

Hi,
I'm running numpy 1.6.1rc2 + python 2.7.1 64-bit from python.org on OSX 10.6.8.

I have a f2py'd fortran routine that inputs and outputs fortran real*8
scalars, and I normally call it like

tu,tv,E,El,IF,HF,HFI = LW.rotate2u(u,v,NN,ff,0)

I now want to call it over 2D arrays UT,VT,N,f

Using steam-age indexing works fine:

mflux_east,mflux_north,IWE,IWE_lin,InvFr,HFroude =
np.empty([6,ny-1,nx],dtype=np.float64)
for j in range(ny-1):
   for i in range(nx):
       u,v,NN,ff = [x[j,i] for x in UT,VT,N,f]
       mflux_east[j,i],mflux_north[j,i],IWE[j,i],IWE_lin[j,i],InvFr[j,i],HFroude[j,i],HFI
= LW.rotate2u(u,v,NN,ff,0)


I decided to try the new nditer option, with

it = np.nditer([UT,VT,N,f,None,None,None,None,None,None,None]
              ,op_flags=4*[['readonly']]+7*[['writeonly','allocate']]
              ,op_dtypes=np.float64)
for (u,v,NN,ff,tu,tv,E,El,IF,HF,HFI) in it:
   tu,tv,E,El,IF,HF,HFI = LW.rotate2u(u,v,NN,ff,0)


Unfortunately this doesn't seem to work. Writing
aa,bb,cc,dd,ee,ff,gg = it.operands[4:]

aa seems to contain the contents of UT (bizarrely rescaled to lie
between 0 and 1), while bb,cc etc are all zero.


I'm not sure whether I've just called it incorrectly, or whether
perhaps it's only supposed to work with one output array.


--George Nurser.


From mwwiebe at gmail.com  Wed Aug 10 12:15:48 2011
From: mwwiebe at gmail.com (Mark Wiebe)
Date: Wed, 10 Aug 2011 09:15:48 -0700
Subject: [Numpy-discussion] problems with multiple outputs with
	numpy.nditer
In-Reply-To: <CAAyCRnT+WrC=uwwKMADzcYDKqm-rL1Yqb4M3Ebf64K1f-R+6Jw@mail.gmail.com>
References: <CAAyCRnT+WrC=uwwKMADzcYDKqm-rL1Yqb4M3Ebf64K1f-R+6Jw@mail.gmail.com>
Message-ID: <CAMRnEmq8AkVSy8Bdx5t31pcFrQBOdomFEEWKh1Wf1iYLVYg9=Q@mail.gmail.com>

On Wed, Aug 10, 2011 at 3:45 AM, George Nurser <gnurser at gmail.com> wrote:

> Hi,
> I'm running numpy 1.6.1rc2 + python 2.7.1 64-bit from python.org on OSX
> 10.6.8.
>
> I have a f2py'd fortran routine that inputs and outputs fortran real*8
> scalars, and I normally call it like
>
> tu,tv,E,El,IF,HF,HFI = LW.rotate2u(u,v,NN,ff,0)
>
> I now want to call it over 2D arrays UT,VT,N,f
>
> Using steam-age indexing works fine:
>
> mflux_east,mflux_north,IWE,IWE_lin,InvFr,HFroude =
> np.empty([6,ny-1,nx],dtype=np.float64)
> for j in range(ny-1):
>   for i in range(nx):
>       u,v,NN,ff = [x[j,i] for x in UT,VT,N,f]
>
> mflux_east[j,i],mflux_north[j,i],IWE[j,i],IWE_lin[j,i],InvFr[j,i],HFroude[j,i],HFI
> = LW.rotate2u(u,v,NN,ff,0)
>
>
>
> I decided to try the new nditer option, with
>
> it = np.nditer([UT,VT,N,f,None,None,None,None,None,None,None]
>              ,op_flags=4*[['readonly']]+7*[['writeonly','allocate']]
>              ,op_dtypes=np.float64)
> for (u,v,NN,ff,tu,tv,E,El,IF,HF,HFI) in it:
>   tu,tv,E,El,IF,HF,HFI = LW.rotate2u(u,v,NN,ff,0)
>
>
> Unfortunately this doesn't seem to work. Writing
> aa,bb,cc,dd,ee,ff,gg = it.operands[4:]
>

One problem here is that the assignment needs to assign into the view the
iterator gives, something a direct assignment doesn't actually do. Instead
of

a, b = f(c,d)

you need to write it like

a[...], b[...] = f(c,d)

so that the actual values being iterated get modified. Here's what I get:

In [7]: a = np.arange(5.)

In [8]: b, c, d = a + 1, a + 2, a + 3

In [9]: it = np.nditer([a,b,c,d] + [None]*7,
   ...:         op_flags=4*[['readonly']]+7*[['writeonly','allocate']],
   ...:         op_dtypes=np.float64)

In [10]: for (x,y,z,w,A,B,C,D,E,F,G) in it:
   ....:     A[...], B[...], C[...], D[...], E[...], F[...], G[...] = x, y,
z, w, x+y, y+z, z+w
   ....:

In [11]: it.operands[4]
Out[11]: array([ 0.,  1.,  2.,  3.,  4.])

In [12]: it.operands[5]
Out[12]: array([ 1.,  2.,  3.,  4.,  5.])

In [13]: it.operands[6]
Out[13]: array([ 2.,  3.,  4.,  5.,  6.])

In [14]: it.operands[7]
Out[14]: array([ 3.,  4.,  5.,  6.,  7.])

In [15]: it.operands[8]
Out[15]: array([ 1.,  3.,  5.,  7.,  9.])

In [16]: it.operands[9]
Out[16]: array([  3.,   5.,   7.,   9.,  11.])

In [17]: it.operands[10]
Out[17]: array([  5.,   7.,   9.,  11.,  13.])


-Mark


>
> aa seems to contain the contents of UT (bizarrely rescaled to lie
> between 0 and 1), while bb,cc etc are all zero.
>
>
> I'm not sure whether I've just called it incorrectly, or whether
> perhaps it's only supposed to work with one output array.
>
>
> --George Nurser.
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110810/bc48679f/attachment.html>

From amcmorl at gmail.com  Wed Aug 10 12:50:39 2011
From: amcmorl at gmail.com (Angus McMorland)
Date: Wed, 10 Aug 2011 12:50:39 -0400
Subject: [Numpy-discussion] numpy/ctypes segfault [was: PEP 3118 array size
	check]
Message-ID: <CACtA=SywdrbFisDWSmWu7xJOze4eOAhXcgQ1OsJ_TbDF0aj-AA@mail.gmail.com>

On 10 August 2011 04:01, Pauli Virtanen <pav at iki.fi> wrote:
> Mon, 08 Aug 2011 11:27:14 -0400, Angus McMorland wrote:
>> I've just upgraded to the latest numpy from git along with upgrading
>> Ubuntu to natty. Now some of my code, which relies on ctypes-wrapping of
>> data structures from a messaging system, fails with the error message:
>>
>> "RuntimeWarning: Item size computed from the PEP 3118 buffer format
>> string does not match the actual item size."
>>
>> Can anyone tell me if this was a change that has been added into the git
>> version recently, in which case I can checkout a previous version of
>> numpy, or if I've got to try downgrading the whole system (ergh.)
>
> Python's ctypes module implements its PEP 3118 support incorrectly
> in recent Python versions. There's a patch in waiting:
>
> ? ? ? ?http://bugs.python.org/issue10744
>
> In the meantime, you can just silence the warnings using the warnings
> module,
>
> ? ? ? ?warnings.simplefilter("ignore", RuntimeWarning)

Thanks Pauli.

I was seeing a segfault everytime I saw the error message, and since
both started happening at the same time, I was guilty of mixing
correlation and causation. After rebuilding numpy about 10 times, I
have identified the first git commit after which the segfault appears
(feb8079070b8a659d7ee) , and a small piece of code that triggers it:

from ctypes import Structure, c_double

#-- copied out of an xml2py generated file
class S(Structure):
    pass
S._pack_ = 4
S._fields_ = [
    ('field', c_double * 2),
   ]
#--

import numpy as np
print np.version.version
s = S()
print "S", np.asarray(s.field)

Can anyone confirm this, in which case it's probably a bug?

Thanks,

Angus
-- 
AJC McMorland
Post-doctoral research fellow
Neurobiology, University of Pittsburgh


From gnurser at gmail.com  Wed Aug 10 12:55:59 2011
From: gnurser at gmail.com (George Nurser)
Date: Wed, 10 Aug 2011 17:55:59 +0100
Subject: [Numpy-discussion] problems with multiple outputs with
	numpy.nditer
In-Reply-To: <CAMRnEmq8AkVSy8Bdx5t31pcFrQBOdomFEEWKh1Wf1iYLVYg9=Q@mail.gmail.com>
References: <CAAyCRnT+WrC=uwwKMADzcYDKqm-rL1Yqb4M3Ebf64K1f-R+6Jw@mail.gmail.com>
	<CAMRnEmq8AkVSy8Bdx5t31pcFrQBOdomFEEWKh1Wf1iYLVYg9=Q@mail.gmail.com>
Message-ID: <CAAyCRnSnaPty3oPgKuiFUQ_gZJYTnpAZuih5pAnEqeUpF7COAQ@mail.gmail.com>

Works fine with the [...]s.
Thanks very much.

--George

On 10 August 2011 17:15, Mark Wiebe <mwwiebe at gmail.com> wrote:
> On Wed, Aug 10, 2011 at 3:45 AM, George Nurser <gnurser at gmail.com> wrote:
>>
>> Hi,
>> I'm running numpy 1.6.1rc2 + python 2.7.1 64-bit from python.org on OSX
>> 10.6.8.
>>
>> I have a f2py'd fortran routine that inputs and outputs fortran real*8
>> scalars, and I normally call it like
>>
>> tu,tv,E,El,IF,HF,HFI = LW.rotate2u(u,v,NN,ff,0)
>>
>> I now want to call it over 2D arrays UT,VT,N,f
>>
>> Using steam-age indexing works fine:
>>
>> mflux_east,mflux_north,IWE,IWE_lin,InvFr,HFroude =
>> np.empty([6,ny-1,nx],dtype=np.float64)
>> for j in range(ny-1):
>> ? for i in range(nx):
>> ? ? ? u,v,NN,ff = [x[j,i] for x in UT,VT,N,f]
>>
>> mflux_east[j,i],mflux_north[j,i],IWE[j,i],IWE_lin[j,i],InvFr[j,i],HFroude[j,i],HFI
>> = LW.rotate2u(u,v,NN,ff,0)
>>
>>
>>
>> I decided to try the new nditer option, with
>>
>> it = np.nditer([UT,VT,N,f,None,None,None,None,None,None,None]
>> ? ? ? ? ? ? ?,op_flags=4*[['readonly']]+7*[['writeonly','allocate']]
>> ? ? ? ? ? ? ?,op_dtypes=np.float64)
>> for (u,v,NN,ff,tu,tv,E,El,IF,HF,HFI) in it:
>> ? tu,tv,E,El,IF,HF,HFI = LW.rotate2u(u,v,NN,ff,0)
>>
>>
>> Unfortunately this doesn't seem to work. Writing
>> aa,bb,cc,dd,ee,ff,gg = it.operands[4:]
>
> One problem here is that the assignment needs to assign into the view the
> iterator gives, something a direct assignment doesn't actually do. Instead
> of
> a, b = f(c,d)
> you need to write it like
> a[...], b[...] = f(c,d)
> so that the actual values being iterated get modified. Here's what I get:
> In [7]: a = np.arange(5.)
> In [8]: b, c, d = a + 1, a + 2, a + 3
> In [9]: it = np.nditer([a,b,c,d] + [None]*7,
> ? ?...: ? ? ? ? op_flags=4*[['readonly']]+7*[['writeonly','allocate']],
> ? ?...: ? ? ? ? op_dtypes=np.float64)
> In [10]: for (x,y,z,w,A,B,C,D,E,F,G) in it:
> ? ?....: ? ? A[...], B[...], C[...], D[...], E[...], F[...], G[...] = x, y,
> z, w, x+y, y+z, z+w
> ? ?....:
> In [11]: it.operands[4]
> Out[11]: array([ 0., ?1., ?2., ?3., ?4.])
> In [12]: it.operands[5]
> Out[12]: array([ 1., ?2., ?3., ?4., ?5.])
> In [13]: it.operands[6]
> Out[13]: array([ 2., ?3., ?4., ?5., ?6.])
> In [14]: it.operands[7]
> Out[14]: array([ 3., ?4., ?5., ?6., ?7.])
> In [15]: it.operands[8]
> Out[15]: array([ 1., ?3., ?5., ?7., ?9.])
> In [16]: it.operands[9]
> Out[16]: array([ ?3., ? 5., ? 7., ? 9., ?11.])
> In [17]: it.operands[10]
> Out[17]: array([ ?5., ? 7., ? 9., ?11., ?13.])
>
> -Mark
>
>>
>> aa seems to contain the contents of UT (bizarrely rescaled to lie
>> between 0 and 1), while bb,cc etc are all zero.
>>
>>
>> I'm not sure whether I've just called it incorrectly, or whether
>> perhaps it's only supposed to work with one output array.
>>
>>
>> --George Nurser.
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>


From rowen at uw.edu  Wed Aug 10 13:22:14 2011
From: rowen at uw.edu (Russell E. Owen)
Date: Wed, 10 Aug 2011 10:22:14 -0700
Subject: [Numpy-discussion] Efficient way to load a 1Gb file?
Message-ID: <rowen-E0806A.10221410082011@news.gmane.org>

A coworker is trying to load a 1Gb text data file into a numpy array 
using numpy.loadtxt, but he says it is using up all of his machine's 6Gb 
of RAM. Is there a more efficient way to read such text data files?

-- Russell


From matthew.brett at gmail.com  Wed Aug 10 15:28:53 2011
From: matthew.brett at gmail.com (Matthew Brett)
Date: Wed, 10 Aug 2011 12:28:53 -0700
Subject: [Numpy-discussion] numpydoc - latex longtables error
Message-ID: <CAH6Pt5rs5p4eGbjL2yju+uBbrhw4=ap+89peRF6Nu8Fme2UVsw@mail.gmail.com>

Hi,

I think this one might be for Pauli.

I've run into an odd problem that seems to be an interaction of
numpydoc and autosummary and large classes.

In summary, large classes and numpydoc lead to large tables of class
methods, and there seems to be an error in the creation of the large
tables in latex.

Specifically, if I run 'make latexpdf' with the attached minimal
sphinx setup, I get a pdflatex error ending thus:

...
l.118 \begin{longtable}{LL}

and this is because longtable does not accept LL as an argument, but
needs '|l|l|' (bar - el - bar - el - bar).

I see in sphinx.writers.latex.py, around line 657, that sphinx knows
about this in general, and long tables in standard ReST work fine with
the el-bar arguments passed to longtable.

        if self.table.colspec:
            self.body.append(self.table.colspec)
        else:
            if self.table.has_problematic:
                colwidth = 0.95 / self.table.colcount
                colspec = ('p{%.3f\\linewidth}|' % colwidth) * \
                          self.table.colcount
                self.body.append('{|' + colspec + '}\n')
            elif self.table.longtable:
                self.body.append('{|' + ('l|' * self.table.colcount) + '}\n')
            else:
                self.body.append('{|' + ('L|' * self.table.colcount) + '}\n')

However, using numpydoc and autosummary (see the conf.py file), what
seems to happen is that, when we reach the self.table.colspec test at
the beginning of the snippet above, 'self.table.colspec' is defined:

In [1]: self.table.colspec
Out[1]: '{LL}\n'

and thus the LL gets written as the arg to longtable:

\begin{longtable}{LL}

and the pdf build breaks.

I'm using the numpydoc out of the current numpy source tree.

At that point I wasn't sure how to proceed with debugging.  Can you
give any hints?

Thanks a lot,

Matthew
-------------- next part --------------
A non-text attachment was scrubbed...
Name: long_test.tgz
Type: application/x-gzip
Size: 11907 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110810/3ca69969/attachment.bin>

From jsseabold at gmail.com  Wed Aug 10 15:38:24 2011
From: jsseabold at gmail.com (Skipper Seabold)
Date: Wed, 10 Aug 2011 15:38:24 -0400
Subject: [Numpy-discussion] numpydoc - latex longtables error
In-Reply-To: <CAH6Pt5rs5p4eGbjL2yju+uBbrhw4=ap+89peRF6Nu8Fme2UVsw@mail.gmail.com>
References: <CAH6Pt5rs5p4eGbjL2yju+uBbrhw4=ap+89peRF6Nu8Fme2UVsw@mail.gmail.com>
Message-ID: <CAKF=DjscbQHFY5PZsnVie2W7C2-+vDJjZep-i+FKKQRvVkMATg@mail.gmail.com>

On Wed, Aug 10, 2011 at 3:28 PM, Matthew Brett <matthew.brett at gmail.com> wrote:
> Hi,
>
> I think this one might be for Pauli.
>
> I've run into an odd problem that seems to be an interaction of
> numpydoc and autosummary and large classes.
>
> In summary, large classes and numpydoc lead to large tables of class
> methods, and there seems to be an error in the creation of the large
> tables in latex.
>
> Specifically, if I run 'make latexpdf' with the attached minimal
> sphinx setup, I get a pdflatex error ending thus:
>
> ...
> l.118 \begin{longtable}{LL}
>
> and this is because longtable does not accept LL as an argument, but
> needs '|l|l|' (bar - el - bar - el - bar).
>
> I see in sphinx.writers.latex.py, around line 657, that sphinx knows
> about this in general, and long tables in standard ReST work fine with
> the el-bar arguments passed to longtable.
>
> ? ? ? ?if self.table.colspec:
> ? ? ? ? ? ?self.body.append(self.table.colspec)
> ? ? ? ?else:
> ? ? ? ? ? ?if self.table.has_problematic:
> ? ? ? ? ? ? ? ?colwidth = 0.95 / self.table.colcount
> ? ? ? ? ? ? ? ?colspec = ('p{%.3f\\linewidth}|' % colwidth) * \
> ? ? ? ? ? ? ? ? ? ? ? ? ?self.table.colcount
> ? ? ? ? ? ? ? ?self.body.append('{|' + colspec + '}\n')
> ? ? ? ? ? ?elif self.table.longtable:
> ? ? ? ? ? ? ? ?self.body.append('{|' + ('l|' * self.table.colcount) + '}\n')
> ? ? ? ? ? ?else:
> ? ? ? ? ? ? ? ?self.body.append('{|' + ('L|' * self.table.colcount) + '}\n')
>
> However, using numpydoc and autosummary (see the conf.py file), what
> seems to happen is that, when we reach the self.table.colspec test at
> the beginning of the snippet above, 'self.table.colspec' is defined:
>
> In [1]: self.table.colspec
> Out[1]: '{LL}\n'
>
> and thus the LL gets written as the arg to longtable:
>
> \begin{longtable}{LL}
>
> and the pdf build breaks.
>
> I'm using the numpydoc out of the current numpy source tree.
>
> At that point I wasn't sure how to proceed with debugging. ?Can you
> give any hints?
>

It's not a proper fix, but our workaround is to edit the Makefile for
latex (and latexpdf) to

https://github.com/statsmodels/statsmodels/blob/master/scikits/statsmodels/docs/Makefile#L94
https://github.com/statsmodels/statsmodels/blob/master/scikits/statsmodels/docs/make.bat#L121

to call the script to replace the longtable arguments

https://github.com/statsmodels/statsmodels/blob/master/scikits/statsmodels/docs/fix_longtable.py

The workaround itself probably isn't optimal, and I'd be happy to hear
of a proper fix.

Cheers,

Skipper


From derek at astro.physik.uni-goettingen.de  Wed Aug 10 15:43:47 2011
From: derek at astro.physik.uni-goettingen.de (Derek Homeier)
Date: Wed, 10 Aug 2011 21:43:47 +0200
Subject: [Numpy-discussion] Efficient way to load a 1Gb file?
In-Reply-To: <rowen-E0806A.10221410082011@news.gmane.org>
References: <rowen-E0806A.10221410082011@news.gmane.org>
Message-ID: <FC32192D-0436-46F9-AE71-6359CCA286A0@astro.physik.uni-goettingen.de>

On 10 Aug 2011, at 19:22, Russell E. Owen wrote:

> A coworker is trying to load a 1Gb text data file into a numpy array 
> using numpy.loadtxt, but he says it is using up all of his machine's 6Gb 
> of RAM. Is there a more efficient way to read such text data files?

The npyio routines (loadtxt as well as genfromtxt) first read in the entire data as lists, which creates of course significant overhead, but is not easy to circumvent, since numpy arrays are immutable - so you have to first store the numbers in some kind of mutable object. One could write a custom parser that tries to be somewhat more efficient, e.g. first reading in sub-arrays from a smaller buffer. Concatenating those sub-arrays would still require about twice the memory of the final array. I don't know if using the array.array type (which is mutable) is much more efficient than a list...
To really avoid any excess memory usage you'd have to know the total data size in advance - either by reading in the file in a first pass to count the rows, or explicitly specifying it to a custom reader. Basically, assuming a completely regular file without missing values etc., you could then read in the data like 

X = np.zeros((n_lines, n_columns), dtype=float)
delimiter = ' '
for n, line in enumerate(file(fname, 'r')):
    X[n] = np.array(line.split(delimiter), dtype=float)

(adjust delimiter and dtype as needed...)

HTH,
							Derek


From aarchiba at physics.mcgill.ca  Wed Aug 10 16:01:37 2011
From: aarchiba at physics.mcgill.ca (Anne Archibald)
Date: Wed, 10 Aug 2011 16:01:37 -0400
Subject: [Numpy-discussion] Efficient way to load a 1Gb file?
In-Reply-To: <FC32192D-0436-46F9-AE71-6359CCA286A0@astro.physik.uni-goettingen.de>
References: <rowen-E0806A.10221410082011@news.gmane.org>
	<FC32192D-0436-46F9-AE71-6359CCA286A0@astro.physik.uni-goettingen.de>
Message-ID: <CANm_+Zqmsgo8Q+Oz_0RCya-hJv4Q7PqynDb=LzrgvbTxGY3MWQ@mail.gmail.com>

There was also some work on a semi-mutable array type that allowed
appending along one axis, then 'freezing' to yield a normal numpy
array (unfortunately I'm not sure how to find it in the mailing list
archives). One could write such a setup by hand, using mmap() or
realloc(), but I'd be inclined to simply write a filter that converted
the text file to some sort of binary file on the fly, value by value.
Then the file can be loaded in or mmap()ed.  A 1 Gb text file is a
miserable object anyway, so it might be desirable to convert to (say)
HDF5 and then throw away the text file.

Anne

On 10 August 2011 15:43, Derek Homeier
<derek at astro.physik.uni-goettingen.de> wrote:
> On 10 Aug 2011, at 19:22, Russell E. Owen wrote:
>
>> A coworker is trying to load a 1Gb text data file into a numpy array
>> using numpy.loadtxt, but he says it is using up all of his machine's 6Gb
>> of RAM. Is there a more efficient way to read such text data files?
>
> The npyio routines (loadtxt as well as genfromtxt) first read in the entire data as lists, which creates of course significant overhead, but is not easy to circumvent, since numpy arrays are immutable - so you have to first store the numbers in some kind of mutable object. One could write a custom parser that tries to be somewhat more efficient, e.g. first reading in sub-arrays from a smaller buffer. Concatenating those sub-arrays would still require about twice the memory of the final array. I don't know if using the array.array type (which is mutable) is much more efficient than a list...
> To really avoid any excess memory usage you'd have to know the total data size in advance - either by reading in the file in a first pass to count the rows, or explicitly specifying it to a custom reader. Basically, assuming a completely regular file without missing values etc., you could then read in the data like
>
> X = np.zeros((n_lines, n_columns), dtype=float)
> delimiter = ' '
> for n, line in enumerate(file(fname, 'r')):
> ? ?X[n] = np.array(line.split(delimiter), dtype=float)
>
> (adjust delimiter and dtype as needed...)
>
> HTH,
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?Derek
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From gael.varoquaux at normalesup.org  Wed Aug 10 16:03:03 2011
From: gael.varoquaux at normalesup.org (Gael Varoquaux)
Date: Wed, 10 Aug 2011 22:03:03 +0200
Subject: [Numpy-discussion] Efficient way to load a 1Gb file?
In-Reply-To: <CANm_+Zqmsgo8Q+Oz_0RCya-hJv4Q7PqynDb=LzrgvbTxGY3MWQ@mail.gmail.com>
References: <rowen-E0806A.10221410082011@news.gmane.org>
	<FC32192D-0436-46F9-AE71-6359CCA286A0@astro.physik.uni-goettingen.de>
	<CANm_+Zqmsgo8Q+Oz_0RCya-hJv4Q7PqynDb=LzrgvbTxGY3MWQ@mail.gmail.com>
Message-ID: <20110810200303.GA24720@phare.normalesup.org>

On Wed, Aug 10, 2011 at 04:01:37PM -0400, Anne Archibald wrote:
> A 1 Gb text file is a miserable object anyway, so it might be desirable
> to convert to (say) HDF5 and then throw away the text file.

+1

G


From derek at astro.physik.uni-goettingen.de  Wed Aug 10 16:12:37 2011
From: derek at astro.physik.uni-goettingen.de (Derek Homeier)
Date: Wed, 10 Aug 2011 22:12:37 +0200
Subject: [Numpy-discussion] Efficient way to load a 1Gb file?
In-Reply-To: <20110810200303.GA24720@phare.normalesup.org>
References: <rowen-E0806A.10221410082011@news.gmane.org>
	<FC32192D-0436-46F9-AE71-6359CCA286A0@astro.physik.uni-goettingen.de>
	<CANm_+Zqmsgo8Q+Oz_0RCya-hJv4Q7PqynDb=LzrgvbTxGY3MWQ@mail.gmail.com>
	<20110810200303.GA24720@phare.normalesup.org>
Message-ID: <28298416-05FF-446F-8841-039DE31AD77A@astro.physik.uni-goettingen.de>

On 10 Aug 2011, at 22:03, Gael Varoquaux wrote:

> On Wed, Aug 10, 2011 at 04:01:37PM -0400, Anne Archibald wrote:
>> A 1 Gb text file is a miserable object anyway, so it might be desirable
>> to convert to (say) HDF5 and then throw away the text file.
> 
> +1

There might be concerns about ensuring data accessibility agains throwing the text file away, but converting to HDF5 would be an elegant for reading in without the memory issues, too (I must confess though, I've regularly read ~ 1GB ASCII files into memory - with decent virtual memory management it did not turn out too bad...)

Cheers,
					Derek


From paul.anton.letnes at gmail.com  Wed Aug 10 16:23:06 2011
From: paul.anton.letnes at gmail.com (Paul Anton Letnes)
Date: Wed, 10 Aug 2011 21:23:06 +0100
Subject: [Numpy-discussion] Efficient way to load a 1Gb file?
In-Reply-To: <20110810200303.GA24720@phare.normalesup.org>
References: <rowen-E0806A.10221410082011@news.gmane.org>
	<FC32192D-0436-46F9-AE71-6359CCA286A0@astro.physik.uni-goettingen.de>
	<CANm_+Zqmsgo8Q+Oz_0RCya-hJv4Q7PqynDb=LzrgvbTxGY3MWQ@mail.gmail.com>
	<20110810200303.GA24720@phare.normalesup.org>
Message-ID: <0F0B6E30-34C3-429E-9098-AD512532F4D0@gmail.com>


On 10. aug. 2011, at 21.03, Gael Varoquaux wrote:

> On Wed, Aug 10, 2011 at 04:01:37PM -0400, Anne Archibald wrote:
>> A 1 Gb text file is a miserable object anyway, so it might be desirable
>> to convert to (say) HDF5 and then throw away the text file.
> 
> +1
> 
> G

+1 and a very warm recommendation of h5py.

Paul


From ben.root at ou.edu  Wed Aug 10 16:55:37 2011
From: ben.root at ou.edu (Benjamin Root)
Date: Wed, 10 Aug 2011 15:55:37 -0500
Subject: [Numpy-discussion] bug with assignment into an indexed array?
Message-ID: <CANNq6FnrbrqUvjrmxeCpYrwBv3jOMbpA+5ik_6a7MjQHsYbSJw@mail.gmail.com>

Came across this today when trying to determine what was wrong with my code:

import numpy as np
matched_to = np.array([-1] * 5)
in_ellipse = np.array([False, True, True, True, False])
match = np.array([False, True, True])
matched_to[in_ellipse][match] = 3

I would expect matched_to to now be "array([-1, -1, 3, 3, -1])", but
instead, it is still all -1.

It would seem that unless the view was created by a slice, then the
assignment into the indexed view would not work as expected.  This works:

>>> matched_to[:3][match] = 3

but not:

>>> matched_to[np.array([0, 1, 2])][match] = 3

Note that the following does work:

>>> matched_to[np.array([0, 1, 2])] = 3

Is this a bug, or was I wrong to expect this to work this way?

Thanks,
Ben Root
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110810/9cec285d/attachment.html>

From matthew.brett at gmail.com  Wed Aug 10 18:17:26 2011
From: matthew.brett at gmail.com (Matthew Brett)
Date: Wed, 10 Aug 2011 15:17:26 -0700
Subject: [Numpy-discussion] numpydoc - latex longtables error
In-Reply-To: <CAKF=DjscbQHFY5PZsnVie2W7C2-+vDJjZep-i+FKKQRvVkMATg@mail.gmail.com>
References: <CAH6Pt5rs5p4eGbjL2yju+uBbrhw4=ap+89peRF6Nu8Fme2UVsw@mail.gmail.com>
	<CAKF=DjscbQHFY5PZsnVie2W7C2-+vDJjZep-i+FKKQRvVkMATg@mail.gmail.com>
Message-ID: <CAH6Pt5oTUQ+oAS0o8229S+1gc9Y-1hnBsd9=+vx2QJdqeq-SxA@mail.gmail.com>

Hi,

On Wed, Aug 10, 2011 at 12:38 PM, Skipper Seabold <jsseabold at gmail.com> wrote:
> On Wed, Aug 10, 2011 at 3:28 PM, Matthew Brett <matthew.brett at gmail.com> wrote:
>> Hi,
>>
>> I think this one might be for Pauli.
>>
>> I've run into an odd problem that seems to be an interaction of
>> numpydoc and autosummary and large classes.
>>
>> In summary, large classes and numpydoc lead to large tables of class
>> methods, and there seems to be an error in the creation of the large
>> tables in latex.
>>
>> Specifically, if I run 'make latexpdf' with the attached minimal
>> sphinx setup, I get a pdflatex error ending thus:
>>
>> ...
>> l.118 \begin{longtable}{LL}
>>
>> and this is because longtable does not accept LL as an argument, but
>> needs '|l|l|' (bar - el - bar - el - bar).
>>
>> I see in sphinx.writers.latex.py, around line 657, that sphinx knows
>> about this in general, and long tables in standard ReST work fine with
>> the el-bar arguments passed to longtable.
>>
>> ? ? ? ?if self.table.colspec:
>> ? ? ? ? ? ?self.body.append(self.table.colspec)
>> ? ? ? ?else:
>> ? ? ? ? ? ?if self.table.has_problematic:
>> ? ? ? ? ? ? ? ?colwidth = 0.95 / self.table.colcount
>> ? ? ? ? ? ? ? ?colspec = ('p{%.3f\\linewidth}|' % colwidth) * \
>> ? ? ? ? ? ? ? ? ? ? ? ? ?self.table.colcount
>> ? ? ? ? ? ? ? ?self.body.append('{|' + colspec + '}\n')
>> ? ? ? ? ? ?elif self.table.longtable:
>> ? ? ? ? ? ? ? ?self.body.append('{|' + ('l|' * self.table.colcount) + '}\n')
>> ? ? ? ? ? ?else:
>> ? ? ? ? ? ? ? ?self.body.append('{|' + ('L|' * self.table.colcount) + '}\n')
>>
>> However, using numpydoc and autosummary (see the conf.py file), what
>> seems to happen is that, when we reach the self.table.colspec test at
>> the beginning of the snippet above, 'self.table.colspec' is defined:
>>
>> In [1]: self.table.colspec
>> Out[1]: '{LL}\n'
>>
>> and thus the LL gets written as the arg to longtable:
>>
>> \begin{longtable}{LL}
>>
>> and the pdf build breaks.
>>
>> I'm using the numpydoc out of the current numpy source tree.
>>
>> At that point I wasn't sure how to proceed with debugging. ?Can you
>> give any hints?
>>
>
> It's not a proper fix, but our workaround is to edit the Makefile for
> latex (and latexpdf) to
>
> https://github.com/statsmodels/statsmodels/blob/master/scikits/statsmodels/docs/Makefile#L94
> https://github.com/statsmodels/statsmodels/blob/master/scikits/statsmodels/docs/make.bat#L121
>
> to call the script to replace the longtable arguments
>
> https://github.com/statsmodels/statsmodels/blob/master/scikits/statsmodels/docs/fix_longtable.py
>
> The workaround itself probably isn't optimal, and I'd be happy to hear
> of a proper fix.

Thanks - yes - I found your workaround in my explorations, I put in a
version in our tree too:

https://github.com/matthew-brett/nipy/blob/latex_build_fixes/tools/fix_longtable.py

 - but I agree it seems much better to get to the root cause.

See you,

Matthew


From josef.pktd at gmail.com  Wed Aug 10 20:03:53 2011
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Wed, 10 Aug 2011 20:03:53 -0400
Subject: [Numpy-discussion] numpydoc - latex longtables error
In-Reply-To: <CAH6Pt5oTUQ+oAS0o8229S+1gc9Y-1hnBsd9=+vx2QJdqeq-SxA@mail.gmail.com>
References: <CAH6Pt5rs5p4eGbjL2yju+uBbrhw4=ap+89peRF6Nu8Fme2UVsw@mail.gmail.com>
	<CAKF=DjscbQHFY5PZsnVie2W7C2-+vDJjZep-i+FKKQRvVkMATg@mail.gmail.com>
	<CAH6Pt5oTUQ+oAS0o8229S+1gc9Y-1hnBsd9=+vx2QJdqeq-SxA@mail.gmail.com>
Message-ID: <CAMMTP+CR-uKdV7rzcML6zonZ2bpUs8eck7H86+iWtZSLufiRUg@mail.gmail.com>

On Wed, Aug 10, 2011 at 6:17 PM, Matthew Brett <matthew.brett at gmail.com> wrote:
> Hi,
>
> On Wed, Aug 10, 2011 at 12:38 PM, Skipper Seabold <jsseabold at gmail.com> wrote:
>> On Wed, Aug 10, 2011 at 3:28 PM, Matthew Brett <matthew.brett at gmail.com> wrote:
>>> Hi,
>>>
>>> I think this one might be for Pauli.
>>>
>>> I've run into an odd problem that seems to be an interaction of
>>> numpydoc and autosummary and large classes.
>>>
>>> In summary, large classes and numpydoc lead to large tables of class
>>> methods, and there seems to be an error in the creation of the large
>>> tables in latex.
>>>
>>> Specifically, if I run 'make latexpdf' with the attached minimal
>>> sphinx setup, I get a pdflatex error ending thus:
>>>
>>> ...
>>> l.118 \begin{longtable}{LL}
>>>
>>> and this is because longtable does not accept LL as an argument, but
>>> needs '|l|l|' (bar - el - bar - el - bar).
>>>
>>> I see in sphinx.writers.latex.py, around line 657, that sphinx knows
>>> about this in general, and long tables in standard ReST work fine with
>>> the el-bar arguments passed to longtable.
>>>
>>> ? ? ? ?if self.table.colspec:
>>> ? ? ? ? ? ?self.body.append(self.table.colspec)
>>> ? ? ? ?else:
>>> ? ? ? ? ? ?if self.table.has_problematic:
>>> ? ? ? ? ? ? ? ?colwidth = 0.95 / self.table.colcount
>>> ? ? ? ? ? ? ? ?colspec = ('p{%.3f\\linewidth}|' % colwidth) * \
>>> ? ? ? ? ? ? ? ? ? ? ? ? ?self.table.colcount
>>> ? ? ? ? ? ? ? ?self.body.append('{|' + colspec + '}\n')
>>> ? ? ? ? ? ?elif self.table.longtable:
>>> ? ? ? ? ? ? ? ?self.body.append('{|' + ('l|' * self.table.colcount) + '}\n')
>>> ? ? ? ? ? ?else:
>>> ? ? ? ? ? ? ? ?self.body.append('{|' + ('L|' * self.table.colcount) + '}\n')
>>>
>>> However, using numpydoc and autosummary (see the conf.py file), what
>>> seems to happen is that, when we reach the self.table.colspec test at
>>> the beginning of the snippet above, 'self.table.colspec' is defined:
>>>
>>> In [1]: self.table.colspec
>>> Out[1]: '{LL}\n'
>>>
>>> and thus the LL gets written as the arg to longtable:
>>>
>>> \begin{longtable}{LL}
>>>
>>> and the pdf build breaks.
>>>
>>> I'm using the numpydoc out of the current numpy source tree.
>>>
>>> At that point I wasn't sure how to proceed with debugging. ?Can you
>>> give any hints?
>>>
>>
>> It's not a proper fix, but our workaround is to edit the Makefile for
>> latex (and latexpdf) to
>>
>> https://github.com/statsmodels/statsmodels/blob/master/scikits/statsmodels/docs/Makefile#L94
>> https://github.com/statsmodels/statsmodels/blob/master/scikits/statsmodels/docs/make.bat#L121
>>
>> to call the script to replace the longtable arguments
>>
>> https://github.com/statsmodels/statsmodels/blob/master/scikits/statsmodels/docs/fix_longtable.py
>>
>> The workaround itself probably isn't optimal, and I'd be happy to hear
>> of a proper fix.
>
> Thanks - yes - I found your workaround in my explorations, I put in a
> version in our tree too:
>
> https://github.com/matthew-brett/nipy/blob/latex_build_fixes/tools/fix_longtable.py
>
> ?- but I agree it seems much better to get to the root cause.

When I tried to figure this out, I never found out why the correct
sphinx longtable code path never gets reached, or which code
(numpydoc, autosummary or sphinx) is filling in the colspec.

Josef

>
> See you,
>
> Matthew
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From matthew.brett at gmail.com  Wed Aug 10 20:17:26 2011
From: matthew.brett at gmail.com (Matthew Brett)
Date: Wed, 10 Aug 2011 17:17:26 -0700
Subject: [Numpy-discussion] numpydoc - latex longtables error
In-Reply-To: <CAMMTP+CR-uKdV7rzcML6zonZ2bpUs8eck7H86+iWtZSLufiRUg@mail.gmail.com>
References: <CAH6Pt5rs5p4eGbjL2yju+uBbrhw4=ap+89peRF6Nu8Fme2UVsw@mail.gmail.com>
	<CAKF=DjscbQHFY5PZsnVie2W7C2-+vDJjZep-i+FKKQRvVkMATg@mail.gmail.com>
	<CAH6Pt5oTUQ+oAS0o8229S+1gc9Y-1hnBsd9=+vx2QJdqeq-SxA@mail.gmail.com>
	<CAMMTP+CR-uKdV7rzcML6zonZ2bpUs8eck7H86+iWtZSLufiRUg@mail.gmail.com>
Message-ID: <CAH6Pt5pPRmnDFgVkBE1J2+HWO_VbEX1OPsfGOOnhQFrOO7vd3A@mail.gmail.com>

Hi,

On Wed, Aug 10, 2011 at 5:03 PM,  <josef.pktd at gmail.com> wrote:
> On Wed, Aug 10, 2011 at 6:17 PM, Matthew Brett <matthew.brett at gmail.com> wrote:
>> Hi,
>>
>> On Wed, Aug 10, 2011 at 12:38 PM, Skipper Seabold <jsseabold at gmail.com> wrote:
>>> On Wed, Aug 10, 2011 at 3:28 PM, Matthew Brett <matthew.brett at gmail.com> wrote:
>>>> Hi,
>>>>
>>>> I think this one might be for Pauli.
>>>>
>>>> I've run into an odd problem that seems to be an interaction of
>>>> numpydoc and autosummary and large classes.
>>>>
>>>> In summary, large classes and numpydoc lead to large tables of class
>>>> methods, and there seems to be an error in the creation of the large
>>>> tables in latex.
>>>>
>>>> Specifically, if I run 'make latexpdf' with the attached minimal
>>>> sphinx setup, I get a pdflatex error ending thus:
>>>>
>>>> ...
>>>> l.118 \begin{longtable}{LL}
>>>>
>>>> and this is because longtable does not accept LL as an argument, but
>>>> needs '|l|l|' (bar - el - bar - el - bar).
>>>>
>>>> I see in sphinx.writers.latex.py, around line 657, that sphinx knows
>>>> about this in general, and long tables in standard ReST work fine with
>>>> the el-bar arguments passed to longtable.
>>>>
>>>> ? ? ? ?if self.table.colspec:
>>>> ? ? ? ? ? ?self.body.append(self.table.colspec)
>>>> ? ? ? ?else:
>>>> ? ? ? ? ? ?if self.table.has_problematic:
>>>> ? ? ? ? ? ? ? ?colwidth = 0.95 / self.table.colcount
>>>> ? ? ? ? ? ? ? ?colspec = ('p{%.3f\\linewidth}|' % colwidth) * \
>>>> ? ? ? ? ? ? ? ? ? ? ? ? ?self.table.colcount
>>>> ? ? ? ? ? ? ? ?self.body.append('{|' + colspec + '}\n')
>>>> ? ? ? ? ? ?elif self.table.longtable:
>>>> ? ? ? ? ? ? ? ?self.body.append('{|' + ('l|' * self.table.colcount) + '}\n')
>>>> ? ? ? ? ? ?else:
>>>> ? ? ? ? ? ? ? ?self.body.append('{|' + ('L|' * self.table.colcount) + '}\n')
>>>>
>>>> However, using numpydoc and autosummary (see the conf.py file), what
>>>> seems to happen is that, when we reach the self.table.colspec test at
>>>> the beginning of the snippet above, 'self.table.colspec' is defined:
>>>>
>>>> In [1]: self.table.colspec
>>>> Out[1]: '{LL}\n'
>>>>
>>>> and thus the LL gets written as the arg to longtable:
>>>>
>>>> \begin{longtable}{LL}
>>>>
>>>> and the pdf build breaks.
>>>>
>>>> I'm using the numpydoc out of the current numpy source tree.
>>>>
>>>> At that point I wasn't sure how to proceed with debugging. ?Can you
>>>> give any hints?
>>>>
>>>
>>> It's not a proper fix, but our workaround is to edit the Makefile for
>>> latex (and latexpdf) to
>>>
>>> https://github.com/statsmodels/statsmodels/blob/master/scikits/statsmodels/docs/Makefile#L94
>>> https://github.com/statsmodels/statsmodels/blob/master/scikits/statsmodels/docs/make.bat#L121
>>>
>>> to call the script to replace the longtable arguments
>>>
>>> https://github.com/statsmodels/statsmodels/blob/master/scikits/statsmodels/docs/fix_longtable.py
>>>
>>> The workaround itself probably isn't optimal, and I'd be happy to hear
>>> of a proper fix.
>>
>> Thanks - yes - I found your workaround in my explorations, I put in a
>> version in our tree too:
>>
>> https://github.com/matthew-brett/nipy/blob/latex_build_fixes/tools/fix_longtable.py
>>
>> ?- but I agree it seems much better to get to the root cause.
>
> When I tried to figure this out, I never found out why the correct
> sphinx longtable code path never gets reached, or which code
> (numpydoc, autosummary or sphinx) is filling in the colspec.

No - it looked hard to debug.  I established that it required numpydoc
and autosummary to be enabled.

See you,

Matthew


From yoyoq at yahoo.com  Wed Aug 10 20:50:00 2011
From: yoyoq at yahoo.com (jp d)
Date: Wed, 10 Aug 2011 17:50:00 -0700 (PDT)
Subject: [Numpy-discussion] matrix inversion
Message-ID: <1313023800.18375.YahooMailNeo@web130115.mail.mud.yahoo.com>

hi,
i am trying to invert matrices like this:
[[ 0.01643777 -0.13539939? 0.11946689]
?[ 0.12479926? 0.01210898 -0.09217618]
?[-0.13050087? 0.07575163? 0.01144993]]


in perl using Math::MatrixReal;
and in various online calculators i get 

[? 2.472715991745? 3.680743681735 -3.831392002314 ]
[ -4.673105249083 -5.348238625096 -5.703193038649 ]
[? 2.733966489601 -6.567940452290 -5.936617926811 ]


using python , numpy and linalg.inv (or linalg.pinv) i get? a divergent answer

[[? 6.79611151e+07?? 1.01163031e+08?? 1.05303510e+08]
?[? 1.01163057e+08?? 1.50585545e+08?? 1.56748838e+08]
?[? 1.05303548e+08?? 1.56748831e+08?? 1.63164381e+08]]

any suggestions?

thanks
jpd
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110810/3ba99d5b/attachment.html>

From alan.isaac at gmail.com  Wed Aug 10 21:06:41 2011
From: alan.isaac at gmail.com (Alan G Isaac)
Date: Wed, 10 Aug 2011 21:06:41 -0400
Subject: [Numpy-discussion] matrix inversion
In-Reply-To: <1313023800.18375.YahooMailNeo@web130115.mail.mud.yahoo.com>
References: <1313023800.18375.YahooMailNeo@web130115.mail.mud.yahoo.com>
Message-ID: <4E432B21.607@gmail.com>

On 8/10/2011 8:50 PM, jp d wrote:
> i am trying to invert matrices like this:
> [[ 0.01643777 -0.13539939  0.11946689]
>   [ 0.12479926  0.01210898 -0.09217618]
>   [-0.13050087  0.07575163  0.01144993]]

> in perl using Math::MatrixReal;
> and in various online calculators i get
> [  2.472715991745  3.680743681735 -3.831392002314 ]
> [ -4.673105249083 -5.348238625096 -5.703193038649 ]
> [  2.733966489601 -6.567940452290 -5.936617926811 ]

> using python , numpy and linalg.inv (or linalg.pinv) i get  a divergent answer
> [[  6.79611151e+07   1.01163031e+08   1.05303510e+08]
>   [  1.01163057e+08   1.50585545e+08   1.56748838e+08]
>   [  1.05303548e+08   1.56748831e+08   1.63164381e+08]]

Please demonstrate with code::

     >>> m = np.mat([[ 0.01643777,-0.13539939, 0.11946689],[
         0.12479926, 0.01210898,-0.09217618 ],[-0.13050087,
         0.07575163, 0.01144993]])
     >>> m.I
     matrix([[ -2.60023901e+08,  -3.87056678e+08,  -4.02898472e+08],
             [ -3.87056814e+08,  -5.76150592e+08,  -5.99731775e+08],
             [ -4.02898597e+08,  -5.99731775e+08,  -6.24278108e+08]])

Thank you,
Alan Isaac


From nadavh at visionsense.com  Thu Aug 11 00:21:34 2011
From: nadavh at visionsense.com (Nadav Horesh)
Date: Wed, 10 Aug 2011 21:21:34 -0700
Subject: [Numpy-discussion] matrix inversion
In-Reply-To: <1313023800.18375.YahooMailNeo@web130115.mail.mud.yahoo.com>
References: <1313023800.18375.YahooMailNeo@web130115.mail.mud.yahoo.com>
Message-ID: <26FC23E7C398A64083C980D16001012D246DFC5F87@VA3DIAXVS361.RED001.local>

The matrix in singular, so you can not expect a stable inverse.

   Nadav.

________________________________
From: numpy-discussion-bounces at scipy.org [numpy-discussion-bounces at scipy.org] On Behalf Of jp d [yoyoq at yahoo.com]
Sent: 11 August 2011 03:50
To: numpy-discussion at scipy.org
Subject: [Numpy-discussion] matrix inversion

hi,
i am trying to invert matrices like this:
[[ 0.01643777 -0.13539939  0.11946689]
 [ 0.12479926  0.01210898 -0.09217618]
 [-0.13050087  0.07575163  0.01144993]]

in perl using Math::MatrixReal;
and in various online calculators i get
[  2.472715991745  3.680743681735 -3.831392002314 ]
[ -4.673105249083 -5.348238625096 -5.703193038649 ]
[  2.733966489601 -6.567940452290 -5.936617926811 ]

using python , numpy and linalg.inv (or linalg.pinv) i get  a divergent answer
[[  6.79611151e+07   1.01163031e+08   1.05303510e+08]
 [  1.01163057e+08   1.50585545e+08   1.56748838e+08]
 [  1.05303548e+08   1.56748831e+08   1.63164381e+08]]

any suggestions?

thanks
jpd
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110810/e1091776/attachment.html>

From focke at slac.stanford.edu  Thu Aug 11 00:42:33 2011
From: focke at slac.stanford.edu (Warren Focke)
Date: Wed, 10 Aug 2011 21:42:33 -0700 (PDT)
Subject: [Numpy-discussion] matrix inversion
In-Reply-To: <26FC23E7C398A64083C980D16001012D246DFC5F87@VA3DIAXVS361.RED001.local>
References: <1313023800.18375.YahooMailNeo@web130115.mail.mud.yahoo.com>
	<26FC23E7C398A64083C980D16001012D246DFC5F87@VA3DIAXVS361.RED001.local>
Message-ID: <alpine.LRH.2.00.1108102139470.30462@isoc-pub9.slac.stanford.edu>

The svs are
1.99991695e-01,   1.99991682e-01,   6.84719250e-10
so if you try

>>> np.linalg.pinv(a,1e-5)
array([[ 0.41097834,  3.12024106, -3.26279309],
        [-3.38526587,  0.30274957,  1.89394811],
        [ 2.98692033, -2.30459609,  0.28627222]])

you at least get an answer that's not near-random.

w

On Wed, 10 Aug 2011, Nadav Horesh wrote:

> The matrix in singular, so you can not expect a stable inverse.
>
>   Nadav.
>
> ________________________________
> From: numpy-discussion-bounces at scipy.org [numpy-discussion-bounces at scipy.org] On Behalf Of jp d [yoyoq at yahoo.com]
> Sent: 11 August 2011 03:50
> To: numpy-discussion at scipy.org
> Subject: [Numpy-discussion] matrix inversion
>
> hi,
> i am trying to invert matrices like this:
> [[ 0.01643777 -0.13539939  0.11946689]
> [ 0.12479926  0.01210898 -0.09217618]
> [-0.13050087  0.07575163  0.01144993]]
>
> in perl using Math::MatrixReal;
> and in various online calculators i get
> [  2.472715991745  3.680743681735 -3.831392002314 ]
> [ -4.673105249083 -5.348238625096 -5.703193038649 ]
> [  2.733966489601 -6.567940452290 -5.936617926811 ]
>
> using python , numpy and linalg.inv (or linalg.pinv) i get  a divergent answer
> [[  6.79611151e+07   1.01163031e+08   1.05303510e+08]
> [  1.01163057e+08   1.50585545e+08   1.56748838e+08]
> [  1.05303548e+08   1.56748831e+08   1.63164381e+08]]
>
> any suggestions?
>
> thanks
> jpd
>


From lkb.teichmann at gmail.com  Thu Aug 11 02:41:06 2011
From: lkb.teichmann at gmail.com (Martin Teichmann)
Date: Thu, 11 Aug 2011 08:41:06 +0200
Subject: [Numpy-discussion] matrix inversion
In-Reply-To: <1313023800.18375.YahooMailNeo@web130115.mail.mud.yahoo.com>
References: <1313023800.18375.YahooMailNeo@web130115.mail.mud.yahoo.com>
Message-ID: <CAO9LD_B3Qar6oGt5NjLmBUcKT0N-Yzob9tjFUOGBanc1OjPLUA@mail.gmail.com>

Hi,

> i am trying to invert matrices like this:
> [[ 0.01643777 -0.13539939? 0.11946689]
> ?[ 0.12479926? 0.01210898 -0.09217618]
> ?[-0.13050087? 0.07575163? 0.01144993]]
>
> in perl using Math::MatrixReal;
> and in various online calculators i get
> [? 2.472715991745? 3.680743681735 -3.831392002314 ]
> [ -4.673105249083 -5.348238625096 -5.703193038649 ]
> [? 2.733966489601 -6.567940452290 -5.936617926811 ]

well, inverting latter matrix, I get

>>> n=np.mat([[  2.472715991745 , 3.680743681735 ,-3.831392002314 ],
    [ -4.673105249083, -5.348238625096, -5.703193038649 ],
    [  2.733966489601, -6.567940452290, -5.936617926811 ]])
>>> n.I
matrix([[ 0.01643777, -0.13539939,  0.11946689],
        [ 0.12479926,  0.01210898, -0.09217618],
        [-0.13050087, -0.07575163, -0.01144993]])

Which is nearly the same matrix as the matrix you started with,
but quite. There are some minus signs more in the last two
values... sure you didn't forget these? Adding them by hand
gives nearly the same value as your Perl result, but better,
the residua on the off diagonals are significantly lower.

Greetings

Martin


From keith.hughitt at gmail.com  Thu Aug 11 08:59:57 2011
From: keith.hughitt at gmail.com (Keith Hughitt)
Date: Thu, 11 Aug 2011 08:59:57 -0400
Subject: [Numpy-discussion] Returning ndimage subclass instances from scipy
	methods?
Message-ID: <CAOJcpR9N8stLbWzJytQ+KmS5R_GRhSs0sLFdsPCLcQY=P4r87A@mail.gmail.com>

Hi all,

Does anyone know if it is possible to have SciPy methods which work
on/return ndarray instances return subclass instances instead?

For example, I can pass in an instance of a ndarray subclass to methods in
scipy.ndimage, but a normal ndarray is returned instead of a new subclass
instance. Wrapping the result in __array_wrap__ and __array_finalize__ has
not seemed to help. I also tried asking on the scipy-users
list<http://mail.scipy.org/pipermail/scipy-user/2011-August/030179.html>,
but so far have not gotten any response.

Any suggestions?
Keith
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110811/1ca7ae9c/attachment.html>

From dhanjal at telecom-paristech.fr  Thu Aug 11 09:23:22 2011
From: dhanjal at telecom-paristech.fr (dhanjal at telecom-paristech.fr)
Date: Thu, 11 Aug 2011 15:23:22 +0200
Subject: [Numpy-discussion] SVD does not converge on "clean" matrix
Message-ID: <203aa1b32d794c238d32cb8d29036cc2.squirrel@webmail1.telecom-paristech.fr>

Hi all,

I get an error message "numpy.linalg.linalg.LinAlgError: SVD did not
converge" when calling numpy.linalg.svd on a "clean" matrix of size (1952,
895). The matrix is clean in the sense that it contains no NaN or Inf
values. The corresponding npz file is available here:
https://docs.google.com/leaf?id=0Bw0NXKxxc40jMWEyNTljMWUtMzBmNS00NGZmLThhZWUtY2I2MWU2MGZiNDgx&hl=fr

Here is some information about my setup: I use Python 2.7.1 on Ubuntu
11.04 with numpy 1.6.1. Furthermore, I thought the problem might be solved
by recompiling numpy with my local ATLAS library (version 3.8.3), and this
didn't seem to help. On another machine with Python 2.7.1 and numpy 1.5.1
the SVD does converge however it contains 1 NaN singular value and 3
negative singular values of the order -10^-1 (singular values should
always be non-negative).

I also tried computing the SVD of the matrix using Octave 3.2.4 and Matlab
7.10.0.499 (R2010a) 64-bit (glnxa64) and there were no problems. Any help
is greatly appreciated.

Thanks in advance,
Charanpal


From shish at keba.be  Thu Aug 11 09:37:45 2011
From: shish at keba.be (Olivier Delalleau)
Date: Thu, 11 Aug 2011 09:37:45 -0400
Subject: [Numpy-discussion] bug with assignment into an indexed array?
In-Reply-To: <CANNq6FnrbrqUvjrmxeCpYrwBv3jOMbpA+5ik_6a7MjQHsYbSJw@mail.gmail.com>
References: <CANNq6FnrbrqUvjrmxeCpYrwBv3jOMbpA+5ik_6a7MjQHsYbSJw@mail.gmail.com>
Message-ID: <CAFXk4bpwTAxC-j_QMwqQiJ-GZMGu+_cxo_h-tEE2-zOmYBaSmQ@mail.gmail.com>

Maybe confusing, but working as expected.

When you write:
  matched_to[np.array([0, 1, 2])] = 3
it calls __setitem__ on matched_to, with arguments (np.array([0, 1, 2]), 3).
So numpy understand you want to write 3 at these indices.

When you write:
matched_to[:3][match] = 3
it first calls __getitem__ with the slice as argument, which returns a view
of your array, then it calls __setitem__ on this view, and it fills your
matched_to array at the same time.

But when you write:
  matched_to[np.array([0, 1, 2])][match] = 3
it first calls __getitem__ with the array as argument, which retunrs a
*copy* of your array, so that calling __setitem__ on this copy has no effect
on your original array.

-=- Olivier


2011/8/10 Benjamin Root <ben.root at ou.edu>

> Came across this today when trying to determine what was wrong with my
> code:
>
> import numpy as np
> matched_to = np.array([-1] * 5)
> in_ellipse = np.array([False, True, True, True, False])
> match = np.array([False, True, True])
> matched_to[in_ellipse][match] = 3
>
> I would expect matched_to to now be "array([-1, -1, 3, 3, -1])", but
> instead, it is still all -1.
>
> It would seem that unless the view was created by a slice, then the
> assignment into the indexed view would not work as expected.  This works:
>
> >>> matched_to[:3][match] = 3
>
> but not:
>
> >>> matched_to[np.array([0, 1, 2])][match] = 3
>
> Note that the following does work:
>
> >>> matched_to[np.array([0, 1, 2])] = 3
>
> Is this a bug, or was I wrong to expect this to work this way?
>
> Thanks,
> Ben Root
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110811/922cd0e1/attachment.html>

From nadavh at visionsense.com  Thu Aug 11 10:21:09 2011
From: nadavh at visionsense.com (Nadav Horesh)
Date: Thu, 11 Aug 2011 07:21:09 -0700
Subject: [Numpy-discussion] SVD does not converge on "clean" matrix
In-Reply-To: <203aa1b32d794c238d32cb8d29036cc2.squirrel@webmail1.telecom-paristech.fr>
References: <203aa1b32d794c238d32cb8d29036cc2.squirrel@webmail1.telecom-paristech.fr>
Message-ID: <26FC23E7C398A64083C980D16001012D246DFC5F90@VA3DIAXVS361.RED001.local>


Had no problem on a gentoo 64 bit machine using atlas 3.8.0 (Core I7, python 2.7.2, numpy versions1.60 and 1.6.1)

  Nadav
________________________________________
From: numpy-discussion-bounces at scipy.org [numpy-discussion-bounces at scipy.org] On Behalf Of dhanjal at telecom-paristech.fr [dhanjal at telecom-paristech.fr]
Sent: 11 August 2011 16:23
To: numpy-discussion at scipy.org
Subject: [Numpy-discussion] SVD does not converge on "clean" matrix

Hi all,

I get an error message "numpy.linalg.linalg.LinAlgError: SVD did not
converge" when calling numpy.linalg.svd on a "clean" matrix of size (1952,
895). The matrix is clean in the sense that it contains no NaN or Inf
values. The corresponding npz file is available here:
https://docs.google.com/leaf?id=0Bw0NXKxxc40jMWEyNTljMWUtMzBmNS00NGZmLThhZWUtY2I2MWU2MGZiNDgx&hl=fr

Here is some information about my setup: I use Python 2.7.1 on Ubuntu
11.04 with numpy 1.6.1. Furthermore, I thought the problem might be solved
by recompiling numpy with my local ATLAS library (version 3.8.3), and this
didn't seem to help. On another machine with Python 2.7.1 and numpy 1.5.1
the SVD does converge however it contains 1 NaN singular value and 3
negative singular values of the order -10^-1 (singular values should
always be non-negative).

I also tried computing the SVD of the matrix using Octave 3.2.4 and Matlab
7.10.0.499 (R2010a) 64-bit (glnxa64) and there were no problems. Any help
is greatly appreciated.

Thanks in advance,
Charanpal


_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion at scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

From ben.root at ou.edu  Thu Aug 11 11:16:37 2011
From: ben.root at ou.edu (Benjamin Root)
Date: Thu, 11 Aug 2011 10:16:37 -0500
Subject: [Numpy-discussion] bug with assignment into an indexed array?
In-Reply-To: <CAFXk4bpwTAxC-j_QMwqQiJ-GZMGu+_cxo_h-tEE2-zOmYBaSmQ@mail.gmail.com>
References: <CANNq6FnrbrqUvjrmxeCpYrwBv3jOMbpA+5ik_6a7MjQHsYbSJw@mail.gmail.com>
	<CAFXk4bpwTAxC-j_QMwqQiJ-GZMGu+_cxo_h-tEE2-zOmYBaSmQ@mail.gmail.com>
Message-ID: <CANNq6FkQ++=GnshiUsvTxA5npMNgm9CcG4w2r2FYi-gW6YfLZw@mail.gmail.com>

On Thu, Aug 11, 2011 at 8:37 AM, Olivier Delalleau <shish at keba.be> wrote:

> Maybe confusing, but working as expected.
>
>
> When you write:
>   matched_to[np.array([0, 1, 2])] = 3
> it calls __setitem__ on matched_to, with arguments (np.array([0, 1, 2]),
> 3). So numpy understand you want to write 3 at these indices.
>
>
> When you write:
> matched_to[:3][match] = 3
> it first calls __getitem__ with the slice as argument, which returns a view
> of your array, then it calls __setitem__ on this view, and it fills your
> matched_to array at the same time.
>
>
> But when you write:
>   matched_to[np.array([0, 1, 2])][match] = 3
> it first calls __getitem__ with the array as argument, which retunrs a
> *copy* of your array, so that calling __setitem__ on this copy has no effect
> on your original array.
>
> -=- Olivier
>
>
Right, but I guess my question is does it *have* to be that way?  I guess it
makes some sense with respect to indexing with a numpy array like I did with
the last example, because an element could be referred to multiple times
(which explains the common surprise with '+='), but with boolean indexing,
we are guaranteed that each element of the view will appear at most once.
Therefore, shouldn't boolean indexing always return a view, not a copy?  Is
the general case of arbitrary array selection inherently impossible to
encode in a view versus a slice with a regular spacing?

Thanks,
Ben Root
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110811/f10a96e5/attachment.html>

From shish at keba.be  Thu Aug 11 11:33:41 2011
From: shish at keba.be (Olivier Delalleau)
Date: Thu, 11 Aug 2011 11:33:41 -0400
Subject: [Numpy-discussion] bug with assignment into an indexed array?
In-Reply-To: <CANNq6FkQ++=GnshiUsvTxA5npMNgm9CcG4w2r2FYi-gW6YfLZw@mail.gmail.com>
References: <CANNq6FnrbrqUvjrmxeCpYrwBv3jOMbpA+5ik_6a7MjQHsYbSJw@mail.gmail.com>
	<CAFXk4bpwTAxC-j_QMwqQiJ-GZMGu+_cxo_h-tEE2-zOmYBaSmQ@mail.gmail.com>
	<CANNq6FkQ++=GnshiUsvTxA5npMNgm9CcG4w2r2FYi-gW6YfLZw@mail.gmail.com>
Message-ID: <CAFXk4bqpKeQxUhUXi3bRfXDTGNvUG=GbyKfakzTJ1awD9nkePA@mail.gmail.com>

2011/8/11 Benjamin Root <ben.root at ou.edu>

>
>
> On Thu, Aug 11, 2011 at 8:37 AM, Olivier Delalleau <shish at keba.be> wrote:
>
>> Maybe confusing, but working as expected.
>>
>>
>> When you write:
>>   matched_to[np.array([0, 1, 2])] = 3
>> it calls __setitem__ on matched_to, with arguments (np.array([0, 1, 2]),
>> 3). So numpy understand you want to write 3 at these indices.
>>
>>
>> When you write:
>> matched_to[:3][match] = 3
>> it first calls __getitem__ with the slice as argument, which returns a
>> view of your array, then it calls __setitem__ on this view, and it fills
>> your matched_to array at the same time.
>>
>>
>> But when you write:
>>   matched_to[np.array([0, 1, 2])][match] = 3
>> it first calls __getitem__ with the array as argument, which retunrs a
>> *copy* of your array, so that calling __setitem__ on this copy has no effect
>> on your original array.
>>
>> -=- Olivier
>>
>>
> Right, but I guess my question is does it *have* to be that way?  I guess
> it makes some sense with respect to indexing with a numpy array like I did
> with the last example, because an element could be referred to multiple
> times (which explains the common surprise with '+='), but with boolean
> indexing, we are guaranteed that each element of the view will appear at
> most once.  Therefore, shouldn't boolean indexing always return a view, not
> a copy?  Is the general case of arbitrary array selection inherently
> impossible to encode in a view versus a slice with a regular spacing?
>

Yes, due to the fact the array interface only supports regular spacing
(otherwise it is more difficult to get efficient access to arbitrary array
positions).

-=- Olivier
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110811/231d3f49/attachment.html>

From tmp50 at ukr.net  Thu Aug 11 12:02:00 2011
From: tmp50 at ukr.net (Dmitrey)
Date: Thu, 11 Aug 2011 19:02:00 +0300
Subject: [Numpy-discussion] bug with latest numpy git snapshot build with
	Python3
Message-ID: <E1QrXho-0000tl-T9@ffe12.ukr.net>

 bug in KUBUNTU 11.04, latest numpy git snapshot build with Python3
   >>> import numpy
   Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
   File "/usr/local/lib/python3.2/dist-packages/numpy/__init__.py", line
   137, in <module>
   from . import add_newdocs
   File "/usr/local/lib/python3.2/dist-packages/numpy/add_newdocs.py",
   line 9, in <module>
   from numpy.lib import add_newdoc
   File "/usr/local/lib/python3.2/dist-packages/numpy/lib/__init__.py",
   line 4, in <module>
   from .type_check import *
   File "/usr/local/lib/python3.2/dist-packages/numpy/lib/type_check.py",
   line 8, in <module>
   import numpy.core.numeric as _nx
   File "/usr/local/lib/python3.2/dist-packages/numpy/core/__init__.py",
   line 10, in <module>
   from .numeric import
   *                                                                                                   
   File "/usr/local/lib/python3.2/dist-packages/numpy/core/numeric.py",
   line 27, in <module>                                  
   import
   multiarray                                                                                                        
   ImportError: No module named multiarray
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110811/201e4878/attachment.html>

From wesmckinn at gmail.com  Thu Aug 11 14:25:07 2011
From: wesmckinn at gmail.com (Wes McKinney)
Date: Thu, 11 Aug 2011 14:25:07 -0400
Subject: [Numpy-discussion] Questionable reduceat behavior
Message-ID: <CAJPUwMDJCBe8NKbh+zqWs9=S+RcyYBnGS1-kgywKGu-u64E9TA@mail.gmail.com>

I'm a little perplexed why reduceat was made to behave like this:

In [26]: arr = np.ones((10, 4), dtype=bool)

In [27]: arr
Out[27]:
array([[ True,  True,  True,  True],
       [ True,  True,  True,  True],
       [ True,  True,  True,  True],
       [ True,  True,  True,  True],
       [ True,  True,  True,  True],
       [ True,  True,  True,  True],
       [ True,  True,  True,  True],
       [ True,  True,  True,  True],
       [ True,  True,  True,  True],
       [ True,  True,  True,  True]], dtype=bool)


In [30]: np.add.reduceat(arr, [0, 3, 3, 7, 9], axis=0)
Out[30]:
array([[3, 3, 3, 3],
       [1, 1, 1, 1],
       [4, 4, 4, 4],
       [2, 2, 2, 2],
       [1, 1, 1, 1]])

this does not seem intuitively correct. Since we have:

In [33]: arr[3:3].sum(0)
Out[33]: array([0, 0, 0, 0])

I would expect

array([[3, 3, 3, 3],
       [0, 0, 0, 0],
       [4, 4, 4, 4],
       [2, 2, 2, 2],
       [1, 1, 1, 1]])

Obviously I can RTFM and see why it does this ("if ``indices[i] >=
indices[i + 1]``, the i-th generalized "row" is simply
``a[indices[i]]``"), but it doesn't make much sense to me, and I need
work around it. Suggestions?


From rowen at uw.edu  Thu Aug 11 14:50:14 2011
From: rowen at uw.edu (Russell E. Owen)
Date: Thu, 11 Aug 2011 11:50:14 -0700
Subject: [Numpy-discussion] Efficient way to load a 1Gb file?
References: <rowen-E0806A.10221410082011@news.gmane.org>
	<FC32192D-0436-46F9-AE71-6359CCA286A0@astro.physik.uni-goettingen.de>
	<CANm_+Zqmsgo8Q+Oz_0RCya-hJv4Q7PqynDb=LzrgvbTxGY3MWQ@mail.gmail.com>
Message-ID: <rowen-B8C019.11501411082011@news.gmane.org>

In article 
<CANm_+Zqmsgo8Q+Oz_0RCya-hJv4Q7PqynDb=LzrgvbTxGY3MWQ at mail.gmail.com>,
 Anne Archibald <aarchiba at physics.mcgill.ca> wrote:

> There was also some work on a semi-mutable array type that allowed
> appending along one axis, then 'freezing' to yield a normal numpy
> array (unfortunately I'm not sure how to find it in the mailing list
> archives). One could write such a setup by hand, using mmap() or
> realloc(), but I'd be inclined to simply write a filter that converted
> the text file to some sort of binary file on the fly, value by value.
> Then the file can be loaded in or mmap()ed.  A 1 Gb text file is a
> miserable object anyway, so it might be desirable to convert to (say)
> HDF5 and then throw away the text file.

Thank you and the others for your help.

It seems a shame that loadtxt has no argument for predicted length, 
which would allow preallocation and less appending/copying data.

And yes...reading the whole file first to figure out how many elements 
it has seems sensible to me -- at least as a switchable behavior, and 
preferably the default. 1Gb isn't that large in modern systems, but 
loadtxt is filing up all 6Gb of RAM reading it!

I'll suggest the HDF5 solution to my colleague. Meanwhile I think he's 
hacked around the problem by reading the file through once to figure out 
the array length, allocating that, and reading the data in with a Python 
loop. Sounds slow, but it's working.

-- Russell


From ben.root at ou.edu  Thu Aug 11 16:37:26 2011
From: ben.root at ou.edu (Benjamin Root)
Date: Thu, 11 Aug 2011 15:37:26 -0500
Subject: [Numpy-discussion] bug with assignment into an indexed array?
In-Reply-To: <CAFXk4bqpKeQxUhUXi3bRfXDTGNvUG=GbyKfakzTJ1awD9nkePA@mail.gmail.com>
References: <CANNq6FnrbrqUvjrmxeCpYrwBv3jOMbpA+5ik_6a7MjQHsYbSJw@mail.gmail.com>
	<CAFXk4bpwTAxC-j_QMwqQiJ-GZMGu+_cxo_h-tEE2-zOmYBaSmQ@mail.gmail.com>
	<CANNq6FkQ++=GnshiUsvTxA5npMNgm9CcG4w2r2FYi-gW6YfLZw@mail.gmail.com>
	<CAFXk4bqpKeQxUhUXi3bRfXDTGNvUG=GbyKfakzTJ1awD9nkePA@mail.gmail.com>
Message-ID: <CANNq6F=Ce53jgeoSXzPBSVCuDqZUfNkbiSSBYkPxz3M+d=snvw@mail.gmail.com>

On Thu, Aug 11, 2011 at 10:33 AM, Olivier Delalleau <shish at keba.be> wrote:

> 2011/8/11 Benjamin Root <ben.root at ou.edu>
>
>>
>>
>> On Thu, Aug 11, 2011 at 8:37 AM, Olivier Delalleau <shish at keba.be> wrote:
>>
>>> Maybe confusing, but working as expected.
>>>
>>>
>>> When you write:
>>>   matched_to[np.array([0, 1, 2])] = 3
>>> it calls __setitem__ on matched_to, with arguments (np.array([0, 1, 2]),
>>> 3). So numpy understand you want to write 3 at these indices.
>>>
>>>
>>> When you write:
>>> matched_to[:3][match] = 3
>>> it first calls __getitem__ with the slice as argument, which returns a
>>> view of your array, then it calls __setitem__ on this view, and it fills
>>> your matched_to array at the same time.
>>>
>>>
>>> But when you write:
>>>   matched_to[np.array([0, 1, 2])][match] = 3
>>> it first calls __getitem__ with the array as argument, which retunrs a
>>> *copy* of your array, so that calling __setitem__ on this copy has no effect
>>> on your original array.
>>>
>>> -=- Olivier
>>>
>>>
>> Right, but I guess my question is does it *have* to be that way?  I guess
>> it makes some sense with respect to indexing with a numpy array like I did
>> with the last example, because an element could be referred to multiple
>> times (which explains the common surprise with '+='), but with boolean
>> indexing, we are guaranteed that each element of the view will appear at
>> most once.  Therefore, shouldn't boolean indexing always return a view, not
>> a copy?  Is the general case of arbitrary array selection inherently
>> impossible to encode in a view versus a slice with a regular spacing?
>>
>
> Yes, due to the fact the array interface only supports regular spacing
> (otherwise it is more difficult to get efficient access to arbitrary array
> positions).
>
> -=- Olivier
>
>
This still bothers me, though.  I imagine that it is next to impossible to
detect this situation from numpy's perspective, so it can't even emit a
warning or error. Furthermore, for someone who makes a general function to
modify the contents of some externally provided array, there is a
possibility that the provided array is actually a copy not a view.
Although, I guess it is the responsibility of the user to know the
difference.

I guess that is the key problem.  The key advantage we are taught about
numpy arrays is the use of views for efficient access.  It would seem that
most access operations would use it, but in reality, only sliced access do.
Everything else is a copy (unless you are doing fancy indexing with
assignment).  Maybe with some of the forthcoming changes that have been done
with respect to nditer and ufuncs (in particular, I am thinking of the
"where" kwarg), maybe we could consider an enhancement allowing fancy
indexing (or at least boolean indexing) to produce a view?  Even if it is
less efficient than a view from slicing, it would bring better consistency
in behavior between the different forms of indexing.

Just my 2 cents,
Ben Root
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110811/cc8fe4d8/attachment.html>

From borreguero at gmail.com  Thu Aug 11 19:43:12 2011
From: borreguero at gmail.com (Jose Borreguero)
Date: Thu, 11 Aug 2011 19:43:12 -0400
Subject: [Numpy-discussion] how to create a block diagonal matrix by
	repeating the block?
Message-ID: <CAEee4gUD5KA3jDbyv11FCgo5wdrB7vpKFbDtS7Neg_dxZuBXLg@mail.gmail.com>

Dear numpy users,

I have a 3x3 matrix which I want to repeat 50 times along a diagonal, thus
creating a 150x150 block diagonal matrix.

I know of a method usin scipy.linalg.block_diag, but I don't know if this is
the best one:
a = random.randn(3,3)
b = a.reshape(1,3,3).repeat(50,axis=0)
scipy.linalg.block_diag( *b )

Jose
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110811/b59221c8/attachment.html>

From fperez.net at gmail.com  Thu Aug 11 20:15:21 2011
From: fperez.net at gmail.com (Fernando Perez)
Date: Thu, 11 Aug 2011 17:15:21 -0700
Subject: [Numpy-discussion] how to create a block diagonal matrix by
 repeating the block?
In-Reply-To: <CAEee4gUD5KA3jDbyv11FCgo5wdrB7vpKFbDtS7Neg_dxZuBXLg@mail.gmail.com>
References: <CAEee4gUD5KA3jDbyv11FCgo5wdrB7vpKFbDtS7Neg_dxZuBXLg@mail.gmail.com>
Message-ID: <CAHAreOqc6MS2u3FMP88ow4=Bxj3vP4exjyFd7Cd11YnVdPMb1g@mail.gmail.com>

On Thu, Aug 11, 2011 at 4:43 PM, Jose Borreguero <borreguero at gmail.com> wrote:
> a = random.randn(3,3)
> b = a.reshape(1,3,3).repeat(50,axis=0)
> scipy.linalg.block_diag( *b )
>

slightly simpler, but equivalent, code:

b = [a]*50
scipy.linalg.block_diag( *b)

Cheers,

f


From warren.weckesser at enthought.com  Thu Aug 11 22:01:03 2011
From: warren.weckesser at enthought.com (Warren Weckesser)
Date: Thu, 11 Aug 2011 21:01:03 -0500
Subject: [Numpy-discussion] how to create a block diagonal matrix by
 repeating the block?
In-Reply-To: <CAHAreOqc6MS2u3FMP88ow4=Bxj3vP4exjyFd7Cd11YnVdPMb1g@mail.gmail.com>
References: <CAEee4gUD5KA3jDbyv11FCgo5wdrB7vpKFbDtS7Neg_dxZuBXLg@mail.gmail.com>
	<CAHAreOqc6MS2u3FMP88ow4=Bxj3vP4exjyFd7Cd11YnVdPMb1g@mail.gmail.com>
Message-ID: <CAM-+wY_woYMHs95ijELAivyfq6LbW7bcBGPRBuqvgS_0x+2jDw@mail.gmail.com>

On Thu, Aug 11, 2011 at 7:15 PM, Fernando Perez <fperez.net at gmail.com>wrote:

> On Thu, Aug 11, 2011 at 4:43 PM, Jose Borreguero <borreguero at gmail.com>
> wrote:
> > a = random.randn(3,3)
> > b = a.reshape(1,3,3).repeat(50,axis=0)
> > scipy.linalg.block_diag( *b )
> >
>
> slightly simpler, but equivalent, code:
>
> b = [a]*50
> scipy.linalg.block_diag( *b)
>
>
The following is unnecessarily complicated--using block_diag is fine--but it
can be fun to stretch out into the fourth dimension with stride tricks:


from numpy import array, zeros
from numpy.lib.stride_tricks import as_strided

# N is the number of 3x3 blocks.
# N = 50
N = 4
a = array([[1,2,3],[4,5,6],[7,8,9]])

# b will be the block-diagonal array.
b = zeros((3*N, 3*N), dtype=a.dtype)

bstr = b.strides
c = as_strided(b, shape=(N,N,3,3), strides=(3*bstr[0], 3*bstr[1], bstr[0],
bstr[1]))
# Assign a to the diagonal blocks.
c[range(N), range(N)] = a

print b


Output:


[[1 2 3 0 0 0 0 0 0 0 0 0]
 [4 5 6 0 0 0 0 0 0 0 0 0]
 [7 8 9 0 0 0 0 0 0 0 0 0]
 [0 0 0 1 2 3 0 0 0 0 0 0]
 [0 0 0 4 5 6 0 0 0 0 0 0]
 [0 0 0 7 8 9 0 0 0 0 0 0]
 [0 0 0 0 0 0 1 2 3 0 0 0]
 [0 0 0 0 0 0 4 5 6 0 0 0]
 [0 0 0 0 0 0 7 8 9 0 0 0]
 [0 0 0 0 0 0 0 0 0 1 2 3]
 [0 0 0 0 0 0 0 0 0 4 5 6]
 [0 0 0 0 0 0 0 0 0 7 8 9]]


Warren


Cheers,
>
> f
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110811/aee32258/attachment.html>

From Chris.Barker at noaa.gov  Fri Aug 12 00:49:18 2011
From: Chris.Barker at noaa.gov (Chris Barker)
Date: Thu, 11 Aug 2011 21:49:18 -0700
Subject: [Numpy-discussion] Efficient way to load a 1Gb file?
In-Reply-To: <CANm_+Zqmsgo8Q+Oz_0RCya-hJv4Q7PqynDb=LzrgvbTxGY3MWQ@mail.gmail.com>
References: <rowen-E0806A.10221410082011@news.gmane.org>
	<FC32192D-0436-46F9-AE71-6359CCA286A0@astro.physik.uni-goettingen.de>
	<CANm_+Zqmsgo8Q+Oz_0RCya-hJv4Q7PqynDb=LzrgvbTxGY3MWQ@mail.gmail.com>
Message-ID: <4E44B0CE.4070402@noaa.gov>

On 8/10/2011 1:01 PM, Anne Archibald wrote:
> There was also some work on a semi-mutable array type that allowed
> appending along one axis, then 'freezing' to yield a normal numpy
> array (unfortunately I'm not sure how to find it in the mailing list
> archives).

That was me, and here is the thread -- however, I'm on vacation, and 
don't have the test code, etc with me, but I found the core class. It's 
enclosed.

>> The npyio routines (loadtxt as well as genfromtxt) first read in the entire data as lists, which creates of course significant overhead, but is not easy to circumvent, since numpy arrays are immutable - so you have to first store the numbers in some kind of mutable object. One could write a custom parser that tries to be somewhat more efficient, e.g. first reading in sub-arrays from a smaller buffer. Concatenating those sub-arrays would still require about twice the memory of the final array. I don't know if using the array.array type (which is mutable) is much more efficient than a list...

Indeed, and are holding all the text as well, which is generally going 
to be bigger than the resulting numbers.

Interesting, when I wrote accumulator, I found that it didn't, for the 
most part, have any performance advantage over accumlating on lists, 
then converting to arrays -- but there is a memory advantage, so this 
may be a good use case. you could do something like (untested):

If your rows are all one dtype:

X = accumulator(dtype=np.float32, block_shape = (num_cols,))

if they are not, then build a custon dtype to hold the rows, and use that:

dt = np.dtype('%id'%num_columns) # create a dtype that holds a row
                                  #num_columns doubles in this case.

# create an accumulator for that dtype
X = accumulator(dtype=dt)

# loop through the file to build the array:
delimiter = ' '
for line in file(fname, 'r'):
      X.append ( np.array(line.split(delimiter), dtype=float) )


X = np.array(X) # gives a regular old array as a copy

I note that converting to a regular array requires a data copy, which, 
if memoery is tight, might not be good. The solution would be to have a 
way to make a view, so you'd get a regular array from the same data 
(with maybe the extra buffer space)

I'd like to see this calss get more mature, robust, and better 
performing, but so far it's worked for my use cases. Contributions welcome.

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: accumulator.py
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110811/cb04b68f/attachment.ksh>

From Chris.Barker at noaa.gov  Fri Aug 12 00:51:37 2011
From: Chris.Barker at noaa.gov (Chris Barker)
Date: Thu, 11 Aug 2011 21:51:37 -0700
Subject: [Numpy-discussion] Efficient way to load a 1Gb file?
In-Reply-To: <4E44B0CE.4070402@noaa.gov>
References: <rowen-E0806A.10221410082011@news.gmane.org>
	<FC32192D-0436-46F9-AE71-6359CCA286A0@astro.physik.uni-goettingen.de>
	<CANm_+Zqmsgo8Q+Oz_0RCya-hJv4Q7PqynDb=LzrgvbTxGY3MWQ@mail.gmail.com>
	<4E44B0CE.4070402@noaa.gov>
Message-ID: <4E44B159.1080505@noaa.gov>

aarrgg!

I cleaned up the doc string a bit, but didn't save before sending -- 
here it is again, Sorry about that.

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: accumulator.py
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110811/d714f3ef/attachment.ksh>

From dhanjal at telecom-paristech.fr  Fri Aug 12 05:03:30 2011
From: dhanjal at telecom-paristech.fr (Charanpal Dhanjal)
Date: Fri, 12 Aug 2011 11:03:30 +0200
Subject: [Numpy-discussion] SVD does not converge on "clean" matrix
In-Reply-To: <203aa1b32d794c238d32cb8d29036cc2.squirrel@webmail1.telecom-paristech.fr>
References: <203aa1b32d794c238d32cb8d29036cc2.squirrel@webmail1.telecom-paristech.fr>
Message-ID: <06f59405fc0dfce9e04f83d001963a23@telecom-paristech.fr>

Thank Nadav for testing out the matrix. I wonder if you had a chance to 
check if the resulting decomposition contained NaN or Inf values?

As far I understood, numpy.linalg.svd uses routines in LAPACK and ATLAS 
(if available) to compute the corresponding SVD. I did some 
complementary tests on Debian Squeeze on an Intel Xeon W3550 CPU and the 
call to numpy.linalg.svd results in the LinAlgError "SVD did not 
converge", however the test leading to results containing NaN values ran 
on Debian Lenny on an Intel Core 2 Quad. In both of these situations we 
use Python 2.7.1 and numpy 1.5.1 (without ATLAS), and so the reasons for 
the differences seem to be OS or processor dependent. Any ideas?

Charanpal

Date: Thu, 11 Aug 2011 07:21:09 -0700
 From: Nadav Horesh <nadavh at visionsense.com>
Subject: Re: [Numpy-discussion] SVD does not converge on "clean"
     matrix
To: Discussion of Numerical Python <numpy-discussion at scipy.org>
Message-ID:
     
<26FC23E7C398A64083C980D16001012D246DFC5F90 at VA3DIAXVS361.RED001.local>
Content-Type: text/plain; charset="us-ascii"


> Had no problem on a gentoo 64 bit machine using atlas 3.8.0 (Core I7, 
> python 2.7.2, numpy versions1.60 and 1.6.1)
>
>  Nadav

>On Thu, 11 Aug 2011 15:23:22 +0200, dhanjal at telecom-paristech.fr 
> wrote:
>> Hi all,
>>
>> I get an error message "numpy.linalg.linalg.LinAlgError: SVD did not
>> converge" when calling numpy.linalg.svd on a "clean" matrix of size 
>> (1952,
>> 895). The matrix is clean in the sense that it contains no NaN or 
>> Inf
>> values. The corresponding npz file is available here:
>> 
>> https://docs.google.com/leaf?id=0Bw0NXKxxc40jMWEyNTljMWUtMzBmNS00NGZmLThhZWUtY2I2MWU2MGZiNDgx&hl=fr
>>
>> Here is some information about my setup: I use Python 2.7.1 on 
>> Ubuntu
>> 11.04 with numpy 1.6.1. Furthermore, I thought the problem might be 
>> solved
>> by recompiling numpy with my local ATLAS library (version 3.8.3), 
>> and this
>> didn't seem to help. On another machine with Python 2.7.1 and numpy 
>> 1.5.1
>> the SVD does converge however it contains 1 NaN singular value and 3
>> negative singular values of the order -10^-1 (singular values should
>> always be non-negative).
>>
>> I also tried computing the SVD of the matrix using Octave 3.2.4 and 
>> Matlab
>> 7.10.0.499 (R2010a) 64-bit (glnxa64) and there were no problems. Any 
>> help
>> is greatly appreciated.
>>
>> Thanks in advance,
>> Charanpal


From borreguero at gmail.com  Fri Aug 12 07:53:29 2011
From: borreguero at gmail.com (Jose Borreguero)
Date: Fri, 12 Aug 2011 07:53:29 -0400
Subject: [Numpy-discussion] how to create a block diagonal matrix by
 repeating the block?
In-Reply-To: <CAHAreOqc6MS2u3FMP88ow4=Bxj3vP4exjyFd7Cd11YnVdPMb1g@mail.gmail.com>
References: <CAEee4gUD5KA3jDbyv11FCgo5wdrB7vpKFbDtS7Neg_dxZuBXLg@mail.gmail.com>
	<CAHAreOqc6MS2u3FMP88ow4=Bxj3vP4exjyFd7Cd11YnVdPMb1g@mail.gmail.com>
Message-ID: <CAEee4gXqeXycSWW4PpvyPMbijJVW9g5=DFTnx8umFbN-_PQXmA@mail.gmail.com>

Thanks!
Jose


On Thu, Aug 11, 2011 at 8:15 PM, Fernando Perez <fperez.net at gmail.com>wrote:

> On Thu, Aug 11, 2011 at 4:43 PM, Jose Borreguero <borreguero at gmail.com>
> wrote:
> > a = random.randn(3,3)
> > b = a.reshape(1,3,3).repeat(50,axis=0)
> > scipy.linalg.block_diag( *b )
> >
>
> slightly simpler, but equivalent, code:
>
> b = [a]*50
> scipy.linalg.block_diag( *b)
>
> Cheers,
>
> f
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110812/1f2a1cd0/attachment.html>

From andrea.gavana at gmail.com  Fri Aug 12 09:32:05 2011
From: andrea.gavana at gmail.com (Andrea Gavana)
Date: Fri, 12 Aug 2011 15:32:05 +0200
Subject: [Numpy-discussion] Statistical distributions on samples
Message-ID: <CAEf70bxD7Z4HH6SZH-DpNARCDqQVx7f6SRyAqd7FoZVtdDnC1Q@mail.gmail.com>

Hi All,

    I am working on something that appeared to be a no-brainer issue (at the
beginning), by my complete ignorance in statistics is overwhelming and I got
stuck.

What I am trying to do can be summarized as follows

Let's assume that I have to generate a sample of a 1,000 values for a
variable (let's say, "velocity") using a normal distribution (but later I
will have to do it with log-normal, triangular and a couple of others). The
only thing I know about this velocity sample is the minimum and maximum
values (let's say 50 and 200 respectively) and, obviously for the normal
distribution (but not so for the other distributions), the mean value (125
in this case).

Now, I would like to generate this sample of 1,000 points, in which none of
the point has velocity smaller than 50 or bigger than 200, and the number of
samples close to the mean (125) should be higher than the number of samples
close to the minimum and the maximum, following some kind of normal
distribution.

What I have tried up to now is summarized in the code below, but as you can
easily see, I don't really know what I am doing. I am open to every
suggestion, and I apologize for the dumbness of my question.

import numpy

from scipy import stats
import matplotlib.pyplot as plt

minval, maxval = 50.0, 250.0
x = numpy.linspace(minval, maxval, 500)

samp = stats.norm.rvs(size=len(x))
pdf = stats.norm.pdf(x)
cdf = stats.norm.cdf(x)
ppf = stats.norm.ppf(x)

ax1 = plt.subplot(2, 2, 1)
ax1.plot(range(len(x)), samp)

ax2 = plt.subplot(2, 2, 2)
ax2.plot(x, pdf)

ax3 = plt.subplot(2, 2, 3)
ax3.plot(x, cdf)

ax4 = plt.subplot(2, 2, 4)
ax4.plot(x, ppf)

plt.show()


Andrea.

"Imagination Is The Only Weapon In The War Against Reality."
http://xoomer.alice.it/infinity77/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110812/3d5f6673/attachment.html>

From warren.weckesser at enthought.com  Fri Aug 12 09:33:46 2011
From: warren.weckesser at enthought.com (Warren Weckesser)
Date: Fri, 12 Aug 2011 08:33:46 -0500
Subject: [Numpy-discussion] SVD does not converge on "clean" matrix
In-Reply-To: <06f59405fc0dfce9e04f83d001963a23@telecom-paristech.fr>
References: <203aa1b32d794c238d32cb8d29036cc2.squirrel@webmail1.telecom-paristech.fr>
	<06f59405fc0dfce9e04f83d001963a23@telecom-paristech.fr>
Message-ID: <CAM-+wY_JW1e70y2+R0pJu0JkNSBWLRhZ_dvbeahNbwFDVJNBGg@mail.gmail.com>

On Fri, Aug 12, 2011 at 4:03 AM, Charanpal Dhanjal <
dhanjal at telecom-paristech.fr> wrote:

> Thank Nadav for testing out the matrix. I wonder if you had a chance to
> check if the resulting decomposition contained NaN or Inf values?
>
> As far I understood, numpy.linalg.svd uses routines in LAPACK and ATLAS
> (if available) to compute the corresponding SVD. I did some
> complementary tests on Debian Squeeze on an Intel Xeon W3550 CPU and the
> call to numpy.linalg.svd results in the LinAlgError "SVD did not
> converge", however the test leading to results containing NaN values ran
> on Debian Lenny on an Intel Core 2 Quad. In both of these situations we
> use Python 2.7.1 and numpy 1.5.1 (without ATLAS), and so the reasons for
> the differences seem to be OS or processor dependent. Any ideas?
>
> Charanpal
>
> Date: Thu, 11 Aug 2011 07:21:09 -0700
>  From: Nadav Horesh <nadavh at visionsense.com>
> Subject: Re: [Numpy-discussion] SVD does not converge on "clean"
>     matrix
> To: Discussion of Numerical Python <numpy-discussion at scipy.org>
> Message-ID:
>
> <26FC23E7C398A64083C980D16001012D246DFC5F90 at VA3DIAXVS361.RED001.local>
> Content-Type: text/plain; charset="us-ascii"
>
>
> > Had no problem on a gentoo 64 bit machine using atlas 3.8.0 (Core I7,
> > python 2.7.2, numpy versions1.60 and 1.6.1)
>


Another data point: on Mac OS X, with Python 2.7.2 and numpy 1.6.0 (using
EPD 7.1), I get the error:

$ ipython --pylab
Enthought Python Distribution -- www.enthought.com

Python 2.7.2 |EPD 7.1-1 (32-bit)| (default, Jul  3 2011, 15:40:35)
Type "copyright", "credits" or "license" for more information.

IPython 0.11.rc1 -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object', use 'object??' for extra details.

Welcome to pylab, a matplotlib-based Python environment [backend: WXAgg].
For more information, type 'help(pylab)'.

In [1]: numpy.__version__
Out[1]: '1.6.0'

In [2]: arr = load('matrix_leading_to_bad_SVD.npz')['arr_0']

In [3]: np.linalg.svd(arr)
---------------------------------------------------------------------------
LinAlgError                               Traceback (most recent call last)
/Users/warren/tmp/<ipython-input-3-e475bd6de739> in <module>()
----> 1 np.linalg.svd(arr)

/Library/Frameworks/Python.framework/Versions/7.1/lib/python2.7/site-packages/numpy/linalg/linalg.py
in svd(a, full_matrices, compute_uv)
   1319                                  work, lwork, iwork, 0)
   1320     if results['info'] > 0:
-> 1321         raise LinAlgError, 'SVD did not converge'
   1322     s = s.astype(_realType(result_t))
   1323     if compute_uv:

LinAlgError: SVD did not converge


Warren


> >
> >  Nadav
>
> >On Thu, 11 Aug 2011 15:23:22 +0200, dhanjal at telecom-paristech.fr
> > wrote:
> >> Hi all,
> >>
> >> I get an error message "numpy.linalg.linalg.LinAlgError: SVD did not
> >> converge" when calling numpy.linalg.svd on a "clean" matrix of size
> >> (1952,
> >> 895). The matrix is clean in the sense that it contains no NaN or
> >> Inf
> >> values. The corresponding npz file is available here:
> >>
> >>
> https://docs.google.com/leaf?id=0Bw0NXKxxc40jMWEyNTljMWUtMzBmNS00NGZmLThhZWUtY2I2MWU2MGZiNDgx&hl=fr
> >>
> >> Here is some information about my setup: I use Python 2.7.1 on
> >> Ubuntu
> >> 11.04 with numpy 1.6.1. Furthermore, I thought the problem might be
> >> solved
> >> by recompiling numpy with my local ATLAS library (version 3.8.3),
> >> and this
> >> didn't seem to help. On another machine with Python 2.7.1 and numpy
> >> 1.5.1
> >> the SVD does converge however it contains 1 NaN singular value and 3
> >> negative singular values of the order -10^-1 (singular values should
> >> always be non-negative).
> >>
> >> I also tried computing the SVD of the matrix using Octave 3.2.4 and
> >> Matlab
> >> 7.10.0.499 (R2010a) 64-bit (glnxa64) and there were no problems. Any
> >> help
> >> is greatly appreciated.
> >>
> >> Thanks in advance,
> >> Charanpal
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110812/4ad637c1/attachment.html>

From ndbecker2 at gmail.com  Fri Aug 12 10:30:00 2011
From: ndbecker2 at gmail.com (Neal Becker)
Date: Fri, 12 Aug 2011 10:30 -0400
Subject: [Numpy-discussion] nditer confusion
Message-ID: <j23dd8$8e3$1@dough.gmane.org>

There'a a boatload of options for nditer.  I need a simple explanation, maybe a 
few simple examples.  Is there anything that might help?


From cjordan1 at uw.edu  Fri Aug 12 10:53:12 2011
From: cjordan1 at uw.edu (Christopher Jordan-Squire)
Date: Fri, 12 Aug 2011 09:53:12 -0500
Subject: [Numpy-discussion] Statistical distributions on samples
In-Reply-To: <CAEf70bxD7Z4HH6SZH-DpNARCDqQVx7f6SRyAqd7FoZVtdDnC1Q@mail.gmail.com>
References: <CAEf70bxD7Z4HH6SZH-DpNARCDqQVx7f6SRyAqd7FoZVtdDnC1Q@mail.gmail.com>
Message-ID: <CAEJxiFp0RB6-f2MF72UTM5biWQf2ZP4BUU9Xbx3Jx8Lj93NiSQ@mail.gmail.com>

Hi Andrea--An easy way to get something like this would be

import numpy as np
import scipy.stats as stats

sigma = #some reasonable standard deviation for your application
x = stats.norm.rvs(size=1000, loc=125, scale=sigma)
x = x[x>50]
x = x[x<200]

That will give a roughly normal distribution to your velocities, as long as,
say, sigma<25. (I'm using the rule of thumb for the normal distribution that
normal random samples lie 3 standard deviations away from the mean about 1
out of 350 times.) Though you won't be able to get exactly normal errors
about your mean since normal random samples can theoretically be of any
size.

You can use this same process for any other distribution, as long as you've
chosen a scale variable so that the probability of samples being outside
your desired interval is really small. Of course, once again your random
errors won't be exactly from the distribution you get your original samples
from.

-Chris JS

On Fri, Aug 12, 2011 at 8:32 AM, Andrea Gavana <andrea.gavana at gmail.com>wrote:

> Hi All,
>
>     I am working on something that appeared to be a no-brainer issue (at
> the beginning), by my complete ignorance in statistics is overwhelming and I
> got stuck.
>
> What I am trying to do can be summarized as follows
>
> Let's assume that I have to generate a sample of a 1,000 values for a
> variable (let's say, "velocity") using a normal distribution (but later I
> will have to do it with log-normal, triangular and a couple of others). The
> only thing I know about this velocity sample is the minimum and maximum
> values (let's say 50 and 200 respectively) and, obviously for the normal
> distribution (but not so for the other distributions), the mean value (125
> in this case).
>
> Now, I would like to generate this sample of 1,000 points, in which none of
> the point has velocity smaller than 50 or bigger than 200, and the number of
> samples close to the mean (125) should be higher than the number of samples
> close to the minimum and the maximum, following some kind of normal
> distribution.
>
> What I have tried up to now is summarized in the code below, but as you can
> easily see, I don't really know what I am doing. I am open to every
> suggestion, and I apologize for the dumbness of my question.
>
> import numpy
>
> from scipy import stats
> import matplotlib.pyplot as plt
>
> minval, maxval = 50.0, 250.0
> x = numpy.linspace(minval, maxval, 500)
>
> samp = stats.norm.rvs(size=len(x))
> pdf = stats.norm.pdf(x)
> cdf = stats.norm.cdf(x)
> ppf = stats.norm.ppf(x)
>
> ax1 = plt.subplot(2, 2, 1)
> ax1.plot(range(len(x)), samp)
>
> ax2 = plt.subplot(2, 2, 2)
> ax2.plot(x, pdf)
>
> ax3 = plt.subplot(2, 2, 3)
> ax3.plot(x, cdf)
>
> ax4 = plt.subplot(2, 2, 4)
> ax4.plot(x, ppf)
>
> plt.show()
>
>
> Andrea.
>
> "Imagination Is The Only Weapon In The War Against Reality."
> http://xoomer.alice.it/infinity77/
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110812/c5eaf0b7/attachment.html>

From nadavh at visionsense.com  Fri Aug 12 13:23:53 2011
From: nadavh at visionsense.com (Nadav Horesh)
Date: Fri, 12 Aug 2011 10:23:53 -0700
Subject: [Numpy-discussion] SVD does not converge on "clean" matrix
In-Reply-To: <CAM-+wY_JW1e70y2+R0pJu0JkNSBWLRhZ_dvbeahNbwFDVJNBGg@mail.gmail.com>
References: <203aa1b32d794c238d32cb8d29036cc2.squirrel@webmail1.telecom-paristech.fr>
	<06f59405fc0dfce9e04f83d001963a23@telecom-paristech.fr>,
	<CAM-+wY_JW1e70y2+R0pJu0JkNSBWLRhZ_dvbeahNbwFDVJNBGg@mail.gmail.com>
Message-ID: <26FC23E7C398A64083C980D16001012D246DFC5F95@VA3DIAXVS361.RED001.local>

I tested all the the result 3 matrices with alltrue(infinite(mat)) and got True answer for all of them.

   Nadav

________________________________
From: numpy-discussion-bounces at scipy.org [numpy-discussion-bounces at scipy.org] On Behalf Of Warren Weckesser [warren.weckesser at enthought.com]
Sent: 12 August 2011 16:33
To: Discussion of Numerical Python
Subject: Re: [Numpy-discussion] SVD does not converge on "clean" matrix


On Fri, Aug 12, 2011 at 4:03 AM, Charanpal Dhanjal <dhanjal at telecom-paristech.fr<mailto:dhanjal at telecom-paristech.fr>> wrote:
Thank Nadav for testing out the matrix. I wonder if you had a chance to
check if the resulting decomposition contained NaN or Inf values?

As far I understood, numpy.linalg.svd uses routines in LAPACK and ATLAS
(if available) to compute the corresponding SVD. I did some
complementary tests on Debian Squeeze on an Intel Xeon W3550 CPU and the
call to numpy.linalg.svd results in the LinAlgError "SVD did not
converge", however the test leading to results containing NaN values ran
on Debian Lenny on an Intel Core 2 Quad. In both of these situations we
use Python 2.7.1 and numpy 1.5.1 (without ATLAS), and so the reasons for
the differences seem to be OS or processor dependent. Any ideas?

Charanpal

Date: Thu, 11 Aug 2011 07:21:09 -0700
 From: Nadav Horesh <nadavh at visionsense.com<mailto:nadavh at visionsense.com>>
Subject: Re: [Numpy-discussion] SVD does not converge on "clean"
    matrix
To: Discussion of Numerical Python <numpy-discussion at scipy.org<mailto:numpy-discussion at scipy.org>>
Message-ID:

<26FC23E7C398A64083C980D16001012D246DFC5F90 at VA3DIAXVS361.RED001.local>
Content-Type: text/plain; charset="us-ascii"


> Had no problem on a gentoo 64 bit machine using atlas 3.8.0 (Core I7,
> python 2.7.2, numpy versions1.60 and 1.6.1)


Another data point: on Mac OS X, with Python 2.7.2 and numpy 1.6.0 (using EPD 7.1), I get the error:

$ ipython --pylab
Enthought Python Distribution -- www.enthought.com<http://www.enthought.com>

Python 2.7.2 |EPD 7.1-1 (32-bit)| (default, Jul  3 2011, 15:40:35)
Type "copyright", "credits" or "license" for more information.

IPython 0.11.rc1 -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object', use 'object??' for extra details.

Welcome to pylab, a matplotlib-based Python environment [backend: WXAgg].
For more information, type 'help(pylab)'.

In [1]: numpy.__version__
Out[1]: '1.6.0'

In [2]: arr = load('matrix_leading_to_bad_SVD.npz')['arr_0']

In [3]: np.linalg.svd(arr)
---------------------------------------------------------------------------
LinAlgError                               Traceback (most recent call last)
/Users/warren/tmp/<ipython-input-3-e475bd6de739> in <module>()
----> 1 np.linalg.svd(arr)

/Library/Frameworks/Python.framework/Versions/7.1/lib/python2.7/site-packages/numpy/linalg/linalg.py in svd(a, full_matrices, compute_uv)
   1319                                  work, lwork, iwork, 0)
   1320     if results['info'] > 0:
-> 1321         raise LinAlgError, 'SVD did not converge'
   1322     s = s.astype(_realType(result_t))
   1323     if compute_uv:

LinAlgError: SVD did not converge


Warren


>
>  Nadav

>On Thu, 11 Aug 2011 15:23:22 +0200, dhanjal at telecom-paristech.fr<mailto:dhanjal at telecom-paristech.fr>
> wrote:
>> Hi all,
>>
>> I get an error message "numpy.linalg.linalg.LinAlgError: SVD did not
>> converge" when calling numpy.linalg.svd on a "clean" matrix of size
>> (1952,
>> 895). The matrix is clean in the sense that it contains no NaN or
>> Inf
>> values. The corresponding npz file is available here:
>>
>> https://docs.google.com/leaf?id=0Bw0NXKxxc40jMWEyNTljMWUtMzBmNS00NGZmLThhZWUtY2I2MWU2MGZiNDgx&hl=fr
>>
>> Here is some information about my setup: I use Python 2.7.1 on
>> Ubuntu
>> 11.04 with numpy 1.6.1. Furthermore, I thought the problem might be
>> solved
>> by recompiling numpy with my local ATLAS library (version 3.8.3),
>> and this
>> didn't seem to help. On another machine with Python 2.7.1 and numpy
>> 1.5.1
>> the SVD does converge however it contains 1 NaN singular value and 3
>> negative singular values of the order -10^-1 (singular values should
>> always be non-negative).
>>
>> I also tried computing the SVD of the matrix using Octave 3.2.4 and
>> Matlab
>> 7.10.0.499 (R2010a) 64-bit (glnxa64) and there were no problems. Any
>> help
>> is greatly appreciated.
>>
>> Thanks in advance,
>> Charanpal

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion at scipy.org<mailto:NumPy-Discussion at scipy.org>
http://mail.scipy.org/mailman/listinfo/numpy-discussion

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110812/f655b4b5/attachment.html>

From mwwiebe at gmail.com  Fri Aug 12 14:35:13 2011
From: mwwiebe at gmail.com (Mark Wiebe)
Date: Fri, 12 Aug 2011 11:35:13 -0700
Subject: [Numpy-discussion] nditer confusion
In-Reply-To: <j23dd8$8e3$1@dough.gmane.org>
References: <j23dd8$8e3$1@dough.gmane.org>
Message-ID: <CAMRnEmr0BQUmSNw-CVeTFOCixwktbxDEcurpt0nN5Vy90jdWpQ@mail.gmail.com>

I'll write up some more introductory-style documentation, you're right that
the examples I put in the reference page aren't a nice simple starting
point.  Will post back here for feedback when I have a draft for you to
review.

Cheers,
Mark

On Fri, Aug 12, 2011 at 7:30 AM, Neal Becker <ndbecker2 at gmail.com> wrote:

> There'a a boatload of options for nditer.  I need a simple explanation,
> maybe a
> few simple examples.  Is there anything that might help?
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110812/baa790f1/attachment.html>

From nouiz at nouiz.org  Fri Aug 12 16:06:49 2011
From: nouiz at nouiz.org (=?ISO-8859-1?Q?Fr=E9d=E9ric_Bastien?=)
Date: Fri, 12 Aug 2011 16:06:49 -0400
Subject: [Numpy-discussion] Theano 0.4.1 released
Message-ID: <CADKKbtjn1iETetAYMZzf_QSJvTFrB3iBf0J_BSOjZNqbG99W9A@mail.gmail.com>

===========================
 Announcing Theano 0.4.1
===========================

This is an important release, with lots of new features, bug
fixes and some deprecation warning.  The upgrade is recommended for everybody.

For those using the bleeding edge version in the
mercurial repository, we encourage you to update to the `0.4.1` tag.


What's New
----------

New features:

 * `R_op <http://deeplearning.net/software/theano/tutorial/gradients.html>`_
macro like theano.tensor.grad
   * Not all tests are done yet (TODO)
 * Added alias theano.tensor.bitwise_{and,or,xor,not}. They are the numpy names.
 * Updates returned by Scan (you need to pass them to the
theano.function) are now a new Updates class.
   That allow more check and easier work with them. The Updates class
is a subclass of dict
 * Scan can now work in a "do while" loop style.
   * We scan until a condition is met.
   * There is a minimum of 1 iteration(can't do "while do" style loop)
 * The "Interactive Debugger" (compute_test_value theano flags)
   * Now should work with all ops (even the one with only C code)
   * In the past some errors were caught and re-raised as unrelated
errors (ShapeMismatch replaced with NotImplemented). We don't do that
anymore.
 * The new Op.make_thunk function(introduced in 0.4.0) is now used by
constant_folding and DebugMode
 * Added A_TENSOR_VARIABLE.astype() as a way to cast. NumPy allows this syntax.
 * New BLAS GER implementation.
 * Insert GEMV more frequently.
 * Added new ifelse(scalar condition, rval_if_true, rval_if_false) Op.
   * This is a subset of the elemwise switch (tensor condition,
rval_if_true, rval_if_false).
   * With the new feature in the sandbox, only one of rval_if_true or
rval_if_false will be evaluated.

Optimizations:

 * Subtensor has C code
 * {Inc,Set}Subtensor has C code
 * ScalarFromTensor has C code
 * dot(zeros,x) and dot(x,zeros)
 * IncSubtensor(x, zeros, idx) -> x
 * SetSubtensor(x, x[idx], idx) -> x (when x is a constant)
 * subtensor(alloc,...) -> alloc
 * Many new scan optimization
   * Lower scan execution overhead with a Cython implementation
   * Removed scan double compilation (by using the new Op.make_thunk mechanism)
   * Certain computations from the inner graph are now Pushed out into the outer
     graph. This means they are not re-comptued at every step of scan.
   * Different scan ops get merged now into a single op (if possible), reducing
     the overhead and sharing computations between the two instances

GPU:

 * PyCUDA/CUDAMat/Gnumpy/Theano bridge and `documentation
<http://deeplearning.net/software/theano/tutorial/gpu_data_convert.html>`_.
   * New function to easily convert pycuda GPUArray object to and from
CudaNdarray object
   * Fixed a bug if you crated a view of a manually created
CudaNdarray that are view of GPUArray.
 * Removed a warning when nvcc is not available and the user did not
requested it.
 * renamed config option cuda.nvccflags -> nvcc.flags
 * Allow GpuSoftmax and GpuSoftmaxWithBias to work with bigger input.

Bugs fixed:

 * In one case an AdvancedSubtensor1 could be converted to a
GpuAdvancedIncSubtensor1 insted of GpuAdvancedSubtensor1.
   It probably didn't happen due to the order of optimizations, but
that order is not guaranteed to be the same on all computers.
 * Derivative of set_subtensor was wrong.
 * Derivative of Alloc was wrong.

Crash fixed:

 * On an unusual Python 2.4.4 on Windows
 * When using a C cache copied from another location
 * On Windows 32 bits when setting a complex64 to 0.
 * Compilation crash with CUDA 4
 * When wanting to copy the compilation cache from a computer to another
   * This can be useful for using Theano on a computer without a compiler.
 * GPU:
   * Compilation crash fixed under Ubuntu 11.04
   * Compilation crash fixed with CUDA 4.0

Know bug:

 * CAReduce with nan in inputs don't return the good output (`Ticket
<http://trac-hg.assembla.com/theano/ticket/763>`_).
   * This is used in tensor.{max,mean,prod,sum} and in the grad of
PermuteRowElements.
   * This is not a new bug, just a bug discovered since the last
release that we didn't had time to fix.

Deprecation (will be removed in Theano 0.5, warning generated if you use them):
 * The string mode (accepted only by theano.function()) FAST_RUN_NOGC.
Use Mode(linker='c|py_nogc') instead.
 * The string mode (accepted only by theano.function()) STABILIZE. Use
Mode(optimizer='stabilize') instead.
 * scan interface change:
   * The use of `return_steps` for specifying how many entries of the output
     scan has been depricated
     * The same thing can be done by applying a subtensor on the output
       return by scan to select a certain slice
   * The inner function (that scan receives) should return its outputs and
     updates following this order:
        [outputs], [updates], [condition]. One can skip any of the three if not
        used, but the order has to stay unchanged.
 * tensor.grad(cost, wrt) will return an object of the "same type" as wrt
   (list/tuple/TensorVariable).
   * Currently tensor.grad return a type list when the wrt is a list/tuple of
     more then 1 element.

Sandbox:

 * MRG random generator now implements the same casting behavior as
the regular random generator.
Sandbox New features(not enabled by default):
 * New Linkers (theano flags linker={vm,cvm})
   * The new linker allows lazy evaluation of the new ifelse op,
meaning we compute only the true or false branch depending of the
condition. This can speed up some types of computation.
   * Uses a new profiling system (that currently tracks less stuff)
   * The cvm is implemented in C, so it lowers Theano's overhead.
   * The vm is implemented in python. So it can help debugging in some cases.
   * In the future, the default will be the cvm.
 * Some new not yet well tested sparse ops:
theano.sparse.sandbox.{SpSum, Diag, SquareDiagonal, ColScaleCSC,
RowScaleCSC, Remove0, EnsureSortedIndices, ConvolutionIndices}

Documentation:

 * How to compute the `Jacobian, Hessian, Jacobian times a vector,
Hessian times a vector
<http://deeplearning.net/software/theano/tutorial/gradients.html>`_.
 * Slide for a 3 hours class with exercises that was done at the
HPCS2011 Conference in Montreal.

Others:

 * Logger name renamed to be consistent.
 * Logger function simplified and made more consistent.
 * Fixed transformation of error by other not related error with the
compute_test_value Theano flag.
 * Compilation cache enhancements.
 * Made compatible with NumPy 1.6 and SciPy 0.9
 * Fix tests when there was new dtype in NumPy that is not supported by Theano.
 * Fixed some tests when SciPy is not available.
 * Don't compile anything when Theano is imported. Compile support
code when we compile the first C code.
 * Python 2.4 fix:
   * Fix the file theano/misc/check_blas.py
   * For python 2.4.4 on Windows, replaced float("inf") with numpy.inf.
 * Removes useless inputs to a scan node
   * Beautification mostly, making the graph more visible. Such inputs
would appear as a consequence of other optimizations

Core:

 * there is a new mechanism that lets an Op permit that one of its
   inputs to be aliased to another destroyed input.  This will generally
   result in incorrect calculation, so it should be used with care!  The
   right way to use it is when the caller can guarantee that even if
   these two inputs look aliased, they actually will never overlap. This
   mechanism can be used, for example, by a new alternative approach to
   implementing Scan.  If an op has an attribute called
   "destroyhandler_tolerate_aliased" then this is what's going on.
   IncSubtensor is thus far the only Op to use this mechanism.Mechanism


Download
--------

You can download Theano from http://pypi.python.org/pypi/Theano.

Description
-----------

Theano is a Python library that allows you to define, optimize, and
efficiently evaluate mathematical expressions involving
multi-dimensional arrays. It is built on top of NumPy. Theano
features:

 * tight integration with NumPy: a similar interface to NumPy's.
   numpy.ndarrays are also used internally in Theano-compiled functions.
 * transparent use of a GPU: perform data-intensive computations up to
   140x faster than on a CPU (support for float32 only).
 * efficient symbolic differentiation: Theano can compute derivatives
   for functions of one or many inputs.
 * speed and stability optimizations: avoid nasty bugs when computing
   expressions such as log(1+ exp(x)) for large values of x.
 * dynamic C code generation: evaluate expressions faster.
 * extensive unit-testing and self-verification: includes tools for
   detecting and diagnosing bugs and/or potential problems.

Theano has been powering large-scale computationally intensive
scientific research since 2007, but it is also approachable
enough to be used in the classroom (IFT6266 at the University of Montreal).

Resources
---------

About Theano:

http://deeplearning.net/software/theano/

About NumPy:

http://numpy.scipy.org/

About SciPy:

http://www.scipy.org/

Machine Learning Tutorial with Theano on Deep Architectures:

http://deeplearning.net/tutorial/

Acknowledgments
---------------


I would like to thank all contributors of Theano. For this particular
release, here is the people that contributed code and/or
documentation: (in alphabetical order) Frederic Bastien, James Bergstra,
Olivier Delalleau, Xavier Glorot, Ian Goodfellow, Pascal Lamblin, Gr?goire
Mesnil, Razvan Pascanu, Ilya Sutskever and David Warde-Farley


Also, thank you to all NumPy and Scipy developers as Theano builds on
its strength.

All questions/comments are always welcome on the Theano
mailing-lists ( http://deeplearning.net/software/theano/ )


From ralf.gommers at googlemail.com  Sat Aug 13 11:58:41 2011
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Sat, 13 Aug 2011 17:58:41 +0200
Subject: [Numpy-discussion] disabling SVN (was: Trouble installing scipy
 after upgrading to Mac OS X 10.7 aka Lion)
Message-ID: <CABL7CQjW+PL64R=GieR+Nnp9wQWRu9uNF6RkVx3rV+68eNs5jQ@mail.gmail.com>

On Thu, Aug 11, 2011 at 8:19 PM, Jonathan Guyer <guyer at nist.gov> wrote:

>
> On Aug 10, 2011, at 5:16 PM, Ralf Gommers wrote:
>
> > Ah, with "svn" you actually meant svn:) I thought that was supposed to
> not even work anymore.
>
> It does work and it's confusing. I had not been following the transition
> closely and so was under the impression that the svn repository was being
> mirrored from git. It's not. It's just old.
>
> Who can disable SVN access for numpy and scipy? There are still plenty of
links to http://svn.scipy.org/svn/numpy/trunk/ floating around that can
confuse users.

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110813/785fd630/attachment.html>

From ralf.gommers at googlemail.com  Sat Aug 13 12:14:11 2011
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Sat, 13 Aug 2011 18:14:11 +0200
Subject: [Numpy-discussion] [SciPy-User] disabling SVN (was: Trouble
 installing scipy after upgrading to Mac OS X 10.7 aka Lion)
In-Reply-To: <CAA6U3WCFLbzWpVRdS-ocvJ5sGVQ9rxAZYg33xw2F-qN1KQ7r0w@mail.gmail.com>
References: <CABL7CQjW+PL64R=GieR+Nnp9wQWRu9uNF6RkVx3rV+68eNs5jQ@mail.gmail.com>
	<CAA6U3WCFLbzWpVRdS-ocvJ5sGVQ9rxAZYg33xw2F-qN1KQ7r0w@mail.gmail.com>
Message-ID: <CABL7CQjoRWe4oSPxRNLEg1LR8Y+Nc815rYSvQ0ewBxSm74LXyQ@mail.gmail.com>

On Sat, Aug 13, 2011 at 6:00 PM, Ognen Duzlevski <ognen at enthought.com>wrote:

> On Sat, Aug 13, 2011 at 11:58 AM, Ralf Gommers <
> ralf.gommers at googlemail.com> wrote:
>
>>
>>
>> On Thu, Aug 11, 2011 at 8:19 PM, Jonathan Guyer <guyer at nist.gov> wrote:
>>
>>>
>>> On Aug 10, 2011, at 5:16 PM, Ralf Gommers wrote:
>>>
>>> > Ah, with "svn" you actually meant svn:) I thought that was supposed to
>>> not even work anymore.
>>>
>>> It does work and it's confusing. I had not been following the transition
>>> closely and so was under the impression that the svn repository was being
>>> mirrored from git. It's not. It's just old.
>>>
>>> Who can disable SVN access for numpy and scipy? There are still plenty of
>> links to http://svn.scipy.org/svn/numpy/trunk/ floating around that can
>> confuse users.
>>
>> Ralf
>>
>
> Hi Ognen,


> Ralf,
>
> I am the new Enthought sys admin. Is there anything I can do to help?
>
> We should check if there's still any code in SVN branches that is useful.
If so the people who are interested in it should move it somewhere else.
Anyone?

After that I think you can pull the plug on http://svn.scipy.org/svn/numpy/and
http://svn.scipy.org/svn/scipy/.

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110813/8adc1e88/attachment.html>

From charlesr.harris at gmail.com  Sat Aug 13 15:13:25 2011
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Sat, 13 Aug 2011 13:13:25 -0600
Subject: [Numpy-discussion] SVD does not converge on "clean" matrix
In-Reply-To: <203aa1b32d794c238d32cb8d29036cc2.squirrel@webmail1.telecom-paristech.fr>
References: <203aa1b32d794c238d32cb8d29036cc2.squirrel@webmail1.telecom-paristech.fr>
Message-ID: <CAB6mnxJ9tXR4CFByCsYLCucaBTH8Aq3kLbN_f8aQOU74yORN4Q@mail.gmail.com>

On Thu, Aug 11, 2011 at 7:23 AM, <dhanjal at telecom-paristech.fr> wrote:

> Hi all,
>
> I get an error message "numpy.linalg.linalg.LinAlgError: SVD did not
> converge" when calling numpy.linalg.svd on a "clean" matrix of size (1952,
> 895). The matrix is clean in the sense that it contains no NaN or Inf
> values. The corresponding npz file is available here:
>
> https://docs.google.com/leaf?id=0Bw0NXKxxc40jMWEyNTljMWUtMzBmNS00NGZmLThhZWUtY2I2MWU2MGZiNDgx&hl=fr
>
> Here is some information about my setup: I use Python 2.7.1 on Ubuntu
> 11.04 with numpy 1.6.1. Furthermore, I thought the problem might be solved
> by recompiling numpy with my local ATLAS library (version 3.8.3), and this
> didn't seem to help. On another machine with Python 2.7.1 and numpy 1.5.1
> the SVD does converge however it contains 1 NaN singular value and 3
> negative singular values of the order -10^-1 (singular values should
> always be non-negative).
>
> I also tried computing the SVD of the matrix using Octave 3.2.4 and Matlab
> 7.10.0.499 (R2010a) 64-bit (glnxa64) and there were no problems. Any help
> is greatly appreciated.
>
> Thanks in advance,
> Charanpal
>
>
>
Fails here also, fedora 15 64 bits AMD 940. There should be a maximum
iterations argument somewhere...

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110813/079c31e5/attachment.html>

From charlesr.harris at gmail.com  Sat Aug 13 15:42:19 2011
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Sat, 13 Aug 2011 13:42:19 -0600
Subject: [Numpy-discussion] bug with latest numpy git snapshot build
	with Python3
In-Reply-To: <E1QrXho-0000tl-T9@ffe12.ukr.net>
References: <E1QrXho-0000tl-T9@ffe12.ukr.net>
Message-ID: <CAB6mnxLrBa1CzGra35stc2GMRod02AbO8iV91L8q4C4c0Q+VKA@mail.gmail.com>

2011/8/11 Dmitrey <tmp50 at ukr.net>

>  bug in KUBUNTU 11.04, latest numpy git snapshot build with Python3
> >>> import numpy
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
>   File "/usr/local/lib/python3.2/dist-packages/numpy/__init__.py", line
> 137, in <module>
>     from . import add_newdocs
>   File "/usr/local/lib/python3.2/dist-packages/numpy/add_newdocs.py", line
> 9, in <module>
>     from numpy.lib import add_newdoc
>   File "/usr/local/lib/python3.2/dist-packages/numpy/lib/__init__.py", line
> 4, in <module>
>     from .type_check import *
>   File "/usr/local/lib/python3.2/dist-packages/numpy/lib/type_check.py",
> line 8, in <module>
>     import numpy.core.numeric as _nx
>   File "/usr/local/lib/python3.2/dist-packages/numpy/core/__init__.py",
> line 10, in <module>
>     from .numeric import
> *
>   File "/usr/local/lib/python3.2/dist-packages/numpy/core/numeric.py", line
> 27, in <module>
>     import
> multiarray
> ImportError: No module named multiarray
>
>
I don't see this.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110813/cef9fc7a/attachment.html>

From pav at iki.fi  Sat Aug 13 15:45:23 2011
From: pav at iki.fi (Pauli Virtanen)
Date: Sat, 13 Aug 2011 19:45:23 +0000 (UTC)
Subject: [Numpy-discussion] [SciPy-User] disabling SVN (was: Trouble
 installing scipy after upgrading to Mac OS X 10.7 aka Lion)
References: <CABL7CQjW+PL64R=GieR+Nnp9wQWRu9uNF6RkVx3rV+68eNs5jQ@mail.gmail.com>
	<CAA6U3WCFLbzWpVRdS-ocvJ5sGVQ9rxAZYg33xw2F-qN1KQ7r0w@mail.gmail.com>
	<CABL7CQjoRWe4oSPxRNLEg1LR8Y+Nc815rYSvQ0ewBxSm74LXyQ@mail.gmail.com>
Message-ID: <j26k8i$58j$1@dough.gmane.org>

Sat, 13 Aug 2011 18:14:11 +0200, Ralf Gommers wrote:
[clip]
> We should check if there's still any code in SVN branches that is
> useful.
> If so the people who are interested in it should move it somewhere else.
> Anyone?

All the SVN branches are available in Git, though some are hidden. Do

	git fetch upstream +refs/*:refs/remotes/upstream/everything/*

and you shall receive (also some Github's internal branches named pull/*).
However, AFAIK, there's not so much useful in there.

In any case, as far as I'm aware, the SVN can be safely be turned off,
both for Numpy and Scipy. The admins can access the original repository
on the server, so if something turns out to be missed, it can be brought
back.

	Pauli


From paul.anton.letnes at gmail.com  Sat Aug 13 16:00:57 2011
From: paul.anton.letnes at gmail.com (Paul Anton Letnes)
Date: Sat, 13 Aug 2011 21:00:57 +0100
Subject: [Numpy-discussion] bug with latest numpy git snapshot build
	with Python3
In-Reply-To: <CAB6mnxLrBa1CzGra35stc2GMRod02AbO8iV91L8q4C4c0Q+VKA@mail.gmail.com>
References: <E1QrXho-0000tl-T9@ffe12.ukr.net>
	<CAB6mnxLrBa1CzGra35stc2GMRod02AbO8iV91L8q4C4c0Q+VKA@mail.gmail.com>
Message-ID: <A441F5EF-AA02-4EC8-888F-6D01A29A49C2@gmail.com>


On 13. aug. 2011, at 20.42, Charles R Harris wrote:

> 
> 
> 2011/8/11 Dmitrey <tmp50 at ukr.net>
> bug in KUBUNTU 11.04, latest numpy git snapshot build with Python3
> >>> import numpy
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
>   File "/usr/local/lib/python3.2/dist-packages/numpy/__init__.py", line 137, in <module>
>     from . import add_newdocs
>   File "/usr/local/lib/python3.2/dist-packages/numpy/add_newdocs.py", line 9, in <module>
>     from numpy.lib import add_newdoc
>   File "/usr/local/lib/python3.2/dist-packages/numpy/lib/__init__.py", line 4, in <module>
>     from .type_check import *
>   File "/usr/local/lib/python3.2/dist-packages/numpy/lib/type_check.py", line 8, in <module>
>     import numpy.core.numeric as _nx
>   File "/usr/local/lib/python3.2/dist-packages/numpy/core/__init__.py", line 10, in <module>
>     from .numeric import *                                                                                                   
>   File "/usr/local/lib/python3.2/dist-packages/numpy/core/numeric.py", line 27, in <module>                                  
>     import multiarray                                                                                                        
> ImportError: No module named multiarray
> 
> 
> I don't see this.
> 
> Chuck 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

Mac OS X 10.6.8, python3.2, I don't see this either. "import multiarray" does not work, but "import numpy" works beautifully.

Paul

From mwwiebe at gmail.com  Sat Aug 13 18:00:41 2011
From: mwwiebe at gmail.com (Mark Wiebe)
Date: Sat, 13 Aug 2011 15:00:41 -0700
Subject: [Numpy-discussion] nditer confusion
In-Reply-To: <j23dd8$8e3$1@dough.gmane.org>
References: <j23dd8$8e3$1@dough.gmane.org>
Message-ID: <CAMRnEmqXLxoAXmrB8yozcQkopwOS1-2VweJniU0-wnMqgZb7LA@mail.gmail.com>

I've made a pull request with some fairly extensive introductory material.
 It's available here:

https://github.com/numpy/numpy/pull/138

It walks through nditer usage starting with basic iteration of one array,
through broadcasting and iterator-allocated outputs, and finally covers
accelerating the inner loop with Cython.

Please read and review!

Thanks,
Mark

On Fri, Aug 12, 2011 at 7:30 AM, Neal Becker <ndbecker2 at gmail.com> wrote:

> There'a a boatload of options for nditer.  I need a simple explanation,
> maybe a
> few simple examples.  Is there anything that might help?
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110813/31f25a98/attachment.html>

From mwwiebe at gmail.com  Sat Aug 13 20:06:59 2011
From: mwwiebe at gmail.com (Mark Wiebe)
Date: Sat, 13 Aug 2011 17:06:59 -0700
Subject: [Numpy-discussion] Questionable reduceat behavior
In-Reply-To: <CAJPUwMDJCBe8NKbh+zqWs9=S+RcyYBnGS1-kgywKGu-u64E9TA@mail.gmail.com>
References: <CAJPUwMDJCBe8NKbh+zqWs9=S+RcyYBnGS1-kgywKGu-u64E9TA@mail.gmail.com>
Message-ID: <CAMRnEmppHw1sc9nBWVuHdva9bRUSRD+=MJdOtFuiAux5YyKy+Q@mail.gmail.com>

Looks like this is the second-oldest open bug in the bug tracker.

http://projects.scipy.org/numpy/ticket/236

For what it's worth, I'm in favour of changing this behavior to be more
consistent as proposed in that ticket.

-Mark

On Thu, Aug 11, 2011 at 11:25 AM, Wes McKinney <wesmckinn at gmail.com> wrote:

> I'm a little perplexed why reduceat was made to behave like this:
>
> In [26]: arr = np.ones((10, 4), dtype=bool)
>
> In [27]: arr
> Out[27]:
> array([[ True,  True,  True,  True],
>       [ True,  True,  True,  True],
>       [ True,  True,  True,  True],
>       [ True,  True,  True,  True],
>       [ True,  True,  True,  True],
>       [ True,  True,  True,  True],
>       [ True,  True,  True,  True],
>       [ True,  True,  True,  True],
>       [ True,  True,  True,  True],
>       [ True,  True,  True,  True]], dtype=bool)
>
>
> In [30]: np.add.reduceat(arr, [0, 3, 3, 7, 9], axis=0)
> Out[30]:
> array([[3, 3, 3, 3],
>       [1, 1, 1, 1],
>       [4, 4, 4, 4],
>       [2, 2, 2, 2],
>       [1, 1, 1, 1]])
>
> this does not seem intuitively correct. Since we have:
>
> In [33]: arr[3:3].sum(0)
> Out[33]: array([0, 0, 0, 0])
>
> I would expect
>
> array([[3, 3, 3, 3],
>       [0, 0, 0, 0],
>       [4, 4, 4, 4],
>       [2, 2, 2, 2],
>       [1, 1, 1, 1]])
>
> Obviously I can RTFM and see why it does this ("if ``indices[i] >=
> indices[i + 1]``, the i-th generalized "row" is simply
> ``a[indices[i]]``"), but it doesn't make much sense to me, and I need
> work around it. Suggestions?
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110813/f4481931/attachment.html>

From mwwiebe at gmail.com  Sat Aug 13 20:17:32 2011
From: mwwiebe at gmail.com (Mark Wiebe)
Date: Sat, 13 Aug 2011 17:17:32 -0700
Subject: [Numpy-discussion] bug with assignment into an indexed array?
In-Reply-To: <CANNq6F=Ce53jgeoSXzPBSVCuDqZUfNkbiSSBYkPxz3M+d=snvw@mail.gmail.com>
References: <CANNq6FnrbrqUvjrmxeCpYrwBv3jOMbpA+5ik_6a7MjQHsYbSJw@mail.gmail.com>
	<CAFXk4bpwTAxC-j_QMwqQiJ-GZMGu+_cxo_h-tEE2-zOmYBaSmQ@mail.gmail.com>
	<CANNq6FkQ++=GnshiUsvTxA5npMNgm9CcG4w2r2FYi-gW6YfLZw@mail.gmail.com>
	<CAFXk4bqpKeQxUhUXi3bRfXDTGNvUG=GbyKfakzTJ1awD9nkePA@mail.gmail.com>
	<CANNq6F=Ce53jgeoSXzPBSVCuDqZUfNkbiSSBYkPxz3M+d=snvw@mail.gmail.com>
Message-ID: <CAMRnEmp6vYmsbQGbSa+6TR0qz6XG9Rd=T26zij6jb4Hp8QA1iw@mail.gmail.com>

On Thu, Aug 11, 2011 at 1:37 PM, Benjamin Root <ben.root at ou.edu> wrote:

> On Thu, Aug 11, 2011 at 10:33 AM, Olivier Delalleau <shish at keba.be> wrote:
>
>> 2011/8/11 Benjamin Root <ben.root at ou.edu>
>>
>>>
>>>
>>> On Thu, Aug 11, 2011 at 8:37 AM, Olivier Delalleau <shish at keba.be>wrote:
>>>
>>>> Maybe confusing, but working as expected.
>>>>
>>>>
>>>> When you write:
>>>>   matched_to[np.array([0, 1, 2])] = 3
>>>> it calls __setitem__ on matched_to, with arguments (np.array([0, 1, 2]),
>>>> 3). So numpy understand you want to write 3 at these indices.
>>>>
>>>>
>>>> When you write:
>>>> matched_to[:3][match] = 3
>>>> it first calls __getitem__ with the slice as argument, which returns a
>>>> view of your array, then it calls __setitem__ on this view, and it fills
>>>> your matched_to array at the same time.
>>>>
>>>>
>>>> But when you write:
>>>>   matched_to[np.array([0, 1, 2])][match] = 3
>>>> it first calls __getitem__ with the array as argument, which retunrs a
>>>> *copy* of your array, so that calling __setitem__ on this copy has no effect
>>>> on your original array.
>>>>
>>>> -=- Olivier
>>>>
>>>>
>>> Right, but I guess my question is does it *have* to be that way?  I guess
>>> it makes some sense with respect to indexing with a numpy array like I did
>>> with the last example, because an element could be referred to multiple
>>> times (which explains the common surprise with '+='), but with boolean
>>> indexing, we are guaranteed that each element of the view will appear at
>>> most once.  Therefore, shouldn't boolean indexing always return a view, not
>>> a copy?  Is the general case of arbitrary array selection inherently
>>> impossible to encode in a view versus a slice with a regular spacing?
>>>
>>
>> Yes, due to the fact the array interface only supports regular spacing
>> (otherwise it is more difficult to get efficient access to arbitrary array
>> positions).
>>
>> -=- Olivier
>>
>>
> This still bothers me, though.  I imagine that it is next to impossible to
> detect this situation from numpy's perspective, so it can't even emit a
> warning or error. Furthermore, for someone who makes a general function to
> modify the contents of some externally provided array, there is a
> possibility that the provided array is actually a copy not a view.
> Although, I guess it is the responsibility of the user to know the
> difference.
>
> I guess that is the key problem.  The key advantage we are taught about
> numpy arrays is the use of views for efficient access.  It would seem that
> most access operations would use it, but in reality, only sliced access do.
> Everything else is a copy (unless you are doing fancy indexing with
> assignment).  Maybe with some of the forthcoming changes that have been done
> with respect to nditer and ufuncs (in particular, I am thinking of the
> "where" kwarg), maybe we could consider an enhancement allowing fancy
> indexing (or at least boolean indexing) to produce a view?  Even if it is
> less efficient than a view from slicing, it would bring better consistency
> in behavior between the different forms of indexing.
>
> Just my 2 cents,
> Ben Root
>

I think it would be nice to evolve the NumPy indexing and array
representation towards the goal of indexing returning a view in all cases
with no exceptions. This would provide a much nicer mental model to program
with. Accomplishing such a transition will take a fair bit of time, though.

-Mark


>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110813/a867f5f7/attachment.html>

From josef.pktd at gmail.com  Sat Aug 13 22:00:33 2011
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Sat, 13 Aug 2011 22:00:33 -0400
Subject: [Numpy-discussion] [SciPy-User] disabling SVN (was: Trouble
 installing scipy after upgrading to Mac OS X 10.7 aka Lion)
In-Reply-To: <j26k8i$58j$1@dough.gmane.org>
References: <CABL7CQjW+PL64R=GieR+Nnp9wQWRu9uNF6RkVx3rV+68eNs5jQ@mail.gmail.com>
	<CAA6U3WCFLbzWpVRdS-ocvJ5sGVQ9rxAZYg33xw2F-qN1KQ7r0w@mail.gmail.com>
	<CABL7CQjoRWe4oSPxRNLEg1LR8Y+Nc815rYSvQ0ewBxSm74LXyQ@mail.gmail.com>
	<j26k8i$58j$1@dough.gmane.org>
Message-ID: <CAMMTP+Bfn1eiA3KzGeoS2vSgA2muj5EuDa9Vs5tb6aA+GeYNow@mail.gmail.com>

On Sat, Aug 13, 2011 at 3:45 PM, Pauli Virtanen <pav at iki.fi> wrote:
> Sat, 13 Aug 2011 18:14:11 +0200, Ralf Gommers wrote:
> [clip]
>> We should check if there's still any code in SVN branches that is
>> useful.
>> If so the people who are interested in it should move it somewhere else.
>> Anyone?
>
> All the SVN branches are available in Git, though some are hidden. Do
>
> ? ? ? ?git fetch upstream +refs/*:refs/remotes/upstream/everything/*
>
> and you shall receive (also some Github's internal branches named pull/*).
> However, AFAIK, there's not so much useful in there.
>
> In any case, as far as I'm aware, the SVN can be safely be turned off,
> both for Numpy and Scipy. The admins can access the original repository
> on the server, so if something turns out to be missed, it can be brought
> back.
>
> ? ? ? ?Pauli

Does Trac require svn access to dig out old information?
for example links to old changesets, annotate/blame, ... ?

Josef


>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From dhanjal at telecom-paristech.fr  Sun Aug 14 08:22:07 2011
From: dhanjal at telecom-paristech.fr (Charanpal Dhanjal)
Date: Sun, 14 Aug 2011 14:22:07 +0200
Subject: [Numpy-discussion] SVD does not converge on "clean" matrix
In-Reply-To: <CAB6mnxJ9tXR4CFByCsYLCucaBTH8Aq3kLbN_f8aQOU74yORN4Q@mail.gmail.com>
References: <203aa1b32d794c238d32cb8d29036cc2.squirrel@webmail1.telecom-paristech.fr>
	<CAB6mnxJ9tXR4CFByCsYLCucaBTH8Aq3kLbN_f8aQOU74yORN4Q@mail.gmail.com>
Message-ID: <f69a2d352dca9828e9b57d5adc22e0be@telecom-paristech.fr>

I had a quick look at the code 
(https://github.com/numpy/numpy/blob/master/numpy/linalg/linalg.py) and 
the numpy.linalg.svd function calls lapack_lite.dgesdd (for real 
matrices) so I guess the non-convergence occurs in this function. As I 
understood lapack_lite is used by default unless numpy is installed with 
ATLAS/MKL etc. I wonder why svd works for Nadav and not for anyone else? 
Any ideas anyone?

Charanpal


On Sat, 13 Aug 2011 13:13:25 -0600, Charles R Harris wrote:
> On Thu, Aug 11, 2011 at 7:23 AM,  wrote:
>
>> Hi all,
>>
>> I get an error message "numpy.linalg.linalg.LinAlgError: SVD did
>> not
>> converge" when calling numpy.linalg.svd on a "clean" matrix of size
>> (1952,
>> 895). The matrix is clean in the sense that it contains no NaN or
>> Inf
>> values. The corresponding npz file is available here:
>>
> 
> https://docs.google.com/leaf?id=0Bw0NXKxxc40jMWEyNTljMWUtMzBmNS00NGZmLThhZWUtY2I2MWU2MGZiNDgx&hl=fr
>> [1]
>>
>> Here is some information about my setup: I use Python 2.7.1 on
>> Ubuntu
>> 11.04 with numpy 1.6.1. Furthermore, I thought the problem might be
>> solved
>> by recompiling numpy with my local ATLAS library (version 3.8.3),
>> and this
>> didnt seem to help. On another machine with Python 2.7.1 and numpy
>> 1.5.1
>> the SVD does converge however it contains 1 NaN singular value and
>> 3
>> negative singular values of the order -10^-1 (singular values
>> should
>> always be non-negative).
>>
>> I also tried computing the SVD of the matrix using Octave 3.2.4 and
>> Matlab
>> 7.10.0.499 (R2010a) 64-bit (glnxa64) and there were no problems.
>> Any help
>> is greatly appreciated.
>>
>> Thanks in advance,
>> Charanpal
>
> Fails here also, fedora 15 64 bits AMD 940. There should be a maximum
> iterations argument somewhere...
>
> Chuck
>
>
>
> Links:
> ------
> [1]
> 
> https://docs.google.com/leaf?id=0Bw0NXKxxc40jMWEyNTljMWUtMzBmNS00NGZmLThhZWUtY2I2MWU2MGZiNDgx|+|amp|+|hl=fr
> [2] mailto:dhanjal at telecom-paristech.fr


From lou_boog2000 at yahoo.com  Sun Aug 14 10:27:06 2011
From: lou_boog2000 at yahoo.com (Lou Pecora)
Date: Sun, 14 Aug 2011 07:27:06 -0700 (PDT)
Subject: [Numpy-discussion] SVD does not converge on "clean" matrix
In-Reply-To: <CAB6mnxJ9tXR4CFByCsYLCucaBTH8Aq3kLbN_f8aQOU74yORN4Q@mail.gmail.com>
References: <203aa1b32d794c238d32cb8d29036cc2.squirrel@webmail1.telecom-paristech.fr>
	<CAB6mnxJ9tXR4CFByCsYLCucaBTH8Aq3kLbN_f8aQOU74yORN4Q@mail.gmail.com>
Message-ID: <1313332026.61861.YahooMailNeo@web34404.mail.mud.yahoo.com>

Chuck wrote:


________________________________

Fails here also, fedora 15 64 bits AMD 940. There should be a maximum iterations argument somewhere...

Chuck 

---------------------------------------------------

?? *** ?Here's the "FIX":

Chuck is right. ?There is a max iterations. ?Here is a reply from a thread of mine in this group several years ago about this problem and some comments that might help you.

---- From Mr.?Damian Menscher who was kind enough to find the iteration location and provide some insight:

Ok, so after several hours of trying to read that code, I found
the parameter that needs to be tuned. ?In case anyone has this
problem and finds this thread a year from now, here's your hint:

File: Src/dlapack_lite.c
Subroutine: dlasd4_
Line: 22562

There's a for loop there that limits the number of iterations to
20. ?Increasing this value to 50 allows my matrix to converge.
I have not bothered to test what the "best" value for this number
is, though. ?In any case, it appears the number just exists to
prevent infinite loops, and 50 isn't really that much closer to
infinity than 20.... ?(Actually, I'm just going to set it to 100
so I don't have to think about it ever again.)

Damian Menscher
--?
-=#| Physics Grad Student & SysAdmin @ U Illinois Urbana-Champaign |#=-
-=#| 488 LLP, 1110 W. Green St, Urbana, IL 61801 Ofc:(217)333-0038 |#=-
-=#| 1412 DCL, Workstation Services Group, CITES Ofc:(217)244-3862 |#=-
-=#| <menscher at uiuc.edu> www.uiuc.edu/~menscher/ Fax:(217)333-9819 |#=-


---- My reply and a "fix" of sorts without changing the hard coded iteration max:

I have looked in Src/dlapack_lite.c and line 22562 is no longer a line that sets a max. iterations parameter. ?There are several set in the file, but that code is hard to figure (sort of a Fortran-in-C hybrid). ?

Here's one, for example:

?? ?maxit = *n * 6 * *n; ? // Line 887

I have no idea which parameter to tweak. ?Apparently this error is still in numpy (at least to my version). Does anyone have a fix? ?Should I start a ticket (I think this is what people do)? ?Any help appreciated.

I'm using a Mac Book Pro (Intel chip), system 10.4.11, Python 2.4.4.

============ Possible try/except ===========================?

# ?A is the original matrix
try:
?? ?U,W,VT=linalg.svd(A)
except linalg.linalg.LinAlgError: ?# "Square" the matrix and do SVD

?? ?print "Got svd except, trying square of A."
?? ?A2=dot(conj(A.T),A)
?? ?U,W2,VT=linalg.svd(A2)


This works so far.

---------------------------------------------------------------------------------------

I've been using that simple "fix" of "squaring" the original matrix for several years and it's worked every time. ?I'm not sure why. ?It was just a test and it worked. ?

You could also change the underlying C or Fortran code, but you then have to recompile everything in numpy. ?I wasn't that brave.


-- Lou Pecora, my views are my own.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110814/b8b649ec/attachment.html>

From torgil.svensson at gmail.com  Sun Aug 14 11:31:24 2011
From: torgil.svensson at gmail.com (Torgil Svensson)
Date: Sun, 14 Aug 2011 17:31:24 +0200
Subject: [Numpy-discussion] Efficient way to load a 1Gb file?
In-Reply-To: <rowen-E0806A.10221410082011@news.gmane.org>
References: <rowen-E0806A.10221410082011@news.gmane.org>
Message-ID: <CA+RwOBWjyY_abjijnxEPkSeRaeom608uiMYwffGaG-6XDgSdPw@mail.gmail.com>

Try the fromiter function, that will allow you to pass an iterator
which can read the file line by line and not preload the whole file.

file_iterator = iter(open('filename.txt')
line_parser = lambda x: map(float,x.split('\t'))
a=np.fromiter(itertools.imap(line_parser,file_iterator),dtype=float)

You have also the option to iterate the file twice and pass the
"count" argument.

//Torgil

On Wed, Aug 10, 2011 at 7:22 PM, Russell E. Owen <rowen at uw.edu> wrote:
> A coworker is trying to load a 1Gb text data file into a numpy array
> using numpy.loadtxt, but he says it is using up all of his machine's 6Gb
> of RAM. Is there a more efficient way to read such text data files?
>
> -- Russell
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From wesmckinn at gmail.com  Sun Aug 14 11:58:30 2011
From: wesmckinn at gmail.com (Wes McKinney)
Date: Sun, 14 Aug 2011 11:58:30 -0400
Subject: [Numpy-discussion] Questionable reduceat behavior
In-Reply-To: <CAMRnEmppHw1sc9nBWVuHdva9bRUSRD+=MJdOtFuiAux5YyKy+Q@mail.gmail.com>
References: <CAJPUwMDJCBe8NKbh+zqWs9=S+RcyYBnGS1-kgywKGu-u64E9TA@mail.gmail.com>
	<CAMRnEmppHw1sc9nBWVuHdva9bRUSRD+=MJdOtFuiAux5YyKy+Q@mail.gmail.com>
Message-ID: <CAJPUwMCr+e0TrCRPRPKZo8ErbnC48Vi9GNLX9j4AaKLZ2FpaPg@mail.gmail.com>

On Sat, Aug 13, 2011 at 8:06 PM, Mark Wiebe <mwwiebe at gmail.com> wrote:
> Looks like this is the second-oldest open bug in the bug tracker.
> http://projects.scipy.org/numpy/ticket/236
> For what it's worth, I'm in favour of changing this behavior to be more
> consistent as proposed in that ticket.
> -Mark
>
> On Thu, Aug 11, 2011 at 11:25 AM, Wes McKinney <wesmckinn at gmail.com> wrote:
>>
>> I'm a little perplexed why reduceat was made to behave like this:
>>
>> In [26]: arr = np.ones((10, 4), dtype=bool)
>>
>> In [27]: arr
>> Out[27]:
>> array([[ True, ?True, ?True, ?True],
>> ? ? ? [ True, ?True, ?True, ?True],
>> ? ? ? [ True, ?True, ?True, ?True],
>> ? ? ? [ True, ?True, ?True, ?True],
>> ? ? ? [ True, ?True, ?True, ?True],
>> ? ? ? [ True, ?True, ?True, ?True],
>> ? ? ? [ True, ?True, ?True, ?True],
>> ? ? ? [ True, ?True, ?True, ?True],
>> ? ? ? [ True, ?True, ?True, ?True],
>> ? ? ? [ True, ?True, ?True, ?True]], dtype=bool)
>>
>>
>> In [30]: np.add.reduceat(arr, [0, 3, 3, 7, 9], axis=0)
>> Out[30]:
>> array([[3, 3, 3, 3],
>> ? ? ? [1, 1, 1, 1],
>> ? ? ? [4, 4, 4, 4],
>> ? ? ? [2, 2, 2, 2],
>> ? ? ? [1, 1, 1, 1]])
>>
>> this does not seem intuitively correct. Since we have:
>>
>> In [33]: arr[3:3].sum(0)
>> Out[33]: array([0, 0, 0, 0])
>>
>> I would expect
>>
>> array([[3, 3, 3, 3],
>> ? ? ? [0, 0, 0, 0],
>> ? ? ? [4, 4, 4, 4],
>> ? ? ? [2, 2, 2, 2],
>> ? ? ? [1, 1, 1, 1]])
>>
>> Obviously I can RTFM and see why it does this ("if ``indices[i] >=
>> indices[i + 1]``, the i-th generalized "row" is simply
>> ``a[indices[i]]``"), but it doesn't make much sense to me, and I need
>> work around it. Suggestions?
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>

Well, I certainly hope it doesn't get forgotten about for another 5
years. I think having more consistent behavior would be better rather
than conforming to a seemingly arbitrary decision made ages ago in
Numeric.

- Wes


From alan at ajackson.org  Sun Aug 14 13:43:06 2011
From: alan at ajackson.org (alan at ajackson.org)
Date: Sun, 14 Aug 2011 12:43:06 -0500
Subject: [Numpy-discussion] help translating matlab to numpy
Message-ID: <20110814124306.51a1aba1@ajackson.org>

I'm translating some code from Matlab to numpy, and struggling a bit
since I have very little knowledge of Matlab.

My question is this - the arg function in Matlab (which seems to be deprecated,
they don't show it in their current documentation) is exactly equivalent to
what in Numpy? I know it is angle(x, deg=1) to get degrees instead of radians,
but what is the output range for the Matlab function -pi to pi, or 0 to 2*pi ?

Thanks!

-- 
-----------------------------------------------------------------------
| Alan K. Jackson            | To see a World in a Grain of Sand      |
| alan at ajackson.org          | And a Heaven in a Wild Flower,         |
| www.ajackson.org           | Hold Infinity in the palm of your hand |
| Houston, Texas             | And Eternity in an hour. - Blake       |
-----------------------------------------------------------------------


From silva at lma.cnrs-mrs.fr  Sun Aug 14 13:55:12 2011
From: silva at lma.cnrs-mrs.fr (Fabrice Silva)
Date: Sun, 14 Aug 2011 19:55:12 +0200
Subject: [Numpy-discussion] help translating matlab to numpy
In-Reply-To: <20110814124306.51a1aba1@ajackson.org>
References: <20110814124306.51a1aba1@ajackson.org>
Message-ID: <1313344512.2432.1.camel@amilo.coursju>

Le dimanche 14 ao?t 2011 ? 12:43 -0500, alan at ajackson.org a ?crit :
> I'm translating some code from Matlab to numpy, and struggling a bit
> since I have very little knowledge of Matlab.
> 
> My question is this - the arg function in Matlab (which seems to be deprecated,
> they don't show it in their current documentation) is exactly equivalent to
> what in Numpy? I know it is angle(x, deg=1) to get degrees instead of radians,
> but what is the output range for the Matlab function -pi to pi, or 0 to 2*pi ?

Can you tell from which toolbox your arg function comes from ? Using
help (or which ?) for instance...
It could help!
-- 
Fabrice


From alan at ajackson.org  Sun Aug 14 13:56:43 2011
From: alan at ajackson.org (alan at ajackson.org)
Date: Sun, 14 Aug 2011 12:56:43 -0500
Subject: [Numpy-discussion] help translating matlab to numpy
In-Reply-To: <20110814124306.51a1aba1@ajackson.org>
References: <20110814124306.51a1aba1@ajackson.org>
Message-ID: <20110814125643.2f9311d4@ajackson.org>

Never mind, I've been digging through too much stuff and got confused...

I think trying to read Matlab code can do that to you. 8-)

>I'm translating some code from Matlab to numpy, and struggling a bit
>since I have very little knowledge of Matlab.
>
>My question is this - the arg function in Matlab (which seems to be deprecated,
>they don't show it in their current documentation) is exactly equivalent to
>what in Numpy? I know it is angle(x, deg=1) to get degrees instead of radians,
>but what is the output range for the Matlab function -pi to pi, or 0 to 2*pi ?
>
>Thanks!
>
>-- 
>-----------------------------------------------------------------------
>| Alan K. Jackson            | To see a World in a Grain of Sand      |
>| alan at ajackson.org          | And a Heaven in a Wild Flower,         |
>| www.ajackson.org           | Hold Infinity in the palm of your hand |
>| Houston, Texas             | And Eternity in an hour. - Blake       |
>-----------------------------------------------------------------------
>_______________________________________________
>NumPy-Discussion mailing list
>NumPy-Discussion at scipy.org
>http://mail.scipy.org/mailman/listinfo/numpy-discussion


-- 
-----------------------------------------------------------------------
| Alan K. Jackson            | To see a World in a Grain of Sand      |
| alan at ajackson.org          | And a Heaven in a Wild Flower,         |
| www.ajackson.org           | Hold Infinity in the palm of your hand |
| Houston, Texas             | And Eternity in an hour. - Blake       |
-----------------------------------------------------------------------


From dhanjal at telecom-paristech.fr  Sun Aug 14 15:15:35 2011
From: dhanjal at telecom-paristech.fr (Charanpal Dhanjal)
Date: Sun, 14 Aug 2011 21:15:35 +0200
Subject: [Numpy-discussion] SVD does not converge on "clean" matrix
In-Reply-To: <1313332026.61861.YahooMailNeo@web34404.mail.mud.yahoo.com>
References: <203aa1b32d794c238d32cb8d29036cc2.squirrel@webmail1.telecom-paristech.fr>
	<CAB6mnxJ9tXR4CFByCsYLCucaBTH8Aq3kLbN_f8aQOU74yORN4Q@mail.gmail.com>
	<1313332026.61861.YahooMailNeo@web34404.mail.mud.yahoo.com>
Message-ID: <6d45c5e06b9e78cd9f56cf3ff2d604a5@telecom-paristech.fr>

Thanks very much Lou for the information. I tried delving into the C 
code and found a line in the dlasd4_ routine which reads:

for (niter = iter; niter <= MAXITERLOOPS; ++niter) {

This is apparently the main loop for this subroutine and the value of 
MAXITERLOOPS = 100. All I did was increase the maximum number of 
iterations to 200, and this seemed to solve the problem for the matrix 
in question. Let this matrix be called A, then

>>> P0, o0, Q0 = numpy.linalg.svd(A, full_matrices=False)
>>> numpy.linalg.norm((P0*o0).dot(Q0)- A)
1.8558089412794851
>>> numpy.linalg.norm(A)
4.558649005154054
>>> A.shape
(1952, 895)

It seems A has quite a small norm given its dimension, and perhaps this 
explains the error in the SVD (the numpy.linalg.norm((P0*o0).dot(Q0)- A) 
bit).

To investigate a little further I tried finding the SVD of A*1000:

>>> P0, o0, Q0 = numpy.linalg.svd(A*1000, full_matrices=False)
>>> numpy.isfinite(Q0).all()
False
>>> numpy.isfinite(P0).all()
False
>>> numpy.isfinite(o0).all()
False

and hence increasing the number of iterations does not solve the 
problem in this case. That was about as far as I felt I could go with 
investigating the C code. In the meanwhile I will try the squaring the 
matrix solution.

Incidentally, I am confused as to why numpy calls the lapack lite 
routines - when I call numpy.show_config() it seems to have detected my 
ATLAS libraries and I would have expected it to use those.

Charanpal


On Sun, 14 Aug 2011 07:27:06 -0700 (PDT), Lou Pecora wrote:
> Chuck wrote:
>
> -------------------------
>
> Fails here also, fedora 15 64 bits AMD 940. There should be a maximum
> iterations argument somewhere...
>
> Chuck
>
>  ---------------------------------------------------
>
>  *** Here's the "FIX":
>
> Chuck is right. There is a max iterations. Here is a reply from a
> thread of mine in this group several years ago about this problem and
> some comments that might help you.
>
> ---- From Mr. Damian Menscher who was kind enough to find the
> iteration location and provide some insight:
>
> Ok, so after several hours of trying to read that code, I found
> the parameter that needs to be tuned. In case anyone has this
> problem and finds this thread a year from now, here's your hint:
>
> File: Src/dlapack_lite.c
> Subroutine: dlasd4_
> Line: 22562
>
> There's a for loop there that limits the number of iterations to
> 20. Increasing this value to 50 allows my matrix to converge.
> I have not bothered to test what the "best" value for this number
> is, though. In any case, it appears the number just exists to
> prevent infinite loops, and 50 isn't really that much closer to
> infinity than 20.... (Actually, I'm just going to set it to 100
> so I don't have to think about it ever again.)
>
> Damian Menscher
> --
> -=#| Physics Grad Student & SysAdmin @ U Illinois Urbana-Champaign
> |#=-
> -=#| 488 LLP, 1110 W. Green St, Urbana, IL 61801 Ofc:(217)333-0038
> |#=-
> -=#| 1412 DCL, Workstation Services Group, CITES Ofc:(217)244-3862
> |#=-
> -=#|  www.uiuc.edu/~menscher/ Fax:(217)333-9819 |#=-
>
> ---- My reply and a "fix" of sorts without changing the hard coded
> iteration max:
>
> I have looked in Src/dlapack_lite.c and line 22562 is no longer a 
> line
> that sets a max. iterations parameter. There are several set in the
> file, but that code is hard to figure (sort of a Fortran-in-C 
> hybrid).
>
>
> Here's one, for example:
>
>  maxit = *n * 6 * *n; // Line 887
>
> I have no idea which parameter to tweak. Apparently this error is
> still in numpy (at least to my version). Does anyone have a fix?
> Should I start a ticket (I think this is what people do)? Any help
> appreciated.
>
> I'm using a Mac Book Pro (Intel chip), system 10.4.11, Python 2.4.4.
>
> ============ Possible try/except ===========================
>
> # A is the original matrix
> try:
>  U,W,VT=linalg.svd(A)
> except linalg.linalg.LinAlgError: # "Square" the matrix and do SVD
>
>  print "Got svd except, trying square of A."
>  A2=dot(conj(A.T),A)
>  U,W2,VT=linalg.svd(A2)
>
> This works so far.
>
> 
> ---------------------------------------------------------------------------------------
>
> I've been using that simple "fix" of "squaring" the original matrix
> for several years and it's worked every time. I'm not sure why. It 
> was
> just a test and it worked.
>
> You could also change the underlying C or Fortran code, but you then
> have to recompile everything in numpy. I wasn't that brave.
>
> -- Lou Pecora, my views are my own.


From brennan.williams at visualreservoir.com  Sun Aug 14 18:59:27 2011
From: brennan.williams at visualreservoir.com (Brennan Williams)
Date: Mon, 15 Aug 2011 10:59:27 +1200
Subject: [Numpy-discussion] Statistical distributions on samples
In-Reply-To: <CAEJxiFp0RB6-f2MF72UTM5biWQf2ZP4BUU9Xbx3Jx8Lj93NiSQ@mail.gmail.com>
References: <CAEf70bxD7Z4HH6SZH-DpNARCDqQVx7f6SRyAqd7FoZVtdDnC1Q@mail.gmail.com>
	<CAEJxiFp0RB6-f2MF72UTM5biWQf2ZP4BUU9Xbx3Jx8Lj93NiSQ@mail.gmail.com>
Message-ID: <4E48534F.8010008@visualreservoir.com>

You can use scipy.stats.truncnorm, can't you? Unless I misread, you want 
to sample a normal distribution but with generated values only being 
within a specified range? However you also say you want to do this with 
triangular and log normal and for these I presume the easiest way is to 
sample and then accept/reject.

Brennan

On 13/08/2011 2:53 a.m., Christopher Jordan-Squire wrote:
> Hi Andrea--An easy way to get something like this would be
>
> import numpy as np
> import scipy.stats as stats
>
> sigma = #some reasonable standard deviation for your application
> x = stats.norm.rvs(size=1000, loc=125, scale=sigma)
> x = x[x>50]
> x = x[x<200]
>
> That will give a roughly normal distribution to your velocities, as 
> long as, say, sigma<25. (I'm using the rule of thumb for the normal 
> distribution that normal random samples lie 3 standard deviations away 
> from the mean about 1 out of 350 times.) Though you won't be able to 
> get exactly normal errors about your mean since normal random samples 
> can theoretically be of any size.
>
> You can use this same process for any other distribution, as long as 
> you've chosen a scale variable so that the probability of samples 
> being outside your desired interval is really small. Of course, once 
> again your random errors won't be exactly from the distribution you 
> get your original samples from.
>
> -Chris JS
>
> On Fri, Aug 12, 2011 at 8:32 AM, Andrea Gavana 
> <andrea.gavana at gmail.com <mailto:andrea.gavana at gmail.com>> wrote:
>
>     Hi All,
>
>         I am working on something that appeared to be a no-brainer
>     issue (at the beginning), by my complete ignorance in statistics
>     is overwhelming and I got stuck.
>
>     What I am trying to do can be summarized as follows
>
>     Let's assume that I have to generate a sample of a 1,000 values
>     for a variable (let's say, "velocity") using a normal distribution
>     (but later I will have to do it with log-normal, triangular and a
>     couple of others). The only thing I know about this velocity
>     sample is the minimum and maximum values (let's say 50 and 200
>     respectively) and, obviously for the normal distribution (but not
>     so for the other distributions), the mean value (125 in this case).
>
>     Now, I would like to generate this sample of 1,000 points, in
>     which none of the point has velocity smaller than 50 or bigger
>     than 200, and the number of samples close to the mean (125) should
>     be higher than the number of samples close to the minimum and the
>     maximum, following some kind of normal distribution.
>
>     What I have tried up to now is summarized in the code below, but
>     as you can easily see, I don't really know what I am doing. I am
>     open to every suggestion, and I apologize for the dumbness of my
>     question.
>
>     import numpy
>
>     from scipy import stats
>     import matplotlib.pyplot as plt
>
>     minval, maxval = 50.0, 250.0
>     x = numpy.linspace(minval, maxval, 500)
>
>     samp = stats.norm.rvs(size=len(x))
>     pdf = stats.norm.pdf(x)
>     cdf = stats.norm.cdf(x)
>     ppf = stats.norm.ppf(x)
>
>     ax1 = plt.subplot(2, 2, 1)
>     ax1.plot(range(len(x)), samp)
>
>     ax2 = plt.subplot(2, 2, 2)
>     ax2.plot(x, pdf)
>
>     ax3 = plt.subplot(2, 2, 3)
>     ax3.plot(x, cdf)
>
>     ax4 = plt.subplot(2, 2, 4)
>     ax4.plot(x, ppf)
>
>     plt.show()
>
>
>     Andrea.
>
>     "Imagination Is The Only Weapon In The War Against Reality."
>     http://xoomer.alice.it/infinity77/
>
>
>     _______________________________________________
>     NumPy-Discussion mailing list
>     NumPy-Discussion at scipy.org <mailto:NumPy-Discussion at scipy.org>
>     http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110815/7ce99e48/attachment.html>

From aronne.merrelli at gmail.com  Sun Aug 14 19:51:10 2011
From: aronne.merrelli at gmail.com (Aronne Merrelli)
Date: Sun, 14 Aug 2011 18:51:10 -0500
Subject: [Numpy-discussion] __array_wrap__ / __array_finalize__ in NumPy
	v1.4+
Message-ID: <CAHNdQ4K4nU3etK_mvD_=za3uz6_2GtZeBEfwCfjyE=uSQF8+_w@mail.gmail.com>

Hello,

I'm attempting to implement a subclass of ndarray, and becoming confused
about the way __array_wrap__ and __array_finalize__ operate. I boiled it
down to a short subclass, which is the example on the website at
http://docs.scipy.org/doc/numpy-1.6.0/user/basics.subclassing.html, with one
added attribute that is a copy of the self array multiplied by 2. The
doubled copy is stored in a "plain" ndarray. The attachment has the python
code.

The output below is from NumPy 1.3 and 1.6 (1.4 has the same output as 1.6).
The output from 1.3 matches the documentation on the website. In 1.6,
__array_wrap__ and __array_finalize__ are invoked in the opposite order,
__array_finalize__ appears to be getting an "empty" array, and array_wrap's
argument is no longer an ndarray but rather an instance of the subclass.
This doesn't match the documentation so I am not sure if this is the correct
behavior in newer NumPy. Is this a bug, or the expected behavior in newer
NumPy versions?

Am I just missing something simple? The actual code I am trying to write
uses essentially the same idea - keeping another array, related to the self
array through some calculation, as another object attribute.  Is there a
better way to accomplish this?

Thanks,
Aronne


NumPy version:  1.3.0

object creation
In __array_finalize__:
   self type <class 'array_wrap_test.TestClass'>, values TestClass([0, 1])
   obj  type <type 'numpy.ndarray'>, values array([0, 1])

object + ndarray
In __array_wrap__:
   self type <class 'array_wrap_test.TestClass'>, values TestClass([0, 1])
   arr  type <type 'numpy.ndarray'>, values array([2, 3])
In __array_finalize__:
   self type <class 'array_wrap_test.TestClass'>, values TestClass([2, 3])
   obj  type <class 'array_wrap_test.TestClass'>, values TestClass([0, 1])

obj=  [0 1] [0 2]  ret=  [2 3] [4 6]


NumPy version:  1.6.0

object creation
In __array_finalize__:
   self type <class 'array_wrap_test.TestClass'>, values TestClass([0, 1])
   obj  type <type 'numpy.ndarray'>, values array([0, 1])

object + ndarray
In __array_finalize__:
   self type <class 'array_wrap_test.TestClass'>, values TestClass([
15, 22033837])
   obj  type <class 'array_wrap_test.TestClass'>, values TestClass([0, 1])
In __array_wrap__:
   self type <class 'array_wrap_test.TestClass'>, values TestClass([0, 1])
   arr  type <class 'array_wrap_test.TestClass'>, values TestClass([2, 3])

obj=  [0 1] [0 2]  ret=  [2 3] [      30 44067674]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110814/6227d209/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: array_wrap_test.py
Type: application/octet-stream
Size: 1033 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110814/6227d209/attachment.obj>

From andrea.gavana at gmail.com  Mon Aug 15 05:10:37 2011
From: andrea.gavana at gmail.com (Andrea Gavana)
Date: Mon, 15 Aug 2011 11:10:37 +0200
Subject: [Numpy-discussion] Statistical distributions on samples
In-Reply-To: <4E48534F.8010008@visualreservoir.com>
References: <CAEf70bxD7Z4HH6SZH-DpNARCDqQVx7f6SRyAqd7FoZVtdDnC1Q@mail.gmail.com>
	<CAEJxiFp0RB6-f2MF72UTM5biWQf2ZP4BUU9Xbx3Jx8Lj93NiSQ@mail.gmail.com>
	<4E48534F.8010008@visualreservoir.com>
Message-ID: <CAEf70byVdpu+0syiKRGbnFag9OYcxb92TWJTCZGwy9u1Copz-g@mail.gmail.com>

Hi Chris & Brennan,

On 15 August 2011 00:59, Brennan Williams wrote:
> You can use scipy.stats.truncnorm, can't you? Unless I misread, you want to
> sample a normal distribution but with generated values only being within a
> specified range? However you also say you want to do this with triangular
> and log normal and for these I presume the easiest way is to sample and then
> accept/reject.
>
> Brennan
>
> On 13/08/2011 2:53 a.m., Christopher Jordan-Squire wrote:
>
> Hi Andrea--An easy way to get something like this would be
>
> import numpy as np
> import scipy.stats as stats
>
> sigma = #some reasonable standard deviation for your application
> x = stats.norm.rvs(size=1000, loc=125, scale=sigma)
> x = x[x>50]
> x = x[x<200]
>
> That will give a roughly normal distribution to your velocities, as long as,
> say, sigma<25. (I'm using the rule of thumb for the normal distribution that
> normal random samples lie 3 standard deviations away from the mean about 1
> out of 350 times.) Though you won't be able to get exactly normal errors
> about your mean since normal random samples can theoretically be of any
> size.
>
> You can use this same process for any other distribution, as long as you've
> chosen a scale variable so that the probability of samples being outside
> your desired interval is really small. Of course, once again your random
> errors won't be exactly from the distribution you get your original samples
> from.

Thank you for your answer. Indeed, it appears that a truncated
distribution implementation exists only for the normal distribution
(in the subset of distributions I need to use). I haven't checked yet
what the code for truncnorm does but maybe it might be possible to
apply the same approach for other distributions. In any case the
sampling/reject/accept approach is the best approach for me, due to my
ignorance about statistical things :-)

Thank you again.

Andrea.

"Imagination Is The Only Weapon In The War Against Reality."
http://xoomer.alice.it/infinity77/

>>> import PyQt4.QtGui
Traceback (most recent call last):
? File "<interactive input>", line 1, in <module>
ImportError: No module named PyQt4.QtGui
>>>
>>> import pygtk
Traceback (most recent call last):
? File "<interactive input>", line 1, in <module>
ImportError: No module named pygtk
>>>
>>> import wx
>>>
>>>


From pearu.peterson at gmail.com  Mon Aug 15 08:50:40 2011
From: pearu.peterson at gmail.com (Pearu Peterson)
Date: Mon, 15 Aug 2011 15:50:40 +0300
Subject: [Numpy-discussion] ULONG not in UINT16, UINT32,
	UINT64 under 64-bit windows, is this possible?
Message-ID: <CAPpwKcxPHdrgveie=1yC3VPD0TZW893eC01Kb-h5n52R9hMg4g@mail.gmail.com>

Hi,

A student of mine using 32-bit numpy 1.5 under 64-bit Windows 7 noticed that
giving a numpy array with dtype=uint32 to an extension module the
following codelet would fail:

switch(PyArray_TYPE(ARR)) {
  case PyArray_UINT16: /* do smth */ break;
  case PyArray_UINT32: /* do smth */ break;
  case PyArray_UINT64: /* do smth */ break;
  default: /* raise type error exception */
}

The same test worked fine under Linux.

Checking the value of PyArray_TYPE(ARR) (=8) showed that it corresponds to
NPY_ULONG (when counting the items in the enum definition).

Is this situation possible where NPY_ULONG does not correspond to a 16 or 32
or 64 bit integer?
Or does this indicate a bug somewhere for this particular platform?

Thanks,
Pearu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110815/5d1b7136/attachment.html>

From shish at keba.be  Mon Aug 15 09:25:22 2011
From: shish at keba.be (Olivier Delalleau)
Date: Mon, 15 Aug 2011 09:25:22 -0400
Subject: [Numpy-discussion] ULONG not in UINT16, UINT32,
 UINT64 under 64-bit windows, is this possible?
In-Reply-To: <CAPpwKcxPHdrgveie=1yC3VPD0TZW893eC01Kb-h5n52R9hMg4g@mail.gmail.com>
References: <CAPpwKcxPHdrgveie=1yC3VPD0TZW893eC01Kb-h5n52R9hMg4g@mail.gmail.com>
Message-ID: <CAFXk4brytv60Yc=8SRP0WMowBAgXEfhYiEJLa2HRez4B884=1Q@mail.gmail.com>

The reason is there can be multiple dtypes (i.e. with different .num)
representing the same kind of data.
Usually in Python this goes unnoticed, because you do not test a dtype
through its .num, instead you use for instance "== 'uint32'", and all works
fine.
However, it can indeed confuse C code in situations like the one you
describe, because of direct comparison of .num.
I guess you have a few options:
- Do not compare .num (I'm not sure what would be the equivalent to "==
'utin32' in C though) => probably slower
- Re-cast your array in the exact dtype you need (in Python you can do this
with .view) => probably cumbersome
- Write a customized comparison function that figures out at initialization
time all dtypes that represent the same data, and then is able to do a fast
comparison based on .num => probably best, but requires more work

Here's some Python code that lists the various scalar dtypes associated to
unique .num in numpy (excerpt slightly modified from code found in Theano --
http://deeplearning.net/software/theano -- BSD license). Call the
"get_numeric_types()" function, and print both the string representation of
the resulting dtypes as well as their .num.

def get_numeric_subclasses(cls=numpy.number, ignore=None):
    """
    Return subclasses of `cls` in the numpy scalar hierarchy.

    We only return subclasses that correspond to unique data types.
    The hierarchy can be seen here:
        http://docs.scipy.org/doc/numpy/reference/arrays.scalars.html
    """
    if ignore is None:
        ignore = []
    rval = []
    dtype = numpy.dtype(cls)
    dtype_num = dtype.num
    if dtype_num not in ignore:
        # Safety check: we should be able to represent 0 with this data
type.
        numpy.array(0, dtype=dtype)
        rval.append(cls)
        ignore.append(dtype_num)
    for sub in cls.__subclasses__():
        rval += [c for c in get_numeric_subclasses(sub, ignore=ignore)]
    return rval


def get_numeric_types():
    """
    Return numpy numeric data types.

    :returns: A list of unique data type objects. Note that multiple data
types
    may share the same string representation, but can be differentiated
through
    their `num` attribute.
    """
    rval = []
    def is_within(cls1, cls2):
        # Return True if scalars defined from `cls1` are within the
hierarchy
        # starting from `cls2`.
        # The third test below is to catch for instance the fact that
        # one can use ``dtype=numpy.number`` and obtain a float64 scalar,
even
        # though `numpy.number` is not under `numpy.floating` in the class
        # hierarchy.
        return (cls1 is cls2 or
                issubclass(cls1, cls2) or
                isinstance(numpy.array([0], dtype=cls1)[0], cls2))
    for cls in get_numeric_subclasses():
        dtype = numpy.dtype(cls)
        rval.append([str(dtype), dtype, dtype.num])
    # We sort it to be deterministic, then remove the string and num
elements.
    return [x[1] for x in sorted(rval, key=str)]


2011/8/15 Pearu Peterson <pearu.peterson at gmail.com>

>
> Hi,
>
> A student of mine using 32-bit numpy 1.5 under 64-bit Windows 7 noticed
> that
> giving a numpy array with dtype=uint32 to an extension module the
> following codelet would fail:
>
> switch(PyArray_TYPE(ARR)) {
>   case PyArray_UINT16: /* do smth */ break;
>   case PyArray_UINT32: /* do smth */ break;
>   case PyArray_UINT64: /* do smth */ break;
>   default: /* raise type error exception */
> }
>
> The same test worked fine under Linux.
>
> Checking the value of PyArray_TYPE(ARR) (=8) showed that it corresponds to
> NPY_ULONG (when counting the items in the enum definition).
>
> Is this situation possible where NPY_ULONG does not correspond to a 16 or
> 32 or 64 bit integer?
> Or does this indicate a bug somewhere for this particular platform?
>
> Thanks,
> Pearu
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110815/564f1fe8/attachment.html>

From andrea.gavana at gmail.com  Mon Aug 15 09:53:17 2011
From: andrea.gavana at gmail.com (Andrea Gavana)
Date: Mon, 15 Aug 2011 15:53:17 +0200
Subject: [Numpy-discussion] Statistical distributions on samples
In-Reply-To: <CAEJxiFp0RB6-f2MF72UTM5biWQf2ZP4BUU9Xbx3Jx8Lj93NiSQ@mail.gmail.com>
References: <CAEf70bxD7Z4HH6SZH-DpNARCDqQVx7f6SRyAqd7FoZVtdDnC1Q@mail.gmail.com>
	<CAEJxiFp0RB6-f2MF72UTM5biWQf2ZP4BUU9Xbx3Jx8Lj93NiSQ@mail.gmail.com>
Message-ID: <CAEf70bwKFcfw_Of6haJXUSYcoKjjQzX6aDBQ9Li4fY=zeZ0XUg@mail.gmail.com>

Hi Chris and All,

On 12 August 2011 16:53, Christopher Jordan-Squire wrote:
> Hi Andrea--An easy way to get something like this would be
>
> import numpy as np
> import scipy.stats as stats
>
> sigma = #some reasonable standard deviation for your application
> x = stats.norm.rvs(size=1000, loc=125, scale=sigma)
> x = x[x>50]
> x = x[x<200]
>
> That will give a roughly normal distribution to your velocities, as long as,
> say, sigma<25. (I'm using the rule of thumb for the normal distribution that
> normal random samples lie 3 standard deviations away from the mean about 1
> out of 350 times.) Though you won't be able to get exactly normal errors
> about your mean since normal random samples can theoretically be of any
> size.
>
> You can use this same process for any other distribution, as long as you've
> chosen a scale variable so that the probability of samples being outside
> your desired interval is really small. Of course, once again your random
> errors won't be exactly from the distribution you get your original samples
> from.

Thank you for your suggestion. There are a couple of things I am not
clear with, however. The first one (the easy one), is: let's suppose I
need 200 values, and the accept/discard procedure removes 5 of them
from the list. Is there any way to draw these 200 values from a bigger
sample so that the accept/reject procedure will not interfere too
much? And how do I get 200 values out of the bigger sample so that
these values are still representative?

Another thing, possibly completely unrelated. I am trying to design a
toy Latin Hypercube script (just for my own understanding). I found
this piece of code on the web (and I modified it slightly):

def lhs(dist, size=100):
    '''
    Latin Hypercube sampling of any distrbution.
    dist is is a scipy.stats random number generator
    such as stats.norm, stats.beta, etc
    parms is a tuple with the parameters needed for
    the specified distribution.

    :Parameters:
        - `dist`: random number generator from scipy.stats module.
        - `size` :size for the output sample
    '''

    n = size

    perc = numpy.arange(0.0, 1.0, 1.0/n)
    numpy.random.shuffle(perc)

    smp = [stats.uniform(i,1.0/n).rvs() for i in perc]

    v = dist.ppf(smp)

    return v


Now, I am not 100% clear of what the percent point function is (I have
read around the web, but please keep in mind that my statistical
skills are close to minus infinity). From this page:

http://www.itl.nist.gov/div898/handbook/eda/section3/eda362.htm

I gather that, if you plot the results of the ppf, with the horizontal
axis as probability, the vertical axis goes from the smallest to the
largest value of the cumulative distribution function. If i do this:

numpy.random.seed(123456)

distribution = stats.norm(loc=125, scale=25)

my_lhs = lhs(distribution, 50)

Will my_lhs always contain valid values (i.e., included between 50 and
200)? I assume the answer is no... but even if this was the case, is
this my_lhs array ready to be used to setup a LHS experiment when I
have multi-dimensional problems (in which all the variables are
completely independent from each other - no correlation)?

My apologies for the idiocy of the questions.

Andrea.

"Imagination Is The Only Weapon In The War Against Reality."
http://xoomer.alice.it/infinity77/

>>> import PyQt4.QtGui
Traceback (most recent call last):
? File "<interactive input>", line 1, in <module>
ImportError: No module named PyQt4.QtGui
>>>
>>> import pygtk
Traceback (most recent call last):
? File "<interactive input>", line 1, in <module>
ImportError: No module named pygtk
>>>
>>> import wx
>>>
>>>


From tmp50 at ukr.net  Mon Aug 15 15:21:30 2011
From: tmp50 at ukr.net (Dmitrey)
Date: Mon, 15 Aug 2011 22:21:30 +0300
Subject: [Numpy-discussion] [ANN] Constrained optimization solver with
	guaranteed precision
Message-ID: <E1Qt2j4-000O9x-UB@ffe8.ukr.net>

 Hi all,
   I'm glad to inform you that general constraints handling for interalg
   (free solver with guaranteed user-defined precision) now is available.
   Despite it is very premature and requires lots of improvements, it is
   already capable of outperforming commercial BARON (example:
   http://openopt.org/interalg_bench#Test_4)  and thus you could be
   interested in trying it right now (next OpenOpt release will be no
   sooner than 1 month).

   interalg can be especially more effective than BARON (and some other
   competitors) on problems with huge or absent Lipschitz constant, for
   example on funcs like sqrt(x), log(x), 1/x, x**alpha, alpha<1, when
   domain of x is something like [small_positive_value, another_value].

   Let me also remember you that interalg can search for all solutions of
   nonlinear equations / systems of them where local solvers like
   scipy.optimize fsolve cannot find anyone, and search single/multiple
   integral with guaranteed user-defined precision (speed of integration
   is intended to be enhanced in future).
   However, only FuncDesigner models are handled (read interalg webpage
   for more details).

   Regards, D.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110815/757d0828/attachment.html>

From cjordan1 at uw.edu  Mon Aug 15 15:40:16 2011
From: cjordan1 at uw.edu (Christopher Jordan-Squire)
Date: Mon, 15 Aug 2011 14:40:16 -0500
Subject: [Numpy-discussion] Statistical distributions on samples
In-Reply-To: <CAEf70bwKFcfw_Of6haJXUSYcoKjjQzX6aDBQ9Li4fY=zeZ0XUg@mail.gmail.com>
References: <CAEf70bxD7Z4HH6SZH-DpNARCDqQVx7f6SRyAqd7FoZVtdDnC1Q@mail.gmail.com>
	<CAEJxiFp0RB6-f2MF72UTM5biWQf2ZP4BUU9Xbx3Jx8Lj93NiSQ@mail.gmail.com>
	<CAEf70bwKFcfw_Of6haJXUSYcoKjjQzX6aDBQ9Li4fY=zeZ0XUg@mail.gmail.com>
Message-ID: <CAEJxiFo+bwAjncSG16J=RmY7XOo3izgEHgB=8dqMjGipGAAdyw@mail.gmail.com>

On Mon, Aug 15, 2011 at 8:53 AM, Andrea Gavana <andrea.gavana at gmail.com>wrote:

> Hi Chris and All,
>
> On 12 August 2011 16:53, Christopher Jordan-Squire wrote:
> > Hi Andrea--An easy way to get something like this would be
> >
> > import numpy as np
> > import scipy.stats as stats
> >
> > sigma = #some reasonable standard deviation for your application
> > x = stats.norm.rvs(size=1000, loc=125, scale=sigma)
> > x = x[x>50]
> > x = x[x<200]
> >
> > That will give a roughly normal distribution to your velocities, as long
> as,
> > say, sigma<25. (I'm using the rule of thumb for the normal distribution
> that
> > normal random samples lie 3 standard deviations away from the mean about
> 1
> > out of 350 times.) Though you won't be able to get exactly normal errors
> > about your mean since normal random samples can theoretically be of any
> > size.
> >
> > You can use this same process for any other distribution, as long as
> you've
> > chosen a scale variable so that the probability of samples being outside
> > your desired interval is really small. Of course, once again your random
> > errors won't be exactly from the distribution you get your original
> samples
> > from.
>
> Thank you for your suggestion. There are a couple of things I am not
> clear with, however. The first one (the easy one), is: let's suppose I
> need 200 values, and the accept/discard procedure removes 5 of them
> from the list. Is there any way to draw these 200 values from a bigger
> sample so that the accept/reject procedure will not interfere too
> much? And how do I get 200 values out of the bigger sample so that
> these values are still representative?
>

FWIW, I'm not really advocating a truncated normal so much as making the
standard deviation small enough so that there's no real difference between a
true normal distribution and a truncated normal.

If you're worried about getting exactly 200 samples, then you could sample N
with N>200 and such that after throwing out the ones that lie outside your
desired region you're left with M>200. Then just randomly pick 200 from
those M. That shouldn't bias anything as long as you randomly pick them. (Or
just pick the first 200, if you haven't done anything to impose any order on
the samples, such as sorting them by size.) But I'm not sure why you'd want
exactly 200 samples instead of some number of samples close to 200.


>
> Another thing, possibly completely unrelated. I am trying to design a
> toy Latin Hypercube script (just for my own understanding). I found
> this piece of code on the web (and I modified it slightly):
>
> def lhs(dist, size=100):
>    '''
>    Latin Hypercube sampling of any distrbution.
>    dist is is a scipy.stats random number generator
>    such as stats.norm, stats.beta, etc
>    parms is a tuple with the parameters needed for
>    the specified distribution.
>
>    :Parameters:
>        - `dist`: random number generator from scipy.stats module.
>        - `size` :size for the output sample
>    '''
>
>    n = size
>
>    perc = numpy.arange(0.0, 1.0, 1.0/n)
>    numpy.random.shuffle(perc)
>
>    smp = [stats.uniform(i,1.0/n).rvs() for i in perc]
>
>    v = dist.ppf(smp)
>
>    return v
>
>
> Now, I am not 100% clear of what the percent point function is (I have
> read around the web, but please keep in mind that my statistical
> skills are close to minus infinity). From this page:
>
> http://www.itl.nist.gov/div898/handbook/eda/section3/eda362.htm
>
>
The ppf is what's called the quantile function elsewhere. I do not know why
scipy calls it the ppf/percent point function.

The quantile function is the inverse of the cumulative density function
(cdf). So dist.ppf(z) is the x such that P(dist <= x) = z. Roughly. (Things
get slightly more finicky if you think about discrete distributions because
then you have to pick what happens at the jumps in the cdf.) So
dist.ppf(0.5) gives the median of dist, and dist.ppf(0.25) gives the
lower/first quartile of dist.


> I gather that, if you plot the results of the ppf, with the horizontal
> axis as probability, the vertical axis goes from the smallest to the
> largest value of the cumulative distribution function. If i do this:
>
> numpy.random.seed(123456)
>
> distribution = stats.norm(loc=125, scale=25)
>
> my_lhs = lhs(distribution, 50)
>
> Will my_lhs always contain valid values (i.e., included between 50 and
> 200)? I assume the answer is no... but even if this was the case, is
> this my_lhs array ready to be used to setup a LHS experiment when I
> have multi-dimensional problems (in which all the variables are
> completely independent from each other - no correlation)?
>
>
I'm not really sure if the above function is doing the lhs you want.   To
answer your question, it won't always generate values within [50,200]. If
size is large enough then you're dividing up the probability space evenly.
So even with the random perturbations (whose use I don't really understand),
you'll ensure that some of the samples you get when you apply the ppf will
correspond to the extremely low probability samples that are <50 or >200.

-Chris JS

My apologies for the idiocy of the questions.
>
> Andrea.
>
> "Imagination Is The Only Weapon In The War Against Reality."
> http://xoomer.alice.it/infinity77/
>
> >>> import PyQt4.QtGui
> Traceback (most recent call last):
>   File "<interactive input>", line 1, in <module>
> ImportError: No module named PyQt4.QtGui
> >>>
> >>> import pygtk
> Traceback (most recent call last):
>   File "<interactive input>", line 1, in <module>
> ImportError: No module named pygtk
> >>>
> >>> import wx
> >>>
> >>>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110815/aeb40e28/attachment.html>

From andrea.gavana at gmail.com  Mon Aug 15 16:01:05 2011
From: andrea.gavana at gmail.com (Andrea Gavana)
Date: Mon, 15 Aug 2011 23:01:05 +0300
Subject: [Numpy-discussion] [ANN] Constrained optimization solver with
 guaranteed precision
In-Reply-To: <E1Qt2j4-000O9x-UB@ffe8.ukr.net>
References: <E1Qt2j4-000O9x-UB@ffe8.ukr.net>
Message-ID: <CAEf70bz+noBH6GjRo7+uoWdC250R9LFp7Sk+Oikcg8QfaLFBVQ@mail.gmail.com>

Hi Dmitrey,

2011/8/15 Dmitrey <tmp50 at ukr.net>:
> Hi all,
> I'm glad to inform you that general constraints handling for interalg (free
> solver with guaranteed user-defined precision) now is available. Despite it
> is very premature and requires lots of improvements, it is already capable
> of outperforming commercial BARON (example:
> http://openopt.org/interalg_bench#Test_4)? and thus you could be interested
> in trying it right now (next OpenOpt release will be no sooner than 1
> month).
>
> interalg can be especially more effective than BARON (and some other
> competitors) on problems with huge or absent Lipschitz constant, for example
> on funcs like sqrt(x), log(x), 1/x, x**alpha, alpha<1, when domain of x is
> something like [small_positive_value, another_value].
>
> Let me also remember you that interalg can search for all solutions of
> nonlinear equations / systems of them where local solvers like
> scipy.optimize fsolve cannot find anyone, and search single/multiple
> integral with guaranteed user-defined precision (speed of integration is
> intended to be enhanced in future).
> However, only FuncDesigner models are handled (read interalg webpage for
> more details).

Thank you for this new improvements. I am one of those who use OpenOpt
in real life problems, and if I can advance a suggestion (for the
second time), when you post a benchmark of various optimization
methods, please do not consider the "elapsed time" only as a
meaningful variable to measure a success/failure of an algorithm.

Some (most?) of real life problems require intensive and time
consuming simulations for every *function evaluation*; the time spent
by the solver itself doing its calculations simply disappears in front
of the real process simulation. I know it because our simulations take
between 2 and 48 hours to run, so what's 300 seconds more or less in
the solver calculations? If you talk about synthetic problems (such as
the ones defined by a formula), I can see your point. For everything
else, I believe the number of function evaluations is a more direct
way to assess the quality of an optimization algorithm.

Just my 2c.

Andrea.

"Imagination Is The Only Weapon In The War Against Reality."
http://xoomer.alice.it/infinity77/

>>> import PyQt4.QtGui
Traceback (most recent call last):
? File "<interactive input>", line 1, in <module>
ImportError: No module named PyQt4.QtGui
>>>
>>> import pygtk
Traceback (most recent call last):
? File "<interactive input>", line 1, in <module>
ImportError: No module named pygtk
>>>
>>> import wx
>>>
>>>


From tmp50 at ukr.net  Mon Aug 15 16:09:37 2011
From: tmp50 at ukr.net (Dmitrey)
Date: Mon, 15 Aug 2011 23:09:37 +0300
Subject: [Numpy-discussion] [ANN] Constrained optimization solver with
	guaranteed precision
In-Reply-To: <CAEf70bz+noBH6GjRo7+uoWdC250R9LFp7Sk+Oikcg8QfaLFBVQ@mail.gmail.com>
References: <CAEf70bz+noBH6GjRo7+uoWdC250R9LFp7Sk+Oikcg8QfaLFBVQ@mail.gmail.com>
	<E1Qt2j4-000O9x-UB@ffe8.ukr.net>
Message-ID: <E1Qt3Td-000JBz-SB@ffe12.ukr.net>

 Hi Andrea,
   I believe benchmarks should be like Hans Mittelman do (
   http://plato.asu.edu/bench.html ) and of course number of funcs
   evaluations matters when slow Python code vs compiled is tested, but
   my current work doesn't allow me to spend so much time for OpenOpt
   development, so, moreover, for auxiliary work such as benchmarking
   (and making it properly like that). Also, benchmarks of someone's own
   soft usually are not very  trustful, moreover, on his own probs.

   BTW, please don't reply on my posts in scipy mail lists - I use them
   only to post the announcements like this and can miss a reply.

   Regards, D.

   --- ???????? ????????? ---
   ?? ????: " Andrea Gavana" <andrea.gavana at gmail.com>
   ????: " Discussion of Numerical Python" <numpy-discussion at scipy.org>
   ????: 15 ??????? 2011, 23:01:05
   ????: Re: [Numpy-discussion] [ANN] Constrained optimization solver
   with guaranteed precision


     Hi Dmitrey,
     
     2011/8/15 Dmitrey <     tmp50 at ukr.net     >:
     > Hi all,
     > I'm glad to inform you that general constraints handling for interalg (free
     > solver with guaranteed user-defined precision) now is available. Despite it
     > is very premature and requires lots of improvements, it is already capable
     > of outperforming commercial BARON (example:
     >      http://openopt.org/interalg_bench#Test_4     )  and thus you could be interested
     > in trying it right now (next OpenOpt release will be no sooner than 1
     > month).
     >
     > interalg can be especially more effective than BARON (and some other
     > competitors) on problems with huge or absent Lipschitz constant, for example
     > on funcs like sqrt(x), log(x), 1/x, x**alpha, alpha<1, when domain of x is
     > something like [small_positive_value, another_value].
     >
     > Let me also remember you that interalg can search for all solutions of
     > nonlinear equations / systems of them where local solvers like
     > scipy.optimize fsolve cannot find anyone, and search single/multiple
     > integral with guaranteed user-defined precision (speed of integration is
     > intended to be enhanced in future).
     > However, only FuncDesigner models are handled (read interalg webpage for
     > more details).
     
     Thank you for this new improvements. I am one of those who use OpenOpt
     in real life problems, and if I can advance a suggestion (for the
     second time), when you post a benchmark of various optimization
     methods, please do not consider the "elapsed time" only as a
     meaningful variable to measure a success/failure of an algorithm.
     
     Some (most?) of real life problems require intensive and time
     consuming simulations for every *function evaluation*; the time spent
     by the solver itself doing its calculations simply disappears in front
     of the real process simulation. I know it because our simulations take
     between 2 and 48 hours to run, so what's 300 seconds more or less in
     the solver calculations? If you talk about synthetic problems (such as
     the ones defined by a formula), I can see your point. For everything
     else, I believe the number of function evaluations is a more direct
     way to assess the quality of an optimization algorithm.
     
     Just my 2c.
     
     Andrea.
     
   
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110815/e67a2920/attachment.html>

From daniel.wheeler2 at gmail.com  Mon Aug 15 16:11:24 2011
From: daniel.wheeler2 at gmail.com (Daniel Wheeler)
Date: Mon, 15 Aug 2011 16:11:24 -0400
Subject: [Numpy-discussion] inverting and calculating eigenvalues for
 many small matrices
In-Reply-To: <4E1BFCE6.5030702@astro.uio.no>
References: <CACRAhEzMACgm1kQ1SMhgM5d7Ys3JB8ZZJLVOBG9PjFODhDu_ng@mail.gmail.com>
	<4E1BFCE6.5030702@astro.uio.no>
Message-ID: <CACRAhExbnh1uBmA6fKq3eTd=azpGsBnXYu7kZQaUN1nBr+E9Sw@mail.gmail.com>

Hi, I put together a set of tools for inverting, multiplying and
finding eigenvalues for many small matrices (arrays of shape (N, M, M)
where MxM is the size of each matrix). Thanks to the posoter who
suggested using the Tokyo package. Although not used directly, it
helped with figuring the correct arguments to pass to the lapack
routines and getting stated with cython. I put the code up here if
anyone happens to be interested.

   <http://matforge.org/fipy/browser/sandbox/smallMatrixTools/smt>

It consists of three files, smallMatrixTools.py, smt.pyx amd smt.pxd.
The speed tests comparing with numpy came out something like this...

  N, M, M: 10000, 2, 2
  mulinv speed up: 65.9, eigvec speed up: 11.2

  N, M, M: 10000, 3, 3
  mulinv speed up: 32.3, eigvec speed up: 7.2

  N, M, M: 10000, 4, 4
  mulinv speed up: 24.1, eigvec speed up: 5.9

  N, M, M: 10000, 5, 5
  mulinv speed up: 17.0, eigvec speed up: 5.2

for random matrices. Not bad speed ups, but not out of this world. I'm
new to cython so there may be some costly mistakes in the
implementation. I profiled and it seems that most of the time is now
being spent in the lapack routines, but still not completely convinced
by the profiling results. One thing that I know I'm doing wrong is
reassigning every sub-matrix to a new array. This may not be that
costly, but it seems fairly ugly. I wasn't sure how to pass the
address of the submatrix to the lapack routines so I'm assigning to a
new array and passing that instead.

I tested with <http://matforge.org/fipy/browser/sandbox/smallMatrixTools/smt/test.py>
and speed tests were done using
<http://matforge.org/fipy/browser/sandbox/smallMatrixTools/smt/speedTest.py>.

Cheers

On Tue, Jul 12, 2011 at 3:51 AM, Dag Sverre Seljebotn
<d.s.seljebotn at astro.uio.no> wrote:
> On 07/11/2011 11:01 PM, Daniel Wheeler wrote:
>> Hi, I am trying to find the eigenvalues and eigenvectors as well as
>> the inverse for a large number of small matrices. The matrix size
>> (MxM) will typically range from 2x2 to 8x8 at most. The number of
>> matrices (N) can be from 100 up to a million or more. My current
>> solution is to define "eig" and "inv" to be,
>>
>> def inv(A):
>> ? ? ?"""
>> ? ? ?Inverts N MxM matrices, A.shape = (M, M, N), inv(A).shape = (M, M, N).
>> ? ? ?"""
>> ? ? ?return np.array(map(np.linalg.inv, A.transpose(2, 0, 1))).transpose(1, 2, 0)
>>
>> def eig(A):
>> ? ? ?"""
>> ? ? ?Calculate the eigenvalues and eigenvectors of N MxM matrices,
>> A.shape = (M, M, N), eig(A)[0].shape = (M, N), eig(A)[1].shape = (M,
>> M, N)
>> ? ? ?"""
>> ? ? ?tmp = zip(*map(np.linalg.eig, A.transpose(2, 0, 1)))
>> ? ? ?return (np.array(tmp[0]).swapaxes(0,1), np.array(tmp[1]).transpose(1,2,0))
>>
>> The above uses "map" to fake a vector solution, but this is heinously
>> slow. Are there any better ways to do this without resorting to cython
>> or weave (would it even be faster (or possible) to use "np.linalg.eig"
>> and "np.linalg.inv" within cython)? I could write specialized versions
>
> If you want to go the Cython route, here's a start:
>
> http://www.vetta.org/2009/09/tokyo-a-cython-blas-wrapper-for-fast-matrix-math/
>
> Dag Sverre
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


-- 
Daniel Wheeler


From rowen at uw.edu  Mon Aug 15 16:25:50 2011
From: rowen at uw.edu (Russell E. Owen)
Date: Mon, 15 Aug 2011 13:25:50 -0700
Subject: [Numpy-discussion] Efficient way to load a 1Gb file?
References: <rowen-E0806A.10221410082011@news.gmane.org>
	<CA+RwOBWjyY_abjijnxEPkSeRaeom608uiMYwffGaG-6XDgSdPw@mail.gmail.com>
Message-ID: <rowen-6F26FD.13255015082011@news.gmane.org>

In article 
<CA+RwOBWjyY_abjijnxEPkSeRaeom608uiMYwffGaG-6XDgSdPw at mail.gmail.com>,
 Torgil Svensson <torgil.svensson at gmail.com> wrote:

> Try the fromiter function, that will allow you to pass an iterator
> which can read the file line by line and not preload the whole file.
> 
> file_iterator = iter(open('filename.txt')
> line_parser = lambda x: map(float,x.split('\t'))
> a=np.fromiter(itertools.imap(line_parser,file_iterator),dtype=float)
> 
> You have also the option to iterate the file twice and pass the
> "count" argument.

Thanks. That sounds great!

-- RUssell


From matthew.brett at gmail.com  Mon Aug 15 17:53:13 2011
From: matthew.brett at gmail.com (Matthew Brett)
Date: Mon, 15 Aug 2011 14:53:13 -0700
Subject: [Numpy-discussion] Segfault for np.lookfor
Message-ID: <CAH6Pt5p9k2A=oBThzzBmqtzyMZxvEhu9Jnechpoz3N0w8FRCiA@mail.gmail.com>

Hi,

On current trunk, all tests pass but running the (forgive my language)
doctests, I found this:

In [1]: import numpy as np

In [2]: np.__version__
Out[2]: '2.0.0.dev-730b861'

In [3]: np.lookfor('cos')
Segmentation fault

on:

Linux angela 2.6.38-10-generic #46-Ubuntu SMP Tue Jun 28 15:07:17 UTC
2011 x86_64 x86_64 x86_64 GNU/Linux
Ubuntu Natty Python 2.7.1+

Best,

Matthew


From matthew.brett at gmail.com  Mon Aug 15 18:29:08 2011
From: matthew.brett at gmail.com (Matthew Brett)
Date: Mon, 15 Aug 2011 15:29:08 -0700
Subject: [Numpy-discussion] numpydoc - latex longtables error
In-Reply-To: <CAH6Pt5pPRmnDFgVkBE1J2+HWO_VbEX1OPsfGOOnhQFrOO7vd3A@mail.gmail.com>
References: <CAH6Pt5rs5p4eGbjL2yju+uBbrhw4=ap+89peRF6Nu8Fme2UVsw@mail.gmail.com>
	<CAKF=DjscbQHFY5PZsnVie2W7C2-+vDJjZep-i+FKKQRvVkMATg@mail.gmail.com>
	<CAH6Pt5oTUQ+oAS0o8229S+1gc9Y-1hnBsd9=+vx2QJdqeq-SxA@mail.gmail.com>
	<CAMMTP+CR-uKdV7rzcML6zonZ2bpUs8eck7H86+iWtZSLufiRUg@mail.gmail.com>
	<CAH6Pt5pPRmnDFgVkBE1J2+HWO_VbEX1OPsfGOOnhQFrOO7vd3A@mail.gmail.com>
Message-ID: <CAH6Pt5rLdUhDPr8-SrNHx=VYLBMEd_c6ZWOZXBELFXXng+mWqA@mail.gmail.com>

Hi,

On Wed, Aug 10, 2011 at 5:17 PM, Matthew Brett <matthew.brett at gmail.com> wrote:
> Hi,
>
> On Wed, Aug 10, 2011 at 5:03 PM, ?<josef.pktd at gmail.com> wrote:
>> On Wed, Aug 10, 2011 at 6:17 PM, Matthew Brett <matthew.brett at gmail.com> wrote:
>>> Hi,
>>>
>>> On Wed, Aug 10, 2011 at 12:38 PM, Skipper Seabold <jsseabold at gmail.com> wrote:
>>>> On Wed, Aug 10, 2011 at 3:28 PM, Matthew Brett <matthew.brett at gmail.com> wrote:
>>>>> Hi,
>>>>>
>>>>> I think this one might be for Pauli.
>>>>>
>>>>> I've run into an odd problem that seems to be an interaction of
>>>>> numpydoc and autosummary and large classes.
>>>>>
>>>>> In summary, large classes and numpydoc lead to large tables of class
>>>>> methods, and there seems to be an error in the creation of the large
>>>>> tables in latex.
>>>>>
>>>>> Specifically, if I run 'make latexpdf' with the attached minimal
>>>>> sphinx setup, I get a pdflatex error ending thus:
>>>>>
>>>>> ...
>>>>> l.118 \begin{longtable}{LL}
>>>>>
>>>>> and this is because longtable does not accept LL as an argument, but
>>>>> needs '|l|l|' (bar - el - bar - el - bar).
>>>>>
>>>>> I see in sphinx.writers.latex.py, around line 657, that sphinx knows
>>>>> about this in general, and long tables in standard ReST work fine with
>>>>> the el-bar arguments passed to longtable.
>>>>>
>>>>> ? ? ? ?if self.table.colspec:
>>>>> ? ? ? ? ? ?self.body.append(self.table.colspec)
>>>>> ? ? ? ?else:
>>>>> ? ? ? ? ? ?if self.table.has_problematic:
>>>>> ? ? ? ? ? ? ? ?colwidth = 0.95 / self.table.colcount
>>>>> ? ? ? ? ? ? ? ?colspec = ('p{%.3f\\linewidth}|' % colwidth) * \
>>>>> ? ? ? ? ? ? ? ? ? ? ? ? ?self.table.colcount
>>>>> ? ? ? ? ? ? ? ?self.body.append('{|' + colspec + '}\n')
>>>>> ? ? ? ? ? ?elif self.table.longtable:
>>>>> ? ? ? ? ? ? ? ?self.body.append('{|' + ('l|' * self.table.colcount) + '}\n')
>>>>> ? ? ? ? ? ?else:
>>>>> ? ? ? ? ? ? ? ?self.body.append('{|' + ('L|' * self.table.colcount) + '}\n')
>>>>>
>>>>> However, using numpydoc and autosummary (see the conf.py file), what
>>>>> seems to happen is that, when we reach the self.table.colspec test at
>>>>> the beginning of the snippet above, 'self.table.colspec' is defined:
>>>>>
>>>>> In [1]: self.table.colspec
>>>>> Out[1]: '{LL}\n'
>>>>>
>>>>> and thus the LL gets written as the arg to longtable:
>>>>>
>>>>> \begin{longtable}{LL}
>>>>>
>>>>> and the pdf build breaks.
>>>>>
>>>>> I'm using the numpydoc out of the current numpy source tree.
>>>>>
>>>>> At that point I wasn't sure how to proceed with debugging. ?Can you
>>>>> give any hints?
>>>>>
>>>>
>>>> It's not a proper fix, but our workaround is to edit the Makefile for
>>>> latex (and latexpdf) to
>>>>
>>>> https://github.com/statsmodels/statsmodels/blob/master/scikits/statsmodels/docs/Makefile#L94
>>>> https://github.com/statsmodels/statsmodels/blob/master/scikits/statsmodels/docs/make.bat#L121
>>>>
>>>> to call the script to replace the longtable arguments
>>>>
>>>> https://github.com/statsmodels/statsmodels/blob/master/scikits/statsmodels/docs/fix_longtable.py
>>>>
>>>> The workaround itself probably isn't optimal, and I'd be happy to hear
>>>> of a proper fix.
>>>
>>> Thanks - yes - I found your workaround in my explorations, I put in a
>>> version in our tree too:
>>>
>>> https://github.com/matthew-brett/nipy/blob/latex_build_fixes/tools/fix_longtable.py
>>>
>>> ?- but I agree it seems much better to get to the root cause.
>>
>> When I tried to figure this out, I never found out why the correct
>> sphinx longtable code path never gets reached, or which code
>> (numpydoc, autosummary or sphinx) is filling in the colspec.
>
> No - it looked hard to debug. ?I established that it required numpydoc
> and autosummary to be enabled.

It looks like this conversation dried up, so I've moved it to a ticket:

http://projects.scipy.org/numpy/ticket/1935

Best,

Matthew


From charlesr.harris at gmail.com  Mon Aug 15 20:56:12 2011
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Mon, 15 Aug 2011 18:56:12 -0600
Subject: [Numpy-discussion] Segfault for np.lookfor
In-Reply-To: <CAH6Pt5p9k2A=oBThzzBmqtzyMZxvEhu9Jnechpoz3N0w8FRCiA@mail.gmail.com>
References: <CAH6Pt5p9k2A=oBThzzBmqtzyMZxvEhu9Jnechpoz3N0w8FRCiA@mail.gmail.com>
Message-ID: <CAB6mnxL0Rz5Qu9=qhjVYhHqLYYDwEcDbrPkmY1RWmyqfEqvPow@mail.gmail.com>

On Mon, Aug 15, 2011 at 3:53 PM, Matthew Brett <matthew.brett at gmail.com>wrote:

> Hi,
>
> On current trunk, all tests pass but running the (forgive my language)
> doctests, I found this:
>
> In [1]: import numpy as np
>
> In [2]: np.__version__
> Out[2]: '2.0.0.dev-730b861'
>
> In [3]: np.lookfor('cos')
> Segmentation fault
>
> on:
>
> Linux angela 2.6.38-10-generic #46-Ubuntu SMP Tue Jun 28 15:07:17 UTC
> 2011 x86_64 x86_64 x86_64 GNU/Linux
> Ubuntu Natty Python 2.7.1+
>
>
The problem is somewhere in print_coercion_tables.py

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110815/fbb90e5c/attachment.html>

From charlesr.harris at gmail.com  Mon Aug 15 21:09:11 2011
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Mon, 15 Aug 2011 19:09:11 -0600
Subject: [Numpy-discussion] Segfault for np.lookfor
In-Reply-To: <CAB6mnxL0Rz5Qu9=qhjVYhHqLYYDwEcDbrPkmY1RWmyqfEqvPow@mail.gmail.com>
References: <CAH6Pt5p9k2A=oBThzzBmqtzyMZxvEhu9Jnechpoz3N0w8FRCiA@mail.gmail.com>
	<CAB6mnxL0Rz5Qu9=qhjVYhHqLYYDwEcDbrPkmY1RWmyqfEqvPow@mail.gmail.com>
Message-ID: <CAB6mnxLr9p5OZc_+wXFUufdEtdgexhZT+zY4Tn-kZtwZTK2jwg@mail.gmail.com>

On Mon, Aug 15, 2011 at 6:56 PM, Charles R Harris <charlesr.harris at gmail.com
> wrote:

>
>
> On Mon, Aug 15, 2011 at 3:53 PM, Matthew Brett <matthew.brett at gmail.com>wrote:
>
>> Hi,
>>
>> On current trunk, all tests pass but running the (forgive my language)
>> doctests, I found this:
>>
>> In [1]: import numpy as np
>>
>> In [2]: np.__version__
>> Out[2]: '2.0.0.dev-730b861'
>>
>> In [3]: np.lookfor('cos')
>> Segmentation fault
>>
>> on:
>>
>> Linux angela 2.6.38-10-generic #46-Ubuntu SMP Tue Jun 28 15:07:17 UTC
>> 2011 x86_64 x86_64 x86_64 GNU/Linux
>> Ubuntu Natty Python 2.7.1+
>>
>>
> The problem is somewhere in print_coercion_tables.py
>
>
Or more precisely, it triggered by importing  print_coercion_tables.py. I
don't think lookfor should be doing that, but in any case:

array + scalar
+ ? b h i l q p B H I L Q P e f d g F D G S U V O M m
? ? b h i l l l B H I L L L e f d g F D G O O # O ! m
b b b b b b b b b b b b b b e f d g F D G O O # O ! m
h h h h h h h h h h h h h h f f d g F D G O O # O ! m
i i i i i i i i i i i i i i d d d g D D G O O # O ! m
l l l l l l l l l l l l l l d d d g D D G O O # O ! m
q l l l l l l l l l l l l l d d d g D D G O O # O ! m
p l l l l l l l l l l l l l d d d g D D G O O # O ! m
B B B B B B B B B B B B B B e f d g F D G O O # O ! m
H H H H H H H H H H H H H H f f d g F D G O O # O ! m
I I I I I I I I I I I I I I d d d g D D G O O # O ! m
L L L L L L L L L L L L L L d d d g D D G O O # O ! m
Q L L L L L L L L L L L L L d d d g D D G O O # O ! m
P L L L L L L L L L L L L L d d d g D D G O O # O ! m
e e e e e e e e e e e e e e e e e e F F F O O # O ! #
f f f f f f f f f f f f f f f f f f F F F O O # O ! #
d d d d d d d d d d d d d d d d d d D D D O O # O ! #
g g g g g g g g g g g g g g g g g g G G G O O # O ! #
F F F F F F F F F F F F F F F F F F F F F O O # O ! #
D D D D D D D D D D D D D D D D D D D D D O O # O ! #
G G G G G G G G G G G G G G G G G G G G G O O # O ! #
S O O O O O O O O O O O O O O O O O O O O O O # O ! O
U O O O O O O O O O O O O O O O O O O O O O O # O ! O
Segmentation fault (core dumped)

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110815/6330ec44/attachment.html>

From charlesr.harris at gmail.com  Mon Aug 15 21:43:57 2011
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Mon, 15 Aug 2011 19:43:57 -0600
Subject: [Numpy-discussion] Segfault for np.lookfor
In-Reply-To: <CAB6mnxLr9p5OZc_+wXFUufdEtdgexhZT+zY4Tn-kZtwZTK2jwg@mail.gmail.com>
References: <CAH6Pt5p9k2A=oBThzzBmqtzyMZxvEhu9Jnechpoz3N0w8FRCiA@mail.gmail.com>
	<CAB6mnxL0Rz5Qu9=qhjVYhHqLYYDwEcDbrPkmY1RWmyqfEqvPow@mail.gmail.com>
	<CAB6mnxLr9p5OZc_+wXFUufdEtdgexhZT+zY4Tn-kZtwZTK2jwg@mail.gmail.com>
Message-ID: <CAB6mnxJ35LZOer1muy-9BOW_yzWoMV_MB0JFg7b2wMoCEOy31w@mail.gmail.com>

On Mon, Aug 15, 2011 at 7:09 PM, Charles R Harris <charlesr.harris at gmail.com
> wrote:

>
>
> On Mon, Aug 15, 2011 at 6:56 PM, Charles R Harris <
> charlesr.harris at gmail.com> wrote:
>
>>
>>
>> On Mon, Aug 15, 2011 at 3:53 PM, Matthew Brett <matthew.brett at gmail.com>wrote:
>>
>>> Hi,
>>>
>>> On current trunk, all tests pass but running the (forgive my language)
>>> doctests, I found this:
>>>
>>> In [1]: import numpy as np
>>>
>>> In [2]: np.__version__
>>> Out[2]: '2.0.0.dev-730b861'
>>>
>>> In [3]: np.lookfor('cos')
>>> Segmentation fault
>>>
>>> on:
>>>
>>> Linux angela 2.6.38-10-generic #46-Ubuntu SMP Tue Jun 28 15:07:17 UTC
>>> 2011 x86_64 x86_64 x86_64 GNU/Linux
>>> Ubuntu Natty Python 2.7.1+
>>>
>>>
>> The problem is somewhere in print_coercion_tables.py
>>
>>
> Or more precisely, it triggered by importing  print_coercion_tables.py. I
> don't think lookfor should be doing that, but in any case:
>
> array + scalar
> + ? b h i l q p B H I L Q P e f d g F D G S U V O M m
> ? ? b h i l l l B H I L L L e f d g F D G O O # O ! m
> b b b b b b b b b b b b b b e f d g F D G O O # O ! m
> h h h h h h h h h h h h h h f f d g F D G O O # O ! m
> i i i i i i i i i i i i i i d d d g D D G O O # O ! m
> l l l l l l l l l l l l l l d d d g D D G O O # O ! m
> q l l l l l l l l l l l l l d d d g D D G O O # O ! m
> p l l l l l l l l l l l l l d d d g D D G O O # O ! m
> B B B B B B B B B B B B B B e f d g F D G O O # O ! m
> H H H H H H H H H H H H H H f f d g F D G O O # O ! m
> I I I I I I I I I I I I I I d d d g D D G O O # O ! m
> L L L L L L L L L L L L L L d d d g D D G O O # O ! m
> Q L L L L L L L L L L L L L d d d g D D G O O # O ! m
> P L L L L L L L L L L L L L d d d g D D G O O # O ! m
> e e e e e e e e e e e e e e e e e e F F F O O # O ! #
> f f f f f f f f f f f f f f f f f f F F F O O # O ! #
> d d d d d d d d d d d d d d d d d d D D D O O # O ! #
> g g g g g g g g g g g g g g g g g g G G G O O # O ! #
> F F F F F F F F F F F F F F F F F F F F F O O # O ! #
> D D D D D D D D D D D D D D D D D D D D D O O # O ! #
> G G G G G G G G G G G G G G G G G G G G G O O # O ! #
> S O O O O O O O O O O O O O O O O O O O O O O # O ! O
> U O O O O O O O O O O O O O O O O O O O O O O # O ! O
> Segmentation fault (core dumped)
>

A quick fix is to put the print statements in a function.

diff --git a/numpy/testing/print_coercion_tables.py
b/numpy/testing/print_coercion_tables.p
index d875449..3bc9253 100755
--- a/numpy/testing/print_coercion_tables.py
+++ b/numpy/testing/print_coercion_tables.py
@@ -65,22 +65,23 @@ def print_coercion_table(ntypes, inputfirstvalue,
inputsecondvalue, fir
             print char,
         print

-print "can cast"
-print_cancast_table(np.typecodes['All'])
-print
-print "In these tables, ValueError is '!', OverflowError is '@', TypeError
is '#'"
-print
-print "scalar + scalar"
-print_coercion_table(np.typecodes['All'], 0, 0, False)
-print
-print "scalar + neg scalar"
-print_coercion_table(np.typecodes['All'], 0, -1, False)
-print
-print "array + scalar"
-print_coercion_table(np.typecodes['All'], 0, 0, True)
-print
-print "array + neg scalar"
-print_coercion_table(np.typecodes['All'], 0, -1, True)
-print
-print "promote_types"
-print_coercion_table(np.typecodes['All'], 0, 0, False, True)
+def printem():
+    print "can cast"
+    print_cancast_table(np.typecodes['All'])
+    print
+    print "In these tables, ValueError is '!', OverflowError is '@',
TypeError is '#'"
+    print
+    print "scalar + scalar"
+    print_coercion_table(np.typecodes['All'], 0, 0, False)
+    print
+    print "scalar + neg scalar"
+    print_coercion_table(np.typecodes['All'], 0, -1, False)
+    print
+    print "array + scalar"
+    print_coercion_table(np.typecodes['All'], 0, 0, True)
+    print
+    print "array + neg scalar"
+    print_coercion_table(np.typecodes['All'], 0, -1, True)
+    print
+    print "promote_types"
+    print_coercion_table(np.typecodes['All'], 0, 0, False, True)

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110815/1ce9d143/attachment.html>

From lpc at cmu.edu  Mon Aug 15 22:42:43 2011
From: lpc at cmu.edu (Luis Pedro Coelho)
Date: Mon, 15 Aug 2011 22:42:43 -0400
Subject: [Numpy-discussion] As any array, really any array
Message-ID: <201108152242.48346.lpc@cmu.edu>

Hello all,

I often find myself writing the following code:

    try:
        features = np.asanyarray(features)
    except:
        features = np.asanyarray(features, dtype=object)

I basically want to be able to use fany indexing on features and, in most 
cases, it will be a numpy floating point array. Otherwise, default to having it 
be an array of dtype=object.

Is there a more elegant way to do it with numpy?

Thank you,
-- 
Luis Pedro Coelho | Carnegie Mellon University | http://luispedro.org
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part.
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110815/2efb9b19/attachment.sig>

From wardefar at iro.umontreal.ca  Tue Aug 16 05:53:59 2011
From: wardefar at iro.umontreal.ca (David Warde-Farley)
Date: Tue, 16 Aug 2011 05:53:59 -0400
Subject: [Numpy-discussion] inverting and calculating eigenvalues for
	many small matrices
In-Reply-To: <CACRAhExbnh1uBmA6fKq3eTd=azpGsBnXYu7kZQaUN1nBr+E9Sw@mail.gmail.com>
References: <CACRAhEzMACgm1kQ1SMhgM5d7Ys3JB8ZZJLVOBG9PjFODhDu_ng@mail.gmail.com>
	<4E1BFCE6.5030702@astro.uio.no>
	<CACRAhExbnh1uBmA6fKq3eTd=azpGsBnXYu7kZQaUN1nBr+E9Sw@mail.gmail.com>
Message-ID: <97108383-605D-472E-B97C-75D2150B88D8@iro.umontreal.ca>

On 2011-08-15, at 4:11 PM, Daniel Wheeler wrote:

> One thing that I know I'm doing wrong is
> reassigning every sub-matrix to a new array. This may not be that
> costly, but it seems fairly ugly. I wasn't sure how to pass the
> address of the submatrix to the lapack routines so I'm assigning to a
> new array and passing that instead.

It looks like the arrays you're passing are C contiguous. Am I right about this? (I ask because I was under the impression that BLAS/LAPACK routines typically want Fortran-ordered input arrays).

If your 3D array is also C-contiguous, you should be able to do pointer arithmetic with A.data and B.data. foo.strides[0] will tell you how many bytes you need to move to get to the next element along that axis.

If the 3D array is anything but C contiguous, then I believe the copy is necessary. You should check for that in your Python-visible "solve" wrapper, and make a copy of it that is C contiguous if necessary (check foo.flags.c_contiguous), as this will be likely faster than copying to the same buffer each time in the loop.

David

From J.Lee at bom.gov.au  Tue Aug 16 07:32:34 2011
From: J.Lee at bom.gov.au (Jin Lee)
Date: Tue, 16 Aug 2011 21:32:34 +1000
Subject: [Numpy-discussion] f2py - undefined symbol: _intel_fast_memset
	[SEC=UNCLASSIFIED]
In-Reply-To: <0E3686EB9FA8AA409AFA0A25468DCE43017138F5542C@BOM-VMBX-HO.bom.gov.au>
References: <0E3686EB9FA8AA409AFA0A25468DCE43017138F5542C@BOM-VMBX-HO.bom.gov.au>
Message-ID: <0E3686EB9FA8AA409AFA0A25468DCE43017138F5542E@BOM-VMBX-HO.bom.gov.au>

Hello,

This is my very first attempt at using f2py but I have come across a problem. If anyone can assist me I would appreciate it very much.

I have a very simple test Fortran source, sub.f90 which is:

subroutine sub1(x,y)
   implicit none

   integer, intent(in) :: x
   integer, intent(out) :: y

! start
   y = x

end subroutine sub1


I then used f2py to produce an object file, sub.so:

f2py -c -m sub sub.f90 --fcompiler='gfortran'


After starting a Python interactive session I tried to import the Fortran-derived Python module but I get an error message:

>>> import sub
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: ./sub.so: undefined symbol: _intel_fast_memset


Can anyone suggest what this error message means and how I can overcome it, please?


Regards,

Jin

From pearu.peterson at gmail.com  Tue Aug 16 07:45:20 2011
From: pearu.peterson at gmail.com (Pearu Peterson)
Date: Tue, 16 Aug 2011 14:45:20 +0300
Subject: [Numpy-discussion] f2py - undefined symbol: _intel_fast_memset
 [SEC=UNCLASSIFIED]
In-Reply-To: <0E3686EB9FA8AA409AFA0A25468DCE43017138F5542E@BOM-VMBX-HO.bom.gov.au>
References: <0E3686EB9FA8AA409AFA0A25468DCE43017138F5542C@BOM-VMBX-HO.bom.gov.au>
	<0E3686EB9FA8AA409AFA0A25468DCE43017138F5542E@BOM-VMBX-HO.bom.gov.au>
Message-ID: <4E4A5850.3020901@cens.ioc.ee>


On 08/16/2011 02:32 PM, Jin Lee wrote:
> Hello,
>
> This is my very first attempt at using f2py but I have come across a problem. If anyone can assist me I would appreciate it very much.
>
> I have a very simple test Fortran source, sub.f90 which is:
>
> subroutine sub1(x,y)
>     implicit none
>
>     integer, intent(in) :: x
>     integer, intent(out) :: y
>
> ! start
>     y = x
>
> end subroutine sub1
>
>
> I then used f2py to produce an object file, sub.so:
>
> f2py -c -m sub sub.f90 --fcompiler='gfortran'
>
> After starting a Python interactive session I tried to import the Fortran-derived Python module but I get an error message:
>
>>>> import sub
> Traceback (most recent call last):
>    File "<stdin>", line 1, in<module>
> ImportError: ./sub.so: undefined symbol: _intel_fast_memset
>
>
> Can anyone suggest what this error message means and how I can overcome it, please?

Try
   f2py -c -m sub sub.f90 --fcompiler=gnu95

HTH,
Pearu


From pav at iki.fi  Tue Aug 16 09:01:56 2011
From: pav at iki.fi (Pauli Virtanen)
Date: Tue, 16 Aug 2011 13:01:56 +0000 (UTC)
Subject: [Numpy-discussion] [SciPy-User] disabling SVN (was: Trouble
 installing scipy after upgrading to Mac OS X 10.7 aka Lion)
References: <CABL7CQjW+PL64R=GieR+Nnp9wQWRu9uNF6RkVx3rV+68eNs5jQ@mail.gmail.com>
	<CAA6U3WCFLbzWpVRdS-ocvJ5sGVQ9rxAZYg33xw2F-qN1KQ7r0w@mail.gmail.com>
	<CABL7CQjoRWe4oSPxRNLEg1LR8Y+Nc815rYSvQ0ewBxSm74LXyQ@mail.gmail.com>
	<j26k8i$58j$1@dough.gmane.org>
	<CAMMTP+Bfn1eiA3KzGeoS2vSgA2muj5EuDa9Vs5tb6aA+GeYNow@mail.gmail.com>
Message-ID: <j2dpo4$f5b$1@dough.gmane.org>

Sat, 13 Aug 2011 22:00:33 -0400, josef.pktd wrote:
[clip]
> Does Trac require svn access to dig out old information? for example
> links to old changesets, annotate/blame, ... ?

It does not require HTTP access to SVN, as it looks directly at the
SVN repo on the local disk.

It also probably doesn't use the old SVN repo for anything in reality,
as there's a simple Git plugin installed that just grabs the Git history
to the timeline, and redirects source browsing etc to Github.
However, I don't know whether the timeline views etc continue to
function even without the local SVN repo, so I'd just disable the HTTP
access and leave the local repo as it is as a backup.

	Pauli


From tkluck at infty.nl  Tue Aug 16 10:22:40 2011
From: tkluck at infty.nl (Timo Kluck)
Date: Tue, 16 Aug 2011 16:22:40 +0200
Subject: [Numpy-discussion] numpy.interp running time
In-Reply-To: <CAGK+T_=Xoqs6H-GpSC0qysWoenF5Zj==+u0p+ABQ11xrcM7BhA@mail.gmail.com>
References: <CAGK+T_mHR+vWS_UfR7fv_FHr19Gy-mYZ_aq49Lyni2H=qdO2Wg@mail.gmail.com>
	<CAGK+T_n15L_7oPLe9TRU6-LaV45sZxLiDk8-uVa=HAbrb7X9XQ@mail.gmail.com>
	<4E3452F1.7010607@hawaii.edu>
	<CAGK+T_knsRuciUpOjuEeQoOcs6Y2gG0ZNrzHtVCzjw3rX8YXZQ@mail.gmail.com>
	<CAGK+T_=Xoqs6H-GpSC0qysWoenF5Zj==+u0p+ABQ11xrcM7BhA@mail.gmail.com>
Message-ID: <CAGK+T_mV1_kTr4uqAbNoMWPY7aHJ=N6Dg2Va6ww5rfVfW_5Q7g@mail.gmail.com>

2011/8/1 Timo Kluck <tkluck at infty.nl>:
> I just submitted a patch at
> http://projects.scipy.org/numpy/ticket/1920 . It implements Eric's
> suggestion. Please review, I'll be happy to adapt it to any of your
> feedback.
>
I submitted a minor patch a while ago. It hasn't been reviewed yet,
but I don't know whether that's just because the reviewers just
haven't had time yet, or whether some extra action is required on my
part. Perhaps the ticket should be 'tagged' for review, or similar?
Let me know if there's anything more that I should do.

Timo


From daniel.wheeler2 at gmail.com  Tue Aug 16 10:28:20 2011
From: daniel.wheeler2 at gmail.com (Daniel Wheeler)
Date: Tue, 16 Aug 2011 10:28:20 -0400
Subject: [Numpy-discussion] inverting and calculating eigenvalues for
 many small matrices
In-Reply-To: <97108383-605D-472E-B97C-75D2150B88D8@iro.umontreal.ca>
References: <CACRAhEzMACgm1kQ1SMhgM5d7Ys3JB8ZZJLVOBG9PjFODhDu_ng@mail.gmail.com>
	<4E1BFCE6.5030702@astro.uio.no>
	<CACRAhExbnh1uBmA6fKq3eTd=azpGsBnXYu7kZQaUN1nBr+E9Sw@mail.gmail.com>
	<97108383-605D-472E-B97C-75D2150B88D8@iro.umontreal.ca>
Message-ID: <CACRAhEypcDCXy0Vqsube26ioav1DmHMy1RWnB_Yn7J9BUup1Bg@mail.gmail.com>

On Tue, Aug 16, 2011 at 5:53 AM, David Warde-Farley
<wardefar at iro.umontreal.ca> wrote:
> On 2011-08-15, at 4:11 PM, Daniel Wheeler wrote:
>
>> One thing that I know I'm doing wrong is
>> reassigning every sub-matrix to a new array. This may not be that
>> costly, but it seems fairly ugly. I wasn't sure how to pass the
>> address of the submatrix to the lapack routines so I'm assigning to a
>> new array and passing that instead.
>
> It looks like the arrays you're passing are C contiguous. Am I right about this? (I ask because I was under the impression that BLAS/LAPACK routines typically want Fortran-ordered input arrays).

Are you saying that fortran ordered arrays should be passed? The tests
pass when compared against doing numpy equivalents so I don't believe
that its currently broken (maybe suboptimal). There is a transpose and
copy here <http://matforge.org/fipy/browser/sandbox/smallMatrixTools/smt/smt.pyx#L65>
and here <http://matforge.org/fipy/browser/sandbox/smallMatrixTools/smt/smt.pyx#L80>.
I believe that reorders correctly. Maybe I should cast the arrays to
explicit fortran ordering rather than doing that (not sure how)?
However, the transpose and copy doesn't seem to be expensive compared
with the actual lapack routines.

> If your 3D array is also C-contiguous, you should be able to do pointer arithmetic with A.data and B.data. foo.strides[0] will tell you how many bytes you need to move to get to the next element along that axis.

Sounds complicated, but I'll try and figure it out. Thanks for the idea.

> If the 3D array is anything but C contiguous, then I believe the copy is necessary. You should check for that in your Python-visible "solve" wrapper, and make a copy of it that is C contiguous if necessary (check foo.flags.c_contiguous), as this will be likely faster than copying to the same buffer each time in the loop.

The copy is required after the transpose (which is required for
fortran ordering). I'll look into the pointer arithmetic stuff and see
if that helps any. Thanks.

-- 
Daniel Wheeler


From wesmckinn at gmail.com  Tue Aug 16 11:02:06 2011
From: wesmckinn at gmail.com (Wes McKinney)
Date: Tue, 16 Aug 2011 11:02:06 -0400
Subject: [Numpy-discussion] Questionable reduceat behavior
In-Reply-To: <CAJPUwMCr+e0TrCRPRPKZo8ErbnC48Vi9GNLX9j4AaKLZ2FpaPg@mail.gmail.com>
References: <CAJPUwMDJCBe8NKbh+zqWs9=S+RcyYBnGS1-kgywKGu-u64E9TA@mail.gmail.com>
	<CAMRnEmppHw1sc9nBWVuHdva9bRUSRD+=MJdOtFuiAux5YyKy+Q@mail.gmail.com>
	<CAJPUwMCr+e0TrCRPRPKZo8ErbnC48Vi9GNLX9j4AaKLZ2FpaPg@mail.gmail.com>
Message-ID: <CAJPUwMDoG9MO0dOXhe-qZQT=WkFN7dZnLUsTng6kOsQwhcXP8A@mail.gmail.com>

On Sun, Aug 14, 2011 at 11:58 AM, Wes McKinney <wesmckinn at gmail.com> wrote:
> On Sat, Aug 13, 2011 at 8:06 PM, Mark Wiebe <mwwiebe at gmail.com> wrote:
>> Looks like this is the second-oldest open bug in the bug tracker.
>> http://projects.scipy.org/numpy/ticket/236
>> For what it's worth, I'm in favour of changing this behavior to be more
>> consistent as proposed in that ticket.
>> -Mark
>>
>> On Thu, Aug 11, 2011 at 11:25 AM, Wes McKinney <wesmckinn at gmail.com> wrote:
>>>
>>> I'm a little perplexed why reduceat was made to behave like this:
>>>
>>> In [26]: arr = np.ones((10, 4), dtype=bool)
>>>
>>> In [27]: arr
>>> Out[27]:
>>> array([[ True, ?True, ?True, ?True],
>>> ? ? ? [ True, ?True, ?True, ?True],
>>> ? ? ? [ True, ?True, ?True, ?True],
>>> ? ? ? [ True, ?True, ?True, ?True],
>>> ? ? ? [ True, ?True, ?True, ?True],
>>> ? ? ? [ True, ?True, ?True, ?True],
>>> ? ? ? [ True, ?True, ?True, ?True],
>>> ? ? ? [ True, ?True, ?True, ?True],
>>> ? ? ? [ True, ?True, ?True, ?True],
>>> ? ? ? [ True, ?True, ?True, ?True]], dtype=bool)
>>>
>>>
>>> In [30]: np.add.reduceat(arr, [0, 3, 3, 7, 9], axis=0)
>>> Out[30]:
>>> array([[3, 3, 3, 3],
>>> ? ? ? [1, 1, 1, 1],
>>> ? ? ? [4, 4, 4, 4],
>>> ? ? ? [2, 2, 2, 2],
>>> ? ? ? [1, 1, 1, 1]])
>>>
>>> this does not seem intuitively correct. Since we have:
>>>
>>> In [33]: arr[3:3].sum(0)
>>> Out[33]: array([0, 0, 0, 0])
>>>
>>> I would expect
>>>
>>> array([[3, 3, 3, 3],
>>> ? ? ? [0, 0, 0, 0],
>>> ? ? ? [4, 4, 4, 4],
>>> ? ? ? [2, 2, 2, 2],
>>> ? ? ? [1, 1, 1, 1]])
>>>
>>> Obviously I can RTFM and see why it does this ("if ``indices[i] >=
>>> indices[i + 1]``, the i-th generalized "row" is simply
>>> ``a[indices[i]]``"), but it doesn't make much sense to me, and I need
>>> work around it. Suggestions?
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
>
> Well, I certainly hope it doesn't get forgotten about for another 5
> years. I think having more consistent behavior would be better rather
> than conforming to a seemingly arbitrary decision made ages ago in
> Numeric.
>
> - Wes
>

just a manual hack for now where I needed it...

https://github.com/wesm/pandas/blob/master/pandas/core/frame.py#L2155


From jgomezdans at gmail.com  Tue Aug 16 12:50:25 2011
From: jgomezdans at gmail.com (Jose Gomez-Dans)
Date: Tue, 16 Aug 2011 17:50:25 +0100
Subject: [Numpy-discussion] [f2py] How to specify compile options in setup.py
Message-ID: <CAMWde5rZm8U9ZeRn3gc0LgZrPmhx1Cy6oj4VDFJcsaQxu7vYhA@mail.gmail.com>

Hi,

Up to now, I have managed to build Fortran extensions with f2py by ussing
the following command:
$ python setup.py config_fc --fcompiler=gnu95
 --f77flags='-fmy_flags' --f90flags='-fmy_flags' build

I think that these options should be able to go in a setup.py file, and use
the f2py_options file. One way of doing this is to extend sys.argv with the
required command line options:
import sys
sys.argv.extend ( ['config_fc', '--fcompiler=gnu95',
 '--f77flags="-fmy_flags"', "--f90flags='-fmy_flags"] )

This works well if all the extensions require the same flags. In my case,
however, One of the extensions requires a different set of flags (in
particular, it requires that flag  -fdefault-real-8 isn't set, which is
required by the extensions). I tried setting the f2py_options in the
add_extension method call:

config.add_extension( 'my_extension', sources = my_sources,
 f2py_options=['f77flags="-ffixed-line-length-0" -fdefault-real-8',
'f90flags="-fdefault-real-8"']  )

This compiles the extensions (using the two dashes in front of the f2py
option eg --f77flags results in an unrecognised option), but the f2p_options
goes unheeded. Here's the relevant bit of the output from python setup.py
build:

compiling Fortran sources
Fortran f77 compiler: /usr/bin/gfortran -ffixed-line-length-0 -fPIC -O3
-march=native
Fortran f90 compiler: /usr/bin/gfortran -ffixed-line-length-0 -fPIC -O3
-march=native
Fortran fix compiler: /usr/bin/gfortran -Wall -ffixed-form
-fno-second-underscore -ffixed-line-length-0 -fPIC -O3 -march=native
compile options: '-Ibuild/src.linux-i686-2.7
-I/usr/lib/pymodules/python2.7/numpy/core/include -I/usr/include/python2.7
-c'
extra options: '-Jbuild/temp.linux-i686-2.7/my_dir
-Ibuild/temp.linux-i686-2.7/my_dir'

How can I disable (or enable) one option for compiling one particular
extension?

Thanks!
Jose
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110816/8637a549/attachment.html>

From charlesr.harris at gmail.com  Tue Aug 16 13:05:33 2011
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Tue, 16 Aug 2011 11:05:33 -0600
Subject: [Numpy-discussion] Segfault for np.lookfor
In-Reply-To: <CAB6mnxJ35LZOer1muy-9BOW_yzWoMV_MB0JFg7b2wMoCEOy31w@mail.gmail.com>
References: <CAH6Pt5p9k2A=oBThzzBmqtzyMZxvEhu9Jnechpoz3N0w8FRCiA@mail.gmail.com>
	<CAB6mnxL0Rz5Qu9=qhjVYhHqLYYDwEcDbrPkmY1RWmyqfEqvPow@mail.gmail.com>
	<CAB6mnxLr9p5OZc_+wXFUufdEtdgexhZT+zY4Tn-kZtwZTK2jwg@mail.gmail.com>
	<CAB6mnxJ35LZOer1muy-9BOW_yzWoMV_MB0JFg7b2wMoCEOy31w@mail.gmail.com>
Message-ID: <CAB6mnxK3-EUJBY17sRHYb4v26OVZv5d_-TCYMk+K5LM7pJzVYQ@mail.gmail.com>

On Mon, Aug 15, 2011 at 7:43 PM, Charles R Harris <charlesr.harris at gmail.com
> wrote:

>
>
> On Mon, Aug 15, 2011 at 7:09 PM, Charles R Harris <
> charlesr.harris at gmail.com> wrote:
>
>>
>>
>> On Mon, Aug 15, 2011 at 6:56 PM, Charles R Harris <
>> charlesr.harris at gmail.com> wrote:
>>
>>>
>>>
>>> On Mon, Aug 15, 2011 at 3:53 PM, Matthew Brett <matthew.brett at gmail.com>wrote:
>>>
>>>> Hi,
>>>>
>>>> On current trunk, all tests pass but running the (forgive my language)
>>>> doctests, I found this:
>>>>
>>>> In [1]: import numpy as np
>>>>
>>>> In [2]: np.__version__
>>>> Out[2]: '2.0.0.dev-730b861'
>>>>
>>>> In [3]: np.lookfor('cos')
>>>> Segmentation fault
>>>>
>>>> on:
>>>>
>>>> Linux angela 2.6.38-10-generic #46-Ubuntu SMP Tue Jun 28 15:07:17 UTC
>>>> 2011 x86_64 x86_64 x86_64 GNU/Linux
>>>> Ubuntu Natty Python 2.7.1+
>>>>
>>>>
>>> The problem is somewhere in print_coercion_tables.py
>>>
>>>
>> Or more precisely, it triggered by importing  print_coercion_tables.py. I
>> don't think lookfor should be doing that, but in any case:
>>
>> array + scalar
>> + ? b h i l q p B H I L Q P e f d g F D G S U V O M m
>> ? ? b h i l l l B H I L L L e f d g F D G O O # O ! m
>> b b b b b b b b b b b b b b e f d g F D G O O # O ! m
>> h h h h h h h h h h h h h h f f d g F D G O O # O ! m
>> i i i i i i i i i i i i i i d d d g D D G O O # O ! m
>> l l l l l l l l l l l l l l d d d g D D G O O # O ! m
>> q l l l l l l l l l l l l l d d d g D D G O O # O ! m
>> p l l l l l l l l l l l l l d d d g D D G O O # O ! m
>> B B B B B B B B B B B B B B e f d g F D G O O # O ! m
>> H H H H H H H H H H H H H H f f d g F D G O O # O ! m
>> I I I I I I I I I I I I I I d d d g D D G O O # O ! m
>> L L L L L L L L L L L L L L d d d g D D G O O # O ! m
>> Q L L L L L L L L L L L L L d d d g D D G O O # O ! m
>> P L L L L L L L L L L L L L d d d g D D G O O # O ! m
>> e e e e e e e e e e e e e e e e e e F F F O O # O ! #
>> f f f f f f f f f f f f f f f f f f F F F O O # O ! #
>> d d d d d d d d d d d d d d d d d d D D D O O # O ! #
>> g g g g g g g g g g g g g g g g g g G G G O O # O ! #
>> F F F F F F F F F F F F F F F F F F F F F O O # O ! #
>> D D D D D D D D D D D D D D D D D D D D D O O # O ! #
>> G G G G G G G G G G G G G G G G G G G G G O O # O ! #
>> S O O O O O O O O O O O O O O O O O O O O O O # O ! O
>> U O O O O O O O O O O O O O O O O O O O O O O # O ! O
>> Segmentation fault (core dumped)
>>
>
> A quick fix is to put the print statements in a function.
>
> diff --git a/numpy/testing/print_coercion_tables.py
> b/numpy/testing/print_coercion_tables.p
> index d875449..3bc9253 100755
> --- a/numpy/testing/print_coercion_tables.py
> +++ b/numpy/testing/print_coercion_tables.py
> @@ -65,22 +65,23 @@ def print_coercion_table(ntypes, inputfirstvalue,
> inputsecondvalue, fir
>              print char,
>          print
>
> -print "can cast"
> -print_cancast_table(np.typecodes['All'])
> -print
> -print "In these tables, ValueError is '!', OverflowError is '@', TypeError
> is '#'"
> -print
> -print "scalar + scalar"
> -print_coercion_table(np.typecodes['All'], 0, 0, False)
> -print
> -print "scalar + neg scalar"
> -print_coercion_table(np.typecodes['All'], 0, -1, False)
> -print
> -print "array + scalar"
> -print_coercion_table(np.typecodes['All'], 0, 0, True)
> -print
> -print "array + neg scalar"
> -print_coercion_table(np.typecodes['All'], 0, -1, True)
> -print
> -print "promote_types"
> -print_coercion_table(np.typecodes['All'], 0, 0, False, True)
> +def printem():
> +    print "can cast"
> +    print_cancast_table(np.typecodes['All'])
> +    print
> +    print "In these tables, ValueError is '!', OverflowError is '@',
> TypeError is '#'"
> +    print
> +    print "scalar + scalar"
> +    print_coercion_table(np.typecodes['All'], 0, 0, False)
> +    print
> +    print "scalar + neg scalar"
> +    print_coercion_table(np.typecodes['All'], 0, -1, False)
> +    print
> +    print "array + scalar"
> +    print_coercion_table(np.typecodes['All'], 0, 0, True)
> +    print
> +    print "array + neg scalar"
> +    print_coercion_table(np.typecodes['All'], 0, -1, True)
> +    print
> +    print "promote_types"
> +    print_coercion_table(np.typecodes['All'], 0, 0, False, True)
>
>

I opened ticket #1937 for this

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110816/45dabf5d/attachment.html>

From efiring at hawaii.edu  Tue Aug 16 13:14:35 2011
From: efiring at hawaii.edu (Eric Firing)
Date: Tue, 16 Aug 2011 07:14:35 -1000
Subject: [Numpy-discussion] numpy.interp running time
In-Reply-To: <CAGK+T_mV1_kTr4uqAbNoMWPY7aHJ=N6Dg2Va6ww5rfVfW_5Q7g@mail.gmail.com>
References: <CAGK+T_mHR+vWS_UfR7fv_FHr19Gy-mYZ_aq49Lyni2H=qdO2Wg@mail.gmail.com>
	<CAGK+T_n15L_7oPLe9TRU6-LaV45sZxLiDk8-uVa=HAbrb7X9XQ@mail.gmail.com>
	<4E3452F1.7010607@hawaii.edu>
	<CAGK+T_knsRuciUpOjuEeQoOcs6Y2gG0ZNrzHtVCzjw3rX8YXZQ@mail.gmail.com>
	<CAGK+T_=Xoqs6H-GpSC0qysWoenF5Zj==+u0p+ABQ11xrcM7BhA@mail.gmail.com>
	<CAGK+T_mV1_kTr4uqAbNoMWPY7aHJ=N6Dg2Va6ww5rfVfW_5Q7g@mail.gmail.com>
Message-ID: <4E4AA57B.8070506@hawaii.edu>

On 08/16/2011 04:22 AM, Timo Kluck wrote:
> 2011/8/1 Timo Kluck<tkluck at infty.nl>:
>> I just submitted a patch at
>> http://projects.scipy.org/numpy/ticket/1920 . It implements Eric's
>> suggestion. Please review, I'll be happy to adapt it to any of your
>> feedback.
>>
> I submitted a minor patch a while ago. It hasn't been reviewed yet,
> but I don't know whether that's just because the reviewers just
> haven't had time yet, or whether some extra action is required on my
> part. Perhaps the ticket should be 'tagged' for review, or similar?
> Let me know if there's anything more that I should do.
>
> Timo

Timo,

I suspect the one thing that would improve the likelihood of review 
would be if you were to supply the patch via a github pull request.  In 
addition, posting a timing test (code and results) might help.

Eric


From matthew.brett at gmail.com  Tue Aug 16 15:15:22 2011
From: matthew.brett at gmail.com (Matthew Brett)
Date: Tue, 16 Aug 2011 12:15:22 -0700
Subject: [Numpy-discussion] Segfault for np.lookfor
In-Reply-To: <CAB6mnxK3-EUJBY17sRHYb4v26OVZv5d_-TCYMk+K5LM7pJzVYQ@mail.gmail.com>
References: <CAH6Pt5p9k2A=oBThzzBmqtzyMZxvEhu9Jnechpoz3N0w8FRCiA@mail.gmail.com>
	<CAB6mnxL0Rz5Qu9=qhjVYhHqLYYDwEcDbrPkmY1RWmyqfEqvPow@mail.gmail.com>
	<CAB6mnxLr9p5OZc_+wXFUufdEtdgexhZT+zY4Tn-kZtwZTK2jwg@mail.gmail.com>
	<CAB6mnxJ35LZOer1muy-9BOW_yzWoMV_MB0JFg7b2wMoCEOy31w@mail.gmail.com>
	<CAB6mnxK3-EUJBY17sRHYb4v26OVZv5d_-TCYMk+K5LM7pJzVYQ@mail.gmail.com>
Message-ID: <CAH6Pt5o3bGyJ=Xm1Hes0GMiNPkUYQG_87RAi3ipGAFaM_u69-w@mail.gmail.com>

Hi,

On Tue, Aug 16, 2011 at 10:05 AM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
>
>
> On Mon, Aug 15, 2011 at 7:43 PM, Charles R Harris
> <charlesr.harris at gmail.com> wrote:
>>
>>
>> On Mon, Aug 15, 2011 at 7:09 PM, Charles R Harris
>> <charlesr.harris at gmail.com> wrote:
>>>
>>>
>>> On Mon, Aug 15, 2011 at 6:56 PM, Charles R Harris
>>> <charlesr.harris at gmail.com> wrote:
>>>>
>>>>
>>>> On Mon, Aug 15, 2011 at 3:53 PM, Matthew Brett <matthew.brett at gmail.com>
>>>> wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> On current trunk, all tests pass but running the (forgive my language)
>>>>> doctests, I found this:
>>>>>
>>>>> In [1]: import numpy as np
>>>>>
>>>>> In [2]: np.__version__
>>>>> Out[2]: '2.0.0.dev-730b861'
>>>>>
>>>>> In [3]: np.lookfor('cos')
>>>>> Segmentation fault
>>>>>
>>>>> on:
>>>>>
>>>>> Linux angela 2.6.38-10-generic #46-Ubuntu SMP Tue Jun 28 15:07:17 UTC
>>>>> 2011 x86_64 x86_64 x86_64 GNU/Linux
>>>>> Ubuntu Natty Python 2.7.1+
>>>>>
>>>>
>>>> The problem is somewhere in print_coercion_tables.py
>>>>
>>>
>>> Or more precisely, it triggered by importing? print_coercion_tables.py. I
>>> don't think lookfor should be doing that, but in any case:
>>>
>>> array + scalar
>>> + ? b h i l q p B H I L Q P e f d g F D G S U V O M m
>>> ? ? b h i l l l B H I L L L e f d g F D G O O # O ! m
>>> b b b b b b b b b b b b b b e f d g F D G O O # O ! m
>>> h h h h h h h h h h h h h h f f d g F D G O O # O ! m
>>> i i i i i i i i i i i i i i d d d g D D G O O # O ! m
>>> l l l l l l l l l l l l l l d d d g D D G O O # O ! m
>>> q l l l l l l l l l l l l l d d d g D D G O O # O ! m
>>> p l l l l l l l l l l l l l d d d g D D G O O # O ! m
>>> B B B B B B B B B B B B B B e f d g F D G O O # O ! m
>>> H H H H H H H H H H H H H H f f d g F D G O O # O ! m
>>> I I I I I I I I I I I I I I d d d g D D G O O # O ! m
>>> L L L L L L L L L L L L L L d d d g D D G O O # O ! m
>>> Q L L L L L L L L L L L L L d d d g D D G O O # O ! m
>>> P L L L L L L L L L L L L L d d d g D D G O O # O ! m
>>> e e e e e e e e e e e e e e e e e e F F F O O # O ! #
>>> f f f f f f f f f f f f f f f f f f F F F O O # O ! #
>>> d d d d d d d d d d d d d d d d d d D D D O O # O ! #
>>> g g g g g g g g g g g g g g g g g g G G G O O # O ! #
>>> F F F F F F F F F F F F F F F F F F F F F O O # O ! #
>>> D D D D D D D D D D D D D D D D D D D D D O O # O ! #
>>> G G G G G G G G G G G G G G G G G G G G G O O # O ! #
>>> S O O O O O O O O O O O O O O O O O O O O O O # O ! O
>>> U O O O O O O O O O O O O O O O O O O O O O O # O ! O
>>> Segmentation fault (core dumped)
>>
>> A quick fix is to put the print statements in a function.
>>
>> diff --git a/numpy/testing/print_coercion_tables.py
>> b/numpy/testing/print_coercion_tables.p
>> index d875449..3bc9253 100755
>> --- a/numpy/testing/print_coercion_tables.py
>> +++ b/numpy/testing/print_coercion_tables.py
>> @@ -65,22 +65,23 @@ def print_coercion_table(ntypes, inputfirstvalue,
>> inputsecondvalue, fir
>> ???????????? print char,
>> ???????? print
>>
>> -print "can cast"
>> -print_cancast_table(np.typecodes['All'])
>> -print
>> -print "In these tables, ValueError is '!', OverflowError is '@',
>> TypeError is '#'"
>> -print
>> -print "scalar + scalar"
>> -print_coercion_table(np.typecodes['All'], 0, 0, False)
>> -print
>> -print "scalar + neg scalar"
>> -print_coercion_table(np.typecodes['All'], 0, -1, False)
>> -print
>> -print "array + scalar"
>> -print_coercion_table(np.typecodes['All'], 0, 0, True)
>> -print
>> -print "array + neg scalar"
>> -print_coercion_table(np.typecodes['All'], 0, -1, True)
>> -print
>> -print "promote_types"
>> -print_coercion_table(np.typecodes['All'], 0, 0, False, True)
>> +def printem():
>> +??? print "can cast"
>> +??? print_cancast_table(np.typecodes['All'])
>> +??? print
>> +??? print "In these tables, ValueError is '!', OverflowError is '@',
>> TypeError is '#'"
>> +??? print
>> +??? print "scalar + scalar"
>> +??? print_coercion_table(np.typecodes['All'], 0, 0, False)
>> +??? print
>> +??? print "scalar + neg scalar"
>> +??? print_coercion_table(np.typecodes['All'], 0, -1, False)
>> +??? print
>> +??? print "array + scalar"
>> +??? print_coercion_table(np.typecodes['All'], 0, 0, True)
>> +??? print
>> +??? print "array + neg scalar"
>> +??? print_coercion_table(np.typecodes['All'], 0, -1, True)
>> +??? print
>> +??? print "promote_types"
>> +??? print_coercion_table(np.typecodes['All'], 0, 0, False, True)
>>
>
> I opened ticket #1937 for this

>From git-bisect it looks like the culprit is:

feb8079070b8a659d7eee1b4acbddf470fd8a81d is the first bad commit
commit feb8079070b8a659d7eee1b4acbddf470fd8a81d
Author: Ben Walsh <b at wumpster.com>
Date:   Sun Jul 10 12:52:52 2011 +0100

    BUT: Stop _array_find_type trying to make every list element a
subtype of bool.

Just to remind me, my procedure was:

<~/tmp/testfor.py>
#!/usr/bin/env python
import sys
from functools import partial
from subprocess import check_call, Popen, PIPE, CalledProcessError

caller = partial(check_call, shell=True)
popener = partial(Popen, stdout=PIPE, stderr=PIPE, shell=True)

try:
    caller('git clean -fxd')
    caller('python setup.py build_ext -i')
except CalledProcessError:
    sys.exit(125) # untestable

proc = popener('python -c "%s"' %
"""import sys
import numpy as np
np.lookfor('cos', output=sys.stdout)
""")

stdout, stderr = proc.communicate()
if 'Segmentation fault' in stderr:
    sys.exit(1) # bad
sys.exit(0) # good
</~/tmp/testfor.py>

Then, I established the v1.6.1 did not have the segfault, and (man git-bisect):

git co main-master # current upstream master
git bisect start HEAD v1.6.1 --
git bisect run ~/tmp/testfor.py

See y'all,

Matthew


From pearu.peterson at gmail.com  Tue Aug 16 16:51:24 2011
From: pearu.peterson at gmail.com (Pearu Peterson)
Date: Tue, 16 Aug 2011 23:51:24 +0300
Subject: [Numpy-discussion] [f2py] How to specify compile options in
	setup.py
In-Reply-To: <CAMWde5rZm8U9ZeRn3gc0LgZrPmhx1Cy6oj4VDFJcsaQxu7vYhA@mail.gmail.com>
References: <CAMWde5rZm8U9ZeRn3gc0LgZrPmhx1Cy6oj4VDFJcsaQxu7vYhA@mail.gmail.com>
Message-ID: <CAPpwKcyezjQ1CZPRQc4oQpN1D1BNp_73T0aHBJL_8ndwq0+7Xw@mail.gmail.com>

,

On Tue, Aug 16, 2011 at 7:50 PM, Jose Gomez-Dans <jgomezdans at gmail.com>wrote:

> Hi,
>
> Up to now, I have managed to build Fortran extensions with f2py by ussing
> the following command:
> $ python setup.py config_fc --fcompiler=gnu95
>  --f77flags='-fmy_flags' --f90flags='-fmy_flags' build
>
> I think that these options should be able to go in a setup.py file, and use
> the f2py_options file. One way of doing this is to extend sys.argv with the
> required command line options:
> import sys
> sys.argv.extend ( ['config_fc', '--fcompiler=gnu95',
>  '--f77flags="-fmy_flags"', "--f90flags='-fmy_flags"] )
>
> This works well if all the extensions require the same flags. In my case,
> however, One of the extensions requires a different set of flags (in
> particular, it requires that flag  -fdefault-real-8 isn't set, which is
> required by the extensions). I tried setting the f2py_options in the
> add_extension method call:
>
> config.add_extension( 'my_extension', sources = my_sources,
>  f2py_options=['f77flags="-ffixed-line-length-0" -fdefault-real-8',
> 'f90flags="-fdefault-real-8"']  )
>
> This compiles the extensions (using the two dashes in front of the f2py
> option eg --f77flags results in an unrecognised option), but the f2p_options
> goes unheeded. Here's the relevant bit of the output from python setup.py
> build:
>
> compiling Fortran sources
> Fortran f77 compiler: /usr/bin/gfortran -ffixed-line-length-0 -fPIC -O3
> -march=native
> Fortran f90 compiler: /usr/bin/gfortran -ffixed-line-length-0 -fPIC -O3
> -march=native
> Fortran fix compiler: /usr/bin/gfortran -Wall -ffixed-form
> -fno-second-underscore -ffixed-line-length-0 -fPIC -O3 -march=native
> compile options: '-Ibuild/src.linux-i686-2.7
> -I/usr/lib/pymodules/python2.7/numpy/core/include -I/usr/include/python2.7
> -c'
> extra options: '-Jbuild/temp.linux-i686-2.7/my_dir
> -Ibuild/temp.linux-i686-2.7/my_dir'
>
> How can I disable (or enable) one option for compiling one particular
> extension?
>
>
You cannot do it unless you update numpy from git repo.
I just implemented the support for extra_f77_compile_args and
extra_f90_compile_args
options that can be used in config.add_extension as well as in
config.add_library.
See
  https://github.com/numpy/numpy/commit/43862759

So, with recent numpy the following will work

config.add_extension( 'my_extension', sources = my_sources,
                                    extra_f77_compile_args =
["-ffixed-line-length-0", "-fdefault-real-8"],
                                    extra_f90_compile_args =
["-fdefault-real-8"],
                                  )

HTH,
Pearu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110816/3b2113ea/attachment.html>

From hongchunjin at gmail.com  Tue Aug 16 17:19:07 2011
From: hongchunjin at gmail.com (Hongchun Jin)
Date: Tue, 16 Aug 2011 16:19:07 -0500
Subject: [Numpy-discussion] Trim a numpy array in numpy.
Message-ID: <CANCio4AH=SSB641bY2VPYynY5wdGFa9BZTk1=Ouvmhn_NHkxdQ@mail.gmail.com>

*Hi there,
*
*
*
*I have a question regarding how to trim a string array in numpy. *
*
*
*>>> import numpy as np*
*>>> x = np.array(['aaa.hdf', 'bbb.hdf', 'ccc.hdf', 'ddd.hdf'])*
*
*
*I expect to trim a certain part of each element in the array, for example
'.hdf', giving me ['aaa', 'bbb', 'ccc', 'ddd']. Of course, I can do a loop
thing. However, in my actual dataset, I have more than one million elements
in such an array. So I am wondering is there a faster and better way to do
it, like STRMID function in IDL?  I try to google it, but it turns out that
I can not find any discussion about it.  Thanks. *
*
Hongchun*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110816/8d77d1ff/attachment.html>

From derek at astro.physik.uni-goettingen.de  Tue Aug 16 17:39:26 2011
From: derek at astro.physik.uni-goettingen.de (Derek Homeier)
Date: Tue, 16 Aug 2011 23:39:26 +0200
Subject: [Numpy-discussion] Trim a numpy array in numpy.
In-Reply-To: <CANCio4AH=SSB641bY2VPYynY5wdGFa9BZTk1=Ouvmhn_NHkxdQ@mail.gmail.com>
References: <CANCio4AH=SSB641bY2VPYynY5wdGFa9BZTk1=Ouvmhn_NHkxdQ@mail.gmail.com>
Message-ID: <DA4BA935-0885-49BF-B571-2E0F4E5D1D09@astro.physik.uni-goettingen.de>

Hi Hongchun,

On 16 Aug 2011, at 23:19, Hongchun Jin wrote:

> I have a question regarding how to trim a string array in numpy. 
> 
> >>> import numpy as np
> >>> x = np.array(['aaa.hdf', 'bbb.hdf', 'ccc.hdf', 'ddd.hdf'])
> 
> I expect to trim a certain part of each element in the array, for example '.hdf', giving me ['aaa', 'bbb', 'ccc', 'ddd']. Of course, I can do a loop thing. However, in my actual dataset, I have more than one million elements in such an array. So I am wondering is there a faster and better way to do it, like STRMID function in IDL?  I try to google it, but it turns out that I can not find any discussion about it.  Thanks. 
> 
For a case like above, if you really have all constant length strings and want to truncate to a fixed length, you could simply do 

x.astype('|S3')

For more complex cases like trimming regex patterns I can't think of a numpy solution right now, coding the loop in cython might be a better bet there...

Cheers,
					Derek


From hongchunjin at gmail.com  Tue Aug 16 17:51:49 2011
From: hongchunjin at gmail.com (Hongchun Jin)
Date: Tue, 16 Aug 2011 16:51:49 -0500
Subject: [Numpy-discussion] Trim a numpy array in numpy.
In-Reply-To: <DA4BA935-0885-49BF-B571-2E0F4E5D1D09@astro.physik.uni-goettingen.de>
References: <CANCio4AH=SSB641bY2VPYynY5wdGFa9BZTk1=Ouvmhn_NHkxdQ@mail.gmail.com>
	<DA4BA935-0885-49BF-B571-2E0F4E5D1D09@astro.physik.uni-goettingen.de>
Message-ID: <CANCio4AezEMEk14O81L=8um-WTo0x21KAYw7_Lv+17rBFntkfQ@mail.gmail.com>

*Thanks Derek for  the quick reply. But **I am sorry, I did not make it
clear in my last email.  Assume I have an array like *
*

['CAL_LID_L2_05kmCLay-Prov-V3-01.2008-01-01T00-37-48ZD.hdf'

 'CAL_LID_L2_05kmCLay-Prov-V3-01.2008-01-01T00-37-48ZD.hdf'

 'CAL_LID_L2_05kmCLay-Prov-V3-01.2008-01-01T00-37-48ZD.hdf' ...,

 'CAL_LID_L2_05kmCLay-Prov-V3-01.2008-01-31T23-56-35ZD.hdf'

 'CAL_LID_L2_05kmCLay-Prov-V3-01.2008-01-31T23-56-35ZD.hdf'

 'CAL_LID_L2_05kmCLay-Prov-V3-01.2008-01-31T23-56-35ZD.hdf']

I need to get the sub-string for date and time, for example,
**

'2008-01-31T23-56-35ZD' in the middle of each element. In more general
cases, the sub-string could be any part of the string in such an array.  I
hope to assign the start and stop of the sub-string when I am subsetting it.


*
*Best,

Hongchun
*


On Tue, Aug 16, 2011 at 4:39 PM, Derek Homeier <
derek at astro.physik.uni-goettingen.de> wrote:

> x.astype('|S3')
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110816/1b5f8550/attachment.html>

From derek at astro.physik.uni-goettingen.de  Tue Aug 16 18:43:50 2011
From: derek at astro.physik.uni-goettingen.de (Derek Homeier)
Date: Wed, 17 Aug 2011 00:43:50 +0200
Subject: [Numpy-discussion] Trim a numpy array in numpy.
In-Reply-To: <CANCio4AezEMEk14O81L=8um-WTo0x21KAYw7_Lv+17rBFntkfQ@mail.gmail.com>
References: <CANCio4AH=SSB641bY2VPYynY5wdGFa9BZTk1=Ouvmhn_NHkxdQ@mail.gmail.com>
	<DA4BA935-0885-49BF-B571-2E0F4E5D1D09@astro.physik.uni-goettingen.de>
	<CANCio4AezEMEk14O81L=8um-WTo0x21KAYw7_Lv+17rBFntkfQ@mail.gmail.com>
Message-ID: <B0CFF83D-FEF8-47B0-9B0F-A7B2A2E4B569@astro.physik.uni-goettingen.de>

On 16 Aug 2011, at 23:51, Hongchun Jin wrote:

> Thanks Derek for  the quick reply. But I am sorry, I did not make it clear in my last email.  Assume I have an array like 
> ['CAL_LID_L2_05kmCLay-Prov-V3-01.2008-01-01T00-37-48ZD.hdf'
> 
>  'CAL_LID_L2_05kmCLay-Prov-V3-01.2008-01-01T00-37-48ZD.hdf'
> 
>  'CAL_LID_L2_05kmCLay-Prov-V3-01.2008-01-01T00-37-48ZD.hdf' ...,
> 
>  'CAL_LID_L2_05kmCLay-Prov-V3-01.2008-01-31T23-56-35ZD.hdf'
> 
>  'CAL_LID_L2_05kmCLay-Prov-V3-01.2008-01-31T23-56-35ZD.hdf'
> 
>  'CAL_LID_L2_05kmCLay-Prov-V3-01.2008-01-31T23-56-35ZD.hdf']
> 
> I need to get the sub-string for date and time, for example,  
> 
> '2008-01-31T23-56-35ZD' in the middle of each element. In more general cases, the sub-string could be any part of the string in such an array.  I hope to assign the start and stop of the sub-string when I am subsetting it.  
> 
Well, maybe I was a bit too quick in my reply - see the documentation for np.char for some vectorized array operations that might be of use. Unfortunately, operations like 'lstrip' and 'rstrip' don't do exactly what you might them expect to, but you could use for example 
np.char.split(x,'.') 
to create an array of lists of substrings and then deal with them; something like removing the '.hdf' suffix would already require a somewhat lengthy recursion:

np.char.rstrip(np.char.rstrip(np.char.rstrip(np.char.rstrip(x, 'f'), 'd'), 'h'), '.')

To also remove the leading substring in your case clearly would lead to a very clumsy expression...

It turns out however, something like the above for a similar test case with a length 100000 array takes about 3 times longer than the np.char.split() way; but even that is slower than a direct loop over string functions:

In [6]: %timeit -n 10 y = np.char.split(x, '.')
10 loops, best of 3: 188 ms per loop

In [7]: %timeit -n 10 y = np.char.split(x, '.'); z = np.fromiter( (l[1] for l in y), dtype='|S3', count=x.shape[0])
10 loops, best of 3: 218 ms per loop

In [8]: %timeit -n 10 z = np.fromiter( (l.split('.')[1] for l in x), dtype='|S3', count=x.shape[0])
10 loops, best of 3: 143 ms per loop

So it seems all of the vectorization in np.char is not that great after all (and the direct loop might still be acceptable for 1.e6 elements...)!

Cheers,
								Derek


From warren.weckesser at enthought.com  Tue Aug 16 19:44:10 2011
From: warren.weckesser at enthought.com (Warren Weckesser)
Date: Tue, 16 Aug 2011 18:44:10 -0500
Subject: [Numpy-discussion] Trim a numpy array in numpy.
In-Reply-To: <CANCio4AezEMEk14O81L=8um-WTo0x21KAYw7_Lv+17rBFntkfQ@mail.gmail.com>
References: <CANCio4AH=SSB641bY2VPYynY5wdGFa9BZTk1=Ouvmhn_NHkxdQ@mail.gmail.com>
	<DA4BA935-0885-49BF-B571-2E0F4E5D1D09@astro.physik.uni-goettingen.de>
	<CANCio4AezEMEk14O81L=8um-WTo0x21KAYw7_Lv+17rBFntkfQ@mail.gmail.com>
Message-ID: <CAM-+wY8x7a7XJ+K3n3KcKTBOiWfO46VZSX10wojJu-2TBMokBg@mail.gmail.com>

On Tue, Aug 16, 2011 at 4:51 PM, Hongchun Jin <hongchunjin at gmail.com> wrote:

> *Thanks Derek for  the quick reply. But **I am sorry, I did not make it
> clear in my last email.  Assume I have an array like *
> *
>
> ['CAL_LID_L2_05kmCLay-Prov-V3-01.2008-01-01T00-37-48ZD.hdf'
>
>  'CAL_LID_L2_05kmCLay-Prov-V3-01.2008-01-01T00-37-48ZD.hdf'
>
>  'CAL_LID_L2_05kmCLay-Prov-V3-01.2008-01-01T00-37-48ZD.hdf' ...,
>
>  'CAL_LID_L2_05kmCLay-Prov-V3-01.2008-01-31T23-56-35ZD.hdf'
>
>  'CAL_LID_L2_05kmCLay-Prov-V3-01.2008-01-31T23-56-35ZD.hdf'
>
>  'CAL_LID_L2_05kmCLay-Prov-V3-01.2008-01-31T23-56-35ZD.hdf']
>
> I need to get the sub-string for date and time, for example,
> **
>
> '2008-01-31T23-56-35ZD' in the middle of each element. In more general
> cases, the sub-string could be any part of the string in such an array.  I
> hope to assign the start and stop of the sub-string when I am subsetting it.
>
> *
>

Here's one way:
-----
import numpy as np


def strslice(x, start=None, stop=None, step=None):
    """
    Given a contiguous 1-d numpy array `x` of strings, return a new numpy
    array `y` of strings so that y[k] = x[k][start:stop:step].  `y` contains
a
    copy of the data, not a view.
    """
    slc = slice(start, stop, step)
    x2d = x.view(np.byte).reshape(-1, x.itemsize)
    y2d = x2d[:, slc].copy()
    y = y2d.view('S' + str(y2d.shape[-1])).ravel()
    return y


if __name__ == "__main__":

    x =
np.array(['CAL_LID_L2_05kmCLay-Prov-V3-01.2008-01-01T00-37-48ZD.hdf',

'CAL_LID_L2_05kmCLay-Prov-V3-01.2008-01-01T00-37-48ZD.hdf',

'CAL_LID_L2_05kmCLay-Prov-V3-01.2008-01-01T00-37-48ZD.hdf',

'CAL_LID_L2_05kmCLay-Prov-V3-01.2008-01-31T23-56-35ZD.hdf',

'CAL_LID_L2_05kmCLay-Prov-V3-01.2008-01-31T23-56-35ZD.hdf',

'CAL_LID_L2_05kmCLay-Prov-V3-01.2008-01-31T23-56-35ZD.hdf'])

    print "x:\n", x

    y = strslice(x, start=31, stop=52)
    print "y:\n", y
-----

Output:
-----
x:
['CAL_LID_L2_05kmCLay-Prov-V3-01.2008-01-01T00-37-48ZD.hdf'
 'CAL_LID_L2_05kmCLay-Prov-V3-01.2008-01-01T00-37-48ZD.hdf'
 'CAL_LID_L2_05kmCLay-Prov-V3-01.2008-01-01T00-37-48ZD.hdf'
 'CAL_LID_L2_05kmCLay-Prov-V3-01.2008-01-31T23-56-35ZD.hdf'
 'CAL_LID_L2_05kmCLay-Prov-V3-01.2008-01-31T23-56-35ZD.hdf'
 'CAL_LID_L2_05kmCLay-Prov-V3-01.2008-01-31T23-56-35ZD.hdf']
y:
['2008-01-01T00-37-48ZD' '2008-01-01T00-37-48ZD' '2008-01-01T00-37-48ZD'
 '2008-01-31T23-56-35ZD' '2008-01-31T23-56-35ZD' '2008-01-31T23-56-35ZD']
-----


Warren

*
>
>
> *
> *Best,
>
> Hongchun
> *
>
>
> On Tue, Aug 16, 2011 at 4:39 PM, Derek Homeier <
> derek at astro.physik.uni-goettingen.de> wrote:
>
>> x.astype('|S3')
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110816/bb09f8cf/attachment.html>

From J.Lee at bom.gov.au  Tue Aug 16 22:16:41 2011
From: J.Lee at bom.gov.au (Jin Lee)
Date: Wed, 17 Aug 2011 12:16:41 +1000
Subject: [Numpy-discussion] f2py - undefined symbol: _intel_fast_memset
 [SEC=UNCLASSIFIED]
In-Reply-To: <4E4A5850.3020901@cens.ioc.ee>
Message-ID: <0E3686EB9FA8AA409AFA0A25468DCE43017138E5BD75@BOM-VMBX-HO.bom.gov.au>

Hello Pearu,

Thank you for your reply. It turned out that I was using Intel C/C++ compiler (icc) as my environment was set up for that compiler. I changed my compile environment to gcc and f2py worked.

BTW for the '--fcompiler' switch both 'gnu95' and 'gfortran' seem to work fine.

Many thanks for your prompt reply.


Regards,

Jin


> -----Original Message-----
> From: numpy-discussion-bounces at scipy.org
> [mailto:numpy-discussion-bounces at scipy.org] On Behalf Of
> Pearu Peterson
> Sent: Tuesday, 16 August 2011 21:45
> To: Discussion of Numerical Python
> Subject: Re: [Numpy-discussion] f2py - undefined symbol:
> _intel_fast_memset [SEC=UNCLASSIFIED]
>
>
>
> On 08/16/2011 02:32 PM, Jin Lee wrote:
> > Hello,
> >
> > This is my very first attempt at using f2py but I have come
> across a problem. If anyone can assist me I would appreciate
> it very much.
> >
> > I have a very simple test Fortran source, sub.f90 which is:
> >
> > subroutine sub1(x,y)
> >     implicit none
> >
> >     integer, intent(in) :: x
> >     integer, intent(out) :: y
> >
> > ! start
> >     y = x
> >
> > end subroutine sub1
> >
> >
> > I then used f2py to produce an object file, sub.so:
> >
> > f2py -c -m sub sub.f90 --fcompiler='gfortran'
> >
> > After starting a Python interactive session I tried to
> import the Fortran-derived Python module but I get an error message:
> >
> >>>> import sub
> > Traceback (most recent call last):
> >    File "<stdin>", line 1, in<module>
> > ImportError: ./sub.so: undefined symbol: _intel_fast_memset
> >
> >
> > Can anyone suggest what this error message means and how I
> can overcome it, please?
>
> Try
>    f2py -c -m sub sub.f90 --fcompiler=gnu95
>
> HTH,
> Pearu
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From jdh2358 at gmail.com  Wed Aug 17 09:01:53 2011
From: jdh2358 at gmail.com (John Hunter)
Date: Wed, 17 Aug 2011 08:01:53 -0500
Subject: [Numpy-discussion] segfault on complex array on solaris x86
In-Reply-To: <BANLkTim09dYeVN3eOAvLO5FKmVvLmSSTyQ@mail.gmail.com>
References: <AANLkTi=W3DU2pD7ODPCoCr-5UkPaiKtnpctr2nM9ON5T@mail.gmail.com>
	<AANLkTiki4Za-J7ZmTjYX4U__toaOicHNmJFy9BGJDLW1@mail.gmail.com>
	<BANLkTim09dYeVN3eOAvLO5FKmVvLmSSTyQ@mail.gmail.com>
Message-ID: <CAGD8yY-0ZtYVQUOhFPeJox8J+pZAYcB1sCW49DYRyk446gMY_A@mail.gmail.com>

On Wed, Apr 13, 2011 at 8:50 AM, John Hunter <jdh2358 at gmail.com> wrote:
> On Sat, Jan 15, 2011 at 7:28 AM, Ralf Gommers
> <ralf.gommers at googlemail.com> wrote:
>> I've opened http://projects.scipy.org/numpy/ticket/1713 so this doesn't get
>> lost.
>
> Just wanted to bump this -- bug still exists in numpy HEAD 2.0.0.dev-fe3852f

Just wanted to mention that this segfault still exists in
2.0.0.dev-4386275 and I updated the ticket at

http://projects.scipy.org/numpy/ticket/1713

with a much simpler test script. Basically::

  import numpy as np
  xn = np.exp(2j)

is causing a segfault on my solaris platform


From keith.hughitt at gmail.com  Wed Aug 17 10:04:10 2011
From: keith.hughitt at gmail.com (Keith Hughitt)
Date: Wed, 17 Aug 2011 10:04:10 -0400
Subject: [Numpy-discussion] Best way to construct/slice 3-dimensional
	ndarray from multiple 2d ndarrays?
Message-ID: <CAOJcpR_X3eTbRKvCK70bX0MbMFY+qKcahWCNC7WAZ68zA-9M-Q@mail.gmail.com>

Hi all,

I have a method which builds a single 3d ndarray from several
equal-dimension 2d ndarrays, and another method which extracts the original
2d ndarrays back out from the 3d one.

The way I'm doing this right now is pretty simple, e.g.:

cube = np.asarray([arr1, arr2,...])
...
x = cube[0]

I believe the way this is currently handled, is to use new memory locations
first for the 3d array, and then later for the 2d slices.

Does anyone know if there is a better way to handle this? Ideally, I would
like to reuse the same memory locations instead of copying it anew each
time.

Also, when subclassing ndarray and calling obj = data.view(cls) for an
ndarray "data", does this copy the data into the new object by value or
reference? The method which extracts the 2d slice actually returns a
subclass of ndarray created using the extracted data, so this is why I ask.

Any insight or suggestions would be appreciated.

Thanks!
Keith
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110817/c6888398/attachment.html>

From shish at keba.be  Wed Aug 17 13:00:19 2011
From: shish at keba.be (Olivier Delalleau)
Date: Wed, 17 Aug 2011 13:00:19 -0400
Subject: [Numpy-discussion] Best way to construct/slice 3-dimensional
 ndarray from multiple 2d ndarrays?
In-Reply-To: <CAOJcpR_X3eTbRKvCK70bX0MbMFY+qKcahWCNC7WAZ68zA-9M-Q@mail.gmail.com>
References: <CAOJcpR_X3eTbRKvCK70bX0MbMFY+qKcahWCNC7WAZ68zA-9M-Q@mail.gmail.com>
Message-ID: <CAFXk4brFYwLixk9js4eRKZXzq1k=EVgfHgc0MK7KR7AfOUkCBA@mail.gmail.com>

Right now you allocate new memory only when creating your 3d array. When you
do "x = cube[0]" this creates a view that does not allocate more memory.

If your 2d arrays were created independently, I don't think you can avoid
this.
If you have some control on the way your original 2D arrays are created, you
can first initialize the 3d array with correct shape (or an upper bound on
the number of 2d arrays), then use views on this 3d array ("x_i = cube[i]")
to fill your 2D arrays in the same memory space.

I can't help with your second question, sorry.

-=- Olivier

2011/8/17 Keith Hughitt <keith.hughitt at gmail.com>

> Hi all,
>
> I have a method which builds a single 3d ndarray from several
> equal-dimension 2d ndarrays, and another method which extracts the original
> 2d ndarrays back out from the 3d one.
>
> The way I'm doing this right now is pretty simple, e.g.:
>
> cube = np.asarray([arr1, arr2,...])
> ...
> x = cube[0]
>
> I believe the way this is currently handled, is to use new memory locations
> first for the 3d array, and then later for the 2d slices.
>
> Does anyone know if there is a better way to handle this? Ideally, I would
> like to reuse the same memory locations instead of copying it anew each
> time.
>
> Also, when subclassing ndarray and calling obj = data.view(cls) for an
> ndarray "data", does this copy the data into the new object by value or
> reference? The method which extracts the 2d slice actually returns a
> subclass of ndarray created using the extracted data, so this is why I ask.
>
> Any insight or suggestions would be appreciated.
>
> Thanks!
> Keith
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110817/1d5ff138/attachment.html>

From keith.hughitt at gmail.com  Wed Aug 17 13:43:55 2011
From: keith.hughitt at gmail.com (Keith Hughitt)
Date: Wed, 17 Aug 2011 13:43:55 -0400
Subject: [Numpy-discussion] Best way to construct/slice 3-dimensional
 ndarray from multiple 2d ndarrays?
In-Reply-To: <CAFXk4brFYwLixk9js4eRKZXzq1k=EVgfHgc0MK7KR7AfOUkCBA@mail.gmail.com>
References: <CAOJcpR_X3eTbRKvCK70bX0MbMFY+qKcahWCNC7WAZ68zA-9M-Q@mail.gmail.com>
	<CAFXk4brFYwLixk9js4eRKZXzq1k=EVgfHgc0MK7KR7AfOUkCBA@mail.gmail.com>
Message-ID: <CAOJcpR_CaH6zd6ESuLNQ1Jg+kCSMRMXzm7h0445LH6Mz-QwVog@mail.gmail.com>

The 2d arrays are read in using another library (PyFITS), so I probably
won't be able to control that too much, otherwise that sounds like exactly
what I need.

I'm actually overriding the indexing operation so that the user gets back an
ndarray subclass when they do "cube[0]":

    def __getitem__(self, key):
        """Overiding indexing operation"""
        if isinstance(key, int):
            data = np.ndarray.__getitem__(self, key)
            header = self._headers[key]
            for cls in BaseMap.__subclasses__():
                if cls.is_datasource_for(header):
                    return cls(data, header)
            raise UnrecognizedDataSouceError
        else:
            return np.ndarray.__getitem__(self, key)

Which relates to the second part of the question I had about how the ndarray
is handled when an instance of a ndarray subclass is created.

Thanks for the suggestions!

Keith

On Wed, Aug 17, 2011 at 1:00 PM, Olivier Delalleau <shish at keba.be> wrote:

> Right now you allocate new memory only when creating your 3d array. When
> you do "x = cube[0]" this creates a view that does not allocate more memory.
>
> If your 2d arrays were created independently, I don't think you can avoid
> this.
> If you have some control on the way your original 2D arrays are created,
> you can first initialize the 3d array with correct shape (or an upper bound
> on the number of 2d arrays), then use views on this 3d array ("x_i =
> cube[i]") to fill your 2D arrays in the same memory space.
>
> I can't help with your second question, sorry.
>
> -=- Olivier
>
> 2011/8/17 Keith Hughitt <keith.hughitt at gmail.com>
>
>> Hi all,
>>
>> I have a method which builds a single 3d ndarray from several
>> equal-dimension 2d ndarrays, and another method which extracts the original
>> 2d ndarrays back out from the 3d one.
>>
>> The way I'm doing this right now is pretty simple, e.g.:
>>
>> cube = np.asarray([arr1, arr2,...])
>> ...
>> x = cube[0]
>>
>> I believe the way this is currently handled, is to use new memory
>> locations first for the 3d array, and then later for the 2d slices.
>>
>> Does anyone know if there is a better way to handle this? Ideally, I would
>> like to reuse the same memory locations instead of copying it anew each
>> time.
>>
>> Also, when subclassing ndarray and calling obj = data.view(cls) for an
>> ndarray "data", does this copy the data into the new object by value or
>> reference? The method which extracts the 2d slice actually returns a
>> subclass of ndarray created using the extracted data, so this is why I ask.
>>
>> Any insight or suggestions would be appreciated.
>>
>> Thanks!
>> Keith
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110817/b32dae4c/attachment.html>

From aronne.merrelli at gmail.com  Wed Aug 17 13:46:12 2011
From: aronne.merrelli at gmail.com (Aronne Merrelli)
Date: Wed, 17 Aug 2011 12:46:12 -0500
Subject: [Numpy-discussion] Best way to construct/slice 3-dimensional
 ndarray from multiple 2d ndarrays?
In-Reply-To: <CAOJcpR_X3eTbRKvCK70bX0MbMFY+qKcahWCNC7WAZ68zA-9M-Q@mail.gmail.com>
References: <CAOJcpR_X3eTbRKvCK70bX0MbMFY+qKcahWCNC7WAZ68zA-9M-Q@mail.gmail.com>
Message-ID: <CAHNdQ4K_QAyU3BYbgSWosZ98yywwgsDHksjg0t4vZ-Cpz16oXw@mail.gmail.com>

On Wed, Aug 17, 2011 at 9:04 AM, Keith Hughitt <keith.hughitt at gmail.com>wrote:

>
> Also, when subclassing ndarray and calling obj = data.view(cls) for an
> ndarray "data", does this copy the data into the new object by value or
> reference? The method which extracts the 2d slice actually returns a
> subclass of ndarray created using the extracted data, so this is why I ask.
>
>
>
I think it should pass a reference - the following code suggests the
subclass is sharing the same fundamental array object. You can use the .base
attribute of the ndarray object to see if it is a view back to another
ndarray object:

import numpy as np
class TestClass(np.ndarray):
    def __new__(cls, inp_array):
        return inp_array.view(cls)

In [2]: x = np.ones(5)
In [3]: obj = TestClass(x)
In [4]: id(x), id(obj), id(obj.base)
Out[4]: (23517648, 19708080, 23517648)
In [5]: print x, obj
[ 1.  1.  1.  1.  1.] [ 1.  1.  1.  1.  1.]
In [6]: x[2] = 2
In [7]: print x, obj
[ 1.  1.  2.  1.  1.] [ 1.  1.  2.  1.  1.]


If you change the TestClass.__new__() to: "return
np.array(inp_array).view(cls)" then you will make a copy of the input array
instead, if that is needed. In that case, it looks like the .base attribute
is a new ndarray, copied from the input array.


Aronne

[PS - also note that .base is set to None, if the ndarray is not a view into
another ndarray; it turns out that None has a valid object number, which
confused me at first - see id(None).]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110817/4f4825ad/attachment.html>

From keith.hughitt at gmail.com  Wed Aug 17 14:11:33 2011
From: keith.hughitt at gmail.com (Keith Hughitt)
Date: Wed, 17 Aug 2011 14:11:33 -0400
Subject: [Numpy-discussion] Best way to construct/slice 3-dimensional
 ndarray from multiple 2d ndarrays?
In-Reply-To: <CAHNdQ4K_QAyU3BYbgSWosZ98yywwgsDHksjg0t4vZ-Cpz16oXw@mail.gmail.com>
References: <CAOJcpR_X3eTbRKvCK70bX0MbMFY+qKcahWCNC7WAZ68zA-9M-Q@mail.gmail.com>
	<CAHNdQ4K_QAyU3BYbgSWosZ98yywwgsDHksjg0t4vZ-Cpz16oXw@mail.gmail.com>
Message-ID: <CAOJcpR8hTsw6G2=yxcHKs2-xhp_psi4wg84YYXY2QiKu+fsSnA@mail.gmail.com>

Great! It looks like it is in fact working as desired:

In [4]: cube.shape
Out[4]: (5, 4096, 4096)

In [5]: slice = cube[0]

In [6]: cube[0,1000,1000]
Out[6]: 618

In [7]: slice[1000,1000]
Out[7]: 618

In [8]: slice[1000,1000] = 123

In [9]: cube[0, 1000,1000]
Out[9]: 123

I didn't know about the .base attribute; that is really useful.

Thank you both for the feedback.
Keith

On Wed, Aug 17, 2011 at 1:46 PM, Aronne Merrelli
<aronne.merrelli at gmail.com>wrote:

>
>
> On Wed, Aug 17, 2011 at 9:04 AM, Keith Hughitt <keith.hughitt at gmail.com>wrote:
>
>>
>> Also, when subclassing ndarray and calling obj = data.view(cls) for an
>> ndarray "data", does this copy the data into the new object by value or
>> reference? The method which extracts the 2d slice actually returns a
>> subclass of ndarray created using the extracted data, so this is why I ask.
>>
>>
>>
> I think it should pass a reference - the following code suggests the
> subclass is sharing the same fundamental array object. You can use the .base
> attribute of the ndarray object to see if it is a view back to another
> ndarray object:
>
> import numpy as np
> class TestClass(np.ndarray):
>     def __new__(cls, inp_array):
>         return inp_array.view(cls)
>
> In [2]: x = np.ones(5)
> In [3]: obj = TestClass(x)
> In [4]: id(x), id(obj), id(obj.base)
> Out[4]: (23517648, 19708080, 23517648)
> In [5]: print x, obj
> [ 1.  1.  1.  1.  1.] [ 1.  1.  1.  1.  1.]
> In [6]: x[2] = 2
> In [7]: print x, obj
> [ 1.  1.  2.  1.  1.] [ 1.  1.  2.  1.  1.]
>
>
> If you change the TestClass.__new__() to: "return
> np.array(inp_array).view(cls)" then you will make a copy of the input array
> instead, if that is needed. In that case, it looks like the .base attribute
> is a new ndarray, copied from the input array.
>
>
> Aronne
>
> [PS - also note that .base is set to None, if the ndarray is not a view
> into another ndarray; it turns out that None has a valid object number,
> which confused me at first - see id(None).]
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110817/9f826a7f/attachment.html>

From ben.root at ou.edu  Wed Aug 17 14:54:06 2011
From: ben.root at ou.edu (Benjamin Root)
Date: Wed, 17 Aug 2011 13:54:06 -0500
Subject: [Numpy-discussion] bug with assignment into an indexed array?
In-Reply-To: <CAMRnEmp6vYmsbQGbSa+6TR0qz6XG9Rd=T26zij6jb4Hp8QA1iw@mail.gmail.com>
References: <CANNq6FnrbrqUvjrmxeCpYrwBv3jOMbpA+5ik_6a7MjQHsYbSJw@mail.gmail.com>
	<CAFXk4bpwTAxC-j_QMwqQiJ-GZMGu+_cxo_h-tEE2-zOmYBaSmQ@mail.gmail.com>
	<CANNq6FkQ++=GnshiUsvTxA5npMNgm9CcG4w2r2FYi-gW6YfLZw@mail.gmail.com>
	<CAFXk4bqpKeQxUhUXi3bRfXDTGNvUG=GbyKfakzTJ1awD9nkePA@mail.gmail.com>
	<CANNq6F=Ce53jgeoSXzPBSVCuDqZUfNkbiSSBYkPxz3M+d=snvw@mail.gmail.com>
	<CAMRnEmp6vYmsbQGbSa+6TR0qz6XG9Rd=T26zij6jb4Hp8QA1iw@mail.gmail.com>
Message-ID: <CANNq6FnqHVGLt2kye5Qiaw=CYh1eXa9L=K+eH=Kn7OLukz_Hzg@mail.gmail.com>

On Sat, Aug 13, 2011 at 7:17 PM, Mark Wiebe <mwwiebe at gmail.com> wrote:

> On Thu, Aug 11, 2011 at 1:37 PM, Benjamin Root <ben.root at ou.edu> wrote:
>
>> On Thu, Aug 11, 2011 at 10:33 AM, Olivier Delalleau <shish at keba.be>wrote:
>>
>>> 2011/8/11 Benjamin Root <ben.root at ou.edu>
>>>
>>>>
>>>>
>>>> On Thu, Aug 11, 2011 at 8:37 AM, Olivier Delalleau <shish at keba.be>wrote:
>>>>
>>>>> Maybe confusing, but working as expected.
>>>>>
>>>>>
>>>>> When you write:
>>>>>   matched_to[np.array([0, 1, 2])] = 3
>>>>> it calls __setitem__ on matched_to, with arguments (np.array([0, 1,
>>>>> 2]), 3). So numpy understand you want to write 3 at these indices.
>>>>>
>>>>>
>>>>> When you write:
>>>>> matched_to[:3][match] = 3
>>>>> it first calls __getitem__ with the slice as argument, which returns a
>>>>> view of your array, then it calls __setitem__ on this view, and it fills
>>>>> your matched_to array at the same time.
>>>>>
>>>>>
>>>>> But when you write:
>>>>>   matched_to[np.array([0, 1, 2])][match] = 3
>>>>> it first calls __getitem__ with the array as argument, which retunrs a
>>>>> *copy* of your array, so that calling __setitem__ on this copy has no effect
>>>>> on your original array.
>>>>>
>>>>> -=- Olivier
>>>>>
>>>>>
>>>> Right, but I guess my question is does it *have* to be that way?  I
>>>> guess it makes some sense with respect to indexing with a numpy array like I
>>>> did with the last example, because an element could be referred to multiple
>>>> times (which explains the common surprise with '+='), but with boolean
>>>> indexing, we are guaranteed that each element of the view will appear at
>>>> most once.  Therefore, shouldn't boolean indexing always return a view, not
>>>> a copy?  Is the general case of arbitrary array selection inherently
>>>> impossible to encode in a view versus a slice with a regular spacing?
>>>>
>>>
>>> Yes, due to the fact the array interface only supports regular spacing
>>> (otherwise it is more difficult to get efficient access to arbitrary array
>>> positions).
>>>
>>> -=- Olivier
>>>
>>>
>> This still bothers me, though.  I imagine that it is next to impossible to
>> detect this situation from numpy's perspective, so it can't even emit a
>> warning or error. Furthermore, for someone who makes a general function to
>> modify the contents of some externally provided array, there is a
>> possibility that the provided array is actually a copy not a view.
>> Although, I guess it is the responsibility of the user to know the
>> difference.
>>
>> I guess that is the key problem.  The key advantage we are taught about
>> numpy arrays is the use of views for efficient access.  It would seem that
>> most access operations would use it, but in reality, only sliced access do.
>> Everything else is a copy (unless you are doing fancy indexing with
>> assignment).  Maybe with some of the forthcoming changes that have been done
>> with respect to nditer and ufuncs (in particular, I am thinking of the
>> "where" kwarg), maybe we could consider an enhancement allowing fancy
>> indexing (or at least boolean indexing) to produce a view?  Even if it is
>> less efficient than a view from slicing, it would bring better consistency
>> in behavior between the different forms of indexing.
>>
>> Just my 2 cents,
>> Ben Root
>>
>
> I think it would be nice to evolve the NumPy indexing and array
> representation towards the goal of indexing returning a view in all cases
> with no exceptions. This would provide a much nicer mental model to program
> with. Accomplishing such a transition will take a fair bit of time, though.
>
> -Mark
>
>

Mark,

It is good to know that there is a chance to make this possible,
eventually.  However, I just thought of a possible barrier that might have
to be overcome before achieving this.  Because it has always been very clear
that non-slicing produces copies, I can easily imagine situations where
developers have come to depend on this copying behavior.  While I think most
copies are unintended (but unnoticed because it was read-only), it is quite
possible that there are situations where this copy behavior is entirely
intended.  Therefore, changing this behavior may break code in subtle ways.

I am not saying that it shouldn't be done (clarity and simplicity should be
paramount), but one should tread carefully here.

My 2 cents,
Ben Root
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110817/c0ec02b9/attachment.html>

From mwwiebe at gmail.com  Wed Aug 17 15:12:28 2011
From: mwwiebe at gmail.com (Mark Wiebe)
Date: Wed, 17 Aug 2011 12:12:28 -0700
Subject: [Numpy-discussion] bug with assignment into an indexed array?
In-Reply-To: <CANNq6FnqHVGLt2kye5Qiaw=CYh1eXa9L=K+eH=Kn7OLukz_Hzg@mail.gmail.com>
References: <CANNq6FnrbrqUvjrmxeCpYrwBv3jOMbpA+5ik_6a7MjQHsYbSJw@mail.gmail.com>
	<CAFXk4bpwTAxC-j_QMwqQiJ-GZMGu+_cxo_h-tEE2-zOmYBaSmQ@mail.gmail.com>
	<CANNq6FkQ++=GnshiUsvTxA5npMNgm9CcG4w2r2FYi-gW6YfLZw@mail.gmail.com>
	<CAFXk4bqpKeQxUhUXi3bRfXDTGNvUG=GbyKfakzTJ1awD9nkePA@mail.gmail.com>
	<CANNq6F=Ce53jgeoSXzPBSVCuDqZUfNkbiSSBYkPxz3M+d=snvw@mail.gmail.com>
	<CAMRnEmp6vYmsbQGbSa+6TR0qz6XG9Rd=T26zij6jb4Hp8QA1iw@mail.gmail.com>
	<CANNq6FnqHVGLt2kye5Qiaw=CYh1eXa9L=K+eH=Kn7OLukz_Hzg@mail.gmail.com>
Message-ID: <CAMRnEmqvBTCh6NOu=zfk+K_KDBUdYWNx3hYk0Fk=2g6TcijT-g@mail.gmail.com>

On Wed, Aug 17, 2011 at 11:54 AM, Benjamin Root <ben.root at ou.edu> wrote:

> On Sat, Aug 13, 2011 at 7:17 PM, Mark Wiebe <mwwiebe at gmail.com> wrote:
>
>> On Thu, Aug 11, 2011 at 1:37 PM, Benjamin Root <ben.root at ou.edu> wrote:
>>
>>> On Thu, Aug 11, 2011 at 10:33 AM, Olivier Delalleau <shish at keba.be>wrote:
>>>
>>>> 2011/8/11 Benjamin Root <ben.root at ou.edu>
>>>>
>>>>>
>>>>>
>>>>> On Thu, Aug 11, 2011 at 8:37 AM, Olivier Delalleau <shish at keba.be>wrote:
>>>>>
>>>>>> Maybe confusing, but working as expected.
>>>>>>
>>>>>>
>>>>>> When you write:
>>>>>>   matched_to[np.array([0, 1, 2])] = 3
>>>>>> it calls __setitem__ on matched_to, with arguments (np.array([0, 1,
>>>>>> 2]), 3). So numpy understand you want to write 3 at these indices.
>>>>>>
>>>>>>
>>>>>> When you write:
>>>>>> matched_to[:3][match] = 3
>>>>>> it first calls __getitem__ with the slice as argument, which returns a
>>>>>> view of your array, then it calls __setitem__ on this view, and it fills
>>>>>> your matched_to array at the same time.
>>>>>>
>>>>>>
>>>>>> But when you write:
>>>>>>   matched_to[np.array([0, 1, 2])][match] = 3
>>>>>> it first calls __getitem__ with the array as argument, which retunrs a
>>>>>> *copy* of your array, so that calling __setitem__ on this copy has no effect
>>>>>> on your original array.
>>>>>>
>>>>>> -=- Olivier
>>>>>>
>>>>>>
>>>>> Right, but I guess my question is does it *have* to be that way?  I
>>>>> guess it makes some sense with respect to indexing with a numpy array like I
>>>>> did with the last example, because an element could be referred to multiple
>>>>> times (which explains the common surprise with '+='), but with boolean
>>>>> indexing, we are guaranteed that each element of the view will appear at
>>>>> most once.  Therefore, shouldn't boolean indexing always return a view, not
>>>>> a copy?  Is the general case of arbitrary array selection inherently
>>>>> impossible to encode in a view versus a slice with a regular spacing?
>>>>>
>>>>
>>>> Yes, due to the fact the array interface only supports regular spacing
>>>> (otherwise it is more difficult to get efficient access to arbitrary array
>>>> positions).
>>>>
>>>> -=- Olivier
>>>>
>>>>
>>> This still bothers me, though.  I imagine that it is next to impossible
>>> to detect this situation from numpy's perspective, so it can't even emit a
>>> warning or error. Furthermore, for someone who makes a general function to
>>> modify the contents of some externally provided array, there is a
>>> possibility that the provided array is actually a copy not a view.
>>> Although, I guess it is the responsibility of the user to know the
>>> difference.
>>>
>>> I guess that is the key problem.  The key advantage we are taught about
>>> numpy arrays is the use of views for efficient access.  It would seem that
>>> most access operations would use it, but in reality, only sliced access do.
>>> Everything else is a copy (unless you are doing fancy indexing with
>>> assignment).  Maybe with some of the forthcoming changes that have been done
>>> with respect to nditer and ufuncs (in particular, I am thinking of the
>>> "where" kwarg), maybe we could consider an enhancement allowing fancy
>>> indexing (or at least boolean indexing) to produce a view?  Even if it is
>>> less efficient than a view from slicing, it would bring better consistency
>>> in behavior between the different forms of indexing.
>>>
>>> Just my 2 cents,
>>> Ben Root
>>>
>>
>> I think it would be nice to evolve the NumPy indexing and array
>> representation towards the goal of indexing returning a view in all cases
>> with no exceptions. This would provide a much nicer mental model to program
>> with. Accomplishing such a transition will take a fair bit of time, though.
>>
>> -Mark
>>
>>
>
> Mark,
>
> It is good to know that there is a chance to make this possible,
> eventually.  However, I just thought of a possible barrier that might have
> to be overcome before achieving this.  Because it has always been very clear
> that non-slicing produces copies, I can easily imagine situations where
> developers have come to depend on this copying behavior.  While I think most
> copies are unintended (but unnoticed because it was read-only), it is quite
> possible that there are situations where this copy behavior is entirely
> intended.  Therefore, changing this behavior may break code in subtle ways.
>
> I am not saying that it shouldn't be done (clarity and simplicity should be
> paramount), but one should tread carefully here.
>

Absolutely. It would necessarily be very long term and the specifics of how
it could be done are nontrivial, but I figured it was worth mentioning the
idea.

-Mark


>
> My 2 cents,
> Ben Root
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110817/f8dac8aa/attachment.html>

From keith.hughitt at gmail.com  Wed Aug 17 15:25:33 2011
From: keith.hughitt at gmail.com (Keith Hughitt)
Date: Wed, 17 Aug 2011 15:25:33 -0400
Subject: [Numpy-discussion] Overriding numpy.ndarray.__getitem__?
Message-ID: <CAOJcpR-AhfunG119c_yD+2Uqok_4GFJH538mUunJ8Wy_+Crz8w@mail.gmail.com>

Hi all,

I have a subclass of ndarray which is built using using a stack of images.
Rather than store the image header information separately, I overrode
__getitem__ so that when the user indexes into the image cube a single image
a different object type (which includes the header information) is returned
:

class ImageCube(np.ndarray):
.
.
.
def __getitem__(self, key):
        """Overiding indexing operation"""
        if isinstance(key, int):
            data = np.ndarray.__getitem__(self, key)
            header = self._headers[key]
            return SingleImage(data, header)
        else:
            return np.ndarray.__getitem__(self, key)


Everything seems to work well, however, now when I try to combine that with
indexing into the other dimensions of a single image, errors relating to
numpy's array printing arise, e.g.:


>>> print imagecube[0,0:256,0:256]
.
.
.

/usr/lib/pymodules/python2.7/numpy/core/arrayprint.pyc in _formatArray(a,
format_function, rank, max_line_len, next_line_prefix, separator,
edge_items, summary_insert)
    371             if leading_items or i != trailing_items:
    372                 s += next_line_prefix
--> 373             s += _formatArray(a[-i], format_function, rank-1,
max_line_len,
    374                               " " + next_line_prefix, separator,
edge_items,
    375                               summary_insert)


I think the problem has to do with how I am overriding __getitem__: I check
to see if the input is a single integer, and if it is, I return the new
object instance. This should only occur when something like "imagecube[n]"
is called, however, array2str ends up calling imagecube[x], even if the
original thing you are trying to print is something like
imagecube[0,1:256,1:256].

Any ideas? I apologize if the explanation is not very clear; I'm still
trying to figure out exactly what is going on.

Thanks,
Keith
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110817/8b6d8bf6/attachment.html>

From keith.hughitt at gmail.com  Wed Aug 17 16:22:30 2011
From: keith.hughitt at gmail.com (Keith Hughitt)
Date: Wed, 17 Aug 2011 16:22:30 -0400
Subject: [Numpy-discussion] Overriding numpy.ndarray.__getitem__?
In-Reply-To: <CAOJcpR-AhfunG119c_yD+2Uqok_4GFJH538mUunJ8Wy_+Crz8w@mail.gmail.com>
References: <CAOJcpR-AhfunG119c_yD+2Uqok_4GFJH538mUunJ8Wy_+Crz8w@mail.gmail.com>
Message-ID: <CAOJcpR_LcdPq1EDijtYg4g8gAtCZa68dVzy9q4eSyNVmHfrH9w@mail.gmail.com>

Okay, I found something that seems to do the trick for this particular
problem. Instead of just checking whether the input to __getitem__ is an
int, I also check the number of dimensions to make sure we are indexing
within the full cube, and not some sub-index of the cube:

if self.ndim is 3 and isinstance(key, int):
...

I think what was happening is that when repr() is called on the map, it
recursively walks through displaying one dimension at a time, and this is
what was causing my code to choke; the instantiation of a subclass only
makes sense for one of the three dimensions.

Keith

On Wed, Aug 17, 2011 at 3:25 PM, Keith Hughitt <keith.hughitt at gmail.com>wrote:

> Hi all,
>
> I have a subclass of ndarray which is built using using a stack of images.
> Rather than store the image header information separately, I overrode
> __getitem__ so that when the user indexes into the image cube a single image
> a different object type (which includes the header information) is returned
> :
>
> class ImageCube(np.ndarray):
> .
> .
> .
> def __getitem__(self, key):
>         """Overiding indexing operation"""
>         if isinstance(key, int):
>             data = np.ndarray.__getitem__(self, key)
>             header = self._headers[key]
>             return SingleImage(data, header)
>         else:
>             return np.ndarray.__getitem__(self, key)
>
>
> Everything seems to work well, however, now when I try to combine that with
> indexing into the other dimensions of a single image, errors relating to
> numpy's array printing arise, e.g.:
>
>
> >>> print imagecube[0,0:256,0:256]
> .
> .
> .
>
> /usr/lib/pymodules/python2.7/numpy/core/arrayprint.pyc in _formatArray(a,
> format_function, rank, max_line_len, next_line_prefix, separator,
> edge_items, summary_insert)
>     371             if leading_items or i != trailing_items:
>     372                 s += next_line_prefix
> --> 373             s += _formatArray(a[-i], format_function, rank-1,
> max_line_len,
>     374                               " " + next_line_prefix, separator,
> edge_items,
>     375                               summary_insert)
>
>
> I think the problem has to do with how I am overriding __getitem__: I check
> to see if the input is a single integer, and if it is, I return the new
> object instance. This should only occur when something like "imagecube[n]"
> is called, however, array2str ends up calling imagecube[x], even if the
> original thing you are trying to print is something like
> imagecube[0,1:256,1:256].
>
> Any ideas? I apologize if the explanation is not very clear; I'm still
> trying to figure out exactly what is going on.
>
> Thanks,
> Keith
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110817/65f2c90e/attachment.html>

From chris at simplistix.co.uk  Thu Aug 18 10:19:06 2011
From: chris at simplistix.co.uk (Chris Withers)
Date: Thu, 18 Aug 2011 07:19:06 -0700
Subject: [Numpy-discussion] summing an array
Message-ID: <4E4D1F5A.2000205@simplistix.co.uk>

Hi All,

Hopefully a simple newbie question, if I have an array such as :

array([0, 1, 2, 3, 4])

...what's the best way to cummulatively sum it so that I end up with:

array([0, 1, 3, 6, 10])

How would I do this both in-place and to create a new array?

cheers,

Chris

-- 
Simplistix - Content Management, Batch Processing & Python Consulting
             - http://www.simplistix.co.uk


From jsseabold at gmail.com  Thu Aug 18 10:22:51 2011
From: jsseabold at gmail.com (Skipper Seabold)
Date: Thu, 18 Aug 2011 10:22:51 -0400
Subject: [Numpy-discussion] summing an array
In-Reply-To: <4E4D1F5A.2000205@simplistix.co.uk>
References: <4E4D1F5A.2000205@simplistix.co.uk>
Message-ID: <CAKF=Dju2Wo8K5VmOjfLxJKGt=WmxbDm3C2bZ2YA3hk3iWiWiCQ@mail.gmail.com>

On Thu, Aug 18, 2011 at 10:19 AM, Chris Withers <chris at simplistix.co.uk> wrote:
> Hi All,
>
> Hopefully a simple newbie question, if I have an array such as :
>
> array([0, 1, 2, 3, 4])
>
> ...what's the best way to cummulatively sum it so that I end up with:
>
> array([0, 1, 3, 6, 10])
>
> How would I do this both in-place and to create a new array?
>

[~/]
[1]: a = np.arange(5)

[~/]
[2]: a
[2]: array([0, 1, 2, 3, 4])

[~/]
[3]: np.cumsum(a)
[3]: array([ 0,  1,  3,  6, 10])

[~/]
[4]: np.cumsum(a,out=a)
[4]: array([ 0,  1,  3,  6, 10])

[~/]
[5]: a
[5]: array([ 0,  1,  3,  6, 10])

Skipper


From rjd4+numpy at cam.ac.uk  Thu Aug 18 10:58:25 2011
From: rjd4+numpy at cam.ac.uk (Bob Dowling)
Date: Thu, 18 Aug 2011 15:58:25 +0100
Subject: [Numpy-discussion] summing an array
In-Reply-To: <4E4D1F5A.2000205@simplistix.co.uk>
References: <4E4D1F5A.2000205@simplistix.co.uk>
Message-ID: <4E4D2891.4@cam.ac.uk>


On 18/08/11 15:19, Chris Withers wrote:

> Hopefully a simple newbie question, if I have an array such as :
>
> array([0, 1, 2, 3, 4])
>
> ...what's the best way to cummulatively sum it so that I end up with:
>
> array([0, 1, 3, 6, 10])
>
> How would I do this both in-place and to create a new array?

 >>> a = numpy.arange(0,5)

 >>> a
array([0, 1, 2, 3, 4])

 >>> numpy.add.accumulate(a)
array([ 0,  1,  3,  6, 10])

 >>> numpy.add.accumulate(a, out=a)
array([ 0,  1,  3,  6, 10])

 >>> a
array([ 0,  1,  3,  6, 10])

 >>>


And similarly with numpy.multiply for products etc.


From mwwiebe at gmail.com  Thu Aug 18 17:43:17 2011
From: mwwiebe at gmail.com (Mark Wiebe)
Date: Thu, 18 Aug 2011 14:43:17 -0700
Subject: [Numpy-discussion] NA masks for NumPy are ready to test
Message-ID: <CAMRnEmpGCBzR78-y5PBjoa+nYt4EKOJayeWqfzggR11CmTzn3g@mail.gmail.com>

It's taken a lot of changes to get the NA mask support to its current point,
but the code ready for some testing now. You can read the work-in-progress
release notes here:

https://github.com/m-paradox/numpy/blob/missingdata/doc/release/2.0.0-notes.rst

To try it out, check out the missingdata branch from my github account,
here, and build in the standard way:

https://github.com/m-paradox/numpy

The things most important to test are:

* Confirm that existing code still works correctly. I've tested against
SciPy and matplotlib.
* Confirm that the performance of code not using NA masks is the same or
better.
* Try to do computations with the NA values, find places they don't work
yet, and nominate unimplemented functionality important to you to be next on
the development list. The release notes have a preliminary list of
implemented/unimplemented functions.
* Report any crashes, build problems, or unexpected behaviors.

In addition to adding the NA mask, I've also added features and done a few
performance changes here and there, like letting reductions like sum take
lists of axes instead of being a single axis or all of them. These changes
affect various bugs like http://projects.scipy.org/numpy/ticket/1143 and
http://projects.scipy.org/numpy/ticket/533.

Thanks!
Mark

Here's a small example run using NAs:

>>> import numpy as np
>>> np.__version__
'2.0.0.dev-8a5e2a1'
>>> a = np.random.rand(3,3,3)
>>> a.flags.maskna = True
>>> a[np.random.rand(3,3,3) < 0.5] = np.NA
>>> a
array([[[NA, NA,  0.11511708],
        [ 0.46661454,  0.47565512, NA],
        [NA, NA, NA]],

       [[NA,  0.57860351, NA],
        [NA, NA,  0.72012669],
        [ 0.36582123, NA,  0.76289794]],

       [[ 0.65322748,  0.92794386, NA],
        [ 0.53745165,  0.97520989,  0.17515083],
        [ 0.71219688,  0.5184328 ,  0.75802805]]])
>>> np.mean(a, axis=-1)
array([[NA, NA, NA],
       [NA, NA, NA],
       [NA,  0.56260412,  0.66288591]])
>>> np.std(a, axis=-1)
array([[NA, NA, NA],
       [NA, NA, NA],
       [NA,  0.32710662,  0.10384331]])
>>> np.mean(a, axis=-1, skipna=True)
/home/mwiebe/installtest/lib64/python2.7/site-packages/numpy/core/fromnumeric.py:2474:
RuntimeWarning: invalid value encountered in true_divide
  um.true_divide(ret, rcount, out=ret, casting='unsafe')
array([[ 0.11511708,  0.47113483,         nan],
       [ 0.57860351,  0.72012669,  0.56435958],
       [ 0.79058567,  0.56260412,  0.66288591]])
>>> np.std(a, axis=-1, skipna=True)
/home/mwiebe/installtest/lib64/python2.7/site-packages/numpy/core/fromnumeric.py:2707:
RuntimeWarning: invalid value encountered in true_divide
  um.true_divide(arrmean, rcount, out=arrmean, casting='unsafe')
/home/mwiebe/installtest/lib64/python2.7/site-packages/numpy/core/fromnumeric.py:2730:
RuntimeWarning: invalid value encountered in true_divide
  um.true_divide(ret, rcount, out=ret, casting='unsafe')
array([[ 0.        ,  0.00452029,         nan],
       [ 0.        ,  0.        ,  0.19853835],
       [ 0.13735819,  0.32710662,  0.10384331]])
>>> np.std(a, axis=(1,2), skipna=True)
array([ 0.16786895,  0.15498008,  0.23811937])
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110818/31480027/attachment.html>

From rblove_lists at comcast.net  Thu Aug 18 22:24:01 2011
From: rblove_lists at comcast.net (Robert Love)
Date: Thu, 18 Aug 2011 21:24:01 -0500
Subject: [Numpy-discussion] dtype and shape for 1.6.1 seems broken?
Message-ID: <506D2909-E407-4BE5-9F82-48D2E5D88E9D@comcast.net>


This works under 1.5.1 and 1.6.0 but gives me errors in 1.6.1

import numpy as np

def main():

   print"numpy version: "+ np.__version__

   zdt = np.dtype([('et','i4'),('r','f8',3)])

   zdata = np.loadtxt('zdum.txt', zdt)

In 1.6.1 I get this error:

ValueError: setting an array element with a sequence.  Is this a known problem?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110818/7c49f24a/attachment.html>

From mwwiebe at gmail.com  Thu Aug 18 23:44:50 2011
From: mwwiebe at gmail.com (Mark Wiebe)
Date: Thu, 18 Aug 2011 20:44:50 -0700
Subject: [Numpy-discussion] dtype and shape for 1.6.1 seems broken?
In-Reply-To: <506D2909-E407-4BE5-9F82-48D2E5D88E9D@comcast.net>
References: <506D2909-E407-4BE5-9F82-48D2E5D88E9D@comcast.net>
Message-ID: <CAMRnEmp0RpJWkzCuJ2KVcECuAFk6eV4Xpnk90tNTpbZTbMj4vQ@mail.gmail.com>

This could be related to ticket #1936:

http://projects.scipy.org/numpy/ticket/1936

for which there's a pull request against master here:

https://github.com/numpy/numpy/pull/140

-Mark


On Thu, Aug 18, 2011 at 7:24 PM, Robert Love <rblove_lists at comcast.net>wrote:

>
> This works under 1.5.1 and 1.6.0 but gives me errors in 1.6.1
>
> import numpy as np
>
> def main():
>
>    print"numpy version: "+ np.__version__
>
>    zdt = np.dtype([('et','i4'),('r','f8',3)])
>
>    zdata = np.loadtxt('zdum.txt', zdt)
>
> In 1.6.1 I get this error:
>
> ValueError: setting an array element with a sequence.  Is this a known
> problem?
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110818/d62cd9f4/attachment.html>

From cgohlke at uci.edu  Thu Aug 18 23:45:09 2011
From: cgohlke at uci.edu (Christoph Gohlke)
Date: Thu, 18 Aug 2011 20:45:09 -0700
Subject: [Numpy-discussion] dtype and shape for 1.6.1 seems broken?
In-Reply-To: <506D2909-E407-4BE5-9F82-48D2E5D88E9D@comcast.net>
References: <506D2909-E407-4BE5-9F82-48D2E5D88E9D@comcast.net>
Message-ID: <4E4DDC45.3010303@uci.edu>


On 8/18/2011 7:24 PM, Robert Love wrote:
>
> This works under 1.5.1 and 1.6.0 but gives me errors in 1.6.1
>
> import numpy as np
>
> def main():
>
> print"numpy version: "+ np.__version__
>
> zdt = np.dtype([('et','i4'),('r','f8',3)])
>
> zdata = np.loadtxt('zdum.txt', zdt)
>
> In 1.6.1 I get this error:
>
> ValueError: setting an array element with a sequence. Is this a known
> problem?
>

This looks like <http://projects.scipy.org/numpy/ticket/1936>

The ValueError is raised in "numpy\lib\npyio.py", line 804, in loadtxt.

Npyio.py is identical for numpy 1.6.0 and 1.6.1.

This is an actual function call from line 804, which works in numpy 
1.6.0 but fails with 1.6.1:

>>> np.array([(0, ((0., 0., 0.),))], dtype=[('et', '<i4'), ('r', '<f8', (3,))])

Christoph


From wardefar at iro.umontreal.ca  Thu Aug 18 23:46:08 2011
From: wardefar at iro.umontreal.ca (David Warde-Farley)
Date: Thu, 18 Aug 2011 23:46:08 -0400
Subject: [Numpy-discussion] dtype and shape for 1.6.1 seems broken?
In-Reply-To: <506D2909-E407-4BE5-9F82-48D2E5D88E9D@comcast.net>
References: <506D2909-E407-4BE5-9F82-48D2E5D88E9D@comcast.net>
Message-ID: <28A70610-5F10-4D9C-8F6B-FFE17C4F5A1C@iro.umontreal.ca>

On 2011-08-18, at 10:24 PM, Robert Love wrote:

> In 1.6.1 I get this error:
> 
> ValueError: setting an array element with a sequence.  Is this a known problem?

You'll have to post a traceback if we're to figure out what the problem is. A few lines of zdum.txt would also be nice.

Suffice it to say the dtype line runs fine in 1.6.1, so the problem is either in loadtxt or the data it's being asked to process.

From mwwiebe at gmail.com  Fri Aug 19 00:01:42 2011
From: mwwiebe at gmail.com (Mark Wiebe)
Date: Thu, 18 Aug 2011 21:01:42 -0700
Subject: [Numpy-discussion] longlong format error with Python <= 2.6 in
	scalartypes.c
In-Reply-To: <8D5A8864-6827-4164-B8F6-198000B7491D@astro.physik.uni-goettingen.de>
References: <8D5A8864-6827-4164-B8F6-198000B7491D@astro.physik.uni-goettingen.de>
Message-ID: <CAMRnEmpk1_yt05pcGef80WWwm+h+V-agjR0TtLm0DqwOpj75qA@mail.gmail.com>

On Thu, Aug 4, 2011 at 4:08 PM, Derek Homeier <
derek at astro.physik.uni-goettingen.de> wrote:

> Hi,
>
> commits c15a807e and c135371e (thus most immediately addressed to Mark, but
> I am sending this to the list hoping for more insight on the issue)
> introduce a test failure with Python 2.5+2.6 on Mac:
>
> FAIL: test_timedelta_scalar_construction (test_datetime.TestDateTime)
> ----------------------------------------------------------------------
> Traceback (most recent call last):
>  File
> "/Users/derek/lib/python2.6/site-packages/numpy/core/tests/test_datetime.py",
> line 219, in test_timedelta_scalar_construction
>    assert_equal(str(np.timedelta64(3, 's')), '3 seconds')
>  File "/Users/derek/lib/python2.6/site-packages/numpy/testing/utils.py",
> line 313, in assert_equal
>    raise AssertionError(msg)
> AssertionError:
> Items are not equal:
>  ACTUAL: '%lld seconds'
>  DESIRED: '3 seconds'
>
> due to the "lld" format passed to PyUString_FromFormat in scalartypes.c.
> In the current npy_common.h I found the comment
>  *      in Python 2.6 the %lld formatter is not supported. In this
>  *      case we work around the problem by using the %zd formatter.
> though I did not notice that problem when I cleaned up the NPY_LONGLONG_FMT
> definitions in that file (and it is not entirely clear whether the comment
> only pertains to Windows...). Anyway changing the formatters in
> scalartypes.c to "zd" as well removes the failure and still works with
> Python 2.7 and 3.2 (at least on Mac OS). However I am wondering if
> a) NPY_[U]LONGLONG_FMT should also be defined conditional to the Python
> version (and if "%zu" is a valid formatter), and
> b) scalartypes.c should use NPY_LONGLONG_FMT from npy_common.h
>
> I am attaching a patch implementing a), but only the quick and dirty
> solution to b).
>

I've touched this stuff as little as possible, because I rather dislike the
way the *_FMT macros are set up right now. I added a comment about
NPY_INTP_FMT in npy_common.h which I see you read. If you're going to try to
fix this, I hope you fix it deeper than this patch so it's not error-prone
anymore.

NPY_INTP_FMT is used together with PyErr_Format/PyString_FromFormat, whereas
the other *_FMT are used with the *printf functions from the C libraries.
These are not compatible, and the %zd hack was put in place because it
exists even in Python 2.4, and Py_ssize_t seems matches the  pointer size in
all CPython versions.

Switching the timedelta64 format in scalartypes.c.src to "%zd" won't help on
32-bit platforms, because it won't be a 64-bit type there, unlike how it
works ok for the NPY_INTP_FMT. In summary:

* There need to be changes to create a clear distinction between the *_FMT
for PyString_FromFormat vs the *_FMT for C library *printf functions
* I suspect we're out of luck for 32-bit older versions of CPython with
PyString_FromFormat

Cheers,
-Mark


>
> Cheers,
>                                                 Derek
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110818/4460ba4b/attachment.html>

From charlesr.harris at gmail.com  Fri Aug 19 00:32:44 2011
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Thu, 18 Aug 2011 22:32:44 -0600
Subject: [Numpy-discussion] dtype and shape for 1.6.1 seems broken?
In-Reply-To: <4E4DDC45.3010303@uci.edu>
References: <506D2909-E407-4BE5-9F82-48D2E5D88E9D@comcast.net>
	<4E4DDC45.3010303@uci.edu>
Message-ID: <CAB6mnxLGme3eGxJP5fkG7wv6hNoZtPEV7kP0OHD7X32LLk4f_w@mail.gmail.com>

On Thu, Aug 18, 2011 at 9:45 PM, Christoph Gohlke <cgohlke at uci.edu> wrote:

>
>
> On 8/18/2011 7:24 PM, Robert Love wrote:
> >
> > This works under 1.5.1 and 1.6.0 but gives me errors in 1.6.1
> >
> > import numpy as np
> >
> > def main():
> >
> > print"numpy version: "+ np.__version__
> >
> > zdt = np.dtype([('et','i4'),('r','f8',3)])
> >
> > zdata = np.loadtxt('zdum.txt', zdt)
> >
> > In 1.6.1 I get this error:
> >
> > ValueError: setting an array element with a sequence. Is this a known
> > problem?
> >
>
> This looks like <http://projects.scipy.org/numpy/ticket/1936>
>
> The ValueError is raised in "numpy\lib\npyio.py", line 804, in loadtxt.
>
> Npyio.py is identical for numpy 1.6.0 and 1.6.1.
>
> This is an actual function call from line 804, which works in numpy
> 1.6.0 but fails with 1.6.1:
>
> >>> np.array([(0, ((0., 0., 0.),))], dtype=[('et', '<i4'), ('r', '<f8',
> (3,))])
>
>
Looks malformed, shouldn't that be

In [16]: np.array((0, (0., 0., 0.)), dtype=[('et', '<i4'), ('r', '<f8',
(3,))])
Out[16]:
array((0, [0.0, 0.0, 0.0]),
      dtype=[('et', '<i4'), ('r', '<f8', (3,))])

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110818/9a245c59/attachment.html>

From ralf.gommers at googlemail.com  Fri Aug 19 06:48:29 2011
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Fri, 19 Aug 2011 12:48:29 +0200
Subject: [Numpy-discussion] [SciPy-User] disabling SVN (was: Trouble
 installing scipy after upgrading to Mac OS X 10.7 aka Lion)
In-Reply-To: <j2dpo4$f5b$1@dough.gmane.org>
References: <CABL7CQjW+PL64R=GieR+Nnp9wQWRu9uNF6RkVx3rV+68eNs5jQ@mail.gmail.com>
	<CAA6U3WCFLbzWpVRdS-ocvJ5sGVQ9rxAZYg33xw2F-qN1KQ7r0w@mail.gmail.com>
	<CABL7CQjoRWe4oSPxRNLEg1LR8Y+Nc815rYSvQ0ewBxSm74LXyQ@mail.gmail.com>
	<j26k8i$58j$1@dough.gmane.org>
	<CAMMTP+Bfn1eiA3KzGeoS2vSgA2muj5EuDa9Vs5tb6aA+GeYNow@mail.gmail.com>
	<j2dpo4$f5b$1@dough.gmane.org>
Message-ID: <CABL7CQgkj5pg6TN=URsbqgTQ9Vce8uoadkXR07x4tFL_tm9Pew@mail.gmail.com>

On Tue, Aug 16, 2011 at 3:01 PM, Pauli Virtanen <pav at iki.fi> wrote:

> Sat, 13 Aug 2011 22:00:33 -0400, josef.pktd wrote:
> [clip]
> > Does Trac require svn access to dig out old information? for example
> > links to old changesets, annotate/blame, ... ?
>
> It does not require HTTP access to SVN, as it looks directly at the
> SVN repo on the local disk.
>
> It also probably doesn't use the old SVN repo for anything in reality,
> as there's a simple Git plugin installed that just grabs the Git history
> to the timeline, and redirects source browsing etc to Github.
> However, I don't know whether the timeline views etc continue to
> function even without the local SVN repo, so I'd just disable the HTTP
> access and leave the local repo as it is as a backup.
>
>
Hi Ognen,

Could you please disable http access to numpy and scipy svn?

Thanks a lot,
Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110819/8a8707ac/attachment.html>

From dirk.ullrich at googlemail.com  Fri Aug 19 07:26:16 2011
From: dirk.ullrich at googlemail.com (Dirk Ullrich)
Date: Fri, 19 Aug 2011 13:26:16 +0200
Subject: [Numpy-discussion] Build of current Git HEAD for NumPy fails
Message-ID: <CAFgzs6qkkEwvrFW2h_KdNwjMXTXu0WAGTdqUAHLP89kZwrwi5g@mail.gmail.com>

Hi,

when trying to build current Git HAED of NumPy with - both for
$PYTHON=python2 or $PYTHON=python3:

$PYTHON setup.py config_fc --fcompiler=gnu95 install --prefix=$WHATEVER

I get the following error - here for PYTHON=python3.2

running build_clib
customize UnixCCompiler
customize UnixCCompiler using build_clib
building 'npymath' library
Traceback (most recent call last):
  File "setup.py", line 214, in <module>
    setup_package()
  File "setup.py", line 207, in setup_package
    configuration=configuration )
  File "/common/packages/build/makepkg-du/python-numpy-git/src/numpy-build/build/py3k/numpy/distutils/core.py",
line 186, in setup
    return old_setup(**new_attr)
  File "/usr/lib/python3.2/distutils/core.py", line 150, in setup
    dist.run_commands()
  File "/usr/lib/python3.2/distutils/dist.py", line 919, in run_commands
    self.run_command(cmd)
  File "/usr/lib/python3.2/distutils/dist.py", line 938, in run_command
    cmd_obj.run()
  File "/common/packages/build/makepkg-du/python-numpy-git/src/numpy-build/build/py3k/numpy/distutils/command/build.py",
line 37, in run
    old_build.run(self)
  File "/usr/lib/python3.2/distutils/command/build.py", line 128, in run
    self.run_command(cmd_name)
  File "/usr/lib/python3.2/distutils/cmd.py", line 315, in run_command
    self.distribution.run_command(command)
  File "/usr/lib/python3.2/distutils/dist.py", line 938, in run_command
    cmd_obj.run()
  File "/common/packages/build/makepkg-du/python-numpy-git/src/numpy-build/build/py3k/numpy/distutils/command/build_clib.py",
line 100, in run
    self.build_libraries(self.libraries)
  File "/common/packages/build/makepkg-du/python-numpy-git/src/numpy-build/build/py3k/numpy/distutils/command/build_clib.py",
line 119, in build_libraries
    self.build_a_library(build_info, lib_name, libraries)
  File "/common/packages/build/makepkg-du/python-numpy-git/src/numpy-build/build/py3k/numpy/distutils/command/build_clib.py",
line 179, in build_a_library
    fcompiler.extra_f77_compile_args =
build_info.get('extra_f77_compile_args') or []
AttributeError: 'str' object has no attribute 'extra_f77_compile_args'

It seems that `fcompiler's value in line 179 of
`numpy/distutils/command/build_clib.py' is not properly initialized as
an appropriate `fcompiler' object.

Dirk


From pearu.peterson at gmail.com  Fri Aug 19 07:59:58 2011
From: pearu.peterson at gmail.com (Pearu Peterson)
Date: Fri, 19 Aug 2011 14:59:58 +0300
Subject: [Numpy-discussion] Build of current Git HEAD for NumPy fails
In-Reply-To: <CAFgzs6qkkEwvrFW2h_KdNwjMXTXu0WAGTdqUAHLP89kZwrwi5g@mail.gmail.com>
References: <CAFgzs6qkkEwvrFW2h_KdNwjMXTXu0WAGTdqUAHLP89kZwrwi5g@mail.gmail.com>
Message-ID: <4E4E503E.4060408@cens.ioc.ee>


On 08/19/2011 02:26 PM, Dirk Ullrich wrote:
> Hi,
>
> when trying to build current Git HAED of NumPy with - both for
> $PYTHON=python2 or $PYTHON=python3:
>
> $PYTHON setup.py config_fc --fcompiler=gnu95 install --prefix=$WHATEVER
>
> I get the following error - here for PYTHON=python3.2

The command works fine here with Numpy HEAD and Python 2.7.
Btw, why do you specify --fcompiler=gnu95 for numpy? Numpy
has no Fortran sources. So, fortran compiler is not needed
for building Numpy (unless you use Fortran libraries
for numpy.linalg).

> running build_clib
...
>    File "/common/packages/build/makepkg-du/python-numpy-git/src/numpy-build/build/py3k/numpy/distutils/command/build_clib.py",
> line 179, in build_a_library
>      fcompiler.extra_f77_compile_args =
> build_info.get('extra_f77_compile_args') or []
> AttributeError: 'str' object has no attribute 'extra_f77_compile_args'

Reading the code, I don't see how this can happen. Very strange.
Anyway, I cleaned up build_clib to follow similar coding convention
as in build_ext. Could you try numpy head again?

Regards,
Pearu


From pav at iki.fi  Fri Aug 19 08:48:01 2011
From: pav at iki.fi (Pauli Virtanen)
Date: Fri, 19 Aug 2011 12:48:01 +0000 (UTC)
Subject: [Numpy-discussion] [SciPy-User] disabling SVN (was: Trouble
 installing scipy after upgrading to Mac OS X 10.7 aka Lion)
References: <CABL7CQjW+PL64R=GieR+Nnp9wQWRu9uNF6RkVx3rV+68eNs5jQ@mail.gmail.com>
	<CAA6U3WCFLbzWpVRdS-ocvJ5sGVQ9rxAZYg33xw2F-qN1KQ7r0w@mail.gmail.com>
	<CABL7CQjoRWe4oSPxRNLEg1LR8Y+Nc815rYSvQ0ewBxSm74LXyQ@mail.gmail.com>
	<j26k8i$58j$1@dough.gmane.org>
	<CAMMTP+Bfn1eiA3KzGeoS2vSgA2muj5EuDa9Vs5tb6aA+GeYNow@mail.gmail.com>
	<j2dpo4$f5b$1@dough.gmane.org>
	<CABL7CQgkj5pg6TN=URsbqgTQ9Vce8uoadkXR07x4tFL_tm9Pew@mail.gmail.com>
Message-ID: <j2lm21$70u$1@dough.gmane.org>

Fri, 19 Aug 2011 12:48:29 +0200, Ralf Gommers wrote:
[clip]
> Hi Ognen,
> 
> Could you please disable http access to numpy and scipy svn?

Turns out also I had enough permissions to disable this. Now:

$ svn co http://svn.scipy.org/svn/numpy/trunk numpy
svn: Repository moved permanently to 'http://github.com/numpy/numpy/'; please relocate


From dirk.ullrich at googlemail.com  Fri Aug 19 08:50:18 2011
From: dirk.ullrich at googlemail.com (Dirk Ullrich)
Date: Fri, 19 Aug 2011 14:50:18 +0200
Subject: [Numpy-discussion] Build of current Git HEAD for NumPy fails
In-Reply-To: <4E4E503E.4060408@cens.ioc.ee>
References: <CAFgzs6qkkEwvrFW2h_KdNwjMXTXu0WAGTdqUAHLP89kZwrwi5g@mail.gmail.com>
	<4E4E503E.4060408@cens.ioc.ee>
Message-ID: <CAFgzs6o3LQ7tdsJ=orNhxYNo8kRXn9crFBAVn2T71SmtvZZBFQ@mail.gmail.com>

Hi Paeru,

2011/8/19 Pearu Peterson <pearu.peterson at gmail.com>:
>
>
> On 08/19/2011 02:26 PM, Dirk Ullrich wrote:
>> Hi,
>>
>> when trying to build current Git HAED of NumPy with - both for
>> $PYTHON=python2 or $PYTHON=python3:
>>
>> $PYTHON setup.py config_fc --fcompiler=gnu95 install --prefix=$WHATEVER
>>
>> I get the following error - here for PYTHON=python3.2
>
> The command works fine here with Numpy HEAD and Python 2.7.
> Btw, why do you specify --fcompiler=gnu95 for numpy? Numpy
> has no Fortran sources. So, fortran compiler is not needed
> for building Numpy (unless you use Fortran libraries
> for numpy.linalg).
>
I do use Lapack. Sorry for not mentioning it.
>> running build_clib
> ...
>> ? ?File "/common/packages/build/makepkg-du/python-numpy-git/src/numpy-build/build/py3k/numpy/distutils/command/build_clib.py",
>> line 179, in build_a_library
>> ? ? ?fcompiler.extra_f77_compile_args =
>> build_info.get('extra_f77_compile_args') or []
>> AttributeError: 'str' object has no attribute 'extra_f77_compile_args'
>
> Reading the code, I don't see how this can happen. Very strange.
> Anyway, I cleaned up build_clib to follow similar coding convention
> as in build_ext. Could you try numpy head again?
>[...]
Now it seems to work for for Python 3.2 and 2.7.

Thank you very much, Pearu!

Dirk


From jlconlin at gmail.com  Fri Aug 19 09:00:31 2011
From: jlconlin at gmail.com (Jeremy Conlin)
Date: Fri, 19 Aug 2011 07:00:31 -0600
Subject: [Numpy-discussion] How to start at line # x when using numpy.memmap
Message-ID: <CAAzQXyNiMinY6CAfhRqhqnML6NF8NvtgoRimCA7GchyfnJgUiw@mail.gmail.com>

I would like to use numpy's memmap on some data files I have. The
first 12 or so lines of the files contain text (header information)
and the remainder has the numerical data. Is there a way I can tell
memmap to skip a specified number of lines instead of a number of
bytes?

Thanks,
Jeremy


From pav at iki.fi  Fri Aug 19 09:19:24 2011
From: pav at iki.fi (Pauli Virtanen)
Date: Fri, 19 Aug 2011 13:19:24 +0000 (UTC)
Subject: [Numpy-discussion] How to start at line # x when using
	numpy.memmap
References: <CAAzQXyNiMinY6CAfhRqhqnML6NF8NvtgoRimCA7GchyfnJgUiw@mail.gmail.com>
Message-ID: <j2lnss$fes$1@dough.gmane.org>

Fri, 19 Aug 2011 07:00:31 -0600, Jeremy Conlin wrote:
> I would like to use numpy's memmap on some data files I have. The first
> 12 or so lines of the files contain text (header information) and the
> remainder has the numerical data. Is there a way I can tell memmap to
> skip a specified number of lines instead of a number of bytes?

First use standard Python I/O functions to determine the number of
bytes to skip at the beginning and the number of data items. Then pass
in `offset` and `shape` parameters to numpy.memmap.

-- 
Pauli Virtanen


From jlconlin at gmail.com  Fri Aug 19 09:29:44 2011
From: jlconlin at gmail.com (Jeremy Conlin)
Date: Fri, 19 Aug 2011 07:29:44 -0600
Subject: [Numpy-discussion] How to start at line # x when using
	numpy.memmap
In-Reply-To: <j2lnss$fes$1@dough.gmane.org>
References: <CAAzQXyNiMinY6CAfhRqhqnML6NF8NvtgoRimCA7GchyfnJgUiw@mail.gmail.com>
	<j2lnss$fes$1@dough.gmane.org>
Message-ID: <CAAzQXyOyeqH0+rNyvaO38Ttoerag+EZhJvbpdKoLYMciFOaZfg@mail.gmail.com>

On Fri, Aug 19, 2011 at 7:19 AM, Pauli Virtanen <pav at iki.fi> wrote:
> Fri, 19 Aug 2011 07:00:31 -0600, Jeremy Conlin wrote:
>> I would like to use numpy's memmap on some data files I have. The first
>> 12 or so lines of the files contain text (header information) and the
>> remainder has the numerical data. Is there a way I can tell memmap to
>> skip a specified number of lines instead of a number of bytes?
>
> First use standard Python I/O functions to determine the number of
> bytes to skip at the beginning and the number of data items. Then pass
> in `offset` and `shape` parameters to numpy.memmap.

Thanks for that suggestion. However, I'm unfamiliar with the I/O
functions you are referring to. Can you point me to do the
documentation?

Thanks again,
Jeremy


From ralf.gommers at googlemail.com  Fri Aug 19 09:48:51 2011
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Fri, 19 Aug 2011 15:48:51 +0200
Subject: [Numpy-discussion] [SciPy-User] disabling SVN (was: Trouble
 installing scipy after upgrading to Mac OS X 10.7 aka Lion)
In-Reply-To: <j2lm21$70u$1@dough.gmane.org>
References: <CABL7CQjW+PL64R=GieR+Nnp9wQWRu9uNF6RkVx3rV+68eNs5jQ@mail.gmail.com>
	<CAA6U3WCFLbzWpVRdS-ocvJ5sGVQ9rxAZYg33xw2F-qN1KQ7r0w@mail.gmail.com>
	<CABL7CQjoRWe4oSPxRNLEg1LR8Y+Nc815rYSvQ0ewBxSm74LXyQ@mail.gmail.com>
	<j26k8i$58j$1@dough.gmane.org>
	<CAMMTP+Bfn1eiA3KzGeoS2vSgA2muj5EuDa9Vs5tb6aA+GeYNow@mail.gmail.com>
	<j2dpo4$f5b$1@dough.gmane.org>
	<CABL7CQgkj5pg6TN=URsbqgTQ9Vce8uoadkXR07x4tFL_tm9Pew@mail.gmail.com>
	<j2lm21$70u$1@dough.gmane.org>
Message-ID: <CABL7CQjY8G90jT283urSt973YWaCQPxGqOK8ZMD9Zc57apZ8Hw@mail.gmail.com>

On Fri, Aug 19, 2011 at 2:48 PM, Pauli Virtanen <pav at iki.fi> wrote:

> Fri, 19 Aug 2011 12:48:29 +0200, Ralf Gommers wrote:
> [clip]
> > Hi Ognen,
> >
> > Could you please disable http access to numpy and scipy svn?
>
> Turns out also I had enough permissions to disable this. Now:
>
> $ svn co http://svn.scipy.org/svn/numpy/trunk numpy
> svn: Repository moved permanently to 'http://github.com/numpy/numpy/';
> please relocate
>
>
A helpful message even, nice touch.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110819/8ceb7e28/attachment.html>

From bpederse at gmail.com  Fri Aug 19 10:01:06 2011
From: bpederse at gmail.com (Brent Pedersen)
Date: Fri, 19 Aug 2011 08:01:06 -0600
Subject: [Numpy-discussion] How to start at line # x when using
	numpy.memmap
In-Reply-To: <CAAzQXyOyeqH0+rNyvaO38Ttoerag+EZhJvbpdKoLYMciFOaZfg@mail.gmail.com>
References: <CAAzQXyNiMinY6CAfhRqhqnML6NF8NvtgoRimCA7GchyfnJgUiw@mail.gmail.com>
	<j2lnss$fes$1@dough.gmane.org>
	<CAAzQXyOyeqH0+rNyvaO38Ttoerag+EZhJvbpdKoLYMciFOaZfg@mail.gmail.com>
Message-ID: <CAAp4xwr2DdZ2+QQUBcr0c-4frL0NZ9Z7-w7gRmTViL91RGB7mw@mail.gmail.com>

On Fri, Aug 19, 2011 at 7:29 AM, Jeremy Conlin <jlconlin at gmail.com> wrote:
> On Fri, Aug 19, 2011 at 7:19 AM, Pauli Virtanen <pav at iki.fi> wrote:
>> Fri, 19 Aug 2011 07:00:31 -0600, Jeremy Conlin wrote:
>>> I would like to use numpy's memmap on some data files I have. The first
>>> 12 or so lines of the files contain text (header information) and the
>>> remainder has the numerical data. Is there a way I can tell memmap to
>>> skip a specified number of lines instead of a number of bytes?
>>
>> First use standard Python I/O functions to determine the number of
>> bytes to skip at the beginning and the number of data items. Then pass
>> in `offset` and `shape` parameters to numpy.memmap.
>
> Thanks for that suggestion. However, I'm unfamiliar with the I/O
> functions you are referring to. Can you point me to do the
> documentation?
>
> Thanks again,
> Jeremy
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>

this might get you started:


import numpy as np

# make some fake data with 12 header lines.
with open('test.mm', 'w') as fhw:
    print >> fhw, "\n".join('header' for i in range(12))
    np.arange(100, dtype=np.uint).tofile(fhw)

# use normal python io to determine of offset after 12 lines.
with open('test.mm') as fhr:
    for i in range(12): fhr.readline()
    offset = fhr.tell()

# use the offset in your call to np.memmap.
a = np.memmap('test.mm', mode='r', dtype=np.uint, offset=offset)

assert all(a == np.arange(100))


From pearu.peterson at gmail.com  Fri Aug 19 10:07:54 2011
From: pearu.peterson at gmail.com (Pearu Peterson)
Date: Fri, 19 Aug 2011 17:07:54 +0300
Subject: [Numpy-discussion] How to start at line # x when
	using	numpy.memmap
In-Reply-To: <CAAp4xwr2DdZ2+QQUBcr0c-4frL0NZ9Z7-w7gRmTViL91RGB7mw@mail.gmail.com>
References: <CAAzQXyNiMinY6CAfhRqhqnML6NF8NvtgoRimCA7GchyfnJgUiw@mail.gmail.com>	<j2lnss$fes$1@dough.gmane.org>	<CAAzQXyOyeqH0+rNyvaO38Ttoerag+EZhJvbpdKoLYMciFOaZfg@mail.gmail.com>
	<CAAp4xwr2DdZ2+QQUBcr0c-4frL0NZ9Z7-w7gRmTViL91RGB7mw@mail.gmail.com>
Message-ID: <4E4E6E3A.5010908@cens.ioc.ee>


On 08/19/2011 05:01 PM, Brent Pedersen wrote:
> On Fri, Aug 19, 2011 at 7:29 AM, Jeremy Conlin<jlconlin at gmail.com>  wrote:
>> On Fri, Aug 19, 2011 at 7:19 AM, Pauli Virtanen<pav at iki.fi>  wrote:
>>> Fri, 19 Aug 2011 07:00:31 -0600, Jeremy Conlin wrote:
>>>> I would like to use numpy's memmap on some data files I have. The first
>>>> 12 or so lines of the files contain text (header information) and the
>>>> remainder has the numerical data. Is there a way I can tell memmap to
>>>> skip a specified number of lines instead of a number of bytes?
>>>
>>> First use standard Python I/O functions to determine the number of
>>> bytes to skip at the beginning and the number of data items. Then pass
>>> in `offset` and `shape` parameters to numpy.memmap.
>>
>> Thanks for that suggestion. However, I'm unfamiliar with the I/O
>> functions you are referring to. Can you point me to do the
>> documentation?
>>
>> Thanks again,
>> Jeremy
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>
> this might get you started:
>
>
> import numpy as np
>
> # make some fake data with 12 header lines.
> with open('test.mm', 'w') as fhw:
>      print>>  fhw, "\n".join('header' for i in range(12))
>      np.arange(100, dtype=np.uint).tofile(fhw)
>
> # use normal python io to determine of offset after 12 lines.
> with open('test.mm') as fhr:
>      for i in range(12): fhr.readline()
>      offset = fhr.tell()

I think that before reading a line the program should
check whether the line starts with "#". Otherwise fhr.readline()
may return a very large junk of data (may be the rest of the file 
content) that ought to be read only via memmap.

HTH,
Pearu


From bsouthey at gmail.com  Fri Aug 19 10:15:56 2011
From: bsouthey at gmail.com (Bruce Southey)
Date: Fri, 19 Aug 2011 09:15:56 -0500
Subject: [Numpy-discussion] NA masks for NumPy are ready to test
In-Reply-To: <CAMRnEmpGCBzR78-y5PBjoa+nYt4EKOJayeWqfzggR11CmTzn3g@mail.gmail.com>
References: <CAMRnEmpGCBzR78-y5PBjoa+nYt4EKOJayeWqfzggR11CmTzn3g@mail.gmail.com>
Message-ID: <4E4E701C.1030305@gmail.com>

On 08/18/2011 04:43 PM, Mark Wiebe wrote:
> It's taken a lot of changes to get the NA mask support to its current 
> point, but the code ready for some testing now. You can read the 
> work-in-progress release notes here:
>
> https://github.com/m-paradox/numpy/blob/missingdata/doc/release/2.0.0-notes.rst
>
> To try it out, check out the missingdata branch from my github 
> account, here, and build in the standard way:
>
> https://github.com/m-paradox/numpy
>
> The things most important to test are:
>
> * Confirm that existing code still works correctly. I've tested 
> against SciPy and matplotlib.
> * Confirm that the performance of code not using NA masks is the same 
> or better.
> * Try to do computations with the NA values, find places they don't 
> work yet, and nominate unimplemented functionality important to you to 
> be next on the development list. The release notes have a preliminary 
> list of implemented/unimplemented functions.
> * Report any crashes, build problems, or unexpected behaviors.
>
> In addition to adding the NA mask, I've also added features and done a 
> few performance changes here and there, like letting reductions like 
> sum take lists of axes instead of being a single axis or all of them. 
> These changes affect various bugs like 
> http://projects.scipy.org/numpy/ticket/1143 and 
> http://projects.scipy.org/numpy/ticket/533.
>
> Thanks!
> Mark
>
> Here's a small example run using NAs:
>
> >>> import numpy as np
> >>> np.__version__
> '2.0.0.dev-8a5e2a1'
> >>> a = np.random.rand(3,3,3)
> >>> a.flags.maskna = True
> >>> a[np.random.rand(3,3,3) < 0.5] = np.NA
> >>> a
> array([[[NA, NA,  0.11511708],
>         [ 0.46661454,  0.47565512, NA],
>         [NA, NA, NA]],
>
>        [[NA,  0.57860351, NA],
>         [NA, NA,  0.72012669],
>         [ 0.36582123, NA,  0.76289794]],
>
>        [[ 0.65322748,  0.92794386, NA],
>         [ 0.53745165,  0.97520989,  0.17515083],
>         [ 0.71219688,  0.5184328 ,  0.75802805]]])
> >>> np.mean(a, axis=-1)
> array([[NA, NA, NA],
>        [NA, NA, NA],
>        [NA,  0.56260412,  0.66288591]])
> >>> np.std(a, axis=-1)
> array([[NA, NA, NA],
>        [NA, NA, NA],
>        [NA,  0.32710662,  0.10384331]])
> >>> np.mean(a, axis=-1, skipna=True)
> /home/mwiebe/installtest/lib64/python2.7/site-packages/numpy/core/fromnumeric.py:2474: 
> RuntimeWarning: invalid value encountered in true_divide
>   um.true_divide(ret, rcount, out=ret, casting='unsafe')
> array([[ 0.11511708,  0.47113483,         nan],
>        [ 0.57860351,  0.72012669,  0.56435958],
>        [ 0.79058567,  0.56260412,  0.66288591]])
> >>> np.std(a, axis=-1, skipna=True)
> /home/mwiebe/installtest/lib64/python2.7/site-packages/numpy/core/fromnumeric.py:2707: 
> RuntimeWarning: invalid value encountered in true_divide
>   um.true_divide(arrmean, rcount, out=arrmean, casting='unsafe')
> /home/mwiebe/installtest/lib64/python2.7/site-packages/numpy/core/fromnumeric.py:2730: 
> RuntimeWarning: invalid value encountered in true_divide
>   um.true_divide(ret, rcount, out=ret, casting='unsafe')
> array([[ 0.        ,  0.00452029,         nan],
>        [ 0.        ,  0.        ,  0.19853835],
>        [ 0.13735819,  0.32710662,  0.10384331]])
> >>> np.std(a, axis=(1,2), skipna=True)
> array([ 0.16786895,  0.15498008,  0.23811937])
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
Hi,
That is great news!
(Python2.x will be another email.)

Python3.1 and Python3.2 failed with building 
'multiarraymodule_onefile.o' but I could not see any obvious reason.

I had removed my build directory and then 'python3 setup.py build' but I 
saw this message:
Running from numpy source directory.
numpy/core/setup_common.py:86: MismatchCAPIWarning: API mismatch 
detected, the C API version numbers have to be updated. Current C api 
version is 6, with checksum ef5688af03ffa23dd8e11734f5b69313, but 
recorded checksum for C API version 6 in codegen_dir/cversions.txt is 
e61d5dc51fa1c6459328266e215d6987. If functions were added in the C API, 
you have to update C_API_VERSION  in numpy/core/setup_common.py.
   MismatchCAPIWarning)

Upstream of the build log is below.

Bruce

In file included from 
numpy/core/src/multiarray/multiarraymodule_onefile.c:53:0:
numpy/core/src/multiarray/na_singleton.c: At top level:
numpy/core/src/multiarray/na_singleton.c:708:25: error: 
?Py_TPFLAGS_CHECKTYPES? undeclared here (not in a function)
numpy/core/src/multiarray/common.c:48:1: warning: ?_use_default_type? 
defined but not used
numpy/core/src/multiarray/ctors.h:93:1: warning: ?_arrays_overlap? 
declared ?static? but never defined
numpy/core/src/multiarray/scalartypes.c.src:2251:1: warning: 
?gentype_getsegcount? defined but not used
numpy/core/src/multiarray/scalartypes.c.src:2269:1: warning: 
?gentype_getcharbuf? defined but not used
numpy/core/src/multiarray/mapping.c:110:1: warning: ?_array_ass_item? 
defined but not used
numpy/core/src/multiarray/number.c:266:1: warning: ?array_divide? 
defined but not used
numpy/core/src/multiarray/number.c:464:1: warning: 
?array_inplace_divide? defined but not used
numpy/core/src/multiarray/buffer.c:25:1: warning: ?array_getsegcount? 
defined but not used
numpy/core/src/multiarray/buffer.c:58:1: warning: ?array_getwritebuf? 
defined but not used
numpy/core/src/multiarray/buffer.c:71:1: warning: ?array_getcharbuf? 
defined but not used
numpy/core/src/multiarray/na_mask.c:681:1: warning: 
?PyArray_GetMaskInversionFunction? defined but not used
In file included from numpy/core/src/multiarray/scalartypes.c.src:25:0,
                  from 
numpy/core/src/multiarray/multiarraymodule_onefile.c:10:
numpy/core/src/multiarray/_datetime.h:9:1: warning: function declaration 
isn?t a prototype
In file included from 
numpy/core/src/multiarray/multiarraymodule_onefile.c:13:0:
numpy/core/src/multiarray/datetime.c:33:1: warning: function declaration 
isn?t a prototype
In file included from 
numpy/core/src/multiarray/multiarraymodule_onefile.c:17:0:
numpy/core/src/multiarray/arraytypes.c.src: In function ?VOID_getitem?:
numpy/core/src/multiarray/arraytypes.c.src:643:9: warning: passing 
argument 2 of ?PyArray_SetBaseObject? from incompatible pointer type
build/src.linux-x86_64-3.2/numpy/core/include/numpy/__multiarray_api.h:763:12: 
note: expected ?struct PyObject *? but argument is of type ?struct 
PyArrayObject *?
In file included from 
numpy/core/src/multiarray/multiarraymodule_onefile.c:44:0:
numpy/core/src/multiarray/nditer_pywrap.c: In function ?npyiter_subscript?:
numpy/core/src/multiarray/nditer_pywrap.c:2395:29: warning: passing 
argument 1 of ?PySlice_GetIndices? from incompatible pointer type
/usr/local/include/python3.2m/sliceobject.h:38:5: note: expected ?struct 
PyObject *? but argument is of type ?struct PySliceObject *?
numpy/core/src/multiarray/nditer_pywrap.c: In function 
?npyiter_ass_subscript?:
numpy/core/src/multiarray/nditer_pywrap.c:2440:29: warning: passing 
argument 1 of ?PySlice_GetIndices? from incompatible pointer type
/usr/local/include/python3.2m/sliceobject.h:38:5: note: expected ?struct 
PyObject *? but argument is of type ?struct PySliceObject *?
In file included from 
numpy/core/src/multiarray/multiarraymodule_onefile.c:53:0:
numpy/core/src/multiarray/na_singleton.c: At top level:
numpy/core/src/multiarray/na_singleton.c:708:25: error: 
?Py_TPFLAGS_CHECKTYPES? undeclared here (not in a function)
numpy/core/src/multiarray/common.c:48:1: warning: ?_use_default_type? 
defined but not used
numpy/core/src/multiarray/ctors.h:93:1: warning: ?_arrays_overlap? 
declared ?static? but never defined
numpy/core/src/multiarray/scalartypes.c.src:2251:1: warning: 
?gentype_getsegcount? defined but not used
numpy/core/src/multiarray/scalartypes.c.src:2269:1: warning: 
?gentype_getcharbuf? defined but not used
numpy/core/src/multiarray/mapping.c:110:1: warning: ?_array_ass_item? 
defined but not used
numpy/core/src/multiarray/number.c:266:1: warning: ?array_divide? 
defined but not used
numpy/core/src/multiarray/number.c:464:1: warning: 
?array_inplace_divide? defined but not used
numpy/core/src/multiarray/buffer.c:25:1: warning: ?array_getsegcount? 
defined but not used
numpy/core/src/multiarray/buffer.c:58:1: warning: ?array_getwritebuf? 
defined but not used
numpy/core/src/multiarray/buffer.c:71:1: warning: ?array_getcharbuf? 
defined but not used
numpy/core/src/multiarray/na_mask.c:681:1: warning: 
?PyArray_GetMaskInversionFunction? defined but not used
error: Command "gcc -pthread -DNDEBUG -g -fwrapv -O3 -Wall 
-Wstrict-prototypes -fPIC -Inumpy/core/include 
-Ibuild/src.linux-x86_64-3.2/numpy/core/include/numpy 
-Inumpy/core/src/private -Inumpy/core/src -Inumpy/core 
-Inumpy/core/src/npymath -Inumpy/core/src/multiarray 
-Inumpy/core/src/umath -Inumpy/core/src/npysort -Inumpy/core/include 
-I/usr/local/include/python3.2m 
-Ibuild/src.linux-x86_64-3.2/numpy/core/src/multiarray 
-Ibuild/src.linux-x86_64-3.2/numpy/core/src/umath -c 
numpy/core/src/multiarray/multiarraymodule_onefile.c -o 
build/temp.linux-x86_64-3.2/numpy/core/src/multiarray/multiarraymodule_onefile.o" 
failed with exit status 1


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110819/4d8c9a93/attachment.html>

From chris at simplistix.co.uk  Fri Aug 19 10:49:33 2011
From: chris at simplistix.co.uk (Chris Withers)
Date: Fri, 19 Aug 2011 07:49:33 -0700
Subject: [Numpy-discussion] summing an array
In-Reply-To: <4E4D2891.4@cam.ac.uk>
References: <4E4D1F5A.2000205@simplistix.co.uk> <4E4D2891.4@cam.ac.uk>
Message-ID: <4E4E77FD.8070107@simplistix.co.uk>

On 18/08/2011 07:58, Bob Dowling wrote:
>
>   >>>  numpy.add.accumulate(a)
> array([ 0,  1,  3,  6, 10])
>
>   >>>  numpy.add.accumulate(a, out=a)
> array([ 0,  1,  3,  6, 10])

What's the difference between numpy.cumsum and numpy.add.accumulate?

Where can I find the reference docs for these?

cheers,

Chris

-- 
Simplistix - Content Management, Batch Processing & Python Consulting
             - http://www.simplistix.co.uk


From bsouthey at gmail.com  Fri Aug 19 10:55:46 2011
From: bsouthey at gmail.com (Bruce Southey)
Date: Fri, 19 Aug 2011 09:55:46 -0500
Subject: [Numpy-discussion] NA masks for NumPy are ready to test
In-Reply-To: <CAMRnEmpGCBzR78-y5PBjoa+nYt4EKOJayeWqfzggR11CmTzn3g@mail.gmail.com>
References: <CAMRnEmpGCBzR78-y5PBjoa+nYt4EKOJayeWqfzggR11CmTzn3g@mail.gmail.com>
Message-ID: <4E4E7972.9060807@gmail.com>

On 08/18/2011 04:43 PM, Mark Wiebe wrote:
> It's taken a lot of changes to get the NA mask support to its current 
> point, but the code ready for some testing now. You can read the 
> work-in-progress release notes here:
>
> https://github.com/m-paradox/numpy/blob/missingdata/doc/release/2.0.0-notes.rst
>
> To try it out, check out the missingdata branch from my github 
> account, here, and build in the standard way:
>
> https://github.com/m-paradox/numpy
>
> The things most important to test are:
>
> * Confirm that existing code still works correctly. I've tested 
> against SciPy and matplotlib.
> * Confirm that the performance of code not using NA masks is the same 
> or better.
> * Try to do computations with the NA values, find places they don't 
> work yet, and nominate unimplemented functionality important to you to 
> be next on the development list. The release notes have a preliminary 
> list of implemented/unimplemented functions.
> * Report any crashes, build problems, or unexpected behaviors.
>
> In addition to adding the NA mask, I've also added features and done a 
> few performance changes here and there, like letting reductions like 
> sum take lists of axes instead of being a single axis or all of them. 
> These changes affect various bugs like 
> http://projects.scipy.org/numpy/ticket/1143 and 
> http://projects.scipy.org/numpy/ticket/533.
>
> Thanks!
> Mark
>
> Here's a small example run using NAs:
>
> >>> import numpy as np
> >>> np.__version__
> '2.0.0.dev-8a5e2a1'
> >>> a = np.random.rand(3,3,3)
> >>> a.flags.maskna = True
> >>> a[np.random.rand(3,3,3) < 0.5] = np.NA
> >>> a
> array([[[NA, NA,  0.11511708],
>         [ 0.46661454,  0.47565512, NA],
>         [NA, NA, NA]],
>
>        [[NA,  0.57860351, NA],
>         [NA, NA,  0.72012669],
>         [ 0.36582123, NA,  0.76289794]],
>
>        [[ 0.65322748,  0.92794386, NA],
>         [ 0.53745165,  0.97520989,  0.17515083],
>         [ 0.71219688,  0.5184328 ,  0.75802805]]])
> >>> np.mean(a, axis=-1)
> array([[NA, NA, NA],
>        [NA, NA, NA],
>        [NA,  0.56260412,  0.66288591]])
> >>> np.std(a, axis=-1)
> array([[NA, NA, NA],
>        [NA, NA, NA],
>        [NA,  0.32710662,  0.10384331]])
> >>> np.mean(a, axis=-1, skipna=True)
> /home/mwiebe/installtest/lib64/python2.7/site-packages/numpy/core/fromnumeric.py:2474: 
> RuntimeWarning: invalid value encountered in true_divide
>   um.true_divide(ret, rcount, out=ret, casting='unsafe')
> array([[ 0.11511708,  0.47113483,         nan],
>        [ 0.57860351,  0.72012669,  0.56435958],
>        [ 0.79058567,  0.56260412,  0.66288591]])
> >>> np.std(a, axis=-1, skipna=True)
> /home/mwiebe/installtest/lib64/python2.7/site-packages/numpy/core/fromnumeric.py:2707: 
> RuntimeWarning: invalid value encountered in true_divide
>   um.true_divide(arrmean, rcount, out=arrmean, casting='unsafe')
> /home/mwiebe/installtest/lib64/python2.7/site-packages/numpy/core/fromnumeric.py:2730: 
> RuntimeWarning: invalid value encountered in true_divide
>   um.true_divide(ret, rcount, out=ret, casting='unsafe')
> array([[ 0.        ,  0.00452029,         nan],
>        [ 0.        ,  0.        ,  0.19853835],
>        [ 0.13735819,  0.32710662,  0.10384331]])
> >>> np.std(a, axis=(1,2), skipna=True)
> array([ 0.16786895,  0.15498008,  0.23811937])
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
Hi,
I had to rebuild my Python2.6 as a 'normal' version.

Anyhow, Python2.4, 2.5, 2.6 and 2.7 all build and pass the numpy tests.

Curiously, only tests in Python2.7 give almost no warnings but all the 
other Python2.x give lots of warnings - Python2.6 and Python2.7 are 
below. My expectation is that all versions should behave the same 
regarding printing messages.

Also the message 'Need pytz library to test datetime timezones' means 
that there are invalid tests that have to be rewritten (ticket 1939:  
http://projects.scipy.org/numpy/ticket/1939 ).

Bruce

$ python2.6 -c "import numpy; numpy.test()"
Running unit tests for numpy
NumPy version 2.0.0.dev-93236a2
NumPy is installed in /usr/local/lib/python2.6/site-packages/numpy
Python version 2.6.6 (r266:84292, Aug 19 2011, 09:21:38) [GCC 4.5.1 
20100924 (Red Hat 4.5.1-4)]
nose version 1.0.0
......................../usr/local/lib/python2.6/site-packages/numpy/core/tests/test_datetime.py:1313: 
UserWarning: Need pytz library to test datetime timezones
   warnings.warn("Need pytz library to test datetime timezones")
.........................................................................................................................../usr/local/lib/python2.6/unittest.py:336: 
DeprecationWarning: DType strings 'O4' and 'O8' are deprecated because 
they are platform specific. Use 'O' instead
   callableObj(*args, **kwargs)
............................................................................................................................................................................................................./usr/local/lib/python2.6/site-packages/numpy/core/_internal.py:555: 
DeprecationWarning: Setting NumPy dtype names is deprecated, the dtype 
will become immutable in a future version
   value.names = tuple(names)
...../usr/local/lib/python2.6/site-packages/numpy/core/tests/test_multiarray.py:1912: 
DeprecationWarning: Setting NumPy dtype names is deprecated, the dtype 
will become immutable in a future version
   dt.names = tuple(names)
...../usr/local/lib/python2.6/site-packages/numpy/core/tests/test_multiarray.py:804: 
DeprecationWarning: DType strings 'O4' and 'O8' are deprecated because 
they are platform specific. Use 'O' instead
   return loads(obj)
..../usr/local/lib/python2.6/site-packages/numpy/core/tests/test_multiarray.py:1046: 
DeprecationWarning: putmask has been deprecated. Use copyto with 'where' 
as the mask instead
   np.putmask(x,[True,False,True],-1)
../usr/local/lib/python2.6/site-packages/numpy/core/tests/test_multiarray.py:1025: 
DeprecationWarning: putmask has been deprecated. Use copyto with 'where' 
as the mask instead
   np.putmask(x, mask, val)
................................................/usr/local/lib/python2.6/unittest.py:336: 
DeprecationWarning: putmask has been deprecated. Use copyto with 'where' 
as the mask instead
   callableObj(*args, **kwargs)
../usr/local/lib/python2.6/site-packages/numpy/core/tests/test_multiarray.py:1057: 
DeprecationWarning: putmask has been deprecated. Use copyto with 'where' 
as the mask instead
   np.putmask(rec['x'],[True,False],10)
/usr/local/lib/python2.6/site-packages/numpy/core/tests/test_multiarray.py:1061: 
DeprecationWarning: putmask has been deprecated. Use copyto with 'where' 
as the mask instead
   np.putmask(rec['y'],[True,False],11)
.S/usr/local/lib/python2.6/site-packages/numpy/core/tests/test_multiarray.py:1395: 
DeprecationWarning: Setting NumPy dtype names is deprecated, the dtype 
will become immutable in a future version
   dt.names = ['p','q']
..................................................................................................................................................................................................................................................................................................................................................................................../usr/local/lib/python2.6/site-packages/numpy/core/records.py:157: 
DeprecationWarning: DType strings 'O4' and 'O8' are deprecated because 
they are platform specific. Use 'O' instead
   dtype = sb.dtype(formats, aligned)
........................................................./usr/local/lib/python2.6/site-packages/numpy/core/tests/test_regression.py:1426: 
DeprecationWarning: Setting NumPy dtype names is deprecated, the dtype 
will become immutable in a future version
   ra.dtype.names = ('f1', 'f2')
/usr/local/lib/python2.6/unittest.py:336: DeprecationWarning: Setting 
NumPy dtype names is deprecated, the dtype will become immutable in a 
future version
   callableObj(*args, **kwargs)
............../usr/local/lib/python2.6/site-packages/numpy/core/tests/test_regression.py:1017: 
DeprecationWarning: Setting NumPy dtype names is deprecated, the dtype 
will become immutable in a future version
   a.dtype.names = b
......................................................................................................................./usr/local/lib/python2.6/pickle.py:1133: 
DeprecationWarning: DType strings 'O4' and 'O8' are deprecated because 
they are platform specific. Use 'O' instead
   value = func(*args)
..........................................................................................K..................................................................................................K......................K..........................................................................................................S...................................../usr/local/lib/python2.6/site-packages/numpy/lib/_iotools.py:857: 
DeprecationWarning: Setting NumPy dtype names is deprecated, the dtype 
will become immutable in a future version
   ndtype.names = validate(ndtype.names, defaultfmt=defaultfmt)
/usr/local/lib/python2.6/site-packages/numpy/lib/_iotools.py:854: 
DeprecationWarning: Setting NumPy dtype names is deprecated, the dtype 
will become immutable in a future version
   ndtype.names = validate([''] * nbtypes, defaultfmt=defaultfmt)
/usr/local/lib/python2.6/site-packages/numpy/lib/_iotools.py:847: 
DeprecationWarning: Setting NumPy dtype names is deprecated, the dtype 
will become immutable in a future version
   defaultfmt=defaultfmt)
......................................................................................................................................................................................./usr/local/lib/python2.6/site-packages/numpy/lib/format.py:358: 
DeprecationWarning: DType strings 'O4' and 'O8' are deprecated because 
they are platform specific. Use 'O' instead
   dtype = numpy.dtype(d['descr'])
/usr/local/lib/python2.6/site-packages/numpy/lib/format.py:449: 
DeprecationWarning: DType strings 'O4' and 'O8' are deprecated because 
they are platform specific. Use 'O' instead
   array = cPickle.load(fp)
.............................................................................................................................................................................................................................................................................................................................../usr/local/lib/python2.6/site-packages/numpy/ma/core.py:366: 
DeprecationWarning: DType strings 'O4' and 'O8' are deprecated because 
they are platform specific. Use 'O' instead
   deflist.append(default_fill_value(np.dtype(currenttype)))
................/usr/local/lib/python2.6/site-packages/numpy/lib/npyio.py:1640: 
DeprecationWarning: Setting NumPy dtype names is deprecated, the dtype 
will become immutable in a future version
   dtype.names = names
........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
----------------------------------------------------------------------
Ran 3064 tests in 22.795s

OK (KNOWNFAIL=3, SKIP=2)
$ python -c "import numpy; numpy.test()"
Running unit tests for numpy
NumPy version 2.0.0.dev-93236a2
NumPy is installed in /usr/lib64/python2.7/site-packages/numpy
Python version 2.7 (r27:82500, Sep 16 2010, 18:02:00) [GCC 4.5.1 
20100907 (Red Hat 4.5.1-3)]
nose version 1.0.0
......................../usr/lib64/python2.7/site-packages/numpy/core/tests/test_datetime.py:1313: 
UserWarning: Need pytz library to test datetime timezones
   warnings.warn("Need pytz library to test datetime timezones")
...........................................................................................................................................................................................................................................................................................................................................................................................................S............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................K..................................................................................................K......................K..........................................................................................................S...................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
----------------------------------------------------------------------
Ran 3064 tests in 23.180s

OK (KNOWNFAIL=3, SKIP=2)


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110819/7b6322b4/attachment.html>

From ralf.gommers at googlemail.com  Fri Aug 19 11:04:38 2011
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Fri, 19 Aug 2011 17:04:38 +0200
Subject: [Numpy-discussion] NA masks for NumPy are ready to test
In-Reply-To: <4E4E7972.9060807@gmail.com>
References: <CAMRnEmpGCBzR78-y5PBjoa+nYt4EKOJayeWqfzggR11CmTzn3g@mail.gmail.com>
	<4E4E7972.9060807@gmail.com>
Message-ID: <CABL7CQgCXv-HXREY7PsWd4O5ORy4020fHPhPA8sBVDR3Uz0nUw@mail.gmail.com>

On Fri, Aug 19, 2011 at 4:55 PM, Bruce Southey <bsouthey at gmail.com> wrote:

> **
>
>  Hi,
> I had to rebuild my Python2.6 as a 'normal' version.
>
> Anyhow, Python2.4, 2.5, 2.6 and 2.7 all build and pass the numpy tests.
>
> Curiously, only tests in Python2.7 give almost no warnings but all the
> other Python2.x give lots of warnings - Python2.6 and Python2.7 are below.
> My expectation is that all versions should behave the same regarding
> printing messages.
>

This is due to a change in Python 2.7 itself - deprecation warnings are not
shown anymore by default. Furthermore, all those messages are unrelated to
Mark's missing data commits.

Cheers,
Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110819/f963d030/attachment.html>

From jlconlin at gmail.com  Fri Aug 19 11:09:26 2011
From: jlconlin at gmail.com (Jeremy Conlin)
Date: Fri, 19 Aug 2011 09:09:26 -0600
Subject: [Numpy-discussion] How to start at line # x when using
	numpy.memmap
In-Reply-To: <CAAp4xwr2DdZ2+QQUBcr0c-4frL0NZ9Z7-w7gRmTViL91RGB7mw@mail.gmail.com>
References: <CAAzQXyNiMinY6CAfhRqhqnML6NF8NvtgoRimCA7GchyfnJgUiw@mail.gmail.com>
	<j2lnss$fes$1@dough.gmane.org>
	<CAAzQXyOyeqH0+rNyvaO38Ttoerag+EZhJvbpdKoLYMciFOaZfg@mail.gmail.com>
	<CAAp4xwr2DdZ2+QQUBcr0c-4frL0NZ9Z7-w7gRmTViL91RGB7mw@mail.gmail.com>
Message-ID: <CAAzQXyMgWNm5u-bcWqPbOgrULW9T9o=2YQwZGYoEoY3VbBWP8w@mail.gmail.com>

On Fri, Aug 19, 2011 at 8:01 AM, Brent Pedersen <bpederse at gmail.com> wrote:
> On Fri, Aug 19, 2011 at 7:29 AM, Jeremy Conlin <jlconlin at gmail.com> wrote:
>> On Fri, Aug 19, 2011 at 7:19 AM, Pauli Virtanen <pav at iki.fi> wrote:
>>> Fri, 19 Aug 2011 07:00:31 -0600, Jeremy Conlin wrote:
>>>> I would like to use numpy's memmap on some data files I have. The first
>>>> 12 or so lines of the files contain text (header information) and the
>>>> remainder has the numerical data. Is there a way I can tell memmap to
>>>> skip a specified number of lines instead of a number of bytes?
>>>
>>> First use standard Python I/O functions to determine the number of
>>> bytes to skip at the beginning and the number of data items. Then pass
>>> in `offset` and `shape` parameters to numpy.memmap.
>>
>> Thanks for that suggestion. However, I'm unfamiliar with the I/O
>> functions you are referring to. Can you point me to do the
>> documentation?
>>
>> Thanks again,
>> Jeremy
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>
> this might get you started:
>
>
> import numpy as np
>
> # make some fake data with 12 header lines.
> with open('test.mm', 'w') as fhw:
> ? ?print >> fhw, "\n".join('header' for i in range(12))
> ? ?np.arange(100, dtype=np.uint).tofile(fhw)
>
> # use normal python io to determine of offset after 12 lines.
> with open('test.mm') as fhr:
> ? ?for i in range(12): fhr.readline()
> ? ?offset = fhr.tell()
>
> # use the offset in your call to np.memmap.
> a = np.memmap('test.mm', mode='r', dtype=np.uint, offset=offset)

Thanks, that looks good. I tried it, but it doesn't get the correct
data. I really don't understand what is going on. A simple code and
sample data is attached if anyone has a chance to look at it.

Thanks,
Jeremy
-------------- next part --------------
A non-text attachment was scrubbed...
Name: tmp.dat
Type: application/octet-stream
Size: 1668 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110819/ac4424fb/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: tmp.py
Type: application/octet-stream
Size: 429 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110819/ac4424fb/attachment-0001.obj>

From mwwiebe at gmail.com  Fri Aug 19 11:14:39 2011
From: mwwiebe at gmail.com (Mark Wiebe)
Date: Fri, 19 Aug 2011 08:14:39 -0700
Subject: [Numpy-discussion] NA masks for NumPy are ready to test
In-Reply-To: <4E4E7972.9060807@gmail.com>
References: <CAMRnEmpGCBzR78-y5PBjoa+nYt4EKOJayeWqfzggR11CmTzn3g@mail.gmail.com>
	<4E4E7972.9060807@gmail.com>
Message-ID: <CAMRnEmpWMx-NkKOgfJnzpJ=XKAfc3P0EaGZFTZx2CDAkTYYUjQ@mail.gmail.com>

On Fri, Aug 19, 2011 at 7:55 AM, Bruce Southey <bsouthey at gmail.com> wrote:

> **
> On 08/18/2011 04:43 PM, Mark Wiebe wrote:
>
> It's taken a lot of changes to get the NA mask support to its current
> point, but the code ready for some testing now. You can read the
> work-in-progress release notes here:
>
>
> https://github.com/m-paradox/numpy/blob/missingdata/doc/release/2.0.0-notes.rst
>
>  To try it out, check out the missingdata branch from my github account,
> here, and build in the standard way:
>
>  https://github.com/m-paradox/numpy
>
>  The things most important to test are:
>
>  * Confirm that existing code still works correctly. I've tested against
> SciPy and matplotlib.
> * Confirm that the performance of code not using NA masks is the same or
> better.
> * Try to do computations with the NA values, find places they don't work
> yet, and nominate unimplemented functionality important to you to be next on
> the development list. The release notes have a preliminary list of
> implemented/unimplemented functions.
> * Report any crashes, build problems, or unexpected behaviors.
>
>  In addition to adding the NA mask, I've also added features and done a
> few performance changes here and there, like letting reductions like sum
> take lists of axes instead of being a single axis or all of them. These
> changes affect various bugs like
> http://projects.scipy.org/numpy/ticket/1143 and
> http://projects.scipy.org/numpy/ticket/533.
>
>  Thanks!
> Mark
>
>  Here's a small example run using NAs:
>
>  >>> import numpy as np
> >>> np.__version__
> '2.0.0.dev-8a5e2a1'
> >>> a = np.random.rand(3,3,3)
> >>> a.flags.maskna = True
> >>> a[np.random.rand(3,3,3) < 0.5] = np.NA
> >>> a
> array([[[NA, NA,  0.11511708],
>         [ 0.46661454,  0.47565512, NA],
>         [NA, NA, NA]],
>
>         [[NA,  0.57860351, NA],
>         [NA, NA,  0.72012669],
>         [ 0.36582123, NA,  0.76289794]],
>
>         [[ 0.65322748,  0.92794386, NA],
>         [ 0.53745165,  0.97520989,  0.17515083],
>         [ 0.71219688,  0.5184328 ,  0.75802805]]])
> >>> np.mean(a, axis=-1)
> array([[NA, NA, NA],
>        [NA, NA, NA],
>        [NA,  0.56260412,  0.66288591]])
> >>> np.std(a, axis=-1)
> array([[NA, NA, NA],
>        [NA, NA, NA],
>         [NA,  0.32710662,  0.10384331]])
> >>> np.mean(a, axis=-1, skipna=True)
> /home/mwiebe/installtest/lib64/python2.7/site-packages/numpy/core/fromnumeric.py:2474:
> RuntimeWarning: invalid value encountered in true_divide
>   um.true_divide(ret, rcount, out=ret, casting='unsafe')
> array([[ 0.11511708,  0.47113483,         nan],
>        [ 0.57860351,  0.72012669,  0.56435958],
>        [ 0.79058567,  0.56260412,  0.66288591]])
> >>> np.std(a, axis=-1, skipna=True)
> /home/mwiebe/installtest/lib64/python2.7/site-packages/numpy/core/fromnumeric.py:2707:
> RuntimeWarning: invalid value encountered in true_divide
>   um.true_divide(arrmean, rcount, out=arrmean, casting='unsafe')
> /home/mwiebe/installtest/lib64/python2.7/site-packages/numpy/core/fromnumeric.py:2730:
> RuntimeWarning: invalid value encountered in true_divide
>   um.true_divide(ret, rcount, out=ret, casting='unsafe')
> array([[ 0.        ,  0.00452029,         nan],
>        [ 0.        ,  0.        ,  0.19853835],
>        [ 0.13735819,  0.32710662,  0.10384331]])
>  >>> np.std(a, axis=(1,2), skipna=True)
> array([ 0.16786895,  0.15498008,  0.23811937])
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.orghttp://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>  Hi,
> I had to rebuild my Python2.6 as a 'normal' version.
>
> Anyhow, Python2.4, 2.5, 2.6 and 2.7 all build and pass the numpy tests.
>

Thanks for running the tests!

>
> Curiously, only tests in Python2.7 give almost no warnings but all the
> other Python2.x give lots of warnings - Python2.6 and Python2.7 are below.
> My expectation is that all versions should behave the same regarding
> printing messages.
>

The lack of deprecation warnings is because you need to add -Wd explicitly
when you run under 2.7. There was an idea to make this the default from
within the test suite execution code, but no one has stepped up and
implemented that. See here:

http://projects.scipy.org/numpy/ticket/1894


> Also the message 'Need pytz library to test datetime timezones' means that
> there are invalid tests that have to be rewritten (ticket 1939:
> http://projects.scipy.org/numpy/ticket/1939 ).
>

I did it this way because Python has no timezone objects built in, just
provides the interface. If someone is willing to copy or write timezone
instances into the testsuite to fix this I would be very grateful!

I think all these policies I keep breaking should be written down somewhere.
I don't think it's reasonable to call something a community/project policy
unless a particular wording of it in an easily discoverable official
document has been agreed upon by the community. I nominate this as a new
policy. ;)

Thanks,
Mark


>
> Bruce
>
> $ python2.6 -c "import numpy; numpy.test()"
> Running unit tests for numpy
> NumPy version 2.0.0.dev-93236a2
> NumPy is installed in /usr/local/lib/python2.6/site-packages/numpy
> Python version 2.6.6 (r266:84292, Aug 19 2011, 09:21:38) [GCC 4.5.1
> 20100924 (Red Hat 4.5.1-4)]
> nose version 1.0.0
> ......................../usr/local/lib/python2.6/site-packages/numpy/core/tests/test_datetime.py:1313:
> UserWarning: Need pytz library to test datetime timezones
>   warnings.warn("Need pytz library to test datetime timezones")
> .........................................................................................................................../usr/local/lib/python2.6/unittest.py:336:
> DeprecationWarning: DType strings 'O4' and 'O8' are deprecated because they
> are platform specific. Use 'O' instead
>   callableObj(*args, **kwargs)
> ............................................................................................................................................................................................................./usr/local/lib/python2.6/site-packages/numpy/core/_internal.py:555:
> DeprecationWarning: Setting NumPy dtype names is deprecated, the dtype will
> become immutable in a future version
>   value.names = tuple(names)
> ...../usr/local/lib/python2.6/site-packages/numpy/core/tests/test_multiarray.py:1912:
> DeprecationWarning: Setting NumPy dtype names is deprecated, the dtype will
> become immutable in a future version
>   dt.names = tuple(names)
> ...../usr/local/lib/python2.6/site-packages/numpy/core/tests/test_multiarray.py:804:
> DeprecationWarning: DType strings 'O4' and 'O8' are deprecated because they
> are platform specific. Use 'O' instead
>   return loads(obj)
> ..../usr/local/lib/python2.6/site-packages/numpy/core/tests/test_multiarray.py:1046:
> DeprecationWarning: putmask has been deprecated. Use copyto with 'where' as
> the mask instead
>   np.putmask(x,[True,False,True],-1)
> ../usr/local/lib/python2.6/site-packages/numpy/core/tests/test_multiarray.py:1025:
> DeprecationWarning: putmask has been deprecated. Use copyto with 'where' as
> the mask instead
>   np.putmask(x, mask, val)
> ................................................/usr/local/lib/python2.6/unittest.py:336:
> DeprecationWarning: putmask has been deprecated. Use copyto with 'where' as
> the mask instead
>   callableObj(*args, **kwargs)
> ../usr/local/lib/python2.6/site-packages/numpy/core/tests/test_multiarray.py:1057:
> DeprecationWarning: putmask has been deprecated. Use copyto with 'where' as
> the mask instead
>   np.putmask(rec['x'],[True,False],10)
> /usr/local/lib/python2.6/site-packages/numpy/core/tests/test_multiarray.py:1061:
> DeprecationWarning: putmask has been deprecated. Use copyto with 'where' as
> the mask instead
>   np.putmask(rec['y'],[True,False],11)
> .S/usr/local/lib/python2.6/site-packages/numpy/core/tests/test_multiarray.py:1395:
> DeprecationWarning: Setting NumPy dtype names is deprecated, the dtype will
> become immutable in a future version
>   dt.names = ['p','q']
> ..................................................................................................................................................................................................................................................................................................................................................................................../usr/local/lib/python2.6/site-packages/numpy/core/records.py:157:
> DeprecationWarning: DType strings 'O4' and 'O8' are deprecated because they
> are platform specific. Use 'O' instead
>   dtype = sb.dtype(formats, aligned)
> ........................................................./usr/local/lib/python2.6/site-packages/numpy/core/tests/test_regression.py:1426:
> DeprecationWarning: Setting NumPy dtype names is deprecated, the dtype will
> become immutable in a future version
>   ra.dtype.names = ('f1', 'f2')
> /usr/local/lib/python2.6/unittest.py:336: DeprecationWarning: Setting NumPy
> dtype names is deprecated, the dtype will become immutable in a future
> version
>   callableObj(*args, **kwargs)
> ............../usr/local/lib/python2.6/site-packages/numpy/core/tests/test_regression.py:1017:
> DeprecationWarning: Setting NumPy dtype names is deprecated, the dtype will
> become immutable in a future version
>   a.dtype.names = b
> ......................................................................................................................./usr/local/lib/python2.6/pickle.py:1133:
> DeprecationWarning: DType strings 'O4' and 'O8' are deprecated because they
> are platform specific. Use 'O' instead
>   value = func(*args)
> ..........................................................................................K..................................................................................................K......................K..........................................................................................................S...................................../usr/local/lib/python2.6/site-packages/numpy/lib/_iotools.py:857:
> DeprecationWarning: Setting NumPy dtype names is deprecated, the dtype will
> become immutable in a future version
>   ndtype.names = validate(ndtype.names, defaultfmt=defaultfmt)
> /usr/local/lib/python2.6/site-packages/numpy/lib/_iotools.py:854:
> DeprecationWarning: Setting NumPy dtype names is deprecated, the dtype will
> become immutable in a future version
>   ndtype.names = validate([''] * nbtypes, defaultfmt=defaultfmt)
> /usr/local/lib/python2.6/site-packages/numpy/lib/_iotools.py:847:
> DeprecationWarning: Setting NumPy dtype names is deprecated, the dtype will
> become immutable in a future version
>   defaultfmt=defaultfmt)
> ......................................................................................................................................................................................./usr/local/lib/python2.6/site-packages/numpy/lib/format.py:358:
> DeprecationWarning: DType strings 'O4' and 'O8' are deprecated because they
> are platform specific. Use 'O' instead
>   dtype = numpy.dtype(d['descr'])
> /usr/local/lib/python2.6/site-packages/numpy/lib/format.py:449:
> DeprecationWarning: DType strings 'O4' and 'O8' are deprecated because they
> are platform specific. Use 'O' instead
>   array = cPickle.load(fp)
> .............................................................................................................................................................................................................................................................................................................................../usr/local/lib/python2.6/site-packages/numpy/ma/core.py:366:
> DeprecationWarning: DType strings 'O4' and 'O8' are deprecated because they
> are platform specific. Use 'O' instead
>   deflist.append(default_fill_value(np.dtype(currenttype)))
> ................/usr/local/lib/python2.6/site-packages/numpy/lib/npyio.py:1640:
> DeprecationWarning: Setting NumPy dtype names is deprecated, the dtype will
> become immutable in a future version
>   dtype.names = names
> ..............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
> ..........................................................................................................................................................................................................................
> ----------------------------------------------------------------------
> Ran 3064 tests in 22.795s
>
> OK (KNOWNFAIL=3, SKIP=2)
> $ python -c "import numpy; numpy.test()"
> Running unit tests for numpy
> NumPy version 2.0.0.dev-93236a2
> NumPy is installed in /usr/lib64/python2.7/site-packages/numpy
> Python version 2.7 (r27:82500, Sep 16 2010, 18:02:00) [GCC 4.5.1 20100907
> (Red Hat 4.5.1-3)]
> nose version 1.0.0
> ......................../usr/lib64/python2.7/site-packages/numpy/core/tests/test_datetime.py:1313:
> UserWarning: Need pytz library to test datetime timezones
>   warnings.warn("Need pytz library to test datetime timezones")
> ...........................................................................................................................................................................................................................................................................................................................................................................................................S..................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
> ..........................................................K..................................................................................................K......................K..........................................................................................................S..............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
> ..............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
> .......................................................................
> ----------------------------------------------------------------------
> Ran 3064 tests in 23.180s
>
> OK (KNOWNFAIL=3, SKIP=2)
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110819/d2146b47/attachment.html>

From bpederse at gmail.com  Fri Aug 19 11:18:12 2011
From: bpederse at gmail.com (Brent Pedersen)
Date: Fri, 19 Aug 2011 09:18:12 -0600
Subject: [Numpy-discussion] How to start at line # x when using
	numpy.memmap
In-Reply-To: <CAAzQXyMgWNm5u-bcWqPbOgrULW9T9o=2YQwZGYoEoY3VbBWP8w@mail.gmail.com>
References: <CAAzQXyNiMinY6CAfhRqhqnML6NF8NvtgoRimCA7GchyfnJgUiw@mail.gmail.com>
	<j2lnss$fes$1@dough.gmane.org>
	<CAAzQXyOyeqH0+rNyvaO38Ttoerag+EZhJvbpdKoLYMciFOaZfg@mail.gmail.com>
	<CAAp4xwr2DdZ2+QQUBcr0c-4frL0NZ9Z7-w7gRmTViL91RGB7mw@mail.gmail.com>
	<CAAzQXyMgWNm5u-bcWqPbOgrULW9T9o=2YQwZGYoEoY3VbBWP8w@mail.gmail.com>
Message-ID: <CAAp4xwqTyaYe_DGW9t4M1mLgJiAHrpULDV20f7=kMWYqREgCEw@mail.gmail.com>

On Fri, Aug 19, 2011 at 9:09 AM, Jeremy Conlin <jlconlin at gmail.com> wrote:
> On Fri, Aug 19, 2011 at 8:01 AM, Brent Pedersen <bpederse at gmail.com> wrote:
>> On Fri, Aug 19, 2011 at 7:29 AM, Jeremy Conlin <jlconlin at gmail.com> wrote:
>>> On Fri, Aug 19, 2011 at 7:19 AM, Pauli Virtanen <pav at iki.fi> wrote:
>>>> Fri, 19 Aug 2011 07:00:31 -0600, Jeremy Conlin wrote:
>>>>> I would like to use numpy's memmap on some data files I have. The first
>>>>> 12 or so lines of the files contain text (header information) and the
>>>>> remainder has the numerical data. Is there a way I can tell memmap to
>>>>> skip a specified number of lines instead of a number of bytes?
>>>>
>>>> First use standard Python I/O functions to determine the number of
>>>> bytes to skip at the beginning and the number of data items. Then pass
>>>> in `offset` and `shape` parameters to numpy.memmap.
>>>
>>> Thanks for that suggestion. However, I'm unfamiliar with the I/O
>>> functions you are referring to. Can you point me to do the
>>> documentation?
>>>
>>> Thanks again,
>>> Jeremy
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>
>>
>> this might get you started:
>>
>>
>> import numpy as np
>>
>> # make some fake data with 12 header lines.
>> with open('test.mm', 'w') as fhw:
>> ? ?print >> fhw, "\n".join('header' for i in range(12))
>> ? ?np.arange(100, dtype=np.uint).tofile(fhw)
>>
>> # use normal python io to determine of offset after 12 lines.
>> with open('test.mm') as fhr:
>> ? ?for i in range(12): fhr.readline()
>> ? ?offset = fhr.tell()
>>
>> # use the offset in your call to np.memmap.
>> a = np.memmap('test.mm', mode='r', dtype=np.uint, offset=offset)
>
> Thanks, that looks good. I tried it, but it doesn't get the correct
> data. I really don't understand what is going on. A simple code and
> sample data is attached if anyone has a chance to look at it.
>
> Thanks,
> Jeremy
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>

in that case, i would use:

np.loadtxt('tmp.dat', skiprows=12)


From bsouthey at gmail.com  Fri Aug 19 11:23:37 2011
From: bsouthey at gmail.com (Bruce Southey)
Date: Fri, 19 Aug 2011 10:23:37 -0500
Subject: [Numpy-discussion] NA masks for NumPy are ready to test
In-Reply-To: <CABL7CQgCXv-HXREY7PsWd4O5ORy4020fHPhPA8sBVDR3Uz0nUw@mail.gmail.com>
References: <CAMRnEmpGCBzR78-y5PBjoa+nYt4EKOJayeWqfzggR11CmTzn3g@mail.gmail.com>	<4E4E7972.9060807@gmail.com>
	<CABL7CQgCXv-HXREY7PsWd4O5ORy4020fHPhPA8sBVDR3Uz0nUw@mail.gmail.com>
Message-ID: <4E4E7FF9.6030006@gmail.com>

On 08/19/2011 10:04 AM, Ralf Gommers wrote:
>
>
> On Fri, Aug 19, 2011 at 4:55 PM, Bruce Southey <bsouthey at gmail.com 
> <mailto:bsouthey at gmail.com>> wrote:
>
>     Hi,
>     I had to rebuild my Python2.6 as a 'normal' version.
>
>     Anyhow, Python2.4, 2.5, 2.6 and 2.7 all build and pass the numpy
>     tests.
>
>     Curiously, only tests in Python2.7 give almost no warnings but all
>     the other Python2.x give lots of warnings - Python2.6 and
>     Python2.7 are below. My expectation is that all versions should
>     behave the same regarding printing messages.
>
>
> This is due to a change in Python 2.7 itself - deprecation warnings 
> are not shown anymore by default. Furthermore, all those messages are 
> unrelated to Mark's missing data commits.
>
> Cheers,
> Ralf
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
Yet:
$ python2.6 -c "import numpy; numpy.test()"
Running unit tests for numpy
NumPy version 1.6.1
NumPy is installed in /usr/local/lib/python2.6/site-packages/numpy
Python version 2.6.6 (r266:84292, Aug 19 2011, 09:21:38) [GCC 4.5.1 
20100924 (Red Hat 4.5.1-4)]
nose version 1.0.0
..............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................K.................................................................................................K......................K.....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
----------------------------------------------------------------------
Ran 3533 tests in 22.062s

OK (KNOWNFAIL=3)

Hence why I was curious about all the messages having not seen them.

Is there some plan to cleanup these tests rather than 'hide' them?

Bruce
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110819/467ff2bf/attachment.html>

From warren.weckesser at enthought.com  Fri Aug 19 11:23:34 2011
From: warren.weckesser at enthought.com (Warren Weckesser)
Date: Fri, 19 Aug 2011 10:23:34 -0500
Subject: [Numpy-discussion] How to start at line # x when using
	numpy.memmap
In-Reply-To: <CAAzQXyMgWNm5u-bcWqPbOgrULW9T9o=2YQwZGYoEoY3VbBWP8w@mail.gmail.com>
References: <CAAzQXyNiMinY6CAfhRqhqnML6NF8NvtgoRimCA7GchyfnJgUiw@mail.gmail.com>
	<j2lnss$fes$1@dough.gmane.org>
	<CAAzQXyOyeqH0+rNyvaO38Ttoerag+EZhJvbpdKoLYMciFOaZfg@mail.gmail.com>
	<CAAp4xwr2DdZ2+QQUBcr0c-4frL0NZ9Z7-w7gRmTViL91RGB7mw@mail.gmail.com>
	<CAAzQXyMgWNm5u-bcWqPbOgrULW9T9o=2YQwZGYoEoY3VbBWP8w@mail.gmail.com>
Message-ID: <CAM-+wY-n5iO12O2foooFCQPPxLV7ttiZQtbsiJR3qnCogrJRvQ@mail.gmail.com>

On Fri, Aug 19, 2011 at 10:09 AM, Jeremy Conlin <jlconlin at gmail.com> wrote:

> On Fri, Aug 19, 2011 at 8:01 AM, Brent Pedersen <bpederse at gmail.com>
> wrote:
> > On Fri, Aug 19, 2011 at 7:29 AM, Jeremy Conlin <jlconlin at gmail.com>
> wrote:
> >> On Fri, Aug 19, 2011 at 7:19 AM, Pauli Virtanen <pav at iki.fi> wrote:
> >>> Fri, 19 Aug 2011 07:00:31 -0600, Jeremy Conlin wrote:
> >>>> I would like to use numpy's memmap on some data files I have. The
> first
> >>>> 12 or so lines of the files contain text (header information) and the
> >>>> remainder has the numerical data. Is there a way I can tell memmap to
> >>>> skip a specified number of lines instead of a number of bytes?
> >>>
> >>> First use standard Python I/O functions to determine the number of
> >>> bytes to skip at the beginning and the number of data items. Then pass
> >>> in `offset` and `shape` parameters to numpy.memmap.
> >>
> >> Thanks for that suggestion. However, I'm unfamiliar with the I/O
> >> functions you are referring to. Can you point me to do the
> >> documentation?
> >>
> >> Thanks again,
> >> Jeremy
> >> _______________________________________________
> >> NumPy-Discussion mailing list
> >> NumPy-Discussion at scipy.org
> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >>
> >
> > this might get you started:
> >
> >
> > import numpy as np
> >
> > # make some fake data with 12 header lines.
> > with open('test.mm', 'w') as fhw:
> >    print >> fhw, "\n".join('header' for i in range(12))
> >    np.arange(100, dtype=np.uint).tofile(fhw)
> >
> > # use normal python io to determine of offset after 12 lines.
> > with open('test.mm') as fhr:
> >    for i in range(12): fhr.readline()
> >    offset = fhr.tell()
> >
> > # use the offset in your call to np.memmap.
> > a = np.memmap('test.mm', mode='r', dtype=np.uint, offset=offset)
>
> Thanks, that looks good. I tried it, but it doesn't get the correct
> data. I really don't understand what is going on. A simple code and
> sample data is attached if anyone has a chance to look at it.
>


Your data file is all text.  memmap is generally for binary data; it won't
work with this file.

Warren


>
> Thanks,
> Jeremy
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110819/0c98c280/attachment.html>

From jlconlin at gmail.com  Fri Aug 19 11:26:02 2011
From: jlconlin at gmail.com (Jeremy Conlin)
Date: Fri, 19 Aug 2011 09:26:02 -0600
Subject: [Numpy-discussion] How to start at line # x when using
	numpy.memmap
In-Reply-To: <CAM-+wY-n5iO12O2foooFCQPPxLV7ttiZQtbsiJR3qnCogrJRvQ@mail.gmail.com>
References: <CAAzQXyNiMinY6CAfhRqhqnML6NF8NvtgoRimCA7GchyfnJgUiw@mail.gmail.com>
	<j2lnss$fes$1@dough.gmane.org>
	<CAAzQXyOyeqH0+rNyvaO38Ttoerag+EZhJvbpdKoLYMciFOaZfg@mail.gmail.com>
	<CAAp4xwr2DdZ2+QQUBcr0c-4frL0NZ9Z7-w7gRmTViL91RGB7mw@mail.gmail.com>
	<CAAzQXyMgWNm5u-bcWqPbOgrULW9T9o=2YQwZGYoEoY3VbBWP8w@mail.gmail.com>
	<CAM-+wY-n5iO12O2foooFCQPPxLV7ttiZQtbsiJR3qnCogrJRvQ@mail.gmail.com>
Message-ID: <CAAzQXyOUhXrA1Q3dfpOe1atq=gA=CvbxJmdFbydr22VE-Uuq1g@mail.gmail.com>

On Fri, Aug 19, 2011 at 9:23 AM, Warren Weckesser
<warren.weckesser at enthought.com> wrote:
>
>
> On Fri, Aug 19, 2011 at 10:09 AM, Jeremy Conlin <jlconlin at gmail.com> wrote:
>>
>> On Fri, Aug 19, 2011 at 8:01 AM, Brent Pedersen <bpederse at gmail.com>
>> wrote:
>> > On Fri, Aug 19, 2011 at 7:29 AM, Jeremy Conlin <jlconlin at gmail.com>
>> > wrote:
>> >> On Fri, Aug 19, 2011 at 7:19 AM, Pauli Virtanen <pav at iki.fi> wrote:
>> >>> Fri, 19 Aug 2011 07:00:31 -0600, Jeremy Conlin wrote:
>> >>>> I would like to use numpy's memmap on some data files I have. The
>> >>>> first
>> >>>> 12 or so lines of the files contain text (header information) and the
>> >>>> remainder has the numerical data. Is there a way I can tell memmap to
>> >>>> skip a specified number of lines instead of a number of bytes?
>> >>>
>> >>> First use standard Python I/O functions to determine the number of
>> >>> bytes to skip at the beginning and the number of data items. Then pass
>> >>> in `offset` and `shape` parameters to numpy.memmap.
>> >>
>> >> Thanks for that suggestion. However, I'm unfamiliar with the I/O
>> >> functions you are referring to. Can you point me to do the
>> >> documentation?
>> >>
>> >> Thanks again,
>> >> Jeremy
>> >> _______________________________________________
>> >> NumPy-Discussion mailing list
>> >> NumPy-Discussion at scipy.org
>> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>> >>
>> >
>> > this might get you started:
>> >
>> >
>> > import numpy as np
>> >
>> > # make some fake data with 12 header lines.
>> > with open('test.mm', 'w') as fhw:
>> > ? ?print >> fhw, "\n".join('header' for i in range(12))
>> > ? ?np.arange(100, dtype=np.uint).tofile(fhw)
>> >
>> > # use normal python io to determine of offset after 12 lines.
>> > with open('test.mm') as fhr:
>> > ? ?for i in range(12): fhr.readline()
>> > ? ?offset = fhr.tell()
>> >
>> > # use the offset in your call to np.memmap.
>> > a = np.memmap('test.mm', mode='r', dtype=np.uint, offset=offset)
>>
>> Thanks, that looks good. I tried it, but it doesn't get the correct
>> data. I really don't understand what is going on. A simple code and
>> sample data is attached if anyone has a chance to look at it.
>
>
> Your data file is all text.? memmap is generally for binary data; it won't
> work with this file.
>
> Warren

Yikes! I missed the "binary" in the first line of the documentation. Sorry!

Jeremy


From ralf.gommers at googlemail.com  Fri Aug 19 11:27:43 2011
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Fri, 19 Aug 2011 17:27:43 +0200
Subject: [Numpy-discussion] NA masks for NumPy are ready to test
In-Reply-To: <4E4E7FF9.6030006@gmail.com>
References: <CAMRnEmpGCBzR78-y5PBjoa+nYt4EKOJayeWqfzggR11CmTzn3g@mail.gmail.com>
	<4E4E7972.9060807@gmail.com>
	<CABL7CQgCXv-HXREY7PsWd4O5ORy4020fHPhPA8sBVDR3Uz0nUw@mail.gmail.com>
	<4E4E7FF9.6030006@gmail.com>
Message-ID: <CABL7CQiw838gf6k4dJ=8aeayayNPQjFMDEJLbmnMWyq0o069bw@mail.gmail.com>

On Fri, Aug 19, 2011 at 5:23 PM, Bruce Southey <bsouthey at gmail.com> wrote:

> **
> On 08/19/2011 10:04 AM, Ralf Gommers wrote:
>
>
>
> On Fri, Aug 19, 2011 at 4:55 PM, Bruce Southey <bsouthey at gmail.com> wrote:
>
>>   Hi,
>> I had to rebuild my Python2.6 as a 'normal' version.
>>
>> Anyhow, Python2.4, 2.5, 2.6 and 2.7 all build and pass the numpy tests.
>>
>> Curiously, only tests in Python2.7 give almost no warnings but all the
>> other Python2.x give lots of warnings - Python2.6 and Python2.7 are below.
>> My expectation is that all versions should behave the same regarding
>> printing messages.
>>
>
> This is due to a change in Python 2.7 itself - deprecation warnings are not
> shown anymore by default. Furthermore, all those messages are unrelated to
> Mark's missing data commits.
>
> Cheers,
> Ralf
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.orghttp://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>  Yet:
>
> $ python2.6 -c "import numpy; numpy.test()"
> Running unit tests for numpy
> NumPy version 1.6.1
>
> NumPy is installed in /usr/local/lib/python2.6/site-packages/numpy
> Python version 2.6.6 (r266:84292, Aug 19 2011, 09:21:38) [GCC 4.5.1
> 20100924 (Red Hat 4.5.1-4)]
> nose version 1.0.0
> ..............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................K...............................
> ..................................................................K......................K....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
> ..............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
> ...................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
> ----------------------------------------------------------------------
> Ran 3533 tests in 22.062s
>
> OK (KNOWNFAIL=3)
>
> Hence why I was curious about all the messages having not seen them.
>
> Is there some plan to cleanup these tests rather than 'hide' them?
>
> Yes, that happens before every release.

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110819/54455a9b/attachment.html>

From alok at merfinllc.com  Fri Aug 19 11:41:59 2011
From: alok at merfinllc.com (Alok Singhal)
Date: Fri, 19 Aug 2011 08:41:59 -0700
Subject: [Numpy-discussion] longlong format error with Python <= 2.6 in
	scalartypes.c
In-Reply-To: <CAMRnEmpk1_yt05pcGef80WWwm+h+V-agjR0TtLm0DqwOpj75qA@mail.gmail.com>
References: <8D5A8864-6827-4164-B8F6-198000B7491D@astro.physik.uni-goettingen.de><CAMRnEmpk1_yt05pcGef80WWwm+h+V-agjR0TtLm0DqwOpj75qA@mail.gmail.com>
Message-ID: <CALOnNk-+ur4guzbyDdZ+LwbPPvny34OR_aP6zfk9xFSoEa-Zfg@mail.gmail.com>

On Thu, Aug 18, 2011 at 9:01 PM, Mark Wiebe <mwwiebe at gmail.com> wrote:
> On Thu, Aug 4, 2011 at 4:08 PM, Derek Homeier
> <derek at astro.physik.uni-goettingen.de> wrote:
>>
>> Hi,
>>
>> commits c15a807e and c135371e (thus most immediately addressed to Mark,
>> but I am sending this to the list hoping for more insight on the issue)
>> introduce a test failure with Python 2.5+2.6 on Mac:
>>
>> FAIL: test_timedelta_scalar_construction (test_datetime.TestDateTime)
>> ----------------------------------------------------------------------
>> Traceback (most recent call last):
>> ?File
>> "/Users/derek/lib/python2.6/site-packages/numpy/core/tests/test_datetime.py",
>> line 219, in test_timedelta_scalar_construction
>> ? ?assert_equal(str(np.timedelta64(3, 's')), '3 seconds')
>> ?File "/Users/derek/lib/python2.6/site-packages/numpy/testing/utils.py",
>> line 313, in assert_equal
>> ? ?raise AssertionError(msg)
>> AssertionError:
>> Items are not equal:
>> ?ACTUAL: '%lld seconds'
>> ?DESIRED: '3 seconds'
>>
>> due to the "lld" format passed to PyUString_FromFormat in scalartypes.c.
>> In the current npy_common.h I found the comment
>> ?* ? ? ?in Python 2.6 the %lld formatter is not supported. In this
>> ?* ? ? ?case we work around the problem by using the %zd formatter.
>> though I did not notice that problem when I cleaned up the
>> NPY_LONGLONG_FMT definitions in that file (and it is not entirely clear
>> whether the comment only pertains to Windows...). Anyway changing the
>> formatters in scalartypes.c to "zd" as well removes the failure and still
>> works with Python 2.7 and 3.2 (at least on Mac OS). However I am wondering
>> if
>> a) NPY_[U]LONGLONG_FMT should also be defined conditional to the Python
>> version (and if "%zu" is a valid formatter), and
>> b) scalartypes.c should use NPY_LONGLONG_FMT from npy_common.h
>>
>> I am attaching a patch implementing a), but only the quick and dirty
>> solution to b).
>
> I've touched this stuff as little as possible, because I rather dislike the
> way the *_FMT macros are set up right now. I added a comment about
> NPY_INTP_FMT in npy_common.h which I see you read. If you're going to try to
> fix this, I hope you fix it deeper than this patch so it's not error-prone
> anymore.
> NPY_INTP_FMT is used together with PyErr_Format/PyString_FromFormat, whereas
> the other *_FMT are used with the *printf functions from the C libraries.
> These are not compatible, and the %zd hack was put in place because it
> exists even in Python 2.4, and Py_ssize_t seems matches the ?pointer size in
> all CPython versions.
> Switching the timedelta64 format in scalartypes.c.src to "%zd" won't help on
> 32-bit platforms, because it won't be a 64-bit type there, unlike how it
> works ok for the NPY_INTP_FMT. In summary:
> * There need to be changes to create a clear distinction between the *_FMT
> for PyString_FromFormat vs the *_FMT for C library *printf functions
> * I suspect we're out of luck for 32-bit older versions of CPython with
> PyString_FromFormat
> Cheers,
> -Mark

By the way, the above bug is fixed in the current master (see
https://github.com/numpy/numpy/commit/730b861120094b1ab38670b9a8895a36c19296a7).
 I fixed it in the most direct way possible, because "the correct" way
would require changes to a lot of places.


From mwwiebe at gmail.com  Fri Aug 19 11:48:54 2011
From: mwwiebe at gmail.com (Mark Wiebe)
Date: Fri, 19 Aug 2011 08:48:54 -0700
Subject: [Numpy-discussion] NA masks for NumPy are ready to test
In-Reply-To: <4E4E701C.1030305@gmail.com>
References: <CAMRnEmpGCBzR78-y5PBjoa+nYt4EKOJayeWqfzggR11CmTzn3g@mail.gmail.com>
	<4E4E701C.1030305@gmail.com>
Message-ID: <CAMRnEmq9XnUWDiNz1nHPH9O+a8TRv-5E=mxwDwhw0v7gZppEuA@mail.gmail.com>

On Fri, Aug 19, 2011 at 7:15 AM, Bruce Southey <bsouthey at gmail.com> wrote:

> **
> On 08/18/2011 04:43 PM, Mark Wiebe wrote:
>
> It's taken a lot of changes to get the NA mask support to its current
> point, but the code ready for some testing now. You can read the
> work-in-progress release notes here:
>
>
> https://github.com/m-paradox/numpy/blob/missingdata/doc/release/2.0.0-notes.rst
>
>  To try it out, check out the missingdata branch from my github account,
> here, and build in the standard way:
>
>  https://github.com/m-paradox/numpy
>
>  The things most important to test are:
>
>  * Confirm that existing code still works correctly. I've tested against
> SciPy and matplotlib.
> * Confirm that the performance of code not using NA masks is the same or
> better.
> * Try to do computations with the NA values, find places they don't work
> yet, and nominate unimplemented functionality important to you to be next on
> the development list. The release notes have a preliminary list of
> implemented/unimplemented functions.
> * Report any crashes, build problems, or unexpected behaviors.
>
>  In addition to adding the NA mask, I've also added features and done a
> few performance changes here and there, like letting reductions like sum
> take lists of axes instead of being a single axis or all of them. These
> changes affect various bugs like
> http://projects.scipy.org/numpy/ticket/1143 and
> http://projects.scipy.org/numpy/ticket/533.
>
>  Thanks!
> Mark
>
>  Here's a small example run using NAs:
>
>  >>> import numpy as np
> >>> np.__version__
> '2.0.0.dev-8a5e2a1'
> >>> a = np.random.rand(3,3,3)
> >>> a.flags.maskna = True
> >>> a[np.random.rand(3,3,3) < 0.5] = np.NA
> >>> a
> array([[[NA, NA,  0.11511708],
>         [ 0.46661454,  0.47565512, NA],
>         [NA, NA, NA]],
>
>         [[NA,  0.57860351, NA],
>         [NA, NA,  0.72012669],
>         [ 0.36582123, NA,  0.76289794]],
>
>         [[ 0.65322748,  0.92794386, NA],
>         [ 0.53745165,  0.97520989,  0.17515083],
>         [ 0.71219688,  0.5184328 ,  0.75802805]]])
> >>> np.mean(a, axis=-1)
> array([[NA, NA, NA],
>        [NA, NA, NA],
>        [NA,  0.56260412,  0.66288591]])
> >>> np.std(a, axis=-1)
> array([[NA, NA, NA],
>        [NA, NA, NA],
>         [NA,  0.32710662,  0.10384331]])
> >>> np.mean(a, axis=-1, skipna=True)
> /home/mwiebe/installtest/lib64/python2.7/site-packages/numpy/core/fromnumeric.py:2474:
> RuntimeWarning: invalid value encountered in true_divide
>   um.true_divide(ret, rcount, out=ret, casting='unsafe')
> array([[ 0.11511708,  0.47113483,         nan],
>        [ 0.57860351,  0.72012669,  0.56435958],
>        [ 0.79058567,  0.56260412,  0.66288591]])
> >>> np.std(a, axis=-1, skipna=True)
> /home/mwiebe/installtest/lib64/python2.7/site-packages/numpy/core/fromnumeric.py:2707:
> RuntimeWarning: invalid value encountered in true_divide
>   um.true_divide(arrmean, rcount, out=arrmean, casting='unsafe')
> /home/mwiebe/installtest/lib64/python2.7/site-packages/numpy/core/fromnumeric.py:2730:
> RuntimeWarning: invalid value encountered in true_divide
>   um.true_divide(ret, rcount, out=ret, casting='unsafe')
> array([[ 0.        ,  0.00452029,         nan],
>        [ 0.        ,  0.        ,  0.19853835],
>        [ 0.13735819,  0.32710662,  0.10384331]])
>  >>> np.std(a, axis=(1,2), skipna=True)
> array([ 0.16786895,  0.15498008,  0.23811937])
>
>
> _______________________________________________
> NumPy-Discussion mailing listNumPy-Discussion at scipy.orghttp://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>  Hi,
> That is great news!
> (Python2.x will be another email.)
>
> Python3.1 and Python3.2 failed with building 'multiarraymodule_onefile.o'
> but I could not see any obvious reason.
>

I've pushed a change to fix the Python 3 build, it was a use
of Py_TPFLAGS_CHECKTYPES, which is no longer in Python3 but is always
default now. Tested with 3.2.

Thanks!
Mark


>
> I had removed my build directory and then 'python3 setup.py build' but I
> saw this message:
> Running from numpy source directory.
> numpy/core/setup_common.py:86: MismatchCAPIWarning: API mismatch detected,
> the C API version numbers have to be updated. Current C api version is 6,
> with checksum ef5688af03ffa23dd8e11734f5b69313, but recorded checksum for C
> API version 6 in codegen_dir/cversions.txt is
> e61d5dc51fa1c6459328266e215d6987. If functions were added in the C API, you
> have to update C_API_VERSION  in numpy/core/setup_common.py.
>   MismatchCAPIWarning)
>
> Upstream of the build log is below.
>
> Bruce
>
> In file included from
> numpy/core/src/multiarray/multiarraymodule_onefile.c:53:0:
> numpy/core/src/multiarray/na_singleton.c: At top level:
> numpy/core/src/multiarray/na_singleton.c:708:25: error:
> ?Py_TPFLAGS_CHECKTYPES? undeclared here (not in a function)
> numpy/core/src/multiarray/common.c:48:1: warning: ?_use_default_type?
> defined but not used
> numpy/core/src/multiarray/ctors.h:93:1: warning: ?_arrays_overlap? declared
> ?static? but never defined
> numpy/core/src/multiarray/scalartypes.c.src:2251:1: warning:
> ?gentype_getsegcount? defined but not used
> numpy/core/src/multiarray/scalartypes.c.src:2269:1: warning:
> ?gentype_getcharbuf? defined but not used
> numpy/core/src/multiarray/mapping.c:110:1: warning: ?_array_ass_item?
> defined but not used
> numpy/core/src/multiarray/number.c:266:1: warning: ?array_divide? defined
> but not used
> numpy/core/src/multiarray/number.c:464:1: warning: ?array_inplace_divide?
> defined but not used
> numpy/core/src/multiarray/buffer.c:25:1: warning: ?array_getsegcount?
> defined but not used
> numpy/core/src/multiarray/buffer.c:58:1: warning: ?array_getwritebuf?
> defined but not used
> numpy/core/src/multiarray/buffer.c:71:1: warning: ?array_getcharbuf?
> defined but not used
> numpy/core/src/multiarray/na_mask.c:681:1: warning:
> ?PyArray_GetMaskInversionFunction? defined but not used
> In file included from numpy/core/src/multiarray/scalartypes.c.src:25:0,
>                  from
> numpy/core/src/multiarray/multiarraymodule_onefile.c:10:
> numpy/core/src/multiarray/_datetime.h:9:1: warning: function declaration
> isn?t a prototype
> In file included from
> numpy/core/src/multiarray/multiarraymodule_onefile.c:13:0:
> numpy/core/src/multiarray/datetime.c:33:1: warning: function declaration
> isn?t a prototype
> In file included from
> numpy/core/src/multiarray/multiarraymodule_onefile.c:17:0:
> numpy/core/src/multiarray/arraytypes.c.src: In function ?VOID_getitem?:
> numpy/core/src/multiarray/arraytypes.c.src:643:9: warning: passing argument
> 2 of ?PyArray_SetBaseObject? from incompatible pointer type
> build/src.linux-x86_64-3.2/numpy/core/include/numpy/__multiarray_api.h:763:12:
> note: expected ?struct PyObject *? but argument is of type ?struct
> PyArrayObject *?
> In file included from
> numpy/core/src/multiarray/multiarraymodule_onefile.c:44:0:
> numpy/core/src/multiarray/nditer_pywrap.c: In function ?npyiter_subscript?:
> numpy/core/src/multiarray/nditer_pywrap.c:2395:29: warning: passing
> argument 1 of ?PySlice_GetIndices? from incompatible pointer type
> /usr/local/include/python3.2m/sliceobject.h:38:5: note: expected ?struct
> PyObject *? but argument is of type ?struct PySliceObject *?
> numpy/core/src/multiarray/nditer_pywrap.c: In function
> ?npyiter_ass_subscript?:
> numpy/core/src/multiarray/nditer_pywrap.c:2440:29: warning: passing
> argument 1 of ?PySlice_GetIndices? from incompatible pointer type
> /usr/local/include/python3.2m/sliceobject.h:38:5: note: expected ?struct
> PyObject *? but argument is of type ?struct PySliceObject *?
> In file included from
> numpy/core/src/multiarray/multiarraymodule_onefile.c:53:0:
> numpy/core/src/multiarray/na_singleton.c: At top level:
> numpy/core/src/multiarray/na_singleton.c:708:25: error:
> ?Py_TPFLAGS_CHECKTYPES? undeclared here (not in a function)
> numpy/core/src/multiarray/common.c:48:1: warning: ?_use_default_type?
> defined but not used
> numpy/core/src/multiarray/ctors.h:93:1: warning: ?_arrays_overlap? declared
> ?static? but never defined
> numpy/core/src/multiarray/scalartypes.c.src:2251:1: warning:
> ?gentype_getsegcount? defined but not used
> numpy/core/src/multiarray/scalartypes.c.src:2269:1: warning:
> ?gentype_getcharbuf? defined but not used
> numpy/core/src/multiarray/mapping.c:110:1: warning: ?_array_ass_item?
> defined but not used
> numpy/core/src/multiarray/number.c:266:1: warning: ?array_divide? defined
> but not used
> numpy/core/src/multiarray/number.c:464:1: warning: ?array_inplace_divide?
> defined but not used
> numpy/core/src/multiarray/buffer.c:25:1: warning: ?array_getsegcount?
> defined but not used
> numpy/core/src/multiarray/buffer.c:58:1: warning: ?array_getwritebuf?
> defined but not used
> numpy/core/src/multiarray/buffer.c:71:1: warning: ?array_getcharbuf?
> defined but not used
> numpy/core/src/multiarray/na_mask.c:681:1: warning:
> ?PyArray_GetMaskInversionFunction? defined but not used
> error: Command "gcc -pthread -DNDEBUG -g -fwrapv -O3 -Wall
> -Wstrict-prototypes -fPIC -Inumpy/core/include
> -Ibuild/src.linux-x86_64-3.2/numpy/core/include/numpy
> -Inumpy/core/src/private -Inumpy/core/src -Inumpy/core
> -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath
> -Inumpy/core/src/npysort -Inumpy/core/include
> -I/usr/local/include/python3.2m
> -Ibuild/src.linux-x86_64-3.2/numpy/core/src/multiarray
> -Ibuild/src.linux-x86_64-3.2/numpy/core/src/umath -c
> numpy/core/src/multiarray/multiarraymodule_onefile.c -o
> build/temp.linux-x86_64-3.2/numpy/core/src/multiarray/multiarraymodule_onefile.o"
> failed with exit status 1
>
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110819/5caf14a3/attachment.html>

From ben.root at ou.edu  Fri Aug 19 11:50:50 2011
From: ben.root at ou.edu (Benjamin Root)
Date: Fri, 19 Aug 2011 10:50:50 -0500
Subject: [Numpy-discussion] Can't mix np.newaxis with boolean indexing
Message-ID: <CANNq6F=Ek96BTTRPpUSPhd5WuQ781-UArg93EqooahzQoCk-Dw@mail.gmail.com>

I could have sworn that this use to work:

import numpy as np
a = np.random.random((100,))
b = (a > 0.5)
print a[b, np.newaxis]

But instead, I get this error on the latest master:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: long() argument must be a string or a number, not 'NoneType'

Note, the simple work-around would be "a[b][:, np.newaxis]", but I can't
imagine why the intuitive syntax would not be valid.

Thanks,
Ben Root
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110819/f3ec1e2a/attachment.html>

From rjd4+numpy at cam.ac.uk  Fri Aug 19 12:14:15 2011
From: rjd4+numpy at cam.ac.uk (Bob Dowling)
Date: Fri, 19 Aug 2011 17:14:15 +0100
Subject: [Numpy-discussion] summing an array
In-Reply-To: <4E4E77FD.8070107@simplistix.co.uk>
References: <4E4D1F5A.2000205@simplistix.co.uk> <4E4D2891.4@cam.ac.uk>
	<4E4E77FD.8070107@simplistix.co.uk>
Message-ID: <4E4E8BD7.8020201@cam.ac.uk>


On 19/08/11 15:49, Chris Withers wrote:
> On 18/08/2011 07:58, Bob Dowling wrote:
>>
>> >>> numpy.add.accumulate(a)
>> array([ 0, 1, 3, 6, 10])
>>
>> >>> numpy.add.accumulate(a, out=a)
>> array([ 0, 1, 3, 6, 10])
>
> What's the difference between numpy.cumsum and numpy.add.accumulate?

I think they're equivalent, with numpy.cumprod() serving for 
numpy.multiply.accumulate()

I have a prefeence for general procedures rather than special short 
cuts.  The numpy.<ufunc>.accumulate works for any of the binary ufuncs I 
think.  The cumsum() and cumprod() functions only exist for add and 
multiply.

e.g.

 >>> a = numpy.arange(2,5)
 >>> a
array([2, 3, 4])
 >>> numpy.power.accumulate(a)
array([   2,    8, 4096])


> Where can I find the reference docs for these?

help(numpy.ufunc)
help(numpy.ufunc.accumulate)

is where I started.


From bsouthey at gmail.com  Fri Aug 19 13:55:13 2011
From: bsouthey at gmail.com (Bruce Southey)
Date: Fri, 19 Aug 2011 12:55:13 -0500
Subject: [Numpy-discussion] NA masks for NumPy are ready to test
In-Reply-To: <CAMRnEmq9XnUWDiNz1nHPH9O+a8TRv-5E=mxwDwhw0v7gZppEuA@mail.gmail.com>
References: <CAMRnEmpGCBzR78-y5PBjoa+nYt4EKOJayeWqfzggR11CmTzn3g@mail.gmail.com>
	<4E4E701C.1030305@gmail.com>
	<CAMRnEmq9XnUWDiNz1nHPH9O+a8TRv-5E=mxwDwhw0v7gZppEuA@mail.gmail.com>
Message-ID: <CAAea2pZd=i3XdGkRAR+eX+WnG34vGsNDfvz6dKeydUVnydnbJA@mail.gmail.com>

On Fri, Aug 19, 2011 at 10:48 AM, Mark Wiebe <mwwiebe at gmail.com> wrote:
> On Fri, Aug 19, 2011 at 7:15 AM, Bruce Southey <bsouthey at gmail.com> wrote:
>>
>> On 08/18/2011 04:43 PM, Mark Wiebe wrote:
>>
>> It's taken a lot of changes to get the NA mask support to its current
>> point, but the code ready for some testing now. You can read the
>> work-in-progress release notes here:
>>
>> https://github.com/m-paradox/numpy/blob/missingdata/doc/release/2.0.0-notes.rst
>> To try it out, check out the missingdata branch from my github account,
>> here, and build in the standard way:
>> https://github.com/m-paradox/numpy
>> The things most important to test are:
>> * Confirm that existing code still works correctly. I've tested against
>> SciPy and matplotlib.
>> * Confirm that the performance of code not using NA masks is the same or
>> better.
>> * Try to do computations with the NA values, find places they don't work
>> yet, and nominate unimplemented functionality important to you to be next on
>> the development list. The release notes have a preliminary list of
>> implemented/unimplemented functions.
>> * Report any crashes, build problems, or unexpected behaviors.
>> In addition to adding the NA mask, I've also added features and done a few
>> performance changes here and there, like letting reductions like sum take
>> lists of axes instead of being a single axis or all of them. These changes
>> affect various bugs
>> like?http://projects.scipy.org/numpy/ticket/1143?and?http://projects.scipy.org/numpy/ticket/533.
>> Thanks!
>> Mark
>> Here's a small example run using NAs:
>> >>> import numpy as np
>> >>> np.__version__
>> '2.0.0.dev-8a5e2a1'
>> >>> a = np.random.rand(3,3,3)
>> >>> a.flags.maskna = True
>> >>> a[np.random.rand(3,3,3) < 0.5] = np.NA
>> >>> a
>> array([[[NA, NA, ?0.11511708],
>> ? ? ? ? [ 0.46661454, ?0.47565512, NA],
>> ? ? ? ? [NA, NA, NA]],
>> ? ? ? ?[[NA, ?0.57860351, NA],
>> ? ? ? ? [NA, NA, ?0.72012669],
>> ? ? ? ? [ 0.36582123, NA, ?0.76289794]],
>> ? ? ? ?[[ 0.65322748, ?0.92794386, NA],
>> ? ? ? ? [ 0.53745165, ?0.97520989, ?0.17515083],
>> ? ? ? ? [ 0.71219688, ?0.5184328 , ?0.75802805]]])
>> >>> np.mean(a, axis=-1)
>> array([[NA, NA, NA],
>> ? ? ? ?[NA, NA, NA],
>> ? ? ? ?[NA, ?0.56260412, ?0.66288591]])
>> >>> np.std(a, axis=-1)
>> array([[NA, NA, NA],
>> ? ? ? ?[NA, NA, NA],
>> ? ? ? ?[NA, ?0.32710662, ?0.10384331]])
>> >>> np.mean(a, axis=-1, skipna=True)
>>
>> /home/mwiebe/installtest/lib64/python2.7/site-packages/numpy/core/fromnumeric.py:2474:
>> RuntimeWarning: invalid value encountered in true_divide
>> ? um.true_divide(ret, rcount, out=ret, casting='unsafe')
>> array([[ 0.11511708, ?0.47113483, ? ? ? ? nan],
>> ? ? ? ?[ 0.57860351, ?0.72012669, ?0.56435958],
>> ? ? ? ?[ 0.79058567, ?0.56260412, ?0.66288591]])
>> >>> np.std(a, axis=-1, skipna=True)
>>
>> /home/mwiebe/installtest/lib64/python2.7/site-packages/numpy/core/fromnumeric.py:2707:
>> RuntimeWarning: invalid value encountered in true_divide
>> ? um.true_divide(arrmean, rcount, out=arrmean, casting='unsafe')
>>
>> /home/mwiebe/installtest/lib64/python2.7/site-packages/numpy/core/fromnumeric.py:2730:
>> RuntimeWarning: invalid value encountered in true_divide
>> ? um.true_divide(ret, rcount, out=ret, casting='unsafe')
>> array([[ 0. ? ? ? ?, ?0.00452029, ? ? ? ? nan],
>> ? ? ? ?[ 0. ? ? ? ?, ?0. ? ? ? ?, ?0.19853835],
>> ? ? ? ?[ 0.13735819, ?0.32710662, ?0.10384331]])
>> >>> np.std(a, axis=(1,2), skipna=True)
>> array([ 0.16786895, ?0.15498008, ?0.23811937])
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>> Hi,
>> That is great news!
>> (Python2.x will be another email.)
>>
>> Python3.1 and Python3.2 failed with building 'multiarraymodule_onefile.o'
>> but I could not see any obvious reason.
>
> I've pushed a change to fix the Python 3 build, it was a use
> of?Py_TPFLAGS_CHECKTYPES, which is no longer in Python3 but is always
> default now. Tested with 3.2.
> Thanks!
> Mark
>
>>
>> I had removed my build directory and then 'python3 setup.py build' but I
>> saw this message:
>> Running from numpy source directory.
>> numpy/core/setup_common.py:86: MismatchCAPIWarning: API mismatch detected,
>> the C API version numbers have to be updated. Current C api version is 6,
>> with checksum ef5688af03ffa23dd8e11734f5b69313, but recorded checksum for C
>> API version 6 in codegen_dir/cversions.txt is
>> e61d5dc51fa1c6459328266e215d6987. If functions were added in the C API, you
>> have to update C_API_VERSION? in numpy/core/setup_common.py.
>> ? MismatchCAPIWarning)
>>
>> Upstream of the build log is below.
>>
>> Bruce
>>
>> In file included from
>> numpy/core/src/multiarray/multiarraymodule_onefile.c:53:0:
>> numpy/core/src/multiarray/na_singleton.c: At top level:
>> numpy/core/src/multiarray/na_singleton.c:708:25: error:
>> ?Py_TPFLAGS_CHECKTYPES? undeclared here (not in a function)
>> numpy/core/src/multiarray/common.c:48:1: warning: ?_use_default_type?
>> defined but not used
>> numpy/core/src/multiarray/ctors.h:93:1: warning: ?_arrays_overlap?
>> declared ?static? but never defined
>> numpy/core/src/multiarray/scalartypes.c.src:2251:1: warning:
>> ?gentype_getsegcount? defined but not used
>> numpy/core/src/multiarray/scalartypes.c.src:2269:1: warning:
>> ?gentype_getcharbuf? defined but not used
>> numpy/core/src/multiarray/mapping.c:110:1: warning: ?_array_ass_item?
>> defined but not used
>> numpy/core/src/multiarray/number.c:266:1: warning: ?array_divide? defined
>> but not used
>> numpy/core/src/multiarray/number.c:464:1: warning: ?array_inplace_divide?
>> defined but not used
>> numpy/core/src/multiarray/buffer.c:25:1: warning: ?array_getsegcount?
>> defined but not used
>> numpy/core/src/multiarray/buffer.c:58:1: warning: ?array_getwritebuf?
>> defined but not used
>> numpy/core/src/multiarray/buffer.c:71:1: warning: ?array_getcharbuf?
>> defined but not used
>> numpy/core/src/multiarray/na_mask.c:681:1: warning:
>> ?PyArray_GetMaskInversionFunction? defined but not used
>> In file included from numpy/core/src/multiarray/scalartypes.c.src:25:0,
>> ???????????????? from
>> numpy/core/src/multiarray/multiarraymodule_onefile.c:10:
>> numpy/core/src/multiarray/_datetime.h:9:1: warning: function declaration
>> isn?t a prototype
>> In file included from
>> numpy/core/src/multiarray/multiarraymodule_onefile.c:13:0:
>> numpy/core/src/multiarray/datetime.c:33:1: warning: function declaration
>> isn?t a prototype
>> In file included from
>> numpy/core/src/multiarray/multiarraymodule_onefile.c:17:0:
>> numpy/core/src/multiarray/arraytypes.c.src: In function ?VOID_getitem?:
>> numpy/core/src/multiarray/arraytypes.c.src:643:9: warning: passing
>> argument 2 of ?PyArray_SetBaseObject? from incompatible pointer type
>>
>> build/src.linux-x86_64-3.2/numpy/core/include/numpy/__multiarray_api.h:763:12:
>> note: expected ?struct PyObject *? but argument is of type ?struct
>> PyArrayObject *?
>> In file included from
>> numpy/core/src/multiarray/multiarraymodule_onefile.c:44:0:
>> numpy/core/src/multiarray/nditer_pywrap.c: In function
>> ?npyiter_subscript?:
>> numpy/core/src/multiarray/nditer_pywrap.c:2395:29: warning: passing
>> argument 1 of ?PySlice_GetIndices? from incompatible pointer type
>> /usr/local/include/python3.2m/sliceobject.h:38:5: note: expected ?struct
>> PyObject *? but argument is of type ?struct PySliceObject *?
>> numpy/core/src/multiarray/nditer_pywrap.c: In function
>> ?npyiter_ass_subscript?:
>> numpy/core/src/multiarray/nditer_pywrap.c:2440:29: warning: passing
>> argument 1 of ?PySlice_GetIndices? from incompatible pointer type
>> /usr/local/include/python3.2m/sliceobject.h:38:5: note: expected ?struct
>> PyObject *? but argument is of type ?struct PySliceObject *?
>> In file included from
>> numpy/core/src/multiarray/multiarraymodule_onefile.c:53:0:
>> numpy/core/src/multiarray/na_singleton.c: At top level:
>> numpy/core/src/multiarray/na_singleton.c:708:25: error:
>> ?Py_TPFLAGS_CHECKTYPES? undeclared here (not in a function)
>> numpy/core/src/multiarray/common.c:48:1: warning: ?_use_default_type?
>> defined but not used
>> numpy/core/src/multiarray/ctors.h:93:1: warning: ?_arrays_overlap?
>> declared ?static? but never defined
>> numpy/core/src/multiarray/scalartypes.c.src:2251:1: warning:
>> ?gentype_getsegcount? defined but not used
>> numpy/core/src/multiarray/scalartypes.c.src:2269:1: warning:
>> ?gentype_getcharbuf? defined but not used
>> numpy/core/src/multiarray/mapping.c:110:1: warning: ?_array_ass_item?
>> defined but not used
>> numpy/core/src/multiarray/number.c:266:1: warning: ?array_divide? defined
>> but not used
>> numpy/core/src/multiarray/number.c:464:1: warning: ?array_inplace_divide?
>> defined but not used
>> numpy/core/src/multiarray/buffer.c:25:1: warning: ?array_getsegcount?
>> defined but not used
>> numpy/core/src/multiarray/buffer.c:58:1: warning: ?array_getwritebuf?
>> defined but not used
>> numpy/core/src/multiarray/buffer.c:71:1: warning: ?array_getcharbuf?
>> defined but not used
>> numpy/core/src/multiarray/na_mask.c:681:1: warning:
>> ?PyArray_GetMaskInversionFunction? defined but not used
>> error: Command "gcc -pthread -DNDEBUG -g -fwrapv -O3 -Wall
>> -Wstrict-prototypes -fPIC -Inumpy/core/include
>> -Ibuild/src.linux-x86_64-3.2/numpy/core/include/numpy
>> -Inumpy/core/src/private -Inumpy/core/src -Inumpy/core
>> -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath
>> -Inumpy/core/src/npysort -Inumpy/core/include
>> -I/usr/local/include/python3.2m
>> -Ibuild/src.linux-x86_64-3.2/numpy/core/src/multiarray
>> -Ibuild/src.linux-x86_64-3.2/numpy/core/src/umath -c
>> numpy/core/src/multiarray/multiarraymodule_onefile.c -o
>> build/temp.linux-x86_64-3.2/numpy/core/src/multiarray/multiarraymodule_onefile.o"
>> failed with exit status 1
>>
>>
>>
>>
Thanks for the prompt responses.

That fixes the build problem for both Python3.1 and Python3.2.

I got some test errors below but I guess you are working on those.


Bruce


$ python3 -c "import numpy; numpy.test()"
Running unit tests for numpy
NumPy version 2.0.0.dev-965a5c6
NumPy is installed in /usr/lib64/python3.2/site-packages/numpy
Python version 3.2 (r32:88445, Feb 21 2011, 21:11:06) [GCC 4.6.0
20110212 (Red Hat 4.6.0-0.7)]
nose version 1.0.0
..............S.......EFF.....E............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................K...................................................................................................................................................................................................K..................................................................................................K......................K..........................................................................................................S......................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................./usr/lib64/python3.2/site-packages/numpy/lib/format.py:575:
ResourceWarning: unclosed file <_io.BufferedReader
name='/tmp/tmpfmmo7x'>
  mode=mode, offset=offset)
......................................................................................................................................................................................................................../usr/lib64/python3.2/subprocess.py:460:
ResourceWarning: unclosed file <_io.BufferedReader name=3>
  return Popen(*popenargs, **kwargs).wait()
/usr/lib64/python3.2/subprocess.py:460: ResourceWarning: unclosed file
<_io.BufferedReader name=8>
  return Popen(*popenargs, **kwargs).wait()
....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
======================================================================
ERROR: test_datetime_array_str (test_datetime.TestDateTime)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib64/python3.2/site-packages/numpy/core/tests/test_datetime.py",
line 510, in test_datetime_array_str
    assert_equal(str(a), "['2011-03-16' '1920-01-01' '2013-05-19']")
  File "/usr/lib64/python3.2/site-packages/numpy/core/numeric.py",
line 1400, in array_str
    return array2string(a, max_line_width, precision, suppress_small,
' ', "", str)
  File "/usr/lib64/python3.2/site-packages/numpy/core/arrayprint.py",
line 459, in array2string
    separator, prefix, formatter=formatter)
  File "/usr/lib64/python3.2/site-packages/numpy/core/arrayprint.py",
line 331, in _array2string
    _summaryEdgeItems, summary_insert)[:-1]
  File "/usr/lib64/python3.2/site-packages/numpy/core/arrayprint.py",
line 502, in _formatArray
    word = format_function(a[-i]) + separator
  File "/usr/lib64/python3.2/site-packages/numpy/core/arrayprint.py",
line 770, in __call__
    casting=self.casting)
TypeError: Cannot create a local timezone-based date string from a
NumPy datetime without forcing 'unsafe' casting

======================================================================
ERROR: test_datetime_divide (test_datetime.TestDateTime)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib64/python3.2/site-packages/numpy/core/tests/test_datetime.py",
line 926, in test_datetime_divide
    assert_equal(tda / tdb, 6.0 / 9.0)
TypeError: internal error: could not find appropriate datetime inner
loop in true_divide ufunc

======================================================================
FAIL: test_datetime_as_string (test_datetime.TestDateTime)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib64/python3.2/site-packages/numpy/core/tests/test_datetime.py",
line 1166, in test_datetime_as_string
    '1959')
  File "/usr/lib64/python3.2/site-packages/numpy/testing/utils.py",
line 313, in assert_equal
    raise AssertionError(msg)
AssertionError:
Items are not equal:
 ACTUAL: b'1959'
 DESIRED: '1959'

======================================================================
FAIL: test_datetime_as_string_timezone (test_datetime.TestDateTime)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib64/python3.2/site-packages/numpy/core/tests/test_datetime.py",
line 1277, in test_datetime_as_string_timezone
    '2010-03-15T06:30Z')
  File "/usr/lib64/python3.2/site-packages/numpy/testing/utils.py",
line 313, in assert_equal
    raise AssertionError(msg)
AssertionError:
Items are not equal:
 ACTUAL: b'2010-03-15T06:30Z'
 DESIRED: '2010-03-15T06:30Z'

----------------------------------------------------------------------
Ran 3063 tests in 37.701s

FAILED (KNOWNFAIL=4, SKIP=2, errors=2, failures=2)


From bsouthey at gmail.com  Fri Aug 19 13:55:55 2011
From: bsouthey at gmail.com (Bruce Southey)
Date: Fri, 19 Aug 2011 12:55:55 -0500
Subject: [Numpy-discussion] NA masks for NumPy are ready to test
In-Reply-To: <CABL7CQiw838gf6k4dJ=8aeayayNPQjFMDEJLbmnMWyq0o069bw@mail.gmail.com>
References: <CAMRnEmpGCBzR78-y5PBjoa+nYt4EKOJayeWqfzggR11CmTzn3g@mail.gmail.com>
	<4E4E7972.9060807@gmail.com>
	<CABL7CQgCXv-HXREY7PsWd4O5ORy4020fHPhPA8sBVDR3Uz0nUw@mail.gmail.com>
	<4E4E7FF9.6030006@gmail.com>
	<CABL7CQiw838gf6k4dJ=8aeayayNPQjFMDEJLbmnMWyq0o069bw@mail.gmail.com>
Message-ID: <CAAea2pYsmMy41rRuWz+FEtgvPhqXQj-0AAaqqpTV4KJJkpbafw@mail.gmail.com>

On Fri, Aug 19, 2011 at 10:27 AM, Ralf Gommers
<ralf.gommers at googlemail.com> wrote:
>
>
> On Fri, Aug 19, 2011 at 5:23 PM, Bruce Southey <bsouthey at gmail.com> wrote:
>>
>> On 08/19/2011 10:04 AM, Ralf Gommers wrote:
>>
>> On Fri, Aug 19, 2011 at 4:55 PM, Bruce Southey <bsouthey at gmail.com> wrote:
>>>
>>> Hi,
>>> I had to rebuild my Python2.6 as a 'normal' version.
>>>
>>> Anyhow, Python2.4, 2.5, 2.6 and 2.7 all build and pass the numpy tests.
>>>
>>> Curiously, only tests in Python2.7 give almost no warnings but all the
>>> other Python2.x give lots of warnings - Python2.6 and Python2.7 are below.
>>> My expectation is that all versions should behave the same regarding
>>> printing messages.
>>
>> This is due to a change in Python 2.7 itself - deprecation warnings are
>> not shown anymore by default. Furthermore, all those messages are unrelated
>> to Mark's missing data commits.
>>
>> Cheers,
>> Ralf
>>
>>
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>> Yet:
>> $ python2.6 -c "import numpy; numpy.test()"
>> Running unit tests for numpy
>> NumPy version 1.6.1
>> NumPy is installed in /usr/local/lib/python2.6/site-packages/numpy
>> Python version 2.6.6 (r266:84292, Aug 19 2011, 09:21:38) [GCC 4.5.1
>> 20100924 (Red Hat 4.5.1-4)]
>> nose version 1.0.0
>>
>> ..............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................K...............................
>> ..................................................................K......................K....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
>> ..............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
>> ...................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
>> ----------------------------------------------------------------------
>> Ran 3533 tests in 22.062s
>>
>> OK (KNOWNFAIL=3)
>>
>> Hence why I was curious about all the messages having not seen them.
>>
>> Is there some plan to cleanup these tests rather than 'hide' them?
>>
> Yes, that happens before every release.
>
> Ralf
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>

Many thanks for the clarification!

Bruce


From charlesr.harris at gmail.com  Fri Aug 19 14:07:45 2011
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Fri, 19 Aug 2011 12:07:45 -0600
Subject: [Numpy-discussion] NA masks for NumPy are ready to test
In-Reply-To: <CAAea2pZd=i3XdGkRAR+eX+WnG34vGsNDfvz6dKeydUVnydnbJA@mail.gmail.com>
References: <CAMRnEmpGCBzR78-y5PBjoa+nYt4EKOJayeWqfzggR11CmTzn3g@mail.gmail.com>
	<4E4E701C.1030305@gmail.com>
	<CAMRnEmq9XnUWDiNz1nHPH9O+a8TRv-5E=mxwDwhw0v7gZppEuA@mail.gmail.com>
	<CAAea2pZd=i3XdGkRAR+eX+WnG34vGsNDfvz6dKeydUVnydnbJA@mail.gmail.com>
Message-ID: <CAB6mnxJM378WGzLnLFviM7J8v52a+KkKdV6vB5LHXyUGfmmMLA@mail.gmail.com>

On Fri, Aug 19, 2011 at 11:55 AM, Bruce Southey <bsouthey at gmail.com> wrote:

> On Fri, Aug 19, 2011 at 10:48 AM, Mark Wiebe <mwwiebe at gmail.com> wrote:
> > On Fri, Aug 19, 2011 at 7:15 AM, Bruce Southey <bsouthey at gmail.com>
> wrote:
> >>
> >> On 08/18/2011 04:43 PM, Mark Wiebe wrote:
> >>
> >> It's taken a lot of changes to get the NA mask support to its current
> >> point, but the code ready for some testing now. You can read the
> >> work-in-progress release notes here:
> >>
> >>
> https://github.com/m-paradox/numpy/blob/missingdata/doc/release/2.0.0-notes.rst
> >> To try it out, check out the missingdata branch from my github account,
> >> here, and build in the standard way:
> >> https://github.com/m-paradox/numpy
> >> The things most important to test are:
> >> * Confirm that existing code still works correctly. I've tested against
> >> SciPy and matplotlib.
> >> * Confirm that the performance of code not using NA masks is the same or
> >> better.
> >> * Try to do computations with the NA values, find places they don't work
> >> yet, and nominate unimplemented functionality important to you to be
> next on
> >> the development list. The release notes have a preliminary list of
> >> implemented/unimplemented functions.
> >> * Report any crashes, build problems, or unexpected behaviors.
> >> In addition to adding the NA mask, I've also added features and done a
> few
> >> performance changes here and there, like letting reductions like sum
> take
> >> lists of axes instead of being a single axis or all of them. These
> changes
> >> affect various bugs
> >> like http://projects.scipy.org/numpy/ticket/1143 and
> http://projects.scipy.org/numpy/ticket/533.
> >> Thanks!
> >> Mark
> >> Here's a small example run using NAs:
> >> >>> import numpy as np
> >> >>> np.__version__
> >> '2.0.0.dev-8a5e2a1'
> >> >>> a = np.random.rand(3,3,3)
> >> >>> a.flags.maskna = True
> >> >>> a[np.random.rand(3,3,3) < 0.5] = np.NA
> >> >>> a
> >> array([[[NA, NA,  0.11511708],
> >>         [ 0.46661454,  0.47565512, NA],
> >>         [NA, NA, NA]],
> >>        [[NA,  0.57860351, NA],
> >>         [NA, NA,  0.72012669],
> >>         [ 0.36582123, NA,  0.76289794]],
> >>        [[ 0.65322748,  0.92794386, NA],
> >>         [ 0.53745165,  0.97520989,  0.17515083],
> >>         [ 0.71219688,  0.5184328 ,  0.75802805]]])
> >> >>> np.mean(a, axis=-1)
> >> array([[NA, NA, NA],
> >>        [NA, NA, NA],
> >>        [NA,  0.56260412,  0.66288591]])
> >> >>> np.std(a, axis=-1)
> >> array([[NA, NA, NA],
> >>        [NA, NA, NA],
> >>        [NA,  0.32710662,  0.10384331]])
> >> >>> np.mean(a, axis=-1, skipna=True)
> >>
> >>
> /home/mwiebe/installtest/lib64/python2.7/site-packages/numpy/core/fromnumeric.py:2474:
> >> RuntimeWarning: invalid value encountered in true_divide
> >>   um.true_divide(ret, rcount, out=ret, casting='unsafe')
> >> array([[ 0.11511708,  0.47113483,         nan],
> >>        [ 0.57860351,  0.72012669,  0.56435958],
> >>        [ 0.79058567,  0.56260412,  0.66288591]])
> >> >>> np.std(a, axis=-1, skipna=True)
> >>
> >>
> /home/mwiebe/installtest/lib64/python2.7/site-packages/numpy/core/fromnumeric.py:2707:
> >> RuntimeWarning: invalid value encountered in true_divide
> >>   um.true_divide(arrmean, rcount, out=arrmean, casting='unsafe')
> >>
> >>
> /home/mwiebe/installtest/lib64/python2.7/site-packages/numpy/core/fromnumeric.py:2730:
> >> RuntimeWarning: invalid value encountered in true_divide
> >>   um.true_divide(ret, rcount, out=ret, casting='unsafe')
> >> array([[ 0.        ,  0.00452029,         nan],
> >>        [ 0.        ,  0.        ,  0.19853835],
> >>        [ 0.13735819,  0.32710662,  0.10384331]])
> >> >>> np.std(a, axis=(1,2), skipna=True)
> >> array([ 0.16786895,  0.15498008,  0.23811937])
> >>
> >> _______________________________________________
> >> NumPy-Discussion mailing list
> >> NumPy-Discussion at scipy.org
> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >>
> >> Hi,
> >> That is great news!
> >> (Python2.x will be another email.)
> >>
> >> Python3.1 and Python3.2 failed with building
> 'multiarraymodule_onefile.o'
> >> but I could not see any obvious reason.
> >
> > I've pushed a change to fix the Python 3 build, it was a use
> > of Py_TPFLAGS_CHECKTYPES, which is no longer in Python3 but is always
> > default now. Tested with 3.2.
> > Thanks!
> > Mark
> >
> >>
> >> I had removed my build directory and then 'python3 setup.py build' but I
> >> saw this message:
> >> Running from numpy source directory.
> >> numpy/core/setup_common.py:86: MismatchCAPIWarning: API mismatch
> detected,
> >> the C API version numbers have to be updated. Current C api version is
> 6,
> >> with checksum ef5688af03ffa23dd8e11734f5b69313, but recorded checksum
> for C
> >> API version 6 in codegen_dir/cversions.txt is
> >> e61d5dc51fa1c6459328266e215d6987. If functions were added in the C API,
> you
> >> have to update C_API_VERSION  in numpy/core/setup_common.py.
> >>   MismatchCAPIWarning)
> >>
> >> Upstream of the build log is below.
> >>
> >> Bruce
> >>
> >> In file included from
> >> numpy/core/src/multiarray/multiarraymodule_onefile.c:53:0:
> >> numpy/core/src/multiarray/na_singleton.c: At top level:
> >> numpy/core/src/multiarray/na_singleton.c:708:25: error:
> >> ?Py_TPFLAGS_CHECKTYPES? undeclared here (not in a function)
> >> numpy/core/src/multiarray/common.c:48:1: warning: ?_use_default_type?
> >> defined but not used
> >> numpy/core/src/multiarray/ctors.h:93:1: warning: ?_arrays_overlap?
> >> declared ?static? but never defined
> >> numpy/core/src/multiarray/scalartypes.c.src:2251:1: warning:
> >> ?gentype_getsegcount? defined but not used
> >> numpy/core/src/multiarray/scalartypes.c.src:2269:1: warning:
> >> ?gentype_getcharbuf? defined but not used
> >> numpy/core/src/multiarray/mapping.c:110:1: warning: ?_array_ass_item?
> >> defined but not used
> >> numpy/core/src/multiarray/number.c:266:1: warning: ?array_divide?
> defined
> >> but not used
> >> numpy/core/src/multiarray/number.c:464:1: warning:
> ?array_inplace_divide?
> >> defined but not used
> >> numpy/core/src/multiarray/buffer.c:25:1: warning: ?array_getsegcount?
> >> defined but not used
> >> numpy/core/src/multiarray/buffer.c:58:1: warning: ?array_getwritebuf?
> >> defined but not used
> >> numpy/core/src/multiarray/buffer.c:71:1: warning: ?array_getcharbuf?
> >> defined but not used
> >> numpy/core/src/multiarray/na_mask.c:681:1: warning:
> >> ?PyArray_GetMaskInversionFunction? defined but not used
> >> In file included from numpy/core/src/multiarray/scalartypes.c.src:25:0,
> >>                  from
> >> numpy/core/src/multiarray/multiarraymodule_onefile.c:10:
> >> numpy/core/src/multiarray/_datetime.h:9:1: warning: function declaration
> >> isn?t a prototype
> >> In file included from
> >> numpy/core/src/multiarray/multiarraymodule_onefile.c:13:0:
> >> numpy/core/src/multiarray/datetime.c:33:1: warning: function declaration
> >> isn?t a prototype
> >> In file included from
> >> numpy/core/src/multiarray/multiarraymodule_onefile.c:17:0:
> >> numpy/core/src/multiarray/arraytypes.c.src: In function ?VOID_getitem?:
> >> numpy/core/src/multiarray/arraytypes.c.src:643:9: warning: passing
> >> argument 2 of ?PyArray_SetBaseObject? from incompatible pointer type
> >>
> >>
> build/src.linux-x86_64-3.2/numpy/core/include/numpy/__multiarray_api.h:763:12:
> >> note: expected ?struct PyObject *? but argument is of type ?struct
> >> PyArrayObject *?
> >> In file included from
> >> numpy/core/src/multiarray/multiarraymodule_onefile.c:44:0:
> >> numpy/core/src/multiarray/nditer_pywrap.c: In function
> >> ?npyiter_subscript?:
> >> numpy/core/src/multiarray/nditer_pywrap.c:2395:29: warning: passing
> >> argument 1 of ?PySlice_GetIndices? from incompatible pointer type
> >> /usr/local/include/python3.2m/sliceobject.h:38:5: note: expected ?struct
> >> PyObject *? but argument is of type ?struct PySliceObject *?
> >> numpy/core/src/multiarray/nditer_pywrap.c: In function
> >> ?npyiter_ass_subscript?:
> >> numpy/core/src/multiarray/nditer_pywrap.c:2440:29: warning: passing
> >> argument 1 of ?PySlice_GetIndices? from incompatible pointer type
> >> /usr/local/include/python3.2m/sliceobject.h:38:5: note: expected ?struct
> >> PyObject *? but argument is of type ?struct PySliceObject *?
> >> In file included from
> >> numpy/core/src/multiarray/multiarraymodule_onefile.c:53:0:
> >> numpy/core/src/multiarray/na_singleton.c: At top level:
> >> numpy/core/src/multiarray/na_singleton.c:708:25: error:
> >> ?Py_TPFLAGS_CHECKTYPES? undeclared here (not in a function)
> >> numpy/core/src/multiarray/common.c:48:1: warning: ?_use_default_type?
> >> defined but not used
> >> numpy/core/src/multiarray/ctors.h:93:1: warning: ?_arrays_overlap?
> >> declared ?static? but never defined
> >> numpy/core/src/multiarray/scalartypes.c.src:2251:1: warning:
> >> ?gentype_getsegcount? defined but not used
> >> numpy/core/src/multiarray/scalartypes.c.src:2269:1: warning:
> >> ?gentype_getcharbuf? defined but not used
> >> numpy/core/src/multiarray/mapping.c:110:1: warning: ?_array_ass_item?
> >> defined but not used
> >> numpy/core/src/multiarray/number.c:266:1: warning: ?array_divide?
> defined
> >> but not used
> >> numpy/core/src/multiarray/number.c:464:1: warning:
> ?array_inplace_divide?
> >> defined but not used
> >> numpy/core/src/multiarray/buffer.c:25:1: warning: ?array_getsegcount?
> >> defined but not used
> >> numpy/core/src/multiarray/buffer.c:58:1: warning: ?array_getwritebuf?
> >> defined but not used
> >> numpy/core/src/multiarray/buffer.c:71:1: warning: ?array_getcharbuf?
> >> defined but not used
> >> numpy/core/src/multiarray/na_mask.c:681:1: warning:
> >> ?PyArray_GetMaskInversionFunction? defined but not used
> >> error: Command "gcc -pthread -DNDEBUG -g -fwrapv -O3 -Wall
> >> -Wstrict-prototypes -fPIC -Inumpy/core/include
> >> -Ibuild/src.linux-x86_64-3.2/numpy/core/include/numpy
> >> -Inumpy/core/src/private -Inumpy/core/src -Inumpy/core
> >> -Inumpy/core/src/npymath -Inumpy/core/src/multiarray
> -Inumpy/core/src/umath
> >> -Inumpy/core/src/npysort -Inumpy/core/include
> >> -I/usr/local/include/python3.2m
> >> -Ibuild/src.linux-x86_64-3.2/numpy/core/src/multiarray
> >> -Ibuild/src.linux-x86_64-3.2/numpy/core/src/umath -c
> >> numpy/core/src/multiarray/multiarraymodule_onefile.c -o
> >>
> build/temp.linux-x86_64-3.2/numpy/core/src/multiarray/multiarraymodule_onefile.o"
> >> failed with exit status 1
> >>
> >>
> >>
> >>
> Thanks for the prompt responses.
>
> That fixes the build problem for both Python3.1 and Python3.2.
>
> I got some test errors below but I guess you are working on those.
>
>
> Bruce
>
>
>
> $ python3 -c "import numpy; numpy.test()"
> Running unit tests for numpy
> NumPy version 2.0.0.dev-965a5c6
> NumPy is installed in /usr/lib64/python3.2/site-packages/numpy
> Python version 3.2 (r32:88445, Feb 21 2011, 21:11:06) [GCC 4.6.0
> 20110212 (Red Hat 4.6.0-0.7)]
> nose version 1.0.0
>
> ..............S.......EFF.....E............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................K...................................................................................................................................................................................................K..................................................................................................K......................K..........................................................................................................S......................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................./usr/lib64/python3.2/site-packages/numpy/lib/format.py:575:
> ResourceWarning: unclosed file <_io.BufferedReader
> name='/tmp/tmpfmmo7x'>
>  mode=mode, offset=offset)
>
> ......................................................................................................................................................................................................................../usr/lib64/python3.2/subprocess.py:460:
> ResourceWarning: unclosed file <_io.BufferedReader name=3>
>  return Popen(*popenargs, **kwargs).wait()
> /usr/lib64/python3.2/subprocess.py:460: ResourceWarning: unclosed file
> <_io.BufferedReader name=8>
>  return Popen(*popenargs, **kwargs).wait()
>
> ....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
> ======================================================================
> ERROR: test_datetime_array_str (test_datetime.TestDateTime)
> ----------------------------------------------------------------------
> Traceback (most recent call last):
>  File
> "/usr/lib64/python3.2/site-packages/numpy/core/tests/test_datetime.py",
> line 510, in test_datetime_array_str
>    assert_equal(str(a), "['2011-03-16' '1920-01-01' '2013-05-19']")
>  File "/usr/lib64/python3.2/site-packages/numpy/core/numeric.py",
> line 1400, in array_str
>    return array2string(a, max_line_width, precision, suppress_small,
> ' ', "", str)
>  File "/usr/lib64/python3.2/site-packages/numpy/core/arrayprint.py",
> line 459, in array2string
>    separator, prefix, formatter=formatter)
>  File "/usr/lib64/python3.2/site-packages/numpy/core/arrayprint.py",
> line 331, in _array2string
>    _summaryEdgeItems, summary_insert)[:-1]
>  File "/usr/lib64/python3.2/site-packages/numpy/core/arrayprint.py",
> line 502, in _formatArray
>    word = format_function(a[-i]) + separator
>  File "/usr/lib64/python3.2/site-packages/numpy/core/arrayprint.py",
> line 770, in __call__
>    casting=self.casting)
> TypeError: Cannot create a local timezone-based date string from a
> NumPy datetime without forcing 'unsafe' casting
>
> ======================================================================
> ERROR: test_datetime_divide (test_datetime.TestDateTime)
> ----------------------------------------------------------------------
> Traceback (most recent call last):
>  File
> "/usr/lib64/python3.2/site-packages/numpy/core/tests/test_datetime.py",
> line 926, in test_datetime_divide
>    assert_equal(tda / tdb, 6.0 / 9.0)
> TypeError: internal error: could not find appropriate datetime inner
> loop in true_divide ufunc
>
> ======================================================================
> FAIL: test_datetime_as_string (test_datetime.TestDateTime)
> ----------------------------------------------------------------------
> Traceback (most recent call last):
>  File
> "/usr/lib64/python3.2/site-packages/numpy/core/tests/test_datetime.py",
> line 1166, in test_datetime_as_string
>    '1959')
>  File "/usr/lib64/python3.2/site-packages/numpy/testing/utils.py",
> line 313, in assert_equal
>    raise AssertionError(msg)
> AssertionError:
> Items are not equal:
>  ACTUAL: b'1959'
>  DESIRED: '1959'
>
> ======================================================================
> FAIL: test_datetime_as_string_timezone (test_datetime.TestDateTime)
> ----------------------------------------------------------------------
> Traceback (most recent call last):
>  File
> "/usr/lib64/python3.2/site-packages/numpy/core/tests/test_datetime.py",
> line 1277, in test_datetime_as_string_timezone
>    '2010-03-15T06:30Z')
>  File "/usr/lib64/python3.2/site-packages/numpy/testing/utils.py",
> line 313, in assert_equal
>    raise AssertionError(msg)
> AssertionError:
> Items are not equal:
>  ACTUAL: b'2010-03-15T06:30Z'
>  DESIRED: '2010-03-15T06:30Z'
>
> ----------------------------------------------------------------------
> Ran 3063 tests in 37.701s
>
> FAILED (KNOWNFAIL=4, SKIP=2, errors=2, failures=2)
>

The 3.2 test errors aren't new. I'd fix the tests except I'm not sure if
Mark wants to modify the datetime stuff instead.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110819/61dfcff7/attachment.html>

From mwwiebe at gmail.com  Fri Aug 19 14:12:10 2011
From: mwwiebe at gmail.com (Mark Wiebe)
Date: Fri, 19 Aug 2011 11:12:10 -0700
Subject: [Numpy-discussion] NA masks for NumPy are ready to test
In-Reply-To: <CAB6mnxJM378WGzLnLFviM7J8v52a+KkKdV6vB5LHXyUGfmmMLA@mail.gmail.com>
References: <CAMRnEmpGCBzR78-y5PBjoa+nYt4EKOJayeWqfzggR11CmTzn3g@mail.gmail.com>
	<4E4E701C.1030305@gmail.com>
	<CAMRnEmq9XnUWDiNz1nHPH9O+a8TRv-5E=mxwDwhw0v7gZppEuA@mail.gmail.com>
	<CAAea2pZd=i3XdGkRAR+eX+WnG34vGsNDfvz6dKeydUVnydnbJA@mail.gmail.com>
	<CAB6mnxJM378WGzLnLFviM7J8v52a+KkKdV6vB5LHXyUGfmmMLA@mail.gmail.com>
Message-ID: <CAMRnEmrGkW4u=LYuhHzB9bYGVGnAUQSEJe9jvSjbkyn0LSm0aA@mail.gmail.com>

On Fri, Aug 19, 2011 at 11:07 AM, Charles R Harris <
charlesr.harris at gmail.com> wrote:

>
>
> On Fri, Aug 19, 2011 at 11:55 AM, Bruce Southey <bsouthey at gmail.com>wrote:
>
>> On Fri, Aug 19, 2011 at 10:48 AM, Mark Wiebe <mwwiebe at gmail.com> wrote:
>> > On Fri, Aug 19, 2011 at 7:15 AM, Bruce Southey <bsouthey at gmail.com>
>> wrote:
>> >>
>> >> On 08/18/2011 04:43 PM, Mark Wiebe wrote:
>> >>
>> >> It's taken a lot of changes to get the NA mask support to its current
>> >> point, but the code ready for some testing now. You can read the
>> >> work-in-progress release notes here:
>> >>
>> >>
>> https://github.com/m-paradox/numpy/blob/missingdata/doc/release/2.0.0-notes.rst
>> >> To try it out, check out the missingdata branch from my github account,
>> >> here, and build in the standard way:
>> >> https://github.com/m-paradox/numpy
>> >> The things most important to test are:
>> >> * Confirm that existing code still works correctly. I've tested against
>> >> SciPy and matplotlib.
>> >> * Confirm that the performance of code not using NA masks is the same
>> or
>> >> better.
>> >> * Try to do computations with the NA values, find places they don't
>> work
>> >> yet, and nominate unimplemented functionality important to you to be
>> next on
>> >> the development list. The release notes have a preliminary list of
>> >> implemented/unimplemented functions.
>> >> * Report any crashes, build problems, or unexpected behaviors.
>> >> In addition to adding the NA mask, I've also added features and done a
>> few
>> >> performance changes here and there, like letting reductions like sum
>> take
>> >> lists of axes instead of being a single axis or all of them. These
>> changes
>> >> affect various bugs
>> >> like http://projects.scipy.org/numpy/ticket/1143 and
>> http://projects.scipy.org/numpy/ticket/533.
>> >> Thanks!
>> >> Mark
>> >> Here's a small example run using NAs:
>> >> >>> import numpy as np
>> >> >>> np.__version__
>> >> '2.0.0.dev-8a5e2a1'
>> >> >>> a = np.random.rand(3,3,3)
>> >> >>> a.flags.maskna = True
>> >> >>> a[np.random.rand(3,3,3) < 0.5] = np.NA
>> >> >>> a
>> >> array([[[NA, NA,  0.11511708],
>> >>         [ 0.46661454,  0.47565512, NA],
>> >>         [NA, NA, NA]],
>> >>        [[NA,  0.57860351, NA],
>> >>         [NA, NA,  0.72012669],
>> >>         [ 0.36582123, NA,  0.76289794]],
>> >>        [[ 0.65322748,  0.92794386, NA],
>> >>         [ 0.53745165,  0.97520989,  0.17515083],
>> >>         [ 0.71219688,  0.5184328 ,  0.75802805]]])
>> >> >>> np.mean(a, axis=-1)
>> >> array([[NA, NA, NA],
>> >>        [NA, NA, NA],
>> >>        [NA,  0.56260412,  0.66288591]])
>> >> >>> np.std(a, axis=-1)
>> >> array([[NA, NA, NA],
>> >>        [NA, NA, NA],
>> >>        [NA,  0.32710662,  0.10384331]])
>> >> >>> np.mean(a, axis=-1, skipna=True)
>> >>
>> >>
>> /home/mwiebe/installtest/lib64/python2.7/site-packages/numpy/core/fromnumeric.py:2474:
>> >> RuntimeWarning: invalid value encountered in true_divide
>> >>   um.true_divide(ret, rcount, out=ret, casting='unsafe')
>> >> array([[ 0.11511708,  0.47113483,         nan],
>> >>        [ 0.57860351,  0.72012669,  0.56435958],
>> >>        [ 0.79058567,  0.56260412,  0.66288591]])
>> >> >>> np.std(a, axis=-1, skipna=True)
>> >>
>> >>
>> /home/mwiebe/installtest/lib64/python2.7/site-packages/numpy/core/fromnumeric.py:2707:
>> >> RuntimeWarning: invalid value encountered in true_divide
>> >>   um.true_divide(arrmean, rcount, out=arrmean, casting='unsafe')
>> >>
>> >>
>> /home/mwiebe/installtest/lib64/python2.7/site-packages/numpy/core/fromnumeric.py:2730:
>> >> RuntimeWarning: invalid value encountered in true_divide
>> >>   um.true_divide(ret, rcount, out=ret, casting='unsafe')
>> >> array([[ 0.        ,  0.00452029,         nan],
>> >>        [ 0.        ,  0.        ,  0.19853835],
>> >>        [ 0.13735819,  0.32710662,  0.10384331]])
>> >> >>> np.std(a, axis=(1,2), skipna=True)
>> >> array([ 0.16786895,  0.15498008,  0.23811937])
>> >>
>> >> _______________________________________________
>> >> NumPy-Discussion mailing list
>> >> NumPy-Discussion at scipy.org
>> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>> >>
>> >> Hi,
>> >> That is great news!
>> >> (Python2.x will be another email.)
>> >>
>> >> Python3.1 and Python3.2 failed with building
>> 'multiarraymodule_onefile.o'
>> >> but I could not see any obvious reason.
>> >
>> > I've pushed a change to fix the Python 3 build, it was a use
>> > of Py_TPFLAGS_CHECKTYPES, which is no longer in Python3 but is always
>> > default now. Tested with 3.2.
>> > Thanks!
>> > Mark
>> >
>> >>
>> >> I had removed my build directory and then 'python3 setup.py build' but
>> I
>> >> saw this message:
>> >> Running from numpy source directory.
>> >> numpy/core/setup_common.py:86: MismatchCAPIWarning: API mismatch
>> detected,
>> >> the C API version numbers have to be updated. Current C api version is
>> 6,
>> >> with checksum ef5688af03ffa23dd8e11734f5b69313, but recorded checksum
>> for C
>> >> API version 6 in codegen_dir/cversions.txt is
>> >> e61d5dc51fa1c6459328266e215d6987. If functions were added in the C API,
>> you
>> >> have to update C_API_VERSION  in numpy/core/setup_common.py.
>> >>   MismatchCAPIWarning)
>> >>
>> >> Upstream of the build log is below.
>> >>
>> >> Bruce
>> >>
>> >> In file included from
>> >> numpy/core/src/multiarray/multiarraymodule_onefile.c:53:0:
>> >> numpy/core/src/multiarray/na_singleton.c: At top level:
>> >> numpy/core/src/multiarray/na_singleton.c:708:25: error:
>> >> ?Py_TPFLAGS_CHECKTYPES? undeclared here (not in a function)
>> >> numpy/core/src/multiarray/common.c:48:1: warning: ?_use_default_type?
>> >> defined but not used
>> >> numpy/core/src/multiarray/ctors.h:93:1: warning: ?_arrays_overlap?
>> >> declared ?static? but never defined
>> >> numpy/core/src/multiarray/scalartypes.c.src:2251:1: warning:
>> >> ?gentype_getsegcount? defined but not used
>> >> numpy/core/src/multiarray/scalartypes.c.src:2269:1: warning:
>> >> ?gentype_getcharbuf? defined but not used
>> >> numpy/core/src/multiarray/mapping.c:110:1: warning: ?_array_ass_item?
>> >> defined but not used
>> >> numpy/core/src/multiarray/number.c:266:1: warning: ?array_divide?
>> defined
>> >> but not used
>> >> numpy/core/src/multiarray/number.c:464:1: warning:
>> ?array_inplace_divide?
>> >> defined but not used
>> >> numpy/core/src/multiarray/buffer.c:25:1: warning: ?array_getsegcount?
>> >> defined but not used
>> >> numpy/core/src/multiarray/buffer.c:58:1: warning: ?array_getwritebuf?
>> >> defined but not used
>> >> numpy/core/src/multiarray/buffer.c:71:1: warning: ?array_getcharbuf?
>> >> defined but not used
>> >> numpy/core/src/multiarray/na_mask.c:681:1: warning:
>> >> ?PyArray_GetMaskInversionFunction? defined but not used
>> >> In file included from numpy/core/src/multiarray/scalartypes.c.src:25:0,
>> >>                  from
>> >> numpy/core/src/multiarray/multiarraymodule_onefile.c:10:
>> >> numpy/core/src/multiarray/_datetime.h:9:1: warning: function
>> declaration
>> >> isn?t a prototype
>> >> In file included from
>> >> numpy/core/src/multiarray/multiarraymodule_onefile.c:13:0:
>> >> numpy/core/src/multiarray/datetime.c:33:1: warning: function
>> declaration
>> >> isn?t a prototype
>> >> In file included from
>> >> numpy/core/src/multiarray/multiarraymodule_onefile.c:17:0:
>> >> numpy/core/src/multiarray/arraytypes.c.src: In function ?VOID_getitem?:
>> >> numpy/core/src/multiarray/arraytypes.c.src:643:9: warning: passing
>> >> argument 2 of ?PyArray_SetBaseObject? from incompatible pointer type
>> >>
>> >>
>> build/src.linux-x86_64-3.2/numpy/core/include/numpy/__multiarray_api.h:763:12:
>> >> note: expected ?struct PyObject *? but argument is of type ?struct
>> >> PyArrayObject *?
>> >> In file included from
>> >> numpy/core/src/multiarray/multiarraymodule_onefile.c:44:0:
>> >> numpy/core/src/multiarray/nditer_pywrap.c: In function
>> >> ?npyiter_subscript?:
>> >> numpy/core/src/multiarray/nditer_pywrap.c:2395:29: warning: passing
>> >> argument 1 of ?PySlice_GetIndices? from incompatible pointer type
>> >> /usr/local/include/python3.2m/sliceobject.h:38:5: note: expected
>> ?struct
>> >> PyObject *? but argument is of type ?struct PySliceObject *?
>> >> numpy/core/src/multiarray/nditer_pywrap.c: In function
>> >> ?npyiter_ass_subscript?:
>> >> numpy/core/src/multiarray/nditer_pywrap.c:2440:29: warning: passing
>> >> argument 1 of ?PySlice_GetIndices? from incompatible pointer type
>> >> /usr/local/include/python3.2m/sliceobject.h:38:5: note: expected
>> ?struct
>> >> PyObject *? but argument is of type ?struct PySliceObject *?
>> >> In file included from
>> >> numpy/core/src/multiarray/multiarraymodule_onefile.c:53:0:
>> >> numpy/core/src/multiarray/na_singleton.c: At top level:
>> >> numpy/core/src/multiarray/na_singleton.c:708:25: error:
>> >> ?Py_TPFLAGS_CHECKTYPES? undeclared here (not in a function)
>> >> numpy/core/src/multiarray/common.c:48:1: warning: ?_use_default_type?
>> >> defined but not used
>> >> numpy/core/src/multiarray/ctors.h:93:1: warning: ?_arrays_overlap?
>> >> declared ?static? but never defined
>> >> numpy/core/src/multiarray/scalartypes.c.src:2251:1: warning:
>> >> ?gentype_getsegcount? defined but not used
>> >> numpy/core/src/multiarray/scalartypes.c.src:2269:1: warning:
>> >> ?gentype_getcharbuf? defined but not used
>> >> numpy/core/src/multiarray/mapping.c:110:1: warning: ?_array_ass_item?
>> >> defined but not used
>> >> numpy/core/src/multiarray/number.c:266:1: warning: ?array_divide?
>> defined
>> >> but not used
>> >> numpy/core/src/multiarray/number.c:464:1: warning:
>> ?array_inplace_divide?
>> >> defined but not used
>> >> numpy/core/src/multiarray/buffer.c:25:1: warning: ?array_getsegcount?
>> >> defined but not used
>> >> numpy/core/src/multiarray/buffer.c:58:1: warning: ?array_getwritebuf?
>> >> defined but not used
>> >> numpy/core/src/multiarray/buffer.c:71:1: warning: ?array_getcharbuf?
>> >> defined but not used
>> >> numpy/core/src/multiarray/na_mask.c:681:1: warning:
>> >> ?PyArray_GetMaskInversionFunction? defined but not used
>> >> error: Command "gcc -pthread -DNDEBUG -g -fwrapv -O3 -Wall
>> >> -Wstrict-prototypes -fPIC -Inumpy/core/include
>> >> -Ibuild/src.linux-x86_64-3.2/numpy/core/include/numpy
>> >> -Inumpy/core/src/private -Inumpy/core/src -Inumpy/core
>> >> -Inumpy/core/src/npymath -Inumpy/core/src/multiarray
>> -Inumpy/core/src/umath
>> >> -Inumpy/core/src/npysort -Inumpy/core/include
>> >> -I/usr/local/include/python3.2m
>> >> -Ibuild/src.linux-x86_64-3.2/numpy/core/src/multiarray
>> >> -Ibuild/src.linux-x86_64-3.2/numpy/core/src/umath -c
>> >> numpy/core/src/multiarray/multiarraymodule_onefile.c -o
>> >>
>> build/temp.linux-x86_64-3.2/numpy/core/src/multiarray/multiarraymodule_onefile.o"
>> >> failed with exit status 1
>> >>
>> >>
>> >>
>> >>
>> Thanks for the prompt responses.
>>
>> That fixes the build problem for both Python3.1 and Python3.2.
>>
>> I got some test errors below but I guess you are working on those.
>>
>>
>> Bruce
>>
>>
>>
>> $ python3 -c "import numpy; numpy.test()"
>> Running unit tests for numpy
>> NumPy version 2.0.0.dev-965a5c6
>> NumPy is installed in /usr/lib64/python3.2/site-packages/numpy
>> Python version 3.2 (r32:88445, Feb 21 2011, 21:11:06) [GCC 4.6.0
>> 20110212 (Red Hat 4.6.0-0.7)]
>> nose version 1.0.0
>>
>> ..............S.......EFF.....E............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................K...................................................................................................................................................................................................K..................................................................................................K......................K..........................................................................................................S......................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................./usr/lib64/python3.2/site-packages/numpy/lib/format.py:575:
>> ResourceWarning: unclosed file <_io.BufferedReader
>> name='/tmp/tmpfmmo7x'>
>>  mode=mode, offset=offset)
>>
>> ......................................................................................................................................................................................................................../usr/lib64/python3.2/subprocess.py:460:
>> ResourceWarning: unclosed file <_io.BufferedReader name=3>
>>  return Popen(*popenargs, **kwargs).wait()
>> /usr/lib64/python3.2/subprocess.py:460: ResourceWarning: unclosed file
>> <_io.BufferedReader name=8>
>>  return Popen(*popenargs, **kwargs).wait()
>>
>> ....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
>> ======================================================================
>> ERROR: test_datetime_array_str (test_datetime.TestDateTime)
>> ----------------------------------------------------------------------
>> Traceback (most recent call last):
>>  File
>> "/usr/lib64/python3.2/site-packages/numpy/core/tests/test_datetime.py",
>> line 510, in test_datetime_array_str
>>    assert_equal(str(a), "['2011-03-16' '1920-01-01' '2013-05-19']")
>>  File "/usr/lib64/python3.2/site-packages/numpy/core/numeric.py",
>> line 1400, in array_str
>>    return array2string(a, max_line_width, precision, suppress_small,
>> ' ', "", str)
>>  File "/usr/lib64/python3.2/site-packages/numpy/core/arrayprint.py",
>> line 459, in array2string
>>    separator, prefix, formatter=formatter)
>>  File "/usr/lib64/python3.2/site-packages/numpy/core/arrayprint.py",
>> line 331, in _array2string
>>    _summaryEdgeItems, summary_insert)[:-1]
>>  File "/usr/lib64/python3.2/site-packages/numpy/core/arrayprint.py",
>> line 502, in _formatArray
>>    word = format_function(a[-i]) + separator
>>  File "/usr/lib64/python3.2/site-packages/numpy/core/arrayprint.py",
>> line 770, in __call__
>>    casting=self.casting)
>> TypeError: Cannot create a local timezone-based date string from a
>> NumPy datetime without forcing 'unsafe' casting
>>
>> ======================================================================
>> ERROR: test_datetime_divide (test_datetime.TestDateTime)
>> ----------------------------------------------------------------------
>> Traceback (most recent call last):
>>  File
>> "/usr/lib64/python3.2/site-packages/numpy/core/tests/test_datetime.py",
>> line 926, in test_datetime_divide
>>    assert_equal(tda / tdb, 6.0 / 9.0)
>> TypeError: internal error: could not find appropriate datetime inner
>> loop in true_divide ufunc
>>
>> ======================================================================
>> FAIL: test_datetime_as_string (test_datetime.TestDateTime)
>> ----------------------------------------------------------------------
>> Traceback (most recent call last):
>>  File
>> "/usr/lib64/python3.2/site-packages/numpy/core/tests/test_datetime.py",
>> line 1166, in test_datetime_as_string
>>    '1959')
>>  File "/usr/lib64/python3.2/site-packages/numpy/testing/utils.py",
>> line 313, in assert_equal
>>    raise AssertionError(msg)
>> AssertionError:
>> Items are not equal:
>>  ACTUAL: b'1959'
>>  DESIRED: '1959'
>>
>> ======================================================================
>> FAIL: test_datetime_as_string_timezone (test_datetime.TestDateTime)
>> ----------------------------------------------------------------------
>> Traceback (most recent call last):
>>  File
>> "/usr/lib64/python3.2/site-packages/numpy/core/tests/test_datetime.py",
>> line 1277, in test_datetime_as_string_timezone
>>    '2010-03-15T06:30Z')
>>  File "/usr/lib64/python3.2/site-packages/numpy/testing/utils.py",
>> line 313, in assert_equal
>>    raise AssertionError(msg)
>> AssertionError:
>> Items are not equal:
>>  ACTUAL: b'2010-03-15T06:30Z'
>>  DESIRED: '2010-03-15T06:30Z'
>>
>> ----------------------------------------------------------------------
>> Ran 3063 tests in 37.701s
>>
>> FAILED (KNOWNFAIL=4, SKIP=2, errors=2, failures=2)
>>
>
> The 3.2 test errors aren't new. I'd fix the tests except I'm not sure if
> Mark wants to modify the datetime stuff instead.
>

I left them largely untouched because I found it weird that the 'S' data
type doesn't return strings in Python 3... I guess maybe the
datetime_as_string function should convert to 'U' data type on Python 3
after building the 'S' array to work around this design choice. I'll look at
it after the NA stuff is wrapped up.

-Mark


>
> Chuck
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110819/7ce5a7b5/attachment.html>

From bsouthey at gmail.com  Fri Aug 19 14:37:28 2011
From: bsouthey at gmail.com (Bruce Southey)
Date: Fri, 19 Aug 2011 13:37:28 -0500
Subject: [Numpy-discussion] NA masks for NumPy are ready to test
In-Reply-To: <CAMRnEmpGCBzR78-y5PBjoa+nYt4EKOJayeWqfzggR11CmTzn3g@mail.gmail.com>
References: <CAMRnEmpGCBzR78-y5PBjoa+nYt4EKOJayeWqfzggR11CmTzn3g@mail.gmail.com>
Message-ID: <CAAea2pZBC_xHDhO32T+_OBSRtbhpVvMmWrM=T0sa_7z6mgB4ng@mail.gmail.com>

Hi,
Just some immediate minor observations that are really about trying to
be consistent:

1) Could you keep the display of the NA dtype be the same as the array?
For example, NA dtype is displayed as '<f8' but should be displayed as
'float64' as that is the array dtype.
 >>> a=np.array([[1,2,3,np.NA], [3,4,np.nan,5]])
>>> a
array([[  1.,   2.,   3., NA],
       [  3.,   4.,  nan,   5.]])
>>> a.dtype
dtype('float64')
>>> a.sum()
NA(dtype='<f8')

2) Can the 'skipna' flag be added to the methods?
>>> a.sum(skipna=True)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'skipna' is an invalid keyword argument for this function
>>> np.sum(a,skipna=True)
nan

3) Can the skipna flag be extended to exclude other non-finite cases like NaN?

4) Assigning a np.NA needs a better error message but the Integer
array case is more informative:
>>> b=np.array([1,2,3,4], dtype=np.float128)
>>> b[0]=np.NA
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: float() argument must be a string or a number

>>> j=np.array([1,2,3])
>>> j
array([1, 2, 3])
>>> j[0]=ina
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: int() argument must be a string or a number, not 'numpy.NAType'

But it is nice that np.NA 'adjusts' to the insertion array:
>>> b.flags.maskna = True
>>> ana
NA(dtype='<f8')
>>> b[0]=ana
>>> b[0]
NA(dtype='<f16')

5) Different display depending on masked state. That is I think that
'maskna=True' should be displayed always when flags.maskna is True :
>>> j=np.array([1,2,3], dtype=np.int8)
>>> j
array([1, 2, 3], dtype=int8)
>>> j.flags.maskna=True
>>> j
array([1, 2, 3], maskna=True, dtype=int8)
>>> j[0]=np.NA
>>> j
array([NA, 2, 3], dtype=int8) # Ithink it should still display 'maskna=True'.

Bruce


From youknowho2000 at yahoo.com  Fri Aug 19 14:38:23 2011
From: youknowho2000 at yahoo.com (Ian)
Date: Fri, 19 Aug 2011 11:38:23 -0700 (PDT)
Subject: [Numpy-discussion] Reconstruct multidimensional array from buffer
	without shape
Message-ID: <1313779103.55122.YahooMailNeo@web39408.mail.mud.yahoo.com>

Hello list,

I am storing a multidimensional array as binary in a Postgres 9.04 database. For retrieval of this array from the database I thought frombuffer() was my solution, however I see that this constructs a one-dimensional array. I read in the documentation about the buffer parameter in the ndarray() constructor, but that requires the shape of the array.

Is there a way to re-construct a multidimensional array from a buffer without knowing its shape?

Thanks.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110819/5487f458/attachment.html>

From shish at keba.be  Fri Aug 19 14:44:02 2011
From: shish at keba.be (Olivier Delalleau)
Date: Fri, 19 Aug 2011 14:44:02 -0400
Subject: [Numpy-discussion] Reconstruct multidimensional array from
 buffer without shape
In-Reply-To: <1313779103.55122.YahooMailNeo@web39408.mail.mud.yahoo.com>
References: <1313779103.55122.YahooMailNeo@web39408.mail.mud.yahoo.com>
Message-ID: <CAFXk4bqCXJRi0UcHZfsLZSwyQ2cdDbk6p5TgwvRfubyrFKk2aA@mail.gmail.com>

How could it be possible? If you only have the buffer data, there could be
many different valid shapes associated to this data.

-=- Olivier

2011/8/19 Ian <youknowho2000 at yahoo.com>

> Hello list,
>
> I am storing a multidimensional array as binary in a Postgres 9.04
> database. For retrieval of this array from the database I thought
> frombuffer() was my solution, however I see that this constructs a
> one-dimensional array. I read in the documentation about the buffer
> parameter in the ndarray() constructor, but that requires the shape of the
> array.
>
> Is there a way to re-construct a multidimensional array from a buffer
> without knowing its shape?
>
> Thanks.
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110819/c24269bb/attachment.html>

From charlesr.harris at gmail.com  Fri Aug 19 14:44:18 2011
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Fri, 19 Aug 2011 12:44:18 -0600
Subject: [Numpy-discussion] NA masks for NumPy are ready to test
In-Reply-To: <CAAea2pZBC_xHDhO32T+_OBSRtbhpVvMmWrM=T0sa_7z6mgB4ng@mail.gmail.com>
References: <CAMRnEmpGCBzR78-y5PBjoa+nYt4EKOJayeWqfzggR11CmTzn3g@mail.gmail.com>
	<CAAea2pZBC_xHDhO32T+_OBSRtbhpVvMmWrM=T0sa_7z6mgB4ng@mail.gmail.com>
Message-ID: <CAB6mnxK3HEBq4kJ__y2xnfj_+5mdH4v_j63r_K70tUbFuSDTYw@mail.gmail.com>

On Fri, Aug 19, 2011 at 12:37 PM, Bruce Southey <bsouthey at gmail.com> wrote:

> Hi,
> Just some immediate minor observations that are really about trying to
> be consistent:
>
> 1) Could you keep the display of the NA dtype be the same as the array?
> For example, NA dtype is displayed as '<f8' but should be displayed as
> 'float64' as that is the array dtype.
>  >>> a=np.array([[1,2,3,np.NA], [3,4,np.nan,5]])
> >>> a
> array([[  1.,   2.,   3., NA],
>       [  3.,   4.,  nan,   5.]])
> >>> a.dtype
> dtype('float64')
> >>> a.sum()
> NA(dtype='<f8')
>
> 2) Can the 'skipna' flag be added to the methods?
> >>> a.sum(skipna=True)
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> TypeError: 'skipna' is an invalid keyword argument for this function
> >>> np.sum(a,skipna=True)
> nan
>
> 3) Can the skipna flag be extended to exclude other non-finite cases like
> NaN?
>
> 4) Assigning a np.NA needs a better error message but the Integer
> array case is more informative:
> >>> b=np.array([1,2,3,4], dtype=np.float128)
> >>> b[0]=np.NA
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> TypeError: float() argument must be a string or a number
>
> >>> j=np.array([1,2,3])
> >>> j
> array([1, 2, 3])
> >>> j[0]=ina
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> TypeError: int() argument must be a string or a number, not 'numpy.NAType'
>
> But it is nice that np.NA 'adjusts' to the insertion array:
> >>> b.flags.maskna = True
> >>> ana
> NA(dtype='<f8')
> >>> b[0]=ana
> >>> b[0]
> NA(dtype='<f16')
>
> 5) Different display depending on masked state. That is I think that
> 'maskna=True' should be displayed always when flags.maskna is True :
> >>> j=np.array([1,2,3], dtype=np.int8)
> >>> j
> array([1, 2, 3], dtype=int8)
> >>> j.flags.maskna=True
> >>> j
> array([1, 2, 3], maskna=True, dtype=int8)
> >>> j[0]=np.NA
> >>> j
> array([NA, 2, 3], dtype=int8) # Ithink it should still display
> 'maskna=True'.
>
>
My main peeve is that NA is upper case ;) I suppose that could use some
discussion.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110819/b00eb740/attachment.html>

From youknowho2000 at yahoo.com  Fri Aug 19 14:57:49 2011
From: youknowho2000 at yahoo.com (Ian)
Date: Fri, 19 Aug 2011 11:57:49 -0700 (PDT)
Subject: [Numpy-discussion] Reconstruct multidimensional array from
	buffer without shape
In-Reply-To: <CAFXk4bqCXJRi0UcHZfsLZSwyQ2cdDbk6p5TgwvRfubyrFKk2aA@mail.gmail.com>
References: <1313779103.55122.YahooMailNeo@web39408.mail.mud.yahoo.com>
	<CAFXk4bqCXJRi0UcHZfsLZSwyQ2cdDbk6p5TgwvRfubyrFKk2aA@mail.gmail.com>
Message-ID: <1313780269.53187.YahooMailNeo@web39414.mail.mud.yahoo.com>

Right. I'm new to NumPy so I figured I'd check if there was some nifty way of preserving the shape without storing it in the database that I hadn't discovered yet. No worries, I'll store the shape alongside the array. Thanks for the reply.

Ian


>________________________________
>From: Olivier Delalleau <shish at keba.be>
>To: Discussion of Numerical Python <numpy-discussion at scipy.org>
>Sent: Friday, August 19, 2011 11:44 AM
>Subject: Re: [Numpy-discussion] Reconstruct multidimensional array from buffer without shape
>
>
>How could it be possible? If you only have the buffer data, there could be many different valid shapes associated to this data.
>
>-=- Olivier
>
>
>2011/8/19 Ian <youknowho2000 at yahoo.com>
>
>Hello list,
>>
>>
>>I am storing a multidimensional array as binary in a Postgres 9.04 database. For retrieval of this array from the database I thought frombuffer() was my solution, however I see that this constructs a one-dimensional array. I read in the documentation about the buffer parameter in the ndarray() constructor, but that requires the shape of the array.
>>
>>
>>Is there a way to re-construct a multidimensional array from a buffer without knowing its shape?
>>
>>
>>Thanks.
>>_______________________________________________
>>NumPy-Discussion mailing list
>>NumPy-Discussion at scipy.org
>>http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
>
>_______________________________________________
>NumPy-Discussion mailing list
>NumPy-Discussion at scipy.org
>http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110819/72367b9d/attachment.html>

From paul.anton.letnes at gmail.com  Fri Aug 19 15:13:51 2011
From: paul.anton.letnes at gmail.com (Paul Anton Letnes)
Date: Fri, 19 Aug 2011 20:13:51 +0100
Subject: [Numpy-discussion] Reconstruct multidimensional array from
	buffer without shape
In-Reply-To: <1313780269.53187.YahooMailNeo@web39414.mail.mud.yahoo.com>
References: <1313779103.55122.YahooMailNeo@web39408.mail.mud.yahoo.com>
	<CAFXk4bqCXJRi0UcHZfsLZSwyQ2cdDbk6p5TgwvRfubyrFKk2aA@mail.gmail.com>
	<1313780269.53187.YahooMailNeo@web39414.mail.mud.yahoo.com>
Message-ID: <3F5EE6DA-BC39-4B47-81D8-C8A20DA90F58@gmail.com>


On 19. aug. 2011, at 19.57, Ian wrote:

> Right. I'm new to NumPy so I figured I'd check if there was some nifty way of preserving the shape without storing it in the database that I hadn't discovered yet. No worries, I'll store the shape alongside the array. Thanks for the reply.
> 
I love the h5py package so I keep recommending it (and pytables is supposed to be good, I think?). h5py stores files in hdf5, which is readable from C,C++,fortran,java,python... It also keeps track of shape and you can store other metadata (e.g. strings) as desired.

Also I believe the numpy format (see e.g. http://docs.scipy.org/doc/numpy/reference/generated/numpy.savez.html#numpy.savez) can do the same, although I don't think performance scales as well for huge arrays, and it's not language-neutral (to my knowledge).

Cheers
Paul


From mwwiebe at gmail.com  Fri Aug 19 15:15:05 2011
From: mwwiebe at gmail.com (Mark Wiebe)
Date: Fri, 19 Aug 2011 12:15:05 -0700
Subject: [Numpy-discussion] NA masks for NumPy are ready to test
In-Reply-To: <CAAea2pZBC_xHDhO32T+_OBSRtbhpVvMmWrM=T0sa_7z6mgB4ng@mail.gmail.com>
References: <CAMRnEmpGCBzR78-y5PBjoa+nYt4EKOJayeWqfzggR11CmTzn3g@mail.gmail.com>
	<CAAea2pZBC_xHDhO32T+_OBSRtbhpVvMmWrM=T0sa_7z6mgB4ng@mail.gmail.com>
Message-ID: <CAMRnEmr=XK0waSkJPY9UwLt15mz5MhytU601L71rmdd2qNE=hA@mail.gmail.com>

On Fri, Aug 19, 2011 at 11:37 AM, Bruce Southey <bsouthey at gmail.com> wrote:

> Hi,
> Just some immediate minor observations that are really about trying to
> be consistent:
>
> 1) Could you keep the display of the NA dtype be the same as the array?
> For example, NA dtype is displayed as '<f8' but should be displayed as
> 'float64' as that is the array dtype.
>  >>> a=np.array([[1,2,3,np.NA], [3,4,np.nan,5]])
> >>> a
> array([[  1.,   2.,   3., NA],
>       [  3.,   4.,  nan,   5.]])
> >>> a.dtype
> dtype('float64')
> >>> a.sum()
> NA(dtype='<f8')
>

I suppose I can do it that way, sure. I think it would be good to change the
'float64' into '<float64' at some point, so it's a more portable repr.


> 2) Can the 'skipna' flag be added to the methods?
> >>> a.sum(skipna=True)
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> TypeError: 'skipna' is an invalid keyword argument for this function
> >>> np.sum(a,skipna=True)
> nan
>

Yeah, but I think this is low priority compared to a lot of other things
that need doing. The methods are written in C with a particular hardcoded
implementation pattern, whereas with the functions in the numpy namespace I
was able to adjust to call the ufunc reduce methods without much menial
effort.

3) Can the skipna flag be extended to exclude other non-finite cases like
> NaN?
>

That wasn't really within the scope of the original design, except for one
particular case of the NA-bitpattern dtypes. It's possible to make a new
mask and assign NA to the NaN values like this:

a = [array with NaNs]
aview = a.view(ownmaskna=True)
aview[np.isnan(aview)] = np.NA
np.sum(aview, skipna=True)

4) Assigning a np.NA needs a better error message but the Integer
> array case is more informative:
> >>> b=np.array([1,2,3,4], dtype=np.float128)
> >>> b[0]=np.NA
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> TypeError: float() argument must be a string or a number
>
> >>> j=np.array([1,2,3])
> >>> j
> array([1, 2, 3])
> >>> j[0]=ina
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> TypeError: int() argument must be a string or a number, not 'numpy.NAType'
>

I coded this up the way I did to ease the future transition to NA-bitpattern
dtypes, which would handle this conversion from the NA object. The error
message is being produced by CPython in both of these cases, so it looks
like they didn't make their messages consistent.

This could be changed to match the error message like this:

>>> a = np.array([np.NA, 3])
>>> b = np.array([3,4])
>>> b[...] = a
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: Cannot assign NA value to an array which does not support NAs


> But it is nice that np.NA 'adjusts' to the insertion array:
> >>> b.flags.maskna = True
> >>> ana
> NA(dtype='<f8')
> >>> b[0]=ana
> >>> b[0]
> NA(dtype='<f16')
>

It should generally follow the NumPy type promotion rules, but may be a bit
more liberal in places.


> 5) Different display depending on masked state. That is I think that
> 'maskna=True' should be displayed always when flags.maskna is True :
> >>> j=np.array([1,2,3], dtype=np.int8)
> >>> j
> array([1, 2, 3], dtype=int8)
> >>> j.flags.maskna=True
> >>> j
> array([1, 2, 3], maskna=True, dtype=int8)
> >>> j[0]=np.NA
> >>> j
> array([NA, 2, 3], dtype=int8) # Ithink it should still display
> 'maskna=True'.
>

This is just like how NumPy hides the dtype in some cases, it's hiding the
maskna=True whenever it would be automatically detected from the input list.

>>> np.array([1.0, 2.0])
array([ 1.,  2.])
>>> np.array([1.0, 2.0], dtype=np.float32)
array([ 1.,  2.], dtype=float32)

Cheers,
Mark


>
> Bruce
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110819/1a725f10/attachment.html>

From alan at ajackson.org  Fri Aug 19 16:01:25 2011
From: alan at ajackson.org (alan at ajackson.org)
Date: Fri, 19 Aug 2011 15:01:25 -0500
Subject: [Numpy-discussion] Statistical distributions on samples
In-Reply-To: <CAEJxiFo+bwAjncSG16J=RmY7XOo3izgEHgB=8dqMjGipGAAdyw@mail.gmail.com>
References: <CAEf70bxD7Z4HH6SZH-DpNARCDqQVx7f6SRyAqd7FoZVtdDnC1Q@mail.gmail.com>
	<CAEJxiFp0RB6-f2MF72UTM5biWQf2ZP4BUU9Xbx3Jx8Lj93NiSQ@mail.gmail.com>
	<CAEf70bwKFcfw_Of6haJXUSYcoKjjQzX6aDBQ9Li4fY=zeZ0XUg@mail.gmail.com>
	<CAEJxiFo+bwAjncSG16J=RmY7XOo3izgEHgB=8dqMjGipGAAdyw@mail.gmail.com>
Message-ID: <20110819150125.1649d49e@ajackson.org>

I have applied the update to the documentation (although that function
needs a general rewrite - later...)

>On Mon, Aug 15, 2011 at 8:53 AM, Andrea Gavana <andrea.gavana at gmail.com>wrote:
>
>> Hi Chris and All,
>>
>> On 12 August 2011 16:53, Christopher Jordan-Squire wrote:
>> > Hi Andrea--An easy way to get something like this would be
>> >
>> > import numpy as np
>> > import scipy.stats as stats
>> >
>> > sigma = #some reasonable standard deviation for your application
>> > x = stats.norm.rvs(size=1000, loc=125, scale=sigma)
>> > x = x[x>50]
>> > x = x[x<200]
>> >
>> > That will give a roughly normal distribution to your velocities, as long
>> as,
>> > say, sigma<25. (I'm using the rule of thumb for the normal distribution
>> that
>> > normal random samples lie 3 standard deviations away from the mean about
>> 1
>> > out of 350 times.) Though you won't be able to get exactly normal errors
>> > about your mean since normal random samples can theoretically be of any
>> > size.
>> >
>> > You can use this same process for any other distribution, as long as
>> you've
>> > chosen a scale variable so that the probability of samples being outside
>> > your desired interval is really small. Of course, once again your random
>> > errors won't be exactly from the distribution you get your original
>> samples
>> > from.
>>
>> Thank you for your suggestion. There are a couple of things I am not
>> clear with, however. The first one (the easy one), is: let's suppose I
>> need 200 values, and the accept/discard procedure removes 5 of them
>> from the list. Is there any way to draw these 200 values from a bigger
>> sample so that the accept/reject procedure will not interfere too
>> much? And how do I get 200 values out of the bigger sample so that
>> these values are still representative?
>>
>
>FWIW, I'm not really advocating a truncated normal so much as making the
>standard deviation small enough so that there's no real difference between a
>true normal distribution and a truncated normal.
>
>If you're worried about getting exactly 200 samples, then you could sample N
>with N>200 and such that after throwing out the ones that lie outside your
>desired region you're left with M>200. Then just randomly pick 200 from
>those M. That shouldn't bias anything as long as you randomly pick them. (Or
>just pick the first 200, if you haven't done anything to impose any order on
>the samples, such as sorting them by size.) But I'm not sure why you'd want
>exactly 200 samples instead of some number of samples close to 200.
>
>
>>
>> Another thing, possibly completely unrelated. I am trying to design a
>> toy Latin Hypercube script (just for my own understanding). I found
>> this piece of code on the web (and I modified it slightly):
>>
>> def lhs(dist, size=100):
>>    '''
>>    Latin Hypercube sampling of any distrbution.
>>    dist is is a scipy.stats random number generator
>>    such as stats.norm, stats.beta, etc
>>    parms is a tuple with the parameters needed for
>>    the specified distribution.
>>
>>    :Parameters:
>>        - `dist`: random number generator from scipy.stats module.
>>        - `size` :size for the output sample
>>    '''
>>
>>    n = size
>>
>>    perc = numpy.arange(0.0, 1.0, 1.0/n)
>>    numpy.random.shuffle(perc)
>>
>>    smp = [stats.uniform(i,1.0/n).rvs() for i in perc]
>>
>>    v = dist.ppf(smp)
>>
>>    return v
>>
>>
>> Now, I am not 100% clear of what the percent point function is (I have
>> read around the web, but please keep in mind that my statistical
>> skills are close to minus infinity). From this page:
>>
>> http://www.itl.nist.gov/div898/handbook/eda/section3/eda362.htm
>>
>>
>The ppf is what's called the quantile function elsewhere. I do not know why
>scipy calls it the ppf/percent point function.
>
>The quantile function is the inverse of the cumulative density function
>(cdf). So dist.ppf(z) is the x such that P(dist <= x) = z. Roughly. (Things
>get slightly more finicky if you think about discrete distributions because
>then you have to pick what happens at the jumps in the cdf.) So
>dist.ppf(0.5) gives the median of dist, and dist.ppf(0.25) gives the
>lower/first quartile of dist.
>
>
>> I gather that, if you plot the results of the ppf, with the horizontal
>> axis as probability, the vertical axis goes from the smallest to the
>> largest value of the cumulative distribution function. If i do this:
>>
>> numpy.random.seed(123456)
>>
>> distribution = stats.norm(loc=125, scale=25)
>>
>> my_lhs = lhs(distribution, 50)
>>
>> Will my_lhs always contain valid values (i.e., included between 50 and
>> 200)? I assume the answer is no... but even if this was the case, is
>> this my_lhs array ready to be used to setup a LHS experiment when I
>> have multi-dimensional problems (in which all the variables are
>> completely independent from each other - no correlation)?
>>
>>
>I'm not really sure if the above function is doing the lhs you want.   To
>answer your question, it won't always generate values within [50,200]. If
>size is large enough then you're dividing up the probability space evenly.
>So even with the random perturbations (whose use I don't really understand),
>you'll ensure that some of the samples you get when you apply the ppf will
>correspond to the extremely low probability samples that are <50 or >200.
>
>-Chris JS
>
>My apologies for the idiocy of the questions.
>>
>> Andrea.
>>
>> "Imagination Is The Only Weapon In The War Against Reality."
>> http://xoomer.alice.it/infinity77/
>>
>> >>> import PyQt4.QtGui
>> Traceback (most recent call last):
>>   File "<interactive input>", line 1, in <module>
>> ImportError: No module named PyQt4.QtGui
>> >>>
>> >>> import pygtk
>> Traceback (most recent call last):
>>   File "<interactive input>", line 1, in <module>
>> ImportError: No module named pygtk
>> >>>
>> >>> import wx
>> >>>
>> >>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>


-- 
-----------------------------------------------------------------------
| Alan K. Jackson            | To see a World in a Grain of Sand      |
| alan at ajackson.org          | And a Heaven in a Wild Flower,         |
| www.ajackson.org           | Hold Infinity in the palm of your hand |
| Houston, Texas             | And Eternity in an hour. - Blake       |
-----------------------------------------------------------------------


From ralf.gommers at googlemail.com  Fri Aug 19 16:04:15 2011
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Fri, 19 Aug 2011 22:04:15 +0200
Subject: [Numpy-discussion] NA masks for NumPy are ready to test
In-Reply-To: <CAMRnEmr=XK0waSkJPY9UwLt15mz5MhytU601L71rmdd2qNE=hA@mail.gmail.com>
References: <CAMRnEmpGCBzR78-y5PBjoa+nYt4EKOJayeWqfzggR11CmTzn3g@mail.gmail.com>
	<CAAea2pZBC_xHDhO32T+_OBSRtbhpVvMmWrM=T0sa_7z6mgB4ng@mail.gmail.com>
	<CAMRnEmr=XK0waSkJPY9UwLt15mz5MhytU601L71rmdd2qNE=hA@mail.gmail.com>
Message-ID: <CABL7CQiiEQua2iAsn+tvFMm_z9b2tTggxMePUmnT1PQrB6991w@mail.gmail.com>

On Fri, Aug 19, 2011 at 9:15 PM, Mark Wiebe <mwwiebe at gmail.com> wrote:

> On Fri, Aug 19, 2011 at 11:37 AM, Bruce Southey <bsouthey at gmail.com>wrote:
>
>> Hi,
>> Just some immediate minor observations that are really about trying to
>> be consistent:
>>
>> 1) Could you keep the display of the NA dtype be the same as the array?
>> For example, NA dtype is displayed as '<f8' but should be displayed as
>> 'float64' as that is the array dtype.
>>  >>> a=np.array([[1,2,3,np.NA], [3,4,np.nan,5]])
>> >>> a
>> array([[  1.,   2.,   3., NA],
>>       [  3.,   4.,  nan,   5.]])
>> >>> a.dtype
>> dtype('float64')
>> >>> a.sum()
>> NA(dtype='<f8')
>>
>
> I suppose I can do it that way, sure. I think it would be good to change
> the 'float64' into '<float64' at some point, so it's a more portable repr.
>
>
I don't think that looks better. It would also screws up people's doctests
again.

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110819/91be124f/attachment.html>

From amcmorl at gmail.com  Fri Aug 19 16:04:03 2011
From: amcmorl at gmail.com (Angus McMorland)
Date: Fri, 19 Aug 2011 16:04:03 -0400
Subject: [Numpy-discussion] numpy segfaults with ctypes
Message-ID: <CACtA=SzA3no0WDsLEKsxYYGGyuhr0zgs5SGgAUBsyHG3ZT0POA@mail.gmail.com>

Hi all,

I'm giving this email a new subject, in case that helps it catch the
attention of someone who can fix my problem. I currently cannot
upgrade numpy from git to any date more recent than 10 July. Git
commit feb8079070b8a659d7ee is the first that causes the problem
(according to github, the commit was authored by walshb and committed
by m-paradox, in case that jogs anyone's memory). I've tried taking a
look at the code diff, but I'm afraid I'm just a user, rather than a
developer, and it didn't make much sense.

My problem is that python segfaults when I run it with the following code:

> from ctypes import Structure, c_double
>
> #-- copied out of an xml2py generated file
> class S(Structure):
> ? ?pass
> S._pack_ = 4
> S._fields_ = [
> ? ?('field', c_double * 2),
> ? ]
> #--
>
> import numpy as np
> print np.version.version
> s = S()
> print "S", np.asarray(s.field)

Thanks,

Angus
-- 
AJC McMorland
Post-doctoral research fellow
Neurobiology, University of Pittsburgh


From mwwiebe at gmail.com  Fri Aug 19 16:05:23 2011
From: mwwiebe at gmail.com (Mark Wiebe)
Date: Fri, 19 Aug 2011 13:05:23 -0700
Subject: [Numpy-discussion] NA masks for NumPy are ready to test
In-Reply-To: <CAB6mnxK3HEBq4kJ__y2xnfj_+5mdH4v_j63r_K70tUbFuSDTYw@mail.gmail.com>
References: <CAMRnEmpGCBzR78-y5PBjoa+nYt4EKOJayeWqfzggR11CmTzn3g@mail.gmail.com>
	<CAAea2pZBC_xHDhO32T+_OBSRtbhpVvMmWrM=T0sa_7z6mgB4ng@mail.gmail.com>
	<CAB6mnxK3HEBq4kJ__y2xnfj_+5mdH4v_j63r_K70tUbFuSDTYw@mail.gmail.com>
Message-ID: <CAMRnEmqVQaw-T-3aeoX_tZiw4_mT7ite6WqXzcPa6Eaf_ycQyA@mail.gmail.com>

On Fri, Aug 19, 2011 at 11:44 AM, Charles R Harris <
charlesr.harris at gmail.com> wrote:

>
>
> On Fri, Aug 19, 2011 at 12:37 PM, Bruce Southey <bsouthey at gmail.com>wrote:
>
>> Hi,
>> Just some immediate minor observations that are really about trying to
>> be consistent:
>>
>> 1) Could you keep the display of the NA dtype be the same as the array?
>> For example, NA dtype is displayed as '<f8' but should be displayed as
>> 'float64' as that is the array dtype.
>>  >>> a=np.array([[1,2,3,np.NA], [3,4,np.nan,5]])
>> >>> a
>> array([[  1.,   2.,   3., NA],
>>       [  3.,   4.,  nan,   5.]])
>> >>> a.dtype
>> dtype('float64')
>> >>> a.sum()
>> NA(dtype='<f8')
>>
>> 2) Can the 'skipna' flag be added to the methods?
>> >>> a.sum(skipna=True)
>> Traceback (most recent call last):
>>   File "<stdin>", line 1, in <module>
>> TypeError: 'skipna' is an invalid keyword argument for this function
>> >>> np.sum(a,skipna=True)
>> nan
>>
>> 3) Can the skipna flag be extended to exclude other non-finite cases like
>> NaN?
>>
>> 4) Assigning a np.NA needs a better error message but the Integer
>> array case is more informative:
>> >>> b=np.array([1,2,3,4], dtype=np.float128)
>> >>> b[0]=np.NA
>> Traceback (most recent call last):
>>   File "<stdin>", line 1, in <module>
>> TypeError: float() argument must be a string or a number
>>
>> >>> j=np.array([1,2,3])
>> >>> j
>> array([1, 2, 3])
>> >>> j[0]=ina
>> Traceback (most recent call last):
>>   File "<stdin>", line 1, in <module>
>> TypeError: int() argument must be a string or a number, not 'numpy.NAType'
>>
>> But it is nice that np.NA 'adjusts' to the insertion array:
>> >>> b.flags.maskna = True
>> >>> ana
>> NA(dtype='<f8')
>> >>> b[0]=ana
>> >>> b[0]
>> NA(dtype='<f16')
>>
>> 5) Different display depending on masked state. That is I think that
>> 'maskna=True' should be displayed always when flags.maskna is True :
>> >>> j=np.array([1,2,3], dtype=np.int8)
>> >>> j
>> array([1, 2, 3], dtype=int8)
>> >>> j.flags.maskna=True
>> >>> j
>> array([1, 2, 3], maskna=True, dtype=int8)
>> >>> j[0]=np.NA
>> >>> j
>> array([NA, 2, 3], dtype=int8) # Ithink it should still display
>> 'maskna=True'.
>>
>>
> My main peeve is that NA is upper case ;) I suppose that could use some
> discussion.
>

There is some proliferation of cases in the NaN case:

>>> np.nan
nan
>>> np.NAN
nan
>>> np.NaN
nan

The pros I see for NA over na are:

* less confusion of NA vs nan (should this carry over to the np.isna
function, should it be np.isNA according to this point?)
* more comfortable for switching between NumPy and R when people have to use
both at the same time

The main con is:

* Inconsistent with current nan, inf printing. Here's a hackish workaround:

>>> np.na = np.NA
>>> np.set_printoptions(nastr='na')
>>> np.array([np.na, 2.0])
array([na,  2.])

What's your list of pros and cons?

-Mark


>
> Chuck
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110819/d823e2e2/attachment.html>

From matthew.brett at gmail.com  Fri Aug 19 16:11:40 2011
From: matthew.brett at gmail.com (Matthew Brett)
Date: Fri, 19 Aug 2011 13:11:40 -0700
Subject: [Numpy-discussion] numpy segfaults with ctypes
In-Reply-To: <CACtA=SzA3no0WDsLEKsxYYGGyuhr0zgs5SGgAUBsyHG3ZT0POA@mail.gmail.com>
References: <CACtA=SzA3no0WDsLEKsxYYGGyuhr0zgs5SGgAUBsyHG3ZT0POA@mail.gmail.com>
Message-ID: <CAH6Pt5rZehx4UEWPGtaF1dctAvuokyyeYSpYoAhH8wbALe-c3Q@mail.gmail.com>

Hi,

On Fri, Aug 19, 2011 at 1:04 PM, Angus McMorland <amcmorl at gmail.com> wrote:
> Hi all,
>
> I'm giving this email a new subject, in case that helps it catch the
> attention of someone who can fix my problem. I currently cannot
> upgrade numpy from git to any date more recent than 10 July. Git
> commit feb8079070b8a659d7ee is the first that causes the problem
> (according to github, the commit was authored by walshb and committed
> by m-paradox, in case that jogs anyone's memory). I've tried taking a
> look at the code diff, but I'm afraid I'm just a user, rather than a
> developer, and it didn't make much sense.
>
> My problem is that python segfaults when I run it with the following code:
>
>> from ctypes import Structure, c_double
>>
>> #-- copied out of an xml2py generated file
>> class S(Structure):
>> ? ?pass
>> S._pack_ = 4
>> S._fields_ = [
>> ? ?('field', c_double * 2),
>> ? ]
>> #--
>>
>> import numpy as np
>> print np.version.version
>> s = S()
>> print "S", np.asarray(s.field)

Just to say, that that commit is also the commit that causes a
segfault for np.lookfor:

http://www.mail-archive.com/numpy-discussion at scipy.org/msg33114.html
http://projects.scipy.org/numpy/ticket/1937

The latter ticket is closed because Mark's missing-data development
branch does not have the segfault.

I guess you could try that branch and see whether it fixes the problem?

I guess also that means we'll have to merge in the missing data branch
in order to fix the problem.

See you,

matthew


From mwwiebe at gmail.com  Fri Aug 19 18:46:50 2011
From: mwwiebe at gmail.com (Mark Wiebe)
Date: Fri, 19 Aug 2011 15:46:50 -0700
Subject: [Numpy-discussion] NA masks for NumPy are ready to test
In-Reply-To: <CAMRnEmpGCBzR78-y5PBjoa+nYt4EKOJayeWqfzggR11CmTzn3g@mail.gmail.com>
References: <CAMRnEmpGCBzR78-y5PBjoa+nYt4EKOJayeWqfzggR11CmTzn3g@mail.gmail.com>
Message-ID: <CAMRnEmqMVEXCHVsaC43hKws88eGX-Ya8ZFN6-jp2q8yu-P_CGA@mail.gmail.com>

On Thu, Aug 18, 2011 at 2:43 PM, Mark Wiebe <mwwiebe at gmail.com> wrote:

> It's taken a lot of changes to get the NA mask support to its current
> point, but the code ready for some testing now. You can read the
> work-in-progress release notes here:
>
>
> https://github.com/m-paradox/numpy/blob/missingdata/doc/release/2.0.0-notes.rst
>
> To try it out, check out the missingdata branch from my github account,
> here, and build in the standard way:
>
> https://github.com/m-paradox/numpy
>
> The things most important to test are:
>
> * Confirm that existing code still works correctly. I've tested against
> SciPy and matplotlib.
> * Confirm that the performance of code not using NA masks is the same or
> better.
> * Try to do computations with the NA values, find places they don't work
> yet, and nominate unimplemented functionality important to you to be next on
> the development list. The release notes have a preliminary list of
> implemented/unimplemented functions.
> * Report any crashes, build problems, or unexpected behaviors.
>
> In addition to adding the NA mask, I've also added features and done a few
> performance changes here and there, like letting reductions like sum take
> lists of axes instead of being a single axis or all of them. These changes
> affect various bugs like http://projects.scipy.org/numpy/ticket/1143 and
> http://projects.scipy.org/numpy/ticket/533.
>

With a new fix to the unitless reduction logic I just committed, the
situation for bug http://projects.scipy.org/numpy/ticket/450 is also
improved.

Cheers,
Mark


> Thanks!
> Mark
>
> Here's a small example run using NAs:
>
> >>> import numpy as np
> >>> np.__version__
> '2.0.0.dev-8a5e2a1'
> >>> a = np.random.rand(3,3,3)
> >>> a.flags.maskna = True
> >>> a[np.random.rand(3,3,3) < 0.5] = np.NA
> >>> a
> array([[[NA, NA,  0.11511708],
>         [ 0.46661454,  0.47565512, NA],
>         [NA, NA, NA]],
>
>        [[NA,  0.57860351, NA],
>         [NA, NA,  0.72012669],
>         [ 0.36582123, NA,  0.76289794]],
>
>        [[ 0.65322748,  0.92794386, NA],
>         [ 0.53745165,  0.97520989,  0.17515083],
>         [ 0.71219688,  0.5184328 ,  0.75802805]]])
> >>> np.mean(a, axis=-1)
> array([[NA, NA, NA],
>        [NA, NA, NA],
>        [NA,  0.56260412,  0.66288591]])
> >>> np.std(a, axis=-1)
> array([[NA, NA, NA],
>        [NA, NA, NA],
>        [NA,  0.32710662,  0.10384331]])
> >>> np.mean(a, axis=-1, skipna=True)
> /home/mwiebe/installtest/lib64/python2.7/site-packages/numpy/core/fromnumeric.py:2474:
> RuntimeWarning: invalid value encountered in true_divide
>   um.true_divide(ret, rcount, out=ret, casting='unsafe')
> array([[ 0.11511708,  0.47113483,         nan],
>        [ 0.57860351,  0.72012669,  0.56435958],
>        [ 0.79058567,  0.56260412,  0.66288591]])
> >>> np.std(a, axis=-1, skipna=True)
> /home/mwiebe/installtest/lib64/python2.7/site-packages/numpy/core/fromnumeric.py:2707:
> RuntimeWarning: invalid value encountered in true_divide
>   um.true_divide(arrmean, rcount, out=arrmean, casting='unsafe')
> /home/mwiebe/installtest/lib64/python2.7/site-packages/numpy/core/fromnumeric.py:2730:
> RuntimeWarning: invalid value encountered in true_divide
>   um.true_divide(ret, rcount, out=ret, casting='unsafe')
> array([[ 0.        ,  0.00452029,         nan],
>        [ 0.        ,  0.        ,  0.19853835],
>        [ 0.13735819,  0.32710662,  0.10384331]])
> >>> np.std(a, axis=(1,2), skipna=True)
> array([ 0.16786895,  0.15498008,  0.23811937])
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110819/e832b21d/attachment.html>

From dominique.orban at gmail.com  Fri Aug 19 19:00:04 2011
From: dominique.orban at gmail.com (Dominique Orban)
Date: Fri, 19 Aug 2011 23:00:04 +0000
Subject: [Numpy-discussion] ImportError: dynamic module does not define init
	function (initmultiarray)
Message-ID: <CAO6G2+c9_fK0o1xrL1Szg-qNMycjAz8SOjcxUcpECL0WCSVHgQ@mail.gmail.com>

Dear list,

I'm embedding Python inside a C program to pull functions from
user-supplied Python modules. All is well except when the
user-supplied module imports numpy. Requesting a stack trace when an
exception occurs reveals the following:

---
Traceback (most recent call last):
  File "/Users/dpo/.virtualenvs/matrox/matrox/curve.py", line 3, in <module>
    import numpy as np
  File "/Users/dpo/.virtualenvs/matrox/lib/python2.7/site-packages/numpy/__init__.py",
line 137, in <module>
    import add_newdocs
  File "/Users/dpo/.virtualenvs/matrox/lib/python2.7/site-packages/numpy/add_newdocs.py",
line 9, in <module>
    from numpy.lib import add_newdoc
  File "/Users/dpo/.virtualenvs/matrox/lib/python2.7/site-packages/numpy/lib/__init__.py",
line 4, in <module>
    from type_check import *
  File "/Users/dpo/.virtualenvs/matrox/lib/python2.7/site-packages/numpy/lib/type_check.py",
line 8, in <module>
    import numpy.core.numeric as _nx
  File "/Users/dpo/.virtualenvs/matrox/lib/python2.7/site-packages/numpy/core/__init__.py",
line 5, in <module>
    import multiarray
ImportError: dynamic module does not define init function (initmultiarray)
---

(here, "curve.py" is the user-supplied module in question.)

The symbol initmultiarray *is* defined in multiarray.so so I'm
wondering if anybody has suggestions as to what the problem may be
here.

A bit of Googling reveals the following:

* The 3rd example of Section 31.2.5 of
http://www.swig.org/Doc1.3/Python.html says

   "This error is almost always caused when a bad name is given to the
shared object file. For example, if you created a file example.so
instead of _example.so you would get this error."

* Item #2 in the FAQ at http://biggles.sourceforge.net/doc/1.5/faq says

  "This is a problem with your module search path. Python is loading
[multiarray].so as a module instead of [multiarray].py"

But I don't have any multiarray.py. I have other multiarray.so's, but
they're not in my search path. And I'm not finding any _multiarray.so
with a leading underscore.

So I am lead to ask: should multiarray.so really be called
_multiarray.so? If not, any idea what the problem is?

I'm using Python 2.7.2 compiled as a framework using Homebrew on OSX
10.6.8 and Numpy 1.6.1 installed from PyPi a day or two ago.

Thanks much in advance!

-- 
Dominique


From bsouthey at gmail.com  Fri Aug 19 19:52:53 2011
From: bsouthey at gmail.com (Bruce Southey)
Date: Fri, 19 Aug 2011 18:52:53 -0500
Subject: [Numpy-discussion] NA masks for NumPy are ready to test
In-Reply-To: <CAMRnEmqVQaw-T-3aeoX_tZiw4_mT7ite6WqXzcPa6Eaf_ycQyA@mail.gmail.com>
References: <CAMRnEmpGCBzR78-y5PBjoa+nYt4EKOJayeWqfzggR11CmTzn3g@mail.gmail.com>
	<CAAea2pZBC_xHDhO32T+_OBSRtbhpVvMmWrM=T0sa_7z6mgB4ng@mail.gmail.com>
	<CAB6mnxK3HEBq4kJ__y2xnfj_+5mdH4v_j63r_K70tUbFuSDTYw@mail.gmail.com>
	<CAMRnEmqVQaw-T-3aeoX_tZiw4_mT7ite6WqXzcPa6Eaf_ycQyA@mail.gmail.com>
Message-ID: <CAAea2pZdJV8o=PzUw1K-1kh7PJ50MacuCWH2H9kQkeDAZyafGw@mail.gmail.com>

On Fri, Aug 19, 2011 at 3:05 PM, Mark Wiebe <mwwiebe at gmail.com> wrote:
> On Fri, Aug 19, 2011 at 11:44 AM, Charles R Harris
> <charlesr.harris at gmail.com> wrote:
>>
>>
>> On Fri, Aug 19, 2011 at 12:37 PM, Bruce Southey <bsouthey at gmail.com>
>> wrote:
>>>
>>> Hi,
>>> Just some immediate minor observations that are really about trying to
>>> be consistent:
>>>
>>> 1) Could you keep the display of the NA dtype be the same as the array?
>>> For example, NA dtype is displayed as '<f8' but should be displayed as
>>> 'float64' as that is the array dtype.
>>> ?>>> a=np.array([[1,2,3,np.NA], [3,4,np.nan,5]])
>>> >>> a
>>> array([[ ?1., ? 2., ? 3., NA],
>>> ? ? ? [ ?3., ? 4., ?nan, ? 5.]])
>>> >>> a.dtype
>>> dtype('float64')
>>> >>> a.sum()
>>> NA(dtype='<f8')
>>>
>>> 2) Can the 'skipna' flag be added to the methods?
>>> >>> a.sum(skipna=True)
>>> Traceback (most recent call last):
>>> ?File "<stdin>", line 1, in <module>
>>> TypeError: 'skipna' is an invalid keyword argument for this function
>>> >>> np.sum(a,skipna=True)
>>> nan
>>>
>>> 3) Can the skipna flag be extended to exclude other non-finite cases like
>>> NaN?
>>>
>>> 4) Assigning a np.NA needs a better error message but the Integer
>>> array case is more informative:
>>> >>> b=np.array([1,2,3,4], dtype=np.float128)
>>> >>> b[0]=np.NA
>>> Traceback (most recent call last):
>>> ?File "<stdin>", line 1, in <module>
>>> TypeError: float() argument must be a string or a number
>>>
>>> >>> j=np.array([1,2,3])
>>> >>> j
>>> array([1, 2, 3])
>>> >>> j[0]=ina
>>> Traceback (most recent call last):
>>> ?File "<stdin>", line 1, in <module>
>>> TypeError: int() argument must be a string or a number, not
>>> 'numpy.NAType'
>>>
>>> But it is nice that np.NA 'adjusts' to the insertion array:
>>> >>> b.flags.maskna = True
>>> >>> ana
>>> NA(dtype='<f8')
>>> >>> b[0]=ana
>>> >>> b[0]
>>> NA(dtype='<f16')
>>>
>>> 5) Different display depending on masked state. That is I think that
>>> 'maskna=True' should be displayed always when flags.maskna is True :
>>> >>> j=np.array([1,2,3], dtype=np.int8)
>>> >>> j
>>> array([1, 2, 3], dtype=int8)
>>> >>> j.flags.maskna=True
>>> >>> j
>>> array([1, 2, 3], maskna=True, dtype=int8)
>>> >>> j[0]=np.NA
>>> >>> j
>>> array([NA, 2, 3], dtype=int8) # Ithink it should still display
>>> 'maskna=True'.
>>>
>>
>> My main peeve is that NA is upper case ;) I suppose that could use some
>> discussion.
>
> There is some proliferation of cases in the NaN case:
>>>> np.nan
> nan
>>>> np.NAN
> nan
>>>> np.NaN
> nan
> The pros I see for NA over na are:
> * less confusion of NA vs nan (should this carry over to the np.isna
> function, should it be np.isNA according to this point?)
> * more comfortable for switching between NumPy and R when people have to use
> both at the same time
> The main con is:
> * Inconsistent with current nan, inf printing. Here's a hackish workaround:
>>>> np.na = np.NA
>>>> np.set_printoptions(nastr='na')
>>>> np.array([np.na, 2.0])
> array([na, ?2.])
> What's your list of pros and cons?
> -Mark
>
>>
>> Chuck
>>

In part I sort of like to have NA and nan since poor
eyesight/typing/editing avoiding problems dropping the last 'n'.

Regarding nan/NAN, do you mean something like my ticket 1051?
http://projects.scipy.org/numpy/ticket/1051
I do not care that much about the case (mixed case is not good)
provided that there is only one to specify these.

Also should np.isfinite() return False for np.NA?
>>> np.isfinite([1,2,np.NA,4])
array([ True,  True, NA,  True], dtype=bool)

Anyhow, many thanks for the replies to my observations and your
amazing effect in getting this done.

Bruce


From josef.pktd at gmail.com  Fri Aug 19 23:19:01 2011
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Fri, 19 Aug 2011 23:19:01 -0400
Subject: [Numpy-discussion] life expectancy of scipy.stats nan statistics
Message-ID: <CAMMTP+DNNmpqug+i6TNHab8drVdfMWPFNnL-e__ha1FHNKY9dw@mail.gmail.com>

I'm just looking at http://projects.scipy.org/scipy/ticket/1200

I agree with Ralf that the bias keyword should be changed to ddof as
in the numpy functions. For functions in scipy.stats, and statistics
in general, I prefer the usual axis=0 default.

However, I think these functions, like scipy.stats.nanstd, should be
replaced by corresponding numpy functions, which might happen
relatively soon. But how soon?

Is it worth deprecating bias in scipy 0.10, and then deprecate again
for removal in 0.11 or 0.12?

Josef


From zelbier at gmail.com  Sat Aug 20 03:47:10 2011
From: zelbier at gmail.com (Olivier Verdier)
Date: Sat, 20 Aug 2011 09:47:10 +0200
Subject: [Numpy-discussion] Can't mix np.newaxis with boolean indexing
In-Reply-To: <CANNq6F=Ek96BTTRPpUSPhd5WuQ781-UArg93EqooahzQoCk-Dw@mail.gmail.com>
References: <CANNq6F=Ek96BTTRPpUSPhd5WuQ781-UArg93EqooahzQoCk-Dw@mail.gmail.com>
Message-ID: <CAEfd+bakTJ+KmU0QoX12+rOUT=GRRgqt9ck5bu20DaPrtPjafw@mail.gmail.com>

Your syntax is not as intuitive as you may think.

Suppose I take a matrix instead

a = np.array([1,2,3,4]).reshape(2,2)
b = (a>1) # np.array([[False,True],[True,True]])

How would a[b,np.newaxis] be supposed to work?

Note that other (simple) slices work perfectly with newaxis, such as
a[:1,np.newaxis]

== Olivier

On 19 August 2011 17:50, Benjamin Root <ben.root at ou.edu> wrote:
> I could have sworn that this use to work:
>
> import numpy as np
> a = np.random.random((100,))
> b = (a > 0.5)
> print a[b, np.newaxis]
>
> But instead, I get this error on the latest master:
>
> Traceback (most recent call last):
> ? File "<stdin>", line 1, in <module>
> TypeError: long() argument must be a string or a number, not 'NoneType'
>
> Note, the simple work-around would be "a[b][:, np.newaxis]", but I can't
> imagine why the intuitive syntax would not be valid.
>
> Thanks,
> Ben Root
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>


From heshiming at gmail.com  Sat Aug 20 04:01:42 2011
From: heshiming at gmail.com (He Shiming)
Date: Sat, 20 Aug 2011 16:01:42 +0800
Subject: [Numpy-discussion] RGB <-> HSV in numpy?
Message-ID: <CANBMWHg6zCg0M2AKWQwGgYDR_Zm-v_o64Y9wsJWdUcuZrLEJwQ@mail.gmail.com>

Hi,

I'm wondering how to do RGB <-> HSV conversion in numpy. I found a
couple solutions through stackoverflow, but somehow they can't be used
in my array format. I understand the concept of conversion, but I'm
not that familiar with numpy.

My source buffer format is 'RGBA' sequence. I can take it into numpy
via: numpy.fromstring(data, 'B').astype('I'). So that nd[0::4] becomes
the array for the red channel. After color manipulation, I'll convert
it back by nd.astype('B').tostring().

How do I run RGB <-> HSV conversion on the nd array? I'd like to keep
SV values in the range of 0-1, and H in 0-360. Thank you.

-- 
Best regards,
He Shiming


From wardefar at iro.umontreal.ca  Sat Aug 20 04:17:26 2011
From: wardefar at iro.umontreal.ca (David Warde-Farley)
Date: Sat, 20 Aug 2011 04:17:26 -0400
Subject: [Numpy-discussion] RGB <-> HSV in numpy?
In-Reply-To: <CANBMWHg6zCg0M2AKWQwGgYDR_Zm-v_o64Y9wsJWdUcuZrLEJwQ@mail.gmail.com>
References: <CANBMWHg6zCg0M2AKWQwGgYDR_Zm-v_o64Y9wsJWdUcuZrLEJwQ@mail.gmail.com>
Message-ID: <D6BF0C4F-0A15-4620-97DA-1D0966288314@iro.umontreal.ca>

On 2011-08-20, at 4:01 AM, He Shiming wrote:

> Hi,
> 
> I'm wondering how to do RGB <-> HSV conversion in numpy. I found a
> couple solutions through stackoverflow, but somehow they can't be used
> in my array format. I understand the concept of conversion, but I'm
> not that familiar with numpy.
> 
> My source buffer format is 'RGBA' sequence. I can take it into numpy
> via: numpy.fromstring(data, 'B').astype('I'). So that nd[0::4] becomes
> the array for the red channel. After color manipulation, I'll convert
> it back by nd.astype('B').tostring().


There are functions for this available in scikits.image:

http://stefanv.github.com/scikits.image/api/scikits.image.color.html

Although you may need to reshape it with reshape(arr, (width, height, 4)) or something similar first.

David

From gael.varoquaux at normalesup.org  Sat Aug 20 04:49:56 2011
From: gael.varoquaux at normalesup.org (Gael Varoquaux)
Date: Sat, 20 Aug 2011 10:49:56 +0200
Subject: [Numpy-discussion] SVD does not converge on "clean" matrix
In-Reply-To: <6d45c5e06b9e78cd9f56cf3ff2d604a5@telecom-paristech.fr>
References: <203aa1b32d794c238d32cb8d29036cc2.squirrel@webmail1.telecom-paristech.fr>
	<CAB6mnxJ9tXR4CFByCsYLCucaBTH8Aq3kLbN_f8aQOU74yORN4Q@mail.gmail.com>
	<1313332026.61861.YahooMailNeo@web34404.mail.mud.yahoo.com>
	<6d45c5e06b9e78cd9f56cf3ff2d604a5@telecom-paristech.fr>
Message-ID: <20110820084956.GA16846@phare.normalesup.org>

On Sun, Aug 14, 2011 at 09:15:35PM +0200, Charanpal Dhanjal wrote:
> Incidentally, I am confused as to why numpy calls the lapack lite 
> routines - when I call numpy.show_config() it seems to have detected my 
> ATLAS libraries and I would have expected it to use those.

My rule of thumb is to never use numpy for linear algebra, but only
scipy. It avoids such confusions that I have seen so often, including
with my colleagues.

My 2 cents,

Gael


From mwwiebe at gmail.com  Sat Aug 20 12:26:45 2011
From: mwwiebe at gmail.com (Mark Wiebe)
Date: Sat, 20 Aug 2011 09:26:45 -0700
Subject: [Numpy-discussion] NA masks for NumPy are ready to test
In-Reply-To: <CAAea2pZBC_xHDhO32T+_OBSRtbhpVvMmWrM=T0sa_7z6mgB4ng@mail.gmail.com>
References: <CAMRnEmpGCBzR78-y5PBjoa+nYt4EKOJayeWqfzggR11CmTzn3g@mail.gmail.com>
	<CAAea2pZBC_xHDhO32T+_OBSRtbhpVvMmWrM=T0sa_7z6mgB4ng@mail.gmail.com>
Message-ID: <CAMRnEmputWGyf1KPe=o3HX1BB9-M9T4-RSN-8CV6QY3Cga5vvw@mail.gmail.com>

On Fri, Aug 19, 2011 at 11:37 AM, Bruce Southey <bsouthey at gmail.com> wrote:

> Hi,
> Just some immediate minor observations that are really about trying to
> be consistent:
>
> 1) Could you keep the display of the NA dtype be the same as the array?
> For example, NA dtype is displayed as '<f8' but should be displayed as
> 'float64' as that is the array dtype.
>  >>> a=np.array([[1,2,3,np.NA], [3,4,np.nan,5]])
> >>> a
> array([[  1.,   2.,   3., NA],
>       [  3.,   4.,  nan,   5.]])
> >>> a.dtype
> dtype('float64')
> >>> a.sum()
> NA(dtype='<f8')
>

I've implemented this:

>>> a=np.array([[1,2,3,np.NA], [3,4,np.nan,5]])
>>> a
array([[  1.,   2.,   3.,   NA],
       [  3.,   4.,  nan,   5.]])
>>> a.dtype
dtype('float64')
>>> a.sum()
NA(dtype='float64')


> 2) Can the 'skipna' flag be added to the methods?
> >>> a.sum(skipna=True)
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> TypeError: 'skipna' is an invalid keyword argument for this function
> >>> np.sum(a,skipna=True)
> nan
>
> 3) Can the skipna flag be extended to exclude other non-finite cases like
> NaN?
>
> 4) Assigning a np.NA needs a better error message but the Integer
> array case is more informative:
> >>> b=np.array([1,2,3,4], dtype=np.float128)
> >>> b[0]=np.NA
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> TypeError: float() argument must be a string or a number
>
> >>> j=np.array([1,2,3])
> >>> j
> array([1, 2, 3])
> >>> j[0]=ina
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> TypeError: int() argument must be a string or a number, not 'numpy.NAType'
>

Here are the new error messages in these cases:

>>> b=np.array([1,2,3,4], dtype=np.float128)
>>> b[0]=np.NA
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: Cannot assign NA to an array which does not support NAs
>>> j=np.array([1,2,3])
>>> j[0] = np.NA
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: Cannot assign NA to an array which does not support NAs

Cheers,
Mark


>
> But it is nice that np.NA 'adjusts' to the insertion array:
> >>> b.flags.maskna = True
> >>> ana
> NA(dtype='<f8')
> >>> b[0]=ana
> >>> b[0]
> NA(dtype='<f16')
>
> 5) Different display depending on masked state. That is I think that
> 'maskna=True' should be displayed always when flags.maskna is True :
> >>> j=np.array([1,2,3], dtype=np.int8)
> >>> j
> array([1, 2, 3], dtype=int8)
> >>> j.flags.maskna=True
> >>> j
> array([1, 2, 3], maskna=True, dtype=int8)
> >>> j[0]=np.NA
> >>> j
> array([NA, 2, 3], dtype=int8) # Ithink it should still display
> 'maskna=True'.
>
> Bruce
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110820/8f13a99e/attachment.html>

From mwwiebe at gmail.com  Sat Aug 20 12:32:40 2011
From: mwwiebe at gmail.com (Mark Wiebe)
Date: Sat, 20 Aug 2011 09:32:40 -0700
Subject: [Numpy-discussion] NA masks for NumPy are ready to test
In-Reply-To: <CAAea2pZdJV8o=PzUw1K-1kh7PJ50MacuCWH2H9kQkeDAZyafGw@mail.gmail.com>
References: <CAMRnEmpGCBzR78-y5PBjoa+nYt4EKOJayeWqfzggR11CmTzn3g@mail.gmail.com>
	<CAAea2pZBC_xHDhO32T+_OBSRtbhpVvMmWrM=T0sa_7z6mgB4ng@mail.gmail.com>
	<CAB6mnxK3HEBq4kJ__y2xnfj_+5mdH4v_j63r_K70tUbFuSDTYw@mail.gmail.com>
	<CAMRnEmqVQaw-T-3aeoX_tZiw4_mT7ite6WqXzcPa6Eaf_ycQyA@mail.gmail.com>
	<CAAea2pZdJV8o=PzUw1K-1kh7PJ50MacuCWH2H9kQkeDAZyafGw@mail.gmail.com>
Message-ID: <CAMRnEmoHv0oO0Ddw2HAkhHAVHFzKXr4gbJW4xWHdtPQC8Q-C4A@mail.gmail.com>

On Fri, Aug 19, 2011 at 4:52 PM, Bruce Southey <bsouthey at gmail.com> wrote:

> On Fri, Aug 19, 2011 at 3:05 PM, Mark Wiebe <mwwiebe at gmail.com> wrote:
> > On Fri, Aug 19, 2011 at 11:44 AM, Charles R Harris
> > <charlesr.harris at gmail.com> wrote:
> >>
> >><snip>
> >>
> >>
> >> My main peeve is that NA is upper case ;) I suppose that could use some
> >> discussion.
> >
> > There is some proliferation of cases in the NaN case:
> >>>> np.nan
> > nan
> >>>> np.NAN
> > nan
> >>>> np.NaN
> > nan
> > The pros I see for NA over na are:
> > * less confusion of NA vs nan (should this carry over to the np.isna
> > function, should it be np.isNA according to this point?)
> > * more comfortable for switching between NumPy and R when people have to
> use
> > both at the same time
> > The main con is:
> > * Inconsistent with current nan, inf printing. Here's a hackish
> workaround:
> >>>> np.na = np.NA
> >>>> np.set_printoptions(nastr='na')
> >>>> np.array([np.na, 2.0])
> > array([na,  2.])
> > What's your list of pros and cons?
> > -Mark
> >
> >>
> >> Chuck
> >>
>
> In part I sort of like to have NA and nan since poor
> eyesight/typing/editing avoiding problems dropping the last 'n'.
>
> Regarding nan/NAN, do you mean something like my ticket 1051?
> http://projects.scipy.org/numpy/ticket/1051
> I do not care that much about the case (mixed case is not good)
> provided that there is only one to specify these.
>
> Also should np.isfinite() return False for np.NA?
> >>> np.isfinite([1,2,np.NA,4])
> array([ True,  True, NA,  True], dtype=bool)
>

This is correct according to the NA computational model in the NEP. An NA
represents a value which exists but is unknown, and could be anything
representable by the type. Thus, it could the a finite number or it could be
inf, meaning the answer to isfinite could be True or it could be False, and
the answer must be NA.


> Anyhow, many thanks for the replies to my observations and your
> amazing effect in getting this done.
>

Thanks for taking the time to take the software for a spin, I appreciate
your feedback!

-Mark


>
> Bruce
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110820/abda3013/attachment.html>

From ben.root at ou.edu  Sat Aug 20 16:17:02 2011
From: ben.root at ou.edu (Benjamin Root)
Date: Sat, 20 Aug 2011 15:17:02 -0500
Subject: [Numpy-discussion] Can't mix np.newaxis with boolean indexing
In-Reply-To: <CAEfd+bakTJ+KmU0QoX12+rOUT=GRRgqt9ck5bu20DaPrtPjafw@mail.gmail.com>
References: <CANNq6F=Ek96BTTRPpUSPhd5WuQ781-UArg93EqooahzQoCk-Dw@mail.gmail.com>
	<CAEfd+bakTJ+KmU0QoX12+rOUT=GRRgqt9ck5bu20DaPrtPjafw@mail.gmail.com>
Message-ID: <CANNq6FnDbepydvAnT24qxR_0PoggpAMxrhE_LmQ+04d9zkerTw@mail.gmail.com>

On Sat, Aug 20, 2011 at 2:47 AM, Olivier Verdier <zelbier at gmail.com> wrote:

> Your syntax is not as intuitive as you may think.
>
> Suppose I take a matrix instead
>
> a = np.array([1,2,3,4]).reshape(2,2)
> b = (a>1) # np.array([[False,True],[True,True]])
>
> How would a[b,np.newaxis] be supposed to work?
>
> Note that other (simple) slices work perfectly with newaxis, such as
> a[:1,np.newaxis]
>
> == Olivier
>
>
Personally, I would have expected it to flatten the results and added a
dimension:

a[b, np.newaxis]
array([[2],
         [3],
         [4]])

or

a[np.newaxis, b]
array([[2, 3, 4]])

I mean, it flattens the results anyway when doing boolean indexing for
multi-dimensional arrays, so someone doing that should expect that anyway.

At the very least, I think maybe we could have a better error message than
just saying that long() can't take a NoneType?

Thanks,
Ben Root
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110820/79ce3961/attachment.html>

From chris at simplistix.co.uk  Sat Aug 20 18:37:19 2011
From: chris at simplistix.co.uk (Chris Withers)
Date: Sat, 20 Aug 2011 15:37:19 -0700
Subject: [Numpy-discussion] Decimal arrays?
Message-ID: <4E50371F.1070105@simplistix.co.uk>

Hi All,

What's the best type of array to use for decimal values?
(ie: where I care about precision and want to avoid any possible 
rounding errors)

cheers,

Chris

-- 
Simplistix - Content Management, Batch Processing & Python Consulting
             - http://www.simplistix.co.uk


From robert.kern at gmail.com  Sat Aug 20 18:38:20 2011
From: robert.kern at gmail.com (Robert Kern)
Date: Sat, 20 Aug 2011 17:38:20 -0500
Subject: [Numpy-discussion] Decimal arrays?
In-Reply-To: <4E50371F.1070105@simplistix.co.uk>
References: <4E50371F.1070105@simplistix.co.uk>
Message-ID: <CAF6FJitgbxKLmOEoLmeBK6HQ=WaE99ALWg2n3QwFxshjSjjKyw@mail.gmail.com>

On Sat, Aug 20, 2011 at 17:37, Chris Withers <chris at simplistix.co.uk> wrote:
> Hi All,
>
> What's the best type of array to use for decimal values?
> (ie: where I care about precision and want to avoid any possible
> rounding errors)

dtype=object

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
? -- Umberto Eco


From chris at simplistix.co.uk  Sat Aug 20 18:49:40 2011
From: chris at simplistix.co.uk (Chris Withers)
Date: Sat, 20 Aug 2011 15:49:40 -0700
Subject: [Numpy-discussion] Decimal arrays?
In-Reply-To: <CAF6FJitgbxKLmOEoLmeBK6HQ=WaE99ALWg2n3QwFxshjSjjKyw@mail.gmail.com>
References: <4E50371F.1070105@simplistix.co.uk>
	<CAF6FJitgbxKLmOEoLmeBK6HQ=WaE99ALWg2n3QwFxshjSjjKyw@mail.gmail.com>
Message-ID: <4E503A04.8070600@simplistix.co.uk>

On 20/08/2011 15:38, Robert Kern wrote:
> On Sat, Aug 20, 2011 at 17:37, Chris Withers<chris at simplistix.co.uk>  wrote:
>> Hi All,
>>
>> What's the best type of array to use for decimal values?
>> (ie: where I care about precision and want to avoid any possible
>> rounding errors)
>
> dtype=object

Thanks!

What are the performance implications, if any, of this array type?

Chris

-- 
Simplistix - Content Management, Batch Processing & Python Consulting
             - http://www.simplistix.co.uk


From chris at simplistix.co.uk  Sat Aug 20 19:18:55 2011
From: chris at simplistix.co.uk (Chris Withers)
Date: Sat, 20 Aug 2011 16:18:55 -0700
Subject: [Numpy-discussion] saving groups of numpy arrays to disk
Message-ID: <4E5040DF.9090303@simplistix.co.uk>

Hi All,

I've got a tree of nested dicts that at their leaves end in numpy arrays 
of identical sizes.

What's the easiest way to persist these to disk so that I can pick up 
with them where I left off?

What's the most "correct" way to do so?

I'm using IPython if that makes things easier...

I had wondered about PyTables, but that seems a bit too heavyweight for 
this, unless I'm missing something?

cheers,

Chris

-- 
Simplistix - Content Management, Batch Processing & Python Consulting
             - http://www.simplistix.co.uk


From charlesr.harris at gmail.com  Sat Aug 20 19:41:49 2011
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Sat, 20 Aug 2011 17:41:49 -0600
Subject: [Numpy-discussion] Decimal arrays?
In-Reply-To: <4E503A04.8070600@simplistix.co.uk>
References: <4E50371F.1070105@simplistix.co.uk>
	<CAF6FJitgbxKLmOEoLmeBK6HQ=WaE99ALWg2n3QwFxshjSjjKyw@mail.gmail.com>
	<4E503A04.8070600@simplistix.co.uk>
Message-ID: <CAB6mnx+ScTU4e65zmsh1Qio_fUgivTth0+6OSAdz3Dj9d7uH4Q@mail.gmail.com>

On Sat, Aug 20, 2011 at 4:49 PM, Chris Withers <chris at simplistix.co.uk>wrote:

> On 20/08/2011 15:38, Robert Kern wrote:
> > On Sat, Aug 20, 2011 at 17:37, Chris Withers<chris at simplistix.co.uk>
>  wrote:
> >> Hi All,
> >>
> >> What's the best type of array to use for decimal values?
> >> (ie: where I care about precision and want to avoid any possible
> >> rounding errors)
> >
> > dtype=object
>
> Thanks!
>
> What are the performance implications, if any, of this array type?
>
>
It will be slower, the effect depends on how much data/computation you have.
You need to look into using the decimal objects in  decimal
module<http://docs.python.org/library/decimal.html>,
i.e., import decimal. Note that 1/7 still isn't going to be exact in
decimals.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110820/a186df94/attachment.html>

From robert.kern at gmail.com  Sat Aug 20 20:08:04 2011
From: robert.kern at gmail.com (Robert Kern)
Date: Sat, 20 Aug 2011 19:08:04 -0500
Subject: [Numpy-discussion] Decimal arrays?
In-Reply-To: <4E503A04.8070600@simplistix.co.uk>
References: <4E50371F.1070105@simplistix.co.uk>
	<CAF6FJitgbxKLmOEoLmeBK6HQ=WaE99ALWg2n3QwFxshjSjjKyw@mail.gmail.com>
	<4E503A04.8070600@simplistix.co.uk>
Message-ID: <CAF6FJivqmFAk8eya1x5cwA4RudbZWX-0f2izs=2-DX5CZJOfAQ@mail.gmail.com>

On Sat, Aug 20, 2011 at 17:49, Chris Withers <chris at simplistix.co.uk> wrote:
> On 20/08/2011 15:38, Robert Kern wrote:
>> On Sat, Aug 20, 2011 at 17:37, Chris Withers<chris at simplistix.co.uk> ?wrote:
>>> Hi All,
>>>
>>> What's the best type of array to use for decimal values?
>>> (ie: where I care about precision and want to avoid any possible
>>> rounding errors)
>>
>> dtype=object
>
> Thanks!
>
> What are the performance implications, if any, of this array type?

It will be slower than floats, obviously, because there will be
several C function calls and plenty of extra instructions for each
operation on each element. But it will be somewhat faster than looping
in Python. Note that decimal.Decimal objects are implemented in pure
Python, so you will also be paying for Python function call overhead
and other costs going through ceval.c several times over.

You may want to try the cdecimal package:

  http://pypi.python.org/pypi/cdecimal/

This will provide an extension module defining an extension type
implemented in C. You can avoid the ceval.c overhead entirely during
the array operation.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
? -- Umberto Eco


From heshiming at gmail.com  Sun Aug 21 00:52:12 2011
From: heshiming at gmail.com (He Shiming)
Date: Sun, 21 Aug 2011 12:52:12 +0800
Subject: [Numpy-discussion] RGB <-> HSV in numpy?
In-Reply-To: <D6BF0C4F-0A15-4620-97DA-1D0966288314@iro.umontreal.ca>
References: <CANBMWHg6zCg0M2AKWQwGgYDR_Zm-v_o64Y9wsJWdUcuZrLEJwQ@mail.gmail.com>
	<D6BF0C4F-0A15-4620-97DA-1D0966288314@iro.umontreal.ca>
Message-ID: <CANBMWHhu5JnQ-EUMbzWAsUR=KAwn_eN0Pp3gHVnsh4gY=3Okmg@mail.gmail.com>

On Sat, Aug 20, 2011 at 4:17 PM, David Warde-Farley
<wardefar at iro.umontreal.ca> wrote:
>
> There are functions for this available in scikits.image:
>
> http://stefanv.github.com/scikits.image/api/scikits.image.color.html
>
> Although you may need to reshape it with reshape(arr, (width, height, 4)) or something similar first.
>
> David

Thanks, I'll check it out.

-- 
Best regards,
He Shiming


From paul.anton.letnes at gmail.com  Sun Aug 21 03:19:30 2011
From: paul.anton.letnes at gmail.com (Paul Anton Letnes)
Date: Sun, 21 Aug 2011 08:19:30 +0100
Subject: [Numpy-discussion] saving groups of numpy arrays to disk
In-Reply-To: <4E5040DF.9090303@simplistix.co.uk>
References: <4E5040DF.9090303@simplistix.co.uk>
Message-ID: <83268EE4-194B-4EF0-B370-8F3D1EA6EA6D@gmail.com>

Hi!
On 21. aug. 2011, at 00.18, Chris Withers wrote:

> Hi All,
> 
> I've got a tree of nested dicts that at their leaves end in numpy arrays 
> of identical sizes.
> 
> What's the easiest way to persist these to disk so that I can pick up 
> with them where I left off?

Probably giving them names like
trunk_branch_leaf.txt
with numpy.savetxt, if you want it quick and dirty. Or possibly, use numpy.savez directly on your dict.

> What's the most "correct" way to do so?
> 
> I'm using IPython if that makes things easier...
> 
> I had wondered about PyTables, but that seems a bit too heavyweight for 
> this, unless I'm missing something?

In my (perhaps limited) experience, hdf5 is great for this. I personally use h5py, I believe it is a little lighter. You get the "tree structure" for free in something like a directory structure:
/branch/leaf
/trunk/branch/leaf
etc.

Cheers
Paul


From ben_w_123 at yahoo.co.uk  Sun Aug 21 04:53:05 2011
From: ben_w_123 at yahoo.co.uk (Ben Walsh)
Date: Sun, 21 Aug 2011 09:53:05 +0100 (BST)
Subject: [Numpy-discussion] Segfault for np.lookfor
In-Reply-To: <mailman.3185.1313522101.1086.numpy-discussion@scipy.org>
References: <mailman.3185.1313522101.1086.numpy-discussion@scipy.org>
Message-ID: <alpine.DEB.2.00.1108210948400.4893@localhost.localdomain>


Hi

My bad. Very sorry about that, guys.

There's a patch for this here:

https://github.com/walshb/numpy/tree/fix_np_lookfor_segv

And I submitted a pull request. I'll add something to the tests too when I 
have a little more time.

Cheers

Ben

> ------------------------------
>
> Message: 3
> Date: Tue, 16 Aug 2011 12:15:22 -0700
> From: Matthew Brett <matthew.brett at gmail.com>
> Subject: Re: [Numpy-discussion] Segfault for np.lookfor
> To: Discussion of Numerical Python <numpy-discussion at scipy.org>
> Message-ID:
> 	<CAH6Pt5o3bGyJ=Xm1Hes0GMiNPkUYQG_87RAi3ipGAFaM_u69-w at mail.gmail.com>
> Content-Type: text/plain; charset=ISO-8859-1
>
>>
>> I opened ticket #1937 for this
>
>> From git-bisect it looks like the culprit is:
>
> feb8079070b8a659d7eee1b4acbddf470fd8a81d is the first bad commit
> commit feb8079070b8a659d7eee1b4acbddf470fd8a81d
> Author: Ben Walsh <b at wumpster.com>
> Date:   Sun Jul 10 12:52:52 2011 +0100
>
>    BUT: Stop _array_find_type trying to make every list element a
> subtype of bool.
>
> Just to remind me, my procedure was:
>
> <~/tmp/testfor.py>
> #!/usr/bin/env python
> import sys
> from functools import partial
> from subprocess import check_call, Popen, PIPE, CalledProcessError
>
> caller = partial(check_call, shell=True)
> popener = partial(Popen, stdout=PIPE, stderr=PIPE, shell=True)
>
> try:
>    caller('git clean -fxd')
>    caller('python setup.py build_ext -i')
> except CalledProcessError:
>    sys.exit(125) # untestable
>
> proc = popener('python -c "%s"' %
> """import sys
> import numpy as np
> np.lookfor('cos', output=sys.stdout)
> """)
>
> stdout, stderr = proc.communicate()
> if 'Segmentation fault' in stderr:
>    sys.exit(1) # bad
> sys.exit(0) # good
> </~/tmp/testfor.py>
>
> Then, I established the v1.6.1 did not have the segfault, and (man git-bisect):
>
> git co main-master # current upstream master
> git bisect start HEAD v1.6.1 --
> git bisect run ~/tmp/testfor.py
>
> See y'all,
>
> Matthew
>


From pav at iki.fi  Sun Aug 21 08:24:46 2011
From: pav at iki.fi (Pauli Virtanen)
Date: Sun, 21 Aug 2011 12:24:46 +0000 (UTC)
Subject: [Numpy-discussion] saving groups of numpy arrays to disk
References: <4E5040DF.9090303@simplistix.co.uk>
Message-ID: <j2qtee$lba$1@dough.gmane.org>

On Sat, 20 Aug 2011 16:18:55 -0700, Chris Withers wrote:
> I've got a tree of nested dicts that at their leaves end in numpy arrays
> of identical sizes.
> 
> What's the easiest way to persist these to disk so that I can pick up
> with them where I left off?

Depends on your requirements.

You can use Python pickling, if you do *not* have a requirement for:

- real persistence, i.e., being able to easily read the data years later
- a standard data format
- access from non-Python programs
- safety against malicious parties (unpickling can execute some code
  in the input -- although this is possible to control)

then you can use Python pickling:

	import pickle

	file = open('out.pck', 'wb')
	pickle.dump(file, tree, protocol=pickle.HIGHEST_PROTOCOL)
	file.close()

	file = open('out.pck', 'rb')
	tree = pickle.load(file)
	file.close()

This should just work (TM) directly with your tree-of-dicts-and-arrays.

> What's the most "correct" way to do so?
> 
> I'm using IPython if that makes things easier...
>
> I had wondered about PyTables, but that seems a bit too heavyweight for 
> this, unless I'm missing something?

If I had one or more of the requirements listed above, I'd use the HDF5
format, via either PyTables or h5py. If I'd just need to cache the trees,
then I'd use pickling.

I think the only reason to consider heavy-weighedness is distribution:
does your target audience have these libraries already installed
(they are pre-installed in several Python-for-science distributions),
and how difficult would it be for you to ship them with your stuff,
or to require the users to install them.

-- 
Pauli Virtanen


From torgil.svensson at gmail.com  Sun Aug 21 16:07:16 2011
From: torgil.svensson at gmail.com (Torgil Svensson)
Date: Sun, 21 Aug 2011 22:07:16 +0200
Subject: [Numpy-discussion] Can't mix np.newaxis with boolean indexing
In-Reply-To: <CANNq6FnDbepydvAnT24qxR_0PoggpAMxrhE_LmQ+04d9zkerTw@mail.gmail.com>
References: <CANNq6F=Ek96BTTRPpUSPhd5WuQ781-UArg93EqooahzQoCk-Dw@mail.gmail.com>
	<CAEfd+bakTJ+KmU0QoX12+rOUT=GRRgqt9ck5bu20DaPrtPjafw@mail.gmail.com>
	<CANNq6FnDbepydvAnT24qxR_0PoggpAMxrhE_LmQ+04d9zkerTw@mail.gmail.com>
Message-ID: <CA+RwOBUzbBWxuEWhdEU-SJ8+YoC_h4Lb7kg3dhAwN41V00ALVQ@mail.gmail.com>

Since the result is one-dimensional after using boolean indexing you
can always do:

a[b][:, np.newaxis]
array([[2],
         [3],
         [4]])

a[b][np.newaxis, :]
array([[2, 3, 4]])

//Torgil

On Sat, Aug 20, 2011 at 10:17 PM, Benjamin Root <ben.root at ou.edu> wrote:
> On Sat, Aug 20, 2011 at 2:47 AM, Olivier Verdier <zelbier at gmail.com> wrote:
>>
>> Your syntax is not as intuitive as you may think.
>>
>> Suppose I take a matrix instead
>>
>> a = np.array([1,2,3,4]).reshape(2,2)
>> b = (a>1) # np.array([[False,True],[True,True]])
>>
>> How would a[b,np.newaxis] be supposed to work?
>>
>> Note that other (simple) slices work perfectly with newaxis, such as
>> a[:1,np.newaxis]
>>
>> == Olivier
>>
>
> Personally, I would have expected it to flatten the results and added a
> dimension:
>
> a[b, np.newaxis]
> array([[2],
> ???????? [3],
> ???????? [4]])
>
> or
>
> a[np.newaxis, b]
> array([[2, 3, 4]])
>
> I mean, it flattens the results anyway when doing boolean indexing for
> multi-dimensional arrays, so someone doing that should expect that anyway.
>
> At the very least, I think maybe we could have a better error message than
> just saying that long() can't take a NoneType?
>
> Thanks,
> Ben Root
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>


From ben.root at ou.edu  Sun Aug 21 16:39:02 2011
From: ben.root at ou.edu (Benjamin Root)
Date: Sun, 21 Aug 2011 15:39:02 -0500
Subject: [Numpy-discussion] Can't mix np.newaxis with boolean indexing
In-Reply-To: <CA+RwOBUzbBWxuEWhdEU-SJ8+YoC_h4Lb7kg3dhAwN41V00ALVQ@mail.gmail.com>
References: <CANNq6F=Ek96BTTRPpUSPhd5WuQ781-UArg93EqooahzQoCk-Dw@mail.gmail.com>
	<CAEfd+bakTJ+KmU0QoX12+rOUT=GRRgqt9ck5bu20DaPrtPjafw@mail.gmail.com>
	<CANNq6FnDbepydvAnT24qxR_0PoggpAMxrhE_LmQ+04d9zkerTw@mail.gmail.com>
	<CA+RwOBUzbBWxuEWhdEU-SJ8+YoC_h4Lb7kg3dhAwN41V00ALVQ@mail.gmail.com>
Message-ID: <CANNq6Fn4YaJEkBmi3bEN87jynZ88cVqqfMJ6+x3r6pk3HHmvjg@mail.gmail.com>

On Sunday, August 21, 2011, Torgil Svensson <torgil.svensson at gmail.com>
wrote:
> Since the result is one-dimensional after using boolean indexing you
> can always do:
>
> a[b][:, np.newaxis]
> array([[2],
>         [3],
>         [4]])
>
> a[b][np.newaxis, :]
> array([[2, 3, 4]])
>
> //Torgil

Correct, which I already noted as a workaround in my first email.  The point
I am making is that that shouldn't be necessary because of generic
programming concepts, or a better error message should be emitted in case a
developer didn't know that he was doing Boolean indexing.

Ben Root
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110821/f9850bfb/attachment.html>

From heshiming at gmail.com  Sun Aug 21 23:39:15 2011
From: heshiming at gmail.com (He Shiming)
Date: Mon, 22 Aug 2011 11:39:15 +0800
Subject: [Numpy-discussion] RGB <-> HSV in numpy?
In-Reply-To: <CANBMWHhu5JnQ-EUMbzWAsUR=KAwn_eN0Pp3gHVnsh4gY=3Okmg@mail.gmail.com>
References: <CANBMWHg6zCg0M2AKWQwGgYDR_Zm-v_o64Y9wsJWdUcuZrLEJwQ@mail.gmail.com>
	<D6BF0C4F-0A15-4620-97DA-1D0966288314@iro.umontreal.ca>
	<CANBMWHhu5JnQ-EUMbzWAsUR=KAwn_eN0Pp3gHVnsh4gY=3Okmg@mail.gmail.com>
Message-ID: <CANBMWHiSc3B8HrbFwZVx+G0WfyrQw1HumBQOGeQKS5FQeD9N-A@mail.gmail.com>

> On Sat, Aug 20, 2011 at 4:17 PM, David Warde-Farley
> <wardefar at iro.umontreal.ca> wrote:
>
> Thanks, I'll check it out.
>
> --
> Best regards,
> He Shiming
>

Hi again. Project scikits.image appeared to be difficult to install
under ubuntu. It complains about something related to OpenCV, and I
didn't see any option to compile without it. I'm wondering if there
are any simpler solutions, without using scikits.image or scipy, just
numpy plus calculations. All I'm trying to do is to convert this
algorithm: http://code.activestate.com/recipes/576919-python-rgb-and-hsv-conversion/
to numpy flavor.

-- 
Best regards,
He Shiming


From ralf.gommers at googlemail.com  Mon Aug 22 02:21:33 2011
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Mon, 22 Aug 2011 08:21:33 +0200
Subject: [Numpy-discussion] RGB <-> HSV in numpy?
In-Reply-To: <CANBMWHiSc3B8HrbFwZVx+G0WfyrQw1HumBQOGeQKS5FQeD9N-A@mail.gmail.com>
References: <CANBMWHg6zCg0M2AKWQwGgYDR_Zm-v_o64Y9wsJWdUcuZrLEJwQ@mail.gmail.com>
	<D6BF0C4F-0A15-4620-97DA-1D0966288314@iro.umontreal.ca>
	<CANBMWHhu5JnQ-EUMbzWAsUR=KAwn_eN0Pp3gHVnsh4gY=3Okmg@mail.gmail.com>
	<CANBMWHiSc3B8HrbFwZVx+G0WfyrQw1HumBQOGeQKS5FQeD9N-A@mail.gmail.com>
Message-ID: <CABL7CQjx96Jc5CqN7SfVz48ED7DJi2UqZTr2s7Rg8s==4-iUsg@mail.gmail.com>

On Mon, Aug 22, 2011 at 5:39 AM, He Shiming <heshiming at gmail.com> wrote:

> > On Sat, Aug 20, 2011 at 4:17 PM, David Warde-Farley
> > <wardefar at iro.umontreal.ca> wrote:
> >
> > Thanks, I'll check it out.
> >
> > --
> > Best regards,
> > He Shiming
> >
>
> Hi again. Project scikits.image appeared to be difficult to install
> under ubuntu. It complains about something related to OpenCV, and I
> didn't see any option to compile without it. I'm wondering if there
> are any simpler solutions, without using scikits.image or scipy, just
> numpy plus calculations. All I'm trying to do is to convert this
> algorithm:
> http://code.activestate.com/recipes/576919-python-rgb-and-hsv-conversion/
> to numpy flavor.
>
> You can use this file standalone without installing scikits.image:
https://github.com/stefanv/scikits.image/blob/master/scikits/image/color/colorconv.py

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110822/fe37f051/attachment.html>

From heshiming at gmail.com  Mon Aug 22 02:29:29 2011
From: heshiming at gmail.com (He Shiming)
Date: Mon, 22 Aug 2011 14:29:29 +0800
Subject: [Numpy-discussion] RGB <-> HSV in numpy?
In-Reply-To: <CABL7CQjx96Jc5CqN7SfVz48ED7DJi2UqZTr2s7Rg8s==4-iUsg@mail.gmail.com>
References: <CANBMWHg6zCg0M2AKWQwGgYDR_Zm-v_o64Y9wsJWdUcuZrLEJwQ@mail.gmail.com>
	<D6BF0C4F-0A15-4620-97DA-1D0966288314@iro.umontreal.ca>
	<CANBMWHhu5JnQ-EUMbzWAsUR=KAwn_eN0Pp3gHVnsh4gY=3Okmg@mail.gmail.com>
	<CANBMWHiSc3B8HrbFwZVx+G0WfyrQw1HumBQOGeQKS5FQeD9N-A@mail.gmail.com>
	<CABL7CQjx96Jc5CqN7SfVz48ED7DJi2UqZTr2s7Rg8s==4-iUsg@mail.gmail.com>
Message-ID: <CANBMWHi3Nw-8e6qS1yf6dmx2xoOADNs5nOUMf5qhTZBAObpj3g@mail.gmail.com>

On Mon, Aug 22, 2011 at 2:21 PM, Ralf Gommers
<ralf.gommers at googlemail.com> wrote:
>
>
> You can use this file standalone without installing scikits.image:
> https://github.com/stefanv/scikits.image/blob/master/scikits/image/color/colorconv.py
>
> Ralf
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>

Thanks Ralf. I've managed to extract the conversion routines from this
colorconv.py, and adapted it on RGBA sequence.


-- 
Best regards,
He Shiming


From mdickinson at enthought.com  Mon Aug 22 03:18:57 2011
From: mdickinson at enthought.com (Mark Dickinson)
Date: Mon, 22 Aug 2011 08:18:57 +0100
Subject: [Numpy-discussion] Decimal arrays?
In-Reply-To: <CAF6FJivqmFAk8eya1x5cwA4RudbZWX-0f2izs=2-DX5CZJOfAQ@mail.gmail.com>
References: <4E50371F.1070105@simplistix.co.uk>
	<CAF6FJitgbxKLmOEoLmeBK6HQ=WaE99ALWg2n3QwFxshjSjjKyw@mail.gmail.com>
	<4E503A04.8070600@simplistix.co.uk>
	<CAF6FJivqmFAk8eya1x5cwA4RudbZWX-0f2izs=2-DX5CZJOfAQ@mail.gmail.com>
Message-ID: <CACrt9_E-KdF8yjm4vzhNLREBshocJ6w86dFu13W+5guLZ0Mg7A@mail.gmail.com>

On Sun, Aug 21, 2011 at 1:08 AM, Robert Kern <robert.kern at gmail.com> wrote:
> You may want to try the cdecimal package:
>
> ?http://pypi.python.org/pypi/cdecimal/

I'll second this suggestion.  cdecimal is an extraordinarily carefully
written and well-tested (almost) drop-in replacement for the decimal
module, and well worth a try.  It would probably be in the Python
standard library by now if anyone had had proper time to review it...

Mark


From konrad.hinsen at fastmail.net  Mon Aug 22 03:36:15 2011
From: konrad.hinsen at fastmail.net (Konrad Hinsen)
Date: Mon, 22 Aug 2011 09:36:15 +0200
Subject: [Numpy-discussion] Bug or feature?
Message-ID: <00EE14FD-C521-4197-8C94-5B7E53EAA246@fastmail.net>

Hi everyone,

I just stumbled on a behavior in NumPy for which I can't find an  
explanation in the documentation. I wonder whether this is a bug or an  
undocumented (or badly documented) feature:

--------------------------------------------------------------------------------------
import numpy

t = numpy.dtype([("rotation", numpy.float64, (3, 3)),
                  ("translation", numpy.float64, (3,))])

# works
a1 = numpy.array([], dtype=t)

# doesn't work
a2 = numpy.array((), dtype=t)
# -> ValueError: size of tuple must match number of fields.
--------------------------------------------------------------------------------------

According to my understanding of how numpy.array should work, it  
shouldn't make a difference if the first argument is a list or a  
tuple, but in this case there is a difference.

Konrad.


From stefan-usenet at bytereef.org  Mon Aug 22 08:30:34 2011
From: stefan-usenet at bytereef.org (Stefan Krah)
Date: Mon, 22 Aug 2011 14:30:34 +0200
Subject: [Numpy-discussion] memoryview shape/strides representation for ndim
	= 0
Message-ID: <20110822123034.GA7743@sleipnir.bytereef.org>

Hello,

Numpy arrays and memoryview currently have different representations
for shape and strides if ndim = 0:

>>> from numpy import *
>>> x = array(9, int32)
>>> x.ndim
0
>>> x.shape
()
>>> x.strides
()
>>> m = memoryview(x)
>>> m.ndim
0L
>>> m.shape is None
True
>>> m.strides is None
True


I think the Numpy representation is nicer. Also, I think that memoryviews
should attempt to mimic the underlying object as closely as possible.


Since the ndim = 0 case probably only occurs in Numpy, it might be possible
to change the representation in memoryview.


Travis, was the "shape is None" representation used for compatibility with 
ctypes? Would it be possible or advisable to use the Numpy representation?


Stefan Krah


From mdickinson at enthought.com  Mon Aug 22 08:35:56 2011
From: mdickinson at enthought.com (Mark Dickinson)
Date: Mon, 22 Aug 2011 13:35:56 +0100
Subject: [Numpy-discussion] memoryview shape/strides representation for
 ndim = 0
In-Reply-To: <20110822123034.GA7743@sleipnir.bytereef.org>
References: <20110822123034.GA7743@sleipnir.bytereef.org>
Message-ID: <CACrt9_F6Bvqfn2PEZppohprHxagqHreUkoN-C7rHGgKSeKQ=Bg@mail.gmail.com>

On Mon, Aug 22, 2011 at 1:30 PM, Stefan Krah <stefan-usenet at bytereef.org> wrote:
> Numpy arrays and memoryview currently have different representations
> for shape and strides if ndim = 0:
>
>>>> from numpy import *
>>>> x = array(9, int32)
>>>> x.ndim
> 0
>>>> x.shape
> ()
>>>> x.strides
> ()
>>>> m = memoryview(x)
>>>> m.ndim
> 0L
>>>> m.shape is None
> True
>>>> m.strides is None
> True
>
>
> I think the Numpy representation is nicer. Also, I think that memoryviews
> should attempt to mimic the underlying object as closely as possible.

Agreed on both points.  If there's no good reason for m.shape and
m.strides to be None, I think it should be changed.

Mark


From teoliphant at gmail.com  Mon Aug 22 08:50:06 2011
From: teoliphant at gmail.com (Travis Oliphant)
Date: Mon, 22 Aug 2011 07:50:06 -0500
Subject: [Numpy-discussion] Bug or feature?
In-Reply-To: <00EE14FD-C521-4197-8C94-5B7E53EAA246@fastmail.net>
References: <00EE14FD-C521-4197-8C94-5B7E53EAA246@fastmail.net>
Message-ID: <32C3323B-4398-41A3-9A0D-3C8006DB14E6@enthought.com>

This goes into the category of "feature".  Structured arrays use tuples to indicate a record.   So, (only) when using structured arrays as a dtype, there is a difference between lists and tuples.    In this case, array sees the tuple and expects it to have 2 elements to match the number of fields in 2. 

Best, 

-Travis


On Aug 22, 2011, at 2:36 AM, Konrad Hinsen wrote:

> Hi everyone,
> 
> I just stumbled on a behavior in NumPy for which I can't find an  
> explanation in the documentation. I wonder whether this is a bug or an  
> undocumented (or badly documented) feature:
> 
> --------------------------------------------------------------------------------------
> import numpy
> 
> t = numpy.dtype([("rotation", numpy.float64, (3, 3)),
>                  ("translation", numpy.float64, (3,))])
> 
> # works
> a1 = numpy.array([], dtype=t)
> 
> # doesn't work
> a2 = numpy.array((), dtype=t)
> # -> ValueError: size of tuple must match number of fields.
> --------------------------------------------------------------------------------------
> 
> According to my understanding of how numpy.array should work, it  
> shouldn't make a difference if the first argument is a list or a  
> tuple, but in this case there is a difference.
> 
> Konrad.
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

---
Travis Oliphant
Enthought, Inc.
oliphant at enthought.com
1-512-536-1057
http://www.enthought.com


From chris at simplistix.co.uk  Mon Aug 22 11:07:11 2011
From: chris at simplistix.co.uk (Chris Withers)
Date: Mon, 22 Aug 2011 08:07:11 -0700
Subject: [Numpy-discussion] Decimal arrays?
In-Reply-To: <CACrt9_E-KdF8yjm4vzhNLREBshocJ6w86dFu13W+5guLZ0Mg7A@mail.gmail.com>
References: <4E50371F.1070105@simplistix.co.uk>
	<CAF6FJitgbxKLmOEoLmeBK6HQ=WaE99ALWg2n3QwFxshjSjjKyw@mail.gmail.com>
	<4E503A04.8070600@simplistix.co.uk>
	<CAF6FJivqmFAk8eya1x5cwA4RudbZWX-0f2izs=2-DX5CZJOfAQ@mail.gmail.com>
	<CACrt9_E-KdF8yjm4vzhNLREBshocJ6w86dFu13W+5guLZ0Mg7A@mail.gmail.com>
Message-ID: <4E52709F.4050300@simplistix.co.uk>

On 22/08/2011 00:18, Mark Dickinson wrote:
> On Sun, Aug 21, 2011 at 1:08 AM, Robert Kern<robert.kern at gmail.com>  wrote:
>> You may want to try the cdecimal package:
>>
>>   http://pypi.python.org/pypi/cdecimal/
>
> I'll second this suggestion.  cdecimal is an extraordinarily carefully
> written and well-tested (almost) drop-in replacement for the decimal
> module, and well worth a try.  It would probably be in the Python
> standard library by now if anyone had had proper time to review it...

Who would need to review it?

I'm surprised this isn't in EPD... any ideas why?

cheers,

Chris

-- 
Simplistix - Content Management, Batch Processing & Python Consulting
             - http://www.simplistix.co.uk


From mdickinson at enthought.com  Mon Aug 22 11:10:23 2011
From: mdickinson at enthought.com (Mark Dickinson)
Date: Mon, 22 Aug 2011 16:10:23 +0100
Subject: [Numpy-discussion] Decimal arrays?
In-Reply-To: <4E52709F.4050300@simplistix.co.uk>
References: <4E50371F.1070105@simplistix.co.uk>
	<CAF6FJitgbxKLmOEoLmeBK6HQ=WaE99ALWg2n3QwFxshjSjjKyw@mail.gmail.com>
	<4E503A04.8070600@simplistix.co.uk>
	<CAF6FJivqmFAk8eya1x5cwA4RudbZWX-0f2izs=2-DX5CZJOfAQ@mail.gmail.com>
	<CACrt9_E-KdF8yjm4vzhNLREBshocJ6w86dFu13W+5guLZ0Mg7A@mail.gmail.com>
	<4E52709F.4050300@simplistix.co.uk>
Message-ID: <CACrt9_EoG9YU4YdOwc=eK_R+C7H-rgcuN+f155SyyC12ePP5dw@mail.gmail.com>

On Mon, Aug 22, 2011 at 4:07 PM, Chris Withers <chris at simplistix.co.uk> wrote:
> On 22/08/2011 00:18, Mark Dickinson wrote:
>>
>> On Sun, Aug 21, 2011 at 1:08 AM, Robert Kern<robert.kern at gmail.com>
>> ?wrote:
>>>
>>> You may want to try the cdecimal package:
>>>
>>> ?http://pypi.python.org/pypi/cdecimal/
>>
>> I'll second this suggestion. ?cdecimal is an extraordinarily carefully
>> written and well-tested (almost) drop-in replacement for the decimal
>> module, and well worth a try. ?It would probably be in the Python
>> standard library by now if anyone had had proper time to review it...
>
> Who would need to review it?

Well, anyone who has the time and understands the domain, really;
it's just useful to have a second pair of eyes going through the code.
 Putting several thousands of lines of unreviewed C code into the
Python standard library is a bit of a no-no.

Mark


From amcmorl at gmail.com  Mon Aug 22 12:23:00 2011
From: amcmorl at gmail.com (Angus McMorland)
Date: Mon, 22 Aug 2011 12:23:00 -0400
Subject: [Numpy-discussion] numpy segfaults with ctypes
In-Reply-To: <CAH6Pt5rZehx4UEWPGtaF1dctAvuokyyeYSpYoAhH8wbALe-c3Q@mail.gmail.com>
References: <CACtA=SzA3no0WDsLEKsxYYGGyuhr0zgs5SGgAUBsyHG3ZT0POA@mail.gmail.com>
	<CAH6Pt5rZehx4UEWPGtaF1dctAvuokyyeYSpYoAhH8wbALe-c3Q@mail.gmail.com>
Message-ID: <CACtA=SycWV5vc5N9NLJ-E35GUSHfukgtwnuCC_aYQm02dPdMHA@mail.gmail.com>

On 19 August 2011 16:11, Matthew Brett <matthew.brett at gmail.com> wrote:
> Hi,
>
> On Fri, Aug 19, 2011 at 1:04 PM, Angus McMorland <amcmorl at gmail.com> wrote:
>> Hi all,
>>
>> I'm giving this email a new subject, in case that helps it catch the
>> attention of someone who can fix my problem. I currently cannot
>> upgrade numpy from git to any date more recent than 10 July. Git
>> commit feb8079070b8a659d7ee is the first that causes the problem
>> (according to github, the commit was authored by walshb and committed
>> by m-paradox, in case that jogs anyone's memory). I've tried taking a
>> look at the code diff, but I'm afraid I'm just a user, rather than a
>> developer, and it didn't make much sense.
>>
>> My problem is that python segfaults when I run it with the following code:
>>
>>> from ctypes import Structure, c_double
>>>
>>> #-- copied out of an xml2py generated file
>>> class S(Structure):
>>> ? ?pass
>>> S._pack_ = 4
>>> S._fields_ = [
>>> ? ?('field', c_double * 2),
>>> ? ]
>>> #--
>>>
>>> import numpy as np
>>> print np.version.version
>>> s = S()
>>> print "S", np.asarray(s.field)
>
> Just to say, that that commit is also the commit that causes a
> segfault for np.lookfor:
>
> http://www.mail-archive.com/numpy-discussion at scipy.org/msg33114.html
> http://projects.scipy.org/numpy/ticket/1937
>
> The latter ticket is closed because Mark's missing-data development
> branch does not have the segfault.
>
> I guess you could try that branch and see whether it fixes the problem?
>
> I guess also that means we'll have to merge in the missing data branch
> in order to fix the problem.

Thanks for the reply Matthew. The latest commit d7b12a3 fixes the problem.

Angus.

> See you,
>
> matthew
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


-- 
AJC McMorland
Post-doctoral research fellow
Neurobiology, University of Pittsburgh


From matthew.brett at gmail.com  Mon Aug 22 12:59:02 2011
From: matthew.brett at gmail.com (Matthew Brett)
Date: Mon, 22 Aug 2011 09:59:02 -0700
Subject: [Numpy-discussion] Segfault for np.lookfor
In-Reply-To: <alpine.DEB.2.00.1108210948400.4893@localhost.localdomain>
References: <mailman.3185.1313522101.1086.numpy-discussion@scipy.org>
	<alpine.DEB.2.00.1108210948400.4893@localhost.localdomain>
Message-ID: <CAH6Pt5p9Q5tGGTO_OXmsgPu5CGU=B-G5rR4XSoD0BDbGV=R11Q@mail.gmail.com>

Hi,

On Sun, Aug 21, 2011 at 1:53 AM, Ben Walsh <ben_w_123 at yahoo.co.uk> wrote:
>
> Hi
>
> My bad. Very sorry about that, guys.
>
> There's a patch for this here:
>
> https://github.com/walshb/numpy/tree/fix_np_lookfor_segv
>
> And I submitted a pull request. I'll add something to the tests too when I
> have a little more time.

Thanks a lot - no criticism intended - just life in the wilds of
tracking trunk...

Cheers,

Matthew


From robert.kern at gmail.com  Mon Aug 22 16:51:03 2011
From: robert.kern at gmail.com (Robert Kern)
Date: Mon, 22 Aug 2011 15:51:03 -0500
Subject: [Numpy-discussion] Decimal arrays?
In-Reply-To: <4E52709F.4050300@simplistix.co.uk>
References: <4E50371F.1070105@simplistix.co.uk>
	<CAF6FJitgbxKLmOEoLmeBK6HQ=WaE99ALWg2n3QwFxshjSjjKyw@mail.gmail.com>
	<4E503A04.8070600@simplistix.co.uk>
	<CAF6FJivqmFAk8eya1x5cwA4RudbZWX-0f2izs=2-DX5CZJOfAQ@mail.gmail.com>
	<CACrt9_E-KdF8yjm4vzhNLREBshocJ6w86dFu13W+5guLZ0Mg7A@mail.gmail.com>
	<4E52709F.4050300@simplistix.co.uk>
Message-ID: <CAF6FJishbTTVv+R1tsbbOtSfW74h5vVMZzr73H==o_CS691Kbg@mail.gmail.com>

On Mon, Aug 22, 2011 at 10:07, Chris Withers <chris at simplistix.co.uk> wrote:
> On 22/08/2011 00:18, Mark Dickinson wrote:
>> On Sun, Aug 21, 2011 at 1:08 AM, Robert Kern<robert.kern at gmail.com> ?wrote:
>>> You may want to try the cdecimal package:
>>>
>>> ? http://pypi.python.org/pypi/cdecimal/
>>
>> I'll second this suggestion. ?cdecimal is an extraordinarily carefully
>> written and well-tested (almost) drop-in replacement for the decimal
>> module, and well worth a try. ?It would probably be in the Python
>> standard library by now if anyone had had proper time to review it...
>
> Who would need to review it?
>
> I'm surprised this isn't in EPD... any ideas why?

No one has asked for it, to my knowledge. We do provide it in our PyPI
repository, so

  $ enpkg cdecimal

should install it if you are an EPD subscriber.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
? -- Umberto Eco


From konrad.hinsen at fastmail.net  Tue Aug 23 02:28:27 2011
From: konrad.hinsen at fastmail.net (Konrad Hinsen)
Date: Tue, 23 Aug 2011 08:28:27 +0200
Subject: [Numpy-discussion] Bug or feature?
In-Reply-To: <32C3323B-4398-41A3-9A0D-3C8006DB14E6@enthought.com>
References: <00EE14FD-C521-4197-8C94-5B7E53EAA246@fastmail.net>
	<32C3323B-4398-41A3-9A0D-3C8006DB14E6@enthought.com>
Message-ID: <00A2AF9E-5E2B-4042-816E-80D4934675C9@fastmail.net>

On 22 Aug 2011, at 14:50, Travis Oliphant wrote:

> This goes into the category of "feature".  Structured arrays use  
> tuples to indicate a record.   So, (only) when using structured  
> arrays as a dtype, there is a difference between lists and  
> tuples.    In this case, array sees the tuple and expects it to have  
> 2 elements to match the number of fields in 2.

Thanks, that sounds reasonable. But is this role of tuples in the  
creation of structured arrays documented anywhere? The documentation  
on structured arrays concentrates on specifying the dtype. All I could  
find about array construction is a few examples.

Konrad.


From stefan-usenet at bytereef.org  Tue Aug 23 08:10:48 2011
From: stefan-usenet at bytereef.org (Stefan Krah)
Date: Tue, 23 Aug 2011 14:10:48 +0200
Subject: [Numpy-discussion] PyBUF_SIMPLE/PyBUF_FORMAT: casts to unsigned
	bytes
Message-ID: <20110823121048.GA14594@sleipnir.bytereef.org>

Hello,

PEP-3118 presumably intended that a PyBUF_SIMPLE request should cast the
original buffer's data type to 'B' (unsigned bytes). Here is a one-dimensional
example that currently occurs in Lib/test/test_multiprocessing:

>>> import array, io
>>> a = array.array('i', [1,2,3,4,5])
>>> m = memoryview(a)
>>> m.format
'i'
>>> buf = io.BytesIO(bytearray(5*8))
>>> buf.readinto(m)

buf.readinto() calls PyObject_AsWriteBuffer(), which requests a simple buffer
from the memoryview, thus casting the 'i' data type to the implied type 'B'.

The consumer can see that a cast has occurred because the new buffer's
format field is NULL.


This seems fine for the one-dimensional case. Numpy currently also allows
such casts for multidimensional contiguous and non-contiguous arrays.
See below for the examples; I don't want to distract from the main
point of the post, which is this:


I'm seeking a clear specification for the Python documentation that determines
under what circumstances casts to 'B' should succeed. I'll formulate the points
as statements for clarity, but in fact they are also questions:

1) An exporter of a C-contiguous array with ndim <= 1 MUST honor
   a PyBUF_SIMPLE request, setting format, shape and strides to NULL
   and itemsize to 1.

   As a corner case, an array with ndim = 0, format = "L" (or other)
   would also morph into a buffer of unsigned bytes. test_ctypes
   currently makes use of this.

2) An exporter of a C-contiguous buffer with ndim > 1 MUST honor
   a PyBUF_SIMPLE request, setting format, shape, and strides to NULL
   and itemsize to 1.

3) An exporter of a buffer that is not C-contiguous MUST raise BufferError
   in response to a PyBUF_SIMPLE request.


Why am I looking for such rigid rules? The problem with memoryview is
that it has to act as a re-exporter itself.

For several reasons (performance of chained memoryviews, garbage collection,
early release, etc.) it has been decided that the new memoryview object has
a managed buffer that takes a snapshot of the original exporter's buffer
(See: http://bugs.python.org/issue10181).

Now, since getbuffer requests to the memoryview object cannot be redirected
to the original object, strict rules are needed for memory_getbuf().


Could you agree with these rules? Point 2) isn't clear from the PEP itself.
I assumed it because Numpy currently allows it, and it appears harmless.


Stefan Krah


Examples:
=========

Cast a multidimensional contiguous array:
-----------------------------------------

I think itemsize in the result should be 1.

[_testbuffer.ndarray is from http://hg.python.org/features/pep-3118#memoryview]

>>> from _testbuffer import *
>>> from numpy import *
>>> from _testbuffer import ndarray as pyarray
>>>
>>> exporter = ndarray(shape=[3,4], dtype="L")
# Issue a PyBUF_SIMPLE request to 'exporter' and act as a re-exporter:
>>> x = pyarray(exporter, getbuf=PyBUF_SIMPLE)
>>> x.len
96
>>> x.shape
()
>>> x.strides
()
>>> x.format
''
>>> x.itemsize # I think this should be 1, not 8.
8

Cast a multidimensional non-contiguous array:
---------------------------------------------

This is clearly not right, since y.buf points to a location that the consumer
cannot handle without shape and strides.

>>> nd = ndarray(buffer=bytearray(96), shape=[3,4], dtype="L")
[182658 refs]
>>> exporter = nd[::-1, ::-2]
[182661 refs]
>>> exporter
array([[0, 0],
       [0, 0],
       [0, 0]], dtype=uint64)
[182659 refs]
>>> y = pyarray(exporter, getbuf=PyBUF_SIMPLE)
[182665 refs]
>>> y.len
48
[182666 refs]
>>> y.strides
()
[182666 refs]
>>> y.shape
()
[182666 refs]
>>> y.format
''
[182666 refs]
>>> y.itemsize
8
[182666 refs]


From stefan-usenet at bytereef.org  Tue Aug 23 08:16:40 2011
From: stefan-usenet at bytereef.org (Stefan Krah)
Date: Tue, 23 Aug 2011 14:16:40 +0200
Subject: [Numpy-discussion] memoryview shape/strides representation
	for	ndim = 0
In-Reply-To: <CACrt9_F6Bvqfn2PEZppohprHxagqHreUkoN-C7rHGgKSeKQ=Bg@mail.gmail.com>
References: <20110822123034.GA7743@sleipnir.bytereef.org>
	<CACrt9_F6Bvqfn2PEZppohprHxagqHreUkoN-C7rHGgKSeKQ=Bg@mail.gmail.com>
Message-ID: <20110823121640.GB14594@sleipnir.bytereef.org>

Mark Dickinson <mdickinson at enthought.com> wrote:
> On Mon, Aug 22, 2011 at 1:30 PM, Stefan Krah <stefan-usenet at bytereef.org> wrote:
> > Numpy arrays and memoryview currently have different representations
> > for shape and strides if ndim = 0:
[...]
> > I think the Numpy representation is nicer. Also, I think that memoryviews
> > should attempt to mimic the underlying object as closely as possible.
> 
> Agreed on both points.  If there's no good reason for m.shape and
> m.strides to be None, I think it should be changed.

Excellent, I'll go ahead with it then (in the feature repo).


Stefan Krah


From nadavh at visionsense.com  Tue Aug 23 10:33:16 2011
From: nadavh at visionsense.com (Nadav Horesh)
Date: Tue, 23 Aug 2011 07:33:16 -0700
Subject: [Numpy-discussion] Wrong treatment of byte order?
Message-ID: <26FC23E7C398A64083C980D16001012D246DFC5FB5@VA3DIAXVS361.RED001.local>

My system is a 64 bit gentoo linux on core i7 machine. Numpy version 1.6.1 and pyton(s) 2.7.2 and 3.2.1

Problem summary:
 I tried t invert a matrix of explicit little endian byte-order and got an error. The inversion run properly with a native byte order, and I get a wrong answer with not error message when the matrix is set to big-endian.

mat is a 3x3 float64 array

>>  import numpy as N

>>> mat.dtype.byteorder
'<'
>>> N.linalg.inv(mat)                  # Refuse to ibvert
Traceback (most recent call last):
  File "<pyshell#107>", line 1, in <module>
    N.linalg.inv(mat)
  File "/usr/lib64/python2.7/site-packages/numpy/linalg/linalg.py", line 445, in inv
    return wrap(solve(a, identity(a.shape[0], dtype=a.dtype)))
  File "/usr/lib64/python2.7/site-packages/numpy/linalg/linalg.py", line 326, in solve
    results = lapack_routine(n_eq, n_rhs, a, n_eq, pivots, b, n_eq, 0)
LapackError: Parameter a has non-native byte order in lapack_lite.dgesv

>>> N.linalg.inv(mat.newbyteorder('='))    # OK
array([[ 0.09234453,  0.46163744,  0.2713108 ],
       [ 0.48886135,  0.51230859,  0.2277598 ],
       [ 0.48303131,  0.82571266,  0.17551993]])

>>> N.linalg.inv(mat.newbyteorder('>'))   # WRONG !!!
array([[  2.39051169e-159,  -7.70643158e-157,   5.34087235e-160],
       [  2.11823992e+305,   2.37224043e+307,  -4.31607382e+304],
       [ -1.26608299e+304,  -1.43225563e+306,   7.22233688e+303]])

  Nadav
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110823/a248de1f/attachment.html>

From derek at astro.physik.uni-goettingen.de  Tue Aug 23 12:07:08 2011
From: derek at astro.physik.uni-goettingen.de (Derek Homeier)
Date: Tue, 23 Aug 2011 18:07:08 +0200
Subject: [Numpy-discussion] Efficient way to load a 1Gb file?
In-Reply-To: <rowen-B8C019.11501411082011@news.gmane.org>
References: <rowen-E0806A.10221410082011@news.gmane.org>
	<FC32192D-0436-46F9-AE71-6359CCA286A0@astro.physik.uni-goettingen.de>
	<CANm_+Zqmsgo8Q+Oz_0RCya-hJv4Q7PqynDb=LzrgvbTxGY3MWQ@mail.gmail.com>
	<rowen-B8C019.11501411082011@news.gmane.org>
Message-ID: <781AF0C6-B761-4ABB-9798-9385582536E5@astro.physik.uni-goettingen.de>

On 11.08.2011, at 8:50PM, Russell E. Owen wrote:

> It seems a shame that loadtxt has no argument for predicted length, 
> which would allow preallocation and less appending/copying data.
> 
> And yes...reading the whole file first to figure out how many elements 
> it has seems sensible to me -- at least as a switchable behavior, and 
> preferably the default. 1Gb isn't that large in modern systems, but 
> loadtxt is filing up all 6Gb of RAM reading it!

1 GB is indeed not much in terms of disk space these days, but using text 
files for such data amounts is nonetheless very much non-state-of-the-art ;-)
That said, of course there is no justification to use excessive amounts of 
memory where it could be avoided! 
Implementing the above scheme for npyio is not quite as straightforward 
as in the example I gave before, mainly for the following reasons: 

loadtxt also has to deal with more complex data like structured arrays, 
plus comments, empty lines etc., meaning it has to find and count the 
actual valid data lines. 

Ideally, genfromtxt, which offers yet more functionality to deal with missing 
data, should offer the same options, but they would be certainly more 
difficult to implement there. 

More than 6 GB is still remarkable - from what info I found in the web, lists 
seem to consume ~24 Bytes/element, i.e. 3 times more than a final float64 
array. The text representation would typically take 10-20 char's for one 
float (though with <12 digits, they could usually be read as float32 without 
loss of precision). Thus a factor >6 seems quite extreme, unless the file 
is full of (relatively) short integers...
But this also means copying of the final array would still have a relatively 
low memory footprint compared to the buffer list, thus using some kind of 
mutable array type for reading should be a reasonable solution as well. 
Unfortunately fromiter is not of that much use here since it only reads 
1D-arrays. I haven't tried to use Chris' accumulator class yet, so for now 
I did go the 2x read approach with loadtxt, it turned out to add only ~10% 
to the read-in time. For compressed files this goes up to 30-50%, but 
once physical memory is exhausted it should probably actually become 
faster. 

I've made a pull request 
https://github.com/numpy/numpy/pull/144
implementing that option as a switch 'prescan'; could you review it in 
particular regarding the following:

Is the option reasonably named and documented?

In the case the allocated array does not match the input data (which 
really should never happen), right now just a warning is issued, 
filling any excess buffer with zeros or discarding remaining input data - 
should this rather raise an IndexError?

No prediction if/when I might be able to provide this for genfromtxt, sorry!

Cheers,
							Derek


From fperez.net at gmail.com  Tue Aug 23 16:13:04 2011
From: fperez.net at gmail.com (Fernando Perez)
Date: Tue, 23 Aug 2011 13:13:04 -0700 (PDT)
Subject: [Numpy-discussion] Bug or feature?
In-Reply-To: <00A2AF9E-5E2B-4042-816E-80D4934675C9@fastmail.net>
References: <00EE14FD-C521-4197-8C94-5B7E53EAA246@fastmail.net>
	<32C3323B-4398-41A3-9A0D-3C8006DB14E6@enthought.com>
	<00A2AF9E-5E2B-4042-816E-80D4934675C9@fastmail.net>
Message-ID: <4e5409d0.0295e50a.4548.0bd7@mx.google.com>

On Mon, Aug 22, 2011 at 11:28 PM, Konrad Hinsen <konrad.hinsen at fastmail.net> wrote:
> Thanks, that sounds reasonable. But is this role of tuples in the
> creation of structured arrays documented anywhere? The documentation
> on structured arrays concentrates on specifying the dtype. All I could
> find about array construction is a few examples.

Note from the peanut gallery: this is one area of the docs that could really use a separate

.. warning::

(or at least .. note::) with the info Travis gave, because it's not obvious at all, and deciphering the resulting error message is pretty tricky if you've never seen it before.  I know it's bitten me more than once and I always scratch my head for a few minutes...

It just occurred to me that it would be very cool to have in the docs a few standalone HowTo documents on selected topics.  Over the years I've found  some of the Python howtos extremely useful, and I was very happy when I saw they started including them in the bundled docs.  They complement very nicely the reference/api and explain certain key topics in a more tutorial fashion.

Off the top of my head, here are a few ideas for enterprising souls to make a very useful contribution with a howto on each of these topics:

- dtype/structured arrays and record arrays
- fancy indexing, broadcasting, lib.index_tricks (if Anne could find the time to write this one, we'd be eternally grateful)
- ctypes and cython for C interfacing and optimization
- missing data/masked arrays (including the new goodies).

These could be written by a small team, perhaps pairing an experienced numpy contributor with a new member who can provide the balance of perspective of a newcomer (very important in tutorial documentation) and simultaneously gain in-depth experience with important topics.

OK, back to the comfort of my chair up here in the gallery...

Cheers,

f

From stefan at sun.ac.za  Tue Aug 23 17:47:03 2011
From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=)
Date: Tue, 23 Aug 2011 14:47:03 -0700
Subject: [Numpy-discussion] Bug or feature?
In-Reply-To: <4e5409d0.0295e50a.4548.0bd7@mx.google.com>
References: <00EE14FD-C521-4197-8C94-5B7E53EAA246@fastmail.net>
	<32C3323B-4398-41A3-9A0D-3C8006DB14E6@enthought.com>
	<00A2AF9E-5E2B-4042-816E-80D4934675C9@fastmail.net>
	<4e5409d0.0295e50a.4548.0bd7@mx.google.com>
Message-ID: <CABDkGQ=kZq1b3nv3EBxUKVnw14mP3P5GxdNZS+nWBPBJ4=gEzg@mail.gmail.com>

On Tue, Aug 23, 2011 at 1:13 PM, Fernando Perez <fperez.net at gmail.com> wrote:
> Off the top of my head, here are a few ideas for enterprising souls to make a very useful contribution with a howto on each of these topics:
>
> - dtype/structured arrays and record arrays
> - fancy indexing, broadcasting, lib.index_tricks (if Anne could find the time to write this one, we'd be eternally grateful)
> - ctypes and cython for C interfacing and optimization
> - missing data/masked arrays (including the new goodies).

Some of these are included in numpy:

import numpy.doc as doc

doc.structured_arrays
doc.indexing
doc.performance <-- currently empty

and IIRC they are elso edited via the docs editor.

> These could be written by a small team, perhaps pairing an experienced numpy contributor with a new member who can provide the balance of perspective of a newcomer (very important in tutorial documentation) and simultaneously gain in-depth experience with important topics.

I agree fully.

Cheers
St?fan


From d.s.seljebotn at astro.uio.no  Wed Aug 24 05:49:31 2011
From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn)
Date: Wed, 24 Aug 2011 11:49:31 +0200
Subject: [Numpy-discussion] PyBUF_SIMPLE/PyBUF_FORMAT: casts to
	unsigned	bytes
In-Reply-To: <20110823121048.GA14594@sleipnir.bytereef.org>
References: <20110823121048.GA14594@sleipnir.bytereef.org>
Message-ID: <9d39119a-723a-4018-8ba2-149416f59658@email.android.com>

(sorry for the top-post, no way around it)

Under 2), would it make sense to also export the contents of a Fortran-contiguous buffer as a raw byte stream? I was just the other week writing code to serialize an array in Fortran order to a binary stream.

OTOH I could easily serialize its transpose for the same effect. Just something to think about.
-- 
Sent from my Android phone with K-9 Mail. Please excuse my brevity.

Stefan Krah <stefan-usenet at bytereef.org> wrote:

Hello, PEP-3118 presumably intended that a PyBUF_SIMPLE request should cast the original buffer's data type to 'B' (unsigned bytes). Here is a one-dimensional example that currently occurs in Lib/test/test_multiprocessing: >>> import array, io >>> a = array.array('i', [1,2,3,4,5]) >>> m = memoryview(a) >>> m.format 'i' >>> buf = io.BytesIO(bytearray(5*8)) >>> buf.readinto(m) buf.readinto() calls PyObject_AsWriteBuffer(), which requests a simple buffer from the memoryview, thus casting the 'i' data type to the implied type 'B'. The consumer can see that a cast has occurred because the new buffer's format field is NULL. This seems fine for the one-dimensional case. Numpy currently also allows such casts for multidimensional contiguous and non-contiguous arrays. See below for the examples; I don't want to distract from the main point of the post, which is this: I'm seeking a clear specification for the Python documentation that determines under what circumstances casts to 'B' should
succeed. I'll formulate the points as statements for clarity, but in fact they are also questions: 1) An exporter of a C-contiguous array with ndim <= 1 MUST honor a PyBUF_SIMPLE request, setting format, shape and strides to NULL and itemsize to 1. As a corner case, an array with ndim = 0, format = "L" (or other) would also morph into a buffer of unsigned bytes. test_ctypes currently makes use of this. 2) An exporter of a C-contiguous buffer with ndim > 1 MUST honor a PyBUF_SIMPLE request, setting format, shape, and strides to NULL and itemsize to 1. 3) An exporter of a buffer that is not C-contiguous MUST raise BufferError in response to a PyBUF_SIMPLE request. Why am I looking for such rigid rules? The problem with memoryview is that it has to act as a re-exporter itself. For several reasons (performance of chained memoryviews, garbage collection, early release, etc.) it has been decided that the new memoryview object has a managed buffer that takes a snapshot of the original
exporter's buffer (See: http://bugs.python.org/issue10181). Now, since getbuffer requests to the memoryview object cannot be redirected to the original object, strict rules are needed for memory_getbuf(). Could you agree with these rules? Point 2) isn't clear from the PEP itself. I assumed it because Numpy currently allows it, and it appears harmless. Stefan Krah Examples: ========= Cast a multidimensional contiguous array:_____________________________________________
I think itemsize in the result should be 1. [_testbuffer.ndarray is from http://hg.python.org/features/pep-3118#memoryview] >>> from _testbuffer import * >>> from numpy import * >>> from _testbuffer import ndarray as pyarray >>> >>> exporter = ndarray(shape=[3,4], dtype="L") # Issue a PyBUF_SIMPLE request to 'exporter' and act as a re-exporter: >>> x = pyarray(exporter, getbuf=PyBUF_SIMPLE) >>> x.len 96 >>> x.shape () >>> x.strides () >>> x.format '' >>> x.itemsize # I think this should be 1, not 8. 8 Cast a multidimensional non-contiguous array:_____________________________________________
This is clearly not right, since y.buf points to a location that the consumer cannot handle without shape and strides. >>> nd = ndarray(buffer=bytearray(96), shape=[3,4], dtype="L") [182658 refs] >>> exporter = nd[::-1, ::-2] [182661 refs] >>> exporter array([[0, 0], [0, 0], [0, 0]], dtype=uint64) [182659 refs] >>> y = pyarray(exporter, getbuf=PyBUF_SIMPLE) [182665 refs] >>> y.len 48 [182666 refs] >>> y.strides () [182666 refs] >>> y.shape () [182666 refs] >>> y.format '' [182666 refs] >>> y.itemsize 8 [182666 refs]_____________________________________________
NumPy-Discussion mailing list NumPy-Discussion at scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110824/13dde069/attachment.html>

From scopatz at gmail.com  Wed Aug 24 12:22:10 2011
From: scopatz at gmail.com (Anthony Scopatz)
Date: Wed, 24 Aug 2011 11:22:10 -0500
Subject: [Numpy-discussion] saving groups of numpy arrays to disk
In-Reply-To: <j2qtee$lba$1@dough.gmane.org>
References: <4E5040DF.9090303@simplistix.co.uk> <j2qtee$lba$1@dough.gmane.org>
Message-ID: <CAPk-6T5mqx_GfALGa9VoK0JCvTAD28sJz1Qtja773Q2yRtGmaA@mail.gmail.com>

On Sun, Aug 21, 2011 at 7:24 AM, Pauli Virtanen <pav at iki.fi> wrote:

> On Sat, 20 Aug 2011 16:18:55 -0700, Chris Withers wrote:
> > I've got a tree of nested dicts that at their leaves end in numpy arrays
> > of identical sizes.
> >
> > What's the easiest way to persist these to disk so that I can pick up
> > with them where I left off?
>
> Depends on your requirements.
>
> You can use Python pickling, if you do *not* have a requirement for:
>
> - real persistence, i.e., being able to easily read the data years later
> - a standard data format
> - access from non-Python programs
> - safety against malicious parties (unpickling can execute some code
>  in the input -- although this is possible to control)
>
> then you can use Python pickling:
>
>        import pickle
>
>        file = open('out.pck', 'wb')
>        pickle.dump(file, tree, protocol=pickle.HIGHEST_PROTOCOL)
>        file.close()
>
>        file = open('out.pck', 'rb')
>        tree = pickle.load(file)
>        file.close()
>
> This should just work (TM) directly with your tree-of-dicts-and-arrays.
>
> > What's the most "correct" way to do so?
> >
> > I'm using IPython if that makes things easier...
> >
> > I had wondered about PyTables, but that seems a bit too heavyweight for
> > this, unless I'm missing something?
>
> If I had one or more of the requirements listed above, I'd use the HDF5
> format, via either PyTables or h5py. If I'd just need to cache the trees,
> then I'd use pickling.
>
> I think the only reason to consider heavy-weighedness is distribution:
> does your target audience have these libraries already installed
> (they are pre-installed in several Python-for-science distributions),
> and how difficult would it be for you to ship them with your stuff,
> or to require the users to install them.
>

+1 to PyTables or h5py.


>
> --
> Pauli Virtanen
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110824/d610fd1f/attachment.html>

From srean.list at gmail.com  Wed Aug 24 19:53:09 2011
From: srean.list at gmail.com (srean)
Date: Wed, 24 Aug 2011 18:53:09 -0500
Subject: [Numpy-discussion] c-info.ufunc-tutorial.rst
Message-ID: <CAJewx8-g0X+wSJrFTiKkpFtKe4QmbGKAn=BOD=OOL_u45zZcQg@mail.gmail.com>

Hi,

I was reading this document,
https://github.com/numpy/numpy/blob/master/doc/source/user/c-info.ufunc-tutorial.rst

its well written and there is a good build up to exciting code examples that
are coming, but I do not see the actual examples, only how they may be used.
Is it located somewhere else and not linked? or is it that the
c-info.ufunc-tutorial.rst document is incomplete and the examples have not
been written. I suspect the former. In that case could anyone point to the
code examples and may be also update the c-info.ufunc-tutorial.rst document.

Thanks

-- srean
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110824/8e15e052/attachment.html>

From srean.list at gmail.com  Wed Aug 24 20:05:38 2011
From: srean.list at gmail.com (srean)
Date: Wed, 24 Aug 2011 19:05:38 -0500
Subject: [Numpy-discussion] c-info.ufunc-tutorial.rst
In-Reply-To: <CAJewx8-g0X+wSJrFTiKkpFtKe4QmbGKAn=BOD=OOL_u45zZcQg@mail.gmail.com>
References: <CAJewx8-g0X+wSJrFTiKkpFtKe4QmbGKAn=BOD=OOL_u45zZcQg@mail.gmail.com>
Message-ID: <CAJewx8_0wzvzV00i14xD=LUsYujPb=HYEwq3-eCM-wuiMHwJCw@mail.gmail.com>

Following up on my own question: I can see the code in the commit. So it
appears that

code-block::

Are not being rendered correctly. Could anyone confirm ? In case it is my
browser alone, though I did try after disabling no-script.

On Wed, Aug 24, 2011 at 6:53 PM, srean <srean.list at gmail.com> wrote:

> Hi,
>
> I was reading this document,
> https://github.com/numpy/numpy/blob/master/doc/source/user/c-info.ufunc-tutorial.rst
>
> its well written and there is a good build up to exciting code examples
> that are coming, but I do not see the actual examples, only how they may be
> used. Is it located somewhere else and not linked? or is it that the
> c-info.ufunc-tutorial.rst document is incomplete and the examples have not
> been written. I suspect the former. In that case could anyone point to the
> code examples and may be also update the c-info.ufunc-tutorial.rst document.
>
> Thanks
>
> -- srean
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110824/14e4843c/attachment.html>

From mwwiebe at gmail.com  Wed Aug 24 20:08:59 2011
From: mwwiebe at gmail.com (Mark Wiebe)
Date: Wed, 24 Aug 2011 17:08:59 -0700
Subject: [Numpy-discussion] NA mask C-API documentation
Message-ID: <CAMRnEmp5sc6sgb7vd=mzURkWyTqo6xuQQRPWDODUOQxCew083g@mail.gmail.com>

I've added C-API documentation to the missingdata branch. The .rst file
(beware of the github rst parser though, it drops some of the content) is
here:

https://github.com/m-paradox/numpy/blob/missingdata/doc/source/reference/c-api.maskna.rst

and I made a small example module which goes with it here:

https://github.com/m-paradox/spdiv

Cheers,
Mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110824/2216ac7c/attachment.html>

From mwwiebe at gmail.com  Wed Aug 24 20:10:34 2011
From: mwwiebe at gmail.com (Mark Wiebe)
Date: Wed, 24 Aug 2011 17:10:34 -0700
Subject: [Numpy-discussion] c-info.ufunc-tutorial.rst
In-Reply-To: <CAJewx8_0wzvzV00i14xD=LUsYujPb=HYEwq3-eCM-wuiMHwJCw@mail.gmail.com>
References: <CAJewx8-g0X+wSJrFTiKkpFtKe4QmbGKAn=BOD=OOL_u45zZcQg@mail.gmail.com>
	<CAJewx8_0wzvzV00i14xD=LUsYujPb=HYEwq3-eCM-wuiMHwJCw@mail.gmail.com>
Message-ID: <CAMRnEmoqyoH-5UT7LBHZdPHnS+e-5mAGUX5+YtPsZO7F=Y+c4A@mail.gmail.com>

On Wed, Aug 24, 2011 at 5:05 PM, srean <srean.list at gmail.com> wrote:

> Following up on my own question: I can see the code in the commit. So it
> appears that
>
> code-block::
>
> Are not being rendered correctly. Could anyone confirm ? In case it is my
> browser alone, though I did try after disabling no-script.


I believe this is because of github's .rst processor which simply drops
blocks it can't understand. When building NumPy documentation, many more
extensions and context exists. I'm getting the same thing in the C-API
NA-mask documentation I just posted.

-Mark


>
>
> On Wed, Aug 24, 2011 at 6:53 PM, srean <srean.list at gmail.com> wrote:
>
>> Hi,
>>
>> I was reading this document,
>> https://github.com/numpy/numpy/blob/master/doc/source/user/c-info.ufunc-tutorial.rst
>>
>> its well written and there is a good build up to exciting code examples
>> that are coming, but I do not see the actual examples, only how they may be
>> used. Is it located somewhere else and not linked? or is it that the
>> c-info.ufunc-tutorial.rst document is incomplete and the examples have not
>> been written. I suspect the former. In that case could anyone point to the
>> code examples and may be also update the c-info.ufunc-tutorial.rst document.
>>
>> Thanks
>>
>> -- srean
>>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110824/0e74fb44/attachment.html>

From scopatz at gmail.com  Wed Aug 24 20:19:13 2011
From: scopatz at gmail.com (Anthony Scopatz)
Date: Wed, 24 Aug 2011 19:19:13 -0500
Subject: [Numpy-discussion] c-info.ufunc-tutorial.rst
In-Reply-To: <CAMRnEmoqyoH-5UT7LBHZdPHnS+e-5mAGUX5+YtPsZO7F=Y+c4A@mail.gmail.com>
References: <CAJewx8-g0X+wSJrFTiKkpFtKe4QmbGKAn=BOD=OOL_u45zZcQg@mail.gmail.com>
	<CAJewx8_0wzvzV00i14xD=LUsYujPb=HYEwq3-eCM-wuiMHwJCw@mail.gmail.com>
	<CAMRnEmoqyoH-5UT7LBHZdPHnS+e-5mAGUX5+YtPsZO7F=Y+c4A@mail.gmail.com>
Message-ID: <CAPk-6T68Dwr8jY0NO6FWmvKHSzMnzUmBeELoVcNXEt7+vdB2uA@mail.gmail.com>

code-block:: is a directive that I think might be specific to sphinx.
 Naturally, github's renderer will drop it.

On Wed, Aug 24, 2011 at 7:10 PM, Mark Wiebe <mwwiebe at gmail.com> wrote:

> On Wed, Aug 24, 2011 at 5:05 PM, srean <srean.list at gmail.com> wrote:
>
>> Following up on my own question: I can see the code in the commit. So it
>> appears that
>>
>> code-block::
>>
>> Are not being rendered correctly. Could anyone confirm ? In case it is my
>> browser alone, though I did try after disabling no-script.
>
>
> I believe this is because of github's .rst processor which simply drops
> blocks it can't understand. When building NumPy documentation, many more
> extensions and context exists. I'm getting the same thing in the C-API
> NA-mask documentation I just posted.
>
> -Mark
>
>
>>
>>
>> On Wed, Aug 24, 2011 at 6:53 PM, srean <srean.list at gmail.com> wrote:
>>
>>> Hi,
>>>
>>> I was reading this document,
>>> https://github.com/numpy/numpy/blob/master/doc/source/user/c-info.ufunc-tutorial.rst
>>>
>>> its well written and there is a good build up to exciting code examples
>>> that are coming, but I do not see the actual examples, only how they may be
>>> used. Is it located somewhere else and not linked? or is it that the
>>> c-info.ufunc-tutorial.rst document is incomplete and the examples have not
>>> been written. I suspect the former. In that case could anyone point to the
>>> code examples and may be also update the c-info.ufunc-tutorial.rst document.
>>>
>>> Thanks
>>>
>>> -- srean
>>>
>>
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110824/244e1f50/attachment.html>

From mwwiebe at gmail.com  Wed Aug 24 20:19:58 2011
From: mwwiebe at gmail.com (Mark Wiebe)
Date: Wed, 24 Aug 2011 17:19:58 -0700
Subject: [Numpy-discussion] NA masks for NumPy are ready to test
In-Reply-To: <CAAea2pZBC_xHDhO32T+_OBSRtbhpVvMmWrM=T0sa_7z6mgB4ng@mail.gmail.com>
References: <CAMRnEmpGCBzR78-y5PBjoa+nYt4EKOJayeWqfzggR11CmTzn3g@mail.gmail.com>
	<CAAea2pZBC_xHDhO32T+_OBSRtbhpVvMmWrM=T0sa_7z6mgB4ng@mail.gmail.com>
Message-ID: <CAMRnEmqpr4OeOm2VnFDnPbk4-7j67S4Jn5LQBoXXJwEQTSWSvQ@mail.gmail.com>

On Fri, Aug 19, 2011 at 11:37 AM, Bruce Southey <bsouthey at gmail.com> wrote:

> Hi,
> <snip>
>
> 2) Can the 'skipna' flag be added to the methods?
> >>> a.sum(skipna=True)
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> TypeError: 'skipna' is an invalid keyword argument for this function
> >>> np.sum(a,skipna=True)
> nan
>

I've added this now, as well. I think that finishes up the changes you
suggested in this email which felt right to me.

Cheers,
Mark


> <snip>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110824/3ac903b7/attachment.html>

From srean.list at gmail.com  Wed Aug 24 20:34:28 2011
From: srean.list at gmail.com (srean)
Date: Wed, 24 Aug 2011 19:34:28 -0500
Subject: [Numpy-discussion] c-info.ufunc-tutorial.rst
In-Reply-To: <CAPk-6T68Dwr8jY0NO6FWmvKHSzMnzUmBeELoVcNXEt7+vdB2uA@mail.gmail.com>
References: <CAJewx8-g0X+wSJrFTiKkpFtKe4QmbGKAn=BOD=OOL_u45zZcQg@mail.gmail.com>
	<CAJewx8_0wzvzV00i14xD=LUsYujPb=HYEwq3-eCM-wuiMHwJCw@mail.gmail.com>
	<CAMRnEmoqyoH-5UT7LBHZdPHnS+e-5mAGUX5+YtPsZO7F=Y+c4A@mail.gmail.com>
	<CAPk-6T68Dwr8jY0NO6FWmvKHSzMnzUmBeELoVcNXEt7+vdB2uA@mail.gmail.com>
Message-ID: <CAJewx88H_ybQWdXw9fbKCxF_Wu8jb6TZRBBtPLQVfgMujU0HXg@mail.gmail.com>

Thanks Anthony and Mark, this is good to know.

So what would be the advised way of looking at freshly baked documentation ?
Just look at the raw files ? or is there some place else where the correct
sphinx rendered docs are hosted.

On Wed, Aug 24, 2011 at 7:19 PM, Anthony Scopatz <scopatz at gmail.com> wrote:

> code-block:: is a directive that I think might be specific to sphinx.
>  Naturally, github's renderer will drop it.
>
> On Wed, Aug 24, 2011 at 7:10 PM, Mark Wiebe <mwwiebe at gmail.com> wrote:
>
>>
>> I believe this is because of github's .rst processor which simply drops
>> blocks it can't understand. When building NumPy documentation, many more
>> extensions and context exists. I'm getting the same thing in the C-API
>> NA-mask documentation I just posted.
>>
>> -Mark
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110824/e33c1db8/attachment.html>

From scopatz at gmail.com  Wed Aug 24 20:43:45 2011
From: scopatz at gmail.com (Anthony Scopatz)
Date: Wed, 24 Aug 2011 19:43:45 -0500
Subject: [Numpy-discussion] c-info.ufunc-tutorial.rst
In-Reply-To: <CAJewx88H_ybQWdXw9fbKCxF_Wu8jb6TZRBBtPLQVfgMujU0HXg@mail.gmail.com>
References: <CAJewx8-g0X+wSJrFTiKkpFtKe4QmbGKAn=BOD=OOL_u45zZcQg@mail.gmail.com>
	<CAJewx8_0wzvzV00i14xD=LUsYujPb=HYEwq3-eCM-wuiMHwJCw@mail.gmail.com>
	<CAMRnEmoqyoH-5UT7LBHZdPHnS+e-5mAGUX5+YtPsZO7F=Y+c4A@mail.gmail.com>
	<CAPk-6T68Dwr8jY0NO6FWmvKHSzMnzUmBeELoVcNXEt7+vdB2uA@mail.gmail.com>
	<CAJewx88H_ybQWdXw9fbKCxF_Wu8jb6TZRBBtPLQVfgMujU0HXg@mail.gmail.com>
Message-ID: <CAPk-6T4XUHDWqKgYMQ9iONL33yawZEHoAkAheX+B3sV2ivMQew@mail.gmail.com>

On Wed, Aug 24, 2011 at 7:34 PM, srean <srean.list at gmail.com> wrote:

> Thanks Anthony and Mark, this is good to know.
>
> So what would be the advised way of looking at freshly baked documentation
> ? Just look at the raw files ? or is there some place else where the correct
> sphinx rendered docs are hosted.
>

Building the docs yourself is probably the safest bet.  However, someone
should probably hook up the numpy and scipy repos to readthedocs.org.  That
would solve this problem...


>
> On Wed, Aug 24, 2011 at 7:19 PM, Anthony Scopatz <scopatz at gmail.com>wrote:
>
>>  code-block:: is a directive that I think might be specific to sphinx.
>>  Naturally, github's renderer will drop it.
>>
>> On Wed, Aug 24, 2011 at 7:10 PM, Mark Wiebe <mwwiebe at gmail.com> wrote:
>>
>>>
>>> I believe this is because of github's .rst processor which simply drops
>>> blocks it can't understand. When building NumPy documentation, many more
>>> extensions and context exists. I'm getting the same thing in the C-API
>>> NA-mask documentation I just posted.
>>>
>>> -Mark
>>>
>>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110824/0a315209/attachment.html>

From dominique.orban at gmail.com  Wed Aug 24 21:07:33 2011
From: dominique.orban at gmail.com (dpo)
Date: Wed, 24 Aug 2011 18:07:33 -0700 (PDT)
Subject: [Numpy-discussion] ImportError: dynamic module does not define
 init function (initmultiarray)
In-Reply-To: <CAO6G2+c9_fK0o1xrL1Szg-qNMycjAz8SOjcxUcpECL0WCSVHgQ@mail.gmail.com>
References: <CAO6G2+c9_fK0o1xrL1Szg-qNMycjAz8SOjcxUcpECL0WCSVHgQ@mail.gmail.com>
Message-ID: <32330873.post@talk.nabble.com>


dpo wrote:
> 
> ---
> Traceback (most recent call last):
>   File "/Users/dpo/.virtualenvs/matrox/matrox/curve.py", line 3, in
> <module>
>     import numpy as np
>   File
> "/Users/dpo/.virtualenvs/matrox/lib/python2.7/site-packages/numpy/__init__.py",
> line 137, in <module>
>     import add_newdocs
>   File
> "/Users/dpo/.virtualenvs/matrox/lib/python2.7/site-packages/numpy/add_newdocs.py",
> line 9, in <module>
>     from numpy.lib import add_newdoc
>   File
> "/Users/dpo/.virtualenvs/matrox/lib/python2.7/site-packages/numpy/lib/__init__.py",
> line 4, in <module>
>     from type_check import *
>   File
> "/Users/dpo/.virtualenvs/matrox/lib/python2.7/site-packages/numpy/lib/type_check.py",
> line 8, in <module>
>     import numpy.core.numeric as _nx
>   File
> "/Users/dpo/.virtualenvs/matrox/lib/python2.7/site-packages/numpy/core/__init__.py",
> line 5, in <module>
>     import multiarray
> ImportError: dynamic module does not define init function (initmultiarray)
> ---
> 
> So I am lead to ask: should multiarray.so really be called
> _multiarray.so? If not, any idea what the problem is?
> 

If I may answer my own question, the answer is no. The issue here is that
numpy was compiled for the x86_64 architecture only, while other libraries I
need to link with are i386 only. Changing CFLAGS and LDFLAGS to "-arch i386
-arch x86_64" resolved the issue. Sorry for the noise.

Dominique

-- 
View this message in context: http://old.nabble.com/ImportError%3A-dynamic-module-does-not-define-init-function-%28initmultiarray%29-tp32299073p32330873.html
Sent from the Numpy-discussion mailing list archive at Nabble.com.


From wesmckinn at gmail.com  Wed Aug 24 21:09:35 2011
From: wesmckinn at gmail.com (Wes McKinney)
Date: Wed, 24 Aug 2011 21:09:35 -0400
Subject: [Numpy-discussion] NA masks for NumPy are ready to test
In-Reply-To: <CAMRnEmqpr4OeOm2VnFDnPbk4-7j67S4Jn5LQBoXXJwEQTSWSvQ@mail.gmail.com>
References: <CAMRnEmpGCBzR78-y5PBjoa+nYt4EKOJayeWqfzggR11CmTzn3g@mail.gmail.com>
	<CAAea2pZBC_xHDhO32T+_OBSRtbhpVvMmWrM=T0sa_7z6mgB4ng@mail.gmail.com>
	<CAMRnEmqpr4OeOm2VnFDnPbk4-7j67S4Jn5LQBoXXJwEQTSWSvQ@mail.gmail.com>
Message-ID: <CAJPUwMCMM1==r_HZLpbphDZBLLqY1XSfaTRvwg8=Rkb4sEKAGQ@mail.gmail.com>

On Wed, Aug 24, 2011 at 8:19 PM, Mark Wiebe <mwwiebe at gmail.com> wrote:
> On Fri, Aug 19, 2011 at 11:37 AM, Bruce Southey <bsouthey at gmail.com> wrote:
>>
>> Hi,
>> <snip>
>>
>> 2) Can the 'skipna' flag be added to the methods?
>> >>> a.sum(skipna=True)
>> Traceback (most recent call last):
>> ?File "<stdin>", line 1, in <module>
>> TypeError: 'skipna' is an invalid keyword argument for this function
>> >>> np.sum(a,skipna=True)
>> nan
>
> I've added this now, as well. I think that finishes up the changes you
> suggested in this email which felt right to me.
> Cheers,
> Mark
>
>>
>> <snip>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>

Sorry I haven't had a chance to have a tinker yet. My initial observations:

- I haven't decided whether this is a problem:

In [50]: arr = np.arange(100)

In [51]: arr[5:10] = np.NA
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
/home/wesm/<ipython-input-51-7e07a94409e9> in <module>()
----> 1 arr[5:10] = np.NA

ValueError: Cannot set NumPy array values to NA values without first
enabling NA support in the array

I assume when you flip the maskna switch that a mask is created?

- Performance with skipna is a bit disappointing:

In [52]: arr = np.random.randn(1e6)
In [54]: arr.flags.maskna = True
In [56]: arr[::2] = np.NA
In [58]: timeit arr.sum(skipna=True)
100 loops, best of 3: 7.31 ms per loop

this goes down to 2.12 ms if there are no NAs present.

but:

In [59]: import bottleneck as bn
In [60]: arr = np.random.randn(1e6)
In [61]: arr[::2] = np.nan
In [62]: timeit bn.nansum(arr)
1000 loops, best of 3: 1.17 ms per loop

do you have a sense if this gap can be closed? I assume you've been,
as you should, focused on a correct implementation as opposed with
squeezing out performance.

best,
Wes


From mwwiebe at gmail.com  Wed Aug 24 21:35:50 2011
From: mwwiebe at gmail.com (Mark Wiebe)
Date: Wed, 24 Aug 2011 18:35:50 -0700
Subject: [Numpy-discussion] NA masks for NumPy are ready to test
In-Reply-To: <CAJPUwMCMM1==r_HZLpbphDZBLLqY1XSfaTRvwg8=Rkb4sEKAGQ@mail.gmail.com>
References: <CAMRnEmpGCBzR78-y5PBjoa+nYt4EKOJayeWqfzggR11CmTzn3g@mail.gmail.com>
	<CAAea2pZBC_xHDhO32T+_OBSRtbhpVvMmWrM=T0sa_7z6mgB4ng@mail.gmail.com>
	<CAMRnEmqpr4OeOm2VnFDnPbk4-7j67S4Jn5LQBoXXJwEQTSWSvQ@mail.gmail.com>
	<CAJPUwMCMM1==r_HZLpbphDZBLLqY1XSfaTRvwg8=Rkb4sEKAGQ@mail.gmail.com>
Message-ID: <CAMRnEmqHUSBVwy3SPnkrPzWET_Wfk4Ws5tRYa4GJodq3v2DEfA@mail.gmail.com>

On Wed, Aug 24, 2011 at 6:09 PM, Wes McKinney <wesmckinn at gmail.com> wrote:

> On Wed, Aug 24, 2011 at 8:19 PM, Mark Wiebe <mwwiebe at gmail.com> wrote:
> > On Fri, Aug 19, 2011 at 11:37 AM, Bruce Southey <bsouthey at gmail.com>
> wrote:
> >>
> >> Hi,
> >> <snip>
> >>
> >> 2) Can the 'skipna' flag be added to the methods?
> >> >>> a.sum(skipna=True)
> >> Traceback (most recent call last):
> >>  File "<stdin>", line 1, in <module>
> >> TypeError: 'skipna' is an invalid keyword argument for this function
> >> >>> np.sum(a,skipna=True)
> >> nan
> >
> > I've added this now, as well. I think that finishes up the changes you
> > suggested in this email which felt right to me.
> > Cheers,
> > Mark
> >
> >>
> >> <snip>
> >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >
> >
>
> Sorry I haven't had a chance to have a tinker yet. My initial observations:
>
> - I haven't decided whether this is a problem:
>
> In [50]: arr = np.arange(100)
>
> In [51]: arr[5:10] = np.NA
> ---------------------------------------------------------------------------
> ValueError                                Traceback (most recent call last)
> /home/wesm/<ipython-input-51-7e07a94409e9> in <module>()
> ----> 1 arr[5:10] = np.NA
>
> ValueError: Cannot set NumPy array values to NA values without first
> enabling NA support in the array
>
> I assume when you flip the maskna switch that a mask is created?
>

That's correct, it creates a fully exposed mask when you set the flag. The
thought was that having an assignment automatically add a mask to an array
would be a bad idea ("explicit vs implicit").


>
> - Performance with skipna is a bit disappointing:
>
> In [52]: arr = np.random.randn(1e6)
> In [54]: arr.flags.maskna = True
> In [56]: arr[::2] = np.NA
> In [58]: timeit arr.sum(skipna=True)
> 100 loops, best of 3: 7.31 ms per loop
>
> this goes down to 2.12 ms if there are no NAs present.
>

The alternating case is going to get the worst possible performance
currently. The masked loop has no specialization to the operation or data
type whatsoever yet, it simply calls the regular inner loop on the
appropriate runs of data.


> but:
>
> In [59]: import bottleneck as bn
> In [60]: arr = np.random.randn(1e6)
> In [61]: arr[::2] = np.nan
> In [62]: timeit bn.nansum(arr)
> 1000 loops, best of 3: 1.17 ms per loop
>
> do you have a sense if this gap can be closed? I assume you've been,
> as you should, focused on a correct implementation as opposed with
> squeezing out performance.
>

I've been focusing on a correct implementation while installing hooks in the
right places so that the performance can be improved later. For the
straightforward masked copying  code, I previously created a ticket
describing what needs to be done:

http://projects.scipy.org/numpy/ticket/1901

For element-wise ufuncs, the changes needed are similar, creating inner
loops specialized for masks. In doing these changes, I also figured out a
way to add the ability to more properly specialize the inner loops along the
lines of einsum without breaking ABI compatibility, so I set up the API as
required for this.

Thanks for taking a look,
Mark


>
> best,
> Wes
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110824/57ecd5d0/attachment.html>

From mwwiebe at gmail.com  Wed Aug 24 22:29:44 2011
From: mwwiebe at gmail.com (Mark Wiebe)
Date: Wed, 24 Aug 2011 19:29:44 -0700
Subject: [Numpy-discussion] NA masks for NumPy are ready to test
In-Reply-To: <CAJPUwMCMM1==r_HZLpbphDZBLLqY1XSfaTRvwg8=Rkb4sEKAGQ@mail.gmail.com>
References: <CAMRnEmpGCBzR78-y5PBjoa+nYt4EKOJayeWqfzggR11CmTzn3g@mail.gmail.com>
	<CAAea2pZBC_xHDhO32T+_OBSRtbhpVvMmWrM=T0sa_7z6mgB4ng@mail.gmail.com>
	<CAMRnEmqpr4OeOm2VnFDnPbk4-7j67S4Jn5LQBoXXJwEQTSWSvQ@mail.gmail.com>
	<CAJPUwMCMM1==r_HZLpbphDZBLLqY1XSfaTRvwg8=Rkb4sEKAGQ@mail.gmail.com>
Message-ID: <CAMRnEmqHMuVvQbaD566pBTKL9iNhpDFu0Wf9bYm_37h4uV605A@mail.gmail.com>

On Wed, Aug 24, 2011 at 6:09 PM, Wes McKinney <wesmckinn at gmail.com> wrote:

<snip>
>
> - Performance with skipna is a bit disappointing:
>
> In [52]: arr = np.random.randn(1e6)
> In [54]: arr.flags.maskna = True
> In [56]: arr[::2] = np.NA
> In [58]: timeit arr.sum(skipna=True)
> 100 loops, best of 3: 7.31 ms per loop
>
> this goes down to 2.12 ms if there are no NAs present.
>
> but:
>
> In [59]: import bottleneck as bn
> In [60]: arr = np.random.randn(1e6)
> In [61]: arr[::2] = np.nan
> In [62]: timeit bn.nansum(arr)
> 1000 loops, best of 3: 1.17 ms per loop
>
> do you have a sense if this gap can be closed? I assume you've been,
> as you should, focused on a correct implementation as opposed with
> squeezing out performance.
>

It looks like the spdiv example module I created for the C-API documentation
can give a bit of an idea for some performance expectations. The example has
no specialization for strides, and it operates exactly like np.divide except
it converts the output to NA instead of dividing by zero. It *always*
creates an NA mask for the output, and does a masked loop. Here's a link to
the example module:

https://github.com/m-paradox/spdiv

In [1]: from spdiv_mod import spdiv

In [2]: arr = np.random.randn(1e6)

Since spdiv always creates an NA mask, this is comparing an NA-masked divide
with a regular NumPy divide:


In [3]: timeit spdiv(arr, 3.1)
100 loops, best of 3: 13.8 ms per loop

In [4]: timeit arr / 3.1
10 loops, best of 3: 11.4 ms per loop

Here, the divide is causing an NA mask to be created in the output, just
like in spdiv:

In [5]: timeit spdiv(arr, np.NA)
100 loops, best of 3: 4.72 ms per loop

In [6]: timeit arr / np.NA
100 loops, best of 3: 8.71 ms per loop


Here are the same tests, but after giving 'arr' an NA mask:

In [7]: arr.flags.maskna = True

In [8]: timeit spdiv(arr, 3.1)
100 loops, best of 3: 14.2 ms per loop

In [9]: timeit arr / 3.1
10 loops, best of 3: 20.1 ms per loop

In [10]: timeit spdiv(arr, np.NA)
100 loops, best of 3: 4.02 ms per loop

In [11]: timeit arr / np.NA
100 loops, best of 3: 8.69 ms per loop


Another thought is to compare sum to count_nonzero, which is implemented in
a straightforward fashion without the masked wrapping mechanism that's in
the ufuncs.

n [12]: arr[::2] = np.NA

In [13]: np.count_nonzero(arr)
Out[13]: NA(dtype='int64')

In [14]: np.count_nonzero(arr, skipna=True)
Out[14]: 500000

In [15]: timeit np.count_nonzero(arr, skipna=True)
100 loops, best of 3: 5.86 ms per loop

In [16]: timeit np.sum(arr, skipna=True)
10 loops, best of 3: 16.1 ms per loop

In [17]: timeit np.count_nonzero(arr, skipna=False)
100 loops, best of 3: 1.85 ms per loop

In [18]: timeit np.sum(arr, skipna=False)
100 loops, best of 3: 1.86 ms per loop


Cheers,
Mark


>
> best,
> Wes
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110824/45f7cf35/attachment.html>

From teoliphant at gmail.com  Thu Aug 25 00:39:15 2011
From: teoliphant at gmail.com (Travis Oliphant)
Date: Wed, 24 Aug 2011 23:39:15 -0500
Subject: [Numpy-discussion] Decimal arrays?
In-Reply-To: <CAF6FJishbTTVv+R1tsbbOtSfW74h5vVMZzr73H==o_CS691Kbg@mail.gmail.com>
References: <4E50371F.1070105@simplistix.co.uk>
	<CAF6FJitgbxKLmOEoLmeBK6HQ=WaE99ALWg2n3QwFxshjSjjKyw@mail.gmail.com>
	<4E503A04.8070600@simplistix.co.uk>
	<CAF6FJivqmFAk8eya1x5cwA4RudbZWX-0f2izs=2-DX5CZJOfAQ@mail.gmail.com>
	<CACrt9_E-KdF8yjm4vzhNLREBshocJ6w86dFu13W+5guLZ0Mg7A@mail.gmail.com>
	<4E52709F.4050300@simplistix.co.uk>
	<CAF6FJishbTTVv+R1tsbbOtSfW74h5vVMZzr73H==o_CS691Kbg@mail.gmail.com>
Message-ID: <4CFFFC38-D955-4AA2-9B52-F34DB941387E@enthought.com>


On Aug 22, 2011, at 3:51 PM, Robert Kern wrote:

> On Mon, Aug 22, 2011 at 10:07, Chris Withers <chris at simplistix.co.uk> wrote:
>> On 22/08/2011 00:18, Mark Dickinson wrote:
>>> On Sun, Aug 21, 2011 at 1:08 AM, Robert Kern<robert.kern at gmail.com>  wrote:
>>>> You may want to try the cdecimal package:
>>>> 
>>>>   http://pypi.python.org/pypi/cdecimal/
>>> 
>>> I'll second this suggestion.  cdecimal is an extraordinarily carefully
>>> written and well-tested (almost) drop-in replacement for the decimal
>>> module, and well worth a try.  It would probably be in the Python
>>> standard library by now if anyone had had proper time to review it...
>> 
>> Who would need to review it?
>> 
>> I'm surprised this isn't in EPD... any ideas why?
> 
> No one has asked for it, to my knowledge. We do provide it in our PyPI
> repository, so
> 
>  $ enpkg cdecimal
> 
> should install it if you are an EPD subscriber.

This should work with EPDFree even if you aren't an EPD subscriber as well.    

The fact that cDecimal isn't in EPD is a good thing for you in this case.  Automatically built PyPI packages are provided to everybody as long as the package itself is not in EPD proper (to avoid our EPD customers getting "automatic" builds of core packages instead of our tested and verified builds). 


-Travis


> 
> -- 
> Robert Kern
> 
> "I have come to believe that the whole world is an enigma, a harmless
> enigma that is made terrible by our own mad attempt to interpret it as
> though it had an underlying truth."
>   -- Umberto Eco
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

---
Travis Oliphant
Enthought, Inc.
oliphant at enthought.com
1-512-536-1057
http://www.enthought.com


From teoliphant at gmail.com  Thu Aug 25 00:42:06 2011
From: teoliphant at gmail.com (Travis Oliphant)
Date: Wed, 24 Aug 2011 23:42:06 -0500
Subject: [Numpy-discussion] memoryview shape/strides representation for
	ndim = 0
In-Reply-To: <CACrt9_F6Bvqfn2PEZppohprHxagqHreUkoN-C7rHGgKSeKQ=Bg@mail.gmail.com>
References: <20110822123034.GA7743@sleipnir.bytereef.org>
	<CACrt9_F6Bvqfn2PEZppohprHxagqHreUkoN-C7rHGgKSeKQ=Bg@mail.gmail.com>
Message-ID: <C8162D1F-C8A1-4454-92DE-FF68B352F117@enthought.com>


On Aug 22, 2011, at 7:35 AM, Mark Dickinson wrote:

> On Mon, Aug 22, 2011 at 1:30 PM, Stefan Krah <stefan-usenet at bytereef.org> wrote:
>> Numpy arrays and memoryview currently have different representations
>> for shape and strides if ndim = 0:
>> 
>>>>> from numpy import *
>>>>> x = array(9, int32)
>>>>> x.ndim
>> 0
>>>>> x.shape
>> ()
>>>>> x.strides
>> ()
>>>>> m = memoryview(x)
>>>>> m.ndim
>> 0L
>>>>> m.shape is None
>> True
>>>>> m.strides is None
>> True
>> 
>> 
>> I think the Numpy representation is nicer. Also, I think that memoryviews
>> should attempt to mimic the underlying object as closely as possible.
> 
> Agreed on both points.  If there's no good reason for m.shape and
> m.strides to be None, I think it should be changed.

I can't think of any good reason not to change it to use the NumPy defaults.   This sounds right to me.

-Travis

> 
> Mark
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

---
Travis Oliphant
Enthought, Inc.
oliphant at enthought.com
1-512-536-1057
http://www.enthought.com


From josef.pktd at gmail.com  Thu Aug 25 01:04:03 2011
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Thu, 25 Aug 2011 01:04:03 -0400
Subject: [Numpy-discussion] c-info.ufunc-tutorial.rst
In-Reply-To: <CAPk-6T4XUHDWqKgYMQ9iONL33yawZEHoAkAheX+B3sV2ivMQew@mail.gmail.com>
References: <CAJewx8-g0X+wSJrFTiKkpFtKe4QmbGKAn=BOD=OOL_u45zZcQg@mail.gmail.com>
	<CAJewx8_0wzvzV00i14xD=LUsYujPb=HYEwq3-eCM-wuiMHwJCw@mail.gmail.com>
	<CAMRnEmoqyoH-5UT7LBHZdPHnS+e-5mAGUX5+YtPsZO7F=Y+c4A@mail.gmail.com>
	<CAPk-6T68Dwr8jY0NO6FWmvKHSzMnzUmBeELoVcNXEt7+vdB2uA@mail.gmail.com>
	<CAJewx88H_ybQWdXw9fbKCxF_Wu8jb6TZRBBtPLQVfgMujU0HXg@mail.gmail.com>
	<CAPk-6T4XUHDWqKgYMQ9iONL33yawZEHoAkAheX+B3sV2ivMQew@mail.gmail.com>
Message-ID: <CAMMTP+A0z91u=GG5u6g=XpjQCdDuCx3kG0BW4nNQ979f-ZCoNQ@mail.gmail.com>

On Wed, Aug 24, 2011 at 8:43 PM, Anthony Scopatz <scopatz at gmail.com> wrote:
>
>
> On Wed, Aug 24, 2011 at 7:34 PM, srean <srean.list at gmail.com> wrote:
>>
>> Thanks Anthony and Mark, this is good to know.
>>
>> So what would be the advised way of looking at freshly baked documentation
>> ? Just look at the raw files ? or is there some place else where the correct
>> sphinx rendered docs are hosted.
>
> Building the docs yourself is probably the safest bet. ?However, someone
> should probably hook up the numpy and scipy repos to readthedocs.org. ?That
> would solve this problem...

Maybe someone just needs to add it here
http://docs.scipy.org/numpy/docs/numpy-docs/user/c-info.rst/#c-info

and it would show up in numpy's own docs, which are hooked up to the
repo, as far as I know.

Josef


>
>>
>> On Wed, Aug 24, 2011 at 7:19 PM, Anthony Scopatz <scopatz at gmail.com>
>> wrote:
>>>
>>> code-block:: is a directive that I think might be specific to sphinx.
>>> ?Naturally, github's renderer will drop it.
>>> On Wed, Aug 24, 2011 at 7:10 PM, Mark Wiebe <mwwiebe at gmail.com> wrote:
>>>>
>>>> I believe this is because of github's .rst processor which simply drops
>>>> blocks it can't understand. When building NumPy documentation, many more
>>>> extensions and context exists. I'm getting the same thing in the C-API
>>>> NA-mask documentation I just posted.
>>>> -Mark
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>


From srean.list at gmail.com  Thu Aug 25 08:23:40 2011
From: srean.list at gmail.com (srean)
Date: Thu, 25 Aug 2011 07:23:40 -0500
Subject: [Numpy-discussion] the build and installation process
Message-ID: <CAJewx89e=Rm-cwD=jyfFzvasov8Ro8QYR0i-2nVWfK_KZGb6CA@mail.gmail.com>

Hi,

 I would like to know a bit about how the installation process works. Could
you point me to a resource. In particular I want to know how the site.cfg
configuration works. Is it numpy/scipy specific or is it standard with
distutils. I googled for site.cfg and distutils but did not find any
authoritative document.

I believe many new users trip up on the installation process, especially in
trying to substitute their favourite library in place os the standard. So a
canonical document explaining the process will be very helpful.

http://docs.scipy.org/doc/numpy/user/install.html

does cover some of the important points but its a bit sketchy, and has a
"this is all that you need to know" flavor. Doesnt quite enable the reader
to fix his own problems. So a resource that is somewhere in between reading
up all the sources that get invoked during the installation and building,
and the current install document will be very welcome.

English is not my native language, but if there is anyway I can help, I
would do so gladly.

-- srean
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110825/33457341/attachment.html>

From mwwiebe at gmail.com  Thu Aug 25 13:55:49 2011
From: mwwiebe at gmail.com (Mark Wiebe)
Date: Thu, 25 Aug 2011 10:55:49 -0700
Subject: [Numpy-discussion] NA-mask introductory documentation
Message-ID: <CAMRnEmp-i5ivsSSm7TkoJv5j_FmJetao-G9oHW3Jy97x8ikQ-A@mail.gmail.com>

I've written some introductory documentation for the NA-masked arrays. The
patch is here:

https://github.com/m-paradox/numpy/commit/227e39c34b0e5d9dfde2bbce054b5a8ac088fd64

This is approaching the end of what I will implement for NA masks at the
moment. I think the system is quite usable as is, though it is missing a
number of major pieces like support for struct-NA, file I/O, and other
things mentioned in the release notes. On the other hand, the C API for
working with NA-masked arrays is solid and designed for future expansion to
multi-NA, and many things can be done already with the implementation. It's
also very stable and does not break ABI compatibility, so a NumPy release
with NA masks in its current state should be perfectly reasonable.

Cheers,
Mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110825/92fea1e0/attachment.html>

From Chris.Barker at noaa.gov  Thu Aug 25 14:42:04 2011
From: Chris.Barker at noaa.gov (Chris.Barker)
Date: Thu, 25 Aug 2011 11:42:04 -0700
Subject: [Numpy-discussion] saving groups of numpy arrays to disk
In-Reply-To: <CAPk-6T5mqx_GfALGa9VoK0JCvTAD28sJz1Qtja773Q2yRtGmaA@mail.gmail.com>
References: <4E5040DF.9090303@simplistix.co.uk> <j2qtee$lba$1@dough.gmane.org>
	<CAPk-6T5mqx_GfALGa9VoK0JCvTAD28sJz1Qtja773Q2yRtGmaA@mail.gmail.com>
Message-ID: <4E56977C.3000107@noaa.gov>

On 8/24/11 9:22 AM, Anthony Scopatz wrote:
>     You can use Python pickling, if you do *not* have a requirement for:

I can't recall why, but it seem pickling of numpy arrays has been 
fragile and not very performant.

I like the npy / npz format, built in to numpy, if you don't need:

>     - access from non-Python programs

it's quick and easy to use:

In [5]: a
Out[5]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [6]: b
Out[6]: array([ 0.,  1.,  2.,  3.,  4.])

In [7]: filename = "test.npz"

In [8]: np.savez(filename, a=a, b=b)

In [9]: del a, b

In [10]: # now reload:

In [11]: data = np.load(filename)

In [14]: data['a']
Out[14]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [15]: data['b']
Out[15]: array([ 0.,  1.,  2.,  3.,  4.])

I'd go with hdf5 or netcdf if you want a standard format that can be 
read by non-python software.

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From paulepanter at users.sourceforge.net  Thu Aug 25 15:10:34 2011
From: paulepanter at users.sourceforge.net (Paul Menzel)
Date: Thu, 25 Aug 2011 21:10:34 +0200
Subject: [Numpy-discussion] How to output array with indexes to a text file?
Message-ID: <1314299436.18748.19.camel@mattotaupa>

Dear NumPy folks,


is there an easy way to also save the indexes of an array (columns, rows
or both) when outputting it to a text file. For saving an array to a
file I only found `savetxt()` [1] which does not seem to have such an
option. Adding indexes manually is doable but I would like to avoid
that.

        --- minimal example (also attached) ---
        from numpy import *
        
        a = zeros([2, 3], int)
        print(a)
        
        savetxt("/tmp/test1.txt", a, fmt='%8i')
        
        # Work around for adding the indexes for the columns.
        a[0] = range(3)
        print(a)
        
        savetxt("/tmp/test2.txt", a, fmt='%8i')
        --- minimal example ---

The output is the following.

        $ python output-array.py 
        [[0 0 0]
         [0 0 0]]
        [[0 1 2]
         [0 0 0]]
        $ more /tmp/test*
        ::::::::::::::
        /tmp/test1.txt
        ::::::::::::::
               0        0        0
               0        0        0
        ::::::::::::::
        /tmp/test2.txt
        ::::::::::::::
               0        1        2
               0        0        0

Is there a way to accomplish that task without reserving the 0th row or
column to store the indexes?

I want to process these text files to produce graphs and MetaPost?s [2]
graph package needs these indexes. (I know about Matplotlib [3], but I
would like to use MetaPost.)


Thanks,

Paul


[1] http://docs.scipy.org/doc/numpy/reference/generated/numpy.savetxt.html
[2] http://wiki.contextgarden.net/MetaPost
[3] http://matplotlib.sourceforge.net/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: output-array.py
Type: text/x-python
Size: 209 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110825/1f7ba2a6/attachment.py>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110825/1f7ba2a6/attachment.sig>

From jjhelmus at gmail.com  Thu Aug 25 15:36:57 2011
From: jjhelmus at gmail.com (Jonathan Helmus)
Date: Thu, 25 Aug 2011 15:36:57 -0400
Subject: [Numpy-discussion] How to output array with indexes to a text
 file?
In-Reply-To: <1314299436.18748.19.camel@mattotaupa>
References: <1314299436.18748.19.camel@mattotaupa>
Message-ID: <4E56A459.4010601@gmail.com>

Paul Menzel wrote:
> Dear NumPy folks,
>
>
> is there an easy way to also save the indexes of an array (columns, rows
> or both) when outputting it to a text file. For saving an array to a
> file I only found `savetxt()` [1] which does not seem to have such an
> option. Adding indexes manually is doable but I would like to avoid
> that.
>
>         --- minimal example (also attached) ---
>         from numpy import *
>         
>         a = zeros([2, 3], int)
>         print(a)
>         
>         savetxt("/tmp/test1.txt", a, fmt='%8i')
>         
>         # Work around for adding the indexes for the columns.
>         a[0] = range(3)
>         print(a)
>         
>         savetxt("/tmp/test2.txt", a, fmt='%8i')
>         --- minimal example ---
>
> The output is the following.
>
>         $ python output-array.py 
>         [[0 0 0]
>          [0 0 0]]
>         [[0 1 2]
>          [0 0 0]]
>         $ more /tmp/test*
>         ::::::::::::::
>         /tmp/test1.txt
>         ::::::::::::::
>                0        0        0
>                0        0        0
>         ::::::::::::::
>         /tmp/test2.txt
>         ::::::::::::::
>                0        1        2
>                0        0        0
>
> Is there a way to accomplish that task without reserving the 0th row or
> column to store the indexes?
>
> I want to process these text files to produce graphs and MetaPost?s [2]
> graph package needs these indexes. (I know about Matplotlib [3], but I
> would like to use MetaPost.)
>
>
> Thanks,
>
> Paul
>
>
> [1] http://docs.scipy.org/doc/numpy/reference/generated/numpy.savetxt.html
> [2] http://wiki.contextgarden.net/MetaPost
> [3] http://matplotlib.sourceforge.net/
>   
> ------------------------------------------------------------------------
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
Paul,

I don't know of any numpy function which will output the array indexes 
but with numpy's ndindex this can be accomplished with a for loop.

import numpy as np
a = np.arange(12).reshape(3,4)
f = open("test.txt",'w')

for i in np.ndindex(a.shape):
    print >> f," ".join([str[s] for s in i]),a[i]
f.close()

cat test.txt
0 0 0
0 1 1
0 2 2
...


From wardefar at iro.umontreal.ca  Thu Aug 25 18:49:16 2011
From: wardefar at iro.umontreal.ca (David Warde-Farley)
Date: Thu, 25 Aug 2011 18:49:16 -0400
Subject: [Numpy-discussion] saving groups of numpy arrays to disk
In-Reply-To: <4E56977C.3000107@noaa.gov>
References: <4E5040DF.9090303@simplistix.co.uk> <j2qtee$lba$1@dough.gmane.org>
	<CAPk-6T5mqx_GfALGa9VoK0JCvTAD28sJz1Qtja773Q2yRtGmaA@mail.gmail.com>
	<4E56977C.3000107@noaa.gov>
Message-ID: <BAB799BB-FD52-485C-8479-4853F8E66EEE@iro.umontreal.ca>

On 2011-08-25, at 2:42 PM, Chris.Barker wrote:

> On 8/24/11 9:22 AM, Anthony Scopatz wrote:
>>    You can use Python pickling, if you do *not* have a requirement for:
> 
> I can't recall why, but it seem pickling of numpy arrays has been 
> fragile and not very performant.
> 
> I like the npy / npz format, built in to numpy, if you don't need:
> 
>>    - access from non-Python programs

While I'm not aware of reader implementations for any other language, NPY is a dirt-simple and well-documented format designed by Robert Kern, and should be readable without too much trouble from any language that supports binary I/O. The full spec is at

https://github.com/numpy/numpy/blob/master/doc/neps/npy-format.txt

It should be especially trivial to read arrays of simple scalar numeric dtypes, but reading compound dtypes is also doable.

For NPZ, use a standard zip file reading library to access individual files in the archive, which are in .npy format (or just unzip it by hand first -- it's a normal .zip file with a special extension).

David

From kk1674 at nyu.edu  Fri Aug 26 00:27:31 2011
From: kk1674 at nyu.edu (Kibeom Kim)
Date: Fri, 26 Aug 2011 00:27:31 -0400
Subject: [Numpy-discussion] lazy loading ndarray? (not from file,
	but from user function)
Message-ID: <CAGXpQe8CqdO=qhiDv8ZnLEWzWuNwtdBus66wo_46xLAuZhcrHg@mail.gmail.com>

Hello,

Q1. Is lazy loading ndarray from user defined data supplying function possible?
Q2. If possible, how can I implement it?


The closest method I can think of is, (which requires c++ posix)

1. create a memory region using mmap and protect read operation by mprotect.
2. add SIGSEGV signal handler to trap read operation on the memory
region, and the handler will provide appropriate user data and recover
from SIGSEGV.
3. slightly modify memmap class to use the above mmap (memmap is
already using mmap internally, so it's not a big deal)

but obviously, recovering from SIGSEGV requires removing mprotect (see
http://stackoverflow.com/questions/2663456/write-a-signal-handler-to-catch-sigsegv)
and it's impossible to know when to lock the region by mprotect again.


Thanks,
-Kibeom Kim


From robert.kern at gmail.com  Fri Aug 26 00:32:51 2011
From: robert.kern at gmail.com (Robert Kern)
Date: Thu, 25 Aug 2011 23:32:51 -0500
Subject: [Numpy-discussion] lazy loading ndarray? (not from file,
 but from user function)
In-Reply-To: <CAGXpQe8CqdO=qhiDv8ZnLEWzWuNwtdBus66wo_46xLAuZhcrHg@mail.gmail.com>
References: <CAGXpQe8CqdO=qhiDv8ZnLEWzWuNwtdBus66wo_46xLAuZhcrHg@mail.gmail.com>
Message-ID: <CAF6FJivwMKVXqen0c+PvCC2xkpDuQmJ5OAWVQJqjuFm0z7spDA@mail.gmail.com>

On Thu, Aug 25, 2011 at 23:27, Kibeom Kim <kk1674 at nyu.edu> wrote:
> Hello,
>
> Q1. Is lazy loading ndarray from user defined data supplying function possible?

No, not really.

> Q2. If possible, how can I implement it?
>
>
> The closest method I can think of is, (which requires c++ posix)
>
> 1. create a memory region using mmap and protect read operation by mprotect.
> 2. add SIGSEGV signal handler to trap read operation on the memory
> region, and the handler will provide appropriate user data and recover
> from SIGSEGV.
> 3. slightly modify memmap class to use the above mmap (memmap is
> already using mmap internally, so it's not a big deal)
>
> but obviously, recovering from SIGSEGV requires removing mprotect (see
> http://stackoverflow.com/questions/2663456/write-a-signal-handler-to-catch-sigsegv)
> and it's impossible to know when to lock the region by mprotect again.

Well, if you're willing to go *that* far, you might was well make a
userspace file system with fuse and mmap a file within that.

  http://fuse.sourceforge.net/

You can even implement it in Python!

  http://pypi.python.org/pypi/fuse-python
  http://code.google.com/p/fusepy/

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
? -- Umberto Eco


From paul.anton.letnes at gmail.com  Fri Aug 26 05:22:56 2011
From: paul.anton.letnes at gmail.com (Paul Anton Letnes)
Date: Fri, 26 Aug 2011 10:22:56 +0100
Subject: [Numpy-discussion] saving groups of numpy arrays to disk
In-Reply-To: <BAB799BB-FD52-485C-8479-4853F8E66EEE@iro.umontreal.ca>
References: <4E5040DF.9090303@simplistix.co.uk> <j2qtee$lba$1@dough.gmane.org>
	<CAPk-6T5mqx_GfALGa9VoK0JCvTAD28sJz1Qtja773Q2yRtGmaA@mail.gmail.com>
	<4E56977C.3000107@noaa.gov>
	<BAB799BB-FD52-485C-8479-4853F8E66EEE@iro.umontreal.ca>
Message-ID: <BD6408EA-1FD0-49A3-B953-70DF738FF4F5@gmail.com>


On 25. aug. 2011, at 23.49, David Warde-Farley wrote:

> On 2011-08-25, at 2:42 PM, Chris.Barker wrote:
> 
>> On 8/24/11 9:22 AM, Anthony Scopatz wrote:
>>>   You can use Python pickling, if you do *not* have a requirement for:
>> 
>> I can't recall why, but it seem pickling of numpy arrays has been 
>> fragile and not very performant.
>> 
>> I like the npy / npz format, built in to numpy, if you don't need:
>> 
>>>   - access from non-Python programs
> 
> While I'm not aware of reader implementations for any other language, NPY is a dirt-simple and well-documented format designed by Robert Kern, and should be readable without too much trouble from any language that supports binary I/O. The full spec is at
> 
> https://github.com/numpy/numpy/blob/master/doc/neps/npy-format.txt
> 
> It should be especially trivial to read arrays of simple scalar numeric dtypes, but reading compound dtypes is also doable.
> 
> For NPZ, use a standard zip file reading library to access individual files in the archive, which are in .npy format (or just unzip it by hand first -- it's a normal .zip file with a special extension).
> 
> David

Out of curiosity: is the .npy format guaranteed to be independent of architecture (endianness and similar issues)?

Paul


From derek at astro.physik.uni-goettingen.de  Fri Aug 26 08:04:20 2011
From: derek at astro.physik.uni-goettingen.de (Derek Homeier)
Date: Fri, 26 Aug 2011 14:04:20 +0200
Subject: [Numpy-discussion] saving groups of numpy arrays to disk
In-Reply-To: <4E56977C.3000107@noaa.gov>
References: <4E5040DF.9090303@simplistix.co.uk> <j2qtee$lba$1@dough.gmane.org>
	<CAPk-6T5mqx_GfALGa9VoK0JCvTAD28sJz1Qtja773Q2yRtGmaA@mail.gmail.com>
	<4E56977C.3000107@noaa.gov>
Message-ID: <1D5F9D60-05A9-476F-80D7-DB5E0949BE22@astro.physik.uni-goettingen.de>

On 25.08.2011, at 8:42PM, Chris.Barker wrote:

> On 8/24/11 9:22 AM, Anthony Scopatz wrote:
>>    You can use Python pickling, if you do *not* have a requirement for:
> 
> I can't recall why, but it seem pickling of numpy arrays has been 
> fragile and not very performant.
> 
Hmm, the pure Python version might be, but, I've used cPickle for a long time 
and never noted any stability problems. And it is still noticeably faster than 
pytables, in my experience. Still, for the sake of a standardised format I'd 
go with HDF5 any time now (and usually prefer h5py now when starting 
anything new - my pytables implementation mentioned above likely is not 
the most efficient compared to cPickle). 

But with the usual disclaimers, you should be able to simply use cPickle 
as a drop-in replacement in the example below.

Cheers,
						Derek

On 21.08.2011, at 2:24PM, Pauli Virtanen wrote:

> You can use Python pickling, if you do *not* have a requirement for:
> 
> - real persistence, i.e., being able to easily read the data years later
> - a standard data format
> - access from non-Python programs
> - safety against malicious parties (unpickling can execute some code
>  in the input -- although this is possible to control)
> 
> then you can use Python pickling:
> 
> 	import pickle
> 
> 	file = open('out.pck', 'wb')
> 	pickle.dump(file, tree, protocol=pickle.HIGHEST_PROTOCOL)
> 	file.close()
> 
> 	file = open('out.pck', 'rb')
> 	tree = pickle.load(file)
> 	file.close()


From derek at astro.physik.uni-goettingen.de  Fri Aug 26 10:09:53 2011
From: derek at astro.physik.uni-goettingen.de (Derek Homeier)
Date: Fri, 26 Aug 2011 16:09:53 +0200
Subject: [Numpy-discussion] array_equal and array_equiv comparison functions
	for structured arrays
Message-ID: <8CD90113-73B8-429D-9EE4-C23C91823CD2@astro.physik.uni-goettingen.de>

Hi,

as the subject says, the array_* comparison functions currently do not operate 
on structured/record arrays. Pull request 
https://github.com/numpy/numpy/pull/146
implements these comparisons.

There are two commits, differing in their interpretation whether two 
arrays with different field names, but identical data, are equivalent; i.e.

        res = array_equiv(array((1,2), dtype=[('i','i4'),('v','f8')]),
                          array((1,2), dtype=[('n','i4'),('f','f8')]))

is True in the current HEAD, but False in its parent.
Feedback and additional comments are invited. 

Cheers,
						Derek


From Chris.Barker at noaa.gov  Fri Aug 26 11:51:27 2011
From: Chris.Barker at noaa.gov (Chris.Barker)
Date: Fri, 26 Aug 2011 08:51:27 -0700
Subject: [Numpy-discussion] saving groups of numpy arrays to disk
In-Reply-To: <1D5F9D60-05A9-476F-80D7-DB5E0949BE22@astro.physik.uni-goettingen.de>
References: <4E5040DF.9090303@simplistix.co.uk> <j2qtee$lba$1@dough.gmane.org>
	<CAPk-6T5mqx_GfALGa9VoK0JCvTAD28sJz1Qtja773Q2yRtGmaA@mail.gmail.com>
	<4E56977C.3000107@noaa.gov>
	<1D5F9D60-05A9-476F-80D7-DB5E0949BE22@astro.physik.uni-goettingen.de>
Message-ID: <4E57C0FF.8070908@noaa.gov>

On 8/26/11 5:04 AM, Derek Homeier wrote:
> Hmm, the pure Python version might be, but, I've used cPickle for a long time
> and never noted any stability problems.


well, here is the NEP:

https://github.com/numpy/numpy/blob/master/doc/neps/npy-format.txt

It addresses the why's and hows of the format.

-CHB


-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From robert.kern at gmail.com  Fri Aug 26 12:05:19 2011
From: robert.kern at gmail.com (Robert Kern)
Date: Fri, 26 Aug 2011 11:05:19 -0500
Subject: [Numpy-discussion] saving groups of numpy arrays to disk
In-Reply-To: <1D5F9D60-05A9-476F-80D7-DB5E0949BE22@astro.physik.uni-goettingen.de>
References: <4E5040DF.9090303@simplistix.co.uk> <j2qtee$lba$1@dough.gmane.org>
	<CAPk-6T5mqx_GfALGa9VoK0JCvTAD28sJz1Qtja773Q2yRtGmaA@mail.gmail.com>
	<4E56977C.3000107@noaa.gov>
	<1D5F9D60-05A9-476F-80D7-DB5E0949BE22@astro.physik.uni-goettingen.de>
Message-ID: <CAF6FJiv6p66Q-=mcH7X8q0n+TkiXQA42fE+xRqqA2C2HWPed+w@mail.gmail.com>

On Fri, Aug 26, 2011 at 07:04, Derek Homeier
<derek at astro.physik.uni-goettingen.de> wrote:
> On 25.08.2011, at 8:42PM, Chris.Barker wrote:
>
>> On 8/24/11 9:22 AM, Anthony Scopatz wrote:
>>> ? ?You can use Python pickling, if you do *not* have a requirement for:
>>
>> I can't recall why, but it seem pickling of numpy arrays has been
>> fragile and not very performant.
>>
> Hmm, the pure Python version might be, but, I've used cPickle for a long time
> and never noted any stability problems.

IIRC, there have been one or two releases where we accidentally broke
the ability to load some old pickles. I think that's the kind of
fragility Chris meant. As for the other kind of stability, we have
had, at times, problems passing unpickled arrays to linear algebra
functions. This is because the SSE instructions used by the optimized
linear algebra package required aligned memory, but the unpickling
machinery did not give us such an option. We do some nasty hacks to
make unpickling performant. The unpickling machinery reads the actual
byte data in as a str object, then passes that to a numpy function to
reconstruct the array object. We simply reuse the memory underlying
the str object. This is a hack, but it's the only way to avoid copying
potentially large amounts of data. This is the cause the unaligned
memory.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
? -- Umberto Eco


From mjanikas at esri.com  Fri Aug 26 13:10:39 2011
From: mjanikas at esri.com (Mark Janikas)
Date: Fri, 26 Aug 2011 10:10:39 -0700
Subject: [Numpy-discussion] Identifying Colinear Columns of a Matrix
Message-ID: <FC90C6AB4859DF44AE7376C81140C0FD47FACFA3@redmx4.esri.com>

Hello All,

I am trying to identify columns of a matrix that are perfectly collinear.  It is not that difficult to identify when two columns are identical are have zero variance, but I do not know how to ID when the culprit is of a higher order. i.e. columns 1 + 2 + 3 = column 4.  NUM.corrcoef(matrix.T) will return NaNs when the matrix is singular, and LA.cond(matrix.T) will provide a very large condition number.... But they do not tell me which columns are causing the problem.   For example:

zt = numpy. array([[ 1.  ,  1.  ,  1.  ,  1.  ,  1.  ],
                           [ 0.25,  0.1 ,  0.2 ,  0.25,  0.5 ],
                           [ 0.75,  0.9 ,  0.8 ,  0.75,  0.5 ],
                           [ 3.  ,  8.  ,  0.  ,  5.  ,  0.  ]])

How can I identify that columns 0,1,2 are the issue because: column 1 + column 2 = column 0?

Any input would be greatly appreciated.  Thanks much,

MJ

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110826/b22822e6/attachment.html>

From mjanikas at esri.com  Fri Aug 26 13:12:20 2011
From: mjanikas at esri.com (Mark Janikas)
Date: Fri, 26 Aug 2011 10:12:20 -0700
Subject: [Numpy-discussion] Identifying Colinear Columns of a Matrix
In-Reply-To: <FC90C6AB4859DF44AE7376C81140C0FD47FACFA3@redmx4.esri.com>
References: <FC90C6AB4859DF44AE7376C81140C0FD47FACFA3@redmx4.esri.com>
Message-ID: <FC90C6AB4859DF44AE7376C81140C0FD47FACFA4@redmx4.esri.com>

As you will note, since most of the functions work on rows, the matrix in question has been transposed.

From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion-bounces at scipy.org] On Behalf Of Mark Janikas
Sent: Friday, August 26, 2011 10:11 AM
To: 'Discussion of Numerical Python'
Subject: [Numpy-discussion] Identifying Colinear Columns of a Matrix

Hello All,

I am trying to identify columns of a matrix that are perfectly collinear.  It is not that difficult to identify when two columns are identical are have zero variance, but I do not know how to ID when the culprit is of a higher order. i.e. columns 1 + 2 + 3 = column 4.  NUM.corrcoef(matrix.T) will return NaNs when the matrix is singular, and LA.cond(matrix.T) will provide a very large condition number.... But they do not tell me which columns are causing the problem.   For example:

zt = numpy. array([[ 1.  ,  1.  ,  1.  ,  1.  ,  1.  ],
                           [ 0.25,  0.1 ,  0.2 ,  0.25,  0.5 ],
                           [ 0.75,  0.9 ,  0.8 ,  0.75,  0.5 ],
                           [ 3.  ,  8.  ,  0.  ,  5.  ,  0.  ]])

How can I identify that columns 0,1,2 are the issue because: column 1 + column 2 = column 0?

Any input would be greatly appreciated.  Thanks much,

MJ

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110826/ee03ab6c/attachment.html>

From jsseabold at gmail.com  Fri Aug 26 13:27:38 2011
From: jsseabold at gmail.com (Skipper Seabold)
Date: Fri, 26 Aug 2011 13:27:38 -0400
Subject: [Numpy-discussion] Identifying Colinear Columns of a Matrix
In-Reply-To: <FC90C6AB4859DF44AE7376C81140C0FD47FACFA3@redmx4.esri.com>
References: <FC90C6AB4859DF44AE7376C81140C0FD47FACFA3@redmx4.esri.com>
Message-ID: <CAKF=DjsK0FGbdK6Nds36GP+TvxXTGpQ5Dmt39tWSW8HpK4aXJg@mail.gmail.com>

On Fri, Aug 26, 2011 at 1:10 PM, Mark Janikas <mjanikas at esri.com> wrote:
> Hello All,
>
>
>
> I am trying to identify columns of a matrix that are perfectly collinear.
> It is not that difficult to identify when two columns are identical are have
> zero variance, but I do not know how to ID when the culprit is of a higher
> order. i.e. columns 1 + 2 + 3 = column 4.? NUM.corrcoef(matrix.T) will
> return NaNs when the matrix is singular, and LA.cond(matrix.T) will provide
> a very large condition number?. But they do not tell me which columns are
> causing the problem. ??For example:
>
>
>
> zt = numpy. array([[ 1.? ,? 1.? ,? 1.? ,? 1.? ,? 1.? ],
>
> ?????? ????????????????????[ 0.25,? 0.1 ,? 0.2 ,? 0.25,? 0.5 ],
>
> ?????? ????????????????????[ 0.75,? 0.9 ,? 0.8 ,? 0.75,? 0.5 ],
>
> ?????? ????????????????????[ 3.? ,? 8.? ,? 0.? ,? 5.? ,? 0.? ]])
>
>
>
> How can I identify that columns 0,1,2 are the issue because: column 1 +
> column 2 = column 0?
>
>
>
> Any input would be greatly appreciated.? Thanks much,
>

The way that I know to do this in a regression context for (near
perfect) multicollinearity is VIF. It's long been on my todo list for
statsmodels.

http://en.wikipedia.org/wiki/Variance_inflation_factor

Maybe there are other ways with decompositions. I'd be happy to hear about them.

Please post back if you write any code to do this.

Skipper


From mjanikas at esri.com  Fri Aug 26 13:34:33 2011
From: mjanikas at esri.com (Mark Janikas)
Date: Fri, 26 Aug 2011 10:34:33 -0700
Subject: [Numpy-discussion] Identifying Colinear Columns of a Matrix
In-Reply-To: <CAKF=DjsK0FGbdK6Nds36GP+TvxXTGpQ5Dmt39tWSW8HpK4aXJg@mail.gmail.com>
References: <FC90C6AB4859DF44AE7376C81140C0FD47FACFA3@redmx4.esri.com>
	<CAKF=DjsK0FGbdK6Nds36GP+TvxXTGpQ5Dmt39tWSW8HpK4aXJg@mail.gmail.com>
Message-ID: <FC90C6AB4859DF44AE7376C81140C0FD47FACFA5@redmx4.esri.com>

I actually use the VIF when the design matrix can be inverted.... I do it the quick and dirty way as opposed to the step regression:

1. Calc the correlation coefficient of the matrix (w/o the intercept)
2. Return the diagonal of the inversion of the correlation matrix in step 1.

Again, the problem lies in the multiple column relationship... I wouldn't be able to run sub regressions at all when the columns are perfectly collinear.

MJ

-----Original Message-----
From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion-bounces at scipy.org] On Behalf Of Skipper Seabold
Sent: Friday, August 26, 2011 10:28 AM
To: Discussion of Numerical Python
Subject: Re: [Numpy-discussion] Identifying Colinear Columns of a Matrix

On Fri, Aug 26, 2011 at 1:10 PM, Mark Janikas <mjanikas at esri.com> wrote:
> Hello All,
>
>
>
> I am trying to identify columns of a matrix that are perfectly collinear.
> It is not that difficult to identify when two columns are identical are have
> zero variance, but I do not know how to ID when the culprit is of a higher
> order. i.e. columns 1 + 2 + 3 = column 4.? NUM.corrcoef(matrix.T) will
> return NaNs when the matrix is singular, and LA.cond(matrix.T) will provide
> a very large condition number.. But they do not tell me which columns are
> causing the problem. ??For example:
>
>
>
> zt = numpy. array([[ 1.? ,? 1.? ,? 1.? ,? 1.? ,? 1.? ],
>
> ?????? ????????????????????[ 0.25,? 0.1 ,? 0.2 ,? 0.25,? 0.5 ],
>
> ?????? ????????????????????[ 0.75,? 0.9 ,? 0.8 ,? 0.75,? 0.5 ],
>
> ?????? ????????????????????[ 3.? ,? 8.? ,? 0.? ,? 5.? ,? 0.? ]])
>
>
>
> How can I identify that columns 0,1,2 are the issue because: column 1 +
> column 2 = column 0?
>
>
>
> Any input would be greatly appreciated.? Thanks much,
>

The way that I know to do this in a regression context for (near
perfect) multicollinearity is VIF. It's long been on my todo list for
statsmodels.

http://en.wikipedia.org/wiki/Variance_inflation_factor

Maybe there are other ways with decompositions. I'd be happy to hear about them.

Please post back if you write any code to do this.

Skipper
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion at scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


From mjanikas at esri.com  Fri Aug 26 13:41:35 2011
From: mjanikas at esri.com (Mark Janikas)
Date: Fri, 26 Aug 2011 10:41:35 -0700
Subject: [Numpy-discussion] Identifying Colinear Columns of a Matrix
In-Reply-To: <FC90C6AB4859DF44AE7376C81140C0FD47FACFA5@redmx4.esri.com>
References: <FC90C6AB4859DF44AE7376C81140C0FD47FACFA3@redmx4.esri.com>
	<CAKF=DjsK0FGbdK6Nds36GP+TvxXTGpQ5Dmt39tWSW8HpK4aXJg@mail.gmail.com>
	<FC90C6AB4859DF44AE7376C81140C0FD47FACFA5@redmx4.esri.com>
Message-ID: <FC90C6AB4859DF44AE7376C81140C0FD47FACFA6@redmx4.esri.com>

I wonder if my last statement is essentially the only answer... which I wanted to avoid... 

Should I just use combinations of the columns and try and construct the corrcoef() (then ID whether NaNs are present), or use the condition number to ID the singularity?  I just wanted to avoid the whole k! algorithm.

MJ

-----Original Message-----
From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion-bounces at scipy.org] On Behalf Of Mark Janikas
Sent: Friday, August 26, 2011 10:35 AM
To: Discussion of Numerical Python
Subject: Re: [Numpy-discussion] Identifying Colinear Columns of a Matrix

I actually use the VIF when the design matrix can be inverted.... I do it the quick and dirty way as opposed to the step regression:

1. Calc the correlation coefficient of the matrix (w/o the intercept)
2. Return the diagonal of the inversion of the correlation matrix in step 1.

Again, the problem lies in the multiple column relationship... I wouldn't be able to run sub regressions at all when the columns are perfectly collinear.

MJ

-----Original Message-----
From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion-bounces at scipy.org] On Behalf Of Skipper Seabold
Sent: Friday, August 26, 2011 10:28 AM
To: Discussion of Numerical Python
Subject: Re: [Numpy-discussion] Identifying Colinear Columns of a Matrix

On Fri, Aug 26, 2011 at 1:10 PM, Mark Janikas <mjanikas at esri.com> wrote:
> Hello All,
>
>
>
> I am trying to identify columns of a matrix that are perfectly collinear.
> It is not that difficult to identify when two columns are identical are have
> zero variance, but I do not know how to ID when the culprit is of a higher
> order. i.e. columns 1 + 2 + 3 = column 4.? NUM.corrcoef(matrix.T) will
> return NaNs when the matrix is singular, and LA.cond(matrix.T) will provide
> a very large condition number.. But they do not tell me which columns are
> causing the problem. ??For example:
>
>
>
> zt = numpy. array([[ 1.? ,? 1.? ,? 1.? ,? 1.? ,? 1.? ],
>
> ?????? ????????????????????[ 0.25,? 0.1 ,? 0.2 ,? 0.25,? 0.5 ],
>
> ?????? ????????????????????[ 0.75,? 0.9 ,? 0.8 ,? 0.75,? 0.5 ],
>
> ?????? ????????????????????[ 3.? ,? 8.? ,? 0.? ,? 5.? ,? 0.? ]])
>
>
>
> How can I identify that columns 0,1,2 are the issue because: column 1 +
> column 2 = column 0?
>
>
>
> Any input would be greatly appreciated.? Thanks much,
>

The way that I know to do this in a regression context for (near
perfect) multicollinearity is VIF. It's long been on my todo list for
statsmodels.

http://en.wikipedia.org/wiki/Variance_inflation_factor

Maybe there are other ways with decompositions. I'd be happy to hear about them.

Please post back if you write any code to do this.

Skipper
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion at scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion at scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


From fperez.net at gmail.com  Fri Aug 26 14:01:46 2011
From: fperez.net at gmail.com (Fernando Perez)
Date: Fri, 26 Aug 2011 20:01:46 +0200
Subject: [Numpy-discussion] Identifying Colinear Columns of a Matrix
In-Reply-To: <FC90C6AB4859DF44AE7376C81140C0FD47FACFA6@redmx4.esri.com>
References: <FC90C6AB4859DF44AE7376C81140C0FD47FACFA3@redmx4.esri.com>
	<CAKF=DjsK0FGbdK6Nds36GP+TvxXTGpQ5Dmt39tWSW8HpK4aXJg@mail.gmail.com>
	<FC90C6AB4859DF44AE7376C81140C0FD47FACFA5@redmx4.esri.com>
	<FC90C6AB4859DF44AE7376C81140C0FD47FACFA6@redmx4.esri.com>
Message-ID: <CAHAreOrjiSJ20q61oFyGAqaMmJgFMwZGwdF=CywqipUFRLeNaw@mail.gmail.com>

On Fri, Aug 26, 2011 at 7:41 PM, Mark Janikas <mjanikas at esri.com> wrote:
> I wonder if my last statement is essentially the only answer... which I wanted to avoid...
>
> Should I just use combinations of the columns and try and construct the corrcoef() (then ID whether NaNs are present), or use the condition number to ID the singularity? ?I just wanted to avoid the whole k! algorithm.
>

This is a completely naive, off-the-top of my head reply, so most
likely completely wrong.  But wouldn't a Gram-Schmidt type process let
you identify things here?   You're effectively looking for n vectors
that belong to an m-dimensional subspace with n>m.  As you walk
through the G-S process you could probably track the projections and
identify when one of the vectors in the m-n set is 'emptied out' by
the G-S projections, and would have the info of what it projected
into.

I don't remember the details of G-S so perhaps there's  a really
obvious reason why the above is dumb and doesn't work.  But just in
case it gets you thinking in the right direction... (and I'll learn
something from the corrections)

Cheers,

f


From charlesr.harris at gmail.com  Fri Aug 26 14:04:07 2011
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Fri, 26 Aug 2011 12:04:07 -0600
Subject: [Numpy-discussion] Identifying Colinear Columns of a Matrix
In-Reply-To: <FC90C6AB4859DF44AE7376C81140C0FD47FACFA6@redmx4.esri.com>
References: <FC90C6AB4859DF44AE7376C81140C0FD47FACFA3@redmx4.esri.com>
	<CAKF=DjsK0FGbdK6Nds36GP+TvxXTGpQ5Dmt39tWSW8HpK4aXJg@mail.gmail.com>
	<FC90C6AB4859DF44AE7376C81140C0FD47FACFA5@redmx4.esri.com>
	<FC90C6AB4859DF44AE7376C81140C0FD47FACFA6@redmx4.esri.com>
Message-ID: <CAB6mnx+eR7Hh3Sj9gro=AS42HAec2v7XLGSyv2h2xQY2Ya86Pw@mail.gmail.com>

On Fri, Aug 26, 2011 at 11:41 AM, Mark Janikas <mjanikas at esri.com> wrote:

> I wonder if my last statement is essentially the only answer... which I
> wanted to avoid...
>
> Should I just use combinations of the columns and try and construct the
> corrcoef() (then ID whether NaNs are present), or use the condition number
> to ID the singularity?  I just wanted to avoid the whole k! algorithm.
>
> MJ
>
> -----Original Message-----
> From: numpy-discussion-bounces at scipy.org [mailto:
> numpy-discussion-bounces at scipy.org] On Behalf Of Mark Janikas
> Sent: Friday, August 26, 2011 10:35 AM
> To: Discussion of Numerical Python
> Subject: Re: [Numpy-discussion] Identifying Colinear Columns of a Matrix
>
> I actually use the VIF when the design matrix can be inverted.... I do it
> the quick and dirty way as opposed to the step regression:
>
> 1. Calc the correlation coefficient of the matrix (w/o the intercept)
> 2. Return the diagonal of the inversion of the correlation matrix in step
> 1.
>
> Again, the problem lies in the multiple column relationship... I wouldn't
> be able to run sub regressions at all when the columns are perfectly
> collinear.
>
> MJ
>
> -----Original Message-----
> From: numpy-discussion-bounces at scipy.org [mailto:
> numpy-discussion-bounces at scipy.org] On Behalf Of Skipper Seabold
> Sent: Friday, August 26, 2011 10:28 AM
> To: Discussion of Numerical Python
> Subject: Re: [Numpy-discussion] Identifying Colinear Columns of a Matrix
>
> On Fri, Aug 26, 2011 at 1:10 PM, Mark Janikas <mjanikas at esri.com> wrote:
> > Hello All,
> >
> >
> >
> > I am trying to identify columns of a matrix that are perfectly collinear.
> > It is not that difficult to identify when two columns are identical are
> have
> > zero variance, but I do not know how to ID when the culprit is of a
> higher
> > order. i.e. columns 1 + 2 + 3 = column 4.  NUM.corrcoef(matrix.T) will
> > return NaNs when the matrix is singular, and LA.cond(matrix.T) will
> provide
> > a very large condition number.. But they do not tell me which columns are
> > causing the problem.   For example:
> >
> >
> >
> > zt = numpy. array([[ 1.  ,  1.  ,  1.  ,  1.  ,  1.  ],
> >
> >                            [ 0.25,  0.1 ,  0.2 ,  0.25,  0.5 ],
> >
> >                            [ 0.75,  0.9 ,  0.8 ,  0.75,  0.5 ],
> >
> >                            [ 3.  ,  8.  ,  0.  ,  5.  ,  0.  ]])
> >
> >
> >
> > How can I identify that columns 0,1,2 are the issue because: column 1 +
> > column 2 = column 0?
> >
> >
> >
> > Any input would be greatly appreciated.  Thanks much,
> >
>
> The way that I know to do this in a regression context for (near
> perfect) multicollinearity is VIF. It's long been on my todo list for
> statsmodels.
>
> http://en.wikipedia.org/wiki/Variance_inflation_factor
>
> Maybe there are other ways with decompositions. I'd be happy to hear about
> them.
>
> Please post back if you write any code to do this.
>
>
Why not svd?

In [13]: u,d,v = svd(zt)

In [14]: d
Out[14]:
array([  1.01307066e+01,   1.87795095e+00,   3.03454566e-01,
         3.29253945e-16])

In [15]: u[:,3]
Out[15]: array([ 0.57735027, -0.57735027, -0.57735027,  0.        ])

In [16]: dot(u[:,3], zt)
Out[16]:
array([ -7.77156117e-16,  -6.66133815e-16,  -7.21644966e-16,
        -7.77156117e-16,  -8.88178420e-16])

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110826/0d74b87e/attachment.html>

From josef.pktd at gmail.com  Fri Aug 26 14:13:43 2011
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Fri, 26 Aug 2011 14:13:43 -0400
Subject: [Numpy-discussion] Identifying Colinear Columns of a Matrix
In-Reply-To: <FC90C6AB4859DF44AE7376C81140C0FD47FACFA6@redmx4.esri.com>
References: <FC90C6AB4859DF44AE7376C81140C0FD47FACFA3@redmx4.esri.com>
	<CAKF=DjsK0FGbdK6Nds36GP+TvxXTGpQ5Dmt39tWSW8HpK4aXJg@mail.gmail.com>
	<FC90C6AB4859DF44AE7376C81140C0FD47FACFA5@redmx4.esri.com>
	<FC90C6AB4859DF44AE7376C81140C0FD47FACFA6@redmx4.esri.com>
Message-ID: <CAMMTP+B34709Hufq7b9F6cWJqDVddiW1enbXxgjL7dNn60qKRA@mail.gmail.com>

On Fri, Aug 26, 2011 at 1:41 PM, Mark Janikas <mjanikas at esri.com> wrote:
> I wonder if my last statement is essentially the only answer... which I wanted to avoid...
>
> Should I just use combinations of the columns and try and construct the corrcoef() (then ID whether NaNs are present), or use the condition number to ID the singularity? ?I just wanted to avoid the whole k! algorithm.
>
> MJ
>
> -----Original Message-----
> From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion-bounces at scipy.org] On Behalf Of Mark Janikas
> Sent: Friday, August 26, 2011 10:35 AM
> To: Discussion of Numerical Python
> Subject: Re: [Numpy-discussion] Identifying Colinear Columns of a Matrix
>
> I actually use the VIF when the design matrix can be inverted.... I do it the quick and dirty way as opposed to the step regression:
>
> 1. Calc the correlation coefficient of the matrix (w/o the intercept)
> 2. Return the diagonal of the inversion of the correlation matrix in step 1.
>
> Again, the problem lies in the multiple column relationship... I wouldn't be able to run sub regressions at all when the columns are perfectly collinear.
>
> MJ
>
> -----Original Message-----
> From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion-bounces at scipy.org] On Behalf Of Skipper Seabold
> Sent: Friday, August 26, 2011 10:28 AM
> To: Discussion of Numerical Python
> Subject: Re: [Numpy-discussion] Identifying Colinear Columns of a Matrix
>
> On Fri, Aug 26, 2011 at 1:10 PM, Mark Janikas <mjanikas at esri.com> wrote:
>> Hello All,
>>
>>
>>
>> I am trying to identify columns of a matrix that are perfectly collinear.
>> It is not that difficult to identify when two columns are identical are have
>> zero variance, but I do not know how to ID when the culprit is of a higher
>> order. i.e. columns 1 + 2 + 3 = column 4.? NUM.corrcoef(matrix.T) will
>> return NaNs when the matrix is singular, and LA.cond(matrix.T) will provide
>> a very large condition number.. But they do not tell me which columns are
>> causing the problem. ??For example:
>>
>>
>>
>> zt = numpy. array([[ 1.? ,? 1.? ,? 1.? ,? 1.? ,? 1.? ],
>>
>> ?????? ????????????????????[ 0.25,? 0.1 ,? 0.2 ,? 0.25,? 0.5 ],
>>
>> ?????? ????????????????????[ 0.75,? 0.9 ,? 0.8 ,? 0.75,? 0.5 ],
>>
>> ?????? ????????????????????[ 3.? ,? 8.? ,? 0.? ,? 5.? ,? 0.? ]])
>>
>>
>>
>> How can I identify that columns 0,1,2 are the issue because: column 1 +
>> column 2 = column 0?
>>
>>
>>
>> Any input would be greatly appreciated.? Thanks much,
>>
>
> The way that I know to do this in a regression context for (near
> perfect) multicollinearity is VIF. It's long been on my todo list for
> statsmodels.
>
> http://en.wikipedia.org/wiki/Variance_inflation_factor
>
> Maybe there are other ways with decompositions. I'd be happy to hear about them.
>
> Please post back if you write any code to do this.

Partial answer in a different context. I have written a function that
only adds columns if they maintain invertibility, using brute force:
add each column sequentially, check whether the matrix is singular.
Don't add the columns that already included as linear combination.
(But this doesn't tell which columns are in the colinear vector.)

I did this for categorical variables, so sequence was predefined.

Just finding a non-singular subspace would be easier, PCA, SVD, or
scikits.learn matrix decomposition (?).

(factor models and Johansen's cointegration tests are also just doing
matrix decomposition that identify subspaces)

Maybe rotation in Factor Analysis is able to identify the vectors, but
I don't have much idea about that.

Josef

>
> Skipper
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From mjanikas at esri.com  Fri Aug 26 14:38:28 2011
From: mjanikas at esri.com (Mark Janikas)
Date: Fri, 26 Aug 2011 11:38:28 -0700
Subject: [Numpy-discussion] Identifying Colinear Columns of a Matrix
In-Reply-To: <CAB6mnx+eR7Hh3Sj9gro=AS42HAec2v7XLGSyv2h2xQY2Ya86Pw@mail.gmail.com>
References: <FC90C6AB4859DF44AE7376C81140C0FD47FACFA3@redmx4.esri.com>
	<CAKF=DjsK0FGbdK6Nds36GP+TvxXTGpQ5Dmt39tWSW8HpK4aXJg@mail.gmail.com>
	<FC90C6AB4859DF44AE7376C81140C0FD47FACFA5@redmx4.esri.com>
	<FC90C6AB4859DF44AE7376C81140C0FD47FACFA6@redmx4.esri.com>
	<CAB6mnx+eR7Hh3Sj9gro=AS42HAec2v7XLGSyv2h2xQY2Ya86Pw@mail.gmail.com>
Message-ID: <FC90C6AB4859DF44AE7376C81140C0FD47FACFA8@redmx4.esri.com>

Charles!  That looks like it could be a winner!  It looks like you always choose the last column of the U matrix and ID the columns that have the same values?  It works when I add extra columns as well!  BTW, sorry for my lack of knowledge... but what was the point of the dot multiply at the end?  That they add up to essentially zero, indicating singularity?  Thanks so much!

MJ

From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion-bounces at scipy.org] On Behalf Of Charles R Harris
Sent: Friday, August 26, 2011 11:04 AM
To: Discussion of Numerical Python
Subject: Re: [Numpy-discussion] Identifying Colinear Columns of a Matrix


On Fri, Aug 26, 2011 at 11:41 AM, Mark Janikas <mjanikas at esri.com<mailto:mjanikas at esri.com>> wrote:
I wonder if my last statement is essentially the only answer... which I wanted to avoid...

Should I just use combinations of the columns and try and construct the corrcoef() (then ID whether NaNs are present), or use the condition number to ID the singularity?  I just wanted to avoid the whole k! algorithm.

MJ

-----Original Message-----
From: numpy-discussion-bounces at scipy.org<mailto:numpy-discussion-bounces at scipy.org> [mailto:numpy-discussion-bounces at scipy.org<mailto:numpy-discussion-bounces at scipy.org>] On Behalf Of Mark Janikas
Sent: Friday, August 26, 2011 10:35 AM
To: Discussion of Numerical Python
Subject: Re: [Numpy-discussion] Identifying Colinear Columns of a Matrix

I actually use the VIF when the design matrix can be inverted.... I do it the quick and dirty way as opposed to the step regression:

1. Calc the correlation coefficient of the matrix (w/o the intercept)
2. Return the diagonal of the inversion of the correlation matrix in step 1.

Again, the problem lies in the multiple column relationship... I wouldn't be able to run sub regressions at all when the columns are perfectly collinear.

MJ

-----Original Message-----
From: numpy-discussion-bounces at scipy.org<mailto:numpy-discussion-bounces at scipy.org> [mailto:numpy-discussion-bounces at scipy.org<mailto:numpy-discussion-bounces at scipy.org>] On Behalf Of Skipper Seabold
Sent: Friday, August 26, 2011 10:28 AM
To: Discussion of Numerical Python
Subject: Re: [Numpy-discussion] Identifying Colinear Columns of a Matrix

On Fri, Aug 26, 2011 at 1:10 PM, Mark Janikas <mjanikas at esri.com<mailto:mjanikas at esri.com>> wrote:
> Hello All,
>
>
>
> I am trying to identify columns of a matrix that are perfectly collinear.
> It is not that difficult to identify when two columns are identical are have
> zero variance, but I do not know how to ID when the culprit is of a higher
> order. i.e. columns 1 + 2 + 3 = column 4.  NUM.corrcoef(matrix.T) will
> return NaNs when the matrix is singular, and LA.cond(matrix.T) will provide
> a very large condition number.. But they do not tell me which columns are
> causing the problem.   For example:
>
>
>
> zt = numpy. array([[ 1.  ,  1.  ,  1.  ,  1.  ,  1.  ],
>
>                            [ 0.25,  0.1 ,  0.2 ,  0.25,  0.5 ],
>
>                            [ 0.75,  0.9 ,  0.8 ,  0.75,  0.5 ],
>
>                            [ 3.  ,  8.  ,  0.  ,  5.  ,  0.  ]])
>
>
>
> How can I identify that columns 0,1,2 are the issue because: column 1 +
> column 2 = column 0?
>
>
>
> Any input would be greatly appreciated.  Thanks much,
>

The way that I know to do this in a regression context for (near
perfect) multicollinearity is VIF. It's long been on my todo list for
statsmodels.

http://en.wikipedia.org/wiki/Variance_inflation_factor

Maybe there are other ways with decompositions. I'd be happy to hear about them.

Please post back if you write any code to do this.

Why not svd?

In [13]: u,d,v = svd(zt)

In [14]: d
Out[14]:
array([  1.01307066e+01,   1.87795095e+00,   3.03454566e-01,
         3.29253945e-16])

In [15]: u[:,3]
Out[15]: array([ 0.57735027, -0.57735027, -0.57735027,  0.        ])

In [16]: dot(u[:,3], zt)
Out[16]:
array([ -7.77156117e-16,  -6.66133815e-16,  -7.21644966e-16,
        -7.77156117e-16,  -8.88178420e-16])

Chuck

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110826/792557b6/attachment.html>

From cjordan1 at uw.edu  Fri Aug 26 14:47:59 2011
From: cjordan1 at uw.edu (Christopher Jordan-Squire)
Date: Fri, 26 Aug 2011 13:47:59 -0500
Subject: [Numpy-discussion] NA mask C-API documentation
In-Reply-To: <CAMRnEmp5sc6sgb7vd=mzURkWyTqo6xuQQRPWDODUOQxCew083g@mail.gmail.com>
References: <CAMRnEmp5sc6sgb7vd=mzURkWyTqo6xuQQRPWDODUOQxCew083g@mail.gmail.com>
Message-ID: <CAEJxiFqABhBkPBv2K07aS0LeC0hobC33eHo7VuCU9=2Ha=L+_A@mail.gmail.com>

Regarding ufuncs and NA's, all the mechanics of handling NA from a
ufunc are in the PyUFunc_FromFuncAndData function, right? So the ufunc
creation docs don't have to be updated to include NA's?

-Chris JS

On Wed, Aug 24, 2011 at 7:08 PM, Mark Wiebe <mwwiebe at gmail.com> wrote:
> I've added C-API documentation to the missingdata branch. The .rst file
> (beware of the github rst parser though, it drops some of the content) is
> here:
> https://github.com/m-paradox/numpy/blob/missingdata/doc/source/reference/c-api.maskna.rst
> and I made a small example module which goes with it here:
> https://github.com/m-paradox/spdiv
> Cheers,
> Mark
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>


From charlesr.harris at gmail.com  Fri Aug 26 14:57:26 2011
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Fri, 26 Aug 2011 12:57:26 -0600
Subject: [Numpy-discussion] Identifying Colinear Columns of a Matrix
In-Reply-To: <FC90C6AB4859DF44AE7376C81140C0FD47FACFA8@redmx4.esri.com>
References: <FC90C6AB4859DF44AE7376C81140C0FD47FACFA3@redmx4.esri.com>
	<CAKF=DjsK0FGbdK6Nds36GP+TvxXTGpQ5Dmt39tWSW8HpK4aXJg@mail.gmail.com>
	<FC90C6AB4859DF44AE7376C81140C0FD47FACFA5@redmx4.esri.com>
	<FC90C6AB4859DF44AE7376C81140C0FD47FACFA6@redmx4.esri.com>
	<CAB6mnx+eR7Hh3Sj9gro=AS42HAec2v7XLGSyv2h2xQY2Ya86Pw@mail.gmail.com>
	<FC90C6AB4859DF44AE7376C81140C0FD47FACFA8@redmx4.esri.com>
Message-ID: <CAB6mnxK3Z3noEEZ_1i6jiUsAeJpd+mCQuHwZCwzOzDRFdMHOow@mail.gmail.com>

On Fri, Aug 26, 2011 at 12:38 PM, Mark Janikas <mjanikas at esri.com> wrote:

> Charles!  That looks like it could be a winner!  It looks like you always
> choose the last column of the U matrix and ID the columns that have the same
> values?  It works when I add extra columns as well!  BTW, sorry for my lack
> of knowledge? but what was the point of the dot multiply at the end?  That
> they add up to essentially zero, indicating singularity?  Thanks so much!
>

The indicator of collinearity is the singular value in d, the corresponding
column in u represent the linear combination of rows that are ~0, the
corresponding row in v represents the linear combination of columns that are
~0. If you have several combinations that are ~0, of course you can add them
together and get another. Basically, if you take the rows in v corresponding
to small singular values, you get a basis for the for the null space of the
matrix, the corresponding columns in u are a basis for the orthogonal
complement of the range of the matrix. If that is getting a bit technical
you can just play around with things.

<snip>

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110826/9b8577ee/attachment.html>

From mwwiebe at gmail.com  Fri Aug 26 15:58:42 2011
From: mwwiebe at gmail.com (Mark Wiebe)
Date: Fri, 26 Aug 2011 12:58:42 -0700
Subject: [Numpy-discussion] NA mask C-API documentation
In-Reply-To: <CAEJxiFqABhBkPBv2K07aS0LeC0hobC33eHo7VuCU9=2Ha=L+_A@mail.gmail.com>
References: <CAMRnEmp5sc6sgb7vd=mzURkWyTqo6xuQQRPWDODUOQxCew083g@mail.gmail.com>
	<CAEJxiFqABhBkPBv2K07aS0LeC0hobC33eHo7VuCU9=2Ha=L+_A@mail.gmail.com>
Message-ID: <CAMRnEmrKL0rd8iRQ+-vk1St99Kr-5PbbVy8S3rD9cx6CPP7KJg@mail.gmail.com>

On Fri, Aug 26, 2011 at 11:47 AM, Christopher Jordan-Squire <cjordan1 at uw.edu
> wrote:

> Regarding ufuncs and NA's, all the mechanics of handling NA from a
> ufunc are in the PyUFunc_FromFuncAndData function, right? So the ufunc
> creation docs don't have to be updated to include NA's?
>

That's correct, any ufunc will automatically support NAs with a propagation
approach. It's probably worth mentioning this in the ufunc docs.

I've added some additional type resolution and loop selection functions, but
I'd rather keep them private in NumPy for a version or two so improvements
can be made as experience is gained with them. Unfortunately some aspects of
this are in public headers because of how the API is designed, ideally more
of the classes struct layouts should be hidden from the ABI just as I've
done in deprecating that access for PyArrayObject.

-Mark


>
> -Chris JS
>
> On Wed, Aug 24, 2011 at 7:08 PM, Mark Wiebe <mwwiebe at gmail.com> wrote:
> > I've added C-API documentation to the missingdata branch. The .rst file
> > (beware of the github rst parser though, it drops some of the content) is
> > here:
> >
> https://github.com/m-paradox/numpy/blob/missingdata/doc/source/reference/c-api.maskna.rst
> > and I made a small example module which goes with it here:
> > https://github.com/m-paradox/spdiv
> > Cheers,
> > Mark
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >
> >
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110826/d1ed89d0/attachment.html>

From josef.pktd at gmail.com  Fri Aug 26 16:03:21 2011
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Fri, 26 Aug 2011 16:03:21 -0400
Subject: [Numpy-discussion] Identifying Colinear Columns of a Matrix
In-Reply-To: <CAB6mnxK3Z3noEEZ_1i6jiUsAeJpd+mCQuHwZCwzOzDRFdMHOow@mail.gmail.com>
References: <FC90C6AB4859DF44AE7376C81140C0FD47FACFA3@redmx4.esri.com>
	<CAKF=DjsK0FGbdK6Nds36GP+TvxXTGpQ5Dmt39tWSW8HpK4aXJg@mail.gmail.com>
	<FC90C6AB4859DF44AE7376C81140C0FD47FACFA5@redmx4.esri.com>
	<FC90C6AB4859DF44AE7376C81140C0FD47FACFA6@redmx4.esri.com>
	<CAB6mnx+eR7Hh3Sj9gro=AS42HAec2v7XLGSyv2h2xQY2Ya86Pw@mail.gmail.com>
	<FC90C6AB4859DF44AE7376C81140C0FD47FACFA8@redmx4.esri.com>
	<CAB6mnxK3Z3noEEZ_1i6jiUsAeJpd+mCQuHwZCwzOzDRFdMHOow@mail.gmail.com>
Message-ID: <CAMMTP+D0HMwYAR=8CdZLVVXRnO9vx8v3by8uXwu5gvHZHvk5yg@mail.gmail.com>

On Fri, Aug 26, 2011 at 2:57 PM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
>
>
> On Fri, Aug 26, 2011 at 12:38 PM, Mark Janikas <mjanikas at esri.com> wrote:
>>
>> Charles!? That looks like it could be a winner!? It looks like you always
>> choose the last column of the U matrix and ID the columns that have the same
>> values?? It works when I add extra columns as well!? BTW, sorry for my lack
>> of knowledge? but what was the point of the dot multiply at the end?? That
>> they add up to essentially zero, indicating singularity?? Thanks so much!
>
> The indicator of collinearity is the singular value in d, the corresponding
> column in u represent the linear combination of rows that are ~0, the
> corresponding row in v represents the linear combination of columns that are
> ~0. If you have several combinations that are ~0, of course you can add them
> together and get another. Basically, if you take the rows in v corresponding
> to small singular values, you get a basis for the for the null space of the
> matrix, the corresponding columns in u are a basis for the orthogonal
> complement of the range of the matrix. If that is getting a bit technical
> you can just play around with things.

Interpretation is a bit difficult if there are more than one zero eigenvalues

>>> zt2 = np.vstack((zt, zt[2,:] + zt[3,:]))
>>> zt2
array([[ 1.  ,  1.  ,  1.  ,  1.  ,  1.  ],
       [ 0.25,  0.1 ,  0.2 ,  0.25,  0.5 ],
       [ 0.75,  0.9 ,  0.8 ,  0.75,  0.5 ],
       [ 3.  ,  8.  ,  0.  ,  5.  ,  0.  ],
       [ 3.75,  8.9 ,  0.8 ,  5.75,  0.5 ]])
>>> u,d,v = np.linalg.svd(zt2)
>>> d
array([  1.51561431e+01,   1.91327688e+00,   3.25113875e-01,
         1.05664844e-15,   5.29054218e-16])
>>> u[:,-2:]
array([[ 0.59948553, -0.12496837],
       [-0.59948553,  0.12496837],
       [-0.51747833, -0.48188813],
       [ 0.0820072 , -0.60685651],
       [-0.0820072 ,  0.60685651]])

Josef

>
> <snip>
>
> Chuck
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>


From brett.olsen at gmail.com  Fri Aug 26 16:20:20 2011
From: brett.olsen at gmail.com (Brett Olsen)
Date: Fri, 26 Aug 2011 15:20:20 -0500
Subject: [Numpy-discussion] How to output array with indexes to a text
	file?
In-Reply-To: <1314299436.18748.19.camel@mattotaupa>
References: <1314299436.18748.19.camel@mattotaupa>
Message-ID: <CAFq1z2WD0dk=cANLbiHOJpCRQt-eax_RS2XW4vMtVJh8v_8WTg@mail.gmail.com>

On Thu, Aug 25, 2011 at 2:10 PM, Paul Menzel
<paulepanter at users.sourceforge.net> wrote:
> is there an easy way to also save the indexes of an array (columns, rows
> or both) when outputting it to a text file. For saving an array to a
> file I only found `savetxt()` [1] which does not seem to have such an
> option. Adding indexes manually is doable but I would like to avoid
> that.
> Is there a way to accomplish that task without reserving the 0th row or
> column to store the indexes?
>
> I want to process these text files to produce graphs and MetaPost?s [2]
> graph package needs these indexes. (I know about Matplotlib [3], but I
> would like to use MetaPost.)
>
>
> Thanks,
>
> Paul

Why don't you just write a wrapper for numpy.savetxt that adds the
indices?  E.g.:

In [1]: import numpy as N

In [2]: a = N.arange(6,12).reshape((2,3))

In [3]: a
Out[3]:
array([[ 6,  7,  8],
       [ 9, 10, 11]])

In [4]: def save_with_indices(filename, output):
   ...:     (rows, cols) = output.shape
   ...:     tmp = N.hstack((N.arange(1,rows+1).reshape((rows,1)), output))
   ...:     tmp = N.vstack((N.arange(cols+1).reshape((1,cols+1)), tmp))
   ...:     N.savetxt(filename, tmp, fmt='%8i')
   ...:

In [5]: N.savetxt('noidx.txt', a, fmt='%8i')

In [6]: save_with_indices('idx.txt', a)

'noidx.txt' looks like:
       6        7        8
       9       10       11
'idx.txt' looks like:
       0        1        2        3
       1        6        7        8
       2        9       10       11

~Brett


From ralf.gommers at googlemail.com  Fri Aug 26 17:52:52 2011
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Fri, 26 Aug 2011 23:52:52 +0200
Subject: [Numpy-discussion] the build and installation process
In-Reply-To: <CAJewx89e=Rm-cwD=jyfFzvasov8Ro8QYR0i-2nVWfK_KZGb6CA@mail.gmail.com>
References: <CAJewx89e=Rm-cwD=jyfFzvasov8Ro8QYR0i-2nVWfK_KZGb6CA@mail.gmail.com>
Message-ID: <CABL7CQiaqA50swxoDwn-z2LVUK8NySupiV02xBgqKoOpnq9TDQ@mail.gmail.com>

On Thu, Aug 25, 2011 at 2:23 PM, srean <srean.list at gmail.com> wrote:

> Hi,
>
>  I would like to know a bit about how the installation process works. Could
> you point me to a resource. In particular I want to know how the site.cfg
> configuration works. Is it numpy/scipy specific or is it standard with
> distutils. I googled for site.cfg and distutils but did not find any
> authoritative document.


There is not much more than what's described in the site.cfg.example file
that's in the numpy source tree root dir. As far as I know the site.cfg name
is numpy specific, but python distutils uses a distutils.cfg file in the
same format.

>
> I believe many new users trip up on the installation process, especially in
> trying to substitute their favourite library in place os the standard. So a
> canonical document explaining the process will be very helpful.
>
> http://docs.scipy.org/doc/numpy/user/install.html
>

The most up-to-date descriptions for each OS can be found at
http://www.scipy.org/Installing_SciPy

>
> does cover some of the important points but its a bit sketchy, and has a
> "this is all that you need to know" flavor. Doesnt quite enable the reader
> to fix his own problems. So a resource that is somewhere in between reading
> up all the sources that get invoked during the installation and building,
> and the current install document will be very welcome.
>
> English is not my native language, but if there is anyway I can help, I
> would do so gladly.
>

If the above docs don't help as much as you'd want, please point out the
most problematic points. The install instructions are a wiki so you can make
changes yourself. Especially about things like linking to specific versions
of MKL there's not enough or outdated info, any contributions there will be
very useful.

Cheers,
Ralf


> -- srean
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110826/6431b853/attachment.html>

From cournape at gmail.com  Sat Aug 27 06:28:14 2011
From: cournape at gmail.com (David Cournapeau)
Date: Sat, 27 Aug 2011 12:28:14 +0200
Subject: [Numpy-discussion] Removing numscons,
	adding bento scripts to main branch ?
Message-ID: <CAGY4rcW1W+HeniYC_rowODeYpDjNz_3qz_EEMChbwiEZ6mJOKw@mail.gmail.com>

Hi there,

I am finally at a stage where bento can do most of what numscons could
do. I would rather avoid having 3 different set of build scripts
(distutils+bento+numscons) to maintain in the long term, so I would
favor removing numscons scripts from numpy and scipy.

I was thinking about keeping maybe numscons scripts for one release
for both numpy/scipy, with a warning about their deprecation, and then
removing them one release later.

Does that sound ok with everyone ?

cheers,

David


From ralf.gommers at googlemail.com  Sat Aug 27 07:31:17 2011
From: ralf.gommers at googlemail.com (Ralf Gommers)
Date: Sat, 27 Aug 2011 13:31:17 +0200
Subject: [Numpy-discussion] Removing numscons,
 adding bento scripts to main branch ?
In-Reply-To: <CAGY4rcW1W+HeniYC_rowODeYpDjNz_3qz_EEMChbwiEZ6mJOKw@mail.gmail.com>
References: <CAGY4rcW1W+HeniYC_rowODeYpDjNz_3qz_EEMChbwiEZ6mJOKw@mail.gmail.com>
Message-ID: <CABL7CQi4jeCzURwtFCVo5mkMgBGTgCpsdygie+WKm7__+ysnhQ@mail.gmail.com>

On Sat, Aug 27, 2011 at 12:28 PM, David Cournapeau <cournape at gmail.com>wrote:

> Hi there,
>
> I am finally at a stage where bento can do most of what numscons could
> do. I would rather avoid having 3 different set of build scripts
> (distutils+bento+numscons) to maintain in the long term, so I would
> favor removing numscons scripts from numpy and scipy.
>
> That's awesome!


> I was thinking about keeping maybe numscons scripts for one release
> for both numpy/scipy, with a warning about their deprecation, and then
> removing them one release later.
>
> Does that sound ok with everyone ?
>
> Sounds like the right thing to do.

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110827/95c9bd87/attachment.html>

From charlesr.harris at gmail.com  Sat Aug 27 08:30:25 2011
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Sat, 27 Aug 2011 06:30:25 -0600
Subject: [Numpy-discussion] Removing numscons,
 adding bento scripts to main branch ?
In-Reply-To: <CAGY4rcW1W+HeniYC_rowODeYpDjNz_3qz_EEMChbwiEZ6mJOKw@mail.gmail.com>
References: <CAGY4rcW1W+HeniYC_rowODeYpDjNz_3qz_EEMChbwiEZ6mJOKw@mail.gmail.com>
Message-ID: <CAB6mnxLYsAdT-JPk92EVG8L5Zhdv-ujFbBCxd-bTue_0pKnC5A@mail.gmail.com>

On Sat, Aug 27, 2011 at 4:28 AM, David Cournapeau <cournape at gmail.com>wrote:

> Hi there,
>
> I am finally at a stage where bento can do most of what numscons could
> do. I would rather avoid having 3 different set of build scripts
> (distutils+bento+numscons) to maintain in the long term, so I would
> favor removing numscons scripts from numpy and scipy.
>
> I was thinking about keeping maybe numscons scripts for one release
> for both numpy/scipy, with a warning about their deprecation, and then
> removing them one release later.
>
> Does that sound ok with everyone ?
>
>
Sounds good. The numscons scripts don't work for python3 builds anyway.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110827/9849db4c/attachment.html>

From mwwiebe at gmail.com  Sat Aug 27 12:13:29 2011
From: mwwiebe at gmail.com (Mark Wiebe)
Date: Sat, 27 Aug 2011 09:13:29 -0700
Subject: [Numpy-discussion] Removing numscons,
 adding bento scripts to main branch ?
In-Reply-To: <CAGY4rcW1W+HeniYC_rowODeYpDjNz_3qz_EEMChbwiEZ6mJOKw@mail.gmail.com>
References: <CAGY4rcW1W+HeniYC_rowODeYpDjNz_3qz_EEMChbwiEZ6mJOKw@mail.gmail.com>
Message-ID: <CAMRnEmqiBrjMB2wRvCJ8SD7q1mDnTrijjWiCzmKJP-wH+bdB0Q@mail.gmail.com>

On Sat, Aug 27, 2011 at 3:28 AM, David Cournapeau <cournape at gmail.com>wrote:

> Hi there,
>
> I am finally at a stage where bento can do most of what numscons could
> do. I would rather avoid having 3 different set of build scripts
> (distutils+bento+numscons) to maintain in the long term, so I would
> favor removing numscons scripts from numpy and scipy.
>
> I was thinking about keeping maybe numscons scripts for one release
> for both numpy/scipy, with a warning about their deprecation, and then
> removing them one release later.
>
> Does that sound ok with everyone ?
>

Sounds great to me!

-Mark


>
> cheers,
>
> David
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110827/f9729113/attachment.html>

From teoliphant at gmail.com  Sat Aug 27 13:53:01 2011
From: teoliphant at gmail.com (Travis Oliphant)
Date: Sat, 27 Aug 2011 12:53:01 -0500
Subject: [Numpy-discussion] Removing numscons,
	adding bento scripts to main branch ?
In-Reply-To: <CAGY4rcW1W+HeniYC_rowODeYpDjNz_3qz_EEMChbwiEZ6mJOKw@mail.gmail.com>
References: <CAGY4rcW1W+HeniYC_rowODeYpDjNz_3qz_EEMChbwiEZ6mJOKw@mail.gmail.com>
Message-ID: <0D65F094-742E-4379-B953-5231BDD26EBE@enthought.com>

Three cheers!

-Travis

On Aug 27, 2011, at 5:28 AM, David Cournapeau wrote:

> Hi there,
> 
> I am finally at a stage where bento can do most of what numscons could
> do. I would rather avoid having 3 different set of build scripts
> (distutils+bento+numscons) to maintain in the long term, so I would
> favor removing numscons scripts from numpy and scipy.
> 
> I was thinking about keeping maybe numscons scripts for one release
> for both numpy/scipy, with a warning about their deprecation, and then
> removing them one release later.
> 
> Does that sound ok with everyone ?
> 
> cheers,
> 
> David
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

---
Travis Oliphant
Enthought, Inc.
oliphant at enthought.com
1-512-536-1057
http://www.enthought.com


From teoliphant at gmail.com  Sat Aug 27 13:56:26 2011
From: teoliphant at gmail.com (Travis Oliphant)
Date: Sat, 27 Aug 2011 12:56:26 -0500
Subject: [Numpy-discussion] Removing numscons,
	adding bento scripts to main branch ?
In-Reply-To: <CAGY4rcW1W+HeniYC_rowODeYpDjNz_3qz_EEMChbwiEZ6mJOKw@mail.gmail.com>
References: <CAGY4rcW1W+HeniYC_rowODeYpDjNz_3qz_EEMChbwiEZ6mJOKw@mail.gmail.com>
Message-ID: <92EE6D63-023A-498F-AF4E-45453AF2B986@enthought.com>

Three cheers!

Thanks David, 

-Travis

On Aug 27, 2011, at 5:28 AM, David Cournapeau wrote:

> Hi there,
> 
> I am finally at a stage where bento can do most of what numscons could
> do. I would rather avoid having 3 different set of build scripts
> (distutils+bento+numscons) to maintain in the long term, so I would
> favor removing numscons scripts from numpy and scipy.
> 
> I was thinking about keeping maybe numscons scripts for one release
> for both numpy/scipy, with a warning about their deprecation, and then
> removing them one release later.
> 
> Does that sound ok with everyone ?
> 
> cheers,
> 
> David
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

---
Travis Oliphant
Enthought, Inc.
oliphant at enthought.com
1-512-536-1057
http://www.enthought.com


From cjordan1 at uw.edu  Sat Aug 27 14:08:25 2011
From: cjordan1 at uw.edu (Christopher Jordan-Squire)
Date: Sat, 27 Aug 2011 14:08:25 -0400
Subject: [Numpy-discussion] load from text files Pull Request Review
Message-ID: <CAEJxiFqXJ9zDR+aCCPky34XwmueKNBPRBb_Y+nvkn24UM5ro=g@mail.gmail.com>

Hi--

I've submitted a pull request for a new method for loading data from
text files into a record array/masked record array.

https://github.com/numpy/numpy/pull/143

Click on the link for more info, but the general idea is to create a
regular expression for what entries should look like and loop over the
file, updating the regular expression if it's wrong. Once the types
are determined the file is loaded line by line into a pre-allocated
numpy array.

Compared to genfromtxt this function has several advantages/potential
advantages.

*More modular (genfromtxt is a rather large, nearly 500 line,
monolithic function. In my pull request no individual method is longer
than around 80 lines, and they're fairly self-contained.)
*delimiters can be specified via regex's
*missing data can be specified via regex's
*it's bit simpler and has sensible defaults
*it actually works on some (unfortunately proprietary) data that
genfromtxt doesn't seem robust enough for
*it supports datetimes
*fairly extensible for the power user
*makes two passes through the file, the first to determine types/sizes
for strings and the second to read in the data, and pre-allocates the
array for the second pass. So no giant memory bloating for reading
large text files
*fairly fast, though I think there is plenty of room for optimizations

All that said, it's entirely possible that the innards which determine
the type should be ripped out and submitted as a function on their
own.

I'd love suggestions for improvements, as well as suggestions for a
better name. (Currently it's called loadtable, which I don't really
like. It was just a working name.)

-Chris Jordan-Squire


From dominique.orban at gmail.com  Sat Aug 27 16:09:11 2011
From: dominique.orban at gmail.com (Dominique Orban)
Date: Sat, 27 Aug 2011 20:09:11 +0000
Subject: [Numpy-discussion] numpy.log does not raise exceptions
Message-ID: <CAO6G2+cLxtvpy6dF6Vt4b6YpUyDCWMFFAdw811L7=5BEdX5J9w@mail.gmail.com>

Hi,

I'm wondering why numpy.log doesn't raise a ValueError exception the
way math.log does:

1< import numpy as np
2< np.log([-1])
Warning: invalid value encountered in log
2> array([ nan])

3< import math
4< math.log(-1)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)

It would make it a lot easier to trap domain errors than using isnan().

Thanks,

-- 
Dominique


From robert.kern at gmail.com  Sat Aug 27 16:37:14 2011
From: robert.kern at gmail.com (Robert Kern)
Date: Sat, 27 Aug 2011 15:37:14 -0500
Subject: [Numpy-discussion] numpy.log does not raise exceptions
In-Reply-To: <CAO6G2+cLxtvpy6dF6Vt4b6YpUyDCWMFFAdw811L7=5BEdX5J9w@mail.gmail.com>
References: <CAO6G2+cLxtvpy6dF6Vt4b6YpUyDCWMFFAdw811L7=5BEdX5J9w@mail.gmail.com>
Message-ID: <CAF6FJivauiHUw9DsTW1OimBxiYXiPXYbSwO4Fxmer=LVN--OWQ@mail.gmail.com>

On Sat, Aug 27, 2011 at 15:09, Dominique Orban
<dominique.orban at gmail.com> wrote:
> Hi,
>
> I'm wondering why numpy.log doesn't raise a ValueError exception the
> way math.log does:
>
> 1< import numpy as np
> 2< np.log([-1])
> Warning: invalid value encountered in log
> 2> array([ nan])
>
> 3< import math
> 4< math.log(-1)
> ---------------------------------------------------------------------------
> ValueError ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?Traceback (most recent call last)
>
> It would make it a lot easier to trap domain errors than using isnan().

The reason we don't raise exceptions by default is because when
processing large arrays, you usually don't want to cancel the whole
operation just because some values were out of the domain. You would
rather get an array with NaNs in the elements that had invalid inputs
so you can do something useful with the other elements and actually
track down where the NaNs got their bad inputs. Always raising an
exception destroys that information.

That said, if you do want to raise an exception, this is entirely configurable.

  http://docs.scipy.org/doc/numpy/reference/generated/numpy.seterr.html

[~]
|1> import numpy as np

[~]
|2> np.log([-1])
Warning: invalid value encountered in log
array([ nan])

[~]
|3> np.seterr(invalid='raise')
{'divide': 'print', 'invalid': 'print', 'over': 'print', 'under': 'ignore'}

[~]
|4> np.log([-1])
---------------------------------------------------------------------------
FloatingPointError                        Traceback (most recent call last)
/Users/rkern/<ipython-input-4-6f7031b9c723> in <module>()
----> 1 np.log([-1])

FloatingPointError: invalid value encountered in log


-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
? -- Umberto Eco


From pav at iki.fi  Sat Aug 27 18:02:11 2011
From: pav at iki.fi (Pauli Virtanen)
Date: Sat, 27 Aug 2011 22:02:11 +0000 (UTC)
Subject: [Numpy-discussion] Removing numscons,
	adding bento scripts to main branch ?
References: <CAGY4rcW1W+HeniYC_rowODeYpDjNz_3qz_EEMChbwiEZ6mJOKw@mail.gmail.com>
Message-ID: <j3bph2$eh0$1@dough.gmane.org>

Hey,

Sat, 27 Aug 2011 12:28:14 +0200, David Cournapeau wrote:
> I am finally at a stage where bento can do most of what numscons could
> do. I would rather avoid having 3 different set of build scripts
> (distutils+bento+numscons) to maintain in the long term, so I would
> favor removing numscons scripts from numpy and scipy.
> 
> I was thinking about keeping maybe numscons scripts for one release for
> both numpy/scipy, with a warning about their deprecation, and then
> removing them one release later.

Definite +1 from me!

	Pauli


From dominique.orban at gmail.com  Sun Aug 28 12:36:27 2011
From: dominique.orban at gmail.com (dpo)
Date: Sun, 28 Aug 2011 09:36:27 -0700 (PDT)
Subject: [Numpy-discussion] numpy.log does not raise exceptions
In-Reply-To: <CAF6FJivauiHUw9DsTW1OimBxiYXiPXYbSwO4Fxmer=LVN--OWQ@mail.gmail.com>
References: <CAO6G2+cLxtvpy6dF6Vt4b6YpUyDCWMFFAdw811L7=5BEdX5J9w@mail.gmail.com>
	<CAF6FJivauiHUw9DsTW1OimBxiYXiPXYbSwO4Fxmer=LVN--OWQ@mail.gmail.com>
Message-ID: <32352209.post@talk.nabble.com>


Robert Kern-2 wrote:
> 
> The reason we don't raise exceptions by default is because when
> processing large arrays, you usually don't want to cancel the whole
> operation just because some values were out of the domain. You would
> rather get an array with NaNs in the elements that had invalid inputs
> so you can do something useful with the other elements and actually
> track down where the NaNs got their bad inputs. Always raising an
> exception destroys that information.
> 
> That said, if you do want to raise an exception, this is entirely
> configurable.
> 
>   http://docs.scipy.org/doc/numpy/reference/generated/numpy.seterr.html
> 
> [~]
> |1> import numpy as np
> 
> [~]
> |2> np.log([-1])
> Warning: invalid value encountered in log
> array([ nan])
> 
> [~]
> |3> np.seterr(invalid='raise')
> {'divide': 'print', 'invalid': 'print', 'over': 'print', 'under':
> 'ignore'}
> 
> [~]
> |4> np.log([-1])
> ---------------------------------------------------------------------------
> FloatingPointError                        Traceback (most recent call
> last)
> /Users/rkern/<ipython-input-4-6f7031b9c723> in <module>()
> ----> 1 np.log([-1])
> 
> FloatingPointError: invalid value encountered in log
> 

Excellent, thanks. I was hoping it would be configurable.

Dominique
-- 
View this message in context: http://old.nabble.com/numpy.log-does-not-raise-exceptions-tp32348907p32352209.html
Sent from the Numpy-discussion mailing list archive at Nabble.com.


From rpmuller at gmail.com  Mon Aug 29 10:56:31 2011
From: rpmuller at gmail.com (Rick Muller)
Date: Mon, 29 Aug 2011 08:56:31 -0600
Subject: [Numpy-discussion] Eigenvalues did not converge
Message-ID: <CAH-123Tstze-K=m1Er5zWubV=sLDoxarAguKSONdzxueGMd=hQ@mail.gmail.com>

I'm bumping into the old "Eigenvalues did not converge" error using
numpy.linalg.eigh() on several different linux builds of numpy (1.4.1). The
matrix is 166x166. I can compute the eigenvalues on a Macintosh build of
numpy, and I can confirm that there aren't degenerate eigenvalues, and that
the matrix appears to be negative definite.

I've seen this before (though not for several years), and what I normally do
is to build lapack with -O0. This trick did not work in the current
instance. Does anyone have any tricks to getting eigh to work?

Other weird things that I've noticed about this case: I can compute the
eigenvalues using eigvals and eigvalsh, and can compute the eigenvals/vecs
using eig(). The matrix is real symmetric, and I've tested that it's
symmetric enough by forcibly symmetrizing it.

Thanks in advance for any help you can offer.

-- 
Rick Muller
rpmuller at gmail.com
505-750-7557
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110829/1e1cc26f/attachment.html>

From dhanjal at telecom-paristech.fr  Mon Aug 29 11:21:05 2011
From: dhanjal at telecom-paristech.fr (Charanpal Dhanjal)
Date: Mon, 29 Aug 2011 16:21:05 +0100
Subject: [Numpy-discussion] Eigenvalues did not converge
In-Reply-To: <CAH-123Tstze-K=m1Er5zWubV=sLDoxarAguKSONdzxueGMd=hQ@mail.gmail.com>
References: <CAH-123Tstze-K=m1Er5zWubV=sLDoxarAguKSONdzxueGMd=hQ@mail.gmail.com>
Message-ID: <9992db82607f9fe061235882402b64f7@telecom-paristech.fr>

I posted a similar question about the non-convergence of 
numpy.linalg.svd a few weeks ago. I'm not sure I can help but I wonder 
if you compiled numpy with ATLAS/MKL support (try numpy.show_config()) 
and whether it made a difference? Also what is the condition number and 
Frobenius norm of the matrix in question?

Charanpal

On Mon, 29 Aug 2011 08:56:31 -0600, Rick Muller wrote:
> Im bumping into the old "Eigenvalues did not converge" error using
> numpy.linalg.eigh() on several different linux builds of numpy
> (1.4.1). The matrix is 166x166. I can compute the eigenvalues on a
> Macintosh build of numpy, and I can confirm that there arent
> degenerate eigenvalues, and that the matrix appears to be negative
> definite.
>
> Ive seen this before (though not for several years), and what I
> normally do is to build lapack with -O0. This trick did not work in
> the current instance. Does anyone have any tricks to getting eigh to
> work?
>
>  Other weird things that Ive noticed about this case: I can compute
> the eigenvalues using eigvals and eigvalsh, and can compute the
> eigenvals/vecs using eig(). The matrix is real symmetric, and Ive
> tested that its symmetric enough by forcibly symmetrizing it.
>
> Thanks in advance for any help you can offer.


From paul.anton.letnes at gmail.com  Mon Aug 29 11:31:09 2011
From: paul.anton.letnes at gmail.com (Paul Anton Letnes)
Date: Mon, 29 Aug 2011 16:31:09 +0100
Subject: [Numpy-discussion] Eigenvalues did not converge
In-Reply-To: <9992db82607f9fe061235882402b64f7@telecom-paristech.fr>
References: <CAH-123Tstze-K=m1Er5zWubV=sLDoxarAguKSONdzxueGMd=hQ@mail.gmail.com>
	<9992db82607f9fe061235882402b64f7@telecom-paristech.fr>
Message-ID: <FD699817-141E-4D35-BC7B-916B61B7CFBE@gmail.com>

I recently got into trouble with these calculations (although I used scipy). I actually got segfaults and "bus errors". The solution for me was to not link against ATLAS, but rather link against Apple's blas/lapack libraries. That got everything working again. I would suggest trying to install against something other than ATLAS and see if that helps (or, more generally, determining which blas/lapack you are linking against, and try something else).

Paul


On 29. aug. 2011, at 16.21, Charanpal Dhanjal wrote:

> I posted a similar question about the non-convergence of 
> numpy.linalg.svd a few weeks ago. I'm not sure I can help but I wonder 
> if you compiled numpy with ATLAS/MKL support (try numpy.show_config()) 
> and whether it made a difference? Also what is the condition number and 
> Frobenius norm of the matrix in question?
> 
> Charanpal
> 
> On Mon, 29 Aug 2011 08:56:31 -0600, Rick Muller wrote:
>> Im bumping into the old "Eigenvalues did not converge" error using
>> numpy.linalg.eigh() on several different linux builds of numpy
>> (1.4.1). The matrix is 166x166. I can compute the eigenvalues on a
>> Macintosh build of numpy, and I can confirm that there arent
>> degenerate eigenvalues, and that the matrix appears to be negative
>> definite.
>> 
>> Ive seen this before (though not for several years), and what I
>> normally do is to build lapack with -O0. This trick did not work in
>> the current instance. Does anyone have any tricks to getting eigh to
>> work?
>> 
>> Other weird things that Ive noticed about this case: I can compute
>> the eigenvalues using eigvals and eigvalsh, and can compute the
>> eigenvals/vecs using eig(). The matrix is real symmetric, and Ive
>> tested that its symmetric enough by forcibly symmetrizing it.
>> 
>> Thanks in advance for any help you can offer.
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From marquett at iap.fr  Tue Aug 30 04:46:24 2011
From: marquett at iap.fr (Marquette Jean-Baptiste)
Date: Tue, 30 Aug 2011 10:46:24 +0200
Subject: [Numpy-discussion] A question about dtype syntax
Message-ID: <AA9C209A-DABF-442C-A72B-115B400545EC@iap.fr>

Hi all,

I have this piece of code:

Stats = [CatBase, round(stats.mean(Data.Ra), 5), round(stats.mean(Data.Dec), 5), len(Sep), round(stats.mean(Sep),4), round(stats.stdev(Sep),4)]
print Stats
if First:
	StatsAll = np.array(np.asarray(Stats), dtype=('a11, f8, f8, i4, f8, f8'))
        First = False
else: 
        StatsAll = np.vstack((StatsAll, np.asarray(Stats)))
        print len(StatsAll)

This yields the error:

['bs3000k.cat', 280.60341, -7.09118, 9480, 0.2057, 0.14]
Traceback (most recent call last):
  File "/Users/marquett/workspace/Distort/src/StatsSep.py", line 40, in <module>
    StatsAll = np.array(np.asarray(Stats), dtype=('a11, f8, f8, i4, f8, f8'))
ValueError: could not convert string to float: bs3000k.cat

What's wrong ?
Thanks for your help

Cheers
JB

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110830/0e2b96ac/attachment.html>

From pgmdevlist at gmail.com  Tue Aug 30 05:50:48 2011
From: pgmdevlist at gmail.com (Pierre GM)
Date: Tue, 30 Aug 2011 11:50:48 +0200
Subject: [Numpy-discussion] A question about dtype syntax
In-Reply-To: <AA9C209A-DABF-442C-A72B-115B400545EC@iap.fr>
References: <AA9C209A-DABF-442C-A72B-115B400545EC@iap.fr>
Message-ID: <6AB4E3BA-C9B9-4D99-A470-259CD81589A7@gmail.com>


On Aug 30, 2011, at 10:46 AM, Marquette Jean-Baptiste wrote:

> Hi all,
> 
> I have this piece of code:
> 
> Stats = [CatBase, round(stats.mean(Data.Ra), 5), round(stats.mean(Data.Dec), 5), len(Sep), round(stats.mean(Sep),4), round(stats.stdev(Sep),4)]
> print Stats
> if First:
> 	StatsAll = np.array(np.asarray(Stats), dtype=('a11, f8, f8, i4, f8, f8'))
>         First = False
> else: 
>         StatsAll = np.vstack((StatsAll, np.asarray(Stats)))
>         print len(StatsAll)
> 
> This yields the error:
> 
> ['bs3000k.cat', 280.60341, -7.09118, 9480, 0.2057, 0.14]
> Traceback (most recent call last):
>   File "/Users/marquett/workspace/Distort/src/StatsSep.py", line 40, in <module>
>     StatsAll = np.array(np.asarray(Stats), dtype=('a11, f8, f8, i4, f8, f8'))
> ValueError: could not convert string to float: bs3000k.cat
> 
> What's wrong ?

My guess:
Stats is a list of 5 elements, but you want a list of 1 5-element tuple to match the type. 

> Stats = [(CatBase, round(stats.mean(Data.Ra), 5), round(stats.mean(Data.Dec), 5), len(Sep), round(stats.mean(Sep),4), round(stats.stdev(Sep),4),)]


From qisheng at multicorewareinc.com  Tue Aug 30 05:51:28 2011
From: qisheng at multicorewareinc.com (Qisheng Yang)
Date: Tue, 30 Aug 2011 17:51:28 +0800
Subject: [Numpy-discussion] Want to find a scientific app using NumPy to
 process large set of data, say more than 1000000 elements in ndarray.
Message-ID: <CAA70JGxDfN6MX0Y4x=9K3Pr3TNbw-D8Yp+DrOsGtagz0homX9w@mail.gmail.com>

Hello, All

As the subject say, I want to exercise *multiprocessing *module in NumPy in
order to take advantage of multi-cores. A project which processing large set
of data will be useful to compare single thread with multi-thread.  I have
reviewed some projects using NumPy/SciPy list on SciPy homepage. But I
haven't yet found a project which using NumPy ufunc to process large set of
data.

Any suggestions would be greatly appreciated.
Thanks much.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110830/921a197b/attachment.html>

From ndbecker2 at gmail.com  Tue Aug 30 08:30:54 2011
From: ndbecker2 at gmail.com (Neal Becker)
Date: Tue, 30 Aug 2011 08:30:54 -0400
Subject: [Numpy-discussion] wierd numpy.void behavior
Message-ID: <j3il5u$h2d$1@dough.gmane.org>

I've encountered something weird about numpy.void.

arr = np.empty ((len(results),), dtype=[('deltaf', float),
                                        ('quantize', [('int', int), ('frac', 
int)])])

for i,r in enumerate (results):
    arr[i] = (r[0]['deltaf'],
              tuple(r[0]['quantize_mf']))


from collections import defaultdict, namedtuple
experiments = defaultdict(list)

testcase = namedtuple ('testcase', ['quantize'])

for e in arr:
    experiments[testcase(e['quantize'])].append (e)

Now it seems that when e['quantize'] is used as a dictionary key, equal values 
are not compared as equal:

In [36]: experiments
Out[36]: defaultdict(<type 'list'>, {testcase(quantize=(0, 0)): [(1.25, (0, 
0))], testcase(quantize=(0, 0)): [(1.25, (0, 0))], testcase(quantize=(0, 0)): 
[(1.25, (0, 0))]})

See, there are 3 'testcases' inserted, all with keys quantize=(0,0).

In [37]: e['quantize']
Out[37]: (0, 0)

In [38]: type(e['quantize'])
Out[38]: <type 'numpy.void'>

There's something weird here.  If instead I do:

for e in arr:
    experiments[testcase(tuple(e['quantize']))].append (e)

that is, convert e['quantize'] to a tuple before using it as a key, I get the 
expected behavior:

In [40]: experiments
Out[40]: defaultdict(<type 'list'>, {testcase(quantize=(0, 0)): [(1.25, (0, 0)), 
(1.25, (0, 0)), (1.25, (0, 0))]})


From shish at keba.be  Tue Aug 30 09:09:50 2011
From: shish at keba.be (Olivier Delalleau)
Date: Tue, 30 Aug 2011 09:09:50 -0400
Subject: [Numpy-discussion] wierd numpy.void behavior
In-Reply-To: <j3il5u$h2d$1@dough.gmane.org>
References: <j3il5u$h2d$1@dough.gmane.org>
Message-ID: <CAFXk4bqJDQ6NxpsL_SKyuLaSMu9kD9Xy2VJMPwmkqdp=eOQsqA@mail.gmail.com>

It looks like numpy.void does not properly implement __hash__:

In [35]: arr[0]['quantize'] == arr[1]['quantize']
Out[35]: True

In [34]: hash(arr[0]['quantize']) == hash(arr[1]['quantize'])
Out[34]: False

I'm not familiar enough with this kind of data type to tell you if you are
using it as it should be used though. Maybe such data is not supposed to be
hashed (but then shouldn'it it raise an exception?).

-=- Olivier

2011/8/30 Neal Becker <ndbecker2 at gmail.com>

> I've encountered something weird about numpy.void.
>
> arr = np.empty ((len(results),), dtype=[('deltaf', float),
>                                        ('quantize', [('int', int), ('frac',
> int)])])
>
> for i,r in enumerate (results):
>    arr[i] = (r[0]['deltaf'],
>              tuple(r[0]['quantize_mf']))
>
>
> from collections import defaultdict, namedtuple
> experiments = defaultdict(list)
>
> testcase = namedtuple ('testcase', ['quantize'])
>
> for e in arr:
>    experiments[testcase(e['quantize'])].append (e)
>
> Now it seems that when e['quantize'] is used as a dictionary key, equal
> values
> are not compared as equal:
>
> In [36]: experiments
> Out[36]: defaultdict(<type 'list'>, {testcase(quantize=(0, 0)): [(1.25, (0,
> 0))], testcase(quantize=(0, 0)): [(1.25, (0, 0))], testcase(quantize=(0,
> 0)):
> [(1.25, (0, 0))]})
>
> See, there are 3 'testcases' inserted, all with keys quantize=(0,0).
>
> In [37]: e['quantize']
> Out[37]: (0, 0)
>
> In [38]: type(e['quantize'])
> Out[38]: <type 'numpy.void'>
>
> There's something weird here.  If instead I do:
>
> for e in arr:
>    experiments[testcase(tuple(e['quantize']))].append (e)
>
> that is, convert e['quantize'] to a tuple before using it as a key, I get
> the
> expected behavior:
>
> In [40]: experiments
> Out[40]: defaultdict(<type 'list'>, {testcase(quantize=(0, 0)): [(1.25, (0,
> 0)),
> (1.25, (0, 0)), (1.25, (0, 0))]})
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110830/7a953279/attachment.html>

From rdmoores at gmail.com  Tue Aug 30 09:47:22 2011
From: rdmoores at gmail.com (Richard D. Moores)
Date: Tue, 30 Aug 2011 06:47:22 -0700
Subject: [Numpy-discussion] numpy-1.6.1.win32-py3.2.exe (md5) won't install
Message-ID: <CALMxxxmonAJb-vXCQsHXdRuV2F3074VLBSX=Y7F7CH3EOHi2Tw@mail.gmail.com>

Python 3.2, 64-bit Win 7

When I try to install numpy-1.6.1.win32-py3.2.exe (md5) I get "Python
version 3.2 required, which was not found in the registry". What to
do?

Thanks,

Dick Moores


From shish at keba.be  Tue Aug 30 09:50:47 2011
From: shish at keba.be (Olivier Delalleau)
Date: Tue, 30 Aug 2011 09:50:47 -0400
Subject: [Numpy-discussion] numpy-1.6.1.win32-py3.2.exe (md5) won't
	install
In-Reply-To: <CALMxxxmonAJb-vXCQsHXdRuV2F3074VLBSX=Y7F7CH3EOHi2Tw@mail.gmail.com>
References: <CALMxxxmonAJb-vXCQsHXdRuV2F3074VLBSX=Y7F7CH3EOHi2Tw@mail.gmail.com>
Message-ID: <CAFXk4br6g0O-fXbLnaLtH3AtsSEKtB+SaQOhOwaH_uoN8xNmMw@mail.gmail.com>

win32 = 32 bit Python. That's probably the issue.

-=- Olivier

2011/8/30 Richard D. Moores <rdmoores at gmail.com>

> Python 3.2, 64-bit Win 7
>
> When I try to install numpy-1.6.1.win32-py3.2.exe (md5) I get "Python
> version 3.2 required, which was not found in the registry". What to
> do?
>
> Thanks,
>
> Dick Moores
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110830/1437f656/attachment.html>

From charlesr.harris at gmail.com  Tue Aug 30 09:53:54 2011
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Tue, 30 Aug 2011 07:53:54 -0600
Subject: [Numpy-discussion] numpy-1.6.1.win32-py3.2.exe (md5) won't
	install
In-Reply-To: <CALMxxxmonAJb-vXCQsHXdRuV2F3074VLBSX=Y7F7CH3EOHi2Tw@mail.gmail.com>
References: <CALMxxxmonAJb-vXCQsHXdRuV2F3074VLBSX=Y7F7CH3EOHi2Tw@mail.gmail.com>
Message-ID: <CAB6mnx+doEu5RPzMt6+9x_PWi00PwHBfeOPcu6dkC6g8Dd77NQ@mail.gmail.com>

On Tue, Aug 30, 2011 at 7:47 AM, Richard D. Moores <rdmoores at gmail.com>wrote:

> Python 3.2, 64-bit Win 7
>
> When I try to install numpy-1.6.1.win32-py3.2.exe (md5) I get "Python
> version 3.2 required, which was not found in the registry". What to
> do?
>
>
Did you already install python from python.org<http://www.python.org/download/>
?

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110830/6dec9116/attachment.html>

From rdmoores at gmail.com  Tue Aug 30 09:56:27 2011
From: rdmoores at gmail.com (Richard D. Moores)
Date: Tue, 30 Aug 2011 06:56:27 -0700
Subject: [Numpy-discussion] numpy-1.6.1.win32-py3.2.exe (md5) won't
	install
In-Reply-To: <CAB6mnx+doEu5RPzMt6+9x_PWi00PwHBfeOPcu6dkC6g8Dd77NQ@mail.gmail.com>
References: <CALMxxxmonAJb-vXCQsHXdRuV2F3074VLBSX=Y7F7CH3EOHi2Tw@mail.gmail.com>
	<CAB6mnx+doEu5RPzMt6+9x_PWi00PwHBfeOPcu6dkC6g8Dd77NQ@mail.gmail.com>
Message-ID: <CALMxxx=Zn9YG4J4kKxj74Y8HEKQqboeXM8LM3E-2PF1nH8ysng@mail.gmail.com>

On Tue, Aug 30, 2011 at 06:53, Charles R Harris
<charlesr.harris at gmail.com> wrote:
>
>
> On Tue, Aug 30, 2011 at 7:47 AM, Richard D. Moores <rdmoores at gmail.com>
> wrote:
>>
>> Python 3.2, 64-bit Win 7
>>
>> When I try to install numpy-1.6.1.win32-py3.2.exe (md5) I get "Python
>> version 3.2 required, which was not found in the registry". What to
>> do?
>>
>
> Did you already install python from python.org?

Yes.

Dick


From bsouthey at gmail.com  Tue Aug 30 10:19:54 2011
From: bsouthey at gmail.com (Bruce Southey)
Date: Tue, 30 Aug 2011 09:19:54 -0500
Subject: [Numpy-discussion] numpy-1.6.1.win32-py3.2.exe (md5) won't
	install
In-Reply-To: <CALMxxx=Zn9YG4J4kKxj74Y8HEKQqboeXM8LM3E-2PF1nH8ysng@mail.gmail.com>
References: <CALMxxxmonAJb-vXCQsHXdRuV2F3074VLBSX=Y7F7CH3EOHi2Tw@mail.gmail.com>
	<CAB6mnx+doEu5RPzMt6+9x_PWi00PwHBfeOPcu6dkC6g8Dd77NQ@mail.gmail.com>
	<CALMxxx=Zn9YG4J4kKxj74Y8HEKQqboeXM8LM3E-2PF1nH8ysng@mail.gmail.com>
Message-ID: <CAAea2pYz9jOLp24hXtT7o1COj1AyzUkeGCnrBTYGCTkSQo6ivg@mail.gmail.com>

On Tue, Aug 30, 2011 at 8:56 AM, Richard D. Moores <rdmoores at gmail.com> wrote:
> On Tue, Aug 30, 2011 at 06:53, Charles R Harris
> <charlesr.harris at gmail.com> wrote:
>>
>>
>> On Tue, Aug 30, 2011 at 7:47 AM, Richard D. Moores <rdmoores at gmail.com>
>> wrote:
>>>
>>> Python 3.2, 64-bit Win 7
>>>
>>> When I try to install numpy-1.6.1.win32-py3.2.exe (md5) I get "Python
>>> version 3.2 required, which was not found in the registry". What to
>>> do?
>>>
>>
>> Did you already install python from python.org?
>
> Yes.
>
> Dick
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
Where did you get that file from?

The official file is called:
numpy-1.6.1-win32-superpack-python3.2.exe
(http://sourceforge.net/projects/numpy/files/NumPy/1.6.1/)
Nor does it seem to be one of Christoph's as those have names like
'numpy-unoptimized-1.6.1.win32-py3.2.?exe'
http://www.lfd.uci.edu/~gohlke/pythonlibs/

As Olivier indicated, this is for a 32-bit install of Python 3.2 and
you do not have a 32-bit version of Python installed. I just confirmed
that under my 64-bit Windows 7 system:
Python 3.2 (r32:88445, Feb 20 2011, 21:29:02) [MSC v.1500 32 bit
(Intel)] on win32
Type "copyright", "credits" or "license()" for more information.
>>> import numpy
>>> numpy.test()
Running unit tests for numpy
NumPy version 1.6.1
NumPy is installed in C:\Python32\lib\site-packages\numpy
Python version 3.2 (r32:88445, Feb 20 2011, 21:29:02) [MSC v.1500 32
bit (Intel)]
nose version 1.0.0
.....

Bruce


From johann.cohentanugi at gmail.com  Tue Aug 30 10:33:05 2011
From: johann.cohentanugi at gmail.com (Johann Cohen-Tanugi)
Date: Tue, 30 Aug 2011 16:33:05 +0200
Subject: [Numpy-discussion] numpy oddity
Message-ID: <4E5CF4A1.7090505@gmail.com>

I have numpy version 1.6.1 and I see the following behavior :

In [380]: X
Out[380]: 1.0476157527896641

In [381]: X.__class__
Out[381]: numpy.float64

In [382]: (2,3)*X
Out[382]: (2, 3)

In [383]: (2,3)/X
Out[383]: array([ 1.90909691,  2.86364537])

In [384]: X=float(X)

In [385]: (2,3)/X
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/home/cohen/<ipython-input-385-cafbe080bfd5> in <module>()
----> 1 (2,3)/X

TypeError: unsupported operand type(s) for /: 'tuple' and 'float'


So it appears that X being a numpy float allows numpy to play some trick 
on the tuple so that division becomes possible, which regular built-in 
float does not allow arithmetics with tuples.
But why is multiplication with "*" not following the same prescription?

best,
Johann


From charlesr.harris at gmail.com  Tue Aug 30 10:52:09 2011
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Tue, 30 Aug 2011 08:52:09 -0600
Subject: [Numpy-discussion] numpy oddity
In-Reply-To: <4E5CF4A1.7090505@gmail.com>
References: <4E5CF4A1.7090505@gmail.com>
Message-ID: <CAB6mnxK2xTFW62LQH_au9i=Riy4JXkxTL+NB9vyOV-a-6s=C0A@mail.gmail.com>

On Tue, Aug 30, 2011 at 8:33 AM, Johann Cohen-Tanugi <
johann.cohentanugi at gmail.com> wrote:

> I have numpy version 1.6.1 and I see the following behavior :
>
> In [380]: X
> Out[380]: 1.0476157527896641
>
> In [381]: X.__class__
> Out[381]: numpy.float64
>
> In [382]: (2,3)*X
> Out[382]: (2, 3)
>
> In [383]: (2,3)/X
> Out[383]: array([ 1.90909691,  2.86364537])
>
> In [384]: X=float(X)
>
> In [385]: (2,3)/X
> ---------------------------------------------------------------------------
> TypeError                                 Traceback (most recent call last)
> /home/cohen/<ipython-input-385-cafbe080bfd5> in <module>()
> ----> 1 (2,3)/X
>
> TypeError: unsupported operand type(s) for /: 'tuple' and 'float'
>
>
> So it appears that X being a numpy float allows numpy to play some trick
> on the tuple so that division becomes possible, which regular built-in
> float does not allow arithmetics with tuples.
> But why is multiplication with "*" not following the same prescription?
>
>
That's strange.

In [16]: x = float64(2.1)

In [17]: (2,3)*x
Out[17]: (2, 3, 2, 3)

In [18]: (2,3)/x
Out[18]: array([ 0.95238095,  1.42857143])

Note that in the first case x is treated like an integer. In the second the
tuple is turned into an array. I think both of these cases should raise
exceptions.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110830/e91bf526/attachment.html>

From shish at keba.be  Tue Aug 30 11:01:53 2011
From: shish at keba.be (Olivier Delalleau)
Date: Tue, 30 Aug 2011 11:01:53 -0400
Subject: [Numpy-discussion] numpy oddity
In-Reply-To: <CAB6mnxK2xTFW62LQH_au9i=Riy4JXkxTL+NB9vyOV-a-6s=C0A@mail.gmail.com>
References: <4E5CF4A1.7090505@gmail.com>
	<CAB6mnxK2xTFW62LQH_au9i=Riy4JXkxTL+NB9vyOV-a-6s=C0A@mail.gmail.com>
Message-ID: <CAFXk4bqZUx1kFyCF8k7bDqZHw_giyHT03_-3o-K_BG2Hin-MCg@mail.gmail.com>

2011/8/30 Charles R Harris <charlesr.harris at gmail.com>

>
>
> On Tue, Aug 30, 2011 at 8:33 AM, Johann Cohen-Tanugi <
> johann.cohentanugi at gmail.com> wrote:
>
>> I have numpy version 1.6.1 and I see the following behavior :
>>
>> In [380]: X
>> Out[380]: 1.0476157527896641
>>
>> In [381]: X.__class__
>> Out[381]: numpy.float64
>>
>> In [382]: (2,3)*X
>> Out[382]: (2, 3)
>>
>> In [383]: (2,3)/X
>> Out[383]: array([ 1.90909691,  2.86364537])
>>
>> In [384]: X=float(X)
>>
>> In [385]: (2,3)/X
>>
>> ---------------------------------------------------------------------------
>> TypeError                                 Traceback (most recent call
>> last)
>> /home/cohen/<ipython-input-385-cafbe080bfd5> in <module>()
>> ----> 1 (2,3)/X
>>
>> TypeError: unsupported operand type(s) for /: 'tuple' and 'float'
>>
>>
>> So it appears that X being a numpy float allows numpy to play some trick
>> on the tuple so that division becomes possible, which regular built-in
>> float does not allow arithmetics with tuples.
>> But why is multiplication with "*" not following the same prescription?
>>
>>
> That's strange.
>
> In [16]: x = float64(2.1)
>
> In [17]: (2,3)*x
> Out[17]: (2, 3, 2, 3)
>
> In [18]: (2,3)/x
> Out[18]: array([ 0.95238095,  1.42857143])
>
> Note that in the first case x is treated like an integer. In the second the
> tuple is turned into an array. I think both of these cases should raise
> exceptions.
>
> Chuck
>
>
>
The tuple does not know what to do with /, so Python asks the numpy float if
it can do something when dividing a tuple, and numpy implements this (see
http://docs.python.org/reference/datamodel.html?highlight=radd#object.__radd__for
how reflected operands work).

That part makes sense to me. The behavior with * doesn't though, it
definitely seems wrong.

-=- Olivier
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110830/f0492d6e/attachment.html>

From rdmoores at gmail.com  Tue Aug 30 11:48:42 2011
From: rdmoores at gmail.com (Richard D. Moores)
Date: Tue, 30 Aug 2011 08:48:42 -0700
Subject: [Numpy-discussion] numpy-1.6.1.win32-py3.2.exe (md5) won't
	install
In-Reply-To: <CAAea2pYz9jOLp24hXtT7o1COj1AyzUkeGCnrBTYGCTkSQo6ivg@mail.gmail.com>
References: <CALMxxxmonAJb-vXCQsHXdRuV2F3074VLBSX=Y7F7CH3EOHi2Tw@mail.gmail.com>
	<CAB6mnx+doEu5RPzMt6+9x_PWi00PwHBfeOPcu6dkC6g8Dd77NQ@mail.gmail.com>
	<CALMxxx=Zn9YG4J4kKxj74Y8HEKQqboeXM8LM3E-2PF1nH8ysng@mail.gmail.com>
	<CAAea2pYz9jOLp24hXtT7o1COj1AyzUkeGCnrBTYGCTkSQo6ivg@mail.gmail.com>
Message-ID: <CALMxxxkqXbUXpBsEZwMFTbXsrszCyMbrKtyBgOqMEbMfpy4P=Q@mail.gmail.com>

On Tue, Aug 30, 2011 at 07:19, Bruce Southey <bsouthey at gmail.com> wrote:
> On Tue, Aug 30, 2011 at 8:56 AM, Richard D. Moores <rdmoores at gmail.com> wrote:
>> On Tue, Aug 30, 2011 at 06:53, Charles R Harris
>> <charlesr.harris at gmail.com> wrote:
>>>
>>>
>>> On Tue, Aug 30, 2011 at 7:47 AM, Richard D. Moores <rdmoores at gmail.com>
>>> wrote:
>>>>
>>>> Python 3.2, 64-bit Win 7
>>>>
>>>> When I try to install numpy-1.6.1.win32-py3.2.exe (md5) I get "Python
>>>> version 3.2 required, which was not found in the registry". What to
>>>> do?

>>
> Where did you get that file from?

from <http://pypi.python.org/pypi?:action=browse&c=533&show=all>, I
believe, but right now the numpy link on that page times out.

>
> The official file is called:
> numpy-1.6.1-win32-superpack-python3.2.exe
> p://sourceforge.ne(httt/projects/numpy/files/NumPy/1.6.1/)
> Nor does it seem to be one of Christoph's as those have names like
> 'numpy-unoptimized-1.6.1.win32-py3.2.?exe'
> http://www.lfd.uci.edu/~gohlke/pythonlibs/
>
> As Olivier indicated, this is for a 32-bit install of Python 3.2 and
> you do not have a 32-bit version of Python installed. I just confirmed
> that under my 64-bit Windows 7 system:
> Python 3.2 (r32:88445, Feb 20 2011, 21:29:02) [MSC v.1500 32 bit
> (Intel)] on win32
> Type "copyright", "credits" or "license()" for more information.
>>>> import numpy
>>>> numpy.test()
> Running unit tests for numpy
> NumPy version 1.6.1
> NumPy is installed in C:\Python32\lib\site-packages\numpy
> Python version 3.2 (r32:88445, Feb 20 2011, 21:29:02) [MSC v.1500 32
> bit (Intel)]
> nose version 1.0.0

So there is no 64-bit 3.x numpy? Is it possible to install 32-bit
Python 3.2 on 64-bit Win 7 (you seem to have done so), so I could use
numpy?

Dick


From shish at keba.be  Tue Aug 30 11:51:21 2011
From: shish at keba.be (Olivier Delalleau)
Date: Tue, 30 Aug 2011 11:51:21 -0400
Subject: [Numpy-discussion] numpy-1.6.1.win32-py3.2.exe (md5) won't
	install
In-Reply-To: <CALMxxxkqXbUXpBsEZwMFTbXsrszCyMbrKtyBgOqMEbMfpy4P=Q@mail.gmail.com>
References: <CALMxxxmonAJb-vXCQsHXdRuV2F3074VLBSX=Y7F7CH3EOHi2Tw@mail.gmail.com>
	<CAB6mnx+doEu5RPzMt6+9x_PWi00PwHBfeOPcu6dkC6g8Dd77NQ@mail.gmail.com>
	<CALMxxx=Zn9YG4J4kKxj74Y8HEKQqboeXM8LM3E-2PF1nH8ysng@mail.gmail.com>
	<CAAea2pYz9jOLp24hXtT7o1COj1AyzUkeGCnrBTYGCTkSQo6ivg@mail.gmail.com>
	<CALMxxxkqXbUXpBsEZwMFTbXsrszCyMbrKtyBgOqMEbMfpy4P=Q@mail.gmail.com>
Message-ID: <CAFXk4brd+-uvXcptndSdGD52t=ftuFUFyb3Y5ii63DXM0L4JEA@mail.gmail.com>

2011/8/30 Richard D. Moores <rdmoores at gmail.com>

> Is it possible to install 32-bit
> Python 3.2 on 64-bit Win 7 (you seem to have done so), so I could use
> numpy?
>
>
Yes you can insteall Python 32 bit on 64 bit Windows.

-=- Olivier
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110830/6dba9acf/attachment.html>

From rdmoores at gmail.com  Tue Aug 30 12:01:35 2011
From: rdmoores at gmail.com (Richard D. Moores)
Date: Tue, 30 Aug 2011 09:01:35 -0700
Subject: [Numpy-discussion] numpy-1.6.1.win32-py3.2.exe (md5) won't
	install
In-Reply-To: <CAFXk4brd+-uvXcptndSdGD52t=ftuFUFyb3Y5ii63DXM0L4JEA@mail.gmail.com>
References: <CALMxxxmonAJb-vXCQsHXdRuV2F3074VLBSX=Y7F7CH3EOHi2Tw@mail.gmail.com>
	<CAB6mnx+doEu5RPzMt6+9x_PWi00PwHBfeOPcu6dkC6g8Dd77NQ@mail.gmail.com>
	<CALMxxx=Zn9YG4J4kKxj74Y8HEKQqboeXM8LM3E-2PF1nH8ysng@mail.gmail.com>
	<CAAea2pYz9jOLp24hXtT7o1COj1AyzUkeGCnrBTYGCTkSQo6ivg@mail.gmail.com>
	<CALMxxxkqXbUXpBsEZwMFTbXsrszCyMbrKtyBgOqMEbMfpy4P=Q@mail.gmail.com>
	<CAFXk4brd+-uvXcptndSdGD52t=ftuFUFyb3Y5ii63DXM0L4JEA@mail.gmail.com>
Message-ID: <CALMxxx=x3b2B+nzZHzXPpizJ7bj43xiRw6PjZ8X_brAv--7X3g@mail.gmail.com>

On Tue, Aug 30, 2011 at 08:51, Olivier Delalleau <shish at keba.be> wrote:
>
> 2011/8/30 Richard D. Moores <rdmoores at gmail.com>
>>
>> Is it possible to install 32-bit
>> Python 3.2 on 64-bit Win 7 (you seem to have done so), so I could use
>> numpy?
>>
>
> Yes you can insteall Python 32 bit on 64 bit Windows.

Thanks. Would doing so leave my 64-bit Python 3.2 intact, so I could
switch to the 32-bit only to install and use numpy?

Dick
>
> -=- Olivier
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>


From charlesr.harris at gmail.com  Tue Aug 30 12:09:21 2011
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Tue, 30 Aug 2011 10:09:21 -0600
Subject: [Numpy-discussion] numpy-1.6.1.win32-py3.2.exe (md5) won't
	install
In-Reply-To: <CALMxxx=x3b2B+nzZHzXPpizJ7bj43xiRw6PjZ8X_brAv--7X3g@mail.gmail.com>
References: <CALMxxxmonAJb-vXCQsHXdRuV2F3074VLBSX=Y7F7CH3EOHi2Tw@mail.gmail.com>
	<CAB6mnx+doEu5RPzMt6+9x_PWi00PwHBfeOPcu6dkC6g8Dd77NQ@mail.gmail.com>
	<CALMxxx=Zn9YG4J4kKxj74Y8HEKQqboeXM8LM3E-2PF1nH8ysng@mail.gmail.com>
	<CAAea2pYz9jOLp24hXtT7o1COj1AyzUkeGCnrBTYGCTkSQo6ivg@mail.gmail.com>
	<CALMxxxkqXbUXpBsEZwMFTbXsrszCyMbrKtyBgOqMEbMfpy4P=Q@mail.gmail.com>
	<CAFXk4brd+-uvXcptndSdGD52t=ftuFUFyb3Y5ii63DXM0L4JEA@mail.gmail.com>
	<CALMxxx=x3b2B+nzZHzXPpizJ7bj43xiRw6PjZ8X_brAv--7X3g@mail.gmail.com>
Message-ID: <CAB6mnxJUXSLOagx99hnTxPHA7Yng80fc_Fia9RBVM8JXmvCRhw@mail.gmail.com>

On Tue, Aug 30, 2011 at 10:01 AM, Richard D. Moores <rdmoores at gmail.com>wrote:

> On Tue, Aug 30, 2011 at 08:51, Olivier Delalleau <shish at keba.be> wrote:
> >
> > 2011/8/30 Richard D. Moores <rdmoores at gmail.com>
> >>
> >> Is it possible to install 32-bit
> >> Python 3.2 on 64-bit Win 7 (you seem to have done so), so I could use
> >> numpy?
> >>
> >
> > Yes you can insteall Python 32 bit on 64 bit Windows.
>
> Thanks. Would doing so leave my 64-bit Python 3.2 intact, so I could
> switch to the 32-bit only to install and use numpy?
>
>
You might want to try the win64 packages
here<http://www.lfd.uci.edu/%7Egohlke/pythonlibs/>.


Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110830/e7b4735a/attachment.html>

From Chris.Barker at noaa.gov  Tue Aug 30 12:21:00 2011
From: Chris.Barker at noaa.gov (Chris.Barker)
Date: Tue, 30 Aug 2011 09:21:00 -0700
Subject: [Numpy-discussion] load from text files Pull Request Review
In-Reply-To: <CAEJxiFqXJ9zDR+aCCPky34XwmueKNBPRBb_Y+nvkn24UM5ro=g@mail.gmail.com>
References: <CAEJxiFqXJ9zDR+aCCPky34XwmueKNBPRBb_Y+nvkn24UM5ro=g@mail.gmail.com>
Message-ID: <4E5D0DEC.2070507@noaa.gov>

On 8/27/11 11:08 AM, Christopher Jordan-Squire wrote:
> I've submitted a pull request for a new method for loading data from
> text files into a record array/masked record array.

> Click on the link for more info, but the general idea is to create a
> regular expression for what entries should look like and loop over the
> file, updating the regular expression if it's wrong. Once the types
> are determined the file is loaded line by line into a pre-allocated
> numpy array.

nice stuff.

Have you looked at my "accumulator" class, rather than pre-allocating? 
Less the class itself than that ideas behind it. It's easy enough to do, 
and would keep you from having to run through the file twice. The cost 
of memory re-allocation as the array grows is very small.

I've posted the code recently, but let me know if you want it again.

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From rdmoores at gmail.com  Tue Aug 30 12:27:54 2011
From: rdmoores at gmail.com (Richard D. Moores)
Date: Tue, 30 Aug 2011 09:27:54 -0700
Subject: [Numpy-discussion] numpy-1.6.1.win32-py3.2.exe (md5) won't
	install
In-Reply-To: <CAB6mnxJUXSLOagx99hnTxPHA7Yng80fc_Fia9RBVM8JXmvCRhw@mail.gmail.com>
References: <CALMxxxmonAJb-vXCQsHXdRuV2F3074VLBSX=Y7F7CH3EOHi2Tw@mail.gmail.com>
	<CAB6mnx+doEu5RPzMt6+9x_PWi00PwHBfeOPcu6dkC6g8Dd77NQ@mail.gmail.com>
	<CALMxxx=Zn9YG4J4kKxj74Y8HEKQqboeXM8LM3E-2PF1nH8ysng@mail.gmail.com>
	<CAAea2pYz9jOLp24hXtT7o1COj1AyzUkeGCnrBTYGCTkSQo6ivg@mail.gmail.com>
	<CALMxxxkqXbUXpBsEZwMFTbXsrszCyMbrKtyBgOqMEbMfpy4P=Q@mail.gmail.com>
	<CAFXk4brd+-uvXcptndSdGD52t=ftuFUFyb3Y5ii63DXM0L4JEA@mail.gmail.com>
	<CALMxxx=x3b2B+nzZHzXPpizJ7bj43xiRw6PjZ8X_brAv--7X3g@mail.gmail.com>
	<CAB6mnxJUXSLOagx99hnTxPHA7Yng80fc_Fia9RBVM8JXmvCRhw@mail.gmail.com>
Message-ID: <CALMxxx=oOazTq=Hf4DidMyiXKuYjQBcOfvDvB3fLny2kmaq8AQ@mail.gmail.com>

On Tue, Aug 30, 2011 at 09:09, Charles R Harris
<charlesr.harris at gmail.com> wrote:
>
>
> On Tue, Aug 30, 2011 at 10:01 AM, Richard D. Moores <rdmoores at gmail.com>
> wrote:
>>
>> On Tue, Aug 30, 2011 at 08:51, Olivier Delalleau <shish at keba.be> wrote:
>> >
>> > 2011/8/30 Richard D. Moores <rdmoores at gmail.com>
>> >>
>> >> Is it possible to install 32-bit
>> >> Python 3.2 on 64-bit Win 7 (you seem to have done so), so I could use
>> >> numpy?
>> >>
>> >
>> > Yes you can insteall Python 32 bit on 64 bit Windows.
>>
>> Thanks. Would doing so leave my 64-bit Python 3.2 intact, so I could
>> switch to the 32-bit only to install and use numpy?
>>
>
> You might want to try the win64 packages here.
>
> Chuck

Thanks Chuck! I downloaded
numpy-unoptimized-1.6.1.win-amd64-py3.2.exe.  numpy is now installed
for 64-bit Python 3.21

But what are the implications of "unoptimized"?

Python 3.2.1 (default, Jul 10 2011, 20:02:51) [MSC v.1500 64 bit (AMD64)]
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy; help(numpy)
Help on package numpy:

I copy and pasted this to an RTF file dedicated to the numpy help. It
has 86,363 lines! Wow!

Dick


From charlesr.harris at gmail.com  Tue Aug 30 12:43:51 2011
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Tue, 30 Aug 2011 10:43:51 -0600
Subject: [Numpy-discussion] numpy-1.6.1.win32-py3.2.exe (md5) won't
	install
In-Reply-To: <CALMxxx=oOazTq=Hf4DidMyiXKuYjQBcOfvDvB3fLny2kmaq8AQ@mail.gmail.com>
References: <CALMxxxmonAJb-vXCQsHXdRuV2F3074VLBSX=Y7F7CH3EOHi2Tw@mail.gmail.com>
	<CAB6mnx+doEu5RPzMt6+9x_PWi00PwHBfeOPcu6dkC6g8Dd77NQ@mail.gmail.com>
	<CALMxxx=Zn9YG4J4kKxj74Y8HEKQqboeXM8LM3E-2PF1nH8ysng@mail.gmail.com>
	<CAAea2pYz9jOLp24hXtT7o1COj1AyzUkeGCnrBTYGCTkSQo6ivg@mail.gmail.com>
	<CALMxxxkqXbUXpBsEZwMFTbXsrszCyMbrKtyBgOqMEbMfpy4P=Q@mail.gmail.com>
	<CAFXk4brd+-uvXcptndSdGD52t=ftuFUFyb3Y5ii63DXM0L4JEA@mail.gmail.com>
	<CALMxxx=x3b2B+nzZHzXPpizJ7bj43xiRw6PjZ8X_brAv--7X3g@mail.gmail.com>
	<CAB6mnxJUXSLOagx99hnTxPHA7Yng80fc_Fia9RBVM8JXmvCRhw@mail.gmail.com>
	<CALMxxx=oOazTq=Hf4DidMyiXKuYjQBcOfvDvB3fLny2kmaq8AQ@mail.gmail.com>
Message-ID: <CAB6mnxJtmD5cNcShebT5vk2ZWG9oKE4CThvxa4XxZ=Y42E=hDA@mail.gmail.com>

On Tue, Aug 30, 2011 at 10:27 AM, Richard D. Moores <rdmoores at gmail.com>wrote:

> On Tue, Aug 30, 2011 at 09:09, Charles R Harris
> <charlesr.harris at gmail.com> wrote:
> >
> >
> > On Tue, Aug 30, 2011 at 10:01 AM, Richard D. Moores <rdmoores at gmail.com>
> > wrote:
> >>
> >> On Tue, Aug 30, 2011 at 08:51, Olivier Delalleau <shish at keba.be> wrote:
> >> >
> >> > 2011/8/30 Richard D. Moores <rdmoores at gmail.com>
> >> >>
> >> >> Is it possible to install 32-bit
> >> >> Python 3.2 on 64-bit Win 7 (you seem to have done so), so I could use
> >> >> numpy?
> >> >>
> >> >
> >> > Yes you can insteall Python 32 bit on 64 bit Windows.
> >>
> >> Thanks. Would doing so leave my 64-bit Python 3.2 intact, so I could
> >> switch to the 32-bit only to install and use numpy?
> >>
> >
> > You might want to try the win64 packages here.
> >
> > Chuck
>
> Thanks Chuck! I downloaded
> numpy-unoptimized-1.6.1.win-amd64-py3.2.exe.  numpy is now installed
> for 64-bit Python 3.21
>
> But what are the implications of "unoptimized"?
>
>
Array operations will be slower. The optimized versions will be faster
because they are linked to the highly optimized and tuned Intel MKL library
rather than the fallback code included in numpy. If you have a lot of big
arrays the speed difference will be significant. For small arrays call
overhead tends to dominate and there isn't that much difference.

You might want to download ipython and matplotlib also so that you have the
basic numpy stack.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110830/a86b23df/attachment.html>

From rdmoores at gmail.com  Tue Aug 30 13:02:17 2011
From: rdmoores at gmail.com (Richard D. Moores)
Date: Tue, 30 Aug 2011 10:02:17 -0700
Subject: [Numpy-discussion] numpy-1.6.1.win32-py3.2.exe (md5) won't
	install
In-Reply-To: <CAB6mnxJtmD5cNcShebT5vk2ZWG9oKE4CThvxa4XxZ=Y42E=hDA@mail.gmail.com>
References: <CALMxxxmonAJb-vXCQsHXdRuV2F3074VLBSX=Y7F7CH3EOHi2Tw@mail.gmail.com>
	<CAB6mnx+doEu5RPzMt6+9x_PWi00PwHBfeOPcu6dkC6g8Dd77NQ@mail.gmail.com>
	<CALMxxx=Zn9YG4J4kKxj74Y8HEKQqboeXM8LM3E-2PF1nH8ysng@mail.gmail.com>
	<CAAea2pYz9jOLp24hXtT7o1COj1AyzUkeGCnrBTYGCTkSQo6ivg@mail.gmail.com>
	<CALMxxxkqXbUXpBsEZwMFTbXsrszCyMbrKtyBgOqMEbMfpy4P=Q@mail.gmail.com>
	<CAFXk4brd+-uvXcptndSdGD52t=ftuFUFyb3Y5ii63DXM0L4JEA@mail.gmail.com>
	<CALMxxx=x3b2B+nzZHzXPpizJ7bj43xiRw6PjZ8X_brAv--7X3g@mail.gmail.com>
	<CAB6mnxJUXSLOagx99hnTxPHA7Yng80fc_Fia9RBVM8JXmvCRhw@mail.gmail.com>
	<CALMxxx=oOazTq=Hf4DidMyiXKuYjQBcOfvDvB3fLny2kmaq8AQ@mail.gmail.com>
	<CAB6mnxJtmD5cNcShebT5vk2ZWG9oKE4CThvxa4XxZ=Y42E=hDA@mail.gmail.com>
Message-ID: <CALMxxxnSfDiNHNZcQaUmr1-o4+E4+D8o6t25dMpdX5KGjz-0uA@mail.gmail.com>

On Tue, Aug 30, 2011 at 09:43, Charles R Harris
<charlesr.harris at gmail.com> wrote:

> You might want to download ipython and matplotlib also so that you have the
> basic numpy stack.

Good idea. I got matplotlib, but ipython for Python 3x isn't on
http://www.lfd.uci.edu/~gohlke/pythonlibs/ .

Dick


From charlesr.harris at gmail.com  Tue Aug 30 13:22:03 2011
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Tue, 30 Aug 2011 11:22:03 -0600
Subject: [Numpy-discussion] numpy-1.6.1.win32-py3.2.exe (md5) won't
	install
In-Reply-To: <CALMxxxnSfDiNHNZcQaUmr1-o4+E4+D8o6t25dMpdX5KGjz-0uA@mail.gmail.com>
References: <CALMxxxmonAJb-vXCQsHXdRuV2F3074VLBSX=Y7F7CH3EOHi2Tw@mail.gmail.com>
	<CAB6mnx+doEu5RPzMt6+9x_PWi00PwHBfeOPcu6dkC6g8Dd77NQ@mail.gmail.com>
	<CALMxxx=Zn9YG4J4kKxj74Y8HEKQqboeXM8LM3E-2PF1nH8ysng@mail.gmail.com>
	<CAAea2pYz9jOLp24hXtT7o1COj1AyzUkeGCnrBTYGCTkSQo6ivg@mail.gmail.com>
	<CALMxxxkqXbUXpBsEZwMFTbXsrszCyMbrKtyBgOqMEbMfpy4P=Q@mail.gmail.com>
	<CAFXk4brd+-uvXcptndSdGD52t=ftuFUFyb3Y5ii63DXM0L4JEA@mail.gmail.com>
	<CALMxxx=x3b2B+nzZHzXPpizJ7bj43xiRw6PjZ8X_brAv--7X3g@mail.gmail.com>
	<CAB6mnxJUXSLOagx99hnTxPHA7Yng80fc_Fia9RBVM8JXmvCRhw@mail.gmail.com>
	<CALMxxx=oOazTq=Hf4DidMyiXKuYjQBcOfvDvB3fLny2kmaq8AQ@mail.gmail.com>
	<CAB6mnxJtmD5cNcShebT5vk2ZWG9oKE4CThvxa4XxZ=Y42E=hDA@mail.gmail.com>
	<CALMxxxnSfDiNHNZcQaUmr1-o4+E4+D8o6t25dMpdX5KGjz-0uA@mail.gmail.com>
Message-ID: <CAB6mnxLYeV4Qu3AFyFBRQfHG=ajiw-ev8HOG1Eepfn5Ya_9umw@mail.gmail.com>

On Tue, Aug 30, 2011 at 11:02 AM, Richard D. Moores <rdmoores at gmail.com>wrote:

> On Tue, Aug 30, 2011 at 09:43, Charles R Harris
> <charlesr.harris at gmail.com> wrote:
>
> > You might want to download ipython and matplotlib also so that you have
> the
> > basic numpy stack.
>
> Good idea. I got matplotlib, but ipython for Python 3x isn't on
> http://www.lfd.uci.edu/~gohlke/pythonlibs/ .
>

Looks like python 3 support is still experimental:
http://wiki.ipython.org/Python_3.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110830/6fa30266/attachment.html>

From rdmoores at gmail.com  Tue Aug 30 14:00:29 2011
From: rdmoores at gmail.com (Richard D. Moores)
Date: Tue, 30 Aug 2011 11:00:29 -0700
Subject: [Numpy-discussion] numpy-1.6.1.win32-py3.2.exe (md5) won't
	install
In-Reply-To: <CAB6mnxLYeV4Qu3AFyFBRQfHG=ajiw-ev8HOG1Eepfn5Ya_9umw@mail.gmail.com>
References: <CALMxxxmonAJb-vXCQsHXdRuV2F3074VLBSX=Y7F7CH3EOHi2Tw@mail.gmail.com>
	<CAB6mnx+doEu5RPzMt6+9x_PWi00PwHBfeOPcu6dkC6g8Dd77NQ@mail.gmail.com>
	<CALMxxx=Zn9YG4J4kKxj74Y8HEKQqboeXM8LM3E-2PF1nH8ysng@mail.gmail.com>
	<CAAea2pYz9jOLp24hXtT7o1COj1AyzUkeGCnrBTYGCTkSQo6ivg@mail.gmail.com>
	<CALMxxxkqXbUXpBsEZwMFTbXsrszCyMbrKtyBgOqMEbMfpy4P=Q@mail.gmail.com>
	<CAFXk4brd+-uvXcptndSdGD52t=ftuFUFyb3Y5ii63DXM0L4JEA@mail.gmail.com>
	<CALMxxx=x3b2B+nzZHzXPpizJ7bj43xiRw6PjZ8X_brAv--7X3g@mail.gmail.com>
	<CAB6mnxJUXSLOagx99hnTxPHA7Yng80fc_Fia9RBVM8JXmvCRhw@mail.gmail.com>
	<CALMxxx=oOazTq=Hf4DidMyiXKuYjQBcOfvDvB3fLny2kmaq8AQ@mail.gmail.com>
	<CAB6mnxJtmD5cNcShebT5vk2ZWG9oKE4CThvxa4XxZ=Y42E=hDA@mail.gmail.com>
	<CALMxxxnSfDiNHNZcQaUmr1-o4+E4+D8o6t25dMpdX5KGjz-0uA@mail.gmail.com>
	<CAB6mnxLYeV4Qu3AFyFBRQfHG=ajiw-ev8HOG1Eepfn5Ya_9umw@mail.gmail.com>
Message-ID: <CALMxxxmcPi=b440V0KEWU7=RUUXrsdasi76mMmfJUVr0qEz1FA@mail.gmail.com>

On Tue, Aug 30, 2011 at 10:22, Charles R Harris
<charlesr.harris at gmail.com> wrote:
>
>
> On Tue, Aug 30, 2011 at 11:02 AM, Richard D. Moores <rdmoores at gmail.com>
> wrote:
>>
>> On Tue, Aug 30, 2011 at 09:43, Charles R Harris
>> <charlesr.harris at gmail.com> wrote:
>>
>> > You might want to download ipython and matplotlib also so that you have
>> > the
>> > basic numpy stack.
>>
>> Good idea. I got matplotlib, but ipython for Python 3x isn't on
>> http://www.lfd.uci.edu/~gohlke/pythonlibs/ .
>
> Looks like python 3 support is still experimental:
> http://wiki.ipython.org/Python_3.
>
> Chuck

Yes. Thanks again, Chuck.

Dick


From robert.kern at gmail.com  Tue Aug 30 14:17:35 2011
From: robert.kern at gmail.com (Robert Kern)
Date: Tue, 30 Aug 2011 13:17:35 -0500
Subject: [Numpy-discussion] numpy oddity
In-Reply-To: <CAB6mnxK2xTFW62LQH_au9i=Riy4JXkxTL+NB9vyOV-a-6s=C0A@mail.gmail.com>
References: <4E5CF4A1.7090505@gmail.com>
	<CAB6mnxK2xTFW62LQH_au9i=Riy4JXkxTL+NB9vyOV-a-6s=C0A@mail.gmail.com>
Message-ID: <CAF6FJiuzrU-a5T5O6s3A6YyA8ArAVsjJwWeym4JYPPEZOgU7=w@mail.gmail.com>

On Tue, Aug 30, 2011 at 09:52, Charles R Harris
<charlesr.harris at gmail.com> wrote:
>
> On Tue, Aug 30, 2011 at 8:33 AM, Johann Cohen-Tanugi
> <johann.cohentanugi at gmail.com> wrote:
>>
>> I have numpy version 1.6.1 and I see the following behavior :
>>
>> In [380]: X
>> Out[380]: 1.0476157527896641
>>
>> In [381]: X.__class__
>> Out[381]: numpy.float64
>>
>> In [382]: (2,3)*X
>> Out[382]: (2, 3)
>>
>> In [383]: (2,3)/X
>> Out[383]: array([ 1.90909691, ?2.86364537])
>>
>> In [384]: X=float(X)
>>
>> In [385]: (2,3)/X
>>
>> ---------------------------------------------------------------------------
>> TypeError ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? Traceback (most recent call
>> last)
>> /home/cohen/<ipython-input-385-cafbe080bfd5> in <module>()
>> ----> 1 (2,3)/X
>>
>> TypeError: unsupported operand type(s) for /: 'tuple' and 'float'
>>
>>
>> So it appears that X being a numpy float allows numpy to play some trick
>> on the tuple so that division becomes possible, which regular built-in
>> float does not allow arithmetics with tuples.
>> But why is multiplication with "*" not following the same prescription?
>>
>
> That's strange.
>
> In [16]: x = float64(2.1)
>
> In [17]: (2,3)*x
> Out[17]: (2, 3, 2, 3)
>
> In [18]: (2,3)/x
> Out[18]: array([ 0.95238095,? 1.42857143])
>
> Note that in the first case x is treated like an integer. In the second the
> tuple is turned into an array. I think both of these cases should raise
> exceptions.

In scalartypes.c.src:

tatic PyObject *
gentype_multiply(PyObject *m1, PyObject *m2)
{
    PyObject *ret = NULL;
    long repeat;

    if (!PyArray_IsScalar(m1, Generic) &&
            ((Py_TYPE(m1)->tp_as_number == NULL) ||
             (Py_TYPE(m1)->tp_as_number->nb_multiply == NULL))) {
        /* Try to convert m2 to an int and try sequence repeat */
        repeat = PyInt_AsLong(m2);
        if (repeat == -1 && PyErr_Occurred()) {
            return NULL;
        }
        ret = PySequence_Repeat(m1, (int) repeat);
    }
    else if (!PyArray_IsScalar(m2, Generic) &&
            ((Py_TYPE(m2)->tp_as_number == NULL) ||
             (Py_TYPE(m2)->tp_as_number->nb_multiply == NULL))) {
        /* Try to convert m1 to an int and try sequence repeat */
        repeat = PyInt_AsLong(m1);
        if (repeat == -1 && PyErr_Occurred()) {
            return NULL;
        }
        ret = PySequence_Repeat(m2, (int) repeat);
    }
    if (ret == NULL) {
        PyErr_Clear(); /* no effect if not set */
        ret = PyArray_Type.tp_as_number->nb_multiply(m1, m2);
    }
    return ret;
}

The PyInt_AsLong() calls should be changed to check for
__index__ability, instead. Not sure about the other operators. Some
people *may* be relying on the coerce-sequences-to-ndarray behavior
with numpy scalars just like they do so with ndarrays. On the other
hand, the repeat behavior with * should have thrown a monkey wrench to
them if they were, so the number of people who do this is probably
small.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
? -- Umberto Eco


From johann.cohentanugi at gmail.com  Tue Aug 30 14:58:04 2011
From: johann.cohentanugi at gmail.com (Johann Cohen-Tanugi)
Date: Tue, 30 Aug 2011 20:58:04 +0200
Subject: [Numpy-discussion] numpy oddity
In-Reply-To: <CAF6FJiuzrU-a5T5O6s3A6YyA8ArAVsjJwWeym4JYPPEZOgU7=w@mail.gmail.com>
References: <4E5CF4A1.7090505@gmail.com>	<CAB6mnxK2xTFW62LQH_au9i=Riy4JXkxTL+NB9vyOV-a-6s=C0A@mail.gmail.com>
	<CAF6FJiuzrU-a5T5O6s3A6YyA8ArAVsjJwWeym4JYPPEZOgU7=w@mail.gmail.com>
Message-ID: <4E5D32BC.4040307@gmail.com>

I am not sure I follow : is the problem the coerce-sequences-to-ndarrays 
behavior, or is it the fact that it applies to division and not 
multiplication?
I thought the second situation is the more problematic.
Anyway, you seem to take it as a bug, should I file a ticket somewhere?
thanks,
johann

On 08/30/2011 08:17 PM, Robert Kern wrote:
> On Tue, Aug 30, 2011 at 09:52, Charles R Harris
> <charlesr.harris at gmail.com>  wrote:
>> On Tue, Aug 30, 2011 at 8:33 AM, Johann Cohen-Tanugi
>> <johann.cohentanugi at gmail.com>  wrote:
>>> I have numpy version 1.6.1 and I see the following behavior :
>>>
>>> In [380]: X
>>> Out[380]: 1.0476157527896641
>>>
>>> In [381]: X.__class__
>>> Out[381]: numpy.float64
>>>
>>> In [382]: (2,3)*X
>>> Out[382]: (2, 3)
>>>
>>> In [383]: (2,3)/X
>>> Out[383]: array([ 1.90909691,  2.86364537])
>>>
>>> In [384]: X=float(X)
>>>
>>> In [385]: (2,3)/X
>>>
>>> ---------------------------------------------------------------------------
>>> TypeError                                 Traceback (most recent call
>>> last)
>>> /home/cohen/<ipython-input-385-cafbe080bfd5>  in<module>()
>>> ---->  1 (2,3)/X
>>>
>>> TypeError: unsupported operand type(s) for /: 'tuple' and 'float'
>>>
>>>
>>> So it appears that X being a numpy float allows numpy to play some trick
>>> on the tuple so that division becomes possible, which regular built-in
>>> float does not allow arithmetics with tuples.
>>> But why is multiplication with "*" not following the same prescription?
>>>
>> That's strange.
>>
>> In [16]: x = float64(2.1)
>>
>> In [17]: (2,3)*x
>> Out[17]: (2, 3, 2, 3)
>>
>> In [18]: (2,3)/x
>> Out[18]: array([ 0.95238095,  1.42857143])
>>
>> Note that in the first case x is treated like an integer. In the second the
>> tuple is turned into an array. I think both of these cases should raise
>> exceptions.
> In scalartypes.c.src:
>
> tatic PyObject *
> gentype_multiply(PyObject *m1, PyObject *m2)
> {
>      PyObject *ret = NULL;
>      long repeat;
>
>      if (!PyArray_IsScalar(m1, Generic)&&
>              ((Py_TYPE(m1)->tp_as_number == NULL) ||
>               (Py_TYPE(m1)->tp_as_number->nb_multiply == NULL))) {
>          /* Try to convert m2 to an int and try sequence repeat */
>          repeat = PyInt_AsLong(m2);
>          if (repeat == -1&&  PyErr_Occurred()) {
>              return NULL;
>          }
>          ret = PySequence_Repeat(m1, (int) repeat);
>      }
>      else if (!PyArray_IsScalar(m2, Generic)&&
>              ((Py_TYPE(m2)->tp_as_number == NULL) ||
>               (Py_TYPE(m2)->tp_as_number->nb_multiply == NULL))) {
>          /* Try to convert m1 to an int and try sequence repeat */
>          repeat = PyInt_AsLong(m1);
>          if (repeat == -1&&  PyErr_Occurred()) {
>              return NULL;
>          }
>          ret = PySequence_Repeat(m2, (int) repeat);
>      }
>      if (ret == NULL) {
>          PyErr_Clear(); /* no effect if not set */
>          ret = PyArray_Type.tp_as_number->nb_multiply(m1, m2);
>      }
>      return ret;
> }
>
> The PyInt_AsLong() calls should be changed to check for
> __index__ability, instead. Not sure about the other operators. Some
> people *may* be relying on the coerce-sequences-to-ndarray behavior
> with numpy scalars just like they do so with ndarrays. On the other
> hand, the repeat behavior with * should have thrown a monkey wrench to
> them if they were, so the number of people who do this is probably
> small.
>


From robert.kern at gmail.com  Tue Aug 30 15:06:34 2011
From: robert.kern at gmail.com (Robert Kern)
Date: Tue, 30 Aug 2011 14:06:34 -0500
Subject: [Numpy-discussion] numpy oddity
In-Reply-To: <4E5D32BC.4040307@gmail.com>
References: <4E5CF4A1.7090505@gmail.com>
	<CAB6mnxK2xTFW62LQH_au9i=Riy4JXkxTL+NB9vyOV-a-6s=C0A@mail.gmail.com>
	<CAF6FJiuzrU-a5T5O6s3A6YyA8ArAVsjJwWeym4JYPPEZOgU7=w@mail.gmail.com>
	<4E5D32BC.4040307@gmail.com>
Message-ID: <CAF6FJitM2Hq-kTuqwGV+F38k9R+vGun08keHMR91OuUZvHPKYA@mail.gmail.com>

On Tue, Aug 30, 2011 at 13:58, Johann Cohen-Tanugi
<johann.cohentanugi at gmail.com> wrote:
> I am not sure I follow : is the problem the coerce-sequences-to-ndarrays
> behavior, or is it the fact that it applies to division and not
> multiplication?
> I thought the second situation is the more problematic.
> Anyway, you seem to take it as a bug, should I file a ticket somewhere?

* is the odd one out. /+- all behave the same: they coerce the
sequence to an ndarray and broadcast the operation. Whether this is
desirable is debatable, but there is at least a logic to it. Charles
would rather have it raise an exception.

(sequence * np.integer) is an interesting case. It should probably
have the "repeat" semantics. However, this makes it an exception to
the coerce-to-ndarray-and-broadcast rule with the other operations.
This gives weight to Charles' preference to make the other operations
raise an exception.

What is an unambiguous bug is the behavior of * with a *float* scalar.
It should never have the "repeat" semantics, no matter what.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
? -- Umberto Eco


From johann.cohentanugi at gmail.com  Tue Aug 30 15:10:10 2011
From: johann.cohentanugi at gmail.com (Johann Cohen-Tanugi)
Date: Tue, 30 Aug 2011 21:10:10 +0200
Subject: [Numpy-discussion] numpy oddity
In-Reply-To: <CAF6FJitM2Hq-kTuqwGV+F38k9R+vGun08keHMR91OuUZvHPKYA@mail.gmail.com>
References: <4E5CF4A1.7090505@gmail.com>	<CAB6mnxK2xTFW62LQH_au9i=Riy4JXkxTL+NB9vyOV-a-6s=C0A@mail.gmail.com>	<CAF6FJiuzrU-a5T5O6s3A6YyA8ArAVsjJwWeym4JYPPEZOgU7=w@mail.gmail.com>	<4E5D32BC.4040307@gmail.com>
	<CAF6FJitM2Hq-kTuqwGV+F38k9R+vGun08keHMR91OuUZvHPKYA@mail.gmail.com>
Message-ID: <4E5D3592.7090009@gmail.com>

ok thanks a lot. Safe code is often better than over-smart code, so I 
would line up with Charles here. There is too much potential for 
ambiguity in expected behavior.
Johann

On 08/30/2011 09:06 PM, Robert Kern wrote:
> On Tue, Aug 30, 2011 at 13:58, Johann Cohen-Tanugi
> <johann.cohentanugi at gmail.com>  wrote:
>> I am not sure I follow : is the problem the coerce-sequences-to-ndarrays
>> behavior, or is it the fact that it applies to division and not
>> multiplication?
>> I thought the second situation is the more problematic.
>> Anyway, you seem to take it as a bug, should I file a ticket somewhere?
> * is the odd one out. /+- all behave the same: they coerce the
> sequence to an ndarray and broadcast the operation. Whether this is
> desirable is debatable, but there is at least a logic to it. Charles
> would rather have it raise an exception.
>
> (sequence * np.integer) is an interesting case. It should probably
> have the "repeat" semantics. However, this makes it an exception to
> the coerce-to-ndarray-and-broadcast rule with the other operations.
> This gives weight to Charles' preference to make the other operations
> raise an exception.
>
> What is an unambiguous bug is the behavior of * with a *float* scalar.
> It should never have the "repeat" semantics, no matter what.
>


From bryce.ready at gmail.com  Tue Aug 30 16:34:18 2011
From: bryce.ready at gmail.com (Bryce Ready)
Date: Tue, 30 Aug 2011 14:34:18 -0600
Subject: [Numpy-discussion] converting standard array to record array
Message-ID: <CACso4r3fSn7eCqEUw5E2QiVwYU=e62RcfH_axOD_kYmOB6S2xw@mail.gmail.com>

Hello all,

So i'm using numpy 1.6.0, and trying to convert a (4,4) numpy array of dtype
'f8' into a record array of this dtype:

dt = np.dtype([('mat','(4,4)f8')])
>

Here is the code snippet:

In [21]: a = np.random.randn(4,4)
>
> In [22]: a.view(dt)
>

and the resulting error:

ValueError: new type not compatible with array.
>

Can anyone shed some light for me on why this conversion is not possible?
It is certainly technically possible, since the memory layout of the two
arrays should be the same.

Can anyone recommend a better way to do this conversion?

Thanks in advance!

-Bryce Ready
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110830/b7c54fee/attachment.html>

From mjanikas at esri.com  Tue Aug 30 18:48:18 2011
From: mjanikas at esri.com (Mark Janikas)
Date: Tue, 30 Aug 2011 15:48:18 -0700
Subject: [Numpy-discussion] Question on LinAlg Inverse Algorithm
Message-ID: <FC90C6AB4859DF44AE7376C81140C0FD47FACFAC@redmx4.esri.com>

Hello All,

Last week I posted a question involving the identification of linear dependent columns of a matrix... but now I am finding an interesting result based on the linalg.inv() function... sometime I am able to invert a matrix that has linear dependent columns and other times I get the LinAlgError()... this suggests that there is some kind of random component to the INV method.  Is this normal?  Thanks much ahead of time,

MJ

Mark Janikas
Product Developer
ESRI, Geoprocessing
380 New York St.
Redlands, CA 92373
909-793-2853 (2563)
mjanikas at esri.com<mailto:mjanikas at esri.com>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110830/201c968b/attachment.html>

From robert.kern at gmail.com  Tue Aug 30 18:54:43 2011
From: robert.kern at gmail.com (Robert Kern)
Date: Tue, 30 Aug 2011 17:54:43 -0500
Subject: [Numpy-discussion] Question on LinAlg Inverse Algorithm
In-Reply-To: <FC90C6AB4859DF44AE7376C81140C0FD47FACFAC@redmx4.esri.com>
References: <FC90C6AB4859DF44AE7376C81140C0FD47FACFAC@redmx4.esri.com>
Message-ID: <CAF6FJit-+2PvX-F0dVdHNTV7rM9W7iEc4feBnf-V67be6nBzQA@mail.gmail.com>

On Tue, Aug 30, 2011 at 17:48, Mark Janikas <mjanikas at esri.com> wrote:
> Hello All,
>
> Last week I posted a question involving the identification of linear
> dependent columns of a matrix? but now I am finding an interesting result
> based on the linalg.inv() function? sometime I am able to invert a matrix
> that has linear dependent columns and other times I get the LinAlgError()?
> this suggests that there is some kind of random component to the INV
> method.? Is this normal?? Thanks much ahead of time,

With exactly the same input in the same process? Can you provide that input?

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
? -- Umberto Eco


From cjordan1 at uw.edu  Tue Aug 30 18:56:55 2011
From: cjordan1 at uw.edu (Christopher Jordan-Squire)
Date: Tue, 30 Aug 2011 17:56:55 -0500
Subject: [Numpy-discussion] Question on LinAlg Inverse Algorithm
In-Reply-To: <FC90C6AB4859DF44AE7376C81140C0FD47FACFAC@redmx4.esri.com>
References: <FC90C6AB4859DF44AE7376C81140C0FD47FACFAC@redmx4.esri.com>
Message-ID: <CAEJxiFqYJx+xFs8JD3z5C-t-bxJ6qMt21qbngDyVxHmtXonj6g@mail.gmail.com>

Can you give an example matrix? I'm not a numerical linear algebra
expert, but I suspect that if your matrix is singular (or nearly so,
in floating point) then any inverse given will look pretty wonky. Huge
determinant, eigenvalues, operator norm, etc..

-Chris JS

On Tue, Aug 30, 2011 at 5:48 PM, Mark Janikas <mjanikas at esri.com> wrote:
> Hello All,
>
>
>
> Last week I posted a question involving the identification of linear
> dependent columns of a matrix? but now I am finding an interesting result
> based on the linalg.inv() function? sometime I am able to invert a matrix
> that has linear dependent columns and other times I get the LinAlgError()?
> this suggests that there is some kind of random component to the INV
> method.? Is this normal?? Thanks much ahead of time,
>
>
>
> MJ
>
>
>
> Mark Janikas
>
> Product Developer
>
> ESRI, Geoprocessing
>
> 380 New York St.
>
> Redlands, CA 92373
>
> 909-793-2853 (2563)
>
> mjanikas at esri.com
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>


From mjanikas at esri.com  Tue Aug 30 19:01:55 2011
From: mjanikas at esri.com (Mark Janikas)
Date: Tue, 30 Aug 2011 16:01:55 -0700
Subject: [Numpy-discussion] Question on LinAlg Inverse Algorithm
In-Reply-To: <CAEJxiFqYJx+xFs8JD3z5C-t-bxJ6qMt21qbngDyVxHmtXonj6g@mail.gmail.com>
References: <FC90C6AB4859DF44AE7376C81140C0FD47FACFAC@redmx4.esri.com>
	<CAEJxiFqYJx+xFs8JD3z5C-t-bxJ6qMt21qbngDyVxHmtXonj6g@mail.gmail.com>
Message-ID: <FC90C6AB4859DF44AE7376C81140C0FD47FACFAD@redmx4.esri.com>

Working on it... Give me a few minutes to get you the data.  TY!

MJ

-----Original Message-----
From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion-bounces at scipy.org] On Behalf Of Christopher Jordan-Squire
Sent: Tuesday, August 30, 2011 3:57 PM
To: Discussion of Numerical Python
Subject: Re: [Numpy-discussion] Question on LinAlg Inverse Algorithm

Can you give an example matrix? I'm not a numerical linear algebra
expert, but I suspect that if your matrix is singular (or nearly so,
in floating point) then any inverse given will look pretty wonky. Huge
determinant, eigenvalues, operator norm, etc..

-Chris JS

On Tue, Aug 30, 2011 at 5:48 PM, Mark Janikas <mjanikas at esri.com> wrote:
> Hello All,
>
>
>
> Last week I posted a question involving the identification of linear
> dependent columns of a matrix. but now I am finding an interesting result
> based on the linalg.inv() function. sometime I am able to invert a matrix
> that has linear dependent columns and other times I get the LinAlgError().
> this suggests that there is some kind of random component to the INV
> method.? Is this normal?? Thanks much ahead of time,
>
>
>
> MJ
>
>
>
> Mark Janikas
>
> Product Developer
>
> ESRI, Geoprocessing
>
> 380 New York St.
>
> Redlands, CA 92373
>
> 909-793-2853 (2563)
>
> mjanikas at esri.com
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion at scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


From mjanikas at esri.com  Tue Aug 30 19:34:10 2011
From: mjanikas at esri.com (Mark Janikas)
Date: Tue, 30 Aug 2011 16:34:10 -0700
Subject: [Numpy-discussion] Question on LinAlg Inverse Algorithm
In-Reply-To: <FC90C6AB4859DF44AE7376C81140C0FD47FACFAD@redmx4.esri.com>
References: <FC90C6AB4859DF44AE7376C81140C0FD47FACFAC@redmx4.esri.com>
	<CAEJxiFqYJx+xFs8JD3z5C-t-bxJ6qMt21qbngDyVxHmtXonj6g@mail.gmail.com>
	<FC90C6AB4859DF44AE7376C81140C0FD47FACFAD@redmx4.esri.com>
Message-ID: <FC90C6AB4859DF44AE7376C81140C0FD47FACFAE@redmx4.esri.com>

When I export to ascii I am losing precision and it getting consistency... I will try a flat dump.  More to come.  TY

MJ

-----Original Message-----
From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion-bounces at scipy.org] On Behalf Of Mark Janikas
Sent: Tuesday, August 30, 2011 4:02 PM
To: Discussion of Numerical Python
Subject: Re: [Numpy-discussion] Question on LinAlg Inverse Algorithm

Working on it... Give me a few minutes to get you the data.  TY!

MJ

-----Original Message-----
From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion-bounces at scipy.org] On Behalf Of Christopher Jordan-Squire
Sent: Tuesday, August 30, 2011 3:57 PM
To: Discussion of Numerical Python
Subject: Re: [Numpy-discussion] Question on LinAlg Inverse Algorithm

Can you give an example matrix? I'm not a numerical linear algebra
expert, but I suspect that if your matrix is singular (or nearly so,
in floating point) then any inverse given will look pretty wonky. Huge
determinant, eigenvalues, operator norm, etc..

-Chris JS

On Tue, Aug 30, 2011 at 5:48 PM, Mark Janikas <mjanikas at esri.com> wrote:
> Hello All,
>
>
>
> Last week I posted a question involving the identification of linear
> dependent columns of a matrix. but now I am finding an interesting result
> based on the linalg.inv() function. sometime I am able to invert a matrix
> that has linear dependent columns and other times I get the LinAlgError().
> this suggests that there is some kind of random component to the INV
> method.? Is this normal?? Thanks much ahead of time,
>
>
>
> MJ
>
>
>
> Mark Janikas
>
> Product Developer
>
> ESRI, Geoprocessing
>
> 380 New York St.
>
> Redlands, CA 92373
>
> 909-793-2853 (2563)
>
> mjanikas at esri.com
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion at scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion at scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


From robert.kern at gmail.com  Tue Aug 30 19:37:22 2011
From: robert.kern at gmail.com (Robert Kern)
Date: Tue, 30 Aug 2011 18:37:22 -0500
Subject: [Numpy-discussion] Question on LinAlg Inverse Algorithm
In-Reply-To: <FC90C6AB4859DF44AE7376C81140C0FD47FACFAE@redmx4.esri.com>
References: <FC90C6AB4859DF44AE7376C81140C0FD47FACFAC@redmx4.esri.com>
	<CAEJxiFqYJx+xFs8JD3z5C-t-bxJ6qMt21qbngDyVxHmtXonj6g@mail.gmail.com>
	<FC90C6AB4859DF44AE7376C81140C0FD47FACFAD@redmx4.esri.com>
	<FC90C6AB4859DF44AE7376C81140C0FD47FACFAE@redmx4.esri.com>
Message-ID: <CAF6FJivFAsr6FaCxr7K6Xxz+pJ4+pPrg1kb86=jV_8zmYhJ7ow@mail.gmail.com>

On Tue, Aug 30, 2011 at 18:34, Mark Janikas <mjanikas at esri.com> wrote:
> When I export to ascii I am losing precision and it getting consistency... I will try a flat dump. ?More to come. ?TY

Might as well np.save() it to an .npy binary file and attach it.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
? -- Umberto Eco


From robert.kern at gmail.com  Tue Aug 30 19:42:26 2011
From: robert.kern at gmail.com (Robert Kern)
Date: Tue, 30 Aug 2011 18:42:26 -0500
Subject: [Numpy-discussion] Question on LinAlg Inverse Algorithm
In-Reply-To: <FC90C6AB4859DF44AE7376C81140C0FD47FACFAC@redmx4.esri.com>
References: <FC90C6AB4859DF44AE7376C81140C0FD47FACFAC@redmx4.esri.com>
Message-ID: <CAF6FJiv_SmiMzz7Du4PaeAAsU14eK9ADapeROfhPAvRd30kuPQ@mail.gmail.com>

On Tue, Aug 30, 2011 at 17:48, Mark Janikas <mjanikas at esri.com> wrote:
> Hello All,
>
> Last week I posted a question involving the identification of linear
> dependent columns of a matrix? but now I am finding an interesting result
> based on the linalg.inv() function? sometime I am able to invert a matrix
> that has linear dependent columns and other times I get the LinAlgError()?
> this suggests that there is some kind of random component to the INV
> method.? Is this normal?? Thanks much ahead of time,

We will also need to know the platform that you are on as well as the
LAPACK library that you linked numpy against. It is the behavior of
that LAPACK library that is controlling here. Standard LAPACK does
sometimes use pseudorandom numbers in certain situations, but AFAICT
it deterministically seeds the PRNG on every call, and I don't think
it does this for any subroutine involved with inversion. But if you
use an optimized LAPACK from some vendor, I don't know what they may
be doing. Some optimized LAPACK/BLAS libraries may be threaded and may
dynamically determine how to break up the problem based on load (I
don't know of any that specifically do this, but it's a possibility).

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
? -- Umberto Eco


From mjanikas at esri.com  Tue Aug 30 20:38:59 2011
From: mjanikas at esri.com (Mark Janikas)
Date: Tue, 30 Aug 2011 17:38:59 -0700
Subject: [Numpy-discussion] Question on LinAlg Inverse Algorithm
In-Reply-To: <CAF6FJiv_SmiMzz7Du4PaeAAsU14eK9ADapeROfhPAvRd30kuPQ@mail.gmail.com>
References: <FC90C6AB4859DF44AE7376C81140C0FD47FACFAC@redmx4.esri.com>
	<CAF6FJiv_SmiMzz7Du4PaeAAsU14eK9ADapeROfhPAvRd30kuPQ@mail.gmail.com>
Message-ID: <FC90C6AB4859DF44AE7376C81140C0FD47FACFAF@redmx4.esri.com>

OK... so I have been using checksums to compare and it looks like I am getting a different value when it fails as opposed to when it passes... I.e. the input is NOT the same.  When I save them to npy files and run LA.inv() I get consistent results.  Now I have to track down in my code why the inputs are different.... Sucks, because I keep having to dive deeper (more checksums... yeh!).  But it is all linear algebra from the same input, so kinda weird that there is a diversion. Thanks for all of your help! And Ill post again when I find the culprit. (probably me :-))

MJ

-----Original Message-----
From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion-bounces at scipy.org] On Behalf Of Robert Kern
Sent: Tuesday, August 30, 2011 4:42 PM
To: Discussion of Numerical Python
Subject: Re: [Numpy-discussion] Question on LinAlg Inverse Algorithm

On Tue, Aug 30, 2011 at 17:48, Mark Janikas <mjanikas at esri.com> wrote:
> Hello All,
>
> Last week I posted a question involving the identification of linear
> dependent columns of a matrix? but now I am finding an interesting result
> based on the linalg.inv() function? sometime I am able to invert a matrix
> that has linear dependent columns and other times I get the LinAlgError()?
> this suggests that there is some kind of random component to the INV
> method.? Is this normal?? Thanks much ahead of time,

We will also need to know the platform that you are on as well as the
LAPACK library that you linked numpy against. It is the behavior of
that LAPACK library that is controlling here. Standard LAPACK does
sometimes use pseudorandom numbers in certain situations, but AFAICT
it deterministically seeds the PRNG on every call, and I don't think
it does this for any subroutine involved with inversion. But if you
use an optimized LAPACK from some vendor, I don't know what they may
be doing. Some optimized LAPACK/BLAS libraries may be threaded and may
dynamically determine how to break up the problem based on load (I
don't know of any that specifically do this, but it's a possibility).

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
? -- Umberto Eco
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion at scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

From thomas.robitaille at gmail.com  Tue Aug 30 23:34:52 2011
From: thomas.robitaille at gmail.com (Thomas Robitaille)
Date: Tue, 30 Aug 2011 23:34:52 -0400
Subject: [Numpy-discussion] Issue with dtype and nx1 arrays
Message-ID: <CAGMHX_04xgUYtLKK9OrqxvQQa-caaMvp7XnSkr+iCRpmxfD10Q@mail.gmail.com>

Hello,

Is the following behavior normal?

In [1]: import numpy as np

In [2]: np.dtype([('a','<f4',2)])
Out[2]: dtype([('a', '<f4', (2,))])

In [3]: np.dtype([('a','<f4',1)])
Out[3]: dtype([('a', '<f4')])

I.e. in the second case, the second dimension of the dtype (1) is
being ignored? Is there a way to avoid this?

Thanks,
Thomas


From mdekauwe at gmail.com  Tue Aug 30 23:39:36 2011
From: mdekauwe at gmail.com (mdekauwe)
Date: Tue, 30 Aug 2011 20:39:36 -0700 (PDT)
Subject: [Numpy-discussion]  nan division warnings
Message-ID: <32369310.post@talk.nabble.com>


Hi,

this is probably my lack of understanding...when i set up some masks for 2
arrays and try to divide one by the other I get a runtime warning. Seemingly
this is when I am asking python to divide one nan by the other, however I
thought by masking the array numpy would then know to ignore these nans? For
example

import numpy as np
a = np.array([4.5, 6.7, 8.0, 9.0, 0.00001])
b = np.array([0.0001, 6.7, 8.0, 9.0, 0.00001])
a = np.ma.where(np.logical_or(a<0.01, b<0.01), np.nan, a)
b = np.ma.where(np.logical_or(a<0.01, b<0.01), np.nan, b)
a/b

will produce

?./numpy/ma/core.py:772: RuntimeWarning: invalid value encountered in
absolute 
return umath.absolute(a) * self.tolerance >= umath.absolute(b)

but of course give the correct result

masked_array(data = [-- 1.0 1.0 1.0 --],
             mask = [ True False False False  True],
       fill_value = 1e+20)

But what is the correct way to do this array division such that I don't
produce the warning? The only way I can see that you can do it is a bit
convoluted and involves empty the array of the masked values, e.g.

a = a[np.isnan(a) == False] 
b = b[np.isnan(b) == False]
a/b

thanks,

Martin
-- 
View this message in context: http://old.nabble.com/nan-division-warnings-tp32369310p32369310.html
Sent from the Numpy-discussion mailing list archive at Nabble.com.


From warren.weckesser at enthought.com  Tue Aug 30 23:42:27 2011
From: warren.weckesser at enthought.com (Warren Weckesser)
Date: Tue, 30 Aug 2011 22:42:27 -0500
Subject: [Numpy-discussion] Issue with dtype and nx1 arrays
In-Reply-To: <CAGMHX_04xgUYtLKK9OrqxvQQa-caaMvp7XnSkr+iCRpmxfD10Q@mail.gmail.com>
References: <CAGMHX_04xgUYtLKK9OrqxvQQa-caaMvp7XnSkr+iCRpmxfD10Q@mail.gmail.com>
Message-ID: <CAM-+wY9eSam758a+yjJ1LpKWHCdKzFgCTNrr+iUds7t6Vdfkvw@mail.gmail.com>

On Tue, Aug 30, 2011 at 10:34 PM, Thomas Robitaille <
thomas.robitaille at gmail.com> wrote:

> Hello,
>
> Is the following behavior normal?
>
> In [1]: import numpy as np
>
> In [2]: np.dtype([('a','<f4',2)])
> Out[2]: dtype([('a', '<f4', (2,))])
>
> In [3]: np.dtype([('a','<f4',1)])
> Out[3]: dtype([('a', '<f4')])
>
> I.e. in the second case, the second dimension of the dtype (1) is
> being ignored? Is there a way to avoid this?
>


Use a tuple to specify the dimension:

In [11]: dtype([('a', '<f4', (2,))])
Out[11]: dtype([('a', '<f4', (2,))])

In [12]: dtype([('a', '<f4', (1,))])
Out[12]: dtype([('a', '<f4', (1,))])


Warren


>
> Thanks,
> Thomas
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110830/b13b79d6/attachment.html>

From robert.kern at gmail.com  Tue Aug 30 23:47:58 2011
From: robert.kern at gmail.com (Robert Kern)
Date: Tue, 30 Aug 2011 22:47:58 -0500
Subject: [Numpy-discussion] nan division warnings
In-Reply-To: <32369310.post@talk.nabble.com>
References: <32369310.post@talk.nabble.com>
Message-ID: <CAF6FJiu6KqwGvqsd-5SbHqfVEiv_9aoqY3Q4xWMBvn56fWoHUg@mail.gmail.com>

On Tue, Aug 30, 2011 at 22:39, mdekauwe <mdekauwe at gmail.com> wrote:
>
> Hi,
>
> this is probably my lack of understanding...when i set up some masks for 2
> arrays and try to divide one by the other I get a runtime warning. Seemingly
> this is when I am asking python to divide one nan by the other, however I
> thought by masking the array numpy would then know to ignore these nans? For
> example
>
> import numpy as np
> a = np.array([4.5, 6.7, 8.0, 9.0, 0.00001])
> b = np.array([0.0001, 6.7, 8.0, 9.0, 0.00001])
> a = np.ma.where(np.logical_or(a<0.01, b<0.01), np.nan, a)
> b = np.ma.where(np.logical_or(a<0.01, b<0.01), np.nan, b)
> a/b
>
> will produce
>
> ?./numpy/ma/core.py:772: RuntimeWarning: invalid value encountered in
> absolute
> return umath.absolute(a) * self.tolerance >= umath.absolute(b)
>
> but of course give the correct result
>
> masked_array(data = [-- 1.0 1.0 1.0 --],
> ? ? ? ? ? ? mask = [ True False False False ?True],
> ? ? ? fill_value = 1e+20)
>
> But what is the correct way to do this array division such that I don't
> produce the warning?

Just don't put NaNs in.

[~]
|10> a = np.array([4.5, 6.7, 8.0, 9.0, 0.00001])

[~]
|11> b = np.array([0.0001, 6.7, 8.0, 9.0, 0.00001])

[~]
|12> mask = (a < 0.01) | (b < 0.01)

[~]
|13> ma = np.ma.masked_array(a, mask=mask)

[~]
|14> mb = np.ma.masked_array(b, mask=mask)

[~]
|15> ma / mb
masked_array(data = [-- 1.0 1.0 1.0 --],
             mask = [ True False False False  True],
       fill_value = 1e+20)

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
? -- Umberto Eco


From josef.pktd at gmail.com  Tue Aug 30 23:49:54 2011
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Tue, 30 Aug 2011 23:49:54 -0400
Subject: [Numpy-discussion] converting standard array to record array
In-Reply-To: <CACso4r3fSn7eCqEUw5E2QiVwYU=e62RcfH_axOD_kYmOB6S2xw@mail.gmail.com>
References: <CACso4r3fSn7eCqEUw5E2QiVwYU=e62RcfH_axOD_kYmOB6S2xw@mail.gmail.com>
Message-ID: <CAMMTP+BxbY7BrhHaEJVnhga_pHRa_cu624-xsa0Cnx2fQfBEmw@mail.gmail.com>

On Tue, Aug 30, 2011 at 4:34 PM, Bryce Ready <bryce.ready at gmail.com> wrote:
> Hello all,
>
> So i'm using numpy 1.6.0, and trying to convert a (4,4) numpy array of dtype
> 'f8' into a record array of this dtype:
>
>> dt = np.dtype([('mat','(4,4)f8')])
>
> Here is the code snippet:
>
>> In [21]: a = np.random.randn(4,4)
>>
>> In [22]: a.view(dt)
>
> and the resulting error:
>
>> ValueError: new type not compatible with array.
>
> Can anyone shed some light for me on why this conversion is not possible?
> It is certainly technically possible, since the memory layout of the two
> arrays should be the same.
>
> Can anyone recommend a better way to do this conversion?

I guess it can only convert rows, each row needs the memory size of the dt

>>> np.random.randn(4,4).ravel().view(dt).shape
(1,)
>>> np.random.randn(2,4,4).reshape(-1,16).view(dt)
array([[ ([[1.7107996212005496, 0.64334162481360346,
-2.1589367225479004, 1.2302260107072134], [0.90703092017458831,
-1.0297890301610224, -0.095086304368665275, 0.35407366904038495],
[-1.1083969421298907, 0.83307347286837752, 0.39886399402076494,
0.26313136034262563], [0.81860729029038914, -1.1443047382313905,
0.73326737255810859, 0.34482475392499168]],)],
       [ ([[0.69027418489768777, 0.25867753263599164,
1.0320990807184023, 0.21836691513066409], [0.45913017094388614,
-0.89570247025515981, 0.76452726059163534, -2.2953009964941642],
[0.60248580944596275, 1.0863090037733505, -0.10849220482850662,
-0.19176089514256078], [-1.0700600508627109, -1.4743316703511105,
0.79193567523155062, 0.82243321942810521]],)]],
      dtype=[('mat', '<f8', (4, 4))])

Josef

>
> Thanks in advance!
>
> -Bryce Ready
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>


From mdekauwe at gmail.com  Wed Aug 31 01:00:30 2011
From: mdekauwe at gmail.com (mdekauwe)
Date: Tue, 30 Aug 2011 22:00:30 -0700 (PDT)
Subject: [Numpy-discussion] nan division warnings
In-Reply-To: <CAF6FJiu6KqwGvqsd-5SbHqfVEiv_9aoqY3Q4xWMBvn56fWoHUg@mail.gmail.com>
References: <32369310.post@talk.nabble.com>
	<CAF6FJiu6KqwGvqsd-5SbHqfVEiv_9aoqY3Q4xWMBvn56fWoHUg@mail.gmail.com>
Message-ID: <32369517.post@talk.nabble.com>


Perfect that works how I envisaged, I am an idiot, I clearly overcomplicated
my solution.

thanks.
-- 
View this message in context: http://old.nabble.com/nan-division-warnings-tp32369310p32369517.html
Sent from the Numpy-discussion mailing list archive at Nabble.com.


From nadavh at visionsense.com  Wed Aug 31 01:37:19 2011
From: nadavh at visionsense.com (Nadav Horesh)
Date: Tue, 30 Aug 2011 22:37:19 -0700
Subject: [Numpy-discussion] Wrong treatment of byte-order.
Message-ID: <26FC23E7C398A64083C980D16001012D261844C0BC@VA3DIAXVS361.RED001.local>

Hi,

 This is my second post on this problem I found in numpy 1.6.1, and recently it cam up in the latest  git version (2.0.0.dev-f3e70d9). The problem is numpy treats the native byte order ('<') as illegal while the wrong one ('>') as the right one. The output of the attached script (bult for python 2.6 + ) is given below (my system is a 64 bit linux on core i7.  64 bit python 2.7.2/3.2 , numpy uses ATLAS):

$ python test_byte_order.py
 a =
 [[ 0.28596132  0.31658824  0.34929676]
 [ 0.48739246  0.68020533  0.39616588]
 [ 0.29310406  0.9584545   0.8120068 ]]

 a1 =
 [[ 0.28596132  0.31658824  0.34929676]
 [ 0.48739246  0.68020533  0.39616588]
 [ 0.29310406  0.9584545   0.8120068 ]]

(Wrong byte order on Intel CPUs):
 a2 =
 [[  8.97948198e-017   1.73406416e-025  -4.25909057e+014]
 [  4.59443694e+090   7.91693101e-029   5.26959329e-135]
 [  2.93240450e+060  -2.25898860e-051  -2.06126917e+302]]

Invert a:
OK
 Invert a2 (Wrong byte order!):
OK
 invert a1:
Traceback (most recent call last):
  File "test_byte_order.py", line 20, in <module>
    b1 = N.linalg.inv(a1)
  File "/usr/lib64/python2.7/site-packages/numpy/linalg/linalg.py", line 445, in inv
    return wrap(solve(a, identity(a.shape[0], dtype=a.dtype)))
  File "/usr/lib64/python2.7/site-packages/numpy/linalg/linalg.py", line 326, in solve
    results = lapack_routine(n_eq, n_rhs, a, n_eq, pivots, b, n_eq, 0)
lapack_lite.LapackError: Parameter a has non-native byte order in lapack_lite.dgesv
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110830/764ba499/attachment.html>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: test_byte_order.py
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110830/764ba499/attachment.ksh>

From pav at iki.fi  Wed Aug 31 04:59:44 2011
From: pav at iki.fi (Pauli Virtanen)
Date: Wed, 31 Aug 2011 08:59:44 +0000 (UTC)
Subject: [Numpy-discussion] Question on LinAlg Inverse Algorithm
References: <FC90C6AB4859DF44AE7376C81140C0FD47FACFAC@redmx4.esri.com>
Message-ID: <j3kt60$gjv$1@dough.gmane.org>

On Tue, 30 Aug 2011 15:48:18 -0700, Mark Janikas wrote:
> Last week I posted a question involving the identification of linear
> dependent columns of a matrix... but now I am finding an interesting
> result based on the linalg.inv() function... sometime I am able to
> invert a matrix that has linear dependent columns and other times I get
> the LinAlgError()... this suggests that there is some kind of random
> component to the INV method.  Is this normal?

I suspect that this is a case of floating-point rounding errors.
Floating-point arithmetic is inexact, so even if a certain matrix
is singular in exact arithmetic, for a computer it may still be
invertible (by a given algorithm). This type of things are not
unusual in floating-point computations.

The matrix condition number (`np.linalg.cond`) is a better measure
of whether a matrix is invertible or not.

-- 
Pauli Virtanen


From marquett at iap.fr  Wed Aug 31 06:20:01 2011
From: marquett at iap.fr (Jean-Baptiste Marquette)
Date: Wed, 31 Aug 2011 12:20:01 +0200
Subject: [Numpy-discussion] A question about dtype syntax
In-Reply-To: <6AB4E3BA-C9B9-4D99-A470-259CD81589A7@gmail.com>
References: <AA9C209A-DABF-442C-A72B-115B400545EC@iap.fr>
	<6AB4E3BA-C9B9-4D99-A470-259CD81589A7@gmail.com>
Message-ID: <8701E943-4F16-435B-9EA9-B39639E2754C@iap.fr>


Hi Pierre,

Thanks for the guess. Unfortunately, I got the same error:

[('bs3000k.cat', 280.60341, -7.09118, 9480, 0.2057, 0.14)]
Traceback (most recent call last):
  File "/Users/marquett/workspace/Distort/src/StatsSep.py", line 40, in <module>
    StatsAll = np.array(np.asarray(Stats), dtype=('a15, f8, f8, i4, f8, f8'))
ValueError: could not convert string to float: bs3000k.cat

The code is

Stats = [(CatBase, round(stats.mean(Data.Ra), 5), round(stats.mean(Data.Dec), 5), len(Sep), round(stats.mean(Sep),4), round(stats.stdev(Sep),4),)]
print Stats
if First:
    StatsAll = np.array(np.asarray(Stats), dtype=('a15, f8, f8, i4, f8, f8'))
    First = False
else: 
    StatsAll = np.vstack((StatsAll, np.asarray(Stats)))
    print len(StatsAll)

I tried various syntaxes, without success.

Cheers
JB


> 
> On Aug 30, 2011, at 10:46 AM, Marquette Jean-Baptiste wrote:
> 
>> Hi all,
>> 
>> I have this piece of code:
>> 
>> Stats = [CatBase, round(stats.mean(Data.Ra), 5), round(stats.mean(Data.Dec), 5), len(Sep), round(stats.mean(Sep),4), round(stats.stdev(Sep),4)]
>> print Stats
>> if First:
>> 	StatsAll = np.array(np.asarray(Stats), dtype=('a11, f8, f8, i4, f8, f8'))
>>        First = False
>> else: 
>>        StatsAll = np.vstack((StatsAll, np.asarray(Stats)))
>>        print len(StatsAll)
>> 
>> This yields the error:
>> 
>> ['bs3000k.cat', 280.60341, -7.09118, 9480, 0.2057, 0.14]
>> Traceback (most recent call last):
>>  File "/Users/marquett/workspace/Distort/src/StatsSep.py", line 40, in <module>
>>    StatsAll = np.array(np.asarray(Stats), dtype=('a11, f8, f8, i4, f8, f8'))
>> ValueError: could not convert string to float: bs3000k.cat
>> 
>> What's wrong ?
> 
> My guess:
> Stats is a list of 5 elements, but you want a list of 1 5-element tuple to match the type. 
> 
>> Stats = [(CatBase, round(stats.mean(Data.Ra), 5), round(stats.mean(Data.Dec), 5), len(Sep), round(stats.mean(Sep),4), round(stats.stdev(Sep),4),)]
> 
> 
> 
> 
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110831/c7d5a997/attachment.html>

From pgmdevlist at gmail.com  Wed Aug 31 06:42:21 2011
From: pgmdevlist at gmail.com (Pierre GM)
Date: Wed, 31 Aug 2011 12:42:21 +0200
Subject: [Numpy-discussion] A question about dtype syntax
In-Reply-To: <8701E943-4F16-435B-9EA9-B39639E2754C@iap.fr>
References: <AA9C209A-DABF-442C-A72B-115B400545EC@iap.fr>
	<6AB4E3BA-C9B9-4D99-A470-259CD81589A7@gmail.com>
	<8701E943-4F16-435B-9EA9-B39639E2754C@iap.fr>
Message-ID: <64CA5B4C-E000-4372-85FD-35767715DE3B@gmail.com>


On Aug 31, 2011, at 12:20 PM, Jean-Baptiste Marquette wrote:

> 
> Hi Pierre,
> 
> Thanks for the guess. Unfortunately, I got the same error:
> 
> [('bs3000k.cat', 280.60341, -7.09118, 9480, 0.2057, 0.14)]
> Traceback (most recent call last):
>   File "/Users/marquett/workspace/Distort/src/StatsSep.py", line 40, in <module>
>     StatsAll = np.array(np.asarray(Stats), dtype=('a15, f8, f8, i4, f8, f8'))
> ValueError: could not convert string to float: bs3000k.cat

Of course, silly me

Your line 40 is actually
>>>    StatsAll = np.array(np.asarray(Stats), dtype=('a15, f8, f8, i4, f8, f8'))

With np.asarray(Stats), you're trying to load Stats as an array using a dtype of float by default. Of course, np.asarray is choking on the first element.

So, try to use instead
>>>    StatsAll = np.array(Stats, dtype=('a15, f8, f8, i4, f8, f8'))


From dieter at uellue.de  Wed Aug 31 06:58:50 2011
From: dieter at uellue.de (Dieter Weber)
Date: Wed, 31 Aug 2011 12:58:50 +0200
Subject: [Numpy-discussion] Numpy performance boost
Message-ID: <1314788330.2418.13.camel@media>

Hi,
just wanted to show an example of how python3 + numpy compares with just
python3 and many other languages and language implementations:
http://shootout.alioth.debian.org/u64q/performance.php?test=mandelbrot#about

The python3 program using numpy is #6 and you find it with the
"interesting alternative" programs on the bottom because it was
disqualified for doing things differently. It is 6.3x slower than the
fastest program and well ahead of all other interpreted languages.

Thanks to all contributors for making numpy such a great piece of
software!

Greetings,
Dieter


From davide.lasagna at polito.it  Wed Aug 31 09:30:37 2011
From: davide.lasagna at polito.it (Davide)
Date: Wed, 31 Aug 2011 15:30:37 +0200
Subject: [Numpy-discussion] Model Predictive Control package
Message-ID: <4E5E377D.8050109@polito.it>

Dear List,

Does anybody knows if there is a python package for simulating LTI 
dynamic systems controlled with a model predictive controller? I am 
writing some code which does the job, but the math is not super-easy and 
i would not like to reinvent the wheel and loose to much time.

I will soon publish such codes somewhere, i.e Github, so anyone 
interested can pick it up.

Cheers,

Davide


From marquett at iap.fr  Wed Aug 31 09:40:55 2011
From: marquett at iap.fr (Jean-Baptiste Marquette)
Date: Wed, 31 Aug 2011 15:40:55 +0200
Subject: [Numpy-discussion] A question about dtype syntax
In-Reply-To: <64CA5B4C-E000-4372-85FD-35767715DE3B@gmail.com>
References: <AA9C209A-DABF-442C-A72B-115B400545EC@iap.fr>
	<6AB4E3BA-C9B9-4D99-A470-259CD81589A7@gmail.com>
	<8701E943-4F16-435B-9EA9-B39639E2754C@iap.fr>
	<64CA5B4C-E000-4372-85FD-35767715DE3B@gmail.com>
Message-ID: <BB3B3EAC-77C0-433B-8BE6-047AE0618B36@iap.fr>


Hi Pierre,

Bingo ! That works. I finally coded like:

            Stats = [(CatBase, round(stats.mean(Data.Ra), 5), round(stats.mean(Data.Dec), 5), len(Sep), round(stats.mean(Sep),4), round(stats.stdev(Sep),4),)]
            StatArray = np.array(Stats, dtype=([('Catalog', 'a15'), ('RaMean', 'f8'), ('DecMean', 'f8'), ('NStars', 'i4'), ('RMS', 'f8'), ('StdDev', 'f8')]))
            print StatArray
            if First:
                StatsAll = StatArray
                First = False
            else: 
                StatsAll = np.vstack((StatsAll, StatArray))


My next problem deals with the writing of data to a file. I use the command:

np.savetxt(Table, StatsAll, delimiter=' ', fmt=['%15s %.5f %.5f %5d %.4f %.4f'])

which yields:

Traceback (most recent call last):
  File "/Users/marquett/workspace/Distort/src/StatsSep.py", line 44, in <module>
    np.savetxt(Table, StatsAll, delimiter=' ', fmt=['%15s %.5f %.5f %5d %.4f %.4f'])
  File "/Library/Frameworks/EPD64.framework/Versions/7.1/lib/python2.7/site-packages/numpy/lib/npyio.py", line 979, in savetxt
    fh.write(asbytes(format % tuple(row) + newline))
TypeError: not enough arguments for format string

I struggled with various unsuccessful fmt syntaxes, and the numpy doc is very discrete about that topic:

fmt : string or sequence of strings
    A single format (%10.5f), a sequence of formats

But I don't find this valid sequence nor an example...

Cheers
JB


> 
> On Aug 31, 2011, at 12:20 PM, Jean-Baptiste Marquette wrote:
> 
>> 
>> Hi Pierre,
>> 
>> Thanks for the guess. Unfortunately, I got the same error:
>> 
>> [('bs3000k.cat', 280.60341, -7.09118, 9480, 0.2057, 0.14)]
>> Traceback (most recent call last):
>>  File "/Users/marquett/workspace/Distort/src/StatsSep.py", line 40, in <module>
>>    StatsAll = np.array(np.asarray(Stats), dtype=('a15, f8, f8, i4, f8, f8'))
>> ValueError: could not convert string to float: bs3000k.cat
> 
> Of course, silly me
> 
> Your line 40 is actually
>>>>   StatsAll = np.array(np.asarray(Stats), dtype=('a15, f8, f8, i4, f8, f8'))
> 
> With np.asarray(Stats), you're trying to load Stats as an array using a dtype of float by default. Of course, np.asarray is choking on the first element.
> 
> So, try to use instead
>>>>   StatsAll = np.array(Stats, dtype=('a15, f8, f8, i4, f8, f8'))
> 
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110831/6fdcb99b/attachment.html>

From pgmdevlist at gmail.com  Wed Aug 31 10:02:06 2011
From: pgmdevlist at gmail.com (Pierre GM)
Date: Wed, 31 Aug 2011 16:02:06 +0200
Subject: [Numpy-discussion] A question about dtype syntax
In-Reply-To: <BB3B3EAC-77C0-433B-8BE6-047AE0618B36@iap.fr>
References: <AA9C209A-DABF-442C-A72B-115B400545EC@iap.fr>
	<6AB4E3BA-C9B9-4D99-A470-259CD81589A7@gmail.com>
	<8701E943-4F16-435B-9EA9-B39639E2754C@iap.fr>
	<64CA5B4C-E000-4372-85FD-35767715DE3B@gmail.com>
	<BB3B3EAC-77C0-433B-8BE6-047AE0618B36@iap.fr>
Message-ID: <44057DAD-05DF-4513-A7D1-A80238702C21@gmail.com>


On Aug 31, 2011, at 3:40 PM, Jean-Baptiste Marquette wrote:
> Traceback (most recent call last):
>   File "/Users/marquett/workspace/Distort/src/StatsSep.py", line 44, in <module>
>     np.savetxt(Table, StatsAll, delimiter=' ', fmt=['%15s %.5f %.5f %5d %.4f %.4f'])
>   File "/Library/Frameworks/EPD64.framework/Versions/7.1/lib/python2.7/site-packages/numpy/lib/npyio.py", line 979, in savetxt
>     fh.write(asbytes(format % tuple(row) + newline))
> TypeError: not enough arguments for format string

Without knowing StatsAll, it ain't easy? From the exception message, we could expect that one of rows is empty or as less than the 6 elements required by your format string.
If you're using IPython, switch to debugger mode (pdb), then inspect row and format to find out the content of the offending line.

> I struggled with various unsuccessful fmt syntaxes, and the numpy doc is very discrete about that topic:
> 
> fmt : string or sequence of strings
> 
>     A single format (%10.5f), a sequence of formats

Looks clear enough to me? But yes, a comment in the code shows that "   `fmt` can be a string with multiple insertion points or a list of formats.  E.g. '%10.5f\t%10d' or ('%10.5f', '$10d')" (so we should probably update the doc to this regard)

From marquett at iap.fr  Wed Aug 31 10:24:10 2011
From: marquett at iap.fr (Jean-Baptiste Marquette)
Date: Wed, 31 Aug 2011 16:24:10 +0200
Subject: [Numpy-discussion] A question about dtype syntax
In-Reply-To: <44057DAD-05DF-4513-A7D1-A80238702C21@gmail.com>
References: <AA9C209A-DABF-442C-A72B-115B400545EC@iap.fr>
	<6AB4E3BA-C9B9-4D99-A470-259CD81589A7@gmail.com>
	<8701E943-4F16-435B-9EA9-B39639E2754C@iap.fr>
	<64CA5B4C-E000-4372-85FD-35767715DE3B@gmail.com>
	<BB3B3EAC-77C0-433B-8BE6-047AE0618B36@iap.fr>
	<44057DAD-05DF-4513-A7D1-A80238702C21@gmail.com>
Message-ID: <725FB731-F708-4B56-B084-4C8E1F34CDF1@iap.fr>


Hi Pierre,

> 
> On Aug 31, 2011, at 3:40 PM, Jean-Baptiste Marquette wrote:
>> Traceback (most recent call last):
>>  File "/Users/marquett/workspace/Distort/src/StatsSep.py", line 44, in <module>
>>    np.savetxt(Table, StatsAll, delimiter=' ', fmt=['%15s %.5f %.5f %5d %.4f %.4f'])
>>  File "/Library/Frameworks/EPD64.framework/Versions/7.1/lib/python2.7/site-packages/numpy/lib/npyio.py", line 979, in savetxt
>>    fh.write(asbytes(format % tuple(row) + newline))
>> TypeError: not enough arguments for format string
> 
> Without knowing StatsAll, it ain't easy? From the exception message, we could expect that one of rows is empty or as less than the 6 elements required by your format string.
> If you're using IPython, switch to debugger mode (pdb), then inspect row and format to find out the content of the offending line.

Here is a (short) sample of StatsAll:

[[('bs3000k.cat', 280.60341, -7.09118, 9480, 0.2057, 0.14)]
 [('bs3000l.cat', 280.61389, -7.24097, 11490, 0.1923, 0.0747)]
 [('bs3000m.cat', 280.77074, -7.08237, 13989, 0.2289, 0.1009)]
 [('bs3000n.cat', 280.77228, -7.23563, 15811, 0.1767, 0.1327)]
 [('bs3001k.cat', 280.95383, -7.10004, 7402, 0.2539, 0.0777)]
 [('bs3001l.cat', 280.95495, -7.23409, 13840, 0.1463, 0.1008)]
 [('bs3001m.cat', 281.1172, -7.08094, 9608, 0.2311, 0.1458)]
 [('bs3001n.cat', 281.12447, -7.23398, 14030, 0.2538, 0.1022)]
 [('bs3002k.cat', 280.62533, -7.47818, 593, 0.0291, 0.0237)]
 [('bs3002l.cat', 280.61508, -7.60359, 9122, 0.0518, 0.0205)]
 [('bs3002m.cat', 280.77209, -7.46262, 1510, 0.0415, 0.0302)]
 [('bs3002n.cat', 280.77578, -7.60117, 14177, 0.0807, 0.0327)]
 [('bs3003k.cat', 280.96463, -7.42967, 13506, 0.0305, 0.0225)]
 [('bs3003l.cat', 280.95638, -7.58462, 17903, 0.0458, 0.0298)]
 [('bs3003m.cat', 281.12729, -7.42516, 15676, 0.0879, 0.0446)]
 [('bs3003n.cat', 281.1354, -7.58497, 16015, 0.0685, 0.0376)]
 [('bs3004k.cat', 280.61148, -7.78976, 14794, 0.079, 0.0473)]
 [('bs3004l.cat', 280.61791, -7.94186, 15455, 0.0818, 0.0727)]
 [('bs3004m.cat', 280.78388, -7.78834, 14986, 0.0966, 0.0313)]
 [('bs3004n.cat', 280.78261, -7.93932, 18713, 0.0925, 0.0472)]
 [('bs3005k.cat', 280.9659, -7.78816, 14906, 0.0456, 0.022)]
 [('bs3005l.cat', 280.96811, -7.93894, 19744, 0.021, 0.0218)]
 [('bs3005m.cat', 281.1344, -7.78035, 15943, 0.0687, 0.0203)]
 [('bs3005n.cat', 281.13915, -7.93027, 18183, 0.1173, 0.0695)]
 [('bs3006k.cat', 280.61294, -8.14201, 13309, 0.143, 0.065)]
 [('bs3006l.cat', 280.65109, -8.29416, 405, 0.258, 0.1147)]
 [('bs3006m.cat', 280.78767, -8.13916, 14527, 0.1106, 0.0568)]
 [('bs3006n.cat', 280.80935, -8.28823, 818, 0.2382, 0.0764)]
 [('bs3007k.cat', 280.96614, -8.1401, 13251, 0.0946, 0.0415)]
 [('bs3007l.cat', 280.97158, -8.23797, 5807, 0.1758, 0.0636)]
 [('bs3007m.cat', 281.14129, -8.13799, 13886, 0.1524, 0.0517)]
 [('bs3007n.cat', 281.15309, -8.2476, 214, 0.1584, 0.0648)]]

> 
>> I struggled with various unsuccessful fmt syntaxes, and the numpy doc is very discrete about that topic:
>> 
>> fmt : string or sequence of strings
>> 
>>    A single format (%10.5f), a sequence of formats
> 
> Looks clear enough to me? But yes, a comment in the code shows that "   `fmt` can be a string with multiple insertion points or a list of formats.  E.g. '%10.5f\t%10d' or ('%10.5f', '$10d')" (so we should probably update the doc to this regard)

The command with parentheses:

        np.savetxt(Table, StatsAll, delimiter=' ', fmt=('%15s %.5f %.5f %5d %.4f %.4f'))

fails as well, but with a different error:

Traceback (most recent call last):
  File "/Users/marquett/workspace/Distort/src/StatsSep.py", line 44, in <module>
    np.savetxt(Table, StatsAll, delimiter=' ', fmt=('%15s %.5f %.5f %5d %.4f %.4f'))
  File "/Library/Frameworks/EPD64.framework/Versions/7.1/lib/python2.7/site-packages/numpy/lib/npyio.py", line 974, in savetxt
    % fmt)
AttributeError: fmt has wrong number of % formats.  %15s %.5f %.5f %5d %.4f %.4f

Plus, this one:

        np.savetxt(Table, StatsAll, delimiter=' ', fmt=('%15s', '%.5f', '%.5f', '%5d', '%.4f', '%.4f'))

yields:

Traceback (most recent call last):
  File "/Users/marquett/workspace/Distort/src/StatsSep.py", line 44, in <module>
    np.savetxt(Table, StatsAll, delimiter=' ', fmt=('%15s', '%.5f', '%.5f', '%5d', '%.4f', '%.4f'))
  File "/Library/Frameworks/EPD64.framework/Versions/7.1/lib/python2.7/site-packages/numpy/lib/npyio.py", line 966, in savetxt
    raise AttributeError('fmt has wrong shape.  %s' % str(fmt))
AttributeError: fmt has wrong shape.  ('%15s', '%.5f', '%.5f', '%5d', '%.4f', '%.4f')

Quite puzzling...
Should I switch to the I/O of asciitable package ?
Anyway, thanks again for your help.
JB

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110831/c657600e/attachment.html>

From pgmdevlist at gmail.com  Wed Aug 31 10:33:26 2011
From: pgmdevlist at gmail.com (Pierre GM)
Date: Wed, 31 Aug 2011 16:33:26 +0200
Subject: [Numpy-discussion] A question about dtype syntax
In-Reply-To: <725FB731-F708-4B56-B084-4C8E1F34CDF1@iap.fr>
References: <AA9C209A-DABF-442C-A72B-115B400545EC@iap.fr>
	<6AB4E3BA-C9B9-4D99-A470-259CD81589A7@gmail.com>
	<8701E943-4F16-435B-9EA9-B39639E2754C@iap.fr>
	<64CA5B4C-E000-4372-85FD-35767715DE3B@gmail.com>
	<BB3B3EAC-77C0-433B-8BE6-047AE0618B36@iap.fr>
	<44057DAD-05DF-4513-A7D1-A80238702C21@gmail.com>
	<725FB731-F708-4B56-B084-4C8E1F34CDF1@iap.fr>
Message-ID: <57A95D9F-1AB2-4188-8970-12716CC82F2B@gmail.com>


On Aug 31, 2011, at 4:24 PM, Jean-Baptiste Marquette wrote:

> 
> Hi Pierre,
> 
>> 
>> On Aug 31, 2011, at 3:40 PM, Jean-Baptiste Marquette wrote:
>>> Traceback (most recent call last):
>>>  File "/Users/marquett/workspace/Distort/src/StatsSep.py", line 44, in <module>
>>>    np.savetxt(Table, StatsAll, delimiter=' ', fmt=['%15s %.5f %.5f %5d %.4f %.4f'])
>>>  File "/Library/Frameworks/EPD64.framework/Versions/7.1/lib/python2.7/site-packages/numpy/lib/npyio.py", line 979, in savetxt
>>>    fh.write(asbytes(format % tuple(row) + newline))
>>> TypeError: not enough arguments for format string
>> 
>> Without knowing StatsAll, it ain't easy? From the exception message, we could expect that one of rows is empty or as less than the 6 elements required by your format string.
>> If you're using IPython, switch to debugger mode (pdb), then inspect row and format to find out the content of the offending line.
> 
> Here is a (short) sample of StatsAll:
> 
> [[('bs3000k.cat', 280.60341, -7.09118, 9480, 0.2057, 0.14)]

Have you tried the debugger as I suggested ? There must be a line somewhere that doesn't follow the format (the first one?)

>> 
>>> I struggled with various unsuccessful fmt syntaxes, and the numpy doc is very discrete about that topic:
>>> 
>>> fmt : string or sequence of strings
>>> 
>>>    A single format (%10.5f), a sequence of formats
>> 
>> Looks clear enough to me? But yes, a comment in the code shows that "   `fmt` can be a string with multiple insertion points or a list of formats.  E.g. '%10.5f\t%10d' or ('%10.5f', '$10d')" (so we should probably update the doc to this regard)
> 
> The command with parentheses:
> 
>         np.savetxt(Table, StatsAll, delimiter=' ', fmt=('%15s %.5f %.5f %5d %.4f %.4f'))
> 
> fails as well, but with a different error:

Well, either you use 1 string 
fmt="%15s %.5f %.5f %5d %.4f %.4f"
or you use a list of strings 
fmt=("%15s", "%.5f", "%.5f", "%5d", "%.4f", "%.4f")

> 
> Plus, this one:
> 
>         np.savetxt(Table, StatsAll, delimiter=' ', fmt=('%15s', '%.5f', '%.5f', '%5d', '%.4f', '%.4f'))
> 
> yields:
> 
> Traceback (most recent call last):
>   File "/Users/marquett/workspace/Distort/src/StatsSep.py", line 44, in <module>
>     np.savetxt(Table, StatsAll, delimiter=' ', fmt=('%15s', '%.5f', '%.5f', '%5d', '%.4f', '%.4f'))
>   File "/Library/Frameworks/EPD64.framework/Versions/7.1/lib/python2.7/site-packages/numpy/lib/npyio.py", line 966, in savetxt
>     raise AttributeError('fmt has wrong shape.  %s' % str(fmt))
> AttributeError: fmt has wrong shape.  ('%15s', '%.5f', '%.5f', '%5d', '%.4f', '%.4f')

try
fmt=[('%15s', '%.5f', '%.5f', '%5d', '%.4f', '%.4f')]


> Quite puzzling...
> Should I switch to the I/O of asciitable package ?

As you wish. The easiest might be to write the file yourself.:
>>> fmt = "%15s %.5f %.5f %5d %.4f %.4f\n"
>>> f=open(Table,'r')
>>> for line in StatsAll:
>>>    f.write(fmt % line)
>>> f.close()
or something like that


From warren.weckesser at enthought.com  Wed Aug 31 10:49:45 2011
From: warren.weckesser at enthought.com (Warren Weckesser)
Date: Wed, 31 Aug 2011 09:49:45 -0500
Subject: [Numpy-discussion] A question about dtype syntax
In-Reply-To: <725FB731-F708-4B56-B084-4C8E1F34CDF1@iap.fr>
References: <AA9C209A-DABF-442C-A72B-115B400545EC@iap.fr>
	<6AB4E3BA-C9B9-4D99-A470-259CD81589A7@gmail.com>
	<8701E943-4F16-435B-9EA9-B39639E2754C@iap.fr>
	<64CA5B4C-E000-4372-85FD-35767715DE3B@gmail.com>
	<BB3B3EAC-77C0-433B-8BE6-047AE0618B36@iap.fr>
	<44057DAD-05DF-4513-A7D1-A80238702C21@gmail.com>
	<725FB731-F708-4B56-B084-4C8E1F34CDF1@iap.fr>
Message-ID: <CAM-+wY8ByR_7MnQDcJXkrU1pyzwD2VxQTSpQW6XdJbLgA6bdKg@mail.gmail.com>

On Wed, Aug 31, 2011 at 9:24 AM, Jean-Baptiste Marquette <marquett at iap.fr>wrote:

>
> Hi Pierre,
>
>
> On Aug 31, 2011, at 3:40 PM, Jean-Baptiste Marquette wrote:
>
> Traceback (most recent call last):
>
>  File "/Users/marquett/workspace/Distort/src/StatsSep.py", line 44, in
> <module>
>
>    np.savetxt(Table, StatsAll, delimiter=' ', fmt=['%15s %.5f %.5f %5d %.4f
> %.4f'])
>
>  File
> "/Library/Frameworks/EPD64.framework/Versions/7.1/lib/python2.7/site-packages/numpy/lib/npyio.py",
> line 979, in savetxt
>
>    fh.write(asbytes(format % tuple(row) + newline))
>
> TypeError: not enough arguments for format string
>
>
> Without knowing StatsAll, it ain't easy? From the exception message, we
> could expect that one of rows is empty or as less than the 6 elements
> required by your format string.
> If you're using IPython, switch to debugger mode (pdb), then inspect row
> and format to find out the content of the offending line.
>
>
> Here is a (short) sample of StatsAll:
>
> [[('bs3000k.cat', 280.60341, -7.09118, 9480, 0.2057, 0.14)]
>  [('bs3000l.cat', 280.61389, -7.24097, 11490, 0.1923, 0.0747)]
>  [('bs3000m.cat', 280.77074, -7.08237, 13989, 0.2289, 0.1009)]
>  [('bs3000n.cat', 280.77228, -7.23563, 15811, 0.1767, 0.1327)]
>  [('bs3001k.cat', 280.95383, -7.10004, 7402, 0.2539, 0.0777)]
>  [('bs3001l.cat', 280.95495, -7.23409, 13840, 0.1463, 0.1008)]
>  [('bs3001m.cat', 281.1172, -7.08094, 9608, 0.2311, 0.1458)]
>  [('bs3001n.cat', 281.12447, -7.23398, 14030, 0.2538, 0.1022)]
>  [('bs3002k.cat', 280.62533, -7.47818, 593, 0.0291, 0.0237)]
>  [('bs3002l.cat', 280.61508, -7.60359, 9122, 0.0518, 0.0205)]
>  [('bs3002m.cat', 280.77209, -7.46262, 1510, 0.0415, 0.0302)]
>  [('bs3002n.cat', 280.77578, -7.60117, 14177, 0.0807, 0.0327)]
>  [('bs3003k.cat', 280.96463, -7.42967, 13506, 0.0305, 0.0225)]
>  [('bs3003l.cat', 280.95638, -7.58462, 17903, 0.0458, 0.0298)]
>  [('bs3003m.cat', 281.12729, -7.42516, 15676, 0.0879, 0.0446)]
>  [('bs3003n.cat', 281.1354, -7.58497, 16015, 0.0685, 0.0376)]
>  [('bs3004k.cat', 280.61148, -7.78976, 14794, 0.079, 0.0473)]
>  [('bs3004l.cat', 280.61791, -7.94186, 15455, 0.0818, 0.0727)]
>  [('bs3004m.cat', 280.78388, -7.78834, 14986, 0.0966, 0.0313)]
>  [('bs3004n.cat', 280.78261, -7.93932, 18713, 0.0925, 0.0472)]
>  [('bs3005k.cat', 280.9659, -7.78816, 14906, 0.0456, 0.022)]
>  [('bs3005l.cat', 280.96811, -7.93894, 19744, 0.021, 0.0218)]
>  [('bs3005m.cat', 281.1344, -7.78035, 15943, 0.0687, 0.0203)]
>  [('bs3005n.cat', 281.13915, -7.93027, 18183, 0.1173, 0.0695)]
>  [('bs3006k.cat', 280.61294, -8.14201, 13309, 0.143, 0.065)]
>  [('bs3006l.cat', 280.65109, -8.29416, 405, 0.258, 0.1147)]
>  [('bs3006m.cat', 280.78767, -8.13916, 14527, 0.1106, 0.0568)]
>  [('bs3006n.cat', 280.80935, -8.28823, 818, 0.2382, 0.0764)]
>  [('bs3007k.cat', 280.96614, -8.1401, 13251, 0.0946, 0.0415)]
>  [('bs3007l.cat', 280.97158, -8.23797, 5807, 0.1758, 0.0636)]
>  [('bs3007m.cat', 281.14129, -8.13799, 13886, 0.1524, 0.0517)]
>  [('bs3007n.cat', 281.15309, -8.2476, 214, 0.1584, 0.0648)]]
>
>

Notice that your array is actually a 2D structured array with shape (n, 1).
Try reshaping it to (n,) or apply np.squeeze before calling savetxt.

Warren


>
> I struggled with various unsuccessful fmt syntaxes, and the numpy doc is
> very discrete about that topic:
>
>
> fmt : string or sequence of strings
>
>
>    A single format (%10.5f), a sequence of formats
>
>
> Looks clear enough to me? But yes, a comment in the code shows that "
>   `fmt` can be a string with multiple insertion points or a list of formats.
>  E.g. '%10.5f\t%10d' or ('%10.5f', '$10d')" (so we should probably update
> the doc to this regard)
>
>
> The command with parentheses:
>
>         np.savetxt(Table, StatsAll, delimiter=' ', fmt=('%15s %.5f %.5f
> %5d %.4f %.4f'))
>
> fails as well, but with a different error:
>
> Traceback (most recent call last):
>   File "/Users/marquett/workspace/Distort/src/StatsSep.py", line 44, in
> <module>
>     np.savetxt(Table, StatsAll, delimiter=' ', fmt=('%15s %.5f %.5f %5d
> %.4f %.4f'))
>   File
> "/Library/Frameworks/EPD64.framework/Versions/7.1/lib/python2.7/site-packages/numpy/lib/npyio.py",
> line 974, in savetxt
>     % fmt)
> AttributeError: fmt has wrong number of % formats.  %15s %.5f %.5f %5d %.4f
> %.4f
>
> Plus, this one:
>
>         np.savetxt(Table, StatsAll, delimiter=' ', fmt=('%15s', '%.5f',
> '%.5f', '%5d', '%.4f', '%.4f'))
>
> yields:
>
> Traceback (most recent call last):
>   File "/Users/marquett/workspace/Distort/src/StatsSep.py", line 44, in
> <module>
>     np.savetxt(Table, StatsAll, delimiter=' ', fmt=('%15s', '%.5f', '%.5f',
> '%5d', '%.4f', '%.4f'))
>   File
> "/Library/Frameworks/EPD64.framework/Versions/7.1/lib/python2.7/site-packages/numpy/lib/npyio.py",
> line 966, in savetxt
>     raise AttributeError('fmt has wrong shape.  %s' % str(fmt))
> AttributeError: fmt has wrong shape.  ('%15s', '%.5f', '%.5f', '%5d',
> '%.4f', '%.4f')
>
> Quite puzzling...
> Should I switch to the I/O of asciitable package ?
> Anyway, thanks again for your help.
> JB
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110831/4cc38188/attachment.html>

From igouy2 at yahoo.com  Wed Aug 31 12:01:27 2011
From: igouy2 at yahoo.com (Isaac Gouy)
Date: Wed, 31 Aug 2011 09:01:27 -0700 (PDT)
Subject: [Numpy-discussion] Numpy performance boost
In-Reply-To: <1314788330.2418.13.camel@media>
References: <1314788330.2418.13.camel@media>
Message-ID: <4eaad006-f5be-4ede-9584-ad0559debf35@p25g2000pri.googlegroups.com>

Dieter,

thank you for contributing a numpy mandelbrot program - but no thanks
for your "disqualified for doing things differently" comment here.

The benchmarks game has been showing a spectral-norm program based on
numpy as an "interesting alternative" for the last couple of years -

http://shootout.alioth.debian.org/u64q/program.php?test=spectralnorm&lang=python3&id=2

- simply because I thought numpy was interesting and wanted somehow to
include a numpy program without taking on the chore of dealing with a
whole bunch of numpy programs.

The relevant point isn't that your numpy program is shown as "an
interesting alternative".

The relevant point is that your numpy program is shown at all.

best wishes, Isaac


On Aug 31, 3:58?am, Dieter Weber <die... at uellue.de> wrote:
> Hi,
> just wanted to show an example of how python3 + numpy compares with just
> python3 and many other languages and language implementations:http://shootout.alioth.debian.org/u64q/performance.php?test=mandelbro...
>
> The python3 program using numpy is #6 and you find it with the
> "interesting alternative" programs on the bottom because it was
> disqualified for doing things differently. It is 6.3x slower than the
> fastest program and well ahead of all other interpreted languages.
>
> Thanks to all contributors for making numpy such a great piece of
> software!
>
> Greetings,
> Dieter


From Chris.Barker at noaa.gov  Wed Aug 31 12:08:23 2011
From: Chris.Barker at noaa.gov (Chris.Barker)
Date: Wed, 31 Aug 2011 09:08:23 -0700
Subject: [Numpy-discussion] Numpy performance boost
In-Reply-To: <1314788330.2418.13.camel@media>
References: <1314788330.2418.13.camel@media>
Message-ID: <4E5E5C77.1090307@noaa.gov>

On 8/31/11 3:58 AM, Dieter Weber wrote:
> just wanted to show an example of how python3 + numpy compares with just
> python3 and many other languages and language implementations:
> http://shootout.alioth.debian.org/u64q/performance.php?test=mandelbrot#about

hmmm - it would be interesting to see what PyPy does with this.

Also Cython -- can you call that another language? Done right, it should 
be in the C ballpark.

> The python3 program using numpy is #6 and you find it with the
> "interesting alternative" programs on the bottom because it was
> disqualified for doing things differently.

I'm not sure what they mean by "differently" -- but if it's because 
numpy is not a standard part of the language -- who cares.

It's too bad, though -- a lot of people do discount numpy for that 
reason, but as far as I'm concerned, doing numerics without numpy is 
like doing text processing without the string class (type?).

Python would be essentially useless if string were implemented as lists 
or tuples of characters, and everything had to loop through them. So why 
isn't an ndarray considered a first class citizen in python?

-CHB


-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From igouy2 at yahoo.com  Wed Aug 31 12:53:12 2011
From: igouy2 at yahoo.com (Isaac Gouy)
Date: Wed, 31 Aug 2011 09:53:12 -0700 (PDT)
Subject: [Numpy-discussion] Numpy performance boost
References: <1314788330.2418.13.camel@media> <4E5E5C77.1090307@noaa.gov> 
Message-ID: <1314809592.67472.YahooMailNeo@web65615.mail.ac4.yahoo.com>

----- Original Message -----
> From: Chris.Barker <Chris.Barker at noaa.gov>
> To: numpy-discussion at scipy.org
> Cc: 
> Sent: Wednesday, August 31, 2011 9:08 AM
> Subject: Re: [Numpy-discussion] Numpy performance boost
> 
> On 8/31/11 3:58 AM, Dieter Weber wrote:
>>? just wanted to show an example of how python3 + numpy compares with just
>>? python3 and many other languages and language implementations:
>> 
> http://shootout.alioth.debian.org/u64q/performance.php?test=mandelbrot#about
> 
> hmmm - it would be interesting to see what PyPy does with this.


So do it! 

http://shootout.alioth.debian.org/help.php#languagex


Here's the nightly snapshot with source code for all the programs -

https://alioth.debian.org/frs/?group_id=30402


Have fun.


From mjanikas at esri.com  Wed Aug 31 13:56:28 2011
From: mjanikas at esri.com (Mark Janikas)
Date: Wed, 31 Aug 2011 10:56:28 -0700
Subject: [Numpy-discussion] Question on LinAlg Inverse Algorithm
In-Reply-To: <j3kt60$gjv$1@dough.gmane.org>
References: <FC90C6AB4859DF44AE7376C81140C0FD47FACFAC@redmx4.esri.com>
	<j3kt60$gjv$1@dough.gmane.org>
Message-ID: <FC90C6AB4859DF44AE7376C81140C0FD47FACFB0@redmx4.esri.com>

Right indeed... I have spent a lot of time looking at this and it seems a waste of time as the results are garbage anyways when the columns are collinear.  I am just going to set a threshold, check the condition number, continue is satisfied, return error/warning if not.... now, what is too large?.... Ill poke around.  TY!

MJ 

-----Original Message-----
From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion-bounces at scipy.org] On Behalf Of Pauli Virtanen
Sent: Wednesday, August 31, 2011 2:00 AM
To: numpy-discussion at scipy.org
Subject: Re: [Numpy-discussion] Question on LinAlg Inverse Algorithm

On Tue, 30 Aug 2011 15:48:18 -0700, Mark Janikas wrote:
> Last week I posted a question involving the identification of linear
> dependent columns of a matrix... but now I am finding an interesting
> result based on the linalg.inv() function... sometime I am able to
> invert a matrix that has linear dependent columns and other times I get
> the LinAlgError()... this suggests that there is some kind of random
> component to the INV method.  Is this normal?

I suspect that this is a case of floating-point rounding errors.
Floating-point arithmetic is inexact, so even if a certain matrix
is singular in exact arithmetic, for a computer it may still be
invertible (by a given algorithm). This type of things are not
unusual in floating-point computations.

The matrix condition number (`np.linalg.cond`) is a better measure
of whether a matrix is invertible or not.

-- 
Pauli Virtanen

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion at scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


From bsouthey at gmail.com  Wed Aug 31 14:10:56 2011
From: bsouthey at gmail.com (Bruce Southey)
Date: Wed, 31 Aug 2011 13:10:56 -0500
Subject: [Numpy-discussion] Question on LinAlg Inverse Algorithm
In-Reply-To: <FC90C6AB4859DF44AE7376C81140C0FD47FACFB0@redmx4.esri.com>
References: <FC90C6AB4859DF44AE7376C81140C0FD47FACFAC@redmx4.esri.com>	<j3kt60$gjv$1@dough.gmane.org>
	<FC90C6AB4859DF44AE7376C81140C0FD47FACFB0@redmx4.esri.com>
Message-ID: <4E5E7930.6060209@gmail.com>

On 08/31/2011 12:56 PM, Mark Janikas wrote:
> Right indeed... I have spent a lot of time looking at this and it seems a waste of time as the results are garbage anyways when the columns are collinear.  I am just going to set a threshold, check the condition number, continue is satisfied, return error/warning if not.... now, what is too large?.... Ill poke around.  TY!
>
> MJ
The results are not 'garbage' as if you have collinear columns as these 
have very well-known and understandable meaning. But if you don't expect 
this then you really need to examine how you are modeling or measuring 
your data because that is where the problem lies. For example, if you 
are measuring two variables then it means that those measurements are 
not independent as you are assuming.

Bruce

> -----Original Message-----
> From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion-bounces at scipy.org] On Behalf Of Pauli Virtanen
> Sent: Wednesday, August 31, 2011 2:00 AM
> To: numpy-discussion at scipy.org
> Subject: Re: [Numpy-discussion] Question on LinAlg Inverse Algorithm
>
> On Tue, 30 Aug 2011 15:48:18 -0700, Mark Janikas wrote:
>> Last week I posted a question involving the identification of linear
>> dependent columns of a matrix... but now I am finding an interesting
>> result based on the linalg.inv() function... sometime I am able to
>> invert a matrix that has linear dependent columns and other times I get
>> the LinAlgError()... this suggests that there is some kind of random
>> component to the INV method.  Is this normal?
> I suspect that this is a case of floating-point rounding errors.
> Floating-point arithmetic is inexact, so even if a certain matrix
> is singular in exact arithmetic, for a computer it may still be
> invertible (by a given algorithm). This type of things are not
> unusual in floating-point computations.
>
> The matrix condition number (`np.linalg.cond`) is a better measure
> of whether a matrix is invertible or not.
>


From mjanikas at esri.com  Wed Aug 31 14:32:19 2011
From: mjanikas at esri.com (Mark Janikas)
Date: Wed, 31 Aug 2011 11:32:19 -0700
Subject: [Numpy-discussion] Question on LinAlg Inverse Algorithm
In-Reply-To: <4E5E7930.6060209@gmail.com>
References: <FC90C6AB4859DF44AE7376C81140C0FD47FACFAC@redmx4.esri.com>
	<j3kt60$gjv$1@dough.gmane.org>
	<FC90C6AB4859DF44AE7376C81140C0FD47FACFB0@redmx4.esri.com>
	<4E5E7930.6060209@gmail.com>
Message-ID: <FC90C6AB4859DF44AE7376C81140C0FD47FACFB1@redmx4.esri.com>

When I say garbage, I mean in the context of my hypothesis testing when in the presence of perfect multicollinearity.  I advise the user of the combination that leads to the problem and move on.... 

-----Original Message-----
From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion-bounces at scipy.org] On Behalf Of Bruce Southey
Sent: Wednesday, August 31, 2011 11:11 AM
To: numpy-discussion at scipy.org
Subject: Re: [Numpy-discussion] Question on LinAlg Inverse Algorithm

On 08/31/2011 12:56 PM, Mark Janikas wrote:
> Right indeed... I have spent a lot of time looking at this and it seems a waste of time as the results are garbage anyways when the columns are collinear.  I am just going to set a threshold, check the condition number, continue is satisfied, return error/warning if not.... now, what is too large?.... Ill poke around.  TY!
>
> MJ
The results are not 'garbage' as if you have collinear columns as these 
have very well-known and understandable meaning. But if you don't expect 
this then you really need to examine how you are modeling or measuring 
your data because that is where the problem lies. For example, if you 
are measuring two variables then it means that those measurements are 
not independent as you are assuming.

Bruce

> -----Original Message-----
> From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion-bounces at scipy.org] On Behalf Of Pauli Virtanen
> Sent: Wednesday, August 31, 2011 2:00 AM
> To: numpy-discussion at scipy.org
> Subject: Re: [Numpy-discussion] Question on LinAlg Inverse Algorithm
>
> On Tue, 30 Aug 2011 15:48:18 -0700, Mark Janikas wrote:
>> Last week I posted a question involving the identification of linear
>> dependent columns of a matrix... but now I am finding an interesting
>> result based on the linalg.inv() function... sometime I am able to
>> invert a matrix that has linear dependent columns and other times I get
>> the LinAlgError()... this suggests that there is some kind of random
>> component to the INV method.  Is this normal?
> I suspect that this is a case of floating-point rounding errors.
> Floating-point arithmetic is inexact, so even if a certain matrix
> is singular in exact arithmetic, for a computer it may still be
> invertible (by a given algorithm). This type of things are not
> unusual in floating-point computations.
>
> The matrix condition number (`np.linalg.cond`) is a better measure
> of whether a matrix is invertible or not.
>

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion at scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


From cjordan1 at uw.edu  Wed Aug 31 14:51:13 2011
From: cjordan1 at uw.edu (Christopher Jordan-Squire)
Date: Wed, 31 Aug 2011 13:51:13 -0500
Subject: [Numpy-discussion] non-uniform discrete sampling with given
	probabilities (w/ and w/o replacement)
Message-ID: <CAEJxiFrruHy=5jwZGU3PjUJcPL64t4SnqiPFVNxBkPEQHqgGCw@mail.gmail.com>

In numpy, is there a way of generating a random integer in a specified
range where the integers in that range have given probabilities? So,
for example, generating a random integer between 1 and 3 with
probabilities [0.1, 0.2, 0.7] for the three integers?

I'd like to know how to do this without replacement, as well. If the
probabilities are uniform, there are a number of ways, including just
shuffling the data and taking the first however-many elements of the
shuffle. But this doesn't apply with non-uniform probabilities.
Similarly, one could try arbitrary-sampling-method X (such as
inverse-cdf sampling) and then rejecting repeats. But that is clearly
sub-optimal if the number of samples desired is near the same order of
magnitude as the total population, or if the probabilities are very
skewed. (E.g. a weighted sample of size 2 without replacement from
[0,1,2] with probabilities [0.999,.00005, 0.00005] will take a long
time if you just sample repeatedly until you have two distinct
samples.)

I know parts of what I want can be done in scipy.statistics using a
discrete_rv or with the python standard library's random package. I
would much prefer to do it only using numpy because the eventual
application shouldn't have a scipy dependency and should use the same
random seed as numpy.random.

(For more background, what I want is to create a function like sample
in R, where I can give it an array-like of doo-hickeys and another
array-like of probabilities associated with each doo-hickey, and then
generate a random sample of doo-hickeys with those probabilities. One
step for that is generating ints, to use as indices, with the same
probabilities. I'd like a version of this to be in numpy/scipy, but it
doesn't really belong in scipy since it doesn't

-Chris JS


From shish at keba.be  Wed Aug 31 15:07:01 2011
From: shish at keba.be (Olivier Delalleau)
Date: Wed, 31 Aug 2011 15:07:01 -0400
Subject: [Numpy-discussion] non-uniform discrete sampling with given
 probabilities (w/ and w/o replacement)
In-Reply-To: <CAEJxiFrruHy=5jwZGU3PjUJcPL64t4SnqiPFVNxBkPEQHqgGCw@mail.gmail.com>
References: <CAEJxiFrruHy=5jwZGU3PjUJcPL64t4SnqiPFVNxBkPEQHqgGCw@mail.gmail.com>
Message-ID: <CAFXk4brQD3_VQ1Csezqwi4EkL0v6gpVuDvW10yEUL_7S9een5g@mail.gmail.com>

You can use:
1 + numpy.argmax(numpy.random.multinomial(1, [0.1, 0.2, 0.7]))

For your "real" application you'll probably want to use a value >1 for the
first parameter (equal to your sample size), instead of calling it multiple
times.

-=- Olivier

2011/8/31 Christopher Jordan-Squire <cjordan1 at uw.edu>

> In numpy, is there a way of generating a random integer in a specified
> range where the integers in that range have given probabilities? So,
> for example, generating a random integer between 1 and 3 with
> probabilities [0.1, 0.2, 0.7] for the three integers?
>
> I'd like to know how to do this without replacement, as well. If the
> probabilities are uniform, there are a number of ways, including just
> shuffling the data and taking the first however-many elements of the
> shuffle. But this doesn't apply with non-uniform probabilities.
> Similarly, one could try arbitrary-sampling-method X (such as
> inverse-cdf sampling) and then rejecting repeats. But that is clearly
> sub-optimal if the number of samples desired is near the same order of
> magnitude as the total population, or if the probabilities are very
> skewed. (E.g. a weighted sample of size 2 without replacement from
> [0,1,2] with probabilities [0.999,.00005, 0.00005] will take a long
> time if you just sample repeatedly until you have two distinct
> samples.)
>
> I know parts of what I want can be done in scipy.statistics using a
> discrete_rv or with the python standard library's random package. I
> would much prefer to do it only using numpy because the eventual
> application shouldn't have a scipy dependency and should use the same
> random seed as numpy.random.
>
> (For more background, what I want is to create a function like sample
> in R, where I can give it an array-like of doo-hickeys and another
> array-like of probabilities associated with each doo-hickey, and then
> generate a random sample of doo-hickeys with those probabilities. One
> step for that is generating ints, to use as indices, with the same
> probabilities. I'd like a version of this to be in numpy/scipy, but it
> doesn't really belong in scipy since it doesn't
>
> -Chris JS
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110831/b65cfa04/attachment.html>

From cjordan1 at uw.edu  Wed Aug 31 15:17:04 2011
From: cjordan1 at uw.edu (Christopher Jordan-Squire)
Date: Wed, 31 Aug 2011 14:17:04 -0500
Subject: [Numpy-discussion] non-uniform discrete sampling with given
 probabilities (w/ and w/o replacement)
In-Reply-To: <CAFXk4brQD3_VQ1Csezqwi4EkL0v6gpVuDvW10yEUL_7S9een5g@mail.gmail.com>
References: <CAEJxiFrruHy=5jwZGU3PjUJcPL64t4SnqiPFVNxBkPEQHqgGCw@mail.gmail.com>
	<CAFXk4brQD3_VQ1Csezqwi4EkL0v6gpVuDvW10yEUL_7S9een5g@mail.gmail.com>
Message-ID: <CAEJxiFqVyH_oDpTjjOoiqGfexgjdXfsDR0J3aoPsJ=6+G948OA@mail.gmail.com>

On Wed, Aug 31, 2011 at 2:07 PM, Olivier Delalleau <shish at keba.be> wrote:
> You can use:
> 1 + numpy.argmax(numpy.random.multinomial(1, [0.1, 0.2, 0.7]))
>
> For your "real" application you'll probably want to use a value >1 for the
> first parameter (equal to your sample size), instead of calling it multiple
> times.
>
> -=- Olivier

Thanks. Warren (Weckesser) mentioned this possibility to me yesterday
and I forgot to put it in my post. I assume you mean something like

x = np.arange(3)
y = np.random.multinomial(30, [0.1,0.2,0.7])
z = np.repeat(x, y)
np.random.shuffle(z)

That look right?

-Chris JS

>
> 2011/8/31 Christopher Jordan-Squire <cjordan1 at uw.edu>
>>
>> In numpy, is there a way of generating a random integer in a specified
>> range where the integers in that range have given probabilities? So,
>> for example, generating a random integer between 1 and 3 with
>> probabilities [0.1, 0.2, 0.7] for the three integers?
>>
>> I'd like to know how to do this without replacement, as well. If the
>> probabilities are uniform, there are a number of ways, including just
>> shuffling the data and taking the first however-many elements of the
>> shuffle. But this doesn't apply with non-uniform probabilities.
>> Similarly, one could try arbitrary-sampling-method X (such as
>> inverse-cdf sampling) and then rejecting repeats. But that is clearly
>> sub-optimal if the number of samples desired is near the same order of
>> magnitude as the total population, or if the probabilities are very
>> skewed. (E.g. a weighted sample of size 2 without replacement from
>> [0,1,2] with probabilities [0.999,.00005, 0.00005] will take a long
>> time if you just sample repeatedly until you have two distinct
>> samples.)
>>
>> I know parts of what I want can be done in scipy.statistics using a
>> discrete_rv or with the python standard library's random package. I
>> would much prefer to do it only using numpy because the eventual
>> application shouldn't have a scipy dependency and should use the same
>> random seed as numpy.random.
>>
>> (For more background, what I want is to create a function like sample
>> in R, where I can give it an array-like of doo-hickeys and another
>> array-like of probabilities associated with each doo-hickey, and then
>> generate a random sample of doo-hickeys with those probabilities. One
>> step for that is generating ints, to use as indices, with the same
>> probabilities. I'd like a version of this to be in numpy/scipy, but it
>> doesn't really belong in scipy since it doesn't
>>
>> -Chris JS
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>


From shish at keba.be  Wed Aug 31 15:22:43 2011
From: shish at keba.be (Olivier Delalleau)
Date: Wed, 31 Aug 2011 15:22:43 -0400
Subject: [Numpy-discussion] non-uniform discrete sampling with given
 probabilities (w/ and w/o replacement)
In-Reply-To: <CAEJxiFqVyH_oDpTjjOoiqGfexgjdXfsDR0J3aoPsJ=6+G948OA@mail.gmail.com>
References: <CAEJxiFrruHy=5jwZGU3PjUJcPL64t4SnqiPFVNxBkPEQHqgGCw@mail.gmail.com>
	<CAFXk4brQD3_VQ1Csezqwi4EkL0v6gpVuDvW10yEUL_7S9een5g@mail.gmail.com>
	<CAEJxiFqVyH_oDpTjjOoiqGfexgjdXfsDR0J3aoPsJ=6+G948OA@mail.gmail.com>
Message-ID: <CAFXk4boYPMeFT9ck71Qtbt=gGdjd=ovqiHXcR7OoiB40zNJVxQ@mail.gmail.com>

2011/8/31 Christopher Jordan-Squire <cjordan1 at uw.edu>

> On Wed, Aug 31, 2011 at 2:07 PM, Olivier Delalleau <shish at keba.be> wrote:
> > You can use:
> > 1 + numpy.argmax(numpy.random.multinomial(1, [0.1, 0.2, 0.7]))
> >
> > For your "real" application you'll probably want to use a value >1 for
> the
> > first parameter (equal to your sample size), instead of calling it
> multiple
> > times.
> >
> > -=- Olivier
>
> Thanks. Warren (Weckesser) mentioned this possibility to me yesterday
> and I forgot to put it in my post. I assume you mean something like
>
> x = np.arange(3)
> y = np.random.multinomial(30, [0.1,0.2,0.7])
> z = np.repeat(x, y)
> np.random.shuffle(z)
>
> That look right?
>
> -Chris JS
>
>
Yes, exactly.

-=- Olivier
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20110831/2ea4c146/attachment.html>

From josef.pktd at gmail.com  Wed Aug 31 16:34:17 2011
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Wed, 31 Aug 2011 16:34:17 -0400
Subject: [Numpy-discussion] non-uniform discrete sampling with given
 probabilities (w/ and w/o replacement)
In-Reply-To: <CAFXk4boYPMeFT9ck71Qtbt=gGdjd=ovqiHXcR7OoiB40zNJVxQ@mail.gmail.com>
References: <CAEJxiFrruHy=5jwZGU3PjUJcPL64t4SnqiPFVNxBkPEQHqgGCw@mail.gmail.com>
	<CAFXk4brQD3_VQ1Csezqwi4EkL0v6gpVuDvW10yEUL_7S9een5g@mail.gmail.com>
	<CAEJxiFqVyH_oDpTjjOoiqGfexgjdXfsDR0J3aoPsJ=6+G948OA@mail.gmail.com>
	<CAFXk4boYPMeFT9ck71Qtbt=gGdjd=ovqiHXcR7OoiB40zNJVxQ@mail.gmail.com>
Message-ID: <CAMMTP+DRFrJsHsU_wGxopnqGHUtobqoVXsDQMXcB+q673y9uZw@mail.gmail.com>

On Wed, Aug 31, 2011 at 3:22 PM, Olivier Delalleau <shish at keba.be> wrote:
> 2011/8/31 Christopher Jordan-Squire <cjordan1 at uw.edu>
>>
>> On Wed, Aug 31, 2011 at 2:07 PM, Olivier Delalleau <shish at keba.be> wrote:
>> > You can use:
>> > 1 + numpy.argmax(numpy.random.multinomial(1, [0.1, 0.2, 0.7]))
>> >
>> > For your "real" application you'll probably want to use a value >1 for
>> > the
>> > first parameter (equal to your sample size), instead of calling it
>> > multiple
>> > times.
>> >
>> > -=- Olivier
>>
>> Thanks. Warren (Weckesser) mentioned this possibility to me yesterday
>> and I forgot to put it in my post. I assume you mean something like
>>
>> x = np.arange(3)
>> y = np.random.multinomial(30, [0.1,0.2,0.7])
>> z = np.repeat(x, y)
>> np.random.shuffle(z)
>>
>> That look right?
>>
>> -Chris JS
>>
>
> Yes, exactly.

Chuck's answer to the same question, when I asked on the list, used
searchsorted and is fast

cdfvalues.searchsorted(np.random.random(size))

my recent version of it for FiniteLatticeDistribution

    def rvs(self, size=1):
        '''draw random variables with shape given by size

        '''
        #w = self.pdfvalues
        #p = cumsum(w)/float(w.sum())
        #p.searchsorted(np.random.random(size))
        return self.support[self.cdfvalues.searchsorted(np.random.random(size))]

Josef


>
> -=- Olivier
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>


From cjordan1 at uw.edu  Wed Aug 31 16:58:08 2011
From: cjordan1 at uw.edu (Christopher Jordan-Squire)
Date: Wed, 31 Aug 2011 15:58:08 -0500
Subject: [Numpy-discussion] non-uniform discrete sampling with given
 probabilities (w/ and w/o replacement)
In-Reply-To: <CAMMTP+DRFrJsHsU_wGxopnqGHUtobqoVXsDQMXcB+q673y9uZw@mail.gmail.com>
References: <CAEJxiFrruHy=5jwZGU3PjUJcPL64t4SnqiPFVNxBkPEQHqgGCw@mail.gmail.com>
	<CAFXk4brQD3_VQ1Csezqwi4EkL0v6gpVuDvW10yEUL_7S9een5g@mail.gmail.com>
	<CAEJxiFqVyH_oDpTjjOoiqGfexgjdXfsDR0J3aoPsJ=6+G948OA@mail.gmail.com>
	<CAFXk4boYPMeFT9ck71Qtbt=gGdjd=ovqiHXcR7OoiB40zNJVxQ@mail.gmail.com>
	<CAMMTP+DRFrJsHsU_wGxopnqGHUtobqoVXsDQMXcB+q673y9uZw@mail.gmail.com>
Message-ID: <CAEJxiFpBt2VNFvq-pFC3MQKLYtpQ3NqrLSOxYPXkeeGf0fQgEQ@mail.gmail.com>

On Wed, Aug 31, 2011 at 3:34 PM,  <josef.pktd at gmail.com> wrote:
> On Wed, Aug 31, 2011 at 3:22 PM, Olivier Delalleau <shish at keba.be> wrote:
>> 2011/8/31 Christopher Jordan-Squire <cjordan1 at uw.edu>
>>>
>>> On Wed, Aug 31, 2011 at 2:07 PM, Olivier Delalleau <shish at keba.be> wrote:
>>> > You can use:
>>> > 1 + numpy.argmax(numpy.random.multinomial(1, [0.1, 0.2, 0.7]))
>>> >
>>> > For your "real" application you'll probably want to use a value >1 for
>>> > the
>>> > first parameter (equal to your sample size), instead of calling it
>>> > multiple
>>> > times.
>>> >
>>> > -=- Olivier
>>>
>>> Thanks. Warren (Weckesser) mentioned this possibility to me yesterday
>>> and I forgot to put it in my post. I assume you mean something like
>>>
>>> x = np.arange(3)
>>> y = np.random.multinomial(30, [0.1,0.2,0.7])
>>> z = np.repeat(x, y)
>>> np.random.shuffle(z)
>>>
>>> That look right?
>>>
>>> -Chris JS
>>>
>>
>> Yes, exactly.
>
> Chuck's answer to the same question, when I asked on the list, used
> searchsorted and is fast
>
> cdfvalues.searchsorted(np.random.random(size))
>
> my recent version of it for FiniteLatticeDistribution
>
> ? ?def rvs(self, size=1):
> ? ? ? ?'''draw random variables with shape given by size
>
> ? ? ? ?'''
> ? ? ? ?#w = self.pdfvalues
> ? ? ? ?#p = cumsum(w)/float(w.sum())
> ? ? ? ?#p.searchsorted(np.random.random(size))
> ? ? ? ?return self.support[self.cdfvalues.searchsorted(np.random.random(size))]
>
> Josef
>

That's exactly what I needed. Thanks!

-Chris JS

>
>>
>> -=- Olivier
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>