From felix.hartmann at crans.org  Mon Jul  1 08:11:15 2013
From: felix.hartmann at crans.org (=?UTF-8?B?RsOpbGl4?= Hartmann)
Date: Mon, 1 Jul 2013 14:11:15 +0200
Subject: [Numpy-discussion] np.insert with axis=-1
Message-ID: <20130701141115.480c8103@artemis.nancy.inra.local>

Hi all,

I recently upgraded from Numpy 1.6.2 to 1.7.1 on my Debian testing, and
then got a bug in a program that was previously working. It turned out
that the problem comes from the np.insert function when the argument
`axis=-1` is given.

Here is a minimal example:
>>> u = np.zeros((2,3,4))
>>> ui = np.ones((2,3))
>>> u = np.insert(u, 1, ui, axis=-1)

The last line should be equivalent to 
>>> u = np.insert(u, 1, ui, axis=2)

It was indeed the case in Numpy 1.6, but in 1.7.1 it raises a
ValueError exception.

Note that the problem seems specific to axis=-1, and not to all negative
axis values, since the following example works as expected:
>>> u = np.zeros((2,3,4))
>>> ui = np.ones((2,4))
>>> u = np.insert(u, 1, ui, axis=-2)  # equivalent to axis=1

I didn't check on current master, so maybe things have changed since
1.7.1. If they have not, do you think a bug report would be relevant?

Cheers,
F?lix


From sebastian at sipsolutions.net  Mon Jul  1 11:54:36 2013
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Mon, 01 Jul 2013 17:54:36 +0200
Subject: [Numpy-discussion] np.insert with axis=-1
In-Reply-To: <20130701141115.480c8103@artemis.nancy.inra.local>
References: <20130701141115.480c8103@artemis.nancy.inra.local>
Message-ID: <1372694076.16404.2.camel@sebastian-laptop>

On Mon, 2013-07-01 at 14:11 +0200, F?lix Hartmann wrote:
> Hi all,
> 
> I recently upgraded from Numpy 1.6.2 to 1.7.1 on my Debian testing, and
> then got a bug in a program that was previously working. It turned out
> that the problem comes from the np.insert function when the argument
> `axis=-1` is given.
> 

Dang, yes, its a pretty stupid bug, exists basically the same in both
1.7 and 1.8. If you got a minute, it is because of np.rollaxis usage,
and in it there it says `axis-1` which is wrong for negative axes!

Could you create a pull request to fix that? That would be great.

- Sebastian

> Here is a minimal example:
> >>> u = np.zeros((2,3,4))
> >>> ui = np.ones((2,3))
> >>> u = np.insert(u, 1, ui, axis=-1)
> 
> The last line should be equivalent to 
> >>> u = np.insert(u, 1, ui, axis=2)
> 
> It was indeed the case in Numpy 1.6, but in 1.7.1 it raises a
> ValueError exception.
> 
> Note that the problem seems specific to axis=-1, and not to all negative
> axis values, since the following example works as expected:
> >>> u = np.zeros((2,3,4))
> >>> ui = np.ones((2,4))
> >>> u = np.insert(u, 1, ui, axis=-2)  # equivalent to axis=1
> 
> I didn't check on current master, so maybe things have changed since
> 1.7.1. If they have not, do you think a bug report would be relevant?
> 
> Cheers,
> F?lix
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From sebastian at sipsolutions.net  Mon Jul  1 12:04:18 2013
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Mon, 01 Jul 2013 18:04:18 +0200
Subject: [Numpy-discussion] np.insert with axis=-1
In-Reply-To: <1372694076.16404.2.camel@sebastian-laptop>
References: <20130701141115.480c8103@artemis.nancy.inra.local>
	<1372694076.16404.2.camel@sebastian-laptop>
Message-ID: <1372694658.16404.3.camel@sebastian-laptop>

On Mon, 2013-07-01 at 17:54 +0200, Sebastian Berg wrote:
> On Mon, 2013-07-01 at 14:11 +0200, F?lix Hartmann wrote:
> > Hi all,
> > 
> > I recently upgraded from Numpy 1.6.2 to 1.7.1 on my Debian testing, and
> > then got a bug in a program that was previously working. It turned out
> > that the problem comes from the np.insert function when the argument
> > `axis=-1` is given.
> > 
> 
> Dang, yes, its a pretty stupid bug, exists basically the same in both
> 1.7 and 1.8. If you got a minute, it is because of np.rollaxis usage,
> and in it there it says `axis-1` which is wrong for negative axes!
> 

That is axis + 1 of course...

> Could you create a pull request to fix that? That would be great.
> 
> - Sebastian
> 
> > Here is a minimal example:
> > >>> u = np.zeros((2,3,4))
> > >>> ui = np.ones((2,3))
> > >>> u = np.insert(u, 1, ui, axis=-1)
> > 
> > The last line should be equivalent to 
> > >>> u = np.insert(u, 1, ui, axis=2)
> > 
> > It was indeed the case in Numpy 1.6, but in 1.7.1 it raises a
> > ValueError exception.
> > 
> > Note that the problem seems specific to axis=-1, and not to all negative
> > axis values, since the following example works as expected:
> > >>> u = np.zeros((2,3,4))
> > >>> ui = np.ones((2,4))
> > >>> u = np.insert(u, 1, ui, axis=-2)  # equivalent to axis=1
> > 
> > I didn't check on current master, so maybe things have changed since
> > 1.7.1. If they have not, do you think a bug report would be relevant?
> > 
> > Cheers,
> > F?lix
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> 
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From lists at onerussian.com  Mon Jul  1 15:30:06 2013
From: lists at onerussian.com (Yaroslav Halchenko)
Date: Mon, 1 Jul 2013 15:30:06 -0400
Subject: [Numpy-discussion] Really cruel draft of vbench setup for NumPy
 (.add.reduce benchmarks since 2011)
In-Reply-To: <CAJhcF=1a=zHQ7JgXF9f4N+1pHMr_6RESq6Me9oSYewjj62L0MA@mail.gmail.com>
References: <20130506143241.GV5140@onerussian.com>
	<1367856232.2506.31.camel@sebastian-laptop>
	<20130506161153.GW5140@onerussian.com>
	<1367927238.23010.12.camel@sebastian-laptop>
	<CAJhcF=1a=zHQ7JgXF9f4N+1pHMr_6RESq6Me9oSYewjj62L0MA@mail.gmail.com>
Message-ID: <20130701193006.GC27621@onerussian.com>

Hi Guys,

not quite the recommendations you expressed,  but here is my ugly
attempt to improve benchmarks coverage:

http://www.onerussian.com/tmp/numpy-vbench-20130701/index.html

initially I also ran those ufunc benchmarks per each dtype separately,
but then resulting webpage is loong which brings my laptop on its knees
by firefox.  So I commented those out for now, and left only "summary"
ones across multiple datatypes.

There is a bug in sphinx which forbids embedding some figures for
vb_random "as is", so pardon that for now...

I have not set cpu affinity of the process (but ran it at nice -10), so  may be
that also contributed to variance of benchmark estimates.  And there probably
could be more of goodies (e.g. gc control etc) to borrow from
https://github.com/pydata/pandas/blob/master/vb_suite/test_perf.py which I have
just discovered to minimize variance.

nothing really interesting was pin-pointed so far, besides that 

- svd became a bit faster since few months back ;-)

http://www.onerussian.com/tmp/numpy-vbench-20130701/vb_vb_linalg.html

- isnan (and isinf, isfinite) got improved

http://www.onerussian.com/tmp/numpy-vbench-20130701/vb_vb_ufunc.html#numpy-isnan-a-10types

- right_shift got a miniscule slowdown from what it used to be?

http://www.onerussian.com/tmp/numpy-vbench-20130701/vb_vb_ufunc.html#numpy-right-shift-a-a-3types

As before -- current code of those benchmarks collection is available
at http://github.com/yarikoptic/numpy-vbench/pull/new/master

if you have specific snippets you would like to benchmark -- just state them
here or send a PR -- I will add them in.

Cheers,

On Tue, 07 May 2013, Da?id wrote:

> On 7 May 2013 13:47, Sebastian Berg <sebastian at sipsolutions.net> wrote:
> > Indexing/assignment was the first thing I thought of too (also because
> > fancy indexing/assignment really could use some speedups...). Other then
> > that maybe some timings for small arrays/scalar math, but that might be
> > nice for that GSoC project.

> Why not going bigger? Ufunc operations on big arrays, CPU and memory bound.

> Also, what about interfacing with other packages? It may increase the
> compiling overhead, but I would like to see Cython in action (say,
> only last version, maybe it can be fixed).
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


-- 
Yaroslav O. Halchenko, Ph.D.
http://neuro.debian.net http://www.pymvpa.org http://www.fail2ban.org
Senior Research Associate,     Psychological and Brain Sciences Dept.
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834                       Fax: +1 (603) 646-1419
WWW:   http://www.linkedin.com/in/yarik        


From lists at onerussian.com  Mon Jul  1 17:58:05 2013
From: lists at onerussian.com (Yaroslav Halchenko)
Date: Mon, 1 Jul 2013 17:58:05 -0400
Subject: [Numpy-discussion] Really cruel draft of vbench setup for NumPy
 (.add.reduce benchmarks since 2011)
In-Reply-To: <20130701193006.GC27621@onerussian.com>
References: <20130506143241.GV5140@onerussian.com>
	<1367856232.2506.31.camel@sebastian-laptop>
	<20130506161153.GW5140@onerussian.com>
	<1367927238.23010.12.camel@sebastian-laptop>
	<CAJhcF=1a=zHQ7JgXF9f4N+1pHMr_6RESq6Me9oSYewjj62L0MA@mail.gmail.com>
	<20130701193006.GC27621@onerussian.com>
Message-ID: <20130701215804.GG27621@onerussian.com>

FWIW -- updated plots with contribution from Julian Taylor
http://www.onerussian.com/tmp/numpy-vbench-20130701/vb_vb_indexing.html#mmap-slicing
;-)

On Mon, 01 Jul 2013, Yaroslav Halchenko wrote:

> Hi Guys,

> not quite the recommendations you expressed,  but here is my ugly
> attempt to improve benchmarks coverage:

> http://www.onerussian.com/tmp/numpy-vbench-20130701/index.html

> initially I also ran those ufunc benchmarks per each dtype separately,
> but then resulting webpage is loong which brings my laptop on its knees
> by firefox.  So I commented those out for now, and left only "summary"
> ones across multiple datatypes.

> There is a bug in sphinx which forbids embedding some figures for
> vb_random "as is", so pardon that for now...

> I have not set cpu affinity of the process (but ran it at nice -10), so  may be
> that also contributed to variance of benchmark estimates.  And there probably
> could be more of goodies (e.g. gc control etc) to borrow from
> https://github.com/pydata/pandas/blob/master/vb_suite/test_perf.py which I have
> just discovered to minimize variance.

> nothing really interesting was pin-pointed so far, besides that 

> - svd became a bit faster since few months back ;-)

> http://www.onerussian.com/tmp/numpy-vbench-20130701/vb_vb_linalg.html

> - isnan (and isinf, isfinite) got improved

> http://www.onerussian.com/tmp/numpy-vbench-20130701/vb_vb_ufunc.html#numpy-isnan-a-10types

> - right_shift got a miniscule slowdown from what it used to be?

> http://www.onerussian.com/tmp/numpy-vbench-20130701/vb_vb_ufunc.html#numpy-right-shift-a-a-3types

> As before -- current code of those benchmarks collection is available
> at http://github.com/yarikoptic/numpy-vbench/pull/new/master

> if you have specific snippets you would like to benchmark -- just state them
> here or send a PR -- I will add them in.

> Cheers,

> On Tue, 07 May 2013, Da?id wrote:

> > On 7 May 2013 13:47, Sebastian Berg <sebastian at sipsolutions.net> wrote:
> > > Indexing/assignment was the first thing I thought of too (also because
> > > fancy indexing/assignment really could use some speedups...). Other then
> > > that maybe some timings for small arrays/scalar math, but that might be
> > > nice for that GSoC project.

> > Why not going bigger? Ufunc operations on big arrays, CPU and memory bound.

> > Also, what about interfacing with other packages? It may increase the
> > compiling overhead, but I would like to see Cython in action (say,
> > only last version, maybe it can be fixed).
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
-- 
Yaroslav O. Halchenko, Ph.D.
http://neuro.debian.net http://www.pymvpa.org http://www.fail2ban.org
Senior Research Associate,     Psychological and Brain Sciences Dept.
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834                       Fax: +1 (603) 646-1419
WWW:   http://www.linkedin.com/in/yarik        


From mdroe at stsci.edu  Tue Jul  2 12:39:27 2013
From: mdroe at stsci.edu (Michael Droettboom)
Date: Tue, 2 Jul 2013 12:39:27 -0400
Subject: [Numpy-discussion] matplotlib user survey 2013
Message-ID: <51D3023F.3020108@stsci.edu>

[Apologies for cross-posting]

The matplotlib developers want to hear from you!

We are conducting a user survey to determine how and where matplotlib is 
being used in order to focus its further development.

This should only take a couple of minutes.  To fill it out, visit:

https://docs.google.com/spreadsheet/viewform?fromEmail=true&formkey=dHpQS25pcTZIRWdqX0pNckNSU01sMHc6MQ 
<https://docs.google.com/spreadsheet/viewform?fromEmail=true&formkey=dHpQS25pcTZIRWdqX0pNckNSU01sMHc6MQ>

Please forward to your colleagues, particularly those who don't read 
these mailing lists.

Cheers,
Michael Droettboom, and the matplotlib team
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130702/0594d117/attachment.html>

From jjstickel at gmail.com  Tue Jul  2 17:07:14 2013
From: jjstickel at gmail.com (Jonathan Stickel)
Date: Tue, 02 Jul 2013 15:07:14 -0600
Subject: [Numpy-discussion] trouble with numpy.float64 multiplied with
	cvxopt.matrix
Message-ID: <51D34102.2080507@gmail.com>

I recently ran into some trouble with multiplying scalar variables of 
type numpy.float64 with cvxopt matrices. The cvxopt matrix is converted 
to a numpy array. I first reported this at the cvxopt google groups, 
where I give an example:

https://groups.google.com/forum/#!topic/cvxopt/4suFNOY75E4

The response I got is that it would be difficult to correct this in 
cvxopt, and that a workaround would be to do A*s rather than s*A (where 
s is the scalar of type numpy.float64 and A is the cvxopt matrix). I 
inferred from this that the leading object takes charge of how operator 
works, including the type conversion.

Now, I don't know much about low-level programming for operators and 
type conversion, but I want to ask whether this should be considered a 
bug or whether correcting this might be a reasonable feature request in 
numpy. It seems to me that scalars of any numpy type, when operated with 
other objects, should not change the high-level type of those objects.

Thanks,
Jonathan


From brad.froehle at gmail.com  Tue Jul  2 23:44:19 2013
From: brad.froehle at gmail.com (Bradley M. Froehle)
Date: Tue, 2 Jul 2013 20:44:19 -0700
Subject: [Numpy-discussion] Fancy indexing oddity
Message-ID: <CAHXv-MggEf3ny6Nm1tNEEz8Na-eeVqj0dozkW6MtdE69dCg7wg@mail.gmail.com>

A colleague just showed me this indexing behavior and I was at a loss
to explain what was going on.  Can anybody else chime in and help me
understand this indexing behavior?

    >>> import numpy as np
    >>> np.__version__
    '1.7.1'
    >>> A = np.ones((2,3,5))
    >>> mask = np.array([True]*4 + [False], dtype=bool)
    >>> A.shape
    (2, 3, 5)
    >>> A[:,:,mask].shape
    (2, 3, 4)
    >>> A[:,1,mask].shape
    (2, 4)
    >>> A[1,:,mask].shape
    (4, 3) # Why is this not (3, 4)?
    >>> A[1][:,mask].shape
    (3, 4)

Thanks!
Brad


From sebastian at sipsolutions.net  Wed Jul  3 03:52:16 2013
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Wed, 03 Jul 2013 09:52:16 +0200
Subject: [Numpy-discussion] Fancy indexing oddity
In-Reply-To: <CAHXv-MggEf3ny6Nm1tNEEz8Na-eeVqj0dozkW6MtdE69dCg7wg@mail.gmail.com>
References: <CAHXv-MggEf3ny6Nm1tNEEz8Na-eeVqj0dozkW6MtdE69dCg7wg@mail.gmail.com>
Message-ID: <1372837936.7112.10.camel@sebastian-laptop>

On Tue, 2013-07-02 at 20:44 -0700, Bradley M. Froehle wrote:
> A colleague just showed me this indexing behavior and I was at a loss
> to explain what was going on.  Can anybody else chime in and help me
> understand this indexing behavior?
> 
>     >>> import numpy as np
>     >>> np.__version__
>     '1.7.1'
>     >>> A = np.ones((2,3,5))
>     >>> mask = np.array([True]*4 + [False], dtype=bool)
>     >>> A.shape
>     (2, 3, 5)
>     >>> A[:,:,mask].shape
>     (2, 3, 4)
>     >>> A[:,1,mask].shape
>     (2, 4)
>     >>> A[1,:,mask].shape
>     (4, 3) # Why is this not (3, 4)?
>     >>> A[1][:,mask].shape
>     (3, 4)
> 

Numpy has slicing and fancy indexing. But scalars are both. They are
fancy indexes, but they do not trigger fancy indexing (you could also
add the special case of a scalar result to this, but it doesn't matter
for this)!
Implementation wise mixed fancy indexing/slicing is a multi step
process:

1. Evaluate the slices. (no surprises here)
2. Evaluate the fancy indexing moving the new axes to the *front*. Here
this means A[1,:,mask] -> A.transpose(2,0,1) then combining all fancy
indexes so that A.shape goes from (2, 3, 5) via transpose (5,2,3) to
(4,3), since the combination of 1 and mask gives a 1-d result with 4
entries.
3. If and only if all fancy indexes were consecutive, i.e. A[:,1,mask],
A[mask,[[3]],:], numpy can basically guess where it would make sense to
put the fancy axes. So it transposes it back. This is what makes a
single fancy index behave like a slice. Now in your example A[1,:,mask]
is *not* consecutive (remember scalars are fancy in this regard), so the
fancy axis goes to the front instead of going to where "mask" was.

In short, the resulting axes from the fancy indices is at the front if
the fancy indices are not consecutive. And since scalars are considered
fancy in this regard they are not consecutive in your example.

- Sebastian


> Thanks!
> Brad
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> 


From thomas.robitaille at gmail.com  Thu Jul  4 09:06:49 2013
From: thomas.robitaille at gmail.com (Thomas Robitaille)
Date: Thu, 4 Jul 2013 15:06:49 +0200
Subject: [Numpy-discussion] Equality not working as expected with ndarray
	sub-class
Message-ID: <CAGMHX_2_t-17JQJcGhuMPUyTCh8zA_De+9FwKZRh=ERoFfd3YA@mail.gmail.com>

Hi everyone,

The following example:

    import numpy as np

    class SimpleArray(np.ndarray):

        __array_priority__ = 10000

        def __new__(cls, input_array, info=None):
            return np.asarray(input_array).view(cls)

        def __eq__(self, other):
            return False

    a = SimpleArray(10)
    print (np.int64(10) == a)
    print (a == np.int64(10))

gives the following output

    $ python2.7 eq.py
    True
    False

so that in the first case, SimpleArray.__eq__ is not called. Is this a
bug, and if so, can anyone think of a workaround? If this is expected
behavior, how do I ensure SimpleArray.__eq__ gets called in both
cases?

Thanks,
Tom

ps: cross-posting to stackoverflow


From nouiz at nouiz.org  Thu Jul  4 09:09:37 2013
From: nouiz at nouiz.org (=?ISO-8859-1?Q?Fr=E9d=E9ric_Bastien?=)
Date: Thu, 4 Jul 2013 09:09:37 -0400
Subject: [Numpy-discussion] Equality not working as expected with
	ndarray sub-class
In-Reply-To: <CAGMHX_2_t-17JQJcGhuMPUyTCh8zA_De+9FwKZRh=ERoFfd3YA@mail.gmail.com>
References: <CAGMHX_2_t-17JQJcGhuMPUyTCh8zA_De+9FwKZRh=ERoFfd3YA@mail.gmail.com>
Message-ID: <CADKKbtgKjqA0eEb99rMV5QH2df7tSDBjyKEj57b=nP7G19i3Fg@mail.gmail.com>

Hi,

__array__priority wasn't checked for ==, !=, <, <=, >, >= operation. I
added it in the development version and someone else back-ported it to the
1.7.X branch.

So this will work with the next release of numpy.

I don't know of a workaround until the next release.

Fred


On Thu, Jul 4, 2013 at 9:06 AM, Thomas Robitaille <
thomas.robitaille at gmail.com> wrote:

> Hi everyone,
>
> The following example:
>
>     import numpy as np
>
>     class SimpleArray(np.ndarray):
>
>         __array_priority__ = 10000
>
>         def __new__(cls, input_array, info=None):
>             return np.asarray(input_array).view(cls)
>
>         def __eq__(self, other):
>             return False
>
>     a = SimpleArray(10)
>     print (np.int64(10) == a)
>     print (a == np.int64(10))
>
> gives the following output
>
>     $ python2.7 eq.py
>     True
>     False
>
> so that in the first case, SimpleArray.__eq__ is not called. Is this a
> bug, and if so, can anyone think of a workaround? If this is expected
> behavior, how do I ensure SimpleArray.__eq__ gets called in both
> cases?
>
> Thanks,
> Tom
>
> ps: cross-posting to stackoverflow
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130704/1fa8a3bf/attachment.html>

From sebastian at sipsolutions.net  Thu Jul  4 09:12:16 2013
From: sebastian at sipsolutions.net (sebastian)
Date: Thu, 04 Jul 2013 15:12:16 +0200
Subject: [Numpy-discussion] Equality not working as expected with
	ndarray sub-class
In-Reply-To: <CAGMHX_2_t-17JQJcGhuMPUyTCh8zA_De+9FwKZRh=ERoFfd3YA@mail.gmail.com>
References: <CAGMHX_2_t-17JQJcGhuMPUyTCh8zA_De+9FwKZRh=ERoFfd3YA@mail.gmail.com>
Message-ID: <e8e62f624f28c5fc4ebe7b5d3bb0c4d1@sipsolutions.net>

On 2013-07-04 15:06, Thomas Robitaille wrote:
> Hi everyone,
> 
> The following example:
> 
>     import numpy as np
> 
>     class SimpleArray(np.ndarray):
> 
>         __array_priority__ = 10000
> 
>         def __new__(cls, input_array, info=None):
>             return np.asarray(input_array).view(cls)
> 
>         def __eq__(self, other):
>             return False
> 
>     a = SimpleArray(10)
>     print (np.int64(10) == a)
>     print (a == np.int64(10))
> 
> gives the following output
> 
>     $ python2.7 eq.py
>     True
>     False
> 
> so that in the first case, SimpleArray.__eq__ is not called. Is this a
> bug, and if so, can anyone think of a workaround? If this is expected
> behavior, how do I ensure SimpleArray.__eq__ gets called in both
> cases?
> 

This should be working in all development versions. I.e. NumPy >1.7.2 
(which is not released yet).

- Sebastian

> Thanks,
> Tom
> 
> ps: cross-posting to stackoverflow
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From nouiz at nouiz.org  Thu Jul  4 09:22:56 2013
From: nouiz at nouiz.org (=?ISO-8859-1?Q?Fr=E9d=E9ric_Bastien?=)
Date: Thu, 4 Jul 2013 09:22:56 -0400
Subject: [Numpy-discussion] Equality not working as expected with
	ndarray sub-class
In-Reply-To: <e8e62f624f28c5fc4ebe7b5d3bb0c4d1@sipsolutions.net>
References: <CAGMHX_2_t-17JQJcGhuMPUyTCh8zA_De+9FwKZRh=ERoFfd3YA@mail.gmail.com>
	<e8e62f624f28c5fc4ebe7b5d3bb0c4d1@sipsolutions.net>
Message-ID: <CADKKbtisfUgdW0-eyVe3z394-pj_=wi=ZjpBzY1WidE7cxOr_g@mail.gmail.com>

On Thu, Jul 4, 2013 at 9:12 AM, sebastian <sebastian at sipsolutions.net>wrote:

> On 2013-07-04 15:06, Thomas Robitaille wrote:
> > Hi everyone,
> >
> > The following example:
> >
> >     import numpy as np
> >
> >     class SimpleArray(np.ndarray):
> >
> >         __array_priority__ = 10000
> >
> >         def __new__(cls, input_array, info=None):
> >             return np.asarray(input_array).view(cls)
> >
> >         def __eq__(self, other):
> >             return False
> >
> >     a = SimpleArray(10)
> >     print (np.int64(10) == a)
> >     print (a == np.int64(10))
> >
> > gives the following output
> >
> >     $ python2.7 eq.py
> >     True
> >     False
> >
> > so that in the first case, SimpleArray.__eq__ is not called. Is this a
> > bug, and if so, can anyone think of a workaround? If this is expected
> > behavior, how do I ensure SimpleArray.__eq__ gets called in both
> > cases?
> >
>
> This should be working in all development versions. I.e. NumPy >1.7.2
> (which is not released yet).
>

I think you mean: NumPy >= 1.7.2

Fred
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130704/52d7550c/attachment.html>

From jtaylor.debian at googlemail.com  Thu Jul  4 14:15:12 2013
From: jtaylor.debian at googlemail.com (Julian Taylor)
Date: Thu, 04 Jul 2013 20:15:12 +0200
Subject: [Numpy-discussion] Reducing the rounding error of np.sum
Message-ID: <51D5BBB0.5060505@googlemail.com>

hi,
numpys implementation of sum is just a simple:
for i in d:
   sum += d[i]

this suffers from rather high rounding errors in the order of the d.size
* epsilon. Consider:
(np.ones(50000) / 10.).sum()
5000.0000000006585

There are numerous algorithms which reduce the error of this operation.
E.g. python implements one in math.fsum which is accurate but slow
compared to np.sum [0].

Numpy is currently lacking a precise summation, but I think it would
make sense to add one.
The question is whether we go with the python approach of adding a new
function which is slower but more precise or if we even change the
default summation algorithm (or do nothing :) ).

For a new function I guess the method used in python itself makes sense,
its probably well chosen by the python developers (though I did not
lookup the rational for the choice yet).

For replacing the default, two algorithms come to my mind, pairwise
summation [1] and kahan summation (compensated sum) [2].
pairwise summation adds in pairs so usually the magnitude of the two
operands is the same magnitude, this produces an error of O(log n *
epsilon) for the common case.
This algorithm has the advantage that it is almost as fast as the naive
sum with an reasonable error.
Problematic might be the buffering numpy does when reducing, this would
limit the error reduction to the buffer size.

kahan summation adds some extra operations to recover the rounding
errors. This results in an error of o(epsilon).
It is four times slower than the naive summation but this can be
(partially) compensated by vectorizing it.
It has the advantage of lower error, simpler implementation and
buffering does not interfere.

I did prototype implementations (only unit strides) of both in [3, 4].

Any thoughts on this?


[0]
http://docs.python.org/2/library/math.html#number-theoretic-and-representation-functions
[1] http://en.wikipedia.org/wiki/Pairwise_summationg
[2] http://en.wikipedia.org/wiki/Kahan_summation_algorithm
[3] https://github.com/juliantaylor/numpy/tree/pairwise
[4] https://github.com/juliantaylor/numpy/tree/kahan


From deil.christoph at googlemail.com  Thu Jul  4 14:36:07 2013
From: deil.christoph at googlemail.com (Christoph Deil)
Date: Thu, 4 Jul 2013 20:36:07 +0200
Subject: [Numpy-discussion] Reducing the rounding error of np.sum
In-Reply-To: <51D5BBB0.5060505@googlemail.com>
References: <51D5BBB0.5060505@googlemail.com>
Message-ID: <2144B798-CFF8-4E37-B713-865C5965C748@gmail.com>


On Jul 4, 2013, at 8:15 PM, Julian Taylor <jtaylor.debian at googlemail.com> wrote:

> hi,
> numpys implementation of sum is just a simple:
> for i in d:
>   sum += d[i]
> 
> this suffers from rather high rounding errors in the order of the d.size
> * epsilon. Consider:
> (np.ones(50000) / 10.).sum()
> 5000.0000000006585
> 
> There are numerous algorithms which reduce the error of this operation.
> E.g. python implements one in math.fsum which is accurate but slow
> compared to np.sum [0].
> 
> Numpy is currently lacking a precise summation, but I think it would
> make sense to add one.
> The question is whether we go with the python approach of adding a new
> function which is slower but more precise or if we even change the
> default summation algorithm (or do nothing :) ).
> 
> For a new function I guess the method used in python itself makes sense,
> its probably well chosen by the python developers (though I did not
> lookup the rational for the choice yet).
> 
> For replacing the default, two algorithms come to my mind, pairwise
> summation [1] and kahan summation (compensated sum) [2].
> pairwise summation adds in pairs so usually the magnitude of the two
> operands is the same magnitude, this produces an error of O(log n *
> epsilon) for the common case.
> This algorithm has the advantage that it is almost as fast as the naive
> sum with an reasonable error.
> Problematic might be the buffering numpy does when reducing, this would
> limit the error reduction to the buffer size.
> 
> kahan summation adds some extra operations to recover the rounding
> errors. This results in an error of o(epsilon).
> It is four times slower than the naive summation but this can be
> (partially) compensated by vectorizing it.
> It has the advantage of lower error, simpler implementation and
> buffering does not interfere.
> 
> I did prototype implementations (only unit strides) of both in [3, 4].
> 
> Any thoughts on this?

In case you are not aware, there has been some discussion on how numerically stable sum could be added to numpy here:
https://github.com/numpy/numpy/issues/2448

> 
> 
> [0]
> http://docs.python.org/2/library/math.html#number-theoretic-and-representation-functions
> [1] http://en.wikipedia.org/wiki/Pairwise_summationg
> [2] http://en.wikipedia.org/wiki/Kahan_summation_algorithm
> [3] https://github.com/juliantaylor/numpy/tree/pairwise
> [4] https://github.com/juliantaylor/numpy/tree/kahan
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From charlesr.harris at gmail.com  Thu Jul  4 15:33:43 2013
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Thu, 4 Jul 2013 13:33:43 -0600
Subject: [Numpy-discussion] Reducing the rounding error of np.sum
In-Reply-To: <51D5BBB0.5060505@googlemail.com>
References: <51D5BBB0.5060505@googlemail.com>
Message-ID: <CAB6mnxL0JqXnQWLGQfsVsiTeOZA9qbob=wvf3knmQQn6e2PORA@mail.gmail.com>

On Thu, Jul 4, 2013 at 12:15 PM, Julian Taylor <
jtaylor.debian at googlemail.com> wrote:

> hi,
> numpys implementation of sum is just a simple:
> for i in d:
>    sum += d[i]
>
> this suffers from rather high rounding errors in the order of the d.size
> * epsilon. Consider:
> (np.ones(50000) / 10.).sum()
> 5000.0000000006585
>
> There are numerous algorithms which reduce the error of this operation.
> E.g. python implements one in math.fsum which is accurate but slow
> compared to np.sum [0].
>
> Numpy is currently lacking a precise summation, but I think it would
> make sense to add one.
> The question is whether we go with the python approach of adding a new
> function which is slower but more precise or if we even change the
> default summation algorithm (or do nothing :) ).
>
> For a new function I guess the method used in python itself makes sense,
> its probably well chosen by the python developers (though I did not
> lookup the rational for the choice yet).
>
> For replacing the default, two algorithms come to my mind, pairwise
> summation [1] and kahan summation (compensated sum) [2].
> pairwise summation adds in pairs so usually the magnitude of the two
> operands is the same magnitude, this produces an error of O(log n *
> epsilon) for the common case.
> This algorithm has the advantage that it is almost as fast as the naive
> sum with an reasonable error.
> Problematic might be the buffering numpy does when reducing, this would
> limit the error reduction to the buffer size.
>
> kahan summation adds some extra operations to recover the rounding
> errors. This results in an error of o(epsilon).
> It is four times slower than the naive summation but this can be
> (partially) compensated by vectorizing it.
> It has the advantage of lower error, simpler implementation and
> buffering does not interfere.
>
> I did prototype implementations (only unit strides) of both in [3, 4].
>
> Any thoughts on this?
>
>
> [0]
>
> http://docs.python.org/2/library/math.html#number-theoretic-and-representation-functions
> [1] http://en.wikipedia.org/wiki/Pairwise_summationg
> [2] http://en.wikipedia.org/wiki/Kahan_summation_algorithm
> [3] https://github.com/juliantaylor/numpy/tree/pairwise
> [4] https://github.com/juliantaylor/numpy/tree/kahan
>

I think this would be useful as part of a bigger package that included
accurate mean, var, and std. In particular, the need for more accurate
means and variances have been discussed on the list before, but no one has
stepped forward to do anything about them.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130704/a8aaee8a/attachment.html>

From matti.picus at gmail.com  Thu Jul  4 15:43:20 2013
From: matti.picus at gmail.com (Matti Picus)
Date: Thu, 04 Jul 2013 22:43:20 +0300
Subject: [Numpy-discussion] subtypes of ndarray and round()
Message-ID: <51D5D058.9090502@gmail.com>

round() does not consistently preserve subtype of the ndarray,
is this known behaviour or should I file a bug for it?

Python 2.7.3 (default, Sep 26 2012, 21:51:14)
[GCC 4.7.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
 >>> import numpy as np
 >>> np.version.version
'1.7.0'
 >>> a=np.matrix(range(10))
 >>> a.round(decimals=10)
matrix([[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]])
 >>> a.round(decimals=-10)
array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])

Matti


From daschaich at gmail.com  Thu Jul  4 20:01:17 2013
From: daschaich at gmail.com (David Schaich)
Date: Thu, 04 Jul 2013 18:01:17 -0600
Subject: [Numpy-discussion] Covariance matrix from polyfit
Message-ID: <51D60CCD.5090604@gmail.com>

Hi all,

I recently adopted python, and am in the process of replacing my old 
analysis tools. For simple (e.g., linear) interpolations and 
extrapolations, in the past I used gnuplot. Today I set up the 
equivalent with polyfit in numpy v1.7.1, first running a simple test to 
reproduce the gnuplot result.

A discussion on this list back in February alerted me that I should use 
1/sigma for the weights in polyfit as opposed to 1/sigma**2. Fine -- 
that's not what I'm used to, but I can make a note.
http://mail.scipy.org/pipermail/numpy-discussion/2013-February/065649.html

Another issue mentioned in that thread is scaling the covariance matrix by
fac = resids / (len(x) - order - 2.0)
This wreaked havoc on the simple test I mentioned above (and include 
below), which fit three data points to a straight line. I spent hours 
trying to figure out why numpy returned negative variances, before 
tracking down this line. And, indeed, if I add a fake fourth data point, 
I end up with inf's and nan's.

There is some lengthy justification for subtracting that 2 around line 
590 in lib/polynomial.py. Fine -- it's nothing I recall seeing before 
(and I removed it from my local installation), but I'm just a new user.

However, I do think it is important to fix polyfit so that it doesn't 
produce pathological results like those I encountered today. Here are a 
couple of possibilities that would let the subtraction of 2 remain:
* Check whether len(x) > order + 2, and if it is not, either
** Die with an error
** Scale by resids / (len(x) - order) instead of resids / (len(x) - 
order - 2.0)

* Don't bother with this scaling at all, leaving it to the users (who 
can subtract 2 if they want). This is what scipy.optimize.leastsq does, 
after what seems to be a good deal of discussion: "This matrix must be 
multiplied by the residual variance to get the covariance of the 
parameter estimates ? see curve_fit."
http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.leastsq.html
https://github.com/scipy/scipy/pull/448

I leave it to those of you with more numpy experience to decide what 
would be the best way to go.

Cheers,
David
http://www-hep.colorado.edu/~schaich/


+++
Here's the simple example:
 >>> import numpy as np
 >>> m = np.array([0.008, 0.01, 0.015])
 >>> dat = np.array([1.0822582, 1.0805417, 1.0766624])
 >>> weight = np.array([1/0.000370, 1/0.000355, 1/0.000249])
 >>> out, cov = np.polyfit(m, dat, 1, full=False, w=weight, cov=True)
 >>> print out, '\n', cov
[-0.79269957 1.08854252]
[[ -2.34965006e-04 2.84428412e-06]
[ 2.84428412e-06 -3.66283662e-08]]
 >>> print np.sqrt(-1. * cov[0][0])
0.0153285682895
 >>> print np.sqrt(-1. * cov[1][1])
0.000191385386578
+++

Gnuplot gives
+++
Final set of parameters Asymptotic Standard Error
======================= ==========================
A = -0.792719 +/- 0.01533 (1.934%)
B = 1.08854 +/- 0.0001914 (0.01758%)
+++
so up to the negative sign, all is good.

For its part, scipy.optimize.leastsq needs me to do the scaling:
+++
 >>> import numpy as np
 >>> from scipy import optimize
 >>> m = np.array([0.008, 0.01, 0.015])
 >>> dat = np.array([1.0822582, 1.0805417, 1.0766624])
 >>> err = np.array([0.000370, 0.000355, 0.000249])
 >>> linear = lambda p, x: p[0] * x + p[1]
 >>> errfunc = lambda p, x, y, err: (linear(p, x) - y) / err
 >>> p_in = [-1., 1.]
 >>> all_out = optimize.leastsq(errfunc, p_in[:], args=(m, dat, err), 
full_output = 1)
 >>> out = all_out[0]
 >>> cov = all_out[1]
 >>> print out, '\n', cov
[-0.79269959 1.08854252]
[[ 3.40800756e-03 -4.12544212e-05]
[ -4.12544212e-05 5.31270007e-07]]
 >>> chiSq_dof = ((errfunc(out, m, dat, err))**2).sum() / (len(m) - 
len(out))
 >>> cov *= chiSq_dof
 >>> print cov
[[ 2.34964190e-04 -2.84427528e-06]
[ -2.84427528e-06 3.66282716e-08]]
+++


From josef.pktd at gmail.com  Thu Jul  4 21:40:17 2013
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Thu, 4 Jul 2013 21:40:17 -0400
Subject: [Numpy-discussion] Covariance matrix from polyfit
In-Reply-To: <51D60CCD.5090604@gmail.com>
References: <51D60CCD.5090604@gmail.com>
Message-ID: <CAMMTP+DPVBt-pg1=CObUWDOYF+FvR4iBBVjoQPVzqgrQODvcMA@mail.gmail.com>

On Thu, Jul 4, 2013 at 8:01 PM, David Schaich <daschaich at gmail.com> wrote:
> Hi all,
>
> I recently adopted python, and am in the process of replacing my old
> analysis tools. For simple (e.g., linear) interpolations and
> extrapolations, in the past I used gnuplot. Today I set up the
> equivalent with polyfit in numpy v1.7.1, first running a simple test to
> reproduce the gnuplot result.
>
> A discussion on this list back in February alerted me that I should use
> 1/sigma for the weights in polyfit as opposed to 1/sigma**2. Fine --
> that's not what I'm used to, but I can make a note.
> http://mail.scipy.org/pipermail/numpy-discussion/2013-February/065649.html
>
> Another issue mentioned in that thread is scaling the covariance matrix by
> fac = resids / (len(x) - order - 2.0)
> This wreaked havoc on the simple test I mentioned above (and include
> below), which fit three data points to a straight line. I spent hours
> trying to figure out why numpy returned negative variances, before
> tracking down this line. And, indeed, if I add a fake fourth data point,
> I end up with inf's and nan's.
>
> There is some lengthy justification for subtracting that 2 around line
> 590 in lib/polynomial.py. Fine -- it's nothing I recall seeing before
> (and I removed it from my local installation), but I'm just a new user.
>
> However, I do think it is important to fix polyfit so that it doesn't
> produce pathological results like those I encountered today. Here are a
> couple of possibilities that would let the subtraction of 2 remain:
> * Check whether len(x) > order + 2, and if it is not, either
> ** Die with an error
> ** Scale by resids / (len(x) - order) instead of resids / (len(x) -
> order - 2.0)

I would throw out the -2, or at least make it optional like `ddof`.
(It's not in the docstring AFAICS)

returning a negative (!) definite covariance matrix is definitely a bug.
(should return nan or raise exception)

my 1.5 cents

Josef

>
> * Don't bother with this scaling at all, leaving it to the users (who
> can subtract 2 if they want). This is what scipy.optimize.leastsq does,
> after what seems to be a good deal of discussion: "This matrix must be
> multiplied by the residual variance to get the covariance of the
> parameter estimates ? see curve_fit."
> http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.leastsq.html
> https://github.com/scipy/scipy/pull/448
>
> I leave it to those of you with more numpy experience to decide what
> would be the best way to go.
>
> Cheers,
> David
> http://www-hep.colorado.edu/~schaich/
>
>
> +++
> Here's the simple example:
>  >>> import numpy as np
>  >>> m = np.array([0.008, 0.01, 0.015])
>  >>> dat = np.array([1.0822582, 1.0805417, 1.0766624])
>  >>> weight = np.array([1/0.000370, 1/0.000355, 1/0.000249])
>  >>> out, cov = np.polyfit(m, dat, 1, full=False, w=weight, cov=True)
>  >>> print out, '\n', cov
> [-0.79269957 1.08854252]
> [[ -2.34965006e-04 2.84428412e-06]
> [ 2.84428412e-06 -3.66283662e-08]]
>  >>> print np.sqrt(-1. * cov[0][0])
> 0.0153285682895
>  >>> print np.sqrt(-1. * cov[1][1])
> 0.000191385386578
> +++
>
> Gnuplot gives
> +++
> Final set of parameters Asymptotic Standard Error
> ======================= ==========================
> A = -0.792719 +/- 0.01533 (1.934%)
> B = 1.08854 +/- 0.0001914 (0.01758%)
> +++
> so up to the negative sign, all is good.
>
> For its part, scipy.optimize.leastsq needs me to do the scaling:
> +++
>  >>> import numpy as np
>  >>> from scipy import optimize
>  >>> m = np.array([0.008, 0.01, 0.015])
>  >>> dat = np.array([1.0822582, 1.0805417, 1.0766624])
>  >>> err = np.array([0.000370, 0.000355, 0.000249])
>  >>> linear = lambda p, x: p[0] * x + p[1]
>  >>> errfunc = lambda p, x, y, err: (linear(p, x) - y) / err
>  >>> p_in = [-1., 1.]
>  >>> all_out = optimize.leastsq(errfunc, p_in[:], args=(m, dat, err),
> full_output = 1)
>  >>> out = all_out[0]
>  >>> cov = all_out[1]
>  >>> print out, '\n', cov
> [-0.79269959 1.08854252]
> [[ 3.40800756e-03 -4.12544212e-05]
> [ -4.12544212e-05 5.31270007e-07]]
>  >>> chiSq_dof = ((errfunc(out, m, dat, err))**2).sum() / (len(m) -
> len(out))
>  >>> cov *= chiSq_dof
>  >>> print cov
> [[ 2.34964190e-04 -2.84427528e-06]
> [ -2.84427528e-06 3.66282716e-08]]
> +++
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From bakhtiyor_zokhidov at mail.ru  Fri Jul  5 04:20:28 2013
From: bakhtiyor_zokhidov at mail.ru (=?UTF-8?B?QmFraHRpeW9yIFpva2hpZG92?=)
Date: Fri, 05 Jul 2013 12:20:28 +0400
Subject: [Numpy-discussion] =?utf-8?q?Unique=28=29_function_and_avoiding_L?=
	=?utf-8?q?oop?=
Message-ID: <1373012428.804822615@f377.i.mail.ru>


Hi everybody,

I have a problem with sorting out the following function. What I expect is that I showed as an example below.

Two problems are encountered to achieve the result:
1) The function sometimes can't not sort as expected: I showed an example for that below.
2) I could not do vectorization to avoid loop.


OR, Is there another way to solve that problem??
Thanks in advance


Example:
data = ['', 12, 12, 423, '1', 423, -32, 12, 721, 345]. Expected result:??[0, 12, 12, 423, 0, 423, -32, 12, 721, 345],? here, '' and '1' are string type I need to replace them by zero

The result I got:?['', 12, 12, 423, '1', 423, -32, 12, 721, 345]

import numpy as np
def func(data):
? ? ? ? ? x, i = np.unique(data, return_inverse = True)
? ? ? ? ? f = [ np.where( i == ind )[0] for ind in range(len(x)) ]
? ? ? ? ? new_data = []
? ? ? ? ? # Obtain 'data' arguments and give these data to New_data
? ? ? ? ? for i in range(len(x)):
? ? ? ? ? ? ? ? ? ? ? if np.size(f[i]) > 1:
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?for j in f[i]:
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?if str(data[j]) <> '':
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? new_data.append(data[j])
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? else:
? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? data[j] = 0
? ? ? ? ? return data
--? Bakhtiyor Zokhidov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130705/ae9cd58f/attachment.html>

From grb at skogoglandskap.no  Fri Jul  5 14:29:23 2013
From: grb at skogoglandskap.no (Graeme B. Bell)
Date: Fri, 5 Jul 2013 18:29:23 +0000
Subject: [Numpy-discussion] simple select(): for anyone with too many
 conditions for np.select(), or scalar-valued choicelists.
Message-ID: <65BDE0BC-0E8B-44C5-AAFD-876490382FE3@skogoglandskap.no>


I've made a drop-in replacement for select() which works with large numbers of conditions, and which consistently outperforms numpy.select for my use case (scalar condlist). 
It fixes a couple of other issues too, and (I feel) improves the internal documentation of the code. I have included benchmarks and some tests.


    https://github.com/gbb/numpy-simple-select


The numpy dev team are welcome to include all or part of this code into the main numpy distribution or it can be kept separate or ignored if they prefer. :-)

If you need more than 30 ndarrays in your 'condlist', or if you have an all-scalar choicelist, I think you will find this code particularly interesting. 

Formal test coverage is incomplete, but I think this is still going to be quite useful for some people.

Have a nice weekend,

Graeme 

From mjanikas at esri.com  Fri Jul  5 17:48:42 2013
From: mjanikas at esri.com (Mark Janikas)
Date: Fri, 5 Jul 2013 21:48:42 +0000
Subject: [Numpy-discussion] PyArray_PutTo Question
Message-ID: <1C37EAF5F95D764D99E0EABE944AE95F225C7E53@RED-INF-EXMB-P1.esri.com>

Hi All,

I am a bit new to the NumPy C-API and I am having a hard time with placing results into output arrays... I am using PyArray_TakeFrom to grab an input dimension of data, then do a calculation, then I want to pack it back to the output... yet the PutTo function does not have an axis argument like the TakeFrom does... I am grabbing by column in a two-dimensional array and I would like to pack it that way.  I know that I can build the result in reverse and pack the columns into rows and then reshape the output... but I am wondering why the PutTo does not behave exactly like the take-from does?... The python implementation "numpy.put" also does not have the axis... so I guess I can see the one-to-one reason for the omission.  However, is building in reverse and reshaping the normal way to pack by column?

Thanks much!

MJ
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130705/a10703c0/attachment.html>

From alan.isaac at gmail.com  Fri Jul  5 18:45:16 2013
From: alan.isaac at gmail.com (Alan G Isaac)
Date: Fri, 05 Jul 2013 18:45:16 -0400
Subject: [Numpy-discussion] SLARTG
In-Reply-To: <51D7158C.1090903@american.edu>
References: <51D7158C.1090903@american.edu>
Message-ID: <51D74C7C.5030500@gmail.com>

On 7/5/2013 2:50 PM, Alan G Isaac wrote:
> I see that CLARTG is here:
> https://github.com/scipy/scipy/blob/master/scipy/sparse/linalg/eigen/arpack/ARPACK/SRC/sstqrb.f
>
> But is there a Python interface in SciPy?
> (Or any other SciPy access to Givens rotation?)


Sorry, that was SLARTG, whereas CLARTG is here:
https://github.com/scipy/scipy/blob/master/scipy/sparse/linalg/eigen/arpack/ARPACK/SRC/cnapps.f

But the question stands: is there a Python interface
to Givens rotation?

Thanks,
Alan Isaac


From alan.isaac at gmail.com  Sun Jul  7 11:28:02 2013
From: alan.isaac at gmail.com (Alan G Isaac)
Date: Sun, 07 Jul 2013 11:28:02 -0400
Subject: [Numpy-discussion] add .H attribute?
Message-ID: <51D98902.1090403@gmail.com>

With numpy arrays, I miss being able to spell a.conj().T as a.H,
as one can with numpy matrices.

Is adding this attribute to arrays ever under consideration?

Thanks,
Alan Isaac


From charlesr.harris at gmail.com  Sun Jul  7 16:49:18 2013
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Sun, 7 Jul 2013 14:49:18 -0600
Subject: [Numpy-discussion] add .H attribute?
In-Reply-To: <51D98902.1090403@gmail.com>
References: <51D98902.1090403@gmail.com>
Message-ID: <CAB6mnxL9GhVHi5aBKMfzX0DTi3uY=W90rFaj7Ta=fZguJZs5FA@mail.gmail.com>

On Sun, Jul 7, 2013 at 9:28 AM, Alan G Isaac <alan.isaac at gmail.com> wrote:

> With numpy arrays, I miss being able to spell a.conj().T as a.H,
> as one can with numpy matrices.
>
> Is adding this attribute to arrays ever under consideration?
>

There was a long thread about this back around 1.1 or so, long time ago in
any case. IIRC correctly, Travis was opposed. I think part of the problem
was that arr.T is a view, but arr.H would not be. Probably it could be be
made to return an iterator that performed the conjugation, or we could
simply return a new array. I'm not opposed myself, but I'd have to review
the old discussion to see if there was good reason not to have it in the
first place. I think the original discussion of an abs method took place
about the same time.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130707/2c4a8cd4/attachment.html>

From kimsunghyun at kaist.ac.kr  Mon Jul  8 12:05:44 2013
From: kimsunghyun at kaist.ac.kr (sunghyun Kim)
Date: Tue, 9 Jul 2013 01:05:44 +0900
Subject: [Numpy-discussion] f2py build with mkl lapack
Message-ID: <CAAD8ZekeVMz=TfWaTGTodSGSy=efSAYjs=_bv8065SdnnBGTvw@mail.gmail.com>

Hi

I'm trying to use fortran wrapper f2py with intel's mkl

following is my command

LIB='-L/opt/intel/Compiler/11.1/064/mkl/lib/em64t/ -lguide -lpthread
-lmkl_core -lmkl_intel_lp64 -lmkl_sequential'
INC=-I/opt/intel/Compiler/11.1/064/mkl/include
f2py  --fcompiler=intelem $INC $LIB  -m solveLE  -c solveLE2.f
solveLE2.f is simple fortran code using lapack's linear equation solver
SGESV
=============
CALL SGESV(N, NRHS, A, LDA, IPIV, B, LDB, INFO)
=============

When i use the command, compile was done.
But when I use the solveLE.so, I received following error massage

==================
$python test.py
python: symbol lookup error:
/opt/intel/Compiler/11.1/064/mkl/lib/em64t/libmkl_lapack.so: undefined
symbol: mkl_lapack_sgetrf
==================

I think "mkl_lapack_sgetrf" is defined in -lmkl_sequential.

I don't know what should I do.


Any help would be greatly appreciated!


Sunghyun Kim
Ph.D. Candidate
Theoretical Condensed Matter Physics Group.
KAIST
291 Daehak-ro(373-1 Guseong-dong), Yuseong-gu, Daejeon 305-701, Republic of
Korea
+10-4144-5946
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130709/ec0622e4/attachment.html>

From cournape at gmail.com  Mon Jul  8 13:15:50 2013
From: cournape at gmail.com (David Cournapeau)
Date: Mon, 8 Jul 2013 18:15:50 +0100
Subject: [Numpy-discussion] f2py build with mkl lapack
In-Reply-To: <CAAD8ZekeVMz=TfWaTGTodSGSy=efSAYjs=_bv8065SdnnBGTvw@mail.gmail.com>
References: <CAAD8ZekeVMz=TfWaTGTodSGSy=efSAYjs=_bv8065SdnnBGTvw@mail.gmail.com>
Message-ID: <CAGY4rcUVFFZxyCdkp-O_Szp+9YAfcw1Hop6WAEWowHpNqJBj=w@mail.gmail.com>

On Mon, Jul 8, 2013 at 5:05 PM, sunghyun Kim <kimsunghyun at kaist.ac.kr>wrote:

> Hi
>
> I'm trying to use fortran wrapper f2py with intel's mkl
>
> following is my command
>
> LIB='-L/opt/intel/Compiler/11.1/064/mkl/lib/em64t/ -lguide -lpthread
> -lmkl_core -lmkl_intel_lp64 -lmkl_sequential'
>

Linking order matters: if A needs B, A should appear before B, so
-lpthread/-lguide should be at the end, mkl_intel_lp64 before mkl_core, and
mkl_sequential in front of that.

See the MKL manual for more details,

David

> INC=-I/opt/intel/Compiler/11.1/064/mkl/include
> f2py  --fcompiler=intelem $INC $LIB  -m solveLE  -c solveLE2.f
> solveLE2.f is simple fortran code using lapack's linear equation solver
> SGESV
> =============
> CALL SGESV(N, NRHS, A, LDA, IPIV, B, LDB, INFO)
> =============
>
> When i use the command, compile was done.
> But when I use the solveLE.so, I received following error massage
>
> ==================
> $python test.py
> python: symbol lookup error:
> /opt/intel/Compiler/11.1/064/mkl/lib/em64t/libmkl_lapack.so: undefined
> symbol: mkl_lapack_sgetrf
> ==================
>
> I think "mkl_lapack_sgetrf" is defined in -lmkl_sequential.
>
> I don't know what should I do.
>
>
> Any help would be greatly appreciated!
>
>
>
>
> Sunghyun Kim
> Ph.D. Candidate
> Theoretical Condensed Matter Physics Group.
> KAIST
> 291 Daehak-ro(373-1 Guseong-dong), Yuseong-gu, Daejeon 305-701, Republic
> of Korea
> +10-4144-5946
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130708/3d7b1947/attachment.html>

From brad.froehle at gmail.com  Mon Jul  8 14:37:22 2013
From: brad.froehle at gmail.com (Bradley M. Froehle)
Date: Mon, 8 Jul 2013 11:37:22 -0700
Subject: [Numpy-discussion] f2py build with mkl lapack
In-Reply-To: <CAGY4rcUVFFZxyCdkp-O_Szp+9YAfcw1Hop6WAEWowHpNqJBj=w@mail.gmail.com>
References: <CAAD8ZekeVMz=TfWaTGTodSGSy=efSAYjs=_bv8065SdnnBGTvw@mail.gmail.com>
	<CAGY4rcUVFFZxyCdkp-O_Szp+9YAfcw1Hop6WAEWowHpNqJBj=w@mail.gmail.com>
Message-ID: <CAHXv-Mgw61UROamh0tpdGvACF43-K9rPic6CRqH+KN4boSWnJQ@mail.gmail.com>

On Mon, Jul 8, 2013 at 10:15 AM, David Cournapeau <cournape at gmail.com> wrote:
>
>
> On Mon, Jul 8, 2013 at 5:05 PM, sunghyun Kim <kimsunghyun at kaist.ac.kr>
> wrote:
>>
>> Hi
>>
>> I'm trying to use fortran wrapper f2py with intel's mkl
>>
>> following is my command
>>
>> LIB='-L/opt/intel/Compiler/11.1/064/mkl/lib/em64t/ -lguide -lpthread
>> -lmkl_core -lmkl_intel_lp64 -lmkl_sequential'
>
>
> Linking order matters: if A needs B, A should appear before B, so
> -lpthread/-lguide should be at the end, mkl_intel_lp64 before mkl_core, and
> mkl_sequential in front of that.
>
> See the MKL manual for more details,

You may also want to consult the MKL Link Line Advsior [1], which in
your case recommends an ordering like:

    -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lpthread -lm

[1]: http://software.intel.com/en-us/articles/intel-mkl-link-line-advisor

-Brad


From kimsunghyun at kaist.ac.kr  Mon Jul  8 21:12:06 2013
From: kimsunghyun at kaist.ac.kr (sunghyun Kim)
Date: Tue, 9 Jul 2013 10:12:06 +0900
Subject: [Numpy-discussion] f2py build with mkl lapack
In-Reply-To: <CAHXv-Mgw61UROamh0tpdGvACF43-K9rPic6CRqH+KN4boSWnJQ@mail.gmail.com>
References: <CAAD8ZekeVMz=TfWaTGTodSGSy=efSAYjs=_bv8065SdnnBGTvw@mail.gmail.com>
	<CAGY4rcUVFFZxyCdkp-O_Szp+9YAfcw1Hop6WAEWowHpNqJBj=w@mail.gmail.com>
	<CAHXv-Mgw61UROamh0tpdGvACF43-K9rPic6CRqH+KN4boSWnJQ@mail.gmail.com>
Message-ID: <CAAD8ZekwYhkdz1b9PkSuW_PgV1VnaE4dHZ3Dh+FOOzG=k9Umtg@mail.gmail.com>

thank you for your help
I tried following orders and many combinations...
=====================
LIB='-L/opt/intel/Compiler/11.1/064/mkl/lib/em64t/ -lmkl_intel_lp64
-lmkl_sequential  -lmkl_core  -lpthread -lm'
LIB='-L/opt/intel/Compiler/11.1/064/mkl/lib/em64t/
-lmkl_solver_lp64_sequential -lmkl_intel_lp64 -lmkl_sequential  -lmkl_core
 -lpthread -lm'
====================
but I still got following massage

================
undefined symbol: mkl_lapack_sgetrf
================


Sunghyun Kim
Ph.D. Candidate
Theoretical Condensed Matter Physics Group.
KAIST
291 Daehak-ro(373-1 Guseong-dong), Yuseong-gu, Daejeon 305-701, Republic of
Korea
+10-4144-5946


On Tue, Jul 9, 2013 at 3:37 AM, Bradley M. Froehle
<brad.froehle at gmail.com>wrote:

> On Mon, Jul 8, 2013 at 10:15 AM, David Cournapeau <cournape at gmail.com>
> wrote:
> >
> >
> > On Mon, Jul 8, 2013 at 5:05 PM, sunghyun Kim <kimsunghyun at kaist.ac.kr>
> > wrote:
> >>
> >> Hi
> >>
> >> I'm trying to use fortran wrapper f2py with intel's mkl
> >>
> >> following is my command
> >>
> >> LIB='-L/opt/intel/Compiler/11.1/064/mkl/lib/em64t/ -lguide -lpthread
> >> -lmkl_core -lmkl_intel_lp64 -lmkl_sequential'
> >
> >
> > Linking order matters: if A needs B, A should appear before B, so
> > -lpthread/-lguide should be at the end, mkl_intel_lp64 before mkl_core,
> and
> > mkl_sequential in front of that.
> >
> > See the MKL manual for more details,
>
> You may also want to consult the MKL Link Line Advsior [1], which in
> your case recommends an ordering like:
>
>     -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -lpthread -lm
>
> [1]: http://software.intel.com/en-us/articles/intel-mkl-link-line-advisor
>
> -Brad
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130709/38fa01fa/attachment.html>

From chaoyuejoy at gmail.com  Tue Jul  9 08:55:53 2013
From: chaoyuejoy at gmail.com (Chao YUE)
Date: Tue, 9 Jul 2013 14:55:53 +0200
Subject: [Numpy-discussion] np.ma.argmax not respecting the mask?
Message-ID: <CAAN-aRFJGoj=rqa1+WNm_NnrwEO-Wyy29=iNvbrjMOLpbwHOtg@mail.gmail.com>

Dear all,

I am using 1.7.1 version of numpy and np.ma.argmax is not repecting the
mask?
I expect for all data that are masked, it should also return a mask, but
this is not
the case.

In [96]: d3
Out[96]:
masked_array(data =
 [[-- -- -- -- 4]
 [5 -- 7 8 9]],
             mask =
 [[ True  True  True  True False]
 [False  True False False False]],
       fill_value = 6)


In [97]: np.ma.argmax(d3,axis=0)
Out[97]: array([1, 0, 1, 1, 1])

In [98]: np.__version__
Out[98]: '1.7.1'

Can I file a bug report on this?

thanks,

Chao

-- 
***********************************************************************************
Chao YUE
Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL)
UMR 1572 CEA-CNRS-UVSQ
Batiment 712 - Pe 119
91191 GIF Sur YVETTE Cedex
Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16
************************************************************************************
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130709/ce8e1dd9/attachment.html>

From stefan at sun.ac.za  Tue Jul  9 09:14:28 2013
From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=)
Date: Tue, 9 Jul 2013 15:14:28 +0200
Subject: [Numpy-discussion] np.ma.argmax not respecting the mask?
In-Reply-To: <CAAN-aRFJGoj=rqa1+WNm_NnrwEO-Wyy29=iNvbrjMOLpbwHOtg@mail.gmail.com>
References: <CAAN-aRFJGoj=rqa1+WNm_NnrwEO-Wyy29=iNvbrjMOLpbwHOtg@mail.gmail.com>
Message-ID: <CABDkGQnuGh6z8fb4NHgGxD_MUqHKBQH3X0=xMqsTskfLf1aFdg@mail.gmail.com>

On Tue, Jul 9, 2013 at 2:55 PM, Chao YUE <chaoyuejoy at gmail.com> wrote:
> I am using 1.7.1 version of numpy and np.ma.argmax is not repecting the
> mask?
>
> In [96]: d3
> Out[96]:
> masked_array(data =
>  [[-- -- -- -- 4]
>  [5 -- 7 8 9]],
>              mask =
>  [[ True  True  True  True False]
>  [False  True False False False]],
>        fill_value = 6)
>
>
> In [97]: np.ma.argmax(d3,axis=0)
> Out[97]: array([1, 0, 1, 1, 1])

This is the result I would expect.  If both values are masked, the
fill value is used, so there is always an argmin value.

The following workaround should have done the trick, but it exposes a
different bug:

x = np.ma.array([[0,1,2,3,4],[5,6,7,8, 9]], mask=[[1, 1, 1, 1, 0], [0,
1, 0, 0 ,0]], dtype=float)
np.nanargmax(x.filled(np.nan), axis=0)

This breaks with "ValueError: cannot convert float NaN to integer"

St?fan


From sebastian at sipsolutions.net  Tue Jul  9 10:08:04 2013
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Tue, 09 Jul 2013 16:08:04 +0200
Subject: [Numpy-discussion] np.ma.argmax not respecting the mask?
In-Reply-To: <CABDkGQnuGh6z8fb4NHgGxD_MUqHKBQH3X0=xMqsTskfLf1aFdg@mail.gmail.com>
References: <CAAN-aRFJGoj=rqa1+WNm_NnrwEO-Wyy29=iNvbrjMOLpbwHOtg@mail.gmail.com>
	<CABDkGQnuGh6z8fb4NHgGxD_MUqHKBQH3X0=xMqsTskfLf1aFdg@mail.gmail.com>
Message-ID: <1373378884.2604.6.camel@sebastian-laptop>

On Tue, 2013-07-09 at 15:14 +0200, St?fan van der Walt wrote:
> On Tue, Jul 9, 2013 at 2:55 PM, Chao YUE <chaoyuejoy at gmail.com> wrote:
> > I am using 1.7.1 version of numpy and np.ma.argmax is not repecting the
> > mask?
> >
> > In [96]: d3
> > Out[96]:
> > masked_array(data =
> >  [[-- -- -- -- 4]
> >  [5 -- 7 8 9]],
> >              mask =
> >  [[ True  True  True  True False]
> >  [False  True False False False]],
> >        fill_value = 6)
> >
> >
> > In [97]: np.ma.argmax(d3,axis=0)
> > Out[97]: array([1, 0, 1, 1, 1])
> 
> This is the result I would expect.  If both values are masked, the
> fill value is used, so there is always an argmin value.
> 

To be honest, I would expect the exact opposite. If there is no value,
there is no minimum argument -> either its an error, or it signals
invalid in some other way. On masked arrays I would expect it to be
masked to signal this.
The error for nanargmax is annoying, but it is right to be an error IMO,
due to lack of a better representation. (Ideally mabe the user would be
given the option to pass an Identity element for those nanfuncs
(basically this is always NaN now, which fails for argmax since the
result is integer) for which the ufunc does not have an Identity, and
for those that do, we should actually use it.

- Sebastian

> The following workaround should have done the trick, but it exposes a
> different bug:
> 
> x = np.ma.array([[0,1,2,3,4],[5,6,7,8, 9]], mask=[[1, 1, 1, 1, 0], [0,
> 1, 0, 0 ,0]], dtype=float)
> np.nanargmax(x.filled(np.nan), axis=0)
> 
> This breaks with "ValueError: cannot convert float NaN to integer"
> 
> St?fan
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> 


From pgmdevlist at gmail.com  Tue Jul  9 10:26:08 2013
From: pgmdevlist at gmail.com (Pierre Gerard-Marchant)
Date: Tue, 9 Jul 2013 16:26:08 +0200
Subject: [Numpy-discussion] np.ma.argmax not respecting the mask?
In-Reply-To: <1373378884.2604.6.camel@sebastian-laptop>
References: <CAAN-aRFJGoj=rqa1+WNm_NnrwEO-Wyy29=iNvbrjMOLpbwHOtg@mail.gmail.com>
	<CABDkGQnuGh6z8fb4NHgGxD_MUqHKBQH3X0=xMqsTskfLf1aFdg@mail.gmail.com>
	<1373378884.2604.6.camel@sebastian-laptop>
Message-ID: <200BAAEA-7F9D-4A72-93B9-EE96EE347601@gmail.com>


On Jul 9, 2013, at 16:08 , Sebastian Berg <sebastian at sipsolutions.net> wrote:

> On Tue, 2013-07-09 at 15:14 +0200, St?fan van der Walt wrote:
>> On Tue, Jul 9, 2013 at 2:55 PM, Chao YUE <chaoyuejoy at gmail.com> wrote:
>>> I am using 1.7.1 version of numpy and np.ma.argmax is not repecting the
>>> mask?
>>> 
>>> In [96]: d3
>>> Out[96]:
>>> masked_array(data =
>>> [[-- -- -- -- 4]
>>> [5 -- 7 8 9]],
>>>             mask =
>>> [[ True  True  True  True False]
>>> [False  True False False False]],
>>>       fill_value = 6)
>>> 
>>> 
>>> In [97]: np.ma.argmax(d3,axis=0)
>>> Out[97]: array([1, 0, 1, 1, 1])
>> 
>> This is the result I would expect.  If both values are masked, the
>> fill value is used, so there is always an argmin value.
>> 
> 
> To be honest, I would expect the exact opposite. If there is no value,
> there is no minimum argument -> either its an error, or it signals
> invalid in some other way. On masked arrays I would expect it to be
> masked to signal this.

The doc is quite clear: masked values are replaced by `fill_value` when determining the argmax/argmin. Attaching a mask a posteriori is always doable, but making the output of np.ma.argstuff a MaskedArray may be a nuisance at this point (any input from heavy users?).


From chaoyuejoy at gmail.com  Tue Jul  9 10:38:18 2013
From: chaoyuejoy at gmail.com (Chao YUE)
Date: Tue, 9 Jul 2013 16:38:18 +0200
Subject: [Numpy-discussion] np.ma.argmax not respecting the mask?
In-Reply-To: <200BAAEA-7F9D-4A72-93B9-EE96EE347601@gmail.com>
References: <CAAN-aRFJGoj=rqa1+WNm_NnrwEO-Wyy29=iNvbrjMOLpbwHOtg@mail.gmail.com>
	<CABDkGQnuGh6z8fb4NHgGxD_MUqHKBQH3X0=xMqsTskfLf1aFdg@mail.gmail.com>
	<1373378884.2604.6.camel@sebastian-laptop>
	<200BAAEA-7F9D-4A72-93B9-EE96EE347601@gmail.com>
Message-ID: <CAAN-aRE8ZfPC1kB+HjeYzfosF56bOwA02-3zRCiT8DZJgOXG8g@mail.gmail.com>

Sorry I didn't the docs very carefully. there is no doc for np.ma.argmax
for indeed there is for np.ma.argmin

so it's an expected behavior rather than a bug. Let some heavy users to say
their ideas.

Practicaly, the returned value of 0 will be always confused with the values
which are not masked
but do have the minimum or maximum values at the 0 position over the
specified axis.

One way to walk around is:


data_mask = np.ma.mean(axis=0).mask

np.ma.masked_array(np.ma.argmax(data,axis=0), mask=data_mask)

Chao


On Tue, Jul 9, 2013 at 4:26 PM, Pierre Gerard-Marchant <pgmdevlist at gmail.com
> wrote:

>
> On Jul 9, 2013, at 16:08 , Sebastian Berg <sebastian at sipsolutions.net>
> wrote:
>
> > On Tue, 2013-07-09 at 15:14 +0200, St?fan van der Walt wrote:
> >> On Tue, Jul 9, 2013 at 2:55 PM, Chao YUE <chaoyuejoy at gmail.com> wrote:
> >>> I am using 1.7.1 version of numpy and np.ma.argmax is not repecting the
> >>> mask?
> >>>
> >>> In [96]: d3
> >>> Out[96]:
> >>> masked_array(data =
> >>> [[-- -- -- -- 4]
> >>> [5 -- 7 8 9]],
> >>>             mask =
> >>> [[ True  True  True  True False]
> >>> [False  True False False False]],
> >>>       fill_value = 6)
> >>>
> >>>
> >>> In [97]: np.ma.argmax(d3,axis=0)
> >>> Out[97]: array([1, 0, 1, 1, 1])
> >>
> >> This is the result I would expect.  If both values are masked, the
> >> fill value is used, so there is always an argmin value.
> >>
> >
> > To be honest, I would expect the exact opposite. If there is no value,
> > there is no minimum argument -> either its an error, or it signals
> > invalid in some other way. On masked arrays I would expect it to be
> > masked to signal this.
>
> The doc is quite clear: masked values are replaced by `fill_value` when
> determining the argmax/argmin. Attaching a mask a posteriori is always
> doable, but making the output of np.ma.argstuff a MaskedArray may be a
> nuisance at this point (any input from heavy users?).
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


-- 
***********************************************************************************
Chao YUE
Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL)
UMR 1572 CEA-CNRS-UVSQ
Batiment 712 - Pe 119
91191 GIF Sur YVETTE Cedex
Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16
************************************************************************************
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130709/65002e25/attachment.html>

From pgmdevlist at gmail.com  Tue Jul  9 10:55:51 2013
From: pgmdevlist at gmail.com (Pierre Gerard-Marchant)
Date: Tue, 9 Jul 2013 16:55:51 +0200
Subject: [Numpy-discussion] np.ma.argmax not respecting the mask?
In-Reply-To: <CAAN-aRE8ZfPC1kB+HjeYzfosF56bOwA02-3zRCiT8DZJgOXG8g@mail.gmail.com>
References: <CAAN-aRFJGoj=rqa1+WNm_NnrwEO-Wyy29=iNvbrjMOLpbwHOtg@mail.gmail.com>
	<CABDkGQnuGh6z8fb4NHgGxD_MUqHKBQH3X0=xMqsTskfLf1aFdg@mail.gmail.com>
	<1373378884.2604.6.camel@sebastian-laptop>
	<200BAAEA-7F9D-4A72-93B9-EE96EE347601@gmail.com>
	<CAAN-aRE8ZfPC1kB+HjeYzfosF56bOwA02-3zRCiT8DZJgOXG8g@mail.gmail.com>
Message-ID: <A5D93A8A-F758-4D27-89FF-C4307C1F9B28@gmail.com>


On Jul 9, 2013, at 16:38 , Chao YUE <chaoyuejoy at gmail.com> wrote:

> Sorry I didn't the docs very carefully. there is no doc for np.ma.argmax for indeed there is for np.ma.argmin

Yeah, the doc of the function asks you to go check the doc of the method? Not the best.


> so it's an expected behavior rather than a bug. Let some heavy users to say their ideas.
> 
> Practicaly, the returned value of 0 will be always confused with the values which are not masked
> but do have the minimum or maximum values at the 0 position over the specified axis.

Well, it's just an index: if you take the corresponding value from the input array, it'll be masked...

> One way to walk around is:
> 
> 
> data_mask = np.ma.mean(axis=0).mask
> 
> np.ma.masked_array(np.ma.argmax(data,axis=0), mask=data_mask)

I find easier to use `mask=x.mask.prod(axis)` to get the combined mask along the desired axis (you could also use a `reduce(np.logical_and, x.mask)` for axis=0, but it's less convenient I think).


From chaoyuejoy at gmail.com  Tue Jul  9 11:20:46 2013
From: chaoyuejoy at gmail.com (Chao YUE)
Date: Tue, 9 Jul 2013 17:20:46 +0200
Subject: [Numpy-discussion] np.ma.argmax not respecting the mask?
In-Reply-To: <A5D93A8A-F758-4D27-89FF-C4307C1F9B28@gmail.com>
References: <CAAN-aRFJGoj=rqa1+WNm_NnrwEO-Wyy29=iNvbrjMOLpbwHOtg@mail.gmail.com>
	<CABDkGQnuGh6z8fb4NHgGxD_MUqHKBQH3X0=xMqsTskfLf1aFdg@mail.gmail.com>
	<1373378884.2604.6.camel@sebastian-laptop>
	<200BAAEA-7F9D-4A72-93B9-EE96EE347601@gmail.com>
	<CAAN-aRE8ZfPC1kB+HjeYzfosF56bOwA02-3zRCiT8DZJgOXG8g@mail.gmail.com>
	<A5D93A8A-F758-4D27-89FF-C4307C1F9B28@gmail.com>
Message-ID: <CAAN-aREHOtPNx3he9A+jYJteDhZiOc8Z1A6-WCFc_WKE1H4ycQ@mail.gmail.com>

Thanks Pierre, good to know there are so many tricks available.

Chao

On Tue, Jul 9, 2013 at 4:55 PM, Pierre Gerard-Marchant <pgmdevlist at gmail.com
> wrote:

>
> On Jul 9, 2013, at 16:38 , Chao YUE <chaoyuejoy at gmail.com> wrote:
>
> > Sorry I didn't the docs very carefully. there is no doc for np.ma.argmax
> for indeed there is for np.ma.argmin
>
> Yeah, the doc of the function asks you to go check the doc of the method?
> Not the best.
>
>
> > so it's an expected behavior rather than a bug. Let some heavy users to
> say their ideas.
> >
> > Practicaly, the returned value of 0 will be always confused with the
> values which are not masked
> > but do have the minimum or maximum values at the 0 position over the
> specified axis.
>
> Well, it's just an index: if you take the corresponding value from the
> input array, it'll be masked...
>
> > One way to walk around is:
> >
> >
> > data_mask = np.ma.mean(axis=0).mask
> >
> > np.ma.masked_array(np.ma.argmax(data,axis=0), mask=data_mask)
>
> I find easier to use `mask=x.mask.prod(axis)` to get the combined mask
> along the desired axis (you could also use a `reduce(np.logical_and,
> x.mask)` for axis=0, but it's less convenient I think).
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


-- 
***********************************************************************************
Chao YUE
Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL)
UMR 1572 CEA-CNRS-UVSQ
Batiment 712 - Pe 119
91191 GIF Sur YVETTE Cedex
Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16
************************************************************************************
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130709/689486d6/attachment.html>

From lists at onerussian.com  Tue Jul  9 12:10:07 2013
From: lists at onerussian.com (Yaroslav Halchenko)
Date: Tue, 9 Jul 2013 12:10:07 -0400
Subject: [Numpy-discussion] Really cruel draft of vbench setup for NumPy
 (.add.reduce benchmarks since 2011)
In-Reply-To: <20130701215804.GG27621@onerussian.com>
References: <20130506143241.GV5140@onerussian.com>
	<1367856232.2506.31.camel@sebastian-laptop>
	<20130506161153.GW5140@onerussian.com>
	<1367927238.23010.12.camel@sebastian-laptop>
	<CAJhcF=1a=zHQ7JgXF9f4N+1pHMr_6RESq6Me9oSYewjj62L0MA@mail.gmail.com>
	<20130701193006.GC27621@onerussian.com>
	<20130701215804.GG27621@onerussian.com>
Message-ID: <20130709161007.GL27621@onerussian.com>

Julian Taylor contributed some benchmarks he was "concerned" about, so
now the collection is even better.

I will keep updating tests on the same url:
http://www.onerussian.com/tmp/numpy-vbench/
[it is now running and later I will upload with more commits for higher temporal fidelity]

of particular interest for you might be:
some minor consistent recent losses in
http://www.onerussian.com/tmp/numpy-vbench/vb_vb_io.html#strided-assign-float64
http://www.onerussian.com/tmp/numpy-vbench/vb_vb_io.html#strided-assign-float32
http://www.onerussian.com/tmp/numpy-vbench/vb_vb_io.html#strided-assign-int16
http://www.onerussian.com/tmp/numpy-vbench/vb_vb_io.html#strided-assign-int8

seems have lost more than 25% of performance throughout the timeline
http://www.onerussian.com/tmp/numpy-vbench/vb_vb_io.html#memcpy-int8 

"fast" calls to all/any seemed to be hurt twice in their life time now running
*3 times slower* than in 2011 -- inflection points correspond to regressions
and/or their fixes in those functions to bring back performance on "slow"
cases (when array traversal is needed, e.g. on arrays of zeros for any)

http://www.onerussian.com/tmp/numpy-vbench/vb_vb_reduce.html#numpy-all-fast
http://www.onerussian.com/tmp/numpy-vbench/vb_vb_reduce.html#numpy-any-fast

Enjoy

On Mon, 01 Jul 2013, Yaroslav Halchenko wrote:

> FWIW -- updated plots with contribution from Julian Taylor
> http://www.onerussian.com/tmp/numpy-vbench-20130701/vb_vb_indexing.html#mmap-slicing
> ;-)

> On Mon, 01 Jul 2013, Yaroslav Halchenko wrote:

> > Hi Guys,

> > not quite the recommendations you expressed,  but here is my ugly
> > attempt to improve benchmarks coverage:

> > http://www.onerussian.com/tmp/numpy-vbench-20130701/index.html

> > initially I also ran those ufunc benchmarks per each dtype separately,
> > but then resulting webpage is loong which brings my laptop on its knees
> > by firefox.  So I commented those out for now, and left only "summary"
> > ones across multiple datatypes.

> > There is a bug in sphinx which forbids embedding some figures for
> > vb_random "as is", so pardon that for now...

> > I have not set cpu affinity of the process (but ran it at nice -10), so  may be
> > that also contributed to variance of benchmark estimates.  And there probably
> > could be more of goodies (e.g. gc control etc) to borrow from
> > https://github.com/pydata/pandas/blob/master/vb_suite/test_perf.py which I have
> > just discovered to minimize variance.

> > nothing really interesting was pin-pointed so far, besides that 

> > - svd became a bit faster since few months back ;-)

> > http://www.onerussian.com/tmp/numpy-vbench-20130701/vb_vb_linalg.html

> > - isnan (and isinf, isfinite) got improved

> > http://www.onerussian.com/tmp/numpy-vbench-20130701/vb_vb_ufunc.html#numpy-isnan-a-10types

> > - right_shift got a miniscule slowdown from what it used to be?

> > http://www.onerussian.com/tmp/numpy-vbench-20130701/vb_vb_ufunc.html#numpy-right-shift-a-a-3types

> > As before -- current code of those benchmarks collection is available
> > at http://github.com/yarikoptic/numpy-vbench/pull/new/master

> > if you have specific snippets you would like to benchmark -- just state them
> > here or send a PR -- I will add them in.

> > Cheers,

> > On Tue, 07 May 2013, Da?id wrote:

> > > On 7 May 2013 13:47, Sebastian Berg <sebastian at sipsolutions.net> wrote:
> > > > Indexing/assignment was the first thing I thought of too (also because
> > > > fancy indexing/assignment really could use some speedups...). Other then
> > > > that maybe some timings for small arrays/scalar math, but that might be
> > > > nice for that GSoC project.

> > > Why not going bigger? Ufunc operations on big arrays, CPU and memory bound.

> > > Also, what about interfacing with other packages? It may increase the
> > > compiling overhead, but I would like to see Cython in action (say,
> > > only last version, maybe it can be fixed).
> > > _______________________________________________
> > > NumPy-Discussion mailing list
> > > NumPy-Discussion at scipy.org
> > > http://mail.scipy.org/mailman/listinfo/numpy-discussion
-- 
Yaroslav O. Halchenko, Ph.D.
http://neuro.debian.net http://www.pymvpa.org http://www.fail2ban.org
Senior Research Associate,     Psychological and Brain Sciences Dept.
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834                       Fax: +1 (603) 646-1419
WWW:   http://www.linkedin.com/in/yarik        


From lists at hilboll.de  Wed Jul 10 11:02:07 2013
From: lists at hilboll.de (Andreas Hilboll)
Date: Wed, 10 Jul 2013 17:02:07 +0200
Subject: [Numpy-discussion] flip array on axis
Message-ID: <51DD776F.4040306@hilboll.de>

Hi,

there are np.flipud and np.fliplr methods to flip 2d arrays on the first
and second dimension, respectively. What can I do to flip an array on an
axis which I don't know before runtime? I'd really like to see a
np.flip(arr, axis) method which lets me specify which axis to flip on.

Any ideas?

Cheers, Andreas.


From matthew.brett at gmail.com  Wed Jul 10 11:06:21 2013
From: matthew.brett at gmail.com (Matthew Brett)
Date: Wed, 10 Jul 2013 11:06:21 -0400
Subject: [Numpy-discussion] flip array on axis
In-Reply-To: <51DD776F.4040306@hilboll.de>
References: <51DD776F.4040306@hilboll.de>
Message-ID: <CAH6Pt5rD5057MC9b586NOm3ZB2BKfm+mvx+rP=y=fheTG6B_FA@mail.gmail.com>

Hi,

On Wed, Jul 10, 2013 at 11:02 AM, Andreas Hilboll <lists at hilboll.de> wrote:
> Hi,
>
> there are np.flipud and np.fliplr methods to flip 2d arrays on the first
> and second dimension, respectively. What can I do to flip an array on an
> axis which I don't know before runtime? I'd really like to see a
> np.flip(arr, axis) method which lets me specify which axis to flip on.

I have something like that that's a few lines long:

https://github.com/nipy/nibabel/blob/master/nibabel/orientations.py#L231

Cheers,

Matthew


From jgomezdans at gmail.com  Wed Jul 10 11:50:25 2013
From: jgomezdans at gmail.com (Jose Gomez-Dans)
Date: Wed, 10 Jul 2013 16:50:25 +0100
Subject: [Numpy-discussion] f2py and setup.py how can I specify where the
	.so file goes?
Message-ID: <CAMWde5ot1iKneyw2WtGn263g=PDMDyDN_e4rqnK8CT1j=gNQdg@mail.gmail.com>

Hi,
I am building a package that exposes some Fortran libraries through f2py.
The packages directory looks like this:
setup.py
my_pack/
  |
  |---------->__init__.py
  |----------> some.pyf
  |-----------> code.f90

I thoughat that once installed, I'd get the .so and __init__.py in the same
directory (namely ~/.local/lib/python2.7/site-packages/my_pack/). However,
I get
~/.local/lib/python2.7/site-packages/mypack_fortran.so
~/.local/lib/python2.7/site-packages/my_pack__fortran-1.0.2-py2.7.egg-info
~/.local/lib/python2.7/site-packages/my_pack/__init__.py

Thet setup file is this at the end, I am clearly missing some option here
to move the *.so into the my_pack directory.... Anybody know which one?

Cheers
Jose

[setup.py]

#!/usr/bin/env python


def configuration(parent_package='',top_path=None):
    from numpy.distutils.misc_util import Configuration
    config = Configuration(parent_package,top_path)
    config.add_extension('mypack_fortran', ['the_pack/code.f90'] )
    return config

if __name__ == "__main__":
    from numpy.distutils.core import setup
    # Global variables for this extension:
    name         = "mypack_fortran"  # name of the generated python
extension (.so)
    description  = "blah"
    author       = ""
    author_email = ""

    setup( name=name,\
        description=description, \
        author=author, \
        author_email = author_email, \
        configuration = configuration, version="1.0.2",\
        packages=["my_pack"])
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130710/894e1b5c/attachment.html>

From lists at hilboll.de  Wed Jul 10 12:03:38 2013
From: lists at hilboll.de (Andreas Hilboll)
Date: Wed, 10 Jul 2013 18:03:38 +0200
Subject: [Numpy-discussion] flip array on axis
In-Reply-To: <CAH6Pt5rD5057MC9b586NOm3ZB2BKfm+mvx+rP=y=fheTG6B_FA@mail.gmail.com>
References: <51DD776F.4040306@hilboll.de>
	<CAH6Pt5rD5057MC9b586NOm3ZB2BKfm+mvx+rP=y=fheTG6B_FA@mail.gmail.com>
Message-ID: <51DD85DA.906@hilboll.de>

On 10.07.2013 17:06, Matthew Brett wrote:
> Hi,
> 
> On Wed, Jul 10, 2013 at 11:02 AM, Andreas Hilboll <lists at hilboll.de> wrote:
>> Hi,
>>
>> there are np.flipud and np.fliplr methods to flip 2d arrays on the first
>> and second dimension, respectively. What can I do to flip an array on an
>> axis which I don't know before runtime? I'd really like to see a
>> np.flip(arr, axis) method which lets me specify which axis to flip on.
> 
> I have something like that that's a few lines long:
> 
> https://github.com/nipy/nibabel/blob/master/nibabel/orientations.py#L231
> 
> Cheers,
> 
> Matthew
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> 

Thanks, Matthew! Should this go into numpy itself? If so, I could
prepare a PR, if you point me to the right place (file) to put it.

Cheers, Andreas.


From aronne.merrelli at gmail.com  Wed Jul 10 14:11:07 2013
From: aronne.merrelli at gmail.com (Aronne Merrelli)
Date: Wed, 10 Jul 2013 14:11:07 -0400
Subject: [Numpy-discussion] Unique() function and avoiding Loop
In-Reply-To: <1373012428.804822615@f377.i.mail.ru>
References: <1373012428.804822615@f377.i.mail.ru>
Message-ID: <CAHNdQ4JMnSSdufcjkdsDi7PqK0y5X5b+q+VuspfajeOUyOmVmw@mail.gmail.com>

On Fri, Jul 5, 2013 at 4:20 AM, Bakhtiyor Zokhidov <
bakhtiyor_zokhidov at mail.ru> wrote:

> Hi everybody,
>
> I have a problem with sorting out the following function. What I expect is
> that I showed as an example below.
>
> Two problems are encountered to achieve the result:
> 1) The function sometimes can't not sort as expected: I showed an example
> for that below.
> 2) I could not do vectorization to avoid loop.
>
>
> OR, Is there another way to solve that problem??
> Thanks in advance
>
>
> Example:
> data = ['', 12, 12, 423, '1', 423, -32, 12, 721, 345]. Expected
> result:  [0, 12, 12, 423, 0, 423, -32, 12, 721, 345], here, '' and '1'
> are string type I need to replace them by zero
>

I don't understand your code example, but if your problem is fully
described as above (replace the strings '' or '1' with the integer 0), then
it would seem simplest to just do this with python built in functions
rather than using numpy. The numpy functions work best with arrays, and
your "data" variable is a python list with a mixture of integers and
strings.

Here is a possible solution:

>>> data = ['', 12, 12, 423, '1', 423, -32, 12, 721, 345]
>>> foo = lambda x: 0 if (x == '') or (x == '1') else x
>>> print map(foo, data)
[0, 12, 12, 423, 0, 423, -32, 12, 721, 345]


Hope that helps,
Aronne


> The result I got: ['', 12, 12, 423, '1', 423, -32, 12, 721, 345]
>
> import numpy as np
>
> def func(data):
>
>           x, i = np.unique(data, return_inverse = True)
>           f = [ np.where( i == ind )[0] for ind in range(len(x)) ]
>
>           new_data = []
>           # Obtain 'data' arguments and give these data to New_data
>           for i in range(len(x)):
>                       if np.size(f[i]) > 1:
>                                  for j in f[i]:
>                                          if str(data[j]) <> '':
>
> new_data.append(data[j])
>                                               else:
>                                                     data[j] = 0
>           return data
>
> --
> Bakhtiyor Zokhidov
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130710/580318a3/attachment.html>

From warren.weckesser at gmail.com  Wed Jul 10 14:29:00 2013
From: warren.weckesser at gmail.com (Warren Weckesser)
Date: Wed, 10 Jul 2013 14:29:00 -0400
Subject: [Numpy-discussion] flip array on axis
In-Reply-To: <51DD85DA.906@hilboll.de>
References: <51DD776F.4040306@hilboll.de>
	<CAH6Pt5rD5057MC9b586NOm3ZB2BKfm+mvx+rP=y=fheTG6B_FA@mail.gmail.com>
	<51DD85DA.906@hilboll.de>
Message-ID: <CAGzF1uedd_xkk1_Z-fkP6Cr93npiEKr2zP0hbSBvRDteGe=jEA@mail.gmail.com>

On Wed, Jul 10, 2013 at 12:03 PM, Andreas Hilboll <lists at hilboll.de> wrote:

> On 10.07.2013 17:06, Matthew Brett wrote:
> > Hi,
> >
> > On Wed, Jul 10, 2013 at 11:02 AM, Andreas Hilboll <lists at hilboll.de>
> wrote:
> >> Hi,
> >>
> >> there are np.flipud and np.fliplr methods to flip 2d arrays on the first
> >> and second dimension, respectively. What can I do to flip an array on an
> >> axis which I don't know before runtime? I'd really like to see a
> >> np.flip(arr, axis) method which lets me specify which axis to flip on.
> >
> > I have something like that that's a few lines long:
> >
> > https://github.com/nipy/nibabel/blob/master/nibabel/orientations.py#L231
> >
> > Cheers,
> >
> > Matthew
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >
>
> Thanks, Matthew! Should this go into numpy itself? If so, I could
> prepare a PR, if you point me to the right place (file) to put it.
>
>

Something like this would be nice to have in numpy, so we don't continue to
reinvent it (e.g.
https://github.com/scipy/scipy/blob/master/scipy/signal/_arraytools.py; see
`axis_slice` and `axis_reverse`).

Warren


> Cheers, Andreas.
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130710/ab654e7f/attachment.html>

From blake.a.griffith at gmail.com  Wed Jul 10 23:29:05 2013
From: blake.a.griffith at gmail.com (Blake Griffith)
Date: Wed, 10 Jul 2013 22:29:05 -0500
Subject: [Numpy-discussion] ufunc overrides
Message-ID: <CAOiFMpzEedwjkeP6cA6Qa319Hr+Zq=ihwQ2vuN6PTpiyvYyDrg@mail.gmail.com>

Hello NumPy,

Part of my GSoC is compatibility with SciPy's sparse matrices and NumPy's
ufuncs. Currently there is no feasible way to do this without changing
ufuncs a bit.

I've been considering a mechanism to override ufuncs based on checking the
ufuncs arguments for a __ufunc_override__ attribute. Then handing off the
operation to a function specified by that attribute. I prototyped this in
python and did a demo in a blog post here:
http://cwl.cx/posts/week-6-ufunc-overrides.html
This is similar to a previously discussed, but never implemented change:
http://mail.scipy.org/pipermail/numpy-discussion/2011-June/056945.html

However it seems like the ufunc machinery might be ripped out and replaced
with a true multi-method implementation soon. See Travis' blog post:
http://technicaldiscovery.blogspot.com/2013/07/thoughts-after-scipy-2013-and-specific.html
So I'd like to make my changes as forward compatible as possible. However
I'm not sure what I should even consider here, or how forward compatible my
current implementation is. Thoughts?

Until then, I'm writing up a nep, it is still pretty incomplete, it can be
found here:

https://github.com/cowlicks/numpy/blob/ufunc-override/doc/neps/ufunc-overrides.rst
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130710/3b213089/attachment.html>

From travis at continuum.io  Thu Jul 11 01:01:15 2013
From: travis at continuum.io (Travis Oliphant)
Date: Thu, 11 Jul 2013 00:01:15 -0500
Subject: [Numpy-discussion] ufunc overrides
In-Reply-To: <CAOiFMpzEedwjkeP6cA6Qa319Hr+Zq=ihwQ2vuN6PTpiyvYyDrg@mail.gmail.com>
References: <CAOiFMpzEedwjkeP6cA6Qa319Hr+Zq=ihwQ2vuN6PTpiyvYyDrg@mail.gmail.com>
Message-ID: <CAMcnTE6+GDXXqkDkObuVsA=jkNdPT6SfayX-LemhCsj51P8mXg@mail.gmail.com>

Hey Blake,

To be clear, my blog-post is just a pre-NEP and should not be perceived as
something that will transpire in NumPy anytime soon.    You should take it
as a "hey everyone, I think I know how to solve this problem, but I have no
time to do it, but wanted to get the word out to those who might have the
time"

I think the multi-method approach I outline is the *right* thing to do for
NumPy.   Another attribute on ufuncs would be a bit of a hack (though
easier to implement).   But, on the other-hand, the current ufunc
attributes are also a bit of a hack.

While my overall proposal is to make *all* functions in NumPy (and SciPy
and Scikits) multimethods, I think it's actually pretty straightforward and
a more contained problem to make all *ufuncs* multi-methods.   I think that
could fit in a summer of code project.

I don't think it would be that difficult to make all ufuncs multi-methods
that dispatch based on the Python type (they are already multi-methods
based on the array dtype).    You could basically take the code from
Guido's essay or from Peak Rules multi-method implementation or from the
links below and integrate it with a wrapped version of the current ufuncs
(or do a bit more glue and modify the ufunc_call function in 'C' directly
and get nice general multi-methods for ufuncs).

Of course, you would need to define a decorator that NumPy users could use
to register their multi-method implementation with the ufunc.   But, this
again would not be too difficult.      Look for examples and inspiration at
the following places:

http://alexgaynor.net/2010/jun/26/multimethods-python/
https://pypi.python.org/pypi/typed.py

I really think this would be a great addition to NumPy (it would simplify a
lot of cruft around masked arrays, character arrays, etc.) and be quite
useful.   I wish you the best.     I can't promise I will have time to
help, but I will try to chime in the best I can.

Best regards,

-Travis


On Wed, Jul 10, 2013 at 10:29 PM, Blake Griffith <blake.a.griffith at gmail.com
> wrote:

> Hello NumPy,
>
> Part of my GSoC is compatibility with SciPy's sparse matrices and NumPy's
> ufuncs. Currently there is no feasible way to do this without changing
> ufuncs a bit.
>
> I've been considering a mechanism to override ufuncs based on checking the
> ufuncs arguments for a __ufunc_override__ attribute. Then handing off the
> operation to a function specified by that attribute. I prototyped this in
> python and did a demo in a blog post here:
> http://cwl.cx/posts/week-6-ufunc-overrides.html
> This is similar to a previously discussed, but never implemented change:
> http://mail.scipy.org/pipermail/numpy-discussion/2011-June/056945.html
>
> However it seems like the ufunc machinery might be ripped out and replaced
> with a true multi-method implementation soon. See Travis' blog post:
>
> http://technicaldiscovery.blogspot.com/2013/07/thoughts-after-scipy-2013-and-specific.html
> So I'd like to make my changes as forward compatible as possible. However
> I'm not sure what I should even consider here, or how forward compatible my
> current implementation is. Thoughts?
>
> Until then, I'm writing up a nep, it is still pretty incomplete, it can be
> found here:
>
> https://github.com/cowlicks/numpy/blob/ufunc-override/doc/neps/ufunc-overrides.rst
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>


-- 

Travis Oliphant
Continuum Analytics, Inc.
http://www.continuum.io
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130711/2eefc088/attachment.html>

From njs at pobox.com  Thu Jul 11 15:00:30 2013
From: njs at pobox.com (Nathaniel Smith)
Date: Thu, 11 Jul 2013 20:00:30 +0100
Subject: [Numpy-discussion] flip array on axis
In-Reply-To: <51DD85DA.906@hilboll.de>
References: <51DD776F.4040306@hilboll.de>
	<CAH6Pt5rD5057MC9b586NOm3ZB2BKfm+mvx+rP=y=fheTG6B_FA@mail.gmail.com>
	<51DD85DA.906@hilboll.de>
Message-ID: <CAPJVwBmeHOSBKTBy=RVJGWA=zYri4_1UYhMh773UvT9qxu+1-g@mail.gmail.com>

On Wed, Jul 10, 2013 at 5:03 PM, Andreas Hilboll <lists at hilboll.de> wrote:
> On 10.07.2013 17:06, Matthew Brett wrote:
>> Hi,
>>
>> On Wed, Jul 10, 2013 at 11:02 AM, Andreas Hilboll <lists at hilboll.de> wrote:
>>> Hi,
>>>
>>> there are np.flipud and np.fliplr methods to flip 2d arrays on the first
>>> and second dimension, respectively. What can I do to flip an array on an
>>> axis which I don't know before runtime? I'd really like to see a
>>> np.flip(arr, axis) method which lets me specify which axis to flip on.
>>
>> I have something like that that's a few lines long:
>>
>> https://github.com/nipy/nibabel/blob/master/nibabel/orientations.py#L231
>>
>> Cheers,
>>
>> Matthew
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>
> Thanks, Matthew! Should this go into numpy itself?

Don't see why not.

> If so, I could
> prepare a PR, if you point me to the right place (file) to put it.

I don't think there's a lot of rigid logic to how numpy's source is
laid out. numpy/lib/function_base.py maybe, or next to flipud/fliplr
in numpy/lib/twodim_base.py?

-n


From scott.sinclair.za at gmail.com  Fri Jul 12 05:02:16 2013
From: scott.sinclair.za at gmail.com (Scott Sinclair)
Date: Fri, 12 Jul 2013 11:02:16 +0200
Subject: [Numpy-discussion] f2py and setup.py how can I specify where
 the .so file goes?
In-Reply-To: <CAMWde5ot1iKneyw2WtGn263g=PDMDyDN_e4rqnK8CT1j=gNQdg@mail.gmail.com>
References: <CAMWde5ot1iKneyw2WtGn263g=PDMDyDN_e4rqnK8CT1j=gNQdg@mail.gmail.com>
Message-ID: <CA+nsYDsabC+YntwmBfdmZNrfCRY9fNd=ipTg=UtS=nMzSs18Yg@mail.gmail.com>

On 10 July 2013 17:50, Jose Gomez-Dans <jgomezdans at gmail.com> wrote:
> Hi,
> I am building a package that exposes some Fortran libraries through f2py.
> The packages directory looks like this:
> setup.py
> my_pack/
>   |
>   |---------->__init__.py
>   |----------> some.pyf
>   |-----------> code.f90
>
> I thoughat that once installed, I'd get the .so and __init__.py in the same
> directory (namely ~/.local/lib/python2.7/site-packages/my_pack/). However, I
> get
> ~/.local/lib/python2.7/site-packages/mypack_fortran.so
> ~/.local/lib/python2.7/site-packages/my_pack__fortran-1.0.2-py2.7.egg-info
> ~/.local/lib/python2.7/site-packages/my_pack/__init__.py
>
> Thet setup file is this at the end, I am clearly missing some option here to
> move the *.so into the my_pack directory.... Anybody know which one?
>
> Cheers
> Jose
>
> [setup.py]
>
> #!/usr/bin/env python
>
>
> def configuration(parent_package='',top_path=None):
>     from numpy.distutils.misc_util import Configuration
>     config = Configuration(parent_package,top_path)
>     config.add_extension('mypack_fortran', ['the_pack/code.f90'] )
>     return config
>
> if __name__ == "__main__":
>     from numpy.distutils.core import setup
>     # Global variables for this extension:
>     name         = "mypack_fortran"  # name of the generated python
> extension (.so)
>     description  = "blah"
>     author       = ""
>     author_email = ""
>
>     setup( name=name,\
>         description=description, \
>         author=author, \
>         author_email = author_email, \
>         configuration = configuration, version="1.0.2",\
>         packages=["my_pack"])

Something like the following should work...

from numpy.distutils.core import setup, Extension

my_ext = Extension(name = 'my_pack._fortran',
                              sources = ['my_pack/code.f90'])

if __name__ == "__main__":
    setup(name = 'my_pack',
             description = ...,
             author =...,
             author_email = ...,
             version = ...,
             packages = ['my_pack'],
             ext_modules = [my_ext],
          )

Cheers,
Scott


From sebastian at sipsolutions.net  Fri Jul 12 08:38:08 2013
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Fri, 12 Jul 2013 14:38:08 +0200
Subject: [Numpy-discussion] Allow == and != to raise errors
Message-ID: <1373632688.13968.13.camel@sebastian-laptop>

Hey,

the array comparisons == and != never raise errors but instead simply
return False for invalid comparisons.

The main example are arrays of non-matching dimensions, and object
arrays with invalid element-wise comparisons:

In [1]: np.array([1,2,3]) == np.array([1,2])
Out[1]: False

In [2]: np.array([1, np.array([2, 3])], dtype=object) == [1, 2]
Out[2]: False

This seems wrong to me, and I am sure not just me. I doubt any large
projects makes use of such comparisons and assume that most would prefer
the shape mismatch to raise an error, so I would like to change it. But
I am a bit unsure especially about smaller projects. So to keep the
transition a bit safer could imagine implementing a FutureWarning for
these cases (and that would at least notify new users that what they are
doing doesn't seem like the right thing).

So the question is: Is such a change safe enough, or is there some good
reason for the current behavior that I am missing?

Regards,

Sebastian

(There may be other issues with structured types that would continue
returning False I think, because neither side knows how to compare)


From ben.root at ou.edu  Fri Jul 12 09:13:51 2013
From: ben.root at ou.edu (Benjamin Root)
Date: Fri, 12 Jul 2013 09:13:51 -0400
Subject: [Numpy-discussion] Allow == and != to raise errors
In-Reply-To: <1373632688.13968.13.camel@sebastian-laptop>
References: <1373632688.13968.13.camel@sebastian-laptop>
Message-ID: <CANNq6Fk+uLF_5Q+kesradPt5vEB_-miwmTzSm5odgEqxFtOing@mail.gmail.com>

I can see where you are getting at, but I would have to disagree.  First of
all, when a comparison between two mis-shaped arrays occur, you get back a
bone fide python boolean, not a numpy array of bools. So if any action was
taken on the result of such a comparison assumed that the result was some
sort of an array, it would fail (yes, this does make it a bit difficult to
trace back the source of the problem, but not impossible).

Second, no semantics are broken with this. Are the arrays equal or not? If
they weren't broadcastible, then returning False for == and True for !=
makes perfect sense to me. At least, that is my take on it.

Cheers!
Ben Root


On Fri, Jul 12, 2013 at 8:38 AM, Sebastian Berg
<sebastian at sipsolutions.net>wrote:

> Hey,
>
> the array comparisons == and != never raise errors but instead simply
> return False for invalid comparisons.
>
> The main example are arrays of non-matching dimensions, and object
> arrays with invalid element-wise comparisons:
>
> In [1]: np.array([1,2,3]) == np.array([1,2])
> Out[1]: False
>
> In [2]: np.array([1, np.array([2, 3])], dtype=object) == [1, 2]
> Out[2]: False
>
> This seems wrong to me, and I am sure not just me. I doubt any large
> projects makes use of such comparisons and assume that most would prefer
> the shape mismatch to raise an error, so I would like to change it. But
> I am a bit unsure especially about smaller projects. So to keep the
> transition a bit safer could imagine implementing a FutureWarning for
> these cases (and that would at least notify new users that what they are
> doing doesn't seem like the right thing).
>
> So the question is: Is such a change safe enough, or is there some good
> reason for the current behavior that I am missing?
>
> Regards,
>
> Sebastian
>
> (There may be other issues with structured types that would continue
> returning False I think, because neither side knows how to compare)
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130712/324e496b/attachment.html>

From jgomezdans at gmail.com  Fri Jul 12 09:26:19 2013
From: jgomezdans at gmail.com (Jose Gomez-Dans)
Date: Fri, 12 Jul 2013 14:26:19 +0100
Subject: [Numpy-discussion] f2py and setup.py how can I specify where
 the .so file goes?
In-Reply-To: <CA+nsYDsabC+YntwmBfdmZNrfCRY9fNd=ipTg=UtS=nMzSs18Yg@mail.gmail.com>
References: <CAMWde5ot1iKneyw2WtGn263g=PDMDyDN_e4rqnK8CT1j=gNQdg@mail.gmail.com>
	<CA+nsYDsabC+YntwmBfdmZNrfCRY9fNd=ipTg=UtS=nMzSs18Yg@mail.gmail.com>
Message-ID: <CAMWde5qYKJFcoW9G_sMSHxa_LMUN-_fK_7HHTyn8BZ=Op_5kxw@mail.gmail.com>

Hi Scott, thanks for your help.
On 12 July 2013 10:02, Scott Sinclair <scott.sinclair.za at gmail.com> wrote:
>
> Something like the following should work... [...]
>

Your suggestion works like what I already had. The issue is that the .so
created by the Extension is copied to copying
<blah/blah>/lib/python2.7/site-packages/
and not to
<blah/blah>/lib/python2.7/site-packages/my_pack

As it is, Python finds it with no problems (as site-packages is in the
PYTHONPATH), but I'm worried that that might not be the case with all
possible setups. But maybe that's the way it's suppossed to work.

Thanks!
Jose
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130712/ad9cfd03/attachment.html>

From gregorio.bastardo at gmail.com  Fri Jul 12 10:41:04 2013
From: gregorio.bastardo at gmail.com (Gregorio Bastardo)
Date: Fri, 12 Jul 2013 16:41:04 +0200
Subject: [Numpy-discussion] read-only or immutable masked array
Message-ID: <CAGcEGh66a6=BVn_kD2Oie187b_ayA9a-qB0pXve6QU5p+1rmzA@mail.gmail.com>

Hi,

I use masked arrays to mark missing values in data and found it very
convenient, although sometimes counterintuitive.

I'd like to make a pool of masked arrays (shared between several
processing steps) read-only (both data and mask property) to protect
the arrays from accidental modification (and the array users from
hours of debugging). The regular ndarray trick

array.flags.writeable = False

is perfectly fine, but it does not work on ma-s. Moreover, mask
hardening only protects masked elements, and does not raise error (as
I'd expect).

Could you recommend an easy way to set an ma read-only?

Thanks,
Gregorio


From stefan at sun.ac.za  Fri Jul 12 11:45:30 2013
From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=)
Date: Fri, 12 Jul 2013 17:45:30 +0200
Subject: [Numpy-discussion] read-only or immutable masked array
In-Reply-To: <CAGcEGh66a6=BVn_kD2Oie187b_ayA9a-qB0pXve6QU5p+1rmzA@mail.gmail.com>
References: <CAGcEGh66a6=BVn_kD2Oie187b_ayA9a-qB0pXve6QU5p+1rmzA@mail.gmail.com>
Message-ID: <CABDkGQnK4WzBrPB9BUmURD0G6tnWQmBgmvkhX4NT22Uy2ESypw@mail.gmail.com>

On Fri, Jul 12, 2013 at 4:41 PM, Gregorio Bastardo
<gregorio.bastardo at gmail.com> wrote:
> array.flags.writeable = False
>
> is perfectly fine, but it does not work on ma-s. Moreover, mask
> hardening only protects masked elements, and does not raise error (as
> I'd expect).

You probably have to modify the underlying array and mask:

x = np.ma.array(...)
x.mask.flags.writeable = False
x.data.flags.writeable = False

St?fan


From alan.isaac at gmail.com  Fri Jul 12 14:53:44 2013
From: alan.isaac at gmail.com (Alan G Isaac)
Date: Fri, 12 Jul 2013 14:53:44 -0400
Subject: [Numpy-discussion] numpy.sign query
Message-ID: <51E050B8.6090006@gmail.com>

The docs for numpy.sign at
http://docs.scipy.org/doc/numpy/reference/generated/numpy.sign.html
do not indicate how complex numbers are handled.  Currently, np.sign
appears to return the sign of the real part as a complex value.
Was this an explicit choice?  Was x/abs(x) considered (for non-zero elements)?

Thanks,
Alan Isaac


From nouiz at nouiz.org  Fri Jul 12 15:35:51 2013
From: nouiz at nouiz.org (=?ISO-8859-1?Q?Fr=E9d=E9ric_Bastien?=)
Date: Fri, 12 Jul 2013 15:35:51 -0400
Subject: [Numpy-discussion] Allow == and != to raise errors
In-Reply-To: <CANNq6Fk+uLF_5Q+kesradPt5vEB_-miwmTzSm5odgEqxFtOing@mail.gmail.com>
References: <1373632688.13968.13.camel@sebastian-laptop>
	<CANNq6Fk+uLF_5Q+kesradPt5vEB_-miwmTzSm5odgEqxFtOing@mail.gmail.com>
Message-ID: <CADKKbtj1PioJGYSeU8+jxcrcLc58mLJncD3R_DOnYc_RU+mJcQ@mail.gmail.com>

I also don't like that idea, but I'm not able to come to a good reasoning
like Benjamin.

I don't see advantage to this change and the reason isn't good enough to
justify breaking the interface I think.

But I don't think we rely on this, so if the change goes in, it probably
won't break stuff or they will be easily seen and repared.

Fred


On Fri, Jul 12, 2013 at 9:13 AM, Benjamin Root <ben.root at ou.edu> wrote:

> I can see where you are getting at, but I would have to disagree.  First
> of all, when a comparison between two mis-shaped arrays occur, you get back
> a bone fide python boolean, not a numpy array of bools. So if any action
> was taken on the result of such a comparison assumed that the result was
> some sort of an array, it would fail (yes, this does make it a bit
> difficult to trace back the source of the problem, but not impossible).
>
> Second, no semantics are broken with this. Are the arrays equal or not? If
> they weren't broadcastible, then returning False for == and True for !=
> makes perfect sense to me. At least, that is my take on it.
>
> Cheers!
> Ben Root
>
>
>
> On Fri, Jul 12, 2013 at 8:38 AM, Sebastian Berg <
> sebastian at sipsolutions.net> wrote:
>
>> Hey,
>>
>> the array comparisons == and != never raise errors but instead simply
>> return False for invalid comparisons.
>>
>> The main example are arrays of non-matching dimensions, and object
>> arrays with invalid element-wise comparisons:
>>
>> In [1]: np.array([1,2,3]) == np.array([1,2])
>> Out[1]: False
>>
>> In [2]: np.array([1, np.array([2, 3])], dtype=object) == [1, 2]
>> Out[2]: False
>>
>> This seems wrong to me, and I am sure not just me. I doubt any large
>> projects makes use of such comparisons and assume that most would prefer
>> the shape mismatch to raise an error, so I would like to change it. But
>> I am a bit unsure especially about smaller projects. So to keep the
>> transition a bit safer could imagine implementing a FutureWarning for
>> these cases (and that would at least notify new users that what they are
>> doing doesn't seem like the right thing).
>>
>> So the question is: Is such a change safe enough, or is there some good
>> reason for the current behavior that I am missing?
>>
>> Regards,
>>
>> Sebastian
>>
>> (There may be other issues with structured types that would continue
>> returning False I think, because neither side knows how to compare)
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130712/1476d693/attachment.html>

From tmp50 at ukr.net  Fri Jul 12 15:46:12 2013
From: tmp50 at ukr.net (Dmitrey)
Date: Fri, 12 Jul 2013 22:46:12 +0300
Subject: [Numpy-discussion] new free software for knapsack problem
Message-ID: <1373658171.847348466.d5wdxm7s@fmst-1.ukr.net>

Hi all, 
FYI new free software for knapsack problem ( http://en.wikipedia.org/wiki/Knapsack_problem ) has been made (written in Python language); it can solve possibly constrained, possibly (with interalg ) nonlinear and multiobjective problems with specifiable accuracy. Along with interalg lots of? MILP ? solvers can be used. 
See http://openopt.org/KSP for details. 
Regards, Dmitrey. 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130712/2d24ba3c/attachment.html>

From lists at hilboll.de  Fri Jul 12 17:14:40 2013
From: lists at hilboll.de (Andreas Hilboll)
Date: Fri, 12 Jul 2013 23:14:40 +0200
Subject: [Numpy-discussion] flip array on axis
In-Reply-To: <CAH6Pt5rD5057MC9b586NOm3ZB2BKfm+mvx+rP=y=fheTG6B_FA@mail.gmail.com>
References: <51DD776F.4040306@hilboll.de>
	<CAH6Pt5rD5057MC9b586NOm3ZB2BKfm+mvx+rP=y=fheTG6B_FA@mail.gmail.com>
Message-ID: <51E071C0.7050504@hilboll.de>

Am 10.07.2013 17:06, schrieb Matthew Brett:
> Hi,
> 
> On Wed, Jul 10, 2013 at 11:02 AM, Andreas Hilboll <lists at hilboll.de> wrote:
>> Hi,
>>
>> there are np.flipud and np.fliplr methods to flip 2d arrays on the first
>> and second dimension, respectively. What can I do to flip an array on an
>> axis which I don't know before runtime? I'd really like to see a
>> np.flip(arr, axis) method which lets me specify which axis to flip on.
> 
> I have something like that that's a few lines long:
> 
> https://github.com/nipy/nibabel/blob/master/nibabel/orientations.py#L231
> 
> Cheers,
> 
> Matthew
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> 

Hi Matthew,

is it okay with you as the original author in nipy if I copy the
flip_axis function to numpy, more or less verbatim, including tests?

Cheers, Andreas.


From josef.pktd at gmail.com  Fri Jul 12 19:29:07 2013
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Fri, 12 Jul 2013 19:29:07 -0400
Subject: [Numpy-discussion] Allow == and != to raise errors
In-Reply-To: <CADKKbtj1PioJGYSeU8+jxcrcLc58mLJncD3R_DOnYc_RU+mJcQ@mail.gmail.com>
References: <1373632688.13968.13.camel@sebastian-laptop>
	<CANNq6Fk+uLF_5Q+kesradPt5vEB_-miwmTzSm5odgEqxFtOing@mail.gmail.com>
	<CADKKbtj1PioJGYSeU8+jxcrcLc58mLJncD3R_DOnYc_RU+mJcQ@mail.gmail.com>
Message-ID: <CAMMTP+Dv3X1c3ZV9GPCMcofRE7ou8n35Q9dzZaWbqmpi6SDaNg@mail.gmail.com>

On Fri, Jul 12, 2013 at 3:35 PM, Fr?d?ric Bastien <nouiz at nouiz.org> wrote:
> I also don't like that idea, but I'm not able to come to a good reasoning
> like Benjamin.
>
> I don't see advantage to this change and the reason isn't good enough to
> justify breaking the interface I think.
>
> But I don't think we rely on this, so if the change goes in, it probably
> won't break stuff or they will be easily seen and repared.
>
> Fred
>
>
> On Fri, Jul 12, 2013 at 9:13 AM, Benjamin Root <ben.root at ou.edu> wrote:
>>
>> I can see where you are getting at, but I would have to disagree.  First
>> of all, when a comparison between two mis-shaped arrays occur, you get back
>> a bone fide python boolean, not a numpy array of bools. So if any action was
>> taken on the result of such a comparison assumed that the result was some
>> sort of an array, it would fail (yes, this does make it a bit difficult to
>> trace back the source of the problem, but not impossible).
>>
>> Second, no semantics are broken with this. Are the arrays equal or not? If
>> they weren't broadcastible, then returning False for == and True for !=
>> makes perfect sense to me. At least, that is my take on it.
>>
>> Cheers!
>> Ben Root
>>
>>
>>
>> On Fri, Jul 12, 2013 at 8:38 AM, Sebastian Berg
>> <sebastian at sipsolutions.net> wrote:
>>>
>>> Hey,
>>>
>>> the array comparisons == and != never raise errors but instead simply
>>> return False for invalid comparisons.
>>>
>>> The main example are arrays of non-matching dimensions, and object
>>> arrays with invalid element-wise comparisons:
>>>
>>> In [1]: np.array([1,2,3]) == np.array([1,2])
>>> Out[1]: False
>>>
>>> In [2]: np.array([1, np.array([2, 3])], dtype=object) == [1, 2]
>>> Out[2]: False
>>>
>>> This seems wrong to me, and I am sure not just me. I doubt any large
>>> projects makes use of such comparisons and assume that most would prefer
>>> the shape mismatch to raise an error, so I would like to change it. But
>>> I am a bit unsure especially about smaller projects. So to keep the
>>> transition a bit safer could imagine implementing a FutureWarning for
>>> these cases (and that would at least notify new users that what they are
>>> doing doesn't seem like the right thing).
>>>
>>> So the question is: Is such a change safe enough, or is there some good
>>> reason for the current behavior that I am missing?
>>>
>>> Regards,
>>>
>>> Sebastian
>>>
>>> (There may be other issues with structured types that would continue
>>> returning False I think, because neither side knows how to compare)
>>>
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>

I thought Benjamin sounds pretty convincing, and since I never use
this, I don't care.

However, I (and I'm pretty convinced all statsmodels code) uses
equality comparison only element wise. Getting a boolean back is an
indicator for a bug, which is most of the time easy to trace back.

There is an inconsistency in the behavior with the inequalities.

>>> np.array([1,2,3]) < np.array([1,2])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: shape mismatch: objects cannot be broadcast to a single shape

>>> np.array([1,2,3]) <= np.array([1,2])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: shape mismatch: objects cannot be broadcast to a single shape

>>> (np.array([1,2,3]) == np.array([1,2])).any()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'bool' object has no attribute 'any'


The last one could be misleading and difficult to catch.

>>> np.any(np.array([1,2,3]) == np.array([1,2]))
False

numpy 1.5.1  since I'm playing rear guard

Josef


Josef


From charlesr.harris at gmail.com  Fri Jul 12 21:25:43 2013
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Fri, 12 Jul 2013 19:25:43 -0600
Subject: [Numpy-discussion] numpy.sign query
In-Reply-To: <51E050B8.6090006@gmail.com>
References: <51E050B8.6090006@gmail.com>
Message-ID: <CAB6mnxLj5X=A2itSJhmgM5_LO6Uxm_4sVNf+C-+uvABmSWY1Zg@mail.gmail.com>

On Fri, Jul 12, 2013 at 12:53 PM, Alan G Isaac <alan.isaac at gmail.com> wrote:

> The docs for numpy.sign at
> http://docs.scipy.org/doc/numpy/reference/generated/numpy.sign.html
> do not indicate how complex numbers are handled.  Currently, np.sign
> appears to return the sign of the real part as a complex value.
> Was this an explicit choice?  Was x/abs(x) considered (for non-zero
> elements)?
>
>
ISTR some discussion of that. Personally, I like the x/abs(x) idea.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130712/eeef0d0a/attachment.html>

From charlesr.harris at gmail.com  Fri Jul 12 21:46:54 2013
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Fri, 12 Jul 2013 19:46:54 -0600
Subject: [Numpy-discussion] nansum, nanmean, nanvar, nanstd
Message-ID: <CAB6mnxLtQy6gzKAhuU+7Srhk3SwpZBMH1iQO7A0OWRSp-hNeRw@mail.gmail.com>

Hi All,

I've been working on Benjamin's PR, which I took down as he didn't have
time to finish it. I've made the following changes and thought I'd run them
past others before putting up a new pull request.


   1. The new functions are consolidated with the old ones inn (new)
   numpy/lib/nanfunctions.py.
   2. There is a new test module numpy/lib/tests/test_nanfunctions.py
   3. The functions punt to standard routines if the array is not inexact.
   4. If the array is inexact, then so must be the optional out and dtype
   arguments.
   5. Nans are returned for all nan axis, no warnings are raised.
   6. If cnt - ddof <= 0 the result is Nan for that axis, no warnings are
   raised.
   7. For scalar returns the type of the array, or the type given by the
   dtype option, is preserved.

Number 7 does not hold for current mean, var, and std. I propose that those
functions be fixed.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130712/ab330ddd/attachment.html>

From brady.mccary at gmail.com  Fri Jul 12 23:00:08 2013
From: brady.mccary at gmail.com (Brady McCary)
Date: Fri, 12 Jul 2013 22:00:08 -0500
Subject: [Numpy-discussion] PIL and NumPy
Message-ID: <CAAQ2A-sTsJRRHYZdrDE+H5M4YPZapsnTkB63UFr20ty_8cLKpg@mail.gmail.com>

NumPy Folks,

I want to load images with PIL and then operate on them with NumPy.
According to the PIL and NumPy documentation, I would expect the
following to work, but it is not.


Python 2.7.4 (default, Apr 19 2013, 18:28:01)
[GCC 4.7.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy
>>> numpy.version.version
>>>
>>> import Image
>>> Image.VERSION
'1.1.7'
>>>
>>> im = Image.open('big-0.png')
>>> im.size
(2550, 3300)
>>>
>>> ar = numpy.asarray(im)
>>> ar.size
1
>>> ar.shape
()
>>> ar
array(<PIL.PngImagePlugin.PngImageFile image mode=LA size=2550x3300 at
0x1E5BA70>, dtype=object)


By "not working" I mean that I would have expected the data to be
loaded/available in ar. PIL and NumPy/SciPy seem to be working fine
independently of each other. Any guidance?

Brady


From brady.mccary at gmail.com  Fri Jul 12 23:50:26 2013
From: brady.mccary at gmail.com (Brady McCary)
Date: Fri, 12 Jul 2013 22:50:26 -0500
Subject: [Numpy-discussion] PIL and NumPy
In-Reply-To: <CAAQ2A-sTsJRRHYZdrDE+H5M4YPZapsnTkB63UFr20ty_8cLKpg@mail.gmail.com>
References: <CAAQ2A-sTsJRRHYZdrDE+H5M4YPZapsnTkB63UFr20ty_8cLKpg@mail.gmail.com>
Message-ID: <CAAQ2A-tAmKpgyuBYdXxHjQuo27dO5hhhUk9_X=RAiZsroLvD8A@mail.gmail.com>

NumPy Folks,

Sorry for the self-reply, but I have determined that this may have
something to do with an alpha channel being present. When I remove the
alpha channel, things appear to work as I expect. Any discussion on
the matter?

Brady

On Fri, Jul 12, 2013 at 10:00 PM, Brady McCary <brady.mccary at gmail.com> wrote:
> NumPy Folks,
>
> I want to load images with PIL and then operate on them with NumPy.
> According to the PIL and NumPy documentation, I would expect the
> following to work, but it is not.
>
>
>
> Python 2.7.4 (default, Apr 19 2013, 18:28:01)
> [GCC 4.7.3] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
>>>> import numpy
>>>> numpy.version.version
>>>>
>>>> import Image
>>>> Image.VERSION
> '1.1.7'
>>>>
>>>> im = Image.open('big-0.png')
>>>> im.size
> (2550, 3300)
>>>>
>>>> ar = numpy.asarray(im)
>>>> ar.size
> 1
>>>> ar.shape
> ()
>>>> ar
> array(<PIL.PngImagePlugin.PngImageFile image mode=LA size=2550x3300 at
> 0x1E5BA70>, dtype=object)
>
>
>
> By "not working" I mean that I would have expected the data to be
> loaded/available in ar. PIL and NumPy/SciPy seem to be working fine
> independently of each other. Any guidance?
>
> Brady


From sebastian at sipsolutions.net  Sat Jul 13 06:26:45 2013
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Sat, 13 Jul 2013 12:26:45 +0200
Subject: [Numpy-discussion] Allow == and != to raise errors
In-Reply-To: <CAMMTP+Dv3X1c3ZV9GPCMcofRE7ou8n35Q9dzZaWbqmpi6SDaNg@mail.gmail.com>
References: <1373632688.13968.13.camel@sebastian-laptop>
	<CANNq6Fk+uLF_5Q+kesradPt5vEB_-miwmTzSm5odgEqxFtOing@mail.gmail.com>
	<CADKKbtj1PioJGYSeU8+jxcrcLc58mLJncD3R_DOnYc_RU+mJcQ@mail.gmail.com>
	<CAMMTP+Dv3X1c3ZV9GPCMcofRE7ou8n35Q9dzZaWbqmpi6SDaNg@mail.gmail.com>
Message-ID: <1373711205.31992.20.camel@sebastian-laptop>

On Fri, 2013-07-12 at 19:29 -0400, josef.pktd at gmail.com wrote:
> On Fri, Jul 12, 2013 at 3:35 PM, Fr?d?ric Bastien <nouiz at nouiz.org> wrote:
> > I also don't like that idea, but I'm not able to come to a good reasoning
> > like Benjamin.
> >
> > I don't see advantage to this change and the reason isn't good enough to
> > justify breaking the interface I think.
> >
> > But I don't think we rely on this, so if the change goes in, it probably
> > won't break stuff or they will be easily seen and repared.
> >
> > Fred

<snip>

> 
> I thought Benjamin sounds pretty convincing, and since I never use
> this, I don't care.
> 
> However, I (and I'm pretty convinced all statsmodels code) uses
> equality comparison only element wise. Getting a boolean back is an
> indicator for a bug, which is most of the time easy to trace back.
> 
> There is an inconsistency in the behavior with the inequalities.
> 

Well, I guess I tend to think on the purity side of things. And the
comparisons currently mix container and element-wise comparison up. It
seems to me that it can lead to bugs, though I suppose it is unlikely to
really hit anyone.

One thing that keeping the behaviour means, is that the object array
comparisons will be a little buggy (you get False for the whole array,
when an element comparison gives an error).
Though I admit, that for example arrays inside containers make any
equality for the container quirky, since arrays cannot define a truth
value.

But if there is concern that this really could break code I won't try to
press for it.

- Sebastian

> >>> np.array([1,2,3]) < np.array([1,2])
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> ValueError: shape mismatch: objects cannot be broadcast to a single shape
> 
> >>> np.array([1,2,3]) <= np.array([1,2])
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> ValueError: shape mismatch: objects cannot be broadcast to a single shape
> 
> >>> (np.array([1,2,3]) == np.array([1,2])).any()
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> AttributeError: 'bool' object has no attribute 'any'
> 
> 
> The last one could be misleading and difficult to catch.
> 
> >>> np.any(np.array([1,2,3]) == np.array([1,2]))
> False
> 
> numpy 1.5.1  since I'm playing rear guard
> 
> Josef
> 
> 
> Josef
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> 


From gregorio.bastardo at gmail.com  Sat Jul 13 07:36:49 2013
From: gregorio.bastardo at gmail.com (Gregorio Bastardo)
Date: Sat, 13 Jul 2013 13:36:49 +0200
Subject: [Numpy-discussion] read-only or immutable masked array
In-Reply-To: <CABDkGQnK4WzBrPB9BUmURD0G6tnWQmBgmvkhX4NT22Uy2ESypw@mail.gmail.com>
References: <CAGcEGh66a6=BVn_kD2Oie187b_ayA9a-qB0pXve6QU5p+1rmzA@mail.gmail.com>
	<CABDkGQnK4WzBrPB9BUmURD0G6tnWQmBgmvkhX4NT22Uy2ESypw@mail.gmail.com>
Message-ID: <CAGcEGh7rTbFP9rXQTC7vb4ya8ophb5MKbBcHSfm=EZz1jNRVyA@mail.gmail.com>

Hi St?fan,

Thanks for the suggestion, but it does not protect the array:

>>> x = np.ma.masked_array(xrange(4), [0,1,0,1])
>>> x
masked_array(data = [0 -- 2 --],
             mask = [False  True False  True],
       fill_value = 999999)
>>> x.mask.flags.writeable = False
>>> x.data.flags.writeable = False
>>> x.data.flags.writeable
True
>>> x.mask.flags.writeable
False
>>> x[0] = -1
>>> x
masked_array(data = [-1 -- 2 --],
             mask = [False  True False  True],
       fill_value = 999999)

Is there a working solution for this problem?

Thanks,
Gregorio

2013/7/12 St?fan van der Walt <stefan at sun.ac.za>:
> On Fri, Jul 12, 2013 at 4:41 PM, Gregorio Bastardo
> <gregorio.bastardo at gmail.com> wrote:
>> array.flags.writeable = False
>>
>> is perfectly fine, but it does not work on ma-s. Moreover, mask
>> hardening only protects masked elements, and does not raise error (as
>> I'd expect).
>
> You probably have to modify the underlying array and mask:
>
> x = np.ma.array(...)
> x.mask.flags.writeable = False
> x.data.flags.writeable = False
>
> St?fan
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From njs at pobox.com  Sat Jul 13 09:14:42 2013
From: njs at pobox.com (Nathaniel Smith)
Date: Sat, 13 Jul 2013 14:14:42 +0100
Subject: [Numpy-discussion] Allow == and != to raise errors
In-Reply-To: <CANNq6Fk+uLF_5Q+kesradPt5vEB_-miwmTzSm5odgEqxFtOing@mail.gmail.com>
References: <1373632688.13968.13.camel@sebastian-laptop>
	<CANNq6Fk+uLF_5Q+kesradPt5vEB_-miwmTzSm5odgEqxFtOing@mail.gmail.com>
Message-ID: <CAPJVwBn2T1ksxTKda-x5McLd_QecyWEd9G1mC_M7E=fVt7z_zA@mail.gmail.com>

On Fri, Jul 12, 2013 at 2:13 PM, Benjamin Root <ben.root at ou.edu> wrote:
> I can see where you are getting at, but I would have to disagree.  First of
> all, when a comparison between two mis-shaped arrays occur, you get back a
> bone fide python boolean, not a numpy array of bools. So if any action was
> taken on the result of such a comparison assumed that the result was some
> sort of an array, it would fail (yes, this does make it a bit difficult to
> trace back the source of the problem, but not impossible).
>
> Second, no semantics are broken with this. Are the arrays equal or not? If
> they weren't broadcastible, then returning False for == and True for !=
> makes perfect sense to me. At least, that is my take on it.

But it does break semantics. Sure, it tells you that the arrays aren't
equal -- but that's not the question you asked. "==" is not "are these
arrays equal"; it's "is each pair of broadcasted aligned elements in
these arrays equal", and these are totally different operations. It's
unfortunate that "==" is a somewhat confusing name, but that's no
reason to mix things up like this. "+" in python sometimes means "add
all elements" and sometimes means "concatenate", but no-one would
argue that ndarray.__add__ should the former when the arrays were
broadcastable and the latter when they weren't. This is the same
thing.

"Errors should never pass silently", "In the face of ambiguity, refuse
the temptation to guess."

There's really no sensible interface here -- notice that '==' can
return False but can never return True, and Josef gave an example of
where it can silently produce misleading results. So to me it seems
like a clear bug, but one of the sort that has a higher probability
than usual that someone somewhere is depending on it... which makes it
less clear what exactly to do with it.

I guess one option is to just start raising errors in the first RC and
see whether anyone complains! But people people don't seem to test the
RCs enough to make this entirely reliable :-(.

-n


From josef.pktd at gmail.com  Sat Jul 13 11:28:06 2013
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Sat, 13 Jul 2013 11:28:06 -0400
Subject: [Numpy-discussion] Allow == and != to raise errors
In-Reply-To: <CAPJVwBn2T1ksxTKda-x5McLd_QecyWEd9G1mC_M7E=fVt7z_zA@mail.gmail.com>
References: <1373632688.13968.13.camel@sebastian-laptop>
	<CANNq6Fk+uLF_5Q+kesradPt5vEB_-miwmTzSm5odgEqxFtOing@mail.gmail.com>
	<CAPJVwBn2T1ksxTKda-x5McLd_QecyWEd9G1mC_M7E=fVt7z_zA@mail.gmail.com>
Message-ID: <CAMMTP+DaC5iJZ5xvHN_eRoHhnRNqT9BHq=tugdosHKmO4Ebs7w@mail.gmail.com>

On Sat, Jul 13, 2013 at 9:14 AM, Nathaniel Smith <njs at pobox.com> wrote:
> On Fri, Jul 12, 2013 at 2:13 PM, Benjamin Root <ben.root at ou.edu> wrote:
>> I can see where you are getting at, but I would have to disagree.  First of
>> all, when a comparison between two mis-shaped arrays occur, you get back a
>> bone fide python boolean, not a numpy array of bools. So if any action was
>> taken on the result of such a comparison assumed that the result was some
>> sort of an array, it would fail (yes, this does make it a bit difficult to
>> trace back the source of the problem, but not impossible).
>>
>> Second, no semantics are broken with this. Are the arrays equal or not? If
>> they weren't broadcastible, then returning False for == and True for !=
>> makes perfect sense to me. At least, that is my take on it.
>
> But it does break semantics. Sure, it tells you that the arrays aren't
> equal -- but that's not the question you asked. "==" is not "are these
> arrays equal"; it's "is each pair of broadcasted aligned elements in
> these arrays equal", and these are totally different operations. It's
> unfortunate that "==" is a somewhat confusing name, but that's no
> reason to mix things up like this. "+" in python sometimes means "add
> all elements" and sometimes means "concatenate", but no-one would
> argue that ndarray.__add__ should the former when the arrays were
> broadcastable and the latter when they weren't. This is the same
> thing.
>
> "Errors should never pass silently", "In the face of ambiguity, refuse
> the temptation to guess."
>
> There's really no sensible interface here -- notice that '==' can
> return False but can never return True, and Josef gave an example of
> where it can silently produce misleading results. So to me it seems
> like a clear bug, but one of the sort that has a higher probability
> than usual that someone somewhere is depending on it... which makes it
> less clear what exactly to do with it.
>
> I guess one option is to just start raising errors in the first RC and
> see whether anyone complains! But people people don't seem to test the
> RCs enough to make this entirely reliable :-(.


I'm now +1 on the exception that Sebastian proposed.

I like consistency, and having a more straightforward mental model of
the numpy behavior for elementwise operations, that don't pretend
sometimes to be "python" (when I'm doing array math), like this

>>> [1,2,3] < [1,2]
False
>>> [1,2,3] > [1,2]
True

Josef


>
> -n
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From alan.isaac at gmail.com  Sat Jul 13 11:30:08 2013
From: alan.isaac at gmail.com (Alan G Isaac)
Date: Sat, 13 Jul 2013 11:30:08 -0400
Subject: [Numpy-discussion] add .H attribute?
In-Reply-To: <CAB6mnxL9GhVHi5aBKMfzX0DTi3uY=W90rFaj7Ta=fZguJZs5FA@mail.gmail.com>
References: <51D98902.1090403@gmail.com>
	<CAB6mnxL9GhVHi5aBKMfzX0DTi3uY=W90rFaj7Ta=fZguJZs5FA@mail.gmail.com>
Message-ID: <51E17280.2030105@gmail.com>

> On Sun, Jul 7, 2013 at 9:28 AM, Alan G Isaac <alan.isaac at gmail.com <mailto:alan.isaac at gmail.com>> wrote:
> I miss being able to spell a.conj().T as a.H, as one can
> with numpy matrices.


On 7/7/2013 4:49 PM, Charles R Harris wrote:
> There was a long thread about this back around 1.1 or so,
> long time ago in any case. IIRC correctly, Travis was
> opposed. I think part of the problem was that arr.T is
> a view, but arr.H would not be. Probably it could be be
> made to return an iterator that performed the conjugation,
> or we could simply return a new array. I'm not opposed
> myself, but I'd have to review the old discussion to see
> if there was good reason not to have it in the first
> place. I think the original discussion of an abs method
> took place about the same time.


If not being a view is determinative, could a .ct() method
be considered?  Or would the objection apply there too?

Thanks,
Alan


From njs at pobox.com  Sat Jul 13 13:46:12 2013
From: njs at pobox.com (Nathaniel Smith)
Date: Sat, 13 Jul 2013 18:46:12 +0100
Subject: [Numpy-discussion] add .H attribute?
In-Reply-To: <51E17280.2030105@gmail.com>
References: <51D98902.1090403@gmail.com>
	<CAB6mnxL9GhVHi5aBKMfzX0DTi3uY=W90rFaj7Ta=fZguJZs5FA@mail.gmail.com>
	<51E17280.2030105@gmail.com>
Message-ID: <CAPJVwBn-4WfgPQCzz0kECV8MyoLgWDmTmWu+Fj2TMuUWKx-7Wg@mail.gmail.com>

On 13 Jul 2013 16:30, "Alan G Isaac" <alan.isaac at gmail.com> wrote:
>
> > On Sun, Jul 7, 2013 at 9:28 AM, Alan G Isaac <alan.isaac at gmail.com<mailto:
alan.isaac at gmail.com>> wrote:
> > I miss being able to spell a.conj().T as a.H, as one can
> > with numpy matrices.
>
>
> On 7/7/2013 4:49 PM, Charles R Harris wrote:
> > There was a long thread about this back around 1.1 or so,
> > long time ago in any case. IIRC correctly, Travis was
> > opposed. I think part of the problem was that arr.T is
> > a view, but arr.H would not be. Probably it could be be
> > made to return an iterator that performed the conjugation,
> > or we could simply return a new array. I'm not opposed
> > myself, but I'd have to review the old discussion to see
> > if there was good reason not to have it in the first
> > place. I think the original discussion of an abs method
> > took place about the same time.
>
>
> If not being a view is determinative, could a .ct() method
> be considered?  Or would the objection apply there too?

Why not just write

def H(a):
    return a.conj().T

in your local namespace? The resulting code will be even more concise than
if we had a .ct() method.

ndarray has way too many attributes already IMHO (though I realize this may
be a minority view).

-n
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130713/4cd54cae/attachment.html>

From pav at iki.fi  Sat Jul 13 14:31:32 2013
From: pav at iki.fi (Pauli Virtanen)
Date: Sat, 13 Jul 2013 21:31:32 +0300
Subject: [Numpy-discussion] add .H attribute?
In-Reply-To: <CAPJVwBn-4WfgPQCzz0kECV8MyoLgWDmTmWu+Fj2TMuUWKx-7Wg@mail.gmail.com>
References: <51D98902.1090403@gmail.com>
	<CAB6mnxL9GhVHi5aBKMfzX0DTi3uY=W90rFaj7Ta=fZguJZs5FA@mail.gmail.com>
	<51E17280.2030105@gmail.com>
	<CAPJVwBn-4WfgPQCzz0kECV8MyoLgWDmTmWu+Fj2TMuUWKx-7Wg@mail.gmail.com>
Message-ID: <krs6dt$ouq$1@ger.gmane.org>

13.07.2013 20:46, Nathaniel Smith kirjoitti:
[clip]
> Why not just write
>
> def H(a):
>      return a.conj().T

In long expressions, this puts H to the wrong side.

-- 
Pauli Virtanen


From alan.isaac at gmail.com  Sat Jul 13 16:37:09 2013
From: alan.isaac at gmail.com (Alan G Isaac)
Date: Sat, 13 Jul 2013 16:37:09 -0400
Subject: [Numpy-discussion] add .H attribute?
In-Reply-To: <CAPJVwBn-4WfgPQCzz0kECV8MyoLgWDmTmWu+Fj2TMuUWKx-7Wg@mail.gmail.com>
References: <51D98902.1090403@gmail.com>
	<CAB6mnxL9GhVHi5aBKMfzX0DTi3uY=W90rFaj7Ta=fZguJZs5FA@mail.gmail.com>
	<51E17280.2030105@gmail.com>
	<CAPJVwBn-4WfgPQCzz0kECV8MyoLgWDmTmWu+Fj2TMuUWKx-7Wg@mail.gmail.com>
Message-ID: <51E1BA75.4000806@gmail.com>

On 7/13/2013 1:46 PM, Nathaniel Smith wrote:
> Why not just write
>
> def H(a):
>      return a.conj().T
>
> in your local namespace?
>

First of all, I am sympathetic to being conservative
about the addition of attributes!

But the question about adding a.H about the possibility of improving
- speed (relative to adding a function of my own)
- readability (including error-free readability of others' code)
- consistency (across code bases and objects)
- competitiveness (with other array languages)
- convenience (including key strokes)

I agree that there are alternatives for the last of these.

Alan


From pgmdevlist at gmail.com  Sun Jul 14 09:55:32 2013
From: pgmdevlist at gmail.com (Pierre GM)
Date: Sun, 14 Jul 2013 15:55:32 +0200
Subject: [Numpy-discussion] read-only or immutable masked array
In-Reply-To: <CAGcEGh7rTbFP9rXQTC7vb4ya8ophb5MKbBcHSfm=EZz1jNRVyA@mail.gmail.com>
References: <CAGcEGh66a6=BVn_kD2Oie187b_ayA9a-qB0pXve6QU5p+1rmzA@mail.gmail.com>
	<CABDkGQnK4WzBrPB9BUmURD0G6tnWQmBgmvkhX4NT22Uy2ESypw@mail.gmail.com>
	<CAGcEGh7rTbFP9rXQTC7vb4ya8ophb5MKbBcHSfm=EZz1jNRVyA@mail.gmail.com>
Message-ID: <5585563F-9DEA-4142-A60E-6F8028E53D28@gmail.com>


On Jul 13, 2013, at 13:36 , Gregorio Bastardo <gregorio.bastardo at gmail.com> wrote:

> Hi St?fan,
> 
> Thanks for the suggestion, but it does not protect the array:

Thinking about it, it can't: when `x` is a MaskedArray, `x.data` is just a view of the underlying array as a regular ndarray. As far as I understand, changing the `.flags` of a view doesn't affect the original.

I'm a bit surprised, though. Here's what I tried

>>> np.version.version
<<< 1.7.0
>>> x = np.ma.array([1,2,3], mask=[0,1,0])
>>> x.flags.writeable=False
>>> x[0]=-1
<<< ValueError: assignment destination is read-only

What did you mean by 

>>> array.flags.writeable = False
>>> 
>>> is perfectly fine, but it does not work on ma-s.

? Could you post what you did and what you got?


>>> Moreover, mask
>>> hardening only protects masked elements, and does not raise error (as
>>> I'd expect).

Yes, that's how it supposed to work.

From charlesr.harris at gmail.com  Sun Jul 14 14:55:17 2013
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Sun, 14 Jul 2013 12:55:17 -0600
Subject: [Numpy-discussion] What should be the result in some statistics
	corner cases?
Message-ID: <CAB6mnx+djfM8yJ+owEJ-Hz7eF9YT0gsxUROstEBiq4jUF6Z0bA@mail.gmail.com>

Some corner cases in the mean, var, std.

*Empty arrays*

I think these cases should either raise an error or just return nan.
Warnings seem ineffective to me as they are only issued once by default.

In [3]: ones(0).mean()
/home/charris/.local/lib/python2.7/site-packages/numpy/core/_methods.py:61:
RuntimeWarning: invalid value encountered in double_scalars
  ret = ret / float(rcount)
Out[3]: nan

In [4]: ones(0).var()
/home/charris/.local/lib/python2.7/site-packages/numpy/core/_methods.py:76:
RuntimeWarning: invalid value encountered in true_divide
  out=arrmean, casting='unsafe', subok=False)
/home/charris/.local/lib/python2.7/site-packages/numpy/core/_methods.py:100:
RuntimeWarning: invalid value encountered in double_scalars
  ret = ret / float(rcount)
Out[4]: nan

In [5]: ones(0).std()
/home/charris/.local/lib/python2.7/site-packages/numpy/core/_methods.py:76:
RuntimeWarning: invalid value encountered in true_divide
  out=arrmean, casting='unsafe', subok=False)
/home/charris/.local/lib/python2.7/site-packages/numpy/core/_methods.py:100:
RuntimeWarning: invalid value encountered in double_scalars
  ret = ret / float(rcount)
Out[5]: nan

*ddof >= number of elements*

I think these should just raise errors. The results for ddof >= #elements
is happenstance, and certainly negative numbers should never be returned.

In [6]: ones(2).var(ddof=2)
/home/charris/.local/lib/python2.7/site-packages/numpy/core/_methods.py:100:
RuntimeWarning: invalid value encountered in double_scalars
  ret = ret / float(rcount)
Out[6]: nan

In [7]: ones(2).var(ddof=3)
Out[7]: -0.0
*
nansum*

Currently returns nan for empty arrays. I suspect it should return nan for
slices that are all nan, but 0 for empty slices. That would make it
consistent with sum in the empty case.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130714/911c5045/attachment.html>

From warren.weckesser at gmail.com  Sun Jul 14 16:55:08 2013
From: warren.weckesser at gmail.com (Warren Weckesser)
Date: Sun, 14 Jul 2013 16:55:08 -0400
Subject: [Numpy-discussion] What should be the result in some statistics
 corner cases?
In-Reply-To: <CAB6mnx+djfM8yJ+owEJ-Hz7eF9YT0gsxUROstEBiq4jUF6Z0bA@mail.gmail.com>
References: <CAB6mnx+djfM8yJ+owEJ-Hz7eF9YT0gsxUROstEBiq4jUF6Z0bA@mail.gmail.com>
Message-ID: <CAGzF1uexxRN8PPmKPjPT=aSxWi79r79Exmoo-Xwj6ZZZ2+6=3Q@mail.gmail.com>

On 7/14/13, Charles R Harris <charlesr.harris at gmail.com> wrote:
> Some corner cases in the mean, var, std.
>
> *Empty arrays*
>
> I think these cases should either raise an error or just return nan.
> Warnings seem ineffective to me as they are only issued once by default.
>
> In [3]: ones(0).mean()
> /home/charris/.local/lib/python2.7/site-packages/numpy/core/_methods.py:61:
> RuntimeWarning: invalid value encountered in double_scalars
>   ret = ret / float(rcount)
> Out[3]: nan
>
> In [4]: ones(0).var()
> /home/charris/.local/lib/python2.7/site-packages/numpy/core/_methods.py:76:
> RuntimeWarning: invalid value encountered in true_divide
>   out=arrmean, casting='unsafe', subok=False)
> /home/charris/.local/lib/python2.7/site-packages/numpy/core/_methods.py:100:
> RuntimeWarning: invalid value encountered in double_scalars
>   ret = ret / float(rcount)
> Out[4]: nan
>
> In [5]: ones(0).std()
> /home/charris/.local/lib/python2.7/site-packages/numpy/core/_methods.py:76:
> RuntimeWarning: invalid value encountered in true_divide
>   out=arrmean, casting='unsafe', subok=False)
> /home/charris/.local/lib/python2.7/site-packages/numpy/core/_methods.py:100:
> RuntimeWarning: invalid value encountered in double_scalars
>   ret = ret / float(rcount)
> Out[5]: nan
>
> *ddof >= number of elements*
>
> I think these should just raise errors. The results for ddof >= #elements
> is happenstance, and certainly negative numbers should never be returned.
>
> In [6]: ones(2).var(ddof=2)
> /home/charris/.local/lib/python2.7/site-packages/numpy/core/_methods.py:100:
> RuntimeWarning: invalid value encountered in double_scalars
>   ret = ret / float(rcount)
> Out[6]: nan
>
> In [7]: ones(2).var(ddof=3)
> Out[7]: -0.0
> *
> nansum*
>
> Currently returns nan for empty arrays. I suspect it should return nan for
> slices that are all nan, but 0 for empty slices. That would make it
> consistent with sum in the empty case.
>


For nansum, I would expect 0 even in the case of all nans.  The point
of these functions is to simply ignore nans, correct?  So I would aim
for this behaviour:  nanfunc(x) behaves the same as func(x[~isnan(x)])

Warren


> Chuck
>


From charlesr.harris at gmail.com  Sun Jul 14 17:35:29 2013
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Sun, 14 Jul 2013 15:35:29 -0600
Subject: [Numpy-discussion] What should be the result in some statistics
 corner cases?
In-Reply-To: <CAGzF1uexxRN8PPmKPjPT=aSxWi79r79Exmoo-Xwj6ZZZ2+6=3Q@mail.gmail.com>
References: <CAB6mnx+djfM8yJ+owEJ-Hz7eF9YT0gsxUROstEBiq4jUF6Z0bA@mail.gmail.com>
	<CAGzF1uexxRN8PPmKPjPT=aSxWi79r79Exmoo-Xwj6ZZZ2+6=3Q@mail.gmail.com>
Message-ID: <CAB6mnx+D+uGd2xzrOpPomwUcQkF4+n1Y80Lndxbgumw-3yG3Uw@mail.gmail.com>

On Sun, Jul 14, 2013 at 2:55 PM, Warren Weckesser <
warren.weckesser at gmail.com> wrote:

> On 7/14/13, Charles R Harris <charlesr.harris at gmail.com> wrote:
> > Some corner cases in the mean, var, std.
> >
> > *Empty arrays*
> >
> > I think these cases should either raise an error or just return nan.
> > Warnings seem ineffective to me as they are only issued once by default.
> >
> > In [3]: ones(0).mean()
> >
> /home/charris/.local/lib/python2.7/site-packages/numpy/core/_methods.py:61:
> > RuntimeWarning: invalid value encountered in double_scalars
> >   ret = ret / float(rcount)
> > Out[3]: nan
> >
> > In [4]: ones(0).var()
> >
> /home/charris/.local/lib/python2.7/site-packages/numpy/core/_methods.py:76:
> > RuntimeWarning: invalid value encountered in true_divide
> >   out=arrmean, casting='unsafe', subok=False)
> >
> /home/charris/.local/lib/python2.7/site-packages/numpy/core/_methods.py:100:
> > RuntimeWarning: invalid value encountered in double_scalars
> >   ret = ret / float(rcount)
> > Out[4]: nan
> >
> > In [5]: ones(0).std()
> >
> /home/charris/.local/lib/python2.7/site-packages/numpy/core/_methods.py:76:
> > RuntimeWarning: invalid value encountered in true_divide
> >   out=arrmean, casting='unsafe', subok=False)
> >
> /home/charris/.local/lib/python2.7/site-packages/numpy/core/_methods.py:100:
> > RuntimeWarning: invalid value encountered in double_scalars
> >   ret = ret / float(rcount)
> > Out[5]: nan
> >
> > *ddof >= number of elements*
> >
> > I think these should just raise errors. The results for ddof >= #elements
> > is happenstance, and certainly negative numbers should never be returned.
> >
> > In [6]: ones(2).var(ddof=2)
> >
> /home/charris/.local/lib/python2.7/site-packages/numpy/core/_methods.py:100:
> > RuntimeWarning: invalid value encountered in double_scalars
> >   ret = ret / float(rcount)
> > Out[6]: nan
> >
> > In [7]: ones(2).var(ddof=3)
> > Out[7]: -0.0
> > *
> > nansum*
> >
> > Currently returns nan for empty arrays. I suspect it should return nan
> for
> > slices that are all nan, but 0 for empty slices. That would make it
> > consistent with sum in the empty case.
> >
>
>
> For nansum, I would expect 0 even in the case of all nans.  The point
> of these functions is to simply ignore nans, correct?  So I would aim
> for this behaviour:  nanfunc(x) behaves the same as func(x[~isnan(x)])
>
>
Agreed, although that changes current behavior. What about the other cases?

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130714/2e7ad35d/attachment.html>

From gregorio.bastardo at gmail.com  Mon Jul 15 04:04:46 2013
From: gregorio.bastardo at gmail.com (Gregorio Bastardo)
Date: Mon, 15 Jul 2013 10:04:46 +0200
Subject: [Numpy-discussion] read-only or immutable masked array
In-Reply-To: <5585563F-9DEA-4142-A60E-6F8028E53D28@gmail.com>
References: <CAGcEGh66a6=BVn_kD2Oie187b_ayA9a-qB0pXve6QU5p+1rmzA@mail.gmail.com>
	<CABDkGQnK4WzBrPB9BUmURD0G6tnWQmBgmvkhX4NT22Uy2ESypw@mail.gmail.com>
	<CAGcEGh7rTbFP9rXQTC7vb4ya8ophb5MKbBcHSfm=EZz1jNRVyA@mail.gmail.com>
	<5585563F-9DEA-4142-A60E-6F8028E53D28@gmail.com>
Message-ID: <CAGcEGh5pa58UwDUDL5Hr4Kzz7nGy2f5P1=aMJeiKcU48uFTrXA@mail.gmail.com>

Hi Pierre,

> I'm a bit surprised, though. Here's what I tried
>
>>>> np.version.version
> <<< 1.7.0
>>>> x = np.ma.array([1,2,3], mask=[0,1,0])
>>>> x.flags.writeable=False
>>>> x[0]=-1
> <<< ValueError: assignment destination is read-only

Thanks, it works perfectly =) Sorry, probably have overlooked this
simple solution, tried to set x.data and x.mask directly. I noticed
that this only protects the data, so mask also has to be set to
read-only or be hardened to avoid accidental (un)masking.

Gregorio


From pgmdevlist at gmail.com  Mon Jul 15 07:11:54 2013
From: pgmdevlist at gmail.com (Pierre Gerard-Marchant)
Date: Mon, 15 Jul 2013 13:11:54 +0200
Subject: [Numpy-discussion] read-only or immutable masked array
In-Reply-To: <CAGcEGh5pa58UwDUDL5Hr4Kzz7nGy2f5P1=aMJeiKcU48uFTrXA@mail.gmail.com>
References: <CAGcEGh66a6=BVn_kD2Oie187b_ayA9a-qB0pXve6QU5p+1rmzA@mail.gmail.com>
	<CABDkGQnK4WzBrPB9BUmURD0G6tnWQmBgmvkhX4NT22Uy2ESypw@mail.gmail.com>
	<CAGcEGh7rTbFP9rXQTC7vb4ya8ophb5MKbBcHSfm=EZz1jNRVyA@mail.gmail.com>
	<5585563F-9DEA-4142-A60E-6F8028E53D28@gmail.com>
	<CAGcEGh5pa58UwDUDL5Hr4Kzz7nGy2f5P1=aMJeiKcU48uFTrXA@mail.gmail.com>
Message-ID: <B1D9C1B2-7C8A-4E56-837D-02D4CBE10EFB@gmail.com>


On Jul 15, 2013, at 10:04 , Gregorio Bastardo <gregorio.bastardo at gmail.com> wrote:

> Hi Pierre,
> 
>> I'm a bit surprised, though. Here's what I tried
>> 
>>>>> np.version.version
>> <<< 1.7.0
>>>>> x = np.ma.array([1,2,3], mask=[0,1,0])
>>>>> x.flags.writeable=False
>>>>> x[0]=-1
>> <<< ValueError: assignment destination is read-only
> 
> Thanks, it works perfectly =) Sorry, probably have overlooked this
> simple solution, tried to set x.data and x.mask directly. I noticed
> that this only protects the data, so mask also has to be set to
> read-only or be hardened to avoid accidental (un)masking.

Well, yes and no. Settings the flags of `x` doesn't set (yet) the flags of the mask, that's true. Still, `.writeable=False` should prevent you to unmask data, provided you're not trying to modify the mask directly but use basic assignment like `x[?]=?`. However, assigning `np.ma.masked` to array items does modify the mask and only the mask, hence the absence of error if the array is not writeable.

Note as well that hardening the mask only prevents unmasking: you can still grow the mask, which may not be what you want. Use `x.mask.flags.writeable=False` to make the mask really read-only.


From gregorio.bastardo at gmail.com  Mon Jul 15 08:40:18 2013
From: gregorio.bastardo at gmail.com (Gregorio Bastardo)
Date: Mon, 15 Jul 2013 14:40:18 +0200
Subject: [Numpy-discussion] read-only or immutable masked array
In-Reply-To: <B1D9C1B2-7C8A-4E56-837D-02D4CBE10EFB@gmail.com>
References: <CAGcEGh66a6=BVn_kD2Oie187b_ayA9a-qB0pXve6QU5p+1rmzA@mail.gmail.com>
	<CABDkGQnK4WzBrPB9BUmURD0G6tnWQmBgmvkhX4NT22Uy2ESypw@mail.gmail.com>
	<CAGcEGh7rTbFP9rXQTC7vb4ya8ophb5MKbBcHSfm=EZz1jNRVyA@mail.gmail.com>
	<5585563F-9DEA-4142-A60E-6F8028E53D28@gmail.com>
	<CAGcEGh5pa58UwDUDL5Hr4Kzz7nGy2f5P1=aMJeiKcU48uFTrXA@mail.gmail.com>
	<B1D9C1B2-7C8A-4E56-837D-02D4CBE10EFB@gmail.com>
Message-ID: <CAGcEGh5YkCB=GSuRweG_Tocg8SRmz51Z8vVN3w4w-oQ4t_FN+A@mail.gmail.com>

Hi Pierre,

> Note as well that hardening the mask only prevents unmasking: you can still grow the mask, which may not be what you want. Use `x.mask.flags.writeable=False` to make the mask really read-only.

I ran into an unmasking problem with the suggested approach:

>>> np.version.version
'1.7.0'
>>> x = np.ma.masked_array(xrange(4), [0,1,0,1])
>>> x
masked_array(data = [0 -- 2 --],
             mask = [False  True False  True],
       fill_value = 999999)
>>> x.flags.writeable = False
>>> x.mask.flags.writeable = False
>>> x.mask[1] = 0 # ok
Traceback (most recent call last):
  ...
ValueError: assignment destination is read-only
>>> x[1] = 0 # ok
Traceback (most recent call last):
  ...
ValueError: assignment destination is read-only
>>> x.mask[1] = 0 # ??
>>> x
masked_array(data = [0 1 2 --],
             mask = [False False False  True],
       fill_value = 999999)

I noticed that "sharedmask" attribute changes (from True to False)
after "x[1] = 0". Also, some of the ma operations result mask identity
of the new ma, which causes ValueError when the new ma mask is
modified:

>>> x = np.ma.masked_array(xrange(4), [0,1,0,1])
>>> x.flags.writeable = False
>>> x.mask.flags.writeable = False
>>> x1 = x > 0
>>> x1.mask is x.mask # ok
False
>>> x2 = x != 0
>>> x2.mask is x.mask # ??
True
>>> x2.mask[1] = 0
Traceback (most recent call last):
  ...
ValueError: assignment destination is read-only

which is a bit confusing. And I experienced that *_like operations
give mask identity too:

>>> y = np.ones_like(x)
>>> y.mask is x.mask
True

but for that I found a recent discussion ("empty_like for masked
arrays") on the mailing list:
http://mail.scipy.org/pipermail/numpy-discussion/2013-June/066836.html

I might be missing something but could you clarify these issues?

Thanks,
Gregorio


From bruno.piguet at gmail.com  Mon Jul 15 09:09:12 2013
From: bruno.piguet at gmail.com (bruno Piguet)
Date: Mon, 15 Jul 2013 15:09:12 +0200
Subject: [Numpy-discussion] Allow == and != to raise errors
In-Reply-To: <CADKKbtj1PioJGYSeU8+jxcrcLc58mLJncD3R_DOnYc_RU+mJcQ@mail.gmail.com>
References: <1373632688.13968.13.camel@sebastian-laptop>
	<CANNq6Fk+uLF_5Q+kesradPt5vEB_-miwmTzSm5odgEqxFtOing@mail.gmail.com>
	<CADKKbtj1PioJGYSeU8+jxcrcLc58mLJncD3R_DOnYc_RU+mJcQ@mail.gmail.com>
Message-ID: <CABa3jxhmsb50Qvg2tmyVMMQW-MWvWefk7rexCa2k+WaCvjZtZQ@mail.gmail.com>

Python itself doesn't raise an exception in such cases :

>>> (3,4) != (2, 3, 4)
True
>>> (3,4) == (2, 3, 4)
False


Should numpy behave differently ?

Bruno.


2013/7/12 Fr?d?ric Bastien <nouiz at nouiz.org>

> I also don't like that idea, but I'm not able to come to a good reasoning
> like Benjamin.
>
> I don't see advantage to this change and the reason isn't good enough to
> justify breaking the interface I think.
>
> But I don't think we rely on this, so if the change goes in, it probably
> won't break stuff or they will be easily seen and repared.
>
> Fred
>
>
> On Fri, Jul 12, 2013 at 9:13 AM, Benjamin Root <ben.root at ou.edu> wrote:
>
>> I can see where you are getting at, but I would have to disagree.  First
>> of all, when a comparison between two mis-shaped arrays occur, you get back
>> a bone fide python boolean, not a numpy array of bools. So if any action
>> was taken on the result of such a comparison assumed that the result was
>> some sort of an array, it would fail (yes, this does make it a bit
>> difficult to trace back the source of the problem, but not impossible).
>>
>> Second, no semantics are broken with this. Are the arrays equal or not?
>> If they weren't broadcastible, then returning False for == and True for !=
>> makes perfect sense to me. At least, that is my take on it.
>>
>> Cheers!
>> Ben Root
>>
>>
>>
>> On Fri, Jul 12, 2013 at 8:38 AM, Sebastian Berg <
>> sebastian at sipsolutions.net> wrote:
>>
>>> Hey,
>>>
>>> the array comparisons == and != never raise errors but instead simply
>>> return False for invalid comparisons.
>>>
>>> The main example are arrays of non-matching dimensions, and object
>>> arrays with invalid element-wise comparisons:
>>>
>>> In [1]: np.array([1,2,3]) == np.array([1,2])
>>> Out[1]: False
>>>
>>> In [2]: np.array([1, np.array([2, 3])], dtype=object) == [1, 2]
>>> Out[2]: False
>>>
>>> This seems wrong to me, and I am sure not just me. I doubt any large
>>> projects makes use of such comparisons and assume that most would prefer
>>> the shape mismatch to raise an error, so I would like to change it. But
>>> I am a bit unsure especially about smaller projects. So to keep the
>>> transition a bit safer could imagine implementing a FutureWarning for
>>> these cases (and that would at least notify new users that what they are
>>> doing doesn't seem like the right thing).
>>>
>>> So the question is: Is such a change safe enough, or is there some good
>>> reason for the current behavior that I am missing?
>>>
>>> Regards,
>>>
>>> Sebastian
>>>
>>> (There may be other issues with structured types that would continue
>>> returning False I think, because neither side knows how to compare)
>>>
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>
>>
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130715/6c758210/attachment.html>

From gregorio.bastardo at gmail.com  Mon Jul 15 09:33:24 2013
From: gregorio.bastardo at gmail.com (Gregorio Bastardo)
Date: Mon, 15 Jul 2013 15:33:24 +0200
Subject: [Numpy-discussion] empty_like for masked arrays
Message-ID: <CAGcEGh46UAQ_CtN9bzh=tBK5VT0xeA=o9Yy_yeKdMoVqUvXhqg@mail.gmail.com>

Hi,

On Mon, Jun 10, 2013 at 3:47 PM, Nathaniel Smith <njs at pobox.com> wrote:
> Hi all,
>
> Is there anyone out there using numpy masked arrays, who has an
> opinion on how empty_like (and its friends ones_like, zeros_like)
> should handle the mask?
>
> Right now apparently if you call np.ma.empty_like on a masked array,
> you get a new masked array that shares the original array's mask, so
> modifying one modifies the other. That's almost certainly wrong. This
> PR:
>   https://github.com/numpy/numpy/pull/3404
> makes it so instead the new array has values that are all set to
> empty/zero/one, and a mask which is set to match the input array's
> mask (so whenever something was masked in the original array, the
> empty/zero/one in that place is also masked). We don't know if this is
> the desired behaviour for these functions, though. Maybe it's more
> intuitive for the new array to match the original array in shape and
> dtype, but to always have an empty mask. Or maybe not. None of us
> really use np.ma, so if you do and have an opinion then please speak
> up...

I recently joined the mailing list, so the message might not reach the
original thread, sorry for that.

I use masked arrays extensively, and would vote for the first option,
as I use the *_like operations with the assumption that the resulting
array has the same mask as the original. I think it's more intuitive
than selecting between all masked or all unmasked behaviour. If it's
not too late, please consider my use case.

Thanks,
Gregorio


From charlesr.harris at gmail.com  Mon Jul 15 09:52:15 2013
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Mon, 15 Jul 2013 07:52:15 -0600
Subject: [Numpy-discussion] What should be the result in some statistics
 corner cases?
In-Reply-To: <CAB6mnx+D+uGd2xzrOpPomwUcQkF4+n1Y80Lndxbgumw-3yG3Uw@mail.gmail.com>
References: <CAB6mnx+djfM8yJ+owEJ-Hz7eF9YT0gsxUROstEBiq4jUF6Z0bA@mail.gmail.com>
	<CAGzF1uexxRN8PPmKPjPT=aSxWi79r79Exmoo-Xwj6ZZZ2+6=3Q@mail.gmail.com>
	<CAB6mnx+D+uGd2xzrOpPomwUcQkF4+n1Y80Lndxbgumw-3yG3Uw@mail.gmail.com>
Message-ID: <CAB6mnx+eWsYBM_AmT6scJ6NecWGwfppfFJu1UXrAdu+cwBYuYQ@mail.gmail.com>

On Sun, Jul 14, 2013 at 3:35 PM, Charles R Harris <charlesr.harris at gmail.com
> wrote:

>
>
> On Sun, Jul 14, 2013 at 2:55 PM, Warren Weckesser <
> warren.weckesser at gmail.com> wrote:
>
>> On 7/14/13, Charles R Harris <charlesr.harris at gmail.com> wrote:
>> > Some corner cases in the mean, var, std.
>> >
>> > *Empty arrays*
>> >
>> > I think these cases should either raise an error or just return nan.
>> > Warnings seem ineffective to me as they are only issued once by default.
>> >
>> > In [3]: ones(0).mean()
>> >
>> /home/charris/.local/lib/python2.7/site-packages/numpy/core/_methods.py:61:
>> > RuntimeWarning: invalid value encountered in double_scalars
>> >   ret = ret / float(rcount)
>> > Out[3]: nan
>> >
>> > In [4]: ones(0).var()
>> >
>> /home/charris/.local/lib/python2.7/site-packages/numpy/core/_methods.py:76:
>> > RuntimeWarning: invalid value encountered in true_divide
>> >   out=arrmean, casting='unsafe', subok=False)
>> >
>> /home/charris/.local/lib/python2.7/site-packages/numpy/core/_methods.py:100:
>> > RuntimeWarning: invalid value encountered in double_scalars
>> >   ret = ret / float(rcount)
>> > Out[4]: nan
>> >
>> > In [5]: ones(0).std()
>> >
>> /home/charris/.local/lib/python2.7/site-packages/numpy/core/_methods.py:76:
>> > RuntimeWarning: invalid value encountered in true_divide
>> >   out=arrmean, casting='unsafe', subok=False)
>> >
>> /home/charris/.local/lib/python2.7/site-packages/numpy/core/_methods.py:100:
>> > RuntimeWarning: invalid value encountered in double_scalars
>> >   ret = ret / float(rcount)
>> > Out[5]: nan
>> >
>> > *ddof >= number of elements*
>> >
>> > I think these should just raise errors. The results for ddof >=
>> #elements
>> > is happenstance, and certainly negative numbers should never be
>> returned.
>> >
>> > In [6]: ones(2).var(ddof=2)
>> >
>> /home/charris/.local/lib/python2.7/site-packages/numpy/core/_methods.py:100:
>> > RuntimeWarning: invalid value encountered in double_scalars
>> >   ret = ret / float(rcount)
>> > Out[6]: nan
>> >
>> > In [7]: ones(2).var(ddof=3)
>> > Out[7]: -0.0
>> > *
>> > nansum*
>> >
>> > Currently returns nan for empty arrays. I suspect it should return nan
>> for
>> > slices that are all nan, but 0 for empty slices. That would make it
>> > consistent with sum in the empty case.
>> >
>>
>>
>> For nansum, I would expect 0 even in the case of all nans.  The point
>> of these functions is to simply ignore nans, correct?  So I would aim
>> for this behaviour:  nanfunc(x) behaves the same as func(x[~isnan(x)])
>>
>>
> Agreed, although that changes current behavior. What about the other
> cases?
>
>
Looks like there isn't much interest in the topic, so I'll just go ahead
with the following choices:

Non-NaN case

1) Empty array -> ValueError

The current behavior with stats is an accident, i.e., the nan arises from
0/0. I like to think that in this case the result is any number, rather
than not a number, so *the* value is simply not defined. So in this case
raise a ValueError for empty array.

2) ddof >= n -> ValueError

If the number of elements, n, is not zero and ddof >= n, raise a ValueError
for the ddof value.

Nan case

1) Empty array -> Value Error
2) Empty slice -> NaN
3) For slice ddof >= n -> Nan

 Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130715/6c65d9d2/attachment.html>

From njs at pobox.com  Mon Jul 15 10:20:13 2013
From: njs at pobox.com (Nathaniel Smith)
Date: Mon, 15 Jul 2013 15:20:13 +0100
Subject: [Numpy-discussion] Allow == and != to raise errors
In-Reply-To: <CABa3jxhmsb50Qvg2tmyVMMQW-MWvWefk7rexCa2k+WaCvjZtZQ@mail.gmail.com>
References: <1373632688.13968.13.camel@sebastian-laptop>
	<CANNq6Fk+uLF_5Q+kesradPt5vEB_-miwmTzSm5odgEqxFtOing@mail.gmail.com>
	<CADKKbtj1PioJGYSeU8+jxcrcLc58mLJncD3R_DOnYc_RU+mJcQ@mail.gmail.com>
	<CABa3jxhmsb50Qvg2tmyVMMQW-MWvWefk7rexCa2k+WaCvjZtZQ@mail.gmail.com>
Message-ID: <CAPJVwB=vO88XkcjcxxmDvAsivuTvsg_n3hbCGhDD+Xpw3S1WSA@mail.gmail.com>

On Mon, Jul 15, 2013 at 2:09 PM, bruno Piguet <bruno.piguet at gmail.com> wrote:
> Python itself doesn't raise an exception in such cases :
>
>>>> (3,4) != (2, 3, 4)
> True
>>>> (3,4) == (2, 3, 4)
> False
>
> Should numpy behave differently ?

The numpy equivalent to Python's scalar "==" is called array_equal,
and that does indeed behave the same:

In [5]: np.array_equal([3, 4], [2, 3, 4])
Out[5]: False

But in numpy, the name "==" is shorthand for the ufunc np.equal, which
raises an error:

In [8]: np.equal([3, 4], [2, 3, 4])
ValueError: operands could not be broadcast together with shapes (2) (3)

-n


From sebastian at sipsolutions.net  Mon Jul 15 10:21:35 2013
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Mon, 15 Jul 2013 16:21:35 +0200
Subject: [Numpy-discussion] Allow == and != to raise errors
In-Reply-To: <CABa3jxhmsb50Qvg2tmyVMMQW-MWvWefk7rexCa2k+WaCvjZtZQ@mail.gmail.com>
References: <1373632688.13968.13.camel@sebastian-laptop>
	<CANNq6Fk+uLF_5Q+kesradPt5vEB_-miwmTzSm5odgEqxFtOing@mail.gmail.com>
	<CADKKbtj1PioJGYSeU8+jxcrcLc58mLJncD3R_DOnYc_RU+mJcQ@mail.gmail.com>
	<CABa3jxhmsb50Qvg2tmyVMMQW-MWvWefk7rexCa2k+WaCvjZtZQ@mail.gmail.com>
Message-ID: <1373898095.15619.2.camel@sebastian-laptop>

On Mon, 2013-07-15 at 15:09 +0200, bruno Piguet wrote:
> Python itself doesn't raise an exception in such cases :
> 
> >>> (3,4) != (2, 3, 4)
> True
> >>> (3,4) == (2, 3, 4)
> False

> 
> Should numpy behave differently ?
> 
Yes, because Python tests whether the tuple is different, not whether
the elements are:

>>> (3, 4) == (3, 4)
True
>>> np.array([3, 4]) == np.array([3, 4])
array([ True,  True], dtype=bool)

So doing the test "like python" *changes* the behaviour.

- Sebastian
> 
> Bruno.
> 
> 
> 
> 2013/7/12 Fr?d?ric Bastien <nouiz at nouiz.org>
>         I also don't like that idea, but I'm not able to come to a
>         good reasoning like Benjamin.
>         
>         
>         I don't see advantage to this change and the reason isn't good
>         enough to justify breaking the interface I think.
>         
>         
>         But I don't think we rely on this, so if the change goes in,
>         it probably won't break stuff or they will be easily seen and
>         repared.
>         
>         
>         Fred
>         
>         
>         On Fri, Jul 12, 2013 at 9:13 AM, Benjamin Root
>         <ben.root at ou.edu> wrote:
>                 I can see where you are getting at, but I would have
>                 to disagree.  First of all, when a comparison between
>                 two mis-shaped arrays occur, you get back a bone fide
>                 python boolean, not a numpy array of bools. So if any
>                 action was taken on the result of such a comparison
>                 assumed that the result was some sort of an array, it
>                 would fail (yes, this does make it a bit difficult to
>                 trace back the source of the problem, but not
>                 impossible).
>                 
>                 
>                 Second, no semantics are broken with this. Are the
>                 arrays equal or not? If they weren't broadcastible,
>                 then returning False for == and True for != makes
>                 perfect sense to me. At least, that is my take on it.
>                 
>                 
>                 Cheers!
>                 
>                 Ben Root
>                 
>                 
>                 
>                 
>                 On Fri, Jul 12, 2013 at 8:38 AM, Sebastian Berg
>                 <sebastian at sipsolutions.net> wrote:
>                         Hey,
>                         
>                         the array comparisons == and != never raise
>                         errors but instead simply
>                         return False for invalid comparisons.
>                         
>                         The main example are arrays of non-matching
>                         dimensions, and object
>                         arrays with invalid element-wise comparisons:
>                         
>                         In [1]: np.array([1,2,3]) == np.array([1,2])
>                         Out[1]: False
>                         
>                         In [2]: np.array([1, np.array([2, 3])],
>                         dtype=object) == [1, 2]
>                         Out[2]: False
>                         
>                         This seems wrong to me, and I am sure not just
>                         me. I doubt any large
>                         projects makes use of such comparisons and
>                         assume that most would prefer
>                         the shape mismatch to raise an error, so I
>                         would like to change it. But
>                         I am a bit unsure especially about smaller
>                         projects. So to keep the
>                         transition a bit safer could imagine
>                         implementing a FutureWarning for
>                         these cases (and that would at least notify
>                         new users that what they are
>                         doing doesn't seem like the right thing).
>                         
>                         So the question is: Is such a change safe
>                         enough, or is there some good
>                         reason for the current behavior that I am
>                         missing?
>                         
>                         Regards,
>                         
>                         Sebastian
>                         
>                         (There may be other issues with structured
>                         types that would continue
>                         returning False I think, because neither side
>                         knows how to compare)
>                         
>                         _______________________________________________
>                         NumPy-Discussion mailing list
>                         NumPy-Discussion at scipy.org
>                         http://mail.scipy.org/mailman/listinfo/numpy-discussion
>                 
>                 
>                 
>                 _______________________________________________
>                 NumPy-Discussion mailing list
>                 NumPy-Discussion at scipy.org
>                 http://mail.scipy.org/mailman/listinfo/numpy-discussion
>                 
>         
>         
>         
>         _______________________________________________
>         NumPy-Discussion mailing list
>         NumPy-Discussion at scipy.org
>         http://mail.scipy.org/mailman/listinfo/numpy-discussion
>         
> 
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From ben.root at ou.edu  Mon Jul 15 10:25:08 2013
From: ben.root at ou.edu (Benjamin Root)
Date: Mon, 15 Jul 2013 10:25:08 -0400
Subject: [Numpy-discussion] What should be the result in some statistics
 corner cases?
In-Reply-To: <CAB6mnx+eWsYBM_AmT6scJ6NecWGwfppfFJu1UXrAdu+cwBYuYQ@mail.gmail.com>
References: <CAB6mnx+djfM8yJ+owEJ-Hz7eF9YT0gsxUROstEBiq4jUF6Z0bA@mail.gmail.com>
	<CAGzF1uexxRN8PPmKPjPT=aSxWi79r79Exmoo-Xwj6ZZZ2+6=3Q@mail.gmail.com>
	<CAB6mnx+D+uGd2xzrOpPomwUcQkF4+n1Y80Lndxbgumw-3yG3Uw@mail.gmail.com>
	<CAB6mnx+eWsYBM_AmT6scJ6NecWGwfppfFJu1UXrAdu+cwBYuYQ@mail.gmail.com>
Message-ID: <CANNq6F=Fy--umvvofRhOMuUZ8_fwUENfB7fWesQ817FvaHm-qg@mail.gmail.com>

This is going to need to be heavily documented with doctests. Also, just to
clarify, are we talking about a ValueError for doing a nansum on an empty
array as well, or will that now return a zero?

Ben Root


On Mon, Jul 15, 2013 at 9:52 AM, Charles R Harris <charlesr.harris at gmail.com
> wrote:

>
>
> On Sun, Jul 14, 2013 at 3:35 PM, Charles R Harris <
> charlesr.harris at gmail.com> wrote:
>
>>
>>
>> On Sun, Jul 14, 2013 at 2:55 PM, Warren Weckesser <
>> warren.weckesser at gmail.com> wrote:
>>
>>> On 7/14/13, Charles R Harris <charlesr.harris at gmail.com> wrote:
>>> > Some corner cases in the mean, var, std.
>>> >
>>> > *Empty arrays*
>>> >
>>> > I think these cases should either raise an error or just return nan.
>>> > Warnings seem ineffective to me as they are only issued once by
>>> default.
>>> >
>>> > In [3]: ones(0).mean()
>>> >
>>> /home/charris/.local/lib/python2.7/site-packages/numpy/core/_methods.py:61:
>>> > RuntimeWarning: invalid value encountered in double_scalars
>>> >   ret = ret / float(rcount)
>>> > Out[3]: nan
>>> >
>>> > In [4]: ones(0).var()
>>> >
>>> /home/charris/.local/lib/python2.7/site-packages/numpy/core/_methods.py:76:
>>> > RuntimeWarning: invalid value encountered in true_divide
>>> >   out=arrmean, casting='unsafe', subok=False)
>>> >
>>> /home/charris/.local/lib/python2.7/site-packages/numpy/core/_methods.py:100:
>>> > RuntimeWarning: invalid value encountered in double_scalars
>>> >   ret = ret / float(rcount)
>>> > Out[4]: nan
>>> >
>>> > In [5]: ones(0).std()
>>> >
>>> /home/charris/.local/lib/python2.7/site-packages/numpy/core/_methods.py:76:
>>> > RuntimeWarning: invalid value encountered in true_divide
>>> >   out=arrmean, casting='unsafe', subok=False)
>>> >
>>> /home/charris/.local/lib/python2.7/site-packages/numpy/core/_methods.py:100:
>>> > RuntimeWarning: invalid value encountered in double_scalars
>>> >   ret = ret / float(rcount)
>>> > Out[5]: nan
>>> >
>>> > *ddof >= number of elements*
>>> >
>>> > I think these should just raise errors. The results for ddof >=
>>> #elements
>>> > is happenstance, and certainly negative numbers should never be
>>> returned.
>>> >
>>> > In [6]: ones(2).var(ddof=2)
>>> >
>>> /home/charris/.local/lib/python2.7/site-packages/numpy/core/_methods.py:100:
>>> > RuntimeWarning: invalid value encountered in double_scalars
>>> >   ret = ret / float(rcount)
>>> > Out[6]: nan
>>> >
>>> > In [7]: ones(2).var(ddof=3)
>>> > Out[7]: -0.0
>>> > *
>>> > nansum*
>>> >
>>> > Currently returns nan for empty arrays. I suspect it should return nan
>>> for
>>> > slices that are all nan, but 0 for empty slices. That would make it
>>> > consistent with sum in the empty case.
>>> >
>>>
>>>
>>> For nansum, I would expect 0 even in the case of all nans.  The point
>>> of these functions is to simply ignore nans, correct?  So I would aim
>>> for this behaviour:  nanfunc(x) behaves the same as func(x[~isnan(x)])
>>>
>>>
>> Agreed, although that changes current behavior. What about the other
>> cases?
>>
>>
> Looks like there isn't much interest in the topic, so I'll just go ahead
> with the following choices:
>
> Non-NaN case
>
> 1) Empty array -> ValueError
>
> The current behavior with stats is an accident, i.e., the nan arises from
> 0/0. I like to think that in this case the result is any number, rather
> than not a number, so *the* value is simply not defined. So in this case
> raise a ValueError for empty array.
>
> 2) ddof >= n -> ValueError
>
> If the number of elements, n, is not zero and ddof >= n, raise a
> ValueError for the ddof value.
>
> Nan case
>
> 1) Empty array -> Value Error
> 2) Empty slice -> NaN
> 3) For slice ddof >= n -> Nan
>
>  Chuck
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130715/dfaf44e8/attachment.html>

From charlesr.harris at gmail.com  Mon Jul 15 10:33:47 2013
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Mon, 15 Jul 2013 08:33:47 -0600
Subject: [Numpy-discussion] What should be the result in some statistics
 corner cases?
In-Reply-To: <CANNq6F=Fy--umvvofRhOMuUZ8_fwUENfB7fWesQ817FvaHm-qg@mail.gmail.com>
References: <CAB6mnx+djfM8yJ+owEJ-Hz7eF9YT0gsxUROstEBiq4jUF6Z0bA@mail.gmail.com>
	<CAGzF1uexxRN8PPmKPjPT=aSxWi79r79Exmoo-Xwj6ZZZ2+6=3Q@mail.gmail.com>
	<CAB6mnx+D+uGd2xzrOpPomwUcQkF4+n1Y80Lndxbgumw-3yG3Uw@mail.gmail.com>
	<CAB6mnx+eWsYBM_AmT6scJ6NecWGwfppfFJu1UXrAdu+cwBYuYQ@mail.gmail.com>
	<CANNq6F=Fy--umvvofRhOMuUZ8_fwUENfB7fWesQ817FvaHm-qg@mail.gmail.com>
Message-ID: <CAB6mnxLZJj0Gskkf06FEzOECYUz9EGLZ2UgoYqyXrSyazhnVjQ@mail.gmail.com>

On Mon, Jul 15, 2013 at 8:25 AM, Benjamin Root <ben.root at ou.edu> wrote:

> This is going to need to be heavily documented with doctests. Also, just
> to clarify, are we talking about a ValueError for doing a nansum on an
> empty array as well, or will that now return a zero?
>
>
I was going to leave nansum as is, as it seems that the result was by
choice rather than by accident.

Tests, not doctests. I detest doctests ;) Examples, OTOH...

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130715/a5b36f65/attachment.html>

From sebastian at sipsolutions.net  Mon Jul 15 10:34:16 2013
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Mon, 15 Jul 2013 16:34:16 +0200
Subject: [Numpy-discussion] What should be the result in some statistics
 corner cases?
In-Reply-To: <CAB6mnx+eWsYBM_AmT6scJ6NecWGwfppfFJu1UXrAdu+cwBYuYQ@mail.gmail.com>
References: <CAB6mnx+djfM8yJ+owEJ-Hz7eF9YT0gsxUROstEBiq4jUF6Z0bA@mail.gmail.com>
	<CAGzF1uexxRN8PPmKPjPT=aSxWi79r79Exmoo-Xwj6ZZZ2+6=3Q@mail.gmail.com>
	<CAB6mnx+D+uGd2xzrOpPomwUcQkF4+n1Y80Lndxbgumw-3yG3Uw@mail.gmail.com>
	<CAB6mnx+eWsYBM_AmT6scJ6NecWGwfppfFJu1UXrAdu+cwBYuYQ@mail.gmail.com>
Message-ID: <1373898856.15619.14.camel@sebastian-laptop>

On Mon, 2013-07-15 at 07:52 -0600, Charles R Harris wrote:
> 
> 
> On Sun, Jul 14, 2013 at 3:35 PM, Charles R Harris
> <charlesr.harris at gmail.com> wrote:
>         

<snip>

>                 
>                 For nansum, I would expect 0 even in the case of all
>                 nans.  The point
>                 of these functions is to simply ignore nans, correct?
>                  So I would aim
>                 for this behaviour:  nanfunc(x) behaves the same as
>                 func(x[~isnan(x)])
>                 
>         
>         Agreed, although that changes current behavior. What about the
>         other cases? 
>         
>         
> 
> Looks like there isn't much interest in the topic, so I'll just go
> ahead with the following choices:
> 
> Non-NaN case
> 
> 1) Empty array -> ValueError
> 
> The current behavior with stats is an accident, i.e., the nan arises
> from 0/0. I like to think that in this case the result is any number,
> rather than not a number, so *the* value is simply not defined. So in
> this case raise a ValueError for empty array.
> 
To be honest, I don't mind the current behaviour much sum([]) = 0,
len([]) = 0, so it is in a way well defined. At least I am not sure if I
would prefer always an error. I am a bit worried that just changing it
might break code out there, such as plotting code where it makes
perfectly sense to plot a NaN (i.e. nothing), but if that is the case it
would probably be visible fast.

> 2) ddof >= n -> ValueError
> 
> If the number of elements, n, is not zero and ddof >= n, raise a
> ValueError for the ddof value.
> 
Makes sense to me, especially for ddof > n. Just returning nan in all
cases for backward compatibility would be fine with me too.

> Nan case
> 
> 1) Empty array -> Value Error
> 2) Empty slice -> NaN
> 3) For slice ddof >= n -> Nan
> 
Personally I would somewhat prefer if 1) and 2) would at least default
to the same thing. But I don't use the nanfuncs anyway. I was wondering
about adding the option for the user to pick what the fill is (and i.e.
if it is None (maybe default) -> ValueError). We could also allow this
for normal reductions without an identity, but I am not sure if it is
useful there.

- Sebastian

>  Chuck
> 
> 
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From charlesr.harris at gmail.com  Mon Jul 15 10:47:07 2013
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Mon, 15 Jul 2013 08:47:07 -0600
Subject: [Numpy-discussion] What should be the result in some statistics
 corner cases?
In-Reply-To: <1373898856.15619.14.camel@sebastian-laptop>
References: <CAB6mnx+djfM8yJ+owEJ-Hz7eF9YT0gsxUROstEBiq4jUF6Z0bA@mail.gmail.com>
	<CAGzF1uexxRN8PPmKPjPT=aSxWi79r79Exmoo-Xwj6ZZZ2+6=3Q@mail.gmail.com>
	<CAB6mnx+D+uGd2xzrOpPomwUcQkF4+n1Y80Lndxbgumw-3yG3Uw@mail.gmail.com>
	<CAB6mnx+eWsYBM_AmT6scJ6NecWGwfppfFJu1UXrAdu+cwBYuYQ@mail.gmail.com>
	<1373898856.15619.14.camel@sebastian-laptop>
Message-ID: <CAB6mnxKYPy0hQvX8cvsgMq7w5izUTN3bjV=LndddGCYATu9Csw@mail.gmail.com>

On Mon, Jul 15, 2013 at 8:34 AM, Sebastian Berg
<sebastian at sipsolutions.net>wrote:

> On Mon, 2013-07-15 at 07:52 -0600, Charles R Harris wrote:
> >
> >
> > On Sun, Jul 14, 2013 at 3:35 PM, Charles R Harris
> > <charlesr.harris at gmail.com> wrote:
> >
>
> <snip>
>
> >
> >                 For nansum, I would expect 0 even in the case of all
> >                 nans.  The point
> >                 of these functions is to simply ignore nans, correct?
> >                  So I would aim
> >                 for this behaviour:  nanfunc(x) behaves the same as
> >                 func(x[~isnan(x)])
> >
> >
> >         Agreed, although that changes current behavior. What about the
> >         other cases?
> >
> >
> >
> > Looks like there isn't much interest in the topic, so I'll just go
> > ahead with the following choices:
> >
> > Non-NaN case
> >
> > 1) Empty array -> ValueError
> >
> > The current behavior with stats is an accident, i.e., the nan arises
> > from 0/0. I like to think that in this case the result is any number,
> > rather than not a number, so *the* value is simply not defined. So in
> > this case raise a ValueError for empty array.
> >
> To be honest, I don't mind the current behaviour much sum([]) = 0,
> len([]) = 0, so it is in a way well defined. At least I am not sure if I
> would prefer always an error. I am a bit worried that just changing it
> might break code out there, such as plotting code where it makes
> perfectly sense to plot a NaN (i.e. nothing), but if that is the case it
> would probably be visible fast.
>

I'm talking about mean, var, and std as statistics, sum isn't part of that.
If there is agreement that nansum of empty arrays/columns should be zero I
will do that. Note the sums of empty arrays may or may not be empty.

In [1]: ones((0, 3)).sum(axis=0)
Out[1]: array([ 0.,  0.,  0.])

In [2]: ones((3, 0)).sum(axis=0)
Out[2]: array([], dtype=float64)

Which, sort of, makes sense.


>
> > 2) ddof >= n -> ValueError
> >
> > If the number of elements, n, is not zero and ddof >= n, raise a
> > ValueError for the ddof value.
> >
> Makes sense to me, especially for ddof > n. Just returning nan in all
> cases for backward compatibility would be fine with me too.
>
> > Nan case
> >
> > 1) Empty array -> Value Error
> > 2) Empty slice -> NaN
> > 3) For slice ddof >= n -> Nan
> >
> Personally I would somewhat prefer if 1) and 2) would at least default
> to the same thing. But I don't use the nanfuncs anyway. I was wondering
> about adding the option for the user to pick what the fill is (and i.e.
> if it is None (maybe default) -> ValueError). We could also allow this
> for normal reductions without an identity, but I am not sure if it is
> useful there.
>
>
Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130715/5b0ba3df/attachment.html>

From nouiz at nouiz.org  Mon Jul 15 10:57:17 2013
From: nouiz at nouiz.org (=?ISO-8859-1?Q?Fr=E9d=E9ric_Bastien?=)
Date: Mon, 15 Jul 2013 10:57:17 -0400
Subject: [Numpy-discussion] Allow == and != to raise errors
In-Reply-To: <CAPJVwB=vO88XkcjcxxmDvAsivuTvsg_n3hbCGhDD+Xpw3S1WSA@mail.gmail.com>
References: <1373632688.13968.13.camel@sebastian-laptop>
	<CANNq6Fk+uLF_5Q+kesradPt5vEB_-miwmTzSm5odgEqxFtOing@mail.gmail.com>
	<CADKKbtj1PioJGYSeU8+jxcrcLc58mLJncD3R_DOnYc_RU+mJcQ@mail.gmail.com>
	<CABa3jxhmsb50Qvg2tmyVMMQW-MWvWefk7rexCa2k+WaCvjZtZQ@mail.gmail.com>
	<CAPJVwB=vO88XkcjcxxmDvAsivuTvsg_n3hbCGhDD+Xpw3S1WSA@mail.gmail.com>
Message-ID: <CADKKbtiu0h_OrGnm4_B8NLMN=COewS4Ftt9-zc96C0mzAbW17w@mail.gmail.com>

Just a question, should == behave like a ufunc or like python == for tuple?

I think that all ndarray comparision (==, !=, <=, ...) should behave the
same. If they don't (like it was said), making them consistent is good.
What is the minimal change to have them behave the same? From my
understanding, it is your proposal to change == and != to behave like real
ufunc. But I'm not sure if the minimal change is the best, for new user,
what they will expect more? The ufunc of the python behavior?

Anyway, I see the advantage to simplify the interface to something more
consistent.

Anyway, if we make all comparison behave like ufunc, there is array_equal
as said to have the python behavior of ==, is it useful to have equivalent
function the other comparison? Do they already exist.

thanks

Fred


On Mon, Jul 15, 2013 at 10:20 AM, Nathaniel Smith <njs at pobox.com> wrote:

> On Mon, Jul 15, 2013 at 2:09 PM, bruno Piguet <bruno.piguet at gmail.com>
> wrote:
> > Python itself doesn't raise an exception in such cases :
> >
> >>>> (3,4) != (2, 3, 4)
> > True
> >>>> (3,4) == (2, 3, 4)
> > False
> >
> > Should numpy behave differently ?
>
> The numpy equivalent to Python's scalar "==" is called array_equal,
> and that does indeed behave the same:
>
> In [5]: np.array_equal([3, 4], [2, 3, 4])
> Out[5]: False
>
> But in numpy, the name "==" is shorthand for the ufunc np.equal, which
> raises an error:
>
> In [8]: np.equal([3, 4], [2, 3, 4])
> ValueError: operands could not be broadcast together with shapes (2) (3)
>
> -n
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130715/b3f6d241/attachment.html>

From charlesr.harris at gmail.com  Mon Jul 15 10:58:12 2013
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Mon, 15 Jul 2013 08:58:12 -0600
Subject: [Numpy-discussion] What should be the result in some statistics
 corner cases?
In-Reply-To: <1373898856.15619.14.camel@sebastian-laptop>
References: <CAB6mnx+djfM8yJ+owEJ-Hz7eF9YT0gsxUROstEBiq4jUF6Z0bA@mail.gmail.com>
	<CAGzF1uexxRN8PPmKPjPT=aSxWi79r79Exmoo-Xwj6ZZZ2+6=3Q@mail.gmail.com>
	<CAB6mnx+D+uGd2xzrOpPomwUcQkF4+n1Y80Lndxbgumw-3yG3Uw@mail.gmail.com>
	<CAB6mnx+eWsYBM_AmT6scJ6NecWGwfppfFJu1UXrAdu+cwBYuYQ@mail.gmail.com>
	<1373898856.15619.14.camel@sebastian-laptop>
Message-ID: <CAB6mnxLiJTS4S-41-cZNOzTKbpDuWnsX6RvUsTpV7OLF1CJpWg@mail.gmail.com>

On Mon, Jul 15, 2013 at 8:34 AM, Sebastian Berg
<sebastian at sipsolutions.net>wrote:

> On Mon, 2013-07-15 at 07:52 -0600, Charles R Harris wrote:
> >
> >
> > On Sun, Jul 14, 2013 at 3:35 PM, Charles R Harris
> > <charlesr.harris at gmail.com> wrote:
> >
>
> <snip>
>
> >
> >                 For nansum, I would expect 0 even in the case of all
> >                 nans.  The point
> >                 of these functions is to simply ignore nans, correct?
> >                  So I would aim
> >                 for this behaviour:  nanfunc(x) behaves the same as
> >                 func(x[~isnan(x)])
> >
> >
> >         Agreed, although that changes current behavior. What about the
> >         other cases?
> >
> >
> >
> > Looks like there isn't much interest in the topic, so I'll just go
> > ahead with the following choices:
> >
> > Non-NaN case
> >
> > 1) Empty array -> ValueError
> >
> > The current behavior with stats is an accident, i.e., the nan arises
> > from 0/0. I like to think that in this case the result is any number,
> > rather than not a number, so *the* value is simply not defined. So in
> > this case raise a ValueError for empty array.
> >
> To be honest, I don't mind the current behaviour much sum([]) = 0,
> len([]) = 0, so it is in a way well defined. At least I am not sure if I
> would prefer always an error. I am a bit worried that just changing it
> might break code out there, such as plotting code where it makes
> perfectly sense to plot a NaN (i.e. nothing), but if that is the case it
> would probably be visible fast.
>
> > 2) ddof >= n -> ValueError
> >
> > If the number of elements, n, is not zero and ddof >= n, raise a
> > ValueError for the ddof value.
> >
> Makes sense to me, especially for ddof > n. Just returning nan in all
> cases for backward compatibility would be fine with me too.
>

Currently if ddof > n it returns a negative number for variance, the NaN
only comes when ddof == 0 and n == 0, leading to 0/0 (float is NaN, integer
is zero division).


>
> > Nan case
> >
> > 1) Empty array -> Value Error
> > 2) Empty slice -> NaN
> > 3) For slice ddof >= n -> Nan
> >
> Personally I would somewhat prefer if 1) and 2) would at least default
> to the same thing. But I don't use the nanfuncs anyway. I was wondering
> about adding the option for the user to pick what the fill is (and i.e.
> if it is None (maybe default) -> ValueError). We could also allow this
> for normal reductions without an identity, but I am not sure if it is
> useful there.
>

In the NaN case some slices may be empty, others not. My reasoning is that
that is going to be data dependent, not operator error, but if the array is
empty the writer of the code should deal with that.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130715/646578cc/attachment.html>

From bruno.piguet at gmail.com  Mon Jul 15 11:00:19 2013
From: bruno.piguet at gmail.com (bruno Piguet)
Date: Mon, 15 Jul 2013 17:00:19 +0200
Subject: [Numpy-discussion] Allow == and != to raise errors
In-Reply-To: <CAPJVwB=vO88XkcjcxxmDvAsivuTvsg_n3hbCGhDD+Xpw3S1WSA@mail.gmail.com>
References: <1373632688.13968.13.camel@sebastian-laptop>
	<CANNq6Fk+uLF_5Q+kesradPt5vEB_-miwmTzSm5odgEqxFtOing@mail.gmail.com>
	<CADKKbtj1PioJGYSeU8+jxcrcLc58mLJncD3R_DOnYc_RU+mJcQ@mail.gmail.com>
	<CABa3jxhmsb50Qvg2tmyVMMQW-MWvWefk7rexCa2k+WaCvjZtZQ@mail.gmail.com>
	<CAPJVwB=vO88XkcjcxxmDvAsivuTvsg_n3hbCGhDD+Xpw3S1WSA@mail.gmail.com>
Message-ID: <CABa3jxhKpaHzLeVE8H5KQDJocmT=6sLcGePao1jiYJJHg0PTaw@mail.gmail.com>

Thank-you for your explanations.

So, if  the operator  "==" applied to np.arrays is a shorthand for the
ufunc np.equal, it should definitly behave exactly as np.equal(), and raise
an error.

One side question about style : In case you would like to protect a "x ==
y" test by a try/except clause, wouldn't it feel more "natural" to write "
np.equal(x, y)" ?


Bruno.


2013/7/15 Nathaniel Smith <njs at pobox.com>

> On Mon, Jul 15, 2013 at 2:09 PM, bruno Piguet <bruno.piguet at gmail.com>
> wrote:
> > Python itself doesn't raise an exception in such cases :
> >
> >>>> (3,4) != (2, 3, 4)
> > True
> >>>> (3,4) == (2, 3, 4)
> > False
> >
> > Should numpy behave differently ?
>
> The numpy equivalent to Python's scalar "==" is called array_equal,
> and that does indeed behave the same:
>
> In [5]: np.array_equal([3, 4], [2, 3, 4])
> Out[5]: False
>
> But in numpy, the name "==" is shorthand for the ufunc np.equal, which
> raises an error:
>
> In [8]: np.equal([3, 4], [2, 3, 4])
> ValueError: operands could not be broadcast together with shapes (2) (3)
>
> -n
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130715/fb979ac7/attachment.html>

From chris.barker at noaa.gov  Mon Jul 15 11:05:45 2013
From: chris.barker at noaa.gov (Chris Barker - NOAA Federal)
Date: Mon, 15 Jul 2013 08:05:45 -0700
Subject: [Numpy-discussion] PIL and NumPy
In-Reply-To: <CAAQ2A-tAmKpgyuBYdXxHjQuo27dO5hhhUk9_X=RAiZsroLvD8A@mail.gmail.com>
References: <CAAQ2A-sTsJRRHYZdrDE+H5M4YPZapsnTkB63UFr20ty_8cLKpg@mail.gmail.com>
	<CAAQ2A-tAmKpgyuBYdXxHjQuo27dO5hhhUk9_X=RAiZsroLvD8A@mail.gmail.com>
Message-ID: <4863100717398894614@unknownmsgid>

On Jul 12, 2013, at 8:51 PM, Brady McCary <brady.mccary at gmail.com> wrote:

>
> something to do with an alpha channel being present.

I'd check and see how PIL is storing the alpha channel. If it's RGBA,
then I'd expect it to work.

But I'd PIL is storing the alpha channel as a separate band, then I'm
not surprised you have an issue.

Can you either drop the alpha or convert to RGBA?

There is also a package called something line "imageArray" that loads
and saves image formats directly to/from numpy arrays-maybe that would
be helpful.

CHB


> When I remove the
> alpha channel, things appear to work as I expect. Any discussion on
> the matter?
>
> Brady
>
> On Fri, Jul 12, 2013 at 10:00 PM, Brady McCary <brady.mccary at gmail.com> wrote:
>> NumPy Folks,
>>
>> I want to load images with PIL and then operate on them with NumPy.
>> According to the PIL and NumPy documentation, I would expect the
>> following to work, but it is not.
>>
>>
>>
>> Python 2.7.4 (default, Apr 19 2013, 18:28:01)
>> [GCC 4.7.3] on linux2
>> Type "help", "copyright", "credits" or "license" for more information.
>>>>> import numpy
>>>>> numpy.version.version
>>>>>
>>>>> import Image
>>>>> Image.VERSION
>> '1.1.7'
>>>>>
>>>>> im = Image.open('big-0.png')
>>>>> im.size
>> (2550, 3300)
>>>>>
>>>>> ar = numpy.asarray(im)
>>>>> ar.size
>> 1
>>>>> ar.shape
>> ()
>>>>> ar
>> array(<PIL.PngImagePlugin.PngImageFile image mode=LA size=2550x3300 at
>> 0x1E5BA70>, dtype=object)
>>
>>
>>
>> By "not working" I mean that I would have expected the data to be
>> loaded/available in ar. PIL and NumPy/SciPy seem to be working fine
>> independently of each other. Any guidance?
>>
>> Brady
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From bruno.piguet at gmail.com  Mon Jul 15 11:12:58 2013
From: bruno.piguet at gmail.com (bruno Piguet)
Date: Mon, 15 Jul 2013 17:12:58 +0200
Subject: [Numpy-discussion] Allow == and != to raise errors
In-Reply-To: <CADKKbtiu0h_OrGnm4_B8NLMN=COewS4Ftt9-zc96C0mzAbW17w@mail.gmail.com>
References: <1373632688.13968.13.camel@sebastian-laptop>
	<CANNq6Fk+uLF_5Q+kesradPt5vEB_-miwmTzSm5odgEqxFtOing@mail.gmail.com>
	<CADKKbtj1PioJGYSeU8+jxcrcLc58mLJncD3R_DOnYc_RU+mJcQ@mail.gmail.com>
	<CABa3jxhmsb50Qvg2tmyVMMQW-MWvWefk7rexCa2k+WaCvjZtZQ@mail.gmail.com>
	<CAPJVwB=vO88XkcjcxxmDvAsivuTvsg_n3hbCGhDD+Xpw3S1WSA@mail.gmail.com>
	<CADKKbtiu0h_OrGnm4_B8NLMN=COewS4Ftt9-zc96C0mzAbW17w@mail.gmail.com>
Message-ID: <CABa3jxgDqrACZoNkmBi3jJkFE__9E80dpj4aCxwkU_iBNzKe-A@mail.gmail.com>

2013/7/15 Fr?d?ric Bastien <nouiz at nouiz.org>

> Just a question, should == behave like a ufunc or like python == for tuple?
>

That's what I was also wondering.

I see the advantage of consistency for newcomers.
I'm not experienced enough to see if this is a problem for numerical
practitionners Maybe they wouldn't even imagine that "==" applied to arrays
could do anything else than element-wise comparison ?

"Explicit is better than implicit" : to me,  np.equal(x, y) is more
explicit than "x == y".
But "Beautiful is better than ugly". Is np.equal(x, y) ugly ?

Bruno.


> I think that all ndarray comparision (==, !=, <=, ...) should behave the
> same. If they don't (like it was said), making them consistent is good.
> What is the minimal change to have them behave the same? From my
> understanding, it is your proposal to change == and != to behave like real
> ufunc. But I'm not sure if the minimal change is the best, for new user,
> what they will expect more? The ufunc of the python behavior?
>
> Anyway, I see the advantage to simplify the interface to something more
> consistent.
>
> Anyway, if we make all comparison behave like ufunc, there is array_equal
> as said to have the python behavior of ==, is it useful to have equivalent
> function the other comparison? Do they already exist.
>
> thanks
>
> Fred
>
>
> On Mon, Jul 15, 2013 at 10:20 AM, Nathaniel Smith <njs at pobox.com> wrote:
>
>> On Mon, Jul 15, 2013 at 2:09 PM, bruno Piguet <bruno.piguet at gmail.com>
>> wrote:
>> > Python itself doesn't raise an exception in such cases :
>> >
>> >>>> (3,4) != (2, 3, 4)
>> > True
>> >>>> (3,4) == (2, 3, 4)
>> > False
>> >
>> > Should numpy behave differently ?
>>
>> The numpy equivalent to Python's scalar "==" is called array_equal,
>> and that does indeed behave the same:
>>
>> In [5]: np.array_equal([3, 4], [2, 3, 4])
>> Out[5]: False
>>
>> But in numpy, the name "==" is shorthand for the ufunc np.equal, which
>> raises an error:
>>
>> In [8]: np.equal([3, 4], [2, 3, 4])
>> ValueError: operands could not be broadcast together with shapes (2) (3)
>>
>> -n
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130715/bddfd86c/attachment.html>

From pgmdevlist at gmail.com  Mon Jul 15 11:25:17 2013
From: pgmdevlist at gmail.com (Pierre Gerard-Marchant)
Date: Mon, 15 Jul 2013 17:25:17 +0200
Subject: [Numpy-discussion] read-only or immutable masked array
In-Reply-To: <CAGcEGh5YkCB=GSuRweG_Tocg8SRmz51Z8vVN3w4w-oQ4t_FN+A@mail.gmail.com>
References: <CAGcEGh66a6=BVn_kD2Oie187b_ayA9a-qB0pXve6QU5p+1rmzA@mail.gmail.com>
	<CABDkGQnK4WzBrPB9BUmURD0G6tnWQmBgmvkhX4NT22Uy2ESypw@mail.gmail.com>
	<CAGcEGh7rTbFP9rXQTC7vb4ya8ophb5MKbBcHSfm=EZz1jNRVyA@mail.gmail.com>
	<5585563F-9DEA-4142-A60E-6F8028E53D28@gmail.com>
	<CAGcEGh5pa58UwDUDL5Hr4Kzz7nGy2f5P1=aMJeiKcU48uFTrXA@mail.gmail.com>
	<B1D9C1B2-7C8A-4E56-837D-02D4CBE10EFB@gmail.com>
	<CAGcEGh5YkCB=GSuRweG_Tocg8SRmz51Z8vVN3w4w-oQ4t_FN+A@mail.gmail.com>
Message-ID: <93CF6FC4-E06F-4697-93F9-6628E6C3528D@gmail.com>


On Jul 15, 2013, at 14:40 , Gregorio Bastardo <gregorio.bastardo at gmail.com> wrote:

> Hi Pierre,
> 
>> Note as well that hardening the mask only prevents unmasking: you can still grow the mask, which may not be what you want. Use `x.mask.flags.writeable=False` to make the mask really read-only.
> 
> I ran into an unmasking problem with the suggested approach:
> 
>>>> np.version.version
> '1.7.0'
>>>> x = np.ma.masked_array(xrange(4), [0,1,0,1])
>>>> x
> masked_array(data = [0 -- 2 --],
>             mask = [False  True False  True],
>       fill_value = 999999)
>>>> x.flags.writeable = False
>>>> x.mask.flags.writeable = False
>>>> x.mask[1] = 0 # ok
> Traceback (most recent call last):
>  ...
> ValueError: assignment destination is read-only
>>>> x[1] = 0 # ok
> Traceback (most recent call last):
>  ...
> ValueError: assignment destination is read-only
>>>> x.mask[1] = 0 # ??
>>>> x
> masked_array(data = [0 1 2 --],
>             mask = [False False False  True],
>       fill_value = 999999)

Ouch?
Quick workaround:  use `x.harden_mask()` *then* `x.mask.flags.writeable=False`

[Longer explanation]
> I noticed that "sharedmask" attribute changes (from True to False)
> after "x[1] = 0".

Indeed, indeed? When setting items, the mask is unshared to limit some issues (like propagation to the other masked_arrays sharing the mask). Unsharing the mask involves a copy, which unfortunately doesn't copy the flags. In other terms, when you try `x[1]=0`, the mask becomes rewritable. That hurts?
But! This call to `unshare_mask` is performed only when the mask is 'soft' hence the quick workaround?

Note to self (or whomever will fix the issue before I can do it):
* We could make sure that copying a mask copies some of its flags to (like the `writeable` one, which other ones?)
* The call to `unshare_mask` is made *before* we try to call `__setitem__` on the `_data` part: that's silly, if we called `__setitem__(_data,index,dval)` before, the `ValueError: assignment destination is read-only` would be raised before the mask could get unshared? TLD;DR: move L3073 of np.ma.core to L3068
* There should be some simpler ways to make a masked_array read-only, this little dance is rapidly tiring.


> Also, some of the ma operations result mask identity
> of the new ma, which causes ValueError when the new ma mask is
> modified:
> 
>>>> x = np.ma.masked_array(xrange(4), [0,1,0,1])
>>>> x.flags.writeable = False
>>>> x.mask.flags.writeable = False
>>>> x1 = x > 0
>>>> x1.mask is x.mask # ok
> False
>>>> x2 = x != 0
>>>> x2.mask is x.mask # ??
> True
>>>> x2.mask[1] = 0
> Traceback (most recent call last):
>  ...
> ValueError: assignment destination is read-only
> 
> which is a bit confusing.

Ouch again. 
[TL;DR] No workaround, sorry
[Long version]
The inconsistency comes from the fact that '!=' or '==' call the `__ne__` or `__eq__` methods while other comparison operators call their own function. In the first case, because we're comparing with a non-masked scalar, no copy of the mask is made; in the second case, a copy is systematically made. As pointed out earlier, copies of a mask don't preserve its flags?
[Note to self]
* Define a factory for __lt__/__le__/__gt__/__ge__ based on __eq__ : MaskedArray.__eq__ and __ne__ already have almost the same code.. (but what about filling? Is it an issue?)


> And I experienced that *_like operations
> give mask identity too:
> 
>>>> y = np.ones_like(x)
>>>> y.mask is x.mask
> True

This may change in the future, depending on a yet-to-be-achieved consensus on  the definition of 'least-surprising behaviour'. Right now, the *-like functions return an array that shares the mask with the input, as you've noticed. Some people complained about it, what's your take on that?

> I might be missing something but could you clarify these issues?

You were not missing anything, np.ma isn't the most straightforward module: plenty of corner cases, and the implementation is pretty naive at times (but hey, it works). My only advice is to never lose hope.


From charlesr.harris at gmail.com  Mon Jul 15 11:47:50 2013
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Mon, 15 Jul 2013 09:47:50 -0600
Subject: [Numpy-discussion] What should be the result in some statistics
 corner cases?
In-Reply-To: <CAB6mnxLiJTS4S-41-cZNOzTKbpDuWnsX6RvUsTpV7OLF1CJpWg@mail.gmail.com>
References: <CAB6mnx+djfM8yJ+owEJ-Hz7eF9YT0gsxUROstEBiq4jUF6Z0bA@mail.gmail.com>
	<CAGzF1uexxRN8PPmKPjPT=aSxWi79r79Exmoo-Xwj6ZZZ2+6=3Q@mail.gmail.com>
	<CAB6mnx+D+uGd2xzrOpPomwUcQkF4+n1Y80Lndxbgumw-3yG3Uw@mail.gmail.com>
	<CAB6mnx+eWsYBM_AmT6scJ6NecWGwfppfFJu1UXrAdu+cwBYuYQ@mail.gmail.com>
	<1373898856.15619.14.camel@sebastian-laptop>
	<CAB6mnxLiJTS4S-41-cZNOzTKbpDuWnsX6RvUsTpV7OLF1CJpWg@mail.gmail.com>
Message-ID: <CAB6mnxJg5CkNtW8ejzeMd7=_k24zS8Q8sVSJy3kP8d_Gy0QBzg@mail.gmail.com>

On Mon, Jul 15, 2013 at 8:58 AM, Charles R Harris <charlesr.harris at gmail.com
> wrote:

>
>
> On Mon, Jul 15, 2013 at 8:34 AM, Sebastian Berg <
> sebastian at sipsolutions.net> wrote:
>
>> On Mon, 2013-07-15 at 07:52 -0600, Charles R Harris wrote:
>> >
>> >
>> > On Sun, Jul 14, 2013 at 3:35 PM, Charles R Harris
>> > <charlesr.harris at gmail.com> wrote:
>> >
>>
>> <snip>
>>
>> >
>> >                 For nansum, I would expect 0 even in the case of all
>> >                 nans.  The point
>> >                 of these functions is to simply ignore nans, correct?
>> >                  So I would aim
>> >                 for this behaviour:  nanfunc(x) behaves the same as
>> >                 func(x[~isnan(x)])
>> >
>> >
>> >         Agreed, although that changes current behavior. What about the
>> >         other cases?
>> >
>> >
>> >
>> > Looks like there isn't much interest in the topic, so I'll just go
>> > ahead with the following choices:
>> >
>> > Non-NaN case
>> >
>> > 1) Empty array -> ValueError
>> >
>> > The current behavior with stats is an accident, i.e., the nan arises
>> > from 0/0. I like to think that in this case the result is any number,
>> > rather than not a number, so *the* value is simply not defined. So in
>> > this case raise a ValueError for empty array.
>> >
>> To be honest, I don't mind the current behaviour much sum([]) = 0,
>> len([]) = 0, so it is in a way well defined. At least I am not sure if I
>> would prefer always an error. I am a bit worried that just changing it
>> might break code out there, such as plotting code where it makes
>> perfectly sense to plot a NaN (i.e. nothing), but if that is the case it
>> would probably be visible fast.
>>
>> > 2) ddof >= n -> ValueError
>> >
>> > If the number of elements, n, is not zero and ddof >= n, raise a
>> > ValueError for the ddof value.
>> >
>> Makes sense to me, especially for ddof > n. Just returning nan in all
>> cases for backward compatibility would be fine with me too.
>>
>
> Currently if ddof > n it returns a negative number for variance, the NaN
> only comes when ddof == 0 and n == 0, leading to 0/0 (float is NaN, integer
> is zero division).
>
>
>>
>> > Nan case
>> >
>> > 1) Empty array -> Value Error
>> > 2) Empty slice -> NaN
>> > 3) For slice ddof >= n -> Nan
>> >
>> Personally I would somewhat prefer if 1) and 2) would at least default
>> to the same thing. But I don't use the nanfuncs anyway. I was wondering
>> about adding the option for the user to pick what the fill is (and i.e.
>> if it is None (maybe default) -> ValueError). We could also allow this
>> for normal reductions without an identity, but I am not sure if it is
>> useful there.
>>
>
> In the NaN case some slices may be empty, others not. My reasoning is that
> that is going to be data dependent, not operator error, but if the array is
> empty the writer of the code should deal with that.
>
>
In the case of the nanvar, nanstd, it might make more sense to handle ddof
as

1) if ddof is >= axis size, raise ValueError
2) if ddof is >= number of values after removing NaNs, return NaN

The first would be consistent with the non-nan case, the second accounts
for the variable nature of data containing NaNs.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130715/4ef09d2f/attachment.html>

From ben.root at ou.edu  Mon Jul 15 11:55:05 2013
From: ben.root at ou.edu (Benjamin Root)
Date: Mon, 15 Jul 2013 11:55:05 -0400
Subject: [Numpy-discussion] What should be the result in some statistics
 corner cases?
In-Reply-To: <CAB6mnxJg5CkNtW8ejzeMd7=_k24zS8Q8sVSJy3kP8d_Gy0QBzg@mail.gmail.com>
References: <CAB6mnx+djfM8yJ+owEJ-Hz7eF9YT0gsxUROstEBiq4jUF6Z0bA@mail.gmail.com>
	<CAGzF1uexxRN8PPmKPjPT=aSxWi79r79Exmoo-Xwj6ZZZ2+6=3Q@mail.gmail.com>
	<CAB6mnx+D+uGd2xzrOpPomwUcQkF4+n1Y80Lndxbgumw-3yG3Uw@mail.gmail.com>
	<CAB6mnx+eWsYBM_AmT6scJ6NecWGwfppfFJu1UXrAdu+cwBYuYQ@mail.gmail.com>
	<1373898856.15619.14.camel@sebastian-laptop>
	<CAB6mnxLiJTS4S-41-cZNOzTKbpDuWnsX6RvUsTpV7OLF1CJpWg@mail.gmail.com>
	<CAB6mnxJg5CkNtW8ejzeMd7=_k24zS8Q8sVSJy3kP8d_Gy0QBzg@mail.gmail.com>
Message-ID: <CANNq6Fn2oZ-mewgsKUCq-W+EewgyzM5fashmLFKCAn7T1SXDwQ@mail.gmail.com>

On Jul 15, 2013 11:47 AM, "Charles R Harris" <charlesr.harris at gmail.com>
wrote:

>
>
> On Mon, Jul 15, 2013 at 8:58 AM, Charles R Harris <
> charlesr.harris at gmail.com> wrote:
>
>>
>>
>> On Mon, Jul 15, 2013 at 8:34 AM, Sebastian Berg <
>> sebastian at sipsolutions.net> wrote:
>>
>>> On Mon, 2013-07-15 at 07:52 -0600, Charles R Harris wrote:
>>> >
>>> >
>>> > On Sun, Jul 14, 2013 at 3:35 PM, Charles R Harris
>>> > <charlesr.harris at gmail.com> wrote:
>>> >
>>>
>>> <snip>
>>>
>>> >
>>> >                 For nansum, I would expect 0 even in the case of all
>>> >                 nans.  The point
>>> >                 of these functions is to simply ignore nans, correct?
>>> >                  So I would aim
>>> >                 for this behaviour:  nanfunc(x) behaves the same as
>>> >                 func(x[~isnan(x)])
>>> >
>>> >
>>> >         Agreed, although that changes current behavior. What about the
>>> >         other cases?
>>> >
>>> >
>>> >
>>> > Looks like there isn't much interest in the topic, so I'll just go
>>> > ahead with the following choices:
>>> >
>>> > Non-NaN case
>>> >
>>> > 1) Empty array -> ValueError
>>> >
>>> > The current behavior with stats is an accident, i.e., the nan arises
>>> > from 0/0. I like to think that in this case the result is any number,
>>> > rather than not a number, so *the* value is simply not defined. So in
>>> > this case raise a ValueError for empty array.
>>> >
>>> To be honest, I don't mind the current behaviour much sum([]) = 0,
>>> len([]) = 0, so it is in a way well defined. At least I am not sure if I
>>> would prefer always an error. I am a bit worried that just changing it
>>> might break code out there, such as plotting code where it makes
>>> perfectly sense to plot a NaN (i.e. nothing), but if that is the case it
>>> would probably be visible fast.
>>>
>>> > 2) ddof >= n -> ValueError
>>> >
>>> > If the number of elements, n, is not zero and ddof >= n, raise a
>>> > ValueError for the ddof value.
>>> >
>>> Makes sense to me, especially for ddof > n. Just returning nan in all
>>> cases for backward compatibility would be fine with me too.
>>>
>>
>> Currently if ddof > n it returns a negative number for variance, the NaN
>> only comes when ddof == 0 and n == 0, leading to 0/0 (float is NaN, integer
>> is zero division).
>>
>>
>>>
>>> > Nan case
>>> >
>>> > 1) Empty array -> Value Error
>>> > 2) Empty slice -> NaN
>>> > 3) For slice ddof >= n -> Nan
>>> >
>>> Personally I would somewhat prefer if 1) and 2) would at least default
>>> to the same thing. But I don't use the nanfuncs anyway. I was wondering
>>> about adding the option for the user to pick what the fill is (and i.e.
>>> if it is None (maybe default) -> ValueError). We could also allow this
>>> for normal reductions without an identity, but I am not sure if it is
>>> useful there.
>>>
>>
>> In the NaN case some slices may be empty, others not. My reasoning is
>> that that is going to be data dependent, not operator error, but if the
>> array is empty the writer of the code should deal with that.
>>
>>
> In the case of the nanvar, nanstd, it might make more sense to handle ddof
> as
>
> 1) if ddof is >= axis size, raise ValueError
> 2) if ddof is >= number of values after removing NaNs, return NaN
>
> The first would be consistent with the non-nan case, the second accounts
> for the variable nature of data containing NaNs.
>
> Chuck
>
>
>
I think this is a good idea in that it naturally follows well with the
conventions of what to do with empty arrays / empty slices with nanmean,
etc. Note, however, I am not a very big fan of the idea of having two
different behaviors for what I see as semantically the same thing.

But, my objections are not strong enough to veto it, and I do think this
proposal is well thought-out.

Ben Root
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130715/9825e4c0/attachment.html>

From sebastian at sipsolutions.net  Mon Jul 15 11:55:44 2013
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Mon, 15 Jul 2013 17:55:44 +0200
Subject: [Numpy-discussion] What should be the result in some statistics
 corner cases?
In-Reply-To: <CAB6mnxKYPy0hQvX8cvsgMq7w5izUTN3bjV=LndddGCYATu9Csw@mail.gmail.com>
References: <CAB6mnx+djfM8yJ+owEJ-Hz7eF9YT0gsxUROstEBiq4jUF6Z0bA@mail.gmail.com>
	<CAGzF1uexxRN8PPmKPjPT=aSxWi79r79Exmoo-Xwj6ZZZ2+6=3Q@mail.gmail.com>
	<CAB6mnx+D+uGd2xzrOpPomwUcQkF4+n1Y80Lndxbgumw-3yG3Uw@mail.gmail.com>
	<CAB6mnx+eWsYBM_AmT6scJ6NecWGwfppfFJu1UXrAdu+cwBYuYQ@mail.gmail.com>
	<1373898856.15619.14.camel@sebastian-laptop>
	<CAB6mnxKYPy0hQvX8cvsgMq7w5izUTN3bjV=LndddGCYATu9Csw@mail.gmail.com>
Message-ID: <1373903744.15619.35.camel@sebastian-laptop>

On Mon, 2013-07-15 at 08:47 -0600, Charles R Harris wrote:
> 
> 
> On Mon, Jul 15, 2013 at 8:34 AM, Sebastian Berg
> <sebastian at sipsolutions.net> wrote:
>         On Mon, 2013-07-15 at 07:52 -0600, Charles R Harris wrote:
>         >
>         >
>         > On Sun, Jul 14, 2013 at 3:35 PM, Charles R Harris
>         > <charlesr.harris at gmail.com> wrote:
>         >
>         
>         
>         <snip>
>         
>         >
>         >                 For nansum, I would expect 0 even in the
>         case of all
>         >                 nans.  The point
>         >                 of these functions is to simply ignore nans,
>         correct?
>         >                  So I would aim
>         >                 for this behaviour:  nanfunc(x) behaves the
>         same as
>         >                 func(x[~isnan(x)])
>         >
>         >
>         >         Agreed, although that changes current behavior. What
>         about the
>         >         other cases?
>         >
>         >
>         >
>         > Looks like there isn't much interest in the topic, so I'll
>         just go
>         > ahead with the following choices:
>         >
>         > Non-NaN case
>         >
>         > 1) Empty array -> ValueError
>         >
>         > The current behavior with stats is an accident, i.e., the
>         nan arises
>         > from 0/0. I like to think that in this case the result is
>         any number,
>         > rather than not a number, so *the* value is simply not
>         defined. So in
>         > this case raise a ValueError for empty array.
>         >
>         
>         To be honest, I don't mind the current behaviour much sum([])
>         = 0,
>         len([]) = 0, so it is in a way well defined. At least I am not
>         sure if I
>         would prefer always an error. I am a bit worried that just
>         changing it
>         might break code out there, such as plotting code where it
>         makes
>         perfectly sense to plot a NaN (i.e. nothing), but if that is
>         the case it
>         would probably be visible fast.
> 
> I'm talking about mean, var, and std as statistics, sum isn't part of
> that. If there is agreement that nansum of empty arrays/columns should
> be zero I will do that. Note the sums of empty arrays may or may not
> be empty.
> 
> In [1]: ones((0, 3)).sum(axis=0)
> Out[1]: array([ 0.,  0.,  0.])
> 
> In [2]: ones((3, 0)).sum(axis=0)
> Out[2]: array([], dtype=float64)
> 
> Which, sort of, makes sense.
>  
> 
I think we can agree that the behaviour for reductions with an identity
should default to returning the identity, including for the nanfuncs,
i.e. sum([]) is 0, product([]) is 1...

Since mean = sum/length is a sensible definition, having 0/0 as a result
doesn't seem to bad to me to be honest, it might be accidental but it is
not a special case in the code ;). Though I don't mind an error as long
as it doesn't break matplotlib or so.

I agree about the nanfuncs raising an error would probably be more of a
problem then for a usual ufunc, but still a bit hesitant about saying
that it is ok too. I could imagine adding a very general "identity"
argument (though I would not call it identity, because it is not the
same as `np.add.identity`, just used in a place where that would be used
otherwise):

np.add.reduce([], identity=123) -> [123]
np.add.reduce([1], identity=123) -> [1]
np.nanmean([np.nan], identity=None) -> Error
np.nanmean([np.nan], identity=np.nan) -> np.nan

It doesn't really make sense, but:
np.subtract.reduce([]) -> Error, since np.substract.identity is None
np.subtract.reduce([], identity=0) -> 0, suppressing the error.

I am not sure if I am convinced myself, but especially for the nanfuncs
it could maybe provide a way to circumvent the problem somewhat.
Including functions such as np.nanargmin, whose result type does not
even support NaN. Plus it gives an argument allowing for warnings about
changing behaviour.

- Sebastian

>         
>         > 2) ddof >= n -> ValueError
>         >
>         > If the number of elements, n, is not zero and ddof >= n,
>         raise a
>         > ValueError for the ddof value.
>         >
>         
>         Makes sense to me, especially for ddof > n. Just returning nan
>         in all
>         cases for backward compatibility would be fine with me too.
>         
>         > Nan case
>         >
>         > 1) Empty array -> Value Error
>         > 2) Empty slice -> NaN
>         > 3) For slice ddof >= n -> Nan
>         >
>         
>         Personally I would somewhat prefer if 1) and 2) would at least
>         default
>         to the same thing. But I don't use the nanfuncs anyway. I was
>         wondering
>         about adding the option for the user to pick what the fill is
>         (and i.e.
>         if it is None (maybe default) -> ValueError). We could also
>         allow this
>         for normal reductions without an identity, but I am not sure
>         if it is
>         useful there.
>         
> 
> Chuck 
> 
> 
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From sebastian at sipsolutions.net  Mon Jul 15 12:18:44 2013
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Mon, 15 Jul 2013 18:18:44 +0200
Subject: [Numpy-discussion] Allow == and != to raise errors
In-Reply-To: <CABa3jxgDqrACZoNkmBi3jJkFE__9E80dpj4aCxwkU_iBNzKe-A@mail.gmail.com>
References: <1373632688.13968.13.camel@sebastian-laptop>
	<CANNq6Fk+uLF_5Q+kesradPt5vEB_-miwmTzSm5odgEqxFtOing@mail.gmail.com>
	<CADKKbtj1PioJGYSeU8+jxcrcLc58mLJncD3R_DOnYc_RU+mJcQ@mail.gmail.com>
	<CABa3jxhmsb50Qvg2tmyVMMQW-MWvWefk7rexCa2k+WaCvjZtZQ@mail.gmail.com>
	<CAPJVwB=vO88XkcjcxxmDvAsivuTvsg_n3hbCGhDD+Xpw3S1WSA@mail.gmail.com>
	<CADKKbtiu0h_OrGnm4_B8NLMN=COewS4Ftt9-zc96C0mzAbW17w@mail.gmail.com>
	<CABa3jxgDqrACZoNkmBi3jJkFE__9E80dpj4aCxwkU_iBNzKe-A@mail.gmail.com>
Message-ID: <1373905124.15619.45.camel@sebastian-laptop>

On Mon, 2013-07-15 at 17:12 +0200, bruno Piguet wrote:
> 
> 
> 
> 2013/7/15 Fr?d?ric Bastien <nouiz at nouiz.org>
>         Just a question, should == behave like a ufunc or like python
>         == for tuple?
>         
> 
> 
> That's what I was also wondering.

I am not sure I understand the question. Of course == should be
(mostly?) identical to np.equal. Things like
arr[arr == 0] = -1
etc., etc., are a common design pattern. Operations on arrays are
element-wise by default, "falling back" to the python tuple/container
behaviour" is a special case and I do not see a good reason for it,
except possibly backward compatibility.

Personally I doubt anyone who seriously uses numpy, uses the
np.array([1, 2, 3]) == np.array([1,2]) -> False
behaviour, and it seems a bit like a trap to me, because suddenly you
get:
np.array([1, 2, 3]) == np.array([1]) -> np.array([True, False, False])

(Though in combination with np.all, it can make sense and is then
identical to np.array_equiv/np.array_equal)

- Sebastian

> I see the advantage of consistency for newcomers.
> I'm not experienced enough to see if this is a problem for numerical
> practitionners Maybe they wouldn't even imagine that "==" applied to
> arrays could do anything else than element-wise comparison ? 
> 
> "Explicit is better than implicit" : to me,  np.equal(x, y) is more
> explicit than "x == y".
> 
> But "Beautiful is better than ugly". Is np.equal(x, y) ugly ? 
> 
> 
> Bruno.
> 
> 
> 
>  
> 
>         I think that all ndarray comparision (==, !=, <=, ...) should
>         behave the same. If they don't (like it was said), making them
>         consistent is good. What is the minimal change to have them
>         behave the same? From my understanding, it is your proposal to
>         change == and != to behave like real ufunc. But I'm not sure
>         if the minimal change is the best, for new user, what they
>         will expect more? The ufunc of the python behavior?
>         
>         
>         Anyway, I see the advantage to simplify the interface to
>         something more consistent.
>         
>         
>         Anyway, if we make all comparison behave like ufunc, there is
>         array_equal as said to have the python behavior of ==, is it
>         useful to have equivalent function the other comparison? Do
>         they already exist.
>         
>         
>         thanks
>         
>         
>         
>         Fred
>         
>         
>         On Mon, Jul 15, 2013 at 10:20 AM, Nathaniel Smith
>         <njs at pobox.com> wrote:
>                 On Mon, Jul 15, 2013 at 2:09 PM, bruno Piguet
>                 <bruno.piguet at gmail.com> wrote:
>                 > Python itself doesn't raise an exception in such
>                 cases :
>                 >
>                 >>>> (3,4) != (2, 3, 4)
>                 > True
>                 >>>> (3,4) == (2, 3, 4)
>                 > False
>                 >
>                 > Should numpy behave differently ?
>                 
>                 
>                 The numpy equivalent to Python's scalar "==" is called
>                 array_equal,
>                 and that does indeed behave the same:
>                 
>                 In [5]: np.array_equal([3, 4], [2, 3, 4])
>                 Out[5]: False
>                 
>                 But in numpy, the name "==" is shorthand for the ufunc
>                 np.equal, which
>                 raises an error:
>                 
>                 In [8]: np.equal([3, 4], [2, 3, 4])
>                 ValueError: operands could not be broadcast together
>                 with shapes (2) (3)
>                 
>                 -n
>                 _______________________________________________
>                 NumPy-Discussion mailing list
>                 NumPy-Discussion at scipy.org
>                 http://mail.scipy.org/mailman/listinfo/numpy-discussion
>                 
>         
>         
>         
>         _______________________________________________
>         NumPy-Discussion mailing list
>         NumPy-Discussion at scipy.org
>         http://mail.scipy.org/mailman/listinfo/numpy-discussion
>         
> 
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From stefan at sun.ac.za  Mon Jul 15 12:25:56 2013
From: stefan at sun.ac.za (=?iso-8859-1?Q?St=E9fan?= van der Walt)
Date: Mon, 15 Jul 2013 18:25:56 +0200
Subject: [Numpy-discussion] PIL and NumPy
In-Reply-To: <CAAQ2A-sTsJRRHYZdrDE+H5M4YPZapsnTkB63UFr20ty_8cLKpg@mail.gmail.com>
References: <CAAQ2A-sTsJRRHYZdrDE+H5M4YPZapsnTkB63UFr20ty_8cLKpg@mail.gmail.com>
Message-ID: <20130715162556.GC18804@shinobi>

Dear Brady

On Fri, 12 Jul 2013 22:00:08 -0500, Brady McCary wrote:
>
> I want to load images with PIL and then operate on them with NumPy.
> According to the PIL and NumPy documentation, I would expect the
> following to work, but it is not.

Reading images as PIL is a little bit trickier than one would hope.  You can
find an example of how to do it (taken scikit-image) here:

https://github.com/scikit-image/scikit-image/blob/master/skimage/io/_plugins/pil_plugin.py#L15

St?fan


From gregorio.bastardo at gmail.com  Mon Jul 15 12:41:17 2013
From: gregorio.bastardo at gmail.com (Gregorio Bastardo)
Date: Mon, 15 Jul 2013 18:41:17 +0200
Subject: [Numpy-discussion] read-only or immutable masked array
In-Reply-To: <93CF6FC4-E06F-4697-93F9-6628E6C3528D@gmail.com>
References: <CAGcEGh66a6=BVn_kD2Oie187b_ayA9a-qB0pXve6QU5p+1rmzA@mail.gmail.com>
	<CABDkGQnK4WzBrPB9BUmURD0G6tnWQmBgmvkhX4NT22Uy2ESypw@mail.gmail.com>
	<CAGcEGh7rTbFP9rXQTC7vb4ya8ophb5MKbBcHSfm=EZz1jNRVyA@mail.gmail.com>
	<5585563F-9DEA-4142-A60E-6F8028E53D28@gmail.com>
	<CAGcEGh5pa58UwDUDL5Hr4Kzz7nGy2f5P1=aMJeiKcU48uFTrXA@mail.gmail.com>
	<B1D9C1B2-7C8A-4E56-837D-02D4CBE10EFB@gmail.com>
	<CAGcEGh5YkCB=GSuRweG_Tocg8SRmz51Z8vVN3w4w-oQ4t_FN+A@mail.gmail.com>
	<93CF6FC4-E06F-4697-93F9-6628E6C3528D@gmail.com>
Message-ID: <CAGcEGh4WVgN7f==CxZM82-A2FxtoBvR4cmRz5UoDy1RKWbmJfw@mail.gmail.com>

> Ouch?
> Quick workaround:  use `x.harden_mask()` *then* `x.mask.flags.writeable=False`

Thanks for the update and the detailed explanation. I'll try this trick.

> This may change in the future, depending on a yet-to-be-achieved consensus on  the definition of 'least-surprising behaviour'. Right now, the *-like functions return an array that shares the mask with the input, as you've noticed. Some people complained about it, what's your take on that?

I already took part in the survey (possibly out of thread):
http://mail.scipy.org/pipermail/numpy-discussion/2013-July/067136.html

> You were not missing anything, np.ma isn't the most straightforward module: plenty of corner cases, and the implementation is pretty naive at times (but hey, it works). My only advice is to never lose hope.

I agree there are plenty of hard-to-define cases, and I came accross a
hot debate on missing data representation in python:
https://github.com/njsmith/numpy/wiki/NA-discussion-status

but still I believe np.ma is very usable when compression is not
strongly needed.


From charlesr.harris at gmail.com  Mon Jul 15 13:29:43 2013
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Mon, 15 Jul 2013 11:29:43 -0600
Subject: [Numpy-discussion] What should be the result in some statistics
 corner cases?
In-Reply-To: <1373903744.15619.35.camel@sebastian-laptop>
References: <CAB6mnx+djfM8yJ+owEJ-Hz7eF9YT0gsxUROstEBiq4jUF6Z0bA@mail.gmail.com>
	<CAGzF1uexxRN8PPmKPjPT=aSxWi79r79Exmoo-Xwj6ZZZ2+6=3Q@mail.gmail.com>
	<CAB6mnx+D+uGd2xzrOpPomwUcQkF4+n1Y80Lndxbgumw-3yG3Uw@mail.gmail.com>
	<CAB6mnx+eWsYBM_AmT6scJ6NecWGwfppfFJu1UXrAdu+cwBYuYQ@mail.gmail.com>
	<1373898856.15619.14.camel@sebastian-laptop>
	<CAB6mnxKYPy0hQvX8cvsgMq7w5izUTN3bjV=LndddGCYATu9Csw@mail.gmail.com>
	<1373903744.15619.35.camel@sebastian-laptop>
Message-ID: <CAB6mnx+JyQR95Vi=NQZFascFtUOK7Ubtc=i0QtuVx8iyFpALAA@mail.gmail.com>

On Mon, Jul 15, 2013 at 9:55 AM, Sebastian Berg
<sebastian at sipsolutions.net>wrote:

> On Mon, 2013-07-15 at 08:47 -0600, Charles R Harris wrote:
> >
> >
> > On Mon, Jul 15, 2013 at 8:34 AM, Sebastian Berg
> > <sebastian at sipsolutions.net> wrote:
> >         On Mon, 2013-07-15 at 07:52 -0600, Charles R Harris wrote:
> >         >
> >         >
> >         > On Sun, Jul 14, 2013 at 3:35 PM, Charles R Harris
> >         > <charlesr.harris at gmail.com> wrote:
> >         >
> >
> >
> >         <snip>
> >
> >         >
> >         >                 For nansum, I would expect 0 even in the
> >         case of all
> >         >                 nans.  The point
> >         >                 of these functions is to simply ignore nans,
> >         correct?
> >         >                  So I would aim
> >         >                 for this behaviour:  nanfunc(x) behaves the
> >         same as
> >         >                 func(x[~isnan(x)])
> >         >
> >         >
> >         >         Agreed, although that changes current behavior. What
> >         about the
> >         >         other cases?
> >         >
> >         >
> >         >
> >         > Looks like there isn't much interest in the topic, so I'll
> >         just go
> >         > ahead with the following choices:
> >         >
> >         > Non-NaN case
> >         >
> >         > 1) Empty array -> ValueError
> >         >
> >         > The current behavior with stats is an accident, i.e., the
> >         nan arises
> >         > from 0/0. I like to think that in this case the result is
> >         any number,
> >         > rather than not a number, so *the* value is simply not
> >         defined. So in
> >         > this case raise a ValueError for empty array.
> >         >
> >
> >         To be honest, I don't mind the current behaviour much sum([])
> >         = 0,
> >         len([]) = 0, so it is in a way well defined. At least I am not
> >         sure if I
> >         would prefer always an error. I am a bit worried that just
> >         changing it
> >         might break code out there, such as plotting code where it
> >         makes
> >         perfectly sense to plot a NaN (i.e. nothing), but if that is
> >         the case it
> >         would probably be visible fast.
> >
> > I'm talking about mean, var, and std as statistics, sum isn't part of
> > that. If there is agreement that nansum of empty arrays/columns should
> > be zero I will do that. Note the sums of empty arrays may or may not
> > be empty.
> >
> > In [1]: ones((0, 3)).sum(axis=0)
> > Out[1]: array([ 0.,  0.,  0.])
> >
> > In [2]: ones((3, 0)).sum(axis=0)
> > Out[2]: array([], dtype=float64)
> >
> > Which, sort of, makes sense.
> >
> >
> I think we can agree that the behaviour for reductions with an identity
> should default to returning the identity, including for the nanfuncs,
> i.e. sum([]) is 0, product([]) is 1...
>
> Since mean = sum/length is a sensible definition, having 0/0 as a result
> doesn't seem to bad to me to be honest, it might be accidental but it is
> not a special case in the code ;). Though I don't mind an error as long
> as it doesn't break matplotlib or so.
>
> I agree about the nanfuncs raising an error would probably be more of a
> problem then for a usual ufunc, but still a bit hesitant about saying
> that it is ok too. I could imagine adding a very general "identity"
> argument (though I would not call it identity, because it is not the
> same as `np.add.identity`, just used in a place where that would be used
> otherwise):
>
> np.add.reduce([], identity=123) -> [123]
> np.add.reduce([1], identity=123) -> [1]
> np.nanmean([np.nan], identity=None) -> Error
> np.nanmean([np.nan], identity=np.nan) -> np.nan
>
> It doesn't really make sense, but:
> np.subtract.reduce([]) -> Error, since np.substract.identity is None
> np.subtract.reduce([], identity=0) -> 0, suppressing the error.
>
> I am not sure if I am convinced myself, but especially for the nanfuncs
> it could maybe provide a way to circumvent the problem somewhat.
> Including functions such as np.nanargmin, whose result type does not
> even support NaN. Plus it gives an argument allowing for warnings about
> changing behaviour.
>
>
Let me try to summarize. To begin with, the environment of the nan
functions is rather special.

1) if the array is of not of inexact type, they punt to the non-nan
versions.
2) if the array is of inexact type, then out and dtype must be inexact if
specified

The second assumption guarantees that NaN can be used in the return values.

*sum and nansum*

These should be consistent so that empty sums are 0. This should cover the
empty array case, but will change the behaviour of nansum which currently
returns NaN if the array isn't empty but the slice is after NaN removal.

*mean and nanmean*

In the case of empty arrays, an empty slice, this leads to 0/0. For Python
this is always a zero division error, for Numpy this raises a warning and
and returns NaN for floats, 0 for integers.

Currently mean returns NaN and raises a RuntimeWarning when 0/0 occurs. In
the special case where dtype=int, the NaN is cast to integer.

Option1
1) mean raise error on 0/0
2) nanmean no warning, return NaN

Option2
1) mean raise warning, return NaN (current behavior)
2) nanmean no warning, return NaN

Option3
1) mean raise warning, return NaN (current behavior)
2) nanmean raise warning, return NaN

*var, std, nanvar, nanstd*

1) if ddof > axis(axes) size, raise error, probably a program bug.
2) If ddof=0, then whatever is the case for mean, nanmean

For nanvar, nanstd it is possible that some slice are good, some bad, so

option1
1) if n - ddof <= 0 for a slice, raise warning, return NaN for slice

option2
1) if n - ddof <= 0 for a slice, don't warn, return NaN for slice

<snip>

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130715/00d20cc7/attachment.html>

From njs at pobox.com  Mon Jul 15 14:55:04 2013
From: njs at pobox.com (Nathaniel Smith)
Date: Mon, 15 Jul 2013 19:55:04 +0100
Subject: [Numpy-discussion] What should be the result in some statistics
 corner cases?
In-Reply-To: <CAB6mnx+JyQR95Vi=NQZFascFtUOK7Ubtc=i0QtuVx8iyFpALAA@mail.gmail.com>
References: <CAB6mnx+djfM8yJ+owEJ-Hz7eF9YT0gsxUROstEBiq4jUF6Z0bA@mail.gmail.com>
	<CAGzF1uexxRN8PPmKPjPT=aSxWi79r79Exmoo-Xwj6ZZZ2+6=3Q@mail.gmail.com>
	<CAB6mnx+D+uGd2xzrOpPomwUcQkF4+n1Y80Lndxbgumw-3yG3Uw@mail.gmail.com>
	<CAB6mnx+eWsYBM_AmT6scJ6NecWGwfppfFJu1UXrAdu+cwBYuYQ@mail.gmail.com>
	<1373898856.15619.14.camel@sebastian-laptop>
	<CAB6mnxKYPy0hQvX8cvsgMq7w5izUTN3bjV=LndddGCYATu9Csw@mail.gmail.com>
	<1373903744.15619.35.camel@sebastian-laptop>
	<CAB6mnx+JyQR95Vi=NQZFascFtUOK7Ubtc=i0QtuVx8iyFpALAA@mail.gmail.com>
Message-ID: <CAPJVwBm8ApZYzxwV8Y9bSd13RzQrcRwEAxwbQTPZmxg+ebPYAw@mail.gmail.com>

On Mon, Jul 15, 2013 at 6:29 PM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
> Let me try to summarize. To begin with, the environment of the nan functions
> is rather special.
>
> 1) if the array is of not of inexact type, they punt to the non-nan
> versions.
> 2) if the array is of inexact type, then out and dtype must be inexact if
> specified
>
> The second assumption guarantees that NaN can be used in the return values.

The requirement on the 'out' dtype only exists because currently the
nan function like to return nan for things like empty arrays, right?
If not for that, it could be relaxed? (it's a rather weird
requirement, since the whole point of these functions is that they
ignore nans, yet they don't always...)

> sum and nansum
>
> These should be consistent so that empty sums are 0. This should cover the
> empty array case, but will change the behaviour of nansum which currently
> returns NaN if the array isn't empty but the slice is after NaN removal.

I agree that returning 0 is the right behaviour, but we might need a
FutureWarning period.

> mean and nanmean
>
> In the case of empty arrays, an empty slice, this leads to 0/0. For Python
> this is always a zero division error, for Numpy this raises a warning and
> and returns NaN for floats, 0 for integers.
>
> Currently mean returns NaN and raises a RuntimeWarning when 0/0 occurs. In
> the special case where dtype=int, the NaN is cast to integer.
>
> Option1
> 1) mean raise error on 0/0
> 2) nanmean no warning, return NaN
>
> Option2
> 1) mean raise warning, return NaN (current behavior)
> 2) nanmean no warning, return NaN
>
> Option3
> 1) mean raise warning, return NaN (current behavior)
> 2) nanmean raise warning, return NaN

I have mixed feelings about the whole np.seterr apparatus, but since
it exists, shouldn't we use it for consistency? I.e., just do whatever
numpy is set up to do with 0/0? (Which I think means, warn and return
NaN by default, but this can be changed.)

> var, std, nanvar, nanstd
>
> 1) if ddof > axis(axes) size, raise error, probably a program bug.
> 2) If ddof=0, then whatever is the case for mean, nanmean
>
> For nanvar, nanstd it is possible that some slice are good, some bad, so
>
> option1
> 1) if n - ddof <= 0 for a slice, raise warning, return NaN for slice
>
> option2
> 1) if n - ddof <= 0 for a slice, don't warn, return NaN for slice

I don't really have any intuition for these ddof cases. Just raising
an error on negative effective dof is pretty defensible and might be
the safest -- it's a easy to turn an error into something sensible
later if people come up with use cases...

-n


From josef.pktd at gmail.com  Mon Jul 15 16:24:52 2013
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Mon, 15 Jul 2013 16:24:52 -0400
Subject: [Numpy-discussion] What should be the result in some statistics
 corner cases?
In-Reply-To: <CAPJVwBm8ApZYzxwV8Y9bSd13RzQrcRwEAxwbQTPZmxg+ebPYAw@mail.gmail.com>
References: <CAB6mnx+djfM8yJ+owEJ-Hz7eF9YT0gsxUROstEBiq4jUF6Z0bA@mail.gmail.com>
	<CAGzF1uexxRN8PPmKPjPT=aSxWi79r79Exmoo-Xwj6ZZZ2+6=3Q@mail.gmail.com>
	<CAB6mnx+D+uGd2xzrOpPomwUcQkF4+n1Y80Lndxbgumw-3yG3Uw@mail.gmail.com>
	<CAB6mnx+eWsYBM_AmT6scJ6NecWGwfppfFJu1UXrAdu+cwBYuYQ@mail.gmail.com>
	<1373898856.15619.14.camel@sebastian-laptop>
	<CAB6mnxKYPy0hQvX8cvsgMq7w5izUTN3bjV=LndddGCYATu9Csw@mail.gmail.com>
	<1373903744.15619.35.camel@sebastian-laptop>
	<CAB6mnx+JyQR95Vi=NQZFascFtUOK7Ubtc=i0QtuVx8iyFpALAA@mail.gmail.com>
	<CAPJVwBm8ApZYzxwV8Y9bSd13RzQrcRwEAxwbQTPZmxg+ebPYAw@mail.gmail.com>
Message-ID: <CAMMTP+AcHLcoVUjYsa52nspNtKnB0bg3bz6vVx=rrk-=20J9EQ@mail.gmail.com>

On Mon, Jul 15, 2013 at 2:55 PM, Nathaniel Smith <njs at pobox.com> wrote:
> On Mon, Jul 15, 2013 at 6:29 PM, Charles R Harris
> <charlesr.harris at gmail.com> wrote:
>> Let me try to summarize. To begin with, the environment of the nan functions
>> is rather special.
>>
>> 1) if the array is of not of inexact type, they punt to the non-nan
>> versions.
>> 2) if the array is of inexact type, then out and dtype must be inexact if
>> specified
>>
>> The second assumption guarantees that NaN can be used in the return values.
>
> The requirement on the 'out' dtype only exists because currently the
> nan function like to return nan for things like empty arrays, right?
> If not for that, it could be relaxed? (it's a rather weird
> requirement, since the whole point of these functions is that they
> ignore nans, yet they don't always...)
>
>> sum and nansum
>>
>> These should be consistent so that empty sums are 0. This should cover the
>> empty array case, but will change the behaviour of nansum which currently
>> returns NaN if the array isn't empty but the slice is after NaN removal.
>
> I agree that returning 0 is the right behaviour, but we might need a
> FutureWarning period.
>
>> mean and nanmean
>>
>> In the case of empty arrays, an empty slice, this leads to 0/0. For Python
>> this is always a zero division error, for Numpy this raises a warning and
>> and returns NaN for floats, 0 for integers.
>>
>> Currently mean returns NaN and raises a RuntimeWarning when 0/0 occurs. In
>> the special case where dtype=int, the NaN is cast to integer.
>>
>> Option1
>> 1) mean raise error on 0/0
>> 2) nanmean no warning, return NaN
>>
>> Option2
>> 1) mean raise warning, return NaN (current behavior)
>> 2) nanmean no warning, return NaN
>>
>> Option3
>> 1) mean raise warning, return NaN (current behavior)
>> 2) nanmean raise warning, return NaN
>
> I have mixed feelings about the whole np.seterr apparatus, but since
> it exists, shouldn't we use it for consistency? I.e., just do whatever
> numpy is set up to do with 0/0? (Which I think means, warn and return
> NaN by default, but this can be changed.)
>
>> var, std, nanvar, nanstd
>>
>> 1) if ddof > axis(axes) size, raise error, probably a program bug.
>> 2) If ddof=0, then whatever is the case for mean, nanmean
>>
>> For nanvar, nanstd it is possible that some slice are good, some bad, so
>>
>> option1
>> 1) if n - ddof <= 0 for a slice, raise warning, return NaN for slice
>>
>> option2
>> 1) if n - ddof <= 0 for a slice, don't warn, return NaN for slice
>
> I don't really have any intuition for these ddof cases. Just raising
> an error on negative effective dof is pretty defensible and might be
> the safest -- it's a easy to turn an error into something sensible
> later if people come up with use cases...

related why does reduceat not have empty slices?

>>> np.add.reduceat(np.arange(8),[0,4, 5, 7,7])
array([ 6,  4, 11,  7,  7])


I'm in favor of returning nans instead of raising exceptions, except
if the return type is int and we cannot cast nan to int.

If we get functions into numpy that know how to handle nans, then it
would be useful to get the nans, so we can work with them

Some cases where this might come in handy are when we iterate over
slices of an array that define groups or category levels with possible
empty groups *)

>>> idx = np.repeat(np.array([0, 1, 2, 3]), [4, 3, 0, 2])
>>> x = np.arange(9)
>>> [x[idx==ii].mean() for ii in range(4)]
[1.5, 5.0, nan, 7.5]

instead of
>>> [x[idx==ii].mean() for ii in range(4) if (idx==ii).sum()>0]
[1.5, 5.0, 7.5]

same for var, I wouldn't have to check that the size is larger than
the ddof (whatever that is in the specific case)

*) groups could be empty because they were defined for a larger
dataset or as a union of different datasets


PS: I used mean() above and not var() because

>>> np.__version__
'1.5.1'
>>> np.mean([])
nan
>>> np.var([])
0.0

Josef

>
> -n
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From josef.pktd at gmail.com  Mon Jul 15 16:44:18 2013
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Mon, 15 Jul 2013 16:44:18 -0400
Subject: [Numpy-discussion] What should be the result in some statistics
 corner cases?
In-Reply-To: <CAMMTP+AcHLcoVUjYsa52nspNtKnB0bg3bz6vVx=rrk-=20J9EQ@mail.gmail.com>
References: <CAB6mnx+djfM8yJ+owEJ-Hz7eF9YT0gsxUROstEBiq4jUF6Z0bA@mail.gmail.com>
	<CAGzF1uexxRN8PPmKPjPT=aSxWi79r79Exmoo-Xwj6ZZZ2+6=3Q@mail.gmail.com>
	<CAB6mnx+D+uGd2xzrOpPomwUcQkF4+n1Y80Lndxbgumw-3yG3Uw@mail.gmail.com>
	<CAB6mnx+eWsYBM_AmT6scJ6NecWGwfppfFJu1UXrAdu+cwBYuYQ@mail.gmail.com>
	<1373898856.15619.14.camel@sebastian-laptop>
	<CAB6mnxKYPy0hQvX8cvsgMq7w5izUTN3bjV=LndddGCYATu9Csw@mail.gmail.com>
	<1373903744.15619.35.camel@sebastian-laptop>
	<CAB6mnx+JyQR95Vi=NQZFascFtUOK7Ubtc=i0QtuVx8iyFpALAA@mail.gmail.com>
	<CAPJVwBm8ApZYzxwV8Y9bSd13RzQrcRwEAxwbQTPZmxg+ebPYAw@mail.gmail.com>
	<CAMMTP+AcHLcoVUjYsa52nspNtKnB0bg3bz6vVx=rrk-=20J9EQ@mail.gmail.com>
Message-ID: <CAMMTP+Bpy1cQfwbOUDXMRa+zjxBTHJnuow=W0zkMEH6vEtk0Pg@mail.gmail.com>

On Mon, Jul 15, 2013 at 4:24 PM,  <josef.pktd at gmail.com> wrote:
> On Mon, Jul 15, 2013 at 2:55 PM, Nathaniel Smith <njs at pobox.com> wrote:
>> On Mon, Jul 15, 2013 at 6:29 PM, Charles R Harris
>> <charlesr.harris at gmail.com> wrote:
>>> Let me try to summarize. To begin with, the environment of the nan functions
>>> is rather special.
>>>
>>> 1) if the array is of not of inexact type, they punt to the non-nan
>>> versions.
>>> 2) if the array is of inexact type, then out and dtype must be inexact if
>>> specified
>>>
>>> The second assumption guarantees that NaN can be used in the return values.
>>
>> The requirement on the 'out' dtype only exists because currently the
>> nan function like to return nan for things like empty arrays, right?
>> If not for that, it could be relaxed? (it's a rather weird
>> requirement, since the whole point of these functions is that they
>> ignore nans, yet they don't always...)
>>
>>> sum and nansum
>>>
>>> These should be consistent so that empty sums are 0. This should cover the
>>> empty array case, but will change the behaviour of nansum which currently
>>> returns NaN if the array isn't empty but the slice is after NaN removal.
>>
>> I agree that returning 0 is the right behaviour, but we might need a
>> FutureWarning period.
>>
>>> mean and nanmean
>>>
>>> In the case of empty arrays, an empty slice, this leads to 0/0. For Python
>>> this is always a zero division error, for Numpy this raises a warning and
>>> and returns NaN for floats, 0 for integers.
>>>
>>> Currently mean returns NaN and raises a RuntimeWarning when 0/0 occurs. In
>>> the special case where dtype=int, the NaN is cast to integer.
>>>
>>> Option1
>>> 1) mean raise error on 0/0
>>> 2) nanmean no warning, return NaN
>>>
>>> Option2
>>> 1) mean raise warning, return NaN (current behavior)
>>> 2) nanmean no warning, return NaN
>>>
>>> Option3
>>> 1) mean raise warning, return NaN (current behavior)
>>> 2) nanmean raise warning, return NaN
>>
>> I have mixed feelings about the whole np.seterr apparatus, but since
>> it exists, shouldn't we use it for consistency? I.e., just do whatever
>> numpy is set up to do with 0/0? (Which I think means, warn and return
>> NaN by default, but this can be changed.)
>>
>>> var, std, nanvar, nanstd
>>>
>>> 1) if ddof > axis(axes) size, raise error, probably a program bug.
>>> 2) If ddof=0, then whatever is the case for mean, nanmean
>>>
>>> For nanvar, nanstd it is possible that some slice are good, some bad, so
>>>
>>> option1
>>> 1) if n - ddof <= 0 for a slice, raise warning, return NaN for slice
>>>
>>> option2
>>> 1) if n - ddof <= 0 for a slice, don't warn, return NaN for slice
>>
>> I don't really have any intuition for these ddof cases. Just raising
>> an error on negative effective dof is pretty defensible and might be
>> the safest -- it's a easy to turn an error into something sensible
>> later if people come up with use cases...
>
> related why does reduceat not have empty slices?
>
>>>> np.add.reduceat(np.arange(8),[0,4, 5, 7,7])
> array([ 6,  4, 11,  7,  7])
>
>
> I'm in favor of returning nans instead of raising exceptions, except
> if the return type is int and we cannot cast nan to int.
>
> If we get functions into numpy that know how to handle nans, then it
> would be useful to get the nans, so we can work with them
>
> Some cases where this might come in handy are when we iterate over
> slices of an array that define groups or category levels with possible
> empty groups *)
>
>>>> idx = np.repeat(np.array([0, 1, 2, 3]), [4, 3, 0, 2])
>>>> x = np.arange(9)
>>>> [x[idx==ii].mean() for ii in range(4)]
> [1.5, 5.0, nan, 7.5]
>
> instead of
>>>> [x[idx==ii].mean() for ii in range(4) if (idx==ii).sum()>0]
> [1.5, 5.0, 7.5]
>
> same for var, I wouldn't have to check that the size is larger than
> the ddof (whatever that is in the specific case)
>
> *) groups could be empty because they were defined for a larger
> dataset or as a union of different datasets

background:

I wrote several robust anova versions a few weeks ago, that were
essentially list comprehension as above. However, I didn't allow nans
and didn't check for minimum size.
Allowing for empty groups to return nan would mainly be a convenience,
since I need to check the group size only once.

ddof: tests for proportions have ddof=0, for regular t-test ddof=1,
for tests of correlation ddof=2   IIRC
so we would need to check for the corresponding minimum size that n-ddof>0

"negative effective dof" doesn't exist, that's np.maximum(n - ddof, 0)
which is always non-negative but might result in a zero-division
error. :)

I don't think making anything conditional on ddof>0 is useful.

Josef

>
>
> PS: I used mean() above and not var() because
>
>>>> np.__version__
> '1.5.1'
>>>> np.mean([])
> nan
>>>> np.var([])
> 0.0
>
> Josef
>
>>
>> -n
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From charlesr.harris at gmail.com  Mon Jul 15 17:34:22 2013
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Mon, 15 Jul 2013 15:34:22 -0600
Subject: [Numpy-discussion] What should be the result in some statistics
 corner cases?
In-Reply-To: <CAMMTP+Bpy1cQfwbOUDXMRa+zjxBTHJnuow=W0zkMEH6vEtk0Pg@mail.gmail.com>
References: <CAB6mnx+djfM8yJ+owEJ-Hz7eF9YT0gsxUROstEBiq4jUF6Z0bA@mail.gmail.com>
	<CAGzF1uexxRN8PPmKPjPT=aSxWi79r79Exmoo-Xwj6ZZZ2+6=3Q@mail.gmail.com>
	<CAB6mnx+D+uGd2xzrOpPomwUcQkF4+n1Y80Lndxbgumw-3yG3Uw@mail.gmail.com>
	<CAB6mnx+eWsYBM_AmT6scJ6NecWGwfppfFJu1UXrAdu+cwBYuYQ@mail.gmail.com>
	<1373898856.15619.14.camel@sebastian-laptop>
	<CAB6mnxKYPy0hQvX8cvsgMq7w5izUTN3bjV=LndddGCYATu9Csw@mail.gmail.com>
	<1373903744.15619.35.camel@sebastian-laptop>
	<CAB6mnx+JyQR95Vi=NQZFascFtUOK7Ubtc=i0QtuVx8iyFpALAA@mail.gmail.com>
	<CAPJVwBm8ApZYzxwV8Y9bSd13RzQrcRwEAxwbQTPZmxg+ebPYAw@mail.gmail.com>
	<CAMMTP+AcHLcoVUjYsa52nspNtKnB0bg3bz6vVx=rrk-=20J9EQ@mail.gmail.com>
	<CAMMTP+Bpy1cQfwbOUDXMRa+zjxBTHJnuow=W0zkMEH6vEtk0Pg@mail.gmail.com>
Message-ID: <CAB6mnxLqkJasExXBek8deRVL9+PuWuFWWsfBeMT_pg-B2j-waA@mail.gmail.com>

On Mon, Jul 15, 2013 at 2:44 PM, <josef.pktd at gmail.com> wrote:

> On Mon, Jul 15, 2013 at 4:24 PM,  <josef.pktd at gmail.com> wrote:
> > On Mon, Jul 15, 2013 at 2:55 PM, Nathaniel Smith <njs at pobox.com> wrote:
> >> On Mon, Jul 15, 2013 at 6:29 PM, Charles R Harris
> >> <charlesr.harris at gmail.com> wrote:
> >>> Let me try to summarize. To begin with, the environment of the nan
> functions
> >>> is rather special.
> >>>
> >>> 1) if the array is of not of inexact type, they punt to the non-nan
> >>> versions.
> >>> 2) if the array is of inexact type, then out and dtype must be inexact
> if
> >>> specified
> >>>
> >>> The second assumption guarantees that NaN can be used in the return
> values.
> >>
> >> The requirement on the 'out' dtype only exists because currently the
> >> nan function like to return nan for things like empty arrays, right?
> >> If not for that, it could be relaxed? (it's a rather weird
> >> requirement, since the whole point of these functions is that they
> >> ignore nans, yet they don't always...)
> >>
> >>> sum and nansum
> >>>
> >>> These should be consistent so that empty sums are 0. This should cover
> the
> >>> empty array case, but will change the behaviour of nansum which
> currently
> >>> returns NaN if the array isn't empty but the slice is after NaN
> removal.
> >>
> >> I agree that returning 0 is the right behaviour, but we might need a
> >> FutureWarning period.
> >>
> >>> mean and nanmean
> >>>
> >>> In the case of empty arrays, an empty slice, this leads to 0/0. For
> Python
> >>> this is always a zero division error, for Numpy this raises a warning
> and
> >>> and returns NaN for floats, 0 for integers.
> >>>
> >>> Currently mean returns NaN and raises a RuntimeWarning when 0/0
> occurs. In
> >>> the special case where dtype=int, the NaN is cast to integer.
> >>>
> >>> Option1
> >>> 1) mean raise error on 0/0
> >>> 2) nanmean no warning, return NaN
> >>>
> >>> Option2
> >>> 1) mean raise warning, return NaN (current behavior)
> >>> 2) nanmean no warning, return NaN
> >>>
> >>> Option3
> >>> 1) mean raise warning, return NaN (current behavior)
> >>> 2) nanmean raise warning, return NaN
> >>
> >> I have mixed feelings about the whole np.seterr apparatus, but since
> >> it exists, shouldn't we use it for consistency? I.e., just do whatever
> >> numpy is set up to do with 0/0? (Which I think means, warn and return
> >> NaN by default, but this can be changed.)
> >>
> >>> var, std, nanvar, nanstd
> >>>
> >>> 1) if ddof > axis(axes) size, raise error, probably a program bug.
> >>> 2) If ddof=0, then whatever is the case for mean, nanmean
> >>>
> >>> For nanvar, nanstd it is possible that some slice are good, some bad,
> so
> >>>
> >>> option1
> >>> 1) if n - ddof <= 0 for a slice, raise warning, return NaN for slice
> >>>
> >>> option2
> >>> 1) if n - ddof <= 0 for a slice, don't warn, return NaN for slice
> >>
> >> I don't really have any intuition for these ddof cases. Just raising
> >> an error on negative effective dof is pretty defensible and might be
> >> the safest -- it's a easy to turn an error into something sensible
> >> later if people come up with use cases...
> >
> > related why does reduceat not have empty slices?
> >
> >>>> np.add.reduceat(np.arange(8),[0,4, 5, 7,7])
> > array([ 6,  4, 11,  7,  7])
> >
> >
> > I'm in favor of returning nans instead of raising exceptions, except
> > if the return type is int and we cannot cast nan to int.
> >
> > If we get functions into numpy that know how to handle nans, then it
> > would be useful to get the nans, so we can work with them
> >
> > Some cases where this might come in handy are when we iterate over
> > slices of an array that define groups or category levels with possible
> > empty groups *)
> >
> >>>> idx = np.repeat(np.array([0, 1, 2, 3]), [4, 3, 0, 2])
> >>>> x = np.arange(9)
> >>>> [x[idx==ii].mean() for ii in range(4)]
> > [1.5, 5.0, nan, 7.5]
> >
> > instead of
> >>>> [x[idx==ii].mean() for ii in range(4) if (idx==ii).sum()>0]
> > [1.5, 5.0, 7.5]
> >
> > same for var, I wouldn't have to check that the size is larger than
> > the ddof (whatever that is in the specific case)
> >
> > *) groups could be empty because they were defined for a larger
> > dataset or as a union of different datasets
>
> background:
>
> I wrote several robust anova versions a few weeks ago, that were
> essentially list comprehension as above. However, I didn't allow nans
> and didn't check for minimum size.
> Allowing for empty groups to return nan would mainly be a convenience,
> since I need to check the group size only once.
>
> ddof: tests for proportions have ddof=0, for regular t-test ddof=1,
> for tests of correlation ddof=2   IIRC
> so we would need to check for the corresponding minimum size that n-ddof>0
>
> "negative effective dof" doesn't exist, that's np.maximum(n - ddof, 0)
> which is always non-negative but might result in a zero-division
> error. :)
>
> I don't think making anything conditional on ddof>0 is useful.
>
>
So how would you want it?

To summarize the problem areas:

1) What is the sum of an empty slice? NaN or 0?
2) What is mean of empy slice? NaN, NaN and warn, or error?
3) What if n - ddof < 0 for slice? NaN, NaN and warn, or error?
4) What if n - ddof = 0 for slice? NaN, NaN and warn, or error?

I'm tending to NaN and warn for 2 -- 3, because, as Nathaniel notes, the
warning can be turned into an error by the user. The errstate context
manager would be good for that.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130715/3bb24c8e/attachment.html>

From josef.pktd at gmail.com  Mon Jul 15 17:57:56 2013
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Mon, 15 Jul 2013 17:57:56 -0400
Subject: [Numpy-discussion] What should be the result in some statistics
 corner cases?
In-Reply-To: <CAB6mnxLqkJasExXBek8deRVL9+PuWuFWWsfBeMT_pg-B2j-waA@mail.gmail.com>
References: <CAB6mnx+djfM8yJ+owEJ-Hz7eF9YT0gsxUROstEBiq4jUF6Z0bA@mail.gmail.com>
	<CAGzF1uexxRN8PPmKPjPT=aSxWi79r79Exmoo-Xwj6ZZZ2+6=3Q@mail.gmail.com>
	<CAB6mnx+D+uGd2xzrOpPomwUcQkF4+n1Y80Lndxbgumw-3yG3Uw@mail.gmail.com>
	<CAB6mnx+eWsYBM_AmT6scJ6NecWGwfppfFJu1UXrAdu+cwBYuYQ@mail.gmail.com>
	<1373898856.15619.14.camel@sebastian-laptop>
	<CAB6mnxKYPy0hQvX8cvsgMq7w5izUTN3bjV=LndddGCYATu9Csw@mail.gmail.com>
	<1373903744.15619.35.camel@sebastian-laptop>
	<CAB6mnx+JyQR95Vi=NQZFascFtUOK7Ubtc=i0QtuVx8iyFpALAA@mail.gmail.com>
	<CAPJVwBm8ApZYzxwV8Y9bSd13RzQrcRwEAxwbQTPZmxg+ebPYAw@mail.gmail.com>
	<CAMMTP+AcHLcoVUjYsa52nspNtKnB0bg3bz6vVx=rrk-=20J9EQ@mail.gmail.com>
	<CAMMTP+Bpy1cQfwbOUDXMRa+zjxBTHJnuow=W0zkMEH6vEtk0Pg@mail.gmail.com>
	<CAB6mnxLqkJasExXBek8deRVL9+PuWuFWWsfBeMT_pg-B2j-waA@mail.gmail.com>
Message-ID: <CAMMTP+B3eH8ezYAoZBjXDQta6EJ7jRamS9caHyRKdPWR-w_aMQ@mail.gmail.com>

On Mon, Jul 15, 2013 at 5:34 PM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
>
>
> On Mon, Jul 15, 2013 at 2:44 PM, <josef.pktd at gmail.com> wrote:
>>
>> On Mon, Jul 15, 2013 at 4:24 PM,  <josef.pktd at gmail.com> wrote:
>> > On Mon, Jul 15, 2013 at 2:55 PM, Nathaniel Smith <njs at pobox.com> wrote:
>> >> On Mon, Jul 15, 2013 at 6:29 PM, Charles R Harris
>> >> <charlesr.harris at gmail.com> wrote:
>> >>> Let me try to summarize. To begin with, the environment of the nan
>> >>> functions
>> >>> is rather special.
>> >>>
>> >>> 1) if the array is of not of inexact type, they punt to the non-nan
>> >>> versions.
>> >>> 2) if the array is of inexact type, then out and dtype must be inexact
>> >>> if
>> >>> specified
>> >>>
>> >>> The second assumption guarantees that NaN can be used in the return
>> >>> values.
>> >>
>> >> The requirement on the 'out' dtype only exists because currently the
>> >> nan function like to return nan for things like empty arrays, right?
>> >> If not for that, it could be relaxed? (it's a rather weird
>> >> requirement, since the whole point of these functions is that they
>> >> ignore nans, yet they don't always...)
>> >>
>> >>> sum and nansum
>> >>>
>> >>> These should be consistent so that empty sums are 0. This should cover
>> >>> the
>> >>> empty array case, but will change the behaviour of nansum which
>> >>> currently
>> >>> returns NaN if the array isn't empty but the slice is after NaN
>> >>> removal.
>> >>
>> >> I agree that returning 0 is the right behaviour, but we might need a
>> >> FutureWarning period.
>> >>
>> >>> mean and nanmean
>> >>>
>> >>> In the case of empty arrays, an empty slice, this leads to 0/0. For
>> >>> Python
>> >>> this is always a zero division error, for Numpy this raises a warning
>> >>> and
>> >>> and returns NaN for floats, 0 for integers.
>> >>>
>> >>> Currently mean returns NaN and raises a RuntimeWarning when 0/0
>> >>> occurs. In
>> >>> the special case where dtype=int, the NaN is cast to integer.
>> >>>
>> >>> Option1
>> >>> 1) mean raise error on 0/0
>> >>> 2) nanmean no warning, return NaN
>> >>>
>> >>> Option2
>> >>> 1) mean raise warning, return NaN (current behavior)
>> >>> 2) nanmean no warning, return NaN
>> >>>
>> >>> Option3
>> >>> 1) mean raise warning, return NaN (current behavior)
>> >>> 2) nanmean raise warning, return NaN
>> >>
>> >> I have mixed feelings about the whole np.seterr apparatus, but since
>> >> it exists, shouldn't we use it for consistency? I.e., just do whatever
>> >> numpy is set up to do with 0/0? (Which I think means, warn and return
>> >> NaN by default, but this can be changed.)
>> >>
>> >>> var, std, nanvar, nanstd
>> >>>
>> >>> 1) if ddof > axis(axes) size, raise error, probably a program bug.
>> >>> 2) If ddof=0, then whatever is the case for mean, nanmean
>> >>>
>> >>> For nanvar, nanstd it is possible that some slice are good, some bad,
>> >>> so
>> >>>
>> >>> option1
>> >>> 1) if n - ddof <= 0 for a slice, raise warning, return NaN for slice
>> >>>
>> >>> option2
>> >>> 1) if n - ddof <= 0 for a slice, don't warn, return NaN for slice
>> >>
>> >> I don't really have any intuition for these ddof cases. Just raising
>> >> an error on negative effective dof is pretty defensible and might be
>> >> the safest -- it's a easy to turn an error into something sensible
>> >> later if people come up with use cases...
>> >
>> > related why does reduceat not have empty slices?
>> >
>> >>>> np.add.reduceat(np.arange(8),[0,4, 5, 7,7])
>> > array([ 6,  4, 11,  7,  7])
>> >
>> >
>> > I'm in favor of returning nans instead of raising exceptions, except
>> > if the return type is int and we cannot cast nan to int.
>> >
>> > If we get functions into numpy that know how to handle nans, then it
>> > would be useful to get the nans, so we can work with them
>> >
>> > Some cases where this might come in handy are when we iterate over
>> > slices of an array that define groups or category levels with possible
>> > empty groups *)
>> >
>> >>>> idx = np.repeat(np.array([0, 1, 2, 3]), [4, 3, 0, 2])
>> >>>> x = np.arange(9)
>> >>>> [x[idx==ii].mean() for ii in range(4)]
>> > [1.5, 5.0, nan, 7.5]
>> >
>> > instead of
>> >>>> [x[idx==ii].mean() for ii in range(4) if (idx==ii).sum()>0]
>> > [1.5, 5.0, 7.5]
>> >
>> > same for var, I wouldn't have to check that the size is larger than
>> > the ddof (whatever that is in the specific case)
>> >
>> > *) groups could be empty because they were defined for a larger
>> > dataset or as a union of different datasets
>>
>> background:
>>
>> I wrote several robust anova versions a few weeks ago, that were
>> essentially list comprehension as above. However, I didn't allow nans
>> and didn't check for minimum size.
>> Allowing for empty groups to return nan would mainly be a convenience,
>> since I need to check the group size only once.
>>
>> ddof: tests for proportions have ddof=0, for regular t-test ddof=1,
>> for tests of correlation ddof=2   IIRC
>> so we would need to check for the corresponding minimum size that n-ddof>0
>>
>> "negative effective dof" doesn't exist, that's np.maximum(n - ddof, 0)
>> which is always non-negative but might result in a zero-division
>> error. :)
>>
>> I don't think making anything conditional on ddof>0 is useful.
>>
>
> So how would you want it?
>
> To summarize the problem areas:
>
> 1) What is the sum of an empty slice? NaN or 0?
0 as it is now for sum, (including 0 for nansum with no valid entries).

> 2) What is mean of empy slice? NaN, NaN and warn, or error?
> 3) What if n - ddof < 0 for slice? NaN, NaN and warn, or error?
> 4) What if n - ddof = 0 for slice? NaN, NaN and warn, or error?
>
> I'm tending to NaN and warn for 2 -- 3, because, as Nathaniel notes, the
> warning can be turned into an error by the user. The errstate context
> manager would be good for that.

Yes, That's what I would prefer also, NaN and ZeroDivisionError, for
2-4, including mean, var and std, for both nan and non-nan functions.

with the extra argument that 3) and 4) are the same case   (except in polyfit :)

Josef


>
> Chuck
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From charlesr.harris at gmail.com  Mon Jul 15 18:03:01 2013
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Mon, 15 Jul 2013 16:03:01 -0600
Subject: [Numpy-discussion] What should be the result in some statistics
 corner cases?
In-Reply-To: <CAMMTP+B3eH8ezYAoZBjXDQta6EJ7jRamS9caHyRKdPWR-w_aMQ@mail.gmail.com>
References: <CAB6mnx+djfM8yJ+owEJ-Hz7eF9YT0gsxUROstEBiq4jUF6Z0bA@mail.gmail.com>
	<CAGzF1uexxRN8PPmKPjPT=aSxWi79r79Exmoo-Xwj6ZZZ2+6=3Q@mail.gmail.com>
	<CAB6mnx+D+uGd2xzrOpPomwUcQkF4+n1Y80Lndxbgumw-3yG3Uw@mail.gmail.com>
	<CAB6mnx+eWsYBM_AmT6scJ6NecWGwfppfFJu1UXrAdu+cwBYuYQ@mail.gmail.com>
	<1373898856.15619.14.camel@sebastian-laptop>
	<CAB6mnxKYPy0hQvX8cvsgMq7w5izUTN3bjV=LndddGCYATu9Csw@mail.gmail.com>
	<1373903744.15619.35.camel@sebastian-laptop>
	<CAB6mnx+JyQR95Vi=NQZFascFtUOK7Ubtc=i0QtuVx8iyFpALAA@mail.gmail.com>
	<CAPJVwBm8ApZYzxwV8Y9bSd13RzQrcRwEAxwbQTPZmxg+ebPYAw@mail.gmail.com>
	<CAMMTP+AcHLcoVUjYsa52nspNtKnB0bg3bz6vVx=rrk-=20J9EQ@mail.gmail.com>
	<CAMMTP+Bpy1cQfwbOUDXMRa+zjxBTHJnuow=W0zkMEH6vEtk0Pg@mail.gmail.com>
	<CAB6mnxLqkJasExXBek8deRVL9+PuWuFWWsfBeMT_pg-B2j-waA@mail.gmail.com>
	<CAMMTP+B3eH8ezYAoZBjXDQta6EJ7jRamS9caHyRKdPWR-w_aMQ@mail.gmail.com>
Message-ID: <CAB6mnxLe9-pRewXZ7WJgBFsS62t-g31Y2qa0XtXfrv7421vB0Q@mail.gmail.com>

On Mon, Jul 15, 2013 at 3:57 PM, <josef.pktd at gmail.com> wrote:

> On Mon, Jul 15, 2013 at 5:34 PM, Charles R Harris
> <charlesr.harris at gmail.com> wrote:
> >
> >
> > On Mon, Jul 15, 2013 at 2:44 PM, <josef.pktd at gmail.com> wrote:
> >>
> >> On Mon, Jul 15, 2013 at 4:24 PM,  <josef.pktd at gmail.com> wrote:
> >> > On Mon, Jul 15, 2013 at 2:55 PM, Nathaniel Smith <njs at pobox.com>
> wrote:
> >> >> On Mon, Jul 15, 2013 at 6:29 PM, Charles R Harris
> >> >> <charlesr.harris at gmail.com> wrote:
> >> >>> Let me try to summarize. To begin with, the environment of the nan
> >> >>> functions
> >> >>> is rather special.
> >> >>>
> >> >>> 1) if the array is of not of inexact type, they punt to the non-nan
> >> >>> versions.
> >> >>> 2) if the array is of inexact type, then out and dtype must be
> inexact
> >> >>> if
> >> >>> specified
> >> >>>
> >> >>> The second assumption guarantees that NaN can be used in the return
> >> >>> values.
> >> >>
> >> >> The requirement on the 'out' dtype only exists because currently the
> >> >> nan function like to return nan for things like empty arrays, right?
> >> >> If not for that, it could be relaxed? (it's a rather weird
> >> >> requirement, since the whole point of these functions is that they
> >> >> ignore nans, yet they don't always...)
> >> >>
> >> >>> sum and nansum
> >> >>>
> >> >>> These should be consistent so that empty sums are 0. This should
> cover
> >> >>> the
> >> >>> empty array case, but will change the behaviour of nansum which
> >> >>> currently
> >> >>> returns NaN if the array isn't empty but the slice is after NaN
> >> >>> removal.
> >> >>
> >> >> I agree that returning 0 is the right behaviour, but we might need a
> >> >> FutureWarning period.
> >> >>
> >> >>> mean and nanmean
> >> >>>
> >> >>> In the case of empty arrays, an empty slice, this leads to 0/0. For
> >> >>> Python
> >> >>> this is always a zero division error, for Numpy this raises a
> warning
> >> >>> and
> >> >>> and returns NaN for floats, 0 for integers.
> >> >>>
> >> >>> Currently mean returns NaN and raises a RuntimeWarning when 0/0
> >> >>> occurs. In
> >> >>> the special case where dtype=int, the NaN is cast to integer.
> >> >>>
> >> >>> Option1
> >> >>> 1) mean raise error on 0/0
> >> >>> 2) nanmean no warning, return NaN
> >> >>>
> >> >>> Option2
> >> >>> 1) mean raise warning, return NaN (current behavior)
> >> >>> 2) nanmean no warning, return NaN
> >> >>>
> >> >>> Option3
> >> >>> 1) mean raise warning, return NaN (current behavior)
> >> >>> 2) nanmean raise warning, return NaN
> >> >>
> >> >> I have mixed feelings about the whole np.seterr apparatus, but since
> >> >> it exists, shouldn't we use it for consistency? I.e., just do
> whatever
> >> >> numpy is set up to do with 0/0? (Which I think means, warn and return
> >> >> NaN by default, but this can be changed.)
> >> >>
> >> >>> var, std, nanvar, nanstd
> >> >>>
> >> >>> 1) if ddof > axis(axes) size, raise error, probably a program bug.
> >> >>> 2) If ddof=0, then whatever is the case for mean, nanmean
> >> >>>
> >> >>> For nanvar, nanstd it is possible that some slice are good, some
> bad,
> >> >>> so
> >> >>>
> >> >>> option1
> >> >>> 1) if n - ddof <= 0 for a slice, raise warning, return NaN for slice
> >> >>>
> >> >>> option2
> >> >>> 1) if n - ddof <= 0 for a slice, don't warn, return NaN for slice
> >> >>
> >> >> I don't really have any intuition for these ddof cases. Just raising
> >> >> an error on negative effective dof is pretty defensible and might be
> >> >> the safest -- it's a easy to turn an error into something sensible
> >> >> later if people come up with use cases...
> >> >
> >> > related why does reduceat not have empty slices?
> >> >
> >> >>>> np.add.reduceat(np.arange(8),[0,4, 5, 7,7])
> >> > array([ 6,  4, 11,  7,  7])
> >> >
> >> >
> >> > I'm in favor of returning nans instead of raising exceptions, except
> >> > if the return type is int and we cannot cast nan to int.
> >> >
> >> > If we get functions into numpy that know how to handle nans, then it
> >> > would be useful to get the nans, so we can work with them
> >> >
> >> > Some cases where this might come in handy are when we iterate over
> >> > slices of an array that define groups or category levels with possible
> >> > empty groups *)
> >> >
> >> >>>> idx = np.repeat(np.array([0, 1, 2, 3]), [4, 3, 0, 2])
> >> >>>> x = np.arange(9)
> >> >>>> [x[idx==ii].mean() for ii in range(4)]
> >> > [1.5, 5.0, nan, 7.5]
> >> >
> >> > instead of
> >> >>>> [x[idx==ii].mean() for ii in range(4) if (idx==ii).sum()>0]
> >> > [1.5, 5.0, 7.5]
> >> >
> >> > same for var, I wouldn't have to check that the size is larger than
> >> > the ddof (whatever that is in the specific case)
> >> >
> >> > *) groups could be empty because they were defined for a larger
> >> > dataset or as a union of different datasets
> >>
> >> background:
> >>
> >> I wrote several robust anova versions a few weeks ago, that were
> >> essentially list comprehension as above. However, I didn't allow nans
> >> and didn't check for minimum size.
> >> Allowing for empty groups to return nan would mainly be a convenience,
> >> since I need to check the group size only once.
> >>
> >> ddof: tests for proportions have ddof=0, for regular t-test ddof=1,
> >> for tests of correlation ddof=2   IIRC
> >> so we would need to check for the corresponding minimum size that
> n-ddof>0
> >>
> >> "negative effective dof" doesn't exist, that's np.maximum(n - ddof, 0)
> >> which is always non-negative but might result in a zero-division
> >> error. :)
> >>
> >> I don't think making anything conditional on ddof>0 is useful.
> >>
> >
> > So how would you want it?
> >
> > To summarize the problem areas:
> >
> > 1) What is the sum of an empty slice? NaN or 0?
> 0 as it is now for sum, (including 0 for nansum with no valid entries).
>
> > 2) What is mean of empy slice? NaN, NaN and warn, or error?
> > 3) What if n - ddof < 0 for slice? NaN, NaN and warn, or error?
> > 4) What if n - ddof = 0 for slice? NaN, NaN and warn, or error?
> >
> > I'm tending to NaN and warn for 2 -- 3, because, as Nathaniel notes, the
> > warning can be turned into an error by the user. The errstate context
> > manager would be good for that.
>
> Yes, That's what I would prefer also, NaN and ZeroDivisionError, for
> 2-4, including mean, var and std, for both nan and non-nan functions.
>
> with the extra argument that 3) and 4) are the same case   (except in
> polyfit :)
>

One extra possibility with the nan functions could be a new keyword, error,
which would turn warnings into errors. But that might be a bit much.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130715/9fdaf12b/attachment.html>

From stefan at sun.ac.za  Mon Jul 15 20:22:26 2013
From: stefan at sun.ac.za (=?iso-8859-1?Q?St=E9fan?= van der Walt)
Date: Tue, 16 Jul 2013 02:22:26 +0200
Subject: [Numpy-discussion] What should be the result in some statistics
 corner cases?
In-Reply-To: <CAB6mnxLZJj0Gskkf06FEzOECYUz9EGLZ2UgoYqyXrSyazhnVjQ@mail.gmail.com>
References: <CAB6mnx+djfM8yJ+owEJ-Hz7eF9YT0gsxUROstEBiq4jUF6Z0bA@mail.gmail.com>
	<CAGzF1uexxRN8PPmKPjPT=aSxWi79r79Exmoo-Xwj6ZZZ2+6=3Q@mail.gmail.com>
	<CAB6mnx+D+uGd2xzrOpPomwUcQkF4+n1Y80Lndxbgumw-3yG3Uw@mail.gmail.com>
	<CAB6mnx+eWsYBM_AmT6scJ6NecWGwfppfFJu1UXrAdu+cwBYuYQ@mail.gmail.com>
	<CANNq6F=Fy--umvvofRhOMuUZ8_fwUENfB7fWesQ817FvaHm-qg@mail.gmail.com>
	<CAB6mnxLZJj0Gskkf06FEzOECYUz9EGLZ2UgoYqyXrSyazhnVjQ@mail.gmail.com>
Message-ID: <20130716002226.GB864@shinobi>

On Mon, 15 Jul 2013 08:33:47 -0600, Charles R Harris wrote:
> On Mon, Jul 15, 2013 at 8:25 AM, Benjamin Root <ben.root at ou.edu> wrote:
> 
> > This is going to need to be heavily documented with doctests. Also, just
> > to clarify, are we talking about a ValueError for doing a nansum on an
> > empty array as well, or will that now return a zero?
> >
> >
> I was going to leave nansum as is, as it seems that the result was by
> choice rather than by accident.

That makes sense--I like Sebastian's explanation whereby operations that
define an identity yields that upon empty input.

St?fan


From charlesr.harris at gmail.com  Mon Jul 15 20:46:33 2013
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Mon, 15 Jul 2013 18:46:33 -0600
Subject: [Numpy-discussion] What should be the result in some statistics
 corner cases?
In-Reply-To: <20130716002226.GB864@shinobi>
References: <CAB6mnx+djfM8yJ+owEJ-Hz7eF9YT0gsxUROstEBiq4jUF6Z0bA@mail.gmail.com>
	<CAGzF1uexxRN8PPmKPjPT=aSxWi79r79Exmoo-Xwj6ZZZ2+6=3Q@mail.gmail.com>
	<CAB6mnx+D+uGd2xzrOpPomwUcQkF4+n1Y80Lndxbgumw-3yG3Uw@mail.gmail.com>
	<CAB6mnx+eWsYBM_AmT6scJ6NecWGwfppfFJu1UXrAdu+cwBYuYQ@mail.gmail.com>
	<CANNq6F=Fy--umvvofRhOMuUZ8_fwUENfB7fWesQ817FvaHm-qg@mail.gmail.com>
	<CAB6mnxLZJj0Gskkf06FEzOECYUz9EGLZ2UgoYqyXrSyazhnVjQ@mail.gmail.com>
	<20130716002226.GB864@shinobi>
Message-ID: <CAB6mnxJiH-Fp9a8A759Z0-n36Kq4js-yv+wg9T0uiLFvg5zbNA@mail.gmail.com>

On Mon, Jul 15, 2013 at 6:22 PM, St?fan van der Walt <stefan at sun.ac.za>wrote:

> On Mon, 15 Jul 2013 08:33:47 -0600, Charles R Harris wrote:
> > On Mon, Jul 15, 2013 at 8:25 AM, Benjamin Root <ben.root at ou.edu> wrote:
> >
> > > This is going to need to be heavily documented with doctests. Also,
> just
> > > to clarify, are we talking about a ValueError for doing a nansum on an
> > > empty array as well, or will that now return a zero?
> > >
> > >
> > I was going to leave nansum as is, as it seems that the result was by
> > choice rather than by accident.
>
> That makes sense--I like Sebastian's explanation whereby operations that
> define an identity yields that upon empty input.
>

So nansum should return zeros rather than the current NaNs?

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130715/b731c51f/attachment.html>

From stefan at sun.ac.za  Mon Jul 15 20:55:04 2013
From: stefan at sun.ac.za (=?iso-8859-1?Q?St=E9fan?= van der Walt)
Date: Tue, 16 Jul 2013 02:55:04 +0200
Subject: [Numpy-discussion] What should be the result in some statistics
 corner cases?
In-Reply-To: <CAB6mnxJiH-Fp9a8A759Z0-n36Kq4js-yv+wg9T0uiLFvg5zbNA@mail.gmail.com>
References: <CAB6mnx+djfM8yJ+owEJ-Hz7eF9YT0gsxUROstEBiq4jUF6Z0bA@mail.gmail.com>
	<CAGzF1uexxRN8PPmKPjPT=aSxWi79r79Exmoo-Xwj6ZZZ2+6=3Q@mail.gmail.com>
	<CAB6mnx+D+uGd2xzrOpPomwUcQkF4+n1Y80Lndxbgumw-3yG3Uw@mail.gmail.com>
	<CAB6mnx+eWsYBM_AmT6scJ6NecWGwfppfFJu1UXrAdu+cwBYuYQ@mail.gmail.com>
	<CANNq6F=Fy--umvvofRhOMuUZ8_fwUENfB7fWesQ817FvaHm-qg@mail.gmail.com>
	<CAB6mnxLZJj0Gskkf06FEzOECYUz9EGLZ2UgoYqyXrSyazhnVjQ@mail.gmail.com>
	<20130716002226.GB864@shinobi>
	<CAB6mnxJiH-Fp9a8A759Z0-n36Kq4js-yv+wg9T0uiLFvg5zbNA@mail.gmail.com>
Message-ID: <20130716005504.GA2199@shinobi>

On Mon, 15 Jul 2013 18:46:33 -0600, Charles R Harris wrote:
 
> So nansum should return zeros rather than the current NaNs?

Yes, my feeling is that nansum([]) should be 0.

St?fan


From ben.root at ou.edu  Mon Jul 15 20:58:48 2013
From: ben.root at ou.edu (Benjamin Root)
Date: Mon, 15 Jul 2013 20:58:48 -0400
Subject: [Numpy-discussion] What should be the result in some statistics
 corner cases?
In-Reply-To: <CAB6mnxJiH-Fp9a8A759Z0-n36Kq4js-yv+wg9T0uiLFvg5zbNA@mail.gmail.com>
References: <CAB6mnx+djfM8yJ+owEJ-Hz7eF9YT0gsxUROstEBiq4jUF6Z0bA@mail.gmail.com>
	<CAGzF1uexxRN8PPmKPjPT=aSxWi79r79Exmoo-Xwj6ZZZ2+6=3Q@mail.gmail.com>
	<CAB6mnx+D+uGd2xzrOpPomwUcQkF4+n1Y80Lndxbgumw-3yG3Uw@mail.gmail.com>
	<CAB6mnx+eWsYBM_AmT6scJ6NecWGwfppfFJu1UXrAdu+cwBYuYQ@mail.gmail.com>
	<CANNq6F=Fy--umvvofRhOMuUZ8_fwUENfB7fWesQ817FvaHm-qg@mail.gmail.com>
	<CAB6mnxLZJj0Gskkf06FEzOECYUz9EGLZ2UgoYqyXrSyazhnVjQ@mail.gmail.com>
	<20130716002226.GB864@shinobi>
	<CAB6mnxJiH-Fp9a8A759Z0-n36Kq4js-yv+wg9T0uiLFvg5zbNA@mail.gmail.com>
Message-ID: <CANNq6FnPZvm2MtSH4S8Rj9xsq8+79tuQpqx4-FrcDd7XnpMQXQ@mail.gmail.com>

To add a bit of context to the question of nansum on empty results, we
currently differ from MATLAB and R in this respect, they return zero no
matter what. Personally, I think it should return zero, but our current
behavior of returning nans has existed for a long time.

Personally, I think we need a deprecation warning and possibly wait to
change this until 2.0, with plenty of warning that this will change.

Ben Root
On Jul 15, 2013 8:46 PM, "Charles R Harris" <charlesr.harris at gmail.com>
wrote:

>
>
> On Mon, Jul 15, 2013 at 6:22 PM, St?fan van der Walt <stefan at sun.ac.za>wrote:
>
>> On Mon, 15 Jul 2013 08:33:47 -0600, Charles R Harris wrote:
>> > On Mon, Jul 15, 2013 at 8:25 AM, Benjamin Root <ben.root at ou.edu> wrote:
>> >
>> > > This is going to need to be heavily documented with doctests. Also,
>> just
>> > > to clarify, are we talking about a ValueError for doing a nansum on an
>> > > empty array as well, or will that now return a zero?
>> > >
>> > >
>> > I was going to leave nansum as is, as it seems that the result was by
>> > choice rather than by accident.
>>
>> That makes sense--I like Sebastian's explanation whereby operations that
>> define an identity yields that upon empty input.
>>
>
> So nansum should return zeros rather than the current NaNs?
>
> Chuck
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130715/2d36bd97/attachment.html>

From Catherine.M.Moroney at jpl.nasa.gov  Mon Jul 15 21:03:26 2013
From: Catherine.M.Moroney at jpl.nasa.gov (Moroney, Catherine M (398D))
Date: Tue, 16 Jul 2013 01:03:26 +0000
Subject: [Numpy-discussion] retrieving original array locations from 2d
	argsort
Message-ID: <36D0B3E2-E2CD-4622-89CE-E17D3737A7FE@jpl.nasa.gov>

I know that there's an easy way to solve this problem, but I'm not sufficiently knowledgeable
about numpy indexing to figure it out.

Here is the problem:

Take a 2-d array a, of any size.
Sort it in ascending order using, I presume, argsort.
Step through the sorted array in order, and for each element in the sorted array,
retrieve what the corresponding (line, sample) indices in the original array are.

For instance:

a = numpy.arange(0, 16).reshape(4,4)
a[0,:] = -1*numpy.arange(0,4)
a[2,:] = -1*numpy.arange(4,8)

asort = numpy.sort(a, axis=None)
for idx in xrange(0, asort.size):
	element = asort[idx]
        !! Find the line and sample location in a that corresponds to the i-th element in assort

Thank-you for your help,

Catherine


From warren.weckesser at gmail.com  Mon Jul 15 21:23:30 2013
From: warren.weckesser at gmail.com (Warren Weckesser)
Date: Mon, 15 Jul 2013 21:23:30 -0400
Subject: [Numpy-discussion] retrieving original array locations from 2d
	argsort
In-Reply-To: <36D0B3E2-E2CD-4622-89CE-E17D3737A7FE@jpl.nasa.gov>
References: <36D0B3E2-E2CD-4622-89CE-E17D3737A7FE@jpl.nasa.gov>
Message-ID: <CAGzF1ufGyqma7Oq1_UwekO5qnmFKTpQ5FsTuqOhKicnjepH2hg@mail.gmail.com>

On 7/15/13, Moroney, Catherine M (398D)
<Catherine.M.Moroney at jpl.nasa.gov> wrote:
> I know that there's an easy way to solve this problem, but I'm not
> sufficiently knowledgeable
> about numpy indexing to figure it out.
>
> Here is the problem:
>
> Take a 2-d array a, of any size.
> Sort it in ascending order using, I presume, argsort.
> Step through the sorted array in order, and for each element in the sorted
> array,
> retrieve what the corresponding (line, sample) indices in the original array
> are.
>
> For instance:
>
> a = numpy.arange(0, 16).reshape(4,4)
> a[0,:] = -1*numpy.arange(0,4)
> a[2,:] = -1*numpy.arange(4,8)
>
> asort = numpy.sort(a, axis=None)
> for idx in xrange(0, asort.size):
> 	element = asort[idx]
>         !! Find the line and sample location in a that corresponds to the
> i-th element in assort
>


One way is to use argsort and  `numpy.unravel_index` to recover the
original 2D indices:

<code>
import numpy

a = numpy.arange(0, 16).reshape(4,4)
a[0,:] = -1*numpy.arange(0,4)
a[2,:] = -1*numpy.arange(4,8)

flat_sort_indices = numpy.argsort(a, axis=None)
original_indices = numpy.unravel_index(flat_sort_indices, a.shape)

print "  i   j  a[i,j]"
for i, j in zip(*original_indices):
        element = a[i,j]
        print "%3d %3d %6d" % (i, j, element)

</code>


Warren


> Thank-you for your help,
>
> Catherine
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From charlesr.harris at gmail.com  Mon Jul 15 21:50:34 2013
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Mon, 15 Jul 2013 19:50:34 -0600
Subject: [Numpy-discussion] What should be the result in some statistics
 corner cases?
In-Reply-To: <CANNq6FnPZvm2MtSH4S8Rj9xsq8+79tuQpqx4-FrcDd7XnpMQXQ@mail.gmail.com>
References: <CAB6mnx+djfM8yJ+owEJ-Hz7eF9YT0gsxUROstEBiq4jUF6Z0bA@mail.gmail.com>
	<CAGzF1uexxRN8PPmKPjPT=aSxWi79r79Exmoo-Xwj6ZZZ2+6=3Q@mail.gmail.com>
	<CAB6mnx+D+uGd2xzrOpPomwUcQkF4+n1Y80Lndxbgumw-3yG3Uw@mail.gmail.com>
	<CAB6mnx+eWsYBM_AmT6scJ6NecWGwfppfFJu1UXrAdu+cwBYuYQ@mail.gmail.com>
	<CANNq6F=Fy--umvvofRhOMuUZ8_fwUENfB7fWesQ817FvaHm-qg@mail.gmail.com>
	<CAB6mnxLZJj0Gskkf06FEzOECYUz9EGLZ2UgoYqyXrSyazhnVjQ@mail.gmail.com>
	<20130716002226.GB864@shinobi>
	<CAB6mnxJiH-Fp9a8A759Z0-n36Kq4js-yv+wg9T0uiLFvg5zbNA@mail.gmail.com>
	<CANNq6FnPZvm2MtSH4S8Rj9xsq8+79tuQpqx4-FrcDd7XnpMQXQ@mail.gmail.com>
Message-ID: <CAB6mnxJhuCSfvMuYbYGqWog1TDJqwmOagQgBsh8vPRwqqxid7w@mail.gmail.com>

On Mon, Jul 15, 2013 at 6:58 PM, Benjamin Root <ben.root at ou.edu> wrote:

> To add a bit of context to the question of nansum on empty results, we
> currently differ from MATLAB and R in this respect, they return zero no
> matter what. Personally, I think it should return zero, but our current
> behavior of returning nans has existed for a long time.
>
> Personally, I think we need a deprecation warning and possibly wait to
> change this until 2.0, with plenty of warning that this will change.
>
Waiting for the mythical 2.0 probably won't work ;) We also need to give
folks a way to adjust ahead of time. I think the easiest way to do that is
with an extra keyword, say nanok, with True as the starting default, then
later we can make False the default.

<snip>

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130715/913cb4de/attachment.html>

From ralf.gommers at gmail.com  Tue Jul 16 01:36:51 2013
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Tue, 16 Jul 2013 07:36:51 +0200
Subject: [Numpy-discussion] What should be the result in some statistics
 corner cases?
In-Reply-To: <CAB6mnxJhuCSfvMuYbYGqWog1TDJqwmOagQgBsh8vPRwqqxid7w@mail.gmail.com>
References: <CAB6mnx+djfM8yJ+owEJ-Hz7eF9YT0gsxUROstEBiq4jUF6Z0bA@mail.gmail.com>
	<CAGzF1uexxRN8PPmKPjPT=aSxWi79r79Exmoo-Xwj6ZZZ2+6=3Q@mail.gmail.com>
	<CAB6mnx+D+uGd2xzrOpPomwUcQkF4+n1Y80Lndxbgumw-3yG3Uw@mail.gmail.com>
	<CAB6mnx+eWsYBM_AmT6scJ6NecWGwfppfFJu1UXrAdu+cwBYuYQ@mail.gmail.com>
	<CANNq6F=Fy--umvvofRhOMuUZ8_fwUENfB7fWesQ817FvaHm-qg@mail.gmail.com>
	<CAB6mnxLZJj0Gskkf06FEzOECYUz9EGLZ2UgoYqyXrSyazhnVjQ@mail.gmail.com>
	<20130716002226.GB864@shinobi>
	<CAB6mnxJiH-Fp9a8A759Z0-n36Kq4js-yv+wg9T0uiLFvg5zbNA@mail.gmail.com>
	<CANNq6FnPZvm2MtSH4S8Rj9xsq8+79tuQpqx4-FrcDd7XnpMQXQ@mail.gmail.com>
	<CAB6mnxJhuCSfvMuYbYGqWog1TDJqwmOagQgBsh8vPRwqqxid7w@mail.gmail.com>
Message-ID: <CABL7CQj1ezGnukWWmPt-AFb8K6T5KWSeadPjDCT4_Jt0yJWUVg@mail.gmail.com>

On Tue, Jul 16, 2013 at 3:50 AM, Charles R Harris <charlesr.harris at gmail.com
> wrote:

>
>
> On Mon, Jul 15, 2013 at 6:58 PM, Benjamin Root <ben.root at ou.edu> wrote:
>
>> To add a bit of context to the question of nansum on empty results, we
>> currently differ from MATLAB and R in this respect, they return zero no
>> matter what. Personally, I think it should return zero, but our current
>> behavior of returning nans has existed for a long time.
>>
>> Personally, I think we need a deprecation warning and possibly wait to
>> change this until 2.0, with plenty of warning that this will change.
>>
> Waiting for the mythical 2.0 probably won't work ;) We also need to give
> folks a way to adjust ahead of time. I think the easiest way to do that is
> with an extra keyword, say nanok, with True as the starting default, then
> later we can make False the default.
>

No special keywords to work around behavior change please, it doesn't work
well and you end up with a keyword you don't really want.

Why not just give a FutureWarning in 1.8 and change to returning zero in
1.9?

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130716/853e8e88/attachment.html>

From arinkverma at gmail.com  Tue Jul 16 06:34:47 2013
From: arinkverma at gmail.com (Arink Verma)
Date: Tue, 16 Jul 2013 16:04:47 +0530
Subject: [Numpy-discussion] Speedup by avoiding memory alloc twice in scalar
	array
Message-ID: <CAJAs4rRSm-G2v5UZZmU1ZtcyRDDXH8_2TERO+vw_L1m_gDE1zg@mail.gmail.com>

Hi,

I am working on performance parity between numpy scalar/small array and
python array as GSOC mentored By Charles.

Currently I am looking at PyArray_Return, which allocate separate memory
just for scalar return. Unlike python which allocate memory once  for
returning result of  scalar operations; numpy calls malloc twice once for
the array object itself, and a second time for the array data.

These memory allocations are happening in PyArray_NewFromDescr and
PyArray_Scalar. Stashing both within a single allocation would be more
efficient.
In, PyArray_Scalar, new struct (PyLongScalarObject) need allocation in case
of scalar arrays.  Instead, can we just some how convert/cast PyArrayObject
to
PyLongScalarObject.??

-- 

Arink Verma
www.arinkverma.in
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130716/e1fa1e44/attachment.html>

From njs at pobox.com  Tue Jul 16 07:10:30 2013
From: njs at pobox.com (Nathaniel Smith)
Date: Tue, 16 Jul 2013 12:10:30 +0100
Subject: [Numpy-discussion] Speedup by avoiding memory alloc twice in
 scalar array
In-Reply-To: <CAJAs4rRSm-G2v5UZZmU1ZtcyRDDXH8_2TERO+vw_L1m_gDE1zg@mail.gmail.com>
References: <CAJAs4rRSm-G2v5UZZmU1ZtcyRDDXH8_2TERO+vw_L1m_gDE1zg@mail.gmail.com>
Message-ID: <CAPJVwBm87eM3H9d6zeO=zjeO4XoumL7BsBap=SLYtZGShSHX_A@mail.gmail.com>

On 16 Jul 2013 11:35, "Arink Verma" <arinkverma at gmail.com> wrote:
>
> Hi,
>
> I am working on performance parity between numpy scalar/small array and
python array as GSOC mentored By Charles.
>
> Currently I am looking at PyArray_Return, which allocate separate memory
just for scalar return. Unlike python which allocate memory once  for
returning result of  scalar operations; numpy calls malloc twice once for
the array object itself, and a second time for the array data.
>
> These memory allocations are happening in PyArray_NewFromDescr and
PyArray_Scalar. Stashing both within a single allocation would be more
efficient.
> In, PyArray_Scalar, new struct (PyLongScalarObject) need allocation in
case of scalar arrays.  Instead, can we just some how convert/cast
PyArrayObject to
> PyLongScalarObject.??

I think there are more than 2 mallocs you're talking about here?

Each ndarray does two mallocs, for the obj and buffer. These could be
combined into 1 - just allocate the total size and do some pointer
arithmetic, then set OWNDATA to false.

Converting array to scalar does more allocations. I doubt there's a way to
avoid these, but can't say for sure (on my phone now). In any case the idea
of the project is to make scalars obsolete by making arrays competitive,
right? So no need to go optimizing the competition ;-). (And more
seriously, this slowdown *only* exists because of the array/scalar split,
so ignoring it is fair.)

In the bigger picture, these are pretty tiny optimizations, aren't they? In
the quick profiling I did a while ago, it looked like there was a lot of
much bigger low-hanging fruit, and fiddling around with one malloc versus
two isn't going to do much if we're still wasting an order of magnitude
more time in inefficient loop selection and unnecessary writes to the FP
control word?

-n
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130716/4a519eee/attachment.html>

From arinkverma at gmail.com  Tue Jul 16 09:34:57 2013
From: arinkverma at gmail.com (Arink Verma)
Date: Tue, 16 Jul 2013 19:04:57 +0530
Subject: [Numpy-discussion] Speedup by avoiding memory alloc twice in
 scalar array
In-Reply-To: <CAPJVwBm87eM3H9d6zeO=zjeO4XoumL7BsBap=SLYtZGShSHX_A@mail.gmail.com>
References: <CAJAs4rRSm-G2v5UZZmU1ZtcyRDDXH8_2TERO+vw_L1m_gDE1zg@mail.gmail.com>
	<CAPJVwBm87eM3H9d6zeO=zjeO4XoumL7BsBap=SLYtZGShSHX_A@mail.gmail.com>
Message-ID: <CAJAs4rRw9_-rO5RxOZePaQ7_4ZazSJhnk=jJemnAHkmEwQnxVg@mail.gmail.com>

>Each ndarray does two mallocs, for the obj and buffer. These could be
combined into 1 - just allocate the total size and do some pointer
>arithmetic, then set OWNDATA to false.
So, that two mallocs has been mentioned in project introduction. I got that
wrong.

>magnitude more time in inefficient loop selection and unnecessary writes
to the FP control word?
loop selection, contribute around 2~3% in time. I implemented cache
with PyThreadState_GetDict()
but it didnt help.
Even generating prepopulated dict/list in code_generator/generate_umath.py is
not helping,


Here, it the distribution of time, on addition operations. All memory
related and BuildValue operations cost more than 7%, rest looping ones are
around 2-3%:

   - PyUFunc_AddititonTypeResolver(7.6%)
   - *SimpleBinaryOperationTypeResolver(6.2%)*


   - *execute_legacy_ufunc_loop(20.7%)*
   - trivial_three_operand_loop(8.6%)  ,this will be around 3.4% when pr #
      3521 <https://github.com/numpy/numpy/pull/3521> get merged
      - *PYArray_NewFromDescr(7.3%)*
      - PyUFunc_DefaultLegacyInnerLoopSelector(2.5%)


   - PyUFunc_GetPyValues(12.0%)
   - *_extract_pyvals(9.2%)*
   - *PyArray_Return(14.3%)*


-- 
Arink Verma
www.arinkverma.in
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130716/793b8cc8/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 1-array_cast.svg
Type: image/svg+xml
Size: 92040 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130716/793b8cc8/attachment.svg>

From njs at pobox.com  Tue Jul 16 11:55:58 2013
From: njs at pobox.com (Nathaniel Smith)
Date: Tue, 16 Jul 2013 16:55:58 +0100
Subject: [Numpy-discussion] Speedup by avoiding memory alloc twice in
 scalar array
In-Reply-To: <CAJAs4rRw9_-rO5RxOZePaQ7_4ZazSJhnk=jJemnAHkmEwQnxVg@mail.gmail.com>
References: <CAJAs4rRSm-G2v5UZZmU1ZtcyRDDXH8_2TERO+vw_L1m_gDE1zg@mail.gmail.com>
	<CAPJVwBm87eM3H9d6zeO=zjeO4XoumL7BsBap=SLYtZGShSHX_A@mail.gmail.com>
	<CAJAs4rRw9_-rO5RxOZePaQ7_4ZazSJhnk=jJemnAHkmEwQnxVg@mail.gmail.com>
Message-ID: <CAPJVwBnJWcTAsR0i0XFitObETzM7q-SLH2vAvwpQ6sO0kUpEBg@mail.gmail.com>

On Tue, Jul 16, 2013 at 2:34 PM, Arink Verma <arinkverma at gmail.com> wrote:

> >Each ndarray does two mallocs, for the obj and buffer. These could be
> combined into 1 - just allocate the total size and do some pointer
> >arithmetic, then set OWNDATA to false.
> So, that two mallocs has been mentioned in project introduction. I got
> that wrong.
>

On further thought/reading the code, it appears to be more complicated than
that, actually.

It looks like (for a non-scalar array) we have 2 calls to PyMem_Malloc: 1
for the array object itself, and one for the shapes + strides. And, one
call to regular-old malloc: for the data buffer.

(Mysteriously, shapes + strides together have 2*ndim elements, but to hold
them we allocate a memory region sized to hold 3*ndim elements. I'm not
sure why.)

And contrary to what I said earlier, this is about as optimized as it can
be without breaking ABI. We need at least 2 calls to malloc/PyMem_Malloc,
because the shapes+strides may need to be resized without affecting the
much larger data area. But it's tempting to allocate the array object and
the data buffer in a single memory region, like I suggested earlier. And
this would ALMOST work. But, it turns out there is code out there which
assumes (whether wisely or not) that you can swap around which data buffer
a given PyArrayObject refers to (hi Theano!). And supporting this means
that data buffers and PyArrayObjects need to be in separate memory regions.

>magnitude more time in inefficient loop selection and unnecessary writes
> to the FP control word?
> loop selection, contribute around 2~3% in time. I implemented cache with PyThreadState_GetDict()
> but it didnt help.
> Even generating prepopulated dict/list in code_generator/generate_umath.py is
> not helping,
>
>
> Here, it the distribution of time, on addition operations. All memory
> related and BuildValue operations cost more than 7%, rest looping ones are
> around 2-3%:
>
>    - PyUFunc_AddititonTypeResolver(7.6%)
>    - *SimpleBinaryOperationTypeResolver(6.2%)*
>
>
>    - *execute_legacy_ufunc_loop(20.7%)*
>    - trivial_three_operand_loop(8.6%)  ,this will be around 3.4% when pr #
>       3521 <https://github.com/numpy/numpy/pull/3521> get merged
>       - *PYArray_NewFromDescr(7.3%)*
>       - PyUFunc_DefaultLegacyInnerLoopSelector(2.5%)
>
>
>    - PyUFunc_GetPyValues(12.0%)
>    - *_extract_pyvals(9.2%)*
>    - *PyArray_Return(14.3%)*
>
> Hmm, you prodded me into running those numbers again to see :-)

At http://www.arinkverma.in/2013/06/finding-bottleneck-in-pythonnumpy.htmlyou
say that you're using a Python compiled with --with-pydebug. Is this
true? If so then stop! You want numpy compiled with generic debugging
information ("-g" on gcc), and maybe it helps to have Python compiled with
"-g" as well. But --with-pydebug goes much further -- it actually changes
the Python interpreter in many ways to add lots of expensive self-checks.
On my machine simple operations like "[]" (allocate a list) or "1.0 + 1.0"
go about 4x slower when I use Ubuntu's python-dbg package (which is
compiled with --with-pydebug). You can't trust speed measurements you get
from a --with-pydebug build.

Anyway, I'm using 64-bit python2.7 from Ubuntu's repo, self-compiled numpy
master, with this measurement code:

import ctypes
profiler = ctypes.CDLL("libprofiler.so.0")
def loop(n):
    import numpy as np
    print "Numpy:", np.__version__
    x = np.asarray([1.0, 2.0])
    for i in xrange(n):
        x + x
profiler.ProfilerStart("/tmp/master-array-float64-add.prof")
loop(10000000)
profiler.ProfilerStop()

Graph attached.

Notice:
- because my benchmark has a 2-element array instead of a scalar array, the
special-case scalar return logic (PyArray_Return etc.) disappears. This
makes all percentages a bit higher in my graph, because the operation is
overall faster.

- PyArray_NewFromDescr does indeed take 11.6% of the time, but it's not
clear why. Half that time is directly inside PyArray_NewFromDescr, not in
any sub-calls to malloc-related functions. Also, you see a lot more time in
array_alloc than I do, which may be caused by --with-pydebug.

Taking a closer look with google-pprof --disasm=PyArray_NewFromDescr (also
attached), it looks like the major cost here is, bizarrely enough, the
calculation of the array size?! Out of 338 cumulative samples in this
function, I count 175 that are associated with various div/mul
instructions, while all the mallocs together take only 164 (= 5.6% of total
time).

This is pretty bizarre for a bunch of 1-dimensional 2-element arrays!?

- PyUFunc_AdditionTypeResolver takes 10.9% of the time, and
PyUFunc_DefaultLegacyInnerLoopSelector takes another 4.2% of the time, and
this pretty absurd considering that we're talking about locating the
float64 + float64 loop, which should not require any complicated logic.
This should be like 0.1% or something. I'm not surprised that
PyThreadState_GetDict() doesn't help -- doing dict lookups was probably was
more expensive than the thing you replaced! But some sort of simple table
lookup scheme that reduces loop lookup to chasing a few pointers should be
totally doable.

- We're spending 13.6% of the time in PyUFunc_getfperr. I'm pretty sure
that a lot of this is totally wasted time, because we implement both 'set'
and 'clear' operations as 'set+clear', making them twice as costly as
necessary.

(Eventually it would be even better if we could disable this logic entirely
for integer arrays, and for when the user has turned off fp error
reporting. But neither of these would help for this simple float+float
benchmark.)

- _extract_pyvals and PyUFunc_GetPyValues (not sure why they aren't linked
in my graph, but they seem to be the same code) together use >11% of time.
This is also completely silly -- all this time is spent on doing elaborate
stuff to look up entries in a python dict, extract them, and convert them
into, like, some C level bitmasks. And then doing that again and again on
every operation. Instead we should convert this stuff to a C values once,
when they're set in the first place, and stash those C values directly into
a thread-local variable. See PyThread_*_key in pythread.h for a raw TLS
implementation that's always available (and which is what
PyThreadState_GetDict() is built on top of). The documentation is in the
Python source distribution in comments in Python/thread.c.

-n
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130716/47bcb533/attachment.html>
-------------- next part --------------
ROUTINE ====================== PyArray_NewFromDescr
   168    505 samples (flat, cumulative) 17.4% of total
-------------------- /tmp/pip-UN1TwQ-build/numpy/core/src/multiarray/ctors.c
     3      5   838: {
     1      1       4daf0: push   %r15
     .      .       4daf2: mov    %edx,%r11d
     .      .       4daf5: mov    %rsi,%r15
     .      .       4daf8: push   %r14
     .      .       4dafa: push   %r13
     .      .       4dafc: push   %r12
     .      .       4dafe: push   %rbp
     .      1       4daff: mov    %rcx,%rbp
     1      2       4db02: push   %rbx
     1      1       4db03: mov    %r8,%rbx
     .      .       4db06: sub    $0x248,%rsp
     .      .   845: if (descr->subarray) {
     .      .       4db0d: mov    0x28(%rsi),%r13
     .      .   838: {
     .      .       4db11: mov    %rdi,0x28(%rsp)
     .      .       4db16: mov    %r9,0x30(%rsp)
     .      .   845: if (descr->subarray) {
     .      .       4db1b: test   %r13,%r13
     .      .       4db1e: je     4dcb0 <PyArray_NewFromDescr+0x1c0>
     .      .   849: memcpy(newdims, dims, nd*sizeof(npy_intp));
     .      .       4db24: movslq %edx,%r12
-------------------- /usr/include/x86_64-linux-gnu/bits/string3.h
     .      .    52: return __builtin___memcpy_chk (__dest, __src, __len, __bos0 (__dest));
     .      .       4db27: lea    0x40(%rsp),%rdi
     .      .       4db2c: mov    $0x200,%ecx
-------------------- /tmp/pip-UN1TwQ-build/numpy/core/src/multiarray/ctors.c
     .      .   849: memcpy(newdims, dims, nd*sizeof(npy_intp));
     .      .       4db31: shl    $0x3,%r12
-------------------- /usr/include/x86_64-linux-gnu/bits/string3.h
     .      .    52: return __builtin___memcpy_chk (__dest, __src, __len, __bos0 (__dest));
     .      .       4db35: mov    %rbp,%rsi
     .      .       4db38: mov    %r11d,0x10(%rsp)
     .      .       4db3d: mov    %r12,%rdx
     .      .       4db40: callq  1a1a0 <__memcpy_chk at plt>
-------------------- /tmp/pip-UN1TwQ-build/numpy/core/src/multiarray/ctors.c
     .      .   850: if (strides) {
     .      .       4db45: test   %rbx,%rbx
     .      .   848: npy_intp *newstrides = NULL;
     .      .       4db48: movq   $0x0,0x20(%rsp)
     .      .   850: if (strides) {
     .      .       4db51: mov    0x10(%rsp),%r11d
     .      .       4db56: je     4db7d <PyArray_NewFromDescr+0x8d>
     .      .   851: newstrides = newdims + NPY_MAXDIMS;
     .      .       4db58: lea    0x140(%rsp),%rbp
-------------------- /usr/include/x86_64-linux-gnu/bits/string3.h
     .      .    52: return __builtin___memcpy_chk (__dest, __src, __len, __bos0 (__dest));
     .      .       4db60: mov    $0x100,%ecx
     .      .       4db65: mov    %r12,%rdx
     .      .       4db68: mov    %rbx,%rsi
     .      .       4db6b: mov    %rbp,%rdi
     .      .       4db6e: callq  1a1a0 <__memcpy_chk at plt>
-------------------- /tmp/pip-UN1TwQ-build/numpy/core/src/multiarray/ctors.c
     .      .   851: newstrides = newdims + NPY_MAXDIMS;
     .      .       4db73: mov    0x10(%rsp),%r11d
     .      .       4db78: mov    %rbp,0x20(%rsp)
     .      .   228: tuple = PyTuple_Check(old->subarray->shape);
     .      .       4db7d: mov    0x8(%r13),%rdi
     .      .   227: mydim = newdims + oldnd;
     .      .       4db81: lea    0x40(%rsp),%r14
     .      .   224: *des = old->subarray->base;
     .      .       4db86: mov    0x0(%r13),%rbp
     .      .   227: mydim = newdims + oldnd;
     .      .       4db8a: add    %r12,%r14
     .      .   228: tuple = PyTuple_Check(old->subarray->shape);
     .      .       4db8d: mov    0x8(%rdi),%rax
     .      .   229: if (tuple) {
     .      .       4db91: testb  $0x4,0xab(%rax)
     .      .       4db98: jne    4dc60 <PyArray_NewFromDescr+0x170>
     .      .   237: newnd = oldnd + numnew;
     .      .       4db9e: add    $0x1,%r11d
     .      .   238: if (newnd > NPY_MAXDIMS) {
     .      .       4dba2: cmp    $0x20,%r11d
     .      .   237: newnd = oldnd + numnew;
     .      .       4dba6: mov    %r11d,0x3c(%rsp)
     .      .   238: if (newnd > NPY_MAXDIMS) {
     .      .       4dbab: jg     4dbf6 <PyArray_NewFromDescr+0x106>
     .      .   248: mydim[0] = (npy_intp) PyInt_AsLong(old->subarray->shape);
     .      .       4dbad: callq  19970 <PyInt_AsLong at plt>
     .      .   233: numnew = 1;
     .      .       4dbb2: mov    $0x1,%ebx
     .      .   248: mydim[0] = (npy_intp) PyInt_AsLong(old->subarray->shape);
     .      .       4dbb7: mov    %rax,(%r14)
     .      .   251: if (newstrides) {
     .      .       4dbba: cmpq   $0x0,0x20(%rsp)
     .      .       4dbc0: je     4dbf6 <PyArray_NewFromDescr+0x106>
     .      .   255: mystrides = newstrides + oldnd;
     .      .       4dbc2: add    0x20(%rsp),%r12
     .      .   258: for (i = numnew - 1; i >= 0; i--) {
     .      .       4dbc7: sub    $0x1,%ebx
     .      .   257: tempsize = (*des)->elsize;
     .      .       4dbca: movslq 0x20(%rbp),%rdx
     .      .   258: for (i = numnew - 1; i >= 0; i--) {
     .      .       4dbce: js     4dbf6 <PyArray_NewFromDescr+0x106>
     .      .   260: tempsize *= mydim[i] ? mydim[i] : 1;
     .      .       4dbd0: mov    $0x1,%esi
     .      .       4dbd5: nopl   (%rax)
     .      .   259: mystrides[i] = tempsize;
     .      .       4dbd8: movslq %ebx,%rax
     .      .       4dbdb: mov    %rdx,(%r12,%rax,8)
     .      .   260: tempsize *= mydim[i] ? mydim[i] : 1;
     .      .       4dbdf: mov    (%r14,%rax,8),%rax
     .      .       4dbe3: test   %rax,%rax
     .      .       4dbe6: cmove  %rsi,%rax
     .      .   258: for (i = numnew - 1; i >= 0; i--) {
     .      .       4dbea: sub    $0x1,%ebx
     .      .   260: tempsize *= mydim[i] ? mydim[i] : 1;
     .      .       4dbed: imul   %rax,%rdx
     .      .   258: for (i = numnew - 1; i >= 0; i--) {
     .      .       4dbf1: cmp    $0xffffffff,%ebx
     .      .       4dbf4: jne    4dbd8 <PyArray_NewFromDescr+0xe8>
     .      .   265: Py_INCREF(*des);
     .      .       4dbf6: addq   $0x1,0x0(%rbp)
     .      .   266: Py_DECREF(old);
     .      .       4dbfb: subq   $0x1,(%r15)
     .      .       4dbff: jne    4dc0b <PyArray_NewFromDescr+0x11b>
     .      .       4dc01: mov    0x8(%r15),%rax
     .      .       4dc05: mov    %r15,%rdi
     .      .       4dc08: callq  *0x30(%rax)
     .      .   856: ret = PyArray_NewFromDescr(subtype, descr, nd, newdims,
     .      .       4dc0b: mov    0x280(%rsp),%edx
     .      .       4dc12: mov    0x288(%rsp),%rax
     .      .       4dc1a: lea    0x40(%rsp),%rcx
     .      .       4dc1f: mov    0x30(%rsp),%r9
     .      .       4dc24: mov    0x20(%rsp),%r8
     .      .       4dc29: mov    %rbp,%rsi
     .      .       4dc2c: mov    0x28(%rsp),%rdi
     .      .       4dc31: mov    %edx,(%rsp)
     .      .       4dc34: mov    0x3c(%rsp),%edx
     .      .       4dc38: mov    %rax,0x8(%rsp)
     .      .       4dc3d: callq  4daf0 <PyArray_NewFromDescr>
     .      .       4dc42: mov    %rax,%rbx
     5     10  1064: }
     .      .       4dc45: add    $0x248,%rsp
     .      .       4dc4c: mov    %rbx,%rax
     .      .       4dc4f: pop    %rbx
     .      1       4dc50: pop    %rbp
     1      2       4dc51: pop    %r12
     1      1       4dc53: pop    %r13
     .      2       4dc55: pop    %r14
     2      3       4dc57: pop    %r15
     1      1       4dc59: retq   
     .      .       4dc5a: nopw   0x0(%rax,%rax,1)
     .      .   230: numnew = PyTuple_GET_SIZE(old->subarray->shape);
     .      .       4dc60: mov    0x10(%rdi),%rax
     .      .   237: newnd = oldnd + numnew;
     .      .       4dc64: add    %eax,%r11d
     .      .   230: numnew = PyTuple_GET_SIZE(old->subarray->shape);
     .      .       4dc67: mov    %eax,%ebx
     .      .   238: if (newnd > NPY_MAXDIMS) {
     .      .       4dc69: cmp    $0x20,%r11d
     .      .   237: newnd = oldnd + numnew;
     .      .       4dc6d: mov    %r11d,0x3c(%rsp)
     .      .   238: if (newnd > NPY_MAXDIMS) {
     .      .       4dc72: jg     4dbf6 <PyArray_NewFromDescr+0x106>
     .      .   242: for (i = 0; i < numnew; i++) {
     .      .       4dc74: test   %eax,%eax
     .      .       4dc76: jle    4dbba <PyArray_NewFromDescr+0xca>
     .      .       4dc7c: xor    %r13d,%r13d
     .      .       4dc7f: jmp    4dc90 <PyArray_NewFromDescr+0x1a0>
     .      .       4dc81: nopl   0x0(%rax)
     .      .       4dc88: mov    0x28(%r15),%rax
     .      .       4dc8c: mov    0x8(%rax),%rdi
     .      .   243: mydim[i] = (npy_intp) PyInt_AsLong(
     .      .       4dc90: movslq %r13d,%rax
     .      .       4dc93: mov    0x18(%rdi,%rax,8),%rdi
     .      .       4dc98: callq  19970 <PyInt_AsLong at plt>
     .      .       4dc9d: mov    %rax,(%r14,%r13,8)
     .      .       4dca1: add    $0x1,%r13
     .      .   242: for (i = 0; i < numnew; i++) {
     .      .       4dca5: cmp    %r13d,%ebx
     .      .       4dca8: jg     4dc88 <PyArray_NewFromDescr+0x198>
     .      .       4dcaa: jmpq   4dbba <PyArray_NewFromDescr+0xca>
     .      .       4dcaf:    nop
     .      .   862: if ((unsigned int)nd > (unsigned int)NPY_MAXDIMS) {
     .      .       4dcb0: cmp    $0x20,%edx
     .      .       4dcb3: ja     4def0 <PyArray_NewFromDescr+0x400>
     .      1   872: sd = (size_t) descr->elsize;
     .      1       4dcb9: movslq 0x20(%rsi),%r12
     2     43   873: if (sd == 0) {
     1      1       4dcbd: test   %r12,%r12
     .      1       4dcc0: je     4dea0 <PyArray_NewFromDescr+0x3b0>
     1      1       4dcc6: movabs $0x7fffffffffffffff,%rax
     .      .       4dcd0: xor    %edx,%edx
     .     40       4dcd2: div    %r12
    41     42   892: for (i = 0; i < nd; i++) {
    40     40       4dcd5: test   %r11d,%r11d
     .      .       4dcd8: je     4e16c <PyArray_NewFromDescr+0x67c>
     .      .       4dcde: xor    %r9d,%r9d
     .      1       4dce1: mov    $0x1,%r14d
     1      1       4dce7: nopw   0x0(%rax,%rax,1)
     .      .   893: npy_intp dim = dims[i];
     .      .       4dcf0: mov    0x0(%rbp,%r9,8),%rsi
     .      .   895: if (dim == 0) {
     .      .       4dcf5: cmp    $0x0,%rsi
     .      .       4dcf9: je     4dd18 <PyArray_NewFromDescr+0x228>
     .      .   903: if (dim < 0) {
     .      .       4dcfb: jl     4dfc0 <PyArray_NewFromDescr+0x4d0>
     .      .   917: if (dim > largest) {
     .      .       4dd01: cmp    %rax,%rsi
     .      .       4dd04: jg     4dfd8 <PyArray_NewFromDescr+0x4e8>
     .     47   924: largest /= dim;
     .      .       4dd0a: mov    %rax,%rdx
     .      .       4dd0d: sar    $0x3f,%rdx
     .     47       4dd11: idiv   %rsi
    48     49   923: size *= dim;
    47     48       4dd14: imul   %rsi,%r14
     1      1       4dd18: add    $0x1,%r9
     .      .   892: for (i = 0; i < nd; i++) {
     .      .       4dd1c: cmp    %r9d,%r11d
     .      .       4dd1f: jg     4dcf0 <PyArray_NewFromDescr+0x200>
     1     51   927: fa = (PyArrayObject_fields *) subtype->tp_alloc(subtype, 0);
     .      .       4dd21: mov    0x28(%rsp),%rdi
     .      1       4dd26: xor    %esi,%esi
     1      1       4dd28: mov    %r11d,0x10(%rsp)
     .     49       4dd2d: callq  *0x130(%rdi)
     2      3   928: if (fa == NULL) {
     2      3       4dd33: test   %rax,%rax
     1      1   927: fa = (PyArrayObject_fields *) subtype->tp_alloc(subtype, 0);
     1      1       4dd36: mov    %rax,%r13
     4      8   928: if (fa == NULL) {
     .      4       4dd39: mov    0x10(%rsp),%r11d
     4      4       4dd3e: je     4dec5 <PyArray_NewFromDescr+0x3d5>
     .      .   935: if (data == NULL) {
     .      .       4dd44: cmpq   $0x0,0x30(%rsp)
     .      .   932: fa->nd = nd;
     .      .       4dd4a: mov    %r11d,0x18(%rax)
     .      .   933: fa->dimensions = NULL;
     .      .       4dd4e: movq   $0x0,0x20(%rax)
     .      .   934: fa->data = NULL;
     .      .       4dd56: movq   $0x0,0x10(%rax)
     .      .   935: if (data == NULL) {
     .      .       4dd5e: je     4e068 <PyArray_NewFromDescr+0x578>
     .      3   946: fa->flags = (flags & ~NPY_ARRAY_UPDATEIFCOPY);
     .      .       4dd64: mov    0x280(%rsp),%eax
     .      .       4dd6b: and    $0xef,%ah
     .      3       4dd6e: mov    %eax,0x40(%r13)
     3      4   952: if (nd > 0) {
     3      4       4dd72: test   %r11d,%r11d
     1      5   948: fa->descr = descr;
     1      5       4dd75: mov    %r15,0x38(%r13)
     4      5   949: fa->base = (PyObject *)NULL;
     4      5       4dd79: movq   $0x0,0x30(%r13)
     1      3   950: fa->weakreflist = (PyObject *)NULL;
     1      3       4dd81: movq   $0x0,0x48(%r13)
     2      2   952: if (nd > 0) {
     2      2       4dd89: jne    4df40 <PyArray_NewFromDescr+0x450>
     .      .   975: fa->flags |= NPY_ARRAY_F_CONTIGUOUS;
     .      .       4dd8f: orl    $0x2,0x40(%r13)
     .      1   974: fa->dimensions = fa->strides = NULL;
     .      1       4dd94: movq   $0x0,0x28(%r13)
     2      3   978: if (data == NULL) {
     1      2       4dd9c: cmpq   $0x0,0x30(%rsp)
     1      1       4dda2: je     4e083 <PyArray_NewFromDescr+0x593>
     .      1  1008: fa->flags &= ~NPY_ARRAY_OWNDATA;
     .      1       4dda8: andl   $0xfffffffb,0x40(%r13)
     1      1  1010: fa->data = data;
     1      1       4ddad: mov    0x30(%rsp),%rax
     .      .  1016: if (strides != NULL) {
     .      .       4ddb2: test   %rbx,%rbx
     .      .  1010: fa->data = data;
     .      .       4ddb5: mov    %rax,0x10(%r13)
     .      .  1016: if (strides != NULL) {
     .      .       4ddb9: je     4ddc8 <PyArray_NewFromDescr+0x2d8>
     .      .  1017: PyArray_UpdateFlags((PyArrayObject *)fa, NPY_ARRAY_UPDATE_ALL);
     .      .       4ddbb: mov    $0x103,%esi
     .      .       4ddc0: mov    %r13,%rdi
     .      .       4ddc3: callq  7b3a0 <PyArray_UpdateFlags>
     6     12  1025: if ((subtype != &PyArray_Type)) {
     .      .       4ddc8: lea    0x2d7c11(%rip),%rax        # 3259e0 <PyArray_Type>
     .      6       4ddcf: cmp    %rax,0x28(%rsp)
     6      6       4ddd4: mov    %r13,%rbx
     .      .       4ddd7: je     4dc45 <PyArray_NewFromDescr+0x155>
     .      .  1028: func = PyObject_GetAttrString((PyObject *)fa, "__array_finalize__");
     .      .       4dddd: lea    0x9d5d4(%rip),%rsi        # eb3b8 <CSWTCH.53+0x7f8>
     .      .       4dde4: mov    %r13,%rdi
     .      .       4dde7: callq  19ac0 <PyObject_GetAttrString at plt>
     .      .  1029: if (func && func != Py_None) {
     .      .       4ddec: test   %rax,%rax
     .      .  1028: func = PyObject_GetAttrString((PyObject *)fa, "__array_finalize__");
     .      .       4ddef: mov    %rax,%rbp
     .      .  1029: if (func && func != Py_None) {
     .      .       4ddf2: je     4dc45 <PyArray_NewFromDescr+0x155>
     .      .       4ddf8: mov    0x2d7121(%rip),%r12        # 324f20 <_DYNAMIC+0x360>
     .      .       4ddff: cmp    %r12,%rax
     .      .       4de02: je     4e04f <PyArray_NewFromDescr+0x55f>
     .      .  1030: if (NpyCapsule_Check(func)) {
     .      .       4de08: mov    0x2d7109(%rip),%rdx        # 324f18 <_DYNAMIC+0x358>
     .      .       4de0f: cmp    %rdx,0x8(%rax)
     .      .       4de13: je     4e11b <PyArray_NewFromDescr+0x62b>
     .      .  1040: args = PyTuple_New(1);
     .      .       4de19: mov    $0x1,%edi
     .      .       4de1e: callq  1a200 <PyTuple_New at plt>
     .      .  1042: obj=Py_None;
     .      .       4de23: cmpq   $0x0,0x288(%rsp)
     .      .  1040: args = PyTuple_New(1);
     .      .       4de2c: mov    %rax,%rbx
     .      .  1046: res = PyObject_Call(func, args, NULL);
     .      .       4de2f: mov    %rbp,%rdi
     .      .  1042: obj=Py_None;
     .      .       4de32: cmovne 0x288(%rsp),%r12
     .      .  1046: res = PyObject_Call(func, args, NULL);
     .      .       4de3b: mov    %rbx,%rsi
     .      .       4de3e: xor    %edx,%edx
     .      .  1044: Py_INCREF(obj);
     .      .       4de40: addq   $0x1,(%r12)
     .      .  1045: PyTuple_SET_ITEM(args, 0, obj);
     .      .       4de45: mov    %r12,0x18(%rbx)
     .      .  1042: obj=Py_None;
     .      .       4de49: mov    %r12,0x288(%rsp)
     .      .  1046: res = PyObject_Call(func, args, NULL);
     .      .       4de51: callq  1a830 <PyObject_Call at plt>
     .      .  1047: Py_DECREF(args);
     .      .       4de56: subq   $0x1,(%rbx)
     .      .  1046: res = PyObject_Call(func, args, NULL);
     .      .       4de5a: mov    %rax,%r12
     .      .  1047: Py_DECREF(args);
     .      .       4de5d: je     4e0ee <PyArray_NewFromDescr+0x5fe>
     .      .  1048: Py_DECREF(func);
     .      .       4de63: subq   $0x1,0x0(%rbp)
     .      .       4de68: je     4e0df <PyArray_NewFromDescr+0x5ef>
     .      .  1049: if (res == NULL) {
     .      .       4de6e: test   %r12,%r12
     .      .       4de71: je     4e033 <PyArray_NewFromDescr+0x543>
     .      .  1053: Py_DECREF(res);
     .      .       4de77: mov    (%r12),%rax
     .      .       4de7b: mov    %r13,%rbx
     .      .       4de7e: sub    $0x1,%rax
     .      .       4de82: test   %rax,%rax
     .      .       4de85: mov    %rax,(%r12)
     .      .       4de89: jne    4dc45 <PyArray_NewFromDescr+0x155>
     .      .       4de8f: mov    0x8(%r12),%rax
     .      .       4de94: mov    %r12,%rdi
     .      .       4de97: callq  *0x30(%rax)
     .      .       4de9a: jmpq   4dc45 <PyArray_NewFromDescr+0x155>
     .      .       4de9f:    nop
     .      .   874: if (!PyDataType_ISSTRING(descr)) {
     .      .       4dea0: mov    0x1c(%rsi),%eax
     .      .       4dea3: sub    $0x12,%eax
     .      .       4dea6: cmp    $0x1,%eax
     .      .       4dea9: jbe    4dfe1 <PyArray_NewFromDescr+0x4f1>
     .      .   875: PyErr_SetString(PyExc_TypeError, "Empty data-type");
     .      .       4deaf: mov    0x2d6fc2(%rip),%rax        # 324e78 <_DYNAMIC+0x2b8>
     .      .       4deb6: lea    0x9d4d9(%rip),%rsi        # eb396 <CSWTCH.53+0x7d6>
     .      .   904: PyErr_SetString(PyExc_ValueError,
     .      .       4debd: mov    (%rax),%rdi
     .      .       4dec0: callq  19d10 <PyErr_SetString at plt>
     .      .   929: Py_DECREF(descr);
     .      .       4dec5: subq   $0x1,(%r15)
     .      .       4dec9: je     4ded8 <PyArray_NewFromDescr+0x3e8>
     .      .       4decb: xor    %ebx,%ebx
     .      .       4decd: jmpq   4dc45 <PyArray_NewFromDescr+0x155>
     .      .       4ded2: nopw   0x0(%rax,%rax,1)
     .      .       4ded8: mov    0x8(%r15),%rax
     .      .       4dedc: mov    %r15,%rdi
     .      .       4dedf: xor    %ebx,%ebx
     .      .       4dee1: callq  *0x30(%rax)
     .      .       4dee4: jmpq   4dc45 <PyArray_NewFromDescr+0x155>
     .      .       4dee9: nopl   0x0(%rax)
     .      .   863: PyErr_Format(PyExc_ValueError,
     .      .       4def0: mov    0x2d6f69(%rip),%rax        # 324e60 <_DYNAMIC+0x2a0>
     .      .       4def7: lea    0x9d72a(%rip),%rsi        # eb628 <CSWTCH.53+0xa68>
     .      .       4defe: mov    $0x20,%edx
     .      .       4df03: mov    (%rax),%rdi
     .      .       4df06: xor    %eax,%eax
     .      .       4df08: callq  1a8d0 <PyErr_Format at plt>
     .      .       4df0d: jmp    4dec5 <PyArray_NewFromDescr+0x3d5>
     .      .   939: if (nd > 1) {
     .      .       4df0f: cmp    $0x1,%r11d
     .      .       4df13: jle    4e177 <PyArray_NewFromDescr+0x687>
     .      .   940: fa->flags &= ~NPY_ARRAY_C_CONTIGUOUS;
     .      .       4df19: movl   $0x502,0x40(%rax)
     .      .   948: fa->descr = descr;
     .      .       4df20: mov    %r15,0x38(%rax)
     .      .   949: fa->base = (PyObject *)NULL;
     .      .       4df24: movq   $0x0,0x30(%rax)
     .      .   950: fa->weakreflist = (PyObject *)NULL;
     .      .       4df2c: movq   $0x0,0x48(%rax)
     .      .   942: flags = NPY_ARRAY_F_CONTIGUOUS;
     .      .       4df34: movl   $0x2,0x280(%rsp)
     .      .       4df3f:    nop
     9     60   953: fa->dimensions = PyDimMem_NEW(3*nd);
     .      4       4df40: lea    (%r11,%r11,2),%edi
     4      9       4df44: mov    %r11d,0x10(%rsp)
     5      5       4df49: movslq %edi,%rdi
     .      .       4df4c: shl    $0x3,%rdi
     .     42       4df50: callq  1aa50 <PyMem_Malloc at plt>
     .      .   954: if (fa->dimensions == NULL) {
     .      .       4df55: test   %rax,%rax
     .      2   953: fa->dimensions = PyDimMem_NEW(3*nd);
     .      2       4df58: mov    %rax,0x20(%r13)
     2      2   954: if (fa->dimensions == NULL) {
     2      2       4df5c: mov    0x10(%rsp),%r11d
     .      .       4df61: je     4e02e <PyArray_NewFromDescr+0x53e>
     .      .   958: fa->strides = fa->dimensions + nd;
     .      .       4df67: movslq %r11d,%r8
-------------------- /usr/include/x86_64-linux-gnu/bits/string3.h
     .      .    52: return __builtin___memcpy_chk (__dest, __src, __len, __bos0 (__dest));
     .      .       4df6a: mov    %rax,%rdi
     .      .       4df6d: mov    %rbp,%rsi
-------------------- /tmp/pip-UN1TwQ-build/numpy/core/src/multiarray/ctors.c
     1      2   958: fa->strides = fa->dimensions + nd;
     .      .       4df70: shl    $0x3,%r8
     .      1       4df74: lea    (%rax,%r8,1),%rdx
     1      1       4df78: mov    %rdx,0x28(%r13)
-------------------- /usr/include/x86_64-linux-gnu/bits/string3.h
     .     15    52: return __builtin___memcpy_chk (__dest, __src, __len, __bos0 (__dest));
     .      .       4df7c: mov    %r8,%rdx
     .      .       4df7f: mov    %r8,0x18(%rsp)
     .      .       4df84: mov    %r11d,0x10(%rsp)
     .     15       4df89: callq  1a240 <memcpy at plt>
-------------------- /tmp/pip-UN1TwQ-build/numpy/core/src/multiarray/ctors.c
     1      1   960: if (strides == NULL) { /* fill it in */
     1      1       4df8e: test   %rbx,%rbx
     .      1   961: sd = _array_fill_strides(fa->strides, dims, nd, sd,
     .      1       4df91: mov    0x28(%r13),%rdi
     3      5   960: if (strides == NULL) { /* fill it in */
     1      1       4df95: mov    0x18(%rsp),%r8
     .      2       4df9a: mov    0x10(%rsp),%r11d
     2      2       4df9f: je     4e14a <PyArray_NewFromDescr+0x65a>
-------------------- /usr/include/x86_64-linux-gnu/bits/string3.h
     .      .    52: return __builtin___memcpy_chk (__dest, __src, __len, __bos0 (__dest));
     .      .       4dfa5: mov    %r8,%rdx
     .      .       4dfa8: mov    %rbx,%rsi
     .      .       4dfab: callq  1a240 <memcpy at plt>
-------------------- /tmp/pip-UN1TwQ-build/numpy/core/src/multiarray/ctors.c
     .      .   970: sd *= size;
     .      .       4dfb0: imul   %r14,%r12
     .      .       4dfb4: jmpq   4dd9c <PyArray_NewFromDescr+0x2ac>
     .      .       4dfb9: nopl   0x0(%rax)
     .      .   904: PyErr_SetString(PyExc_ValueError,
     .      .       4dfc0: lea    0x9d691(%rip),%rsi        # eb658 <CSWTCH.53+0xa98>
     .      .       4dfc7: mov    0x2d6e92(%rip),%rax        # 324e60 <_DYNAMIC+0x2a0>
     .      .       4dfce: jmpq   4debd <PyArray_NewFromDescr+0x3cd>
     .      .       4dfd3: nopl   0x0(%rax,%rax,1)
     .      .   918: PyErr_SetString(PyExc_ValueError,
     .      .       4dfd8: lea    0x9d3c7(%rip),%rsi        # eb3a6 <CSWTCH.53+0x7e6>
     .      .       4dfdf: jmp    4dfc7 <PyArray_NewFromDescr+0x4d7>
     .      .   879: PyArray_DESCR_REPLACE(descr);
     .      .       4dfe1: mov    %rsi,%rdi
     .      .       4dfe4: mov    %edx,0x10(%rsp)
     .      .       4dfe8: callq  5e680 <PyArray_DescrNew>
     .      .       4dfed: subq   $0x1,(%r15)
     .      .       4dff1: mov    0x10(%rsp),%r11d
     .      .       4dff6: je     4e0fd <PyArray_NewFromDescr+0x60d>
     .      .       4dffc: test   %rax,%rax
     .      .       4dfff: je     4decb <PyArray_NewFromDescr+0x3db>
     .      .   883: if (descr->type_num == NPY_STRING) {
     .      .       4e005: cmpl   $0x12,0x1c(%rax)
     .      .       4e009: je     4e0c0 <PyArray_NewFromDescr+0x5d0>
     .      .   887: sd = descr->elsize = sizeof(npy_ucs4);
     .      .       4e00f: movl   $0x4,0x20(%rax)
     .      .       4e016: mov    %rax,%r15
     .      .       4e019: mov    $0x4,%r12d
     .      .       4e01f: movabs $0x1fffffffffffffff,%rax
     .      .       4e029: jmpq   4dcd5 <PyArray_NewFromDescr+0x1e5>
     .      .   955: PyErr_NoMemory();
     .      .       4e02e: callq  19bb0 <PyErr_NoMemory at plt>
     .      3  1062: Py_DECREF(fa);
     .      .       4e033: subq   $0x1,0x0(%r13)
     .      .       4e038: jne    4decb <PyArray_NewFromDescr+0x3db>
     .      .       4e03e: mov    0x8(%r13),%rax
     .      .       4e042: mov    %r13,%rdi
     .      .       4e045: xor    %ebx,%ebx
     .      .       4e047: callq  *0x30(%rax)
     .      .       4e04a: jmpq   4dc45 <PyArray_NewFromDescr+0x155>
     .      .       4e04f: subq   $0x1,(%rax)
     .      .       4e053: jne    4dc45 <PyArray_NewFromDescr+0x155>
     .      .       4e059: mov    0x8(%rax),%rax
     .      .       4e05d: mov    %rbp,%rdi
     .      .       4e060: callq  *0x30(%rax)
     .      3       4e063: jmpq   4dc45 <PyArray_NewFromDescr+0x155>
    11     20   937: if (flags) {
     3     11       4e068: mov    0x280(%rsp),%edi
     8      8       4e06f: test   %edi,%edi
     .      1       4e071: jne    4df0f <PyArray_NewFromDescr+0x41f>
     4      7   936: fa->flags = NPY_ARRAY_DEFAULT;
     1      4       4e077: movl   $0x501,0x40(%rax)
     3      3       4e07e: jmpq   4dd72 <PyArray_NewFromDescr+0x282>
     .      .   985: if (sd == 0) {
     .      .       4e083: test   %r12,%r12
     .      .       4e086: jne    4e08c <PyArray_NewFromDescr+0x59c>
     .      2   986: sd = descr->elsize;
     .      2       4e088: movslq 0x20(%r15),%r12
     2     53   988: data = PyDataMem_NEW(sd);
     2      2       4e08c: mov    %r12,%rdi
     .     51       4e08f: callq  af500 <PyDataMem_NEW>
     2      2   989: if (data == NULL) {
     2      2       4e094: test   %rax,%rax
     .      1   988: data = PyDataMem_NEW(sd);
     .      1       4e097: mov    %rax,0x30(%rsp)
     1      1   989: if (data == NULL) {
     1      1       4e09c: je     4e02e <PyArray_NewFromDescr+0x53e>
     .      .   993: fa->flags |= NPY_ARRAY_OWNDATA;
     .      .       4e09e: orl    $0x4,0x40(%r13)
     1      2   999: if (PyDataType_FLAGCHK(descr, NPY_NEEDS_INIT)) {
     .      1       4e0a3: testb  $0x8,0x1b(%r15)
     1      1       4e0a8: je     4ddad <PyArray_NewFromDescr+0x2bd>
-------------------- /usr/include/x86_64-linux-gnu/bits/string3.h
     .      .    85: return __builtin___memset_chk (__dest, __ch, __len, __bos0 (__dest));
     .      .       4e0ae: mov    %r12,%rdx
     .      .       4e0b1: xor    %esi,%esi
     .      .       4e0b3: mov    %rax,%rdi
     .      .       4e0b6: callq  19e50 <memset at plt>
     .      .       4e0bb: jmpq   4ddad <PyArray_NewFromDescr+0x2bd>
-------------------- /tmp/pip-UN1TwQ-build/numpy/core/src/multiarray/ctors.c
     .      .   884: sd = descr->elsize = 1;
     .      .       4e0c0: movl   $0x1,0x20(%rax)
     .      .       4e0c7: mov    %rax,%r15
     .      .       4e0ca: mov    $0x1,%r12d
     .      .       4e0d0: movabs $0x7fffffffffffffff,%rax
     .      .       4e0da: jmpq   4dcd5 <PyArray_NewFromDescr+0x1e5>
     .      .       4e0df: mov    0x8(%rbp),%rax
     .      .       4e0e3: mov    %rbp,%rdi
     .      .       4e0e6: callq  *0x30(%rax)
     .      .       4e0e9: jmpq   4de6e <PyArray_NewFromDescr+0x37e>
     .      .       4e0ee: mov    0x8(%rbx),%rax
     .      .       4e0f2: mov    %rbx,%rdi
     .      .       4e0f5: callq  *0x30(%rax)
     .      .       4e0f8: jmpq   4de63 <PyArray_NewFromDescr+0x373>
     .      .       4e0fd: mov    0x8(%r15),%rdx
     .      .       4e101: mov    %r15,%rdi
     .      .       4e104: mov    %rax,0x18(%rsp)
     .      .       4e109: callq  *0x30(%rdx)
     .      .       4e10c: mov    0x10(%rsp),%r11d
     .      .       4e111: mov    0x18(%rsp),%rax
     .      .       4e116: jmpq   4dffc <PyArray_NewFromDescr+0x50c>
-------------------- ...ip-UN1TwQ-build/numpy/core/include/numpy/npy_3kcompat.h
     .      .   377: return PyCObject_AsVoidPtr(ptr);
     .      .       4e11b: mov    %rax,%rdi
     .      .       4e11e: callq  1a6a0 <PyCObject_AsVoidPtr at plt>
-------------------- /tmp/pip-UN1TwQ-build/numpy/core/src/multiarray/ctors.c
     .      .  1034: Py_DECREF(func);
     .      .       4e123: subq   $0x1,0x0(%rbp)
-------------------- ...ip-UN1TwQ-build/numpy/core/include/numpy/npy_3kcompat.h
     .      .   377: return PyCObject_AsVoidPtr(ptr);
     .      .       4e128: mov    %rax,%rbx
-------------------- /tmp/pip-UN1TwQ-build/numpy/core/src/multiarray/ctors.c
     .      .  1034: Py_DECREF(func);
     .      .       4e12b: je     4e18e <PyArray_NewFromDescr+0x69e>
     .      1  1035: if (cfunc((PyArrayObject *)fa, obj) < 0) {
     .      .       4e12d: mov    0x288(%rsp),%rsi
     .      .       4e135: mov    %r13,%rdi
     .      .       4e138: callq  *%rbx
     .      .       4e13a: test   %eax,%eax
     .      .       4e13c: mov    %r13,%rbx
     .      .       4e13f: jns    4dc45 <PyArray_NewFromDescr+0x155>
     .      1       4e145: jmpq   4e033 <PyArray_NewFromDescr+0x543>
     4     25   961: sd = _array_fill_strides(fa->strides, dims, nd, sd,
     1      2       4e14a: mov    0x280(%rsp),%r8d
     1      1       4e152: lea    0x40(%r13),%r9
     .      .       4e156: mov    %r12,%rcx
     .      1       4e159: mov    %r11d,%edx
     1      1       4e15c: mov    %rbp,%rsi
     .     19       4e15f: callq  4da20 <_array_fill_strides>
     1      1       4e164: mov    %rax,%r12
     .      .       4e167: jmpq   4dd9c <PyArray_NewFromDescr+0x2ac>
     .      .   871: size = 1;
     .      .       4e16c: mov    $0x1,%r14d
     .      .       4e172: jmpq   4dd21 <PyArray_NewFromDescr+0x231>
     .      .   938: fa->flags |= NPY_ARRAY_F_CONTIGUOUS;
     .      .       4e177: movl   $0x503,0x40(%rax)
     .      .   942: flags = NPY_ARRAY_F_CONTIGUOUS;
     .      .       4e17e: movl   $0x2,0x280(%rsp)
     .      .       4e189: jmpq   4dd72 <PyArray_NewFromDescr+0x282>
     .      .       4e18e: mov    0x8(%rbp),%rax
     .      .       4e192: mov    %rbp,%rdi
     .      .       4e195: callq  *0x30(%rax)
     .      .       4e198: jmp    4e12d <PyArray_NewFromDescr+0x63d>
     .      .       4e19a: nopw   0x0(%rax,%rax,1)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: master-array-float64-add.pdf
Type: application/pdf
Size: 19235 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130716/47bcb533/attachment.pdf>

From nouiz at nouiz.org  Tue Jul 16 14:53:30 2013
From: nouiz at nouiz.org (=?ISO-8859-1?Q?Fr=E9d=E9ric_Bastien?=)
Date: Tue, 16 Jul 2013 14:53:30 -0400
Subject: [Numpy-discussion] Speedup by avoiding memory alloc twice in
 scalar array
In-Reply-To: <CAPJVwBnJWcTAsR0i0XFitObETzM7q-SLH2vAvwpQ6sO0kUpEBg@mail.gmail.com>
References: <CAJAs4rRSm-G2v5UZZmU1ZtcyRDDXH8_2TERO+vw_L1m_gDE1zg@mail.gmail.com>
	<CAPJVwBm87eM3H9d6zeO=zjeO4XoumL7BsBap=SLYtZGShSHX_A@mail.gmail.com>
	<CAJAs4rRw9_-rO5RxOZePaQ7_4ZazSJhnk=jJemnAHkmEwQnxVg@mail.gmail.com>
	<CAPJVwBnJWcTAsR0i0XFitObETzM7q-SLH2vAvwpQ6sO0kUpEBg@mail.gmail.com>
Message-ID: <CADKKbtgc0FxFaqhFzxOtDfeY3OrYkixwqmrbvx69JvX3FsgHzQ@mail.gmail.com>

Hi,


On Tue, Jul 16, 2013 at 11:55 AM, Nathaniel Smith <njs at pobox.com> wrote:

> On Tue, Jul 16, 2013 at 2:34 PM, Arink Verma <arinkverma at gmail.com> wrote:
>
>> >Each ndarray does two mallocs, for the obj and buffer. These could be
>> combined into 1 - just allocate the total size and do some pointer
>> >arithmetic, then set OWNDATA to false.
>> So, that two mallocs has been mentioned in project introduction. I got
>> that wrong.
>>
>
> On further thought/reading the code, it appears to be more complicated
> than that, actually.
>
> It looks like (for a non-scalar array) we have 2 calls to PyMem_Malloc: 1
> for the array object itself, and one for the shapes + strides. And, one
> call to regular-old malloc: for the data buffer.
>
> (Mysteriously, shapes + strides together have 2*ndim elements, but to hold
> them we allocate a memory region sized to hold 3*ndim elements. I'm not
> sure why.)
>
> And contrary to what I said earlier, this is about as optimized as it can
> be without breaking ABI. We need at least 2 calls to malloc/PyMem_Malloc,
> because the shapes+strides may need to be resized without affecting the
> much larger data area. But it's tempting to allocate the array object and
> the data buffer in a single memory region, like I suggested earlier. And
> this would ALMOST work. But, it turns out there is code out there which
> assumes (whether wisely or not) that you can swap around which data buffer
> a given PyArrayObject refers to (hi Theano!). And supporting this means
> that data buffers and PyArrayObjects need to be in separate memory regions.
>

Are you sure that Theano "swap" the data ptr of an ndarray? When we play
with that, it is on a newly create ndarray. So a node in our graph, won't
change the input ndarray structure. It will create a new ndarray structure
with new shape/strides and pass a data ptr and we flag the new ndarray with
own_data correctly to my knowledge.

If Theano pose a problem here, I'll suggest that I fix Theano. But
currently I don't see the problem. So if this make you change your mind
about this optimization, tell me. I don't want Theano to prevent
optimization in NumPy.

Fred
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130716/87f1cbf1/attachment.html>

From irving at naml.us  Tue Jul 16 19:44:37 2013
From: irving at naml.us (Geoffrey Irving)
Date: Tue, 16 Jul 2013 16:44:37 -0700
Subject: [Numpy-discussion] restricting object arrays to a single Python type
Message-ID: <CAJ1ofpcY6yHxiuOZ4-Lug4iwFMmSvsPE0b4=gRRns=pMkXZN-w@mail.gmail.com>

Is there a standard way of creating an object array restricted to a
particular python type?  I want a safe way of sending arrays of
objects back and forth between Python and C++, and it'd be great if I
could use numpy arrays on the Python side instead of creating a new
type.

For example, I might have a C++ class Force which is simultaneously a
valid Python extension type (also named "Force").  I'd like to be able
to switch between Array<Force> on the C++ side and a suitable numpy
array on the Python side, while preventing Python from ever storing an
object with different type (say, a tuple) in the array.  Note that
Array<Force> has the memory representation of an array of PyObject*'s.

Thanks,
Geoffrey


From scopatz at gmail.com  Tue Jul 16 19:51:58 2013
From: scopatz at gmail.com (Anthony Scopatz)
Date: Tue, 16 Jul 2013 18:51:58 -0500
Subject: [Numpy-discussion] restricting object arrays to a single Python
	type
In-Reply-To: <CAJ1ofpcY6yHxiuOZ4-Lug4iwFMmSvsPE0b4=gRRns=pMkXZN-w@mail.gmail.com>
References: <CAJ1ofpcY6yHxiuOZ4-Lug4iwFMmSvsPE0b4=gRRns=pMkXZN-w@mail.gmail.com>
Message-ID: <CAPk-6T6yMtc-fTDvmEjL-egtG52H=_fHaTWwNTaP0PhAYfXBtg@mail.gmail.com>

Hi Geoffrey,

Not to toot my own horn here too much, but you really should have a look at
xdress (http://xdress.org/ and https://github.com/xdress/xdress).  XDress
will generate a wrapper of the Force class for you and then also create a
custom numpy dtype for this class.  In this way, you could get exactly what
you want.

If you run into any trouble, let me know and I'll be sure to help you out!
 This is the kind of thing that xdress is *supposed* to do so bugs here are
a big priority for me personally =)

Be Well
Anthony


On Tue, Jul 16, 2013 at 6:44 PM, Geoffrey Irving <irving at naml.us> wrote:

> Is there a standard way of creating an object array restricted to a
> particular python type?  I want a safe way of sending arrays of
> objects back and forth between Python and C++, and it'd be great if I
> could use numpy arrays on the Python side instead of creating a new
> type.
>
> For example, I might have a C++ class Force which is simultaneously a
> valid Python extension type (also named "Force").  I'd like to be able
> to switch between Array<Force> on the C++ side and a suitable numpy
> array on the Python side, while preventing Python from ever storing an
> object with different type (say, a tuple) in the array.  Note that
> Array<Force> has the memory representation of an array of PyObject*'s.
>
> Thanks,
> Geoffrey
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130716/76da224a/attachment.html>

From irving at naml.us  Tue Jul 16 20:15:22 2013
From: irving at naml.us (Geoffrey Irving)
Date: Tue, 16 Jul 2013 17:15:22 -0700
Subject: [Numpy-discussion] restricting object arrays to a single Python
	type
In-Reply-To: <CAPk-6T6yMtc-fTDvmEjL-egtG52H=_fHaTWwNTaP0PhAYfXBtg@mail.gmail.com>
References: <CAJ1ofpcY6yHxiuOZ4-Lug4iwFMmSvsPE0b4=gRRns=pMkXZN-w@mail.gmail.com>
	<CAPk-6T6yMtc-fTDvmEjL-egtG52H=_fHaTWwNTaP0PhAYfXBtg@mail.gmail.com>
Message-ID: <CAJ1ofpcsdSWb++15MLzfNwK=SE-PG_NR_Zz-Euii0ZON0QQ7EA@mail.gmail.com>

On Tue, Jul 16, 2013 at 4:51 PM, Anthony Scopatz <scopatz at gmail.com> wrote:
> Hi Geoffrey,
>
> Not to toot my own horn here too much, but you really should have a look at
> xdress (http://xdress.org/ and https://github.com/xdress/xdress).  XDress
> will generate a wrapper of the Force class for you and then also create a
> custom numpy dtype for this class.  In this way, you could get exactly what
> you want.

Unfortunately it's unlikely to work out of the box, since it uses
gccxml which appears to still be based on gcc 4.2.  All of our code is
C++11, and we need to preserve portability to horrible places like
Visual Studio (yes, these two constraints are just barely compatible
at the moment).

> If you run into any trouble, let me know and I'll be sure to help you out!
> This is the kind of thing that xdress is supposed to do so bugs here are a
> big priority for me personally =)

We're currently using a custom Python binding layer which I wrote a
while ago after getting fed up with boost::python.  Our system is
extremely lightweight but also limited, and in particular is missing a
few key features like automatic support for named and default
arguments (since these can't be introspected inside C++).  It'd be
great to chat more about our two feature sets and whether there are
opportunities for collaboration and/or merging.  I'm not sure if this
list is a good place for that discussion, so we could optionally take
it off list or to skype if you're up for that (send me an email
directly if so).

Here are links to our system, which unfortunately is undocumented at the moment:

    https://github.com/otherlab/core
    https://github.com/otherlab/core/blob/master/python/ClassTest.cpp
# Unit test for wrapping a class

Geoffrey


From scopatz at gmail.com  Tue Jul 16 22:15:03 2013
From: scopatz at gmail.com (Anthony Scopatz)
Date: Tue, 16 Jul 2013 21:15:03 -0500
Subject: [Numpy-discussion] restricting object arrays to a single Python
	type
In-Reply-To: <CAJ1ofpcsdSWb++15MLzfNwK=SE-PG_NR_Zz-Euii0ZON0QQ7EA@mail.gmail.com>
References: <CAJ1ofpcY6yHxiuOZ4-Lug4iwFMmSvsPE0b4=gRRns=pMkXZN-w@mail.gmail.com>
	<CAPk-6T6yMtc-fTDvmEjL-egtG52H=_fHaTWwNTaP0PhAYfXBtg@mail.gmail.com>
	<CAJ1ofpcsdSWb++15MLzfNwK=SE-PG_NR_Zz-Euii0ZON0QQ7EA@mail.gmail.com>
Message-ID: <CAPk-6T5ggbHHeiWU0Uqzt9sw7zrBCfv31T-oQ51f1SCseaG99A@mail.gmail.com>

Hey Geoffrey,

Let's definitely take this off (this) list.  The discussion could get
involved :).

Be Well
Anthony


On Tue, Jul 16, 2013 at 7:15 PM, Geoffrey Irving <irving at naml.us> wrote:

> On Tue, Jul 16, 2013 at 4:51 PM, Anthony Scopatz <scopatz at gmail.com>
> wrote:
> > Hi Geoffrey,
> >
> > Not to toot my own horn here too much, but you really should have a look
> at
> > xdress (http://xdress.org/ and https://github.com/xdress/xdress).
>  XDress
> > will generate a wrapper of the Force class for you and then also create a
> > custom numpy dtype for this class.  In this way, you could get exactly
> what
> > you want.
>
> Unfortunately it's unlikely to work out of the box, since it uses
> gccxml which appears to still be based on gcc 4.2.  All of our code is
> C++11, and we need to preserve portability to horrible places like
> Visual Studio (yes, these two constraints are just barely compatible
> at the moment).
>
> > If you run into any trouble, let me know and I'll be sure to help you
> out!
> > This is the kind of thing that xdress is supposed to do so bugs here are
> a
> > big priority for me personally =)
>
> We're currently using a custom Python binding layer which I wrote a
> while ago after getting fed up with boost::python.  Our system is
> extremely lightweight but also limited, and in particular is missing a
> few key features like automatic support for named and default
> arguments (since these can't be introspected inside C++).  It'd be
> great to chat more about our two feature sets and whether there are
> opportunities for collaboration and/or merging.  I'm not sure if this
> list is a good place for that discussion, so we could optionally take
> it off list or to skype if you're up for that (send me an email
> directly if so).
>
> Here are links to our system, which unfortunately is undocumented at the
> moment:
>
>     https://github.com/otherlab/core
>     https://github.com/otherlab/core/blob/master/python/ClassTest.cpp
> # Unit test for wrapping a class
>
> Geoffrey
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130716/24dc0ea6/attachment.html>

From lists at onerussian.com  Tue Jul 16 23:53:48 2013
From: lists at onerussian.com (Yaroslav Halchenko)
Date: Tue, 16 Jul 2013 23:53:48 -0400
Subject: [Numpy-discussion] Really cruel draft of vbench setup for NumPy
 (.add.reduce benchmarks since 2011)
In-Reply-To: <20130709161007.GL27621@onerussian.com>
References: <20130506143241.GV5140@onerussian.com>
	<1367856232.2506.31.camel@sebastian-laptop>
	<20130506161153.GW5140@onerussian.com>
	<1367927238.23010.12.camel@sebastian-laptop>
	<CAJhcF=1a=zHQ7JgXF9f4N+1pHMr_6RESq6Me9oSYewjj62L0MA@mail.gmail.com>
	<20130701193006.GC27621@onerussian.com>
	<20130701215804.GG27621@onerussian.com>
	<20130709161007.GL27621@onerussian.com>
Message-ID: <20130717035348.GN27621@onerussian.com>

and to put so far reported findings into some kind of automated form,
please welcome

http://www.onerussian.com/tmp/numpy-vbench/#benchmarks-performance-analysis

This is based on a simple 1-way anova of last 10 commits and some point
in the past where 10 other commits had smallest timing and were significantly
different from the last 10 commits.

"Possible recent" is probably too noisy and not sure if useful -- it should
point to a closest in time (to the latest commits) diff where a
significant excursion from current performance was detected.  So per se it has
nothing to do with the initial detected performance hit, but in some cases
seems still to reasonably locate commits hitting on performance.

Enjoy,

On Tue, 09 Jul 2013, Yaroslav Halchenko wrote:

> Julian Taylor contributed some benchmarks he was "concerned" about, so
> now the collection is even better.

> I will keep updating tests on the same url:
> http://www.onerussian.com/tmp/numpy-vbench/
> [it is now running and later I will upload with more commits for higher temporal fidelity]

> of particular interest for you might be:
> some minor consistent recent losses in
> http://www.onerussian.com/tmp/numpy-vbench/vb_vb_io.html#strided-assign-float64
> http://www.onerussian.com/tmp/numpy-vbench/vb_vb_io.html#strided-assign-float32
> http://www.onerussian.com/tmp/numpy-vbench/vb_vb_io.html#strided-assign-int16
> http://www.onerussian.com/tmp/numpy-vbench/vb_vb_io.html#strided-assign-int8

> seems have lost more than 25% of performance throughout the timeline
> http://www.onerussian.com/tmp/numpy-vbench/vb_vb_io.html#memcpy-int8 

> "fast" calls to all/any seemed to be hurt twice in their life time now running
> *3 times slower* than in 2011 -- inflection points correspond to regressions
> and/or their fixes in those functions to bring back performance on "slow"
> cases (when array traversal is needed, e.g. on arrays of zeros for any)

> http://www.onerussian.com/tmp/numpy-vbench/vb_vb_reduce.html#numpy-all-fast
> http://www.onerussian.com/tmp/numpy-vbench/vb_vb_reduce.html#numpy-any-fast

> Enjoy

> On Mon, 01 Jul 2013, Yaroslav Halchenko wrote:

> > FWIW -- updated plots with contribution from Julian Taylor
> > http://www.onerussian.com/tmp/numpy-vbench-20130701/vb_vb_indexing.html#mmap-slicing
> > ;-)

> > On Mon, 01 Jul 2013, Yaroslav Halchenko wrote:

> > > Hi Guys,

> > > not quite the recommendations you expressed,  but here is my ugly
> > > attempt to improve benchmarks coverage:

> > > http://www.onerussian.com/tmp/numpy-vbench-20130701/index.html

> > > initially I also ran those ufunc benchmarks per each dtype separately,
> > > but then resulting webpage is loong which brings my laptop on its knees
> > > by firefox.  So I commented those out for now, and left only "summary"
> > > ones across multiple datatypes.

> > > There is a bug in sphinx which forbids embedding some figures for
> > > vb_random "as is", so pardon that for now...

> > > I have not set cpu affinity of the process (but ran it at nice -10), so  may be
> > > that also contributed to variance of benchmark estimates.  And there probably
> > > could be more of goodies (e.g. gc control etc) to borrow from
> > > https://github.com/pydata/pandas/blob/master/vb_suite/test_perf.py which I have
> > > just discovered to minimize variance.

> > > nothing really interesting was pin-pointed so far, besides that 

> > > - svd became a bit faster since few months back ;-)

> > > http://www.onerussian.com/tmp/numpy-vbench-20130701/vb_vb_linalg.html

> > > - isnan (and isinf, isfinite) got improved

> > > http://www.onerussian.com/tmp/numpy-vbench-20130701/vb_vb_ufunc.html#numpy-isnan-a-10types

> > > - right_shift got a miniscule slowdown from what it used to be?

> > > http://www.onerussian.com/tmp/numpy-vbench-20130701/vb_vb_ufunc.html#numpy-right-shift-a-a-3types

> > > As before -- current code of those benchmarks collection is available
> > > at http://github.com/yarikoptic/numpy-vbench/pull/new/master

> > > if you have specific snippets you would like to benchmark -- just state them
> > > here or send a PR -- I will add them in.

> > > Cheers,

> > > On Tue, 07 May 2013, Da?id wrote:

> > > > On 7 May 2013 13:47, Sebastian Berg <sebastian at sipsolutions.net> wrote:
> > > > > Indexing/assignment was the first thing I thought of too (also because
> > > > > fancy indexing/assignment really could use some speedups...). Other then
> > > > > that maybe some timings for small arrays/scalar math, but that might be
> > > > > nice for that GSoC project.

> > > > Why not going bigger? Ufunc operations on big arrays, CPU and memory bound.

> > > > Also, what about interfacing with other packages? It may increase the
> > > > compiling overhead, but I would like to see Cython in action (say,
> > > > only last version, maybe it can be fixed).
> > > > _______________________________________________
> > > > NumPy-Discussion mailing list
> > > > NumPy-Discussion at scipy.org
> > > > http://mail.scipy.org/mailman/listinfo/numpy-discussion
-- 
Yaroslav O. Halchenko, Ph.D.
http://neuro.debian.net http://www.pymvpa.org http://www.fail2ban.org
Senior Research Associate,     Psychological and Brain Sciences Dept.
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834                       Fax: +1 (603) 646-1419
WWW:   http://www.linkedin.com/in/yarik        


From scopatz at gmail.com  Wed Jul 17 01:50:17 2013
From: scopatz at gmail.com (Anthony Scopatz)
Date: Wed, 17 Jul 2013 00:50:17 -0500
Subject: [Numpy-discussion] restricting object arrays to a single Python
	type
In-Reply-To: <CAJ1ofpcsdSWb++15MLzfNwK=SE-PG_NR_Zz-Euii0ZON0QQ7EA@mail.gmail.com>
References: <CAJ1ofpcY6yHxiuOZ4-Lug4iwFMmSvsPE0b4=gRRns=pMkXZN-w@mail.gmail.com>
	<CAPk-6T6yMtc-fTDvmEjL-egtG52H=_fHaTWwNTaP0PhAYfXBtg@mail.gmail.com>
	<CAJ1ofpcsdSWb++15MLzfNwK=SE-PG_NR_Zz-Euii0ZON0QQ7EA@mail.gmail.com>
Message-ID: <CAPk-6T4-GazQZoX_CmtO7RdBVi6p_qCU1EA9SSVQk8KYScxeHQ@mail.gmail.com>

On Tue, Jul 16, 2013 at 7:15 PM, Geoffrey Irving <irving at naml.us> wrote:

> On Tue, Jul 16, 2013 at 4:51 PM, Anthony Scopatz <scopatz at gmail.com>
> wrote:
> > Hi Geoffrey,
> >
> > Not to toot my own horn here too much, but you really should have a look
> at
> > xdress (http://xdress.org/ and https://github.com/xdress/xdress).
>  XDress
> > will generate a wrapper of the Force class for you and then also create a
> > custom numpy dtype for this class.  In this way, you could get exactly
> what
> > you want.
>
> Unfortunately it's unlikely to work out of the box, since it uses
> gccxml which appears to still be based on gcc 4.2.  All of our code is
> C++11, and we need to preserve portability to horrible places like
> Visual Studio (yes, these two constraints are just barely compatible
> at the moment).
>

Hey Geoffrey,

I don't think that GCC-XML should be a show stopper.  There are a couple of
reasons for this.  The first is that, correct me if I am wrong, but most of
the C++11 updates are not really changes that affect top-level API elements
-- which is the only thing that you care about when creating wrappers.  So
unless you are relying on a lot of lambdas or something, my guess is that
GCC-XML might just work anyways.

The second reason is that xdress is written to be *very* modular.  There is
no reason that it needs to rely on GCC-XML at all. Other parsers and ASTs,
such as Clang or SWIG or ROSE, could be used.  In fact there is a mostly
complete version of a Clang AST present in XDress already.  I have disabled
it and am not worrying about it personally because the current Clang Python
AST bindings do not support template arguments.  Since this is a major use
case of mine, I had to abandon that code line.  However, other people could
forge ahead with Clang in one of a few ways:

   1. Use the nascent XML output of Clang,
   2. Use the existing Clang Python AST bindings, understanding that they
   are incomplete
   3. Fix the Python Clang AST bindings
   4. Write your on Python Clang AST Bindings (I know people who have done
   this but they are not open sorce), possibly using XDress!

In any event, none of this is super difficult, wouldn't impair xdress
development at all, and everyone would benefit.  Alternative parsers are
something that is on my radar and I would love to support.


> > If you run into any trouble, let me know and I'll be sure to help you
> out!
> > This is the kind of thing that xdress is supposed to do so bugs here are
> a
> > big priority for me personally =)
>
> We're currently using a custom Python binding layer which I wrote a
> while ago after getting fed up with boost::python.


I have been down that painful road =)


>  Our system is
> extremely lightweight but also limited, and in particular is missing a
> few key features like automatic support for named and default
> arguments (since these can't be introspected inside C++).


XDress supports these.


>  It'd be
> great to chat more about our two feature sets and whether there are
> opportunities for collaboration and/or merging.  I'm not sure if this
> list is a good place for that discussion, so we could optionally take
> it off list or to skype if you're up for that (send me an email
> directly if so).
>

I'd be happy to!  My skype name is 'scopatz' or you can find me on Google+.
I tend not to just hang out on skype and I have a lot to do tomorrow, so if
you want to set a time that would probably be best.  My schedule is pretty
flexible if busy.

Be Well
Anthony


> Here are links to our system, which unfortunately is undocumented at the
> moment:
>
>     https://github.com/otherlab/core
>     https://github.com/otherlab/core/blob/master/python/ClassTest.cpp
> # Unit test for wrapping a class
>
> Geoffrey
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130717/d63ed3be/attachment.html>

From njs at pobox.com  Wed Jul 17 10:25:43 2013
From: njs at pobox.com (Nathaniel Smith)
Date: Wed, 17 Jul 2013 15:25:43 +0100
Subject: [Numpy-discussion] ufunc overrides
In-Reply-To: <CAOiFMpzEedwjkeP6cA6Qa319Hr+Zq=ihwQ2vuN6PTpiyvYyDrg@mail.gmail.com>
References: <CAOiFMpzEedwjkeP6cA6Qa319Hr+Zq=ihwQ2vuN6PTpiyvYyDrg@mail.gmail.com>
Message-ID: <CAPJVwB=3uY4LiqOLZvQjvci+x0O6qhgY2P2iY51C9CcM0CrvVg@mail.gmail.com>

On Thu, Jul 11, 2013 at 4:29 AM, Blake Griffith
<blake.a.griffith at gmail.com> wrote:
>
> Hello NumPy,
>
> Part of my GSoC is compatibility with SciPy's sparse matrices and NumPy's ufuncs. Currently there is no feasible way to do this without changing ufuncs a bit.
>
> I've been considering a mechanism to override ufuncs based on checking the ufuncs arguments for a __ufunc_override__ attribute. Then handing off the operation to a function specified by that attribute. I prototyped this in python and did a demo in a blog post here:
> http://cwl.cx/posts/week-6-ufunc-overrides.html
> This is similar to a previously discussed, but never implemented change:
> http://mail.scipy.org/pipermail/numpy-discussion/2011-June/056945.html

I've just posted long comment with a slightly different proposal in the PR:
  https://github.com/numpy/numpy/pull/3524#issuecomment-21115548
Mentioning this here because this has the potential to majorly affect
anyone working with ndarray subclasses or other array-like objects
(e.g., masked arrays, GPU arrays, etc.), so if you care about these
things then please take a look and help us make sure that the final
API is flexible enough to handle your needs.

-n


From njs at pobox.com  Wed Jul 17 10:39:54 2013
From: njs at pobox.com (Nathaniel Smith)
Date: Wed, 17 Jul 2013 15:39:54 +0100
Subject: [Numpy-discussion] Speedup by avoiding memory alloc twice in
 scalar array
In-Reply-To: <CADKKbtgc0FxFaqhFzxOtDfeY3OrYkixwqmrbvx69JvX3FsgHzQ@mail.gmail.com>
References: <CAJAs4rRSm-G2v5UZZmU1ZtcyRDDXH8_2TERO+vw_L1m_gDE1zg@mail.gmail.com>
	<CAPJVwBm87eM3H9d6zeO=zjeO4XoumL7BsBap=SLYtZGShSHX_A@mail.gmail.com>
	<CAJAs4rRw9_-rO5RxOZePaQ7_4ZazSJhnk=jJemnAHkmEwQnxVg@mail.gmail.com>
	<CAPJVwBnJWcTAsR0i0XFitObETzM7q-SLH2vAvwpQ6sO0kUpEBg@mail.gmail.com>
	<CADKKbtgc0FxFaqhFzxOtDfeY3OrYkixwqmrbvx69JvX3FsgHzQ@mail.gmail.com>
Message-ID: <CAPJVwBmi7-hUHEbqzJodWjNgr1dGBY0-d7EAq5bkP1JQxZ985Q@mail.gmail.com>

On Tue, Jul 16, 2013 at 7:53 PM, Fr?d?ric Bastien <nouiz at nouiz.org> wrote:
> Hi,
>
>
> On Tue, Jul 16, 2013 at 11:55 AM, Nathaniel Smith <njs at pobox.com> wrote:
>>
>> On Tue, Jul 16, 2013 at 2:34 PM, Arink Verma <arinkverma at gmail.com> wrote:
>>>
>>> >Each ndarray does two mallocs, for the obj and buffer. These could be
>>> > combined into 1 - just allocate the total size and do some pointer
>>> > >arithmetic, then set OWNDATA to false.
>>> So, that two mallocs has been mentioned in project introduction. I got
>>> that wrong.
>>
>>
>> On further thought/reading the code, it appears to be more complicated
>> than that, actually.
>>
>> It looks like (for a non-scalar array) we have 2 calls to PyMem_Malloc: 1
>> for the array object itself, and one for the shapes + strides. And, one call
>> to regular-old malloc: for the data buffer.
>>
>> (Mysteriously, shapes + strides together have 2*ndim elements, but to hold
>> them we allocate a memory region sized to hold 3*ndim elements. I'm not sure
>> why.)
>>
>> And contrary to what I said earlier, this is about as optimized as it can
>> be without breaking ABI. We need at least 2 calls to malloc/PyMem_Malloc,
>> because the shapes+strides may need to be resized without affecting the much
>> larger data area. But it's tempting to allocate the array object and the
>> data buffer in a single memory region, like I suggested earlier. And this
>> would ALMOST work. But, it turns out there is code out there which assumes
>> (whether wisely or not) that you can swap around which data buffer a given
>> PyArrayObject refers to (hi Theano!). And supporting this means that data
>> buffers and PyArrayObjects need to be in separate memory regions.
>
>
> Are you sure that Theano "swap" the data ptr of an ndarray? When we play
> with that, it is on a newly create ndarray. So a node in our graph, won't
> change the input ndarray structure. It will create a new ndarray structure
> with new shape/strides and pass a data ptr and we flag the new ndarray with
> own_data correctly to my knowledge.
>
> If Theano pose a problem here, I'll suggest that I fix Theano. But currently
> I don't see the problem. So if this make you change your mind about this
> optimization, tell me. I don't want Theano to prevent optimization in NumPy.

It's entirely possible I misunderstood, so let's see if we can work it
out. I know that you want to assign to the ->data pointer in a
PyArrayObject, right? That's what caused some trouble with the 1.7 API
deprecations, which were trying to prevent direct access to this
field? Creating a new array given a pointer to a memory region is no
problem, and obviously will be supported regardless of any
optimizations. But if that's all you were doing then you shouldn't
have run into the deprecation problem. Or maybe I'm misremembering!

The problem is if one wants to (a) create a PyArrayObject, which will
by default allocate a new memory region and assign a pointer to it to
the ->data field, and *then* (b) "steal" that memory region and
replace it with another one, while keeping the same PyArrayObject.
This is technically possible right now (though I wouldn't say it was
necessarily a good idea!), but it would become impossible if we
allocated the PyArrayObject and data into a single region.

The profiles suggest that this would only make allocation of arrays
maybe 15% faster, with probably a similar effect on deallocation. And
I'm not sure how often array allocation per se is actually a
bottleneck -- usually you also do things with the arrays, which is
more expensive :-). But hey, 15% is nothing to sneeze at.

-n


From njs at pobox.com  Wed Jul 17 11:18:07 2013
From: njs at pobox.com (Nathaniel Smith)
Date: Wed, 17 Jul 2013 16:18:07 +0100
Subject: [Numpy-discussion] empty_like for masked arrays
In-Reply-To: <CAGcEGh46UAQ_CtN9bzh=tBK5VT0xeA=o9Yy_yeKdMoVqUvXhqg@mail.gmail.com>
References: <CAGcEGh46UAQ_CtN9bzh=tBK5VT0xeA=o9Yy_yeKdMoVqUvXhqg@mail.gmail.com>
Message-ID: <CAPJVwBnjMLoUxCWD9WoqT91PHo6Q=4ZqN8T5fdix3XKPe7HhcQ@mail.gmail.com>

On Mon, Jul 15, 2013 at 2:33 PM, Gregorio Bastardo
<gregorio.bastardo at gmail.com> wrote:
> Hi,
>
> On Mon, Jun 10, 2013 at 3:47 PM, Nathaniel Smith <njs at pobox.com> wrote:
>> Hi all,
>>
>> Is there anyone out there using numpy masked arrays, who has an
>> opinion on how empty_like (and its friends ones_like, zeros_like)
>> should handle the mask?
>>
>> Right now apparently if you call np.ma.empty_like on a masked array,
>> you get a new masked array that shares the original array's mask, so
>> modifying one modifies the other. That's almost certainly wrong. This
>> PR:
>>   https://github.com/numpy/numpy/pull/3404
>> makes it so instead the new array has values that are all set to
>> empty/zero/one, and a mask which is set to match the input array's
>> mask (so whenever something was masked in the original array, the
>> empty/zero/one in that place is also masked). We don't know if this is
>> the desired behaviour for these functions, though. Maybe it's more
>> intuitive for the new array to match the original array in shape and
>> dtype, but to always have an empty mask. Or maybe not. None of us
>> really use np.ma, so if you do and have an opinion then please speak
>> up...
>
> I recently joined the mailing list, so the message might not reach the
> original thread, sorry for that.
>
> I use masked arrays extensively, and would vote for the first option,
> as I use the *_like operations with the assumption that the resulting
> array has the same mask as the original. I think it's more intuitive
> than selecting between all masked or all unmasked behaviour. If it's
> not too late, please consider my use case.

The original submitter of that PR has been silent since then, so so
far nothing has happened.

So that's 2 votes for copying the mask and 3 against, I guess. That's
not very consensus-ful. If there's really a lot of confusion here,
then it's possible the answer is that np.ma.empty_like should just
raise an error or not be defined. Or can you all agree?

-n


From nouiz at nouiz.org  Wed Jul 17 12:57:16 2013
From: nouiz at nouiz.org (=?ISO-8859-1?Q?Fr=E9d=E9ric_Bastien?=)
Date: Wed, 17 Jul 2013 12:57:16 -0400
Subject: [Numpy-discussion] Speedup by avoiding memory alloc twice in
 scalar array
In-Reply-To: <CAPJVwBmi7-hUHEbqzJodWjNgr1dGBY0-d7EAq5bkP1JQxZ985Q@mail.gmail.com>
References: <CAJAs4rRSm-G2v5UZZmU1ZtcyRDDXH8_2TERO+vw_L1m_gDE1zg@mail.gmail.com>
	<CAPJVwBm87eM3H9d6zeO=zjeO4XoumL7BsBap=SLYtZGShSHX_A@mail.gmail.com>
	<CAJAs4rRw9_-rO5RxOZePaQ7_4ZazSJhnk=jJemnAHkmEwQnxVg@mail.gmail.com>
	<CAPJVwBnJWcTAsR0i0XFitObETzM7q-SLH2vAvwpQ6sO0kUpEBg@mail.gmail.com>
	<CADKKbtgc0FxFaqhFzxOtDfeY3OrYkixwqmrbvx69JvX3FsgHzQ@mail.gmail.com>
	<CAPJVwBmi7-hUHEbqzJodWjNgr1dGBY0-d7EAq5bkP1JQxZ985Q@mail.gmail.com>
Message-ID: <CADKKbtiqYcr4PFBcd+w8tZHkAaMvE5SpPKL=9HpYKafQLrTjzQ@mail.gmail.com>

On Wed, Jul 17, 2013 at 10:39 AM, Nathaniel Smith <njs at pobox.com> wrote:

> On Tue, Jul 16, 2013 at 7:53 PM, Fr?d?ric Bastien <nouiz at nouiz.org> wrote:
> > Hi,
> >
> >
> > On Tue, Jul 16, 2013 at 11:55 AM, Nathaniel Smith <njs at pobox.com> wrote:
> >>
> >> On Tue, Jul 16, 2013 at 2:34 PM, Arink Verma <arinkverma at gmail.com>
> wrote:
> >>>
> >>> >Each ndarray does two mallocs, for the obj and buffer. These could be
> >>> > combined into 1 - just allocate the total size and do some pointer
> >>> > >arithmetic, then set OWNDATA to false.
> >>> So, that two mallocs has been mentioned in project introduction. I got
> >>> that wrong.
> >>
> >>
> >> On further thought/reading the code, it appears to be more complicated
> >> than that, actually.
> >>
> >> It looks like (for a non-scalar array) we have 2 calls to PyMem_Malloc:
> 1
> >> for the array object itself, and one for the shapes + strides. And, one
> call
> >> to regular-old malloc: for the data buffer.
> >>
> >> (Mysteriously, shapes + strides together have 2*ndim elements, but to
> hold
> >> them we allocate a memory region sized to hold 3*ndim elements. I'm not
> sure
> >> why.)
> >>
> >> And contrary to what I said earlier, this is about as optimized as it
> can
> >> be without breaking ABI. We need at least 2 calls to
> malloc/PyMem_Malloc,
> >> because the shapes+strides may need to be resized without affecting the
> much
> >> larger data area. But it's tempting to allocate the array object and the
> >> data buffer in a single memory region, like I suggested earlier. And
> this
> >> would ALMOST work. But, it turns out there is code out there which
> assumes
> >> (whether wisely or not) that you can swap around which data buffer a
> given
> >> PyArrayObject refers to (hi Theano!). And supporting this means that
> data
> >> buffers and PyArrayObjects need to be in separate memory regions.
> >
> >
> > Are you sure that Theano "swap" the data ptr of an ndarray? When we play
> > with that, it is on a newly create ndarray. So a node in our graph, won't
> > change the input ndarray structure. It will create a new ndarray
> structure
> > with new shape/strides and pass a data ptr and we flag the new ndarray
> with
> > own_data correctly to my knowledge.
> >
> > If Theano pose a problem here, I'll suggest that I fix Theano. But
> currently
> > I don't see the problem. So if this make you change your mind about this
> > optimization, tell me. I don't want Theano to prevent optimization in
> NumPy.
>
> It's entirely possible I misunderstood, so let's see if we can work it
> out. I know that you want to assign to the ->data pointer in a
> PyArrayObject, right? That's what caused some trouble with the 1.7 API
> deprecations, which were trying to prevent direct access to this
> field? Creating a new array given a pointer to a memory region is no
> problem, and obviously will be supported regardless of any
> optimizations. But if that's all you were doing then you shouldn't
> have run into the deprecation problem. Or maybe I'm misremembering!
>

What is currently done at only 1 place is to create a new PyArrayObject
with a given ptr. So NumPy don't do the allocation. We later change that
ptr to another one.

It is the change to the ptr of the just created PyArrayObject that caused
problem with the interface deprecation. I fixed all other problem releated
to the deprecation (mostly just rename of function/macro). But I didn't
fixed this one yet. I would need to change the logic to compute the final
ptr before creating the PyArrayObject object and create it with the final
data ptr. But in call cases, NumPy didn't allocated data memory for this
object, so this case don't block your optimization.

One thing in our optimization "wish list" is to reuse allocated
PyArrayObject between Theano function call for intermediate results(so
completly under Theano control). This could be useful in particular for
reshape/transpose/subtensor. Those functions are pretty fast and from
memory, I already found the allocation time was significant. But in those
cases, it is on PyArrayObject that are views, so the metadata and the data
would be in different memory region in all cases.

The other cases of optimization "wish list"  is if  we want to reuse the
PyArrayObject when the shape isn't the good one (but the number of
dimensions is the same). If we do that for operation like addition, we will
need to use PyArray_Resize(). This will be done on PyArrayObject whose data
memory was allocated by NumPy. So if you do one memory allowcation for
metadata and data, just make sure that PyArray_Resize() will handle that
correctly.

On the usefulness of doing only 1 memory allocation, on our old gpu
ndarray, we where doing 2 alloc on the GPU, one for metadata and one for
data. I removed this, as this was a bottleneck. allocation on the CPU are
faster the on the GPU, but this is still something that is slow except if
you reuse memory. Do PyMem_Malloc, reuse previous small allocation?

For those that read up all this, the conclusion is that Theano should block
this optimization. If you optimize the allocation of new PyArrayObject,
they will be less incentive to do the "wish list" optimization.

One last thing to keep in mind is that you should keep the data segment
aligned. I would arg that alignment on the datatype size isn't enough, so I
would suggest on cache line size or something like this. But I don't have
number to base this one. This would also help in the case of resize that
change the number of dimensions.


Fred
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130717/abaecb79/attachment.html>

From brady.mccary at gmail.com  Wed Jul 17 13:21:43 2013
From: brady.mccary at gmail.com (Brady McCary)
Date: Wed, 17 Jul 2013 12:21:43 -0500
Subject: [Numpy-discussion] Size/Shape
Message-ID: <CAAQ2A-s4APcin7yLNCSBWx58aC+6PXLTiJzSpvWzffD3Ym0Hjg@mail.gmail.com>

NumPy Folks,

Would someone please discuss or point me to a discussion about the
discrepancy in size vs shape in the following MWE? In this example I
have used a grayscale PNG version of the ImageMagick logo, but any
image which is not square will do.

$ python
Python 2.7.4 (default, Apr 19 2013, 18:28:01)
[GCC 4.7.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import Image
>>> import matplotlib.pyplot as plt
>>>
>>> s = 'logo.png'
>>>
>>> im = Image.open(s)
>>> ar = plt.imread(s)
>>>
>>> im.size
(640, 480)
>>>
>>> ar.shape
(480, 640)
>>>

The extents/shape of the NumPy array (as loaded by matplotlib, but
this convention seems uniform through NumPy) are transposed from what
seems to be the usual convention. Why was this choice made?

Brady


From robert.kern at gmail.com  Wed Jul 17 13:41:37 2013
From: robert.kern at gmail.com (Robert Kern)
Date: Wed, 17 Jul 2013 18:41:37 +0100
Subject: [Numpy-discussion] Size/Shape
In-Reply-To: <CAAQ2A-s4APcin7yLNCSBWx58aC+6PXLTiJzSpvWzffD3Ym0Hjg@mail.gmail.com>
References: <CAAQ2A-s4APcin7yLNCSBWx58aC+6PXLTiJzSpvWzffD3Ym0Hjg@mail.gmail.com>
Message-ID: <CAF6FJiu-mXU8=jA0qaruV7QdhOoK1Z+_tCgsosMSCOyywNG2_A@mail.gmail.com>

On Wed, Jul 17, 2013 at 6:21 PM, Brady McCary <brady.mccary at gmail.com>
wrote:
>
> NumPy Folks,
>
> Would someone please discuss or point me to a discussion about the
> discrepancy in size vs shape in the following MWE? In this example I
> have used a grayscale PNG version of the ImageMagick logo, but any
> image which is not square will do.
>
> $ python
> Python 2.7.4 (default, Apr 19 2013, 18:28:01)
> [GCC 4.7.3] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
> >>> import Image
> >>> import matplotlib.pyplot as plt
> >>>
> >>> s = 'logo.png'
> >>>
> >>> im = Image.open(s)
> >>> ar = plt.imread(s)
> >>>
> >>> im.size
> (640, 480)
> >>>
> >>> ar.shape
> (480, 640)
> >>>
>
> The extents/shape of the NumPy array (as loaded by matplotlib, but
> this convention seems uniform through NumPy) are transposed from what
> seems to be the usual convention. Why was this choice made?

It matches Python sequence semantics better. ar[i] will index along the
first axis to return an array of one less dimension, which itself can be
indexed (ar[i])[j]. Try using a list of lists to see what we are trying to
be consistent with. To extend this to multidimensional indexing, we want
ar[i,j] to give the same thing as ar[i][j]. The .shape attribute needs to
be given in the same order that indexing happens.

Note that what you call the "usual convention" isn't all that standard for
general multidimensional arrays. It's just one of two fairly arbitrary
choices, usually derived from the default memory layout at a very low
level. Fortran picked one convention, C picked another; numpy and Python
are built with C so we use its default conventions. Now, you are right that
image dimensions are usually quoted as (width, height), but numpy arrays
represent a much broader range of objects than images.

--
Robert Kern
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130717/8c8673c4/attachment.html>

From charlesr.harris at gmail.com  Wed Jul 17 17:42:57 2013
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Wed, 17 Jul 2013 15:42:57 -0600
Subject: [Numpy-discussion] Speedup by avoiding memory alloc twice in
 scalar array
In-Reply-To: <CADKKbtiqYcr4PFBcd+w8tZHkAaMvE5SpPKL=9HpYKafQLrTjzQ@mail.gmail.com>
References: <CAJAs4rRSm-G2v5UZZmU1ZtcyRDDXH8_2TERO+vw_L1m_gDE1zg@mail.gmail.com>
	<CAPJVwBm87eM3H9d6zeO=zjeO4XoumL7BsBap=SLYtZGShSHX_A@mail.gmail.com>
	<CAJAs4rRw9_-rO5RxOZePaQ7_4ZazSJhnk=jJemnAHkmEwQnxVg@mail.gmail.com>
	<CAPJVwBnJWcTAsR0i0XFitObETzM7q-SLH2vAvwpQ6sO0kUpEBg@mail.gmail.com>
	<CADKKbtgc0FxFaqhFzxOtDfeY3OrYkixwqmrbvx69JvX3FsgHzQ@mail.gmail.com>
	<CAPJVwBmi7-hUHEbqzJodWjNgr1dGBY0-d7EAq5bkP1JQxZ985Q@mail.gmail.com>
	<CADKKbtiqYcr4PFBcd+w8tZHkAaMvE5SpPKL=9HpYKafQLrTjzQ@mail.gmail.com>
Message-ID: <CAB6mnx+tCgPhOAbenoHM4G7TwHERUrc=iiwry7iPd+=y0d3CYQ@mail.gmail.com>

On Wed, Jul 17, 2013 at 10:57 AM, Fr?d?ric Bastien <nouiz at nouiz.org> wrote:

>
>
>
> On Wed, Jul 17, 2013 at 10:39 AM, Nathaniel Smith <njs at pobox.com> wrote:
>
>> On Tue, Jul 16, 2013 at 7:53 PM, Fr?d?ric Bastien <nouiz at nouiz.org>
>> wrote:
>> > Hi,
>> >
>> >
>> > On Tue, Jul 16, 2013 at 11:55 AM, Nathaniel Smith <njs at pobox.com>
>> wrote:
>> >>
>> >> On Tue, Jul 16, 2013 at 2:34 PM, Arink Verma <arinkverma at gmail.com>
>> wrote:
>> >>>
>> >>> >Each ndarray does two mallocs, for the obj and buffer. These could be
>> >>> > combined into 1 - just allocate the total size and do some pointer
>> >>> > >arithmetic, then set OWNDATA to false.
>> >>> So, that two mallocs has been mentioned in project introduction. I got
>> >>> that wrong.
>> >>
>> >>
>> >> On further thought/reading the code, it appears to be more complicated
>> >> than that, actually.
>> >>
>> >> It looks like (for a non-scalar array) we have 2 calls to
>> PyMem_Malloc: 1
>> >> for the array object itself, and one for the shapes + strides. And,
>> one call
>> >> to regular-old malloc: for the data buffer.
>> >>
>> >> (Mysteriously, shapes + strides together have 2*ndim elements, but to
>> hold
>> >> them we allocate a memory region sized to hold 3*ndim elements. I'm
>> not sure
>> >> why.)
>> >>
>> >> And contrary to what I said earlier, this is about as optimized as it
>> can
>> >> be without breaking ABI. We need at least 2 calls to
>> malloc/PyMem_Malloc,
>> >> because the shapes+strides may need to be resized without affecting
>> the much
>> >> larger data area. But it's tempting to allocate the array object and
>> the
>> >> data buffer in a single memory region, like I suggested earlier. And
>> this
>> >> would ALMOST work. But, it turns out there is code out there which
>> assumes
>> >> (whether wisely or not) that you can swap around which data buffer a
>> given
>> >> PyArrayObject refers to (hi Theano!). And supporting this means that
>> data
>> >> buffers and PyArrayObjects need to be in separate memory regions.
>> >
>> >
>> > Are you sure that Theano "swap" the data ptr of an ndarray? When we play
>> > with that, it is on a newly create ndarray. So a node in our graph,
>> won't
>> > change the input ndarray structure. It will create a new ndarray
>> structure
>> > with new shape/strides and pass a data ptr and we flag the new ndarray
>> with
>> > own_data correctly to my knowledge.
>> >
>> > If Theano pose a problem here, I'll suggest that I fix Theano. But
>> currently
>> > I don't see the problem. So if this make you change your mind about this
>> > optimization, tell me. I don't want Theano to prevent optimization in
>> NumPy.
>>
>> It's entirely possible I misunderstood, so let's see if we can work it
>> out. I know that you want to assign to the ->data pointer in a
>> PyArrayObject, right? That's what caused some trouble with the 1.7 API
>> deprecations, which were trying to prevent direct access to this
>> field? Creating a new array given a pointer to a memory region is no
>> problem, and obviously will be supported regardless of any
>> optimizations. But if that's all you were doing then you shouldn't
>> have run into the deprecation problem. Or maybe I'm misremembering!
>>
>
> What is currently done at only 1 place is to create a new PyArrayObject
> with a given ptr. So NumPy don't do the allocation. We later change that
> ptr to another one.
>
> It is the change to the ptr of the just created PyArrayObject that caused
> problem with the interface deprecation. I fixed all other problem releated
> to the deprecation (mostly just rename of function/macro). But I didn't
> fixed this one yet. I would need to change the logic to compute the final
> ptr before creating the PyArrayObject object and create it with the final
> data ptr. But in call cases, NumPy didn't allocated data memory for this
> object, so this case don't block your optimization.
>
> One thing in our optimization "wish list" is to reuse allocated
> PyArrayObject between Theano function call for intermediate results(so
> completly under Theano control). This could be useful in particular for
> reshape/transpose/subtensor. Those functions are pretty fast and from
> memory, I already found the allocation time was significant. But in those
> cases, it is on PyArrayObject that are views, so the metadata and the data
> would be in different memory region in all cases.
>
> The other cases of optimization "wish list"  is if  we want to reuse the
> PyArrayObject when the shape isn't the good one (but the number of
> dimensions is the same). If we do that for operation like addition, we will
> need to use PyArray_Resize(). This will be done on PyArrayObject whose data
> memory was allocated by NumPy. So if you do one memory allowcation for
> metadata and data, just make sure that PyArray_Resize() will handle that
> correctly.
>
> On the usefulness of doing only 1 memory allocation, on our old gpu
> ndarray, we where doing 2 alloc on the GPU, one for metadata and one for
> data. I removed this, as this was a bottleneck. allocation on the CPU are
> faster the on the GPU, but this is still something that is slow except if
> you reuse memory. Do PyMem_Malloc, reuse previous small allocation?
>
> For those that read up all this, the conclusion is that Theano should
> block this optimization. If you optimize the allocation of new
> PyArrayObject, they will be less incentive to do the "wish list"
> optimization.
>
> One last thing to keep in mind is that you should keep the data segment
> aligned. I would arg that alignment on the datatype size isn't enough, so I
> would suggest on cache line size or something like this. But I don't have
> number to base this one. This would also help in the case of resize that
> change the number of dimensions.
>
>
There is a similar thing done in f2py which is still keeping it from being
current with the 1.7 macro replacement by functions. I'd like to add a
'swap' type function and would welcome discussion/implementation fo such.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130717/c9ff2b95/attachment.html>

From jrocher at enthought.com  Wed Jul 17 18:43:47 2013
From: jrocher at enthought.com (Jonathan Rocher)
Date: Wed, 17 Jul 2013 17:43:47 -0500
Subject: [Numpy-discussion] [ANN] 4th Python Symposium at AMS2014
Message-ID: <CAOzk5QeP5+-Abu1WrnYM6wN+sHo6BuZ0F7i-VWbqJa2ZOmZvJA@mail.gmail.com>

[Apologies for the cross-post]

Dear all,

If you work with Python around themes like big data, climate,
meteorological or oceanic science, and/or GIS, you should come present at
the 4th Python Symposium, as part of the American Meteorological Society
conference in Atlanta in Feb 2014:
http://annual.ametsoc.org/2014/index.cfm/programs-and-events/conferences-and-symposia/fourth-symposium-on-advances-in-modeling-and-analysis-using-python/

The *abstract deadline is Aug 1st*!

Jonathan

-- 
Jonathan Rocher, PhD
Scientific software developer
SciPy2013 conference co-chair
Enthought, Inc.
jrocher at enthought.com
1-512-536-1057
http://www.enthought.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130717/2472b5ed/attachment.html>

From grb at skogoglandskap.no  Thu Jul 18 03:32:31 2013
From: grb at skogoglandskap.no (Graeme B. Bell)
Date: Thu, 18 Jul 2013 07:32:31 +0000
Subject: [Numpy-discussion] np.select use case
In-Reply-To: <mailman.1797.1374079944.973.numpy-discussion@scipy.org>
References: <mailman.1797.1374079944.973.numpy-discussion@scipy.org>
Message-ID: <BC68F48D-B690-4CE9-BB2F-53AE24DC524C@skogoglandskap.no>


Quick question:

Can anyone think of a realistic/real-world use case for array broadcasting and np.select, (other than scalar to ndarray broadcasting)? 
e.g. differently shaped arrays with matching lower dimensions.

(I don't know if a use case even exists).

Graeme.

From njs at pobox.com  Thu Jul 18 08:52:00 2013
From: njs at pobox.com (Nathaniel Smith)
Date: Thu, 18 Jul 2013 13:52:00 +0100
Subject: [Numpy-discussion] Bringing order to higher dimensional operations
Message-ID: <CAPJVwBmkCYGV2+uwvH7DXo=0_hRMWZjB=EX32qfsAw0B6Obd=A@mail.gmail.com>

Hi all,

I hadn't realized until Pauli just pointed it out that np.dot and the
new gufuncs actually have different rules for how they handle extra
axes:
  https://github.com/numpy/numpy/pull/3524#issuecomment-21117601
This is turning into a big mess, and it may take some hard decisions to fix it.

Before I explain the problem, a bit of terminology: a "vectorized
operation" is built by taking an "intrinsic operation" and putting a
loop around it, to apply it many times. So for np.add for example, the
intrinsic operation is scalar addition, and if you put a loop around
this you get vectorized addition.

The question then is, given some input arrays: which parts do you loop
over, and which things you pass to the intrinsic operation?

In the case of scalar operations (like classic ufuncs), this is pretty
straightforward: we broadcast the input arrays together, and loop over
them in parallel. So
  np.add(ones((2, 3, 4)), ones((3, 4))).shape == (2, 3, 4)
The other obvious option is, instead of looping over the two arrays in
parallel, instead find all combinations. This is what the .outer
method on ufuncs does, so
  np.add.outer(ones((2, 3)), ones((4, 5))).shape == (2, 3, 4, 5)

Now, that's for vectorized versions scalar operations. We also have a
bunch of vectorized operations whose "intrinsic operation" is itself a
function over multidimensional arrays. For example, reduction
operations like 'sum' intrinsically take a 1-dim array and return a
0-dim (scalar), but they can be vectorized to apply to a single axis
of a >1-dim array. Or matrix multiplication intrinsically takes two
2-dim arrays and returns a 2-dim array; it can be vectorized to apply
to >2 dim inputs.

(As shorthand I'll write "takes two 2-dim arrays and returns a 2-dim
array" as "2,2->2"; so 'sum' is 1->0 and 'cumsum' is 1->1.)

-----

Okay, now I can explain the problem. For vectorized multidimensional
operations, we have four (!) different conventions deployed:

Convention 1: Ufunc .reduce (1->0), .accumulate (1->1), .reduceat
(1,1->1): These pick the 0th axis for the intrinsic axis and loop over
the rest. (By default; can be modified by specifying axis.)
  np.add.reduce(np.ones((2, 3, 4))).shape == (3, 4)
  np.add.accumulate(np.ones((2, 3, 4))).shape == (2, 3, 4)

Convention 2: Reduction (1->0) and accumulation (1->1) operations
defined as top-level functions and ndarray methods (np.sum,
ndarray.sum, np.mean, np.cumprod, etc.): These flatten the array and
use the whole thing as the intrinsic axis. (By default; can be
modified by specifying axis=.)
  np.sum(np.ones((2, 3, 4))).shape == ()
  np.cumsum(np.ones((2, 3, 4))).shape == (24,)

Convention 3: gufuncs (any->any): These take the *last* k axes for the
intrinsic axes, and then broadcast and parallel-loop over the rest.
Cannot currently be modified.
  gu_dot = np.linalg._umath_linalg.matrix_multiply # requires current master
  gu_dot(np.ones((2, 3, 4, 5)), np.ones((1, 3, 5, 6))).shape == (2, 3, 4, 6)

(So what's happened here is that the gufunc pulled off the last two
axes of each array, so the intrinsic operation is always going to be a
matrix multiply of a (4, 5) array by a (5, 6) array, producing a (4,
6) array. Then it broadcast the remaining axes together: (2, 3) and
(1, 3) broadcast to (2, 3), and did a parallel iteration over them:
output[0, 0, :, :] is the result of dot(input1[0, 0, :, :], input2[0,
0, :, :]).)

Convention 4: np.dot (2->2): this one is bizarre:
  np.dot(np.ones((1, 2, 10, 11)), np.ones((101, 102, 11, 12)).shape
    == (1, 2, 10, 101, 102, 12)

So what it's done is picked the last two axes to be the intrinsic
axes, just like the gufunc -- so it always does a bunch of matrix
multiplies of a (10, 11) array with an (11, 12) array. But then it
didn't do a ufunc-style parallel loop. Instead it did a
ufunc.outer-style outer loop, in which it found all n^2 ways of
matching up a matrix in the first input with a matrix in the second
input, and took the dot of each. And then it packed these up into an
array with a rather weird shape: first all the looping axes from the
first input, then the first axis of the output matrix, then all the
looping axes from the second input, and then finally the second axis
of the output matrix.

-----

There are plenty of general reasons to want to simplify this -- it'd
make numpy easier to explain and understand, simplify the code, etc.
-- but also a few more specific reasons that make it urgent:

- gufuncs haven't really been used much yet, so maybe it'd be easy to
change how they work now. But in the next release, everything in
np.linalg will become a gufunc, so it'll become much harder to change
after that.

- we'd really like np.dot to become a gufunc -- in particular, in
combination with Blake's work on ufunc overrides, this would allow
np.dot() to work on scipy.sparse matrices.

- pretty soon we'll want to turn things like 'mean' into gufuncs too,
for the same reason.

-----

Okay, what to do?

The gufunc convention actually seems like the right one to me. This is
unfortunate, because it's also the only one we could easily change
:-(. But we obviously want our vectorized operations to do
broadcasting by default, both for consistency with scalar ufuncs, and
because it just seems to be the most useful convention. So that rules
out the np.dot convention. Then given that we're broadcasting, the two
options are to pick intrinsic axes from the right like gufuncs do, or
to follow the ufunc.reduce convention and pick intrinsic axes from the
left. But picking from the left seems confusing to me, because
broadcasting is itself a right-to-left operation. This doesn't matter
for .reduce and such because they only take one input, but for
something like 'dot', it means you can have 1's inserted in the
"middle" of the array, and then broadcast up to a higher dimension.
Compare:
  gu_dot_leftwards(ones((10, 11, 4)), ones((11, 12, 3, 4))) -> (10, 12, 3, 4)
versus
  gu_dot_rightwards(ones((4, 10, 11)), ones((3, 4, 11, 12))) -> (3, 4, 10, 12)
To me, it's easier to figure out which axes end up where in the second
case. Working from the right, we take two axes to be the intrinsic
axes, then we match up the next axis (4 matches 4), then we append a 1
and match up the last axis (1 broadcasts to match 3).

So:

QUESTION 1: does that sound right: that in a perfect world, the
current gufunc convention would be the only one, and that's what we
should work towards, at least in the cases where that's possible?

QUESTION 2: Assuming that's right, it would be *really nice* if we
could at least get np.dot onto our new convention, for consistency
with the rest of np.linalg, and to allow it to be overridden. I'm sort
of terrified to touch np.dot's API, but the only cases where it would
act differently is when *both* arguments have *3 or more dimensions*,
and I guess there are very very few users who fall into that category.
So maybe we could start raising some big FutureWarnings for this case
in the next release, and eventually switch?

(I'm even more terrified of trying to mess with np.sum or
np.add.reduce, so I'll leave that alone for now -- maybe we're just
stuck with them. And at least they do already go through the ufunc
machinery.)

-n


From cjwilliams43 at gmail.com  Thu Jul 18 09:18:59 2013
From: cjwilliams43 at gmail.com (Colin J. Williams)
Date: Thu, 18 Jul 2013 09:18:59 -0400
Subject: [Numpy-discussion] User Guide
Message-ID: <51E7EB43.7010904@gmail.com>

An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130718/5711554b/attachment.html>

From sebastian at sipsolutions.net  Thu Jul 18 09:23:39 2013
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Thu, 18 Jul 2013 15:23:39 +0200
Subject: [Numpy-discussion] Bringing order to higher dimensional
 operations
In-Reply-To: <CAPJVwBmkCYGV2+uwvH7DXo=0_hRMWZjB=EX32qfsAw0B6Obd=A@mail.gmail.com>
References: <CAPJVwBmkCYGV2+uwvH7DXo=0_hRMWZjB=EX32qfsAw0B6Obd=A@mail.gmail.com>
Message-ID: <1374153819.14751.17.camel@sebastian-laptop>

On Thu, 2013-07-18 at 13:52 +0100, Nathaniel Smith wrote:
> Hi all,
> 
<snip>
> 
> So:
> 
> QUESTION 1: does that sound right: that in a perfect world, the
> current gufunc convention would be the only one, and that's what we
> should work towards, at least in the cases where that's possible?
> 

Sounds right to me, ufunc/gufunc broadcasting assumes the "inner"
dimensions are the right-most. Since we are normally in C-order arrays,
this also seems the sensible way if you consider the memory layout.

> QUESTION 2: Assuming that's right, it would be *really nice* if we
> could at least get np.dot onto our new convention, for consistency
> with the rest of np.linalg, and to allow it to be overridden. I'm sort
> of terrified to touch np.dot's API, but the only cases where it would
> act differently is when *both* arguments have *3 or more dimensions*,
> and I guess there are very very few users who fall into that category.
> So maybe we could start raising some big FutureWarnings for this case
> in the next release, and eventually switch?
> 

It is noble to try to get do to use the gufunc convention, but if you
look at the new gufunc linalg functions, they already have to have some
weird tricks in the case of np.linalg.solve.
It is so difficult because of the fact that dot is basically a
combination of many functions:
  o vector * vector -> vector
  o vector * matrix -> matrix (add dimensions to vector on right)
  o matrix * vector -> matrix (add dimensions to vector on left)
  o matrix * matrix -> matrix
plus scalar cases.

I somewhat believe we should not touch dot, or deprecate anything but
the most basic dot functionality. Then we can point to matrix_multiply,
inner1d, etc. which are gufuncs (even if they are not exposed at this
time). The whole dance that is already done for np.linalg.solve right
now is not pretty there, and it will be worse for dot. Because dot is
basically overloaded, marrying it with the broadcasting machinery in a
general way is impossible.

> (I'm even more terrified of trying to mess with np.sum or
> np.add.reduce, so I'll leave that alone for now -- maybe we're just
> stuck with them. And at least they do already go through the ufunc
> machinery.)

I did not understand where the inconsistency/problem for the reductions
is.

- Sebastian

> 
> -n
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> 


From andrew.collette at gmail.com  Thu Jul 18 09:23:59 2013
From: andrew.collette at gmail.com (Andrew Collette)
Date: Thu, 18 Jul 2013 07:23:59 -0600
Subject: [Numpy-discussion] ANN: HDF5 for Python (h5py) 2.2 BETA
Message-ID: <CALmrCV1Fs5EvBRQ9DWfxJsL=Bkg-uAJ6yZiWW=-KUn7vUqyV+g@mail.gmail.com>

Announcing HDF5 for Python (h5py) 2.2.0 BETA
============================================

We are proud to announce that HDF5 for Python 2.2.0 (beta) is now available.
Because of the large number of new features in this release, we are actively
seeking community feedback over the (2-week) beta period.

The h5py package is a Pythonic interface to the HDF5 binary data format.

It lets you store huge amounts of numerical data, and easily manipulate that
data from NumPy. For example, you can slice into multi-terabyte datasets
stored on disk, as if they were real NumPy arrays. Thousands of datasets can
be stored in a single file, categorized and tagged however you want.

H5py uses straightforward NumPy and Python metaphors, like dictionary and
NumPy array syntax. For example, you can iterate over datasets in a file, or
check out the .shape or .dtype attributes of datasets. You don't need to know
anything special about HDF5 to get started.

Documentation and download links are available at:

    http://www.h5py.org

Parallel HDF5
=============

This version of h5py introduces support for MPI/Parallel HDF5, using the
mpi4py package.  Parallel HDF5 is the native method for sharing
files and objects across multiple processes.  Unlike "multiprocessing"
based solutions, all processes in an MPI-based program can read
from and write to the same shared HDF5 file.

There is a guide to using Parallel HDF5 at the h5py web site:

http://h5py.org/docs/build/html/topics/mpi.html

Other new features
==================

* Support for Python 3.3
* Support for 16-bit "mini" floats
* Access to the HDF5 scale-offset filter
* Field names are now allowed when writing to a dataset
* Region references now preserve the shape of their selections
* File-resident "committed" types can be linked to datasets and attributes
* A new "move" method on Group objects
* Many new options for Group.copy


From njs at pobox.com  Thu Jul 18 09:36:50 2013
From: njs at pobox.com (Nathaniel Smith)
Date: Thu, 18 Jul 2013 14:36:50 +0100
Subject: [Numpy-discussion] Speedup by avoiding memory alloc twice in
 scalar array
In-Reply-To: <CADKKbtiqYcr4PFBcd+w8tZHkAaMvE5SpPKL=9HpYKafQLrTjzQ@mail.gmail.com>
References: <CAJAs4rRSm-G2v5UZZmU1ZtcyRDDXH8_2TERO+vw_L1m_gDE1zg@mail.gmail.com>
	<CAPJVwBm87eM3H9d6zeO=zjeO4XoumL7BsBap=SLYtZGShSHX_A@mail.gmail.com>
	<CAJAs4rRw9_-rO5RxOZePaQ7_4ZazSJhnk=jJemnAHkmEwQnxVg@mail.gmail.com>
	<CAPJVwBnJWcTAsR0i0XFitObETzM7q-SLH2vAvwpQ6sO0kUpEBg@mail.gmail.com>
	<CADKKbtgc0FxFaqhFzxOtDfeY3OrYkixwqmrbvx69JvX3FsgHzQ@mail.gmail.com>
	<CAPJVwBmi7-hUHEbqzJodWjNgr1dGBY0-d7EAq5bkP1JQxZ985Q@mail.gmail.com>
	<CADKKbtiqYcr4PFBcd+w8tZHkAaMvE5SpPKL=9HpYKafQLrTjzQ@mail.gmail.com>
Message-ID: <CAPJVwBkx-kx1Wk5RjtySJuOkjDzcVQNby=v2dKML9+cSxiKJ2A@mail.gmail.com>

On Wed, Jul 17, 2013 at 5:57 PM, Fr?d?ric Bastien <nouiz at nouiz.org> wrote:
> On Wed, Jul 17, 2013 at 10:39 AM, Nathaniel Smith <njs at pobox.com> wrote:
>> >
>> > On Tue, Jul 16, 2013 at 11:55 AM, Nathaniel Smith <njs at pobox.com> wrote:
>> It's entirely possible I misunderstood, so let's see if we can work it
>> out. I know that you want to assign to the ->data pointer in a
>> PyArrayObject, right? That's what caused some trouble with the 1.7 API
>> deprecations, which were trying to prevent direct access to this
>> field? Creating a new array given a pointer to a memory region is no
>> problem, and obviously will be supported regardless of any
>> optimizations. But if that's all you were doing then you shouldn't
>> have run into the deprecation problem. Or maybe I'm misremembering!
>
> What is currently done at only 1 place is to create a new PyArrayObject with
> a given ptr. So NumPy don't do the allocation. We later change that ptr to
> another one.

Hmm, OK, so that would still work. If the array has the OWNDATA flag
set (or you otherwise know where the data came from), then swapping
the data pointer would still work.

The change would be that in most cases when asking numpy to allocate a
new array from scratch, the OWNDATA flag would not be set. That's
because the OWNDATA flag really means "when this object is
deallocated, call free(self->data)", but if we allocate the array
struct and the data buffer together in a single memory region, then
deallocating the object will automatically cause the data buffer to be
deallocated as well, without the array destructor having to take any
special effort.

> It is the change to the ptr of the just created PyArrayObject that caused
> problem with the interface deprecation. I fixed all other problem releated
> to the deprecation (mostly just rename of function/macro). But I didn't
> fixed this one yet. I would need to change the logic to compute the final
> ptr before creating the PyArrayObject object and create it with the final
> data ptr. But in call cases, NumPy didn't allocated data memory for this
> object, so this case don't block your optimization.

Right.

> One thing in our optimization "wish list" is to reuse allocated
> PyArrayObject between Theano function call for intermediate results(so
> completly under Theano control). This could be useful in particular for
> reshape/transpose/subtensor. Those functions are pretty fast and from
> memory, I already found the allocation time was significant. But in those
> cases, it is on PyArrayObject that are views, so the metadata and the data
> would be in different memory region in all cases.
>
> The other cases of optimization "wish list"  is if  we want to reuse the
> PyArrayObject when the shape isn't the good one (but the number of
> dimensions is the same). If we do that for operation like addition, we will
> need to use PyArray_Resize(). This will be done on PyArrayObject whose data
> memory was allocated by NumPy. So if you do one memory allowcation for
> metadata and data, just make sure that PyArray_Resize() will handle that
> correctly.

I'm not sure I follow the details here, but it does turn out that a
really surprising amount of time in PyArray_NewFromDescr is spent in
just calculating and writing out the shape and strides buffers, so for
programs that e.g. use hundreds of small 3-element arrays to represent
points in space, re-using even these buffers might be a big win...

> On the usefulness of doing only 1 memory allocation, on our old gpu ndarray,
> we where doing 2 alloc on the GPU, one for metadata and one for data. I
> removed this, as this was a bottleneck. allocation on the CPU are faster the
> on the GPU, but this is still something that is slow except if you reuse
> memory. Do PyMem_Malloc, reuse previous small allocation?

Yes, at least in theory PyMem_Malloc is highly-optimized for small
buffer re-use. (For requests >256 bytes it just calls malloc().) And
it's possible to define type-specific freelists; not sure if there's
any value in doing that for PyArrayObjects. See Objects/obmalloc.c in
the Python source tree.

-n


From mdroe at stsci.edu  Thu Jul 18 09:42:42 2013
From: mdroe at stsci.edu (Michael Droettboom)
Date: Thu, 18 Jul 2013 09:42:42 -0400
Subject: [Numpy-discussion] Results of matplotlib user survey 2013
Message-ID: <51E7F0D2.8040807@stsci.edu>

We have had 508 responses to the matplotlib user survey.  Quite a nice 
turnout!

You can view the results here:

https://docs.google.com/spreadsheet/viewanalytics?key=0AjrPjlTMRTwTdHpQS25pcTZIRWdqX0pNckNSU01sMHc&gridId=0#chart

and from there, you can access the complete raw results.

I will be doing more analysis of the results over the coming days and 
weeks, including dedup'ing some of the responses and converting some of 
the free-form responses into github issues etc.  Volunteers to help with 
this are of course welcome!

Cheers,
Mike
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130718/2ea4215f/attachment.html>

From stefan at sun.ac.za  Thu Jul 18 10:18:56 2013
From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=)
Date: Thu, 18 Jul 2013 16:18:56 +0200
Subject: [Numpy-discussion] add .H attribute?
In-Reply-To: <CAPJVwBn-4WfgPQCzz0kECV8MyoLgWDmTmWu+Fj2TMuUWKx-7Wg@mail.gmail.com>
References: <51D98902.1090403@gmail.com>
	<CAB6mnxL9GhVHi5aBKMfzX0DTi3uY=W90rFaj7Ta=fZguJZs5FA@mail.gmail.com>
	<51E17280.2030105@gmail.com>
	<CAPJVwBn-4WfgPQCzz0kECV8MyoLgWDmTmWu+Fj2TMuUWKx-7Wg@mail.gmail.com>
Message-ID: <CABDkGQ=to_EG8uUyj8aQTVusguvt1qsLnhsPEGpanczHE9J1Rg@mail.gmail.com>

On Sat, Jul 13, 2013 at 7:46 PM, Nathaniel Smith <njs at pobox.com> wrote:
> Why not just write
>
> def H(a):
>     return a.conj().T

It's hard to convince students that this is the Best Way of doing
things in NumPy.  Why, they ask, can you do it using a' in MATLAB,
then?

I've tripped over this one before, since it's not the kind of thing
you imagine would be unimplemented, and then spend some time trying to
find it.

St?fan


From mdroe at stsci.edu  Thu Jul 18 10:20:00 2013
From: mdroe at stsci.edu (Michael Droettboom)
Date: Thu, 18 Jul 2013 10:20:00 -0400
Subject: [Numpy-discussion] Results of matplotlib user survey 2013
In-Reply-To: <51E7F0D2.8040807@stsci.edu>
References: <51E7F0D2.8040807@stsci.edu>
Message-ID: <51E7F990.3060307@stsci.edu>

Apologies: I didn't realize the link to the raw results only exists for 
users with edit permissions.  The public URL for the raw results is:

https://docs.google.com/spreadsheet/ccc?key=0AjrPjlTMRTwTdHpQS25pcTZIRWdqX0pNckNSU01sMHc&usp=sharing

Mike

On 07/18/2013 09:42 AM, Michael Droettboom wrote:
> We have had 508 responses to the matplotlib user survey.  Quite a nice 
> turnout!
>
> You can view the results here:
>
> https://docs.google.com/spreadsheet/viewanalytics?key=0AjrPjlTMRTwTdHpQS25pcTZIRWdqX0pNckNSU01sMHc&gridId=0#chart
>
> and from there, you can access the complete raw results.
>
> I will be doing more analysis of the results over the coming days and 
> weeks, including dedup'ing some of the responses and converting some 
> of the free-form responses into github issues etc.  Volunteers to help 
> with this are of course welcome!
>
> Cheers,
> Mike

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130718/b47ef7f5/attachment.html>

From alan.isaac at gmail.com  Thu Jul 18 12:57:19 2013
From: alan.isaac at gmail.com (Alan G Isaac)
Date: Thu, 18 Jul 2013 12:57:19 -0400
Subject: [Numpy-discussion] azip
Message-ID: <51E81E6F.8010302@gmail.com>

I'm floating this thought even though it is not fleshed out.

On occasion, I run into the following problem:
I have a rectangular array A to which I want to append
a (probably) one dimensional vector b to make [A|b].
Of course this can be done as np.hstack((x,b[:,None]))
(or obscurely np.r_['1,2,0',x,b]), but this has the following issues:

- what if ``b`` turns out to be a list?
- what if ``b`` turns out to be 2d (e.g., a column vector)?
- it's a bit ugly
- it is not obvious when read by others (e.g., students)

(The last is a key motivation for me to talk about this.)

All of which leads me to wonder if there might be profit
in a numpy.azip function that takes as arguments
- a tuple of arraylike iterables
- an axis along which to concatenate (say, like r_ does) iterated items

To make that a little clearer (but not to provide a suggested implementation),
it might behave something like

def azip(alst, axis=1):
     results = []
     for tpl in zip(*alst):
         results.append(np.r_[tpl])
     return np.rollaxis(np.array(results), axis-1)

Alan Isaac


From robert.kern at gmail.com  Thu Jul 18 13:03:06 2013
From: robert.kern at gmail.com (Robert Kern)
Date: Thu, 18 Jul 2013 18:03:06 +0100
Subject: [Numpy-discussion] azip
In-Reply-To: <51E81E6F.8010302@gmail.com>
References: <51E81E6F.8010302@gmail.com>
Message-ID: <CAF6FJiujaBEnBshPd+RuL1X0fp3X1R0z78XbbTPNjqQav9TDRg@mail.gmail.com>

On Thu, Jul 18, 2013 at 5:57 PM, Alan G Isaac <alan.isaac at gmail.com> wrote:
>
> I'm floating this thought even though it is not fleshed out.
>
> On occasion, I run into the following problem:
> I have a rectangular array A to which I want to append
> a (probably) one dimensional vector b to make [A|b].
> Of course this can be done as np.hstack((x,b[:,None]))
> (or obscurely np.r_['1,2,0',x,b]), but this has the following issues:
>
> - what if ``b`` turns out to be a list?
> - what if ``b`` turns out to be 2d (e.g., a column vector)?
> - it's a bit ugly
> - it is not obvious when read by others (e.g., students)

np.column_stack([x, b]) does everything you need.

--
Robert Kern
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130718/70aad05f/attachment.html>

From alan.isaac at gmail.com  Thu Jul 18 13:06:59 2013
From: alan.isaac at gmail.com (Alan G Isaac)
Date: Thu, 18 Jul 2013 13:06:59 -0400
Subject: [Numpy-discussion] azip
In-Reply-To: <CAF6FJiujaBEnBshPd+RuL1X0fp3X1R0z78XbbTPNjqQav9TDRg@mail.gmail.com>
References: <51E81E6F.8010302@gmail.com>
	<CAF6FJiujaBEnBshPd+RuL1X0fp3X1R0z78XbbTPNjqQav9TDRg@mail.gmail.com>
Message-ID: <51E820B3.8000208@gmail.com>

On 7/18/2013 1:03 PM, Robert Kern wrote:
> np.column_stack([x, b]) does everything you need.


So it does.

It's not referenced from the hstack or concatenate documentation.

Thanks!
Alan


From stefan at sun.ac.za  Thu Jul 18 13:14:06 2013
From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=)
Date: Thu, 18 Jul 2013 19:14:06 +0200
Subject: [Numpy-discussion] azip
In-Reply-To: <51E820B3.8000208@gmail.com>
References: <51E81E6F.8010302@gmail.com>
	<CAF6FJiujaBEnBshPd+RuL1X0fp3X1R0z78XbbTPNjqQav9TDRg@mail.gmail.com>
	<51E820B3.8000208@gmail.com>
Message-ID: <CABDkGQndL8VdCtEEyB20wLLJFUmpA5NVT-LJ82DDsQXvLqa7iA@mail.gmail.com>

On Thu, Jul 18, 2013 at 7:06 PM, Alan G Isaac <alan.isaac at gmail.com> wrote:
> On 7/18/2013 1:03 PM, Robert Kern wrote:
>> np.column_stack([x, b]) does everything you need.
>
> So it does.
>
> It's not referenced from the hstack or concatenate documentation.

A pull request would fix all of that in seconds!  GitHub now allows
online editing, and provides a one-click option for creating the PR.

St?fan


From ben.root at ou.edu  Thu Jul 18 13:18:06 2013
From: ben.root at ou.edu (Benjamin Root)
Date: Thu, 18 Jul 2013 13:18:06 -0400
Subject: [Numpy-discussion] azip
In-Reply-To: <CABDkGQndL8VdCtEEyB20wLLJFUmpA5NVT-LJ82DDsQXvLqa7iA@mail.gmail.com>
References: <51E81E6F.8010302@gmail.com>
	<CAF6FJiujaBEnBshPd+RuL1X0fp3X1R0z78XbbTPNjqQav9TDRg@mail.gmail.com>
	<51E820B3.8000208@gmail.com>
	<CABDkGQndL8VdCtEEyB20wLLJFUmpA5NVT-LJ82DDsQXvLqa7iA@mail.gmail.com>
Message-ID: <CANNq6F=hHtKxNfzKAKubwrn5pZdEQyYsgH1jCWkAtaHjXua9iw@mail.gmail.com>

Forgive my ignorance, but has numpy and scipy stopped doing that weird doc
editing thing that existed back in the days of Trac? I have actually held
back on submitting doc edits because I hated using that thing so much.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130718/db7d56d1/attachment.html>

From stefan at sun.ac.za  Thu Jul 18 13:27:03 2013
From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=)
Date: Thu, 18 Jul 2013 19:27:03 +0200
Subject: [Numpy-discussion] azip
In-Reply-To: <CANNq6F=hHtKxNfzKAKubwrn5pZdEQyYsgH1jCWkAtaHjXua9iw@mail.gmail.com>
References: <51E81E6F.8010302@gmail.com>
	<CAF6FJiujaBEnBshPd+RuL1X0fp3X1R0z78XbbTPNjqQav9TDRg@mail.gmail.com>
	<51E820B3.8000208@gmail.com>
	<CABDkGQndL8VdCtEEyB20wLLJFUmpA5NVT-LJ82DDsQXvLqa7iA@mail.gmail.com>
	<CANNq6F=hHtKxNfzKAKubwrn5pZdEQyYsgH1jCWkAtaHjXua9iw@mail.gmail.com>
Message-ID: <CABDkGQmHK-CQx=43AeGAa6MZoVa5XEHJcx5=e9np8mnWVOz=HA@mail.gmail.com>

Hi Ben

On Thu, Jul 18, 2013 at 7:18 PM, Benjamin Root <ben.root at ou.edu> wrote:
> Forgive my ignorance, but has numpy and scipy stopped doing that weird doc
> editing thing that existed back in the days of Trac? I have actually held
> back on submitting doc edits because I hated using that thing so much.

That thing helps people without hacking experience to contribute, but
you are welcome to issue pull-requests instead.

St?fan


From alan.isaac at gmail.com  Thu Jul 18 13:50:42 2013
From: alan.isaac at gmail.com (Alan G Isaac)
Date: Thu, 18 Jul 2013 13:50:42 -0400
Subject: [Numpy-discussion] azip
In-Reply-To: <CAF6FJiujaBEnBshPd+RuL1X0fp3X1R0z78XbbTPNjqQav9TDRg@mail.gmail.com>
References: <51E81E6F.8010302@gmail.com>
	<CAF6FJiujaBEnBshPd+RuL1X0fp3X1R0z78XbbTPNjqQav9TDRg@mail.gmail.com>
Message-ID: <51E82AF2.60905@gmail.com>

On 7/18/2013 1:03 PM, Robert Kern wrote:
> np.column_stack([x, b]) does everything you need.


I am curious: why is column_stack in numpy/lib/shape_base.py
while hstack and vstack are in numpy/core/shape_base.py ?

Thanks,
Alan


From pav at iki.fi  Thu Jul 18 13:51:08 2013
From: pav at iki.fi (Pauli Virtanen)
Date: Thu, 18 Jul 2013 20:51:08 +0300
Subject: [Numpy-discussion] azip
In-Reply-To: <CANNq6F=hHtKxNfzKAKubwrn5pZdEQyYsgH1jCWkAtaHjXua9iw@mail.gmail.com>
References: <51E81E6F.8010302@gmail.com>
	<CAF6FJiujaBEnBshPd+RuL1X0fp3X1R0z78XbbTPNjqQav9TDRg@mail.gmail.com>
	<51E820B3.8000208@gmail.com>
	<CABDkGQndL8VdCtEEyB20wLLJFUmpA5NVT-LJ82DDsQXvLqa7iA@mail.gmail.com>
	<CANNq6F=hHtKxNfzKAKubwrn5pZdEQyYsgH1jCWkAtaHjXua9iw@mail.gmail.com>
Message-ID: <ks99u5$4c3$1@ger.gmane.org>

18.07.2013 20:18, Benjamin Root kirjoitti:
> Forgive my ignorance, but has numpy and scipy stopped doing that weird
> doc editing thing that existed back in the days of Trac? I have actually
> held back on submitting doc edits because I hated using that thing so much.

You were never required to use it.

-- 
Pauli Virtanen


From ben.root at ou.edu  Thu Jul 18 14:11:46 2013
From: ben.root at ou.edu (Benjamin Root)
Date: Thu, 18 Jul 2013 14:11:46 -0400
Subject: [Numpy-discussion] azip
In-Reply-To: <ks99u5$4c3$1@ger.gmane.org>
References: <51E81E6F.8010302@gmail.com>
	<CAF6FJiujaBEnBshPd+RuL1X0fp3X1R0z78XbbTPNjqQav9TDRg@mail.gmail.com>
	<51E820B3.8000208@gmail.com>
	<CABDkGQndL8VdCtEEyB20wLLJFUmpA5NVT-LJ82DDsQXvLqa7iA@mail.gmail.com>
	<CANNq6F=hHtKxNfzKAKubwrn5pZdEQyYsgH1jCWkAtaHjXua9iw@mail.gmail.com>
	<ks99u5$4c3$1@ger.gmane.org>
Message-ID: <CANNq6Fk56v1hp8nKd_Js2g8eEtZ95avOpmkmDg5dUixONakmkA@mail.gmail.com>

Well, that's nice to know now. However, I distinctly remember being told
that any changes made to the docstrings directly in the source would end up
getting replaced by whatever was in the doc edit system whenever a merge
from it happens. Therefore, if one wanted their edits to be persistent,
they had to submit it through the doc edit system.

Note, much of my animosity towards the doc edit system was due to issues
with the scipy.org being so sluggish back then, and the length of time it
took for any edits to finally make it down to the docstrings. Now that
scipy.org is much more responsive, and that numpy and scipy has moved on to
git, perhaps those two issues are gone now?

Sorry for hijacking the thread, this is just the first I am hearing that
one can submit documentation edits via PRs and was surprised.

Cheers!
Ben Root


On Thu, Jul 18, 2013 at 1:51 PM, Pauli Virtanen <pav at iki.fi> wrote:

> 18.07.2013 20:18, Benjamin Root kirjoitti:
> > Forgive my ignorance, but has numpy and scipy stopped doing that weird
> > doc editing thing that existed back in the days of Trac? I have actually
> > held back on submitting doc edits because I hated using that thing so
> much.
>
> You were never required to use it.
>
> --
> Pauli Virtanen
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130718/c085b44d/attachment.html>

From pav at iki.fi  Thu Jul 18 14:21:19 2013
From: pav at iki.fi (Pauli Virtanen)
Date: Thu, 18 Jul 2013 21:21:19 +0300
Subject: [Numpy-discussion] azip
In-Reply-To: <CANNq6Fk56v1hp8nKd_Js2g8eEtZ95avOpmkmDg5dUixONakmkA@mail.gmail.com>
References: <51E81E6F.8010302@gmail.com>
	<CAF6FJiujaBEnBshPd+RuL1X0fp3X1R0z78XbbTPNjqQav9TDRg@mail.gmail.com>
	<51E820B3.8000208@gmail.com>
	<CABDkGQndL8VdCtEEyB20wLLJFUmpA5NVT-LJ82DDsQXvLqa7iA@mail.gmail.com>
	<CANNq6F=hHtKxNfzKAKubwrn5pZdEQyYsgH1jCWkAtaHjXua9iw@mail.gmail.com>
	<ks99u5$4c3$1@ger.gmane.org>
	<CANNq6Fk56v1hp8nKd_Js2g8eEtZ95avOpmkmDg5dUixONakmkA@mail.gmail.com>
Message-ID: <ks9bmo$pil$1@ger.gmane.org>

18.07.2013 21:11, Benjamin Root kirjoitti:
> Well, that's nice to know now. However, I distinctly remember being told
> that any changes made to the docstrings directly in the source would end
> up getting replaced by whatever was in the doc edit system whenever a
> merge from it happens. Therefore, if one wanted their edits to be
> persistent, they had to submit it through the doc edit system.

I think there must have been some misunderstanding here: the doc editor 
works similarly to VCS, in that it will detect merge conflicts and 
require someone to manually resolve conflicts if the docstring in the 
source code has been changed.

-- 
Pauli Virtanen


From smortaz at exchange.microsoft.com  Thu Jul 18 18:06:40 2013
From: smortaz at exchange.microsoft.com (Shahrokh Mortazavi)
Date: Thu, 18 Jul 2013 22:06:40 +0000
Subject: [Numpy-discussion] Mixed Python + C debugging support in Visual
	Studio
Message-ID: <ce5aa37867f6434dbb7dab23ebcb3a9a@DFM-DB3MBX15-08.exchange.corp.microsoft.com>


Hi folks,

1st time poster - apologies if I'm breaking any protocols...

We were told that this would be a good alias to announce this on:  a few Python & OSS enthusiasts and Microsoft have created a plug-in for Visual Studio that enables Python <-> C/C++ debugging.  You may find this useful for debugging your extension modules.


A quick video overview of the mixed mode debugging feature: http://www.youtube.com/watch?v=wvJaKQ94lBY&hd=1 (HD)

Documentation: https://pytools.codeplex.com/wikipage?title=Mixed-mode%20debugging

Python Tools for Visual Studio is free (and OSS): http://pytools.codeplex.com

Cheers,

s

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130718/ae072cf6/attachment.html>

From rob.clewley at gmail.com  Thu Jul 18 18:21:24 2013
From: rob.clewley at gmail.com (Rob Clewley)
Date: Thu, 18 Jul 2013 18:21:24 -0400
Subject: [Numpy-discussion] User Guide
In-Reply-To: <51E7EB43.7010904@gmail.com>
References: <51E7EB43.7010904@gmail.com>
Message-ID: <CA+7tCySr-Cf7yNMpBnj+bL69Q1+NXsjqrP2W6XhyoObg57c_-Q@mail.gmail.com>

Hi,

I see the desire for stylistic improvement by removing the awkward
parens but your correction has incorrect grammar. One cannot have
"arrays of Python," nor are Numpy objects a subset of "Python"
(because Python is not a set) -- both of which are what your sentence
technically states. I.e., the commas are in the wrong place. You could
say

"The exception: one can have arrays of python objects (including those
from numpy) thereby allowing for arrays of different sized elements."

but I think it is even clear to just unpack this a bit more with

"The exception: one can have arrays of python objects, including numpy
objects, which allows arrays to contain different sized elements."

In my experience, attempting to be extremely concise in technical
writing is a common cause of awkward grammar problems like this. I do
it all the time :)

-Rob

On Thu, Jul 18, 2013 at 9:18 AM, Colin J. Williams
<cjwilliams43 at gmail.com> wrote:
> Returning to numpy after a while away, I'm impressed with the style and
> content of the User Guide and the Reference.  This is to offer a Guide
> correction - I couldn't figure out how to offer the correction on-line.
>
> What is Numpy?
>
>
> Suggest:
>
> "The exception: one can have arrays of (Python, including NumPy) objects,
> thereby allowing for arrays of different sized elements."
>
> to:
>
> The exception: one can have arrays of Python, including NumPy objects,
> thereby allowing for arrays of different sized elements.
>
> Colin W.
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


-- 
Robert Clewley, Ph.D.
Assistant Professor
Neuroscience Institute and
Department of Mathematics and Statistics
Georgia State University
PO Box 5030
Atlanta, GA 30302, USA

tel: 404-413-6420 fax: 404-413-5446
http://neuroscience.gsu.edu/rclewley.html


From lists at onerussian.com  Thu Jul 18 22:49:20 2013
From: lists at onerussian.com (Yaroslav Halchenko)
Date: Thu, 18 Jul 2013 22:49:20 -0400
Subject: [Numpy-discussion]  the mean, var, std of non-arrays
Message-ID: <20130719024920.GO27621@onerussian.com>

Hi everyone,

Some of my elderly code stopped working upon upgrades of numpy and
upcoming pandas: https://github.com/pydata/pandas/issues/4290 so I have
looked at the code of

  2481   def mean(a, axis=None, dtype=None, out=None, keepdims=False):
  2482       """
  ...
  2489       Parameters
  2490       ----------
  2491       a : array_like
  2492           Array containing numbers whose mean is desired. If `a` is not an
  2493           array, a conversion is attempted.
  ...
  2555       """
  2556       if type(a) is not mu.ndarray:
  2557           try:
  2558               mean = a.mean
  2559               return mean(axis=axis, dtype=dtype, out=out)
  2560           except AttributeError:
  2561               pass
  2562  
  2563       return _methods._mean(a, axis=axis, dtype=dtype,
  2564                               out=out, keepdims=keepdims)

here 'array_like'ness is checked by a having mean function.  Then it is assumed
that it has the same definition as ndarray, including dtype keyword argument.

Not sure anyways if my direct numpy.mean application to pandas DataFrame is
"kosher" -- initially I just assumed that any argument is asanyarray'ed first
-- but I think here catching TypeError for those incompatible .mean's would not
hurt either.  What do you think?  Similar logic applies to mean cousins (var,
std, ...?) decorated around _methods implementations.

-- 
Yaroslav O. Halchenko, Ph.D.
http://neuro.debian.net http://www.pymvpa.org http://www.fail2ban.org
Senior Research Associate,     Psychological and Brain Sciences Dept.
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834                       Fax: +1 (603) 646-1419
WWW:   http://www.linkedin.com/in/yarik        


From jsseabold at gmail.com  Thu Jul 18 23:18:49 2013
From: jsseabold at gmail.com (Skipper Seabold)
Date: Thu, 18 Jul 2013 23:18:49 -0400
Subject: [Numpy-discussion] the mean, var, std of non-arrays
In-Reply-To: <20130719024920.GO27621@onerussian.com>
References: <20130719024920.GO27621@onerussian.com>
Message-ID: <CAKF=Djv-=DvELPE0bvU9XVuQRewLyoNPw5DOJ6mM5+kc=cOSBA@mail.gmail.com>

On Thu, Jul 18, 2013 at 10:49 PM, Yaroslav Halchenko
<lists at onerussian.com>wrote:

> Hi everyone,
>
> Some of my elderly code stopped working upon upgrades of numpy and
> upcoming pandas: https://github.com/pydata/pandas/issues/4290 so I have
> looked at the code of
>
>   2481   def mean(a, axis=None, dtype=None, out=None, keepdims=False):
>   2482       """
>   ...
>   2489       Parameters
>   2490       ----------
>   2491       a : array_like
>   2492           Array containing numbers whose mean is desired. If `a` is
> not an
>   2493           array, a conversion is attempted.
>   ...
>   2555       """
>   2556       if type(a) is not mu.ndarray:
>   2557           try:
>   2558               mean = a.mean
>   2559               return mean(axis=axis, dtype=dtype, out=out)
>   2560           except AttributeError:
>   2561               pass
>   2562
>   2563       return _methods._mean(a, axis=axis, dtype=dtype,
>   2564                               out=out, keepdims=keepdims)
>
> here 'array_like'ness is checked by a having mean function.  Then it is
> assumed
> that it has the same definition as ndarray, including dtype keyword
> argument.
>
> Not sure anyways if my direct numpy.mean application to pandas DataFrame is
> "kosher" -- initially I just assumed that any argument is asanyarray'ed
> first
> -- but I think here catching TypeError for those incompatible .mean's
> would not
> hurt either.  What do you think?  Similar logic applies to mean cousins
> (var,
> std, ...?) decorated around _methods implementations.


Related? From a while ago.

https://github.com/numpy/numpy/pull/160

Skipper
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130718/49d86596/attachment.html>

From lists at onerussian.com  Thu Jul 18 23:24:42 2013
From: lists at onerussian.com (Yaroslav Halchenko)
Date: Thu, 18 Jul 2013 23:24:42 -0400
Subject: [Numpy-discussion] the mean, var, std of non-arrays
In-Reply-To: <CAKF=Djv-=DvELPE0bvU9XVuQRewLyoNPw5DOJ6mM5+kc=cOSBA@mail.gmail.com>
References: <20130719024920.GO27621@onerussian.com>
	<CAKF=Djv-=DvELPE0bvU9XVuQRewLyoNPw5DOJ6mM5+kc=cOSBA@mail.gmail.com>
Message-ID: <20130719032442.GP27621@onerussian.com>


On Thu, 18 Jul 2013, Skipper Seabold wrote:

>      Not sure anyways if my direct numpy.mean application to pandas DataFrame
>      is
>      "kosher" -- initially I just assumed that any argument is asanyarray'ed
>      first
>      -- but I think here catching TypeError for those incompatible .mean's
>      would not
>      hurt either. ?What do you think? ?Similar logic applies to mean cousins
>      (var,
>      std, ...?) decorated around _methods implementations.

>    Related? From a while ago.
>    [3]https://github.com/numpy/numpy/pull/160

yeah...  That is how I thought "it is working", but I guess it was left
without asanyarraying for additional flexibility/performance so any
array-like object could be used, not just ndarray derived classes.

-- 
Yaroslav O. Halchenko, Ph.D.
http://neuro.debian.net http://www.pymvpa.org http://www.fail2ban.org
Senior Research Associate,     Psychological and Brain Sciences Dept.
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834                       Fax: +1 (603) 646-1419
WWW:   http://www.linkedin.com/in/yarik        


From charlesr.harris at gmail.com  Fri Jul 19 00:12:45 2013
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Thu, 18 Jul 2013 22:12:45 -0600
Subject: [Numpy-discussion] the mean, var, std of non-arrays
In-Reply-To: <20130719032442.GP27621@onerussian.com>
References: <20130719024920.GO27621@onerussian.com>
	<CAKF=Djv-=DvELPE0bvU9XVuQRewLyoNPw5DOJ6mM5+kc=cOSBA@mail.gmail.com>
	<20130719032442.GP27621@onerussian.com>
Message-ID: <CAB6mnxJ8i1WMT8LQ5TyeX8Fyez73-S1wrhhoaUtECNkLtjSQUw@mail.gmail.com>

On Thu, Jul 18, 2013 at 9:24 PM, Yaroslav Halchenko <lists at onerussian.com>wrote:

>
> On Thu, 18 Jul 2013, Skipper Seabold wrote:
>
> >      Not sure anyways if my direct numpy.mean application to pandas
> DataFrame
> >      is
> >      "kosher" -- initially I just assumed that any argument is
> asanyarray'ed
> >      first
> >      -- but I think here catching TypeError for those incompatible
> .mean's
> >      would not
> >      hurt either. ?What do you think? ?Similar logic applies to mean
> cousins
> >      (var,
> >      std, ...?) decorated around _methods implementations.
>
> >    Related? From a while ago.
> >    [3]https://github.com/numpy/numpy/pull/160
>
> yeah...  That is how I thought "it is working", but I guess it was left
> without asanyarraying for additional flexibility/performance so any
> array-like object could be used, not just ndarray derived classes.
>

Speaking of which, there is a PR for nan{mean, var,
std)<https://github.com/numpy/numpy/pull/3534>
that you might want to check before it gets committed. There might be some
modifications that you would want to add.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130718/736651c6/attachment.html>

From stefan at sun.ac.za  Fri Jul 19 04:20:49 2013
From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=)
Date: Fri, 19 Jul 2013 10:20:49 +0200
Subject: [Numpy-discussion] User Guide
In-Reply-To: <CA+7tCySr-Cf7yNMpBnj+bL69Q1+NXsjqrP2W6XhyoObg57c_-Q@mail.gmail.com>
References: <51E7EB43.7010904@gmail.com>
	<CA+7tCySr-Cf7yNMpBnj+bL69Q1+NXsjqrP2W6XhyoObg57c_-Q@mail.gmail.com>
Message-ID: <CABDkGQ=HNTo8c-rWGWay2ZuG3pSO-kOb7RX3dJfr8f1GNf8fqg@mail.gmail.com>

On Fri, Jul 19, 2013 at 12:21 AM, Rob Clewley <rob.clewley at gmail.com> wrote:
> "The exception: one can have arrays of python objects, including numpy
> objects, which allows arrays to contain different sized elements."

What are numpy objects?  "numpy objects" -> "numpy ndarrays" or "numpy
ndarray objects"?

St?fan


From njs at pobox.com  Fri Jul 19 11:14:27 2013
From: njs at pobox.com (Nathaniel Smith)
Date: Fri, 19 Jul 2013 16:14:27 +0100
Subject: [Numpy-discussion] Bringing order to higher dimensional
	operations
In-Reply-To: <1374153819.14751.17.camel@sebastian-laptop>
References: <CAPJVwBmkCYGV2+uwvH7DXo=0_hRMWZjB=EX32qfsAw0B6Obd=A@mail.gmail.com>
	<1374153819.14751.17.camel@sebastian-laptop>
Message-ID: <CAPJVwBnPDuCU5GZ_fgD_OpjHQ3559YMQ-gkN9hukyD57qNxitA@mail.gmail.com>

On Thu, Jul 18, 2013 at 2:23 PM, Sebastian Berg
<sebastian at sipsolutions.net> wrote:
> On Thu, 2013-07-18 at 13:52 +0100, Nathaniel Smith wrote:
>> Hi all,
>>
> <snip>
>>
>> So:
>>
>> QUESTION 1: does that sound right: that in a perfect world, the
>> current gufunc convention would be the only one, and that's what we
>> should work towards, at least in the cases where that's possible?
>>
>
> Sounds right to me, ufunc/gufunc broadcasting assumes the "inner"
> dimensions are the right-most. Since we are normally in C-order arrays,
> this also seems the sensible way if you consider the memory layout.
>
>> QUESTION 2: Assuming that's right, it would be *really nice* if we
>> could at least get np.dot onto our new convention, for consistency
>> with the rest of np.linalg, and to allow it to be overridden. I'm sort
>> of terrified to touch np.dot's API, but the only cases where it would
>> act differently is when *both* arguments have *3 or more dimensions*,
>> and I guess there are very very few users who fall into that category.
>> So maybe we could start raising some big FutureWarnings for this case
>> in the next release, and eventually switch?
>>
>
> It is noble to try to get do to use the gufunc convention, but if you
> look at the new gufunc linalg functions, they already have to have some
> weird tricks in the case of np.linalg.solve.
> It is so difficult because of the fact that dot is basically a
> combination of many functions:
>   o vector * vector -> vector
>   o vector * matrix -> matrix (add dimensions to vector on right)
>   o matrix * vector -> matrix (add dimensions to vector on left)
>   o matrix * matrix -> matrix
> plus scalar cases.

Oh ugh, I forgot about all those special cases.

> I somewhat believe we should not touch dot, or deprecate anything but
> the most basic dot functionality. Then we can point to matrix_multiply,
> inner1d, etc. which are gufuncs (even if they are not exposed at this
> time). The whole dance that is already done for np.linalg.solve right
> now is not pretty there, and it will be worse for dot. Because dot is
> basically overloaded, marrying it with the broadcasting machinery in a
> general way is impossible.

While it would be kind of nice if we could eventually make
isinstance(np.dot, np.ufunc) be True, I'm not so bothered if it
remains a wrapper around gufuncs (like the np.linalg wrappers
currently are). Most of these special cases, while ugly, can be
handled perfectly well by this sort of mechanism. What I'm most
bothered about is pseudo-outer case:
  np.dot(array with ndim >2, array with ndim >2)
This simply *can't* be emulated with a gufunc. And as long as that's
true it's going to be very hard to get 'dot' to play along with the
general ufunc machinery.

So that's specifically the case I was talking about in question 2.

>> (I'm even more terrified of trying to mess with np.sum or
>> np.add.reduce, so I'll leave that alone for now -- maybe we're just
>> stuck with them. And at least they do already go through the ufunc
>> machinery.)
>
> I did not understand where the inconsistency/problem for the reductions
> is.

What I mean is: Suppose we wrote a gufunc for 'sum', where the
intrinsic operation took a vector and returned a scalar. (E.g. we want
to implement one of the specialized algorithms for vector summation,
like Kahan summation, which can be more accurate than applying scalar
addition repeatedly.)

Then we'd have:

np.sum(ones((2, 3))).shape == ()
np.add.reduce(ones((2, 3))).shape == (3,)
gufunc_sum(ones((2, 3))).shape == (2,)

These are three names for exactly the same underlying function... but
they all have different defaults for how they vectorize.

-n


From njs at pobox.com  Fri Jul 19 11:31:31 2013
From: njs at pobox.com (Nathaniel Smith)
Date: Fri, 19 Jul 2013 16:31:31 +0100
Subject: [Numpy-discussion] Bringing order to higher dimensional
	operations
In-Reply-To: <1374153819.14751.17.camel@sebastian-laptop>
References: <CAPJVwBmkCYGV2+uwvH7DXo=0_hRMWZjB=EX32qfsAw0B6Obd=A@mail.gmail.com>
	<1374153819.14751.17.camel@sebastian-laptop>
Message-ID: <CAPJVwB=hhS87ihYiu2=op+0MTi+n=-e6ZAH+o1DpHHBs9f4-0A@mail.gmail.com>

On Thu, Jul 18, 2013 at 2:23 PM, Sebastian Berg
<sebastian at sipsolutions.net> wrote:
> It is so difficult because of the fact that dot is basically a
> combination of many functions:
>   o vector * vector -> vector
>   o vector * matrix -> matrix (add dimensions to vector on right)
>   o matrix * vector -> matrix (add dimensions to vector on left)
>   o matrix * matrix -> matrix
> plus scalar cases.

Though, just throwing this out there for the archives since I was
thinking about it...

I think we *could* consolidate all dot's functionality into a single
gufunc, with a few small changes:

1) Deprecate and get rid of the scalar special cases. (For those
following along: right now, np.dot(10, array) does scalar
multiplication, but this doesn't make much sense conceptually, it's
not documented, and I don't think anyone uses it. Except maybe
np.matrix.__mul__, but that could be fixed.)

2) Deprecate the strange "broadcasting" behaviour for high-dimensional
inputs, in favor of the gufunc version suggested in the previous
email.

That leaves the vector * vector, vector * matrix, matrix * vector,
matrix * matrix cases. To handle these:

3) Extend the gufunc machinery to understand the idea that some core
dimensions are allowed to take on a special "nonexistent" size. So the
signature for dot would be:
  (m*,k) x (k, n*) -> (m*, n*)
where '*' denotes dimensions who are allowed to take on the
"nonexistent" size if necessary. So dot(ones((2, 3)), ones((3, 4)))
would have
  m = 2
  k = 3
  n = 4
and produce an output with shape (m, n) = (2, 4). But dot(ones((2,
3)), ones((3,))) would have
  m = 2
  k = 3
  n = <nothing>
and produce an output with shape (m, n) = (2, <nothing>) = (2,). And
dot(ones((3,)), ones((3,))) would have
  m = <nothing>
  k = 3
  n = <nothing>
and produce an output with shape (m, n) = (<nothing>, <nothing>) = (),
i.e., dot(vector, vector) would return a scalar.

I'm not sure if there are any other cases where this would be useful,
but even if it were just for 'dot', that's still a pretty important
case that might justify the mechanism all on its own.

-n


From stefan at sun.ac.za  Fri Jul 19 11:32:11 2013
From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=)
Date: Fri, 19 Jul 2013 17:32:11 +0200
Subject: [Numpy-discussion] Bringing order to higher dimensional
	operations
In-Reply-To: <CAPJVwBmkCYGV2+uwvH7DXo=0_hRMWZjB=EX32qfsAw0B6Obd=A@mail.gmail.com>
References: <CAPJVwBmkCYGV2+uwvH7DXo=0_hRMWZjB=EX32qfsAw0B6Obd=A@mail.gmail.com>
Message-ID: <CABDkGQkr1J6b9xh69j=CN=iV6YWRH5JpOJuzZDT3X7m0SrXW5Q@mail.gmail.com>

On Thu, Jul 18, 2013 at 2:52 PM, Nathaniel Smith <njs at pobox.com> wrote:
> Compare:
>   gu_dot_leftwards(ones((10, 11, 4)), ones((11, 12, 3, 4))) -> (10, 12, 3, 4)
> versus
>   gu_dot_rightwards(ones((4, 10, 11)), ones((3, 4, 11, 12))) -> (3, 4, 10, 12)

The second makes quite a bit more sense to me, and fits with the
current way we match broadcasting dimensions (align to the right,
match right to left).

The np.dot outer example you gave and other exceptions like that will
probably give us headaches in the future, so I'd opt for moving away
from them.  The way ellipses are broadcast, well that's a battle for
another day.

St?fan


From stefan at sun.ac.za  Fri Jul 19 11:36:56 2013
From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=)
Date: Fri, 19 Jul 2013 17:36:56 +0200
Subject: [Numpy-discussion] Bringing order to higher dimensional
	operations
In-Reply-To: <CAPJVwB=hhS87ihYiu2=op+0MTi+n=-e6ZAH+o1DpHHBs9f4-0A@mail.gmail.com>
References: <CAPJVwBmkCYGV2+uwvH7DXo=0_hRMWZjB=EX32qfsAw0B6Obd=A@mail.gmail.com>
	<1374153819.14751.17.camel@sebastian-laptop>
	<CAPJVwB=hhS87ihYiu2=op+0MTi+n=-e6ZAH+o1DpHHBs9f4-0A@mail.gmail.com>
Message-ID: <CABDkGQ=N0JVPAzYVH-nS5WKuff0jp8uHOaWSOVpq3B+OZ1i-RA@mail.gmail.com>

On Fri, Jul 19, 2013 at 5:31 PM, Nathaniel Smith <njs at pobox.com> wrote:
> 3) Extend the gufunc machinery to understand the idea that some core
> dimensions are allowed to take on a special "nonexistent" size. So the
> signature for dot would be:
>   (m*,k) x (k, n*) -> (m*, n*)
> where '*' denotes dimensions who are allowed to take on the
> "nonexistent" size if necessary. So dot(ones((2, 3)), ones((3, 4)))
> would have
>   m = 2
>   k = 3
>   n = 4
> and produce an output with shape (m, n) = (2, 4). But dot(ones((2,
> 3)), ones((3,))) would have
>   m = 2
>   k = 3
>   n = <nothing>
> and produce an output with shape (m, n) = (2, <nothing>) = (2,). And
> dot(ones((3,)), ones((3,))) would have
>   m = <nothing>
>   k = 3
>   n = <nothing>
> and produce an output with shape (m, n) = (<nothing>, <nothing>) = (),
> i.e., dot(vector, vector) would return a scalar.

This looks like a fairly clean solution; it could be implemented in a
shape pre- and post-processing step, where we pad the array dimensions
to match the full signature, and remove it again afterwards.

St?fan


From sebastian at sipsolutions.net  Fri Jul 19 12:05:00 2013
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Fri, 19 Jul 2013 18:05:00 +0200
Subject: [Numpy-discussion] Bringing order to higher dimensional
 operations
In-Reply-To: <CAPJVwB=hhS87ihYiu2=op+0MTi+n=-e6ZAH+o1DpHHBs9f4-0A@mail.gmail.com>
References: <CAPJVwBmkCYGV2+uwvH7DXo=0_hRMWZjB=EX32qfsAw0B6Obd=A@mail.gmail.com>
	<1374153819.14751.17.camel@sebastian-laptop>
	<CAPJVwB=hhS87ihYiu2=op+0MTi+n=-e6ZAH+o1DpHHBs9f4-0A@mail.gmail.com>
Message-ID: <1374249900.3254.28.camel@sebastian-laptop>

On Fri, 2013-07-19 at 16:31 +0100, Nathaniel Smith wrote:
> On Thu, Jul 18, 2013 at 2:23 PM, Sebastian Berg
> <sebastian at sipsolutions.net> wrote:
> > It is so difficult because of the fact that dot is basically a
> > combination of many functions:
> >   o vector * vector -> vector
> >   o vector * matrix -> matrix (add dimensions to vector on right)
> >   o matrix * vector -> matrix (add dimensions to vector on left)
> >   o matrix * matrix -> matrix
> > plus scalar cases.
> 
> Though, just throwing this out there for the archives since I was
> thinking about it...
> 
> I think we *could* consolidate all dot's functionality into a single
> gufunc, with a few small changes:
> 
> 1) Deprecate and get rid of the scalar special cases. (For those
> following along: right now, np.dot(10, array) does scalar
> multiplication, but this doesn't make much sense conceptually, it's
> not documented, and I don't think anyone uses it. Except maybe
> np.matrix.__mul__, but that could be fixed.)
> 
> 2) Deprecate the strange "broadcasting" behaviour for high-dimensional
> inputs, in favor of the gufunc version suggested in the previous
> email.
> 
> That leaves the vector * vector, vector * matrix, matrix * vector,
> matrix * matrix cases. To handle these:
> 
> 3) Extend the gufunc machinery to understand the idea that some core
> dimensions are allowed to take on a special "nonexistent" size. So the
> signature for dot would be:
>   (m*,k) x (k, n*) -> (m*, n*)
> where '*' denotes dimensions who are allowed to take on the
> "nonexistent" size if necessary. So dot(ones((2, 3)), ones((3, 4)))
> would have
>   m = 2
>   k = 3
>   n = 4
> and produce an output with shape (m, n) = (2, 4). But dot(ones((2,
> 3)), ones((3,))) would have
>   m = 2
>   k = 3
>   n = <nothing>
> and produce an output with shape (m, n) = (2, <nothing>) = (2,). And
> dot(ones((3,)), ones((3,))) would have
>   m = <nothing>
>   k = 3
>   n = <nothing>
> and produce an output with shape (m, n) = (<nothing>, <nothing>) = (),
> i.e., dot(vector, vector) would return a scalar.
> 
> I'm not sure if there are any other cases where this would be useful,
> but even if it were just for 'dot', that's still a pretty important
> case that might justify the mechanism all on its own.
> 

Yeah this would work. It is basically what np.linalg.solve currently
does in the preparation step. So maybe this is not that bad implemented
in the machinery. The logic itself is pretty simple after all. Though it
would be one of those features I would probably not want to see used a
lot ;).

- Sebastian

> -n
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> 


From sebastian at sipsolutions.net  Fri Jul 19 12:10:34 2013
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Fri, 19 Jul 2013 18:10:34 +0200
Subject: [Numpy-discussion] Bringing order to higher dimensional
 operations
In-Reply-To: <CAPJVwBnPDuCU5GZ_fgD_OpjHQ3559YMQ-gkN9hukyD57qNxitA@mail.gmail.com>
References: <CAPJVwBmkCYGV2+uwvH7DXo=0_hRMWZjB=EX32qfsAw0B6Obd=A@mail.gmail.com>
	<1374153819.14751.17.camel@sebastian-laptop>
	<CAPJVwBnPDuCU5GZ_fgD_OpjHQ3559YMQ-gkN9hukyD57qNxitA@mail.gmail.com>
Message-ID: <1374250234.3254.33.camel@sebastian-laptop>

On Fri, 2013-07-19 at 16:14 +0100, Nathaniel Smith wrote:
> On Thu, Jul 18, 2013 at 2:23 PM, Sebastian Berg
> <sebastian at sipsolutions.net> wrote:
> > On Thu, 2013-07-18 at 13:52 +0100, Nathaniel Smith wrote:
> >> Hi all,
> >>
<snip>
> 
> What I mean is: Suppose we wrote a gufunc for 'sum', where the
> intrinsic operation took a vector and returned a scalar. (E.g. we want
> to implement one of the specialized algorithms for vector summation,
> like Kahan summation, which can be more accurate than applying scalar
> addition repeatedly.)
> 
> Then we'd have:
> 
> np.sum(ones((2, 3))).shape == ()
> np.add.reduce(ones((2, 3))).shape == (3,)
> gufunc_sum(ones((2, 3))).shape == (2,)
> 

Ah, indeed! So we have a different default behaviour for ufunc.reduce
and all other reduce-like functions, didn't realize that. Changing that
would be one huge thing...
As to implementing such thing as a Kahan summation, it is true, I also
can't see how it fits into the machinery. Maybe it shouldn't even be a
gufunc, but we rather need a way to specialize the reduction, or tag on
more information into the ufunc itself?

- Sebastian

> These are three names for exactly the same underlying function... but
> they all have different defaults for how they vectorize.
> 
> -n
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> 


From cjwilliams43 at gmail.com  Fri Jul 19 13:16:52 2013
From: cjwilliams43 at gmail.com (Colin J. Williams)
Date: Fri, 19 Jul 2013 13:16:52 -0400
Subject: [Numpy-discussion] NumPy-Discussion Digest, Vol 82, Issue 34
In-Reply-To: <mailman.1874.1374246552.973.numpy-discussion@scipy.org>
References: <mailman.1874.1374246552.973.numpy-discussion@scipy.org>
Message-ID: <51E97484.1020001@gmail.com>

An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130719/c01ed595/attachment.html>

From lists at onerussian.com  Fri Jul 19 13:04:12 2013
From: lists at onerussian.com (Yaroslav Halchenko)
Date: Fri, 19 Jul 2013 13:04:12 -0400
Subject: [Numpy-discussion] the mean, var, std of non-arrays
In-Reply-To: <CAB6mnxJ8i1WMT8LQ5TyeX8Fyez73-S1wrhhoaUtECNkLtjSQUw@mail.gmail.com>
References: <20130719024920.GO27621@onerussian.com>
	<CAKF=Djv-=DvELPE0bvU9XVuQRewLyoNPw5DOJ6mM5+kc=cOSBA@mail.gmail.com>
	<20130719032442.GP27621@onerussian.com>
	<CAB6mnxJ8i1WMT8LQ5TyeX8Fyez73-S1wrhhoaUtECNkLtjSQUw@mail.gmail.com>
Message-ID: <20130719170412.GQ27621@onerussian.com>


On Thu, 18 Jul 2013, Charles R Harris wrote:
>      yeah... ?That is how I thought "it is working", but I guess it was left
>      without asanyarraying for additional flexibility/performance so any
>      array-like object could be used, not just ndarray derived classes.

>    Speaking of which, there is a PR for [3]nan{mean, var, std)? that you
>    might want to check before it gets committed. There might be some
>    modifications that you would want to add.

well -- the only modifications to non-nan mean was a docstring's see
also.  there though input is explicitly converted/copied to ndarray so 
no custom .mean() functions would be called, thus issue a bit orthogonal
as far as I see


-- 
Yaroslav O. Halchenko, Ph.D.
http://neuro.debian.net http://www.pymvpa.org http://www.fail2ban.org
Senior Research Associate,     Psychological and Brain Sciences Dept.
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834                       Fax: +1 (603) 646-1419
WWW:   http://www.linkedin.com/in/yarik        


From grb at skogoglandskap.no  Fri Jul 19 14:23:00 2013
From: grb at skogoglandskap.no (Graeme B. Bell)
Date: Fri, 19 Jul 2013 18:23:00 +0000
Subject: [Numpy-discussion] simpleselect v2.0
Message-ID: <EB3DDC06-048A-4AAA-A708-68D605BF068F@skogoglandskap.no>

Hi all,

I've just released version 2.0 of simple select. In brief this is a drop-in replacement for numpy.select with the following qualities:

- Faster! (benchmarks 2-5x faster than numpy.select depending on use case and faster than v1.0 simpleselect)
- Full broadcasting. 
- All bugs in numpy.select fixed, tested.
- Better documented code.
- Improvements to the test harness.


I'm also submitting this as a pull request to the main numpy distribution, since it now covers all the functionality of numpy.select, but is faster and fixed long-standing bugs.

I've tested with the numpy runtests.py as well as a test harness and unit tests of my own. 

Hope it's useful. Drop me a mail if you have any feedback. 

Have a nice weekend all,

Graeme Bell


From grb at skogoglandskap.no  Fri Jul 19 14:23:52 2013
From: grb at skogoglandskap.no (Graeme B. Bell)
Date: Fri, 19 Jul 2013 18:23:52 +0000
Subject: [Numpy-discussion] simpleselect v2.0
In-Reply-To: <EB3DDC06-048A-4AAA-A708-68D605BF068F@skogoglandskap.no>
References: <EB3DDC06-048A-4AAA-A708-68D605BF068F@skogoglandskap.no>
Message-ID: <CC92AB48-7050-4A18-8C1B-30B9FB2BEED1@skogoglandskap.no>


URL:

https://github.com/gbb/numpy-simple-select


> I've just released version 2.0 of simple select. In brief this is a drop-in replacement for numpy.select with the following qualities:
> 
> - Faster! (benchmarks 2-5x faster than numpy.select depending on use case and faster than v1.0 simpleselect)
> - Full broadcasting. 
> - All bugs in numpy.select fixed, tested.
> - Better documented code.
> - Improvements to the test harness.


From lists at onerussian.com  Fri Jul 19 18:07:54 2013
From: lists at onerussian.com (Yaroslav Halchenko)
Date: Fri, 19 Jul 2013 18:07:54 -0400
Subject: [Numpy-discussion] fresh performance hits: numpy.linalg.pinv >30%
	slowdown
In-Reply-To: <20130717035348.GN27621@onerussian.com>
References: <20130506143241.GV5140@onerussian.com>
	<1367856232.2506.31.camel@sebastian-laptop>
	<20130506161153.GW5140@onerussian.com>
	<1367927238.23010.12.camel@sebastian-laptop>
	<CAJhcF=1a=zHQ7JgXF9f4N+1pHMr_6RESq6Me9oSYewjj62L0MA@mail.gmail.com>
	<20130701193006.GC27621@onerussian.com>
	<20130701215804.GG27621@onerussian.com>
	<20130709161007.GL27621@onerussian.com>
	<20130717035348.GN27621@onerussian.com>
Message-ID: <20130719220754.GR27621@onerussian.com>

I have just added a few more benchmarks, and here they come
http://www.onerussian.com/tmp/numpy-vbench/vb_vb_linalg.html#numpy-linalg-pinv-a-float32
it seems to be very recent so my only check based on 10 commits
didn't pick it up yet so they are not present in the summary table.

could well be related to 80% faster det()? ;)

norm was hit as well a bit earlier, might well be within these commits:
https://github.com/numpy/numpy/compare/24a0aa5...29dcc54
I will rerun now benchmarking for the rest of commits (was running last
in the day iirc)

Cheers,

On Tue, 16 Jul 2013, Yaroslav Halchenko wrote:

> and to put so far reported findings into some kind of automated form,
> please welcome

> http://www.onerussian.com/tmp/numpy-vbench/#benchmarks-performance-analysis

> This is based on a simple 1-way anova of last 10 commits and some point
> in the past where 10 other commits had smallest timing and were significantly
> different from the last 10 commits.

> "Possible recent" is probably too noisy and not sure if useful -- it should
> point to a closest in time (to the latest commits) diff where a
> significant excursion from current performance was detected.  So per se it has
> nothing to do with the initial detected performance hit, but in some cases
> seems still to reasonably locate commits hitting on performance.

> Enjoy,
-- 
Yaroslav O. Halchenko, Ph.D.
http://neuro.debian.net http://www.pymvpa.org http://www.fail2ban.org
Senior Research Associate,     Psychological and Brain Sciences Dept.
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834                       Fax: +1 (603) 646-1419
WWW:   http://www.linkedin.com/in/yarik        


From njs at pobox.com  Fri Jul 19 18:38:14 2013
From: njs at pobox.com (Nathaniel Smith)
Date: Fri, 19 Jul 2013 23:38:14 +0100
Subject: [Numpy-discussion] fresh performance hits: numpy.linalg.pinv
	>30% slowdown
In-Reply-To: <20130719220754.GR27621@onerussian.com>
References: <20130506143241.GV5140@onerussian.com>
	<1367856232.2506.31.camel@sebastian-laptop>
	<20130506161153.GW5140@onerussian.com>
	<1367927238.23010.12.camel@sebastian-laptop>
	<CAJhcF=1a=zHQ7JgXF9f4N+1pHMr_6RESq6Me9oSYewjj62L0MA@mail.gmail.com>
	<20130701193006.GC27621@onerussian.com>
	<20130701215804.GG27621@onerussian.com>
	<20130709161007.GL27621@onerussian.com>
	<20130717035348.GN27621@onerussian.com>
	<20130719220754.GR27621@onerussian.com>
Message-ID: <CAPJVwBmowHL4wUV8KdPfXf2TB92nmXauQQXmpx6-Cgk6zynGCA@mail.gmail.com>

The biggest ~recent change in master's linalg was the switch to gufunc back
ends - you might want to check for that event in your commit log.
On 19 Jul 2013 23:08, "Yaroslav Halchenko" <lists at onerussian.com> wrote:

> I have just added a few more benchmarks, and here they come
>
> http://www.onerussian.com/tmp/numpy-vbench/vb_vb_linalg.html#numpy-linalg-pinv-a-float32
> it seems to be very recent so my only check based on 10 commits
> didn't pick it up yet so they are not present in the summary table.
>
> could well be related to 80% faster det()? ;)
>
> norm was hit as well a bit earlier, might well be within these commits:
> https://github.com/numpy/numpy/compare/24a0aa5...29dcc54
> I will rerun now benchmarking for the rest of commits (was running last
> in the day iirc)
>
> Cheers,
>
> On Tue, 16 Jul 2013, Yaroslav Halchenko wrote:
>
> > and to put so far reported findings into some kind of automated form,
> > please welcome
>
> >
> http://www.onerussian.com/tmp/numpy-vbench/#benchmarks-performance-analysis
>
> > This is based on a simple 1-way anova of last 10 commits and some point
> > in the past where 10 other commits had smallest timing and were
> significantly
> > different from the last 10 commits.
>
> > "Possible recent" is probably too noisy and not sure if useful -- it
> should
> > point to a closest in time (to the latest commits) diff where a
> > significant excursion from current performance was detected.  So per se
> it has
> > nothing to do with the initial detected performance hit, but in some
> cases
> > seems still to reasonably locate commits hitting on performance.
>
> > Enjoy,
> --
> Yaroslav O. Halchenko, Ph.D.
> http://neuro.debian.net http://www.pymvpa.org http://www.fail2ban.org
> Senior Research Associate,     Psychological and Brain Sciences Dept.
> Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
> Phone: +1 (603) 646-9834                       Fax: +1 (603) 646-1419
> WWW:   http://www.linkedin.com/in/yarik
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130719/be95bfa4/attachment.html>

From warren.weckesser at gmail.com  Fri Jul 19 21:27:40 2013
From: warren.weckesser at gmail.com (Warren Weckesser)
Date: Fri, 19 Jul 2013 21:27:40 -0400
Subject: [Numpy-discussion] fresh performance hits: numpy.linalg.pinv
	>30% slowdown
In-Reply-To: <20130719220754.GR27621@onerussian.com>
References: <20130506143241.GV5140@onerussian.com>
	<1367856232.2506.31.camel@sebastian-laptop>
	<20130506161153.GW5140@onerussian.com>
	<1367927238.23010.12.camel@sebastian-laptop>
	<CAJhcF=1a=zHQ7JgXF9f4N+1pHMr_6RESq6Me9oSYewjj62L0MA@mail.gmail.com>
	<20130701193006.GC27621@onerussian.com>
	<20130701215804.GG27621@onerussian.com>
	<20130709161007.GL27621@onerussian.com>
	<20130717035348.GN27621@onerussian.com>
	<20130719220754.GR27621@onerussian.com>
Message-ID: <CAGzF1uc=+oyZpX+idrPk2Nffz_wURqLKLV25B3xF9-nX9PeN=g@mail.gmail.com>

On 7/19/13, Yaroslav Halchenko <lists at onerussian.com> wrote:
> I have just added a few more benchmarks, and here they come
> http://www.onerussian.com/tmp/numpy-vbench/vb_vb_linalg.html#numpy-linalg-pinv-a-float32
> it seems to be very recent so my only check based on 10 commits
> didn't pick it up yet so they are not present in the summary table.
>
> could well be related to 80% faster det()? ;)
>
> norm was hit as well a bit earlier,


Well, this is embarrassing: https://github.com/numpy/numpy/pull/3539

Thanks for benchmarks!  I'm now an even bigger fan. :)

Warren


 might well be within these commits:
> https://github.com/numpy/numpy/compare/24a0aa5...29dcc54
> I will rerun now benchmarking for the rest of commits (was running last
> in the day iirc)
>
> Cheers,
>
> On Tue, 16 Jul 2013, Yaroslav Halchenko wrote:
>
>> and to put so far reported findings into some kind of automated form,
>> please welcome
>
>> http://www.onerussian.com/tmp/numpy-vbench/#benchmarks-performance-analysis
>
>> This is based on a simple 1-way anova of last 10 commits and some point
>> in the past where 10 other commits had smallest timing and were
>> significantly
>> different from the last 10 commits.
>
>> "Possible recent" is probably too noisy and not sure if useful -- it
>> should
>> point to a closest in time (to the latest commits) diff where a
>> significant excursion from current performance was detected.  So per se it
>> has
>> nothing to do with the initial detected performance hit, but in some
>> cases
>> seems still to reasonably locate commits hitting on performance.
>
>> Enjoy,
> --
> Yaroslav O. Halchenko, Ph.D.
> http://neuro.debian.net http://www.pymvpa.org http://www.fail2ban.org
> Senior Research Associate,     Psychological and Brain Sciences Dept.
> Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
> Phone: +1 (603) 646-9834                       Fax: +1 (603) 646-1419
> WWW:   http://www.linkedin.com/in/yarik
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From josef.pktd at gmail.com  Sat Jul 20 01:44:18 2013
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Sat, 20 Jul 2013 01:44:18 -0400
Subject: [Numpy-discussion] Bringing order to higher dimensional
	operations
In-Reply-To: <1374250234.3254.33.camel@sebastian-laptop>
References: <CAPJVwBmkCYGV2+uwvH7DXo=0_hRMWZjB=EX32qfsAw0B6Obd=A@mail.gmail.com>
	<1374153819.14751.17.camel@sebastian-laptop>
	<CAPJVwBnPDuCU5GZ_fgD_OpjHQ3559YMQ-gkN9hukyD57qNxitA@mail.gmail.com>
	<1374250234.3254.33.camel@sebastian-laptop>
Message-ID: <CAMMTP+D-MZT9vffASMmxQ3DitipVV=nRbCVSF=mCZMgvjxenSA@mail.gmail.com>

On Fri, Jul 19, 2013 at 12:10 PM, Sebastian Berg
<sebastian at sipsolutions.net> wrote:
> On Fri, 2013-07-19 at 16:14 +0100, Nathaniel Smith wrote:
>> On Thu, Jul 18, 2013 at 2:23 PM, Sebastian Berg
>> <sebastian at sipsolutions.net> wrote:
>> > On Thu, 2013-07-18 at 13:52 +0100, Nathaniel Smith wrote:
>> >> Hi all,
>> >>
> <snip>
>>
>> What I mean is: Suppose we wrote a gufunc for 'sum', where the
>> intrinsic operation took a vector and returned a scalar. (E.g. we want
>> to implement one of the specialized algorithms for vector summation,
>> like Kahan summation, which can be more accurate than applying scalar
>> addition repeatedly.)
>>
>> Then we'd have:
>>
>> np.sum(ones((2, 3))).shape == ()
>> np.add.reduce(ones((2, 3))).shape == (3,)
>> gufunc_sum(ones((2, 3))).shape == (2,)
>>
>
> Ah, indeed! So we have a different default behaviour for ufunc.reduce
> and all other reduce-like functions, didn't realize that. Changing that
> would be one huge thing...

I thought reduce, accumulate and reduceat (and map in python) are
functions on iterators, and numpy still uses axis=0 to iterate over.

related: is there any advantage to np.add.reduce?
I find it more difficult to read than sum() and still see it used sometimes.

(dot with more than 3 dimension is weird, and I never found a use for it.)

Josef


> As to implementing such thing as a Kahan summation, it is true, I also
> can't see how it fits into the machinery. Maybe it shouldn't even be a
> gufunc, but we rather need a way to specialize the reduction, or tag on
> more information into the ufunc itself?
>
> - Sebastian
>
>> These are three names for exactly the same underlying function... but
>> they all have different defaults for how they vectorize.
>>
>> -n
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From ralf.gommers at gmail.com  Sat Jul 20 06:36:57 2013
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Sat, 20 Jul 2013 12:36:57 +0200
Subject: [Numpy-discussion] add .H attribute?
In-Reply-To: <CABDkGQ=to_EG8uUyj8aQTVusguvt1qsLnhsPEGpanczHE9J1Rg@mail.gmail.com>
References: <51D98902.1090403@gmail.com>
	<CAB6mnxL9GhVHi5aBKMfzX0DTi3uY=W90rFaj7Ta=fZguJZs5FA@mail.gmail.com>
	<51E17280.2030105@gmail.com>
	<CAPJVwBn-4WfgPQCzz0kECV8MyoLgWDmTmWu+Fj2TMuUWKx-7Wg@mail.gmail.com>
	<CABDkGQ=to_EG8uUyj8aQTVusguvt1qsLnhsPEGpanczHE9J1Rg@mail.gmail.com>
Message-ID: <CABL7CQggxgtZepxmW3Htbe_q59AzJ+X7cfLJo9t7qXnNHWvkBw@mail.gmail.com>

On Thu, Jul 18, 2013 at 4:18 PM, St?fan van der Walt <stefan at sun.ac.za>wrote:

> On Sat, Jul 13, 2013 at 7:46 PM, Nathaniel Smith <njs at pobox.com> wrote:
> > Why not just write
> >
> > def H(a):
> >     return a.conj().T
>
> It's hard to convince students that this is the Best Way of doing
> things in NumPy.  Why, they ask, can you do it using a' in MATLAB,
> then?
>
> I've tripped over this one before, since it's not the kind of thing
> you imagine would be unimplemented, and then spend some time trying to
> find it.
>

+1 for adding a H attribute.

Here's the end of the old discussion Chuck referred to:
http://thread.gmane.org/gmane.comp.python.numeric.general/6637.  No strong
arguments against and then several more votes in favor.

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130720/02de829b/attachment.html>

From ralf.gommers at gmail.com  Sat Jul 20 06:58:08 2013
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Sat, 20 Jul 2013 12:58:08 +0200
Subject: [Numpy-discussion] subtypes of ndarray and round()
In-Reply-To: <51D5D058.9090502@gmail.com>
References: <51D5D058.9090502@gmail.com>
Message-ID: <CABL7CQivCxtDAmorkY-9wG5X0Le+qJ0kMryPGnibpVwvZ5BZ8w@mail.gmail.com>

On Thu, Jul 4, 2013 at 9:43 PM, Matti Picus <matti.picus at gmail.com> wrote:

> round() does not consistently preserve subtype of the ndarray,
> is this known behaviour or should I file a bug for it?
>

That looks like a bug to me. The docstring explicitly says that return type
equals input type.

Ralf


> Python 2.7.3 (default, Sep 26 2012, 21:51:14)
> [GCC 4.7.2] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
>  >>> import numpy as np
>  >>> np.version.version
> '1.7.0'
>  >>> a=np.matrix(range(10))
>  >>> a.round(decimals=10)
> matrix([[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]])
>  >>> a.round(decimals=-10)
> array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]])
>
> Matti
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130720/9db890f6/attachment.html>

From pav at iki.fi  Sat Jul 20 09:28:11 2013
From: pav at iki.fi (Pauli Virtanen)
Date: Sat, 20 Jul 2013 16:28:11 +0300
Subject: [Numpy-discussion] fresh performance hits: numpy.linalg.pinv
	>30% slowdown
In-Reply-To: <CAPJVwBmowHL4wUV8KdPfXf2TB92nmXauQQXmpx6-Cgk6zynGCA@mail.gmail.com>
References: <20130506143241.GV5140@onerussian.com>
	<1367856232.2506.31.camel@sebastian-laptop>
	<20130506161153.GW5140@onerussian.com>
	<1367927238.23010.12.camel@sebastian-laptop>
	<CAJhcF=1a=zHQ7JgXF9f4N+1pHMr_6RESq6Me9oSYewjj62L0MA@mail.gmail.com>
	<20130701193006.GC27621@onerussian.com>
	<20130701215804.GG27621@onerussian.com>
	<20130709161007.GL27621@onerussian.com>
	<20130717035348.GN27621@onerussian.com>
	<20130719220754.GR27621@onerussian.com>
	<CAPJVwBmowHL4wUV8KdPfXf2TB92nmXauQQXmpx6-Cgk6zynGCA@mail.gmail.com>
Message-ID: <kse394$se4$1@ger.gmane.org>

20.07.2013 01:38, Nathaniel Smith kirjoitti:
> The biggest ~recent change in master's linalg was the switch to gufunc
> back ends - you might want to check for that event in your commit log.

That was in mid-April, which doesn't match with the location of the 
uptick in the graph.

	Pauli


From seb.haase at gmail.com  Sat Jul 20 09:30:48 2013
From: seb.haase at gmail.com (Sebastian Haase)
Date: Sat, 20 Jul 2013 15:30:48 +0200
Subject: [Numpy-discussion] add .H attribute?
In-Reply-To: <CABL7CQggxgtZepxmW3Htbe_q59AzJ+X7cfLJo9t7qXnNHWvkBw@mail.gmail.com>
References: <51D98902.1090403@gmail.com>
	<CAB6mnxL9GhVHi5aBKMfzX0DTi3uY=W90rFaj7Ta=fZguJZs5FA@mail.gmail.com>
	<51E17280.2030105@gmail.com>
	<CAPJVwBn-4WfgPQCzz0kECV8MyoLgWDmTmWu+Fj2TMuUWKx-7Wg@mail.gmail.com>
	<CABDkGQ=to_EG8uUyj8aQTVusguvt1qsLnhsPEGpanczHE9J1Rg@mail.gmail.com>
	<CABL7CQggxgtZepxmW3Htbe_q59AzJ+X7cfLJo9t7qXnNHWvkBw@mail.gmail.com>
Message-ID: <CAN06oV8LWNM4PKfc0AqN1sKM3kj1iw6M6GVSUBfZJLK8qZ_TAw@mail.gmail.com>

On Sat, Jul 20, 2013 at 12:36 PM, Ralf Gommers <ralf.gommers at gmail.com> wrote:
>
>
>
>
> On Thu, Jul 18, 2013 at 4:18 PM, St?fan van der Walt <stefan at sun.ac.za> wrote:
>>
>> On Sat, Jul 13, 2013 at 7:46 PM, Nathaniel Smith <njs at pobox.com> wrote:
>> > Why not just write
>> >
>> > def H(a):
>> >     return a.conj().T
>>
>> It's hard to convince students that this is the Best Way of doing
>> things in NumPy.  Why, they ask, can you do it using a' in MATLAB,
>> then?
>>
>> I've tripped over this one before, since it's not the kind of thing
>> you imagine would be unimplemented, and then spend some time trying to
>> find it.
>
>
> +1 for adding a H attribute.
>
> Here's the end of the old discussion Chuck referred to: >http://thread.gmane.org/gmane.comp.python.numeric.general/6637.  No strong arguments against and then
> several more votes in favor.

Are there other precedents where an attribute would involve
data-copying ? I'm thinking that numpy generally does better than
matlab by being more explicit about it's memory usage...
(But, I'm no mathematician and I could see it beeing much of a
convenience to have .H )

My two cents,
Sebastian Haase


From ralf.gommers at gmail.com  Sat Jul 20 09:49:34 2013
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Sat, 20 Jul 2013 15:49:34 +0200
Subject: [Numpy-discussion] add .H attribute?
In-Reply-To: <CAN06oV8LWNM4PKfc0AqN1sKM3kj1iw6M6GVSUBfZJLK8qZ_TAw@mail.gmail.com>
References: <51D98902.1090403@gmail.com>
	<CAB6mnxL9GhVHi5aBKMfzX0DTi3uY=W90rFaj7Ta=fZguJZs5FA@mail.gmail.com>
	<51E17280.2030105@gmail.com>
	<CAPJVwBn-4WfgPQCzz0kECV8MyoLgWDmTmWu+Fj2TMuUWKx-7Wg@mail.gmail.com>
	<CABDkGQ=to_EG8uUyj8aQTVusguvt1qsLnhsPEGpanczHE9J1Rg@mail.gmail.com>
	<CABL7CQggxgtZepxmW3Htbe_q59AzJ+X7cfLJo9t7qXnNHWvkBw@mail.gmail.com>
	<CAN06oV8LWNM4PKfc0AqN1sKM3kj1iw6M6GVSUBfZJLK8qZ_TAw@mail.gmail.com>
Message-ID: <CABL7CQg3XoraY2qAk65eJfe0ja2=qek6D1wU2bGa6mJYYy-r5g@mail.gmail.com>

On Sat, Jul 20, 2013 at 3:30 PM, Sebastian Haase <seb.haase at gmail.com>wrote:

> On Sat, Jul 20, 2013 at 12:36 PM, Ralf Gommers <ralf.gommers at gmail.com>
> wrote:
> >
> >
> >
> >
> > On Thu, Jul 18, 2013 at 4:18 PM, St?fan van der Walt <stefan at sun.ac.za>
> wrote:
> >>
> >> On Sat, Jul 13, 2013 at 7:46 PM, Nathaniel Smith <njs at pobox.com> wrote:
> >> > Why not just write
> >> >
> >> > def H(a):
> >> >     return a.conj().T
> >>
> >> It's hard to convince students that this is the Best Way of doing
> >> things in NumPy.  Why, they ask, can you do it using a' in MATLAB,
> >> then?
> >>
> >> I've tripped over this one before, since it's not the kind of thing
> >> you imagine would be unimplemented, and then spend some time trying to
> >> find it.
> >
> >
> > +1 for adding a H attribute.
> >
> > Here's the end of the old discussion Chuck referred to: >
> http://thread.gmane.org/gmane.comp.python.numeric.general/6637.  No
> strong arguments against and then
> > several more votes in favor.
>
> Are there other precedents where an attribute would involve
> data-copying ?


np.matrix.H for example. If you meant ndarray attributes and not attributes
of numpy objects, I guess no. I don't think that matters much compared to
having an intuitive and consistent API though.

Ralf


> I'm thinking that numpy generally does better than
> matlab by being more explicit about it's memory usage...
> (But, I'm no mathematician and I could see it beeing much of a
> convenience to have .H )
>
> My two cents,
> Sebastian Haase
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130720/1ef97f50/attachment.html>

From matti.picus at gmail.com  Sat Jul 20 14:09:09 2013
From: matti.picus at gmail.com (Matti Picus)
Date: Sat, 20 Jul 2013 21:09:09 +0300
Subject: [Numpy-discussion] subtypes of ndarray and round()
In-Reply-To: <mailman.13.1374339602.31198.numpy-discussion@scipy.org>
References: <mailman.13.1374339602.31198.numpy-discussion@scipy.org>
Message-ID: <51EAD245.7030605@gmail.com>

An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130720/cea3f79c/attachment.html>

From magawake at gmail.com  Sun Jul 21 02:24:16 2013
From: magawake at gmail.com (Mag Gam)
Date: Sun, 21 Jul 2013 02:24:16 -0400
Subject: [Numpy-discussion] Mag Gam
Message-ID: <CAPG7ZsjTCpPtEhQ8Mu1aFOTBf53VCCTjMZ3eaJXPYCZj9jSj2w@mail.gmail.com>

http://houtwormbestrijding-houtwormbestrijding.nl/mlwoeh/ibivodpmj.ikuklorzxzgycwtj


Mag Gam


7/21/2013 7:24:11 AM


From stefan at sun.ac.za  Sun Jul 21 18:37:42 2013
From: stefan at sun.ac.za (=?iso-8859-1?Q?St=E9fan?= van der Walt)
Date: Mon, 22 Jul 2013 00:37:42 +0200
Subject: [Numpy-discussion] add .H attribute?
In-Reply-To: <CAN06oV8LWNM4PKfc0AqN1sKM3kj1iw6M6GVSUBfZJLK8qZ_TAw@mail.gmail.com>
References: <51D98902.1090403@gmail.com>
	<CAB6mnxL9GhVHi5aBKMfzX0DTi3uY=W90rFaj7Ta=fZguJZs5FA@mail.gmail.com>
	<51E17280.2030105@gmail.com>
	<CAPJVwBn-4WfgPQCzz0kECV8MyoLgWDmTmWu+Fj2TMuUWKx-7Wg@mail.gmail.com>
	<CABDkGQ=to_EG8uUyj8aQTVusguvt1qsLnhsPEGpanczHE9J1Rg@mail.gmail.com>
	<CABL7CQggxgtZepxmW3Htbe_q59AzJ+X7cfLJo9t7qXnNHWvkBw@mail.gmail.com>
	<CAN06oV8LWNM4PKfc0AqN1sKM3kj1iw6M6GVSUBfZJLK8qZ_TAw@mail.gmail.com>
Message-ID: <20130721223742.GA20415@shinobi>

On Sat, 20 Jul 2013 15:30:48 +0200, Sebastian Haase wrote:
> Are there other precedents where an attribute would involve
> data-copying ? I'm thinking that numpy generally does better than
> matlab by being more explicit about it's memory usage...
> (But, I'm no mathematician and I could see it beeing much of a
> convenience to have .H )

Hopefully we'll eventually have lazily evaluated arrays so that we can do
things like views of ufuncs on data.  Unfortunately, this is not doable with
the current ndarray, since its structure is tied to a pointer and strides.

St?fan


From anubhab91 at gmail.com  Mon Jul 22 00:04:57 2013
From: anubhab91 at gmail.com (Anubhab Baksi)
Date: Mon, 22 Jul 2013 09:34:57 +0530
Subject: [Numpy-discussion] Mag Gam
In-Reply-To: <CAPG7ZsjTCpPtEhQ8Mu1aFOTBf53VCCTjMZ3eaJXPYCZj9jSj2w@mail.gmail.com>
References: <CAPG7ZsjTCpPtEhQ8Mu1aFOTBf53VCCTjMZ3eaJXPYCZj9jSj2w@mail.gmail.com>
Message-ID: <CANHk5kz6cAkYMfD9F8mqrfZ2DXxKQd6E=1Shrc01KJdK4i5sbg@mail.gmail.com>

I don't know, but it is redirected to https://www.google.co.in/?gws_rd=cr .

On Sun, Jul 21, 2013 at 11:54 AM, Mag Gam <magawake at gmail.com> wrote:

>
> http://houtwormbestrijding-houtwormbestrijding.nl/mlwoeh/ibivodpmj.ikuklorzxzgycwtj
>
>
>
>
>
> Mag Gam
>
>
> 7/21/2013 7:24:11 AM
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130722/8b5faaae/attachment.html>

From stefan at sun.ac.za  Mon Jul 22 05:02:19 2013
From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=)
Date: Mon, 22 Jul 2013 11:02:19 +0200
Subject: [Numpy-discussion] Bringing order to higher dimensional
	operations
In-Reply-To: <CAMMTP+D-MZT9vffASMmxQ3DitipVV=nRbCVSF=mCZMgvjxenSA@mail.gmail.com>
References: <CAPJVwBmkCYGV2+uwvH7DXo=0_hRMWZjB=EX32qfsAw0B6Obd=A@mail.gmail.com>
	<1374153819.14751.17.camel@sebastian-laptop>
	<CAPJVwBnPDuCU5GZ_fgD_OpjHQ3559YMQ-gkN9hukyD57qNxitA@mail.gmail.com>
	<1374250234.3254.33.camel@sebastian-laptop>
	<CAMMTP+D-MZT9vffASMmxQ3DitipVV=nRbCVSF=mCZMgvjxenSA@mail.gmail.com>
Message-ID: <CABDkGQmisnwxrd6grgAU0zozPOqwiOG7Ae9HqZVgg5LdKkrL=w@mail.gmail.com>

On Sat, Jul 20, 2013 at 7:44 AM,  <josef.pktd at gmail.com> wrote:
> related: is there any advantage to np.add.reduce?
> I find it more difficult to read than sum() and still see it used sometimes.

I think ``np.add.reduce`` just falls out of the ufunc
implementation--there's no "per ufunc" choice to remove certain parts
of the API, if I recall correctly.

St?fan


From lists at onerussian.com  Mon Jul 22 10:51:58 2013
From: lists at onerussian.com (Yaroslav Halchenko)
Date: Mon, 22 Jul 2013 10:51:58 -0400
Subject: [Numpy-discussion] fresh performance hits: numpy.linalg.pinv
 >30% slowdown
In-Reply-To: <CAGzF1uc=+oyZpX+idrPk2Nffz_wURqLKLV25B3xF9-nX9PeN=g@mail.gmail.com>
References: <1367856232.2506.31.camel@sebastian-laptop>
	<20130506161153.GW5140@onerussian.com>
	<1367927238.23010.12.camel@sebastian-laptop>
	<CAJhcF=1a=zHQ7JgXF9f4N+1pHMr_6RESq6Me9oSYewjj62L0MA@mail.gmail.com>
	<20130701193006.GC27621@onerussian.com>
	<20130701215804.GG27621@onerussian.com>
	<20130709161007.GL27621@onerussian.com>
	<20130717035348.GN27621@onerussian.com>
	<20130719220754.GR27621@onerussian.com>
	<CAGzF1uc=+oyZpX+idrPk2Nffz_wURqLKLV25B3xF9-nX9PeN=g@mail.gmail.com>
Message-ID: <20130722145157.GS27621@onerussian.com>


On Fri, 19 Jul 2013, Warren Weckesser wrote:

> Well, this is embarrassing: https://github.com/numpy/numpy/pull/3539

> Thanks for benchmarks!  I'm now an even bigger fan. :)

Great to see that those came of help!  I thought to provide a detailed
details (benchmarking all recent commits) to provide exact point of
regression, but embarrassingly I made that run outside of the
benchmarking chroot, so consistency was not guaranteed.  Anyways --
rerunning it correctly now (with recent commits included).

-- 
Yaroslav O. Halchenko, Ph.D.
http://neuro.debian.net http://www.pymvpa.org http://www.fail2ban.org
Senior Research Associate,     Psychological and Brain Sciences Dept.
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834                       Fax: +1 (603) 646-1419
WWW:   http://www.linkedin.com/in/yarik        


From lists at onerussian.com  Mon Jul 22 10:55:14 2013
From: lists at onerussian.com (Yaroslav Halchenko)
Date: Mon, 22 Jul 2013 10:55:14 -0400
Subject: [Numpy-discussion] fresh performance hits: numpy.linalg.pinv
 >30% slowdown
In-Reply-To: <kse394$se4$1@ger.gmane.org>
References: <20130506161153.GW5140@onerussian.com>
	<1367927238.23010.12.camel@sebastian-laptop>
	<CAJhcF=1a=zHQ7JgXF9f4N+1pHMr_6RESq6Me9oSYewjj62L0MA@mail.gmail.com>
	<20130701193006.GC27621@onerussian.com>
	<20130701215804.GG27621@onerussian.com>
	<20130709161007.GL27621@onerussian.com>
	<20130717035348.GN27621@onerussian.com>
	<20130719220754.GR27621@onerussian.com>
	<CAPJVwBmowHL4wUV8KdPfXf2TB92nmXauQQXmpx6-Cgk6zynGCA@mail.gmail.com>
	<kse394$se4$1@ger.gmane.org>
Message-ID: <20130722145514.GT27621@onerussian.com>

At some point I hope to tune up the report with an option of viewing the
plot using e.g. nvd3 JS so it could be easier to pin point/analyze
interactively.

On Sat, 20 Jul 2013, Pauli Virtanen wrote:

> 20.07.2013 01:38, Nathaniel Smith kirjoitti:
> > The biggest ~recent change in master's linalg was the switch to gufunc
> > back ends - you might want to check for that event in your commit log.

> That was in mid-April, which doesn't match with the location of the 
> uptick in the graph.

-- 
Yaroslav O. Halchenko, Ph.D.
http://neuro.debian.net http://www.pymvpa.org http://www.fail2ban.org
Senior Research Associate,     Psychological and Brain Sciences Dept.
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834                       Fax: +1 (603) 646-1419
WWW:   http://www.linkedin.com/in/yarik        


From ben.root at ou.edu  Mon Jul 22 13:16:22 2013
From: ben.root at ou.edu (Benjamin Root)
Date: Mon, 22 Jul 2013 13:16:22 -0400
Subject: [Numpy-discussion] fresh performance hits: numpy.linalg.pinv
	>30% slowdown
In-Reply-To: <20130722145514.GT27621@onerussian.com>
References: <20130506161153.GW5140@onerussian.com>
	<1367927238.23010.12.camel@sebastian-laptop>
	<CAJhcF=1a=zHQ7JgXF9f4N+1pHMr_6RESq6Me9oSYewjj62L0MA@mail.gmail.com>
	<20130701193006.GC27621@onerussian.com>
	<20130701215804.GG27621@onerussian.com>
	<20130709161007.GL27621@onerussian.com>
	<20130717035348.GN27621@onerussian.com>
	<20130719220754.GR27621@onerussian.com>
	<CAPJVwBmowHL4wUV8KdPfXf2TB92nmXauQQXmpx6-Cgk6zynGCA@mail.gmail.com>
	<kse394$se4$1@ger.gmane.org> <20130722145514.GT27621@onerussian.com>
Message-ID: <CANNq6Fm97J-wYH_T-C_juUAs1JTBSWevHfZA9nQ5TeNmjYEzFQ@mail.gmail.com>

On Mon, Jul 22, 2013 at 10:55 AM, Yaroslav Halchenko
<lists at onerussian.com>wrote:

> At some point I hope to tune up the report with an option of viewing the
> plot using e.g. nvd3 JS so it could be easier to pin point/analyze
> interactively.
>
>
shameless plug... the soon-to-be-finalized matplotlib-1.3 has a WebAgg
backend that allows for interactivity.

Cheers!
Ben Root
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130722/f2212654/attachment.html>

From lists at onerussian.com  Mon Jul 22 13:28:28 2013
From: lists at onerussian.com (Yaroslav Halchenko)
Date: Mon, 22 Jul 2013 13:28:28 -0400
Subject: [Numpy-discussion] fresh performance hits: numpy.linalg.pinv
 >30% slowdown
In-Reply-To: <CANNq6Fm97J-wYH_T-C_juUAs1JTBSWevHfZA9nQ5TeNmjYEzFQ@mail.gmail.com>
References: <CAJhcF=1a=zHQ7JgXF9f4N+1pHMr_6RESq6Me9oSYewjj62L0MA@mail.gmail.com>
	<20130701193006.GC27621@onerussian.com>
	<20130701215804.GG27621@onerussian.com>
	<20130709161007.GL27621@onerussian.com>
	<20130717035348.GN27621@onerussian.com>
	<20130719220754.GR27621@onerussian.com>
	<CAPJVwBmowHL4wUV8KdPfXf2TB92nmXauQQXmpx6-Cgk6zynGCA@mail.gmail.com>
	<kse394$se4$1@ger.gmane.org>
	<20130722145514.GT27621@onerussian.com>
	<CANNq6Fm97J-wYH_T-C_juUAs1JTBSWevHfZA9nQ5TeNmjYEzFQ@mail.gmail.com>
Message-ID: <20130722172828.GU27621@onerussian.com>


On Mon, 22 Jul 2013, Benjamin Root wrote:
>      At some point I hope to tune up the report with an option of viewing the
>      plot using e.g. nvd3 JS so it could be easier to pin point/analyze
>      interactively.
>    shameless plug... the soon-to-be-finalized matplotlib-1.3 has a WebAgg
>    backend that allows for interactivity.

"that's just sick!"

do you know about any motion in python-sphinx world on supporting it?

is there any demo page you would recommend to assess what to expect
supported in upcoming webagg?

-- 
Yaroslav O. Halchenko, Ph.D.
http://neuro.debian.net http://www.pymvpa.org http://www.fail2ban.org
Senior Research Associate,     Psychological and Brain Sciences Dept.
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834                       Fax: +1 (603) 646-1419
WWW:   http://www.linkedin.com/in/yarik        


From ben.root at ou.edu  Mon Jul 22 13:43:56 2013
From: ben.root at ou.edu (Benjamin Root)
Date: Mon, 22 Jul 2013 13:43:56 -0400
Subject: [Numpy-discussion] fresh performance hits: numpy.linalg.pinv
	>30% slowdown
In-Reply-To: <20130722172828.GU27621@onerussian.com>
References: <CAJhcF=1a=zHQ7JgXF9f4N+1pHMr_6RESq6Me9oSYewjj62L0MA@mail.gmail.com>
	<20130701193006.GC27621@onerussian.com>
	<20130701215804.GG27621@onerussian.com>
	<20130709161007.GL27621@onerussian.com>
	<20130717035348.GN27621@onerussian.com>
	<20130719220754.GR27621@onerussian.com>
	<CAPJVwBmowHL4wUV8KdPfXf2TB92nmXauQQXmpx6-Cgk6zynGCA@mail.gmail.com>
	<kse394$se4$1@ger.gmane.org> <20130722145514.GT27621@onerussian.com>
	<CANNq6Fm97J-wYH_T-C_juUAs1JTBSWevHfZA9nQ5TeNmjYEzFQ@mail.gmail.com>
	<20130722172828.GU27621@onerussian.com>
Message-ID: <CANNq6FkPf-sa0e6GWyGMO3b_jEe=wNmRRqDi=PVaZZjH3h=VPg@mail.gmail.com>

On Mon, Jul 22, 2013 at 1:28 PM, Yaroslav Halchenko <lists at onerussian.com>wrote:

>
> On Mon, 22 Jul 2013, Benjamin Root wrote:
> >      At some point I hope to tune up the report with an option of
> viewing the
> >      plot using e.g. nvd3 JS so it could be easier to pin point/analyze
> >      interactively.
> >    shameless plug... the soon-to-be-finalized matplotlib-1.3 has a WebAgg
> >    backend that allows for interactivity.
>
> "that's just sick!"
>
> do you know about any motion in python-sphinx world on supporting it?
>
> is there any demo page you would recommend to assess what to expect
> supported in upcoming webagg?
>
>
Oldie but goodie:
http://mdboom.github.io/blog/2012/10/11/matplotlib-in-the-browser-its-coming/
Official Announcement:
http://matplotlib.org/1.3.0/users/whats_new.html#webagg-backend

Note, this is different than what is now available in IPython Notebook (it
isn't really interactive there). As for what is supported, just about
everything you can do normally, can be done in WebAgg. I have no clue about
sphinx-level support.

Now, back to your regularly scheduled program.

Cheers!
Ben Root
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130722/f3367bc0/attachment.html>

From njs at pobox.com  Mon Jul 22 15:10:42 2013
From: njs at pobox.com (Nathaniel Smith)
Date: Mon, 22 Jul 2013 20:10:42 +0100
Subject: [Numpy-discussion] add .H attribute?
In-Reply-To: <CABDkGQ=to_EG8uUyj8aQTVusguvt1qsLnhsPEGpanczHE9J1Rg@mail.gmail.com>
References: <51D98902.1090403@gmail.com>
	<CAB6mnxL9GhVHi5aBKMfzX0DTi3uY=W90rFaj7Ta=fZguJZs5FA@mail.gmail.com>
	<51E17280.2030105@gmail.com>
	<CAPJVwBn-4WfgPQCzz0kECV8MyoLgWDmTmWu+Fj2TMuUWKx-7Wg@mail.gmail.com>
	<CABDkGQ=to_EG8uUyj8aQTVusguvt1qsLnhsPEGpanczHE9J1Rg@mail.gmail.com>
Message-ID: <CAPJVwBmGkMxPw_cD21zxsf1Jj5cMqSzm4vShAEfqmkT4Pdy4bA@mail.gmail.com>

On Thu, Jul 18, 2013 at 3:18 PM, St?fan van der Walt <stefan at sun.ac.za> wrote:
> On Sat, Jul 13, 2013 at 7:46 PM, Nathaniel Smith <njs at pobox.com> wrote:
>> Why not just write
>>
>> def H(a):
>>     return a.conj().T
>
> It's hard to convince students that this is the Best Way of doing
> things in NumPy.  Why, they ask, can you do it using a' in MATLAB,
> then?

I guess I'd try to treat it as a teachable moment... the answer points
to a basic difference in numpy versus MATLAB. Numpy operates at a
slightly lower level of abstraction. In MATLAB you're encouraged to
think of arrays as just mathematical matrices and let MATLAB worry
about how to actually represent those inside the computer. Sometimes
it does a good job, sometimes not. In numpy you need to think of
arrays as structured representations of a chunk of memory. There
disadvantages to this -- e.g. keeping track of which arrays return
view and which return copies can be tricky -- but it also gives a lot
of power: views are awesome, you get better interoperability with C
libraries/Cython, better ability to predict which operations are
expensive or cheap, more opportunities to use clever tricks when you
need to, etc.

And one example of this is that transpose and conjugate transpose
really are very different at this level, because one is a cheap stride
manipulation that returns a view, and the other is a (relatively)
expensive data copying operation. The convention in Python is that
attribute access is supposed to be cheap, while function calls serve
as a warning that something expensive might be going on. So in short:
MATLAB is optimized for doing linear algebra and not thinking too hard
about programming; numpy is optimized for writing good programs.
Having .T but not .H is an example of this split.

Also it's a good opportunity to demonstrate the value of making little
helper functions, which is a powerful technique that students
generally need to be taught ;-).

-n


From evgeny.toder at jpmorgan.com  Mon Jul 22 15:39:47 2013
From: evgeny.toder at jpmorgan.com (Toder, Evgeny)
Date: Mon, 22 Jul 2013 19:39:47 +0000
Subject: [Numpy-discussion] add .H attribute?
In-Reply-To: <CAPJVwBmGkMxPw_cD21zxsf1Jj5cMqSzm4vShAEfqmkT4Pdy4bA@mail.gmail.com>
References: <51D98902.1090403@gmail.com>
	<CAB6mnxL9GhVHi5aBKMfzX0DTi3uY=W90rFaj7Ta=fZguJZs5FA@mail.gmail.com>
	<51E17280.2030105@gmail.com>
	<CAPJVwBn-4WfgPQCzz0kECV8MyoLgWDmTmWu+Fj2TMuUWKx-7Wg@mail.gmail.com>
	<CABDkGQ=to_EG8uUyj8aQTVusguvt1qsLnhsPEGpanczHE9J1Rg@mail.gmail.com>
	<CAPJVwBmGkMxPw_cD21zxsf1Jj5cMqSzm4vShAEfqmkT4Pdy4bA@mail.gmail.com>
Message-ID: <C0B518458D7A2B479B898CA53524926E362FFCAE@SBECMX010.exchad.jpmchase.net>

What if .H is not an attribute, but a method? Is this enough of a warning about copying?

Eugene

-----Original Message-----
From: numpy-discussion-bounces at scipy.org [mailto:numpy-discussion-bounces at scipy.org] On Behalf Of Nathaniel Smith
Sent: Monday, July 22, 2013 3:11 PM
To: Discussion of Numerical Python
Subject: Re: [Numpy-discussion] add .H attribute?

On Thu, Jul 18, 2013 at 3:18 PM, St?fan van der Walt <stefan at sun.ac.za> wrote:
> On Sat, Jul 13, 2013 at 7:46 PM, Nathaniel Smith <njs at pobox.com> wrote:
>> Why not just write
>>
>> def H(a):
>>     return a.conj().T
>
> It's hard to convince students that this is the Best Way of doing
> things in NumPy.  Why, they ask, can you do it using a' in MATLAB,
> then?

I guess I'd try to treat it as a teachable moment... the answer points
to a basic difference in numpy versus MATLAB. Numpy operates at a
slightly lower level of abstraction. In MATLAB you're encouraged to
think of arrays as just mathematical matrices and let MATLAB worry
about how to actually represent those inside the computer. Sometimes
it does a good job, sometimes not. In numpy you need to think of
arrays as structured representations of a chunk of memory. There
disadvantages to this -- e.g. keeping track of which arrays return
view and which return copies can be tricky -- but it also gives a lot
of power: views are awesome, you get better interoperability with C
libraries/Cython, better ability to predict which operations are
expensive or cheap, more opportunities to use clever tricks when you
need to, etc.

And one example of this is that transpose and conjugate transpose
really are very different at this level, because one is a cheap stride
manipulation that returns a view, and the other is a (relatively)
expensive data copying operation. The convention in Python is that
attribute access is supposed to be cheap, while function calls serve
as a warning that something expensive might be going on. So in short:
MATLAB is optimized for doing linear algebra and not thinking too hard
about programming; numpy is optimized for writing good programs.
Having .T but not .H is an example of this split.

Also it's a good opportunity to demonstrate the value of making little
helper functions, which is a powerful technique that students
generally need to be taught ;-).

-n
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion at scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

This email is confidential and subject to important disclaimers and conditions including on offers for the purchase or sale of securities, accuracy and completeness of information, viruses, confidentiality, legal privilege, and legal entity disclaimers, available at http://www.jpmorgan.com/pages/disclosures/email.  

From bryanv at continuum.io  Mon Jul 22 16:04:47 2013
From: bryanv at continuum.io (Bryan Van de Ven)
Date: Mon, 22 Jul 2013 16:04:47 -0400
Subject: [Numpy-discussion] add .H attribute?
In-Reply-To: <CAPJVwBmGkMxPw_cD21zxsf1Jj5cMqSzm4vShAEfqmkT4Pdy4bA@mail.gmail.com>
References: <51D98902.1090403@gmail.com>
	<CAB6mnxL9GhVHi5aBKMfzX0DTi3uY=W90rFaj7Ta=fZguJZs5FA@mail.gmail.com>
	<51E17280.2030105@gmail.com>
	<CAPJVwBn-4WfgPQCzz0kECV8MyoLgWDmTmWu+Fj2TMuUWKx-7Wg@mail.gmail.com>
	<CABDkGQ=to_EG8uUyj8aQTVusguvt1qsLnhsPEGpanczHE9J1Rg@mail.gmail.com>
	<CAPJVwBmGkMxPw_cD21zxsf1Jj5cMqSzm4vShAEfqmkT4Pdy4bA@mail.gmail.com>
Message-ID: <D8768823-C278-4826-809E-ABAB292C02E5@continuum.io>

On the other hand, the most salient quality an unavoidable copy is that it is unavoidable. For people for whom using Hermitian conjugates is common, it's not like they won't do it just because they can't avoid a copy that can't be avoided.  Given that if a problem dictates a Hermitian conjugate be taken, then it will be taken, then: a.H is closer to the mathematical notation, eases migration for matlab users, and does not require everyone to reinvent their own little version of the same function over and over. All of that seems more compelling that this particular arbitrary convention, personally. 

Bryan 

On Jul 22, 2013, at 3:10 PM, Nathaniel Smith <njs at pobox.com> wrote:

> On Thu, Jul 18, 2013 at 3:18 PM, St?fan van der Walt <stefan at sun.ac.za> wrote:
>> On Sat, Jul 13, 2013 at 7:46 PM, Nathaniel Smith <njs at pobox.com> wrote:
>>> Why not just write
>>> 
>>> def H(a):
>>>    return a.conj().T
>> 
>> It's hard to convince students that this is the Best Way of doing
>> things in NumPy.  Why, they ask, can you do it using a' in MATLAB,
>> then?
> 
> I guess I'd try to treat it as a teachable moment... the answer points
> to a basic difference in numpy versus MATLAB. Numpy operates at a
> slightly lower level of abstraction. In MATLAB you're encouraged to
> think of arrays as just mathematical matrices and let MATLAB worry
> about how to actually represent those inside the computer. Sometimes
> it does a good job, sometimes not. In numpy you need to think of
> arrays as structured representations of a chunk of memory. There
> disadvantages to this -- e.g. keeping track of which arrays return
> view and which return copies can be tricky -- but it also gives a lot
> of power: views are awesome, you get better interoperability with C
> libraries/Cython, better ability to predict which operations are
> expensive or cheap, more opportunities to use clever tricks when you
> need to, etc.
> 
> And one example of this is that transpose and conjugate transpose
> really are very different at this level, because one is a cheap stride
> manipulation that returns a view, and the other is a (relatively)
> expensive data copying operation. The convention in Python is that
> attribute access is supposed to be cheap, while function calls serve
> as a warning that something expensive might be going on. So in short:
> MATLAB is optimized for doing linear algebra and not thinking too hard
> about programming; numpy is optimized for writing good programs.
> Having .T but not .H is an example of this split.
> 
> Also it's a good opportunity to demonstrate the value of making little
> helper functions, which is a powerful technique that students
> generally need to be taught ;-).
> 
> -n
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From alan.isaac at gmail.com  Mon Jul 22 16:07:28 2013
From: alan.isaac at gmail.com (Alan G Isaac)
Date: Mon, 22 Jul 2013 16:07:28 -0400
Subject: [Numpy-discussion] add .H attribute?
In-Reply-To: <CAPJVwBmGkMxPw_cD21zxsf1Jj5cMqSzm4vShAEfqmkT4Pdy4bA@mail.gmail.com>
References: <51D98902.1090403@gmail.com>
	<CAB6mnxL9GhVHi5aBKMfzX0DTi3uY=W90rFaj7Ta=fZguJZs5FA@mail.gmail.com>
	<51E17280.2030105@gmail.com>
	<CAPJVwBn-4WfgPQCzz0kECV8MyoLgWDmTmWu+Fj2TMuUWKx-7Wg@mail.gmail.com>
	<CABDkGQ=to_EG8uUyj8aQTVusguvt1qsLnhsPEGpanczHE9J1Rg@mail.gmail.com>
	<CAPJVwBmGkMxPw_cD21zxsf1Jj5cMqSzm4vShAEfqmkT4Pdy4bA@mail.gmail.com>
Message-ID: <51ED9100.8040108@gmail.com>

On 7/22/2013 3:10 PM, Nathaniel Smith wrote:
> Having .T but not .H is an example of this split.


Hate to do this but ...

  Readability counts.
  Special cases aren't special enough to break the rules.
  Although practicality beats purity.

How much is the split a rule or "just" a convention, and is there
enough practicality here to beat the purity of the split?

Note: this is not a rhetorical question.
However: if you propose A.conjugate().transpose() as providing a teachable
moment about why to use NumPy instead of A' in Matlab, I conclude you
do not ever teach most of my students.  The real world matters.  Since
practicality beats purity, we do have A.conj().T, which is better
but still not as readable as A.H would be.  Or even A.H(), should
that satisfy your objections (and still provide a teachable moment).

Alan


From dave.hirschfeld at gmail.com  Tue Jul 23 03:35:41 2013
From: dave.hirschfeld at gmail.com (Dave Hirschfeld)
Date: Tue, 23 Jul 2013 07:35:41 +0000 (UTC)
Subject: [Numpy-discussion] add .H attribute?
References: <51D98902.1090403@gmail.com>
	<CAB6mnxL9GhVHi5aBKMfzX0DTi3uY=W90rFaj7Ta=fZguJZs5FA@mail.gmail.com>
	<51E17280.2030105@gmail.com>
	<CAPJVwBn-4WfgPQCzz0kECV8MyoLgWDmTmWu+Fj2TMuUWKx-7Wg@mail.gmail.com>
	<CABDkGQ=to_EG8uUyj8aQTVusguvt1qsLnhsPEGpanczHE9J1Rg@mail.gmail.com>
	<CAPJVwBmGkMxPw_cD21zxsf1Jj5cMqSzm4vShAEfqmkT4Pdy4bA@mail.gmail.com>
	<51ED9100.8040108@gmail.com>
Message-ID: <loom.20130723T091735-874@post.gmane.org>

Alan G Isaac <alan.isaac <at> gmail.com> writes:

> 
> On 7/22/2013 3:10 PM, Nathaniel Smith wrote:
> > Having .T but not .H is an example of this split.
> 
> Hate to do this but ...
> 
>   Readability counts.

+10!

A.conjugate().transpose() is unspeakably horrible IMHO. Since there's no way 
to avoid a copy you gain nothing by not providing the convenience function. 

It should be fairly obvious that an operation which changes the values of an 
array (and doesn't work in-place) necessarily takes a copy. I think it's more 
than sufficient to simply document the fact that A.H will return a copy.

A user coming from Matlab probably doesn't care that it takes a copy but you'd 
be hard pressed to convince them there's any benefit of writing 
A.conjugate().transpose() over exactly what it looks like in textbooks - A.H

Regards,
Dave


From fperez.net at gmail.com  Tue Jul 23 04:07:27 2013
From: fperez.net at gmail.com (Fernando Perez)
Date: Tue, 23 Jul 2013 01:07:27 -0700
Subject: [Numpy-discussion] add .H attribute?
In-Reply-To: <loom.20130723T091735-874@post.gmane.org>
References: <51D98902.1090403@gmail.com>
	<CAB6mnxL9GhVHi5aBKMfzX0DTi3uY=W90rFaj7Ta=fZguJZs5FA@mail.gmail.com>
	<51E17280.2030105@gmail.com>
	<CAPJVwBn-4WfgPQCzz0kECV8MyoLgWDmTmWu+Fj2TMuUWKx-7Wg@mail.gmail.com>
	<CABDkGQ=to_EG8uUyj8aQTVusguvt1qsLnhsPEGpanczHE9J1Rg@mail.gmail.com>
	<CAPJVwBmGkMxPw_cD21zxsf1Jj5cMqSzm4vShAEfqmkT4Pdy4bA@mail.gmail.com>
	<51ED9100.8040108@gmail.com> <loom.20130723T091735-874@post.gmane.org>
Message-ID: <CAHAreOqQGOvyJf50NjY2FHPNSOYSG5R9+bvWgZdwOK15BE4uLg@mail.gmail.com>

On Tue, Jul 23, 2013 at 12:35 AM, Dave Hirschfeld
<dave.hirschfeld at gmail.com> wrote:
> Alan G Isaac <alan.isaac <at> gmail.com> writes:
>
>>
>> On 7/22/2013 3:10 PM, Nathaniel Smith wrote:
>> > Having .T but not .H is an example of this split.
>>
>> Hate to do this but ...
>>
>>   Readability counts.
>
> +10!
>
> A.conjugate().transpose() is unspeakably horrible IMHO. Since there's no way
> to avoid a copy you gain nothing by not providing the convenience function.


Silly suggestion: why not just make .H a callable?

a.H()

is nearly as short/handy as .H, it fits easily into the mnemonic
pattern suggested by .T, yet the extra () are indicative that
something potentially big/expensive is happening...

Cheers,

f

--
Fernando Perez (@fperez_org; http://fperez.org)
fperez.net-at-gmail: mailing lists only (I ignore this when swamped!)
fernando.perez-at-berkeley: contact me here for any direct mail


From d.s.seljebotn at astro.uio.no  Tue Jul 23 04:35:16 2013
From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn)
Date: Tue, 23 Jul 2013 10:35:16 +0200
Subject: [Numpy-discussion] add .H attribute?
In-Reply-To: <loom.20130723T091735-874@post.gmane.org>
References: <51D98902.1090403@gmail.com>
	<CAB6mnxL9GhVHi5aBKMfzX0DTi3uY=W90rFaj7Ta=fZguJZs5FA@mail.gmail.com>
	<51E17280.2030105@gmail.com>
	<CAPJVwBn-4WfgPQCzz0kECV8MyoLgWDmTmWu+Fj2TMuUWKx-7Wg@mail.gmail.com>
	<CABDkGQ=to_EG8uUyj8aQTVusguvt1qsLnhsPEGpanczHE9J1Rg@mail.gmail.com>
	<CAPJVwBmGkMxPw_cD21zxsf1Jj5cMqSzm4vShAEfqmkT4Pdy4bA@mail.gmail.com>
	<51ED9100.8040108@gmail.com>
	<loom.20130723T091735-874@post.gmane.org>
Message-ID: <51EE4044.5060509@astro.uio.no>

On 07/23/2013 09:35 AM, Dave Hirschfeld wrote:
> Alan G Isaac <alan.isaac <at> gmail.com> writes:
>
>>
>> On 7/22/2013 3:10 PM, Nathaniel Smith wrote:
>>> Having .T but not .H is an example of this split.
>>
>> Hate to do this but ...
>>
>>    Readability counts.
>
> +10!
>
> A.conjugate().transpose() is unspeakably horrible IMHO. Since there's no way
> to avoid a copy you gain nothing by not providing the convenience function.
>
> It should be fairly obvious that an operation which changes the values of an
> array (and doesn't work in-place) necessarily takes a copy. I think it's more
> than sufficient to simply document the fact that A.H will return a copy.

I don't think this is obvious at all. In fact, I'd fully expect A.H to 
return a view that conjugates the values on the fly as they are 
read/written (just the same way the array is "transposed on the fly" or 
"sliced on the fly" with other views).

There's lots of uses for A.H to be a conjugating-view, e.g., np.dot(A.H, 
A) can be done on-the-fly by BLAS at no extra cost, and so on. These are 
currently not possible with pure NumPy without a copy, which is a pretty 
big defect IMO (and one reason I'd call BLAS myself using Cython rather 
than use np.dot...)

So -1 on using A.H for anything but a proper view, and "A.conjt()" or 
something similar for a method that does a copy.

Dag Sverre


From d.s.seljebotn at astro.uio.no  Tue Jul 23 04:36:33 2013
From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn)
Date: Tue, 23 Jul 2013 10:36:33 +0200
Subject: [Numpy-discussion] add .H attribute?
In-Reply-To: <51EE4044.5060509@astro.uio.no>
References: <51D98902.1090403@gmail.com>
	<CAB6mnxL9GhVHi5aBKMfzX0DTi3uY=W90rFaj7Ta=fZguJZs5FA@mail.gmail.com>
	<51E17280.2030105@gmail.com>
	<CAPJVwBn-4WfgPQCzz0kECV8MyoLgWDmTmWu+Fj2TMuUWKx-7Wg@mail.gmail.com>
	<CABDkGQ=to_EG8uUyj8aQTVusguvt1qsLnhsPEGpanczHE9J1Rg@mail.gmail.com>
	<CAPJVwBmGkMxPw_cD21zxsf1Jj5cMqSzm4vShAEfqmkT4Pdy4bA@mail.gmail.com>
	<51ED9100.8040108@gmail.com>
	<loom.20130723T091735-874@post.gmane.org>
	<51EE4044.5060509@astro.uio.no>
Message-ID: <51EE4091.1080009@astro.uio.no>

On 07/23/2013 10:35 AM, Dag Sverre Seljebotn wrote:
> On 07/23/2013 09:35 AM, Dave Hirschfeld wrote:
>> Alan G Isaac <alan.isaac <at> gmail.com> writes:
>>
>>>
>>> On 7/22/2013 3:10 PM, Nathaniel Smith wrote:
>>>> Having .T but not .H is an example of this split.
>>>
>>> Hate to do this but ...
>>>
>>>    Readability counts.
>>
>> +10!
>>
>> A.conjugate().transpose() is unspeakably horrible IMHO. Since there's
>> no way
>> to avoid a copy you gain nothing by not providing the convenience
>> function.
>>
>> It should be fairly obvious that an operation which changes the values
>> of an
>> array (and doesn't work in-place) necessarily takes a copy. I think
>> it's more
>> than sufficient to simply document the fact that A.H will return a copy.
>
> I don't think this is obvious at all. In fact, I'd fully expect A.H to
> return a view that conjugates the values on the fly as they are
> read/written (just the same way the array is "transposed on the fly" or
> "sliced on the fly" with other views).
>
> There's lots of uses for A.H to be a conjugating-view, e.g., np.dot(A.H,
> A) can be done on-the-fly by BLAS at no extra cost, and so on. These are
> currently not possible with pure NumPy without a copy, which is a pretty
> big defect IMO (and one reason I'd call BLAS myself using Cython rather
> than use np.dot...)
>
> So -1 on using A.H for anything but a proper view, and "A.conjt()" or
> something similar for a method that does a copy.

Sorry: I'm +1 on another name for a method that does a copy. Which can 
eventually be made redundant with A.H.copy(), if somebody ever takes on 
the work to make that happen...but at least I think the path to that 
should be kept open.

Dag Sverre


From josef.pktd at gmail.com  Tue Jul 23 04:46:44 2013
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Tue, 23 Jul 2013 04:46:44 -0400
Subject: [Numpy-discussion] what is data attribute of numpy.str_ ?
Message-ID: <CAMMTP+Ak3tTLcxnbe7Re7DGdmZFbCs2NP+7VBAsHxKTFd0jSjg@mail.gmail.com>

python 3.3

I have a bug because we have a check for a .data attribute, that is
not supposed to be available for a string.


(Pdb) dir(data)
['T', '__abs__', '__add__', '__and__', '__array__',
'__array_interface__', '__array_priority__', '__array_struct__',
'__array_wrap__', '__bool__', '__class__', '__contains__', '__copy__',
'__deepcopy__', '__delattr__', '__dir__', '__divmod__', '__doc__',
'__eq__', '__float__', '__floordiv__', '__format__', '__ge__',
'__getattribute__', '__getitem__', '__getnewargs__', '__gt__',
'__hash__', '__init__', '__int__', '__invert__', '__iter__', '__le__',
'__len__', '__lshift__', '__lt__', '__mod__', '__mul__', '__ne__',
'__neg__', '__new__', '__or__', '__pos__', '__pow__', '__radd__',
'__rand__', '__rdivmod__', '__reduce__', '__reduce_ex__', '__repr__',
'__rfloordiv__', '__rlshift__', '__rmod__', '__rmul__', '__ror__',
'__round__', '__rpow__', '__rrshift__', '__rshift__', '__rsub__',
'__rtruediv__', '__rxor__', '__setattr__', '__setstate__',
'__sizeof__', '__str__', '__sub__', '__subclasshook__', '__truediv__',
'__xor__', 'all', 'any', 'argmax', 'argmin', 'argsort', 'astype',
'base', 'byteswap', 'capitalize', 'casefold', 'center', 'choose',
'clip', 'compress', 'conj', 'conjugate', 'copy', 'count', 'cumprod',
'cumsum', 'data', 'diagonal', 'dtype', 'dump', 'dumps', 'encode',
'endswith', 'expandtabs', 'fill', 'find', 'flags', 'flat', 'flatten',
'format', 'format_map', 'getfield', 'imag', 'index', 'isalnum',
'isalpha', 'isdecimal', 'isdigit', 'isidentifier', 'islower',
'isnumeric', 'isprintable', 'isspace', 'istitle', 'isupper', 'item',
'itemset', 'itemsize', 'join', 'ljust', 'lower', 'lstrip',
'maketrans', 'max', 'mean', 'min', 'nbytes', 'ndim', 'newbyteorder',
'nonzero', 'partition', 'prod', 'ptp', 'put', 'ravel', 'real',
'repeat', 'replace', 'reshape', 'resize', 'rfind', 'rindex', 'rjust',
'round', 'rpartition', 'rsplit', 'rstrip', 'searchsorted', 'setfield',
'setflags', 'shape', 'size', 'sort', 'split', 'splitlines', 'squeeze',
'startswith', 'std', 'strides', 'strip', 'sum', 'swapaxes',
'swapcase', 'take', 'title', 'tofile', 'tolist', 'tostring', 'trace',
'translate', 'transpose', 'upper', 'var', 'view', 'zfill']
(Pdb) data
'0'
(Pdb) type(data)
<class 'numpy.str_'>
(Pdb) data.data
*** TypeError: memoryview: numpy.str_ object does not have the buffer interface
(Pdb)

Josef


From alan.isaac at gmail.com  Tue Jul 23 08:35:01 2013
From: alan.isaac at gmail.com (Alan G Isaac)
Date: Tue, 23 Jul 2013 08:35:01 -0400
Subject: [Numpy-discussion] add .H attribute?
In-Reply-To: <51EE4091.1080009@astro.uio.no>
References: <51D98902.1090403@gmail.com>
	<CAB6mnxL9GhVHi5aBKMfzX0DTi3uY=W90rFaj7Ta=fZguJZs5FA@mail.gmail.com>
	<51E17280.2030105@gmail.com>
	<CAPJVwBn-4WfgPQCzz0kECV8MyoLgWDmTmWu+Fj2TMuUWKx-7Wg@mail.gmail.com>
	<CABDkGQ=to_EG8uUyj8aQTVusguvt1qsLnhsPEGpanczHE9J1Rg@mail.gmail.com>
	<CAPJVwBmGkMxPw_cD21zxsf1Jj5cMqSzm4vShAEfqmkT4Pdy4bA@mail.gmail.com>
	<51ED9100.8040108@gmail.com>
	<loom.20130723T091735-874@post.gmane.org>
	<51EE4044.5060509@astro.uio.no> <51EE4091.1080009@astro.uio.no>
Message-ID: <51EE7875.1080803@gmail.com>

On 7/23/2013 4:36 AM, Dag Sverre Seljebotn wrote:
> I'm +1 on another name for a method that does a copy. Which can
> eventually be made redundant with A.H.copy(), if somebody ever takes on
> the work to make that happen...but at least I think the path to that
> should be kept open.


If that is the decision, I would suggest A.ct().

But, it this really necessary?  An obvious path is to introduce A.H now,
document that it makes a copy, and document that it may eventually
produce an iterative view.

Think how much nicer things would be evolving if diagonal
had been implemented as an attribute with documentation
that it would eventually be a writable view.  Isn't there some
analogy with this situation?

Alan


From stefan at sun.ac.za  Tue Jul 23 08:42:49 2013
From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=)
Date: Tue, 23 Jul 2013 14:42:49 +0200
Subject: [Numpy-discussion] add .H attribute?
In-Reply-To: <51EE4044.5060509@astro.uio.no>
References: <51D98902.1090403@gmail.com>
	<CAB6mnxL9GhVHi5aBKMfzX0DTi3uY=W90rFaj7Ta=fZguJZs5FA@mail.gmail.com>
	<51E17280.2030105@gmail.com>
	<CAPJVwBn-4WfgPQCzz0kECV8MyoLgWDmTmWu+Fj2TMuUWKx-7Wg@mail.gmail.com>
	<CABDkGQ=to_EG8uUyj8aQTVusguvt1qsLnhsPEGpanczHE9J1Rg@mail.gmail.com>
	<CAPJVwBmGkMxPw_cD21zxsf1Jj5cMqSzm4vShAEfqmkT4Pdy4bA@mail.gmail.com>
	<51ED9100.8040108@gmail.com> <loom.20130723T091735-874@post.gmane.org>
	<51EE4044.5060509@astro.uio.no>
Message-ID: <CABDkGQktdmTX9DpHyfyr1DZZd16BU9+nF5ZyNgbvAeomBT8nVg@mail.gmail.com>

On Tue, Jul 23, 2013 at 10:35 AM, Dag Sverre Seljebotn
<d.s.seljebotn at astro.uio.no> wrote:
> So -1 on using A.H for anything but a proper view, and "A.conjt()" or
> something similar for a method that does a copy.

"A.T.conj()" is just as clear, so my feeling is that we should either
add A.H / A.H() or leave it be.

St?fan


From pav at iki.fi  Tue Jul 23 09:09:27 2013
From: pav at iki.fi (Pauli Virtanen)
Date: Tue, 23 Jul 2013 16:09:27 +0300
Subject: [Numpy-discussion] add .H attribute?
In-Reply-To: <CABDkGQktdmTX9DpHyfyr1DZZd16BU9+nF5ZyNgbvAeomBT8nVg@mail.gmail.com>
References: <51D98902.1090403@gmail.com>
	<CAB6mnxL9GhVHi5aBKMfzX0DTi3uY=W90rFaj7Ta=fZguJZs5FA@mail.gmail.com>
	<51E17280.2030105@gmail.com>
	<CAPJVwBn-4WfgPQCzz0kECV8MyoLgWDmTmWu+Fj2TMuUWKx-7Wg@mail.gmail.com>
	<CABDkGQ=to_EG8uUyj8aQTVusguvt1qsLnhsPEGpanczHE9J1Rg@mail.gmail.com>
	<CAPJVwBmGkMxPw_cD21zxsf1Jj5cMqSzm4vShAEfqmkT4Pdy4bA@mail.gmail.com>
	<51ED9100.8040108@gmail.com>
	<loom.20130723T091735-874@post.gmane.org>
	<51EE4044.5060509@astro.uio.no>
	<CABDkGQktdmTX9DpHyfyr1DZZd16BU9+nF5ZyNgbvAeomBT8nVg@mail.gmail.com>
Message-ID: <kslva0$bpl$1@ger.gmane.org>

23.07.2013 15:42, St?fan van der Walt kirjoitti:
> On Tue, Jul 23, 2013 at 10:35 AM, Dag Sverre Seljebotn
> <d.s.seljebotn at astro.uio.no> wrote:
>> So -1 on using A.H for anything but a proper view, and "A.conjt()" or
>> something similar for a method that does a copy.
>
> "A.T.conj()" is just as clear, so my feeling is that we should either
> add A.H / A.H() or leave it be.

The .H property has been implemented in Numpy matrices and Scipy's 
sparse matrices for many years, and AFAIK the view issue apparently 
hasn't caused much confusion.

I think having it return an iterator (similarly to .flat which I think 
is rarely used) that is not compatible with ndarrays would be quite 
confusing. Implementing a full complex-conjugating ndarray view for this 
purpose on the other hand seems quite a large hassle, for somewhat 
dubious gains.

If it is implemented as returning a copy, it can be documented in a way 
that leaves leeway for changing the implementation to a view later on.

-- 
Pauli Virtanen


From Jerome.Kieffer at esrf.fr  Tue Jul 23 09:37:34 2013
From: Jerome.Kieffer at esrf.fr (Jerome Kieffer)
Date: Tue, 23 Jul 2013 15:37:34 +0200
Subject: [Numpy-discussion] add .H attribute?
In-Reply-To: <CAHAreOqQGOvyJf50NjY2FHPNSOYSG5R9+bvWgZdwOK15BE4uLg@mail.gmail.com>
References: <51D98902.1090403@gmail.com>
	<CAB6mnxL9GhVHi5aBKMfzX0DTi3uY=W90rFaj7Ta=fZguJZs5FA@mail.gmail.com>
	<51E17280.2030105@gmail.com>
	<CAPJVwBn-4WfgPQCzz0kECV8MyoLgWDmTmWu+Fj2TMuUWKx-7Wg@mail.gmail.com>
	<CABDkGQ=to_EG8uUyj8aQTVusguvt1qsLnhsPEGpanczHE9J1Rg@mail.gmail.com>
	<CAPJVwBmGkMxPw_cD21zxsf1Jj5cMqSzm4vShAEfqmkT4Pdy4bA@mail.gmail.com>
	<51ED9100.8040108@gmail.com>
	<loom.20130723T091735-874@post.gmane.org>
	<CAHAreOqQGOvyJf50NjY2FHPNSOYSG5R9+bvWgZdwOK15BE4uLg@mail.gmail.com>
Message-ID: <20130723153734.3a235e49.Jerome.Kieffer@esrf.fr>

On Tue, 23 Jul 2013 01:07:27 -0700
Fernando Perez <fperez.net at gmail.com> wrote:

> Silly suggestion: why not just make .H a callable?
> 
> a.H()

+1

-- 
J?r?me Kieffer
On-Line Data analysis / Software Group 
ISDD / ESRF
tel +33 476 882 445


From alan.isaac at gmail.com  Tue Jul 23 09:39:08 2013
From: alan.isaac at gmail.com (Alan G Isaac)
Date: Tue, 23 Jul 2013 09:39:08 -0400
Subject: [Numpy-discussion] .flat (was: add .H attribute?)
In-Reply-To: <kslva0$bpl$1@ger.gmane.org>
References: <51D98902.1090403@gmail.com>
	<CAB6mnxL9GhVHi5aBKMfzX0DTi3uY=W90rFaj7Ta=fZguJZs5FA@mail.gmail.com>
	<51E17280.2030105@gmail.com>
	<CAPJVwBn-4WfgPQCzz0kECV8MyoLgWDmTmWu+Fj2TMuUWKx-7Wg@mail.gmail.com>
	<CABDkGQ=to_EG8uUyj8aQTVusguvt1qsLnhsPEGpanczHE9J1Rg@mail.gmail.com>
	<CAPJVwBmGkMxPw_cD21zxsf1Jj5cMqSzm4vShAEfqmkT4Pdy4bA@mail.gmail.com>
	<51ED9100.8040108@gmail.com>
	<loom.20130723T091735-874@post.gmane.org>
	<51EE4044.5060509@astro.uio.no>
	<CABDkGQktdmTX9DpHyfyr1DZZd16BU9+nF5ZyNgbvAeomBT8nVg@mail.gmail.com>
	<kslva0$bpl$1@ger.gmane.org>
Message-ID: <51EE877C.4080504@gmail.com>

On 7/23/2013 9:09 AM, Pauli Virtanen wrote:
> .flat which I think
> is rarely used


Until ``diagonal`` completes its transition,
use of ``flat`` seems the best way to reset
the diagonal on an array.  Am I wrong?
I use it that way all the time.

Alan Isaac


From stefan at sun.ac.za  Tue Jul 23 10:11:47 2013
From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=)
Date: Tue, 23 Jul 2013 16:11:47 +0200
Subject: [Numpy-discussion] .flat (was: add .H attribute?)
In-Reply-To: <51EE877C.4080504@gmail.com>
References: <51D98902.1090403@gmail.com>
	<CAB6mnxL9GhVHi5aBKMfzX0DTi3uY=W90rFaj7Ta=fZguJZs5FA@mail.gmail.com>
	<51E17280.2030105@gmail.com>
	<CAPJVwBn-4WfgPQCzz0kECV8MyoLgWDmTmWu+Fj2TMuUWKx-7Wg@mail.gmail.com>
	<CABDkGQ=to_EG8uUyj8aQTVusguvt1qsLnhsPEGpanczHE9J1Rg@mail.gmail.com>
	<CAPJVwBmGkMxPw_cD21zxsf1Jj5cMqSzm4vShAEfqmkT4Pdy4bA@mail.gmail.com>
	<51ED9100.8040108@gmail.com> <loom.20130723T091735-874@post.gmane.org>
	<51EE4044.5060509@astro.uio.no>
	<CABDkGQktdmTX9DpHyfyr1DZZd16BU9+nF5ZyNgbvAeomBT8nVg@mail.gmail.com>
	<kslva0$bpl$1@ger.gmane.org> <51EE877C.4080504@gmail.com>
Message-ID: <CABDkGQkiyWHX7CXh0aDSZGMKeOpGt6HQOPfvEiSddNymdC_wZQ@mail.gmail.com>

On Tue, Jul 23, 2013 at 3:39 PM, Alan G Isaac <alan.isaac at gmail.com> wrote:
> On 7/23/2013 9:09 AM, Pauli Virtanen wrote:
>> .flat which I think
>> is rarely used
>
> Until ``diagonal`` completes its transition,
> use of ``flat`` seems the best way to reset
> the diagonal on an array.  Am I wrong?
> I use it that way all the time.

I usually write

x[np.diag_indices_from(x)] = [1,2,3]

St?fan


From sebastian at sipsolutions.net  Tue Jul 23 10:29:44 2013
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Tue, 23 Jul 2013 16:29:44 +0200
Subject: [Numpy-discussion] Allow == and != to raise errors
In-Reply-To: <CAMMTP+DaC5iJZ5xvHN_eRoHhnRNqT9BHq=tugdosHKmO4Ebs7w@mail.gmail.com>
References: <1373632688.13968.13.camel@sebastian-laptop>
	<CANNq6Fk+uLF_5Q+kesradPt5vEB_-miwmTzSm5odgEqxFtOing@mail.gmail.com>
	<CAPJVwBn2T1ksxTKda-x5McLd_QecyWEd9G1mC_M7E=fVt7z_zA@mail.gmail.com>
	<CAMMTP+DaC5iJZ5xvHN_eRoHhnRNqT9BHq=tugdosHKmO4Ebs7w@mail.gmail.com>
Message-ID: <1374589784.13486.32.camel@sebastian-laptop>

On Sat, 2013-07-13 at 11:28 -0400, josef.pktd at gmail.com wrote:
> On Sat, Jul 13, 2013 at 9:14 AM, Nathaniel Smith <njs at pobox.com> wrote:
<snip>
> 
> I'm now +1 on the exception that Sebastian proposed.
> 
> I like consistency, and having a more straightforward mental model of
> the numpy behavior for elementwise operations, that don't pretend
> sometimes to be "python" (when I'm doing array math), like this
> 

I am not sure what the result of this discussion is. As far as I see
Benjamin and Fr?d?ric were opposing and overall it seemed pretty mixed,
so unless you two changed your mind or say that it was just a small
personal preference I would drop it for now.
I obviously think the current behaviour is inconsistent to buggy and am
really only afraid of possibly breaking code out there. Which is why I
think I maybe should first add a FutureWarning if we decide on changing
it.

Regards,

Sebastian

> >>> [1,2,3] < [1,2]
> False
> >>> [1,2,3] > [1,2]
> True
> 
> Josef
> 
> 
> >
> > -n
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> 


From ben.root at ou.edu  Tue Jul 23 10:34:22 2013
From: ben.root at ou.edu (Benjamin Root)
Date: Tue, 23 Jul 2013 10:34:22 -0400
Subject: [Numpy-discussion] .flat (was: add .H attribute?)
In-Reply-To: <CABDkGQkiyWHX7CXh0aDSZGMKeOpGt6HQOPfvEiSddNymdC_wZQ@mail.gmail.com>
References: <51D98902.1090403@gmail.com>
	<CAB6mnxL9GhVHi5aBKMfzX0DTi3uY=W90rFaj7Ta=fZguJZs5FA@mail.gmail.com>
	<51E17280.2030105@gmail.com>
	<CAPJVwBn-4WfgPQCzz0kECV8MyoLgWDmTmWu+Fj2TMuUWKx-7Wg@mail.gmail.com>
	<CABDkGQ=to_EG8uUyj8aQTVusguvt1qsLnhsPEGpanczHE9J1Rg@mail.gmail.com>
	<CAPJVwBmGkMxPw_cD21zxsf1Jj5cMqSzm4vShAEfqmkT4Pdy4bA@mail.gmail.com>
	<51ED9100.8040108@gmail.com> <loom.20130723T091735-874@post.gmane.org>
	<51EE4044.5060509@astro.uio.no>
	<CABDkGQktdmTX9DpHyfyr1DZZd16BU9+nF5ZyNgbvAeomBT8nVg@mail.gmail.com>
	<kslva0$bpl$1@ger.gmane.org> <51EE877C.4080504@gmail.com>
	<CABDkGQkiyWHX7CXh0aDSZGMKeOpGt6HQOPfvEiSddNymdC_wZQ@mail.gmail.com>
Message-ID: <CANNq6Fm7BpfWDTP=KDzW5gAac5TQ4cHxTcm+4mzkvoy1ax_h9w@mail.gmail.com>

On Tue, Jul 23, 2013 at 10:11 AM, St?fan van der Walt <stefan at sun.ac.za>wrote:

> On Tue, Jul 23, 2013 at 3:39 PM, Alan G Isaac <alan.isaac at gmail.com>
> wrote:
> > On 7/23/2013 9:09 AM, Pauli Virtanen wrote:
> >> .flat which I think
> >> is rarely used
> >
>

Don't assume .flat is not commonly used.  A common idiom in matlab is
"a[:]" to flatten an array. When porting code over from matlab, it is
typical to replace that with either "a.flat" or "a.flatten()", depending on
whether an iterator or an array is needed.

Cheers!
Ben Root
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130723/0955a782/attachment.html>

From pav at iki.fi  Tue Jul 23 10:46:20 2013
From: pav at iki.fi (Pauli Virtanen)
Date: Tue, 23 Jul 2013 17:46:20 +0300
Subject: [Numpy-discussion] .flat
In-Reply-To: <CANNq6Fm7BpfWDTP=KDzW5gAac5TQ4cHxTcm+4mzkvoy1ax_h9w@mail.gmail.com>
References: <51D98902.1090403@gmail.com>
	<CAB6mnxL9GhVHi5aBKMfzX0DTi3uY=W90rFaj7Ta=fZguJZs5FA@mail.gmail.com>
	<51E17280.2030105@gmail.com>
	<CAPJVwBn-4WfgPQCzz0kECV8MyoLgWDmTmWu+Fj2TMuUWKx-7Wg@mail.gmail.com>
	<CABDkGQ=to_EG8uUyj8aQTVusguvt1qsLnhsPEGpanczHE9J1Rg@mail.gmail.com>
	<CAPJVwBmGkMxPw_cD21zxsf1Jj5cMqSzm4vShAEfqmkT4Pdy4bA@mail.gmail.com>
	<51ED9100.8040108@gmail.com>
	<loom.20130723T091735-874@post.gmane.org>
	<51EE4044.5060509@astro.uio.no>
	<CABDkGQktdmTX9DpHyfyr1DZZd16BU9+nF5ZyNgbvAeomBT8nVg@mail.gmail.com>
	<kslva0$bpl$1@ger.gmane.org> <51EE877C.4080504@gmail.com>
	<CABDkGQkiyWHX7CXh0aDSZGMKeOpGt6HQOPfvEiSddNymdC_wZQ@mail.gmail.com>
	<CANNq6Fm7BpfWDTP=KDzW5gAac5TQ4cHxTcm+4mzkvoy1ax_h9w@mail.gmail.com>
Message-ID: <ksm4vl$kvp$1@ger.gmane.org>

23.07.2013 17:34, Benjamin Root kirjoitti:
[clip]
> Don't assume .flat is not commonly used.  A common idiom in matlab is
> "a[:]" to flatten an array. When porting code over from matlab, it is
> typical to replace that with either "a.flat" or "a.flatten()", depending
> on whether an iterator or an array is needed.

It is much more rarely used than `ravel()` and `flatten()`, as can be 
verified by grepping e.g. the matplotlib source code.

-- 
Pauli Virtanen


From njs at pobox.com  Tue Jul 23 10:51:46 2013
From: njs at pobox.com (Nathaniel Smith)
Date: Tue, 23 Jul 2013 15:51:46 +0100
Subject: [Numpy-discussion] add .H attribute?
In-Reply-To: <51EE4044.5060509@astro.uio.no>
References: <51D98902.1090403@gmail.com>
	<CAB6mnxL9GhVHi5aBKMfzX0DTi3uY=W90rFaj7Ta=fZguJZs5FA@mail.gmail.com>
	<51E17280.2030105@gmail.com>
	<CAPJVwBn-4WfgPQCzz0kECV8MyoLgWDmTmWu+Fj2TMuUWKx-7Wg@mail.gmail.com>
	<CABDkGQ=to_EG8uUyj8aQTVusguvt1qsLnhsPEGpanczHE9J1Rg@mail.gmail.com>
	<CAPJVwBmGkMxPw_cD21zxsf1Jj5cMqSzm4vShAEfqmkT4Pdy4bA@mail.gmail.com>
	<51ED9100.8040108@gmail.com>
	<loom.20130723T091735-874@post.gmane.org>
	<51EE4044.5060509@astro.uio.no>
Message-ID: <CAPJVwBmR9ShkJYPZz-+EU5J6enTFf0j_KyOqgB=ex1uGijcc5w@mail.gmail.com>

On Tue, Jul 23, 2013 at 9:35 AM, Dag Sverre Seljebotn
<d.s.seljebotn at astro.uio.no> wrote:
> I don't think this is obvious at all. In fact, I'd fully expect A.H to
> return a view that conjugates the values on the fly as they are
> read/written (just the same way the array is "transposed on the fly" or
> "sliced on the fly" with other views).
>
> There's lots of uses for A.H to be a conjugating-view, e.g., np.dot(A.H,
> A) can be done on-the-fly by BLAS at no extra cost, and so on. These are
> currently not possible with pure NumPy without a copy, which is a pretty
> big defect IMO (and one reason I'd call BLAS myself using Cython rather
> than use np.dot...)

I was skeptical about this at first on the grounds that yeah, it'd be
nice if at some point we allowed for on-the-fly transformations, it
isn't happening anytime soon.

But on second thought, we actually could implement this pretty easily
-- just define a new dtype "conjcomplex" that stores the value x+iy as
two doubles (x, -y). Then
  complex_arr.view(conjcomplex)
would preserve memory contents but invert the numeric sign of all
imaginary components, while
  complex_arr.astype(conjcomplex)
would preserve numeric value but alter the memory representation.
Because this latter cast is safe, all the existing ufuncs would
automatically work fine on conjcomplex arrays. But we could also
define conjcomplex-specific ufunc loops for cases like dot() where a
more efficient implementation is possible (using the above-mentioned
BLAS flags).

Don't know if we want to actually do this, but it's doable.

(I don't have any in-principle objection to .H(), but won't it just
lead to more threads complaining about how confusing it is that .T and
.H() are different?)

-n


From stefan at sun.ac.za  Tue Jul 23 10:54:23 2013
From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=)
Date: Tue, 23 Jul 2013 16:54:23 +0200
Subject: [Numpy-discussion] add .H attribute?
In-Reply-To: <CAPJVwBmR9ShkJYPZz-+EU5J6enTFf0j_KyOqgB=ex1uGijcc5w@mail.gmail.com>
References: <51D98902.1090403@gmail.com>
	<CAB6mnxL9GhVHi5aBKMfzX0DTi3uY=W90rFaj7Ta=fZguJZs5FA@mail.gmail.com>
	<51E17280.2030105@gmail.com>
	<CAPJVwBn-4WfgPQCzz0kECV8MyoLgWDmTmWu+Fj2TMuUWKx-7Wg@mail.gmail.com>
	<CABDkGQ=to_EG8uUyj8aQTVusguvt1qsLnhsPEGpanczHE9J1Rg@mail.gmail.com>
	<CAPJVwBmGkMxPw_cD21zxsf1Jj5cMqSzm4vShAEfqmkT4Pdy4bA@mail.gmail.com>
	<51ED9100.8040108@gmail.com> <loom.20130723T091735-874@post.gmane.org>
	<51EE4044.5060509@astro.uio.no>
	<CAPJVwBmR9ShkJYPZz-+EU5J6enTFf0j_KyOqgB=ex1uGijcc5w@mail.gmail.com>
Message-ID: <CABDkGQkoPKdsjrKgu5k08ipmkJ+-K-=zVQLy4PGR5id5ddzBhQ@mail.gmail.com>

On Tue, Jul 23, 2013 at 4:51 PM, Nathaniel Smith <njs at pobox.com> wrote:
> Don't know if we want to actually do this, but it's doable.

Would we need a matching conjugate data-type for each complex
data-type then, or can the data-type be "parameterized"?

St?fan


From ben.root at ou.edu  Tue Jul 23 10:57:58 2013
From: ben.root at ou.edu (Benjamin Root)
Date: Tue, 23 Jul 2013 10:57:58 -0400
Subject: [Numpy-discussion] .flat
In-Reply-To: <ksm4vl$kvp$1@ger.gmane.org>
References: <51D98902.1090403@gmail.com>
	<CAB6mnxL9GhVHi5aBKMfzX0DTi3uY=W90rFaj7Ta=fZguJZs5FA@mail.gmail.com>
	<51E17280.2030105@gmail.com>
	<CAPJVwBn-4WfgPQCzz0kECV8MyoLgWDmTmWu+Fj2TMuUWKx-7Wg@mail.gmail.com>
	<CABDkGQ=to_EG8uUyj8aQTVusguvt1qsLnhsPEGpanczHE9J1Rg@mail.gmail.com>
	<CAPJVwBmGkMxPw_cD21zxsf1Jj5cMqSzm4vShAEfqmkT4Pdy4bA@mail.gmail.com>
	<51ED9100.8040108@gmail.com> <loom.20130723T091735-874@post.gmane.org>
	<51EE4044.5060509@astro.uio.no>
	<CABDkGQktdmTX9DpHyfyr1DZZd16BU9+nF5ZyNgbvAeomBT8nVg@mail.gmail.com>
	<kslva0$bpl$1@ger.gmane.org> <51EE877C.4080504@gmail.com>
	<CABDkGQkiyWHX7CXh0aDSZGMKeOpGt6HQOPfvEiSddNymdC_wZQ@mail.gmail.com>
	<CANNq6Fm7BpfWDTP=KDzW5gAac5TQ4cHxTcm+4mzkvoy1ax_h9w@mail.gmail.com>
	<ksm4vl$kvp$1@ger.gmane.org>
Message-ID: <CANNq6FnsCu4v_gi4c7udZuik-zPJa7+SAWySN8YTwQRLe=vkcw@mail.gmail.com>

On Tue, Jul 23, 2013 at 10:46 AM, Pauli Virtanen <pav at iki.fi> wrote:

> 23.07.2013 17:34, Benjamin Root kirjoitti:
> [clip]
> > Don't assume .flat is not commonly used.  A common idiom in matlab is
> > "a[:]" to flatten an array. When porting code over from matlab, it is
> > typical to replace that with either "a.flat" or "a.flatten()", depending
> > on whether an iterator or an array is needed.
>
> It is much more rarely used than `ravel()` and `flatten()`, as can be
> verified by grepping e.g. the matplotlib source code.
>
>
The matplotlib source code is not a port from Matlab, so grepping that
wouldn't prove anything.

Meanwhile, the "NumPy for Matlab users" page notes that a.flatten() makes a
copy. A newbie to NumPy would then (correctly) look up the documentation
for a.flatten() and see in the "See Also" section that "a.flat" is just an
iterator rather than a copy, and would often use that to avoid the copy.

Cheers!
Ben Root
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130723/4643dfa8/attachment.html>

From pav at iki.fi  Tue Jul 23 11:02:13 2013
From: pav at iki.fi (Pauli Virtanen)
Date: Tue, 23 Jul 2013 18:02:13 +0300
Subject: [Numpy-discussion] add .H attribute?
In-Reply-To: <CAPJVwBmR9ShkJYPZz-+EU5J6enTFf0j_KyOqgB=ex1uGijcc5w@mail.gmail.com>
References: <51D98902.1090403@gmail.com>
	<CAB6mnxL9GhVHi5aBKMfzX0DTi3uY=W90rFaj7Ta=fZguJZs5FA@mail.gmail.com>
	<51E17280.2030105@gmail.com>
	<CAPJVwBn-4WfgPQCzz0kECV8MyoLgWDmTmWu+Fj2TMuUWKx-7Wg@mail.gmail.com>
	<CABDkGQ=to_EG8uUyj8aQTVusguvt1qsLnhsPEGpanczHE9J1Rg@mail.gmail.com>
	<CAPJVwBmGkMxPw_cD21zxsf1Jj5cMqSzm4vShAEfqmkT4Pdy4bA@mail.gmail.com>
	<51ED9100.8040108@gmail.com>
	<loom.20130723T091735-874@post.gmane.org>
	<51EE4044.5060509@astro.uio.no>
	<CAPJVwBmR9ShkJYPZz-+EU5J6enTFf0j_KyOqgB=ex1uGijcc5w@mail.gmail.com>
Message-ID: <ksm5te$ukc$1@ger.gmane.org>

23.07.2013 17:51, Nathaniel Smith kirjoitti:
[clip: conjcomplex dtype]
> Because this latter cast is safe, all the existing ufuncs would
> automatically work fine on conjcomplex arrays. But we could also
> define conjcomplex-specific ufunc loops for cases like dot() where a
> more efficient implementation is possible (using the above-mentioned
> BLAS flags).
>
> Don't know if we want to actually do this, but it's doable.

There's somewhat a lot of 3rd party code that doesn't do automatic 
casting (e.g. all of Cython code interfacing with Numpy, C extensions, 
f2py I think), but rather fails for incompatible input dtypes. Having 
arrays with a new complex dtype around would require changes in this 
sort of code.

In this sense having an iterator of some sort with an __array__ 
attribute would work. However, an iterator doesn't support (without a 
lot of work) the various ndarray attributes which would be confusing.

-- 
Pauli Virtanen


From nouiz at nouiz.org  Tue Jul 23 11:10:51 2013
From: nouiz at nouiz.org (=?ISO-8859-1?Q?Fr=E9d=E9ric_Bastien?=)
Date: Tue, 23 Jul 2013 11:10:51 -0400
Subject: [Numpy-discussion] Allow == and != to raise errors
In-Reply-To: <1374589784.13486.32.camel@sebastian-laptop>
References: <1373632688.13968.13.camel@sebastian-laptop>
	<CANNq6Fk+uLF_5Q+kesradPt5vEB_-miwmTzSm5odgEqxFtOing@mail.gmail.com>
	<CAPJVwBn2T1ksxTKda-x5McLd_QecyWEd9G1mC_M7E=fVt7z_zA@mail.gmail.com>
	<CAMMTP+DaC5iJZ5xvHN_eRoHhnRNqT9BHq=tugdosHKmO4Ebs7w@mail.gmail.com>
	<1374589784.13486.32.camel@sebastian-laptop>
Message-ID: <CADKKbtjhSUhMHP0CKE91s4JKx+BUKOrvYt2Zy503Su1ffO=AHg@mail.gmail.com>

I'm mixed, because I see the good value, but I'm not able to guess the
consequence of the interface change.

So doing your FutureWarning would allow to gatter some data about this, and
if it seam to cause too much problem, we could cancel the change.

Also, in the case there is a few software that depend on the old behaviour,
this will cause a crash(Except if they have a catch all Exception case),
not bad result.

I think it is always hard to predict the consequence of interface change in
NumPy. To help measure it, we could make/as people to contribute to a
collection of software that use NumPy with a good tests suites. We could
test interface change on them by running there tests suites to try to have
a guess of the impact of those change. What do you think of that? I think
it was already discussed on the mailing list, but not acted upon.

Fred


On Tue, Jul 23, 2013 at 10:29 AM, Sebastian Berg <sebastian at sipsolutions.net
> wrote:

> On Sat, 2013-07-13 at 11:28 -0400, josef.pktd at gmail.com wrote:
> > On Sat, Jul 13, 2013 at 9:14 AM, Nathaniel Smith <njs at pobox.com> wrote:
> <snip>
> >
> > I'm now +1 on the exception that Sebastian proposed.
> >
> > I like consistency, and having a more straightforward mental model of
> > the numpy behavior for elementwise operations, that don't pretend
> > sometimes to be "python" (when I'm doing array math), like this
> >
>
> I am not sure what the result of this discussion is. As far as I see
> Benjamin and Fr?d?ric were opposing and overall it seemed pretty mixed,
> so unless you two changed your mind or say that it was just a small
> personal preference I would drop it for now.
> I obviously think the current behaviour is inconsistent to buggy and am
> really only afraid of possibly breaking code out there. Which is why I
> think I maybe should first add a FutureWarning if we decide on changing
> it.
>
> Regards,
>
> Sebastian
>
> > >>> [1,2,3] < [1,2]
> > False
> > >>> [1,2,3] > [1,2]
> > True
> >
> > Josef
> >
> >
> > >
> > > -n
> > > _______________________________________________
> > > NumPy-Discussion mailing list
> > > NumPy-Discussion at scipy.org
> > > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130723/facb5659/attachment.html>

From njs at pobox.com  Tue Jul 23 12:10:52 2013
From: njs at pobox.com (Nathaniel Smith)
Date: Tue, 23 Jul 2013 17:10:52 +0100
Subject: [Numpy-discussion] add .H attribute?
In-Reply-To: <ksm5te$ukc$1@ger.gmane.org>
References: <51D98902.1090403@gmail.com>
	<CAB6mnxL9GhVHi5aBKMfzX0DTi3uY=W90rFaj7Ta=fZguJZs5FA@mail.gmail.com>
	<51E17280.2030105@gmail.com>
	<CAPJVwBn-4WfgPQCzz0kECV8MyoLgWDmTmWu+Fj2TMuUWKx-7Wg@mail.gmail.com>
	<CABDkGQ=to_EG8uUyj8aQTVusguvt1qsLnhsPEGpanczHE9J1Rg@mail.gmail.com>
	<CAPJVwBmGkMxPw_cD21zxsf1Jj5cMqSzm4vShAEfqmkT4Pdy4bA@mail.gmail.com>
	<51ED9100.8040108@gmail.com>
	<loom.20130723T091735-874@post.gmane.org>
	<51EE4044.5060509@astro.uio.no>
	<CAPJVwBmR9ShkJYPZz-+EU5J6enTFf0j_KyOqgB=ex1uGijcc5w@mail.gmail.com>
	<ksm5te$ukc$1@ger.gmane.org>
Message-ID: <CAPJVwBkM5rhz3qxL0TrZd0WOk=DAOfx8m759=VjdrAZEhmfEcQ@mail.gmail.com>

On 23 Jul 2013 16:03, "Pauli Virtanen" <pav at iki.fi> wrote:
>
> 23.07.2013 17:51, Nathaniel Smith kirjoitti:
> [clip: conjcomplex dtype]
> > Because this latter cast is safe, all the existing ufuncs would
> > automatically work fine on conjcomplex arrays. But we could also
> > define conjcomplex-specific ufunc loops for cases like dot() where a
> > more efficient implementation is possible (using the above-mentioned
> > BLAS flags).
> >
> > Don't know if we want to actually do this, but it's doable.
>
> There's somewhat a lot of 3rd party code that doesn't do automatic
> casting (e.g. all of Cython code interfacing with Numpy, C extensions,
> f2py I think), but rather fails for incompatible input dtypes. Having
> arrays with a new complex dtype around would require changes in this
> sort of code.
>
> In this sense having an iterator of some sort with an __array__
> attribute would work. However, an iterator doesn't support (without a
> lot of work) the various ndarray attributes which would be confusing.

Surely there's more code that handles unusual but correctly castable dtypes
dtypes than there is code that handles custom iterator objects that are
missing ndarray attributes?

-n
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130723/301030f1/attachment.html>

From njs at pobox.com  Tue Jul 23 12:13:40 2013
From: njs at pobox.com (Nathaniel Smith)
Date: Tue, 23 Jul 2013 17:13:40 +0100
Subject: [Numpy-discussion] add .H attribute?
In-Reply-To: <CABDkGQkoPKdsjrKgu5k08ipmkJ+-K-=zVQLy4PGR5id5ddzBhQ@mail.gmail.com>
References: <51D98902.1090403@gmail.com>
	<CAB6mnxL9GhVHi5aBKMfzX0DTi3uY=W90rFaj7Ta=fZguJZs5FA@mail.gmail.com>
	<51E17280.2030105@gmail.com>
	<CAPJVwBn-4WfgPQCzz0kECV8MyoLgWDmTmWu+Fj2TMuUWKx-7Wg@mail.gmail.com>
	<CABDkGQ=to_EG8uUyj8aQTVusguvt1qsLnhsPEGpanczHE9J1Rg@mail.gmail.com>
	<CAPJVwBmGkMxPw_cD21zxsf1Jj5cMqSzm4vShAEfqmkT4Pdy4bA@mail.gmail.com>
	<51ED9100.8040108@gmail.com>
	<loom.20130723T091735-874@post.gmane.org>
	<51EE4044.5060509@astro.uio.no>
	<CAPJVwBmR9ShkJYPZz-+EU5J6enTFf0j_KyOqgB=ex1uGijcc5w@mail.gmail.com>
	<CABDkGQkoPKdsjrKgu5k08ipmkJ+-K-=zVQLy4PGR5id5ddzBhQ@mail.gmail.com>
Message-ID: <CAPJVwBmau+y-Rku+VLN8i=dG3pAKqEz7621Qc58w6WsBByGjQw@mail.gmail.com>

On 23 Jul 2013 15:55, "St?fan van der Walt" <stefan at sun.ac.za> wrote:
>
> On Tue, Jul 23, 2013 at 4:51 PM, Nathaniel Smith <njs at pobox.com> wrote:
> > Don't know if we want to actually do this, but it's doable.
>
> Would we need a matching conjugate data-type for each complex
> data-type then, or can the data-type be "parameterized"?

Right now dtypes can't be parametrized. In this particular case it doesn't
matter a whole lot anyway I think - you'd have to write basically the same
code to handle different width complex types in either case, the difference
is just whether that code got called at runtime or build time.

-n
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130723/88765b6e/attachment.html>

From charlesr.harris at gmail.com  Tue Jul 23 12:22:13 2013
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Tue, 23 Jul 2013 10:22:13 -0600
Subject: [Numpy-discussion] .flat
In-Reply-To: <ksm4vl$kvp$1@ger.gmane.org>
References: <51D98902.1090403@gmail.com>
	<CAB6mnxL9GhVHi5aBKMfzX0DTi3uY=W90rFaj7Ta=fZguJZs5FA@mail.gmail.com>
	<51E17280.2030105@gmail.com>
	<CAPJVwBn-4WfgPQCzz0kECV8MyoLgWDmTmWu+Fj2TMuUWKx-7Wg@mail.gmail.com>
	<CABDkGQ=to_EG8uUyj8aQTVusguvt1qsLnhsPEGpanczHE9J1Rg@mail.gmail.com>
	<CAPJVwBmGkMxPw_cD21zxsf1Jj5cMqSzm4vShAEfqmkT4Pdy4bA@mail.gmail.com>
	<51ED9100.8040108@gmail.com>
	<loom.20130723T091735-874@post.gmane.org>
	<51EE4044.5060509@astro.uio.no>
	<CABDkGQktdmTX9DpHyfyr1DZZd16BU9+nF5ZyNgbvAeomBT8nVg@mail.gmail.com>
	<kslva0$bpl$1@ger.gmane.org> <51EE877C.4080504@gmail.com>
	<CABDkGQkiyWHX7CXh0aDSZGMKeOpGt6HQOPfvEiSddNymdC_wZQ@mail.gmail.com>
	<CANNq6Fm7BpfWDTP=KDzW5gAac5TQ4cHxTcm+4mzkvoy1ax_h9w@mail.gmail.com>
	<ksm4vl$kvp$1@ger.gmane.org>
Message-ID: <CAB6mnxJjmpGKQjdf147P-3WbYk-ZbgLTOk-G_ZpY4bnPKD9X_Q@mail.gmail.com>

On Tue, Jul 23, 2013 at 8:46 AM, Pauli Virtanen <pav at iki.fi> wrote:

> 23.07.2013 17:34, Benjamin Root kirjoitti:
> [clip]
> > Don't assume .flat is not commonly used.  A common idiom in matlab is
> > "a[:]" to flatten an array. When porting code over from matlab, it is
> > typical to replace that with either "a.flat" or "a.flatten()", depending
> > on whether an iterator or an array is needed.
>
> It is much more rarely used than `ravel()` and `flatten()`, as can be
> verified by grepping e.g. the matplotlib source code.
>

Grepping in my code, I find a lot of things like

dfx = van.dot((ax2 - ax1).flat)

IIRC, the flat version was faster than other methods.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130723/2673faa3/attachment.html>

From sebastian at sipsolutions.net  Tue Jul 23 12:35:54 2013
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Tue, 23 Jul 2013 18:35:54 +0200
Subject: [Numpy-discussion] .flat
In-Reply-To: <CAB6mnxJjmpGKQjdf147P-3WbYk-ZbgLTOk-G_ZpY4bnPKD9X_Q@mail.gmail.com>
References: <51D98902.1090403@gmail.com>
	<CAB6mnxL9GhVHi5aBKMfzX0DTi3uY=W90rFaj7Ta=fZguJZs5FA@mail.gmail.com>
	<51E17280.2030105@gmail.com>
	<CAPJVwBn-4WfgPQCzz0kECV8MyoLgWDmTmWu+Fj2TMuUWKx-7Wg@mail.gmail.com>
	<CABDkGQ=to_EG8uUyj8aQTVusguvt1qsLnhsPEGpanczHE9J1Rg@mail.gmail.com>
	<CAPJVwBmGkMxPw_cD21zxsf1Jj5cMqSzm4vShAEfqmkT4Pdy4bA@mail.gmail.com>
	<51ED9100.8040108@gmail.com> <loom.20130723T091735-874@post.gmane.org>
	<51EE4044.5060509@astro.uio.no>
	<CABDkGQktdmTX9DpHyfyr1DZZd16BU9+nF5ZyNgbvAeomBT8nVg@mail.gmail.com>
	<kslva0$bpl$1@ger.gmane.org> <51EE877C.4080504@gmail.com>
	<CABDkGQkiyWHX7CXh0aDSZGMKeOpGt6HQOPfvEiSddNymdC_wZQ@mail.gmail.com>
	<CANNq6Fm7BpfWDTP=KDzW5gAac5TQ4cHxTcm+4mzkvoy1ax_h9w@mail.gmail.com>
	<ksm4vl$kvp$1@ger.gmane.org>
	<CAB6mnxJjmpGKQjdf147P-3WbYk-ZbgLTOk-G_ZpY4bnPKD9X_Q@mail.gmail.com>
Message-ID: <1374597354.13486.37.camel@sebastian-laptop>

On Tue, 2013-07-23 at 10:22 -0600, Charles R Harris wrote:
> 
> 
> On Tue, Jul 23, 2013 at 8:46 AM, Pauli Virtanen <pav at iki.fi> wrote:
>         23.07.2013 17:34, Benjamin Root kirjoitti:
>         [clip]
>         > Don't assume .flat is not commonly used.  A common idiom in
>         matlab is
>         > "a[:]" to flatten an array. When porting code over from
>         matlab, it is
>         > typical to replace that with either "a.flat" or
>         "a.flatten()", depending
>         > on whether an iterator or an array is needed.
>         
>         It is much more rarely used than `ravel()` and `flatten()`, as
>         can be
>         verified by grepping e.g. the matplotlib source code.
> 
> Grepping in my code, I find a lot of things like
> 
> dfx = van.dot((ax2 - ax1).flat)
> 
> IIRC, the flat version was faster than other methods.
> 
Faster then flatten certainly (since flatten forces a copy), I would be
quite surprised if it is faster then ravel, and since dot can't make use
of the iterator, that seems more natural to me.

- Sebastian


> Chuck 
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From pav at iki.fi  Tue Jul 23 12:36:08 2013
From: pav at iki.fi (Pauli Virtanen)
Date: Tue, 23 Jul 2013 19:36:08 +0300
Subject: [Numpy-discussion] .flat
In-Reply-To: <CAB6mnxJjmpGKQjdf147P-3WbYk-ZbgLTOk-G_ZpY4bnPKD9X_Q@mail.gmail.com>
References: <51D98902.1090403@gmail.com>
	<CAB6mnxL9GhVHi5aBKMfzX0DTi3uY=W90rFaj7Ta=fZguJZs5FA@mail.gmail.com>
	<51E17280.2030105@gmail.com>
	<CAPJVwBn-4WfgPQCzz0kECV8MyoLgWDmTmWu+Fj2TMuUWKx-7Wg@mail.gmail.com>
	<CABDkGQ=to_EG8uUyj8aQTVusguvt1qsLnhsPEGpanczHE9J1Rg@mail.gmail.com>
	<CAPJVwBmGkMxPw_cD21zxsf1Jj5cMqSzm4vShAEfqmkT4Pdy4bA@mail.gmail.com>
	<51ED9100.8040108@gmail.com>
	<loom.20130723T091735-874@post.gmane.org>
	<51EE4044.5060509@astro.uio.no>
	<CABDkGQktdmTX9DpHyfyr1DZZd16BU9+nF5ZyNgbvAeomBT8nVg@mail.gmail.com>
	<kslva0$bpl$1@ger.gmane.org> <51EE877C.4080504@gmail.com>
	<CABDkGQkiyWHX7CXh0aDSZGMKeOpGt6HQOPfvEiSddNymdC_wZQ@mail.gmail.com>
	<CANNq6Fm7BpfWDTP=KDzW5gAac5TQ4cHxTcm+4mzkvoy1ax_h9w@mail.gmail.com>
	<ksm4vl$kvp$1@ger.gmane.org>
	<CAB6mnxJjmpGKQjdf147P-3WbYk-ZbgLTOk-G_ZpY4bnPKD9X_Q@mail.gmail.com>
Message-ID: <ksmbdh$jq7$1@ger.gmane.org>

23.07.2013 19:22, Charles R Harris kirjoitti:
[clip]
> Grepping in my code, I find a lot of things like
>
> dfx = van.dot((ax2 - ax1).flat)
>
> IIRC, the flat version was faster than other methods.

That goes through the same code path as
`van.dot(np.asarray((ax2 - ax1).flat))`, which calls the `__array__` 
attribute of the flatiter object. If it's faster than .ravel(), that is 
surprising.

-- 
Pauli Virtanen


From charlesr.harris at gmail.com  Tue Jul 23 13:05:06 2013
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Tue, 23 Jul 2013 11:05:06 -0600
Subject: [Numpy-discussion] .flat
In-Reply-To: <ksmbdh$jq7$1@ger.gmane.org>
References: <51D98902.1090403@gmail.com>
	<CAB6mnxL9GhVHi5aBKMfzX0DTi3uY=W90rFaj7Ta=fZguJZs5FA@mail.gmail.com>
	<51E17280.2030105@gmail.com>
	<CAPJVwBn-4WfgPQCzz0kECV8MyoLgWDmTmWu+Fj2TMuUWKx-7Wg@mail.gmail.com>
	<CABDkGQ=to_EG8uUyj8aQTVusguvt1qsLnhsPEGpanczHE9J1Rg@mail.gmail.com>
	<CAPJVwBmGkMxPw_cD21zxsf1Jj5cMqSzm4vShAEfqmkT4Pdy4bA@mail.gmail.com>
	<51ED9100.8040108@gmail.com>
	<loom.20130723T091735-874@post.gmane.org>
	<51EE4044.5060509@astro.uio.no>
	<CABDkGQktdmTX9DpHyfyr1DZZd16BU9+nF5ZyNgbvAeomBT8nVg@mail.gmail.com>
	<kslva0$bpl$1@ger.gmane.org> <51EE877C.4080504@gmail.com>
	<CABDkGQkiyWHX7CXh0aDSZGMKeOpGt6HQOPfvEiSddNymdC_wZQ@mail.gmail.com>
	<CANNq6Fm7BpfWDTP=KDzW5gAac5TQ4cHxTcm+4mzkvoy1ax_h9w@mail.gmail.com>
	<ksm4vl$kvp$1@ger.gmane.org>
	<CAB6mnxJjmpGKQjdf147P-3WbYk-ZbgLTOk-G_ZpY4bnPKD9X_Q@mail.gmail.com>
	<ksmbdh$jq7$1@ger.gmane.org>
Message-ID: <CAB6mnxJNG2A+f6LO_R4E5DGKMo7==opp5_15dj5+wUmW-D+DGA@mail.gmail.com>

On Tue, Jul 23, 2013 at 10:36 AM, Pauli Virtanen <pav at iki.fi> wrote:

> 23.07.2013 19:22, Charles R Harris kirjoitti:
> [clip]
> > Grepping in my code, I find a lot of things like
> >
> > dfx = van.dot((ax2 - ax1).flat)
> >
> > IIRC, the flat version was faster than other methods.
>
> That goes through the same code path as
> `van.dot(np.asarray((ax2 - ax1).flat))`, which calls the `__array__`
> attribute of the flatiter object. If it's faster than .ravel(), that is
> surprising.
>
>
Well, I never use ravel, there are zero examples in my code ;) So you may
be correct.

I'm not sure the example I gave is the one where '*.flat' wins, but I
recall such a case and have just used flat a lot ever since.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130723/3a24fe55/attachment.html>

From josef.pktd at gmail.com  Tue Jul 23 13:39:10 2013
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Tue, 23 Jul 2013 13:39:10 -0400
Subject: [Numpy-discussion] .flat
In-Reply-To: <CAB6mnxJNG2A+f6LO_R4E5DGKMo7==opp5_15dj5+wUmW-D+DGA@mail.gmail.com>
References: <51D98902.1090403@gmail.com>
	<CAB6mnxL9GhVHi5aBKMfzX0DTi3uY=W90rFaj7Ta=fZguJZs5FA@mail.gmail.com>
	<51E17280.2030105@gmail.com>
	<CAPJVwBn-4WfgPQCzz0kECV8MyoLgWDmTmWu+Fj2TMuUWKx-7Wg@mail.gmail.com>
	<CABDkGQ=to_EG8uUyj8aQTVusguvt1qsLnhsPEGpanczHE9J1Rg@mail.gmail.com>
	<CAPJVwBmGkMxPw_cD21zxsf1Jj5cMqSzm4vShAEfqmkT4Pdy4bA@mail.gmail.com>
	<51ED9100.8040108@gmail.com>
	<loom.20130723T091735-874@post.gmane.org>
	<51EE4044.5060509@astro.uio.no>
	<CABDkGQktdmTX9DpHyfyr1DZZd16BU9+nF5ZyNgbvAeomBT8nVg@mail.gmail.com>
	<kslva0$bpl$1@ger.gmane.org> <51EE877C.4080504@gmail.com>
	<CABDkGQkiyWHX7CXh0aDSZGMKeOpGt6HQOPfvEiSddNymdC_wZQ@mail.gmail.com>
	<CANNq6Fm7BpfWDTP=KDzW5gAac5TQ4cHxTcm+4mzkvoy1ax_h9w@mail.gmail.com>
	<ksm4vl$kvp$1@ger.gmane.org>
	<CAB6mnxJjmpGKQjdf147P-3WbYk-ZbgLTOk-G_ZpY4bnPKD9X_Q@mail.gmail.com>
	<ksmbdh$jq7$1@ger.gmane.org>
	<CAB6mnxJNG2A+f6LO_R4E5DGKMo7==opp5_15dj5+wUmW-D+DGA@mail.gmail.com>
Message-ID: <CAMMTP+A=pdSoy9Ze27DLLh2mT48UF4KKMv3FV891Fmtqx2bUtA@mail.gmail.com>

On Tue, Jul 23, 2013 at 1:05 PM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
>
>
> On Tue, Jul 23, 2013 at 10:36 AM, Pauli Virtanen <pav at iki.fi> wrote:
>>
>> 23.07.2013 19:22, Charles R Harris kirjoitti:
>> [clip]
>> > Grepping in my code, I find a lot of things like
>> >
>> > dfx = van.dot((ax2 - ax1).flat)
>> >
>> > IIRC, the flat version was faster than other methods.
>>
>> That goes through the same code path as
>> `van.dot(np.asarray((ax2 - ax1).flat))`, which calls the `__array__`
>> attribute of the flatiter object. If it's faster than .ravel(), that is
>> surprising.
>>
>
> Well, I never use ravel, there are zero examples in my code ;) So you may be
> correct.
>
> I'm not sure the example I gave is the one where '*.flat' wins, but I recall
> such a case and have just used flat a lot ever since.
>
> Chuck

just another survey

scipy: ravel: 136 (including stats)  flat: 6  flatten: 37   (not current master)
statsmodels ravel: 137,   flat: 0    flatten: 9

I only use ravel  (what am I supposed to do with an iterator if I want a view?)

(I think the equivalent of matlab x(:) is x.ravel("F") not flat or flatten)


Josef

>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From alan.isaac at gmail.com  Tue Jul 23 13:53:54 2013
From: alan.isaac at gmail.com (Alan G Isaac)
Date: Tue, 23 Jul 2013 13:53:54 -0400
Subject: [Numpy-discussion] add .H attribute?
In-Reply-To: <CAPJVwBmR9ShkJYPZz-+EU5J6enTFf0j_KyOqgB=ex1uGijcc5w@mail.gmail.com>
References: <51D98902.1090403@gmail.com>
	<CAB6mnxL9GhVHi5aBKMfzX0DTi3uY=W90rFaj7Ta=fZguJZs5FA@mail.gmail.com>
	<51E17280.2030105@gmail.com>
	<CAPJVwBn-4WfgPQCzz0kECV8MyoLgWDmTmWu+Fj2TMuUWKx-7Wg@mail.gmail.com>
	<CABDkGQ=to_EG8uUyj8aQTVusguvt1qsLnhsPEGpanczHE9J1Rg@mail.gmail.com>
	<CAPJVwBmGkMxPw_cD21zxsf1Jj5cMqSzm4vShAEfqmkT4Pdy4bA@mail.gmail.com>
	<51ED9100.8040108@gmail.com>
	<loom.20130723T091735-874@post.gmane.org>
	<51EE4044.5060509@astro.uio.no>
	<CAPJVwBmR9ShkJYPZz-+EU5J6enTFf0j_KyOqgB=ex1uGijcc5w@mail.gmail.com>
Message-ID: <51EEC332.7070805@gmail.com>

I'm trying to understand the state of this discussion.
I believe that propoents of adding a .H attribute have
primarily emphasized

- readability (and general ease of use)
- consistency with matrix and masked array
- forward looking (to a future when .H can be a view)

The opponents have primarily emphasized

- inconsistency with convention that for arrays
   instance attributes should return views

Is this a correct summary?

If it is correct, I believe the proponents' case is stronger.
All the considerations are valid, so it is a matter of
deciding how to weight them.

The alternative of offering a new method seems inferior in
terms of readability and consistency, and it is not adequately
forward looking.

If the alternative is nevertheless chosen, I suggest that it
should definitely *not* be .H(), both because of the conflict
with uses by matrix and masked array, and because I expect
that eventually the desire for an attribute will win the day,
and it would be a shame for the obvious notation to be lost.

Alan Isaac


From d.s.seljebotn at astro.uio.no  Tue Jul 23 17:08:11 2013
From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn)
Date: Tue, 23 Jul 2013 23:08:11 +0200
Subject: [Numpy-discussion] add .H attribute?
In-Reply-To: <51EEC332.7070805@gmail.com>
References: <51D98902.1090403@gmail.com>
	<CAB6mnxL9GhVHi5aBKMfzX0DTi3uY=W90rFaj7Ta=fZguJZs5FA@mail.gmail.com>
	<51E17280.2030105@gmail.com>
	<CAPJVwBn-4WfgPQCzz0kECV8MyoLgWDmTmWu+Fj2TMuUWKx-7Wg@mail.gmail.com>
	<CABDkGQ=to_EG8uUyj8aQTVusguvt1qsLnhsPEGpanczHE9J1Rg@mail.gmail.com>
	<CAPJVwBmGkMxPw_cD21zxsf1Jj5cMqSzm4vShAEfqmkT4Pdy4bA@mail.gmail.com>
	<51ED9100.8040108@gmail.com>
	<loom.20130723T091735-874@post.gmane.org>
	<51EE4044.5060509@astro.uio.no>
	<CAPJVwBmR9ShkJYPZz-+EU5J6enTFf0j_KyOqgB=ex1uGijcc5w@mail.gmail.com>
	<51EEC332.7070805@gmail.com>
Message-ID: <51EEF0BB.60508@astro.uio.no>

On 07/23/2013 07:53 PM, Alan G Isaac wrote:
> I'm trying to understand the state of this discussion.
> I believe that propoents of adding a .H attribute have
> primarily emphasized
>
> - readability (and general ease of use)
> - consistency with matrix and masked array
> - forward looking (to a future when .H can be a view)

I disagree with this being forward looking, as it explicitly creates a 
situation where code will break if .H becomes a view, e.g.:

xh = x.H
x *= 2
assert np.all(2 * xh == x.H)

>
> The opponents have primarily emphasized
>
> - inconsistency with convention that for arrays
>     instance attributes should return views

I'd formulate this as simply "inconsistency with .T"; they are both 
motivated primarily as notational shorthands.

Dag Sverre


From josef.pktd at gmail.com  Tue Jul 23 17:32:52 2013
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Tue, 23 Jul 2013 17:32:52 -0400
Subject: [Numpy-discussion] add .H attribute?
In-Reply-To: <51EEF0BB.60508@astro.uio.no>
References: <51D98902.1090403@gmail.com>
	<CAB6mnxL9GhVHi5aBKMfzX0DTi3uY=W90rFaj7Ta=fZguJZs5FA@mail.gmail.com>
	<51E17280.2030105@gmail.com>
	<CAPJVwBn-4WfgPQCzz0kECV8MyoLgWDmTmWu+Fj2TMuUWKx-7Wg@mail.gmail.com>
	<CABDkGQ=to_EG8uUyj8aQTVusguvt1qsLnhsPEGpanczHE9J1Rg@mail.gmail.com>
	<CAPJVwBmGkMxPw_cD21zxsf1Jj5cMqSzm4vShAEfqmkT4Pdy4bA@mail.gmail.com>
	<51ED9100.8040108@gmail.com>
	<loom.20130723T091735-874@post.gmane.org>
	<51EE4044.5060509@astro.uio.no>
	<CAPJVwBmR9ShkJYPZz-+EU5J6enTFf0j_KyOqgB=ex1uGijcc5w@mail.gmail.com>
	<51EEC332.7070805@gmail.com> <51EEF0BB.60508@astro.uio.no>
Message-ID: <CAMMTP+C20UtJ1HBnaRxXLPSUzr46ibeou9kUWmp8wy=BQqQm0Q@mail.gmail.com>

On Tue, Jul 23, 2013 at 5:08 PM, Dag Sverre Seljebotn
<d.s.seljebotn at astro.uio.no> wrote:
> On 07/23/2013 07:53 PM, Alan G Isaac wrote:
>> I'm trying to understand the state of this discussion.
>> I believe that propoents of adding a .H attribute have
>> primarily emphasized
>>
>> - readability (and general ease of use)
>> - consistency with matrix and masked array
>> - forward looking (to a future when .H can be a view)
>
> I disagree with this being forward looking, as it explicitly creates a
> situation where code will break if .H becomes a view, e.g.:
>
> xh = x.H
> x *= 2
> assert np.all(2 * xh == x.H)
>
>>
>> The opponents have primarily emphasized
>>
>> - inconsistency with convention that for arrays
>>     instance attributes should return views
>
> I'd formulate this as simply "inconsistency with .T"; they are both
> motivated primarily as notational shorthands.

Do we really need a one letter shorthand for `a.conj().T` ?

I don't.

Josef
(The one who wrote np.max(np.abs(y - x)) and np.max(np.abs(y / x - 1))
30 or more times in the last 24 hours, in pdb.)


>
> Dag Sverre
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From alan.isaac at gmail.com  Tue Jul 23 18:22:15 2013
From: alan.isaac at gmail.com (Alan G Isaac)
Date: Tue, 23 Jul 2013 18:22:15 -0400
Subject: [Numpy-discussion] add .H attribute?
In-Reply-To: <51EEF0BB.60508@astro.uio.no>
References: <51D98902.1090403@gmail.com>
	<CAB6mnxL9GhVHi5aBKMfzX0DTi3uY=W90rFaj7Ta=fZguJZs5FA@mail.gmail.com>
	<51E17280.2030105@gmail.com>
	<CAPJVwBn-4WfgPQCzz0kECV8MyoLgWDmTmWu+Fj2TMuUWKx-7Wg@mail.gmail.com>
	<CABDkGQ=to_EG8uUyj8aQTVusguvt1qsLnhsPEGpanczHE9J1Rg@mail.gmail.com>
	<CAPJVwBmGkMxPw_cD21zxsf1Jj5cMqSzm4vShAEfqmkT4Pdy4bA@mail.gmail.com>
	<51ED9100.8040108@gmail.com>
	<loom.20130723T091735-874@post.gmane.org>
	<51EE4044.5060509@astro.uio.no>
	<CAPJVwBmR9ShkJYPZz-+EU5J6enTFf0j_KyOqgB=ex1uGijcc5w@mail.gmail.com>
	<51EEC332.7070805@gmail.com> <51EEF0BB.60508@astro.uio.no>
Message-ID: <51EF0217.209@gmail.com>

On 7/23/2013 5:08 PM, Dag Sverre Seljebotn wrote:
> I disagree with this being forward looking, as it explicitly creates a
> situation where code will break if .H becomes a view


Well yes, we cannot have everything.
Just like it is taking a while for ``diagonal``
to transition to providing a view, this would
be true for .H when the time comes.  Naturally,
this would be documented (that it may change
to a view).  Just as it is documented with
``diagonal``.

But it is nevertheless forward looking in an
obvious sense: it provides access to an
extremely convenient and much more readable
notation that will in any case eventually
be available.  Also, the current context is
the matrices and masked arrays have this
attribute, so this transitional issue already
exists.

Out of curiosity: do you use NumPy to work with
complex arrays?

Alan


From alan.isaac at gmail.com  Tue Jul 23 18:30:42 2013
From: alan.isaac at gmail.com (Alan G Isaac)
Date: Tue, 23 Jul 2013 18:30:42 -0400
Subject: [Numpy-discussion] add .H attribute?
In-Reply-To: <CAMMTP+C20UtJ1HBnaRxXLPSUzr46ibeou9kUWmp8wy=BQqQm0Q@mail.gmail.com>
References: <51D98902.1090403@gmail.com>
	<CAB6mnxL9GhVHi5aBKMfzX0DTi3uY=W90rFaj7Ta=fZguJZs5FA@mail.gmail.com>
	<51E17280.2030105@gmail.com>
	<CAPJVwBn-4WfgPQCzz0kECV8MyoLgWDmTmWu+Fj2TMuUWKx-7Wg@mail.gmail.com>
	<CABDkGQ=to_EG8uUyj8aQTVusguvt1qsLnhsPEGpanczHE9J1Rg@mail.gmail.com>
	<CAPJVwBmGkMxPw_cD21zxsf1Jj5cMqSzm4vShAEfqmkT4Pdy4bA@mail.gmail.com>
	<51ED9100.8040108@gmail.com>
	<loom.20130723T091735-874@post.gmane.org>
	<51EE4044.5060509@astro.uio.no>
	<CAPJVwBmR9ShkJYPZz-+EU5J6enTFf0j_KyOqgB=ex1uGijcc5w@mail.gmail.com>
	<51EEC332.7070805@gmail.com> <51EEF0BB.60508@astro.uio.no>
	<CAMMTP+C20UtJ1HBnaRxXLPSUzr46ibeou9kUWmp8wy=BQqQm0Q@mail.gmail.com>
Message-ID: <51EF0412.5030003@gmail.com>

On 7/23/2013 5:32 PM, josef.pktd at gmail.com wrote:
> Do we really need a one letter shorthand for `a.conj().T` ?


One way to assess this would be to propose removing
it from matrix and masked array objects.  If the
yelping is loud enough, there is apparently need.
I suspect the yelping would be pretty loud.  Indeed,
the reason I started this thread is that I'm using
the matrix object less and less, and I definitely
miss the .H attribute it offers.

In any case, need is the wrong criterion.  The question is,
do the gains in readability, consistency (across
objects), convenience, and advertising appeal
(e.g., to those used to other languages) outweigh
the costs?

It's a cost benefit analysis.  Obviously some people
think the costs outweigh the benefits and others
say they do not.  We should look for a ways to
determine which group has the better case.  This
discussion has made me much more inclined to believe
it is a good idea to add this attribute.

I agree that it would be an even better idea to
add it as an iterative view, but nobody seems to
feel that can happen quickly.

Alan


From alan.isaac at gmail.com  Tue Jul 23 19:57:39 2013
From: alan.isaac at gmail.com (Alan G Isaac)
Date: Tue, 23 Jul 2013 19:57:39 -0400
Subject: [Numpy-discussion] add .H attribute?
In-Reply-To: <51EF078E.8050603@astro.uio.no>
References: <51D98902.1090403@gmail.com>
	<CAB6mnxL9GhVHi5aBKMfzX0DTi3uY=W90rFaj7Ta=fZguJZs5FA@mail.gmail.com>
	<51E17280.2030105@gmail.com>
	<CAPJVwBn-4WfgPQCzz0kECV8MyoLgWDmTmWu+Fj2TMuUWKx-7Wg@mail.gmail.com>
	<CABDkGQ=to_EG8uUyj8aQTVusguvt1qsLnhsPEGpanczHE9J1Rg@mail.gmail.com>
	<CAPJVwBmGkMxPw_cD21zxsf1Jj5cMqSzm4vShAEfqmkT4Pdy4bA@mail.gmail.com>
	<51ED9100.8040108@gmail.com>
	<loom.20130723T091735-874@post.gmane.org>
	<51EE4044.5060509@astro.uio.no>
	<CAPJVwBmR9ShkJYPZz-+EU5J6enTFf0j_KyOqgB=ex1uGijcc5w@mail.gmail.com>
	<51EEC332.7070805@gmail.com> <51EEF0BB.60508@astro.uio.no>
	<51EF0217.209@gmail.com> <51EF078E.8050603@astro.uio.no>
Message-ID: <51EF1873.7050202@gmail.com>

On 7/23/2013 6:45 PM, Dag Sverre Seljebotn wrote:
> It'd be great if you could try to incorporate it to create a more balanced overview


Attempt 2:

I believe that propoents of adding a .H attribute have
primarily emphasized

- readability (and general ease of use, including in teaching)
- consistency with matrix and masked array
- forward looking (to a future when .H can be a view)
   in the following sense: it gives access now to
   the conjugate transpose via .H, which is likely
   to be implemented in the future, so as long as
   we document (as with ``diagonal``) that this may
   change, it gives a large chunk of the desired
   benefit now.

The opponents have primarily emphasized

- inconsistency with convention that for arrays
   instance attributes should return views
- NOT forward looking (to a future when .H can be a view)
   in the following sense: it gives access now to
   the conjugate transpose via .H but NOT as a view,
   which is likely to be the preferred implementation
   in the future, and if the implementation changes
   in this preferred way then code that relied on
   behavior rather than documentation will break

Finally, I think (?) everyone (proponents and opponents)
would be happy if .H could provide access to an iterative
view of the conjugate transpose.  (Any objections?)

Better?

Alan


From chris.barker at noaa.gov  Tue Jul 23 20:15:25 2013
From: chris.barker at noaa.gov (Chris Barker - NOAA Federal)
Date: Tue, 23 Jul 2013 17:15:25 -0700
Subject: [Numpy-discussion] add .H attribute?
In-Reply-To: <kslva0$bpl$1@ger.gmane.org>
References: <51D98902.1090403@gmail.com>
	<CAB6mnxL9GhVHi5aBKMfzX0DTi3uY=W90rFaj7Ta=fZguJZs5FA@mail.gmail.com>
	<51E17280.2030105@gmail.com>
	<CAPJVwBn-4WfgPQCzz0kECV8MyoLgWDmTmWu+Fj2TMuUWKx-7Wg@mail.gmail.com>
	<CABDkGQ=to_EG8uUyj8aQTVusguvt1qsLnhsPEGpanczHE9J1Rg@mail.gmail.com>
	<CAPJVwBmGkMxPw_cD21zxsf1Jj5cMqSzm4vShAEfqmkT4Pdy4bA@mail.gmail.com>
	<51ED9100.8040108@gmail.com> <loom.20130723T091735-874@post.gmane.org>
	<51EE4044.5060509@astro.uio.no>
	<CABDkGQktdmTX9DpHyfyr1DZZd16BU9+nF5ZyNgbvAeomBT8nVg@mail.gmail.com>
	<kslva0$bpl$1@ger.gmane.org>
Message-ID: <CALGmxEJdy=5YNqH=8ncy-Zc-+KdYMGGXqhxPV2s6dgOJKZLUdA@mail.gmail.com>

On Tue, Jul 23, 2013 at 6:09 AM, Pauli Virtanen <pav at iki.fi> wrote:

> The .H property has been implemented in Numpy matrices and Scipy's
> sparse matrices for many years.

Then we're done. Numpy  is an array package, NOT a matrix package, and
while you can implement matrix math with arrays (and we do), having
quick and easy mnemonics for common matrix math operations (but
uncommon general purpose array operations) is not eh job of numpy.
That's what the matrix object is for.

Yes, I know the matrix object isn't really what it should be, and
doesn't get much  use, but if you want something that is natural for
doing matrix math, and particularly natural for teaching it -- that's
what it's for -- work to make it what it could be, rather than
polluting numpy with this stuff.

One of the things I've loved about numpy after moving from MATLAB is
that matrixes are second-class citizens, not the other way around.

(OK, I'll go away now....)

-Chris


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From stefan at sun.ac.za  Wed Jul 24 02:53:56 2013
From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=)
Date: Wed, 24 Jul 2013 08:53:56 +0200
Subject: [Numpy-discussion] add .H attribute?
In-Reply-To: <CALGmxEJdy=5YNqH=8ncy-Zc-+KdYMGGXqhxPV2s6dgOJKZLUdA@mail.gmail.com>
References: <51D98902.1090403@gmail.com>
	<CAB6mnxL9GhVHi5aBKMfzX0DTi3uY=W90rFaj7Ta=fZguJZs5FA@mail.gmail.com>
	<51E17280.2030105@gmail.com>
	<CAPJVwBn-4WfgPQCzz0kECV8MyoLgWDmTmWu+Fj2TMuUWKx-7Wg@mail.gmail.com>
	<CABDkGQ=to_EG8uUyj8aQTVusguvt1qsLnhsPEGpanczHE9J1Rg@mail.gmail.com>
	<CAPJVwBmGkMxPw_cD21zxsf1Jj5cMqSzm4vShAEfqmkT4Pdy4bA@mail.gmail.com>
	<51ED9100.8040108@gmail.com> <loom.20130723T091735-874@post.gmane.org>
	<51EE4044.5060509@astro.uio.no>
	<CABDkGQktdmTX9DpHyfyr1DZZd16BU9+nF5ZyNgbvAeomBT8nVg@mail.gmail.com>
	<kslva0$bpl$1@ger.gmane.org>
	<CALGmxEJdy=5YNqH=8ncy-Zc-+KdYMGGXqhxPV2s6dgOJKZLUdA@mail.gmail.com>
Message-ID: <CABDkGQm-VQ0Jum1569nDx=7KHpjP=t2upWWT9StSgONRx_8sSQ@mail.gmail.com>

On Wed, Jul 24, 2013 at 2:15 AM, Chris Barker - NOAA Federal
<chris.barker at noaa.gov> wrote:
>
> On Tue, Jul 23, 2013 at 6:09 AM, Pauli Virtanen <pav at iki.fi> wrote:
>
> > The .H property has been implemented in Numpy matrices and Scipy's
> > sparse matrices for many years.
>
> Then we're done. Numpy  is an array package, NOT a matrix package, and
> while you can implement matrix math with arrays (and we do), having
> quick and easy mnemonics for common matrix math operations (but
> uncommon general purpose array operations) is not eh job of numpy.
> That's what the matrix object is for.

I would argue that the ship sailed when we added .T already.  Most
users see no difference between the addition of .T and .H.

The matrix class should probably be deprecated and removed from NumPy
in the long run--being a second class citizen not used by the
developers themselves is not sustainable.  And, now that we have "dot"
as a method, there's very little advantage to it.

St?fan


From seb.haase at gmail.com  Wed Jul 24 03:15:56 2013
From: seb.haase at gmail.com (Sebastian Haase)
Date: Wed, 24 Jul 2013 09:15:56 +0200
Subject: [Numpy-discussion] add .H attribute?
In-Reply-To: <CABDkGQm-VQ0Jum1569nDx=7KHpjP=t2upWWT9StSgONRx_8sSQ@mail.gmail.com>
References: <51D98902.1090403@gmail.com>
	<CAB6mnxL9GhVHi5aBKMfzX0DTi3uY=W90rFaj7Ta=fZguJZs5FA@mail.gmail.com>
	<51E17280.2030105@gmail.com>
	<CAPJVwBn-4WfgPQCzz0kECV8MyoLgWDmTmWu+Fj2TMuUWKx-7Wg@mail.gmail.com>
	<CABDkGQ=to_EG8uUyj8aQTVusguvt1qsLnhsPEGpanczHE9J1Rg@mail.gmail.com>
	<CAPJVwBmGkMxPw_cD21zxsf1Jj5cMqSzm4vShAEfqmkT4Pdy4bA@mail.gmail.com>
	<51ED9100.8040108@gmail.com> <loom.20130723T091735-874@post.gmane.org>
	<51EE4044.5060509@astro.uio.no>
	<CABDkGQktdmTX9DpHyfyr1DZZd16BU9+nF5ZyNgbvAeomBT8nVg@mail.gmail.com>
	<kslva0$bpl$1@ger.gmane.org>
	<CALGmxEJdy=5YNqH=8ncy-Zc-+KdYMGGXqhxPV2s6dgOJKZLUdA@mail.gmail.com>
	<CABDkGQm-VQ0Jum1569nDx=7KHpjP=t2upWWT9StSgONRx_8sSQ@mail.gmail.com>
Message-ID: <CAN06oV9E2Xsf=tGbyqgXpnT4LHAN6TWTbuyi8gAGT-Vg2QAFjQ@mail.gmail.com>

On Wed, Jul 24, 2013 at 8:53 AM, St?fan van der Walt <stefan at sun.ac.za> wrote:
> On Wed, Jul 24, 2013 at 2:15 AM, Chris Barker - NOAA Federal
> <chris.barker at noaa.gov> wrote:
>>
>> On Tue, Jul 23, 2013 at 6:09 AM, Pauli Virtanen <pav at iki.fi> wrote:
>>
>> > The .H property has been implemented in Numpy matrices and Scipy's
>> > sparse matrices for many years.
>>
>> Then we're done. Numpy  is an array package, NOT a matrix package, and
>> while you can implement matrix math with arrays (and we do), having
>> quick and easy mnemonics for common matrix math operations (but
>> uncommon general purpose array operations) is not eh job of numpy.
>> That's what the matrix object is for.
>
> I would argue that the ship sailed when we added .T already.  Most
> users see no difference between the addition of .T and .H.
>
> The matrix class should probably be deprecated and removed from NumPy
> in the long run--being a second class citizen not used by the
> developers themselves is not sustainable.  And, now that we have "dot"
> as a method, there's very little advantage to it.
>
> St?fan

Maybe this is the point where one just needs to do a poll.
And finally someone has to make the decision.

I feel that adding a method
.H()
would be the compromise !

Alan, could you live with that ?

It is short enough and still emphasises the fact that it is NOT a view
and therefore behaves sensitively different in certain scenarios as .T
 .
It also leaves the door open to adding an iterator .H attribute later
on without introducing the above mentioned code breaks.

Who could make (i.e. is willing to make) the decision ?

(( I would not open the discussion about ndarray vs. matrix -- it gets
far to involving and we would be talking about far-future directions
instead of "a single letter addition", which abvious already has big
enough support and had so years ago))

Regards,
Sebastian Haase


From stefan at sun.ac.za  Wed Jul 24 03:30:55 2013
From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=)
Date: Wed, 24 Jul 2013 09:30:55 +0200
Subject: [Numpy-discussion] add .H attribute?
In-Reply-To: <CAN06oV9E2Xsf=tGbyqgXpnT4LHAN6TWTbuyi8gAGT-Vg2QAFjQ@mail.gmail.com>
References: <51D98902.1090403@gmail.com>
	<CAB6mnxL9GhVHi5aBKMfzX0DTi3uY=W90rFaj7Ta=fZguJZs5FA@mail.gmail.com>
	<51E17280.2030105@gmail.com>
	<CAPJVwBn-4WfgPQCzz0kECV8MyoLgWDmTmWu+Fj2TMuUWKx-7Wg@mail.gmail.com>
	<CABDkGQ=to_EG8uUyj8aQTVusguvt1qsLnhsPEGpanczHE9J1Rg@mail.gmail.com>
	<CAPJVwBmGkMxPw_cD21zxsf1Jj5cMqSzm4vShAEfqmkT4Pdy4bA@mail.gmail.com>
	<51ED9100.8040108@gmail.com> <loom.20130723T091735-874@post.gmane.org>
	<51EE4044.5060509@astro.uio.no>
	<CABDkGQktdmTX9DpHyfyr1DZZd16BU9+nF5ZyNgbvAeomBT8nVg@mail.gmail.com>
	<kslva0$bpl$1@ger.gmane.org>
	<CALGmxEJdy=5YNqH=8ncy-Zc-+KdYMGGXqhxPV2s6dgOJKZLUdA@mail.gmail.com>
	<CABDkGQm-VQ0Jum1569nDx=7KHpjP=t2upWWT9StSgONRx_8sSQ@mail.gmail.com>
	<CAN06oV9E2Xsf=tGbyqgXpnT4LHAN6TWTbuyi8gAGT-Vg2QAFjQ@mail.gmail.com>
Message-ID: <CABDkGQkRayZ5YAvpD0j5a3uLT4u-=1V7Pff6TkzVD7UH9hoTWQ@mail.gmail.com>

On Wed, Jul 24, 2013 at 9:15 AM, Sebastian Haase <seb.haase at gmail.com> wrote:
> I feel that adding a method
> .H()
> would be the compromise !

Thinking about this more, I think it would just confuse most users...
why .T and not .H; then you have to start explaining the underlying
implementation detail.  For users who already understand the
implementation detail, finding .T.conj() would not be too hard.

> (( I would not open the discussion about ndarray vs. matrix -- it gets
> far to involving and we would be talking about far-future directions
> instead of "a single letter addition", which abvious already has big
> enough support and had so years ago))

I am willing to write up a NEP if there's any interest.  The plan
would be to remove the Matrix class from numpy over two or three
releases, and publish it as a separate package on PyPi.

St?fan


From josef.pktd at gmail.com  Wed Jul 24 03:40:58 2013
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Wed, 24 Jul 2013 03:40:58 -0400
Subject: [Numpy-discussion] add .H attribute?
In-Reply-To: <CABDkGQkRayZ5YAvpD0j5a3uLT4u-=1V7Pff6TkzVD7UH9hoTWQ@mail.gmail.com>
References: <51D98902.1090403@gmail.com>
	<CAB6mnxL9GhVHi5aBKMfzX0DTi3uY=W90rFaj7Ta=fZguJZs5FA@mail.gmail.com>
	<51E17280.2030105@gmail.com>
	<CAPJVwBn-4WfgPQCzz0kECV8MyoLgWDmTmWu+Fj2TMuUWKx-7Wg@mail.gmail.com>
	<CABDkGQ=to_EG8uUyj8aQTVusguvt1qsLnhsPEGpanczHE9J1Rg@mail.gmail.com>
	<CAPJVwBmGkMxPw_cD21zxsf1Jj5cMqSzm4vShAEfqmkT4Pdy4bA@mail.gmail.com>
	<51ED9100.8040108@gmail.com>
	<loom.20130723T091735-874@post.gmane.org>
	<51EE4044.5060509@astro.uio.no>
	<CABDkGQktdmTX9DpHyfyr1DZZd16BU9+nF5ZyNgbvAeomBT8nVg@mail.gmail.com>
	<kslva0$bpl$1@ger.gmane.org>
	<CALGmxEJdy=5YNqH=8ncy-Zc-+KdYMGGXqhxPV2s6dgOJKZLUdA@mail.gmail.com>
	<CABDkGQm-VQ0Jum1569nDx=7KHpjP=t2upWWT9StSgONRx_8sSQ@mail.gmail.com>
	<CAN06oV9E2Xsf=tGbyqgXpnT4LHAN6TWTbuyi8gAGT-Vg2QAFjQ@mail.gmail.com>
	<CABDkGQkRayZ5YAvpD0j5a3uLT4u-=1V7Pff6TkzVD7UH9hoTWQ@mail.gmail.com>
Message-ID: <CAMMTP+AGze-rKAjV4xqQ9P8mD+hWWbYPg_S7djP6ss-=eqK4kA@mail.gmail.com>

I think a H is feature creep and too specialized

What's .H of a int a str a bool ?

It's just .T and a view, so you cannot rely that conj() makes a copy
if you don't work with complex.

.T is just a reshape function and has **nothing** to do with matrix algebra.


>>> x = np.arange(12).reshape(3,4)
>>> x
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])
>>> np.may_share_memory(x, x.T)
True
>>> np.may_share_memory(x, x.conj())
True
>>> y = x + 1j
>>> np.may_share_memory(y, y.conj())
False
>>> y.dtype
dtype('complex128')
>>> x.conj().dtype
dtype('int32')

Josef

On Wed, Jul 24, 2013 at 3:30 AM, St?fan van der Walt <stefan at sun.ac.za> wrote:
> On Wed, Jul 24, 2013 at 9:15 AM, Sebastian Haase <seb.haase at gmail.com> wrote:
>> I feel that adding a method
>> .H()
>> would be the compromise !
>
> Thinking about this more, I think it would just confuse most users...
> why .T and not .H; then you have to start explaining the underlying
> implementation detail.  For users who already understand the
> implementation detail, finding .T.conj() would not be too hard.
>
>> (( I would not open the discussion about ndarray vs. matrix -- it gets
>> far to involving and we would be talking about far-future directions
>> instead of "a single letter addition", which abvious already has big
>> enough support and had so years ago))
>
> I am willing to write up a NEP if there's any interest.  The plan
> would be to remove the Matrix class from numpy over two or three
> releases, and publish it as a separate package on PyPi.
>
> St?fan
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From josef.pktd at gmail.com  Wed Jul 24 04:05:24 2013
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Wed, 24 Jul 2013 04:05:24 -0400
Subject: [Numpy-discussion] add .H attribute?
In-Reply-To: <51EE4044.5060509@astro.uio.no>
References: <51D98902.1090403@gmail.com>
	<CAB6mnxL9GhVHi5aBKMfzX0DTi3uY=W90rFaj7Ta=fZguJZs5FA@mail.gmail.com>
	<51E17280.2030105@gmail.com>
	<CAPJVwBn-4WfgPQCzz0kECV8MyoLgWDmTmWu+Fj2TMuUWKx-7Wg@mail.gmail.com>
	<CABDkGQ=to_EG8uUyj8aQTVusguvt1qsLnhsPEGpanczHE9J1Rg@mail.gmail.com>
	<CAPJVwBmGkMxPw_cD21zxsf1Jj5cMqSzm4vShAEfqmkT4Pdy4bA@mail.gmail.com>
	<51ED9100.8040108@gmail.com>
	<loom.20130723T091735-874@post.gmane.org>
	<51EE4044.5060509@astro.uio.no>
Message-ID: <CAMMTP+BnD-iBSkXFAz3qKSU8nbDCx9pv7DfgYXY92YpbTz=RRg@mail.gmail.com>

On Tue, Jul 23, 2013 at 4:35 AM, Dag Sverre Seljebotn
<d.s.seljebotn at astro.uio.no> wrote:
...

> There's lots of uses for A.H to be a conjugating-view, e.g., np.dot(A.H,
> A) can be done on-the-fly by BLAS at no extra cost, and so on. These are
> currently not possible with pure NumPy without a copy, which is a pretty
> big defect IMO (and one reason I'd call BLAS myself using Cython rather
> than use np.dot...)

Wouldn't the simpler way not just be to expose those linalg functions?

hdot(X, Y)  == dot(X.T, Y) (if not complex)
          == dot(X.H, Y)    (if complex)

Josef


From dave.hirschfeld at gmail.com  Wed Jul 24 04:23:09 2013
From: dave.hirschfeld at gmail.com (Dave Hirschfeld)
Date: Wed, 24 Jul 2013 08:23:09 +0000 (UTC)
Subject: [Numpy-discussion] add .H attribute?
References: <51D98902.1090403@gmail.com>
	<CAB6mnxL9GhVHi5aBKMfzX0DTi3uY=W90rFaj7Ta=fZguJZs5FA@mail.gmail.com>
	<51E17280.2030105@gmail.com>
	<CAPJVwBn-4WfgPQCzz0kECV8MyoLgWDmTmWu+Fj2TMuUWKx-7Wg@mail.gmail.com>
	<CABDkGQ=to_EG8uUyj8aQTVusguvt1qsLnhsPEGpanczHE9J1Rg@mail.gmail.com>
	<CAPJVwBmGkMxPw_cD21zxsf1Jj5cMqSzm4vShAEfqmkT4Pdy4bA@mail.gmail.com>
	<51ED9100.8040108@gmail.com>
	<loom.20130723T091735-874@post.gmane.org>
	<51EE4044.5060509@astro.uio.no>
	<CABDkGQktdmTX9DpHyfyr1DZZd16BU9+nF5ZyNgbvAeomBT8nVg@mail.gmail.com>
	<kslva0$bpl$1@ger.gmane.org>
	<CALGmxEJdy=5YNqH=8ncy-Zc-+KdYMGGXqhxPV2s6dgOJKZLUdA@mail.gmail.com>
	<CABDkGQm-VQ0Jum1569nDx=7KHpjP=t2upWWT9StSgONRx_8sSQ@mail.gmail.com>
	<CAN06oV9E2Xsf=tGbyqgXpnT4LHAN6TWTbuyi8gAGT-Vg2QAFjQ@mail.gmail.com>
	<CABDkGQkRayZ5YAvpD0j5a3uLT4u-=1V7Pff6TkzVD7UH9hoTWQ@mail.gmail.com>
	<CAMMTP+AGze-rKAjV4xqQ9P8mD+hWWbYPg_S7djP6ss-=eqK4kA@mail.gmail.com>
Message-ID: <loom.20130724T095052-763@post.gmane.org>

 <josef.pktd <at> gmail.com> writes:

> 
> I think a H is feature creep and too specialized
> 
> What's .H of a int a str a bool ?
> 
> It's just .T and a view, so you cannot rely that conj() makes a copy
> if you don't work with complex.
> 
> .T is just a reshape function and has **nothing** to do with matrix 
algebra.
> 

It seems to me that that ship has already sailed - i.e. conj doesn't make 
much sense for str arrays, but it still works in the sense that it's a nop

In [16]: A = asarray(list('abcdefghi')).reshape(3,3)
    ...: np.all(A.T == A.conj().T)
    ...: 
Out[16]: True

If we're voting my vote goes to add the .H attribute for all the reasons 
Alan has specified. Document that it returns a copy but that it may in 
future return a view so it it not future proof to operate on the result 
inplace.

I'm -1 on .H() as it will require code changes if it ever changes to a 
property and it will simply result in questions about why .T is a property 
and .H is a function (and why it's a property for (sparse) matrices)

Regarding Dag's example:

xh = x.H
x *= 2
assert np.all(2 * xh == x.H)


I'm sceptical that there's much code out there actually relying on the fact 
that a transpose is a view with the specified intention of altering the 
original array inplace.

I work with a lot of beginners and whenever I've seen them operate inplace 
on a transpose it has been a bug in the code, leading to a discussion of 
how, for performance reasons, numpy will return a view where possible, 
leading to yet further discussion of when it is and isn't possible to return 
a view.

The third option of .H returning a view would probably be agreeable to 
everyone but I don't think we should punt on this decision for something 
that if it does happen is likely years away. It seems that work on this 
front is happening in different projects to numpy. Even if for example 
sometime in the future numpy's internals were replaced with libdynd or other 
expression graph engine surely this would result in more breaking changes 
than .H returning a view rather than a copy?!

IANAD so I'm happy with whatever the consensus is I just thought I'd put 
forward the view from a (specific type of) user perspective.

Regards,
Dave


From klemm at phys.ethz.ch  Wed Jul 24 05:46:25 2013
From: klemm at phys.ethz.ch (Hanno Klemm)
Date: Wed, 24 Jul 2013 11:46:25 +0200
Subject: [Numpy-discussion] Question regarding documentation of structured
	arrays
Message-ID: <45e848b3afcc062d96726186d2cb827b@phys.ethz.ch>


Hi,

I found the following inconsistency between the advertised and the 
actual behviour of structured arrays:

on http://docs.scipy.org/doc/numpy/user/basics.rec.html it says in the 
section

"Accessing multiple fields at once"
Notice that the fields are always returned in the same order regardless 
of the sequence they are asked for.

Fortunately that does not seem to be the case in my simple test (see 
below). Is that a change in behaviour I can rely on or am I somehow 
lucky in this particular example?

Thanks,
Hanno


In [596]: test_array = np.ones((10),dtype=[('a', float), ('b',float)])

In [597]: test_array
Out[597]:
array([(1.0, 1.0), (1.0, 1.0), (1.0, 1.0), (1.0, 1.0), (1.0, 1.0),
        (1.0, 1.0), (1.0, 1.0), (1.0, 1.0), (1.0, 1.0), (1.0, 1.0)],
       dtype=[('a', '<f8'), ('b', '<f8')])

In [598]: test_array['b']*=2

In [599]: test_array
Out[599]:
array([(1.0, 2.0), (1.0, 2.0), (1.0, 2.0), (1.0, 2.0), (1.0, 2.0),
        (1.0, 2.0), (1.0, 2.0), (1.0, 2.0), (1.0, 2.0), (1.0, 2.0)],
       dtype=[('a', '<f8'), ('b', '<f8')])

In [600]: test_array[['b','a']]
Out[600]:
array([(2.0, 1.0), (2.0, 1.0), (2.0, 1.0), (2.0, 1.0), (2.0, 1.0),
        (2.0, 1.0), (2.0, 1.0), (2.0, 1.0), (2.0, 1.0), (2.0, 1.0)],
       dtype=[('b', '<f8'), ('a', '<f8')])

In [601]: test_array[['a','b']]
Out[601]:
array([(1.0, 2.0), (1.0, 2.0), (1.0, 2.0), (1.0, 2.0), (1.0, 2.0),
        (1.0, 2.0), (1.0, 2.0), (1.0, 2.0), (1.0, 2.0), (1.0, 2.0)],
       dtype=[('a', '<f8'), ('b', '<f8')])

In [602]: test_ab = test_array[['a','b']]

In [603]: test_ba = test_array[['b','a']]

In [604]: test_ba
Out[604]:
array([(2.0, 1.0), (2.0, 1.0), (2.0, 1.0), (2.0, 1.0), (2.0, 1.0),
        (2.0, 1.0), (2.0, 1.0), (2.0, 1.0), (2.0, 1.0), (2.0, 1.0)],
       dtype=[('b', '<f8'), ('a', '<f8')])

In [605]: np.__version__
Out[605]: '1.6.1'


-- 
Hanno Klemm
klemm at phys.ethz.ch


From stefan at sun.ac.za  Wed Jul 24 06:31:36 2013
From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=)
Date: Wed, 24 Jul 2013 12:31:36 +0200
Subject: [Numpy-discussion] Question regarding documentation of
	structured arrays
In-Reply-To: <45e848b3afcc062d96726186d2cb827b@phys.ethz.ch>
References: <45e848b3afcc062d96726186d2cb827b@phys.ethz.ch>
Message-ID: <CABDkGQn=kC1aEDSWw+fE2AtH7f+SZyJMKuTK_K1cj+iH=bFKGw@mail.gmail.com>

Hi Hanno

On Wed, Jul 24, 2013 at 11:46 AM, Hanno Klemm <klemm at phys.ethz.ch> wrote:
> I found the following inconsistency between the advertised and the
> actual behviour of structured arrays:
>
> on http://docs.scipy.org/doc/numpy/user/basics.rec.html it says in the
> section
>
> "Accessing multiple fields at once"
> Notice that the fields are always returned in the same order regardless
> of the sequence they are asked for.

I can confirm the behavior you see under the latest development
version.  Would you mind filing a pull request against the docs?

St?fan


From njs at pobox.com  Wed Jul 24 06:54:28 2013
From: njs at pobox.com (Nathaniel Smith)
Date: Wed, 24 Jul 2013 11:54:28 +0100
Subject: [Numpy-discussion] add .H attribute?
In-Reply-To: <loom.20130724T095052-763@post.gmane.org>
References: <51D98902.1090403@gmail.com>
	<CAB6mnxL9GhVHi5aBKMfzX0DTi3uY=W90rFaj7Ta=fZguJZs5FA@mail.gmail.com>
	<51E17280.2030105@gmail.com>
	<CAPJVwBn-4WfgPQCzz0kECV8MyoLgWDmTmWu+Fj2TMuUWKx-7Wg@mail.gmail.com>
	<CABDkGQ=to_EG8uUyj8aQTVusguvt1qsLnhsPEGpanczHE9J1Rg@mail.gmail.com>
	<CAPJVwBmGkMxPw_cD21zxsf1Jj5cMqSzm4vShAEfqmkT4Pdy4bA@mail.gmail.com>
	<51ED9100.8040108@gmail.com>
	<loom.20130723T091735-874@post.gmane.org>
	<51EE4044.5060509@astro.uio.no>
	<CABDkGQktdmTX9DpHyfyr1DZZd16BU9+nF5ZyNgbvAeomBT8nVg@mail.gmail.com>
	<kslva0$bpl$1@ger.gmane.org>
	<CALGmxEJdy=5YNqH=8ncy-Zc-+KdYMGGXqhxPV2s6dgOJKZLUdA@mail.gmail.com>
	<CABDkGQm-VQ0Jum1569nDx=7KHpjP=t2upWWT9StSgONRx_8sSQ@mail.gmail.com>
	<CAN06oV9E2Xsf=tGbyqgXpnT4LHAN6TWTbuyi8gAGT-Vg2QAFjQ@mail.gmail.com>
	<CABDkGQkRayZ5YAvpD0j5a3uLT4u-=1V7Pff6TkzVD7UH9hoTWQ@mail.gmail.com>
	<CAMMTP+AGze-rKAjV4xqQ9P8mD+hWWbYPg_S7djP6ss-=eqK4kA@mail.gmail.com>
	<loom.20130724T095052-763@post.gmane.org>
Message-ID: <CAPJVwBnvBCPzwyWPoBAXyf8THLd2dmPt-1oWpOwT9opHy2xx5g@mail.gmail.com>

On Wed, Jul 24, 2013 at 9:23 AM, Dave Hirschfeld
<dave.hirschfeld at gmail.com> wrote:
> If we're voting my vote goes to add the .H attribute for all the reasons
> Alan has specified. Document that it returns a copy but that it may in
> future return a view so it it not future proof to operate on the result
> inplace.

As soon as you talk about attributes "returning" things you've already
broken Python's mental model... attributes are things that sit there,
not things that execute arbitrary code. Of course this is not how the
actual implementation works, attribute access *can* in fact execute
arbitrary code, but the mental model is important, so we should
preserve it where-ever we can. Just mentioning an attribute should not
cause unbounded memory allocations.

Consider these two expressions:
  x = solve(dot(arr, arr.T), arr.T)
  x = solve(dot(arr, arr.H), arr.H)

Mathematically, they're very similar, and the mathematics-like
notation does a good job of expressing that similarity while hiding
mathematically irrelevant details. Which is what mathematical notation
is for.

But numpy isn't a toolkit for writing mathematical formula, it's a
toolkit for writing computational algorithms that implement
mathematical formula, and algorithmically, those two expressions are
radically different. The first one allocates one temporary (the result
from 'dot'); the second one allocates 3 temporaries. The second one is
gratuitously inefficient, since two of those temporaries are
identical, but they're being computed twice anyway.

> I'm sceptical that there's much code out there actually relying on the fact
> that a transpose is a view with the specified intention of altering the
> original array inplace.
>
> I work with a lot of beginners and whenever I've seen them operate inplace
> on a transpose it has been a bug in the code, leading to a discussion of
> how, for performance reasons, numpy will return a view where possible,
> leading to yet further discussion of when it is and isn't possible to return
> a view.

The point isn't that there's code that relies specifically on .T
returning a view. It's that to be a good programmer, you need to *know
whether* it returns a view -- exactly as you say in the second
paragraph. And a library should not hide these kinds of details.

-n


From njs at pobox.com  Wed Jul 24 06:56:38 2013
From: njs at pobox.com (Nathaniel Smith)
Date: Wed, 24 Jul 2013 11:56:38 +0100
Subject: [Numpy-discussion] add .H attribute?
In-Reply-To: <CABDkGQkRayZ5YAvpD0j5a3uLT4u-=1V7Pff6TkzVD7UH9hoTWQ@mail.gmail.com>
References: <51D98902.1090403@gmail.com>
	<CAB6mnxL9GhVHi5aBKMfzX0DTi3uY=W90rFaj7Ta=fZguJZs5FA@mail.gmail.com>
	<51E17280.2030105@gmail.com>
	<CAPJVwBn-4WfgPQCzz0kECV8MyoLgWDmTmWu+Fj2TMuUWKx-7Wg@mail.gmail.com>
	<CABDkGQ=to_EG8uUyj8aQTVusguvt1qsLnhsPEGpanczHE9J1Rg@mail.gmail.com>
	<CAPJVwBmGkMxPw_cD21zxsf1Jj5cMqSzm4vShAEfqmkT4Pdy4bA@mail.gmail.com>
	<51ED9100.8040108@gmail.com>
	<loom.20130723T091735-874@post.gmane.org>
	<51EE4044.5060509@astro.uio.no>
	<CABDkGQktdmTX9DpHyfyr1DZZd16BU9+nF5ZyNgbvAeomBT8nVg@mail.gmail.com>
	<kslva0$bpl$1@ger.gmane.org>
	<CALGmxEJdy=5YNqH=8ncy-Zc-+KdYMGGXqhxPV2s6dgOJKZLUdA@mail.gmail.com>
	<CABDkGQm-VQ0Jum1569nDx=7KHpjP=t2upWWT9StSgONRx_8sSQ@mail.gmail.com>
	<CAN06oV9E2Xsf=tGbyqgXpnT4LHAN6TWTbuyi8gAGT-Vg2QAFjQ@mail.gmail.com>
	<CABDkGQkRayZ5YAvpD0j5a3uLT4u-=1V7Pff6TkzVD7UH9hoTWQ@mail.gmail.com>
Message-ID: <CAPJVwBn8BHxN2ZG3WRG6my8pgTCfedvTXqjKXWrt4j_SkJH4mw@mail.gmail.com>

On Wed, Jul 24, 2013 at 8:30 AM, St?fan van der Walt <stefan at sun.ac.za> wrote:
> I am willing to write up a NEP if there's any interest.  The plan
> would be to remove the Matrix class from numpy over two or three
> releases, and publish it as a separate package on PyPi.

Please do! There are some sticky issues to work through (e.g. how to
deprecate the "matrix" entry in the numpy namespace, what to do with
scipy.sparse), and I don't know whether we'll decide to go through
with it in the end, but the way to figure that out is to, you know,
work through them :-).

-n


From dave.hirschfeld at gmail.com  Wed Jul 24 07:08:29 2013
From: dave.hirschfeld at gmail.com (Dave Hirschfeld)
Date: Wed, 24 Jul 2013 11:08:29 +0000 (UTC)
Subject: [Numpy-discussion] add .H attribute?
References: <51D98902.1090403@gmail.com>
	<CAB6mnxL9GhVHi5aBKMfzX0DTi3uY=W90rFaj7Ta=fZguJZs5FA@mail.gmail.com>
	<51E17280.2030105@gmail.com>
	<CAPJVwBn-4WfgPQCzz0kECV8MyoLgWDmTmWu+Fj2TMuUWKx-7Wg@mail.gmail.com>
	<CABDkGQ=to_EG8uUyj8aQTVusguvt1qsLnhsPEGpanczHE9J1Rg@mail.gmail.com>
	<CAPJVwBmGkMxPw_cD21zxsf1Jj5cMqSzm4vShAEfqmkT4Pdy4bA@mail.gmail.com>
	<51ED9100.8040108@gmail.com>
	<loom.20130723T091735-874@post.gmane.org>
	<51EE4044.5060509@astro.uio.no>
	<CABDkGQktdmTX9DpHyfyr1DZZd16BU9+nF5ZyNgbvAeomBT8nVg@mail.gmail.com>
	<kslva0$bpl$1@ger.gmane.org>
	<CALGmxEJdy=5YNqH=8ncy-Zc-+KdYMGGXqhxPV2s6dgOJKZLUdA@mail.gmail.com>
	<CABDkGQm-VQ0Jum1569nDx=7KHpjP=t2upWWT9StSgONRx_8sSQ@mail.gmail.com>
	<CAN06oV9E2Xsf=tGbyqgXpnT4LHAN6TWTbuyi8gAGT-Vg2QAFjQ@mail.gmail.com>
	<CABDkGQkRayZ5YAvpD0j5a3uLT4u-=1V7Pff6TkzVD7UH9hoTWQ@mail.gmail.com>
	<CAMMTP+AGze-rKAjV4xqQ9P8mD+hWWbYPg_S7djP6ss-=eqK4kA@mail.gmail.com>
	<loom.20130724T095052-763@post.gmane.org>
	<CAPJVwBnvBCPzwyWPoBAXyf8THLd2dmPt-1oWpOwT9opHy2xx5g@mail.gmail.com>
Message-ID: <loom.20130724T125824-383@post.gmane.org>

Nathaniel Smith <njs <at> pobox.com> writes:

> 
> 
> As soon as you talk about attributes "returning" things you've already
> broken Python's mental model... attributes are things that sit there,
> not things that execute arbitrary code. Of course this is not how the
> actual implementation works, attribute access *can* in fact execute
> arbitrary code, but the mental model is important, so we should
> preserve it where-ever we can. Just mentioning an attribute should not
> cause unbounded memory allocations.
> 

Yep, sorry - sloppy use of terminology which I agree is important in helping 
understand what's happening.

-Dave


From klemm at phys.ethz.ch  Wed Jul 24 07:29:36 2013
From: klemm at phys.ethz.ch (Hanno Klemm)
Date: Wed, 24 Jul 2013 13:29:36 +0200
Subject: [Numpy-discussion] Question regarding documentation of
 structured arrays
In-Reply-To: <CABDkGQn=kC1aEDSWw+fE2AtH7f+SZyJMKuTK_K1cj+iH=bFKGw@mail.gmail.com>
References: <45e848b3afcc062d96726186d2cb827b@phys.ethz.ch>
	<CABDkGQn=kC1aEDSWw+fE2AtH7f+SZyJMKuTK_K1cj+iH=bFKGw@mail.gmail.com>
Message-ID: <0b2c3a3a2e18264ff5bc72e0e16b0d14@phys.ethz.ch>


Hi Stefan,

I would be happy to file a pull request against the docs if you (or 
somebody) could point me to a document on how and where to do that.

Hanno

On 24.07.2013 12:31, St?fan van der Walt wrote:
> Hi Hanno
> 
> On Wed, Jul 24, 2013 at 11:46 AM, Hanno Klemm <klemm at phys.ethz.ch> 
> wrote:
>> I found the following inconsistency between the advertised and the
>> actual behviour of structured arrays:
>> 
>> on http://docs.scipy.org/doc/numpy/user/basics.rec.html it says in the
>> section
>> 
>> "Accessing multiple fields at once"
>> Notice that the fields are always returned in the same order 
>> regardless
>> of the sequence they are asked for.
> 
> I can confirm the behavior you see under the latest development
> version.  Would you mind filing a pull request against the docs?
> 
> St?fan
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

-- 
Hanno Klemm
klemm at phys.ethz.ch


From davidmenhur at gmail.com  Wed Jul 24 08:47:59 2013
From: davidmenhur at gmail.com (=?UTF-8?B?RGHPgGlk?=)
Date: Wed, 24 Jul 2013 14:47:59 +0200
Subject: [Numpy-discussion] add .H attribute?
In-Reply-To: <loom.20130724T125824-383@post.gmane.org>
References: <51D98902.1090403@gmail.com>
	<CAB6mnxL9GhVHi5aBKMfzX0DTi3uY=W90rFaj7Ta=fZguJZs5FA@mail.gmail.com>
	<51E17280.2030105@gmail.com>
	<CAPJVwBn-4WfgPQCzz0kECV8MyoLgWDmTmWu+Fj2TMuUWKx-7Wg@mail.gmail.com>
	<CABDkGQ=to_EG8uUyj8aQTVusguvt1qsLnhsPEGpanczHE9J1Rg@mail.gmail.com>
	<CAPJVwBmGkMxPw_cD21zxsf1Jj5cMqSzm4vShAEfqmkT4Pdy4bA@mail.gmail.com>
	<51ED9100.8040108@gmail.com> <loom.20130723T091735-874@post.gmane.org>
	<51EE4044.5060509@astro.uio.no>
	<CABDkGQktdmTX9DpHyfyr1DZZd16BU9+nF5ZyNgbvAeomBT8nVg@mail.gmail.com>
	<kslva0$bpl$1@ger.gmane.org>
	<CALGmxEJdy=5YNqH=8ncy-Zc-+KdYMGGXqhxPV2s6dgOJKZLUdA@mail.gmail.com>
	<CABDkGQm-VQ0Jum1569nDx=7KHpjP=t2upWWT9StSgONRx_8sSQ@mail.gmail.com>
	<CAN06oV9E2Xsf=tGbyqgXpnT4LHAN6TWTbuyi8gAGT-Vg2QAFjQ@mail.gmail.com>
	<CABDkGQkRayZ5YAvpD0j5a3uLT4u-=1V7Pff6TkzVD7UH9hoTWQ@mail.gmail.com>
	<CAMMTP+AGze-rKAjV4xqQ9P8mD+hWWbYPg_S7djP6ss-=eqK4kA@mail.gmail.com>
	<loom.20130724T095052-763@post.gmane.org>
	<CAPJVwBnvBCPzwyWPoBAXyf8THLd2dmPt-1oWpOwT9opHy2xx5g@mail.gmail.com>
	<loom.20130724T125824-383@post.gmane.org>
Message-ID: <CAJhcF=0M=EfKOvvFztkyRTy3Fitf_a3Hfa8WGOnr_F74w1SZWw@mail.gmail.com>

An idea:

If .H is ideally going to be a view, and we want to keep it this way,
we could have a .h() method with the present implementation. This
would preserve the name .H for the conjugate view --when someone finds
the way to do it.

This way we would increase the readability, simplify some matrix
algebra code, and keep the API consistency.

On 24 July 2013 13:08, Dave Hirschfeld <dave.hirschfeld at gmail.com> wrote:
> Nathaniel Smith <njs <at> pobox.com> writes:
>
>>
>>
>> As soon as you talk about attributes "returning" things you've already
>> broken Python's mental model... attributes are things that sit there,
>> not things that execute arbitrary code. Of course this is not how the
>> actual implementation works, attribute access *can* in fact execute
>> arbitrary code, but the mental model is important, so we should
>> preserve it where-ever we can. Just mentioning an attribute should not
>> cause unbounded memory allocations.
>>
>
> Yep, sorry - sloppy use of terminology which I agree is important in helping
> understand what's happening.
>
> -Dave
>
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From alan.isaac at gmail.com  Wed Jul 24 08:58:34 2013
From: alan.isaac at gmail.com (Alan G Isaac)
Date: Wed, 24 Jul 2013 08:58:34 -0400
Subject: [Numpy-discussion] add .H attribute?
In-Reply-To: <CAN06oV9E2Xsf=tGbyqgXpnT4LHAN6TWTbuyi8gAGT-Vg2QAFjQ@mail.gmail.com>
References: <51D98902.1090403@gmail.com>
	<CAB6mnxL9GhVHi5aBKMfzX0DTi3uY=W90rFaj7Ta=fZguJZs5FA@mail.gmail.com>
	<51E17280.2030105@gmail.com>
	<CAPJVwBn-4WfgPQCzz0kECV8MyoLgWDmTmWu+Fj2TMuUWKx-7Wg@mail.gmail.com>
	<CABDkGQ=to_EG8uUyj8aQTVusguvt1qsLnhsPEGpanczHE9J1Rg@mail.gmail.com>
	<CAPJVwBmGkMxPw_cD21zxsf1Jj5cMqSzm4vShAEfqmkT4Pdy4bA@mail.gmail.com>
	<51ED9100.8040108@gmail.com>
	<loom.20130723T091735-874@post.gmane.org>
	<51EE4044.5060509@astro.uio.no>
	<CABDkGQktdmTX9DpHyfyr1DZZd16BU9+nF5ZyNgbvAeomBT8nVg@mail.gmail.com>
	<kslva0$bpl$1@ger.gmane.org>
	<CALGmxEJdy=5YNqH=8ncy-Zc-+KdYMGGXqhxPV2s6dgOJKZLUdA@mail.gmail.com>
	<CABDkGQm-VQ0Jum1569nDx=7KHpjP=t2upWWT9StSgONRx_8sSQ@mail.gmail.com>
	<CAN06oV9E2Xsf=tGbyqgXpnT4LHAN6TWTbuyi8gAGT-Vg2QAFjQ@mail.gmail.com>
Message-ID: <51EFCF7A.6030805@gmail.com>

On 7/24/2013 3:15 AM, Sebastian Haase wrote:
> I feel that adding a method
> .H()
> would be the compromise !
>
> Alan, could you live with that ?


I feel .H() now would get in the way of a .H attribute later,
which some have indicated could be added as an iterative
view in a future numpy.  I'd rather wait for that.

My assessment of the conversation so far: there is
not adequate support for a .H attribute until it
can be an iterative view.  I believe that almost
everyone (possibly not Josef) would accept or want
a .H attribute if it could provide an iterative view.
(Is that correct?)

So I'll drop out of the conversation, but I hope the
user interest that has been displayed stimulates interest
in that feature request.

Thanks to everyone who shared their perspective on this issue.
And my apologies to those (e.g., Dag) whom I annoyed by being
too bullheaded.

Cheers,
Alan


From ben.root at ou.edu  Wed Jul 24 09:54:52 2013
From: ben.root at ou.edu (Benjamin Root)
Date: Wed, 24 Jul 2013 09:54:52 -0400
Subject: [Numpy-discussion] add .H attribute?
In-Reply-To: <CAJhcF=0M=EfKOvvFztkyRTy3Fitf_a3Hfa8WGOnr_F74w1SZWw@mail.gmail.com>
References: <51D98902.1090403@gmail.com>
	<CAB6mnxL9GhVHi5aBKMfzX0DTi3uY=W90rFaj7Ta=fZguJZs5FA@mail.gmail.com>
	<51E17280.2030105@gmail.com>
	<CAPJVwBn-4WfgPQCzz0kECV8MyoLgWDmTmWu+Fj2TMuUWKx-7Wg@mail.gmail.com>
	<CABDkGQ=to_EG8uUyj8aQTVusguvt1qsLnhsPEGpanczHE9J1Rg@mail.gmail.com>
	<CAPJVwBmGkMxPw_cD21zxsf1Jj5cMqSzm4vShAEfqmkT4Pdy4bA@mail.gmail.com>
	<51ED9100.8040108@gmail.com> <loom.20130723T091735-874@post.gmane.org>
	<51EE4044.5060509@astro.uio.no>
	<CABDkGQktdmTX9DpHyfyr1DZZd16BU9+nF5ZyNgbvAeomBT8nVg@mail.gmail.com>
	<kslva0$bpl$1@ger.gmane.org>
	<CALGmxEJdy=5YNqH=8ncy-Zc-+KdYMGGXqhxPV2s6dgOJKZLUdA@mail.gmail.com>
	<CABDkGQm-VQ0Jum1569nDx=7KHpjP=t2upWWT9StSgONRx_8sSQ@mail.gmail.com>
	<CAN06oV9E2Xsf=tGbyqgXpnT4LHAN6TWTbuyi8gAGT-Vg2QAFjQ@mail.gmail.com>
	<CABDkGQkRayZ5YAvpD0j5a3uLT4u-=1V7Pff6TkzVD7UH9hoTWQ@mail.gmail.com>
	<CAMMTP+AGze-rKAjV4xqQ9P8mD+hWWbYPg_S7djP6ss-=eqK4kA@mail.gmail.com>
	<loom.20130724T095052-763@post.gmane.org>
	<CAPJVwBnvBCPzwyWPoBAXyf8THLd2dmPt-1oWpOwT9opHy2xx5g@mail.gmail.com>
	<loom.20130724T125824-383@post.gmane.org>
	<CAJhcF=0M=EfKOvvFztkyRTy3Fitf_a3Hfa8WGOnr_F74w1SZWw@mail.gmail.com>
Message-ID: <CANNq6FnLGGge530=C3XeQyWsRUAKDWb1YycFyV8qjecJO0YeKw@mail.gmail.com>

On Wed, Jul 24, 2013 at 8:47 AM, Da?id <davidmenhur at gmail.com> wrote:

> An idea:
>
> If .H is ideally going to be a view, and we want to keep it this way,
> we could have a .h() method with the present implementation. This
> would preserve the name .H for the conjugate view --when someone finds
> the way to do it.
>
> This way we would increase the readability, simplify some matrix
> algebra code, and keep the API consistency.
>
>
I could get behind a .h() method until .H attribute is ready.

+1

Cheers!
Ben Root
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130724/42b0ae40/attachment.html>

From stefan at sun.ac.za  Wed Jul 24 10:57:01 2013
From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=)
Date: Wed, 24 Jul 2013 16:57:01 +0200
Subject: [Numpy-discussion] Question regarding documentation of
	structured arrays
In-Reply-To: <0b2c3a3a2e18264ff5bc72e0e16b0d14@phys.ethz.ch>
References: <45e848b3afcc062d96726186d2cb827b@phys.ethz.ch>
	<CABDkGQn=kC1aEDSWw+fE2AtH7f+SZyJMKuTK_K1cj+iH=bFKGw@mail.gmail.com>
	<0b2c3a3a2e18264ff5bc72e0e16b0d14@phys.ethz.ch>
Message-ID: <CABDkGQ=DHHoe3h8kgBfG1Mxm-dRMopKKgBMrTndweWt_wzWO5Q@mail.gmail.com>

Hallo Hanno

On Wed, Jul 24, 2013 at 1:29 PM, Hanno Klemm <klemm at phys.ethz.ch> wrote:
> I would be happy to file a pull request against the docs if you (or
> somebody) could point me to a document on how and where to do that.

The file you want to edit is here:
https://github.com/numpy/numpy/blob/master/numpy/doc/structured_arrays.py#L194

You can click on the "edit" button, then GitHub will help you to make
a pull request.

Thanks!
St?fan


From lists at onerussian.com  Wed Jul 24 11:00:37 2013
From: lists at onerussian.com (Yaroslav Halchenko)
Date: Wed, 24 Jul 2013 11:00:37 -0400
Subject: [Numpy-discussion] fresh performance boosts and elderly hits
 e.g. identity, ones
In-Reply-To: <20130719220754.GR27621@onerussian.com>
References: <20130506143241.GV5140@onerussian.com>
	<1367856232.2506.31.camel@sebastian-laptop>
	<20130506161153.GW5140@onerussian.com>
	<1367927238.23010.12.camel@sebastian-laptop>
	<CAJhcF=1a=zHQ7JgXF9f4N+1pHMr_6RESq6Me9oSYewjj62L0MA@mail.gmail.com>
	<20130701193006.GC27621@onerussian.com>
	<20130701215804.GG27621@onerussian.com>
	<20130709161007.GL27621@onerussian.com>
	<20130717035348.GN27621@onerussian.com>
	<20130719220754.GR27621@onerussian.com>
Message-ID: <20130724150037.GW27621@onerussian.com>

Added some basic constructors benchmarks:
http://www.onerussian.com/tmp/numpy-vbench/vb_vb_core.html quite a bit
of fresh enhancements are present (cool) but also some freshly
discovered elderly hits, e.g.

http://www.onerussian.com/tmp/numpy-vbench/vb_vb_core.html#numpy-identity-100
http://www.onerussian.com/tmp/numpy-vbench/vb_vb_core.html#numpy-ones-100

Cheers,

On Fri, 19 Jul 2013, Yaroslav Halchenko wrote:

> I have just added a few more benchmarks, and here they come
> http://www.onerussian.com/tmp/numpy-vbench/vb_vb_linalg.html#numpy-linalg-pinv-a-float32
> it seems to be very recent so my only check based on 10 commits
> didn't pick it up yet so they are not present in the summary table.

> could well be related to 80% faster det()? ;)

> norm was hit as well a bit earlier, might well be within these commits:
> https://github.com/numpy/numpy/compare/24a0aa5...29dcc54
> I will rerun now benchmarking for the rest of commits (was running last
> in the day iirc)

> Cheers,
-- 
Yaroslav O. Halchenko, Ph.D.
http://neuro.debian.net http://www.pymvpa.org http://www.fail2ban.org
Senior Research Associate,     Psychological and Brain Sciences Dept.
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834                       Fax: +1 (603) 646-1419
WWW:   http://www.linkedin.com/in/yarik        


From stefan at sun.ac.za  Wed Jul 24 11:02:52 2013
From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=)
Date: Wed, 24 Jul 2013 17:02:52 +0200
Subject: [Numpy-discussion] add .H attribute?
In-Reply-To: <CAPJVwBnvBCPzwyWPoBAXyf8THLd2dmPt-1oWpOwT9opHy2xx5g@mail.gmail.com>
References: <51D98902.1090403@gmail.com>
	<CAB6mnxL9GhVHi5aBKMfzX0DTi3uY=W90rFaj7Ta=fZguJZs5FA@mail.gmail.com>
	<51E17280.2030105@gmail.com>
	<CAPJVwBn-4WfgPQCzz0kECV8MyoLgWDmTmWu+Fj2TMuUWKx-7Wg@mail.gmail.com>
	<CABDkGQ=to_EG8uUyj8aQTVusguvt1qsLnhsPEGpanczHE9J1Rg@mail.gmail.com>
	<CAPJVwBmGkMxPw_cD21zxsf1Jj5cMqSzm4vShAEfqmkT4Pdy4bA@mail.gmail.com>
	<51ED9100.8040108@gmail.com> <loom.20130723T091735-874@post.gmane.org>
	<51EE4044.5060509@astro.uio.no>
	<CABDkGQktdmTX9DpHyfyr1DZZd16BU9+nF5ZyNgbvAeomBT8nVg@mail.gmail.com>
	<kslva0$bpl$1@ger.gmane.org>
	<CALGmxEJdy=5YNqH=8ncy-Zc-+KdYMGGXqhxPV2s6dgOJKZLUdA@mail.gmail.com>
	<CABDkGQm-VQ0Jum1569nDx=7KHpjP=t2upWWT9StSgONRx_8sSQ@mail.gmail.com>
	<CAN06oV9E2Xsf=tGbyqgXpnT4LHAN6TWTbuyi8gAGT-Vg2QAFjQ@mail.gmail.com>
	<CABDkGQkRayZ5YAvpD0j5a3uLT4u-=1V7Pff6TkzVD7UH9hoTWQ@mail.gmail.com>
	<CAMMTP+AGze-rKAjV4xqQ9P8mD+hWWbYPg_S7djP6ss-=eqK4kA@mail.gmail.com>
	<loom.20130724T095052-763@post.gmane.org>
	<CAPJVwBnvBCPzwyWPoBAXyf8THLd2dmPt-1oWpOwT9opHy2xx5g@mail.gmail.com>
Message-ID: <CABDkGQ=_=TZmDHv0g6W6aVg6wJKm_d-QccfSzGu-wSj+81aTtg@mail.gmail.com>

On Wed, Jul 24, 2013 at 12:54 PM, Nathaniel Smith <njs at pobox.com> wrote:
> The point isn't that there's code that relies specifically on .T
> returning a view. It's that to be a good programmer, you need to *know
> whether* it returns a view -- exactly as you say in the second
> paragraph. And a library should not hide these kinds of details.

After listening to the arguments by yourself and Dag, I think I buy
into the idea that we should hold off on this until we have ufunc
views or something similar implemented.

Also, if we split off the matrix package, we can give other people who
really care about that (perhaps Alan is interested?) ownership, and
let them run with it (I mainly use ndarrays myself).

St?fan


From chris.barker at noaa.gov  Wed Jul 24 11:24:00 2013
From: chris.barker at noaa.gov (Chris Barker - NOAA Federal)
Date: Wed, 24 Jul 2013 08:24:00 -0700
Subject: [Numpy-discussion] add .H attribute?
In-Reply-To: <CABDkGQm-VQ0Jum1569nDx=7KHpjP=t2upWWT9StSgONRx_8sSQ@mail.gmail.com>
References: <51D98902.1090403@gmail.com>
	<CAB6mnxL9GhVHi5aBKMfzX0DTi3uY=W90rFaj7Ta=fZguJZs5FA@mail.gmail.com>
	<51E17280.2030105@gmail.com>
	<CAPJVwBn-4WfgPQCzz0kECV8MyoLgWDmTmWu+Fj2TMuUWKx-7Wg@mail.gmail.com>
	<CABDkGQ=to_EG8uUyj8aQTVusguvt1qsLnhsPEGpanczHE9J1Rg@mail.gmail.com>
	<CAPJVwBmGkMxPw_cD21zxsf1Jj5cMqSzm4vShAEfqmkT4Pdy4bA@mail.gmail.com>
	<51ED9100.8040108@gmail.com> <loom.20130723T091735-874@post.gmane.org>
	<51EE4044.5060509@astro.uio.no>
	<CABDkGQktdmTX9DpHyfyr1DZZd16BU9+nF5ZyNgbvAeomBT8nVg@mail.gmail.com>
	<kslva0$bpl$1@ger.gmane.org>
	<CALGmxEJdy=5YNqH=8ncy-Zc-+KdYMGGXqhxPV2s6dgOJKZLUdA@mail.gmail.com>
	<CABDkGQm-VQ0Jum1569nDx=7KHpjP=t2upWWT9StSgONRx_8sSQ@mail.gmail.com>
Message-ID: <-132241265923687312@unknownmsgid>

On Jul 23, 2013, at 11:54 PM, "St?fan van der Walt" <stefan at sun.ac.za> wrote:

>>> The .H property has been implemented in Numpy matrices and Scipy's
>>> sparse matrices for many years.
>>
>> Then we're done. Numpy  is an array package, NOT a matrix package, and
>> while you can implement matrix math with arrays (and we do), having
>> quick and easy mnemonics for common matrix math operations (but
>> uncommon general purpose array operations) is not eh job of numpy.
>> That's what the matrix object is for.
>
> I would argue that the ship sailed when we added .T already.  Most
> users see no difference between the addition of .T and .H.

I don't know who can speak for "most users", but I see them quite
differently. Transposing is a common operation outside of linear
algebra--I, for one, use it to work with image arrays, which are often
stored in a way by image libraries that is the transpose of the
"natural" numpy way.

But anyway, just because we have one domain-specific convenience
attribute, doesn't mean we should add them all.

> The matrix class should probably be deprecated and removed from NumPy
> in the long run--being a second class citizen not used by the
> developers themselves is not sustainable.

I agree, but the fact that no one has stepped up to maintain and
improve it tells me that there is not a very large community that
wants a clean linear algebra interface, not that we should try to
build such an interface directly into numpy.

Is there really a point to a clean interface to the Hermetian
transpose, but not plain old matrix multiply?

>  And, now that we have "dot"
> as a method,

Agh, but "dot" is a method--so we still don't have a clean
relationship with the math in text books:

AB => A.dot(B)

Anyway, adding .H is clearly not a big deal, I just don't think it's
going to satisfy anyone anyway.

-Chris


From chris.barker at noaa.gov  Wed Jul 24 11:29:16 2013
From: chris.barker at noaa.gov (Chris Barker - NOAA Federal)
Date: Wed, 24 Jul 2013 08:29:16 -0700
Subject: [Numpy-discussion] add .H attribute?
In-Reply-To: <CAPJVwBn8BHxN2ZG3WRG6my8pgTCfedvTXqjKXWrt4j_SkJH4mw@mail.gmail.com>
References: <51D98902.1090403@gmail.com>
	<CAB6mnxL9GhVHi5aBKMfzX0DTi3uY=W90rFaj7Ta=fZguJZs5FA@mail.gmail.com>
	<51E17280.2030105@gmail.com>
	<CAPJVwBn-4WfgPQCzz0kECV8MyoLgWDmTmWu+Fj2TMuUWKx-7Wg@mail.gmail.com>
	<CABDkGQ=to_EG8uUyj8aQTVusguvt1qsLnhsPEGpanczHE9J1Rg@mail.gmail.com>
	<CAPJVwBmGkMxPw_cD21zxsf1Jj5cMqSzm4vShAEfqmkT4Pdy4bA@mail.gmail.com>
	<51ED9100.8040108@gmail.com> <loom.20130723T091735-874@post.gmane.org>
	<51EE4044.5060509@astro.uio.no>
	<CABDkGQktdmTX9DpHyfyr1DZZd16BU9+nF5ZyNgbvAeomBT8nVg@mail.gmail.com>
	<kslva0$bpl$1@ger.gmane.org>
	<CALGmxEJdy=5YNqH=8ncy-Zc-+KdYMGGXqhxPV2s6dgOJKZLUdA@mail.gmail.com>
	<CABDkGQm-VQ0Jum1569nDx=7KHpjP=t2upWWT9StSgONRx_8sSQ@mail.gmail.com>
	<CAN06oV9E2Xsf=tGbyqgXpnT4LHAN6TWTbuyi8gAGT-Vg2QAFjQ@mail.gmail.com>
	<CABDkGQkRayZ5YAvpD0j5a3uLT4u-=1V7Pff6TkzVD7UH9hoTWQ@mail.gmail.com>
	<CAPJVwBn8BHxN2ZG3WRG6my8pgTCfedvTXqjKXWrt4j_SkJH4mw@mail.gmail.com>
Message-ID: <-3986359840241221136@unknownmsgid>

>>
>>  plan
>> would be to remove the Matrix class from numpy over two or three
>> releases, and publish it as a separate package on PyPi.

Anyone willing to take ownership of it? Maybe we should still do it of
not-- at least it will make it clear that it is orphaned.

Though one plus to having matrix in numpy is that it was a testbed for
ndarray subclassing...

-Chris

> Please do! There are some sticky issues to work through (e.g. how to
> deprecate the "matrix" entry in the numpy namespace, what to do with
> scipy.sparse), and I don't know whether we'll decide to go through
> with it in the end, but the way to figure that out is to, you know,
> work through them :-).
>
> -n
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From pav at iki.fi  Wed Jul 24 11:33:16 2013
From: pav at iki.fi (Pauli Virtanen)
Date: Wed, 24 Jul 2013 18:33:16 +0300
Subject: [Numpy-discussion] Splitting numpydoc to a separate repo
Message-ID: <ksos3l$jer$1@ger.gmane.org>

Hi,

How about splitting doc/sphinxext out from the main Numpy repository to 
a separate `numpydoc` repo under Numpy project?

It's a separate Python package, after all. Moreover, this would make it 
easier to use it as a git submodule (e.g. in Scipy). Moreover, its 
release cycle is not in any way tied to that of Numpy.

	Pauli


From stefan at sun.ac.za  Wed Jul 24 11:35:24 2013
From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=)
Date: Wed, 24 Jul 2013 17:35:24 +0200
Subject: [Numpy-discussion] Splitting numpydoc to a separate repo
In-Reply-To: <ksos3l$jer$1@ger.gmane.org>
References: <ksos3l$jer$1@ger.gmane.org>
Message-ID: <CABDkGQmbOZy40zeckAv-hGgOC41Z=-WL31be5=pUSFZ=OS2j5A@mail.gmail.com>

On Wed, Jul 24, 2013 at 5:33 PM, Pauli Virtanen <pav at iki.fi> wrote:
> How about splitting doc/sphinxext out from the main Numpy repository to
> a separate `numpydoc` repo under Numpy project?

That would be great, also for scikits that rely on these extensions.

St?fan


From robert.kern at gmail.com  Wed Jul 24 11:35:42 2013
From: robert.kern at gmail.com (Robert Kern)
Date: Wed, 24 Jul 2013 16:35:42 +0100
Subject: [Numpy-discussion] Splitting numpydoc to a separate repo
In-Reply-To: <ksos3l$jer$1@ger.gmane.org>
References: <ksos3l$jer$1@ger.gmane.org>
Message-ID: <CAF6FJiu4WsBtKOb=2MJHMrcrb1EjS7Ay_tUHrHGyO1mB=LMKNw@mail.gmail.com>

On Wed, Jul 24, 2013 at 4:33 PM, Pauli Virtanen <pav at iki.fi> wrote:
>
> Hi,
>
> How about splitting doc/sphinxext out from the main Numpy repository to
> a separate `numpydoc` repo under Numpy project?
>
> It's a separate Python package, after all. Moreover, this would make it
> easier to use it as a git submodule (e.g. in Scipy). Moreover, its
> release cycle is not in any way tied to that of Numpy.

Works for me.

--
Robert Kern
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130724/4e0e6035/attachment.html>

From lists at onerussian.com  Wed Jul 24 11:36:58 2013
From: lists at onerussian.com (Yaroslav Halchenko)
Date: Wed, 24 Jul 2013 11:36:58 -0400
Subject: [Numpy-discussion] Splitting numpydoc to a separate repo
In-Reply-To: <ksos3l$jer$1@ger.gmane.org>
References: <ksos3l$jer$1@ger.gmane.org>
Message-ID: <20130724153658.GX27621@onerussian.com>


On Wed, 24 Jul 2013, Pauli Virtanen wrote:
> How about splitting doc/sphinxext out from the main Numpy repository to 
> a separate `numpydoc` repo under Numpy project?

+1

> It's a separate Python package, after all. Moreover, this would make it 
> easier to use it as a git submodule (e.g. in Scipy). Moreover, its 
> release cycle is not in any way tied to that of Numpy.

yeap -- it has a life of its own

-- 
Yaroslav O. Halchenko, Ph.D.
http://neuro.debian.net http://www.pymvpa.org http://www.fail2ban.org
Senior Research Associate,     Psychological and Brain Sciences Dept.
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834                       Fax: +1 (603) 646-1419
WWW:   http://www.linkedin.com/in/yarik        


From tantczak at operasolutions.com  Wed Jul 24 11:36:53 2013
From: tantczak at operasolutions.com (Trevor Antczak)
Date: Wed, 24 Jul 2013 11:36:53 -0400
Subject: [Numpy-discussion] Casting Errors in AIX
Message-ID: <A9A62ED6B753AB449CD669445B09B5E02B416A288F@opera-ex5.ny.os.local>

Hello Numpy Discussion List,

So I'm trying to get numpy working on an AIX 6.1 system.  Initially I had a lot of problems trying to compile the package because the xlc compiler weren't installed on this machine, but apparently the Python package we installed had been built with them.  Once we got xlc installed the process seemed to work pretty well until we got to compiling heapsort.c.  At this point I began to get a huge number of errors in the form:

compile options: '-Inumpy/core/include -Ibuild/src.aix-6.1-2.7/numpy/core/include/numpy -Inumpy/core/src/private -Inumpy/core/src -Inumpy/core -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath -Inumpy/core/src/npysort -Inumpy/core/include -I/opt/freeware/include/python2.7 -Ibuild/src.aix-6.1-2.7/numpy/core/src/multiarray -Ibuild/src.aix-6.1-2.7/numpy/core/src/umath -c'
xlc_r: build/src.aix-6.1-2.7/numpy/core/src/npysort/heapsort.c
"/usr/include/stdio.h", line 528.12: 1506-343 (S) Redeclaration of fgetpos64 differs from previous declaration on line 323 of "/usr/include/stdio.h".
"/usr/include/stdio.h", line 528.12: 1506-377 (I) The type "long long*" of parameter 2 differs from the previous type "long* restrict".
"/usr/include/stdio.h", line 531.12: 1506-343 (S) Redeclaration of fseeko64 differs from previous declaration on line 471 of "/usr/include/stdio.h".
"/usr/include/stdio.h", line 531.12: 1506-377 (I) The type "long long" of parameter 2 differs from the previous type "long".
"/usr/include/stdio.h", line 532.12: 1506-343 (S) Redeclaration of fsetpos64 differs from previous declaration on line 325 of "/usr/include/stdio.h".
"/usr/include/stdio.h", line 532.12: 1506-377 (I) The type "const long long*" of parameter 2 differs from the previous type "const long*".
"/usr/include/stdio.h", line 533.16: 1506-343 (S) Redeclaration of ftello64 differs from previous declaration on line 472 of "/usr/include/stdio.h".
"/usr/include/stdio.h", line 533.16: 1506-050 (I) Return type "long long" in redeclaration is not compatible with the previous return type "long".
"/usr/include/unistd.h", line 171.17: 1506-343 (S) Redeclaration of lseek64 differs from previous declaration on line 169 of "/usr/include/unistd.h".
"/usr/include/unistd.h", line 171.17: 1506-050 (I) Return type "long long" in redeclaration is not compatible with the previous return type "long".
"/usr/include/unistd.h", line 171.17: 1506-377 (I) The type "long long" of parameter 2 differs from the previous type "long".
"/usr/include/sys/lockf.h", line 64.20: 1506-343 (S) Redeclaration of lockf64 differs from previous declaration on line 62 of "/usr/include/sys/lockf.h".

...................................................

"/usr/include/unistd.h", line 942.25: 1506-050 (I) Return type "long long" in redeclaration is not compatible with the previous return type "long".
"/usr/include/unistd.h", line 942.25: 1506-377 (I) The type "long long" of parameter 2 differs from the previous type "long".
"/usr/include/unistd.h", line 943.25: 1506-343 (S) Redeclaration of fsync_range64 differs from previous declaration on line 940 of "/usr/include/unistd.h".
"/usr/include/unistd.h", line 943.25: 1506-377 (I) The type "long long" of parameter 3 differs from the previous type "long".
error: Command "/usr/vac/bin/xlc_r -DAIX_GENUINE_CPLUSCPLUS -D_LINUX_SOURCE_COMPAT -q32 -qbitfields=signed -qmaxmem=70000 -qalloca -bmaxdata:0x80000000 -Wl,-brtl -I/usr/include -I/opt/freeware/include -I/opt/freeware/include/ncurses -DNDEBUG -O2 -Inumpy/core/include -Ibuild/src.aix-6.1-2.7/numpy/core/include/numpy -Inumpy/core/src/private -Inumpy/core/src -Inumpy/core -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath -Inumpy/core/src/npysort -Inumpy/core/include -I/opt/freeware/include/python2.7 -Ibuild/src.aix-6.1-2.7/numpy/core/src/multiarray -Ibuild/src.aix-6.1-2.7/numpy/core/src/umath -c build/src.aix-6.1-2.7/numpy/core/src/npysort/heapsort.c -o build/temp.aix-6.1-2.7/build/src.aix-6.1-2.7/numpy/core/src/npysort/heapsort.o" failed with exit status 1


There are a lot more than this.  Probably in neighborhood of 40 lines all told.  I spent some time doing research and this appears to be something not terribly uncommon when compiling F/OSS on AIX.  Most of the instances appeared to involve either sshd or smb.  Unfortunately the most commonly cited solution (using the --disable-largefile to configure) won't work for compiling a Python module.  One suggestion I found that did help was to explicitly include some of the standard libraries in the .c file.  So I added:

#include<unistd.h>

#include<stdarg.h>

To heapsort.c.  That dramatically reduced the error messages.  Now I get:

compile options: '-Inumpy/core/include -Ibuild/src.aix-6.1-2.7/numpy/core/include/numpy -Inumpy/core/src/private -Inumpy/core/src -Inumpy/core -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath -Inumpy/core/src/npysort -Inumpy/core/include -I/opt/freeware/include/python2.7 -Ibuild/src.aix-6.1-2.7/numpy/core/src/multiarray -Ibuild/src.aix-6.1-2.7/numpy/core/src/umath -c'
xlc_r: build/src.aix-6.1-2.7/numpy/core/src/npysort/heapsort.c
"/usr/include/stdio.h", line 528.12: 1506-343 (S) Redeclaration of fgetpos64 differs from previous declaration on line 323 of "/usr/include/stdio.h".
"/usr/include/stdio.h", line 528.12: 1506-377 (I) The type "long long*" of parameter 2 differs from the previous type "long* restrict".
"/usr/include/stdio.h", line 531.12: 1506-343 (S) Redeclaration of fseeko64 differs from previous declaration on line 471 of "/usr/include/stdio.h".
"/usr/include/stdio.h", line 531.12: 1506-377 (I) The type "long long" of parameter 2 differs from the previous type "long".
"/usr/include/stdio.h", line 532.12: 1506-343 (S) Redeclaration of fsetpos64 differs from previous declaration on line 325 of "/usr/include/stdio.h".
"/usr/include/stdio.h", line 532.12: 1506-377 (I) The type "const long long*" of parameter 2 differs from the previous type "const long*".
"/usr/include/stdio.h", line 533.16: 1506-343 (S) Redeclaration of ftello64 differs from previous declaration on line 472 of "/usr/include/stdio.h".
"/usr/include/stdio.h", line 533.16: 1506-050 (I) Return type "long long" in redeclaration is not compatible with the previous return type "long".
"/usr/include/stdio.h", line 528.12: 1506-343 (S) Redeclaration of fgetpos64 differs from previous declaration on line 323 of "/usr/include/stdio.h".
"/usr/include/stdio.h", line 528.12: 1506-377 (I) The type "long long*" of parameter 2 differs from the previous type "long* restrict".
"/usr/include/stdio.h", line 531.12: 1506-343 (S) Redeclaration of fseeko64 differs from previous declaration on line 471 of "/usr/include/stdio.h".
"/usr/include/stdio.h", line 531.12: 1506-377 (I) The type "long long" of parameter 2 differs from the previous type "long".
"/usr/include/stdio.h", line 532.12: 1506-343 (S) Redeclaration of fsetpos64 differs from previous declaration on line 325 of "/usr/include/stdio.h".
"/usr/include/stdio.h", line 532.12: 1506-377 (I) The type "const long long*" of parameter 2 differs from the previous type "const long*".
"/usr/include/stdio.h", line 533.16: 1506-343 (S) Redeclaration of ftello64 differs from previous declaration on line 472 of "/usr/include/stdio.h".
"/usr/include/stdio.h", line 533.16: 1506-050 (I) Return type "long long" in redeclaration is not compatible with the previous return type "long".
error: Command "/usr/vac/bin/xlc_r -DAIX_GENUINE_CPLUSCPLUS -D_LINUX_SOURCE_COMPAT -q32 -qbitfields=signed -qmaxmem=70000 -qalloca -bmaxdata:0x80000000 -Wl,-brtl -I/usr/include -I/opt/freeware/include -I/opt/freeware/include/ncurses -DNDEBUG -O2 -Inumpy/core/include -Ibuild/src.aix-6.1-2.7/numpy/core/include/numpy -Inumpy/core/src/private -Inumpy/core/src -Inumpy/core -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath -Inumpy/core/src/npysort -Inumpy/core/include -I/opt/freeware/include/python2.7 -Ibuild/src.aix-6.1-2.7/numpy/core/src/multiarray -Ibuild/src.aix-6.1-2.7/numpy/core/src/umath -c build/src.aix-6.1-2.7/numpy/core/src/npysort/heapsort.c -o build/temp.aix-6.1-2.7/build/src.aix-6.1-2.7/numpy/core/src/npysort/heapsort.o" failed with exit status 1

And that's all of them, and all related to stdio.h.  Unfortunately the obvious solution of explicitly including stdio.h didn't help.  It also seems really odd that I would have to explicitly include standard system libraries.  I'm hoping there some sort of solution to this that doesn't involve a massive amount of recoding.  Thanks in advance for your help!

Trevor


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130724/d8d78d28/attachment.html>

From jjhelmus at gmail.com  Wed Jul 24 11:37:30 2013
From: jjhelmus at gmail.com (Jonathan J. Helmus)
Date: Wed, 24 Jul 2013 10:37:30 -0500
Subject: [Numpy-discussion] Splitting numpydoc to a separate repo
In-Reply-To: <ksos3l$jer$1@ger.gmane.org>
References: <ksos3l$jer$1@ger.gmane.org>
Message-ID: <A256DAD3-1EF4-4941-A0B5-4FD0C1778DE0@gmail.com>


On Jul 24, 2013, at 10:33 AM, Pauli Virtanen <pav at iki.fi> wrote:

> Hi,
> 
> How about splitting doc/sphinxext out from the main Numpy repository to 
> a separate `numpydoc` repo under Numpy project?
> 
> It's a separate Python package, after all. Moreover, this would make it 
> easier to use it as a git submodule (e.g. in Scipy). Moreover, its 
> release cycle is not in any way tied to that of Numpy.
> 
> 	Pauli
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

I'm a big +1 on this idea.  I've used the numpydoc sphinx extensions in a number of package I've worked on, having them as a separate git repo would make these even easier to use.

	- Jonathan Helmus

From lists at hilboll.de  Wed Jul 24 11:39:18 2013
From: lists at hilboll.de (Andreas Hilboll)
Date: Wed, 24 Jul 2013 17:39:18 +0200
Subject: [Numpy-discussion] Splitting numpydoc to a separate repo
In-Reply-To: <ksos3l$jer$1@ger.gmane.org>
References: <ksos3l$jer$1@ger.gmane.org>
Message-ID: <51EFF526.80306@hilboll.de>

On 24.07.2013 17:33, Pauli Virtanen wrote:
> Hi,
> 
> How about splitting doc/sphinxext out from the main Numpy repository to 
> a separate `numpydoc` repo under Numpy project?

+1

-- Andreas


From pav at iki.fi  Wed Jul 24 12:32:58 2013
From: pav at iki.fi (Pauli Virtanen)
Date: Wed, 24 Jul 2013 19:32:58 +0300
Subject: [Numpy-discussion] Splitting numpydoc to a separate repo
In-Reply-To: <ksos3l$jer$1@ger.gmane.org>
References: <ksos3l$jer$1@ger.gmane.org>
Message-ID: <ksovjj$tba$1@ger.gmane.org>

24.07.2013 18:33, Pauli Virtanen kirjoitti:
> How about splitting doc/sphinxext out from the main Numpy repository to
> a separate `numpydoc` repo under Numpy project?

Done:

https://github.com/numpy/numpydoc
https://github.com/numpy/numpy/pull/3547
https://github.com/scipy/scipy/pull/2657


From cjwilliams43 at gmail.com  Wed Jul 24 14:13:58 2013
From: cjwilliams43 at gmail.com (Colin J. Williams)
Date: Wed, 24 Jul 2013 14:13:58 -0400
Subject: [Numpy-discussion] Treatment of the Matrix by Numpy
Message-ID: <51F01966.6080603@gmail.com>

An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130724/948fcb42/attachment.html>

From grb at skogoglandskap.no  Thu Jul 25 04:47:03 2013
From: grb at skogoglandskap.no (Graeme B. Bell)
Date: Thu, 25 Jul 2013 08:47:03 +0000
Subject: [Numpy-discussion] unit tests / developing numpy
In-Reply-To: <mailman.13.1374253202.4135.numpy-discussion@scipy.org>
References: <mailman.13.1374253202.4135.numpy-discussion@scipy.org>
Message-ID: <7B2B6664-7E15-4F65-9366-06898322F773@skogoglandskap.no>


Does anyone know how to get the unit tests to run on a local fork, without doing a complete install of numpy? 

If so, please can you describe it, or better still, update: http://docs.scipy.org/doc/numpy/dev/gitwash/development_workflow.html

It seems strange that the development workflow doesn't mention running any tests before committing/pushing/pulling.

Graeme.

From grb at skogoglandskap.no  Thu Jul 25 05:09:51 2013
From: grb at skogoglandskap.no (Graeme B. Bell)
Date: Thu, 25 Jul 2013 09:09:51 +0000
Subject: [Numpy-discussion] unit tests / developing numpy
In-Reply-To: <7B2B6664-7E15-4F65-9366-06898322F773@skogoglandskap.no>
References: <mailman.13.1374253202.4135.numpy-discussion@scipy.org>
	<7B2B6664-7E15-4F65-9366-06898322F773@skogoglandskap.no>
Message-ID: <80BE42C2-3DAF-4BA8-A271-5A3DD2E098DE@skogoglandskap.no>


To answer my own question in a clumsy way:

To run unit tests on a dev version of numpy: 


python setup.py build
python setup.py install --prefix=/tmp/numpy
export PYTHONPATH="/tmp/numpy/lib64/python2.7/site-packages/"

python

>>> import numpy
>>> print numpy.version.version
>>> numpy.test()


Adjust according to your version of python.


On Jul 25, 2013, at 10:47 AM, Graeme Bell <grb at skogoglandskap.no> wrote:

> 
> Does anyone know how to get the unit tests to run on a local fork, without doing a complete install of numpy? 
> 
> If so, please can you describe it, or better still, update: http://docs.scipy.org/doc/numpy/dev/gitwash/development_workflow.html
> 
> It seems strange that the development workflow doesn't mention running any tests before committing/pushing/pulling.
> 
> Graeme.


From grb at skogoglandskap.no  Thu Jul 25 05:17:19 2013
From: grb at skogoglandskap.no (Graeme B. Bell)
Date: Thu, 25 Jul 2013 09:17:19 +0000
Subject: [Numpy-discussion] unit tests / developing numpy
In-Reply-To: <80BE42C2-3DAF-4BA8-A271-5A3DD2E098DE@skogoglandskap.no>
References: <mailman.13.1374253202.4135.numpy-discussion@scipy.org>
	<7B2B6664-7E15-4F65-9366-06898322F773@skogoglandskap.no>
	<80BE42C2-3DAF-4BA8-A271-5A3DD2E098DE@skogoglandskap.no>
Message-ID: <0634E4DC-15CA-4134-A4D0-CA54ADFA3A33@skogoglandskap.no>


To run unit tests on a dev version of numpy: 
It won't run if you start in the source directory, so a cd is also needed: 


python setup.py build
python setup.py install --prefix=/tmp/numpy
export PYTHONPATH="/tmp/numpy/lib64/python2.7/site-packages/"
cd .. 
python

>>> import numpy
>>> print numpy.version.version
>>> numpy.test()


Adjust according to your version of python.


From njs at pobox.com  Thu Jul 25 06:02:10 2013
From: njs at pobox.com (Nathaniel Smith)
Date: Thu, 25 Jul 2013 11:02:10 +0100
Subject: [Numpy-discussion] unit tests / developing numpy
In-Reply-To: <0634E4DC-15CA-4134-A4D0-CA54ADFA3A33@skogoglandskap.no>
References: <mailman.13.1374253202.4135.numpy-discussion@scipy.org>
	<7B2B6664-7E15-4F65-9366-06898322F773@skogoglandskap.no>
	<80BE42C2-3DAF-4BA8-A271-5A3DD2E098DE@skogoglandskap.no>
	<0634E4DC-15CA-4134-A4D0-CA54ADFA3A33@skogoglandskap.no>
Message-ID: <CAPJVwBm_bt+ru5RGG1N6PxQbLsanAkAnQCMdQAykSHjK1YkSqA@mail.gmail.com>

A cleaner option is to use virtualenv:

virtualenv --python=/usr/bin/python2.7 my-test-env
cd my-test-env
bin/pip install $NUMPY_SRCDIR
bin/python $NUMPY_SRCDIR/tools/test-installed-numpy.py --mode=full

Or you can install 'tox', and then running 'tox -e py27' will do the
above, 'tox -e py27,py33' will check both 2.7 and 3.3, plain 'tox'
will test all supported configurations (this requires you have lots of
python interpreter versions installed), etc.

Those docs are in doc/source/dev/gitwash in the numpy source tree --
if you have any thoughts on how to make them clearer than pull
requests are appreciated :-). You probably have a better idea than us
how to put things clearly to someone who's just starting...

-n

On Thu, Jul 25, 2013 at 10:17 AM, Graeme B. Bell <grb at skogoglandskap.no> wrote:
>
> To run unit tests on a dev version of numpy:
> It won't run if you start in the source directory, so a cd is also needed:
>
>
>
> python setup.py build
> python setup.py install --prefix=/tmp/numpy
> export PYTHONPATH="/tmp/numpy/lib64/python2.7/site-packages/"
> cd ..
> python
>
>>>> import numpy
>>>> print numpy.version.version
>>>> numpy.test()
>
>
> Adjust according to your version of python.
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From pav at iki.fi  Thu Jul 25 06:25:51 2013
From: pav at iki.fi (Pauli Virtanen)
Date: Thu, 25 Jul 2013 13:25:51 +0300
Subject: [Numpy-discussion] unit tests / developing numpy
In-Reply-To: <CAPJVwBm_bt+ru5RGG1N6PxQbLsanAkAnQCMdQAykSHjK1YkSqA@mail.gmail.com>
References: <mailman.13.1374253202.4135.numpy-discussion@scipy.org>
	<7B2B6664-7E15-4F65-9366-06898322F773@skogoglandskap.no>
	<80BE42C2-3DAF-4BA8-A271-5A3DD2E098DE@skogoglandskap.no>
	<0634E4DC-15CA-4134-A4D0-CA54ADFA3A33@skogoglandskap.no>
	<CAPJVwBm_bt+ru5RGG1N6PxQbLsanAkAnQCMdQAykSHjK1YkSqA@mail.gmail.com>
Message-ID: <ksquf8$c47$1@ger.gmane.org>

25.07.2013 13:02, Nathaniel Smith kirjoitti:
[clip]
> Or you can install 'tox', and then running 'tox -e py27' will do the
> above, 'tox -e py27,py33' will check both 2.7 and 3.3, plain 'tox'
> will test all supported configurations (this requires you have lots of
> python interpreter versions installed), etc.

Or:

python runtests.py

-- 
Pauli Virtanen


From njs at pobox.com  Thu Jul 25 07:48:10 2013
From: njs at pobox.com (Nathaniel Smith)
Date: Thu, 25 Jul 2013 12:48:10 +0100
Subject: [Numpy-discussion] Allow == and != to raise errors
In-Reply-To: <CADKKbtjhSUhMHP0CKE91s4JKx+BUKOrvYt2Zy503Su1ffO=AHg@mail.gmail.com>
References: <1373632688.13968.13.camel@sebastian-laptop>
	<CANNq6Fk+uLF_5Q+kesradPt5vEB_-miwmTzSm5odgEqxFtOing@mail.gmail.com>
	<CAPJVwBn2T1ksxTKda-x5McLd_QecyWEd9G1mC_M7E=fVt7z_zA@mail.gmail.com>
	<CAMMTP+DaC5iJZ5xvHN_eRoHhnRNqT9BHq=tugdosHKmO4Ebs7w@mail.gmail.com>
	<1374589784.13486.32.camel@sebastian-laptop>
	<CADKKbtjhSUhMHP0CKE91s4JKx+BUKOrvYt2Zy503Su1ffO=AHg@mail.gmail.com>
Message-ID: <CAPJVwBmhEv0+aP1_jRpW=KidxUTRzJ3LTC6A1=muSQFfk8Zv8Q@mail.gmail.com>

On Tue, Jul 23, 2013 at 4:10 PM, Fr?d?ric Bastien <nouiz at nouiz.org> wrote:
> I'm mixed, because I see the good value, but I'm not able to guess the
> consequence of the interface change.
>
> So doing your FutureWarning would allow to gatter some data about this, and
> if it seam to cause too much problem, we could cancel the change.
>
> Also, in the case there is a few software that depend on the old behaviour,
> this will cause a crash(Except if they have a catch all Exception case), not
> bad result.

I think we have to be willing to fix bugs, even if we can't be sure
what all the consequences are. Carefully of course, and with due
consideration to possible compatibility consequences, but if we
rejected every change that might have unforeseen effects then we'd
have to stop accepting changes altogether. (And anyway the
show-stopper regressions that make it into releases always seem to be
the ones we didn't anticipate at all, so I doubt that being 50% more
careful with obscure corner cases like this will have any measurable
impact in our overall release-to-release compatibility.) So I'd
consider Fred's comments above to be a vote for the change, in
practice...

> I think it is always hard to predict the consequence of interface change in
> NumPy. To help measure it, we could make/as people to contribute to a
> collection of software that use NumPy with a good tests suites. We could
> test interface change on them by running there tests suites to try to have a
> guess of the impact of those change. What do you think of that? I think it
> was already discussed on the mailing list, but not acted upon.

Yeah, if we want to be careful then it never hurts to run other
projects test suites to flush out bugs :-).

We don't do this systematically right now. Maybe we should stick some
precompiled copies of scipy and other core numpy-dependants up on a
host somewhere and then pull them down and run their test suite as
part of the Travis tests? We have maybe 10 minutes of CPU budget for
tests still.

-n


From nouiz at nouiz.org  Thu Jul 25 09:15:30 2013
From: nouiz at nouiz.org (=?ISO-8859-1?Q?Fr=E9d=E9ric_Bastien?=)
Date: Thu, 25 Jul 2013 09:15:30 -0400
Subject: [Numpy-discussion] Allow == and != to raise errors
In-Reply-To: <CAPJVwBmhEv0+aP1_jRpW=KidxUTRzJ3LTC6A1=muSQFfk8Zv8Q@mail.gmail.com>
References: <1373632688.13968.13.camel@sebastian-laptop>
	<CANNq6Fk+uLF_5Q+kesradPt5vEB_-miwmTzSm5odgEqxFtOing@mail.gmail.com>
	<CAPJVwBn2T1ksxTKda-x5McLd_QecyWEd9G1mC_M7E=fVt7z_zA@mail.gmail.com>
	<CAMMTP+DaC5iJZ5xvHN_eRoHhnRNqT9BHq=tugdosHKmO4Ebs7w@mail.gmail.com>
	<1374589784.13486.32.camel@sebastian-laptop>
	<CADKKbtjhSUhMHP0CKE91s4JKx+BUKOrvYt2Zy503Su1ffO=AHg@mail.gmail.com>
	<CAPJVwBmhEv0+aP1_jRpW=KidxUTRzJ3LTC6A1=muSQFfk8Zv8Q@mail.gmail.com>
Message-ID: <CADKKbthiWS9xiyu0m+ptMk_7HiMYvTcu3MXOQn_k-2m+T0AM+w@mail.gmail.com>

On Thu, Jul 25, 2013 at 7:48 AM, Nathaniel Smith <njs at pobox.com> wrote:

> On Tue, Jul 23, 2013 at 4:10 PM, Fr?d?ric Bastien <nouiz at nouiz.org> wrote:
> > I'm mixed, because I see the good value, but I'm not able to guess the
> > consequence of the interface change.
> >
> > So doing your FutureWarning would allow to gatter some data about this,
> and
> > if it seam to cause too much problem, we could cancel the change.
> >
> > Also, in the case there is a few software that depend on the old
> behaviour,
> > this will cause a crash(Except if they have a catch all Exception case),
> not
> > bad result.
>
> I think we have to be willing to fix bugs, even if we can't be sure
> what all the consequences are. Carefully of course, and with due
> consideration to possible compatibility consequences, but if we
> rejected every change that might have unforeseen effects then we'd
> have to stop accepting changes altogether. (And anyway the
> show-stopper regressions that make it into releases always seem to be
> the ones we didn't anticipate at all, so I doubt that being 50% more
> careful with obscure corner cases like this will have any measurable
> impact in our overall release-to-release compatibility.) So I'd
> consider Fred's comments above to be a vote for the change, in
> practice...
>
> > I think it is always hard to predict the consequence of interface change
> in
> > NumPy. To help measure it, we could make/as people to contribute to a
> > collection of software that use NumPy with a good tests suites. We could
> > test interface change on them by running there tests suites to try to
> have a
> > guess of the impact of those change. What do you think of that? I think
> it
> > was already discussed on the mailing list, but not acted upon.
>
> Yeah, if we want to be careful then it never hurts to run other
> projects test suites to flush out bugs :-).
>
> We don't do this systematically right now. Maybe we should stick some
> precompiled copies of scipy and other core numpy-dependants up on a
> host somewhere and then pull them down and run their test suite as
> part of the Travis tests? We have maybe 10 minutes of CPU budget for
> tests still.


Theano tests will be too long. I'm not sure that doing this on travis-ci is
the right place. Doing this for each version of a PR will be too long for
travis and will limit the project that we will test on.

What about doing a vagrant VM that update/install the development version
of NumPy and then reinstall some predetermined version of other project and
run there tests?

I started playing with vagrant VM to help test differente OS configuration
for Theano. I haven't finished this, but it seam to do the job well. People
just cd in a directory, then run "vagrant up" and then all is automatic.
They just wait and read the output.

Other idea? I know some other project used jenkins. Would this be a better
idea?

Fred
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130725/d11df775/attachment.html>

From chris.barker at noaa.gov  Thu Jul 25 10:49:44 2013
From: chris.barker at noaa.gov (Chris Barker - NOAA Federal)
Date: Thu, 25 Jul 2013 07:49:44 -0700
Subject: [Numpy-discussion] add .H attribute?
In-Reply-To: <51EF1873.7050202@gmail.com>
References: <51D98902.1090403@gmail.com>
	<CAB6mnxL9GhVHi5aBKMfzX0DTi3uY=W90rFaj7Ta=fZguJZs5FA@mail.gmail.com>
	<51E17280.2030105@gmail.com>
	<CAPJVwBn-4WfgPQCzz0kECV8MyoLgWDmTmWu+Fj2TMuUWKx-7Wg@mail.gmail.com>
	<CABDkGQ=to_EG8uUyj8aQTVusguvt1qsLnhsPEGpanczHE9J1Rg@mail.gmail.com>
	<CAPJVwBmGkMxPw_cD21zxsf1Jj5cMqSzm4vShAEfqmkT4Pdy4bA@mail.gmail.com>
	<51ED9100.8040108@gmail.com> <loom.20130723T091735-874@post.gmane.org>
	<51EE4044.5060509@astro.uio.no>
	<CAPJVwBmR9ShkJYPZz-+EU5J6enTFf0j_KyOqgB=ex1uGijcc5w@mail.gmail.com>
	<51EEC332.7070805@gmail.com> <51EEF0BB.60508@astro.uio.no>
	<51EF0217.209@gmail.com> <51EF078E.8050603@astro.uio.no>
	<51EF1873.7050202@gmail.com>
Message-ID: <-29565752093052626@unknownmsgid>

On Jul 23, 2013, at 4:57 PM, Alan G Isaac <alan.isaac at gmail.com> wrote:

> Finally, I think (?) everyone (proponents and opponents)
> would be happy if .H could provide access to an iterative
> view of the conjugate transpose.

Except those of us that don't think numpy needs it at all.

But I'll call it a -0

-Chris


From grb at skogoglandskap.no  Thu Jul 25 10:52:07 2013
From: grb at skogoglandskap.no (Graeme B. Bell)
Date: Thu, 25 Jul 2013 14:52:07 +0000
Subject: [Numpy-discussion] unit tests / developing numpy
In-Reply-To: <mailman.2098.1374757834.973.numpy-discussion@scipy.org>
References: <mailman.2098.1374757834.973.numpy-discussion@scipy.org>
Message-ID: <20D22199-8C71-49A0-8292-CEBBBB70B902@skogoglandskap.no>

Nathaniel, Pauli:

Thanks for the suggestions!

= runtests.py is a nice solution, but unless you also set up your PYTHONPATH and install the code you've been working on, you're going to run with whichever version of numpy you have installed normally rather than the code you've just been working on (e.g. 1.7.1 rather than 1.8dev). 

Unfortunately, this caused me to submit a buggy set of commits, thinking they had passed the tests. 


= env approach: thanks, I'll give that a try and compare it with the /tmp install approach.


Can we add this to the dev workflow web pages?

Graeme.

From pav at iki.fi  Thu Jul 25 11:38:11 2013
From: pav at iki.fi (Pauli Virtanen)
Date: Thu, 25 Jul 2013 18:38:11 +0300
Subject: [Numpy-discussion] unit tests / developing numpy
In-Reply-To: <20D22199-8C71-49A0-8292-CEBBBB70B902@skogoglandskap.no>
References: <mailman.2098.1374757834.973.numpy-discussion@scipy.org>
	<20D22199-8C71-49A0-8292-CEBBBB70B902@skogoglandskap.no>
Message-ID: <ksrgos$60t$1@ger.gmane.org>

25.07.2013 17:52, Graeme B. Bell kirjoitti:
[clip]
> = runtests.py is a nice solution, but unless you also set up your
 > PYTHONPATH and install the code you've been working on, you're going
 > to run with whichever version of numpy you have installed normally
 > rather than the code you've just been working on (e.g. 1.7.1 rather 
than 1.8dev).

No, runtests builds the code and sets PYTHONPATH accordingly.

-- 
Pauli Virtanen


From grb at skogoglandskap.no  Thu Jul 25 13:15:22 2013
From: grb at skogoglandskap.no (Graeme B. Bell)
Date: Thu, 25 Jul 2013 17:15:22 +0000
Subject: [Numpy-discussion] unit tests / developing numpy (Pauli
	Virtanen)
In-Reply-To: <mailman.11.1374771602.18002.numpy-discussion@scipy.org>
References: <mailman.11.1374771602.18002.numpy-discussion@scipy.org>
Message-ID: <EF3DCCA6-675C-458D-8B5E-03D65B4E62A1@skogoglandskap.no>


> 
> No, runtests builds the code and sets PYTHONPATH accordingly.
> 
> -- 
> Pauli Virtanen


Hello Pauli,

Thanks again for writing back.

I agree that may be what runtests.py is intended to do, but it is unfortunately not what it actually does in its default configuration, at least on my computer. I got burned by this a few nights ago. 

$ pwd
/home/X/github/numpy

$ more numpy/version.py
# THIS FILE IS GENERATED FROM NUMPY SETUP.PY
short_version = '1.8.0'

$ echo $PYTHONPATH

$ python runtests.py 
Building, see build.log...
Build OK
Running unit tests for numpy
NumPy version 1.7.1
NumPy is installed in /usr/lib64/python2.7/site-packages/numpy
Python version 2.7.3 (default, Aug  9 2012, 17:23:57) [GCC 4.7.1 20120720 (Red Hat 4.7.1-5)]
nose version 1.3.0

*note the version number and directory used by runtests.py*

It reported 100% tests passed (unsurprising since it was testing the release version!). In reality, the current directory at that time failed a test when I pushed it to the main repository.

Can you suggest anything that I may be doing wrong here?

Graeme


On Jul 25, 2013, at 7:00 PM, numpy-discussion-request at scipy.org wrote:

> Send NumPy-Discussion mailing list submissions to
> 	numpy-discussion at scipy.org
> 
> To subscribe or unsubscribe via the World Wide Web, visit
> 	http://mail.scipy.org/mailman/listinfo/numpy-discussion
> or, via email, send a message with subject or body 'help' to
> 	numpy-discussion-request at scipy.org
> 
> You can reach the person managing the list at
> 	numpy-discussion-owner at scipy.org
> 
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of NumPy-Discussion digest..."
> 
> 
> Today's Topics:
> 
>   1. Re: add .H attribute? (Chris Barker - NOAA Federal)
>   2. Re: unit tests / developing numpy (Graeme B. Bell)
>   3. Re: unit tests / developing numpy (Pauli Virtanen)
> 
> 
> ----------------------------------------------------------------------
> 
> Message: 1
> Date: Thu, 25 Jul 2013 07:49:44 -0700
> From: Chris Barker - NOAA Federal <chris.barker at noaa.gov>
> Subject: Re: [Numpy-discussion] add .H attribute?
> To: Discussion of Numerical Python <numpy-discussion at scipy.org>
> Message-ID: <-29565752093052626 at unknownmsgid>
> Content-Type: text/plain; charset=UTF-8
> 
> On Jul 23, 2013, at 4:57 PM, Alan G Isaac <alan.isaac at gmail.com> wrote:
> 
>> Finally, I think (?) everyone (proponents and opponents)
>> would be happy if .H could provide access to an iterative
>> view of the conjugate transpose.
> 
> Except those of us that don't think numpy needs it at all.
> 
> But I'll call it a -0
> 
> -Chris
> 
> 
> ------------------------------
> 
> Message: 2
> Date: Thu, 25 Jul 2013 14:52:07 +0000
> From: "Graeme B. Bell" <grb at skogoglandskap.no>
> Subject: Re: [Numpy-discussion] unit tests / developing numpy
> To: "<numpy-discussion at scipy.org>" <numpy-discussion at scipy.org>
> Message-ID: <20D22199-8C71-49A0-8292-CEBBBB70B902 at skogoglandskap.no>
> Content-Type: text/plain; charset="us-ascii"
> 
> Nathaniel, Pauli:
> 
> Thanks for the suggestions!
> 
> = runtests.py is a nice solution, but unless you also set up your PYTHONPATH and install the code you've been working on, you're going to run with whichever version of numpy you have installed normally rather than the code you've just been working on (e.g. 1.7.1 rather than 1.8dev). 
> 
> Unfortunately, this caused me to submit a buggy set of commits, thinking they had passed the tests. 
> 
> 
> = env approach: thanks, I'll give that a try and compare it with the /tmp install approach.
> 
> 
> Can we add this to the dev workflow web pages?
> 
> Graeme.
> 
> ------------------------------
> 
> Message: 3
> Date: Thu, 25 Jul 2013 18:38:11 +0300
> From: Pauli Virtanen <pav at iki.fi>
> Subject: Re: [Numpy-discussion] unit tests / developing numpy
> To: numpy-discussion at scipy.org
> Message-ID: <ksrgos$60t$1 at ger.gmane.org>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
> 
> 25.07.2013 17:52, Graeme B. Bell kirjoitti:
> [clip]
>> = runtests.py is a nice solution, but unless you also set up your
>> PYTHONPATH and install the code you've been working on, you're going
>> to run with whichever version of numpy you have installed normally
>> rather than the code you've just been working on (e.g. 1.7.1 rather 
> than 1.8dev).
> 
> No, runtests builds the code and sets PYTHONPATH accordingly.
> 
> -- 
> Pauli Virtanen
> 
> 
> 
> ------------------------------
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> 
> 
> End of NumPy-Discussion Digest, Vol 82, Issue 51
> ************************************************


From pav at iki.fi  Thu Jul 25 13:49:33 2013
From: pav at iki.fi (Pauli Virtanen)
Date: Thu, 25 Jul 2013 20:49:33 +0300
Subject: [Numpy-discussion] unit tests / developing numpy (Pauli
	Virtanen)
In-Reply-To: <EF3DCCA6-675C-458D-8B5E-03D65B4E62A1@skogoglandskap.no>
References: <mailman.11.1374771602.18002.numpy-discussion@scipy.org>
	<EF3DCCA6-675C-458D-8B5E-03D65B4E62A1@skogoglandskap.no>
Message-ID: <ksrof7$2r5$1@ger.gmane.org>

25.07.2013 20:15, Graeme B. Bell kirjoitti:
[clip]
> I agree that may be what runtests.py is intended to do, but it is
 > unfortunately not what it actually does in its default configuration,
 > at least on my computer. I got burned by this a few nights ago.

That is interesting, as it has worked for me on all configurations.

You can check under 'build/testenv/' --- does your Python version by 
chance install it to a `lib64` directory instead of `lib`?

(i)

What does

import os
from distutils.sysconfig import get_python_lib
get_python_lib(prefix=os.path.abspath('build/testenv'))

report? Is there a 'numpy' directory below the reported directory after 
running runtests.py?

(ii)

Start a fresh Python interpreter and run

import sys
print sys.modules.get('numpy')

(Note: no "import numpy" above). Does it print `None` or something else?

-- 
Pauli Virtanen


From stefan at sun.ac.za  Wed Jul 24 22:53:19 2013
From: stefan at sun.ac.za (=?iso-8859-1?Q?St=E9fan?= van der Walt)
Date: Thu, 25 Jul 2013 04:53:19 +0200
Subject: [Numpy-discussion] unit tests / developing numpy
In-Reply-To: <7B2B6664-7E15-4F65-9366-06898322F773@skogoglandskap.no>
References: <mailman.13.1374253202.4135.numpy-discussion@scipy.org>
	<7B2B6664-7E15-4F65-9366-06898322F773@skogoglandskap.no>
Message-ID: <20130725025319.GC7821@shinobi>

On Thu, 25 Jul 2013 08:47:03 +0000, Graeme B. Bell wrote:
> 
> Does anyone know how to get the unit tests to run on a local fork, without doing a complete install of numpy? 
> 

I usually do an in-place build with either

bentomaker build -i -j

or

python setup.py build_ext -i

Then

export PYTHONPATH=$PYTHONPATH:/path/to/numpy

and

nosetests numpy

St?fan


From grb at skogoglandskap.no  Mon Jul 29 03:43:28 2013
From: grb at skogoglandskap.no (Graeme B. Bell)
Date: Mon, 29 Jul 2013 07:43:28 +0000
Subject: [Numpy-discussion] unit tests / developing numpy
	(Pauli	Virtanen)
In-Reply-To: <mailman.11.1374858002.25379.numpy-discussion@scipy.org>
References: <mailman.11.1374858002.25379.numpy-discussion@scipy.org>
Message-ID: <FDC9CF5C-4C36-4DE9-911B-77254073DC18@skogoglandskap.no>


Hi Pauli,

Thanks for looking into this. Apologies for mangling the email subject line in my previous reply. 

Answers are below, inline:

> That is interesting, as it has worked for me on all configurations.
> You can check under 'build/testenv/' --- does your Python version by 
> chance install it to a `lib64` directory instead of `lib`?


$ ls build/testenv/
bin  lib64

> (i)
> 
> What does
> 
> import os
> from distutils.sysconfig import get_python_lib
> get_python_lib(prefix=os.path.abspath('build/testenv'))
> 
> report? Is there a 'numpy' directory below the reported directory after 
> running runtests.py?

>>> import os
>>> from distutils.sysconfig import get_python_lib
>>> get_python_lib(prefix=os.path.abspath('build/testenv'))

'/ssd-space/home/X/github/numpy/build/testenv/lib/python2.7/site-packages'

I'm running this in a fresh python client in the directory 'github/numpy'. 

> (ii)
> 
> Start a fresh Python interpreter and run
> 
> import sys
> print sys.modules.get('numpy')
> 
> (Note: no "import numpy" above). Does it print `None` or something else?

>>> import sys
>>> print sys.modules.get('numpy')
None


Thanks, Pauli and Stefan.

Graeme.


From ncreati at inogs.it  Mon Jul 29 08:50:03 2013
From: ncreati at inogs.it (Nicola Creati)
Date: Mon, 29 Jul 2013 14:50:03 +0200
Subject: [Numpy-discussion] Search array in array
Message-ID: <51F664FB.5000509@inogs.it>

Hello,
I'm wondering if there is a fast way to solve the following problem. I 
have two arrays:

A = [[ 4,  9, 10],
        [ 7,  4, 17],
        [12, 21, 14],
        [12, 24, 11],
        [18, 21,  3],
        [16,  3,  7],
        [17, 21,  5],
        [24,  3, 14]]

B = [[17,  5],
        [14,  21]]

I need to search rows of A that contain elements of each row of B 
regardless of the order of the elements in B. The searched results is: 
[2, 6] .

Thanks.

Nicola

-- 
_____________________________________________________________________
Nicola Creati
Istituto Nazionale di Oceanografia e di Geofisica Sperimentale - OGS
IRI (Ricerca Teconologica e Infrastrutture) Department
B.go Grotta Gigante - Brisciki 42/c
34010 Sgonico - Zgonik (TS) - Italy
Tel. +39-0402140213
Fax  +39-040327307


From gbs25 at drexel.edu  Mon Jul 29 16:27:47 2013
From: gbs25 at drexel.edu (Gabe Schwartz)
Date: Mon, 29 Jul 2013 20:27:47 +0000 (UTC)
Subject: [Numpy-discussion] Search array in array
References: <51F664FB.5000509@inogs.it>
Message-ID: <loom.20130729T221714-699@post.gmane.org>

Nicola Creati <ncreati <at> inogs.it> writes:

> 
> I need to search rows of A that contain elements of each row of B 
> regardless of the order of the elements in B.
> 

I don't know how fast this is, but it is fairly short:

C = (A[..., np.newaxis, np.newaxis] == B)
rows = (C.sum(axis=(1,2,3)) >= B.shape[1]).nonzero()[0]


From ncreati at inogs.it  Tue Jul 30 02:57:13 2013
From: ncreati at inogs.it (Nicola Creati)
Date: Tue, 30 Jul 2013 08:57:13 +0200
Subject: [Numpy-discussion] Search array in array
In-Reply-To: <loom.20130729T221714-699@post.gmane.org>
References: <51F664FB.5000509@inogs.it>
	<loom.20130729T221714-699@post.gmane.org>
Message-ID: <51F763C9.4030203@inogs.it>

On 07/29/2013 10:27 PM, Gabe Schwartz wrote:
> C = (A[..., np.newaxis, np.newaxis] == B)
> rows = (C.sum(axis=(1,2,3)) >= B.shape[1]).nonzero()[0]
Hello,
thank you, it's not fast but really nice.

Nicola

-- 
_____________________________________________________________________
Nicola Creati
Istituto Nazionale di Oceanografia e di Geofisica Sperimentale - OGS
IRI (Ricerca Teconologica e Infrastrutture) Department
B.go Grotta Gigante - Brisciki 42/c
34010 Sgonico - Zgonik (TS) - Italy
Tel. +39-0402140213
Fax  +39-040327307