From hannesschoenberger at gmail.com  Tue Jan  1 09:50:13 2013
From: hannesschoenberger at gmail.com (=?iso-8859-1?Q?Sch=F6nberger_Johannes?=)
Date: Tue, 1 Jan 2013 15:50:13 +0100
Subject: [Numpy-discussion] Conversion functions
Message-ID: <85ADC5DF-F17D-41EE-84D6-DA1F22015D3C@gmail.com>

Hello everyone,

I recently opened a new pull request which adds the functionality to
convert between degrees and degrees, minutes and seconds
(https://github.com/numpy/numpy/pull/2869).

The discussion is about whether such conversion functionality should
be integrated into numpy at all or whether this belongs to scipy. I
suggest to move the most common conversion functions (deg2rad,
rad2deg, deg2dms, dms2deg and some more could be added) to a separate
file `conversion.py` file in `numpy/lib`.

I could implement this in a new pull request, if the general consensus
is in favor of it? What are your thoughts?

Regards, Johannes


From ralf.gommers at gmail.com  Tue Jan  1 12:23:26 2013
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Tue, 1 Jan 2013 18:23:26 +0100
Subject: [Numpy-discussion] Conversion functions
In-Reply-To: <85ADC5DF-F17D-41EE-84D6-DA1F22015D3C@gmail.com>
References: <85ADC5DF-F17D-41EE-84D6-DA1F22015D3C@gmail.com>
Message-ID: <CABL7CQhBSENUygafzSXgErS+P1dQpA6eCj_cfxYuKCBFFennrQ@mail.gmail.com>

On Tue, Jan 1, 2013 at 3:50 PM, Sch?nberger Johannes <
hannesschoenberger at gmail.com> wrote:

> Hello everyone,
>
> I recently opened a new pull request which adds the functionality to
> convert between degrees and degrees, minutes and seconds
> (https://github.com/numpy/numpy/pull/2869).
>
> The discussion is about whether such conversion functionality should
> be integrated into numpy at all or whether this belongs to scipy. I
> suggest to move the most common conversion functions (deg2rad,
> rad2deg, deg2dms, dms2deg and some more could be added) to a separate
> file `conversion.py` file in `numpy/lib`.
>
> I could implement this in a new pull request, if the general consensus
> is in favor of it? What are your thoughts?
>

After checking what's in scipy.constants now (degree/arcminute/arcsecond
and for example temperature and frequency conversion function), I think
that that's where it belongs.

A separate new numpy submodule with a bunch of these type of conversion
utilities would be my second choice. I'm -1 on adding such small and fairly
domain-specific functions to the main numpy namespace.

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130101/75402707/attachment.html>

From chris.barker at noaa.gov  Tue Jan  1 22:41:19 2013
From: chris.barker at noaa.gov (Chris Barker - NOAA Federal)
Date: Tue, 1 Jan 2013 19:41:19 -0800
Subject: [Numpy-discussion] Conversion functions
In-Reply-To: <85ADC5DF-F17D-41EE-84D6-DA1F22015D3C@gmail.com>
References: <85ADC5DF-F17D-41EE-84D6-DA1F22015D3C@gmail.com>
Message-ID: <CALGmxELQcpgeSCScDe49Wfwro273H_W2cgd-6gk5xrRQ_V0Qyw@mail.gmail.com>

On Tue, Jan 1, 2013 at 6:50 AM, Sch?nberger Johannes
<hannesschoenberger at gmail.com> wrote:

> I recently opened a new pull request which adds the functionality to
> convert between degrees and degrees, minutes and seconds
> (https://github.com/numpy/numpy/pull/2869).
>
> The discussion is about whether such conversion functionality should
> be integrated into numpy at all or whether this belongs to scipy.

handy functions, yes, but certainly not something to put in numpy --
maybe scipy, not sure the best place. I see someone (chuck? )on github
suggested a "conversion.py" module -- that should be in scipy, not
numpy, but I"m wary -- where would it stop? RAther, perhaps the
quantaties package should be adopted.

But another note: conversion to deg.min.sec with floating point is a
bit less trivial than you'd think, you can end up with results like:

xdegrees, 60 minutes.... if you're not careful -- it looks from a
first glance that the pull request does not address this. Note: I
suppose we could consider it technically OK, but it's certainly not
aesthetically pleasing.

Example:
import numpy as np

def deg2dms(x):
    out = [0,0,0]
    out[0] = np.floor(x)
    out[1] = np.floor((x - out[0]) * 60)
    out[2] = ((x - out[0]) * 60 - out[1]) * 60
    return out

print deg2dms(1.0)
print deg2dms(1.1)
print deg2dms(45.05)

In [60]: run deg2dms.py
[1.0, 0.0, 0.0]
[1.0, 6.0, 3.1974423109204508e-13]
[45.0, 2.0, 59.999999999989768]

you'd really want that to be:

 1degree 6 minutes, 0 seconds

and

 45 degrees 3 minutes, zero seconds


Here's the code I used:

    @classmethod
    def ToDegMin(self, DecDegrees, ustring = False):
        """
        Converts from decimal (binary float) degrees to:
          Degrees, Minutes

        If the optional parameter: "ustring" is True,
        a Unicode string is returned

        """
        if signbit(DecDegrees):
            Sign = -1
            DecDegrees = abs(DecDegrees)
        else:
            Sign = 1
        Degrees = int(DecDegrees)
        DecMinutes = round((DecDegrees - Degrees + 1e-14) * 60, 10)#
add a tiny bit then round to avoid binary rounding issues
        if ustring:
            if Sign == 1:
                return u"%i\xb0 %.3f'"%(Degrees, DecMinutes)
            else:
                return u"-%i\xb0 %.3f'"%(Degrees, DecMinutes)
        else:
            return (Sign*float(Degrees), DecMinutes) # float to preserve -0.0

perhaps ugly but it results in pretty output -- someone smart here
could probably offer a cleaner solution.


-Chris

-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From chris.barker at noaa.gov  Tue Jan  1 22:53:24 2013
From: chris.barker at noaa.gov (Chris Barker - NOAA Federal)
Date: Tue, 1 Jan 2013 19:53:24 -0800
Subject: [Numpy-discussion] 3D array problem challenging in Python
In-Reply-To: <1356921614.45639691@f150.mail.ru>
References: <1356867686.200644432@f373.mail.ru>
	<1356921614.45639691@f150.mail.ru>
Message-ID: <CALGmxELMXTrK3EFv4OUok0DfrkuMPshmrwTcSdbu7fxqiM2omg@mail.gmail.com>

On Sun, Dec 30, 2012 at 6:40 PM, Happyman <bahtiyor_zohidov at mail.ru> wrote:

> Again the same problem here I want to optimize my codes in order to avoid
> "Loop" as well as to get quick response as much as possible. BUT, it seems
> really confusing, would be great to get help from Python programmers !!!
> ==================================
> The codes here:
> =================================================================
>
> import numpy as np
> import scipy.special as ss
>
> from scipy.special import sph_jnyn,sph_jn,jv,yv
> from scipy import integrate
>
> import time
> import os
>
> ---------------------------
> 1) Problem: no problem in this F0() function
> ---------------------------
> Inputs: m=5+0.4j  - complex number as an example!
>             x= one value - float!
> ---------------------------
> #This function returns an, bn coefficients I don't want it to be vectorized
> because it is already done. it is working well!
>
> def F0(m, x):
>
> nmax = np.round(2.0+x+4.0*x**(1.0/3.0))
> mx = m * x
>
> j_x,jd_x,y_x,yd_x = ss.sph_jnyn(nmax, x)        #    sph_jnyn   -    from
> scipy special functions
>
> j_x = j_x[1:]
> jd_x = jd_x[1:]
> y_x = y_x[1:]
> yd_x = yd_x[1:]
>
> h1_x = j_x + 1.0j*y_x
> h1d_x = jd_x + 1.0j*yd_x
>
> j_mx,jd_mx = ss.sph_jn(nmax, mx)        #    sph_jn    -    from scipy
> special functions
> j_mx = j_mx[1:]
> jd_mx = jd_mx[1:]
>
> j_xp = j_x + x*jd_x
> j_mxp = j_mx + mx*jd_mx
> h1_xp = h1_x + x*h1d_x
>
> m2 = m * m
> an = (m2 * j_mx * j_xp - j_x * j_mxp)/(m2 * j_mx * h1_xp - h1_x * j_mxp)
> bn = (j_mx * j_xp - j_x * j_mxp)/(j_mx * h1_xp - h1_x * j_mxp)
>
> return an, bn
>
> --------------------------------------
> 2) Problem:    1) To avoid loop
>                        2) To return values from the function (below) no
> matter whether 'a' array or scalar!
> --------------------------------------
>   m=5+0.4j  - for example
>   L = 30 - for example
>   a - array(one dimensional)
> --------------------------------------
>
> def F1(m,L,a):
>
> xs = pi * a / L
> if(m.imag < 0.0):
>              m = conj(m)

in this case, you can do things like:

m = np.where(m.imag < 0.0, np.conj(m), m)

to vectorize.


> # Want to make sure we can accept single arguments or arrays
> try:
>      xs.size
>      xlist = xs
> except:
>      xlist = array(xs)

here I use:

xs = np.asarray(xs, dtype-the_dtype_you_want)

it is essentially a no-op if xs is already an array, and will convert
it if it isn't.

> q=[ ]
> for i,s in enumerate(xlist.flat):
>
>               if float(s)==0.0: # To avoid a singularity at x=0
>                          q.append(0.0)

again, look to use np.where, or "fancy indexing":

ind = xs == 0.0
q[xs==0.0] = 0.0

>                          q.append(((L*L)/(2*pi) * (c * (an.real + bn.real
> )).sum()))

even if you do need the loop -- pre-allocate the result array (with
np.zeros() ), and then put stuf in it -- it will should be faster than
using a list.

> 3) Problem: 1)  I used "try" to avoid whether 'D' is singular or not!!! IS
> there better way beside this?

The other option is an if test -- try is faster if it's a rare
occurrence, slower if it's common.

> def F2(a,s):
> for i,d in enumerate(Dslist.flat):    # IS there any wayy to avoid from the
> loop here in this case???

see above.

note that using the where() or fancy indexing does mean you need to go
through the loop multiple times, but still probably much faster then
looping. For full-on speed for this sort of thing, Cython is a nice
option.

-Chris


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From njs at pobox.com  Wed Jan  2 06:24:10 2013
From: njs at pobox.com (Nathaniel Smith)
Date: Wed, 2 Jan 2013 11:24:10 +0000
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
Message-ID: <CAPJVwBnct8Tcop6SruOfD1CCWC82sE3cbXoQvtux8sfLmJhYuQ@mail.gmail.com>

This discussion seems to have petered out without reaching consensus
one way or another. This seems like an important issue, so I've opened
a bug:
  https://github.com/numpy/numpy/issues/2878
Hopefully this way we'll at least not forget about it; also I tried to
summarize the main issues there and would welcome comments.

-n

On Mon, Nov 12, 2012 at 7:54 PM, Matthew Brett <matthew.brett at gmail.com> wrote:
> Hi,
>
> I wanted to check that everyone knows about and is happy with the
> scalar casting changes from 1.6.0.
>
> Specifically, the rules for (array, scalar) casting have changed such
> that the resulting dtype depends on the _value_ of the scalar.
>
> Mark W has documented these changes here:
>
> http://docs.scipy.org/doc/numpy/reference/ufuncs.html#casting-rules
> http://docs.scipy.org/doc/numpy/reference/generated/numpy.result_type.html
> http://docs.scipy.org/doc/numpy/reference/generated/numpy.promote_types.html
>
> Specifically, as of 1.6.0:
>
> In [19]: arr = np.array([1.], dtype=np.float32)
>
> In [20]: (arr + (2**16-1)).dtype
> Out[20]: dtype('float32')
>
> In [21]: (arr + (2**16)).dtype
> Out[21]: dtype('float64')
>
> In [25]: arr = np.array([1.], dtype=np.int8)
>
> In [26]: (arr + 127).dtype
> Out[26]: dtype('int8')
>
> In [27]: (arr + 128).dtype
> Out[27]: dtype('int16')
>
> There's discussion about the changes here:
>
> http://mail.scipy.org/pipermail/numpy-discussion/2011-September/058563.html
> http://mail.scipy.org/pipermail/numpy-discussion/2011-March/055156.html
> http://mail.scipy.org/pipermail/numpy-discussion/2012-February/060381.html
>
> It seems to me that this change is hard to explain, and does what you
> want only some of the time, making it a false friend.
>
> Is it the right behavior for numpy 2.0?
>
> Cheers,
>
> Matthew
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From njs at pobox.com  Wed Jan  2 09:56:15 2013
From: njs at pobox.com (Nathaniel Smith)
Date: Wed, 2 Jan 2013 14:56:15 +0000
Subject: [Numpy-discussion] Numpy speed ups to simple tasks - final
 findings and suggestions
In-Reply-To: <50D4B69A.7000409@virtualmaterials.com>
References: <50D4B69A.7000409@virtualmaterials.com>
Message-ID: <CAPJVwB=6ddwLVWSwebByT7na1feYze1c00reWkYccp07Kq0kwA@mail.gmail.com>

On Fri, Dec 21, 2012 at 7:20 PM, Raul Cota <raul at virtualmaterials.com> wrote:
> Hello,
>
>
> On Dec/2/2012 I sent an email about some meaningful speed problems I was
> facing when porting our core program from Numeric (Python 2.2) to Numpy
> (Python 2.6). Some of our tests went from 30 seconds to 90 seconds for
> example.

Hi Raul,

This is great work! Sorry you haven't gotten any feedback yet -- I
guess it's a busy time of year for most people; and, the way you've
described your changes makes it hard for us to use our usual workflow
to discuss them.

> These are the actual changes to the C code,
> For bottleneck (a)
>
> In general,
> - avoid calls to PyObject_GetAttrString when I know the type is
> List, None, Tuple, Float, Int, String or Unicode
>
> - avoid calls to PyObject_GetBuffer when I know the type is
> List, None or Tuple

This definitely seems like a worthwhile change. There are possible
quibbles about coding style -- the macros could have better names, and
would probably be better as (inline) functions instead of macros --
but that can be dealt with.

Can you make a pull request on github with these changes? I guess you
haven't used git before, but I think you'll find it makes things
*much* easier (in particular, you'll never have to type out long
awkward english descriptions of the changes you made ever again!) We
have docs here:
  http://docs.scipy.org/doc/numpy/dev/gitwash/git_development.html
and your goal is to get to the point where you can file a "pull request":
  http://docs.scipy.org/doc/numpy/dev/gitwash/development_workflow.html#asking-for-your-changes-to-be-merged-with-the-main-repo
Feel free to ask on the list if you get stuck of course.

> For bottleneck (b)
>
> b.1)
> I noticed that PyFloat * Float64 resulted in an unnecessary "on the fly"
> conversion of the PyFloat into a Float64 to extract its underlying C
> double value. This happened in the function
> _double_convert_to_ctype which comes from the pattern,
> _ at name@_convert_to_ctype

This also sounds like an excellent change, and perhaps should be
extended to ints and bools as well... again, can you file a pull
request?

> b.2) This is the change that may not be very popular among Numpy users.
> I modified Float64 operations to return a Float instead of Float64. I
> could not think or see any ill effects and I got a fairly decent speed
> boost.

Yes, unfortunately, there's no way we'll be able to make this change
upstream -- there's too much chance of it breaking people's code. (And
numpy float64's do act different than python floats in at least some
cases, e.g., numpy gives more powerful control over floating point
error handling, see np.seterr.)

But, it's almost certainly possible to optimize numpy's float64 (and
friends), so that they are themselves (almost) as fast as the native
python objects. And that would help all the code that uses them, not
just the ones where regular python floats could be substituted
instead. Have you tried profiling, say, float64 * float64 to figure
out where the bottlenecks are?

-n


From njs at pobox.com  Wed Jan  2 09:58:32 2013
From: njs at pobox.com (Nathaniel Smith)
Date: Wed, 2 Jan 2013 14:58:32 +0000
Subject: [Numpy-discussion] Numpy speed ups to simple tasks - final
 findings and suggestions
In-Reply-To: <CAPJVwB=6ddwLVWSwebByT7na1feYze1c00reWkYccp07Kq0kwA@mail.gmail.com>
References: <50D4B69A.7000409@virtualmaterials.com>
	<CAPJVwB=6ddwLVWSwebByT7na1feYze1c00reWkYccp07Kq0kwA@mail.gmail.com>
Message-ID: <CAPJVwBm6uHSqg6WQyy-oiFMZzWVHLsYh6XjKVGaNAQrPG4s1qQ@mail.gmail.com>

On Wed, Jan 2, 2013 at 2:56 PM, Nathaniel Smith <njs at pobox.com> wrote:
> On Fri, Dec 21, 2012 at 7:20 PM, Raul Cota <raul at virtualmaterials.com> wrote:
>> b.1)
>> I noticed that PyFloat * Float64 resulted in an unnecessary "on the fly"
>> conversion of the PyFloat into a Float64 to extract its underlying C
>> double value. This happened in the function
>> _double_convert_to_ctype which comes from the pattern,
>> _ at name@_convert_to_ctype
>
> This also sounds like an excellent change, and perhaps should be
> extended to ints and bools as well... again, can you file a pull
> request?

Immediately after I hit 'send' I realized this might be unclear...
what I mean is, please file two separate pull requests, one for the
(a) changes and one for the (b.1) changes. They're logically separate
so it'll be easier to review and merge them separately.

-n


From bahtiyor_zohidov at mail.ru  Wed Jan  2 20:27:26 2013
From: bahtiyor_zohidov at mail.ru (=?UTF-8?B?SGFwcHltYW4=?=)
Date: Thu, 03 Jan 2013 05:27:26 +0400
Subject: [Numpy-discussion]
 =?utf-8?q?3D_array_problem_challenging_in_Pyth?= =?utf-8?q?on?=
In-Reply-To: <CALGmxELMXTrK3EFv4OUok0DfrkuMPshmrwTcSdbu7fxqiM2omg@mail.gmail.com>
References: <1356867686.200644432@f373.mail.ru>
	<1356921614.45639691@f150.mail.ru>
	<CALGmxELMXTrK3EFv4OUok0DfrkuMPshmrwTcSdbu7fxqiM2omg@mail.gmail.com>
Message-ID: <1357176446.246671186@f220.mail.ru>

 Hi Chris,
Thanks a lot.

I did as you advised..but, unfortunately, I could not "negotiate" with "quad" function at all.

what do you think if quad function can get arrays or not?


???????,  1 ?????? 2013, 19:53 -08:00 ?? Chris Barker - NOAA Federal <chris.barker at noaa.gov>:
>On Sun, Dec 30, 2012 at 6:40 PM, Happyman < bahtiyor_zohidov at mail.ru > wrote:
>
>> Again the same problem here I want to optimize my codes in order to avoid
>> "Loop" as well as to get quick response as much as possible. BUT, it seems
>> really confusing, would be great to get help from Python programmers !!!
>> ==================================
>> The codes here:
>> =================================================================
>>
>> import numpy as np
>> import scipy.special as ss
>>
>> from scipy.special import sph_jnyn,sph_jn,jv,yv
>> from scipy import integrate
>>
>> import time
>> import os
>>
>> ---------------------------
>> 1) Problem: no problem in this F0() function
>> ---------------------------
>> Inputs: m=5+0.4j  - complex number as an example!
>>             x= one value - float!
>> ---------------------------
>> #This function returns an, bn coefficients I don't want it to be vectorized
>> because it is already done. it is working well!
>>
>> def F0(m, x):
>>
>> nmax = np.round(2.0+x+4.0*x**(1.0/3.0))
>> mx = m * x
>>
>> j_x,jd_x,y_x,yd_x = ss.sph_jnyn(nmax, x)        #    sph_jnyn   -    from
>> scipy special functions
>>
>> j_x = j_x[1:]
>> jd_x = jd_x[1:]
>> y_x = y_x[1:]
>> yd_x = yd_x[1:]
>>
>> h1_x = j_x + 1.0j*y_x
>> h1d_x = jd_x + 1.0j*yd_x
>>
>> j_mx,jd_mx = ss.sph_jn(nmax, mx)        #    sph_jn    -    from scipy
>> special functions
>> j_mx = j_mx[1:]
>> jd_mx = jd_mx[1:]
>>
>> j_xp = j_x + x*jd_x
>> j_mxp = j_mx + mx*jd_mx
>> h1_xp = h1_x + x*h1d_x
>>
>> m2 = m * m
>> an = (m2 * j_mx * j_xp - j_x * j_mxp)/(m2 * j_mx * h1_xp - h1_x * j_mxp)
>> bn = (j_mx * j_xp - j_x * j_mxp)/(j_mx * h1_xp - h1_x * j_mxp)
>>
>> return an, bn
>>
>> --------------------------------------
>> 2) Problem:    1) To avoid loop
>>                        2) To return values from the function (below) no
>> matter whether 'a' array or scalar!
>> --------------------------------------
>>   m=5+0.4j  - for example
>>   L = 30 - for example
>>   a - array(one dimensional)
>> --------------------------------------
>>
>> def F1(m,L,a):
>>
>> xs = pi * a / L
>> if(m.imag < 0.0):
>>              m = conj(m)
>
>in this case, you can do things like:
>
>m = np.where(m.imag < 0.0, np.conj(m), m)
>
>to vectorize.
>
>
>
>
>> # Want to make sure we can accept single arguments or arrays
>> try:
>>      xs.size
>>      xlist = xs
>> except:
>>      xlist = array(xs)
>
>here I use:
>
>xs = np.asarray(xs, dtype-the_dtype_you_want)
>
>it is essentially a no-op if xs is already an array, and will convert
>it if it isn't.
>
>> q=[ ]
>> for i,s in enumerate(xlist.flat):
>>
>>               if float(s)==0.0: # To avoid a singularity at x=0
>>                          q.append(0.0)
>
>again, look to use np.where, or "fancy indexing":
>
>ind = xs == 0.0
>q[xs==0.0] = 0.0
>
>>                          q.append(((L*L)/(2*pi) * (c * (an.real + bn.real
>> )).sum()))
>
>even if you do need the loop -- pre-allocate the result array (with
>np.zeros() ), and then put stuf in it -- it will should be faster than
>using a list.
>
>> 3) Problem: 1)  I used "try" to avoid whether 'D' is singular or not!!! IS
>> there better way beside this?
>
>The other option is an if test -- try is faster if it's a rare
>occurrence, slower if it's common.
>
>> def F2(a,s):
>> for i,d in enumerate(Dslist.flat):    # IS there any wayy to avoid from the
>> loop here in this case???
>
>see above.
>
>note that using the where() or fancy indexing does mean you need to go
>through the loop multiple times, but still probably much faster then
>looping. For full-on speed for this sort of thing, Cython is a nice
>option.
>
>-Chris
>
>
>-- 
>
>Christopher Barker, Ph.D.
>Oceanographer
>
>Emergency Response Division
>NOAA/NOS/OR&R            (206) 526-6959   voice
>7600 Sand Point Way NE   (206) 526-6329   fax
>Seattle, WA  98115       (206) 526-6317   main reception
>
>Chris.Barker at noaa.gov
>_______________________________________________
>NumPy-Discussion mailing list
>NumPy-Discussion at scipy.org
>http://mail.scipy.org/mailman/listinfo/numpy-discussion

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130103/b473c855/attachment.html>

From robince at gmail.com  Thu Jan  3 10:54:25 2013
From: robince at gmail.com (Robin)
Date: Thu, 3 Jan 2013 15:54:25 +0000
Subject: [Numpy-discussion] test failures when embedded (in matlab)
Message-ID: <CALsWBNM=qo-5vvVAKx6Seqo8SD_S-maYTE9j1KNz82dkXqr8bA@mail.gmail.com>

Hi All,

When using Numpy from an embedded Python (Python embedded in a Matlab
mex function) I get a lot of test failures (see attached log).

I am using CentOS 6.3, distribution packaged Python (2.6) and Numpy
(1.4.1). Running numpy tests from a normal Python interpreter results
in no errors or failures.

Most of the failures look to be to do with errors calling Fortran
functions - is it possible there is some linking / ABI problem? Would
there be any way to overcome it?

I get similar errors using EPD 7.3.

Any advice appreciated,

Cheers

Robin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: embed_numpy_test.log
Type: application/octet-stream
Size: 78041 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130103/f352e149/attachment.obj>

From njs at pobox.com  Thu Jan  3 15:06:57 2013
From: njs at pobox.com (Nathaniel Smith)
Date: Thu, 3 Jan 2013 20:06:57 +0000
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <CAPJVwBnct8Tcop6SruOfD1CCWC82sE3cbXoQvtux8sfLmJhYuQ@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CAPJVwBnct8Tcop6SruOfD1CCWC82sE3cbXoQvtux8sfLmJhYuQ@mail.gmail.com>
Message-ID: <CAPJVwB=GmgRQoW1RqwR1WmJBq5tfxJAGnGPjcT0drZvGTVGx7Q@mail.gmail.com>

On Wed, Jan 2, 2013 at 11:24 AM, Nathaniel Smith <njs at pobox.com> wrote:
> This discussion seems to have petered out without reaching consensus
> one way or another. This seems like an important issue, so I've opened
> a bug:
>   https://github.com/numpy/numpy/issues/2878
> Hopefully this way we'll at least not forget about it; also I tried to
> summarize the main issues there and would welcome comments.

Consensus in that bug report seems to be that for array/scalar operations like:
  np.array([1], dtype=np.int8) + 1000 # can't be represented as an int8!
we should raise an error, rather than either silently upcasting the
result (as in 1.6 and 1.7) or silently downcasting the scalar (as in
1.5 and earlier).

The next question is how to handle the warning period, or if there
should be a warning period, given that we've already silently changed
the semantics of this operation, so raising a warning now is perhaps
like noticing that the horses are gone and putting up a notice warning
that we plan to close the barn door shortly. But then again, people
who have already adjusted their code for 1.6 may appreciate such a
warning.

Or maybe no-one actually writes dumb things like int8-plus-1000 so it
doesn't matter, but anyway I thought the list should have a heads-up
:-)

-n


From andrew.collette at gmail.com  Thu Jan  3 18:39:37 2013
From: andrew.collette at gmail.com (Andrew Collette)
Date: Thu, 3 Jan 2013 16:39:37 -0700
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <CAPJVwB=GmgRQoW1RqwR1WmJBq5tfxJAGnGPjcT0drZvGTVGx7Q@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CAPJVwBnct8Tcop6SruOfD1CCWC82sE3cbXoQvtux8sfLmJhYuQ@mail.gmail.com>
	<CAPJVwB=GmgRQoW1RqwR1WmJBq5tfxJAGnGPjcT0drZvGTVGx7Q@mail.gmail.com>
Message-ID: <CALmrCV11T6OTp+dBvUNjGn518Rd2ovHPfS7Ur6a5BKTyYXzKQw@mail.gmail.com>

> Consensus in that bug report seems to be that for array/scalar operations like:
>   np.array([1], dtype=np.int8) + 1000 # can't be represented as an int8!
> we should raise an error, rather than either silently upcasting the
> result (as in 1.6 and 1.7) or silently downcasting the scalar (as in
> 1.5 and earlier).

I have run into this a few times as a NumPy user, and I just wanted to
comment that (in my opinion), having this case generate an error is
the worst of both worlds.  The reason people can't decide between
rollover and promotion is because neither is objectively better.  One
avoids memory inflation, and the other avoids losing precision.  You
just need to pick one and document it.  Kicking the can down the road
to the user, and making him/her explicitly test for this condition, is
not a very good solution.

What does this mean in practical terms for NumPy users?  I personally
don't relish the choice of always using numpy.add, or always wrapping
my additions in checks for ValueError.

Andrew


From d.s.seljebotn at astro.uio.no  Thu Jan  3 19:11:21 2013
From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn)
Date: Fri, 04 Jan 2013 01:11:21 +0100
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <CALmrCV11T6OTp+dBvUNjGn518Rd2ovHPfS7Ur6a5BKTyYXzKQw@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CAPJVwBnct8Tcop6SruOfD1CCWC82sE3cbXoQvtux8sfLmJhYuQ@mail.gmail.com>
	<CAPJVwB=GmgRQoW1RqwR1WmJBq5tfxJAGnGPjcT0drZvGTVGx7Q@mail.gmail.com>
	<CALmrCV11T6OTp+dBvUNjGn518Rd2ovHPfS7Ur6a5BKTyYXzKQw@mail.gmail.com>
Message-ID: <50E61E29.1020709@astro.uio.no>

On 01/04/2013 12:39 AM, Andrew Collette wrote:
>> Consensus in that bug report seems to be that for array/scalar operations like:
>>    np.array([1], dtype=np.int8) + 1000 # can't be represented as an int8!
>> we should raise an error, rather than either silently upcasting the
>> result (as in 1.6 and 1.7) or silently downcasting the scalar (as in
>> 1.5 and earlier).
>
> I have run into this a few times as a NumPy user, and I just wanted to
> comment that (in my opinion), having this case generate an error is
> the worst of both worlds.  The reason people can't decide between
> rollover and promotion is because neither is objectively better.  One

If neither is objectively better, I think that is a very good reason to 
kick it down to the user. "Explicit is better than implicit".

> avoids memory inflation, and the other avoids losing precision.  You
> just need to pick one and document it.  Kicking the can down the road
> to the user, and making him/her explicitly test for this condition, is
> not a very good solution.

It's a good solution to encourage bug-free code. It may not be a good 
solution to avoid typing.

> What does this mean in practical terms for NumPy users?  I personally
> don't relish the choice of always using numpy.add, or always wrapping
> my additions in checks for ValueError.

I think you usually have a bug in your program when this happens, since 
either the dtype is wrong, or the value one is trying to store is wrong. 
I know that's true for myself, though I don't claim to know everybody 
elses usecases.

Dag Sverre


From njs at pobox.com  Thu Jan  3 19:26:46 2013
From: njs at pobox.com (Nathaniel Smith)
Date: Fri, 4 Jan 2013 00:26:46 +0000
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <CALmrCV11T6OTp+dBvUNjGn518Rd2ovHPfS7Ur6a5BKTyYXzKQw@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CAPJVwBnct8Tcop6SruOfD1CCWC82sE3cbXoQvtux8sfLmJhYuQ@mail.gmail.com>
	<CAPJVwB=GmgRQoW1RqwR1WmJBq5tfxJAGnGPjcT0drZvGTVGx7Q@mail.gmail.com>
	<CALmrCV11T6OTp+dBvUNjGn518Rd2ovHPfS7Ur6a5BKTyYXzKQw@mail.gmail.com>
Message-ID: <CAPJVwBnzAaNSYArtOe_PBLoYDSVg-YzPx=yGOY+CKP6SwddLdQ@mail.gmail.com>

On 3 Jan 2013 23:39, "Andrew Collette" <andrew.collette at gmail.com> wrote:
>
> > Consensus in that bug report seems to be that for array/scalar
operations like:
> >   np.array([1], dtype=np.int8) + 1000 # can't be represented as an int8!
> > we should raise an error, rather than either silently upcasting the
> > result (as in 1.6 and 1.7) or silently downcasting the scalar (as in
> > 1.5 and earlier).
>
> I have run into this a few times as a NumPy user, and I just wanted to
> comment that (in my opinion), having this case generate an error is
> the worst of both worlds.  The reason people can't decide between
> rollover and promotion is because neither is objectively better.  One
> avoids memory inflation, and the other avoids losing precision.  You
> just need to pick one and document it.  Kicking the can down the road
> to the user, and making him/her explicitly test for this condition, is
> not a very good solution.
>
> What does this mean in practical terms for NumPy users?  I personally
> don't relish the choice of always using numpy.add, or always wrapping
> my additions in checks for ValueError.

To be clear: we're only talking here about the case where you have a mix of
a narrow dtype in an array and a scalar value that cannot be represented in
that narrow dtype. If both sides are arrays then we continue to upcast as
normal. So my impression is that this means very little in practical terms,
because this is a rare and historically poorly supported situation.

But if this is something you're running into in practice then you may have
a better idea than us about the practical effects. Do you have any examples
where this has come up that you can share?

-n
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130104/b30eddcb/attachment.html>

From p.j.a.cock at googlemail.com  Thu Jan  3 19:39:50 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Fri, 4 Jan 2013 00:39:50 +0000
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <50E61E29.1020709@astro.uio.no>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CAPJVwBnct8Tcop6SruOfD1CCWC82sE3cbXoQvtux8sfLmJhYuQ@mail.gmail.com>
	<CAPJVwB=GmgRQoW1RqwR1WmJBq5tfxJAGnGPjcT0drZvGTVGx7Q@mail.gmail.com>
	<CALmrCV11T6OTp+dBvUNjGn518Rd2ovHPfS7Ur6a5BKTyYXzKQw@mail.gmail.com>
	<50E61E29.1020709@astro.uio.no>
Message-ID: <CAKVJ-_5rECRf-gzqp1sYqK-TAMi8AHBtBe6Q5T9X1yiqieCu7Q@mail.gmail.com>

On Fri, Jan 4, 2013 at 12:11 AM, Dag Sverre Seljebotn
<d.s.seljebotn at astro.uio.no> wrote:
> On 01/04/2013 12:39 AM, Andrew Collette wrote:
> > Nathaniel Smith wrote:
> >> Consensus in that bug report seems to be that for array/scalar operations like:
> >>    np.array([1], dtype=np.int8) + 1000 # can't be represented as an int8!
> >> we should raise an error, rather than either silently upcasting the
> >> result (as in 1.6 and 1.7) or silently downcasting the scalar (as in
> >> 1.5 and earlier).
> >
> > I have run into this a few times as a NumPy user, and I just wanted to
> > comment that (in my opinion), having this case generate an error is
> > the worst of both worlds.  The reason people can't decide between
> > rollover and promotion is because neither is objectively better.  One
>
> If neither is objectively better, I think that is a very good reason to
> kick it down to the user. "Explicit is better than implicit".
>
> > avoids memory inflation, and the other avoids losing precision.  You
> > just need to pick one and document it.  Kicking the can down the road
> > to the user, and making him/her explicitly test for this condition, is
> > not a very good solution.
>
> It's a good solution to encourage bug-free code. It may not be a good
> solution to avoid typing.
>
> > What does this mean in practical terms for NumPy users?  I personally
> > don't relish the choice of always using numpy.add, or always wrapping
> > my additions in checks for ValueError.
>
> I think you usually have a bug in your program when this happens, since
> either the dtype is wrong, or the value one is trying to store is wrong.
> I know that's true for myself, though I don't claim to know everybody
> elses usecases.

I agree with Dag rather than Andrew, "Explicit is better than implicit".
i.e. What Nathaniel described earlier as the apparent consensus.

Since I've actually used NumPy arrays with specific low memory
types, I thought I should comment about my use case if case it
is helpful:

I've only used the low precision types like np.uint8 (unsigned) where
I needed to limit my memory usage. In this case, the topology of a
graph allowing multiple edges held as an integer adjacency matrix, A.
I would calculate things like A^n for paths of length n, and also make
changes to A directly (e.g. adding edges). So an overflow was always
possible, and neither the old behaviour (type preserving but wrapping
on overflow giving data corruption) nor the current behaviour (type
promotion overriding my deliberate memory management) are nice.
My preferences here would be for an exception, so I knew right away.

The other use case which comes to mind is dealing with low level
libraries and/or file formats, and here automagic type promotion
would probably be unwelcome.

Regards,

Peter


From njs at pobox.com  Thu Jan  3 19:49:24 2013
From: njs at pobox.com (Nathaniel Smith)
Date: Fri, 4 Jan 2013 00:49:24 +0000
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <CAKVJ-_5rECRf-gzqp1sYqK-TAMi8AHBtBe6Q5T9X1yiqieCu7Q@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CAPJVwBnct8Tcop6SruOfD1CCWC82sE3cbXoQvtux8sfLmJhYuQ@mail.gmail.com>
	<CAPJVwB=GmgRQoW1RqwR1WmJBq5tfxJAGnGPjcT0drZvGTVGx7Q@mail.gmail.com>
	<CALmrCV11T6OTp+dBvUNjGn518Rd2ovHPfS7Ur6a5BKTyYXzKQw@mail.gmail.com>
	<50E61E29.1020709@astro.uio.no>
	<CAKVJ-_5rECRf-gzqp1sYqK-TAMi8AHBtBe6Q5T9X1yiqieCu7Q@mail.gmail.com>
Message-ID: <CAPJVwB=7YBFNevJCdS0A75h0LjNyWwsrZr_6zsUZD_k8S5FJkg@mail.gmail.com>

On 4 Jan 2013 00:39, "Peter Cock" <p.j.a.cock at googlemail.com> wrote:
> I agree with Dag rather than Andrew, "Explicit is better than implicit".
> i.e. What Nathaniel described earlier as the apparent consensus.
>
> Since I've actually used NumPy arrays with specific low memory
> types, I thought I should comment about my use case if case it
> is helpful:
>
> I've only used the low precision types like np.uint8 (unsigned) where
> I needed to limit my memory usage. In this case, the topology of a
> graph allowing multiple edges held as an integer adjacency matrix, A.
> I would calculate things like A^n for paths of length n, and also make
> changes to A directly (e.g. adding edges). So an overflow was always
> possible, and neither the old behaviour (type preserving but wrapping
> on overflow giving data corruption) nor the current behaviour (type
> promotion overriding my deliberate memory management) are nice.
> My preferences here would be for an exception, so I knew right away.

I don't think the changes we're talking about here will help your use case
actually; this is only about the specific case where one of your operands,
itself, cannot be cleanly cast to the types being used for the operation -
it won't detect overflow in general. For that you want #593:
https://github.com/numpy/numpy/issues/593

On another note, while you're here, perhaps I can tempt you into having a
go at fixing #593? :-)

-n
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130104/585bad31/attachment.html>

From p.j.a.cock at googlemail.com  Thu Jan  3 20:04:16 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Fri, 4 Jan 2013 01:04:16 +0000
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <CAKVJ-_5rECRf-gzqp1sYqK-TAMi8AHBtBe6Q5T9X1yiqieCu7Q@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CAPJVwBnct8Tcop6SruOfD1CCWC82sE3cbXoQvtux8sfLmJhYuQ@mail.gmail.com>
	<CAPJVwB=GmgRQoW1RqwR1WmJBq5tfxJAGnGPjcT0drZvGTVGx7Q@mail.gmail.com>
	<CALmrCV11T6OTp+dBvUNjGn518Rd2ovHPfS7Ur6a5BKTyYXzKQw@mail.gmail.com>
	<50E61E29.1020709@astro.uio.no>
	<CAKVJ-_5rECRf-gzqp1sYqK-TAMi8AHBtBe6Q5T9X1yiqieCu7Q@mail.gmail.com>
Message-ID: <CAKVJ-_7D2989wzPo3rmDFRtj9qujs-GzPZZFFktS1wNChg-JOA@mail.gmail.com>

On Fri, Jan 4, 2013 at 12:39 AM, Peter Cock <p.j.a.cock at googlemail.com> wrote:
>> Since I've actually used NumPy arrays with specific low memory
>> types, I thought I should comment about my use case if case it
>> is helpful:
>>
>> I've only used the low precision types like np.uint8 (unsigned) where
>> I needed to limit my memory usage. In this case, the topology of a
>> graph allowing multiple edges held as an integer adjacency matrix, A.
>> I would calculate things like A^n for paths of length n, and also make
>> changes to A directly (e.g. adding edges). So an overflow was always
>> possible, and neither the old behaviour (type preserving but wrapping
>> on overflow giving data corruption) nor the current behaviour (type
>> promotion overriding my deliberate memory management) are nice.
>> My preferences here would be for an exception, so I knew right away.
>>
>> The other use case which comes to mind is dealing with low level
>> libraries and/or file formats, and here automagic type promotion
>> would probably be unwelcome.
>
> Regards,
>
> Peter

Elsewhere on the thread, Nathaniel Smith <njs at pobox.com> wrote:
>
> To be clear: we're only talking here about the case where you have a mix of
> a narrow dtype in an array and a scalar value that cannot be represented in
> that narrow dtype. If both sides are arrays then we continue to upcast as
> normal. So my impression is that this means very little in practical terms,
> because this is a rare and historically poorly supported situation.
>
> But if this is something you're running into in practice then you may have a
> better idea than us about the practical effects. Do you have any examples
> where this has come up that you can share?
>
> -n

Clarification appreciated - on closer inspection for my adjacency
matrix example I would not fall over the issue in
https://github.com/numpy/numpy/issues/2878

>>> import numpy as np
>>> np.__version__
'1.6.1'
>>> A = np.zeros((100,100), np.uint8) # Matrix could be very big
>>> A[3,4] = 255 # Max value, setting up next step in example
>>> A[3,4]
255
>>> A[3,4] += 1 # Silently overflows on NumPy 1.6
>>> A[3,4]
0

To trigger the contentious behaviour I'd have to do something
like this:

>>> A = np.zeros((100,100), np.uint8)
>>> B = A + 256
>>> B
array([[256, 256, 256, ..., 256, 256, 256],
       [256, 256, 256, ..., 256, 256, 256],
       [256, 256, 256, ..., 256, 256, 256],
       ...,
       [256, 256, 256, ..., 256, 256, 256],
       [256, 256, 256, ..., 256, 256, 256],
       [256, 256, 256, ..., 256, 256, 256]], dtype=uint16)

I wasn't doing anything like that in my code though, just
simple matrix multiplication and in situ element modification,
for example A[i,j] += 1 to add an edge.

I still agree that for https://github.com/numpy/numpy/issues/2878
an exception sounds sensible.

Peter


From andrew.collette at gmail.com  Thu Jan  3 20:15:41 2013
From: andrew.collette at gmail.com (Andrew Collette)
Date: Thu, 3 Jan 2013 18:15:41 -0700
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <50E61E29.1020709@astro.uio.no>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CAPJVwBnct8Tcop6SruOfD1CCWC82sE3cbXoQvtux8sfLmJhYuQ@mail.gmail.com>
	<CAPJVwB=GmgRQoW1RqwR1WmJBq5tfxJAGnGPjcT0drZvGTVGx7Q@mail.gmail.com>
	<CALmrCV11T6OTp+dBvUNjGn518Rd2ovHPfS7Ur6a5BKTyYXzKQw@mail.gmail.com>
	<50E61E29.1020709@astro.uio.no>
Message-ID: <CALmrCV0L6PAA=aCYgaX2gQyM-_=bcDiygz0TyDmE0o5oCbc7iA@mail.gmail.com>

Hi Dag,

> If neither is objectively better, I think that is a very good reason to
> kick it down to the user. "Explicit is better than implicit".

I agree with you, up to a point.  However, we are talking about an
extremely common operation that I think most people (myself included)
would not expect to raise an exception: namely, adding a number to an
array.

> It's a good solution to encourage bug-free code. It may not be a good
> solution to avoid typing.

Ha!  But seriously, checking every time I make an addition?  And in
the current version of numpy it's not buggy code to add 128 to an int8
array; it's documented to give you an int16 with the result of the
addition.  Maybe it shouldn't, but that's what it does.

> I think you usually have a bug in your program when this happens, since
> either the dtype is wrong, or the value one is trying to store is wrong.
> I know that's true for myself, though I don't claim to know everybody
> elses usecases.

I don't think it's unreasonable to add a number to an int16 array (or
int32), and rely on specific, documented behavior if the number is
outside the range.  For example, IDL will clip the value.  Up until
1.6, in NumPy it would roll over. Currently it upcasts.

I won't make the case for upcasting vs rollover again, as I think
that's dealt with extensively in the threads linked in the bug.  I am
concerned about the tests I need to add wherever I might have a
scalar, or the program blows up.

It occurs to me that, if I have "a = b + c" in my code, and "c" is
sometimes a scalar and sometimes an array, I will get different
behavior.  If I have this right, if "c" is an array of larger dtype,
including a 1-element array, it will upcast, if it's the same dtype,
it will roll over regardless, but if it's a scalar and the result
won't fit, it will raise ValueError.

By the way, how do I test for this?  I can't test just the scalar
because the proposed behavior (as I understand it) considers the
result of the addition.  Should I always compute amax (nanmax)? Do I
need to try adding them and look for ValueError?

And things like this suddenly become dangerous:

try:
    some_function(myarray + something)
except ValueError:
   print "Problem in some_function!"

Nathaniel asked:

> But if this is something you're running into in practice then you may have a better idea than us about the practical effects. Do you have any examples where this has come up that you can share?

The only time I really ran into the 1.5/1.6 change was some old code
ported from IDL which did odd things with the wrapping behavior.  But
what I'm really trying to get a handle on here is the proposed future
behavior.  I am coming to this from the perspective of both a user and
a library developer (h5py) trying to work out what if anything I have
to do when handling arrays and values I get from users.

Andrew


From p.j.a.cock at googlemail.com  Thu Jan  3 20:17:42 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Fri, 4 Jan 2013 01:17:42 +0000
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <CAPJVwB=7YBFNevJCdS0A75h0LjNyWwsrZr_6zsUZD_k8S5FJkg@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CAPJVwBnct8Tcop6SruOfD1CCWC82sE3cbXoQvtux8sfLmJhYuQ@mail.gmail.com>
	<CAPJVwB=GmgRQoW1RqwR1WmJBq5tfxJAGnGPjcT0drZvGTVGx7Q@mail.gmail.com>
	<CALmrCV11T6OTp+dBvUNjGn518Rd2ovHPfS7Ur6a5BKTyYXzKQw@mail.gmail.com>
	<50E61E29.1020709@astro.uio.no>
	<CAKVJ-_5rECRf-gzqp1sYqK-TAMi8AHBtBe6Q5T9X1yiqieCu7Q@mail.gmail.com>
	<CAPJVwB=7YBFNevJCdS0A75h0LjNyWwsrZr_6zsUZD_k8S5FJkg@mail.gmail.com>
Message-ID: <CAKVJ-_6DkbNY_grAhf3JzH5JXr97BnvpYRkW_QSQLhFVb_VdrQ@mail.gmail.com>

On Fri, Jan 4, 2013 at 12:49 AM, Nathaniel Smith <njs at pobox.com> wrote:
> On 4 Jan 2013 00:39, "Peter Cock" <p.j.a.cock at googlemail.com> wrote:
>> I agree with Dag rather than Andrew, "Explicit is better than implicit".
>> i.e. What Nathaniel described earlier as the apparent consensus.
>>
>> Since I've actually used NumPy arrays with specific low memory
>> types, I thought I should comment about my use case if case it
>> is helpful:
>>
>> I've only used the low precision types like np.uint8 (unsigned) where
>> I needed to limit my memory usage. In this case, the topology of a
>> graph allowing multiple edges held as an integer adjacency matrix, A.
>> I would calculate things like A^n for paths of length n, and also make
>> changes to A directly (e.g. adding edges). So an overflow was always
>> possible, and neither the old behaviour (type preserving but wrapping
>> on overflow giving data corruption) nor the current behaviour (type
>> promotion overriding my deliberate memory management) are nice.
>> My preferences here would be for an exception, so I knew right away.
>
> I don't think the changes we're talking about here will help your use case
> actually; this is only about the specific case where one of your operands,
> itself, cannot be cleanly cast to the types being used for the operation -

Understood - I replied to your other message before I saw this one.

> it won't detect overflow in general. For that you want #593:
> https://github.com/numpy/numpy/issues/593
>
> On another note, while you're here, perhaps I can tempt you into having a go
> at fixing #593? :-)
>
> -n

I agree, and have commented on that issue. Thanks for pointing me to
that separate issue.

Peter


From shish at keba.be  Thu Jan  3 21:39:05 2013
From: shish at keba.be (Olivier Delalleau)
Date: Thu, 3 Jan 2013 21:39:05 -0500
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <CALmrCV0L6PAA=aCYgaX2gQyM-_=bcDiygz0TyDmE0o5oCbc7iA@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CAPJVwBnct8Tcop6SruOfD1CCWC82sE3cbXoQvtux8sfLmJhYuQ@mail.gmail.com>
	<CAPJVwB=GmgRQoW1RqwR1WmJBq5tfxJAGnGPjcT0drZvGTVGx7Q@mail.gmail.com>
	<CALmrCV11T6OTp+dBvUNjGn518Rd2ovHPfS7Ur6a5BKTyYXzKQw@mail.gmail.com>
	<50E61E29.1020709@astro.uio.no>
	<CALmrCV0L6PAA=aCYgaX2gQyM-_=bcDiygz0TyDmE0o5oCbc7iA@mail.gmail.com>
Message-ID: <CAFXk4bqT5Hk67Cv=FJ+L+vx53zR6x5=xRVQRcZShPs7QOZ0OoQ@mail.gmail.com>

2013/1/3 Andrew Collette <andrew.collette at gmail.com>:
> Hi Dag,
>
>> If neither is objectively better, I think that is a very good reason to
>> kick it down to the user. "Explicit is better than implicit".
>
> I agree with you, up to a point.  However, we are talking about an
> extremely common operation that I think most people (myself included)
> would not expect to raise an exception: namely, adding a number to an
> array.
>
>> It's a good solution to encourage bug-free code. It may not be a good
>> solution to avoid typing.
>
> Ha!  But seriously, checking every time I make an addition?  And in
> the current version of numpy it's not buggy code to add 128 to an int8
> array; it's documented to give you an int16 with the result of the
> addition.  Maybe it shouldn't, but that's what it does.
>
>> I think you usually have a bug in your program when this happens, since
>> either the dtype is wrong, or the value one is trying to store is wrong.
>> I know that's true for myself, though I don't claim to know everybody
>> elses usecases.
>
> I don't think it's unreasonable to add a number to an int16 array (or
> int32), and rely on specific, documented behavior if the number is
> outside the range.  For example, IDL will clip the value.  Up until
> 1.6, in NumPy it would roll over. Currently it upcasts.
>
> I won't make the case for upcasting vs rollover again, as I think
> that's dealt with extensively in the threads linked in the bug.  I am
> concerned about the tests I need to add wherever I might have a
> scalar, or the program blows up.
>
> It occurs to me that, if I have "a = b + c" in my code, and "c" is
> sometimes a scalar and sometimes an array, I will get different
> behavior.  If I have this right, if "c" is an array of larger dtype,
> including a 1-element array, it will upcast, if it's the same dtype,
> it will roll over regardless, but if it's a scalar and the result
> won't fit, it will raise ValueError.
>
> By the way, how do I test for this?  I can't test just the scalar
> because the proposed behavior (as I understand it) considers the
> result of the addition.  Should I always compute amax (nanmax)? Do I
> need to try adding them and look for ValueError?
>
> And things like this suddenly become dangerous:
>
> try:
>     some_function(myarray + something)
> except ValueError:
>    print "Problem in some_function!"

Actually, the proposed behavior considers only the value of the
scalar, not the result of the addition.
So the correct way to do things with this proposal would be to be sure
you don't add to an array a scalar value that can't fit in the array's
dtype.

In 1.6.1, you should make this check anyway, since otherwise your
computation can be doing something completely different without
telling you (and I doubt it's what you'd want):
    In [50]: np.array([2], dtype='int8') + 127
    Out[50]: array([-127], dtype=int8)
    In [51]: np.array([2], dtype='int8') + 128
    Out[51]: array([130], dtype=int16)

If the decision is to always roll-over, the first thing to decide is
whether this means the scalar is downcasted, or the output of the
computation. It doesn't matter for +, but for instance for the
"maximum" ufunc, I don't think it makes sense to perform the
computation at higher precision then downcast the output, as you would
otherwise have:
    np.maximum(np.ones(1, dtype='int8'), 128)) == [-128]
So out of consistency (across ufuncs) I think it should always
downcast the scalar (it has the advantage of being more efficient too,
since you don't need to do an upcast to perform the computation). But
then you're up for some nasty surprise if your scalar overflows and
you didn't expect it. For instance the "maximum" example above would
return [1], which may be expected... or not (maybe you wanted to
obtain [128] instead?).

Another solution is to forget about trying to be smart and always
upcast the operation. That would be my 2nd preferred solution, but it
would make it very annoying to deal with Python scalars (typically
int64 / float64) that would be upcasting lots of things, potentially
breaking a significant amount of existing code.

So, personally, I don't see a straightforward solution without
warning/error, that would be safe enough for programmers.

-=- Olivier

>
> Nathaniel asked:
>
>> But if this is something you're running into in practice then you may have a better idea than us about the practical effects. Do you have any examples where this has come up that you can share?
>
> The only time I really ran into the 1.5/1.6 change was some old code
> ported from IDL which did odd things with the wrapping behavior.  But
> what I'm really trying to get a handle on here is the proposed future
> behavior.  I am coming to this from the perspective of both a user and
> a library developer (h5py) trying to work out what if anything I have
> to do when handling arrays and values I get from users.
>
> Andrew


From andrew.collette at gmail.com  Thu Jan  3 22:35:57 2013
From: andrew.collette at gmail.com (Andrew Collette)
Date: Thu, 3 Jan 2013 20:35:57 -0700
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <CAFXk4bqT5Hk67Cv=FJ+L+vx53zR6x5=xRVQRcZShPs7QOZ0OoQ@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CAPJVwBnct8Tcop6SruOfD1CCWC82sE3cbXoQvtux8sfLmJhYuQ@mail.gmail.com>
	<CAPJVwB=GmgRQoW1RqwR1WmJBq5tfxJAGnGPjcT0drZvGTVGx7Q@mail.gmail.com>
	<CALmrCV11T6OTp+dBvUNjGn518Rd2ovHPfS7Ur6a5BKTyYXzKQw@mail.gmail.com>
	<50E61E29.1020709@astro.uio.no>
	<CALmrCV0L6PAA=aCYgaX2gQyM-_=bcDiygz0TyDmE0o5oCbc7iA@mail.gmail.com>
	<CAFXk4bqT5Hk67Cv=FJ+L+vx53zR6x5=xRVQRcZShPs7QOZ0OoQ@mail.gmail.com>
Message-ID: <CALmrCV2TcFwO=mbiSSeqrwbVCyo9nsEQSUspfihA1WgK4Abd+g@mail.gmail.com>

Hi Olivier,

> Another solution is to forget about trying to be smart and always
> upcast the operation. That would be my 2nd preferred solution, but it
> would make it very annoying to deal with Python scalars (typically
> int64 / float64) that would be upcasting lots of things, potentially
> breaking a significant amount of existing code.
>
> So, personally, I don't see a straightforward solution without
> warning/error, that would be safe enough for programmers.

I guess what's really confusing me here is that I had assumed that this:

result = myarray + scalar

was equivalent to this:

result = myarray + numpy.array(scalar)

where the dtype of the converted scalar was chosen to be "just big
enough" for it to fit.  Then you proceed using the normal rules for
array addition.  Yes, you can have upcasting or rollover depending on
the values involved, but you have that anyway with array addition;
it's just how arrays work in NumPy.

Also, have I got this (proposed behavior) right?

array([127], dtype=int8) + 128 -> ValueError
array([127], dtype=int8) + 127 -> -2

It seems like all this does is raise an error when the current rules
would require upcasting, but still allows rollover for smaller values.
 What error condition, specifically, is the ValueError designed to
tell me about?   You can still get "unexpected" data (if you're not
expecting rollover) with no exception.

Andrew


From ondrej.certik at gmail.com  Fri Jan  4 00:16:33 2013
From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=)
Date: Thu, 3 Jan 2013 21:16:33 -0800
Subject: [Numpy-discussion] test failures when embedded (in matlab)
In-Reply-To: <CALsWBNM=qo-5vvVAKx6Seqo8SD_S-maYTE9j1KNz82dkXqr8bA@mail.gmail.com>
References: <CALsWBNM=qo-5vvVAKx6Seqo8SD_S-maYTE9j1KNz82dkXqr8bA@mail.gmail.com>
Message-ID: <CADDwiVBFeN24mCO=oN7h3gXhhtHg_3xJbwCCw0JyLisvOHJCJQ@mail.gmail.com>

On Thu, Jan 3, 2013 at 7:54 AM, Robin <robince at gmail.com> wrote:
> Hi All,
>
> When using Numpy from an embedded Python (Python embedded in a Matlab
> mex function) I get a lot of test failures (see attached log).
>
> I am using CentOS 6.3, distribution packaged Python (2.6) and Numpy
> (1.4.1). Running numpy tests from a normal Python interpreter results
> in no errors or failures.
>
> Most of the failures look to be to do with errors calling Fortran
> functions - is it possible there is some linking / ABI problem? Would
> there be any way to overcome it?
>
> I get similar errors using EPD 7.3.
>
> Any advice appreciated,

In your log I can see failures of the type:

======================================================================
ERROR: test_cdouble (test_linalg.TestDet)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib64/python2.6/site-packages/numpy/linalg/tests/test_linalg.py",
line 44, in test_cdouble
    self.do(a, b)
  File "/usr/lib64/python2.6/site-packages/numpy/linalg/tests/test_linalg.py",
line 129, in do
    d = linalg.det(a)
  File "/usr/lib64/python2.6/site-packages/numpy/linalg/linalg.py",
line 1503, in det
    raise TypeError, "Illegal input to Fortran routine"
TypeError: Illegal input to Fortran routine


So I can only offer a general advice, that I learned while fixing
release critical bugs in NumPy:
I would look into the source file numpy/linalg/linalg.py,  line 1503
and start debugging
to figure out why the TypeError is raised. Which exact numpy do you use?

In the latest master, the line numbers are different and the det()
routine seems to be reworked. But in general,
you can see there a code like this:

    results = lapack_routine(n, n, a, n, pivots, 0)
    info = results['info']
    if (info < 0):
        raise TypeError("Illegal input to Fortran routine")

so that typically means that some wrong argument are being passed to
the Lapack routine. Try to print the "info" variable and then lookup
the Lapack documentation, it should say more (e.g. which exact
argument is wrong). Then you can go from there, e.g. I would put some
debug print statements into the code which gets called in
lapack_routine(), i.e. is it lapack_lite from NumPy, or some other
Lapack implementation? And so on.

Ondrej


From mike.r.anderson.13 at gmail.com  Fri Jan  4 01:29:39 2013
From: mike.r.anderson.13 at gmail.com (Mike Anderson)
Date: Fri, 4 Jan 2013 14:29:39 +0800
Subject: [Numpy-discussion] Insights / lessons learned from NumPy design
Message-ID: <CAA_67WgMPeF3LP=1xsTCgcvXXAPqbuJnJ5HJyEZM9Wda3JFSpg@mail.gmail.com>

Hello all,

In the Clojure community there has been some discussion about creating a
common matrix maths library / API. Currently there are a few different
fledgeling matrix libraries in Clojure, so it seemed like a worthwhile
effort to unify them and have a common base on which to build on.

NumPy has been something of an inspiration for this, so I though I'd ask
here to see what lessons have been learned.

We're thinking of a matrix library with roughly the following design
(subject to change!)
- Support for multi-dimensional matrices (but with fast paths for 1D
vectors and 2D matrices as the common cases)
- Immutability by default, i.e. matrix operations are pure functions that
create new matrices. There could be a "backdoor" option to mutate matrices,
but that would be unidiomatic in Clojure
- Support for 64-bit double precision floats only (this is the standard
float type in Clojure)
- Ability to support multiple different back-end matrix implementations
(JBLAS, Colt, EJML, Vectorz, javax.vecmath etc.)
- A full range of matrix operations. Operations would be delegated to back
end implementations where they are supported, otherwise generic
implementations could be used.

Any thoughts on this topic based on the NumPy experience? In particular
would be very interesting to know:
- Features in NumPy which proved to be redundant / not worth the effort
- Features that you wish had been designed in at the start
- Design decisions that turned out to be a particularly big mistake /
success

Would love to hear your insights, any ideas+advice greatly appreciated!

   Mike.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130104/d8368e65/attachment.html>

From raul at virtualmaterials.com  Fri Jan  4 01:50:37 2013
From: raul at virtualmaterials.com (Raul Cota)
Date: Thu, 03 Jan 2013 23:50:37 -0700
Subject: [Numpy-discussion] Numpy speed ups to simple tasks - final
 findings and suggestions
In-Reply-To: <CAPJVwB=6ddwLVWSwebByT7na1feYze1c00reWkYccp07Kq0kwA@mail.gmail.com>
References: <50D4B69A.7000409@virtualmaterials.com>
	<CAPJVwB=6ddwLVWSwebByT7na1feYze1c00reWkYccp07Kq0kwA@mail.gmail.com>
Message-ID: <50E67BBD.7090804@virtualmaterials.com>


On 02/01/2013 7:56 AM, Nathaniel Smith wrote:
> On Fri, Dec 21, 2012 at 7:20 PM, Raul Cota <raul at virtualmaterials.com> wrote:
>> Hello,
>>
>>
>> On Dec/2/2012 I sent an email about some meaningful speed problems I was
>> facing when porting our core program from Numeric (Python 2.2) to Numpy
>> (Python 2.6). Some of our tests went from 30 seconds to 90 seconds for
>> example.
>
> Hi Raul,
>
> This is great work! Sorry you haven't gotten any feedback yet -- I
> guess it's a busy time of year for most people; and, the way you've
> described your changes makes it hard for us to use our usual workflow
> to discuss them.
>

Sorry about that.


>> These are the actual changes to the C code,
>> For bottleneck (a)
>>
>> In general,
>> - avoid calls to PyObject_GetAttrString when I know the type is
>> List, None, Tuple, Float, Int, String or Unicode
>>
>> - avoid calls to PyObject_GetBuffer when I know the type is
>> List, None or Tuple
>
> This definitely seems like a worthwhile change. There are possible
> quibbles about coding style -- the macros could have better names, and
> would probably be better as (inline) functions instead of macros --
> but that can be dealt with.
>
> Can you make a pull request on github with these changes? I guess you
> haven't used git before, but I think you'll find it makes things
> *much* easier (in particular, you'll never have to type out long
> awkward english descriptions of the changes you made ever again!) We
> have docs here:
>    http://docs.scipy.org/doc/numpy/dev/gitwash/git_development.html
> and your goal is to get to the point where you can file a "pull request":
>    http://docs.scipy.org/doc/numpy/dev/gitwash/development_workflow.html#asking-for-your-changes-to-be-merged-with-the-main-repo
> Feel free to ask on the list if you get stuck of course.
>
>> For bottleneck (b)
>>
>> b.1)
>> I noticed that PyFloat * Float64 resulted in an unnecessary "on the fly"
>> conversion of the PyFloat into a Float64 to extract its underlying C
>> double value. This happened in the function
>> _double_convert_to_ctype which comes from the pattern,
>> _ at name@_convert_to_ctype
>
> This also sounds like an excellent change, and perhaps should be
> extended to ints and bools as well... again, can you file a pull
> request?
>
>> b.2) This is the change that may not be very popular among Numpy users.
>> I modified Float64 operations to return a Float instead of Float64. I
>> could not think or see any ill effects and I got a fairly decent speed
>> boost.
>
> Yes, unfortunately, there's no way we'll be able to make this change
> upstream -- there's too much chance of it breaking people's code. (And
> numpy float64's do act different than python floats in at least some
> cases, e.g., numpy gives more powerful control over floating point
> error handling, see np.seterr.)

I thought so. I may keep a fork of the changes for myself.


>
> But, it's almost certainly possible to optimize numpy's float64 (and
> friends), so that they are themselves (almost) as fast as the native
> python objects. And that would help all the code that uses them, not
> just the ones where regular python floats could be substituted
> instead. Have you tried profiling, say, float64 * float64 to figure
> out where the bottlenecks are?
>


Seems to be split between
- (primarily) the memory allocation/deallocation of the float64 that is 
created from the operation float64 * float64. This is the reason why 
float64 * Pyfloat got improved with one of my changes because PyFloat 
was being internally converted into a float64 before doing the 
multiplication.

- the rest of the time is the actual multiplication path way.


I attach an image of the profiler using the original numpy code with a 
loop on
val = float64 * float64 * float64 * float64


Let me know if something is not clear.


Raul


> -n
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: numpy_prof.png
Type: image/png
Size: 39190 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130103/93fbb731/attachment.png>

From raul at virtualmaterials.com  Fri Jan  4 01:56:03 2013
From: raul at virtualmaterials.com (Raul Cota)
Date: Thu, 03 Jan 2013 23:56:03 -0700
Subject: [Numpy-discussion] Numpy speed ups to simple tasks - final
 findings and suggestions
In-Reply-To: <CAPJVwBm6uHSqg6WQyy-oiFMZzWVHLsYh6XjKVGaNAQrPG4s1qQ@mail.gmail.com>
References: <50D4B69A.7000409@virtualmaterials.com>
	<CAPJVwB=6ddwLVWSwebByT7na1feYze1c00reWkYccp07Kq0kwA@mail.gmail.com>
	<CAPJVwBm6uHSqg6WQyy-oiFMZzWVHLsYh6XjKVGaNAQrPG4s1qQ@mail.gmail.com>
Message-ID: <50E67D03.6000403@virtualmaterials.com>


On 02/01/2013 7:58 AM, Nathaniel Smith wrote:
> On Wed, Jan 2, 2013 at 2:56 PM, Nathaniel Smith <njs at pobox.com> wrote:
>> On Fri, Dec 21, 2012 at 7:20 PM, Raul Cota <raul at virtualmaterials.com> wrote:
>>> b.1)
>>> I noticed that PyFloat * Float64 resulted in an unnecessary "on the fly"
>>> conversion of the PyFloat into a Float64 to extract its underlying C
>>> double value. This happened in the function
>>> _double_convert_to_ctype which comes from the pattern,
>>> _ at name@_convert_to_ctype
>>
>> This also sounds like an excellent change, and perhaps should be
>> extended to ints and bools as well... again, can you file a pull
>> request?
>
> Immediately after I hit 'send' I realized this might be unclear...
> what I mean is, please file two separate pull requests, one for the
> (a) changes and one for the (b.1) changes. They're logically separate
> so it'll be easier to review and merge them separately.
>


I understood it like that :)

I will give it a try.


Thanks for the feedback

Raul


> -n
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From d.s.seljebotn at astro.uio.no  Fri Jan  4 03:00:42 2013
From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn)
Date: Fri, 04 Jan 2013 09:00:42 +0100
Subject: [Numpy-discussion] Insights / lessons learned from NumPy design
In-Reply-To: <CAA_67WgMPeF3LP=1xsTCgcvXXAPqbuJnJ5HJyEZM9Wda3JFSpg@mail.gmail.com>
References: <CAA_67WgMPeF3LP=1xsTCgcvXXAPqbuJnJ5HJyEZM9Wda3JFSpg@mail.gmail.com>
Message-ID: <50E68C2A.9060400@astro.uio.no>

On 01/04/2013 07:29 AM, Mike Anderson wrote:
> Hello all,
>
> In the Clojure community there has been some discussion about creating a
> common matrix maths library / API. Currently there are a few different
> fledgeling matrix libraries in Clojure, so it seemed like a worthwhile
> effort to unify them and have a common base on which to build on.
>
> NumPy has been something of an inspiration for this, so I though I'd ask
> here to see what lessons have been learned.
>
> We're thinking of a matrix library with roughly the following design
> (subject to change!)
> - Support for multi-dimensional matrices (but with fast paths for 1D
> vectors and 2D matrices as the common cases)

Food for thought: Myself I have vectors that are naturally stored in 2D, 
"matrices" that can be naturally stored in 4D and so on (you can't view 
them that way when doing linear algebra, it's just that the indices can 
have multiple components) -- I like that NumPy calls everything "array"; 
I think vector and matrix are higher-level mathematical concepts.

> - Immutability by default, i.e. matrix operations are pure functions
> that create new matrices. There could be a "backdoor" option to mutate
> matrices, but that would be unidiomatic in Clojure

Sounds very promising (assuming you can reuse the buffer if the input 
matrix had no other references and is not used again?). It's very common 
for NumPy arrays to fill a large chunk of the available memory (think 
20-100 GB), so for those users this would need to be coupled with buffer 
reuse and good diagnostics that help remove references to old 
generations of a matrix.

> - Support for 64-bit double precision floats only (this is the standard
> float type in Clojure)
> - Ability to support multiple different back-end matrix implementations
> (JBLAS, Colt, EJML, Vectorz, javax.vecmath etc.)
> - A full range of matrix operations. Operations would be delegated to
> back end implementations where they are supported, otherwise generic
> implementations could be used.
>
> Any thoughts on this topic based on the NumPy experience? In particular
> would be very interesting to know:
> - Features in NumPy which proved to be redundant / not worth the effort
> - Features that you wish had been designed in at the start
> - Design decisions that turned out to be a particularly big mistake /
> success
>
> Would love to hear your insights, any ideas+advice greatly appreciated!

Travis Oliphant noted some of his thoughts on this in the recent thread 
"DARPA funding for Blaze and passing the NumPy torch" which is a must-read.

Dag Sverre


From d.s.seljebotn at astro.uio.no  Fri Jan  4 03:13:06 2013
From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn)
Date: Fri, 04 Jan 2013 09:13:06 +0100
Subject: [Numpy-discussion] Insights / lessons learned from NumPy design
In-Reply-To: <50E68C2A.9060400@astro.uio.no>
References: <CAA_67WgMPeF3LP=1xsTCgcvXXAPqbuJnJ5HJyEZM9Wda3JFSpg@mail.gmail.com>
	<50E68C2A.9060400@astro.uio.no>
Message-ID: <50E68F12.90804@astro.uio.no>

On 01/04/2013 09:00 AM, Dag Sverre Seljebotn wrote:
> On 01/04/2013 07:29 AM, Mike Anderson wrote:
>> Hello all,
>>
>> In the Clojure community there has been some discussion about creating a
>> common matrix maths library / API. Currently there are a few different
>> fledgeling matrix libraries in Clojure, so it seemed like a worthwhile
>> effort to unify them and have a common base on which to build on.
>>
>> NumPy has been something of an inspiration for this, so I though I'd ask
>> here to see what lessons have been learned.
>>
>> We're thinking of a matrix library with roughly the following design
>> (subject to change!)
>> - Support for multi-dimensional matrices (but with fast paths for 1D
>> vectors and 2D matrices as the common cases)
>
> Food for thought: Myself I have vectors that are naturally stored in 2D,
> "matrices" that can be naturally stored in 4D and so on (you can't view
> them that way when doing linear algebra, it's just that the indices can
> have multiple components) -- I like that NumPy calls everything "array";
> I think vector and matrix are higher-level mathematical concepts.
>
>> - Immutability by default, i.e. matrix operations are pure functions
>> that create new matrices. There could be a "backdoor" option to mutate
>> matrices, but that would be unidiomatic in Clojure
>
> Sounds very promising (assuming you can reuse the buffer if the input
> matrix had no other references and is not used again?). It's very common
> for NumPy arrays to fill a large chunk of the available memory (think
> 20-100 GB), so for those users this would need to be coupled with buffer
> reuse and good diagnostics that help remove references to old
> generations of a matrix.

Oh: Depending on your amibitions, it's worth thinking hard about i) 
storage format, and ii) lazy evaluation.

Storage format: The new trend is for more flexible formats than just 
column-major/row-major, e.g., storing cache-sized n-dimensional tiles.

Lazy evaluation: The big problem with numpy is that "a + b + np.sqrt(c)" 
will first make a temporary result for "a + b", rather than doing the 
whole expression on the fly, which is *very* bad for performance.

So if you want immutability, I urge you to consider every operation to 
build up an expression tree/"program", and then either find out the 
smart points where you interpret that program automatically, or make 
explicit eval() of an expression tree the default mode.

Of course this depends all on how ambitious you are.

It's probably best to have a look at all the projects designed in order 
to get around NumPy's short-comings:

  - Blaze (in development, continuum.io)
  - Theano
  - Numexpr

Related:

  - HDF chunks
  - To some degree Cython

Dag Sverre

>
>> - Support for 64-bit double precision floats only (this is the standard
>> float type in Clojure)
>> - Ability to support multiple different back-end matrix implementations
>> (JBLAS, Colt, EJML, Vectorz, javax.vecmath etc.)
>> - A full range of matrix operations. Operations would be delegated to
>> back end implementations where they are supported, otherwise generic
>> implementations could be used.
>>
>> Any thoughts on this topic based on the NumPy experience? In particular
>> would be very interesting to know:
>> - Features in NumPy which proved to be redundant / not worth the effort
>> - Features that you wish had been designed in at the start
>> - Design decisions that turned out to be a particularly big mistake /
>> success
>>
>> Would love to hear your insights, any ideas+advice greatly appreciated!
>
> Travis Oliphant noted some of his thoughts on this in the recent thread
> "DARPA funding for Blaze and passing the NumPy torch" which is a must-read.
>
> Dag Sverre


From matthew.brett at gmail.com  Fri Jan  4 06:09:23 2013
From: matthew.brett at gmail.com (Matthew Brett)
Date: Fri, 4 Jan 2013 11:09:23 +0000
Subject: [Numpy-discussion] Scalar casting rules use-case reprise
Message-ID: <CAH6Pt5rr2_0s=7OGCpvuxdq8EyDMt=2kQ7fyn8fK=kf0qzDHpA@mail.gmail.com>

Hi,

Reading the discussion on the scalar casting rule change I realized I
was hazy on the use-cases that led to the rule that scalars cast
differently from arrays.

My impression was that the primary use-case was for lower-precision
floats. That is, when you have a large float32 arr, you do not want to
double your memory use with:

>>> large_float32 + 1.0 # please no float64 here

Probably also:

>>> large_int8 + 1 # please no int32 / int64 here.

That makes sense.  On the other hand these are more ambiguous:

>>> large_float32 + np.float64(1) # really - you don't want float64?

>>> large_int8 + np.int32(1) # ditto

I wonder whether the main use-case was to deal with the automatic
types of Python floats and scalars?  That is, I wonder whether it
would be worth considering (in the distant long term), doing fancy
guess-what-you-mean stuff with Python scalars, on the basis that they
are of unspecified dtype, and make 0 dimensional scalars follow the
array casting rules.  As in:

>>> large_float32 + 1.0
# no upcast - we don't know what float type you meant for the scalar
>>> large_float32 + np.float64(1)
# upcast - you clearly meant the scalar to be float64

In any case, can anyone remember the original use-cases well enough to
record them for future decision making?

Best,

Matthew


From njs at pobox.com  Fri Jan  4 08:46:40 2013
From: njs at pobox.com (Nathaniel Smith)
Date: Fri, 4 Jan 2013 13:46:40 +0000
Subject: [Numpy-discussion] Scalar casting rules use-case reprise
In-Reply-To: <CAH6Pt5rr2_0s=7OGCpvuxdq8EyDMt=2kQ7fyn8fK=kf0qzDHpA@mail.gmail.com>
References: <CAH6Pt5rr2_0s=7OGCpvuxdq8EyDMt=2kQ7fyn8fK=kf0qzDHpA@mail.gmail.com>
Message-ID: <CAPJVwBkQ-LJJrqmiM0hVn8d5_KvmmZ=33jeSeKi+67us5VN5uA@mail.gmail.com>

On Fri, Jan 4, 2013 at 11:09 AM, Matthew Brett <matthew.brett at gmail.com> wrote:
> Hi,
>
> Reading the discussion on the scalar casting rule change I realized I
> was hazy on the use-cases that led to the rule that scalars cast
> differently from arrays.
>
> My impression was that the primary use-case was for lower-precision
> floats. That is, when you have a large float32 arr, you do not want to
> double your memory use with:
>
>>>> large_float32 + 1.0 # please no float64 here
>
> Probably also:
>
>>>> large_int8 + 1 # please no int32 / int64 here.
>
> That makes sense.  On the other hand these are more ambiguous:
>
>>>> large_float32 + np.float64(1) # really - you don't want float64?
>
>>>> large_int8 + np.int32(1) # ditto
>
> I wonder whether the main use-case was to deal with the automatic
> types of Python floats and scalars?  That is, I wonder whether it
> would be worth considering (in the distant long term), doing fancy
> guess-what-you-mean stuff with Python scalars, on the basis that they
> are of unspecified dtype, and make 0 dimensional scalars follow the
> array casting rules.  As in:
>
>>>> large_float32 + 1.0
> # no upcast - we don't know what float type you meant for the scalar
>>>> large_float32 + np.float64(1)
> # upcast - you clearly meant the scalar to be float64

Hmm, but consider this, which is exactly the operation in your example:

In [9]: a = np.arange(3, dtype=np.float32)

In [10]: a / np.mean(a) # normalize
Out[10]: array([ 0.,  1.,  2.], dtype=float32)

In [11]: type(np.mean(a))
Out[11]: numpy.float64

Obviously the most common situation where it's useful to have the rule
to ignore scalar width is for avoiding "width contamination" from
Python float and int literals. But you can easily end up with numpy
scalars from indexing, high-precision operations like np.mean, etc.,
where you don't "really mean" you want high-precision. And at least
it's easy to understand the rule: same-kind scalars don't affect
precision.

...Though arguably the bug here is that np.mean actually returns a
value with higher precision. Interestingly, we seem to have some
special cases so that if you want to normalize each row of a matrix,
then again the dtype is preserved, but for a totally different
reasons. In

a = np.arange(4, dtype=np.float32).reshape((2, 2))
a / np.mean(a, axis=0, keepdims=True)

the result has float32 type, even though this is an array/array
operation, not an array/scalar operation. The reason is:

In [32]: np.mean(a).dtype
Out[32]: dtype('float64')

But:

In [33]: np.mean(a, axis=0).dtype
Out[33]: dtype('float32')

In this respect np.var and np.std behave like np.mean, but np.sum
always preserves the input dtype. (Which is curious because np.sum is
just like np.mean in terms of potential loss of precision, right? The
problem in np.mean is the accumulating error over many addition
operations, not the divide-by-n at the end.)

It is very disturbing that even after this discussion none of us here
seem to actually have a precise understanding of how the numpy type
selection system actually works :-(. We really need a formal
description...

-n


From d.s.seljebotn at astro.uio.no  Fri Jan  4 09:01:15 2013
From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn)
Date: Fri, 04 Jan 2013 15:01:15 +0100
Subject: [Numpy-discussion] Scalar casting rules use-case reprise
In-Reply-To: <CAPJVwBkQ-LJJrqmiM0hVn8d5_KvmmZ=33jeSeKi+67us5VN5uA@mail.gmail.com>
References: <CAH6Pt5rr2_0s=7OGCpvuxdq8EyDMt=2kQ7fyn8fK=kf0qzDHpA@mail.gmail.com>
	<CAPJVwBkQ-LJJrqmiM0hVn8d5_KvmmZ=33jeSeKi+67us5VN5uA@mail.gmail.com>
Message-ID: <50E6E0AB.5090904@astro.uio.no>

On 01/04/2013 02:46 PM, Nathaniel Smith wrote:
> On Fri, Jan 4, 2013 at 11:09 AM, Matthew Brett <matthew.brett at gmail.com> wrote:
>> Hi,
>>
>> Reading the discussion on the scalar casting rule change I realized I
>> was hazy on the use-cases that led to the rule that scalars cast
>> differently from arrays.
>>
>> My impression was that the primary use-case was for lower-precision
>> floats. That is, when you have a large float32 arr, you do not want to
>> double your memory use with:
>>
>>>>> large_float32 + 1.0 # please no float64 here
>>
>> Probably also:
>>
>>>>> large_int8 + 1 # please no int32 / int64 here.
>>
>> That makes sense.  On the other hand these are more ambiguous:
>>
>>>>> large_float32 + np.float64(1) # really - you don't want float64?
>>
>>>>> large_int8 + np.int32(1) # ditto
>>
>> I wonder whether the main use-case was to deal with the automatic
>> types of Python floats and scalars?  That is, I wonder whether it
>> would be worth considering (in the distant long term), doing fancy
>> guess-what-you-mean stuff with Python scalars, on the basis that they
>> are of unspecified dtype, and make 0 dimensional scalars follow the
>> array casting rules.  As in:
>>
>>>>> large_float32 + 1.0
>> # no upcast - we don't know what float type you meant for the scalar
>>>>> large_float32 + np.float64(1)
>> # upcast - you clearly meant the scalar to be float64
>
> Hmm, but consider this, which is exactly the operation in your example:
>
> In [9]: a = np.arange(3, dtype=np.float32)
>
> In [10]: a / np.mean(a) # normalize
> Out[10]: array([ 0.,  1.,  2.], dtype=float32)
>
> In [11]: type(np.mean(a))
> Out[11]: numpy.float64
>
> Obviously the most common situation where it's useful to have the rule
> to ignore scalar width is for avoiding "width contamination" from
> Python float and int literals. But you can easily end up with numpy
> scalars from indexing, high-precision operations like np.mean, etc.,
> where you don't "really mean" you want high-precision. And at least
> it's easy to understand the rule: same-kind scalars don't affect
> precision.
>
> ...Though arguably the bug here is that np.mean actually returns a
> value with higher precision. Interestingly, we seem to have some
> special cases so that if you want to normalize each row of a matrix,
> then again the dtype is preserved, but for a totally different
> reasons. In
>
> a = np.arange(4, dtype=np.float32).reshape((2, 2))
> a / np.mean(a, axis=0, keepdims=True)
>
> the result has float32 type, even though this is an array/array
> operation, not an array/scalar operation. The reason is:
>
> In [32]: np.mean(a).dtype
> Out[32]: dtype('float64')
>
> But:
>
> In [33]: np.mean(a, axis=0).dtype
> Out[33]: dtype('float32')
>
> In this respect np.var and np.std behave like np.mean, but np.sum
> always preserves the input dtype. (Which is curious because np.sum is
> just like np.mean in terms of potential loss of precision, right? The
> problem in np.mean is the accumulating error over many addition
> operations, not the divide-by-n at the end.)
>
> It is very disturbing that even after this discussion none of us here
> seem to actually have a precise understanding of how the numpy type
> selection system actually works :-(. We really need a formal
> description...

I think this is a usability wart -- if you don't understand, then 
newcomers certainly don't.

Very naive question:

If one is re-doing this anyway, how important are the primitive 
(non-record) NumPy scalars at all? How much would break if one simply 
always uses Python's int and double, declare that scalars never 
interacts with the dtype?

  a) any computation returning scalars can return float()/int()

  b) float() are silently truncated to float32

  c) integral values that don't fit either wrap around/truncates/raises 
error

  d) the only things that determines dtype is the dtypes of arrays, 
never scalars

Too naive?

I guess the opposite idea is what Travis mentioned in his 
passing-the-torch post, about making scalars and 0-d-arrays the same.

Dag Sverre


From shish at keba.be  Fri Jan  4 09:03:02 2013
From: shish at keba.be (Olivier Delalleau)
Date: Fri, 4 Jan 2013 09:03:02 -0500
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <CALmrCV2TcFwO=mbiSSeqrwbVCyo9nsEQSUspfihA1WgK4Abd+g@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CAPJVwBnct8Tcop6SruOfD1CCWC82sE3cbXoQvtux8sfLmJhYuQ@mail.gmail.com>
	<CAPJVwB=GmgRQoW1RqwR1WmJBq5tfxJAGnGPjcT0drZvGTVGx7Q@mail.gmail.com>
	<CALmrCV11T6OTp+dBvUNjGn518Rd2ovHPfS7Ur6a5BKTyYXzKQw@mail.gmail.com>
	<50E61E29.1020709@astro.uio.no>
	<CALmrCV0L6PAA=aCYgaX2gQyM-_=bcDiygz0TyDmE0o5oCbc7iA@mail.gmail.com>
	<CAFXk4bqT5Hk67Cv=FJ+L+vx53zR6x5=xRVQRcZShPs7QOZ0OoQ@mail.gmail.com>
	<CALmrCV2TcFwO=mbiSSeqrwbVCyo9nsEQSUspfihA1WgK4Abd+g@mail.gmail.com>
Message-ID: <CAFXk4bopXBpN2_aPAJCJ3ADk6PSMiG7NANSra7fhButkud+75A@mail.gmail.com>

2013/1/3 Andrew Collette <andrew.collette at gmail.com>:
>> Another solution is to forget about trying to be smart and always
>> upcast the operation. That would be my 2nd preferred solution, but it
>> would make it very annoying to deal with Python scalars (typically
>> int64 / float64) that would be upcasting lots of things, potentially
>> breaking a significant amount of existing code.
>>
>> So, personally, I don't see a straightforward solution without
>> warning/error, that would be safe enough for programmers.
>
> I guess what's really confusing me here is that I had assumed that this:
>
> result = myarray + scalar
>
> was equivalent to this:
>
> result = myarray + numpy.array(scalar)
>
> where the dtype of the converted scalar was chosen to be "just big
> enough" for it to fit.  Then you proceed using the normal rules for
> array addition.  Yes, you can have upcasting or rollover depending on
> the values involved, but you have that anyway with array addition;
> it's just how arrays work in NumPy.

A key difference is that with arrays, the dtype is not chosen "just
big enough" for your data to fit. Either you set the dtype yourself,
or you're using the default inferred dtype (int/float). In both cases
you should know what to expect, and it doesn't depend on the actual
numeric values (except for the auto int/float distinction).

>
> Also, have I got this (proposed behavior) right?
>
> array([127], dtype=int8) + 128 -> ValueError
> array([127], dtype=int8) + 127 -> -2
>
> It seems like all this does is raise an error when the current rules
> would require upcasting, but still allows rollover for smaller values.
>  What error condition, specifically, is the ValueError designed to
> tell me about?   You can still get "unexpected" data (if you're not
> expecting rollover) with no exception.

The ValueError is here to warn you that the operation may not be doing
what you want. The rollover for smaller values would be the documented
(and thus hopefully expected) behavior.

Taking the addition as an example may be misleading, as it makes it
look like we could just "always rollover" to obtain consistent
behavior, and programmers are to some extent used to integer rollover
on this kind of operation. However, I gave examples with "maximum"
that I believe show it's not that easy (this behavior would just
appear "wrong"). Another example is with the integer division, where
casting the scalar silently would result in
    array([-128], dtype=int8) // 128 -> [1]
which is unlikely to be something someone would like to obtain.

To summarize the goals of the proposal (in my mind):
1. Low cognitive load (simple and consistent across ufuncs).
2. Low risk of doing something unexpected.
3. Efficient by default.
4. Most existing (non buggy) code should not be affected.

If we always do the silent cast, it will significantly break existing
code relying on the 1.6 behavior, and increases the risk of doing
something unexpected (bad on #2 & #4)
If we always upcast, we may break existing code and lose efficiency
(bad on #3 and #4).
If we keep current behavior, we stay with something that's difficult
to understand and has high risk of doing weird things (bad on #1 and
#2).

-=- Olivier


From shish at keba.be  Fri Jan  4 09:34:29 2013
From: shish at keba.be (Olivier Delalleau)
Date: Fri, 4 Jan 2013 09:34:29 -0500
Subject: [Numpy-discussion] Scalar casting rules use-case reprise
In-Reply-To: <CAPJVwBkQ-LJJrqmiM0hVn8d5_KvmmZ=33jeSeKi+67us5VN5uA@mail.gmail.com>
References: <CAH6Pt5rr2_0s=7OGCpvuxdq8EyDMt=2kQ7fyn8fK=kf0qzDHpA@mail.gmail.com>
	<CAPJVwBkQ-LJJrqmiM0hVn8d5_KvmmZ=33jeSeKi+67us5VN5uA@mail.gmail.com>
Message-ID: <CAFXk4bq=7bpswfYUBHDToT168vsVoBEZix+0P1UN3YN8UrsjRw@mail.gmail.com>

2013/1/4 Nathaniel Smith <njs at pobox.com>:
> On Fri, Jan 4, 2013 at 11:09 AM, Matthew Brett <matthew.brett at gmail.com> wrote:
>> Hi,
>>
>> Reading the discussion on the scalar casting rule change I realized I
>> was hazy on the use-cases that led to the rule that scalars cast
>> differently from arrays.
>>
>> My impression was that the primary use-case was for lower-precision
>> floats. That is, when you have a large float32 arr, you do not want to
>> double your memory use with:
>>
>>>>> large_float32 + 1.0 # please no float64 here
>>
>> Probably also:
>>
>>>>> large_int8 + 1 # please no int32 / int64 here.
>>
>> That makes sense.  On the other hand these are more ambiguous:
>>
>>>>> large_float32 + np.float64(1) # really - you don't want float64?
>>
>>>>> large_int8 + np.int32(1) # ditto
>>
>> I wonder whether the main use-case was to deal with the automatic
>> types of Python floats and scalars?  That is, I wonder whether it
>> would be worth considering (in the distant long term), doing fancy
>> guess-what-you-mean stuff with Python scalars, on the basis that they
>> are of unspecified dtype, and make 0 dimensional scalars follow the
>> array casting rules.  As in:
>>
>>>>> large_float32 + 1.0
>> # no upcast - we don't know what float type you meant for the scalar
>>>>> large_float32 + np.float64(1)
>> # upcast - you clearly meant the scalar to be float64
>
> Hmm, but consider this, which is exactly the operation in your example:
>
> In [9]: a = np.arange(3, dtype=np.float32)
>
> In [10]: a / np.mean(a) # normalize
> Out[10]: array([ 0.,  1.,  2.], dtype=float32)
>
> In [11]: type(np.mean(a))
> Out[11]: numpy.float64
>
> Obviously the most common situation where it's useful to have the rule
> to ignore scalar width is for avoiding "width contamination" from
> Python float and int literals. But you can easily end up with numpy
> scalars from indexing, high-precision operations like np.mean, etc.,
> where you don't "really mean" you want high-precision. And at least
> it's easy to understand the rule: same-kind scalars don't affect
> precision.
>
> ...Though arguably the bug here is that np.mean actually returns a
> value with higher precision. Interestingly, we seem to have some
> special cases so that if you want to normalize each row of a matrix,
> then again the dtype is preserved, but for a totally different
> reasons. In
>
> a = np.arange(4, dtype=np.float32).reshape((2, 2))
> a / np.mean(a, axis=0, keepdims=True)
>
> the result has float32 type, even though this is an array/array
> operation, not an array/scalar operation. The reason is:
>
> In [32]: np.mean(a).dtype
> Out[32]: dtype('float64')
>
> But:
>
> In [33]: np.mean(a, axis=0).dtype
> Out[33]: dtype('float32')
>
> In this respect np.var and np.std behave like np.mean, but np.sum
> always preserves the input dtype. (Which is curious because np.sum is
> just like np.mean in terms of potential loss of precision, right? The
> problem in np.mean is the accumulating error over many addition
> operations, not the divide-by-n at the end.)

IMO having a different dtype depending on whether or not you provide
the "axis" argument to mean() should be considered as a bug.
As to what the correct dtype should be... it's not such an easy
question. Personally I would go with float64 by default to be
consistent across all int / float dtypes. Then someone who wants to
downcast it can use the "out" argument to mean().

To come back to Matthew's use-case question, I agree the most common
use case is to prevent a float32 or small int array from being
upcasted, and most of the time this would come from Python scalars.
However I don't think it's a good idea to have a behavior that is
different between Python and Numpy scalars, because it's a subtle
difference that users could have trouble understanding & foreseeing.
The expected behavior of numpy functions when providing them with
non-numpy objects is they should behave the same as if we had called
numpy.asarray() on these objects, and straying away from this behavior
seems dangerous to me.

As far as I'm concerned, in a world where numpy would be brand new
with no existing codebase using it, I would probably prefer to use the
same casting rules for array/array and array/scalar operations. It may
cause some unwanted array upcasting, but it's a lot simpler to
understand. However, given that there may be a lot of code relying on
the current dtype-preserving behavior, doing it now doesn't sound like
a good idea to me.

-=- Olivier


From williamj at tenbase2.com  Fri Jan  4 10:26:09 2013
From: williamj at tenbase2.com (William Johnston)
Date: Fri, 4 Jan 2013 10:26:09 -0500
Subject: [Numpy-discussion] still need DLR support
Message-ID: <8BA33575ACC64F408670B645AD89C584@leviathan>


Hello,

I posted some time ago that I need Numpy for .NET for a C# DLR app.

Has anyone made any progress on this?

May I suggest this as a project?

Thank you.

Sincerely,
William Johnston
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130104/996e0e9b/attachment.html>

From andrew.collette at gmail.com  Fri Jan  4 11:01:23 2013
From: andrew.collette at gmail.com (Andrew Collette)
Date: Fri, 4 Jan 2013 09:01:23 -0700
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <CAFXk4bopXBpN2_aPAJCJ3ADk6PSMiG7NANSra7fhButkud+75A@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CAPJVwBnct8Tcop6SruOfD1CCWC82sE3cbXoQvtux8sfLmJhYuQ@mail.gmail.com>
	<CAPJVwB=GmgRQoW1RqwR1WmJBq5tfxJAGnGPjcT0drZvGTVGx7Q@mail.gmail.com>
	<CALmrCV11T6OTp+dBvUNjGn518Rd2ovHPfS7Ur6a5BKTyYXzKQw@mail.gmail.com>
	<50E61E29.1020709@astro.uio.no>
	<CALmrCV0L6PAA=aCYgaX2gQyM-_=bcDiygz0TyDmE0o5oCbc7iA@mail.gmail.com>
	<CAFXk4bqT5Hk67Cv=FJ+L+vx53zR6x5=xRVQRcZShPs7QOZ0OoQ@mail.gmail.com>
	<CALmrCV2TcFwO=mbiSSeqrwbVCyo9nsEQSUspfihA1WgK4Abd+g@mail.gmail.com>
	<CAFXk4bopXBpN2_aPAJCJ3ADk6PSMiG7NANSra7fhButkud+75A@mail.gmail.com>
Message-ID: <CALmrCV0311UpW__-espNZrbgGk2pa2fW6E1oeTvZkbYf_1e_VA@mail.gmail.com>

Hi Olivier,

> A key difference is that with arrays, the dtype is not chosen "just
> big enough" for your data to fit. Either you set the dtype yourself,
> or you're using the default inferred dtype (int/float). In both cases
> you should know what to expect, and it doesn't depend on the actual
> numeric values (except for the auto int/float distinction).

Yes, certainly; for example, you would get an int32/int64 if you
simply do "array(4)". What I mean is, when you do "a+b" and b is a
scalar, I had assumed that the normal array rules for addition apply,
if you treat the dtype of b as being the smallest precision possible
which can hold that value.  E.g. 1 (int8) + 42 would treat 42 as an
int8, and 1 (int8) + 200 would treat 200 as an int16.  If I'm not
mistaken, this is what happens currently.

As far as knowing what to expect, well, as a library author I don't
control what my users supply.  I have to write conditional code to
deal with things like this, and that's my interest in this issue.  One
way or another I have to handle it, correctly, and I'm trying to get a
handle on what that means.

> The ValueError is here to warn you that the operation may not be doing
> what you want. The rollover for smaller values would be the documented
> (and thus hopefully expected) behavior.

Right, but what confuses me is that the only thing this prevents is
the current upcast behavior.  Why is that so evil it should be
replaced with an exception?

> Taking the addition as an example may be misleading, as it makes it
> look like we could just "always rollover" to obtain consistent
> behavior, and programmers are to some extent used to integer rollover
> on this kind of operation. However, I gave examples with "maximum"
> that I believe show it's not that easy (this behavior would just
> appear "wrong"). Another example is with the integer division, where
> casting the scalar silently would result in
>     array([-128], dtype=int8) // 128 -> [1]
> which is unlikely to be something someone would like to obtain.

But with the rule I outlined, this would be treated as:

array([-128], dtype=int8) // array([128], dtype=int16) -> -1 (int16)

> To summarize the goals of the proposal (in my mind):
> 1. Low cognitive load (simple and consistent across ufuncs).
> 2. Low risk of doing something unexpected.
> 3. Efficient by default.
> 4. Most existing (non buggy) code should not be affected.
>
> If we always do the silent cast, it will significantly break existing
> code relying on the 1.6 behavior, and increases the risk of doing
> something unexpected (bad on #2 & #4)
> If we always upcast, we may break existing code and lose efficiency
> (bad on #3 and #4).
> If we keep current behavior, we stay with something that's difficult
> to understand and has high risk of doing weird things (bad on #1 and
> #2).

I suppose what really concerns me here is, with respect to #2,
addition raising ValueError is really unexpected (at least to me).  I
don't have control over the values my users pass to me, which means
that I am going to have to carefully check for the presence of scalars
and use either numpy.add or explicitly cast to a single-element array
before performing addition (or, as you point out, any similar
operation).

>From a more basic perspective, I think that adding a number to an
array should never raise an exception.  I've not used any other
language in which this behavior takes place.  In C, you have rollover
behavior, in IDL you roll over or clip, and in NumPy you either roll
or upcast, depending on the version.  IDL, etc. manage to handle
things like max() or total() in a sensible (or at least defensible)
fashion, and without raising an error.

Andrew


From shish at keba.be  Fri Jan  4 11:34:34 2013
From: shish at keba.be (Olivier Delalleau)
Date: Fri, 4 Jan 2013 11:34:34 -0500
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <CALmrCV0311UpW__-espNZrbgGk2pa2fW6E1oeTvZkbYf_1e_VA@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CAPJVwBnct8Tcop6SruOfD1CCWC82sE3cbXoQvtux8sfLmJhYuQ@mail.gmail.com>
	<CAPJVwB=GmgRQoW1RqwR1WmJBq5tfxJAGnGPjcT0drZvGTVGx7Q@mail.gmail.com>
	<CALmrCV11T6OTp+dBvUNjGn518Rd2ovHPfS7Ur6a5BKTyYXzKQw@mail.gmail.com>
	<50E61E29.1020709@astro.uio.no>
	<CALmrCV0L6PAA=aCYgaX2gQyM-_=bcDiygz0TyDmE0o5oCbc7iA@mail.gmail.com>
	<CAFXk4bqT5Hk67Cv=FJ+L+vx53zR6x5=xRVQRcZShPs7QOZ0OoQ@mail.gmail.com>
	<CALmrCV2TcFwO=mbiSSeqrwbVCyo9nsEQSUspfihA1WgK4Abd+g@mail.gmail.com>
	<CAFXk4bopXBpN2_aPAJCJ3ADk6PSMiG7NANSra7fhButkud+75A@mail.gmail.com>
	<CALmrCV0311UpW__-espNZrbgGk2pa2fW6E1oeTvZkbYf_1e_VA@mail.gmail.com>
Message-ID: <CAFXk4bpprJhtYeGwSTF1FHiz6LfffLxiF0ueLwA3iLuvn5ooMA@mail.gmail.com>

(sorry, no time for full reply, so for now just answering what I
believe is the main point)

2013/1/4 Andrew Collette <andrew.collette at gmail.com>:
>> The ValueError is here to warn you that the operation may not be doing
>> what you want. The rollover for smaller values would be the documented
>> (and thus hopefully expected) behavior.
>
> Right, but what confuses me is that the only thing this prevents is
> the current upcast behavior.  Why is that so evil it should be
> replaced with an exception?

The evilness lies in the silent switch between the rollover and upcast
behavior, as in the example I gave previously:

    In [50]: np.array([2], dtype='int8') + 127
    Out[50]: array([-127], dtype=int8)
    In [51]: np.array([2], dtype='int8') + 128
    Out[51]: array([130], dtype=int16)

If the scalar is the user-supplied value, it's likely you actually
want a fixed behavior (either rollover or upcast) regardless of the
numeric value being provided.

Looking at what other numeric libraries are doing is definitely a good
suggestion.

-=- Olivier


From matthew.brett at gmail.com  Fri Jan  4 11:54:09 2013
From: matthew.brett at gmail.com (Matthew Brett)
Date: Fri, 4 Jan 2013 16:54:09 +0000
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <CALmrCV0311UpW__-espNZrbgGk2pa2fW6E1oeTvZkbYf_1e_VA@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CAPJVwBnct8Tcop6SruOfD1CCWC82sE3cbXoQvtux8sfLmJhYuQ@mail.gmail.com>
	<CAPJVwB=GmgRQoW1RqwR1WmJBq5tfxJAGnGPjcT0drZvGTVGx7Q@mail.gmail.com>
	<CALmrCV11T6OTp+dBvUNjGn518Rd2ovHPfS7Ur6a5BKTyYXzKQw@mail.gmail.com>
	<50E61E29.1020709@astro.uio.no>
	<CALmrCV0L6PAA=aCYgaX2gQyM-_=bcDiygz0TyDmE0o5oCbc7iA@mail.gmail.com>
	<CAFXk4bqT5Hk67Cv=FJ+L+vx53zR6x5=xRVQRcZShPs7QOZ0OoQ@mail.gmail.com>
	<CALmrCV2TcFwO=mbiSSeqrwbVCyo9nsEQSUspfihA1WgK4Abd+g@mail.gmail.com>
	<CAFXk4bopXBpN2_aPAJCJ3ADk6PSMiG7NANSra7fhButkud+75A@mail.gmail.com>
	<CALmrCV0311UpW__-espNZrbgGk2pa2fW6E1oeTvZkbYf_1e_VA@mail.gmail.com>
Message-ID: <CAH6Pt5rbqTRka0sbJdGgR7cVYkn1q0W=G+zHA2mywiqs3pnL4w@mail.gmail.com>

Hi,

On Fri, Jan 4, 2013 at 4:01 PM, Andrew Collette
<andrew.collette at gmail.com> wrote:
> >From a more basic perspective, I think that adding a number to an
> array should never raise an exception.  I've not used any other
> language in which this behavior takes place.  In C, you have rollover
> behavior, in IDL you roll over or clip, and in NumPy you either roll
> or upcast, depending on the version.  IDL, etc. manage to handle
> things like max() or total() in a sensible (or at least defensible)
> fashion, and without raising an error.

That's a reasonable point.

Looks like we lost consensus.

What about returning to the 1.5 behavior instead?

Best,

Matthew


From njs at pobox.com  Fri Jan  4 11:54:27 2013
From: njs at pobox.com (Nathaniel Smith)
Date: Fri, 4 Jan 2013 16:54:27 +0000
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <CALmrCV0311UpW__-espNZrbgGk2pa2fW6E1oeTvZkbYf_1e_VA@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CAPJVwBnct8Tcop6SruOfD1CCWC82sE3cbXoQvtux8sfLmJhYuQ@mail.gmail.com>
	<CAPJVwB=GmgRQoW1RqwR1WmJBq5tfxJAGnGPjcT0drZvGTVGx7Q@mail.gmail.com>
	<CALmrCV11T6OTp+dBvUNjGn518Rd2ovHPfS7Ur6a5BKTyYXzKQw@mail.gmail.com>
	<50E61E29.1020709@astro.uio.no>
	<CALmrCV0L6PAA=aCYgaX2gQyM-_=bcDiygz0TyDmE0o5oCbc7iA@mail.gmail.com>
	<CAFXk4bqT5Hk67Cv=FJ+L+vx53zR6x5=xRVQRcZShPs7QOZ0OoQ@mail.gmail.com>
	<CALmrCV2TcFwO=mbiSSeqrwbVCyo9nsEQSUspfihA1WgK4Abd+g@mail.gmail.com>
	<CAFXk4bopXBpN2_aPAJCJ3ADk6PSMiG7NANSra7fhButkud+75A@mail.gmail.com>
	<CALmrCV0311UpW__-espNZrbgGk2pa2fW6E1oeTvZkbYf_1e_VA@mail.gmail.com>
Message-ID: <CAPJVwBkXSjcDHDuddJ8D-39N+hB8SZP6DMHpsr2zwv9XCMT3OA@mail.gmail.com>

On Fri, Jan 4, 2013 at 4:01 PM, Andrew Collette
<andrew.collette at gmail.com> wrote:
> Hi Olivier,
>
>> A key difference is that with arrays, the dtype is not chosen "just
>> big enough" for your data to fit. Either you set the dtype yourself,
>> or you're using the default inferred dtype (int/float). In both cases
>> you should know what to expect, and it doesn't depend on the actual
>> numeric values (except for the auto int/float distinction).
>
> Yes, certainly; for example, you would get an int32/int64 if you
> simply do "array(4)". What I mean is, when you do "a+b" and b is a
> scalar, I had assumed that the normal array rules for addition apply,
> if you treat the dtype of b as being the smallest precision possible
> which can hold that value.  E.g. 1 (int8) + 42 would treat 42 as an
> int8, and 1 (int8) + 200 would treat 200 as an int16.  If I'm not
> mistaken, this is what happens currently.

Well, that's the thing... there is actually *no* version of numpy
where the "normal rules" apply to scalars. If
  a = np.array([1, 2, 3], dtype=np.uint8)
then in numpy 1.5 and earlier we had
  # Python scalars
  (a / 1).dtype == np.uint8
  (a / 300).dtype == np.uint8
  # Numpy scalars
  (a / np.int_(1)) == np.uint8
  (a / np.int_(300)) == np.uint8
  # Arrays
  (a / [1]).dtype == np.int_
  (a / [300]).dtype == np.int_

In 1.6 we have:
  # Python scalars
  (a / 1).dtype == np.uint8
  (a / 300).dtype == np.uint16
  # Numpy scalars
  (a / np.int_(1)) == np.uint8
  (a / np.int_(300)) == np.uint16
  # Arrays
  (a / [1]).dtype == np.int_
  (a / [1]).dtype == np.int_

In fact in 1.6 there is no assignment of a dtype to '1' which makes
the way 1.6 handles it consistent with the array rules:
  # Ah-hah, it looks like '1' has a uint8 dtype:
  (np.ones(2, dtype=np.uint8) / np.ones(2, dtype=np.uint8)).dtype == np.uint8
  (np.ones(2, dtype=np.uint8) / 1).dtype == np.uint8
  # But wait! No it doesn't!
  (np.ones(2, dtype=np.int8) / np.ones(2, dtype=np.uint8)).dtype == np.int16
  (np.ones(2, dtype=np.int8) / 1).dtype == np.int8
  # Apparently in this case it has an int8 dtype instead.
  (np.ones(2, dtype=np.int8) / np.ones(2, dtype=np.int8)).dtype == np.int8

In 1.5, the special rule for (same-kind) scalars is that we always
cast them to the array's type.
In 1.6, the special rule for (same-kind) scalars is that we cast them
to some type which is a function of the array's type, and the scalar's
value, but not the scalar's type.

This is especially confusing because normally in numpy the *only* way
to get a dtype that is not in the set [np.bool, np.int_, np.float64,
np.complex128, np.object_] (the dtypes produced by np.array(pyobj)) is
to explicitly request it by name. So if you're memory-constrained, a
useful mental model is to think that there are two types of arrays:
your compact ones that use the specific limited-precision type you've
picked (uint8, float32, whichever), and "regular" arrays, which use
machine precision. And all you have to keep track of is the
interaction between these. But in 1.6, as soon as you have a uint8
array, suddenly all the other precisions might spring magically into
being at any moment.

So options:
If we require that new dtypes shouldn't be suddenly introduced then we
have to pick from:
  1) a / 300 silently rolls over the 300 before attempting the
operation (1.5-style)
  2) a / 300 upcasts to machine precision (use the same rules for
arrays and scalars)
  3) a / 300 gives an error (the proposal you don't like)

If we instead treat a Python scalar like 1 as having the smallest
precision dtype that can hold its value, then we have to accept either
  uint8 + 1 -> uint16
or
  int8 + 1 -> int16

Or there's the current code, whose behaviour no-one actually
understands. (And I mean that both figuratively -- it's clearly
confusing enough that people won't be able to remember it well in
practice -- and literally -- even we developers don't know what it
will do without running it to see.)

-n


From andrew.collette at gmail.com  Fri Jan  4 11:57:05 2013
From: andrew.collette at gmail.com (Andrew Collette)
Date: Fri, 4 Jan 2013 09:57:05 -0700
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <CAFXk4bpprJhtYeGwSTF1FHiz6LfffLxiF0ueLwA3iLuvn5ooMA@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CAPJVwBnct8Tcop6SruOfD1CCWC82sE3cbXoQvtux8sfLmJhYuQ@mail.gmail.com>
	<CAPJVwB=GmgRQoW1RqwR1WmJBq5tfxJAGnGPjcT0drZvGTVGx7Q@mail.gmail.com>
	<CALmrCV11T6OTp+dBvUNjGn518Rd2ovHPfS7Ur6a5BKTyYXzKQw@mail.gmail.com>
	<50E61E29.1020709@astro.uio.no>
	<CALmrCV0L6PAA=aCYgaX2gQyM-_=bcDiygz0TyDmE0o5oCbc7iA@mail.gmail.com>
	<CAFXk4bqT5Hk67Cv=FJ+L+vx53zR6x5=xRVQRcZShPs7QOZ0OoQ@mail.gmail.com>
	<CALmrCV2TcFwO=mbiSSeqrwbVCyo9nsEQSUspfihA1WgK4Abd+g@mail.gmail.com>
	<CAFXk4bopXBpN2_aPAJCJ3ADk6PSMiG7NANSra7fhButkud+75A@mail.gmail.com>
	<CALmrCV0311UpW__-espNZrbgGk2pa2fW6E1oeTvZkbYf_1e_VA@mail.gmail.com>
	<CAFXk4bpprJhtYeGwSTF1FHiz6LfffLxiF0ueLwA3iLuvn5ooMA@mail.gmail.com>
Message-ID: <CALmrCV3=8_LMLv4PLRZFfW0CWFft9Z8Zp1oJJ4JDYb8k6SHWkw@mail.gmail.com>

Hi,

> (sorry, no time for full reply, so for now just answering what I
> believe is the main point)

Thanks for taking the time to discuss/explain this at all... I appreciate it.

> The evilness lies in the silent switch between the rollover and upcast
> behavior, as in the example I gave previously:
>
>     In [50]: np.array([2], dtype='int8') + 127
>     Out[50]: array([-127], dtype=int8)
>     In [51]: np.array([2], dtype='int8') + 128
>     Out[51]: array([130], dtype=int16)

Right, but for better or for worse this is how *array* addition works.
 If I have an int16 array in my program, and I add a user-supplied
array to it, I get rollover if they supply an int16 array and
upcasting if they provide an int32.  The answer may simply be that we
consider scalar addition a special case; I think that's really what
tripping me up here.

Granted, one is a type-dependent change while the other is a
value-dependent change; but in my head they were connected by the
rules for choosing a "effective" dtype for a scalar based on its
value.

> If the scalar is the user-supplied value, it's likely you actually
> want a fixed behavior (either rollover or upcast) regardless of the
> numeric value being provided.

This is a good point; thanks.

> Looking at what other numeric libraries are doing is definitely a good
> suggestion.

I just double-checked IDL, and for addition it seems to convert to the
larger type:

a = bytarr(10)
help, a+fix(0)
<Expression>    INT       = Array[10]
help, a+long(0)
<Expression>    LONG      = Array[10]

Of course, IDL and Python scalars likely work differently.

Andrew


From andrew.collette at gmail.com  Fri Jan  4 12:25:07 2013
From: andrew.collette at gmail.com (Andrew Collette)
Date: Fri, 4 Jan 2013 10:25:07 -0700
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <CAPJVwBkXSjcDHDuddJ8D-39N+hB8SZP6DMHpsr2zwv9XCMT3OA@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CAPJVwBnct8Tcop6SruOfD1CCWC82sE3cbXoQvtux8sfLmJhYuQ@mail.gmail.com>
	<CAPJVwB=GmgRQoW1RqwR1WmJBq5tfxJAGnGPjcT0drZvGTVGx7Q@mail.gmail.com>
	<CALmrCV11T6OTp+dBvUNjGn518Rd2ovHPfS7Ur6a5BKTyYXzKQw@mail.gmail.com>
	<50E61E29.1020709@astro.uio.no>
	<CALmrCV0L6PAA=aCYgaX2gQyM-_=bcDiygz0TyDmE0o5oCbc7iA@mail.gmail.com>
	<CAFXk4bqT5Hk67Cv=FJ+L+vx53zR6x5=xRVQRcZShPs7QOZ0OoQ@mail.gmail.com>
	<CALmrCV2TcFwO=mbiSSeqrwbVCyo9nsEQSUspfihA1WgK4Abd+g@mail.gmail.com>
	<CAFXk4bopXBpN2_aPAJCJ3ADk6PSMiG7NANSra7fhButkud+75A@mail.gmail.com>
	<CALmrCV0311UpW__-espNZrbgGk2pa2fW6E1oeTvZkbYf_1e_VA@mail.gmail.com>
	<CAPJVwBkXSjcDHDuddJ8D-39N+hB8SZP6DMHpsr2zwv9XCMT3OA@mail.gmail.com>
Message-ID: <CALmrCV1xP6cJVL7DZufw84Lw+S_+oE5S4BUY9BAOJtDiXruEvg@mail.gmail.com>

Hi,

> In fact in 1.6 there is no assignment of a dtype to '1' which makes
> the way 1.6 handles it consistent with the array rules:

I guess I'm a little out of my depth here... what are the array rules?

>   # Ah-hah, it looks like '1' has a uint8 dtype:
>   (np.ones(2, dtype=np.uint8) / np.ones(2, dtype=np.uint8)).dtype == np.uint8
>   (np.ones(2, dtype=np.uint8) / 1).dtype == np.uint8
>   # But wait! No it doesn't!
>   (np.ones(2, dtype=np.int8) / np.ones(2, dtype=np.uint8)).dtype == np.int16
>   (np.ones(2, dtype=np.int8) / 1).dtype == np.int8
>   # Apparently in this case it has an int8 dtype instead.
>   (np.ones(2, dtype=np.int8) / np.ones(2, dtype=np.int8)).dtype == np.int8

Yes, this is a good point... I hadn't thought about whether it should
be unsigned or signed.  In the case of something like "1", where it's
ambiguous, couldn't we prefer the sign of the other participant in the
addition?

> interaction between these. But in 1.6, as soon as you have a uint8
> array, suddenly all the other precisions might spring magically into
> being at any moment.

I can see how this would be really annoying for someone close to the
max memory on their machine.

> So options:
> If we require that new dtypes shouldn't be suddenly introduced then we
> have to pick from:
>   1) a / 300 silently rolls over the 300 before attempting the
> operation (1.5-style)

Were people really not happy with this behavior?  My reading of this thread:

http://thread.gmane.org/gmane.comp.python.numeric.general/47986

was that the change was, although not an accident, certainly
unexpected for most people.  I don't have a strong preference either
way, but I'm interested in why we're so eager to keep the "corrected"
behavior.

>   2) a / 300 upcasts to machine precision (use the same rules for
> arrays and scalars)
>   3) a / 300 gives an error (the proposal you don't like)
>
> If we instead treat a Python scalar like 1 as having the smallest
> precision dtype that can hold its value, then we have to accept either
>   uint8 + 1 -> uint16
> or
>   int8 + 1 -> int16

Is there any consistent way we could prefer the "signedness" of the
other participant?  That would lead to both uint8 +1 -> uint8 and int8
+ 1 -> int8.

> Or there's the current code, whose behaviour no-one actually
> understands. (And I mean that both figuratively -- it's clearly
> confusing enough that people won't be able to remember it well in
> practice -- and literally -- even we developers don't know what it
> will do without running it to see.)

I agree the current behavior is confusing.  Regardless of the details
of what to do, I suppose my main objection is that, to me, it's really
unexpected that adding a number to an array could result in an
exception.

Andrew


From raul at virtualmaterials.com  Fri Jan  4 14:10:02 2013
From: raul at virtualmaterials.com (Raul Cota)
Date: Fri, 04 Jan 2013 12:10:02 -0700
Subject: [Numpy-discussion] Numpy speed ups to simple tasks - final
 findings and suggestions
In-Reply-To: <CAPJVwB=6ddwLVWSwebByT7na1feYze1c00reWkYccp07Kq0kwA@mail.gmail.com>
References: <50D4B69A.7000409@virtualmaterials.com>
	<CAPJVwB=6ddwLVWSwebByT7na1feYze1c00reWkYccp07Kq0kwA@mail.gmail.com>
Message-ID: <50E7290A.2000705@virtualmaterials.com>

In my previous email I sent an image but I just thought that maybe the 
mailing list does not accept attachments or need approval.

I put a couple of images related to my profiling results (referenced to 
my previous email) here.


Sorted by time per function with a graph of calls at the bottom

http://raul-playground.appspot.com/static/images/numpy-profile-time.png


Sorted by Time with Children
http://raul-playground.appspot.com/static/images/numpy-profile-timewchildren.png


The test is a loop of
val = float64 * float64 * float64 * float64


Raul


On 02/01/2013 7:56 AM, Nathaniel Smith wrote:
> On Fri, Dec 21, 2012 at 7:20 PM, Raul Cota <raul at virtualmaterials.com> wrote:
>> Hello,
>>
>>
>> On Dec/2/2012 I sent an email about some meaningful speed problems I was
>> facing when porting our core program from Numeric (Python 2.2) to Numpy
>> (Python 2.6). Some of our tests went from 30 seconds to 90 seconds for
>> example.
>
> Hi Raul,
>
> This is great work! Sorry you haven't gotten any feedback yet -- I
> guess it's a busy time of year for most people; and, the way you've
> described your changes makes it hard for us to use our usual workflow
> to discuss them.
>
>> These are the actual changes to the C code,
>> For bottleneck (a)
>>
>> In general,
>> - avoid calls to PyObject_GetAttrString when I know the type is
>> List, None, Tuple, Float, Int, String or Unicode
>>
>> - avoid calls to PyObject_GetBuffer when I know the type is
>> List, None or Tuple
>
> This definitely seems like a worthwhile change. There are possible
> quibbles about coding style -- the macros could have better names, and
> would probably be better as (inline) functions instead of macros --
> but that can be dealt with.
>
> Can you make a pull request on github with these changes? I guess you
> haven't used git before, but I think you'll find it makes things
> *much* easier (in particular, you'll never have to type out long
> awkward english descriptions of the changes you made ever again!) We
> have docs here:
>    http://docs.scipy.org/doc/numpy/dev/gitwash/git_development.html
> and your goal is to get to the point where you can file a "pull request":
>    http://docs.scipy.org/doc/numpy/dev/gitwash/development_workflow.html#asking-for-your-changes-to-be-merged-with-the-main-repo
> Feel free to ask on the list if you get stuck of course.
>
>> For bottleneck (b)
>>
>> b.1)
>> I noticed that PyFloat * Float64 resulted in an unnecessary "on the fly"
>> conversion of the PyFloat into a Float64 to extract its underlying C
>> double value. This happened in the function
>> _double_convert_to_ctype which comes from the pattern,
>> _ at name@_convert_to_ctype
>
> This also sounds like an excellent change, and perhaps should be
> extended to ints and bools as well... again, can you file a pull
> request?
>
>> b.2) This is the change that may not be very popular among Numpy users.
>> I modified Float64 operations to return a Float instead of Float64. I
>> could not think or see any ill effects and I got a fairly decent speed
>> boost.
>
> Yes, unfortunately, there's no way we'll be able to make this change
> upstream -- there's too much chance of it breaking people's code. (And
> numpy float64's do act different than python floats in at least some
> cases, e.g., numpy gives more powerful control over floating point
> error handling, see np.seterr.)
>
> But, it's almost certainly possible to optimize numpy's float64 (and
> friends), so that they are themselves (almost) as fast as the native
> python objects. And that would help all the code that uses them, not
> just the ones where regular python floats could be substituted
> instead. Have you tried profiling, say, float64 * float64 to figure
> out where the bottlenecks are?
>
> -n
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From njs at pobox.com  Fri Jan  4 14:59:38 2013
From: njs at pobox.com (Nathaniel Smith)
Date: Fri, 4 Jan 2013 19:59:38 +0000
Subject: [Numpy-discussion] Scalar casting rules use-case reprise
In-Reply-To: <CAH6Pt5rr2_0s=7OGCpvuxdq8EyDMt=2kQ7fyn8fK=kf0qzDHpA@mail.gmail.com>
References: <CAH6Pt5rr2_0s=7OGCpvuxdq8EyDMt=2kQ7fyn8fK=kf0qzDHpA@mail.gmail.com>
Message-ID: <CAPJVwBkZkJLHb-8-2U6tEWueNL2apxxjYrnC+=t6O+et3aw=bg@mail.gmail.com>

On Fri, Jan 4, 2013 at 11:09 AM, Matthew Brett <matthew.brett at gmail.com> wrote:
> In any case, can anyone remember the original use-cases well enough to
> record them for future decision making?

Heh. Everything new is old again. Here's a discussion from 2002 which
quotes the rationale:
  http://mail.scipy.org/pipermail/numpy-discussion/2002-September/014002.html

Note that in context:
  - numpy means the old Numeric library
  - AFAICT neither numeric nor numarray had special "scalar" types at
this point, and they didn't have 0d arrays either, so in fact indexing
an array would just return the closest python type (int or float). In
fact this is a thread about the problems this causes. (So the question
Dag raised downthread was prescient! Or, well, postscient, I guess.)

So it looks like the main reason was actually that back then, you
*couldn't* preserve non-native widths in operations involving scalars,
because there was no such thing as a non-native width scalar. As soon
as you called 'sum' or indexed an array, you reverted to native width.

-n


From mw at eml.cc  Fri Jan  4 15:42:52 2013
From: mw at eml.cc (mw at eml.cc)
Date: Fri, 04 Jan 2013 21:42:52 +0100
Subject: [Numpy-discussion] Embedded NumPy LAPACK errors
Message-ID: <50E73ECC.8050803@eml.cc>

Hiall,


I am trying to embed numerical code in a mexFunction,
as called by MATLAB, written as a Cython function.

NumPy core functions and BLAS work fine, but calls to LAPACK
function such as SVD seem to be made against to MATLAB's linked
MKL, and this generates MKL errors. When I try this with
Octave, it works fine, presumably because it is compiled against
the same LAPACK as the NumPy I am embedding.


Assuming I haven't made big mistakes up to here, I have the
following questions:

Is there a way to request numpy.linalg to use a particular
LAPACK library, e.g. /usr/lib/liblapack.so ?

If not, is there a reasonable way to build numpy.linalg such that
it interfaces with MKL correctly ?


thanks in advance for any help,
Marmaduke


[1] The Cython code in question : https://gist.github.com/4433635
     Please see the mexFunction at the bottom.


From njs at pobox.com  Fri Jan  4 16:33:25 2013
From: njs at pobox.com (Nathaniel Smith)
Date: Fri, 4 Jan 2013 21:33:25 +0000
Subject: [Numpy-discussion] Numpy speed ups to simple tasks - final
 findings and suggestions
In-Reply-To: <50E67BBD.7090804@virtualmaterials.com>
References: <50D4B69A.7000409@virtualmaterials.com>
	<CAPJVwB=6ddwLVWSwebByT7na1feYze1c00reWkYccp07Kq0kwA@mail.gmail.com>
	<50E67BBD.7090804@virtualmaterials.com>
Message-ID: <CAPJVwB=zHLpXSm-MwhCn2AkJLcg43i9T13HxLEidK+SbL5QMRA@mail.gmail.com>

On Fri, Jan 4, 2013 at 6:50 AM, Raul Cota <raul at virtualmaterials.com> wrote:
>
> On 02/01/2013 7:56 AM, Nathaniel Smith wrote:
>> But, it's almost certainly possible to optimize numpy's float64 (and
>> friends), so that they are themselves (almost) as fast as the native
>> python objects. And that would help all the code that uses them, not
>> just the ones where regular python floats could be substituted
>> instead. Have you tried profiling, say, float64 * float64 to figure
>> out where the bottlenecks are?
>
> Seems to be split between
> - (primarily) the memory allocation/deallocation of the float64 that is
> created from the operation float64 * float64. This is the reason why float64
> * Pyfloat got improved with one of my changes because PyFloat was being
> internally converted into a float64 before doing the multiplication.
>
> - the rest of the time is the actual multiplication path way.

Running a quick profile on Linux x86-64 of
  x = np.float64(5.5)
  for i in xrange(n):
     x * x
I find that ~50% of the total CPU time is inside feclearexcept(), the
function which resets the floating point error checking registers --
and most of this is inside a single instruction, stmxcsr ("store sse
control register"). It's possible that this is different on windows
(esp. since apparently our fpe exception handling apparently doesn't
work on windows[1]), but the total time you measure for both
PyFloat*PyFloat and Float64*Float64 match mine almost exactly, so most
likely we have similar CPUs that are doing a similar amount of work in
both cases.

The way we implement floating point error checking is basically:
    PyUFunc_clearfperr()
    <do the floating point operation>
    if (PyUFunc_getfperror() & BAD_STUFF) {
        <raise a warning or whatever>
    }

Some points that you may find interesting though:

- The way we define these functions, both PyUFunc_clearfperr() and
PyUFunc_getfperror() clear the flags. However, for PyUFunc_getfperror,
this is just pointless. We could simply remove this, and expect to see
a ~25% speedup in Float64*Float64 without any downside.

- Numpy's default behaviour is to always check for an warn on floating
point errors. This seems like it's probably the correct default.
However, if you aren't worried about this for your use code, you could
disable these warnings with np.seterr(all="ignore"). (And you'll get
similar error-checking to what PyFloat does.) At the moment, that
won't speed anything up. But we could easily then fix it so that the
PyUFunc_clearfperr/PyUFunc_getfperror code checks for whether errors
are ignored, and disables itself. This together with the previous
change should get you a ~50% speedup in Float64*Float64, without
having to change any of numpy's semantics.

- Bizarrely, Numpy still checks the floating point flags on integer
operations, at least for integer scalars. So 50% of the time in
Int64*Int64 is also spent in fiddling with floating point exception
flags. That's also some low-hanging fruit right there... (to be fair,
this isn't *quite* as trivial to fix as it could be, because the
integer overflow checking code sets the floating point unit's
"overflow" flag to signal a problem, and we'd need to pull this out to
a thread-local variable or something before disabling the floating
point checks entirely in integer code. But still, not a huge problem.)

-n

[1] https://github.com/numpy/numpy/issues/2350


From sebastian at sipsolutions.net  Fri Jan  4 18:17:47 2013
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Sat, 05 Jan 2013 00:17:47 +0100
Subject: [Numpy-discussion] Howto bisect old commits correctly
Message-ID: <1357341467.12993.6.camel@sebastian-laptop>

Hey,

this is probably just because I do not have any experience with bisect
and the like, but when I try running a bisect keep running into:

ImportError: /home/sebastian/.../lib/python2.7/site-packages/numpy/core/multiarray.so: undefined symbol: PyDataMem_NEW
or:
RuntimeError: module compiled against API version 8 but this version of numpy is 7

I am sure I am missing something simple, but I have no idea where to
look. Am I just forgetting to delete some things and my version is not
clean!?

Regards,

Sebastian


From raul at virtualmaterials.com  Fri Jan  4 18:36:28 2013
From: raul at virtualmaterials.com (Raul Cota)
Date: Fri, 04 Jan 2013 16:36:28 -0700
Subject: [Numpy-discussion] Numpy speed ups to simple tasks - final
 findings and suggestions
In-Reply-To: <CAPJVwB=zHLpXSm-MwhCn2AkJLcg43i9T13HxLEidK+SbL5QMRA@mail.gmail.com>
References: <50D4B69A.7000409@virtualmaterials.com>
	<CAPJVwB=6ddwLVWSwebByT7na1feYze1c00reWkYccp07Kq0kwA@mail.gmail.com>
	<50E67BBD.7090804@virtualmaterials.com>
	<CAPJVwB=zHLpXSm-MwhCn2AkJLcg43i9T13HxLEidK+SbL5QMRA@mail.gmail.com>
Message-ID: <50E7677C.9090008@virtualmaterials.com>

On 04/01/2013 2:33 PM, Nathaniel Smith wrote:
> On Fri, Jan 4, 2013 at 6:50 AM, Raul Cota <raul at virtualmaterials.com> wrote:
>> On 02/01/2013 7:56 AM, Nathaniel Smith wrote:
>>> But, it's almost certainly possible to optimize numpy's float64 (and
>>> friends), so that they are themselves (almost) as fast as the native
>>> python objects. And that would help all the code that uses them, not
>>> just the ones where regular python floats could be substituted
>>> instead. Have you tried profiling, say, float64 * float64 to figure
>>> out where the bottlenecks are?
>> Seems to be split between
>> - (primarily) the memory allocation/deallocation of the float64 that is
>> created from the operation float64 * float64. This is the reason why float64
>> * Pyfloat got improved with one of my changes because PyFloat was being
>> internally converted into a float64 before doing the multiplication.
>>
>> - the rest of the time is the actual multiplication path way.
> Running a quick profile on Linux x86-64 of
>    x = np.float64(5.5)
>    for i in xrange(n):
>       x * x
> I find that ~50% of the total CPU time is inside feclearexcept(), the
> function which resets the floating point error checking registers --
> and most of this is inside a single instruction, stmxcsr ("store sse
> control register").

I find strange you don't see bottleneck in allocation of a float64.

is it easy for you to profile this ?

x = np.float64(5.5)
y = 5.5
for i in xrange(n):
     x * y

numpy internally translates y into a float64 temporarily and then 
discards it and I seem to remember is a bit over two times slower than x * x


I will try to do your suggestions on

PyUFunc_clearfperr/PyUFunc_getfperror

and see what I get. Haven't gotten around to get going with being able 
to do a pull request for the previous stuff. if changes are worth while 
would it be ok if I also create one for this ?


Thanks again,

Raul


> It's possible that this is different on windows
> (esp. since apparently our fpe exception handling apparently doesn't
> work on windows[1]), but the total time you measure for both
> PyFloat*PyFloat and Float64*Float64 match mine almost exactly, so most
> likely we have similar CPUs that are doing a similar amount of work in
> both cases.
>
> The way we implement floating point error checking is basically:
>      PyUFunc_clearfperr()
>      <do the floating point operation>
>      if (PyUFunc_getfperror() & BAD_STUFF) {
>          <raise a warning or whatever>
>      }
>
> Some points that you may find interesting though:
>
> - The way we define these functions, both PyUFunc_clearfperr() and
> PyUFunc_getfperror() clear the flags. However, for PyUFunc_getfperror,
> this is just pointless. We could simply remove this, and expect to see
> a ~25% speedup in Float64*Float64 without any downside.
>
> - Numpy's default behaviour is to always check for an warn on floating
> point errors. This seems like it's probably the correct default.
> However, if you aren't worried about this for your use code, you could
> disable these warnings with np.seterr(all="ignore"). (And you'll get
> similar error-checking to what PyFloat does.) At the moment, that
> won't speed anything up. But we could easily then fix it so that the
> PyUFunc_clearfperr/PyUFunc_getfperror code checks for whether errors
> are ignored, and disables itself. This together with the previous
> change should get you a ~50% speedup in Float64*Float64, without
> having to change any of numpy's semantics.
>
> - Bizarrely, Numpy still checks the floating point flags on integer
> operations, at least for integer scalars. So 50% of the time in
> Int64*Int64 is also spent in fiddling with floating point exception
> flags. That's also some low-hanging fruit right there... (to be fair,
> this isn't *quite* as trivial to fix as it could be, because the
> integer overflow checking code sets the floating point unit's
> "overflow" flag to signal a problem, and we'd need to pull this out to
> a thread-local variable or something before disabling the floating
> point checks entirely in integer code. But still, not a huge problem.)
>
> -n
>
> [1] https://github.com/numpy/numpy/issues/2350
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>


From njs at pobox.com  Fri Jan  4 19:44:28 2013
From: njs at pobox.com (Nathaniel Smith)
Date: Sat, 5 Jan 2013 00:44:28 +0000
Subject: [Numpy-discussion] Numpy speed ups to simple tasks - final
 findings and suggestions
In-Reply-To: <50E7677C.9090008@virtualmaterials.com>
References: <50D4B69A.7000409@virtualmaterials.com>
	<CAPJVwB=6ddwLVWSwebByT7na1feYze1c00reWkYccp07Kq0kwA@mail.gmail.com>
	<50E67BBD.7090804@virtualmaterials.com>
	<CAPJVwB=zHLpXSm-MwhCn2AkJLcg43i9T13HxLEidK+SbL5QMRA@mail.gmail.com>
	<50E7677C.9090008@virtualmaterials.com>
Message-ID: <CAPJVwBnLRvzvdfHZh_sLsXBBTiZpUDSvmDMzE2yPqbVwP_xbXA@mail.gmail.com>

On Fri, Jan 4, 2013 at 11:36 PM, Raul Cota <raul at virtualmaterials.com> wrote:
> On 04/01/2013 2:33 PM, Nathaniel Smith wrote:
>> On Fri, Jan 4, 2013 at 6:50 AM, Raul Cota <raul at virtualmaterials.com> wrote:
>>> On 02/01/2013 7:56 AM, Nathaniel Smith wrote:
>>>> But, it's almost certainly possible to optimize numpy's float64 (and
>>>> friends), so that they are themselves (almost) as fast as the native
>>>> python objects. And that would help all the code that uses them, not
>>>> just the ones where regular python floats could be substituted
>>>> instead. Have you tried profiling, say, float64 * float64 to figure
>>>> out where the bottlenecks are?
>>> Seems to be split between
>>> - (primarily) the memory allocation/deallocation of the float64 that is
>>> created from the operation float64 * float64. This is the reason why float64
>>> * Pyfloat got improved with one of my changes because PyFloat was being
>>> internally converted into a float64 before doing the multiplication.
>>>
>>> - the rest of the time is the actual multiplication path way.
>> Running a quick profile on Linux x86-64 of
>>    x = np.float64(5.5)
>>    for i in xrange(n):
>>       x * x
>> I find that ~50% of the total CPU time is inside feclearexcept(), the
>> function which resets the floating point error checking registers --
>> and most of this is inside a single instruction, stmxcsr ("store sse
>> control register").
>
> I find strange you don't see bottleneck in allocation of a float64.
>
> is it easy for you to profile this ?
>
> x = np.float64(5.5)
> y = 5.5
> for i in xrange(n):
>      x * y
>
> numpy internally translates y into a float64 temporarily and then
> discards it and I seem to remember is a bit over two times slower than x * x

Yeah, seems to be dramatically slower. Using ipython's handy interface
to the timeit[1] library:

In [1]: x = np.float64(5.5)

In [2]: y = 5.5

In [3]: timeit x * y
1000000 loops, best of 3: 725 ns per loop

In [4]: timeit x * x
1000000 loops, best of 3: 283 ns per loop

But we already figured out how to (mostly) fix this part, right? I was
curious about the Float64*Float64 case, because that's the one that
was still slow after those first two patches. (And, yes, like you say,
when I run x*y in the profiler then there's a huge amount of overhead
in PyArray_GetPriority and object allocation/deallocation).

> I will try to do your suggestions on
>
> PyUFunc_clearfperr/PyUFunc_getfperror
>
> and see what I get. Haven't gotten around to get going with being able
> to do a pull request for the previous stuff. if changes are worth while
> would it be ok if I also create one for this ?

First, to be clear, it's always OK to do a pull request -- the worst
that can happen is that we all look it over carefully and decide that
it's the wrong approach and don't merge. In my email before I just
wanted to give you some clear suggestions on a good way to get
started, we wouldn't have like kicked you out or something if you did
it differently :-)

And, yes, assuming my analysis so far is correct we would definitely
be interested in major speedups that have no other user-visible
effects... ;-)

-n

[1] http://docs.python.org/2/library/timeit.html


From sebastian at sipsolutions.net  Fri Jan  4 20:29:51 2013
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Sat, 05 Jan 2013 02:29:51 +0100
Subject: [Numpy-discussion] Howto bisect old commits correctly
In-Reply-To: <1357341467.12993.6.camel@sebastian-laptop>
References: <1357341467.12993.6.camel@sebastian-laptop>
Message-ID: <1357349391.12993.8.camel@sebastian-laptop>

On Sat, 2013-01-05 at 00:17 +0100, Sebastian Berg wrote:
> Hey,
> 
> this is probably just because I do not have any experience with bisect
> and the like, but when I try running a bisect keep running into:
> 

Nevermind that. Probably I just stumbled on some bad versions...

> ImportError: /home/sebastian/.../lib/python2.7/site-packages/numpy/core/multiarray.so: undefined symbol: PyDataMem_NEW
> or:
> RuntimeError: module compiled against API version 8 but this version of numpy is 7
> 
> I am sure I am missing something simple, but I have no idea where to
> look. Am I just forgetting to delete some things and my version is not
> clean!?
> 
> Regards,
> 
> Sebastian
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> 


From raul at virtualmaterials.com  Sat Jan  5 01:09:17 2013
From: raul at virtualmaterials.com (Raul Cota)
Date: Fri, 04 Jan 2013 23:09:17 -0700
Subject: [Numpy-discussion] Numpy speed ups to simple tasks - final
 findings and suggestions
In-Reply-To: <CAPJVwBnLRvzvdfHZh_sLsXBBTiZpUDSvmDMzE2yPqbVwP_xbXA@mail.gmail.com>
References: <50D4B69A.7000409@virtualmaterials.com>
	<CAPJVwB=6ddwLVWSwebByT7na1feYze1c00reWkYccp07Kq0kwA@mail.gmail.com>
	<50E67BBD.7090804@virtualmaterials.com>
	<CAPJVwB=zHLpXSm-MwhCn2AkJLcg43i9T13HxLEidK+SbL5QMRA@mail.gmail.com>
	<50E7677C.9090008@virtualmaterials.com>
	<CAPJVwBnLRvzvdfHZh_sLsXBBTiZpUDSvmDMzE2yPqbVwP_xbXA@mail.gmail.com>
Message-ID: <50E7C38D.2040306@virtualmaterials.com>

On 04/01/2013 5:44 PM, Nathaniel Smith wrote:
> On Fri, Jan 4, 2013 at 11:36 PM, Raul Cota <raul at virtualmaterials.com> wrote:
>> On 04/01/2013 2:33 PM, Nathaniel Smith wrote:
>>> On Fri, Jan 4, 2013 at 6:50 AM, Raul Cota <raul at virtualmaterials.com> wrote:
>>>> On 02/01/2013 7:56 AM, Nathaniel Smith wrote:
>>>>> But, it's almost certainly possible to optimize numpy's float64 (and
>>>>> friends), so that they are themselves (almost) as fast as the native
>>>>> python objects. And that would help all the code that uses them, not
>>>>> just the ones where regular python floats could be substituted
>>>>> instead. Have you tried profiling, say, float64 * float64 to figure
>>>>> out where the bottlenecks are?
>>>> Seems to be split between
>>>> - (primarily) the memory allocation/deallocation of the float64 that is
>>>> created from the operation float64 * float64. This is the reason why float64
>>>> * Pyfloat got improved with one of my changes because PyFloat was being
>>>> internally converted into a float64 before doing the multiplication.
>>>>
>>>> - the rest of the time is the actual multiplication path way.
>>> Running a quick profile on Linux x86-64 of
>>>     x = np.float64(5.5)
>>>     for i in xrange(n):
>>>        x * x
>>> I find that ~50% of the total CPU time is inside feclearexcept(), the
>>> function which resets the floating point error checking registers --
>>> and most of this is inside a single instruction, stmxcsr ("store sse
>>> control register").
>> I find strange you don't see bottleneck in allocation of a float64.
>>
>> is it easy for you to profile this ?
>>
>> x = np.float64(5.5)
>> y = 5.5
>> for i in xrange(n):
>>       x * y
>>
>> numpy internally translates y into a float64 temporarily and then
>> discards it and I seem to remember is a bit over two times slower than x * x
> Yeah, seems to be dramatically slower. Using ipython's handy interface
> to the timeit[1] library:
>
> In [1]: x = np.float64(5.5)
>
> In [2]: y = 5.5
>
> In [3]: timeit x * y
> 1000000 loops, best of 3: 725 ns per loop
>
> In [4]: timeit x * x
> 1000000 loops, best of 3: 283 ns per loop

I haven't been using timeit because the bulk of what we are doing 
includes comparing against Python 2.2 and Numeric and timeit did not 
exist then. Can't wait to finally officially upgrade our main product.


> But we already figured out how to (mostly) fix this part, right?

Correct


Cheers,

Raul


> I was
> curious about the Float64*Float64 case, because that's the one that
> was still slow after those first two patches. (And, yes, like you say,
> when I run x*y in the profiler then there's a huge amount of overhead
> in PyArray_GetPriority and object allocation/deallocation).
>
>> I will try to do your suggestions on
>>
>> PyUFunc_clearfperr/PyUFunc_getfperror
>>
>> and see what I get. Haven't gotten around to get going with being able
>> to do a pull request for the previous stuff. if changes are worth while
>> would it be ok if I also create one for this ?
> First, to be clear, it's always OK to do a pull request -- the worst
> that can happen is that we all look it over carefully and decide that
> it's the wrong approach and don't merge. In my email before I just
> wanted to give you some clear suggestions on a good way to get
> started, we wouldn't have like kicked you out or something if you did
> it differently :-)
>
> And, yes, assuming my analysis so far is correct we would definitely
> be interested in major speedups that have no other user-visible
> effects... ;-)
>
> -n
>
> [1] http://docs.python.org/2/library/timeit.html
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>


From paul.anton.letnes at gmail.com  Sat Jan  5 05:42:13 2013
From: paul.anton.letnes at gmail.com (Paul Anton Letnes)
Date: Sat, 5 Jan 2013 11:42:13 +0100
Subject: [Numpy-discussion] Embedded NumPy LAPACK errors
In-Reply-To: <50E73ECC.8050803@eml.cc>
References: <50E73ECC.8050803@eml.cc>
Message-ID: <3FF2E38B-6A93-4AC6-B28B-CD1C50784AD5@gmail.com>


On 4. jan. 2013, at 21:42, mw at eml.cc wrote:

> Hiall,
> 
> 
> I am trying to embed numerical code in a mexFunction,
> as called by MATLAB, written as a Cython function.
> 
> NumPy core functions and BLAS work fine, but calls to LAPACK
> function such as SVD seem to be made against to MATLAB's linked
> MKL, and this generates MKL errors. When I try this with
> Octave, it works fine, presumably because it is compiled against
> the same LAPACK as the NumPy I am embedding.
> 
> 
> Assuming I haven't made big mistakes up to here, I have the
> following questions:
> 
> Is there a way to request numpy.linalg to use a particular
> LAPACK library, e.g. /usr/lib/liblapack.so ?
> 
> If not, is there a reasonable way to build numpy.linalg such that
> it interfaces with MKL correctly ?

It's possible, but it's much easier to install one of the pre-built python distributions. Enthought, WinPython and others include precompiled python/numpy/scipy/etc with MKL. If that works for you, I'd recommend that route, as it involves less work.

Good luck,
Paul


From matthew.brett at gmail.com  Sat Jan  5 07:15:55 2013
From: matthew.brett at gmail.com (Matthew Brett)
Date: Sat, 5 Jan 2013 12:15:55 +0000
Subject: [Numpy-discussion] Rank-0 arrays - reprise
Message-ID: <CAH6Pt5qZikcA8C+hW05viZQYhnsqoCRroOAszfsCS2Rmf+Qr+A@mail.gmail.com>

Hi,

Following on from Nathaniel's explorations of the scalar - array
casting rules, some resources on rank-0 arrays.

The discussion that Nathaniel tracked down on "rank-0 arrays"; it also
makes reference to casting.  The rank-0 arrays seem to have been one
way of solving the problem of maintaining array dtypes other than bool
/ float / int:

http://mail.scipy.org/pipermail/numpy-discussion/2002-September/001612.html

Quoting from an email from Travis in that thread, replying to an email
from Tim Hochberg:

http://mail.scipy.org/pipermail/numpy-discussion/2002-September/001647.html

<quote>
> Frankly, I have no idea what the implimentation details would be, but
> could we get rid of rank-0 arrays altogether? I have always simply found
> them strange and confusing... What are they really neccesary for
> (besides holding scalar values of different precision that standard
> Pyton scalars)?

With new coercion rules this becomes a possibility.  Arguments against it
are that  special rank-0 arrays behave as more consistent numbers with the
rest of Numeric than Python scalars.  In other words they have a length
and a shape and one can right N-dimensional code that works the same even
when the result is a scalar.

Another advantage of having a Numeric scalar is that we can control the
behavior of floating point operations better.

e.g.

if only Python scalars were available and sum(a) returned 0, then

 1 / sum(a)  would behave as Python behaves (always raises error).

while with our own scalars

1 / sum(a)   could potentially behave however the user wanted.
</quote>

There seemed then to be some impetus to remove rank-0 arrays and
replace them with Python scalar types with the various numpy
precisions :

http://mail.scipy.org/pipermail/numpy-discussion/2002-September/013983.html

Travis' recent email hints at something that seems similar, but I
don't understand what he means:

http://mail.scipy.org/pipermail/numpy-discussion/2012-December/064795.html

<quote>
Don't create array-scalars.  Instead, make the data-type object a
meta-type object whose instances are the items returned from NumPy
arrays.   There is no need for a separate array-scalar object and in
fact it's confusing to the type-system.    I understand that now.  I
did not understand that 5 years ago.
</quote>

Travis - can you expand?

I remember rank-0 arrays being confusing in that I sometimes get a
python scalar and sometimes a numpy scalar, and I may want a python
scalar, and have to special-case the rank-0 array, but I don't
remember precisely why I needed the python scalar.  Any other comments
/ records of rank-0 arrays being confusing?

Best,

Matthew


From matthew.brett at gmail.com  Sat Jan  5 07:27:02 2013
From: matthew.brett at gmail.com (Matthew Brett)
Date: Sat, 5 Jan 2013 12:27:02 +0000
Subject: [Numpy-discussion] Rank-0 arrays - reprise
In-Reply-To: <CAH6Pt5qZikcA8C+hW05viZQYhnsqoCRroOAszfsCS2Rmf+Qr+A@mail.gmail.com>
References: <CAH6Pt5qZikcA8C+hW05viZQYhnsqoCRroOAszfsCS2Rmf+Qr+A@mail.gmail.com>
Message-ID: <CAH6Pt5pA48e1GohjfbGXwT72rpKDd-TV1+ngO2_B8qnDJxF9OQ@mail.gmail.com>

Hi,

On Sat, Jan 5, 2013 at 12:15 PM, Matthew Brett <matthew.brett at gmail.com> wrote:
> Hi,
>
> Following on from Nathaniel's explorations of the scalar - array
> casting rules, some resources on rank-0 arrays.
>
> The discussion that Nathaniel tracked down on "rank-0 arrays"; it also
> makes reference to casting.  The rank-0 arrays seem to have been one
> way of solving the problem of maintaining array dtypes other than bool
> / float / int:
>
> http://mail.scipy.org/pipermail/numpy-discussion/2002-September/001612.html
>
> Quoting from an email from Travis in that thread, replying to an email
> from Tim Hochberg:
>
> http://mail.scipy.org/pipermail/numpy-discussion/2002-September/001647.html
>
> <quote>
>> Frankly, I have no idea what the implimentation details would be, but
>> could we get rid of rank-0 arrays altogether? I have always simply found
>> them strange and confusing... What are they really neccesary for
>> (besides holding scalar values of different precision that standard
>> Pyton scalars)?
>
> With new coercion rules this becomes a possibility.  Arguments against it
> are that  special rank-0 arrays behave as more consistent numbers with the
> rest of Numeric than Python scalars.  In other words they have a length
> and a shape and one can right N-dimensional code that works the same even
> when the result is a scalar.
>
> Another advantage of having a Numeric scalar is that we can control the
> behavior of floating point operations better.
>
> e.g.
>
> if only Python scalars were available and sum(a) returned 0, then
>
>  1 / sum(a)  would behave as Python behaves (always raises error).
>
> while with our own scalars
>
> 1 / sum(a)   could potentially behave however the user wanted.
> </quote>
>
> There seemed then to be some impetus to remove rank-0 arrays and
> replace them with Python scalar types with the various numpy
> precisions :
>
> http://mail.scipy.org/pipermail/numpy-discussion/2002-September/013983.html
>
> Travis' recent email hints at something that seems similar, but I
> don't understand what he means:
>
> http://mail.scipy.org/pipermail/numpy-discussion/2012-December/064795.html
>
> <quote>
> Don't create array-scalars.  Instead, make the data-type object a
> meta-type object whose instances are the items returned from NumPy
> arrays.   There is no need for a separate array-scalar object and in
> fact it's confusing to the type-system.    I understand that now.  I
> did not understand that 5 years ago.
> </quote>
>
> Travis - can you expand?
>
> I remember rank-0 arrays being confusing in that I sometimes get a
> python scalar and sometimes a numpy scalar, and I may want a python
> scalar, and have to special-case the rank-0 array, but I don't
> remember precisely why I needed the python scalar.  Any other comments
> / records of rank-0 arrays being confusing?

Adding:

Comments by Konrad Hinsen on desirable methods for rank-0 arrays, all
of which seem to have got into numpy:

http://mail.scipy.org/pipermail/numpy-discussion/2002-September/001635.html

Best,

Matthew


From matthew.brett at gmail.com  Sat Jan  5 07:32:09 2013
From: matthew.brett at gmail.com (Matthew Brett)
Date: Sat, 5 Jan 2013 12:32:09 +0000
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <CAH6Pt5rbqTRka0sbJdGgR7cVYkn1q0W=G+zHA2mywiqs3pnL4w@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CAPJVwBnct8Tcop6SruOfD1CCWC82sE3cbXoQvtux8sfLmJhYuQ@mail.gmail.com>
	<CAPJVwB=GmgRQoW1RqwR1WmJBq5tfxJAGnGPjcT0drZvGTVGx7Q@mail.gmail.com>
	<CALmrCV11T6OTp+dBvUNjGn518Rd2ovHPfS7Ur6a5BKTyYXzKQw@mail.gmail.com>
	<50E61E29.1020709@astro.uio.no>
	<CALmrCV0L6PAA=aCYgaX2gQyM-_=bcDiygz0TyDmE0o5oCbc7iA@mail.gmail.com>
	<CAFXk4bqT5Hk67Cv=FJ+L+vx53zR6x5=xRVQRcZShPs7QOZ0OoQ@mail.gmail.com>
	<CALmrCV2TcFwO=mbiSSeqrwbVCyo9nsEQSUspfihA1WgK4Abd+g@mail.gmail.com>
	<CAFXk4bopXBpN2_aPAJCJ3ADk6PSMiG7NANSra7fhButkud+75A@mail.gmail.com>
	<CALmrCV0311UpW__-espNZrbgGk2pa2fW6E1oeTvZkbYf_1e_VA@mail.gmail.com>
	<CAH6Pt5rbqTRka0sbJdGgR7cVYkn1q0W=G+zHA2mywiqs3pnL4w@mail.gmail.com>
Message-ID: <CAH6Pt5r5TS60x9h0WFoPYsAYX4_p4i8QTvzUVugomakt3YDznw@mail.gmail.com>

Hi,

On Fri, Jan 4, 2013 at 4:54 PM, Matthew Brett <matthew.brett at gmail.com> wrote:
> Hi,
>
> On Fri, Jan 4, 2013 at 4:01 PM, Andrew Collette
> <andrew.collette at gmail.com> wrote:
>> >From a more basic perspective, I think that adding a number to an
>> array should never raise an exception.  I've not used any other
>> language in which this behavior takes place.  In C, you have rollover
>> behavior, in IDL you roll over or clip, and in NumPy you either roll
>> or upcast, depending on the version.  IDL, etc. manage to handle
>> things like max() or total() in a sensible (or at least defensible)
>> fashion, and without raising an error.
>
> That's a reasonable point.
>
> Looks like we lost consensus.
>
> What about returning to the 1.5 behavior instead?

If we do return to the 1.5 behavior, we would need to think about
doing this in 1.7.

If there are a large number of 1.5.x and previous users who would
upgrade to 1.7, leaving the 1.6 behavior in 1.7 will mean that they
will get double the confusion:

1) The behavior has changed to something they weren't expecting
2) The behavior is going to change back very soon

Best,

Matthew


From robince at gmail.com  Sat Jan  5 08:03:41 2013
From: robince at gmail.com (Robin)
Date: Sat, 5 Jan 2013 13:03:41 +0000
Subject: [Numpy-discussion] Embedded NumPy LAPACK errors
In-Reply-To: <3FF2E38B-6A93-4AC6-B28B-CD1C50784AD5@gmail.com>
References: <50E73ECC.8050803@eml.cc>
	<3FF2E38B-6A93-4AC6-B28B-CD1C50784AD5@gmail.com>
Message-ID: <CALsWBNNVAFNqKX=s9PB0LTe16axVWRHuv8-ZbxZAHToseX-YsA@mail.gmail.com>

Coincidently I have been having the same problem this week. Unrelated
to the problem, I would suggest looking at pymex which 'wraps' python
inside Matlab very nicely, although it has the same problem with
duplicate lapack symbols.

https://github.com/kw/pymex

I have the same problem with Enthough EPD which is built against MKL -
but I think the problem is that Intel provide two different interfaces
- ILP64 with 64 bit integer indices and LP64 with 32 bit integers.
Matlab link against the ILP64 version, whereas Enthought use the LP64
version - so there are still incompatible.

Cheers

Robin

On Sat, Jan 5, 2013 at 10:42 AM, Paul Anton Letnes
<paul.anton.letnes at gmail.com> wrote:
>
> On 4. jan. 2013, at 21:42, mw at eml.cc wrote:
>
>> Hiall,
>>
>>
>> I am trying to embed numerical code in a mexFunction,
>> as called by MATLAB, written as a Cython function.
>>
>> NumPy core functions and BLAS work fine, but calls to LAPACK
>> function such as SVD seem to be made against to MATLAB's linked
>> MKL, and this generates MKL errors. When I try this with
>> Octave, it works fine, presumably because it is compiled against
>> the same LAPACK as the NumPy I am embedding.
>>
>>
>> Assuming I haven't made big mistakes up to here, I have the
>> following questions:
>>
>> Is there a way to request numpy.linalg to use a particular
>> LAPACK library, e.g. /usr/lib/liblapack.so ?
>>
>> If not, is there a reasonable way to build numpy.linalg such that
>> it interfaces with MKL correctly ?
>
> It's possible, but it's much easier to install one of the pre-built python distributions. Enthought, WinPython and others include precompiled python/numpy/scipy/etc with MKL. If that works for you, I'd recommend that route, as it involves less work.
>
> Good luck,
> Paul
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From pierre.raybaut at gmail.com  Sat Jan  5 08:38:06 2013
From: pierre.raybaut at gmail.com (Pierre Raybaut)
Date: Sat, 5 Jan 2013 14:38:06 +0100
Subject: [Numpy-discussion] ANN: Spyder v2.1.13
Message-ID: <CAKegKuCNdsr88pXsJzP+JK3eHYEtFTW4_hQ9ijSxkJXj1hyciA@mail.gmail.com>

Hi all,

On the behalf of Spyder's development team
(http://code.google.com/p/spyderlib/people/list), I'm pleased to
announce that Spyder v2.1.13 has been released and is available for
Windows XP/Vista/7, GNU/Linux and MacOS X:
http://code.google.com/p/spyderlib/

This is a pure maintenance release -- a lot of bugs were fixed since
v2.1.11 (v2.1.12 was released exclusively inside WinPython
distribution):
http://code.google.com/p/spyderlib/wiki/ChangeLog

Spyder is a free, open-source (MIT license) interactive development
environment for the Python language with advanced editing, interactive
testing, debugging and introspection features. Originally designed to
provide MATLAB-like features (integrated help, interactive console,
variable explorer with GUI-based editors for dictionaries, NumPy
arrays, ...), it is strongly oriented towards scientific computing and
software development.
Thanks to the `spyderlib` library, Spyder also provides powerful
ready-to-use widgets: embedded Python console (example:
http://packages.python.org/guiqwt/_images/sift3.png), NumPy array
editor (example: http://packages.python.org/guiqwt/_images/sift2.png),
dictionary editor, source code editor, etc.

Description of key features with tasty screenshots can be found at:
http://code.google.com/p/spyderlib/wiki/Features

On Windows platforms, Spyder is also available as a stand-alone
executable (don't forget to disable UAC on Vista/7). This all-in-one
portable version is still experimental (for example, it does not embed
sphinx -- meaning no rich text mode for the object inspector) but it
should provide a working version of Spyder for Windows platforms
without having to install anything else (except Python 2.x itself, of
course).

Don't forget to follow Spyder updates/news:
 * on the project website: http://code.google.com/p/spyderlib/
 * and on our official blog: http://spyder-ide.blogspot.com/

Last, but not least, we welcome any contribution that helps making
Spyder an efficient scientific development/computing environment. Join
us to help creating your favourite environment!
(http://code.google.com/p/spyderlib/wiki/NoteForContributors)

Enjoy!
-Pierre


From eric.emsellem at eso.org  Sat Jan  5 09:15:02 2013
From: eric.emsellem at eso.org (Eric Emsellem)
Date: Sat, 05 Jan 2013 15:15:02 +0100
Subject: [Numpy-discussion] Invalid value encoutered : how to prevent
	numpy.where to do this?
Message-ID: <50E83566.2080404@eso.org>

Dear all,

I have a code using lots of "numpy.where" to make some constrained 
calculations as in:

data = arange(10)
result = np.where(data == 0, 0., 1./data)

# or
data1 = arange(10)
data2 = arange(10)+1.0
result = np.where(data1 > data2, np.sqrt(data1-data2), np.sqrt(data2-data2))

which then produces warnings like:
/usr/bin/ipython:1: RuntimeWarning: invalid value encountered in sqrt

or for the first example:

/usr/bin/ipython:1: RuntimeWarning: divide by zero encountered in divide

How do I avoid these messages to appear?

I know that I could in principle use numpy.seterr. However, I do NOT 
want to remove these warnings for other potential divide/multiply/sqrt 
etc errors. Only when I am using a "where", to in fact avoid such 
warnings! Note that the warnings only happen once, but since I am going 
to release that code, I would like to avoid the user to get such 
messages which are irrelevant here (because I am testing, with the 
where, when NOT to divide by zero or take a sqrt of a negative number).

thanks!
Eric


From njs at pobox.com  Sat Jan  5 09:27:42 2013
From: njs at pobox.com (Nathaniel Smith)
Date: Sat, 5 Jan 2013 14:27:42 +0000
Subject: [Numpy-discussion] Invalid value encoutered : how to prevent
 numpy.where to do this?
In-Reply-To: <50E83566.2080404@eso.org>
References: <50E83566.2080404@eso.org>
Message-ID: <CAPJVwBnjRY8N+UbUX0M7Omrz9i5rgt984Pwv9aTk5=x1VyegEg@mail.gmail.com>

On Sat, Jan 5, 2013 at 2:15 PM, Eric Emsellem <eric.emsellem at eso.org> wrote:
> Dear all,
>
> I have a code using lots of "numpy.where" to make some constrained
> calculations as in:
>
> data = arange(10)
> result = np.where(data == 0, 0., 1./data)
>
> # or
> data1 = arange(10)
> data2 = arange(10)+1.0
> result = np.where(data1 > data2, np.sqrt(data1-data2), np.sqrt(data2-data2))
>
> which then produces warnings like:
> /usr/bin/ipython:1: RuntimeWarning: invalid value encountered in sqrt
>
> or for the first example:
>
> /usr/bin/ipython:1: RuntimeWarning: divide by zero encountered in divide
>
> How do I avoid these messages to appear?
>
> I know that I could in principle use numpy.seterr. However, I do NOT
> want to remove these warnings for other potential divide/multiply/sqrt
> etc errors. Only when I am using a "where", to in fact avoid such
> warnings! Note that the warnings only happen once, but since I am going
> to release that code, I would like to avoid the user to get such
> messages which are irrelevant here (because I am testing, with the
> where, when NOT to divide by zero or take a sqrt of a negative number).

You can't avoid it while using np.where like this, because the warning
is being issued before np.where is even called. It's basically doing:
  # Calculate all possible sqrts
  tmp1 = np.sqrt(data1-data2)
  tmp2 = np.sqrt(data2-data2)  # let's pretend this isn't just all zeros...
  # Use np.where to pick out the useful ones and put them together
into one array
  mashed_up = np.where(data1 > data2, tmp1, tmp2)

So you need to somehow apply the indexing while doing the sqrt. In
this case the easiest way would just be
  np.sqrt(np.where(data1 > data2, data1 - data2, data2 - data2))
Or, slightly faster (avoiding some temporaries):
  np.sqrt(np.where(data1 > data2, data1, data2) - data2)

If your operation doesn't factor like this though then you can always
use something more cumbersome like
  result = np.empty_like(data)
  mask = (data == 0)
  result[mask] = 0
  result[~mask] = 1.0/data[~mask]

Or in 1.7 this could be written
  result = np.zeros_like(data)
  np.divide(1.0, data, where=(data != 0), out=result)

-n


From njs at pobox.com  Sat Jan  5 09:38:07 2013
From: njs at pobox.com (Nathaniel Smith)
Date: Sat, 5 Jan 2013 14:38:07 +0000
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <CAH6Pt5r5TS60x9h0WFoPYsAYX4_p4i8QTvzUVugomakt3YDznw@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CAPJVwBnct8Tcop6SruOfD1CCWC82sE3cbXoQvtux8sfLmJhYuQ@mail.gmail.com>
	<CAPJVwB=GmgRQoW1RqwR1WmJBq5tfxJAGnGPjcT0drZvGTVGx7Q@mail.gmail.com>
	<CALmrCV11T6OTp+dBvUNjGn518Rd2ovHPfS7Ur6a5BKTyYXzKQw@mail.gmail.com>
	<50E61E29.1020709@astro.uio.no>
	<CALmrCV0L6PAA=aCYgaX2gQyM-_=bcDiygz0TyDmE0o5oCbc7iA@mail.gmail.com>
	<CAFXk4bqT5Hk67Cv=FJ+L+vx53zR6x5=xRVQRcZShPs7QOZ0OoQ@mail.gmail.com>
	<CALmrCV2TcFwO=mbiSSeqrwbVCyo9nsEQSUspfihA1WgK4Abd+g@mail.gmail.com>
	<CAFXk4bopXBpN2_aPAJCJ3ADk6PSMiG7NANSra7fhButkud+75A@mail.gmail.com>
	<CALmrCV0311UpW__-espNZrbgGk2pa2fW6E1oeTvZkbYf_1e_VA@mail.gmail.com>
	<CAH6Pt5rbqTRka0sbJdGgR7cVYkn1q0W=G+zHA2mywiqs3pnL4w@mail.gmail.com>
	<CAH6Pt5r5TS60x9h0WFoPYsAYX4_p4i8QTvzUVugomakt3YDznw@mail.gmail.com>
Message-ID: <CAPJVwBngwa5c_XS-rvb2hmRsG9Rn-7V9kOxBujXT53wkrE33Og@mail.gmail.com>

On Sat, Jan 5, 2013 at 12:32 PM, Matthew Brett <matthew.brett at gmail.com> wrote:
> Hi,
>
> On Fri, Jan 4, 2013 at 4:54 PM, Matthew Brett <matthew.brett at gmail.com> wrote:
>> Hi,
>>
>> On Fri, Jan 4, 2013 at 4:01 PM, Andrew Collette
>> <andrew.collette at gmail.com> wrote:
>>> >From a more basic perspective, I think that adding a number to an
>>> array should never raise an exception.  I've not used any other
>>> language in which this behavior takes place.  In C, you have rollover
>>> behavior, in IDL you roll over or clip, and in NumPy you either roll
>>> or upcast, depending on the version.  IDL, etc. manage to handle
>>> things like max() or total() in a sensible (or at least defensible)
>>> fashion, and without raising an error.
>>
>> That's a reasonable point.
>>
>> Looks like we lost consensus.
>>
>> What about returning to the 1.5 behavior instead?
>
> If we do return to the 1.5 behavior, we would need to think about
> doing this in 1.7.
>
> If there are a large number of 1.5.x and previous users who would
> upgrade to 1.7, leaving the 1.6 behavior in 1.7 will mean that they
> will get double the confusion:
>
> 1) The behavior has changed to something they weren't expecting
> 2) The behavior is going to change back very soon

I disagree. 1.7 is basically done, the 1.6 changes are out there
already, and we still have work to do just to get consensus on how we
want to handle this, plus implement the changes.

Basically, the way I think about it in general is, you have the first
release that contains some bug, and then you have the first release
that doesn't contain it. Minimizing the amount of *time* between those
releases is important. Minimizing the *number of releases* in between
does not -- according to that logic, we shouldn't have released 1.6.1
and 1.6.2 until we were confident that we'd fixed *all* the bugs,
because otherwise they might have misled people into upgrading too
soon. Holding 1.7 back for this isn't going to get this change done or
to users any faster; it's just going to hold back all the other
changes in 1.7.

I do think we ought to aim to shorten our release cycle drastically.
Like release 1.8 within 2-3 months after 1.7. But let's talk about
that after 1.7 is out.

-n


From ralf.gommers at gmail.com  Sat Jan  5 10:55:33 2013
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Sat, 5 Jan 2013 16:55:33 +0100
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <CAPJVwBngwa5c_XS-rvb2hmRsG9Rn-7V9kOxBujXT53wkrE33Og@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CAPJVwBnct8Tcop6SruOfD1CCWC82sE3cbXoQvtux8sfLmJhYuQ@mail.gmail.com>
	<CAPJVwB=GmgRQoW1RqwR1WmJBq5tfxJAGnGPjcT0drZvGTVGx7Q@mail.gmail.com>
	<CALmrCV11T6OTp+dBvUNjGn518Rd2ovHPfS7Ur6a5BKTyYXzKQw@mail.gmail.com>
	<50E61E29.1020709@astro.uio.no>
	<CALmrCV0L6PAA=aCYgaX2gQyM-_=bcDiygz0TyDmE0o5oCbc7iA@mail.gmail.com>
	<CAFXk4bqT5Hk67Cv=FJ+L+vx53zR6x5=xRVQRcZShPs7QOZ0OoQ@mail.gmail.com>
	<CALmrCV2TcFwO=mbiSSeqrwbVCyo9nsEQSUspfihA1WgK4Abd+g@mail.gmail.com>
	<CAFXk4bopXBpN2_aPAJCJ3ADk6PSMiG7NANSra7fhButkud+75A@mail.gmail.com>
	<CALmrCV0311UpW__-espNZrbgGk2pa2fW6E1oeTvZkbYf_1e_VA@mail.gmail.com>
	<CAH6Pt5rbqTRka0sbJdGgR7cVYkn1q0W=G+zHA2mywiqs3pnL4w@mail.gmail.com>
	<CAH6Pt5r5TS60x9h0WFoPYsAYX4_p4i8QTvzUVugomakt3YDznw@mail.gmail.com>
	<CAPJVwBngwa5c_XS-rvb2hmRsG9Rn-7V9kOxBujXT53wkrE33Og@mail.gmail.com>
Message-ID: <CABL7CQibBJKA6s8UpzAD0=Yz7dZwMVE2cqvfR3FcZKGYcd100A@mail.gmail.com>

On Sat, Jan 5, 2013 at 3:38 PM, Nathaniel Smith <njs at pobox.com> wrote:

> On Sat, Jan 5, 2013 at 12:32 PM, Matthew Brett <matthew.brett at gmail.com>
> wrote:
> > Hi,
> >
> > On Fri, Jan 4, 2013 at 4:54 PM, Matthew Brett <matthew.brett at gmail.com>
> wrote:
> >> Hi,
> >>
> >> On Fri, Jan 4, 2013 at 4:01 PM, Andrew Collette
> >> <andrew.collette at gmail.com> wrote:
> >>> >From a more basic perspective, I think that adding a number to an
> >>> array should never raise an exception.  I've not used any other
> >>> language in which this behavior takes place.  In C, you have rollover
> >>> behavior, in IDL you roll over or clip, and in NumPy you either roll
> >>> or upcast, depending on the version.  IDL, etc. manage to handle
> >>> things like max() or total() in a sensible (or at least defensible)
> >>> fashion, and without raising an error.
> >>
> >> That's a reasonable point.
> >>
> >> Looks like we lost consensus.
> >>
> >> What about returning to the 1.5 behavior instead?
> >
> > If we do return to the 1.5 behavior, we would need to think about
> > doing this in 1.7.
> >
> > If there are a large number of 1.5.x and previous users who would
> > upgrade to 1.7, leaving the 1.6 behavior in 1.7 will mean that they
> > will get double the confusion:
> >
> > 1) The behavior has changed to something they weren't expecting
> > 2) The behavior is going to change back very soon
>
> I disagree. 1.7 is basically done, the 1.6 changes are out there
> already, and we still have work to do just to get consensus on how we
> want to handle this, plus implement the changes.
>

I agree with Nathaniel. 1.7.0rc1 is out, so all that should go into 1.7.x
from now on is bug fixes.

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130105/1e5005ec/attachment.html>

From matthew.brett at gmail.com  Sat Jan  5 10:59:25 2013
From: matthew.brett at gmail.com (Matthew Brett)
Date: Sat, 5 Jan 2013 15:59:25 +0000
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <CAPJVwBngwa5c_XS-rvb2hmRsG9Rn-7V9kOxBujXT53wkrE33Og@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CAPJVwBnct8Tcop6SruOfD1CCWC82sE3cbXoQvtux8sfLmJhYuQ@mail.gmail.com>
	<CAPJVwB=GmgRQoW1RqwR1WmJBq5tfxJAGnGPjcT0drZvGTVGx7Q@mail.gmail.com>
	<CALmrCV11T6OTp+dBvUNjGn518Rd2ovHPfS7Ur6a5BKTyYXzKQw@mail.gmail.com>
	<50E61E29.1020709@astro.uio.no>
	<CALmrCV0L6PAA=aCYgaX2gQyM-_=bcDiygz0TyDmE0o5oCbc7iA@mail.gmail.com>
	<CAFXk4bqT5Hk67Cv=FJ+L+vx53zR6x5=xRVQRcZShPs7QOZ0OoQ@mail.gmail.com>
	<CALmrCV2TcFwO=mbiSSeqrwbVCyo9nsEQSUspfihA1WgK4Abd+g@mail.gmail.com>
	<CAFXk4bopXBpN2_aPAJCJ3ADk6PSMiG7NANSra7fhButkud+75A@mail.gmail.com>
	<CALmrCV0311UpW__-espNZrbgGk2pa2fW6E1oeTvZkbYf_1e_VA@mail.gmail.com>
	<CAH6Pt5rbqTRka0sbJdGgR7cVYkn1q0W=G+zHA2mywiqs3pnL4w@mail.gmail.com>
	<CAH6Pt5r5TS60x9h0WFoPYsAYX4_p4i8QTvzUVugomakt3YDznw@mail.gmail.com>
	<CAPJVwBngwa5c_XS-rvb2hmRsG9Rn-7V9kOxBujXT53wkrE33Og@mail.gmail.com>
Message-ID: <CAH6Pt5owtrgk-rM8Mh1vaAoSNo_q7Jw+pa9JwHuqXHtmJarupg@mail.gmail.com>

Hi,

On Sat, Jan 5, 2013 at 2:38 PM, Nathaniel Smith <njs at pobox.com> wrote:
> On Sat, Jan 5, 2013 at 12:32 PM, Matthew Brett <matthew.brett at gmail.com> wrote:
>> Hi,
>>
>> On Fri, Jan 4, 2013 at 4:54 PM, Matthew Brett <matthew.brett at gmail.com> wrote:
>>> Hi,
>>>
>>> On Fri, Jan 4, 2013 at 4:01 PM, Andrew Collette
>>> <andrew.collette at gmail.com> wrote:
>>>> >From a more basic perspective, I think that adding a number to an
>>>> array should never raise an exception.  I've not used any other
>>>> language in which this behavior takes place.  In C, you have rollover
>>>> behavior, in IDL you roll over or clip, and in NumPy you either roll
>>>> or upcast, depending on the version.  IDL, etc. manage to handle
>>>> things like max() or total() in a sensible (or at least defensible)
>>>> fashion, and without raising an error.
>>>
>>> That's a reasonable point.
>>>
>>> Looks like we lost consensus.
>>>
>>> What about returning to the 1.5 behavior instead?
>>
>> If we do return to the 1.5 behavior, we would need to think about
>> doing this in 1.7.
>>
>> If there are a large number of 1.5.x and previous users who would
>> upgrade to 1.7, leaving the 1.6 behavior in 1.7 will mean that they
>> will get double the confusion:
>>
>> 1) The behavior has changed to something they weren't expecting
>> 2) The behavior is going to change back very soon
>
> I disagree. 1.7 is basically done, the 1.6 changes are out there
> already, and we still have work to do just to get consensus on how we
> want to handle this, plus implement the changes.
>
> Basically, the way I think about it in general is, you have the first
> release that contains some bug, and then you have the first release
> that doesn't contain it. Minimizing the amount of *time* between those
> releases is important. Minimizing the *number of releases* in between
> does not -- according to that logic, we shouldn't have released 1.6.1
> and 1.6.2 until we were confident that we'd fixed *all* the bugs,
> because otherwise they might have misled people into upgrading too
> soon. Holding 1.7 back for this isn't going to get this change done or
> to users any faster; it's just going to hold back all the other
> changes in 1.7.
>
> I do think we ought to aim to shorten our release cycle drastically.
> Like release 1.8 within 2-3 months after 1.7. But let's talk about
> that after 1.7 is out.

Yes, I was imagining that resolving this question would be rather
quick, and therefore any delay to 1.7 would be very small, but if it
takes more than a few days to come to a solution, it's possible there
would not be net benefit.

To Ralf - I think a 'bugfix only' metric doesn't help all that much in
this case, because if we revert to 1.5 behavior, this could very
reasonably be described as a bugfix.

Cheers,

Matthew


From njs at pobox.com  Sat Jan  5 11:16:25 2013
From: njs at pobox.com (Nathaniel Smith)
Date: Sat, 5 Jan 2013 16:16:25 +0000
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <CAH6Pt5owtrgk-rM8Mh1vaAoSNo_q7Jw+pa9JwHuqXHtmJarupg@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CAPJVwBnct8Tcop6SruOfD1CCWC82sE3cbXoQvtux8sfLmJhYuQ@mail.gmail.com>
	<CAPJVwB=GmgRQoW1RqwR1WmJBq5tfxJAGnGPjcT0drZvGTVGx7Q@mail.gmail.com>
	<CALmrCV11T6OTp+dBvUNjGn518Rd2ovHPfS7Ur6a5BKTyYXzKQw@mail.gmail.com>
	<50E61E29.1020709@astro.uio.no>
	<CALmrCV0L6PAA=aCYgaX2gQyM-_=bcDiygz0TyDmE0o5oCbc7iA@mail.gmail.com>
	<CAFXk4bqT5Hk67Cv=FJ+L+vx53zR6x5=xRVQRcZShPs7QOZ0OoQ@mail.gmail.com>
	<CALmrCV2TcFwO=mbiSSeqrwbVCyo9nsEQSUspfihA1WgK4Abd+g@mail.gmail.com>
	<CAFXk4bopXBpN2_aPAJCJ3ADk6PSMiG7NANSra7fhButkud+75A@mail.gmail.com>
	<CALmrCV0311UpW__-espNZrbgGk2pa2fW6E1oeTvZkbYf_1e_VA@mail.gmail.com>
	<CAH6Pt5rbqTRka0sbJdGgR7cVYkn1q0W=G+zHA2mywiqs3pnL4w@mail.gmail.com>
	<CAH6Pt5r5TS60x9h0WFoPYsAYX4_p4i8QTvzUVugomakt3YDznw@mail.gmail.com>
	<CAPJVwBngwa5c_XS-rvb2hmRsG9Rn-7V9kOxBujXT53wkrE33Og@mail.gmail.com>
	<CAH6Pt5owtrgk-rM8Mh1vaAoSNo_q7Jw+pa9JwHuqXHtmJarupg@mail.gmail.com>
Message-ID: <CAPJVwBniNOdKgUUP+GrBm17A9nUBawq+TTH2uwjrWmMZsR032g@mail.gmail.com>

On 5 Jan 2013 15:59, "Matthew Brett" <matthew.brett at gmail.com> wrote:
>
> Hi,
>
> On Sat, Jan 5, 2013 at 2:38 PM, Nathaniel Smith <njs at pobox.com> wrote:
> > On Sat, Jan 5, 2013 at 12:32 PM, Matthew Brett <matthew.brett at gmail.com>
wrote:
> >> Hi,
> >>
> >> On Fri, Jan 4, 2013 at 4:54 PM, Matthew Brett <matthew.brett at gmail.com>
wrote:
> >>> Hi,
> >>>
> >>> On Fri, Jan 4, 2013 at 4:01 PM, Andrew Collette
> >>> <andrew.collette at gmail.com> wrote:
> >>>> >From a more basic perspective, I think that adding a number to an
> >>>> array should never raise an exception.  I've not used any other
> >>>> language in which this behavior takes place.  In C, you have rollover
> >>>> behavior, in IDL you roll over or clip, and in NumPy you either roll
> >>>> or upcast, depending on the version.  IDL, etc. manage to handle
> >>>> things like max() or total() in a sensible (or at least defensible)
> >>>> fashion, and without raising an error.
> >>>
> >>> That's a reasonable point.
> >>>
> >>> Looks like we lost consensus.
> >>>
> >>> What about returning to the 1.5 behavior instead?
> >>
> >> If we do return to the 1.5 behavior, we would need to think about
> >> doing this in 1.7.
> >>
> >> If there are a large number of 1.5.x and previous users who would
> >> upgrade to 1.7, leaving the 1.6 behavior in 1.7 will mean that they
> >> will get double the confusion:
> >>
> >> 1) The behavior has changed to something they weren't expecting
> >> 2) The behavior is going to change back very soon
> >
> > I disagree. 1.7 is basically done, the 1.6 changes are out there
> > already, and we still have work to do just to get consensus on how we
> > want to handle this, plus implement the changes.
> >
> > Basically, the way I think about it in general is, you have the first
> > release that contains some bug, and then you have the first release
> > that doesn't contain it. Minimizing the amount of *time* between those
> > releases is important. Minimizing the *number of releases* in between
> > does not -- according to that logic, we shouldn't have released 1.6.1
> > and 1.6.2 until we were confident that we'd fixed *all* the bugs,
> > because otherwise they might have misled people into upgrading too
> > soon. Holding 1.7 back for this isn't going to get this change done or
> > to users any faster; it's just going to hold back all the other
> > changes in 1.7.
> >
> > I do think we ought to aim to shorten our release cycle drastically.
> > Like release 1.8 within 2-3 months after 1.7. But let's talk about
> > that after 1.7 is out.
>
> Yes, I was imagining that resolving this question would be rather
> quick, and therefore any delay to 1.7 would be very small, but if it
> takes more than a few days to come to a solution, it's possible there
> would not be net benefit.
>
> To Ralf - I think a 'bugfix only' metric doesn't help all that much in
> this case, because if we revert to 1.5 behavior, this could very
> reasonably be described as a bugfix.

It's not just the time to make the change, it's the time to make sure that
we haven't created any new unexpected problems in the process. 1.7's
already gone through many weeks of stabilization and testing. Really at
this point the criterion isn't really even bug fixes only, but release
critical bugs and doc fixes only (and the only RC bugs left should be ones
discovered through the beta/rc cycle).

-n
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130105/260b2f8e/attachment.html>

From matthew.brett at gmail.com  Sat Jan  5 13:56:13 2013
From: matthew.brett at gmail.com (Matthew Brett)
Date: Sat, 5 Jan 2013 18:56:13 +0000
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <CAPJVwBniNOdKgUUP+GrBm17A9nUBawq+TTH2uwjrWmMZsR032g@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CAPJVwBnct8Tcop6SruOfD1CCWC82sE3cbXoQvtux8sfLmJhYuQ@mail.gmail.com>
	<CAPJVwB=GmgRQoW1RqwR1WmJBq5tfxJAGnGPjcT0drZvGTVGx7Q@mail.gmail.com>
	<CALmrCV11T6OTp+dBvUNjGn518Rd2ovHPfS7Ur6a5BKTyYXzKQw@mail.gmail.com>
	<50E61E29.1020709@astro.uio.no>
	<CALmrCV0L6PAA=aCYgaX2gQyM-_=bcDiygz0TyDmE0o5oCbc7iA@mail.gmail.com>
	<CAFXk4bqT5Hk67Cv=FJ+L+vx53zR6x5=xRVQRcZShPs7QOZ0OoQ@mail.gmail.com>
	<CALmrCV2TcFwO=mbiSSeqrwbVCyo9nsEQSUspfihA1WgK4Abd+g@mail.gmail.com>
	<CAFXk4bopXBpN2_aPAJCJ3ADk6PSMiG7NANSra7fhButkud+75A@mail.gmail.com>
	<CALmrCV0311UpW__-espNZrbgGk2pa2fW6E1oeTvZkbYf_1e_VA@mail.gmail.com>
	<CAH6Pt5rbqTRka0sbJdGgR7cVYkn1q0W=G+zHA2mywiqs3pnL4w@mail.gmail.com>
	<CAH6Pt5r5TS60x9h0WFoPYsAYX4_p4i8QTvzUVugomakt3YDznw@mail.gmail.com>
	<CAPJVwBngwa5c_XS-rvb2hmRsG9Rn-7V9kOxBujXT53wkrE33Og@mail.gmail.com>
	<CAH6Pt5owtrgk-rM8Mh1vaAoSNo_q7Jw+pa9JwHuqXHtmJarupg@mail.gmail.com>
	<CAPJVwBniNOdKgUUP+GrBm17A9nUBawq+TTH2uwjrWmMZsR032g@mail.gmail.com>
Message-ID: <CAH6Pt5puOj-ke7qguu-PnDorCaFPD0XTi_RcUz77GK0dEcwr2w@mail.gmail.com>

Hi,

On Sat, Jan 5, 2013 at 4:16 PM, Nathaniel Smith <njs at pobox.com> wrote:
> On 5 Jan 2013 15:59, "Matthew Brett" <matthew.brett at gmail.com> wrote:
>>
>> Hi,
>>
>> On Sat, Jan 5, 2013 at 2:38 PM, Nathaniel Smith <njs at pobox.com> wrote:
>> > On Sat, Jan 5, 2013 at 12:32 PM, Matthew Brett <matthew.brett at gmail.com>
>> > wrote:
>> >> Hi,
>> >>
>> >> On Fri, Jan 4, 2013 at 4:54 PM, Matthew Brett <matthew.brett at gmail.com>
>> >> wrote:
>> >>> Hi,
>> >>>
>> >>> On Fri, Jan 4, 2013 at 4:01 PM, Andrew Collette
>> >>> <andrew.collette at gmail.com> wrote:
>> >>>> >From a more basic perspective, I think that adding a number to an
>> >>>> array should never raise an exception.  I've not used any other
>> >>>> language in which this behavior takes place.  In C, you have rollover
>> >>>> behavior, in IDL you roll over or clip, and in NumPy you either roll
>> >>>> or upcast, depending on the version.  IDL, etc. manage to handle
>> >>>> things like max() or total() in a sensible (or at least defensible)
>> >>>> fashion, and without raising an error.
>> >>>
>> >>> That's a reasonable point.
>> >>>
>> >>> Looks like we lost consensus.
>> >>>
>> >>> What about returning to the 1.5 behavior instead?
>> >>
>> >> If we do return to the 1.5 behavior, we would need to think about
>> >> doing this in 1.7.
>> >>
>> >> If there are a large number of 1.5.x and previous users who would
>> >> upgrade to 1.7, leaving the 1.6 behavior in 1.7 will mean that they
>> >> will get double the confusion:
>> >>
>> >> 1) The behavior has changed to something they weren't expecting
>> >> 2) The behavior is going to change back very soon
>> >
>> > I disagree. 1.7 is basically done, the 1.6 changes are out there
>> > already, and we still have work to do just to get consensus on how we
>> > want to handle this, plus implement the changes.
>> >
>> > Basically, the way I think about it in general is, you have the first
>> > release that contains some bug, and then you have the first release
>> > that doesn't contain it. Minimizing the amount of *time* between those
>> > releases is important. Minimizing the *number of releases* in between
>> > does not -- according to that logic, we shouldn't have released 1.6.1
>> > and 1.6.2 until we were confident that we'd fixed *all* the bugs,
>> > because otherwise they might have misled people into upgrading too
>> > soon. Holding 1.7 back for this isn't going to get this change done or
>> > to users any faster; it's just going to hold back all the other
>> > changes in 1.7.
>> >
>> > I do think we ought to aim to shorten our release cycle drastically.
>> > Like release 1.8 within 2-3 months after 1.7. But let's talk about
>> > that after 1.7 is out.
>>
>> Yes, I was imagining that resolving this question would be rather
>> quick, and therefore any delay to 1.7 would be very small, but if it
>> takes more than a few days to come to a solution, it's possible there
>> would not be net benefit.
>>
>> To Ralf - I think a 'bugfix only' metric doesn't help all that much in
>> this case, because if we revert to 1.5 behavior, this could very
>> reasonably be described as a bugfix.
>
> It's not just the time to make the change, it's the time to make sure that
> we haven't created any new unexpected problems in the process. 1.7's already
> gone through many weeks of stabilization and testing. Really at this point
> the criterion isn't really even bug fixes only, but release critical bugs
> and doc fixes only (and the only RC bugs left should be ones discovered
> through the beta/rc cycle).

OK, I understand.

This must influence the decision on what to do about the scalar
casting.  Further from 1.5.x makes reverting to 1.5.x less attractive.
 The longer the 1.6.x changes have been in the wild, the stronger the
argument for leaving things as they are.

Best,

Matthew


From nouiz at nouiz.org  Sat Jan  5 15:36:01 2013
From: nouiz at nouiz.org (=?ISO-8859-1?Q?Fr=E9d=E9ric_Bastien?=)
Date: Sat, 5 Jan 2013 15:36:01 -0500
Subject: [Numpy-discussion] Howto bisect old commits correctly
In-Reply-To: <1357349391.12993.8.camel@sebastian-laptop>
References: <1357341467.12993.6.camel@sebastian-laptop>
	<1357349391.12993.8.camel@sebastian-laptop>
Message-ID: <CADKKbtjgt85ZLKEBnHifM6pk_-1cH8pxtf6561FDZkapSka0SQ@mail.gmail.com>

Hi,

I had many error when tring to the checkedout version and recompile.
the problem I had is that I didn't erased the build directory each
time. This cause some problem as not all is recompiled correctly in
that case. Just deleting this directory manually fixed my problem.

HTH

Fred

On Fri, Jan 4, 2013 at 8:29 PM, Sebastian Berg
<sebastian at sipsolutions.net> wrote:
> On Sat, 2013-01-05 at 00:17 +0100, Sebastian Berg wrote:
>> Hey,
>>
>> this is probably just because I do not have any experience with bisect
>> and the like, but when I try running a bisect keep running into:
>>
>
> Nevermind that. Probably I just stumbled on some bad versions...
>
>> ImportError: /home/sebastian/.../lib/python2.7/site-packages/numpy/core/multiarray.so: undefined symbol: PyDataMem_NEW
>> or:
>> RuntimeError: module compiled against API version 8 but this version of numpy is 7
>>
>> I am sure I am missing something simple, but I have no idea where to
>> look. Am I just forgetting to delete some things and my version is not
>> clean!?
>>
>> Regards,
>>
>> Sebastian
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From njs at pobox.com  Sat Jan  5 16:31:12 2013
From: njs at pobox.com (Nathaniel Smith)
Date: Sat, 5 Jan 2013 21:31:12 +0000
Subject: [Numpy-discussion] Rank-0 arrays - reprise
In-Reply-To: <CAH6Pt5qZikcA8C+hW05viZQYhnsqoCRroOAszfsCS2Rmf+Qr+A@mail.gmail.com>
References: <CAH6Pt5qZikcA8C+hW05viZQYhnsqoCRroOAszfsCS2Rmf+Qr+A@mail.gmail.com>
Message-ID: <CAPJVwB=HJ5c=eeospCrTYzOQyaSbmBDorRCX716muyjbhnz3Sw@mail.gmail.com>

On 5 Jan 2013 12:16, "Matthew Brett" <matthew.brett at gmail.com> wrote:
>
> Hi,
>
> Following on from Nathaniel's explorations of the scalar - array
> casting rules, some resources on rank-0 arrays.
>
> The discussion that Nathaniel tracked down on "rank-0 arrays"; it also
> makes reference to casting.  The rank-0 arrays seem to have been one
> way of solving the problem of maintaining array dtypes other than bool
> / float / int:
>
> http://mail.scipy.org/pipermail/numpy-discussion/2002-September/001612.html
>
> Quoting from an email from Travis in that thread, replying to an email
> from Tim Hochberg:
>
> http://mail.scipy.org/pipermail/numpy-discussion/2002-September/001647.html
>
> <quote>
> > Frankly, I have no idea what the implimentation details would be, but
> > could we get rid of rank-0 arrays altogether? I have always simply found
> > them strange and confusing... What are they really neccesary for
> > (besides holding scalar values of different precision that standard
> > Pyton scalars)?
>
> With new coercion rules this becomes a possibility.  Arguments against it
> are that  special rank-0 arrays behave as more consistent numbers with the
> rest of Numeric than Python scalars.  In other words they have a length
> and a shape and one can right N-dimensional code that works the same even
> when the result is a scalar.
>
> Another advantage of having a Numeric scalar is that we can control the
> behavior of floating point operations better.
>
> e.g.
>
> if only Python scalars were available and sum(a) returned 0, then
>
>  1 / sum(a)  would behave as Python behaves (always raises error).
>
> while with our own scalars
>
> 1 / sum(a)   could potentially behave however the user wanted.
> </quote>
>
> There seemed then to be some impetus to remove rank-0 arrays and
> replace them with Python scalar types with the various numpy
> precisions :
>
> http://mail.scipy.org/pipermail/numpy-discussion/2002-September/013983.html
>
> Travis' recent email hints at something that seems similar, but I
> don't understand what he means:
>
> http://mail.scipy.org/pipermail/numpy-discussion/2012-December/064795.html
>
> <quote>
> Don't create array-scalars.  Instead, make the data-type object a
> meta-type object whose instances are the items returned from NumPy
> arrays.   There is no need for a separate array-scalar object and in
> fact it's confusing to the type-system.    I understand that now.  I
> did not understand that 5 years ago.
> </quote>
>
> Travis - can you expand?

Numpy has 3 partially overlapping concepts:

A) scalars (what Travis calls "array scalars"): Things like "float64",
"int32". These are ordinary Python classes; usually when you subscript
an array, what you get back is an instance of one of these classes:

In [1]: a = np.array([1, 2, 3])

In [2]: a[0]
Out[2]: 1

In [3]: type(a[0])
Out[3]: numpy.int64

Note that even though they are called "array scalars", they have
nothing to do with the actual ndarray type -- they are totally
separate objects.

B) dtypes: These are instances of class np.dtype. For every scalar
type, there is a corresponding dtype object; plus you can create new
dtype objects for things like record arrays (which correspond to
scalars of type "np.void"; I don't really understand how void scalars
work in detail):

In [8]: int64_dtype = np.dtype(np.int64)

In [9]: int64_dtype
Out[9]: dtype('int64')

In [10]: type(int64_dtype)
Out[10]: numpy.dtype

In [11]: int64_dtype.type
Out[11]: numpy.int64

C) rank-0 arrays: Plain old ndarray objects that happen to have ndim
== 0, shape == (). These are arrays which are scalars, but they are
not array scalars. Arrays HAVE-A dtype.

In [15]: int64_arr = np.array(1)

In [16]: int64_arr
Out[16]: array(1)

In [17]: int64_arr.dtype
Out[17]: dtype('int64')

------------

Okay given that background:

What Travis was saying in that email was that he thought (A) and (B)
should be combined. Instead of having np.float64-the-class and
dtype(np.float64)-the-dtype-object, we should make dtype objects
actually *be* the scalar classes. (They would still be dtype objects,
which means they would be "metaclasses", which is just a fancy way to
say, dtype would be a subclass of the Python class "type", and dtype
objects would be class objects that had extra functionality.)

Those old mailing list threads are debating about (A) versus (C). What
we ended up with is what I described above -- we have "rank-0"
(0-dimensional) arrays, and we have array scalar objects that are a
different set of python types and objects entirely. The actual
implementation is totally different -- to the point that we a 35,000
line auto-generated C file implementing arithmetic for scalars, *and*
a 10,000 line auto-generated C file implementing arithmetic for arrays
(including 0-dim arrays), and these have different functionality and
bugs:
  https://github.com/numpy/numpy/issues/593

However, the actual goal of all this code is to make array scalars and
0-dim arrays entirely indistinguishable. Supposedly they have the same
APIs and generally behave exactly the same, modulo bugs (but surely
there can't be many of those...), and two things:

1) isinstance(scalar, np.int64) is a sorta-legitimate way to do a type
check. But isinstance(zerodim_arr, np.int64) is always false. Instead
you have to use issubdtype(zerodim_arr, np.int64). (I mean, obviously,
right?)

2) Scalars are always read-only, like regular Python scalars. 0-dim
arrays are in general writeable... unless you set them to read-only. I
think the only behavioural difference between an array scalar and a
read-only 0-dim array is that for read-only 0-dim arrays, in-place
operations raise an exception:

In [5]: scalar = np.int64(1)

# same as 'scalar = scalar + 2', i.e., creates a new object
In [6]: scalar += 2

In [7]: scalar
Out[7]: 3

In [10]: zerodim = np.array(1)

In [11]: zerodim.flags.writeable = False

In [12]: zerodim += 2
ValueError: return array is not writeable

Also, scalar indexing of ndarrays returns scalar objects. Except when
it returns a 0-dim array -- I'm pretty sure this can happen when the
moon is right, though I forget the details. ndarray subclasses? custom
dtypes? Maybe someone will remember.

Q: We could make += work on read-only arrays with, like, a 2 line fix.
So wouldn't it be simpler to throw away the tens of thousands of lines
of code used to implement scalars, and just use 0-dim arrays
everywhere instead? So like, np.array([1, 2, 3])[1] would return a
read-only 0-dim array, which acted just like the current scalar
objects in basically every way?

A: Excellent question! So ndarrays would be similar to Python strings
-- indexing an ndarray would return another ndarray, just like
indexing a string returns another string?

Q: Yeah. I mean, I remember that seemed weird when I first learned
Python, but when have you ever felt the Python was really missing a
"character" type like C has?

A: That's true, I don't think I ever have. Plus if you wanted a "real"
float/int/whatever object you could just call float() or int() or use
.item(), just like now. Can you think any problems this would cause,
though?

Q: Well, what about speed? 0-dim arrays are stupidly slow:

In [2]: x = 1.5

In [3]: zerodim = np.array(x)

In [4]: scalar = zerodim[()]

In [5]: timeit x * x
10000000 loops, best of 3: 64.2 ns per loop

In [6]: timeit scalar * scalar
1000000 loops, best of 3: 299 ns per loop

In [7]: timeit zerodim * zerodim
1000000 loops, best of 3: 1.78 us per loop

A: True!

Q: So before we could throw away that code, we'd have to make arrays faster?

A: Is that an objection?

Q: Well, maybe they're already going as fast as they possibly can be?
Part of the motivation for having array scalars in the first place was
that they could be more optimized.

A: It's true, reducing overhead might be hard! For example, with
arrays, you have to look up which ufunc inner loop to use. That
requires considering all kinds of different casts (like it has to
consider, maybe we should cast both arrays to integers and then
multiply those?), and this currently takes up about 700 ns all by
itself!

Q: It takes 700 ns to figure out that to multiply two arrays of
doubles you should use the double-multiplication loop?

A: Well, we support 24 different dtypes out-of-the-box. Caching all
the different combinations so we could skip the ufunc lookup time
would create memory overhead of nearly *600 bytes per ufunc!* So
instead we re-do it from scratch each time.

Q: Uh....

A: C'mon, that's not a question.

Q: Right, okay, how about the isinstance() thing. There are probably
people relying on isinstance(scalar, np.float64) working (even if this
is unwise) -- but if we get rid of scalars, then how could we possibly
make isinstance(zerodim_array, np.float64) work? All 0-dim arrays have
the same type -- ndarray!

A: Well, it turns out that starting in Python 2.6 -- which,
coincidentally, is now our minimum required version! -- you can make
isinstance() and issubclass() do whatever arbitrary checks you want.
Check it out:

class MetaEven(type):
    def __instancecheck__(self, obj):
        return obj % 2 == 0

class Even(object):
     __metaclass__ = MetaEven

assert not isinstance(1, Even)
assert isinstance(2, Even)

So we could just decide that isinstance(foo, some_dtype) returns True
whenever foo is an array with the given dtype, and define np.float64
to be correct dtype. (Thus also fulfilling Travis's idea of getting
rid of the distinction between scalar types and dtypes.)

Q: So basically all the dtypes, including the weird ones like
'np.integer' and 'np.number'[1], would use the standard Python
abstract base class machinery, and we could throw out all the
issubdtype/issubsctype/issctype nonsense, and just use
isinstance/issubclass everywhere instead?
[1] http://docs.scipy.org/doc/numpy/reference/arrays.scalars.html

A: Yeah.

Q: Huh. That does sound nice. I don't know. What other problems can
you think of with this scheme?

-n


From eric.emsellem at eso.org  Sat Jan  5 17:04:34 2013
From: eric.emsellem at eso.org (Eric Emsellem)
Date: Sat, 05 Jan 2013 23:04:34 +0100
Subject: [Numpy-discussion] Invalid value encoutered : how to,
 prevent numpy.where to do this?
Message-ID: <50E8A372.5080808@eso.org>

Thanks!

This makes sense of course. And yes the operation I am trying to do is 
rather complicated so I need to rely on a prior selection.

Now I would need to optimise this for large arrays and the code does go 
through these command line many many times.

When I have to operate on the two different parts of the array, I guess 
just using the following is the fastest way (as you indicated) :

result = np.empty_like(data)
mask = (data == 0)
result[mask] = 0.0
result[~mask] = 1.0/data[~mask]

But if I only need to do this on one side of the selection, I guess I 
would just do:

result = np.empty_like(data)
mask = (data != 0)
result[mask] += 1.0 / data[mask]

I have tried using three version of "mask = " with the rest of the code 
being the same:

1- mask = where(data != 0)
2- mask = np.where(data != 0)
3- mask = (data != 0)

and it looks like #3 is the fastest, then #2 (20% slower) then #1 (50% 
slower than #3).

I am not sure why, but Is that making sense? Or is there even a faster 
way (for large data arrays, and complicated operations)?

thanks

Eric
> If your operation doesn't factor like this though then you can always
> use something more cumbersome like
>    result = np.empty_like(data)
>    mask = (data == 0)
>    result[mask] = 0
>    result[~mask] = 1.0/data[~mask]
>
> Or in 1.7 this could be written
>    result = np.zeros_like(data)
>    np.divide(1.0, data, where=(data != 0), out=result)
>
> -n
>


From eric.emsellem at eso.org  Sat Jan  5 17:04:55 2013
From: eric.emsellem at eso.org (Eric Emsellem)
Date: Sat, 05 Jan 2013 23:04:55 +0100
Subject: [Numpy-discussion] Invalid value encoutered : how to,
 prevent numpy.where to do this?
Message-ID: <50E8A387.7050706@eso.org>

Thanks!

This makes sense of course. And yes the operation I am trying to do is 
rather complicated so I need to rely on a prior selection.

Now I would need to optimise this for large arrays and the code does go 
through these command line many many times.

When I have to operate on the two different parts of the array, I guess 
just using the following is the fastest way (as you indicated) :

result = np.empty_like(data)
mask = (data == 0)
result[mask] = 0.0
result[~mask] = 1.0/data[~mask]

But if I only need to do this on one side of the selection, I guess I 
would just do:

result = np.empty_like(data)
mask = (data != 0)
result[mask] += 1.0 / data[mask]

I have tried using three version of "mask = " with the rest of the code 
being the same:

1- mask = where(data != 0)
2- mask = np.where(data != 0)
3- mask = (data != 0)

and it looks like #3 is the fastest, then #2 (20% slower) then #1 (50% 
slower than #3).

I am not sure why, but Is that making sense? Or is there even a faster 
way (for large data arrays, and complicated operations)?

thanks

Eric
> If your operation doesn't factor like this though then you can always
> use something more cumbersome like
>    result = np.empty_like(data)
>    mask = (data == 0)
>    result[mask] = 0
>    result[~mask] = 1.0/data[~mask]
>
> Or in 1.7 this could be written
>    result = np.zeros_like(data)
>    np.divide(1.0, data, where=(data != 0), out=result)
>
> -n
>


From eric.emsellem at eso.org  Sat Jan  5 17:07:24 2013
From: eric.emsellem at eso.org (Eric Emsellem)
Date: Sat, 05 Jan 2013 23:07:24 +0100
Subject: [Numpy-discussion] Invalid value encoutered : how to,
 prevent numpy.where to do this?
Message-ID: <50E8A41C.7090201@eso.org>

Thanks!

This makes sense of course. And yes the operation I am trying to do is 
rather complicated so I need to rely on a prior selection.

Now I would need to optimise this for large arrays and the code does go 
through these command line many many times.

When I have to operate on the two different parts of the array, I guess 
just using the following is the fastest way (as you indicated) :

result = np.empty_like(data)
mask = (data == 0)
result[mask] = 0.0
result[~mask] = 1.0/data[~mask]

But if I only need to do this on one side of the selection, I guess I 
would just do:

result = np.empty_like(data)
mask = (data != 0)
result[mask] += 1.0 / data[mask]

I have tried using three version of "mask = " with the rest of the code 
being the same:

1- mask = where(data != 0)
2- mask = np.where(data != 0)
3- mask = (data != 0)

and it looks like #3 is the fastest, then #2 (20% slower) then #1 (50% 
slower than #3).

I am not sure why, but Is that making sense? Or is there even a faster 
way (for large data arrays, and complicated operations)?

thanks

Eric
> If your operation doesn't factor like this though then you can always
> use something more cumbersome like
>    result = np.empty_like(data)
>    mask = (data == 0)
>    result[mask] = 0
>    result[~mask] = 1.0/data[~mask]
>
> Or in 1.7 this could be written
>    result = np.zeros_like(data)
>    np.divide(1.0, data, where=(data != 0), out=result)
>
> -n
>


From cournape at gmail.com  Sat Jan  5 17:10:04 2013
From: cournape at gmail.com (David Cournapeau)
Date: Sat, 5 Jan 2013 16:10:04 -0600
Subject: [Numpy-discussion] Rank-0 arrays - reprise
In-Reply-To: <CAPJVwB=HJ5c=eeospCrTYzOQyaSbmBDorRCX716muyjbhnz3Sw@mail.gmail.com>
References: <CAH6Pt5qZikcA8C+hW05viZQYhnsqoCRroOAszfsCS2Rmf+Qr+A@mail.gmail.com>
	<CAPJVwB=HJ5c=eeospCrTYzOQyaSbmBDorRCX716muyjbhnz3Sw@mail.gmail.com>
Message-ID: <CAGY4rcV_zLxs_hrZ6xnacnjgAsmhA2zv3pRAG3b-6e8SfvbccA@mail.gmail.com>

On Sat, Jan 5, 2013 at 3:31 PM, Nathaniel Smith <njs at pobox.com> wrote:
> On 5 Jan 2013 12:16, "Matthew Brett" <matthew.brett at gmail.com> wrote:
>>
>> Hi,
>>
>> Following on from Nathaniel's explorations of the scalar - array
>> casting rules, some resources on rank-0 arrays.
>>
>> The discussion that Nathaniel tracked down on "rank-0 arrays"; it also
>> makes reference to casting.  The rank-0 arrays seem to have been one
>> way of solving the problem of maintaining array dtypes other than bool
>> / float / int:
>>
>> http://mail.scipy.org/pipermail/numpy-discussion/2002-September/001612.html
>>
>> Quoting from an email from Travis in that thread, replying to an email
>> from Tim Hochberg:
>>
>> http://mail.scipy.org/pipermail/numpy-discussion/2002-September/001647.html
>>
>> <quote>
>> > Frankly, I have no idea what the implimentation details would be, but
>> > could we get rid of rank-0 arrays altogether? I have always simply found
>> > them strange and confusing... What are they really neccesary for
>> > (besides holding scalar values of different precision that standard
>> > Pyton scalars)?
>>
>> With new coercion rules this becomes a possibility.  Arguments against it
>> are that  special rank-0 arrays behave as more consistent numbers with the
>> rest of Numeric than Python scalars.  In other words they have a length
>> and a shape and one can right N-dimensional code that works the same even
>> when the result is a scalar.
>>
>> Another advantage of having a Numeric scalar is that we can control the
>> behavior of floating point operations better.
>>
>> e.g.
>>
>> if only Python scalars were available and sum(a) returned 0, then
>>
>>  1 / sum(a)  would behave as Python behaves (always raises error).
>>
>> while with our own scalars
>>
>> 1 / sum(a)   could potentially behave however the user wanted.
>> </quote>
>>
>> There seemed then to be some impetus to remove rank-0 arrays and
>> replace them with Python scalar types with the various numpy
>> precisions :
>>
>> http://mail.scipy.org/pipermail/numpy-discussion/2002-September/013983.html
>>
>> Travis' recent email hints at something that seems similar, but I
>> don't understand what he means:
>>
>> http://mail.scipy.org/pipermail/numpy-discussion/2012-December/064795.html
>>
>> <quote>
>> Don't create array-scalars.  Instead, make the data-type object a
>> meta-type object whose instances are the items returned from NumPy
>> arrays.   There is no need for a separate array-scalar object and in
>> fact it's confusing to the type-system.    I understand that now.  I
>> did not understand that 5 years ago.
>> </quote>
>>
>> Travis - can you expand?
>
> Numpy has 3 partially overlapping concepts:
>
> A) scalars (what Travis calls "array scalars"): Things like "float64",
> "int32". These are ordinary Python classes; usually when you subscript
> an array, what you get back is an instance of one of these classes:
>
> In [1]: a = np.array([1, 2, 3])
>
> In [2]: a[0]
> Out[2]: 1
>
> In [3]: type(a[0])
> Out[3]: numpy.int64
>
> Note that even though they are called "array scalars", they have
> nothing to do with the actual ndarray type -- they are totally
> separate objects.
>
> B) dtypes: These are instances of class np.dtype. For every scalar
> type, there is a corresponding dtype object; plus you can create new
> dtype objects for things like record arrays (which correspond to
> scalars of type "np.void"; I don't really understand how void scalars
> work in detail):
>
> In [8]: int64_dtype = np.dtype(np.int64)
>
> In [9]: int64_dtype
> Out[9]: dtype('int64')
>
> In [10]: type(int64_dtype)
> Out[10]: numpy.dtype
>
> In [11]: int64_dtype.type
> Out[11]: numpy.int64
>
> C) rank-0 arrays: Plain old ndarray objects that happen to have ndim
> == 0, shape == (). These are arrays which are scalars, but they are
> not array scalars. Arrays HAVE-A dtype.
>
> In [15]: int64_arr = np.array(1)
>
> In [16]: int64_arr
> Out[16]: array(1)
>
> In [17]: int64_arr.dtype
> Out[17]: dtype('int64')
>
> ------------
>
> Okay given that background:
>
> What Travis was saying in that email was that he thought (A) and (B)
> should be combined. Instead of having np.float64-the-class and
> dtype(np.float64)-the-dtype-object, we should make dtype objects
> actually *be* the scalar classes. (They would still be dtype objects,
> which means they would be "metaclasses", which is just a fancy way to
> say, dtype would be a subclass of the Python class "type", and dtype
> objects would be class objects that had extra functionality.)
>
> Those old mailing list threads are debating about (A) versus (C). What
> we ended up with is what I described above -- we have "rank-0"
> (0-dimensional) arrays, and we have array scalar objects that are a
> different set of python types and objects entirely. The actual
> implementation is totally different -- to the point that we a 35,000
> line auto-generated C file implementing arithmetic for scalars, *and*
> a 10,000 line auto-generated C file implementing arithmetic for arrays
> (including 0-dim arrays), and these have different functionality and
> bugs:
>   https://github.com/numpy/numpy/issues/593
>
> However, the actual goal of all this code is to make array scalars and
> 0-dim arrays entirely indistinguishable. Supposedly they have the same
> APIs and generally behave exactly the same, modulo bugs (but surely
> there can't be many of those...), and two things:
>
> 1) isinstance(scalar, np.int64) is a sorta-legitimate way to do a type
> check. But isinstance(zerodim_arr, np.int64) is always false. Instead
> you have to use issubdtype(zerodim_arr, np.int64). (I mean, obviously,
> right?)
>
> 2) Scalars are always read-only, like regular Python scalars. 0-dim
> arrays are in general writeable... unless you set them to read-only. I
> think the only behavioural difference between an array scalar and a
> read-only 0-dim array is that for read-only 0-dim arrays, in-place
> operations raise an exception:
>
> In [5]: scalar = np.int64(1)
>
> # same as 'scalar = scalar + 2', i.e., creates a new object
> In [6]: scalar += 2
>
> In [7]: scalar
> Out[7]: 3
>
> In [10]: zerodim = np.array(1)
>
> In [11]: zerodim.flags.writeable = False
>
> In [12]: zerodim += 2
> ValueError: return array is not writeable
>
> Also, scalar indexing of ndarrays returns scalar objects. Except when
> it returns a 0-dim array -- I'm pretty sure this can happen when the
> moon is right, though I forget the details. ndarray subclasses? custom
> dtypes? Maybe someone will remember.
>
> Q: We could make += work on read-only arrays with, like, a 2 line fix.
> So wouldn't it be simpler to throw away the tens of thousands of lines
> of code used to implement scalars, and just use 0-dim arrays
> everywhere instead? So like, np.array([1, 2, 3])[1] would return a
> read-only 0-dim array, which acted just like the current scalar
> objects in basically every way?
>
> A: Excellent question! So ndarrays would be similar to Python strings
> -- indexing an ndarray would return another ndarray, just like
> indexing a string returns another string?
>
> Q: Yeah. I mean, I remember that seemed weird when I first learned
> Python, but when have you ever felt the Python was really missing a
> "character" type like C has?
>
> A: That's true, I don't think I ever have. Plus if you wanted a "real"
> float/int/whatever object you could just call float() or int() or use
> .item(), just like now. Can you think any problems this would cause,
> though?
>
> Q: Well, what about speed? 0-dim arrays are stupidly slow:
>
> In [2]: x = 1.5
>
> In [3]: zerodim = np.array(x)
>
> In [4]: scalar = zerodim[()]
>
> In [5]: timeit x * x
> 10000000 loops, best of 3: 64.2 ns per loop
>
> In [6]: timeit scalar * scalar
> 1000000 loops, best of 3: 299 ns per loop
>
> In [7]: timeit zerodim * zerodim
> 1000000 loops, best of 3: 1.78 us per loop
>
> A: True!
>
> Q: So before we could throw away that code, we'd have to make arrays faster?
>
> A: Is that an objection?
>
> Q: Well, maybe they're already going as fast as they possibly can be?
> Part of the motivation for having array scalars in the first place was
> that they could be more optimized.
>
> A: It's true, reducing overhead might be hard! For example, with
> arrays, you have to look up which ufunc inner loop to use. That
> requires considering all kinds of different casts (like it has to
> consider, maybe we should cast both arrays to integers and then
> multiply those?), and this currently takes up about 700 ns all by
> itself!
>
> Q: It takes 700 ns to figure out that to multiply two arrays of
> doubles you should use the double-multiplication loop?
>
> A: Well, we support 24 different dtypes out-of-the-box. Caching all
> the different combinations so we could skip the ufunc lookup time
> would create memory overhead of nearly *600 bytes per ufunc!* So
> instead we re-do it from scratch each time.
>
> Q: Uh....
>
> A: C'mon, that's not a question.
>
> Q: Right, okay, how about the isinstance() thing. There are probably
> people relying on isinstance(scalar, np.float64) working (even if this
> is unwise) -- but if we get rid of scalars, then how could we possibly
> make isinstance(zerodim_array, np.float64) work? All 0-dim arrays have
> the same type -- ndarray!
>
> A: Well, it turns out that starting in Python 2.6 -- which,
> coincidentally, is now our minimum required version! -- you can make
> isinstance() and issubclass() do whatever arbitrary checks you want.
> Check it out:
>
> class MetaEven(type):
>     def __instancecheck__(self, obj):
>         return obj % 2 == 0
>
> class Even(object):
>      __metaclass__ = MetaEven
>
> assert not isinstance(1, Even)
> assert isinstance(2, Even)
>
> So we could just decide that isinstance(foo, some_dtype) returns True
> whenever foo is an array with the given dtype, and define np.float64
> to be correct dtype. (Thus also fulfilling Travis's idea of getting
> rid of the distinction between scalar types and dtypes.)
>
> Q: So basically all the dtypes, including the weird ones like
> 'np.integer' and 'np.number'[1], would use the standard Python
> abstract base class machinery, and we could throw out all the
> issubdtype/issubsctype/issctype nonsense, and just use
> isinstance/issubclass everywhere instead?
> [1] http://docs.scipy.org/doc/numpy/reference/arrays.scalars.html
>
> A: Yeah.
>
> Q: Huh. That does sound nice. I don't know. What other problems can
> you think of with this scheme?

Thanks for the entertaining explanation.

I don't think 0-dim array being slow is such a big drawback. I would
be really surprised if there was no way to make them faster, and
having unspecified, nearly duplicated type handling code in multiple
places is likely one reason why nobody took time to really make them
faster.

Regarding ufunc combination caching, couldn't we do the caching on
demand ? I am not sure how you arrived at a 600 bytes per ufunc, but
in many real world use cases, I would suspect only a few combinations
would be used.

Scalar arrays are ones of the most esoteric feature of numpy, and a
fairly complex one in terms of implementation. Getting rid of it would
be a net plus on that side. Of course, there is the issue of backward
compatibility, whose extend is hard to assess.

cheers,
David


From njs at pobox.com  Sat Jan  5 17:14:47 2013
From: njs at pobox.com (Nathaniel Smith)
Date: Sat, 5 Jan 2013 22:14:47 +0000
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <CALmrCV1xP6cJVL7DZufw84Lw+S_+oE5S4BUY9BAOJtDiXruEvg@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CAPJVwBnct8Tcop6SruOfD1CCWC82sE3cbXoQvtux8sfLmJhYuQ@mail.gmail.com>
	<CAPJVwB=GmgRQoW1RqwR1WmJBq5tfxJAGnGPjcT0drZvGTVGx7Q@mail.gmail.com>
	<CALmrCV11T6OTp+dBvUNjGn518Rd2ovHPfS7Ur6a5BKTyYXzKQw@mail.gmail.com>
	<50E61E29.1020709@astro.uio.no>
	<CALmrCV0L6PAA=aCYgaX2gQyM-_=bcDiygz0TyDmE0o5oCbc7iA@mail.gmail.com>
	<CAFXk4bqT5Hk67Cv=FJ+L+vx53zR6x5=xRVQRcZShPs7QOZ0OoQ@mail.gmail.com>
	<CALmrCV2TcFwO=mbiSSeqrwbVCyo9nsEQSUspfihA1WgK4Abd+g@mail.gmail.com>
	<CAFXk4bopXBpN2_aPAJCJ3ADk6PSMiG7NANSra7fhButkud+75A@mail.gmail.com>
	<CALmrCV0311UpW__-espNZrbgGk2pa2fW6E1oeTvZkbYf_1e_VA@mail.gmail.com>
	<CAPJVwBkXSjcDHDuddJ8D-39N+hB8SZP6DMHpsr2zwv9XCMT3OA@mail.gmail.com>
	<CALmrCV1xP6cJVL7DZufw84Lw+S_+oE5S4BUY9BAOJtDiXruEvg@mail.gmail.com>
Message-ID: <CAPJVwBnBQQ=9tU05edRcomjk4GqY=v7kWcmr_O6aytv95GBo8A@mail.gmail.com>

On Fri, Jan 4, 2013 at 5:25 PM, Andrew Collette
<andrew.collette at gmail.com> wrote:
> I agree the current behavior is confusing.  Regardless of the details
> of what to do, I suppose my main objection is that, to me, it's really
> unexpected that adding a number to an array could result in an
> exception.

I think the main objection to the 1.5 behaviour was that it violated
"Errors should never pass silently." (from 'import this'). Granted
there are tons of places where numpy violates this but this is the one
we're thinking about right now...

Okay, here's another idea I'll throw out, maybe it's a good compromise:

1) We go back to the 1.5 behaviour.

2) If this produces a rollover/overflow/etc., we signal that using the
standard mechanisms (whatever is configured via np.seterr). So by
default things like
  np.maximum(np.array([1, 2, 3], dtype=uint8), 256)
would succeed (and produce [1, 2, 3] with dtype uint8), but also issue
a warning that 256 had rolled over to become 0. Alternatively those
who want to be paranoid could call np.seterr(overflow="raise") and
then it would be an error.

-n


From njs at pobox.com  Sat Jan  5 17:20:11 2013
From: njs at pobox.com (Nathaniel Smith)
Date: Sat, 5 Jan 2013 22:20:11 +0000
Subject: [Numpy-discussion] Invalid value encoutered : how to,
 prevent numpy.where to do this?
In-Reply-To: <50E8A41C.7090201@eso.org>
References: <50E8A41C.7090201@eso.org>
Message-ID: <CAPJVwBmdqVXpovbMjPJji7jC_eRsN8yCTK1zVKaJAc28w4On7A@mail.gmail.com>

On Sat, Jan 5, 2013 at 10:07 PM, Eric Emsellem <eric.emsellem at eso.org> wrote:
> Thanks!
>
> This makes sense of course. And yes the operation I am trying to do is
> rather complicated so I need to rely on a prior selection.
>
> Now I would need to optimise this for large arrays and the code does go
> through these command line many many times.
>
> When I have to operate on the two different parts of the array, I guess
> just using the following is the fastest way (as you indicated) :
>
> result = np.empty_like(data)
> mask = (data == 0)
> result[mask] = 0.0
> result[~mask] = 1.0/data[~mask]
>
> But if I only need to do this on one side of the selection, I guess I
> would just do:
>
> result = np.empty_like(data)
> mask = (data != 0)
> result[mask] += 1.0 / data[mask]

Note that np.empty_like will return an array full of random memory
contents, and this will leave those random values anywhere that mask
== False. This may or may not be a problem for you.

> I have tried using three version of "mask = " with the rest of the code
> being the same:
>
> 1- mask = where(data != 0)
> 2- mask = np.where(data != 0)
> 3- mask = (data != 0)
>
> and it looks like #3 is the fastest, then #2 (20% slower) then #1 (50%
> slower than #3).
>
> I am not sure why, but Is that making sense? Or is there even a faster
> way (for large data arrays, and complicated operations)?

Yes, these should all do the same thing. And calling a function is
slower than not calling a function, and normal Python 'where' is
slower (for numpy arrays) than the numpy 'where'.

Once you can count on 1.7, using the new where= argument should be the
fastest way to do this (since it totally avoids making temporary
arrays).

-n


From njs at pobox.com  Sat Jan  5 17:58:04 2013
From: njs at pobox.com (Nathaniel Smith)
Date: Sat, 5 Jan 2013 22:58:04 +0000
Subject: [Numpy-discussion] Rank-0 arrays - reprise
In-Reply-To: <CAGY4rcV_zLxs_hrZ6xnacnjgAsmhA2zv3pRAG3b-6e8SfvbccA@mail.gmail.com>
References: <CAH6Pt5qZikcA8C+hW05viZQYhnsqoCRroOAszfsCS2Rmf+Qr+A@mail.gmail.com>
	<CAPJVwB=HJ5c=eeospCrTYzOQyaSbmBDorRCX716muyjbhnz3Sw@mail.gmail.com>
	<CAGY4rcV_zLxs_hrZ6xnacnjgAsmhA2zv3pRAG3b-6e8SfvbccA@mail.gmail.com>
Message-ID: <CAPJVwBnbrTqwRPYHQmv+Ct_fn5E5SRqXONbthHCN2GfOMw6-MA@mail.gmail.com>

On Sat, Jan 5, 2013 at 10:10 PM, David Cournapeau <cournape at gmail.com> wrote:
> Thanks for the entertaining explanation.

Procrastination is a hell of a drug.

> I don't think 0-dim array being slow is such a big drawback. I would
> be really surprised if there was no way to make them faster, and
> having unspecified, nearly duplicated type handling code in multiple
> places is likely one reason why nobody took time to really make them
> faster.

I agree!

> Regarding ufunc combination caching, couldn't we do the caching on
> demand ? I am not sure how you arrived at a 600 bytes per ufunc, but
> in many real world use cases, I would suspect only a few combinations
> would be used.

600 bytes is for an implementation that just kept a table like
  chosen_ufunc_offset = np.empty((24, 24), dtype=uint8)  # 576 bytes
and looked up the proper ufunc loop by doing
  ufunc_loops[chosen_ufunc_offset[left_arg_typenum, right_arg_typenum]]
I suspect we could pre-fill such tables extremely quickly (basically
just fill in all the exact matches, and then do a flood-fill along the
can_cast graph), or we could fill them on-demand. (Also even 24 * 24
is currently an over-estimate since some of those 24 types are
parametrized, and currently ufuncs can't handle parametrized types,
but hopefully that will get fixed at some point.)

The numpy main namespace and scipy.special together contain only 35 +
47 = 82 ufuncs that take 2 arguments[1], so loading those two modules
using this scheme would add a total of *50 kilobytes* to numpy's
memory overhead...

(Interestingly, scipy.special does include 48 three-argument ufuncs,
15 four-argument ufuncs, and 6 five-argument ufuncs, which obviously
cannot use a table lookup scheme. Maybe we can add a check for
symmetry -- if all the loops are defined on matching types, like
"dd->d" and "ff->f", then really it's a one-dimensional lookup problem
-- find a common type for the inputs ("d" or "f") and then find the
best loop for that one type.)

Another option, like you suggest (?), would be to keep a little
fixed-size LRU cache for each ufunc and do lookups in it by linear
search. It's hard to know how many different types get used in
real-world programs, though, and I'd worry about performance falling
off a cliff as soon as someone tweaked their inner loop so it used 6
different (type1, type2) combinations instead of 5 or whatever (maybe
in one place they do int * float and in another float * int, etc.).

Anyway the point is yes, this particular thing is eminently fixable.

> Scalar arrays are ones of the most esoteric feature of numpy, and a
> fairly complex one in terms of implementation. Getting rid of it would
> be a net plus on that side. Of course, there is the issue of backward
> compatibility, whose extend is hard to assess.

You mean array scalars, not scalar arrays, right?[2]

-n

[1] len([v for v in np.__dict__.values() if isinstance(v, np.ufunc)
and v.nin == 2])
[2] The fact that this sentence means something is certainly evidence
for... something.


From ondrej.certik at gmail.com  Sat Jan  5 21:21:04 2013
From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=)
Date: Sat, 5 Jan 2013 18:21:04 -0800
Subject: [Numpy-discussion] Which Python to use for Mac binaries
Message-ID: <CADDwiVCrujdbbpCVCiDbi1N7pH6i5aHDDv+Y+TyyRnkkzJ14+Q@mail.gmail.com>

Hi,

Currently the NumPy binaries are built using the pavement.py script,
which uses the following Pythons:

MPKG_PYTHON = {
        "2.5": ["/Library/Frameworks/Python.framework/Versions/2.5/bin/python"],
        "2.6": ["/Library/Frameworks/Python.framework/Versions/2.6/bin/python"],
        "2.7": ["/Library/Frameworks/Python.framework/Versions/2.7/bin/python"],
        "3.1": ["/Library/Frameworks/Python.framework/Versions/3.1/bin/python3"],
        "3.2": ["/Library/Frameworks/Python.framework/Versions/3.2/bin/python3"],
        "3.3": ["/Library/Frameworks/Python.framework/Versions/3.3/bin/python3"],
}

So for example I can easily create the 2.6 binary if that Python is
pre-installed on the Mac box that I am using.
On one of the Mac boxes that I am using, the 2.7 is missing, so are
3.1, 3.2 and 3.3. So I was thinking
of updating my Fabric fab file to automatically install all Pythons
from source and build against that, just like
I do for Wine.

Which exact Python do we need to use on Mac? Do we need to use the
binary installer from python.org?
Or can I install it from source? Finally, for which Python versions
should we provide binary installers for Mac?
For reference, the 1.6.2 had installers for 2.5, 2.6 and 2.7 only for
OS X 10.3. There is only 2.7 version for OS X 10.6.


Also, what is the meaning of the following piece of code in pavement.py:

def _build_mpkg(pyver):
    # account for differences between Python 2.7.1 versions from python.org
    if os.environ.get('MACOSX_DEPLOYMENT_TARGET', None) == "10.6":
        ldflags = "-undefined dynamic_lookup -bundle -arch i386 -arch
x86_64 -Wl,-search_paths_first"
    else:
        ldflags = "-undefined dynamic_lookup -bundle -arch i386 -arch
ppc -Wl,-search_paths_first"
    ldflags += " -L%s" % os.path.join(os.path.dirname(__file__), "build")

    if pyver == "2.5":
        sh("CC=gcc-4.0 LDFLAGS='%s' %s setupegg.py bdist_mpkg" %
(ldflags, " ".join(MPKG_PYTHON[pyver])))
    else:
        sh("LDFLAGS='%s' %s setupegg.py bdist_mpkg" % (ldflags, "
".join(MPKG_PYTHON[pyver])))


In particular, the last line gets executed and it then fails with:

paver dmg -p 2.6
---> pavement.dmg
---> pavement.clean
LDFLAGS='-undefined dynamic_lookup -bundle -arch i386 -arch ppc
-Wl,-search_paths_first -Lbuild'
/Library/Frameworks/Python.framework/Versions/2.6/bin/python
setupegg.py bdist_mpkg
Traceback (most recent call last):
  File "setupegg.py", line 17, in <module>
    from setuptools import setup
ImportError: No module named setuptools


The reason is (I think) that if the Python binary is called explicitly
with /Library/Frameworks/Python.framework/Versions/2.6/bin/python,
then the paths are not setup properly in virtualenv, and thus
setuptools (which is only installed in virtualenv, but not in system
Python) fails to import. The solution is to simply apply this patch:

diff --git a/pavement.py b/pavement.py
index e693016..0c637f8 100644
--- a/pavement.py
+++ b/pavement.py
@@ -449,7 +449,7 @@ def _build_mpkg(pyver):
     if pyver == "2.5":
         sh("CC=gcc-4.0 LDFLAGS='%s' %s setupegg.py bdist_mpkg" %
(ldflags, " ".join(MPKG_PYTHON[pyver])))
     else:
-        sh("LDFLAGS='%s' %s setupegg.py bdist_mpkg" % (ldflags, "
".join(MPKG_PYTHON[pyver])))
+        sh("python setupegg.py bdist_mpkg")

 @task
 def simple_dmg():


and then things work. So an obvious question is --- why do we need to
fiddle with LDFLAGS and paths to the exact Python version? Here is a
proposed simpler version of the build_mpkg() function:

def _build_mpkg(pyver):
        sh("python setupegg.py bdist_mpkg")

Thanks for any tips.

Ondrej


From charlesr.harris at gmail.com  Sat Jan  5 21:38:40 2013
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Sat, 5 Jan 2013 19:38:40 -0700
Subject: [Numpy-discussion] Remove support for numeric and numarray in 1.8
Message-ID: <CAB6mnxJ=zH7Q4hUpRXT-Heg8ONVBbwVuyrgb=QJ-SjLpZDa7JA@mail.gmail.com>

Thoughts?

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130105/57cbf217/attachment.html>

From d.s.seljebotn at astro.uio.no  Sun Jan  6 02:58:58 2013
From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn)
Date: Sun, 06 Jan 2013 08:58:58 +0100
Subject: [Numpy-discussion] Rank-0 arrays - reprise
In-Reply-To: <CAPJVwB=HJ5c=eeospCrTYzOQyaSbmBDorRCX716muyjbhnz3Sw@mail.gmail.com>
References: <CAH6Pt5qZikcA8C+hW05viZQYhnsqoCRroOAszfsCS2Rmf+Qr+A@mail.gmail.com>
	<CAPJVwB=HJ5c=eeospCrTYzOQyaSbmBDorRCX716muyjbhnz3Sw@mail.gmail.com>
Message-ID: <50E92EC2.10404@astro.uio.no>

On 01/05/2013 10:31 PM, Nathaniel Smith wrote:
> On 5 Jan 2013 12:16, "Matthew Brett" <matthew.brett at gmail.com> wrote:
>>
>> Hi,
>>
>> Following on from Nathaniel's explorations of the scalar - array
>> casting rules, some resources on rank-0 arrays.
>>
>> The discussion that Nathaniel tracked down on "rank-0 arrays"; it also
>> makes reference to casting.  The rank-0 arrays seem to have been one
>> way of solving the problem of maintaining array dtypes other than bool
>> / float / int:
>>
>> http://mail.scipy.org/pipermail/numpy-discussion/2002-September/001612.html
>>
>> Quoting from an email from Travis in that thread, replying to an email
>> from Tim Hochberg:
>>
>> http://mail.scipy.org/pipermail/numpy-discussion/2002-September/001647.html
>>
>> <quote>
>>> Frankly, I have no idea what the implimentation details would be, but
>>> could we get rid of rank-0 arrays altogether? I have always simply found
>>> them strange and confusing... What are they really neccesary for
>>> (besides holding scalar values of different precision that standard
>>> Pyton scalars)?
>>
>> With new coercion rules this becomes a possibility.  Arguments against it
>> are that  special rank-0 arrays behave as more consistent numbers with the
>> rest of Numeric than Python scalars.  In other words they have a length
>> and a shape and one can right N-dimensional code that works the same even
>> when the result is a scalar.
>>
>> Another advantage of having a Numeric scalar is that we can control the
>> behavior of floating point operations better.
>>
>> e.g.
>>
>> if only Python scalars were available and sum(a) returned 0, then
>>
>>   1 / sum(a)  would behave as Python behaves (always raises error).
>>
>> while with our own scalars
>>
>> 1 / sum(a)   could potentially behave however the user wanted.
>> </quote>
>>
>> There seemed then to be some impetus to remove rank-0 arrays and
>> replace them with Python scalar types with the various numpy
>> precisions :
>>
>> http://mail.scipy.org/pipermail/numpy-discussion/2002-September/013983.html
>>
>> Travis' recent email hints at something that seems similar, but I
>> don't understand what he means:
>>
>> http://mail.scipy.org/pipermail/numpy-discussion/2012-December/064795.html
>>
>> <quote>
>> Don't create array-scalars.  Instead, make the data-type object a
>> meta-type object whose instances are the items returned from NumPy
>> arrays.   There is no need for a separate array-scalar object and in
>> fact it's confusing to the type-system.    I understand that now.  I
>> did not understand that 5 years ago.
>> </quote>
>>
>> Travis - can you expand?
>
> Numpy has 3 partially overlapping concepts:
>
> A) scalars (what Travis calls "array scalars"): Things like "float64",
> "int32". These are ordinary Python classes; usually when you subscript
> an array, what you get back is an instance of one of these classes:
>
> In [1]: a = np.array([1, 2, 3])
>
> In [2]: a[0]
> Out[2]: 1
>
> In [3]: type(a[0])
> Out[3]: numpy.int64
>
> Note that even though they are called "array scalars", they have
> nothing to do with the actual ndarray type -- they are totally
> separate objects.
>
> B) dtypes: These are instances of class np.dtype. For every scalar
> type, there is a corresponding dtype object; plus you can create new
> dtype objects for things like record arrays (which correspond to
> scalars of type "np.void"; I don't really understand how void scalars
> work in detail):
>
> In [8]: int64_dtype = np.dtype(np.int64)
>
> In [9]: int64_dtype
> Out[9]: dtype('int64')
>
> In [10]: type(int64_dtype)
> Out[10]: numpy.dtype
>
> In [11]: int64_dtype.type
> Out[11]: numpy.int64
>
> C) rank-0 arrays: Plain old ndarray objects that happen to have ndim
> == 0, shape == (). These are arrays which are scalars, but they are
> not array scalars. Arrays HAVE-A dtype.
>
> In [15]: int64_arr = np.array(1)
>
> In [16]: int64_arr
> Out[16]: array(1)
>
> In [17]: int64_arr.dtype
> Out[17]: dtype('int64')
>
> ------------
>
> Okay given that background:
>
> What Travis was saying in that email was that he thought (A) and (B)
> should be combined. Instead of having np.float64-the-class and
> dtype(np.float64)-the-dtype-object, we should make dtype objects
> actually *be* the scalar classes. (They would still be dtype objects,
> which means they would be "metaclasses", which is just a fancy way to
> say, dtype would be a subclass of the Python class "type", and dtype
> objects would be class objects that had extra functionality.)
>
> Those old mailing list threads are debating about (A) versus (C). What
> we ended up with is what I described above -- we have "rank-0"
> (0-dimensional) arrays, and we have array scalar objects that are a
> different set of python types and objects entirely. The actual
> implementation is totally different -- to the point that we a 35,000
> line auto-generated C file implementing arithmetic for scalars, *and*
> a 10,000 line auto-generated C file implementing arithmetic for arrays
> (including 0-dim arrays), and these have different functionality and
> bugs:
>    https://github.com/numpy/numpy/issues/593
>
> However, the actual goal of all this code is to make array scalars and
> 0-dim arrays entirely indistinguishable. Supposedly they have the same
> APIs and generally behave exactly the same, modulo bugs (but surely
> there can't be many of those...), and two things:
>
> 1) isinstance(scalar, np.int64) is a sorta-legitimate way to do a type
> check. But isinstance(zerodim_arr, np.int64) is always false. Instead
> you have to use issubdtype(zerodim_arr, np.int64). (I mean, obviously,
> right?)
>
> 2) Scalars are always read-only, like regular Python scalars. 0-dim
> arrays are in general writeable... unless you set them to read-only. I
> think the only behavioural difference between an array scalar and a
> read-only 0-dim array is that for read-only 0-dim arrays, in-place
> operations raise an exception:
>
> In [5]: scalar = np.int64(1)
>
> # same as 'scalar = scalar + 2', i.e., creates a new object
> In [6]: scalar += 2
>
> In [7]: scalar
> Out[7]: 3
>
> In [10]: zerodim = np.array(1)
>
> In [11]: zerodim.flags.writeable = False
>
> In [12]: zerodim += 2
> ValueError: return array is not writeable
>
> Also, scalar indexing of ndarrays returns scalar objects. Except when
> it returns a 0-dim array -- I'm pretty sure this can happen when the
> moon is right, though I forget the details. ndarray subclasses? custom
> dtypes? Maybe someone will remember.
>
> Q: We could make += work on read-only arrays with, like, a 2 line fix.
> So wouldn't it be simpler to throw away the tens of thousands of lines
> of code used to implement scalars, and just use 0-dim arrays
> everywhere instead? So like, np.array([1, 2, 3])[1] would return a
> read-only 0-dim array, which acted just like the current scalar
> objects in basically every way?
>
> A: Excellent question! So ndarrays would be similar to Python strings
> -- indexing an ndarray would return another ndarray, just like
> indexing a string returns another string?
>
> Q: Yeah. I mean, I remember that seemed weird when I first learned
> Python, but when have you ever felt the Python was really missing a
> "character" type like C has?

str is immutable which makes this a lot easier to deal with without 
getting confused. So basically you have:

a[0:1] # read-write view
a[[0]] # read-write copy
a[0] # read-only view

AND, += are allowed on all read-only arrays, they just transparently 
create a copy instead of doing the operation in-place.

Try to enumerate all the fundamentally different things (if you count 
memory use/running time) that can happen for ndarrays a, b, and 
arbitrary x here:

a += b[x]

That's already quite a lot, your proposal adds even more options. It's 
certainly a lot more complicated than str.

To me it all sounds like a lot of rules introduced just to have the 
result of a[0] be "kind of a scalar" without actually choosing that option.

BUT I should read up on that thread you posted on why that won't work, 
didn't have time yet...

Dag Sverre


From sebastian at sipsolutions.net  Sun Jan  6 04:41:16 2013
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Sun, 06 Jan 2013 10:41:16 +0100
Subject: [Numpy-discussion] Rank-0 arrays - reprise
In-Reply-To: <50E92EC2.10404@astro.uio.no>
References: <CAH6Pt5qZikcA8C+hW05viZQYhnsqoCRroOAszfsCS2Rmf+Qr+A@mail.gmail.com>
	<CAPJVwB=HJ5c=eeospCrTYzOQyaSbmBDorRCX716muyjbhnz3Sw@mail.gmail.com>
	<50E92EC2.10404@astro.uio.no>
Message-ID: <1357465276.12993.20.camel@sebastian-laptop>

On Sun, 2013-01-06 at 08:58 +0100, Dag Sverre Seljebotn wrote:
> On 01/05/2013 10:31 PM, Nathaniel Smith wrote:
> > On 5 Jan 2013 12:16, "Matthew Brett" <matthew.brett at gmail.com> wrote:
> >>
> >> Hi,
> >>
> >> Following on from Nathaniel's explorations of the scalar - array
> >> casting rules, some resources on rank-0 arrays.
> >>

<snip>

> > Q: Yeah. I mean, I remember that seemed weird when I first learned
> > Python, but when have you ever felt the Python was really missing a
> > "character" type like C has?
> 
> str is immutable which makes this a lot easier to deal with without 
> getting confused. So basically you have:
> 
> a[0:1] # read-write view
> a[[0]] # read-write copy
> a[0] # read-only view
> 
> AND, += are allowed on all read-only arrays, they just transparently 
> create a copy instead of doing the operation in-place.
> 
> Try to enumerate all the fundamentally different things (if you count 
> memory use/running time) that can happen for ndarrays a, b, and 
> arbitrary x here:
> 
> a += b[x]
> 
> That's already quite a lot, your proposal adds even more options. It's 
> certainly a lot more complicated than str.
> 
> To me it all sounds like a lot of rules introduced just to have the 
> result of a[0] be "kind of a scalar" without actually choosing that option.
> 

Yes, but I don't think there is an option to making the elements of an
array being immutable. Firstly if you switch normal python code to numpy
code you suddenly get numpy data types spilled into your code, and
mutable objects are simply very different (also true for code updating
to this new version). Do you expect:

array = np.zeros(10, dtype=np.intp)
b = arr[5]
while condition:
    # might change the array?!    
    b += 1
# This would not be possible and break:
dictionary[b] = b**2

Because mutable objects are not hashable which important considering
that dictionaries are a very central data type, making an element return
mutable would be a bad idea.
One could argue about structured datatypes, but maybe then it should be
a datatype property whether its mutable or not, and even then the
element should probably be a copy (though I did not check what happens
here right now).

> BUT I should read up on that thread you posted on why that won't work, 
> didn't have time yet...
> 
> Dag Sverre
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> 


From ralf.gommers at gmail.com  Sun Jan  6 05:04:20 2013
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Sun, 6 Jan 2013 11:04:20 +0100
Subject: [Numpy-discussion] Which Python to use for Mac binaries
In-Reply-To: <CADDwiVCrujdbbpCVCiDbi1N7pH6i5aHDDv+Y+TyyRnkkzJ14+Q@mail.gmail.com>
References: <CADDwiVCrujdbbpCVCiDbi1N7pH6i5aHDDv+Y+TyyRnkkzJ14+Q@mail.gmail.com>
Message-ID: <CABL7CQhsSuJAVLA8LKYPNv-G5SC0J7D4aUZ6Jd8jc7ZTssEdfg@mail.gmail.com>

On Sun, Jan 6, 2013 at 3:21 AM, Ond?ej ?ert?k <ondrej.certik at gmail.com>wrote:

> Hi,
>
> Currently the NumPy binaries are built using the pavement.py script,
> which uses the following Pythons:
>
> MPKG_PYTHON = {
>         "2.5":
> ["/Library/Frameworks/Python.framework/Versions/2.5/bin/python"],
>         "2.6":
> ["/Library/Frameworks/Python.framework/Versions/2.6/bin/python"],
>         "2.7":
> ["/Library/Frameworks/Python.framework/Versions/2.7/bin/python"],
>         "3.1":
> ["/Library/Frameworks/Python.framework/Versions/3.1/bin/python3"],
>         "3.2":
> ["/Library/Frameworks/Python.framework/Versions/3.2/bin/python3"],
>         "3.3":
> ["/Library/Frameworks/Python.framework/Versions/3.3/bin/python3"],
> }
>
> So for example I can easily create the 2.6 binary if that Python is
> pre-installed on the Mac box that I am using.
> On one of the Mac boxes that I am using, the 2.7 is missing, so are
> 3.1, 3.2 and 3.3. So I was thinking
> of updating my Fabric fab file to automatically install all Pythons
> from source and build against that, just like I do for Wine.
>
> Which exact Python do we need to use on Mac? Do we need to use the
> binary installer from python.org?
>

Yes, the one from python.org.


> Or can I install it from source? Finally, for which Python versions
> should we provide binary installers for Mac?
> For reference, the 1.6.2 had installers for 2.5, 2.6 and 2.7 only for
> OS X 10.3. There is only 2.7 version for OS X 10.6.
>

The provided installers and naming scheme should match what's done for
Python itself on python.org.

The 10.3 installers for 2.5, 2.6 and 2.7 should be compiled on OS X 10.5.
This is kind of hard to come by these days, but Vincent Davis maintains a
build machine for numpy and scipy. That's already set up correctly, so all
you have to do is connect to it via ssh, check out v.17.0 in ~/Code/numpy,
check in release.sh that the section for OS X 10.6 is disabled and for 10.5
enabled and run it.

OS X 10.6 broke support for previous versions in some subtle ways, so even
when using the 10.4 SDK numpy compiled on 10.6 won't run on 10.5. As long
as we're supporting 10.5 you therefore need to compile on it.

The 10.7 --> 10.6 support hasn't been checked, but I wouldn't trust it. I
have a 10.6 machine, so I can compile those binaries if needed.


> Also, what is the meaning of the following piece of code in pavement.py:
>
> def _build_mpkg(pyver):
>     # account for differences between Python 2.7.1 versions from
> python.org
>     if os.environ.get('MACOSX_DEPLOYMENT_TARGET', None) == "10.6":
>         ldflags = "-undefined dynamic_lookup -bundle -arch i386 -arch
> x86_64 -Wl,-search_paths_first"
>     else:
>         ldflags = "-undefined dynamic_lookup -bundle -arch i386 -arch
> ppc -Wl,-search_paths_first"
>     ldflags += " -L%s" % os.path.join(os.path.dirname(__file__), "build")


The 10.6 binaries support only Intel Macs, both 32-bit and 64-bit. The 10.3
binaries support PPC Macs and 32-bit Intel. That's what the above does.
Note that we simply follow the choice made by the Python release managers
here.


>     if pyver == "2.5":
>         sh("CC=gcc-4.0 LDFLAGS='%s' %s setupegg.py bdist_mpkg" %
> (ldflags, " ".join(MPKG_PYTHON[pyver])))
>     else:
>         sh("LDFLAGS='%s' %s setupegg.py bdist_mpkg" % (ldflags, "
> ".join(MPKG_PYTHON[pyver])))
>

This is necessary because in Python 2.5, distutils asks for "gcc" instead
of "gcc-4.0", so you may get the wrong one without CC=gcc-4.0. From Python
2.6 on this was fixed.


> In particular, the last line gets executed and it then fails with:
>
> paver dmg -p 2.6
> ---> pavement.dmg
> ---> pavement.clean
> LDFLAGS='-undefined dynamic_lookup -bundle -arch i386 -arch ppc
> -Wl,-search_paths_first -Lbuild'
> /Library/Frameworks/Python.framework/Versions/2.6/bin/python
> setupegg.py bdist_mpkg
> Traceback (most recent call last):
>   File "setupegg.py", line 17, in <module>
>     from setuptools import setup
> ImportError: No module named setuptools
>
>
> The reason is (I think) that if the Python binary is called explicitly
> with /Library/Frameworks/Python.framework/Versions/2.6/bin/python,
> then the paths are not setup properly in virtualenv, and thus
> setuptools (which is only installed in virtualenv, but not in system
> Python) fails to import. The solution is to simply apply this patch:
>

Avoid using system Python for anything. The first thing to do on any new OS
X system is install Python some other way, preferably from python.org.


> diff --git a/pavement.py b/pavement.py
> index e693016..0c637f8 100644
> --- a/pavement.py
> +++ b/pavement.py
> @@ -449,7 +449,7 @@ def _build_mpkg(pyver):
>      if pyver == "2.5":
>          sh("CC=gcc-4.0 LDFLAGS='%s' %s setupegg.py bdist_mpkg" %
> (ldflags, " ".join(MPKG_PYTHON[pyver])))
>      else:
> -        sh("LDFLAGS='%s' %s setupegg.py bdist_mpkg" % (ldflags, "
> ".join(MPKG_PYTHON[pyver])))
> +        sh("python setupegg.py bdist_mpkg")
>

This doesn't work unless using virtualenvs, you're just throwing away the
version selection here. If you can support virtualenvs in addition to
python.org pythons, that would be useful. But being able to build binaries
when needed simply by "paver dmg -p 2.x" is quite useful.


>
>  @task
>  def simple_dmg():
>
>
> and then things work. So an obvious question is --- why do we need to
> fiddle with LDFLAGS and paths to the exact Python version? Here is a
> proposed simpler version of the build_mpkg() function:
>
> def _build_mpkg(pyver):
>         sh("python setupegg.py bdist_mpkg")
>
> Thanks for any tips.
>

Did you see the release.sh script? Some of the answers to your questions
were already documented there, and it should do the job out of the box.

Last note: bdist_mpkg is unmaintained and doesn't support Python 3.x. Most
recent version is at: https://github.com/matthew-brett/bdist_mpkg, for
previous versions numpy releases I've used that at commit
e81a58a471<https://github.com/rgommers/bdist_mpkg/commit/e81a58a47120522fc0e9374d2d2cf5403d24cb6a>

If we want 3.x binaries, then we should fix that or (preferably) build
binaries with Bento. Bento has grown support for mpkg's; I'm not sure how
robust that is.

Cheers,
Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130106/5ce93cac/attachment.html>

From njs at pobox.com  Sun Jan  6 05:16:13 2013
From: njs at pobox.com (Nathaniel Smith)
Date: Sun, 6 Jan 2013 10:16:13 +0000
Subject: [Numpy-discussion] Rank-0 arrays - reprise
In-Reply-To: <50E92EC2.10404@astro.uio.no>
References: <CAH6Pt5qZikcA8C+hW05viZQYhnsqoCRroOAszfsCS2Rmf+Qr+A@mail.gmail.com>
	<CAPJVwB=HJ5c=eeospCrTYzOQyaSbmBDorRCX716muyjbhnz3Sw@mail.gmail.com>
	<50E92EC2.10404@astro.uio.no>
Message-ID: <CAPJVwBnm9j5zOSRt5+CEQUWEbHP7nNR3M7nGtEUs+vsW8Or3VQ@mail.gmail.com>

On 6 Jan 2013 07:59, "Dag Sverre Seljebotn" <d.s.seljebotn at astro.uio.no>
wrote:
> Try to enumerate all the fundamentally different things (if you count
> memory use/running time) that can happen for ndarrays a, b, and
> arbitrary x here:
>
> a += b[x]
>
> That's already quite a lot, your proposal adds even more options. It's
> certainly a lot more complicated than str.

I agree it's complicated, but all the complications and options already
exist - they're just split across two similar-but-not-quite-identical sets
of data types.

> To me it all sounds like a lot of rules introduced just to have the
> result of a[0] be "kind of a scalar" without actually choosing that
option.

Not sure what you mean here. We know that whatever object a[0] returns is
going to have scalar behaviour. Right now we have two totally different
implementations of scalars. I'm not suggesting changing any (or hardly any)
existing behaviour, just that we switch which implementation of that
behavior we use.

I actually wrote that email as kind of amusing exercise in "what if...?",
but even after sleeping on it I'm still not thinking of any terrible
downsides...

-n
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130106/94150602/attachment.html>

From d.s.seljebotn at astro.uio.no  Sun Jan  6 05:35:21 2013
From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn)
Date: Sun, 06 Jan 2013 11:35:21 +0100
Subject: [Numpy-discussion] Rank-0 arrays - reprise
In-Reply-To: <1357465276.12993.20.camel@sebastian-laptop>
References: <CAH6Pt5qZikcA8C+hW05viZQYhnsqoCRroOAszfsCS2Rmf+Qr+A@mail.gmail.com>
	<CAPJVwB=HJ5c=eeospCrTYzOQyaSbmBDorRCX716muyjbhnz3Sw@mail.gmail.com>
	<50E92EC2.10404@astro.uio.no>
	<1357465276.12993.20.camel@sebastian-laptop>
Message-ID: <50E95369.70602@astro.uio.no>

On 01/06/2013 10:41 AM, Sebastian Berg wrote:
> On Sun, 2013-01-06 at 08:58 +0100, Dag Sverre Seljebotn wrote:
>> On 01/05/2013 10:31 PM, Nathaniel Smith wrote:
>>> On 5 Jan 2013 12:16, "Matthew Brett" <matthew.brett at gmail.com> wrote:
>>>>
>>>> Hi,
>>>>
>>>> Following on from Nathaniel's explorations of the scalar - array
>>>> casting rules, some resources on rank-0 arrays.
>>>>
>
> <snip>
>
>>> Q: Yeah. I mean, I remember that seemed weird when I first learned
>>> Python, but when have you ever felt the Python was really missing a
>>> "character" type like C has?
>>
>> str is immutable which makes this a lot easier to deal with without
>> getting confused. So basically you have:
>>
>> a[0:1] # read-write view
>> a[[0]] # read-write copy
>> a[0] # read-only view
>>
>> AND, += are allowed on all read-only arrays, they just transparently
>> create a copy instead of doing the operation in-place.
>>
>> Try to enumerate all the fundamentally different things (if you count
>> memory use/running time) that can happen for ndarrays a, b, and
>> arbitrary x here:
>>
>> a += b[x]
>>
>> That's already quite a lot, your proposal adds even more options. It's
>> certainly a lot more complicated than str.
>>
>> To me it all sounds like a lot of rules introduced just to have the
>> result of a[0] be "kind of a scalar" without actually choosing that option.
>>
>
> Yes, but I don't think there is an option to making the elements of an
> array being immutable. Firstly if you switch normal python code to numpy
> code you suddenly get numpy data types spilled into your code, and
> mutable objects are simply very different (also true for code updating
> to this new version). Do you expect:
>
> array = np.zeros(10, dtype=np.intp)
> b = arr[5]
> while condition:
>      # might change the array?!
>      b += 1
> # This would not be possible and break:
> dictionary[b] = b**2
>
> Because mutable objects are not hashable which important considering
> that dictionaries are a very central data type, making an element return
> mutable would be a bad idea.

Indeed, this would be completely crazy.

I should have been more precise: I like the proposal, but also believe 
the additional complexity introduced have significant costs that must be 
considered.

  a) Making += behave differently for readonly arrays should be 
carefully considered. If I have a 10 GB read-only array, I prefer an 
error to a copy for +=. (One could use an ISSCALAR flag instead that 
only affected +=...)

  b) Things seems simpler since "indexing away the last index" is no 
longer a special case, it is always true for a.ndim > 0 that "a[i]" is a 
new array such that

a[i].ndim == a.ndim - 1

But in exchange, a new special-case is introduced since READONLY is only 
set when ndim becomes 0, so it doesn't really help with the learning 
curve IMO.

In some ways I believe the "scalar-indexing" special case is simpler for 
newcomers to understand, and is what people already assume, and that a 
"readonly-indexing" special case is more complicated. It's dangerous to 
have a library which people only use correctly by accident, so to speak, 
it's much better if what people think they see is how things are.

(With respect to arr[5] returning a good old Python scalar for floats 
and ints -- Travis' example from 2002 is division, and at least that 
example is much less serious now with the introduction of the // 
operator in Python.)

> One could argue about structured datatypes, but maybe then it should be
> a datatype property whether its mutable or not, and even then the
> element should probably be a copy (though I did not check what happens
> here right now).

Elements from arrays with structured dtypes are already mutable (*and*, 
at least until recently, could still be used as dict keys...). This was 
discussed on the list a couple of months back I think.

Dag Sverre

>
>> BUT I should read up on that thread you posted on why that won't work,
>> didn't have time yet...
>>
>> Dag Sverre
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From d.s.seljebotn at astro.uio.no  Sun Jan  6 05:40:00 2013
From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn)
Date: Sun, 06 Jan 2013 11:40:00 +0100
Subject: [Numpy-discussion] Rank-0 arrays - reprise
In-Reply-To: <CAPJVwBnm9j5zOSRt5+CEQUWEbHP7nNR3M7nGtEUs+vsW8Or3VQ@mail.gmail.com>
References: <CAH6Pt5qZikcA8C+hW05viZQYhnsqoCRroOAszfsCS2Rmf+Qr+A@mail.gmail.com>
	<CAPJVwB=HJ5c=eeospCrTYzOQyaSbmBDorRCX716muyjbhnz3Sw@mail.gmail.com>
	<50E92EC2.10404@astro.uio.no>
	<CAPJVwBnm9j5zOSRt5+CEQUWEbHP7nNR3M7nGtEUs+vsW8Or3VQ@mail.gmail.com>
Message-ID: <50E95480.7050103@astro.uio.no>

On 01/06/2013 11:16 AM, Nathaniel Smith wrote:
> On 6 Jan 2013 07:59, "Dag Sverre Seljebotn" <d.s.seljebotn at astro.uio.no
> <mailto:d.s.seljebotn at astro.uio.no>> wrote:
>  > Try to enumerate all the fundamentally different things (if you count
>  > memory use/running time) that can happen for ndarrays a, b, and
>  > arbitrary x here:
>  >
>  > a += b[x]
>  >
>  > That's already quite a lot, your proposal adds even more options. It's
>  > certainly a lot more complicated than str.
>
> I agree it's complicated, but all the complications and options already
> exist - they're just split across two similar-but-not-quite-identical
> sets of data types.
>
>  > To me it all sounds like a lot of rules introduced just to have the
>  > result of a[0] be "kind of a scalar" without actually choosing that
> option.
>
> Not sure what you mean here. We know that whatever object a[0] returns
> is going to have scalar behaviour. Right now we have two totally
> different implementations of scalars. I'm not suggesting changing any
> (or hardly any) existing behaviour, just that we switch which
> implementation of that behavior we use.

In that case, how about not changing += for READONLY but instead have a 
new ISSCALAR flag for that? I.e. semantics stay mostly as today, it's 
just about removing those 10,000 lines of C code.

> I actually wrote that email as kind of amusing exercise in "what
> if...?", but even after sleeping on it I'm still not thinking of any
> terrible downsides...

I should say that I am really happy with the direction it is taking though.

(I wish I understood why using Python floats and ints is so horrible 
though, but I've probably not written enough library NumPy code that 
needs to consider all ndims and dtypes, just final-end-user-code where 
the array vs. scalar distinction is more clear.)

Dag Sverre


From njs at pobox.com  Sun Jan  6 09:42:24 2013
From: njs at pobox.com (Nathaniel Smith)
Date: Sun, 6 Jan 2013 14:42:24 +0000
Subject: [Numpy-discussion] Remove support for numeric and numarray in
	1.8
In-Reply-To: <CAB6mnxJ=zH7Q4hUpRXT-Heg8ONVBbwVuyrgb=QJ-SjLpZDa7JA@mail.gmail.com>
References: <CAB6mnxJ=zH7Q4hUpRXT-Heg8ONVBbwVuyrgb=QJ-SjLpZDa7JA@mail.gmail.com>
Message-ID: <CAPJVwBnkTu2+2++CvK34jG93BtNfSOdJO4gMx=N_rHfM1bNmfw@mail.gmail.com>

On Sun, Jan 6, 2013 at 2:38 AM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
> Thoughts?

To be clear, what you're talking about is basically deleting these two packages:
  numpy.oldnumeric
  numpy.numarray
plus the compatibility C API in
  numpy/numarray/include
?

So this would only affect Python code which explicitly imported one of
those two packages (neither is imported by default), or C code which
did #include "numpy/numarray/..."?

(I'm not even sure how you would build such a C module, these headers
are distributed in a weird directory not accessible via
np.get_include(). So unless your build system does some special work
to access it, you can't even see these headers.)

-n


From charlesr.harris at gmail.com  Sun Jan  6 10:09:28 2013
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Sun, 6 Jan 2013 08:09:28 -0700
Subject: [Numpy-discussion] Remove support for numeric and numarray in
	1.8
In-Reply-To: <CAPJVwBnkTu2+2++CvK34jG93BtNfSOdJO4gMx=N_rHfM1bNmfw@mail.gmail.com>
References: <CAB6mnxJ=zH7Q4hUpRXT-Heg8ONVBbwVuyrgb=QJ-SjLpZDa7JA@mail.gmail.com>
	<CAPJVwBnkTu2+2++CvK34jG93BtNfSOdJO4gMx=N_rHfM1bNmfw@mail.gmail.com>
Message-ID: <CAB6mnx+zmk4Trh4vAz7Ofc4QQ-wix1FaHd7dNFyAeQZtDiMQdQ@mail.gmail.com>

On Sun, Jan 6, 2013 at 7:42 AM, Nathaniel Smith <njs at pobox.com> wrote:

> On Sun, Jan 6, 2013 at 2:38 AM, Charles R Harris
> <charlesr.harris at gmail.com> wrote:
> > Thoughts?
>
> To be clear, what you're talking about is basically deleting these two
> packages:
>   numpy.oldnumeric
>   numpy.numarray
> plus the compatibility C API in
>   numpy/numarray/include
> ?
>
>
Yep.


> So this would only affect Python code which explicitly imported one of
> those two packages (neither is imported by default), or C code which
> did #include "numpy/numarray/..."?
>
>
Those packages were intended to be an easy path for folks to port their
numeric and numarray code to numpy. During the 2.4 discussion there was a
fellow who said his group was just now moving their code from numeric to
numpy, but I had the feeling they were rewriting it in the process.


> (I'm not even sure how you would build such a C module, these headers
> are distributed in a weird directory not accessible via
> np.get_include(). So unless your build system does some special work
> to access it, you can't even see these headers.)
>
>
Never tried it myself. There is some C code in those packages and it easy
to overlook its maintenance, so I'd like to solve the problem by nuking it.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130106/05a87931/attachment.html>

From njs at pobox.com  Sun Jan  6 11:52:13 2013
From: njs at pobox.com (Nathaniel Smith)
Date: Sun, 6 Jan 2013 16:52:13 +0000
Subject: [Numpy-discussion] Rank-0 arrays - reprise
In-Reply-To: <50E95369.70602@astro.uio.no>
References: <CAH6Pt5qZikcA8C+hW05viZQYhnsqoCRroOAszfsCS2Rmf+Qr+A@mail.gmail.com>
	<CAPJVwB=HJ5c=eeospCrTYzOQyaSbmBDorRCX716muyjbhnz3Sw@mail.gmail.com>
	<50E92EC2.10404@astro.uio.no>
	<1357465276.12993.20.camel@sebastian-laptop>
	<50E95369.70602@astro.uio.no>
Message-ID: <CAPJVwBka5MT0gJmUNxM+M0qVpejQwkOyfa_PYOmwuK8PaqTJ_w@mail.gmail.com>

On Sun, Jan 6, 2013 at 10:35 AM, Dag Sverre Seljebotn
<d.s.seljebotn at astro.uio.no> wrote:
> I should have been more precise: I like the proposal, but also believe
> the additional complexity introduced have significant costs that must be
> considered.
>
>   a) Making += behave differently for readonly arrays should be
> carefully considered. If I have a 10 GB read-only array, I prefer an
> error to a copy for +=. (One could use an ISSCALAR flag instead that
> only affected +=...)

Yes, definitely we would need to nail down the exact semantics here.
My feeling is that we should see start by seeing if we can come up
with a set of coherent rules for read-only arrays that does what we
want before we add an ACT_LIKE_OLD_SCALARS flag, but either way is
viable. (Or we could start with a PRETEND_TO_BE_SCALAR flag and then
gradually migrate away from it.)

>   b) Things seems simpler since "indexing away the last index" is no
> longer a special case, it is always true for a.ndim > 0 that "a[i]" is a
> new array such that
>
> a[i].ndim == a.ndim - 1
>
> But in exchange, a new special-case is introduced since READONLY is only
> set when ndim becomes 0, so it doesn't really help with the learning
> curve IMO.

Yes, indexing with a scalar (as opposed to slicing or fancy-indexing)
remains a special case just like now. And not just because the result
is read-only -- it also returns a copy, not a view.

I don't think the comparison to the a[i] special-case is very useful,
really. Scalar indexing and the wacky one-dimensional indexing thing
where a[i] -> a[i, ..] (unless a is one-dimensional) would still be
different in general, even aside from the READONLY part, because the
one-dimensional indexing thing only applies to one-dimensional
indexes. For a 3-d array,
  a[i, j]
gives an error; it's not the same as a[i, j, ...]. And while I
understand why numpy does what it does for len() and __getitem__(int)
on multi-dimensional arrays (it's to make multi-dimensional arrays act
more like list-of-lists), this is IMO a confusing special case that we
might be better off without, and in any case shouldn't be used as a
guide for how to make the rest of the indexing system work.

> In some ways I believe the "scalar-indexing" special case is simpler for
> newcomers to understand, and is what people already assume, and that a
> "readonly-indexing" special case is more complicated. It's dangerous to
> have a library which people only use correctly by accident, so to speak,
> it's much better if what people think they see is how things are.

This is all true, but current scalars *are* readonly arrays, just
weird ones with some limitations and that people don't realize are
there.

Heck, you can even reshape scalars:

In [10]: a = np.float64(0)

In [11]: a.reshape((1, 1))
Out[11]: array([[ 0.]])

And resizing is allowed... but silently does nothing:

In [12]: a.resize((1, 1))

In [13]: a
Out[13]: 0.0

> (With respect to arr[5] returning a good old Python scalar for floats
> and ints -- Travis' example from 2002 is division, and at least that
> example is much less serious now with the introduction of the //
> operator in Python.)

I thought Travis's example was (in current numpy terms):

In [1]: a = np.array([-1.0, 1.0])

# Pretend that np.sum() returns a float, which uses Python's arithmetic:
In [2]: 1 / float(np.sum(a))
ZeroDivisionError: float division by zero

# It actually returns a numpy scalar, which uses numpy's arithmetic:
In [3]: 1 / np.sum(a)
/home/njs/.user-python2.7-64bit/bin/ipython:1: RuntimeWarning: divide
by zero encountered in double_scalars
  #!/home/njs/.user-python2.7-64bit/bin/python
Out[3]: inf

Anyway, you still need to return some sort of special object for
anything that's not part of python's type system (structured arrays,
custom dtypes like enumerated values, etc.). So returning good-old
Python scalars (GOPS?) for floats/ints/bools actually introduces a new
special case.

>> One could argue about structured datatypes, but maybe then it should be
>> a datatype property whether its mutable or not, and even then the
>> element should probably be a copy (though I did not check what happens
>> here right now).
>
> Elements from arrays with structured dtypes are already mutable (*and*,
> at least until recently, could still be used as dict keys...). This was
> discussed on the list a couple of months back I think.

Yeah, this is another weird wart we could fix up in the process...

-n


From charlesr.harris at gmail.com  Sun Jan  6 12:53:47 2013
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Sun, 6 Jan 2013 10:53:47 -0700
Subject: [Numpy-discussion] Rank-0 arrays - reprise
In-Reply-To: <CAPJVwB=HJ5c=eeospCrTYzOQyaSbmBDorRCX716muyjbhnz3Sw@mail.gmail.com>
References: <CAH6Pt5qZikcA8C+hW05viZQYhnsqoCRroOAszfsCS2Rmf+Qr+A@mail.gmail.com>
	<CAPJVwB=HJ5c=eeospCrTYzOQyaSbmBDorRCX716muyjbhnz3Sw@mail.gmail.com>
Message-ID: <CAB6mnx+Q8JyqXHiNVKtRnnbb35bnqPrHK0cbcnvOm38seQrWYw@mail.gmail.com>

On Sat, Jan 5, 2013 at 2:31 PM, Nathaniel Smith <njs at pobox.com> wrote:

> On 5 Jan 2013 12:16, "Matthew Brett" <matthew.brett at gmail.com> wrote:
> >
> > Hi,
> >
> > Following on from Nathaniel's explorations of the scalar - array
> > casting rules, some resources on rank-0 arrays.
> >
> > The discussion that Nathaniel tracked down on "rank-0 arrays"; it also
> > makes reference to casting.  The rank-0 arrays seem to have been one
> > way of solving the problem of maintaining array dtypes other than bool
> > / float / int:
> >
> >
> http://mail.scipy.org/pipermail/numpy-discussion/2002-September/001612.html
> >
> > Quoting from an email from Travis in that thread, replying to an email
> > from Tim Hochberg:
> >
> >
> http://mail.scipy.org/pipermail/numpy-discussion/2002-September/001647.html
> >
> > <quote>
> > > Frankly, I have no idea what the implimentation details would be, but
> > > could we get rid of rank-0 arrays altogether? I have always simply
> found
> > > them strange and confusing... What are they really neccesary for
> > > (besides holding scalar values of different precision that standard
> > > Pyton scalars)?
> >
> > With new coercion rules this becomes a possibility.  Arguments against it
> > are that  special rank-0 arrays behave as more consistent numbers with
> the
> > rest of Numeric than Python scalars.  In other words they have a length
> > and a shape and one can right N-dimensional code that works the same even
> > when the result is a scalar.
> >
> > Another advantage of having a Numeric scalar is that we can control the
> > behavior of floating point operations better.
> >
> > e.g.
> >
> > if only Python scalars were available and sum(a) returned 0, then
> >
> >  1 / sum(a)  would behave as Python behaves (always raises error).
> >
> > while with our own scalars
> >
> > 1 / sum(a)   could potentially behave however the user wanted.
> > </quote>
> >
> > There seemed then to be some impetus to remove rank-0 arrays and
> > replace them with Python scalar types with the various numpy
> > precisions :
> >
> >
> http://mail.scipy.org/pipermail/numpy-discussion/2002-September/013983.html
> >
> > Travis' recent email hints at something that seems similar, but I
> > don't understand what he means:
> >
> >
> http://mail.scipy.org/pipermail/numpy-discussion/2012-December/064795.html
> >
> > <quote>
> > Don't create array-scalars.  Instead, make the data-type object a
> > meta-type object whose instances are the items returned from NumPy
> > arrays.   There is no need for a separate array-scalar object and in
> > fact it's confusing to the type-system.    I understand that now.  I
> > did not understand that 5 years ago.
> > </quote>
> >
> > Travis - can you expand?
>
> Numpy has 3 partially overlapping concepts:
>
> A) scalars (what Travis calls "array scalars"): Things like "float64",
> "int32". These are ordinary Python classes; usually when you subscript
> an array, what you get back is an instance of one of these classes:
>
> In [1]: a = np.array([1, 2, 3])
>
> In [2]: a[0]
> Out[2]: 1
>
> In [3]: type(a[0])
> Out[3]: numpy.int64
>
> Note that even though they are called "array scalars", they have
> nothing to do with the actual ndarray type -- they are totally
> separate objects.
>
> B) dtypes: These are instances of class np.dtype. For every scalar
> type, there is a corresponding dtype object; plus you can create new
> dtype objects for things like record arrays (which correspond to
> scalars of type "np.void"; I don't really understand how void scalars
> work in detail):
>

While thinking about dtypes I started a post proposing that *all* arrays be
considered as special cases of void arrays. A void array is basically a
memory indexing construct combined with a view.

<snip>

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130106/849f34d6/attachment.html>

From sebastian at sipsolutions.net  Sun Jan  6 12:57:21 2013
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Sun, 06 Jan 2013 18:57:21 +0100
Subject: [Numpy-discussion] high dimensional array -> python scalar/index
Message-ID: <1357495041.3537.7.camel@sebastian-laptop>

Question for everyone, is this really reasonable:

>>> import numpy as np
>>> from operator import index
>>> index(np.array([[5]]))
5
>>> int(np.array([[5]]))
5
>>> [0,1,2,3][np.array([[2]])]
2

To me, this does not make sense, why should we allow to use a high
dimensional object like a normal scalar (its ok for 0-d arrays I guess)?
Personally I would be for deprecating these usages, even if that
(probably) means you cannot reshape your array with a matrix (as it is
2D) ;-):
>>> np.arange(10).reshape(np.matrix([5,-1]).T)
array([[0, 1],
       [2, 3],
       [4, 5],
       [6, 7],
       [8, 9]])


From matthew.brett at gmail.com  Sun Jan  6 13:16:32 2013
From: matthew.brett at gmail.com (Matthew Brett)
Date: Sun, 6 Jan 2013 18:16:32 +0000
Subject: [Numpy-discussion] Rank-0 arrays - reprise
In-Reply-To: <CAB6mnx+Q8JyqXHiNVKtRnnbb35bnqPrHK0cbcnvOm38seQrWYw@mail.gmail.com>
References: <CAH6Pt5qZikcA8C+hW05viZQYhnsqoCRroOAszfsCS2Rmf+Qr+A@mail.gmail.com>
	<CAPJVwB=HJ5c=eeospCrTYzOQyaSbmBDorRCX716muyjbhnz3Sw@mail.gmail.com>
	<CAB6mnx+Q8JyqXHiNVKtRnnbb35bnqPrHK0cbcnvOm38seQrWYw@mail.gmail.com>
Message-ID: <CAH6Pt5pEe8_rfm5UCmv_84c8W5zbvDwfPcdbo1YM=1=M5hj9yg@mail.gmail.com>

Hi,

On Sun, Jan 6, 2013 at 5:53 PM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
>
>
> On Sat, Jan 5, 2013 at 2:31 PM, Nathaniel Smith <njs at pobox.com> wrote:
>>
>> On 5 Jan 2013 12:16, "Matthew Brett" <matthew.brett at gmail.com> wrote:
>> >
>> > Hi,
>> >
>> > Following on from Nathaniel's explorations of the scalar - array
>> > casting rules, some resources on rank-0 arrays.
>> >
>> > The discussion that Nathaniel tracked down on "rank-0 arrays"; it also
>> > makes reference to casting.  The rank-0 arrays seem to have been one
>> > way of solving the problem of maintaining array dtypes other than bool
>> > / float / int:
>> >
>> >
>> > http://mail.scipy.org/pipermail/numpy-discussion/2002-September/001612.html
>> >
>> > Quoting from an email from Travis in that thread, replying to an email
>> > from Tim Hochberg:
>> >
>> >
>> > http://mail.scipy.org/pipermail/numpy-discussion/2002-September/001647.html
>> >
>> > <quote>
>> > > Frankly, I have no idea what the implimentation details would be, but
>> > > could we get rid of rank-0 arrays altogether? I have always simply
>> > > found
>> > > them strange and confusing... What are they really neccesary for
>> > > (besides holding scalar values of different precision that standard
>> > > Pyton scalars)?
>> >
>> > With new coercion rules this becomes a possibility.  Arguments against
>> > it
>> > are that  special rank-0 arrays behave as more consistent numbers with
>> > the
>> > rest of Numeric than Python scalars.  In other words they have a length
>> > and a shape and one can right N-dimensional code that works the same
>> > even
>> > when the result is a scalar.
>> >
>> > Another advantage of having a Numeric scalar is that we can control the
>> > behavior of floating point operations better.
>> >
>> > e.g.
>> >
>> > if only Python scalars were available and sum(a) returned 0, then
>> >
>> >  1 / sum(a)  would behave as Python behaves (always raises error).
>> >
>> > while with our own scalars
>> >
>> > 1 / sum(a)   could potentially behave however the user wanted.
>> > </quote>
>> >
>> > There seemed then to be some impetus to remove rank-0 arrays and
>> > replace them with Python scalar types with the various numpy
>> > precisions :
>> >
>> >
>> > http://mail.scipy.org/pipermail/numpy-discussion/2002-September/013983.html
>> >
>> > Travis' recent email hints at something that seems similar, but I
>> > don't understand what he means:
>> >
>> >
>> > http://mail.scipy.org/pipermail/numpy-discussion/2012-December/064795.html
>> >
>> > <quote>
>> > Don't create array-scalars.  Instead, make the data-type object a
>> > meta-type object whose instances are the items returned from NumPy
>> > arrays.   There is no need for a separate array-scalar object and in
>> > fact it's confusing to the type-system.    I understand that now.  I
>> > did not understand that 5 years ago.
>> > </quote>
>> >
>> > Travis - can you expand?
>>
>> Numpy has 3 partially overlapping concepts:
>>
>> A) scalars (what Travis calls "array scalars"): Things like "float64",
>> "int32". These are ordinary Python classes; usually when you subscript
>> an array, what you get back is an instance of one of these classes:
>>
>> In [1]: a = np.array([1, 2, 3])
>>
>> In [2]: a[0]
>> Out[2]: 1
>>
>> In [3]: type(a[0])
>> Out[3]: numpy.int64
>>
>> Note that even though they are called "array scalars", they have
>> nothing to do with the actual ndarray type -- they are totally
>> separate objects.
>>
>> B) dtypes: These are instances of class np.dtype. For every scalar
>> type, there is a corresponding dtype object; plus you can create new
>> dtype objects for things like record arrays (which correspond to
>> scalars of type "np.void"; I don't really understand how void scalars
>> work in detail):
>
>
> While thinking about dtypes I started a post proposing that *all* arrays be
> considered as special cases of void arrays. A void array is basically a
> memory indexing construct combined with a view.
>
> <snip>

I'd be really interested to read that, I'm sure others would too,

Cheers,

Matthew


From josef.pktd at gmail.com  Sun Jan  6 13:28:40 2013
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Sun, 6 Jan 2013 13:28:40 -0500
Subject: [Numpy-discussion] high dimensional array -> python scalar/index
In-Reply-To: <1357495041.3537.7.camel@sebastian-laptop>
References: <1357495041.3537.7.camel@sebastian-laptop>
Message-ID: <CAMMTP+CDGa=Ry-jrQmLcQuuDX0HhQBkacH_2HsQexZTvGC9+6Q@mail.gmail.com>

On Sun, Jan 6, 2013 at 12:57 PM, Sebastian Berg
<sebastian at sipsolutions.net> wrote:
> Question for everyone, is this really reasonable:
>
>>>> import numpy as np
>>>> from operator import index
>>>> index(np.array([[5]]))
> 5
>>>> int(np.array([[5]]))
> 5
>>>> [0,1,2,3][np.array([[2]])]
> 2

Not sure I understand the point

looks reasonable to my

int has an implied squeeze, if it succeeds

not so python lists

>>> int([[1]])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: int() argument must be a string or a number, not 'list'


>>> [0,1,2,3][np.array([[2, 2], [0, 1]])]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: only integer arrays with one element can be converted to an index


but we can to more fun things with numpy

>>> np.array([0,1,2,3])[np.array([[2, 2], [0, 1]])]
array([[2, 2],
       [0, 1]])

Josef

>
> To me, this does not make sense, why should we allow to use a high
> dimensional object like a normal scalar (its ok for 0-d arrays I guess)?
> Personally I would be for deprecating these usages, even if that
> (probably) means you cannot reshape your array with a matrix (as it is
> 2D) ;-):
>>>> np.arange(10).reshape(np.matrix([5,-1]).T)
> array([[0, 1],
>        [2, 3],
>        [4, 5],
>        [6, 7],
>        [8, 9]])
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From d.s.seljebotn at astro.uio.no  Sun Jan  6 13:36:04 2013
From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn)
Date: Sun, 06 Jan 2013 19:36:04 +0100
Subject: [Numpy-discussion] Rank-0 arrays - reprise
In-Reply-To: <CAPJVwBka5MT0gJmUNxM+M0qVpejQwkOyfa_PYOmwuK8PaqTJ_w@mail.gmail.com>
References: <CAH6Pt5qZikcA8C+hW05viZQYhnsqoCRroOAszfsCS2Rmf+Qr+A@mail.gmail.com>
	<CAPJVwB=HJ5c=eeospCrTYzOQyaSbmBDorRCX716muyjbhnz3Sw@mail.gmail.com>
	<50E92EC2.10404@astro.uio.no>
	<1357465276.12993.20.camel@sebastian-laptop>
	<50E95369.70602@astro.uio.no>
	<CAPJVwBka5MT0gJmUNxM+M0qVpejQwkOyfa_PYOmwuK8PaqTJ_w@mail.gmail.com>
Message-ID: <50E9C414.8010406@astro.uio.no>

On 01/06/2013 05:52 PM, Nathaniel Smith wrote:
> On Sun, Jan 6, 2013 at 10:35 AM, Dag Sverre Seljebotn
> <d.s.seljebotn at astro.uio.no> wrote:
>> I should have been more precise: I like the proposal, but also believe
>> the additional complexity introduced have significant costs that must be
>> considered.
>>
>>    a) Making += behave differently for readonly arrays should be
>> carefully considered. If I have a 10 GB read-only array, I prefer an
>> error to a copy for +=. (One could use an ISSCALAR flag instead that
>> only affected +=...)
>
> Yes, definitely we would need to nail down the exact semantics here.
> My feeling is that we should see start by seeing if we can come up
> with a set of coherent rules for read-only arrays that does what we
> want before we add an ACT_LIKE_OLD_SCALARS flag, but either way is
> viable. (Or we could start with a PRETEND_TO_BE_SCALAR flag and then
> gradually migrate away from it.)

Sounds like a good plan.

>
>>    b) Things seems simpler since "indexing away the last index" is no
>> longer a special case, it is always true for a.ndim > 0 that "a[i]" is a
>> new array such that
>>
>> a[i].ndim == a.ndim - 1
>>
>> But in exchange, a new special-case is introduced since READONLY is only
>> set when ndim becomes 0, so it doesn't really help with the learning
>> curve IMO.
>
> Yes, indexing with a scalar (as opposed to slicing or fancy-indexing)
> remains a special case just like now. And not just because the result
> is read-only -- it also returns a copy, not a view.
>
> I don't think the comparison to the a[i] special-case is very useful,
> really. Scalar indexing and the wacky one-dimensional indexing thing
> where a[i] -> a[i, ..] (unless a is one-dimensional) would still be
> different in general, even aside from the READONLY part, because the
> one-dimensional indexing thing only applies to one-dimensional
> indexes. For a 3-d array,
>    a[i, j]
> gives an error; it's not the same as a[i, j, ...]. And while I
> understand why numpy does what it does for len() and __getitem__(int)
> on multi-dimensional arrays (it's to make multi-dimensional arrays act
> more like list-of-lists), this is IMO a confusing special case that we
> might be better off without, and in any case shouldn't be used as a
> guide for how to make the rest of the indexing system work.

Removing the single-index special case would be great. I see people 
doing stuff like a[i][j][k] all the time, just because that's what they 
tried first when they came to NumPy and then the habit sticks for years. 
OTOH, that means that it might have to stay for backwards compatability 
reasons.

Dag Sverre


From sebastian at sipsolutions.net  Sun Jan  6 13:56:22 2013
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Sun, 06 Jan 2013 19:56:22 +0100
Subject: [Numpy-discussion] high dimensional array -> python scalar/index
In-Reply-To: <CAMMTP+CDGa=Ry-jrQmLcQuuDX0HhQBkacH_2HsQexZTvGC9+6Q@mail.gmail.com>
References: <1357495041.3537.7.camel@sebastian-laptop>
	<CAMMTP+CDGa=Ry-jrQmLcQuuDX0HhQBkacH_2HsQexZTvGC9+6Q@mail.gmail.com>
Message-ID: <1357498582.3537.18.camel@sebastian-laptop>

On Sun, 2013-01-06 at 13:28 -0500, josef.pktd at gmail.com wrote:
> On Sun, Jan 6, 2013 at 12:57 PM, Sebastian Berg
> <sebastian at sipsolutions.net> wrote:
> > Question for everyone, is this really reasonable:
> >
> >>>> import numpy as np
> >>>> from operator import index
> >>>> index(np.array([[5]]))
> > 5
> >>>> int(np.array([[5]]))
> > 5
> >>>> [0,1,2,3][np.array([[2]])]
> > 2
> 
> Not sure I understand the point
> 
> looks reasonable to my
> 
> int has an implied squeeze, if it succeeds
> 

Exactly *why* should it have an implied squeeze? Note I agree, the
int(np.array([3])) is OK, since also int('10') works, however for index
I think it is not OK, you simply cannot do list['10'].

> not so python lists
> 
> >>> int([[1]])
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> TypeError: int() argument must be a string or a number, not 'list'

Exactly, so why should numpy be much more forgiving?

> 
> >>> [0,1,2,3][np.array([[2, 2], [0, 1]])]
> Traceback (most recent call last):
>   File "<stdin>", line 1, in <module>
> TypeError: only integer arrays with one element can be converted to an index
> 
> 
> but we can to more fun things with numpy
> 
> >>> np.array([0,1,2,3])[np.array([[2, 2], [0, 1]])]
> array([[2, 2],
>        [0, 1]])
> 

Of course... But if you compare to lists, thats actually a point why
index should fail:

>>> np.array([0,1,2,3])[np.array([[3]])]

is very different from:

>>> [0,1,2,3][np.array([[3]])]

and in my opinion there is no reason why the latter should not simply
fail.

> Josef
> 
> >
> > To me, this does not make sense, why should we allow to use a high
> > dimensional object like a normal scalar (its ok for 0-d arrays I guess)?
> > Personally I would be for deprecating these usages, even if that
> > (probably) means you cannot reshape your array with a matrix (as it is
> > 2D) ;-):
> >>>> np.arange(10).reshape(np.matrix([5,-1]).T)
> > array([[0, 1],
> >        [2, 3],
> >        [4, 5],
> >        [6, 7],
> >        [8, 9]])
> >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> 


From charlesr.harris at gmail.com  Sun Jan  6 17:36:43 2013
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Sun, 6 Jan 2013 15:36:43 -0700
Subject: [Numpy-discussion] Remove support for numeric and numarray in
	1.8
In-Reply-To: <CAB6mnx+zmk4Trh4vAz7Ofc4QQ-wix1FaHd7dNFyAeQZtDiMQdQ@mail.gmail.com>
References: <CAB6mnxJ=zH7Q4hUpRXT-Heg8ONVBbwVuyrgb=QJ-SjLpZDa7JA@mail.gmail.com>
	<CAPJVwBnkTu2+2++CvK34jG93BtNfSOdJO4gMx=N_rHfM1bNmfw@mail.gmail.com>
	<CAB6mnx+zmk4Trh4vAz7Ofc4QQ-wix1FaHd7dNFyAeQZtDiMQdQ@mail.gmail.com>
Message-ID: <CAB6mnxK=ndPL1o9ZYL098jVg1ExtRNp-v8DtStNvsPHGfc8m3g@mail.gmail.com>

On Sun, Jan 6, 2013 at 8:09 AM, Charles R Harris
<charlesr.harris at gmail.com>wrote:

>
>
> On Sun, Jan 6, 2013 at 7:42 AM, Nathaniel Smith <njs at pobox.com> wrote:
>
>> On Sun, Jan 6, 2013 at 2:38 AM, Charles R Harris
>> <charlesr.harris at gmail.com> wrote:
>> > Thoughts?
>>
>> To be clear, what you're talking about is basically deleting these two
>> packages:
>>   numpy.oldnumeric
>>   numpy.numarray
>> plus the compatibility C API in
>>   numpy/numarray/include
>> ?
>>
>>
> Yep.
>
>
>> So this would only affect Python code which explicitly imported one of
>> those two packages (neither is imported by default), or C code which
>> did #include "numpy/numarray/..."?
>>
>>
> Those packages were intended to be an easy path for folks to port their
> numeric and numarray code to numpy. During the 2.4 discussion there was a
> fellow who said his group was just now moving their code from numeric to
> numpy, but I had the feeling they were rewriting it in the process.
>
>
>> (I'm not even sure how you would build such a C module, these headers
>> are distributed in a weird directory not accessible via
>> np.get_include(). So unless your build system does some special work
>> to access it, you can't even see these headers.)
>>
>>
> Never tried it myself. There is some C code in those packages and it easy
> to overlook its maintenance, so I'd like to solve the problem by nuking it.
>
>
Oops. The proposal is to only remove numarray support. The functions in
oldnumeric have been taken over into numpy and we need to keep them.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130106/d58d9cfd/attachment.html>

From chris.barker at noaa.gov  Sun Jan  6 17:40:28 2013
From: chris.barker at noaa.gov (Chris Barker - NOAA Federal)
Date: Sun, 6 Jan 2013 14:40:28 -0800
Subject: [Numpy-discussion] Which Python to use for Mac binaries
In-Reply-To: <CABL7CQhsSuJAVLA8LKYPNv-G5SC0J7D4aUZ6Jd8jc7ZTssEdfg@mail.gmail.com>
References: <CADDwiVCrujdbbpCVCiDbi1N7pH6i5aHDDv+Y+TyyRnkkzJ14+Q@mail.gmail.com>
	<CABL7CQhsSuJAVLA8LKYPNv-G5SC0J7D4aUZ6Jd8jc7ZTssEdfg@mail.gmail.com>
Message-ID: <CALGmxEJjaj=XjUkXAc2QEtEGNMmc4kKQ44d4k5v8s7E3_t6ynw@mail.gmail.com>

On Sun, Jan 6, 2013 at 2:04 AM, Ralf Gommers <ralf.gommers at gmail.com> wrote:
>> Which exact Python do we need to use on Mac? Do we need to use the
>> binary installer from python.org?
>
> Yes, the one from python.org.
>
>> Or can I install it from source?

you could install from source using the same method that the
python.org binaries are built -- there is a script with the source to
do that, though I'm not sure what the point of that would be.

> The 10.3 installers for 2.5, 2.6 and 2.7 should be compiled on OS X 10.5.

It would be great to continue support for that, though I wonder how
many people still need it -- I don't think Apple supports 10.5
anymore, for instance.

> The 10.7 --> 10.6 support hasn't been checked, but I wouldn't trust it. I
> have a 10.6 machine, so I can compile those binaries if needed.

That would be better, but it would also be nice to check how building
on 10.7 works.

> Avoid using system Python for anything. The first thing to do on any new OS
> X system is install Python some other way, preferably from python.org.

+1

> Last note: bdist_mpkg is unmaintained and doesn't support Python 3.x. Most
> recent version is at: https://github.com/matthew-brett/bdist_mpkg, for
> previous versions numpy releases I've used that at commit e81a58a471

There has been recent discussion on the pythonmac list about this --
some waffling about how important it is -- though I think it would be
good to keep it up to date.

> If we want 3.x binaries, then we should fix that or (preferably) build
> binaries with Bento. Bento has grown support for mpkg's; I'm not sure how
> robust that is.

So maybe bento is a better route than bdist_mpkg -- this is worth
discussion on teh pythonmac list.

-Chris


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From njs at pobox.com  Sun Jan  6 19:07:21 2013
From: njs at pobox.com (Nathaniel Smith)
Date: Mon, 7 Jan 2013 00:07:21 +0000
Subject: [Numpy-discussion] Remove support for numeric and numarray in
	1.8
In-Reply-To: <CAB6mnxK=ndPL1o9ZYL098jVg1ExtRNp-v8DtStNvsPHGfc8m3g@mail.gmail.com>
References: <CAB6mnxJ=zH7Q4hUpRXT-Heg8ONVBbwVuyrgb=QJ-SjLpZDa7JA@mail.gmail.com>
	<CAPJVwBnkTu2+2++CvK34jG93BtNfSOdJO4gMx=N_rHfM1bNmfw@mail.gmail.com>
	<CAB6mnx+zmk4Trh4vAz7Ofc4QQ-wix1FaHd7dNFyAeQZtDiMQdQ@mail.gmail.com>
	<CAB6mnxK=ndPL1o9ZYL098jVg1ExtRNp-v8DtStNvsPHGfc8m3g@mail.gmail.com>
Message-ID: <CAPJVwBmGh7-AeeReR7st8X+Pb48KqjKFgAPJuo--LPcuD=+-Pw@mail.gmail.com>

On Sun, Jan 6, 2013 at 10:36 PM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
>
>
> On Sun, Jan 6, 2013 at 8:09 AM, Charles R Harris <charlesr.harris at gmail.com>
> wrote:
>>
>>
>>
>> On Sun, Jan 6, 2013 at 7:42 AM, Nathaniel Smith <njs at pobox.com> wrote:
>>>
>>> On Sun, Jan 6, 2013 at 2:38 AM, Charles R Harris
>>> <charlesr.harris at gmail.com> wrote:
>>> > Thoughts?
>>>
>>> To be clear, what you're talking about is basically deleting these two
>>> packages:
>>>   numpy.oldnumeric
>>>   numpy.numarray
>>> plus the compatibility C API in
>>>   numpy/numarray/include
>>> ?
>>>
>>
>> Yep.
>>
>>>
>>> So this would only affect Python code which explicitly imported one of
>>> those two packages (neither is imported by default), or C code which
>>> did #include "numpy/numarray/..."?
>>>
>>
>> Those packages were intended to be an easy path for folks to port their
>> numeric and numarray code to numpy. During the 2.4 discussion there was a
>> fellow who said his group was just now moving their code from numeric to
>> numpy, but I had the feeling they were rewriting it in the process.
>>
>>>
>>> (I'm not even sure how you would build such a C module, these headers
>>> are distributed in a weird directory not accessible via
>>> np.get_include(). So unless your build system does some special work
>>> to access it, you can't even see these headers.)
>>>
>>
>> Never tried it myself. There is some C code in those packages and it easy
>> to overlook its maintenance, so I'd like to solve the problem by nuking it.
>>
>
> Oops. The proposal is to only remove numarray support. The functions in
> oldnumeric have been taken over into numpy and we need to keep them.

...huh? The package name is mentioned nowhere in the numpy sources...

~/src/numpy/numpy$ find -type f | grep -Ev
'^\./(numarray|oldnumeric)/' | xargs grep oldnumeric
./setupscons.py:    config.add_subpackage('oldnumeric')
./bento.info:        oldnumeric,
./core/setup.py:            join('include', 'numpy', 'oldnumeric.h'),
./setup.py:    config.add_subpackage('oldnumeric')

...and it's not even available unless a user explicitly does 'import
numpy.oldnumeric' or 'import numpy.numarray', so no-one's using this
stuff without knowing it:

In [2]: import numpy

In [3]: [m for m in sys.modules if m.startswith("numpy.oldnumeric")]
Out[3]: []

-n


From charlesr.harris at gmail.com  Sun Jan  6 19:42:30 2013
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Sun, 6 Jan 2013 17:42:30 -0700
Subject: [Numpy-discussion] Remove support for numeric and numarray in
	1.8
In-Reply-To: <CAPJVwBmGh7-AeeReR7st8X+Pb48KqjKFgAPJuo--LPcuD=+-Pw@mail.gmail.com>
References: <CAB6mnxJ=zH7Q4hUpRXT-Heg8ONVBbwVuyrgb=QJ-SjLpZDa7JA@mail.gmail.com>
	<CAPJVwBnkTu2+2++CvK34jG93BtNfSOdJO4gMx=N_rHfM1bNmfw@mail.gmail.com>
	<CAB6mnx+zmk4Trh4vAz7Ofc4QQ-wix1FaHd7dNFyAeQZtDiMQdQ@mail.gmail.com>
	<CAB6mnxK=ndPL1o9ZYL098jVg1ExtRNp-v8DtStNvsPHGfc8m3g@mail.gmail.com>
	<CAPJVwBmGh7-AeeReR7st8X+Pb48KqjKFgAPJuo--LPcuD=+-Pw@mail.gmail.com>
Message-ID: <CAB6mnxLzLTdcwZk6umEnjOCUNuLwTZWwTh3FG4RB5v-k8A-dEg@mail.gmail.com>

On Sun, Jan 6, 2013 at 5:07 PM, Nathaniel Smith <njs at pobox.com> wrote:

> On Sun, Jan 6, 2013 at 10:36 PM, Charles R Harris
> <charlesr.harris at gmail.com> wrote:
> >
> >
> > On Sun, Jan 6, 2013 at 8:09 AM, Charles R Harris <
> charlesr.harris at gmail.com>
> > wrote:
> >>
> >>
> >>
> >> On Sun, Jan 6, 2013 at 7:42 AM, Nathaniel Smith <njs at pobox.com> wrote:
> >>>
> >>> On Sun, Jan 6, 2013 at 2:38 AM, Charles R Harris
> >>> <charlesr.harris at gmail.com> wrote:
> >>> > Thoughts?
> >>>
> >>> To be clear, what you're talking about is basically deleting these two
> >>> packages:
> >>>   numpy.oldnumeric
> >>>   numpy.numarray
> >>> plus the compatibility C API in
> >>>   numpy/numarray/include
> >>> ?
> >>>
> >>
> >> Yep.
> >>
> >>>
> >>> So this would only affect Python code which explicitly imported one of
> >>> those two packages (neither is imported by default), or C code which
> >>> did #include "numpy/numarray/..."?
> >>>
> >>
> >> Those packages were intended to be an easy path for folks to port their
> >> numeric and numarray code to numpy. During the 2.4 discussion there was
> a
> >> fellow who said his group was just now moving their code from numeric to
> >> numpy, but I had the feeling they were rewriting it in the process.
> >>
> >>>
> >>> (I'm not even sure how you would build such a C module, these headers
> >>> are distributed in a weird directory not accessible via
> >>> np.get_include(). So unless your build system does some special work
> >>> to access it, you can't even see these headers.)
> >>>
> >>
> >> Never tried it myself. There is some C code in those packages and it
> easy
> >> to overlook its maintenance, so I'd like to solve the problem by nuking
> it.
> >>
> >
> > Oops. The proposal is to only remove numarray support. The functions in
> > oldnumeric have been taken over into numpy and we need to keep them.
>
> ...huh? The package name is mentioned nowhere in the numpy sources...
>
> ~/src/numpy/numpy$ find -type f | grep -Ev
> '^\./(numarray|oldnumeric)/' | xargs grep oldnumeric
> ./setupscons.py:    config.add_subpackage('oldnumeric')
> ./bento.info:        oldnumeric,
> ./core/setup.py:            join('include', 'numpy', 'oldnumeric.h'),
> ./setup.py:    config.add_subpackage('oldnumeric')
>
> ...and it's not even available unless a user explicitly does 'import
> numpy.oldnumeric' or 'import numpy.numarray', so no-one's using this
> stuff without knowing it:
>
> In [2]: import numpy
>
> In [3]: [m for m in sys.modules if m.startswith("numpy.oldnumeric")]
> Out[3]: []
>
>
Right. I mistakenly looked at numpy/core/fromoldnumeric.py. So yes, the
oldnumeric directory and include also.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130106/64668096/attachment.html>

From shish at keba.be  Sun Jan  6 20:43:42 2013
From: shish at keba.be (Olivier Delalleau)
Date: Sun, 6 Jan 2013 20:43:42 -0500
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <CAPJVwBnBQQ=9tU05edRcomjk4GqY=v7kWcmr_O6aytv95GBo8A@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CAPJVwBnct8Tcop6SruOfD1CCWC82sE3cbXoQvtux8sfLmJhYuQ@mail.gmail.com>
	<CAPJVwB=GmgRQoW1RqwR1WmJBq5tfxJAGnGPjcT0drZvGTVGx7Q@mail.gmail.com>
	<CALmrCV11T6OTp+dBvUNjGn518Rd2ovHPfS7Ur6a5BKTyYXzKQw@mail.gmail.com>
	<50E61E29.1020709@astro.uio.no>
	<CALmrCV0L6PAA=aCYgaX2gQyM-_=bcDiygz0TyDmE0o5oCbc7iA@mail.gmail.com>
	<CAFXk4bqT5Hk67Cv=FJ+L+vx53zR6x5=xRVQRcZShPs7QOZ0OoQ@mail.gmail.com>
	<CALmrCV2TcFwO=mbiSSeqrwbVCyo9nsEQSUspfihA1WgK4Abd+g@mail.gmail.com>
	<CAFXk4bopXBpN2_aPAJCJ3ADk6PSMiG7NANSra7fhButkud+75A@mail.gmail.com>
	<CALmrCV0311UpW__-espNZrbgGk2pa2fW6E1oeTvZkbYf_1e_VA@mail.gmail.com>
	<CAPJVwBkXSjcDHDuddJ8D-39N+hB8SZP6DMHpsr2zwv9XCMT3OA@mail.gmail.com>
	<CALmrCV1xP6cJVL7DZufw84Lw+S_+oE5S4BUY9BAOJtDiXruEvg@mail.gmail.com>
	<CAPJVwBnBQQ=9tU05edRcomjk4GqY=v7kWcmr_O6aytv95GBo8A@mail.gmail.com>
Message-ID: <CAFXk4bpWtBdRRVDPguXdgDts_9tckYOef7Hoxifbqo-N+StymA@mail.gmail.com>

2013/1/5 Nathaniel Smith <njs at pobox.com>:
> On Fri, Jan 4, 2013 at 5:25 PM, Andrew Collette
> <andrew.collette at gmail.com> wrote:
>> I agree the current behavior is confusing.  Regardless of the details
>> of what to do, I suppose my main objection is that, to me, it's really
>> unexpected that adding a number to an array could result in an
>> exception.
>
> I think the main objection to the 1.5 behaviour was that it violated
> "Errors should never pass silently." (from 'import this'). Granted
> there are tons of places where numpy violates this but this is the one
> we're thinking about right now...
>
> Okay, here's another idea I'll throw out, maybe it's a good compromise:
>
> 1) We go back to the 1.5 behaviour.
>
> 2) If this produces a rollover/overflow/etc., we signal that using the
> standard mechanisms (whatever is configured via np.seterr). So by
> default things like
>   np.maximum(np.array([1, 2, 3], dtype=uint8), 256)
> would succeed (and produce [1, 2, 3] with dtype uint8), but also issue
> a warning that 256 had rolled over to become 0. Alternatively those
> who want to be paranoid could call np.seterr(overflow="raise") and
> then it would be an error.

That'd work for me as well. Although I'm not sure about the name
"overflow", it sounds generic enough that it may be associated to many
different situations. If I want to have an error but only for this
very specific scenario (an "unsafe" cast in a mixed scalar/array
operation), would that be possible?

Also, do we all agree that "float32 array + float64 scalar" should
cast the scalar to float32 (thus resulting in a float32 array as
output) without warning, even if the scalar can't be represented
exactly in float32?

-=- Olivier


From njs at pobox.com  Sun Jan  6 21:01:07 2013
From: njs at pobox.com (Nathaniel Smith)
Date: Mon, 7 Jan 2013 02:01:07 +0000
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <CAFXk4bpWtBdRRVDPguXdgDts_9tckYOef7Hoxifbqo-N+StymA@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CAPJVwBnct8Tcop6SruOfD1CCWC82sE3cbXoQvtux8sfLmJhYuQ@mail.gmail.com>
	<CAPJVwB=GmgRQoW1RqwR1WmJBq5tfxJAGnGPjcT0drZvGTVGx7Q@mail.gmail.com>
	<CALmrCV11T6OTp+dBvUNjGn518Rd2ovHPfS7Ur6a5BKTyYXzKQw@mail.gmail.com>
	<50E61E29.1020709@astro.uio.no>
	<CALmrCV0L6PAA=aCYgaX2gQyM-_=bcDiygz0TyDmE0o5oCbc7iA@mail.gmail.com>
	<CAFXk4bqT5Hk67Cv=FJ+L+vx53zR6x5=xRVQRcZShPs7QOZ0OoQ@mail.gmail.com>
	<CALmrCV2TcFwO=mbiSSeqrwbVCyo9nsEQSUspfihA1WgK4Abd+g@mail.gmail.com>
	<CAFXk4bopXBpN2_aPAJCJ3ADk6PSMiG7NANSra7fhButkud+75A@mail.gmail.com>
	<CALmrCV0311UpW__-espNZrbgGk2pa2fW6E1oeTvZkbYf_1e_VA@mail.gmail.com>
	<CAPJVwBkXSjcDHDuddJ8D-39N+hB8SZP6DMHpsr2zwv9XCMT3OA@mail.gmail.com>
	<CALmrCV1xP6cJVL7DZufw84Lw+S_+oE5S4BUY9BAOJtDiXruEvg@mail.gmail.com>
	<CAPJVwBnBQQ=9tU05edRcomjk4GqY=v7kWcmr_O6aytv95GBo8A@mail.gmail.com>
	<CAFXk4bpWtBdRRVDPguXdgDts_9tckYOef7Hoxifbqo-N+StymA@mail.gmail.com>
Message-ID: <CAPJVwBnPsZvh7cPo-kpoayL5nhwNToOe+e2TaCAAOt3Pvis7bw@mail.gmail.com>

On Mon, Jan 7, 2013 at 1:43 AM, Olivier Delalleau <shish at keba.be> wrote:
> 2013/1/5 Nathaniel Smith <njs at pobox.com>:
>> On Fri, Jan 4, 2013 at 5:25 PM, Andrew Collette
>> <andrew.collette at gmail.com> wrote:
>>> I agree the current behavior is confusing.  Regardless of the details
>>> of what to do, I suppose my main objection is that, to me, it's really
>>> unexpected that adding a number to an array could result in an
>>> exception.
>>
>> I think the main objection to the 1.5 behaviour was that it violated
>> "Errors should never pass silently." (from 'import this'). Granted
>> there are tons of places where numpy violates this but this is the one
>> we're thinking about right now...
>>
>> Okay, here's another idea I'll throw out, maybe it's a good compromise:
>>
>> 1) We go back to the 1.5 behaviour.
>>
>> 2) If this produces a rollover/overflow/etc., we signal that using the
>> standard mechanisms (whatever is configured via np.seterr). So by
>> default things like
>>   np.maximum(np.array([1, 2, 3], dtype=uint8), 256)
>> would succeed (and produce [1, 2, 3] with dtype uint8), but also issue
>> a warning that 256 had rolled over to become 0. Alternatively those
>> who want to be paranoid could call np.seterr(overflow="raise") and
>> then it would be an error.
>
> That'd work for me as well. Although I'm not sure about the name
> "overflow", it sounds generic enough that it may be associated to many
> different situations. If I want to have an error but only for this
> very specific scenario (an "unsafe" cast in a mixed scalar/array
> operation), would that be possible?

I suggested "overflow" because that's how we signal rollover in
general right now:

In [5]: np.int8(100) * np.int8(2)
/home/njs/.user-python2.7-64bit/bin/ipython:1: RuntimeWarning:
overflow encountered in byte_scalars
  #!/home/njs/.user-python2.7-64bit/bin/python
Out[5]: -56

Two caveats on this: One, right now this is only implemented for
scalars, not arrays -- which is bug #593 -- and two, I actually agree
(?) that integer rollover and float overflow are different things we
should probably add a new category to np.seterr() for integer rollover
specifically.

But the proposal here is that we not add a specific category for
"unsafe cast" (which we would then have to define!), but instead just
signal it using the standard mechanisms for the particular kind of
corruption that happened. (Which right now is overflow, and might
become something else later.)

> Also, do we all agree that "float32 array + float64 scalar" should
> cast the scalar to float32 (thus resulting in a float32 array as
> output) without warning, even if the scalar can't be represented
> exactly in float32?

I guess for consistency, if this proposal is adopted then a float64
which ends up getting cast to 'inf' or 0.0 should trigger an overflow
or underflow warning respectively... e.g.:

In [12]: np.float64(1e300)
Out[12]: 1.0000000000000001e+300

In [13]: np.float32(_12)
Out[13]: inf

...but otherwise I think yes we agree.

-n


From shish at keba.be  Sun Jan  6 21:17:14 2013
From: shish at keba.be (Olivier Delalleau)
Date: Sun, 6 Jan 2013 21:17:14 -0500
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <CAPJVwBnPsZvh7cPo-kpoayL5nhwNToOe+e2TaCAAOt3Pvis7bw@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CAPJVwBnct8Tcop6SruOfD1CCWC82sE3cbXoQvtux8sfLmJhYuQ@mail.gmail.com>
	<CAPJVwB=GmgRQoW1RqwR1WmJBq5tfxJAGnGPjcT0drZvGTVGx7Q@mail.gmail.com>
	<CALmrCV11T6OTp+dBvUNjGn518Rd2ovHPfS7Ur6a5BKTyYXzKQw@mail.gmail.com>
	<50E61E29.1020709@astro.uio.no>
	<CALmrCV0L6PAA=aCYgaX2gQyM-_=bcDiygz0TyDmE0o5oCbc7iA@mail.gmail.com>
	<CAFXk4bqT5Hk67Cv=FJ+L+vx53zR6x5=xRVQRcZShPs7QOZ0OoQ@mail.gmail.com>
	<CALmrCV2TcFwO=mbiSSeqrwbVCyo9nsEQSUspfihA1WgK4Abd+g@mail.gmail.com>
	<CAFXk4bopXBpN2_aPAJCJ3ADk6PSMiG7NANSra7fhButkud+75A@mail.gmail.com>
	<CALmrCV0311UpW__-espNZrbgGk2pa2fW6E1oeTvZkbYf_1e_VA@mail.gmail.com>
	<CAPJVwBkXSjcDHDuddJ8D-39N+hB8SZP6DMHpsr2zwv9XCMT3OA@mail.gmail.com>
	<CALmrCV1xP6cJVL7DZufw84Lw+S_+oE5S4BUY9BAOJtDiXruEvg@mail.gmail.com>
	<CAPJVwBnBQQ=9tU05edRcomjk4GqY=v7kWcmr_O6aytv95GBo8A@mail.gmail.com>
	<CAFXk4bpWtBdRRVDPguXdgDts_9tckYOef7Hoxifbqo-N+StymA@mail.gmail.com>
	<CAPJVwBnPsZvh7cPo-kpoayL5nhwNToOe+e2TaCAAOt3Pvis7bw@mail.gmail.com>
Message-ID: <CAFXk4brjECt7gVD5Q0rZR858te_AO4RXMfwRwdEUjzdO6c_YGQ@mail.gmail.com>

2013/1/6 Nathaniel Smith <njs at pobox.com>:
> On Mon, Jan 7, 2013 at 1:43 AM, Olivier Delalleau <shish at keba.be> wrote:
>> 2013/1/5 Nathaniel Smith <njs at pobox.com>:
>>> On Fri, Jan 4, 2013 at 5:25 PM, Andrew Collette
>>> <andrew.collette at gmail.com> wrote:
>>>> I agree the current behavior is confusing.  Regardless of the details
>>>> of what to do, I suppose my main objection is that, to me, it's really
>>>> unexpected that adding a number to an array could result in an
>>>> exception.
>>>
>>> I think the main objection to the 1.5 behaviour was that it violated
>>> "Errors should never pass silently." (from 'import this'). Granted
>>> there are tons of places where numpy violates this but this is the one
>>> we're thinking about right now...
>>>
>>> Okay, here's another idea I'll throw out, maybe it's a good compromise:
>>>
>>> 1) We go back to the 1.5 behaviour.
>>>
>>> 2) If this produces a rollover/overflow/etc., we signal that using the
>>> standard mechanisms (whatever is configured via np.seterr). So by
>>> default things like
>>>   np.maximum(np.array([1, 2, 3], dtype=uint8), 256)
>>> would succeed (and produce [1, 2, 3] with dtype uint8), but also issue
>>> a warning that 256 had rolled over to become 0. Alternatively those
>>> who want to be paranoid could call np.seterr(overflow="raise") and
>>> then it would be an error.
>>
>> That'd work for me as well. Although I'm not sure about the name
>> "overflow", it sounds generic enough that it may be associated to many
>> different situations. If I want to have an error but only for this
>> very specific scenario (an "unsafe" cast in a mixed scalar/array
>> operation), would that be possible?
>
> I suggested "overflow" because that's how we signal rollover in
> general right now:
>
> In [5]: np.int8(100) * np.int8(2)
> /home/njs/.user-python2.7-64bit/bin/ipython:1: RuntimeWarning:
> overflow encountered in byte_scalars
>   #!/home/njs/.user-python2.7-64bit/bin/python
> Out[5]: -56
>
> Two caveats on this: One, right now this is only implemented for
> scalars, not arrays -- which is bug #593 -- and two, I actually agree
> (?) that integer rollover and float overflow are different things we
> should probably add a new category to np.seterr() for integer rollover
> specifically.
>
> But the proposal here is that we not add a specific category for
> "unsafe cast" (which we would then have to define!), but instead just
> signal it using the standard mechanisms for the particular kind of
> corruption that happened. (Which right now is overflow, and might
> become something else later.)

Hehe, I didn't even know there was supposed to be a warning for arrays... Ok.

But I'm not convinced that re-using the "overflow" category is a good
idea, because to me the overflow is typically associated to the result
of an operation (when it goes beyond the dtype's supported range),
while here the problem is with the unsafe cast an input (even if it
makes no difference for addition, it does for some other ufuncs). I
may also want to have different error settings for operation overflow
vs. input overflow.

It may just be me though... let's see what others think about it.

>
>> Also, do we all agree that "float32 array + float64 scalar" should
>> cast the scalar to float32 (thus resulting in a float32 array as
>> output) without warning, even if the scalar can't be represented
>> exactly in float32?
>
> I guess for consistency, if this proposal is adopted then a float64
> which ends up getting cast to 'inf' or 0.0 should trigger an overflow
> or underflow warning respectively... e.g.:
>
> In [12]: np.float64(1e300)
> Out[12]: 1.0000000000000001e+300
>
> In [13]: np.float32(_12)
> Out[13]: inf
>
> ...but otherwise I think yes we agree.

Sounds good to me.

-=- Olivier


From raul at virtualmaterials.com  Sun Jan  6 23:15:24 2013
From: raul at virtualmaterials.com (Raul Cota)
Date: Sun, 06 Jan 2013 21:15:24 -0700
Subject: [Numpy-discussion] Remove support for numeric and numarray in
 1.8
In-Reply-To: <CAB6mnxJ=zH7Q4hUpRXT-Heg8ONVBbwVuyrgb=QJ-SjLpZDa7JA@mail.gmail.com>
References: <CAB6mnxJ=zH7Q4hUpRXT-Heg8ONVBbwVuyrgb=QJ-SjLpZDa7JA@mail.gmail.com>
Message-ID: <50EA4BDC.2060108@virtualmaterials.com>

An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130106/eba7c902/attachment.html>

From matthew.brett at gmail.com  Mon Jan  7 06:37:09 2013
From: matthew.brett at gmail.com (Matthew Brett)
Date: Mon, 7 Jan 2013 11:37:09 +0000
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <CALmrCV1xP6cJVL7DZufw84Lw+S_+oE5S4BUY9BAOJtDiXruEvg@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CAPJVwBnct8Tcop6SruOfD1CCWC82sE3cbXoQvtux8sfLmJhYuQ@mail.gmail.com>
	<CAPJVwB=GmgRQoW1RqwR1WmJBq5tfxJAGnGPjcT0drZvGTVGx7Q@mail.gmail.com>
	<CALmrCV11T6OTp+dBvUNjGn518Rd2ovHPfS7Ur6a5BKTyYXzKQw@mail.gmail.com>
	<50E61E29.1020709@astro.uio.no>
	<CALmrCV0L6PAA=aCYgaX2gQyM-_=bcDiygz0TyDmE0o5oCbc7iA@mail.gmail.com>
	<CAFXk4bqT5Hk67Cv=FJ+L+vx53zR6x5=xRVQRcZShPs7QOZ0OoQ@mail.gmail.com>
	<CALmrCV2TcFwO=mbiSSeqrwbVCyo9nsEQSUspfihA1WgK4Abd+g@mail.gmail.com>
	<CAFXk4bopXBpN2_aPAJCJ3ADk6PSMiG7NANSra7fhButkud+75A@mail.gmail.com>
	<CALmrCV0311UpW__-espNZrbgGk2pa2fW6E1oeTvZkbYf_1e_VA@mail.gmail.com>
	<CAPJVwBkXSjcDHDuddJ8D-39N+hB8SZP6DMHpsr2zwv9XCMT3OA@mail.gmail.com>
	<CALmrCV1xP6cJVL7DZufw84Lw+S_+oE5S4BUY9BAOJtDiXruEvg@mail.gmail.com>
Message-ID: <CAH6Pt5r50HQctUCGe2+oWP9G709-cenmvOpr7G4L=WQVn0JS=A@mail.gmail.com>

Hi,

On Fri, Jan 4, 2013 at 5:25 PM, Andrew Collette
<andrew.collette at gmail.com> wrote:

> I agree the current behavior is confusing.  Regardless of the details
> of what to do, I suppose my main objection is that, to me, it's really
> unexpected that adding a number to an array could result in an
> exception.

I realized when I thought about it, that I did not have a clear idea
of your exact use case.  How does the user specify the thing to add,
and why do you need to avoid an error in the case that adding would
overflow the type?  Would you mind giving an idiot-level explanation?

Best,

Matthew


From matthew.brett at gmail.com  Mon Jan  7 08:31:59 2013
From: matthew.brett at gmail.com (Matthew Brett)
Date: Mon, 7 Jan 2013 13:31:59 +0000
Subject: [Numpy-discussion] Which Python to use for Mac binaries
In-Reply-To: <CALGmxEJjaj=XjUkXAc2QEtEGNMmc4kKQ44d4k5v8s7E3_t6ynw@mail.gmail.com>
References: <CADDwiVCrujdbbpCVCiDbi1N7pH6i5aHDDv+Y+TyyRnkkzJ14+Q@mail.gmail.com>
	<CABL7CQhsSuJAVLA8LKYPNv-G5SC0J7D4aUZ6Jd8jc7ZTssEdfg@mail.gmail.com>
	<CALGmxEJjaj=XjUkXAc2QEtEGNMmc4kKQ44d4k5v8s7E3_t6ynw@mail.gmail.com>
Message-ID: <CAH6Pt5pHpYSS1kor0b-N9AUG9Qar74jBPWMiphwRm8ALR64Pjw@mail.gmail.com>

Hi,

On Sun, Jan 6, 2013 at 10:40 PM, Chris Barker - NOAA Federal
<chris.barker at noaa.gov> wrote:
> On Sun, Jan 6, 2013 at 2:04 AM, Ralf Gommers <ralf.gommers at gmail.com> wrote:
>>> Which exact Python do we need to use on Mac? Do we need to use the
>>> binary installer from python.org?
>>
>> Yes, the one from python.org.
>>
>>> Or can I install it from source?
>
> you could install from source using the same method that the
> python.org binaries are built -- there is a script with the source to
> do that, though I'm not sure what the point of that would be.
>
>> The 10.3 installers for 2.5, 2.6 and 2.7 should be compiled on OS X 10.5.
>
> It would be great to continue support for that, though I wonder how
> many people still need it -- I don't think Apple supports 10.5
> anymore, for instance.
>
>> The 10.7 --> 10.6 support hasn't been checked, but I wouldn't trust it. I
>> have a 10.6 machine, so I can compile those binaries if needed.
>
> That would be better, but it would also be nice to check how building
> on 10.7 works.
>
>> Avoid using system Python for anything. The first thing to do on any new OS
>> X system is install Python some other way, preferably from python.org.
>
> +1
>
>> Last note: bdist_mpkg is unmaintained and doesn't support Python 3.x. Most
>> recent version is at: https://github.com/matthew-brett/bdist_mpkg, for
>> previous versions numpy releases I've used that at commit e81a58a471
>
> There has been recent discussion on the pythonmac list about this --
> some waffling about how important it is -- though I think it would be
> good to keep it up to date.

I updated my fork of bdist_mpkg with Python 3k support.  It doesn't
have any tests that I could see, but I've run it on python 2.6 and 3.2
and 3.3 on one of my packages as a first pass.

>> If we want 3.x binaries, then we should fix that or (preferably) build
>> binaries with Bento. Bento has grown support for mpkg's; I'm not sure how
>> robust that is.
>
> So maybe bento is a better route than bdist_mpkg -- this is worth
> discussion on teh pythonmac list.

David - can you give a status update on that?

Cheers,

Matthew


From charlesr.harris at gmail.com  Mon Jan  7 11:22:51 2013
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Mon, 7 Jan 2013 09:22:51 -0700
Subject: [Numpy-discussion] Remove support for numeric and numarray in
	1.8
In-Reply-To: <50EA4BDC.2060108@virtualmaterials.com>
References: <CAB6mnxJ=zH7Q4hUpRXT-Heg8ONVBbwVuyrgb=QJ-SjLpZDa7JA@mail.gmail.com>
	<50EA4BDC.2060108@virtualmaterials.com>
Message-ID: <CAB6mnxJ4DPcndfbv-GB40DsHzOO_wVfPWS-Z7cd6OO9sVE-3DA@mail.gmail.com>

On Sun, Jan 6, 2013 at 9:15 PM, Raul Cota <raul at virtualmaterials.com> wrote:

>  I realize we may be a minority but it would be very nice if support for
> numeric could be kept for a few more versions. We don't have any particular
> needs for numarray.
>
> We just under went through an extremely long and painful process to
> upgrade our software from Numeric to numpy and everything hinges on the
> "oldnumeric" stuff. This was the classical 80-20 scenario where we got most
> of the stuff to work in a just a few days and then we had to revisit
> several areas of our software to iron out all the bugs and subtle but
> meaningful differences between numpy and Numeric. The last round of work
> was related to speed. We still have not released the upgrade to our
> costumers therefore we expect still a few more bugs to surface.
>
> Bottom line, we are still about one or two years away from changing all
> our imports to numpy. Yes, I know, we move fairly slowly but that is our
> reality.
>
>
Good to know. Have you tested oldnumeric in the upcoming 1.7 release?

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130107/46a937e5/attachment.html>

From andrew.collette at gmail.com  Mon Jan  7 11:33:48 2013
From: andrew.collette at gmail.com (Andrew Collette)
Date: Mon, 7 Jan 2013 09:33:48 -0700
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <CAH6Pt5r50HQctUCGe2+oWP9G709-cenmvOpr7G4L=WQVn0JS=A@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CAPJVwBnct8Tcop6SruOfD1CCWC82sE3cbXoQvtux8sfLmJhYuQ@mail.gmail.com>
	<CAPJVwB=GmgRQoW1RqwR1WmJBq5tfxJAGnGPjcT0drZvGTVGx7Q@mail.gmail.com>
	<CALmrCV11T6OTp+dBvUNjGn518Rd2ovHPfS7Ur6a5BKTyYXzKQw@mail.gmail.com>
	<50E61E29.1020709@astro.uio.no>
	<CALmrCV0L6PAA=aCYgaX2gQyM-_=bcDiygz0TyDmE0o5oCbc7iA@mail.gmail.com>
	<CAFXk4bqT5Hk67Cv=FJ+L+vx53zR6x5=xRVQRcZShPs7QOZ0OoQ@mail.gmail.com>
	<CALmrCV2TcFwO=mbiSSeqrwbVCyo9nsEQSUspfihA1WgK4Abd+g@mail.gmail.com>
	<CAFXk4bopXBpN2_aPAJCJ3ADk6PSMiG7NANSra7fhButkud+75A@mail.gmail.com>
	<CALmrCV0311UpW__-espNZrbgGk2pa2fW6E1oeTvZkbYf_1e_VA@mail.gmail.com>
	<CAPJVwBkXSjcDHDuddJ8D-39N+hB8SZP6DMHpsr2zwv9XCMT3OA@mail.gmail.com>
	<CALmrCV1xP6cJVL7DZufw84Lw+S_+oE5S4BUY9BAOJtDiXruEvg@mail.gmail.com>
	<CAH6Pt5r50HQctUCGe2+oWP9G709-cenmvOpr7G4L=WQVn0JS=A@mail.gmail.com>
Message-ID: <CALmrCV0sXJ=SfUQA=F9zEPUM0q6fDKzQkUtyjc=mqP+nQZE-Sw@mail.gmail.com>

Hi Matthew,

> I realized when I thought about it, that I did not have a clear idea
> of your exact use case.  How does the user specify the thing to add,
> and why do you need to avoid an error in the case that adding would
> overflow the type?  Would you mind giving an idiot-level explanation?

There isn't a specific use case I had in mind... from a developer's
perspective, what bothers me about the proposed behavior is that every
use of "+" on user-generated input becomes a time bomb.  Since h5py
deals with user-generated files, I have to deal with all kinds of
dtypes, including low-precision ones like int8/uint8.  They come from
user-supplied function and methods arguments, sure, but also from
datasets in files; attributes; virtually everywhere.

I suppose what I'm really asking is that numpy provides (continues to
provide) a default rule in this situation, as does every other
scientific language I've used.  One reason to avoid a ValueError in
favor of default behavior (in addition to the large amount of work
required to check every use of "+") is so there's an established
behavior users know to expect.

For example, one feature we're thinking of implementing involves
adding an offset to a dataset when it's read.  Should we roll over?
Upcast?  It seems to me there's great value in being able to say "We
do what numpy does."  If numpy doesn't answer the question, everybody
makes up their own rules.  There are certainly cases where the answer
is obvious to the application: you have a huge number of int8's and
don't want to upcast.  Or you don't want to lose precision.  But if
numpy provides a default rule, nobody is prevented from making careful
choices based on their application's requirements, and there's the
additional value of having an common, documented default behavior.

Andrew


From andrew.collette at gmail.com  Mon Jan  7 11:38:44 2013
From: andrew.collette at gmail.com (Andrew Collette)
Date: Mon, 7 Jan 2013 09:38:44 -0700
Subject: [Numpy-discussion] ANN: HDF5 for Python (h5py) 2.1.1
Message-ID: <CALmrCV3wAe7YjTbOnS1sfEd2H23dkt4dLDx9hB0nZC_kqOrJRg@mail.gmail.com>

Announcing HDF5 for Python (h5py) 2.1.1
=======================================

HDF5 for Python 2.1.1 is now available!  This bugfix release also marks
a number of changes for the h5py project intended to make the development
process more responsive, including a move to GitHub and a switch to a
rapid release model.

Development has moved over to GitHub at http://github.com/h5py/h5py.  We
welcome bug reports and pull requests from anyone interested in contributing.

Releases will now be made every 4-6 weeks, in order to get bugfixes and new
features out to users quickly while still leaving time for testing.

* New main website: http://www.h5py.org
* Mailing list:     http://groups.google.com/group/h5py


What is h5py?
=============

The h5py package is a Pythonic interface to the HDF5 binary data format.

It lets you store huge amounts of numerical data, and easily manipulate that
data from NumPy. For example, you can slice into multi-terabyte datasets
stored on disk, as if they were real NumPy arrays. Thousands of datasets can
be stored in a single file, categorized and tagged however you want.

H5py uses straightforward NumPy and Python metaphors, like dictionary and
NumPy array syntax. For example, you can iterate over datasets in a file, or
check out the .shape or .dtype attributes of datasets. You don't need to know
anything special about HDF5 to get started.

In addition to the easy-to-use high level interface, h5py rests on a
object-oriented Cython wrapping of the HDF5 C API. Almost anything you can do
from C in HDF5, you can do from h5py.

Best of all, the files you create are in a widely-used standard binary format,
which you can exchange with other people, including those who use programs
like IDL and MATLAB.


What's new in 2.1.1?
====================

This is a bugfix release.  The most substantial changes were:

* Fixed a memory leak related to variable-length strings (Thanks to Luke
    Campbell for extensive testing and bug reports)
* Fixed a threading deadlock related to the use of H5Aiterate
* Fixed a double INCREF memory leak affecting Unicode variable-length strings
* Fixed an exception when taking the repr() of objects with non-ASCII names


From raul at virtualmaterials.com  Mon Jan  7 11:47:26 2013
From: raul at virtualmaterials.com (Raul Cota)
Date: Mon, 07 Jan 2013 09:47:26 -0700
Subject: [Numpy-discussion] Remove support for numeric and numarray in
 1.8
In-Reply-To: <CAB6mnxJ4DPcndfbv-GB40DsHzOO_wVfPWS-Z7cd6OO9sVE-3DA@mail.gmail.com>
References: <CAB6mnxJ=zH7Q4hUpRXT-Heg8ONVBbwVuyrgb=QJ-SjLpZDa7JA@mail.gmail.com>
	<50EA4BDC.2060108@virtualmaterials.com>
	<CAB6mnxJ4DPcndfbv-GB40DsHzOO_wVfPWS-Z7cd6OO9sVE-3DA@mail.gmail.com>
Message-ID: <50EAFC1E.1010608@virtualmaterials.com>

An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130107/ba7de373/attachment.html>

From chris.barker at noaa.gov  Mon Jan  7 12:48:24 2013
From: chris.barker at noaa.gov (Chris Barker - NOAA Federal)
Date: Mon, 7 Jan 2013 09:48:24 -0800
Subject: [Numpy-discussion] Which Python to use for Mac binaries
In-Reply-To: <CAH6Pt5pHpYSS1kor0b-N9AUG9Qar74jBPWMiphwRm8ALR64Pjw@mail.gmail.com>
References: <CADDwiVCrujdbbpCVCiDbi1N7pH6i5aHDDv+Y+TyyRnkkzJ14+Q@mail.gmail.com>
	<CABL7CQhsSuJAVLA8LKYPNv-G5SC0J7D4aUZ6Jd8jc7ZTssEdfg@mail.gmail.com>
	<CALGmxEJjaj=XjUkXAc2QEtEGNMmc4kKQ44d4k5v8s7E3_t6ynw@mail.gmail.com>
	<CAH6Pt5pHpYSS1kor0b-N9AUG9Qar74jBPWMiphwRm8ALR64Pjw@mail.gmail.com>
Message-ID: <CALGmxEJAS3mxVcsOagST1=zqO-sC6jJM-Ud5gMCeenF+KWAqGQ@mail.gmail.com>

On Mon, Jan 7, 2013 at 5:31 AM, Matthew Brett <matthew.brett at gmail.com> wrote:

> I updated my fork of bdist_mpkg with Python 3k support.  It doesn't
> have any tests that I could see, but I've run it on python 2.6 and 3.2
> and 3.3 on one of my packages as a first pass.

Have you been in communication with Ronald Oussoren about this? I'm
sure he'd be interested in bringing into the "official"repository.

-Chris


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From chris.barker at noaa.gov  Mon Jan  7 13:08:03 2013
From: chris.barker at noaa.gov (Chris Barker - NOAA Federal)
Date: Mon, 7 Jan 2013 10:08:03 -0800
Subject: [Numpy-discussion] Insights / lessons learned from NumPy design
In-Reply-To: <CAA_67WgMPeF3LP=1xsTCgcvXXAPqbuJnJ5HJyEZM9Wda3JFSpg@mail.gmail.com>
References: <CAA_67WgMPeF3LP=1xsTCgcvXXAPqbuJnJ5HJyEZM9Wda3JFSpg@mail.gmail.com>
Message-ID: <CALGmxEJFXbPXjebhG2WYEUfjC9z7y=t_7HCjHNzPWSr+4SZw0g@mail.gmail.com>

On Thu, Jan 3, 2013 at 10:29 PM, Mike Anderson
<mike.r.anderson.13 at gmail.com> wrote:
> In the Clojure community there has been some discussion about creating a
> common matrix maths library / API. Currently there are a few different
> fledgeling matrix libraries in Clojure, so it seemed like a worthwhile
> effort to unify them and have a common base on which to build on.
>
> NumPy has been something of an inspiration for this, so I though I'd ask
> here to see what lessons have been learned.

A few thoughts:

> We're thinking of a matrix library

First -- is this a "matrix" library, or a general use nd-array
library? That will drive your design a great deal. For my part, I came
from MATLAB, which started our very focused on matrixes, then extended
to be more generally useful. Personally, I found the matrix-focus to
get in the way more than help -- in any "real" code, you're the actual
matrix operations are likely to be a tiny fraction of the code.

One reason I like numpy is that it is array-first, with secondary
support for matrix stuff.

That being said, there is the numpy matrix type, and there are those
that find it very useful. particularly in teaching situations, though
it feels a bit "tacked-on", and that does get in the way, so if you
want a "real" matrix object, but also a general purpose array lib,
thinking about both up front will be helpful.

> - Support for multi-dimensional matrices (but with fast paths for 1D vectors
> and 2D matrices as the common cases)

what is a multi-dimensional matrix? -- is a 3-d something, a stack of
matrixes? or something else? (note, numpy lacks this kind of object,
but it is sometimes asked for -- i.e a way to do fast matrix
multiplication with a lot of small matrixes)

I think fast paths for 1-D and 2-D is secondary, though you may want
"easy paths" for those. IN particular, if you want good support for
linear algebra (matrixes), then having a clean and natural "row vector
and  "column vector" would be nice. See the archives of this list for
a bunch of discussion about that -- and what the weaknesses are of the
numpy matrix object.

> - Immutability by default, i.e. matrix operations are pure functions that
> create new matrices.

I'd be careful about this -- the purity and predictability is nice,
but these days a lot of time is spend allocating and moving memory
around -- numpy array's mutability is a major key feature -- indeed,
the key issues with performance with numpy surrond the fact that many
copies may be made unnecessarily (note, Dag's suggesting of lazy
evaluation may mitigate this to some extent).

> - Support for 64-bit double precision floats only (this is the standard
> float type in Clojure)

not a bad start, but another major strength of numpy is the multiple
data types - you may wantt to design that concept in from the start.

> - Ability to support multiple different back-end matrix implementations
> (JBLAS, Colt, EJML, Vectorz, javax.vecmath etc.)

This ties in to another major strength of numpy -- ndarrays are both
powerful python objects, and wrappers around standard C arrays -- that
makes it pretty darn easy to interface with external libraries for
core computation.

HTH,
  -Chris


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From rhl at astro.princeton.edu  Mon Jan  7 13:19:00 2013
From: rhl at astro.princeton.edu (Robert Lupton the Good)
Date: Mon, 7 Jan 2013 13:19:00 -0500
Subject: [Numpy-discussion] Which Python to use for Mac binaries
In-Reply-To: <mailman.6.1357581603.11362.numpy-discussion@scipy.org>
References: <mailman.6.1357581603.11362.numpy-discussion@scipy.org>
Message-ID: <35A75EE5-AE01-4144-B7CB-54D492077E21@astro.princeton.edu>

I am sympathetic with this attitude ("Avoid using system Python for anything"), but I don't think it's the right one.  For example, the project I'm working on (HSC/LSST for astrofolk) is using python/C++ for astronomical imaging, and we expect to have the code running on a significant number of end-user laptops.   If the instructions start out with:
	0. Install a new version of python
it's a significant barrier.  What if they've already involved other packages into the system python?

Python is a central part of modern operating systems, and people should not have to manage two versions of python to use numpy. It's  tempting to say, "First install g++ 4.7 so we can use C++11 features" it's simply not viable, and I think that saying, "first install a new python" is comparable.  Yes, I know that you can have more than one python/compiler suite installed simultaneously, but that's not something for casual users to have to get involved in.

							R


> On Sun, Jan 6, 2013 at 2:04 AM, Ralf Gommers <ralf.gommers at gmail.com> wrote:
>>> Which exact Python do we need to use on Mac? Do we need to use the
>>> binary installer from python.org?
...
> 
>> Avoid using system Python for anything. The first thing to do on any new OS
>> X system is install Python some other way, preferably from python.org.
> 
> +1


From chris.barker at noaa.gov  Mon Jan  7 13:35:28 2013
From: chris.barker at noaa.gov (Chris Barker - NOAA Federal)
Date: Mon, 7 Jan 2013 10:35:28 -0800
Subject: [Numpy-discussion] Which Python to use for Mac binaries
In-Reply-To: <35A75EE5-AE01-4144-B7CB-54D492077E21@astro.princeton.edu>
References: <mailman.6.1357581603.11362.numpy-discussion@scipy.org>
	<35A75EE5-AE01-4144-B7CB-54D492077E21@astro.princeton.edu>
Message-ID: <CALGmxEKbMXZMgUkEQ9DjYS=bA4Cumap4347gXbB2S2RWRnGpBA@mail.gmail.com>

On Mon, Jan 7, 2013 at 10:19 AM, Robert Lupton the Good
<rhl at astro.princeton.edu> wrote:
> I am sympathetic with this attitude ("Avoid using system Python for anything"), but I don't think it's the right one.  For example, the project I'm working on (HSC/LSST for astrofolk) is using python/C++ for astronomical imaging, and we expect to have the code running on a significant number of end-user laptops.   If the instructions start out with:
>         0. Install a new version of python
> it's a significant barrier.  What if they've already involved other packages into the system python?

What if they've already installed other packages into the python.org
python? or fink? macports? or homebrew? or build-your own?

Unfortunately, python on the Mac is a bit of a mess--there are WAY too
many ways to get Python working. It's great that Apple provides python
as part of the system, but unfortunately:

* Apple has NEVER upgraded python within a OS-X version.
* Apple includes proprietary code with their build, so you are not
allowed to re-distribute it (i.e. py2app).

As a result, the MacPython community can not declare that Apple's
build is the primary one we want to support.

This has been hashed out a bunch on the PythonMac list, and there was
more or less consensus that the python.org builds would be the ones
that we as a community try to support with binaries, etc.

All that being said, for the most part, the Apple builds and
python.org builds are compatible. Robin Dunn has worked out a way to
build installers for wxPython that work with both the python.org and
Apple builds -- putting everything in /usr/local, and *.pth files in
both of the python builds -- it's a hack, but it works, that may be an
approach worth taking.

It also wouldn't be that hard to build a duplicate set of binaries,
but it does get to be a lot for users to figure out what they need.

> Yes, I know that you can have more than one python/compiler suite installed simultaneously, but that's not something for casual users to have to get involved in.

Installing a binary from python.org is not much of a challenge for
anyone that is installing anything, actually.

It's true that it's a pain for users to get a system all set up, then
find out that to use the numpy binaries (or anything else...) they
need to start over with a new python -- that's why we in the MacPython
community encourage everyone to build binaries for the python.org
builds -- standards are good, but standardizing on the Apple builds
isn't viable.

(NOTE: the "decision" was made a few years back -- it may be worth
re-visiting, but I'm pretty sure that Apple's build is still not
suitable as the default choice)

-Chris


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From matthew.brett at gmail.com  Mon Jan  7 15:01:06 2013
From: matthew.brett at gmail.com (Matthew Brett)
Date: Mon, 7 Jan 2013 20:01:06 +0000
Subject: [Numpy-discussion] Which Python to use for Mac binaries
In-Reply-To: <CALGmxEJAS3mxVcsOagST1=zqO-sC6jJM-Ud5gMCeenF+KWAqGQ@mail.gmail.com>
References: <CADDwiVCrujdbbpCVCiDbi1N7pH6i5aHDDv+Y+TyyRnkkzJ14+Q@mail.gmail.com>
	<CABL7CQhsSuJAVLA8LKYPNv-G5SC0J7D4aUZ6Jd8jc7ZTssEdfg@mail.gmail.com>
	<CALGmxEJjaj=XjUkXAc2QEtEGNMmc4kKQ44d4k5v8s7E3_t6ynw@mail.gmail.com>
	<CAH6Pt5pHpYSS1kor0b-N9AUG9Qar74jBPWMiphwRm8ALR64Pjw@mail.gmail.com>
	<CALGmxEJAS3mxVcsOagST1=zqO-sC6jJM-Ud5gMCeenF+KWAqGQ@mail.gmail.com>
Message-ID: <CAH6Pt5pPeciNiqK=xT7akuyy7ZEKxcggDOr869T1D1mB9ok8Xg@mail.gmail.com>

Hi,

On Mon, Jan 7, 2013 at 5:48 PM, Chris Barker - NOAA Federal
<chris.barker at noaa.gov> wrote:
> On Mon, Jan 7, 2013 at 5:31 AM, Matthew Brett <matthew.brett at gmail.com> wrote:
>
>> I updated my fork of bdist_mpkg with Python 3k support.  It doesn't
>> have any tests that I could see, but I've run it on python 2.6 and 3.2
>> and 3.3 on one of my packages as a first pass.
>
> Have you been in communication with Ronald Oussoren about this? I'm
> sure he'd be interested in bringing into the "official"repository.

I just emailed him, thanks for the suggestion.

Best,

Matthew


From matthew.brett at gmail.com  Mon Jan  7 15:12:51 2013
From: matthew.brett at gmail.com (Matthew Brett)
Date: Mon, 7 Jan 2013 20:12:51 +0000
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <CALmrCV0sXJ=SfUQA=F9zEPUM0q6fDKzQkUtyjc=mqP+nQZE-Sw@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CAPJVwBnct8Tcop6SruOfD1CCWC82sE3cbXoQvtux8sfLmJhYuQ@mail.gmail.com>
	<CAPJVwB=GmgRQoW1RqwR1WmJBq5tfxJAGnGPjcT0drZvGTVGx7Q@mail.gmail.com>
	<CALmrCV11T6OTp+dBvUNjGn518Rd2ovHPfS7Ur6a5BKTyYXzKQw@mail.gmail.com>
	<50E61E29.1020709@astro.uio.no>
	<CALmrCV0L6PAA=aCYgaX2gQyM-_=bcDiygz0TyDmE0o5oCbc7iA@mail.gmail.com>
	<CAFXk4bqT5Hk67Cv=FJ+L+vx53zR6x5=xRVQRcZShPs7QOZ0OoQ@mail.gmail.com>
	<CALmrCV2TcFwO=mbiSSeqrwbVCyo9nsEQSUspfihA1WgK4Abd+g@mail.gmail.com>
	<CAFXk4bopXBpN2_aPAJCJ3ADk6PSMiG7NANSra7fhButkud+75A@mail.gmail.com>
	<CALmrCV0311UpW__-espNZrbgGk2pa2fW6E1oeTvZkbYf_1e_VA@mail.gmail.com>
	<CAPJVwBkXSjcDHDuddJ8D-39N+hB8SZP6DMHpsr2zwv9XCMT3OA@mail.gmail.com>
	<CALmrCV1xP6cJVL7DZufw84Lw+S_+oE5S4BUY9BAOJtDiXruEvg@mail.gmail.com>
	<CAH6Pt5r50HQctUCGe2+oWP9G709-cenmvOpr7G4L=WQVn0JS=A@mail.gmail.com>
	<CALmrCV0sXJ=SfUQA=F9zEPUM0q6fDKzQkUtyjc=mqP+nQZE-Sw@mail.gmail.com>
Message-ID: <CAH6Pt5oqwvWSbnaPy2pd0H=ZiQKf6xemcxY1tXsdQ1KdioH53w@mail.gmail.com>

Hi,

On Mon, Jan 7, 2013 at 4:33 PM, Andrew Collette
<andrew.collette at gmail.com> wrote:
> Hi Matthew,
>
>> I realized when I thought about it, that I did not have a clear idea
>> of your exact use case.  How does the user specify the thing to add,
>> and why do you need to avoid an error in the case that adding would
>> overflow the type?  Would you mind giving an idiot-level explanation?
>
> There isn't a specific use case I had in mind... from a developer's
> perspective, what bothers me about the proposed behavior is that every
> use of "+" on user-generated input becomes a time bomb.  Since h5py
> deals with user-generated files, I have to deal with all kinds of
> dtypes, including low-precision ones like int8/uint8.  They come from
> user-supplied function and methods arguments, sure, but also from
> datasets in files; attributes; virtually everywhere.
>
> I suppose what I'm really asking is that numpy provides (continues to
> provide) a default rule in this situation, as does every other
> scientific language I've used.  One reason to avoid a ValueError in
> favor of default behavior (in addition to the large amount of work
> required to check every use of "+") is so there's an established
> behavior users know to expect.
>
> For example, one feature we're thinking of implementing involves
> adding an offset to a dataset when it's read.  Should we roll over?
> Upcast?  It seems to me there's great value in being able to say "We
> do what numpy does."  If numpy doesn't answer the question, everybody
> makes up their own rules.  There are certainly cases where the answer
> is obvious to the application: you have a huge number of int8's and
> don't want to upcast.  Or you don't want to lose precision.  But if
> numpy provides a default rule, nobody is prevented from making careful
> choices based on their application's requirements, and there's the
> additional value of having an common, documented default behavior.

Just to be clear, you mean you might have something like this?

def my_func('array_name', some_offset):
    arr = load_somehow('array_name') # dtype hitherto unknown
    return arr + some_offset

?  And the problem is that it fails late?   Is it really better that
something bad happens for the addition than that it raises an error?

You'll also often get an error when trying to add structured dtypes,
but maybe you cant return these from a 'load'?

Best,

Matthew


From njs at pobox.com  Mon Jan  7 15:16:03 2013
From: njs at pobox.com (Nathaniel Smith)
Date: Mon, 7 Jan 2013 20:16:03 +0000
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <CAFXk4brjECt7gVD5Q0rZR858te_AO4RXMfwRwdEUjzdO6c_YGQ@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CAPJVwBnct8Tcop6SruOfD1CCWC82sE3cbXoQvtux8sfLmJhYuQ@mail.gmail.com>
	<CAPJVwB=GmgRQoW1RqwR1WmJBq5tfxJAGnGPjcT0drZvGTVGx7Q@mail.gmail.com>
	<CALmrCV11T6OTp+dBvUNjGn518Rd2ovHPfS7Ur6a5BKTyYXzKQw@mail.gmail.com>
	<50E61E29.1020709@astro.uio.no>
	<CALmrCV0L6PAA=aCYgaX2gQyM-_=bcDiygz0TyDmE0o5oCbc7iA@mail.gmail.com>
	<CAFXk4bqT5Hk67Cv=FJ+L+vx53zR6x5=xRVQRcZShPs7QOZ0OoQ@mail.gmail.com>
	<CALmrCV2TcFwO=mbiSSeqrwbVCyo9nsEQSUspfihA1WgK4Abd+g@mail.gmail.com>
	<CAFXk4bopXBpN2_aPAJCJ3ADk6PSMiG7NANSra7fhButkud+75A@mail.gmail.com>
	<CALmrCV0311UpW__-espNZrbgGk2pa2fW6E1oeTvZkbYf_1e_VA@mail.gmail.com>
	<CAPJVwBkXSjcDHDuddJ8D-39N+hB8SZP6DMHpsr2zwv9XCMT3OA@mail.gmail.com>
	<CALmrCV1xP6cJVL7DZufw84Lw+S_+oE5S4BUY9BAOJtDiXruEvg@mail.gmail.com>
	<CAPJVwBnBQQ=9tU05edRcomjk4GqY=v7kWcmr_O6aytv95GBo8A@mail.gmail.com>
	<CAFXk4bpWtBdRRVDPguXdgDts_9tckYOef7Hoxifbqo-N+StymA@mail.gmail.com>
	<CAPJVwBnPsZvh7cPo-kpoayL5nhwNToOe+e2TaCAAOt3Pvis7bw@mail.gmail.com>
	<CAFXk4brjECt7gVD5Q0rZR858te_AO4RXMfwRwdEUjzdO6c_YGQ@mail.gmail.com>
Message-ID: <CAPJVwBnhS3CFfe+hbbjzB5LKB78Lswa4ztT1LPb=8OYyABNquw@mail.gmail.com>

On Mon, Jan 7, 2013 at 2:17 AM, Olivier Delalleau <shish at keba.be> wrote:
> Hehe, I didn't even know there was supposed to be a warning for arrays... Ok.
>
> But I'm not convinced that re-using the "overflow" category is a good
> idea, because to me the overflow is typically associated to the result
> of an operation (when it goes beyond the dtype's supported range),
> while here the problem is with the unsafe cast an input (even if it
> makes no difference for addition, it does for some other ufuncs).

Right, there are two operations: casting the inputs to a common type,
and then performing the addition. It's the first operation that rolls
over and would trigger a warning/error/whatever, not the second.

-n


From andrew.collette at gmail.com  Mon Jan  7 15:50:12 2013
From: andrew.collette at gmail.com (Andrew Collette)
Date: Mon, 7 Jan 2013 13:50:12 -0700
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <CAH6Pt5oqwvWSbnaPy2pd0H=ZiQKf6xemcxY1tXsdQ1KdioH53w@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CAPJVwBnct8Tcop6SruOfD1CCWC82sE3cbXoQvtux8sfLmJhYuQ@mail.gmail.com>
	<CAPJVwB=GmgRQoW1RqwR1WmJBq5tfxJAGnGPjcT0drZvGTVGx7Q@mail.gmail.com>
	<CALmrCV11T6OTp+dBvUNjGn518Rd2ovHPfS7Ur6a5BKTyYXzKQw@mail.gmail.com>
	<50E61E29.1020709@astro.uio.no>
	<CALmrCV0L6PAA=aCYgaX2gQyM-_=bcDiygz0TyDmE0o5oCbc7iA@mail.gmail.com>
	<CAFXk4bqT5Hk67Cv=FJ+L+vx53zR6x5=xRVQRcZShPs7QOZ0OoQ@mail.gmail.com>
	<CALmrCV2TcFwO=mbiSSeqrwbVCyo9nsEQSUspfihA1WgK4Abd+g@mail.gmail.com>
	<CAFXk4bopXBpN2_aPAJCJ3ADk6PSMiG7NANSra7fhButkud+75A@mail.gmail.com>
	<CALmrCV0311UpW__-espNZrbgGk2pa2fW6E1oeTvZkbYf_1e_VA@mail.gmail.com>
	<CAPJVwBkXSjcDHDuddJ8D-39N+hB8SZP6DMHpsr2zwv9XCMT3OA@mail.gmail.com>
	<CALmrCV1xP6cJVL7DZufw84Lw+S_+oE5S4BUY9BAOJtDiXruEvg@mail.gmail.com>
	<CAH6Pt5r50HQctUCGe2+oWP9G709-cenmvOpr7G4L=WQVn0JS=A@mail.gmail.com>
	<CALmrCV0sXJ=SfUQA=F9zEPUM0q6fDKzQkUtyjc=mqP+nQZE-Sw@mail.gmail.com>
	<CAH6Pt5oqwvWSbnaPy2pd0H=ZiQKf6xemcxY1tXsdQ1KdioH53w@mail.gmail.com>
Message-ID: <CALmrCV0n=TsEPMrmzpdh6UQNFRTp5SgD0VAH52n9pxedsEu8Kg@mail.gmail.com>

Hi Matthew,

> Just to be clear, you mean you might have something like this?
>
> def my_func('array_name', some_offset):
>     arr = load_somehow('array_name') # dtype hitherto unknown
>     return arr + some_offset
>
> ?  And the problem is that it fails late?   Is it really better that
> something bad happens for the addition than that it raises an error?
>
> You'll also often get an error when trying to add structured dtypes,
> but maybe you cant return these from a 'load'?

In this specific case I would like to just use "+" and say "We add
your offset using the NumPy rules," which is a problem if there are no
NumPy rules for addition in the specific case where some_offset
happens to be a scalar and not an array, and also slightly larger than
arr.dtype can hold. I personally prefer upcasting to some reasonable
type big enough to hold some_offset, as I described earlier, although
that's not crucial.

But I think we're getting a little caught up in the details of this
example.  My basic point is: yes, people should be careful to check
dtypes, etc. where it's important to their application; but people who
want to rely on some reasonable NumPy-supplied default behavior should
be excused from doing so.

Andrew


From matthew.brett at gmail.com  Mon Jan  7 15:55:25 2013
From: matthew.brett at gmail.com (Matthew Brett)
Date: Mon, 7 Jan 2013 20:55:25 +0000
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <CALmrCV0n=TsEPMrmzpdh6UQNFRTp5SgD0VAH52n9pxedsEu8Kg@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CAPJVwBnct8Tcop6SruOfD1CCWC82sE3cbXoQvtux8sfLmJhYuQ@mail.gmail.com>
	<CAPJVwB=GmgRQoW1RqwR1WmJBq5tfxJAGnGPjcT0drZvGTVGx7Q@mail.gmail.com>
	<CALmrCV11T6OTp+dBvUNjGn518Rd2ovHPfS7Ur6a5BKTyYXzKQw@mail.gmail.com>
	<50E61E29.1020709@astro.uio.no>
	<CALmrCV0L6PAA=aCYgaX2gQyM-_=bcDiygz0TyDmE0o5oCbc7iA@mail.gmail.com>
	<CAFXk4bqT5Hk67Cv=FJ+L+vx53zR6x5=xRVQRcZShPs7QOZ0OoQ@mail.gmail.com>
	<CALmrCV2TcFwO=mbiSSeqrwbVCyo9nsEQSUspfihA1WgK4Abd+g@mail.gmail.com>
	<CAFXk4bopXBpN2_aPAJCJ3ADk6PSMiG7NANSra7fhButkud+75A@mail.gmail.com>
	<CALmrCV0311UpW__-espNZrbgGk2pa2fW6E1oeTvZkbYf_1e_VA@mail.gmail.com>
	<CAPJVwBkXSjcDHDuddJ8D-39N+hB8SZP6DMHpsr2zwv9XCMT3OA@mail.gmail.com>
	<CALmrCV1xP6cJVL7DZufw84Lw+S_+oE5S4BUY9BAOJtDiXruEvg@mail.gmail.com>
	<CAH6Pt5r50HQctUCGe2+oWP9G709-cenmvOpr7G4L=WQVn0JS=A@mail.gmail.com>
	<CALmrCV0sXJ=SfUQA=F9zEPUM0q6fDKzQkUtyjc=mqP+nQZE-Sw@mail.gmail.com>
	<CAH6Pt5oqwvWSbnaPy2pd0H=ZiQKf6xemcxY1tXsdQ1KdioH53w@mail.gmail.com>
	<CALmrCV0n=TsEPMrmzpdh6UQNFRTp5SgD0VAH52n9pxedsEu8Kg@mail.gmail.com>
Message-ID: <CAH6Pt5rKwOzTjnOO87cSi=fkCT_-8tVAbEoA9RunwWD6A_DAPw@mail.gmail.com>

Hi,

On Mon, Jan 7, 2013 at 8:50 PM, Andrew Collette
<andrew.collette at gmail.com> wrote:
> Hi Matthew,
>
>> Just to be clear, you mean you might have something like this?
>>
>> def my_func('array_name', some_offset):
>>     arr = load_somehow('array_name') # dtype hitherto unknown
>>     return arr + some_offset
>>
>> ?  And the problem is that it fails late?   Is it really better that
>> something bad happens for the addition than that it raises an error?
>>
>> You'll also often get an error when trying to add structured dtypes,
>> but maybe you cant return these from a 'load'?
>
> In this specific case I would like to just use "+" and say "We add
> your offset using the NumPy rules," which is a problem if there are no
> NumPy rules for addition in the specific case where some_offset
> happens to be a scalar and not an array, and also slightly larger than
> arr.dtype can hold. I personally prefer upcasting to some reasonable
> type big enough to hold some_offset, as I described earlier, although
> that's not crucial.
>
> But I think we're getting a little caught up in the details of this
> example.  My basic point is: yes, people should be careful to check
> dtypes, etc. where it's important to their application; but people who
> want to rely on some reasonable NumPy-supplied default behavior should
> be excused from doing so.

For myself, I find detailed examples helpful, because I find it
difficult to think about more general rules without applying them to
practical cases.

In this case I think you'd probably agree it would be reasonable to
raise an error - all other things being equal?

Can you think of another practical case where it would be reasonably
clear that it was the wrong thing to do?

Cheers,

Matthew


From d.s.seljebotn at astro.uio.no  Mon Jan  7 16:17:45 2013
From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn)
Date: Mon, 07 Jan 2013 22:17:45 +0100
Subject: [Numpy-discussion]
 =?utf-8?q?Do_we_want_scalar_casting_to_behave_?=
 =?utf-8?q?as_it_does_at_the_moment=3F?=
In-Reply-To: <CALmrCV0n=TsEPMrmzpdh6UQNFRTp5SgD0VAH52n9pxedsEu8Kg@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CAPJVwBnct8Tcop6SruOfD1CCWC82sE3cbXoQvtux8sfLmJhYuQ@mail.gmail.com>
	<CAPJVwB=GmgRQoW1RqwR1WmJBq5tfxJAGnGPjcT0drZvGTVGx7Q@mail.gmail.com>
	<CALmrCV11T6OTp+dBvUNjGn518Rd2ovHPfS7Ur6a5BKTyYXzKQw@mail.gmail.com>
	<50E61E29.1020709@astro.uio.no>
	<CALmrCV0L6PAA=aCYgaX2gQyM-_=bcDiygz0TyDmE0o5oCbc7iA@mail.gmail.com>
	<CAFXk4bqT5Hk67Cv=FJ+L+vx53zR6x5=xRVQRcZShPs7QOZ0OoQ@mail.gmail.com>
	<CALmrCV2TcFwO=mbiSSeqrwbVCyo9nsEQSUspfihA1WgK4Abd+g@mail.gmail.com>
	<CAFXk4bopXBpN2_aPAJCJ3ADk6PSMiG7NANSra7fhButkud+75A@mail.gmail.com>
	<CALmrCV0311UpW__-espNZrbgGk2pa2fW6E1oeTvZkbYf_1e_VA@mail.gmail.com>
	<CAPJVwBkXSjcDHDuddJ8D-39N+hB8SZP6DMHpsr2zwv9XCMT3OA@mail.gmail.com>
	<CALmrCV1xP6cJVL7DZufw84Lw+S_+oE5S4BUY9BAOJtDiXruEvg@mail.gmail.com>
	<CAH6Pt5r50HQctUCGe2+oWP9G709-cenmvOpr7G4L=WQVn0JS=A@mail.gmail.com>
	<CALmrCV0sXJ=SfUQA=F9zEPUM0q6fDKzQkUtyjc=mqP+nQZE-Sw@mail.gmail.com>
	<CAH6Pt5oqwvWSbnaPy2pd0H=ZiQKf6xemcxY1tXsdQ1KdioH53w@mail.gmail.com>
	<CALmrCV0n=TsEPMrmzpdh6UQNFRTp5SgD0VAH52n9pxedsEu8Kg@mail.gmail.com>
Message-ID: <b5573cdde5efc3344b5134c0560d95ca@ulrik.uio.no>

On 2013-01-07 21:50, Andrew Collette wrote:
> Hi Matthew,
>
>> Just to be clear, you mean you might have something like this?
>>
>> def my_func('array_name', some_offset):
>>     arr = load_somehow('array_name') # dtype hitherto unknown
>>     return arr + some_offset
>>
>> ?  And the problem is that it fails late?   Is it really better that
>> something bad happens for the addition than that it raises an error?
>>
>> You'll also often get an error when trying to add structured dtypes,
>> but maybe you cant return these from a 'load'?
>
> In this specific case I would like to just use "+" and say "We add
> your offset using the NumPy rules," which is a problem if there are 
> no
> NumPy rules for addition in the specific case where some_offset
> happens to be a scalar and not an array, and also slightly larger 
> than
> arr.dtype can hold. I personally prefer upcasting to some reasonable
> type big enough to hold some_offset, as I described earlier, although
> that's not crucial.
>
> But I think we're getting a little caught up in the details of this
> example.  My basic point is: yes, people should be careful to check
> dtypes, etc. where it's important to their application; but people 
> who
> want to rely on some reasonable NumPy-supplied default behavior 
> should
> be excused from doing so.

But the default float dtype is double, and default integer dtype is at 
least
int32.

So if you rely on NumPy-supplied default behaviour you are fine!

If you specify a smaller dtype for your arrays, you have some reason to 
do
that. If you had enough memory to not worry about automatic conversion 
from int8
to int16, you would have specified it as int16 in the first place when 
you created
the array.

Dag Sverre


From andrew.collette at gmail.com  Mon Jan  7 16:18:52 2013
From: andrew.collette at gmail.com (Andrew Collette)
Date: Mon, 7 Jan 2013 14:18:52 -0700
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <CAH6Pt5rKwOzTjnOO87cSi=fkCT_-8tVAbEoA9RunwWD6A_DAPw@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CAPJVwBnct8Tcop6SruOfD1CCWC82sE3cbXoQvtux8sfLmJhYuQ@mail.gmail.com>
	<CAPJVwB=GmgRQoW1RqwR1WmJBq5tfxJAGnGPjcT0drZvGTVGx7Q@mail.gmail.com>
	<CALmrCV11T6OTp+dBvUNjGn518Rd2ovHPfS7Ur6a5BKTyYXzKQw@mail.gmail.com>
	<50E61E29.1020709@astro.uio.no>
	<CALmrCV0L6PAA=aCYgaX2gQyM-_=bcDiygz0TyDmE0o5oCbc7iA@mail.gmail.com>
	<CAFXk4bqT5Hk67Cv=FJ+L+vx53zR6x5=xRVQRcZShPs7QOZ0OoQ@mail.gmail.com>
	<CALmrCV2TcFwO=mbiSSeqrwbVCyo9nsEQSUspfihA1WgK4Abd+g@mail.gmail.com>
	<CAFXk4bopXBpN2_aPAJCJ3ADk6PSMiG7NANSra7fhButkud+75A@mail.gmail.com>
	<CALmrCV0311UpW__-espNZrbgGk2pa2fW6E1oeTvZkbYf_1e_VA@mail.gmail.com>
	<CAPJVwBkXSjcDHDuddJ8D-39N+hB8SZP6DMHpsr2zwv9XCMT3OA@mail.gmail.com>
	<CALmrCV1xP6cJVL7DZufw84Lw+S_+oE5S4BUY9BAOJtDiXruEvg@mail.gmail.com>
	<CAH6Pt5r50HQctUCGe2+oWP9G709-cenmvOpr7G4L=WQVn0JS=A@mail.gmail.com>
	<CALmrCV0sXJ=SfUQA=F9zEPUM0q6fDKzQkUtyjc=mqP+nQZE-Sw@mail.gmail.com>
	<CAH6Pt5oqwvWSbnaPy2pd0H=ZiQKf6xemcxY1tXsdQ1KdioH53w@mail.gmail.com>
	<CALmrCV0n=TsEPMrmzpdh6UQNFRTp5SgD0VAH52n9pxedsEu8Kg@mail.gmail.com>
	<CAH6Pt5rKwOzTjnOO87cSi=fkCT_-8tVAbEoA9RunwWD6A_DAPw@mail.gmail.com>
Message-ID: <CALmrCV1JwDcBnO3sf=4n_4+V=C_j3QzRkpMCEZJutnv8NxRPqw@mail.gmail.com>

Hi Matthew,

> In this case I think you'd probably agree it would be reasonable to
> raise an error - all other things being equal?

No, I don't agree.  I want there to be some default semantics I can
rely on.  Preferably, I want it to do the same thing it would do if
some_offset were an array with element-by-element offsets, which is
the current behavior of numpy 1.6 if you assume a reasonable dtype for
some_offset.

> Can you think of another practical case where it would be reasonably
> clear that it was the wrong thing to do?

I consider "myarray + constant -> Error" clearly wrong no matter what
the context.  I've never seen it in any other analysis language I've
used.  But it's also possible that I'm alone in this... I haven't seen
many other people here arguing against the change.

Andrew


From andrew.collette at gmail.com  Mon Jan  7 16:24:15 2013
From: andrew.collette at gmail.com (Andrew Collette)
Date: Mon, 7 Jan 2013 14:24:15 -0700
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <b5573cdde5efc3344b5134c0560d95ca@ulrik.uio.no>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CAPJVwBnct8Tcop6SruOfD1CCWC82sE3cbXoQvtux8sfLmJhYuQ@mail.gmail.com>
	<CAPJVwB=GmgRQoW1RqwR1WmJBq5tfxJAGnGPjcT0drZvGTVGx7Q@mail.gmail.com>
	<CALmrCV11T6OTp+dBvUNjGn518Rd2ovHPfS7Ur6a5BKTyYXzKQw@mail.gmail.com>
	<50E61E29.1020709@astro.uio.no>
	<CALmrCV0L6PAA=aCYgaX2gQyM-_=bcDiygz0TyDmE0o5oCbc7iA@mail.gmail.com>
	<CAFXk4bqT5Hk67Cv=FJ+L+vx53zR6x5=xRVQRcZShPs7QOZ0OoQ@mail.gmail.com>
	<CALmrCV2TcFwO=mbiSSeqrwbVCyo9nsEQSUspfihA1WgK4Abd+g@mail.gmail.com>
	<CAFXk4bopXBpN2_aPAJCJ3ADk6PSMiG7NANSra7fhButkud+75A@mail.gmail.com>
	<CALmrCV0311UpW__-espNZrbgGk2pa2fW6E1oeTvZkbYf_1e_VA@mail.gmail.com>
	<CAPJVwBkXSjcDHDuddJ8D-39N+hB8SZP6DMHpsr2zwv9XCMT3OA@mail.gmail.com>
	<CALmrCV1xP6cJVL7DZufw84Lw+S_+oE5S4BUY9BAOJtDiXruEvg@mail.gmail.com>
	<CAH6Pt5r50HQctUCGe2+oWP9G709-cenmvOpr7G4L=WQVn0JS=A@mail.gmail.com>
	<CALmrCV0sXJ=SfUQA=F9zEPUM0q6fDKzQkUtyjc=mqP+nQZE-Sw@mail.gmail.com>
	<CAH6Pt5oqwvWSbnaPy2pd0H=ZiQKf6xemcxY1tXsdQ1KdioH53w@mail.gmail.com>
	<CALmrCV0n=TsEPMrmzpdh6UQNFRTp5SgD0VAH52n9pxedsEu8Kg@mail.gmail.com>
	<b5573cdde5efc3344b5134c0560d95ca@ulrik.uio.no>
Message-ID: <CALmrCV21op+s_sbwM8++t=R_7JrDRr4gDahaYssEGBNx1MhVgA@mail.gmail.com>

Hi Dag,

> But the default float dtype is double, and default integer dtype is at
> least
> int32.
>
> So if you rely on NumPy-supplied default behaviour you are fine!

As I mentioned, this caught my interest because people routinely save
data in HDF5 as int8 or int16 to save disk space.  It's not at all
unusual to end up with these precisions when you read from a file.

> If you specify a smaller dtype for your arrays, you have some reason to do that.

In this case, the reason is that the person who gave me the file chose
to store the data as e.g. int16.  Good default semantics for things
like addition make it easy to write generic code.

Andrew


From matthew.brett at gmail.com  Mon Jan  7 16:31:43 2013
From: matthew.brett at gmail.com (Matthew Brett)
Date: Mon, 7 Jan 2013 21:31:43 +0000
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <CALmrCV1JwDcBnO3sf=4n_4+V=C_j3QzRkpMCEZJutnv8NxRPqw@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CAPJVwBnct8Tcop6SruOfD1CCWC82sE3cbXoQvtux8sfLmJhYuQ@mail.gmail.com>
	<CAPJVwB=GmgRQoW1RqwR1WmJBq5tfxJAGnGPjcT0drZvGTVGx7Q@mail.gmail.com>
	<CALmrCV11T6OTp+dBvUNjGn518Rd2ovHPfS7Ur6a5BKTyYXzKQw@mail.gmail.com>
	<50E61E29.1020709@astro.uio.no>
	<CALmrCV0L6PAA=aCYgaX2gQyM-_=bcDiygz0TyDmE0o5oCbc7iA@mail.gmail.com>
	<CAFXk4bqT5Hk67Cv=FJ+L+vx53zR6x5=xRVQRcZShPs7QOZ0OoQ@mail.gmail.com>
	<CALmrCV2TcFwO=mbiSSeqrwbVCyo9nsEQSUspfihA1WgK4Abd+g@mail.gmail.com>
	<CAFXk4bopXBpN2_aPAJCJ3ADk6PSMiG7NANSra7fhButkud+75A@mail.gmail.com>
	<CALmrCV0311UpW__-espNZrbgGk2pa2fW6E1oeTvZkbYf_1e_VA@mail.gmail.com>
	<CAPJVwBkXSjcDHDuddJ8D-39N+hB8SZP6DMHpsr2zwv9XCMT3OA@mail.gmail.com>
	<CALmrCV1xP6cJVL7DZufw84Lw+S_+oE5S4BUY9BAOJtDiXruEvg@mail.gmail.com>
	<CAH6Pt5r50HQctUCGe2+oWP9G709-cenmvOpr7G4L=WQVn0JS=A@mail.gmail.com>
	<CALmrCV0sXJ=SfUQA=F9zEPUM0q6fDKzQkUtyjc=mqP+nQZE-Sw@mail.gmail.com>
	<CAH6Pt5oqwvWSbnaPy2pd0H=ZiQKf6xemcxY1tXsdQ1KdioH53w@mail.gmail.com>
	<CALmrCV0n=TsEPMrmzpdh6UQNFRTp5SgD0VAH52n9pxedsEu8Kg@mail.gmail.com>
	<CAH6Pt5rKwOzTjnOO87cSi=fkCT_-8tVAbEoA9RunwWD6A_DAPw@mail.gmail.com>
	<CALmrCV1JwDcBnO3sf=4n_4+V=C_j3QzRkpMCEZJutnv8NxRPqw@mail.gmail.com>
Message-ID: <CAH6Pt5qnD=hfW_Zmw5t69HL78H8uhFY8b+GTkFx4uOmO+HERYQ@mail.gmail.com>

Hi,

On Mon, Jan 7, 2013 at 9:18 PM, Andrew Collette
<andrew.collette at gmail.com> wrote:
> Hi Matthew,
>
>> In this case I think you'd probably agree it would be reasonable to
>> raise an error - all other things being equal?
>
> No, I don't agree.  I want there to be some default semantics I can
> rely on.  Preferably, I want it to do the same thing it would do if
> some_offset were an array with element-by-element offsets, which is
> the current behavior of numpy 1.6 if you assume a reasonable dtype for
> some_offset.

Ah - well - I only meant that raising an error in the example would be
no more surprising than raising an error at the python prompt.  Do you
agree with that?  I mean, if the user knew that:

>>> np.array([1], dtype=np.int8) + 128

would raise an error, they'd probably expect your offset routine to do the same.

>> Can you think of another practical case where it would be reasonably
>> clear that it was the wrong thing to do?
>
> I consider "myarray + constant -> Error" clearly wrong no matter what
> the context.  I've never seen it in any other analysis language I've
> used.  But it's also possible that I'm alone in this... I haven't seen
> many other people here arguing against the change.

I agree it kind of feels funny, but that's why I wanted to ask you for
some silly but specific example where the funniness would be more
apparent.

Cheers,

Matthew


From andrew.collette at gmail.com  Mon Jan  7 17:03:49 2013
From: andrew.collette at gmail.com (Andrew Collette)
Date: Mon, 7 Jan 2013 15:03:49 -0700
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <CAH6Pt5qnD=hfW_Zmw5t69HL78H8uhFY8b+GTkFx4uOmO+HERYQ@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CAPJVwBnct8Tcop6SruOfD1CCWC82sE3cbXoQvtux8sfLmJhYuQ@mail.gmail.com>
	<CAPJVwB=GmgRQoW1RqwR1WmJBq5tfxJAGnGPjcT0drZvGTVGx7Q@mail.gmail.com>
	<CALmrCV11T6OTp+dBvUNjGn518Rd2ovHPfS7Ur6a5BKTyYXzKQw@mail.gmail.com>
	<50E61E29.1020709@astro.uio.no>
	<CALmrCV0L6PAA=aCYgaX2gQyM-_=bcDiygz0TyDmE0o5oCbc7iA@mail.gmail.com>
	<CAFXk4bqT5Hk67Cv=FJ+L+vx53zR6x5=xRVQRcZShPs7QOZ0OoQ@mail.gmail.com>
	<CALmrCV2TcFwO=mbiSSeqrwbVCyo9nsEQSUspfihA1WgK4Abd+g@mail.gmail.com>
	<CAFXk4bopXBpN2_aPAJCJ3ADk6PSMiG7NANSra7fhButkud+75A@mail.gmail.com>
	<CALmrCV0311UpW__-espNZrbgGk2pa2fW6E1oeTvZkbYf_1e_VA@mail.gmail.com>
	<CAPJVwBkXSjcDHDuddJ8D-39N+hB8SZP6DMHpsr2zwv9XCMT3OA@mail.gmail.com>
	<CALmrCV1xP6cJVL7DZufw84Lw+S_+oE5S4BUY9BAOJtDiXruEvg@mail.gmail.com>
	<CAH6Pt5r50HQctUCGe2+oWP9G709-cenmvOpr7G4L=WQVn0JS=A@mail.gmail.com>
	<CALmrCV0sXJ=SfUQA=F9zEPUM0q6fDKzQkUtyjc=mqP+nQZE-Sw@mail.gmail.com>
	<CAH6Pt5oqwvWSbnaPy2pd0H=ZiQKf6xemcxY1tXsdQ1KdioH53w@mail.gmail.com>
	<CALmrCV0n=TsEPMrmzpdh6UQNFRTp5SgD0VAH52n9pxedsEu8Kg@mail.gmail.com>
	<CAH6Pt5rKwOzTjnOO87cSi=fkCT_-8tVAbEoA9RunwWD6A_DAPw@mail.gmail.com>
	<CALmrCV1JwDcBnO3sf=4n_4+V=C_j3QzRkpMCEZJutnv8NxRPqw@mail.gmail.com>
	<CAH6Pt5qnD=hfW_Zmw5t69HL78H8uhFY8b+GTkFx4uOmO+HERYQ@mail.gmail.com>
Message-ID: <CALmrCV3ssCQ+U8QziyymoiF6BgxHHBiM2Z2Bnuk9UWNGH9bC4w@mail.gmail.com>

Hi Matthew,

> Ah - well - I only meant that raising an error in the example would be
> no more surprising than raising an error at the python prompt.  Do you
> agree with that?  I mean, if the user knew that:
>
>>>> np.array([1], dtype=np.int8) + 128
>
> would raise an error, they'd probably expect your offset routine to do the same.

I think they would be surprised in both cases, considering this works fine:

np.array([1], dtype=np.int8) + np.array([128])

> I agree it kind of feels funny, but that's why I wanted to ask you for
> some silly but specific example where the funniness would be more
> apparent.

Here are a couple of examples I slapped together, specifically
highlighting the value of the present (or similar) upcasting behavior.
 Granted, they are contrived and can all be fixed by conditional code,
but this is my best effort at illustrating the "real-world" problems
people may run into.

Note that there is no easy way for the user to force upcasting to
avoid the error, unless e.g. an "upcast" keyword were added to these
functions, or code added to inspect the data dtype and use numpy.add
to simulate the current behavior.

def map_heights(self, dataset_name, heightmap):
    """ Correct altitudes by adding a custom heightmap

    dataset_name: Name of HDF5 dataset containing altitude data
    heightmap:  Corrections in meters.  Must match shape of the
dataset (or be a scalar).
    """
    # TODO: scattered reports of errors when a constant heightmap value is used

    return self.f[dataset_name][...] + heightmap

def perform_analysis(self, dataset_name, kernel_offset=128):
    """ Apply Frobnication analysis, using optional linear offset

    dataset_name: Name of dataset in file
    kernel_offset:  Optional sequencing parameter.  Must be a power of
2 and at least 16 (default 128)
    """
    # TODO: people report certain files frobnicate fine in IDL but not
in Python...

     import frob
     data = self.f[dataset_name][...]
     try:
         return frob.frobnicate(data + kernel_offset)
     except ValueError:
         raise AnalysisFailed("Invalid input data")


From matthew.brett at gmail.com  Mon Jan  7 17:26:08 2013
From: matthew.brett at gmail.com (Matthew Brett)
Date: Mon, 7 Jan 2013 22:26:08 +0000
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <CALmrCV3ssCQ+U8QziyymoiF6BgxHHBiM2Z2Bnuk9UWNGH9bC4w@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CAPJVwBnct8Tcop6SruOfD1CCWC82sE3cbXoQvtux8sfLmJhYuQ@mail.gmail.com>
	<CAPJVwB=GmgRQoW1RqwR1WmJBq5tfxJAGnGPjcT0drZvGTVGx7Q@mail.gmail.com>
	<CALmrCV11T6OTp+dBvUNjGn518Rd2ovHPfS7Ur6a5BKTyYXzKQw@mail.gmail.com>
	<50E61E29.1020709@astro.uio.no>
	<CALmrCV0L6PAA=aCYgaX2gQyM-_=bcDiygz0TyDmE0o5oCbc7iA@mail.gmail.com>
	<CAFXk4bqT5Hk67Cv=FJ+L+vx53zR6x5=xRVQRcZShPs7QOZ0OoQ@mail.gmail.com>
	<CALmrCV2TcFwO=mbiSSeqrwbVCyo9nsEQSUspfihA1WgK4Abd+g@mail.gmail.com>
	<CAFXk4bopXBpN2_aPAJCJ3ADk6PSMiG7NANSra7fhButkud+75A@mail.gmail.com>
	<CALmrCV0311UpW__-espNZrbgGk2pa2fW6E1oeTvZkbYf_1e_VA@mail.gmail.com>
	<CAPJVwBkXSjcDHDuddJ8D-39N+hB8SZP6DMHpsr2zwv9XCMT3OA@mail.gmail.com>
	<CALmrCV1xP6cJVL7DZufw84Lw+S_+oE5S4BUY9BAOJtDiXruEvg@mail.gmail.com>
	<CAH6Pt5r50HQctUCGe2+oWP9G709-cenmvOpr7G4L=WQVn0JS=A@mail.gmail.com>
	<CALmrCV0sXJ=SfUQA=F9zEPUM0q6fDKzQkUtyjc=mqP+nQZE-Sw@mail.gmail.com>
	<CAH6Pt5oqwvWSbnaPy2pd0H=ZiQKf6xemcxY1tXsdQ1KdioH53w@mail.gmail.com>
	<CALmrCV0n=TsEPMrmzpdh6UQNFRTp5SgD0VAH52n9pxedsEu8Kg@mail.gmail.com>
	<CAH6Pt5rKwOzTjnOO87cSi=fkCT_-8tVAbEoA9RunwWD6A_DAPw@mail.gmail.com>
	<CALmrCV1JwDcBnO3sf=4n_4+V=C_j3QzRkpMCEZJutnv8NxRPqw@mail.gmail.com>
	<CAH6Pt5qnD=hfW_Zmw5t69HL78H8uhFY8b+GTkFx4uOmO+HERYQ@mail.gmail.com>
	<CALmrCV3ssCQ+U8QziyymoiF6BgxHHBiM2Z2Bnuk9UWNGH9bC4w@mail.gmail.com>
Message-ID: <CAH6Pt5oiY07W=S+XbAYwN8Zk+AGCOh17eh_eg-uexBHGkPDgWQ@mail.gmail.com>

Hi,

On Mon, Jan 7, 2013 at 10:03 PM, Andrew Collette
<andrew.collette at gmail.com> wrote:
> Hi Matthew,
>
>> Ah - well - I only meant that raising an error in the example would be
>> no more surprising than raising an error at the python prompt.  Do you
>> agree with that?  I mean, if the user knew that:
>>
>>>>> np.array([1], dtype=np.int8) + 128
>>
>> would raise an error, they'd probably expect your offset routine to do the same.
>
> I think they would be surprised in both cases, considering this works fine:
>
> np.array([1], dtype=np.int8) + np.array([128])
>
>> I agree it kind of feels funny, but that's why I wanted to ask you for
>> some silly but specific example where the funniness would be more
>> apparent.
>
> Here are a couple of examples I slapped together, specifically
> highlighting the value of the present (or similar) upcasting behavior.
>  Granted, they are contrived and can all be fixed by conditional code,
> but this is my best effort at illustrating the "real-world" problems
> people may run into.
>
> Note that there is no easy way for the user to force upcasting to
> avoid the error, unless e.g. an "upcast" keyword were added to these
> functions, or code added to inspect the data dtype and use numpy.add
> to simulate the current behavior.
>
> def map_heights(self, dataset_name, heightmap):
>     """ Correct altitudes by adding a custom heightmap
>
>     dataset_name: Name of HDF5 dataset containing altitude data
>     heightmap:  Corrections in meters.  Must match shape of the
> dataset (or be a scalar).
>     """
>     # TODO: scattered reports of errors when a constant heightmap value is used
>
>     return self.f[dataset_name][...] + heightmap
>
> def perform_analysis(self, dataset_name, kernel_offset=128):
>     """ Apply Frobnication analysis, using optional linear offset
>
>     dataset_name: Name of dataset in file
>     kernel_offset:  Optional sequencing parameter.  Must be a power of
> 2 and at least 16 (default 128)
>     """
>     # TODO: people report certain files frobnicate fine in IDL but not
> in Python...
>
>      import frob
>      data = self.f[dataset_name][...]
>      try:
>          return frob.frobnicate(data + kernel_offset)
>      except ValueError:
>          raise AnalysisFailed("Invalid input data")

Thanks - I know it seems silly - but it is helpful.

There are two separate issues though:

1) Is the upcasting behavior of 1.6 better than the overflow behavior of 1.5?
2) If the upcasting of 1.6 is bad, is it better to raise an error or
silently overflow, as in 1.5?

Taking 2) first, in this example:

>     return self.f[dataset_name][...] + heightmap

assuming it is not going to upcast, would you rather it overflow than
raise an error?  Why?  The second seems more explicit and sensible to
me.

For 1) - of course the upcasting in 1.6 is only going to work some of
the time.   For example:

In [2]: np.array([127], dtype=np.int8) * 1000
Out[2]: array([-4072], dtype=int16)

So - you'll get something, but there's a reasonable chance you won't
get what you were expecting.  Of course that is true for 1.5 as well,
but at least the rule there is simpler and so easier - in my opinion -
to think about.

Best,

Matthew


From andrew.collette at gmail.com  Mon Jan  7 17:58:12 2013
From: andrew.collette at gmail.com (Andrew Collette)
Date: Mon, 7 Jan 2013 15:58:12 -0700
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <CAH6Pt5oiY07W=S+XbAYwN8Zk+AGCOh17eh_eg-uexBHGkPDgWQ@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CAPJVwBnct8Tcop6SruOfD1CCWC82sE3cbXoQvtux8sfLmJhYuQ@mail.gmail.com>
	<CAPJVwB=GmgRQoW1RqwR1WmJBq5tfxJAGnGPjcT0drZvGTVGx7Q@mail.gmail.com>
	<CALmrCV11T6OTp+dBvUNjGn518Rd2ovHPfS7Ur6a5BKTyYXzKQw@mail.gmail.com>
	<50E61E29.1020709@astro.uio.no>
	<CALmrCV0L6PAA=aCYgaX2gQyM-_=bcDiygz0TyDmE0o5oCbc7iA@mail.gmail.com>
	<CAFXk4bqT5Hk67Cv=FJ+L+vx53zR6x5=xRVQRcZShPs7QOZ0OoQ@mail.gmail.com>
	<CALmrCV2TcFwO=mbiSSeqrwbVCyo9nsEQSUspfihA1WgK4Abd+g@mail.gmail.com>
	<CAFXk4bopXBpN2_aPAJCJ3ADk6PSMiG7NANSra7fhButkud+75A@mail.gmail.com>
	<CALmrCV0311UpW__-espNZrbgGk2pa2fW6E1oeTvZkbYf_1e_VA@mail.gmail.com>
	<CAPJVwBkXSjcDHDuddJ8D-39N+hB8SZP6DMHpsr2zwv9XCMT3OA@mail.gmail.com>
	<CALmrCV1xP6cJVL7DZufw84Lw+S_+oE5S4BUY9BAOJtDiXruEvg@mail.gmail.com>
	<CAH6Pt5r50HQctUCGe2+oWP9G709-cenmvOpr7G4L=WQVn0JS=A@mail.gmail.com>
	<CALmrCV0sXJ=SfUQA=F9zEPUM0q6fDKzQkUtyjc=mqP+nQZE-Sw@mail.gmail.com>
	<CAH6Pt5oqwvWSbnaPy2pd0H=ZiQKf6xemcxY1tXsdQ1KdioH53w@mail.gmail.com>
	<CALmrCV0n=TsEPMrmzpdh6UQNFRTp5SgD0VAH52n9pxedsEu8Kg@mail.gmail.com>
	<CAH6Pt5rKwOzTjnOO87cSi=fkCT_-8tVAbEoA9RunwWD6A_DAPw@mail.gmail.com>
	<CALmrCV1JwDcBnO3sf=4n_4+V=C_j3QzRkpMCEZJutnv8NxRPqw@mail.gmail.com>
	<CAH6Pt5qnD=hfW_Zmw5t69HL78H8uhFY8b+GTkFx4uOmO+HERYQ@mail.gmail.com>
	<CALmrCV3ssCQ+U8QziyymoiF6BgxHHBiM2Z2Bnuk9UWNGH9bC4w@mail.gmail.com>
	<CAH6Pt5oiY07W=S+XbAYwN8Zk+AGCOh17eh_eg-uexBHGkPDgWQ@mail.gmail.com>
Message-ID: <CALmrCV3kZ=EEopMDydwYDarV6G9nSbiPHK_UY4vSBiDUG8fhKg@mail.gmail.com>

Hi,

> Taking 2) first, in this example:
>
>>     return self.f[dataset_name][...] + heightmap
>
> assuming it is not going to upcast, would you rather it overflow than
> raise an error?  Why?  The second seems more explicit and sensible to
> me.

Yes, I think this (the 1.5 overflow behavior) was a bit odd, if easy
to understand.

> For 1) - of course the upcasting in 1.6 is only going to work some of
> the time.   For example:
>
> In [2]: np.array([127], dtype=np.int8) * 1000
> Out[2]: array([-4072], dtype=int16)
>
> So - you'll get something, but there's a reasonable chance you won't
> get what you were expecting.  Of course that is true for 1.5 as well,
> but at least the rule there is simpler and so easier - in my opinion -
> to think about.

Part of what my first example was trying to demonstrate was that the
function author assumed arrays and scalars obeyed the same rules for
addition.

For example, if data were int8 and heightmap were an int16 array with
a max value of 32767, and the data had a max value in the same spot
with e.g. 10, then the addition would overflow at that position, even
with the int16 result.  That's how array addition works in numpy, and
as I understand it that's not slated to change.

But when we have a scalar of value 32767 (which fits in int16 but not
int8), we are proposing instead to do nothing under the assumption
that it's an error.

In summary: yes, there are some odd results, but they're consistent
with the rules for addition elsewhere in numpy, and I would prefer
that to treating this case as an error.

Out of curiosity, I checked what IDL did, and it overflows using
something like the numpy 1.6 rules:

IDL> print, byte(1) + fix(32767)
  -32768

and in other places with 1.5-like behavior:

IDL> print, byte(1) ^ fix(1000)
    1

Of course, I don't hold up IDL as a shining example of good analysis
software. :)

Andrew


From raul at virtualmaterials.com  Mon Jan  7 18:08:58 2013
From: raul at virtualmaterials.com (Raul Cota)
Date: Mon, 07 Jan 2013 16:08:58 -0700
Subject: [Numpy-discussion] Remove support for numeric and numarray in
 1.8
In-Reply-To: <CAB6mnxJ4DPcndfbv-GB40DsHzOO_wVfPWS-Z7cd6OO9sVE-3DA@mail.gmail.com>
References: <CAB6mnxJ=zH7Q4hUpRXT-Heg8ONVBbwVuyrgb=QJ-SjLpZDa7JA@mail.gmail.com>
	<50EA4BDC.2060108@virtualmaterials.com>
	<CAB6mnxJ4DPcndfbv-GB40DsHzOO_wVfPWS-Z7cd6OO9sVE-3DA@mail.gmail.com>
Message-ID: <50EB558A.1030306@virtualmaterials.com>

An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130107/d3fc8c55/attachment.html>

From charlesr.harris at gmail.com  Mon Jan  7 19:14:57 2013
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Mon, 7 Jan 2013 17:14:57 -0700
Subject: [Numpy-discussion] Remove support for numeric and numarray in
	1.8
In-Reply-To: <50EB558A.1030306@virtualmaterials.com>
References: <CAB6mnxJ=zH7Q4hUpRXT-Heg8ONVBbwVuyrgb=QJ-SjLpZDa7JA@mail.gmail.com>
	<50EA4BDC.2060108@virtualmaterials.com>
	<CAB6mnxJ4DPcndfbv-GB40DsHzOO_wVfPWS-Z7cd6OO9sVE-3DA@mail.gmail.com>
	<50EB558A.1030306@virtualmaterials.com>
Message-ID: <CAB6mnx++9xzO19MkNYngiNDZwa09yy-6s3+96yyQR5yo7kQ=rw@mail.gmail.com>

On Mon, Jan 7, 2013 at 4:08 PM, Raul Cota <raul at virtualmaterials.com> wrote:

>  Ran a fair bit of our test suite using numpy 1.7 compiling against the
> corresponding 'numpy/oldnumeric.h' and everything worked well .
>
> All I saw was the warning below which is obviously expected:
> """
> Warning    23    warning Msg: Using deprecated NumPy API, disable it by
> #defining NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION
> c:\python27\lib\site-packages\numpy\core\include\numpy\npy_deprecated_api.h
> 8
> """
>
>
Great! Thanks, not many are in a position to check that part of numpy.
Looks like we will need to keep the deprecated api around for a while also.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130107/6cb8d10a/attachment.html>

From raul at virtualmaterials.com  Mon Jan  7 20:38:54 2013
From: raul at virtualmaterials.com (Raul Cota)
Date: Mon, 07 Jan 2013 18:38:54 -0700
Subject: [Numpy-discussion] Remove support for numeric and numarray in
 1.8
In-Reply-To: <CAB6mnx++9xzO19MkNYngiNDZwa09yy-6s3+96yyQR5yo7kQ=rw@mail.gmail.com>
References: <CAB6mnxJ=zH7Q4hUpRXT-Heg8ONVBbwVuyrgb=QJ-SjLpZDa7JA@mail.gmail.com>
	<50EA4BDC.2060108@virtualmaterials.com>
	<CAB6mnxJ4DPcndfbv-GB40DsHzOO_wVfPWS-Z7cd6OO9sVE-3DA@mail.gmail.com>
	<50EB558A.1030306@virtualmaterials.com>
	<CAB6mnx++9xzO19MkNYngiNDZwa09yy-6s3+96yyQR5yo7kQ=rw@mail.gmail.com>
Message-ID: <50EB78AE.3070001@virtualmaterials.com>

An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130107/12ea767b/attachment.html>

From ondrej.certik at gmail.com  Mon Jan  7 21:09:50 2013
From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=)
Date: Mon, 7 Jan 2013 18:09:50 -0800
Subject: [Numpy-discussion] Which Python to use for Mac binaries
In-Reply-To: <CALGmxEJjaj=XjUkXAc2QEtEGNMmc4kKQ44d4k5v8s7E3_t6ynw@mail.gmail.com>
References: <CADDwiVCrujdbbpCVCiDbi1N7pH6i5aHDDv+Y+TyyRnkkzJ14+Q@mail.gmail.com>
	<CABL7CQhsSuJAVLA8LKYPNv-G5SC0J7D4aUZ6Jd8jc7ZTssEdfg@mail.gmail.com>
	<CALGmxEJjaj=XjUkXAc2QEtEGNMmc4kKQ44d4k5v8s7E3_t6ynw@mail.gmail.com>
Message-ID: <CADDwiVDDHG2czKA8UTQpDJ_MW3frhoKoUtVrH8-1z2tBkHm-Ow@mail.gmail.com>

On Sun, Jan 6, 2013 at 2:40 PM, Chris Barker - NOAA Federal
<chris.barker at noaa.gov> wrote:
> On Sun, Jan 6, 2013 at 2:04 AM, Ralf Gommers <ralf.gommers at gmail.com> wrote:
>>> Which exact Python do we need to use on Mac? Do we need to use the
>>> binary installer from python.org?
>>
>> Yes, the one from python.org.
>>
>>> Or can I install it from source?
>
> you could install from source using the same method that the
> python.org binaries are built -- there is a script with the source to
> do that, though I'm not sure what the point of that would be.

Is it possible to install the dmg images without root access from the
command line?

I know how to access the contents:

$ hdiutil attach python-2.7.3-macosx10.6.dmg
$ ls /Volumes/Python\ 2.7.3/
Build.txt	License.txt	Python.mpkg	ReadMe.txt

But I am not currently sure what to do with it. The Python.mpkg
directory seems to contain the sources.


I have access to Vincent's computer, as suggested by Ralf and it is
already setup, so I am using it.
But I am not able (so far) to replicate the setup there so that I can
create the binaries on any other Mac
computer, which makes me feel really uneasy.

By replicating the setup, at least once (preferably automated) would
make me understand things much better.
If possible, I would prefer to just use a command line (ssh) to do all
that. (So that's maybe building from source
is the only option.)

Ondrej


From ondrej.certik at gmail.com  Mon Jan  7 21:12:58 2013
From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=)
Date: Mon, 7 Jan 2013 18:12:58 -0800
Subject: [Numpy-discussion] Which Python to use for Mac binaries
In-Reply-To: <CABL7CQhsSuJAVLA8LKYPNv-G5SC0J7D4aUZ6Jd8jc7ZTssEdfg@mail.gmail.com>
References: <CADDwiVCrujdbbpCVCiDbi1N7pH6i5aHDDv+Y+TyyRnkkzJ14+Q@mail.gmail.com>
	<CABL7CQhsSuJAVLA8LKYPNv-G5SC0J7D4aUZ6Jd8jc7ZTssEdfg@mail.gmail.com>
Message-ID: <CADDwiVCpjzYJSRPZ0uohSMaS69aGMY=BCdMdthrFSWpP3asFPw@mail.gmail.com>

On Sun, Jan 6, 2013 at 2:04 AM, Ralf Gommers <ralf.gommers at gmail.com> wrote:
>
>
>
> On Sun, Jan 6, 2013 at 3:21 AM, Ond?ej ?ert?k <ondrej.certik at gmail.com>
> wrote:
>>
>> Hi,
>>
>> Currently the NumPy binaries are built using the pavement.py script,
>> which uses the following Pythons:
>>
>> MPKG_PYTHON = {
>>         "2.5":
>> ["/Library/Frameworks/Python.framework/Versions/2.5/bin/python"],
>>         "2.6":
>> ["/Library/Frameworks/Python.framework/Versions/2.6/bin/python"],
>>         "2.7":
>> ["/Library/Frameworks/Python.framework/Versions/2.7/bin/python"],
>>         "3.1":
>> ["/Library/Frameworks/Python.framework/Versions/3.1/bin/python3"],
>>         "3.2":
>> ["/Library/Frameworks/Python.framework/Versions/3.2/bin/python3"],
>>         "3.3":
>> ["/Library/Frameworks/Python.framework/Versions/3.3/bin/python3"],
>> }
>>
>> So for example I can easily create the 2.6 binary if that Python is
>> pre-installed on the Mac box that I am using.
>> On one of the Mac boxes that I am using, the 2.7 is missing, so are
>> 3.1, 3.2 and 3.3. So I was thinking
>> of updating my Fabric fab file to automatically install all Pythons
>> from source and build against that, just like I do for Wine.
>>
>> Which exact Python do we need to use on Mac? Do we need to use the
>> binary installer from python.org?
>
>
> Yes, the one from python.org.
>
>>
>> Or can I install it from source? Finally, for which Python versions
>> should we provide binary installers for Mac?
>> For reference, the 1.6.2 had installers for 2.5, 2.6 and 2.7 only for
>> OS X 10.3. There is only 2.7 version for OS X 10.6.
>
>
> The provided installers and naming scheme should match what's done for
> Python itself on python.org.
>
> The 10.3 installers for 2.5, 2.6 and 2.7 should be compiled on OS X 10.5.
> This is kind of hard to come by these days, but Vincent Davis maintains a
> build machine for numpy and scipy. That's already set up correctly, so all
> you have to do is connect to it via ssh, check out v.17.0 in ~/Code/numpy,
> check in release.sh that the section for OS X 10.6 is disabled and for 10.5
> enabled and run it.
>
> OS X 10.6 broke support for previous versions in some subtle ways, so even
> when using the 10.4 SDK numpy compiled on 10.6 won't run on 10.5. As long as
> we're supporting 10.5 you therefore need to compile on it.
>
> The 10.7 --> 10.6 support hasn't been checked, but I wouldn't trust it. I
> have a 10.6 machine, so I can compile those binaries if needed.
>
>>
>> Also, what is the meaning of the following piece of code in pavement.py:
>>
>> def _build_mpkg(pyver):
>>     # account for differences between Python 2.7.1 versions from
>> python.org
>>     if os.environ.get('MACOSX_DEPLOYMENT_TARGET', None) == "10.6":
>>         ldflags = "-undefined dynamic_lookup -bundle -arch i386 -arch
>> x86_64 -Wl,-search_paths_first"
>>     else:
>>         ldflags = "-undefined dynamic_lookup -bundle -arch i386 -arch
>> ppc -Wl,-search_paths_first"
>>     ldflags += " -L%s" % os.path.join(os.path.dirname(__file__), "build")
>
>
> The 10.6 binaries support only Intel Macs, both 32-bit and 64-bit. The 10.3
> binaries support PPC Macs and 32-bit Intel. That's what the above does. Note
> that we simply follow the choice made by the Python release managers here.
>
>>
>>     if pyver == "2.5":
>>         sh("CC=gcc-4.0 LDFLAGS='%s' %s setupegg.py bdist_mpkg" %
>> (ldflags, " ".join(MPKG_PYTHON[pyver])))
>>     else:
>>         sh("LDFLAGS='%s' %s setupegg.py bdist_mpkg" % (ldflags, "
>> ".join(MPKG_PYTHON[pyver])))
>
>
> This is necessary because in Python 2.5, distutils asks for "gcc" instead of
> "gcc-4.0", so you may get the wrong one without CC=gcc-4.0. From Python 2.6
> on this was fixed.
>
>>
>> In particular, the last line gets executed and it then fails with:
>>
>> paver dmg -p 2.6
>> ---> pavement.dmg
>> ---> pavement.clean
>> LDFLAGS='-undefined dynamic_lookup -bundle -arch i386 -arch ppc
>> -Wl,-search_paths_first -Lbuild'
>> /Library/Frameworks/Python.framework/Versions/2.6/bin/python
>> setupegg.py bdist_mpkg
>> Traceback (most recent call last):
>>   File "setupegg.py", line 17, in <module>
>>     from setuptools import setup
>> ImportError: No module named setuptools
>>
>>
>> The reason is (I think) that if the Python binary is called explicitly
>> with /Library/Frameworks/Python.framework/Versions/2.6/bin/python,
>> then the paths are not setup properly in virtualenv, and thus
>> setuptools (which is only installed in virtualenv, but not in system
>> Python) fails to import. The solution is to simply apply this patch:
>
>
> Avoid using system Python for anything. The first thing to do on any new OS
> X system is install Python some other way, preferably from python.org.
>
>>
>> diff --git a/pavement.py b/pavement.py
>> index e693016..0c637f8 100644
>> --- a/pavement.py
>> +++ b/pavement.py
>> @@ -449,7 +449,7 @@ def _build_mpkg(pyver):
>>      if pyver == "2.5":
>>          sh("CC=gcc-4.0 LDFLAGS='%s' %s setupegg.py bdist_mpkg" %
>> (ldflags, " ".join(MPKG_PYTHON[pyver])))
>>      else:
>> -        sh("LDFLAGS='%s' %s setupegg.py bdist_mpkg" % (ldflags, "
>> ".join(MPKG_PYTHON[pyver])))
>> +        sh("python setupegg.py bdist_mpkg")
>
>
> This doesn't work unless using virtualenvs, you're just throwing away the
> version selection here. If you can support virtualenvs in addition to
> python.org pythons, that would be useful. But being able to build binaries
> when needed simply by "paver dmg -p 2.x" is quite useful.


Absolutely. I was following the release.sh in the numpy git
repository, which contains:

paver bootstrap
source bootstrap/bin/activate
python setupsconsegg.py install
paver pdf
paver dmg -p 2.7

So it is using the virtualenv and it works on Vincent's computer, but
it doesn't work on my
other computer.

I wanted to make the steps somehow reproducible. I started adding the
commands needed to setup the Mac (any Mac)
into my Fabfile here:

https://github.com/certik/numpy-vendor/blob/master/fabfile.py#L98

but I run into the issues above.

Of course, I'll try to just use Vincent's computer, but I would feel
much better if the numpy release process for Mac didn't depend on one
particular computer, but rather could be quite easily reproduced on
any Mac OS X of the right version.

Ondrej


From chris.barker at noaa.gov  Mon Jan  7 23:41:14 2013
From: chris.barker at noaa.gov (Chris Barker - NOAA Federal)
Date: Mon, 7 Jan 2013 20:41:14 -0800
Subject: [Numpy-discussion] Which Python to use for Mac binaries
In-Reply-To: <CADDwiVDDHG2czKA8UTQpDJ_MW3frhoKoUtVrH8-1z2tBkHm-Ow@mail.gmail.com>
References: <CADDwiVCrujdbbpCVCiDbi1N7pH6i5aHDDv+Y+TyyRnkkzJ14+Q@mail.gmail.com>
	<CABL7CQhsSuJAVLA8LKYPNv-G5SC0J7D4aUZ6Jd8jc7ZTssEdfg@mail.gmail.com>
	<CALGmxEJjaj=XjUkXAc2QEtEGNMmc4kKQ44d4k5v8s7E3_t6ynw@mail.gmail.com>
	<CADDwiVDDHG2czKA8UTQpDJ_MW3frhoKoUtVrH8-1z2tBkHm-Ow@mail.gmail.com>
Message-ID: <CALGmxE+u4QhKRtkapdVccmBUOuKpPVbewO_heykh+5m6_3x1eA@mail.gmail.com>

On Mon, Jan 7, 2013 at 6:09 PM, Ond?ej ?ert?k <ondrej.certik at gmail.com> wrote:
> Is it possible to install the dmg images without root access from the
> command line?

I've never tried, but it looks like you can:

http://www.commandlinefu.com/commands/view/2031/install-an-mpkg-from-the-command-line-on-osx


> But I am not currently sure what to do with it. The Python.mpkg
> directory seems to contain the sources.

yup -- that's where everything is. the "installer" command should be
able to unpack it.

> By replicating the setup, at least once (preferably automated) would
> make me understand things much better.
> If possible, I would prefer to just use a command line (ssh) to do all
> that. (So that's maybe building from source
> is the only option.)

If you ndo need to build from source, see this message for a bit more info:

http://mail.python.org/pipermail/pythonmac-sig/2012-October/023742.html

there are a few prerequisites you need to install first...

Either way, you should be able to build a start-to-finish build script.

-Chris


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From ondrej.certik at gmail.com  Tue Jan  8 01:23:08 2013
From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=)
Date: Mon, 7 Jan 2013 22:23:08 -0800
Subject: [Numpy-discussion] Which Python to use for Mac binaries
In-Reply-To: <CALGmxE+u4QhKRtkapdVccmBUOuKpPVbewO_heykh+5m6_3x1eA@mail.gmail.com>
References: <CADDwiVCrujdbbpCVCiDbi1N7pH6i5aHDDv+Y+TyyRnkkzJ14+Q@mail.gmail.com>
	<CABL7CQhsSuJAVLA8LKYPNv-G5SC0J7D4aUZ6Jd8jc7ZTssEdfg@mail.gmail.com>
	<CALGmxEJjaj=XjUkXAc2QEtEGNMmc4kKQ44d4k5v8s7E3_t6ynw@mail.gmail.com>
	<CADDwiVDDHG2czKA8UTQpDJ_MW3frhoKoUtVrH8-1z2tBkHm-Ow@mail.gmail.com>
	<CALGmxE+u4QhKRtkapdVccmBUOuKpPVbewO_heykh+5m6_3x1eA@mail.gmail.com>
Message-ID: <CADDwiVAapYEZA8x6+KRL9Fh0eccEVPy97-Ah8=mi8hXUe0W3GQ@mail.gmail.com>

On Mon, Jan 7, 2013 at 8:41 PM, Chris Barker - NOAA Federal
<chris.barker at noaa.gov> wrote:
> On Mon, Jan 7, 2013 at 6:09 PM, Ond?ej ?ert?k <ondrej.certik at gmail.com> wrote:
>> Is it possible to install the dmg images without root access from the
>> command line?
>
> I've never tried, but it looks like you can:
>
> http://www.commandlinefu.com/commands/view/2031/install-an-mpkg-from-the-command-line-on-osx

This requires root access. Without sudo, I get:

$ installer -pkg /Volumes/Python\ 2.7.3/Python.mpkg/ -target ondrej
installer: This package requires authentication to install.

and since I don't have root access, it doesn't work.

So one way around it would be to install python from source, that
shouldn't require root access.

>
>
>> But I am not currently sure what to do with it. The Python.mpkg
>> directory seems to contain the sources.
>
> yup -- that's where everything is. the "installer" command should be
> able to unpack it.

Ok.

>
>> By replicating the setup, at least once (preferably automated) would
>> make me understand things much better.
>> If possible, I would prefer to just use a command line (ssh) to do all
>> that. (So that's maybe building from source
>> is the only option.)
>
> If you ndo need to build from source, see this message for a bit more info:
>
> http://mail.python.org/pipermail/pythonmac-sig/2012-October/023742.html
>
> there are a few prerequisites you need to install first...
>
> Either way, you should be able to build a start-to-finish build script.

Yes, that would be my goal eventually. Without root access.
But right now, I am not even sure it's possible.

So for now I'll simply use already pre-configured box.

Ondrej


From ralf.gommers at gmail.com  Tue Jan  8 02:36:16 2013
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Tue, 8 Jan 2013 08:36:16 +0100
Subject: [Numpy-discussion] Which Python to use for Mac binaries
In-Reply-To: <CADDwiVCpjzYJSRPZ0uohSMaS69aGMY=BCdMdthrFSWpP3asFPw@mail.gmail.com>
References: <CADDwiVCrujdbbpCVCiDbi1N7pH6i5aHDDv+Y+TyyRnkkzJ14+Q@mail.gmail.com>
	<CABL7CQhsSuJAVLA8LKYPNv-G5SC0J7D4aUZ6Jd8jc7ZTssEdfg@mail.gmail.com>
	<CADDwiVCpjzYJSRPZ0uohSMaS69aGMY=BCdMdthrFSWpP3asFPw@mail.gmail.com>
Message-ID: <CABL7CQgWmisiZ5u_o87xFOnLiXJc=BtxBcjAs_iQVrB+xWyRHA@mail.gmail.com>

On Tue, Jan 8, 2013 at 3:12 AM, Ond?ej ?ert?k <ondrej.certik at gmail.com>wrote:

> On Sun, Jan 6, 2013 at 2:04 AM, Ralf Gommers <ralf.gommers at gmail.com>
> wrote:
> >
> >
> >
> > On Sun, Jan 6, 2013 at 3:21 AM, Ond?ej ?ert?k <ondrej.certik at gmail.com>
> > wrote:
> >>
> >> Hi,
> >>
> >> Currently the NumPy binaries are built using the pavement.py script,
> >> which uses the following Pythons:
> >>
> >> MPKG_PYTHON = {
> >>         "2.5":
> >> ["/Library/Frameworks/Python.framework/Versions/2.5/bin/python"],
> >>         "2.6":
> >> ["/Library/Frameworks/Python.framework/Versions/2.6/bin/python"],
> >>         "2.7":
> >> ["/Library/Frameworks/Python.framework/Versions/2.7/bin/python"],
> >>         "3.1":
> >> ["/Library/Frameworks/Python.framework/Versions/3.1/bin/python3"],
> >>         "3.2":
> >> ["/Library/Frameworks/Python.framework/Versions/3.2/bin/python3"],
> >>         "3.3":
> >> ["/Library/Frameworks/Python.framework/Versions/3.3/bin/python3"],
> >> }
> >>
> >> So for example I can easily create the 2.6 binary if that Python is
> >> pre-installed on the Mac box that I am using.
> >> On one of the Mac boxes that I am using, the 2.7 is missing, so are
> >> 3.1, 3.2 and 3.3. So I was thinking
> >> of updating my Fabric fab file to automatically install all Pythons
> >> from source and build against that, just like I do for Wine.
> >>
> >> Which exact Python do we need to use on Mac? Do we need to use the
> >> binary installer from python.org?
> >
> >
> > Yes, the one from python.org.
> >
> >>
> >> Or can I install it from source? Finally, for which Python versions
> >> should we provide binary installers for Mac?
> >> For reference, the 1.6.2 had installers for 2.5, 2.6 and 2.7 only for
> >> OS X 10.3. There is only 2.7 version for OS X 10.6.
> >
> >
> > The provided installers and naming scheme should match what's done for
> > Python itself on python.org.
> >
> > The 10.3 installers for 2.5, 2.6 and 2.7 should be compiled on OS X 10.5.
> > This is kind of hard to come by these days, but Vincent Davis maintains a
> > build machine for numpy and scipy. That's already set up correctly, so
> all
> > you have to do is connect to it via ssh, check out v.17.0 in
> ~/Code/numpy,
> > check in release.sh that the section for OS X 10.6 is disabled and for
> 10.5
> > enabled and run it.
> >
> > OS X 10.6 broke support for previous versions in some subtle ways, so
> even
> > when using the 10.4 SDK numpy compiled on 10.6 won't run on 10.5. As
> long as
> > we're supporting 10.5 you therefore need to compile on it.
> >
> > The 10.7 --> 10.6 support hasn't been checked, but I wouldn't trust it. I
> > have a 10.6 machine, so I can compile those binaries if needed.
> >
> >>
> >> Also, what is the meaning of the following piece of code in pavement.py:
> >>
> >> def _build_mpkg(pyver):
> >>     # account for differences between Python 2.7.1 versions from
> >> python.org
> >>     if os.environ.get('MACOSX_DEPLOYMENT_TARGET', None) == "10.6":
> >>         ldflags = "-undefined dynamic_lookup -bundle -arch i386 -arch
> >> x86_64 -Wl,-search_paths_first"
> >>     else:
> >>         ldflags = "-undefined dynamic_lookup -bundle -arch i386 -arch
> >> ppc -Wl,-search_paths_first"
> >>     ldflags += " -L%s" % os.path.join(os.path.dirname(__file__),
> "build")
> >
> >
> > The 10.6 binaries support only Intel Macs, both 32-bit and 64-bit. The
> 10.3
> > binaries support PPC Macs and 32-bit Intel. That's what the above does.
> Note
> > that we simply follow the choice made by the Python release managers
> here.
> >
> >>
> >>     if pyver == "2.5":
> >>         sh("CC=gcc-4.0 LDFLAGS='%s' %s setupegg.py bdist_mpkg" %
> >> (ldflags, " ".join(MPKG_PYTHON[pyver])))
> >>     else:
> >>         sh("LDFLAGS='%s' %s setupegg.py bdist_mpkg" % (ldflags, "
> >> ".join(MPKG_PYTHON[pyver])))
> >
> >
> > This is necessary because in Python 2.5, distutils asks for "gcc"
> instead of
> > "gcc-4.0", so you may get the wrong one without CC=gcc-4.0. From Python
> 2.6
> > on this was fixed.
> >
> >>
> >> In particular, the last line gets executed and it then fails with:
> >>
> >> paver dmg -p 2.6
> >> ---> pavement.dmg
> >> ---> pavement.clean
> >> LDFLAGS='-undefined dynamic_lookup -bundle -arch i386 -arch ppc
> >> -Wl,-search_paths_first -Lbuild'
> >> /Library/Frameworks/Python.framework/Versions/2.6/bin/python
> >> setupegg.py bdist_mpkg
> >> Traceback (most recent call last):
> >>   File "setupegg.py", line 17, in <module>
> >>     from setuptools import setup
> >> ImportError: No module named setuptools
> >>
> >>
> >> The reason is (I think) that if the Python binary is called explicitly
> >> with /Library/Frameworks/Python.framework/Versions/2.6/bin/python,
> >> then the paths are not setup properly in virtualenv, and thus
> >> setuptools (which is only installed in virtualenv, but not in system
> >> Python) fails to import. The solution is to simply apply this patch:
> >
> >
> > Avoid using system Python for anything. The first thing to do on any new
> OS
> > X system is install Python some other way, preferably from python.org.
> >
> >>
> >> diff --git a/pavement.py b/pavement.py
> >> index e693016..0c637f8 100644
> >> --- a/pavement.py
> >> +++ b/pavement.py
> >> @@ -449,7 +449,7 @@ def _build_mpkg(pyver):
> >>      if pyver == "2.5":
> >>          sh("CC=gcc-4.0 LDFLAGS='%s' %s setupegg.py bdist_mpkg" %
> >> (ldflags, " ".join(MPKG_PYTHON[pyver])))
> >>      else:
> >> -        sh("LDFLAGS='%s' %s setupegg.py bdist_mpkg" % (ldflags, "
> >> ".join(MPKG_PYTHON[pyver])))
> >> +        sh("python setupegg.py bdist_mpkg")
> >
> >
> > This doesn't work unless using virtualenvs, you're just throwing away the
> > version selection here. If you can support virtualenvs in addition to
> > python.org pythons, that would be useful. But being able to build
> binaries
> > when needed simply by "paver dmg -p 2.x" is quite useful.
>
>
> Absolutely. I was following the release.sh in the numpy git
> repository, which contains:
>
> paver bootstrap
> source bootstrap/bin/activate
> python setupsconsegg.py install
> paver pdf
> paver dmg -p 2.7
>
> So it is using the virtualenv and it works on Vincent's computer, but
> it doesn't work on my
> other computer.
>

Note that it's only using a virtualenv for this one step (building the
docs). This is because building the docs requires installing numpy first to
be able to extract the docstrings.


> I wanted to make the steps somehow reproducible. I started adding the
> commands needed to setup the Mac (any Mac)
> into my Fabfile here:
>
> https://github.com/certik/numpy-vendor/blob/master/fabfile.py#L98
>
> but I run into the issues above.
>
> Of course, I'll try to just use Vincent's computer, but I would feel
> much better if the numpy release process for Mac didn't depend on one
> particular computer, but rather could be quite easily reproduced on
> any Mac OS X of the right version.
>

It doesn't depend on that one computer of course, it takes only a few
minutes to set up a new Mac. But yes, currently it does require admin
rights to install a framework Python.

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130108/33250928/attachment.html>

From robince at gmail.com  Tue Jan  8 09:06:17 2013
From: robince at gmail.com (Robin)
Date: Tue, 8 Jan 2013 14:06:17 +0000
Subject: [Numpy-discussion] Embedded NumPy LAPACK errors
In-Reply-To: <CALsWBNNVAFNqKX=s9PB0LTe16axVWRHuv8-ZbxZAHToseX-YsA@mail.gmail.com>
References: <50E73ECC.8050803@eml.cc>
	<3FF2E38B-6A93-4AC6-B28B-CD1C50784AD5@gmail.com>
	<CALsWBNNVAFNqKX=s9PB0LTe16axVWRHuv8-ZbxZAHToseX-YsA@mail.gmail.com>
Message-ID: <CALsWBNNBwwUhTXvqpS4KBhhri164kyoDwZ2TozNSMKZUUL=yHg@mail.gmail.com>

On Sat, Jan 5, 2013 at 1:03 PM, Robin <robince at gmail.com> wrote:
>>> If not, is there a reasonable way to build numpy.linalg such that
>>> it interfaces with MKL correctly ?

I managed to get this to work in the end. Since Matlab uses MKL with
ILP64 interface it is not possible to get Numpy to use that without
modifications to all the lapack calls. However, I was able to keep the
two different versions of lapack seperate.

The first step is to build numpy to link statically against MKL. I
wasn't sure how to get distutils to do this so I copied all the mkl
static .a libaries to a temporary directory and pointed numpy to that
to force the issue (so dynamic linking wasn't an option).

Even with that it still uses the Lapack from the Matlab dynamic global
symbols. The trick was adding the linker flag "-Bsymbolic" which means
lapack_lite calls to lapack use the statically linked local copies.
With these changes everything appears to work. There are two test
failures (below) which do not appear when running the same Numpy build
outside of Matlab but they don't seem so severe.

So:
[robini at robini2-pc numpy]$ cat site.cfg
[mkl]
search_static_first = true
library_dirs = /tmp/intel64
include_dirs = /opt/intel/mkl/include
#mkl_libs = mkl_sequential, mkl_intel_lp64, mkl_core,
mkl_lapack95_lp64, mkl_blas95_lp64
mkl_libs = mkl_lapack95, mkl_blas95, mkl_intel_lp64, mkl_sequential,
mkl_core, svml, imf, irc
lapack_libs =

[robini at robini2-pc numpy]$ ls /tmp/intel64/
libimf.a                       libmkl_gnu_thread.a
libirc.a                       libmkl_intel_ilp64.a
libmkl_blacs_ilp64.a           libmkl_intel_lp64.a
libmkl_blacs_intelmpi_ilp64.a  libmkl_intel_sp2dp.a
libmkl_blacs_intelmpi_lp64.a   libmkl_intel_thread.a
libmkl_blacs_lp64.a            libmkl_lapack95_ilp64.a
libmkl_blacs_openmpi_ilp64.a   libmkl_lapack95_lp64.a
libmkl_blacs_openmpi_lp64.a    libmkl_pgi_thread.a
libmkl_blacs_sgimpt_ilp64.a    libmkl_scalapack_ilp64.a
libmkl_blacs_sgimpt_lp64.a     libmkl_scalapack_lp64.a
libmkl_blas95_ilp64.a          libmkl_sequential.a
libmkl_blas95_lp64.a           libmkl_solver_ilp64.a
libmkl_cdft_core.a             libmkl_solver_ilp64_sequential.a
libmkl_core.a                  libmkl_solver_lp64.a
libmkl_gf_ilp64.a              libmkl_solver_lp64_sequential.a
libmkl_gf_lp64.a               libsvml.a

in numpy/distutils/intelccompiler.py:
class IntelEM64TCCompiler(UnixCCompiler):
    """ A modified Intel x86_64 compiler compatible with a 64bit gcc
built Python.
    """
    compiler_type = 'intelem'
    cc_exe = 'icc -m64 -fPIC'
    cc_args = "-fPIC"
    def __init__ (self, verbose=0, dry_run=0, force=0):
        UnixCCompiler.__init__ (self, verbose,dry_run, force)
        self.cc_exe = 'icc -m64 -fPIC -O3 -fomit-frame-pointer'
        compiler = self.cc_exe
        self.set_executables(compiler=compiler,
                             compiler_so=compiler,
                             compiler_cxx=compiler,
                             linker_exe=compiler,
                             linker_so=compiler + ' -shared
-static-intel -Bsymbolic')

Test failures (test_special_values also fails outside Matlab, but the
other 2 only occur when embedded):
======================================================================
FAIL: test_umath.test_nextafterl
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/opt/epd-7.3/lib/python2.7/site-packages/nose/case.py", line
197, in runTest
    self.test(*self.arg)
  File "/home/robini/slash/lib/python2.7/site-packages/numpy/testing/decorators.py",
line 215, in knownfailer
    return f(*args, **kwargs)
  File "/home/robini/slash/lib/python2.7/site-packages/numpy/core/tests/test_umath.py",
line 1123, in test_nextafterl
    return _test_nextafter(np.longdouble)
  File "/home/robini/slash/lib/python2.7/site-packages/numpy/core/tests/test_umath.py",
line 1108, in _test_nextafter
    assert np.nextafter(one, two) - one == eps
AssertionError

======================================================================
FAIL: test_umath.test_spacingl
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/opt/epd-7.3/lib/python2.7/site-packages/nose/case.py", line
197, in runTest
    self.test(*self.arg)
  File "/home/robini/slash/lib/python2.7/site-packages/numpy/testing/decorators.py",
line 215, in knownfailer
    return f(*args, **kwargs)
  File "/home/robini/slash/lib/python2.7/site-packages/numpy/core/tests/test_umath.py",
line 1149, in test_spacingl
    return _test_spacing(np.longdouble)
  File "/home/robini/slash/lib/python2.7/site-packages/numpy/core/tests/test_umath.py",
line 1132, in _test_spacing
    assert np.spacing(one) == eps
AssertionError

======================================================================
FAIL: test_special_values (test_umath_complex.TestClog)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/robini/slash/lib/python2.7/site-packages/numpy/testing/decorators.py",
line 146, in skipper_func
    return f(*args, **kwargs)
  File "/home/robini/slash/lib/python2.7/site-packages/numpy/core/tests/test_umath_complex.py",
line 299, in test_special_values
    assert_almost_equal(np.log(np.conj(xa[i])), np.conj(np.log(xa[i])))
  File "/home/robini/slash/lib/python2.7/site-packages/numpy/testing/utils.py",
line 448, in assert_almost_equal
    raise AssertionError(msg)
AssertionError:
Arrays are not almost equal to 7 decimals
 ACTUAL: array([-inf+3.14159265j])
 DESIRED: array([-inf-3.14159265j])

----------------------------------------------------------------------
Ran 3571 tests in 10.897s

FAILED (KNOWNFAIL=5, SKIP=1, failures=3)

Cheers

Robin


From cournape at gmail.com  Tue Jan  8 10:20:51 2013
From: cournape at gmail.com (David Cournapeau)
Date: Tue, 8 Jan 2013 09:20:51 -0600
Subject: [Numpy-discussion] Which Python to use for Mac binaries
In-Reply-To: <CAH6Pt5pHpYSS1kor0b-N9AUG9Qar74jBPWMiphwRm8ALR64Pjw@mail.gmail.com>
References: <CADDwiVCrujdbbpCVCiDbi1N7pH6i5aHDDv+Y+TyyRnkkzJ14+Q@mail.gmail.com>
	<CABL7CQhsSuJAVLA8LKYPNv-G5SC0J7D4aUZ6Jd8jc7ZTssEdfg@mail.gmail.com>
	<CALGmxEJjaj=XjUkXAc2QEtEGNMmc4kKQ44d4k5v8s7E3_t6ynw@mail.gmail.com>
	<CAH6Pt5pHpYSS1kor0b-N9AUG9Qar74jBPWMiphwRm8ALR64Pjw@mail.gmail.com>
Message-ID: <CAGY4rcU0SHwDA4NwQaocfhher0BU1sSeZC1A5SqA9iRvim9zUg@mail.gmail.com>

On Mon, Jan 7, 2013 at 7:31 AM, Matthew Brett <matthew.brett at gmail.com> wrote:
> Hi,
>
> On Sun, Jan 6, 2013 at 10:40 PM, Chris Barker - NOAA Federal
> <chris.barker at noaa.gov> wrote:
>> On Sun, Jan 6, 2013 at 2:04 AM, Ralf Gommers <ralf.gommers at gmail.com> wrote:
>>>> Which exact Python do we need to use on Mac? Do we need to use the
>>>> binary installer from python.org?
>>>
>>> Yes, the one from python.org.
>>>
>>>> Or can I install it from source?
>>
>> you could install from source using the same method that the
>> python.org binaries are built -- there is a script with the source to
>> do that, though I'm not sure what the point of that would be.
>>
>>> The 10.3 installers for 2.5, 2.6 and 2.7 should be compiled on OS X 10.5.
>>
>> It would be great to continue support for that, though I wonder how
>> many people still need it -- I don't think Apple supports 10.5
>> anymore, for instance.
>>
>>> The 10.7 --> 10.6 support hasn't been checked, but I wouldn't trust it. I
>>> have a 10.6 machine, so I can compile those binaries if needed.
>>
>> That would be better, but it would also be nice to check how building
>> on 10.7 works.
>>
>>> Avoid using system Python for anything. The first thing to do on any new OS
>>> X system is install Python some other way, preferably from python.org.
>>
>> +1
>>
>>> Last note: bdist_mpkg is unmaintained and doesn't support Python 3.x. Most
>>> recent version is at: https://github.com/matthew-brett/bdist_mpkg, for
>>> previous versions numpy releases I've used that at commit e81a58a471
>>
>> There has been recent discussion on the pythonmac list about this --
>> some waffling about how important it is -- though I think it would be
>> good to keep it up to date.
>
> I updated my fork of bdist_mpkg with Python 3k support.  It doesn't
> have any tests that I could see, but I've run it on python 2.6 and 3.2
> and 3.3 on one of my packages as a first pass.
>
>>> If we want 3.x binaries, then we should fix that or (preferably) build
>>> binaries with Bento. Bento has grown support for mpkg's; I'm not sure how
>>> robust that is.
>>
>> So maybe bento is a better route than bdist_mpkg -- this is worth
>> discussion on teh pythonmac list.
>
> David - can you give a status update on that?

It is more a starting point than anything else, and barely tested. I
would advise against using it ATM.

thanks,
David


From ondrej.certik at gmail.com  Tue Jan  8 11:24:10 2013
From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=)
Date: Tue, 8 Jan 2013 08:24:10 -0800
Subject: [Numpy-discussion] Which Python to use for Mac binaries
In-Reply-To: <CABL7CQgWmisiZ5u_o87xFOnLiXJc=BtxBcjAs_iQVrB+xWyRHA@mail.gmail.com>
References: <CADDwiVCrujdbbpCVCiDbi1N7pH6i5aHDDv+Y+TyyRnkkzJ14+Q@mail.gmail.com>
	<CABL7CQhsSuJAVLA8LKYPNv-G5SC0J7D4aUZ6Jd8jc7ZTssEdfg@mail.gmail.com>
	<CADDwiVCpjzYJSRPZ0uohSMaS69aGMY=BCdMdthrFSWpP3asFPw@mail.gmail.com>
	<CABL7CQgWmisiZ5u_o87xFOnLiXJc=BtxBcjAs_iQVrB+xWyRHA@mail.gmail.com>
Message-ID: <CADDwiVCSa_P-adFnDTaWW9Qveb90t3G83yrArCL=OvqWw__-Ew@mail.gmail.com>

On Mon, Jan 7, 2013 at 11:36 PM, Ralf Gommers <ralf.gommers at gmail.com> wrote:
>
>
>
> On Tue, Jan 8, 2013 at 3:12 AM, Ond?ej ?ert?k <ondrej.certik at gmail.com>
> wrote:
>>
>> On Sun, Jan 6, 2013 at 2:04 AM, Ralf Gommers <ralf.gommers at gmail.com>
>> wrote:
>> >
>> >
>> >
>> > On Sun, Jan 6, 2013 at 3:21 AM, Ond?ej ?ert?k <ondrej.certik at gmail.com>
>> > wrote:
>> >>
>> >> Hi,
>> >>
>> >> Currently the NumPy binaries are built using the pavement.py script,
>> >> which uses the following Pythons:
>> >>
>> >> MPKG_PYTHON = {
>> >>         "2.5":
>> >> ["/Library/Frameworks/Python.framework/Versions/2.5/bin/python"],
>> >>         "2.6":
>> >> ["/Library/Frameworks/Python.framework/Versions/2.6/bin/python"],
>> >>         "2.7":
>> >> ["/Library/Frameworks/Python.framework/Versions/2.7/bin/python"],
>> >>         "3.1":
>> >> ["/Library/Frameworks/Python.framework/Versions/3.1/bin/python3"],
>> >>         "3.2":
>> >> ["/Library/Frameworks/Python.framework/Versions/3.2/bin/python3"],
>> >>         "3.3":
>> >> ["/Library/Frameworks/Python.framework/Versions/3.3/bin/python3"],
>> >> }
>> >>
>> >> So for example I can easily create the 2.6 binary if that Python is
>> >> pre-installed on the Mac box that I am using.
>> >> On one of the Mac boxes that I am using, the 2.7 is missing, so are
>> >> 3.1, 3.2 and 3.3. So I was thinking
>> >> of updating my Fabric fab file to automatically install all Pythons
>> >> from source and build against that, just like I do for Wine.
>> >>
>> >> Which exact Python do we need to use on Mac? Do we need to use the
>> >> binary installer from python.org?
>> >
>> >
>> > Yes, the one from python.org.
>> >
>> >>
>> >> Or can I install it from source? Finally, for which Python versions
>> >> should we provide binary installers for Mac?
>> >> For reference, the 1.6.2 had installers for 2.5, 2.6 and 2.7 only for
>> >> OS X 10.3. There is only 2.7 version for OS X 10.6.
>> >
>> >
>> > The provided installers and naming scheme should match what's done for
>> > Python itself on python.org.
>> >
>> > The 10.3 installers for 2.5, 2.6 and 2.7 should be compiled on OS X
>> > 10.5.
>> > This is kind of hard to come by these days, but Vincent Davis maintains
>> > a
>> > build machine for numpy and scipy. That's already set up correctly, so
>> > all
>> > you have to do is connect to it via ssh, check out v.17.0 in
>> > ~/Code/numpy,
>> > check in release.sh that the section for OS X 10.6 is disabled and for
>> > 10.5
>> > enabled and run it.
>> >
>> > OS X 10.6 broke support for previous versions in some subtle ways, so
>> > even
>> > when using the 10.4 SDK numpy compiled on 10.6 won't run on 10.5. As
>> > long as
>> > we're supporting 10.5 you therefore need to compile on it.
>> >
>> > The 10.7 --> 10.6 support hasn't been checked, but I wouldn't trust it.
>> > I
>> > have a 10.6 machine, so I can compile those binaries if needed.
>> >
>> >>
>> >> Also, what is the meaning of the following piece of code in
>> >> pavement.py:
>> >>
>> >> def _build_mpkg(pyver):
>> >>     # account for differences between Python 2.7.1 versions from
>> >> python.org
>> >>     if os.environ.get('MACOSX_DEPLOYMENT_TARGET', None) == "10.6":
>> >>         ldflags = "-undefined dynamic_lookup -bundle -arch i386 -arch
>> >> x86_64 -Wl,-search_paths_first"
>> >>     else:
>> >>         ldflags = "-undefined dynamic_lookup -bundle -arch i386 -arch
>> >> ppc -Wl,-search_paths_first"
>> >>     ldflags += " -L%s" % os.path.join(os.path.dirname(__file__),
>> >> "build")
>> >
>> >
>> > The 10.6 binaries support only Intel Macs, both 32-bit and 64-bit. The
>> > 10.3
>> > binaries support PPC Macs and 32-bit Intel. That's what the above does.
>> > Note
>> > that we simply follow the choice made by the Python release managers
>> > here.
>> >
>> >>
>> >>     if pyver == "2.5":
>> >>         sh("CC=gcc-4.0 LDFLAGS='%s' %s setupegg.py bdist_mpkg" %
>> >> (ldflags, " ".join(MPKG_PYTHON[pyver])))
>> >>     else:
>> >>         sh("LDFLAGS='%s' %s setupegg.py bdist_mpkg" % (ldflags, "
>> >> ".join(MPKG_PYTHON[pyver])))
>> >
>> >
>> > This is necessary because in Python 2.5, distutils asks for "gcc"
>> > instead of
>> > "gcc-4.0", so you may get the wrong one without CC=gcc-4.0. From Python
>> > 2.6
>> > on this was fixed.
>> >
>> >>
>> >> In particular, the last line gets executed and it then fails with:
>> >>
>> >> paver dmg -p 2.6
>> >> ---> pavement.dmg
>> >> ---> pavement.clean
>> >> LDFLAGS='-undefined dynamic_lookup -bundle -arch i386 -arch ppc
>> >> -Wl,-search_paths_first -Lbuild'
>> >> /Library/Frameworks/Python.framework/Versions/2.6/bin/python
>> >> setupegg.py bdist_mpkg
>> >> Traceback (most recent call last):
>> >>   File "setupegg.py", line 17, in <module>
>> >>     from setuptools import setup
>> >> ImportError: No module named setuptools
>> >>
>> >>
>> >> The reason is (I think) that if the Python binary is called explicitly
>> >> with /Library/Frameworks/Python.framework/Versions/2.6/bin/python,
>> >> then the paths are not setup properly in virtualenv, and thus
>> >> setuptools (which is only installed in virtualenv, but not in system
>> >> Python) fails to import. The solution is to simply apply this patch:
>> >
>> >
>> > Avoid using system Python for anything. The first thing to do on any new
>> > OS
>> > X system is install Python some other way, preferably from python.org.
>> >
>> >>
>> >> diff --git a/pavement.py b/pavement.py
>> >> index e693016..0c637f8 100644
>> >> --- a/pavement.py
>> >> +++ b/pavement.py
>> >> @@ -449,7 +449,7 @@ def _build_mpkg(pyver):
>> >>      if pyver == "2.5":
>> >>          sh("CC=gcc-4.0 LDFLAGS='%s' %s setupegg.py bdist_mpkg" %
>> >> (ldflags, " ".join(MPKG_PYTHON[pyver])))
>> >>      else:
>> >> -        sh("LDFLAGS='%s' %s setupegg.py bdist_mpkg" % (ldflags, "
>> >> ".join(MPKG_PYTHON[pyver])))
>> >> +        sh("python setupegg.py bdist_mpkg")
>> >
>> >
>> > This doesn't work unless using virtualenvs, you're just throwing away
>> > the
>> > version selection here. If you can support virtualenvs in addition to
>> > python.org pythons, that would be useful. But being able to build
>> > binaries
>> > when needed simply by "paver dmg -p 2.x" is quite useful.
>>
>>
>> Absolutely. I was following the release.sh in the numpy git
>> repository, which contains:
>>
>> paver bootstrap
>> source bootstrap/bin/activate
>> python setupsconsegg.py install
>> paver pdf
>> paver dmg -p 2.7
>>
>> So it is using the virtualenv and it works on Vincent's computer, but
>> it doesn't work on my
>> other computer.
>
>
> Note that it's only using a virtualenv for this one step (building the
> docs). This is because building the docs requires installing numpy first to
> be able to extract the docstrings.

Ah, I missed this important part. Since I generate the pdf files in linux,
I can just copy them on Mac and thus don't need any of the virtualenv part.


>
>>
>> I wanted to make the steps somehow reproducible. I started adding the
>> commands needed to setup the Mac (any Mac)
>> into my Fabfile here:
>>
>> https://github.com/certik/numpy-vendor/blob/master/fabfile.py#L98
>>
>> but I run into the issues above.
>>
>> Of course, I'll try to just use Vincent's computer, but I would feel
>> much better if the numpy release process for Mac didn't depend on one
>> particular computer, but rather could be quite easily reproduced on
>> any Mac OS X of the right version.
>
>
> It doesn't depend on that one computer of course, it takes only a few
> minutes to set up a new Mac. But yes, currently it does require admin rights
> to install a framework Python.

Ok, that's what I wanted to know.

Thanks,
Ondrej


From chris.barker at noaa.gov  Tue Jan  8 11:45:25 2013
From: chris.barker at noaa.gov (Chris Barker - NOAA Federal)
Date: Tue, 8 Jan 2013 08:45:25 -0800
Subject: [Numpy-discussion] Which Python to use for Mac binaries
In-Reply-To: <CADDwiVAapYEZA8x6+KRL9Fh0eccEVPy97-Ah8=mi8hXUe0W3GQ@mail.gmail.com>
References: <CADDwiVCrujdbbpCVCiDbi1N7pH6i5aHDDv+Y+TyyRnkkzJ14+Q@mail.gmail.com>
	<CABL7CQhsSuJAVLA8LKYPNv-G5SC0J7D4aUZ6Jd8jc7ZTssEdfg@mail.gmail.com>
	<CALGmxEJjaj=XjUkXAc2QEtEGNMmc4kKQ44d4k5v8s7E3_t6ynw@mail.gmail.com>
	<CADDwiVDDHG2czKA8UTQpDJ_MW3frhoKoUtVrH8-1z2tBkHm-Ow@mail.gmail.com>
	<CALGmxE+u4QhKRtkapdVccmBUOuKpPVbewO_heykh+5m6_3x1eA@mail.gmail.com>
	<CADDwiVAapYEZA8x6+KRL9Fh0eccEVPy97-Ah8=mi8hXUe0W3GQ@mail.gmail.com>
Message-ID: <CALGmxELD=ecZeOiDLcOPuqTT_1269Wbftm4X607zKuenjFbG6Q@mail.gmail.com>

On Mon, Jan 7, 2013 at 10:23 PM, Ond?ej ?ert?k <ondrej.certik at gmail.com> wrote:
>> http://www.commandlinefu.com/commands/view/2031/install-an-mpkg-from-the-command-line-on-osx
>
> This requires root access. Without sudo, I get:
>
> $ installer -pkg /Volumes/Python\ 2.7.3/Python.mpkg/ -target ondrej
> installer: This package requires authentication to install.
>
> and since I don't have root access, it doesn't work.
>
> So one way around it would be to install python from source, that
> shouldn't require root access.

hmm -- this all may be a trick -- both the *.mpkg and the standard
build put everything in /Library/Frameworks/Python -- which is where
it belongs. Bu tif you need root access to write there, then there is
a problem. I'm sure a non-root build could put everything in the
users' home directory, then packages built against that would have
their paths messed up.

What's odd is that I'm pretty sure I've been able to point+click
install those without sudo...(I could recall incorrectly).

This would be a good question for the pythonmac list -- low traffic,
but there are some very smart and helpful folks there:

http://mail.python.org/mailman/listinfo/pythonmac-sig


>>> But I am not currently sure what to do with it. The Python.mpkg
>>> directory seems to contain the sources.

It should be possible to unpack a mpkg by hand, but it contains both
the contents, and various instal scripts, so that seems like a really
ugly solution.


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From matthew.brett at gmail.com  Tue Jan  8 11:59:17 2013
From: matthew.brett at gmail.com (Matthew Brett)
Date: Tue, 8 Jan 2013 16:59:17 +0000
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <CALmrCV3kZ=EEopMDydwYDarV6G9nSbiPHK_UY4vSBiDUG8fhKg@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CAPJVwBnct8Tcop6SruOfD1CCWC82sE3cbXoQvtux8sfLmJhYuQ@mail.gmail.com>
	<CAPJVwB=GmgRQoW1RqwR1WmJBq5tfxJAGnGPjcT0drZvGTVGx7Q@mail.gmail.com>
	<CALmrCV11T6OTp+dBvUNjGn518Rd2ovHPfS7Ur6a5BKTyYXzKQw@mail.gmail.com>
	<50E61E29.1020709@astro.uio.no>
	<CALmrCV0L6PAA=aCYgaX2gQyM-_=bcDiygz0TyDmE0o5oCbc7iA@mail.gmail.com>
	<CAFXk4bqT5Hk67Cv=FJ+L+vx53zR6x5=xRVQRcZShPs7QOZ0OoQ@mail.gmail.com>
	<CALmrCV2TcFwO=mbiSSeqrwbVCyo9nsEQSUspfihA1WgK4Abd+g@mail.gmail.com>
	<CAFXk4bopXBpN2_aPAJCJ3ADk6PSMiG7NANSra7fhButkud+75A@mail.gmail.com>
	<CALmrCV0311UpW__-espNZrbgGk2pa2fW6E1oeTvZkbYf_1e_VA@mail.gmail.com>
	<CAPJVwBkXSjcDHDuddJ8D-39N+hB8SZP6DMHpsr2zwv9XCMT3OA@mail.gmail.com>
	<CALmrCV1xP6cJVL7DZufw84Lw+S_+oE5S4BUY9BAOJtDiXruEvg@mail.gmail.com>
	<CAH6Pt5r50HQctUCGe2+oWP9G709-cenmvOpr7G4L=WQVn0JS=A@mail.gmail.com>
	<CALmrCV0sXJ=SfUQA=F9zEPUM0q6fDKzQkUtyjc=mqP+nQZE-Sw@mail.gmail.com>
	<CAH6Pt5oqwvWSbnaPy2pd0H=ZiQKf6xemcxY1tXsdQ1KdioH53w@mail.gmail.com>
	<CALmrCV0n=TsEPMrmzpdh6UQNFRTp5SgD0VAH52n9pxedsEu8Kg@mail.gmail.com>
	<CAH6Pt5rKwOzTjnOO87cSi=fkCT_-8tVAbEoA9RunwWD6A_DAPw@mail.gmail.com>
	<CALmrCV1JwDcBnO3sf=4n_4+V=C_j3QzRkpMCEZJutnv8NxRPqw@mail.gmail.com>
	<CAH6Pt5qnD=hfW_Zmw5t69HL78H8uhFY8b+GTkFx4uOmO+HERYQ@mail.gmail.com>
	<CALmrCV3ssCQ+U8QziyymoiF6BgxHHBiM2Z2Bnuk9UWNGH9bC4w@mail.gmail.com>
	<CAH6Pt5oiY07W=S+XbAYwN8Zk+AGCOh17eh_eg-uexBHGkPDgWQ@mail.gmail.com>
	<CALmrCV3kZ=EEopMDydwYDarV6G9nSbiPHK_UY4vSBiDUG8fhKg@mail.gmail.com>
Message-ID: <CAH6Pt5ro6n0z9e_BZHjvrVxL2tvYoE50vvkeaXmAgvOUQBpMLQ@mail.gmail.com>

Hi,

On Mon, Jan 7, 2013 at 10:58 PM, Andrew Collette
<andrew.collette at gmail.com> wrote:
> Hi,
>
>> Taking 2) first, in this example:
>>
>>>     return self.f[dataset_name][...] + heightmap
>>
>> assuming it is not going to upcast, would you rather it overflow than
>> raise an error?  Why?  The second seems more explicit and sensible to
>> me.
>
> Yes, I think this (the 1.5 overflow behavior) was a bit odd, if easy
> to understand.
>
>> For 1) - of course the upcasting in 1.6 is only going to work some of
>> the time.   For example:
>>
>> In [2]: np.array([127], dtype=np.int8) * 1000
>> Out[2]: array([-4072], dtype=int16)
>>
>> So - you'll get something, but there's a reasonable chance you won't
>> get what you were expecting.  Of course that is true for 1.5 as well,
>> but at least the rule there is simpler and so easier - in my opinion -
>> to think about.
>
> Part of what my first example was trying to demonstrate was that the
> function author assumed arrays and scalars obeyed the same rules for
> addition.
>
> For example, if data were int8 and heightmap were an int16 array with
> a max value of 32767, and the data had a max value in the same spot
> with e.g. 10, then the addition would overflow at that position, even
> with the int16 result.  That's how array addition works in numpy, and
> as I understand it that's not slated to change.
>
> But when we have a scalar of value 32767 (which fits in int16 but not
> int8), we are proposing instead to do nothing under the assumption
> that it's an error.
>
> In summary: yes, there are some odd results, but they're consistent
> with the rules for addition elsewhere in numpy, and I would prefer
> that to treating this case as an error.

I think you are voting strongly for the current casting rules, because
they make it less obvious to the user that scalars are different from
arrays.

Returning to the question of 1.5 behavior vs the error - I think you
are saying you prefer the 1.5 silent-but-deadly approach to the error,
but I think I still haven't grasped why.  Maybe someone else can
explain it?  The holiday has not been good to my brain.

Best,

Matthew


From andrew.collette at gmail.com  Tue Jan  8 12:20:53 2013
From: andrew.collette at gmail.com (Andrew Collette)
Date: Tue, 8 Jan 2013 10:20:53 -0700
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <CAH6Pt5ro6n0z9e_BZHjvrVxL2tvYoE50vvkeaXmAgvOUQBpMLQ@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CAPJVwBnct8Tcop6SruOfD1CCWC82sE3cbXoQvtux8sfLmJhYuQ@mail.gmail.com>
	<CAPJVwB=GmgRQoW1RqwR1WmJBq5tfxJAGnGPjcT0drZvGTVGx7Q@mail.gmail.com>
	<CALmrCV11T6OTp+dBvUNjGn518Rd2ovHPfS7Ur6a5BKTyYXzKQw@mail.gmail.com>
	<50E61E29.1020709@astro.uio.no>
	<CALmrCV0L6PAA=aCYgaX2gQyM-_=bcDiygz0TyDmE0o5oCbc7iA@mail.gmail.com>
	<CAFXk4bqT5Hk67Cv=FJ+L+vx53zR6x5=xRVQRcZShPs7QOZ0OoQ@mail.gmail.com>
	<CALmrCV2TcFwO=mbiSSeqrwbVCyo9nsEQSUspfihA1WgK4Abd+g@mail.gmail.com>
	<CAFXk4bopXBpN2_aPAJCJ3ADk6PSMiG7NANSra7fhButkud+75A@mail.gmail.com>
	<CALmrCV0311UpW__-espNZrbgGk2pa2fW6E1oeTvZkbYf_1e_VA@mail.gmail.com>
	<CAPJVwBkXSjcDHDuddJ8D-39N+hB8SZP6DMHpsr2zwv9XCMT3OA@mail.gmail.com>
	<CALmrCV1xP6cJVL7DZufw84Lw+S_+oE5S4BUY9BAOJtDiXruEvg@mail.gmail.com>
	<CAH6Pt5r50HQctUCGe2+oWP9G709-cenmvOpr7G4L=WQVn0JS=A@mail.gmail.com>
	<CALmrCV0sXJ=SfUQA=F9zEPUM0q6fDKzQkUtyjc=mqP+nQZE-Sw@mail.gmail.com>
	<CAH6Pt5oqwvWSbnaPy2pd0H=ZiQKf6xemcxY1tXsdQ1KdioH53w@mail.gmail.com>
	<CALmrCV0n=TsEPMrmzpdh6UQNFRTp5SgD0VAH52n9pxedsEu8Kg@mail.gmail.com>
	<CAH6Pt5rKwOzTjnOO87cSi=fkCT_-8tVAbEoA9RunwWD6A_DAPw@mail.gmail.com>
	<CALmrCV1JwDcBnO3sf=4n_4+V=C_j3QzRkpMCEZJutnv8NxRPqw@mail.gmail.com>
	<CAH6Pt5qnD=hfW_Zmw5t69HL78H8uhFY8b+GTkFx4uOmO+HERYQ@mail.gmail.com>
	<CALmrCV3ssCQ+U8QziyymoiF6BgxHHBiM2Z2Bnuk9UWNGH9bC4w@mail.gmail.com>
	<CAH6Pt5oiY07W=S+XbAYwN8Zk+AGCOh17eh_eg-uexBHGkPDgWQ@mail.gmail.com>
	<CALmrCV3kZ=EEopMDydwYDarV6G9nSbiPHK_UY4vSBiDUG8fhKg@mail.gmail.com>
	<CAH6Pt5ro6n0z9e_BZHjvrVxL2tvYoE50vvkeaXmAgvOUQBpMLQ@mail.gmail.com>
Message-ID: <CALmrCV1H68=Vtr6Gep7Y08Ns_vASQdrf+Q3R-COgvXt7URi5sg@mail.gmail.com>

Hi,

> I think you are voting strongly for the current casting rules, because
> they make it less obvious to the user that scalars are different from
> arrays.

Maybe this is the source of my confusion... why should scalars be
different from arrays?  They should follow the same rules, as closely
as possible.  If a scalar value would fit in an int16, why not add it
using the rules for an int16 array?

> Returning to the question of 1.5 behavior vs the error - I think you
> are saying you prefer the 1.5 silent-but-deadly approach to the error,
> but I think I still haven't grasped why.  Maybe someone else can
> explain it?  The holiday has not been good to my brain.

In a strict choice between 1.5-behavior and errors, I'm not sure which
one I would pick.  I don't think either is particularly useful.  Of
course, other members of the community would likely have a different
view, especially those who got used to the 1.5 behavior.

Andrew


From mail.till at gmx.de  Tue Jan  8 13:17:39 2013
From: mail.till at gmx.de (Till Stensitz)
Date: Tue, 8 Jan 2013 18:17:39 +0000 (UTC)
Subject: [Numpy-discussion] Linear least squares
Message-ID: <loom.20130108T184502-398@post.gmane.org>

Hi,
i did some profiling and testing of my data-fitting code. 
One of its core parts is doing some linear least squares, 
until now i used np.linalg.lstsq. Most of time the size
a is (250, 7) and of b is (250, 800).

Today i compared it to using pinv manually, 
to my surprise, it is much faster. I taught,
both are svd based? Too check another computer
i also run my test on wakari:

https://www.wakari.io/nb/tillsten/linear_least_squares

Also using scipy.linalg instead of np.linalg is 
slower for both cases. My numpy and scipy
are both from C. Gohlkes website. If my result
is valid in general, maybe the lstsq function
should be changed or a hint should be added
to the documentation.

greetings
Till


From shish at keba.be  Tue Jan  8 13:48:42 2013
From: shish at keba.be (Olivier Delalleau)
Date: Tue, 8 Jan 2013 13:48:42 -0500
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <CALmrCV1H68=Vtr6Gep7Y08Ns_vASQdrf+Q3R-COgvXt7URi5sg@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CAPJVwBnct8Tcop6SruOfD1CCWC82sE3cbXoQvtux8sfLmJhYuQ@mail.gmail.com>
	<CAPJVwB=GmgRQoW1RqwR1WmJBq5tfxJAGnGPjcT0drZvGTVGx7Q@mail.gmail.com>
	<CALmrCV11T6OTp+dBvUNjGn518Rd2ovHPfS7Ur6a5BKTyYXzKQw@mail.gmail.com>
	<50E61E29.1020709@astro.uio.no>
	<CALmrCV0L6PAA=aCYgaX2gQyM-_=bcDiygz0TyDmE0o5oCbc7iA@mail.gmail.com>
	<CAFXk4bqT5Hk67Cv=FJ+L+vx53zR6x5=xRVQRcZShPs7QOZ0OoQ@mail.gmail.com>
	<CALmrCV2TcFwO=mbiSSeqrwbVCyo9nsEQSUspfihA1WgK4Abd+g@mail.gmail.com>
	<CAFXk4bopXBpN2_aPAJCJ3ADk6PSMiG7NANSra7fhButkud+75A@mail.gmail.com>
	<CALmrCV0311UpW__-espNZrbgGk2pa2fW6E1oeTvZkbYf_1e_VA@mail.gmail.com>
	<CAPJVwBkXSjcDHDuddJ8D-39N+hB8SZP6DMHpsr2zwv9XCMT3OA@mail.gmail.com>
	<CALmrCV1xP6cJVL7DZufw84Lw+S_+oE5S4BUY9BAOJtDiXruEvg@mail.gmail.com>
	<CAH6Pt5r50HQctUCGe2+oWP9G709-cenmvOpr7G4L=WQVn0JS=A@mail.gmail.com>
	<CALmrCV0sXJ=SfUQA=F9zEPUM0q6fDKzQkUtyjc=mqP+nQZE-Sw@mail.gmail.com>
	<CAH6Pt5oqwvWSbnaPy2pd0H=ZiQKf6xemcxY1tXsdQ1KdioH53w@mail.gmail.com>
	<CALmrCV0n=TsEPMrmzpdh6UQNFRTp5SgD0VAH52n9pxedsEu8Kg@mail.gmail.com>
	<CAH6Pt5rKwOzTjnOO87cSi=fkCT_-8tVAbEoA9RunwWD6A_DAPw@mail.gmail.com>
	<CALmrCV1JwDcBnO3sf=4n_4+V=C_j3QzRkpMCEZJutnv8NxRPqw@mail.gmail.com>
	<CAH6Pt5qnD=hfW_Zmw5t69HL78H8uhFY8b+GTkFx4uOmO+HERYQ@mail.gmail.com>
	<CALmrCV3ssCQ+U8QziyymoiF6BgxHHBiM2Z2Bnuk9UWNGH9bC4w@mail.gmail.com>
	<CAH6Pt5oiY07W=S+XbAYwN8Zk+AGCOh17eh_eg-uexBHGkPDgWQ@mail.gmail.com>
	<CALmrCV3kZ=EEopMDydwYDarV6G9nSbiPHK_UY4vSBiDUG8fhKg@mail.gmail.com>
	<CAH6Pt5ro6n0z9e_BZHjvrVxL2tvYoE50vvkeaXmAgvOUQBpMLQ@mail.gmail.com>
	<CALmrCV1H68=Vtr6Gep7Y08Ns_vASQdrf+Q3R-COgvXt7URi5sg@mail.gmail.com>
Message-ID: <CAFXk4bq2tRLs8+YACpYCF8y4Nn7wz_DA=yhjFWDM0bwVoZVSmA@mail.gmail.com>

2013/1/8 Andrew Collette <andrew.collette at gmail.com>:
> Hi,
>
>> I think you are voting strongly for the current casting rules, because
>> they make it less obvious to the user that scalars are different from
>> arrays.
>
> Maybe this is the source of my confusion... why should scalars be
> different from arrays?  They should follow the same rules, as closely
> as possible.  If a scalar value would fit in an int16, why not add it
> using the rules for an int16 array?

As I mentioned in another post, I also agree that it would make things
simpler and safer to just yield the same result as if we were using a
one-element array.

My understanding of the motivation for the rule "scalars do not upcast
arrays unless they are of a fundamentally different type" is that it
avoids accidentally upcasting arrays in operations like "x + 1" (for
instance if x is a float32 array, the upcast would yield a float64
result, and if x is an int16, it would yield int64), which may waste
memory. I find it a useful feature, however I'm not sure it's worth
the headaches it can lead to.

However, my first reaction at the idea of dropping this rule
altogether is that it would lead to a long and painful deprecation
process. I may be wrong though, I really haven't thought about it
much.

-=- Olivier


From alan.isaac at gmail.com  Tue Jan  8 14:28:51 2013
From: alan.isaac at gmail.com (Alan G Isaac)
Date: Tue, 08 Jan 2013 14:28:51 -0500
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <CAFXk4bq2tRLs8+YACpYCF8y4Nn7wz_DA=yhjFWDM0bwVoZVSmA@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CAH6Pt5r50HQctUCGe2+oWP9G709-cenmvOpr7G4L=WQVn0JS=A@mail.gmail.com>
	<CALmrCV0sXJ=SfUQA=F9zEPUM0q6fDKzQkUtyjc=mqP+nQZE-Sw@mail.gmail.com>
	<CAH6Pt5oqwvWSbnaPy2pd0H=ZiQKf6xemcxY1tXsdQ1KdioH53w@mail.gmail.com>
	<CALmrCV0n=TsEPMrmzpdh6UQNFRTp5SgD0VAH52n9pxedsEu8Kg@mail.gmail.com>
	<CAH6Pt5rKwOzTjnOO87cSi=fkCT_-8tVAbEoA9RunwWD6A_DAPw@mail.gmail.com>
	<CALmrCV1JwDcBnO3sf=4n_4+V=C_j3QzRkpMCEZJutnv8NxRPqw@mail.gmail.com>
	<CAH6Pt5qnD=hfW_Zmw5t69HL78H8uhFY8b+GTkFx4uOmO+HERYQ@mail.gmail.com>
	<CALmrCV3ssCQ+U8QziyymoiF6BgxHHBiM2Z2Bnuk9UWNGH9bC4w@mail.gmail.com>
	<CAH6Pt5oiY07W=S+XbAYwN8Zk+AGCOh17eh_eg-uexBHGkPDgWQ@mail.gmail.com>
	<CALmrCV3kZ=EEopMDydwYDarV6G9nSbiPHK_UY4vSBiDUG8fhKg@mail.gmail.com>
	<CAH6Pt5ro6n0z9e_BZHjvrVxL2tvYoE50vvkeaXmAgvOUQBpMLQ@mail.gmail.com>
	<CALmrCV1H68=Vtr6Gep7Y08Ns_vASQdrf+Q3R-COgvXt7URi5sg@mail.gmail.com>
	<CAFXk4bq2tRLs8+YACpYCF8y4Nn7wz_DA=yhjFWDM0bwVoZVSmA@mail.gmail.com>
Message-ID: <50EC7373.7010407@gmail.com>

On 1/8/2013 1:48 PM, Olivier Delalleau wrote:
> As I mentioned in another post, I also agree that it would make things
> simpler and safer to just yield the same result as if we were using a
> one-element array.


Yes!
Anything else is going to drive people insane,
especially new users.

Alan Isaac


From njs at pobox.com  Tue Jan  8 14:59:16 2013
From: njs at pobox.com (Nathaniel Smith)
Date: Tue, 8 Jan 2013 19:59:16 +0000
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <CALmrCV1H68=Vtr6Gep7Y08Ns_vASQdrf+Q3R-COgvXt7URi5sg@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CAPJVwBnct8Tcop6SruOfD1CCWC82sE3cbXoQvtux8sfLmJhYuQ@mail.gmail.com>
	<CAPJVwB=GmgRQoW1RqwR1WmJBq5tfxJAGnGPjcT0drZvGTVGx7Q@mail.gmail.com>
	<CALmrCV11T6OTp+dBvUNjGn518Rd2ovHPfS7Ur6a5BKTyYXzKQw@mail.gmail.com>
	<50E61E29.1020709@astro.uio.no>
	<CALmrCV0L6PAA=aCYgaX2gQyM-_=bcDiygz0TyDmE0o5oCbc7iA@mail.gmail.com>
	<CAFXk4bqT5Hk67Cv=FJ+L+vx53zR6x5=xRVQRcZShPs7QOZ0OoQ@mail.gmail.com>
	<CALmrCV2TcFwO=mbiSSeqrwbVCyo9nsEQSUspfihA1WgK4Abd+g@mail.gmail.com>
	<CAFXk4bopXBpN2_aPAJCJ3ADk6PSMiG7NANSra7fhButkud+75A@mail.gmail.com>
	<CALmrCV0311UpW__-espNZrbgGk2pa2fW6E1oeTvZkbYf_1e_VA@mail.gmail.com>
	<CAPJVwBkXSjcDHDuddJ8D-39N+hB8SZP6DMHpsr2zwv9XCMT3OA@mail.gmail.com>
	<CALmrCV1xP6cJVL7DZufw84Lw+S_+oE5S4BUY9BAOJtDiXruEvg@mail.gmail.com>
	<CAH6Pt5r50HQctUCGe2+oWP9G709-cenmvOpr7G4L=WQVn0JS=A@mail.gmail.com>
	<CALmrCV0sXJ=SfUQA=F9zEPUM0q6fDKzQkUtyjc=mqP+nQZE-Sw@mail.gmail.com>
	<CAH6Pt5oqwvWSbnaPy2pd0H=ZiQKf6xemcxY1tXsdQ1KdioH53w@mail.gmail.com>
	<CALmrCV0n=TsEPMrmzpdh6UQNFRTp5SgD0VAH52n9pxedsEu8Kg@mail.gmail.com>
	<CAH6Pt5rKwOzTjnOO87cSi=fkCT_-8tVAbEoA9RunwWD6A_DAPw@mail.gmail.com>
	<CALmrCV1JwDcBnO3sf=4n_4+V=C_j3QzRkpMCEZJutnv8NxRPqw@mail.gmail.com>
	<CAH6Pt5qnD=hfW_Zmw5t69HL78H8uhFY8b+GTkFx4uOmO+HERYQ@mail.gmail.com>
	<CALmrCV3ssCQ+U8QziyymoiF6BgxHHBiM2Z2Bnuk9UWNGH9bC4w@mail.gmail.com>
	<CAH6Pt5oiY07W=S+XbAYwN8Zk+AGCOh17eh_eg-uexBHGkPDgWQ@mail.gmail.com>
	<CALmrCV3kZ=EEopMDydwYDarV6G9nSbiPHK_UY4vSBiDUG8fhKg@mail.gmail.com>
	<CAH6Pt5ro6n0z9e_BZHjvrVxL2tvYoE50vvkeaXmAgvOUQBpMLQ@mail.gmail.com>
	<CALmrCV1H68=Vtr6Gep7Y08Ns_vASQdrf+Q3R-COgvXt7URi5sg@mail.gmail.com>
Message-ID: <CAPJVwBmCmPCShfmoSPgkpO58aLU6S2t9GbYbDV=iif6US5YgJQ@mail.gmail.com>

On 8 Jan 2013 17:24, "Andrew Collette" <andrew.collette at gmail.com> wrote:
>
> Hi,
>
> > I think you are voting strongly for the current casting rules, because
> > they make it less obvious to the user that scalars are different from
> > arrays.
>
> Maybe this is the source of my confusion... why should scalars be
> different from arrays?  They should follow the same rules, as closely
> as possible.  If a scalar value would fit in an int16, why not add it
> using the rules for an int16 array?

The problem is that rule for arrays - and for every other party of
numpy in general - are that we *don't* pick types based on values.
Numpy always uses input types to determine output types, not input
values.

# This value fits in an int8
In [5]: a = np.array([1])

# And yet...
In [6]: a.dtype
Out[6]: dtype('int64')

In [7]: small = np.array([1], dtype=np.int8)

# Computing 1 + 1 doesn't need a large integer... but we use one
In [8]: (small + a).dtype
Out[8]: dtype('int64')

Python scalars have an unambiguous types: a Python 'int' is a C
'long', and a Python 'float' is a C 'double'. And these are the types
that np.array() converts them to. So it's pretty unambiguous that
"using the same rules for arrays and scalars" would mean, ignore the
value of the scalar, and in expressions like
  np.array([1], dtype=np.int8) + 1
we should always upcast to int32/int64. The problem is that this makes
working with narrow types very awkward for no real benefit, so
everyone pretty much seems to want *some* kind of special case. These
are both absolutely special cases:

numarray through 1.5: in a binary operation, if one operand has
ndim==0 and the other has ndim>0, ignore the width of the ndim==0
operand.

1.6, your proposal: in a binary operation, if one operand has ndim==0
and the other has ndim>0, downcast the ndim==0 item to the smallest
width that is consistent with its value and the other operand's type.

-n


From njs at pobox.com  Tue Jan  8 15:04:09 2013
From: njs at pobox.com (Nathaniel Smith)
Date: Tue, 8 Jan 2013 20:04:09 +0000
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <50EC7373.7010407@gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CAH6Pt5r50HQctUCGe2+oWP9G709-cenmvOpr7G4L=WQVn0JS=A@mail.gmail.com>
	<CALmrCV0sXJ=SfUQA=F9zEPUM0q6fDKzQkUtyjc=mqP+nQZE-Sw@mail.gmail.com>
	<CAH6Pt5oqwvWSbnaPy2pd0H=ZiQKf6xemcxY1tXsdQ1KdioH53w@mail.gmail.com>
	<CALmrCV0n=TsEPMrmzpdh6UQNFRTp5SgD0VAH52n9pxedsEu8Kg@mail.gmail.com>
	<CAH6Pt5rKwOzTjnOO87cSi=fkCT_-8tVAbEoA9RunwWD6A_DAPw@mail.gmail.com>
	<CALmrCV1JwDcBnO3sf=4n_4+V=C_j3QzRkpMCEZJutnv8NxRPqw@mail.gmail.com>
	<CAH6Pt5qnD=hfW_Zmw5t69HL78H8uhFY8b+GTkFx4uOmO+HERYQ@mail.gmail.com>
	<CALmrCV3ssCQ+U8QziyymoiF6BgxHHBiM2Z2Bnuk9UWNGH9bC4w@mail.gmail.com>
	<CAH6Pt5oiY07W=S+XbAYwN8Zk+AGCOh17eh_eg-uexBHGkPDgWQ@mail.gmail.com>
	<CALmrCV3kZ=EEopMDydwYDarV6G9nSbiPHK_UY4vSBiDUG8fhKg@mail.gmail.com>
	<CAH6Pt5ro6n0z9e_BZHjvrVxL2tvYoE50vvkeaXmAgvOUQBpMLQ@mail.gmail.com>
	<CALmrCV1H68=Vtr6Gep7Y08Ns_vASQdrf+Q3R-COgvXt7URi5sg@mail.gmail.com>
	<CAFXk4bq2tRLs8+YACpYCF8y4Nn7wz_DA=yhjFWDM0bwVoZVSmA@mail.gmail.com>
	<50EC7373.7010407@gmail.com>
Message-ID: <CAPJVwBksPBZgqjOCtNgQn+0rqmU96oYQiYRZBWyOFDnk1kc0CA@mail.gmail.com>

On Tue, Jan 8, 2013 at 7:28 PM, Alan G Isaac <alan.isaac at gmail.com> wrote:
> On 1/8/2013 1:48 PM, Olivier Delalleau wrote:
>> As I mentioned in another post, I also agree that it would make things
>> simpler and safer to just yield the same result as if we were using a
>> one-element array.
>
> Yes!
> Anything else is going to drive people insane,
> especially new users.

New users don't use narrow-width dtypes... it's important to remember
in this discussion that in numpy, non-standard dtypes only arise when
users explicitly request them, so there's some expressed intention
there that we want to try and respect. (As opposed to the type
associated with Python manifest constants like the "2" in "2 * a",
which probably no programmer looked at and thought "hmm, what I want
here is 2-as-an-int64".)

-n


From alan.isaac at gmail.com  Tue Jan  8 15:43:52 2013
From: alan.isaac at gmail.com (Alan G Isaac)
Date: Tue, 08 Jan 2013 15:43:52 -0500
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
	does at the moment?
In-Reply-To: <CAPJVwBksPBZgqjOCtNgQn+0rqmU96oYQiYRZBWyOFDnk1kc0CA@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CAH6Pt5oqwvWSbnaPy2pd0H=ZiQKf6xemcxY1tXsdQ1KdioH53w@mail.gmail.com>
	<CALmrCV0n=TsEPMrmzpdh6UQNFRTp5SgD0VAH52n9pxedsEu8Kg@mail.gmail.com>
	<CAH6Pt5rKwOzTjnOO87cSi=fkCT_-8tVAbEoA9RunwWD6A_DAPw@mail.gmail.com>
	<CALmrCV1JwDcBnO3sf=4n_4+V=C_j3QzRkpMCEZJutnv8NxRPqw@mail.gmail.com>
	<CAH6Pt5qnD=hfW_Zmw5t69HL78H8uhFY8b+GTkFx4uOmO+HERYQ@mail.gmail.com>
	<CALmrCV3ssCQ+U8QziyymoiF6BgxHHBiM2Z2Bnuk9UWNGH9bC4w@mail.gmail.com>
	<CAH6Pt5oiY07W=S+XbAYwN8Zk+AGCOh17eh_eg-uexBHGkPDgWQ@mail.gmail.com>
	<CALmrCV3kZ=EEopMDydwYDarV6G9nSbiPHK_UY4vSBiDUG8fhKg@mail.gmail.com>
	<CAH6Pt5ro6n0z9e_BZHjvrVxL2tvYoE50vvkeaXmAgvOUQBpMLQ@mail.gmail.com>
	<CALmrCV1H68=Vtr6Gep7Y08Ns_vASQdrf+Q3R-COgvXt7URi5sg@mail.gmail.com>
	<CAFXk4bq2tRLs8+YACpYCF8y4Nn7wz_DA=yhjFWDM0bwVoZVSmA@mail.gmail.com>
	<50EC7373.7010407@gmail.com>
	<CAPJVwBksPBZgqjOCtNgQn+0rqmU96oYQiYRZBWyOFDnk1kc0CA@mail.gmail.com>
Message-ID: <50EC8508.6090307@gmail.com>

On 1/8/2013 3:04 PM, Nathaniel Smith wrote:
> New users don't use narrow-width dtypes... it's important to remember
> in this discussion that in numpy, non-standard dtypes only arise when
> users explicitly request them, so there's some expressed intention
> there that we want to try and respect.


1. I think the first statement is wrong.
Control over dtypes is a good reason for
a new use to consider NumPy.

2. You cannot treat the intention as separate from the rules.
Users want to play by the rules.

Because NumPy supports broadcasting,
it is natural for array-array operations and
scalar-array operations to be consistent.
I believe anything else will be too confusing.

I do not recall an example yet that clearly
demonstrates a case where a single user would
want two different behaviors for a scalar operation
and an analogous broadcasting operation.
Was there one?

Alan Isaac


From andrew.collette at gmail.com  Tue Jan  8 16:14:12 2013
From: andrew.collette at gmail.com (Andrew Collette)
Date: Tue, 8 Jan 2013 14:14:12 -0700
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <CAPJVwBmCmPCShfmoSPgkpO58aLU6S2t9GbYbDV=iif6US5YgJQ@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CAPJVwBnct8Tcop6SruOfD1CCWC82sE3cbXoQvtux8sfLmJhYuQ@mail.gmail.com>
	<CAPJVwB=GmgRQoW1RqwR1WmJBq5tfxJAGnGPjcT0drZvGTVGx7Q@mail.gmail.com>
	<CALmrCV11T6OTp+dBvUNjGn518Rd2ovHPfS7Ur6a5BKTyYXzKQw@mail.gmail.com>
	<50E61E29.1020709@astro.uio.no>
	<CALmrCV0L6PAA=aCYgaX2gQyM-_=bcDiygz0TyDmE0o5oCbc7iA@mail.gmail.com>
	<CAFXk4bqT5Hk67Cv=FJ+L+vx53zR6x5=xRVQRcZShPs7QOZ0OoQ@mail.gmail.com>
	<CALmrCV2TcFwO=mbiSSeqrwbVCyo9nsEQSUspfihA1WgK4Abd+g@mail.gmail.com>
	<CAFXk4bopXBpN2_aPAJCJ3ADk6PSMiG7NANSra7fhButkud+75A@mail.gmail.com>
	<CALmrCV0311UpW__-espNZrbgGk2pa2fW6E1oeTvZkbYf_1e_VA@mail.gmail.com>
	<CAPJVwBkXSjcDHDuddJ8D-39N+hB8SZP6DMHpsr2zwv9XCMT3OA@mail.gmail.com>
	<CALmrCV1xP6cJVL7DZufw84Lw+S_+oE5S4BUY9BAOJtDiXruEvg@mail.gmail.com>
	<CAH6Pt5r50HQctUCGe2+oWP9G709-cenmvOpr7G4L=WQVn0JS=A@mail.gmail.com>
	<CALmrCV0sXJ=SfUQA=F9zEPUM0q6fDKzQkUtyjc=mqP+nQZE-Sw@mail.gmail.com>
	<CAH6Pt5oqwvWSbnaPy2pd0H=ZiQKf6xemcxY1tXsdQ1KdioH53w@mail.gmail.com>
	<CALmrCV0n=TsEPMrmzpdh6UQNFRTp5SgD0VAH52n9pxedsEu8Kg@mail.gmail.com>
	<CAH6Pt5rKwOzTjnOO87cSi=fkCT_-8tVAbEoA9RunwWD6A_DAPw@mail.gmail.com>
	<CALmrCV1JwDcBnO3sf=4n_4+V=C_j3QzRkpMCEZJutnv8NxRPqw@mail.gmail.com>
	<CAH6Pt5qnD=hfW_Zmw5t69HL78H8uhFY8b+GTkFx4uOmO+HERYQ@mail.gmail.com>
	<CALmrCV3ssCQ+U8QziyymoiF6BgxHHBiM2Z2Bnuk9UWNGH9bC4w@mail.gmail.com>
	<CAH6Pt5oiY07W=S+XbAYwN8Zk+AGCOh17eh_eg-uexBHGkPDgWQ@mail.gmail.com>
	<CALmrCV3kZ=EEopMDydwYDarV6G9nSbiPHK_UY4vSBiDUG8fhKg@mail.gmail.com>
	<CAH6Pt5ro6n0z9e_BZHjvrVxL2tvYoE50vvkeaXmAgvOUQBpMLQ@mail.gmail.com>
	<CALmrCV1H68=Vtr6Gep7Y08Ns_vASQdrf+Q3R-COgvXt7URi5sg@mail.gmail.com>
	<CAPJVwBmCmPCShfmoSPgkpO58aLU6S2t9GbYbDV=iif6US5YgJQ@mail.gmail.com>
Message-ID: <CALmrCV1qbkkp8yyA4z2RTCcB9hK70ZcFYD53xXNaqNKWuaGtnw@mail.gmail.com>

Hi Nathaniel,

(Responding to both your emails)

> The problem is that rule for arrays - and for every other party of
> numpy in general - are that we *don't* pick types based on values.
> Numpy always uses input types to determine output types, not input
> values.

Yes, of course... array operations are governed exclusively by their
dtypes.  It seems to me that, using the language of the bug report
(2878), if we have this:

result = arr + scalar

I would argue that our job is, rather than to pick result.dtype, to
pick scalar.dtype, and apply the normal rules for array operations.

> So it's pretty unambiguous that
> "using the same rules for arrays and scalars" would mean, ignore the
> value of the scalar, and in expressions like
>   np.array([1], dtype=np.int8) + 1
> we should always upcast to int32/int64.

Ah, but that's my point: we already, in 1.6, ignore the intrinsic
width of the scalar and effectively substitute one based on it's
value:

>>> a = np.array([1], dtype=int8)
>>> (a + 1).dtype
dtype('int8')
>>> (a + 1000).dtype
dtype('int16')
>>> (a + 90000).dtype
dtype('int32')
>>> (a + 2**40).dtype
dtype('int64')

> 1.6, your proposal: in a binary operation, if one operand has ndim==0
> and the other has ndim>0, downcast the ndim==0 item to the smallest
> width that is consistent with its value and the other operand's type.

Yes, exactly.  I'm not trying to propose a completely new behavior: as
I mentioned (although very far upthread), this is the mental model I
had of how things worked in 1.6 already.

> New users don't use narrow-width dtypes... it's important to remember
> in this discussion that in numpy, non-standard dtypes only arise when
> users explicitly request them, so there's some expressed intention
> there that we want to try and respect.

I would respectfully disagree.  One example I cited was that when
dealing with HDF5, it's very common to get int16's (and even int8's)
when reading from a file because they are used to save disk space.
All a new user has to do to get int8's from a file they got from
someone else is:

>>> data = some_hdf5_file['MyDataset'][...]

This is a general issue applying to data which is read from real-world
external sources.  For example, digitizers routinely represent their
samples as int8's or int16's, and you apply a scale and offset to get
a reading in volts.

As you say, the proposed change will prevent accidental upcasting by
people who selected int8/int16 on purpose to save memory, by notifying
them with a ValueError.  But another assumption we could make is that
people who choose to use narrow types for performance reasons should
be expected to use caution when performing operations that might
upcast, and that the default behavior should be to follow the normal
array rules as closely as possible, as is done in 1.6.

Andrew


From sebastian at sipsolutions.net  Tue Jan  8 16:15:38 2013
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Tue, 08 Jan 2013 22:15:38 +0100
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <CAPJVwBmCmPCShfmoSPgkpO58aLU6S2t9GbYbDV=iif6US5YgJQ@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CAPJVwBnct8Tcop6SruOfD1CCWC82sE3cbXoQvtux8sfLmJhYuQ@mail.gmail.com>
	<CAPJVwB=GmgRQoW1RqwR1WmJBq5tfxJAGnGPjcT0drZvGTVGx7Q@mail.gmail.com>
	<CALmrCV11T6OTp+dBvUNjGn518Rd2ovHPfS7Ur6a5BKTyYXzKQw@mail.gmail.com>
	<50E61E29.1020709@astro.uio.no>
	<CALmrCV0L6PAA=aCYgaX2gQyM-_=bcDiygz0TyDmE0o5oCbc7iA@mail.gmail.com>
	<CAFXk4bqT5Hk67Cv=FJ+L+vx53zR6x5=xRVQRcZShPs7QOZ0OoQ@mail.gmail.com>
	<CALmrCV2TcFwO=mbiSSeqrwbVCyo9nsEQSUspfihA1WgK4Abd+g@mail.gmail.com>
	<CAFXk4bopXBpN2_aPAJCJ3ADk6PSMiG7NANSra7fhButkud+75A@mail.gmail.com>
	<CALmrCV0311UpW__-espNZrbgGk2pa2fW6E1oeTvZkbYf_1e_VA@mail.gmail.com>
	<CAPJVwBkXSjcDHDuddJ8D-39N+hB8SZP6DMHpsr2zwv9XCMT3OA@mail.gmail.com>
	<CALmrCV1xP6cJVL7DZufw84Lw+S_+oE5S4BUY9BAOJtDiXruEvg@mail.gmail.com>
	<CAH6Pt5r50HQctUCGe2+oWP9G709-cenmvOpr7G4L=WQVn0JS=A@mail.gmail.com>
	<CALmrCV0sXJ=SfUQA=F9zEPUM0q6fDKzQkUtyjc=mqP+nQZE-Sw@mail.gmail.com>
	<CAH6Pt5oqwvWSbnaPy2pd0H=ZiQKf6xemcxY1tXsdQ1KdioH53w@mail.gmail.com>
	<CALmrCV0n=TsEPMrmzpdh6UQNFRTp5SgD0VAH52n9pxedsEu8Kg@mail.gmail.com>
	<CAH6Pt5rKwOzTjnOO87cSi=fkCT_-8tVAbEoA9RunwWD6A_DAPw@mail.gmail.com>
	<CALmrCV1JwDcBnO3sf=4n_4+V=C_j3QzRkpMCEZJutnv8NxRPqw@mail.gmail.com>
	<CAH6Pt5qnD=hfW_Zmw5t69HL78H8uhFY8b+GTkFx4uOmO+HERYQ@mail.gmail.com>
	<CALmrCV3ssCQ+U8QziyymoiF6BgxHHBiM2Z2Bnuk9UWNGH9bC4w@mail.gmail.com>
	<CAH6Pt5oiY07W=S+XbAYwN8Zk+AGCOh17eh_eg-uexBHGkPDgWQ@mail.gmail.com>
	<CALmrCV3kZ=EEopMDydwYDarV6G9nSbiPHK_UY4vSBiDUG8fhKg@mail.gmail.com>
	<CAH6Pt5ro6n0z9e_BZHjvrVxL2tvYoE50vvkeaXmAgvOUQBpMLQ@mail.gmail.com>
	<CALmrCV1H68=Vtr6Gep7Y08Ns_vASQdrf+Q3R-COgvXt7URi5sg@mail.gmail.com>
	<CAPJVwBmCmPCShfmoSPgkpO58aLU6S2t9GbYbDV=iif6US5YgJQ@mail.gmail.com>
Message-ID: <1357679738.2754.3.camel@sebastian-laptop>

On Tue, 2013-01-08 at 19:59 +0000, Nathaniel Smith wrote:
> On 8 Jan 2013 17:24, "Andrew Collette" <andrew.collette at gmail.com> wrote:
> >
> > Hi,
> >
> > > I think you are voting strongly for the current casting rules, because
> > > they make it less obvious to the user that scalars are different from
> > > arrays.
> >
> > Maybe this is the source of my confusion... why should scalars be
> > different from arrays?  They should follow the same rules, as closely
> > as possible.  If a scalar value would fit in an int16, why not add it
> > using the rules for an int16 array?
> 
> The problem is that rule for arrays - and for every other party of
> numpy in general - are that we *don't* pick types based on values.
> Numpy always uses input types to determine output types, not input
> values.
> 
> # This value fits in an int8
> In [5]: a = np.array([1])
> 
> # And yet...
> In [6]: a.dtype
> Out[6]: dtype('int64')
> 
> In [7]: small = np.array([1], dtype=np.int8)
> 
> # Computing 1 + 1 doesn't need a large integer... but we use one
> In [8]: (small + a).dtype
> Out[8]: dtype('int64')
> 
> Python scalars have an unambiguous types: a Python 'int' is a C
> 'long', and a Python 'float' is a C 'double'. And these are the types
> that np.array() converts them to. So it's pretty unambiguous that
> "using the same rules for arrays and scalars" would mean, ignore the
> value of the scalar, and in expressions like
>   np.array([1], dtype=np.int8) + 1
> we should always upcast to int32/int64. The problem is that this makes
> working with narrow types very awkward for no real benefit, so
> everyone pretty much seems to want *some* kind of special case. These
> are both absolutely special cases:
> 
> numarray through 1.5: in a binary operation, if one operand has
> ndim==0 and the other has ndim>0, ignore the width of the ndim==0
> operand.
> 
> 1.6, your proposal: in a binary operation, if one operand has ndim==0
> and the other has ndim>0, downcast the ndim==0 item to the smallest
> width that is consistent with its value and the other operand's type.
> 

Well, that leaves the maybe not quite implausible proposal of saying
that numpy scalars behave like arrays with ndim>0, but python scalars
behave like they do in 1.6. to allow for easier working with narrow
types.

Sebastian

> -n
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> 


From chris.barker at noaa.gov  Tue Jan  8 16:17:58 2013
From: chris.barker at noaa.gov (Chris Barker - NOAA Federal)
Date: Tue, 8 Jan 2013 13:17:58 -0800
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <50EC8508.6090307@gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CAH6Pt5oqwvWSbnaPy2pd0H=ZiQKf6xemcxY1tXsdQ1KdioH53w@mail.gmail.com>
	<CALmrCV0n=TsEPMrmzpdh6UQNFRTp5SgD0VAH52n9pxedsEu8Kg@mail.gmail.com>
	<CAH6Pt5rKwOzTjnOO87cSi=fkCT_-8tVAbEoA9RunwWD6A_DAPw@mail.gmail.com>
	<CALmrCV1JwDcBnO3sf=4n_4+V=C_j3QzRkpMCEZJutnv8NxRPqw@mail.gmail.com>
	<CAH6Pt5qnD=hfW_Zmw5t69HL78H8uhFY8b+GTkFx4uOmO+HERYQ@mail.gmail.com>
	<CALmrCV3ssCQ+U8QziyymoiF6BgxHHBiM2Z2Bnuk9UWNGH9bC4w@mail.gmail.com>
	<CAH6Pt5oiY07W=S+XbAYwN8Zk+AGCOh17eh_eg-uexBHGkPDgWQ@mail.gmail.com>
	<CALmrCV3kZ=EEopMDydwYDarV6G9nSbiPHK_UY4vSBiDUG8fhKg@mail.gmail.com>
	<CAH6Pt5ro6n0z9e_BZHjvrVxL2tvYoE50vvkeaXmAgvOUQBpMLQ@mail.gmail.com>
	<CALmrCV1H68=Vtr6Gep7Y08Ns_vASQdrf+Q3R-COgvXt7URi5sg@mail.gmail.com>
	<CAFXk4bq2tRLs8+YACpYCF8y4Nn7wz_DA=yhjFWDM0bwVoZVSmA@mail.gmail.com>
	<50EC7373.7010407@gmail.com>
	<CAPJVwBksPBZgqjOCtNgQn+0rqmU96oYQiYRZBWyOFDnk1kc0CA@mail.gmail.com>
	<50EC8508.6090307@gmail.com>
Message-ID: <CALGmxEJ7JZ2SdyOTtQfw0gcCP3VQ=siydOeXa2gxkpMQLMZ=LQ@mail.gmail.com>

On Tue, Jan 8, 2013 at 12:43 PM, Alan G Isaac <alan.isaac at gmail.com> wrote:
>> New users don't use narrow-width dtypes... it's important to remember

> 1. I think the first statement is wrong.
> Control over dtypes is a good reason for
> a new use to consider NumPy.

Absolutely.

> Because NumPy supports broadcasting,
> it is natural for array-array operations and
> scalar-array operations to be consistent.
> I believe anything else will be too confusing.

Theoretically true -- but in practice, the problem arrises because it
is easy to write literals with the standard python scalars, so one is
very likely to want to do:

arr = np.zeros((m,n), dtype=np.uint8)
arr += 3

and not want an upcast.

I don't think we want to require that to be spelled:

arr += np.array(3, dtype=np.uint8)

so that defines desired behaviour for array<->scalar.

but what should this do?

arr1 = np.zeros((m,n), dtype=np.uint8)
arr2 = np.zeros((m,n), dtype=np.uint16)

arr1 + arr2
  or
arr2 + arr1

upcast in both cases?
use the type of the left operand?
raise an exception?

matching the array<-> scalar approach would mean always keeping the
smallest type, which is unlikely to be what is wanted.

Having it be dependent on order would be really ripe fro confusion.

raising an exception might have been the best idea from the beginning.
(though I wouldn't want that in the array<-> scalar case).

So perhaps having a scalar array distinction, while quite impure, is
the best compromise.

NOTE: no matter how you slice it, at some point reducing operations
produce something different (that can no longer be reduced), so I do
think it would be nice for rank-zero arrays and scalars to be the same
thing (in this regard and others)

-Chris


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From shish at keba.be  Tue Jan  8 16:24:52 2013
From: shish at keba.be (Olivier Delalleau)
Date: Tue, 8 Jan 2013 16:24:52 -0500
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <1357679738.2754.3.camel@sebastian-laptop>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CAPJVwBnct8Tcop6SruOfD1CCWC82sE3cbXoQvtux8sfLmJhYuQ@mail.gmail.com>
	<CAPJVwB=GmgRQoW1RqwR1WmJBq5tfxJAGnGPjcT0drZvGTVGx7Q@mail.gmail.com>
	<CALmrCV11T6OTp+dBvUNjGn518Rd2ovHPfS7Ur6a5BKTyYXzKQw@mail.gmail.com>
	<50E61E29.1020709@astro.uio.no>
	<CALmrCV0L6PAA=aCYgaX2gQyM-_=bcDiygz0TyDmE0o5oCbc7iA@mail.gmail.com>
	<CAFXk4bqT5Hk67Cv=FJ+L+vx53zR6x5=xRVQRcZShPs7QOZ0OoQ@mail.gmail.com>
	<CALmrCV2TcFwO=mbiSSeqrwbVCyo9nsEQSUspfihA1WgK4Abd+g@mail.gmail.com>
	<CAFXk4bopXBpN2_aPAJCJ3ADk6PSMiG7NANSra7fhButkud+75A@mail.gmail.com>
	<CALmrCV0311UpW__-espNZrbgGk2pa2fW6E1oeTvZkbYf_1e_VA@mail.gmail.com>
	<CAPJVwBkXSjcDHDuddJ8D-39N+hB8SZP6DMHpsr2zwv9XCMT3OA@mail.gmail.com>
	<CALmrCV1xP6cJVL7DZufw84Lw+S_+oE5S4BUY9BAOJtDiXruEvg@mail.gmail.com>
	<CAH6Pt5r50HQctUCGe2+oWP9G709-cenmvOpr7G4L=WQVn0JS=A@mail.gmail.com>
	<CALmrCV0sXJ=SfUQA=F9zEPUM0q6fDKzQkUtyjc=mqP+nQZE-Sw@mail.gmail.com>
	<CAH6Pt5oqwvWSbnaPy2pd0H=ZiQKf6xemcxY1tXsdQ1KdioH53w@mail.gmail.com>
	<CALmrCV0n=TsEPMrmzpdh6UQNFRTp5SgD0VAH52n9pxedsEu8Kg@mail.gmail.com>
	<CAH6Pt5rKwOzTjnOO87cSi=fkCT_-8tVAbEoA9RunwWD6A_DAPw@mail.gmail.com>
	<CALmrCV1JwDcBnO3sf=4n_4+V=C_j3QzRkpMCEZJutnv8NxRPqw@mail.gmail.com>
	<CAH6Pt5qnD=hfW_Zmw5t69HL78H8uhFY8b+GTkFx4uOmO+HERYQ@mail.gmail.com>
	<CALmrCV3ssCQ+U8QziyymoiF6BgxHHBiM2Z2Bnuk9UWNGH9bC4w@mail.gmail.com>
	<CAH6Pt5oiY07W=S+XbAYwN8Zk+AGCOh17eh_eg-uexBHGkPDgWQ@mail.gmail.com>
	<CALmrCV3kZ=EEopMDydwYDarV6G9nSbiPHK_UY4vSBiDUG8fhKg@mail.gmail.com>
	<CAH6Pt5ro6n0z9e_BZHjvrVxL2tvYoE50vvkeaXmAgvOUQBpMLQ@mail.gmail.com>
	<CALmrCV1H68=Vtr6Gep7Y08Ns_vASQdrf+Q3R-COgvXt7URi5sg@mail.gmail.com>
	<CAPJVwBmCmPCShfmoSPgkpO58aLU6S2t9GbYbDV=iif6US5YgJQ@mail.gmail.com>
	<1357679738.2754.3.camel@sebastian-laptop>
Message-ID: <CAFXk4bo=a3c0+sEk0YmBnNg-rYakWbD9+xkvmWb1v=1qG9uoAA@mail.gmail.com>

2013/1/8 Sebastian Berg <sebastian at sipsolutions.net>:
> On Tue, 2013-01-08 at 19:59 +0000, Nathaniel Smith wrote:
>> On 8 Jan 2013 17:24, "Andrew Collette" <andrew.collette at gmail.com> wrote:
>> >
>> > Hi,
>> >
>> > > I think you are voting strongly for the current casting rules, because
>> > > they make it less obvious to the user that scalars are different from
>> > > arrays.
>> >
>> > Maybe this is the source of my confusion... why should scalars be
>> > different from arrays?  They should follow the same rules, as closely
>> > as possible.  If a scalar value would fit in an int16, why not add it
>> > using the rules for an int16 array?
>>
>> The problem is that rule for arrays - and for every other party of
>> numpy in general - are that we *don't* pick types based on values.
>> Numpy always uses input types to determine output types, not input
>> values.
>>
>> # This value fits in an int8
>> In [5]: a = np.array([1])
>>
>> # And yet...
>> In [6]: a.dtype
>> Out[6]: dtype('int64')
>>
>> In [7]: small = np.array([1], dtype=np.int8)
>>
>> # Computing 1 + 1 doesn't need a large integer... but we use one
>> In [8]: (small + a).dtype
>> Out[8]: dtype('int64')
>>
>> Python scalars have an unambiguous types: a Python 'int' is a C
>> 'long', and a Python 'float' is a C 'double'. And these are the types
>> that np.array() converts them to. So it's pretty unambiguous that
>> "using the same rules for arrays and scalars" would mean, ignore the
>> value of the scalar, and in expressions like
>>   np.array([1], dtype=np.int8) + 1
>> we should always upcast to int32/int64. The problem is that this makes
>> working with narrow types very awkward for no real benefit, so
>> everyone pretty much seems to want *some* kind of special case. These
>> are both absolutely special cases:
>>
>> numarray through 1.5: in a binary operation, if one operand has
>> ndim==0 and the other has ndim>0, ignore the width of the ndim==0
>> operand.
>>
>> 1.6, your proposal: in a binary operation, if one operand has ndim==0
>> and the other has ndim>0, downcast the ndim==0 item to the smallest
>> width that is consistent with its value and the other operand's type.
>>
>
> Well, that leaves the maybe not quite implausible proposal of saying
> that numpy scalars behave like arrays with ndim>0, but python scalars
> behave like they do in 1.6. to allow for easier working with narrow
> types.

I know I already said it, but I really think it'd be a bad idea to
have a different behavior between Python scalars and Numpy scalars,
because I think most people would expect them to behave the same (when
knowing what dtype is a Python float / int). It could lead to very
tricky bugs to handle them differently.

-=- Olivier


From shish at keba.be  Tue Jan  8 16:29:45 2013
From: shish at keba.be (Olivier Delalleau)
Date: Tue, 8 Jan 2013 16:29:45 -0500
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <CALGmxEJ7JZ2SdyOTtQfw0gcCP3VQ=siydOeXa2gxkpMQLMZ=LQ@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CAH6Pt5oqwvWSbnaPy2pd0H=ZiQKf6xemcxY1tXsdQ1KdioH53w@mail.gmail.com>
	<CALmrCV0n=TsEPMrmzpdh6UQNFRTp5SgD0VAH52n9pxedsEu8Kg@mail.gmail.com>
	<CAH6Pt5rKwOzTjnOO87cSi=fkCT_-8tVAbEoA9RunwWD6A_DAPw@mail.gmail.com>
	<CALmrCV1JwDcBnO3sf=4n_4+V=C_j3QzRkpMCEZJutnv8NxRPqw@mail.gmail.com>
	<CAH6Pt5qnD=hfW_Zmw5t69HL78H8uhFY8b+GTkFx4uOmO+HERYQ@mail.gmail.com>
	<CALmrCV3ssCQ+U8QziyymoiF6BgxHHBiM2Z2Bnuk9UWNGH9bC4w@mail.gmail.com>
	<CAH6Pt5oiY07W=S+XbAYwN8Zk+AGCOh17eh_eg-uexBHGkPDgWQ@mail.gmail.com>
	<CALmrCV3kZ=EEopMDydwYDarV6G9nSbiPHK_UY4vSBiDUG8fhKg@mail.gmail.com>
	<CAH6Pt5ro6n0z9e_BZHjvrVxL2tvYoE50vvkeaXmAgvOUQBpMLQ@mail.gmail.com>
	<CALmrCV1H68=Vtr6Gep7Y08Ns_vASQdrf+Q3R-COgvXt7URi5sg@mail.gmail.com>
	<CAFXk4bq2tRLs8+YACpYCF8y4Nn7wz_DA=yhjFWDM0bwVoZVSmA@mail.gmail.com>
	<50EC7373.7010407@gmail.com>
	<CAPJVwBksPBZgqjOCtNgQn+0rqmU96oYQiYRZBWyOFDnk1kc0CA@mail.gmail.com>
	<50EC8508.6090307@gmail.com>
	<CALGmxEJ7JZ2SdyOTtQfw0gcCP3VQ=siydOeXa2gxkpMQLMZ=LQ@mail.gmail.com>
Message-ID: <CAFXk4brFU8u07gwmRPGB4Hpa60aSTUrAWvBG3ydZOtoDUokWSA@mail.gmail.com>

2013/1/8 Chris Barker - NOAA Federal <chris.barker at noaa.gov>:
> On Tue, Jan 8, 2013 at 12:43 PM, Alan G Isaac <alan.isaac at gmail.com> wrote:
>>> New users don't use narrow-width dtypes... it's important to remember
>
>> 1. I think the first statement is wrong.
>> Control over dtypes is a good reason for
>> a new use to consider NumPy.
>
> Absolutely.
>
>> Because NumPy supports broadcasting,
>> it is natural for array-array operations and
>> scalar-array operations to be consistent.
>> I believe anything else will be too confusing.
>
> Theoretically true -- but in practice, the problem arrises because it
> is easy to write literals with the standard python scalars, so one is
> very likely to want to do:
>
> arr = np.zeros((m,n), dtype=np.uint8)
> arr += 3
>
> and not want an upcast.

Note that the behavior with in-place operations is also an interesting
topic, but slightly different, since there is no ambiguity on the
dtype of the output (which is required to match that of the input). I
was actually thinking about this earlier today but decided not to
mention it yet to avoid making the discussion even more complex ;)

The key question is whether the operand should be cast before the
operation, or whether to perform the operation in an upcasted array,
then downcast it back into the original version. I actually thnk the
latter makes more sense (and that's actually what's being done I think
in 1.6.1 from a few tests I tried), and to me this is an argument in
favor of the upcast behavior for non-inplace operations.

-=- Olivier


From d.s.seljebotn at astro.uio.no  Tue Jan  8 16:32:42 2013
From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn)
Date: Tue, 08 Jan 2013 22:32:42 +0100
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <CALmrCV1H68=Vtr6Gep7Y08Ns_vASQdrf+Q3R-COgvXt7URi5sg@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CALmrCV1xP6cJVL7DZufw84Lw+S_+oE5S4BUY9BAOJtDiXruEvg@mail.gmail.com>
	<CAH6Pt5r50HQctUCGe2+oWP9G709-cenmvOpr7G4L=WQVn0JS=A@mail.gmail.com>
	<CALmrCV0sXJ=SfUQA=F9zEPUM0q6fDKzQkUtyjc=mqP+nQZE-Sw@mail.gmail.com>
	<CAH6Pt5oqwvWSbnaPy2pd0H=ZiQKf6xemcxY1tXsdQ1KdioH53w@mail.gmail.com>
	<CALmrCV0n=TsEPMrmzpdh6UQNFRTp5SgD0VAH52n9pxedsEu8Kg@mail.gmail.com>
	<CAH6Pt5rKwOzTjnOO87cSi=fkCT_-8tVAbEoA9RunwWD6A_DAPw@mail.gmail.com>
	<CALmrCV1JwDcBnO3sf=4n_4+V=C_j3QzRkpMCEZJutnv8NxRPqw@mail.gmail.com>
	<CAH6Pt5qnD=hfW_Zmw5t69HL78H8uhFY8b+GTkFx4uOmO+HERYQ@mail.gmail.com>
	<CALmrCV3ssCQ+U8QziyymoiF6BgxHHBiM2Z2Bnuk9UWNGH9bC4w@mail.gmail.com>
	<CAH6Pt5oiY07W=S+XbAYwN8Zk+AGCOh17eh_eg-uexBHGkPDgWQ@mail.gmail.com>
	<CALmrCV3kZ=EEopMDydwYDarV6G9nSbiPHK_UY4vSBiDUG8fhKg@mail.gmail.com>
	<CAH6Pt5ro6n0z9e_BZHjvrVxL2tvYoE50vvkeaXmAgvOUQBpMLQ@mail.gmail.com>
	<CALmrCV1H68=Vtr6Gep7Y08Ns_vASQdrf+Q3R-COgvXt7URi5sg@mail.gmail.com>
Message-ID: <50EC907A.8040003@astro.uio.no>

On 01/08/2013 06:20 PM, Andrew Collette wrote:
> Hi,
>
>> I think you are voting strongly for the current casting rules, because
>> they make it less obvious to the user that scalars are different from
>> arrays.
>
> Maybe this is the source of my confusion... why should scalars be
> different from arrays?  They should follow the same rules, as closely

Scalars (as in, Python float/int) are inherently different because the 
user didn't specify a dtype.

For an array, there was always *some* point where the user chose, 
explicitly or implicitly, a dtype.

> as possible.  If a scalar value would fit in an int16, why not add it
> using the rules for an int16 array?

So you are saying that, for an array x, you want

x + random.randint(100000)

to produce an array with a random dtype?

So that after carefully testing that your code works, suddenly a 
different draw (or user input, or whatever) causes a different set of 
dtypes to ripple through your entire program?

To me this is something that must be avoided at all costs. It's hard 
enough to reason about the code one writes without throwing in complete 
randomness (by which I mean, types determined by values).

Dag Sverre


>
>> Returning to the question of 1.5 behavior vs the error - I think you
>> are saying you prefer the 1.5 silent-but-deadly approach to the error,
>> but I think I still haven't grasped why.  Maybe someone else can
>> explain it?  The holiday has not been good to my brain.
>
> In a strict choice between 1.5-behavior and errors, I'm not sure which
> one I would pick.  I don't think either is particularly useful.  Of
> course, other members of the community would likely have a different
> view, especially those who got used to the 1.5 behavior.
>
> Andrew
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From d.s.seljebotn at astro.uio.no  Tue Jan  8 16:37:51 2013
From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn)
Date: Tue, 08 Jan 2013 22:37:51 +0100
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <50EC907A.8040003@astro.uio.no>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CAH6Pt5r50HQctUCGe2+oWP9G709-cenmvOpr7G4L=WQVn0JS=A@mail.gmail.com>
	<CALmrCV0sXJ=SfUQA=F9zEPUM0q6fDKzQkUtyjc=mqP+nQZE-Sw@mail.gmail.com>
	<CAH6Pt5oqwvWSbnaPy2pd0H=ZiQKf6xemcxY1tXsdQ1KdioH53w@mail.gmail.com>
	<CALmrCV0n=TsEPMrmzpdh6UQNFRTp5SgD0VAH52n9pxedsEu8Kg@mail.gmail.com>
	<CAH6Pt5rKwOzTjnOO87cSi=fkCT_-8tVAbEoA9RunwWD6A_DAPw@mail.gmail.com>
	<CALmrCV1JwDcBnO3sf=4n_4+V=C_j3QzRkpMCEZJutnv8NxRPqw@mail.gmail.com>
	<CAH6Pt5qnD=hfW_Zmw5t69HL78H8uhFY8b+GTkFx4uOmO+HERYQ@mail.gmail.com>
	<CALmrCV3ssCQ+U8QziyymoiF6BgxHHBiM2Z2Bnuk9UWNGH9bC4w@mail.gmail.com>
	<CAH6Pt5oiY07W=S+XbAYwN8Zk+AGCOh17eh_eg-uexBHGkPDgWQ@mail.gmail.com>
	<CALmrCV3kZ=EEopMDydwYDarV6G9nSbiPHK_UY4vSBiDUG8fhKg@mail.gmail.com>
	<CAH6Pt5ro6n0z9e_BZHjvrVxL2tvYoE50vvkeaXmAgvOUQBpMLQ@mail.gmail.com>
	<CALmrCV1H68=Vtr6Gep7Y08Ns_vASQdrf+Q3R-COgvXt7URi5sg@mail.gmail.com>
	<50EC907A.8040003@astro.uio.no>
Message-ID: <50EC91AF.8030908@astro.uio.no>

On 01/08/2013 10:32 PM, Dag Sverre Seljebotn wrote:
> On 01/08/2013 06:20 PM, Andrew Collette wrote:
>> Hi,
>>
>>> I think you are voting strongly for the current casting rules, because
>>> they make it less obvious to the user that scalars are different from
>>> arrays.
>>
>> Maybe this is the source of my confusion... why should scalars be
>> different from arrays?  They should follow the same rules, as closely
>
> Scalars (as in, Python float/int) are inherently different because the
> user didn't specify a dtype.
>
> For an array, there was always *some* point where the user chose,
> explicitly or implicitly, a dtype.
>
>> as possible.  If a scalar value would fit in an int16, why not add it
>> using the rules for an int16 array?
>
> So you are saying that, for an array x, you want
>
> x + random.randint(100000)
>
> to produce an array with a random dtype?
>
> So that after carefully testing that your code works, suddenly a
> different draw (or user input, or whatever) causes a different set of
> dtypes to ripple through your entire program?
>
> To me this is something that must be avoided at all costs. It's hard
> enough to reason about the code one writes without throwing in complete
> randomness (by which I mean, types determined by values).

Oh, sorry, given that this is indeed the present behaviour, this just 
sounds silly. I should have said it's something I dislike about the 
present behaviour then.

Dag Sverre

>
> Dag Sverre
>
>
>
>
>>
>>> Returning to the question of 1.5 behavior vs the error - I think you
>>> are saying you prefer the 1.5 silent-but-deadly approach to the error,
>>> but I think I still haven't grasped why.  Maybe someone else can
>>> explain it?  The holiday has not been good to my brain.
>>
>> In a strict choice between 1.5-behavior and errors, I'm not sure which
>> one I would pick.  I don't think either is particularly useful.  Of
>> course, other members of the community would likely have a different
>> view, especially those who got used to the 1.5 behavior.
>>
>> Andrew
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From andrew.collette at gmail.com  Tue Jan  8 17:30:33 2013
From: andrew.collette at gmail.com (Andrew Collette)
Date: Tue, 8 Jan 2013 15:30:33 -0700
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <50EC907A.8040003@astro.uio.no>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CALmrCV1xP6cJVL7DZufw84Lw+S_+oE5S4BUY9BAOJtDiXruEvg@mail.gmail.com>
	<CAH6Pt5r50HQctUCGe2+oWP9G709-cenmvOpr7G4L=WQVn0JS=A@mail.gmail.com>
	<CALmrCV0sXJ=SfUQA=F9zEPUM0q6fDKzQkUtyjc=mqP+nQZE-Sw@mail.gmail.com>
	<CAH6Pt5oqwvWSbnaPy2pd0H=ZiQKf6xemcxY1tXsdQ1KdioH53w@mail.gmail.com>
	<CALmrCV0n=TsEPMrmzpdh6UQNFRTp5SgD0VAH52n9pxedsEu8Kg@mail.gmail.com>
	<CAH6Pt5rKwOzTjnOO87cSi=fkCT_-8tVAbEoA9RunwWD6A_DAPw@mail.gmail.com>
	<CALmrCV1JwDcBnO3sf=4n_4+V=C_j3QzRkpMCEZJutnv8NxRPqw@mail.gmail.com>
	<CAH6Pt5qnD=hfW_Zmw5t69HL78H8uhFY8b+GTkFx4uOmO+HERYQ@mail.gmail.com>
	<CALmrCV3ssCQ+U8QziyymoiF6BgxHHBiM2Z2Bnuk9UWNGH9bC4w@mail.gmail.com>
	<CAH6Pt5oiY07W=S+XbAYwN8Zk+AGCOh17eh_eg-uexBHGkPDgWQ@mail.gmail.com>
	<CALmrCV3kZ=EEopMDydwYDarV6G9nSbiPHK_UY4vSBiDUG8fhKg@mail.gmail.com>
	<CAH6Pt5ro6n0z9e_BZHjvrVxL2tvYoE50vvkeaXmAgvOUQBpMLQ@mail.gmail.com>
	<CALmrCV1H68=Vtr6Gep7Y08Ns_vASQdrf+Q3R-COgvXt7URi5sg@mail.gmail.com>
	<50EC907A.8040003@astro.uio.no>
Message-ID: <CALmrCV1iU9P1nL8xiJxTp_u7GjN9gG9+F3wJio9yc+MB18t+vA@mail.gmail.com>

Hi Dag,

> So you are saying that, for an array x, you want
>
> x + random.randint(100000)
>
> to produce an array with a random dtype?

Under the proposed behavior, depending on the dtype of x and the value
from random, this would sometimes add-with-rollover and sometimes
raise ValueError.

Under the 1.5 behavior, it would always add-with-rollover and preserve
the type of x.

Under the 1.6 behavior, it produces a range of dtypes, each of which
is at least large enough to hold the random int.

Personally, I prefer the third option, but I strongly prefer either
the second or the third to the first.

Andrew


From josef.pktd at gmail.com  Tue Jan  8 17:35:12 2013
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Tue, 8 Jan 2013 17:35:12 -0500
Subject: [Numpy-discussion] Linear least squares
In-Reply-To: <loom.20130108T184502-398@post.gmane.org>
References: <loom.20130108T184502-398@post.gmane.org>
Message-ID: <CAMMTP+DXQ7Y+o=6okas_rZsCVS6Z1ASkuv272t7W65Gq6Ag2-Q@mail.gmail.com>

On Tue, Jan 8, 2013 at 1:17 PM, Till Stensitz <mail.till at gmx.de> wrote:
> Hi,
> i did some profiling and testing of my data-fitting code.
> One of its core parts is doing some linear least squares,
> until now i used np.linalg.lstsq. Most of time the size
> a is (250, 7) and of b is (250, 800).

My guess is that this depends a lot on the shape

try a is (10000, 7) and b is (10000, 1)

Josef


>
> Today i compared it to using pinv manually,
> to my surprise, it is much faster. I taught,
> both are svd based? Too check another computer
> i also run my test on wakari:
>
> https://www.wakari.io/nb/tillsten/linear_least_squares
>
> Also using scipy.linalg instead of np.linalg is
> slower for both cases. My numpy and scipy
> are both from C. Gohlkes website. If my result
> is valid in general, maybe the lstsq function
> should be changed or a hint should be added
> to the documentation.
>
> greetings
> Till
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From shish at keba.be  Tue Jan  8 17:41:34 2013
From: shish at keba.be (Olivier Delalleau)
Date: Tue, 8 Jan 2013 17:41:34 -0500
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <CALmrCV1iU9P1nL8xiJxTp_u7GjN9gG9+F3wJio9yc+MB18t+vA@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CALmrCV1xP6cJVL7DZufw84Lw+S_+oE5S4BUY9BAOJtDiXruEvg@mail.gmail.com>
	<CAH6Pt5r50HQctUCGe2+oWP9G709-cenmvOpr7G4L=WQVn0JS=A@mail.gmail.com>
	<CALmrCV0sXJ=SfUQA=F9zEPUM0q6fDKzQkUtyjc=mqP+nQZE-Sw@mail.gmail.com>
	<CAH6Pt5oqwvWSbnaPy2pd0H=ZiQKf6xemcxY1tXsdQ1KdioH53w@mail.gmail.com>
	<CALmrCV0n=TsEPMrmzpdh6UQNFRTp5SgD0VAH52n9pxedsEu8Kg@mail.gmail.com>
	<CAH6Pt5rKwOzTjnOO87cSi=fkCT_-8tVAbEoA9RunwWD6A_DAPw@mail.gmail.com>
	<CALmrCV1JwDcBnO3sf=4n_4+V=C_j3QzRkpMCEZJutnv8NxRPqw@mail.gmail.com>
	<CAH6Pt5qnD=hfW_Zmw5t69HL78H8uhFY8b+GTkFx4uOmO+HERYQ@mail.gmail.com>
	<CALmrCV3ssCQ+U8QziyymoiF6BgxHHBiM2Z2Bnuk9UWNGH9bC4w@mail.gmail.com>
	<CAH6Pt5oiY07W=S+XbAYwN8Zk+AGCOh17eh_eg-uexBHGkPDgWQ@mail.gmail.com>
	<CALmrCV3kZ=EEopMDydwYDarV6G9nSbiPHK_UY4vSBiDUG8fhKg@mail.gmail.com>
	<CAH6Pt5ro6n0z9e_BZHjvrVxL2tvYoE50vvkeaXmAgvOUQBpMLQ@mail.gmail.com>
	<CALmrCV1H68=Vtr6Gep7Y08Ns_vASQdrf+Q3R-COgvXt7URi5sg@mail.gmail.com>
	<50EC907A.8040003@astro.uio.no>
	<CALmrCV1iU9P1nL8xiJxTp_u7GjN9gG9+F3wJio9yc+MB18t+vA@mail.gmail.com>
Message-ID: <CAFXk4bp=sFhbREc5LnVUBXcY-WmVZB++8WW+fCTnz0=BB0mKzg@mail.gmail.com>

Le mardi 8 janvier 2013, Andrew Collette a ?crit :

> Hi Dag,
>
> > So you are saying that, for an array x, you want
> >
> > x + random.randint(100000)
> >
> > to produce an array with a random dtype?
>
> Under the proposed behavior, depending on the dtype of x and the value
> from random, this would sometimes add-with-rollover and sometimes
> raise ValueError.
>
> Under the 1.5 behavior, it would always add-with-rollover and preserve
> the type of x.
>
> Under the 1.6 behavior, it produces a range of dtypes, each of which
> is at least large enough to hold the random int.
>
> Personally, I prefer the third option, but I strongly prefer either
> the second or the third to the first.
>
> Andrew
>

Keep in mind that in the third option (current 1.6 behavior) the dtype is
large enough to hold the random number, but not necessarily to hold the
result. So for instance if x is an int16 array with only positive values,
the result of this addition may contain negative values (or not, depending
on the number being drawn). That's the part I feel is flawed with this
behavior, it is quite unpredictable.

-=- Olivier
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130108/eb1eb192/attachment.html>

From andrew.collette at gmail.com  Tue Jan  8 17:51:28 2013
From: andrew.collette at gmail.com (Andrew Collette)
Date: Tue, 8 Jan 2013 15:51:28 -0700
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <CAFXk4bp=sFhbREc5LnVUBXcY-WmVZB++8WW+fCTnz0=BB0mKzg@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CALmrCV1xP6cJVL7DZufw84Lw+S_+oE5S4BUY9BAOJtDiXruEvg@mail.gmail.com>
	<CAH6Pt5r50HQctUCGe2+oWP9G709-cenmvOpr7G4L=WQVn0JS=A@mail.gmail.com>
	<CALmrCV0sXJ=SfUQA=F9zEPUM0q6fDKzQkUtyjc=mqP+nQZE-Sw@mail.gmail.com>
	<CAH6Pt5oqwvWSbnaPy2pd0H=ZiQKf6xemcxY1tXsdQ1KdioH53w@mail.gmail.com>
	<CALmrCV0n=TsEPMrmzpdh6UQNFRTp5SgD0VAH52n9pxedsEu8Kg@mail.gmail.com>
	<CAH6Pt5rKwOzTjnOO87cSi=fkCT_-8tVAbEoA9RunwWD6A_DAPw@mail.gmail.com>
	<CALmrCV1JwDcBnO3sf=4n_4+V=C_j3QzRkpMCEZJutnv8NxRPqw@mail.gmail.com>
	<CAH6Pt5qnD=hfW_Zmw5t69HL78H8uhFY8b+GTkFx4uOmO+HERYQ@mail.gmail.com>
	<CALmrCV3ssCQ+U8QziyymoiF6BgxHHBiM2Z2Bnuk9UWNGH9bC4w@mail.gmail.com>
	<CAH6Pt5oiY07W=S+XbAYwN8Zk+AGCOh17eh_eg-uexBHGkPDgWQ@mail.gmail.com>
	<CALmrCV3kZ=EEopMDydwYDarV6G9nSbiPHK_UY4vSBiDUG8fhKg@mail.gmail.com>
	<CAH6Pt5ro6n0z9e_BZHjvrVxL2tvYoE50vvkeaXmAgvOUQBpMLQ@mail.gmail.com>
	<CALmrCV1H68=Vtr6Gep7Y08Ns_vASQdrf+Q3R-COgvXt7URi5sg@mail.gmail.com>
	<50EC907A.8040003@astro.uio.no>
	<CALmrCV1iU9P1nL8xiJxTp_u7GjN9gG9+F3wJio9yc+MB18t+vA@mail.gmail.com>
	<CAFXk4bp=sFhbREc5LnVUBXcY-WmVZB++8WW+fCTnz0=BB0mKzg@mail.gmail.com>
Message-ID: <CALmrCV0RTFuFBQrFu-D1wi4aHdAXzx1g-=vaZKwjQb9Lg1iEyQ@mail.gmail.com>

Hi,

> Keep in mind that in the third option (current 1.6 behavior) the dtype is
> large enough to hold the random number, but not necessarily to hold the
> result. So for instance if x is an int16 array with only positive values,
> the result of this addition may contain negative values (or not, depending
> on the number being drawn). That's the part I feel is flawed with this
> behavior, it is quite unpredictable.

Yes, certainly.  But in either the proposed or 1.5 behavior, if the
values in x are close to the limits of the type, this can happen also.

Andrew


From shish at keba.be  Tue Jan  8 18:36:03 2013
From: shish at keba.be (Olivier Delalleau)
Date: Tue, 8 Jan 2013 18:36:03 -0500
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <CALmrCV0RTFuFBQrFu-D1wi4aHdAXzx1g-=vaZKwjQb9Lg1iEyQ@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CALmrCV1xP6cJVL7DZufw84Lw+S_+oE5S4BUY9BAOJtDiXruEvg@mail.gmail.com>
	<CAH6Pt5r50HQctUCGe2+oWP9G709-cenmvOpr7G4L=WQVn0JS=A@mail.gmail.com>
	<CALmrCV0sXJ=SfUQA=F9zEPUM0q6fDKzQkUtyjc=mqP+nQZE-Sw@mail.gmail.com>
	<CAH6Pt5oqwvWSbnaPy2pd0H=ZiQKf6xemcxY1tXsdQ1KdioH53w@mail.gmail.com>
	<CALmrCV0n=TsEPMrmzpdh6UQNFRTp5SgD0VAH52n9pxedsEu8Kg@mail.gmail.com>
	<CAH6Pt5rKwOzTjnOO87cSi=fkCT_-8tVAbEoA9RunwWD6A_DAPw@mail.gmail.com>
	<CALmrCV1JwDcBnO3sf=4n_4+V=C_j3QzRkpMCEZJutnv8NxRPqw@mail.gmail.com>
	<CAH6Pt5qnD=hfW_Zmw5t69HL78H8uhFY8b+GTkFx4uOmO+HERYQ@mail.gmail.com>
	<CALmrCV3ssCQ+U8QziyymoiF6BgxHHBiM2Z2Bnuk9UWNGH9bC4w@mail.gmail.com>
	<CAH6Pt5oiY07W=S+XbAYwN8Zk+AGCOh17eh_eg-uexBHGkPDgWQ@mail.gmail.com>
	<CALmrCV3kZ=EEopMDydwYDarV6G9nSbiPHK_UY4vSBiDUG8fhKg@mail.gmail.com>
	<CAH6Pt5ro6n0z9e_BZHjvrVxL2tvYoE50vvkeaXmAgvOUQBpMLQ@mail.gmail.com>
	<CALmrCV1H68=Vtr6Gep7Y08Ns_vASQdrf+Q3R-COgvXt7URi5sg@mail.gmail.com>
	<50EC907A.8040003@astro.uio.no>
	<CALmrCV1iU9P1nL8xiJxTp_u7GjN9gG9+F3wJio9yc+MB18t+vA@mail.gmail.com>
	<CAFXk4bp=sFhbREc5LnVUBXcY-WmVZB++8WW+fCTnz0=BB0mKzg@mail.gmail.com>
	<CALmrCV0RTFuFBQrFu-D1wi4aHdAXzx1g-=vaZKwjQb9Lg1iEyQ@mail.gmail.com>
Message-ID: <CAFXk4bq3uKvegsjraNpic=h7FGyzv5T_irsLpT-u+oRuBuDXzA@mail.gmail.com>

Le mardi 8 janvier 2013, Andrew Collette a ?crit :

> Hi,
>
> > Keep in mind that in the third option (current 1.6 behavior) the dtype is
> > large enough to hold the random number, but not necessarily to hold the
> > result. So for instance if x is an int16 array with only positive values,
> > the result of this addition may contain negative values (or not,
> depending
> > on the number being drawn). That's the part I feel is flawed with this
> > behavior, it is quite unpredictable.
>
> Yes, certainly.  But in either the proposed or 1.5 behavior, if the
> values in x are close to the limits of the type, this can happen also.
>

My previous email may not have been clear enough, so to be sure: in my
above example, if the random number is 30000, then the result may
contain negative
values (int16). If the random number is 50000, then the result will only
contain positive values (upcast to int32). Do you believe it is a good
behavior?

-=- Olivier
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130108/d2d70236/attachment.html>

From njs at pobox.com  Tue Jan  8 18:47:06 2013
From: njs at pobox.com (Nathaniel Smith)
Date: Tue, 8 Jan 2013 23:47:06 +0000
Subject: [Numpy-discussion] Linear least squares
In-Reply-To: <loom.20130108T184502-398@post.gmane.org>
References: <loom.20130108T184502-398@post.gmane.org>
Message-ID: <CAPJVwBn50HmYcc=yL_0CUQt8_v-y-6N9M++g5ZRcXx2u+iMuxA@mail.gmail.com>

On Tue, Jan 8, 2013 at 6:17 PM, Till Stensitz <mail.till at gmx.de> wrote:
> Hi,
> i did some profiling and testing of my data-fitting code.
> One of its core parts is doing some linear least squares,
> until now i used np.linalg.lstsq. Most of time the size
> a is (250, 7) and of b is (250, 800).
>
> Today i compared it to using pinv manually,
> to my surprise, it is much faster. I taught,
> both are svd based?

np.linalg.lstsq is written in Python (calling LAPACK for the SVD), so
you could run the line_profiler over it and see where the slowdown is.

An obvious thing is that it always computes residuals, which could be
costly; if your pinv code isn't doing that then it's not really
comparable. (Though might still be well-suited for your actual
problem.)

Depending on how well-conditioned your problems are, and how much
speed you need, there are faster ways than pinv as well. (Going via qr
might or might not, going via cholesky almost certainly will be.)

-n


From andrew.collette at gmail.com  Tue Jan  8 19:35:42 2013
From: andrew.collette at gmail.com (Andrew Collette)
Date: Tue, 8 Jan 2013 17:35:42 -0700
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <CAFXk4bq3uKvegsjraNpic=h7FGyzv5T_irsLpT-u+oRuBuDXzA@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CALmrCV1xP6cJVL7DZufw84Lw+S_+oE5S4BUY9BAOJtDiXruEvg@mail.gmail.com>
	<CAH6Pt5r50HQctUCGe2+oWP9G709-cenmvOpr7G4L=WQVn0JS=A@mail.gmail.com>
	<CALmrCV0sXJ=SfUQA=F9zEPUM0q6fDKzQkUtyjc=mqP+nQZE-Sw@mail.gmail.com>
	<CAH6Pt5oqwvWSbnaPy2pd0H=ZiQKf6xemcxY1tXsdQ1KdioH53w@mail.gmail.com>
	<CALmrCV0n=TsEPMrmzpdh6UQNFRTp5SgD0VAH52n9pxedsEu8Kg@mail.gmail.com>
	<CAH6Pt5rKwOzTjnOO87cSi=fkCT_-8tVAbEoA9RunwWD6A_DAPw@mail.gmail.com>
	<CALmrCV1JwDcBnO3sf=4n_4+V=C_j3QzRkpMCEZJutnv8NxRPqw@mail.gmail.com>
	<CAH6Pt5qnD=hfW_Zmw5t69HL78H8uhFY8b+GTkFx4uOmO+HERYQ@mail.gmail.com>
	<CALmrCV3ssCQ+U8QziyymoiF6BgxHHBiM2Z2Bnuk9UWNGH9bC4w@mail.gmail.com>
	<CAH6Pt5oiY07W=S+XbAYwN8Zk+AGCOh17eh_eg-uexBHGkPDgWQ@mail.gmail.com>
	<CALmrCV3kZ=EEopMDydwYDarV6G9nSbiPHK_UY4vSBiDUG8fhKg@mail.gmail.com>
	<CAH6Pt5ro6n0z9e_BZHjvrVxL2tvYoE50vvkeaXmAgvOUQBpMLQ@mail.gmail.com>
	<CALmrCV1H68=Vtr6Gep7Y08Ns_vASQdrf+Q3R-COgvXt7URi5sg@mail.gmail.com>
	<50EC907A.8040003@astro.uio.no>
	<CALmrCV1iU9P1nL8xiJxTp_u7GjN9gG9+F3wJio9yc+MB18t+vA@mail.gmail.com>
	<CAFXk4bp=sFhbREc5LnVUBXcY-WmVZB++8WW+fCTnz0=BB0mKzg@mail.gmail.com>
	<CALmrCV0RTFuFBQrFu-D1wi4aHdAXzx1g-=vaZKwjQb9Lg1iEyQ@mail.gmail.com>
	<CAFXk4bq3uKvegsjraNpic=h7FGyzv5T_irsLpT-u+oRuBuDXzA@mail.gmail.com>
Message-ID: <CALmrCV3_fYkZBiD0KgA3++P3+wysv3aZy3atg-3e35m+HnugGQ@mail.gmail.com>

Hi Olivier,

>> Yes, certainly.  But in either the proposed or 1.5 behavior, if the
>> values in x are close to the limits of the type, this can happen also.
>
>
> My previous email may not have been clear enough, so to be sure: in my above
> example, if the random number is 30000, then the result may contain negative
> values (int16). If the random number is 50000, then the result will only
> contain positive values (upcast to int32). Do you believe it is a good
> behavior?

Under the proposed behavior, if the random number is 30000, then you
*still* may have negative values, and if it's 50000, you get
ValueError.  No, I don't think the behavior you outlined is
particularly nice, but (1) it's consistent with array addition
elsewhere, at least in my mind, and (2) I don't think that sometimes
getting a ValueError is a big improvement.

Although I still prefer automatic upcasting, this discussion is really
making me see the value of a nice, simple rule like in 1.5. :)

Andrew


From charlesr.harris at gmail.com  Wed Jan  9 00:11:30 2013
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Tue, 8 Jan 2013 22:11:30 -0700
Subject: [Numpy-discussion] Linear least squares
In-Reply-To: <loom.20130108T184502-398@post.gmane.org>
References: <loom.20130108T184502-398@post.gmane.org>
Message-ID: <CAB6mnx+CpU1AxL9siNgdD4p-icZY3FbZJdpTnuBo8vUJG4x2fQ@mail.gmail.com>

On Tue, Jan 8, 2013 at 11:17 AM, Till Stensitz <mail.till at gmx.de> wrote:

> Hi,
> i did some profiling and testing of my data-fitting code.
> One of its core parts is doing some linear least squares,
> until now i used np.linalg.lstsq. Most of time the size
> a is (250, 7) and of b is (250, 800).
>
> Today i compared it to using pinv manually,
> to my surprise, it is much faster. I taught,
> both are svd based? Too check another computer
> i also run my test on wakari:
>
> https://www.wakari.io/nb/tillsten/linear_least_squares
>
> Also using scipy.linalg instead of np.linalg is
> slower for both cases. My numpy and scipy
> are both from C. Gohlkes website. If my result
> is valid in general, maybe the lstsq function
> should be changed or a hint should be added
> to the documentation.
>
>
Do you know if both are using Atlas (MKL)? Numpy will compile a default
unoptimized version if there is no Atlas (or MKL). Also, lstsq is a direct
call to an LAPACK least squares function, so the underlying functions
themselves are probably different for lstsq and pinv.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130108/fe5220fc/attachment.html>

From brenbarn at brenbarn.net  Wed Jan  9 02:23:05 2013
From: brenbarn at brenbarn.net (OKB (not okblacke))
Date: Wed, 9 Jan 2013 07:23:05 +0000 (UTC)
Subject: [Numpy-discussion] Bug with ufuncs made with frompyfunc
Message-ID: <XnsA142EDEC095C3OKB@80.91.229.13>

    	A bug causing errors with using methods of ufuncs created with 
frompyfunc was mentioned on the list over a year ago: 
http://mail.scipy.org/pipermail/numpy-discussion/2011-
September/058501.html

    	Is there any word on the status of this bug?  I wasn't able to find 
a ticket in the bug tracker.


From mail.till at gmx.de  Wed Jan  9 03:29:20 2013
From: mail.till at gmx.de (Till Stensitz)
Date: Wed, 9 Jan 2013 08:29:20 +0000 (UTC)
Subject: [Numpy-discussion] Linear least squares
References: <loom.20130108T184502-398@post.gmane.org>
	<CAPJVwBn50HmYcc=yL_0CUQt8_v-y-6N9M++g5ZRcXx2u+iMuxA@mail.gmail.com>
Message-ID: <loom.20130109T091925-88@post.gmane.org>

Nathaniel Smith <njs <at> pobox.com> writes:


> 
> An obvious thing is that it always computes residuals, which could be
> costly; if your pinv code isn't doing that then it's not really
> comparable. (Though might still be well-suited for your actual
> problem.)
> 
> Depending on how well-conditioned your problems are, and how much
> speed you need, there are faster ways than pinv as well. (Going via qr
> might or might not, going via cholesky almost certainly will be.)
> 
> -n
> 


You are right. With calculating the residuals, the speedup goes 
down to a factor of 2. I had to calculate the residuals anyways because 
lstsq only returns the squared sum of the residuals, while i need every
residual (as an input to optimize.leastsq).

Josef is also right, it is shape depended. For his example, lstsq is faster.

Maybe it is possible to make lstsq to choose its method automatically?
Or some keyword to set the method and making other decompositions
available.  


From mike.r.anderson.13 at gmail.com  Wed Jan  9 05:35:29 2013
From: mike.r.anderson.13 at gmail.com (Mike Anderson)
Date: Wed, 9 Jan 2013 18:35:29 +0800
Subject: [Numpy-discussion] Insights / lessons learned from NumPy design
In-Reply-To: <CALGmxEJFXbPXjebhG2WYEUfjC9z7y=t_7HCjHNzPWSr+4SZw0g@mail.gmail.com>
References: <CAA_67WgMPeF3LP=1xsTCgcvXXAPqbuJnJ5HJyEZM9Wda3JFSpg@mail.gmail.com>
	<CALGmxEJFXbPXjebhG2WYEUfjC9z7y=t_7HCjHNzPWSr+4SZw0g@mail.gmail.com>
Message-ID: <CAA_67WjPiNGAx_nnGjnuzE1DhfB98nPEFGcR+2OngKhOO6mhOg@mail.gmail.com>

On 8 January 2013 02:08, Chris Barker - NOAA Federal
<chris.barker at noaa.gov>wrote:

> On Thu, Jan 3, 2013 at 10:29 PM, Mike Anderson
> <mike.r.anderson.13 at gmail.com> wrote:
> > In the Clojure community there has been some discussion about creating a
> > common matrix maths library / API. Currently there are a few different
> > fledgeling matrix libraries in Clojure, so it seemed like a worthwhile
> > effort to unify them and have a common base on which to build on.
> >
> > NumPy has been something of an inspiration for this, so I though I'd ask
> > here to see what lessons have been learned.
>
> A few thoughts:
>
> > We're thinking of a matrix library
>
> First -- is this a "matrix" library, or a general use nd-array
> library? That will drive your design a great deal. For my part, I came
> from MATLAB, which started our very focused on matrixes, then extended
> to be more generally useful. Personally, I found the matrix-focus to
> get in the way more than help -- in any "real" code, you're the actual
> matrix operations are likely to be a tiny fraction of the code.
>
> One reason I like numpy is that it is array-first, with secondary
> support for matrix stuff.
>
> That being said, there is the numpy matrix type, and there are those
> that find it very useful. particularly in teaching situations, though
> it feels a bit "tacked-on", and that does get in the way, so if you
> want a "real" matrix object, but also a general purpose array lib,
> thinking about both up front will be helpful.
>

This is very useful context - thanks! I've had opinions in favour of both
an nd-array style library and a matrix library. I guess it depends on your
use case which one you are more inclined to think in.

I'm hoping that it should be possible for the same API to support both,
i.e. you should be able to use a 2D array of numbers as a matrix, and
vice-versa.


>
> > - Support for multi-dimensional matrices (but with fast paths for 1D
> vectors
> > and 2D matrices as the common cases)
>
> what is a multi-dimensional matrix? -- is a 3-d something, a stack of
> matrixes? or something else? (note, numpy lacks this kind of object,
> but it is sometimes asked for -- i.e a way to do fast matrix
> multiplication with a lot of small matrixes)
>
> I think fast paths for 1-D and 2-D is secondary, though you may want
> "easy paths" for those. IN particular, if you want good support for
> linear algebra (matrixes), then having a clean and natural "row vector
> and  "column vector" would be nice. See the archives of this list for
> a bunch of discussion about that -- and what the weaknesses are of the
> numpy matrix object.
>
> > - Immutability by default, i.e. matrix operations are pure functions that
> > create new matrices.
>
> I'd be careful about this -- the purity and predictability is nice,
> but these days a lot of time is spend allocating and moving memory
> around -- numpy array's mutability is a major key feature -- indeed,
> the key issues with performance with numpy surrond the fact that many
> copies may be made unnecessarily (note, Dag's suggesting of lazy
> evaluation may mitigate this to some extent).
>

Interesting and very useful to know. Sounds like we should definitely allow
for mutable arrays / zero-copy operations in that case if that is proving
to be a big bottleneck.


>
> > - Support for 64-bit double precision floats only (this is the standard
> > float type in Clojure)
>
> not a bad start, but another major strength of numpy is the multiple
> data types - you may wantt to design that concept in from the start.
>

Sounds like good advice and that should be possible to accomodate in the
design.

But I'm curious: what is the main use case for the alternative data types
in NumPy? Is it for columns of data of heterogeneous types? or something
else?


>
> > - Ability to support multiple different back-end matrix implementations
> > (JBLAS, Colt, EJML, Vectorz, javax.vecmath etc.)
>
> This ties in to another major strength of numpy -- ndarrays are both
> powerful python objects, and wrappers around standard C arrays -- that
> makes it pretty darn easy to interface with external libraries for
> core computation.


Great - good to know we are on the right track with this one.

Thanks Chris for all your comments / suggestions!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130109/b380d0c2/attachment.html>

From mike.r.anderson.13 at gmail.com  Wed Jan  9 05:49:06 2013
From: mike.r.anderson.13 at gmail.com (Mike Anderson)
Date: Wed, 9 Jan 2013 18:49:06 +0800
Subject: [Numpy-discussion] Insights / lessons learned from NumPy design
In-Reply-To: <50E68C2A.9060400@astro.uio.no>
References: <CAA_67WgMPeF3LP=1xsTCgcvXXAPqbuJnJ5HJyEZM9Wda3JFSpg@mail.gmail.com>
	<50E68C2A.9060400@astro.uio.no>
Message-ID: <CAA_67WjXVSxNpOMKPuMLsA21M5-D6TLkn7rZByrgR7wKgs4W-w@mail.gmail.com>

On 4 January 2013 16:00, Dag Sverre Seljebotn <d.s.seljebotn at astro.uio.no>wrote:

> On 01/04/2013 07:29 AM, Mike Anderson wrote:
> > Hello all,
> >
> > In the Clojure community there has been some discussion about creating a
> > common matrix maths library / API. Currently there are a few different
> > fledgeling matrix libraries in Clojure, so it seemed like a worthwhile
> > effort to unify them and have a common base on which to build on.
> >
> > NumPy has been something of an inspiration for this, so I though I'd ask
> > here to see what lessons have been learned.
> >
> > We're thinking of a matrix library with roughly the following design
> > (subject to change!)
> > - Support for multi-dimensional matrices (but with fast paths for 1D
> > vectors and 2D matrices as the common cases)
>
> Food for thought: Myself I have vectors that are naturally stored in 2D,
> "matrices" that can be naturally stored in 4D and so on (you can't view
> them that way when doing linear algebra, it's just that the indices can
> have multiple components) -- I like that NumPy calls everything "array";
> I think vector and matrix are higher-level mathematical concepts.
>

Very interesting. Can I ask what the application is? And is it equivalent
from a mathematical perspective to flattening the 2D vectors into very long
1D vectors?


>
> > - Immutability by default, i.e. matrix operations are pure functions
> > that create new matrices. There could be a "backdoor" option to mutate
> > matrices, but that would be unidiomatic in Clojure
>
> Sounds very promising (assuming you can reuse the buffer if the input
> matrix had no other references and is not used again?). It's very common
> for NumPy arrays to fill a large chunk of the available memory (think
> 20-100 GB), so for those users this would need to be coupled with buffer
> reuse and good diagnostics that help remove references to old
> generations of a matrix.
>

Yes it should be possible to re-use buffers, though to some extent that
would depend on the underlying matrix library implementation. The JVM makes
things a bit interesting here - the GC is extremely good but it doesn't
play particularly nicely with non-Java native code.

20-100GB is pretty ambitious and I guess reflects the maturity of NumPy -
 I'd be happy with good handling of 100MB matrices right now.....


>
> > - Support for 64-bit double precision floats only (this is the standard
> > float type in Clojure)
> > - Ability to support multiple different back-end matrix implementations
> > (JBLAS, Colt, EJML, Vectorz, javax.vecmath etc.)
> > - A full range of matrix operations. Operations would be delegated to
> > back end implementations where they are supported, otherwise generic
> > implementations could be used.
> >
> > Any thoughts on this topic based on the NumPy experience? In particular
> > would be very interesting to know:
> > - Features in NumPy which proved to be redundant / not worth the effort
> > - Features that you wish had been designed in at the start
> > - Design decisions that turned out to be a particularly big mistake /
> > success
> >
> > Would love to hear your insights, any ideas+advice greatly appreciated!
>
> Travis Oliphant noted some of his thoughts on this in the recent thread
> "DARPA funding for Blaze and passing the NumPy torch" which is a must-read.
>

Great link. Thanks for this and all your other comments!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130109/738d2ad0/attachment.html>

From mike.r.anderson.13 at gmail.com  Wed Jan  9 05:57:27 2013
From: mike.r.anderson.13 at gmail.com (Mike Anderson)
Date: Wed, 9 Jan 2013 18:57:27 +0800
Subject: [Numpy-discussion] Insights / lessons learned from NumPy design
In-Reply-To: <50E68F12.90804@astro.uio.no>
References: <CAA_67WgMPeF3LP=1xsTCgcvXXAPqbuJnJ5HJyEZM9Wda3JFSpg@mail.gmail.com>
	<50E68C2A.9060400@astro.uio.no> <50E68F12.90804@astro.uio.no>
Message-ID: <CAA_67WhkZ6sdVUtEXNXdOdSS3Y7=h=NEZ+-q+8Ac1XupbcTSPg@mail.gmail.com>

On 4 January 2013 16:13, Dag Sverre Seljebotn <d.s.seljebotn at astro.uio.no>wrote:

> On 01/04/2013 09:00 AM, Dag Sverre Seljebotn wrote:
> > On 01/04/2013 07:29 AM, Mike Anderson wrote:
> <snip>
>
> Oh: Depending on your amibitions, it's worth thinking hard about i)
> storage format, and ii) lazy evaluation.
>
> Storage format: The new trend is for more flexible formats than just
> column-major/row-major, e.g., storing cache-sized n-dimensional tiles.
>

I'm hoping the API will be independent of storage format - i.e. the
underlying implementations can store the data any way they like. So the API
will be written in terms of abstractions, and the user will have the choice
of whatever concrete implementation best fits the specific needs. Sparse
matrices, tiled matrices etc. should all be possible options.

Has this kind of approach been used much with NumPy?


>
> Lazy evaluation: The big problem with numpy is that "a + b + np.sqrt(c)"
> will first make a temporary result for "a + b", rather than doing the
> whole expression on the fly, which is *very* bad for performance.
>
> So if you want immutability, I urge you to consider every operation to
> build up an expression tree/"program", and then either find out the
> smart points where you interpret that program automatically, or make
> explicit eval() of an expression tree the default mode.
>

Very interesting. Seems like this could be layered on top though? i.e. have
a separate DSL for building up the expression tree, then compile this down
to the optimal set of underlying operations?


>
> Of course this depends all on how ambitious you are.
>

A little ambitious, though mostly I'll be glad to get something working
that people find useful :-)

Thanks again for your comments Dag!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130109/1e99a9e6/attachment.html>

From d.s.seljebotn at astro.uio.no  Wed Jan  9 07:30:24 2013
From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn)
Date: Wed, 09 Jan 2013 13:30:24 +0100
Subject: [Numpy-discussion] Insights / lessons learned from NumPy design
In-Reply-To: <CAA_67WjXVSxNpOMKPuMLsA21M5-D6TLkn7rZByrgR7wKgs4W-w@mail.gmail.com>
References: <CAA_67WgMPeF3LP=1xsTCgcvXXAPqbuJnJ5HJyEZM9Wda3JFSpg@mail.gmail.com>
	<50E68C2A.9060400@astro.uio.no>
	<CAA_67WjXVSxNpOMKPuMLsA21M5-D6TLkn7rZByrgR7wKgs4W-w@mail.gmail.com>
Message-ID: <50ED62E0.1010502@astro.uio.no>

On 01/09/2013 11:49 AM, Mike Anderson wrote:
> On 4 January 2013 16:00, Dag Sverre Seljebotn
> <d.s.seljebotn at astro.uio.no <mailto:d.s.seljebotn at astro.uio.no>> wrote:
>
>     On 01/04/2013 07:29 AM, Mike Anderson wrote:
>      > Hello all,
>      >
>      > In the Clojure community there has been some discussion about
>     creating a
>      > common matrix maths library / API. Currently there are a few
>     different
>      > fledgeling matrix libraries in Clojure, so it seemed like a
>     worthwhile
>      > effort to unify them and have a common base on which to build on.
>      >
>      > NumPy has been something of an inspiration for this, so I though
>     I'd ask
>      > here to see what lessons have been learned.
>      >
>      > We're thinking of a matrix library with roughly the following design
>      > (subject to change!)
>      > - Support for multi-dimensional matrices (but with fast paths for 1D
>      > vectors and 2D matrices as the common cases)
>
>     Food for thought: Myself I have vectors that are naturally stored in 2D,
>     "matrices" that can be naturally stored in 4D and so on (you can't view
>     them that way when doing linear algebra, it's just that the indices can
>     have multiple components) -- I like that NumPy calls everything "array";
>     I think vector and matrix are higher-level mathematical concepts.
>
>
> Very interesting. Can I ask what the application is? And is it
> equivalent from a mathematical perspective to flattening the 2D vectors
> into very long 1D vectors?

For instance, if you are solving an equation for one value per grid 
point on a 2D or 3D grid. In PDE problems this occurs all the time, 
though normally the flattening is treated explicitly before one gets to 
solving the equation, and when not a reshape operation like you say is 
usually OK (but the very concept for flattening/reshaping is something 
that's inherent to arrays, not matrices).

Chris also mentioned the case where you have lots of small matrices 
(say, A[i,j,k] is element (i,j) in matrix k), and you want to multiply 
all matrices by the same vector, or all matrices by different vectors, 
and so on.


>      > - Immutability by default, i.e. matrix operations are pure functions
>      > that create new matrices. There could be a "backdoor" option to
>     mutate
>      > matrices, but that would be unidiomatic in Clojure
>
>     Sounds very promising (assuming you can reuse the buffer if the input
>     matrix had no other references and is not used again?). It's very common
>     for NumPy arrays to fill a large chunk of the available memory (think
>     20-100 GB), so for those users this would need to be coupled with buffer
>     reuse and good diagnostics that help remove references to old
>     generations of a matrix.
>
>
> Yes it should be possible to re-use buffers, though to some extent that
> would depend on the underlying matrix library implementation. The JVM
> makes things a bit interesting here - the GC is extremely good but it
> doesn't play particularly nicely with non-Java native code.

My hunch is that you rely on the GC I think you'll get nowhere (though 
if you're happy to treat 100 MB matrices then that may not matter so much).

> 20-100GB is pretty ambitious and I guess reflects the maturity of NumPy
> -  I'd be happy with good handling of 100MB matrices right now.....

Still, if you copy 100 MB every time you assign to a single element, 
performance won't be stellar to say the least. I don't know Clojure but 
I'm thinking that an immutable design would be something like

b = a but with 1.0 in position (0, 3)
c = b + (3.2 in position (3, 4)

however you want to express that syntax-wise.

Pasting in your other post:

On 01/09/2013 11:57 AM, Mike Anderson wrote:> On 4 January 2013 16:13, > 
I'm hoping the API will be independent of storage format - i.e. the
 > underlying implementations can store the data any way they like. So the
 > API will be written in terms of abstractions, and the user will have the
 > choice of whatever concrete implementation best fits the specific needs.
 > Sparse matrices, tiled matrices etc. should all be possible options.
 >
 > Has this kind of approach been used much with NumPy?

No, NumPy only supports strided arrays. SciPy has sparse matrices using 
a different API (which is a pain point).

 >     Lazy evaluation: The big problem with numpy is that "a + b + 
np.sqrt(c)"
 >     will first make a temporary result for "a + b", rather than doing the
 >     whole expression on the fly, which is *very* bad for performance.
 >
 >     So if you want immutability, I urge you to consider every 
operation to
 >     build up an expression tree/"program", and then either find out the
 >     smart points where you interpret that program automatically, or make
 >     explicit eval() of an expression tree the default mode.
 >
 >
 > Very interesting. Seems like this could be layered on top though? i.e.
 > have a separate DSL for building up the expression tree, then compile
 > this down to the optimal set of underlying operations?

That's what Theano/Numexpr does on NumPy. But it does mean that users 
have to deal with 2-3 different APIs and ways of doing things rather 
than one. If you start out with only the lazy API then users only have 
to use 1 API everywhere, instead of 2-3.

If you want immutability, I don't think you can get around laziness, 
because it will allow you to have a "journal" rather than copying the 
100 MB array all the time. I.e., my example would become

b = a but with 1.0 in position (0, 3)
c = b + (3.2 in position (3, 4)
print c[0,0] # looks up memory in a
print c[3,4] # hits a "journal" of dirty values that's not yet committed
              # to linear memory
d = eval(c) # copy the 100MB

Combined with a tiled storage scheme so that the last step only needs to 
copy a few dirty blocks, immutable arrays may be within reach.

 >     Of course this depends all on how ambitious you are.
 >
 >
 > A little ambitious, though mostly I'll be glad to get something working
 > that people find useful :-)

I'd advise you to either go for something really simple and clean (which 
would almost certainly involve directly mutable arrays), or something 
very powerful (probably with only the abstraction of immutable arrays, 
with multiple back-end strategies for how to deal with that, but 
certainly buffering up single-element-updates). The latter is more of a 
full research project.

Having both immutable and mutable matrices in the same API doesn't sound 
ideal to me at least.

Dag Sverre


From davidmenhur at gmail.com  Wed Jan  9 07:59:57 2013
From: davidmenhur at gmail.com (=?UTF-8?B?RGHPgGlk?=)
Date: Wed, 9 Jan 2013 13:59:57 +0100
Subject: [Numpy-discussion] Insights / lessons learned from NumPy design
In-Reply-To: <CAA_67WjPiNGAx_nnGjnuzE1DhfB98nPEFGcR+2OngKhOO6mhOg@mail.gmail.com>
References: <CAA_67WgMPeF3LP=1xsTCgcvXXAPqbuJnJ5HJyEZM9Wda3JFSpg@mail.gmail.com>
	<CALGmxEJFXbPXjebhG2WYEUfjC9z7y=t_7HCjHNzPWSr+4SZw0g@mail.gmail.com>
	<CAA_67WjPiNGAx_nnGjnuzE1DhfB98nPEFGcR+2OngKhOO6mhOg@mail.gmail.com>
Message-ID: <CAJhcF=1rRcaQEnA22hXNA=j1vYUbh0n+FaBY1p9e-4j66VEm4w@mail.gmail.com>

On Jan 9, 2013 11:35 AM, "Mike Anderson" <mike.r.anderson.13 at gmail.com>
wrote:
> But I'm curious: what is the main use case for the alternative data types
in NumPy? Is it for columns of data of heterogeneous types? or something
else?
In my case, I have used 32 bit (or lower) arrays due to memory limitations
and some significant speedups in certain situations. This was particularly
useful when I was preprocessing numerous arrays to especially Boolean data,
saved a lot of hd space and I/O. I have used 128 bits when precision was
critical, as I was dealing with very small differences.
It is also nice to be able to repeat your computation with different
precision in order to spot possible numerical instabilities, even if the
performance is not great.l

David.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130109/b33b6d64/attachment.html>

From njs at pobox.com  Wed Jan  9 09:29:25 2013
From: njs at pobox.com (Nathaniel Smith)
Date: Wed, 9 Jan 2013 14:29:25 +0000
Subject: [Numpy-discussion] Bug with ufuncs made with frompyfunc
In-Reply-To: <XnsA142EDEC095C3OKB@80.91.229.13>
References: <XnsA142EDEC095C3OKB@80.91.229.13>
Message-ID: <CAPJVwBny2QxtZNrQ5qcjpP9g4vAYo4v2JhkUk1Tf09AnL8sRng@mail.gmail.com>

On Wed, Jan 9, 2013 at 7:23 AM, OKB (not okblacke)
<brenbarn at brenbarn.net> wrote:
>         A bug causing errors with using methods of ufuncs created with
> frompyfunc was mentioned on the list over a year ago:
> http://mail.scipy.org/pipermail/numpy-discussion/2011-
> September/058501.html
>
>         Is there any word on the status of this bug?  I wasn't able to find
> a ticket in the bug tracker.

That thread says that it had already been fixed in the development
version of numpy, so it should be fixed in the upcoming 1.7. If you
want to be sure then you try it on the 1.7 release candidate.

-n


From heng at cantab.net  Wed Jan  9 09:30:22 2013
From: heng at cantab.net (Henry Gomersall)
Date: Wed, 09 Jan 2013 14:30:22 +0000
Subject: [Numpy-discussion] natural alignment
Message-ID: <1357741822.3475.23.camel@farnsworth>

Further to my previous emails about getting SIMD aligned arrays, I've
noticed that numpy arrays aren't always naturally aligned either. 

For example, numpy.float96 arrays are not always aligned on 12-byte
boundaries under 32-bit linux/gcc. Indeed, .alignment on the array
always seems to return 4 (with 64-bit, .alignment returns 4, 8, and 16
for float32, float64 and longdouble respectively).

Can I assume _anything_ in general about the alignment of a numpy array?
(I mean, based on what all implementations of the underlying malloc etc
will return). Should I rely on what is returned from .alignment?

cheers,

Henry


From alan.isaac at gmail.com  Wed Jan  9 09:53:14 2013
From: alan.isaac at gmail.com (Alan G Isaac)
Date: Wed, 09 Jan 2013 09:53:14 -0500
Subject: [Numpy-discussion] Insights / lessons learned from NumPy design
In-Reply-To: <50ED62E0.1010502@astro.uio.no>
References: <CAA_67WgMPeF3LP=1xsTCgcvXXAPqbuJnJ5HJyEZM9Wda3JFSpg@mail.gmail.com>
	<50E68C2A.9060400@astro.uio.no>
	<CAA_67WjXVSxNpOMKPuMLsA21M5-D6TLkn7rZByrgR7wKgs4W-w@mail.gmail.com>
	<50ED62E0.1010502@astro.uio.no>
Message-ID: <50ED845A.7080402@gmail.com>

I'm just a Python+NumPy user and not a CS type.
May I ask a naive question on this thread?

Given the work that has (as I understand it) gone into
making NumPy usable as a C library, why is the discussion not
going in a direction like the following:
What changes to the NumPy code base would be required for it
to provide useful ndarray functionality in a C extension
to Clojure?  Is this simply incompatible with the goal that
Clojure compile to JVM byte code?

Thanks,
Alan Isaac


From njs at pobox.com  Wed Jan  9 09:58:24 2013
From: njs at pobox.com (Nathaniel Smith)
Date: Wed, 9 Jan 2013 14:58:24 +0000
Subject: [Numpy-discussion] Insights / lessons learned from NumPy design
In-Reply-To: <50ED845A.7080402@gmail.com>
References: <CAA_67WgMPeF3LP=1xsTCgcvXXAPqbuJnJ5HJyEZM9Wda3JFSpg@mail.gmail.com>
	<50E68C2A.9060400@astro.uio.no>
	<CAA_67WjXVSxNpOMKPuMLsA21M5-D6TLkn7rZByrgR7wKgs4W-w@mail.gmail.com>
	<50ED62E0.1010502@astro.uio.no> <50ED845A.7080402@gmail.com>
Message-ID: <CAPJVwBnBMddqDTSe4CCaHEfwkRS-JZ-E9VUsmdb5ytYLES+fbA@mail.gmail.com>

On Wed, Jan 9, 2013 at 2:53 PM, Alan G Isaac <alan.isaac at gmail.com> wrote:
> I'm just a Python+NumPy user and not a CS type.
> May I ask a naive question on this thread?
>
> Given the work that has (as I understand it) gone into
> making NumPy usable as a C library, why is the discussion not
> going in a direction like the following:
> What changes to the NumPy code base would be required for it
> to provide useful ndarray functionality in a C extension
> to Clojure?  Is this simply incompatible with the goal that
> Clojure compile to JVM byte code?

IIUC that work was done on a fork of numpy which has since been
abandoned by its authors, so... yeah, numpy itself doesn't have much
to offer in this area right now. It could in principle with a bunch of
refactoring (ideally not on a fork, since we saw how well that went),
but I don't think most happy current numpy users are wishing they
could switch to writing Lisp on the JVM or vice-versa, so I don't
think it's surprising that no-one's jumped up to do this work.

-n


From njs at pobox.com  Wed Jan  9 10:09:21 2013
From: njs at pobox.com (Nathaniel Smith)
Date: Wed, 9 Jan 2013 15:09:21 +0000
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <CALmrCV1qbkkp8yyA4z2RTCcB9hK70ZcFYD53xXNaqNKWuaGtnw@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CAPJVwBnct8Tcop6SruOfD1CCWC82sE3cbXoQvtux8sfLmJhYuQ@mail.gmail.com>
	<CAPJVwB=GmgRQoW1RqwR1WmJBq5tfxJAGnGPjcT0drZvGTVGx7Q@mail.gmail.com>
	<CALmrCV11T6OTp+dBvUNjGn518Rd2ovHPfS7Ur6a5BKTyYXzKQw@mail.gmail.com>
	<50E61E29.1020709@astro.uio.no>
	<CALmrCV0L6PAA=aCYgaX2gQyM-_=bcDiygz0TyDmE0o5oCbc7iA@mail.gmail.com>
	<CAFXk4bqT5Hk67Cv=FJ+L+vx53zR6x5=xRVQRcZShPs7QOZ0OoQ@mail.gmail.com>
	<CALmrCV2TcFwO=mbiSSeqrwbVCyo9nsEQSUspfihA1WgK4Abd+g@mail.gmail.com>
	<CAFXk4bopXBpN2_aPAJCJ3ADk6PSMiG7NANSra7fhButkud+75A@mail.gmail.com>
	<CALmrCV0311UpW__-espNZrbgGk2pa2fW6E1oeTvZkbYf_1e_VA@mail.gmail.com>
	<CAPJVwBkXSjcDHDuddJ8D-39N+hB8SZP6DMHpsr2zwv9XCMT3OA@mail.gmail.com>
	<CALmrCV1xP6cJVL7DZufw84Lw+S_+oE5S4BUY9BAOJtDiXruEvg@mail.gmail.com>
	<CAH6Pt5r50HQctUCGe2+oWP9G709-cenmvOpr7G4L=WQVn0JS=A@mail.gmail.com>
	<CALmrCV0sXJ=SfUQA=F9zEPUM0q6fDKzQkUtyjc=mqP+nQZE-Sw@mail.gmail.com>
	<CAH6Pt5oqwvWSbnaPy2pd0H=ZiQKf6xemcxY1tXsdQ1KdioH53w@mail.gmail.com>
	<CALmrCV0n=TsEPMrmzpdh6UQNFRTp5SgD0VAH52n9pxedsEu8Kg@mail.gmail.com>
	<CAH6Pt5rKwOzTjnOO87cSi=fkCT_-8tVAbEoA9RunwWD6A_DAPw@mail.gmail.com>
	<CALmrCV1JwDcBnO3sf=4n_4+V=C_j3QzRkpMCEZJutnv8NxRPqw@mail.gmail.com>
	<CAH6Pt5qnD=hfW_Zmw5t69HL78H8uhFY8b+GTkFx4uOmO+HERYQ@mail.gmail.com>
	<CALmrCV3ssCQ+U8QziyymoiF6BgxHHBiM2Z2Bnuk9UWNGH9bC4w@mail.gmail.com>
	<CAH6Pt5oiY07W=S+XbAYwN8Zk+AGCOh17eh_eg-uexBHGkPDgWQ@mail.gmail.com>
	<CALmrCV3kZ=EEopMDydwYDarV6G9nSbiPHK_UY4vSBiDUG8fhKg@mail.gmail.com>
	<CAH6Pt5ro6n0z9e_BZHjvrVxL2tvYoE50vvkeaXmAgvOUQBpMLQ@mail.gmail.com>
	<CALmrCV1H68=Vtr6Gep7Y08Ns_vASQdrf+Q3R-COgvXt7URi5sg@mail.gmail.com>
	<CAPJVwBmCmPCShfmoSPgkpO58aLU6S2t9GbYbDV=iif6US5YgJQ@mail.gmail.com>
	<CALmrCV1qbkkp8yyA4z2RTCcB9hK70ZcFYD53xXNaqNKWuaGtnw@mail.gmail.com>
Message-ID: <CAPJVwB=f9_g81=SYyrb88j9mWrHKTwjxNUp0ofs0ZQKzPxeJoA@mail.gmail.com>

On Tue, Jan 8, 2013 at 9:14 PM, Andrew Collette
<andrew.collette at gmail.com> wrote:
> Hi Nathaniel,
>
> (Responding to both your emails)
>
>> The problem is that rule for arrays - and for every other party of
>> numpy in general - are that we *don't* pick types based on values.
>> Numpy always uses input types to determine output types, not input
>> values.
>
> Yes, of course... array operations are governed exclusively by their
> dtypes.  It seems to me that, using the language of the bug report
> (2878), if we have this:
>
> result = arr + scalar
>
> I would argue that our job is, rather than to pick result.dtype, to
> pick scalar.dtype, and apply the normal rules for array operations.

Okay, but we already have unambiguous rules for picking scalar.dtype:
you use whatever width the underlying type has, so it'd always be
np.int_ or np.float64. Those are the normal rules for picking dtypes.

I'm just trying to make clear that what you're arguing for is also a
very special case, which also violates the rules numpy uses everywhere
else. That doesn't mean we should rule it out ("Special cases aren't
special enough to break the rules. / Although practicality beats
purity."), but claiming that it is just "the normal rules" while
everything else is a "special case" is rhetorically unhelpful.

>> So it's pretty unambiguous that
>> "using the same rules for arrays and scalars" would mean, ignore the
>> value of the scalar, and in expressions like
>>   np.array([1], dtype=np.int8) + 1
>> we should always upcast to int32/int64.
>
> Ah, but that's my point: we already, in 1.6, ignore the intrinsic
> width of the scalar and effectively substitute one based on it's
> value:
>
>>>> a = np.array([1], dtype=int8)
>>>> (a + 1).dtype
> dtype('int8')
>>>> (a + 1000).dtype
> dtype('int16')
>>>> (a + 90000).dtype
> dtype('int32')
>>>> (a + 2**40).dtype
> dtype('int64')

Sure. But the only reason this is in 1.6 is that the person who made
the change never mentioned it to anyone else, so it wasn't noticed
until after 1.6 came out. If it had gone through proper review/mailing
list discussion (like we're doing now) then it's very unlikely it
would have gone in in its present form.

>> 1.6, your proposal: in a binary operation, if one operand has ndim==0
>> and the other has ndim>0, downcast the ndim==0 item to the smallest
>> width that is consistent with its value and the other operand's type.
>
> Yes, exactly.  I'm not trying to propose a completely new behavior: as
> I mentioned (although very far upthread), this is the mental model I
> had of how things worked in 1.6 already.
>
>> New users don't use narrow-width dtypes... it's important to remember
>> in this discussion that in numpy, non-standard dtypes only arise when
>> users explicitly request them, so there's some expressed intention
>> there that we want to try and respect.
>
> I would respectfully disagree.  One example I cited was that when
> dealing with HDF5, it's very common to get int16's (and even int8's)
> when reading from a file because they are used to save disk space.
> All a new user has to do to get int8's from a file they got from
> someone else is:
>
>>>> data = some_hdf5_file['MyDataset'][...]
>
> This is a general issue applying to data which is read from real-world
> external sources.  For example, digitizers routinely represent their
> samples as int8's or int16's, and you apply a scale and offset to get
> a reading in volts.

This particular case is actually handled fine by 1.5, because int
array + float scalar *does* upcast to float. It's width that's ignored
(int8 versus int32), not the basic "kind" of data (int versus float).

But overall this does sound like a problem -- but it's not a problem
with the scalar/array rules, it's a problem with working with narrow
width data in general. There's a good argument to be made that data
files should be stored in compressed form, but read in in full-width
form, exactly to avoid the problems that arise when trying to
manipulate narrow-width representations.

Suppose your scale and offset *were* integers, so that the "kind"
casting rules didn't get invoked. Even if this were the case, then the
rules you're arguing for would not actually solve your problem at all.
It'd be very easy to have, say, scale=100, offset=100, both of which
fit fine in an int8... but actually performing the scaling/offseting
in an int8 would still be a terrible idea! The problem you're talking
about is picking the correct width for an *operation*, and futzing
about with picking the dtypes of *one input* to that operation is not
going to help; it's like trying to ensure your house won't fall down
by making sure the doors are really sturdy.

-n


From alan.isaac at gmail.com  Wed Jan  9 10:09:46 2013
From: alan.isaac at gmail.com (Alan G Isaac)
Date: Wed, 09 Jan 2013 10:09:46 -0500
Subject: [Numpy-discussion] Insights / lessons learned from NumPy design
In-Reply-To: <CAPJVwBnBMddqDTSe4CCaHEfwkRS-JZ-E9VUsmdb5ytYLES+fbA@mail.gmail.com>
References: <CAA_67WgMPeF3LP=1xsTCgcvXXAPqbuJnJ5HJyEZM9Wda3JFSpg@mail.gmail.com>
	<50E68C2A.9060400@astro.uio.no>
	<CAA_67WjXVSxNpOMKPuMLsA21M5-D6TLkn7rZByrgR7wKgs4W-w@mail.gmail.com>
	<50ED62E0.1010502@astro.uio.no> <50ED845A.7080402@gmail.com>
	<CAPJVwBnBMddqDTSe4CCaHEfwkRS-JZ-E9VUsmdb5ytYLES+fbA@mail.gmail.com>
Message-ID: <50ED883A.6060605@gmail.com>

On 1/9/2013 9:58 AM, Nathaniel Smith wrote:
> I don't think most happy current numpy users are wishing they
> could switch to writing Lisp on the JVM or vice-versa, so I don't
> think it's surprising that no-one's jumped up to do this work.


Sure.  I'm trying to look at this more from the Clojure end.
Is it really better to start from scratch than to attempt
a contribution to NumPy that would make it useful to Clojure.
Given the amount of work that has gone into making NumPy
what it is, it seems a huge project for the Clojure people
to hope to produce anything comparable starting from scratch.

Thanks,
Alan


From ben.root at ou.edu  Wed Jan  9 10:41:34 2013
From: ben.root at ou.edu (Benjamin Root)
Date: Wed, 9 Jan 2013 10:41:34 -0500
Subject: [Numpy-discussion] Insights / lessons learned from NumPy design
In-Reply-To: <CAPJVwBnBMddqDTSe4CCaHEfwkRS-JZ-E9VUsmdb5ytYLES+fbA@mail.gmail.com>
References: <CAA_67WgMPeF3LP=1xsTCgcvXXAPqbuJnJ5HJyEZM9Wda3JFSpg@mail.gmail.com>
	<50E68C2A.9060400@astro.uio.no>
	<CAA_67WjXVSxNpOMKPuMLsA21M5-D6TLkn7rZByrgR7wKgs4W-w@mail.gmail.com>
	<50ED62E0.1010502@astro.uio.no> <50ED845A.7080402@gmail.com>
	<CAPJVwBnBMddqDTSe4CCaHEfwkRS-JZ-E9VUsmdb5ytYLES+fbA@mail.gmail.com>
Message-ID: <CANNq6Fm9zZMRiiUndaR1=HGZQuF9OJjvqABC6B1RQAuc7GfrxA@mail.gmail.com>

On Wed, Jan 9, 2013 at 9:58 AM, Nathaniel Smith <njs at pobox.com> wrote:

> On Wed, Jan 9, 2013 at 2:53 PM, Alan G Isaac <alan.isaac at gmail.com> wrote:
> > I'm just a Python+NumPy user and not a CS type.
> > May I ask a naive question on this thread?
> >
> > Given the work that has (as I understand it) gone into
> > making NumPy usable as a C library, why is the discussion not
> > going in a direction like the following:
> > What changes to the NumPy code base would be required for it
> > to provide useful ndarray functionality in a C extension
> > to Clojure?  Is this simply incompatible with the goal that
> > Clojure compile to JVM byte code?
>
> IIUC that work was done on a fork of numpy which has since been
> abandoned by its authors, so... yeah, numpy itself doesn't have much
> to offer in this area right now. It could in principle with a bunch of
> refactoring (ideally not on a fork, since we saw how well that went),
> but I don't think most happy current numpy users are wishing they
> could switch to writing Lisp on the JVM or vice-versa, so I don't
> think it's surprising that no-one's jumped up to do this work.
>
>
If I could just point out that the attempt to fork numpy for the .NET work
was done back in the subversion days, and there was little-to-no effort to
incrementally merge back changes to master, and vice-versa.  With git as
our repository now, such work may be more feasible.

Ben Root
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130109/b731c69b/attachment.html>

From charlesr.harris at gmail.com  Wed Jan  9 11:42:30 2013
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Wed, 9 Jan 2013 09:42:30 -0700
Subject: [Numpy-discussion] Linear least squares
In-Reply-To: <loom.20130109T091925-88@post.gmane.org>
References: <loom.20130108T184502-398@post.gmane.org>
	<CAPJVwBn50HmYcc=yL_0CUQt8_v-y-6N9M++g5ZRcXx2u+iMuxA@mail.gmail.com>
	<loom.20130109T091925-88@post.gmane.org>
Message-ID: <CAB6mnxKdZz-gujNAsHa-9WgcTqkXizWZeS9udAatJaOqoQMU1Q@mail.gmail.com>

On Wed, Jan 9, 2013 at 1:29 AM, Till Stensitz <mail.till at gmx.de> wrote:

> Nathaniel Smith <njs <at> pobox.com> writes:
>
>
> >
> > An obvious thing is that it always computes residuals, which could be
> > costly; if your pinv code isn't doing that then it's not really
> > comparable. (Though might still be well-suited for your actual
> > problem.)
> >
> > Depending on how well-conditioned your problems are, and how much
> > speed you need, there are faster ways than pinv as well. (Going via qr
> > might or might not, going via cholesky almost certainly will be.)
> >
> > -n
> >
>
>
> You are right. With calculating the residuals, the speedup goes
> down to a factor of 2. I had to calculate the residuals anyways because
> lstsq only returns the squared sum of the residuals, while i need every
> residual (as an input to optimize.leastsq).
>
>
Same here. Unfortunately the residuals computed by the LAPACK function are
in a different basis so aren't directly usable. I'd support adding a
keyword to disable the usual computation of the sum of squares.

Josef is also right, it is shape depended. For his example, lstsq is faster.
>
> Maybe it is possible to make lstsq to choose its method automatically?
> Or some keyword to set the method and making other decompositions
> available.
>

QR without column pivoting is a nice option for "safe" problems, but it
doesn't provide a reliable indication of rank reduction. I also don't find
pinv useful once the rank goes down, since it relies on Euclidean distance
having relevance in parameter space and that is seldom a sound assumption,
usually it is better to reformulate the problem or remove a column from the
design matrix. So maybe an 'unsafe', or less suggestively, 'fast' keyword
could also be an option. IIRC, this was discussed on the scipy mailing list
a year or two ago.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130109/005d62bf/attachment.html>

From nouiz at nouiz.org  Wed Jan  9 11:47:40 2013
From: nouiz at nouiz.org (=?ISO-8859-1?Q?Fr=E9d=E9ric_Bastien?=)
Date: Wed, 9 Jan 2013 11:47:40 -0500
Subject: [Numpy-discussion] ANN: NumPy 1.7.0rc1 release
In-Reply-To: <CAB4XWXy6EcaMA2H_doo8ecRDoRQppY8AqAMOJdXROks6NbOtqA@mail.gmail.com>
References: <CADDwiVBHyRQTbckksfgX6h+rznqZt0qS3rxWOqnaak=fXPehxQ@mail.gmail.com>
	<CAB4XWXy6EcaMA2H_doo8ecRDoRQppY8AqAMOJdXROks6NbOtqA@mail.gmail.com>
Message-ID: <CADKKbti7d5vYkSBT+8WXMAozUzenXciJSA2+8kDRZie9HsNoPg@mail.gmail.com>

Hi,

Congratulation for the release and a big thanks for the hard work.

I tested it with our software and all work fine.

thanks!

Fr?d?ric

On Sun, Dec 30, 2012 at 7:17 PM, Sandro Tosi <morph at debian.org> wrote:
> Hi Ondrej & al,
>
> On Sat, Dec 29, 2012 at 1:02 AM, Ond?ej ?ert?k <ondrej.certik at gmail.com> wrote:
>> I'm pleased to announce the availability of the first release candidate of
>> NumPy 1.7.0rc1.
>
> Congrats on this RC release!
>
> I've uploaded this version to Debian and updated some of the issues
> related to it. There are also a couple of minor PR you might want to
> consider for 1.7: 2872 and 2873.
>
> Cheers,
> --
> Sandro Tosi (aka morph, morpheus, matrixhasu)
> My website: http://matrixhasu.altervista.org/
> Me at Debian: http://wiki.debian.org/SandroTosi
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From chris.barker at noaa.gov  Wed Jan  9 12:22:19 2013
From: chris.barker at noaa.gov (Chris Barker - NOAA Federal)
Date: Wed, 9 Jan 2013 09:22:19 -0800
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <CAPJVwB=f9_g81=SYyrb88j9mWrHKTwjxNUp0ofs0ZQKzPxeJoA@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CAPJVwBnct8Tcop6SruOfD1CCWC82sE3cbXoQvtux8sfLmJhYuQ@mail.gmail.com>
	<CAPJVwB=GmgRQoW1RqwR1WmJBq5tfxJAGnGPjcT0drZvGTVGx7Q@mail.gmail.com>
	<CALmrCV11T6OTp+dBvUNjGn518Rd2ovHPfS7Ur6a5BKTyYXzKQw@mail.gmail.com>
	<50E61E29.1020709@astro.uio.no>
	<CALmrCV0L6PAA=aCYgaX2gQyM-_=bcDiygz0TyDmE0o5oCbc7iA@mail.gmail.com>
	<CAFXk4bqT5Hk67Cv=FJ+L+vx53zR6x5=xRVQRcZShPs7QOZ0OoQ@mail.gmail.com>
	<CALmrCV2TcFwO=mbiSSeqrwbVCyo9nsEQSUspfihA1WgK4Abd+g@mail.gmail.com>
	<CAFXk4bopXBpN2_aPAJCJ3ADk6PSMiG7NANSra7fhButkud+75A@mail.gmail.com>
	<CALmrCV0311UpW__-espNZrbgGk2pa2fW6E1oeTvZkbYf_1e_VA@mail.gmail.com>
	<CAPJVwBkXSjcDHDuddJ8D-39N+hB8SZP6DMHpsr2zwv9XCMT3OA@mail.gmail.com>
	<CALmrCV1xP6cJVL7DZufw84Lw+S_+oE5S4BUY9BAOJtDiXruEvg@mail.gmail.com>
	<CAH6Pt5r50HQctUCGe2+oWP9G709-cenmvOpr7G4L=WQVn0JS=A@mail.gmail.com>
	<CALmrCV0sXJ=SfUQA=F9zEPUM0q6fDKzQkUtyjc=mqP+nQZE-Sw@mail.gmail.com>
	<CAH6Pt5oqwvWSbnaPy2pd0H=ZiQKf6xemcxY1tXsdQ1KdioH53w@mail.gmail.com>
	<CALmrCV0n=TsEPMrmzpdh6UQNFRTp5SgD0VAH52n9pxedsEu8Kg@mail.gmail.com>
	<CAH6Pt5rKwOzTjnOO87cSi=fkCT_-8tVAbEoA9RunwWD6A_DAPw@mail.gmail.com>
	<CALmrCV1JwDcBnO3sf=4n_4+V=C_j3QzRkpMCEZJutnv8NxRPqw@mail.gmail.com>
	<CAH6Pt5qnD=hfW_Zmw5t69HL78H8uhFY8b+GTkFx4uOmO+HERYQ@mail.gmail.com>
	<CALmrCV3ssCQ+U8QziyymoiF6BgxHHBiM2Z2Bnuk9UWNGH9bC4w@mail.gmail.com>
	<CAH6Pt5oiY07W=S+XbAYwN8Zk+AGCOh17eh_eg-uexBHGkPDgWQ@mail.gmail.com>
	<CALmrCV3kZ=EEopMDydwYDarV6G9nSbiPHK_UY4vSBiDUG8fhKg@mail.gmail.com>
	<CAH6Pt5ro6n0z9e_BZHjvrVxL2tvYoE50vvkeaXmAgvOUQBpMLQ@mail.gmail.com>
	<CALmrCV1H68=Vtr6Gep7Y08Ns_vASQdrf+Q3R-COgvXt7URi5sg@mail.gmail.com>
	<CAPJVwBmCmPCShfmoSPgkpO58aLU6S2t9GbYbDV=iif6US5YgJQ@mail.gmail.com>
	<CALmrCV1qbkkp8yyA4z2RTCcB9hK70ZcFYD53xXNaqNKWuaGtnw@mail.gmail.com>
	<CAPJVwB=f9_g81=SYyrb88j9mWrHKTwjxNUp0ofs0ZQKzPxeJoA@mail.gmail.com>
Message-ID: <CALGmxEJyj4NT8RPmGqi8Sf5TD5PMw38TxVKyNQGqf-YU1T5-Wg@mail.gmail.com>

On Wed, Jan 9, 2013 at 7:09 AM, Nathaniel Smith <njs at pobox.com> wrote:
>> This is a general issue applying to data which is read from real-world
>> external sources.  For example, digitizers routinely represent their
>> samples as int8's or int16's, and you apply a scale and offset to get
>> a reading in volts.
>
> This particular case is actually handled fine by 1.5, because int
> array + float scalar *does* upcast to float. It's width that's ignored
> (int8 versus int32), not the basic "kind" of data (int versus float).
>
> But overall this does sound like a problem -- but it's not a problem
> with the scalar/array rules, it's a problem with working with narrow
> width data in general.

Exactly -- this is key. details asside, we essentially have a choice
between an approach that makes it easy to preserver your values --
upcasting liberally, or making it easy to preserve your dtype --
requiring users to specifically upcast where needed.

IIRC, our experience with earlier versions of numpy (and Numeric
before that) is that all too often folks would choose a small dtype
quite deliberately, then have it accidentally upcast for them -- this
was determined to be not-so-good behavior.

I think the HDF (and also netcdf...) case is a special case -- the
small dtype+scaling has been chosen deliberately by whoever created
the data file (to save space), but we would want it generally opaque
to the consumer of the file -- to me, that means the issue should be
adressed by the file reading tools, not numpy. If your HDF5 reader
chooses the the resulting dtype explicitly, it doesn't matter what
numpy's defaults are. If the user wants to work with the raw, unscaled
arrays, then they should know what they are doing.

-Chris


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From jaakko.luttinen at aalto.fi  Wed Jan  9 12:32:06 2013
From: jaakko.luttinen at aalto.fi (Jaakko Luttinen)
Date: Wed, 9 Jan 2013 19:32:06 +0200
Subject: [Numpy-discussion] numpydoc for python 3?
Message-ID: <50EDA996.6090806@aalto.fi>

Hi!

I'm trying to use numpydoc (Sphinx extension) for my project written in
Python 3.2. However, installing numpydoc gives errors shown at
http://pastebin.com/MPED6v9G and although it says "Successfully
installed numpydoc", trying to import numpydoc raises errors..

Could this be fixed or am I doing something wrong?

Thanks!
Jaakko


From chris.barker at noaa.gov  Wed Jan  9 12:38:46 2013
From: chris.barker at noaa.gov (Chris Barker - NOAA Federal)
Date: Wed, 9 Jan 2013 09:38:46 -0800
Subject: [Numpy-discussion] Insights / lessons learned from NumPy design
In-Reply-To: <CAA_67WjPiNGAx_nnGjnuzE1DhfB98nPEFGcR+2OngKhOO6mhOg@mail.gmail.com>
References: <CAA_67WgMPeF3LP=1xsTCgcvXXAPqbuJnJ5HJyEZM9Wda3JFSpg@mail.gmail.com>
	<CALGmxEJFXbPXjebhG2WYEUfjC9z7y=t_7HCjHNzPWSr+4SZw0g@mail.gmail.com>
	<CAA_67WjPiNGAx_nnGjnuzE1DhfB98nPEFGcR+2OngKhOO6mhOg@mail.gmail.com>
Message-ID: <CALGmxE+rtjw5wtjC0fvTPYX4-RaOVjnRPtxSwuY1t91S635crA@mail.gmail.com>

On Wed, Jan 9, 2013 at 2:35 AM, Mike Anderson

>> First -- is this a "matrix" library, or a general use nd-array
>> library? That will drive your design a great deal.

> This is very useful context - thanks! I've had opinions in favour of both an
> nd-array style library and a matrix library. I guess it depends on your use
> case which one you are more inclined to think in.
>
> I'm hoping that it should be possible for the same API to support both, i.e.
> you should be able to use a 2D array of numbers as a matrix, and vice-versa.

sure, but the API can/should be differnent -- in some sense, the numpy
matrix object is really just syntactic sugar -- you can use a 2-d
array as a matrix, but then you have to explicilty call linear algebra
functions to get things like matrix multiplication, etc. and do some
hand work to make sure you're got things the right shape -- i.e a
column or row vector where called for.

tacking on the matrix object helped this, but in practice, it gets
tricky to prevent operations from accidentally returning a plan array
from operations on a matrix.

Also numpy's matrix concept does not include the concept of  a row or
column vector, just 1XN or NX1 matrixes -- which works OK, but then
when you iterate through a vector, you get 1X1 matrixes, rather than
scalars -- a bit odd.

Anyway, it takes some though to have two clean APIs sharing one core object.

>> not a bad start, but another major strength of numpy is the multiple
>> data types - you may wantt to design that concept in from the start.

> But I'm curious: what is the main use case for the alternative data types in
> NumPy? Is it for columns of data of heterogeneous types? or something else?

heterogeneous data types were added relatively recently in numpy, and
are great mostly for interacting with other libraries (and some
syntactic sugar uses...) that may store data in arrays of structures.

But multiple homogenous data types are critical for saving memory,
speeding operations, doing integer math when that's really called for,
manipulating images, etc, etc.....

> 20-100GB is pretty ambitious and I guess reflects the maturity of
> NumPy -  I'd be happy with good handling of 100MB matrices right
> now.....

100MB is prety darn small these days -- if you're only interested in
smallish problems, then you can probably forget about performance
issues, and focus on a really nice API. But I"m not sure I'd bother
with that -- once people start using it, they'll want to use it for
big problems!

-Chris


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From andrew.collette at gmail.com  Wed Jan  9 12:58:39 2013
From: andrew.collette at gmail.com (Andrew Collette)
Date: Wed, 9 Jan 2013 10:58:39 -0700
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <CAPJVwB=f9_g81=SYyrb88j9mWrHKTwjxNUp0ofs0ZQKzPxeJoA@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CAPJVwBnct8Tcop6SruOfD1CCWC82sE3cbXoQvtux8sfLmJhYuQ@mail.gmail.com>
	<CAPJVwB=GmgRQoW1RqwR1WmJBq5tfxJAGnGPjcT0drZvGTVGx7Q@mail.gmail.com>
	<CALmrCV11T6OTp+dBvUNjGn518Rd2ovHPfS7Ur6a5BKTyYXzKQw@mail.gmail.com>
	<50E61E29.1020709@astro.uio.no>
	<CALmrCV0L6PAA=aCYgaX2gQyM-_=bcDiygz0TyDmE0o5oCbc7iA@mail.gmail.com>
	<CAFXk4bqT5Hk67Cv=FJ+L+vx53zR6x5=xRVQRcZShPs7QOZ0OoQ@mail.gmail.com>
	<CALmrCV2TcFwO=mbiSSeqrwbVCyo9nsEQSUspfihA1WgK4Abd+g@mail.gmail.com>
	<CAFXk4bopXBpN2_aPAJCJ3ADk6PSMiG7NANSra7fhButkud+75A@mail.gmail.com>
	<CALmrCV0311UpW__-espNZrbgGk2pa2fW6E1oeTvZkbYf_1e_VA@mail.gmail.com>
	<CAPJVwBkXSjcDHDuddJ8D-39N+hB8SZP6DMHpsr2zwv9XCMT3OA@mail.gmail.com>
	<CALmrCV1xP6cJVL7DZufw84Lw+S_+oE5S4BUY9BAOJtDiXruEvg@mail.gmail.com>
	<CAH6Pt5r50HQctUCGe2+oWP9G709-cenmvOpr7G4L=WQVn0JS=A@mail.gmail.com>
	<CALmrCV0sXJ=SfUQA=F9zEPUM0q6fDKzQkUtyjc=mqP+nQZE-Sw@mail.gmail.com>
	<CAH6Pt5oqwvWSbnaPy2pd0H=ZiQKf6xemcxY1tXsdQ1KdioH53w@mail.gmail.com>
	<CALmrCV0n=TsEPMrmzpdh6UQNFRTp5SgD0VAH52n9pxedsEu8Kg@mail.gmail.com>
	<CAH6Pt5rKwOzTjnOO87cSi=fkCT_-8tVAbEoA9RunwWD6A_DAPw@mail.gmail.com>
	<CALmrCV1JwDcBnO3sf=4n_4+V=C_j3QzRkpMCEZJutnv8NxRPqw@mail.gmail.com>
	<CAH6Pt5qnD=hfW_Zmw5t69HL78H8uhFY8b+GTkFx4uOmO+HERYQ@mail.gmail.com>
	<CALmrCV3ssCQ+U8QziyymoiF6BgxHHBiM2Z2Bnuk9UWNGH9bC4w@mail.gmail.com>
	<CAH6Pt5oiY07W=S+XbAYwN8Zk+AGCOh17eh_eg-uexBHGkPDgWQ@mail.gmail.com>
	<CALmrCV3kZ=EEopMDydwYDarV6G9nSbiPHK_UY4vSBiDUG8fhKg@mail.gmail.com>
	<CAH6Pt5ro6n0z9e_BZHjvrVxL2tvYoE50vvkeaXmAgvOUQBpMLQ@mail.gmail.com>
	<CALmrCV1H68=Vtr6Gep7Y08Ns_vASQdrf+Q3R-COgvXt7URi5sg@mail.gmail.com>
	<CAPJVwBmCmPCShfmoSPgkpO58aLU6S2t9GbYbDV=iif6US5YgJQ@mail.gmail.com>
	<CALmrCV1qbkkp8yyA4z2RTCcB9hK70ZcFYD53xXNaqNKWuaGtnw@mail.gmail.com>
	<CAPJVwB=f9_g81=SYyrb88j9mWrHKTwjxNUp0ofs0ZQKzPxeJoA@mail.gmail.com>
Message-ID: <CALmrCV3cVS=QZgPE4dTg8ZLPFq93FPrd9jjGygREV6nj2crYXw@mail.gmail.com>

Hi Nathaniel,

> Sure. But the only reason this is in 1.6 is that the person who made
> the change never mentioned it to anyone else, so it wasn't noticed
> until after 1.6 came out. If it had gone through proper review/mailing
> list discussion (like we're doing now) then it's very unlikely it
> would have gone in in its present form.

This is also a good point; I didn't realize that was how it was handled.

Ultimately, the people who have to make this decision are the people
who actually do the work -- and that means the core numpy maintainers.
 We've had a great discussion and I certainly feel like my input has
been respected.  Although I still disagree with the change, I
certainly see that it's not as simple as I first thought.

At this point the discussion has gone on for about 70 emails so far
and I think I've said all I can.  Thanks again for being willing to
engage with users like this... numpy is an unusual project in that
regard.

I imagine that once the change is released (scheduled for 2.0?) the
broader community will also be happy to provide input.

Andrew


From d.s.seljebotn at astro.uio.no  Wed Jan  9 13:04:23 2013
From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn)
Date: Wed, 09 Jan 2013 19:04:23 +0100
Subject: [Numpy-discussion] Insights / lessons learned from NumPy design
In-Reply-To: <CANNq6Fm9zZMRiiUndaR1=HGZQuF9OJjvqABC6B1RQAuc7GfrxA@mail.gmail.com>
References: <CAA_67WgMPeF3LP=1xsTCgcvXXAPqbuJnJ5HJyEZM9Wda3JFSpg@mail.gmail.com>
	<50E68C2A.9060400@astro.uio.no>
	<CAA_67WjXVSxNpOMKPuMLsA21M5-D6TLkn7rZByrgR7wKgs4W-w@mail.gmail.com>
	<50ED62E0.1010502@astro.uio.no> <50ED845A.7080402@gmail.com>
	<CAPJVwBnBMddqDTSe4CCaHEfwkRS-JZ-E9VUsmdb5ytYLES+fbA@mail.gmail.com>
	<CANNq6Fm9zZMRiiUndaR1=HGZQuF9OJjvqABC6B1RQAuc7GfrxA@mail.gmail.com>
Message-ID: <50EDB127.30602@astro.uio.no>

On 01/09/2013 04:41 PM, Benjamin Root wrote:
>
>
> On Wed, Jan 9, 2013 at 9:58 AM, Nathaniel Smith <njs at pobox.com
> <mailto:njs at pobox.com>> wrote:
>
>     On Wed, Jan 9, 2013 at 2:53 PM, Alan G Isaac <alan.isaac at gmail.com
>     <mailto:alan.isaac at gmail.com>> wrote:
>      > I'm just a Python+NumPy user and not a CS type.
>      > May I ask a naive question on this thread?
>      >
>      > Given the work that has (as I understand it) gone into
>      > making NumPy usable as a C library, why is the discussion not
>      > going in a direction like the following:
>      > What changes to the NumPy code base would be required for it
>      > to provide useful ndarray functionality in a C extension
>      > to Clojure?  Is this simply incompatible with the goal that
>      > Clojure compile to JVM byte code?
>
>     IIUC that work was done on a fork of numpy which has since been
>     abandoned by its authors, so... yeah, numpy itself doesn't have much
>     to offer in this area right now. It could in principle with a bunch of
>     refactoring (ideally not on a fork, since we saw how well that went),
>     but I don't think most happy current numpy users are wishing they
>     could switch to writing Lisp on the JVM or vice-versa, so I don't
>     think it's surprising that no-one's jumped up to do this work.
>
>
> If I could just point out that the attempt to fork numpy for the .NET
> work was done back in the subversion days, and there was little-to-no
> effort to incrementally merge back changes to master, and vice-versa.
> With git as our repository now, such work may be more feasible.

This is a matter of personal software design taste I guess, so the 
following is very subjective.

I don't think there's anything at all to gain from this.  In 2013 (and 
presumably, the future), a static C or C++ library is IMO fundamentally 
incompatible with achieving optimal performance. Going through a major 
refactor simply to end up with something that's no faster and no more 
flexible than what NumPy is today seems sort of pointless to me.

What one wants is to generate ufuncs etc. on the fly using LLVM that are 
tuned to the specific tiling pattern of a specific operation, not a 
static C or C++ library (even with C++ meta-programming, the 
combinatorial explosion kills you if you do it all at compile-time).

Granted, one could probably write a C++ library that was more of a 
compiler, using LLVM to emit code. But that's starting all over so not 
really relevant to the question of a NumPy refactor.

This is how I understand Continuum thinks too, with Numba as a back-end 
for Blaze. (And Travis also spoke about this in his "farewell address".)

Finally, Mark Florisson sort of started this with the 'minivect' library 
last summer which could as a "ufunc" backend both for Cython and Numba 
(which for this purpose are different languages), however as I 
understand it focus is now more on developing Numba directly rather than 
minivect (which is understandable as that's quicker).

Dag Sverre


From d.s.seljebotn at astro.uio.no  Wed Jan  9 13:07:16 2013
From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn)
Date: Wed, 09 Jan 2013 19:07:16 +0100
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <CALGmxEJyj4NT8RPmGqi8Sf5TD5PMw38TxVKyNQGqf-YU1T5-Wg@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CALmrCV0n=TsEPMrmzpdh6UQNFRTp5SgD0VAH52n9pxedsEu8Kg@mail.gmail.com>
	<CAH6Pt5rKwOzTjnOO87cSi=fkCT_-8tVAbEoA9RunwWD6A_DAPw@mail.gmail.com>
	<CALmrCV1JwDcBnO3sf=4n_4+V=C_j3QzRkpMCEZJutnv8NxRPqw@mail.gmail.com>
	<CAH6Pt5qnD=hfW_Zmw5t69HL78H8uhFY8b+GTkFx4uOmO+HERYQ@mail.gmail.com>
	<CALmrCV3ssCQ+U8QziyymoiF6BgxHHBiM2Z2Bnuk9UWNGH9bC4w@mail.gmail.com>
	<CAH6Pt5oiY07W=S+XbAYwN8Zk+AGCOh17eh_eg-uexBHGkPDgWQ@mail.gmail.com>
	<CALmrCV3kZ=EEopMDydwYDarV6G9nSbiPHK_UY4vSBiDUG8fhKg@mail.gmail.com>
	<CAH6Pt5ro6n0z9e_BZHjvrVxL2tvYoE50vvkeaXmAgvOUQBpMLQ@mail.gmail.com>
	<CALmrCV1H68=Vtr6Gep7Y08Ns_vASQdrf+Q3R-COgvXt7URi5sg@mail.gmail.com>
	<CAPJVwBmCmPCShfmoSPgkpO58aLU6S2t9GbYbDV=iif6US5YgJQ@mail.gmail.com>
	<CALmrCV1qbkkp8yyA4z2RTCcB9hK70ZcFYD53xXNaqNKWuaGtnw@mail.gmail.com>
	<CAPJVwB=f9_g81=SYyrb88j9mWrHKTwjxNUp0ofs0ZQKzPxeJoA@mail.gmail.com>
	<CALGmxEJyj4NT8RPmGqi8Sf5TD5PMw38TxVKyNQGqf-YU1T5-Wg@mail.gmail.com>
Message-ID: <50EDB1D4.5090909@astro.uio.no>

On 01/09/2013 06:22 PM, Chris Barker - NOAA Federal wrote:
> On Wed, Jan 9, 2013 at 7:09 AM, Nathaniel Smith <njs at pobox.com> wrote:
>>> This is a general issue applying to data which is read from real-world
>>> external sources.  For example, digitizers routinely represent their
>>> samples as int8's or int16's, and you apply a scale and offset to get
>>> a reading in volts.
>>
>> This particular case is actually handled fine by 1.5, because int
>> array + float scalar *does* upcast to float. It's width that's ignored
>> (int8 versus int32), not the basic "kind" of data (int versus float).
>>
>> But overall this does sound like a problem -- but it's not a problem
>> with the scalar/array rules, it's a problem with working with narrow
>> width data in general.
>
> Exactly -- this is key. details asside, we essentially have a choice
> between an approach that makes it easy to preserver your values --
> upcasting liberally, or making it easy to preserve your dtype --
> requiring users to specifically upcast where needed.
>
> IIRC, our experience with earlier versions of numpy (and Numeric
> before that) is that all too often folks would choose a small dtype
> quite deliberately, then have it accidentally upcast for them -- this
> was determined to be not-so-good behavior.
>
> I think the HDF (and also netcdf...) case is a special case -- the
> small dtype+scaling has been chosen deliberately by whoever created
> the data file (to save space), but we would want it generally opaque
> to the consumer of the file -- to me, that means the issue should be
> adressed by the file reading tools, not numpy. If your HDF5 reader
> chooses the the resulting dtype explicitly, it doesn't matter what
> numpy's defaults are. If the user wants to work with the raw, unscaled
> arrays, then they should know what they are doing.

+1. I think h5py should consider:

File("my.h5")['int8_dset'].dtype == int64
File("my.h5", preserve_dtype=True)['int8_dset'].dtype == int8

Dag Sverre


From pierre.raybaut at gmail.com  Wed Jan  9 16:05:23 2013
From: pierre.raybaut at gmail.com (Pierre Raybaut)
Date: Wed, 9 Jan 2013 22:05:23 +0100
Subject: [Numpy-discussion] ANN: first previews of WinPython for Python 3
	32/64bit
Message-ID: <CAKegKuBy9kDAMN0Cj8RtyqhyxeJLMmfi4rfN+g=Z-RuwRPTkYw@mail.gmail.com>

Hi all,

I'm pleased to announce that the first previews of WinPython for
Python 3 32bit and 64bit are available (WinPython v3.3.0.0alpha1):
http://code.google.com/p/winpython/
This first release based on Python 3 required to migrate the following
libraries which were only available for Python 2:
  * formlayout 1.0.12
  * guidata 1.6.0dev1
  * guiqwt 2.3.0dev1
  * Spyder 2.1.14dev
Please note that these libraries are still development release.
[Special thanks to Christoph Gohlke for patching and building a
version of PyQwt compatible with Python 3.3]

WinPython is a free open-source portable distribution of Python for
Windows, designed for scientists.

It is a full-featured (see
http://code.google.com/p/winpython/wiki/PackageIndex) Python-based
scientific environment:
  * Designed for scientists (thanks to the integrated libraries NumPy,
SciPy, Matplotlib, guiqwt, etc.:
    * Regular *scientific users*: interactive data processing and
visualization using Python with Spyder
    * *Advanced scientific users and software developers*: Python
applications development with Spyder, version control with Mercurial
and other development tools (like gettext)
  * *Portable*: preconfigured, it should run out of the box on any
machine under Windows (without any installation requirements) and the
folder containing WinPython can be moved to any location (local,
network or removable drive)
  * *Flexible*: one can install (or should I write "use" as it's
portable) as many WinPython versions as necessary (like isolated and
self-consistent environments), even if those versions are running
different versions of Python (2.7, 3.x in the near future) or
different architectures (32bit or 64bit) on the same machine
  * *Customizable*: using the integrated package manager (wppm, as
WinPython Package Manager), it's possible to install, uninstall or
upgrade Python packages (see
http://code.google.com/p/winpython/wiki/WPPM for more details on
supported package formats).

*WinPython is not an attempt to replace Python(x,y)*, this is just
something different (see
http://code.google.com/p/winpython/wiki/Roadmap): more flexible,
easier to maintain, movable and less invasive for the OS, but
certainly less user-friendly, with less packages/contents and without
any integration to Windows explorer [*].

[*] Actually there is an optional integration into Windows explorer,
providing the same features as the official Python installer regarding
file associations and context menu entry (this option may be activated
through the WinPython Control Panel), and adding shortcuts to Windows
Start menu.

Enjoy!
-Pierre


From chris.barker at noaa.gov  Wed Jan  9 16:19:49 2013
From: chris.barker at noaa.gov (Chris Barker - NOAA Federal)
Date: Wed, 9 Jan 2013 13:19:49 -0800
Subject: [Numpy-discussion] Insights / lessons learned from NumPy design
In-Reply-To: <CAA_67WhkZ6sdVUtEXNXdOdSS3Y7=h=NEZ+-q+8Ac1XupbcTSPg@mail.gmail.com>
References: <CAA_67WgMPeF3LP=1xsTCgcvXXAPqbuJnJ5HJyEZM9Wda3JFSpg@mail.gmail.com>
	<50E68C2A.9060400@astro.uio.no> <50E68F12.90804@astro.uio.no>
	<CAA_67WhkZ6sdVUtEXNXdOdSS3Y7=h=NEZ+-q+8Ac1XupbcTSPg@mail.gmail.com>
Message-ID: <CALGmxEK1iOStkT4ijgxNGsSVMe4tu5PQ8oc0zWbSQ3xk8g+2ZQ@mail.gmail.com>

On Wed, Jan 9, 2013 at 2:57 AM, Mike Anderson

> I'm hoping the API will be independent of storage format - i.e. the
> underlying implementations can store the data any way they like. So the API
> will be written in terms of abstractions, and the user will have the choice
> of whatever concrete implementation best fits the specific needs. Sparse
> matrices, tiled matrices etc. should all be possible options.

A note about that -- as I think if it, numpy arrays are two things:

1) a python object for working with numbers, in a wide variety of ways

2) a wrapper around a C-array (or data block) that can be used to
provide an easyway for Python to interact with C (and Fortran, and...)
libraries, etc.

As it turns out a LOT of people use numpy for (2) -- what this means
is that while you could change the underlying data representation,
etc, and keep the same Python API -- such changes would break a lot of
non-pure-python code that relies on that data representation.

This is a big issue with the numpy-for-PyPy project -- they could
write a numpy clone, but it would only be useful for the pure-python
stuff.

Even then, a number of folks do tricks with numpy arrays in python
that rely on the underlying structure.

Not sure how all this would play out for Clojure, but it's something
to keep in mind.

-Chris


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From jrocher at enthought.com  Wed Jan  9 17:32:28 2013
From: jrocher at enthought.com (Jonathan Rocher)
Date: Wed, 9 Jan 2013 16:32:28 -0600
Subject: [Numpy-discussion] [SCIPY2013] Feedback on mini-symposia themes
Message-ID: <CAOzk5QcWhxFcPTNL5A1oCNnmxApR96kEpWqEXT9bDk6+i8NYDQ@mail.gmail.com>

Dear community members,

We are working hard to organize the SciPy2013 conference (Scientific
Computing with Python) <http://conference.scipy.org/scipy2013/>, this June
24th-29th in Austin, TX. We would like to probe the community about the
themes you would be interested in contributing to or participating in for
the mini-symposia at SciPy2013.

These mini-symposia are held to discuss scientific computing applied to a
specific *scientific domain/industry* during a half afternoon after the
general conference. Their goal is to promote industry specific libraries
and tools, and gather people with similar interests for discussions. For
example, the SciPy2012<http://conference.scipy.org/scipy2012/schedule/conf_schedule_1.php>
edition
successfully hosted 4 mini-symposia on Astronomy/Astrophysics,
Bio-informatics, Meteorology, and Geophysics.

Please join us and voice your opinion to shape the next SciPy conference at:

http://www.surveygizmo.com/s3/1114631/SciPy-2013-Themes

Thanks,

The Scipy2013 organizers

-- 
Jonathan Rocher, PhD
Scientific software developer
Enthought, Inc.
jrocher at enthought.com
1-512-536-1057
http://www.enthought.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130109/e980741b/attachment.html>

From chanley at gmail.com  Wed Jan  9 18:21:44 2013
From: chanley at gmail.com (Christopher Hanley)
Date: Wed, 9 Jan 2013 18:21:44 -0500
Subject: [Numpy-discussion] Remove support for numeric and numarray in
	1.8
In-Reply-To: <CAB6mnxJ=zH7Q4hUpRXT-Heg8ONVBbwVuyrgb=QJ-SjLpZDa7JA@mail.gmail.com>
References: <CAB6mnxJ=zH7Q4hUpRXT-Heg8ONVBbwVuyrgb=QJ-SjLpZDa7JA@mail.gmail.com>
Message-ID: <CADeni3v3r2jJSvw9isbb8fb3_=E6jkTN0T=dT9Od4tdkEH+q6g@mail.gmail.com>

After poking around our code base and talking to a few folks I predict that
we at STScI can remove our dependence on the numpy-numarray compatibility
layer by the end of this calendar year.  I'm unsure of what the timeline
for numpy 1.8 is so I don't know if this schedule supports removal of the
compatibility layer from 1.8 or not.

Thanks,
Chris


On Sat, Jan 5, 2013 at 9:38 PM, Charles R Harris
<charlesr.harris at gmail.com>wrote:

> Thoughts?
>
> Chuck
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130109/884be2e3/attachment.html>

From njs at pobox.com  Wed Jan  9 18:38:56 2013
From: njs at pobox.com (Nathaniel Smith)
Date: Wed, 9 Jan 2013 23:38:56 +0000
Subject: [Numpy-discussion] Remove support for numeric and numarray in
	1.8
In-Reply-To: <CADeni3v3r2jJSvw9isbb8fb3_=E6jkTN0T=dT9Od4tdkEH+q6g@mail.gmail.com>
References: <CAB6mnxJ=zH7Q4hUpRXT-Heg8ONVBbwVuyrgb=QJ-SjLpZDa7JA@mail.gmail.com>
	<CADeni3v3r2jJSvw9isbb8fb3_=E6jkTN0T=dT9Od4tdkEH+q6g@mail.gmail.com>
Message-ID: <CAPJVwBnYGeHq-mtRts5eEOHzdiPSw4BSUZDR2b7mRX=Q4ZGi9w@mail.gmail.com>

On Wed, Jan 9, 2013 at 11:21 PM, Christopher Hanley <chanley at gmail.com> wrote:
> After poking around our code base and talking to a few folks I predict that
> we at STScI can remove our dependence on the numpy-numarray compatibility
> layer by the end of this calendar year.  I'm unsure of what the timeline for
> numpy 1.8 is so I don't know if this schedule supports removal of the
> compatibility layer from 1.8 or not.

It'd be nice if 1.8 were out before that, but that doesn't really
matter -- let us know when you get it sorted?

Also, would it help if we added a big scary warning at import time to
annoy your more recalcitrant developers with? :-)

The basic issue is that none of us actually use this stuff, it has no
tests, the rest of numpy is changing around it, and we have no idea if
it works, so at some point it makes more sense for us to just stop
shipping the compat layer and let anyone who still needs it maintain
their own copy of the code.

-n


From charlesr.harris at gmail.com  Wed Jan  9 18:41:52 2013
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Wed, 9 Jan 2013 16:41:52 -0700
Subject: [Numpy-discussion] Remove support for numeric and numarray in
	1.8
In-Reply-To: <CADeni3v3r2jJSvw9isbb8fb3_=E6jkTN0T=dT9Od4tdkEH+q6g@mail.gmail.com>
References: <CAB6mnxJ=zH7Q4hUpRXT-Heg8ONVBbwVuyrgb=QJ-SjLpZDa7JA@mail.gmail.com>
	<CADeni3v3r2jJSvw9isbb8fb3_=E6jkTN0T=dT9Od4tdkEH+q6g@mail.gmail.com>
Message-ID: <CAB6mnxKnnG4L1HtHZsxGwDBEeFinD1QKK+WGLQh_E1Wq0nnB_A@mail.gmail.com>

On Wed, Jan 9, 2013 at 4:21 PM, Christopher Hanley <chanley at gmail.com>wrote:

> After poking around our code base and talking to a few folks I predict
> that we at STScI can remove our dependence on the numpy-numarray
> compatibility layer by the end of this calendar year.  I'm unsure of what
> the timeline for numpy 1.8 is so I don't know if this schedule supports
> removal of the compatibility layer from 1.8 or not.
>
>
Together with the previous post that puts the kibosh on removing either
numeric or numarray support from 1.8, at least if we get 1.8 before the end
of summer. It's good to know where folks stand with regard to those
packages, we'll give it another shot next year.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130109/b19d1a32/attachment.html>

From ondrej.certik at gmail.com  Wed Jan  9 21:55:39 2013
From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=)
Date: Wed, 9 Jan 2013 18:55:39 -0800
Subject: [Numpy-discussion] Which Python to use for Mac binaries
In-Reply-To: <CALGmxELD=ecZeOiDLcOPuqTT_1269Wbftm4X607zKuenjFbG6Q@mail.gmail.com>
References: <CADDwiVCrujdbbpCVCiDbi1N7pH6i5aHDDv+Y+TyyRnkkzJ14+Q@mail.gmail.com>
	<CABL7CQhsSuJAVLA8LKYPNv-G5SC0J7D4aUZ6Jd8jc7ZTssEdfg@mail.gmail.com>
	<CALGmxEJjaj=XjUkXAc2QEtEGNMmc4kKQ44d4k5v8s7E3_t6ynw@mail.gmail.com>
	<CADDwiVDDHG2czKA8UTQpDJ_MW3frhoKoUtVrH8-1z2tBkHm-Ow@mail.gmail.com>
	<CALGmxE+u4QhKRtkapdVccmBUOuKpPVbewO_heykh+5m6_3x1eA@mail.gmail.com>
	<CADDwiVAapYEZA8x6+KRL9Fh0eccEVPy97-Ah8=mi8hXUe0W3GQ@mail.gmail.com>
	<CALGmxELD=ecZeOiDLcOPuqTT_1269Wbftm4X607zKuenjFbG6Q@mail.gmail.com>
Message-ID: <CADDwiVCtnQ2wveGm+G9evb2hObAW1poV4wqMGzZ9L3iYRdH_BQ@mail.gmail.com>

On Tue, Jan 8, 2013 at 8:45 AM, Chris Barker - NOAA Federal
<chris.barker at noaa.gov> wrote:
> On Mon, Jan 7, 2013 at 10:23 PM, Ond?ej ?ert?k <ondrej.certik at gmail.com> wrote:
>>> http://www.commandlinefu.com/commands/view/2031/install-an-mpkg-from-the-command-line-on-osx
>>
>> This requires root access. Without sudo, I get:
>>
>> $ installer -pkg /Volumes/Python\ 2.7.3/Python.mpkg/ -target ondrej
>> installer: This package requires authentication to install.
>>
>> and since I don't have root access, it doesn't work.
>>
>> So one way around it would be to install python from source, that
>> shouldn't require root access.
>
> hmm -- this all may be a trick -- both the *.mpkg and the standard
> build put everything in /Library/Frameworks/Python -- which is where
> it belongs. Bu tif you need root access to write there, then there is
> a problem. I'm sure a non-root build could put everything in the
> users' home directory, then packages built against that would have
> their paths messed up.

Right.

>
> What's odd is that I'm pretty sure I've been able to point+click
> install those without sudo...(I could recall incorrectly).
>
> This would be a good question for the pythonmac list -- low traffic,
> but there are some very smart and helpful folks there:
>
> http://mail.python.org/mailman/listinfo/pythonmac-sig
>
>
>>>> But I am not currently sure what to do with it. The Python.mpkg
>>>> directory seems to contain the sources.
>
> It should be possible to unpack a mpkg by hand, but it contains both
> the contents, and various instal scripts, so that seems like a really
> ugly solution.

Yep.

In the meantime, the hard drive on Vincent's box failed, so he
reinstalled the box completely.
Also he explained to me a lot of Mac things over the phone, so I think
I now understand what is going on with the dmg.

As such, I have updated my instructions in my release helper repo:

https://github.com/certik/numpy-vendor

by the following paragraph:

"""
First prepare the Mac build box as follows:

* Install Python 2.5, 2.6, 2.7 from python.org using the dmg disk image
* Install setuptools and bdist_mpkg into all these Pythons
* Install Paver into the default Python

Tip: Add the /Library/Frameworks/Python.framework directory into git
and commit after each installation of any package or Python. That way
you can easily remove temporary installations.
"""

And you need sudo access to do those. If your user is an admin, then
it can do it, otherwise it can't.
So one can only use a Mac, which has the above setup installed. With
that, my Fabfile can then do the rest.

So I just built the following binaries:

numpy-1.7.0rc1-py2.5-python.org-macosx10.3.dmg
numpy-1.7.0rc1-py2.6-python.org-macosx10.3.dmg
numpy-1.7.0rc1-py2.7-python.org-macosx10.3.dmg

and uploaded to:

https://sourceforge.net/projects/numpy/files/NumPy/1.7.0rc1/


So I think we are all set here. Ralf, would you be willing to build
the final binary on 10.6? I don't think you have to do it for this
rc1, but I am going to release rc2 now and for that it would be nice
to have it.

Ondrej


From mike.r.anderson.13 at gmail.com  Wed Jan  9 23:06:37 2013
From: mike.r.anderson.13 at gmail.com (Mike Anderson)
Date: Thu, 10 Jan 2013 12:06:37 +0800
Subject: [Numpy-discussion] Insights / lessons learned from NumPy design
In-Reply-To: <50ED883A.6060605@gmail.com>
References: <CAA_67WgMPeF3LP=1xsTCgcvXXAPqbuJnJ5HJyEZM9Wda3JFSpg@mail.gmail.com>
	<50E68C2A.9060400@astro.uio.no>
	<CAA_67WjXVSxNpOMKPuMLsA21M5-D6TLkn7rZByrgR7wKgs4W-w@mail.gmail.com>
	<50ED62E0.1010502@astro.uio.no> <50ED845A.7080402@gmail.com>
	<CAPJVwBnBMddqDTSe4CCaHEfwkRS-JZ-E9VUsmdb5ytYLES+fbA@mail.gmail.com>
	<50ED883A.6060605@gmail.com>
Message-ID: <CAA_67WisGCp5K+nQFRTyGpNrkGQjUrV8K7XKUx9FL0XswNnJZw@mail.gmail.com>

On 9 January 2013 23:09, Alan G Isaac <alan.isaac at gmail.com> wrote:

> On 1/9/2013 9:58 AM, Nathaniel Smith wrote:
> > I don't think most happy current numpy users are wishing they
> > could switch to writing Lisp on the JVM or vice-versa, so I don't
> > think it's surprising that no-one's jumped up to do this work.
>
>
> Sure.  I'm trying to look at this more from the Clojure end.
> Is it really better to start from scratch than to attempt
> a contribution to NumPy that would make it useful to Clojure.
> Given the amount of work that has gone into making NumPy
> what it is, it seems a huge project for the Clojure people
> to hope to produce anything comparable starting from scratch.
>
> Thanks,
> Alan


Currently I expect that the Clojure community will produce an abstraction /
API for matrices / ndarrays that supports multiple implementations. It's
fairly idiomatic in Clojure to work in abstractions, and the language
offers good tools for making different concrete abstractions work with a
common API, so it's less hard to make this work than it might sound.

An interface to NumPy could certainly be one of the implementations of this
API - I'm sure people would find this very useful given the maturity on
NumPy and the need for integration in environments
with heterogeneous systems.

At the same time, there will be people in the Clojure world who will want
to stay 100% on the JVM for certain projects. For them I don't see how
NumPy could be used, unless it can be made to run well on Jython perhaps?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130110/e7e2bb85/attachment.html>

From ralf.gommers at gmail.com  Thu Jan 10 02:21:01 2013
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Thu, 10 Jan 2013 08:21:01 +0100
Subject: [Numpy-discussion] Which Python to use for Mac binaries
In-Reply-To: <CADDwiVCtnQ2wveGm+G9evb2hObAW1poV4wqMGzZ9L3iYRdH_BQ@mail.gmail.com>
References: <CADDwiVCrujdbbpCVCiDbi1N7pH6i5aHDDv+Y+TyyRnkkzJ14+Q@mail.gmail.com>
	<CABL7CQhsSuJAVLA8LKYPNv-G5SC0J7D4aUZ6Jd8jc7ZTssEdfg@mail.gmail.com>
	<CALGmxEJjaj=XjUkXAc2QEtEGNMmc4kKQ44d4k5v8s7E3_t6ynw@mail.gmail.com>
	<CADDwiVDDHG2czKA8UTQpDJ_MW3frhoKoUtVrH8-1z2tBkHm-Ow@mail.gmail.com>
	<CALGmxE+u4QhKRtkapdVccmBUOuKpPVbewO_heykh+5m6_3x1eA@mail.gmail.com>
	<CADDwiVAapYEZA8x6+KRL9Fh0eccEVPy97-Ah8=mi8hXUe0W3GQ@mail.gmail.com>
	<CALGmxELD=ecZeOiDLcOPuqTT_1269Wbftm4X607zKuenjFbG6Q@mail.gmail.com>
	<CADDwiVCtnQ2wveGm+G9evb2hObAW1poV4wqMGzZ9L3iYRdH_BQ@mail.gmail.com>
Message-ID: <CABL7CQgHfZ9NP0ZvNXWJdfiyGa_CQHqZGvy=1-c=EGyutWN0jA@mail.gmail.com>

On Thu, Jan 10, 2013 at 3:55 AM, Ond?ej ?ert?k <ondrej.certik at gmail.com>wrote:

> On Tue, Jan 8, 2013 at 8:45 AM, Chris Barker - NOAA Federal
> <chris.barker at noaa.gov> wrote:
> > On Mon, Jan 7, 2013 at 10:23 PM, Ond?ej ?ert?k <ondrej.certik at gmail.com>
> wrote:
> >>>
> http://www.commandlinefu.com/commands/view/2031/install-an-mpkg-from-the-command-line-on-osx
> >>
> >> This requires root access. Without sudo, I get:
> >>
> >> $ installer -pkg /Volumes/Python\ 2.7.3/Python.mpkg/ -target ondrej
> >> installer: This package requires authentication to install.
> >>
> >> and since I don't have root access, it doesn't work.
> >>
> >> So one way around it would be to install python from source, that
> >> shouldn't require root access.
> >
> > hmm -- this all may be a trick -- both the *.mpkg and the standard
> > build put everything in /Library/Frameworks/Python -- which is where
> > it belongs. Bu tif you need root access to write there, then there is
> > a problem. I'm sure a non-root build could put everything in the
> > users' home directory, then packages built against that would have
> > their paths messed up.
>
> Right.
>
> >
> > What's odd is that I'm pretty sure I've been able to point+click
> > install those without sudo...(I could recall incorrectly).
> >
> > This would be a good question for the pythonmac list -- low traffic,
> > but there are some very smart and helpful folks there:
> >
> > http://mail.python.org/mailman/listinfo/pythonmac-sig
> >
> >
> >>>> But I am not currently sure what to do with it. The Python.mpkg
> >>>> directory seems to contain the sources.
> >
> > It should be possible to unpack a mpkg by hand, but it contains both
> > the contents, and various instal scripts, so that seems like a really
> > ugly solution.
>
> Yep.
>
> In the meantime, the hard drive on Vincent's box failed, so he
> reinstalled the box completely.
> Also he explained to me a lot of Mac things over the phone, so I think
> I now understand what is going on with the dmg.
>
> As such, I have updated my instructions in my release helper repo:
>
> https://github.com/certik/numpy-vendor
>
> by the following paragraph:
>
> """
> First prepare the Mac build box as follows:
>
> * Install Python 2.5, 2.6, 2.7 from python.org using the dmg disk image
> * Install setuptools and bdist_mpkg into all these Pythons
> * Install Paver into the default Python
>
> Tip: Add the /Library/Frameworks/Python.framework directory into git
> and commit after each installation of any package or Python. That way
> you can easily remove temporary installations.
> """
>
> And you need sudo access to do those. If your user is an admin, then
> it can do it, otherwise it can't.
> So one can only use a Mac, which has the above setup installed. With
> that, my Fabfile can then do the rest.
>
> So I just built the following binaries:
>
> numpy-1.7.0rc1-py2.5-python.org-macosx10.3.dmg
> numpy-1.7.0rc1-py2.6-python.org-macosx10.3.dmg
> numpy-1.7.0rc1-py2.7-python.org-macosx10.3.dmg
>
> and uploaded to:
>
> https://sourceforge.net/projects/numpy/files/NumPy/1.7.0rc1/
>
>
> So I think we are all set here. Ralf, would you be willing to build
> the final binary on 10.6? I don't think you have to do it for this
> rc1, but I am going to release rc2 now and for that it would be nice
> to have it.
>

Sure, no problem. For the part that needs to be built on 10.6 that is.
Vincent's box still has 10.5, right?

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130110/cff6ec8b/attachment.html>

From chaoyuejoy at gmail.com  Thu Jan 10 04:40:52 2013
From: chaoyuejoy at gmail.com (Chao YUE)
Date: Thu, 10 Jan 2013 10:40:52 +0100
Subject: [Numpy-discussion] return index of maximum value in an array easily?
Message-ID: <CAAN-aRGFDFxeHHxXnkma7BQnxNXLnoY0Gs2cxFARev8-0mpAAA@mail.gmail.com>

Dear all,

Are we going to consider returning the index of maximum value in an array
easily
without calling np.argmax and np.unravel_index consecutively?

I saw few posts in mailing archive and stackover flow on this, when I tried
to return
the index of maximum value of 2d array.

It seems that I am not the first to be confused by this.

http://stackoverflow.com/questions/11377028/getting-index-of-numpy-ndarray
http://old.nabble.com/maximum-value-and-corresponding-index-td24834930.html
http://stackoverflow.com/questions/5469286/how-to-get-the-index-of-a-maximum-element-in-a-numpy-array
http://stackoverflow.com/questions/4150542/determine-index-of-highest-value-in-pythons-numpy

cheers,

Chao
-- 
***********************************************************************************
Chao YUE
Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL)
UMR 1572 CEA-CNRS-UVSQ
Batiment 712 - Pe 119
91191 GIF Sur YVETTE Cedex
Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16
************************************************************************************
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130110/7d6cc6d1/attachment.html>

From klemm at phys.ethz.ch  Thu Jan 10 05:00:57 2013
From: klemm at phys.ethz.ch (Hanno Klemm)
Date: Thu, 10 Jan 2013 11:00:57 +0100
Subject: [Numpy-discussion]
	=?utf-8?q?return_index_of_maximum_value_in_an_?=
	=?utf-8?q?array_easily=3F?=
In-Reply-To: <CAAN-aRGFDFxeHHxXnkma7BQnxNXLnoY0Gs2cxFARev8-0mpAAA@mail.gmail.com>
References: <CAAN-aRGFDFxeHHxXnkma7BQnxNXLnoY0Gs2cxFARev8-0mpAAA@mail.gmail.com>
Message-ID: <e631ae0880ae6a616332772fb8abb2b7@phys.ethz.ch>


Hi Chao,

in two dimensions the following works very well:

In [97]: a = np.random.randn(5,7)

In [98]: a[divmod(a.argmax(), a.shape[1])]
Out[98]: 1.3680204597100922

In [99]: a.max()
Out[99]: 1.3680204597100922

In [100]:

In [100]: b = a[divmod(a.argmax(), a.shape[1])]

In [101]: b==a.max()
Out[101]: True

Cheers,
Hanno


On 10.01.2013 10:40, Chao YUE wrote:
> Dear all,
>
> Are we going to consider returning the index of maximum value in an
> array easily
> without calling np.argmax and np.unravel_index consecutively?
>
> I saw few posts in mailing archive and stackover flow on this, when I
> tried to return
>  the index of maximum value of 2d array.
>
> It seems that I am not the first to be confused by this.
>
> 
> http://stackoverflow.com/questions/11377028/getting-index-of-numpy-ndarray 
> [1]
>  
> http://old.nabble.com/maximum-value-and-corresponding-index-td24834930.html 
> [2]
> 
> http://stackoverflow.com/questions/5469286/how-to-get-the-index-of-a-maximum-element-in-a-numpy-array
> [3]
>
> 
> http://stackoverflow.com/questions/4150542/determine-index-of-highest-value-in-pythons-numpy
> [4]
>
> cheers,
>
> Chao
> --
>
> 
> ***********************************************************************************
>
> Chao YUE
> Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL)
> UMR 1572 CEA-CNRS-UVSQ
> Batiment 712 - Pe 119
> 91191 GIF Sur YVETTE Cedex
> Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16
>
> 
> ************************************************************************************
>
> Links:
> ------
> [1] 
> http://stackoverflow.com/questions/11377028/getting-index-of-numpy-ndarray
> [2] 
> http://old.nabble.com/maximum-value-and-corresponding-index-td24834930.html
> [3]
> 
> http://stackoverflow.com/questions/5469286/how-to-get-the-index-of-a-maximum-element-in-a-numpy-array
> [4]
> 
> http://stackoverflow.com/questions/4150542/determine-index-of-highest-value-in-pythons-numpy
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

-- 
Hanno Klemm
klemm at phys.ethz.ch


From madsipsen at gmail.com  Thu Jan 10 05:32:45 2013
From: madsipsen at gmail.com (Mads Ipsen)
Date: Thu, 10 Jan 2013 11:32:45 +0100
Subject: [Numpy-discussion] int and long issues
Message-ID: <50EE98CD.8090700@gmail.com>

Hi,

I find this to be a little strange:

     x = numpy.arange(10)
     isinstance(x[0],int)

gives True

     y = numpy.where(x < 5)[0]
isinstance(y[0],int)

gives False

isinstance(y[0],long)

gives True

Specs: Python 2.7.2, numpy-1.6.1, Win7, 64 bit

Best regards,

Mads

-- 
+-----------------------------------------------------+
| Mads Ipsen                                          |
+----------------------+------------------------------+
| G?seb?ksvej 7, 4. tv |                              |
| DK-2500 Valby        | phone:          +45-29716388 |
| Denmark              | email:  mads.ipsen at gmail.com |
+----------------------+------------------------------+

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130110/58e7870c/attachment.html>

From mail.till at gmx.de  Thu Jan 10 06:01:01 2013
From: mail.till at gmx.de (Till Stensitzki)
Date: Thu, 10 Jan 2013 11:01:01 +0000 (UTC)
Subject: [Numpy-discussion]
	=?utf-8?q?ANN=3A_first_previews_of_WinPython_f?=
	=?utf-8?q?or_Python_3=0932/64bit?=
References: <CAKegKuBy9kDAMN0Cj8RtyqhyxeJLMmfi4rfN+g=Z-RuwRPTkYw@mail.gmail.com>
Message-ID: <loom.20130110T115907-520@post.gmane.org>

Pierre Raybaut <pierre.raybaut <at> gmail.com> writes:

> 
> Hi all,
> 
> I'm pleased to announce that the first previews of WinPython for
> Python 3 32bit and 64bit are available (WinPython v3.3.0.0alpha1):
> http://code.google.com/p/winpython/
> This first release based on Python 3 required to migrate the following
> libraries which were only available for Python 2:
>   * formlayout 1.0.12
>   * guidata 1.6.0dev1
>   * guiqwt 2.3.0dev1
>   * Spyder 2.1.14dev
> Please note that these libraries are still development release.
> [Special thanks to Christoph Gohlke for patching and building a
> version of PyQwt compatible with Python 3.3]
> 

Hey Pierre, 
i just want to say thanks for your work. I use spyder, winpython (no more hassle
with administration) and guiqwt (fastest plotting library under pyqt) daily and
love them. 

greetings
Till


From sebastian at sipsolutions.net  Thu Jan 10 06:06:39 2013
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Thu, 10 Jan 2013 12:06:39 +0100
Subject: [Numpy-discussion] int and long issues
In-Reply-To: <50EE98CD.8090700@gmail.com>
References: <50EE98CD.8090700@gmail.com>
Message-ID: <1357815999.2516.6.camel@sebastian-laptop>

On Thu, 2013-01-10 at 11:32 +0100, Mads Ipsen wrote:
> Hi,
> 
> I find this to be a little strange:
> 
>     x = numpy.arange(10)
>     isinstance(x[0],int)
> 
> gives True
> 
>     y = numpy.where(x < 5)[0]
>     isinstance(y[0],int)
> 
> gives False
> 
>     isinstance(y[0],long)
> 

Check what type(x[0])/type(y[0]) prints, I expect these are very
different, because the default integer type and the integer type used
for indexing (addressing memory in general) are not necessarily the
same. And because of that, `y[0]` probably simply isn't compatible to
the datatype of a python integer for your hardware and OS (for example
for me, your code works). So on python 2 (python 3 abolishes int and
makes long the only integer, so this should work as expected there) you
have to just check both even in the python context, because you can
never really know (there may be some nice trick for that, but not sure).
And if you want to allow for rare 0d arrays as well (well they are very
rare admittingly)... it gets even a bit hairier.


> gives True
> 
> Specs: Python 2.7.2, numpy-1.6.1, Win7, 64 bit
> 
> Best regards,
> 
> Mads
> -- 
> +-----------------------------------------------------+
> | Mads Ipsen                                          |
> +----------------------+------------------------------+
> | G?seb?ksvej 7, 4. tv |                              |
> | DK-2500 Valby        | phone:          +45-29716388 |
> | Denmark              | email:  mads.ipsen at gmail.com |
> +----------------------+------------------------------+
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From sebastian at sipsolutions.net  Thu Jan 10 06:06:54 2013
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Thu, 10 Jan 2013 12:06:54 +0100
Subject: [Numpy-discussion] int and long issues
In-Reply-To: <50EE98CD.8090700@gmail.com>
References: <50EE98CD.8090700@gmail.com>
Message-ID: <1357816014.2516.7.camel@sebastian-laptop>

On Thu, 2013-01-10 at 11:32 +0100, Mads Ipsen wrote:
> Hi,
> 
> I find this to be a little strange:
> 
>     x = numpy.arange(10)
>     isinstance(x[0],int)
> 
> gives True
> 
>     y = numpy.where(x < 5)[0]
>     isinstance(y[0],int)
> 
> gives False
> 
>     isinstance(y[0],long)
> 

Check what type(x[0])/type(y[0]) prints, I expect these are very
different, because the default integer type and the integer type used
for indexing (addressing memory in general) are not necessarily the
same. And because of that, `y[0]` probably simply isn't compatible to
the datatype of a python integer for your hardware and OS (for example
for me, your code works). So on python 2 (python 3 abolishes int and
makes long the only integer, so this should work as expected there) you
have to just check both even in the python context, because you can
never really know (there may be some nice trick for that, but not sure).
And if you want to allow for rare 0d arrays as well (well they are very
rare admittingly)... it gets even a bit hairier.


Regards,

Sebastian

> gives True
> 
> Specs: Python 2.7.2, numpy-1.6.1, Win7, 64 bit
> 
> Best regards,
> 
> Mads
> -- 
> +-----------------------------------------------------+
> | Mads Ipsen                                          |
> +----------------------+------------------------------+
> | G?seb?ksvej 7, 4. tv |                              |
> | DK-2500 Valby        | phone:          +45-29716388 |
> | Denmark              | email:  mads.ipsen at gmail.com |
> +----------------------+------------------------------+
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From mail.till at gmx.de  Thu Jan 10 06:05:52 2013
From: mail.till at gmx.de (Till Stensitzki)
Date: Thu, 10 Jan 2013 11:05:52 +0000 (UTC)
Subject: [Numpy-discussion] Linear least squares
References: <loom.20130108T184502-398@post.gmane.org>
	<CAPJVwBn50HmYcc=yL_0CUQt8_v-y-6N9M++g5ZRcXx2u+iMuxA@mail.gmail.com>
	<loom.20130109T091925-88@post.gmane.org>
	<CAB6mnxKdZz-gujNAsHa-9WgcTqkXizWZeS9udAatJaOqoQMU1Q@mail.gmail.com>
Message-ID: <loom.20130110T120134-551@post.gmane.org>


> 
> QR without column pivoting is a nice option for 
>"safe" problems, but it doesn't 
>provide a reliable indication of rank 
>reduction. I also don't find pinv useful 
>once the rank goes down, since it relies on
> Euclidean distance having relevance in 
>parameter space and that is seldom a sound 
>assumption, usually it is better to 
>reformulate the problem or remove a column
> from the design matrix.

Oh, i always taught that lstsq is more or less using the same procedure
as pinv. Maybe you can give me a hint which algorithm is the most stable, 
as system (sum of expontials) is not very stable? My numerical lectures 
were some year ago.


greetings Till


From madsipsen at gmail.com  Thu Jan 10 06:29:41 2013
From: madsipsen at gmail.com (Mads Ipsen)
Date: Thu, 10 Jan 2013 12:29:41 +0100
Subject: [Numpy-discussion] int and long issues
In-Reply-To: <1357815999.2516.6.camel@sebastian-laptop>
References: <50EE98CD.8090700@gmail.com>
	<1357815999.2516.6.camel@sebastian-laptop>
Message-ID: <50EEA625.1050305@gmail.com>

Sebastian - thanks - very helpful.

Best regards,

Mads


On 10/01/2013 12:06, Sebastian Berg wrote:
> On Thu, 2013-01-10 at 11:32 +0100, Mads Ipsen wrote:
>> Hi,
>>
>> I find this to be a little strange:
>>
>>      x = numpy.arange(10)
>>      isinstance(x[0],int)
>>
>> gives True
>>
>>      y = numpy.where(x < 5)[0]
>>      isinstance(y[0],int)
>>
>> gives False
>>
>>      isinstance(y[0],long)
>>
> Check what type(x[0])/type(y[0]) prints, I expect these are very
> different, because the default integer type and the integer type used
> for indexing (addressing memory in general) are not necessarily the
> same. And because of that, `y[0]` probably simply isn't compatible to
> the datatype of a python integer for your hardware and OS (for example
> for me, your code works). So on python 2 (python 3 abolishes int and
> makes long the only integer, so this should work as expected there) you
> have to just check both even in the python context, because you can
> never really know (there may be some nice trick for that, but not sure).
> And if you want to allow for rare 0d arrays as well (well they are very
> rare admittingly)... it gets even a bit hairier.
>
>
>> gives True
>>
>> Specs: Python 2.7.2, numpy-1.6.1, Win7, 64 bit
>>
>> Best regards,
>>
>> Mads
>> -- 
>> +-----------------------------------------------------+
>> | Mads Ipsen                                          |
>> +----------------------+------------------------------+
>> | G?seb?ksvej 7, 4. tv |                              |
>> | DK-2500 Valby        | phone:          +45-29716388 |
>> | Denmark              | email:  mads.ipsen at gmail.com |
>> +----------------------+------------------------------+
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


-- 
+-----------------------------------------------------+
| Mads Ipsen                                          |
+----------------------+------------------------------+
| G?seb?ksvej 7, 4. tv |                              |
| DK-2500 Valby        | phone:          +45-29716388 |
| Denmark              | email:  mads.ipsen at gmail.com |
+----------------------+------------------------------+


From pav at iki.fi  Thu Jan 10 07:04:07 2013
From: pav at iki.fi (Pauli Virtanen)
Date: Thu, 10 Jan 2013 12:04:07 +0000 (UTC)
Subject: [Numpy-discussion] numpydoc for python 3?
References: <50EDA996.6090806@aalto.fi>
Message-ID: <loom.20130110T130306-98@post.gmane.org>

Hi,

Jaakko Luttinen <jaakko.luttinen <at> aalto.fi> writes:
> I'm trying to use numpydoc (Sphinx extension) for my project written in
> Python 3.2. However, installing numpydoc gives errors shown at
> http://pastebin.com/MPED6v9G and although it says "Successfully
> installed numpydoc", trying to import numpydoc raises errors..
> 
> Could this be fixed or am I doing something wrong?

Numpydoc hasn't been ported to Python 3 so far. This probably
wouldn't a very large amount of work --- patches are accepted!

-- 
Pauli Virtanen


From nouiz at nouiz.org  Thu Jan 10 09:28:45 2013
From: nouiz at nouiz.org (=?ISO-8859-1?Q?Fr=E9d=E9ric_Bastien?=)
Date: Thu, 10 Jan 2013 09:28:45 -0500
Subject: [Numpy-discussion] Remove support for numeric and numarray in
	1.8
In-Reply-To: <CAB6mnxKnnG4L1HtHZsxGwDBEeFinD1QKK+WGLQh_E1Wq0nnB_A@mail.gmail.com>
References: <CAB6mnxJ=zH7Q4hUpRXT-Heg8ONVBbwVuyrgb=QJ-SjLpZDa7JA@mail.gmail.com>
	<CADeni3v3r2jJSvw9isbb8fb3_=E6jkTN0T=dT9Od4tdkEH+q6g@mail.gmail.com>
	<CAB6mnxKnnG4L1HtHZsxGwDBEeFinD1QKK+WGLQh_E1Wq0nnB_A@mail.gmail.com>
Message-ID: <CADKKbtiLtJzCfs67bED+M0ukgMCzgvtE4C2LeG-FU6o4=jerUw@mail.gmail.com>

Hi,

Just to note, as they plan to remove there dependency on it this year,
is it bad that they can't use 1.8 for a few mounts until they finish
the conversion? They already have a working version. They can continue
to use it for as long they want. The only advantage for them if the
compat layers are kept is the abbility for them to use the new numpy
1.8 a few monts earlier.

I don't know enough about this issue, but from Nathaniel description,
the consequence of dropping it in 1.8 seam light compared to the
potential problem in my view. But the question is, how many other
group are in there situation?

Can we make a big warning printed when we compile again those
compatibility layer to make it clear they will get removed? (probably
it is already like that)

my 2 cents

Fred

On Wed, Jan 9, 2013 at 6:41 PM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
>
>
> On Wed, Jan 9, 2013 at 4:21 PM, Christopher Hanley <chanley at gmail.com>
> wrote:
>>
>> After poking around our code base and talking to a few folks I predict
>> that we at STScI can remove our dependence on the numpy-numarray
>> compatibility layer by the end of this calendar year.  I'm unsure of what
>> the timeline for numpy 1.8 is so I don't know if this schedule supports
>> removal of the compatibility layer from 1.8 or not.
>>
>
> Together with the previous post that puts the kibosh on removing either
> numeric or numarray support from 1.8, at least if we get 1.8 before the end
> of summer. It's good to know where folks stand with regard to those
> packages, we'll give it another shot next year.
>
> Chuck
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From jaakko.luttinen at aalto.fi  Thu Jan 10 09:54:35 2013
From: jaakko.luttinen at aalto.fi (Jaakko Luttinen)
Date: Thu, 10 Jan 2013 16:54:35 +0200
Subject: [Numpy-discussion] numpydoc for python 3?
In-Reply-To: <loom.20130110T130306-98@post.gmane.org>
References: <50EDA996.6090806@aalto.fi>
	<loom.20130110T130306-98@post.gmane.org>
Message-ID: <50EED62B.9010105@aalto.fi>

The files in numpy/doc/sphinxext/ and numpydoc/ (from PyPI) are a bit
different. Which ones should be modified?
-Jaakko

On 01/10/2013 02:04 PM, Pauli Virtanen wrote:
> Hi,
> 
> Jaakko Luttinen <jaakko.luttinen <at> aalto.fi> writes:
>> I'm trying to use numpydoc (Sphinx extension) for my project written in
>> Python 3.2. However, installing numpydoc gives errors shown at
>> http://pastebin.com/MPED6v9G and although it says "Successfully
>> installed numpydoc", trying to import numpydoc raises errors..
>>
>> Could this be fixed or am I doing something wrong?
> 
> Numpydoc hasn't been ported to Python 3 so far. This probably
> wouldn't a very large amount of work --- patches are accepted!
> 


From pav at iki.fi  Thu Jan 10 10:04:40 2013
From: pav at iki.fi (Pauli Virtanen)
Date: Thu, 10 Jan 2013 15:04:40 +0000 (UTC)
Subject: [Numpy-discussion] numpydoc for python 3?
References: <50EDA996.6090806@aalto.fi>
	<loom.20130110T130306-98@post.gmane.org>
	<50EED62B.9010105@aalto.fi>
Message-ID: <loom.20130110T160320-428@post.gmane.org>

Jaakko Luttinen <jaakko.luttinen <at> aalto.fi> writes:
> The files in numpy/doc/sphinxext/ and numpydoc/ (from PyPI) are a bit
> different. Which ones should be modified?

The stuff in sphinxext/ is the development version of the package on
PyPi, so the changes should be made in sphinxext/

-- 
Pauli Virtanen


From jaakko.luttinen at aalto.fi  Thu Jan 10 10:16:32 2013
From: jaakko.luttinen at aalto.fi (Jaakko Luttinen)
Date: Thu, 10 Jan 2013 17:16:32 +0200
Subject: [Numpy-discussion] numpydoc for python 3?
In-Reply-To: <loom.20130110T160320-428@post.gmane.org>
References: <50EDA996.6090806@aalto.fi>
	<loom.20130110T130306-98@post.gmane.org>
	<50EED62B.9010105@aalto.fi>
	<loom.20130110T160320-428@post.gmane.org>
Message-ID: <50EEDB50.5000902@aalto.fi>

On 01/10/2013 05:04 PM, Pauli Virtanen wrote:
> Jaakko Luttinen <jaakko.luttinen <at> aalto.fi> writes:
>> The files in numpy/doc/sphinxext/ and numpydoc/ (from PyPI) are a bit
>> different. Which ones should be modified?
> 
> The stuff in sphinxext/ is the development version of the package on
> PyPi, so the changes should be made in sphinxext/
> 

Thanks!

I'm trying to run the tests with Python 2 using nosetests, but I get
some errors http://pastebin.com/Mp9i8T2f . Am I doing something wrong?
How should I run the tests?
If I run nosetests on the numpydoc folder from PyPI, all the tests are
successful.

-Jaakko


From klonuo at gmail.com  Thu Jan 10 10:56:38 2013
From: klonuo at gmail.com (klo)
Date: Thu, 10 Jan 2013 16:56:38 +0100
Subject: [Numpy-discussion] Building Numpy 1.6.2 for Python 3.3 on Windows
Message-ID: <147978185.20130110165638@gmail.com>

Hi,

I run `python3 setup.py config` and then

  python3 setup.py build --compiler=mingw32

but it picks that I have MSVC 10 and complains about manifests.
Why, or even better, how to compile with available MinGW compilers?


Here is log:

========================================
C:\src\numpy-1.6.2>python3 setup.py --compiler=mingw32
Converting to Python3 via 2to3...
Running from numpy source directory.usage: setup.py [global_opts] cmd1 [cmd1_opts] [cmd2 [cmd2_opts] ...]
   or: setup.py --help [cmd1 cmd2 ...]
   or: setup.py --help-commands
   or: setup.py cmd --help

error: option --compiler not recognized

C:\src\numpy-1.6.2>python3 setup.py build --compiler=mingw32
Converting to Python3 via 2to3...
F2PY Version 2
blas_opt_info:
blas_mkl_info:
  FOUND:
    libraries = ['mkl_rt']
    library_dirs = ['C:/Progra~1/Intel/Compos~1/mkl/lib/ia32']
    include_dirs = ['C:/Progra~1/Intel/Compos~1/mkl/include']
    define_macros = [('SCIPY_MKL_H', None)]

  FOUND:
    libraries = ['mkl_rt']
    library_dirs = ['C:/Progra~1/Intel/Compos~1/mkl/lib/ia32']
    include_dirs = ['C:/Progra~1/Intel/Compos~1/mkl/include']
    define_macros = [('SCIPY_MKL_H', None)]

non-existing path in 'numpy\\lib': 'benchmarks'
lapack_opt_info:
lapack_mkl_info:
mkl_info:
  FOUND:
    libraries = ['mkl_rt']
    library_dirs = ['C:/Progra~1/Intel/Compos~1/mkl/lib/ia32']
    include_dirs = ['C:/Progra~1/Intel/Compos~1/mkl/include']
    define_macros = [('SCIPY_MKL_H', None)]

  FOUND:
    libraries = ['mkl_rt']
    library_dirs = ['C:/Progra~1/Intel/Compos~1/mkl/lib/ia32']
    include_dirs = ['C:/Progra~1/Intel/Compos~1/mkl/include']
    define_macros = [('SCIPY_MKL_H', None)]

  FOUND:
    libraries = ['mkl_rt']
    library_dirs = ['C:/Progra~1/Intel/Compos~1/mkl/lib/ia32']
    include_dirs = ['C:/Progra~1/Intel/Compos~1/mkl/include']
    define_macros = [('SCIPY_MKL_H', None)]

running build
running config_cc
unifing config_cc, config, build_clib, build_ext, build commands --compiler options
running config_fc
unifing config_fc, config, build_clib, build_ext, build commands --fcompiler options
running build_src
build_src
building py_modules sources
building library "npymath" sources
customize GnuFCompiler
Could not locate executable g77
Could not locate executable f77
customize IntelVisualFCompiler
Could not locate executable ifort
Could not locate executable ifl
customize AbsoftFCompiler
Could not locate executable f90
customize CompaqVisualFCompiler
Could not locate executable DF
customize IntelItaniumVisualFCompiler
Could not locate executable efl
customize Gnu95FCompiler
Found executable C:\MinGW\bin\gfortran.exe
Found executable C:\MinGW\bin\gfortran.exe
Running from numpy source directory.customize Gnu95FCompiler
customize Gnu95FCompiler using config
Traceback (most recent call last):
  File "C:\src\numpy-1.6.2\build\py3k\numpy\distutils\mingw32ccompiler.py", line 399, in msvc_manifest_xml
    fullver = _MSVCRVER_TO_FULLVER[str(maj * 10 + min)]
KeyError: '100'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "setup.py", line 214, in <module>
    setup_package()
  File "setup.py", line 207, in setup_package
    configuration=configuration )
  File "C:\src\numpy-1.6.2\build\py3k\numpy\distutils\core.py", line 186, in setup
    return old_setup(**new_attr)
  File "c:\python33\lib\distutils\core.py", line 148, in setup
    dist.run_commands()
  File "c:\python33\lib\distutils\dist.py", line 917, in run_commands
    self.run_command(cmd)
  File "c:\python33\lib\distutils\dist.py", line 936, in run_command
    cmd_obj.run()
  File "C:\src\numpy-1.6.2\build\py3k\numpy\distutils\command\build.py", line 37, in run
    old_build.run(self)
  File "c:\python33\lib\distutils\command\build.py", line 126, in run
    self.run_command(cmd_name)
  File "c:\python33\lib\distutils\cmd.py", line 313, in run_command
    self.distribution.run_command(command)
  File "c:\python33\lib\distutils\dist.py", line 936, in run_command
    cmd_obj.run()
  File "C:\src\numpy-1.6.2\build\py3k\numpy\distutils\command\build_src.py", line 152, in run
    self.build_sources()
  File "C:\src\numpy-1.6.2\build\py3k\numpy\distutils\command\build_src.py", line 163, in build_sources
    self.build_library_sources(*libname_info)
  File "C:\src\numpy-1.6.2\build\py3k\numpy\distutils\command\build_src.py", line 298, in build_library_sources
    sources = self.generate_sources(sources, (lib_name, build_info))
  File "C:\src\numpy-1.6.2\build\py3k\numpy\distutils\command\build_src.py", line 385, in generate_sources
    source = func(extension, build_dir)
  File "numpy\core\setup.py", line 694, in get_mathlib_info
    st = config_cmd.try_link('int main(void) { return 0;}')
  File "c:\python33\lib\distutils\command\config.py", line 246, in try_link
    libraries, library_dirs, lang)
  File "C:\src\numpy-1.6.2\build\py3k\numpy\distutils\command\config.py", line 146, in _link
    generate_manifest(self)
  File "C:\src\numpy-1.6.2\build\py3k\numpy\distutils\mingw32ccompiler.py", line 484, in generate_manifest
    manxml = msvc_manifest_xml(ma, mi)
  File "C:\src\numpy-1.6.2\build\py3k\numpy\distutils\mingw32ccompiler.py", line 402, in msvc_manifest_xml
    % (maj, min))
ValueError: Version 10,0 of MSVCRT not supported yet
========================================


From chanley at gmail.com  Thu Jan 10 11:02:45 2013
From: chanley at gmail.com (Christopher Hanley)
Date: Thu, 10 Jan 2013 11:02:45 -0500
Subject: [Numpy-discussion] Remove support for numeric and numarray in
	1.8
In-Reply-To: <CAPJVwBnYGeHq-mtRts5eEOHzdiPSw4BSUZDR2b7mRX=Q4ZGi9w@mail.gmail.com>
References: <CAB6mnxJ=zH7Q4hUpRXT-Heg8ONVBbwVuyrgb=QJ-SjLpZDa7JA@mail.gmail.com>
	<CADeni3v3r2jJSvw9isbb8fb3_=E6jkTN0T=dT9Od4tdkEH+q6g@mail.gmail.com>
	<CAPJVwBnYGeHq-mtRts5eEOHzdiPSw4BSUZDR2b7mRX=Q4ZGi9w@mail.gmail.com>
Message-ID: <CADeni3vV-=7LYG2g6Aft6OTNtA1J-cOZuY63p_Hf4=2mLTM1YA@mail.gmail.com>

I'm all for a big scary warning on import.  Fair warning is good for
everyone, not just our developers.

As for testing, our software that uses the API is tested nightly.  So if
our software stops working, and the compatibility layer is the cause, we
would definitely be looking into what happened. :-)

In any case, fair warning of dropped support in 1.8 and removal in 1.9 is
fine with us.

Thanks,
Chris


On Wed, Jan 9, 2013 at 6:38 PM, Nathaniel Smith <njs at pobox.com> wrote:

> On Wed, Jan 9, 2013 at 11:21 PM, Christopher Hanley <chanley at gmail.com>
> wrote:
> > After poking around our code base and talking to a few folks I predict
> that
> > we at STScI can remove our dependence on the numpy-numarray compatibility
> > layer by the end of this calendar year.  I'm unsure of what the timeline
> for
> > numpy 1.8 is so I don't know if this schedule supports removal of the
> > compatibility layer from 1.8 or not.
>
> It'd be nice if 1.8 were out before that, but that doesn't really
> matter -- let us know when you get it sorted?
>
> Also, would it help if we added a big scary warning at import time to
> annoy your more recalcitrant developers with? :-)
>
> The basic issue is that none of us actually use this stuff, it has no
> tests, the rest of numpy is changing around it, and we have no idea if
> it works, so at some point it makes more sense for us to just stop
> shipping the compat layer and let anyone who still needs it maintain
> their own copy of the code.
>
> -n
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130110/01105167/attachment.html>

From p.j.a.cock at googlemail.com  Thu Jan 10 11:09:53 2013
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Thu, 10 Jan 2013 16:09:53 +0000
Subject: [Numpy-discussion] Building Numpy 1.6.2 for Python 3.3 on
	Windows
In-Reply-To: <147978185.20130110165638@gmail.com>
References: <147978185.20130110165638@gmail.com>
Message-ID: <CAKVJ-_4K5uPVWvhSf=kcMgreOWYXNPpYFr3EQm75KxdMu0qbUg@mail.gmail.com>

On Thu, Jan 10, 2013 at 3:56 PM, klo <klonuo at gmail.com> wrote:
> Hi,
>
> I run `python3 setup.py config` and then
>
>   python3 setup.py build --compiler=mingw32
>
> but it picks that I have MSVC 10 and complains about manifests.
> Why, or even better, how to compile with available MinGW compilers?

I reported this issue/bug to the mailing list recently as part of
a discussion with Ralf which lead to various fixes being made
to get NumPy to compile with either mingw32 or MSCV 10.

http://mail.scipy.org/pipermail/numpy-discussion/2012-November/064454.html

My workaround is to change the default compiler for Python 3,
by creating C:\Python33\Lib\distutils\distutils.cfg containing:

[build]
compiler=mingw32

Peter


From raul at virtualmaterials.com  Thu Jan 10 11:15:55 2013
From: raul at virtualmaterials.com (Raul Cota)
Date: Thu, 10 Jan 2013 09:15:55 -0700
Subject: [Numpy-discussion] Remove support for numeric and numarray in
 1.8
In-Reply-To: <CADeni3vV-=7LYG2g6Aft6OTNtA1J-cOZuY63p_Hf4=2mLTM1YA@mail.gmail.com>
References: <CAB6mnxJ=zH7Q4hUpRXT-Heg8ONVBbwVuyrgb=QJ-SjLpZDa7JA@mail.gmail.com>
	<CADeni3v3r2jJSvw9isbb8fb3_=E6jkTN0T=dT9Od4tdkEH+q6g@mail.gmail.com>
	<CAPJVwBnYGeHq-mtRts5eEOHzdiPSw4BSUZDR2b7mRX=Q4ZGi9w@mail.gmail.com>
	<CADeni3vV-=7LYG2g6Aft6OTNtA1J-cOZuY63p_Hf4=2mLTM1YA@mail.gmail.com>
Message-ID: <50EEE93B.4040206@virtualmaterials.com>

An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130110/f6bd0132/attachment.html>

From klonuo at gmail.com  Thu Jan 10 11:35:31 2013
From: klonuo at gmail.com (klo)
Date: Thu, 10 Jan 2013 17:35:31 +0100
Subject: [Numpy-discussion] Building Numpy 1.6.2 for Python 3.3 on
	Windows
In-Reply-To: <CAKVJ-_4K5uPVWvhSf=kcMgreOWYXNPpYFr3EQm75KxdMu0qbUg@mail.gmail.com>
References: <147978185.20130110165638@gmail.com>
	<CAKVJ-_4K5uPVWvhSf=kcMgreOWYXNPpYFr3EQm75KxdMu0qbUg@mail.gmail.com>
Message-ID: <1074049520.20130110173531@gmail.com>

> I reported this issue/bug to the mailing list recently as part of
> a discussion with Ralf which lead to various fixes being made
> to get NumPy to compile with either mingw32 or MSCV 10.

> http://mail.scipy.org/pipermail/numpy-discussion/2012-November/064454.html

> My workaround is to change the default compiler for Python 3,
> by creating C:\Python33\Lib\distutils\distutils.cfg containing:

> [build]
> compiler=mingw32

Thanks, but I have already set C:\Python33\Lib\distutils\distutils.cfg:

========================================
[build]
compiler=mingw32

[build_ext]
compiler=mingw32
========================================


From cgohlke at uci.edu  Thu Jan 10 12:08:50 2013
From: cgohlke at uci.edu (Christoph Gohlke)
Date: Thu, 10 Jan 2013 09:08:50 -0800
Subject: [Numpy-discussion] Building Numpy 1.6.2 for Python 3.3 on
	Windows
In-Reply-To: <1074049520.20130110173531@gmail.com>
References: <147978185.20130110165638@gmail.com>
	<CAKVJ-_4K5uPVWvhSf=kcMgreOWYXNPpYFr3EQm75KxdMu0qbUg@mail.gmail.com>
	<1074049520.20130110173531@gmail.com>
Message-ID: <50EEF5A2.9090600@uci.edu>

On 1/10/2013 8:35 AM, klo wrote:
>> I reported this issue/bug to the mailing list recently as part of
>> a discussion with Ralf which lead to various fixes being made
>> to get NumPy to compile with either mingw32 or MSCV 10.
>
>> http://mail.scipy.org/pipermail/numpy-discussion/2012-November/064454.html
>
>> My workaround is to change the default compiler for Python 3,
>> by creating C:\Python33\Lib\distutils\distutils.cfg containing:
>
>> [build]
>> compiler=mingw32
>
> Thanks, but I have already set C:\Python33\Lib\distutils\distutils.cfg:
>
> ========================================
> [build]
> compiler=mingw32
>
> [build_ext]
> compiler=mingw32
> ========================================
>


Numpy <= 1.6 is not compatible with Python 3.3. Use numpy >= 1.7.0rc1.

Christoph


From chris.barker at noaa.gov  Thu Jan 10 12:26:16 2013
From: chris.barker at noaa.gov (Chris Barker - NOAA Federal)
Date: Thu, 10 Jan 2013 09:26:16 -0800
Subject: [Numpy-discussion] Which Python to use for Mac binaries
In-Reply-To: <CADDwiVCtnQ2wveGm+G9evb2hObAW1poV4wqMGzZ9L3iYRdH_BQ@mail.gmail.com>
References: <CADDwiVCrujdbbpCVCiDbi1N7pH6i5aHDDv+Y+TyyRnkkzJ14+Q@mail.gmail.com>
	<CABL7CQhsSuJAVLA8LKYPNv-G5SC0J7D4aUZ6Jd8jc7ZTssEdfg@mail.gmail.com>
	<CALGmxEJjaj=XjUkXAc2QEtEGNMmc4kKQ44d4k5v8s7E3_t6ynw@mail.gmail.com>
	<CADDwiVDDHG2czKA8UTQpDJ_MW3frhoKoUtVrH8-1z2tBkHm-Ow@mail.gmail.com>
	<CALGmxE+u4QhKRtkapdVccmBUOuKpPVbewO_heykh+5m6_3x1eA@mail.gmail.com>
	<CADDwiVAapYEZA8x6+KRL9Fh0eccEVPy97-Ah8=mi8hXUe0W3GQ@mail.gmail.com>
	<CALGmxELD=ecZeOiDLcOPuqTT_1269Wbftm4X607zKuenjFbG6Q@mail.gmail.com>
	<CADDwiVCtnQ2wveGm+G9evb2hObAW1poV4wqMGzZ9L3iYRdH_BQ@mail.gmail.com>
Message-ID: <CALGmxEJ-waVoJBEdu+HOPs3FAjzu-_Y-kapObjYwHFCc5aUOrQ@mail.gmail.com>

Ond?ej, Vincent, and Ralf (and others..)

Thank you so much for doing all this -- it's a great service to the
MacPython community.

-Chris


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From klonuo at gmail.com  Thu Jan 10 14:14:35 2013
From: klonuo at gmail.com (klo)
Date: Thu, 10 Jan 2013 20:14:35 +0100
Subject: [Numpy-discussion] Building Numpy 1.6.2 for Python 3.3 on
	Windows
In-Reply-To: <50EEF5A2.9090600@uci.edu>
References: <147978185.20130110165638@gmail.com>
	<CAKVJ-_4K5uPVWvhSf=kcMgreOWYXNPpYFr3EQm75KxdMu0qbUg@mail.gmail.com>
	<1074049520.20130110173531@gmail.com> <50EEF5A2.9090600@uci.edu>
Message-ID: <443695837.20130110201435@gmail.com>

> Numpy <= 1.6 is not compatible with Python 3.3. Use numpy >= 1.7.0rc1.

Thanks for the tip

1.7.0rc builds without issue


From klonuo at gmail.com  Thu Jan 10 14:57:50 2013
From: klonuo at gmail.com (klo)
Date: Thu, 10 Jan 2013 20:57:50 +0100
Subject: [Numpy-discussion] Building Numpy 1.6.2 for Python 3.3 on
	Windows
In-Reply-To: <443695837.20130110201435@gmail.com>
References: <147978185.20130110165638@gmail.com>
	<CAKVJ-_4K5uPVWvhSf=kcMgreOWYXNPpYFr3EQm75KxdMu0qbUg@mail.gmail.com>
	<1074049520.20130110173531@gmail.com>
	<50EEF5A2.9090600@uci.edu> <443695837.20130110201435@gmail.com>
Message-ID: <485515987.20130110205750@gmail.com>

>> Numpy <= 1.6 is not compatible with Python 3.3. Use numpy >= 1.7.0rc1.
> Thanks for the tip
> 1.7.0rc builds without issue

Actually, this isn't over. It builds fine, but when I try to import 
numpy I get error:

========================================
...
from numpy.linalg import lapack_lite 
ImportError: DLL load failed: The specified module could not be found.
========================================

Google reveals that PATH has to be updated with "C:\Python33\Scripts" 
path, but then when I run `python3 setup.py build` I get another 
error:

========================================
C:\src\numpy-1.7.0rc1>python3 setup.py build
Converting to Python3 via 2to3...
Running from numpy source directory.
F2PY Version 2
blas_opt_info:
blas_mkl_info:
  FOUND:
    define_macros = [('SCIPY_MKL_H', None)]
    libraries = ['mkl_rt']
    include_dirs = ['C:/Progra~1/Intel/Compos~1/mkl/include']
    library_dirs = ['C:/Progra~1/Intel/Compos~1/mkl/lib/ia32']

  FOUND:
    define_macros = [('SCIPY_MKL_H', None)]
    libraries = ['mkl_rt']
    include_dirs = ['C:/Progra~1/Intel/Compos~1/mkl/include']
    library_dirs = ['C:/Progra~1/Intel/Compos~1/mkl/lib/ia32']

non-existing path in 'numpy\\lib': 'benchmarks'
lapack_opt_info:
lapack_mkl_info:
mkl_info:
  FOUND:
    define_macros = [('SCIPY_MKL_H', None)]
    libraries = ['mkl_rt']
    include_dirs = ['C:/Progra~1/Intel/Compos~1/mkl/include']
    library_dirs = ['C:/Progra~1/Intel/Compos~1/mkl/lib/ia32']

  FOUND:
    define_macros = [('SCIPY_MKL_H', None)]
    libraries = ['mkl_rt']
    include_dirs = ['C:/Progra~1/Intel/Compos~1/mkl/include']
    library_dirs = ['C:/Progra~1/Intel/Compos~1/mkl/lib/ia32']

  FOUND:
    define_macros = [('SCIPY_MKL_H', None)]
    libraries = ['mkl_rt']
    include_dirs = ['C:/Progra~1/Intel/Compos~1/mkl/include']
    library_dirs = ['C:/Progra~1/Intel/Compos~1/mkl/lib/ia32']

running build
running config_cc
unifing config_cc, config, build_clib, build_ext, build commands --compiler options
running config_fc
unifing config_fc, config, build_clib, build_ext, build commands --fcompiler options
running build_src
build_src
building py_modules sources
creating build
creating build\src.win32-3.3
creating build\src.win32-3.3\numpy
creating build\src.win32-3.3\numpy\distutils
building library "npymath" sources
Building import library (ARCH=x86): "c:\python33\libs\libpython33.a"
Traceback (most recent call last):
  File "setup.py", line 214, in <module>
    setup_package()
  File "setup.py", line 207, in setup_package
    configuration=configuration )
  File "C:\src\numpy-1.7.0rc1\build\py3k\numpy\distutils\core.py", line 186, in setup
    return old_setup(**new_attr)
  File "c:\python33\lib\distutils\core.py", line 148, in setup
    dist.run_commands()
  File "c:\python33\lib\distutils\dist.py", line 917, in run_commands
    self.run_command(cmd)
  File "c:\python33\lib\distutils\dist.py", line 936, in run_command
    cmd_obj.run()
  File "C:\src\numpy-1.7.0rc1\build\py3k\numpy\distutils\command\build.py", line 37, in run
    old_build.run(self)
  File "c:\python33\lib\distutils\command\build.py", line 126, in run
    self.run_command(cmd_name)
  File "c:\python33\lib\distutils\cmd.py", line 313, in run_command
    self.distribution.run_command(command)
  File "c:\python33\lib\distutils\dist.py", line 936, in run_command
    cmd_obj.run()
  File "C:\src\numpy-1.7.0rc1\build\py3k\numpy\distutils\command\build_src.py", line 152, in run
    self.build_sources()
  File "C:\src\numpy-1.7.0rc1\build\py3k\numpy\distutils\command\build_src.py", line 163, in build_sources
    self.build_library_sources(*libname_info)
  File "C:\src\numpy-1.7.0rc1\build\py3k\numpy\distutils\command\build_src.py", line 298, in build_library_sources
    sources = self.generate_sources(sources, (lib_name, build_info))
  File "C:\src\numpy-1.7.0rc1\build\py3k\numpy\distutils\command\build_src.py", line 385, in generate_sources
    source = func(extension, build_dir)
  File "numpy\core\setup.py", line 646, in get_mathlib_info
    st = config_cmd.try_link('int main(void) { return 0;}')
  File "c:\python33\lib\distutils\command\config.py", line 243, in try_link
    self._check_compiler()
  File "C:\src\numpy-1.7.0rc1\build\py3k\numpy\distutils\command\config.py", line 45, in _check_compiler
    old_config._check_compiler(self)
  File "c:\python33\lib\distutils\command\config.py", line 98, in _check_compiler
    dry_run=self.dry_run, force=1)
  File "C:\src\numpy-1.7.0rc1\build\py3k\numpy\distutils\ccompiler.py", line 560, in new_compiler
    compiler = klass(None, dry_run, force)
  File "C:\src\numpy-1.7.0rc1\build\py3k\numpy\distutils\mingw32ccompiler.py", line 91, in __init__
    build_import_library()
  File "C:\src\numpy-1.7.0rc1\build\py3k\numpy\distutils\mingw32ccompiler.py", line 383, in build_import_library
    return _build_import_library_x86()
  File "C:\src\numpy-1.7.0rc1\build\py3k\numpy\distutils\mingw32ccompiler.py", line 428, in _build_import_library_x86
    dlist, flist = lib2def.parse_nm(nm_output)
  File "C:\src\numpy-1.7.0rc1\build\py3k\numpy\distutils\lib2def.py", line 77, in parse_nm
    data = DATA_RE.findall(nm_output)
TypeError: can't use a string pattern on a bytes-like object
========================================

Any ideas?


From klonuo at gmail.com  Thu Jan 10 15:20:17 2013
From: klonuo at gmail.com (klo)
Date: Thu, 10 Jan 2013 21:20:17 +0100
Subject: [Numpy-discussion] Building Numpy 1.6.2 for Python 3.3 on
	Windows
In-Reply-To: <485515987.20130110205750@gmail.com>
References: <147978185.20130110165638@gmail.com>
	<CAKVJ-_4K5uPVWvhSf=kcMgreOWYXNPpYFr3EQm75KxdMu0qbUg@mail.gmail.com>
	<1074049520.20130110173531@gmail.com>
	<50EEF5A2.9090600@uci.edu> <443695837.20130110201435@gmail.com>
	<485515987.20130110205750@gmail.com>
Message-ID: <15313110.20130110212017@gmail.com>

> Actually, this isn't over. It builds fine, but when I try to import 
> numpy I get error:

> ========================================
> ...

Sorry for the noise, after re-reading tracelog, I realized that I accidentally
removed "c:\python33\libs\libpython33.a" while removing previous non-working
numpy build


Cheers


From ondrej.certik at gmail.com  Thu Jan 10 21:14:19 2013
From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=)
Date: Thu, 10 Jan 2013 18:14:19 -0800
Subject: [Numpy-discussion] Which Python to use for Mac binaries
In-Reply-To: <CABL7CQgHfZ9NP0ZvNXWJdfiyGa_CQHqZGvy=1-c=EGyutWN0jA@mail.gmail.com>
References: <CADDwiVCrujdbbpCVCiDbi1N7pH6i5aHDDv+Y+TyyRnkkzJ14+Q@mail.gmail.com>
	<CABL7CQhsSuJAVLA8LKYPNv-G5SC0J7D4aUZ6Jd8jc7ZTssEdfg@mail.gmail.com>
	<CALGmxEJjaj=XjUkXAc2QEtEGNMmc4kKQ44d4k5v8s7E3_t6ynw@mail.gmail.com>
	<CADDwiVDDHG2czKA8UTQpDJ_MW3frhoKoUtVrH8-1z2tBkHm-Ow@mail.gmail.com>
	<CALGmxE+u4QhKRtkapdVccmBUOuKpPVbewO_heykh+5m6_3x1eA@mail.gmail.com>
	<CADDwiVAapYEZA8x6+KRL9Fh0eccEVPy97-Ah8=mi8hXUe0W3GQ@mail.gmail.com>
	<CALGmxELD=ecZeOiDLcOPuqTT_1269Wbftm4X607zKuenjFbG6Q@mail.gmail.com>
	<CADDwiVCtnQ2wveGm+G9evb2hObAW1poV4wqMGzZ9L3iYRdH_BQ@mail.gmail.com>
	<CABL7CQgHfZ9NP0ZvNXWJdfiyGa_CQHqZGvy=1-c=EGyutWN0jA@mail.gmail.com>
Message-ID: <CADDwiVC0tY45Ut01Y1i++S_yBPE=VH9WmD6Y8kvFi-onxd0AuA@mail.gmail.com>

On Wed, Jan 9, 2013 at 11:21 PM, Ralf Gommers <ralf.gommers at gmail.com> wrote:
> Sure, no problem. For the part that needs to be built on 10.6 that is.
> Vincent's box still has 10.5, right?

Yes.

Ondrej


From ondrej.certik at gmail.com  Thu Jan 10 21:14:50 2013
From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=)
Date: Thu, 10 Jan 2013 18:14:50 -0800
Subject: [Numpy-discussion] Which Python to use for Mac binaries
In-Reply-To: <CALGmxEJ-waVoJBEdu+HOPs3FAjzu-_Y-kapObjYwHFCc5aUOrQ@mail.gmail.com>
References: <CADDwiVCrujdbbpCVCiDbi1N7pH6i5aHDDv+Y+TyyRnkkzJ14+Q@mail.gmail.com>
	<CABL7CQhsSuJAVLA8LKYPNv-G5SC0J7D4aUZ6Jd8jc7ZTssEdfg@mail.gmail.com>
	<CALGmxEJjaj=XjUkXAc2QEtEGNMmc4kKQ44d4k5v8s7E3_t6ynw@mail.gmail.com>
	<CADDwiVDDHG2czKA8UTQpDJ_MW3frhoKoUtVrH8-1z2tBkHm-Ow@mail.gmail.com>
	<CALGmxE+u4QhKRtkapdVccmBUOuKpPVbewO_heykh+5m6_3x1eA@mail.gmail.com>
	<CADDwiVAapYEZA8x6+KRL9Fh0eccEVPy97-Ah8=mi8hXUe0W3GQ@mail.gmail.com>
	<CALGmxELD=ecZeOiDLcOPuqTT_1269Wbftm4X607zKuenjFbG6Q@mail.gmail.com>
	<CADDwiVCtnQ2wveGm+G9evb2hObAW1poV4wqMGzZ9L3iYRdH_BQ@mail.gmail.com>
	<CALGmxEJ-waVoJBEdu+HOPs3FAjzu-_Y-kapObjYwHFCc5aUOrQ@mail.gmail.com>
Message-ID: <CADDwiVATXO3Gjs+E50Cb07Jzu_Dq4KH9ckkdLyLgwhjfLozMYg@mail.gmail.com>

On Thu, Jan 10, 2013 at 9:26 AM, Chris Barker - NOAA Federal
<chris.barker at noaa.gov> wrote:
> Ond?ej, Vincent, and Ralf (and others..)
>
> Thank you so much for doing all this -- it's a great service to the
> MacPython community.

Chris, thank you for your help as well!

Ondrej


From ondrej.certik at gmail.com  Thu Jan 10 21:21:15 2013
From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=)
Date: Thu, 10 Jan 2013 18:21:15 -0800
Subject: [Numpy-discussion] Building Numpy 1.6.2 for Python 3.3 on
	Windows
In-Reply-To: <15313110.20130110212017@gmail.com>
References: <147978185.20130110165638@gmail.com>
	<CAKVJ-_4K5uPVWvhSf=kcMgreOWYXNPpYFr3EQm75KxdMu0qbUg@mail.gmail.com>
	<1074049520.20130110173531@gmail.com> <50EEF5A2.9090600@uci.edu>
	<443695837.20130110201435@gmail.com>
	<485515987.20130110205750@gmail.com>
	<15313110.20130110212017@gmail.com>
Message-ID: <CADDwiVAbt8=19V8WDtUea4M01ntt0CA0Ej0FQB8Zkicq06biuA@mail.gmail.com>

On Thu, Jan 10, 2013 at 12:20 PM, klo <klonuo at gmail.com> wrote:
>> Actually, this isn't over. It builds fine, but when I try to import
>> numpy I get error:
>
>> ========================================
>> ...
>
> Sorry for the noise, after re-reading tracelog, I realized that I accidentally
> removed "c:\python33\libs\libpython33.a" while removing previous non-working
> numpy build

Cool, I am glad to hear that 1.7.0rc1 works great. Thanks for letting us know.

Ondrej


From ndbecker2 at gmail.com  Fri Jan 11 10:40:39 2013
From: ndbecker2 at gmail.com (Neal Becker)
Date: Fri, 11 Jan 2013 10:40:39 -0500
Subject: [Numpy-discussion] phase unwrapping (1d)
Message-ID: <kcpbpk$8nk$1@ger.gmane.org>

np.unwrap was too slow, so I rolled by own (in c++).

I wanted to be able to handle the case of

unwrap (arg (x1) + arg (x2))

Here, phase can change by more than 2pi.

I came up with the following algorithm, any thoughts?

In the following, y is normally set to pi.
o points to output
i points to input
nint1 finds nearest integer

  value_t prev_o = init;
  for (; i != e; ++i, ++o) {
    *o = cnt * 2 * y + *i;
    value_t delta = *o - prev_o;
    
    if (delta / y > 1 or delta / y < -1) {
      int i = nint1<int> (delta / (2*y));
      *o -= 2*y*i;
      cnt -= i;
    }

    prev_o = *o;
  }


From njs at pobox.com  Fri Jan 11 17:00:55 2013
From: njs at pobox.com (Nathaniel Smith)
Date: Fri, 11 Jan 2013 22:00:55 +0000
Subject: [Numpy-discussion] return index of maximum value in an array
	easily?
In-Reply-To: <CAAN-aRGFDFxeHHxXnkma7BQnxNXLnoY0Gs2cxFARev8-0mpAAA@mail.gmail.com>
References: <CAAN-aRGFDFxeHHxXnkma7BQnxNXLnoY0Gs2cxFARev8-0mpAAA@mail.gmail.com>
Message-ID: <CAPJVwBmmpBZuLg886H7a55K+NCz+_HWTw2VE34wfKGRgJqYWpA@mail.gmail.com>

On Thu, Jan 10, 2013 at 9:40 AM, Chao YUE <chaoyuejoy at gmail.com> wrote:
> Dear all,
>
> Are we going to consider returning the index of maximum value in an array
> easily
> without calling np.argmax and np.unravel_index consecutively?

This does seem like a good thing to support somehow. What would a good
interface look like? Something like np.nonzero(a == np.max(a))? Should
we support vectorized operation (e.g. argmax of each 2-d subarray of a
3-d array along some axis)?

-n


From chaoyuejoy at gmail.com  Fri Jan 11 18:26:13 2013
From: chaoyuejoy at gmail.com (Chao YUE)
Date: Sat, 12 Jan 2013 00:26:13 +0100
Subject: [Numpy-discussion] return index of maximum value in an array
	easily?
In-Reply-To: <CAPJVwBmmpBZuLg886H7a55K+NCz+_HWTw2VE34wfKGRgJqYWpA@mail.gmail.com>
References: <CAAN-aRGFDFxeHHxXnkma7BQnxNXLnoY0Gs2cxFARev8-0mpAAA@mail.gmail.com>
	<CAPJVwBmmpBZuLg886H7a55K+NCz+_HWTw2VE34wfKGRgJqYWpA@mail.gmail.com>
Message-ID: <CAAN-aRE=ntYc4tJ1=+c6LihmV9a0R77Axu0Z_jxGv8p=dtSz5Q@mail.gmail.com>

Hi,

I don't know how others think about this. Like you point out, one can use
np.nonzero(a==np.max(a)) as a workaround.

For the second point, in case I have an array:
a = np.arange(24.).reshape(2,3,4)

suppose I want to find the index for maximum value of each 2X3 array along
the 3rd dimension, what I can think of will be:

index_list = []
for i in range(a.shape[-1]):
    data = a[...,i]
    index_list.append(np.nonzero(data==np.max(data)))


In [87]:

index_list

Out[87]:

[(array([1]), array([2])),
 (array([1]), array([2])),
 (array([1]), array([2])),
 (array([1]), array([2]))]


If we want to make the np.argmax function doing the job of this part of
code,
could we add another some kind of boolean keyword argument, for example,
"exclude" to the function?
[this is only my thinking, and I am only a beginner, maybe it's stupid!!!]

np.argmax(a,axis=2,exclude=True) (default value for exclude is False)

it will give the index of maximum value along all other axis except the
axis=2
(which is acutally the 3rd axis)

The output will be:

np.array(index_list).squeeze()

array([[1, 2],
       [1, 2],
       [1, 2],
       [1, 2]])

and one can use a[1,2,i] (i=1,2,3,4) to extract the maximum value.

I doubt this is really useful...... too complicated......

Chao

On Fri, Jan 11, 2013 at 11:00 PM, Nathaniel Smith <njs at pobox.com> wrote:

> On Thu, Jan 10, 2013 at 9:40 AM, Chao YUE <chaoyuejoy at gmail.com> wrote:
> > Dear all,
> >
> > Are we going to consider returning the index of maximum value in an array
> > easily
> > without calling np.argmax and np.unravel_index consecutively?
>
> This does seem like a good thing to support somehow. What would a good
> interface look like? Something like np.nonzero(a == np.max(a))? Should
> we support vectorized operation (e.g. argmax of each 2-d subarray of a
> 3-d array along some axis)?
>
> -n
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


-- 
***********************************************************************************
Chao YUE
Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL)
UMR 1572 CEA-CNRS-UVSQ
Batiment 712 - Pe 119
91191 GIF Sur YVETTE Cedex
Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16
************************************************************************************
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130112/5bcf2610/attachment.html>

From sebastian at sipsolutions.net  Fri Jan 11 20:33:07 2013
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Sat, 12 Jan 2013 02:33:07 +0100
Subject: [Numpy-discussion] return index of maximum value in an array
 easily?
In-Reply-To: <CAAN-aRE=ntYc4tJ1=+c6LihmV9a0R77Axu0Z_jxGv8p=dtSz5Q@mail.gmail.com>
References: <CAAN-aRGFDFxeHHxXnkma7BQnxNXLnoY0Gs2cxFARev8-0mpAAA@mail.gmail.com>
	<CAPJVwBmmpBZuLg886H7a55K+NCz+_HWTw2VE34wfKGRgJqYWpA@mail.gmail.com>
	<CAAN-aRE=ntYc4tJ1=+c6LihmV9a0R77Axu0Z_jxGv8p=dtSz5Q@mail.gmail.com>
Message-ID: <1357954387.8396.6.camel@sebastian-laptop>

On Sat, 2013-01-12 at 00:26 +0100, Chao YUE wrote:
> Hi,
> 
> I don't know how others think about this. Like you point out, one can
> use 
> np.nonzero(a==np.max(a)) as a workaround. 
> 
> For the second point, in case I have an array:
> a = np.arange(24.).reshape(2,3,4)
> 
> suppose I want to find the index for maximum value of each 2X3 array
> along 
> the 3rd dimension, what I can think of will be:
> 
> index_list = []
> for i in range(a.shape[-1]):
>     data = a[...,i]
>     index_list.append(np.nonzero(data==np.max(data)))
> 
To keep being close to min/max (and other ufunc based reduce
operations), it would seem consistent to allow something like
np.argmax(array, axis=(1, 2)), which would give a tuple of
arrays as result such that

array[np.argmax(array, axis=(1,2))] == np.max(array, axis=(1,2))

But apart from consistency, I am not sure anyone would get the idea of
giving multiple axes into the function...

> 
> In [87]:
> 
> 
> index_list
> Out[87]:
> [(array([1]), array([2])),
>  (array([1]), array([2])),
>  (array([1]), array([2])),
>  (array([1]), array([2]))]
> 
> 
> If we want to make the np.argmax function doing the job of this part
> of code,
> could we add another some kind of boolean keyword argument, for
> example, 
> "exclude" to the function? 
> [this is only my thinking, and I am only a beginner, maybe it's
> stupid!!!]
> 
> np.argmax(a,axis=2,exclude=True) (default value for exclude is False)
> 
> it will give the index of maximum value along all other axis except
> the axis=2
> (which is acutally the 3rd axis)
> 
> The output will be:
> 
> np.array(index_list).squeeze()
> 
> array([[1, 2],
>        [1, 2],
>        [1, 2],
>        [1, 2]])
> 
> and one can use a[1,2,i] (i=1,2,3,4) to extract the maximum value. 
> 
> I doubt this is really useful...... too complicated......
> 
> Chao
> 
> On Fri, Jan 11, 2013 at 11:00 PM, Nathaniel Smith <njs at pobox.com>
> wrote:
>         On Thu, Jan 10, 2013 at 9:40 AM, Chao YUE
>         <chaoyuejoy at gmail.com> wrote:
>         > Dear all,
>         >
>         > Are we going to consider returning the index of maximum
>         value in an array
>         > easily
>         > without calling np.argmax and np.unravel_index
>         consecutively?
>         
>         
>         This does seem like a good thing to support somehow. What
>         would a good
>         interface look like? Something like np.nonzero(a ==
>         np.max(a))? Should
>         we support vectorized operation (e.g. argmax of each 2-d
>         subarray of a
>         3-d array along some axis)?
>         
>         -n
>         _______________________________________________
>         NumPy-Discussion mailing list
>         NumPy-Discussion at scipy.org
>         http://mail.scipy.org/mailman/listinfo/numpy-discussion
>         
> 
> 
> 
> -- 
> ***********************************************************************************
> Chao YUE
> Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL)
> UMR 1572 CEA-CNRS-UVSQ
> Batiment 712 - Pe 119
> 91191 GIF Sur YVETTE Cedex
> Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16
> 
> ************************************************************************************
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From charlesr.harris at gmail.com  Sun Jan 13 00:34:32 2013
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Sat, 12 Jan 2013 22:34:32 -0700
Subject: [Numpy-discussion] How many build systems do we need?
Message-ID: <CAB6mnxLsWGQ0t12fG3AB494NP6gbQuMMKNWJ1XyV562rhzkXqw@mail.gmail.com>

Hi All,

In the continuing proposal for cleanups, note that we currently support
three (3!) build systems, distutils, scons, and bento. That's a bit much to
maintain when contemplating changes, and scons and bento both have external
dependencies. Can we dispense with any of these? Thoughts?

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130112/758c941b/attachment.html>

From mike.r.anderson.13 at gmail.com  Sun Jan 13 07:08:18 2013
From: mike.r.anderson.13 at gmail.com (Mike Anderson)
Date: Sun, 13 Jan 2013 20:08:18 +0800
Subject: [Numpy-discussion] Insights / lessons learned from NumPy design
In-Reply-To: <CALGmxEK1iOStkT4ijgxNGsSVMe4tu5PQ8oc0zWbSQ3xk8g+2ZQ@mail.gmail.com>
References: <CAA_67WgMPeF3LP=1xsTCgcvXXAPqbuJnJ5HJyEZM9Wda3JFSpg@mail.gmail.com>
	<50E68C2A.9060400@astro.uio.no> <50E68F12.90804@astro.uio.no>
	<CAA_67WhkZ6sdVUtEXNXdOdSS3Y7=h=NEZ+-q+8Ac1XupbcTSPg@mail.gmail.com>
	<CALGmxEK1iOStkT4ijgxNGsSVMe4tu5PQ8oc0zWbSQ3xk8g+2ZQ@mail.gmail.com>
Message-ID: <CAA_67WgsB5MhgV5dTB+J6ZCb+ixrVDbzWPoifntWsKPMRWNS1A@mail.gmail.com>

On 10 January 2013 05:19, Chris Barker - NOAA Federal <chris.barker at noaa.gov
> wrote:

> On Wed, Jan 9, 2013 at 2:57 AM, Mike Anderson
>
> > I'm hoping the API will be independent of storage format - i.e. the
> > underlying implementations can store the data any way they like. So the
> API
> > will be written in terms of abstractions, and the user will have the
> choice
> > of whatever concrete implementation best fits the specific needs. Sparse
> > matrices, tiled matrices etc. should all be possible options.
>
> A note about that -- as I think if it, numpy arrays are two things:
>
> 1) a python object for working with numbers, in a wide variety of ways
>
> 2) a wrapper around a C-array (or data block) that can be used to
> provide an easyway for Python to interact with C (and Fortran, and...)
> libraries, etc.
>
> As it turns out a LOT of people use numpy for (2) -- what this means
> is that while you could change the underlying data representation,
> etc, and keep the same Python API -- such changes would break a lot of
> non-pure-python code that relies on that data representation.
>
> This is a big issue with the numpy-for-PyPy project -- they could
> write a numpy clone, but it would only be useful for the pure-python
> stuff.
>
> Even then, a number of folks do tricks with numpy arrays in python
> that rely on the underlying structure.
>
> Not sure how all this would play out for Clojure, but it's something
> to keep in mind.


Thanks Chris -  this is a really helpful insight.

Trying to translate that into the Clojure world, I think that's roughly
equivalent to the separation between the API (roughly equivalent to the
methods in the ndarray referred to in 1) from the specific implementations
(which will probably include a data block ndarray-style wrapper like 2, but
would also leave open other implementation options).

That way the majority of users can code purely against the API, and they
won't be affected if (when?) the underlying implementation changes. In this
way, they should be able to get the benefits of 2) without building a
direct dependency on it.

Of course, I still expect some users to circumvent the API and build a
dependency on the underlying implementation. Nothing we can do to stop
that, and they may even have good reasons like hardcore performance
optimization. We have to assume at that point they know what they are doing
and are prepared to live with the consequences :-)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130113/1bf3ca0b/attachment.html>

From ralf.gommers at gmail.com  Sun Jan 13 08:29:20 2013
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Sun, 13 Jan 2013 14:29:20 +0100
Subject: [Numpy-discussion] How many build systems do we need?
In-Reply-To: <CAB6mnxLsWGQ0t12fG3AB494NP6gbQuMMKNWJ1XyV562rhzkXqw@mail.gmail.com>
References: <CAB6mnxLsWGQ0t12fG3AB494NP6gbQuMMKNWJ1XyV562rhzkXqw@mail.gmail.com>
Message-ID: <CABL7CQib269SwSBw5+fzbrs002sR_ug286uZ8xu=WUqks1PZGg@mail.gmail.com>

On Sun, Jan 13, 2013 at 6:34 AM, Charles R Harris <charlesr.harris at gmail.com
> wrote:

> Hi All,
>
> In the continuing proposal for cleanups, note that we currently support
> three (3!) build systems, distutils, scons, and bento. That's a bit much to
> maintain when contemplating changes, and scons and bento both have external
> dependencies. Can we dispense with any of these? Thoughts?
>

Numscons is the only one that can be dropped. I'm still using it regularly,
but the few things it does better than bento can be easily improved in
bento. So if removing numscons support from master saves some developer
hours, +1 from me.

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130113/eb0a2329/attachment.html>

From cournape at gmail.com  Sun Jan 13 08:44:31 2013
From: cournape at gmail.com (David Cournapeau)
Date: Sun, 13 Jan 2013 07:44:31 -0600
Subject: [Numpy-discussion] How many build systems do we need?
In-Reply-To: <CABL7CQib269SwSBw5+fzbrs002sR_ug286uZ8xu=WUqks1PZGg@mail.gmail.com>
References: <CAB6mnxLsWGQ0t12fG3AB494NP6gbQuMMKNWJ1XyV562rhzkXqw@mail.gmail.com>
	<CABL7CQib269SwSBw5+fzbrs002sR_ug286uZ8xu=WUqks1PZGg@mail.gmail.com>
Message-ID: <CAGY4rcUMREnpnC8xiQRuRH9ar1Na8Cjr6bBGmmiD0Fqm1vS+tA@mail.gmail.com>

On Sun, Jan 13, 2013 at 7:29 AM, Ralf Gommers <ralf.gommers at gmail.com> wrote:
>
>
>
> On Sun, Jan 13, 2013 at 6:34 AM, Charles R Harris
> <charlesr.harris at gmail.com> wrote:
>>
>> Hi All,
>>
>> In the continuing proposal for cleanups, note that we currently support
>> three (3!) build systems, distutils, scons, and bento. That's a bit much to
>> maintain when contemplating changes, and scons and bento both have external
>> dependencies. Can we dispense with any of these? Thoughts?
>
>
> Numscons is the only one that can be dropped. I'm still using it regularly,
> but the few things it does better than bento can be easily improved in
> bento. So if removing numscons support from master saves some developer
> hours, +1 from me.

I think numscons was already scheduled to be dropped in 1.7 (and next
version of scipy as well) ? I am certainly in favor of dropping it as
well.

David


From ralf.gommers at gmail.com  Sun Jan 13 09:03:51 2013
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Sun, 13 Jan 2013 15:03:51 +0100
Subject: [Numpy-discussion] How many build systems do we need?
In-Reply-To: <CAGY4rcUMREnpnC8xiQRuRH9ar1Na8Cjr6bBGmmiD0Fqm1vS+tA@mail.gmail.com>
References: <CAB6mnxLsWGQ0t12fG3AB494NP6gbQuMMKNWJ1XyV562rhzkXqw@mail.gmail.com>
	<CABL7CQib269SwSBw5+fzbrs002sR_ug286uZ8xu=WUqks1PZGg@mail.gmail.com>
	<CAGY4rcUMREnpnC8xiQRuRH9ar1Na8Cjr6bBGmmiD0Fqm1vS+tA@mail.gmail.com>
Message-ID: <CABL7CQiF1wTu0SWbukSgn12e+1WopbF3cP-HqSKoOXmdpH-O6Q@mail.gmail.com>

On Sun, Jan 13, 2013 at 2:44 PM, David Cournapeau <cournape at gmail.com>wrote:

> On Sun, Jan 13, 2013 at 7:29 AM, Ralf Gommers <ralf.gommers at gmail.com>
> wrote:
> >
> >
> >
> > On Sun, Jan 13, 2013 at 6:34 AM, Charles R Harris
> > <charlesr.harris at gmail.com> wrote:
> >>
> >> Hi All,
> >>
> >> In the continuing proposal for cleanups, note that we currently support
> >> three (3!) build systems, distutils, scons, and bento. That's a bit
> much to
> >> maintain when contemplating changes, and scons and bento both have
> external
> >> dependencies. Can we dispense with any of these? Thoughts?
> >
> >
> > Numscons is the only one that can be dropped. I'm still using it
> regularly,
> > but the few things it does better than bento can be easily improved in
> > bento. So if removing numscons support from master saves some developer
> > hours, +1 from me.
>
> I think numscons was already scheduled to be dropped in 1.7 (and next
> version of scipy as well) ? I am certainly in favor of dropping it as
> well.
>

It was deprecated when we added Bento support, but we never decided on a
timeline.

Scipy is a different story, since Bento doesn't build it correctly the last
few times I checked (there are some tickets by Pauli and me on the Bento
issue list).

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130113/bbe55586/attachment.html>

From njs at pobox.com  Sun Jan 13 09:30:24 2013
From: njs at pobox.com (Nathaniel Smith)
Date: Sun, 13 Jan 2013 14:30:24 +0000
Subject: [Numpy-discussion] How many build systems do we need?
In-Reply-To: <CAB6mnxLsWGQ0t12fG3AB494NP6gbQuMMKNWJ1XyV562rhzkXqw@mail.gmail.com>
References: <CAB6mnxLsWGQ0t12fG3AB494NP6gbQuMMKNWJ1XyV562rhzkXqw@mail.gmail.com>
Message-ID: <CAPJVwBmsOVR5WhPEUy=qAW2WL++rgfvfH76MG0_pWn1eQ9onVw@mail.gmail.com>

On Sun, Jan 13, 2013 at 5:34 AM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
> Hi All,
>
> In the continuing proposal for cleanups, note that we currently support
> three (3!) build systems, distutils, scons, and bento. That's a bit much to
> maintain when contemplating changes, and scons and bento both have external
> dependencies. Can we dispense with any of these? Thoughts?

I think it's actually 6 build systems, because each build system
supports two modes: compiling each source file separately before
linking, and concatenating everything into one big file and compiling
that.

It's been proposed that we phase out the one-file build (which is
currently the default):
  http://mail.scipy.org/pipermail/numpy-discussion/2012-June/063015.html
  https://github.com/numpy/numpy/issues/315
The separate compilation approach is superior in every way, so long as
it works. There is a theory that on some system somewhere there might
be a broken compiler/linker which make it not work[1], but we don't
actually know of any such system.

Shall we switch the default to separate compilation for 1.8 and see if
anyone notices?

-n

[1] The problem is that we need to make sure that symbols defined in
numpy .c files are visible to other numpy .c files, but not to
non-numpy code linked into the same process; this is a problem that
the C standard didn't consider, so it requires system-specific
fiddling. However that fiddling is pretty standard these days.


From charlesr.harris at gmail.com  Sun Jan 13 10:47:24 2013
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Sun, 13 Jan 2013 08:47:24 -0700
Subject: [Numpy-discussion] How many build systems do we need?
In-Reply-To: <CAPJVwBmsOVR5WhPEUy=qAW2WL++rgfvfH76MG0_pWn1eQ9onVw@mail.gmail.com>
References: <CAB6mnxLsWGQ0t12fG3AB494NP6gbQuMMKNWJ1XyV562rhzkXqw@mail.gmail.com>
	<CAPJVwBmsOVR5WhPEUy=qAW2WL++rgfvfH76MG0_pWn1eQ9onVw@mail.gmail.com>
Message-ID: <CAB6mnxJ-=rjpdUFCrJo7tsQ130rsOEnH_EXJEa2BDbf-Yd=yLQ@mail.gmail.com>

On Sun, Jan 13, 2013 at 7:30 AM, Nathaniel Smith <njs at pobox.com> wrote:

> On Sun, Jan 13, 2013 at 5:34 AM, Charles R Harris
> <charlesr.harris at gmail.com> wrote:
> > Hi All,
> >
> > In the continuing proposal for cleanups, note that we currently support
> > three (3!) build systems, distutils, scons, and bento. That's a bit much
> to
> > maintain when contemplating changes, and scons and bento both have
> external
> > dependencies. Can we dispense with any of these? Thoughts?
>
> I think it's actually 6 build systems, because each build system
> supports two modes: compiling each source file separately before
> linking, and concatenating everything into one big file and compiling
> that.
>
> It's been proposed that we phase out the one-file build (which is
> currently the default):
>   http://mail.scipy.org/pipermail/numpy-discussion/2012-June/063015.html
>   https://github.com/numpy/numpy/issues/315
> The separate compilation approach is superior in every way, so long as
> it works. There is a theory that on some system somewhere there might
> be a broken compiler/linker which make it not work[1], but we don't
> actually know of any such system.
>
> Shall we switch the default to separate compilation for 1.8 and see if
> anyone notices?
>
>
+1


> -n
>
> [1] The problem is that we need to make sure that symbols defined in
> numpy .c files are visible to other numpy .c files, but not to
> non-numpy code linked into the same process; this is a problem that
> the C standard didn't consider, so it requires system-specific
> fiddling. However that fiddling is pretty standard these days.
>

And do we really care? I've compiled and statically linked libraries using
setup.py because it is more easily portable than make, and on windows few
symbols are exposed by default while on linux most are, but who looks?
Exposing unneeded symbols is  a bit of a wart but I don't think it matters
that much for most things.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130113/b3f8aadf/attachment.html>

From charlesr.harris at gmail.com  Sun Jan 13 10:50:15 2013
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Sun, 13 Jan 2013 08:50:15 -0700
Subject: [Numpy-discussion] How many build systems do we need?
In-Reply-To: <CAGY4rcUMREnpnC8xiQRuRH9ar1Na8Cjr6bBGmmiD0Fqm1vS+tA@mail.gmail.com>
References: <CAB6mnxLsWGQ0t12fG3AB494NP6gbQuMMKNWJ1XyV562rhzkXqw@mail.gmail.com>
	<CABL7CQib269SwSBw5+fzbrs002sR_ug286uZ8xu=WUqks1PZGg@mail.gmail.com>
	<CAGY4rcUMREnpnC8xiQRuRH9ar1Na8Cjr6bBGmmiD0Fqm1vS+tA@mail.gmail.com>
Message-ID: <CAB6mnxJxuBfeAi0ZbT3TbWmdieWv8ApL36Vp8UQb4sU6kvA=cg@mail.gmail.com>

On Sun, Jan 13, 2013 at 6:44 AM, David Cournapeau <cournape at gmail.com>wrote:

> On Sun, Jan 13, 2013 at 7:29 AM, Ralf Gommers <ralf.gommers at gmail.com>
> wrote:
> >
> >
> >
> > On Sun, Jan 13, 2013 at 6:34 AM, Charles R Harris
> > <charlesr.harris at gmail.com> wrote:
> >>
> >> Hi All,
> >>
> >> In the continuing proposal for cleanups, note that we currently support
> >> three (3!) build systems, distutils, scons, and bento. That's a bit
> much to
> >> maintain when contemplating changes, and scons and bento both have
> external
> >> dependencies. Can we dispense with any of these? Thoughts?
> >
> >
> > Numscons is the only one that can be dropped. I'm still using it
> regularly,
> > but the few things it does better than bento can be easily improved in
> > bento. So if removing numscons support from master saves some developer
> > hours, +1 from me.
>
> I think numscons was already scheduled to be dropped in 1.7 (and next
> version of scipy as well) ? I am certainly in favor of dropping it as
> well.
>

Is bento documented anywhere or can you commit to keeping it working for
numpy?

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130113/ef23ffe5/attachment.html>

From cournape at gmail.com  Sun Jan 13 11:25:00 2013
From: cournape at gmail.com (David Cournapeau)
Date: Sun, 13 Jan 2013 10:25:00 -0600
Subject: [Numpy-discussion] How many build systems do we need?
In-Reply-To: <CAB6mnxJxuBfeAi0ZbT3TbWmdieWv8ApL36Vp8UQb4sU6kvA=cg@mail.gmail.com>
References: <CAB6mnxLsWGQ0t12fG3AB494NP6gbQuMMKNWJ1XyV562rhzkXqw@mail.gmail.com>
	<CABL7CQib269SwSBw5+fzbrs002sR_ug286uZ8xu=WUqks1PZGg@mail.gmail.com>
	<CAGY4rcUMREnpnC8xiQRuRH9ar1Na8Cjr6bBGmmiD0Fqm1vS+tA@mail.gmail.com>
	<CAB6mnxJxuBfeAi0ZbT3TbWmdieWv8ApL36Vp8UQb4sU6kvA=cg@mail.gmail.com>
Message-ID: <CAGY4rcWzihMTUsRY7YapBWDuPZyQsPg_gvj+r-y71LMUWJCEuw@mail.gmail.com>

On Sun, Jan 13, 2013 at 9:50 AM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
>
>
> On Sun, Jan 13, 2013 at 6:44 AM, David Cournapeau <cournape at gmail.com>
> wrote:
>>
>> On Sun, Jan 13, 2013 at 7:29 AM, Ralf Gommers <ralf.gommers at gmail.com>
>> wrote:
>> >
>> >
>> >
>> > On Sun, Jan 13, 2013 at 6:34 AM, Charles R Harris
>> > <charlesr.harris at gmail.com> wrote:
>> >>
>> >> Hi All,
>> >>
>> >> In the continuing proposal for cleanups, note that we currently support
>> >> three (3!) build systems, distutils, scons, and bento. That's a bit
>> >> much to
>> >> maintain when contemplating changes, and scons and bento both have
>> >> external
>> >> dependencies. Can we dispense with any of these? Thoughts?
>> >
>> >
>> > Numscons is the only one that can be dropped. I'm still using it
>> > regularly,
>> > but the few things it does better than bento can be easily improved in
>> > bento. So if removing numscons support from master saves some developer
>> > hours, +1 from me.
>>
>> I think numscons was already scheduled to be dropped in 1.7 (and next
>> version of scipy as well) ? I am certainly in favor of dropping it as
>> well.
>
>
> Is bento documented anywhere or can you commit to keeping it working for
> numpy?

Both. Doc: http://bento.readthedocs.org/en/latest/ and tests
(https://travis-ci.org/cournape/Bento) are continuously run/updated.

David


From ralf.gommers at gmail.com  Sun Jan 13 12:11:18 2013
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Sun, 13 Jan 2013 18:11:18 +0100
Subject: [Numpy-discussion] How many build systems do we need?
In-Reply-To: <CAGY4rcWzihMTUsRY7YapBWDuPZyQsPg_gvj+r-y71LMUWJCEuw@mail.gmail.com>
References: <CAB6mnxLsWGQ0t12fG3AB494NP6gbQuMMKNWJ1XyV562rhzkXqw@mail.gmail.com>
	<CABL7CQib269SwSBw5+fzbrs002sR_ug286uZ8xu=WUqks1PZGg@mail.gmail.com>
	<CAGY4rcUMREnpnC8xiQRuRH9ar1Na8Cjr6bBGmmiD0Fqm1vS+tA@mail.gmail.com>
	<CAB6mnxJxuBfeAi0ZbT3TbWmdieWv8ApL36Vp8UQb4sU6kvA=cg@mail.gmail.com>
	<CAGY4rcWzihMTUsRY7YapBWDuPZyQsPg_gvj+r-y71LMUWJCEuw@mail.gmail.com>
Message-ID: <CABL7CQh0OfMp-bgo6a8WpsTKUiiOuTJSwxJNZ841bFWgu2bEcA@mail.gmail.com>

On Sun, Jan 13, 2013 at 5:25 PM, David Cournapeau <cournape at gmail.com>wrote:

> On Sun, Jan 13, 2013 at 9:50 AM, Charles R Harris
> <charlesr.harris at gmail.com> wrote:
> >
> >
> > On Sun, Jan 13, 2013 at 6:44 AM, David Cournapeau <cournape at gmail.com>
> > wrote:
> >>
> >> On Sun, Jan 13, 2013 at 7:29 AM, Ralf Gommers <ralf.gommers at gmail.com>
> >> wrote:
> >> >
> >> >
> >> >
> >> > On Sun, Jan 13, 2013 at 6:34 AM, Charles R Harris
> >> > <charlesr.harris at gmail.com> wrote:
> >> >>
> >> >> Hi All,
> >> >>
> >> >> In the continuing proposal for cleanups, note that we currently
> support
> >> >> three (3!) build systems, distutils, scons, and bento. That's a bit
> >> >> much to
> >> >> maintain when contemplating changes, and scons and bento both have
> >> >> external
> >> >> dependencies. Can we dispense with any of these? Thoughts?
> >> >
> >> >
> >> > Numscons is the only one that can be dropped. I'm still using it
> >> > regularly,
> >> > but the few things it does better than bento can be easily improved in
> >> > bento. So if removing numscons support from master saves some
> developer
> >> > hours, +1 from me.
> >>
> >> I think numscons was already scheduled to be dropped in 1.7 (and next
> >> version of scipy as well) ? I am certainly in favor of dropping it as
> >> well.
> >
> >
> > Is bento documented anywhere or can you commit to keeping it working for
> > numpy?
>
> Both. Doc: http://bento.readthedocs.org/en/latest/ and tests
> (https://travis-ci.org/cournape/Bento) are continuously run/updated.
>

That's bento's own test suite only - we should add a numpy build with Bento
for at least Python 2.7 to the numpy Travis config.

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130113/f70e7758/attachment.html>

From njs at pobox.com  Sun Jan 13 12:11:47 2013
From: njs at pobox.com (Nathaniel Smith)
Date: Sun, 13 Jan 2013 17:11:47 +0000
Subject: [Numpy-discussion] How many build systems do we need?
In-Reply-To: <CAB6mnxJ-=rjpdUFCrJo7tsQ130rsOEnH_EXJEa2BDbf-Yd=yLQ@mail.gmail.com>
References: <CAB6mnxLsWGQ0t12fG3AB494NP6gbQuMMKNWJ1XyV562rhzkXqw@mail.gmail.com>
	<CAPJVwBmsOVR5WhPEUy=qAW2WL++rgfvfH76MG0_pWn1eQ9onVw@mail.gmail.com>
	<CAB6mnxJ-=rjpdUFCrJo7tsQ130rsOEnH_EXJEa2BDbf-Yd=yLQ@mail.gmail.com>
Message-ID: <CAPJVwBns=d=P6SMZ-=1xkTJFKAPeHm+V61ivG3ZOi+vDytt5Pw@mail.gmail.com>

On Sun, Jan 13, 2013 at 3:47 PM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
>
>
> On Sun, Jan 13, 2013 at 7:30 AM, Nathaniel Smith <njs at pobox.com> wrote:
>>
>> On Sun, Jan 13, 2013 at 5:34 AM, Charles R Harris
>> <charlesr.harris at gmail.com> wrote:
>> > Hi All,
>> >
>> > In the continuing proposal for cleanups, note that we currently support
>> > three (3!) build systems, distutils, scons, and bento. That's a bit much
>> > to
>> > maintain when contemplating changes, and scons and bento both have
>> > external
>> > dependencies. Can we dispense with any of these? Thoughts?
>>
>> I think it's actually 6 build systems, because each build system
>> supports two modes: compiling each source file separately before
>> linking, and concatenating everything into one big file and compiling
>> that.
>>
>> It's been proposed that we phase out the one-file build (which is
>> currently the default):
>>   http://mail.scipy.org/pipermail/numpy-discussion/2012-June/063015.html
>>   https://github.com/numpy/numpy/issues/315
>> The separate compilation approach is superior in every way, so long as
>> it works. There is a theory that on some system somewhere there might
>> be a broken compiler/linker which make it not work[1], but we don't
>> actually know of any such system.
>>
>> Shall we switch the default to separate compilation for 1.8 and see if
>> anyone notices?
>>
>
> +1

https://github.com/numpy/numpy/issues/2913

>> [1] The problem is that we need to make sure that symbols defined in
>> numpy .c files are visible to other numpy .c files, but not to
>> non-numpy code linked into the same process; this is a problem that
>> the C standard didn't consider, so it requires system-specific
>> fiddling. However that fiddling is pretty standard these days.
>
> And do we really care? I've compiled and statically linked libraries using
> setup.py because it is more easily portable than make, and on windows few
> symbols are exposed by default while on linux most are, but who looks?
> Exposing unneeded symbols is  a bit of a wart but I don't think it matters
> that much for most things.

I guess it's like many things in programming -- it doesn't matter
until it does. (And then you realize that you should have done things
properly from the start because you have a horrible hairball of "eh,
does it really matter?" to sort out.)

OTOH it's easy to build a static object without unneeded symbols, at
least if you have access to gnu ld. I assume that most of these
systems that don't have dynamic linkers still use some variant of
binutils:

# Let's take the .o files from separate compilation and make a single
.o file suitable for static linking
$ NPY_SEPARATE_COMPILATION=1 python setup.py build
$ cd build/temp.*/numpy/core/src/multiarray
# Combine all .o files into a single .o file, resolving all internal symbols:
$ ld -r *.o -o multiarray-all.o
# This file still exports a ton of junk...
$ nm multiarray-all.o | wc -l
2541
# But now, we can strip out all the stuff we don't want to be public
$ strip --strip-all --keep-symbol initmultiarray multiarray-all.o
# Ta-da, a single-file static Python module that exports only the
module setup symbol:
$ nm multiarray-all.o
0000000000047a40 T initmultiarray

-n


From cournape at gmail.com  Sun Jan 13 12:26:33 2013
From: cournape at gmail.com (David Cournapeau)
Date: Sun, 13 Jan 2013 11:26:33 -0600
Subject: [Numpy-discussion] How many build systems do we need?
In-Reply-To: <CABL7CQh0OfMp-bgo6a8WpsTKUiiOuTJSwxJNZ841bFWgu2bEcA@mail.gmail.com>
References: <CAB6mnxLsWGQ0t12fG3AB494NP6gbQuMMKNWJ1XyV562rhzkXqw@mail.gmail.com>
	<CABL7CQib269SwSBw5+fzbrs002sR_ug286uZ8xu=WUqks1PZGg@mail.gmail.com>
	<CAGY4rcUMREnpnC8xiQRuRH9ar1Na8Cjr6bBGmmiD0Fqm1vS+tA@mail.gmail.com>
	<CAB6mnxJxuBfeAi0ZbT3TbWmdieWv8ApL36Vp8UQb4sU6kvA=cg@mail.gmail.com>
	<CAGY4rcWzihMTUsRY7YapBWDuPZyQsPg_gvj+r-y71LMUWJCEuw@mail.gmail.com>
	<CABL7CQh0OfMp-bgo6a8WpsTKUiiOuTJSwxJNZ841bFWgu2bEcA@mail.gmail.com>
Message-ID: <CAGY4rcU2GHAhgOD57aUfZNxLVTPPcjYKsZ=+-yzGNoovpbXOnA@mail.gmail.com>

On Sun, Jan 13, 2013 at 11:11 AM, Ralf Gommers <ralf.gommers at gmail.com> wrote:
>
>
>
> On Sun, Jan 13, 2013 at 5:25 PM, David Cournapeau <cournape at gmail.com>
> wrote:
>>
>> On Sun, Jan 13, 2013 at 9:50 AM, Charles R Harris
>> <charlesr.harris at gmail.com> wrote:
>> >
>> >
>> > On Sun, Jan 13, 2013 at 6:44 AM, David Cournapeau <cournape at gmail.com>
>> > wrote:
>> >>
>> >> On Sun, Jan 13, 2013 at 7:29 AM, Ralf Gommers <ralf.gommers at gmail.com>
>> >> wrote:
>> >> >
>> >> >
>> >> >
>> >> > On Sun, Jan 13, 2013 at 6:34 AM, Charles R Harris
>> >> > <charlesr.harris at gmail.com> wrote:
>> >> >>
>> >> >> Hi All,
>> >> >>
>> >> >> In the continuing proposal for cleanups, note that we currently
>> >> >> support
>> >> >> three (3!) build systems, distutils, scons, and bento. That's a bit
>> >> >> much to
>> >> >> maintain when contemplating changes, and scons and bento both have
>> >> >> external
>> >> >> dependencies. Can we dispense with any of these? Thoughts?
>> >> >
>> >> >
>> >> > Numscons is the only one that can be dropped. I'm still using it
>> >> > regularly,
>> >> > but the few things it does better than bento can be easily improved
>> >> > in
>> >> > bento. So if removing numscons support from master saves some
>> >> > developer
>> >> > hours, +1 from me.
>> >>
>> >> I think numscons was already scheduled to be dropped in 1.7 (and next
>> >> version of scipy as well) ? I am certainly in favor of dropping it as
>> >> well.
>> >
>> >
>> > Is bento documented anywhere or can you commit to keeping it working for
>> > numpy?
>>
>> Both. Doc: http://bento.readthedocs.org/en/latest/ and tests
>> (https://travis-ci.org/cournape/Bento) are continuously run/updated.
>
>
> That's bento's own test suite only - we should add a numpy build with Bento
> for at least Python 2.7 to the numpy Travis config.

Definitely. I was merely answering Chuck's worries that bento may just
be a one man, undocumented, bitrotted thing :)

David


From njs at pobox.com  Sun Jan 13 12:27:54 2013
From: njs at pobox.com (Nathaniel Smith)
Date: Sun, 13 Jan 2013 17:27:54 +0000
Subject: [Numpy-discussion] New numpy functions: filled, filled_like
Message-ID: <CAPJVwBmgZUCwApoS2ztmwt2pTa-2-+B3W5tOm8A30KyuOd12AA@mail.gmail.com>

Hi all,

PR 2875 adds two new functions, that generalize zeros(), ones(),
zeros_like(), ones_like(), by simply taking an arbitrary fill value:
  https://github.com/numpy/numpy/pull/2875
So
  np.ones((10, 10))
is the same as
  np.filled((10, 10), 1)

The implementations are trivial, but the API seems useful because it
provides an idiomatic way of efficiently creating an array full of
inf, or nan, or None, whatever funny value you need. All the
alternatives are either inefficient (np.ones(...) * np.inf) or
cumbersome (a = np.empty(...); a.fill(...)). Or so it seems to me. But
there's a question of taste here; one could argue instead that these
just add more clutter to the numpy namespace. So, before we merge,
anyone want to chime in?

(Bonus, extra bike-sheddy survey: do people prefer
  np.filled((10, 10), np.nan)
  np.filled_like(my_arr, np.nan)
or
  np.filled(np.nan, (10, 10))
  np.filled_like(np.nan, my_arr)
?)

-n


From josef.pktd at gmail.com  Sun Jan 13 12:44:11 2013
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Sun, 13 Jan 2013 12:44:11 -0500
Subject: [Numpy-discussion] New numpy functions: filled, filled_like
In-Reply-To: <CAPJVwBmgZUCwApoS2ztmwt2pTa-2-+B3W5tOm8A30KyuOd12AA@mail.gmail.com>
References: <CAPJVwBmgZUCwApoS2ztmwt2pTa-2-+B3W5tOm8A30KyuOd12AA@mail.gmail.com>
Message-ID: <CAMMTP+Absj0qxEv4Y7tbWpV7bC4jbRBBM+GXJ-=Ay_DiMj22qw@mail.gmail.com>

On Sun, Jan 13, 2013 at 12:27 PM, Nathaniel Smith <njs at pobox.com> wrote:
> Hi all,
>
> PR 2875 adds two new functions, that generalize zeros(), ones(),
> zeros_like(), ones_like(), by simply taking an arbitrary fill value:
>   https://github.com/numpy/numpy/pull/2875
> So
>   np.ones((10, 10))
> is the same as
>   np.filled((10, 10), 1)
>
> The implementations are trivial, but the API seems useful because it
> provides an idiomatic way of efficiently creating an array full of
> inf, or nan, or None, whatever funny value you need. All the
> alternatives are either inefficient (np.ones(...) * np.inf) or
> cumbersome (a = np.empty(...); a.fill(...)). Or so it seems to me. But
> there's a question of taste here; one could argue instead that these
> just add more clutter to the numpy namespace. So, before we merge,
> anyone want to chime in?

+1
I find it useful.
I do the indirect way very often, or write matlab style helper functions.

def nanes: ....

problem dtype: inf and nan only makes sense for float
I don't think I used many besides those two.

>
> (Bonus, extra bike-sheddy survey: do people prefer
>   np.filled((10, 10), np.nan)
>   np.filled_like(my_arr, np.nan)

+ 0.5

> or
>   np.filled(np.nan, (10, 10))
>   np.filled_like(np.nan, my_arr)
> ?)

Josef

>
> -n
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From e.antero.tammi at gmail.com  Sun Jan 13 13:27:42 2013
From: e.antero.tammi at gmail.com (eat)
Date: Sun, 13 Jan 2013 20:27:42 +0200
Subject: [Numpy-discussion] New numpy functions: filled, filled_like
In-Reply-To: <CAPJVwBmgZUCwApoS2ztmwt2pTa-2-+B3W5tOm8A30KyuOd12AA@mail.gmail.com>
References: <CAPJVwBmgZUCwApoS2ztmwt2pTa-2-+B3W5tOm8A30KyuOd12AA@mail.gmail.com>
Message-ID: <CAKa=AYRevdzvYX893vgmgSJcXaKu7VuMo65+7RibsCZKRpqGQA@mail.gmail.com>

Hi,

On Sun, Jan 13, 2013 at 7:27 PM, Nathaniel Smith <njs at pobox.com> wrote:

> Hi all,
>
> PR 2875 adds two new functions, that generalize zeros(), ones(),
> zeros_like(), ones_like(), by simply taking an arbitrary fill value:
>   https://github.com/numpy/numpy/pull/2875
> So
>   np.ones((10, 10))
> is the same as
>   np.filled((10, 10), 1)
>
> The implementations are trivial, but the API seems useful because it
> provides an idiomatic way of efficiently creating an array full of
> inf, or nan, or None, whatever funny value you need. All the
> alternatives are either inefficient (np.ones(...) * np.inf) or
> cumbersome (a = np.empty(...); a.fill(...)). Or so it seems to me. But
> there's a question of taste here; one could argue instead that these
> just add more clutter to the numpy namespace. So, before we merge,
> anyone want to chime in?
>
> (Bonus, extra bike-sheddy survey: do people prefer
>   np.filled((10, 10), np.nan)
>   np.filled_like(my_arr, np.nan)
>
+0
OTOH, it might also be handy to let val to be an array as well, which is
then repeated to fill the array.

My 2 cents.
-eat

> or
>   np.filled(np.nan, (10, 10))
>   np.filled_like(np.nan, my_arr)
> ?)
>
> -n
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130113/067aa0f9/attachment.html>

From matthew.brett at gmail.com  Sun Jan 13 13:30:46 2013
From: matthew.brett at gmail.com (Matthew Brett)
Date: Sun, 13 Jan 2013 18:30:46 +0000
Subject: [Numpy-discussion] New numpy functions: filled, filled_like
In-Reply-To: <CAPJVwBmgZUCwApoS2ztmwt2pTa-2-+B3W5tOm8A30KyuOd12AA@mail.gmail.com>
References: <CAPJVwBmgZUCwApoS2ztmwt2pTa-2-+B3W5tOm8A30KyuOd12AA@mail.gmail.com>
Message-ID: <CAH6Pt5qj9RKaqOJBAWD0sJ0Czo0T+XVfdqrLtQ4dAeTi5JF8Dw@mail.gmail.com>

Hi,

On Sun, Jan 13, 2013 at 5:27 PM, Nathaniel Smith <njs at pobox.com> wrote:
> Hi all,
>
> PR 2875 adds two new functions, that generalize zeros(), ones(),
> zeros_like(), ones_like(), by simply taking an arbitrary fill value:
>   https://github.com/numpy/numpy/pull/2875
> So
>   np.ones((10, 10))
> is the same as
>   np.filled((10, 10), 1)
>
> The implementations are trivial, but the API seems useful because it
> provides an idiomatic way of efficiently creating an array full of
> inf, or nan, or None, whatever funny value you need. All the
> alternatives are either inefficient (np.ones(...) * np.inf) or
> cumbersome (a = np.empty(...); a.fill(...)). Or so it seems to me. But
> there's a question of taste here; one could argue instead that these
> just add more clutter to the numpy namespace. So, before we merge,
> anyone want to chime in?
>
> (Bonus, extra bike-sheddy survey: do people prefer
>   np.filled((10, 10), np.nan)
>   np.filled_like(my_arr, np.nan)
> or
>   np.filled(np.nan, (10, 10))
>   np.filled_like(np.nan, my_arr)
> ?)

I remember there has been a reluctance in the past to add functions
that were two-liners.  I guess the problem might be that the namespace
fills up with many similar things.  Is this a worry?

Best,

Matthew


From charlesr.harris at gmail.com  Sun Jan 13 13:33:51 2013
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Sun, 13 Jan 2013 11:33:51 -0700
Subject: [Numpy-discussion] How many build systems do we need?
In-Reply-To: <CAGY4rcU2GHAhgOD57aUfZNxLVTPPcjYKsZ=+-yzGNoovpbXOnA@mail.gmail.com>
References: <CAB6mnxLsWGQ0t12fG3AB494NP6gbQuMMKNWJ1XyV562rhzkXqw@mail.gmail.com>
	<CABL7CQib269SwSBw5+fzbrs002sR_ug286uZ8xu=WUqks1PZGg@mail.gmail.com>
	<CAGY4rcUMREnpnC8xiQRuRH9ar1Na8Cjr6bBGmmiD0Fqm1vS+tA@mail.gmail.com>
	<CAB6mnxJxuBfeAi0ZbT3TbWmdieWv8ApL36Vp8UQb4sU6kvA=cg@mail.gmail.com>
	<CAGY4rcWzihMTUsRY7YapBWDuPZyQsPg_gvj+r-y71LMUWJCEuw@mail.gmail.com>
	<CABL7CQh0OfMp-bgo6a8WpsTKUiiOuTJSwxJNZ841bFWgu2bEcA@mail.gmail.com>
	<CAGY4rcU2GHAhgOD57aUfZNxLVTPPcjYKsZ=+-yzGNoovpbXOnA@mail.gmail.com>
Message-ID: <CAB6mnxLJ+vQKfFSY6RejFKQqp4BOpuxUke+FGkqtQBT=i=ySqA@mail.gmail.com>

On Sun, Jan 13, 2013 at 10:26 AM, David Cournapeau <cournape at gmail.com>wrote:

> On Sun, Jan 13, 2013 at 11:11 AM, Ralf Gommers <ralf.gommers at gmail.com>
> wrote:
> >
> >
> >
> > On Sun, Jan 13, 2013 at 5:25 PM, David Cournapeau <cournape at gmail.com>
> > wrote:
> >>
> >> On Sun, Jan 13, 2013 at 9:50 AM, Charles R Harris
> >> <charlesr.harris at gmail.com> wrote:
> >> >
> >> >
> >> > On Sun, Jan 13, 2013 at 6:44 AM, David Cournapeau <cournape at gmail.com
> >
> >> > wrote:
> >> >>
> >> >> On Sun, Jan 13, 2013 at 7:29 AM, Ralf Gommers <
> ralf.gommers at gmail.com>
> >> >> wrote:
> >> >> >
> >> >> >
> >> >> >
> >> >> > On Sun, Jan 13, 2013 at 6:34 AM, Charles R Harris
> >> >> > <charlesr.harris at gmail.com> wrote:
> >> >> >>
> >> >> >> Hi All,
> >> >> >>
> >> >> >> In the continuing proposal for cleanups, note that we currently
> >> >> >> support
> >> >> >> three (3!) build systems, distutils, scons, and bento. That's a
> bit
> >> >> >> much to
> >> >> >> maintain when contemplating changes, and scons and bento both have
> >> >> >> external
> >> >> >> dependencies. Can we dispense with any of these? Thoughts?
> >> >> >
> >> >> >
> >> >> > Numscons is the only one that can be dropped. I'm still using it
> >> >> > regularly,
> >> >> > but the few things it does better than bento can be easily improved
> >> >> > in
> >> >> > bento. So if removing numscons support from master saves some
> >> >> > developer
> >> >> > hours, +1 from me.
> >> >>
> >> >> I think numscons was already scheduled to be dropped in 1.7 (and next
> >> >> version of scipy as well) ? I am certainly in favor of dropping it as
> >> >> well.
> >> >
> >> >
> >> > Is bento documented anywhere or can you commit to keeping it working
> for
> >> > numpy?
> >>
> >> Both. Doc: http://bento.readthedocs.org/en/latest/ and tests
> >> (https://travis-ci.org/cournape/Bento) are continuously run/updated.
> >
> >
> > That's bento's own test suite only - we should add a numpy build with
> Bento
> > for at least Python 2.7 to the numpy Travis config.
>
> Definitely. I was merely answering Chuck's worries that bento may just
> be a one man, undocumented, bitrotted thing :)
>
>
Tsk, tsk, I would never use such extreme language ;) I put up a PR
expunging SCons support from numpy, could you take a look at it? There was
also a file related to mingw builds on windows (in cpu_id I think) that has
no bento equivalent.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130113/769c2773/attachment.html>

From charlesr.harris at gmail.com  Sun Jan 13 14:03:22 2013
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Sun, 13 Jan 2013 12:03:22 -0700
Subject: [Numpy-discussion] 1.8 release
Message-ID: <CAB6mnxLU4sbG7AfMcQVH9rQkLe+Um15njj7nCjr_tVWz3unB1g@mail.gmail.com>

Now that 1.7 is nearing release, it's time to look forward to the 1.8
release. I'd like us to get back to the twice yearly schedule that we tried
to maintain through the 1.3 - 1.6 releases, so I propose a June release as
a goal. Call it the Spring Cleaning release. As to content, I'd like to see
the following.

Removal of Python 2.4-2.5 support.
Removal of SCons support.
The index work consolidated.
Initial stab at removing the need for 2to3. See Pauli's PR for scipy.
Miscellaneous enhancements and fixes.

Thoughts?

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130113/32f71335/attachment.html>

From jaakko.luttinen at aalto.fi  Sun Jan 13 17:46:49 2013
From: jaakko.luttinen at aalto.fi (Jaakko Luttinen)
Date: Mon, 14 Jan 2013 00:46:49 +0200
Subject: [Numpy-discussion] numpydoc for python 3?
In-Reply-To: <50EEDB50.5000902@aalto.fi>
References: <50EDA996.6090806@aalto.fi>
	<loom.20130110T130306-98@post.gmane.org>
	<50EED62B.9010105@aalto.fi>
	<loom.20130110T160320-428@post.gmane.org>
	<50EEDB50.5000902@aalto.fi>
Message-ID: <50F33959.7030203@aalto.fi>

On 2013-01-10 17:16, Jaakko Luttinen wrote:
> On 01/10/2013 05:04 PM, Pauli Virtanen wrote:
>> Jaakko Luttinen <jaakko.luttinen <at> aalto.fi> writes:
>>> The files in numpy/doc/sphinxext/ and numpydoc/ (from PyPI) are a bit
>>> different. Which ones should be modified?
>>
>> The stuff in sphinxext/ is the development version of the package on
>> PyPi, so the changes should be made in sphinxext/
>>
>
> Thanks!
>
> I'm trying to run the tests with Python 2 using nosetests, but I get
> some errors http://pastebin.com/Mp9i8T2f . Am I doing something wrong?
> How should I run the tests?
> If I run nosetests on the numpydoc folder from PyPI, all the tests are
> successful.

I'm a bit stuck trying to make numpydoc Python 3 compatible. I made 
setup.py try to use distutils.command.build_py.build_py_2to3 in order to 
transform installed code automatically to Python 3. However, the tests 
(in tests folder) are not part of the package but rather package_data, 
so they won't get transformed. How can I automatically transform the 
tests too? Probably there is some easy and "right" solution to this, but 
I haven't been able to figure out a nice and simple solution.. Any 
ideas? Thanks.

-Jaakko


From matthew.brett at gmail.com  Sun Jan 13 17:53:12 2013
From: matthew.brett at gmail.com (Matthew Brett)
Date: Sun, 13 Jan 2013 22:53:12 +0000
Subject: [Numpy-discussion] numpydoc for python 3?
In-Reply-To: <50F33959.7030203@aalto.fi>
References: <50EDA996.6090806@aalto.fi>
	<loom.20130110T130306-98@post.gmane.org>
	<50EED62B.9010105@aalto.fi>
	<loom.20130110T160320-428@post.gmane.org>
	<50EEDB50.5000902@aalto.fi> <50F33959.7030203@aalto.fi>
Message-ID: <CAH6Pt5q8R9Y82dAA2dUUDNuG8u1BLmgd=jsQgLUB3rAy3u9COQ@mail.gmail.com>

Hi,

On Sun, Jan 13, 2013 at 10:46 PM, Jaakko Luttinen
<jaakko.luttinen at aalto.fi> wrote:
> On 2013-01-10 17:16, Jaakko Luttinen wrote:
>> On 01/10/2013 05:04 PM, Pauli Virtanen wrote:
>>> Jaakko Luttinen <jaakko.luttinen <at> aalto.fi> writes:
>>>> The files in numpy/doc/sphinxext/ and numpydoc/ (from PyPI) are a bit
>>>> different. Which ones should be modified?
>>>
>>> The stuff in sphinxext/ is the development version of the package on
>>> PyPi, so the changes should be made in sphinxext/
>>>
>>
>> Thanks!
>>
>> I'm trying to run the tests with Python 2 using nosetests, but I get
>> some errors http://pastebin.com/Mp9i8T2f . Am I doing something wrong?
>> How should I run the tests?
>> If I run nosetests on the numpydoc folder from PyPI, all the tests are
>> successful.
>
> I'm a bit stuck trying to make numpydoc Python 3 compatible. I made
> setup.py try to use distutils.command.build_py.build_py_2to3 in order to
> transform installed code automatically to Python 3. However, the tests
> (in tests folder) are not part of the package but rather package_data,
> so they won't get transformed. How can I automatically transform the
> tests too? Probably there is some easy and "right" solution to this, but
> I haven't been able to figure out a nice and simple solution.. Any
> ideas? Thanks.

Can you add tests as a package 'numpydoc.tests' and add an __init__.py
file to the 'tests' directory?

You might be able to get away without 2to3, using the kind of stuff
that Pauli has used for scipy recently:

https://github.com/scipy/scipy/pull/397

I'm happy to help over email or chat, just let me know.

Best,

Matthew


From efiring at hawaii.edu  Sun Jan 13 18:02:01 2013
From: efiring at hawaii.edu (Eric Firing)
Date: Sun, 13 Jan 2013 13:02:01 -1000
Subject: [Numpy-discussion] New numpy functions: filled, filled_like
In-Reply-To: <CAPJVwBmgZUCwApoS2ztmwt2pTa-2-+B3W5tOm8A30KyuOd12AA@mail.gmail.com>
References: <CAPJVwBmgZUCwApoS2ztmwt2pTa-2-+B3W5tOm8A30KyuOd12AA@mail.gmail.com>
Message-ID: <50F33CE9.6020707@hawaii.edu>

On 2013/01/13 7:27 AM, Nathaniel Smith wrote:
> Hi all,
>
> PR 2875 adds two new functions, that generalize zeros(), ones(),
> zeros_like(), ones_like(), by simply taking an arbitrary fill value:
>    https://github.com/numpy/numpy/pull/2875
> So
>    np.ones((10, 10))
> is the same as
>    np.filled((10, 10), 1)
>
> The implementations are trivial, but the API seems useful because it
> provides an idiomatic way of efficiently creating an array full of
> inf, or nan, or None, whatever funny value you need. All the
> alternatives are either inefficient (np.ones(...) * np.inf) or
> cumbersome (a = np.empty(...); a.fill(...)). Or so it seems to me. But
> there's a question of taste here; one could argue instead that these
> just add more clutter to the numpy namespace. So, before we merge,
> anyone want to chime in?

I'm neutral to negative as to whether it is worth adding these to the 
namespace; I don't mind using the "cumbersome" alternative.

Note also that there is already a numpy.ma.filled() function for quite a 
different purpose, so putting a filled() in numpy breaks the pattern 
that ma has masked versions of most numpy functions.

This consideration actually tips me quite a bit toward the negative 
side.  I don't think I am unique in relying heavily on masked arrays.

>
> (Bonus, extra bike-sheddy survey: do people prefer
>    np.filled((10, 10), np.nan)
>    np.filled_like(my_arr, np.nan)

+1 for this form if you decide to do it despite the problem mentioned above.

> or
>    np.filled(np.nan, (10, 10))
>    np.filled_like(np.nan, my_arr)

This one is particularly bad for filled_like, therefore bad for both.

Eric

> ?)
>
> -n
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From robert.kern at gmail.com  Sun Jan 13 18:24:48 2013
From: robert.kern at gmail.com (Robert Kern)
Date: Mon, 14 Jan 2013 00:24:48 +0100
Subject: [Numpy-discussion] New numpy functions: filled, filled_like
In-Reply-To: <CAPJVwBmgZUCwApoS2ztmwt2pTa-2-+B3W5tOm8A30KyuOd12AA@mail.gmail.com>
References: <CAPJVwBmgZUCwApoS2ztmwt2pTa-2-+B3W5tOm8A30KyuOd12AA@mail.gmail.com>
Message-ID: <CAF6FJivC=O0CGMGP=c5=a=TMRP73Zy9H61FRS8xSKp5cU2XV4Q@mail.gmail.com>

On Sun, Jan 13, 2013 at 6:27 PM, Nathaniel Smith <njs at pobox.com> wrote:
> Hi all,
>
> PR 2875 adds two new functions, that generalize zeros(), ones(),
> zeros_like(), ones_like(), by simply taking an arbitrary fill value:
>   https://github.com/numpy/numpy/pull/2875
> So
>   np.ones((10, 10))
> is the same as
>   np.filled((10, 10), 1)
>
> The implementations are trivial, but the API seems useful because it
> provides an idiomatic way of efficiently creating an array full of
> inf, or nan, or None, whatever funny value you need. All the
> alternatives are either inefficient (np.ones(...) * np.inf) or
> cumbersome (a = np.empty(...); a.fill(...)). Or so it seems to me. But
> there's a question of taste here; one could argue instead that these
> just add more clutter to the numpy namespace. So, before we merge,
> anyone want to chime in?

One alternative that does not expand the API with two-liners is to let
the ndarray.fill() method return self:

  a = np.empty(...).fill(20.0)

--
Robert Kern


From njs at pobox.com  Sun Jan 13 18:26:43 2013
From: njs at pobox.com (Nathaniel Smith)
Date: Sun, 13 Jan 2013 23:26:43 +0000
Subject: [Numpy-discussion] 1.8 release
In-Reply-To: <CAB6mnxLU4sbG7AfMcQVH9rQkLe+Um15njj7nCjr_tVWz3unB1g@mail.gmail.com>
References: <CAB6mnxLU4sbG7AfMcQVH9rQkLe+Um15njj7nCjr_tVWz3unB1g@mail.gmail.com>
Message-ID: <CAPJVwBn8XzztoZibiK7o89guj1nY9zZvcSUT-Us4nAF+JHisyg@mail.gmail.com>

On Sun, Jan 13, 2013 at 7:03 PM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
> Now that 1.7 is nearing release, it's time to look forward to the 1.8
> release. I'd like us to get back to the twice yearly schedule that we tried
> to maintain through the 1.3 - 1.6 releases, so I propose a June release as a
> goal. Call it the Spring Cleaning release. As to content, I'd like to see
> the following.
>
> Removal of Python 2.4-2.5 support.
> Removal of SCons support.
> The index work consolidated.
> Initial stab at removing the need for 2to3. See Pauli's PR for scipy.
> Miscellaneous enhancements and fixes.

I'd actually like to propose a faster release cycle than this, even.
Perhaps 3 months between releases; 2 months from release n to the
first beta of n+1?

The consequences would be:
* Changes get out to users faster.
* Each release is smaller, so it's easier for downstream projects to
adjust to each release -- instead of having this giant pile of changes
to work through all at once every 6-12 months
* End-users are less scared of updating, because the changes aren't so
overwhelming, so they end up actually testing (and getting to take
advantage of) the new stuff more.
* We get feedback more quickly, so we can fix up whatever we break
while we still know what we did.
* And for larger changes, if we release them incrementally, we can get
feedback before we've gone miles down the wrong path.
* Releases come out on time more often -- sort of paradoxical, but
with small, frequent releases, beta cycles go smoother, and it's
easier to say "don't worry, I'll get it ready for next time", or
"right, that patch was less done than we thought, let's take it out
for now" (also this is much easier if we don't have another years
worth of changes committed on top of the patch!).
* If your schedule does slip, then you still end up with a <6 month
release cycle.

1.6.x was branched from master in March 2011 and released in May 2011.
1.7.x was branched from master in July 2012 and still isn't out. But
at least we've finally found and fixed the second to last bug!

Wouldn't it be nice to have a 2-4 week beta cycle that only found
trivial and expected problems? We *already* have 6 months worth of
feature work in master that won't be in the *next* release.

Note 1: if we do do this, then we'll also want to rethink the
deprecation cycle a bit -- right now we've sort of vaguely been saying
"well, we'll deprecate it in release n and take it out in n+1.
Whenever that is". 3 months definitely isn't long enough for a
deprecation period, so if we do do this then we'll want to deprecate
things for multiple releases before actually removing them. Details to
be determined.

Note 2: in this kind of release schedule, you definitely don't want to
say "here are the features that will be in the next release!", because
then you end up slipping and sliding all over the place. Instead you
say "here are some things that I want to work on next, and we'll see
which release they end up in". Since we're already following the rule
that nothing goes into master until it's done and tested and ready for
release anyway, this doesn't really change much.

Thoughts?

-n


From matthew.brett at gmail.com  Sun Jan 13 18:28:05 2013
From: matthew.brett at gmail.com (Matthew Brett)
Date: Sun, 13 Jan 2013 23:28:05 +0000
Subject: [Numpy-discussion] New numpy functions: filled, filled_like
In-Reply-To: <CAF6FJivC=O0CGMGP=c5=a=TMRP73Zy9H61FRS8xSKp5cU2XV4Q@mail.gmail.com>
References: <CAPJVwBmgZUCwApoS2ztmwt2pTa-2-+B3W5tOm8A30KyuOd12AA@mail.gmail.com>
	<CAF6FJivC=O0CGMGP=c5=a=TMRP73Zy9H61FRS8xSKp5cU2XV4Q@mail.gmail.com>
Message-ID: <CAH6Pt5q0v_n=9dK-_Mi11+VC2AgdEy03bHpD7EVGqx4+44a89g@mail.gmail.com>

On Sun, Jan 13, 2013 at 11:24 PM, Robert Kern <robert.kern at gmail.com> wrote:
> On Sun, Jan 13, 2013 at 6:27 PM, Nathaniel Smith <njs at pobox.com> wrote:
>> Hi all,
>>
>> PR 2875 adds two new functions, that generalize zeros(), ones(),
>> zeros_like(), ones_like(), by simply taking an arbitrary fill value:
>>   https://github.com/numpy/numpy/pull/2875
>> So
>>   np.ones((10, 10))
>> is the same as
>>   np.filled((10, 10), 1)
>>
>> The implementations are trivial, but the API seems useful because it
>> provides an idiomatic way of efficiently creating an array full of
>> inf, or nan, or None, whatever funny value you need. All the
>> alternatives are either inefficient (np.ones(...) * np.inf) or
>> cumbersome (a = np.empty(...); a.fill(...)). Or so it seems to me. But
>> there's a question of taste here; one could argue instead that these
>> just add more clutter to the numpy namespace. So, before we merge,
>> anyone want to chime in?
>
> One alternative that does not expand the API with two-liners is to let
> the ndarray.fill() method return self:
>
>   a = np.empty(...).fill(20.0)

Nice.

Matthew


From njs at pobox.com  Sun Jan 13 18:39:09 2013
From: njs at pobox.com (Nathaniel Smith)
Date: Sun, 13 Jan 2013 23:39:09 +0000
Subject: [Numpy-discussion] New numpy functions: filled, filled_like
In-Reply-To: <CAF6FJivC=O0CGMGP=c5=a=TMRP73Zy9H61FRS8xSKp5cU2XV4Q@mail.gmail.com>
References: <CAPJVwBmgZUCwApoS2ztmwt2pTa-2-+B3W5tOm8A30KyuOd12AA@mail.gmail.com>
	<CAF6FJivC=O0CGMGP=c5=a=TMRP73Zy9H61FRS8xSKp5cU2XV4Q@mail.gmail.com>
Message-ID: <CAPJVwBnOdehQTo=dLup+woc5_JkE5ekvEBDeiEFraVyVUvSDtQ@mail.gmail.com>

On Sun, Jan 13, 2013 at 11:24 PM, Robert Kern <robert.kern at gmail.com> wrote:
> On Sun, Jan 13, 2013 at 6:27 PM, Nathaniel Smith <njs at pobox.com> wrote:
>> Hi all,
>>
>> PR 2875 adds two new functions, that generalize zeros(), ones(),
>> zeros_like(), ones_like(), by simply taking an arbitrary fill value:
>>   https://github.com/numpy/numpy/pull/2875
>> So
>>   np.ones((10, 10))
>> is the same as
>>   np.filled((10, 10), 1)
>>
>> The implementations are trivial, but the API seems useful because it
>> provides an idiomatic way of efficiently creating an array full of
>> inf, or nan, or None, whatever funny value you need. All the
>> alternatives are either inefficient (np.ones(...) * np.inf) or
>> cumbersome (a = np.empty(...); a.fill(...)). Or so it seems to me. But
>> there's a question of taste here; one could argue instead that these
>> just add more clutter to the numpy namespace. So, before we merge,
>> anyone want to chime in?
>
> One alternative that does not expand the API with two-liners is to let
> the ndarray.fill() method return self:
>
>   a = np.empty(...).fill(20.0)

This violates the convention that in-place operations never return
self, to avoid confusion with out-of-place operations. E.g.
ndarray.resize() versus ndarray.reshape(), ndarray.sort() versus
np.sort(), and in the broader Python world, list.sort() versus
sorted(), list.reverse() versus reversed(). (This was an explicit
reason given for list.sort to not return self, even.)

Maybe enabling this idiom is a good enough reason to break the
convention ("Special cases aren't special enough to break the rules. /
Although practicality beats purity"), but it at least makes me -0 on
this...

(The nice thing about np.filled() is that it makes np.zeros() and
np.ones() feel like clutter, rather than the reverse... not that I'm
suggesting ever getting rid of them, but it makes the API conceptually
feel smaller, not larger.)

-n


From jsseabold at gmail.com  Sun Jan 13 18:48:10 2013
From: jsseabold at gmail.com (Skipper Seabold)
Date: Sun, 13 Jan 2013 18:48:10 -0500
Subject: [Numpy-discussion] New numpy functions: filled, filled_like
In-Reply-To: <CAPJVwBnOdehQTo=dLup+woc5_JkE5ekvEBDeiEFraVyVUvSDtQ@mail.gmail.com>
References: <CAPJVwBmgZUCwApoS2ztmwt2pTa-2-+B3W5tOm8A30KyuOd12AA@mail.gmail.com>
	<CAF6FJivC=O0CGMGP=c5=a=TMRP73Zy9H61FRS8xSKp5cU2XV4Q@mail.gmail.com>
	<CAPJVwBnOdehQTo=dLup+woc5_JkE5ekvEBDeiEFraVyVUvSDtQ@mail.gmail.com>
Message-ID: <CAKF=DjtjSGk99zh53wwiPVmMThOzr4hGUKX5HJPG-5YeF_QTCQ@mail.gmail.com>

On Sun, Jan 13, 2013 at 6:39 PM, Nathaniel Smith <njs at pobox.com> wrote:

> On Sun, Jan 13, 2013 at 11:24 PM, Robert Kern <robert.kern at gmail.com>
> wrote:
> > On Sun, Jan 13, 2013 at 6:27 PM, Nathaniel Smith <njs at pobox.com> wrote:
> >> Hi all,
> >>
> >> PR 2875 adds two new functions, that generalize zeros(), ones(),
> >> zeros_like(), ones_like(), by simply taking an arbitrary fill value:
> >>   https://github.com/numpy/numpy/pull/2875
> >> So
> >>   np.ones((10, 10))
> >> is the same as
> >>   np.filled((10, 10), 1)
> >>
> >> The implementations are trivial, but the API seems useful because it
> >> provides an idiomatic way of efficiently creating an array full of
> >> inf, or nan, or None, whatever funny value you need. All the
> >> alternatives are either inefficient (np.ones(...) * np.inf) or
> >> cumbersome (a = np.empty(...); a.fill(...)). Or so it seems to me. But
> >> there's a question of taste here; one could argue instead that these
> >> just add more clutter to the numpy namespace. So, before we merge,
> >> anyone want to chime in?
> >
> > One alternative that does not expand the API with two-liners is to let
> > the ndarray.fill() method return self:
> >
> >   a = np.empty(...).fill(20.0)
>
> This violates the convention that in-place operations never return
> self, to avoid confusion with out-of-place operations. E.g.
> ndarray.resize() versus ndarray.reshape(), ndarray.sort() versus
> np.sort(), and in the broader Python world, list.sort() versus
> sorted(), list.reverse() versus reversed(). (This was an explicit
> reason given for list.sort to not return self, even.)
>
> Maybe enabling this idiom is a good enough reason to break the
> convention ("Special cases aren't special enough to break the rules. /
> Although practicality beats purity"), but it at least makes me -0 on
> this...
>
>
I tend to agree with the notion that inplace operations shouldn't return
self, but I don't know if it's just because I've been conditioned this way.
Not returning self breaks the fluid interface pattern [1], as noted in a
similar discussion on pandas [2], FWIW, though there's likely some way to
have both worlds.

Skipper

[1] https://en.wikipedia.org/wiki/Fluent_interface
[2] https://github.com/pydata/pandas/issues/1893
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130113/29eb2a3b/attachment.html>

From alan.isaac at gmail.com  Sun Jan 13 18:54:50 2013
From: alan.isaac at gmail.com (Alan G Isaac)
Date: Sun, 13 Jan 2013 18:54:50 -0500
Subject: [Numpy-discussion] New numpy functions: filled, filled_like
In-Reply-To: <CAPJVwBnOdehQTo=dLup+woc5_JkE5ekvEBDeiEFraVyVUvSDtQ@mail.gmail.com>
References: <CAPJVwBmgZUCwApoS2ztmwt2pTa-2-+B3W5tOm8A30KyuOd12AA@mail.gmail.com>
	<CAF6FJivC=O0CGMGP=c5=a=TMRP73Zy9H61FRS8xSKp5cU2XV4Q@mail.gmail.com>
	<CAPJVwBnOdehQTo=dLup+woc5_JkE5ekvEBDeiEFraVyVUvSDtQ@mail.gmail.com>
Message-ID: <50F3494A.7000602@gmail.com>

> On Sun, Jan 13, 2013 at 11:24 PM, Robert Kern <robert.kern at gmail.com> wrote:
>> One alternative that does not expand the API with two-liners is to let
>> the ndarray.fill() method return self:
>>
>>    a = np.empty(...).fill(20.0)
>


On 1/13/2013 6:39 PM, Nathaniel Smith wrote:
> This violates the convention that in-place operations never return
> self, to avoid confusion with out-of-place operations.


Strongly agree.
It is not worth a violation to save two keystrokes: "\na".
(Three or four for a longer name, given name completion.)

Alan Isaac


From njs at pobox.com  Sun Jan 13 19:04:59 2013
From: njs at pobox.com (Nathaniel Smith)
Date: Mon, 14 Jan 2013 00:04:59 +0000
Subject: [Numpy-discussion] New numpy functions: filled, filled_like
In-Reply-To: <CAKF=DjtjSGk99zh53wwiPVmMThOzr4hGUKX5HJPG-5YeF_QTCQ@mail.gmail.com>
References: <CAPJVwBmgZUCwApoS2ztmwt2pTa-2-+B3W5tOm8A30KyuOd12AA@mail.gmail.com>
	<CAF6FJivC=O0CGMGP=c5=a=TMRP73Zy9H61FRS8xSKp5cU2XV4Q@mail.gmail.com>
	<CAPJVwBnOdehQTo=dLup+woc5_JkE5ekvEBDeiEFraVyVUvSDtQ@mail.gmail.com>
	<CAKF=DjtjSGk99zh53wwiPVmMThOzr4hGUKX5HJPG-5YeF_QTCQ@mail.gmail.com>
Message-ID: <CAPJVwBmZvXXszZzf9m=PX_7e=8qs8ZJteU2hD7_eTR9pnN9U2w@mail.gmail.com>

On Sun, Jan 13, 2013 at 11:48 PM, Skipper Seabold <jsseabold at gmail.com> wrote:
> On Sun, Jan 13, 2013 at 6:39 PM, Nathaniel Smith <njs at pobox.com> wrote:
>>
>> On Sun, Jan 13, 2013 at 11:24 PM, Robert Kern <robert.kern at gmail.com>
>> wrote:
>> > On Sun, Jan 13, 2013 at 6:27 PM, Nathaniel Smith <njs at pobox.com> wrote:
>> >> Hi all,
>> >>
>> >> PR 2875 adds two new functions, that generalize zeros(), ones(),
>> >> zeros_like(), ones_like(), by simply taking an arbitrary fill value:
>> >>   https://github.com/numpy/numpy/pull/2875
>> >> So
>> >>   np.ones((10, 10))
>> >> is the same as
>> >>   np.filled((10, 10), 1)
>> >>
>> >> The implementations are trivial, but the API seems useful because it
>> >> provides an idiomatic way of efficiently creating an array full of
>> >> inf, or nan, or None, whatever funny value you need. All the
>> >> alternatives are either inefficient (np.ones(...) * np.inf) or
>> >> cumbersome (a = np.empty(...); a.fill(...)). Or so it seems to me. But
>> >> there's a question of taste here; one could argue instead that these
>> >> just add more clutter to the numpy namespace. So, before we merge,
>> >> anyone want to chime in?
>> >
>> > One alternative that does not expand the API with two-liners is to let
>> > the ndarray.fill() method return self:
>> >
>> >   a = np.empty(...).fill(20.0)
>>
>> This violates the convention that in-place operations never return
>> self, to avoid confusion with out-of-place operations. E.g.
>> ndarray.resize() versus ndarray.reshape(), ndarray.sort() versus
>> np.sort(), and in the broader Python world, list.sort() versus
>> sorted(), list.reverse() versus reversed(). (This was an explicit
>> reason given for list.sort to not return self, even.)
>>
>> Maybe enabling this idiom is a good enough reason to break the
>> convention ("Special cases aren't special enough to break the rules. /
>> Although practicality beats purity"), but it at least makes me -0 on
>> this...
>>
>
> I tend to agree with the notion that inplace operations shouldn't return
> self, but I don't know if it's just because I've been conditioned this way.
> Not returning self breaks the fluid interface pattern [1], as noted in a
> similar discussion on pandas [2], FWIW, though there's likely some way to
> have both worlds.

Ah-hah, here's the email where Guide officially proclaims that there
shall be no "fluent interface" nonsense applied to in-place operators
in Python, because it hurts readability (at least for Dutch people
;-)):
  http://mail.python.org/pipermail/python-dev/2003-October/038855.html

-n


From charlesr.harris at gmail.com  Sun Jan 13 19:14:27 2013
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Sun, 13 Jan 2013 17:14:27 -0700
Subject: [Numpy-discussion] 1.8 release
In-Reply-To: <CAPJVwBn8XzztoZibiK7o89guj1nY9zZvcSUT-Us4nAF+JHisyg@mail.gmail.com>
References: <CAB6mnxLU4sbG7AfMcQVH9rQkLe+Um15njj7nCjr_tVWz3unB1g@mail.gmail.com>
	<CAPJVwBn8XzztoZibiK7o89guj1nY9zZvcSUT-Us4nAF+JHisyg@mail.gmail.com>
Message-ID: <CAB6mnxKzh-_HOSHbz0W5GURajo_DGTgvMNnoWdoNk_mzqi3Ecw@mail.gmail.com>

On Sun, Jan 13, 2013 at 4:26 PM, Nathaniel Smith <njs at pobox.com> wrote:

> On Sun, Jan 13, 2013 at 7:03 PM, Charles R Harris
> <charlesr.harris at gmail.com> wrote:
> > Now that 1.7 is nearing release, it's time to look forward to the 1.8
> > release. I'd like us to get back to the twice yearly schedule that we
> tried
> > to maintain through the 1.3 - 1.6 releases, so I propose a June release
> as a
> > goal. Call it the Spring Cleaning release. As to content, I'd like to see
> > the following.
> >
> > Removal of Python 2.4-2.5 support.
> > Removal of SCons support.
> > The index work consolidated.
> > Initial stab at removing the need for 2to3. See Pauli's PR for scipy.
> > Miscellaneous enhancements and fixes.
>
> I'd actually like to propose a faster release cycle than this, even.
> Perhaps 3 months between releases; 2 months from release n to the
> first beta of n+1?
>
> The consequences would be:
> * Changes get out to users faster.
> * Each release is smaller, so it's easier for downstream projects to
> adjust to each release -- instead of having this giant pile of changes
> to work through all at once every 6-12 months
> * End-users are less scared of updating, because the changes aren't so
> overwhelming, so they end up actually testing (and getting to take
> advantage of) the new stuff more.
> * We get feedback more quickly, so we can fix up whatever we break
> while we still know what we did.
> * And for larger changes, if we release them incrementally, we can get
> feedback before we've gone miles down the wrong path.
> * Releases come out on time more often -- sort of paradoxical, but
> with small, frequent releases, beta cycles go smoother, and it's
> easier to say "don't worry, I'll get it ready for next time", or
> "right, that patch was less done than we thought, let's take it out
> for now" (also this is much easier if we don't have another years
> worth of changes committed on top of the patch!).
> * If your schedule does slip, then you still end up with a <6 month
> release cycle.
>
> 1.6.x was branched from master in March 2011 and released in May 2011.
> 1.7.x was branched from master in July 2012 and still isn't out. But
> at least we've finally found and fixed the second to last bug!
>
>
Actually, the first branch was late Dec 2011, IIRC, maybe Feb 2012. We've
had about a year delay and I'm not convinced it was worth it.


> Wouldn't it be nice to have a 2-4 week beta cycle that only found
> trivial and expected problems? We *already* have 6 months worth of
> feature work in master that won't be in the *next* release.
>
> Note 1: if we do do this, then we'll also want to rethink the
> deprecation cycle a bit -- right now we've sort of vaguely been saying
> "well, we'll deprecate it in release n and take it out in n+1.
> Whenever that is". 3 months definitely isn't long enough for a
> deprecation period, so if we do do this then we'll want to deprecate
> things for multiple releases before actually removing them. Details to
> be determined.
>
>
Deprecations should probably be time based.


> Note 2: in this kind of release schedule, you definitely don't want to
> say "here are the features that will be in the next release!", because
> then you end up slipping and sliding all over the place. Instead you
> say "here are some things that I want to work on next, and we'll see
> which release they end up in". Since we're already following the rule
> that nothing goes into master until it's done and tested and ready for
> release anyway, this doesn't really change much.
>
> Thoughts?
>
>
I think three months is a bit short. Much will depend on the release
manager and I not sure what  Andrej's plans are. I'd happily nominate you
for that role ;)

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130113/01706da7/attachment.html>

From cournape at gmail.com  Sun Jan 13 19:19:59 2013
From: cournape at gmail.com (David Cournapeau)
Date: Sun, 13 Jan 2013 18:19:59 -0600
Subject: [Numpy-discussion] 1.8 release
In-Reply-To: <CAPJVwBn8XzztoZibiK7o89guj1nY9zZvcSUT-Us4nAF+JHisyg@mail.gmail.com>
References: <CAB6mnxLU4sbG7AfMcQVH9rQkLe+Um15njj7nCjr_tVWz3unB1g@mail.gmail.com>
	<CAPJVwBn8XzztoZibiK7o89guj1nY9zZvcSUT-Us4nAF+JHisyg@mail.gmail.com>
Message-ID: <CAGY4rcWHWf7efDCfPQx9GhwHeeTpxj1AfToq0tVV_P=frzfQhg@mail.gmail.com>

On Sun, Jan 13, 2013 at 5:26 PM, Nathaniel Smith <njs at pobox.com> wrote:
> On Sun, Jan 13, 2013 at 7:03 PM, Charles R Harris
> <charlesr.harris at gmail.com> wrote:
>> Now that 1.7 is nearing release, it's time to look forward to the 1.8
>> release. I'd like us to get back to the twice yearly schedule that we tried
>> to maintain through the 1.3 - 1.6 releases, so I propose a June release as a
>> goal. Call it the Spring Cleaning release. As to content, I'd like to see
>> the following.
>>
>> Removal of Python 2.4-2.5 support.
>> Removal of SCons support.
>> The index work consolidated.
>> Initial stab at removing the need for 2to3. See Pauli's PR for scipy.
>> Miscellaneous enhancements and fixes.
>
> I'd actually like to propose a faster release cycle than this, even.
> Perhaps 3 months between releases; 2 months from release n to the
> first beta of n+1?
>
> The consequences would be:
> * Changes get out to users faster.
> * Each release is smaller, so it's easier for downstream projects to
> adjust to each release -- instead of having this giant pile of changes
> to work through all at once every 6-12 months
> * End-users are less scared of updating, because the changes aren't so
> overwhelming, so they end up actually testing (and getting to take
> advantage of) the new stuff more.
> * We get feedback more quickly, so we can fix up whatever we break
> while we still know what we did.
> * And for larger changes, if we release them incrementally, we can get
> feedback before we've gone miles down the wrong path.
> * Releases come out on time more often -- sort of paradoxical, but
> with small, frequent releases, beta cycles go smoother, and it's
> easier to say "don't worry, I'll get it ready for next time", or
> "right, that patch was less done than we thought, let's take it out
> for now" (also this is much easier if we don't have another years
> worth of changes committed on top of the patch!).
> * If your schedule does slip, then you still end up with a <6 month
> release cycle.
>
> 1.6.x was branched from master in March 2011 and released in May 2011.
> 1.7.x was branched from master in July 2012 and still isn't out. But
> at least we've finally found and fixed the second to last bug!
>
> Wouldn't it be nice to have a 2-4 week beta cycle that only found
> trivial and expected problems? We *already* have 6 months worth of
> feature work in master that won't be in the *next* release.
>
> Note 1: if we do do this, then we'll also want to rethink the
> deprecation cycle a bit -- right now we've sort of vaguely been saying
> "well, we'll deprecate it in release n and take it out in n+1.
> Whenever that is". 3 months definitely isn't long enough for a
> deprecation period, so if we do do this then we'll want to deprecate
> things for multiple releases before actually removing them. Details to
> be determined.
>
> Note 2: in this kind of release schedule, you definitely don't want to
> say "here are the features that will be in the next release!", because
> then you end up slipping and sliding all over the place. Instead you
> say "here are some things that I want to work on next, and we'll see
> which release they end up in". Since we're already following the rule
> that nothing goes into master until it's done and tested and ready for
> release anyway, this doesn't really change much.
>
> Thoughts?

Hey, my time to have a time-machine:
http://mail.scipy.org/pipermail/numpy-discussion/2008-May/033754.html

I still think it is a good idea :)

cheers,
David


From nadavh at visionsense.com  Mon Jan 14 01:21:36 2013
From: nadavh at visionsense.com (Nadav Horesh)
Date: Mon, 14 Jan 2013 06:21:36 +0000
Subject: [Numpy-discussion] phase unwrapping (1d)
In-Reply-To: <kcpbpk$8nk$1@ger.gmane.org>
References: <kcpbpk$8nk$1@ger.gmane.org>
Message-ID: <F656855EF0FAB246A0AEC45F7D5D13242CF505E2@BLUPRD0811MB424.namprd08.prod.outlook.com>

There is an unwrap function in numpy. Doesn't it work for you?

   Nadav
________________________________________
From: numpy-discussion-bounces at scipy.org [numpy-discussion-bounces at scipy.org] on behalf of Neal Becker [ndbecker2 at gmail.com]
Sent: 11 January 2013 17:40
To: numpy-discussion at scipy.org
Subject: [Numpy-discussion] phase unwrapping (1d)

np.unwrap was too slow, so I rolled by own (in c++).

I wanted to be able to handle the case of

unwrap (arg (x1) + arg (x2))

Here, phase can change by more than 2pi.

I came up with the following algorithm, any thoughts?

In the following, y is normally set to pi.
o points to output
i points to input
nint1 finds nearest integer

  value_t prev_o = init;
  for (; i != e; ++i, ++o) {
    *o = cnt * 2 * y + *i;
    value_t delta = *o - prev_o;

    if (delta / y > 1 or delta / y < -1) {
      int i = nint1<int> (delta / (2*y));
      *o -= 2*y*i;
      cnt -= i;
    }

    prev_o = *o;
  }


_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion at scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


From robert.kern at gmail.com  Mon Jan 14 01:59:24 2013
From: robert.kern at gmail.com (Robert Kern)
Date: Mon, 14 Jan 2013 07:59:24 +0100
Subject: [Numpy-discussion] New numpy functions: filled, filled_like
In-Reply-To: <CAPJVwBmZvXXszZzf9m=PX_7e=8qs8ZJteU2hD7_eTR9pnN9U2w@mail.gmail.com>
References: <CAPJVwBmgZUCwApoS2ztmwt2pTa-2-+B3W5tOm8A30KyuOd12AA@mail.gmail.com>
	<CAF6FJivC=O0CGMGP=c5=a=TMRP73Zy9H61FRS8xSKp5cU2XV4Q@mail.gmail.com>
	<CAPJVwBnOdehQTo=dLup+woc5_JkE5ekvEBDeiEFraVyVUvSDtQ@mail.gmail.com>
	<CAKF=DjtjSGk99zh53wwiPVmMThOzr4hGUKX5HJPG-5YeF_QTCQ@mail.gmail.com>
	<CAPJVwBmZvXXszZzf9m=PX_7e=8qs8ZJteU2hD7_eTR9pnN9U2w@mail.gmail.com>
Message-ID: <CAF6FJiuQwNCkPvzVKneLfx3XSM7tHKcNXzMftRFnu8R+tN_LoQ@mail.gmail.com>

On Mon, Jan 14, 2013 at 1:04 AM, Nathaniel Smith <njs at pobox.com> wrote:
> On Sun, Jan 13, 2013 at 11:48 PM, Skipper Seabold <jsseabold at gmail.com> wrote:
>> On Sun, Jan 13, 2013 at 6:39 PM, Nathaniel Smith <njs at pobox.com> wrote:
>>>
>>> On Sun, Jan 13, 2013 at 11:24 PM, Robert Kern <robert.kern at gmail.com>
>>> wrote:
>>> > On Sun, Jan 13, 2013 at 6:27 PM, Nathaniel Smith <njs at pobox.com> wrote:
>>> >> Hi all,
>>> >>
>>> >> PR 2875 adds two new functions, that generalize zeros(), ones(),
>>> >> zeros_like(), ones_like(), by simply taking an arbitrary fill value:
>>> >>   https://github.com/numpy/numpy/pull/2875
>>> >> So
>>> >>   np.ones((10, 10))
>>> >> is the same as
>>> >>   np.filled((10, 10), 1)
>>> >>
>>> >> The implementations are trivial, but the API seems useful because it
>>> >> provides an idiomatic way of efficiently creating an array full of
>>> >> inf, or nan, or None, whatever funny value you need. All the
>>> >> alternatives are either inefficient (np.ones(...) * np.inf) or
>>> >> cumbersome (a = np.empty(...); a.fill(...)). Or so it seems to me. But
>>> >> there's a question of taste here; one could argue instead that these
>>> >> just add more clutter to the numpy namespace. So, before we merge,
>>> >> anyone want to chime in?
>>> >
>>> > One alternative that does not expand the API with two-liners is to let
>>> > the ndarray.fill() method return self:
>>> >
>>> >   a = np.empty(...).fill(20.0)
>>>
>>> This violates the convention that in-place operations never return
>>> self, to avoid confusion with out-of-place operations. E.g.
>>> ndarray.resize() versus ndarray.reshape(), ndarray.sort() versus
>>> np.sort(), and in the broader Python world, list.sort() versus
>>> sorted(), list.reverse() versus reversed(). (This was an explicit
>>> reason given for list.sort to not return self, even.)
>>>
>>> Maybe enabling this idiom is a good enough reason to break the
>>> convention ("Special cases aren't special enough to break the rules. /
>>> Although practicality beats purity"), but it at least makes me -0 on
>>> this...
>>>
>>
>> I tend to agree with the notion that inplace operations shouldn't return
>> self, but I don't know if it's just because I've been conditioned this way.
>> Not returning self breaks the fluid interface pattern [1], as noted in a
>> similar discussion on pandas [2], FWIW, though there's likely some way to
>> have both worlds.
>
> Ah-hah, here's the email where Guide officially proclaims that there
> shall be no "fluent interface" nonsense applied to in-place operators
> in Python, because it hurts readability (at least for Dutch people
> ;-)):
>   http://mail.python.org/pipermail/python-dev/2003-October/038855.html

That's a statement about the policy for the stdlib, and just one
person's opinion. You, and numpy, are permitted to have a different
opinion.

In any case, I'm not strongly advocating for it. It's violation of
principle ("no fluent interfaces") is roughly in the same ballpark as
np.filled() ("not every two-liner needs its own function"), so I
thought I would toss it out there for consideration.

--
Robert Kern


From dave.hirschfeld at gmail.com  Mon Jan 14 04:02:36 2013
From: dave.hirschfeld at gmail.com (Dave Hirschfeld)
Date: Mon, 14 Jan 2013 09:02:36 +0000 (UTC)
Subject: [Numpy-discussion]
	=?utf-8?q?New_numpy_functions=3A_filled=2C_fil?=
	=?utf-8?q?led=5Flike?=
References: <CAPJVwBmgZUCwApoS2ztmwt2pTa-2-+B3W5tOm8A30KyuOd12AA@mail.gmail.com>
	<CAF6FJivC=O0CGMGP=c5=a=TMRP73Zy9H61FRS8xSKp5cU2XV4Q@mail.gmail.com>
	<CAPJVwBnOdehQTo=dLup+woc5_JkE5ekvEBDeiEFraVyVUvSDtQ@mail.gmail.com>
	<CAKF=DjtjSGk99zh53wwiPVmMThOzr4hGUKX5HJPG-5YeF_QTCQ@mail.gmail.com>
	<CAPJVwBmZvXXszZzf9m=PX_7e=8qs8ZJteU2hD7_eTR9pnN9U2w@mail.gmail.com>
	<CAF6FJiuQwNCkPvzVKneLfx3XSM7tHKcNXzMftRFnu8R+tN_LoQ@mail.gmail.com>
Message-ID: <loom.20130114T094714-689@post.gmane.org>

Robert Kern <robert.kern <at> gmail.com> writes:

> 
> >>> >
> >>> > One alternative that does not expand the API with two-liners is to let
> >>> > the ndarray.fill() method return self:
> >>> >
> >>> >   a = np.empty(...).fill(20.0)
> >>>
> >>> This violates the convention that in-place operations never return
> >>> self, to avoid confusion with out-of-place operations. E.g.
> >>> ndarray.resize() versus ndarray.reshape(), ndarray.sort() versus
> >>> np.sort(), and in the broader Python world, list.sort() versus
> >>> sorted(), list.reverse() versus reversed(). (This was an explicit
> >>> reason given for list.sort to not return self, even.)
> >>>
> >>> Maybe enabling this idiom is a good enough reason to break the
> >>> convention ("Special cases aren't special enough to break the rules. /
> >>> Although practicality beats purity"), but it at least makes me -0 on
> >>> this...
> >>>
> >>
> >> I tend to agree with the notion that inplace operations shouldn't return
> >> self, but I don't know if it's just because I've been conditioned this way.
> >> Not returning self breaks the fluid interface pattern [1], as noted in a
> >> similar discussion on pandas [2], FWIW, though there's likely some way to
> >> have both worlds.
> >
> > Ah-hah, here's the email where Guide officially proclaims that there
> > shall be no "fluent interface" nonsense applied to in-place operators
> > in Python, because it hurts readability (at least for Dutch people
> > ):
> >   http://mail.python.org/pipermail/python-dev/2003-October/038855.html
> 
> That's a statement about the policy for the stdlib, and just one
> person's opinion. You, and numpy, are permitted to have a different
> opinion.
> 
> In any case, I'm not strongly advocating for it. It's violation of
> principle ("no fluent interfaces") is roughly in the same ballpark as
> np.filled() ("not every two-liner needs its own function"), so I
> thought I would toss it out there for consideration.
> 
> --
> Robert Kern
> 

FWIW I'm +1 on the idea. Perhaps because I just don't see many practical 
downsides to breaking the convention but I regularly see a big issue with there 
being no way to instantiate an array with a particular value.

The one obvious way to do it is use ones and multiply by the value you want. I 
work with a lot of inexperienced programmers and I see this idiom all the time. 
It takes a fair amount of numpy knowledge to know that you should do it in two 
lines by using empty and setting a slice.

In [1]: %timeit NaN*ones(10000)
1000 loops, best of 3: 1.74 ms per loop

In [2]: %%timeit
   ...: x = empty(10000, dtype=float)
   ...: x[:] = NaN
   ...: 
10000 loops, best of 3: 28 us per loop

In [3]: 1.74e-3/28e-6
Out[3]: 62.142857142857146


Even when not in the mythical "tight loop" setting an array to one and then 
multiplying uses up a lot of cycles - it's nearly 2 orders of magnitude slower 
than what we know they *should* be doing.

I'm agnostic as to whether fill should be modified or new functions provided but 
I think numpy is currently missing this functionality and that providing it 
would save a lot of new users from shooting themselves in the foot performance-
wise.

-Dave


From jaakko.luttinen at aalto.fi  Mon Jan 14 05:35:43 2013
From: jaakko.luttinen at aalto.fi (Jaakko Luttinen)
Date: Mon, 14 Jan 2013 12:35:43 +0200
Subject: [Numpy-discussion] numpydoc for python 3?
In-Reply-To: <CAH6Pt5q8R9Y82dAA2dUUDNuG8u1BLmgd=jsQgLUB3rAy3u9COQ@mail.gmail.com>
References: <50EDA996.6090806@aalto.fi>
	<loom.20130110T130306-98@post.gmane.org>
	<50EED62B.9010105@aalto.fi>
	<loom.20130110T160320-428@post.gmane.org>
	<50EEDB50.5000902@aalto.fi> <50F33959.7030203@aalto.fi>
	<CAH6Pt5q8R9Y82dAA2dUUDNuG8u1BLmgd=jsQgLUB3rAy3u9COQ@mail.gmail.com>
Message-ID: <50F3DF7F.3060600@aalto.fi>

On 01/14/2013 12:53 AM, Matthew Brett wrote:
> On Sun, Jan 13, 2013 at 10:46 PM, Jaakko Luttinen
> <jaakko.luttinen at aalto.fi> wrote:
>> I'm a bit stuck trying to make numpydoc Python 3 compatible. I made
>> setup.py try to use distutils.command.build_py.build_py_2to3 in order to
>> transform installed code automatically to Python 3. However, the tests
>> (in tests folder) are not part of the package but rather package_data,
>> so they won't get transformed. How can I automatically transform the
>> tests too? Probably there is some easy and "right" solution to this, but
>> I haven't been able to figure out a nice and simple solution.. Any
>> ideas? Thanks.
> 
> Can you add tests as a package 'numpydoc.tests' and add an __init__.py
> file to the 'tests' directory?

I thought there is some reason why the 'tests' directory is not added as
a package 'numpydoc.tests', so I didn't want to take that route.

> You might be able to get away without 2to3, using the kind of stuff
> that Pauli has used for scipy recently:
> 
> https://github.com/scipy/scipy/pull/397

Ok, thanks, maybe I'll try to make the tests valid in all Python
versions. It seems there's only one line which I'm not able to transform.

In doc/sphinxext/tests/test_docscrape.py, on line 559:
    assert doc['Summary'][0] == u'?????????????'.encode('utf-8')

This is invalid in Python 3.0-3.2. How could I write this in such a way
that it is valid in all Python versions? I'm a bit lost with these
unicode encodings in Python (and in general).. And I didn't want to add
dependency on 'six' package.

Regards,
Jaakko


From pierre.haessig at crans.org  Mon Jan 14 06:04:57 2013
From: pierre.haessig at crans.org (Pierre Haessig)
Date: Mon, 14 Jan 2013 12:04:57 +0100
Subject: [Numpy-discussion] numpydoc for python 3?
In-Reply-To: <50F3DF7F.3060600@aalto.fi>
References: <50EDA996.6090806@aalto.fi>
	<loom.20130110T130306-98@post.gmane.org>
	<50EED62B.9010105@aalto.fi>
	<loom.20130110T160320-428@post.gmane.org>
	<50EEDB50.5000902@aalto.fi> <50F33959.7030203@aalto.fi>
	<CAH6Pt5q8R9Y82dAA2dUUDNuG8u1BLmgd=jsQgLUB3rAy3u9COQ@mail.gmail.com>
	<50F3DF7F.3060600@aalto.fi>
Message-ID: <50F3E659.9070402@crans.org>

Hi,

Le 14/01/2013 11:35, Jaakko Luttinen a ?crit :
> Ok, thanks, maybe I'll try to make the tests valid in all Python
> versions. It seems there's only one line which I'm not able to transform.
>
> In doc/sphinxext/tests/test_docscrape.py, on line 559:
>     assert doc['Summary'][0] == u'?????????????'.encode('utf-8')
>
> This is invalid in Python 3.0-3.2. How could I write this in such a way
> that it is valid in all Python versions? I'm a bit lost with these
> unicode encodings in Python (and in general).. And I didn't want to add
> dependency on 'six' package.
Just as a side note about Python and encodings, I found great help in
watching (by chance) the PyCon 2012 presentation "Pragmatic Unicode or
How do I stop the Pain ?" by Ned Batchelder :
http://nedbatchelder.com/text/unipain.html

Now, if I understand the problem correctly, the u'xxx' syntax was
reintroduced in Python 3.3 specifically to enhance the 2to3
compatibility
(http://docs.python.org/3/whatsnew/3.3.html#pep-414-explicit-unicode-literals).
Maybe the question is then whether it's worth supporting Python 3.0-3.2
or not ?


Also, one possible rewrite of the test could be to replace the unicode
string with the corresponding utf8-encoded bytes :
assert doc['Summary'][0] ==
b'\xc3\xb6\xc3\xa4\xc3\xb6\xc3\xa4\xc3\xb6\xc3\xa4\xc3\xb6\xc3\xa4\xc3\xb6\xc3\xa5\xc3\xa5\xc3\xa5\xc3\xa5'
# output of '?????????????'.encode('utf-8')
(One restriction : I think the b'' prefix was introduced in Python 2.6)

I'm not sure for the readability though...

Best,
Pierre


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 900 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130114/d565df29/attachment.sig>

From pelson.pub at gmail.com  Mon Jan 14 06:59:28 2013
From: pelson.pub at gmail.com (Phil Elson)
Date: Mon, 14 Jan 2013 11:59:28 +0000
Subject: [Numpy-discussion] 1.8 release
In-Reply-To: <CAGY4rcWHWf7efDCfPQx9GhwHeeTpxj1AfToq0tVV_P=frzfQhg@mail.gmail.com>
References: <CAB6mnxLU4sbG7AfMcQVH9rQkLe+Um15njj7nCjr_tVWz3unB1g@mail.gmail.com>
	<CAPJVwBn8XzztoZibiK7o89guj1nY9zZvcSUT-Us4nAF+JHisyg@mail.gmail.com>
	<CAGY4rcWHWf7efDCfPQx9GhwHeeTpxj1AfToq0tVV_P=frzfQhg@mail.gmail.com>
Message-ID: <CA+L60sCn8PC20LmvGR4=xU4m2bgSypuMhZw8zRf4bnXPL94+8g@mail.gmail.com>

I tried to suggest this for our matplotlib development cycle, but it didn't
get the roaring response I was hoping for (even though I was being
conservative by suggesting a 8-9 month release time):
http://matplotlib.1069221.n5.nabble.com/strategy-for-1-2-x-master-PEP8-changes-tp39453p39465.html

In essence, I think there is a lot of benefit in getting releases out
quicker. The biggest downside, IMHO, is that those who package the binary
releases have to work more frequently on what is not a particularly
glamorous task.

For those who are worried about the quality of releases being diminished by
releasing more frequently, an LTS approach could also work.

Good luck on getting these frequent releases going, IMHO there is a lot to
be said for having users on the latest and greatest, rather than have users
on old versions & still finding bug which were introduced 24 months ago and
fixed 12 months ago on master...

Cheers,

Phil


On 14 January 2013 00:19, David Cournapeau <cournape at gmail.com> wrote:

> On Sun, Jan 13, 2013 at 5:26 PM, Nathaniel Smith <njs at pobox.com> wrote:
> > On Sun, Jan 13, 2013 at 7:03 PM, Charles R Harris
> > <charlesr.harris at gmail.com> wrote:
> >> Now that 1.7 is nearing release, it's time to look forward to the 1.8
> >> release. I'd like us to get back to the twice yearly schedule that we
> tried
> >> to maintain through the 1.3 - 1.6 releases, so I propose a June release
> as a
> >> goal. Call it the Spring Cleaning release. As to content, I'd like to
> see
> >> the following.
> >>
> >> Removal of Python 2.4-2.5 support.
> >> Removal of SCons support.
> >> The index work consolidated.
> >> Initial stab at removing the need for 2to3. See Pauli's PR for scipy.
> >> Miscellaneous enhancements and fixes.
> >
> > I'd actually like to propose a faster release cycle than this, even.
> > Perhaps 3 months between releases; 2 months from release n to the
> > first beta of n+1?
> >
> > The consequences would be:
> > * Changes get out to users faster.
> > * Each release is smaller, so it's easier for downstream projects to
> > adjust to each release -- instead of having this giant pile of changes
> > to work through all at once every 6-12 months
> > * End-users are less scared of updating, because the changes aren't so
> > overwhelming, so they end up actually testing (and getting to take
> > advantage of) the new stuff more.
> > * We get feedback more quickly, so we can fix up whatever we break
> > while we still know what we did.
> > * And for larger changes, if we release them incrementally, we can get
> > feedback before we've gone miles down the wrong path.
> > * Releases come out on time more often -- sort of paradoxical, but
> > with small, frequent releases, beta cycles go smoother, and it's
> > easier to say "don't worry, I'll get it ready for next time", or
> > "right, that patch was less done than we thought, let's take it out
> > for now" (also this is much easier if we don't have another years
> > worth of changes committed on top of the patch!).
> > * If your schedule does slip, then you still end up with a <6 month
> > release cycle.
> >
> > 1.6.x was branched from master in March 2011 and released in May 2011.
> > 1.7.x was branched from master in July 2012 and still isn't out. But
> > at least we've finally found and fixed the second to last bug!
> >
> > Wouldn't it be nice to have a 2-4 week beta cycle that only found
> > trivial and expected problems? We *already* have 6 months worth of
> > feature work in master that won't be in the *next* release.
> >
> > Note 1: if we do do this, then we'll also want to rethink the
> > deprecation cycle a bit -- right now we've sort of vaguely been saying
> > "well, we'll deprecate it in release n and take it out in n+1.
> > Whenever that is". 3 months definitely isn't long enough for a
> > deprecation period, so if we do do this then we'll want to deprecate
> > things for multiple releases before actually removing them. Details to
> > be determined.
> >
> > Note 2: in this kind of release schedule, you definitely don't want to
> > say "here are the features that will be in the next release!", because
> > then you end up slipping and sliding all over the place. Instead you
> > say "here are some things that I want to work on next, and we'll see
> > which release they end up in". Since we're already following the rule
> > that nothing goes into master until it's done and tested and ready for
> > release anyway, this doesn't really change much.
> >
> > Thoughts?
>
> Hey, my time to have a time-machine:
> http://mail.scipy.org/pipermail/numpy-discussion/2008-May/033754.html
>
> I still think it is a good idea :)
>
> cheers,
> David
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130114/5ff7f954/attachment.html>

From matthew.brett at gmail.com  Mon Jan 14 07:18:49 2013
From: matthew.brett at gmail.com (Matthew Brett)
Date: Mon, 14 Jan 2013 12:18:49 +0000
Subject: [Numpy-discussion] 1.8 release
In-Reply-To: <CAGY4rcWHWf7efDCfPQx9GhwHeeTpxj1AfToq0tVV_P=frzfQhg@mail.gmail.com>
References: <CAB6mnxLU4sbG7AfMcQVH9rQkLe+Um15njj7nCjr_tVWz3unB1g@mail.gmail.com>
	<CAPJVwBn8XzztoZibiK7o89guj1nY9zZvcSUT-Us4nAF+JHisyg@mail.gmail.com>
	<CAGY4rcWHWf7efDCfPQx9GhwHeeTpxj1AfToq0tVV_P=frzfQhg@mail.gmail.com>
Message-ID: <CAH6Pt5qCW7WF=w64O-QvQVENp-8oBpo2JeDZ_McC66_DMcmGpQ@mail.gmail.com>

Hi,

On Mon, Jan 14, 2013 at 12:19 AM, David Cournapeau <cournape at gmail.com> wrote:
> On Sun, Jan 13, 2013 at 5:26 PM, Nathaniel Smith <njs at pobox.com> wrote:
>> On Sun, Jan 13, 2013 at 7:03 PM, Charles R Harris
>> <charlesr.harris at gmail.com> wrote:
>>> Now that 1.7 is nearing release, it's time to look forward to the 1.8
>>> release. I'd like us to get back to the twice yearly schedule that we tried
>>> to maintain through the 1.3 - 1.6 releases, so I propose a June release as a
>>> goal. Call it the Spring Cleaning release. As to content, I'd like to see
>>> the following.
>>>
>>> Removal of Python 2.4-2.5 support.
>>> Removal of SCons support.
>>> The index work consolidated.
>>> Initial stab at removing the need for 2to3. See Pauli's PR for scipy.
>>> Miscellaneous enhancements and fixes.
>>
>> I'd actually like to propose a faster release cycle than this, even.
>> Perhaps 3 months between releases; 2 months from release n to the
>> first beta of n+1?
>>
>> The consequences would be:
>> * Changes get out to users faster.
>> * Each release is smaller, so it's easier for downstream projects to
>> adjust to each release -- instead of having this giant pile of changes
>> to work through all at once every 6-12 months
>> * End-users are less scared of updating, because the changes aren't so
>> overwhelming, so they end up actually testing (and getting to take
>> advantage of) the new stuff more.
>> * We get feedback more quickly, so we can fix up whatever we break
>> while we still know what we did.
>> * And for larger changes, if we release them incrementally, we can get
>> feedback before we've gone miles down the wrong path.
>> * Releases come out on time more often -- sort of paradoxical, but
>> with small, frequent releases, beta cycles go smoother, and it's
>> easier to say "don't worry, I'll get it ready for next time", or
>> "right, that patch was less done than we thought, let's take it out
>> for now" (also this is much easier if we don't have another years
>> worth of changes committed on top of the patch!).
>> * If your schedule does slip, then you still end up with a <6 month
>> release cycle.
>>
>> 1.6.x was branched from master in March 2011 and released in May 2011.
>> 1.7.x was branched from master in July 2012 and still isn't out. But
>> at least we've finally found and fixed the second to last bug!
>>
>> Wouldn't it be nice to have a 2-4 week beta cycle that only found
>> trivial and expected problems? We *already* have 6 months worth of
>> feature work in master that won't be in the *next* release.
>>
>> Note 1: if we do do this, then we'll also want to rethink the
>> deprecation cycle a bit -- right now we've sort of vaguely been saying
>> "well, we'll deprecate it in release n and take it out in n+1.
>> Whenever that is". 3 months definitely isn't long enough for a
>> deprecation period, so if we do do this then we'll want to deprecate
>> things for multiple releases before actually removing them. Details to
>> be determined.
>>
>> Note 2: in this kind of release schedule, you definitely don't want to
>> say "here are the features that will be in the next release!", because
>> then you end up slipping and sliding all over the place. Instead you
>> say "here are some things that I want to work on next, and we'll see
>> which release they end up in". Since we're already following the rule
>> that nothing goes into master until it's done and tested and ready for
>> release anyway, this doesn't really change much.
>>
>> Thoughts?
>
> Hey, my time to have a time-machine:
> http://mail.scipy.org/pipermail/numpy-discussion/2008-May/033754.html
>
> I still think it is a good idea :)

I guess it is the release manager who has by far the largest say in
this.  Who will that be for the next year or so?

Best,

Matthew


From pierre.haessig at crans.org  Mon Jan 14 07:38:46 2013
From: pierre.haessig at crans.org (Pierre Haessig)
Date: Mon, 14 Jan 2013 13:38:46 +0100
Subject: [Numpy-discussion] New numpy functions: filled, filled_like
In-Reply-To: <CAPJVwBnOdehQTo=dLup+woc5_JkE5ekvEBDeiEFraVyVUvSDtQ@mail.gmail.com>
References: <CAPJVwBmgZUCwApoS2ztmwt2pTa-2-+B3W5tOm8A30KyuOd12AA@mail.gmail.com>
	<CAF6FJivC=O0CGMGP=c5=a=TMRP73Zy9H61FRS8xSKp5cU2XV4Q@mail.gmail.com>
	<CAPJVwBnOdehQTo=dLup+woc5_JkE5ekvEBDeiEFraVyVUvSDtQ@mail.gmail.com>
Message-ID: <50F3FC56.8000100@crans.org>

Hi,

Le 14/01/2013 00:39, Nathaniel Smith a ?crit :
> (The nice thing about np.filled() is that it makes np.zeros() and
> np.ones() feel like clutter, rather than the reverse... not that I'm
> suggesting ever getting rid of them, but it makes the API conceptually
> feel smaller, not larger.)
Coming from the Matlab syntax, I feel that np.zeros and np.ones are in
numpy for Matlab (and maybe others ?) compatibilty and are useful for
that. Now that I've been "enlightened" by Python, I think that those
functions (especially np.ones) are indeed clutter. Therefore I favor the
introduction of these two new functions.

However, I think Eric's remark about masked array API compatibility is
important. I don't know what other names are possible ? np.const ?

Or maybe np.tile is also useful for that same purpose ? In that case
adding a dtype argument to np.tile would be useful.

best,
Pierre

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 900 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130114/e962a824/attachment.sig>

From matthew.brett at gmail.com  Mon Jan 14 07:44:40 2013
From: matthew.brett at gmail.com (Matthew Brett)
Date: Mon, 14 Jan 2013 12:44:40 +0000
Subject: [Numpy-discussion] numpydoc for python 3?
In-Reply-To: <50F3DF7F.3060600@aalto.fi>
References: <50EDA996.6090806@aalto.fi>
	<loom.20130110T130306-98@post.gmane.org>
	<50EED62B.9010105@aalto.fi>
	<loom.20130110T160320-428@post.gmane.org>
	<50EEDB50.5000902@aalto.fi> <50F33959.7030203@aalto.fi>
	<CAH6Pt5q8R9Y82dAA2dUUDNuG8u1BLmgd=jsQgLUB3rAy3u9COQ@mail.gmail.com>
	<50F3DF7F.3060600@aalto.fi>
Message-ID: <CAH6Pt5o2Bg1QBRbq-k-7msT1vc1b5E6QWqCJSzmy5Eai8FHEWA@mail.gmail.com>

Hi,

On Mon, Jan 14, 2013 at 10:35 AM, Jaakko Luttinen
<jaakko.luttinen at aalto.fi> wrote:
> On 01/14/2013 12:53 AM, Matthew Brett wrote:
>> On Sun, Jan 13, 2013 at 10:46 PM, Jaakko Luttinen
>> <jaakko.luttinen at aalto.fi> wrote:
>>> I'm a bit stuck trying to make numpydoc Python 3 compatible. I made
>>> setup.py try to use distutils.command.build_py.build_py_2to3 in order to
>>> transform installed code automatically to Python 3. However, the tests
>>> (in tests folder) are not part of the package but rather package_data,
>>> so they won't get transformed. How can I automatically transform the
>>> tests too? Probably there is some easy and "right" solution to this, but
>>> I haven't been able to figure out a nice and simple solution.. Any
>>> ideas? Thanks.
>>
>> Can you add tests as a package 'numpydoc.tests' and add an __init__.py
>> file to the 'tests' directory?
>
> I thought there is some reason why the 'tests' directory is not added as
> a package 'numpydoc.tests', so I didn't want to take that route.

I think the only reason is so that people can't import
'numpydoc.tests' in case they get confused.   We (nipy.org/nipy etc)
used to use packagedata for tests, but then we lost interest in
preventing people doing the import, and started to enjoy being able to
port things across as packages, do relative imports, run 2to3 and so
on.  So, I'd just go for it.

>> You might be able to get away without 2to3, using the kind of stuff
>> that Pauli has used for scipy recently:
>>
>> https://github.com/scipy/scipy/pull/397
>
> Ok, thanks, maybe I'll try to make the tests valid in all Python
> versions. It seems there's only one line which I'm not able to transform.
>
> In doc/sphinxext/tests/test_docscrape.py, on line 559:
>     assert doc['Summary'][0] == u'?????????????'.encode('utf-8')
>
> This is invalid in Python 3.0-3.2. How could I write this in such a way
> that it is valid in all Python versions? I'm a bit lost with these
> unicode encodings in Python (and in general).. And I didn't want to add
> dependency on 'six' package.

Pierre's suggestion is good; you can also do something like this:

# -*- coding: utf8 -*-
import sys

if sys.version_info[0] >= 3:
    a = '?????????????'
else:
    a = unicode('?????????????', 'utf8')

The 'coding' line has to be the first or second line in the file.

Best,

Matthew


From ndbecker2 at gmail.com  Mon Jan 14 07:56:59 2013
From: ndbecker2 at gmail.com (Neal Becker)
Date: Mon, 14 Jan 2013 07:56:59 -0500
Subject: [Numpy-discussion] phase unwrapping (1d)
References: <kcpbpk$8nk$1@ger.gmane.org>
	<F656855EF0FAB246A0AEC45F7D5D13242CF505E2@BLUPRD0811MB424.namprd08.prod.outlook.com>
Message-ID: <kd0vao$rl8$1@ger.gmane.org>

Nadav Horesh wrote:

> There is an unwrap function in numpy. Doesn't it work for you?
> 

Like I had said, np.unwrap was too slow.  Profiling showed it eating up an 
absurd proportion of time.  My c++ code was much better (although still 
surprisingly slow).


From pierre.haessig at crans.org  Mon Jan 14 08:08:34 2013
From: pierre.haessig at crans.org (Pierre Haessig)
Date: Mon, 14 Jan 2013 14:08:34 +0100
Subject: [Numpy-discussion] phase unwrapping (1d)
In-Reply-To: <kcpbpk$8nk$1@ger.gmane.org>
References: <kcpbpk$8nk$1@ger.gmane.org>
Message-ID: <50F40352.9090603@crans.org>

Hi Neal,

Le 11/01/2013 16:40, Neal Becker a ?crit :
> I wanted to be able to handle the case of
>
> unwrap (arg (x1) + arg (x2))
>
> Here, phase can change by more than 2pi.
It's not clear to me what you mean by "change more than 2pi" ? Do you
mean that the consecutive points of in input can increase by more than
2pi ? If that's the case, I feel like there is no a priori information
in the data to detect such a "giant leap".

Also, I copy-paste here for reference the numpy.wrap code from [1] :

def unwrap(p, discont=pi, axis=-1):
    p = asarray(p)
    nd = len(p.shape)
    dd = diff(p, axis=axis)
    slice1 = [slice(None, None)]*nd # full slices
    slice1[axis] = slice(1, None)
    ddmod = mod(dd+pi, 2*pi)-pi
    _nx.copyto(ddmod, pi, where=(ddmod==-pi) & (dd > 0))
    ph_correct = ddmod - dd;
    _nx.copyto(ph_correct, 0, where=abs(dd)<discont)
    up = array(p, copy=True, dtype='d')
    up[slice1] = p[slice1] + ph_correct.cumsum(axis)
    return up

I don't know why it's too slow though. It looks well vectorized.

Coming back to your C algorithm, I'm not C guru so that I don't have a
clear picture of what it's doing. Do you have a Python prototype ?

Best,
Pierre

[1]
https://github.com/numpy/numpy/blob/master/numpy/lib/function_base.py#L1117

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 900 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130114/4f1db7cc/attachment.sig>

From mike.r.anderson.13 at gmail.com  Mon Jan 14 08:56:35 2013
From: mike.r.anderson.13 at gmail.com (Mike Anderson)
Date: Mon, 14 Jan 2013 21:56:35 +0800
Subject: [Numpy-discussion] Insights / lessons learned from NumPy design
In-Reply-To: <CAA_67WgMPeF3LP=1xsTCgcvXXAPqbuJnJ5HJyEZM9Wda3JFSpg@mail.gmail.com>
References: <CAA_67WgMPeF3LP=1xsTCgcvXXAPqbuJnJ5HJyEZM9Wda3JFSpg@mail.gmail.com>
Message-ID: <CAA_67WgeK2P7rg8k-cw-1ca7PHw8RLkvAfFxYxqFSzt1woFrBQ@mail.gmail.com>

Just wanted to say a big thanks to everyone in the NumPy community who has
commented on this topic - it's given us a lot to think about and a lot of
good ideas to work into the design!

Best regards,

   Mike.

On 4 January 2013 14:29, Mike Anderson <mike.r.anderson.13 at gmail.com> wrote:

> Hello all,
>
> In the Clojure community there has been some discussion about creating a
> common matrix maths library / API. Currently there are a few different
> fledgeling matrix libraries in Clojure, so it seemed like a worthwhile
> effort to unify them and have a common base on which to build on.
>
> NumPy has been something of an inspiration for this, so I though I'd ask
> here to see what lessons have been learned.
>
> We're thinking of a matrix library with roughly the following design
> (subject to change!)
> - Support for multi-dimensional matrices (but with fast paths for 1D
> vectors and 2D matrices as the common cases)
> - Immutability by default, i.e. matrix operations are pure functions that
> create new matrices. There could be a "backdoor" option to mutate matrices,
> but that would be unidiomatic in Clojure
> - Support for 64-bit double precision floats only (this is the standard
> float type in Clojure)
> - Ability to support multiple different back-end matrix implementations
> (JBLAS, Colt, EJML, Vectorz, javax.vecmath etc.)
> - A full range of matrix operations. Operations would be delegated to back
> end implementations where they are supported, otherwise generic
> implementations could be used.
>
> Any thoughts on this topic based on the NumPy experience? In particular
> would be very interesting to know:
> - Features in NumPy which proved to be redundant / not worth the effort
> - Features that you wish had been designed in at the start
> - Design decisions that turned out to be a particularly big mistake /
> success
>
> Would love to hear your insights, any ideas+advice greatly appreciated!
>
>    Mike.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130114/8b685c01/attachment.html>

From ndbecker2 at gmail.com  Mon Jan 14 09:39:34 2013
From: ndbecker2 at gmail.com (Neal Becker)
Date: Mon, 14 Jan 2013 09:39:34 -0500
Subject: [Numpy-discussion] phase unwrapping (1d)
References: <kcpbpk$8nk$1@ger.gmane.org> <50F40352.9090603@crans.org>
Message-ID: <kd15b5$rte$1@ger.gmane.org>

This code should explain all:
--------------------------------
import numpy as np
arg = np.angle

def nint (x):
    return int (x + 0.5) if x >= 0 else int (x - 0.5)

def unwrap (inp, y=np.pi, init=0, cnt=0):
    o = np.empty_like (inp)
    prev_o = init
    for i in range (len (inp)):
        o[i] = cnt * 2 * y + inp[i]
        delta = o[i] - prev_o

        if delta / y > 1 or delta / y < -1:
            n = nint (delta / (2*y))
            o[i] -= 2*y*n
            cnt -= n

        prev_o = o[i]

    return o
            

u = np.linspace (0, 400, 100) * np.pi/100
v = np.cos (u) + 1j * np.sin (u)
plot (arg(v))
plot (arg(v) + arg (v))
plot (unwrap (arg (v)))
plot (unwrap (arg (v) + arg (v)))
-------------------------------

Pierre Haessig wrote:

> Hi Neal,
> 
> Le 11/01/2013 16:40, Neal Becker a ?crit :
>> I wanted to be able to handle the case of
>>
>> unwrap (arg (x1) + arg (x2))
>>
>> Here, phase can change by more than 2pi.
> It's not clear to me what you mean by "change more than 2pi" ? Do you
> mean that the consecutive points of in input can increase by more than
> 2pi ? If that's the case, I feel like there is no a priori information
> in the data to detect such a "giant leap".
> 
> Also, I copy-paste here for reference the numpy.wrap code from [1] :
> 
> def unwrap(p, discont=pi, axis=-1):
>     p = asarray(p)
>     nd = len(p.shape)
>     dd = diff(p, axis=axis)
>     slice1 = [slice(None, None)]*nd # full slices
>     slice1[axis] = slice(1, None)
>     ddmod = mod(dd+pi, 2*pi)-pi
>     _nx.copyto(ddmod, pi, where=(ddmod==-pi) & (dd > 0))
>     ph_correct = ddmod - dd;
>     _nx.copyto(ph_correct, 0, where=abs(dd)<discont)
>     up = array(p, copy=True, dtype='d')
>     up[slice1] = p[slice1] + ph_correct.cumsum(axis)
>     return up
> 
> I don't know why it's too slow though. It looks well vectorized.
> 
> Coming back to your C algorithm, I'm not C guru so that I don't have a
> clear picture of what it's doing. Do you have a Python prototype ?
> 
> Best,
> Pierre
> 
> [1]
> https://github.com/numpy/numpy/blob/master/numpy/lib/function_base.py#L1117


From ben.root at ou.edu  Mon Jan 14 09:57:17 2013
From: ben.root at ou.edu (Benjamin Root)
Date: Mon, 14 Jan 2013 09:57:17 -0500
Subject: [Numpy-discussion] New numpy functions: filled, filled_like
In-Reply-To: <50F3FC56.8000100@crans.org>
References: <CAPJVwBmgZUCwApoS2ztmwt2pTa-2-+B3W5tOm8A30KyuOd12AA@mail.gmail.com>
	<CAF6FJivC=O0CGMGP=c5=a=TMRP73Zy9H61FRS8xSKp5cU2XV4Q@mail.gmail.com>
	<CAPJVwBnOdehQTo=dLup+woc5_JkE5ekvEBDeiEFraVyVUvSDtQ@mail.gmail.com>
	<50F3FC56.8000100@crans.org>
Message-ID: <CANNq6Fk2HzE2RYvNwtsH9sa+_MU4fP15cpn38TfO+YxRsdhCZw@mail.gmail.com>

On Mon, Jan 14, 2013 at 7:38 AM, Pierre Haessig <pierre.haessig at crans.org>wrote:

> Hi,
>
> Le 14/01/2013 00:39, Nathaniel Smith a ?crit :
> > (The nice thing about np.filled() is that it makes np.zeros() and
> > np.ones() feel like clutter, rather than the reverse... not that I'm
> > suggesting ever getting rid of them, but it makes the API conceptually
> > feel smaller, not larger.)
> Coming from the Matlab syntax, I feel that np.zeros and np.ones are in
> numpy for Matlab (and maybe others ?) compatibilty and are useful for
> that. Now that I've been "enlightened" by Python, I think that those
> functions (especially np.ones) are indeed clutter. Therefore I favor the
> introduction of these two new functions.
>
> However, I think Eric's remark about masked array API compatibility is
> important. I don't know what other names are possible ? np.const ?
>
> Or maybe np.tile is also useful for that same purpose ? In that case
> adding a dtype argument to np.tile would be useful.
>
> best,
> Pierre
>
>
I am also +1 on the idea of having a filled() and filled_like() function (I
learned a long time ago to just do a = np.empty() and a.fill() rather than
the multiplication trick I learned from Matlab).  However, the collision
with the masked array API is a non-starter for me.  np.const() and
np.const_like() probably make the most sense, but I would prefer a verb
over a noun.

Ben Root
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130114/ff2d25e4/attachment.html>

From nouiz at nouiz.org  Mon Jan 14 10:12:47 2013
From: nouiz at nouiz.org (=?ISO-8859-1?Q?Fr=E9d=E9ric_Bastien?=)
Date: Mon, 14 Jan 2013 10:12:47 -0500
Subject: [Numpy-discussion] New numpy functions: filled, filled_like
In-Reply-To: <CANNq6Fk2HzE2RYvNwtsH9sa+_MU4fP15cpn38TfO+YxRsdhCZw@mail.gmail.com>
References: <CAPJVwBmgZUCwApoS2ztmwt2pTa-2-+B3W5tOm8A30KyuOd12AA@mail.gmail.com>
	<CAF6FJivC=O0CGMGP=c5=a=TMRP73Zy9H61FRS8xSKp5cU2XV4Q@mail.gmail.com>
	<CAPJVwBnOdehQTo=dLup+woc5_JkE5ekvEBDeiEFraVyVUvSDtQ@mail.gmail.com>
	<50F3FC56.8000100@crans.org>
	<CANNq6Fk2HzE2RYvNwtsH9sa+_MU4fP15cpn38TfO+YxRsdhCZw@mail.gmail.com>
Message-ID: <CADKKbthd8_dEK+PraNV=B5UE1_zX=+bLYxzKtmOXMcZeoMJNkw@mail.gmail.com>

Why not optimize NumPy to detect a mul of an ndarray by a scalar to
call fill? That way, "np.empty * 2" will be as fast as "x=np.empty;
x.fill(2)"?

Fred

On Mon, Jan 14, 2013 at 9:57 AM, Benjamin Root <ben.root at ou.edu> wrote:
>
>
> On Mon, Jan 14, 2013 at 7:38 AM, Pierre Haessig <pierre.haessig at crans.org>
> wrote:
>>
>> Hi,
>>
>> Le 14/01/2013 00:39, Nathaniel Smith a ?crit :
>> > (The nice thing about np.filled() is that it makes np.zeros() and
>> > np.ones() feel like clutter, rather than the reverse... not that I'm
>> > suggesting ever getting rid of them, but it makes the API conceptually
>> > feel smaller, not larger.)
>> Coming from the Matlab syntax, I feel that np.zeros and np.ones are in
>> numpy for Matlab (and maybe others ?) compatibilty and are useful for
>> that. Now that I've been "enlightened" by Python, I think that those
>> functions (especially np.ones) are indeed clutter. Therefore I favor the
>> introduction of these two new functions.
>>
>> However, I think Eric's remark about masked array API compatibility is
>> important. I don't know what other names are possible ? np.const ?
>>
>> Or maybe np.tile is also useful for that same purpose ? In that case
>> adding a dtype argument to np.tile would be useful.
>>
>> best,
>> Pierre
>>
>
> I am also +1 on the idea of having a filled() and filled_like() function (I
> learned a long time ago to just do a = np.empty() and a.fill() rather than
> the multiplication trick I learned from Matlab).  However, the collision
> with the masked array API is a non-starter for me.  np.const() and
> np.const_like() probably make the most sense, but I would prefer a verb over
> a noun.
>
> Ben Root
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From robince at gmail.com  Mon Jan 14 10:21:57 2013
From: robince at gmail.com (Robin)
Date: Mon, 14 Jan 2013 15:21:57 +0000
Subject: [Numpy-discussion] New numpy functions: filled, filled_like
In-Reply-To: <CANNq6Fk2HzE2RYvNwtsH9sa+_MU4fP15cpn38TfO+YxRsdhCZw@mail.gmail.com>
References: <CAPJVwBmgZUCwApoS2ztmwt2pTa-2-+B3W5tOm8A30KyuOd12AA@mail.gmail.com>
	<CAF6FJivC=O0CGMGP=c5=a=TMRP73Zy9H61FRS8xSKp5cU2XV4Q@mail.gmail.com>
	<CAPJVwBnOdehQTo=dLup+woc5_JkE5ekvEBDeiEFraVyVUvSDtQ@mail.gmail.com>
	<50F3FC56.8000100@crans.org>
	<CANNq6Fk2HzE2RYvNwtsH9sa+_MU4fP15cpn38TfO+YxRsdhCZw@mail.gmail.com>
Message-ID: <CALsWBNPKM30pYp9b+LgbHJ4FrV-8M+gPcC5K2cwE8BRZ1fFQgA@mail.gmail.com>

On Mon, Jan 14, 2013 at 2:57 PM, Benjamin Root <ben.root at ou.edu> wrote:
> I am also +1 on the idea of having a filled() and filled_like() function (I
> learned a long time ago to just do a = np.empty() and a.fill() rather than
> the multiplication trick I learned from Matlab).  However, the collision
> with the masked array API is a non-starter for me.  np.const() and
> np.const_like() probably make the most sense, but I would prefer a verb over
> a noun.

To get an array of 1's, you call np.ones(shape), to get an array of
0's you call np.zeros(shape) so to get an array of val's why not call
np.vals(shape, val)?

Cheers

Robins


From robert.kern at gmail.com  Mon Jan 14 10:32:27 2013
From: robert.kern at gmail.com (Robert Kern)
Date: Mon, 14 Jan 2013 16:32:27 +0100
Subject: [Numpy-discussion] New numpy functions: filled, filled_like
In-Reply-To: <CADKKbthd8_dEK+PraNV=B5UE1_zX=+bLYxzKtmOXMcZeoMJNkw@mail.gmail.com>
References: <CAPJVwBmgZUCwApoS2ztmwt2pTa-2-+B3W5tOm8A30KyuOd12AA@mail.gmail.com>
	<CAF6FJivC=O0CGMGP=c5=a=TMRP73Zy9H61FRS8xSKp5cU2XV4Q@mail.gmail.com>
	<CAPJVwBnOdehQTo=dLup+woc5_JkE5ekvEBDeiEFraVyVUvSDtQ@mail.gmail.com>
	<50F3FC56.8000100@crans.org>
	<CANNq6Fk2HzE2RYvNwtsH9sa+_MU4fP15cpn38TfO+YxRsdhCZw@mail.gmail.com>
	<CADKKbthd8_dEK+PraNV=B5UE1_zX=+bLYxzKtmOXMcZeoMJNkw@mail.gmail.com>
Message-ID: <CAF6FJiuGkNOiC1r8Bkg_ZjxfHFuWcnA=gG5xiUvyqydipNQ8+Q@mail.gmail.com>

On Mon, Jan 14, 2013 at 4:12 PM, Fr?d?ric Bastien <nouiz at nouiz.org> wrote:
> Why not optimize NumPy to detect a mul of an ndarray by a scalar to
> call fill? That way, "np.empty * 2" will be as fast as "x=np.empty;
> x.fill(2)"?

In general, each element of an array will be different, so the result
of the multiplication will be different, so fill can not be used.

--
Robert Kern


From matthew.brett at gmail.com  Mon Jan 14 10:35:49 2013
From: matthew.brett at gmail.com (Matthew Brett)
Date: Mon, 14 Jan 2013 15:35:49 +0000
Subject: [Numpy-discussion] New numpy functions: filled, filled_like
In-Reply-To: <loom.20130114T094714-689@post.gmane.org>
References: <CAPJVwBmgZUCwApoS2ztmwt2pTa-2-+B3W5tOm8A30KyuOd12AA@mail.gmail.com>
	<CAF6FJivC=O0CGMGP=c5=a=TMRP73Zy9H61FRS8xSKp5cU2XV4Q@mail.gmail.com>
	<CAPJVwBnOdehQTo=dLup+woc5_JkE5ekvEBDeiEFraVyVUvSDtQ@mail.gmail.com>
	<CAKF=DjtjSGk99zh53wwiPVmMThOzr4hGUKX5HJPG-5YeF_QTCQ@mail.gmail.com>
	<CAPJVwBmZvXXszZzf9m=PX_7e=8qs8ZJteU2hD7_eTR9pnN9U2w@mail.gmail.com>
	<CAF6FJiuQwNCkPvzVKneLfx3XSM7tHKcNXzMftRFnu8R+tN_LoQ@mail.gmail.com>
	<loom.20130114T094714-689@post.gmane.org>
Message-ID: <CAH6Pt5rjUE0uEQkgZsGrKL4HWXMPZO=wbbaUZ5=tm+LCBSrrXw@mail.gmail.com>

Hi,

On Mon, Jan 14, 2013 at 9:02 AM, Dave Hirschfeld
<dave.hirschfeld at gmail.com> wrote:
> Robert Kern <robert.kern <at> gmail.com> writes:
>
>>
>> >>> >
>> >>> > One alternative that does not expand the API with two-liners is to let
>> >>> > the ndarray.fill() method return self:
>> >>> >
>> >>> >   a = np.empty(...).fill(20.0)
>> >>>
>> >>> This violates the convention that in-place operations never return
>> >>> self, to avoid confusion with out-of-place operations. E.g.
>> >>> ndarray.resize() versus ndarray.reshape(), ndarray.sort() versus
>> >>> np.sort(), and in the broader Python world, list.sort() versus
>> >>> sorted(), list.reverse() versus reversed(). (This was an explicit
>> >>> reason given for list.sort to not return self, even.)
>> >>>
>> >>> Maybe enabling this idiom is a good enough reason to break the
>> >>> convention ("Special cases aren't special enough to break the rules. /
>> >>> Although practicality beats purity"), but it at least makes me -0 on
>> >>> this...
>> >>>
>> >>
>> >> I tend to agree with the notion that inplace operations shouldn't return
>> >> self, but I don't know if it's just because I've been conditioned this way.
>> >> Not returning self breaks the fluid interface pattern [1], as noted in a
>> >> similar discussion on pandas [2], FWIW, though there's likely some way to
>> >> have both worlds.
>> >
>> > Ah-hah, here's the email where Guide officially proclaims that there
>> > shall be no "fluent interface" nonsense applied to in-place operators
>> > in Python, because it hurts readability (at least for Dutch people
>> > ):
>> >   http://mail.python.org/pipermail/python-dev/2003-October/038855.html
>>
>> That's a statement about the policy for the stdlib, and just one
>> person's opinion. You, and numpy, are permitted to have a different
>> opinion.
>>
>> In any case, I'm not strongly advocating for it. It's violation of
>> principle ("no fluent interfaces") is roughly in the same ballpark as
>> np.filled() ("not every two-liner needs its own function"), so I
>> thought I would toss it out there for consideration.
>>
>> --
>> Robert Kern
>>
>
> FWIW I'm +1 on the idea. Perhaps because I just don't see many practical
> downsides to breaking the convention but I regularly see a big issue with there
> being no way to instantiate an array with a particular value.
>
> The one obvious way to do it is use ones and multiply by the value you want. I
> work with a lot of inexperienced programmers and I see this idiom all the time.
> It takes a fair amount of numpy knowledge to know that you should do it in two
> lines by using empty and setting a slice.
>
> In [1]: %timeit NaN*ones(10000)
> 1000 loops, best of 3: 1.74 ms per loop
>
> In [2]: %%timeit
>    ...: x = empty(10000, dtype=float)
>    ...: x[:] = NaN
>    ...:
> 10000 loops, best of 3: 28 us per loop
>
> In [3]: 1.74e-3/28e-6
> Out[3]: 62.142857142857146
>
>
> Even when not in the mythical "tight loop" setting an array to one and then
> multiplying uses up a lot of cycles - it's nearly 2 orders of magnitude slower
> than what we know they *should* be doing.
>
> I'm agnostic as to whether fill should be modified or new functions provided but
> I think numpy is currently missing this functionality and that providing it
> would save a lot of new users from shooting themselves in the foot performance-
> wise.

Is this a fair summary?

=> fill(shape, val), fill_like(arr, val) - new functions, as proposed
For: readable, seems to fit a pattern often used, presence in
namespace may clue people into using the 'fill' rather than * val or +
val
Con: a very simple alias for a = ones(shape) ; a.fill(val), maybe
cluttering already full namespace.

=> empty(shape).fill(val) - by allowing return value from arr.fill(val)
For: readable
Con: breaks guideline not to return anything from in-place operations,
no presence in namespace means users may not find this pattern.

=> no new API
For : easy maintenance
Con : harder for users to discover fill pattern, filling a new array
requires two lines instead of one.

So maybe the decision rests on:

How important is it that users see these function names in the
namespace in order to discover the pattern "a = ones(shape) ;
a.fill(val)"?

How important is it to obey guidelines for no-return-from-in-place?

How important is it to avoid expanding the namespace?

How common is this pattern?

On the last, I'd say that the only common use I have for this pattern
is to fill an array with NaN.

Cheers,

Matthew


From shish at keba.be  Mon Jan 14 11:15:30 2013
From: shish at keba.be (Olivier Delalleau)
Date: Mon, 14 Jan 2013 11:15:30 -0500
Subject: [Numpy-discussion] New numpy functions: filled, filled_like
In-Reply-To: <CAH6Pt5rjUE0uEQkgZsGrKL4HWXMPZO=wbbaUZ5=tm+LCBSrrXw@mail.gmail.com>
References: <CAPJVwBmgZUCwApoS2ztmwt2pTa-2-+B3W5tOm8A30KyuOd12AA@mail.gmail.com>
	<CAF6FJivC=O0CGMGP=c5=a=TMRP73Zy9H61FRS8xSKp5cU2XV4Q@mail.gmail.com>
	<CAPJVwBnOdehQTo=dLup+woc5_JkE5ekvEBDeiEFraVyVUvSDtQ@mail.gmail.com>
	<CAKF=DjtjSGk99zh53wwiPVmMThOzr4hGUKX5HJPG-5YeF_QTCQ@mail.gmail.com>
	<CAPJVwBmZvXXszZzf9m=PX_7e=8qs8ZJteU2hD7_eTR9pnN9U2w@mail.gmail.com>
	<CAF6FJiuQwNCkPvzVKneLfx3XSM7tHKcNXzMftRFnu8R+tN_LoQ@mail.gmail.com>
	<loom.20130114T094714-689@post.gmane.org>
	<CAH6Pt5rjUE0uEQkgZsGrKL4HWXMPZO=wbbaUZ5=tm+LCBSrrXw@mail.gmail.com>
Message-ID: <CAFXk4br_0pohTOrU6fJjbiqpsP+GSiT3U-nDWRrCi1q6PjJHjg@mail.gmail.com>

2013/1/14 Matthew Brett <matthew.brett at gmail.com>:
> Hi,
>
> On Mon, Jan 14, 2013 at 9:02 AM, Dave Hirschfeld
> <dave.hirschfeld at gmail.com> wrote:
>> Robert Kern <robert.kern <at> gmail.com> writes:
>>
>>>
>>> >>> >
>>> >>> > One alternative that does not expand the API with two-liners is to let
>>> >>> > the ndarray.fill() method return self:
>>> >>> >
>>> >>> >   a = np.empty(...).fill(20.0)
>>> >>>
>>> >>> This violates the convention that in-place operations never return
>>> >>> self, to avoid confusion with out-of-place operations. E.g.
>>> >>> ndarray.resize() versus ndarray.reshape(), ndarray.sort() versus
>>> >>> np.sort(), and in the broader Python world, list.sort() versus
>>> >>> sorted(), list.reverse() versus reversed(). (This was an explicit
>>> >>> reason given for list.sort to not return self, even.)
>>> >>>
>>> >>> Maybe enabling this idiom is a good enough reason to break the
>>> >>> convention ("Special cases aren't special enough to break the rules. /
>>> >>> Although practicality beats purity"), but it at least makes me -0 on
>>> >>> this...
>>> >>>
>>> >>
>>> >> I tend to agree with the notion that inplace operations shouldn't return
>>> >> self, but I don't know if it's just because I've been conditioned this way.
>>> >> Not returning self breaks the fluid interface pattern [1], as noted in a
>>> >> similar discussion on pandas [2], FWIW, though there's likely some way to
>>> >> have both worlds.
>>> >
>>> > Ah-hah, here's the email where Guide officially proclaims that there
>>> > shall be no "fluent interface" nonsense applied to in-place operators
>>> > in Python, because it hurts readability (at least for Dutch people
>>> > ):
>>> >   http://mail.python.org/pipermail/python-dev/2003-October/038855.html
>>>
>>> That's a statement about the policy for the stdlib, and just one
>>> person's opinion. You, and numpy, are permitted to have a different
>>> opinion.
>>>
>>> In any case, I'm not strongly advocating for it. It's violation of
>>> principle ("no fluent interfaces") is roughly in the same ballpark as
>>> np.filled() ("not every two-liner needs its own function"), so I
>>> thought I would toss it out there for consideration.
>>>
>>> --
>>> Robert Kern
>>>
>>
>> FWIW I'm +1 on the idea. Perhaps because I just don't see many practical
>> downsides to breaking the convention but I regularly see a big issue with there
>> being no way to instantiate an array with a particular value.
>>
>> The one obvious way to do it is use ones and multiply by the value you want. I
>> work with a lot of inexperienced programmers and I see this idiom all the time.
>> It takes a fair amount of numpy knowledge to know that you should do it in two
>> lines by using empty and setting a slice.
>>
>> In [1]: %timeit NaN*ones(10000)
>> 1000 loops, best of 3: 1.74 ms per loop
>>
>> In [2]: %%timeit
>>    ...: x = empty(10000, dtype=float)
>>    ...: x[:] = NaN
>>    ...:
>> 10000 loops, best of 3: 28 us per loop
>>
>> In [3]: 1.74e-3/28e-6
>> Out[3]: 62.142857142857146
>>
>>
>> Even when not in the mythical "tight loop" setting an array to one and then
>> multiplying uses up a lot of cycles - it's nearly 2 orders of magnitude slower
>> than what we know they *should* be doing.
>>
>> I'm agnostic as to whether fill should be modified or new functions provided but
>> I think numpy is currently missing this functionality and that providing it
>> would save a lot of new users from shooting themselves in the foot performance-
>> wise.
>
> Is this a fair summary?
>
> => fill(shape, val), fill_like(arr, val) - new functions, as proposed
> For: readable, seems to fit a pattern often used, presence in
> namespace may clue people into using the 'fill' rather than * val or +
> val
> Con: a very simple alias for a = ones(shape) ; a.fill(val), maybe
> cluttering already full namespace.
>
> => empty(shape).fill(val) - by allowing return value from arr.fill(val)
> For: readable
> Con: breaks guideline not to return anything from in-place operations,
> no presence in namespace means users may not find this pattern.
>
> => no new API
> For : easy maintenance
> Con : harder for users to discover fill pattern, filling a new array
> requires two lines instead of one.
>
> So maybe the decision rests on:
>
> How important is it that users see these function names in the
> namespace in order to discover the pattern "a = ones(shape) ;
> a.fill(val)"?
>
> How important is it to obey guidelines for no-return-from-in-place?
>
> How important is it to avoid expanding the namespace?
>
> How common is this pattern?
>
> On the last, I'd say that the only common use I have for this pattern
> is to fill an array with NaN.

My 2 cts from a user perspective:

- +1 to have such a function. I usually use numpy.ones * scalar
because honestly, spending two lines of code for such a basic
operations seems like a waste. Even if it's slower and potentially
dangerous due to casting rules.
- I think having a noun rather than a verb makes more sense since we
have numpy.ones and numpy.zeros (and I always read "numpy.empty" as
"give me an empty array", not "empty an array").
- I agree the name collision with np.ma.filled is a problem. I have no
better suggestion though at this point.

-=- Olivier


From josef.pktd at gmail.com  Mon Jan 14 11:22:40 2013
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Mon, 14 Jan 2013 11:22:40 -0500
Subject: [Numpy-discussion] New numpy functions: filled, filled_like
In-Reply-To: <CAFXk4br_0pohTOrU6fJjbiqpsP+GSiT3U-nDWRrCi1q6PjJHjg@mail.gmail.com>
References: <CAPJVwBmgZUCwApoS2ztmwt2pTa-2-+B3W5tOm8A30KyuOd12AA@mail.gmail.com>
	<CAF6FJivC=O0CGMGP=c5=a=TMRP73Zy9H61FRS8xSKp5cU2XV4Q@mail.gmail.com>
	<CAPJVwBnOdehQTo=dLup+woc5_JkE5ekvEBDeiEFraVyVUvSDtQ@mail.gmail.com>
	<CAKF=DjtjSGk99zh53wwiPVmMThOzr4hGUKX5HJPG-5YeF_QTCQ@mail.gmail.com>
	<CAPJVwBmZvXXszZzf9m=PX_7e=8qs8ZJteU2hD7_eTR9pnN9U2w@mail.gmail.com>
	<CAF6FJiuQwNCkPvzVKneLfx3XSM7tHKcNXzMftRFnu8R+tN_LoQ@mail.gmail.com>
	<loom.20130114T094714-689@post.gmane.org>
	<CAH6Pt5rjUE0uEQkgZsGrKL4HWXMPZO=wbbaUZ5=tm+LCBSrrXw@mail.gmail.com>
	<CAFXk4br_0pohTOrU6fJjbiqpsP+GSiT3U-nDWRrCi1q6PjJHjg@mail.gmail.com>
Message-ID: <CAMMTP+BpRCdoEB_aDG4Va3yoPqqJ_2=5OFjw6L74Z_Xsf7nCtg@mail.gmail.com>

On Mon, Jan 14, 2013 at 11:15 AM, Olivier Delalleau <shish at keba.be> wrote:
> 2013/1/14 Matthew Brett <matthew.brett at gmail.com>:
>> Hi,
>>
>> On Mon, Jan 14, 2013 at 9:02 AM, Dave Hirschfeld
>> <dave.hirschfeld at gmail.com> wrote:
>>> Robert Kern <robert.kern <at> gmail.com> writes:
>>>
>>>>
>>>> >>> >
>>>> >>> > One alternative that does not expand the API with two-liners is to let
>>>> >>> > the ndarray.fill() method return self:
>>>> >>> >
>>>> >>> >   a = np.empty(...).fill(20.0)
>>>> >>>
>>>> >>> This violates the convention that in-place operations never return
>>>> >>> self, to avoid confusion with out-of-place operations. E.g.
>>>> >>> ndarray.resize() versus ndarray.reshape(), ndarray.sort() versus
>>>> >>> np.sort(), and in the broader Python world, list.sort() versus
>>>> >>> sorted(), list.reverse() versus reversed(). (This was an explicit
>>>> >>> reason given for list.sort to not return self, even.)
>>>> >>>
>>>> >>> Maybe enabling this idiom is a good enough reason to break the
>>>> >>> convention ("Special cases aren't special enough to break the rules. /
>>>> >>> Although practicality beats purity"), but it at least makes me -0 on
>>>> >>> this...
>>>> >>>
>>>> >>
>>>> >> I tend to agree with the notion that inplace operations shouldn't return
>>>> >> self, but I don't know if it's just because I've been conditioned this way.
>>>> >> Not returning self breaks the fluid interface pattern [1], as noted in a
>>>> >> similar discussion on pandas [2], FWIW, though there's likely some way to
>>>> >> have both worlds.
>>>> >
>>>> > Ah-hah, here's the email where Guide officially proclaims that there
>>>> > shall be no "fluent interface" nonsense applied to in-place operators
>>>> > in Python, because it hurts readability (at least for Dutch people
>>>> > ):
>>>> >   http://mail.python.org/pipermail/python-dev/2003-October/038855.html
>>>>
>>>> That's a statement about the policy for the stdlib, and just one
>>>> person's opinion. You, and numpy, are permitted to have a different
>>>> opinion.
>>>>
>>>> In any case, I'm not strongly advocating for it. It's violation of
>>>> principle ("no fluent interfaces") is roughly in the same ballpark as
>>>> np.filled() ("not every two-liner needs its own function"), so I
>>>> thought I would toss it out there for consideration.
>>>>
>>>> --
>>>> Robert Kern
>>>>
>>>
>>> FWIW I'm +1 on the idea. Perhaps because I just don't see many practical
>>> downsides to breaking the convention but I regularly see a big issue with there
>>> being no way to instantiate an array with a particular value.
>>>
>>> The one obvious way to do it is use ones and multiply by the value you want. I
>>> work with a lot of inexperienced programmers and I see this idiom all the time.
>>> It takes a fair amount of numpy knowledge to know that you should do it in two
>>> lines by using empty and setting a slice.
>>>
>>> In [1]: %timeit NaN*ones(10000)
>>> 1000 loops, best of 3: 1.74 ms per loop
>>>
>>> In [2]: %%timeit
>>>    ...: x = empty(10000, dtype=float)
>>>    ...: x[:] = NaN
>>>    ...:
>>> 10000 loops, best of 3: 28 us per loop
>>>
>>> In [3]: 1.74e-3/28e-6
>>> Out[3]: 62.142857142857146
>>>
>>>
>>> Even when not in the mythical "tight loop" setting an array to one and then
>>> multiplying uses up a lot of cycles - it's nearly 2 orders of magnitude slower
>>> than what we know they *should* be doing.
>>>
>>> I'm agnostic as to whether fill should be modified or new functions provided but
>>> I think numpy is currently missing this functionality and that providing it
>>> would save a lot of new users from shooting themselves in the foot performance-
>>> wise.
>>
>> Is this a fair summary?
>>
>> => fill(shape, val), fill_like(arr, val) - new functions, as proposed
>> For: readable, seems to fit a pattern often used, presence in
>> namespace may clue people into using the 'fill' rather than * val or +
>> val
>> Con: a very simple alias for a = ones(shape) ; a.fill(val), maybe
>> cluttering already full namespace.
>>
>> => empty(shape).fill(val) - by allowing return value from arr.fill(val)
>> For: readable
>> Con: breaks guideline not to return anything from in-place operations,
>> no presence in namespace means users may not find this pattern.
>>
>> => no new API
>> For : easy maintenance
>> Con : harder for users to discover fill pattern, filling a new array
>> requires two lines instead of one.
>>
>> So maybe the decision rests on:
>>
>> How important is it that users see these function names in the
>> namespace in order to discover the pattern "a = ones(shape) ;
>> a.fill(val)"?
>>
>> How important is it to obey guidelines for no-return-from-in-place?
>>
>> How important is it to avoid expanding the namespace?
>>
>> How common is this pattern?
>>
>> On the last, I'd say that the only common use I have for this pattern
>> is to fill an array with NaN.
>
> My 2 cts from a user perspective:
>
> - +1 to have such a function. I usually use numpy.ones * scalar
> because honestly, spending two lines of code for such a basic
> operations seems like a waste. Even if it's slower and potentially
> dangerous due to casting rules.
> - I think having a noun rather than a verb makes more sense since we
> have numpy.ones and numpy.zeros (and I always read "numpy.empty" as
> "give me an empty array", not "empty an array").
> - I agree the name collision with np.ma.filled is a problem. I have no
> better suggestion though at this point.

np.array_filled(shape, value, dtype)  ?
maybe more verbose, but unambiguous AFAICS

BTW
GAUSS http://en.wikipedia.org/wiki/GAUSS_(software)
also has zeros and ones. 1st release 1984

np.array_filled((100, 2), -999, int) ?

Josef


>
> -=- Olivier
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From nouiz at nouiz.org  Mon Jan 14 11:45:29 2013
From: nouiz at nouiz.org (=?ISO-8859-1?Q?Fr=E9d=E9ric_Bastien?=)
Date: Mon, 14 Jan 2013 11:45:29 -0500
Subject: [Numpy-discussion] 1.8 release
In-Reply-To: <CAH6Pt5qCW7WF=w64O-QvQVENp-8oBpo2JeDZ_McC66_DMcmGpQ@mail.gmail.com>
References: <CAB6mnxLU4sbG7AfMcQVH9rQkLe+Um15njj7nCjr_tVWz3unB1g@mail.gmail.com>
	<CAPJVwBn8XzztoZibiK7o89guj1nY9zZvcSUT-Us4nAF+JHisyg@mail.gmail.com>
	<CAGY4rcWHWf7efDCfPQx9GhwHeeTpxj1AfToq0tVV_P=frzfQhg@mail.gmail.com>
	<CAH6Pt5qCW7WF=w64O-QvQVENp-8oBpo2JeDZ_McC66_DMcmGpQ@mail.gmail.com>
Message-ID: <CADKKbtgeCEP7vByNSvWd0VvN1T7c6DotG4KCGXaqgNh4FQ1PSw@mail.gmail.com>

Hi,

I don't volontear for the next release manager, but +1 for shorter
releases. I heard just good comments from that. Also, I'm not sure it
would ask more from the release manager. Do someone have an idea? The
most work I do as a release manager for theano is the
preparation/tests/release notes and this depend on the amont of new
stuff. And this seam exponential on the number of new changes in the
release, not linear (no data, just an impression...). Making smaller
release make this easier.

But yes, this mean more announces. But this isn't what take the most
times. Also, doing the release notes more frequently mean it is more
recent in memory when you check the PR merged, so it make it easier to
do.

But what prevent us from making shorter release? Oother priorities
that can't wait, like work for papers to submit, or for collaboration
with partners.

just my 2cents.

Fred

On Mon, Jan 14, 2013 at 7:18 AM, Matthew Brett <matthew.brett at gmail.com> wrote:
> Hi,
>
> On Mon, Jan 14, 2013 at 12:19 AM, David Cournapeau <cournape at gmail.com> wrote:
>> On Sun, Jan 13, 2013 at 5:26 PM, Nathaniel Smith <njs at pobox.com> wrote:
>>> On Sun, Jan 13, 2013 at 7:03 PM, Charles R Harris
>>> <charlesr.harris at gmail.com> wrote:
>>>> Now that 1.7 is nearing release, it's time to look forward to the 1.8
>>>> release. I'd like us to get back to the twice yearly schedule that we tried
>>>> to maintain through the 1.3 - 1.6 releases, so I propose a June release as a
>>>> goal. Call it the Spring Cleaning release. As to content, I'd like to see
>>>> the following.
>>>>
>>>> Removal of Python 2.4-2.5 support.
>>>> Removal of SCons support.
>>>> The index work consolidated.
>>>> Initial stab at removing the need for 2to3. See Pauli's PR for scipy.
>>>> Miscellaneous enhancements and fixes.
>>>
>>> I'd actually like to propose a faster release cycle than this, even.
>>> Perhaps 3 months between releases; 2 months from release n to the
>>> first beta of n+1?
>>>
>>> The consequences would be:
>>> * Changes get out to users faster.
>>> * Each release is smaller, so it's easier for downstream projects to
>>> adjust to each release -- instead of having this giant pile of changes
>>> to work through all at once every 6-12 months
>>> * End-users are less scared of updating, because the changes aren't so
>>> overwhelming, so they end up actually testing (and getting to take
>>> advantage of) the new stuff more.
>>> * We get feedback more quickly, so we can fix up whatever we break
>>> while we still know what we did.
>>> * And for larger changes, if we release them incrementally, we can get
>>> feedback before we've gone miles down the wrong path.
>>> * Releases come out on time more often -- sort of paradoxical, but
>>> with small, frequent releases, beta cycles go smoother, and it's
>>> easier to say "don't worry, I'll get it ready for next time", or
>>> "right, that patch was less done than we thought, let's take it out
>>> for now" (also this is much easier if we don't have another years
>>> worth of changes committed on top of the patch!).
>>> * If your schedule does slip, then you still end up with a <6 month
>>> release cycle.
>>>
>>> 1.6.x was branched from master in March 2011 and released in May 2011.
>>> 1.7.x was branched from master in July 2012 and still isn't out. But
>>> at least we've finally found and fixed the second to last bug!
>>>
>>> Wouldn't it be nice to have a 2-4 week beta cycle that only found
>>> trivial and expected problems? We *already* have 6 months worth of
>>> feature work in master that won't be in the *next* release.
>>>
>>> Note 1: if we do do this, then we'll also want to rethink the
>>> deprecation cycle a bit -- right now we've sort of vaguely been saying
>>> "well, we'll deprecate it in release n and take it out in n+1.
>>> Whenever that is". 3 months definitely isn't long enough for a
>>> deprecation period, so if we do do this then we'll want to deprecate
>>> things for multiple releases before actually removing them. Details to
>>> be determined.
>>>
>>> Note 2: in this kind of release schedule, you definitely don't want to
>>> say "here are the features that will be in the next release!", because
>>> then you end up slipping and sliding all over the place. Instead you
>>> say "here are some things that I want to work on next, and we'll see
>>> which release they end up in". Since we're already following the rule
>>> that nothing goes into master until it's done and tested and ready for
>>> release anyway, this doesn't really change much.
>>>
>>> Thoughts?
>>
>> Hey, my time to have a time-machine:
>> http://mail.scipy.org/pipermail/numpy-discussion/2008-May/033754.html
>>
>> I still think it is a good idea :)
>
> I guess it is the release manager who has by far the largest say in
> this.  Who will that be for the next year or so?
>
> Best,
>
> Matthew
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From josef.pktd at gmail.com  Mon Jan 14 11:55:39 2013
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Mon, 14 Jan 2013 11:55:39 -0500
Subject: [Numpy-discussion] New numpy functions: filled, filled_like
In-Reply-To: <CAMMTP+BpRCdoEB_aDG4Va3yoPqqJ_2=5OFjw6L74Z_Xsf7nCtg@mail.gmail.com>
References: <CAPJVwBmgZUCwApoS2ztmwt2pTa-2-+B3W5tOm8A30KyuOd12AA@mail.gmail.com>
	<CAF6FJivC=O0CGMGP=c5=a=TMRP73Zy9H61FRS8xSKp5cU2XV4Q@mail.gmail.com>
	<CAPJVwBnOdehQTo=dLup+woc5_JkE5ekvEBDeiEFraVyVUvSDtQ@mail.gmail.com>
	<CAKF=DjtjSGk99zh53wwiPVmMThOzr4hGUKX5HJPG-5YeF_QTCQ@mail.gmail.com>
	<CAPJVwBmZvXXszZzf9m=PX_7e=8qs8ZJteU2hD7_eTR9pnN9U2w@mail.gmail.com>
	<CAF6FJiuQwNCkPvzVKneLfx3XSM7tHKcNXzMftRFnu8R+tN_LoQ@mail.gmail.com>
	<loom.20130114T094714-689@post.gmane.org>
	<CAH6Pt5rjUE0uEQkgZsGrKL4HWXMPZO=wbbaUZ5=tm+LCBSrrXw@mail.gmail.com>
	<CAFXk4br_0pohTOrU6fJjbiqpsP+GSiT3U-nDWRrCi1q6PjJHjg@mail.gmail.com>
	<CAMMTP+BpRCdoEB_aDG4Va3yoPqqJ_2=5OFjw6L74Z_Xsf7nCtg@mail.gmail.com>
Message-ID: <CAMMTP+COKBCwa=jcnHCBUJDTipuAKst=t6RrN-g7=q0U35GNHQ@mail.gmail.com>

On Mon, Jan 14, 2013 at 11:22 AM,  <josef.pktd at gmail.com> wrote:
> On Mon, Jan 14, 2013 at 11:15 AM, Olivier Delalleau <shish at keba.be> wrote:
>> 2013/1/14 Matthew Brett <matthew.brett at gmail.com>:
>>> Hi,
>>>
>>> On Mon, Jan 14, 2013 at 9:02 AM, Dave Hirschfeld
>>> <dave.hirschfeld at gmail.com> wrote:
>>>> Robert Kern <robert.kern <at> gmail.com> writes:
>>>>
>>>>>
>>>>> >>> >
>>>>> >>> > One alternative that does not expand the API with two-liners is to let
>>>>> >>> > the ndarray.fill() method return self:
>>>>> >>> >
>>>>> >>> >   a = np.empty(...).fill(20.0)
>>>>> >>>
>>>>> >>> This violates the convention that in-place operations never return
>>>>> >>> self, to avoid confusion with out-of-place operations. E.g.
>>>>> >>> ndarray.resize() versus ndarray.reshape(), ndarray.sort() versus
>>>>> >>> np.sort(), and in the broader Python world, list.sort() versus
>>>>> >>> sorted(), list.reverse() versus reversed(). (This was an explicit
>>>>> >>> reason given for list.sort to not return self, even.)
>>>>> >>>
>>>>> >>> Maybe enabling this idiom is a good enough reason to break the
>>>>> >>> convention ("Special cases aren't special enough to break the rules. /
>>>>> >>> Although practicality beats purity"), but it at least makes me -0 on
>>>>> >>> this...
>>>>> >>>
>>>>> >>
>>>>> >> I tend to agree with the notion that inplace operations shouldn't return
>>>>> >> self, but I don't know if it's just because I've been conditioned this way.
>>>>> >> Not returning self breaks the fluid interface pattern [1], as noted in a
>>>>> >> similar discussion on pandas [2], FWIW, though there's likely some way to
>>>>> >> have both worlds.
>>>>> >
>>>>> > Ah-hah, here's the email where Guide officially proclaims that there
>>>>> > shall be no "fluent interface" nonsense applied to in-place operators
>>>>> > in Python, because it hurts readability (at least for Dutch people
>>>>> > ):
>>>>> >   http://mail.python.org/pipermail/python-dev/2003-October/038855.html
>>>>>
>>>>> That's a statement about the policy for the stdlib, and just one
>>>>> person's opinion. You, and numpy, are permitted to have a different
>>>>> opinion.
>>>>>
>>>>> In any case, I'm not strongly advocating for it. It's violation of
>>>>> principle ("no fluent interfaces") is roughly in the same ballpark as
>>>>> np.filled() ("not every two-liner needs its own function"), so I
>>>>> thought I would toss it out there for consideration.
>>>>>
>>>>> --
>>>>> Robert Kern
>>>>>
>>>>
>>>> FWIW I'm +1 on the idea. Perhaps because I just don't see many practical
>>>> downsides to breaking the convention but I regularly see a big issue with there
>>>> being no way to instantiate an array with a particular value.
>>>>
>>>> The one obvious way to do it is use ones and multiply by the value you want. I
>>>> work with a lot of inexperienced programmers and I see this idiom all the time.
>>>> It takes a fair amount of numpy knowledge to know that you should do it in two
>>>> lines by using empty and setting a slice.
>>>>
>>>> In [1]: %timeit NaN*ones(10000)
>>>> 1000 loops, best of 3: 1.74 ms per loop
>>>>
>>>> In [2]: %%timeit
>>>>    ...: x = empty(10000, dtype=float)
>>>>    ...: x[:] = NaN
>>>>    ...:
>>>> 10000 loops, best of 3: 28 us per loop
>>>>
>>>> In [3]: 1.74e-3/28e-6
>>>> Out[3]: 62.142857142857146
>>>>
>>>>
>>>> Even when not in the mythical "tight loop" setting an array to one and then
>>>> multiplying uses up a lot of cycles - it's nearly 2 orders of magnitude slower
>>>> than what we know they *should* be doing.
>>>>
>>>> I'm agnostic as to whether fill should be modified or new functions provided but
>>>> I think numpy is currently missing this functionality and that providing it
>>>> would save a lot of new users from shooting themselves in the foot performance-
>>>> wise.
>>>
>>> Is this a fair summary?
>>>
>>> => fill(shape, val), fill_like(arr, val) - new functions, as proposed
>>> For: readable, seems to fit a pattern often used, presence in
>>> namespace may clue people into using the 'fill' rather than * val or +
>>> val
>>> Con: a very simple alias for a = ones(shape) ; a.fill(val), maybe
>>> cluttering already full namespace.
>>>
>>> => empty(shape).fill(val) - by allowing return value from arr.fill(val)
>>> For: readable
>>> Con: breaks guideline not to return anything from in-place operations,
>>> no presence in namespace means users may not find this pattern.
>>>
>>> => no new API
>>> For : easy maintenance
>>> Con : harder for users to discover fill pattern, filling a new array
>>> requires two lines instead of one.
>>>
>>> So maybe the decision rests on:
>>>
>>> How important is it that users see these function names in the
>>> namespace in order to discover the pattern "a = ones(shape) ;
>>> a.fill(val)"?
>>>
>>> How important is it to obey guidelines for no-return-from-in-place?
>>>
>>> How important is it to avoid expanding the namespace?
>>>
>>> How common is this pattern?
>>>
>>> On the last, I'd say that the only common use I have for this pattern
>>> is to fill an array with NaN.
>>
>> My 2 cts from a user perspective:
>>
>> - +1 to have such a function. I usually use numpy.ones * scalar
>> because honestly, spending two lines of code for such a basic
>> operations seems like a waste. Even if it's slower and potentially
>> dangerous due to casting rules.
>> - I think having a noun rather than a verb makes more sense since we
>> have numpy.ones and numpy.zeros (and I always read "numpy.empty" as
>> "give me an empty array", not "empty an array").
>> - I agree the name collision with np.ma.filled is a problem. I have no
>> better suggestion though at this point.
>
> np.array_filled(shape, value, dtype)  ?
> maybe more verbose, but unambiguous AFAICS
>
> BTW
> GAUSS http://en.wikipedia.org/wiki/GAUSS_(software)
> also has zeros and ones. 1st release 1984
>
> np.array_filled((100, 2), -999, int) ?

A quick check of the statsmodels source

20 occassions of np.nan * np.ones(...)
50 occassions of np.emtpy
     a few filled with other values than nan
     many filled in a loop (optimistically, more often used by new contributers)

It's just a two-liner, but if it's a function it hopefully produces better code.
David's argument looks plausible to me.

Josef

>
> Josef
>
>
>>
>> -=- Olivier
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From alan.isaac at gmail.com  Mon Jan 14 12:15:12 2013
From: alan.isaac at gmail.com (Alan G Isaac)
Date: Mon, 14 Jan 2013 12:15:12 -0500
Subject: [Numpy-discussion] New numpy functions: vals and vals_like or
 filled, filled_like?
In-Reply-To: <CAMMTP+COKBCwa=jcnHCBUJDTipuAKst=t6RrN-g7=q0U35GNHQ@mail.gmail.com>
References: <CAPJVwBmgZUCwApoS2ztmwt2pTa-2-+B3W5tOm8A30KyuOd12AA@mail.gmail.com>
	<CAF6FJivC=O0CGMGP=c5=a=TMRP73Zy9H61FRS8xSKp5cU2XV4Q@mail.gmail.com>
	<CAPJVwBnOdehQTo=dLup+woc5_JkE5ekvEBDeiEFraVyVUvSDtQ@mail.gmail.com>
	<CAKF=DjtjSGk99zh53wwiPVmMThOzr4hGUKX5HJPG-5YeF_QTCQ@mail.gmail.com>
	<CAPJVwBmZvXXszZzf9m=PX_7e=8qs8ZJteU2hD7_eTR9pnN9U2w@mail.gmail.com>
	<CAF6FJiuQwNCkPvzVKneLfx3XSM7tHKcNXzMftRFnu8R+tN_LoQ@mail.gmail.com>
	<loom.20130114T094714-689@post.gmane.org>
	<CAH6Pt5rjUE0uEQkgZsGrKL4HWXMPZO=wbbaUZ5=tm+LCBSrrXw@mail.gmail.com>
	<CAFXk4br_0pohTOrU6fJjbiqpsP+GSiT3U-nDWRrCi1q6PjJHjg@mail.gmail.com>
	<CAMMTP+BpRCdoEB_aDG4Va3yoPqqJ_2=5OFjw6L74Z_Xsf7nCtg@mail.gmail.com>
	<CAMMTP+COKBCwa=jcnHCBUJDTipuAKst=t6RrN-g7=q0U35GNHQ@mail.gmail.com>
Message-ID: <50F43D20.5070907@gmail.com>

Just changing the subject line so a good suggestion
does not get lost ...

Alan


From efiring at hawaii.edu  Mon Jan 14 12:27:43 2013
From: efiring at hawaii.edu (Eric Firing)
Date: Mon, 14 Jan 2013 07:27:43 -1000
Subject: [Numpy-discussion] New numpy functions: filled, filled_like
In-Reply-To: <CAFXk4br_0pohTOrU6fJjbiqpsP+GSiT3U-nDWRrCi1q6PjJHjg@mail.gmail.com>
References: <CAPJVwBmgZUCwApoS2ztmwt2pTa-2-+B3W5tOm8A30KyuOd12AA@mail.gmail.com>
	<CAF6FJivC=O0CGMGP=c5=a=TMRP73Zy9H61FRS8xSKp5cU2XV4Q@mail.gmail.com>
	<CAPJVwBnOdehQTo=dLup+woc5_JkE5ekvEBDeiEFraVyVUvSDtQ@mail.gmail.com>
	<CAKF=DjtjSGk99zh53wwiPVmMThOzr4hGUKX5HJPG-5YeF_QTCQ@mail.gmail.com>
	<CAPJVwBmZvXXszZzf9m=PX_7e=8qs8ZJteU2hD7_eTR9pnN9U2w@mail.gmail.com>
	<CAF6FJiuQwNCkPvzVKneLfx3XSM7tHKcNXzMftRFnu8R+tN_LoQ@mail.gmail.com>
	<loom.20130114T094714-689@post.gmane.org>
	<CAH6Pt5rjUE0uEQkgZsGrKL4HWXMPZO=wbbaUZ5=tm+LCBSrrXw@mail.gmail.com>
	<CAFXk4br_0pohTOrU6fJjbiqpsP+GSiT3U-nDWRrCi1q6PjJHjg@mail.gmail.com>
Message-ID: <50F4400F.4040709@hawaii.edu>

On 2013/01/14 6:15 AM, Olivier Delalleau wrote:
> - I agree the name collision with np.ma.filled is a problem. I have no
> better suggestion though at this point.

How about "initialized()"?


From ben.root at ou.edu  Mon Jan 14 12:33:52 2013
From: ben.root at ou.edu (Benjamin Root)
Date: Mon, 14 Jan 2013 12:33:52 -0500
Subject: [Numpy-discussion] New numpy functions: filled, filled_like
In-Reply-To: <50F4400F.4040709@hawaii.edu>
References: <CAPJVwBmgZUCwApoS2ztmwt2pTa-2-+B3W5tOm8A30KyuOd12AA@mail.gmail.com>
	<CAF6FJivC=O0CGMGP=c5=a=TMRP73Zy9H61FRS8xSKp5cU2XV4Q@mail.gmail.com>
	<CAPJVwBnOdehQTo=dLup+woc5_JkE5ekvEBDeiEFraVyVUvSDtQ@mail.gmail.com>
	<CAKF=DjtjSGk99zh53wwiPVmMThOzr4hGUKX5HJPG-5YeF_QTCQ@mail.gmail.com>
	<CAPJVwBmZvXXszZzf9m=PX_7e=8qs8ZJteU2hD7_eTR9pnN9U2w@mail.gmail.com>
	<CAF6FJiuQwNCkPvzVKneLfx3XSM7tHKcNXzMftRFnu8R+tN_LoQ@mail.gmail.com>
	<loom.20130114T094714-689@post.gmane.org>
	<CAH6Pt5rjUE0uEQkgZsGrKL4HWXMPZO=wbbaUZ5=tm+LCBSrrXw@mail.gmail.com>
	<CAFXk4br_0pohTOrU6fJjbiqpsP+GSiT3U-nDWRrCi1q6PjJHjg@mail.gmail.com>
	<50F4400F.4040709@hawaii.edu>
Message-ID: <CANNq6Fm35sYGK4cjZNQedsbdVz06hVhAm6kv7d6QHPtMJACR6g@mail.gmail.com>

On Mon, Jan 14, 2013 at 12:27 PM, Eric Firing <efiring at hawaii.edu> wrote:

> On 2013/01/14 6:15 AM, Olivier Delalleau wrote:
> > - I agree the name collision with np.ma.filled is a problem. I have no
> > better suggestion though at this point.
>
> How about "initialized()"?
>

A verb! +1 from me!

For those wondering, I have a personal rule that because functions *do*
something, they really should have verbs for their names.  I have to learn
to read functions like "ones" and "empty" like "give me ones" or "give me
an empty array".

Ben Root
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130114/9b8d6d7e/attachment.html>

From charlesr.harris at gmail.com  Mon Jan 14 12:56:35 2013
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Mon, 14 Jan 2013 10:56:35 -0700
Subject: [Numpy-discussion] New numpy functions: filled, filled_like
In-Reply-To: <CAF6FJivC=O0CGMGP=c5=a=TMRP73Zy9H61FRS8xSKp5cU2XV4Q@mail.gmail.com>
References: <CAPJVwBmgZUCwApoS2ztmwt2pTa-2-+B3W5tOm8A30KyuOd12AA@mail.gmail.com>
	<CAF6FJivC=O0CGMGP=c5=a=TMRP73Zy9H61FRS8xSKp5cU2XV4Q@mail.gmail.com>
Message-ID: <CAB6mnxLY3wTFVZu8V2nNG=1KixyAKE32nPJ5WXZZ4j_uyz_7kA@mail.gmail.com>

On Sun, Jan 13, 2013 at 4:24 PM, Robert Kern <robert.kern at gmail.com> wrote:

> On Sun, Jan 13, 2013 at 6:27 PM, Nathaniel Smith <njs at pobox.com> wrote:
> > Hi all,
> >
> > PR 2875 adds two new functions, that generalize zeros(), ones(),
> > zeros_like(), ones_like(), by simply taking an arbitrary fill value:
> >   https://github.com/numpy/numpy/pull/2875
> > So
> >   np.ones((10, 10))
> > is the same as
> >   np.filled((10, 10), 1)
> >
> > The implementations are trivial, but the API seems useful because it
> > provides an idiomatic way of efficiently creating an array full of
> > inf, or nan, or None, whatever funny value you need. All the
> > alternatives are either inefficient (np.ones(...) * np.inf) or
> > cumbersome (a = np.empty(...); a.fill(...)). Or so it seems to me. But
> > there's a question of taste here; one could argue instead that these
> > just add more clutter to the numpy namespace. So, before we merge,
> > anyone want to chime in?
>
> One alternative that does not expand the API with two-liners is to let
> the ndarray.fill() method return self:
>
>   a = np.empty(...).fill(20.0)
>
>
My thought also. Shades of the Python `.sort` method...

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130114/92430466/attachment.html>

From pierre.haessig at crans.org  Mon Jan 14 13:12:37 2013
From: pierre.haessig at crans.org (Pierre Haessig)
Date: Mon, 14 Jan 2013 19:12:37 +0100
Subject: [Numpy-discussion] New numpy functions: filled, filled_like
In-Reply-To: <CANNq6Fm35sYGK4cjZNQedsbdVz06hVhAm6kv7d6QHPtMJACR6g@mail.gmail.com>
References: <CAPJVwBmgZUCwApoS2ztmwt2pTa-2-+B3W5tOm8A30KyuOd12AA@mail.gmail.com>
	<CAF6FJivC=O0CGMGP=c5=a=TMRP73Zy9H61FRS8xSKp5cU2XV4Q@mail.gmail.com>
	<CAPJVwBnOdehQTo=dLup+woc5_JkE5ekvEBDeiEFraVyVUvSDtQ@mail.gmail.com>
	<CAKF=DjtjSGk99zh53wwiPVmMThOzr4hGUKX5HJPG-5YeF_QTCQ@mail.gmail.com>
	<CAPJVwBmZvXXszZzf9m=PX_7e=8qs8ZJteU2hD7_eTR9pnN9U2w@mail.gmail.com>
	<CAF6FJiuQwNCkPvzVKneLfx3XSM7tHKcNXzMftRFnu8R+tN_LoQ@mail.gmail.com>
	<loom.20130114T094714-689@post.gmane.org>
	<CAH6Pt5rjUE0uEQkgZsGrKL4HWXMPZO=wbbaUZ5=tm+LCBSrrXw@mail.gmail.com>
	<CAFXk4br_0pohTOrU6fJjbiqpsP+GSiT3U-nDWRrCi1q6PjJHjg@mail.gmail.com>
	<50F4400F.4040709@hawaii.edu>
	<CANNq6Fm35sYGK4cjZNQedsbdVz06hVhAm6kv7d6QHPtMJACR6g@mail.gmail.com>
Message-ID: <50F44A95.2030202@crans.org>

Le 14/01/2013 18:33, Benjamin Root a ?crit :
>
>
>     How about "initialized()"?
>
>
> A verb! +1 from me!
Shouldn't it be "initialize()" then ? I'm not so fond of it though,
because initialize is pretty broad in the field of programming.

What about "refurbishing" the already existing "tile()" function ? As of
now it almost does the job :

In [8]: tile(nan, (3,3)) # (it's a verb ! )
Out[8]:
array([[ nan,  nan,  nan],
       [ nan,  nan,  nan],
       [ nan,  nan,  nan]])


 though with two restrictions:
 * tile doesn't have a dtype keyword. Could this be added ?
 * tile performance on my computer seems to be twice as bad as "ones() *
val"

Best,
Pierre
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130114/d87a766e/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 900 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130114/d87a766e/attachment.sig>

From d.warde.farley at gmail.com  Mon Jan 14 13:49:19 2013
From: d.warde.farley at gmail.com (David Warde-Farley)
Date: Mon, 14 Jan 2013 13:49:19 -0500
Subject: [Numpy-discussion] New numpy functions: filled, filled_like
In-Reply-To: <CANNq6Fk2HzE2RYvNwtsH9sa+_MU4fP15cpn38TfO+YxRsdhCZw@mail.gmail.com>
References: <CAPJVwBmgZUCwApoS2ztmwt2pTa-2-+B3W5tOm8A30KyuOd12AA@mail.gmail.com>
	<CAF6FJivC=O0CGMGP=c5=a=TMRP73Zy9H61FRS8xSKp5cU2XV4Q@mail.gmail.com>
	<CAPJVwBnOdehQTo=dLup+woc5_JkE5ekvEBDeiEFraVyVUvSDtQ@mail.gmail.com>
	<50F3FC56.8000100@crans.org>
	<CANNq6Fk2HzE2RYvNwtsH9sa+_MU4fP15cpn38TfO+YxRsdhCZw@mail.gmail.com>
Message-ID: <CALc6Xo4WN+rutrTgWr76WQCnyxQXoyOfQ21Aoo6-6XQiFgVw_w@mail.gmail.com>

On Mon, Jan 14, 2013 at 9:57 AM, Benjamin Root <ben.root at ou.edu> wrote:
>
>
> On Mon, Jan 14, 2013 at 7:38 AM, Pierre Haessig <pierre.haessig at crans.org>
> wrote:
>>
>> Hi,
>>
>> Le 14/01/2013 00:39, Nathaniel Smith a ?crit :
>> > (The nice thing about np.filled() is that it makes np.zeros() and
>> > np.ones() feel like clutter, rather than the reverse... not that I'm
>> > suggesting ever getting rid of them, but it makes the API conceptually
>> > feel smaller, not larger.)
>> Coming from the Matlab syntax, I feel that np.zeros and np.ones are in
>> numpy for Matlab (and maybe others ?) compatibilty and are useful for
>> that. Now that I've been "enlightened" by Python, I think that those
>> functions (especially np.ones) are indeed clutter. Therefore I favor the
>> introduction of these two new functions.
>>
>> However, I think Eric's remark about masked array API compatibility is
>> important. I don't know what other names are possible ? np.const ?
>>
>> Or maybe np.tile is also useful for that same purpose ? In that case
>> adding a dtype argument to np.tile would be useful.
>>
>> best,
>> Pierre
>>
>
> I am also +1 on the idea of having a filled() and filled_like() function (I
> learned a long time ago to just do a = np.empty() and a.fill() rather than
> the multiplication trick I learned from Matlab).  However, the collision
> with the masked array API is a non-starter for me.  np.const() and
> np.const_like() probably make the most sense, but I would prefer a verb over
> a noun.

Definitely -1 on const. Falsely implies immutability, to my mind.

David


From d.warde.farley at gmail.com  Mon Jan 14 13:56:54 2013
From: d.warde.farley at gmail.com (David Warde-Farley)
Date: Mon, 14 Jan 2013 13:56:54 -0500
Subject: [Numpy-discussion] New numpy functions: filled, filled_like
In-Reply-To: <50F44A95.2030202@crans.org>
References: <CAPJVwBmgZUCwApoS2ztmwt2pTa-2-+B3W5tOm8A30KyuOd12AA@mail.gmail.com>
	<CAF6FJivC=O0CGMGP=c5=a=TMRP73Zy9H61FRS8xSKp5cU2XV4Q@mail.gmail.com>
	<CAPJVwBnOdehQTo=dLup+woc5_JkE5ekvEBDeiEFraVyVUvSDtQ@mail.gmail.com>
	<CAKF=DjtjSGk99zh53wwiPVmMThOzr4hGUKX5HJPG-5YeF_QTCQ@mail.gmail.com>
	<CAPJVwBmZvXXszZzf9m=PX_7e=8qs8ZJteU2hD7_eTR9pnN9U2w@mail.gmail.com>
	<CAF6FJiuQwNCkPvzVKneLfx3XSM7tHKcNXzMftRFnu8R+tN_LoQ@mail.gmail.com>
	<loom.20130114T094714-689@post.gmane.org>
	<CAH6Pt5rjUE0uEQkgZsGrKL4HWXMPZO=wbbaUZ5=tm+LCBSrrXw@mail.gmail.com>
	<CAFXk4br_0pohTOrU6fJjbiqpsP+GSiT3U-nDWRrCi1q6PjJHjg@mail.gmail.com>
	<50F4400F.4040709@hawaii.edu>
	<CANNq6Fm35sYGK4cjZNQedsbdVz06hVhAm6kv7d6QHPtMJACR6g@mail.gmail.com>
	<50F44A95.2030202@crans.org>
Message-ID: <CALc6Xo6b=3OE3VHNsqJ-SXo_8tvNZOL5RV0pnYnG_fiDp58TUQ@mail.gmail.com>

On Mon, Jan 14, 2013 at 1:12 PM, Pierre Haessig
<pierre.haessig at crans.org> wrote:
> In [8]: tile(nan, (3,3)) # (it's a verb ! )

tile, in my opinion, is useful in some cases (for people who think in
terms of repmat()) but not very NumPy-ish. What I'd like is a function
that takes

- an initial array_like "a"
- a shape "s"
- optionally, a dtype (otherwise inherit from a)

and broadcasts "a" to the shape "s". In the case of scalars this is
just a fill. In the case of, say, a (5,) vector and a (10, 5) shape,
this broadcasts across rows, etc.

I don't think it's worth special-casing scalar fills (except perhaps
as an implementation detail) when you have rich broadcasting semantics
that are already a fundamental part of NumPy, allowing for a much
handier primitive.

David


From ben.root at ou.edu  Mon Jan 14 14:05:21 2013
From: ben.root at ou.edu (Benjamin Root)
Date: Mon, 14 Jan 2013 14:05:21 -0500
Subject: [Numpy-discussion] New numpy functions: filled, filled_like
In-Reply-To: <CALc6Xo6b=3OE3VHNsqJ-SXo_8tvNZOL5RV0pnYnG_fiDp58TUQ@mail.gmail.com>
References: <CAPJVwBmgZUCwApoS2ztmwt2pTa-2-+B3W5tOm8A30KyuOd12AA@mail.gmail.com>
	<CAF6FJivC=O0CGMGP=c5=a=TMRP73Zy9H61FRS8xSKp5cU2XV4Q@mail.gmail.com>
	<CAPJVwBnOdehQTo=dLup+woc5_JkE5ekvEBDeiEFraVyVUvSDtQ@mail.gmail.com>
	<CAKF=DjtjSGk99zh53wwiPVmMThOzr4hGUKX5HJPG-5YeF_QTCQ@mail.gmail.com>
	<CAPJVwBmZvXXszZzf9m=PX_7e=8qs8ZJteU2hD7_eTR9pnN9U2w@mail.gmail.com>
	<CAF6FJiuQwNCkPvzVKneLfx3XSM7tHKcNXzMftRFnu8R+tN_LoQ@mail.gmail.com>
	<loom.20130114T094714-689@post.gmane.org>
	<CAH6Pt5rjUE0uEQkgZsGrKL4HWXMPZO=wbbaUZ5=tm+LCBSrrXw@mail.gmail.com>
	<CAFXk4br_0pohTOrU6fJjbiqpsP+GSiT3U-nDWRrCi1q6PjJHjg@mail.gmail.com>
	<50F4400F.4040709@hawaii.edu>
	<CANNq6Fm35sYGK4cjZNQedsbdVz06hVhAm6kv7d6QHPtMJACR6g@mail.gmail.com>
	<50F44A95.2030202@crans.org>
	<CALc6Xo6b=3OE3VHNsqJ-SXo_8tvNZOL5RV0pnYnG_fiDp58TUQ@mail.gmail.com>
Message-ID: <CANNq6F=WpPoMF0+AMqRHGXmfs2cdnX83d=i1TNSfaj1-pamZPg@mail.gmail.com>

On Mon, Jan 14, 2013 at 1:56 PM, David Warde-Farley <
d.warde.farley at gmail.com> wrote:

> On Mon, Jan 14, 2013 at 1:12 PM, Pierre Haessig
> <pierre.haessig at crans.org> wrote:
> > In [8]: tile(nan, (3,3)) # (it's a verb ! )
>
> tile, in my opinion, is useful in some cases (for people who think in
> terms of repmat()) but not very NumPy-ish. What I'd like is a function
> that takes
>
> - an initial array_like "a"
> - a shape "s"
> - optionally, a dtype (otherwise inherit from a)
>
> and broadcasts "a" to the shape "s". In the case of scalars this is
> just a fill. In the case of, say, a (5,) vector and a (10, 5) shape,
> this broadcasts across rows, etc.
>
> I don't think it's worth special-casing scalar fills (except perhaps
> as an implementation detail) when you have rich broadcasting semantics
> that are already a fundamental part of NumPy, allowing for a much
> handier primitive.
>

I have similar problems with "tile".  I learned it for a particular use in
numpy, and it would be hard for me to see it for another (contextually)
different use.

I do like the way you are thinking in terms of the broadcasting semantics,
but I wonder if that is a bit awkward.  What I mean is, if one were to use
broadcasting semantics for creating an array, wouldn't one have just simply
used broadcasting anyway?  The point of broadcasting is to _avoid_ the
creation of unneeded arrays.  But maybe I can be convinced with some
examples.

Ben Root
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130114/9f835fa9/attachment.html>

From alan.isaac at gmail.com  Mon Jan 14 14:17:51 2013
From: alan.isaac at gmail.com (Alan G Isaac)
Date: Mon, 14 Jan 2013 14:17:51 -0500
Subject: [Numpy-discussion] New numpy functions: filled, filled_like
In-Reply-To: <CALc6Xo6b=3OE3VHNsqJ-SXo_8tvNZOL5RV0pnYnG_fiDp58TUQ@mail.gmail.com>
References: <CAPJVwBmgZUCwApoS2ztmwt2pTa-2-+B3W5tOm8A30KyuOd12AA@mail.gmail.com>
	<CAF6FJivC=O0CGMGP=c5=a=TMRP73Zy9H61FRS8xSKp5cU2XV4Q@mail.gmail.com>
	<CAPJVwBnOdehQTo=dLup+woc5_JkE5ekvEBDeiEFraVyVUvSDtQ@mail.gmail.com>
	<CAKF=DjtjSGk99zh53wwiPVmMThOzr4hGUKX5HJPG-5YeF_QTCQ@mail.gmail.com>
	<CAPJVwBmZvXXszZzf9m=PX_7e=8qs8ZJteU2hD7_eTR9pnN9U2w@mail.gmail.com>
	<CAF6FJiuQwNCkPvzVKneLfx3XSM7tHKcNXzMftRFnu8R+tN_LoQ@mail.gmail.com>
	<loom.20130114T094714-689@post.gmane.org>
	<CAH6Pt5rjUE0uEQkgZsGrKL4HWXMPZO=wbbaUZ5=tm+LCBSrrXw@mail.gmail.com>
	<CAFXk4br_0pohTOrU6fJjbiqpsP+GSiT3U-nDWRrCi1q6PjJHjg@mail.gmail.com>
	<50F4400F.4040709@hawaii.edu>
	<CANNq6Fm35sYGK4cjZNQedsbdVz06hVhAm6kv7d6QHPtMJACR6g@mail.gmail.com>
	<50F44A95.2030202@crans.org>
	<CALc6Xo6b=3OE3VHNsqJ-SXo_8tvNZOL5RV0pnYnG_fiDp58TUQ@mail.gmail.com>
Message-ID: <50F459DF.2070900@gmail.com>

Thanks Pierre for noting that np.tile already
provides a chunk of this functionality:

 >>> a = np.tile(5,(1,2,3))
 >>> a
array([[[5, 5, 5],
         [5, 5, 5]]])
 >>> np.tile(1,a.shape)
array([[[1, 1, 1],
         [1, 1, 1]]])

I had not realized a scalar first argument was possible.

Alan Isaac


From ralf.gommers at gmail.com  Mon Jan 14 16:26:31 2013
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Mon, 14 Jan 2013 22:26:31 +0100
Subject: [Numpy-discussion] 1.8 release
In-Reply-To: <CAGY4rcWHWf7efDCfPQx9GhwHeeTpxj1AfToq0tVV_P=frzfQhg@mail.gmail.com>
References: <CAB6mnxLU4sbG7AfMcQVH9rQkLe+Um15njj7nCjr_tVWz3unB1g@mail.gmail.com>
	<CAPJVwBn8XzztoZibiK7o89guj1nY9zZvcSUT-Us4nAF+JHisyg@mail.gmail.com>
	<CAGY4rcWHWf7efDCfPQx9GhwHeeTpxj1AfToq0tVV_P=frzfQhg@mail.gmail.com>
Message-ID: <CABL7CQgQTniUgoRZhTD9HuVezHmx=QnJyt9Jx1=i09VZVBjQCg@mail.gmail.com>

On Mon, Jan 14, 2013 at 1:19 AM, David Cournapeau <cournape at gmail.com>wrote:

> On Sun, Jan 13, 2013 at 5:26 PM, Nathaniel Smith <njs at pobox.com> wrote:
> > On Sun, Jan 13, 2013 at 7:03 PM, Charles R Harris
> > <charlesr.harris at gmail.com> wrote:
> >> Now that 1.7 is nearing release, it's time to look forward to the 1.8
> >> release. I'd like us to get back to the twice yearly schedule that we
> tried
> >> to maintain through the 1.3 - 1.6 releases, so I propose a June release
> as a
> >> goal. Call it the Spring Cleaning release. As to content, I'd like to
> see
> >> the following.
> >>
> >> Removal of Python 2.4-2.5 support.
> >> Removal of SCons support.
> >> The index work consolidated.
> >> Initial stab at removing the need for 2to3. See Pauli's PR for scipy.
> >> Miscellaneous enhancements and fixes.
> >
> > I'd actually like to propose a faster release cycle than this, even.
> > Perhaps 3 months between releases; 2 months from release n to the
> > first beta of n+1?
> >
> > The consequences would be:
> > * Changes get out to users faster.
> > * Each release is smaller, so it's easier for downstream projects to
> > adjust to each release -- instead of having this giant pile of changes
> > to work through all at once every 6-12 months
> > * End-users are less scared of updating, because the changes aren't so
> > overwhelming, so they end up actually testing (and getting to take
> > advantage of) the new stuff more.
> > * We get feedback more quickly, so we can fix up whatever we break
> > while we still know what we did.
> > * And for larger changes, if we release them incrementally, we can get
> > feedback before we've gone miles down the wrong path.
> > * Releases come out on time more often -- sort of paradoxical, but
> > with small, frequent releases, beta cycles go smoother, and it's
> > easier to say "don't worry, I'll get it ready for next time", or
> > "right, that patch was less done than we thought, let's take it out
> > for now" (also this is much easier if we don't have another years
> > worth of changes committed on top of the patch!).
> > * If your schedule does slip, then you still end up with a <6 month
> > release cycle.
> >
> > 1.6.x was branched from master in March 2011 and released in May 2011.
> > 1.7.x was branched from master in July 2012 and still isn't out. But
> > at least we've finally found and fixed the second to last bug!
> >
> > Wouldn't it be nice to have a 2-4 week beta cycle that only found
> > trivial and expected problems? We *already* have 6 months worth of
> > feature work in master that won't be in the *next* release.
> >
> > Note 1: if we do do this, then we'll also want to rethink the
> > deprecation cycle a bit -- right now we've sort of vaguely been saying
> > "well, we'll deprecate it in release n and take it out in n+1.
> > Whenever that is". 3 months definitely isn't long enough for a
> > deprecation period, so if we do do this then we'll want to deprecate
> > things for multiple releases before actually removing them. Details to
> > be determined.
> >
> > Note 2: in this kind of release schedule, you definitely don't want to
> > say "here are the features that will be in the next release!", because
> > then you end up slipping and sliding all over the place. Instead you
> > say "here are some things that I want to work on next, and we'll see
> > which release they end up in". Since we're already following the rule
> > that nothing goes into master until it's done and tested and ready for
> > release anyway, this doesn't really change much.
> >
> > Thoughts?
>
> Hey, my time to have a time-machine:
> http://mail.scipy.org/pipermail/numpy-discussion/2008-May/033754.html
>
> I still think it is a good idea :)
>

+1 for faster and time-based releases.

3 months does sound a little too short to me (5 or 6 would be better),
since a release cycle typically doesn't fit in one month.

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130114/85b7d5f0/attachment.html>

From njs at pobox.com  Mon Jan 14 17:08:50 2013
From: njs at pobox.com (Nathaniel Smith)
Date: Mon, 14 Jan 2013 22:08:50 +0000
Subject: [Numpy-discussion] 1.8 release
In-Reply-To: <CADKKbtgeCEP7vByNSvWd0VvN1T7c6DotG4KCGXaqgNh4FQ1PSw@mail.gmail.com>
References: <CAB6mnxLU4sbG7AfMcQVH9rQkLe+Um15njj7nCjr_tVWz3unB1g@mail.gmail.com>
	<CAPJVwBn8XzztoZibiK7o89guj1nY9zZvcSUT-Us4nAF+JHisyg@mail.gmail.com>
	<CAGY4rcWHWf7efDCfPQx9GhwHeeTpxj1AfToq0tVV_P=frzfQhg@mail.gmail.com>
	<CAH6Pt5qCW7WF=w64O-QvQVENp-8oBpo2JeDZ_McC66_DMcmGpQ@mail.gmail.com>
	<CADKKbtgeCEP7vByNSvWd0VvN1T7c6DotG4KCGXaqgNh4FQ1PSw@mail.gmail.com>
Message-ID: <CAPJVwBkFC0qaRYRAcEge6Te6Y1S91Haj6-nkrkD1YM-T33x8=g@mail.gmail.com>

On Mon, Jan 14, 2013 at 4:45 PM, Fr?d?ric Bastien <nouiz at nouiz.org> wrote:
> I don't volontear for the next release manager, but +1 for shorter
> releases. I heard just good comments from that. Also, I'm not sure it
> would ask more from the release manager. Do someone have an idea? The
> most work I do as a release manager for theano is the
> preparation/tests/release notes and this depend on the amont of new
> stuff. And this seam exponential on the number of new changes in the
> release, not linear (no data, just an impression...). Making smaller
> release make this easier.
>
> But yes, this mean more announces. But this isn't what take the most
> times. Also, doing the release notes more frequently mean it is more
> recent in memory when you check the PR merged, so it make it easier to
> do.

Right, this is my experience too -- that it's actually easier to put
out more releases, because each one is manageable and you get a
routine going. ("Oops, it's March, better find an hour this week to
check the release notes and run the 'release beta1' script.") It
becomes almost boring, which is awesome. Putting out 5 small releases
is much, MUCH easier than putting out one giant 5x bigger release.

On Mon, Jan 14, 2013 at 9:26 PM, Ralf Gommers <ralf.gommers at gmail.com> wrote:
> +1 for faster and time-based releases.
>
> 3 months does sound a little too short to me (5 or 6 would be better), since
> a release cycle typically doesn't fit in one month.

The release cycle for 6-12+ months of changes doesn't typically fit in
one month, but we've never tried for a smaller release, so who knows.
I suppose that theoretically, as scientists, what we ought to do is to
attempt 1-2 releases at as aggressive a pace as we can imagine to see
how it goes, and then we'll have the data to interpolate the correct
speed instead of extrapolating... ;-)

On Mon, Jan 14, 2013 at 12:14 AM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
> I think three months is a bit short. Much will depend on the release manager
> and I not sure what  Andrej's plans are. I'd happily nominate you for that
> role ;)

Careful, or I'll nominate you back! ;-) Seriously, though, Ondrej is
doing a great job, I doubt I'd do as well...

Ondrej: I know you're still doing heroic work getting 1.7 pulled
together, but if you have a moment-- Are you planning to stick around
as release manager after 1.7? And if so, what are your thoughts on
attempting such a short cycle?

-n


From madsipsen at gmail.com  Tue Jan 15 06:50:20 2013
From: madsipsen at gmail.com (Mads Ipsen)
Date: Tue, 15 Jan 2013 12:50:20 +0100
Subject: [Numpy-discussion] argsort
Message-ID: <50F5427C.8060006@gmail.com>

Hi,

I simply can't understand this. I'm trying to use argsort to produce 
indices that can be used to sort an array:

   from numpy import *

   indices = array([[4,3],[1,12],[23,7],[11,6],[8,9]])
   args = argsort(indices, axis=0)
   print indices[args]

gives:

[[[ 1 12]
   [ 4  3]]

  [[ 4  3]
   [11  6]]

  [[ 8  9]
   [23  7]]

  [[11  6]
   [ 8  9]]

  [[23  7]
   [ 1 12]]]

I thought this should produce a sorted version of the indices array.

Any help is appreciated.

Best regards,

Mads

-- 
+-----------------------------------------------------+
| Mads Ipsen                                          |
+----------------------+------------------------------+
| G?seb?ksvej 7, 4. tv |                              |
| DK-2500 Valby        | phone:          +45-29716388 |
| Denmark              | email:  mads.ipsen at gmail.com |
+----------------------+------------------------------+

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130115/e8894f40/attachment.html>

From charlesr.harris at gmail.com  Tue Jan 15 09:44:19 2013
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Tue, 15 Jan 2013 07:44:19 -0700
Subject: [Numpy-discussion] argsort
In-Reply-To: <50F5427C.8060006@gmail.com>
References: <50F5427C.8060006@gmail.com>
Message-ID: <CAB6mnxKQY4dbvRWGGyp8XiGAD5+MvnsqeGKxW907hTyjhsNDYg@mail.gmail.com>

On Tue, Jan 15, 2013 at 4:50 AM, Mads Ipsen <madsipsen at gmail.com> wrote:

>  Hi,
>
> I simply can't understand this. I'm trying to use argsort to produce
> indices that can be used to sort an array:
>
>   from numpy import *
>
>   indices = array([[4,3],[1,12],[23,7],[11,6],[8,9]])
>   args = argsort(indices, axis=0)
>   print indices[args]
>
> gives:
>
> [[[ 1 12]
>   [ 4  3]]
>
>  [[ 4  3]
>   [11  6]]
>
>  [[ 8  9]
>   [23  7]]
>
>  [[11  6]
>   [ 8  9]]
>
>  [[23  7]
>   [ 1 12]]]
>
> I thought this should produce a sorted version of the indices array.
>
> Any help is appreciated.
>
>
Fancy indexing is a funny creature and not easy to understand in more than
one dimension. What is happening is that each index is replaced by the
corresponding row of a and the result is of shape (5,2,2). To do what you
want to do:

In [20]: a[i, [[0,1]]*5]
Out[20]:
array([[ 1,  3],
       [ 4,  6],
       [ 8,  7],
       [11,  9],
       [23, 12]])

I agree that there should be an easier way to do this.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130115/c6847734/attachment.html>

From robert.kern at gmail.com  Tue Jan 15 09:56:10 2013
From: robert.kern at gmail.com (Robert Kern)
Date: Tue, 15 Jan 2013 15:56:10 +0100
Subject: [Numpy-discussion] argsort
In-Reply-To: <CAB6mnxKQY4dbvRWGGyp8XiGAD5+MvnsqeGKxW907hTyjhsNDYg@mail.gmail.com>
References: <50F5427C.8060006@gmail.com>
	<CAB6mnxKQY4dbvRWGGyp8XiGAD5+MvnsqeGKxW907hTyjhsNDYg@mail.gmail.com>
Message-ID: <CAF6FJiv4Wt+yAwvCQ4_kG9tbKXL0YrxaQPr=3iXC_3ERdZV=dg@mail.gmail.com>

On Tue, Jan 15, 2013 at 3:44 PM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
> Fancy indexing is a funny creature and not easy to understand in more than
> one dimension. What is happening is that each index is replaced by the
> corresponding row of a and the result is of shape (5,2,2). To do what you
> want to do:
>
> In [20]: a[i, [[0,1]]*5]
> Out[20]:
> array([[ 1,  3],
>        [ 4,  6],
>        [ 8,  7],
>        [11,  9],
>        [23, 12]])
>
> I agree that there should be an easier way to do this.

Slightly easier, though no more transparent:

  a[i, [0,1]]

http://docs.scipy.org/doc/numpy/user/basics.indexing.html#indexing-multi-dimensional-arrays

--
Robert Kern


From Nicolas.Rougier at inria.fr  Tue Jan 15 12:37:52 2013
From: Nicolas.Rougier at inria.fr (Nicolas Rougier)
Date: Tue, 15 Jan 2013 18:37:52 +0100
Subject: [Numpy-discussion] dtype "reduction" [SOLVED]
In-Reply-To: <E3BFCAB8-418D-48D8-9A9C-E37A69488EF9@inria.fr>
References: <88940E94-C5A2-4CBB-8D44-554193B9CF05@inria.fr>
	<CAPJVwBm4uTOh3AO07YHLWqYYD9OXoBxd-cVOTssCn5086nUMPQ@mail.gmail.com>
	<E3BFCAB8-418D-48D8-9A9C-E37A69488EF9@inria.fr>
Message-ID: <23DC4442-3DCC-411A-AB9D-69C3FDD5CCD5@inria.fr>


I ended coding the dtype reduction, it's not foolproof but it might be useful for others as well.

Nicolas


import numpy as np

def dtype_reduce(dtype, level=0, depth=0):
    """
    Try to reduce dtype up to a given level when it is possible

    dtype =  [ ('vertex',  [('x', 'f4'), ('y', 'f4'), ('z', 'f4')]),
               ('normal',  [('x', 'f4'), ('y', 'f4'), ('z', 'f4')]),
               ('color',   [('r', 'f4'), ('g', 'f4'), ('b', 'f4'), ('a', 'f4')])]

    level 0: ['color,vertex,normal,', 10, 'float32']
    level 1: [['color', 4, 'float32']
              ['normal', 3, 'float32']
              ['vertex', 3, 'float32']]
    """
    dtype = np.dtype(dtype)
    fields = dtype.fields
    
    # No fields
    if fields is None:
        if dtype.shape:
            count = reduce(mul, dtype.shape)
        else:
            count = 1
        size = dtype.itemsize/count
        if dtype.subdtype:
            name = str( dtype.subdtype[0] )
        else:
            name = str( dtype )
        return ['', count, name]
    else:
        items = []
        name = ''
        # Get reduced fields
        for key,value in fields.items():
            l =  dtype_reduce(value[0], level, depth+1)
            if type(l[0]) is str:
                items.append( [key, l[1], l[2]] )
            else:
                items.append( l )
            name += key+','

        # Check if we can reduce item list
        ctype = None
        count = 0
        for i,item in enumerate(items):
            # One item is a list, we cannot reduce
            if type(item[0]) is not str:
                return items
            else:
                if i==0:
                    ctype = item[2]
                    count += item[1]
                else:
                    if item[2] != ctype:
                        return items
                    count += item[1]
        if depth >= level:
            return [name, count, ctype]
        else:
            return items

if __name__ == '__main__':

    # Fully reductible
    dtype =  [ ('vertex',  [('x', 'f4'), ('y', 'f4'), ('z', 'f4')]),
               ('normal',  [('x', 'f4'), ('y', 'f4'), ('z', 'f4')]),
               ('color',   [('r', 'f4'), ('g', 'f4'), ('b', 'f4'), ('a', 'f4')])]
    print 'level 0:'
    print dtype_reduce(dtype,level=0)
    print 'level 1:'
    print dtype_reduce(dtype,level=1)
    print

    # Not fully reductible
    dtype =  [ ('vertex',  [('x', 'i4'), ('y', 'i4'), ('z', 'i4')]),
               ('normal',  [('x', 'f4'), ('y', 'f4'), ('z', 'f4')]),
               ('color',   [('r', 'f4'), ('g', 'f4'), ('b', 'f4'), ('a', 'f4')])]
    print 'level 0:'
    print dtype_reduce(dtype,level=0)
    print

    # Not reductible at all
    dtype =  [ ('vertex',  [('x', 'f4'), ('y', 'f4'), ('z', 'i4')]),
               ('normal',  [('x', 'f4'), ('y', 'f4'), ('z', 'i4')]),
               ('color',   [('r', 'f4'), ('g', 'f4'), ('b', 'i4'), ('a', 'f4')])]
    print 'level 0:'
    print dtype_reduce(dtype,level=0)


On Dec 27, 2012, at 9:11 , Nicolas Rougier wrote:

> 
> Yep, I'm trying to construct dtype2 programmaticaly and was hoping for some function giving me a "canonical" expression of the dtype. I've started playing with fields but it's just a bit harder than I though (lot of different cases and recursion).
> 
> Thanks for the answer.
> 
> 
> Nicolas
> 
> On Dec 27, 2012, at 1:32 , Nathaniel Smith wrote:
> 
>> On Wed, Dec 26, 2012 at 8:09 PM, Nicolas Rougier
>> <Nicolas.Rougier at inria.fr> wrote:
>>> 
>>> 
>>> Hi all,
>>> 
>>> 
>>> I'm looking for a way to "reduce" dtype1 into dtype2 (when it is possible of course).
>>> Is there some easy way to do that by any chance ?
>>> 
>>> 
>>> dtype1 = np.dtype( [ ('vertex',  [('x', 'f4'),
>>>                                 ('y', 'f4'),
>>>                                 ('z', 'f4')]),
>>>                   ('normal',  [('x', 'f4'),
>>>                                ('y', 'f4'),
>>>                                ('z', 'f4')]),
>>>                   ('color',   [('r', 'f4'),
>>>                                ('g', 'f4'),
>>>                                ('b', 'f4'),
>>>                                ('a', 'f4')]) ] )
>>> 
>>> dtype2 = np.dtype( [ ('vertex',  'f4', 3),
>>>                    ('normal',  'f4', 3),
>>>                    ('color',   'f4', 4)] )
>>> 
>> 
>> If you have an array whose dtype is dtype1, and you want to convert it
>> into an array with dtype2, then you just do
>> my_dtype2_array = my_dtype1_array.view(dtype2)
>> 
>> If you have dtype1 and you want to programmaticaly construct dtype2,
>> then that's a little more fiddly and depends on what exactly you're
>> trying to do, but start by poking around with dtype1.names and
>> dtype1.fields, which contain information on how dtype1 is put together
>> in the form of regular python structures.
>> 
>> -n
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From jerome_caron_astro at ymail.com  Tue Jan 15 14:31:25 2013
From: jerome_caron_astro at ymail.com (Jerome Caron)
Date: Tue, 15 Jan 2013 19:31:25 +0000 (GMT)
Subject: [Numpy-discussion] algorithm for faster median calculation ?
Message-ID: <1358278285.14438.YahooMailNeo@web171404.mail.ir2.yahoo.com>

Dear all,
I am new to the Numpy-discussion list.
I would like to follow up some possibly useful information about calculating median.
The message below was posted today on the AstroPy mailing list.
Kind regards
Jerome Caron

#----------------------------------------
I think the calculation of median values in Numpy is not optimal. I don't know if there are other libraries that do better?
On my machine I get these results:
>>> data = numpy.random.rand(5000,5000)
>>> t0=time.time();print numpy.ma.median(data);print time.time()-t0
0.499845739822
15.1949999332
>>> t0=time.time();print numpy.median(data);print time.time()-t0
0.499845739822
4.32100009918
>>> t0=time.time();print aspylib.astro.get_median(data);print time.time()-t0
[ 0.49984574]
0.90499997139
>>>

The median calculation in Aspylib is using C code from Nicolas Devillard (can be found here: http://ndevilla.free.fr/median/index.html) interfaced with ctypes.
It could be easily re-used for other, more official packages. I think the code also finds quantiles efficiently.
See: http://www.aspylib.com/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130115/08268d4f/attachment.html>

From deil.christoph at googlemail.com  Tue Jan 15 14:49:47 2013
From: deil.christoph at googlemail.com (Christoph Deil)
Date: Tue, 15 Jan 2013 20:49:47 +0100
Subject: [Numpy-discussion] algorithm for faster median calculation ?
In-Reply-To: <1358278285.14438.YahooMailNeo@web171404.mail.ir2.yahoo.com>
References: <1358278285.14438.YahooMailNeo@web171404.mail.ir2.yahoo.com>
Message-ID: <F530F00F-AB8F-4647-A73F-53DF5F596811@gmail.com>


On Jan 15, 2013, at 8:31 PM, Jerome Caron <jerome_caron_astro at ymail.com> wrote:

> Dear all,
> I am new to the Numpy-discussion list.
> I would like to follow up some possibly useful information about calculating median.
> The message below was posted today on the AstroPy mailing list.
> Kind regards
> Jerome Caron
>  
> #----------------------------------------
> I think the calculation of median values in Numpy is not optimal. I don't know if there are other libraries that do better?
> On my machine I get these results:
> >>> data = numpy.random.rand(5000,5000)
> >>> t0=time.time();print numpy.ma.median(data);print time.time()-t0
> 0.499845739822
> 15.1949999332
> >>> t0=time.time();print numpy.median(data);print time.time()-t0
> 0.499845739822
> 4.32100009918
> >>> t0=time.time();print aspylib.astro.get_median(data);print time.time()-t0
> [ 0.49984574]
> 0.90499997139
> >>>
> The median calculation in Aspylib is using C code from Nicolas Devillard (can be found here: http://ndevilla.free.fr/median/index.html) interfaced with ctypes.
> It could be easily re-used for other, more official packages. I think the code also finds quantiles efficiently.
> See: http://www.aspylib.com/
>  
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

Hi Jerome,

some of the numpy devs are already discussing how to best implement the fast median for numpy here:
https://github.com/numpy/numpy/issues/1811 "median in average O(n) time"

If you want to get an email when someone posts a comment on that github ticket, sign up for a free github account, then click on "watch tread" at the bottom of that issue.

Note that numpy is BSD-licensed, so they can't take GPL-licensed code.
But I think looking at the method you have in aspylib is OK, so thanks for sharing!

Christoph
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130115/0899bf46/attachment.html>

From sturla at molden.no  Tue Jan 15 14:50:21 2013
From: sturla at molden.no (Sturla Molden)
Date: Tue, 15 Jan 2013 20:50:21 +0100
Subject: [Numpy-discussion] algorithm for faster median calculation ?
In-Reply-To: <1358278285.14438.YahooMailNeo@web171404.mail.ir2.yahoo.com>
References: <1358278285.14438.YahooMailNeo@web171404.mail.ir2.yahoo.com>
Message-ID: <50F5B2FD.6070504@molden.no>

You might want to look at this first:

https://github.com/numpy/numpy/issues/1811

Yes it is possible to compute the median faster by doing quickselect 
instead of quicksort. Best case O(n) for quickselect, O(n log n) for 
quicksort. But adding selection and partial sorting to NumPy is a bigger 
issue than just computing medians and percentiles faster.

If we are to do this I think we should add partial sorting and selection 
to npysort, not patch in some C or Cython quickselect just for the 
median. When npysort has quickselect, changing the Python code to use it 
for medians and percentiles is a nobrainer.

https://github.com/numpy/numpy/tree/master/numpy/core/src/npysort


Sturla


On 15.01.2013 20:31, Jerome Caron wrote:
> Dear all,
> I am new to the Numpy-discussion list.
> I would like to follow up some possibly useful information about
> calculating median.
> The message below was posted today on the AstroPy mailing list.
> Kind regards
> Jerome Caron
> #----------------------------------------
> I think the calculation of median values in Numpy is not optimal. I
> don't know if there are other libraries that do better?
> On my machine I get these results:
>  >>> data = numpy.random.rand(5000,5000)
>  >>> t0=time.time();print numpy.ma.median(data);print time.time()-t0
> 0.499845739822
> 15.1949999332
>  >>> t0=time.time();print numpy.median(data);print time.time()-t0
> 0.499845739822
> 4.32100009918
>  >>> t0=time.time();print aspylib.astro.get_median(data);print
> time.time()-t0
> [ 0.49984574]
> 0.90499997139
>  >>>
> The median calculation in Aspylib is using C code from Nicolas Devillard
> (can be found here: http://ndevilla.free.fr/median/index.html
> <http://ndevilla.free.fr/median/index.html>) interfaced with ctypes.
> It could be easily re-used for other, more official packages. I think
> the code also finds quantiles efficiently.
> See: http://www.aspylib.com/
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From e.antero.tammi at gmail.com  Tue Jan 15 15:53:39 2013
From: e.antero.tammi at gmail.com (eat)
Date: Tue, 15 Jan 2013 22:53:39 +0200
Subject: [Numpy-discussion] argsort
In-Reply-To: <50F5427C.8060006@gmail.com>
References: <50F5427C.8060006@gmail.com>
Message-ID: <CAKa=AYS3dL4nyFt5ZVuPvrXJarjaKE7M6Jg15YVshSga3=GB7g@mail.gmail.com>

Hi,

On Tue, Jan 15, 2013 at 1:50 PM, Mads Ipsen <madsipsen at gmail.com> wrote:

>  Hi,
>
> I simply can't understand this. I'm trying to use argsort to produce
> indices that can be used to sort an array:
>
>   from numpy import *
>
>   indices = array([[4,3],[1,12],[23,7],[11,6],[8,9]])
>   args = argsort(indices, axis=0)
>   print indices[args]
>
> gives:
>
> [[[ 1 12]
>   [ 4  3]]
>
>  [[ 4  3]
>   [11  6]]
>
>  [[ 8  9]
>   [23  7]]
>
>  [[11  6]
>   [ 8  9]]
>
>  [[23  7]
>   [ 1 12]]]
>
> I thought this should produce a sorted version of the indices array.
>
> Any help is appreciated.
>
Perhaps these three different point of views will help you a little bit
more to move on:
In []: x
Out[]:
array([[ 4,  3],
       [ 1, 12],
       [23,  7],
       [11,  6],
       [ 8,  9]])
In []: ind= x.argsort(axis= 0)
In []: ind
Out[]:
array([[1, 0],
       [0, 3],
       [4, 2],
       [3, 4],
       [2, 1]])

In []: x[ind[:, 0]]
Out[]:
array([[ 1, 12],
       [ 4,  3],
       [ 8,  9],
       [11,  6],
       [23,  7]])

In []: x[ind[:, 1]]
Out[]:
array([[ 4,  3],
       [11,  6],
       [23,  7],
       [ 8,  9],
       [ 1, 12]])

In []: x[ind, [0, 1]]
Out[]:
array([[ 1,  3],
       [ 4,  6],
       [ 8,  7],
       [11,  9],
       [23, 12]])
-eat

>
> Best regards,
>
> Mads
>
>  --
> +-----------------------------------------------------+
> | Mads Ipsen                                          |
> +----------------------+------------------------------+
> | G?seb?ksvej 7, 4. tv |                              |
> | DK-2500 Valby        | phone:          +45-29716388 |
> | Denmark              | email:  mads.ipsen at gmail.com |
> +----------------------+------------------------------+
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130115/b50737ff/attachment.html>

From sturla at molden.no  Tue Jan 15 16:18:19 2013
From: sturla at molden.no (Sturla Molden)
Date: Tue, 15 Jan 2013 22:18:19 +0100
Subject: [Numpy-discussion] algorithm for faster median calculation ?
In-Reply-To: <50F5B2FD.6070504@molden.no>
References: <1358278285.14438.YahooMailNeo@web171404.mail.ir2.yahoo.com>
	<50F5B2FD.6070504@molden.no>
Message-ID: <50F5C79B.80309@molden.no>


On 15.01.2013 20:50, Sturla Molden wrote:
> You might want to look at this first:
>
> https://github.com/numpy/numpy/issues/1811
>
> Yes it is possible to compute the median faster by doing quickselect
> instead of quicksort. Best case O(n) for quickselect, O(n log n) for
> quicksort. But adding selection and partial sorting to NumPy is a bigger
> issue than just computing medians and percentiles faster.


Anyway, here is the code, a bit updated.

I prefer quickselect with a better pivot though.

Sturla
-------------- next part --------------
A non-text attachment was scrubbed...
Name: median.py
Type: text/x-python
Size: 5604 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130115/72c1d727/attachment.py>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: quickselect.pyx
Type: /
Size: 3346 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130115/72c1d727/attachment.bin>

From madsipsen at gmail.com  Wed Jan 16 03:30:55 2013
From: madsipsen at gmail.com (Mads Ipsen)
Date: Wed, 16 Jan 2013 09:30:55 +0100
Subject: [Numpy-discussion] argsort
In-Reply-To: <CAKa=AYS3dL4nyFt5ZVuPvrXJarjaKE7M6Jg15YVshSga3=GB7g@mail.gmail.com>
References: <50F5427C.8060006@gmail.com>
	<CAKa=AYS3dL4nyFt5ZVuPvrXJarjaKE7M6Jg15YVshSga3=GB7g@mail.gmail.com>
Message-ID: <50F6653F.8060409@gmail.com>

Hi,

Thanks everybody for all the answers that make perfect sense when axis=0.

Now suppose I want to sort the array in such a way that each row is 
sorted individually. Then I suppose I should do this:

from numpy import *

v = array([[4,3],
            [1,12],
            [23,7],
            [11,6],
            [8,9]])
idx = argsort(v, axis=1)

idx is then

[[1 0]
  [0 1]
  [1 0]
  [1 0]
  [0 1]]

which makes sense, since these are the indices in an order that would 
sort each row. But when I try

a[idx, variuos_additional_arguments]

I just get strange results. Anybody that can point me towards the 
correct solution.

Best regards,

Mads


On 01/15/2013 09:53 PM, eat wrote:
> Hi,
>
> On Tue, Jan 15, 2013 at 1:50 PM, Mads Ipsen <madsipsen at gmail.com 
> <mailto:madsipsen at gmail.com>> wrote:
>
>     Hi,
>
>     I simply can't understand this. I'm trying to use argsort to
>     produce indices that can be used to sort an array:
>
>       from numpy import *
>
>       indices = array([[4,3],[1,12],[23,7],[11,6],[8,9]])
>       args = argsort(indices, axis=0)
>       print indices[args]
>
>     gives:
>
>     [[[ 1 12]
>       [ 4  3]]
>
>      [[ 4  3]
>       [11  6]]
>
>      [[ 8  9]
>       [23  7]]
>
>      [[11  6]
>       [ 8  9]]
>
>      [[23  7]
>       [ 1 12]]]
>
>     I thought this should produce a sorted version of the indices array.
>
>     Any help is appreciated.
>
> Perhaps these three different point of views will help you a little 
> bit more to move on:
> In []: x
> Out[]:
> array([[ 4,  3],
>        [ 1, 12],
>        [23,  7],
>        [11,  6],
>        [ 8,  9]])
> In []: ind= x.argsort(axis= 0)
> In []: ind
> Out[]:
> array([[1, 0],
>        [0, 3],
>        [4, 2],
>        [3, 4],
>        [2, 1]])
>
> In []: x[ind[:, 0]]
> Out[]:
> array([[ 1, 12],
>        [ 4,  3],
>        [ 8,  9],
>        [11,  6],
>        [23,  7]])
>
> In []: x[ind[:, 1]]
> Out[]:
> array([[ 4,  3],
>        [11,  6],
>        [23,  7],
>        [ 8,  9],
>        [ 1, 12]])
>
> In []: x[ind, [0, 1]]
> Out[]:
> array([[ 1,  3],
>        [ 4,  6],
>        [ 8,  7],
>        [11,  9],
>        [23, 12]])
> -eat
>
>
>     Best regards,
>
>     Mads
>
>     -- 
>     +-----------------------------------------------------+
>     | Mads Ipsen                                          |
>     +----------------------+------------------------------+
>     | G?seb?ksvej 7, 4. tv |                              |
>     | DK-2500 Valby        | phone:+45-29716388  <tel:%2B45-29716388>  |
>     | Denmark              | email:mads.ipsen at gmail.com  <mailto:mads.ipsen at gmail.com>  |
>     +----------------------+------------------------------+
>
>
>     _______________________________________________
>     NumPy-Discussion mailing list
>     NumPy-Discussion at scipy.org <mailto:NumPy-Discussion at scipy.org>
>     http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


-- 
+-----------------------------------------------------+
| Mads Ipsen                                          |
+----------------------+------------------------------+
| G?seb?ksvej 7, 4. tv |                              |
| DK-2500 Valby        | phone:          +45-29716388 |
| Denmark              | email:  mads.ipsen at gmail.com |
+----------------------+------------------------------+

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130116/19346851/attachment.html>

From robert.kern at gmail.com  Wed Jan 16 03:39:10 2013
From: robert.kern at gmail.com (Robert Kern)
Date: Wed, 16 Jan 2013 09:39:10 +0100
Subject: [Numpy-discussion] argsort
In-Reply-To: <50F6653F.8060409@gmail.com>
References: <50F5427C.8060006@gmail.com>
	<CAKa=AYS3dL4nyFt5ZVuPvrXJarjaKE7M6Jg15YVshSga3=GB7g@mail.gmail.com>
	<50F6653F.8060409@gmail.com>
Message-ID: <CAF6FJitb9a+b+g6Yjq5RQJk_vcdCMfA=17v1FwE1mZ=4C18bJg@mail.gmail.com>

On Wed, Jan 16, 2013 at 9:30 AM, Mads Ipsen <madsipsen at gmail.com> wrote:
> Hi,
>
> Thanks everybody for all the answers that make perfect sense when axis=0.
>
> Now suppose I want to sort the array in such a way that each row is sorted
> individually. Then I suppose I should do this:
>
> from numpy import *
>
>
> v = array([[4,3],
>            [1,12],
>            [23,7],
>            [11,6],
>            [8,9]])
> idx = argsort(v, axis=1)
>
> idx is then
>
> [[1 0]
>  [0 1]
>  [1 0]
>  [1 0]
>  [0 1]]
>
> which makes sense, since these are the indices in an order that would sort
> each row. But when I try
>
> a[idx, variuos_additional_arguments]
>
> I just get strange results. Anybody that can point me towards the correct
> solution.

Please have a look at the documentation again. If idx has indices for
the second axis, you need to put it into the second place.

  http://docs.scipy.org/doc/numpy/user/basics.indexing.html#indexing-multi-dimensional-arrays
  http://docs.scipy.org/doc/numpy/reference/arrays.indexing.html#advanced-indexing


[~]
|4> idx0 = np.arange(v.shape[0])[:,np.newaxis]

[~]
|5> idx0
array([[0],
       [1],
       [2],
       [3],
       [4]])

[~]
|7> v[idx0, idx]
array([[ 3,  4],
       [ 1, 12],
       [ 7, 23],
       [ 6, 11],
       [ 8,  9]])

--
Robert Kern


From jaakko.luttinen at aalto.fi  Wed Jan 16 06:32:33 2013
From: jaakko.luttinen at aalto.fi (Jaakko Luttinen)
Date: Wed, 16 Jan 2013 13:32:33 +0200
Subject: [Numpy-discussion] numpydoc for python 3?
In-Reply-To: <CAH6Pt5o2Bg1QBRbq-k-7msT1vc1b5E6QWqCJSzmy5Eai8FHEWA@mail.gmail.com>
References: <50EDA996.6090806@aalto.fi>
	<loom.20130110T130306-98@post.gmane.org>
	<50EED62B.9010105@aalto.fi>
	<loom.20130110T160320-428@post.gmane.org>
	<50EEDB50.5000902@aalto.fi> <50F33959.7030203@aalto.fi>
	<CAH6Pt5q8R9Y82dAA2dUUDNuG8u1BLmgd=jsQgLUB3rAy3u9COQ@mail.gmail.com>
	<50F3DF7F.3060600@aalto.fi>
	<CAH6Pt5o2Bg1QBRbq-k-7msT1vc1b5E6QWqCJSzmy5Eai8FHEWA@mail.gmail.com>
Message-ID: <50F68FD1.70302@aalto.fi>

On 01/14/2013 02:44 PM, Matthew Brett wrote:
> On Mon, Jan 14, 2013 at 10:35 AM, Jaakko Luttinen
> <jaakko.luttinen at aalto.fi> wrote:
>> On 01/14/2013 12:53 AM, Matthew Brett wrote:
>>> You might be able to get away without 2to3, using the kind of stuff
>>> that Pauli has used for scipy recently:
>>>
>>> https://github.com/scipy/scipy/pull/397
>>
>> Ok, thanks, maybe I'll try to make the tests valid in all Python
>> versions. It seems there's only one line which I'm not able to transform.
>>
>> In doc/sphinxext/tests/test_docscrape.py, on line 559:
>>     assert doc['Summary'][0] == u'?????????????'.encode('utf-8')
>>
>> This is invalid in Python 3.0-3.2. How could I write this in such a way
>> that it is valid in all Python versions? I'm a bit lost with these
>> unicode encodings in Python (and in general).. And I didn't want to add
>> dependency on 'six' package.
> 
> Pierre's suggestion is good; you can also do something like this:
> 
> # -*- coding: utf8 -*-
> import sys
> 
> if sys.version_info[0] >= 3:
>     a = '?????????????'
> else:
>     a = unicode('?????????????', 'utf8')
> 
> The 'coding' line has to be the first or second line in the file.

Thanks for all the comments!

I reported an issue and made a pull request:
https://github.com/numpy/numpy/pull/2919

However, I haven't been able to make nosetests work. I get error:
"ValueError: Attempted relative import in non-package"
Don't know how to fix it properly..

-Jaakko


From ondrej.certik at gmail.com  Wed Jan 16 12:51:42 2013
From: ondrej.certik at gmail.com (=?UTF-8?B?T25kxZllaiDEjGVydMOtaw==?=)
Date: Wed, 16 Jan 2013 09:51:42 -0800
Subject: [Numpy-discussion] Travis failures with no errors
In-Reply-To: <CAB6mnxJOgafkP5sh9nr6ko5YxSJNZZp-=WKnncDjAbK=h6HNSw@mail.gmail.com>
References: <CADDwiVCNxMuQb9Bo_FvXaRNX2GUjYVx5Xt4_yyQUJ5pZgw0idw@mail.gmail.com>
	<CADDwiVA3J5uagTGbb7=BgokJntx77e8hikNK9Ur1Nvu9_YN3iA@mail.gmail.com>
	<CAB6mnxJOgafkP5sh9nr6ko5YxSJNZZp-=WKnncDjAbK=h6HNSw@mail.gmail.com>
Message-ID: <CADDwiVBgLG-VS9WVf9i+sdMeVEob4RB0ya8nb0WUXLr+beh-ww@mail.gmail.com>

On Thu, Dec 20, 2012 at 6:32 PM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
>
>
> On Thu, Dec 20, 2012 at 6:25 PM, Ond?ej ?ert?k <ondrej.certik at gmail.com>
> wrote:
>>
>> On Thu, Dec 13, 2012 at 4:39 PM, Ond?ej ?ert?k <ondrej.certik at gmail.com>
>> wrote:
>> > Hi,
>> >
>> > I found these recent weird "failures" in Travis, but I can't find any
>> > problem with the log and all tests pass. Any ideas what is going on?
>> >
>> > https://travis-ci.org/numpy/numpy/jobs/3570123
>> > https://travis-ci.org/numpy/numpy/jobs/3539549
>> > https://travis-ci.org/numpy/numpy/jobs/3369629
>>
>> And here is another one:
>>
>> https://travis-ci.org/numpy/numpy/jobs/3768782
>
>
> Hmm, that is strange indeed. The first three are old, >= 12 days, but the
> last is new, although the run time was getting up there. Might try running
> the last one again. I don't know if the is an easy way to do that.

And another one from 3 days ago:

https://travis-ci.org/numpy/numpy/jobs/4118113

Ondrej


From nouiz at nouiz.org  Wed Jan 16 12:55:08 2013
From: nouiz at nouiz.org (=?ISO-8859-1?Q?Fr=E9d=E9ric_Bastien?=)
Date: Wed, 16 Jan 2013 12:55:08 -0500
Subject: [Numpy-discussion] Travis failures with no errors
In-Reply-To: <CADDwiVBgLG-VS9WVf9i+sdMeVEob4RB0ya8nb0WUXLr+beh-ww@mail.gmail.com>
References: <CADDwiVCNxMuQb9Bo_FvXaRNX2GUjYVx5Xt4_yyQUJ5pZgw0idw@mail.gmail.com>
	<CADDwiVA3J5uagTGbb7=BgokJntx77e8hikNK9Ur1Nvu9_YN3iA@mail.gmail.com>
	<CAB6mnxJOgafkP5sh9nr6ko5YxSJNZZp-=WKnncDjAbK=h6HNSw@mail.gmail.com>
	<CADDwiVBgLG-VS9WVf9i+sdMeVEob4RB0ya8nb0WUXLr+beh-ww@mail.gmail.com>
Message-ID: <CADKKbtjFqMF+WmfrypeDaKwc89tv1oNVW+mHbzB3i9N5PKaKwQ@mail.gmail.com>

Hi,

go to the site tracis-ci(the the next.travis-ci.org part):

https://next.travis-ci.org/numpy/numpy/jobs/4118113


When you go that way, in a drop-down menu in the screen, when you are
autorized, you can ask travis-ci to rerun the tests. You can do it in
the particular test or in the commit page too to rerun all test for
that commit.

I find this usefull to rerun failed tests caused by VM errors...

HTH

Fred


On Wed, Jan 16, 2013 at 12:51 PM, Ond?ej ?ert?k <ondrej.certik at gmail.com> wrote:
> On Thu, Dec 20, 2012 at 6:32 PM, Charles R Harris
> <charlesr.harris at gmail.com> wrote:
>>
>>
>> On Thu, Dec 20, 2012 at 6:25 PM, Ond?ej ?ert?k <ondrej.certik at gmail.com>
>> wrote:
>>>
>>> On Thu, Dec 13, 2012 at 4:39 PM, Ond?ej ?ert?k <ondrej.certik at gmail.com>
>>> wrote:
>>> > Hi,
>>> >
>>> > I found these recent weird "failures" in Travis, but I can't find any
>>> > problem with the log and all tests pass. Any ideas what is going on?
>>> >
>>> > https://travis-ci.org/numpy/numpy/jobs/3570123
>>> > https://travis-ci.org/numpy/numpy/jobs/3539549
>>> > https://travis-ci.org/numpy/numpy/jobs/3369629
>>>
>>> And here is another one:
>>>
>>> https://travis-ci.org/numpy/numpy/jobs/3768782
>>
>>
>> Hmm, that is strange indeed. The first three are old, >= 12 days, but the
>> last is new, although the run time was getting up there. Might try running
>> the last one again. I don't know if the is an easy way to do that.
>
> And another one from 3 days ago:
>
> https://travis-ci.org/numpy/numpy/jobs/4118113
>
> Ondrej
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From ndbecker2 at gmail.com  Wed Jan 16 14:04:41 2013
From: ndbecker2 at gmail.com (Neal Becker)
Date: Wed, 16 Jan 2013 14:04:41 -0500
Subject: [Numpy-discussion] find points unique within some epsilon
Message-ID: <kd6tk6$64k$2@ger.gmane.org>

Any suggestion how to take a 2d complex array and find the set of points that 
are unique within some tolerance?  (My preferred metric here would be Euclidean 
distance)


From pav at iki.fi  Wed Jan 16 15:36:13 2013
From: pav at iki.fi (Pauli Virtanen)
Date: Wed, 16 Jan 2013 22:36:13 +0200
Subject: [Numpy-discussion] numpydoc for python 3?
In-Reply-To: <CAH6Pt5o2Bg1QBRbq-k-7msT1vc1b5E6QWqCJSzmy5Eai8FHEWA@mail.gmail.com>
References: <50EDA996.6090806@aalto.fi>
	<loom.20130110T130306-98@post.gmane.org>
	<50EED62B.9010105@aalto.fi>
	<loom.20130110T160320-428@post.gmane.org>
	<50EEDB50.5000902@aalto.fi> <50F33959.7030203@aalto.fi>
	<CAH6Pt5q8R9Y82dAA2dUUDNuG8u1BLmgd=jsQgLUB3rAy3u9COQ@mail.gmail.com>
	<50F3DF7F.3060600@aalto.fi>
	<CAH6Pt5o2Bg1QBRbq-k-7msT1vc1b5E6QWqCJSzmy5Eai8FHEWA@mail.gmail.com>
Message-ID: <kd72vq$qpb$1@ger.gmane.org>

14.01.2013 14:44, Matthew Brett kirjoitti:
[clip]
> Pierre's suggestion is good; you can also do something like this:
> 
> # -*- coding: utf8 -*-
> import sys
> 
> if sys.version_info[0] >= 3:
>     a = '?????????????'
> else:
>     a = unicode('?????????????', 'utf8')
> 
> The 'coding' line has to be the first or second line in the file.

Another useful option would be

	from __future__ import unicode_literals

This makes the literal

	'spam'

be unicode also on Python 2, so that

	b'spam'

is bytes. This might make unicode unification easier.

OTOH, it might open some cans of worms.

-- 
Pauli Virtanen


From e.antero.tammi at gmail.com  Wed Jan 16 19:11:33 2013
From: e.antero.tammi at gmail.com (eat)
Date: Thu, 17 Jan 2013 02:11:33 +0200
Subject: [Numpy-discussion] Shouldn't all in-place operations simply return
	self?
Message-ID: <CAKa=AYQa_WrcFrVKNne9f7bzZvxhj9rpZAFp=Hx62zf2YphwLQ@mail.gmail.com>

Hi,

In a recent thread
http://article.gmane.org/gmane.comp.python.numeric.general/52772 it was
proposed that .fill(.) should return self as an alternative for a trivial
two-liner.

I'm raising now the question: what if all in-place operations indeed could
return self? How bad this would be? A 'strong' counter argument may be
found at
http://mail.python.org/pipermail/python-dev/2003-October/038855.html.

But anyway, at least for me. it would be much more straightforward to
implement simple mini dsl's (
http://en.wikipedia.org/wiki/Domain-specific_language) a much more
straightforward manner.

What do you think?


-eat

P.S. FWIW, if this idea really gains momentum obviously I'm volunteering to
create a PR of it.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130117/1e57607b/attachment.html>

From patrickmarshwx at gmail.com  Wed Jan 16 19:16:44 2013
From: patrickmarshwx at gmail.com (Patrick Marsh)
Date: Wed, 16 Jan 2013 18:16:44 -0600
Subject: [Numpy-discussion] Casting Bug or a "Feature"?
Message-ID: <CAHUTaHH54MAP4K3yHR_kTzWC70xNrWa0ykuq36RdC=4+PP71bA@mail.gmail.com>

Greetings,

I spent a couple hours today tracking down a bug in one of my programs. I
was getting different answers depending on whether I passed in a numpy
array or a single number. Ultimately, I tracked it down to something I
would consider a bug, but I'm not sure if others do. The case comes from
taking a numpy integer array and adding a float to it.  When doing var =
np.array(ints) + float, var is cast to an array of floats, which is what I
would expect. However, if I do np.array(ints) += float, the result is an
array of integers. I can understand why this happens -- you are shoving the
sum back into an integer array -- but without thinking through that I would
expect the behavior of the two additions to be equal...or at least be
consistent with what occurs with numbers, instead of arrays.  Here's a
trivial example demonstrating this


import numpy as np
a = np.arange(10)
print a.dtype
b = a + 0.5
print b.dtype
a += 0.5
print a.dtype

>> int64
>> float64
>> int64
>> <type 'int'>
>> <type 'float'>
>> <type 'float'>


An implication of this arrises from a simple function that "does math". The
function returns different values depending on whether a number or array
was passed in.


def add_n_multiply(var):
    var += 0.5
    var *= 10
    return var

aaa = np.arange(5)
print aaa
print add_n_multiply(aaa.copy())
print [add_n_multiply(x) for x in aaa.copy()]


>> [0 1 2 3 4]
>> [ 0 10 20 30 40]
>> [5.0, 15.0, 25.0, 35.0, 45.0]


Am I alone in thinking this is a bug? Or is this the behavior that others
would have expected?


Cheers,
Patrick
---
Patrick Marsh
Ph.D. Candidate / Liaison to the HWT
School of Meteorology / University of Oklahoma
Cooperative Institute for Mesoscale Meteorological Studies
National Severe Storms Laboratory
http://www.patricktmarsh.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130116/3d3f2f6f/attachment.html>

From brad.froehle at gmail.com  Wed Jan 16 19:39:56 2013
From: brad.froehle at gmail.com (Bradley M. Froehle)
Date: Wed, 16 Jan 2013 16:39:56 -0800
Subject: [Numpy-discussion] Casting Bug or a "Feature"?
In-Reply-To: <CAHUTaHH54MAP4K3yHR_kTzWC70xNrWa0ykuq36RdC=4+PP71bA@mail.gmail.com>
References: <CAHUTaHH54MAP4K3yHR_kTzWC70xNrWa0ykuq36RdC=4+PP71bA@mail.gmail.com>
Message-ID: <CAHXv-MjFa+tw=UvuB8ggJWat6hy_u2nVvNDs8Bjf5x=3KG++MQ@mail.gmail.com>

Hi Patrick:

I think it is the behavior I have come to expect.  The only "gotcha" here
might be the difference between "var = var + 0.5" and "var += 0.5"

For example:

>>> import numpy as np

>>> x = np.arange(5); x += 0.5; x
array([0, 1, 2, 3, 4])

>>> x = np.arange(5); x = x + 0.5; x
array([ 0.5,  1.5,  2.5,  3.5,  4.5])

The first line is definitely what I expect.  The second, the automatic
casting from int64 -> double, is documented and generally desirable.

It's hard to avoid these casting issues without making code unnecessarily
complex or allowing only one data type (e.g., as MATLAB does).

If you worry about standardizing behavior you can always use `var =
np.array(var, dtype=np.double, copy=True)` or similar at the start of your
function.

-Brad


On Wed, Jan 16, 2013 at 4:16 PM, Patrick Marsh <patrickmarshwx at gmail.com>wrote:

> Greetings,
>
> I spent a couple hours today tracking down a bug in one of my programs. I
> was getting different answers depending on whether I passed in a numpy
> array or a single number. Ultimately, I tracked it down to something I
> would consider a bug, but I'm not sure if others do. The case comes from
> taking a numpy integer array and adding a float to it.  When doing var =
> np.array(ints) + float, var is cast to an array of floats, which is what I
> would expect. However, if I do np.array(ints) += float, the result is an
> array of integers. I can understand why this happens -- you are shoving the
> sum back into an integer array -- but without thinking through that I would
> expect the behavior of the two additions to be equal...or at least be
> consistent with what occurs with numbers, instead of arrays.  Here's a
> trivial example demonstrating this
>
>
> import numpy as np
> a = np.arange(10)
> print a.dtype
> b = a + 0.5
> print b.dtype
> a += 0.5
> print a.dtype
>
>  >> int64
> >> float64
> >> int64
> >> <type 'int'>
> >> <type 'float'>
> >> <type 'float'>
>
>
> An implication of this arrises from a simple function that "does math".
> The function returns different values depending on whether a number or
> array was passed in.
>
>
> def add_n_multiply(var):
>     var += 0.5
>     var *= 10
>     return var
>
> aaa = np.arange(5)
> print aaa
> print add_n_multiply(aaa.copy())
> print [add_n_multiply(x) for x in aaa.copy()]
>
>
> >> [0 1 2 3 4]
> >> [ 0 10 20 30 40]
> >> [5.0, 15.0, 25.0, 35.0, 45.0]
>
>
>
>
> Am I alone in thinking this is a bug? Or is this the behavior that others
> would have expected?
>
>
>
> Cheers,
> Patrick
> ---
> Patrick Marsh
> Ph.D. Candidate / Liaison to the HWT
> School of Meteorology / University of Oklahoma
> Cooperative Institute for Mesoscale Meteorological Studies
> National Severe Storms Laboratory
> http://www.patricktmarsh.com
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130116/cc56463e/attachment.html>

From chris.barker at noaa.gov  Wed Jan 16 19:42:09 2013
From: chris.barker at noaa.gov (Chris Barker - NOAA Federal)
Date: Wed, 16 Jan 2013 16:42:09 -0800
Subject: [Numpy-discussion] Casting Bug or a "Feature"?
In-Reply-To: <CAHUTaHH54MAP4K3yHR_kTzWC70xNrWa0ykuq36RdC=4+PP71bA@mail.gmail.com>
References: <CAHUTaHH54MAP4K3yHR_kTzWC70xNrWa0ykuq36RdC=4+PP71bA@mail.gmail.com>
Message-ID: <CALGmxE+DkgeTSHzas1=QEOdnGHN8HUv=p4Ff9+TMfhRLT7pJAQ@mail.gmail.com>

Patrick,

Not a bug but is it a mis-feature?

See the recent thread: "Do we want scalar casting to behave as it does
at the moment"

In short, this is an complex issue with no easy answer...

-Chris


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From njs at pobox.com  Wed Jan 16 20:24:19 2013
From: njs at pobox.com (Nathaniel Smith)
Date: Thu, 17 Jan 2013 01:24:19 +0000
Subject: [Numpy-discussion] Casting Bug or a "Feature"?
In-Reply-To: <CALGmxE+DkgeTSHzas1=QEOdnGHN8HUv=p4Ff9+TMfhRLT7pJAQ@mail.gmail.com>
References: <CAHUTaHH54MAP4K3yHR_kTzWC70xNrWa0ykuq36RdC=4+PP71bA@mail.gmail.com>
	<CALGmxE+DkgeTSHzas1=QEOdnGHN8HUv=p4Ff9+TMfhRLT7pJAQ@mail.gmail.com>
Message-ID: <CAPJVwBm8awHwWe5G5E1rxkYfaM3q6HDEYRY6svyFfo7nfaxO+Q@mail.gmail.com>

This is separate from the scalar casting thing. This is a disguised version
of the discussion about what we should do with implicit casts caused by
assignment:
  into_array[i] = 0.5

Traditionally numpy just happily casts this stuff, possibly mangling data
in the process, and this has caused many actual bugs in user code. In 1.6
some of these assignments cause errors, but we reverted this in 1.7 because
this was also breaking things. Supposedly we also deprecated these at the
same time, with an eye towards making them errors eventually, but I'm not
sure we did this properly, and our carrying rules need revisiting in any
case.

(Sorry for lack of links to earlier discussion; traveling and on my phone.)

-n
On 16 Jan 2013 16:42, "Chris Barker - NOAA Federal" <chris.barker at noaa.gov>
wrote:

> Patrick,
>
> Not a bug but is it a mis-feature?
>
> See the recent thread: "Do we want scalar casting to behave as it does
> at the moment"
>
> In short, this is an complex issue with no easy answer...
>
> -Chris
>
>
> --
>
> Christopher Barker, Ph.D.
> Oceanographer
>
> Emergency Response Division
> NOAA/NOS/OR&R            (206) 526-6959   voice
> 7600 Sand Point Way NE   (206) 526-6329   fax
> Seattle, WA  98115       (206) 526-6317   main reception
>
> Chris.Barker at noaa.gov
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130117/3a1ed2a9/attachment.html>

From josef.pktd at gmail.com  Wed Jan 16 20:53:58 2013
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Wed, 16 Jan 2013 20:53:58 -0500
Subject: [Numpy-discussion] Shouldn't all in-place operations simply
 return self?
In-Reply-To: <CAKa=AYQa_WrcFrVKNne9f7bzZvxhj9rpZAFp=Hx62zf2YphwLQ@mail.gmail.com>
References: <CAKa=AYQa_WrcFrVKNne9f7bzZvxhj9rpZAFp=Hx62zf2YphwLQ@mail.gmail.com>
Message-ID: <CAMMTP+DFxanPjmrbzBQVmeyxowz2+FCsDExKnkqWwAMvzwE=yQ@mail.gmail.com>

On Wed, Jan 16, 2013 at 7:11 PM, eat <e.antero.tammi at gmail.com> wrote:
> Hi,
>
> In a recent thread
> http://article.gmane.org/gmane.comp.python.numeric.general/52772 it was
> proposed that .fill(.) should return self as an alternative for a trivial
> two-liner.
>
> I'm raising now the question: what if all in-place operations indeed could
> return self? How bad this would be? A 'strong' counter argument may be found
> at http://mail.python.org/pipermail/python-dev/2003-October/038855.html.
>
> But anyway, at least for me. it would be much more straightforward to
> implement simple mini dsl's
> (http://en.wikipedia.org/wiki/Domain-specific_language) a much more
> straightforward manner.
>
> What do you think?

I'm against it.
I think it requires too much thinking by users and developers.

The function in numpy are conceptually much closer to basic python,
not some heavy object oriented framework where we need lots of
chaining.
(I thought I remembered some discussion and justification for
returning self in sqlalchemy for this, but couldn't find it.)

I'm chasing quite a few bugs with inplace operations

>>> a = np.arange(10)
>>> a *= np.pi
>>> a
???

>>> a = np.random.random_integers(0, 5, size=5)
>>> b = a.sort()
>>> b
>>> a
array([0, 1, 2, 5, 5])

>>> b = np.random.shuffle(a)
>>> b
>>> b = np.random.permutation(a)
>>> b
array([0, 5, 5, 2, 1])

How do I remember if shuffle shuffles or permutes ?

Do we have a list of functions that are inplace?

Josef

>
>
> -eat
>
> P.S. FWIW, if this idea really gains momentum obviously I'm volunteering to
> create a PR of it.
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From patrickmarshwx at gmail.com  Wed Jan 16 22:43:22 2013
From: patrickmarshwx at gmail.com (Patrick Marsh)
Date: Wed, 16 Jan 2013 21:43:22 -0600
Subject: [Numpy-discussion] Casting Bug or a "Feature"?
In-Reply-To: <CAPJVwBm8awHwWe5G5E1rxkYfaM3q6HDEYRY6svyFfo7nfaxO+Q@mail.gmail.com>
References: <CAHUTaHH54MAP4K3yHR_kTzWC70xNrWa0ykuq36RdC=4+PP71bA@mail.gmail.com>
	<CALGmxE+DkgeTSHzas1=QEOdnGHN8HUv=p4Ff9+TMfhRLT7pJAQ@mail.gmail.com>
	<CAPJVwBm8awHwWe5G5E1rxkYfaM3q6HDEYRY6svyFfo7nfaxO+Q@mail.gmail.com>
Message-ID: <CAHUTaHH7H31rbq=ej6SSqe6s-5WpZi4u3ZPyQUqJnBmRyo8iQA@mail.gmail.com>

Thanks, everyone for chiming in.  Now that I know this behavior exists, I
can explicitly prevent it in my code. However, it would be nice if a
warning or something was generated to alert users about the inconsistency
between var += ... and var = var + ...


Patrick


---
Patrick Marsh
Ph.D. Candidate / Liaison to the HWT
School of Meteorology / University of Oklahoma
Cooperative Institute for Mesoscale Meteorological Studies
National Severe Storms Laboratory
http://www.patricktmarsh.com


On Wed, Jan 16, 2013 at 7:24 PM, Nathaniel Smith <njs at pobox.com> wrote:

> This is separate from the scalar casting thing. This is a disguised
> version of the discussion about what we should do with implicit casts
> caused by assignment:
>   into_array[i] = 0.5
>
> Traditionally numpy just happily casts this stuff, possibly mangling data
> in the process, and this has caused many actual bugs in user code. In 1.6
> some of these assignments cause errors, but we reverted this in 1.7 because
> this was also breaking things. Supposedly we also deprecated these at the
> same time, with an eye towards making them errors eventually, but I'm not
> sure we did this properly, and our carrying rules need revisiting in any
> case.
>
> (Sorry for lack of links to earlier discussion; traveling and on my phone.)
>
> -n
> On 16 Jan 2013 16:42, "Chris Barker - NOAA Federal" <chris.barker at noaa.gov>
> wrote:
>
>> Patrick,
>>
>> Not a bug but is it a mis-feature?
>>
>> See the recent thread: "Do we want scalar casting to behave as it does
>> at the moment"
>>
>> In short, this is an complex issue with no easy answer...
>>
>> -Chris
>>
>>
>> --
>>
>> Christopher Barker, Ph.D.
>> Oceanographer
>>
>> Emergency Response Division
>> NOAA/NOS/OR&R            (206) 526-6959   voice
>> 7600 Sand Point Way NE   (206) 526-6329   fax
>> Seattle, WA  98115       (206) 526-6317   main reception
>>
>> Chris.Barker at noaa.gov
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130116/175c557d/attachment.html>

From josef.pktd at gmail.com  Wed Jan 16 22:54:40 2013
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Wed, 16 Jan 2013 22:54:40 -0500
Subject: [Numpy-discussion] Casting Bug or a "Feature"?
In-Reply-To: <CAHUTaHH7H31rbq=ej6SSqe6s-5WpZi4u3ZPyQUqJnBmRyo8iQA@mail.gmail.com>
References: <CAHUTaHH54MAP4K3yHR_kTzWC70xNrWa0ykuq36RdC=4+PP71bA@mail.gmail.com>
	<CALGmxE+DkgeTSHzas1=QEOdnGHN8HUv=p4Ff9+TMfhRLT7pJAQ@mail.gmail.com>
	<CAPJVwBm8awHwWe5G5E1rxkYfaM3q6HDEYRY6svyFfo7nfaxO+Q@mail.gmail.com>
	<CAHUTaHH7H31rbq=ej6SSqe6s-5WpZi4u3ZPyQUqJnBmRyo8iQA@mail.gmail.com>
Message-ID: <CAMMTP+DERTF_k=cyqEB3twYh391R9Q_1y5mSg8p1UD9hityBiQ@mail.gmail.com>

On Wed, Jan 16, 2013 at 10:43 PM, Patrick Marsh
<patrickmarshwx at gmail.com> wrote:
> Thanks, everyone for chiming in.  Now that I know this behavior exists, I
> can explicitly prevent it in my code. However, it would be nice if a warning
> or something was generated to alert users about the inconsistency between
> var += ... and var = var + ...

Since I also got bitten by this recently in my code, I fully agree.
I could live with an exception for lossy down casting in this case.

Josef

>
>
>
> Patrick
>
>
> ---
> Patrick Marsh
> Ph.D. Candidate / Liaison to the HWT
> School of Meteorology / University of Oklahoma
> Cooperative Institute for Mesoscale Meteorological Studies
> National Severe Storms Laboratory
> http://www.patricktmarsh.com
>
>
> On Wed, Jan 16, 2013 at 7:24 PM, Nathaniel Smith <njs at pobox.com> wrote:
>>
>> This is separate from the scalar casting thing. This is a disguised
>> version of the discussion about what we should do with implicit casts caused
>> by assignment:
>>   into_array[i] = 0.5
>>
>> Traditionally numpy just happily casts this stuff, possibly mangling data
>> in the process, and this has caused many actual bugs in user code. In 1.6
>> some of these assignments cause errors, but we reverted this in 1.7 because
>> this was also breaking things. Supposedly we also deprecated these at the
>> same time, with an eye towards making them errors eventually, but I'm not
>> sure we did this properly, and our carrying rules need revisiting in any
>> case.
>>
>> (Sorry for lack of links to earlier discussion; traveling and on my
>> phone.)
>>
>> -n
>>
>> On 16 Jan 2013 16:42, "Chris Barker - NOAA Federal"
>> <chris.barker at noaa.gov> wrote:
>>>
>>> Patrick,
>>>
>>> Not a bug but is it a mis-feature?
>>>
>>> See the recent thread: "Do we want scalar casting to behave as it does
>>> at the moment"
>>>
>>> In short, this is an complex issue with no easy answer...
>>>
>>> -Chris
>>>
>>>
>>> --
>>>
>>> Christopher Barker, Ph.D.
>>> Oceanographer
>>>
>>> Emergency Response Division
>>> NOAA/NOS/OR&R            (206) 526-6959   voice
>>> 7600 Sand Point Way NE   (206) 526-6329   fax
>>> Seattle, WA  98115       (206) 526-6317   main reception
>>>
>>> Chris.Barker at noaa.gov
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From njs at pobox.com  Thu Jan 17 01:41:29 2013
From: njs at pobox.com (Nathaniel Smith)
Date: Thu, 17 Jan 2013 06:41:29 +0000
Subject: [Numpy-discussion] Shouldn't all in-place operations simply
 return self?
In-Reply-To: <CAMMTP+DFxanPjmrbzBQVmeyxowz2+FCsDExKnkqWwAMvzwE=yQ@mail.gmail.com>
References: <CAKa=AYQa_WrcFrVKNne9f7bzZvxhj9rpZAFp=Hx62zf2YphwLQ@mail.gmail.com>
	<CAMMTP+DFxanPjmrbzBQVmeyxowz2+FCsDExKnkqWwAMvzwE=yQ@mail.gmail.com>
Message-ID: <CAPJVwBnro9dzy7vUBs8M91ujQu_+7-n0QAWW8EsSNM2BEPDmGw@mail.gmail.com>

On 16 Jan 2013 17:54, <josef.pktd at gmail.com> wrote:
> >>> a = np.random.random_integers(0, 5, size=5)
> >>> b = a.sort()
> >>> b
> >>> a
> array([0, 1, 2, 5, 5])
>
> >>> b = np.random.shuffle(a)
> >>> b
> >>> b = np.random.permutation(a)
> >>> b
> array([0, 5, 5, 2, 1])
>
> How do I remember if shuffle shuffles or permutes ?
>
> Do we have a list of functions that are inplace?

I rather like the convention used elsewhere in Python of naming in-place
operations with present tense imperative verbs, and out-of-place operations
with past participles. So you have sort/sorted, reverse/reversed, etc.

Here this would suggest we name these two operations as either shuffle()
and shuffled(), or permute() and permuted().

-n
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130117/e1ec6146/attachment.html>

From paul.anton.letnes at gmail.com  Thu Jan 17 02:14:47 2013
From: paul.anton.letnes at gmail.com (Paul Anton Letnes)
Date: Thu, 17 Jan 2013 08:14:47 +0100
Subject: [Numpy-discussion] Casting Bug or a "Feature"?
In-Reply-To: <CAHUTaHH7H31rbq=ej6SSqe6s-5WpZi4u3ZPyQUqJnBmRyo8iQA@mail.gmail.com>
References: <CAHUTaHH54MAP4K3yHR_kTzWC70xNrWa0ykuq36RdC=4+PP71bA@mail.gmail.com>
	<CALGmxE+DkgeTSHzas1=QEOdnGHN8HUv=p4Ff9+TMfhRLT7pJAQ@mail.gmail.com>
	<CAPJVwBm8awHwWe5G5E1rxkYfaM3q6HDEYRY6svyFfo7nfaxO+Q@mail.gmail.com>
	<CAHUTaHH7H31rbq=ej6SSqe6s-5WpZi4u3ZPyQUqJnBmRyo8iQA@mail.gmail.com>
Message-ID: <50F7A4E7.7050608@gmail.com>

On 17.01.2013 04:43, Patrick Marsh wrote:
> Thanks, everyone for chiming in.  Now that I know this behavior 
> exists, I can explicitly prevent it in my code. However, it would be 
> nice if a warning or something was generated to alert users about the 
> inconsistency between var += ... and var = var + ...
>
>
> Patrick
>

I agree wholeheartedly. I actually, for a long time, used to believe 
that python would translate
a += b
to
a = a + b
and was bitten several times by this bug. A warning (which can be 
silenced if you desperately want to) would be really nice, imho.

Keep up the good work,
Paul


From matthieu.brucher at gmail.com  Thu Jan 17 02:34:27 2013
From: matthieu.brucher at gmail.com (Matthieu Brucher)
Date: Thu, 17 Jan 2013 08:34:27 +0100
Subject: [Numpy-discussion] Casting Bug or a "Feature"?
In-Reply-To: <50F7A4E7.7050608@gmail.com>
References: <CAHUTaHH54MAP4K3yHR_kTzWC70xNrWa0ykuq36RdC=4+PP71bA@mail.gmail.com>
	<CALGmxE+DkgeTSHzas1=QEOdnGHN8HUv=p4Ff9+TMfhRLT7pJAQ@mail.gmail.com>
	<CAPJVwBm8awHwWe5G5E1rxkYfaM3q6HDEYRY6svyFfo7nfaxO+Q@mail.gmail.com>
	<CAHUTaHH7H31rbq=ej6SSqe6s-5WpZi4u3ZPyQUqJnBmRyo8iQA@mail.gmail.com>
	<50F7A4E7.7050608@gmail.com>
Message-ID: <CAHCaCk+iMUWXnwBVhH=vDgkhW14gpaeN4oji9=RVsvXCT6kzdw@mail.gmail.com>

Hi,

Actually, this behavior is already present in other languages, so I'm -1 on
additional verbosity.
Of course a += b is not the same as a = a + b. The first one modifies the
object a, the second one creates a new object and puts it inside a. The
behavior IS consistent.

Cheers,

Matthieu


2013/1/17 Paul Anton Letnes <paul.anton.letnes at gmail.com>

> On 17.01.2013 04:43, Patrick Marsh wrote:
> > Thanks, everyone for chiming in.  Now that I know this behavior
> > exists, I can explicitly prevent it in my code. However, it would be
> > nice if a warning or something was generated to alert users about the
> > inconsistency between var += ... and var = var + ...
> >
> >
> > Patrick
> >
>
> I agree wholeheartedly. I actually, for a long time, used to believe
> that python would translate
> a += b
> to
> a = a + b
> and was bitten several times by this bug. A warning (which can be
> silenced if you desperately want to) would be really nice, imho.
>
> Keep up the good work,
> Paul
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


-- 
Information System Engineer, Ph.D.
Blog: http://matt.eifelle.com
LinkedIn: http://www.linkedin.com/in/matthieubrucher
Music band: http://liliejay.com/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130117/2caef57b/attachment.html>

From burger.ga at gmail.com  Thu Jan 17 05:01:11 2013
From: burger.ga at gmail.com (Gerhard Burger)
Date: Thu, 17 Jan 2013 11:01:11 +0100
Subject: [Numpy-discussion] Fwd: numpy test fails with "Illegal instruction'
In-Reply-To: <CAHTr4dXL7uM6+oKZk2jaeoTNQR11yBdx_guw6yLX4OviwNDaNw@mail.gmail.com>
References: <CAHTr4dXL7uM6+oKZk2jaeoTNQR11yBdx_guw6yLX4OviwNDaNw@mail.gmail.com>
Message-ID: <CAHTr4dWR-dp-hY5UCBehNHuNW5fYrEdmXo35qkkPts8kB5L+XQ@mail.gmail.com>

Dear numpy users,

I am trying to get numpy to work on my computer, but so far no luck.

When I run `numpy.test(verbose=10)` it crashes with

    test_polyfit (test_polynomial.TestDocs) ... Illegal instruction

In the FAQ it states that I should provide the following information
(running Ubuntu 12.04 64bit):

    os.name = 'posix'
    uname -r = 3.2.0-35-generic
    sys.platform = 'linux2'
    sys.version = '2.7.3 (default, Aug  1 2012, 05:14:39) \n[GCC 4.6.3]'

Atlas is not installed (not required for numpy, only for scipy right?)

It fails both when I install numpy 1.6.2 with `pip install numpy` and if I
install the latest dev version from git.

Can someone give me some pointers on how to solve this?
I will be grateful for any help you can provide.

Kind regards,
Gerhard Burger
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130117/15f42750/attachment.html>

From josef.pktd at gmail.com  Thu Jan 17 07:27:44 2013
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Thu, 17 Jan 2013 07:27:44 -0500
Subject: [Numpy-discussion] Casting Bug or a "Feature"?
In-Reply-To: <CAHCaCk+iMUWXnwBVhH=vDgkhW14gpaeN4oji9=RVsvXCT6kzdw@mail.gmail.com>
References: <CAHUTaHH54MAP4K3yHR_kTzWC70xNrWa0ykuq36RdC=4+PP71bA@mail.gmail.com>
	<CALGmxE+DkgeTSHzas1=QEOdnGHN8HUv=p4Ff9+TMfhRLT7pJAQ@mail.gmail.com>
	<CAPJVwBm8awHwWe5G5E1rxkYfaM3q6HDEYRY6svyFfo7nfaxO+Q@mail.gmail.com>
	<CAHUTaHH7H31rbq=ej6SSqe6s-5WpZi4u3ZPyQUqJnBmRyo8iQA@mail.gmail.com>
	<50F7A4E7.7050608@gmail.com>
	<CAHCaCk+iMUWXnwBVhH=vDgkhW14gpaeN4oji9=RVsvXCT6kzdw@mail.gmail.com>
Message-ID: <CAMMTP+CyqHCBsOO8uTAaHVEVY+YWDo1HSKENhawPLdgqS5RjqQ@mail.gmail.com>

On Thu, Jan 17, 2013 at 2:34 AM, Matthieu Brucher
<matthieu.brucher at gmail.com> wrote:
> Hi,
>
> Actually, this behavior is already present in other languages, so I'm -1 on
> additional verbosity.
> Of course a += b is not the same as a = a + b. The first one modifies the
> object a, the second one creates a new object and puts it inside a. The
> behavior IS consistent.

The inplace operation is standard, but my guess is that the silent
downcasting is not.

in python

>>> a = 1
>>> a += 5.3
>>> a
6.2999999999999998
>>> a = 1
>>> a *= 1j
>>> a
1j

I have no idea about other languages.

Josef

>
> Cheers,
>
> Matthieu
>
>
> 2013/1/17 Paul Anton Letnes <paul.anton.letnes at gmail.com>
>>
>> On 17.01.2013 04:43, Patrick Marsh wrote:
>> > Thanks, everyone for chiming in.  Now that I know this behavior
>> > exists, I can explicitly prevent it in my code. However, it would be
>> > nice if a warning or something was generated to alert users about the
>> > inconsistency between var += ... and var = var + ...
>> >
>> >
>> > Patrick
>> >
>>
>> I agree wholeheartedly. I actually, for a long time, used to believe
>> that python would translate
>> a += b
>> to
>> a = a + b
>> and was bitten several times by this bug. A warning (which can be
>> silenced if you desperately want to) would be really nice, imho.
>>
>> Keep up the good work,
>> Paul
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
>
>
> --
> Information System Engineer, Ph.D.
> Blog: http://matt.eifelle.com
> LinkedIn: http://www.linkedin.com/in/matthieubrucher
> Music band: http://liliejay.com/
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From d.s.seljebotn at astro.uio.no  Thu Jan 17 07:49:09 2013
From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn)
Date: Thu, 17 Jan 2013 13:49:09 +0100
Subject: [Numpy-discussion] Casting Bug or a "Feature"?
In-Reply-To: <CAMMTP+CyqHCBsOO8uTAaHVEVY+YWDo1HSKENhawPLdgqS5RjqQ@mail.gmail.com>
References: <CAHUTaHH54MAP4K3yHR_kTzWC70xNrWa0ykuq36RdC=4+PP71bA@mail.gmail.com>
	<CALGmxE+DkgeTSHzas1=QEOdnGHN8HUv=p4Ff9+TMfhRLT7pJAQ@mail.gmail.com>
	<CAPJVwBm8awHwWe5G5E1rxkYfaM3q6HDEYRY6svyFfo7nfaxO+Q@mail.gmail.com>
	<CAHUTaHH7H31rbq=ej6SSqe6s-5WpZi4u3ZPyQUqJnBmRyo8iQA@mail.gmail.com>
	<50F7A4E7.7050608@gmail.com>
	<CAHCaCk+iMUWXnwBVhH=vDgkhW14gpaeN4oji9=RVsvXCT6kzdw@mail.gmail.com>
	<CAMMTP+CyqHCBsOO8uTAaHVEVY+YWDo1HSKENhawPLdgqS5RjqQ@mail.gmail.com>
Message-ID: <50F7F345.4040504@astro.uio.no>

On 01/17/2013 01:27 PM, josef.pktd at gmail.com wrote:
> On Thu, Jan 17, 2013 at 2:34 AM, Matthieu Brucher
> <matthieu.brucher at gmail.com> wrote:
>> Hi,
>>
>> Actually, this behavior is already present in other languages, so I'm -1 on
>> additional verbosity.
>> Of course a += b is not the same as a = a + b. The first one modifies the
>> object a, the second one creates a new object and puts it inside a. The
>> behavior IS consistent.
>
> The inplace operation is standard, but my guess is that the silent
> downcasting is not.
>
> in python
>
>>>> a = 1
>>>> a += 5.3
>>>> a
> 6.2999999999999998
>>>> a = 1
>>>> a *= 1j
>>>> a
> 1j
>
> I have no idea about other languages.

I don't think the comparison with Python scalars is relevant since they 
are immutable:

In [9]: a = 1

In [10]: b = a

In [11]: a *= 1j

In [12]: b
Out[12]: 1


In-place operators exists for lists, but I don't know what the 
equivalent of a down-cast would be...

In [3]: a = [0, 1]

In [4]: b = a

In [5]: a *= 2

In [6]: b
Out[6]: [0, 1, 0, 1]

Dag Sverre


From jim.vickroy at noaa.gov  Thu Jan 17 08:54:03 2013
From: jim.vickroy at noaa.gov (Jim Vickroy)
Date: Thu, 17 Jan 2013 06:54:03 -0700
Subject: [Numpy-discussion] Shouldn't all in-place operations simply
 return self?
In-Reply-To: <CAPJVwBnro9dzy7vUBs8M91ujQu_+7-n0QAWW8EsSNM2BEPDmGw@mail.gmail.com>
References: <CAKa=AYQa_WrcFrVKNne9f7bzZvxhj9rpZAFp=Hx62zf2YphwLQ@mail.gmail.com>
	<CAMMTP+DFxanPjmrbzBQVmeyxowz2+FCsDExKnkqWwAMvzwE=yQ@mail.gmail.com>
	<CAPJVwBnro9dzy7vUBs8M91ujQu_+7-n0QAWW8EsSNM2BEPDmGw@mail.gmail.com>
Message-ID: <50F8027B.5040301@noaa.gov>

On 1/16/2013 11:41 PM, Nathaniel Smith wrote:
>
> On 16 Jan 2013 17:54, <josef.pktd at gmail.com 
> <mailto:josef.pktd at gmail.com>> wrote:
> > >>> a = np.random.random_integers(0, 5, size=5)
> > >>> b = a.sort()
> > >>> b
> > >>> a
> > array([0, 1, 2, 5, 5])
> >
> > >>> b = np.random.shuffle(a)
> > >>> b
> > >>> b = np.random.permutation(a)
> > >>> b
> > array([0, 5, 5, 2, 1])
> >
> > How do I remember if shuffle shuffles or permutes ?
> >
> > Do we have a list of functions that are inplace?
>
> I rather like the convention used elsewhere in Python of naming 
> in-place operations with present tense imperative verbs, and 
> out-of-place operations with past participles. So you have 
> sort/sorted, reverse/reversed, etc.
>
> Here this would suggest we name these two operations as either 
> shuffle() and shuffled(), or permute() and permuted().
>

I like this (tense) suggestion.  It seems easy to remember.  --jv


> -n
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130117/92ea61ed/attachment.html>

From pierre.haessig at crans.org  Thu Jan 17 09:02:51 2013
From: pierre.haessig at crans.org (Pierre Haessig)
Date: Thu, 17 Jan 2013 15:02:51 +0100
Subject: [Numpy-discussion] New numpy functions: filled, filled_like
In-Reply-To: <50F459DF.2070900@gmail.com>
References: <CAPJVwBmgZUCwApoS2ztmwt2pTa-2-+B3W5tOm8A30KyuOd12AA@mail.gmail.com>
	<CAF6FJivC=O0CGMGP=c5=a=TMRP73Zy9H61FRS8xSKp5cU2XV4Q@mail.gmail.com>
	<CAPJVwBnOdehQTo=dLup+woc5_JkE5ekvEBDeiEFraVyVUvSDtQ@mail.gmail.com>
	<CAKF=DjtjSGk99zh53wwiPVmMThOzr4hGUKX5HJPG-5YeF_QTCQ@mail.gmail.com>
	<CAPJVwBmZvXXszZzf9m=PX_7e=8qs8ZJteU2hD7_eTR9pnN9U2w@mail.gmail.com>
	<CAF6FJiuQwNCkPvzVKneLfx3XSM7tHKcNXzMftRFnu8R+tN_LoQ@mail.gmail.com>
	<loom.20130114T094714-689@post.gmane.org>
	<CAH6Pt5rjUE0uEQkgZsGrKL4HWXMPZO=wbbaUZ5=tm+LCBSrrXw@mail.gmail.com>
	<CAFXk4br_0pohTOrU6fJjbiqpsP+GSiT3U-nDWRrCi1q6PjJHjg@mail.gmail.com>
	<50F4400F.4040709@hawaii.edu>
	<CANNq6Fm35sYGK4cjZNQedsbdVz06hVhAm6kv7d6QHPtMJACR6g@mail.gmail.com>
	<50F44A95.2030202@crans.org>
	<CALc6Xo6b=3OE3VHNsqJ-SXo_8tvNZOL5RV0pnYnG_fiDp58TUQ@mail.gmail.com>
	<50F459DF.2070900@gmail.com>
Message-ID: <50F8048B.6070506@crans.org>

Hi,
Le 14/01/2013 20:17, Alan G Isaac a ?crit :
>  >>> a = np.tile(5,(1,2,3))
>  >>> a
> array([[[5, 5, 5],
>          [5, 5, 5]]])
>  >>> np.tile(1,a.shape)
> array([[[1, 1, 1],
>          [1, 1, 1]]])
>
> I had not realized a scalar first argument was possible.
I didn't know either ! I discovered this use in the thread of this
discussion. Just like Ben, I've almost never used "np.tile" neither its
cousin "np.repeat"...

Now, in the process of rediscovering those two functions, I was just
wondering whether it would make sense to repackage them in order to
allow the simple functionality of initializing a non-empty array.

In term of choosing the name (or actually the verb), I prefer "repeat"
because it's a more familiar concept than "tile". However, repeat may
need more changes to make it work than tile. Indeed we currently have :

>>> tile(nan, (3,3))  # works fine, but is pretty slow for that purpose,
And doesn't accept a dtype arg
array([[ nan,  nan,  nan],
       [ nan,  nan,  nan],
       [ nan,  nan,  nan]])


Doesn't work for that purpose:
>>>repeat(nan, (3,3))
[...]
ValueError: a.shape[axis] != len(repeats)


So what people think of this "green" approach of recycling existing API
into a slightly different function (without breaking current behavior of
course)

Best,
Pierre

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 900 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130117/ead70f5d/attachment.sig>

From scott.sinclair.za at gmail.com  Thu Jan 17 09:12:53 2013
From: scott.sinclair.za at gmail.com (Scott Sinclair)
Date: Thu, 17 Jan 2013 16:12:53 +0200
Subject: [Numpy-discussion] Fwd: numpy test fails with "Illegal
	instruction'
In-Reply-To: <CAHTr4dWR-dp-hY5UCBehNHuNW5fYrEdmXo35qkkPts8kB5L+XQ@mail.gmail.com>
References: <CAHTr4dXL7uM6+oKZk2jaeoTNQR11yBdx_guw6yLX4OviwNDaNw@mail.gmail.com>
	<CAHTr4dWR-dp-hY5UCBehNHuNW5fYrEdmXo35qkkPts8kB5L+XQ@mail.gmail.com>
Message-ID: <CA+nsYDsp2nAcOcOT+r4C16b9oOc9WjFT4QcdZzvCu58O0U9oOA@mail.gmail.com>

On 17 January 2013 12:01, Gerhard Burger <burger.ga at gmail.com> wrote:
> When I run `numpy.test(verbose=10)` it crashes with
>
>     test_polyfit (test_polynomial.TestDocs) ... Illegal instruction
>
> In the FAQ it states that I should provide the following information
> (running Ubuntu 12.04 64bit):
>
>     os.name = 'posix'
>     uname -r = 3.2.0-35-generic
>     sys.platform = 'linux2'
>     sys.version = '2.7.3 (default, Aug  1 2012, 05:14:39) \n[GCC 4.6.3]'
>
> Atlas is not installed (not required for numpy, only for scipy right?)
>
> It fails both when I install numpy 1.6.2 with `pip install numpy` and if I
> install the latest dev version from git.

Very strange. I tried to reproduce this on 64-bit Ubuntu 12.04 (by
removing my ATLAS, BLAS, LAPACK etc..) but couldn't:

$ python -c "import numpy; numpy.test()"
Running unit tests for numpy
NumPy version 1.6.2
NumPy is installed in
/home/scott/.virtualenvs/numpy-tmp/local/lib/python2.7/site-packages/numpy
Python version 2.7.3 (default, Aug  1 2012, 05:14:39) [GCC 4.6.3]
nose version 1.2.1
.........
----------------------------------------------------------------------
Ran 3568 tests in 14.170s

OK (KNOWNFAIL=5, SKIP=5)

$ python -c "import numpy; numpy.show_config()"
blas_info:
  NOT AVAILABLE
lapack_info:
  NOT AVAILABLE
atlas_threads_info:
  NOT AVAILABLE
blas_src_info:
  NOT AVAILABLE
lapack_src_info:
  NOT AVAILABLE
atlas_blas_threads_info:
  NOT AVAILABLE
lapack_opt_info:
  NOT AVAILABLE
blas_opt_info:
  NOT AVAILABLE
atlas_info:
  NOT AVAILABLE
lapack_mkl_info:
  NOT AVAILABLE
blas_mkl_info:
  NOT AVAILABLE
atlas_blas_info:
  NOT AVAILABLE
mkl_info:
  NOT AVAILABLE

Cheers,
Scott


From pierre.haessig at crans.org  Thu Jan 17 09:13:37 2013
From: pierre.haessig at crans.org (Pierre Haessig)
Date: Thu, 17 Jan 2013 15:13:37 +0100
Subject: [Numpy-discussion] New numpy functions: filled, filled_like
In-Reply-To: <CANNq6F=WpPoMF0+AMqRHGXmfs2cdnX83d=i1TNSfaj1-pamZPg@mail.gmail.com>
References: <CAPJVwBmgZUCwApoS2ztmwt2pTa-2-+B3W5tOm8A30KyuOd12AA@mail.gmail.com>
	<CAF6FJivC=O0CGMGP=c5=a=TMRP73Zy9H61FRS8xSKp5cU2XV4Q@mail.gmail.com>
	<CAPJVwBnOdehQTo=dLup+woc5_JkE5ekvEBDeiEFraVyVUvSDtQ@mail.gmail.com>
	<CAKF=DjtjSGk99zh53wwiPVmMThOzr4hGUKX5HJPG-5YeF_QTCQ@mail.gmail.com>
	<CAPJVwBmZvXXszZzf9m=PX_7e=8qs8ZJteU2hD7_eTR9pnN9U2w@mail.gmail.com>
	<CAF6FJiuQwNCkPvzVKneLfx3XSM7tHKcNXzMftRFnu8R+tN_LoQ@mail.gmail.com>
	<loom.20130114T094714-689@post.gmane.org>
	<CAH6Pt5rjUE0uEQkgZsGrKL4HWXMPZO=wbbaUZ5=tm+LCBSrrXw@mail.gmail.com>
	<CAFXk4br_0pohTOrU6fJjbiqpsP+GSiT3U-nDWRrCi1q6PjJHjg@mail.gmail.com>
	<50F4400F.4040709@hawaii.edu>
	<CANNq6Fm35sYGK4cjZNQedsbdVz06hVhAm6kv7d6QHPtMJACR6g@mail.gmail.com>
	<50F44A95.2030202@crans.org>
	<CALc6Xo6b=3OE3VHNsqJ-SXo_8tvNZOL5RV0pnYnG_fiDp58TUQ@mail.gmail.com>
	<CANNq6F=WpPoMF0+AMqRHGXmfs2cdnX83d=i1TNSfaj1-pamZPg@mail.gmail.com>
Message-ID: <50F80711.9010204@crans.org>

Hi,

Le 14/01/2013 20:05, Benjamin Root a ?crit :
> I do like the way you are thinking in terms of the broadcasting
> semantics, but I wonder if that is a bit awkward.  What I mean is, if
> one were to use broadcasting semantics for creating an array, wouldn't
> one have just simply used broadcasting anyway?  The point of
> broadcasting is to _avoid_ the creation of unneeded arrays.  But maybe
> I can be convinced with some examples.

I feel that one of the point of the discussion is : although a new (or
not so new...) function to create a filled array would be more elegant
than the existing pair of functions "np.zeros" and "np.ones", there are
maybe not so many usecases for filled arrays *other than zeros values*.

I can remember having initialized a non-zero array *some months ago*.
For the anecdote it was a vector of discretized vehicule speed values
which I wanted to be initialized with a predefined mean speed value
prior to some optimization. In that usecase, I really didn't care about
the performance of this initialization step.

So my overall feeling after this thread is
 - *yes* a single dedicated fill/init/someverb function would give a
slightly better API,
 -  but *no* it's not important because np.empty and np.zeros covers 95
% usecases !

best,
Pierre

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 900 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130117/0cbea8a1/attachment.sig>

From burger.ga at gmail.com  Thu Jan 17 09:18:04 2013
From: burger.ga at gmail.com (Gerhard Burger)
Date: Thu, 17 Jan 2013 15:18:04 +0100
Subject: [Numpy-discussion] Fwd: numpy test fails with "Illegal
	instruction'
In-Reply-To: <CA+nsYDsp2nAcOcOT+r4C16b9oOc9WjFT4QcdZzvCu58O0U9oOA@mail.gmail.com>
References: <CAHTr4dXL7uM6+oKZk2jaeoTNQR11yBdx_guw6yLX4OviwNDaNw@mail.gmail.com>
	<CAHTr4dWR-dp-hY5UCBehNHuNW5fYrEdmXo35qkkPts8kB5L+XQ@mail.gmail.com>
	<CA+nsYDsp2nAcOcOT+r4C16b9oOc9WjFT4QcdZzvCu58O0U9oOA@mail.gmail.com>
Message-ID: <CAHTr4dVeqs1nFTx_6NooxkyzCoH0vPdpM_a7LUqmgsg_XovP5Q@mail.gmail.com>

I read somewhere that it could have to do with the sse instructions that
your processor is capable of, but my processor is not that old, so I would
think that is not the problem...


On Thu, Jan 17, 2013 at 3:12 PM, Scott Sinclair <scott.sinclair.za at gmail.com
> wrote:

> On 17 January 2013 12:01, Gerhard Burger <burger.ga at gmail.com> wrote:
> > When I run `numpy.test(verbose=10)` it crashes with
> >
> >     test_polyfit (test_polynomial.TestDocs) ... Illegal instruction
> >
> > In the FAQ it states that I should provide the following information
> > (running Ubuntu 12.04 64bit):
> >
> >     os.name = 'posix'
> >     uname -r = 3.2.0-35-generic
> >     sys.platform = 'linux2'
> >     sys.version = '2.7.3 (default, Aug  1 2012, 05:14:39) \n[GCC 4.6.3]'
> >
> > Atlas is not installed (not required for numpy, only for scipy right?)
> >
> > It fails both when I install numpy 1.6.2 with `pip install numpy` and if
> I
> > install the latest dev version from git.
>
> Very strange. I tried to reproduce this on 64-bit Ubuntu 12.04 (by
> removing my ATLAS, BLAS, LAPACK etc..) but couldn't:
>
> $ python -c "import numpy; numpy.test()"
> Running unit tests for numpy
> NumPy version 1.6.2
> NumPy is installed in
> /home/scott/.virtualenvs/numpy-tmp/local/lib/python2.7/site-packages/numpy
> Python version 2.7.3 (default, Aug  1 2012, 05:14:39) [GCC 4.6.3]
> nose version 1.2.1
> .........
> ----------------------------------------------------------------------
> Ran 3568 tests in 14.170s
>
> OK (KNOWNFAIL=5, SKIP=5)
>
> $ python -c "import numpy; numpy.show_config()"
> blas_info:
>   NOT AVAILABLE
> lapack_info:
>   NOT AVAILABLE
> atlas_threads_info:
>   NOT AVAILABLE
> blas_src_info:
>   NOT AVAILABLE
> lapack_src_info:
>   NOT AVAILABLE
> atlas_blas_threads_info:
>   NOT AVAILABLE
> lapack_opt_info:
>   NOT AVAILABLE
> blas_opt_info:
>   NOT AVAILABLE
> atlas_info:
>   NOT AVAILABLE
> lapack_mkl_info:
>   NOT AVAILABLE
> blas_mkl_info:
>   NOT AVAILABLE
> atlas_blas_info:
>   NOT AVAILABLE
> mkl_info:
>   NOT AVAILABLE
>
> Cheers,
> Scott
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130117/ffc60dcf/attachment.html>

From matthew.brett at gmail.com  Thu Jan 17 09:26:16 2013
From: matthew.brett at gmail.com (Matthew Brett)
Date: Thu, 17 Jan 2013 14:26:16 +0000
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <50EDB1D4.5090909@astro.uio.no>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CALmrCV0n=TsEPMrmzpdh6UQNFRTp5SgD0VAH52n9pxedsEu8Kg@mail.gmail.com>
	<CAH6Pt5rKwOzTjnOO87cSi=fkCT_-8tVAbEoA9RunwWD6A_DAPw@mail.gmail.com>
	<CALmrCV1JwDcBnO3sf=4n_4+V=C_j3QzRkpMCEZJutnv8NxRPqw@mail.gmail.com>
	<CAH6Pt5qnD=hfW_Zmw5t69HL78H8uhFY8b+GTkFx4uOmO+HERYQ@mail.gmail.com>
	<CALmrCV3ssCQ+U8QziyymoiF6BgxHHBiM2Z2Bnuk9UWNGH9bC4w@mail.gmail.com>
	<CAH6Pt5oiY07W=S+XbAYwN8Zk+AGCOh17eh_eg-uexBHGkPDgWQ@mail.gmail.com>
	<CALmrCV3kZ=EEopMDydwYDarV6G9nSbiPHK_UY4vSBiDUG8fhKg@mail.gmail.com>
	<CAH6Pt5ro6n0z9e_BZHjvrVxL2tvYoE50vvkeaXmAgvOUQBpMLQ@mail.gmail.com>
	<CALmrCV1H68=Vtr6Gep7Y08Ns_vASQdrf+Q3R-COgvXt7URi5sg@mail.gmail.com>
	<CAPJVwBmCmPCShfmoSPgkpO58aLU6S2t9GbYbDV=iif6US5YgJQ@mail.gmail.com>
	<CALmrCV1qbkkp8yyA4z2RTCcB9hK70ZcFYD53xXNaqNKWuaGtnw@mail.gmail.com>
	<CAPJVwB=f9_g81=SYyrb88j9mWrHKTwjxNUp0ofs0ZQKzPxeJoA@mail.gmail.com>
	<CALGmxEJyj4NT8RPmGqi8Sf5TD5PMw38TxVKyNQGqf-YU1T5-Wg@mail.gmail.com>
	<50EDB1D4.5090909@astro.uio.no>
Message-ID: <CAH6Pt5pxM7z4YXiyyYxO1nj_wj9M_RVUFB7yt2tKnvo1=cXuNg@mail.gmail.com>

Hi,

On Wed, Jan 9, 2013 at 6:07 PM, Dag Sverre Seljebotn
<d.s.seljebotn at astro.uio.no> wrote:
> On 01/09/2013 06:22 PM, Chris Barker - NOAA Federal wrote:
>> On Wed, Jan 9, 2013 at 7:09 AM, Nathaniel Smith <njs at pobox.com> wrote:
>>>> This is a general issue applying to data which is read from real-world
>>>> external sources.  For example, digitizers routinely represent their
>>>> samples as int8's or int16's, and you apply a scale and offset to get
>>>> a reading in volts.
>>>
>>> This particular case is actually handled fine by 1.5, because int
>>> array + float scalar *does* upcast to float. It's width that's ignored
>>> (int8 versus int32), not the basic "kind" of data (int versus float).
>>>
>>> But overall this does sound like a problem -- but it's not a problem
>>> with the scalar/array rules, it's a problem with working with narrow
>>> width data in general.
>>
>> Exactly -- this is key. details asside, we essentially have a choice
>> between an approach that makes it easy to preserver your values --
>> upcasting liberally, or making it easy to preserve your dtype --
>> requiring users to specifically upcast where needed.
>>
>> IIRC, our experience with earlier versions of numpy (and Numeric
>> before that) is that all too often folks would choose a small dtype
>> quite deliberately, then have it accidentally upcast for them -- this
>> was determined to be not-so-good behavior.
>>
>> I think the HDF (and also netcdf...) case is a special case -- the
>> small dtype+scaling has been chosen deliberately by whoever created
>> the data file (to save space), but we would want it generally opaque
>> to the consumer of the file -- to me, that means the issue should be
>> adressed by the file reading tools, not numpy. If your HDF5 reader
>> chooses the the resulting dtype explicitly, it doesn't matter what
>> numpy's defaults are. If the user wants to work with the raw, unscaled
>> arrays, then they should know what they are doing.
>
> +1. I think h5py should consider:
>
> File("my.h5")['int8_dset'].dtype == int64
> File("my.h5", preserve_dtype=True)['int8_dset'].dtype == int8

Returning to this thread - did we have a decision?

With further reflection, it seems to me we will have a tough time
going back to the 1.5 behavior now - we might be shutting the stable
door after the cat is out of the bag, if you see what I mean.

Maybe we should change the question to the desirable behavior in the
long term.

I am starting to wonder if we should aim for making

* scalar and array casting rules the same;
* Python int / float scalars become int32 / 64 or float64;

This has the benefit of being very easy to understand and explain.  It
makes dtypes predictable in the sense they don't depend on value.

Those wanting to maintain - say - float32 will need to cast scalars to float32.

Maybe the use-cases motivating the scalar casting rules - maintaining
float32 precision in particular - can be dealt with by careful casting
of scalars, throwing the burden onto the memory-conscious to maintain
their dtypes.

Or is there a way of using flags to ufuncs to emulate the 1.5 casting rules?

Do y'all agree this is desirable in the long term?

If so, how should we get there?  It seems to me we're about 25 percent
of the way there with the current scalar casting rule.

Cheers,

Matthew


From alan.isaac at gmail.com  Thu Jan 17 09:32:34 2013
From: alan.isaac at gmail.com (Alan G Isaac)
Date: Thu, 17 Jan 2013 09:32:34 -0500
Subject: [Numpy-discussion] Shouldn't all in-place operations simply
	return self?
In-Reply-To: <50F8027B.5040301@noaa.gov>
References: <CAKa=AYQa_WrcFrVKNne9f7bzZvxhj9rpZAFp=Hx62zf2YphwLQ@mail.gmail.com>
	<CAMMTP+DFxanPjmrbzBQVmeyxowz2+FCsDExKnkqWwAMvzwE=yQ@mail.gmail.com>
	<CAPJVwBnro9dzy7vUBs8M91ujQu_+7-n0QAWW8EsSNM2BEPDmGw@mail.gmail.com>
	<50F8027B.5040301@noaa.gov>
Message-ID: <50F80B82.9080606@gmail.com>

Is it really better to have `permute` and `permuted`
than to add a keyword?  (Note that these are actually
still ambiguous, except by convention.)

Btw, two separate issues seem to be running side by side.

i. should in-place operations return their result?
ii. how can we signal that an operation is inplace?

I expect NumPy to do inplace operations when feasible,
so maybe they could take an `out` keyword with a None default.
Possibly recognize `out=True` as asking for the original array
object to be returned (mutated); `out='copy'` as asking for a copy to
be created, operated upon, and returned; and `out=a` to ask
for array `a` to be used for the output (without changing
the original object, and with a return value of None).

Alan Isaac


From ben.root at ou.edu  Thu Jan 17 09:49:51 2013
From: ben.root at ou.edu (Benjamin Root)
Date: Thu, 17 Jan 2013 09:49:51 -0500
Subject: [Numpy-discussion] Shouldn't all in-place operations simply
 return self?
In-Reply-To: <50F8027B.5040301@noaa.gov>
References: <CAKa=AYQa_WrcFrVKNne9f7bzZvxhj9rpZAFp=Hx62zf2YphwLQ@mail.gmail.com>
	<CAMMTP+DFxanPjmrbzBQVmeyxowz2+FCsDExKnkqWwAMvzwE=yQ@mail.gmail.com>
	<CAPJVwBnro9dzy7vUBs8M91ujQu_+7-n0QAWW8EsSNM2BEPDmGw@mail.gmail.com>
	<50F8027B.5040301@noaa.gov>
Message-ID: <CANNq6Fm91owGD2BKkLAxi+LWjTrd9q-hHwy+DLp6Mvdihc7ZyA@mail.gmail.com>

On Thu, Jan 17, 2013 at 8:54 AM, Jim Vickroy <jim.vickroy at noaa.gov> wrote:

>  On 1/16/2013 11:41 PM, Nathaniel Smith wrote:
>
> On 16 Jan 2013 17:54, <josef.pktd at gmail.com> wrote:
> > >>> a = np.random.random_integers(0, 5, size=5)
> > >>> b = a.sort()
> > >>> b
> > >>> a
> > array([0, 1, 2, 5, 5])
> >
> > >>> b = np.random.shuffle(a)
> > >>> b
> > >>> b = np.random.permutation(a)
> > >>> b
> > array([0, 5, 5, 2, 1])
> >
> > How do I remember if shuffle shuffles or permutes ?
> >
> > Do we have a list of functions that are inplace?
>
> I rather like the convention used elsewhere in Python of naming in-place
> operations with present tense imperative verbs, and out-of-place operations
> with past participles. So you have sort/sorted, reverse/reversed, etc.
>
> Here this would suggest we name these two operations as either shuffle()
> and shuffled(), or permute() and permuted().
>
>
> I like this (tense) suggestion.  It seems easy to remember.  --jv
>
>
>
And another score for functions as verbs!

:-P

Ben Root
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130117/5a67690f/attachment.html>

From burger.ga at gmail.com  Thu Jan 17 09:59:36 2013
From: burger.ga at gmail.com (Gerhard Burger)
Date: Thu, 17 Jan 2013 15:59:36 +0100
Subject: [Numpy-discussion] Fwd: numpy test fails with "Illegal
	instruction'
In-Reply-To: <CAHTr4dVeqs1nFTx_6NooxkyzCoH0vPdpM_a7LUqmgsg_XovP5Q@mail.gmail.com>
References: <CAHTr4dXL7uM6+oKZk2jaeoTNQR11yBdx_guw6yLX4OviwNDaNw@mail.gmail.com>
	<CAHTr4dWR-dp-hY5UCBehNHuNW5fYrEdmXo35qkkPts8kB5L+XQ@mail.gmail.com>
	<CA+nsYDsp2nAcOcOT+r4C16b9oOc9WjFT4QcdZzvCu58O0U9oOA@mail.gmail.com>
	<CAHTr4dVeqs1nFTx_6NooxkyzCoH0vPdpM_a7LUqmgsg_XovP5Q@mail.gmail.com>
Message-ID: <CAHTr4dVBQWA2NhJYoD42AnC68_dBePENNwLArY1UYHJxTPpsRg@mail.gmail.com>

Solved it, did a backtrace with gdb and the error came somewhere from an
old lapack version that was installed on my machine (I thought I wouldn't
have these issues in a virtualenv). but anyway after I removed it, and
installed numpy again, it ran without problems!


On Thu, Jan 17, 2013 at 3:18 PM, Gerhard Burger <burger.ga at gmail.com> wrote:

> I read somewhere that it could have to do with the sse instructions that
> your processor is capable of, but my processor is not that old, so I would
> think that is not the problem...
>
>
>
> On Thu, Jan 17, 2013 at 3:12 PM, Scott Sinclair <
> scott.sinclair.za at gmail.com> wrote:
>
>> On 17 January 2013 12:01, Gerhard Burger <burger.ga at gmail.com> wrote:
>> > When I run `numpy.test(verbose=10)` it crashes with
>> >
>> >     test_polyfit (test_polynomial.TestDocs) ... Illegal instruction
>> >
>> > In the FAQ it states that I should provide the following information
>> > (running Ubuntu 12.04 64bit):
>> >
>> >     os.name = 'posix'
>> >     uname -r = 3.2.0-35-generic
>> >     sys.platform = 'linux2'
>> >     sys.version = '2.7.3 (default, Aug  1 2012, 05:14:39) \n[GCC 4.6.3]'
>> >
>> > Atlas is not installed (not required for numpy, only for scipy right?)
>> >
>> > It fails both when I install numpy 1.6.2 with `pip install numpy` and
>> if I
>> > install the latest dev version from git.
>>
>> Very strange. I tried to reproduce this on 64-bit Ubuntu 12.04 (by
>> removing my ATLAS, BLAS, LAPACK etc..) but couldn't:
>>
>> $ python -c "import numpy; numpy.test()"
>> Running unit tests for numpy
>> NumPy version 1.6.2
>> NumPy is installed in
>> /home/scott/.virtualenvs/numpy-tmp/local/lib/python2.7/site-packages/numpy
>> Python version 2.7.3 (default, Aug  1 2012, 05:14:39) [GCC 4.6.3]
>> nose version 1.2.1
>> .........
>> ----------------------------------------------------------------------
>> Ran 3568 tests in 14.170s
>>
>> OK (KNOWNFAIL=5, SKIP=5)
>>
>> $ python -c "import numpy; numpy.show_config()"
>> blas_info:
>>   NOT AVAILABLE
>> lapack_info:
>>   NOT AVAILABLE
>> atlas_threads_info:
>>   NOT AVAILABLE
>> blas_src_info:
>>   NOT AVAILABLE
>> lapack_src_info:
>>   NOT AVAILABLE
>> atlas_blas_threads_info:
>>   NOT AVAILABLE
>> lapack_opt_info:
>>   NOT AVAILABLE
>> blas_opt_info:
>>   NOT AVAILABLE
>> atlas_info:
>>   NOT AVAILABLE
>> lapack_mkl_info:
>>   NOT AVAILABLE
>> blas_mkl_info:
>>   NOT AVAILABLE
>> atlas_blas_info:
>>   NOT AVAILABLE
>> mkl_info:
>>   NOT AVAILABLE
>>
>> Cheers,
>> Scott
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130117/8fcd5296/attachment.html>

From josef.pktd at gmail.com  Thu Jan 17 10:24:29 2013
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Thu, 17 Jan 2013 10:24:29 -0500
Subject: [Numpy-discussion] Shouldn't all in-place operations simply
 return self?
In-Reply-To: <CANNq6Fm91owGD2BKkLAxi+LWjTrd9q-hHwy+DLp6Mvdihc7ZyA@mail.gmail.com>
References: <CAKa=AYQa_WrcFrVKNne9f7bzZvxhj9rpZAFp=Hx62zf2YphwLQ@mail.gmail.com>
	<CAMMTP+DFxanPjmrbzBQVmeyxowz2+FCsDExKnkqWwAMvzwE=yQ@mail.gmail.com>
	<CAPJVwBnro9dzy7vUBs8M91ujQu_+7-n0QAWW8EsSNM2BEPDmGw@mail.gmail.com>
	<50F8027B.5040301@noaa.gov>
	<CANNq6Fm91owGD2BKkLAxi+LWjTrd9q-hHwy+DLp6Mvdihc7ZyA@mail.gmail.com>
Message-ID: <CAMMTP+CJSeZ_fzcDXWGRD0gZOpHQT8c0YCAaz_EtYUXGwjXK5w@mail.gmail.com>

On Thu, Jan 17, 2013 at 9:49 AM, Benjamin Root <ben.root at ou.edu> wrote:
>
>
> On Thu, Jan 17, 2013 at 8:54 AM, Jim Vickroy <jim.vickroy at noaa.gov> wrote:
>>
>> On 1/16/2013 11:41 PM, Nathaniel Smith wrote:
>>
>> On 16 Jan 2013 17:54, <josef.pktd at gmail.com> wrote:
>> > >>> a = np.random.random_integers(0, 5, size=5)
>> > >>> b = a.sort()
>> > >>> b
>> > >>> a
>> > array([0, 1, 2, 5, 5])
>> >
>> > >>> b = np.random.shuffle(a)
>> > >>> b
>> > >>> b = np.random.permutation(a)
>> > >>> b
>> > array([0, 5, 5, 2, 1])
>> >
>> > How do I remember if shuffle shuffles or permutes ?
>> >
>> > Do we have a list of functions that are inplace?
>>
>> I rather like the convention used elsewhere in Python of naming in-place
>> operations with present tense imperative verbs, and out-of-place operations
>> with past participles. So you have sort/sorted, reverse/reversed, etc.
>>
>> Here this would suggest we name these two operations as either shuffle()
>> and shuffled(), or permute() and permuted().
>>
>>
>> I like this (tense) suggestion.  It seems easy to remember.  --jv
>>
>>
>
> And another score for functions as verbs!

I don't thing the filled we discuss here is an action.

The current ``fill`` is an inplace operation, operating on an existing array.
``filled`` would be the analog that returns a copy.

However ``filled`` here is creating an object

I still think ``array_filled`` is the most precise

'''Create an array and initialize it with the ``value``, returning the array '''


my 2.5c

Josef

>
> :-P
>
> Ben Root
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From charlesr.harris at gmail.com  Thu Jan 17 10:27:25 2013
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Thu, 17 Jan 2013 08:27:25 -0700
Subject: [Numpy-discussion] Shouldn't all in-place operations simply
 return self?
In-Reply-To: <CAKa=AYQa_WrcFrVKNne9f7bzZvxhj9rpZAFp=Hx62zf2YphwLQ@mail.gmail.com>
References: <CAKa=AYQa_WrcFrVKNne9f7bzZvxhj9rpZAFp=Hx62zf2YphwLQ@mail.gmail.com>
Message-ID: <CAB6mnxLpOd_=xDfEeH7h+HN+zq=igsck=N6MQ75YHYia9-V8ew@mail.gmail.com>

On Wed, Jan 16, 2013 at 5:11 PM, eat <e.antero.tammi at gmail.com> wrote:

> Hi,
>
> In a recent thread
> http://article.gmane.org/gmane.comp.python.numeric.general/52772 it was
> proposed that .fill(.) should return self as an alternative for a trivial
> two-liner.
>
> I'm raising now the question: what if all in-place operations indeed could
> return self? How bad this would be? A 'strong' counter argument may be
> found at
> http://mail.python.org/pipermail/python-dev/2003-October/038855.html.
>
> But anyway, at least for me. it would be much more straightforward to
> implement simple mini dsl's (
> http://en.wikipedia.org/wiki/Domain-specific_language) a much more
> straightforward manner.
>
> What do you think?
>
>
I've read Guido about why he didn't like inplace operations returning self
and found him convincing for a while. And then I listened to other folks
express a preference for the freight train style and found them convincing
also. I think it comes down to a preference for one style over another and
I go back and forth myself. If I had to vote, I'd go for returning self,
but I'm not sure it's worth breaking python conventions to do so.

Chuck


>
> -eat
>
> P.S. FWIW, if this idea really gains momentum obviously I'm volunteering
> to create a PR of it.
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130117/12ae912f/attachment.html>

From josef.pktd at gmail.com  Thu Jan 17 10:28:20 2013
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Thu, 17 Jan 2013 10:28:20 -0500
Subject: [Numpy-discussion] Shouldn't all in-place operations simply
 return self?
In-Reply-To: <CAMMTP+CJSeZ_fzcDXWGRD0gZOpHQT8c0YCAaz_EtYUXGwjXK5w@mail.gmail.com>
References: <CAKa=AYQa_WrcFrVKNne9f7bzZvxhj9rpZAFp=Hx62zf2YphwLQ@mail.gmail.com>
	<CAMMTP+DFxanPjmrbzBQVmeyxowz2+FCsDExKnkqWwAMvzwE=yQ@mail.gmail.com>
	<CAPJVwBnro9dzy7vUBs8M91ujQu_+7-n0QAWW8EsSNM2BEPDmGw@mail.gmail.com>
	<50F8027B.5040301@noaa.gov>
	<CANNq6Fm91owGD2BKkLAxi+LWjTrd9q-hHwy+DLp6Mvdihc7ZyA@mail.gmail.com>
	<CAMMTP+CJSeZ_fzcDXWGRD0gZOpHQT8c0YCAaz_EtYUXGwjXK5w@mail.gmail.com>
Message-ID: <CAMMTP+BUpGinWCicBDEWiknzsjPQEssXg2+ujL3hw1c9mX=T8g@mail.gmail.com>

On Thu, Jan 17, 2013 at 10:24 AM,  <josef.pktd at gmail.com> wrote:
> On Thu, Jan 17, 2013 at 9:49 AM, Benjamin Root <ben.root at ou.edu> wrote:
>>
>>
>> On Thu, Jan 17, 2013 at 8:54 AM, Jim Vickroy <jim.vickroy at noaa.gov> wrote:
>>>
>>> On 1/16/2013 11:41 PM, Nathaniel Smith wrote:
>>>
>>> On 16 Jan 2013 17:54, <josef.pktd at gmail.com> wrote:
>>> > >>> a = np.random.random_integers(0, 5, size=5)
>>> > >>> b = a.sort()
>>> > >>> b
>>> > >>> a
>>> > array([0, 1, 2, 5, 5])
>>> >
>>> > >>> b = np.random.shuffle(a)
>>> > >>> b
>>> > >>> b = np.random.permutation(a)
>>> > >>> b
>>> > array([0, 5, 5, 2, 1])
>>> >
>>> > How do I remember if shuffle shuffles or permutes ?
>>> >
>>> > Do we have a list of functions that are inplace?
>>>
>>> I rather like the convention used elsewhere in Python of naming in-place
>>> operations with present tense imperative verbs, and out-of-place operations
>>> with past participles. So you have sort/sorted, reverse/reversed, etc.
>>>
>>> Here this would suggest we name these two operations as either shuffle()
>>> and shuffled(), or permute() and permuted().
>>>
>>>
>>> I like this (tense) suggestion.  It seems easy to remember.  --jv
>>>
>>>
>>
>> And another score for functions as verbs!
>
> I don't thing the filled we discuss here is an action.
>
> The current ``fill`` is an inplace operation, operating on an existing array.
> ``filled`` would be the analog that returns a copy.
>
> However ``filled`` here is creating an object
>
> I still think ``array_filled`` is the most precise
>
> '''Create an array and initialize it with the ``value``, returning the array '''
>
>
> my 2.5c
>
> Josef

Sorry, completely out of context.

I shouldn't write emails, when I'm running in and out the office.

Josef

>
>>
>> :-P
>>
>> Ben Root
>>
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>


From pierre.haessig at crans.org  Thu Jan 17 10:48:36 2013
From: pierre.haessig at crans.org (Pierre Haessig)
Date: Thu, 17 Jan 2013 16:48:36 +0100
Subject: [Numpy-discussion] phase unwrapping (1d)
In-Reply-To: <kd15b5$rte$1@ger.gmane.org>
References: <kcpbpk$8nk$1@ger.gmane.org> <50F40352.9090603@crans.org>
	<kd15b5$rte$1@ger.gmane.org>
Message-ID: <50F81D54.7020105@crans.org>

Hi Neal,

Le 14/01/2013 15:39, Neal Becker a ?crit :
> This code should explain all:
> --------------------------------
> import numpy as np
> arg = np.angle
>
> def nint (x):
>     return int (x + 0.5) if x >= 0 else int (x - 0.5)
>
> def unwrap (inp, y=np.pi, init=0, cnt=0):
>     o = np.empty_like (inp)
>     prev_o = init
>     for i in range (len (inp)):
>         o[i] = cnt * 2 * y + inp[i]
>         delta = o[i] - prev_o
>
>         if delta / y > 1 or delta / y < -1:
>             n = nint (delta / (2*y))
>             o[i] -= 2*y*n
>             cnt -= n
>
>         prev_o = o[i]
>
>     return o
>             
>
> u = np.linspace (0, 400, 100) * np.pi/100
> v = np.cos (u) + 1j * np.sin (u)
> plot (arg(v))
> plot (arg(v) + arg (v))
> plot (unwrap (arg (v)))
> plot (unwrap (arg (v) + arg (v)))
I think your code does the job.

I tried the following simplification, without the use of nint (which by
the way could be replaced by int(floor(x)) I think) :

def unwrap (inp, y=np.pi, init=0, cnt=0):
    o = np.empty_like (inp)
    prev_o = init
    for i in range (len (inp)):
        o[i] = cnt * 2 * y + inp[i]
        delta = o[i] - prev_o

        if delta / y > 1:
             o[i] -= 2*y
            cnt -= 1
        elif delta / y < -1:
            o[i] += 2*y
            cnt += 1

        prev_o = o[i]

    return o

And now I understand the issue you described of "phase changes of more
than 2pi" because the above indeed fail to unwrap (arg (v) + arg (v)).
On the other hand np.unwrap handles it correctly.

(I still don't know for the speed issue).

Best,
Pierre


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 900 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130117/c0836328/attachment.sig>

From chris.barker at noaa.gov  Thu Jan 17 11:21:26 2013
From: chris.barker at noaa.gov (Chris Barker - NOAA Federal)
Date: Thu, 17 Jan 2013 08:21:26 -0800
Subject: [Numpy-discussion] Casting Bug or a "Feature"?
In-Reply-To: <CAHCaCk+iMUWXnwBVhH=vDgkhW14gpaeN4oji9=RVsvXCT6kzdw@mail.gmail.com>
References: <CAHUTaHH54MAP4K3yHR_kTzWC70xNrWa0ykuq36RdC=4+PP71bA@mail.gmail.com>
	<CALGmxE+DkgeTSHzas1=QEOdnGHN8HUv=p4Ff9+TMfhRLT7pJAQ@mail.gmail.com>
	<CAPJVwBm8awHwWe5G5E1rxkYfaM3q6HDEYRY6svyFfo7nfaxO+Q@mail.gmail.com>
	<CAHUTaHH7H31rbq=ej6SSqe6s-5WpZi4u3ZPyQUqJnBmRyo8iQA@mail.gmail.com>
	<50F7A4E7.7050608@gmail.com>
	<CAHCaCk+iMUWXnwBVhH=vDgkhW14gpaeN4oji9=RVsvXCT6kzdw@mail.gmail.com>
Message-ID: <CALGmxEJTZDubRrzZyP9KXVHM5gXDFwg-ODcQ3ia1DhuHTAOydQ@mail.gmail.com>

On Wed, Jan 16, 2013 at 11:34 PM, Matthieu Brucher

> Of course a += b is not the same as a = a + b. The first one modifies the
> object a, the second one creates a new object and puts it inside a. The
> behavior IS consistent.

Exactly -- if you ask me, the bug is that Python allows "in_place"
operators for immutable objects -- they should be more than syntactic
sugar.

Of course, the temptation for += on regular numbers was just too much to resist.

-Chris

-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From njs at pobox.com  Thu Jan 17 11:33:47 2013
From: njs at pobox.com (Nathaniel Smith)
Date: Thu, 17 Jan 2013 16:33:47 +0000
Subject: [Numpy-discussion] Shouldn't all in-place operations simply
 return self?
In-Reply-To: <50F80B82.9080606@gmail.com>
References: <CAKa=AYQa_WrcFrVKNne9f7bzZvxhj9rpZAFp=Hx62zf2YphwLQ@mail.gmail.com>
	<CAMMTP+DFxanPjmrbzBQVmeyxowz2+FCsDExKnkqWwAMvzwE=yQ@mail.gmail.com>
	<CAPJVwBnro9dzy7vUBs8M91ujQu_+7-n0QAWW8EsSNM2BEPDmGw@mail.gmail.com>
	<50F8027B.5040301@noaa.gov> <50F80B82.9080606@gmail.com>
Message-ID: <CAPJVwB=+eKTeTyJ3OFZiswACGELShf3cGfRTu-yDgOsmpRvh-g@mail.gmail.com>

On Thu, Jan 17, 2013 at 2:32 PM, Alan G Isaac <alan.isaac at gmail.com> wrote:
> Is it really better to have `permute` and `permuted`
> than to add a keyword?  (Note that these are actually
> still ambiguous, except by convention.)

The convention in question, though, is that of English grammar. In
practice everyone who uses numpy is a more-or-less skilled English
speaker in any case, so re-using the conventions is helpful!

"Shake the martini!" <- an imperative command

This is a complete statement all by itself. You can't say "Hand me the
shake the martini". In procedural languages like Python, there's a
strong distinction between statements (whole lines, a = 1), which only
matter because of their side-effects, and expressions (a + b) which
have a value and can be embedded into a larger statement or expression
((a + b) + c). "Shake the martini" is clearly a statement, not an
expression, and therefore clearly has a side-effect.

"shaken martini" <- a noun phrase

Grammatically, this is like plain "martini", you can use it anywhere
you can use a noun. "Hand me the martini", "Hand me the shaken
martini". In programming terms, it's an expression, not a statement.
And side-effecting expressions are poor style, because when you read
procedural code, you know each statement contains at least 1
side-effect, and it's much easier to figure out what's going on if
each statement contains *exactly* one side-effect, and it's the
top-most operation.

This underlying readability guideline is actually baked much more
deeply into Python than the sort/sorted distinction -- this is why in
Python, 'a = 1' is *not* an expression, but a statement. C allows you
to say things like "b = (a = 1)", but in Python you have to say "a =
1; b = a".

> Btw, two separate issues seem to be running side by side.
>
> i. should in-place operations return their result?
> ii. how can we signal that an operation is inplace?
>
> I expect NumPy to do inplace operations when feasible,
> so maybe they could take an `out` keyword with a None default.
> Possibly recognize `out=True` as asking for the original array
> object to be returned (mutated); `out='copy'` as asking for a copy to
> be created, operated upon, and returned; and `out=a` to ask
> for array `a` to be used for the output (without changing
> the original object, and with a return value of None).

Good point that numpy also has a nice convention with out= arguments
for ufuncs. I guess that convention is, by default return a new array,
but also allow one to modify the same (or another!) array in-place, by
passing out=. So this would suggest that we'd have
  b = shuffled(a)
  shuffled(a, out=a)
  shuffled(a, out=b)
  shuffle(a) # same as shuffled(a, out=a)
and if people are bothered by having both 'shuffled' and 'shuffle',
then we drop 'shuffle'. (And the decision about whether to include the
imperative form can be made on a case-by-case basis; having both
shuffled and shuffle seems fine to me, but probably there are other
cases where this is less clear.)

There is also an argument that if out= is given, then we should always
return None, in general. I'm having a lot of trouble thinking of any
situation where it would be acceptable style (or even useful) to write
something like:
  c = np.add(a, b, out=a) + 1
But, 'out=' is very large and visible (which makes the readability
less terrible than it could be). And np.add always returns the out
array when working out-of-place (so there's at least a weak
countervailing convention). So I feel much more strongly that
shuffle() should return None, than I do that np.add(out=...) should
return None.

A compromise position would be to make all new functions that take
out= return None when out= is given, while leaving existing ufuncs and
such as they are for now.

-n


From d.s.seljebotn at astro.uio.no  Thu Jan 17 13:08:36 2013
From: d.s.seljebotn at astro.uio.no (Dag Sverre Seljebotn)
Date: Thu, 17 Jan 2013 19:08:36 +0100
Subject: [Numpy-discussion] Shouldn't all in-place operations simply
 return self?
In-Reply-To: <CAPJVwB=+eKTeTyJ3OFZiswACGELShf3cGfRTu-yDgOsmpRvh-g@mail.gmail.com>
References: <CAKa=AYQa_WrcFrVKNne9f7bzZvxhj9rpZAFp=Hx62zf2YphwLQ@mail.gmail.com>
	<CAMMTP+DFxanPjmrbzBQVmeyxowz2+FCsDExKnkqWwAMvzwE=yQ@mail.gmail.com>
	<CAPJVwBnro9dzy7vUBs8M91ujQu_+7-n0QAWW8EsSNM2BEPDmGw@mail.gmail.com>
	<50F8027B.5040301@noaa.gov> <50F80B82.9080606@gmail.com>
	<CAPJVwB=+eKTeTyJ3OFZiswACGELShf3cGfRTu-yDgOsmpRvh-g@mail.gmail.com>
Message-ID: <50F83E24.9030808@astro.uio.no>

On 01/17/2013 05:33 PM, Nathaniel Smith wrote:
> On Thu, Jan 17, 2013 at 2:32 PM, Alan G Isaac <alan.isaac at gmail.com> wrote:
>> Is it really better to have `permute` and `permuted`
>> than to add a keyword?  (Note that these are actually
>> still ambiguous, except by convention.)
>
> The convention in question, though, is that of English grammar. In
> practice everyone who uses numpy is a more-or-less skilled English
> speaker in any case, so re-using the conventions is helpful!
>
> "Shake the martini!" <- an imperative command
>
> This is a complete statement all by itself. You can't say "Hand me the
> shake the martini". In procedural languages like Python, there's a
> strong distinction between statements (whole lines, a = 1), which only
> matter because of their side-effects, and expressions (a + b) which
> have a value and can be embedded into a larger statement or expression
> ((a + b) + c). "Shake the martini" is clearly a statement, not an
> expression, and therefore clearly has a side-effect.
>
> "shaken martini" <- a noun phrase
>
> Grammatically, this is like plain "martini", you can use it anywhere
> you can use a noun. "Hand me the martini", "Hand me the shaken
> martini". In programming terms, it's an expression, not a statement.
> And side-effecting expressions are poor style, because when you read
> procedural code, you know each statement contains at least 1
> side-effect, and it's much easier to figure out what's going on if
> each statement contains *exactly* one side-effect, and it's the
> top-most operation.
>
> This underlying readability guideline is actually baked much more
> deeply into Python than the sort/sorted distinction -- this is why in
> Python, 'a = 1' is *not* an expression, but a statement. C allows you
> to say things like "b = (a = 1)", but in Python you have to say "a =
> 1; b = a".
>
>> Btw, two separate issues seem to be running side by side.
>>
>> i. should in-place operations return their result?
>> ii. how can we signal that an operation is inplace?
>>
>> I expect NumPy to do inplace operations when feasible,
>> so maybe they could take an `out` keyword with a None default.
>> Possibly recognize `out=True` as asking for the original array
>> object to be returned (mutated); `out='copy'` as asking for a copy to
>> be created, operated upon, and returned; and `out=a` to ask
>> for array `a` to be used for the output (without changing
>> the original object, and with a return value of None).
>
> Good point that numpy also has a nice convention with out= arguments
> for ufuncs. I guess that convention is, by default return a new array,
> but also allow one to modify the same (or another!) array in-place, by
> passing out=. So this would suggest that we'd have
>    b = shuffled(a)
>    shuffled(a, out=a)
>    shuffled(a, out=b)
>    shuffle(a) # same as shuffled(a, out=a)
> and if people are bothered by having both 'shuffled' and 'shuffle',
> then we drop 'shuffle'. (And the decision about whether to include the
> imperative form can be made on a case-by-case basis; having both
> shuffled and shuffle seems fine to me, but probably there are other
> cases where this is less clear.)

In addition to the verb tense, I think it's important that mutators are 
methods whereas functions do not mutate their arguments:

lst.sort()
sorted(lst)

So -1 on shuffle(a) and a.shuffled().

Dag Sverre

>
> There is also an argument that if out= is given, then we should always
> return None, in general. I'm having a lot of trouble thinking of any
> situation where it would be acceptable style (or even useful) to write
> something like:
>    c = np.add(a, b, out=a) + 1
> But, 'out=' is very large and visible (which makes the readability
> less terrible than it could be). And np.add always returns the out
> array when working out-of-place (so there's at least a weak
> countervailing convention). So I feel much more strongly that
> shuffle() should return None, than I do that np.add(out=...) should
> return None.
>
> A compromise position would be to make all new functions that take
> out= return None when out= is given, while leaving existing ufuncs and
> such as they are for now.
>
> -n
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From njs at pobox.com  Thu Jan 17 13:29:08 2013
From: njs at pobox.com (Nathaniel Smith)
Date: Thu, 17 Jan 2013 18:29:08 +0000
Subject: [Numpy-discussion] Shouldn't all in-place operations simply
 return self?
In-Reply-To: <50F83E24.9030808@astro.uio.no>
References: <CAKa=AYQa_WrcFrVKNne9f7bzZvxhj9rpZAFp=Hx62zf2YphwLQ@mail.gmail.com>
	<CAMMTP+DFxanPjmrbzBQVmeyxowz2+FCsDExKnkqWwAMvzwE=yQ@mail.gmail.com>
	<CAPJVwBnro9dzy7vUBs8M91ujQu_+7-n0QAWW8EsSNM2BEPDmGw@mail.gmail.com>
	<50F8027B.5040301@noaa.gov> <50F80B82.9080606@gmail.com>
	<CAPJVwB=+eKTeTyJ3OFZiswACGELShf3cGfRTu-yDgOsmpRvh-g@mail.gmail.com>
	<50F83E24.9030808@astro.uio.no>
Message-ID: <CAPJVwB=8O0QKwU+iDUukgAfU7d-Oq6aDu0q7b=uGd2OC8DmSfQ@mail.gmail.com>

On Thu, Jan 17, 2013 at 6:08 PM, Dag Sverre Seljebotn <
d.s.seljebotn at astro.uio.no> wrote:
> In addition to the verb tense, I think it's important that mutators are
> methods whereas functions do not mutate their arguments:
>
> lst.sort()
> sorted(lst)

Unfortunately this isn't really viable in a language like Python where you
can't add methods to a class. (list.sort() versus sorted() has as much or
more to do with the fact that sort's implementation only works on lists,
while sorted takes an arbitrary iterable.) Even core python provides a
function for in-place list randomization, not a method. Following the
proposed rule would just mean that we couldn't provide in-place shuffles at
all, which is clearly not going to be acceptable.

-n
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130117/c2b71a60/attachment.html>

From mwwiebe at gmail.com  Thu Jan 17 14:04:24 2013
From: mwwiebe at gmail.com (Mark Wiebe)
Date: Thu, 17 Jan 2013 11:04:24 -0800
Subject: [Numpy-discussion] memory leak in 1.7
Message-ID: <CAMRnEmqmcsfcA42eN3JLy9RfvZsLoW4G=V_H8Y4LOCOYxpQuMA@mail.gmail.com>

I've tracked down and fixed a memory leak in 1.7 and master. The pull
request to check and backport is here:

https://github.com/numpy/numpy/pull/2928

Thanks,
Mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130117/2ad28ecd/attachment.html>

From efiring at hawaii.edu  Thu Jan 17 17:04:43 2013
From: efiring at hawaii.edu (Eric Firing)
Date: Thu, 17 Jan 2013 12:04:43 -1000
Subject: [Numpy-discussion] New numpy functions: filled, filled_like
In-Reply-To: <50F80711.9010204@crans.org>
References: <CAPJVwBmgZUCwApoS2ztmwt2pTa-2-+B3W5tOm8A30KyuOd12AA@mail.gmail.com>
	<CAF6FJivC=O0CGMGP=c5=a=TMRP73Zy9H61FRS8xSKp5cU2XV4Q@mail.gmail.com>
	<CAPJVwBnOdehQTo=dLup+woc5_JkE5ekvEBDeiEFraVyVUvSDtQ@mail.gmail.com>
	<CAKF=DjtjSGk99zh53wwiPVmMThOzr4hGUKX5HJPG-5YeF_QTCQ@mail.gmail.com>
	<CAPJVwBmZvXXszZzf9m=PX_7e=8qs8ZJteU2hD7_eTR9pnN9U2w@mail.gmail.com>
	<CAF6FJiuQwNCkPvzVKneLfx3XSM7tHKcNXzMftRFnu8R+tN_LoQ@mail.gmail.com>
	<loom.20130114T094714-689@post.gmane.org>
	<CAH6Pt5rjUE0uEQkgZsGrKL4HWXMPZO=wbbaUZ5=tm+LCBSrrXw@mail.gmail.com>
	<CAFXk4br_0pohTOrU6fJjbiqpsP+GSiT3U-nDWRrCi1q6PjJHjg@mail.gmail.com>
	<50F4400F.4040709@hawaii.edu>
	<CANNq6Fm35sYGK4cjZNQedsbdVz06hVhAm6kv7d6QHPtMJACR6g@mail.gmail.com>
	<50F44A95.2030202@crans.org>
	<CALc6Xo6b=3OE3VHNsqJ-SXo_8tvNZOL5RV0pnYnG_fiDp58TUQ@mail.gmail.com>
	<CANNq6F=WpPoMF0+AMqRHGXmfs2cdnX83d=i1TNSfaj1-pamZPg@mail.gmail.com>
	<50F80711.9010204@crans.org>
Message-ID: <50F8757B.7060008@hawaii.edu>

On 2013/01/17 4:13 AM, Pierre Haessig wrote:
> Hi,
>
> Le 14/01/2013 20:05, Benjamin Root a ?crit :
>> I do like the way you are thinking in terms of the broadcasting
>> semantics, but I wonder if that is a bit awkward.  What I mean is, if
>> one were to use broadcasting semantics for creating an array, wouldn't
>> one have just simply used broadcasting anyway?  The point of
>> broadcasting is to _avoid_ the creation of unneeded arrays.  But maybe
>> I can be convinced with some examples.
>
> I feel that one of the point of the discussion is : although a new (or
> not so new...) function to create a filled array would be more elegant
> than the existing pair of functions "np.zeros" and "np.ones", there are
> maybe not so many usecases for filled arrays *other than zeros values*.
>
> I can remember having initialized a non-zero array *some months ago*.
> For the anecdote it was a vector of discretized vehicule speed values
> which I wanted to be initialized with a predefined mean speed value
> prior to some optimization. In that usecase, I really didn't care about
> the performance of this initialization step.
>
> So my overall feeling after this thread is
>   - *yes* a single dedicated fill/init/someverb function would give a
> slightly better API,
>   -  but *no* it's not important because np.empty and np.zeros covers 95
> % usecases !

I agree with your summary and conclusion.

Eric

>
> best,
> Pierre
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From ben.root at ou.edu  Thu Jan 17 17:10:14 2013
From: ben.root at ou.edu (Benjamin Root)
Date: Thu, 17 Jan 2013 17:10:14 -0500
Subject: [Numpy-discussion] New numpy functions: filled, filled_like
In-Reply-To: <50F8757B.7060008@hawaii.edu>
References: <CAPJVwBmgZUCwApoS2ztmwt2pTa-2-+B3W5tOm8A30KyuOd12AA@mail.gmail.com>
	<CAF6FJivC=O0CGMGP=c5=a=TMRP73Zy9H61FRS8xSKp5cU2XV4Q@mail.gmail.com>
	<CAPJVwBnOdehQTo=dLup+woc5_JkE5ekvEBDeiEFraVyVUvSDtQ@mail.gmail.com>
	<CAKF=DjtjSGk99zh53wwiPVmMThOzr4hGUKX5HJPG-5YeF_QTCQ@mail.gmail.com>
	<CAPJVwBmZvXXszZzf9m=PX_7e=8qs8ZJteU2hD7_eTR9pnN9U2w@mail.gmail.com>
	<CAF6FJiuQwNCkPvzVKneLfx3XSM7tHKcNXzMftRFnu8R+tN_LoQ@mail.gmail.com>
	<loom.20130114T094714-689@post.gmane.org>
	<CAH6Pt5rjUE0uEQkgZsGrKL4HWXMPZO=wbbaUZ5=tm+LCBSrrXw@mail.gmail.com>
	<CAFXk4br_0pohTOrU6fJjbiqpsP+GSiT3U-nDWRrCi1q6PjJHjg@mail.gmail.com>
	<50F4400F.4040709@hawaii.edu>
	<CANNq6Fm35sYGK4cjZNQedsbdVz06hVhAm6kv7d6QHPtMJACR6g@mail.gmail.com>
	<50F44A95.2030202@crans.org>
	<CALc6Xo6b=3OE3VHNsqJ-SXo_8tvNZOL5RV0pnYnG_fiDp58TUQ@mail.gmail.com>
	<CANNq6F=WpPoMF0+AMqRHGXmfs2cdnX83d=i1TNSfaj1-pamZPg@mail.gmail.com>
	<50F80711.9010204@crans.org> <50F8757B.7060008@hawaii.edu>
Message-ID: <CANNq6F=W4KC1H1HguoWm5AZA5N5iJSGsmAmYsVRQJKr5mueG8w@mail.gmail.com>

On Thu, Jan 17, 2013 at 5:04 PM, Eric Firing <efiring at hawaii.edu> wrote:

> On 2013/01/17 4:13 AM, Pierre Haessig wrote:
> > Hi,
> >
> > Le 14/01/2013 20:05, Benjamin Root a ?crit :
> >> I do like the way you are thinking in terms of the broadcasting
> >> semantics, but I wonder if that is a bit awkward.  What I mean is, if
> >> one were to use broadcasting semantics for creating an array, wouldn't
> >> one have just simply used broadcasting anyway?  The point of
> >> broadcasting is to _avoid_ the creation of unneeded arrays.  But maybe
> >> I can be convinced with some examples.
> >
> > I feel that one of the point of the discussion is : although a new (or
> > not so new...) function to create a filled array would be more elegant
> > than the existing pair of functions "np.zeros" and "np.ones", there are
> > maybe not so many usecases for filled arrays *other than zeros values*.
> >
> > I can remember having initialized a non-zero array *some months ago*.
> > For the anecdote it was a vector of discretized vehicule speed values
> > which I wanted to be initialized with a predefined mean speed value
> > prior to some optimization. In that usecase, I really didn't care about
> > the performance of this initialization step.
> >
> > So my overall feeling after this thread is
> >   - *yes* a single dedicated fill/init/someverb function would give a
> > slightly better API,
> >   -  but *no* it's not important because np.empty and np.zeros covers 95
> > % usecases !
>
> I agree with your summary and conclusion.
>
> Eric
>
>
Can we at least have a np.nans() and np.infs() functions?  This should
cover an additional 4% of use-cases.

Ben Root

P.S. - I know they aren't verbs...
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130117/1431d2cf/attachment.html>

From thouis at gmail.com  Thu Jan 17 17:13:44 2013
From: thouis at gmail.com (Thouis (Ray) Jones)
Date: Thu, 17 Jan 2013 17:13:44 -0500
Subject: [Numpy-discussion] Shouldn't all in-place operations simply
 return self?
In-Reply-To: <CAB6mnxLpOd_=xDfEeH7h+HN+zq=igsck=N6MQ75YHYia9-V8ew@mail.gmail.com>
References: <CAKa=AYQa_WrcFrVKNne9f7bzZvxhj9rpZAFp=Hx62zf2YphwLQ@mail.gmail.com>
	<CAB6mnxLpOd_=xDfEeH7h+HN+zq=igsck=N6MQ75YHYia9-V8ew@mail.gmail.com>
Message-ID: <CAHGWWxHFTanGnez95tmkN3h52K_GHsYFF98ocXU2kCCS1qKauw@mail.gmail.com>

On Thu, Jan 17, 2013 at 10:27 AM, Charles R Harris
<charlesr.harris at gmail.com> wrote:
>
>
> On Wed, Jan 16, 2013 at 5:11 PM, eat <e.antero.tammi at gmail.com> wrote:
>>
>> Hi,
>>
>> In a recent thread
>> http://article.gmane.org/gmane.comp.python.numeric.general/52772 it was
>> proposed that .fill(.) should return self as an alternative for a trivial
>> two-liner.
>>
>> I'm raising now the question: what if all in-place operations indeed could
>> return self? How bad this would be? A 'strong' counter argument may be found
>> at http://mail.python.org/pipermail/python-dev/2003-October/038855.html.
>>
>> But anyway, at least for me. it would be much more straightforward to
>> implement simple mini dsl's
>> (http://en.wikipedia.org/wiki/Domain-specific_language) a much more
>> straightforward manner.
>>
>> What do you think?
>>
>
> I've read Guido about why he didn't like inplace operations returning self
> and found him convincing for a while. And then I listened to other folks
> express a preference for the freight train style and found them convincing
> also. I think it comes down to a preference for one style over another and I
> go back and forth myself. If I had to vote, I'd go for returning self, but
> I'm not sure it's worth breaking python conventions to do so.
>
> Chuck

I'm -1 on breaking with Python convention without very good reasons.

Ray


From matthew.brett at gmail.com  Thu Jan 17 17:23:50 2013
From: matthew.brett at gmail.com (Matthew Brett)
Date: Thu, 17 Jan 2013 22:23:50 +0000
Subject: [Numpy-discussion] New numpy functions: filled, filled_like
In-Reply-To: <CANNq6F=W4KC1H1HguoWm5AZA5N5iJSGsmAmYsVRQJKr5mueG8w@mail.gmail.com>
References: <CAPJVwBmgZUCwApoS2ztmwt2pTa-2-+B3W5tOm8A30KyuOd12AA@mail.gmail.com>
	<CAF6FJivC=O0CGMGP=c5=a=TMRP73Zy9H61FRS8xSKp5cU2XV4Q@mail.gmail.com>
	<CAPJVwBnOdehQTo=dLup+woc5_JkE5ekvEBDeiEFraVyVUvSDtQ@mail.gmail.com>
	<CAKF=DjtjSGk99zh53wwiPVmMThOzr4hGUKX5HJPG-5YeF_QTCQ@mail.gmail.com>
	<CAPJVwBmZvXXszZzf9m=PX_7e=8qs8ZJteU2hD7_eTR9pnN9U2w@mail.gmail.com>
	<CAF6FJiuQwNCkPvzVKneLfx3XSM7tHKcNXzMftRFnu8R+tN_LoQ@mail.gmail.com>
	<loom.20130114T094714-689@post.gmane.org>
	<CAH6Pt5rjUE0uEQkgZsGrKL4HWXMPZO=wbbaUZ5=tm+LCBSrrXw@mail.gmail.com>
	<CAFXk4br_0pohTOrU6fJjbiqpsP+GSiT3U-nDWRrCi1q6PjJHjg@mail.gmail.com>
	<50F4400F.4040709@hawaii.edu>
	<CANNq6Fm35sYGK4cjZNQedsbdVz06hVhAm6kv7d6QHPtMJACR6g@mail.gmail.com>
	<50F44A95.2030202@crans.org>
	<CALc6Xo6b=3OE3VHNsqJ-SXo_8tvNZOL5RV0pnYnG_fiDp58TUQ@mail.gmail.com>
	<CANNq6F=WpPoMF0+AMqRHGXmfs2cdnX83d=i1TNSfaj1-pamZPg@mail.gmail.com>
	<50F80711.9010204@crans.org> <50F8757B.7060008@hawaii.edu>
	<CANNq6F=W4KC1H1HguoWm5AZA5N5iJSGsmAmYsVRQJKr5mueG8w@mail.gmail.com>
Message-ID: <CAH6Pt5qXX+AtqJSOTniPXx6432AkNA54Utm=2j7iGUfk-w6Ugg@mail.gmail.com>

Hi,

On Thu, Jan 17, 2013 at 10:10 PM, Benjamin Root <ben.root at ou.edu> wrote:
>
>
> On Thu, Jan 17, 2013 at 5:04 PM, Eric Firing <efiring at hawaii.edu> wrote:
>>
>> On 2013/01/17 4:13 AM, Pierre Haessig wrote:
>> > Hi,
>> >
>> > Le 14/01/2013 20:05, Benjamin Root a ?crit :
>> >> I do like the way you are thinking in terms of the broadcasting
>> >> semantics, but I wonder if that is a bit awkward.  What I mean is, if
>> >> one were to use broadcasting semantics for creating an array, wouldn't
>> >> one have just simply used broadcasting anyway?  The point of
>> >> broadcasting is to _avoid_ the creation of unneeded arrays.  But maybe
>> >> I can be convinced with some examples.
>> >
>> > I feel that one of the point of the discussion is : although a new (or
>> > not so new...) function to create a filled array would be more elegant
>> > than the existing pair of functions "np.zeros" and "np.ones", there are
>> > maybe not so many usecases for filled arrays *other than zeros values*.
>> >
>> > I can remember having initialized a non-zero array *some months ago*.
>> > For the anecdote it was a vector of discretized vehicule speed values
>> > which I wanted to be initialized with a predefined mean speed value
>> > prior to some optimization. In that usecase, I really didn't care about
>> > the performance of this initialization step.
>> >
>> > So my overall feeling after this thread is
>> >   - *yes* a single dedicated fill/init/someverb function would give a
>> > slightly better API,
>> >   -  but *no* it's not important because np.empty and np.zeros covers 95
>> > % usecases !
>>
>> I agree with your summary and conclusion.
>>
>> Eric
>>
>
> Can we at least have a np.nans() and np.infs() functions?  This should cover
> an additional 4% of use-cases.

I'm a -0.5 on the new functions, just because they only save one line
of code, and the use-case is fairly rare in my experience..

Cheers,

Matthew


From mwwiebe at gmail.com  Thu Jan 17 17:27:13 2013
From: mwwiebe at gmail.com (Mark Wiebe)
Date: Thu, 17 Jan 2013 14:27:13 -0800
Subject: [Numpy-discussion] New numpy functions: filled, filled_like
In-Reply-To: <CANNq6F=W4KC1H1HguoWm5AZA5N5iJSGsmAmYsVRQJKr5mueG8w@mail.gmail.com>
References: <CAPJVwBmgZUCwApoS2ztmwt2pTa-2-+B3W5tOm8A30KyuOd12AA@mail.gmail.com>
	<CAF6FJivC=O0CGMGP=c5=a=TMRP73Zy9H61FRS8xSKp5cU2XV4Q@mail.gmail.com>
	<CAPJVwBnOdehQTo=dLup+woc5_JkE5ekvEBDeiEFraVyVUvSDtQ@mail.gmail.com>
	<CAKF=DjtjSGk99zh53wwiPVmMThOzr4hGUKX5HJPG-5YeF_QTCQ@mail.gmail.com>
	<CAPJVwBmZvXXszZzf9m=PX_7e=8qs8ZJteU2hD7_eTR9pnN9U2w@mail.gmail.com>
	<CAF6FJiuQwNCkPvzVKneLfx3XSM7tHKcNXzMftRFnu8R+tN_LoQ@mail.gmail.com>
	<loom.20130114T094714-689@post.gmane.org>
	<CAH6Pt5rjUE0uEQkgZsGrKL4HWXMPZO=wbbaUZ5=tm+LCBSrrXw@mail.gmail.com>
	<CAFXk4br_0pohTOrU6fJjbiqpsP+GSiT3U-nDWRrCi1q6PjJHjg@mail.gmail.com>
	<50F4400F.4040709@hawaii.edu>
	<CANNq6Fm35sYGK4cjZNQedsbdVz06hVhAm6kv7d6QHPtMJACR6g@mail.gmail.com>
	<50F44A95.2030202@crans.org>
	<CALc6Xo6b=3OE3VHNsqJ-SXo_8tvNZOL5RV0pnYnG_fiDp58TUQ@mail.gmail.com>
	<CANNq6F=WpPoMF0+AMqRHGXmfs2cdnX83d=i1TNSfaj1-pamZPg@mail.gmail.com>
	<50F80711.9010204@crans.org> <50F8757B.7060008@hawaii.edu>
	<CANNq6F=W4KC1H1HguoWm5AZA5N5iJSGsmAmYsVRQJKr5mueG8w@mail.gmail.com>
Message-ID: <CAMRnEmpyjMvL0-PgFf0RsbE1NkANWCfxAXmggdQ1+p7u6ux5+w@mail.gmail.com>

On Thu, Jan 17, 2013 at 2:10 PM, Benjamin Root <ben.root at ou.edu> wrote:

>
>
> On Thu, Jan 17, 2013 at 5:04 PM, Eric Firing <efiring at hawaii.edu> wrote:
>
>> On 2013/01/17 4:13 AM, Pierre Haessig wrote:
>> > Hi,
>> >
>> > Le 14/01/2013 20:05, Benjamin Root a ?crit :
>> >> I do like the way you are thinking in terms of the broadcasting
>> >> semantics, but I wonder if that is a bit awkward.  What I mean is, if
>> >> one were to use broadcasting semantics for creating an array, wouldn't
>> >> one have just simply used broadcasting anyway?  The point of
>> >> broadcasting is to _avoid_ the creation of unneeded arrays.  But maybe
>> >> I can be convinced with some examples.
>> >
>> > I feel that one of the point of the discussion is : although a new (or
>> > not so new...) function to create a filled array would be more elegant
>> > than the existing pair of functions "np.zeros" and "np.ones", there are
>> > maybe not so many usecases for filled arrays *other than zeros values*.
>> >
>> > I can remember having initialized a non-zero array *some months ago*.
>> > For the anecdote it was a vector of discretized vehicule speed values
>> > which I wanted to be initialized with a predefined mean speed value
>> > prior to some optimization. In that usecase, I really didn't care about
>> > the performance of this initialization step.
>> >
>> > So my overall feeling after this thread is
>> >   - *yes* a single dedicated fill/init/someverb function would give a
>> > slightly better API,
>> >   -  but *no* it's not important because np.empty and np.zeros covers 95
>> > % usecases !
>>
>> I agree with your summary and conclusion.
>>
>> Eric
>>
>>
> Can we at least have a np.nans() and np.infs() functions?  This should
> cover an additional 4% of use-cases.
>
> Ben Root
>
> P.S. - I know they aren't verbs...
>

Would it be too weird or clumsy to extend the empty and empty_like
functions to do the filling?

np.empty((10, 10), fill=np.nan)
np.empty_like(my_arr, fill=np.nan)

-Mark


> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130117/a5001850/attachment.html>

From matthew.brett at gmail.com  Thu Jan 17 17:31:04 2013
From: matthew.brett at gmail.com (Matthew Brett)
Date: Thu, 17 Jan 2013 22:31:04 +0000
Subject: [Numpy-discussion] New numpy functions: filled, filled_like
In-Reply-To: <CAMRnEmpyjMvL0-PgFf0RsbE1NkANWCfxAXmggdQ1+p7u6ux5+w@mail.gmail.com>
References: <CAPJVwBmgZUCwApoS2ztmwt2pTa-2-+B3W5tOm8A30KyuOd12AA@mail.gmail.com>
	<CAF6FJivC=O0CGMGP=c5=a=TMRP73Zy9H61FRS8xSKp5cU2XV4Q@mail.gmail.com>
	<CAPJVwBnOdehQTo=dLup+woc5_JkE5ekvEBDeiEFraVyVUvSDtQ@mail.gmail.com>
	<CAKF=DjtjSGk99zh53wwiPVmMThOzr4hGUKX5HJPG-5YeF_QTCQ@mail.gmail.com>
	<CAPJVwBmZvXXszZzf9m=PX_7e=8qs8ZJteU2hD7_eTR9pnN9U2w@mail.gmail.com>
	<CAF6FJiuQwNCkPvzVKneLfx3XSM7tHKcNXzMftRFnu8R+tN_LoQ@mail.gmail.com>
	<loom.20130114T094714-689@post.gmane.org>
	<CAH6Pt5rjUE0uEQkgZsGrKL4HWXMPZO=wbbaUZ5=tm+LCBSrrXw@mail.gmail.com>
	<CAFXk4br_0pohTOrU6fJjbiqpsP+GSiT3U-nDWRrCi1q6PjJHjg@mail.gmail.com>
	<50F4400F.4040709@hawaii.edu>
	<CANNq6Fm35sYGK4cjZNQedsbdVz06hVhAm6kv7d6QHPtMJACR6g@mail.gmail.com>
	<50F44A95.2030202@crans.org>
	<CALc6Xo6b=3OE3VHNsqJ-SXo_8tvNZOL5RV0pnYnG_fiDp58TUQ@mail.gmail.com>
	<CANNq6F=WpPoMF0+AMqRHGXmfs2cdnX83d=i1TNSfaj1-pamZPg@mail.gmail.com>
	<50F80711.9010204@crans.org> <50F8757B.7060008@hawaii.edu>
	<CANNq6F=W4KC1H1HguoWm5AZA5N5iJSGsmAmYsVRQJKr5mueG8w@mail.gmail.com>
	<CAMRnEmpyjMvL0-PgFf0RsbE1NkANWCfxAXmggdQ1+p7u6ux5+w@mail.gmail.com>
Message-ID: <CAH6Pt5rEX290w58rLYehUQT+bxq-5rSKDRDTF2C7viqwPp5HOg@mail.gmail.com>

Hi,

On Thu, Jan 17, 2013 at 10:27 PM, Mark Wiebe <mwwiebe at gmail.com> wrote:
>
> On Thu, Jan 17, 2013 at 2:10 PM, Benjamin Root <ben.root at ou.edu> wrote:
>>
>>
>>
>> On Thu, Jan 17, 2013 at 5:04 PM, Eric Firing <efiring at hawaii.edu> wrote:
>>>
>>> On 2013/01/17 4:13 AM, Pierre Haessig wrote:
>>> > Hi,
>>> >
>>> > Le 14/01/2013 20:05, Benjamin Root a ?crit :
>>> >> I do like the way you are thinking in terms of the broadcasting
>>> >> semantics, but I wonder if that is a bit awkward.  What I mean is, if
>>> >> one were to use broadcasting semantics for creating an array, wouldn't
>>> >> one have just simply used broadcasting anyway?  The point of
>>> >> broadcasting is to _avoid_ the creation of unneeded arrays.  But maybe
>>> >> I can be convinced with some examples.
>>> >
>>> > I feel that one of the point of the discussion is : although a new (or
>>> > not so new...) function to create a filled array would be more elegant
>>> > than the existing pair of functions "np.zeros" and "np.ones", there are
>>> > maybe not so many usecases for filled arrays *other than zeros values*.
>>> >
>>> > I can remember having initialized a non-zero array *some months ago*.
>>> > For the anecdote it was a vector of discretized vehicule speed values
>>> > which I wanted to be initialized with a predefined mean speed value
>>> > prior to some optimization. In that usecase, I really didn't care about
>>> > the performance of this initialization step.
>>> >
>>> > So my overall feeling after this thread is
>>> >   - *yes* a single dedicated fill/init/someverb function would give a
>>> > slightly better API,
>>> >   -  but *no* it's not important because np.empty and np.zeros covers
>>> > 95
>>> > % usecases !
>>>
>>> I agree with your summary and conclusion.
>>>
>>> Eric
>>>
>>
>> Can we at least have a np.nans() and np.infs() functions?  This should
>> cover an additional 4% of use-cases.
>>
>> Ben Root
>>
>> P.S. - I know they aren't verbs...
>
>
> Would it be too weird or clumsy to extend the empty and empty_like functions
> to do the filling?
>
> np.empty((10, 10), fill=np.nan)
> np.empty_like(my_arr, fill=np.nan)

That sounds like a good idea to me.  Someone wanting a fast way to
fill an array will probably check out the 'empty' docstring first.

See you,

Matthew


From shish at keba.be  Thu Jan 17 20:01:26 2013
From: shish at keba.be (Olivier Delalleau)
Date: Thu, 17 Jan 2013 20:01:26 -0500
Subject: [Numpy-discussion] New numpy functions: filled, filled_like
In-Reply-To: <CAH6Pt5rEX290w58rLYehUQT+bxq-5rSKDRDTF2C7viqwPp5HOg@mail.gmail.com>
References: <CAPJVwBmgZUCwApoS2ztmwt2pTa-2-+B3W5tOm8A30KyuOd12AA@mail.gmail.com>
	<CAF6FJivC=O0CGMGP=c5=a=TMRP73Zy9H61FRS8xSKp5cU2XV4Q@mail.gmail.com>
	<CAPJVwBnOdehQTo=dLup+woc5_JkE5ekvEBDeiEFraVyVUvSDtQ@mail.gmail.com>
	<CAKF=DjtjSGk99zh53wwiPVmMThOzr4hGUKX5HJPG-5YeF_QTCQ@mail.gmail.com>
	<CAPJVwBmZvXXszZzf9m=PX_7e=8qs8ZJteU2hD7_eTR9pnN9U2w@mail.gmail.com>
	<CAF6FJiuQwNCkPvzVKneLfx3XSM7tHKcNXzMftRFnu8R+tN_LoQ@mail.gmail.com>
	<loom.20130114T094714-689@post.gmane.org>
	<CAH6Pt5rjUE0uEQkgZsGrKL4HWXMPZO=wbbaUZ5=tm+LCBSrrXw@mail.gmail.com>
	<CAFXk4br_0pohTOrU6fJjbiqpsP+GSiT3U-nDWRrCi1q6PjJHjg@mail.gmail.com>
	<50F4400F.4040709@hawaii.edu>
	<CANNq6Fm35sYGK4cjZNQedsbdVz06hVhAm6kv7d6QHPtMJACR6g@mail.gmail.com>
	<50F44A95.2030202@crans.org>
	<CALc6Xo6b=3OE3VHNsqJ-SXo_8tvNZOL5RV0pnYnG_fiDp58TUQ@mail.gmail.com>
	<CANNq6F=WpPoMF0+AMqRHGXmfs2cdnX83d=i1TNSfaj1-pamZPg@mail.gmail.com>
	<50F80711.9010204@crans.org> <50F8757B.7060008@hawaii.edu>
	<CANNq6F=W4KC1H1HguoWm5AZA5N5iJSGsmAmYsVRQJKr5mueG8w@mail.gmail.com>
	<CAMRnEmpyjMvL0-PgFf0RsbE1NkANWCfxAXmggdQ1+p7u6ux5+w@mail.gmail.com>
	<CAH6Pt5rEX290w58rLYehUQT+bxq-5rSKDRDTF2C7viqwPp5HOg@mail.gmail.com>
Message-ID: <CAFXk4bpHWJJREX2mo-zL1F=_gra3zk9sFZqM+7Nx3uYmvTr6kQ@mail.gmail.com>

2013/1/17 Matthew Brett <matthew.brett at gmail.com>:
> Hi,
>
> On Thu, Jan 17, 2013 at 10:27 PM, Mark Wiebe <mwwiebe at gmail.com> wrote:
>>
>> On Thu, Jan 17, 2013 at 2:10 PM, Benjamin Root <ben.root at ou.edu> wrote:
>>>
>>>
>>>
>>> On Thu, Jan 17, 2013 at 5:04 PM, Eric Firing <efiring at hawaii.edu> wrote:
>>>>
>>>> On 2013/01/17 4:13 AM, Pierre Haessig wrote:
>>>> > Hi,
>>>> >
>>>> > Le 14/01/2013 20:05, Benjamin Root a ?crit :
>>>> >> I do like the way you are thinking in terms of the broadcasting
>>>> >> semantics, but I wonder if that is a bit awkward.  What I mean is, if
>>>> >> one were to use broadcasting semantics for creating an array, wouldn't
>>>> >> one have just simply used broadcasting anyway?  The point of
>>>> >> broadcasting is to _avoid_ the creation of unneeded arrays.  But maybe
>>>> >> I can be convinced with some examples.
>>>> >
>>>> > I feel that one of the point of the discussion is : although a new (or
>>>> > not so new...) function to create a filled array would be more elegant
>>>> > than the existing pair of functions "np.zeros" and "np.ones", there are
>>>> > maybe not so many usecases for filled arrays *other than zeros values*.
>>>> >
>>>> > I can remember having initialized a non-zero array *some months ago*.
>>>> > For the anecdote it was a vector of discretized vehicule speed values
>>>> > which I wanted to be initialized with a predefined mean speed value
>>>> > prior to some optimization. In that usecase, I really didn't care about
>>>> > the performance of this initialization step.
>>>> >
>>>> > So my overall feeling after this thread is
>>>> >   - *yes* a single dedicated fill/init/someverb function would give a
>>>> > slightly better API,
>>>> >   -  but *no* it's not important because np.empty and np.zeros covers
>>>> > 95
>>>> > % usecases !
>>>>
>>>> I agree with your summary and conclusion.
>>>>
>>>> Eric
>>>>
>>>
>>> Can we at least have a np.nans() and np.infs() functions?  This should
>>> cover an additional 4% of use-cases.
>>>
>>> Ben Root
>>>
>>> P.S. - I know they aren't verbs...
>>
>>
>> Would it be too weird or clumsy to extend the empty and empty_like functions
>> to do the filling?
>>
>> np.empty((10, 10), fill=np.nan)
>> np.empty_like(my_arr, fill=np.nan)
>
> That sounds like a good idea to me.  Someone wanting a fast way to
> fill an array will probably check out the 'empty' docstring first.
>
> See you,
>
> Matthew

+1 from me. Even though it *is* weird to have both "empty" and "fill" ;)

-=- Olivier


From chris.barker at noaa.gov  Thu Jan 17 20:04:15 2013
From: chris.barker at noaa.gov (Chris Barker - NOAA Federal)
Date: Thu, 17 Jan 2013 17:04:15 -0800
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <CAH6Pt5pxM7z4YXiyyYxO1nj_wj9M_RVUFB7yt2tKnvo1=cXuNg@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CALmrCV0n=TsEPMrmzpdh6UQNFRTp5SgD0VAH52n9pxedsEu8Kg@mail.gmail.com>
	<CAH6Pt5rKwOzTjnOO87cSi=fkCT_-8tVAbEoA9RunwWD6A_DAPw@mail.gmail.com>
	<CALmrCV1JwDcBnO3sf=4n_4+V=C_j3QzRkpMCEZJutnv8NxRPqw@mail.gmail.com>
	<CAH6Pt5qnD=hfW_Zmw5t69HL78H8uhFY8b+GTkFx4uOmO+HERYQ@mail.gmail.com>
	<CALmrCV3ssCQ+U8QziyymoiF6BgxHHBiM2Z2Bnuk9UWNGH9bC4w@mail.gmail.com>
	<CAH6Pt5oiY07W=S+XbAYwN8Zk+AGCOh17eh_eg-uexBHGkPDgWQ@mail.gmail.com>
	<CALmrCV3kZ=EEopMDydwYDarV6G9nSbiPHK_UY4vSBiDUG8fhKg@mail.gmail.com>
	<CAH6Pt5ro6n0z9e_BZHjvrVxL2tvYoE50vvkeaXmAgvOUQBpMLQ@mail.gmail.com>
	<CALmrCV1H68=Vtr6Gep7Y08Ns_vASQdrf+Q3R-COgvXt7URi5sg@mail.gmail.com>
	<CAPJVwBmCmPCShfmoSPgkpO58aLU6S2t9GbYbDV=iif6US5YgJQ@mail.gmail.com>
	<CALmrCV1qbkkp8yyA4z2RTCcB9hK70ZcFYD53xXNaqNKWuaGtnw@mail.gmail.com>
	<CAPJVwB=f9_g81=SYyrb88j9mWrHKTwjxNUp0ofs0ZQKzPxeJoA@mail.gmail.com>
	<CALGmxEJyj4NT8RPmGqi8Sf5TD5PMw38TxVKyNQGqf-YU1T5-Wg@mail.gmail.com>
	<50EDB1D4.5090909@astro.uio.no>
	<CAH6Pt5pxM7z4YXiyyYxO1nj_wj9M_RVUFB7yt2tKnvo1=cXuNg@mail.gmail.com>
Message-ID: <CALGmxEJW7QvegJxPa0j5bgD_CAx2R-J6ywv+a=6G4dRpyzodaw@mail.gmail.com>

On Thu, Jan 17, 2013 at 6:26 AM, Matthew Brett <matthew.brett at gmail.com> wrote:

> I am starting to wonder if we should aim for making
>
> * scalar and array casting rules the same;
> * Python int / float scalars become int32 / 64 or float64;

aren't they already? I'm not sure what you are proposing.

> This has the benefit of being very easy to understand and explain.  It
> makes dtypes predictable in the sense they don't depend on value.

That is key -- I don't think casting should ever depend on value.

> Those wanting to maintain - say - float32 will need to cast scalars to float32.
>
> Maybe the use-cases motivating the scalar casting rules - maintaining
> float32 precision in particular - can be dealt with by careful casting
> of scalars, throwing the burden onto the memory-conscious to maintain
> their dtypes.

IIRC this is how it worked "back in the day" (the Numeric day? -- and
I'm pretty sure that in the long run it worked out badly. the core
problem is that there are only python literals for a couple types, and
it was oh so easy to do things like:

my_arr = np,zeros(shape, dtype-float32)

another_array = my_array * 4.0

and you'd suddenly get a float64 array. (of course, we already know
all that..) I suppose this has the up side of being safe, and having
scalar and array casting rules be the same is of course appealing, but
you use a particular size dtype for a reason,and it's a real pain to
maintain it.

Casual users will use the defaults that match the Python types anyway.

So in the in the spirit of "practicality beats purity" -- I"d like
accidental upcasting to be hard to do.

-Chris

-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From chris.barker at noaa.gov  Thu Jan 17 20:05:54 2013
From: chris.barker at noaa.gov (Chris Barker - NOAA Federal)
Date: Thu, 17 Jan 2013 17:05:54 -0800
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <CALGmxEJW7QvegJxPa0j5bgD_CAx2R-J6ywv+a=6G4dRpyzodaw@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CALmrCV0n=TsEPMrmzpdh6UQNFRTp5SgD0VAH52n9pxedsEu8Kg@mail.gmail.com>
	<CAH6Pt5rKwOzTjnOO87cSi=fkCT_-8tVAbEoA9RunwWD6A_DAPw@mail.gmail.com>
	<CALmrCV1JwDcBnO3sf=4n_4+V=C_j3QzRkpMCEZJutnv8NxRPqw@mail.gmail.com>
	<CAH6Pt5qnD=hfW_Zmw5t69HL78H8uhFY8b+GTkFx4uOmO+HERYQ@mail.gmail.com>
	<CALmrCV3ssCQ+U8QziyymoiF6BgxHHBiM2Z2Bnuk9UWNGH9bC4w@mail.gmail.com>
	<CAH6Pt5oiY07W=S+XbAYwN8Zk+AGCOh17eh_eg-uexBHGkPDgWQ@mail.gmail.com>
	<CALmrCV3kZ=EEopMDydwYDarV6G9nSbiPHK_UY4vSBiDUG8fhKg@mail.gmail.com>
	<CAH6Pt5ro6n0z9e_BZHjvrVxL2tvYoE50vvkeaXmAgvOUQBpMLQ@mail.gmail.com>
	<CALmrCV1H68=Vtr6Gep7Y08Ns_vASQdrf+Q3R-COgvXt7URi5sg@mail.gmail.com>
	<CAPJVwBmCmPCShfmoSPgkpO58aLU6S2t9GbYbDV=iif6US5YgJQ@mail.gmail.com>
	<CALmrCV1qbkkp8yyA4z2RTCcB9hK70ZcFYD53xXNaqNKWuaGtnw@mail.gmail.com>
	<CAPJVwB=f9_g81=SYyrb88j9mWrHKTwjxNUp0ofs0ZQKzPxeJoA@mail.gmail.com>
	<CALGmxEJyj4NT8RPmGqi8Sf5TD5PMw38TxVKyNQGqf-YU1T5-Wg@mail.gmail.com>
	<50EDB1D4.5090909@astro.uio.no>
	<CAH6Pt5pxM7z4YXiyyYxO1nj_wj9M_RVUFB7yt2tKnvo1=cXuNg@mail.gmail.com>
	<CALGmxEJW7QvegJxPa0j5bgD_CAx2R-J6ywv+a=6G4dRpyzodaw@mail.gmail.com>
Message-ID: <CALGmxEL+vFkZVvox_kMTyhhx9rPz3snDoECCH_9n=Ufy3SCm-w@mail.gmail.com>

On Thu, Jan 17, 2013 at 5:04 PM, Chris Barker - NOAA Federal
<chris.barker at noaa.gov> wrote:

> So in the in the spirit of "practicality beats purity" -- I"d like
> accidental upcasting to be hard to do.

and then:

arr = arr + scalar

would yield the same type as:

arr += scalar

so we buy some consistency!

-Chris


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From matthew.brett at gmail.com  Thu Jan 17 20:18:14 2013
From: matthew.brett at gmail.com (Matthew Brett)
Date: Fri, 18 Jan 2013 01:18:14 +0000
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <CALGmxEJW7QvegJxPa0j5bgD_CAx2R-J6ywv+a=6G4dRpyzodaw@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CALmrCV0n=TsEPMrmzpdh6UQNFRTp5SgD0VAH52n9pxedsEu8Kg@mail.gmail.com>
	<CAH6Pt5rKwOzTjnOO87cSi=fkCT_-8tVAbEoA9RunwWD6A_DAPw@mail.gmail.com>
	<CALmrCV1JwDcBnO3sf=4n_4+V=C_j3QzRkpMCEZJutnv8NxRPqw@mail.gmail.com>
	<CAH6Pt5qnD=hfW_Zmw5t69HL78H8uhFY8b+GTkFx4uOmO+HERYQ@mail.gmail.com>
	<CALmrCV3ssCQ+U8QziyymoiF6BgxHHBiM2Z2Bnuk9UWNGH9bC4w@mail.gmail.com>
	<CAH6Pt5oiY07W=S+XbAYwN8Zk+AGCOh17eh_eg-uexBHGkPDgWQ@mail.gmail.com>
	<CALmrCV3kZ=EEopMDydwYDarV6G9nSbiPHK_UY4vSBiDUG8fhKg@mail.gmail.com>
	<CAH6Pt5ro6n0z9e_BZHjvrVxL2tvYoE50vvkeaXmAgvOUQBpMLQ@mail.gmail.com>
	<CALmrCV1H68=Vtr6Gep7Y08Ns_vASQdrf+Q3R-COgvXt7URi5sg@mail.gmail.com>
	<CAPJVwBmCmPCShfmoSPgkpO58aLU6S2t9GbYbDV=iif6US5YgJQ@mail.gmail.com>
	<CALmrCV1qbkkp8yyA4z2RTCcB9hK70ZcFYD53xXNaqNKWuaGtnw@mail.gmail.com>
	<CAPJVwB=f9_g81=SYyrb88j9mWrHKTwjxNUp0ofs0ZQKzPxeJoA@mail.gmail.com>
	<CALGmxEJyj4NT8RPmGqi8Sf5TD5PMw38TxVKyNQGqf-YU1T5-Wg@mail.gmail.com>
	<50EDB1D4.5090909@astro.uio.no>
	<CAH6Pt5pxM7z4YXiyyYxO1nj_wj9M_RVUFB7yt2tKnvo1=cXuNg@mail.gmail.com>
	<CALGmxEJW7QvegJxPa0j5bgD_CAx2R-J6ywv+a=6G4dRpyzodaw@mail.gmail.com>
Message-ID: <CAH6Pt5o3hPuR0+b62_os7ZDHg0Q8+MYrj+2ucZ0WfxXj9MU1+w@mail.gmail.com>

Hi,

On Fri, Jan 18, 2013 at 1:04 AM, Chris Barker - NOAA Federal
<chris.barker at noaa.gov> wrote:
> On Thu, Jan 17, 2013 at 6:26 AM, Matthew Brett <matthew.brett at gmail.com> wrote:
>
>> I am starting to wonder if we should aim for making
>>
>> * scalar and array casting rules the same;
>> * Python int / float scalars become int32 / 64 or float64;
>
> aren't they already? I'm not sure what you are proposing.

Sorry - yes that is what they are already, this sentence refers back
to an earlier suggestion of mine on the thread, which I am discarding.

>> This has the benefit of being very easy to understand and explain.  It
>> makes dtypes predictable in the sense they don't depend on value.
>
> That is key -- I don't think casting should ever depend on value.
>
>> Those wanting to maintain - say - float32 will need to cast scalars to float32.
>>
>> Maybe the use-cases motivating the scalar casting rules - maintaining
>> float32 precision in particular - can be dealt with by careful casting
>> of scalars, throwing the burden onto the memory-conscious to maintain
>> their dtypes.
>
> IIRC this is how it worked "back in the day" (the Numeric day? -- and
> I'm pretty sure that in the long run it worked out badly. the core
> problem is that there are only python literals for a couple types, and
> it was oh so easy to do things like:
>
> my_arr = np,zeros(shape, dtype-float32)
>
> another_array = my_array * 4.0
>
> and you'd suddenly get a float64 array. (of course, we already know
> all that..) I suppose this has the up side of being safe, and having
> scalar and array casting rules be the same is of course appealing, but
> you use a particular size dtype for a reason,and it's a real pain to
> maintain it.

Yes, I do understand that.  The difference - as I understand it - is
that back in the day, numeric did not have the the float32 etc
scalars, so you could not do:

another_array = my_array * np.float32(4.0)

(please someone correct me if I'm wrong).

> Casual users will use the defaults that match the Python types anyway.

I think what we are reading in this thread is that even experienced
numpy users can find the scalar casting rules surprising, and that's a
real problem, it seems to me.

The person with a massive float32 array certainly should have the
ability to control upcasting, but I think the default should be the
least surprising thing, and that, it seems to me, is for the casting
rules to be the same for arrays and scalars.   In the very long term.

Cheers,

Matthew


From shish at keba.be  Thu Jan 17 20:19:50 2013
From: shish at keba.be (Olivier Delalleau)
Date: Thu, 17 Jan 2013 20:19:50 -0500
Subject: [Numpy-discussion] Casting Bug or a "Feature"?
In-Reply-To: <CAMMTP+DERTF_k=cyqEB3twYh391R9Q_1y5mSg8p1UD9hityBiQ@mail.gmail.com>
References: <CAHUTaHH54MAP4K3yHR_kTzWC70xNrWa0ykuq36RdC=4+PP71bA@mail.gmail.com>
	<CALGmxE+DkgeTSHzas1=QEOdnGHN8HUv=p4Ff9+TMfhRLT7pJAQ@mail.gmail.com>
	<CAPJVwBm8awHwWe5G5E1rxkYfaM3q6HDEYRY6svyFfo7nfaxO+Q@mail.gmail.com>
	<CAHUTaHH7H31rbq=ej6SSqe6s-5WpZi4u3ZPyQUqJnBmRyo8iQA@mail.gmail.com>
	<CAMMTP+DERTF_k=cyqEB3twYh391R9Q_1y5mSg8p1UD9hityBiQ@mail.gmail.com>
Message-ID: <CAFXk4bqLZKxS2DzL1HiE=1NkUkaX9JY1wU9jmouZJnHQ4r9qNA@mail.gmail.com>

2013/1/16  <josef.pktd at gmail.com>:
> On Wed, Jan 16, 2013 at 10:43 PM, Patrick Marsh
> <patrickmarshwx at gmail.com> wrote:
>> Thanks, everyone for chiming in.  Now that I know this behavior exists, I
>> can explicitly prevent it in my code. However, it would be nice if a warning
>> or something was generated to alert users about the inconsistency between
>> var += ... and var = var + ...
>
> Since I also got bitten by this recently in my code, I fully agree.
> I could live with an exception for lossy down casting in this case.

About exceptions: someone mentioned in another thread about casting
how having exceptions can make it difficult to write code. I've
thought a bit more about this issue and I tend to agree, especially on
code that used to "work" (in the sense of doing something -- not
necessarily what you'd want -- without complaining).

Don't get me wrong, when I write code I love when a library crashes
and forces me to be more explicit about what I want, thus saving me
the trouble of hunting down a tricky overflow / casting bug. However,
in a production environment for instance, such an unexpected crash
could have much worse consequences than an incorrect output. And
although you may blame the programmer for not being careful enough
about types, he couldn't expect it might crash the application back
when this code was written....

Long story short, +1 for warning, -1 for exception, and +1 for a
config flag that allows one to change to exceptions by default, if
desired.

-=- Olivier


From stsci.perry at gmail.com  Thu Jan 17 20:24:18 2013
From: stsci.perry at gmail.com (Perry Greenfield)
Date: Thu, 17 Jan 2013 20:24:18 -0500
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
	does at the moment?
In-Reply-To: <CALGmxEJW7QvegJxPa0j5bgD_CAx2R-J6ywv+a=6G4dRpyzodaw@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CALmrCV0n=TsEPMrmzpdh6UQNFRTp5SgD0VAH52n9pxedsEu8Kg@mail.gmail.com>
	<CAH6Pt5rKwOzTjnOO87cSi=fkCT_-8tVAbEoA9RunwWD6A_DAPw@mail.gmail.com>
	<CALmrCV1JwDcBnO3sf=4n_4+V=C_j3QzRkpMCEZJutnv8NxRPqw@mail.gmail.com>
	<CAH6Pt5qnD=hfW_Zmw5t69HL78H8uhFY8b+GTkFx4uOmO+HERYQ@mail.gmail.com>
	<CALmrCV3ssCQ+U8QziyymoiF6BgxHHBiM2Z2Bnuk9UWNGH9bC4w@mail.gmail.com>
	<CAH6Pt5oiY07W=S+XbAYwN8Zk+AGCOh17eh_eg-uexBHGkPDgWQ@mail.gmail.com>
	<CALmrCV3kZ=EEopMDydwYDarV6G9nSbiPHK_UY4vSBiDUG8fhKg@mail.gmail.com>
	<CAH6Pt5ro6n0z9e_BZHjvrVxL2tvYoE50vvkeaXmAgvOUQBpMLQ@mail.gmail.com>
	<CALmrCV1H68=Vtr6Gep7Y08Ns_vASQdrf+Q3R-COgvXt7URi5sg@mail.gmail.com>
	<CAPJVwBmCmPCShfmoSPgkpO58aLU6S2t9GbYbDV=iif6US5YgJQ@mail.gmail.com>
	<CALmrCV1qbkkp8yyA4z2RTCcB9hK70ZcFYD53xXNaqNKWuaGtnw@mail.gmail.com>
	<CAPJVwB=f9_g81=SYyrb88j9mWrHKTwjxNUp0ofs0ZQKzPxeJoA@mail.gmail.com>
	<CALGmxEJyj4NT8RPmGqi8Sf5TD5PMw38TxVKyNQGqf-YU1T5-Wg@mail.gmail.com>
	<50EDB1D4.5090909@astro.uio.no>
	<CAH6Pt5pxM7z4YXiyyYxO1nj_wj9M_RVUFB7yt2tKnvo1=cXuNg@mail.gmail.com>
	<CALGmxEJW7QvegJxPa0j5bgD_CAx2R-J6ywv+a=6G4dRpyzodaw@mail.gmail.com>
Message-ID: <BC7FD9B4-0103-4637-8EA8-7C5EDBD25673@gmail.com>

I'd like to echo what Chris is saying. It was a big annoyance with Numeric to make it so hard to preserve the array type in ordinary expressions.

Perry

On Jan 17, 2013, at 8:04 PM, Chris Barker - NOAA Federal wrote:

> On Thu, Jan 17, 2013 at 6:26 AM, Matthew Brett <matthew.brett at gmail.com> wrote:
> 
>> I am starting to wonder if we should aim for making
>> 
>> * scalar and array casting rules the same;
>> * Python int / float scalars become int32 / 64 or float64;
> 
> aren't they already? I'm not sure what you are proposing.
> 
>> This has the benefit of being very easy to understand and explain.  It
>> makes dtypes predictable in the sense they don't depend on value.
> 
> That is key -- I don't think casting should ever depend on value.
> 
>> Those wanting to maintain - say - float32 will need to cast scalars to float32.
>> 
>> Maybe the use-cases motivating the scalar casting rules - maintaining
>> float32 precision in particular - can be dealt with by careful casting
>> of scalars, throwing the burden onto the memory-conscious to maintain
>> their dtypes.
> 
> IIRC this is how it worked "back in the day" (the Numeric day? -- and
> I'm pretty sure that in the long run it worked out badly. the core
> problem is that there are only python literals for a couple types, and
> it was oh so easy to do things like:
> 
> my_arr = np,zeros(shape, dtype-float32)
> 
> another_array = my_array * 4.0
> 
> and you'd suddenly get a float64 array. (of course, we already know
> all that..) I suppose this has the up side of being safe, and having
> scalar and array casting rules be the same is of course appealing, but
> you use a particular size dtype for a reason,and it's a real pain to
> maintain it.
> 
> Casual users will use the defaults that match the Python types anyway.
> 
> So in the in the spirit of "practicality beats purity" -- I"d like
> accidental upcasting to be hard to do.
> 
> -Chris
> 
> -- 
> 
> Christopher Barker, Ph.D.
> Oceanographer
> 
> Emergency Response Division
> NOAA/NOS/OR&R            (206) 526-6959   voice
> 7600 Sand Point Way NE   (206) 526-6329   fax
> Seattle, WA  98115       (206) 526-6317   main reception
> 
> Chris.Barker at noaa.gov
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From shish at keba.be  Thu Jan 17 20:34:13 2013
From: shish at keba.be (Olivier Delalleau)
Date: Thu, 17 Jan 2013 20:34:13 -0500
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <CAH6Pt5o3hPuR0+b62_os7ZDHg0Q8+MYrj+2ucZ0WfxXj9MU1+w@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CALmrCV0n=TsEPMrmzpdh6UQNFRTp5SgD0VAH52n9pxedsEu8Kg@mail.gmail.com>
	<CAH6Pt5rKwOzTjnOO87cSi=fkCT_-8tVAbEoA9RunwWD6A_DAPw@mail.gmail.com>
	<CALmrCV1JwDcBnO3sf=4n_4+V=C_j3QzRkpMCEZJutnv8NxRPqw@mail.gmail.com>
	<CAH6Pt5qnD=hfW_Zmw5t69HL78H8uhFY8b+GTkFx4uOmO+HERYQ@mail.gmail.com>
	<CALmrCV3ssCQ+U8QziyymoiF6BgxHHBiM2Z2Bnuk9UWNGH9bC4w@mail.gmail.com>
	<CAH6Pt5oiY07W=S+XbAYwN8Zk+AGCOh17eh_eg-uexBHGkPDgWQ@mail.gmail.com>
	<CALmrCV3kZ=EEopMDydwYDarV6G9nSbiPHK_UY4vSBiDUG8fhKg@mail.gmail.com>
	<CAH6Pt5ro6n0z9e_BZHjvrVxL2tvYoE50vvkeaXmAgvOUQBpMLQ@mail.gmail.com>
	<CALmrCV1H68=Vtr6Gep7Y08Ns_vASQdrf+Q3R-COgvXt7URi5sg@mail.gmail.com>
	<CAPJVwBmCmPCShfmoSPgkpO58aLU6S2t9GbYbDV=iif6US5YgJQ@mail.gmail.com>
	<CALmrCV1qbkkp8yyA4z2RTCcB9hK70ZcFYD53xXNaqNKWuaGtnw@mail.gmail.com>
	<CAPJVwB=f9_g81=SYyrb88j9mWrHKTwjxNUp0ofs0ZQKzPxeJoA@mail.gmail.com>
	<CALGmxEJyj4NT8RPmGqi8Sf5TD5PMw38TxVKyNQGqf-YU1T5-Wg@mail.gmail.com>
	<50EDB1D4.5090909@astro.uio.no>
	<CAH6Pt5pxM7z4YXiyyYxO1nj_wj9M_RVUFB7yt2tKnvo1=cXuNg@mail.gmail.com>
	<CALGmxEJW7QvegJxPa0j5bgD_CAx2R-J6ywv+a=6G4dRpyzodaw@mail.gmail.com>
	<CAH6Pt5o3hPuR0+b62_os7ZDHg0Q8+MYrj+2ucZ0WfxXj9MU1+w@mail.gmail.com>
Message-ID: <CAFXk4bpekpe=1JHHEGSwZhj4Oqx0358YSc78CyCP3=r3_efbmQ@mail.gmail.com>

2013/1/17 Matthew Brett <matthew.brett at gmail.com>:
> Hi,
>
> On Fri, Jan 18, 2013 at 1:04 AM, Chris Barker - NOAA Federal
> <chris.barker at noaa.gov> wrote:
>> On Thu, Jan 17, 2013 at 6:26 AM, Matthew Brett <matthew.brett at gmail.com> wrote:
>>
>>> I am starting to wonder if we should aim for making
>>>
>>> * scalar and array casting rules the same;
>>> * Python int / float scalars become int32 / 64 or float64;
>>
>> aren't they already? I'm not sure what you are proposing.
>
> Sorry - yes that is what they are already, this sentence refers back
> to an earlier suggestion of mine on the thread, which I am discarding.
>
>>> This has the benefit of being very easy to understand and explain.  It
>>> makes dtypes predictable in the sense they don't depend on value.
>>
>> That is key -- I don't think casting should ever depend on value.
>>
>>> Those wanting to maintain - say - float32 will need to cast scalars to float32.
>>>
>>> Maybe the use-cases motivating the scalar casting rules - maintaining
>>> float32 precision in particular - can be dealt with by careful casting
>>> of scalars, throwing the burden onto the memory-conscious to maintain
>>> their dtypes.
>>
>> IIRC this is how it worked "back in the day" (the Numeric day? -- and
>> I'm pretty sure that in the long run it worked out badly. the core
>> problem is that there are only python literals for a couple types, and
>> it was oh so easy to do things like:
>>
>> my_arr = np,zeros(shape, dtype-float32)
>>
>> another_array = my_array * 4.0
>>
>> and you'd suddenly get a float64 array. (of course, we already know
>> all that..) I suppose this has the up side of being safe, and having
>> scalar and array casting rules be the same is of course appealing, but
>> you use a particular size dtype for a reason,and it's a real pain to
>> maintain it.
>
> Yes, I do understand that.  The difference - as I understand it - is
> that back in the day, numeric did not have the the float32 etc
> scalars, so you could not do:
>
> another_array = my_array * np.float32(4.0)
>
> (please someone correct me if I'm wrong).
>
>> Casual users will use the defaults that match the Python types anyway.
>
> I think what we are reading in this thread is that even experienced
> numpy users can find the scalar casting rules surprising, and that's a
> real problem, it seems to me.
>
> The person with a massive float32 array certainly should have the
> ability to control upcasting, but I think the default should be the
> least surprising thing, and that, it seems to me, is for the casting
> rules to be the same for arrays and scalars.   In the very long term.

That would also be my preference, after banging my head against this
problem for a while now, because it's simple and consistent.

Since most of the related issues seem to come from integer arrays, a
middle-ground may be the following:
- Integer-type arrays get upcasted by scalars as in usual array /
array operations.
- Float/Complex-type arrays don't get upcasted by scalars except when
the scalar is complex and the array is float.

It makes the rule a bit more complex, but has the advantage of better
preserving float types while getting rid of most issues related to
integer overflows.

-=- Olivier


From thouis at gmail.com  Thu Jan 17 23:05:14 2013
From: thouis at gmail.com (Thouis (Ray) Jones)
Date: Thu, 17 Jan 2013 23:05:14 -0500
Subject: [Numpy-discussion] New numpy functions: filled, filled_like
In-Reply-To: <CAFXk4bpHWJJREX2mo-zL1F=_gra3zk9sFZqM+7Nx3uYmvTr6kQ@mail.gmail.com>
References: <CAPJVwBmgZUCwApoS2ztmwt2pTa-2-+B3W5tOm8A30KyuOd12AA@mail.gmail.com>
	<CAF6FJivC=O0CGMGP=c5=a=TMRP73Zy9H61FRS8xSKp5cU2XV4Q@mail.gmail.com>
	<CAPJVwBnOdehQTo=dLup+woc5_JkE5ekvEBDeiEFraVyVUvSDtQ@mail.gmail.com>
	<CAKF=DjtjSGk99zh53wwiPVmMThOzr4hGUKX5HJPG-5YeF_QTCQ@mail.gmail.com>
	<CAPJVwBmZvXXszZzf9m=PX_7e=8qs8ZJteU2hD7_eTR9pnN9U2w@mail.gmail.com>
	<CAF6FJiuQwNCkPvzVKneLfx3XSM7tHKcNXzMftRFnu8R+tN_LoQ@mail.gmail.com>
	<loom.20130114T094714-689@post.gmane.org>
	<CAH6Pt5rjUE0uEQkgZsGrKL4HWXMPZO=wbbaUZ5=tm+LCBSrrXw@mail.gmail.com>
	<CAFXk4br_0pohTOrU6fJjbiqpsP+GSiT3U-nDWRrCi1q6PjJHjg@mail.gmail.com>
	<50F4400F.4040709@hawaii.edu>
	<CANNq6Fm35sYGK4cjZNQedsbdVz06hVhAm6kv7d6QHPtMJACR6g@mail.gmail.com>
	<50F44A95.2030202@crans.org>
	<CALc6Xo6b=3OE3VHNsqJ-SXo_8tvNZOL5RV0pnYnG_fiDp58TUQ@mail.gmail.com>
	<CANNq6F=WpPoMF0+AMqRHGXmfs2cdnX83d=i1TNSfaj1-pamZPg@mail.gmail.com>
	<50F80711.9010204@crans.org> <50F8757B.7060008@hawaii.edu>
	<CANNq6F=W4KC1H1HguoWm5AZA5N5iJSGsmAmYsVRQJKr5mueG8w@mail.gmail.com>
	<CAMRnEmpyjMvL0-PgFf0RsbE1NkANWCfxAXmggdQ1+p7u6ux5+w@mail.gmail.com>
	<CAH6Pt5rEX290w58rLYehUQT+bxq-5rSKDRDTF2C7viqwPp5HOg@mail.gmail.com>
	<CAFXk4bpHWJJREX2mo-zL1F=_gra3zk9sFZqM+7Nx3uYmvTr6kQ@mail.gmail.com>
Message-ID: <CAHGWWxHVx56-2k5qU8KW-H1e57FbkV2v0bqM-FZ1+HVZ7JpYEQ@mail.gmail.com>

On Jan 17, 2013 8:01 PM, "Olivier Delalleau" <shish at keba.be> wrote:
>
> 2013/1/17 Matthew Brett <matthew.brett at gmail.com>:
> > Hi,
> >
> > On Thu, Jan 17, 2013 at 10:27 PM, Mark Wiebe <mwwiebe at gmail.com> wrote:
> >>
> >> On Thu, Jan 17, 2013 at 2:10 PM, Benjamin Root <ben.root at ou.edu> wrote:
> >>>
> >>>
> >>>
> >>> On Thu, Jan 17, 2013 at 5:04 PM, Eric Firing <efiring at hawaii.edu>
wrote:
> >>>>
> >>>> On 2013/01/17 4:13 AM, Pierre Haessig wrote:
> >>>> > Hi,
> >>>> >
> >>>> > Le 14/01/2013 20:05, Benjamin Root a ?crit :
> >>>> >> I do like the way you are thinking in terms of the broadcasting
> >>>> >> semantics, but I wonder if that is a bit awkward.  What I mean
is, if
> >>>> >> one were to use broadcasting semantics for creating an array,
wouldn't
> >>>> >> one have just simply used broadcasting anyway?  The point of
> >>>> >> broadcasting is to _avoid_ the creation of unneeded arrays.  But
maybe
> >>>> >> I can be convinced with some examples.
> >>>> >
> >>>> > I feel that one of the point of the discussion is : although a new
(or
> >>>> > not so new...) function to create a filled array would be more
elegant
> >>>> > than the existing pair of functions "np.zeros" and "np.ones",
there are
> >>>> > maybe not so many usecases for filled arrays *other than zeros
values*.
> >>>> >
> >>>> > I can remember having initialized a non-zero array *some months
ago*.
> >>>> > For the anecdote it was a vector of discretized vehicule speed
values
> >>>> > which I wanted to be initialized with a predefined mean speed value
> >>>> > prior to some optimization. In that usecase, I really didn't care
about
> >>>> > the performance of this initialization step.
> >>>> >
> >>>> > So my overall feeling after this thread is
> >>>> >   - *yes* a single dedicated fill/init/someverb function would
give a
> >>>> > slightly better API,
> >>>> >   -  but *no* it's not important because np.empty and np.zeros
covers
> >>>> > 95
> >>>> > % usecases !
> >>>>
> >>>> I agree with your summary and conclusion.
> >>>>
> >>>> Eric
> >>>>
> >>>
> >>> Can we at least have a np.nans() and np.infs() functions?  This should
> >>> cover an additional 4% of use-cases.
> >>>
> >>> Ben Root
> >>>
> >>> P.S. - I know they aren't verbs...
> >>
> >>
> >> Would it be too weird or clumsy to extend the empty and empty_like
functions
> >> to do the filling?
> >>
> >> np.empty((10, 10), fill=np.nan)
> >> np.empty_like(my_arr, fill=np.nan)
> >
> > That sounds like a good idea to me.  Someone wanting a fast way to
> > fill an array will probably check out the 'empty' docstring first.
> >
> > See you,
> >
> > Matthew
>
> +1 from me. Even though it *is* weird to have both "empty" and "fill" ;)

I'd almost prefer such a keyword be added to np.ones() to avoid that
weirdness.

(something like "an array of ones where one equals X")

Ray
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130117/7929994e/attachment.html>

From chris.barker at noaa.gov  Fri Jan 18 01:23:10 2013
From: chris.barker at noaa.gov (Chris Barker - NOAA Federal)
Date: Thu, 17 Jan 2013 22:23:10 -0800
Subject: [Numpy-discussion] Casting Bug or a "Feature"?
In-Reply-To: <CAFXk4bqLZKxS2DzL1HiE=1NkUkaX9JY1wU9jmouZJnHQ4r9qNA@mail.gmail.com>
References: <CAHUTaHH54MAP4K3yHR_kTzWC70xNrWa0ykuq36RdC=4+PP71bA@mail.gmail.com>
	<CALGmxE+DkgeTSHzas1=QEOdnGHN8HUv=p4Ff9+TMfhRLT7pJAQ@mail.gmail.com>
	<CAPJVwBm8awHwWe5G5E1rxkYfaM3q6HDEYRY6svyFfo7nfaxO+Q@mail.gmail.com>
	<CAHUTaHH7H31rbq=ej6SSqe6s-5WpZi4u3ZPyQUqJnBmRyo8iQA@mail.gmail.com>
	<CAMMTP+DERTF_k=cyqEB3twYh391R9Q_1y5mSg8p1UD9hityBiQ@mail.gmail.com>
	<CAFXk4bqLZKxS2DzL1HiE=1NkUkaX9JY1wU9jmouZJnHQ4r9qNA@mail.gmail.com>
Message-ID: <CALGmxEL9P5s6co80K4roDZL8sCenyxeqwjwb+S4Wv_y=R38_2g@mail.gmail.com>

On Thu, Jan 17, 2013 at 5:19 PM, Olivier Delalleau <shish at keba.be> wrote:
> 2013/1/16  <josef.pktd at gmail.com>:
>> On Wed, Jan 16, 2013 at 10:43 PM, Patrick Marsh
>> <patrickmarshwx at gmail.com> wrote:

>> I could live with an exception for lossy down casting in this case.

I'm not sure what the idea here is -- would you only get an exception
if the value was such that the downcast would be lossy? If so, a major
-1

The other option would be to always raise an exception if types would
cause a downcast, i.e:

arr = np.zeros(shape, dtype-uint8)

arr2 = arr + 30 # this would raise an exception

arr2 = arr + np.uint8(30) # you'd have to do this

That sure would be clear and result if few errors of this type, but
sure seems verbose and "static language like" to me.

> Long story short, +1 for warning, -1 for exception, and +1 for a
> config flag that allows one to change to exceptions by default, if
> desired.

is this for value-dependent or any casting of this sort?

-Chris


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From scott.sinclair.za at gmail.com  Fri Jan 18 01:32:06 2013
From: scott.sinclair.za at gmail.com (Scott Sinclair)
Date: Fri, 18 Jan 2013 08:32:06 +0200
Subject: [Numpy-discussion] Fwd: numpy test fails with "Illegal
	instruction'
In-Reply-To: <CAHTr4dVBQWA2NhJYoD42AnC68_dBePENNwLArY1UYHJxTPpsRg@mail.gmail.com>
References: <CAHTr4dXL7uM6+oKZk2jaeoTNQR11yBdx_guw6yLX4OviwNDaNw@mail.gmail.com>
	<CAHTr4dWR-dp-hY5UCBehNHuNW5fYrEdmXo35qkkPts8kB5L+XQ@mail.gmail.com>
	<CA+nsYDsp2nAcOcOT+r4C16b9oOc9WjFT4QcdZzvCu58O0U9oOA@mail.gmail.com>
	<CAHTr4dVeqs1nFTx_6NooxkyzCoH0vPdpM_a7LUqmgsg_XovP5Q@mail.gmail.com>
	<CAHTr4dVBQWA2NhJYoD42AnC68_dBePENNwLArY1UYHJxTPpsRg@mail.gmail.com>
Message-ID: <CA+nsYDvb4hczyLYeRphhYWTKj2iUN2VeqwHJZR2QAYUdTWL4mw@mail.gmail.com>

On 17 January 2013 16:59, Gerhard Burger <burger.ga at gmail.com> wrote:
> Solved it, did a backtrace with gdb and the error came somewhere from an old
> lapack version that was installed on my machine (I thought I wouldn't have
> these issues in a virtualenv). but anyway after I removed it, and installed
> numpy again, it ran without problems!

Virtualenv only creates an isolated Python install, it doesn't trick
the Numpy build process into ignoring system libraries like LAPACK,
ATLAS etc.

Glad it's fixed.

Cheers,
Scott


From chris.barker at noaa.gov  Fri Jan 18 01:32:55 2013
From: chris.barker at noaa.gov (Chris Barker - NOAA Federal)
Date: Thu, 17 Jan 2013 22:32:55 -0800
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <CAFXk4bpekpe=1JHHEGSwZhj4Oqx0358YSc78CyCP3=r3_efbmQ@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CALmrCV0n=TsEPMrmzpdh6UQNFRTp5SgD0VAH52n9pxedsEu8Kg@mail.gmail.com>
	<CAH6Pt5rKwOzTjnOO87cSi=fkCT_-8tVAbEoA9RunwWD6A_DAPw@mail.gmail.com>
	<CALmrCV1JwDcBnO3sf=4n_4+V=C_j3QzRkpMCEZJutnv8NxRPqw@mail.gmail.com>
	<CAH6Pt5qnD=hfW_Zmw5t69HL78H8uhFY8b+GTkFx4uOmO+HERYQ@mail.gmail.com>
	<CALmrCV3ssCQ+U8QziyymoiF6BgxHHBiM2Z2Bnuk9UWNGH9bC4w@mail.gmail.com>
	<CAH6Pt5oiY07W=S+XbAYwN8Zk+AGCOh17eh_eg-uexBHGkPDgWQ@mail.gmail.com>
	<CALmrCV3kZ=EEopMDydwYDarV6G9nSbiPHK_UY4vSBiDUG8fhKg@mail.gmail.com>
	<CAH6Pt5ro6n0z9e_BZHjvrVxL2tvYoE50vvkeaXmAgvOUQBpMLQ@mail.gmail.com>
	<CALmrCV1H68=Vtr6Gep7Y08Ns_vASQdrf+Q3R-COgvXt7URi5sg@mail.gmail.com>
	<CAPJVwBmCmPCShfmoSPgkpO58aLU6S2t9GbYbDV=iif6US5YgJQ@mail.gmail.com>
	<CALmrCV1qbkkp8yyA4z2RTCcB9hK70ZcFYD53xXNaqNKWuaGtnw@mail.gmail.com>
	<CAPJVwB=f9_g81=SYyrb88j9mWrHKTwjxNUp0ofs0ZQKzPxeJoA@mail.gmail.com>
	<CALGmxEJyj4NT8RPmGqi8Sf5TD5PMw38TxVKyNQGqf-YU1T5-Wg@mail.gmail.com>
	<50EDB1D4.5090909@astro.uio.no>
	<CAH6Pt5pxM7z4YXiyyYxO1nj_wj9M_RVUFB7yt2tKnvo1=cXuNg@mail.gmail.com>
	<CALGmxEJW7QvegJxPa0j5bgD_CAx2R-J6ywv+a=6G4dRpyzodaw@mail.gmail.com>
	<CAH6Pt5o3hPuR0+b62_os7ZDHg0Q8+MYrj+2ucZ0WfxXj9MU1+w@mail.gmail.com>
	<CAFXk4bpekpe=1JHHEGSwZhj4Oqx0358YSc78CyCP3=r3_efbmQ@mail.gmail.com>
Message-ID: <CALGmxELfd76aWSBzc061H_dHVAbJu+jOK8KBBSciQ7ERMpkL4w@mail.gmail.com>

On Thu, Jan 17, 2013 at 5:34 PM, Olivier Delalleau <shish at keba.be> wrote:
>> Yes, I do understand that.  The difference - as I understand it - is
>> that back in the day, numeric did not have the the float32 etc
>> scalars, so you could not do:
>>
>> another_array = my_array * np.float32(4.0)
>>
>> (please someone correct me if I'm wrong).

correct, it didn't have any scalars, but you could (and had to) still
do something like:

another_array = my_array * np.array(4.0, dtype=np.float32)

a bit more verbose, but the verbosity wasn't the key issue -- it was
doing anything special at all.

>>> Casual users will use the defaults that match the Python types anyway.
>>
>> I think what we are reading in this thread is that even experienced
>> numpy users can find the scalar casting rules surprising, and that's a
>> real problem, it seems to me.

for sure -- but it's still relevant -- if you want non-default types,
you need to understand the rules an be more careful.

>> The person with a massive float32 array certainly should have the
>> ability to control upcasting, but I think the default should be the
>> least surprising thing, and that, it seems to me, is for the casting
>> rules to be the same for arrays and scalars.   In the very long term.

"A foolish consistency is the hobgoblin of little minds"

-- just kidding.

But in all seriousness -- accidental upcasting really was a big old
pain back in the day -- we are not making this up. We re using the
term "least surprising", but I now I was often surprised that I had
lost my nice compact array.

The user will need to think about it no matter how you slice it.

> Since most of the related issues seem to come from integer arrays, a
> middle-ground may be the following:
> - Integer-type arrays get upcasted by scalars as in usual array /
> array operations.
> - Float/Complex-type arrays don't get upcasted by scalars except when
> the scalar is complex and the array is float.

I'm not sure that integer arrays are any more of an an issue, and
having integer types and float typed behave differently is really
asking for trouble!

-Chris


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From g.brandl at gmx.net  Fri Jan 18 03:31:23 2013
From: g.brandl at gmx.net (Georg Brandl)
Date: Fri, 18 Jan 2013 09:31:23 +0100
Subject: [Numpy-discussion] Casting Bug or a "Feature"?
In-Reply-To: <CALGmxEJTZDubRrzZyP9KXVHM5gXDFwg-ODcQ3ia1DhuHTAOydQ@mail.gmail.com>
References: <CAHUTaHH54MAP4K3yHR_kTzWC70xNrWa0ykuq36RdC=4+PP71bA@mail.gmail.com>
	<CALGmxE+DkgeTSHzas1=QEOdnGHN8HUv=p4Ff9+TMfhRLT7pJAQ@mail.gmail.com>
	<CAPJVwBm8awHwWe5G5E1rxkYfaM3q6HDEYRY6svyFfo7nfaxO+Q@mail.gmail.com>
	<CAHUTaHH7H31rbq=ej6SSqe6s-5WpZi4u3ZPyQUqJnBmRyo8iQA@mail.gmail.com>
	<50F7A4E7.7050608@gmail.com>
	<CAHCaCk+iMUWXnwBVhH=vDgkhW14gpaeN4oji9=RVsvXCT6kzdw@mail.gmail.com>
	<CALGmxEJTZDubRrzZyP9KXVHM5gXDFwg-ODcQ3ia1DhuHTAOydQ@mail.gmail.com>
Message-ID: <50F9085B.3010908@gmx.net>

Am 17.01.2013 17:21, schrieb Chris Barker - NOAA Federal:
> On Wed, Jan 16, 2013 at 11:34 PM, Matthieu Brucher
> 
>> Of course a += b is not the same as a = a + b. The first one modifies the
>> object a, the second one creates a new object and puts it inside a. The
>> behavior IS consistent.
> 
> Exactly -- if you ask me, the bug is that Python allows "in_place"
> operators for immutable objects -- they should be more than syntactic
> sugar.

They are not -- the "+=" translation is well defined: the equivalents are

a += b
a = a.__iadd__(b)

Now __iadd__ can choose to return self (for mutable objects) or a new object
(for immutable objects).  The confusion about immutables is simply the
"usual" confusion about "=" assigning names, not variable space.

> Of course, the temptation for += on regular numbers was just too much to resist.

And probably 95% of the use of +=/-= *is* with regular numbers.

Georg

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130118/04136005/attachment.sig>

From daniele at grinta.net  Fri Jan 18 03:44:03 2013
From: daniele at grinta.net (Daniele Nicolodi)
Date: Fri, 18 Jan 2013 09:44:03 +0100
Subject: [Numpy-discussion] New numpy functions: filled, filled_like
In-Reply-To: <CAMRnEmpyjMvL0-PgFf0RsbE1NkANWCfxAXmggdQ1+p7u6ux5+w@mail.gmail.com>
References: <CAPJVwBmgZUCwApoS2ztmwt2pTa-2-+B3W5tOm8A30KyuOd12AA@mail.gmail.com>
	<CAKF=DjtjSGk99zh53wwiPVmMThOzr4hGUKX5HJPG-5YeF_QTCQ@mail.gmail.com>
	<CAPJVwBmZvXXszZzf9m=PX_7e=8qs8ZJteU2hD7_eTR9pnN9U2w@mail.gmail.com>
	<CAF6FJiuQwNCkPvzVKneLfx3XSM7tHKcNXzMftRFnu8R+tN_LoQ@mail.gmail.com>
	<loom.20130114T094714-689@post.gmane.org>
	<CAH6Pt5rjUE0uEQkgZsGrKL4HWXMPZO=wbbaUZ5=tm+LCBSrrXw@mail.gmail.com>
	<CAFXk4br_0pohTOrU6fJjbiqpsP+GSiT3U-nDWRrCi1q6PjJHjg@mail.gmail.com>
	<50F4400F.4040709@hawaii.edu>
	<CANNq6Fm35sYGK4cjZNQedsbdVz06hVhAm6kv7d6QHPtMJACR6g@mail.gmail.com>
	<50F44A95.2030202@crans.org>
	<CALc6Xo6b=3OE3VHNsqJ-SXo_8tvNZOL5RV0pnYnG_fiDp58TUQ@mail.gmail.com>
	<CANNq6F=WpPoMF0+AMqRHGXmfs2cdnX83d=i1TNSfaj1-pamZPg@mail.gmail.com>
	<50F80711.9010204@crans.org> <50F8757B.7060008@hawaii.edu>
	<CANNq6F=W4KC1H1HguoWm5AZA5N5iJSGsmAmYsVRQJKr5mueG8w@mail.gmail.com>
	<CAMRnEmpyjMvL0-PgFf0RsbE1NkANWCfxAXmggdQ1+p7u6ux5+w@mail.gmail.com>
Message-ID: <50F90B53.4030000@grinta.net>

On 17/01/2013 23:27, Mark Wiebe wrote:
> Would it be too weird or clumsy to extend the empty and empty_like
> functions to do the filling?
> 
> np.empty((10, 10), fill=np.nan)
> np.empty_like(my_arr, fill=np.nan)

Wouldn't it be more natural to extend the ndarray constructor?

np.ndarray((10, 10), fill=np.nan)

It looks more natural to me. In this way it is not possible to have the
_like extension, but I don't see it as a major drawback.


Cheers,
Daniele


From shish at keba.be  Fri Jan 18 07:26:57 2013
From: shish at keba.be (Olivier Delalleau)
Date: Fri, 18 Jan 2013 07:26:57 -0500
Subject: [Numpy-discussion] Casting Bug or a "Feature"?
In-Reply-To: <CALGmxEL9P5s6co80K4roDZL8sCenyxeqwjwb+S4Wv_y=R38_2g@mail.gmail.com>
References: <CAHUTaHH54MAP4K3yHR_kTzWC70xNrWa0ykuq36RdC=4+PP71bA@mail.gmail.com>
	<CALGmxE+DkgeTSHzas1=QEOdnGHN8HUv=p4Ff9+TMfhRLT7pJAQ@mail.gmail.com>
	<CAPJVwBm8awHwWe5G5E1rxkYfaM3q6HDEYRY6svyFfo7nfaxO+Q@mail.gmail.com>
	<CAHUTaHH7H31rbq=ej6SSqe6s-5WpZi4u3ZPyQUqJnBmRyo8iQA@mail.gmail.com>
	<CAMMTP+DERTF_k=cyqEB3twYh391R9Q_1y5mSg8p1UD9hityBiQ@mail.gmail.com>
	<CAFXk4bqLZKxS2DzL1HiE=1NkUkaX9JY1wU9jmouZJnHQ4r9qNA@mail.gmail.com>
	<CALGmxEL9P5s6co80K4roDZL8sCenyxeqwjwb+S4Wv_y=R38_2g@mail.gmail.com>
Message-ID: <CAFXk4bqvypf6JLXRYCc1+gyPVDf2D6RPa_=2_nYdJevB3KzyQg@mail.gmail.com>

Le vendredi 18 janvier 2013, Chris Barker - NOAA Federal a ?crit :

> On Thu, Jan 17, 2013 at 5:19 PM, Olivier Delalleau <shish at keba.be<javascript:;>>
> wrote:
> > 2013/1/16  <josef.pktd at gmail.com <javascript:;>>:
> >> On Wed, Jan 16, 2013 at 10:43 PM, Patrick Marsh
> >> <patrickmarshwx at gmail.com <javascript:;>> wrote:
>
> >> I could live with an exception for lossy down casting in this case.
>
> I'm not sure what the idea here is -- would you only get an exception
> if the value was such that the downcast would be lossy? If so, a major
> -1
>
> The other option would be to always raise an exception if types would
> cause a downcast, i.e:
>
> arr = np.zeros(shape, dtype-uint8)
>
> arr2 = arr + 30 # this would raise an exception
>
> arr2 = arr + np.uint8(30) # you'd have to do this
>
> That sure would be clear and result if few errors of this type, but
> sure seems verbose and "static language like" to me.
>
> > Long story short, +1 for warning, -1 for exception, and +1 for a
> > config flag that allows one to change to exceptions by default, if
> > desired.
>
> is this for value-dependent or any casting of this sort?


What I had in mind here is the situation where the scalar's dtype is
fundamentally different from the array's dtype (i.e. float vs int, complex
vs float) and can't be cast exactly into the array's dtype (so,
value-dependent), which is the situation that originated this thread.
I don't mind removing the second part ("and can't be cast exactly...") to
have it value-independent.
Other tricky situations with integer arrays are to some extent related to
how regular (not in-place) additions are handled, something that should
probably be settled first.

-=- Olivier
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130118/16bd1dd4/attachment.html>

From shish at keba.be  Fri Jan 18 07:39:01 2013
From: shish at keba.be (Olivier Delalleau)
Date: Fri, 18 Jan 2013 07:39:01 -0500
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <CALGmxELfd76aWSBzc061H_dHVAbJu+jOK8KBBSciQ7ERMpkL4w@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CALmrCV0n=TsEPMrmzpdh6UQNFRTp5SgD0VAH52n9pxedsEu8Kg@mail.gmail.com>
	<CAH6Pt5rKwOzTjnOO87cSi=fkCT_-8tVAbEoA9RunwWD6A_DAPw@mail.gmail.com>
	<CALmrCV1JwDcBnO3sf=4n_4+V=C_j3QzRkpMCEZJutnv8NxRPqw@mail.gmail.com>
	<CAH6Pt5qnD=hfW_Zmw5t69HL78H8uhFY8b+GTkFx4uOmO+HERYQ@mail.gmail.com>
	<CALmrCV3ssCQ+U8QziyymoiF6BgxHHBiM2Z2Bnuk9UWNGH9bC4w@mail.gmail.com>
	<CAH6Pt5oiY07W=S+XbAYwN8Zk+AGCOh17eh_eg-uexBHGkPDgWQ@mail.gmail.com>
	<CALmrCV3kZ=EEopMDydwYDarV6G9nSbiPHK_UY4vSBiDUG8fhKg@mail.gmail.com>
	<CAH6Pt5ro6n0z9e_BZHjvrVxL2tvYoE50vvkeaXmAgvOUQBpMLQ@mail.gmail.com>
	<CALmrCV1H68=Vtr6Gep7Y08Ns_vASQdrf+Q3R-COgvXt7URi5sg@mail.gmail.com>
	<CAPJVwBmCmPCShfmoSPgkpO58aLU6S2t9GbYbDV=iif6US5YgJQ@mail.gmail.com>
	<CALmrCV1qbkkp8yyA4z2RTCcB9hK70ZcFYD53xXNaqNKWuaGtnw@mail.gmail.com>
	<CAPJVwB=f9_g81=SYyrb88j9mWrHKTwjxNUp0ofs0ZQKzPxeJoA@mail.gmail.com>
	<CALGmxEJyj4NT8RPmGqi8Sf5TD5PMw38TxVKyNQGqf-YU1T5-Wg@mail.gmail.com>
	<50EDB1D4.5090909@astro.uio.no>
	<CAH6Pt5pxM7z4YXiyyYxO1nj_wj9M_RVUFB7yt2tKnvo1=cXuNg@mail.gmail.com>
	<CALGmxEJW7QvegJxPa0j5bgD_CAx2R-J6ywv+a=6G4dRpyzodaw@mail.gmail.com>
	<CAH6Pt5o3hPuR0+b62_os7ZDHg0Q8+MYrj+2ucZ0WfxXj9MU1+w@mail.gmail.com>
	<CAFXk4bpekpe=1JHHEGSwZhj4Oqx0358YSc78CyCP3=r3_efbmQ@mail.gmail.com>
	<CALGmxELfd76aWSBzc061H_dHVAbJu+jOK8KBBSciQ7ERMpkL4w@mail.gmail.com>
Message-ID: <CAFXk4bpM4rZmNYC4+mveXU_0PKG+cMchfqwhAG8Lz34aG-WZew@mail.gmail.com>

Le vendredi 18 janvier 2013, Chris Barker - NOAA Federal a ?crit :

> On Thu, Jan 17, 2013 at 5:34 PM, Olivier Delalleau <shish at keba.be<javascript:;>>
> wrote:
> >> Yes, I do understand that.  The difference - as I understand it - is
> >> that back in the day, numeric did not have the the float32 etc
> >> scalars, so you could not do:
> >>
> >> another_array = my_array * np.float32(4.0)
> >>
> >> (please someone correct me if I'm wrong).
>
> correct, it didn't have any scalars, but you could (and had to) still
> do something like:
>
> another_array = my_array * np.array(4.0, dtype=np.float32)
>
> a bit more verbose, but the verbosity wasn't the key issue -- it was
> doing anything special at all.
>
> >>> Casual users will use the defaults that match the Python types anyway.
> >>
> >> I think what we are reading in this thread is that even experienced
> >> numpy users can find the scalar casting rules surprising, and that's a
> >> real problem, it seems to me.
>
> for sure -- but it's still relevant -- if you want non-default types,
> you need to understand the rules an be more careful.
>
> >> The person with a massive float32 array certainly should have the
> >> ability to control upcasting, but I think the default should be the
> >> least surprising thing, and that, it seems to me, is for the casting
> >> rules to be the same for arrays and scalars.   In the very long term.
>
> "A foolish consistency is the hobgoblin of little minds"
>
> -- just kidding.
>
> But in all seriousness -- accidental upcasting really was a big old
> pain back in the day -- we are not making this up. We re using the
> term "least surprising", but I now I was often surprised that I had
> lost my nice compact array.
>
> The user will need to think about it no matter how you slice it.
>
> > Since most of the related issues seem to come from integer arrays, a
> > middle-ground may be the following:
> > - Integer-type arrays get upcasted by scalars as in usual array /
> > array operations.
> > - Float/Complex-type arrays don't get upcasted by scalars except when
> > the scalar is complex and the array is float.
>
> I'm not sure that integer arrays are any more of an an issue, and
> having integer types and float typed behave differently is really
> asking for trouble!


"A foolish consistency is the hobgoblin of little minds" :P

If you check again the examples in this thread exhibiting surprising /
unexpected behavior, you'll notice most of them are with integers.
The tricky thing about integers is that downcasting can dramatically change
your result. With floats, not so much: you get approximation errors
(usually what you want) and the occasional nan / inf creeping in (usally
noticeable).

I too would prefer similar rules between ints & floats, but after all these
discussions I'm starting to think it may be worth acknowledging they are
different beasts.

Anyway, in my mind we were discussing what might be the desired behavior in
the long term, and my suggestion isn't practical in the short term since it
may break a significant amount of code. So I'm still in favor of
Nathaniel's proposal, except with exceptions replaced by warnings by
default (and no warning for lossy downcasting of e.g. float64 ->
float32 except for zero / inf, as discussed at some point in the thread).
-=- Olivier
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130118/b9f00c5b/attachment.html>

From pierre.haessig at crans.org  Fri Jan 18 08:48:36 2013
From: pierre.haessig at crans.org (Pierre Haessig)
Date: Fri, 18 Jan 2013 14:48:36 +0100
Subject: [Numpy-discussion] New numpy functions: filled, filled_like
In-Reply-To: <CAH6Pt5rEX290w58rLYehUQT+bxq-5rSKDRDTF2C7viqwPp5HOg@mail.gmail.com>
References: <CAPJVwBmgZUCwApoS2ztmwt2pTa-2-+B3W5tOm8A30KyuOd12AA@mail.gmail.com>
	<CAPJVwBmZvXXszZzf9m=PX_7e=8qs8ZJteU2hD7_eTR9pnN9U2w@mail.gmail.com>
	<CAF6FJiuQwNCkPvzVKneLfx3XSM7tHKcNXzMftRFnu8R+tN_LoQ@mail.gmail.com>
	<loom.20130114T094714-689@post.gmane.org>
	<CAH6Pt5rjUE0uEQkgZsGrKL4HWXMPZO=wbbaUZ5=tm+LCBSrrXw@mail.gmail.com>
	<CAFXk4br_0pohTOrU6fJjbiqpsP+GSiT3U-nDWRrCi1q6PjJHjg@mail.gmail.com>
	<50F4400F.4040709@hawaii.edu>
	<CANNq6Fm35sYGK4cjZNQedsbdVz06hVhAm6kv7d6QHPtMJACR6g@mail.gmail.com>
	<50F44A95.2030202@crans.org>
	<CALc6Xo6b=3OE3VHNsqJ-SXo_8tvNZOL5RV0pnYnG_fiDp58TUQ@mail.gmail.com>
	<CANNq6F=WpPoMF0+AMqRHGXmfs2cdnX83d=i1TNSfaj1-pamZPg@mail.gmail.com>
	<50F80711.9010204@crans.org> <50F8757B.7060008@hawaii.edu>
	<CANNq6F=W4KC1H1HguoWm5AZA5N5iJSGsmAmYsVRQJKr5mueG8w@mail.gmail.com>
	<CAMRnEmpyjMvL0-PgFf0RsbE1NkANWCfxAXmggdQ1+p7u6ux5+w@mail.gmail.com>
	<CAH6Pt5rEX290w58rLYehUQT+bxq-5rSKDRDTF2C7viqwPp5HOg@mail.gmail.com>
Message-ID: <50F952B4.4070208@crans.org>

Hi,
Le 17/01/2013 23:31, Matthew Brett a ?crit :
>> Would it be too weird or clumsy to extend the empty and empty_like functions
>> >to do the filling?
>> >
>> >np.empty((10, 10), fill=np.nan)
>> >np.empty_like(my_arr, fill=np.nan)
> That sounds like a good idea to me.  Someone wanting a fast way to
> fill an array will probably check out the 'empty' docstring first.
Oh, that sounds very good to me. There is indeed a bit of contradictions 
between "empty" and "fill" but maybe not that strong if we think of 
"empty" as a "void of actual information". (Especially true when the 
fill value is nan or inf, which, as Ben just mentionned are probably the 
most commonly used fill value after zero.)

Maybe a keyword named "value" instead of "fill" may help soften the 
semantic opposition with "empty" ?

best,
Pierre


From ben.root at ou.edu  Fri Jan 18 09:19:35 2013
From: ben.root at ou.edu (Benjamin Root)
Date: Fri, 18 Jan 2013 09:19:35 -0500
Subject: [Numpy-discussion] New numpy functions: filled, filled_like
In-Reply-To: <50F90B53.4030000@grinta.net>
References: <CAPJVwBmgZUCwApoS2ztmwt2pTa-2-+B3W5tOm8A30KyuOd12AA@mail.gmail.com>
	<CAKF=DjtjSGk99zh53wwiPVmMThOzr4hGUKX5HJPG-5YeF_QTCQ@mail.gmail.com>
	<CAPJVwBmZvXXszZzf9m=PX_7e=8qs8ZJteU2hD7_eTR9pnN9U2w@mail.gmail.com>
	<CAF6FJiuQwNCkPvzVKneLfx3XSM7tHKcNXzMftRFnu8R+tN_LoQ@mail.gmail.com>
	<loom.20130114T094714-689@post.gmane.org>
	<CAH6Pt5rjUE0uEQkgZsGrKL4HWXMPZO=wbbaUZ5=tm+LCBSrrXw@mail.gmail.com>
	<CAFXk4br_0pohTOrU6fJjbiqpsP+GSiT3U-nDWRrCi1q6PjJHjg@mail.gmail.com>
	<50F4400F.4040709@hawaii.edu>
	<CANNq6Fm35sYGK4cjZNQedsbdVz06hVhAm6kv7d6QHPtMJACR6g@mail.gmail.com>
	<50F44A95.2030202@crans.org>
	<CALc6Xo6b=3OE3VHNsqJ-SXo_8tvNZOL5RV0pnYnG_fiDp58TUQ@mail.gmail.com>
	<CANNq6F=WpPoMF0+AMqRHGXmfs2cdnX83d=i1TNSfaj1-pamZPg@mail.gmail.com>
	<50F80711.9010204@crans.org> <50F8757B.7060008@hawaii.edu>
	<CANNq6F=W4KC1H1HguoWm5AZA5N5iJSGsmAmYsVRQJKr5mueG8w@mail.gmail.com>
	<CAMRnEmpyjMvL0-PgFf0RsbE1NkANWCfxAXmggdQ1+p7u6ux5+w@mail.gmail.com>
	<50F90B53.4030000@grinta.net>
Message-ID: <CANNq6Fk7Gf_zi33=3Yjy3M89m8ep0ex2_g9-H9SAWFG9Mr8XWw@mail.gmail.com>

On Fri, Jan 18, 2013 at 3:44 AM, Daniele Nicolodi <daniele at grinta.net>wrote:

> On 17/01/2013 23:27, Mark Wiebe wrote:
> > Would it be too weird or clumsy to extend the empty and empty_like
> > functions to do the filling?
> >
> > np.empty((10, 10), fill=np.nan)
> > np.empty_like(my_arr, fill=np.nan)
>
> Wouldn't it be more natural to extend the ndarray constructor?
>
> np.ndarray((10, 10), fill=np.nan)
>
> It looks more natural to me. In this way it is not possible to have the
> _like extension, but I don't see it as a major drawback.
>
>
> Cheers,
> Daniele
>
>
This isn't a bad idea.  Although, I would wager that most people, like
myself, use np.array() and np.array_like() instead of np.ndarray().  We
should also double-check and see how well that would fit in with the other
contructors like masked arrays and matrix objects.

Ben Root
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130118/117e97a9/attachment.html>

From daniele at grinta.net  Fri Jan 18 11:36:12 2013
From: daniele at grinta.net (Daniele Nicolodi)
Date: Fri, 18 Jan 2013 17:36:12 +0100
Subject: [Numpy-discussion] New numpy functions: filled, filled_like
In-Reply-To: <CANNq6Fk7Gf_zi33=3Yjy3M89m8ep0ex2_g9-H9SAWFG9Mr8XWw@mail.gmail.com>
References: <CAPJVwBmgZUCwApoS2ztmwt2pTa-2-+B3W5tOm8A30KyuOd12AA@mail.gmail.com>
	<CAPJVwBmZvXXszZzf9m=PX_7e=8qs8ZJteU2hD7_eTR9pnN9U2w@mail.gmail.com>
	<CAF6FJiuQwNCkPvzVKneLfx3XSM7tHKcNXzMftRFnu8R+tN_LoQ@mail.gmail.com>
	<loom.20130114T094714-689@post.gmane.org>
	<CAH6Pt5rjUE0uEQkgZsGrKL4HWXMPZO=wbbaUZ5=tm+LCBSrrXw@mail.gmail.com>
	<CAFXk4br_0pohTOrU6fJjbiqpsP+GSiT3U-nDWRrCi1q6PjJHjg@mail.gmail.com>
	<50F4400F.4040709@hawaii.edu>
	<CANNq6Fm35sYGK4cjZNQedsbdVz06hVhAm6kv7d6QHPtMJACR6g@mail.gmail.com>
	<50F44A95.2030202@crans.org>
	<CALc6Xo6b=3OE3VHNsqJ-SXo_8tvNZOL5RV0pnYnG_fiDp58TUQ@mail.gmail.com>
	<CANNq6F=WpPoMF0+AMqRHGXmfs2cdnX83d=i1TNSfaj1-pamZPg@mail.gmail.com>
	<50F80711.9010204@crans.org> <50F8757B.7060008@hawaii.edu>
	<CANNq6F=W4KC1H1HguoWm5AZA5N5iJSGsmAmYsVRQJKr5mueG8w@mail.gmail.com>
	<CAMRnEmpyjMvL0-PgFf0RsbE1NkANWCfxAXmggdQ1+p7u6ux5+w@mail.gmail.com>
	<50F90B53.4030000@grinta.net>
	<CANNq6Fk7Gf_zi33=3Yjy3M89m8ep0ex2_g9-H9SAWFG9Mr8XWw@mail.gmail.com>
Message-ID: <50F979FC.6020700@grinta.net>

On 18/01/2013 15:19, Benjamin Root wrote:
> 
> 
> On Fri, Jan 18, 2013 at 3:44 AM, Daniele Nicolodi <daniele at grinta.net
> <mailto:daniele at grinta.net>> wrote:
> 
>     On 17/01/2013 23:27, Mark Wiebe wrote:
>     > Would it be too weird or clumsy to extend the empty and empty_like
>     > functions to do the filling?
>     >
>     > np.empty((10, 10), fill=np.nan)
>     > np.empty_like(my_arr, fill=np.nan)
> 
>     Wouldn't it be more natural to extend the ndarray constructor?
> 
>     np.ndarray((10, 10), fill=np.nan)
> 
>     It looks more natural to me. In this way it is not possible to have the
>     _like extension, but I don't see it as a major drawback.
> 
> 
>     Cheers,
>     Daniele
> 
> 
> This isn't a bad idea.  Although, I would wager that most people, like
> myself, use np.array() and np.array_like() instead of np.ndarray().  We
> should also double-check and see how well that would fit in with the
> other contructors like masked arrays and matrix objects.

Hello Ben,

I don't really get what you mean with this. np.array() construct a numpy
array from an array-like object, np.ndarray() accepts a dimensions tuple
as first parameter, I don't see any np.array_like in the current numpy
release.

Cheers,
Daniele


From ben.root at ou.edu  Fri Jan 18 11:46:31 2013
From: ben.root at ou.edu (Benjamin Root)
Date: Fri, 18 Jan 2013 11:46:31 -0500
Subject: [Numpy-discussion] New numpy functions: filled, filled_like
In-Reply-To: <50F979FC.6020700@grinta.net>
References: <CAPJVwBmgZUCwApoS2ztmwt2pTa-2-+B3W5tOm8A30KyuOd12AA@mail.gmail.com>
	<CAPJVwBmZvXXszZzf9m=PX_7e=8qs8ZJteU2hD7_eTR9pnN9U2w@mail.gmail.com>
	<CAF6FJiuQwNCkPvzVKneLfx3XSM7tHKcNXzMftRFnu8R+tN_LoQ@mail.gmail.com>
	<loom.20130114T094714-689@post.gmane.org>
	<CAH6Pt5rjUE0uEQkgZsGrKL4HWXMPZO=wbbaUZ5=tm+LCBSrrXw@mail.gmail.com>
	<CAFXk4br_0pohTOrU6fJjbiqpsP+GSiT3U-nDWRrCi1q6PjJHjg@mail.gmail.com>
	<50F4400F.4040709@hawaii.edu>
	<CANNq6Fm35sYGK4cjZNQedsbdVz06hVhAm6kv7d6QHPtMJACR6g@mail.gmail.com>
	<50F44A95.2030202@crans.org>
	<CALc6Xo6b=3OE3VHNsqJ-SXo_8tvNZOL5RV0pnYnG_fiDp58TUQ@mail.gmail.com>
	<CANNq6F=WpPoMF0+AMqRHGXmfs2cdnX83d=i1TNSfaj1-pamZPg@mail.gmail.com>
	<50F80711.9010204@crans.org> <50F8757B.7060008@hawaii.edu>
	<CANNq6F=W4KC1H1HguoWm5AZA5N5iJSGsmAmYsVRQJKr5mueG8w@mail.gmail.com>
	<CAMRnEmpyjMvL0-PgFf0RsbE1NkANWCfxAXmggdQ1+p7u6ux5+w@mail.gmail.com>
	<50F90B53.4030000@grinta.net>
	<CANNq6Fk7Gf_zi33=3Yjy3M89m8ep0ex2_g9-H9SAWFG9Mr8XWw@mail.gmail.com>
	<50F979FC.6020700@grinta.net>
Message-ID: <CANNq6FnO1mYrVbR09staPNSE7NLODqpWR9dwiPfv+fKtyPhapg@mail.gmail.com>

On Fri, Jan 18, 2013 at 11:36 AM, Daniele Nicolodi <daniele at grinta.net>wrote:

> On 18/01/2013 15:19, Benjamin Root wrote:
> >
> >
> > On Fri, Jan 18, 2013 at 3:44 AM, Daniele Nicolodi <daniele at grinta.net
> > <mailto:daniele at grinta.net>> wrote:
> >
> >     On 17/01/2013 23:27, Mark Wiebe wrote:
> >     > Would it be too weird or clumsy to extend the empty and empty_like
> >     > functions to do the filling?
> >     >
> >     > np.empty((10, 10), fill=np.nan)
> >     > np.empty_like(my_arr, fill=np.nan)
> >
> >     Wouldn't it be more natural to extend the ndarray constructor?
> >
> >     np.ndarray((10, 10), fill=np.nan)
> >
> >     It looks more natural to me. In this way it is not possible to have
> the
> >     _like extension, but I don't see it as a major drawback.
> >
> >
> >     Cheers,
> >     Daniele
> >
> >
> > This isn't a bad idea.  Although, I would wager that most people, like
> > myself, use np.array() and np.array_like() instead of np.ndarray().  We
> > should also double-check and see how well that would fit in with the
> > other contructors like masked arrays and matrix objects.
>
> Hello Ben,
>
> I don't really get what you mean with this. np.array() construct a numpy
> array from an array-like object, np.ndarray() accepts a dimensions tuple
> as first parameter, I don't see any np.array_like in the current numpy
> release.
>
> Cheers,
> Daniele
>
>
My bad, I had a brain-fart and got mixed up.  I was thinking of
np.empty().  In fact, I never use np.ndarray(), I use np.empty().  Besides
np.ndarray() being the actual constructor, what is the difference between
them?

Ben Root
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130118/f2b355e1/attachment.html>

From daniele at grinta.net  Fri Jan 18 11:57:36 2013
From: daniele at grinta.net (Daniele Nicolodi)
Date: Fri, 18 Jan 2013 17:57:36 +0100
Subject: [Numpy-discussion] New numpy functions: filled, filled_like
In-Reply-To: <CANNq6FnO1mYrVbR09staPNSE7NLODqpWR9dwiPfv+fKtyPhapg@mail.gmail.com>
References: <CAPJVwBmgZUCwApoS2ztmwt2pTa-2-+B3W5tOm8A30KyuOd12AA@mail.gmail.com>
	<CAF6FJiuQwNCkPvzVKneLfx3XSM7tHKcNXzMftRFnu8R+tN_LoQ@mail.gmail.com>
	<loom.20130114T094714-689@post.gmane.org>
	<CAH6Pt5rjUE0uEQkgZsGrKL4HWXMPZO=wbbaUZ5=tm+LCBSrrXw@mail.gmail.com>
	<CAFXk4br_0pohTOrU6fJjbiqpsP+GSiT3U-nDWRrCi1q6PjJHjg@mail.gmail.com>
	<50F4400F.4040709@hawaii.edu>
	<CANNq6Fm35sYGK4cjZNQedsbdVz06hVhAm6kv7d6QHPtMJACR6g@mail.gmail.com>
	<50F44A95.2030202@crans.org>
	<CALc6Xo6b=3OE3VHNsqJ-SXo_8tvNZOL5RV0pnYnG_fiDp58TUQ@mail.gmail.com>
	<CANNq6F=WpPoMF0+AMqRHGXmfs2cdnX83d=i1TNSfaj1-pamZPg@mail.gmail.com>
	<50F80711.9010204@crans.org> <50F8757B.7060008@hawaii.edu>
	<CANNq6F=W4KC1H1HguoWm5AZA5N5iJSGsmAmYsVRQJKr5mueG8w@mail.gmail.com>
	<CAMRnEmpyjMvL0-PgFf0RsbE1NkANWCfxAXmggdQ1+p7u6ux5+w@mail.gmail.com>
	<50F90B53.4030000@grinta.net>
	<CANNq6Fk7Gf_zi33=3Yjy3M89m8ep0ex2_g9-H9SAWFG9Mr8XWw@mail.gmail.com>
	<50F979FC.6020700@grinta.net>
	<CANNq6FnO1mYrVbR09staPNSE7NLODqpWR9dwiPfv+fKtyPhapg@mail.gmail.com>
Message-ID: <50F97F00.8080802@grinta.net>

On 18/01/2013 17:46, Benjamin Root wrote:
> 
> 
> On Fri, Jan 18, 2013 at 11:36 AM, Daniele Nicolodi <daniele at grinta.net
> <mailto:daniele at grinta.net>> wrote:
> 
>     On 18/01/2013 15:19, Benjamin Root wrote:
>     >
>     >
>     > On Fri, Jan 18, 2013 at 3:44 AM, Daniele Nicolodi
>     <daniele at grinta.net <mailto:daniele at grinta.net>
>     > <mailto:daniele at grinta.net <mailto:daniele at grinta.net>>> wrote:
>     >
>     >     On 17/01/2013 23:27, Mark Wiebe wrote:
>     >     > Would it be too weird or clumsy to extend the empty and
>     empty_like
>     >     > functions to do the filling?
>     >     >
>     >     > np.empty((10, 10), fill=np.nan)
>     >     > np.empty_like(my_arr, fill=np.nan)
>     >
>     >     Wouldn't it be more natural to extend the ndarray constructor?
>     >
>     >     np.ndarray((10, 10), fill=np.nan)
>     >
>     >     It looks more natural to me. In this way it is not possible to
>     have the
>     >     _like extension, but I don't see it as a major drawback.
>     >
>     >
>     >     Cheers,
>     >     Daniele
>     >
>     >
>     > This isn't a bad idea.  Although, I would wager that most people, like
>     > myself, use np.array() and np.array_like() instead of
>     np.ndarray().  We
>     > should also double-check and see how well that would fit in with the
>     > other contructors like masked arrays and matrix objects.
> 
>     Hello Ben,
> 
>     I don't really get what you mean with this. np.array() construct a numpy
>     array from an array-like object, np.ndarray() accepts a dimensions tuple
>     as first parameter, I don't see any np.array_like in the current numpy
>     release.
> 
>     Cheers,
>     Daniele
> 
> 
> My bad, I had a brain-fart and got mixed up.  I was thinking of
> np.empty().  In fact, I never use np.ndarray(), I use np.empty(). 
> Besides np.ndarray() being the actual constructor, what is the
> difference between them?

I was also wondering what's the difference between np.ndarray() and
np.empty(). I thought the second was a wrapper around the first, but it
looks like both of them are actually implemented in C...

Cheers,
Daniele


From chris.barker at noaa.gov  Fri Jan 18 14:58:50 2013
From: chris.barker at noaa.gov (Chris Barker - NOAA Federal)
Date: Fri, 18 Jan 2013 11:58:50 -0800
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <CAFXk4bpM4rZmNYC4+mveXU_0PKG+cMchfqwhAG8Lz34aG-WZew@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CALmrCV0n=TsEPMrmzpdh6UQNFRTp5SgD0VAH52n9pxedsEu8Kg@mail.gmail.com>
	<CAH6Pt5rKwOzTjnOO87cSi=fkCT_-8tVAbEoA9RunwWD6A_DAPw@mail.gmail.com>
	<CALmrCV1JwDcBnO3sf=4n_4+V=C_j3QzRkpMCEZJutnv8NxRPqw@mail.gmail.com>
	<CAH6Pt5qnD=hfW_Zmw5t69HL78H8uhFY8b+GTkFx4uOmO+HERYQ@mail.gmail.com>
	<CALmrCV3ssCQ+U8QziyymoiF6BgxHHBiM2Z2Bnuk9UWNGH9bC4w@mail.gmail.com>
	<CAH6Pt5oiY07W=S+XbAYwN8Zk+AGCOh17eh_eg-uexBHGkPDgWQ@mail.gmail.com>
	<CALmrCV3kZ=EEopMDydwYDarV6G9nSbiPHK_UY4vSBiDUG8fhKg@mail.gmail.com>
	<CAH6Pt5ro6n0z9e_BZHjvrVxL2tvYoE50vvkeaXmAgvOUQBpMLQ@mail.gmail.com>
	<CALmrCV1H68=Vtr6Gep7Y08Ns_vASQdrf+Q3R-COgvXt7URi5sg@mail.gmail.com>
	<CAPJVwBmCmPCShfmoSPgkpO58aLU6S2t9GbYbDV=iif6US5YgJQ@mail.gmail.com>
	<CALmrCV1qbkkp8yyA4z2RTCcB9hK70ZcFYD53xXNaqNKWuaGtnw@mail.gmail.com>
	<CAPJVwB=f9_g81=SYyrb88j9mWrHKTwjxNUp0ofs0ZQKzPxeJoA@mail.gmail.com>
	<CALGmxEJyj4NT8RPmGqi8Sf5TD5PMw38TxVKyNQGqf-YU1T5-Wg@mail.gmail.com>
	<50EDB1D4.5090909@astro.uio.no>
	<CAH6Pt5pxM7z4YXiyyYxO1nj_wj9M_RVUFB7yt2tKnvo1=cXuNg@mail.gmail.com>
	<CALGmxEJW7QvegJxPa0j5bgD_CAx2R-J6ywv+a=6G4dRpyzodaw@mail.gmail.com>
	<CAH6Pt5o3hPuR0+b62_os7ZDHg0Q8+MYrj+2ucZ0WfxXj9MU1+w@mail.gmail.com>
	<CAFXk4bpekpe=1JHHEGSwZhj4Oqx0358YSc78CyCP3=r3_efbmQ@mail.gmail.com>
	<CALGmxELfd76aWSBzc061H_dHVAbJu+jOK8KBBSciQ7ERMpkL4w@mail.gmail.com>
	<CAFXk4bpM4rZmNYC4+mveXU_0PKG+cMchfqwhAG8Lz34aG-WZew@mail.gmail.com>
Message-ID: <CALGmxE+1Z5PaBaFRXx+S1vaRr-rhCMBYPGNuGLuW2TQbAbTXhQ@mail.gmail.com>

On Fri, Jan 18, 2013 at 4:39 AM, Olivier Delalleau <shish at keba.be> wrote:
> Le vendredi 18 janvier 2013, Chris Barker - NOAA Federal a ?crit :

> If you check again the examples in this thread exhibiting surprising /
> unexpected behavior, you'll notice most of them are with integers.
> The tricky thing about integers is that downcasting can dramatically change
> your result. With floats, not so much: you get approximation errors (usually
> what you want) and the occasional nan / inf creeping in (usally noticeable).

fair enough.

However my core argument is that people use non-standard (usually
smaller) dtypes for a reason, and it should be hard to accidentally
up-cast.

This is in contrast with the argument that accidental down-casting can
produce incorrect results, and thus it should be hard to accidentally
down-cast -- same argument whether the incorrect results are drastic
or not....

It's really a question of which of these we think should be prioritized.

-Chris


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From matthew.brett at gmail.com  Fri Jan 18 17:22:36 2013
From: matthew.brett at gmail.com (Matthew Brett)
Date: Fri, 18 Jan 2013 22:22:36 +0000
Subject: [Numpy-discussion] New numpy functions: filled, filled_like
In-Reply-To: <50F952B4.4070208@crans.org>
References: <CAPJVwBmgZUCwApoS2ztmwt2pTa-2-+B3W5tOm8A30KyuOd12AA@mail.gmail.com>
	<CAPJVwBmZvXXszZzf9m=PX_7e=8qs8ZJteU2hD7_eTR9pnN9U2w@mail.gmail.com>
	<CAF6FJiuQwNCkPvzVKneLfx3XSM7tHKcNXzMftRFnu8R+tN_LoQ@mail.gmail.com>
	<loom.20130114T094714-689@post.gmane.org>
	<CAH6Pt5rjUE0uEQkgZsGrKL4HWXMPZO=wbbaUZ5=tm+LCBSrrXw@mail.gmail.com>
	<CAFXk4br_0pohTOrU6fJjbiqpsP+GSiT3U-nDWRrCi1q6PjJHjg@mail.gmail.com>
	<50F4400F.4040709@hawaii.edu>
	<CANNq6Fm35sYGK4cjZNQedsbdVz06hVhAm6kv7d6QHPtMJACR6g@mail.gmail.com>
	<50F44A95.2030202@crans.org>
	<CALc6Xo6b=3OE3VHNsqJ-SXo_8tvNZOL5RV0pnYnG_fiDp58TUQ@mail.gmail.com>
	<CANNq6F=WpPoMF0+AMqRHGXmfs2cdnX83d=i1TNSfaj1-pamZPg@mail.gmail.com>
	<50F80711.9010204@crans.org> <50F8757B.7060008@hawaii.edu>
	<CANNq6F=W4KC1H1HguoWm5AZA5N5iJSGsmAmYsVRQJKr5mueG8w@mail.gmail.com>
	<CAMRnEmpyjMvL0-PgFf0RsbE1NkANWCfxAXmggdQ1+p7u6ux5+w@mail.gmail.com>
	<CAH6Pt5rEX290w58rLYehUQT+bxq-5rSKDRDTF2C7viqwPp5HOg@mail.gmail.com>
	<50F952B4.4070208@crans.org>
Message-ID: <CAH6Pt5r33sRyTJ6ZLr0p4jvG9-8=K9ipmXR9HXP41RcnEbZsDw@mail.gmail.com>

Hi,

On Fri, Jan 18, 2013 at 1:48 PM, Pierre Haessig
<pierre.haessig at crans.org> wrote:
> Hi,
> Le 17/01/2013 23:31, Matthew Brett a ?crit :
>>> Would it be too weird or clumsy to extend the empty and empty_like functions
>>> >to do the filling?
>>> >
>>> >np.empty((10, 10), fill=np.nan)
>>> >np.empty_like(my_arr, fill=np.nan)
>> That sounds like a good idea to me.  Someone wanting a fast way to
>> fill an array will probably check out the 'empty' docstring first.
> Oh, that sounds very good to me. There is indeed a bit of contradictions
> between "empty" and "fill" but maybe not that strong if we think of
> "empty" as a "void of actual information". (Especially true when the
> fill value is nan or inf, which, as Ben just mentionned are probably the
> most commonly used fill value after zero.)
>
> Maybe a keyword named "value" instead of "fill" may help soften the
> semantic opposition with "empty" ?

I personally find 'fill' OK.  I'd read:

a = np.empty((10, 10), fill=np.nan)

as

"make an empty array shape (10, 10) and fill with nans"

Which would indeed be what the code was doing :)  So I doubt that the
semantic clash would cause any long term problems,

Best,

Matthew


From chris.barker at noaa.gov  Fri Jan 18 17:31:14 2013
From: chris.barker at noaa.gov (Chris Barker - NOAA Federal)
Date: Fri, 18 Jan 2013 14:31:14 -0800
Subject: [Numpy-discussion] New numpy functions: filled, filled_like
In-Reply-To: <CAH6Pt5r33sRyTJ6ZLr0p4jvG9-8=K9ipmXR9HXP41RcnEbZsDw@mail.gmail.com>
References: <CAPJVwBmgZUCwApoS2ztmwt2pTa-2-+B3W5tOm8A30KyuOd12AA@mail.gmail.com>
	<CAPJVwBmZvXXszZzf9m=PX_7e=8qs8ZJteU2hD7_eTR9pnN9U2w@mail.gmail.com>
	<CAF6FJiuQwNCkPvzVKneLfx3XSM7tHKcNXzMftRFnu8R+tN_LoQ@mail.gmail.com>
	<loom.20130114T094714-689@post.gmane.org>
	<CAH6Pt5rjUE0uEQkgZsGrKL4HWXMPZO=wbbaUZ5=tm+LCBSrrXw@mail.gmail.com>
	<CAFXk4br_0pohTOrU6fJjbiqpsP+GSiT3U-nDWRrCi1q6PjJHjg@mail.gmail.com>
	<50F4400F.4040709@hawaii.edu>
	<CANNq6Fm35sYGK4cjZNQedsbdVz06hVhAm6kv7d6QHPtMJACR6g@mail.gmail.com>
	<50F44A95.2030202@crans.org>
	<CALc6Xo6b=3OE3VHNsqJ-SXo_8tvNZOL5RV0pnYnG_fiDp58TUQ@mail.gmail.com>
	<CANNq6F=WpPoMF0+AMqRHGXmfs2cdnX83d=i1TNSfaj1-pamZPg@mail.gmail.com>
	<50F80711.9010204@crans.org> <50F8757B.7060008@hawaii.edu>
	<CANNq6F=W4KC1H1HguoWm5AZA5N5iJSGsmAmYsVRQJKr5mueG8w@mail.gmail.com>
	<CAMRnEmpyjMvL0-PgFf0RsbE1NkANWCfxAXmggdQ1+p7u6ux5+w@mail.gmail.com>
	<CAH6Pt5rEX290w58rLYehUQT+bxq-5rSKDRDTF2C7viqwPp5HOg@mail.gmail.com>
	<50F952B4.4070208@crans.org>
	<CAH6Pt5r33sRyTJ6ZLr0p4jvG9-8=K9ipmXR9HXP41RcnEbZsDw@mail.gmail.com>
Message-ID: <CALGmxE+wVvOr7R-M+ji+FoMMqQzz072kVZK+-6kzYXzoE4mjfQ@mail.gmail.com>

On Fri, Jan 18, 2013 at 2:22 PM, Matthew Brett <matthew.brett at gmail.com> wrote:

> I personally find 'fill' OK.  I'd read:
>
> a = np.empty((10, 10), fill=np.nan)
>
> as
>
> "make an empty array shape (10, 10) and fill with nans"

+1

simple, does the job, and doesn't bloat the API.

-Chris


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


From ralf.gommers at gmail.com  Fri Jan 18 17:35:04 2013
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Fri, 18 Jan 2013 23:35:04 +0100
Subject: [Numpy-discussion] New numpy functions: filled, filled_like
In-Reply-To: <CALGmxE+wVvOr7R-M+ji+FoMMqQzz072kVZK+-6kzYXzoE4mjfQ@mail.gmail.com>
References: <CAPJVwBmgZUCwApoS2ztmwt2pTa-2-+B3W5tOm8A30KyuOd12AA@mail.gmail.com>
	<CAPJVwBmZvXXszZzf9m=PX_7e=8qs8ZJteU2hD7_eTR9pnN9U2w@mail.gmail.com>
	<CAF6FJiuQwNCkPvzVKneLfx3XSM7tHKcNXzMftRFnu8R+tN_LoQ@mail.gmail.com>
	<loom.20130114T094714-689@post.gmane.org>
	<CAH6Pt5rjUE0uEQkgZsGrKL4HWXMPZO=wbbaUZ5=tm+LCBSrrXw@mail.gmail.com>
	<CAFXk4br_0pohTOrU6fJjbiqpsP+GSiT3U-nDWRrCi1q6PjJHjg@mail.gmail.com>
	<50F4400F.4040709@hawaii.edu>
	<CANNq6Fm35sYGK4cjZNQedsbdVz06hVhAm6kv7d6QHPtMJACR6g@mail.gmail.com>
	<50F44A95.2030202@crans.org>
	<CALc6Xo6b=3OE3VHNsqJ-SXo_8tvNZOL5RV0pnYnG_fiDp58TUQ@mail.gmail.com>
	<CANNq6F=WpPoMF0+AMqRHGXmfs2cdnX83d=i1TNSfaj1-pamZPg@mail.gmail.com>
	<50F80711.9010204@crans.org> <50F8757B.7060008@hawaii.edu>
	<CANNq6F=W4KC1H1HguoWm5AZA5N5iJSGsmAmYsVRQJKr5mueG8w@mail.gmail.com>
	<CAMRnEmpyjMvL0-PgFf0RsbE1NkANWCfxAXmggdQ1+p7u6ux5+w@mail.gmail.com>
	<CAH6Pt5rEX290w58rLYehUQT+bxq-5rSKDRDTF2C7viqwPp5HOg@mail.gmail.com>
	<50F952B4.4070208@crans.org>
	<CAH6Pt5r33sRyTJ6ZLr0p4jvG9-8=K9ipmXR9HXP41RcnEbZsDw@mail.gmail.com>
	<CALGmxE+wVvOr7R-M+ji+FoMMqQzz072kVZK+-6kzYXzoE4mjfQ@mail.gmail.com>
Message-ID: <CABL7CQjgdARSb+VBbxpY8bBJd-bcN3eSAnwsGnhQopaSkf3igw@mail.gmail.com>

On Fri, Jan 18, 2013 at 11:31 PM, Chris Barker - NOAA Federal <
chris.barker at noaa.gov> wrote:

> On Fri, Jan 18, 2013 at 2:22 PM, Matthew Brett <matthew.brett at gmail.com>
> wrote:
>
> > I personally find 'fill' OK.  I'd read:
> >
> > a = np.empty((10, 10), fill=np.nan)
> >
> > as
> >
> > "make an empty array shape (10, 10) and fill with nans"
>
> +1
>
> simple, does the job, and doesn't bloat the API.
>

+1 from me too.

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130118/363762f5/attachment.html>

From matthew.brett at gmail.com  Fri Jan 18 17:35:59 2013
From: matthew.brett at gmail.com (Matthew Brett)
Date: Fri, 18 Jan 2013 22:35:59 +0000
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <CALGmxE+1Z5PaBaFRXx+S1vaRr-rhCMBYPGNuGLuW2TQbAbTXhQ@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CALmrCV0n=TsEPMrmzpdh6UQNFRTp5SgD0VAH52n9pxedsEu8Kg@mail.gmail.com>
	<CAH6Pt5rKwOzTjnOO87cSi=fkCT_-8tVAbEoA9RunwWD6A_DAPw@mail.gmail.com>
	<CALmrCV1JwDcBnO3sf=4n_4+V=C_j3QzRkpMCEZJutnv8NxRPqw@mail.gmail.com>
	<CAH6Pt5qnD=hfW_Zmw5t69HL78H8uhFY8b+GTkFx4uOmO+HERYQ@mail.gmail.com>
	<CALmrCV3ssCQ+U8QziyymoiF6BgxHHBiM2Z2Bnuk9UWNGH9bC4w@mail.gmail.com>
	<CAH6Pt5oiY07W=S+XbAYwN8Zk+AGCOh17eh_eg-uexBHGkPDgWQ@mail.gmail.com>
	<CALmrCV3kZ=EEopMDydwYDarV6G9nSbiPHK_UY4vSBiDUG8fhKg@mail.gmail.com>
	<CAH6Pt5ro6n0z9e_BZHjvrVxL2tvYoE50vvkeaXmAgvOUQBpMLQ@mail.gmail.com>
	<CALmrCV1H68=Vtr6Gep7Y08Ns_vASQdrf+Q3R-COgvXt7URi5sg@mail.gmail.com>
	<CAPJVwBmCmPCShfmoSPgkpO58aLU6S2t9GbYbDV=iif6US5YgJQ@mail.gmail.com>
	<CALmrCV1qbkkp8yyA4z2RTCcB9hK70ZcFYD53xXNaqNKWuaGtnw@mail.gmail.com>
	<CAPJVwB=f9_g81=SYyrb88j9mWrHKTwjxNUp0ofs0ZQKzPxeJoA@mail.gmail.com>
	<CALGmxEJyj4NT8RPmGqi8Sf5TD5PMw38TxVKyNQGqf-YU1T5-Wg@mail.gmail.com>
	<50EDB1D4.5090909@astro.uio.no>
	<CAH6Pt5pxM7z4YXiyyYxO1nj_wj9M_RVUFB7yt2tKnvo1=cXuNg@mail.gmail.com>
	<CALGmxEJW7QvegJxPa0j5bgD_CAx2R-J6ywv+a=6G4dRpyzodaw@mail.gmail.com>
	<CAH6Pt5o3hPuR0+b62_os7ZDHg0Q8+MYrj+2ucZ0WfxXj9MU1+w@mail.gmail.com>
	<CAFXk4bpekpe=1JHHEGSwZhj4Oqx0358YSc78CyCP3=r3_efbmQ@mail.gmail.com>
	<CALGmxELfd76aWSBzc061H_dHVAbJu+jOK8KBBSciQ7ERMpkL4w@mail.gmail.com>
	<CAFXk4bpM4rZmNYC4+mveXU_0PKG+cMchfqwhAG8Lz34aG-WZew@mail.gmail.com>
	<CALGmxE+1Z5PaBaFRXx+S1vaRr-rhCMBYPGNuGLuW2TQbAbTXhQ@mail.gmail.com>
Message-ID: <CAH6Pt5qjquD+txsyBHy4MZ23gd8CsPAvMXwY6_Csh4AD650Ubw@mail.gmail.com>

Hi,

On Fri, Jan 18, 2013 at 7:58 PM, Chris Barker - NOAA Federal
<chris.barker at noaa.gov> wrote:
> On Fri, Jan 18, 2013 at 4:39 AM, Olivier Delalleau <shish at keba.be> wrote:
>> Le vendredi 18 janvier 2013, Chris Barker - NOAA Federal a ?crit :
>
>> If you check again the examples in this thread exhibiting surprising /
>> unexpected behavior, you'll notice most of them are with integers.
>> The tricky thing about integers is that downcasting can dramatically change
>> your result. With floats, not so much: you get approximation errors (usually
>> what you want) and the occasional nan / inf creeping in (usally noticeable).
>
> fair enough.
>
> However my core argument is that people use non-standard (usually
> smaller) dtypes for a reason, and it should be hard to accidentally
> up-cast.
>
> This is in contrast with the argument that accidental down-casting can
> produce incorrect results, and thus it should be hard to accidentally
> down-cast -- same argument whether the incorrect results are drastic
> or not....
>
> It's really a question of which of these we think should be prioritized.

After thinking about it for a while, it seems to me Olivier's
suggestion is a good one.

The rule becomes the following:

array + scalar casting is the same as array + array casting except
array + scalar casting does not upcast floating point precision of the
array.

Am I right (Chris, Perry?) that this deals with almost all your cases?
 Meaning that it is upcasting of floats that is the main problem, not
upcasting of (u)ints?

This rule seems to me not very far from the current 1.6 behavior; it
upcasts more - but the dtype is now predictable.  It's easy to
explain.  It avoids the obvious errors that the 1.6 rules were trying
to avoid.  It doesn't seem too far to stretch to make a distinction
between rules about range (ints) and rules about precision (float,
complex).

What do you'all think?

Best,

Matthew


From ralf.gommers at gmail.com  Fri Jan 18 19:08:22 2013
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Sat, 19 Jan 2013 01:08:22 +0100
Subject: [Numpy-discussion] Shouldn't all in-place operations simply
 return self?
In-Reply-To: <CAHGWWxHFTanGnez95tmkN3h52K_GHsYFF98ocXU2kCCS1qKauw@mail.gmail.com>
References: <CAKa=AYQa_WrcFrVKNne9f7bzZvxhj9rpZAFp=Hx62zf2YphwLQ@mail.gmail.com>
	<CAB6mnxLpOd_=xDfEeH7h+HN+zq=igsck=N6MQ75YHYia9-V8ew@mail.gmail.com>
	<CAHGWWxHFTanGnez95tmkN3h52K_GHsYFF98ocXU2kCCS1qKauw@mail.gmail.com>
Message-ID: <CABL7CQjgcu_8veCj3-MBgSc9pH2vkYAMuhfdNNpM8kruXiV-vA@mail.gmail.com>

On Thu, Jan 17, 2013 at 11:13 PM, Thouis (Ray) Jones <thouis at gmail.com>wrote:

> On Thu, Jan 17, 2013 at 10:27 AM, Charles R Harris
> <charlesr.harris at gmail.com> wrote:
> >
> >
> > On Wed, Jan 16, 2013 at 5:11 PM, eat <e.antero.tammi at gmail.com> wrote:
> >>
> >> Hi,
> >>
> >> In a recent thread
> >> http://article.gmane.org/gmane.comp.python.numeric.general/52772 it was
> >> proposed that .fill(.) should return self as an alternative for a
> trivial
> >> two-liner.
> >>
> >> I'm raising now the question: what if all in-place operations indeed
> could
> >> return self? How bad this would be? A 'strong' counter argument may be
> found
> >> at http://mail.python.org/pipermail/python-dev/2003-October/038855.html
> .
> >>
> >> But anyway, at least for me. it would be much more straightforward to
> >> implement simple mini dsl's
> >> (http://en.wikipedia.org/wiki/Domain-specific_language) a much more
> >> straightforward manner.
> >>
> >> What do you think?
> >>
> >
> > I've read Guido about why he didn't like inplace operations returning
> self
> > and found him convincing for a while. And then I listened to other folks
> > express a preference for the freight train style and found them
> convincing
> > also. I think it comes down to a preference for one style over another
> and I
> > go back and forth myself. If I had to vote, I'd go for returning self,
> but
> > I'm not sure it's worth breaking python conventions to do so.
> >
> > Chuck
>
> I'm -1 on breaking with Python convention without very good reasons.


Three times -1: on breaking Python conventions, on changing any existing
numpy functions/methods for something like this, and on having similarly
named functions like shuffle/shuffled that basically do the same thing.

+1 on using out= more, and on some general guideline on
function-naming-grammar.

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130119/3a3c926a/attachment.html>

From fperez.net at gmail.com  Sat Jan 19 02:28:52 2013
From: fperez.net at gmail.com (Fernando Perez)
Date: Fri, 18 Jan 2013 23:28:52 -0800
Subject: [Numpy-discussion] New numpy functions: filled, filled_like
In-Reply-To: <CAH6Pt5r33sRyTJ6ZLr0p4jvG9-8=K9ipmXR9HXP41RcnEbZsDw@mail.gmail.com>
References: <CAPJVwBmgZUCwApoS2ztmwt2pTa-2-+B3W5tOm8A30KyuOd12AA@mail.gmail.com>
	<CAPJVwBmZvXXszZzf9m=PX_7e=8qs8ZJteU2hD7_eTR9pnN9U2w@mail.gmail.com>
	<CAF6FJiuQwNCkPvzVKneLfx3XSM7tHKcNXzMftRFnu8R+tN_LoQ@mail.gmail.com>
	<loom.20130114T094714-689@post.gmane.org>
	<CAH6Pt5rjUE0uEQkgZsGrKL4HWXMPZO=wbbaUZ5=tm+LCBSrrXw@mail.gmail.com>
	<CAFXk4br_0pohTOrU6fJjbiqpsP+GSiT3U-nDWRrCi1q6PjJHjg@mail.gmail.com>
	<50F4400F.4040709@hawaii.edu>
	<CANNq6Fm35sYGK4cjZNQedsbdVz06hVhAm6kv7d6QHPtMJACR6g@mail.gmail.com>
	<50F44A95.2030202@crans.org>
	<CALc6Xo6b=3OE3VHNsqJ-SXo_8tvNZOL5RV0pnYnG_fiDp58TUQ@mail.gmail.com>
	<CANNq6F=WpPoMF0+AMqRHGXmfs2cdnX83d=i1TNSfaj1-pamZPg@mail.gmail.com>
	<50F80711.9010204@crans.org> <50F8757B.7060008@hawaii.edu>
	<CANNq6F=W4KC1H1HguoWm5AZA5N5iJSGsmAmYsVRQJKr5mueG8w@mail.gmail.com>
	<CAMRnEmpyjMvL0-PgFf0RsbE1NkANWCfxAXmggdQ1+p7u6ux5+w@mail.gmail.com>
	<CAH6Pt5rEX290w58rLYehUQT+bxq-5rSKDRDTF2C7viqwPp5HOg@mail.gmail.com>
	<50F952B4.4070208@crans.org>
	<CAH6Pt5r33sRyTJ6ZLr0p4jvG9-8=K9ipmXR9HXP41RcnEbZsDw@mail.gmail.com>
Message-ID: <CAHAreOrkGv=zSLDHPuOz0BeqHTpczVAUOd5QW_boGEBbKi5cFA@mail.gmail.com>

On Fri, Jan 18, 2013 at 2:22 PM, Matthew Brett <matthew.brett at gmail.com> wrote:
> I personally find 'fill' OK.  I'd read:
>
> a = np.empty((10, 10), fill=np.nan)
>
> as
>
> "make an empty array shape (10, 10) and fill with nans"
>
> Which would indeed be what the code was doing :)  So I doubt that the
> semantic clash would cause any long term problems,


+1, practicality beats purity...


From e.antero.tammi at gmail.com  Sat Jan 19 06:35:27 2013
From: e.antero.tammi at gmail.com (eat)
Date: Sat, 19 Jan 2013 13:35:27 +0200
Subject: [Numpy-discussion] Shouldn't all in-place operations simply
 return self?
In-Reply-To: <CAHGWWxHFTanGnez95tmkN3h52K_GHsYFF98ocXU2kCCS1qKauw@mail.gmail.com>
References: <CAKa=AYQa_WrcFrVKNne9f7bzZvxhj9rpZAFp=Hx62zf2YphwLQ@mail.gmail.com>
	<CAB6mnxLpOd_=xDfEeH7h+HN+zq=igsck=N6MQ75YHYia9-V8ew@mail.gmail.com>
	<CAHGWWxHFTanGnez95tmkN3h52K_GHsYFF98ocXU2kCCS1qKauw@mail.gmail.com>
Message-ID: <CAKa=AYTnfCpB0_vpTJNWL0QqY=DQky7DYs4zuFY++3UjE5J89w@mail.gmail.com>

Hi,

On Fri, Jan 18, 2013 at 12:13 AM, Thouis (Ray) Jones <thouis at gmail.com>wrote:

> On Thu, Jan 17, 2013 at 10:27 AM, Charles R Harris
> <charlesr.harris at gmail.com> wrote:
> >
> >
> > On Wed, Jan 16, 2013 at 5:11 PM, eat <e.antero.tammi at gmail.com> wrote:
> >>
> >> Hi,
> >>
> >> In a recent thread
> >> http://article.gmane.org/gmane.comp.python.numeric.general/52772 it was
> >> proposed that .fill(.) should return self as an alternative for a
> trivial
> >> two-liner.
> >>
> >> I'm raising now the question: what if all in-place operations indeed
> could
> >> return self? How bad this would be? A 'strong' counter argument may be
> found
> >> at http://mail.python.org/pipermail/python-dev/2003-October/038855.html
> .
> >>
> >> But anyway, at least for me. it would be much more straightforward to
> >> implement simple mini dsl's
> >> (http://en.wikipedia.org/wiki/Domain-specific_language) a much more
> >> straightforward manner.
> >>
> >> What do you think?
> >>
> >
> > I've read Guido about why he didn't like inplace operations returning
> self
> > and found him convincing for a while. And then I listened to other folks
> > express a preference for the freight train style and found them
> convincing
> > also. I think it comes down to a preference for one style over another
> and I
> > go back and forth myself. If I had to vote, I'd go for returning self,
> but
> > I'm not sure it's worth breaking python conventions to do so.
> >
> > Chuck
>
> I'm -1 on breaking with Python convention without very good reasons.
>
As an example I personally find following behavior highly counter intuitive.
In []: p, P= rand(3, 1), rand(3, 5)

In []: ((p- P)** 2).sum(0).argsort()
Out[]: array([2, 4, 1, 3, 0])

In []: ((p- P)** 2).sum(0).sort().diff()
------------------------------------------------------------
Traceback (most recent call last):
  File "<ipython console>", line 1, in <module>
AttributeError: 'NoneType' object has no attribute 'diff'


Regards,
-eat

>
> Ray
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130119/3c2ea045/attachment.html>

From shish at keba.be  Sun Jan 20 21:10:30 2013
From: shish at keba.be (Olivier Delalleau)
Date: Sun, 20 Jan 2013 21:10:30 -0500
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <CAH6Pt5qjquD+txsyBHy4MZ23gd8CsPAvMXwY6_Csh4AD650Ubw@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CALmrCV0n=TsEPMrmzpdh6UQNFRTp5SgD0VAH52n9pxedsEu8Kg@mail.gmail.com>
	<CAH6Pt5rKwOzTjnOO87cSi=fkCT_-8tVAbEoA9RunwWD6A_DAPw@mail.gmail.com>
	<CALmrCV1JwDcBnO3sf=4n_4+V=C_j3QzRkpMCEZJutnv8NxRPqw@mail.gmail.com>
	<CAH6Pt5qnD=hfW_Zmw5t69HL78H8uhFY8b+GTkFx4uOmO+HERYQ@mail.gmail.com>
	<CALmrCV3ssCQ+U8QziyymoiF6BgxHHBiM2Z2Bnuk9UWNGH9bC4w@mail.gmail.com>
	<CAH6Pt5oiY07W=S+XbAYwN8Zk+AGCOh17eh_eg-uexBHGkPDgWQ@mail.gmail.com>
	<CALmrCV3kZ=EEopMDydwYDarV6G9nSbiPHK_UY4vSBiDUG8fhKg@mail.gmail.com>
	<CAH6Pt5ro6n0z9e_BZHjvrVxL2tvYoE50vvkeaXmAgvOUQBpMLQ@mail.gmail.com>
	<CALmrCV1H68=Vtr6Gep7Y08Ns_vASQdrf+Q3R-COgvXt7URi5sg@mail.gmail.com>
	<CAPJVwBmCmPCShfmoSPgkpO58aLU6S2t9GbYbDV=iif6US5YgJQ@mail.gmail.com>
	<CALmrCV1qbkkp8yyA4z2RTCcB9hK70ZcFYD53xXNaqNKWuaGtnw@mail.gmail.com>
	<CAPJVwB=f9_g81=SYyrb88j9mWrHKTwjxNUp0ofs0ZQKzPxeJoA@mail.gmail.com>
	<CALGmxEJyj4NT8RPmGqi8Sf5TD5PMw38TxVKyNQGqf-YU1T5-Wg@mail.gmail.com>
	<50EDB1D4.5090909@astro.uio.no>
	<CAH6Pt5pxM7z4YXiyyYxO1nj_wj9M_RVUFB7yt2tKnvo1=cXuNg@mail.gmail.com>
	<CALGmxEJW7QvegJxPa0j5bgD_CAx2R-J6ywv+a=6G4dRpyzodaw@mail.gmail.com>
	<CAH6Pt5o3hPuR0+b62_os7ZDHg0Q8+MYrj+2ucZ0WfxXj9MU1+w@mail.gmail.com>
	<CAFXk4bpekpe=1JHHEGSwZhj4Oqx0358YSc78CyCP3=r3_efbmQ@mail.gmail.com>
	<CALGmxELfd76aWSBzc061H_dHVAbJu+jOK8KBBSciQ7ERMpkL4w@mail.gmail.com>
	<CAFXk4bpM4rZmNYC4+mveXU_0PKG+cMchfqwhAG8Lz34aG-WZew@mail.gmail.com>
	<CALGmxE+1Z5PaBaFRXx+S1vaRr-rhCMBYPGNuGLuW2TQbAbTXhQ@mail.gmail.com>
	<CAH6Pt5qjquD+txsyBHy4MZ23gd8CsPAvMXwY6_Csh4AD650Ubw@mail.gmail.com>
Message-ID: <CAFXk4boy6+FCwvnP_7CTUtfvztg9PjS12+2Tp61KqJpaQCNtzA@mail.gmail.com>

2013/1/18 Matthew Brett <matthew.brett at gmail.com>:
> Hi,
>
> On Fri, Jan 18, 2013 at 7:58 PM, Chris Barker - NOAA Federal
> <chris.barker at noaa.gov> wrote:
>> On Fri, Jan 18, 2013 at 4:39 AM, Olivier Delalleau <shish at keba.be> wrote:
>>> Le vendredi 18 janvier 2013, Chris Barker - NOAA Federal a ?crit :
>>
>>> If you check again the examples in this thread exhibiting surprising /
>>> unexpected behavior, you'll notice most of them are with integers.
>>> The tricky thing about integers is that downcasting can dramatically change
>>> your result. With floats, not so much: you get approximation errors (usually
>>> what you want) and the occasional nan / inf creeping in (usally noticeable).
>>
>> fair enough.
>>
>> However my core argument is that people use non-standard (usually
>> smaller) dtypes for a reason, and it should be hard to accidentally
>> up-cast.
>>
>> This is in contrast with the argument that accidental down-casting can
>> produce incorrect results, and thus it should be hard to accidentally
>> down-cast -- same argument whether the incorrect results are drastic
>> or not....
>>
>> It's really a question of which of these we think should be prioritized.
>
> After thinking about it for a while, it seems to me Olivier's
> suggestion is a good one.
>
> The rule becomes the following:
>
> array + scalar casting is the same as array + array casting except
> array + scalar casting does not upcast floating point precision of the
> array.
>
> Am I right (Chris, Perry?) that this deals with almost all your cases?
>  Meaning that it is upcasting of floats that is the main problem, not
> upcasting of (u)ints?
>
> This rule seems to me not very far from the current 1.6 behavior; it
> upcasts more - but the dtype is now predictable.  It's easy to
> explain.  It avoids the obvious errors that the 1.6 rules were trying
> to avoid.  It doesn't seem too far to stretch to make a distinction
> between rules about range (ints) and rules about precision (float,
> complex).
>
> What do you'all think?

Personally, I think the main issue with my suggestion is that it seems
hard to go there from the current behavior -- without potentially
breaking existing code in non-obvious ways. The main problematic case
I foresee is the typical "small_int_array + 1", which would get
upcasted while it wasn't the case before (neither in 1.5 nor in 1.6).
That's why I think Nathaniel's proposal is more practical.

-=- Olivier


From pierre.haessig at crans.org  Mon Jan 21 03:07:09 2013
From: pierre.haessig at crans.org (Pierre Haessig)
Date: Mon, 21 Jan 2013 09:07:09 +0100
Subject: [Numpy-discussion] New numpy functions: filled, filled_like
In-Reply-To: <CAH6Pt5r33sRyTJ6ZLr0p4jvG9-8=K9ipmXR9HXP41RcnEbZsDw@mail.gmail.com>
References: <CAPJVwBmgZUCwApoS2ztmwt2pTa-2-+B3W5tOm8A30KyuOd12AA@mail.gmail.com>
	<CAF6FJiuQwNCkPvzVKneLfx3XSM7tHKcNXzMftRFnu8R+tN_LoQ@mail.gmail.com>
	<loom.20130114T094714-689@post.gmane.org>
	<CAH6Pt5rjUE0uEQkgZsGrKL4HWXMPZO=wbbaUZ5=tm+LCBSrrXw@mail.gmail.com>
	<CAFXk4br_0pohTOrU6fJjbiqpsP+GSiT3U-nDWRrCi1q6PjJHjg@mail.gmail.com>
	<50F4400F.4040709@hawaii.edu>
	<CANNq6Fm35sYGK4cjZNQedsbdVz06hVhAm6kv7d6QHPtMJACR6g@mail.gmail.com>
	<50F44A95.2030202@crans.org>
	<CALc6Xo6b=3OE3VHNsqJ-SXo_8tvNZOL5RV0pnYnG_fiDp58TUQ@mail.gmail.com>
	<CANNq6F=WpPoMF0+AMqRHGXmfs2cdnX83d=i1TNSfaj1-pamZPg@mail.gmail.com>
	<50F80711.9010204@crans.org> <50F8757B.7060008@hawaii.edu>
	<CANNq6F=W4KC1H1HguoWm5AZA5N5iJSGsmAmYsVRQJKr5mueG8w@mail.gmail.com>
	<CAMRnEmpyjMvL0-PgFf0RsbE1NkANWCfxAXmggdQ1+p7u6ux5+w@mail.gmail.com>
	<CAH6Pt5rEX290w58rLYehUQT+bxq-5rSKDRDTF2C7viqwPp5HOg@mail.gmail.com>
	<50F952B4.4070208@crans.org>
	<CAH6Pt5r33sRyTJ6ZLr0p4jvG9-8=K9ipmXR9HXP41RcnEbZsDw@mail.gmail.com>
Message-ID: <50FCF72D.6040903@crans.org>

Le 18/01/2013 23:22, Matthew Brett a ?crit :
> I personally find 'fill' OK.  I'd read:
>
> a = np.empty((10, 10), fill=np.nan)
>
> as
>
> "make an empty array shape (10, 10) and fill with nans"
+1

(and now we have *two* verbs ! )

-- 
Pierre

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 900 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130121/7239774b/attachment.sig>

From ndbecker2 at gmail.com  Mon Jan 21 08:41:52 2013
From: ndbecker2 at gmail.com (Neal Becker)
Date: Mon, 21 Jan 2013 08:41:52 -0500
Subject: [Numpy-discussion] another little index puzzle
Message-ID: <kdjgiu$er5$1@ger.gmane.org>

I have an array to be used for indexing.  It is 2d, where the rows are all the 
permutations of some numbers.  So:

array([[-2, -2, -2],
       [-2, -2, -1],
       [-2, -2,  0],
       [-2, -2,  1],
       [-2, -2,  2],
 ...
       [ 2,  1,  2],
       [ 2,  2, -2],
       [ 2,  2, -1],
       [ 2,  2,  0],
       [ 2,  2,  1],
       [ 2,  2,  2]])

Here the array is 125x3

I want to select all the rows of the array in which all the 3 elements are 
equal, so I can remove them.  So for example, the 1st and last row.


From robert.kern at gmail.com  Mon Jan 21 08:50:53 2013
From: robert.kern at gmail.com (Robert Kern)
Date: Mon, 21 Jan 2013 14:50:53 +0100
Subject: [Numpy-discussion] another little index puzzle
In-Reply-To: <kdjgiu$er5$1@ger.gmane.org>
References: <kdjgiu$er5$1@ger.gmane.org>
Message-ID: <CAF6FJiuBVC9bs66i7f_VTicp2wvWAi2vLujzTKBPm4x-JKX9nA@mail.gmail.com>

On Mon, Jan 21, 2013 at 2:41 PM, Neal Becker <ndbecker2 at gmail.com> wrote:
> I have an array to be used for indexing.  It is 2d, where the rows are all the
> permutations of some numbers.  So:
>
> array([[-2, -2, -2],
>        [-2, -2, -1],
>        [-2, -2,  0],
>        [-2, -2,  1],
>        [-2, -2,  2],
>  ...
>        [ 2,  1,  2],
>        [ 2,  2, -2],
>        [ 2,  2, -1],
>        [ 2,  2,  0],
>        [ 2,  2,  1],
>        [ 2,  2,  2]])
>
> Here the array is 125x3
>
> I want to select all the rows of the array in which all the 3 elements are
> equal, so I can remove them.  So for example, the 1st and last row.

all_equal_mask = np.logical_and.reduce(arr[:,1:] == arr[:,:-1], axis=1)
some_unequal = arr[~all_equal_mask]

--
Robert Kern


From heng at cantab.net  Mon Jan 21 09:02:14 2013
From: heng at cantab.net (Henry Gomersall)
Date: Mon, 21 Jan 2013 14:02:14 +0000
Subject: [Numpy-discussion] another little index puzzle
In-Reply-To: <kdjgiu$er5$1@ger.gmane.org>
References: <kdjgiu$er5$1@ger.gmane.org>
Message-ID: <1358776934.25855.49.camel@farnsworth>

On Mon, 2013-01-21 at 08:41 -0500, Neal Becker wrote:
> I have an array to be used for indexing.  It is 2d, where the rows are
> all the 
> permutations of some numbers.  So:
> 
> array([[-2, -2, -2],
>        [-2, -2, -1],
>        [-2, -2,  0],
>        [-2, -2,  1],
>        [-2, -2,  2],
>  ...
>        [ 2,  1,  2],
>        [ 2,  2, -2],
>        [ 2,  2, -1],
>        [ 2,  2,  0],
>        [ 2,  2,  1],
>        [ 2,  2,  2]])
> 
> Here the array is 125x3
> 
> I want to select all the rows of the array in which all the 3 elements
> are 
> equal, so I can remove them.  So for example, the 1st and last row.

You can use a convolution to pick out the changes...

conv_arr = numpy.array([[1, -1, 0], [0, 1, -1]])
equal_selector = ~numpy.any(numpy.dot(b, numpy.transpose(a)), 0)

or
unequal_selector = numpy.any(numpy.dot(b, numpy.transpose(a)), 0)

hen


From matthew.brett at gmail.com  Mon Jan 21 17:46:55 2013
From: matthew.brett at gmail.com (Matthew Brett)
Date: Mon, 21 Jan 2013 14:46:55 -0800
Subject: [Numpy-discussion] Do we want scalar casting to behave as it
 does at the moment?
In-Reply-To: <CAFXk4boy6+FCwvnP_7CTUtfvztg9PjS12+2Tp61KqJpaQCNtzA@mail.gmail.com>
References: <CAH6Pt5rZU9XoxMWp4Uyx6rBMoFDBEgfzWZVXDaVMSsSc+13rgg@mail.gmail.com>
	<CALmrCV0n=TsEPMrmzpdh6UQNFRTp5SgD0VAH52n9pxedsEu8Kg@mail.gmail.com>
	<CAH6Pt5rKwOzTjnOO87cSi=fkCT_-8tVAbEoA9RunwWD6A_DAPw@mail.gmail.com>
	<CALmrCV1JwDcBnO3sf=4n_4+V=C_j3QzRkpMCEZJutnv8NxRPqw@mail.gmail.com>
	<CAH6Pt5qnD=hfW_Zmw5t69HL78H8uhFY8b+GTkFx4uOmO+HERYQ@mail.gmail.com>
	<CALmrCV3ssCQ+U8QziyymoiF6BgxHHBiM2Z2Bnuk9UWNGH9bC4w@mail.gmail.com>
	<CAH6Pt5oiY07W=S+XbAYwN8Zk+AGCOh17eh_eg-uexBHGkPDgWQ@mail.gmail.com>
	<CALmrCV3kZ=EEopMDydwYDarV6G9nSbiPHK_UY4vSBiDUG8fhKg@mail.gmail.com>
	<CAH6Pt5ro6n0z9e_BZHjvrVxL2tvYoE50vvkeaXmAgvOUQBpMLQ@mail.gmail.com>
	<CALmrCV1H68=Vtr6Gep7Y08Ns_vASQdrf+Q3R-COgvXt7URi5sg@mail.gmail.com>
	<CAPJVwBmCmPCShfmoSPgkpO58aLU6S2t9GbYbDV=iif6US5YgJQ@mail.gmail.com>
	<CALmrCV1qbkkp8yyA4z2RTCcB9hK70ZcFYD53xXNaqNKWuaGtnw@mail.gmail.com>
	<CAPJVwB=f9_g81=SYyrb88j9mWrHKTwjxNUp0ofs0ZQKzPxeJoA@mail.gmail.com>
	<CALGmxEJyj4NT8RPmGqi8Sf5TD5PMw38TxVKyNQGqf-YU1T5-Wg@mail.gmail.com>
	<50EDB1D4.5090909@astro.uio.no>
	<CAH6Pt5pxM7z4YXiyyYxO1nj_wj9M_RVUFB7yt2tKnvo1=cXuNg@mail.gmail.com>
	<CALGmxEJW7QvegJxPa0j5bgD_CAx2R-J6ywv+a=6G4dRpyzodaw@mail.gmail.com>
	<CAH6Pt5o3hPuR0+b62_os7ZDHg0Q8+MYrj+2ucZ0WfxXj9MU1+w@mail.gmail.com>
	<CAFXk4bpekpe=1JHHEGSwZhj4Oqx0358YSc78CyCP3=r3_efbmQ@mail.gmail.com>
	<CALGmxELfd76aWSBzc061H_dHVAbJu+jOK8KBBSciQ7ERMpkL4w@mail.gmail.com>
	<CAFXk4bpM4rZmNYC4+mveXU_0PKG+cMchfqwhAG8Lz34aG-WZew@mail.gmail.com>
	<CALGmxE+1Z5PaBaFRXx+S1vaRr-rhCMBYPGNuGLuW2TQbAbTXhQ@mail.gmail.com>
	<CAH6Pt5qjquD+txsyBHy4MZ23gd8CsPAvMXwY6_Csh4AD650Ubw@mail.gmail.com>
	<CAFXk4boy6+FCwvnP_7CTUtfvztg9PjS12+2Tp61KqJpaQCNtzA@mail.gmail.com>
Message-ID: <CAH6Pt5rkL5OvDDXxZnPiN+NLoyVcWZr1btmLXtOpTt3HWrKP5g@mail.gmail.com>

Hi,

On Sun, Jan 20, 2013 at 6:10 PM, Olivier Delalleau <shish at keba.be> wrote:
> 2013/1/18 Matthew Brett <matthew.brett at gmail.com>:
>> Hi,
>>
>> On Fri, Jan 18, 2013 at 7:58 PM, Chris Barker - NOAA Federal
>> <chris.barker at noaa.gov> wrote:
>>> On Fri, Jan 18, 2013 at 4:39 AM, Olivier Delalleau <shish at keba.be> wrote:
>>>> Le vendredi 18 janvier 2013, Chris Barker - NOAA Federal a ?crit :
>>>
>>>> If you check again the examples in this thread exhibiting surprising /
>>>> unexpected behavior, you'll notice most of them are with integers.
>>>> The tricky thing about integers is that downcasting can dramatically change
>>>> your result. With floats, not so much: you get approximation errors (usually
>>>> what you want) and the occasional nan / inf creeping in (usally noticeable).
>>>
>>> fair enough.
>>>
>>> However my core argument is that people use non-standard (usually
>>> smaller) dtypes for a reason, and it should be hard to accidentally
>>> up-cast.
>>>
>>> This is in contrast with the argument that accidental down-casting can
>>> produce incorrect results, and thus it should be hard to accidentally
>>> down-cast -- same argument whether the incorrect results are drastic
>>> or not....
>>>
>>> It's really a question of which of these we think should be prioritized.
>>
>> After thinking about it for a while, it seems to me Olivier's
>> suggestion is a good one.
>>
>> The rule becomes the following:
>>
>> array + scalar casting is the same as array + array casting except
>> array + scalar casting does not upcast floating point precision of the
>> array.
>>
>> Am I right (Chris, Perry?) that this deals with almost all your cases?
>>  Meaning that it is upcasting of floats that is the main problem, not
>> upcasting of (u)ints?
>>
>> This rule seems to me not very far from the current 1.6 behavior; it
>> upcasts more - but the dtype is now predictable.  It's easy to
>> explain.  It avoids the obvious errors that the 1.6 rules were trying
>> to avoid.  It doesn't seem too far to stretch to make a distinction
>> between rules about range (ints) and rules about precision (float,
>> complex).
>>
>> What do you'all think?
>
> Personally, I think the main issue with my suggestion is that it seems
> hard to go there from the current behavior -- without potentially
> breaking existing code in non-obvious ways. The main problematic case
> I foresee is the typical "small_int_array + 1", which would get
> upcasted while it wasn't the case before (neither in 1.5 nor in 1.6).
> That's why I think Nathaniel's proposal is more practical.

It's important to establish the behavior we want in the long term,
because it will likely affect the stop-gap solution we choose now.

For example, let's say we think that the 1.5 behavior is desired in
the long term - in that case Nathaniel's solution seems good (although
it will change behavior from 1.6.x)

If we think that your suggestion is preferable for the long term,
sticking with 1.6. behavior is more attractive.

It seems to me we need the use-cases laid out properly in order to
decide, at the moment we are working somewhat blind, at least in my
opinion.

Cheers,

Matthew


From amueller at ais.uni-bonn.de  Mon Jan 21 18:02:24 2013
From: amueller at ais.uni-bonn.de (Andreas Mueller)
Date: Tue, 22 Jan 2013 00:02:24 +0100
Subject: [Numpy-discussion] ANN: scikit-learn 0.13 released!
Message-ID: <50FDC900.9000600@ais.uni-bonn.de>

Hi all.
I am very happy to announce the release of scikit-learn 0.13.
New features in this release include feature hashing for text processing,
passive-agressive classifiers, faster random forests and many more.

There have also been countless improvements in stability, consistency and
usability.

Details can be found on the what's new 
<http://scikit-learn.org/stable/whats_new.html>page.

Sources and windows binaries are available on sourceforge,
through pypi (http://pypi.python.org/pypi/scikit-learn/0.13) or
can be installed directly using pip:

   pip install -U scikit-learn

A big "thank you" to all the contributors who made this release possible!

In parallel to the release, we started a small survey 
<https://docs.google.com/spreadsheet/viewform?formkey=dFdyeGNhMzlCRWZUdldpMEZlZ1B1YkE6MQ#gid=0> 
to get to know our user base a bit more.
If you are using scikit-learn, it would be great if you could give us 
your input.

Best,
Andy
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130122/bf55e811/attachment.html>

From olivier.grisel at ensta.org  Mon Jan 21 18:19:58 2013
From: olivier.grisel at ensta.org (Olivier Grisel)
Date: Tue, 22 Jan 2013 00:19:58 +0100
Subject: [Numpy-discussion] ANN: scikit-learn 0.13 released!
In-Reply-To: <50FDC900.9000600@ais.uni-bonn.de>
References: <50FDC900.9000600@ais.uni-bonn.de>
Message-ID: <CAFvE7K6yyc-bSy=2Ceo2oLHR_o2hOKq4AsBvzi+g0qHt0+rewg@mail.gmail.com>

Congrats and thanks to Andreas and everyone involved in the release,
the website fixes and the online survey setup.

I posted Andreas blog post on HN and reddit:

- http://news.ycombinator.com/item?id=5094319
- http://www.reddit.com/r/programming/comments/170oty/scikitlearn_013_is_out_machine_learning_in_python/

We might get some user feedback in the comments there as well.


From toddrjen at gmail.com  Tue Jan 22 04:21:03 2013
From: toddrjen at gmail.com (Todd)
Date: Tue, 22 Jan 2013 10:21:03 +0100
Subject: [Numpy-discussion] Subclassing ndarray with concatenate
Message-ID: <CAFpSVpKjZk6Ycpi21OCxOqGFOcH17jVe=gOQ=+rWWTdU08zZUA@mail.gmail.com>

I am trying to create a subclass of ndarray that has additional
attributes.  These attributes are maintained with most numpy functions if
__array_finalize__ is used.

The main exception I have found is concatenate (and hstack/vstack, which
just wrap concatenate).  In this case, __array_finalize__ is passed an
array that has already been stripped of the additional attributes, and I
don't see a way to recover this information.

In my particular case at least, there are clear ways to handle corner cases
(like being passed a class that lacks these attributes), so in principle
there no problem handling concatenate in a general way, assuming I can get
access to the attributes.

So is there any way to subclass ndarray in such a way that concatenate can
be handled properly?

I have been looking extensively online, but have not been able to find a
clear answer on how to do this, or if there even is a way.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130122/e09471e1/attachment.html>

From sebastian at sipsolutions.net  Tue Jan 22 07:44:33 2013
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Tue, 22 Jan 2013 13:44:33 +0100
Subject: [Numpy-discussion] Subclassing ndarray with concatenate
In-Reply-To: <CAFpSVpKjZk6Ycpi21OCxOqGFOcH17jVe=gOQ=+rWWTdU08zZUA@mail.gmail.com>
References: <CAFpSVpKjZk6Ycpi21OCxOqGFOcH17jVe=gOQ=+rWWTdU08zZUA@mail.gmail.com>
Message-ID: <1358858673.24631.20.camel@sebastian-laptop>

Hey,

On Tue, 2013-01-22 at 10:21 +0100, Todd wrote:
> I am trying to create a subclass of ndarray that has additional
> attributes.  These attributes are maintained with most numpy functions
> if __array_finalize__ is used.  
> 
You can cover a bit more if you also implement `__array_wrap__`, though
unless you want to do something fancy, that just replaces the
`__array_finalize__` for the most part. But some (very few) functions
currently call `__array_wrap__` explicitly.

> The main exception I have found is concatenate (and hstack/vstack,
> which just wrap concatenate).  In this case, __array_finalize__ is
> passed an array that has already been stripped of the additional
> attributes, and I don't see a way to recover this information.  
> 
There are quite a few functions that simply do not preserve subclasses
(though I think more could/should call `__array_wrap__` probably, even
if the documentation may say that it is about ufuncs, there are some
example of this already).
`np.concatenate` is one of these. It always returns a base array. In any
case it gets a bit difficult if you have multiple input arrays (which
may not matter for you).

> In my particular case at least, there are clear ways to handle corner
> cases (like being passed a class that lacks these attributes), so in
> principle there no problem handling concatenate in a general way,
> assuming I can get access to the attributes.
> 
> 
> So is there any way to subclass ndarray in such a way that concatenate
> can be handled properly?
> 
Quite simply, no. If you compare masked arrays, they also provide their
own concatenate for this reason.

I hope that helps a bit...

Regards,

Sebastian

> I have been looking extensively online, but have not been able to find
> a clear answer on how to do this, or if there even is a way.
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From sebastian at sipsolutions.net  Tue Jan 22 10:56:05 2013
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Tue, 22 Jan 2013 16:56:05 +0100
Subject: [Numpy-discussion] Subclassing ndarray with concatenate
In-Reply-To: <1358858673.24631.20.camel@sebastian-laptop>
References: <CAFpSVpKjZk6Ycpi21OCxOqGFOcH17jVe=gOQ=+rWWTdU08zZUA@mail.gmail.com>
	<1358858673.24631.20.camel@sebastian-laptop>
Message-ID: <1358870165.1679.1.camel@sebastian-laptop>

On Tue, 2013-01-22 at 13:44 +0100, Sebastian Berg wrote:
> Hey,
> 
> On Tue, 2013-01-22 at 10:21 +0100, Todd wrote:
> > I am trying to create a subclass of ndarray that has additional
> > attributes.  These attributes are maintained with most numpy functions
> > if __array_finalize__ is used.  
> > 
> You can cover a bit more if you also implement `__array_wrap__`, though
> unless you want to do something fancy, that just replaces the
> `__array_finalize__` for the most part. But some (very few) functions
> currently call `__array_wrap__` explicitly.
> 

Actually have to correct myself here. The default __array_wrap__ causes
__array_finalize__ to be called as you would expect, so there is no need
to use it unless you want to do something fancy.

> > The main exception I have found is concatenate (and hstack/vstack,
> > which just wrap concatenate).  In this case, __array_finalize__ is
> > passed an array that has already been stripped of the additional
> > attributes, and I don't see a way to recover this information.  
> > 
> There are quite a few functions that simply do not preserve subclasses
> (though I think more could/should call `__array_wrap__` probably, even
> if the documentation may say that it is about ufuncs, there are some
> example of this already).
> `np.concatenate` is one of these. It always returns a base array. In any
> case it gets a bit difficult if you have multiple input arrays (which
> may not matter for you).
> 
> > In my particular case at least, there are clear ways to handle corner
> > cases (like being passed a class that lacks these attributes), so in
> > principle there no problem handling concatenate in a general way,
> > assuming I can get access to the attributes.
> > 
> > 
> > So is there any way to subclass ndarray in such a way that concatenate
> > can be handled properly?
> > 
> Quite simply, no. If you compare masked arrays, they also provide their
> own concatenate for this reason.
> 
> I hope that helps a bit...
> 
> Regards,
> 
> Sebastian
> 
> > I have been looking extensively online, but have not been able to find
> > a clear answer on how to do this, or if there even is a way.
> > 
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> 
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> 


From wesmckinn at gmail.com  Tue Jan 22 11:32:03 2013
From: wesmckinn at gmail.com (Wes McKinney)
Date: Tue, 22 Jan 2013 11:32:03 -0500
Subject: [Numpy-discussion] ANN: pandas 0.10.1 is released
Message-ID: <CAJPUwMA_cXG8+V9H88Lk1OsUPE-Un2+p35OKeSy9chEYF+D9Vw@mail.gmail.com>

hi all,

We've released pandas 0.10.1 which includes many bug fixes from
0.10.0 (including a number of issues with the new file parser,
e.g. reading multiple files in separate threads), various
performance improvements, and major new PyTables/HDF5-based
functionality contributed by Jeff Reback. I strongly recommend
that all users upgrade.

Thanks to all who contributed to this release, especially Chang
She, Jeff Reback, and Yoval P.

As always source archives and Windows installers are on PyPI.

What's new: http://pandas.pydata.org/pandas-docs/stable/whatsnew.html
Installers: http://pypi.python.org/pypi/pandas

$ git log v0.10.0..v0.10.1 --pretty=format:%aN | sort | uniq -c | sort -rn
     66 jreback
     59 Wes McKinney
     43 Chang She
     12 y-p
      5 Vincent Arel-Bundock
      4 Damien Garaud
      3 Christopher Whelan
      3 Andy Hayden
      2 Jay Parlar
      2 Dan Allan
      1 Thouis (Ray) Jones
      1 svaksha
      1 herrfz
      1 Garrett Drapala
      1 elpres
      1 Dieter Vandenbussche
      1 Anton I. Sipos

Happy data hacking!

- Wes

What is it
==========
pandas is a Python package providing fast, flexible, and
expressive data structures designed to make working with
relational, time series, or any other kind of labeled data both
easy and intuitive. It aims to be the fundamental high-level
building block for doing practical, real world data analysis in
Python.

Links
=====
Release Notes: http://github.com/pydata/pandas/blob/master/RELEASE.rst
Documentation: http://pandas.pydata.org
Installers: http://pypi.python.org/pypi/pandas
Code Repository: http://github.com/pydata/pandas
Mailing List: http://groups.google.com/group/pydata


From jrocher at enthought.com  Wed Jan 23 15:16:06 2013
From: jrocher at enthought.com (Jonathan Rocher)
Date: Wed, 23 Jan 2013 14:16:06 -0600
Subject: [Numpy-discussion] [SCIPY2013] Feedback on mini-symposia themes
In-Reply-To: <CAOzk5QcWhxFcPTNL5A1oCNnmxApR96kEpWqEXT9bDk6+i8NYDQ@mail.gmail.com>
References: <CAOzk5QcWhxFcPTNL5A1oCNnmxApR96kEpWqEXT9bDk6+i8NYDQ@mail.gmail.com>
Message-ID: <CAOzk5Qeay_182mJrkQk9dR+q2DBouW=Jy+o+p535NWWAwr9F=Q@mail.gmail.com>

Dear community members,

[Sorry for the cross-post]

We are making progress and building an awesome organization team for
the SciPy2013
conference (Scientific Computing with
Python)<http://conference.scipy.org/scipy2013/> this June
24th-29th in Austin, TX. More on that later.

Following my previous email, we have gotten lots of good answers to our
survey about the themes the community would like to see at the  for the
mini-symposia *[1]*. We will leave *this survey open until Feb 7th*. So if
you haven't done so, and would like to discuss scientific python tools with
peers from the same industry/field, take a second to voice your opinion:
http://www.surveygizmo.com/s3/1114631/SciPy-2013-Themes

Thanks,

The SciPy2013 organizers

*[1] These mini-symposia are held to discuss scientific computing applied
to a specific scientific domain/industry during a half afternoon after the
general conference. Their goal is to promote industry specific libraries
and tools, and gather people with similar interests for discussions. For
example, the SciPy2012<http://conference.scipy.org/scipy2012/schedule/conf_schedule_1.php>
edition
successfully hosted 4 mini-symposia on Astronomy/Astrophysics,
Bio-informatics, Meteorology, and Geophysics.*
*
*

On Wed, Jan 9, 2013 at 4:32 PM, Jonathan Rocher <jrocher at enthought.com>wrote:

> Dear community members,
>
> We are working hard to organize the SciPy2013 conference (Scientific
> Computing with Python) <http://conference.scipy.org/scipy2013/>,
> this June 24th-29th in Austin, TX. We would like to probe the community
> about the themes you would be interested in contributing to or
> participating in for the mini-symposia at SciPy2013.
>
> These mini-symposia are held to discuss scientific computing applied to a
> specific *scientific domain/industry* during a half afternoon after the
> general conference. Their goal is to promote industry specific libraries
> and tools, and gather people with similar interests for discussions. For
> example, the SciPy2012<http://conference.scipy.org/scipy2012/schedule/conf_schedule_1.php> edition
> successfully hosted 4 mini-symposia on Astronomy/Astrophysics,
> Bio-informatics, Meteorology, and Geophysics.
>
> Please join us and voice your opinion to shape the next SciPy conference
> at:
>
> http://www.surveygizmo.com/s3/1114631/SciPy-2013-Themes
>
> Thanks,
>
> The Scipy2013 organizers
>
> --
> Jonathan Rocher, PhD
> Scientific software developer
> Enthought, Inc.
> jrocher at enthought.com
> 1-512-536-1057
> http://www.enthought.com
>
>


-- 
Jonathan Rocher, PhD
Scientific software developer
Enthought, Inc.
jrocher at enthought.com
1-512-536-1057
http://www.enthought.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130123/7aca00c0/attachment.html>

From raul at virtualmaterials.com  Sat Jan 26 12:56:34 2013
From: raul at virtualmaterials.com (Raul Cota)
Date: Sat, 26 Jan 2013 10:56:34 -0700
Subject: [Numpy-discussion] NumPy int32 array to Excel through COM server is
	failing
Message-ID: <510418D2.3010706@virtualmaterials.com>

Hello,

We came across a problem trying to get an array across COM when wrapped 
as a COM server using the win32com extension.

What caught our attention is that arrays of type float64 work fine but 
do not work for any other array.

Does anyone know if there is something we could do at the NumPy level to 
make it work ?

Arrays are mapped onto Variant/SafeArray and the conversion to a Variant 
seems to be failing for anything that is not a float64.

It is not a huge deal for us to workaround the problem but it is kind of 
ugly and I just wanted to make sure there was not something simple that 
could be done (particularly if something like this would be considered a 
bug).

I include working sample code below that reproduces the problem where 
Excel instantiates a com server and requests an array of size and dtype. 
On the Python side, the code sets up the com server and exposes the 
function that just returns an array of ones.


Code in Excel to get an array:
========================================
Public Sub NumPyVariantTest()
     Dim npcom As Object
     Dim arr

     '... instantiate com object
     Set npcom = CreateObject("NPTest.COM")
     size = 7

     '... test float64. Works !
     arr = npcom.GetNPArray(size, "float64")

     '... test int32. Fails !
     arr = npcom.GetNPArray(size, "int32")

End Sub
========================================


Code in Python to set up com server and expose the GetNPArray function
========================================
import win32com.server.util
import win32com.client
from pythoncom import CLSCTX_LOCAL_SERVER, CLSCTX_INPROC

import sys, os
import numpy
from numpy import zeros, ones, array


class NPTestCOM(object):
     """COM accessible version of CommandInterface"""

     _reg_clsid_ = "{A0E551F5-2F22-4FB4-B28E-FF1B6809D21C}"
     _reg_desc_  = "NumPy COM Test"
     _reg_progid_ = "NPTest.COM"
     _reg_clsctx_ = CLSCTX_INPROC
     _public_methods_ = ['GetNPArray']

     _public_attrs_ = []
     _readonly_attrs_ = []


     def GetNPArray(self, size, dtype):
         """ Return an arbitrary NumPy array of type dtype to check
         conversion to Variant"""
         return ones(size, dtype=dtype)


if __name__ == '__main__':
     import win32com.server.register
     import _winreg

     dllkey = 'nptestdll'

     if len(sys.argv) > 1 and sys.argv[1] == 'unregister':
         win32com.server.register.UnregisterClasses(NPTestCOM)
         software_key = _winreg.OpenKey(_winreg.HKEY_LOCAL_MACHINE, 
'SOFTWARE')
         vmg_key = _winreg.OpenKey(software_key, 'VMG')
         _winreg.DeleteKey(vmg_key, dllkey)
         _winreg.CloseKey(vmg_key)
         _winreg.CloseKey(software_key)
     else:
         win32com.server.register.UseCommandLine(NPTestCOM)
         software_key = _winreg.OpenKey(_winreg.HKEY_LOCAL_MACHINE, 
'SOFTWARE')
         vmg_key = _winreg.CreateKey(software_key, 'VMG')
         _winreg.SetValue(vmg_key, dllkey, _winreg.REG_SZ, 
os.path.abspath(os.curdir))
         _winreg.CloseKey(vmg_key)
         _winreg.CloseKey(software_key)

========================================


Regards,
Raul

-- 
Raul Cota (P.Eng., Ph.D. Chemical Engineering)
Research & Development Manager
Phone: (403) 457 4598
Fax:   (403) 457 4637
Virtual Materials Group - Canada
www.virtualmaterials.com


From olli.wallin at elisanet.fi  Sun Jan 27 14:40:49 2013
From: olli.wallin at elisanet.fi (olli.wallin at elisanet.fi)
Date: Sun, 27 Jan 2013 21:40:49 +0200 (EET)
Subject: [Numpy-discussion] Installing numpy-mkl binary on top of Python(x,
	y)
Message-ID: <26797036.26795851359315649926.JavaMail.olli.wallin@elisanet.fi>

Hi, 

if I want to have a painless Python installation build against Intel MKL on Windows, one obvious choice is to just buy the EPD package. However, 
as I already do have a C++ licence of the MKL library I was wondering if I could just install the Python(x,y) -distribution and then take one of the NumPy-MKL binaries provided 
by Christoph Gohlke. Is it simple as that? Any downsides, will SciPy work as well? On the plus side, I would get Spyder2 without hassle and it looks nice to a former Matlab user.

I apologize for such a simple question, I would have tried it myself but this is for my work where only IT support has the admin rights and I have mac at home. I want it to be as 
clearcut for them as possible so I get things up and running. I did try to search the internet and the list but did not find a conclusive answer.

Many thanks in advance for any help.

All the best,

Olli

-- 


From cgohlke at uci.edu  Sun Jan 27 14:54:44 2013
From: cgohlke at uci.edu (Christoph Gohlke)
Date: Sun, 27 Jan 2013 11:54:44 -0800
Subject: [Numpy-discussion] Installing numpy-mkl binary on top of
 Python(x, y)
In-Reply-To: <26797036.26795851359315649926.JavaMail.olli.wallin@elisanet.fi>
References: <26797036.26795851359315649926.JavaMail.olli.wallin@elisanet.fi>
Message-ID: <51058604.3000906@uci.edu>

On 1/27/2013 11:40 AM, olli.wallin at elisanet.fi wrote:
> Hi,
>
> if I want to have a painless Python installation build against Intel MKL on Windows, one obvious choice is to just buy the EPD package. However,
> as I already do have a C++ licence of the MKL library I was wondering if I could just install the Python(x,y) -distribution and then take one of the NumPy-MKL binaries provided
> by Christoph Gohlke. Is it simple as that? Any downsides, will SciPy work as well? On the plus side, I would get Spyder2 without hassle and it looks nice to a former Matlab user.
>
> I apologize for such a simple question, I would have tried it myself but this is for my work where only IT support has the admin rights and I have mac at home. I want it to be as
> clearcut for them as possible so I get things up and running. I did try to search the internet and the list but did not find a conclusive answer.
>
> Many thanks in advance for any help.
>
> All the best,
>
> Olli
>

Try WinPython <http://code.google.com/p/winpython/>. It repackages 
numpy-MKL and other packages from 
<http://www.lfd.uci.edu/~gohlke/pythonlibs/>, contains Spyder and all 
dependencies, is available as 64 bit, and does not require admin rights 
to install.

Christoph


From josef.pktd at gmail.com  Sun Jan 27 16:34:16 2013
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Sun, 27 Jan 2013 16:34:16 -0500
Subject: [Numpy-discussion] Installing numpy-mkl binary on top of
 Python(x, y)
In-Reply-To: <51058604.3000906@uci.edu>
References: <26797036.26795851359315649926.JavaMail.olli.wallin@elisanet.fi>
	<51058604.3000906@uci.edu>
Message-ID: <CAMMTP+AMz0o3TPwiH7GeijL-4g2AkwNLtUf2Ewvw48CZHnm8NA@mail.gmail.com>

On Sun, Jan 27, 2013 at 2:54 PM, Christoph Gohlke <cgohlke at uci.edu> wrote:
> On 1/27/2013 11:40 AM, olli.wallin at elisanet.fi wrote:
>> Hi,
>>
>> if I want to have a painless Python installation build against Intel MKL on Windows, one obvious choice is to just buy the EPD package. However,
>> as I already do have a C++ licence of the MKL library I was wondering if I could just install the Python(x,y) -distribution and then take one of the NumPy-MKL binaries provided
>> by Christoph Gohlke. Is it simple as that? Any downsides, will SciPy work as well? On the plus side, I would get Spyder2 without hassle and it looks nice to a former Matlab user.
>>
>> I apologize for such a simple question, I would have tried it myself but this is for my work where only IT support has the admin rights and I have mac at home. I want it to be as
>> clearcut for them as possible so I get things up and running. I did try to search the internet and the list but did not find a conclusive answer.
>>
>> Many thanks in advance for any help.
>>
>> All the best,
>>
>> Olli
>>
>
> Try WinPython <http://code.google.com/p/winpython/>. It repackages
> numpy-MKL and other packages from
> <http://www.lfd.uci.edu/~gohlke/pythonlibs/>, contains Spyder and all
> dependencies, is available as 64 bit, and does not require admin rights
> to install.

You can replace python xy installed packages but it's necessary to
watch out for dependencies.

If you replace numpy with the mkl version, then you also have to
replace scipy with the mkl version, as far as I understand.

I initially installed python xy on a new computer and updated many
packages since, using standard python not the python xy updates.
The only problem I have is that I have some incompatibilities between
QT, pyQT, pyside, spyder and the ipython qt console, the later doesn't
work in my current setup.

Josef


>
> Christoph
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From zdoor at xs4all.nl  Mon Jan 28 10:06:59 2013
From: zdoor at xs4all.nl (Alex)
Date: Mon, 28 Jan 2013 15:06:59 +0000 (UTC)
Subject: [Numpy-discussion] =?utf-8?q?Merging_structured_arrays_with_mixed?=
	=?utf-8?q?_dtypes_including_=27=7CO4=27?=
Message-ID: <loom.20130128T160253-106@post.gmane.org>

Let's say I have two structured arrays with dtypes as per below

>>> getdat.dtype
dtype([('Tstamp', '|O4'), ('Vf', '<f4'), ('Vq', '<f4'), ('Sf', '<f4'), 
('Sq', '<f4'), ('Bt', '<f4'), ('SPL', '<f4')])

>>> out.dtype
dtype([('Viscosity_cSt', '<f4'), ('Density_kgm3', '<f4'), ('GVF', '<f4')])

Then merging this per the below gives an error. Why?

>>> rfn.merge_arrays((getdat, out), flatten = True, usemask = False, 
asrecarray=False)

Traceback (most recent call last):
  File "<pyshell#43>", line 1, in <module>
    rfn.merge_arrays((getdat, out), flatten = True, usemask = False, 
asrecarray=False)
  File "C:\Python27\lib\site-packages\numpy\lib\recfunctions.py", line 458, in 
merge_arrays
    dtype=newdtype, count=maxlength)
ValueError: cannot create object arrays from iterator

The issue seems to be object field 'Tstamp' which contains python datetime 
objects. I can merge structured arrays with numeric formats.

Any help much appreciated.
Alex van der Spek


From mail.till at gmx.de  Mon Jan 28 11:31:26 2013
From: mail.till at gmx.de (Till Stensitzki)
Date: Mon, 28 Jan 2013 16:31:26 +0000 (UTC)
Subject: [Numpy-discussion] Matrix Expontial for differenr t.
Message-ID: <loom.20130128T172740-885@post.gmane.org>

Hi group,
is there a faster way to calculate the 
matrix exponential for different t's 
than this:

def sol_matexp(A, tlist, y0):
    w, v = np.linalg.eig(A)
    out = np.zeros((tlist.size, y0.size))
    for i, t in enumerate(tlist):
        sol_t = np.dot(v,np.diag(np.exp(-w*t))).dot(np.linalg.inv(v)).dot(y0)
        out[i, :] =  sol_t
    return out

This is the calculates exp(-Kt).dot(y0) for a list a ts.

greetings
Till


From robert.kern at gmail.com  Mon Jan 28 11:42:18 2013
From: robert.kern at gmail.com (Robert Kern)
Date: Mon, 28 Jan 2013 17:42:18 +0100
Subject: [Numpy-discussion] Matrix Expontial for differenr t.
In-Reply-To: <loom.20130128T172740-885@post.gmane.org>
References: <loom.20130128T172740-885@post.gmane.org>
Message-ID: <CAF6FJiuVT2MdY_qK6Ez7B+N7=mJVpb6bNHAgBTbdewsZZhn1YQ@mail.gmail.com>

On Mon, Jan 28, 2013 at 5:31 PM, Till Stensitzki <mail.till at gmx.de> wrote:
> Hi group,
> is there a faster way to calculate the
> matrix exponential for different t's
> than this:
>
> def sol_matexp(A, tlist, y0):
>     w, v = np.linalg.eig(A)
>     out = np.zeros((tlist.size, y0.size))
>     for i, t in enumerate(tlist):
>         sol_t = np.dot(v,np.diag(np.exp(-w*t))).dot(np.linalg.inv(v)).dot(y0)
>         out[i, :] =  sol_t
>     return out
>
> This is the calculates exp(-Kt).dot(y0) for a list a ts.

You can precalculate the latter part of the expression and avoid the
inv() by using solve().

viy0 = np.linalg.solve(v, y0)
for i, t in enumerate(tlist):
    # And no need to dot() the first part. Broadcasting works just fine.
    sol_t = (v * np.exp(-w*t)).dot(viy0)
    ...

--
Robert Kern


From nadavh at visionsense.com  Mon Jan 28 11:48:27 2013
From: nadavh at visionsense.com (Nadav Horesh)
Date: Mon, 28 Jan 2013 16:48:27 +0000
Subject: [Numpy-discussion] Matrix Expontial for differenr t.
In-Reply-To: <loom.20130128T172740-885@post.gmane.org>
References: <loom.20130128T172740-885@post.gmane.org>
Message-ID: <F656855EF0FAB246A0AEC45F7D5D13242CF5A9AE@BLUPRD0811MB424.namprd08.prod.outlook.com>

I did not try it, but I assume that you can build a stack of diagonal matrices as a MxNxN array and use tensordot with the matrix v (and it's inverse). The trivial way to accelerate the loop is to calculate in inverse of v before the loop.

   Nadav
________________________________________
From: numpy-discussion-bounces at scipy.org [numpy-discussion-bounces at scipy.org] on behalf of Till Stensitzki [mail.till at gmx.de]
Sent: 28 January 2013 18:31
To: numpy-discussion at scipy.org
Subject: [Numpy-discussion] Matrix Expontial for differenr t.

Hi group,
is there a faster way to calculate the
matrix exponential for different t's
than this:

def sol_matexp(A, tlist, y0):
    w, v = np.linalg.eig(A)
    out = np.zeros((tlist.size, y0.size))
    for i, t in enumerate(tlist):
        sol_t = np.dot(v,np.diag(np.exp(-w*t))).dot(np.linalg.inv(v)).dot(y0)
        out[i, :] =  sol_t
    return out

This is the calculates exp(-Kt).dot(y0) for a list a ts.

greetings
Till

_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion at scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


From pierre.haessig at crans.org  Mon Jan 28 11:52:21 2013
From: pierre.haessig at crans.org (Pierre Haessig)
Date: Mon, 28 Jan 2013 17:52:21 +0100
Subject: [Numpy-discussion] Matrix Expontial for differenr t.
In-Reply-To: <loom.20130128T172740-885@post.gmane.org>
References: <loom.20130128T172740-885@post.gmane.org>
Message-ID: <5106ACC5.5010107@crans.org>

Hi,

Le 28/01/2013 17:31, Till Stensitzki a ?crit :
> This is the calculates exp(-Kt).dot(y0) for a list a ts.
If your time vector ts is *regularly* discretized with a timestep h, you
could try an iterative computation

I would (roughly) write this as :

Ah = np.expm(A*h) # or use the "diagonalization + np.exp" method you
mentionned

y[0] = y0
for i in range(len(tlist)-1):
    y[i+1] = Ah*y[i]

best,
Pierre

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 900 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130128/1e3b5f2b/attachment.sig>

From mail.till at gmx.de  Mon Jan 28 12:14:49 2013
From: mail.till at gmx.de (Till Stensitzki)
Date: Mon, 28 Jan 2013 17:14:49 +0000 (UTC)
Subject: [Numpy-discussion] Matrix Expontial for differenr t.
References: <loom.20130128T172740-885@post.gmane.org>
Message-ID: <loom.20130128T181042-228@post.gmane.org>

Thanks for hints so far, i am especially searching
for a way to get rid of the t loop. Making a NxMxM 
Matrix is quite memory inefficient in my case (N > M). 

On way would be just use cython, but i think this problem
common enough to have a solution into scipy. 
(Solution of a simple compartment model.)

thanks,
Till


From pierre.haessig at crans.org  Mon Jan 28 12:24:57 2013
From: pierre.haessig at crans.org (Pierre Haessig)
Date: Mon, 28 Jan 2013 18:24:57 +0100
Subject: [Numpy-discussion] Matrix Expontial for differenr t.
In-Reply-To: <loom.20130128T181042-228@post.gmane.org>
References: <loom.20130128T172740-885@post.gmane.org>
	<loom.20130128T181042-228@post.gmane.org>
Message-ID: <5106B469.6030206@crans.org>

Hi,
Le 28/01/2013 18:14, Till Stensitzki a ?crit :
> On way would be just use cython, but i think this problem
> common enough to have a solution into scipy. 
> (Solution of a simple compartment model.)
I see the solution you propose as a specialized ODE solver for linear
systems.

Then, what about using a general purpose ODE ? I guess there would be
some integration errors as opposed to the exact integration method but
errors bounds should be manageable. Maybe the performance would be
increased thanks to ODE solvers being already written in C or Fortran?
(At this point, please note that I'm handwaving a lot !)

Best,
Pierre

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 900 bytes
Desc: OpenPGP digital signature
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130128/932ad52d/attachment.sig>

From mesanthu at gmail.com  Mon Jan 28 12:58:56 2013
From: mesanthu at gmail.com (santhu kumar)
Date: Mon, 28 Jan 2013 11:58:56 -0600
Subject: [Numpy-discussion] Numpy multiple instances
Message-ID: <CA+7TRst7RjJ_PRAZG9e+PVnpEMOFKtketRF818=XvamTDwg55Q@mail.gmail.com>

Hello,

I have embedded python/numpy scripts in an application that runs in
parallel.
But the python code is always invoked on the master node.
So it could be assumed that at some point, there could be multiple
instances of script being invoked and run.

I have commented out import numpy part to see if both get the same sys.path
and here are the sys.path:

>From C : Replica ID value 0
>From Python : The replicaId 0 of 2 simulations running at 0.0 with 27319
atoms
['/usr/lib64/python24.zip', '/usr/lib64/python2.4',
'/usr/lib64/python2.4/plat-linux2', '/usr/lib64/python2.4/lib-tk',
'/usr/lib64/python2.4/lib-dynload', '/usr/lib64/python2.4/site-packages',
'/usr/lib64/python2.4/site-packages/Numeric',
'/usr/lib64/python2.4/site-packages/gtk-2.0',
'/usr/lib/python2.4/site-packages', 'python_custom']

>From C : Replica ID value 1
>From Python : The replicaId 1 of 2 simulations running at 0.0 with 27319
atoms
['/usr/lib64/python24.zip', '/usr/lib64/python2.4',
'/usr/lib64/python2.4/plat-linux2', '/usr/lib64/python2.4/lib-tk',
'/usr/lib64/python2.4/lib-dynload', '/usr/lib64/python2.4/site-packages',
'/usr/lib64/python2.4/site-packages/Numeric',
'/usr/lib64/python2.4/site-packages/gtk-2.0',
'/usr/lib/python2.4/site-packages', 'python_custom']

But once I uncomment, import numpy as np part in the script,

>From C : Replica ID value 0
>From Python : The replicaId 0 of 2 simulations running at 0.0 with 27319
atoms
['/usr/lib64/python24.zip', '/usr/lib64/python2.4',
'/usr/lib64/python2.4/plat-linux2', '/usr/lib64/python2.4/lib-tk',
'/usr/lib64/python2.4/lib-dynload', '/usr/lib64/python2.4/site-packages',
'/usr/lib64/python2.4/site-packages/Numeric',
'/usr/lib64/python2.4/site-packages/gtk-2.0',
'/usr/lib/python2.4/site-packages', 'python_custom']
>From C : Replica ID value 1
Traceback (most recent call last):
  File "python_custom/customF.py", line 3, in ?
    import numpy
ImportError: No module named numpy

Just giving some more information : When I embedded the python call in C, I
had to comment out Py_Finalize() as numpy was throwing an error when trying
to finalize.

Any ideas/suggestions on whats happening ? Do I need to do something
special to have multiple instances of python/numpy in C?
Thanks
Santhosh
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130128/a9333b8d/attachment.html>

From mesanthu at gmail.com  Mon Jan 28 13:23:55 2013
From: mesanthu at gmail.com (santhu kumar)
Date: Mon, 28 Jan 2013 12:23:55 -0600
Subject: [Numpy-discussion] Numpy multiple instances
Message-ID: <CA+7TRsv+wFR7V_fTa=unF77ZKUnAZeiB43+5LRZBpaEF7FKnNw@mail.gmail.com>

Please ignore the previous message. I have done some testing and found it
to be running on a client node instead of the master node.

The problem might be because node2, does not have numpy installed. Thanks.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130128/e9789b10/attachment.html>

From xscript at gmx.net  Mon Jan 28 17:15:00 2013
From: xscript at gmx.net (=?utf-8?Q?Llu=C3=ADs?=)
Date: Mon, 28 Jan 2013 23:15:00 +0100
Subject: [Numpy-discussion] numpythonically getting elements with the
	minimum sum
Message-ID: <871ud5x8d7.fsf@fimbulvetr.bsc.es>

Hi,

I have a somewhat convoluted N-dimensional array that contains information of a
set of experiments.

The last dimension has as many entries as iterations in the experiment (an
iterative application), and the penultimate dimension has as many entries as
times I have run that experiment; the rest of dimensions describe the features
of the experiment:

    data.shape == (... indefinite amount of dimensions ..., NUM_RUNS, NUM_ITERATIONS)

So, what I want is to get the data for the best run of each experiment:

    best.shape == (... indefinite amount of dimensions ..., NUM_ITERATIONS)

by selecting, for each experiment, the run with the lowest total time (sum of
the time of all iterations for that experiment).


So far I've got the trivial part, but not the final indexing into "data":

    dsum = data.sum(axis = -1)
    dmin = dsum.min(axis = -1)
    best = data[???]


I'm sure there must be some numpythonic and generic way to get what I want, but
fancy indexing is beating me here :)


Thanks a lot!
  Lluis

-- 
 "And it's much the same thing with knowledge, for whenever you learn
 something new, the whole world becomes that much richer."
 -- The Princess of Pure Reason, as told by Norton Juster in The Phantom
 Tollbooth


From irving at naml.us  Mon Jan 28 18:48:36 2013
From: irving at naml.us (Geoffrey Irving)
Date: Mon, 28 Jan 2013 15:48:36 -0800
Subject: [Numpy-discussion] PyArray_FromAny silently converts None to a
	singleton nan
Message-ID: <CAJ1ofpeotgbeKpM0ed8a7Jw8N0nKvZiPpv=Cukj9yk-kRGsLmw@mail.gmail.com>

I discovered this from C via the PyArray_FromAny function, but here it
is in Python:

    >>> asarray(None,dtype=float)
    array(nan)

Is this expected or documented behavior?  It seems quite unintuitive
and surprising that this wouldn't throw an exception.

Is there a way to disable this behavior in PyArray_FromAny in order to
catch bugs earlier on?  In the situation where I discovered this I
actually passed None to a wrapped C routine, and it complained that it
didn't have rank 2 (since the resulting nan singleton had rank 0).
It'd be much nicer to get something mentioning NoneType.  I suppose I
could check for None manually as long there aren't any other weird
cases.

Geoffrey


From brad.froehle at gmail.com  Mon Jan 28 20:09:50 2013
From: brad.froehle at gmail.com (Bradley M. Froehle)
Date: Mon, 28 Jan 2013 17:09:50 -0800
Subject: [Numpy-discussion] PyArray_FromAny silently converts None to a
 singleton nan
In-Reply-To: <CAJ1ofpeotgbeKpM0ed8a7Jw8N0nKvZiPpv=Cukj9yk-kRGsLmw@mail.gmail.com>
References: <CAJ1ofpeotgbeKpM0ed8a7Jw8N0nKvZiPpv=Cukj9yk-kRGsLmw@mail.gmail.com>
Message-ID: <CAHXv-MgHYP7hikmqwVoDKvD_VZ9RWMijv7GrD9mM=9AWJa3Q5g@mail.gmail.com>

>>> import numpy as np
>>> np.double(None)
nan

On Mon, Jan 28, 2013 at 3:48 PM, Geoffrey Irving <irving at naml.us> wrote:

> I discovered this from C via the PyArray_FromAny function, but here it
> is in Python:
>
>     >>> asarray(None,dtype=float)
>     array(nan)
>
> Is this expected or documented behavior?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130128/1610f502/attachment.html>

From irving at naml.us  Mon Jan 28 20:27:45 2013
From: irving at naml.us (Geoffrey Irving)
Date: Mon, 28 Jan 2013 17:27:45 -0800
Subject: [Numpy-discussion] PyArray_FromAny silently converts None to a
 singleton nan
In-Reply-To: <CAHXv-MgHYP7hikmqwVoDKvD_VZ9RWMijv7GrD9mM=9AWJa3Q5g@mail.gmail.com>
References: <CAJ1ofpeotgbeKpM0ed8a7Jw8N0nKvZiPpv=Cukj9yk-kRGsLmw@mail.gmail.com>
	<CAHXv-MgHYP7hikmqwVoDKvD_VZ9RWMijv7GrD9mM=9AWJa3Q5g@mail.gmail.com>
Message-ID: <CAJ1ofpcQdNd9mj5s+tbag5C=qxECBqGDpitrCWxYmPRTS_eiNA@mail.gmail.com>

For comparison:

>>> float32(None)
nan
>>> float(None)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: float() argument must be a string or a number

On Mon, Jan 28, 2013 at 5:09 PM, Bradley M. Froehle
<brad.froehle at gmail.com> wrote:
>>>> import numpy as np
>>>> np.double(None)
> nan
>
> On Mon, Jan 28, 2013 at 3:48 PM, Geoffrey Irving <irving at naml.us> wrote:
>>
>> I discovered this from C via the PyArray_FromAny function, but here it
>> is in Python:
>>
>>     >>> asarray(None,dtype=float)
>>     array(nan)
>>
>> Is this expected or documented behavior?
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From gregor.thalhammer at gmail.com  Tue Jan 29 03:49:55 2013
From: gregor.thalhammer at gmail.com (Gregor Thalhammer)
Date: Tue, 29 Jan 2013 09:49:55 +0100
Subject: [Numpy-discussion] numpythonically getting elements with the
	minimum sum
In-Reply-To: <871ud5x8d7.fsf@fimbulvetr.bsc.es>
References: <871ud5x8d7.fsf@fimbulvetr.bsc.es>
Message-ID: <E55F144F-74A9-4EE2-839A-2D353BB624D8@gmail.com>


Am 28.1.2013 um 23:15 schrieb Llu?s:

> Hi,
> 
> I have a somewhat convoluted N-dimensional array that contains information of a
> set of experiments.
> 
> The last dimension has as many entries as iterations in the experiment (an
> iterative application), and the penultimate dimension has as many entries as
> times I have run that experiment; the rest of dimensions describe the features
> of the experiment:
> 
>    data.shape == (... indefinite amount of dimensions ..., NUM_RUNS, NUM_ITERATIONS)
> 
> So, what I want is to get the data for the best run of each experiment:
> 
>    best.shape == (... indefinite amount of dimensions ..., NUM_ITERATIONS)
> 
> by selecting, for each experiment, the run with the lowest total time (sum of
> the time of all iterations for that experiment).
> 
> 
> So far I've got the trivial part, but not the final indexing into "data":
> 
>    dsum = data.sum(axis = -1)
>    dmin = dsum.min(axis = -1)
>    best = data[???]
> 
> 
> I'm sure there must be some numpythonic and generic way to get what I want, but
> fancy indexing is beating me here :)

Did you have a look at the argmin function? It delivers the indices of the minimum values along an axis. Untested guess:

dmin_idx = argmin(dsum, axis = -1)
best = data[..., dmin_idx, :]

Gregor


From valentin at haenel.co  Tue Jan 29 04:49:07 2013
From: valentin at haenel.co (Valentin Haenel)
Date: Tue, 29 Jan 2013 10:49:07 +0100
Subject: [Numpy-discussion] Question about documentation for SWIG and ctypes
	numpy support
Message-ID: <20130129094907.GA30692@kudu.in-berlin.de>

Hi,

I need to link the documentation on ctypes and SWIG support for Numpy.

For ctypes I found:

  http://www.scipy.org/Cookbook/Ctypes

Which seems to be reasonably up-to-date. There are of course also:

http://docs.scipy.org/doc/numpy/reference/routines.ctypeslib.html

There are also the corresponding section from the API docs:

http://docs.scipy.org/doc/numpy/reference/routines.ctypeslib.html

http://docs.scipy.org/doc/numpy/reference/generated/numpy.ndarray.ctypes.html

So for numpy ctypes support, I would link those three.

For SWIG I found:

  http://www.scipy.org/Cookbook/SWIG_NumPy_examples

And this seems to be somewhat outdated, at least it references files
from the numpy svn... :(

There is also:

  http://docs.scipy.org/doc/numpy/reference/swig.interface-file.html

Which seems to be more up-to-date, although it doesn't contain much
information about the compilation procedure, like the cookbook does. I
would probably only link that last one for numpy swig support.

Is there any other documentation I should be aware of?

V-


From denis-bz-gg at t-online.de  Tue Jan 29 06:16:43 2013
From: denis-bz-gg at t-online.de (denis)
Date: Tue, 29 Jan 2013 11:16:43 +0000 (UTC)
Subject: [Numpy-discussion] np.where: x and y need to have the same shape as
	condition ?
Message-ID: <loom.20130129T120945-574@post.gmane.org>

Folks, 
  the doc for `where` says "x and y need to have the same shape as condition"
http://docs.scipy.org/doc/numpy-dev/reference/generated/numpy.where.html
But surely
    "where is equivalent to:
    [xv if c else yv for (c,xv,yv) in zip(condition,x,y)]"
holds as long as len(condition) == len(x) == len(y) ?
And `condition` can be broadcast ?
    n = 3
    all01 = np.array([ t for t in np.ndindex( n * (2,) )]) # 000 001 ...
    x = np.zeros(n)
    y = np.ones(n)
    w = np.where( all01, y, x )  # 2^n x n  

Can anyone please help me understand `where`
/ extend "where is equivalent to ..." ?
Thanks,
cheers
  -- denis


From xscript at gmx.net  Tue Jan 29 08:53:29 2013
From: xscript at gmx.net (=?utf-8?Q?Llu=C3=ADs?=)
Date: Tue, 29 Jan 2013 14:53:29 +0100
Subject: [Numpy-discussion] numpythonically getting elements with the
	minimum sum
In-Reply-To: <E55F144F-74A9-4EE2-839A-2D353BB624D8@gmail.com> (Gregor
	Thalhammer's message of "Tue, 29 Jan 2013 09:49:55 +0100")
References: <871ud5x8d7.fsf@fimbulvetr.bsc.es>
	<E55F144F-74A9-4EE2-839A-2D353BB624D8@gmail.com>
Message-ID: <87txq0t7s6.fsf@fimbulvetr.bsc.es>

Gregor Thalhammer writes:

> Am 28.1.2013 um 23:15 schrieb Llu?s:

>> Hi,
>> 
>> I have a somewhat convoluted N-dimensional array that contains information of a
>> set of experiments.
>> 
>> The last dimension has as many entries as iterations in the experiment (an
>> iterative application), and the penultimate dimension has as many entries as
>> times I have run that experiment; the rest of dimensions describe the features
>> of the experiment:
>> 
>> data.shape == (... indefinite amount of dimensions ..., NUM_RUNS, NUM_ITERATIONS)
>> 
>> So, what I want is to get the data for the best run of each experiment:
>> 
>> best.shape == (... indefinite amount of dimensions ..., NUM_ITERATIONS)
>> 
>> by selecting, for each experiment, the run with the lowest total time (sum of
>> the time of all iterations for that experiment).
>> 
>> 
>> So far I've got the trivial part, but not the final indexing into "data":
>> 
>> dsum = data.sum(axis = -1)
>> dmin = dsum.min(axis = -1)
>> best = data[???]
>> 
>> 
>> I'm sure there must be some numpythonic and generic way to get what I want, but
>> fancy indexing is beating me here :)

> Did you have a look at the argmin function? It delivers the indices of the minimum values along an axis. Untested guess:

> dmin_idx = argmin(dsum, axis = -1)
> best = data[..., dmin_idx, :]

Ah, sorry, my example is incorrect. I was actually using 'argmin', but indexing
with it does not exactly work as I expected:

  >>> d1.shape
  (2, 5, 10)
  >>> dsum = d1.sum(axis = -1)
  >>> dmin = d1.argmin(axis = -1)
  >>> dmin.shape
  (2,)
  >>> d1_best = d1[...,dmin,:]
  >>> d1_best.shape
  (2, 2, 10)


Assuming 1st dimension is the test, 2nd the run and 10th the iterations, using
this previous code with some example values:

  >>> dmin
  [4 3]
  >>> d1_best
  [[[ ... contents of d1[0,4,:] ...]
    [ ... contents of d1[0,3,:] ...]]
   [[ ... contents of d1[1,4,:] ...]
    [ ... contents of d1[1,3,:] ...]]]


While I actually want this:

  [[ ... contents of d1[0,4,:] ...]
   [ ... contents of d1[1,3,:] ...]]


Thanks,
  Lluis

-- 
 "And it's much the same thing with knowledge, for whenever you learn
 something new, the whole world becomes that much richer."
 -- The Princess of Pure Reason, as told by Norton Juster in The Phantom
 Tollbooth


From sebastian at sipsolutions.net  Tue Jan 29 09:11:55 2013
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Tue, 29 Jan 2013 15:11:55 +0100
Subject: [Numpy-discussion] numpythonically getting elements with the
 minimum sum
In-Reply-To: <87txq0t7s6.fsf@fimbulvetr.bsc.es>
References: <871ud5x8d7.fsf@fimbulvetr.bsc.es>
	<E55F144F-74A9-4EE2-839A-2D353BB624D8@gmail.com>
	<87txq0t7s6.fsf@fimbulvetr.bsc.es>
Message-ID: <1359468715.3559.10.camel@sebastian-laptop>

On Tue, 2013-01-29 at 14:53 +0100, Llu?s wrote:
> Gregor Thalhammer writes:
> 
> > Am 28.1.2013 um 23:15 schrieb Llu?s:
> 
> >> Hi,
> >> 
> >> I have a somewhat convoluted N-dimensional array that contains information of a
> >> set of experiments.
> >> 
> >> The last dimension has as many entries as iterations in the experiment (an
> >> iterative application), and the penultimate dimension has as many entries as
> >> times I have run that experiment; the rest of dimensions describe the features
> >> of the experiment:
> >> 
> >> data.shape == (... indefinite amount of dimensions ..., NUM_RUNS, NUM_ITERATIONS)
> >> 
> >> So, what I want is to get the data for the best run of each experiment:
> >> 
> >> best.shape == (... indefinite amount of dimensions ..., NUM_ITERATIONS)
> >> 
> >> by selecting, for each experiment, the run with the lowest total time (sum of
> >> the time of all iterations for that experiment).
> >> 
> >> 
> >> So far I've got the trivial part, but not the final indexing into "data":
> >> 
> >> dsum = data.sum(axis = -1)
> >> dmin = dsum.min(axis = -1)
> >> best = data[???]
> >> 
> >> 
> >> I'm sure there must be some numpythonic and generic way to get what I want, but
> >> fancy indexing is beating me here :)
> 
> > Did you have a look at the argmin function? It delivers the indices of the minimum values along an axis. Untested guess:
> 
> > dmin_idx = argmin(dsum, axis = -1)
> > best = data[..., dmin_idx, :]
> 
> Ah, sorry, my example is incorrect. I was actually using 'argmin', but indexing
> with it does not exactly work as I expected:
> 
>   >>> d1.shape
>   (2, 5, 10)
>   >>> dsum = d1.sum(axis = -1)
>   >>> dmin = d1.argmin(axis = -1)
>   >>> dmin.shape
>   (2,)
>   >>> d1_best = d1[...,dmin,:]

You need to use fancy indexing. Something like:
>>> d1_best = d1[np.arange(2), dmin,:]

Because the Ellipsis takes everything from the axis, while you want to
pick from multiple axes at the same time. That can be achieved with
fancy indexing (indexing with arrays). From another perspective, you
want to get rid of two axes in favor of a new one, but a slice/Ellipsis
always preserves the axis it works on.

>   >>> d1_best.shape
>   (2, 2, 10)
> 
> 
> Assuming 1st dimension is the test, 2nd the run and 10th the iterations, using
> this previous code with some example values:
> 
>   >>> dmin
>   [4 3]
>   >>> d1_best
>   [[[ ... contents of d1[0,4,:] ...]
>     [ ... contents of d1[0,3,:] ...]]
>    [[ ... contents of d1[1,4,:] ...]
>     [ ... contents of d1[1,3,:] ...]]]
> 
> 
> While I actually want this:
> 
>   [[ ... contents of d1[0,4,:] ...]
>    [ ... contents of d1[1,3,:] ...]]
> 
> 
> Thanks,
>   Lluis
> 


From xscript at gmx.net  Tue Jan 29 10:56:47 2013
From: xscript at gmx.net (=?utf-8?Q?Llu=C3=ADs?=)
Date: Tue, 29 Jan 2013 16:56:47 +0100
Subject: [Numpy-discussion] numpythonically getting elements with the
	minimum sum
In-Reply-To: <1359468715.3559.10.camel@sebastian-laptop> (Sebastian Berg's
	message of "Tue, 29 Jan 2013 15:11:55 +0100")
References: <871ud5x8d7.fsf@fimbulvetr.bsc.es>
	<E55F144F-74A9-4EE2-839A-2D353BB624D8@gmail.com>
	<87txq0t7s6.fsf@fimbulvetr.bsc.es>
	<1359468715.3559.10.camel@sebastian-laptop>
Message-ID: <87sj5kq8xs.fsf@fimbulvetr.bsc.es>

Sebastian Berg writes:

> On Tue, 2013-01-29 at 14:53 +0100, Llu?s wrote:
>> Gregor Thalhammer writes:
>> 
>> > Am 28.1.2013 um 23:15 schrieb Llu?s:
>> 
>> >> Hi,
>> >> 
>> >> I have a somewhat convoluted N-dimensional array that contains information of a
>> >> set of experiments.
>> >> 
>> >> The last dimension has as many entries as iterations in the experiment (an
>> >> iterative application), and the penultimate dimension has as many entries as
>> >> times I have run that experiment; the rest of dimensions describe the features
>> >> of the experiment:
>> >> 
>> >> data.shape == (... indefinite amount of dimensions ..., NUM_RUNS, NUM_ITERATIONS)
>> >> 
>> >> So, what I want is to get the data for the best run of each experiment:
>> >> 
>> >> best.shape == (... indefinite amount of dimensions ..., NUM_ITERATIONS)
>> >> 
>> >> by selecting, for each experiment, the run with the lowest total time (sum of
>> >> the time of all iterations for that experiment).
>> >> 
>> >> 
>> >> So far I've got the trivial part, but not the final indexing into "data":
>> >> 
>> >> dsum = data.sum(axis = -1)
>> >> dmin = dsum.min(axis = -1)
>> >> best = data[???]
>> >> 
>> >> 
>> >> I'm sure there must be some numpythonic and generic way to get what I want, but
>> >> fancy indexing is beating me here :)
>> 
>> > Did you have a look at the argmin function? It delivers the indices of the minimum values along an axis. Untested guess:
>> 
>> > dmin_idx = argmin(dsum, axis = -1)
>> > best = data[..., dmin_idx, :]
>> 
>> Ah, sorry, my example is incorrect. I was actually using 'argmin', but indexing
>> with it does not exactly work as I expected:
>> 
>> >>> d1.shape
>> (2, 5, 10)
>> >>> dsum = d1.sum(axis = -1)
>> >>> dmin = d1.argmin(axis = -1)
>> >>> dmin.shape
>> (2,)
>> >>> d1_best = d1[...,dmin,:]

> You need to use fancy indexing. Something like:
>>>> d1_best = d1[np.arange(2), dmin,:]

> Because the Ellipsis takes everything from the axis, while you want to
> pick from multiple axes at the same time. That can be achieved with
> fancy indexing (indexing with arrays). From another perspective, you
> want to get rid of two axes in favor of a new one, but a slice/Ellipsis
> always preserves the axis it works on.

Nice, thanks. That works for this specific example, but I couldn't get it to
work with "d1.shape == (1, 2, 16, 5, 10)" (thus "dmin.shape == (1, 2, 16)"):

    >>> def get_best_run (data, field):
    ...     """Returns the best run."""
    ...     data = data.view(np.ndarray)
    ...     assert data.ndim >= 2
    ...     dsum = data[field].sum(axis=-1)
    ...     dmin = dsum.argmin(axis=-1)
    ...     idxs  = [ np.arange(dlen) for dlen in data.shape[:-2] ]
    ...     idxs += [ dmin ]
    ...     idxs += [ slice(None) ]
    ...     return data[tuple(idxs)]
    >>> d1.shape   
    (2, 5, 10)
    >>> get_best_run(d1, "time")
    (2, 10)
    >>> d2.shape
    (1, 2, 16, 5, 10)
    >>> get_best_run(d2, "time")
    Traceback (most recent call last):
      ...
      File "./plot-user.py", line 89, in get_best_run
        res = data.view(np.ndarray)[tuple(idxs)]
    ValueError: shape mismatch: objects cannot be broadcast to a single shape


After reading the "Advanced indexing section", my understanding is that the
elements in "idxs" are not broadcastable to the same shape, but I'm not sure how
I should build them to be broadcastable to what specific shape.


Thanks a lot,
  Lluis


>> >>> d1_best.shape
>> (2, 2, 10)
>> 
>> 
>> Assuming 1st dimension is the test, 2nd the run and 10th the iterations, using
>> this previous code with some example values:
>> 
>> >>> dmin
>> [4 3]
>> >>> d1_best
>> [[[ ... contents of d1[0,4,:] ...]
>> [ ... contents of d1[0,3,:] ...]]
>> [[ ... contents of d1[1,4,:] ...]
>> [ ... contents of d1[1,3,:] ...]]]
>> 
>> 
>> While I actually want this:
>> 
>> [[ ... contents of d1[0,4,:] ...]
>> [ ... contents of d1[1,3,:] ...]]

-- 
 "And it's much the same thing with knowledge, for whenever you learn
 something new, the whole world becomes that much richer."
 -- The Princess of Pure Reason, as told by Norton Juster in The Phantom
 Tollbooth


From xscript at gmx.net  Tue Jan 29 13:07:03 2013
From: xscript at gmx.net (=?utf-8?Q?Llu=C3=ADs?=)
Date: Tue, 29 Jan 2013 19:07:03 +0100
Subject: [Numpy-discussion] numpythonically getting elements with the
	minimum sum
In-Reply-To: <87sj5kq8xs.fsf@fimbulvetr.bsc.es> (=?utf-8?Q?=22Llu=C3=ADs?=
	=?utf-8?Q?=22's?= message of "Tue, 29 Jan 2013 16:56:47 +0100")
References: <871ud5x8d7.fsf@fimbulvetr.bsc.es>
	<E55F144F-74A9-4EE2-839A-2D353BB624D8@gmail.com>
	<87txq0t7s6.fsf@fimbulvetr.bsc.es>
	<1359468715.3559.10.camel@sebastian-laptop>
	<87sj5kq8xs.fsf@fimbulvetr.bsc.es>
Message-ID: <87y5fbq2wo.fsf@fimbulvetr.bsc.es>

Llu?s  writes:

> Sebastian Berg writes:
>> On Tue, 2013-01-29 at 14:53 +0100, Llu?s wrote:
>>> Gregor Thalhammer writes:
>>> 
>>> > Am 28.1.2013 um 23:15 schrieb Llu?s:
>>> 
>>> >> Hi,
>>> >> 
>>> >> I have a somewhat convoluted N-dimensional array that contains information of a
>>> >> set of experiments.
>>> >> 
>>> >> The last dimension has as many entries as iterations in the experiment (an
>>> >> iterative application), and the penultimate dimension has as many entries as
>>> >> times I have run that experiment; the rest of dimensions describe the features
>>> >> of the experiment:
>>> >> 
>>> >> data.shape == (... indefinite amount of dimensions ..., NUM_RUNS, NUM_ITERATIONS)
>>> >> 
>>> >> So, what I want is to get the data for the best run of each experiment:
>>> >> 
>>> >> best.shape == (... indefinite amount of dimensions ..., NUM_ITERATIONS)
>>> >> 
>>> >> by selecting, for each experiment, the run with the lowest total time (sum of
>>> >> the time of all iterations for that experiment).
>>> >> 
>>> >> 
>>> >> So far I've got the trivial part, but not the final indexing into "data":
>>> >> 
>>> >> dsum = data.sum(axis = -1)
>>> >> dmin = dsum.min(axis = -1)
>>> >> best = data[???]
>>> >> 
>>> >> 
>>> >> I'm sure there must be some numpythonic and generic way to get what I want, but
>>> >> fancy indexing is beating me here :)
>>> 
>>> > Did you have a look at the argmin function? It delivers the indices of the minimum values along an axis. Untested guess:
>>> 
>>> > dmin_idx = argmin(dsum, axis = -1)
>>> > best = data[..., dmin_idx, :]
>>> 
>>> Ah, sorry, my example is incorrect. I was actually using 'argmin', but indexing
>>> with it does not exactly work as I expected:
>>> 
>>> >>> d1.shape
>>> (2, 5, 10)
>>> >>> dsum = d1.sum(axis = -1)
>>> >>> dmin = d1.argmin(axis = -1)
>>> >>> dmin.shape
>>> (2,)
>>> >>> d1_best = d1[...,dmin,:]

>> You need to use fancy indexing. Something like:
>>>>> d1_best = d1[np.arange(2), dmin,:]

>> Because the Ellipsis takes everything from the axis, while you want to
>> pick from multiple axes at the same time. That can be achieved with
>> fancy indexing (indexing with arrays). From another perspective, you
>> want to get rid of two axes in favor of a new one, but a slice/Ellipsis
>> always preserves the axis it works on.

> Nice, thanks. That works for this specific example, but I couldn't get it to
> work with "d1.shape == (1, 2, 16, 5, 10)" (thus "dmin.shape == (1, 2, 16)"):

>>>> def get_best_run (data, field):
>     ...     """Returns the best run."""
>     ...     data = data.view(np.ndarray)
>     ...     assert data.ndim >= 2
>     ...     dsum = data[field].sum(axis=-1)
>     ...     dmin = dsum.argmin(axis=-1)
>     ...     idxs  = [ np.arange(dlen) for dlen in data.shape[:-2] ]
>     ...     idxs += [ dmin ]
>     ...     idxs += [ slice(None) ]
>     ...     return data[tuple(idxs)]
>>>> d1.shape   
>     (2, 5, 10)
>>>> get_best_run(d1, "time")
>     (2, 10)
>>>> d2.shape
>     (1, 2, 16, 5, 10)
>>>> get_best_run(d2, "time")
>     Traceback (most recent call last):
>       ...
>       File "./plot-user.py", line 89, in get_best_run
>         res = data.view(np.ndarray)[tuple(idxs)]
>     ValueError: shape mismatch: objects cannot be broadcast to a single shape


> After reading the "Advanced indexing section", my understanding is that the
> elements in "idxs" are not broadcastable to the same shape, but I'm not sure how
> I should build them to be broadcastable to what specific shape.

BTW, here's an equivalent that seems to work on all cases, although I would
prefer to avoid control code to manually fill-in the result:


    >>> def get_best_run (data, field):
    ...     """Returns the best run."""
    ...     data = data.view(np.ndarray)
    ...     assert data.ndim >= 2
    ...     dsum = data[field].sum(axis=-1)
    ...     dmin = dsum.argmin(axis=-1)
    ...  
    ...     res_shape = list(data.shape)
    ...     del res_shape[-2]
    ...     res = np.ndarray(res_shape, dtype = data.dtype)
    ...  
    ...     idxs = np.unravel_index(np.arange(dmin.size), dmin.shape)
    ...     for idx in itertools.izip(*idxs):
    ...         isum = dsum[idx]
    ...         imin = dmin[idx]
    ...         idata = data[idx]
    ...         res[idx] = data[tuple(list(idx) + [imin])]
    ...  
    ...     return res
    >>> d1.shape   
    (2, 5, 10)
    >>> get_best_run(d1, "time")
    (2, 10)
    >>> d2.shape
    (1, 2, 16, 5, 10)
    >>> get_best_run(d2, "time")
    (1, 2, 16, 10)


Thanks,
  Lluis


>>> >>> d1_best.shape
>>> (2, 2, 10)
>>> 
>>> 
>>> Assuming 1st dimension is the test, 2nd the run and 10th the iterations, using
>>> this previous code with some example values:
>>> 
>>> >>> dmin
>>> [4 3]
>>> >>> d1_best
>>> [[[ ... contents of d1[0,4,:] ...]
>>> [ ... contents of d1[0,3,:] ...]]
>>> [[ ... contents of d1[1,4,:] ...]
>>> [ ... contents of d1[1,3,:] ...]]]
>>> 
>>> 
>>> While I actually want this:
>>> 
>>> [[ ... contents of d1[0,4,:] ...]
>>> [ ... contents of d1[1,3,:] ...]]


-- 
 "And it's much the same thing with knowledge, for whenever you learn
 something new, the whole world becomes that much richer."
 -- The Princess of Pure Reason, as told by Norton Juster in The Phantom
 Tollbooth


From ben.root at ou.edu  Tue Jan 29 17:19:53 2013
From: ben.root at ou.edu (Benjamin Root)
Date: Tue, 29 Jan 2013 17:19:53 -0500
Subject: [Numpy-discussion] np.where: x and y need to have the same
 shape as condition ?
In-Reply-To: <loom.20130129T120945-574@post.gmane.org>
References: <loom.20130129T120945-574@post.gmane.org>
Message-ID: <CANNq6FkE32dA3e=uJzQKDH1ibp3261pjkVYjS=ZNfryQFA5XpA@mail.gmail.com>

On Tue, Jan 29, 2013 at 6:16 AM, denis <denis-bz-gg at t-online.de> wrote:

> Folks,
>   the doc for `where` says "x and y need to have the same shape as
> condition"
> http://docs.scipy.org/doc/numpy-dev/reference/generated/numpy.where.html
> But surely
>     "where is equivalent to:
>     [xv if c else yv for (c,xv,yv) in zip(condition,x,y)]"
> holds as long as len(condition) == len(x) == len(y) ?
> And `condition` can be broadcast ?
>     n = 3
>     all01 = np.array([ t for t in np.ndindex( n * (2,) )]) # 000 001 ...
>     x = np.zeros(n)
>     y = np.ones(n)
>     w = np.where( all01, y, x )  # 2^n x n
>
> Can anyone please help me understand `where`
> / extend "where is equivalent to ..." ?
> Thanks,
> cheers
>   -- denis
>
>
Do keep in mind the difference between len() and shape (they aren't the
same for 2 and greater dimension arrays).  But, ultimately, yes, the arrays
have to have the same shape, or use scalars.  I haven't checked
broadcast-ability though.  Perhaps a note should be added into the
documentation to explicitly say whether the arrays can be broadcastable.

Ben Root
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130129/48fbc3f6/attachment.html>

From toddrjen at gmail.com  Wed Jan 30 04:24:22 2013
From: toddrjen at gmail.com (Todd)
Date: Wed, 30 Jan 2013 10:24:22 +0100
Subject: [Numpy-discussion] Subclassing ndarray with concatenate
In-Reply-To: <1358858673.24631.20.camel@sebastian-laptop>
References: <CAFpSVpKjZk6Ycpi21OCxOqGFOcH17jVe=gOQ=+rWWTdU08zZUA@mail.gmail.com>
	<1358858673.24631.20.camel@sebastian-laptop>
Message-ID: <CAFpSVp+8NrvMZnnCs1YnurgSrxY2aAVL9roPAZe6MTGsqsoXCg@mail.gmail.com>

On Tue, Jan 22, 2013 at 1:44 PM, Sebastian Berg
<sebastian at sipsolutions.net>wrote:

> Hey,
>
> On Tue, 2013-01-22 at 10:21 +0100, Todd wrote:
>
> > The main exception I have found is concatenate (and hstack/vstack,
> > which just wrap concatenate).  In this case, __array_finalize__ is
> > passed an array that has already been stripped of the additional
> > attributes, and I don't see a way to recover this information.
> >
> There are quite a few functions that simply do not preserve subclasses
> (though I think more could/should call `__array_wrap__` probably, even
> if the documentation may say that it is about ufuncs, there are some
> example of this already).
> `np.concatenate` is one of these. It always returns a base array. In any
> case it gets a bit difficult if you have multiple input arrays (which
> may not matter for you).
>


I don't think this is right.  I tried it and it doesn't return a base
array, it returns an instance of the original array subclass.


>
> > In my particular case at least, there are clear ways to handle corner
> > cases (like being passed a class that lacks these attributes), so in
> > principle there no problem handling concatenate in a general way,
> > assuming I can get access to the attributes.
> >
> >
> > So is there any way to subclass ndarray in such a way that concatenate
> > can be handled properly?
> >
> Quite simply, no. If you compare masked arrays, they also provide their
> own concatenate for this reason.
>
> I hope that helps a bit...
>
>
Is this something that should be available?  For instance a method that
provides both the new array and the arrays that were used to construct it.
This would seem to be an extremely common use-case for array subclasses, so
letting them gracefully handle this would seem to be very important.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20130130/ba452597/attachment.html>

From sebastian at sipsolutions.net  Wed Jan 30 05:20:39 2013
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Wed, 30 Jan 2013 11:20:39 +0100
Subject: [Numpy-discussion] Subclassing ndarray with concatenate
In-Reply-To: <CAFpSVp+8NrvMZnnCs1YnurgSrxY2aAVL9roPAZe6MTGsqsoXCg@mail.gmail.com>
References: <CAFpSVpKjZk6Ycpi21OCxOqGFOcH17jVe=gOQ=+rWWTdU08zZUA@mail.gmail.com>
	<1358858673.24631.20.camel@sebastian-laptop>
	<CAFpSVp+8NrvMZnnCs1YnurgSrxY2aAVL9roPAZe6MTGsqsoXCg@mail.gmail.com>
Message-ID: <1359541239.2496.14.camel@sebastian-laptop>

On Wed, 2013-01-30 at 10:24 +0100, Todd wrote:
> On Tue, Jan 22, 2013 at 1:44 PM, Sebastian Berg
> <sebastian at sipsolutions.net> wrote:
>         Hey,
>         
>         On Tue, 2013-01-22 at 10:21 +0100, Todd wrote:
>         
>         
>         > The main exception I have found is concatenate (and
>         hstack/vstack,
>         > which just wrap concatenate).  In this case,
>         __array_finalize__ is
>         > passed an array that has already been stripped of the
>         additional
>         > attributes, and I don't see a way to recover this
>         information.
>         >
>         
>         There are quite a few functions that simply do not preserve
>         subclasses
>         (though I think more could/should call `__array_wrap__`
>         probably, even
>         if the documentation may say that it is about ufuncs, there
>         are some
>         example of this already).
>         `np.concatenate` is one of these. It always returns a base
>         array. In any
>         case it gets a bit difficult if you have multiple input arrays
>         (which
>         may not matter for you).
> 
> 
> 
> I don't think this is right.  I tried it and it doesn't return a base
> array, it returns an instance of the original array subclass.

Yes you are right it preserves type, I was fooled by
`__array_priority__` being 0 as default, thought it defaulted to more
then 0 (for ufuncs everything beats arrays, not sure if it really
should) but so I missed.

In any case, yes, it calls __array_finalize__, but as you noticed, it
calls it without the original array. Now it would be very easy and
harmless to change that, however I am not sure if giving only the parent
array is very useful (ie. you only get the one with highest array
priority).

Another way to get around it would be maybe to call __array_wrap__ like
ufuncs do (with a context, so you get all inputs, but then the non-array
axis argument may not be reasonably placed into the context).

In any case, if you think it would be helpful to at least get the single
parent array, that would be a very simple change, but I feel the whole
subclassing could use a bit thinking and quite a bit of work probably,
since I am not quite convinced that calling __array_wrap__ with a
complicated context from as many functions as possible is the right
approach for allowing more complex subclasses.

>  
>         
>         > In my particular case at least, there are clear ways to
>         handle corner
>         > cases (like being passed a class that lacks these
>         attributes), so in
>         > principle there no problem handling concatenate in a general
>         way,
>         > assuming I can get access to the attributes.
>         >
>         >
>         > So is there any way to subclass ndarray in such a way that
>         concatenate
>         > can be handled properly?
>         >
>         
>         Quite simply, no. If you compare masked arrays, they also
>         provide their
>         own concatenate for this reason.
>         
>         I hope that helps a bit...
>         
> 
> 
> Is this something that should be available?  For instance a method
> that provides both the new array and the arrays that were used to
> construct it.  This would seem to be an extremely common use-case for
> array subclasses, so letting them gracefully handle this would seem to
> be very important.
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From opossumnano at gmail.com  Thu Jan 31 05:56:28 2013
From: opossumnano at gmail.com (Tiziano Zito)
Date: Thu, 31 Jan 2013 11:56:28 +0100 (CET)
Subject: [Numpy-discussion] =?utf-8?q?=5BANN=5D_Summer_School_=22Advanced_?=
 =?utf-8?q?Scientific_Programming_in_Python=22_in_Z=C3=BCrich=2C_Switzerla?=
 =?utf-8?q?nd?=
Message-ID: <20130131105628.5223E12E00D8@comms.bccn-berlin.de>

Advanced Scientific Programming in Python
=========================================
a Summer School by the G-Node and the Physik-Institut, University of Zurich

Scientists spend more and more time writing, maintaining, and
debugging software. While techniques for doing this efficiently have
evolved, only few scientists actually use them. As a result, instead
of doing their research, they spend far too much time writing
deficient code and reinventing the wheel. In this course we will
present a selection of advanced programming techniques,
incorporating theoretical lectures and practical exercises tailored
to the needs of a programming scientist. New skills will be tested
in a real programming project: we will team up to develop an
entertaining scientific computer game.

We use the Python programming language for the entire course. Python
works as a simple programming language for beginners, but more
importantly, it also works great in scientific simulations and data
analysis. We show how clean language design, ease of extensibility,
and the great wealth of open source libraries for scientific
computing and data visualization are driving Python to become a
standard tool for the programming scientist.

This school is targeted at Master or PhD students and Post-docs from
all areas of science. Competence in Python or in another language
such as Java, C/C++, MATLAB, or Mathematica is absolutely required.
Basic knowledge of Python is assumed. Participants without any prior
experience with Python should work through the proposed introductory
materials before the course.

Date and Location
=================
September 1?6, 2013. Z?rich, Switzerlandi.

Preliminary Program
===================
Day 0 (Sun Sept 1) ? Best Programming Practices
  - Best Practices, Development Methodologies and the Zen of Python
  - Version control with git
  - Object-oriented programming & design patterns
Day 1 (Mon Sept 2) ? Software Carpentry
  - Test-driven development, unit testing & quality assurance
  - Debugging, profiling and benchmarking techniques
  - Best practices in data visualization
  - Programming in teams
Day 2 (Tue Sept 3) ? Scientific Tools for Python
  - Advanced NumPy
  - The Quest for Speed (intro): Interfacing to C with Cython
  - Advanced Python I: idioms, useful built-in data structures, generators
Day 3 (Wed Sept 4) ? The Quest for Speed
  - Writing parallel applications in Python
  - Programming project
Day 4 (Thu Sept 5) ? Efficient Memory Management
  - When parallelization does not help:
the starving CPUs problem
  - Advanced Python II: decorators and context managers
  - Programming project
Day 5 (Fri Sept 6) ? Practical Software Development
  - Programming project
  - The Pelita Tournament

Every evening we will have the tutors' consultation hour : Tutors will
answer your questions and give suggestions for your own projects.

Applications
============
You can apply on-line at http://python.g-node.org

Applications must be submitted before 23:59 CEST, May 1, 2013.
Notifications of acceptance will be sent by June 1, 2013.

No fee is charged but participants should take care of travel,
living, and accommodation expenses.  Candidates will be selected on
the basis of their profile. Places are limited: acceptance rate is
usually around 20%.  Prerequisites: You are supposed to know the
basics of Python to participate in the lectures. You are encouraged
to go through the introductory material available on the website.

Faculty
=======
  - Francesc Alted, Continuum Analytics Inc., USA
  - Pietro Berkes, Enthought Inc., UK
  - Valentin Haenel, freelance developer and consultant, Berlin, Germany
  - Zbigniew J?drzejewski-Szmek, Krasnow Institute, 
    George Mason University, USA
  - Eilif Muller, Blue Brain Project, ?cole Polytechnique F?d?rale de
    Lausanne, Switzerland
  - Emanuele Olivetti, NeuroInformatics Laboratory, Fondazione Bruno
    Kessler and University of Trento, Italy
  - Rike-Benjamin Schuppner, Technologit GbR, Germany
  - Bartosz Tele?czuk, Unit? de Neurosciences Information et Complexit?,
    CNRS, France
  - St?fan van der Walt, Applied Mathematics, Stellenbosch University,
    South Africa
  - Bastian Venthur, Berlin Institute of Technology and Bernstein Focus
    Neurotechnology, Germany
  - Niko Wilbert, TNG Technology Consulting GmbH, Germany
  - Tiziano Zito, Institute for Theoretical Biology, Humboldt-Universit?t
    zu Berlin, Germany

Organized by Nicola Chiapolini and colleagues of the Physik-Institut, 
University of Zurich, and by Zbigniew J?drzejewski-Szmek and Tiziano Zito for
the German Neuroinformatics Node of the INCF.

Website:  http://python.g-node.org
Contact:  python-info at g-node.org

From oscar.villellas at continuum.io  Thu Jan 31 11:43:23 2013
From: oscar.villellas at continuum.io (Oscar Villellas)
Date: Thu, 31 Jan 2013 17:43:23 +0100
Subject: [Numpy-discussion] pull request: generalized ufunc signature fix
 and lineal algebra generalized ufuncs
Message-ID: <CAMGgv_-8t_tKPXfki=TJiPogAk+UEkga-zgvex1kmK36+rvVXA@mail.gmail.com>

Hello,

At Continuum Analytics we've been working on a submodule implementing
a set of lineal algebra operations as generalized ufuncs. This allows
specifying arrays of lineal algebra problems to be computed with a
single Python call, allowing broadcasting as well. As the
vectorization is handled in the kernel, this gives a speed edge on the
operations. We think this could be useful to the community and we want
to share the work done.

I've created a couple of pull-requests:

The first one contains a fix for a bug in the handling of certain
signatures in the gufuncs. This was found while building the
submodule. The fix was done by Mark Wiebe, so credit should go to him
:).
https://github.com/numpy/numpy/pull/2953

The second pull request contains the submodule itself and builds on
top of the previous fix. It contains a rst file that explains the
submodule, enumerates the functions implemented and details some
implementation bits. The entry point to the module is in written in
Python and contains detailed docstrings.
https://github.com/numpy/numpy/pull/2954

We are open to discussion and to make improvements to the code if
needed, in order to adapt to NumPy standards.

Thanks,
    Oscar.


From njs at pobox.com  Thu Jan 31 14:44:05 2013
From: njs at pobox.com (Nathaniel Smith)
Date: Thu, 31 Jan 2013 11:44:05 -0800
Subject: [Numpy-discussion] pull request: generalized ufunc signature
 fix and lineal algebra generalized ufuncs
In-Reply-To: <CAMGgv_-8t_tKPXfki=TJiPogAk+UEkga-zgvex1kmK36+rvVXA@mail.gmail.com>
References: <CAMGgv_-8t_tKPXfki=TJiPogAk+UEkga-zgvex1kmK36+rvVXA@mail.gmail.com>
Message-ID: <CAPJVwBmnLbj2ukoRuiUfQGFFz7iQNPnaXd39Xf+2xWX=7ZT2mg@mail.gmail.com>

On Thu, Jan 31, 2013 at 8:43 AM, Oscar Villellas
<oscar.villellas at continuum.io> wrote:
> Hello,
>
> At Continuum Analytics we've been working on a submodule implementing
> a set of lineal algebra operations as generalized ufuncs. This allows
> specifying arrays of lineal algebra problems to be computed with a
> single Python call, allowing broadcasting as well. As the
> vectorization is handled in the kernel, this gives a speed edge on the
> operations. We think this could be useful to the community and we want
> to share the work done.

It certainly does look useful. My question is -- why do we need two
complete copies of the linear algebra routine interfaces? Can we just
replace the existing linalg functions with these new implementations?
Or if not, what prevents it?

-n


From robert.kern at gmail.com  Thu Jan 31 15:35:22 2013
From: robert.kern at gmail.com (Robert Kern)
Date: Thu, 31 Jan 2013 20:35:22 +0000
Subject: [Numpy-discussion] pull request: generalized ufunc signature
 fix and lineal algebra generalized ufuncs
In-Reply-To: <CAPJVwBmnLbj2ukoRuiUfQGFFz7iQNPnaXd39Xf+2xWX=7ZT2mg@mail.gmail.com>
References: <CAMGgv_-8t_tKPXfki=TJiPogAk+UEkga-zgvex1kmK36+rvVXA@mail.gmail.com>
	<CAPJVwBmnLbj2ukoRuiUfQGFFz7iQNPnaXd39Xf+2xWX=7ZT2mg@mail.gmail.com>
Message-ID: <CAF6FJis_GJ8QP0n=Ruisn1or+V5SMF8kwD-_Nes65ynetTTZ+Q@mail.gmail.com>

On Thu, Jan 31, 2013 at 7:44 PM, Nathaniel Smith <njs at pobox.com> wrote:
> On Thu, Jan 31, 2013 at 8:43 AM, Oscar Villellas
> <oscar.villellas at continuum.io> wrote:
>> Hello,
>>
>> At Continuum Analytics we've been working on a submodule implementing
>> a set of lineal algebra operations as generalized ufuncs. This allows
>> specifying arrays of lineal algebra problems to be computed with a
>> single Python call, allowing broadcasting as well. As the
>> vectorization is handled in the kernel, this gives a speed edge on the
>> operations. We think this could be useful to the community and we want
>> to share the work done.
>
> It certainly does look useful. My question is -- why do we need two
> complete copies of the linear algebra routine interfaces? Can we just
> replace the existing linalg functions with these new implementations?
> Or if not, what prevents it?

The error reporting would have to be bodged back in.

--
Robert Kern