From Nicolas.Rougier at inria.fr  Fri May  1 03:49:50 2015
From: Nicolas.Rougier at inria.fr (Nicolas P. Rougier)
Date: Fri, 1 May 2015 09:49:50 +0200
Subject: [Numpy-discussion] EuroScipy 2015: Extended deadline (15/05/2015)
Message-ID: <763F7E6D-7850-403E-ADEC-79167A31FD41@inria.fr>

--------------------------------
Extended deadline: 15th May 2015
--------------------------------

EuroScipy 2015, the annual conference on Python in science will take place in
Cambridge, UK on 26-30 August 2015. The conference features two days of
tutorials followed by two days of scientific talks & posters and an extra day
dedicated to developer sprints. It is the major event in Europe in the field of
technical/scientific computing within the Python ecosystem. Data scientists,
analysts, quants, PhD's, scientists and students from more than 20 countries
attended the conference last year.

The topics presented at EuroSciPy are very diverse, with a focus on advanced
software engineering and original uses of Python and its scientific libraries,
either in theoretical or experimental research, from both academia and the
industry.

Submissions for posters, talks & tutorials (beginner and advanced) are welcome
on our website at http://www.euroscipy.org/2015/ Sprint proposals should be
addressed directly to the organisation at euroscipy-org at python.org

Important dates
===============

Mar 24, 2015	Call for talks, posters & tutorials
Apr 30, 2015	Talk and tutorials submission deadline
May 15, 2015	EXTENDED DEADLINE
May 1,  2015	Registration opens
May 30, 2015	Final program announced
Jun 15, 2015	Early-bird registration ends

Aug 26-27, 2015	Tutorials
Aug 28-29, 2015	Main conference
Aug 30, 2015	Sprints

We look forward to an exciting conference and hope to see you in Cambridge

The EuroSciPy 2015 Team - http://www.euroscipy.org/2015/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150501/1b4db658/attachment.html>

From faltet at gmail.com  Fri May  1 05:26:50 2015
From: faltet at gmail.com (Francesc Alted)
Date: Fri, 01 May 2015 11:26:50 +0200
Subject: [Numpy-discussion] ANN: PyTables 3.2.0 RC2 is out
Message-ID: <554346DA.5000309@gmail.com>

===========================
 Announcing PyTables 3.2.0rc2
===========================

We are happy to announce PyTables 3.2.0rc2.

*******************************
IMPORTANT NOTICE:

If you are a user of PyTables, it needs your help to keep going.  Please
read the next thread as it contains important information about the
future (or lack of it) of the project:

https://groups.google.com/forum/#!topic/pytables-users/yY2aUa4H7W4

Thanks!
*******************************


What's new
==========

This is a major release of PyTables and it is the result of more than a
year of accumulated patches, but most specially it fixes a couple of
nasty problem with indexed queries not returning the correct results in
some scenarios (mainly pandas users).  There are many usability and
performance improvements too.

In case you want to know more in detail what has changed in this
version, please refer to: http://www.pytables.org/release_notes.html

You can install it via pip or download a source package with generated
PDF and HTML docs from:
http://sourceforge.net/projects/pytables/files/pytables/3.2.0rc2

For an online version of the manual, visit:
http://www.pytables.org/usersguide/index.html


What it is?
===========

PyTables is a library for managing hierarchical datasets and
designed to efficiently cope with extremely large amounts of data with
support for full 64-bit file addressing.  PyTables runs on top of
the HDF5 library and NumPy package for achieving maximum throughput and
convenient use.  PyTables includes OPSI, a new indexing technology,
allowing to perform data lookups in tables exceeding 10 gigarows
(10**10 rows) in less than a tenth of a second.


Resources
=========

About PyTables: http://www.pytables.org

About the HDF5 library: http://hdfgroup.org/HDF5/

About NumPy: http://numpy.scipy.org/


Acknowledgments
===============

Thanks to many users who provided feature improvements, patches, bug
reports, support and suggestions.  See the ``THANKS`` file in the
distribution package for a (incomplete) list of contributors.  Most
specially, a lot of kudos go to the HDF5 and NumPy makers.
Without them, PyTables simply would not exist.


Share your experience
=====================

Let us know of any bugs, suggestions, gripes, kudos, etc. you may have.


----

  **Enjoy data!**

  -- The PyTables Developers


From aymeric.rateau at gmail.com  Mon May  4 16:17:42 2015
From: aymeric.rateau at gmail.com (Gmail)
Date: Mon, 04 May 2015 22:17:42 +0200
Subject: [Numpy-discussion] read not byte aligned records
Message-ID: <5547D3E6.9080400@gmail.com>

Hi,

I am developping a code to read binary files (MDF, Measurement Data File).
In its previous version 3, data was always byte aligned. I used widely
numpy.core.records module (fromstring, fromfile) showing good
performance to read and unpack data on the fly.
However, in the latest version 4, not byte aligned data is possible. It
allows to reduce size of file, especially when raw data is not actually
recorded on bytes, like 10bits for analog converter. For instance, a
record structure could be:
uint64, float32, uint8, unit10, padding 6bits, uint9, padding 7bits,
uint24, uint24, uint24, etc.

I found a way using instead of numpy.core.records the bitstring module
to read these records when not aligned but performance is much worse (I
did not try cython implementation though but in python like x10) ?

Would there be a pure numpy way to do ?

Regards

Aymeric


From Jerome.Kieffer at esrf.fr  Tue May  5 01:21:24 2015
From: Jerome.Kieffer at esrf.fr (Jerome Kieffer)
Date: Tue, 5 May 2015 07:21:24 +0200
Subject: [Numpy-discussion] read not byte aligned records
In-Reply-To: <5547D3E6.9080400@gmail.com>
References: <5547D3E6.9080400@gmail.com>
Message-ID: <20150505072124.aa8746c35d26992bb5f16ec2@esrf.fr>

Hi, 
If you want to play with 10 bits data-blocks, read 5 bytes and work with 4 entries at a time...

-- 
J?r?me Kieffer
Data analysis unit - ESRF


From njs at pobox.com  Tue May  5 02:15:46 2015
From: njs at pobox.com (Nathaniel Smith)
Date: Mon, 4 May 2015 23:15:46 -0700
Subject: [Numpy-discussion] read not byte aligned records
In-Reply-To: <20150505072124.aa8746c35d26992bb5f16ec2@esrf.fr>
References: <5547D3E6.9080400@gmail.com>
	<20150505072124.aa8746c35d26992bb5f16ec2@esrf.fr>
Message-ID: <CAPJVwBk6e7AVD0TiYWfqeFeOYrALLvnuJW4D0vbe3bh+z=dgPw@mail.gmail.com>

On Mon, May 4, 2015 at 10:21 PM, Jerome Kieffer <Jerome.Kieffer at esrf.fr> wrote:
> Hi,
> If you want to play with 10 bits data-blocks, read 5 bytes and work with 4 entries at a time...

NumPy arrays don't have any support for sub-byte alignment. So if you
want to handle such data, you either need to write some manual
packing/unpacking code (using bitshift operators, or perhaps
np.unpackbits, or whatever), or use another library designed for doing
this. You may find Cython useful to write the core packing/unpacking,
since bit-by-bit processing in a for loop is not something that
CPython is super well suited to.

Good luck,
-n

-- 
Nathaniel J. Smith -- http://vorpus.org


From aymeric.rateau at gmail.com  Tue May  5 07:07:42 2015
From: aymeric.rateau at gmail.com (aymeric.rateau at gmail.com)
Date: Tue, 05 May 2015 11:07:42 +0000
Subject: [Numpy-discussion] read not byte aligned records
In-Reply-To: <CAPJVwBk6e7AVD0TiYWfqeFeOYrALLvnuJW4D0vbe3bh+z=dgPw@mail.gmail.com>
References: <CAPJVwBk6e7AVD0TiYWfqeFeOYrALLvnuJW4D0vbe3bh+z=dgPw@mail.gmail.com>
	<5547D3E6.9080400@gmail.com>
	<20150505072124.aa8746c35d26992bb5f16ec2@esrf.fr>
Message-ID: <080ff0d475a9941f5f752078524158b7@ratal.org>

Hi,
To answer Jerome (I hope), data is sometime spread on bytes shared by other data in the whole record. 10 bits was an example, sometimes, 24, 2, 8, 7 etc. all combined including some padding between them. I am not sure to have understood...

To Nathaniel, yes indeed I could read the records in big/long bytes and apply right_shift and bitwise_and functions to extract each channels. I am a bit afraid of performance though.

I am currently using bitstring module which is doing exactly this bits handling. It is implemented in both pure python and cython.
I tried to use the pure python and performance drawback compared to byte aligned data is around 2-3x for similar file sizes.
--> I will try with bitstring's cython implementation.
--> I will also try the way using right_shift and bitwise_and
Best will win but at least I am sure I am not missing any trick or optimisation and I am in the right direction from your answers.
Thanks !
Regards
Aymeric


5 mai 2015 08:15 "Nathaniel Smith" <njs at pobox.com> a ?crit:
> On Mon, May 4, 2015 at 10:21 PM, Jerome Kieffer <Jerome.Kieffer at esrf.fr> wrote:
> 
>> Hi,
>> If you want to play with 10 bits data-blocks, read 5 bytes and work with 4 entries at a time...
> 
> NumPy arrays don't have any support for sub-byte alignment. So if you
> want to handle such data, you either need to write some manual
> packing/unpacking code (using bitshift operators, or perhaps
> np.unpackbits, or whatever), or use another library designed for doing
> this. You may find Cython useful to write the core packing/unpacking,
> since bit-by-bit processing in a for loop is not something that
> CPython is super well suited to.
> 
> Good luck,
> -n
> 
> --
> Nathaniel J. Smith -- http://vorpus.org
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From ben.root at ou.edu  Tue May  5 09:39:45 2015
From: ben.root at ou.edu (Benjamin Root)
Date: Tue, 5 May 2015 09:39:45 -0400
Subject: [Numpy-discussion] read not byte aligned records
In-Reply-To: <080ff0d475a9941f5f752078524158b7@ratal.org>
References: <5547D3E6.9080400@gmail.com>
	<20150505072124.aa8746c35d26992bb5f16ec2@esrf.fr>
	<CAPJVwBk6e7AVD0TiYWfqeFeOYrALLvnuJW4D0vbe3bh+z=dgPw@mail.gmail.com>
	<080ff0d475a9941f5f752078524158b7@ratal.org>
Message-ID: <CANNq6F=18wd7kFTKxkPxCOtvN2dQsvmosq9ORgGnXay=gFq6Jw@mail.gmail.com>

I have been very happy with the bitarray package. I don't know if it is
faster than bitstring, but it is worth a mention. Just watch out for any
hashing operations on its objects, it doesn't seem to do them right (set(),
dict(), etc...), but comparison operations work just fine.

Ben Root
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150505/ed89ef75/attachment.html>

From allanhaldane at gmail.com  Tue May  5 11:13:22 2015
From: allanhaldane at gmail.com (Allan Haldane)
Date: Tue, 05 May 2015 11:13:22 -0400
Subject: [Numpy-discussion] Should ndarray subclasses support the keepdims
	arg?
Message-ID: <5548DE12.2060606@gmail.com>

Hello all,

A question:

Many ndarray methods (eg sum, mean, any, min) have a "keepdims" keyword
argument, but ndarray subclass methods sometimes don't. The 'matrix'
subclass doesn't, and numpy functions like 'np.sum' intentionally
drop/ignore the keepdims argument when called with an ndarray subclass
as first argument.

This means you can't always use ndarray subclasses as 'drop in'
replacement for ndarrays if the code uses keepdims (even indirectly),
and it means code that deals with keepdims (eg np.sum and more) has to
detect ndarray subclasses and drop keepdims even if the subclass
supports it (since there is no good way to detect support). It seems to
me that if we are going to use inheritance, subclass methods should keep
the signature of the parent class methods. What does the list think?

---- Details: ----

This problem comes up in a PR I'm working on (#5706) to add the keepdims
arg to masked array methods. In order to support masked matrices (which
a lot of unit tests check), I would have to detect and drop the keepdims
arg to avoid an exception. This would be solved if the matrix class
supported keepdims (plus an update to np.sum). Similarly,
`np.sum(mymaskedarray, keepdims=True)` does not respect keepdims, but it
could work if all subclasses supported keepdims.

I do not foresee immediate problems with adding keepdims to the matrix
methods, except that it would be an unused argument. Modifying `np.sum`
to always pass on the keepdims arg is trickier, since it would break any
code that tried to np.sum a subclass that doesn't support keepdims, eg
pandas.DataFrame. **kwargs tricks might work. But if it's permissible I
think it would be better to require subclasses to support all the
keyword args ndarray supports.

Allan


From njs at pobox.com  Tue May  5 13:55:07 2015
From: njs at pobox.com (Nathaniel Smith)
Date: Tue, 5 May 2015 10:55:07 -0700
Subject: [Numpy-discussion] Should ndarray subclasses support the
	keepdims arg?
In-Reply-To: <5548DE12.2060606@gmail.com>
References: <5548DE12.2060606@gmail.com>
Message-ID: <CAPJVwBm4PuLxpSKPmsKaoDvUAEN6NKvdt=Wnk8TCFF6=E0L7HA@mail.gmail.com>

AFAICT the only real solution here is for np.sum and friends to propagate
the keepdims argument if and only if it was explicitly passed to them (or
maybe the slightly different, if and only if it has a non-default value).
If we just started requiring code to handle it and passing it
unconditionally, then as soon as someone upgraded numpy all their existing
code might break for no good reason.
On May 5, 2015 8:13 AM, "Allan Haldane" <allanhaldane at gmail.com> wrote:

> Hello all,
>
> A question:
>
> Many ndarray methods (eg sum, mean, any, min) have a "keepdims" keyword
> argument, but ndarray subclass methods sometimes don't. The 'matrix'
> subclass doesn't, and numpy functions like 'np.sum' intentionally
> drop/ignore the keepdims argument when called with an ndarray subclass
> as first argument.
>
> This means you can't always use ndarray subclasses as 'drop in'
> replacement for ndarrays if the code uses keepdims (even indirectly),
> and it means code that deals with keepdims (eg np.sum and more) has to
> detect ndarray subclasses and drop keepdims even if the subclass
> supports it (since there is no good way to detect support). It seems to
> me that if we are going to use inheritance, subclass methods should keep
> the signature of the parent class methods. What does the list think?
>
> ---- Details: ----
>
> This problem comes up in a PR I'm working on (#5706) to add the keepdims
> arg to masked array methods. In order to support masked matrices (which
> a lot of unit tests check), I would have to detect and drop the keepdims
> arg to avoid an exception. This would be solved if the matrix class
> supported keepdims (plus an update to np.sum). Similarly,
> `np.sum(mymaskedarray, keepdims=True)` does not respect keepdims, but it
> could work if all subclasses supported keepdims.
>
> I do not foresee immediate problems with adding keepdims to the matrix
> methods, except that it would be an unused argument. Modifying `np.sum`
> to always pass on the keepdims arg is trickier, since it would break any
> code that tried to np.sum a subclass that doesn't support keepdims, eg
> pandas.DataFrame. **kwargs tricks might work. But if it's permissible I
> think it would be better to require subclasses to support all the
> keyword args ndarray supports.
>
> Allan
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150505/463a1c17/attachment.html>

From mistersheik at gmail.com  Tue May  5 14:05:08 2015
From: mistersheik at gmail.com (Neil Girdhar)
Date: Tue, 5 May 2015 14:05:08 -0400
Subject: [Numpy-discussion] Should ndarray subclasses support the
	keepdims arg?
In-Reply-To: <CAPJVwBm4PuLxpSKPmsKaoDvUAEN6NKvdt=Wnk8TCFF6=E0L7HA@mail.gmail.com>
References: <5548DE12.2060606@gmail.com>
	<CAPJVwBm4PuLxpSKPmsKaoDvUAEN6NKvdt=Wnk8TCFF6=E0L7HA@mail.gmail.com>
Message-ID: <CAA68w_k8fYOq-a8+v5sVMbBOq+2_=8vU4JG-7vgS_OjbyfEAhw@mail.gmail.com>

Maybe they should have written their code with **kwargs that consumes all
keyword arguments rather than assuming that no keyword arguments would be
added?  The problem with this approach in general is that it makes writing
code unnecessarily convoluted.

On Tue, May 5, 2015 at 1:55 PM, Nathaniel Smith <njs at pobox.com> wrote:

> AFAICT the only real solution here is for np.sum and friends to propagate
> the keepdims argument if and only if it was explicitly passed to them (or
> maybe the slightly different, if and only if it has a non-default value).
> If we just started requiring code to handle it and passing it
> unconditionally, then as soon as someone upgraded numpy all their existing
> code might break for no good reason.
> On May 5, 2015 8:13 AM, "Allan Haldane" <allanhaldane at gmail.com> wrote:
>
>> Hello all,
>>
>> A question:
>>
>> Many ndarray methods (eg sum, mean, any, min) have a "keepdims" keyword
>> argument, but ndarray subclass methods sometimes don't. The 'matrix'
>> subclass doesn't, and numpy functions like 'np.sum' intentionally
>> drop/ignore the keepdims argument when called with an ndarray subclass
>> as first argument.
>>
>> This means you can't always use ndarray subclasses as 'drop in'
>> replacement for ndarrays if the code uses keepdims (even indirectly),
>> and it means code that deals with keepdims (eg np.sum and more) has to
>> detect ndarray subclasses and drop keepdims even if the subclass
>> supports it (since there is no good way to detect support). It seems to
>> me that if we are going to use inheritance, subclass methods should keep
>> the signature of the parent class methods. What does the list think?
>>
>> ---- Details: ----
>>
>> This problem comes up in a PR I'm working on (#5706) to add the keepdims
>> arg to masked array methods. In order to support masked matrices (which
>> a lot of unit tests check), I would have to detect and drop the keepdims
>> arg to avoid an exception. This would be solved if the matrix class
>> supported keepdims (plus an update to np.sum). Similarly,
>> `np.sum(mymaskedarray, keepdims=True)` does not respect keepdims, but it
>> could work if all subclasses supported keepdims.
>>
>> I do not foresee immediate problems with adding keepdims to the matrix
>> methods, except that it would be an unused argument. Modifying `np.sum`
>> to always pass on the keepdims arg is trickier, since it would break any
>> code that tried to np.sum a subclass that doesn't support keepdims, eg
>> pandas.DataFrame. **kwargs tricks might work. But if it's permissible I
>> think it would be better to require subclasses to support all the
>> keyword args ndarray supports.
>>
>> Allan
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150505/f3cbffb5/attachment.html>

From sebastian at sipsolutions.net  Tue May  5 13:41:51 2015
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Tue, 05 May 2015 19:41:51 +0200
Subject: [Numpy-discussion] Should ndarray subclasses support the
 keepdims arg?
In-Reply-To: <5548DE12.2060606@gmail.com>
References: <5548DE12.2060606@gmail.com>
Message-ID: <1430847711.2930.4.camel@sipsolutions.net>

On Di, 2015-05-05 at 11:13 -0400, Allan Haldane wrote:
> Hello all,
> 
> A question:
> 
> Many ndarray methods (eg sum, mean, any, min) have a "keepdims" keyword
> argument, but ndarray subclass methods sometimes don't. The 'matrix'
> subclass doesn't, and numpy functions like 'np.sum' intentionally
> drop/ignore the keepdims argument when called with an ndarray subclass
> as first argument.
> 
> This means you can't always use ndarray subclasses as 'drop in'
> replacement for ndarrays if the code uses keepdims (even indirectly),
> and it means code that deals with keepdims (eg np.sum and more) has to
> detect ndarray subclasses and drop keepdims even if the subclass
> supports it (since there is no good way to detect support). It seems to
> me that if we are going to use inheritance, subclass methods should keep
> the signature of the parent class methods. What does the list think?
> 
> ---- Details: ----
> 
> This problem comes up in a PR I'm working on (#5706) to add the keepdims
> arg to masked array methods. In order to support masked matrices (which
> a lot of unit tests check), I would have to detect and drop the keepdims
> arg to avoid an exception. This would be solved if the matrix class
> supported keepdims (plus an update to np.sum). Similarly,
> `np.sum(mymaskedarray, keepdims=True)` does not respect keepdims, but it
> could work if all subclasses supported keepdims.
> 
> I do not foresee immediate problems with adding keepdims to the matrix
> methods, except that it would be an unused argument. Modifying `np.sum`
> to always pass on the keepdims arg is trickier, since it would break any
> code that tried to np.sum a subclass that doesn't support keepdims, eg
> pandas.DataFrame. **kwargs tricks might work. But if it's permissible I
> think it would be better to require subclasses to support all the
> keyword args ndarray supports.

What is the advantage over having an error raised due to the invalid
**kwargs trick when the subclass does not support it? First sight it
seems like a far shot have a hard requirement. The transition period
alone seems hard, unless we have add magic to test the subclass upon
creation, and I am not sure that is easy to do (something like ABC
conformance test).

- Sebastian


> 
> Allan
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150505/109b2997/attachment.sig>

From alexander.brezinov at mmbresearch.com  Tue May  5 15:52:44 2015
From: alexander.brezinov at mmbresearch.com (Alexander Brezinov)
Date: Tue, 5 May 2015 15:52:44 -0400
Subject: [Numpy-discussion] import scipy.linalg is hanging on Marvell armada
	370
Message-ID: <CADDmshvNvSQnBEDwi7RWywTssqDaqomCM=cRFwY_T9ef6q1Y5g@mail.gmail.com>

Hello

The import of scipy.linalg is hanging in DOUBLE_mutiply function
(BINARY_LOOP) in umath.so. After attaching the gdb and dumping the local
varibles the args are empty strings. Could you please advise if this is
known issue? I just search the mailing list and could not find any solution
for the problem.

I am running:

kernel 3.2.36  + Debian wheezy on ARMv71 armhf
CPU Armada 370 Marvell
python 2.7.3
scipy 0.15.1
numpy 1.9.2

The problem could be reproduced by launching python and importing
scipy.linalg(import linalg)

I also run the same OS on qemu and was not able to reproduce the issue.
Similar architecture such as rasbery pi (ARMv7 armhf) is fine. Also if
using software floating point intead of hardware floating point on the same
Armada 370 (ARMv7) working just fine.


Thank you for any comments or suggestions in advance,
Alex
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150505/43d988ea/attachment.html>

From njs at pobox.com  Tue May  5 16:39:13 2015
From: njs at pobox.com (Nathaniel Smith)
Date: Tue, 5 May 2015 13:39:13 -0700
Subject: [Numpy-discussion] Should ndarray subclasses support the
	keepdims arg?
In-Reply-To: <CAA68w_k8fYOq-a8+v5sVMbBOq+2_=8vU4JG-7vgS_OjbyfEAhw@mail.gmail.com>
References: <5548DE12.2060606@gmail.com>
	<CAPJVwBm4PuLxpSKPmsKaoDvUAEN6NKvdt=Wnk8TCFF6=E0L7HA@mail.gmail.com>
	<CAA68w_k8fYOq-a8+v5sVMbBOq+2_=8vU4JG-7vgS_OjbyfEAhw@mail.gmail.com>
Message-ID: <CAPJVwBnNhquBH2bgxUs7JsNBNp+R9Jorc_Ei1mPVcPjKUuvibA@mail.gmail.com>

On May 5, 2015 11:05 AM, "Neil Girdhar" <mistersheik at gmail.com> wrote:
>
> Maybe they should have written their code with **kwargs that consumes all
keyword arguments rather than assuming that no keyword arguments would be
added?  The problem with this approach in general is that it makes writing
code unnecessarily convoluted.

If the user asked for keepdims=True, then silently ignoring this is worse
than raising an error.

And I guess I would call this making code necessarily convoluted :-). There
are not that many options for evolving an interface shared by multiple
unrelated libraries.

-n
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150505/90609997/attachment.html>

From allanhaldane at gmail.com  Tue May  5 18:46:24 2015
From: allanhaldane at gmail.com (Allan Haldane)
Date: Tue, 05 May 2015 18:46:24 -0400
Subject: [Numpy-discussion] Should ndarray subclasses support the
 keepdims arg?
In-Reply-To: <CAPJVwBm4PuLxpSKPmsKaoDvUAEN6NKvdt=Wnk8TCFF6=E0L7HA@mail.gmail.com>
References: <5548DE12.2060606@gmail.com>
	<CAPJVwBm4PuLxpSKPmsKaoDvUAEN6NKvdt=Wnk8TCFF6=E0L7HA@mail.gmail.com>
Message-ID: <55494840.4060705@gmail.com>

That makes sense. I think it's the way to go, thanks.

The downside is using **kwargs instead of an explicit keepdims arg gives
a more obscure signature, but using the new __signature__ attribute this
could be hidden for python 3 users and python2 using ipython3+.

On 05/05/2015 01:55 PM, Nathaniel Smith wrote:
> AFAICT the only real solution here is for np.sum and friends to
> propagate the keepdims argument if and only if it was explicitly passed
> to them (or maybe the slightly different, if and only if it has a
> non-default value). If we just started requiring code to handle it and
> passing it unconditionally, then as soon as someone upgraded numpy all
> their existing code might break for no good reason.
> 
> On May 5, 2015 8:13 AM, "Allan Haldane" <allanhaldane at gmail.com
> <mailto:allanhaldane at gmail.com>> wrote:
> 
>     Hello all,
> 
>     A question:
> 
>     Many ndarray methods (eg sum, mean, any, min) have a "keepdims" keyword
>     argument, but ndarray subclass methods sometimes don't. The 'matrix'
>     subclass doesn't, and numpy functions like 'np.sum' intentionally
>     drop/ignore the keepdims argument when called with an ndarray subclass
>     as first argument.
> 
>     This means you can't always use ndarray subclasses as 'drop in'
>     replacement for ndarrays if the code uses keepdims (even indirectly),
>     and it means code that deals with keepdims (eg np.sum and more) has to
>     detect ndarray subclasses and drop keepdims even if the subclass
>     supports it (since there is no good way to detect support). It seems to
>     me that if we are going to use inheritance, subclass methods should keep
>     the signature of the parent class methods. What does the list think?
> 
>     ---- Details: ----
> 
>     This problem comes up in a PR I'm working on (#5706) to add the keepdims
>     arg to masked array methods. In order to support masked matrices (which
>     a lot of unit tests check), I would have to detect and drop the keepdims
>     arg to avoid an exception. This would be solved if the matrix class
>     supported keepdims (plus an update to np.sum). Similarly,
>     `np.sum(mymaskedarray, keepdims=True)` does not respect keepdims, but it
>     could work if all subclasses supported keepdims.
> 
>     I do not foresee immediate problems with adding keepdims to the matrix
>     methods, except that it would be an unused argument. Modifying `np.sum`
>     to always pass on the keepdims arg is trickier, since it would break any
>     code that tried to np.sum a subclass that doesn't support keepdims, eg
>     pandas.DataFrame. **kwargs tricks might work. But if it's permissible I
>     think it would be better to require subclasses to support all the
>     keyword args ndarray supports.
> 
>     Allan
>     _______________________________________________
>     NumPy-Discussion mailing list
>     NumPy-Discussion at scipy.org <mailto:NumPy-Discussion at scipy.org>
>     http://mail.scipy.org/mailman/listinfo/numpy-discussion
> 
> 
> 
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> 


From charlesr.harris at gmail.com  Tue May  5 20:18:02 2015
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Tue, 5 May 2015 18:18:02 -0600
Subject: [Numpy-discussion] import scipy.linalg is hanging on Marvell
 armada 370
In-Reply-To: <CADDmshvNvSQnBEDwi7RWywTssqDaqomCM=cRFwY_T9ef6q1Y5g@mail.gmail.com>
References: <CADDmshvNvSQnBEDwi7RWywTssqDaqomCM=cRFwY_T9ef6q1Y5g@mail.gmail.com>
Message-ID: <CAB6mnxKC+xonsokXLnJd8Y5rGUS9aGA3eYRDEvHuq1C3kzjoEw@mail.gmail.com>

On Tue, May 5, 2015 at 1:52 PM, Alexander Brezinov <
alexander.brezinov at mmbresearch.com> wrote:

> Hello
>
> The import of scipy.linalg is hanging in DOUBLE_mutiply function
> (BINARY_LOOP) in umath.so. After attaching the gdb and dumping the local
> varibles the args are empty strings. Could you please advise if this is
> known issue? I just search the mailing list and could not find any solution
> for the problem.
>
> I am running:
>
> kernel 3.2.36  + Debian wheezy on ARMv71 armhf
> CPU Armada 370 Marvell
> python 2.7.3
> scipy 0.15.1
> numpy 1.9.2
>
> The problem could be reproduced by launching python and importing
> scipy.linalg(import linalg)
>
> I also run the same OS on qemu and was not able to reproduce the issue.
> Similar architecture such as rasbery pi (ARMv7 armhf) is fine. Also if
> using software floating point intead of hardware floating point on the same
> Armada 370 (ARMv7) working just fine.
>
>
> Thank you for any comments or suggestions in advance,
> Alex
>
>
Almost sounds like a compiler problem, are you using the correctly compiled
version of umath.so? Not that there couldn't be other sources of the
problem...

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150505/a28bf9d8/attachment.html>

From njs at pobox.com  Wed May  6 05:59:16 2015
From: njs at pobox.com (Nathaniel Smith)
Date: Wed, 6 May 2015 02:59:16 -0700
Subject: [Numpy-discussion] Dispatch rules for binary operations on ndarrays
In-Reply-To: <CAPJVwB=+gVDrGkyZTzwPk4KSfD-tLwDjU7Wn8fqgij1o0JQNtw@mail.gmail.com>
References: <CAPJVwB=+gVDrGkyZTzwPk4KSfD-tLwDjU7Wn8fqgij1o0JQNtw@mail.gmail.com>
Message-ID: <CAPJVwBkY2ia0O3BN8d+v3XqKvtGvHQ3ehDN_9nVaFCkS3hQBkA@mail.gmail.com>

I just wanted to draw the list's attention to a discussion happening on the
tracker, about the details of how methods like ndarray.__add__ are
implemented, and how this interacts with the new __numpy_ufunc__ method
that will make it possible for third party libraries to override arbitrary
ufuncs starting in (hopefully) 1.10:
  https://github.com/numpy/numpy/issues/5844

The details are somewhat arcane, but very important for anyone who
implements ndarray-like objects or (to a lesser extent) anyone who
subclasses ndarray. So feedback very welcome.

-n
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150506/e817f2e2/attachment.html>

From faltet at gmail.com  Wed May  6 06:11:10 2015
From: faltet at gmail.com (Francesc Alted)
Date: Wed, 6 May 2015 12:11:10 +0200
Subject: [Numpy-discussion] ANN: python-blosc 1.2.7 released
Message-ID: <CAFrp1vr43NL_kHNMgzj910FWwE+Hr2e0jTs8wVa6ZfR-RuNz3w@mail.gmail.com>

=============================
Announcing python-blosc 1.2.7
=============================

What is new?
============

Updated to use c-blosc v1.6.1.  Although that this supports AVX2, it is
not enabled in python-blosc because we still need a way to devise how to
detect AVX2 in the underlying platform.

At any rate, c-blosc 1.6.1 fixed an important bug in the blosclz codec that
a release was deemed important.

For more info, you can have a look at the release notes in:

https://github.com/Blosc/python-blosc/wiki/Release-notes

More docs and examples are available in the documentation site:

http://python-blosc.blosc.org


What is it?
===========

Blosc (http://www.blosc.org) is a high performance compressor
optimized for binary data.  It has been designed to transmit data to
the processor cache faster than the traditional, non-compressed,
direct memory fetch approach via a memcpy() OS call.

Blosc is the first compressor that is meant not only to reduce the size
of large datasets on-disk or in-memory, but also to accelerate object
manipulations that are memory-bound
(http://www.blosc.org/docs/StarvingCPUs.pdf).  See
http://www.blosc.org/synthetic-benchmarks.html for some benchmarks on
how much speed it can achieve in some datasets.

Blosc works well for compressing numerical arrays that contains data
with relatively low entropy, like sparse data, time series, grids with
regular-spaced values, etc.

python-blosc (http://python-blosc.blosc.org/) is the Python wrapper for
the Blosc compression library.

There is also a handy tool built on Blosc called Bloscpack
(https://github.com/Blosc/bloscpack). It features a commmand line
interface that allows you to compress large binary datafiles on-disk.
It also comes with a Python API that has built-in support for
serializing and deserializing Numpy arrays both on-disk and in-memory at
speeds that are competitive with regular Pickle/cPickle machinery.


Installing
==========

python-blosc is in PyPI repository, so installing it is easy:

$ pip install -U blosc  # yes, you should omit the python- prefix


Download sources
================

The sources are managed through github services at:

http://github.com/Blosc/python-blosc


Documentation
=============

There is Sphinx-based documentation site at:

http://python-blosc.blosc.org/


Mailing list
============

There is an official mailing list for Blosc at:

blosc at googlegroups.com
http://groups.google.es/group/blosc


Licenses
========

Both Blosc and its Python wrapper are distributed using the MIT license.
See:

https://github.com/Blosc/python-blosc/blob/master/LICENSES

for more details.

----

  **Enjoy data!**

-- 
Francesc Alted
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150506/e067c15d/attachment.html>

From faltet at gmail.com  Wed May  6 12:37:24 2015
From: faltet at gmail.com (Francesc Alted)
Date: Wed, 6 May 2015 18:37:24 +0200
Subject: [Numpy-discussion] ANN: PyTables 3.2.0 (final) released!
Message-ID: <CAFrp1vrqv0zdcd98FS6RXuRnMkz93Aicpv-oc117WXojtDCQyA@mail.gmail.com>

===========================
 Announcing PyTables 3.2.0
===========================

We are happy to announce PyTables 3.2.0.

*******************************
IMPORTANT NOTICE:

If you are a user of PyTables, it needs your help to keep going.  Please
read the next thread as it contains important information about the
future (or the lack of it) of the project:

https://groups.google.com/forum/#!topic/pytables-users/yY2aUa4H7W4

Thanks!
*******************************


What's new
==========

This is a major release of PyTables and it is the result of more than a
year of accumulated patches, but most specially it fixes a couple of
nasty problem with indexed queries not returning the correct results in
some scenarios.  There are many usablity and performance improvements
too.

In case you want to know more in detail what has changed in this
version, please refer to: http://www.pytables.org/release_notes.html

You can install it via pip or download a source package with generated
PDF and HTML docs from:
http://sourceforge.net/projects/pytables/files/pytables/3.2.0

For an online version of the manual, visit:
http://www.pytables.org/usersguide/index.html


What it is?
===========

PyTables is a library for managing hierarchical datasets and
designed to efficiently cope with extremely large amounts of data with
support for full 64-bit file addressing.  PyTables runs on top of
the HDF5 library and NumPy package for achieving maximum throughput and
convenient use.  PyTables includes OPSI, a new indexing technology,
allowing to perform data lookups in tables exceeding 10 gigarows
(10**10 rows) in less than a tenth of a second.


Resources
=========

About PyTables: http://www.pytables.org

About the HDF5 library: http://hdfgroup.org/HDF5/

About NumPy: http://numpy.scipy.org/


Acknowledgments
===============

Thanks to many users who provided feature improvements, patches, bug
reports, support and suggestions.  See the ``THANKS`` file in the
distribution package for a (incomplete) list of contributors.  Most
specially, a lot of kudos go to the HDF5 and NumPy makers.
Without them, PyTables simply would not exist.


Share your experience
=====================

Let us know of any bugs, suggestions, gripes, kudos, etc. you may have.


----

  **Enjoy data!**

  -- The PyTables Developers
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150506/1d1d502d/attachment.html>

From damilarefagbemi at gmail.com  Wed May  6 20:26:02 2015
From: damilarefagbemi at gmail.com (Dammy)
Date: Wed, 6 May 2015 17:26:02 -0700 (MST)
Subject: [Numpy-discussion] Using gentxt to import a csv with a string class
 label and hundreds of integer features
Message-ID: <1430958362815-40319.post@n7.nabble.com>

Hi,
I am trying to use numpy.gentxt to import a csv for classification using
scikit-learn. The first column in the csv is a string type class label while
200+ extra columns are integer features.
Please I wish to find out how I can use the gentext function to specify a
dtype of string for the first column while specifying int type for all other
columns.

I have tried using "dtype=None" as shown below, but when I print
dataset.shape,  I get (number_or_rows,) i.e no columns are read in:
 dataset = np.genfromtxt(file,delimiter=',', skip_header=True)
 
I also tried setting the dtypes as shown in the examples below, but I get
the same error as dtype=None:
a: dataset = np.genfromtxt(file,delimiter=',', skip_header=True,
dtype=['S2'] + [ int for n in range(241)],)
b: dataset = np.genfromtxt(file,delimiter=',', skip_header=True,
dtype=['S2'] + [ int for n in range(241)],names=True )


Any thoughts? Thanks for your assistance.

Dammy


--
View this message in context: http://numpy-discussion.10968.n7.nabble.com/Using-gentxt-to-import-a-csv-with-a-string-class-label-and-hundreds-of-integer-features-tp40319.html
Sent from the Numpy-discussion mailing list archive at Nabble.com.


From arnaldorusso at gmail.com  Thu May  7 09:48:38 2015
From: arnaldorusso at gmail.com (Arnaldo Russo)
Date: Thu, 7 May 2015 10:48:38 -0300
Subject: [Numpy-discussion] Using gentxt to import a csv with a string
 class label and hundreds of integer features
In-Reply-To: <1430958362815-40319.post@n7.nabble.com>
References: <1430958362815-40319.post@n7.nabble.com>
Message-ID: <CAKUv9ONjzZE5+tdu4+e7ZapFZ7-5TySDSPt5hFkC_rZtFhNzfQ@mail.gmail.com>

Hi Dammy,

I really don't know how to test your issue, but you could try np.readtxt,
or in the last case using pandas (read_csv) could do this for you.

Cheers,
Arnaldo.
?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150507/dadfb8ca/attachment.html>

From jtaylor.debian at googlemail.com  Fri May  8 08:42:53 2015
From: jtaylor.debian at googlemail.com (Julian Taylor)
Date: Fri, 8 May 2015 14:42:53 +0200
Subject: [Numpy-discussion] Using gentxt to import a csv with a string
 class label and hundreds of integer features
In-Reply-To: <1430958362815-40319.post@n7.nabble.com>
References: <1430958362815-40319.post@n7.nabble.com>
Message-ID: <CAK5FAtFcb+v5ev_8Bm19sRXsvPFoOFX5mHzoGSOXsoXEAQ=RzQ@mail.gmail.com>

On Thu, May 7, 2015 at 2:26 AM, Dammy <damilarefagbemi at gmail.com> wrote:
> Hi,
> I am trying to use numpy.gentxt to import a csv for classification using
> scikit-learn. The first column in the csv is a string type class label while
> 200+ extra columns are integer features.
> Please I wish to find out how I can use the gentext function to specify a
> dtype of string for the first column while specifying int type for all other
> columns.
>
> I have tried using "dtype=None" as shown below, but when I print
> dataset.shape,  I get (number_or_rows,) i.e no columns are read in:
>  dataset = np.genfromtxt(file,delimiter=',', skip_header=True)
>
> I also tried setting the dtypes as shown in the examples below, but I get
> the same error as dtype=None:

these dtypes will create structured arrays:
http://docs.scipy.org/doc/numpy/user/basics.rec.html

so it is expected that the shape is the number of rows, the colums are
part of the dtype and can be accessed like a dictionary:

In [21]: d = np.ones(3, dtype='S2, int8')

In [22]: d
Out[22]:
array([('1', 1), ('1', 1), ('1', 1)],
      dtype=[('f0', 'S2'), ('f1', 'i1')])

In [23]: d.shape
Out[23]: (3,)

In [24]: d.dtype.names
Out[24]: ('f0', 'f1')

In [25]: d[0]
Out[25]: ('1', 1)

In [26]: d['f0']
Out[26]:
array(['1', '1', '1'],
      dtype='|S2')

In [27]: d['f1']
Out[27]: array([1, 1, 1], dtype=int8)


From jaime.frio at gmail.com  Sat May  9 13:48:46 2015
From: jaime.frio at gmail.com (=?UTF-8?Q?Jaime_Fern=C3=A1ndez_del_R=C3=ADo?=)
Date: Sat, 9 May 2015 10:48:46 -0700
Subject: [Numpy-discussion] Bug in np.nonzero / Should index returning
	functions return ndarray subclasses?
Message-ID: <CAPOWHWnxwpmAOJJkDt9SH2Ax+57rY-3D=mxddks8yfvxYZ+H+g@mail.gmail.com>

There is a reported bug (issue #5837
<https://github.com/numpy/numpy/issues/5837>) regarding different returns
from np.nonzero with 1-D vs higher dimensional arrays. A full summary of
the differences can be seen from the following output:

>>> class C(np.ndarray): pass
...
>>> a = np.arange(6).view(C)
>>> b = np.arange(6).reshape(2, 3).view(C)
>>> anz = a.nonzero()
>>> bnz = b.nonzero()

>>> type(anz[0])
<type 'numpy.ndarray'>
>>> anz[0].flags
  C_CONTIGUOUS : True
  F_CONTIGUOUS : True
  OWNDATA : True
  WRITEABLE : True
  ALIGNED : True
  UPDATEIFCOPY : False
>>> anz[0].base

>>> type(bnz[0])
<class '__main__.C'>
>>> bnz[0].flags
  C_CONTIGUOUS : False
  F_CONTIGUOUS : False
  OWNDATA : False
  WRITEABLE : False
  ALIGNED : True
  UPDATEIFCOPY : False
>>> bnz[0].base
array([[0, 1],
       [0, 2],
       [1, 0],
       [1, 1],
       [1, 2]])

The original bug report was only concerned with the non-writeability of
higher dimensional array returns, but there are more differences: 1-D
always returns an ndarray that owns its memory and is writeable, but higher
dimensional arrays return views, of the type of the original array, that
are non-writeable.

I have a branch that attempts to fix this by making both 1-D and n-D arrays:

   1. return a view, never the base array,
   2. return an ndarray, never a subclass, and
   3. return a writeable view.

I guess the most controversial choice is #2, and in fact making that change
breaks a few tests. I nevertheless think that all of the index returning
functions (nonzero, argsort, argmin, argmax, argpartition) should always
return a bare ndarray, not a subclass. I'd be happy to be corrected, but I
can't think of any situation in which preserving the subclass would be
needed for these functions.

Since we are changing the returns of a few other functions in 1.10
(diagonal, diag, ravel), it may be a good moment to revisit the behavior
for these other functions. Any thoughts?

Jaime

-- 
(\__/)
( O.o)
( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes
de dominaci?n mundial.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150509/ebc8d7a1/attachment.html>

From njs at pobox.com  Sat May  9 14:42:50 2015
From: njs at pobox.com (Nathaniel Smith)
Date: Sat, 9 May 2015 11:42:50 -0700
Subject: [Numpy-discussion] Bug in np.nonzero / Should index returning
 functions return ndarray subclasses?
In-Reply-To: <CAPOWHWnxwpmAOJJkDt9SH2Ax+57rY-3D=mxddks8yfvxYZ+H+g@mail.gmail.com>
References: <CAPOWHWnxwpmAOJJkDt9SH2Ax+57rY-3D=mxddks8yfvxYZ+H+g@mail.gmail.com>
Message-ID: <CAPJVwBnOTBud7nHeqbkYtfC4+vBK05EycH1z_YzJJHUijZ2N_Q@mail.gmail.com>

On May 9, 2015 10:48 AM, "Jaime Fern?ndez del R?o" <jaime.frio at gmail.com>
wrote:
>
> There is a reported bug (issue #5837) regarding different returns from
np.nonzero with 1-D vs higher dimensional arrays. A full summary of the
differences can be seen from the following output:
>
> >>> class C(np.ndarray): pass
> ...
> >>> a = np.arange(6).view(C)
> >>> b = np.arange(6).reshape(2, 3).view(C)
> >>> anz = a.nonzero()
> >>> bnz = b.nonzero()
>
> >>> type(anz[0])
> <type 'numpy.ndarray'>
> >>> anz[0].flags
>   C_CONTIGUOUS : True
>   F_CONTIGUOUS : True
>   OWNDATA : True
>   WRITEABLE : True
>   ALIGNED : True
>   UPDATEIFCOPY : False
> >>> anz[0].base
>
> >>> type(bnz[0])
> <class '__main__.C'>
> >>> bnz[0].flags
>   C_CONTIGUOUS : False
>   F_CONTIGUOUS : False
>   OWNDATA : False
>   WRITEABLE : False
>   ALIGNED : True
>   UPDATEIFCOPY : False
> >>> bnz[0].base
> array([[0, 1],
>        [0, 2],
>        [1, 0],
>        [1, 1],
>        [1, 2]])
>
> The original bug report was only concerned with the non-writeability of
higher dimensional array returns, but there are more differences: 1-D
always returns an ndarray that owns its memory and is writeable, but higher
dimensional arrays return views, of the type of the original array, that
are non-writeable.
>
> I have a branch that attempts to fix this by making both 1-D and n-D
arrays:
> return a view, never the base array,

This doesn't matter, does it? "View" isn't a thing, only "view of" is
meaningful. And in this case, none of the returned arrays share any memory
with any other arrays that the user has access to... so whether they were
created as a view or not should be an implementation detail that's
transparent to the user?

> return an ndarray, never a subclass, and
> return a writeable view.
> I guess the most controversial choice is #2, and in fact making that
change breaks a few tests. I nevertheless think that all of the index
returning functions (nonzero, argsort, argmin, argmax, argpartition) should
always return a bare ndarray, not a subclass. I'd be happy to be corrected,
but I can't think of any situation in which preserving the subclass would
be needed for these functions.

I also can't see any logical reason why the return type of these functions
has anything to do with the type of the inputs. You can index me with my
phone number but my phone number is not a person. OTOH logic and ndarray
subclassing don't have much to do with each other; the practical effect is
probably more important. Looking at the subclasses I know about (masked
arrays, np.matrix, and astropy quantities), though, I also can't see much
benefit in copying the subclass of the input, and the fact that we were
never consistent about this suggests that people probably aren't depending
on it too much.

So in summary my feeling is: +1 to making then writable, no objection to
the view thing (though I don't see how it matters), and provisional +1 to
consistently returning ndarray (to be revised if the people who use the
subclassing functionality disagree).

-n
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150509/144402f0/attachment.html>

From ben.root at ou.edu  Sat May  9 15:53:31 2015
From: ben.root at ou.edu (Benjamin Root)
Date: Sat, 9 May 2015 15:53:31 -0400
Subject: [Numpy-discussion] Bug in np.nonzero / Should index returning
 functions return ndarray subclasses?
In-Reply-To: <CAPJVwBnOTBud7nHeqbkYtfC4+vBK05EycH1z_YzJJHUijZ2N_Q@mail.gmail.com>
References: <CAPOWHWnxwpmAOJJkDt9SH2Ax+57rY-3D=mxddks8yfvxYZ+H+g@mail.gmail.com>
	<CAPJVwBnOTBud7nHeqbkYtfC4+vBK05EycH1z_YzJJHUijZ2N_Q@mail.gmail.com>
Message-ID: <CANNq6FnL8ENqy_vgUyBy+cHx7P8T3sALBAY0Gwfv62qHR8bAzA@mail.gmail.com>

Absolutely, it should be writable. As for subclassing, that might be messy.
Consider the following:

inds = np.where(data > 5)

In that case, I'd expect a normal, bog-standard ndarray because that is
what you use for indexing (although pandas might have a good argument for
having it return one of their special indexing types if "data" was a pandas
array...). Next:

foobar = np.where(data > 5, 1, 2)

Again, I'd expect a normal, bog-standard ndarray because the scalar
elements are very simple. This question gets very complicated when
considering array arguments. Consider:

merged_data = np.where(data > 5, data, data2)

So, what should "merged_data" be? If both "data" and "data2" are the same
types, then it would be reasonable to return the same type, if possible.
But what if they aren't the same? Maybe use array_priority to determine the
return type? Or, perhaps it does make sense to say "sod it all" and always
return an ndarray?

I don't know the answer. I do find it interesting that the result from a
multi-dimensional array is not writable. I don't know why I have never
encountered that.


Ben Root


On Sat, May 9, 2015 at 2:42 PM, Nathaniel Smith <njs at pobox.com> wrote:

> On May 9, 2015 10:48 AM, "Jaime Fern?ndez del R?o" <jaime.frio at gmail.com>
> wrote:
> >
> > There is a reported bug (issue #5837) regarding different returns from
> np.nonzero with 1-D vs higher dimensional arrays. A full summary of the
> differences can be seen from the following output:
> >
> > >>> class C(np.ndarray): pass
> > ...
> > >>> a = np.arange(6).view(C)
> > >>> b = np.arange(6).reshape(2, 3).view(C)
> > >>> anz = a.nonzero()
> > >>> bnz = b.nonzero()
> >
> > >>> type(anz[0])
> > <type 'numpy.ndarray'>
> > >>> anz[0].flags
> >   C_CONTIGUOUS : True
> >   F_CONTIGUOUS : True
> >   OWNDATA : True
> >   WRITEABLE : True
> >   ALIGNED : True
> >   UPDATEIFCOPY : False
> > >>> anz[0].base
> >
> > >>> type(bnz[0])
> > <class '__main__.C'>
> > >>> bnz[0].flags
> >   C_CONTIGUOUS : False
> >   F_CONTIGUOUS : False
> >   OWNDATA : False
> >   WRITEABLE : False
> >   ALIGNED : True
> >   UPDATEIFCOPY : False
> > >>> bnz[0].base
> > array([[0, 1],
> >        [0, 2],
> >        [1, 0],
> >        [1, 1],
> >        [1, 2]])
> >
> > The original bug report was only concerned with the non-writeability of
> higher dimensional array returns, but there are more differences: 1-D
> always returns an ndarray that owns its memory and is writeable, but higher
> dimensional arrays return views, of the type of the original array, that
> are non-writeable.
> >
> > I have a branch that attempts to fix this by making both 1-D and n-D
> arrays:
> > return a view, never the base array,
>
> This doesn't matter, does it? "View" isn't a thing, only "view of" is
> meaningful. And in this case, none of the returned arrays share any memory
> with any other arrays that the user has access to... so whether they were
> created as a view or not should be an implementation detail that's
> transparent to the user?
>
> > return an ndarray, never a subclass, and
> > return a writeable view.
> > I guess the most controversial choice is #2, and in fact making that
> change breaks a few tests. I nevertheless think that all of the index
> returning functions (nonzero, argsort, argmin, argmax, argpartition) should
> always return a bare ndarray, not a subclass. I'd be happy to be corrected,
> but I can't think of any situation in which preserving the subclass would
> be needed for these functions.
>
> I also can't see any logical reason why the return type of these functions
> has anything to do with the type of the inputs. You can index me with my
> phone number but my phone number is not a person. OTOH logic and ndarray
> subclassing don't have much to do with each other; the practical effect is
> probably more important. Looking at the subclasses I know about (masked
> arrays, np.matrix, and astropy quantities), though, I also can't see much
> benefit in copying the subclass of the input, and the fact that we were
> never consistent about this suggests that people probably aren't depending
> on it too much.
>
> So in summary my feeling is: +1 to making then writable, no objection to
> the view thing (though I don't see how it matters), and provisional +1 to
> consistently returning ndarray (to be revised if the people who use the
> subclassing functionality disagree).
>
> -n
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150509/bfc9d5a8/attachment.html>

From njs at pobox.com  Sat May  9 16:03:07 2015
From: njs at pobox.com (Nathaniel Smith)
Date: Sat, 9 May 2015 13:03:07 -0700
Subject: [Numpy-discussion] Bug in np.nonzero / Should index returning
 functions return ndarray subclasses?
In-Reply-To: <CANNq6FnL8ENqy_vgUyBy+cHx7P8T3sALBAY0Gwfv62qHR8bAzA@mail.gmail.com>
References: <CAPOWHWnxwpmAOJJkDt9SH2Ax+57rY-3D=mxddks8yfvxYZ+H+g@mail.gmail.com>
	<CAPJVwBnOTBud7nHeqbkYtfC4+vBK05EycH1z_YzJJHUijZ2N_Q@mail.gmail.com>
	<CANNq6FnL8ENqy_vgUyBy+cHx7P8T3sALBAY0Gwfv62qHR8bAzA@mail.gmail.com>
Message-ID: <CAPJVwB=rBkMC8bcEjdck77LKWmzBBF6NK6U0KswLX0AuGbDQ3Q@mail.gmail.com>

On May 9, 2015 12:54 PM, "Benjamin Root" <ben.root at ou.edu> wrote:
>
> Absolutely, it should be writable. As for subclassing, that might be
messy. Consider the following:
>
> inds = np.where(data > 5)
>
> In that case, I'd expect a normal, bog-standard ndarray because that is
what you use for indexing (although pandas might have a good argument for
having it return one of their special indexing types if "data" was a pandas
array...).

Pandas doesn't subclass ndarray (anymore), so they're irrelevant to this
particular discussion :-). Of course they're an argument for having a
cleaner more general way of allowing non-ndarray array-like objects, but
the legacy subclassing system will never be that.

> Next:
>
> foobar = np.where(data > 5, 1, 2)
>
> Again, I'd expect a normal, bog-standard ndarray because the scalar
elements are very simple. This question gets very complicated when
considering array arguments. Consider:
>
> merged_data = np.where(data > 5, data, data2)
>
> So, what should "merged_data" be? If both "data" and "data2" are the same
types, then it would be reasonable to return the same type, if possible.
But what if they aren't the same? Maybe use array_priority to determine the
return type? Or, perhaps it does make sense to say "sod it all" and always
return an ndarray?

Not sure what this has to do with Jaime's post about nonzero? There is
indeed a potential question about what 3-argument where() should do with
subclasses, but that's effectively a different operation entirely and to
discuss it we'd need to know things like what it historically has done and
why that was causing problems.

-n
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150509/dc49fce1/attachment.html>

From njs at pobox.com  Sat May  9 16:26:58 2015
From: njs at pobox.com (Nathaniel Smith)
Date: Sat, 9 May 2015 13:26:58 -0700
Subject: [Numpy-discussion] Proposed deprecations for 1.10: dot corner cases
Message-ID: <CAPJVwB=kToggkE_joY9f059rf+nwKKMPvJp3v_zysvohX7qNVA@mail.gmail.com>

Hi all,

I'd like to suggest that we go ahead and add deprecation warnings to
the following operations. This doesn't commit us to changing anything
on any particular time scale, but it gives us more options later.

1) dot(A, B) where A and B *both* have *3 or more dimensions*:
currently, this does a weird "outer product" thing, where it computes
all pairwise matrix products. We've had numerous discussions about why
this is suboptimal, and it contradicts the PEP 465 semantics for @,
which broadcast + vectorize over extra dimensions. (If you have a
vectorized version, then the outer product one is easy to derive; if
you have only the outer product version .) While dot() is widely used
in general, this particular varient is very, very rarely used. I
propose we issue a FutureWarning here, so as to lay the groundwork for
someday eventually making dot() and @ the same.

2) dot(A, B) where one of the argument is a scalar: currently, this
does scalar multiplication. There is no logically consistent
motivation for this, it violates TOOWTDI, and again it is inconsistent
with the PEP semantics for @ (which are that this case should be an
error). (NB for those still using np.matrix: scalar * np.matrix will
still be supported regardless; this would only affect expressions
where you actually call the dot() function.) I propose to make this a
DeprecationWarning.

-- 
Nathaniel J. Smith -- http://vorpus.org


From ben.root at ou.edu  Sat May  9 16:27:03 2015
From: ben.root at ou.edu (Benjamin Root)
Date: Sat, 9 May 2015 16:27:03 -0400
Subject: [Numpy-discussion] Bug in np.nonzero / Should index returning
 functions return ndarray subclasses?
In-Reply-To: <CAPJVwB=rBkMC8bcEjdck77LKWmzBBF6NK6U0KswLX0AuGbDQ3Q@mail.gmail.com>
References: <CAPOWHWnxwpmAOJJkDt9SH2Ax+57rY-3D=mxddks8yfvxYZ+H+g@mail.gmail.com>
	<CAPJVwBnOTBud7nHeqbkYtfC4+vBK05EycH1z_YzJJHUijZ2N_Q@mail.gmail.com>
	<CANNq6FnL8ENqy_vgUyBy+cHx7P8T3sALBAY0Gwfv62qHR8bAzA@mail.gmail.com>
	<CAPJVwB=rBkMC8bcEjdck77LKWmzBBF6NK6U0KswLX0AuGbDQ3Q@mail.gmail.com>
Message-ID: <CANNq6Fk_Yv3cuyjB4UeCsqaOWTAsaQ-1xQP+Ne8O7VJELqPtQg@mail.gmail.com>

On Sat, May 9, 2015 at 4:03 PM, Nathaniel Smith <njs at pobox.com> wrote:

> Not sure what this has to do with Jaime's post about nonzero? There is
> indeed a potential question about what 3-argument where() should do with
> subclasses, but that's effectively a different operation entirely and to
> discuss it we'd need to know things like what it historically has done and
> why that was causing problems.


Because my train of thought started at np.nonzero(), which I have always
just mentally mapped to np.where(), and then... squirrel!

Indeed, np.where() has no bearing here.

Ben Root
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150509/f674a22b/attachment.html>

From njs at pobox.com  Sat May  9 16:56:24 2015
From: njs at pobox.com (Nathaniel Smith)
Date: Sat, 9 May 2015 13:56:24 -0700
Subject: [Numpy-discussion] Bug in np.nonzero / Should index returning
 functions return ndarray subclasses?
In-Reply-To: <CANNq6Fk_Yv3cuyjB4UeCsqaOWTAsaQ-1xQP+Ne8O7VJELqPtQg@mail.gmail.com>
References: <CAPOWHWnxwpmAOJJkDt9SH2Ax+57rY-3D=mxddks8yfvxYZ+H+g@mail.gmail.com>
	<CAPJVwBnOTBud7nHeqbkYtfC4+vBK05EycH1z_YzJJHUijZ2N_Q@mail.gmail.com>
	<CANNq6FnL8ENqy_vgUyBy+cHx7P8T3sALBAY0Gwfv62qHR8bAzA@mail.gmail.com>
	<CAPJVwB=rBkMC8bcEjdck77LKWmzBBF6NK6U0KswLX0AuGbDQ3Q@mail.gmail.com>
	<CANNq6Fk_Yv3cuyjB4UeCsqaOWTAsaQ-1xQP+Ne8O7VJELqPtQg@mail.gmail.com>
Message-ID: <CAPJVwBmiprDE1C_knwM-YDqi-utNoQq0bG9HU5M6mLOiHEQf1A@mail.gmail.com>

On Sat, May 9, 2015 at 1:27 PM, Benjamin Root <ben.root at ou.edu> wrote:
>
> On Sat, May 9, 2015 at 4:03 PM, Nathaniel Smith <njs at pobox.com> wrote:
>>
>> Not sure what this has to do with Jaime's post about nonzero? There is
>> indeed a potential question about what 3-argument where() should do with
>> subclasses, but that's effectively a different operation entirely and to
>> discuss it we'd need to know things like what it historically has done and
>> why that was causing problems.
>
> Because my train of thought started at np.nonzero(), which I have always
> just mentally mapped to np.where(), and then... squirrel!
>
> Indeed, np.where() has no bearing here.

Ah, gotcha :-).

There is an argument that we should try to reduce this confusion by
nudging people to use np.nonzero() consistently instead of np.where(),
via the documentation and/or a warning message...

-- 
Nathaniel J. Smith -- http://vorpus.org


From shoyer at gmail.com  Sat May  9 21:53:42 2015
From: shoyer at gmail.com (Stephan Hoyer)
Date: Sat, 09 May 2015 18:53:42 -0700 (PDT)
Subject: [Numpy-discussion] Bug in np.nonzero / Should index returning
 functions return ndarray subclasses?
In-Reply-To: <CANNq6FnL8ENqy_vgUyBy+cHx7P8T3sALBAY0Gwfv62qHR8bAzA@mail.gmail.com>
References: <CANNq6FnL8ENqy_vgUyBy+cHx7P8T3sALBAY0Gwfv62qHR8bAzA@mail.gmail.com>
Message-ID: <1431222821751.71592119@Nodemailer>

With regards to np.where -- shouldn't where be a ufunc, so subclasses or other array-likes can be control its behavior with __numpy_ufunc__?


As for the other indexing functions, I don't have a strong opinion about how they should handle subclasses. But it is certainly tricky to attempt to handle handle arbitrary subclasses. I would agree that the least error prone thing to do is usually to return base ndarrays. Better to force subclasses to override methods explicitly.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150509/468238de/attachment.html>

From stefan.otte at gmail.com  Sun May 10 06:33:30 2015
From: stefan.otte at gmail.com (Stefan Otte)
Date: Sun, 10 May 2015 10:33:30 +0000
Subject: [Numpy-discussion] Generalize hstack/vstack --> stack;
 Block matrices like in matlab
In-Reply-To: <CAFync2iSUxqsUPfC=ARQiTKsNtXTZxF9pYXOOYeiatWPdhnOmA@mail.gmail.com>
References: <CAFync2iJ9_1f7Na4EptMaLFNxTLJvTRKj04L4VGnLCrzcL9khA@mail.gmail.com>
	<101656916431878296.890307sturla.molden-gmail.com@news.gmane.org>
	<CAPOWHWm2Jr+Wyfj+Vd_QKj30XrHBGQmpYpmTpZ9ojvcrjH1h5w@mail.gmail.com>
	<CAO0rnfEVTDus5C66Krn0G-=+hb5CNAaGyi=UfAAFyu1Xuy6atA@mail.gmail.com>
	<CANNq6Fmv-_vKJKTA2cxDfUtoTQ1R2T5v6t11aBZ18Upk1=efKw@mail.gmail.com>
	<CADT3MEDiBYW=xxCiGfns5ERobG7pe-jYKf0SOwb3pcaRQgoELQ@mail.gmail.com>
	<CAFync2iD3a6HSAp8xKaZ8gHZzVyrUX4nWJArT3gXPLUDOSzpsw@mail.gmail.com>
	<CAPJVwBn_pt=Bz8U9KBeaPysc=kF5M_gu9EH965-qajtB4-B8aA@mail.gmail.com>
	<CAFync2hVZdPq53idJeqqX+D-Tcmix1kC80p0MxqmvhSX5zv3gg@mail.gmail.com>
	<CAFync2iSUxqsUPfC=ARQiTKsNtXTZxF9pYXOOYeiatWPdhnOmA@mail.gmail.com>
Message-ID: <CAFync2hsP10v5zhJt-qitMD5Brtuh7dTJXG_fa4jzTDVXANFmQ@mail.gmail.com>

Hey,

Just a quick update. I updated the pull request and renamed `stack` into
`block`. Have a look: https://github.com/numpy/numpy/pull/5057

I'm sticking with simple initial implementation because it's simple and
does what you think it does.


Cheers,
 Stefan


On Fri, Oct 31, 2014 at 2:13 PM Stefan Otte <stefan.otte at gmail.com> wrote:

> To make the last point more concrete the implementation could look
> something like this (note that I didn't test it and that it still
> takes some work):
>
>
> def bmat(obj, ldict=None, gdict=None):
>     return matrix(stack(obj, ldict, gdict))
>
>
> def stack(obj, ldict=None, gdict=None):
>     # the old bmat code minus the matrix calls
>     if isinstance(obj, str):
>         if gdict is None:
>             # get previous frame
>             frame = sys._getframe().f_back
>             glob_dict = frame.f_globals
>             loc_dict = frame.f_locals
>         else:
>             glob_dict = gdict
>             loc_dict = ldict
>         return _from_string(obj, glob_dict, loc_dict)
>
>     if isinstance(obj, (tuple, list)):
>         # [[A,B],[C,D]]
>         arr_rows = []
>         for row in obj:
>             if isinstance(row, N.ndarray):  # not 2-d
>                 return concatenate(obj, axis=-1)
>             else:
>                 arr_rows.append(concatenate(row, axis=-1))
>         return concatenate(arr_rows, axis=0)
>
>     if isinstance(obj, N.ndarray):
>         return obj
>
>
> I basically turned the old `bmat` into `stack` and removed the matrix
> calls.
>
>
> Best,
>  Stefan
>
>
>
> On Wed, Oct 29, 2014 at 3:59 PM, Stefan Otte <stefan.otte at gmail.com>
> wrote:
> > Hey,
> >
> > there are several ways how to proceed.
> >
> > - My proposed solution covers the 80% case quite well (at least I use
> > it all the time). I'd convert the doctests into unittests and we're
> > done.
> >
> > - We could slightly change the interface to leave out the surrounding
> > square brackets, i.e. turning `stack([[a, b], [c, d]])` into
> > `stack([a, b], [c, d])`
> >
> > - We could extend it even further allowing a "filler value" for non
> > set values and a "shape" argument. This could be done later as well.
> >
> > - `bmat` is not really matrix specific. We could refactor `bmat` a bit
> > to use the same logic in `stack`. Except the `matrix` calls `bmat` and
> > `_from_string` are pretty agnostic to the input.
> >
> > I'm in favor of the first or last approach. The first: because it
> > already works and is quite simple. The last: because the logic and
> > tests of both `bmat` and `stack` would be the same and the feature to
> > specify a string representation of the block matrix is nice.
> >
> >
> > Best,
> >  Stefan
> >
> >
> >
> > On Tue, Oct 28, 2014 at 7:46 PM, Nathaniel Smith <njs at pobox.com> wrote:
> >> On 28 Oct 2014 18:34, "Stefan Otte" <stefan.otte at gmail.com> wrote:
> >>>
> >>> Hey,
> >>>
> >>> In the last weeks I tested `np.asarray(np.bmat(....))` as `stack`
> >>> function and it works quite well. So the question persits:  If `bmat`
> >>> already offers something like `stack` should we even bother
> >>> implementing `stack`? More code leads to more
> >>> bugs and maintenance work. (However, the current implementation is
> >>> only 5 lines and by using `bmat` which would reduce that even more.)
> >>
> >> In the long run we're trying to reduce usage of np.matrix and ideally
> >> deprecate it entirely. So yes, providing ndarray equivalents of matrix
> >> functionality (like bmat) is valuable.
> >>
> >> -n
> >>
> >>
> >> _______________________________________________
> >> NumPy-Discussion mailing list
> >> NumPy-Discussion at scipy.org
> >> http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150510/d5a3d8a3/attachment.html>

From stefan.otte at gmail.com  Sun May 10 07:40:52 2015
From: stefan.otte at gmail.com (Stefan Otte)
Date: Sun, 10 May 2015 11:40:52 +0000
Subject: [Numpy-discussion] Create a n-D grid; meshgrid alternative
Message-ID: <CAFync2iTCQtitBAPC68L_GmPVTQ08Wwd4ZGEFi0rv4ZoiYaYjw@mail.gmail.com>

Hey,

quite often I want to evaluate a function on a grid in a n-D space.
What I end up doing (and what I really dislike) looks something like this:

  x = np.linspace(0, 5, 20)
  M1, M2 = np.meshgrid(x, x)
  X = np.column_stack([M1.flatten(), M2.flatten()])
  X.shape  # (400, 2)

  fancy_function(X)

I don't think I ever used `meshgrid` in any other way.
Is there a better way to create such a grid space?

I wrote myself a little helper function:

  def gridspace(linspaces):
      return np.column_stack([space.flatten()
                              for space in np.meshgrid(*linspaces)])

But maybe something like this should be part of numpy?


Best,
 Stefan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150510/d457ffa8/attachment.html>

From stefan.otte at gmail.com  Sun May 10 10:05:02 2015
From: stefan.otte at gmail.com (Stefan Otte)
Date: Sun, 10 May 2015 16:05:02 +0200
Subject: [Numpy-discussion] Create a n-D grid; meshgrid alternative
In-Reply-To: <CAFync2iTCQtitBAPC68L_GmPVTQ08Wwd4ZGEFi0rv4ZoiYaYjw@mail.gmail.com>
References: <CAFync2iTCQtitBAPC68L_GmPVTQ08Wwd4ZGEFi0rv4ZoiYaYjw@mail.gmail.com>
Message-ID: <CAFync2gKGFh0FJXpM+zEeiUXkKB8K2VmxVEt2rqBHFh+4sTxJw@mail.gmail.com>

I just drafted different versions of the `gridspace` function:
https://tmp23.tmpnb.org/user/1waoqQ8PJBJ7/notebooks/2015-05%20gridspace.ipynb


Beste Gr??e,
 Stefan


On Sun, May 10, 2015 at 1:40 PM, Stefan Otte <stefan.otte at gmail.com> wrote:
> Hey,
>
> quite often I want to evaluate a function on a grid in a n-D space.
> What I end up doing (and what I really dislike) looks something like this:
>
>   x = np.linspace(0, 5, 20)
>   M1, M2 = np.meshgrid(x, x)
>   X = np.column_stack([M1.flatten(), M2.flatten()])
>   X.shape  # (400, 2)
>
>   fancy_function(X)
>
> I don't think I ever used `meshgrid` in any other way.
> Is there a better way to create such a grid space?
>
> I wrote myself a little helper function:
>
>   def gridspace(linspaces):
>       return np.column_stack([space.flatten()
>                               for space in np.meshgrid(*linspaces)])
>
> But maybe something like this should be part of numpy?
>
>
> Best,
>  Stefan
>


From jaime.frio at gmail.com  Sun May 10 12:22:51 2015
From: jaime.frio at gmail.com (=?UTF-8?Q?Jaime_Fern=C3=A1ndez_del_R=C3=ADo?=)
Date: Sun, 10 May 2015 09:22:51 -0700
Subject: [Numpy-discussion] Create a n-D grid; meshgrid alternative
In-Reply-To: <CAFync2gKGFh0FJXpM+zEeiUXkKB8K2VmxVEt2rqBHFh+4sTxJw@mail.gmail.com>
References: <CAFync2iTCQtitBAPC68L_GmPVTQ08Wwd4ZGEFi0rv4ZoiYaYjw@mail.gmail.com>
	<CAFync2gKGFh0FJXpM+zEeiUXkKB8K2VmxVEt2rqBHFh+4sTxJw@mail.gmail.com>
Message-ID: <CAPOWHWnvPe+kfgs79YhKEmu7LDeYn3brxzjJM-TCtaRPj2JpUw@mail.gmail.com>

On Sun, May 10, 2015 at 7:05 AM, Stefan Otte <stefan.otte at gmail.com> wrote:

> I just drafted different versions of the `gridspace` function:
>
> https://tmp23.tmpnb.org/user/1waoqQ8PJBJ7/notebooks/2015-05%20gridspace.ipynb


The link seems to be broken...

Jaime

-- 
(\__/)
( O.o)
( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes
de dominaci?n mundial.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150510/de91891f/attachment.html>

From aymeric.rateau at gmail.com  Sun May 10 15:11:29 2015
From: aymeric.rateau at gmail.com (Gmail)
Date: Sun, 10 May 2015 21:11:29 +0200
Subject: [Numpy-discussion] read not byte aligned records
In-Reply-To: <CANNq6F=18wd7kFTKxkPxCOtvN2dQsvmosq9ORgGnXay=gFq6Jw@mail.gmail.com>
References: <5547D3E6.9080400@gmail.com>	<20150505072124.aa8746c35d26992bb5f16ec2@esrf.fr>	<CAPJVwBk6e7AVD0TiYWfqeFeOYrALLvnuJW4D0vbe3bh+z=dgPw@mail.gmail.com>	<080ff0d475a9941f5f752078524158b7@ratal.org>
	<CANNq6F=18wd7kFTKxkPxCOtvN2dQsvmosq9ORgGnXay=gFq6Jw@mail.gmail.com>
Message-ID: <554FAD61.9000809@gmail.com>

For the archive, I tried to use bitarray instead of bitstring and for
same file parsing went from 180ms to 60ms. Code was finally shorter and
more simple but less easy to jump into (documentation).


Performance is still far from using fromstring or fromfile which gives
like 5ms for similar size of file but byte aligned.

Aymeric


my code is below:

def readBitarray(self, bita, channelList=None):
        """ reads stream of record bytes using bitarray module needed
for not byte aligned data
       
        Parameters
        ------------
        bitarray : stream
            stream of bytes
        channelList : List of str, optional
       
        Returns
        --------
        rec : numpy recarray
            contains a matrix of raw data in a recarray (attributes
corresponding to channel name)
        """
        from bitarray import bitarray
        B = bitarray(endian="little")  # little endian by default
        B.frombytes(bytes(bita))
        # initialise data structure
        if channelList is None:
            channelList = self.channelNames
        format = []
        for channel in self:
            if channel.name in channelList:
                format.append(channel.RecordFormat)
        buf = recarray(self.numberOfRecords, format)
        # read data
        for chan in range(len(self)):
            if self[chan].name in channelList:
                record_bit_size = self.CGrecordLength * 8
                temp = [B[self[chan].posBitBeg + record_bit_size * i:\
                        self[chan].posBitEnd + record_bit_size * i]\
                         for i in range(self.numberOfRecords)]
                nbytes = len(temp[0].tobytes())
                if not nbytes == self[chan].nBytes and \
                        self[chan].signalDataType not in (6, 7, 8, 9,
10, 11, 12): # not Ctype byte length
                    byte = 8 * (self[chan].nBytes - nbytes) *
bitarray([False])
                    for i in range(self.numberOfRecords):  # extend data
of bytes to match numpy requirement
                        temp[i].append(byte)
                temp = [self[chan].CFormat.unpack(temp[i].tobytes())[0] \
                        for i in range(self.numberOfRecords)]
                buf[self[chan].name] = asarray(temp)
        return buf

Le 05/05/15 15:39, Benjamin Root a ?crit :
> I have been very happy with the bitarray package. I don't know if it
> is faster than bitstring, but it is worth a mention. Just watch out
> for any hashing operations on its objects, it doesn't seem to do them
> right (set(), dict(), etc...), but comparison operations work just fine.
>
> Ben Root
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150510/9d8ec65c/attachment.html>

From jaime.frio at gmail.com  Sun May 10 17:46:12 2015
From: jaime.frio at gmail.com (=?UTF-8?Q?Jaime_Fern=C3=A1ndez_del_R=C3=ADo?=)
Date: Sun, 10 May 2015 14:46:12 -0700
Subject: [Numpy-discussion] Create a n-D grid; meshgrid alternative
In-Reply-To: <CAFync2iTCQtitBAPC68L_GmPVTQ08Wwd4ZGEFi0rv4ZoiYaYjw@mail.gmail.com>
References: <CAFync2iTCQtitBAPC68L_GmPVTQ08Wwd4ZGEFi0rv4ZoiYaYjw@mail.gmail.com>
Message-ID: <CAPOWHW=T7KfMhCQ2hRXahRE2Vg9FnKnjW_w5+LwNQ1NcdKpJnw@mail.gmail.com>

On Sun, May 10, 2015 at 4:40 AM, Stefan Otte <stefan.otte at gmail.com> wrote:

> Hey,
>
> quite often I want to evaluate a function on a grid in a n-D space.
> What I end up doing (and what I really dislike) looks something like this:
>
>   x = np.linspace(0, 5, 20)
>   M1, M2 = np.meshgrid(x, x)
>   X = np.column_stack([M1.flatten(), M2.flatten()])
>   X.shape  # (400, 2)
>
>   fancy_function(X)
>
> I don't think I ever used `meshgrid` in any other way.
> Is there a better way to create such a grid space?
>
> I wrote myself a little helper function:
>
>   def gridspace(linspaces):
>       return np.column_stack([space.flatten()
>                               for space in np.meshgrid(*linspaces)])
>
> But maybe something like this should be part of numpy?
>

Isn't what you are trying to build a cartesian product function? There is a
neat, efficient implementation of such a function in StackOverflow, by our
own pv.:

http://stackoverflow.com/questions/1208118/using-numpy-to-build-an-array-of-all-combinations-of-two-arrays/1235363#1235363

Perhaps we could make this part of numpy.lib.arraysetops? Isthere room for
other combinatoric generators, i.e. combinations, permutations... as in
itertools?

Jaime

-- 
(\__/)
( O.o)
( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes
de dominaci?n mundial.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150510/83230539/attachment.html>

From njs at pobox.com  Sun May 10 20:44:33 2015
From: njs at pobox.com (Nathaniel Smith)
Date: Sun, 10 May 2015 17:44:33 -0700
Subject: [Numpy-discussion] Create a n-D grid; meshgrid alternative
In-Reply-To: <CAFync2iTCQtitBAPC68L_GmPVTQ08Wwd4ZGEFi0rv4ZoiYaYjw@mail.gmail.com>
References: <CAFync2iTCQtitBAPC68L_GmPVTQ08Wwd4ZGEFi0rv4ZoiYaYjw@mail.gmail.com>
Message-ID: <CAPJVwBm2CyVzxrnZiqZoD0WD3gKFy_Yg6pTErCFiuKM_c5UAtQ@mail.gmail.com>

On Sun, May 10, 2015 at 4:40 AM, Stefan Otte <stefan.otte at gmail.com> wrote:
> Hey,
>
> quite often I want to evaluate a function on a grid in a n-D space.
> What I end up doing (and what I really dislike) looks something like this:
>
>   x = np.linspace(0, 5, 20)
>   M1, M2 = np.meshgrid(x, x)
>   X = np.column_stack([M1.flatten(), M2.flatten()])
>   X.shape  # (400, 2)
>
>   fancy_function(X)
>
> I don't think I ever used `meshgrid` in any other way.
> Is there a better way to create such a grid space?

I feel like our "house style" has moved away from automatic
flattening, and would maybe we should be nudging people towards
something more like

  # using proposed np.stack from pull request #5605
  X = np.stack(np.meshgrid(x, x), axis=-1)
  assert X.shape == (20, 20, 2)
  fancy_function(X)  # vectorized to accept any array with shape (..., 2)

-n

-- 
Nathaniel J. Smith -- http://vorpus.org


From stefanv at berkeley.edu  Sun May 10 21:07:09 2015
From: stefanv at berkeley.edu (Stefan van der Walt)
Date: Sun, 10 May 2015 18:07:09 -0700
Subject: [Numpy-discussion] Create a n-D grid; meshgrid alternative
In-Reply-To: <CAPOWHW=T7KfMhCQ2hRXahRE2Vg9FnKnjW_w5+LwNQ1NcdKpJnw@mail.gmail.com>
References: <CAFync2iTCQtitBAPC68L_GmPVTQ08Wwd4ZGEFi0rv4ZoiYaYjw@mail.gmail.com>
	<CAPOWHW=T7KfMhCQ2hRXahRE2Vg9FnKnjW_w5+LwNQ1NcdKpJnw@mail.gmail.com>
Message-ID: <871tinorb6.fsf@berkeley.edu>

On 2015-05-10 14:46:12, Jaime Fern?ndez del R?o 
<jaime.frio at gmail.com> wrote:
> Isn't what you are trying to build a cartesian product function? 
> There is a neat, efficient implementation of such a function in 
> StackOverflow, by our own pv.:
>
> http://stackoverflow.com/questions/1208118/using-numpy-to-build-an-array-of-all-combinations-of-two-arrays/1235363#1235363

And a slightly faster version just down that page ;)

St?fan


From jeffreback at gmail.com  Mon May 11 11:42:11 2015
From: jeffreback at gmail.com (Jeff Reback)
Date: Mon, 11 May 2015 11:42:11 -0400
Subject: [Numpy-discussion] ANN: pandas 0.16.1 released
Message-ID: <CAHMnJKjcOsM41+1_bE1UNR5WJ-8rHqJMEEudOZeF216yULXqCQ@mail.gmail.com>

Hello,

We are proud to announce v0.16.1 of pandas, a minor release from 0.16.0.

This release includes a small number of API changes, several new features,
enhancements, and performance improvements along with a large number of bug
fixes.

This was a release of 7 weeks with 222 commits by 57 authors encompassing
85 issues.

We recommend that all users upgrade to this version.

*What is it:*

*pandas* is a Python package providing fast, flexible, and expressive data
structures designed to make working with ?relational? or ?labeled? data both
easy and intuitive. It aims to be the fundamental high-level building block
for
doing practical, real world data analysis in Python. Additionally, it has
the
broader goal of becoming the most powerful and flexible open source data
analysis / manipulation tool available in any language.

Highlights of this release include:

   - Support for *CategoricalIndex*, a category based index, see here
   <http://pandas.pydata.org/pandas-docs/stable/whatsnew.html#whatsnew-0161-enhancements-categoricalindex>
   - New section on how-to-contribute to *pandas*, see here
   <http://pandas.pydata.org/pandas-docs/stable/contributing.html>
   - Revised "Merge, join, and concatenate" documentation, including
   graphical examples to make it easier to understand each operations, see
   here <http://pandas.pydata.org/pandas-docs/stable/merging.html>
   - New method *sample* for drawing random samples from Series, DataFrames
   and Panels. See here
   <http://pandas.pydata.org/pandas-docs/stable/whatsnew.html#whatsnew-0161-enhancements-sample>
   - The default *Index* printing has changed to a more uniform format, see
   here
   <http://pandas.pydata.org/pandas-docs/stable/whatsnew.html#whatsnew-0161-index-repr>
   - *BusinessHour* datetime-offset is now supported, see here
   <http://pandas.pydata.org/pandas-docs/stable/timeseries.html#business-hour>
   - Further enhancement to the *.str* accessor to make string operations
   easier, see here
   <http://pandas.pydata.org/pandas-docs/stable/whatsnew.html#whatsnew-0161-enhancements-string>


See the Whatsnew in v0.16.1
<http://pandas.pydata.org/pandas-docs/stable/whatsnew.html#v0-16-1-may-11-2015>

Documentation:
http://pandas.pydata.org/pandas-docs/stable/

Source tarballs, windows binaries are available on PyPI:
https://pypi.python.org/pypi/pandas

windows binaries are courtesy of  Christoph Gohlke and are built on Numpy
1.8
macosx wheels are courtesy of Matthew Brett

Please report any issues here:
https://github.com/pydata/pandas/issues


Thanks

The Pandas Development Team


Contributors to the 0.16.1 release

   -
   - Alfonso MHC
   - Andy Hayden
   - Artemy Kolchinsky
   - Chris Gilmer
   - Chris Grinolds
   - Dan Birken
   - David BROCHART
   - David Hirschfeld
   - David Stephens
   - Dr. Leo
   - Evan Wright
   - Frans van Dunn?
   - Hatem Nassrat
   - Henning Sperr
   - Hugo Herter
   - Jan Schulz
   - Jeff Blackburne
   - Jeff Reback
   - Jim Crist
   - Jonas Abernot
   - Joris Van den Bossche
   - Kerby Shedden
   - Leo Razoumov
   - Manuel Riel
   - Mortada Mehyar
   - Nick Burns
   - Nick Eubank
   - Olivier Grisel
   - Phillip Cloud
   - Pietro Battiston
   - Roy Hyunjin Han
   - Sam Zhang
   - Scott Sanderson
   - Stephan Hoyer
   - Tiago Antao
   - Tom Ajamian
   - Tom Augspurger
   - Tomaz Berisa
   - Vikram Shirgur
   - Vladimir Filimonov
   - William Hogman
   - Yasin A
   - Younggun Kim
   - behzad nouri
   - dsm054
   - floydsoft
   - flying-sheep
   - gfr
   - jnmclarty
   - jreback
   - ksanghai
   - lucas
   - mschmohl
   - ptype
   - rockg
   - scls19fr
   - sinhrks
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150511/94c01b1e/attachment.html>

From alan.isaac at gmail.com  Mon May 11 15:43:51 2015
From: alan.isaac at gmail.com (Alan G Isaac)
Date: Mon, 11 May 2015 15:43:51 -0400
Subject: [Numpy-discussion] Proposed deprecations for 1.10: dot corner
 cases
In-Reply-To: <CAPJVwB=kToggkE_joY9f059rf+nwKKMPvJp3v_zysvohX7qNVA@mail.gmail.com>
References: <CAPJVwB=kToggkE_joY9f059rf+nwKKMPvJp3v_zysvohX7qNVA@mail.gmail.com>
Message-ID: <55510677.40804@gmail.com>

On 5/9/2015 4:26 PM, Nathaniel Smith wrote:
> dot(A, B) where one of the argument is a scalar: currently, this
> does scalar multiplication. There is no logically consistent
> motivation for this, it violates TOOWTDI, and again it is inconsistent
> with the PEP semantics for @ (which are that this case should be an
> error).


Do I recall incorrectly: I thought that reconciliation of `@` and `dot`
was explicitly not part of the project on getting a `@` operator?

I do not mean this to speak for or against the change above, which I only
moderately oppose, but rather to the argument offered.

As for the "logic" of the current behavior, can it not be given a
tensor product motivation?  (Otoh, it conflicts with the current
behavior of `vdot`.)

Alan


From njs at pobox.com  Mon May 11 15:52:55 2015
From: njs at pobox.com (Nathaniel Smith)
Date: Mon, 11 May 2015 12:52:55 -0700
Subject: [Numpy-discussion] Proposed deprecations for 1.10: dot corner
	cases
In-Reply-To: <55510677.40804@gmail.com>
References: <CAPJVwB=kToggkE_joY9f059rf+nwKKMPvJp3v_zysvohX7qNVA@mail.gmail.com>
	<55510677.40804@gmail.com>
Message-ID: <CAPJVwB=trksEZz+HG3YDUbza34ytX1VYjxZnuQ-2TF3NmXVa4A@mail.gmail.com>

On May 11, 2015 12:44 PM, "Alan G Isaac" <alan.isaac at gmail.com> wrote:
>
> On 5/9/2015 4:26 PM, Nathaniel Smith wrote:
> > dot(A, B) where one of the argument is a scalar: currently, this
> > does scalar multiplication. There is no logically consistent
> > motivation for this, it violates TOOWTDI, and again it is inconsistent
> > with the PEP semantics for @ (which are that this case should be an
> > error).
>
> Do I recall incorrectly: I thought that reconciliation of `@` and `dot`
> was explicitly not part of the project on getting a `@` operator?
>
> I do not mean this to speak for or against the change above, which I only
> moderately oppose, but rather to the argument offered.

Not sure what you mean. It's true that PEP 465 doesn't say anything about
np.dot, because it's out of scope. The argument here, though, is not "PEP
465 says we have to do this". It's that it's confusing to have two
different subtly different sets of semantics, and the PEP semantics are
better (that's why we chose them), so we should at a minimum warn people
who are getting the old behavior

> As for the "logic" of the current behavior, can it not be given a
> tensor product motivation?  (Otoh, it conflicts with the current
> behavior of `vdot`.)

Maybe? I don't know of any motivation that doesn't require treating it as a
special case added only to duplicate existing behavior, but that doesn't
mean one doesnt exist...

-n
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150511/3f9ee55a/attachment.html>

From shoyer at gmail.com  Mon May 11 16:07:51 2015
From: shoyer at gmail.com (Stephan Hoyer)
Date: Mon, 11 May 2015 13:07:51 -0700
Subject: [Numpy-discussion] Proposed deprecations for 1.10: dot corner
	cases
In-Reply-To: <CAPJVwB=kToggkE_joY9f059rf+nwKKMPvJp3v_zysvohX7qNVA@mail.gmail.com>
References: <CAPJVwB=kToggkE_joY9f059rf+nwKKMPvJp3v_zysvohX7qNVA@mail.gmail.com>
Message-ID: <CAEQ_Tvej8pDu7j-CttFuBBUrk_To_MP3A1FdVWXHT_WQLd3wWw@mail.gmail.com>

On Sat, May 9, 2015 at 1:26 PM, Nathaniel Smith <njs at pobox.com> wrote:

> I'd like to suggest that we go ahead and add deprecation warnings to
> the following operations. This doesn't commit us to changing anything
> on any particular time scale, but it gives us more options later.
>

These both get a strong +1 from me.

How long has the "outer product" behavior for np.dot been around?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150511/aac593ee/attachment.html>

From alan.isaac at gmail.com  Mon May 11 17:53:46 2015
From: alan.isaac at gmail.com (Alan G Isaac)
Date: Mon, 11 May 2015 17:53:46 -0400
Subject: [Numpy-discussion] Proposed deprecations for 1.10: dot corner
 cases
In-Reply-To: <CAPJVwB=trksEZz+HG3YDUbza34ytX1VYjxZnuQ-2TF3NmXVa4A@mail.gmail.com>
References: <CAPJVwB=kToggkE_joY9f059rf+nwKKMPvJp3v_zysvohX7qNVA@mail.gmail.com>	<55510677.40804@gmail.com>
	<CAPJVwB=trksEZz+HG3YDUbza34ytX1VYjxZnuQ-2TF3NmXVa4A@mail.gmail.com>
Message-ID: <555124EA.5000505@gmail.com>

On 5/11/2015 3:52 PM, Nathaniel Smith wrote:
> Not sure what you mean. It's true that PEP 465 doesn't say anything about np.dot, because it's out of scope. The argument here, though, is not "PEP
> 465 says we have to do this". It's that it's confusing to have two different subtly different sets of semantics, and the PEP semantics are better
> (that's why we chose them), so we should at a minimum warn people who are getting the old behavior


I would have to dig around, but I am pretty sure there were explicit statements
that `@` would neither be bound by the behavior of `dot` nor expected to be
reconciled with it.

I agree that where `@` and `dot` differ in behavior, this should be clearly documented.
I would hope that the behavior of `dot` would not change.

Alan


From shoyer at gmail.com  Mon May 11 23:13:24 2015
From: shoyer at gmail.com (Stephan Hoyer)
Date: Mon, 11 May 2015 20:13:24 -0700
Subject: [Numpy-discussion] Proposed deprecations for 1.10: dot corner
	cases
In-Reply-To: <555124EA.5000505@gmail.com>
References: <CAPJVwB=kToggkE_joY9f059rf+nwKKMPvJp3v_zysvohX7qNVA@mail.gmail.com>
	<55510677.40804@gmail.com>
	<CAPJVwB=trksEZz+HG3YDUbza34ytX1VYjxZnuQ-2TF3NmXVa4A@mail.gmail.com>
	<555124EA.5000505@gmail.com>
Message-ID: <CAEQ_Tvfo_1XpKpPM6kC8BrBdYG4SJ3rJej=mfLFUYMSoJX1QmQ@mail.gmail.com>

On Mon, May 11, 2015 at 2:53 PM, Alan G Isaac <alan.isaac at gmail.com> wrote:

> I agree that where `@` and `dot` differ in behavior, this should be
> clearly documented.
> I would hope that the behavior of `dot` would not change.


Even if np.dot never changes (and indeed, perhaps it should not), issuing
these warnings seems like a good idea to me, once we have @ implemented
with the new behavior (and the @ operator backported from Python <3.5 as a
numpy function).

I expect that this warning would serve the useful purpose of reminding
users writing code intended to be used on earlier versions of numpy/python
that @ and np.dot don't work exactly the same way. As Nathaniel already
mentioned, it is quite straightforward to implement the "outer product"
behavior using the new @ behavior, so it will not be much of a hassle to
update code to remove the warning.

Stephan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150511/73db5081/attachment.html>

From stefan.otte at gmail.com  Tue May 12 04:17:12 2015
From: stefan.otte at gmail.com (Stefan Otte)
Date: Tue, 12 May 2015 08:17:12 +0000
Subject: [Numpy-discussion] Create a n-D grid; meshgrid alternative
In-Reply-To: <CAPJVwBm2CyVzxrnZiqZoD0WD3gKFy_Yg6pTErCFiuKM_c5UAtQ@mail.gmail.com>
References: <CAFync2iTCQtitBAPC68L_GmPVTQ08Wwd4ZGEFi0rv4ZoiYaYjw@mail.gmail.com>
	<CAPJVwBm2CyVzxrnZiqZoD0WD3gKFy_Yg6pTErCFiuKM_c5UAtQ@mail.gmail.com>
Message-ID: <CAFync2gcTDGOZZVrHHSrG+jZTpuBH7DLztEhjF28q=wn7TC-xA@mail.gmail.com>

Hello,

indeed I was looking for the cartesian product.

I timed the two stackoverflow answers and the winner is not quite as clear:

n_elements:    10  cartesian  0.00427 cartesian2  0.00172
n_elements:   100  cartesian  0.02758 cartesian2  0.01044
n_elements:  1000  cartesian  0.97628 cartesian2  1.12145
n_elements:  5000  cartesian 17.14133 cartesian2 31.12241

(This is for two arrays as parameters: np.linspace(0, 1, n_elements))
cartesian2 seems to be slower for bigger.

I'd really appreciate if this was be part of numpy. Should I create a pull
request?

Regarding combinations and permutations: I could be convenient to have as
well.


Cheers,
 Stefan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150512/7f23bcb8/attachment.html>

From johannes.kulick at ipvs.uni-stuttgart.de  Tue May 12 04:57:28 2015
From: johannes.kulick at ipvs.uni-stuttgart.de (Johannes Kulick)
Date: Tue, 12 May 2015 10:57:28 +0200
Subject: [Numpy-discussion] Create a n-D grid; meshgrid alternative
In-Reply-To: <CAFync2gKGFh0FJXpM+zEeiUXkKB8K2VmxVEt2rqBHFh+4sTxJw@mail.gmail.com>
References: <CAFync2iTCQtitBAPC68L_GmPVTQ08Wwd4ZGEFi0rv4ZoiYaYjw@mail.gmail.com>
	<CAFync2gKGFh0FJXpM+zEeiUXkKB8K2VmxVEt2rqBHFh+4sTxJw@mail.gmail.com>
Message-ID: <20150512085728.24178.24849@quirm.robotics.tu-berlin.de>

I'm totally in favor of the 'gridspace(linspaces)' version, as you probably end
up wanting to create grids of other things than linspaces (e.g. a logspace grid,
or a grid of random points etc.).

It should be called somewhat different though. Maybe 'cartesian(arrays)'?

Best,
Johannes

Quoting Stefan Otte (2015-05-10 16:05:02)
> I just drafted different versions of the `gridspace` function:
> https://tmp23.tmpnb.org/user/1waoqQ8PJBJ7/notebooks/2015-05%20gridspace.ipynb
> 
> 
> Beste Gr??e,
>  Stefan
> 
> 
> 
> On Sun, May 10, 2015 at 1:40 PM, Stefan Otte <stefan.otte at gmail.com> wrote:
> > Hey,
> >
> > quite often I want to evaluate a function on a grid in a n-D space.
> > What I end up doing (and what I really dislike) looks something like this:
> >
> >   x = np.linspace(0, 5, 20)
> >   M1, M2 = np.meshgrid(x, x)
> >   X = np.column_stack([M1.flatten(), M2.flatten()])
> >   X.shape  # (400, 2)
> >
> >   fancy_function(X)
> >
> > I don't think I ever used `meshgrid` in any other way.
> > Is there a better way to create such a grid space?
> >
> > I wrote myself a little helper function:
> >
> >   def gridspace(linspaces):
> >       return np.column_stack([space.flatten()
> >                               for space in np.meshgrid(*linspaces)])
> >
> > But maybe something like this should be part of numpy?
> >
> >
> > Best,
> >  Stefan
> >
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

-- 
Question: What is the weird attachment to all my emails?
Answer:   http://en.wikipedia.org/wiki/Digital_signature


From jaime.frio at gmail.com  Tue May 12 08:01:26 2015
From: jaime.frio at gmail.com (=?UTF-8?Q?Jaime_Fern=C3=A1ndez_del_R=C3=ADo?=)
Date: Tue, 12 May 2015 05:01:26 -0700
Subject: [Numpy-discussion] Create a n-D grid; meshgrid alternative
In-Reply-To: <CAFync2gcTDGOZZVrHHSrG+jZTpuBH7DLztEhjF28q=wn7TC-xA@mail.gmail.com>
References: <CAFync2iTCQtitBAPC68L_GmPVTQ08Wwd4ZGEFi0rv4ZoiYaYjw@mail.gmail.com>
	<CAPJVwBm2CyVzxrnZiqZoD0WD3gKFy_Yg6pTErCFiuKM_c5UAtQ@mail.gmail.com>
	<CAFync2gcTDGOZZVrHHSrG+jZTpuBH7DLztEhjF28q=wn7TC-xA@mail.gmail.com>
Message-ID: <CAPOWHWk=Z7iHjbosR8UaW65jYy3Ko=_C7VtbL1dVKWOrNEz8Xg@mail.gmail.com>

On Tue, May 12, 2015 at 1:17 AM, Stefan Otte <stefan.otte at gmail.com> wrote:

> Hello,
>
> indeed I was looking for the cartesian product.
>
> I timed the two stackoverflow answers and the winner is not quite as clear:
>
> n_elements:    10  cartesian  0.00427 cartesian2  0.00172
> n_elements:   100  cartesian  0.02758 cartesian2  0.01044
> n_elements:  1000  cartesian  0.97628 cartesian2  1.12145
> n_elements:  5000  cartesian 17.14133 cartesian2 31.12241
>
> (This is for two arrays as parameters: np.linspace(0, 1, n_elements))
> cartesian2 seems to be slower for bigger.
>

On my system, the following variation on Pauli's answer is 2-4x faster than
his for your test cases:

def cartesian4(arrays, out=None):
    arrays = [np.asarray(x).ravel() for x in arrays]
    dtype = np.result_type(*arrays)

    n = np.prod([arr.size for arr in arrays])
    if out is None:
        out = np.empty((len(arrays), n), dtype=dtype)
    else:
        out = out.T

    for j, arr in enumerate(arrays):
        n /= arr.size
        out.shape = (len(arrays), -1, arr.size, n)
        out[j] = arr[np.newaxis, :, np.newaxis]
    out.shape = (len(arrays), -1)

    return out.T


> I'd really appreciate if this was be part of numpy. Should I create a pull
> request?
>

There hasn't been any opposition, quite the contrary, so yes, I would go
ahead an create that PR. I somehow feel this belongs with the set
operations, rather than with the indexing ones. Other thoughts?

Also for consideration: should it work on flattened arrays? or should we
give it an axis argument, and then "broadcast on the rest", a la
generalized ufunc?

Jaime

-- 
(\__/)
( O.o)
( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes
de dominaci?n mundial.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150512/c696c908/attachment.html>

From stefan.otte at gmail.com  Tue May 12 09:29:10 2015
From: stefan.otte at gmail.com (Stefan Otte)
Date: Tue, 12 May 2015 13:29:10 +0000
Subject: [Numpy-discussion] Create a n-D grid; meshgrid alternative
In-Reply-To: <CAPOWHWk=Z7iHjbosR8UaW65jYy3Ko=_C7VtbL1dVKWOrNEz8Xg@mail.gmail.com>
References: <CAFync2iTCQtitBAPC68L_GmPVTQ08Wwd4ZGEFi0rv4ZoiYaYjw@mail.gmail.com>
	<CAPJVwBm2CyVzxrnZiqZoD0WD3gKFy_Yg6pTErCFiuKM_c5UAtQ@mail.gmail.com>
	<CAFync2gcTDGOZZVrHHSrG+jZTpuBH7DLztEhjF28q=wn7TC-xA@mail.gmail.com>
	<CAPOWHWk=Z7iHjbosR8UaW65jYy3Ko=_C7VtbL1dVKWOrNEz8Xg@mail.gmail.com>
Message-ID: <CAFync2js-drh86RHNL6Gx0w26LpCnc0p3585O9nfeVwqbvLtSA@mail.gmail.com>

Hey,

here is an ipython notebook with benchmarks of all implementations (scroll
to the bottom for plots):

https://github.com/sotte/ipynb_snippets/blob/master/2015-05%20gridspace%20-%20cartesian.ipynb

Overall, Jaime's version is the fastest.


On Tue, May 12, 2015 at 2:01 PM Jaime Fern?ndez del R?o <
jaime.frio at gmail.com> wrote:

> On Tue, May 12, 2015 at 1:17 AM, Stefan Otte <stefan.otte at gmail.com>
> wrote:
>
>> Hello,
>>
>> indeed I was looking for the cartesian product.
>>
>> I timed the two stackoverflow answers and the winner is not quite as
>> clear:
>>
>> n_elements:    10  cartesian  0.00427 cartesian2  0.00172
>> n_elements:   100  cartesian  0.02758 cartesian2  0.01044
>> n_elements:  1000  cartesian  0.97628 cartesian2  1.12145
>> n_elements:  5000  cartesian 17.14133 cartesian2 31.12241
>>
>> (This is for two arrays as parameters: np.linspace(0, 1, n_elements))
>> cartesian2 seems to be slower for bigger.
>>
>
> On my system, the following variation on Pauli's answer is 2-4x faster
> than his for your test cases:
>
> def cartesian4(arrays, out=None):
>     arrays = [np.asarray(x).ravel() for x in arrays]
>     dtype = np.result_type(*arrays)
>
>     n = np.prod([arr.size for arr in arrays])
>     if out is None:
>         out = np.empty((len(arrays), n), dtype=dtype)
>     else:
>         out = out.T
>
>     for j, arr in enumerate(arrays):
>         n /= arr.size
>         out.shape = (len(arrays), -1, arr.size, n)
>         out[j] = arr[np.newaxis, :, np.newaxis]
>     out.shape = (len(arrays), -1)
>
>     return out.T
>
>
>> I'd really appreciate if this was be part of numpy. Should I create a
>> pull request?
>>
>
> There hasn't been any opposition, quite the contrary, so yes, I would go
> ahead an create that PR. I somehow feel this belongs with the set
> operations, rather than with the indexing ones. Other thoughts?
>
> Also for consideration: should it work on flattened arrays? or should we
> give it an axis argument, and then "broadcast on the rest", a la
> generalized ufunc?
>
> Jaime
>
> --
> (\__/)
> ( O.o)
> ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes
> de dominaci?n mundial.
>  _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150512/2f5df59d/attachment.html>

From m.h.vankerkwijk at gmail.com  Tue May 12 11:49:07 2015
From: m.h.vankerkwijk at gmail.com (Marten van Kerkwijk)
Date: Tue, 12 May 2015 11:49:07 -0400
Subject: [Numpy-discussion] Bug in np.nonzero / Should index returning
 functions return ndarray subclasses?
In-Reply-To: <1431222821751.71592119@Nodemailer>
References: <CANNq6FnL8ENqy_vgUyBy+cHx7P8T3sALBAY0Gwfv62qHR8bAzA@mail.gmail.com>
	<1431222821751.71592119@Nodemailer>
Message-ID: <CAJNV+9u=n7-Q7b0VvCbTEG_6E7d4wGGUhNLhaEh+Zo5OctsQdA@mail.gmail.com>

Agreed that indexing functions should return bare `ndarray`. Note that in
Jaime's PR one can override it anyway by defining __nonzero__.  -- Marten

On Sat, May 9, 2015 at 9:53 PM, Stephan Hoyer <shoyer at gmail.com> wrote:

>  With regards to np.where -- shouldn't where be a ufunc, so subclasses or
> other array-likes can be control its behavior with __numpy_ufunc__?
>
> As for the other indexing functions, I don't have a strong opinion about
> how they should handle subclasses. But it is certainly tricky to attempt to
> handle handle arbitrary subclasses. I would agree that the least error
> prone thing to do is usually to return base ndarrays. Better to force
> subclasses to override methods explicitly.
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150512/f129a508/attachment.html>

From ellisonbg at gmail.com  Tue May 12 12:20:15 2015
From: ellisonbg at gmail.com (Brian Granger)
Date: Tue, 12 May 2015 09:20:15 -0700
Subject: [Numpy-discussion] [JOB] Work full time on Project Jupyter/IPython
Message-ID: <CAH4pYpRf_gQ4AFfbYAAG_y1b5a-ML-vKcY+ArkVBHPtgh=6Tqw@mail.gmail.com>

Hi all,

I wanted to let the community know that we are currently hiring 3 full time
software engineers to work full time on Project Jupyter/IPython. These
positions will be in my group at Cal Poly in San Luis Obispo, CA. We are
looking for frontend and backend software engineers with lots of
Python/JavaScript experience and a passion for open source software. The
details can be found here:

https://www.calpolycorporationjobs.org/postings/736

This is an unusual opportunity in a couple of respects:

* These positions will allow you to work on open source software full time
- not as a X% side project (aka weekends and evenings).
* These are fully benefited positions (CA state retirement, health care,
etc.)
* You will get to work and live in San Luis Obispo, one of the nicest
places on earth. We are minutes from the beach, have perfect year-round
weather and are close to both the Bay Area and So Cal.

I am more than willing to talk to any who is interested in these positions.

Cheers,

Brian

-- 
Brian E. Granger
Cal Poly State University, San Luis Obispo
@ellisonbg on Twitter and GitHub
bgranger at calpoly.edu and ellisonbg at gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150512/fd649913/attachment.html>

From ndbecker2 at gmail.com  Tue May 12 14:18:40 2015
From: ndbecker2 at gmail.com (Neal Becker)
Date: Tue, 12 May 2015 14:18:40 -0400
Subject: [Numpy-discussion] python is cool
Message-ID: <mitg61$536$1@ger.gmane.org>

In order to make sure all my random number generators have good 
independence, it is a good practice to use a single shared instance (because 
it is already known to have good properties).  A less-desirable alternative 
is to used rng's seeded with different starting states - in this case the 
independence properties are not generally known.

So I have some fairly deeply nested data structures (classes) that somewhere 
contain a reference to a RandomState object.

I need to be able to clone these data structures, producing new independent 
copies, but I want the RandomState part to be the shared, singleton rs 
object.

In python, no problem:

---
from numpy.random import RandomState

class shared_random_state (RandomState):
    def __init__ (self, rs):
        RandomState.__init__(self, rs)

    def __deepcopy__ (self, memo):
        return self
---

Now I can copy.deepcopy the data structures, but the randomstate part is 
shared.  I just use

rs = shared_random_state (random.RandomState(0))

and provide this rs to all my other objects.  Pretty nice!

-- 
Those who fail to understand recursion are doomed to repeat it


From ocp at gatech.edu  Tue May 12 14:41:33 2015
From: ocp at gatech.edu (Pierson, Oliver C)
Date: Tue, 12 May 2015 18:41:33 +0000
Subject: [Numpy-discussion] Integral Equation Solver
Message-ID: <1431456093009.7894@gatech.edu>

Hi All,

  Awhile back I had written some code to solve Volterra integral equations (integral equations where one of the integration bounds is a variable).  The code is available on Github (https://github.com/oliverpierson/volterra).  Just curious if there'd be any interest in adding this to Numpy?  I still have some work to do on the code.  However, before I invest too much time, I was trying to get a feel for the interest in this functionality.


Please let me know if you have any questions.


Thanks,

Oliver


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150512/c383921f/attachment.html>

From roland at utk.edu  Tue May 12 14:56:02 2015
From: roland at utk.edu (Roland Schulz)
Date: Tue, 12 May 2015 14:56:02 -0400
Subject: [Numpy-discussion] python is cool
In-Reply-To: <f7bdf1a9828544efb1940231ea118f38@BY2PR0201MB0584.namprd02.prod.outlook.com>
References: <f7bdf1a9828544efb1940231ea118f38@BY2PR0201MB0584.namprd02.prod.outlook.com>
Message-ID: <CAO2TwbkTB=cJb3HE_FV0hcF7VNckWe6CU7oHzSQsf0rC4Uc6og@mail.gmail.com>

Hi,

I think the best way to solve this issue to not use a state at all. It is
fast, reproducible even in parallel (if wanted), and doesn't suffer from
the shared issue. Would be nice if numpy provided such a stateless RNG as
implemented in Random123: www.deshawresearch.com/resources_random123.html

Roland

On Tue, May 12, 2015 at 2:18 PM, Neal Becker <ndbecker2 at gmail.com> wrote:

> In order to make sure all my random number generators have good
> independence, it is a good practice to use a single shared instance
> (because
> it is already known to have good properties).  A less-desirable alternative
> is to used rng's seeded with different starting states - in this case the
> independence properties are not generally known.
>
> So I have some fairly deeply nested data structures (classes) that
> somewhere
> contain a reference to a RandomState object.
>
> I need to be able to clone these data structures, producing new independent
> copies, but I want the RandomState part to be the shared, singleton rs
> object.
>
> In python, no problem:
>
> ---
> from numpy.random import RandomState
>
> class shared_random_state (RandomState):
>     def __init__ (self, rs):
>         RandomState.__init__(self, rs)
>
>     def __deepcopy__ (self, memo):
>         return self
> ---
>
> Now I can copy.deepcopy the data structures, but the randomstate part is
> shared.  I just use
>
> rs = shared_random_state (random.RandomState(0))
>
> and provide this rs to all my other objects.  Pretty nice!
>
> --
> Those who fail to understand recursion are doomed to repeat it
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


-- 
ORNL/UT Center for Molecular Biophysics cmb.ornl.gov
865-241-1537, ORNL PO BOX 2008 MS6309
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150512/e27ea3b3/attachment.html>

From ndbecker2 at gmail.com  Tue May 12 15:00:42 2015
From: ndbecker2 at gmail.com (Neal Becker)
Date: Tue, 12 May 2015 15:00:42 -0400
Subject: [Numpy-discussion] python is cool
References: <f7bdf1a9828544efb1940231ea118f38@BY2PR0201MB0584.namprd02.prod.outlook.com>
	<CAO2TwbkTB=cJb3HE_FV0hcF7VNckWe6CU7oHzSQsf0rC4Uc6og@mail.gmail.com>
Message-ID: <mitikq$eci$1@ger.gmane.org>

Roland Schulz wrote:

> Hi,
> 
> I think the best way to solve this issue to not use a state at all. It is
> fast, reproducible even in parallel (if wanted), and doesn't suffer from
> the shared issue. Would be nice if numpy provided such a stateless RNG as
> implemented in Random123: www.deshawresearch.com/resources_random123.html
> 
> Roland

That is interesting.  I think np.random needs to be refactored, so it can 
accept a pluggable rng - then we could switch the underlying rng.


From charlesr.harris at gmail.com  Tue May 12 15:34:58 2015
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Tue, 12 May 2015 13:34:58 -0600
Subject: [Numpy-discussion] Integral Equation Solver
In-Reply-To: <1431456093009.7894@gatech.edu>
References: <1431456093009.7894@gatech.edu>
Message-ID: <CAB6mnx+eL6CZUwYDSTiPxP83wxrEdfsnCLY7djv6J_ZxW2rucA@mail.gmail.com>

On Tue, May 12, 2015 at 12:41 PM, Pierson, Oliver C <ocp at gatech.edu> wrote:

>  Hi All,
>
>   Awhile back I had written some code to solve Volterra integral equations
> (integral equations where one of the integration bounds is a variable).
> The code is available on Github (https://github.com/oliverpierson/volterra).
> Just curious if there'd be any interest in adding this to Numpy?  I still
> have some work to do on the code.  However, before I invest too much time,
> I was trying to get a feel for the interest in this functionality.
>

Could be useful. The best place for something like this would be scipy (
scipy-dev at scipy.org)..

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150512/6402e9f9/attachment.html>

From ralf.gommers at gmail.com  Tue May 12 17:54:55 2015
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Tue, 12 May 2015 23:54:55 +0200
Subject: [Numpy-discussion] ANN: Scipy 0.16.0 beta 1 release
Message-ID: <CABL7CQjwediXQqn=1deGDRZJiXfhVsSKLmt1QoZr6HgENc1QWw@mail.gmail.com>

Hi all,

I'm pleased to announce the availability of the first beta release of Scipy
0.16.0. Please try this beta and report any issues on the Github issue
tracker or on the scipy-dev mailing list.

This first beta is a source-only release; binary installers will follow
(probably next week). Source tarballs and the full release notes can be
found at https://sourceforge.net/projects/scipy/files/scipy/0.16.0b1/. Part
of the release notes copied below.

Thanks to everyone who contributed to this release!

Ralf


==========================
SciPy 0.16.0 Release Notes
==========================

.. note:: Scipy 0.16.0 is not released yet!

SciPy 0.16.0 is the culmination of 6 months of hard work. It contains
many new features, numerous bug-fixes, improved test coverage and
better documentation.  There have been a number of deprecations and
API changes in this release, which are documented below.  All users
are encouraged to upgrade to this release, as there are a large number
of bug-fixes and optimizations.  Moreover, our development attention
will now shift to bug-fix releases on the 0.15.x branch, and on adding
new features on the master branch.

This release requires Python 2.6, 2.7 or 3.2-3.4 and NumPy 1.6.2 or greater.

Highlights of this release include:

- A Cython API for BLAS/LAPACK in `scipy.linalg`
- A new benchmark suite.  It's now straightforward to add new benchmarks,
and
  they're routinely included with performance enhancement PRs.
- Support for the second order sections (SOS) format in `scipy.signal`.


New features
============

Benchmark suite
---------------

The benchmark suite has switched to using `Airspeed Velocity
<http://spacetelescope.github.io/asv/>`__ for benchmarking. You can
run the suite locally via ``python runtests.py --bench``. For more
details, see ``benchmarks/README.rst``.

`scipy.linalg` improvements
---------------------------

A full set of Cython wrappers for BLAS and LAPACK has been added in the
modules `scipy.linalg.cython_blas` and `scipy.linalg.cython_lapack`.
In Cython, these wrappers can now be cimported from their corresponding
modules and used without linking directly against BLAS or LAPACK.

The functions `scipy.linalg.qr_delete`, `scipy.linalg.qr_insert` and
`scipy.linalg.qr_update` for updating QR decompositions were added.

The function `scipy.linalg.solve_circulant` solves a linear system with
a circulant coefficient matrix.

The function `scipy.linalg.invpascal` computes the inverse of a Pascal
matrix.

The function `scipy.linalg.solve_toeplitz`, a Levinson-Durbin Toeplitz
solver,
was added.

Added wrapper for potentially useful LAPACK function ``*lasd4``.  It
computes
the square root of the i-th updated eigenvalue of a positive symmetric
rank-one
modification to a positive diagonal matrix. See its LAPACK documentation and
unit tests for it to get more info.

Added two extra wrappers for LAPACK least-square solvers. Namely, they are
``*gelsd`` and ``*gelsy``.

Wrappers for the LAPACK ``*lange`` functions, which calculate various matrix
norms, were added.

Wrappers for ``*gtsv`` and ``*ptsv``, which solve ``A*X = B`` for
tri-diagonal
matrix ``A``, were added.

`scipy.signal` improvements
---------------------------

Support for second order sections (SOS) as a format for IIR filters
was added.  The new functions are:

* `scipy.signal.sosfilt`
* `scipy.signal.sosfilt_zi`,
* `scipy.signal.sos2tf`
* `scipy.signal.sos2zpk`
* `scipy.signal.tf2sos`
* `scipy.signal.zpk2sos`.

Additionally, the filter design functions `iirdesign`, `iirfilter`,
`butter`,
`cheby1`, `cheby2`, `ellip`, and `bessel` can return the filter in the SOS
format.

The function `scipy.signal.place_poles`, which provides two methods to place
poles for linear systems, was added.

The option to use Gustafsson's method for choosing the initial conditions
of the forward and backward passes was added to `scipy.signal.filtfilt`.

New classes ``TransferFunction``, ``StateSpace`` and ``ZerosPolesGain`` were
added.  These classes are now returned when instantiating
`scipy.signal.lti`.
Conversion between those classes can be done explicitly now.

An exponential (Poisson) window was added as `scipy.signal.exponential`,
and a
Tukey window was added as `scipy.signal.tukey`.

The function for computing digital filter group delay was added as
`scipy.signal.group_delay`.

The functionality for spectral analysis and spectral density estimation has
been significantly improved: `scipy.signal.welch` became ~8x faster and the
functions `scipy.signal.spectrogram`, `scipy.signal.coherence` and
`scipy.signal.csd` (cross-spectral density) were added.

`scipy.signal.lsim` was rewritten - all known issues are fixed, so this
function can now be used instead of ``lsim2``; ``lsim`` is orders of
magnitude
faster than ``lsim2`` in most cases.

`scipy.sparse` improvements
---------------------------

The function `scipy.sparse.norm`, which computes sparse matrix norms, was
added.

The function `scipy.sparse.random`, which allows to draw random variates
from
an arbitrary distribution, was added.

`scipy.spatial` improvements
----------------------------

`scipy.spatial.cKDTree` has seen a major rewrite, which improved the
performance of the ``query`` method significantly, added support for
parallel
queries, pickling, and options that affect the tree layout.  See pull
request
4374 for more details.

The function `scipy.spatial.procrustes` for Procrustes analysis (statistical
shape analysis) was added.

`scipy.stats` improvements
--------------------------

The Wishart distribution and its inverse have been added, as
`scipy.stats.wishart` and `scipy.stats.invwishart`.

The Exponentially Modified Normal distribution has been
added as `scipy.stats.exponnorm`.

The Generalized Normal distribution has been added as `scipy.stats.gennorm`.

All distributions now contain a ``random_state`` property and allow
specifying a
specific ``numpy.random.RandomState`` random number generator when
generating
random variates.

Many statistical tests and other `scipy.stats` functions that have multiple
return values now return ``namedtuples``.  See pull request 4709 for
details.

`scipy.optimize` improvements
-----------------------------

A new derivative-free method DF-SANE has been added to the nonlinear
equation
system solving function `scipy.optimize.root`.


Deprecated features
===================

``scipy.stats.pdf_fromgamma`` is deprecated.  This function was
undocumented,
untested and rarely used.  Statsmodels provides equivalent functionality
with ``statsmodels.distributions.ExpandedNormal``.

``scipy.stats.fastsort`` is deprecated.  This function is unnecessary,
``numpy.argsort`` can be used instead.

``scipy.stats.signaltonoise`` and ``scipy.stats.mstats.signaltonoise`` are
deprecated.  These functions did not belong in ``scipy.stats`` and are
rarely
used.  See issue #609 for details.

``scipy.stats.histogram2`` is deprecated. This function is unnecessary,
``numpy.histogram2d`` can be used instead.

Backwards incompatible changes
==============================

The deprecated global optimizer ``scipy.optimize.anneal`` was removed.

The following deprecated modules have been removed: ``scipy.lib.blas``,
``scipy.lib.lapack``, ``scipy.linalg.cblas``, ``scipy.linalg.fblas``,
``scipy.linalg.clapack``, ``scipy.linalg.flapack``.  They had been
deprecated
since Scipy 0.12.0, the functionality should be accessed as
`scipy.linalg.blas`
and `scipy.linalg.lapack`.

The deprecated function ``scipy.special.all_mat`` has been removed.

The deprecated functions ``fprob``, ``ksprob``, ``zprob``, ``randwcdf``
and ``randwppf`` have been removed from `scipy.stats`.


Other changes
=============

The version numbering for development builds has been updated to comply
with PEP 440.

Building with ``python setup.py develop`` is now supported.


Authors
=======

* @axiru +
* @endolith
* Elliott Sales de Andrade +
* Anne Archibald
* Yoshiki V?zquez Baeza +
* Sylvain Bellemare
* Felix Berkenkamp +
* Raoul Bourquin +
* Matthew Brett
* Per Brodtkorb
* Christian Brueffer
* Lars Buitinck
* Evgeni Burovski
* Steven Byrnes
* CJ Carey
* George Castillo +
* Alex Conley +
* Liam Damewood +
* Rupak Das +
* Abraham Escalante +
* Matthias Feurer +
* Eric Firing +
* Clark Fitzgerald
* Chad Fulton
* Andr? Gaul
* Andreea Georgescu +
* Christoph Gohlke
* Andrey Golovizin +
* Ralf Gommers
* J.J. Green +
* Alex Griffing
* Alexander Grigorievskiy +
* Hans Moritz Gunther +
* Jonas Hahnfeld +
* Charles Harris
* Ian Henriksen
* Andreas Hilboll
* ?smund Hjulstad +
* Jan Schl?ter +
* Janko Slavi? +
* Daniel Jensen +
* Johannes Ball? +
* Terry Jones +
* Amato Kasahara +
* Eric Larson
* Denis Laxalde
* Antony Lee
* Gregory R. Lee
* Perry Lee +
* Lo?c Est?ve
* Martin Manns +
* Eric Martin +
* Mat?j Koci?n +
* Andreas Mayer +
* Nikolay Mayorov +
* Robert McGibbon +
* Sturla Molden
* Nicola Montecchio +
* Eric Moore
* Jamie Morton +
* Nikolas Moya +
* Maniteja Nandana +
* Andrew Nelson
* Joel Nothman
* Aldrian Obaja
* Regina Ongowarsito +
* Paul Ortyl +
* Pedro L?pez-Adeva Fern?ndez-Layos +
* Stefan Peterson +
* Irvin Probst +
* Eric Quintero +
* John David Reaver +
* Juha Remes +
* Thomas Robitaille
* Clancy Rowley +
* Tobias Schmidt +
* Skipper Seabold
* Aman Singh +
* Eric Soroos
* Valentine Svensson +
* Julian Taylor
* Aman Thakral +
* Helmut Toplitzer +
* Fukumu Tsutsumi +
* Anastasiia Tsyplia +
* Jacob Vanderplas
* Pauli Virtanen
* Matteo Visconti +
* Warren Weckesser
* Florian Wilhelm +
* Nathan Woods
* Haochen Wu +
* Daan Wynen +

A total of 93 people contributed to this release.
People with a "+" by their names contributed a patch for the first time.
This list of names is automatically generated, and may not be fully
complete.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150512/3102f807/attachment.html>

From vincent at vincentdavis.net  Wed May 13 17:14:39 2015
From: vincent at vincentdavis.net (Vincent Davis)
Date: Wed, 13 May 2015 15:14:39 -0600
Subject: [Numpy-discussion] Help loading data into pandas
Message-ID: <CALyJZZUTNn-KmUWqBaqmKW+bdBO8vyyUR8g+r2o6z_S7gC4kXg@mail.gmail.com>

?I have a large (~400mb) csv file I am trying to open in Pandas. When I
don't specify the dtype and open it with the following command It appears
to work.

df = pd.io.parsers.read_csv(CSVFILECLEAN2013, quotechar='"',
low_memory=False, na_values='')

If I try to specify the dtype for each field I get an error but no hint as
to where I should look. I have "cleaned" the csv by checking that all
values that should be an int for a float are either blank or can be cast as
a float or a int. I guess my question is, can I get a more useful error
message or is there a hint as to where the problem is that I am not seeing.

Exception                                 Traceback (most recent call last)
<ipython-input-2-8715d8cbaa54> in <module>()
      3 import load_data
      4 import numpy as np
----> 5 df2 = load_data.load('jeffco_2013')

/Users/vmd/GitHub/Jeffco-Properties/tools/load_data.py in load(data)
     47 def load(data):
     48     files = dict(jeffco_2013 =
'/Users/vmd/GitHub/Jeffco-Properties/Data/JeffersonCo/Datasets/2013_clean_Jeffco_ATSDTA_ATSP600.csv')
---> 49     return pd.io.parsers.read_csv(files[data], quotechar='"',
low_memory=False, na_values='', dtype=DATASHAPE)

/Users/vmd/anaconda/envs/py34/lib/python3.4/site-packages/pandas/io/parsers.py
in parser_f(filepath_or_buffer, sep, dialect, compression, doublequote,
escapechar, quotechar, quoting, skipinitialspace, lineterminator, header,
index_col, names, prefix, skiprows, skipfooter, skip_footer, na_values,
na_fvalues, true_values, false_values, delimiter, converters, dtype,
usecols, engine, delim_whitespace, as_recarray, na_filter, compact_ints,
use_unsigned, low_memory, buffer_lines, warn_bad_lines, error_bad_lines,
keep_default_na, thousands, comment, decimal, parse_dates, keep_date_col,
dayfirst, date_parser, memory_map, float_precision, nrows, iterator,
chunksize, verbose, encoding, squeeze, mangle_dupe_cols, tupleize_cols,
infer_datetime_format, skip_blank_lines)
    468                     skip_blank_lines=skip_blank_lines)
    469
--> 470         return _read(filepath_or_buffer, kwds)
    471
    472     parser_f.__name__ = name

/Users/vmd/anaconda/envs/py34/lib/python3.4/site-packages/pandas/io/parsers.py
in _read(filepath_or_buffer, kwds)
    254         return parser
    255
--> 256     return parser.read()
    257
    258 _parser_defaults = {

/Users/vmd/anaconda/envs/py34/lib/python3.4/site-packages/pandas/io/parsers.py
in read(self, nrows)
    713                 raise ValueError('skip_footer not supported for
iteration')
    714
--> 715         ret = self._engine.read(nrows)
    716
    717         if self.options.get('as_recarray'):

/Users/vmd/anaconda/envs/py34/lib/python3.4/site-packages/pandas/io/parsers.py
in read(self, nrows)
   1162
   1163         try:
-> 1164             data = self._reader.read(nrows)
   1165         except StopIteration:
   1166             if nrows is None:

pandas/parser.pyx in pandas.parser.TextReader.read (pandas/parser.c:7426)()

pandas/parser.pyx in pandas.parser.TextReader._read_rows
(pandas/parser.c:8484)()

pandas/parser.pyx in pandas.parser.TextReader._convert_column_data
(pandas/parser.c:9795)()

pandas/parser.pyx in pandas.parser.TextReader._convert_tokens
(pandas/parser.c:10403)()

pandas/parser.pyx in pandas.parser.TextReader._convert_with_dtype
(pandas/parser.c:11257)()

Exception: Integer column has NA values


Vincent Davis
720-301-3003
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150513/e92d06b2/attachment.html>

From njs at pobox.com  Wed May 13 17:27:39 2015
From: njs at pobox.com (Nathaniel Smith)
Date: Wed, 13 May 2015 14:27:39 -0700
Subject: [Numpy-discussion] Help loading data into pandas
In-Reply-To: <CALyJZZUTNn-KmUWqBaqmKW+bdBO8vyyUR8g+r2o6z_S7gC4kXg@mail.gmail.com>
References: <CALyJZZUTNn-KmUWqBaqmKW+bdBO8vyyUR8g+r2o6z_S7gC4kXg@mail.gmail.com>
Message-ID: <CAPJVwBkDgBwEEjBC9oC3ZE_odPNXQcjf6rgZAJk65_yOm4A-Fg@mail.gmail.com>

I don't think pandas allows blank values in integer columns? You might get
better results asking on the pandas list, though -- see
  http://pandas.pydata.org/community.html

-n
On May 13, 2015 2:17 PM, "Vincent Davis" <vincent at vincentdavis.net> wrote:

> ?I have a large (~400mb) csv file I am trying to open in Pandas. When I
> don't specify the dtype and open it with the following command It appears
> to work.
>
> df = pd.io.parsers.read_csv(CSVFILECLEAN2013, quotechar='"',
> low_memory=False, na_values='')
>
> If I try to specify the dtype for each field I get an error but no hint as
> to where I should look. I have "cleaned" the csv by checking that all
> values that should be an int for a float are either blank or can be cast as
> a float or a int. I guess my question is, can I get a more useful error
> message or is there a hint as to where the problem is that I am not seeing.
>
> Exception                                 Traceback (most recent call last)
> <ipython-input-2-8715d8cbaa54> in <module>()
>       3 import load_data
>       4 import numpy as np
> ----> 5 df2 = load_data.load('jeffco_2013')
>
> /Users/vmd/GitHub/Jeffco-Properties/tools/load_data.py in load(data)
>      47 def load(data):
>      48     files = dict(jeffco_2013 =
> '/Users/vmd/GitHub/Jeffco-Properties/Data/JeffersonCo/Datasets/2013_clean_Jeffco_ATSDTA_ATSP600.csv')
> ---> 49     return pd.io.parsers.read_csv(files[data], quotechar='"',
> low_memory=False, na_values='', dtype=DATASHAPE)
>
> /Users/vmd/anaconda/envs/py34/lib/python3.4/site-packages/pandas/io/parsers.py
> in parser_f(filepath_or_buffer, sep, dialect, compression, doublequote,
> escapechar, quotechar, quoting, skipinitialspace, lineterminator, header,
> index_col, names, prefix, skiprows, skipfooter, skip_footer, na_values,
> na_fvalues, true_values, false_values, delimiter, converters, dtype,
> usecols, engine, delim_whitespace, as_recarray, na_filter, compact_ints,
> use_unsigned, low_memory, buffer_lines, warn_bad_lines, error_bad_lines,
> keep_default_na, thousands, comment, decimal, parse_dates, keep_date_col,
> dayfirst, date_parser, memory_map, float_precision, nrows, iterator,
> chunksize, verbose, encoding, squeeze, mangle_dupe_cols, tupleize_cols,
> infer_datetime_format, skip_blank_lines)
>     468                     skip_blank_lines=skip_blank_lines)
>     469
> --> 470         return _read(filepath_or_buffer, kwds)
>     471
>     472     parser_f.__name__ = name
>
> /Users/vmd/anaconda/envs/py34/lib/python3.4/site-packages/pandas/io/parsers.py
> in _read(filepath_or_buffer, kwds)
>     254         return parser
>     255
> --> 256     return parser.read()
>     257
>     258 _parser_defaults = {
>
> /Users/vmd/anaconda/envs/py34/lib/python3.4/site-packages/pandas/io/parsers.py
> in read(self, nrows)
>     713                 raise ValueError('skip_footer not supported for
> iteration')
>     714
> --> 715         ret = self._engine.read(nrows)
>     716
>     717         if self.options.get('as_recarray'):
>
> /Users/vmd/anaconda/envs/py34/lib/python3.4/site-packages/pandas/io/parsers.py
> in read(self, nrows)
>    1162
>    1163         try:
> -> 1164             data = self._reader.read(nrows)
>    1165         except StopIteration:
>    1166             if nrows is None:
>
> pandas/parser.pyx in pandas.parser.TextReader.read (pandas/parser.c:7426)()
>
> pandas/parser.pyx in pandas.parser.TextReader._read_rows
> (pandas/parser.c:8484)()
>
> pandas/parser.pyx in pandas.parser.TextReader._convert_column_data
> (pandas/parser.c:9795)()
>
> pandas/parser.pyx in pandas.parser.TextReader._convert_tokens
> (pandas/parser.c:10403)()
>
> pandas/parser.pyx in pandas.parser.TextReader._convert_with_dtype
> (pandas/parser.c:11257)()
>
> Exception: Integer column has NA values
>
>
>
>
>
> Vincent Davis
> 720-301-3003
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150513/05ab7b3c/attachment.html>

From cgodshall at enthought.com  Wed May 13 20:16:21 2015
From: cgodshall at enthought.com (Courtenay Godshall (Enthought))
Date: Wed, 13 May 2015 19:16:21 -0500
Subject: [Numpy-discussion] ANN: SciPy 2015 Talk & Poster Selections
	Announced Today, Early Bird Deadline 5/22
Message-ID: <008e01d08ddb$35f11290$a1d337b0$@enthought.com>

The talks & posters for the 2015 SciPy Conference were announced today:
http://scipy2015.scipy.org/ehome/115969/292868/?
<http://scipy2015.scipy.org/ehome/115969/292868/?&> &. Early bird
registration deadline was extended (final) to 5/22 - hope we'll see you this
year!

 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150513/03ee4893/attachment.html>

From njs at pobox.com  Wed May 13 20:23:10 2015
From: njs at pobox.com (Nathaniel Smith)
Date: Wed, 13 May 2015 17:23:10 -0700
Subject: [Numpy-discussion] ANN: NumPy Developer Meeting: July 7th @ SciPy
	2015 in Austin
Message-ID: <CAPJVwBn5vcLLK0Lcgk+r1wLXN_2Xik7uwOEcEH=GJE5vtNkT_Q@mail.gmail.com>

Hi all,

I wanted to announce that the numpy core team will be organizing a
whole-day face-to-face developer meeting on July 7 this year at the
SciPy conference in Austin, TX. (This is the second day of the
tutorials and the day before the conference proper starts.) This will
be a working meeting to discuss and address numpy-related issues,
particularly ones that are too big to fit in a github issue, like
governance and release management and where we want to be in five
years. (We'll be talking more between now and then about more detailed
logistics and agenda and so forth, but I wanted to get this out now.)

If you're reading this and interested in these issues, then you're invited :-).

-n

-- 
Nathaniel J. Smith -- http://vorpus.org


From vincent at vincentdavis.net  Wed May 13 22:30:28 2015
From: vincent at vincentdavis.net (Vincent Davis)
Date: Wed, 13 May 2015 20:30:28 -0600
Subject: [Numpy-discussion] Help loading data into pandas
In-Reply-To: <CAPJVwBkDgBwEEjBC9oC3ZE_odPNXQcjf6rgZAJk65_yOm4A-Fg@mail.gmail.com>
References: <CALyJZZUTNn-KmUWqBaqmKW+bdBO8vyyUR8g+r2o6z_S7gC4kXg@mail.gmail.com>
	<CAPJVwBkDgBwEEjBC9oC3ZE_odPNXQcjf6rgZAJk65_yOm4A-Fg@mail.gmail.com>
Message-ID: <CALyJZZVVOdXGRwwGPqKa3hQivCH3tqmj-iu+obfFuGih-wPy=Q@mail.gmail.com>

On Wed, May 13, 2015 at 3:27 PM, Nathaniel Smith <njs at pobox.com> wrote:

> I don't think pandas allows blank values in integer columns? You might get
> better results asking on the pandas list, though -- see
>   http://pandas.pydata.org/community.html
>
?"integer columns" seems to be the key. they have to be float. Thanks?

Vincent Davis
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150513/51c681df/attachment.html>

From ellisonbg at gmail.com  Thu May 14 01:47:08 2015
From: ellisonbg at gmail.com (Brian Granger)
Date: Wed, 13 May 2015 22:47:08 -0700
Subject: [Numpy-discussion] ANN: NumPy Developer Meeting: July 7th @
 SciPy 2015 in Austin
In-Reply-To: <CAPJVwBn5vcLLK0Lcgk+r1wLXN_2Xik7uwOEcEH=GJE5vtNkT_Q@mail.gmail.com>
References: <CAPJVwBn5vcLLK0Lcgk+r1wLXN_2Xik7uwOEcEH=GJE5vtNkT_Q@mail.gmail.com>
Message-ID: <CAH4pYpQr9id90iun5-6_Nw3sJ-8YRjnLbck7s2wBkcKW6nkFvw@mail.gmail.com>

Great!

On Wed, May 13, 2015 at 5:23 PM, Nathaniel Smith <njs at pobox.com> wrote:

> Hi all,
>
> I wanted to announce that the numpy core team will be organizing a
> whole-day face-to-face developer meeting on July 7 this year at the
> SciPy conference in Austin, TX. (This is the second day of the
> tutorials and the day before the conference proper starts.) This will
> be a working meeting to discuss and address numpy-related issues,
> particularly ones that are too big to fit in a github issue, like
> governance and release management and where we want to be in five
> years. (We'll be talking more between now and then about more detailed
> logistics and agenda and so forth, but I wanted to get this out now.)
>
> If you're reading this and interested in these issues, then you're invited
> :-).
>
> -n
>
> --
> Nathaniel J. Smith -- http://vorpus.org
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


-- 
Brian E. Granger
Cal Poly State University, San Luis Obispo
@ellisonbg on Twitter and GitHub
bgranger at calpoly.edu and ellisonbg at gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150513/e1bb8f09/attachment.html>

From stefan.otte at gmail.com  Thu May 14 07:31:17 2015
From: stefan.otte at gmail.com (Stefan Otte)
Date: Thu, 14 May 2015 11:31:17 +0000
Subject: [Numpy-discussion] Create a n-D grid; meshgrid alternative
In-Reply-To: <CAFync2js-drh86RHNL6Gx0w26LpCnc0p3585O9nfeVwqbvLtSA@mail.gmail.com>
References: <CAFync2iTCQtitBAPC68L_GmPVTQ08Wwd4ZGEFi0rv4ZoiYaYjw@mail.gmail.com>
	<CAPJVwBm2CyVzxrnZiqZoD0WD3gKFy_Yg6pTErCFiuKM_c5UAtQ@mail.gmail.com>
	<CAFync2gcTDGOZZVrHHSrG+jZTpuBH7DLztEhjF28q=wn7TC-xA@mail.gmail.com>
	<CAPOWHWk=Z7iHjbosR8UaW65jYy3Ko=_C7VtbL1dVKWOrNEz8Xg@mail.gmail.com>
	<CAFync2js-drh86RHNL6Gx0w26LpCnc0p3585O9nfeVwqbvLtSA@mail.gmail.com>
Message-ID: <CAFync2gDjFFCi1dKDPBgiGC4Rygq9YXs_oJTha0wtRihM2fu2A@mail.gmail.com>

Hey,

I just created a pull request: https://github.com/numpy/numpy/pull/5874

Best,
 Stefan

On Tue, May 12, 2015 at 3:29 PM Stefan Otte <stefan.otte at gmail.com> wrote:

> Hey,
>
> here is an ipython notebook with benchmarks of all implementations (scroll
> to the bottom for plots):
>
> https://github.com/sotte/ipynb_snippets/blob/master/2015-05%20gridspace%20-%20cartesian.ipynb
>
> Overall, Jaime's version is the fastest.
>
>
>
>
>
>
>
> On Tue, May 12, 2015 at 2:01 PM Jaime Fern?ndez del R?o <
> jaime.frio at gmail.com> wrote:
>
>> On Tue, May 12, 2015 at 1:17 AM, Stefan Otte <stefan.otte at gmail.com>
>> wrote:
>>
>>> Hello,
>>>
>>> indeed I was looking for the cartesian product.
>>>
>>> I timed the two stackoverflow answers and the winner is not quite as
>>> clear:
>>>
>>> n_elements:    10  cartesian  0.00427 cartesian2  0.00172
>>> n_elements:   100  cartesian  0.02758 cartesian2  0.01044
>>> n_elements:  1000  cartesian  0.97628 cartesian2  1.12145
>>> n_elements:  5000  cartesian 17.14133 cartesian2 31.12241
>>>
>>> (This is for two arrays as parameters: np.linspace(0, 1, n_elements))
>>> cartesian2 seems to be slower for bigger.
>>>
>>
>> On my system, the following variation on Pauli's answer is 2-4x faster
>> than his for your test cases:
>>
>> def cartesian4(arrays, out=None):
>>     arrays = [np.asarray(x).ravel() for x in arrays]
>>     dtype = np.result_type(*arrays)
>>
>>     n = np.prod([arr.size for arr in arrays])
>>     if out is None:
>>         out = np.empty((len(arrays), n), dtype=dtype)
>>     else:
>>         out = out.T
>>
>>     for j, arr in enumerate(arrays):
>>         n /= arr.size
>>         out.shape = (len(arrays), -1, arr.size, n)
>>         out[j] = arr[np.newaxis, :, np.newaxis]
>>     out.shape = (len(arrays), -1)
>>
>>     return out.T
>>
>>
>>> I'd really appreciate if this was be part of numpy. Should I create a
>>> pull request?
>>>
>>
>> There hasn't been any opposition, quite the contrary, so yes, I would go
>> ahead an create that PR. I somehow feel this belongs with the set
>> operations, rather than with the indexing ones. Other thoughts?
>>
>> Also for consideration: should it work on flattened arrays? or should we
>> give it an axis argument, and then "broadcast on the rest", a la
>> generalized ufunc?
>>
>> Jaime
>>
>> --
>> (\__/)
>> ( O.o)
>> ( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes
>> de dominaci?n mundial.
>>  _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150514/3c84b7d1/attachment.html>

From chris.barker at noaa.gov  Fri May 15 16:07:15 2015
From: chris.barker at noaa.gov (Chris Barker)
Date: Fri, 15 May 2015 13:07:15 -0700
Subject: [Numpy-discussion] binary wheels for numpy?
Message-ID: <CALGmxEKRH3_5ypx7DKJ9t6PSkRRb5W3+J8JCRk1Jz=9D+gC9Cw@mail.gmail.com>

Hi folks.,

I did a little "intro to scipy" session as part of a larger Python class
the other day, and was dismayed to find that "pip install numpy" still
dosn't work on Windows.

Thanks mostly to Matthew Brett's work, the whole scipy stack is
pip-installable on OS-X, it would be really nice if we had that for Windows.

And no, saying "you should go get Python(x,y) or Anaconda, or Canopy,
or...) is really not a good solution. That is indeed the way to go if
someone is primarily focusing on computational programming, but if you have
a web developer, or someone new to Python for general use, they really
should be able to just grab numpy and play around with it a bit without
having to start all over again.


My solution was to point folks to Chris Gohlke's site -- which is a
Fabulous resource --

THANK YOU CHRISTOPH!

But I still think that we should have the basic scipy stack on PyPi as
Windows Wheels...

IIRC, the last run through on this discussion got stuck on the "what
hardware should it support" -- wheels do not allow a selection at install
time, so we'd have to decide what instruction set to support, and just
stick with that. Which would mean that:

some folks would get a numpy/scipy that would run a bit slower than it might
and
some folks would get one that wouldn't run at all on their machine.

But I don't see any reason that we can't find a compromise here -- do a
build that supports most machines, and be done with it. Even now, people
have to go get (one way or another) a MKL-based build to get optimum
performance anyway -- so if we pick an instruction set support by, say (an
arbitrary, and impossible to determine) 95% of machines out there -- we're
good to go.

I take it there are licensing issues that prevent us from putting Chris'
Binaries up on PyPi?

But are there technical issues I'm forgetting here, or do we just need to
come to a consensus as to hardware version to support and do it?

-Chris


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150515/081908ac/attachment.html>

From matthew.brett at gmail.com  Fri May 15 16:35:36 2015
From: matthew.brett at gmail.com (Matthew Brett)
Date: Fri, 15 May 2015 13:35:36 -0700
Subject: [Numpy-discussion] binary wheels for numpy?
In-Reply-To: <CALGmxEKRH3_5ypx7DKJ9t6PSkRRb5W3+J8JCRk1Jz=9D+gC9Cw@mail.gmail.com>
References: <CALGmxEKRH3_5ypx7DKJ9t6PSkRRb5W3+J8JCRk1Jz=9D+gC9Cw@mail.gmail.com>
Message-ID: <CAH6Pt5qP_zvKdx-7KvKfWARxJquc9mjjRVs8UoRyEaQ6QC4dJA@mail.gmail.com>

Hi,

On Fri, May 15, 2015 at 1:07 PM, Chris Barker <chris.barker at noaa.gov> wrote:
> Hi folks.,
>
> I did a little "intro to scipy" session as part of a larger Python class the
> other day, and was dismayed to find that "pip install numpy" still dosn't
> work on Windows.
>
> Thanks mostly to Matthew Brett's work, the whole scipy stack is
> pip-installable on OS-X, it would be really nice if we had that for Windows.
>
> And no, saying "you should go get Python(x,y) or Anaconda, or Canopy, or...)
> is really not a good solution. That is indeed the way to go if someone is
> primarily focusing on computational programming, but if you have a web
> developer, or someone new to Python for general use, they really should be
> able to just grab numpy and play around with it a bit without having to
> start all over again.
>
>
> My solution was to point folks to Chris Gohlke's site -- which is a Fabulous
> resource --
>
> THANK YOU CHRISTOPH!
>
> But I still think that we should have the basic scipy stack on PyPi as
> Windows Wheels...
>
> IIRC, the last run through on this discussion got stuck on the "what
> hardware should it support" -- wheels do not allow a selection at installc
> time, so we'd have to decide what instruction set to support, and just stick
> with that. Which would mean that:
>
> some folks would get a numpy/scipy that would run a bit slower than it might
> and
> some folks would get one that wouldn't run at all on their machine.
>
> But I don't see any reason that we can't find a compromise here -- do a
> build that supports most machines, and be done with it. Even now, people
> have to go get (one way or another) a MKL-based build to get optimum
> performance anyway -- so if we pick an instruction set support by, say (an
> arbitrary, and impossible to determine) 95% of machines out there -- we're
> good to go.
>
> I take it there are licensing issues that prevent us from putting Chris'
> Binaries up on PyPi?

Yes, unfortunately we can't put MKL binaries on pypi because of the
MKL license - see
https://github.com/numpy/numpy/wiki/Numerical-software-on-Windows#blas--lapack-libraries.
Also see discussion in the containing thread of
http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069701.html
.

> But are there technical issues I'm forgetting here, or do we just need to
> come to a consensus as to hardware version to support and do it?

There has been some progress on this - see

https://github.com/scipy/scipy/issues/4829

I think there's a move afoot to have a Google hangout or similar on
this exact topic :
https://github.com/scipy/scipy/issues/2829#issuecomment-101303078 -
maybe we could hammer out a policy there?  Once we have got numpy and
scipy built in a reasonable way, I think we will be most of the way
there...

Cheers,

Matthew


From chris.barker at noaa.gov  Fri May 15 19:26:32 2015
From: chris.barker at noaa.gov (Chris Barker - NOAA Federal)
Date: Fri, 15 May 2015 16:26:32 -0700
Subject: [Numpy-discussion] binary wheels for numpy?
In-Reply-To: <CAH6Pt5qP_zvKdx-7KvKfWARxJquc9mjjRVs8UoRyEaQ6QC4dJA@mail.gmail.com>
References: <CALGmxEKRH3_5ypx7DKJ9t6PSkRRb5W3+J8JCRk1Jz=9D+gC9Cw@mail.gmail.com>
	<CAH6Pt5qP_zvKdx-7KvKfWARxJquc9mjjRVs8UoRyEaQ6QC4dJA@mail.gmail.com>
Message-ID: <309834013734674372@unknownmsgid>

Thanks for the update Matthew, it's great to see so much activity on this issue.

Looks like we are headed in the right direction --and getting close.

Thanks to all that are putting time into this.

-Chris

> On May 15, 2015, at 1:37 PM, Matthew Brett <matthew.brett at gmail.com> wrote:
>
> Hi,
>
>> On Fri, May 15, 2015 at 1:07 PM, Chris Barker <chris.barker at noaa.gov> wrote:
>> Hi folks.,
>>
>> I did a little "intro to scipy" session as part of a larger Python class the
>> other day, and was dismayed to find that "pip install numpy" still dosn't
>> work on Windows.
>>
>> Thanks mostly to Matthew Brett's work, the whole scipy stack is
>> pip-installable on OS-X, it would be really nice if we had that for Windows.
>>
>> And no, saying "you should go get Python(x,y) or Anaconda, or Canopy, or...)
>> is really not a good solution. That is indeed the way to go if someone is
>> primarily focusing on computational programming, but if you have a web
>> developer, or someone new to Python for general use, they really should be
>> able to just grab numpy and play around with it a bit without having to
>> start all over again.
>>
>>
>> My solution was to point folks to Chris Gohlke's site -- which is a Fabulous
>> resource --
>>
>> THANK YOU CHRISTOPH!
>>
>> But I still think that we should have the basic scipy stack on PyPi as
>> Windows Wheels...
>>
>> IIRC, the last run through on this discussion got stuck on the "what
>> hardware should it support" -- wheels do not allow a selection at installc
>> time, so we'd have to decide what instruction set to support, and just stick
>> with that. Which would mean that:
>>
>> some folks would get a numpy/scipy that would run a bit slower than it might
>> and
>> some folks would get one that wouldn't run at all on their machine.
>>
>> But I don't see any reason that we can't find a compromise here -- do a
>> build that supports most machines, and be done with it. Even now, people
>> have to go get (one way or another) a MKL-based build to get optimum
>> performance anyway -- so if we pick an instruction set support by, say (an
>> arbitrary, and impossible to determine) 95% of machines out there -- we're
>> good to go.
>>
>> I take it there are licensing issues that prevent us from putting Chris'
>> Binaries up on PyPi?
>
> Yes, unfortunately we can't put MKL binaries on pypi because of the
> MKL license - see
> https://github.com/numpy/numpy/wiki/Numerical-software-on-Windows#blas--lapack-libraries.
> Also see discussion in the containing thread of
> http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069701.html
> .
>
>> But are there technical issues I'm forgetting here, or do we just need to
>> come to a consensus as to hardware version to support and do it?
>
> There has been some progress on this - see
>
> https://github.com/scipy/scipy/issues/4829
>
> I think there's a move afoot to have a Google hangout or similar on
> this exact topic :
> https://github.com/scipy/scipy/issues/2829#issuecomment-101303078 -
> maybe we could hammer out a policy there?  Once we have got numpy and
> scipy built in a reasonable way, I think we will be most of the way
> there...
>
> Cheers,
>
> Matthew
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion


From josef.pktd at gmail.com  Fri May 15 21:56:17 2015
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Fri, 15 May 2015 21:56:17 -0400
Subject: [Numpy-discussion] binary wheels for numpy?
In-Reply-To: <CALGmxEKRH3_5ypx7DKJ9t6PSkRRb5W3+J8JCRk1Jz=9D+gC9Cw@mail.gmail.com>
References: <CALGmxEKRH3_5ypx7DKJ9t6PSkRRb5W3+J8JCRk1Jz=9D+gC9Cw@mail.gmail.com>
Message-ID: <CAMMTP+BJzjZOzeQr+nk_v0Q-8eSs69BYaWjgcZd21LZ0zZhMjw@mail.gmail.com>

On Fri, May 15, 2015 at 4:07 PM, Chris Barker <chris.barker at noaa.gov> wrote:

> Hi folks.,
>
> I did a little "intro to scipy" session as part of a larger Python class
> the other day, and was dismayed to find that "pip install numpy" still
> dosn't work on Windows.
>
> Thanks mostly to Matthew Brett's work, the whole scipy stack is
> pip-installable on OS-X, it would be really nice if we had that for Windows.
>
> And no, saying "you should go get Python(x,y) or Anaconda, or Canopy,
> or...) is really not a good solution. That is indeed the way to go if
> someone is primarily focusing on computational programming, but if you have
> a web developer, or someone new to Python for general use, they really
> should be able to just grab numpy and play around with it a bit without
> having to start all over again.
>

Unrelated to the pip/wheel discussion.

In my experience by far the easiest to get something running to play with
is using Winpython. Download and unzip (and maybe add to system path) and
most of the data analysis stack is available.

I haven't even bothered yet to properly install a full "system python" on
my Windows machine. I'm just working with 3 winpython. (One even has Julia
and IJulia included after following the installation instructions for a
short time.)

Josef


>
>
> My solution was to point folks to Chris Gohlke's site -- which is a
> Fabulous resource --
>
> THANK YOU CHRISTOPH!
>
> But I still think that we should have the basic scipy stack on PyPi as
> Windows Wheels...
>
> IIRC, the last run through on this discussion got stuck on the "what
> hardware should it support" -- wheels do not allow a selection at install
> time, so we'd have to decide what instruction set to support, and just
> stick with that. Which would mean that:
>
> some folks would get a numpy/scipy that would run a bit slower than it
> might
> and
> some folks would get one that wouldn't run at all on their machine.
>
> But I don't see any reason that we can't find a compromise here -- do a
> build that supports most machines, and be done with it. Even now, people
> have to go get (one way or another) a MKL-based build to get optimum
> performance anyway -- so if we pick an instruction set support by, say (an
> arbitrary, and impossible to determine) 95% of machines out there -- we're
> good to go.
>
> I take it there are licensing issues that prevent us from putting Chris'
> Binaries up on PyPi?
>
> But are there technical issues I'm forgetting here, or do we just need to
> come to a consensus as to hardware version to support and do it?
>
> -Chris
>
>
>
>
>
>
>
> --
>
> Christopher Barker, Ph.D.
> Oceanographer
>
> Emergency Response Division
> NOAA/NOS/OR&R            (206) 526-6959   voice
> 7600 Sand Point Way NE   (206) 526-6329   fax
> Seattle, WA  98115       (206) 526-6317   main reception
>
> Chris.Barker at noaa.gov
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150515/c41ebaea/attachment.html>

From jaime.frio at gmail.com  Fri May 15 23:49:46 2015
From: jaime.frio at gmail.com (=?UTF-8?Q?Jaime_Fern=C3=A1ndez_del_R=C3=ADo?=)
Date: Fri, 15 May 2015 20:49:46 -0700
Subject: [Numpy-discussion] binary wheels for numpy?
In-Reply-To: <CAMMTP+BJzjZOzeQr+nk_v0Q-8eSs69BYaWjgcZd21LZ0zZhMjw@mail.gmail.com>
References: <CALGmxEKRH3_5ypx7DKJ9t6PSkRRb5W3+J8JCRk1Jz=9D+gC9Cw@mail.gmail.com>
	<CAMMTP+BJzjZOzeQr+nk_v0Q-8eSs69BYaWjgcZd21LZ0zZhMjw@mail.gmail.com>
Message-ID: <CAPOWHWmuvUv=XHSXy08cWqVm69jPRSu09FBixtu1yoZGSuiUBg@mail.gmail.com>

On Fri, May 15, 2015 at 6:56 PM, <josef.pktd at gmail.com> wrote:

>
>
> On Fri, May 15, 2015 at 4:07 PM, Chris Barker <chris.barker at noaa.gov>
> wrote:
>
>> Hi folks.,
>>
>> I did a little "intro to scipy" session as part of a larger Python class
>> the other day, and was dismayed to find that "pip install numpy" still
>> dosn't work on Windows.
>>
>> Thanks mostly to Matthew Brett's work, the whole scipy stack is
>> pip-installable on OS-X, it would be really nice if we had that for Windows.
>>
>> And no, saying "you should go get Python(x,y) or Anaconda, or Canopy,
>> or...) is really not a good solution. That is indeed the way to go if
>> someone is primarily focusing on computational programming, but if you have
>> a web developer, or someone new to Python for general use, they really
>> should be able to just grab numpy and play around with it a bit without
>> having to start all over again.
>>
>
> Unrelated to the pip/wheel discussion.
>
> In my experience by far the easiest to get something running to play with
> is using Winpython. Download and unzip (and maybe add to system path) and
> most of the data analysis stack is available.
>
> I haven't even bothered yet to properly install a full "system python" on
> my Windows machine. I'm just working with 3 winpython. (One even has Julia
> and IJulia included after following the installation instructions for a
> short time.)
>

+1 on WinPython. I have half a dozen "installations" of it, none registered
with Windows.

Jaime

-- 
(\__/)
( O.o)
( > <) Este es Conejo. Copia a Conejo en tu firma y ay?dale en sus planes
de dominaci?n mundial.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150515/072ac11c/attachment.html>

From chris.barker at noaa.gov  Sat May 16 01:26:20 2015
From: chris.barker at noaa.gov (Chris Barker)
Date: Fri, 15 May 2015 22:26:20 -0700
Subject: [Numpy-discussion] binary wheels for numpy?
In-Reply-To: <CAMMTP+BJzjZOzeQr+nk_v0Q-8eSs69BYaWjgcZd21LZ0zZhMjw@mail.gmail.com>
References: <CALGmxEKRH3_5ypx7DKJ9t6PSkRRb5W3+J8JCRk1Jz=9D+gC9Cw@mail.gmail.com>
	<CAMMTP+BJzjZOzeQr+nk_v0Q-8eSs69BYaWjgcZd21LZ0zZhMjw@mail.gmail.com>
Message-ID: <CALGmxE++56_v=jKyW=uJkpEWPSuQH7EkFMhCd8ABAu3-99HDDA@mail.gmail.com>

On Fri, May 15, 2015 at 6:56 PM, <josef.pktd at gmail.com> wrote:

> Unrelated to the pip/wheel discussion.
>
> In my experience by far the easiest to get something running to play with
> is using Winpython. Download and unzip (and maybe add to system path) and
> most of the data analysis stack is available.
>

Sure -- if someone comes to me wanting to use python for
scientific/computational computing, I point them to one of the
distributions -- maybe I'll add WinPython to that list now.

But if someone is already using python for, say web development, then they
already have an installation up and running, and I want to give them an
easy option to add numpy (and secondarily scipy) to what they have easily.

And it looks like we are almost there, thanks to a lot of work by a few key
folks -- thanks!

-Chris


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150515/b859847f/attachment.html>

From ralf.gommers at gmail.com  Sun May 17 06:06:11 2015
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Sun, 17 May 2015 12:06:11 +0200
Subject: [Numpy-discussion] binary wheels for numpy?
In-Reply-To: <CAH6Pt5qP_zvKdx-7KvKfWARxJquc9mjjRVs8UoRyEaQ6QC4dJA@mail.gmail.com>
References: <CALGmxEKRH3_5ypx7DKJ9t6PSkRRb5W3+J8JCRk1Jz=9D+gC9Cw@mail.gmail.com>
	<CAH6Pt5qP_zvKdx-7KvKfWARxJquc9mjjRVs8UoRyEaQ6QC4dJA@mail.gmail.com>
Message-ID: <CABL7CQgy2+5xnU4GbQX5+67g8Jh_4wDY2nDeDQWHTd4r9CG9vg@mail.gmail.com>

On Fri, May 15, 2015 at 10:35 PM, Matthew Brett <matthew.brett at gmail.com>
wrote:

> Hi,
>
> On Fri, May 15, 2015 at 1:07 PM, Chris Barker <chris.barker at noaa.gov>
> wrote:
> > Hi folks.,
> >
> > I did a little "intro to scipy" session as part of a larger Python class
> the
> > other day, and was dismayed to find that "pip install numpy" still dosn't
> > work on Windows.
> >
> > Thanks mostly to Matthew Brett's work, the whole scipy stack is
> > pip-installable on OS-X, it would be really nice if we had that for
> Windows.
> >
> > And no, saying "you should go get Python(x,y) or Anaconda, or Canopy,
> or...)
> > is really not a good solution. That is indeed the way to go if someone is
> > primarily focusing on computational programming, but if you have a web
> > developer, or someone new to Python for general use, they really should
> be
> > able to just grab numpy and play around with it a bit without having to
> > start all over again.
> >
> >
> > My solution was to point folks to Chris Gohlke's site -- which is a
> Fabulous
> > resource --
> >
> > THANK YOU CHRISTOPH!
> >
> > But I still think that we should have the basic scipy stack on PyPi as
> > Windows Wheels...
> >
> > IIRC, the last run through on this discussion got stuck on the "what
> > hardware should it support" -- wheels do not allow a selection at
> installc
> > time, so we'd have to decide what instruction set to support, and just
> stick
> > with that. Which would mean that:
> >
> > some folks would get a numpy/scipy that would run a bit slower than it
> might
> > and
> > some folks would get one that wouldn't run at all on their machine.
> >
> > But I don't see any reason that we can't find a compromise here -- do a
> > build that supports most machines, and be done with it. Even now, people
> > have to go get (one way or another) a MKL-based build to get optimum
> > performance anyway -- so if we pick an instruction set support by, say
> (an
> > arbitrary, and impossible to determine) 95% of machines out there --
> we're
> > good to go.
> >
> > I take it there are licensing issues that prevent us from putting Chris'
> > Binaries up on PyPi?
>
> Yes, unfortunately we can't put MKL binaries on pypi because of the
> MKL license - see
>
> https://github.com/numpy/numpy/wiki/Numerical-software-on-Windows#blas--lapack-libraries
> .
> Also see discussion in the containing thread of
> http://mail.scipy.org/pipermail/numpy-discussion/2014-March/069701.html
> .
>
> > But are there technical issues I'm forgetting here, or do we just need to
> > come to a consensus as to hardware version to support and do it?
>

There's the switch to OpenBLAS and building the right selection mechanism
for which arch to use:
http://article.gmane.org/gmane.comp.python.distutils.devel/20350. That
seems now feasible to complete on a reasonable time-scale, and the problems
with OpenBLAS seem to be mostly solved. Binaries which crash for ~1% of
users (which ATLAS-SSE2 would result in) are still not acceptable I think.

Ralf


> There has been some progress on this - see
>
> https://github.com/scipy/scipy/issues/4829
>
> I think there's a move afoot to have a Google hangout or similar on
> this exact topic :
> https://github.com/scipy/scipy/issues/2829#issuecomment-101303078 -
> maybe we could hammer out a policy there?  Once we have got numpy and
> scipy built in a reasonable way, I think we will be most of the way
> there...
>
> Cheers,
>
> Matthew
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150517/42801e31/attachment.html>

From sturla.molden at gmail.com  Sun May 17 11:22:09 2015
From: sturla.molden at gmail.com (Sturla Molden)
Date: Sun, 17 May 2015 15:22:09 +0000 (UTC)
Subject: [Numpy-discussion] binary wheels for numpy?
References: <CALGmxEKRH3_5ypx7DKJ9t6PSkRRb5W3+J8JCRk1Jz=9D+gC9Cw@mail.gmail.com>
	<CAH6Pt5qP_zvKdx-7KvKfWARxJquc9mjjRVs8UoRyEaQ6QC4dJA@mail.gmail.com>
Message-ID: <1250663829453568496.156738sturla.molden-gmail.com@news.gmane.org>

Matthew Brett <matthew.brett at gmail.com> wrote:

> Yes, unfortunately we can't put MKL binaries on pypi because of the
> MKL license - see

I believe we can, because we asked Intel for permission. From what I heard
the response was positive.

 But it doesn't mean we should. :-)

Sturla


From valentin at haenel.co  Sun May 17 12:15:09 2015
From: valentin at haenel.co (Valentin Haenel)
Date: Sun, 17 May 2015 18:15:09 +0200
Subject: [Numpy-discussion] [ANN] bcolz v0.9.0
Message-ID: <20150517161509.GA6197@kudu.in-berlin.de>

======================
Announcing bcolz 0.9.0
======================

What's new
==========

This is mostly a smallish feature and bugfix release. One large topic
was implementing 'addcol' and 'delcol' to properly handle on-disk
tables. 'addcol' now has a new keyword argument 'move' that allows you
to specify if you want to move or copy the data. 'delcol' has a new
keyword argument 'keep' which allows you preserve the data on disk when
removing a column.  Additionally, ctable now supports an 'auto_flush'
keyword that makes it flush to disk automatically after any methods that
may write data.

Another important aspect is handling the GIL. In this release, we do
keep the GIL while calling Blosc compress and decompress in order to
support lock-free operation of newer Blosc versions (1.5.x and beyond)
that no longer have a global state.

Furthermore we now distribute the 'carray_ext.pxd' as part of  the
package via PyPi to ease building applications on bcolz, for example
*bquery*.

Finally, the Sphinx based API documentation is now autogenerated from
the docstrings in the Python sources.

For the full list, please check the release notes at:

https://github.com/Blosc/bcolz/blob/v0.9.0/RELEASE_NOTES.rst

What it is
==========

*bcolz* provides columnar and compressed data containers that can live
either on-disk or in-memory.  Column storage allows for efficiently
querying tables with a large number of columns.  It also allows for
cheap addition and removal of column.  In addition, bcolz objects are
compressed by default for reducing memory/disk I/O needs. The
compression process is carried out internally by Blosc, an
extremely fast meta-compressor that is optimized for binary data. Lastly,
high-performance iterators (like ``iter()``, ``where()``) for querying
the objects are provided.

bcolz can use numexpr internally so as to accelerate many vector and
query operations (although it can use pure NumPy for doing so too).
numexpr optimizes the memory usage and use several cores for doing the
computations, so it is blazing fast.  Moreover, since the carray/ctable
containers can be disk-based, and it is possible to use them for
seamlessly performing out-of-memory computations.

bcolz has minimal dependencies (NumPy), comes with an exhaustive test
suite and fully supports both 32-bit and 64-bit platforms.  Also, it is
typically tested on both UNIX and Windows operating systems.

Together, bcolz and the Blosc compressor, are finally fulfilling the
promise of accelerating memory I/O, at least for some real scenarios:

http://nbviewer.ipython.org/github/Blosc/movielens-bench/blob/master/querying-ep14.ipynb#Plots

Other users of bcolz are Visualfabriq (http://www.visualfabriq.com/) the
Blaze project (http://blaze.pydata.org/), Quantopian
(https://www.quantopian.com/) and Scikit-Allel
(https://github.com/cggh/scikit-allel) which you can read more about by
pointing your browser at the links below.

* Visualfabriq:

  * *bquery*, A query and aggregation framework for Bcolz:
  * https://github.com/visualfabriq/bquery

* Blaze:

  * Notebooks showing Blaze + Pandas + BColz interaction: 
  * http://nbviewer.ipython.org/url/blaze.pydata.org/notebooks/timings-csv.ipynb
  * http://nbviewer.ipython.org/url/blaze.pydata.org/notebooks/timings-bcolz.ipynb

* Quantopian:

  * Using compressed data containers for faster backtesting at scale:
  * https://quantopian.github.io/talks/NeedForSpeed/slides.html

* Scikit-Allel

  * Provides an alternative backend to work with compressed arrays
  * https://scikit-allel.readthedocs.org/en/latest/bcolz.html

Installing
==========

bcolz is in the PyPI repository, so installing it is easy::

    $ pip install -U bcolz


Resources
=========

Visit the main bcolz site repository at:
http://github.com/Blosc/bcolz

Manual:
http://bcolz.blosc.org

Home of Blosc compressor:
http://blosc.org

User's mail list:
bcolz at googlegroups.com
http://groups.google.com/group/bcolz

License is the new BSD:
https://github.com/Blosc/bcolz/blob/master/LICENSES/BCOLZ.txt

Release notes can be found in the Git repository:
https://github.com/Blosc/bcolz/blob/master/RELEASE_NOTES.rst

----

  **Enjoy data!**


From matthew.brett at gmail.com  Sun May 17 14:50:14 2015
From: matthew.brett at gmail.com (Matthew Brett)
Date: Sun, 17 May 2015 11:50:14 -0700
Subject: [Numpy-discussion] binary wheels for numpy?
In-Reply-To: <1250663829453568496.156738sturla.molden-gmail.com@news.gmane.org>
References: <CALGmxEKRH3_5ypx7DKJ9t6PSkRRb5W3+J8JCRk1Jz=9D+gC9Cw@mail.gmail.com>
	<CAH6Pt5qP_zvKdx-7KvKfWARxJquc9mjjRVs8UoRyEaQ6QC4dJA@mail.gmail.com>
	<1250663829453568496.156738sturla.molden-gmail.com@news.gmane.org>
Message-ID: <CAH6Pt5o-_53f0f0qY3WQcpXgFZ9KEMGUhamb1vk+ekLK23x4AQ@mail.gmail.com>

On Sun, May 17, 2015 at 8:22 AM, Sturla Molden <sturla.molden at gmail.com> wrote:
> Matthew Brett <matthew.brett at gmail.com> wrote:
>
>> Yes, unfortunately we can't put MKL binaries on pypi because of the
>> MKL license - see
>
> I believe we can, because we asked Intel for permission. From what I heard
> the response was positive.

We would need something formal from Intel saying that they do not
require us to hold our users to their standard redistribution terms
and that they waive the requirement that we be responsible for any
damage to Intel that happens as a result of people using our binaries.

I'm guessing we don't have this, but I'm happy to be corrected,

Cheers,

Matthew


From ralf.gommers at gmail.com  Sun May 17 14:54:48 2015
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Sun, 17 May 2015 20:54:48 +0200
Subject: [Numpy-discussion] binary wheels for numpy?
In-Reply-To: <CAH6Pt5o-_53f0f0qY3WQcpXgFZ9KEMGUhamb1vk+ekLK23x4AQ@mail.gmail.com>
References: <CALGmxEKRH3_5ypx7DKJ9t6PSkRRb5W3+J8JCRk1Jz=9D+gC9Cw@mail.gmail.com>
	<CAH6Pt5qP_zvKdx-7KvKfWARxJquc9mjjRVs8UoRyEaQ6QC4dJA@mail.gmail.com>
	<1250663829453568496.156738sturla.molden-gmail.com@news.gmane.org>
	<CAH6Pt5o-_53f0f0qY3WQcpXgFZ9KEMGUhamb1vk+ekLK23x4AQ@mail.gmail.com>
Message-ID: <CABL7CQi03L_at44oi8egaRJXey63vQD5TezWPpR42L4J1+NEAQ@mail.gmail.com>

On Sun, May 17, 2015 at 8:50 PM, Matthew Brett <matthew.brett at gmail.com>
wrote:

> On Sun, May 17, 2015 at 8:22 AM, Sturla Molden <sturla.molden at gmail.com>
> wrote:
> > Matthew Brett <matthew.brett at gmail.com> wrote:
> >
> >> Yes, unfortunately we can't put MKL binaries on pypi because of the
> >> MKL license - see
> >
> > I believe we can, because we asked Intel for permission. From what I
> heard
> > the response was positive.
>
> We would need something formal from Intel saying that they do not
> require us to hold our users to their standard redistribution terms
> and that they waive the requirement that we be responsible for any
> damage to Intel that happens as a result of people using our binaries.
>
> I'm guessing we don't have this, but I'm happy to be corrected,
>

We only have an email, probably not enough. I'd rather not go to the
trouble of discussing something more formal unless we are really sure that
we actually want to distribute MKL binaries. Which isn't too likely I
suspect; OpenBLAS seems like the way to go (?).

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150517/fc155064/attachment.html>

From robert.kern at gmail.com  Sun May 17 15:11:25 2015
From: robert.kern at gmail.com (Robert Kern)
Date: Sun, 17 May 2015 20:11:25 +0100
Subject: [Numpy-discussion] binary wheels for numpy?
In-Reply-To: <CAH6Pt5o-_53f0f0qY3WQcpXgFZ9KEMGUhamb1vk+ekLK23x4AQ@mail.gmail.com>
References: <CALGmxEKRH3_5ypx7DKJ9t6PSkRRb5W3+J8JCRk1Jz=9D+gC9Cw@mail.gmail.com>
	<CAH6Pt5qP_zvKdx-7KvKfWARxJquc9mjjRVs8UoRyEaQ6QC4dJA@mail.gmail.com>
	<1250663829453568496.156738sturla.molden-gmail.com@news.gmane.org>
	<CAH6Pt5o-_53f0f0qY3WQcpXgFZ9KEMGUhamb1vk+ekLK23x4AQ@mail.gmail.com>
Message-ID: <CAF6FJivD7Sau30icaQPbsbutG2bdgZeSCa-hkzAwdSB20A3Qng@mail.gmail.com>

On Sun, May 17, 2015 at 7:50 PM, Matthew Brett <matthew.brett at gmail.com>
wrote:
>
> On Sun, May 17, 2015 at 8:22 AM, Sturla Molden <sturla.molden at gmail.com>
wrote:
> > Matthew Brett <matthew.brett at gmail.com> wrote:
> >
> >> Yes, unfortunately we can't put MKL binaries on pypi because of the
> >> MKL license - see
> >
> > I believe we can, because we asked Intel for permission. From what I
heard
> > the response was positive.
>
> We would need something formal from Intel saying that they do not
> require us to hold our users to their standard redistribution terms
> and that they waive the requirement that we be responsible for any
> damage to Intel that happens as a result of people using our binaries.
>
> I'm guessing we don't have this, but I'm happy to be corrected,

I don't think permission from Intel is the blocking issue for putting these
binaries up on PyPI. Even with Intel's permission, we would be putting up
proprietary binaries on a page that is explicitly claiming that the files
linked therein are BSD-licensed. The binaries could not be redistributed
with any GPLed module, say, pygsl.

We could host them on numpy.org on their own page that clearly explained
the license of those files, but I think PyPI is out.

--
Robert Kern
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150517/566e1630/attachment.html>

From sturla.molden at gmail.com  Sun May 17 17:09:56 2015
From: sturla.molden at gmail.com (Sturla Molden)
Date: Sun, 17 May 2015 23:09:56 +0200
Subject: [Numpy-discussion] binary wheels for numpy?
In-Reply-To: <CABL7CQi03L_at44oi8egaRJXey63vQD5TezWPpR42L4J1+NEAQ@mail.gmail.com>
References: <CALGmxEKRH3_5ypx7DKJ9t6PSkRRb5W3+J8JCRk1Jz=9D+gC9Cw@mail.gmail.com>	<CAH6Pt5qP_zvKdx-7KvKfWARxJquc9mjjRVs8UoRyEaQ6QC4dJA@mail.gmail.com>	<1250663829453568496.156738sturla.molden-gmail.com@news.gmane.org>	<CAH6Pt5o-_53f0f0qY3WQcpXgFZ9KEMGUhamb1vk+ekLK23x4AQ@mail.gmail.com>
	<CABL7CQi03L_at44oi8egaRJXey63vQD5TezWPpR42L4J1+NEAQ@mail.gmail.com>
Message-ID: <mjb036$vul$1@ger.gmane.org>

On 17/05/15 20:54, Ralf Gommers wrote:

> I suspect; OpenBLAS seems like the way to go (?).

I think OpenBLAS is currently the most promising candidate to replace 
ATLAS. But we need to build OpenBLAS with MinGW gcc, due to AT&T syntax 
in the assembly code. I am not sure if the old toolchain is good enough, 
or if we will need Carl Kleffner's binaries.

Sturla


From njs at pobox.com  Sun May 17 17:18:58 2015
From: njs at pobox.com (Nathaniel Smith)
Date: Sun, 17 May 2015 14:18:58 -0700
Subject: [Numpy-discussion] binary wheels for numpy?
In-Reply-To: <mjb036$vul$1@ger.gmane.org>
References: <CALGmxEKRH3_5ypx7DKJ9t6PSkRRb5W3+J8JCRk1Jz=9D+gC9Cw@mail.gmail.com>
	<CAH6Pt5qP_zvKdx-7KvKfWARxJquc9mjjRVs8UoRyEaQ6QC4dJA@mail.gmail.com>
	<1250663829453568496.156738sturla.molden-gmail.com@news.gmane.org>
	<CAH6Pt5o-_53f0f0qY3WQcpXgFZ9KEMGUhamb1vk+ekLK23x4AQ@mail.gmail.com>
	<CABL7CQi03L_at44oi8egaRJXey63vQD5TezWPpR42L4J1+NEAQ@mail.gmail.com>
	<mjb036$vul$1@ger.gmane.org>
Message-ID: <CAPJVwBn5QURX10atqF7LrgoHJPW9G+1Es4FxbmZZyAH_pXG7Ww@mail.gmail.com>

On Sun, May 17, 2015 at 2:09 PM, Sturla Molden <sturla.molden at gmail.com> wrote:
> On 17/05/15 20:54, Ralf Gommers wrote:
>
>> I suspect; OpenBLAS seems like the way to go (?).
>
> I think OpenBLAS is currently the most promising candidate to replace
> ATLAS. But we need to build OpenBLAS with MinGW gcc, due to AT&T syntax
> in the assembly code. I am not sure if the old toolchain is good enough,
> or if we will need Carl Kleffner's binaries.

The old toolchain is 32-bit only, so it certainly won't be a general solution.

-- 
Nathaniel J. Smith -- http://vorpus.org


From njs at pobox.com  Sun May 17 20:39:58 2015
From: njs at pobox.com (Nathaniel Smith)
Date: Sun, 17 May 2015 17:39:58 -0700
Subject: [Numpy-discussion] ANN: NumPy Developer Meeting: July 7th @
	SciPy 2015 in Austin
In-Reply-To: <CAPJVwBn5vcLLK0Lcgk+r1wLXN_2Xik7uwOEcEH=GJE5vtNkT_Q@mail.gmail.com>
References: <CAPJVwBn5vcLLK0Lcgk+r1wLXN_2Xik7uwOEcEH=GJE5vtNkT_Q@mail.gmail.com>
Message-ID: <CAPJVwBm0J_heaFfXBRCitFD4_8eB5bYF34WkbqYw5CSVbS1NCg@mail.gmail.com>

Hi all,

I just made a wiki page to start collecting agenda items and doing
planning for this:

  https://github.com/numpy/numpy/wiki/SciPy-2015-developer-meeting

-n

On Wed, May 13, 2015 at 5:23 PM, Nathaniel Smith <njs at pobox.com> wrote:
> Hi all,
>
> I wanted to announce that the numpy core team will be organizing a
> whole-day face-to-face developer meeting on July 7 this year at the
> SciPy conference in Austin, TX. (This is the second day of the
> tutorials and the day before the conference proper starts.) This will
> be a working meeting to discuss and address numpy-related issues,
> particularly ones that are too big to fit in a github issue, like
> governance and release management and where we want to be in five
> years. (We'll be talking more between now and then about more detailed
> logistics and agenda and so forth, but I wanted to get this out now.)
>
> If you're reading this and interested in these issues, then you're invited :-).
>
> -n
>
> --
> Nathaniel J. Smith -- http://vorpus.org


-- 
Nathaniel J. Smith -- http://vorpus.org


From chris.barker at noaa.gov  Mon May 18 00:09:12 2015
From: chris.barker at noaa.gov (Chris Barker)
Date: Sun, 17 May 2015 21:09:12 -0700
Subject: [Numpy-discussion] binary wheels for numpy?
In-Reply-To: <CAF6FJivD7Sau30icaQPbsbutG2bdgZeSCa-hkzAwdSB20A3Qng@mail.gmail.com>
References: <CALGmxEKRH3_5ypx7DKJ9t6PSkRRb5W3+J8JCRk1Jz=9D+gC9Cw@mail.gmail.com>
	<CAH6Pt5qP_zvKdx-7KvKfWARxJquc9mjjRVs8UoRyEaQ6QC4dJA@mail.gmail.com>
	<1250663829453568496.156738sturla.molden-gmail.com@news.gmane.org>
	<CAH6Pt5o-_53f0f0qY3WQcpXgFZ9KEMGUhamb1vk+ekLK23x4AQ@mail.gmail.com>
	<CAF6FJivD7Sau30icaQPbsbutG2bdgZeSCa-hkzAwdSB20A3Qng@mail.gmail.com>
Message-ID: <CALGmxEJChdwJqEaTsOzU0_u0A2-qv_hjg56RcbRQEAoCYR6PRw@mail.gmail.com>

On Sun, May 17, 2015 at 12:11 PM, Robert Kern <robert.kern at gmail.com> wrote:

> I don't think permission from Intel is the blocking issue for putting
> these binaries up on PyPI. Even with Intel's permission, we would be
> putting up proprietary binaries on a page that is explicitly claiming that
> the files linked therein are BSD-licensed. The binaries could not be
> redistributed with any GPLed module, say, pygsl.
>
> We could host them on numpy.org on their own page that clearly explained
> the license of those files, but I think PyPI is out.
>
>
Can't PyPi re-direct -- so they can actualy be hosted somewhere else, but
"pip install numpy" would still work?

IIUC, The Intel libs have the great advantage of run-time selection of
hardware specific code -- yes? So they would both work and give high
performance on most machines (all?). Much as I am a fan of open source,
there doesn't appear to be anything  as good out there.

-CHB


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150517/a94be113/attachment.html>

From chris.barker at noaa.gov  Mon May 18 00:14:56 2015
From: chris.barker at noaa.gov (Chris Barker)
Date: Sun, 17 May 2015 21:14:56 -0700
Subject: [Numpy-discussion] binary wheels for numpy?
In-Reply-To: <CABL7CQgy2+5xnU4GbQX5+67g8Jh_4wDY2nDeDQWHTd4r9CG9vg@mail.gmail.com>
References: <CALGmxEKRH3_5ypx7DKJ9t6PSkRRb5W3+J8JCRk1Jz=9D+gC9Cw@mail.gmail.com>
	<CAH6Pt5qP_zvKdx-7KvKfWARxJquc9mjjRVs8UoRyEaQ6QC4dJA@mail.gmail.com>
	<CABL7CQgy2+5xnU4GbQX5+67g8Jh_4wDY2nDeDQWHTd4r9CG9vg@mail.gmail.com>
Message-ID: <CALGmxEKdTcSpqEoYEC=UvtHb-_aNSyaPcjXcztyZ0sjWxsz=7Q@mail.gmail.com>

On Sun, May 17, 2015 at 3:06 AM, Ralf Gommers <ralf.gommers at gmail.com>
wrote:

> Binaries which crash for ~1% of users (which ATLAS-SSE2 would result in)
> are still not acceptable I think.
>

what instruction set would an OpenBLAS build support? wouldn't we still
need to select a lowest common denominator instructions set to support?

And SEE2 was introduced with the Pentium 4in 2001 -- that is a very long
time ago!

I think the 1% number came from a survey of firefox downloads -- that may
well not be representative of the numpy-using population.

and depending on HOW it failed, 1% might be OK if we could give a
reasonable error message (which maybe we can't...)

-Chris


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150517/018bd62e/attachment.html>

From matthew.brett at gmail.com  Mon May 18 00:23:37 2015
From: matthew.brett at gmail.com (Matthew Brett)
Date: Sun, 17 May 2015 21:23:37 -0700
Subject: [Numpy-discussion] binary wheels for numpy?
In-Reply-To: <CALGmxEKdTcSpqEoYEC=UvtHb-_aNSyaPcjXcztyZ0sjWxsz=7Q@mail.gmail.com>
References: <CALGmxEKRH3_5ypx7DKJ9t6PSkRRb5W3+J8JCRk1Jz=9D+gC9Cw@mail.gmail.com>
	<CAH6Pt5qP_zvKdx-7KvKfWARxJquc9mjjRVs8UoRyEaQ6QC4dJA@mail.gmail.com>
	<CABL7CQgy2+5xnU4GbQX5+67g8Jh_4wDY2nDeDQWHTd4r9CG9vg@mail.gmail.com>
	<CALGmxEKdTcSpqEoYEC=UvtHb-_aNSyaPcjXcztyZ0sjWxsz=7Q@mail.gmail.com>
Message-ID: <CAH6Pt5qz06z2D=3_=KFe9Qkx9_MgsuttdViCCekqXfhAqEb6dQ@mail.gmail.com>

On Sun, May 17, 2015 at 9:14 PM, Chris Barker <chris.barker at noaa.gov> wrote:
> On Sun, May 17, 2015 at 3:06 AM, Ralf Gommers <ralf.gommers at gmail.com>
> wrote:
>>
>> Binaries which crash for ~1% of users (which ATLAS-SSE2 would result in)
>> are still not acceptable I think.
>
>
> what instruction set would an OpenBLAS build support? wouldn't we still need
> to select a lowest common denominator instructions set to support?

I believe OpenBLAS does run-time selection too.

> And SEE2 was introduced with the Pentium 4in 2001 -- that is a very long
> time ago!
>
> I think the 1% number came from a survey of firefox downloads -- that may
> well not be representative of the numpy-using population.
>
> and depending on HOW it failed, 1% might be OK if we could give a reasonable
> error message (which maybe we can't...)

I think we discussed before having a check and error clause in
__init__.py saying something like "You have a really old computer, you
can't use this binary, please go to sourceforge and download the exe
installer...".

Matthew


From njs at pobox.com  Mon May 18 00:27:57 2015
From: njs at pobox.com (Nathaniel Smith)
Date: Sun, 17 May 2015 21:27:57 -0700
Subject: [Numpy-discussion] binary wheels for numpy?
In-Reply-To: <CALGmxEJChdwJqEaTsOzU0_u0A2-qv_hjg56RcbRQEAoCYR6PRw@mail.gmail.com>
References: <CALGmxEKRH3_5ypx7DKJ9t6PSkRRb5W3+J8JCRk1Jz=9D+gC9Cw@mail.gmail.com>
	<CAH6Pt5qP_zvKdx-7KvKfWARxJquc9mjjRVs8UoRyEaQ6QC4dJA@mail.gmail.com>
	<1250663829453568496.156738sturla.molden-gmail.com@news.gmane.org>
	<CAH6Pt5o-_53f0f0qY3WQcpXgFZ9KEMGUhamb1vk+ekLK23x4AQ@mail.gmail.com>
	<CAF6FJivD7Sau30icaQPbsbutG2bdgZeSCa-hkzAwdSB20A3Qng@mail.gmail.com>
	<CALGmxEJChdwJqEaTsOzU0_u0A2-qv_hjg56RcbRQEAoCYR6PRw@mail.gmail.com>
Message-ID: <CAPJVwB=J07_BKkJvqPH-x+znAEzf-3f-arvanyGXE88-32UQQw@mail.gmail.com>

On Sun, May 17, 2015 at 9:09 PM, Chris Barker <chris.barker at noaa.gov> wrote:
> On Sun, May 17, 2015 at 12:11 PM, Robert Kern <robert.kern at gmail.com> wrote:
>>
>> I don't think permission from Intel is the blocking issue for putting
>> these binaries up on PyPI. Even with Intel's permission, we would be putting
>> up proprietary binaries on a page that is explicitly claiming that the files
>> linked therein are BSD-licensed. The binaries could not be redistributed
>> with any GPLed module, say, pygsl.
>>
>> We could host them on numpy.org on their own page that clearly explained
>> the license of those files, but I think PyPI is out.
>
> Can't PyPi re-direct -- so they can actualy be hosted somewhere else, but
> "pip install numpy" would still work?

There's two issues here: (1) we can't actually use the intel stuff
(MKL, icc) under its regular license without having our release
managers accepting personal liability. Which isn't going to happen.
(2) The problem isn't whether they're hosted on PyPI, it's whether the
people downloading them get warned about what they're downloading. The
whole point is that we *don't* want 'pip install numpy' to work in
this case, because it's too seamless.

-n

-- 
Nathaniel J. Smith -- http://vorpus.org


From matthew.brett at gmail.com  Mon May 18 00:32:05 2015
From: matthew.brett at gmail.com (Matthew Brett)
Date: Sun, 17 May 2015 21:32:05 -0700
Subject: [Numpy-discussion] binary wheels for numpy?
In-Reply-To: <CAPJVwB=J07_BKkJvqPH-x+znAEzf-3f-arvanyGXE88-32UQQw@mail.gmail.com>
References: <CALGmxEKRH3_5ypx7DKJ9t6PSkRRb5W3+J8JCRk1Jz=9D+gC9Cw@mail.gmail.com>
	<CAH6Pt5qP_zvKdx-7KvKfWARxJquc9mjjRVs8UoRyEaQ6QC4dJA@mail.gmail.com>
	<1250663829453568496.156738sturla.molden-gmail.com@news.gmane.org>
	<CAH6Pt5o-_53f0f0qY3WQcpXgFZ9KEMGUhamb1vk+ekLK23x4AQ@mail.gmail.com>
	<CAF6FJivD7Sau30icaQPbsbutG2bdgZeSCa-hkzAwdSB20A3Qng@mail.gmail.com>
	<CALGmxEJChdwJqEaTsOzU0_u0A2-qv_hjg56RcbRQEAoCYR6PRw@mail.gmail.com>
	<CAPJVwB=J07_BKkJvqPH-x+znAEzf-3f-arvanyGXE88-32UQQw@mail.gmail.com>
Message-ID: <CAH6Pt5o9cRyUkhyQ15F9vcvqhP7ONWO+0RLUUmvTC9E2mLuLZg@mail.gmail.com>

On Sun, May 17, 2015 at 9:27 PM, Nathaniel Smith <njs at pobox.com> wrote:
> On Sun, May 17, 2015 at 9:09 PM, Chris Barker <chris.barker at noaa.gov> wrote:
>> On Sun, May 17, 2015 at 12:11 PM, Robert Kern <robert.kern at gmail.com> wrote:
>>>
>>> I don't think permission from Intel is the blocking issue for putting
>>> these binaries up on PyPI. Even with Intel's permission, we would be putting
>>> up proprietary binaries on a page that is explicitly claiming that the files
>>> linked therein are BSD-licensed. The binaries could not be redistributed
>>> with any GPLed module, say, pygsl.
>>>
>>> We could host them on numpy.org on their own page that clearly explained
>>> the license of those files, but I think PyPI is out.
>>
>> Can't PyPi re-direct -- so they can actualy be hosted somewhere else, but
>> "pip install numpy" would still work?
>
> There's two issues here: (1) we can't actually use the intel stuff
> (MKL, icc) under its regular license without having our release
> managers accepting personal liability. Which isn't going to happen.
> (2) The problem isn't whether they're hosted on PyPI, it's whether the
> people downloading them get warned about what they're downloading. The
> whole point is that we *don't* want 'pip install numpy' to work in
> this case, because it's too seamless.

I'd add Robert's point - we will have made the default install
something that is not compatible with GPL libraries,

Matthew


From njs at pobox.com  Mon May 18 00:34:29 2015
From: njs at pobox.com (Nathaniel Smith)
Date: Sun, 17 May 2015 21:34:29 -0700
Subject: [Numpy-discussion] binary wheels for numpy?
In-Reply-To: <CABL7CQgy2+5xnU4GbQX5+67g8Jh_4wDY2nDeDQWHTd4r9CG9vg@mail.gmail.com>
References: <CALGmxEKRH3_5ypx7DKJ9t6PSkRRb5W3+J8JCRk1Jz=9D+gC9Cw@mail.gmail.com>
	<CAH6Pt5qP_zvKdx-7KvKfWARxJquc9mjjRVs8UoRyEaQ6QC4dJA@mail.gmail.com>
	<CABL7CQgy2+5xnU4GbQX5+67g8Jh_4wDY2nDeDQWHTd4r9CG9vg@mail.gmail.com>
Message-ID: <CAPJVwBmaQ9kTLM1mK_8Db4GnKyvKyjVvQGiqBtHRG3n+ZEJfEQ@mail.gmail.com>

On Sun, May 17, 2015 at 3:06 AM, Ralf Gommers <ralf.gommers at gmail.com> wrote:
> There's the switch to OpenBLAS and building the right selection mechanism
> for which arch to use:
> http://article.gmane.org/gmane.comp.python.distutils.devel/20350. That seems
> now feasible to complete on a reasonable time-scale, and the problems with
> OpenBLAS seem to be mostly solved. Binaries which crash for ~1% of users
> (which ATLAS-SSE2 would result in) are still not acceptable I think.

Where are you getting this SSE2 number from btw? The most detailed
public survey source for consumer hardware that I know is the Steam
hardware survey:

  http://store.steampowered.com/hwsurvey

It's somewhat biased towards higher-end hardware b/c it targets
gamers, but there is plenty of less-high-end hardware on there as well
-- notice that 20% of the surveyed computers are using intel graphics.
And they're reporting that 99.92% of surveyed computers have SSE*3*
support, and 100.00% have SSE2. So assuming the significant digits are
accurate, this puts the upper bound on SSE2 failure on these systems
at ~0.05%. Even if gamers are 10x likelier to have new hardware then
the rest of the world, 1% still seems to be at least an order of
magnitude too high?

-n

-- 
Nathaniel J. Smith -- http://vorpus.org


From ralf.gommers at gmail.com  Mon May 18 00:45:30 2015
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Mon, 18 May 2015 06:45:30 +0200
Subject: [Numpy-discussion] binary wheels for numpy?
In-Reply-To: <CAPJVwBmaQ9kTLM1mK_8Db4GnKyvKyjVvQGiqBtHRG3n+ZEJfEQ@mail.gmail.com>
References: <CALGmxEKRH3_5ypx7DKJ9t6PSkRRb5W3+J8JCRk1Jz=9D+gC9Cw@mail.gmail.com>
	<CAH6Pt5qP_zvKdx-7KvKfWARxJquc9mjjRVs8UoRyEaQ6QC4dJA@mail.gmail.com>
	<CABL7CQgy2+5xnU4GbQX5+67g8Jh_4wDY2nDeDQWHTd4r9CG9vg@mail.gmail.com>
	<CAPJVwBmaQ9kTLM1mK_8Db4GnKyvKyjVvQGiqBtHRG3n+ZEJfEQ@mail.gmail.com>
Message-ID: <CABL7CQiFgSeUoeo+OcBa_O=rgBHv3CeX5bC80Kkfu+wD9LTxEg@mail.gmail.com>

On Mon, May 18, 2015 at 6:34 AM, Nathaniel Smith <njs at pobox.com> wrote:

> On Sun, May 17, 2015 at 3:06 AM, Ralf Gommers <ralf.gommers at gmail.com>
> wrote:
> > There's the switch to OpenBLAS and building the right selection mechanism
> > for which arch to use:
> > http://article.gmane.org/gmane.comp.python.distutils.devel/20350. That
> seems
> > now feasible to complete on a reasonable time-scale, and the problems
> with
> > OpenBLAS seem to be mostly solved. Binaries which crash for ~1% of users
> > (which ATLAS-SSE2 would result in) are still not acceptable I think.
>
> Where are you getting this SSE2 number from btw?


This is info Matthew just collected from Firefox crash reports:
https://github.com/scipy/scipy/issues/4829#issuecomment-100354752

The most detailed
> public survey source for consumer hardware that I know is the Steam
> hardware survey:
>
>   http://store.steampowered.com/hwsurvey
>
> It's somewhat biased towards higher-end hardware b/c it targets
> gamers, but there is plenty of less-high-end hardware on there as well
> -- notice that 20% of the surveyed computers are using intel graphics.
> And they're reporting that 99.92% of surveyed computers have SSE*3*
> support, and 100.00% have SSE2. So assuming the significant digits are
> accurate, this puts the upper bound on SSE2 failure on these systems
> at ~0.05%. Even if gamers are 10x likelier to have new hardware then
> the rest of the world, 1% still seems to be at least an order of
> magnitude too high?
>

That would make life easier.....

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150518/0a1f5456/attachment.html>

From njs at pobox.com  Mon May 18 00:56:47 2015
From: njs at pobox.com (Nathaniel Smith)
Date: Sun, 17 May 2015 21:56:47 -0700
Subject: [Numpy-discussion] binary wheels for numpy?
In-Reply-To: <CABL7CQiFgSeUoeo+OcBa_O=rgBHv3CeX5bC80Kkfu+wD9LTxEg@mail.gmail.com>
References: <CALGmxEKRH3_5ypx7DKJ9t6PSkRRb5W3+J8JCRk1Jz=9D+gC9Cw@mail.gmail.com>
	<CAH6Pt5qP_zvKdx-7KvKfWARxJquc9mjjRVs8UoRyEaQ6QC4dJA@mail.gmail.com>
	<CABL7CQgy2+5xnU4GbQX5+67g8Jh_4wDY2nDeDQWHTd4r9CG9vg@mail.gmail.com>
	<CAPJVwBmaQ9kTLM1mK_8Db4GnKyvKyjVvQGiqBtHRG3n+ZEJfEQ@mail.gmail.com>
	<CABL7CQiFgSeUoeo+OcBa_O=rgBHv3CeX5bC80Kkfu+wD9LTxEg@mail.gmail.com>
Message-ID: <CAPJVwB=DCJ5HqYxaQ-ZqiGJ8tio6xz15oQEpKc73DK7F+euJ0Q@mail.gmail.com>

On Sun, May 17, 2015 at 9:45 PM, Ralf Gommers <ralf.gommers at gmail.com> wrote:
>
> On Mon, May 18, 2015 at 6:34 AM, Nathaniel Smith <njs at pobox.com> wrote:
>>
>> On Sun, May 17, 2015 at 3:06 AM, Ralf Gommers <ralf.gommers at gmail.com>
>> wrote:
>> > There's the switch to OpenBLAS and building the right selection
>> > mechanism
>> > for which arch to use:
>> > http://article.gmane.org/gmane.comp.python.distutils.devel/20350. That
>> > seems
>> > now feasible to complete on a reasonable time-scale, and the problems
>> > with
>> > OpenBLAS seem to be mostly solved. Binaries which crash for ~1% of users
>> > (which ATLAS-SSE2 would result in) are still not acceptable I think.
>>
>> Where are you getting this SSE2 number from btw?
>
> This is info Matthew just collected from Firefox crash reports:
> https://github.com/scipy/scipy/issues/4829#issuecomment-100354752

Ah, hmm. I guess it's possible that decade-old machines are less
reliable and overrepresented in crash reports, but who knows :-)

It might become reasonable at some point to just go ahead and put up
binaries (ideally with some check so that they fail in a
human-readable way), and see how many people email us. If it's too
many we can always take the wheels down again.

-n

-- 
Nathaniel J. Smith -- http://vorpus.org


From ralf.gommers at gmail.com  Mon May 18 01:08:10 2015
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Mon, 18 May 2015 07:08:10 +0200
Subject: [Numpy-discussion] binary wheels for numpy?
In-Reply-To: <CAPJVwB=DCJ5HqYxaQ-ZqiGJ8tio6xz15oQEpKc73DK7F+euJ0Q@mail.gmail.com>
References: <CALGmxEKRH3_5ypx7DKJ9t6PSkRRb5W3+J8JCRk1Jz=9D+gC9Cw@mail.gmail.com>
	<CAH6Pt5qP_zvKdx-7KvKfWARxJquc9mjjRVs8UoRyEaQ6QC4dJA@mail.gmail.com>
	<CABL7CQgy2+5xnU4GbQX5+67g8Jh_4wDY2nDeDQWHTd4r9CG9vg@mail.gmail.com>
	<CAPJVwBmaQ9kTLM1mK_8Db4GnKyvKyjVvQGiqBtHRG3n+ZEJfEQ@mail.gmail.com>
	<CABL7CQiFgSeUoeo+OcBa_O=rgBHv3CeX5bC80Kkfu+wD9LTxEg@mail.gmail.com>
	<CAPJVwB=DCJ5HqYxaQ-ZqiGJ8tio6xz15oQEpKc73DK7F+euJ0Q@mail.gmail.com>
Message-ID: <CABL7CQh6E0ZhmW4=myo+T8ydqQzn9LAKLAYtq_S+KRaihBKD-A@mail.gmail.com>

On Mon, May 18, 2015 at 6:56 AM, Nathaniel Smith <njs at pobox.com> wrote:

> On Sun, May 17, 2015 at 9:45 PM, Ralf Gommers <ralf.gommers at gmail.com>
> wrote:
> >
> > On Mon, May 18, 2015 at 6:34 AM, Nathaniel Smith <njs at pobox.com> wrote:
> >>
> >> On Sun, May 17, 2015 at 3:06 AM, Ralf Gommers <ralf.gommers at gmail.com>
> >> wrote:
> >> > There's the switch to OpenBLAS and building the right selection
> >> > mechanism
> >> > for which arch to use:
> >> > http://article.gmane.org/gmane.comp.python.distutils.devel/20350.
> That
> >> > seems
> >> > now feasible to complete on a reasonable time-scale, and the problems
> >> > with
> >> > OpenBLAS seem to be mostly solved. Binaries which crash for ~1% of
> users
> >> > (which ATLAS-SSE2 would result in) are still not acceptable I think.
> >>
> >> Where are you getting this SSE2 number from btw?
> >
> > This is info Matthew just collected from Firefox crash reports:
> > https://github.com/scipy/scipy/issues/4829#issuecomment-100354752
>
> Ah, hmm. I guess it's possible that decade-old machines are less
> reliable and overrepresented in crash reports, but who knows :-)
>
> It might become reasonable at some point to just go ahead and put up
> binaries (ideally with some check so that they fail in a
> human-readable way), and see how many people email us. If it's too
> many we can always take the wheels down again.
>

We should probably do that for the next release, if and only if we cannot
make the switch to OpenBLAS in time.

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150518/f176d7c2/attachment.html>

From sturla.molden at gmail.com  Mon May 18 07:47:53 2015
From: sturla.molden at gmail.com (Sturla Molden)
Date: Mon, 18 May 2015 13:47:53 +0200
Subject: [Numpy-discussion] binary wheels for numpy?
In-Reply-To: <CALGmxEJChdwJqEaTsOzU0_u0A2-qv_hjg56RcbRQEAoCYR6PRw@mail.gmail.com>
References: <CALGmxEKRH3_5ypx7DKJ9t6PSkRRb5W3+J8JCRk1Jz=9D+gC9Cw@mail.gmail.com>	<CAH6Pt5qP_zvKdx-7KvKfWARxJquc9mjjRVs8UoRyEaQ6QC4dJA@mail.gmail.com>	<1250663829453568496.156738sturla.molden-gmail.com@news.gmane.org>	<CAH6Pt5o-_53f0f0qY3WQcpXgFZ9KEMGUhamb1vk+ekLK23x4AQ@mail.gmail.com>	<CAF6FJivD7Sau30icaQPbsbutG2bdgZeSCa-hkzAwdSB20A3Qng@mail.gmail.com>
	<CALGmxEJChdwJqEaTsOzU0_u0A2-qv_hjg56RcbRQEAoCYR6PRw@mail.gmail.com>
Message-ID: <mjcjh8$64l$1@ger.gmane.org>

On 18/05/15 06:09, Chris Barker wrote:

> IIUC, The Intel libs have the great advantage of run-time selection of
> hardware specific code -- yes? So they would both work and give high
> performance on most machines (all?).

OpenBLAS can also be built for dynamic architecture with hardware 
auto-detection. IIRC you build with DYNAMIC_ARCH=1 instead of specifying 
TARGET.

Apple Accelerate Framework does this as well.


Sturla


From fomcl at yahoo.com  Mon May 18 13:08:57 2015
From: fomcl at yahoo.com (Albert-Jan Roskam)
Date: Mon, 18 May 2015 17:08:57 +0000 (UTC)
Subject: [Numpy-discussion] binary wheels for numpy?
In-Reply-To: <CAH6Pt5o9cRyUkhyQ15F9vcvqhP7ONWO+0RLUUmvTC9E2mLuLZg@mail.gmail.com>
References: <CAH6Pt5o9cRyUkhyQ15F9vcvqhP7ONWO+0RLUUmvTC9E2mLuLZg@mail.gmail.com>
Message-ID: <1302776353.586284.1431968937387.JavaMail.yahoo@mail.yahoo.com>


----- Original Message -----

> From: Matthew Brett <matthew.brett at gmail.com>
> To: Discussion of Numerical Python <numpy-discussion at scipy.org>
> Cc: 
> Sent: Monday, May 18, 2015 6:32 AM
> Subject: Re: [Numpy-discussion] binary wheels for numpy?
> 
> On Sun, May 17, 2015 at 9:27 PM, Nathaniel Smith <njs at pobox.com> wrote:
>>  On Sun, May 17, 2015 at 9:09 PM, Chris Barker <chris.barker at noaa.gov> 
> wrote:
>>>  On Sun, May 17, 2015 at 12:11 PM, Robert Kern 
> <robert.kern at gmail.com> wrote:
>>>> 
>>>>  I don't think permission from Intel is the blocking issue for 
> putting
>>>>  these binaries up on PyPI. Even with Intel's permission, we 
> would be putting
>>>>  up proprietary binaries on a page that is explicitly claiming that 
> the files
>>>>  linked therein are BSD-licensed. The binaries could not be 
> redistributed
>>>>  with any GPLed module, say, pygsl.
>>>> 
>>>>  We could host them on numpy.org on their own page that clearly 
> explained
>>>>  the license of those files, but I think PyPI is out.
>>> 
>>>  Can't PyPi re-direct -- so they can actualy be hosted somewhere 
> else, but
>>>  "pip install numpy" would still work?
>> 
>>  There's two issues here: (1) we can't actually use the intel stuff
>>  (MKL, icc) under its regular license without having our release
>>  managers accepting personal liability. Which isn't going to happen.
>>  (2) The problem isn't whether they're hosted on PyPI, it's 
> whether the
>>  people downloading them get warned about what they're downloading. The
>>  whole point is that we *don't* want 'pip install numpy' to work 
> in
>>  this case, because it's too seamless.


But you could use allow-external or allow-all-external:


--allow-external <package>

Allow the installation of a package even if it is externally hosted

--allow-all-external

Allow the installation of all packages that are externally hosted

https://pip.pypa.io/en/latest/reference/pip_wheel.html#allow-external


From jgoutin at users.sourceforge.net  Mon May 18 13:49:58 2015
From: jgoutin at users.sourceforge.net (J.Goutin)
Date: Mon, 18 May 2015 17:49:58 +0000 (UTC)
Subject: [Numpy-discussion] MaskedArray compatibility decorators
Message-ID: <loom.20150518T193411-943@post.gmane.org>

Hello,

I created 2 decorators for improve compatibility of "numpy.ma.MaskedArray"
with functions that don't support them.

This simply convert MaskedArray to classical ndarray with masked value
converted to NaN, use the function, and reconvert to MaskedArray.

@MaArrayToNaNKeepMask : Re-use the source mask on the output.
@MaArrayToNaNFixInvalid : Replace invalid values by mask on the output.


Source: 

    import numpy as np

    
    def MaArrayToNaNKeepMask(func):
        """
        MaArray to ndArray with nan decorator.

        Keep mask from originale ndArray.
        """
        def wrapper(MaArray, *args, **kwargs):
            try:
                Mask = MaArray.mask
                fill = MaArray.fill_value
                return np.ma.masked_array(func(MaArray.filled(np.NaN),*args,
**kwargs), mask=Mask, fill_value=fill)
            except:
                return func(MaArray, *args, **kwargs)
        return wrapper


def MaArrayToNaNFixInvalid(func):
    """
    MaArray to ndArray with nan decorator.

    Recreate mask from invalid points.
    """
    def wrapper(MaArray, *args, **kwargs):
        try:
            fill = MaArray.fill_value
            return np.ma.fix_invalid(func(MaArray.filled(np.NaN), *args,
**kwargs), fill_value=fill)
        except:
            return func(MaArray, *args, **kwargs)
    return wrapper


Exemple: 

import skimage.transform

@MaArrayToNaNFixInvalid
def maresize(image, output_shape, order=1, mode='constant', cval=0,
clip=True, preserve_range=True):
    return skimage.transform.resize(image, output_shape, order, mode, cval,
clip, preserve_range)


I think it may be usefull to include this directly on numpy. There is a lot
of functions that don't work directly with MaskedArray.


From chris.barker at noaa.gov  Mon May 18 15:57:41 2015
From: chris.barker at noaa.gov (Chris Barker)
Date: Mon, 18 May 2015 12:57:41 -0700
Subject: [Numpy-discussion] binary wheels for numpy?
In-Reply-To: <CAH6Pt5qz06z2D=3_=KFe9Qkx9_MgsuttdViCCekqXfhAqEb6dQ@mail.gmail.com>
References: <CALGmxEKRH3_5ypx7DKJ9t6PSkRRb5W3+J8JCRk1Jz=9D+gC9Cw@mail.gmail.com>
	<CAH6Pt5qP_zvKdx-7KvKfWARxJquc9mjjRVs8UoRyEaQ6QC4dJA@mail.gmail.com>
	<CABL7CQgy2+5xnU4GbQX5+67g8Jh_4wDY2nDeDQWHTd4r9CG9vg@mail.gmail.com>
	<CALGmxEKdTcSpqEoYEC=UvtHb-_aNSyaPcjXcztyZ0sjWxsz=7Q@mail.gmail.com>
	<CAH6Pt5qz06z2D=3_=KFe9Qkx9_MgsuttdViCCekqXfhAqEb6dQ@mail.gmail.com>
Message-ID: <CALGmxELUy2tYGP+zYumAQoWuN7tVYvJozKkuhC30=7Z88i_1Wg@mail.gmail.com>

On Sun, May 17, 2015 at 9:23 PM, Matthew Brett <matthew.brett at gmail.com>
wrote:

> I believe OpenBLAS does run-time selection too.


very cool! then an excellent option if we can get it to work (make that you
can get it to work, I'm not doing squat in this effort other than
nudging...)

I think we discussed before having a check and error clause in
> __init__.py saying something like "You have a really old computer, you
> can't use this binary, please go to sourceforge and download the exe
> installer...".


If we can to that, then there is NO reason not to put up binaries that
_may_ not support some tiny percentage of users.

though maybe with OpenBLAS we don't need to anyway.

Thanks again to y'all for working on this.

-Chris


-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150518/cf45d968/attachment.html>

From sturla.molden at gmail.com  Mon May 18 17:28:06 2015
From: sturla.molden at gmail.com (Sturla Molden)
Date: Mon, 18 May 2015 23:28:06 +0200
Subject: [Numpy-discussion] binary wheels for numpy?
In-Reply-To: <CALGmxELUy2tYGP+zYumAQoWuN7tVYvJozKkuhC30=7Z88i_1Wg@mail.gmail.com>
References: <CALGmxEKRH3_5ypx7DKJ9t6PSkRRb5W3+J8JCRk1Jz=9D+gC9Cw@mail.gmail.com>	<CAH6Pt5qP_zvKdx-7KvKfWARxJquc9mjjRVs8UoRyEaQ6QC4dJA@mail.gmail.com>	<CABL7CQgy2+5xnU4GbQX5+67g8Jh_4wDY2nDeDQWHTd4r9CG9vg@mail.gmail.com>	<CALGmxEKdTcSpqEoYEC=UvtHb-_aNSyaPcjXcztyZ0sjWxsz=7Q@mail.gmail.com>	<CAH6Pt5qz06z2D=3_=KFe9Qkx9_MgsuttdViCCekqXfhAqEb6dQ@mail.gmail.com>
	<CALGmxELUy2tYGP+zYumAQoWuN7tVYvJozKkuhC30=7Z88i_1Wg@mail.gmail.com>
Message-ID: <mjdlh4$v68$1@ger.gmane.org>

On 18/05/15 21:57, Chris Barker wrote:
> On Sun, May 17, 2015 at 9:23 PM, Matthew Brett <matthew.brett at gmail.com
> <mailto:matthew.brett at gmail.com>> wrote:
>
>     I believe OpenBLAS does run-time selection too.
>
>
> very cool! then an excellent option if we can get it to work (make that
> you can get it to work, I'm not doing squat in this effort other than
> nudging...)


Carl Kleffner has built binary wheels for NumPy and SciPy with OpenBLAS 
configured for run-time hardware detection. I don't remember at the top 
of my head where you can download them for testing. IIRC there remaining 
test failures were not related to OpenBLAS.

Sturla


From cmkleffner at gmail.com  Tue May 19 10:04:50 2015
From: cmkleffner at gmail.com (Carl Kleffner)
Date: Tue, 19 May 2015 16:04:50 +0200
Subject: [Numpy-discussion] binary wheels for numpy?
In-Reply-To: <mjdlh4$v68$1@ger.gmane.org>
References: <CALGmxEKRH3_5ypx7DKJ9t6PSkRRb5W3+J8JCRk1Jz=9D+gC9Cw@mail.gmail.com>
	<CAH6Pt5qP_zvKdx-7KvKfWARxJquc9mjjRVs8UoRyEaQ6QC4dJA@mail.gmail.com>
	<CABL7CQgy2+5xnU4GbQX5+67g8Jh_4wDY2nDeDQWHTd4r9CG9vg@mail.gmail.com>
	<CALGmxEKdTcSpqEoYEC=UvtHb-_aNSyaPcjXcztyZ0sjWxsz=7Q@mail.gmail.com>
	<CAH6Pt5qz06z2D=3_=KFe9Qkx9_MgsuttdViCCekqXfhAqEb6dQ@mail.gmail.com>
	<CALGmxELUy2tYGP+zYumAQoWuN7tVYvJozKkuhC30=7Z88i_1Wg@mail.gmail.com>
	<mjdlh4$v68$1@ger.gmane.org>
Message-ID: <CAGGsPMz8+NWcVFSu8pc76t1j6mMcJYyvVE=N1L7WtP=B+2NnhQ@mail.gmail.com>

numpy and scipy wheels for python2.6-3.4 have been uploaded on binstar last
month and are installable with pip:

https://binstar.org/carlkl/numpy
https://binstar.org/carlkl/scipy

The toolchains can be downloaded from
https://bitbucket.org/carlkl/mingw-w64-for-python/downloads with some
explanations given in
https://bitbucket.org/carlkl/mingw-w64-for-python/downloads/mingwpy-2015-04-readme.pdf

Carl

2015-05-18 23:28 GMT+02:00 Sturla Molden <sturla.molden at gmail.com>:

> On 18/05/15 21:57, Chris Barker wrote:
> > On Sun, May 17, 2015 at 9:23 PM, Matthew Brett <matthew.brett at gmail.com
> > <mailto:matthew.brett at gmail.com>> wrote:
> >
> >     I believe OpenBLAS does run-time selection too.
> >
> >
> > very cool! then an excellent option if we can get it to work (make that
> > you can get it to work, I'm not doing squat in this effort other than
> > nudging...)
>
>
> Carl Kleffner has built binary wheels for NumPy and SciPy with OpenBLAS
> configured for run-time hardware detection. I don't remember at the top
> of my head where you can download them for testing. IIRC there remaining
> test failures were not related to OpenBLAS.
>
> Sturla
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150519/d9adbaac/attachment.html>

From ndarray at mac.com  Thu May 21 21:06:46 2015
From: ndarray at mac.com (Alexander Belopolsky)
Date: Thu, 21 May 2015 21:06:46 -0400
Subject: [Numpy-discussion] Two questions about PEP 465 dot product
Message-ID: <CAP7h-xa3nsyanD4jY_ytKhMcovGEieX3NnLhSY1vNT7w4g9poQ@mail.gmail.com>

1. Is there a simple expression using existing numpy functions that
implements PEP 465 semantics for @?

2. Suppose I have a function that takes two vectors x and y, and a matrix M
and returns x.dot(M.dot(y)).  I would like to "vectorize" this function so
that it works with x and y of any ndim >= 1 and M of any ndim >= 2 treating
multi-dimensional x and y as arrays of vectors and M as an array of
matrices (broadcasting as necessary).  The result should be an array of xMy
products.  How would I achieve that using  PEP 465's @?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150521/2ebfeb17/attachment.html>

From njs at pobox.com  Thu May 21 21:37:04 2015
From: njs at pobox.com (Nathaniel Smith)
Date: Thu, 21 May 2015 18:37:04 -0700
Subject: [Numpy-discussion] Two questions about PEP 465 dot product
In-Reply-To: <CAP7h-xa3nsyanD4jY_ytKhMcovGEieX3NnLhSY1vNT7w4g9poQ@mail.gmail.com>
References: <CAP7h-xa3nsyanD4jY_ytKhMcovGEieX3NnLhSY1vNT7w4g9poQ@mail.gmail.com>
Message-ID: <CAPJVwBmoMD_iQ5Z0ah8iuAWXEknJ0KZPM+gj5d4c9dqO6_00qA@mail.gmail.com>

On Thu, May 21, 2015 at 6:06 PM, Alexander Belopolsky <ndarray at mac.com> wrote:
> 1. Is there a simple expression using existing numpy functions that
> implements PEP 465 semantics for @?

Not yet.

> 2. Suppose I have a function that takes two vectors x and y, and a matrix M
> and returns x.dot(M.dot(y)).  I would like to "vectorize" this function so
> that it works with x and y of any ndim >= 1 and M of any ndim >= 2 treating
> multi-dimensional x and y as arrays of vectors and M as an array of matrices
> (broadcasting as necessary).  The result should be an array of xMy products.
> How would I achieve that using  PEP 465's @?

(x[..., np.newaxis, :] @ M @ y[..., :, np.newaxis])[..., 0, 0]

Alternatively, you might prefer something like this (though it won't
yet take advantage of BLAS):

np.einsum("...i,...ij,...j", x, M, y)

Alternatively, there's been some discussion of the possibility of
adding specialized gufuncs for broadcasted vector-vector,
vector-matrix, matrix-vector multiplication, which wouldn't do the
magic vector promotion that dot and @ do.

-n

-- 
Nathaniel J. Smith -- http://vorpus.org


From charlesr.harris at gmail.com  Fri May 22 02:02:49 2015
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Fri, 22 May 2015 00:02:49 -0600
Subject: [Numpy-discussion] Two questions about PEP 465 dot product
In-Reply-To: <CAP7h-xa3nsyanD4jY_ytKhMcovGEieX3NnLhSY1vNT7w4g9poQ@mail.gmail.com>
References: <CAP7h-xa3nsyanD4jY_ytKhMcovGEieX3NnLhSY1vNT7w4g9poQ@mail.gmail.com>
Message-ID: <CAB6mnxJ=5-jmaTRNv9Tc9eAVTsgiywhQ=oJzXnq84_TjhLChdw@mail.gmail.com>

On Thu, May 21, 2015 at 7:06 PM, Alexander Belopolsky <ndarray at mac.com>
wrote:

> 1. Is there a simple expression using existing numpy functions that
> implements PEP 465 semantics for @?
>
> 2. Suppose I have a function that takes two vectors x and y, and a matrix
> M and returns x.dot(M.dot(y)).  I would like to "vectorize" this function
> so that it works with x and y of any ndim >= 1 and M of any ndim >= 2
> treating multi-dimensional x and y as arrays of vectors and M as an array
> of matrices (broadcasting as necessary).  The result should be an array of
> xMy products.  How would I achieve that using  PEP 465's @?
>
>
If you are willing to run Python 3.5 (use 3.6.0a3, a4 crawls with the
bugs), you can use gh-5878 <https://github.com/numpy/numpy/pull/5878>. The
override mechanisms are still in process in Nathaniel's PR, so that may
change. I'd welcome any feedback.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150522/090292ac/attachment.html>

From charlesr.harris at gmail.com  Fri May 22 02:03:55 2015
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Fri, 22 May 2015 00:03:55 -0600
Subject: [Numpy-discussion] Two questions about PEP 465 dot product
In-Reply-To: <CAB6mnxJ=5-jmaTRNv9Tc9eAVTsgiywhQ=oJzXnq84_TjhLChdw@mail.gmail.com>
References: <CAP7h-xa3nsyanD4jY_ytKhMcovGEieX3NnLhSY1vNT7w4g9poQ@mail.gmail.com>
	<CAB6mnxJ=5-jmaTRNv9Tc9eAVTsgiywhQ=oJzXnq84_TjhLChdw@mail.gmail.com>
Message-ID: <CAB6mnxKJFqpWQO2ZjNeWjA==KEwt0hKyVFQho_XhCgE7Z3y=pA@mail.gmail.com>

On Fri, May 22, 2015 at 12:02 AM, Charles R Harris <
charlesr.harris at gmail.com> wrote:

>
>
> On Thu, May 21, 2015 at 7:06 PM, Alexander Belopolsky <ndarray at mac.com>
> wrote:
>
>> 1. Is there a simple expression using existing numpy functions that
>> implements PEP 465 semantics for @?
>>
>> 2. Suppose I have a function that takes two vectors x and y, and a matrix
>> M and returns x.dot(M.dot(y)).  I would like to "vectorize" this function
>> so that it works with x and y of any ndim >= 1 and M of any ndim >= 2
>> treating multi-dimensional x and y as arrays of vectors and M as an array
>> of matrices (broadcasting as necessary).  The result should be an array of
>> xMy products.  How would I achieve that using  PEP 465's @?
>>
>>
> If you are willing to run Python 3.5 (use 3.6.0a3, a4 crawls with the
> bugs), you can use gh-5878 <https://github.com/numpy/numpy/pull/5878>.
> The override mechanisms are still in process in Nathaniel's PR, so that may
> change. I'd welcome any feedback.
>
>
Oops, make the 3.5.0a3.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150522/5b77693a/attachment.html>

From mathieu at mblondel.org  Fri May 22 04:39:00 2015
From: mathieu at mblondel.org (Mathieu Blondel)
Date: Fri, 22 May 2015 17:39:00 +0900
Subject: [Numpy-discussion] np.diag(np.dot(A, B))
Message-ID: <CAOKSrLyLkseizLkgy3iOnDaBR52OKzzym1qb4w=C0SQW6xbWWQ@mail.gmail.com>

Hi,

I often need to compute the equivalent of

np.diag(np.dot(A, B)).

Computing np.dot(A, B) is highly inefficient if you only need the diagonal
entries. Two more efficient ways of computing the same thing are

np.sum(A * B.T, axis=1)

and

np.einsum("ij,ji->i", A, B).

The first can allocate quite a lot of temporary memory.
The second can be quite cryptic for someone not familiar with einsum.
I assume that einsum does not compute np.dot(A, B), but I haven't verified.

Since this is is quite a recurrent pattern, I was wondering if it would be
worth adding a dedicated function to NumPy and SciPy's sparse module. A
possible name would be "diagdot". The best performance would be obtained
when A is C-style and B fortran-style.

Best,
Mathieu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150522/e1d49611/attachment.html>

From cournape at gmail.com  Fri May 22 04:58:15 2015
From: cournape at gmail.com (David Cournapeau)
Date: Fri, 22 May 2015 17:58:15 +0900
Subject: [Numpy-discussion] np.diag(np.dot(A, B))
In-Reply-To: <CAOKSrLyLkseizLkgy3iOnDaBR52OKzzym1qb4w=C0SQW6xbWWQ@mail.gmail.com>
References: <CAOKSrLyLkseizLkgy3iOnDaBR52OKzzym1qb4w=C0SQW6xbWWQ@mail.gmail.com>
Message-ID: <CAGY4rcW308rfY9Mzhu2fpPZMiC9iNeVZ03yQfqCQ0gDbrbcHow@mail.gmail.com>

On Fri, May 22, 2015 at 5:39 PM, Mathieu Blondel <mathieu at mblondel.org>
wrote:

> Hi,
>
> I often need to compute the equivalent of
>
> np.diag(np.dot(A, B)).
>
> Computing np.dot(A, B) is highly inefficient if you only need the diagonal
> entries. Two more efficient ways of computing the same thing are
>
> np.sum(A * B.T, axis=1)
>
> and
>
> np.einsum("ij,ji->i", A, B).
>
> The first can allocate quite a lot of temporary memory.
> The second can be quite cryptic for someone not familiar with einsum.
> I assume that einsum does not compute np.dot(A, B), but I haven't verified.
>
> Since this is is quite a recurrent pattern, I was wondering if it would be
> worth adding a dedicated function to NumPy and SciPy's sparse module. A
> possible name would be "diagdot". The best performance would be obtained
> when A is C-style and B fortran-style.
>

Does your implementation use BLAS, or is just a a wrapper around einsum ?

David
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150522/a155273f/attachment.html>

From nadavh at visionsense.com  Fri May 22 05:50:46 2015
From: nadavh at visionsense.com (Nadav Horesh)
Date: Fri, 22 May 2015 09:50:46 +0000
Subject: [Numpy-discussion] np.diag(np.dot(A, B))
Message-ID: <df501ec1-bc82-448e-a911-fd6ae1ef3285@email.android.com>

There was an idea on this list to provide a function the run multiple dot on several vectors/matrices. It seems to be a particular implementation of this proposed function.

  Nadav.

On 22 May 2015 11:58, David Cournapeau <cournape at gmail.com> wrote:


On Fri, May 22, 2015 at 5:39 PM, Mathieu Blondel <mathieu at mblondel.org<mailto:mathieu at mblondel.org>> wrote:
Hi,

I often need to compute the equivalent of

np.diag(np.dot(A, B)).

Computing np.dot(A, B) is highly inefficient if you only need the diagonal entries. Two more efficient ways of computing the same thing are

np.sum(A * B.T, axis=1)

and

np.einsum("ij,ji->i", A, B).

The first can allocate quite a lot of temporary memory.
The second can be quite cryptic for someone not familiar with einsum.
I assume that einsum does not compute np.dot(A, B), but I haven't verified.

Since this is is quite a recurrent pattern, I was wondering if it would be worth adding a dedicated function to NumPy and SciPy's sparse module. A possible name would be "diagdot". The best performance would be obtained when A is C-style and B fortran-style.

Does your implementation use BLAS, or is just a a wrapper around einsum ?

David

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150522/b4069c15/attachment.html>

From mathieu at mblondel.org  Fri May 22 06:15:10 2015
From: mathieu at mblondel.org (Mathieu Blondel)
Date: Fri, 22 May 2015 19:15:10 +0900
Subject: [Numpy-discussion] np.diag(np.dot(A, B))
In-Reply-To: <CAGY4rcW308rfY9Mzhu2fpPZMiC9iNeVZ03yQfqCQ0gDbrbcHow@mail.gmail.com>
References: <CAOKSrLyLkseizLkgy3iOnDaBR52OKzzym1qb4w=C0SQW6xbWWQ@mail.gmail.com>
	<CAGY4rcW308rfY9Mzhu2fpPZMiC9iNeVZ03yQfqCQ0gDbrbcHow@mail.gmail.com>
Message-ID: <CAOKSrLxSnUGRPzo9NvK_GkU7teNxnXzKzX8ENdx1zybDuW2o_g@mail.gmail.com>

Right now I am using np.sum(A * B.T, axis=1) for dense data and I have
implemented a Cython routine for sparse data.
I haven't benched np.sum(A * B.T, axis=1) vs. np.einsum("ij,ji->i", A, B)
yet since I am mostly interested in the sparse case right now.

When A and B are C-style and Fortran-style, the optimal algorithm should be
computing the inner products along the diagonal using BLAS.
If not, I guess this will need some benchmarking.

Another use for this is to compute the row-wise L2 norms: np.diagdot(A,
A.T).

Mathieu

On Fri, May 22, 2015 at 5:58 PM, David Cournapeau <cournape at gmail.com>
wrote:

>
>
> On Fri, May 22, 2015 at 5:39 PM, Mathieu Blondel <mathieu at mblondel.org>
> wrote:
>
>> Hi,
>>
>> I often need to compute the equivalent of
>>
>> np.diag(np.dot(A, B)).
>>
>> Computing np.dot(A, B) is highly inefficient if you only need the
>> diagonal entries. Two more efficient ways of computing the same thing are
>>
>> np.sum(A * B.T, axis=1)
>>
>> and
>>
>> np.einsum("ij,ji->i", A, B).
>>
>> The first can allocate quite a lot of temporary memory.
>> The second can be quite cryptic for someone not familiar with einsum.
>> I assume that einsum does not compute np.dot(A, B), but I haven't
>> verified.
>>
>> Since this is is quite a recurrent pattern, I was wondering if it would
>> be worth adding a dedicated function to NumPy and SciPy's sparse module. A
>> possible name would be "diagdot". The best performance would be obtained
>> when A is C-style and B fortran-style.
>>
>
> Does your implementation use BLAS, or is just a a wrapper around einsum ?
>
> David
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150522/bf13a167/attachment.html>

From davidmenhur at gmail.com  Fri May 22 06:53:26 2015
From: davidmenhur at gmail.com (=?UTF-8?B?RGHPgGlk?=)
Date: Fri, 22 May 2015 12:53:26 +0200
Subject: [Numpy-discussion] np.diag(np.dot(A, B))
In-Reply-To: <CAOKSrLxSnUGRPzo9NvK_GkU7teNxnXzKzX8ENdx1zybDuW2o_g@mail.gmail.com>
References: <CAOKSrLyLkseizLkgy3iOnDaBR52OKzzym1qb4w=C0SQW6xbWWQ@mail.gmail.com>
	<CAGY4rcW308rfY9Mzhu2fpPZMiC9iNeVZ03yQfqCQ0gDbrbcHow@mail.gmail.com>
	<CAOKSrLxSnUGRPzo9NvK_GkU7teNxnXzKzX8ENdx1zybDuW2o_g@mail.gmail.com>
Message-ID: <CAJhcF=3rOZBqKQPVJ1KMKoQVHNvFKOG49RYwGjfPi-m4Xx6g5A@mail.gmail.com>

On 22 May 2015 at 12:15, Mathieu Blondel <mathieu at mblondel.org> wrote:

> Right now I am using np.sum(A * B.T, axis=1) for dense data and I have
> implemented a Cython routine for sparse data.
> I haven't benched np.sum(A * B.T, axis=1) vs. np.einsum("ij,ji->i", A, B)
> yet since I am mostly interested in the sparse case right now.
>


In my system, einsum seems to be faster.


In [3]: N = 256

In [4]: A = np.random.random((N, N))

In [5]: B = np.random.random((N, N))

In [6]: %timeit np.sum(A * B.T, axis=1)
1000 loops, best of 3: 260 ?s per loop

In [7]: %timeit  np.einsum("ij,ji->i", A, B)
10000 loops, best of 3: 147 ?s per loop


In [9]: N = 1023

In [10]: A = np.random.random((N, N))

In [11]: B = np.random.random((N, N))

In [12]: %timeit np.sum(A * B.T, axis=1)
100 loops, best of 3: 14 ms per loop

In [13]: %timeit  np.einsum("ij,ji->i", A, B)
100 loops, best of 3: 10.7 ms per loop


I have ATLAS installed from the Fedora repos, so not tuned; but einsum is
only using one thread anyway, so probably it is not using it (definitely
not computing the full dot, because that already takes 200 ms).

If B is in FORTRAN order, it is much faster (for N=5000).

In [25]: Bf = B.copy(order='F')

In [26]: %timeit  np.einsum("ij,ji->i", A, Bf)
10 loops, best of 3: 25.7 ms per loop

In [27]: %timeit  np.einsum("ij,ji->i", A, B)
1 loops, best of 3: 404 ms per loop

In [29]: %timeit np.sum(A * Bf.T, axis=1)
10 loops, best of 3: 118 ms per loop

In [30]: %timeit np.sum(A * B.T, axis=1)
1 loops, best of 3: 517 ms per loop

But the copy is not worth it:

In [31]: %timeit Bf = B.copy(order='F')
1 loops, best of 3: 463 ms per loop


/David.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150522/adc8f15a/attachment.html>

From ndarray at mac.com  Fri May 22 13:57:17 2015
From: ndarray at mac.com (Alexander Belopolsky)
Date: Fri, 22 May 2015 13:57:17 -0400
Subject: [Numpy-discussion] Two questions about PEP 465 dot product
In-Reply-To: <CAPJVwBmoMD_iQ5Z0ah8iuAWXEknJ0KZPM+gj5d4c9dqO6_00qA@mail.gmail.com>
References: <CAP7h-xa3nsyanD4jY_ytKhMcovGEieX3NnLhSY1vNT7w4g9poQ@mail.gmail.com>
	<CAPJVwBmoMD_iQ5Z0ah8iuAWXEknJ0KZPM+gj5d4c9dqO6_00qA@mail.gmail.com>
Message-ID: <CAP7h-xbHfw9iXMft22THApqQ5VOYoprcr-Yj7bFE0DdghJxBiA@mail.gmail.com>

On Thu, May 21, 2015 at 9:37 PM, Nathaniel Smith <njs at pobox.com> wrote:
>
> .. there's been some discussion of the possibility of
> adding specialized gufuncs for broadcasted vector-vector,
> vector-matrix, matrix-vector multiplication, which wouldn't do the
> magic vector promotion that dot and @ do.


This would be nice.  What I would like to see is some consistency between
multi-matrix
support in linalg methods and dot.

For example, when A is a matrix and b is a vector and

a = linalg.solve(A, b)

then

dot(A, a) returns b, but if either or both A and b are stacks, this
invariant does not hold.  I would like
to see a function (say xdot) that I can use instead of dot and have xdot(A,
a) return b whenever a = linalg.solve(A, b).

Similarly, if w,v =  linalg.eig(A), then dot(A,v) returns w * v, but only
if A is 2d.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150522/dbb9f635/attachment.html>

From njs at pobox.com  Fri May 22 14:23:02 2015
From: njs at pobox.com (Nathaniel Smith)
Date: Fri, 22 May 2015 11:23:02 -0700
Subject: [Numpy-discussion] Two questions about PEP 465 dot product
In-Reply-To: <CAP7h-xbHfw9iXMft22THApqQ5VOYoprcr-Yj7bFE0DdghJxBiA@mail.gmail.com>
References: <CAP7h-xa3nsyanD4jY_ytKhMcovGEieX3NnLhSY1vNT7w4g9poQ@mail.gmail.com>
	<CAPJVwBmoMD_iQ5Z0ah8iuAWXEknJ0KZPM+gj5d4c9dqO6_00qA@mail.gmail.com>
	<CAP7h-xbHfw9iXMft22THApqQ5VOYoprcr-Yj7bFE0DdghJxBiA@mail.gmail.com>
Message-ID: <CAPJVwBnxS=CzpEmSW6btFJ0yFL9M1aUH-VqUHVmSw6xf+Oo3uA@mail.gmail.com>

On May 22, 2015 11:00 AM, "Alexander Belopolsky" <ndarray at mac.com> wrote:
>
>
> On Thu, May 21, 2015 at 9:37 PM, Nathaniel Smith <njs at pobox.com> wrote:
> >
> > .. there's been some discussion of the possibility of
>
> > adding specialized gufuncs for broadcasted vector-vector,
> > vector-matrix, matrix-vector multiplication, which wouldn't do the
> > magic vector promotion that dot and @ do.
>
>
> This would be nice.  What I would like to see is some consistency between
multi-matrix
> support in linalg methods and dot.
>
> For example, when A is a matrix and b is a vector and
>
> a = linalg.solve(A, b)
>
> then
>
> dot(A, a) returns b, but if either or both A and b are stacks, this
invariant does not hold.  I would like
> to see a function (say xdot) that I can use instead of dot and have
xdot(A, a) return b whenever a = linalg.solve(A, b).

I believe this equivalence holds if xdot(x, y) = x @ y, because solve()
does follow the pep 465 semantics for shape handling. Or at least, it's
intended to. Of course we will also expose pep 465 matmul semantics under
some name that doesn't require the new syntax (probably not "xdot" though
;-)).

> Similarly, if w,v =  linalg.eig(A), then dot(A,v) returns w * v, but only
if A is 2d.

Again A @ v I believe does the right thing, though I'm not positive -- you
might need a swapaxes or matvec or something. Let us know if you work it
out :-).

Note that it still won't be equivalent to w * v because w * v doesn't
broadcast the way you want :-). You need w[..., np.newaxis, :] * v, I think.

-n

>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150522/949c18ba/attachment.html>

From ben.root at ou.edu  Fri May 22 14:30:42 2015
From: ben.root at ou.edu (Benjamin Root)
Date: Fri, 22 May 2015 14:30:42 -0400
Subject: [Numpy-discussion] Two questions about PEP 465 dot product
In-Reply-To: <CAPJVwBnxS=CzpEmSW6btFJ0yFL9M1aUH-VqUHVmSw6xf+Oo3uA@mail.gmail.com>
References: <CAP7h-xa3nsyanD4jY_ytKhMcovGEieX3NnLhSY1vNT7w4g9poQ@mail.gmail.com>
	<CAPJVwBmoMD_iQ5Z0ah8iuAWXEknJ0KZPM+gj5d4c9dqO6_00qA@mail.gmail.com>
	<CAP7h-xbHfw9iXMft22THApqQ5VOYoprcr-Yj7bFE0DdghJxBiA@mail.gmail.com>
	<CAPJVwBnxS=CzpEmSW6btFJ0yFL9M1aUH-VqUHVmSw6xf+Oo3uA@mail.gmail.com>
Message-ID: <CANNq6FnKALG5yeSQcH=UrS2m9AWpyv15iTLAFCcSghX+zhm+jg@mail.gmail.com>

At some point, someone is going to make a single documentation page
describing all of this, right? Tables, mathtex, and such? I get woozy
whenever I see this discussion go on.

Ben Root

On Fri, May 22, 2015 at 2:23 PM, Nathaniel Smith <njs at pobox.com> wrote:

> On May 22, 2015 11:00 AM, "Alexander Belopolsky" <ndarray at mac.com> wrote:
> >
> >
> > On Thu, May 21, 2015 at 9:37 PM, Nathaniel Smith <njs at pobox.com> wrote:
> > >
> > > .. there's been some discussion of the possibility of
> >
> > > adding specialized gufuncs for broadcasted vector-vector,
> > > vector-matrix, matrix-vector multiplication, which wouldn't do the
> > > magic vector promotion that dot and @ do.
> >
> >
> > This would be nice.  What I would like to see is some consistency
> between multi-matrix
> > support in linalg methods and dot.
> >
> > For example, when A is a matrix and b is a vector and
> >
> > a = linalg.solve(A, b)
> >
> > then
> >
> > dot(A, a) returns b, but if either or both A and b are stacks, this
> invariant does not hold.  I would like
> > to see a function (say xdot) that I can use instead of dot and have
> xdot(A, a) return b whenever a = linalg.solve(A, b).
>
> I believe this equivalence holds if xdot(x, y) = x @ y, because solve()
> does follow the pep 465 semantics for shape handling. Or at least, it's
> intended to. Of course we will also expose pep 465 matmul semantics under
> some name that doesn't require the new syntax (probably not "xdot" though
> ;-)).
>
> > Similarly, if w,v =  linalg.eig(A), then dot(A,v) returns w * v, but
> only if A is 2d.
>
> Again A @ v I believe does the right thing, though I'm not positive -- you
> might need a swapaxes or matvec or something. Let us know if you work it
> out :-).
>
> Note that it still won't be equivalent to w * v because w * v doesn't
> broadcast the way you want :-). You need w[..., np.newaxis, :] * v, I think.
>
> -n
>
> >
> >
> > _______________________________________________
> > NumPy-Discussion mailing list
> > NumPy-Discussion at scipy.org
> > http://mail.scipy.org/mailman/listinfo/numpy-discussion
> >
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150522/ea414e15/attachment.html>

From njs at pobox.com  Fri May 22 16:05:08 2015
From: njs at pobox.com (Nathaniel Smith)
Date: Fri, 22 May 2015 13:05:08 -0700
Subject: [Numpy-discussion] Two questions about PEP 465 dot product
In-Reply-To: <CANNq6FnKALG5yeSQcH=UrS2m9AWpyv15iTLAFCcSghX+zhm+jg@mail.gmail.com>
References: <CAP7h-xa3nsyanD4jY_ytKhMcovGEieX3NnLhSY1vNT7w4g9poQ@mail.gmail.com>
	<CAPJVwBmoMD_iQ5Z0ah8iuAWXEknJ0KZPM+gj5d4c9dqO6_00qA@mail.gmail.com>
	<CAP7h-xbHfw9iXMft22THApqQ5VOYoprcr-Yj7bFE0DdghJxBiA@mail.gmail.com>
	<CAPJVwBnxS=CzpEmSW6btFJ0yFL9M1aUH-VqUHVmSw6xf+Oo3uA@mail.gmail.com>
	<CANNq6FnKALG5yeSQcH=UrS2m9AWpyv15iTLAFCcSghX+zhm+jg@mail.gmail.com>
Message-ID: <CAPJVwBn7XL5LfoH-D8dZWcPpVNBLs0ORjoLQG3t=KOkx1N_ohQ@mail.gmail.com>

On May 22, 2015 11:34 AM, "Benjamin Root" <ben.root at ou.edu> wrote:
>
> At some point, someone is going to make a single documentation page
describing all of this, right? Tables, mathtex, and such? I get woozy
whenever I see this discussion go on.

That does seem like a good idea, doesn't it. Following the principle that
recently-confused users write the best docs, any interest in taking a shot
at writing such a thing?

-n
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150522/2c2fd0be/attachment.html>

From ben.root at ou.edu  Fri May 22 16:22:46 2015
From: ben.root at ou.edu (Benjamin Root)
Date: Fri, 22 May 2015 16:22:46 -0400
Subject: [Numpy-discussion] Two questions about PEP 465 dot product
In-Reply-To: <CAPJVwBn7XL5LfoH-D8dZWcPpVNBLs0ORjoLQG3t=KOkx1N_ohQ@mail.gmail.com>
References: <CAP7h-xa3nsyanD4jY_ytKhMcovGEieX3NnLhSY1vNT7w4g9poQ@mail.gmail.com>
	<CAPJVwBmoMD_iQ5Z0ah8iuAWXEknJ0KZPM+gj5d4c9dqO6_00qA@mail.gmail.com>
	<CAP7h-xbHfw9iXMft22THApqQ5VOYoprcr-Yj7bFE0DdghJxBiA@mail.gmail.com>
	<CAPJVwBnxS=CzpEmSW6btFJ0yFL9M1aUH-VqUHVmSw6xf+Oo3uA@mail.gmail.com>
	<CANNq6FnKALG5yeSQcH=UrS2m9AWpyv15iTLAFCcSghX+zhm+jg@mail.gmail.com>
	<CAPJVwBn7XL5LfoH-D8dZWcPpVNBLs0ORjoLQG3t=KOkx1N_ohQ@mail.gmail.com>
Message-ID: <CANNq6FkfJGQaKk06JRayYQi3SJSStDPRaZdwGR+1yHPNRW23+w@mail.gmail.com>

That assumes that the said recently-confused ever get to the point of
understanding it... and I personally don't do much matrix math work, so I
don't have the proper mental context. I just know that coworkers are going
to be coming to me asking questions because I am the de facto "python guy".
So, having a page I can point them to would be extremely valuable.

Ben Root

On Fri, May 22, 2015 at 4:05 PM, Nathaniel Smith <njs at pobox.com> wrote:

> On May 22, 2015 11:34 AM, "Benjamin Root" <ben.root at ou.edu> wrote:
> >
> > At some point, someone is going to make a single documentation page
> describing all of this, right? Tables, mathtex, and such? I get woozy
> whenever I see this discussion go on.
>
> That does seem like a good idea, doesn't it. Following the principle that
> recently-confused users write the best docs, any interest in taking a shot
> at writing such a thing?
>
> -n
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150522/144e90be/attachment.html>

From njs at pobox.com  Fri May 22 16:58:45 2015
From: njs at pobox.com (Nathaniel Smith)
Date: Fri, 22 May 2015 13:58:45 -0700
Subject: [Numpy-discussion] Two questions about PEP 465 dot product
In-Reply-To: <CANNq6FkfJGQaKk06JRayYQi3SJSStDPRaZdwGR+1yHPNRW23+w@mail.gmail.com>
References: <CAP7h-xa3nsyanD4jY_ytKhMcovGEieX3NnLhSY1vNT7w4g9poQ@mail.gmail.com>
	<CAPJVwBmoMD_iQ5Z0ah8iuAWXEknJ0KZPM+gj5d4c9dqO6_00qA@mail.gmail.com>
	<CAP7h-xbHfw9iXMft22THApqQ5VOYoprcr-Yj7bFE0DdghJxBiA@mail.gmail.com>
	<CAPJVwBnxS=CzpEmSW6btFJ0yFL9M1aUH-VqUHVmSw6xf+Oo3uA@mail.gmail.com>
	<CANNq6FnKALG5yeSQcH=UrS2m9AWpyv15iTLAFCcSghX+zhm+jg@mail.gmail.com>
	<CAPJVwBn7XL5LfoH-D8dZWcPpVNBLs0ORjoLQG3t=KOkx1N_ohQ@mail.gmail.com>
	<CANNq6FkfJGQaKk06JRayYQi3SJSStDPRaZdwGR+1yHPNRW23+w@mail.gmail.com>
Message-ID: <CAPJVwBmeD0dAh3Ci_aC+GME3Af5C718+jb_ULRM_iXnkjA9wRQ@mail.gmail.com>

On May 22, 2015 1:26 PM, "Benjamin Root" <ben.root at ou.edu> wrote:
>
> That assumes that the said recently-confused ever get to the point of
understanding it...

Well, I don't think it's that complicated really. For whatever that's worth
:-). My best attempt is here, anyway:

  https://www.python.org/dev/peps/pep-0465/#semantics

The short version is, for 1d and 2d inputs it acts just like dot(). For
higher dimension inputs like (i, j, n, m) it acts like any other gufunc
(e.g., everything in np.linalg) -- it treats this as an i-by-j stack of
n-by-m matrices and is vectorized over the i, j dimensions. And 0d inputs
are an error.

-n
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150522/8d945e23/attachment.html>

From ben.root at ou.edu  Fri May 22 17:37:11 2015
From: ben.root at ou.edu (Benjamin Root)
Date: Fri, 22 May 2015 17:37:11 -0400
Subject: [Numpy-discussion] Two questions about PEP 465 dot product
In-Reply-To: <CAPJVwBmeD0dAh3Ci_aC+GME3Af5C718+jb_ULRM_iXnkjA9wRQ@mail.gmail.com>
References: <CAP7h-xa3nsyanD4jY_ytKhMcovGEieX3NnLhSY1vNT7w4g9poQ@mail.gmail.com>
	<CAPJVwBmoMD_iQ5Z0ah8iuAWXEknJ0KZPM+gj5d4c9dqO6_00qA@mail.gmail.com>
	<CAP7h-xbHfw9iXMft22THApqQ5VOYoprcr-Yj7bFE0DdghJxBiA@mail.gmail.com>
	<CAPJVwBnxS=CzpEmSW6btFJ0yFL9M1aUH-VqUHVmSw6xf+Oo3uA@mail.gmail.com>
	<CANNq6FnKALG5yeSQcH=UrS2m9AWpyv15iTLAFCcSghX+zhm+jg@mail.gmail.com>
	<CAPJVwBn7XL5LfoH-D8dZWcPpVNBLs0ORjoLQG3t=KOkx1N_ohQ@mail.gmail.com>
	<CANNq6FkfJGQaKk06JRayYQi3SJSStDPRaZdwGR+1yHPNRW23+w@mail.gmail.com>
	<CAPJVwBmeD0dAh3Ci_aC+GME3Af5C718+jb_ULRM_iXnkjA9wRQ@mail.gmail.com>
Message-ID: <CANNq6FnZ1pWw97nt8t_GZ7UKaQFVaQd4u7sRemqFwTxCiVzyAQ@mail.gmail.com>

Then add in broadcasting behavior...

On Fri, May 22, 2015 at 4:58 PM, Nathaniel Smith <njs at pobox.com> wrote:

> On May 22, 2015 1:26 PM, "Benjamin Root" <ben.root at ou.edu> wrote:
> >
> > That assumes that the said recently-confused ever get to the point of
> understanding it...
>
> Well, I don't think it's that complicated really. For whatever that's
> worth :-). My best attempt is here, anyway:
>
>   https://www.python.org/dev/peps/pep-0465/#semantics
>
> The short version is, for 1d and 2d inputs it acts just like dot(). For
> higher dimension inputs like (i, j, n, m) it acts like any other gufunc
> (e.g., everything in np.linalg) -- it treats this as an i-by-j stack of
> n-by-m matrices and is vectorized over the i, j dimensions. And 0d inputs
> are an error.
>
> -n
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150522/71270c2a/attachment.html>

From ndarray at mac.com  Fri May 22 17:40:09 2015
From: ndarray at mac.com (Alexander Belopolsky)
Date: Fri, 22 May 2015 17:40:09 -0400
Subject: [Numpy-discussion] Two questions about PEP 465 dot product
In-Reply-To: <CAPJVwBmeD0dAh3Ci_aC+GME3Af5C718+jb_ULRM_iXnkjA9wRQ@mail.gmail.com>
References: <CAP7h-xa3nsyanD4jY_ytKhMcovGEieX3NnLhSY1vNT7w4g9poQ@mail.gmail.com>
	<CAPJVwBmoMD_iQ5Z0ah8iuAWXEknJ0KZPM+gj5d4c9dqO6_00qA@mail.gmail.com>
	<CAP7h-xbHfw9iXMft22THApqQ5VOYoprcr-Yj7bFE0DdghJxBiA@mail.gmail.com>
	<CAPJVwBnxS=CzpEmSW6btFJ0yFL9M1aUH-VqUHVmSw6xf+Oo3uA@mail.gmail.com>
	<CANNq6FnKALG5yeSQcH=UrS2m9AWpyv15iTLAFCcSghX+zhm+jg@mail.gmail.com>
	<CAPJVwBn7XL5LfoH-D8dZWcPpVNBLs0ORjoLQG3t=KOkx1N_ohQ@mail.gmail.com>
	<CANNq6FkfJGQaKk06JRayYQi3SJSStDPRaZdwGR+1yHPNRW23+w@mail.gmail.com>
	<CAPJVwBmeD0dAh3Ci_aC+GME3Af5C718+jb_ULRM_iXnkjA9wRQ@mail.gmail.com>
Message-ID: <CAP7h-xasrh-HLE7Q3u2yibg5Ecbu69ds0iARWVowfabZvc9vxA@mail.gmail.com>

On Fri, May 22, 2015 at 4:58 PM, Nathaniel Smith <njs at pobox.com> wrote:

> For higher dimension inputs like (i, j, n, m) it acts like any other
> gufunc (e.g., everything in np.linalg)


Unfortunately, not everything in linalg acts the same way.  For example,
matrix_rank and lstsq don't.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150522/d0bb6b1a/attachment.html>

From njs at pobox.com  Fri May 22 17:47:54 2015
From: njs at pobox.com (Nathaniel Smith)
Date: Fri, 22 May 2015 14:47:54 -0700
Subject: [Numpy-discussion] Two questions about PEP 465 dot product
In-Reply-To: <CANNq6FnZ1pWw97nt8t_GZ7UKaQFVaQd4u7sRemqFwTxCiVzyAQ@mail.gmail.com>
References: <CAP7h-xa3nsyanD4jY_ytKhMcovGEieX3NnLhSY1vNT7w4g9poQ@mail.gmail.com>
	<CAPJVwBmoMD_iQ5Z0ah8iuAWXEknJ0KZPM+gj5d4c9dqO6_00qA@mail.gmail.com>
	<CAP7h-xbHfw9iXMft22THApqQ5VOYoprcr-Yj7bFE0DdghJxBiA@mail.gmail.com>
	<CAPJVwBnxS=CzpEmSW6btFJ0yFL9M1aUH-VqUHVmSw6xf+Oo3uA@mail.gmail.com>
	<CANNq6FnKALG5yeSQcH=UrS2m9AWpyv15iTLAFCcSghX+zhm+jg@mail.gmail.com>
	<CAPJVwBn7XL5LfoH-D8dZWcPpVNBLs0ORjoLQG3t=KOkx1N_ohQ@mail.gmail.com>
	<CANNq6FkfJGQaKk06JRayYQi3SJSStDPRaZdwGR+1yHPNRW23+w@mail.gmail.com>
	<CAPJVwBmeD0dAh3Ci_aC+GME3Af5C718+jb_ULRM_iXnkjA9wRQ@mail.gmail.com>
	<CANNq6FnZ1pWw97nt8t_GZ7UKaQFVaQd4u7sRemqFwTxCiVzyAQ@mail.gmail.com>
Message-ID: <CAPJVwBkfS17wj6wVzq38n2UPvbkMmPGsShgx_-5d7BbH+BgXAw@mail.gmail.com>

On May 22, 2015 2:40 PM, "Benjamin Root" <ben.root at ou.edu> wrote:
>
> Then add in broadcasting behavior...

Vectorized functions broadcast over the vectorized dimensions, there's
nothing special about @ in this regard.

-n

> On Fri, May 22, 2015 at 4:58 PM, Nathaniel Smith <njs at pobox.com> wrote:
>>
>> On May 22, 2015 1:26 PM, "Benjamin Root" <ben.root at ou.edu> wrote:
>> >
>> > That assumes that the said recently-confused ever get to the point of
understanding it...
>>
>> Well, I don't think it's that complicated really. For whatever that's
worth :-). My best attempt is here, anyway:
>>
>>   https://www.python.org/dev/peps/pep-0465/#semantics
>>
>> The short version is, for 1d and 2d inputs it acts just like dot(). For
higher dimension inputs like (i, j, n, m) it acts like any other gufunc
(e.g., everything in np.linalg) -- it treats this as an i-by-j stack of
n-by-m matrices and is vectorized over the i, j dimensions. And 0d inputs
are an error.
>>
>> -n
>>
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150522/52e8bae6/attachment.html>

From antony.lee at berkeley.edu  Sun May 24 04:22:21 2015
From: antony.lee at berkeley.edu (Antony Lee)
Date: Sun, 24 May 2015 01:22:21 -0700
Subject: [Numpy-discussion] Backwards-incompatible improvements to
	numpy.random.RandomState
Message-ID: <CAGRr6BGy6mSB22wuk2rdwUTD51OF8N+ueWuoSymsW7396zk3vQ@mail.gmail.com>

Hi,

As mentioned in

#1450: Patch with Ziggurat method for Normal distribution
#5158: ENH: More efficient algorithm for unweighted random choice without
replacement
#5299: using `random.choice` to sample integers in a large range
#5851: Bug in np.random.dirichlet for small alpha parameters

some methods on np.random.RandomState are implemented either non-optimally
(#1450, #5158, #5299) or have outright bugs (#5851), but cannot be easily
changed due to backwards compatibility concerns.  While some have suggested
new methods deprecating the old ones (see e.g. #5872), some consensus has
formed around the following ideas (see #5299 for original discussion,
followed by private discussions with @njsmith):

- Backwards compatibility should only be provided to those who were
explicitly instantiating a seeded RandomState object or reseeding a
RandomState object to a given value, and drawing variates from it: using
the global methods (or a None-seeded RandomState) was already
non-reproducible anyways as e.g. other libraries could be drawing variates
from the global RandomState (of which the free functions in np.random are
actually methods).  Thus, the global RandomState object should use the
latest implementation of the methods.

- "RandomState(seed)" and "r = RandomState(...); r.seed(seed)" should offer
backwards-compatibility guarantees (see e.g.
https://docs.python.org/3.4/library/random.html#notes-on-reproducibility).

As such, we propose the following improvements to the API:

- RandomState gains a (keyword-only) parameter, "version", also accessible
as a read-only attribute.  This indicates the version of the methods on the
object.  The current version of RandomState is retroactively assigned
version 0.  The latest available version is available as
np.random.LATEST_VERSION.  Backwards-incompatible improvements to
RandomState methods can be introduced but increase the LAGTEST_VERSION.

- The global RandomState is instantiated as
RandomState(version=LATEST_VERSION).

- RandomState() and rs.seed() sets the version to LATEST_VERSION.

- RandomState(seed[!=None]) and rs.seed(seed[!=None]) sets the version to 0.

A proof-of-concept implementation, still missing tests, is tracked as
#5911.  It includes the patch proposed in #5158 as an example of how to
include an improved version of random.choice.

Comments, and help for writing tests (in particular to make sure backwards
compatibility is maintained) are welcome.

Antony Lee
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150524/9fbd1c4d/attachment.html>

From ralf.gommers at gmail.com  Sun May 24 04:59:49 2015
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Sun, 24 May 2015 10:59:49 +0200
Subject: [Numpy-discussion] Backwards-incompatible improvements to
	numpy.random.RandomState
In-Reply-To: <CAGRr6BGy6mSB22wuk2rdwUTD51OF8N+ueWuoSymsW7396zk3vQ@mail.gmail.com>
References: <CAGRr6BGy6mSB22wuk2rdwUTD51OF8N+ueWuoSymsW7396zk3vQ@mail.gmail.com>
Message-ID: <CABL7CQjhvZtD4MLVo4E_sQCaVs9J3xnMgmoKXWJ_=hcw2aVQcQ@mail.gmail.com>

On Sun, May 24, 2015 at 10:22 AM, Antony Lee <antony.lee at berkeley.edu>
wrote:

> Hi,
>
> As mentioned in
>
> #1450: Patch with Ziggurat method for Normal distribution
> #5158: ENH: More efficient algorithm for unweighted random choice without
> replacement
> #5299: using `random.choice` to sample integers in a large range
> #5851: Bug in np.random.dirichlet for small alpha parameters
>
> some methods on np.random.RandomState are implemented either non-optimally
> (#1450, #5158, #5299) or have outright bugs (#5851), but cannot be easily
> changed due to backwards compatibility concerns.  While some have suggested
> new methods deprecating the old ones (see e.g. #5872), some consensus has
> formed around the following ideas (see #5299 for original discussion,
> followed by private discussions with @njsmith):
>
> - Backwards compatibility should only be provided to those who were
> explicitly instantiating a seeded RandomState object or reseeding a
> RandomState object to a given value, and drawing variates from it: using
> the global methods (or a None-seeded RandomState) was already
> non-reproducible anyways as e.g. other libraries could be drawing variates
> from the global RandomState (of which the free functions in np.random are
> actually methods).  Thus, the global RandomState object should use the
> latest implementation of the methods.
>

The rest of the proposal looks good to me, but the reasoning on this point
is shaky. np.random.seed() is *very* widely used, and works fine for a test
suite where each test that needs random numbers calls seed(...) and is run
with nose. Can you explain why you need to touch the behavior of the global
methods in order to make RandomState(version=) work?

Ralf


- "RandomState(seed)" and "r = RandomState(...); r.seed(seed)" should offer
> backwards-compatibility guarantees (see e.g.
> https://docs.python.org/3.4/library/random.html#notes-on-reproducibility).
>
> As such, we propose the following improvements to the API:
>
> - RandomState gains a (keyword-only) parameter, "version", also accessible
> as a read-only attribute.  This indicates the version of the methods on the
> object.  The current version of RandomState is retroactively assigned
> version 0.  The latest available version is available as
> np.random.LATEST_VERSION.  Backwards-incompatible improvements to
> RandomState methods can be introduced but increase the LAGTEST_VERSION.
>
> - The global RandomState is instantiated as
> RandomState(version=LATEST_VERSION).
>
> - RandomState() and rs.seed() sets the version to LATEST_VERSION.
>
> - RandomState(seed[!=None]) and rs.seed(seed[!=None]) sets the version to
> 0.
>
> A proof-of-concept implementation, still missing tests, is tracked as
> #5911.  It includes the patch proposed in #5158 as an example of how to
> include an improved version of random.choice.
>
> Comments, and help for writing tests (in particular to make sure backwards
> compatibility is maintained) are welcome.
>
> Antony Lee
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150524/94793c2b/attachment.html>

From njs at pobox.com  Sun May 24 05:30:31 2015
From: njs at pobox.com (Nathaniel Smith)
Date: Sun, 24 May 2015 02:30:31 -0700
Subject: [Numpy-discussion] Backwards-incompatible improvements to
	numpy.random.RandomState
In-Reply-To: <CABL7CQjhvZtD4MLVo4E_sQCaVs9J3xnMgmoKXWJ_=hcw2aVQcQ@mail.gmail.com>
References: <CAGRr6BGy6mSB22wuk2rdwUTD51OF8N+ueWuoSymsW7396zk3vQ@mail.gmail.com>
	<CABL7CQjhvZtD4MLVo4E_sQCaVs9J3xnMgmoKXWJ_=hcw2aVQcQ@mail.gmail.com>
Message-ID: <CAPJVwBmBdvkNjCfg=WWVTxuFXThLHC+rFzQR85CcCKM-Duvchg@mail.gmail.com>

On May 24, 2015 2:03 AM, "Ralf Gommers" <ralf.gommers at gmail.com> wrote:
>
> On Sun, May 24, 2015 at 10:22 AM, Antony Lee <antony.lee at berkeley.edu>
wrote:
>>
>> Hi,
>>
>> As mentioned in
>>
>> #1450: Patch with Ziggurat method for Normal distribution
>> #5158: ENH: More efficient algorithm for unweighted random choice
without replacement
>> #5299: using `random.choice` to sample integers in a large range
>> #5851: Bug in np.random.dirichlet for small alpha parameters
>>
>> some methods on np.random.RandomState are implemented either
non-optimally (#1450, #5158, #5299) or have outright bugs (#5851), but
cannot be easily changed due to backwards compatibility concerns.  While
some have suggested new methods deprecating the old ones (see e.g. #5872),
some consensus has formed around the following ideas (see #5299 for
original discussion, followed by private discussions with @njsmith):
>>
>> - Backwards compatibility should only be provided to those who were
explicitly instantiating a seeded RandomState object or reseeding a
RandomState object to a given value, and drawing variates from it: using
the global methods (or a None-seeded RandomState) was already
non-reproducible anyways as e.g. other libraries could be drawing variates
from the global RandomState (of which the free functions in np.random are
actually methods).  Thus, the global RandomState object should use the
latest implementation of the methods.
>
>
> The rest of the proposal looks good to me, but the reasoning on this
point is shaky. np.random.seed() is *very* widely used, and works fine for
a test suite where each test that needs random numbers calls seed(...) and
is run with nose. Can you explain why you need to touch the behavior of the
global methods in order to make RandomState(version=) work?

You're absolutely right about it being important to preserve the behavior
of the global functions when seeded, but I think this is just a bug in the
description of the proposal here, not in the proposal itself :-).

If you look at the PR, there's no change to how the global functions work
-- they're still just a transparently thin wrapper around a hidden, global
RandomState object, and thus IIUC changes to RandomState will automatically
apply to the global functions as well.

So with this proposal, an unseeded RandomState uses the latest version ->
therefore the global functions, which start out unseeded, start out using
the latest version. If you call .seed() on an existing RandomState object
and pass in a seed but no version= argument, the version gets reset to 0 ->
therefore if you call the global seed() function and pass in a seed but no
version= argument, the global RandomState gets reset to version 0 (at least
until the next time seed() is called), and backcompat is preserved.

-n
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150524/2917e84d/attachment.html>

From ralf.gommers at gmail.com  Sun May 24 05:54:24 2015
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Sun, 24 May 2015 11:54:24 +0200
Subject: [Numpy-discussion] Backwards-incompatible improvements to
	numpy.random.RandomState
In-Reply-To: <CAPJVwBmBdvkNjCfg=WWVTxuFXThLHC+rFzQR85CcCKM-Duvchg@mail.gmail.com>
References: <CAGRr6BGy6mSB22wuk2rdwUTD51OF8N+ueWuoSymsW7396zk3vQ@mail.gmail.com>
	<CABL7CQjhvZtD4MLVo4E_sQCaVs9J3xnMgmoKXWJ_=hcw2aVQcQ@mail.gmail.com>
	<CAPJVwBmBdvkNjCfg=WWVTxuFXThLHC+rFzQR85CcCKM-Duvchg@mail.gmail.com>
Message-ID: <CABL7CQg8e_c72Mv-Jdoq_=zd5nnGJavebXG1FVjReLcZf4kMjw@mail.gmail.com>

On Sun, May 24, 2015 at 11:30 AM, Nathaniel Smith <njs at pobox.com> wrote:

> So with this proposal, an unseeded RandomState uses the latest version ->
> therefore the global functions, which start out unseeded, start out using
> the latest version. If you call .seed() on an existing RandomState object
> and pass in a seed but no version= argument, the version gets reset to 0 ->
> therefore if you call the global seed() function and pass in a seed but no
> version= argument, the global RandomState gets reset to version 0 (at least
> until the next time seed() is called), and backcompat is preserved.
>
> On May 24, 2015 2:03 AM, "Ralf Gommers" <ralf.gommers at gmail.com> wrote:
> >
> > On Sun, May 24, 2015 at 10:22 AM, Antony Lee <antony.lee at berkeley.edu>
> wrote:
> >>
> >> Hi,
> >>
> >> As mentioned in
> >>
> >> #1450: Patch with Ziggurat method for Normal distribution
> >> #5158: ENH: More efficient algorithm for unweighted random choice
> without replacement
> >> #5299: using `random.choice` to sample integers in a large range
> >> #5851: Bug in np.random.dirichlet for small alpha parameters
> >>
> >> some methods on np.random.RandomState are implemented either
> non-optimally (#1450, #5158, #5299) or have outright bugs (#5851), but
> cannot be easily changed due to backwards compatibility concerns.  While
> some have suggested new methods deprecating the old ones (see e.g. #5872),
> some consensus has formed around the following ideas (see #5299 for
> original discussion, followed by private discussions with @njsmith):
> >>
> >> - Backwards compatibility should only be provided to those who were
> explicitly instantiating a seeded RandomState object or reseeding a
> RandomState object to a given value, and drawing variates from it: using
> the global methods (or a None-seeded RandomState) was already
> non-reproducible anyways as e.g. other libraries could be drawing variates
> from the global RandomState (of which the free functions in np.random are
> actually methods).  Thus, the global RandomState object should use the
> latest implementation of the methods.
> >
> >
> > The rest of the proposal looks good to me, but the reasoning on this
> point is shaky. np.random.seed() is *very* widely used, and works fine for
> a test suite where each test that needs random numbers calls seed(...) and
> is run with nose. Can you explain why you need to touch the behavior of the
> global methods in order to make RandomState(version=) work?
> You're absolutely right about it being important to preserve the behavior
> of the global functions when seeded, but I think this is just a bug in the
> description of the proposal here, not in the proposal itself :-). If you
> look at the PR, there's no change to how the global functions work --
> they're still just a transparently thin wrapper around a hidden, global
> RandomState object, and thus IIUC changes to RandomState will automatically
> apply to the global functions as well.
>
> Thanks for the clarification. Then +1 from me for this proposal.

Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150524/25269cd8/attachment.html>

From alan.isaac at gmail.com  Sun May 24 08:41:18 2015
From: alan.isaac at gmail.com (Alan G Isaac)
Date: Sun, 24 May 2015 08:41:18 -0400
Subject: [Numpy-discussion] Backwards-incompatible improvements to
	numpy.random.RandomState
In-Reply-To: <CABL7CQjhvZtD4MLVo4E_sQCaVs9J3xnMgmoKXWJ_=hcw2aVQcQ@mail.gmail.com>
References: <CAGRr6BGy6mSB22wuk2rdwUTD51OF8N+ueWuoSymsW7396zk3vQ@mail.gmail.com>
	<CABL7CQjhvZtD4MLVo4E_sQCaVs9J3xnMgmoKXWJ_=hcw2aVQcQ@mail.gmail.com>
Message-ID: <5561C6EE.1080000@gmail.com>

I echo Ralf's question.
For those who need replicability, the proposed upgrade path seems quite radical.

Also, I would prefer to have the new functionality introduced beside the existing
implementation of RandomState, with an announcement that RandomState
will change in the next major numpy version number.  This will allow everyone
who wants to to change now, without requiring that users attend to minor
numpy version numbers if they want replicability.

I think this is what is required by semantic versioning.

Alan Isaac


On 5/24/2015 4:59 AM, Ralf Gommers wrote:
> the reasoning on this point is shaky. np.random.seed() is *very* widely used, and works fine for a test suite where each test that needs random
> numbers calls seed(...) and is run with nose. Can you explain why you need to touch the behavior of the global methods in order to make
> RandomState(version=) work?


From ralf.gommers at gmail.com  Sun May 24 08:47:34 2015
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Sun, 24 May 2015 14:47:34 +0200
Subject: [Numpy-discussion] Backwards-incompatible improvements to
	numpy.random.RandomState
In-Reply-To: <5561C6EE.1080000@gmail.com>
References: <CAGRr6BGy6mSB22wuk2rdwUTD51OF8N+ueWuoSymsW7396zk3vQ@mail.gmail.com>
	<CABL7CQjhvZtD4MLVo4E_sQCaVs9J3xnMgmoKXWJ_=hcw2aVQcQ@mail.gmail.com>
	<5561C6EE.1080000@gmail.com>
Message-ID: <CABL7CQgs_n-ToJO160GLPKKa-5xKwFUtbknifoOwXGf1Px3WTg@mail.gmail.com>

On Sun, May 24, 2015 at 2:41 PM, Alan G Isaac <alan.isaac at gmail.com> wrote:

> I echo Ralf's question.
> For those who need replicability, the proposed upgrade path seems quite
> radical.
>

It's not radical, and my question was already answered. Nothing changes if
you are doing:

   np.random.seed(1234)
   np.random.any_random_sample_generator_func()

Values only change if you leave out the call to seed(), which you should
never do if you care about replicability.

Ralf


> Also, I would prefer to have the new functionality introduced beside the
> existing
> implementation of RandomState, with an announcement that RandomState
> will change in the next major numpy version number.  This will allow
> everyone
> who wants to to change now, without requiring that users attend to minor
> numpy version numbers if they want replicability.
>
> I think this is what is required by semantic versioning.
>
> Alan Isaac
>
>
>
> On 5/24/2015 4:59 AM, Ralf Gommers wrote:
> > the reasoning on this point is shaky. np.random.seed() is *very* widely
> used, and works fine for a test suite where each test that needs random
> > numbers calls seed(...) and is run with nose. Can you explain why you
> need to touch the behavior of the global methods in order to make
> > RandomState(version=) work?
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150524/df2dcee4/attachment.html>

From alan.isaac at gmail.com  Sun May 24 09:08:12 2015
From: alan.isaac at gmail.com (Alan G Isaac)
Date: Sun, 24 May 2015 09:08:12 -0400
Subject: [Numpy-discussion] Backwards-incompatible improvements to
	numpy.random.RandomState
In-Reply-To: <CABL7CQgs_n-ToJO160GLPKKa-5xKwFUtbknifoOwXGf1Px3WTg@mail.gmail.com>
References: <CAGRr6BGy6mSB22wuk2rdwUTD51OF8N+ueWuoSymsW7396zk3vQ@mail.gmail.com>	<CABL7CQjhvZtD4MLVo4E_sQCaVs9J3xnMgmoKXWJ_=hcw2aVQcQ@mail.gmail.com>	<5561C6EE.1080000@gmail.com>
	<CABL7CQgs_n-ToJO160GLPKKa-5xKwFUtbknifoOwXGf1Px3WTg@mail.gmail.com>
Message-ID: <5561CD3C.8020401@gmail.com>

On 5/24/2015 8:47 AM, Ralf Gommers wrote:
> Values only change if you leave out the call to seed()


OK, but this claim seems to conflict with the following language:
"the global RandomState object should use the latest implementation of the methods".
I take it that this is what Nathan meant by
"I think this is just a bug in the description of the proposal here, not in the proposal itself".

So, is the correct phrasing
"the global RandomState object should use the latest implementation of the methods, unless explicitly seeded"?

Thanks,
Alan


From ralf.gommers at gmail.com  Sun May 24 10:55:31 2015
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Sun, 24 May 2015 16:55:31 +0200
Subject: [Numpy-discussion] ANN: Scipy 0.16.0 beta release 2
Message-ID: <CABL7CQiOy3GUN2OCo5kdAQX-=dYGMgG7n=MRr2NZk_FUUvjvpA@mail.gmail.com>

Hi all,

The second beta for Scipy 0.16.0 is now available. After beta 1 a couple of
critical issues on Windows were solved, and there are now also 32-bit
Windows binaries (along with the sources and release notes) available on
https://sourceforge.net/projects/scipy/files/scipy/0.16.0b2/.

Please try this release and report any issues on the scipy-dev mailing list.

Cheers,
Ralf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150524/b3cd4af9/attachment.html>

From josef.pktd at gmail.com  Sun May 24 11:04:11 2015
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Sun, 24 May 2015 11:04:11 -0400
Subject: [Numpy-discussion] Backwards-incompatible improvements to
	numpy.random.RandomState
In-Reply-To: <5561CD3C.8020401@gmail.com>
References: <CAGRr6BGy6mSB22wuk2rdwUTD51OF8N+ueWuoSymsW7396zk3vQ@mail.gmail.com>
	<CABL7CQjhvZtD4MLVo4E_sQCaVs9J3xnMgmoKXWJ_=hcw2aVQcQ@mail.gmail.com>
	<5561C6EE.1080000@gmail.com>
	<CABL7CQgs_n-ToJO160GLPKKa-5xKwFUtbknifoOwXGf1Px3WTg@mail.gmail.com>
	<5561CD3C.8020401@gmail.com>
Message-ID: <CAMMTP+BuHa-WAmin3symWBQpiq2APGqGCmeBtG-6zs3S5pOzNg@mail.gmail.com>

On Sun, May 24, 2015 at 9:08 AM, Alan G Isaac <alan.isaac at gmail.com> wrote:

> On 5/24/2015 8:47 AM, Ralf Gommers wrote:
> > Values only change if you leave out the call to seed()
>
>
> OK, but this claim seems to conflict with the following language:
> "the global RandomState object should use the latest implementation of the
> methods".
> I take it that this is what Nathan meant by
> "I think this is just a bug in the description of the proposal here, not
> in the proposal itself".
>
> So, is the correct phrasing
> "the global RandomState object should use the latest implementation of the
> methods, unless explicitly seeded"?
>

that's how I understand it.

I don't see any problems with the clarified proposal for the use cases that
I know of.

Can we choose the version also for the global random state, for example to
fix both version and seed in unit tests, with version > 0?


BTW: I would expect that bug fixes are still exempt from backwards
compatibility.

fixing #5851 should be independent of the version, (without having looked
at the issue)

(If you need to replicate bugs, then use an old version of a package.)

Josef


>
> Thanks,
> Alan
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150524/2fccedd0/attachment.html>

From archibald at astron.nl  Sun May 24 11:13:49 2015
From: archibald at astron.nl (Anne Archibald)
Date: Sun, 24 May 2015 15:13:49 +0000
Subject: [Numpy-discussion] Backwards-incompatible improvements to
	numpy.random.RandomState
In-Reply-To: <CAMMTP+BuHa-WAmin3symWBQpiq2APGqGCmeBtG-6zs3S5pOzNg@mail.gmail.com>
References: <CAGRr6BGy6mSB22wuk2rdwUTD51OF8N+ueWuoSymsW7396zk3vQ@mail.gmail.com>
	<CABL7CQjhvZtD4MLVo4E_sQCaVs9J3xnMgmoKXWJ_=hcw2aVQcQ@mail.gmail.com>
	<5561C6EE.1080000@gmail.com>
	<CABL7CQgs_n-ToJO160GLPKKa-5xKwFUtbknifoOwXGf1Px3WTg@mail.gmail.com>
	<5561CD3C.8020401@gmail.com>
	<CAMMTP+BuHa-WAmin3symWBQpiq2APGqGCmeBtG-6zs3S5pOzNg@mail.gmail.com>
Message-ID: <CANm_+ZrfFppiswWQH45jMCVFovk7MeaOcxgZc9ZMQ765uDModg@mail.gmail.com>

Do we want a deprecation-like approach, so that eventually people who want
replicability will specify versions, and everyone else gets bug fixes and
improvements? This would presumably take several major versions, but it
might avoid people getting unintentionally trapped on this version.

Incidentally, bug fixes are complicated: if a bug fix uses more or fewer
raw random numbers, it breaks repeatability not just for the call that got
fixed but for all successive random number generations.

Anne

On Sun, May 24, 2015 at 5:04 PM <josef.pktd at gmail.com> wrote:

> On Sun, May 24, 2015 at 9:08 AM, Alan G Isaac <alan.isaac at gmail.com>
> wrote:
>
>> On 5/24/2015 8:47 AM, Ralf Gommers wrote:
>> > Values only change if you leave out the call to seed()
>>
>>
>> OK, but this claim seems to conflict with the following language:
>> "the global RandomState object should use the latest implementation of
>> the methods".
>> I take it that this is what Nathan meant by
>> "I think this is just a bug in the description of the proposal here, not
>> in the proposal itself".
>>
>> So, is the correct phrasing
>> "the global RandomState object should use the latest implementation of
>> the methods, unless explicitly seeded"?
>>
>
> that's how I understand it.
>
> I don't see any problems with the clarified proposal for the use cases
> that I know of.
>
> Can we choose the version also for the global random state, for example to
> fix both version and seed in unit tests, with version > 0?
>
>
> BTW: I would expect that bug fixes are still exempt from backwards
> compatibility.
>
> fixing #5851 should be independent of the version, (without having looked
> at the issue)
>
> (If you need to replicate bugs, then use an old version of a package.)
>
> Josef
>
>
>>
>> Thanks,
>> Alan
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150524/b20bb5df/attachment.html>

From josef.pktd at gmail.com  Sun May 24 11:40:06 2015
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Sun, 24 May 2015 11:40:06 -0400
Subject: [Numpy-discussion] Backwards-incompatible improvements to
	numpy.random.RandomState
In-Reply-To: <CANm_+ZrfFppiswWQH45jMCVFovk7MeaOcxgZc9ZMQ765uDModg@mail.gmail.com>
References: <CAGRr6BGy6mSB22wuk2rdwUTD51OF8N+ueWuoSymsW7396zk3vQ@mail.gmail.com>
	<CABL7CQjhvZtD4MLVo4E_sQCaVs9J3xnMgmoKXWJ_=hcw2aVQcQ@mail.gmail.com>
	<5561C6EE.1080000@gmail.com>
	<CABL7CQgs_n-ToJO160GLPKKa-5xKwFUtbknifoOwXGf1Px3WTg@mail.gmail.com>
	<5561CD3C.8020401@gmail.com>
	<CAMMTP+BuHa-WAmin3symWBQpiq2APGqGCmeBtG-6zs3S5pOzNg@mail.gmail.com>
	<CANm_+ZrfFppiswWQH45jMCVFovk7MeaOcxgZc9ZMQ765uDModg@mail.gmail.com>
Message-ID: <CAMMTP+AWqGpu0DbkDAR1v+CC3n=KrUX_QeRz0HUYE_FoxWcavA@mail.gmail.com>

On Sun, May 24, 2015 at 11:13 AM, Anne Archibald <archibald at astron.nl>
wrote:

> Do we want a deprecation-like approach, so that eventually people who want
> replicability will specify versions, and everyone else gets bug fixes and
> improvements? This would presumably take several major versions, but it
> might avoid people getting unintentionally trapped on this version.
>
> Incidentally, bug fixes are complicated: if a bug fix uses more or fewer
> raw random numbers, it breaks repeatability not just for the call that got
> fixed but for all successive random number generations.
>

Reminder: we are bottom or inline posting


>
>
> Anne
>
> On Sun, May 24, 2015 at 5:04 PM <josef.pktd at gmail.com> wrote:
>
>> On Sun, May 24, 2015 at 9:08 AM, Alan G Isaac <alan.isaac at gmail.com>
>> wrote:
>>
>>> On 5/24/2015 8:47 AM, Ralf Gommers wrote:
>>> > Values only change if you leave out the call to seed()
>>>
>>>
>>> OK, but this claim seems to conflict with the following language:
>>> "the global RandomState object should use the latest implementation of
>>> the methods".
>>> I take it that this is what Nathan meant by
>>> "I think this is just a bug in the description of the proposal here, not
>>> in the proposal itself".
>>>
>>> So, is the correct phrasing
>>> "the global RandomState object should use the latest implementation of
>>> the methods, unless explicitly seeded"?
>>>
>>
>> that's how I understand it.
>>
>> I don't see any problems with the clarified proposal for the use cases
>> that I know of.
>>
>> Can we choose the version also for the global random state, for example
>> to fix both version and seed in unit tests, with version > 0?
>>
>>
>> BTW: I would expect that bug fixes are still exempt from backwards
>> compatibility.
>>
>> fixing #5851 should be independent of the version, (without having
>> looked at the issue)
>>
>
I skimmed the issue.
In a strict sense it's not really a bug, the user doesn't get wrong
numbers, he or she gets Not A Number.

So there are no current usages that use the function in that range.

Josef


>
>> (If you need to replicate bugs, then use an old version of a package.)
>>
>> Josef
>>
>>
>>>
>>> Thanks,
>>> Alan
>>> _______________________________________________
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion at scipy.org
>>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150524/2b8ec346/attachment.html>

From njs at pobox.com  Sun May 24 13:49:22 2015
From: njs at pobox.com (Nathaniel Smith)
Date: Sun, 24 May 2015 10:49:22 -0700
Subject: [Numpy-discussion] Backwards-incompatible improvements to
	numpy.random.RandomState
In-Reply-To: <CAMMTP+AWqGpu0DbkDAR1v+CC3n=KrUX_QeRz0HUYE_FoxWcavA@mail.gmail.com>
References: <CAGRr6BGy6mSB22wuk2rdwUTD51OF8N+ueWuoSymsW7396zk3vQ@mail.gmail.com>
	<CABL7CQjhvZtD4MLVo4E_sQCaVs9J3xnMgmoKXWJ_=hcw2aVQcQ@mail.gmail.com>
	<5561C6EE.1080000@gmail.com>
	<CABL7CQgs_n-ToJO160GLPKKa-5xKwFUtbknifoOwXGf1Px3WTg@mail.gmail.com>
	<5561CD3C.8020401@gmail.com>
	<CAMMTP+BuHa-WAmin3symWBQpiq2APGqGCmeBtG-6zs3S5pOzNg@mail.gmail.com>
	<CANm_+ZrfFppiswWQH45jMCVFovk7MeaOcxgZc9ZMQ765uDModg@mail.gmail.com>
	<CAMMTP+AWqGpu0DbkDAR1v+CC3n=KrUX_QeRz0HUYE_FoxWcavA@mail.gmail.com>
Message-ID: <CAPJVwBkE23tHXF4biaPiKw=Q2JfsT-cnTN_NzYhwt0DAHsoHag@mail.gmail.com>

On May 24, 2015 8:43 AM, <josef.pktd at gmail.com> wrote:
>
> Reminder: we are bottom or inline posting

Can we stop hassling people about this? Inline replies are a great tool to
have in your toolkit for complicated technical discussions, but I feel like
our weird insistence on them has turned into a pointless and exclusionary
thing. It's not like bottom replying is even any better -- the traditional
mailing list rule is you trim quotes to just the part you're replying to
(like this message); quoting the whole thing and replying underneath just
to give people a bit of exercise for their scrolling finger would totally
have gotten you flamed too.

But email etiquette has moved on since the 90s, even regular posters to
this list violate this "rule" all the time, it's time to let it go.

-n
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150524/cbad59ba/attachment.html>

From josef.pktd at gmail.com  Sun May 24 14:01:28 2015
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Sun, 24 May 2015 14:01:28 -0400
Subject: [Numpy-discussion] Backwards-incompatible improvements to
	numpy.random.RandomState
In-Reply-To: <CAPJVwBkE23tHXF4biaPiKw=Q2JfsT-cnTN_NzYhwt0DAHsoHag@mail.gmail.com>
References: <CAGRr6BGy6mSB22wuk2rdwUTD51OF8N+ueWuoSymsW7396zk3vQ@mail.gmail.com>
	<CABL7CQjhvZtD4MLVo4E_sQCaVs9J3xnMgmoKXWJ_=hcw2aVQcQ@mail.gmail.com>
	<5561C6EE.1080000@gmail.com>
	<CABL7CQgs_n-ToJO160GLPKKa-5xKwFUtbknifoOwXGf1Px3WTg@mail.gmail.com>
	<5561CD3C.8020401@gmail.com>
	<CAMMTP+BuHa-WAmin3symWBQpiq2APGqGCmeBtG-6zs3S5pOzNg@mail.gmail.com>
	<CANm_+ZrfFppiswWQH45jMCVFovk7MeaOcxgZc9ZMQ765uDModg@mail.gmail.com>
	<CAMMTP+AWqGpu0DbkDAR1v+CC3n=KrUX_QeRz0HUYE_FoxWcavA@mail.gmail.com>
	<CAPJVwBkE23tHXF4biaPiKw=Q2JfsT-cnTN_NzYhwt0DAHsoHag@mail.gmail.com>
Message-ID: <CAMMTP+DRgDtT2NZVuiHQNGkeiFfbxc2P5r8pN3XD4mktC2WRWw@mail.gmail.com>

On Sun, May 24, 2015 at 1:49 PM, Nathaniel Smith <njs at pobox.com> wrote:

> On May 24, 2015 8:43 AM, <josef.pktd at gmail.com> wrote:
> >
> > Reminder: we are bottom or inline posting
>
> Can we stop hassling people about this? Inline replies are a great tool to
> have in your toolkit for complicated technical discussions, but I feel like
> our weird insistence on them has turned into a pointless and exclusionary
> thing. It's not like bottom replying is even any better -- the traditional
> mailing list rule is you trim quotes to just the part you're replying to
> (like this message); quoting the whole thing and replying underneath just
> to give people a bit of exercise for their scrolling finger would totally
> have gotten you flamed too.
>
> But email etiquette has moved on since the 90s, even regular posters to
> this list violate this "rule" all the time, it's time to let it go.
>

It's not a 90's thing and I learned about it around 2009 when I started in
here.
I find it very annoying trying to catch up with a longer thread and the
replies are all over the place.


Anne is a few years older than I in terms of numpy and scipy participation
and this was just intended to be a friendly reminder.

And as BTW: I'm glad Anne is back with scipy.


Josef


> -n
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150524/f4252234/attachment.html>

From njs at pobox.com  Sun May 24 14:04:39 2015
From: njs at pobox.com (Nathaniel Smith)
Date: Sun, 24 May 2015 11:04:39 -0700
Subject: [Numpy-discussion] Backwards-incompatible improvements to
	numpy.random.RandomState
In-Reply-To: <CANm_+ZrfFppiswWQH45jMCVFovk7MeaOcxgZc9ZMQ765uDModg@mail.gmail.com>
References: <CAGRr6BGy6mSB22wuk2rdwUTD51OF8N+ueWuoSymsW7396zk3vQ@mail.gmail.com>
	<CABL7CQjhvZtD4MLVo4E_sQCaVs9J3xnMgmoKXWJ_=hcw2aVQcQ@mail.gmail.com>
	<5561C6EE.1080000@gmail.com>
	<CABL7CQgs_n-ToJO160GLPKKa-5xKwFUtbknifoOwXGf1Px3WTg@mail.gmail.com>
	<5561CD3C.8020401@gmail.com>
	<CAMMTP+BuHa-WAmin3symWBQpiq2APGqGCmeBtG-6zs3S5pOzNg@mail.gmail.com>
	<CANm_+ZrfFppiswWQH45jMCVFovk7MeaOcxgZc9ZMQ765uDModg@mail.gmail.com>
Message-ID: <CAPJVwB=cRWvuXdDm3acFHBNZStv7aBnkkwBM-moGAbisBbAMmQ@mail.gmail.com>

On May 24, 2015 8:15 AM, "Anne Archibald" <archibald at astron.nl> wrote:
>
> Do we want a deprecation-like approach, so that eventually people who
want replicability will specify versions, and everyone else gets bug fixes
and improvements? This would presumably take several major versions, but it
might avoid people getting unintentionally trapped on this version.

I'm not sure what you're envisioning as needing a deprecation cycle? The
neat thing about random is that we already have a way for users to say that
they want replicability -- the use of an explicit seed -- so we can just
immediately go to the world you describe, where people who seed get to pick
their version (or default to version 0 for backcompat), and everyone else
gets the improvements automatically. Or is this different from what you
meant somehow?

Fortunately we haven't yet run into any really serious bugs in random, like
"oops we're sampling from the wrong distribution" type bugs. Mostly it's
more like "oops this is really inefficient" or "oops this crashes in this
edge case", so there's no real harm in letting people use old versions. If
we did run into a case where we were giving flat out wrong results, then I
guess we'd still want to keep the code around because reproducibility is
still important, but perhaps with a requirement that you pass an extra
argument like I_know_its_broken=True or something so that people couldn't
end up running the broken code accidentally? I guess we'll cross that
bridge when we come to it.

> Incidentally, bug fixes are complicated: if a bug fix uses more or fewer
raw random numbers, it breaks repeatability not just for the call that got
fixed but for all successive random number generations.

Yep. This is why we mostly haven't been able to change behavior at *all*
except in cases where there was a clear error so we know no-one was using
something.

-n
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150524/7d310551/attachment.html>

From sturla.molden at gmail.com  Sun May 24 14:46:50 2015
From: sturla.molden at gmail.com (Sturla Molden)
Date: Sun, 24 May 2015 20:46:50 +0200
Subject: [Numpy-discussion] Backwards-incompatible improvements to
	numpy.random.RandomState
In-Reply-To: <CANm_+ZrfFppiswWQH45jMCVFovk7MeaOcxgZc9ZMQ765uDModg@mail.gmail.com>
References: <CAGRr6BGy6mSB22wuk2rdwUTD51OF8N+ueWuoSymsW7396zk3vQ@mail.gmail.com>	<CABL7CQjhvZtD4MLVo4E_sQCaVs9J3xnMgmoKXWJ_=hcw2aVQcQ@mail.gmail.com>	<5561C6EE.1080000@gmail.com>	<CABL7CQgs_n-ToJO160GLPKKa-5xKwFUtbknifoOwXGf1Px3WTg@mail.gmail.com>	<5561CD3C.8020401@gmail.com>	<CAMMTP+BuHa-WAmin3symWBQpiq2APGqGCmeBtG-6zs3S5pOzNg@mail.gmail.com>
	<CANm_+ZrfFppiswWQH45jMCVFovk7MeaOcxgZc9ZMQ765uDModg@mail.gmail.com>
Message-ID: <mjt6ak$usr$1@ger.gmane.org>

On 24/05/15 17:13, Anne Archibald wrote:
> Do we want a deprecation-like approach, so that eventually people who
> want replicability will specify versions, and everyone else gets bug
> fixes and improvements? This would presumably take several major
> versions, but it might avoid people getting unintentionally trapped on
> this version.
>
> Incidentally, bug fixes are complicated: if a bug fix uses more or fewer
> raw random numbers, it breaks repeatability not just for the call that
> got fixed but for all successive random number generations.


If a function has a bug, changing it will change the output of the 
function. This is not special for random numbers. If not retaining the 
old erroneous output means we break-backwards compatibility, then no 
bugs can ever be fixed, anywhere in NumPy. I think we need to clarify 
what we mean by backwards compatibility for random numbers. What 
guarantees should we make from one version to another?


Sturla


From sturla.molden at gmail.com  Sun May 24 14:56:17 2015
From: sturla.molden at gmail.com (Sturla Molden)
Date: Sun, 24 May 2015 20:56:17 +0200
Subject: [Numpy-discussion] Backwards-incompatible improvements to
	numpy.random.RandomState
In-Reply-To: <CAPJVwB=cRWvuXdDm3acFHBNZStv7aBnkkwBM-moGAbisBbAMmQ@mail.gmail.com>
References: <CAGRr6BGy6mSB22wuk2rdwUTD51OF8N+ueWuoSymsW7396zk3vQ@mail.gmail.com>	<CABL7CQjhvZtD4MLVo4E_sQCaVs9J3xnMgmoKXWJ_=hcw2aVQcQ@mail.gmail.com>	<5561C6EE.1080000@gmail.com>	<CABL7CQgs_n-ToJO160GLPKKa-5xKwFUtbknifoOwXGf1Px3WTg@mail.gmail.com>	<5561CD3C.8020401@gmail.com>	<CAMMTP+BuHa-WAmin3symWBQpiq2APGqGCmeBtG-6zs3S5pOzNg@mail.gmail.com>	<CANm_+ZrfFppiswWQH45jMCVFovk7MeaOcxgZc9ZMQ765uDModg@mail.gmail.com>
	<CAPJVwB=cRWvuXdDm3acFHBNZStv7aBnkkwBM-moGAbisBbAMmQ@mail.gmail.com>
Message-ID: <mjt6sb$7to$1@ger.gmane.org>

On 24/05/15 20:04, Nathaniel Smith wrote:

> I'm not sure what you're envisioning as needing a deprecation cycle? The
> neat thing about random is that we already have a way for users to say
> that they want replicability -- the use of an explicit seed --

No, this is not sufficient for random numbers. Random sampling and 
ziggurat generators are examples. If we introduce a change (e.g. a 
bugfix) that will affect the number of calls to the entropy source, just 
setting the seed will in general not be enough to ensure backwards 
compatibility. That is e.g. the case with using ziggurat samplers 
instead of the current transcendental transforms for normal, exponential 
and gamma distributions. While ziggurat is faster (and to my knowledge) 
more accurate, it will also make a different number of calls to the 
entropy source, and hence the whole sequence will be affected, even if 
you do set a random seed.


Sturla


From robert.kern at gmail.com  Sun May 24 15:22:32 2015
From: robert.kern at gmail.com (Robert Kern)
Date: Sun, 24 May 2015 20:22:32 +0100
Subject: [Numpy-discussion] Backwards-incompatible improvements to
	numpy.random.RandomState
In-Reply-To: <mjt6ak$usr$1@ger.gmane.org>
References: <CAGRr6BGy6mSB22wuk2rdwUTD51OF8N+ueWuoSymsW7396zk3vQ@mail.gmail.com>
	<CABL7CQjhvZtD4MLVo4E_sQCaVs9J3xnMgmoKXWJ_=hcw2aVQcQ@mail.gmail.com>
	<5561C6EE.1080000@gmail.com>
	<CABL7CQgs_n-ToJO160GLPKKa-5xKwFUtbknifoOwXGf1Px3WTg@mail.gmail.com>
	<5561CD3C.8020401@gmail.com>
	<CAMMTP+BuHa-WAmin3symWBQpiq2APGqGCmeBtG-6zs3S5pOzNg@mail.gmail.com>
	<CANm_+ZrfFppiswWQH45jMCVFovk7MeaOcxgZc9ZMQ765uDModg@mail.gmail.com>
	<mjt6ak$usr$1@ger.gmane.org>
Message-ID: <CAF6FJiuevGdxzcw+J97inugZM53-Gj2bHw3p+PdORMVdmPn+Uw@mail.gmail.com>

On Sun, May 24, 2015 at 7:46 PM, Sturla Molden <sturla.molden at gmail.com>
wrote:
>
> On 24/05/15 17:13, Anne Archibald wrote:
> > Do we want a deprecation-like approach, so that eventually people who
> > want replicability will specify versions, and everyone else gets bug
> > fixes and improvements? This would presumably take several major
> > versions, but it might avoid people getting unintentionally trapped on
> > this version.
> >
> > Incidentally, bug fixes are complicated: if a bug fix uses more or fewer
> > raw random numbers, it breaks repeatability not just for the call that
> > got fixed but for all successive random number generations.
>
> If a function has a bug, changing it will change the output of the
> function. This is not special for random numbers. If not retaining the
> old erroneous output means we break-backwards compatibility, then no
> bugs can ever be fixed, anywhere in NumPy. I think we need to clarify
> what we mean by backwards compatibility for random numbers. What
> guarantees should we make from one version to another?

The policy thus far has been that we will fix bugs in the distributions and
make changes that allow a strictly wider domain of distribution parameters
(e.g. allowing b==0 where before we only allowed b>0), but we will not make
other enhancements that would change existing good output.

--
Robert Kern
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150524/5801e104/attachment.html>

From robert.kern at gmail.com  Sun May 24 15:25:39 2015
From: robert.kern at gmail.com (Robert Kern)
Date: Sun, 24 May 2015 20:25:39 +0100
Subject: [Numpy-discussion] Backwards-incompatible improvements to
	numpy.random.RandomState
In-Reply-To: <mjt6sb$7to$1@ger.gmane.org>
References: <CAGRr6BGy6mSB22wuk2rdwUTD51OF8N+ueWuoSymsW7396zk3vQ@mail.gmail.com>
	<CABL7CQjhvZtD4MLVo4E_sQCaVs9J3xnMgmoKXWJ_=hcw2aVQcQ@mail.gmail.com>
	<5561C6EE.1080000@gmail.com>
	<CABL7CQgs_n-ToJO160GLPKKa-5xKwFUtbknifoOwXGf1Px3WTg@mail.gmail.com>
	<5561CD3C.8020401@gmail.com>
	<CAMMTP+BuHa-WAmin3symWBQpiq2APGqGCmeBtG-6zs3S5pOzNg@mail.gmail.com>
	<CANm_+ZrfFppiswWQH45jMCVFovk7MeaOcxgZc9ZMQ765uDModg@mail.gmail.com>
	<CAPJVwB=cRWvuXdDm3acFHBNZStv7aBnkkwBM-moGAbisBbAMmQ@mail.gmail.com>
	<mjt6sb$7to$1@ger.gmane.org>
Message-ID: <CAF6FJiv2=4AUB+LZ1bbcvqUSkJSDQn9q=xaOTKwnK1r4zNvqhg@mail.gmail.com>

On Sun, May 24, 2015 at 7:56 PM, Sturla Molden <sturla.molden at gmail.com>
wrote:
>
> On 24/05/15 20:04, Nathaniel Smith wrote:
>
> > I'm not sure what you're envisioning as needing a deprecation cycle? The
> > neat thing about random is that we already have a way for users to say
> > that they want replicability -- the use of an explicit seed --
>
> No, this is not sufficient for random numbers. Random sampling and
> ziggurat generators are examples. If we introduce a change (e.g. a
> bugfix) that will affect the number of calls to the entropy source, just
> setting the seed will in general not be enough to ensure backwards
> compatibility. That is e.g. the case with using ziggurat samplers
> instead of the current transcendental transforms for normal, exponential
> and gamma distributions. While ziggurat is faster (and to my knowledge)
> more accurate, it will also make a different number of calls to the
> entropy source, and hence the whole sequence will be affected, even if
> you do set a random seed.

Please reread the proposal at the top of the thread.

--
Robert Kern
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150524/d43632ff/attachment.html>

From antony.lee at berkeley.edu  Sun May 24 16:15:04 2015
From: antony.lee at berkeley.edu (Antony Lee)
Date: Sun, 24 May 2015 13:15:04 -0700
Subject: [Numpy-discussion] Backwards-incompatible improvements to
	numpy.random.RandomState
In-Reply-To: <CAF6FJiv2=4AUB+LZ1bbcvqUSkJSDQn9q=xaOTKwnK1r4zNvqhg@mail.gmail.com>
References: <CAGRr6BGy6mSB22wuk2rdwUTD51OF8N+ueWuoSymsW7396zk3vQ@mail.gmail.com>
	<CABL7CQjhvZtD4MLVo4E_sQCaVs9J3xnMgmoKXWJ_=hcw2aVQcQ@mail.gmail.com>
	<5561C6EE.1080000@gmail.com>
	<CABL7CQgs_n-ToJO160GLPKKa-5xKwFUtbknifoOwXGf1Px3WTg@mail.gmail.com>
	<5561CD3C.8020401@gmail.com>
	<CAMMTP+BuHa-WAmin3symWBQpiq2APGqGCmeBtG-6zs3S5pOzNg@mail.gmail.com>
	<CANm_+ZrfFppiswWQH45jMCVFovk7MeaOcxgZc9ZMQ765uDModg@mail.gmail.com>
	<CAPJVwB=cRWvuXdDm3acFHBNZStv7aBnkkwBM-moGAbisBbAMmQ@mail.gmail.com>
	<mjt6sb$7to$1@ger.gmane.org>
	<CAF6FJiv2=4AUB+LZ1bbcvqUSkJSDQn9q=xaOTKwnK1r4zNvqhg@mail.gmail.com>
Message-ID: <CAGRr6BFZOY_obC5aepMynB8HTUhC3oOV1BcspZ0FjCpQdjF2Ag@mail.gmail.com>

Thanks to Nathaniel who has indeed clarified my intent, i.e. "the global
RandomState should use the latest implementation, unless explicitly
seeded".  More generally, the `RandomState` constructor is just a thin
wrapper around `seed` with the same signature, so one can swap the version
of the global functions with a call to `np.random.seed(version=...)`.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150524/411a0474/attachment.html>

From sturla.molden at gmail.com  Sun May 24 16:30:59 2015
From: sturla.molden at gmail.com (Sturla Molden)
Date: Sun, 24 May 2015 22:30:59 +0200
Subject: [Numpy-discussion] Backwards-incompatible improvements to
	numpy.random.RandomState
In-Reply-To: <CAGRr6BGy6mSB22wuk2rdwUTD51OF8N+ueWuoSymsW7396zk3vQ@mail.gmail.com>
References: <CAGRr6BGy6mSB22wuk2rdwUTD51OF8N+ueWuoSymsW7396zk3vQ@mail.gmail.com>
Message-ID: <mjtcdv$rah$1@ger.gmane.org>

On 24/05/15 10:22, Antony Lee wrote:

> Comments, and help for writing tests (in particular to make sure
> backwards compatibility is maintained) are welcome.

I have one comment, and that is what makes random numbers so special? 
This applies to the rest of NumPy too, fixing a bug can sometimes change 
the output of a function.

Personally I think we should only make guarantees about the data types, 
array shapes, and things like that, but not about the values. Those who 
need a particular version of NumPy for exact reproducibility should 
install the version of Python and NumPy they need. That is why virtual 
environments exist.

I am sure a lot will disagree with me on this. So please don't take this 
as flamebait.


Sturla


From njs at pobox.com  Sun May 24 16:39:52 2015
From: njs at pobox.com (Nathaniel Smith)
Date: Sun, 24 May 2015 13:39:52 -0700
Subject: [Numpy-discussion] Backwards-incompatible improvements to
	numpy.random.RandomState
In-Reply-To: <CAMMTP+DRgDtT2NZVuiHQNGkeiFfbxc2P5r8pN3XD4mktC2WRWw@mail.gmail.com>
References: <CAGRr6BGy6mSB22wuk2rdwUTD51OF8N+ueWuoSymsW7396zk3vQ@mail.gmail.com>
	<CABL7CQjhvZtD4MLVo4E_sQCaVs9J3xnMgmoKXWJ_=hcw2aVQcQ@mail.gmail.com>
	<5561C6EE.1080000@gmail.com>
	<CABL7CQgs_n-ToJO160GLPKKa-5xKwFUtbknifoOwXGf1Px3WTg@mail.gmail.com>
	<5561CD3C.8020401@gmail.com>
	<CAMMTP+BuHa-WAmin3symWBQpiq2APGqGCmeBtG-6zs3S5pOzNg@mail.gmail.com>
	<CANm_+ZrfFppiswWQH45jMCVFovk7MeaOcxgZc9ZMQ765uDModg@mail.gmail.com>
	<CAMMTP+AWqGpu0DbkDAR1v+CC3n=KrUX_QeRz0HUYE_FoxWcavA@mail.gmail.com>
	<CAPJVwBkE23tHXF4biaPiKw=Q2JfsT-cnTN_NzYhwt0DAHsoHag@mail.gmail.com>
	<CAMMTP+DRgDtT2NZVuiHQNGkeiFfbxc2P5r8pN3XD4mktC2WRWw@mail.gmail.com>
Message-ID: <CAPJVwBkj1k2pZSVR2fC7hcpSS0tcsiUwVtjPJBrQAZPWw0DpOw@mail.gmail.com>

On May 24, 2015 11:04 AM, <josef.pktd at gmail.com> wrote:
>
> On Sun, May 24, 2015 at 1:49 PM, Nathaniel Smith <njs at pobox.com> wrote:
>>
>> On May 24, 2015 8:43 AM, <josef.pktd at gmail.com> wrote:
>> >
>> > Reminder: we are bottom or inline posting
>>
>> Can we stop hassling people about this? Inline replies are a great tool
to have in your toolkit for complicated technical discussions, but I feel
like our weird insistence on them has turned into a pointless and
exclusionary thing. It's not like bottom replying is even any better -- the
traditional mailing list rule is you trim quotes to just the part you're
replying to (like this message); quoting the whole thing and replying
underneath just to give people a bit of exercise for their scrolling finger
would totally have gotten you flamed too.
>>
>> But email etiquette has moved on since the 90s, even regular posters to
this list violate this "rule" all the time, it's time to let it go.
>
>
> It's not a 90's thing and I learned about it around 2009 when I started
in here.
> I find it very annoying trying to catch up with a longer thread and the
replies are all over the place.
>
>
> Anne is a few years older than I in terms of numpy and scipy
participation and this was just intended to be a friendly reminder.

And while I know you didn't mean it this way, I'm guessing that being
immediately greeted by criticism for failing to follow some arbitrary and
inconsistently-applied rule was indeed a strong reminder of what a
unpleasant place FOSS mailing lists can sometimes be, and why someone might
disappear from them for a few years. I think we can do better.

This is pretty off-topic for this thread, though, see so let's let it lie
here. If anyone desperately needs to comment further please email me
off-list.

-n
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150524/e0d6f1fb/attachment.html>

From antony.lee at berkeley.edu  Sun May 24 17:09:43 2015
From: antony.lee at berkeley.edu (Antony Lee)
Date: Sun, 24 May 2015 14:09:43 -0700
Subject: [Numpy-discussion] Backwards-incompatible improvements to
	numpy.random.RandomState
In-Reply-To: <mjtcdv$rah$1@ger.gmane.org>
References: <CAGRr6BGy6mSB22wuk2rdwUTD51OF8N+ueWuoSymsW7396zk3vQ@mail.gmail.com>
	<mjtcdv$rah$1@ger.gmane.org>
Message-ID: <CAGRr6BEJQQE+wBbPTAax6G0gcqiaRQJG1JFdFL7h-7mKw-=jnA@mail.gmail.com>

2015-05-24 13:30 GMT-07:00 Sturla Molden <sturla.molden at gmail.com>:

> On 24/05/15 10:22, Antony Lee wrote:
>
> > Comments, and help for writing tests (in particular to make sure
> > backwards compatibility is maintained) are welcome.
>
> I have one comment, and that is what makes random numbers so special?
> This applies to the rest of NumPy too, fixing a bug can sometimes change
> the output of a function.
>
> Personally I think we should only make guarantees about the data types,
> array shapes, and things like that, but not about the values. Those who
> need a particular version of NumPy for exact reproducibility should
> install the version of Python and NumPy they need. That is why virtual
> environments exist.


I personally agree with this point of view (see original discussion in
#5299, for example); if it was only up to me at least I'd make
RandomState(seed) default to the latest version rather than the original
one (whether to keep the old versions around is another question).  On the
other hand, I see that this long-standing debate has prevented obvious
improvements from being added sometimes for years (e.g. a patch for
Ziggurat normal variates has been lying around since 2010), or led to
potential API duplication in order to fix some clearly undesirable behavior
(dirichlet returning "nan" being described as "in a strict sense not really
a bug"(!)), so I'm willing to compromise to get this moving forward.

Antony
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150524/dab1243e/attachment.html>

From josef.pktd at gmail.com  Sun May 24 17:49:17 2015
From: josef.pktd at gmail.com (josef.pktd at gmail.com)
Date: Sun, 24 May 2015 17:49:17 -0400
Subject: [Numpy-discussion] Backwards-incompatible improvements to
	numpy.random.RandomState
In-Reply-To: <CAGRr6BEJQQE+wBbPTAax6G0gcqiaRQJG1JFdFL7h-7mKw-=jnA@mail.gmail.com>
References: <CAGRr6BGy6mSB22wuk2rdwUTD51OF8N+ueWuoSymsW7396zk3vQ@mail.gmail.com>
	<mjtcdv$rah$1@ger.gmane.org>
	<CAGRr6BEJQQE+wBbPTAax6G0gcqiaRQJG1JFdFL7h-7mKw-=jnA@mail.gmail.com>
Message-ID: <CAMMTP+C0592En9bdh4xf1avk=woNC3wXjWivmyzj4BK0cysF6w@mail.gmail.com>

On Sun, May 24, 2015 at 5:09 PM, Antony Lee <antony.lee at berkeley.edu> wrote:

> 2015-05-24 13:30 GMT-07:00 Sturla Molden <sturla.molden at gmail.com>:
>
>> On 24/05/15 10:22, Antony Lee wrote:
>>
>> > Comments, and help for writing tests (in particular to make sure
>> > backwards compatibility is maintained) are welcome.
>>
>> I have one comment, and that is what makes random numbers so special?
>> This applies to the rest of NumPy too, fixing a bug can sometimes change
>> the output of a function.
>>
>> Personally I think we should only make guarantees about the data types,
>> array shapes, and things like that, but not about the values. Those who
>> need a particular version of NumPy for exact reproducibility should
>> install the version of Python and NumPy they need. That is why virtual
>> environments exist.
>
>
> I personally agree with this point of view (see original discussion in
> #5299, for example); if it was only up to me at least I'd make
> RandomState(seed) default to the latest version rather than the original
> one (whether to keep the old versions around is another question).  On the
> other hand, I see that this long-standing debate has prevented obvious
> improvements from being added sometimes for years (e.g. a patch for
> Ziggurat normal variates has been lying around since 2010), or led to
> potential API duplication in order to fix some clearly undesirable behavior
> (dirichlet returning "nan" being described as "in a strict sense not really
> a bug"(!)), so I'm willing to compromise to get this moving forward.
>


It's clearly a different kind of "bug" than some of the ones we fixed in
the past without backwards compatibility discussion where the distribution
was wrong, i.e. some values shifted so parts have more weight and parts
have less weight.

As I mentioned, I don't see any real problem with the proposal.

Josef


>
> Antony
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150524/3a706a2d/attachment.html>

From andyfaff at gmail.com  Mon May 25 07:02:42 2015
From: andyfaff at gmail.com (Andrew Nelson)
Date: Mon, 25 May 2015 21:02:42 +1000
Subject: [Numpy-discussion] Chaining apply_over_axis for multiple axes.
Message-ID: <CAAbtOZfaN__9D3Naf02Ue6aHC+6DH28MpKQBX60GY0jMqD9sRg@mail.gmail.com>

I have a function that operates over a 1D array, to return an array of a
similar size.  To use it in a 2D fashion I would have to do something like
the following:

for row in range(np.size(arr, 0):
    arr_out[row] = func(arr[row])
for col in range(np.size(arr, 1):
    arr_out[:, col] = func(arr[:, col])

I would like to generalise this to N dimensions. Does anyone have any
suggestions of how to achieve this?  Presumably what I need to do is build
an iterator, and then remove an axis:

# arr.shape=(2, 3, 4)
it = np.nditer(arr, flags=['multi_index'])
it.remove_axis(2)
while not it.finished:
    arr_out[it.multi_index] = func(arr[it.multi_index])
    it.iternext()

If I have an array with shape (2, 3, 4) this would allow me to iterate over
the 6 1D arrays that are 4 elements long.  However, how do I then construct
the iterator for the preceding axes?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150525/f1e6259b/attachment.html>

From davidmenhur at gmail.com  Mon May 25 07:14:08 2015
From: davidmenhur at gmail.com (=?UTF-8?B?RGHPgGlk?=)
Date: Mon, 25 May 2015 13:14:08 +0200
Subject: [Numpy-discussion] Backwards-incompatible improvements to
	numpy.random.RandomState
In-Reply-To: <mjtcdv$rah$1@ger.gmane.org>
References: <CAGRr6BGy6mSB22wuk2rdwUTD51OF8N+ueWuoSymsW7396zk3vQ@mail.gmail.com>
	<mjtcdv$rah$1@ger.gmane.org>
Message-ID: <CAJhcF=0+V3X-uaADmgYwpdAD8D3Ustr+_-YsVFnQKGuzMytBAg@mail.gmail.com>

On 24 May 2015 at 22:30, Sturla Molden <sturla.molden at gmail.com> wrote:

> Personally I think we should only make guarantees about the data types,
> array shapes, and things like that, but not about the values. Those who
> need a particular version of NumPy for exact reproducibility should
> install the version of Python and NumPy they need. That is why virtual
> environments exist.
>

But there is a lot of legacy code out there that doesn't specify the
version required; and in most cases the original author cannot even be
asked.

Tests are a particularly annoying case. For example, when testing an
algorithm, is usually a good practice to record the number of iterations as
well as the result; consider it an early warning that we have changed
something we possibly didn't mean to, even if the result is correct. If we
want to support several NumPy versions, and the algorithm has any
randomness, the tests would have to be duplicated, or find a seed that
gives the exact same results. Thus, keeping different versions lets us
compare the results against the old API, without needing to duplicate the
tests. A lot less people will get annoyed.


/David.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150525/b43b8507/attachment.html>

From sebastian at sipsolutions.net  Mon May 25 11:21:29 2015
From: sebastian at sipsolutions.net (Sebastian Berg)
Date: Mon, 25 May 2015 09:21:29 -0600
Subject: [Numpy-discussion] Chaining apply_over_axis for multiple axes.
In-Reply-To: <CAAbtOZfaN__9D3Naf02Ue6aHC+6DH28MpKQBX60GY0jMqD9sRg@mail.gmail.com>
References: <CAAbtOZfaN__9D3Naf02Ue6aHC+6DH28MpKQBX60GY0jMqD9sRg@mail.gmail.com>
Message-ID: <1432567289.2764.7.camel@sipsolutions.net>

On Mo, 2015-05-25 at 21:02 +1000, Andrew Nelson wrote:
> I have a function that operates over a 1D array, to return an array of
> a similar size.  To use it in a 2D fashion I would have to do
> something like the following:
> 
> 
> for row in range(np.size(arr, 0):
>     arr_out[row] = func(arr[row])
> for col in range(np.size(arr, 1):
>     arr_out[:, col] = func(arr[:, col])
> 
> 
> I would like to generalise this to N dimensions. Does anyone have any
> suggestions of how to achieve this?  Presumably what I need to do is
> build an iterator, and then remove an axis:
> 
> 
> # arr.shape=(2, 3, 4)
> it = np.nditer(arr, flags=['multi_index'])
> it.remove_axis(2)
> while not it.finished:
>     arr_out[it.multi_index] = func(arr[it.multi_index])
>     it.iternext()
> 

Just warning that nditer is pretty low level (i.e. can be a bit mind
boggling since it is close to the C-side of things).

Anyway, you can of course do this just iterating the result. Since you
have no buffering, etc. this should work fine. There is also
`np.nesterd_iters` but since I am a bit lazy to look it up, you would
have to actually check some examples for it from the numpy tests to see
how it works probably.

- Sebastian

> 
> If I have an array with shape (2, 3, 4) this would allow me to iterate
> over the 6 1D arrays that are 4 elements long.  However, how do I then
> construct the iterator for the preceding axes?
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150525/ac4ec2d9/attachment.sig>

From njs at pobox.com  Mon May 25 11:38:26 2015
From: njs at pobox.com (Nathaniel Smith)
Date: Mon, 25 May 2015 08:38:26 -0700
Subject: [Numpy-discussion] Chaining apply_over_axis for multiple axes.
In-Reply-To: <CAAbtOZfaN__9D3Naf02Ue6aHC+6DH28MpKQBX60GY0jMqD9sRg@mail.gmail.com>
References: <CAAbtOZfaN__9D3Naf02Ue6aHC+6DH28MpKQBX60GY0jMqD9sRg@mail.gmail.com>
Message-ID: <CAPJVwB=o0DH7mK+e6c81wicNpF_c7H7WWJf5RnWvzazJD4QXdQ@mail.gmail.com>

On May 25, 2015 4:05 AM, "Andrew Nelson" <andyfaff at gmail.com> wrote:
>
> I have a function that operates over a 1D array, to return an array of a
similar size.  To use it in a 2D fashion I would have to do something like
the following:
>
> for row in range(np.size(arr, 0):
>     arr_out[row] = func(arr[row])
> for col in range(np.size(arr, 1):
>     arr_out[:, col] = func(arr[:, col])
>
> I would like to generalise this to N dimensions. Does anyone have any
suggestions of how to achieve this?

The crude but effective way is

tmp_in = arr.reshape((-1, arr.shape[-
1]))
tmp_out = np.empty(tmp_in.shape)
for i in range(tmp_in.shape[0]):
    tmp_out[i, :] = func(tmp_in[i, :])
out = tmp_out.reshape(arr.shape)

This won't produce any unnecessary copies if your input array is contiguous.

This also assumes you want to apply the function on the last axis. If not
you can do something like

arr = arr.swapaxes(axis, -1)
... call the code above ...
out = out.swapaxes(axis, -1)

This will result in an extra copy of the input array though if it's >2d and
the requested axis is not the last one.

-n
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150525/24c70189/attachment.html>

From matthew.brett at gmail.com  Tue May 26 10:56:43 2015
From: matthew.brett at gmail.com (Matthew Brett)
Date: Tue, 26 May 2015 10:56:43 -0400
Subject: [Numpy-discussion] Strategy for OpenBLAS
Message-ID: <CAH6Pt5ogjf5-3MwMRLTah6ZsMb3kR0iX1k9ECVQ=niX2+qfbjg@mail.gmail.com>

Hi,

This morning I was wondering whether we ought to plan to devote some
resources to collaborating with the OpenBLAS team.

Summary:  we should explore ways of setting up numpy as a test engine
for OpenBLAS development.

Detail:

I am getting the impression that OpenBLAS is looking like the most
likely medium term solution for open-source stack builds of numpy and
scipy on Linux and Windows at least.

ATLAS has been our choice for this up until now, but it is designed
for optimizing to a particular CPU configuration, which will likely
make it slow on some or even most of the machines a general installer
gets installed on.  This is only likely to get more severe over time,
because current ATLAS development is on multi-core optimization, where
the number of cores may need to be set at compile time.

The worry about OpenBLAS has always been that it is hard to maintain,
and fixes don't always have tests.  There might be other alternatives
that are a better bet technically, but don't currently have OpenBLAS'
dynamic selection features or CPU support.

It is relatively easy to add tests using Python / numpy.  We like
tests.  Why don't we propose a collaboration with OpenBLAS where we
build and test numpy with every / most / some commits of OpenBLAS, and
try to make it easy for the OpenBLAS team to add tests.    Maybe we
can use and add to the list of machines on which OpenBLAS is tested
[1]?  We Berkeley Pythonistas can certainly add the machines at our
buildbot farm [2].  Maybe the Julia / R developers would be interested
to help too?

Cheers,

Matthew

[1] https://github.com/xianyi/OpenBLAS/wiki/Machine-List
[2] http://nipy.bic.berkeley.edu/buildslaves


From thomas.p.krauss at gmail.com  Tue May 26 10:59:25 2015
From: thomas.p.krauss at gmail.com (Tom Krauss)
Date: Tue, 26 May 2015 09:59:25 -0500
Subject: [Numpy-discussion] addition to numpy.i
Message-ID: <CAFQELMyK1YvZDQnQPWTVm+J8MxNh2tOuY4D953zHqmcwi2-oLg@mail.gmail.com>

Hi folks,

After some discussion with Bill Spotz I decided to try to submit my new
typemap to numpy.i that allows in-place arrays of an arbitrary number of
dimensions to be passed in as a "flat" array with a single "size".

To that end I created my first pull request
  https://github.com/numpy/numpy/pull/5914
sorry if I missed any steps or procedures - I noticed only after I did the
commit and pull request that I should have created a new feature branch,
sorry about that.

Anyway I noticed the pull request initiated a series of tests and one of
them failed. How do I go about debugging and resolving the failure?

Thanks for your help,
  Tom Krauss
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150526/588156fe/attachment.html>

From jtaylor.debian at googlemail.com  Tue May 26 12:53:08 2015
From: jtaylor.debian at googlemail.com (Julian Taylor)
Date: Tue, 26 May 2015 18:53:08 +0200
Subject: [Numpy-discussion] Strategy for OpenBLAS
In-Reply-To: <CAH6Pt5ogjf5-3MwMRLTah6ZsMb3kR0iX1k9ECVQ=niX2+qfbjg@mail.gmail.com>
References: <CAH6Pt5ogjf5-3MwMRLTah6ZsMb3kR0iX1k9ECVQ=niX2+qfbjg@mail.gmail.com>
Message-ID: <5564A4F4.8050803@googlemail.com>

On 05/26/2015 04:56 PM, Matthew Brett wrote:
> Hi,
> 
> This morning I was wondering whether we ought to plan to devote some
> resources to collaborating with the OpenBLAS team.
> 
> 
> 
> It is relatively easy to add tests using Python / numpy.  We like
> tests.  Why don't we propose a collaboration with OpenBLAS where we
> build and test numpy with every / most / some commits of OpenBLAS, and
> try to make it easy for the OpenBLAS team to add tests.    Maybe we
> can use and add to the list of machines on which OpenBLAS is tested
> [1]?  We Berkeley Pythonistas can certainly add the machines at our
> buildbot farm [2].  Maybe the Julia / R developers would be interested
> to help too?
> 

Technically we only need a single machine with the newest instruction
set available. All other cases could then be tested via a virtual
machine that only exposes specific instruction sets (e.g. qemu which
could technically also emulate stuff the host does not have).

Concerning test generation there is a huge parameter space that needs
testing due with openblas, at least some of it would need to be
automated/fuzzed. We also need specific preconditioning of memory to
test failure cases openblas had in the past, E.g. filling memory around
the matrices with nans and also somehow filling openblas own temporary
buffers with some signaling values (might require special built openblas
if _MALLOC_PERTURB does not work).

Maybe it would be feasible to write a hypothesis [0] strategy for some
of the blas stuff to automate the parameter exploration.

And then we'd need to run everything under valgrind as due to the
assembler implementation of openblas we can't use the faster address
sanitizers gcc and clang now provide.

[0] https://hypothesis.readthedocs.org/en/latest/


From sturla.molden at gmail.com  Tue May 26 13:02:33 2015
From: sturla.molden at gmail.com (Sturla Molden)
Date: Tue, 26 May 2015 17:02:33 +0000 (UTC)
Subject: [Numpy-discussion] Strategy for OpenBLAS
References: <CAH6Pt5ogjf5-3MwMRLTah6ZsMb3kR0iX1k9ECVQ=niX2+qfbjg@mail.gmail.com>
Message-ID: <1788986294454351015.246752sturla.molden-gmail.com@news.gmane.org>

Matthew Brett <matthew.brett at gmail.com> wrote:

> I am getting the impression that OpenBLAS is looking like the most
> likely medium term solution for open-source stack builds of numpy and
> scipy on Linux and Windows at least.

I think you right. 

OpenBLAS might even be a long-term solution. We should also consider that
GotoBLAS (and GotoBLAS2) powered some of the World's most expensive
superomputers for a decade. It is not like this is untested software. 

The remaining test errors on Windows are also due to MSVC and MinGW-w64
differences, not due to OpenBLAS itself, and those are not relevant on
Linux.

On Apple, I am not sure which is better. Accelerate is faster in some
corner cases (level-1 BLAS with AVX, operations on very small matrices),
but it has issues with multiprocessing (GCD's threadpool is not forksafe).
Apart from that OpenBLAS and Accelerate are about equivalent in
performance. I have built OpenBLAS on OSX with clang and gfortran, it works
like a charm. So it might be worth cobsidering for binary wheels on OSX as
well.

Sturla


From charlesr.harris at gmail.com  Tue May 26 13:16:52 2015
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Tue, 26 May 2015 11:16:52 -0600
Subject: [Numpy-discussion] addition to numpy.i
In-Reply-To: <CAFQELMyK1YvZDQnQPWTVm+J8MxNh2tOuY4D953zHqmcwi2-oLg@mail.gmail.com>
References: <CAFQELMyK1YvZDQnQPWTVm+J8MxNh2tOuY4D953zHqmcwi2-oLg@mail.gmail.com>
Message-ID: <CAB6mnxK3gYaTbjuhbYBQWCQwH4kyrJ5z66atKLNQiYVbEMqpOQ@mail.gmail.com>

On Tue, May 26, 2015 at 8:59 AM, Tom Krauss <thomas.p.krauss at gmail.com>
wrote:

> Hi folks,
>
> After some discussion with Bill Spotz I decided to try to submit my new
> typemap to numpy.i that allows in-place arrays of an arbitrary number of
> dimensions to be passed in as a "flat" array with a single "size".
>
> To that end I created my first pull request
>   https://github.com/numpy/numpy/pull/5914
> sorry if I missed any steps or procedures - I noticed only after I did the
> commit and pull request that I should have created a new feature branch,
> sorry about that.
>
> Anyway I noticed the pull request initiated a series of tests and one of
> them failed. How do I go about debugging and resolving the failure?
>

Looks like it passed the tests, in fact, I don't think travis tests
numpy.i, but I could be wrong about that.

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150526/109ecbc5/attachment.html>

From matthew.brett at gmail.com  Tue May 26 14:59:54 2015
From: matthew.brett at gmail.com (Matthew Brett)
Date: Tue, 26 May 2015 14:59:54 -0400
Subject: [Numpy-discussion] Strategy for OpenBLAS
In-Reply-To: <5564A4F4.8050803@googlemail.com>
References: <CAH6Pt5ogjf5-3MwMRLTah6ZsMb3kR0iX1k9ECVQ=niX2+qfbjg@mail.gmail.com>
	<5564A4F4.8050803@googlemail.com>
Message-ID: <CAH6Pt5qXWo=mR2t50FyFZv5BQ8d=WD+U_LYuqtQKz3-sT7O7zA@mail.gmail.com>

Hi,

On Tue, May 26, 2015 at 12:53 PM, Julian Taylor
<jtaylor.debian at googlemail.com> wrote:
> On 05/26/2015 04:56 PM, Matthew Brett wrote:
>> Hi,
>>
>> This morning I was wondering whether we ought to plan to devote some
>> resources to collaborating with the OpenBLAS team.
>>
>>
>>
>> It is relatively easy to add tests using Python / numpy.  We like
>> tests.  Why don't we propose a collaboration with OpenBLAS where we
>> build and test numpy with every / most / some commits of OpenBLAS, and
>> try to make it easy for the OpenBLAS team to add tests.    Maybe we
>> can use and add to the list of machines on which OpenBLAS is tested
>> [1]?  We Berkeley Pythonistas can certainly add the machines at our
>> buildbot farm [2].  Maybe the Julia / R developers would be interested
>> to help too?
>>
>
> Technically we only need a single machine with the newest instruction
> set available. All other cases could then be tested via a virtual
> machine that only exposes specific instruction sets (e.g. qemu which
> could technically also emulate stuff the host does not have).
>
> Concerning test generation there is a huge parameter space that needs
> testing due with openblas, at least some of it would need to be
> automated/fuzzed. We also need specific preconditioning of memory to
> test failure cases openblas had in the past, E.g. filling memory around
> the matrices with nans and also somehow filling openblas own temporary
> buffers with some signaling values (might require special built openblas
> if _MALLOC_PERTURB does not work).
>
> Maybe it would be feasible to write a hypothesis [0] strategy for some
> of the blas stuff to automate the parameter exploration.
>
> And then we'd need to run everything under valgrind as due to the
> assembler implementation of openblas we can't use the faster address
> sanitizers gcc and clang now provide.
>
> [0] https://hypothesis.readthedocs.org/en/latest/

All this sounds extremely useful.

What do you think we should do next?   How feasible is it to start to
set this kind of thing up for our own use, and then offer to integrate
with OpenBLAS?

Is there anyone out there who knows the Julia and / or R community
well enough to know if they would be interested to help?  What kind of
help do you think we need?  Money for a machine?

Cheers,

Matthew


From njs at pobox.com  Wed May 27 04:13:04 2015
From: njs at pobox.com (Nathaniel Smith)
Date: Wed, 27 May 2015 01:13:04 -0700
Subject: [Numpy-discussion] Strategy for OpenBLAS
In-Reply-To: <5564A4F4.8050803@googlemail.com>
References: <CAH6Pt5ogjf5-3MwMRLTah6ZsMb3kR0iX1k9ECVQ=niX2+qfbjg@mail.gmail.com>
	<5564A4F4.8050803@googlemail.com>
Message-ID: <CAPJVwBkRvBAb3LLuX=+3mt9FpTDFOzJZq6mTso2EfAvN2oVzrg@mail.gmail.com>

On Tue, May 26, 2015 at 9:53 AM, Julian Taylor
<jtaylor.debian at googlemail.com> wrote:
> On 05/26/2015 04:56 PM, Matthew Brett wrote:
>> Hi,
>>
>> This morning I was wondering whether we ought to plan to devote some
>> resources to collaborating with the OpenBLAS team.
>>
>>
>>
>> It is relatively easy to add tests using Python / numpy.  We like
>> tests.  Why don't we propose a collaboration with OpenBLAS where we
>> build and test numpy with every / most / some commits of OpenBLAS, and
>> try to make it easy for the OpenBLAS team to add tests.    Maybe we
>> can use and add to the list of machines on which OpenBLAS is tested
>> [1]?  We Berkeley Pythonistas can certainly add the machines at our
>> buildbot farm [2].  Maybe the Julia / R developers would be interested
>> to help too?
>>
>
> Technically we only need a single machine with the newest instruction
> set available. All other cases could then be tested via a virtual
> machine that only exposes specific instruction sets (e.g. qemu which
> could technically also emulate stuff the host does not have).
>
> Concerning test generation there is a huge parameter space that needs
> testing due with openblas, at least some of it would need to be
> automated/fuzzed. We also need specific preconditioning of memory to
> test failure cases openblas had in the past, E.g. filling memory around
> the matrices with nans and also somehow filling openblas own temporary
> buffers with some signaling values (might require special built openblas
> if _MALLOC_PERTURB does not work).

A lot of this stuff is easier if we take a white-box instead of
black-box approach -- adding hooks in OpenBLAS to override the
CPU-based kernel-autoselection sounds a lot easier than creating
unnatural machines in qemu, and similarly for initializing temporary
buffers. (I would be really unsurprised if OpenBLAS re-uses temporary
buffers across calls instead of doing a free/re-malloc, for example.)

> Maybe it would be feasible to write a hypothesis [0] strategy for some
> of the blas stuff to automate the parameter exploration.

Or if this is daunting, you can get pretty far just sitting down and
writing some for loops... I think this is a case where something is a
lot better than nothing :-).

-n

-- 
Nathaniel J. Smith -- http://vorpus.org


From njs at pobox.com  Wed May 27 04:25:50 2015
From: njs at pobox.com (Nathaniel Smith)
Date: Wed, 27 May 2015 01:25:50 -0700
Subject: [Numpy-discussion] Strategy for OpenBLAS
In-Reply-To: <CAH6Pt5ogjf5-3MwMRLTah6ZsMb3kR0iX1k9ECVQ=niX2+qfbjg@mail.gmail.com>
References: <CAH6Pt5ogjf5-3MwMRLTah6ZsMb3kR0iX1k9ECVQ=niX2+qfbjg@mail.gmail.com>
Message-ID: <CAPJVwBk5Fa85PgbKAQKYPDJBSfPByCmP7y=KbC_bDj2cM_7_cw@mail.gmail.com>

On Tue, May 26, 2015 at 7:56 AM, Matthew Brett <matthew.brett at gmail.com> wrote:
> Hi,
>
> This morning I was wondering whether we ought to plan to devote some
> resources to collaborating with the OpenBLAS team.

Sounds like a great idea to me. Even a bit familiar :-)
   http://thread.gmane.org/gmane.comp.python.numeric.general/57498

The lead developers of both OpenBLAS and BLIS are currently at UT
Austin: http://shpc.ices.utexas.edu/people.html
...and it turns out that this is also where SciPy will be held in
July. Might be a good opportunity for numpy/scipy folks interested in
these matters to sit down in the same room as them and hash out some
kind of shared plan of action.

(NB: I'm told that BLIS now has full multi-threading support, and that
they are working on runtime CPU detection and kernel auto-selection
right now.)

-n

-- 
Nathaniel J. Smith -- http://vorpus.org


From cmkleffner at gmail.com  Wed May 27 04:26:07 2015
From: cmkleffner at gmail.com (Carl Kleffner)
Date: Wed, 27 May 2015 10:26:07 +0200
Subject: [Numpy-discussion] Strategy for OpenBLAS
In-Reply-To: <CAPJVwBkRvBAb3LLuX=+3mt9FpTDFOzJZq6mTso2EfAvN2oVzrg@mail.gmail.com>
References: <CAH6Pt5ogjf5-3MwMRLTah6ZsMb3kR0iX1k9ECVQ=niX2+qfbjg@mail.gmail.com>
	<5564A4F4.8050803@googlemail.com>
	<CAPJVwBkRvBAb3LLuX=+3mt9FpTDFOzJZq6mTso2EfAvN2oVzrg@mail.gmail.com>
Message-ID: <CAGGsPMxs50aKHcdac9q+T0fqz-FptYt-51U-=OZM0O9oxutokA@mail.gmail.com>

2015-05-27 10:13 GMT+02:00 Nathaniel Smith <njs at pobox.com>:

> On Tue, May 26, 2015 at 9:53 AM, Julian Taylor
> <jtaylor.debian at googlemail.com> wrote:
> > On 05/26/2015 04:56 PM, Matthew Brett wrote:
> >> Hi,
> >>
> >> This morning I was wondering whether we ought to plan to devote some
> >> resources to collaborating with the OpenBLAS team.
> >>
> >>
> >>
> >> It is relatively easy to add tests using Python / numpy.  We like
> >> tests.  Why don't we propose a collaboration with OpenBLAS where we
> >> build and test numpy with every / most / some commits of OpenBLAS, and
> >> try to make it easy for the OpenBLAS team to add tests.    Maybe we
> >> can use and add to the list of machines on which OpenBLAS is tested
> >> [1]?  We Berkeley Pythonistas can certainly add the machines at our
> >> buildbot farm [2].  Maybe the Julia / R developers would be interested
> >> to help too?
> >>
> >
> > Technically we only need a single machine with the newest instruction
> > set available. All other cases could then be tested via a virtual
> > machine that only exposes specific instruction sets (e.g. qemu which
> > could technically also emulate stuff the host does not have).
> >
> > Concerning test generation there is a huge parameter space that needs
> > testing due with openblas, at least some of it would need to be
> > automated/fuzzed. We also need specific preconditioning of memory to
> > test failure cases openblas had in the past, E.g. filling memory around
> > the matrices with nans and also somehow filling openblas own temporary
> > buffers with some signaling values (might require special built openblas
> > if _MALLOC_PERTURB does not work).
>
> A lot of this stuff is easier if we take a white-box instead of
> black-box approach -- adding hooks in OpenBLAS to override the
> CPU-based kernel-autoselection sounds a lot easier than creating
> unnatural machines in qemu, and similarly for initializing temporary
> buffers. (I would be really unsurprised if OpenBLAS re-uses temporary
> buffers across calls instead of doing a free/re-malloc, for example.)
>
> Manually overwriting the OpenBLAS CPU autoselection can easily be done by
setting the OPENBLAS_CORETYPE environment variable, i.e.
export OPENBLAS_CORETYPE=Nehalem


> > Maybe it would be feasible to write a hypothesis [0] strategy for some
> > of the blas stuff to automate the parameter exploration.
>
> Or if this is daunting, you can get pretty far just sitting down and
> writing some for loops... I think this is a case where something is a
> lot better than nothing :-).
>
> -n
>
> --
> Nathaniel J. Smith -- http://vorpus.org
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150527/2a0c37eb/attachment.html>

From cmkleffner at gmail.com  Wed May 27 04:41:15 2015
From: cmkleffner at gmail.com (Carl Kleffner)
Date: Wed, 27 May 2015 10:41:15 +0200
Subject: [Numpy-discussion] Strategy for OpenBLAS
In-Reply-To: <CAGGsPMxs50aKHcdac9q+T0fqz-FptYt-51U-=OZM0O9oxutokA@mail.gmail.com>
References: <CAH6Pt5ogjf5-3MwMRLTah6ZsMb3kR0iX1k9ECVQ=niX2+qfbjg@mail.gmail.com>
	<5564A4F4.8050803@googlemail.com>
	<CAPJVwBkRvBAb3LLuX=+3mt9FpTDFOzJZq6mTso2EfAvN2oVzrg@mail.gmail.com>
	<CAGGsPMxs50aKHcdac9q+T0fqz-FptYt-51U-=OZM0O9oxutokA@mail.gmail.com>
Message-ID: <CAGGsPMyh+r5zskmt=cJvdA9UMaXdtMZLLvV8Grc_eQG6PKNh_A@mail.gmail.com>

2015-05-27 10:26 GMT+02:00 Carl Kleffner <cmkleffner at gmail.com>:

>
>
> 2015-05-27 10:13 GMT+02:00 Nathaniel Smith <njs at pobox.com>:
>
>> On Tue, May 26, 2015 at 9:53 AM, Julian Taylor
>> <jtaylor.debian at googlemail.com> wrote:
>> > On 05/26/2015 04:56 PM, Matthew Brett wrote:
>> >> Hi,
>> >>
>> >> This morning I was wondering whether we ought to plan to devote some
>> >> resources to collaborating with the OpenBLAS team.
>> >>
>> >>
>> >>
>> >> It is relatively easy to add tests using Python / numpy.  We like
>> >> tests.  Why don't we propose a collaboration with OpenBLAS where we
>> >> build and test numpy with every / most / some commits of OpenBLAS, and
>> >> try to make it easy for the OpenBLAS team to add tests.    Maybe we
>> >> can use and add to the list of machines on which OpenBLAS is tested
>> >> [1]?  We Berkeley Pythonistas can certainly add the machines at our
>> >> buildbot farm [2].  Maybe the Julia / R developers would be interested
>> >> to help too?
>> >>
>> >
>>
> Some benchmark results made by @wernsaar can be found at
http://sourceforge.net/p/slurm-roll/code/HEAD/tree/branches/benchmark/ .
I guess this was made on Linux, so it cannot directly applied to Windows.
See i.e https://github.com/xianyi/OpenBLAS/issues/532. In general OpenBLAS
development trunk runs smoothly on Windows now.


> > Technically we only need a single machine with the newest instruction
>> > set available. All other cases could then be tested via a virtual
>> > machine that only exposes specific instruction sets (e.g. qemu which
>> > could technically also emulate stuff the host does not have).
>> >
>> > Concerning test generation there is a huge parameter space that needs
>> > testing due with openblas, at least some of it would need to be
>> > automated/fuzzed. We also need specific preconditioning of memory to
>> > test failure cases openblas had in the past, E.g. filling memory around
>> > the matrices with nans and also somehow filling openblas own temporary
>> > buffers with some signaling values (might require special built openblas
>> > if _MALLOC_PERTURB does not work).
>>
>> A lot of this stuff is easier if we take a white-box instead of
>> black-box approach -- adding hooks in OpenBLAS to override the
>> CPU-based kernel-autoselection sounds a lot easier than creating
>> unnatural machines in qemu, and similarly for initializing temporary
>> buffers. (I would be really unsurprised if OpenBLAS re-uses temporary
>> buffers across calls instead of doing a free/re-malloc, for example.)
>>
>> Manually overwriting the OpenBLAS CPU autoselection can easily be done by
> setting the OPENBLAS_CORETYPE environment variable, i.e.
> export OPENBLAS_CORETYPE=Nehalem
>
>
>> > Maybe it would be feasible to write a hypothesis [0] strategy for some
>> > of the blas stuff to automate the parameter exploration.
>>
>> Or if this is daunting, you can get pretty far just sitting down and
>> writing some for loops... I think this is a case where something is a
>> lot better than nothing :-).
>>
>> -n
>>
>> --
>> Nathaniel J. Smith -- http://vorpus.org
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150527/1e969dac/attachment.html>

From mailinglists at xgm.de  Wed May 27 10:15:55 2015
From: mailinglists at xgm.de (Florian Lindner)
Date: Wed, 27 May 2015 16:15:55 +0200
Subject: [Numpy-discussion] MPI: Sendrecv blocks
Message-ID: <mk4jis$bb0$1@ger.gmane.org>

Hello,

I have this piece of code:

comm = MPI.COMM_WORLD
temp = np.zeros(blockSize*blockSize)
PrintNB("Communicate A to", get_left_rank())
comm.Sendrecv(sendbuf=np.ascontiguousarray(lA), dest=get_left_rank(), 
recvbuf=temp)
lA = np.reshape(temp, (blockSize, blockSize))
PrintNB("Finished sending")


lA being a numpy array. Output is:

[0] Communicate A to 1
[2] Communicate A to 3
[3] Communicate A to 2
[1] Communicate A to 0
[1] Finished sending
# here it blocks

[n] is the rank. I have a circular send. 0>1, 1>0 and 2>3, 3>2. I understood 
Sendrec so that it is made specifically for these cases, but still it 
blocks.

What is the problem here?

Thanks!
Florian


From matthew.brett at gmail.com  Wed May 27 14:27:40 2015
From: matthew.brett at gmail.com (Matthew Brett)
Date: Wed, 27 May 2015 14:27:40 -0400
Subject: [Numpy-discussion] Strategy for OpenBLAS
In-Reply-To: <CAPJVwBk5Fa85PgbKAQKYPDJBSfPByCmP7y=KbC_bDj2cM_7_cw@mail.gmail.com>
References: <CAH6Pt5ogjf5-3MwMRLTah6ZsMb3kR0iX1k9ECVQ=niX2+qfbjg@mail.gmail.com>
	<CAPJVwBk5Fa85PgbKAQKYPDJBSfPByCmP7y=KbC_bDj2cM_7_cw@mail.gmail.com>
Message-ID: <CAH6Pt5oErUUEaD_b44B0txhjG9F981bu4dei2KcCsZAuHZhPRg@mail.gmail.com>

Hi,

On 5/27/15, Nathaniel Smith <njs at pobox.com> wrote:
> On Tue, May 26, 2015 at 7:56 AM, Matthew Brett <matthew.brett at gmail.com>
> wrote:
>> Hi,
>>
>> This morning I was wondering whether we ought to plan to devote some
>> resources to collaborating with the OpenBLAS team.
>
> Sounds like a great idea to me. Even a bit familiar :-)
>    http://thread.gmane.org/gmane.comp.python.numeric.general/57498

I had forgotten that thread, thanks for reminding me.  I guess my idea
arose from my forgotten memory of that thread , but I was thinking
that it may be less of a burden, and allow more sharing of work, if we
concentrate on testing.  For example, if we have a testing repo on the
OpenBLAS org, I can imagine Julia / R developers finding bugs, and
adding tests to the Python repo, because the machinery to do that is
already built and documented (in my perfect world).

> The lead developers of both OpenBLAS and BLIS are currently at UT
> Austin: http://shpc.ices.utexas.edu/people.html
> ...and it turns out that this is also where SciPy will be held in
> July. Might be a good opportunity for numpy/scipy folks interested in
> these matters to sit down in the same room as them and hash out some
> kind of shared plan of action.

I'm afraid I'm not going to Scipy this year.  Nathaniel - would you
consider organizing something like this, with able help from those of
us going and not going who can contribute some time?

> (NB: I'm told that BLIS now has full multi-threading support, and that
> they are working on runtime CPU detection and kernel auto-selection
> right now.)

I can well imagine that BLIS will be a good option at some point, but
I'm guessing that it is unlikely we will be able to to use BLIS for
our default BLAS / LAPACK library on Linux / Windows / Mac in the near
future.  Is that right?

Cheers,

Matthew


From thomas.p.krauss at gmail.com  Wed May 27 14:37:23 2015
From: thomas.p.krauss at gmail.com (Tom Krauss)
Date: Wed, 27 May 2015 13:37:23 -0500
Subject: [Numpy-discussion] addition to numpy.i
In-Reply-To: <CAB6mnxK3gYaTbjuhbYBQWCQwH4kyrJ5z66atKLNQiYVbEMqpOQ@mail.gmail.com>
References: <CAFQELMyK1YvZDQnQPWTVm+J8MxNh2tOuY4D953zHqmcwi2-oLg@mail.gmail.com>
	<CAB6mnxK3gYaTbjuhbYBQWCQwH4kyrJ5z66atKLNQiYVbEMqpOQ@mail.gmail.com>
Message-ID: <CAFQELMzX7a6VvmosJpPv4X0O=T5ez6AY6kw+KtfwMNCR44yuWQ@mail.gmail.com>

Thanks for merging!
After the merge, it's kind of a moot point now, but there was a failed
build at one point. The Python 2.7 test timed out. Then mid-day yesterday
the tests got run again and all passed. Not sure what's going on. I'm not
seeing a record of the failed build now.

On Tue, May 26, 2015 at 12:16 PM, Charles R Harris <
charlesr.harris at gmail.com> wrote:

>
>
> On Tue, May 26, 2015 at 8:59 AM, Tom Krauss <thomas.p.krauss at gmail.com>
> wrote:
>
>> Hi folks,
>>
>> After some discussion with Bill Spotz I decided to try to submit my new
>> typemap to numpy.i that allows in-place arrays of an arbitrary number of
>> dimensions to be passed in as a "flat" array with a single "size".
>>
>> To that end I created my first pull request
>>   https://github.com/numpy/numpy/pull/5914
>> sorry if I missed any steps or procedures - I noticed only after I did
>> the commit and pull request that I should have created a new feature
>> branch, sorry about that.
>>
>> Anyway I noticed the pull request initiated a series of tests and one of
>> them failed. How do I go about debugging and resolving the failure?
>>
>
> Looks like it passed the tests, in fact, I don't think travis tests
> numpy.i, but I could be wrong about that.
>
> Chuck
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150527/de510645/attachment.html>

From jtaylor.debian at googlemail.com  Thu May 28 08:46:00 2015
From: jtaylor.debian at googlemail.com (Julian Taylor)
Date: Thu, 28 May 2015 14:46:00 +0200
Subject: [Numpy-discussion] Verify your sourceforge windows installer
	downloads
Message-ID: <CAK5FAtEJMj0ZhTRdG9aVHWM8KjpmYJNFDVVWHF4cbSHW7Gj=1g@mail.gmail.com>

hi,
It has been reported that sourceforge has taken over the gimp
unofficial windows downloader page and temporarily bundled the
installer with unauthorized adware:
https://plus.google.com/+gimp/posts/cxhB1PScFpe

As NumPy is also distributing windows installers via sourceforge I
recommend that when you download the files you verify the downloads
via the checksums in the README.txt before using them. The README.txt
is clearsigned with my gpg key so it should be safe from tampering.
Unfortunately as I don't use windows I cannot give any advice on how
to do the verifcation on these platforms. Maybe someone familar with
available tools can chime in.

I have checked the numpy downloads and they still match what I
uploaded, but as sourceforge does redirect based on OS and geolocation
this may not mean much.

Cheers,
Julian Taylor


From cournape at gmail.com  Thu May 28 09:35:55 2015
From: cournape at gmail.com (David Cournapeau)
Date: Thu, 28 May 2015 22:35:55 +0900
Subject: [Numpy-discussion] Verify your sourceforge windows installer
	downloads
In-Reply-To: <CAK5FAtEJMj0ZhTRdG9aVHWM8KjpmYJNFDVVWHF4cbSHW7Gj=1g@mail.gmail.com>
References: <CAK5FAtEJMj0ZhTRdG9aVHWM8KjpmYJNFDVVWHF4cbSHW7Gj=1g@mail.gmail.com>
Message-ID: <CAGY4rcX15Jn+JvmQghKWrVKXUYjDcp2A88k2QUfT36dc6=6n2w@mail.gmail.com>

IMO, this really begs the question on whether we still want to use
sourceforge at all. At this point I just don't trust the service at all
anymore.

Could we use some resources (e.g. rackspace ?) to host those files ? Do we
know how much traffic they get so estimate the cost ?

David

On Thu, May 28, 2015 at 9:46 PM, Julian Taylor <
jtaylor.debian at googlemail.com> wrote:

> hi,
> It has been reported that sourceforge has taken over the gimp
> unofficial windows downloader page and temporarily bundled the
> installer with unauthorized adware:
> https://plus.google.com/+gimp/posts/cxhB1PScFpe
>
> As NumPy is also distributing windows installers via sourceforge I
> recommend that when you download the files you verify the downloads
> via the checksums in the README.txt before using them. The README.txt
> is clearsigned with my gpg key so it should be safe from tampering.
> Unfortunately as I don't use windows I cannot give any advice on how
> to do the verifcation on these platforms. Maybe someone familar with
> available tools can chime in.
>
> I have checked the numpy downloads and they still match what I
> uploaded, but as sourceforge does redirect based on OS and geolocation
> this may not mean much.
>
> Cheers,
> Julian Taylor
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150528/8d105745/attachment.html>

From sturla.molden at gmail.com  Thu May 28 09:49:03 2015
From: sturla.molden at gmail.com (Sturla Molden)
Date: Thu, 28 May 2015 13:49:03 +0000 (UTC)
Subject: [Numpy-discussion] Verify your sourceforge windows installer
	downloads
References: <CAK5FAtEJMj0ZhTRdG9aVHWM8KjpmYJNFDVVWHF4cbSHW7Gj=1g@mail.gmail.com>
Message-ID: <1057862074454513646.637274sturla.molden-gmail.com@news.gmane.org>

Julian Taylor <jtaylor.debian at googlemail.com> wrote:

> It has been reported that sourceforge has taken over the gimp
> unofficial windows downloader page and temporarily bundled the
> installer with unauthorized adware:
> https://plus.google.com/+gimp/posts/cxhB1PScFpe

WTF?


From p.j.a.cock at googlemail.com  Thu May 28 10:00:38 2015
From: p.j.a.cock at googlemail.com (Peter Cock)
Date: Thu, 28 May 2015 15:00:38 +0100
Subject: [Numpy-discussion] Verify your sourceforge windows installer
	downloads
In-Reply-To: <CAGY4rcX15Jn+JvmQghKWrVKXUYjDcp2A88k2QUfT36dc6=6n2w@mail.gmail.com>
References: <CAK5FAtEJMj0ZhTRdG9aVHWM8KjpmYJNFDVVWHF4cbSHW7Gj=1g@mail.gmail.com>
	<CAGY4rcX15Jn+JvmQghKWrVKXUYjDcp2A88k2QUfT36dc6=6n2w@mail.gmail.com>
Message-ID: <CAKVJ-_4sdmQ4-_Npcn0DWSkaJU3o=xzZApgk4_YsLHP+Ke0UNg@mail.gmail.com>

Migrating from SourceForge seems worth considering. I also
agree this is a breach of trust with the open source community.

It is my impression that the GIMP team stopped using SF for
downloads some time ago in favour of using their own website,
leaving the SF account live to maintain the old release downloads:

https://mail.gnome.org/archives/gimp-developer-list/2015-May/msg00098.html

According to the SourceForge blog, they assumed the "GIMP for
Windows" account was abandoned, and it appears SF decided
to make some money off it as a mirror site offering adware-bundled
versions of the official releases:

http://sourceforge.net/blog/gimp-win-project-wasnt-hijacked-just-abandoned/

We would not want the same thing to happen to NumPy, but on
the other hand deleting all the old releases on SourceForge
would break a vast number of installation scripts/recipes.

Peter

On Thu, May 28, 2015 at 2:35 PM, David Cournapeau <cournape at gmail.com> wrote:
> IMO, this really begs the question on whether we still want to use
> sourceforge at all. At this point I just don't trust the service at all
> anymore.
>
> Could we use some resources (e.g. rackspace ?) to host those files ? Do we
> know how much traffic they get so estimate the cost ?
>
> David
>
> On Thu, May 28, 2015 at 9:46 PM, Julian Taylor
> <jtaylor.debian at googlemail.com> wrote:
>>
>> hi,
>> It has been reported that sourceforge has taken over the gimp
>> unofficial windows downloader page and temporarily bundled the
>> installer with unauthorized adware:
>> https://plus.google.com/+gimp/posts/cxhB1PScFpe
>>
>> As NumPy is also distributing windows installers via sourceforge I
>> recommend that when you download the files you verify the downloads
>> via the checksums in the README.txt before using them. The README.txt
>> is clearsigned with my gpg key so it should be safe from tampering.
>> Unfortunately as I don't use windows I cannot give any advice on how
>> to do the verifcation on these platforms. Maybe someone familar with
>> available tools can chime in.
>>
>> I have checked the numpy downloads and they still match what I
>> uploaded, but as sourceforge does redirect based on OS and geolocation
>> this may not mean much.
>>
>> Cheers,
>> Julian Taylor
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From sturla.molden at gmail.com  Thu May 28 10:07:37 2015
From: sturla.molden at gmail.com (Sturla Molden)
Date: Thu, 28 May 2015 14:07:37 +0000 (UTC)
Subject: [Numpy-discussion] Verify your sourceforge windows installer
	downloads
References: <CAK5FAtEJMj0ZhTRdG9aVHWM8KjpmYJNFDVVWHF4cbSHW7Gj=1g@mail.gmail.com>
	<CAGY4rcX15Jn+JvmQghKWrVKXUYjDcp2A88k2QUfT36dc6=6n2w@mail.gmail.com>
Message-ID: <31698217454514411.075227sturla.molden-gmail.com@news.gmane.org>

David Cournapeau <cournape at gmail.com> wrote:
> IMO, this really begs the question on whether we still want to use
> sourceforge at all. At this point I just don't trust the service at all
> anymore.

Here is their lame excuse:

https://sourceforge.net/blog/gimp-win-project-wasnt-hijacked-just-abandoned/

It probably means this:

If NumPy installers are moved away from Sourceforge, they will set up a
mirror and load the mirrored installers with all sorts of crapware. It is
some sort of racket the mob couldn't do better.


Sturla


From andrew.collette at gmail.com  Thu May 28 13:00:08 2015
From: andrew.collette at gmail.com (Andrew Collette)
Date: Thu, 28 May 2015 11:00:08 -0600
Subject: [Numpy-discussion] Verify your sourceforge windows installer
	downloads
In-Reply-To: <31698217454514411.075227sturla.molden-gmail.com@news.gmane.org>
References: <CAK5FAtEJMj0ZhTRdG9aVHWM8KjpmYJNFDVVWHF4cbSHW7Gj=1g@mail.gmail.com>
	<CAGY4rcX15Jn+JvmQghKWrVKXUYjDcp2A88k2QUfT36dc6=6n2w@mail.gmail.com>
	<31698217454514411.075227sturla.molden-gmail.com@news.gmane.org>
Message-ID: <CALmrCV3k1b3cGzOVCDdMFMy9bBHXMdbAR=SWxstsazTUHf6beQ@mail.gmail.com>

> Here is their lame excuse:
>
> https://sourceforge.net/blog/gimp-win-project-wasnt-hijacked-just-abandoned/
>
> It probably means this:
>
> If NumPy installers are moved away from Sourceforge, they will set up a
> mirror and load the mirrored installers with all sorts of crapware. It is
> some sort of racket the mob couldn't do better.

I noticed that like most BSD-licensed software, NumPy's license
includes this clause:

"Neither the name of the NumPy Developers nor the names of any
contributors may be used to endorse or promote products derived from
this software without specific prior written permission."

There's an argument to be made that SF isn't legally permitted to
distribute poisoned installers under the name "NumPy" without
permission.  I recall a similar dust-up a while ago about "Standard
Markdown" using the name "Markdown"; the original author (John Gruber)
took action and got them to change the name.

In any case I've always been surprised that NumPy is distributed
through SourceForge, which has been sketchy for years now. Could it
simply be hosted on PyPI?

Andrew


From cournape at gmail.com  Thu May 28 13:05:57 2015
From: cournape at gmail.com (David Cournapeau)
Date: Fri, 29 May 2015 02:05:57 +0900
Subject: [Numpy-discussion] Verify your sourceforge windows installer
	downloads
In-Reply-To: <CALmrCV3k1b3cGzOVCDdMFMy9bBHXMdbAR=SWxstsazTUHf6beQ@mail.gmail.com>
References: <CAK5FAtEJMj0ZhTRdG9aVHWM8KjpmYJNFDVVWHF4cbSHW7Gj=1g@mail.gmail.com>
	<CAGY4rcX15Jn+JvmQghKWrVKXUYjDcp2A88k2QUfT36dc6=6n2w@mail.gmail.com>
	<31698217454514411.075227sturla.molden-gmail.com@news.gmane.org>
	<CALmrCV3k1b3cGzOVCDdMFMy9bBHXMdbAR=SWxstsazTUHf6beQ@mail.gmail.com>
Message-ID: <CAGY4rcXuE_AbZsCLWsTnFv4=LrpUzoeVR=8dVz1+_Sh1sGNrPg@mail.gmail.com>

On Fri, May 29, 2015 at 2:00 AM, Andrew Collette <andrew.collette at gmail.com>
wrote:

> > Here is their lame excuse:
> >
> >
> https://sourceforge.net/blog/gimp-win-project-wasnt-hijacked-just-abandoned/
> >
> > It probably means this:
> >
> > If NumPy installers are moved away from Sourceforge, they will set up a
> > mirror and load the mirrored installers with all sorts of crapware. It is
> > some sort of racket the mob couldn't do better.
>
> I noticed that like most BSD-licensed software, NumPy's license
> includes this clause:
>
> "Neither the name of the NumPy Developers nor the names of any
> contributors may be used to endorse or promote products derived from
> this software without specific prior written permission."
>
> There's an argument to be made that SF isn't legally permitted to
> distribute poisoned installers under the name "NumPy" without
> permission.  I recall a similar dust-up a while ago about "Standard
> Markdown" using the name "Markdown"; the original author (John Gruber)
> took action and got them to change the name.
>
> In any case I've always been surprised that NumPy is distributed
> through SourceForge, which has been sketchy for years now. Could it
> simply be hosted on PyPI?
>

They don't accept arbitrary binaries like SF does, and some of our
installer formats can't be uploaded there.

David


>
> Andrew
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150529/aba8656e/attachment.html>

From pav at iki.fi  Thu May 28 13:20:27 2015
From: pav at iki.fi (Pauli Virtanen)
Date: Thu, 28 May 2015 20:20:27 +0300
Subject: [Numpy-discussion] Verify your sourceforge windows installer
	downloads
In-Reply-To: <CAGY4rcXuE_AbZsCLWsTnFv4=LrpUzoeVR=8dVz1+_Sh1sGNrPg@mail.gmail.com>
References: <CAK5FAtEJMj0ZhTRdG9aVHWM8KjpmYJNFDVVWHF4cbSHW7Gj=1g@mail.gmail.com>	<CAGY4rcX15Jn+JvmQghKWrVKXUYjDcp2A88k2QUfT36dc6=6n2w@mail.gmail.com>	<31698217454514411.075227sturla.molden-gmail.com@news.gmane.org>	<CALmrCV3k1b3cGzOVCDdMFMy9bBHXMdbAR=SWxstsazTUHf6beQ@mail.gmail.com>
	<CAGY4rcXuE_AbZsCLWsTnFv4=LrpUzoeVR=8dVz1+_Sh1sGNrPg@mail.gmail.com>
Message-ID: <mk7ior$dqg$1@ger.gmane.org>

28.05.2015, 20:05, David Cournapeau kirjoitti:
[clip]
>> In any case I've always been surprised that NumPy is distributed
>> through SourceForge, which has been sketchy for years now. Could it
>> simply be hosted on PyPI?
>>
> 
> They don't accept arbitrary binaries like SF does, and some of our
> installer formats can't be uploaded there.

Is it possible to host them on github? I think there's an option to add
release notes and (apparently) to upload binaries if you go to the
"Releases" section --- there's one for each tag.

	Pauli


From sturla.molden at gmail.com  Thu May 28 13:35:25 2015
From: sturla.molden at gmail.com (Sturla Molden)
Date: Thu, 28 May 2015 17:35:25 +0000 (UTC)
Subject: [Numpy-discussion] Verify your sourceforge windows installer
	downloads
References: <CAK5FAtEJMj0ZhTRdG9aVHWM8KjpmYJNFDVVWHF4cbSHW7Gj=1g@mail.gmail.com>
	<CAGY4rcX15Jn+JvmQghKWrVKXUYjDcp2A88k2QUfT36dc6=6n2w@mail.gmail.com>
	<31698217454514411.075227sturla.molden-gmail.com@news.gmane.org>
	<CALmrCV3k1b3cGzOVCDdMFMy9bBHXMdbAR=SWxstsazTUHf6beQ@mail.gmail.com>
	<CAGY4rcXuE_AbZsCLWsTnFv4=LrpUzoeVR=8dVz1+_Sh1sGNrPg@mail.gmail.com>
	<mk7ior$dqg$1@ger.gmane.org>
Message-ID: <1872293233454527203.780948sturla.molden-gmail.com@news.gmane.org>

Pauli Virtanen <pav at iki.fi> wrote:

> Is it possible to host them on github? I think there's an option to add
> release notes and (apparently) to upload binaries if you go to the
> "Releases" section --- there's one for each tag.

And then Sourceforge will put up tainted installers "for the benefit of
NumPy users". :)


From pav at iki.fi  Thu May 28 13:46:29 2015
From: pav at iki.fi (Pauli Virtanen)
Date: Thu, 28 May 2015 20:46:29 +0300
Subject: [Numpy-discussion] Verify your sourceforge windows installer
	downloads
In-Reply-To: <1872293233454527203.780948sturla.molden-gmail.com@news.gmane.org>
References: <CAK5FAtEJMj0ZhTRdG9aVHWM8KjpmYJNFDVVWHF4cbSHW7Gj=1g@mail.gmail.com>	<CAGY4rcX15Jn+JvmQghKWrVKXUYjDcp2A88k2QUfT36dc6=6n2w@mail.gmail.com>	<31698217454514411.075227sturla.molden-gmail.com@news.gmane.org>	<CALmrCV3k1b3cGzOVCDdMFMy9bBHXMdbAR=SWxstsazTUHf6beQ@mail.gmail.com>	<CAGY4rcXuE_AbZsCLWsTnFv4=LrpUzoeVR=8dVz1+_Sh1sGNrPg@mail.gmail.com>	<mk7ior$dqg$1@ger.gmane.org>
	<1872293233454527203.780948sturla.molden-gmail.com@news.gmane.org>
Message-ID: <mk7k9m$8bm$1@ger.gmane.org>

28.05.2015, 20:35, Sturla Molden kirjoitti:
> Pauli Virtanen <pav at iki.fi> wrote:
> 
>> Is it possible to host them on github? I think there's an option to add
>> release notes and (apparently) to upload binaries if you go to the
>> "Releases" section --- there's one for each tag.
> 
> And then Sourceforge will put up tainted installers "for the benefit of
> NumPy users". :)

Well, let them. They may already be tainted, who knows. It's phishing
and malware distribution at that point, and there are some ways to deal
with that (safe browsing, AV etc).


From jtaylor.debian at googlemail.com  Thu May 28 14:52:01 2015
From: jtaylor.debian at googlemail.com (Julian Taylor)
Date: Thu, 28 May 2015 20:52:01 +0200
Subject: [Numpy-discussion] Verify your sourceforge windows installer
 downloads
In-Reply-To: <mk7k9m$8bm$1@ger.gmane.org>
References: <CAK5FAtEJMj0ZhTRdG9aVHWM8KjpmYJNFDVVWHF4cbSHW7Gj=1g@mail.gmail.com>	<CAGY4rcX15Jn+JvmQghKWrVKXUYjDcp2A88k2QUfT36dc6=6n2w@mail.gmail.com>	<31698217454514411.075227sturla.molden-gmail.com@news.gmane.org>	<CALmrCV3k1b3cGzOVCDdMFMy9bBHXMdbAR=SWxstsazTUHf6beQ@mail.gmail.com>	<CAGY4rcXuE_AbZsCLWsTnFv4=LrpUzoeVR=8dVz1+_Sh1sGNrPg@mail.gmail.com>	<mk7ior$dqg$1@ger.gmane.org>	<1872293233454527203.780948sturla.molden-gmail.com@news.gmane.org>
	<mk7k9m$8bm$1@ger.gmane.org>
Message-ID: <556763D1.9060806@googlemail.com>

On 28.05.2015 19:46, Pauli Virtanen wrote:
> 28.05.2015, 20:35, Sturla Molden kirjoitti:
>> Pauli Virtanen <pav at iki.fi> wrote:
>>
>>> Is it possible to host them on github? I think there's an option to add
>>> release notes and (apparently) to upload binaries if you go to the
>>> "Releases" section --- there's one for each tag.
>>
>> And then Sourceforge will put up tainted installers "for the benefit of
>> NumPy users". :)
> 
> Well, let them. They may already be tainted, who knows. It's phishing
> and malware distribution at that point, and there are some ways to deal
> with that (safe browsing, AV etc).
> 
> 

there is no guarantee that github will not do this stuff in future too,
also PyPI or self hosting do not necessarily help as those resources can
be compromised.
The main thing that should be learned this and the many similar
incidents in the past is that binaries from the internet need to be
verified of they have been modified from their original state otherwise
they cannot be trusted.

With my mail I wanted to bring to attention that both numpy (since
1.7.2) and scipy (since 0.14.1) allow users to do so via the signed
README.txt containing checksums.


From pav at iki.fi  Thu May 28 15:05:29 2015
From: pav at iki.fi (Pauli Virtanen)
Date: Thu, 28 May 2015 22:05:29 +0300
Subject: [Numpy-discussion] Verify your sourceforge windows installer
	downloads
In-Reply-To: <556763D1.9060806@googlemail.com>
References: <CAK5FAtEJMj0ZhTRdG9aVHWM8KjpmYJNFDVVWHF4cbSHW7Gj=1g@mail.gmail.com>	<CAGY4rcX15Jn+JvmQghKWrVKXUYjDcp2A88k2QUfT36dc6=6n2w@mail.gmail.com>	<31698217454514411.075227sturla.molden-gmail.com@news.gmane.org>	<CALmrCV3k1b3cGzOVCDdMFMy9bBHXMdbAR=SWxstsazTUHf6beQ@mail.gmail.com>	<CAGY4rcXuE_AbZsCLWsTnFv4=LrpUzoeVR=8dVz1+_Sh1sGNrPg@mail.gmail.com>	<mk7ior$dqg$1@ger.gmane.org>	<1872293233454527203.780948sturla.molden-gmail.com@news.gmane.org>	<mk7k9m$8bm$1@ger.gmane.org>
	<556763D1.9060806@googlemail.com>
Message-ID: <mk7otq$np6$1@ger.gmane.org>

28.05.2015, 21:52, Julian Taylor kirjoitti:
> there is no guarantee that github will not do this stuff in future too,
> also PyPI or self hosting do not necessarily help as those resources can
> be compromised.
> The main thing that should be learned this and the many similar
> incidents in the past is that binaries from the internet need to be
> verified of they have been modified from their original state otherwise
> they cannot be trusted.

Indeed, but on the other hand, there's no reason for us to continue
cooperating with shady partners, especially when there are easy
alternatives. We can just quietly change the main binary distribution
channel and be done with it.


From toddrjen at gmail.com  Fri May 29 01:43:34 2015
From: toddrjen at gmail.com (Todd)
Date: Fri, 29 May 2015 07:43:34 +0200
Subject: [Numpy-discussion] Verify your sourceforge windows installer
	downloads
In-Reply-To: <CAGY4rcXuE_AbZsCLWsTnFv4=LrpUzoeVR=8dVz1+_Sh1sGNrPg@mail.gmail.com>
References: <CAK5FAtEJMj0ZhTRdG9aVHWM8KjpmYJNFDVVWHF4cbSHW7Gj=1g@mail.gmail.com>
	<CAGY4rcX15Jn+JvmQghKWrVKXUYjDcp2A88k2QUfT36dc6=6n2w@mail.gmail.com>
	<31698217454514411.075227sturla.molden-gmail.com@news.gmane.org>
	<CALmrCV3k1b3cGzOVCDdMFMy9bBHXMdbAR=SWxstsazTUHf6beQ@mail.gmail.com>
	<CAGY4rcXuE_AbZsCLWsTnFv4=LrpUzoeVR=8dVz1+_Sh1sGNrPg@mail.gmail.com>
Message-ID: <CAFpSVpJuAV_NtNCrxSaDxPENLPNiLMqoT5Yu_9n9qoSFum5n9Q@mail.gmail.com>

On May 28, 2015 7:06 PM, "David Cournapeau" <cournape at gmail.com> wrote:
> On Fri, May 29, 2015 at 2:00 AM, Andrew Collette <
andrew.collette at gmail.com> wrote:
>>
>> In any case I've always been surprised that NumPy is distributed
>> through SourceForge, which has been sketchy for years now. Could it
>> simply be hosted on PyPI?
>
>
> They don't accept arbitrary binaries like SF does, and some of our
installer formats can't be uploaded there.
>
> David

Is that something that could be fixed? Has anyone asked the pypi
maintainers whether they could change those rules, either in general or by
granting exceptions on a case-by-case basis to projects that have proven
track records and importance?

It would seem to me that if the rules on pypi are forcing critical projects
like numpy to host elsewhere, then the rules are flawed and are preventing
pypi from serving is intended purpose.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150529/710c5d94/attachment.html>

From cimrman3 at ntc.zcu.cz  Fri May 29 11:24:22 2015
From: cimrman3 at ntc.zcu.cz (Robert Cimrman)
Date: Fri, 29 May 2015 17:24:22 +0200
Subject: [Numpy-discussion] ANN: SfePy 2015.2
Message-ID: <556884A6.1010600@ntc.zcu.cz>

I am pleased to announce release 2015.2 of SfePy.

Description
-----------

SfePy (simple finite elements in Python) is a software for solving systems of
coupled partial differential equations by the finite element method or by the
isogeometric analysis (preliminary support). It is distributed under the new
BSD license.

Home page: http://sfepy.org
Mailing list: http://groups.google.com/group/sfepy-devel
Git (source) repository, issue tracker, wiki: http://github.com/sfepy

Highlights of this release
--------------------------

- major code simplification (removed element groups)
- time stepping solvers updated for interactive use
- improved finding of reference element coordinates of physical points
- reorganized examples
- reorganized installation on POSIX systems (sfepy-run script)

For full release notes see http://docs.sfepy.org/doc/release_notes.html#id1
(rather long and technical).

Best regards,
Robert Cimrman and Contributors (*)

(*) Contributors to this release (alphabetical order):

Lubos Kejzlar, Vladimir Lukes, Anton Gladky, Matyas Novak


From ben.root at ou.edu  Fri May 29 13:28:05 2015
From: ben.root at ou.edu (Benjamin Root)
Date: Fri, 29 May 2015 13:28:05 -0400
Subject: [Numpy-discussion] Verify your sourceforge windows installer
	downloads
In-Reply-To: <CAFpSVpJuAV_NtNCrxSaDxPENLPNiLMqoT5Yu_9n9qoSFum5n9Q@mail.gmail.com>
References: <CAK5FAtEJMj0ZhTRdG9aVHWM8KjpmYJNFDVVWHF4cbSHW7Gj=1g@mail.gmail.com>
	<CAGY4rcX15Jn+JvmQghKWrVKXUYjDcp2A88k2QUfT36dc6=6n2w@mail.gmail.com>
	<31698217454514411.075227sturla.molden-gmail.com@news.gmane.org>
	<CALmrCV3k1b3cGzOVCDdMFMy9bBHXMdbAR=SWxstsazTUHf6beQ@mail.gmail.com>
	<CAGY4rcXuE_AbZsCLWsTnFv4=LrpUzoeVR=8dVz1+_Sh1sGNrPg@mail.gmail.com>
	<CAFpSVpJuAV_NtNCrxSaDxPENLPNiLMqoT5Yu_9n9qoSFum5n9Q@mail.gmail.com>
Message-ID: <CANNq6FkSg4mAsOZmbCm1Ac-5r3GnKCW6z0v+VhkhUMbL_AbQ2A@mail.gmail.com>

Speaking from the matplotlib project, our binaries are substantial due to
our suite of test images. Pypi worked with us on relaxing size constraints.
Also, I think the new cheese shop/warehouse server they are using scales
better, so size is not nearly the same concern as before.

Ben Root
On May 29, 2015 1:43 AM, "Todd" <toddrjen at gmail.com> wrote:

> On May 28, 2015 7:06 PM, "David Cournapeau" <cournape at gmail.com> wrote:
> > On Fri, May 29, 2015 at 2:00 AM, Andrew Collette <
> andrew.collette at gmail.com> wrote:
> >>
> >> In any case I've always been surprised that NumPy is distributed
> >> through SourceForge, which has been sketchy for years now. Could it
> >> simply be hosted on PyPI?
> >
> >
> > They don't accept arbitrary binaries like SF does, and some of our
> installer formats can't be uploaded there.
> >
> > David
>
> Is that something that could be fixed? Has anyone asked the pypi
> maintainers whether they could change those rules, either in general or by
> granting exceptions on a case-by-case basis to projects that have proven
> track records and importance?
>
> It would seem to me that if the rules on pypi are forcing critical
> projects like numpy to host elsewhere, then the rules are flawed and are
> preventing pypi from serving is intended purpose.
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150529/7829be28/attachment.html>

From saketkc at gmail.com  Fri May 29 13:52:26 2015
From: saketkc at gmail.com (Saket Choudhary)
Date: Fri, 29 May 2015 10:52:26 -0700
Subject: [Numpy-discussion] Verify your sourceforge windows installer
	downloads
In-Reply-To: <CAGY4rcXuE_AbZsCLWsTnFv4=LrpUzoeVR=8dVz1+_Sh1sGNrPg@mail.gmail.com>
References: <CAK5FAtEJMj0ZhTRdG9aVHWM8KjpmYJNFDVVWHF4cbSHW7Gj=1g@mail.gmail.com>
	<CAGY4rcX15Jn+JvmQghKWrVKXUYjDcp2A88k2QUfT36dc6=6n2w@mail.gmail.com>
	<31698217454514411.075227sturla.molden-gmail.com@news.gmane.org>
	<CALmrCV3k1b3cGzOVCDdMFMy9bBHXMdbAR=SWxstsazTUHf6beQ@mail.gmail.com>
	<CAGY4rcXuE_AbZsCLWsTnFv4=LrpUzoeVR=8dVz1+_Sh1sGNrPg@mail.gmail.com>
Message-ID: <CAEDHeiv8kJNPgfmuSQacbKy_yYNmaTz=nKFcS42WmQj1jqjCnA@mail.gmail.com>

On 28 May 2015 at 10:05, David Cournapeau <cournape at gmail.com> wrote:
>
>
> On Fri, May 29, 2015 at 2:00 AM, Andrew Collette <andrew.collette at gmail.com>
> wrote:
>>
>> > Here is their lame excuse:
>> >
>> >
>> > https://sourceforge.net/blog/gimp-win-project-wasnt-hijacked-just-abandoned/
>> >
>> > It probably means this:
>> >
>> > If NumPy installers are moved away from Sourceforge, they will set up a
>> > mirror and load the mirrored installers with all sorts of crapware. It
>> > is
>> > some sort of racket the mob couldn't do better.
>>
>> I noticed that like most BSD-licensed software, NumPy's license
>> includes this clause:
>>
>> "Neither the name of the NumPy Developers nor the names of any
>> contributors may be used to endorse or promote products derived from
>> this software without specific prior written permission."
>>
>> There's an argument to be made that SF isn't legally permitted to
>> distribute poisoned installers under the name "NumPy" without
>> permission.  I recall a similar dust-up a while ago about "Standard
>> Markdown" using the name "Markdown"; the original author (John Gruber)
>> took action and got them to change the name.
>>
>> In any case I've always been surprised that NumPy is distributed
>> through SourceForge, which has been sketchy for years now. Could it
>> simply be hosted on PyPI?
>
>
> They don't accept arbitrary binaries like SF does, and some of our installer
> formats can't be uploaded there.
>

Bintray [1] has been providing a free service for hosting
'bottles'(compiled binaries) for the Homebrew project [2].
Probably an option to look at.

[1] https://bintray.com/
[2] http://brew.sh/


> David
>
>>
>>
>> Andrew
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>


From antony.lee at berkeley.edu  Fri May 29 17:06:39 2015
From: antony.lee at berkeley.edu (Antony Lee)
Date: Fri, 29 May 2015 14:06:39 -0700
Subject: [Numpy-discussion] Backwards-incompatible improvements to
	numpy.random.RandomState
In-Reply-To: <CAGRr6BGy6mSB22wuk2rdwUTD51OF8N+ueWuoSymsW7396zk3vQ@mail.gmail.com>
References: <CAGRr6BGy6mSB22wuk2rdwUTD51OF8N+ueWuoSymsW7396zk3vQ@mail.gmail.com>
Message-ID: <CAGRr6BEfe5WRh_ww2Dz26m03q9hVf8iNE-tEGA24GbapkRRpyQ@mail.gmail.com>

>
> A proof-of-concept implementation, still missing tests, is tracked as
> #5911.  It includes the patch proposed in #5158 as an example of how to
> include an improved version of random.choice.
>

Tests are in now (whether we should bundle in pickles of old versions to
make sure they are still unpickled correctly and outputs of old random
streams to make sure they are still reproduced is a good question, though).
Comments welcome.

Antony
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150529/953a9443/attachment.html>

From ralf.gommers at gmail.com  Sun May 31 21:43:11 2015
From: ralf.gommers at gmail.com (Ralf Gommers)
Date: Mon, 1 Jun 2015 03:43:11 +0200
Subject: [Numpy-discussion] Verify your sourceforge windows installer
	downloads
In-Reply-To: <CANNq6FkSg4mAsOZmbCm1Ac-5r3GnKCW6z0v+VhkhUMbL_AbQ2A@mail.gmail.com>
References: <CAK5FAtEJMj0ZhTRdG9aVHWM8KjpmYJNFDVVWHF4cbSHW7Gj=1g@mail.gmail.com>
	<CAGY4rcX15Jn+JvmQghKWrVKXUYjDcp2A88k2QUfT36dc6=6n2w@mail.gmail.com>
	<31698217454514411.075227sturla.molden-gmail.com@news.gmane.org>
	<CALmrCV3k1b3cGzOVCDdMFMy9bBHXMdbAR=SWxstsazTUHf6beQ@mail.gmail.com>
	<CAGY4rcXuE_AbZsCLWsTnFv4=LrpUzoeVR=8dVz1+_Sh1sGNrPg@mail.gmail.com>
	<CAFpSVpJuAV_NtNCrxSaDxPENLPNiLMqoT5Yu_9n9qoSFum5n9Q@mail.gmail.com>
	<CANNq6FkSg4mAsOZmbCm1Ac-5r3GnKCW6z0v+VhkhUMbL_AbQ2A@mail.gmail.com>
Message-ID: <CABL7CQjsuu7pc_pvrtSGKpUWjJzQVzh_hqZTp6zkDzBmmSyLWg@mail.gmail.com>

On Fri, May 29, 2015 at 7:28 PM, Benjamin Root <ben.root at ou.edu> wrote:

> Speaking from the matplotlib project, our binaries are substantial due to
> our suite of test images. Pypi worked with us on relaxing size constraints.
> Also, I think the new cheese shop/warehouse server they are using scales
> better, so size is not nearly the same concern as before.
>
> Ben Root
> On May 29, 2015 1:43 AM, "Todd" <toddrjen at gmail.com> wrote:
>
>> On May 28, 2015 7:06 PM, "David Cournapeau" <cournape at gmail.com> wrote:
>> > On Fri, May 29, 2015 at 2:00 AM, Andrew Collette <
>> andrew.collette at gmail.com> wrote:
>> >>
>> >> In any case I've always been surprised that NumPy is distributed
>> >> through SourceForge, which has been sketchy for years now. Could it
>> >> simply be hosted on PyPI?
>> >
>> >
>> > They don't accept arbitrary binaries like SF does, and some of our
>> installer formats can't be uploaded there.
>> >
>> > David
>>
>> Is that something that could be fixed?
>>
>
For the current .exe installers that cannot be fixed, because neither pip
nor easy_install can handle those. We actually have to ensure that we don't
link from pypi directly to the sourceforge folder with the latest release,
because then easy_install will follow the link, download the .exe and fail.

Dmg's were another non-supported format, but we'll stop using those. So
if/when it's SSE2 .exe installers only (make with bdist_wininst and no
NSIS) then PyPi works. Size constraints are not an issue for Numpy I think.

Ralf

Has anyone asked the pypi maintainers whether they could change those
>> rules, either in general or by granting exceptions on a case-by-case basis
>> to projects that have proven track records and importance?
>>
>> It would seem to me that if the rules on pypi are forcing critical
>> projects like numpy to host elsewhere, then the rules are flawed and are
>> preventing pypi from serving is intended purpose.
>>
>> _______________________________________________
>> NumPy-Discussion mailing list
>> NumPy-Discussion at scipy.org
>> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
> _______________________________________________
> NumPy-Discussion mailing list
> NumPy-Discussion at scipy.org
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150601/b03d2fb1/attachment.html>

From charlesr.harris at gmail.com  Sat May 30 18:23:47 2015
From: charlesr.harris at gmail.com (Charles R Harris)
Date: Sat, 30 May 2015 16:23:47 -0600
Subject: [Numpy-discussion] matmul needs some clarification.
Message-ID: <CAB6mnxLmjzAy0dM1Dyty8TiVjk0DAJVoEdnNSAzc+c821DVgyw@mail.gmail.com>

Hi All,

The problem arises when multiplying a stack of matrices times a vector.
PEP465 defines this as appending a '1' to the dimensions of the vector and
doing the defined stacked matrix multiply, then removing the last dimension
from the result. Note that in the middle step we have a stack of matrices
and after removing the last dimension we will still have a stack of
matrices. What we want is a stack of vectors, but we can't have those with
our conventions. This makes the result somewhat unexpected. How should we
resolve this?

Chuck
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/numpy-discussion/attachments/20150530/e2e6e814/attachment.html>