From sebastian at sipsolutions.net Tue Oct 1 13:30:05 2019 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Tue, 01 Oct 2019 10:30:05 -0700 Subject: [Numpy-discussion] NumPy Community Meeting Wednesday, Oct. 2 Message-ID: <4762d6fe1bdf419d3f77956cac72fa7f65b935dd.camel@sipsolutions.net> Hi all, There will be a NumPy Community meeting Wednesday September 25 at 11 am Pacific Time. Everyone is invited to join in and edit the work-in- progress meeting topics and notes: https://hackmd.io/76o-IxCjQX2mOXO_wwkcpg?both Best wishes Sebastian -------------- next part -------------- A non-text attachment was scrubbed... Name: NumPy_Community_Call.ics Type: text/calendar Size: 3264 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From sebastian at sipsolutions.net Thu Oct 3 17:54:19 2019 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Thu, 03 Oct 2019 14:54:19 -0700 Subject: [Numpy-discussion] =?utf-8?q?Accepting_NEP_29_=E2=80=94_Recommen?= =?utf-8?q?d_Python_and_Numpy_version_support_as_a_community_policy_standa?= =?utf-8?q?rd?= Message-ID: Hi all, we propose formally accepting the NumPy enhancement proposal 29: "Recommend Python and Numpy version support as a community policy standard" available at: https://numpy.org/neps/nep-0029-deprecation_policy.html If there are no objections within a week it may be accepted. This proposal is a recommendation to the larger ecosystem and thus should receive attention and acceptance from a wide audience. However, lets try to keep discussions on the NumPy mailing list. The most important points from the Abstract and Implementation sections are: "This NEP recommends that all projects across the Scientific Python ecosystem adopt a common ?time window-based? policy for support of Python and NumPy versions. Standardizing a recommendation for project support of minimum Python and NumPy versions will improve downstream project planning. ?" and: "We suggest that all projects adopt the following language into their development guidelines: This project supports: * All minor versions of Python released 42 months prior to the project, and at minimum the two latest minor versions. * All minor versions of numpy released in the 24 months prior to the project, and at minimum the last thee minor versions." For the full text, please refer to the link above. Cheers, Sebastian -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From jni at fastmail.com Thu Oct 3 21:01:02 2019 From: jni at fastmail.com (Juan Nunez-Iglesias) Date: Thu, 03 Oct 2019 20:01:02 -0500 Subject: [Numpy-discussion] =?utf-8?q?Accepting_NEP_29_=E2=80=94_Recommen?= =?utf-8?q?d_Python_and_Numpy_version_support_as_a_community_policy_standa?= =?utf-8?q?rd?= In-Reply-To: References: Message-ID: <7ce1db7a-d5dd-41ff-9f75-c90199599e01@www.fastmail.com> We're behind this at scikit-image! Thank you to all who worked on this proposal! Minor typo: "at minimum the last THREE minor versions" Juan. On Thu, 3 Oct 2019, at 4:54 PM, Sebastian Berg wrote: > Hi all, > > we propose formally accepting the NumPy enhancement proposal 29: > > "Recommend Python and Numpy version support as a community policy > standard" > > available at: https://numpy.org/neps/nep-0029-deprecation_policy.html > > If there are no objections within a week it may be accepted. This > proposal is a recommendation to the larger ecosystem and thus should > receive attention and acceptance from a wide audience. > However, lets try to keep discussions on the NumPy mailing list. > > The most important points from the Abstract and Implementation sections > are: > > "This NEP recommends that all projects across the Scientific Python > ecosystem adopt a common ?time window-based? policy for support of > Python and NumPy versions. Standardizing a recommendation for project > support of minimum Python and NumPy versions will improve downstream > project planning. ?" > > and: > > "We suggest that all projects adopt the following language into their > development guidelines: > > This project supports: > * All minor versions of Python released 42 months prior to the > project, and at minimum the two latest minor versions. > * All minor versions of numpy released in the 24 months prior to the > project, and at minimum the last thee minor versions." > > For the full text, please refer to the link above. > > Cheers, > > Sebastian > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > > > *Attachments:* > * signature.asc -------------- next part -------------- An HTML attachment was scrubbed... URL: From s.molnar at sbcglobal.net Fri Oct 4 13:31:34 2019 From: s.molnar at sbcglobal.net (Stephen P. Molnar) Date: Fri, 4 Oct 2019 13:31:34 -0400 Subject: [Numpy-discussion] Np.genfromtxt Problem Message-ID: <5D9781F6.2000102@sbcglobal.net> I have a snippet of code #!/usr/bin/env python3 # -*- coding: utf-8 -*- """ Created on Tue Sep 24 07:51:11 2019 """ import numpy as np files = [] data = np.genfromtxt(files, usecols=(3), dtype=None, skip_header=8, skip_footer=1, encoding=None) print(data) If file is a single file the code generates the data that I want. However I have a list of files that I want to process. According to numpy.genfromtxt fname can be a "File, filename, list, or generator to read." If I use [13-7a_apo-1acl.RMSD 13-7_apo-1acl.RMSD 14-7_apo-1acl.RMSD 15-7_apo-1acl.RMSD 17-7_apo-1acl.RMSD ] get the error: runfile('/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/Results/RMSDTable/Test/DeltaGTable_s.py', wdir='/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/Results/RMSDTable/Test', current_namespace=True) Traceback (most recent call last): File "/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/Results/RMSDTable/Test/DeltaGTable_s.py", line 12, in data = np.genfromtxt(files, usecols=(3), dtype=None, skip_header=8, skip_footer=1, encoding=None) File "/home/comp/Apps/Miniconda3/lib/python3.7/site-packages/numpy/lib/npyio.py", line 1762, in genfromtxt next(fhd) StopIteration I have tried very combination of search terms that I can think of in order to find an example of how to make this work without success. How can I make this work? Thanks in advance. -- Stephen P. Molnar, Ph.D. www.molecular-modeling.net 614.312.7528 (c) Skype: smolnar1 From deak.andris at gmail.com Fri Oct 4 18:15:37 2019 From: deak.andris at gmail.com (Andras Deak) Date: Sat, 5 Oct 2019 00:15:37 +0200 Subject: [Numpy-discussion] Np.genfromtxt Problem In-Reply-To: <5D9781F6.2000102@sbcglobal.net> References: <5D9781F6.2000102@sbcglobal.net> Message-ID: On Fri, Oct 4, 2019 at 7:31 PM Stephen P. Molnar wrote: > > > I have a snippet of code > > #!/usr/bin/env python3 > # -*- coding: utf-8 -*- > """ > > Created on Tue Sep 24 07:51:11 2019 > > """ > import numpy as np > > files = [] > > data = np.genfromtxt(files, usecols=(3), dtype=None, skip_header=8, > skip_footer=1, encoding=None) > > print(data) > > > If file is a single file the code generates the data that I want. > However I have a list of files that I want to process. According to > numpy.genfromtxt fname can be a "File, filename, list, or generator to > read." If I use [13-7a_apo-1acl.RMSD 13-7_apo-1acl.RMSD > 14-7_apo-1acl.RMSD 15-7_apo-1acl.RMSD 17-7_apo-1acl.RMSD ] get the > error: Hi Stephen, As far as I know genfromtxt is designed to read the contents of a single file. Consider this quote from the docs for the first parameter: "The strings in a list or produced by a generator are treated as lines." And the general description of the function says "Load data from a text file, with missing values handled as specified." ("a text file", singular) So if I understand correctly the list case is there so that you can pass `f.readlines()` or equivalent into genfromtxt. From a higher-level standpoint, how would reading multiple files behave if the files have different structure, and what type and shape should the function return in that case? If one file can be read just fine then I suggest looping over them to read each, one after the other. You can then tell python what to do with each returned array and so it doesn't have to guess. Regards, Andr?s > > runfile('/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/Results/RMSDTable/Test/DeltaGTable_s.py', > wdir='/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/Results/RMSDTable/Test', > current_namespace=True) > Traceback (most recent call last): > > File > "/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/Results/RMSDTable/Test/DeltaGTable_s.py", > line 12, in > data = np.genfromtxt(files, usecols=(3), dtype=None, skip_header=8, > skip_footer=1, encoding=None) > > File > "/home/comp/Apps/Miniconda3/lib/python3.7/site-packages/numpy/lib/npyio.py", > line 1762, in genfromtxt > next(fhd) > > StopIteration > > I have tried very combination of search terms that I can think of in > order to find an example of how to make this work without success. > > How can I make this work? > > Thanks in advance. > > -- > Stephen P. Molnar, Ph.D. > www.molecular-modeling.net > 614.312.7528 (c) > Skype: smolnar1 > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion From derek at astro.physik.uni-goettingen.de Fri Oct 4 18:38:16 2019 From: derek at astro.physik.uni-goettingen.de (Derek Homeier) Date: Sat, 5 Oct 2019 00:38:16 +0200 Subject: [Numpy-discussion] Np.genfromtxt Problem In-Reply-To: References: <5D9781F6.2000102@sbcglobal.net> Message-ID: <6E13B4DE-E155-47E6-B6C6-3A052AD8B66E@astro.physik.uni-goettingen.de> On 5 Oct 2019, at 12:15 am, Andras Deak wrote: > > On Fri, Oct 4, 2019 at 7:31 PM Stephen P. Molnar wrote: >> >> >> I have a snippet of code >> >> #!/usr/bin/env python3 >> # -*- coding: utf-8 -*- >> """ >> >> Created on Tue Sep 24 07:51:11 2019 >> >> """ >> import numpy as np >> >> files = [] >> >> data = np.genfromtxt(files, usecols=(3), dtype=None, skip_header=8, >> skip_footer=1, encoding=None) >> >> print(data) >> >> >> If file is a single file the code generates the data that I want. >> However I have a list of files that I want to process. According to >> numpy.genfromtxt fname can be a "File, filename, list, or generator to >> read." If I use [13-7a_apo-1acl.RMSD 13-7_apo-1acl.RMSD >> 14-7_apo-1acl.RMSD 15-7_apo-1acl.RMSD 17-7_apo-1acl.RMSD ] get the >> error: > > Hi Stephen, > > As far as I know genfromtxt is designed to read the contents of a > single file. Consider this quote from the docs for the first > parameter: > "The strings in a list or produced by a generator are treated as lines." > And the general description of the function says > "Load data from a text file, with missing values handled as specified." > ("a text file", singular) > So if I understand correctly the list case is there so that you can > pass `f.readlines()` or equivalent into genfromtxt. From a > higher-level standpoint, how would reading multiple files behave if > the files have different structure, and what type and shape should the > function return in that case? > If one file can be read just fine then I suggest looping over them to > read each, one after the other. You can then tell python what to do > with each returned array and so it doesn't have to guess. The above is correct in that genfromtxt expects a single file or file-like object. That said, assuming all input files have compatible format (i.e. identical no. of columns with matching dtypes), which really is the only case that would make sense to pass to genfromtxt, you could try creating a pipe to concatenate all input files into a single object. Something like this might work: fobj = os.popen('cat 1[3457]-7a_apo-1acl.RMSD?) data = np.genfromtxt(fobj, usecols=(3), dtype=None, ?) However the multiple headers and footers in your concatenated file may cause trouble here - maybe you find a way to remove them in the popen call with some '[e]grep -v? artistry. Depending on this, the loop over input files might be the easier solution. HTH, Derek From stefanv at berkeley.edu Fri Oct 4 20:19:46 2019 From: stefanv at berkeley.edu (Stefan van der Walt) Date: Fri, 04 Oct 2019 17:19:46 -0700 Subject: [Numpy-discussion] Np.genfromtxt Problem In-Reply-To: <5D9781F6.2000102@sbcglobal.net> References: <5D9781F6.2000102@sbcglobal.net> Message-ID: <6b85b8cc-d64a-41f2-ab49-f7b72ab04f60@www.fastmail.com> On Fri, Oct 4, 2019, at 10:31, Stephen P. Molnar wrote: > data = np.genfromtxt(files, usecols=(3), dtype=None, skip_header=8, > skip_footer=1, encoding=None) This seems like a good use case for `dask.dataframe.read_csv` [0]. St?fan [0] https://examples.dask.org/dataframes/01-data-access.html#Read-CSV-files From s.molnar at sbcglobal.net Sat Oct 5 04:16:12 2019 From: s.molnar at sbcglobal.net (Stephen P. Molnar) Date: Sat, 5 Oct 2019 04:16:12 -0400 Subject: [Numpy-discussion] Np.genfromtxt Problem In-Reply-To: <6b85b8cc-d64a-41f2-ab49-f7b72ab04f60@www.fastmail.com> References: <5D9781F6.2000102@sbcglobal.net> <6b85b8cc-d64a-41f2-ab49-f7b72ab04f60@www.fastmail.com> Message-ID: <5D98514C.1020609@sbcglobal.net> On 10/04/2019 08:19 PM, Stefan van der Walt wrote: > On Fri, Oct 4, 2019, at 10:31, Stephen P. Molnar wrote: >> data = np.genfromtxt(files, usecols=(3), dtype=None, skip_header=8, >> skip_footer=1, encoding=None) > This seems like a good use case for `dask.dataframe.read_csv` [0]. > > St??fan > > [0] https://examples.dask.org/dataframes/01-data-access.html#Read-CSV-files > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion I appreciate the responses that I've received. I feel that I must apologize for the one important fact it would appear I railed to mention - all of the files that I wish to process are identical. -- Stephen P. Molnar, Ph.D. www.molecular-modeling.net 614.312.7528 (c) Skype: smolnar1 From charlesr.harris at gmail.com Mon Oct 7 21:08:39 2019 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 7 Oct 2019 19:08:39 -0600 Subject: [Numpy-discussion] Python 3.8 numpy wheels Message-ID: Hi All, Thanks to the work of Matti Pincus and Matthew Brett, manylinux1 numpy wheels are now available for testing. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Tue Oct 8 05:21:35 2019 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Tue, 8 Oct 2019 02:21:35 -0700 Subject: [Numpy-discussion] Python 3.8 numpy wheels In-Reply-To: References: Message-ID: On Mon, Oct 7, 2019 at 6:09 PM Charles R Harris wrote: > Hi All, > > Thanks to the work of Matti Pincus and Matthew Brett, manylinux1 numpy > wheels are now available for testing. > Thanks so much for working on that Matti, Matthew and Chuck! Any pointer on where the wheels are hosted? Don't see them on PyPI, nor on http://wheels.scipy.org/. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Tue Oct 8 08:25:46 2019 From: charlesr.harris at gmail.com (Charles R Harris) Date: Tue, 8 Oct 2019 06:25:46 -0600 Subject: [Numpy-discussion] Python 3.8 numpy wheels In-Reply-To: References: Message-ID: On Tue, Oct 8, 2019 at 3:22 AM Ralf Gommers wrote: > > > On Mon, Oct 7, 2019 at 6:09 PM Charles R Harris > wrote: > >> Hi All, >> >> Thanks to the work of Matti Pincus and Matthew Brett, manylinux1 numpy >> wheels are now available for testing. >> > > Thanks so much for working on that Matti, Matthew and Chuck! > > Any pointer on where the wheels are hosted? Don't see them on PyPI, nor on > http://wheels.scipy.org/. > > They are with the pre-release wheels . Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From s.molnar at sbcglobal.net Tue Oct 8 09:16:50 2019 From: s.molnar at sbcglobal.net (Stephen P. Molnar) Date: Tue, 8 Oct 2019 09:16:50 -0400 Subject: [Numpy-discussion] Problem with np.savetxt Message-ID: <5D9C8C42.8010006@sbcglobal.net> I am embarrassed to be asking this question, but I have exhausted Google at this point . I have a number of identically formatted text files from which I want to extract data, as an example (hopefully, putting these in as quotes will persevere the format): > ======================================================================= > PSOVina version 2.0 > Giotto H. K. Tai & Shirley W. I. Siu > > Computational Biology and Bioinformatics Lab > University of Macau > > Visit http://cbbio.cis.umac.mo for more information. > > PSOVina was developed based on the framework of AutoDock Vina. > > For more information about Vina, please visit http://vina.scripps.edu. > > ======================================================================= > > Output will be 13-7_out.pdbqt > Reading input ... done. > Setting up the scoring function ... done. > Analyzing the binding site ... done. > Using random seed: 1828390527 > Performing search ... done. > > Refining results ... done. > > mode | affinity | dist from best mode > | (kcal/mol) | rmsd l.b.| rmsd u.b. > -----+------------+----------+---------- > 1 -8.862004149 0.000 0.000 > 2 -8.403522829 2.992 6.553 > 3 -8.401384636 2.707 5.220 > 4 -7.886402037 4.907 6.862 > 5 -7.845519031 3.233 5.915 > 6 -7.837434227 3.954 5.641 > 7 -7.834584887 3.188 7.294 > 8 -7.694395765 3.746 7.553 > 9 -7.691211177 3.536 5.745 > 10 -7.670759445 3.698 7.587 > 11 -7.661882758 4.882 7.044 > 12 -7.636280303 2.347 3.284 > 13 -7.635788052 3.511 6.250 > 14 -7.611175249 2.427 3.449 > 15 -7.586368357 2.142 2.864 > 16 -7.531307666 2.976 4.980 > 17 -7.520501084 3.085 5.775 > 18 -7.512906514 4.220 7.672 > 19 -7.307403528 3.240 4.354 > 20 -7.256063348 3.694 7.252 > Writing output ... done. At this point, my python script consists of only the following: > #!/usr/bin/env python3 > # -*- coding: utf-8 -*- > """ > > Created on Tue Sep 24 07:51:11 2019 > > """ > import numpy as np > > data = [] > > data = np.genfromtxt("13-7.log", usecols=(1), dtype=None, > skip_header=27, skip_footer=1, encoding=None) > > print(data) > > np.savetxt('13-7', [data], fmt='%15.9f', header='13-7') The problem lies in tfe np.savetxt line, on execution I get: > runfile('/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/VINA20/SmallSet/DeltaGTable_V_s.py', > wdir='/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/VINA20/SmallSet', > current_namespace=True) > ['-8.839713733' '-8.743377250' '-8.151051167' '-8.090452911' > '-7.967494477' '-7.854890056' '-7.757417879' '-7.741557490' > '-7.643885488' '-7.611595767' '-7.507605524' '-7.413920814' > '-7.389408331' '-7.384446364' '-7.374206276' '-7.368808179' > '-7.346641418' '-7.325037898' '-7.309614787' '-7.113209147'] > Traceback (most recent call last): > > File > "/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/VINA20/SmallSet/DeltaGTable_V_s.py", > line 16, in > np.savetxt('13-7', [data], fmt='%16.9f', header='13-7') > > File "<__array_function__ internals>", line 6, in savetxt > > File > "/home/comp/Apps/Miniconda3/lib/python3.7/site-packages/numpy/lib/npyio.py", > line 1438, in savetxt > % (str(X.dtype), format)) > > TypeError: Mismatch between array dtype (' ('%16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f > %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f > %16.9f') The data is in the data file, but the only entry in '13-7', the saved file, is the label. Obviously, the error is in the format argument. Help will be much appreciated. Thanks in advance. -- Stephen P. Molnar, Ph.D. www.molecular-modeling.net 614.312.7528 (c) Skype: smolnar1 From deak.andris at gmail.com Tue Oct 8 09:42:49 2019 From: deak.andris at gmail.com (Andras Deak) Date: Tue, 8 Oct 2019 15:42:49 +0200 Subject: [Numpy-discussion] Problem with np.savetxt In-Reply-To: <5D9C8C42.8010006@sbcglobal.net> References: <5D9C8C42.8010006@sbcglobal.net> Message-ID: On Tue, Oct 8, 2019 at 3:17 PM Stephen P. Molnar wrote: > > I am embarrassed to be asking this question, but I have exhausted Google > at this point . > > I have a number of identically formatted text files from which I want to > extract data, as an example (hopefully, putting these in as quotes will > persevere the format): > > > ======================================================================= > > PSOVina version 2.0 > > Giotto H. K. Tai & Shirley W. I. Siu > > > > Computational Biology and Bioinformatics Lab > > University of Macau > > > > Visit http://cbbio.cis.umac.mo for more information. > > > > PSOVina was developed based on the framework of AutoDock Vina. > > > > For more information about Vina, please visit http://vina.scripps.edu. > > > > ======================================================================= > > > > Output will be 13-7_out.pdbqt > > Reading input ... done. > > Setting up the scoring function ... done. > > Analyzing the binding site ... done. > > Using random seed: 1828390527 > > Performing search ... done. > > > > Refining results ... done. > > > > mode | affinity | dist from best mode > > | (kcal/mol) | rmsd l.b.| rmsd u.b. > > -----+------------+----------+---------- > > 1 -8.862004149 0.000 0.000 > > 2 -8.403522829 2.992 6.553 > > 3 -8.401384636 2.707 5.220 > > 4 -7.886402037 4.907 6.862 > > 5 -7.845519031 3.233 5.915 > > 6 -7.837434227 3.954 5.641 > > 7 -7.834584887 3.188 7.294 > > 8 -7.694395765 3.746 7.553 > > 9 -7.691211177 3.536 5.745 > > 10 -7.670759445 3.698 7.587 > > 11 -7.661882758 4.882 7.044 > > 12 -7.636280303 2.347 3.284 > > 13 -7.635788052 3.511 6.250 > > 14 -7.611175249 2.427 3.449 > > 15 -7.586368357 2.142 2.864 > > 16 -7.531307666 2.976 4.980 > > 17 -7.520501084 3.085 5.775 > > 18 -7.512906514 4.220 7.672 > > 19 -7.307403528 3.240 4.354 > > 20 -7.256063348 3.694 7.252 > > Writing output ... done. > At this point, my python script consists of only the following: > > > #!/usr/bin/env python3 > > # -*- coding: utf-8 -*- > > """ > > > > Created on Tue Sep 24 07:51:11 2019 > > > > """ > > import numpy as np > > > > data = [] > > > > data = np.genfromtxt("13-7.log", usecols=(1), dtype=None, > > skip_header=27, skip_footer=1, encoding=None) > > > > print(data) > > > > np.savetxt('13-7', [data], fmt='%15.9f', header='13-7') > > The problem lies in tfe np.savetxt line, on execution I get: > > > runfile('/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/VINA20/SmallSet/DeltaGTable_V_s.py', > > wdir='/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/VINA20/SmallSet', > > current_namespace=True) > > ['-8.839713733' '-8.743377250' '-8.151051167' '-8.090452911' > > '-7.967494477' '-7.854890056' '-7.757417879' '-7.741557490' > > '-7.643885488' '-7.611595767' '-7.507605524' '-7.413920814' > > '-7.389408331' '-7.384446364' '-7.374206276' '-7.368808179' > > '-7.346641418' '-7.325037898' '-7.309614787' '-7.113209147'] > > Traceback (most recent call last): > > > > File > > "/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/VINA20/SmallSet/DeltaGTable_V_s.py", > > line 16, in > > np.savetxt('13-7', [data], fmt='%16.9f', header='13-7') > > > > File "<__array_function__ internals>", line 6, in savetxt > > > > File > > "/home/comp/Apps/Miniconda3/lib/python3.7/site-packages/numpy/lib/npyio.py", > > line 1438, in savetxt > > % (str(X.dtype), format)) > > > > TypeError: Mismatch between array dtype (' > ('%16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f > > %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f > > %16.9f') > > The data is in the data file, but the only entry in '13-7', the saved > file, is the label. Obviously, the error is in the format argument. Hi, One problem is the format: the error is telling you that you have strings in your array (compare the `' > Help will be much appreciated. > > Thanks in advance. > > -- > Stephen P. Molnar, Ph.D. > www.molecular-modeling.net > 614.312.7528 (c) > Skype: smolnar1 > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion From deak.andris at gmail.com Tue Oct 8 09:44:30 2019 From: deak.andris at gmail.com (Andras Deak) Date: Tue, 8 Oct 2019 15:44:30 +0200 Subject: [Numpy-discussion] Problem with np.savetxt In-Reply-To: References: <5D9C8C42.8010006@sbcglobal.net> Message-ID: PS. if you just want to specify the width of the fields you wouldn't have to convert anything, because you can specify the size and justification of a %s format. But arguably having float data as floats is more natural anyway. On Tue, Oct 8, 2019 at 3:42 PM Andras Deak wrote: > > On Tue, Oct 8, 2019 at 3:17 PM Stephen P. Molnar wrote: > > > > I am embarrassed to be asking this question, but I have exhausted Google > > at this point . > > > > I have a number of identically formatted text files from which I want to > > extract data, as an example (hopefully, putting these in as quotes will > > persevere the format): > > > > > ======================================================================= > > > PSOVina version 2.0 > > > Giotto H. K. Tai & Shirley W. I. Siu > > > > > > Computational Biology and Bioinformatics Lab > > > University of Macau > > > > > > Visit http://cbbio.cis.umac.mo for more information. > > > > > > PSOVina was developed based on the framework of AutoDock Vina. > > > > > > For more information about Vina, please visit http://vina.scripps.edu. > > > > > > ======================================================================= > > > > > > Output will be 13-7_out.pdbqt > > > Reading input ... done. > > > Setting up the scoring function ... done. > > > Analyzing the binding site ... done. > > > Using random seed: 1828390527 > > > Performing search ... done. > > > > > > Refining results ... done. > > > > > > mode | affinity | dist from best mode > > > | (kcal/mol) | rmsd l.b.| rmsd u.b. > > > -----+------------+----------+---------- > > > 1 -8.862004149 0.000 0.000 > > > 2 -8.403522829 2.992 6.553 > > > 3 -8.401384636 2.707 5.220 > > > 4 -7.886402037 4.907 6.862 > > > 5 -7.845519031 3.233 5.915 > > > 6 -7.837434227 3.954 5.641 > > > 7 -7.834584887 3.188 7.294 > > > 8 -7.694395765 3.746 7.553 > > > 9 -7.691211177 3.536 5.745 > > > 10 -7.670759445 3.698 7.587 > > > 11 -7.661882758 4.882 7.044 > > > 12 -7.636280303 2.347 3.284 > > > 13 -7.635788052 3.511 6.250 > > > 14 -7.611175249 2.427 3.449 > > > 15 -7.586368357 2.142 2.864 > > > 16 -7.531307666 2.976 4.980 > > > 17 -7.520501084 3.085 5.775 > > > 18 -7.512906514 4.220 7.672 > > > 19 -7.307403528 3.240 4.354 > > > 20 -7.256063348 3.694 7.252 > > > Writing output ... done. > > At this point, my python script consists of only the following: > > > > > #!/usr/bin/env python3 > > > # -*- coding: utf-8 -*- > > > """ > > > > > > Created on Tue Sep 24 07:51:11 2019 > > > > > > """ > > > import numpy as np > > > > > > data = [] > > > > > > data = np.genfromtxt("13-7.log", usecols=(1), dtype=None, > > > skip_header=27, skip_footer=1, encoding=None) > > > > > > print(data) > > > > > > np.savetxt('13-7', [data], fmt='%15.9f', header='13-7') > > > > The problem lies in tfe np.savetxt line, on execution I get: > > > > > runfile('/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/VINA20/SmallSet/DeltaGTable_V_s.py', > > > wdir='/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/VINA20/SmallSet', > > > current_namespace=True) > > > ['-8.839713733' '-8.743377250' '-8.151051167' '-8.090452911' > > > '-7.967494477' '-7.854890056' '-7.757417879' '-7.741557490' > > > '-7.643885488' '-7.611595767' '-7.507605524' '-7.413920814' > > > '-7.389408331' '-7.384446364' '-7.374206276' '-7.368808179' > > > '-7.346641418' '-7.325037898' '-7.309614787' '-7.113209147'] > > > Traceback (most recent call last): > > > > > > File > > > "/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/VINA20/SmallSet/DeltaGTable_V_s.py", > > > line 16, in > > > np.savetxt('13-7', [data], fmt='%16.9f', header='13-7') > > > > > > File "<__array_function__ internals>", line 6, in savetxt > > > > > > File > > > "/home/comp/Apps/Miniconda3/lib/python3.7/site-packages/numpy/lib/npyio.py", > > > line 1438, in savetxt > > > % (str(X.dtype), format)) > > > > > > TypeError: Mismatch between array dtype (' > > ('%16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f > > > %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f > > > %16.9f') > > > > The data is in the data file, but the only entry in '13-7', the saved > > file, is the label. Obviously, the error is in the format argument. > > Hi, > > One problem is the format: the error is telling you that you have > strings in your array (compare the `' your `print(data)` call with strings inside), whereas %16.9f can only > be used to format floats (f for float). You would first have to > convert your array of strings to an array numbers. I don't usually use > genfromtxt so I'm not sure how you can make it return floats for you > in the first place, but I suspect `dtype=None` in the call to > genfromtxt might be responsible. In any case making it return numbers > should be the easier case. > The second problem is that you should make sure you mean `[data]` in > the call to savetxt. As it is now this would give you a 2d array of > shape (1, 20), and the output would correspondingly contain a single > row of 20 values (hence the 20 instances of '%16.9f' in the error > message). In case you meant to print one value per row in a single > column, you should drop the brackets around `data`: > np.savetxt('13-7', data, fmt='%16.9f', header='13-7') > > And just a personal note, but I'd find an output file named '13-7' to > be a bit surprising. Perhaps some extension or prefix would help > organize these files? > Regards, > > Andr?s > > > > > Help will be much appreciated. > > > > Thanks in advance. > > > > -- > > Stephen P. Molnar, Ph.D. > > www.molecular-modeling.net > > 614.312.7528 (c) > > Skype: smolnar1 > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion From s.molnar at sbcglobal.net Tue Oct 8 10:49:31 2019 From: s.molnar at sbcglobal.net (Stephen P. Molnar) Date: Tue, 8 Oct 2019 10:49:31 -0400 Subject: [Numpy-discussion] Problem with np.savetxt In-Reply-To: References: <5D9C8C42.8010006@sbcglobal.net> Message-ID: <5D9CA1FB.4020607@sbcglobal.net> Many thanks or your kid replies. I really appreciate your suggestions. On 10/08/2019 09:44 AM, Andras Deak wrote: > PS. if you just want to specify the width of the fields you wouldn't > have to convert anything, because you can specify the size and > justification of a %s format. But arguably having float data as floats > is more natural anyway. > > On Tue, Oct 8, 2019 at 3:42 PM Andras Deak wrote: >> On Tue, Oct 8, 2019 at 3:17 PM Stephen P. Molnar wrote: >>> I am embarrassed to be asking this question, but I have exhausted Google >>> at this point . >>> >>> I have a number of identically formatted text files from which I want to >>> extract data, as an example (hopefully, putting these in as quotes will >>> persevere the format): >>> >>>> ======================================================================= >>>> PSOVina version 2.0 >>>> Giotto H. K. Tai & Shirley W. I. Siu >>>> >>>> Computational Biology and Bioinformatics Lab >>>> University of Macau >>>> >>>> Visit http://cbbio.cis.umac.mo for more information. >>>> >>>> PSOVina was developed based on the framework of AutoDock Vina. >>>> >>>> For more information about Vina, please visit http://vina.scripps.edu. >>>> >>>> ======================================================================= >>>> >>>> Output will be 13-7_out.pdbqt >>>> Reading input ... done. >>>> Setting up the scoring function ... done. >>>> Analyzing the binding site ... done. >>>> Using random seed: 1828390527 >>>> Performing search ... done. >>>> >>>> Refining results ... done. >>>> >>>> mode | affinity | dist from best mode >>>> | (kcal/mol) | rmsd l.b.| rmsd u.b. >>>> -----+------------+----------+---------- >>>> 1 -8.862004149 0.000 0.000 >>>> 2 -8.403522829 2.992 6.553 >>>> 3 -8.401384636 2.707 5.220 >>>> 4 -7.886402037 4.907 6.862 >>>> 5 -7.845519031 3.233 5.915 >>>> 6 -7.837434227 3.954 5.641 >>>> 7 -7.834584887 3.188 7.294 >>>> 8 -7.694395765 3.746 7.553 >>>> 9 -7.691211177 3.536 5.745 >>>> 10 -7.670759445 3.698 7.587 >>>> 11 -7.661882758 4.882 7.044 >>>> 12 -7.636280303 2.347 3.284 >>>> 13 -7.635788052 3.511 6.250 >>>> 14 -7.611175249 2.427 3.449 >>>> 15 -7.586368357 2.142 2.864 >>>> 16 -7.531307666 2.976 4.980 >>>> 17 -7.520501084 3.085 5.775 >>>> 18 -7.512906514 4.220 7.672 >>>> 19 -7.307403528 3.240 4.354 >>>> 20 -7.256063348 3.694 7.252 >>>> Writing output ... done. >>> At this point, my python script consists of only the following: >>> >>>> #!/usr/bin/env python3 >>>> # -*- coding: utf-8 -*- >>>> """ >>>> >>>> Created on Tue Sep 24 07:51:11 2019 >>>> >>>> """ >>>> import numpy as np >>>> >>>> data = [] >>>> >>>> data = np.genfromtxt("13-7.log", usecols=(1), dtype=None, >>>> skip_header=27, skip_footer=1, encoding=None) >>>> >>>> print(data) >>>> >>>> np.savetxt('13-7', [data], fmt='%15.9f', header='13-7') >>> The problem lies in tfe np.savetxt line, on execution I get: >>> >>>> runfile('/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/VINA20/SmallSet/DeltaGTable_V_s.py', >>>> wdir='/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/VINA20/SmallSet', >>>> current_namespace=True) >>>> ['-8.839713733' '-8.743377250' '-8.151051167' '-8.090452911' >>>> '-7.967494477' '-7.854890056' '-7.757417879' '-7.741557490' >>>> '-7.643885488' '-7.611595767' '-7.507605524' '-7.413920814' >>>> '-7.389408331' '-7.384446364' '-7.374206276' '-7.368808179' >>>> '-7.346641418' '-7.325037898' '-7.309614787' '-7.113209147'] >>>> Traceback (most recent call last): >>>> >>>> File >>>> "/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/VINA20/SmallSet/DeltaGTable_V_s.py", >>>> line 16, in >>>> np.savetxt('13-7', [data], fmt='%16.9f', header='13-7') >>>> >>>> File "<__array_function__ internals>", line 6, in savetxt >>>> >>>> File >>>> "/home/comp/Apps/Miniconda3/lib/python3.7/site-packages/numpy/lib/npyio.py", >>>> line 1438, in savetxt >>>> % (str(X.dtype), format)) >>>> >>>> TypeError: Mismatch between array dtype ('>>> ('%16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f >>>> %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f >>>> %16.9f') >>> The data is in the data file, but the only entry in '13-7', the saved >>> file, is the label. Obviously, the error is in the format argument. >> Hi, >> >> One problem is the format: the error is telling you that you have >> strings in your array (compare the `'> your `print(data)` call with strings inside), whereas %16.9f can only >> be used to format floats (f for float). You would first have to >> convert your array of strings to an array numbers. I don't usually use >> genfromtxt so I'm not sure how you can make it return floats for you >> in the first place, but I suspect `dtype=None` in the call to >> genfromtxt might be responsible. In any case making it return numbers >> should be the easier case. >> The second problem is that you should make sure you mean `[data]` in >> the call to savetxt. As it is now this would give you a 2d array of >> shape (1, 20), and the output would correspondingly contain a single >> row of 20 values (hence the 20 instances of '%16.9f' in the error >> message). In case you meant to print one value per row in a single >> column, you should drop the brackets around `data`: >> np.savetxt('13-7', data, fmt='%16.9f', header='13-7') >> >> And just a personal note, but I'd find an output file named '13-7' to >> be a bit surprising. Perhaps some extension or prefix would help >> organize these files? >> Regards, >> >> Andr??s >> >>> Help will be much appreciated. >>> >>> Thanks in advance. >>> >>> -- >>> Stephen P. Molnar, Ph.D. >>> www.molecular-modeling.net >>> 614.312.7528 (c) >>> Skype: smolnar1 >>> >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion at python.org >>> https://mail.python.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion -- Stephen P. Molnar, Ph.D. www.molecular-modeling.net 614.312.7528 (c) Skype: smolnar1 From silva at lma.cnrs-mrs.fr Tue Oct 8 10:42:20 2019 From: silva at lma.cnrs-mrs.fr (Fabrice Silva) Date: Tue, 08 Oct 2019 16:42:20 +0200 Subject: [Numpy-discussion] Problem with np.savetxt In-Reply-To: <5D9C8C42.8010006@sbcglobal.net> References: <5D9C8C42.8010006@sbcglobal.net> Message-ID: Le mardi 08 octobre 2019, Stephen P. Molnar a ?crit : > > data = np.genfromtxt("13-7.log", usecols=(1), dtype=None, > > skip_header=27, skip_footer=1, encoding=None)print(data)[...] > > ['-8.839713733' '-8.743377250' '-8.151051167' '-8.090452911' '- > > 7.967494477' '-7.854890056' '-7.757417879' '-7.741557490' '- > > 7.643885488' '-7.611595767' '-7.507605524' '-7.413920814' '- > > 7.389408331' '-7.384446364' '-7.374206276' '-7.368808179' '- > > 7.346641418' '-7.325037898' '-7.309614787' '-7.113209147'] Hi, Note that your data array is made of strings and not floats. The default value of the dtype argument is float, which you override by None. Remove the 'dtype=None' part to correctly load data You then have no problem to save your data with the format you want. Fabrice PS : be aware that [data] is a 2D row array, that will end up inlined with command np.savetxt('13-7', [data], fmt='%15.9f', header='13-7') Remove the bracket for a one-per-line formatted output np.savetxt('13-7', data, fmt='%15.9f', header='13-7') -------------- next part -------------- An HTML attachment was scrubbed... URL: From s.molnar at sbcglobal.net Tue Oct 8 13:53:51 2019 From: s.molnar at sbcglobal.net (Stephen P. Molnar) Date: Tue, 8 Oct 2019 13:53:51 -0400 Subject: [Numpy-discussion] Problem with np.savetxt In-Reply-To: References: <5D9C8C42.8010006@sbcglobal.net> Message-ID: <5D9CCD2F.1060504@sbcglobal.net> Thanks for the replies. All is now well! I'm thankful that this list is so very patient with ROF's (retired old fools) struggling to learn a new programmng language. On 10/08/2019 10:42 AM, Fabrice Silva wrote: > Le mardi 08 octobre 2019, Stephen P. Molnar a ?crit : >>> data = np.genfromtxt("13-7.log", usecols=(1), dtype=None, >>> skip_header=27, skip_footer=1, encoding=None) >>> print(data) >>> [...] >>> ['-8.839713733' '-8.743377250' '-8.151051167' '-8.090452911' >>> '-7.967494477' '-7.854890056' '-7.757417879' '-7.741557490' >>> '-7.643885488' '-7.611595767' '-7.507605524' '-7.413920814' >>> '-7.389408331' '-7.384446364' '-7.374206276' '-7.368808179' >>> '-7.346641418' '-7.325037898' '-7.309614787' '-7.113209147'] > > Hi, > Note that your *data* array is made of strings and not floats. > The default value of the *dtype* argument is *float*, which you > override by *None.* > Remove the 'dtype=None' part to correctly load data > > You then have no problem to save your data with the format you want. > > Fabrice > > PS : be aware that [data] is a 2D row array, that will end up inlined > with command > np.savetxt('13-7', [data], fmt='%15.9f', header='13-7') > Remove the bracket for a one-per-line formatted output > np.savetxt('13-7', data, fmt='%15.9f', header='13-7') > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion -- Stephen P. Molnar, Ph.D. www.molecular-modeling.net 614.312.7528 (c) Skype: smolnar1 -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Tue Oct 8 14:04:24 2019 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Tue, 08 Oct 2019 11:04:24 -0700 Subject: [Numpy-discussion] NumPy Community Meeting Wednesday, Oct. 9 Message-ID: <3e9a164ebfb52fe5a31dd8bc623f360d32ff80f5.camel@sipsolutions.net> Hi all, There will be a NumPy Community meeting Wednesday October 9 at 11 am Pacific Time. Everyone is invited to join in and edit the work-in- progress meeting topics and notes: https://hackmd.io/76o-IxCjQX2mOXO_wwkcpg?both Best wishes Sebastian -------------- next part -------------- A non-text attachment was scrubbed... Name: NumPy_Community_Call.ics Type: text/calendar Size: 3264 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From einstein.edison at gmail.com Wed Oct 9 13:30:26 2019 From: einstein.edison at gmail.com (Hameer Abbasi) Date: Wed, 9 Oct 2019 17:30:26 +0000 Subject: [Numpy-discussion] =?windows-1252?q?NEP_31_=97_Context-local_and?= =?windows-1252?q?_global_overrides_of_the_NumPy_API?= References: Message-ID: Thanks to all the feedback, we have a new PR of NEP-31. Please find the full-text quoted below: ============================================================ NEP 31 ? Context-local and global overrides of the NumPy API ============================================================ :Author: Hameer Abbasi > :Author: Ralf Gommers > :Author: Peter Bell > :Status: Draft :Type: Standards Track :Created: 2019-08-22 Abstract -------- This NEP proposes to make all of NumPy's public API overridable via an extensible backend mechanism. Acceptance of this NEP means NumPy would provide global and context-local overrides, as well as a dispatch mechanism similar to NEP-18 [2]_. First experiences with ``__array_function__`` show that it is necessary to be able to override NumPy functions that *do not take an array-like argument*, and hence aren't overridable via ``__array_function__``. The most pressing need is array creation and coercion functions, such as ``numpy.zeros`` or ``numpy.asarray``; see e.g. NEP-30 [9]_. This NEP proposes to allow, in an opt-in fashion, overriding any part of the NumPy API. It is intended as a comprehensive resolution to NEP-22 [3]_, and obviates the need to add an ever-growing list of new protocols for each new type of function or object that needs to become overridable. Motivation and Scope -------------------- The motivation behind ``uarray`` is manyfold: First, there have been several attempts to allow dispatch of parts of the NumPy API, including (most prominently), the ``__array_ufunc__`` protocol in NEP-13 [4]_, and the ``__array_function__`` protocol in NEP-18 [2]_, but this has shown the need for further protocols to be developed, including a protocol for coercion (see [5]_, [9]_). The reasons these overrides are needed have been extensively discussed in the references, and this NEP will not attempt to go into the details of why these are needed; but in short: It is necessary for library authors to be able to coerce arbitrary objects into arrays of their own types, such as CuPy needing to coerce to a CuPy array, for example, instead of a NumPy array. These kinds of overrides are useful for both the end-user as well as library authors. End-users may have written or wish to write code that they then later speed up or move to a different implementation, say PyData/Sparse. They can do this simply by setting a backend. Library authors may also wish to write code that is portable across array implementations, for example ``sklearn`` may wish to write code for a machine learning algorithm that is portable across array implementations while also using array creation functions. This NEP takes a holistic approach: It assumes that there are parts of the API that need to be overridable, and that these will grow over time. It provides a general framework and a mechanism to avoid a design of a new protocol each time this is required. This was the goal of ``uarray``: to allow for overrides in an API without needing the design of a new protocol. This NEP proposes the following: That ``unumpy`` [8]_ becomes the recommended override mechanism for the parts of the NumPy API not yet covered by ``__array_function__`` or ``__array_ufunc__``, and that ``uarray`` is vendored into a new namespace within NumPy to give users and downstream dependencies access to these overrides. This vendoring mechanism is similar to what SciPy decided to do for making ``scipy.fft`` overridable (see [10]_). Detailed description -------------------- Using overrides ~~~~~~~~~~~~~~~ The way we propose the overrides will be used by end users is:: # On the library side import numpy.overridable as unp def library_function(array): array = unp.asarray(array) # Code using unumpy as usual return array # On the user side: import numpy.overridable as unp import uarray as ua import dask.array as da ua.register_backend(da) library_function(dask_array) # works and returns dask_array with unp.set_backend(da): library_function([1, 2, 3, 4]) # actually returns a Dask array. Here, ``backend`` can be any compatible object defined either by NumPy or an external library, such as Dask or CuPy. Ideally, it should be the module ``dask.array`` or ``cupy`` itself. Composing backends ~~~~~~~~~~~~~~~~~~ There are some backends which may depend on other backends, for example xarray depending on `numpy.fft`, and transforming a time axis into a frequency axis, or Dask/xarray holding an array other than a NumPy array inside it. This would be handled in the following manner inside code:: with ua.set_backend(cupy), ua.set_backend(dask.array): # Code that has distributed GPU arrays here Proposals ~~~~~~~~~ The only change this NEP proposes at its acceptance, is to make ``unumpy`` the officially recommended way to override NumPy, along with making some submodules overridable by default via ``uarray``. ``unumpy`` will remain a separate repository/package (which we propose to vendor to avoid a hard dependency, and use the separate ``unumpy`` package only if it is installed, rather than depend on for the time being). In concrete terms, ``numpy.overridable`` becomes an alias for ``unumpy``, if available with a fallback to the a vendored version if not. ``uarray`` and ``unumpy`` and will be developed primarily with the input of duck-array authors and secondarily, custom dtype authors, via the usual GitHub workflow. There are a few reasons for this: * Faster iteration in the case of bugs or issues. * Faster design changes, in the case of needed functionality. * ``unumpy`` will work with older versions of NumPy as well. * The user and library author opt-in to the override process, rather than breakages happening when it is least expected. In simple terms, bugs in ``unumpy`` mean that ``numpy`` remains unaffected. * For ``numpy.fft``, ``numpy.linalg`` and ``numpy.random``, the functions in the main namespace will mirror those in the ``numpy.overridable`` namespace. The reason for this is that there may exist functions in the in these submodules that need backends, even for ``numpy.ndarray`` inputs. Advantanges of ``unumpy`` over other solutions ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ``unumpy`` offers a number of advantanges over the approach of defining a new protocol for every problem encountered: Whenever there is something requiring an override, ``unumpy`` will be able to offer a unified API with very minor changes. For example: * ``ufunc`` objects can be overridden via their ``__call__``, ``reduce`` and other methods. * Other functions can be overridden in a similar fashion. * ``np.asduckarray`` goes away, and becomes ``np.overridable.asarray`` with a backend set. * The same holds for array creation functions such as ``np.zeros``, ``np.empty`` and so on. This also holds for the future: Making something overridable would require only minor changes to ``unumpy``. Another promise ``unumpy`` holds is one of default implementations. Default implementations can be provided for any multimethod, in terms of others. This allows one to override a large part of the NumPy API by defining only a small part of it. This is to ease the creation of new duck-arrays, by providing default implementations of many functions that can be easily expressed in terms of others, as well as a repository of utility functions that help in the implementation of duck-arrays that most duck-arrays would require. This would allow us to avoid designing entire protocols, e.g., a protocol for stacking and concatenating would be replaced by simply implementing ``stack`` and/or ``concatenate`` and then providing default implementations for everything else in that class. The same applies for transposing, and many other functions for which protocols haven't been proposed, such as ``isin`` in terms of ``in1d``, ``setdiff1d`` in terms of ``unique``, and so on. It also allows one to override functions in a manner which ``__array_function__`` simply cannot, such as overriding ``np.einsum`` with the version from the ``opt_einsum`` package, or Intel MKL overriding FFT, BLAS or ``ufunc`` objects. They would define a backend with the appropriate multimethods, and the user would select them via a ``with`` statement, or registering them as a backend. The last benefit is a clear way to coerce to a given backend (via the ``coerce`` keyword in ``ua.set_backend``), and a protocol for coercing not only arrays, but also ``dtype`` objects and ``ufunc`` objects with similar ones from other libraries. This is due to the existence of actual, third party dtype packages, and their desire to blend into the NumPy ecosystem (see [6]_). This is a separate issue compared to the C-level dtype redesign proposed in [7]_, it's about allowing third-party dtype implementations to work with NumPy, much like third-party array implementations. These can provide features such as, for example, units, jagged arrays or other such features that are outside the scope of NumPy. Mixing NumPy and ``unumpy`` in the same file ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Normally, one would only want to import only one of ``unumpy`` or ``numpy``, you would import it as ``np`` for familiarity. However, there may be situations where one wishes to mix NumPy and the overrides, and there are a few ways to do this, depending on the user's style:: from numpy import overridable as unp import numpy as np or:: import numpy as np # Use unumpy via np.overridable Duck-array coercion ~~~~~~~~~~~~~~~~~~~ There are inherent problems about returning objects that are not NumPy arrays from ``numpy.array`` or ``numpy.asarray``, particularly in the context of C/C++ or Cython code that may get an object with a different memory layout than the one it expects. However, we believe this problem may apply not only to these two functions but all functions that return NumPy arrays. For this reason, overrides are opt-in for the user, by using the submodule ``numpy.overridable`` rather than ``numpy``. NumPy will continue to work unaffected by anything in ``numpy.overridable``. If the user wishes to obtain a NumPy array, there are two ways of doing it: 1. Use ``numpy.asarray`` (the non-overridable version). 2. Use ``numpy.overridable.asarray`` with the NumPy backend set and coercion enabled Aliases outside of the ``numpy.overridable`` namespace ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ All functionality in ``numpy.random``, ``numpy.linalg`` and ``numpy.fft`` will be aliased to their respective overridable versions inside ``numpy.overridable``. The reason for this is that there are alternative implementations of RNGs (``mkl-random``), linear algebra routines (``eigen``, ``blis``) and FFT routines (``mkl-fft``, ``pyFFTW``) that need to operate on ``numpy.ndarray`` inputs, but still need the ability to switch behaviour. This is different from monkeypatching in a few different ways: * The caller-facing signature of the function is always the same, so there is at least the loose sense of an API contract. Monkeypatching does not provide this ability. * There is the ability of locally switching the backend. * It has been `suggested `_ that the reason that 1.17 hasn't landed in the Anaconda defaults channel is due to the incompatibility between monkeypatching and ``__array_function__``, as monkeypatching would bypass the protocol completely. * Statements of the form ``from numpy import x; x`` and ``np.x`` would have different results depending on whether the import was made before or after monkeypatching happened. All this isn't possible at all with ``__array_function__`` or ``__array_ufunc__``. It has been formally realised (at least in part) that a backend system is needed for this, in the `NumPy roadmap `_. For ``numpy.random``, it's still necessary to make the C-API fit the one proposed in `NEP-19 `_. This is impossible for `mkl-random`, because then it would need to be rewritten to fit that framework. The guarantees on stream compatibility will be the same as before, but if there's a backend that affects ``numpy.random`` set, we make no guarantees about stream compatibility, and it is up to the backend author to provide their own guarantees. Providing a way for implicit dispatch ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ It has been suggested that the ability to dispatch methods which do not take a dispatchable is needed, while guessing that backend from another dispatchable. As a concrete example, consider the following: .. code:: python with unumpy.determine_backend(array_like, np.ndarray): unumpy.arange(len(array_like)) While this does not exist yet in ``uarray``, it is trivial to add it. The need for this kind of code exists because one might want to have an alternative for the proposed ``*_like`` functions, or the ``like=`` keyword argument. The need for these exists because there are functions in the NumPy API that do not take a dispatchable argument, but there is still the need to select a backend based on a different dispatchable. The need for an opt-in module ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The need for an opt-in module is realised because of a few reasons: * There are parts of the API (like `numpy.asarray`) that simply cannot be overridden due to incompatibility concerns with C/Cython extensions, however, one may want to coerce to a duck-array using ``asarray`` with a backend set. * There are possible issues around an implicit option and monkeypatching, such as those mentioned above. NEP 18 notes that this may require maintenance of two separate APIs. However, this burden may be lessened by, for example, parametrizing all tests over ``numpy.overridable`` separately via a fixture. This also has the side-effect of thoroughly testing it, unlike ``__array_function__``. We also feel that it provides an oppurtunity to separate the NumPy API contract properly from the implementation. Benefits to end-users and mixing backends ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Mixing backends is easy in ``uarray``, one only has to do: .. code:: python # Explicitly say which backends you want to mix ua.register_backend(backend1) ua.register_backend(backend2) ua.register_backend(backend3) # Freely use code that mixes backends here. The benefits to end-users extend beyond just writing new code. Old code (usually in the form of scripts) can be easily ported to different backends by a simple import switch and a line adding the preferred backend. This way, users may find it easier to port existing code to GPU or distributed computing. Related Work ------------ Other override mechanisms ~~~~~~~~~~~~~~~~~~~~~~~~~ * NEP-18, the ``__array_function__`` protocol. [2]_ * NEP-13, the ``__array_ufunc__`` protocol. [3]_ * NEP-30, the ``__duck_array__`` protocol. [9]_ Existing NumPy-like array implementations ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ * Dask: https://dask.org/ * CuPy: https://cupy.chainer.org/ * PyData/Sparse: https://sparse.pydata.org/ * Xnd: https://xnd.readthedocs.io/ * Astropy's Quantity: https://docs.astropy.org/en/stable/units/ Existing and potential consumers of alternative arrays ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ * Dask: https://dask.org/ * scikit-learn: https://scikit-learn.org/ * xarray: https://xarray.pydata.org/ * TensorLy: http://tensorly.org/ Existing alternate dtype implementations ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ * ``ndtypes``: https://ndtypes.readthedocs.io/en/latest/ * Datashape: https://datashape.readthedocs.io * Plum: https://plum-py.readthedocs.io/ Alternate implementations of parts of the NumPy API ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ * ``mkl_random``: https://github.com/IntelPython/mkl_random * ``mkl_fft``: https://github.com/IntelPython/mkl_fft * ``bottleneck``: https://github.com/pydata/bottleneck * ``opt_einsum``: https://github.com/dgasmith/opt_einsum Implementation -------------- The implementation of this NEP will require the following steps: * Implementation of ``uarray`` multimethods corresponding to the NumPy API, including classes for overriding ``dtype``, ``ufunc`` and ``array`` objects, in the ``unumpy`` repository. * Moving backends from ``unumpy`` into the respective array libraries. ``uarray`` Primer ~~~~~~~~~~~~~~~~~ **Note:** *This section will not attempt to go into too much detail about uarray, that is the purpose of the uarray documentation.* [1]_ *However, the NumPy community will have input into the design of uarray, via the issue tracker.* ``unumpy`` is the interface that defines a set of overridable functions (multimethods) compatible with the numpy API. To do this, it uses the ``uarray`` library. ``uarray`` is a general purpose tool for creating multimethods that dispatch to one of multiple different possible backend implementations. In this sense, it is similar to the ``__array_function__`` protocol but with the key difference that the backend is explicitly installed by the end-user and not coupled into the array type. Decoupling the backend from the array type gives much more flexibility to end-users and backend authors. For example, it is possible to: * override functions not taking arrays as arguments * create backends out of source from the array type * install multiple backends for the same array type This decoupling also means that ``uarray`` is not constrained to dispatching over array-like types. The backend is free to inspect the entire set of function arguments to determine if it can implement the function e.g. ``dtype`` parameter dispatching. Defining backends ^^^^^^^^^^^^^^^^^ ``uarray`` consists of two main protocols: ``__ua_convert__`` and ``__ua_function__``, called in that order, along with ``__ua_domain__``. ``__ua_convert__`` is for conversion and coercion. It has the signature ``(dispatchables, coerce)``, where ``dispatchables`` is an iterable of ``ua.Dispatchable`` objects and ``coerce`` is a boolean indicating whether or not to force the conversion. ``ua.Dispatchable`` is a simple class consisting of three simple values: ``type``, ``value``, and ``coercible``. ``__ua_convert__`` returns an iterable of the converted values, or ``NotImplemented`` in the case of failure. ``__ua_function__`` has the signature ``(func, args, kwargs)`` and defines the actual implementation of the function. It recieves the function and its arguments. Returning ``NotImplemented`` will cause a move to the default implementation of the function if one exists, and failing that, the next backend. Here is what will happen assuming a ``uarray`` multimethod is called: 1. We canonicalise the arguments so any arguments without a default are placed in ``*args`` and those with one are placed in ``**kwargs``. 2. We check the list of backends. a. If it is empty, we try the default implementation. 3. We check if the backend's ``__ua_convert__`` method exists. If it exists: a. We pass it the output of the dispatcher, which is an iterable of ``ua.Dispatchable`` objects. b. We feed this output, along with the arguments, to the argument replacer. ``NotImplemented`` means we move to 3 with the next backend. c. We store the replaced arguments as the new arguments. 4. We feed the arguments into ``__ua_function__``, and return the output, and exit if it isn't ``NotImplemented``. 5. If the default implementation exists, we try it with the current backend. 6. On failure, we move to 3 with the next backend. If there are no more backends, we move to 7. 7. We raise a ``ua.BackendNotImplementedError``. Defining overridable multimethods ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ To define an overridable function (a multimethod), one needs a few things: 1. A dispatcher that returns an iterable of ``ua.Dispatchable`` objects. 2. A reverse dispatcher that replaces dispatchable values with the supplied ones. 3. A domain. 4. Optionally, a default implementation, which can be provided in terms of other multimethods. As an example, consider the following:: import uarray as ua def full_argreplacer(args, kwargs, dispatchables): def full(shape, fill_value, dtype=None, order='C'): return (shape, fill_value), dict( dtype=dispatchables[0], order=order ) return full(*args, **kwargs) @ua.create_multimethod(full_argreplacer, domain="numpy") def full(shape, fill_value, dtype=None, order='C'): return (ua.Dispatchable(dtype, np.dtype),) A large set of examples can be found in the ``unumpy`` repository, [8]_. This simple act of overriding callables allows us to override: * Methods * Properties, via ``fget`` and ``fset`` * Entire objects, via ``__get__``. Examples for NumPy ^^^^^^^^^^^^^^^^^^ A library that implements a NumPy-like API will use it in the following manner (as an example):: import numpy.overridable as unp _ua_implementations = {} __ua_domain__ = "numpy" def __ua_function__(func, args, kwargs): fn = _ua_implementations.get(func, None) return fn(*args, **kwargs) if fn is not None else NotImplemented def implements(ua_func): def inner(func): _ua_implementations[ua_func] = func return func return inner @implements(unp.asarray) def asarray(a, dtype=None, order=None): # Code here # Either this method or __ua_convert__ must # return NotImplemented for unsupported types, # Or they shouldn't be marked as dispatchable. # Provides a default implementation for ones and zeros. @implements(unp.full) def full(shape, fill_value, dtype=None, order='C'): # Code here Backward compatibility ---------------------- There are no backward incompatible changes proposed in this NEP. Alternatives ------------ The current alternative to this problem is a combination of NEP-18 [2]_, NEP-13 [4]_ and NEP-30 [9]_ plus adding more protocols (not yet specified) in addition to it. Even then, some parts of the NumPy API will remain non-overridable, so it's a partial alternative. The main alternative to vendoring ``unumpy`` is to simply move it into NumPy completely and not distribute it as a separate package. This would also achieve the proposed goals, however we prefer to keep it a separate package for now, for reasons already stated above. The third alternative is to move ``unumpy`` into the NumPy organisation and develop it as a NumPy project. This will also achieve the said goals, and is also a possibility that can be considered by this NEP. However, the act of doing an extra ``pip install`` or ``conda install`` may discourage some users from adopting this method. An alternative to requiring opt-in is mainly to *not* override ``np.asarray`` and ``np.array``, and making the rest of the NumPy API surface overridable, instead providing ``np.duckarray`` and ``np.asduckarray`` as duck-array friendly alternatives that used the respective overrides. However, this has the downside of adding a minor overhead to NumPy calls. Discussion ---------- * ``uarray`` blogpost: https://labs.quansight.org/blog/2019/07/uarray-update-api-changes-overhead-and-comparison-to-__array_function__/ * The discussion section of NEP-18: https://numpy.org/neps/nep-0018-array-function-protocol.html#discussion * NEP-22: https://numpy.org/neps/nep-0022-ndarray-duck-typing-overview.html * Dask issue #4462: https://github.com/dask/dask/issues/4462 * PR #13046: https://github.com/numpy/numpy/pull/13046 * Dask issue #4883: https://github.com/dask/dask/issues/4883 * Issue #13831: https://github.com/numpy/numpy/issues/13831 * Discussion PR 1: https://github.com/hameerabbasi/numpy/pull/3 * Discussion PR 2: https://github.com/hameerabbasi/numpy/pull/4 * Discussion PR 3: https://github.com/numpy/numpy/pull/14389 References and Footnotes ------------------------ .. [1] uarray, A general dispatch mechanism for Python: https://uarray.readthedocs.io .. [2] NEP 18 ? A dispatch mechanism for NumPy?s high level array functions: https://numpy.org/neps/nep-0018-array-function-protocol.html .. [3] NEP 22 ? Duck typing for NumPy arrays ? high level overview: https://numpy.org/neps/nep-0022-ndarray-duck-typing-overview.html .. [4] NEP 13 ? A Mechanism for Overriding Ufuncs: https://numpy.org/neps/nep-0013-ufunc-overrides.html .. [5] Reply to Adding to the non-dispatched implementation of NumPy methods: http://numpy-discussion.10968.n7.nabble.com/Adding-to-the-non-dispatched-implementation-of-NumPy-methods-tp46816p46874.html .. [6] Custom Dtype/Units discussion: http://numpy-discussion.10968.n7.nabble.com/Custom-Dtype-Units-discussion-td43262.html .. [7] The epic dtype cleanup plan: https://github.com/numpy/numpy/issues/2899 .. [8] unumpy: NumPy, but implementation-independent: https://unumpy.readthedocs.io .. [9] NEP 30 ? Duck Typing for NumPy Arrays - Implementation: https://www.numpy.org/neps/nep-0030-duck-array-protocol.html .. [10] http://scipy.github.io/devdocs/fft.html#backend-control Copyright --------- This document has been placed in the public domain. From: NumPy-Discussion on behalf of Hameer Abbasi Reply to: Discussion of Numerical Python Date: Thursday, 5. September 2019 at 17:12 To: Subject: Re: [Numpy-discussion] NEP 31 ? Context-local and global overrides of the NumPy API Hello everyone; Thanks to all the feedback from the community, in particular Sebastian Berg, we have a new draft of NEP-31. Please find the full text quoted below for discussion and reference. Any feedback and discussion is welcome. ============================================================ NEP 31 ? Context-local and global overrides of the NumPy API ============================================================ :Author: Hameer Abbasi :Author: Ralf Gommers :Author: Peter Bell :Status: Draft :Type: Standards Track :Created: 2019-08-22 Abstract -------- This NEP proposes to make all of NumPy's public API overridable via an extensible backend mechanism. Acceptance of this NEP means NumPy would provide global and context-local overrides, as well as a dispatch mechanism similar to NEP-18 [2]_. First experiences with ``__array_function__`` show that it is necessary to be able to override NumPy functions that *do not take an array-like argument*, and hence aren't overridable via ``__array_function__``. The most pressing need is array creation and coercion functions, such as ``numpy.zeros`` or ``numpy.asarray``; see e.g. NEP-30 [9]_. This NEP proposes to allow, in an opt-in fashion, overriding any part of the NumPy API. It is intended as a comprehensive resolution to NEP-22 [3]_, and obviates the need to add an ever-growing list of new protocols for each new type of function or object that needs to become overridable. Motivation and Scope -------------------- The motivation behind ``uarray`` is manyfold: First, there have been several attempts to allow dispatch of parts of the NumPy API, including (most prominently), the ``__array_ufunc__`` protocol in NEP-13 [4]_, and the ``__array_function__`` protocol in NEP-18 [2]_, but this has shown the need for further protocols to be developed, including a protocol for coercion (see [5]_, [9]_). The reasons these overrides are needed have been extensively discussed in the references, and this NEP will not attempt to go into the details of why these are needed; but in short: It is necessary for library authors to be able to coerce arbitrary objects into arrays of their own types, such as CuPy needing to coerce to a CuPy array, for example, instead of a NumPy array. These kinds of overrides are useful for both the end-user as well as library authors. End-users may have written or wish to write code that they then later speed up or move to a different implementation, say PyData/Sparse. They can do this simply by setting a backend. Library authors may also wish to write code that is portable across array implementations, for example ``sklearn`` may wish to write code for a machine learning algorithm that is portable across array implementations while also using array creation functions. This NEP takes a holistic approach: It assumes that there are parts of the API that need to be overridable, and that these will grow over time. It provides a general framework and a mechanism to avoid a design of a new protocol each time this is required. This was the goal of ``uarray``: to allow for overrides in an API without needing the design of a new protocol. This NEP proposes the following: That ``unumpy`` [8]_ becomes the recommended override mechanism for the parts of the NumPy API not yet covered by ``__array_function__`` or ``__array_ufunc__``, and that ``uarray`` is vendored into a new namespace within NumPy to give users and downstream dependencies access to these overrides. This vendoring mechanism is similar to what SciPy decided to do for making ``scipy.fft`` overridable (see [10]_). Detailed description -------------------- Using overrides ~~~~~~~~~~~~~~~ The way we propose the overrides will be used by end users is:: # On the library side import numpy.overridable as unp def library_function(array): array = unp.asarray(array) # Code using unumpy as usual return array # On the user side: import numpy.overridable as unp import uarray as ua import dask.array as da ua.register_backend(da) library_function(dask_array) # works and returns dask_array with unp.set_backend(da): library_function([1, 2, 3, 4]) # actually returns a Dask array. Here, ``backend`` can be any compatible object defined either by NumPy or an external library, such as Dask or CuPy. Ideally, it should be the module ``dask.array`` or ``cupy`` itself. Composing backends ~~~~~~~~~~~~~~~~~~ There are some backends which may depend on other backends, for example xarray depending on `numpy.fft`, and transforming a time axis into a frequency axis, or Dask/xarray holding an array other than a NumPy array inside it. This would be handled in the following manner inside code:: with ua.set_backend(cupy), ua.set_backend(dask.array): # Code that has distributed GPU arrays here Proposals ~~~~~~~~~ The only change this NEP proposes at its acceptance, is to make ``unumpy`` the officially recommended way to override NumPy. ``unumpy`` will remain a separate repository/package (which we propose to vendor to avoid a hard dependency, and use the separate ``unumpy`` package only if it is installed, rather than depend on for the time being). In concrete terms, ``numpy.overridable`` becomes an alias for ``unumpy``, if available with a fallback to the a vendored version if not. ``uarray`` and ``unumpy`` and will be developed primarily with the input of duck-array authors and secondarily, custom dtype authors, via the usual GitHub workflow. There are a few reasons for this: * Faster iteration in the case of bugs or issues. * Faster design changes, in the case of needed functionality. * ``unumpy`` will work with older versions of NumPy as well. * The user and library author opt-in to the override process, rather than breakages happening when it is least expected. In simple terms, bugs in ``unumpy`` mean that ``numpy`` remains unaffected. Advantanges of ``unumpy`` over other solutions ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ``unumpy`` offers a number of advantanges over the approach of defining a new protocol for every problem encountered: Whenever there is something requiring an override, ``unumpy`` will be able to offer a unified API with very minor changes. For example: * ``ufunc`` objects can be overridden via their ``__call__``, ``reduce`` and other methods. * Other functions can be overridden in a similar fashion. * ``np.asduckarray`` goes away, and becomes ``np.overridable.asarray`` with a backend set. * The same holds for array creation functions such as ``np.zeros``, ``np.empty`` and so on. This also holds for the future: Making something overridable would require only minor changes to ``unumpy``. Another promise ``unumpy`` holds is one of default implementations. Default implementations can be provided for any multimethod, in terms of others. This allows one to override a large part of the NumPy API by defining only a small part of it. This is to ease the creation of new duck-arrays, by providing default implementations of many functions that can be easily expressed in terms of others, as well as a repository of utility functions that help in the implementation of duck-arrays that most duck-arrays would require. It also allows one to override functions in a manner which ``__array_function__`` simply cannot, such as overriding ``np.einsum`` with the version from the ``opt_einsum`` package, or Intel MKL overriding FFT, BLAS or ``ufunc`` objects. They would define a backend with the appropriate multimethods, and the user would select them via a ``with`` statement, or registering them as a backend. The last benefit is a clear way to coerce to a given backend (via the ``coerce`` keyword in ``ua.set_backend``), and a protocol for coercing not only arrays, but also ``dtype`` objects and ``ufunc`` objects with similar ones from other libraries. This is due to the existence of actual, third party dtype packages, and their desire to blend into the NumPy ecosystem (see [6]_). This is a separate issue compared to the C-level dtype redesign proposed in [7]_, it's about allowing third-party dtype implementations to work with NumPy, much like third-party array implementations. These can provide features such as, for example, units, jagged arrays or other such features that are outside the scope of NumPy. Mixing NumPy and ``unumpy`` in the same file ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Normally, one would only want to import only one of ``unumpy`` or ``numpy``, you would import it as ``np`` for familiarity. However, there may be situations where one wishes to mix NumPy and the overrides, and there are a few ways to do this, depending on the user's style:: from numpy import overridable as unp import numpy as np or:: import numpy as np # Use unumpy via np.overridable Duck-array coercion ~~~~~~~~~~~~~~~~~~~ There are inherent problems about returning objects that are not NumPy arrays from ``numpy.array`` or ``numpy.asarray``, particularly in the context of C/C++ or Cython code that may get an object with a different memory layout than the one it expects. However, we believe this problem may apply not only to these two functions but all functions that return NumPy arrays. For this reason, overrides are opt-in for the user, by using the submodule ``numpy.overridable`` rather than ``numpy``. NumPy will continue to work unaffected by anything in ``numpy.overridable``. If the user wishes to obtain a NumPy array, there are two ways of doing it: 1. Use ``numpy.asarray`` (the non-overridable version). 2. Use ``numpy.overridable.asarray`` with the NumPy backend set and coercion enabled Related Work ------------ Other override mechanisms ~~~~~~~~~~~~~~~~~~~~~~~~~ * NEP-18, the ``__array_function__`` protocol. [2]_ * NEP-13, the ``__array_ufunc__`` protocol. [3]_ * NEP-30, the ``__duck_array__`` protocol. [9]_ Existing NumPy-like array implementations ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ * Dask: https://dask.org/ * CuPy: https://cupy.chainer.org/ * PyData/Sparse: https://sparse.pydata.org/ * Xnd: https://xnd.readthedocs.io/ * Astropy's Quantity: https://docs.astropy.org/en/stable/units/ Existing and potential consumers of alternative arrays ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ * Dask: https://dask.org/ * scikit-learn: https://scikit-learn.org/ * xarray: https://xarray.pydata.org/ * TensorLy: http://tensorly.org/ Existing alternate dtype implementations ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ * ``ndtypes``: https://ndtypes.readthedocs.io/en/latest/ * Datashape: https://datashape.readthedocs.io * Plum: https://plum-py.readthedocs.io/ Implementation -------------- The implementation of this NEP will require the following steps: * Implementation of ``uarray`` multimethods corresponding to the NumPy API, including classes for overriding ``dtype``, ``ufunc`` and ``array`` objects, in the ``unumpy`` repository. * Moving backends from ``unumpy`` into the respective array libraries. ``uarray`` Primer ~~~~~~~~~~~~~~~~~ **Note:** *This section will not attempt to go into too much detail about uarray, that is the purpose of the uarray documentation.* [1]_ *However, the NumPy community will have input into the design of uarray, via the issue tracker.* ``unumpy`` is the interface that defines a set of overridable functions (multimethods) compatible with the numpy API. To do this, it uses the ``uarray`` library. ``uarray`` is a general purpose tool for creating multimethods that dispatch to one of multiple different possible backend implementations. In this sense, it is similar to the ``__array_function__`` protocol but with the key difference that the backend is explicitly installed by the end-user and not coupled into the array type. Decoupling the backend from the array type gives much more flexibility to end-users and backend authors. For example, it is possible to: * override functions not taking arrays as arguments * create backends out of source from the array type * install multiple backends for the same array type This decoupling also means that ``uarray`` is not constrained to dispatching over array-like types. The backend is free to inspect the entire set of function arguments to determine if it can implement the function e.g. ``dtype`` parameter dispatching. Defining backends ^^^^^^^^^^^^^^^^^ ``uarray`` consists of two main protocols: ``__ua_convert__`` and ``__ua_function__``, called in that order, along with ``__ua_domain__``. ``__ua_convert__`` is for conversion and coercion. It has the signature ``(dispatchables, coerce)``, where ``dispatchables`` is an iterable of ``ua.Dispatchable`` objects and ``coerce`` is a boolean indicating whether or not to force the conversion. ``ua.Dispatchable`` is a simple class consisting of three simple values: ``type``, ``value``, and ``coercible``. ``__ua_convert__`` returns an iterable of the converted values, or ``NotImplemented`` in the case of failure. ``__ua_function__`` has the signature ``(func, args, kwargs)`` and defines the actual implementation of the function. It recieves the function and its arguments. Returning ``NotImplemented`` will cause a move to the default implementation of the function if one exists, and failing that, the next backend. Here is what will happen assuming a ``uarray`` multimethod is called: 1. We canonicalise the arguments so any arguments without a default are placed in ``*args`` and those with one are placed in ``**kwargs``. 2. We check the list of backends. a. If it is empty, we try the default implementation. 3. We check if the backend's ``__ua_convert__`` method exists. If it exists: a. We pass it the output of the dispatcher, which is an iterable of ``ua.Dispatchable`` objects. b. We feed this output, along with the arguments, to the argument replacer. ``NotImplemented`` means we move to 3 with the next backend. c. We store the replaced arguments as the new arguments. 4. We feed the arguments into ``__ua_function__``, and return the output, and exit if it isn't ``NotImplemented``. 5. If the default implementation exists, we try it with the current backend. 6. On failure, we move to 3 with the next backend. If there are no more backends, we move to 7. 7. We raise a ``ua.BackendNotImplementedError``. Defining overridable multimethods ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ To define an overridable function (a multimethod), one needs a few things: 1. A dispatcher that returns an iterable of ``ua.Dispatchable`` objects. 2. A reverse dispatcher that replaces dispatchable values with the supplied ones. 3. A domain. 4. Optionally, a default implementation, which can be provided in terms of other multimethods. As an example, consider the following:: import uarray as ua def full_argreplacer(args, kwargs, dispatchables): def full(shape, fill_value, dtype=None, order='C'): return (shape, fill_value), dict( dtype=dispatchables[0], order=order ) return full(*args, **kwargs) @ua.create_multimethod(full_argreplacer, domain="numpy") def full(shape, fill_value, dtype=None, order='C'): return (ua.Dispatchable(dtype, np.dtype),) A large set of examples can be found in the ``unumpy`` repository, [8]_. This simple act of overriding callables allows us to override: * Methods * Properties, via ``fget`` and ``fset`` * Entire objects, via ``__get__``. Examples for NumPy ^^^^^^^^^^^^^^^^^^ A library that implements a NumPy-like API will use it in the following manner (as an example):: import numpy.overridable as unp _ua_implementations = {} __ua_domain__ = "numpy" def __ua_function__(func, args, kwargs): fn = _ua_implementations.get(func, None) return fn(*args, **kwargs) if fn is not None else NotImplemented def implements(ua_func): def inner(func): _ua_implementations[ua_func] = func return func return inner @implements(unp.asarray) def asarray(a, dtype=None, order=None): # Code here # Either this method or __ua_convert__ must # return NotImplemented for unsupported types, # Or they shouldn't be marked as dispatchable. # Provides a default implementation for ones and zeros. @implements(unp.full) def full(shape, fill_value, dtype=None, order='C'): # Code here Backward compatibility ---------------------- There are no backward incompatible changes proposed in this NEP. Alternatives ------------ The current alternative to this problem is a combination of NEP-18 [2]_, NEP-13 [4]_ and NEP-30 [9]_ plus adding more protocols (not yet specified) in addition to it. Even then, some parts of the NumPy API will remain non-overridable, so it's a partial alternative. The main alternative to vendoring ``unumpy`` is to simply move it into NumPy completely and not distribute it as a separate package. This would also achieve the proposed goals, however we prefer to keep it a separate package for now, for reasons already stated above. The third alternative is to move ``unumpy`` into the NumPy organisation and develop it as a NumPy project. This will also achieve the said goals, and is also a possibility that can be considered by this NEP. However, the act of doing an extra ``pip install`` or ``conda install`` may discourage some users from adopting this method. Discussion ---------- * ``uarray`` blogpost: https://labs.quansight.org/blog/2019/07/uarray-update-api-changes-overhead-and-comparison-to-__array_function__/ * The discussion section of NEP-18: https://numpy.org/neps/nep-0018-array-function-protocol.html#discussion * NEP-22: https://numpy.org/neps/nep-0022-ndarray-duck-typing-overview.html * Dask issue #4462: https://github.com/dask/dask/issues/4462 * PR #13046: https://github.com/numpy/numpy/pull/13046 * Dask issue #4883: https://github.com/dask/dask/issues/4883 * Issue #13831: https://github.com/numpy/numpy/issues/13831 * Discussion PR 1: https://github.com/hameerabbasi/numpy/pull/3 * Discussion PR 2: https://github.com/hameerabbasi/numpy/pull/4 * Discussion PR 3: https://github.com/numpy/numpy/pull/14389 References and Footnotes ------------------------ .. [1] uarray, A general dispatch mechanism for Python: https://uarray.readthedocs.io .. [2] NEP 18 ? A dispatch mechanism for NumPy?s high level array functions: https://numpy.org/neps/nep-0018-array-function-protocol.html .. [3] NEP 22 ? Duck typing for NumPy arrays ? high level overview: https://numpy.org/neps/nep-0022-ndarray-duck-typing-overview.html .. [4] NEP 13 ? A Mechanism for Overriding Ufuncs: https://numpy.org/neps/nep-0013-ufunc-overrides.html .. [5] Reply to Adding to the non-dispatched implementation of NumPy methods: http://numpy-discussion.10968.n7.nabble.com/Adding-to-the-non-dispatched-implementation-of-NumPy-methods-tp46816p46874.html .. [6] Custom Dtype/Units discussion: http://numpy-discussion.10968.n7.nabble.com/Custom-Dtype-Units-discussion-td43262.html .. [7] The epic dtype cleanup plan: https://github.com/numpy/numpy/issues/2899 .. [8] unumpy: NumPy, but implementation-independent: https://unumpy.readthedocs.io .. [9] NEP 30 ? Duck Typing for NumPy Arrays - Implementation: https://www.numpy.org/neps/nep-0030-duck-array-protocol.html .. [10] http://scipy.github.io/devdocs/fft.html#backend-control Copyright --------- This document has been placed in the public domain. _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at python.org https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From hameerabbasi at yahoo.com Wed Oct 9 13:28:36 2019 From: hameerabbasi at yahoo.com (Hameer Abbasi) Date: Wed, 09 Oct 2019 22:28:36 +0500 Subject: [Numpy-discussion] =?utf-8?q?NEP_31_=E2=80=94_Context-local_and_?= =?utf-8?q?global_overrides_of_the_NumPy_API?= In-Reply-To: References: Message-ID: <4C75888A-994F-4F63-9414-0842C8E1C396@yahoo.com> Thanks to all the feedback, we have a new PR of NEP-31. Please find the full-text quoted below: ============================================================ NEP 31 ? Context-local and global overrides of the NumPy API ============================================================ :Author: Hameer Abbasi :Author: Ralf Gommers :Author: Peter Bell :Status: Draft :Type: Standards Track :Created: 2019-08-22 Abstract -------- This NEP proposes to make all of NumPy's public API overridable via an extensible backend mechanism. Acceptance of this NEP means NumPy would provide global and context-local overrides, as well as a dispatch mechanism similar to NEP-18 [2]_. First experiences with ``__array_function__`` show that it is necessary to be able to override NumPy functions that *do not take an array-like argument*, and hence aren't overridable via ``__array_function__``. The most pressing need is array creation and coercion functions, such as ``numpy.zeros`` or ``numpy.asarray``; see e.g. NEP-30 [9]_. This NEP proposes to allow, in an opt-in fashion, overriding any part of the NumPy API. It is intended as a comprehensive resolution to NEP-22 [3]_, and obviates the need to add an ever-growing list of new protocols for each new type of function or object that needs to become overridable. Motivation and Scope -------------------- The motivation behind ``uarray`` is manyfold: First, there have been several attempts to allow dispatch of parts of the NumPy API, including (most prominently), the ``__array_ufunc__`` protocol in NEP-13 [4]_, and the ``__array_function__`` protocol in NEP-18 [2]_, but this has shown the need for further protocols to be developed, including a protocol for coercion (see [5]_, [9]_). The reasons these overrides are needed have been extensively discussed in the references, and this NEP will not attempt to go into the details of why these are needed; but in short: It is necessary for library authors to be able to coerce arbitrary objects into arrays of their own types, such as CuPy needing to coerce to a CuPy array, for example, instead of a NumPy array. These kinds of overrides are useful for both the end-user as well as library authors. End-users may have written or wish to write code that they then later speed up or move to a different implementation, say PyData/Sparse. They can do this simply by setting a backend. Library authors may also wish to write code that is portable across array implementations, for example ``sklearn`` may wish to write code for a machine learning algorithm that is portable across array implementations while also using array creation functions. This NEP takes a holistic approach: It assumes that there are parts of the API that need to be overridable, and that these will grow over time. It provides a general framework and a mechanism to avoid a design of a new protocol each time this is required. This was the goal of ``uarray``: to allow for overrides in an API without needing the design of a new protocol. This NEP proposes the following: That ``unumpy`` [8]_? becomes the recommended override mechanism for the parts of the NumPy API not yet covered by ``__array_function__`` or ``__array_ufunc__``, and that ``uarray`` is vendored into a new namespace within NumPy to give users and downstream dependencies access to these overrides.? This vendoring mechanism is similar to what SciPy decided to do for making ``scipy.fft`` overridable (see [10]_). Detailed description -------------------- Using overrides ~~~~~~~~~~~~~~~ The way we propose the overrides will be used by end users is:: ??? # On the library side ??? import numpy.overridable as unp ??? def library_function(array): ??????? array = unp.asarray(array) ??????? # Code using unumpy as usual ??????? return array ??? # On the user side: ??? import numpy.overridable as unp ??? import uarray as ua ??? import dask.array as da ??? ua.register_backend(da) ??? library_function(dask_array)? # works and returns dask_array ??? with unp.set_backend(da): ??????? library_function([1, 2, 3, 4])? # actually returns a Dask array. Here, ``backend`` can be any compatible object defined either by NumPy or an external library, such as Dask or CuPy. Ideally, it should be the module ``dask.array`` or ``cupy`` itself. Composing backends ~~~~~~~~~~~~~~~~~~ There are some backends which may depend on other backends, for example xarray depending on `numpy.fft`, and transforming a time axis into a frequency axis, or Dask/xarray holding an array other than a NumPy array inside it. This would be handled in the following manner inside code:: ??? with ua.set_backend(cupy), ua.set_backend(dask.array): ??????? # Code that has distributed GPU arrays here Proposals ~~~~~~~~~ The only change this NEP proposes at its acceptance, is to make ``unumpy`` the officially recommended way to override NumPy, along with making some submodules overridable by default via ``uarray``. ``unumpy`` will remain a separate repository/package (which we propose to vendor to avoid a hard dependency, and use the separate ``unumpy`` package only if it is installed, rather than depend on for the time being). In concrete terms, ``numpy.overridable`` becomes an alias for ``unumpy``, if available with a fallback to the a vendored version if not. ``uarray`` and ``unumpy`` and will be developed primarily with the input of duck-array authors and secondarily, custom dtype authors, via the usual GitHub workflow. There are a few reasons for this: * Faster iteration in the case of bugs or issues. * Faster design changes, in the case of needed functionality. * ``unumpy`` will work with older versions of NumPy as well. * The user and library author opt-in to the override process, ? rather than breakages happening when it is least expected. ? In simple terms, bugs in ``unumpy`` mean that ``numpy`` remains ? unaffected. * For ``numpy.fft``, ``numpy.linalg`` and ``numpy.random``, the functions in ? the main namespace will mirror those in the ``numpy.overridable`` namespace. ? The reason for this is that there may exist functions in the in these ? submodules that need backends, even for ``numpy.ndarray`` inputs. Advantanges of ``unumpy`` over other solutions ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ``unumpy`` offers a number of advantanges over the approach of defining a new protocol for every problem encountered: Whenever there is something requiring an override, ``unumpy`` will be able to offer a unified API with very minor changes. For example: * ``ufunc`` objects can be overridden via their ``__call__``, ``reduce`` and ? other methods. * Other functions can be overridden in a similar fashion. * ``np.asduckarray`` goes away, and becomes ``np.overridable.asarray`` with a ? backend set. * The same holds for array creation functions such as ``np.zeros``, ? ``np.empty`` and so on. This also holds for the future: Making something overridable would require only minor changes to ``unumpy``. Another promise ``unumpy`` holds is one of default implementations. Default implementations can be provided for any multimethod, in terms of others. This allows one to override a large part of the NumPy API by defining only a small part of it. This is to ease the creation of new duck-arrays, by providing default implementations of many functions that can be easily expressed in terms of others, as well as a repository of utility functions that help in the implementation of duck-arrays that most duck-arrays would require. This would allow us to avoid designing entire protocols, e.g., a protocol for stacking and concatenating would be replaced by simply implementing ``stack`` and/or ``concatenate`` and then providing default implementations for everything else in that class. The same applies for transposing, and many other functions for which protocols haven't been proposed, such as ``isin`` in terms of ``in1d``, ``setdiff1d`` in terms of ``unique``, and so on. It also allows one to override functions in a manner which ``__array_function__`` simply cannot, such as overriding ``np.einsum`` with the version from the ``opt_einsum`` package, or Intel MKL overriding FFT, BLAS or ``ufunc`` objects. They would define a backend with the appropriate multimethods, and the user would select them via a ``with`` statement, or registering them as a backend. The last benefit is a clear way to coerce to a given backend (via the ``coerce`` keyword in ``ua.set_backend``), and a protocol for coercing not only arrays, but also ``dtype`` objects and ``ufunc`` objects with similar ones from other libraries. This is due to the existence of actual, third party dtype packages, and their desire to blend into the NumPy ecosystem (see [6]_). This is a separate issue compared to the C-level dtype redesign proposed in [7]_, it's about allowing third-party dtype implementations to work with NumPy, much like third-party array implementations. These can provide features such as, for example, units, jagged arrays or other such features that are outside the scope of NumPy. Mixing NumPy and ``unumpy`` in the same file ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Normally, one would only want to import only one of ``unumpy`` or ``numpy``, you would import it as ``np`` for familiarity. However, there may be situations where one wishes to mix NumPy and the overrides, and there are a few ways to do this, depending on the user's style:: ??? from numpy import overridable as unp ??? import numpy as np or:: ??? import numpy as np ??? # Use unumpy via np.overridable Duck-array coercion ~~~~~~~~~~~~~~~~~~~ There are inherent problems about returning objects that are not NumPy arrays from ``numpy.array`` or ``numpy.asarray``, particularly in the context of C/C++ or Cython code that may get an object with a different memory layout than the one it expects. However, we believe this problem may apply not only to these two functions but all functions that return NumPy arrays. For this reason, overrides are opt-in for the user, by using the submodule ``numpy.overridable`` rather than ``numpy``. NumPy will continue to work unaffected by anything in ``numpy.overridable``. If the user wishes to obtain a NumPy array, there are two ways of doing it: 1. Use ``numpy.asarray`` (the non-overridable version). 2. Use ``numpy.overridable.asarray`` with the NumPy backend set and coercion ?? enabled Aliases outside of the ``numpy.overridable`` namespace ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ All functionality in ``numpy.random``, ``numpy.linalg`` and ``numpy.fft`` will be aliased to their respective overridable versions inside ``numpy.overridable``. The reason for this is that there are alternative implementations of RNGs (``mkl-random``), linear algebra routines (``eigen``, ``blis``) and FFT routines (``mkl-fft``, ``pyFFTW``) that need to operate on ``numpy.ndarray`` inputs, but still need the ability to switch behaviour. This is different from monkeypatching in a few different ways: * The caller-facing signature of the function is always the same, ? so there is at least the loose sense of an API contract. Monkeypatching ? does not provide this ability. * There is the ability of locally switching the backend. * It has been `suggested `_ ? that the reason that 1.17 hasn't landed in the Anaconda defaults channel is ? due to the incompatibility between monkeypatching and ``__array_function__``, ? as monkeypatching would bypass the protocol completely. * Statements of the form ``from numpy import x; x`` and ``np.x`` would have ? different results depending on whether the import was made before or ? after monkeypatching happened. All this isn't possible at all with ``__array_function__`` or ``__array_ufunc__``. It has been formally realised (at least in part) that a backend system is needed for this, in the `NumPy roadmap `_. For ``numpy.random``, it's still necessary to make the C-API fit the one proposed in `NEP-19 `_. This is impossible for `mkl-random`, because then it would need to be rewritten to fit that framework. The guarantees on stream compatibility will be the same as before, but if there's a backend that affects ``numpy.random`` set, we make no guarantees about stream compatibility, and it is up to the backend author to provide their own guarantees. Providing a way for implicit dispatch ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ It has been suggested that the ability to dispatch methods which do not take a dispatchable is needed, while guessing that backend from another dispatchable. As a concrete example, consider the following: .. code:: python ??? with unumpy.determine_backend(array_like, np.ndarray): ??????? unumpy.arange(len(array_like)) While this does not exist yet in ``uarray``, it is trivial to add it. The need for this kind of code exists because one might want to have an alternative for the proposed ``*_like`` functions, or the ``like=`` keyword argument. The need for these exists because there are functions in the NumPy API that do not take a dispatchable argument, but there is still the need to select a backend based on a different dispatchable. The need for an opt-in module ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The need for an opt-in module is realised because of a few reasons: * There are parts of the API (like `numpy.asarray`) that simply cannot be ? overridden due to incompatibility concerns with C/Cython extensions, however, ? one may want to coerce to a duck-array using ``asarray`` with a backend set. * There are possible issues around an implicit option and monkeypatching, such ? as those mentioned above. NEP 18 notes that this may require maintenance of two separate APIs. However, this burden may be lessened by, for example, parametrizing all tests over ``numpy.overridable`` separately via a fixture. This also has the side-effect of thoroughly testing it, unlike ``__array_function__``. We also feel that it provides an oppurtunity to separate the NumPy API contract properly from the implementation. Benefits to end-users and mixing backends ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Mixing backends is easy in ``uarray``, one only has to do: .. code:: python ??? # Explicitly say which backends you want to mix ??? ua.register_backend(backend1) ??? ua.register_backend(backend2) ??? ua.register_backend(backend3) ??? # Freely use code that mixes backends here. The benefits to end-users extend beyond just writing new code. Old code (usually in the form of scripts) can be easily ported to different backends by a simple import switch and a line adding the preferred backend. This way, users may find it easier to port existing code to GPU or distributed computing. Related Work ------------ Other override mechanisms ~~~~~~~~~~~~~~~~~~~~~~~~~ * NEP-18, the ``__array_function__`` protocol. [2]_ * NEP-13, the ``__array_ufunc__`` protocol. [3]_ * NEP-30, the ``__duck_array__`` protocol. [9]_ Existing NumPy-like array implementations ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ * Dask: https://dask.org/ * CuPy: https://cupy.chainer.org/ * PyData/Sparse: https://sparse.pydata.org/ * Xnd: https://xnd.readthedocs.io/ * Astropy's Quantity: https://docs.astropy.org/en/stable/units/ Existing and potential consumers of alternative arrays ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ * Dask: https://dask.org/ * scikit-learn: https://scikit-learn.org/ * xarray: https://xarray.pydata.org/ * TensorLy: http://tensorly.org/ Existing alternate dtype implementations ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ * ``ndtypes``: https://ndtypes.readthedocs.io/en/latest/ * Datashape: https://datashape.readthedocs.io * Plum: https://plum-py.readthedocs.io/ Alternate implementations of parts of the NumPy API ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ * ``mkl_random``: https://github.com/IntelPython/mkl_random * ``mkl_fft``: https://github.com/IntelPython/mkl_fft * ``bottleneck``: https://github.com/pydata/bottleneck * ``opt_einsum``: https://github.com/dgasmith/opt_einsum Implementation -------------- The implementation of this NEP will require the following steps: * Implementation of ``uarray`` multimethods corresponding to the ? NumPy API, including classes for overriding ``dtype``, ``ufunc`` ? and ``array`` objects, in the ``unumpy`` repository. * Moving backends from ``unumpy`` into the respective array libraries. ``uarray`` Primer ~~~~~~~~~~~~~~~~~ **Note:** *This section will not attempt to go into too much detail about uarray, that is the purpose of the uarray documentation.* [1]_ *However, the NumPy community will have input into the design of uarray, via the issue tracker.* ``unumpy`` is the interface that defines a set of overridable functions (multimethods) compatible with the numpy API. To do this, it uses the ``uarray`` library. ``uarray`` is a general purpose tool for creating multimethods that dispatch to one of multiple different possible backend implementations. In this sense, it is similar to the ``__array_function__`` protocol but with the key difference that the backend is explicitly installed by the end-user and not coupled into the array type. Decoupling the backend from the array type gives much more flexibility to end-users and backend authors. For example, it is possible to: * override functions not taking arrays as arguments * create backends out of source from the array type * install multiple backends for the same array type This decoupling also means that ``uarray`` is not constrained to dispatching over array-like types. The backend is free to inspect the entire set of function arguments to determine if it can implement the function e.g. ``dtype`` parameter dispatching. Defining backends ^^^^^^^^^^^^^^^^^ ``uarray`` consists of two main protocols: ``__ua_convert__`` and ``__ua_function__``, called in that order, along with ``__ua_domain__``. ``__ua_convert__`` is for conversion and coercion. It has the signature ``(dispatchables, coerce)``, where ``dispatchables`` is an iterable of ``ua.Dispatchable`` objects and ``coerce`` is a boolean indicating whether or not to force the conversion. ``ua.Dispatchable`` is a simple class consisting of three simple values: ``type``, ``value``, and ``coercible``. ``__ua_convert__`` returns an iterable of the converted values, or ``NotImplemented`` in the case of failure. ``__ua_function__`` has the signature ``(func, args, kwargs)`` and defines the actual implementation of the function. It recieves the function and its arguments. Returning ``NotImplemented`` will cause a move to the default implementation of the function if one exists, and failing that, the next backend. Here is what will happen assuming a ``uarray`` multimethod is called: 1. We canonicalise the arguments so any arguments without a default ?? are placed in ``*args`` and those with one are placed in ``**kwargs``. 2. We check the list of backends. ?? a. If it is empty, we try the default implementation. 3. We check if the backend's ``__ua_convert__`` method exists. If it exists: ?? a. We pass it the output of the dispatcher, ????? which is an iterable of ``ua.Dispatchable`` objects. ?? b. We feed this output, along with the arguments, ????? to the argument replacer. ``NotImplemented`` means we move to 3 ????? with the next backend. ?? c. We store the replaced arguments as the new arguments. 4. We feed the arguments into ``__ua_function__``, and return the output, and ?? exit if it isn't ``NotImplemented``. 5. If the default implementation exists, we try it with the current backend. 6. On failure,? we move to 3 with the next backend. If there are no more ?? backends, we move to 7. 7. We raise a ``ua.BackendNotImplementedError``. Defining overridable multimethods ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ To define an overridable function (a multimethod), one needs a few things: 1. A dispatcher that returns an iterable of ``ua.Dispatchable`` objects. 2. A reverse dispatcher that replaces dispatchable values with the supplied ?? ones. 3. A domain. 4. Optionally, a default implementation, which can be provided in terms of ?? other multimethods. As an example, consider the following:: ??? import uarray as ua ??? def full_argreplacer(args, kwargs, dispatchables): ??????? def full(shape, fill_value, dtype=None, order='C'): ??????????? return (shape, fill_value), dict( ??????????????? dtype=dispatchables[0], ??????????????? order=order ??????????? ) ??????? return full(*args, **kwargs) ??? @ua.create_multimethod(full_argreplacer, domain="numpy") ??? def full(shape, fill_value, dtype=None, order='C'): ??????? return (ua.Dispatchable(dtype, np.dtype),) A large set of examples can be found in the ``unumpy`` repository, [8]_. This simple act of overriding callables allows us to override: * Methods * Properties, via ``fget`` and ``fset`` * Entire objects, via ``__get__``. Examples for NumPy ^^^^^^^^^^^^^^^^^^ A library that implements a NumPy-like API will use it in the following manner (as an example):: ??? import numpy.overridable as unp ??? _ua_implementations = {} ??? __ua_domain__ = "numpy" ??? def __ua_function__(func, args, kwargs): ??????? fn = _ua_implementations.get(func, None) ??????? return fn(*args, **kwargs) if fn is not None else NotImplemented ??? def implements(ua_func): ??????? def inner(func): ??????????? _ua_implementations[ua_func] = func ??????????? return func ??????? return inner ??? @implements(unp.asarray) ??? def asarray(a, dtype=None, order=None): ??????? # Code here ??????? # Either this method or __ua_convert__ must ??????? # return NotImplemented for unsupported types, ??????? # Or they shouldn't be marked as dispatchable. ??? # Provides a default implementation for ones and zeros. ??? @implements(unp.full) ??? def full(shape, fill_value, dtype=None, order='C'): ??????? # Code here Backward compatibility ---------------------- There are no backward incompatible changes proposed in this NEP. Alternatives ------------ The current alternative to this problem is a combination of NEP-18 [2]_, NEP-13 [4]_ and NEP-30 [9]_ plus adding more protocols (not yet specified) in addition to it. Even then, some parts of the NumPy API will remain non-overridable, so it's a partial alternative. The main alternative to vendoring ``unumpy`` is to simply move it into NumPy completely and not distribute it as a separate package. This would also achieve the proposed goals, however we prefer to keep it a separate package for now, for reasons already stated above. The third alternative is to move ``unumpy`` into the NumPy organisation and develop it as a NumPy project. This will also achieve the said goals, and is also a possibility that can be considered by this NEP. However, the act of doing an extra ``pip install`` or ``conda install`` may discourage some users from adopting this method. An alternative to requiring opt-in is mainly to *not* override ``np.asarray`` and ``np.array``, and making the rest of the NumPy API surface overridable, instead providing ``np.duckarray`` and ``np.asduckarray`` as duck-array friendly alternatives that used the respective overrides. However, this has the downside of adding a minor overhead to NumPy calls. Discussion ---------- * ``uarray`` blogpost: https://labs.quansight.org/blog/2019/07/uarray-update-api-changes-overhead-and-comparison-to-__array_function__/ * The discussion section of NEP-18: https://numpy.org/neps/nep-0018-array-function-protocol.html#discussion * NEP-22: https://numpy.org/neps/nep-0022-ndarray-duck-typing-overview.html * Dask issue #4462: https://github.com/dask/dask/issues/4462 * PR #13046: https://github.com/numpy/numpy/pull/13046 * Dask issue #4883: https://github.com/dask/dask/issues/4883 * Issue #13831: https://github.com/numpy/numpy/issues/13831 * Discussion PR 1: https://github.com/hameerabbasi/numpy/pull/3 * Discussion PR 2: https://github.com/hameerabbasi/numpy/pull/4 * Discussion PR 3: https://github.com/numpy/numpy/pull/14389 References and Footnotes ------------------------ .. [1] uarray, A general dispatch mechanism for Python: https://uarray.readthedocs.io .. [2] NEP 18 ? A dispatch mechanism for NumPy?s high level array functions: https://numpy.org/neps/nep-0018-array-function-protocol.html .. [3] NEP 22 ? Duck typing for NumPy arrays ? high level overview: https://numpy.org/neps/nep-0022-ndarray-duck-typing-overview.html .. [4] NEP 13 ? A Mechanism for Overriding Ufuncs: https://numpy.org/neps/nep-0013-ufunc-overrides.html .. [5] Reply to Adding to the non-dispatched implementation of NumPy methods: http://numpy-discussion.10968.n7.nabble.com/Adding-to-the-non-dispatched-implementation-of-NumPy-methods-tp46816p46874.html .. [6] Custom Dtype/Units discussion: http://numpy-discussion.10968.n7.nabble.com/Custom-Dtype-Units-discussion-td43262.html .. [7] The epic dtype cleanup plan: https://github.com/numpy/numpy/issues/2899 .. [8] unumpy: NumPy, but implementation-independent: https://unumpy.readthedocs.io .. [9] NEP 30 ? Duck Typing for NumPy Arrays - Implementation: https://www.numpy.org/neps/nep-0030-duck-array-protocol.html .. [10] http://scipy.github.io/devdocs/fft.html#backend-control Copyright --------- This document has been placed in the public domain. From: NumPy-Discussion on behalf of Hameer Abbasi Reply to: Discussion of Numerical Python Date: Thursday, 5. September 2019 at 17:12 To: Subject: Re: [Numpy-discussion] NEP 31 ? Context-local and global overrides of the NumPy API Hello everyone; Thanks to all the feedback from the community, in particular Sebastian Berg, we have a new draft of NEP-31. Please find the full text quoted below for discussion and reference. Any feedback and discussion is welcome. ============================================================ NEP 31 ? Context-local and global overrides of the NumPy API ============================================================ :Author: Hameer Abbasi :Author: Ralf Gommers :Author: Peter Bell :Status: Draft :Type: Standards Track :Created: 2019-08-22 Abstract -------- This NEP proposes to make all of NumPy's public API overridable via an extensible backend mechanism. Acceptance of this NEP means NumPy would provide global and context-local overrides, as well as a dispatch mechanism similar to NEP-18 [2]_. First experiences with ``__array_function__`` show that it is necessary to be able to override NumPy functions that *do not take an array-like argument*, and hence aren't overridable via ``__array_function__``. The most pressing need is array creation and coercion functions, such as ``numpy.zeros`` or ``numpy.asarray``; see e.g. NEP-30 [9]_. This NEP proposes to allow, in an opt-in fashion, overriding any part of the NumPy API. It is intended as a comprehensive resolution to NEP-22 [3]_, and obviates the need to add an ever-growing list of new protocols for each new type of function or object that needs to become overridable. Motivation and Scope -------------------- The motivation behind ``uarray`` is manyfold: First, there have been several attempts to allow dispatch of parts of the NumPy API, including (most prominently), the ``__array_ufunc__`` protocol in NEP-13 [4]_, and the ``__array_function__`` protocol in NEP-18 [2]_, but this has shown the need for further protocols to be developed, including a protocol for coercion (see [5]_, [9]_). The reasons these overrides are needed have been extensively discussed in the references, and this NEP will not attempt to go into the details of why these are needed; but in short: It is necessary for library authors to be able to coerce arbitrary objects into arrays of their own types, such as CuPy needing to coerce to a CuPy array, for example, instead of a NumPy array. These kinds of overrides are useful for both the end-user as well as library authors. End-users may have written or wish to write code that they then later speed up or move to a different implementation, say PyData/Sparse. They can do this simply by setting a backend. Library authors may also wish to write code that is portable across array implementations, for example ``sklearn`` may wish to write code for a machine learning algorithm that is portable across array implementations while also using array creation functions. This NEP takes a holistic approach: It assumes that there are parts of the API that need to be overridable, and that these will grow over time. It provides a general framework and a mechanism to avoid a design of a new protocol each time this is required. This was the goal of ``uarray``: to allow for overrides in an API without needing the design of a new protocol. This NEP proposes the following: That ``unumpy`` [8]_? becomes the recommended override mechanism for the parts of the NumPy API not yet covered by ``__array_function__`` or ``__array_ufunc__``, and that ``uarray`` is vendored into a new namespace within NumPy to give users and downstream dependencies access to these overrides.? This vendoring mechanism is similar to what SciPy decided to do for making ``scipy.fft`` overridable (see [10]_). Detailed description -------------------- Using overrides ~~~~~~~~~~~~~~~ The way we propose the overrides will be used by end users is:: ??? # On the library side ??? import numpy.overridable as unp ??? def library_function(array): ??????? array = unp.asarray(array) ??????? # Code using unumpy as usual ??????? return array ??? # On the user side: ??? import numpy.overridable as unp ??? import uarray as ua ??? import dask.array as da ??? ua.register_backend(da) ??? library_function(dask_array)? # works and returns dask_array ??? with unp.set_backend(da): ??????? library_function([1, 2, 3, 4])? # actually returns a Dask array. Here, ``backend`` can be any compatible object defined either by NumPy or an external library, such as Dask or CuPy. Ideally, it should be the module ``dask.array`` or ``cupy`` itself. Composing backends ~~~~~~~~~~~~~~~~~~ There are some backends which may depend on other backends, for example xarray depending on `numpy.fft`, and transforming a time axis into a frequency axis, or Dask/xarray holding an array other than a NumPy array inside it. This would be handled in the following manner inside code:: ??? with ua.set_backend(cupy), ua.set_backend(dask.array): ??????? # Code that has distributed GPU arrays here Proposals ~~~~~~~~~ The only change this NEP proposes at its acceptance, is to make ``unumpy`` the officially recommended way to override NumPy. ``unumpy`` will remain a separate repository/package (which we propose to vendor to avoid a hard dependency, and use the separate ``unumpy`` package only if it is installed, rather than depend on for the time being). In concrete terms, ``numpy.overridable`` becomes an alias for ``unumpy``, if available with a fallback to the a vendored version if not. ``uarray`` and ``unumpy`` and will be developed primarily with the input of duck-array authors and secondarily, custom dtype authors, via the usual GitHub workflow. There are a few reasons for this: * Faster iteration in the case of bugs or issues. * Faster design changes, in the case of needed functionality. * ``unumpy`` will work with older versions of NumPy as well. * The user and library author opt-in to the override process, ? rather than breakages happening when it is least expected. ? In simple terms, bugs in ``unumpy`` mean that ``numpy`` remains ? unaffected. Advantanges of ``unumpy`` over other solutions ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ``unumpy`` offers a number of advantanges over the approach of defining a new protocol for every problem encountered: Whenever there is something requiring an override, ``unumpy`` will be able to offer a unified API with very minor changes. For example: * ``ufunc`` objects can be overridden via their ``__call__``, ``reduce`` and ? other methods. * Other functions can be overridden in a similar fashion. * ``np.asduckarray`` goes away, and becomes ``np.overridable.asarray`` with a ? backend set. * The same holds for array creation functions such as ``np.zeros``, ? ``np.empty`` and so on. This also holds for the future: Making something overridable would require only minor changes to ``unumpy``. Another promise ``unumpy`` holds is one of default implementations. Default implementations can be provided for any multimethod, in terms of others. This allows one to override a large part of the NumPy API by defining only a small part of it. This is to ease the creation of new duck-arrays, by providing default implementations of many functions that can be easily expressed in terms of others, as well as a repository of utility functions that help in the implementation of duck-arrays that most duck-arrays would require. It also allows one to override functions in a manner which ``__array_function__`` simply cannot, such as overriding ``np.einsum`` with the version from the ``opt_einsum`` package, or Intel MKL overriding FFT, BLAS or ``ufunc`` objects. They would define a backend with the appropriate multimethods, and the user would select them via a ``with`` statement, or registering them as a backend. The last benefit is a clear way to coerce to a given backend (via the ``coerce`` keyword in ``ua.set_backend``), and a protocol for coercing not only arrays, but also ``dtype`` objects and ``ufunc`` objects with similar ones from other libraries. This is due to the existence of actual, third party dtype packages, and their desire to blend into the NumPy ecosystem (see [6]_). This is a separate issue compared to the C-level dtype redesign proposed in [7]_, it's about allowing third-party dtype implementations to work with NumPy, much like third-party array implementations. These can provide features such as, for example, units, jagged arrays or other such features that are outside the scope of NumPy. Mixing NumPy and ``unumpy`` in the same file ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Normally, one would only want to import only one of ``unumpy`` or ``numpy``, you would import it as ``np`` for familiarity. However, there may be situations where one wishes to mix NumPy and the overrides, and there are a few ways to do this, depending on the user's style:: ??? from numpy import overridable as unp ??? import numpy as np or:: ??? import numpy as np ??? # Use unumpy via np.overridable Duck-array coercion ~~~~~~~~~~~~~~~~~~~ There are inherent problems about returning objects that are not NumPy arrays from ``numpy.array`` or ``numpy.asarray``, particularly in the context of C/C++ or Cython code that may get an object with a different memory layout than the one it expects. However, we believe this problem may apply not only to these two functions but all functions that return NumPy arrays. For this reason, overrides are opt-in for the user, by using the submodule ``numpy.overridable`` rather than ``numpy``. NumPy will continue to work unaffected by anything in ``numpy.overridable``. If the user wishes to obtain a NumPy array, there are two ways of doing it: 1. Use ``numpy.asarray`` (the non-overridable version). 2. Use ``numpy.overridable.asarray`` with the NumPy backend set and coercion ?? enabled Related Work ------------ Other override mechanisms ~~~~~~~~~~~~~~~~~~~~~~~~~ * NEP-18, the ``__array_function__`` protocol. [2]_ * NEP-13, the ``__array_ufunc__`` protocol. [3]_ * NEP-30, the ``__duck_array__`` protocol. [9]_ Existing NumPy-like array implementations ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ * Dask: https://dask.org/ * CuPy: https://cupy.chainer.org/ * PyData/Sparse: https://sparse.pydata.org/ * Xnd: https://xnd.readthedocs.io/ * Astropy's Quantity: https://docs.astropy.org/en/stable/units/ Existing and potential consumers of alternative arrays ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ * Dask: https://dask.org/ * scikit-learn: https://scikit-learn.org/ * xarray: https://xarray.pydata.org/ * TensorLy: http://tensorly.org/ Existing alternate dtype implementations ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ * ``ndtypes``: https://ndtypes.readthedocs.io/en/latest/ * Datashape: https://datashape.readthedocs.io * Plum: https://plum-py.readthedocs.io/ Implementation -------------- The implementation of this NEP will require the following steps: * Implementation of ``uarray`` multimethods corresponding to the ? NumPy API, including classes for overriding ``dtype``, ``ufunc`` ? and ``array`` objects, in the ``unumpy`` repository. * Moving backends from ``unumpy`` into the respective array libraries. ``uarray`` Primer ~~~~~~~~~~~~~~~~~ **Note:** *This section will not attempt to go into too much detail about uarray, that is the purpose of the uarray documentation.* [1]_ *However, the NumPy community will have input into the design of uarray, via the issue tracker.* ``unumpy`` is the interface that defines a set of overridable functions (multimethods) compatible with the numpy API. To do this, it uses the ``uarray`` library. ``uarray`` is a general purpose tool for creating multimethods that dispatch to one of multiple different possible backend implementations. In this sense, it is similar to the ``__array_function__`` protocol but with the key difference that the backend is explicitly installed by the end-user and not coupled into the array type. Decoupling the backend from the array type gives much more flexibility to end-users and backend authors. For example, it is possible to: * override functions not taking arrays as arguments * create backends out of source from the array type * install multiple backends for the same array type This decoupling also means that ``uarray`` is not constrained to dispatching over array-like types. The backend is free to inspect the entire set of function arguments to determine if it can implement the function e.g. ``dtype`` parameter dispatching. Defining backends ^^^^^^^^^^^^^^^^^ ``uarray`` consists of two main protocols: ``__ua_convert__`` and ``__ua_function__``, called in that order, along with ``__ua_domain__``. ``__ua_convert__`` is for conversion and coercion. It has the signature ``(dispatchables, coerce)``, where ``dispatchables`` is an iterable of ``ua.Dispatchable`` objects and ``coerce`` is a boolean indicating whether or not to force the conversion. ``ua.Dispatchable`` is a simple class consisting of three simple values: ``type``, ``value``, and ``coercible``. ``__ua_convert__`` returns an iterable of the converted values, or ``NotImplemented`` in the case of failure. ``__ua_function__`` has the signature ``(func, args, kwargs)`` and defines the actual implementation of the function. It recieves the function and its arguments. Returning ``NotImplemented`` will cause a move to the default implementation of the function if one exists, and failing that, the next backend. Here is what will happen assuming a ``uarray`` multimethod is called: 1. We canonicalise the arguments so any arguments without a default ?? are placed in ``*args`` and those with one are placed in ``**kwargs``. 2. We check the list of backends. ?? a. If it is empty, we try the default implementation. 3. We check if the backend's ``__ua_convert__`` method exists. If it exists: ?? a. We pass it the output of the dispatcher, ????? which is an iterable of ``ua.Dispatchable`` objects. ?? b. We feed this output, along with the arguments, ????? to the argument replacer. ``NotImplemented`` means we move to 3 ????? with the next backend. ?? c. We store the replaced arguments as the new arguments. 4. We feed the arguments into ``__ua_function__``, and return the output, and ?? exit if it isn't ``NotImplemented``. 5. If the default implementation exists, we try it with the current backend. 6. On failure,? we move to 3 with the next backend. If there are no more ?? backends, we move to 7. 7. We raise a ``ua.BackendNotImplementedError``. Defining overridable multimethods ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ To define an overridable function (a multimethod), one needs a few things: 1. A dispatcher that returns an iterable of ``ua.Dispatchable`` objects. 2. A reverse dispatcher that replaces dispatchable values with the supplied ?? ones. 3. A domain. 4. Optionally, a default implementation, which can be provided in terms of ?? other multimethods. As an example, consider the following:: ??? import uarray as ua ??? def full_argreplacer(args, kwargs, dispatchables): ??????? def full(shape, fill_value, dtype=None, order='C'): ??????????? return (shape, fill_value), dict( ??????????????? dtype=dispatchables[0], ??????????????? order=order ????? ??????) ??????? return full(*args, **kwargs) ??? @ua.create_multimethod(full_argreplacer, domain="numpy") ??? def full(shape, fill_value, dtype=None, order='C'): ??????? return (ua.Dispatchable(dtype, np.dtype),) A large set of examples can be found in the ``unumpy`` repository, [8]_. This simple act of overriding callables allows us to override: * Methods * Properties, via ``fget`` and ``fset`` * Entire objects, via ``__get__``. Examples for NumPy ^^^^^^^^^^^^^^^^^^ A library that implements a NumPy-like API will use it in the following manner (as an example):: ??? import numpy.overridable as unp ??? _ua_implementations = {} ??? __ua_domain__ = "numpy" ??? def __ua_function__(func, args, kwargs): ??????? fn = _ua_implementations.get(func, None) ??????? return fn(*args, **kwargs) if fn is not None else NotImplemented ??? def implements(ua_func): ??????? def inner(func): ??????????? _ua_implementations[ua_func] = func ??????????? return func ??????? return inner ??? @implements(unp.asarray) ??? def asarray(a, dtype=None, order=None): ??????? # Code here ??????? # Either this method or __ua_convert__ must ??????? # return NotImplemented for unsupported types, ??????? # Or they shouldn't be marked as dispatchable. ??? # Provides a default implementation for ones and zeros. ??? @implements(unp.full) ??? def full(shape, fill_value, dtype=None, order='C'): ??????? # Code here Backward compatibility ---------------------- There are no backward incompatible changes proposed in this NEP. Alternatives ------------ The current alternative to this problem is a combination of NEP-18 [2]_, NEP-13 [4]_ and NEP-30 [9]_ plus adding more protocols (not yet specified) in addition to it. Even then, some parts of the NumPy API will remain non-overridable, so it's a partial alternative. The main alternative to vendoring ``unumpy`` is to simply move it into NumPy completely and not distribute it as a separate package. This would also achieve the proposed goals, however we prefer to keep it a separate package for now, for reasons already stated above. The third alternative is to move ``unumpy`` into the NumPy organisation and develop it as a NumPy project. This will also achieve the said goals, and is also a possibility that can be considered by this NEP. However, the act of doing an extra ``pip install`` or ``conda install`` may discourage some users from adopting this method. Discussion ---------- * ``uarray`` blogpost: https://labs.quansight.org/blog/2019/07/uarray-update-api-changes-overhead-and-comparison-to-__array_function__/ * The discussion section of NEP-18: https://numpy.org/neps/nep-0018-array-function-protocol.html#discussion * NEP-22: https://numpy.org/neps/nep-0022-ndarray-duck-typing-overview.html * Dask issue #4462: https://github.com/dask/dask/issues/4462 * PR #13046: https://github.com/numpy/numpy/pull/13046 * Dask issue #4883: https://github.com/dask/dask/issues/4883 * Issue #13831: https://github.com/numpy/numpy/issues/13831 * Discussion PR 1: https://github.com/hameerabbasi/numpy/pull/3 * Discussion PR 2: https://github.com/hameerabbasi/numpy/pull/4 * Discussion PR 3: https://github.com/numpy/numpy/pull/14389 References and Footnotes ------------------------ .. [1] uarray, A general dispatch mechanism for Python: https://uarray.readthedocs.io .. [2] NEP 18 ? A dispatch mechanism for NumPy?s high level array functions: https://numpy.org/neps/nep-0018-array-function-protocol.html .. [3] NEP 22 ? Duck typing for NumPy arrays ? high level overview: https://numpy.org/neps/nep-0022-ndarray-duck-typing-overview.html .. [4] NEP 13 ? A Mechanism for Overriding Ufuncs: https://numpy.org/neps/nep-0013-ufunc-overrides.html .. [5] Reply to Adding to the non-dispatched implementation of NumPy methods: http://numpy-discussion.10968.n7.nabble.com/Adding-to-the-non-dispatched-implementation-of-NumPy-methods-tp46816p46874.html .. [6] Custom Dtype/Units discussion: http://numpy-discussion.10968.n7.nabble.com/Custom-Dtype-Units-discussion-td43262.html .. [7] The epic dtype cleanup plan: https://github.com/numpy/numpy/issues/2899 .. [8] unumpy: NumPy, but implementation-independent: https://unumpy.readthedocs.io .. [9] NEP 30 ? Duck Typing for NumPy Arrays - Implementation: https://www.numpy.org/neps/nep-0030-duck-array-protocol.html .. [10] http://scipy.github.io/devdocs/fft.html#backend-control Copyright --------- This document has been placed in the public domain. _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at python.org https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Wed Oct 9 20:55:53 2019 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Wed, 09 Oct 2019 17:55:53 -0700 Subject: [Numpy-discussion] "Spring cleaning" sprint Message-ID: <81ed0f30ab4f83fa28557b0032d447f1e36922ec.camel@sipsolutions.net> Hi all, we are planning to remote spring to try to reduce the number of open PRs and issues, and do triage work. This is planned for next Tuesday, October 14th, between 9:00 and 15:00 Pacific time. And everyone is invited to join in for as long as you wish. I assume we will have a video chat up and running at: https://berkeley.zoom.us/j/762261535 However, we will send a reminder/logistics email when we start with the sprint. Cheers, Sebastian -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From jni at fastmail.com Wed Oct 9 20:59:50 2019 From: jni at fastmail.com (Juan Nunez-Iglesias) Date: Wed, 09 Oct 2019 19:59:50 -0500 Subject: [Numpy-discussion] =?utf-8?q?NEP_31_=E2=80=94_Context-local_and_?= =?utf-8?q?global_overrides_of_the_NumPy_API?= In-Reply-To: <4C75888A-994F-4F63-9414-0842C8E1C396@yahoo.com> References: <4C75888A-994F-4F63-9414-0842C8E1C396@yahoo.com> Message-ID: <06a272bf-71a9-4f0f-951a-fd0c25e341ba@www.fastmail.com> Hi all, and thank you for all your hard work with this. I wanted to provide more of an "end user" perspective than I think has been present in this discussion so far. Over the past month, I've quickly skimmed some emails on this thread and skipped others altogether. I am far from a NumPy novice, but essentially *all* of the discussion went over my head. For a while my attitude was "Oh well, far smarter people than me are dealing with this, I'll let them figure it out." Looking at the participants in the thread, I worry that this is the attitude almost everyone has taken, and that the solution proposed will not be easy enough to deal with for any meaningful adoption. Certainly with `__array_function__` I only took interest when our tests broke with 1.17rc1. Today I was particularly interested because I'm working to improve scikit-image support for pyopencl.Array inputs. I went back and read the original NEP and the latest iteration. Thank you again for the discussion, because the latest is indeed a vast improvement over the original. I think the very motivation has the wrong focus. I would summarise it as "we've been coming up with all kinds of ways to do multiple dispatch for array-likes, and we've found that we need more ways, so let's come up with the One True Way." I think the focus should be on the users and community. Something along the lines of: "New implementations of array computing are cropping up left, right, and centre in Python (not to speak of other languages!). There are good reasons for this (GPUs, distributed computing, sparse data, etc), but it leaves users and library authors in a pickle: how can they ensure that their functions, written with NumPy array inputs and outputs in mind, work well in this ecosystem?" With this new motivation in mind, I think that the user story below is (a) the best part of the NEP, but (b) underdeveloped. The NEP is all about "if I want my array implementation to work with this fancy dispatch system, what do I need to do?". But there should be more of "in user situations X, Y, and Z, what is the desired behaviour?" > The way we propose the overrides will be used by end users is:: > > # On the library side > > import numpy.overridable as unp > > def library_function(array): > array = unp.asarray(array) > # Code using unumpy as usual > return array > > # On the user side: > > import numpy.overridable as unp > import uarray as ua > import dask.array as da > > ua.register_backend(da) > > library_function(dask_array) # works and returns dask_array > > with unp.set_backend(da): > library_function([1, 2, 3, 4]) # actually returns a Dask array. > > Here, ``backend`` can be any compatible object defined either by NumPy or an > external library, such as Dask or CuPy. Ideally, it should be the module > ``dask.array`` or ``cupy`` itself. Some questions about the above: - What happens if I call `library_function(dask_array)` without registering `da` as a backend first? Will `unp.asarray` try to instantiate a potentially 100GB array? This seems bad. - To get `library_function`, I presumably have to do `from fancy_array_library import library_function`. Can the code in `fancy_array_library` itself register backends, and if so, should/would fancy array libraries that want to maximise compatibility pre-register a bunch of backends so that users don't have to? Here are a couple of code snippets that I would *want* to "just work". Maybe it's unreasonable, but imho the NEP should provide these as use cases (specifically: how library_function should be written so that they work, and what dask.array and pytorch would need to do so that they work, OR, why the NEP doesn't solve them). 1. from dask import array as da from fancy_array_library import library_function # hopefully skimage one day ;) data = da.from_zarr('myfile.zarr') result = library_function(data) # result should still be dask, all things being equal result.to_zarr('output.zarr') 2. from dask import array as da from magic_library import pytorch_predict data = da.from_zarr('myfile.zarr') result = pytorch_predict(data) # normally here I would use e.g. data.map_overlap, but could this be done magically? result.to_zarr('output.zarr') There's probably a whole bunch of other "user stories" one can concoct, and no doubt many from the authors of the NEP themselves, but they don't come through in the NEP text. My apologies that I haven't read *all* the references: I understand that it is frustrating if the above are addressed there, but I think it's important to have this kind of context in the NEP itself. Thank you again, and I hope the above is helpful rather than feels like more unnecessary churn. Juan. -------------- next part -------------- An HTML attachment was scrubbed... URL: From poh.zijie at gmail.com Thu Oct 10 00:26:41 2019 From: poh.zijie at gmail.com (Zijie Poh) Date: Wed, 9 Oct 2019 21:26:41 -0700 Subject: [Numpy-discussion] "Spring cleaning" sprint In-Reply-To: <81ed0f30ab4f83fa28557b0032d447f1e36922ec.camel@sipsolutions.net> References: <81ed0f30ab4f83fa28557b0032d447f1e36922ec.camel@sipsolutions.net> Message-ID: Hi Sebastian, It is Tuesday October 15 or Monday October 14? Regards, ZJ On Wed, Oct 9, 2019 at 5:57 PM Sebastian Berg wrote: > Hi all, > > we are planning to remote spring to try to reduce the number of open > PRs and issues, and do triage work. > > This is planned for next Tuesday, October 14th, between 9:00 and 15:00 > Pacific time. And everyone is invited to join in for as long as you > wish. > > I assume we will have a video chat up and running at: > https://berkeley.zoom.us/j/762261535 > > However, we will send a reminder/logistics email when we start with the > sprint. > > Cheers, > > Sebastian > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Thu Oct 10 00:43:57 2019 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Wed, 09 Oct 2019 21:43:57 -0700 Subject: [Numpy-discussion] "Spring cleaning" sprint In-Reply-To: References: <81ed0f30ab4f83fa28557b0032d447f1e36922ec.camel@sipsolutions.net> Message-ID: On Wed, 2019-10-09 at 21:26 -0700, Zijie Poh wrote: > Hi Sebastian, > > It is Tuesday October 15 or Monday October 14? > Sorry, its Tuesday the 15th [0]. Monday is a holiday in California at least. Cheers, Sebastian [0] Probably happens to me because I am still used to weeks starting with Monday and not Sunday (as my calendar now shows). > Regards, > ZJ > > On Wed, Oct 9, 2019 at 5:57 PM Sebastian Berg < > sebastian at sipsolutions.net> wrote: > > Hi all, > > > > we are planning to remote spring to try to reduce the number of > > open > > PRs and issues, and do triage work. > > > > This is planned for next Tuesday, October 14th, between 9:00 and > > 15:00 > > Pacific time. And everyone is invited to join in for as long as you > > wish. > > > > I assume we will have a video chat up and running at: > > https://berkeley.zoom.us/j/762261535 > > > > However, we will send a reminder/logistics email when we start with > > the > > sprint. > > > > Cheers, > > > > Sebastian > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From ralf.gommers at gmail.com Thu Oct 10 01:37:43 2019 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Wed, 9 Oct 2019 22:37:43 -0700 Subject: [Numpy-discussion] =?utf-8?q?NEP_31_=E2=80=94_Context-local_and_?= =?utf-8?q?global_overrides_of_the_NumPy_API?= In-Reply-To: <06a272bf-71a9-4f0f-951a-fd0c25e341ba@www.fastmail.com> References: <4C75888A-994F-4F63-9414-0842C8E1C396@yahoo.com> <06a272bf-71a9-4f0f-951a-fd0c25e341ba@www.fastmail.com> Message-ID: On Wed, Oct 9, 2019 at 6:00 PM Juan Nunez-Iglesias wrote: > Hi all, and thank you for all your hard work with this. > > I wanted to provide more of an "end user" perspective than I think has > been present in this discussion so far. Over the past month, I've quickly > skimmed some emails on this thread and skipped others altogether. I am far > from a NumPy novice, but essentially *all* of the discussion went over my > head. For a while my attitude was "Oh well, far smarter people than me are > dealing with this, I'll let them figure it out." Looking at the > participants in the thread, I worry that this is the attitude almost > everyone has taken, and that the solution proposed will not be easy enough > to deal with for any meaningful adoption. Certainly with > `__array_function__` I only took interest when our tests broke with 1.17rc1. > > Today I was particularly interested because I'm working to improve > scikit-image support for pyopencl.Array inputs. I went back and read the > original NEP and the latest iteration. Thank you again for the discussion, > because the latest is indeed a vast improvement over the original. > > I think the very motivation has the wrong focus. I would summarise it as > "we've been coming up with all kinds of ways to do multiple dispatch for > array-likes, and we've found that we need more ways, so let's come up with > the One True Way." I think the focus should be on the users and community. > Something along the lines of: "New implementations of array computing are > cropping up left, right, and centre in Python (not to speak of other > languages!). There are good reasons for this (GPUs, distributed computing, > sparse data, etc), but it leaves users and library authors in a pickle: how > can they ensure that their functions, written with NumPy array inputs and > outputs in mind, work well in this ecosystem?" > > With this new motivation in mind, I think that the user story below is (a) > the best part of the NEP, but (b) underdeveloped. The NEP is all about "if > I want my array implementation to work with this fancy dispatch system, > what do I need to do?". But there should be more of "in user situations X, > Y, and Z, what is the desired behaviour?" > > The way we propose the overrides will be used by end users is:: > > # On the library side > > import numpy.overridable as unp > > def library_function(array): > array = unp.asarray(array) > # Code using unumpy as usual > return array > > # On the user side: > > import numpy.overridable as unp > import uarray as ua > import dask.array as da > > ua.register_backend(da) > > library_function(dask_array) # works and returns dask_array > > with unp.set_backend(da): > library_function([1, 2, 3, 4]) # actually returns a Dask array. > > Here, ``backend`` can be any compatible object defined either by NumPy or > an > external library, such as Dask or CuPy. Ideally, it should be the module > ``dask.array`` or ``cupy`` itself. > > > Some questions about the above: > > - What happens if I call `library_function(dask_array)` without > registering `da` as a backend first? Will `unp.asarray` try to instantiate > a potentially 100GB array? This seems bad. > - To get `library_function`, I presumably have to do `from > fancy_array_library import library_function`. Can the code in > `fancy_array_library` itself register backends, and if so, should/would > fancy array libraries that want to maximise compatibility pre-register a > bunch of backends so that users don't have to? > > Here are a couple of code snippets that I would *want* to "just work". > Maybe it's unreasonable, but imho the NEP should provide these as use cases > (specifically: how library_function should be written so that they work, > and what dask.array and pytorch would need to do so that they work, OR, why > the NEP doesn't solve them). > > 1. > from dask import array as da > from fancy_array_library import library_function # hopefully skimage one > day ;) > > data = da.from_zarr('myfile.zarr') > result = library_function(data) # result should still be dask, all things > being equal > result.to_zarr('output.zarr') > > 2. > from dask import array as da > from magic_library import pytorch_predict > > data = da.from_zarr('myfile.zarr') > result = pytorch_predict(data) # normally here I would use e.g. > data.map_overlap, but could this be done magically? > result.to_zarr('output.zarr') > > > There's probably a whole bunch of other "user stories" one can concoct, > and no doubt many from the authors of the NEP themselves, but they don't > come through in the NEP text. My apologies that I haven't read *all* the > references: I understand that it is frustrating if the above are addressed > there, but I think it's important to have this kind of context in the NEP > itself. > > Thank you again, and I hope the above is helpful rather than feels like > more unnecessary churn. > Thanks Juan, this feedback is amazing and I couldn't agree more. I think we have to have this "end user focus" for this NEP, as well as for other large-scope design efforts: we should do or have done this for __array_ufunc__, __array_function__, the dtype redesign, etc. I think in this case, the user stories and a "vision" on the whole topic don't belong inside this NEP. Rather, it should be a separate one like https://numpy.org/neps/nep-0022-ndarray-duck-typing-overview.html. If you read the first paragraph of that NEP, it actually starts out exactly right in the detailed description. But then it dives straight into design. In my experience, when you mix user stories or external requirements with design, it's extremely easy to ignore or be super brief about the former, and let design considerations/details lead rather than follow from those external requirements. Note that this is also why I wanted to update the NEP template. We've done a tweak by adding the "Motivation and Scope" section, but that doesn't go nearly far enough. Back to this NEP: I don't think we should significantly extend it, we should write a new separate one. Rationale: these user stories apply equally to __array_function__ et al., and will have to guide how NumPy as a whole and the usage of the NumPy API evolves over the next couple of years. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From s.molnar at sbcglobal.net Thu Oct 10 10:10:58 2019 From: s.molnar at sbcglobal.net (Stephen P. Molnar) Date: Thu, 10 Oct 2019 10:10:58 -0400 Subject: [Numpy-discussion] Problem with np.savetxt In-Reply-To: <5D9CA1FB.4020607@sbcglobal.net> References: <5D9C8C42.8010006@sbcglobal.net> <5D9CA1FB.4020607@sbcglobal.net> Message-ID: <5D9F3BF2.5020008@sbcglobal.net> I am slowly and not quickly stumbling forward, but at this point my degree of mental entropy (confusion) is monumental. This works: > import numpy as np > > print('${d}') > > data = np.genfromtxt("14-7.log", usecols=(1), skip_header=27, > skip_footer=1, encoding=None) > > print(data) > > np.savetxt('14-7.dG', data, fmt='%12.9f', header='14-7') > print(data) which produces: > runfile('/home/comp/Apps/Python/PsoVina/DeltaGTable_V_s.py', > wdir='/home/comp/Apps/Python/PsoVina', current_namespace=True) > ${d} > [-9.96090267 -8.97950478 -8.94261136 -8.91552301 -8.73650883 -8.66338714 > -8.41073971 -8.38914635 -8.29679891 -8.16845411 -8.12799082 -8.12710377 > -7.97909074 -7.94187268 -7.90076621 -7.88148523 -7.83782648 -7.8159095 > -7.72254029 -7.72034674] > [-9.96090267 -8.97950478 -8.94261136 -8.91552301 -8.73650883 -8.66338714 > -8.41073971 -8.38914635 -8.29679891 -8.16845411 -8.12799082 -8.12710377 > -7.97909074 -7.94187268 -7.90076621 -7.88148523 -7.83782648 -7.8159095 > -7.72254029 -7.72034674] Note; the print statements are for a quick check o the output, which is: > # 14-7 > -9.960902669 > -8.979504781 > -8.942611364 > -8.915523010 > -8.736508831 > -8.663387139 > -8.410739711 > -8.389146347 > -8.296798909 > -8.168454106 > -8.127990818 > -8.127103774 > -7.979090739 > -7.941872682 > -7.900766215 > -7.881485228 > -7.837826485 > -7.815909505 > -7.722540286 > -7.720346742 Also, this bash script works: > #!/bin/bash > > # Run.dG.list_1 > > while IFS= read -r d > do > echo "${d}.log" > > done 14-7.log > 15-7.log > 18-7.log > C-VX3.log But, if I run this bash script: > #!/bin/bash > > # Run.dG.list_1 > > while IFS= read -r d > do > echo "${d}.log" > python3 DeltaGTable_V_sl.py > > > done where DeltaGTable_V_sl.py is: > import numpy as np > > print('${d}') > > data = np.genfromtxt('${d}.log', usecols=(1), skip_header=27, > skip_footer=1, encoding=None) > print(data) > > np.savetxt('${d}.dG', data, fmt='%12.9f', header='${d}') > print(data.dG) I get: > (base) comp at AbNormal:~/Apps/Python/PsoVina$ sh ./Run.dG.list_1.sh > 14-7.log > python3: can't open file 'DeltaGTable_V_sl.py': [Errno 2] No such file > or directory > 15-7.log > python3: can't open file 'DeltaGTable_V_sl.py': [Errno 2] No such file > or directory > 18-7.log > python3: can't open file 'DeltaGTable_V_sl.py': [Errno 2] No such file > or directory > C-VX3.log > python3: can't open file 'DeltaGTable_V_sl.py': [Errno 2] No such file > or directory So, it would appear that the log file labels are in the workspace, but '${d}.log' is not being recognized as fname by genfromtxt. Although i have googled every combination of terms I can think of I am obviously missing something. As I have potentially hundreds of files to process, I would appreciate pointers towards a solution to the problem. Thanks in advance. On 10/08/2019 10:49 AM, Stephen P. Molnar wrote: > Many thanks or your kind replies. > > I really appreciate your suggestions. > > On 10/08/2019 09:44 AM, Andras Deak wrote: >> PS. if you just want to specify the width of the fields you wouldn't >> have to convert anything, because you can specify the size and >> justification of a %s format. But arguably having float data as floats >> is more natural anyway. >> >> On Tue, Oct 8, 2019 at 3:42 PM Andras Deak >> wrote: >>> On Tue, Oct 8, 2019 at 3:17 PM Stephen P. Molnar >>> wrote: >>>> I am embarrassed to be asking this question, but I have exhausted >>>> Google >>>> at this point . >>>> >>>> I have a number of identically formatted text files from which I >>>> want to >>>> extract data, as an example (hopefully, putting these in as quotes >>>> will >>>> persevere the format): >>>> >>>>> ======================================================================= >>>>> >>>>> PSOVina version 2.0 >>>>> Giotto H. K. Tai & Shirley W. I. Siu >>>>> >>>>> Computational Biology and Bioinformatics Lab >>>>> University of Macau >>>>> >>>>> Visit http://cbbio.cis.umac.mo for more information. >>>>> >>>>> PSOVina was developed based on the framework of AutoDock Vina. >>>>> >>>>> For more information about Vina, please visit >>>>> http://vina.scripps.edu. >>>>> >>>>> ======================================================================= >>>>> >>>>> >>>>> Output will be 13-7_out.pdbqt >>>>> Reading input ... done. >>>>> Setting up the scoring function ... done. >>>>> Analyzing the binding site ... done. >>>>> Using random seed: 1828390527 >>>>> Performing search ... done. >>>>> >>>>> Refining results ... done. >>>>> >>>>> mode | affinity | dist from best mode >>>>> | (kcal/mol) | rmsd l.b.| rmsd u.b. >>>>> -----+------------+----------+---------- >>>>> 1 -8.862004149 0.000 0.000 >>>>> 2 -8.403522829 2.992 6.553 >>>>> 3 -8.401384636 2.707 5.220 >>>>> 4 -7.886402037 4.907 6.862 >>>>> 5 -7.845519031 3.233 5.915 >>>>> 6 -7.837434227 3.954 5.641 >>>>> 7 -7.834584887 3.188 7.294 >>>>> 8 -7.694395765 3.746 7.553 >>>>> 9 -7.691211177 3.536 5.745 >>>>> 10 -7.670759445 3.698 7.587 >>>>> 11 -7.661882758 4.882 7.044 >>>>> 12 -7.636280303 2.347 3.284 >>>>> 13 -7.635788052 3.511 6.250 >>>>> 14 -7.611175249 2.427 3.449 >>>>> 15 -7.586368357 2.142 2.864 >>>>> 16 -7.531307666 2.976 4.980 >>>>> 17 -7.520501084 3.085 5.775 >>>>> 18 -7.512906514 4.220 7.672 >>>>> 19 -7.307403528 3.240 4.354 >>>>> 20 -7.256063348 3.694 7.252 >>>>> Writing output ... done. >>>> At this point, my python script consists of only the following: >>>> >>>>> #!/usr/bin/env python3 >>>>> # -*- coding: utf-8 -*- >>>>> """ >>>>> >>>>> Created on Tue Sep 24 07:51:11 2019 >>>>> >>>>> """ >>>>> import numpy as np >>>>> >>>>> data = [] >>>>> >>>>> data = np.genfromtxt("13-7.log", usecols=(1), dtype=None, >>>>> skip_header=27, skip_footer=1, encoding=None) >>>>> >>>>> print(data) >>>>> >>>>> np.savetxt('13-7', [data], fmt='%15.9f', header='13-7') >>>> The problem lies in tfe np.savetxt line, on execution I get: >>>> >>>>> runfile('/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/VINA20/SmallSet/DeltaGTable_V_s.py', >>>>> >>>>> wdir='/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/VINA20/SmallSet', >>>>> >>>>> current_namespace=True) >>>>> ['-8.839713733' '-8.743377250' '-8.151051167' '-8.090452911' >>>>> '-7.967494477' '-7.854890056' '-7.757417879' '-7.741557490' >>>>> '-7.643885488' '-7.611595767' '-7.507605524' '-7.413920814' >>>>> '-7.389408331' '-7.384446364' '-7.374206276' '-7.368808179' >>>>> '-7.346641418' '-7.325037898' '-7.309614787' '-7.113209147'] >>>>> Traceback (most recent call last): >>>>> >>>>> File >>>>> "/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/VINA20/SmallSet/DeltaGTable_V_s.py", >>>>> >>>>> line 16, in >>>>> np.savetxt('13-7', [data], fmt='%16.9f', header='13-7') >>>>> >>>>> File "<__array_function__ internals>", line 6, in savetxt >>>>> >>>>> File >>>>> "/home/comp/Apps/Miniconda3/lib/python3.7/site-packages/numpy/lib/npyio.py", >>>>> >>>>> line 1438, in savetxt >>>>> % (str(X.dtype), format)) >>>>> >>>>> TypeError: Mismatch between array dtype ('>>>> ('%16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f >>>>> %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f >>>>> %16.9f') >>>> The data is in the data file, but the only entry in '13-7', the saved >>>> file, is the label. Obviously, the error is in the format argument. >>> Hi, >>> >>> One problem is the format: the error is telling you that you have >>> strings in your array (compare the `'>> your `print(data)` call with strings inside), whereas %16.9f can only >>> be used to format floats (f for float). You would first have to >>> convert your array of strings to an array numbers. I don't usually use >>> genfromtxt so I'm not sure how you can make it return floats for you >>> in the first place, but I suspect `dtype=None` in the call to >>> genfromtxt might be responsible. In any case making it return numbers >>> should be the easier case. >>> The second problem is that you should make sure you mean `[data]` in >>> the call to savetxt. As it is now this would give you a 2d array of >>> shape (1, 20), and the output would correspondingly contain a single >>> row of 20 values (hence the 20 instances of '%16.9f' in the error >>> message). In case you meant to print one value per row in a single >>> column, you should drop the brackets around `data`: >>> np.savetxt('13-7', data, fmt='%16.9f', header='13-7') >>> >>> And just a personal note, but I'd find an output file named '13-7' to >>> be a bit surprising. Perhaps some extension or prefix would help >>> organize these files? >>> Regards, >>> >>> Andr??s >>> >>>> Help will be much appreciated. >>>> >>>> Thanks in advance. >>>> >>>> -- >>>> Stephen P. Molnar, Ph.D. >>>> www.molecular-modeling.net >>>> 614.312.7528 (c) >>>> Skype: smolnar1 >>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at python.org >>>> https://mail.python.org/mailman/listinfo/numpy-discussion >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion > -- Stephen P. Molnar, Ph.D. www.molecular-modeling.net 614.312.7528 (c) Skype: smolnar1 From wieser.eric+numpy at gmail.com Thu Oct 10 10:22:46 2019 From: wieser.eric+numpy at gmail.com (Eric Wieser) Date: Thu, 10 Oct 2019 15:22:46 +0100 Subject: [Numpy-discussion] Problem with np.savetxt In-Reply-To: <5D9F3BF2.5020008@sbcglobal.net> References: <5D9C8C42.8010006@sbcglobal.net> <5D9CA1FB.4020607@sbcglobal.net> <5D9F3BF2.5020008@sbcglobal.net> Message-ID: You're trying to read a file with a name of literally `${d}.log`, which is unlikely to be the name of your file. `${}` is bash syntax, not python syntax. This has drifted out of numpy territory and into "how to coordinate between bash and python" territory - I'd perhaps recommend you ask this to a wider python audience on StackOverflow, where you'll get a faster response. Eric On Thu, 10 Oct 2019 at 15:11, Stephen P. Molnar wrote: > I am slowly and not quickly stumbling forward, but at this point my > degree of mental entropy (confusion) is monumental. > > This works: > > > import numpy as np > > > > print('${d}') > > > > data = np.genfromtxt("14-7.log", usecols=(1), skip_header=27, > > skip_footer=1, encoding=None) > > > > print(data) > > > > np.savetxt('14-7.dG', data, fmt='%12.9f', header='14-7') > > print(data) > > which produces: > > > runfile('/home/comp/Apps/Python/PsoVina/DeltaGTable_V_s.py', > > wdir='/home/comp/Apps/Python/PsoVina', current_namespace=True) > > ${d} > > [-9.96090267 -8.97950478 -8.94261136 -8.91552301 -8.73650883 -8.66338714 > > -8.41073971 -8.38914635 -8.29679891 -8.16845411 -8.12799082 -8.12710377 > > -7.97909074 -7.94187268 -7.90076621 -7.88148523 -7.83782648 -7.8159095 > > -7.72254029 -7.72034674] > > [-9.96090267 -8.97950478 -8.94261136 -8.91552301 -8.73650883 -8.66338714 > > -8.41073971 -8.38914635 -8.29679891 -8.16845411 -8.12799082 -8.12710377 > > -7.97909074 -7.94187268 -7.90076621 -7.88148523 -7.83782648 -7.8159095 > > -7.72254029 -7.72034674] > Note; the print statements are for a quick check o the output, which is: > > > # 14-7 > > -9.960902669 > > -8.979504781 > > -8.942611364 > > -8.915523010 > > -8.736508831 > > -8.663387139 > > -8.410739711 > > -8.389146347 > > -8.296798909 > > -8.168454106 > > -8.127990818 > > -8.127103774 > > -7.979090739 > > -7.941872682 > > -7.900766215 > > -7.881485228 > > -7.837826485 > > -7.815909505 > > -7.722540286 > > -7.720346742 > Also, this bash script works: > > > #!/bin/bash > > > > # Run.dG.list_1 > > > > while IFS= read -r d > > do > > echo "${d}.log" > > > > done which returns the three log file names: > > > 14-7.log > > 15-7.log > > 18-7.log > > C-VX3.log > > > But, if I run this bash script: > > > #!/bin/bash > > > > # Run.dG.list_1 > > > > while IFS= read -r d > > do > > echo "${d}.log" > > python3 DeltaGTable_V_sl.py > > > > > > done > > where DeltaGTable_V_sl.py is: > > > import numpy as np > > > > print('${d}') > > > > data = np.genfromtxt('${d}.log', usecols=(1), skip_header=27, > > skip_footer=1, encoding=None) > > print(data) > > > > np.savetxt('${d}.dG', data, fmt='%12.9f', header='${d}') > > print(data.dG) > > I get: > > > (base) comp at AbNormal:~/Apps/Python/PsoVina$ sh ./Run.dG.list_1.sh > > 14-7.log > > python3: can't open file 'DeltaGTable_V_sl.py': [Errno 2] No such file > > or directory > > 15-7.log > > python3: can't open file 'DeltaGTable_V_sl.py': [Errno 2] No such file > > or directory > > 18-7.log > > python3: can't open file 'DeltaGTable_V_sl.py': [Errno 2] No such file > > or directory > > C-VX3.log > > python3: can't open file 'DeltaGTable_V_sl.py': [Errno 2] No such file > > or directory > > So, it would appear that the log file labels are in the workspace, but > '${d}.log' is not being recognized as fname by genfromtxt. Although i > have googled every combination of terms I can think of I am obviously > missing something. > > As I have potentially hundreds of files to process, I would appreciate > pointers towards a solution to the problem. > > Thanks in advance. > > On 10/08/2019 10:49 AM, Stephen P. Molnar wrote: > > Many thanks or your kind replies. > > > > I really appreciate your suggestions. > > > > On 10/08/2019 09:44 AM, Andras Deak wrote: > >> PS. if you just want to specify the width of the fields you wouldn't > >> have to convert anything, because you can specify the size and > >> justification of a %s format. But arguably having float data as floats > >> is more natural anyway. > >> > >> On Tue, Oct 8, 2019 at 3:42 PM Andras Deak > >> wrote: > >>> On Tue, Oct 8, 2019 at 3:17 PM Stephen P. Molnar > >>> wrote: > >>>> I am embarrassed to be asking this question, but I have exhausted > >>>> Google > >>>> at this point . > >>>> > >>>> I have a number of identically formatted text files from which I > >>>> want to > >>>> extract data, as an example (hopefully, putting these in as quotes > >>>> will > >>>> persevere the format): > >>>> > >>>>> > ======================================================================= > >>>>> > >>>>> PSOVina version 2.0 > >>>>> Giotto H. K. Tai & Shirley W. I. Siu > >>>>> > >>>>> Computational Biology and Bioinformatics Lab > >>>>> University of Macau > >>>>> > >>>>> Visit http://cbbio.cis.umac.mo for more information. > >>>>> > >>>>> PSOVina was developed based on the framework of AutoDock Vina. > >>>>> > >>>>> For more information about Vina, please visit > >>>>> http://vina.scripps.edu. > >>>>> > >>>>> > ======================================================================= > >>>>> > >>>>> > >>>>> Output will be 13-7_out.pdbqt > >>>>> Reading input ... done. > >>>>> Setting up the scoring function ... done. > >>>>> Analyzing the binding site ... done. > >>>>> Using random seed: 1828390527 > >>>>> Performing search ... done. > >>>>> > >>>>> Refining results ... done. > >>>>> > >>>>> mode | affinity | dist from best mode > >>>>> | (kcal/mol) | rmsd l.b.| rmsd u.b. > >>>>> -----+------------+----------+---------- > >>>>> 1 -8.862004149 0.000 0.000 > >>>>> 2 -8.403522829 2.992 6.553 > >>>>> 3 -8.401384636 2.707 5.220 > >>>>> 4 -7.886402037 4.907 6.862 > >>>>> 5 -7.845519031 3.233 5.915 > >>>>> 6 -7.837434227 3.954 5.641 > >>>>> 7 -7.834584887 3.188 7.294 > >>>>> 8 -7.694395765 3.746 7.553 > >>>>> 9 -7.691211177 3.536 5.745 > >>>>> 10 -7.670759445 3.698 7.587 > >>>>> 11 -7.661882758 4.882 7.044 > >>>>> 12 -7.636280303 2.347 3.284 > >>>>> 13 -7.635788052 3.511 6.250 > >>>>> 14 -7.611175249 2.427 3.449 > >>>>> 15 -7.586368357 2.142 2.864 > >>>>> 16 -7.531307666 2.976 4.980 > >>>>> 17 -7.520501084 3.085 5.775 > >>>>> 18 -7.512906514 4.220 7.672 > >>>>> 19 -7.307403528 3.240 4.354 > >>>>> 20 -7.256063348 3.694 7.252 > >>>>> Writing output ... done. > >>>> At this point, my python script consists of only the following: > >>>> > >>>>> #!/usr/bin/env python3 > >>>>> # -*- coding: utf-8 -*- > >>>>> """ > >>>>> > >>>>> Created on Tue Sep 24 07:51:11 2019 > >>>>> > >>>>> """ > >>>>> import numpy as np > >>>>> > >>>>> data = [] > >>>>> > >>>>> data = np.genfromtxt("13-7.log", usecols=(1), dtype=None, > >>>>> skip_header=27, skip_footer=1, encoding=None) > >>>>> > >>>>> print(data) > >>>>> > >>>>> np.savetxt('13-7', [data], fmt='%15.9f', header='13-7') > >>>> The problem lies in tfe np.savetxt line, on execution I get: > >>>> > >>>>> > runfile('/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/VINA20/SmallSet/DeltaGTable_V_s.py', > > >>>>> > >>>>> > wdir='/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/VINA20/SmallSet', > > >>>>> > >>>>> current_namespace=True) > >>>>> ['-8.839713733' '-8.743377250' '-8.151051167' '-8.090452911' > >>>>> '-7.967494477' '-7.854890056' '-7.757417879' '-7.741557490' > >>>>> '-7.643885488' '-7.611595767' '-7.507605524' '-7.413920814' > >>>>> '-7.389408331' '-7.384446364' '-7.374206276' '-7.368808179' > >>>>> '-7.346641418' '-7.325037898' '-7.309614787' '-7.113209147'] > >>>>> Traceback (most recent call last): > >>>>> > >>>>> File > >>>>> > "/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/VINA20/SmallSet/DeltaGTable_V_s.py", > > >>>>> > >>>>> line 16, in > >>>>> np.savetxt('13-7', [data], fmt='%16.9f', header='13-7') > >>>>> > >>>>> File "<__array_function__ internals>", line 6, in savetxt > >>>>> > >>>>> File > >>>>> > "/home/comp/Apps/Miniconda3/lib/python3.7/site-packages/numpy/lib/npyio.py", > > >>>>> > >>>>> line 1438, in savetxt > >>>>> % (str(X.dtype), format)) > >>>>> > >>>>> TypeError: Mismatch between array dtype (' >>>>> ('%16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f > >>>>> %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f > >>>>> %16.9f') > >>>> The data is in the data file, but the only entry in '13-7', the saved > >>>> file, is the label. Obviously, the error is in the format argument. > >>> Hi, > >>> > >>> One problem is the format: the error is telling you that you have > >>> strings in your array (compare the `' >>> your `print(data)` call with strings inside), whereas %16.9f can only > >>> be used to format floats (f for float). You would first have to > >>> convert your array of strings to an array numbers. I don't usually use > >>> genfromtxt so I'm not sure how you can make it return floats for you > >>> in the first place, but I suspect `dtype=None` in the call to > >>> genfromtxt might be responsible. In any case making it return numbers > >>> should be the easier case. > >>> The second problem is that you should make sure you mean `[data]` in > >>> the call to savetxt. As it is now this would give you a 2d array of > >>> shape (1, 20), and the output would correspondingly contain a single > >>> row of 20 values (hence the 20 instances of '%16.9f' in the error > >>> message). In case you meant to print one value per row in a single > >>> column, you should drop the brackets around `data`: > >>> np.savetxt('13-7', data, fmt='%16.9f', header='13-7') > >>> > >>> And just a personal note, but I'd find an output file named '13-7' to > >>> be a bit surprising. Perhaps some extension or prefix would help > >>> organize these files? > >>> Regards, > >>> > >>> Andr??s > >>> > >>>> Help will be much appreciated. > >>>> > >>>> Thanks in advance. > >>>> > >>>> -- > >>>> Stephen P. Molnar, Ph.D. > >>>> www.molecular-modeling.net > >>>> 614.312.7528 (c) > >>>> Skype: smolnar1 > >>>> > >>>> _______________________________________________ > >>>> NumPy-Discussion mailing list > >>>> NumPy-Discussion at python.org > >>>> https://mail.python.org/mailman/listinfo/numpy-discussion > >> _______________________________________________ > >> NumPy-Discussion mailing list > >> NumPy-Discussion at python.org > >> https://mail.python.org/mailman/listinfo/numpy-discussion > > > > -- > Stephen P. Molnar, Ph.D. > www.molecular-modeling.net > 614.312.7528 (c) > Skype: smolnar1 > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matti.picus at gmail.com Thu Oct 10 11:31:13 2019 From: matti.picus at gmail.com (Matti Picus) Date: Thu, 10 Oct 2019 18:31:13 +0300 Subject: [Numpy-discussion] Unsupporting python3.5 Message-ID: <21b78142-1733-1c3d-f64d-5141470c8b11@gmail.com> An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu Oct 10 12:34:55 2019 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 10 Oct 2019 10:34:55 -0600 Subject: [Numpy-discussion] Unsupporting python3.5 In-Reply-To: <21b78142-1733-1c3d-f64d-5141470c8b11@gmail.com> References: <21b78142-1733-1c3d-f64d-5141470c8b11@gmail.com> Message-ID: On Thu, Oct 10, 2019 at 9:31 AM Matti Picus wrote: > According to NEP 29, numpy 1.18 will be released after Sept 2019, which as > I understand it is the cutoff for Python 3.5. In PR 14673 I proposed > removing it from the test matrix and also removing some shims in the code > to support it - meaning that in order to use Numpy 1.18+ you will need to > use Python 3.5+. Is this the intention of the NEP or is the intention only > that we no longer test it and no longer supply wheels? If the latter, at > what point do we remove support code for unsupported versions? > > > Note the cost of this code is negligible, it is more a question of what is > the correct/desired approach. > I think we can support 3.5 as long as we please, the question is how long we *want* to support it. I don't plan to release 1.18 wheels for 3.5, but I'm concerned about making 1.18 outright incompatible with 3.5. I would like to see the random interface settle before we do that. So my preference would be to drop 3.5 in 1.19 with a future warning in the 1.18 release notes. Alternatively, we could backport all the random changes to 1.17, but I would rather not do that. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From s.molnar at sbcglobal.net Thu Oct 10 13:18:56 2019 From: s.molnar at sbcglobal.net (Stephen P. Molnar) Date: Thu, 10 Oct 2019 13:18:56 -0400 Subject: [Numpy-discussion] Fwd: Re: Problem with np.savetxt In-Reply-To: <5D9F3BF2.5020008@sbcglobal.net> References: <5D9F3BF2.5020008@sbcglobal.net> Message-ID: <5D9F6800.9040700@sbcglobal.net> -------- Forwarded Message -------- Subject: Re: [Numpy-discussion] Problem with np.savetxt Date: Thu, 10 Oct 2019 10:10:58 -0400 From: Stephen P. Molnar To: numpy-discussion at python.org I am slowly and not quickly stumbling forward, but at this point my degree of mental entropy (confusion) is monumental. This works: > import numpy as np > > print('${d}') > > data = np.genfromtxt("14-7.log", usecols=(1), skip_header=27, > skip_footer=1, encoding=None) > > print(data) > > np.savetxt('14-7.dG', data, fmt='%12.9f', header='14-7') > print(data) which produces: > runfile('/home/comp/Apps/Python/PsoVina/DeltaGTable_V_s.py', > wdir='/home/comp/Apps/Python/PsoVina', current_namespace=True) > ${d} > [-9.96090267 -8.97950478 -8.94261136 -8.91552301 -8.73650883 -8.66338714 > -8.41073971 -8.38914635 -8.29679891 -8.16845411 -8.12799082 -8.12710377 > -7.97909074 -7.94187268 -7.90076621 -7.88148523 -7.83782648 -7.8159095 > -7.72254029 -7.72034674] > [-9.96090267 -8.97950478 -8.94261136 -8.91552301 -8.73650883 -8.66338714 > -8.41073971 -8.38914635 -8.29679891 -8.16845411 -8.12799082 -8.12710377 > -7.97909074 -7.94187268 -7.90076621 -7.88148523 -7.83782648 -7.8159095 > -7.72254029 -7.72034674] Note; the print statements are for a quick check o the output, which is: > # 14-7 > -9.960902669 > -8.979504781 > -8.942611364 > -8.915523010 > -8.736508831 > -8.663387139 > -8.410739711 > -8.389146347 > -8.296798909 > -8.168454106 > -8.127990818 > -8.127103774 > -7.979090739 > -7.941872682 > -7.900766215 > -7.881485228 > -7.837826485 > -7.815909505 > -7.722540286 > -7.720346742 Also, this bash script works: > #!/bin/bash > > # Run.dG.list_1 > > while IFS= read -r d > do > echo "${d}.log" > > done 14-7.log > 15-7.log > 18-7.log > C-VX3.log But, if I run this bash script: > #!/bin/bash > > # Run.dG.list_1 > > while IFS= read -r d > do > echo "${d}.log" > python3 DeltaGTable_V_sl.py > > > done where DeltaGTable_V_sl.py is: > import numpy as np > > print('${d}') > > data = np.genfromtxt('${d}.log', usecols=(1), skip_header=27, > skip_footer=1, encoding=None) > print(data) > > np.savetxt('${d}.dG', data, fmt='%12.9f', header='${d}') > print(data.dG) I get: > (base) comp at AbNormal:~/Apps/Python/PsoVina$ sh ./Run.dG.list_1.sh > 14-7.log > python3: can't open file 'DeltaGTable_V_sl.py': [Errno 2] No such file > or directory > 15-7.log > python3: can't open file 'DeltaGTable_V_sl.py': [Errno 2] No such file > or directory > 18-7.log > python3: can't open file 'DeltaGTable_V_sl.py': [Errno 2] No such file > or directory > C-VX3.log > python3: can't open file 'DeltaGTable_V_sl.py': [Errno 2] No such file > or directory So, it would appear that the log file labels are in the workspace, but '${d}.log' is not being recognized as fname by genfromtxt. Although i have googled every combination of terms I can think of I am obviously missing something. As I have potentially hundreds of files to process, I would appreciate pointers towards a solution to the problem. Thanks in advance. On 10/08/2019 10:49 AM, Stephen P. Molnar wrote: > Many thanks or your kind replies. > > I really appreciate your suggestions. > > On 10/08/2019 09:44 AM, Andras Deak wrote: >> PS. if you just want to specify the width of the fields you wouldn't >> have to convert anything, because you can specify the size and >> justification of a %s format. But arguably having float data as floats >> is more natural anyway. >> >> On Tue, Oct 8, 2019 at 3:42 PM Andras Deak >> wrote: >>> On Tue, Oct 8, 2019 at 3:17 PM Stephen P. Molnar >>> wrote: >>>> I am embarrassed to be asking this question, but I have exhausted >>>> Google >>>> at this point . >>>> >>>> I have a number of identically formatted text files from which I >>>> want to >>>> extract data, as an example (hopefully, putting these in as quotes >>>> will >>>> persevere the format): >>>> >>>>> ======================================================================= >>>>> >>>>> PSOVina version 2.0 >>>>> Giotto H. K. Tai & Shirley W. I. Siu >>>>> >>>>> Computational Biology and Bioinformatics Lab >>>>> University of Macau >>>>> >>>>> Visit http://cbbio.cis.umac.mo for more information. >>>>> >>>>> PSOVina was developed based on the framework of AutoDock Vina. >>>>> >>>>> For more information about Vina, please visit >>>>> http://vina.scripps.edu. >>>>> >>>>> ======================================================================= >>>>> >>>>> >>>>> Output will be 13-7_out.pdbqt >>>>> Reading input ... done. >>>>> Setting up the scoring function ... done. >>>>> Analyzing the binding site ... done. >>>>> Using random seed: 1828390527 >>>>> Performing search ... done. >>>>> >>>>> Refining results ... done. >>>>> >>>>> mode | affinity | dist from best mode >>>>> | (kcal/mol) | rmsd l.b.| rmsd u.b. >>>>> -----+------------+----------+---------- >>>>> 1 -8.862004149 0.000 0.000 >>>>> 2 -8.403522829 2.992 6.553 >>>>> 3 -8.401384636 2.707 5.220 >>>>> 4 -7.886402037 4.907 6.862 >>>>> 5 -7.845519031 3.233 5.915 >>>>> 6 -7.837434227 3.954 5.641 >>>>> 7 -7.834584887 3.188 7.294 >>>>> 8 -7.694395765 3.746 7.553 >>>>> 9 -7.691211177 3.536 5.745 >>>>> 10 -7.670759445 3.698 7.587 >>>>> 11 -7.661882758 4.882 7.044 >>>>> 12 -7.636280303 2.347 3.284 >>>>> 13 -7.635788052 3.511 6.250 >>>>> 14 -7.611175249 2.427 3.449 >>>>> 15 -7.586368357 2.142 2.864 >>>>> 16 -7.531307666 2.976 4.980 >>>>> 17 -7.520501084 3.085 5.775 >>>>> 18 -7.512906514 4.220 7.672 >>>>> 19 -7.307403528 3.240 4.354 >>>>> 20 -7.256063348 3.694 7.252 >>>>> Writing output ... done. >>>> At this point, my python script consists of only the following: >>>> >>>>> #!/usr/bin/env python3 >>>>> # -*- coding: utf-8 -*- >>>>> """ >>>>> >>>>> Created on Tue Sep 24 07:51:11 2019 >>>>> >>>>> """ >>>>> import numpy as np >>>>> >>>>> data = [] >>>>> >>>>> data = np.genfromtxt("13-7.log", usecols=(1), dtype=None, >>>>> skip_header=27, skip_footer=1, encoding=None) >>>>> >>>>> print(data) >>>>> >>>>> np.savetxt('13-7', [data], fmt='%15.9f', header='13-7') >>>> The problem lies in tfe np.savetxt line, on execution I get: >>>> >>>>> runfile('/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/VINA20/SmallSet/DeltaGTable_V_s.py', >>>>> >>>>> wdir='/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/VINA20/SmallSet', >>>>> >>>>> current_namespace=True) >>>>> ['-8.839713733' '-8.743377250' '-8.151051167' '-8.090452911' >>>>> '-7.967494477' '-7.854890056' '-7.757417879' '-7.741557490' >>>>> '-7.643885488' '-7.611595767' '-7.507605524' '-7.413920814' >>>>> '-7.389408331' '-7.384446364' '-7.374206276' '-7.368808179' >>>>> '-7.346641418' '-7.325037898' '-7.309614787' '-7.113209147'] >>>>> Traceback (most recent call last): >>>>> >>>>> File >>>>> "/home/comp/Apps/Models/1-PhosphorusLigands/CombinedLigands/MOL/Docking/VINA20/SmallSet/DeltaGTable_V_s.py", >>>>> >>>>> line 16, in >>>>> np.savetxt('13-7', [data], fmt='%16.9f', header='13-7') >>>>> >>>>> File "<__array_function__ internals>", line 6, in savetxt >>>>> >>>>> File >>>>> "/home/comp/Apps/Miniconda3/lib/python3.7/site-packages/numpy/lib/npyio.py", >>>>> >>>>> line 1438, in savetxt >>>>> % (str(X.dtype), format)) >>>>> >>>>> TypeError: Mismatch between array dtype ('>>>> ('%16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f >>>>> %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f %16.9f >>>>> %16.9f') >>>> The data is in the data file, but the only entry in '13-7', the saved >>>> file, is the label. Obviously, the error is in the format argument. >>> Hi, >>> >>> One problem is the format: the error is telling you that you have >>> strings in your array (compare the `'>> your `print(data)` call with strings inside), whereas %16.9f can only >>> be used to format floats (f for float). You would first have to >>> convert your array of strings to an array numbers. I don't usually use >>> genfromtxt so I'm not sure how you can make it return floats for you >>> in the first place, but I suspect `dtype=None` in the call to >>> genfromtxt might be responsible. In any case making it return numbers >>> should be the easier case. >>> The second problem is that you should make sure you mean `[data]` in >>> the call to savetxt. As it is now this would give you a 2d array of >>> shape (1, 20), and the output would correspondingly contain a single >>> row of 20 values (hence the 20 instances of '%16.9f' in the error >>> message). In case you meant to print one value per row in a single >>> column, you should drop the brackets around `data`: >>> np.savetxt('13-7', data, fmt='%16.9f', header='13-7') >>> >>> And just a personal note, but I'd find an output file named '13-7' to >>> be a bit surprising. Perhaps some extension or prefix would help >>> organize these files? >>> Regards, >>> >>> Andr??s >>> >>>> Help will be much appreciated. >>>> >>>> Thanks in advance. >>>> >>>> -- >>>> Stephen P. Molnar, Ph.D. >>>> www.molecular-modeling.net >>>> 614.312.7528 (c) >>>> Skype: smolnar1 >>>> >>>> _______________________________________________ >>>> NumPy-Discussion mailing list >>>> NumPy-Discussion at python.org >>>> https://mail.python.org/mailman/listinfo/numpy-discussion >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion > -- Stephen P. Molnar, Ph.D. www.molecular-modeling.net 614.312.7528 (c) Skype: smolnar1 -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefanv at berkeley.edu Thu Oct 10 20:01:43 2019 From: stefanv at berkeley.edu (Stefan van der Walt) Date: Thu, 10 Oct 2019 17:01:43 -0700 Subject: [Numpy-discussion] Unsupporting python3.5 In-Reply-To: References: <21b78142-1733-1c3d-f64d-5141470c8b11@gmail.com> Message-ID: <2a7e6e0f-6b81-40f2-87be-d64b52dfa07c@www.fastmail.com> On Thu, Oct 10, 2019, at 09:34, Charles R Harris wrote: > I think we can support 3.5 as long as we please, the question is how long we *want* to support it. I don't plan to release 1.18 wheels for 3.5, but I'm concerned about making 1.18 outright incompatible with 3.5. I would like to see the random interface settle before we do that. So my preference would be to drop 3.5 in 1.19 with a future warning in the 1.18 release notes. Alternatively, we could backport all the random changes to 1.17, but I would rather not do that. The language in the NEP is: "we recommend that they support at least all minor versions of Python introduced and released in the prior 42 months" i.e., a lower bound for support, such that compatibility with more versions of Python is perfectly OK and, in the case of NumPy, probably encouraged. The NEP is meant to lift the burden of very long support cycles from smaller projects. St?fan -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu Oct 10 22:17:37 2019 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 10 Oct 2019 20:17:37 -0600 Subject: [Numpy-discussion] Unsupporting python3.5 In-Reply-To: <2a7e6e0f-6b81-40f2-87be-d64b52dfa07c@www.fastmail.com> References: <21b78142-1733-1c3d-f64d-5141470c8b11@gmail.com> <2a7e6e0f-6b81-40f2-87be-d64b52dfa07c@www.fastmail.com> Message-ID: On Thu, Oct 10, 2019 at 6:02 PM Stefan van der Walt wrote: > On Thu, Oct 10, 2019, at 09:34, Charles R Harris wrote: > > I think we can support 3.5 as long as we please, the question is how long > we *want* to support it. I don't plan to release 1.18 wheels for 3.5, but > I'm concerned about making 1.18 outright incompatible with 3.5. I would > like to see the random interface settle before we do that. So my preference > would be to drop 3.5 in 1.19 with a future warning in the 1.18 release > notes. Alternatively, we could backport all the random changes to 1.17, but > I would rather not do that. > > > The language in the NEP is: > > "we recommend that they support at least all minor versions of Python > introduced and released in the prior 42 months" > > i.e., a lower bound for support, such that compatibility with more > versions of Python is perfectly OK and, in the case of NumPy, probably > encouraged. The NEP is meant to lift the burden of very long support > cycles from smaller projects. > > St?fan > > The 1.18.0rc1 is about one month out, so we should spend some effort on those PRs and issues with the 1.18 milestone. Dealing with issues and milestones, plus putting together the release notes, is the major pain point in making releases these days. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu Oct 10 22:21:27 2019 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 10 Oct 2019 20:21:27 -0600 Subject: [Numpy-discussion] "Spring cleaning" sprint In-Reply-To: References: <81ed0f30ab4f83fa28557b0032d447f1e36922ec.camel@sipsolutions.net> Message-ID: On Wed, Oct 9, 2019 at 10:44 PM Sebastian Berg wrote: > On Wed, 2019-10-09 at 21:26 -0700, Zijie Poh wrote: > > Hi Sebastian, > > > > It is Tuesday October 15 or Monday October 14? > > > > Sorry, its Tuesday the 15th [0]. Monday is a holiday in California at > least. > > Cheers, > > Sebastian > > > [0] Probably happens to me because I am still used to weeks starting > with Monday and not Sunday (as my calendar now shows). > > > > Regards, > > ZJ > > > > On Wed, Oct 9, 2019 at 5:57 PM Sebastian Berg < > > sebastian at sipsolutions.net> wrote: > > > Hi all, > > > > > > we are planning to remote spring to try to reduce the number of > > > open > > > PRs and issues, and do triage work. > > > > > > This is planned for next Tuesday, October 14th, between 9:00 and > > > 15:00 > > > Pacific time. And everyone is invited to join in for as long as you > > > wish. > > > > > > I assume we will have a video chat up and running at: > > > https://berkeley.zoom.us/j/762261535 > > > > > > However, we will send a reminder/logistics email when we start with > > > the > > > sprint. > > > > > > Cheers, > > > > > > Sebastian > I would like to suggest that we concentrate on the issues and PRs with the 1.18 milestone. The 1.18.0rc1 should come out sometime in the last half of November and hopefully the final release will be before the end of the year. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From s.molnar at sbcglobal.net Fri Oct 11 12:41:04 2019 From: s.molnar at sbcglobal.net (Stephen P. Molnar) Date: Fri, 11 Oct 2019 12:41:04 -0400 Subject: [Numpy-discussion] np.genfromtxt StopIteration Error Message-ID: <5DA0B0A0.4070007@sbcglobal.net> I have been fighting with the genfromtxt function in numpy for a while now and am trying a slightly different approach. Here is the code: > > import os > import glob > import numpy as np > > fileList = [] > filesList = [] > > for files in glob.glob("*.log"): > fileName, fileExtension = os.path.splitext(files) > fileList.append(fileName) > filesList.append(files) > > print('fileList = ', fileList) > print('filesList = ', filesList) > > fname = filesList > print('fname = ', fname) > data = np.genfromtxt(fname, usecols=(1), skip_header=27, > skip_footer=1, encoding=None) > print(data) > > np.savetxt('fileList.dG', data, fmt='%12.9f', header='${d}') > print(data.dG) I am using the Spyder IDE which has a variable explorer which shows: filesList = ['C-VX3.log', '18-7.log', '14-7.log', '15-7.log'] fileList = ['C-VX3', '18-7', '14-7', '15-7'] so the lists that genfromtxt needs are being generated. Goggling 'numpy genfromtxt stopiteration error' does not seem to address this problem. At least, I didn't find plaything that I thought applied. I would greatly appreciate some assistance here. Thanks is advance. -- Stephen P. Molnar, Ph.D. www.molecular-modeling.net 614.312.7528 (c) Skype: smolnar1 -------------- next part -------------- An HTML attachment was scrubbed... URL: From bennet at umich.edu Fri Oct 11 13:12:05 2019 From: bennet at umich.edu (Bennet Fauber) Date: Fri, 11 Oct 2019 13:12:05 -0400 Subject: [Numpy-discussion] np.genfromtxt StopIteration Error In-Reply-To: <5DA0B0A0.4070007@sbcglobal.net> References: <5DA0B0A0.4070007@sbcglobal.net> Message-ID: I think genfromtxt() wants a filename as the first argument, and you have to tell it the entries in the file are strings not numerics. test.py ------------------------------------------ import os import glob import numpy as np fileList = [] filesList = [] for files in glob.glob("*.log"): fileName, fileExtension = os.path.splitext(files) fileList.append(fileName) filesList.append(files) print('fileList = ', fileList) print('filesList = ', filesList) fname = '/tmp/foo.txt' print('fname = ', fname) data = np.genfromtxt(fname, dtype=str) print(data) ------------------------------------------ Contents of /tmp/foo.txt ------------------------------------------ 15-7.log 18-7.log 14-7.log C-VX3.log ------------------------------------------ Sample run $ python --version Python 2.7.15+ $ python t.py ('fileList = ', ['15-7', '18-7', '14-7', 'C-VX3']) ('filesList = ', ['15-7.log', '18-7.log', '14-7.log', 'C-VX3.log']) ('fname = ', '/tmp/foo.txt') ['15-7.log' '18-7.log' '14-7.log' 'C-VX3.log'] Is that any help? On Fri, Oct 11, 2019 at 12:41 PM Stephen P. Molnar wrote: > > I have been fighting with the genfromtxt function in numpy for a while now and am trying a slightly different approach. > > Here is the code: > > > import os > import glob > import numpy as np > > fileList = [] > filesList = [] > > for files in glob.glob("*.log"): > ?????? fileName, fileExtension = os.path.splitext(files) > ?????? fileList.append(fileName) > ?????? filesList.append(files) > > print('fileList = ', fileList) > print('filesList = ', filesList) > > fname = filesList > print('fname = ', fname) > data = np.genfromtxt(fname, usecols=(1), skip_header=27, skip_footer=1, encoding=None) > print(data) > > np.savetxt('fileList.dG', data, fmt='%12.9f', header='${d}') > print(data.dG) > > I am using the Spyder IDE which has a variable explorer which shows: > > filesList = ['C-VX3.log', '18-7.log', '14-7.log', '15-7.log'] > fileList = ['C-VX3', '18-7', '14-7', '15-7'] > > so the lists that genfromtxt needs are being generated. > > Goggling 'numpy genfromtxt stopiteration error' does not seem to address this problem. At least, I didn't find plaything that I thought applied. > > I would greatly appreciate some assistance here. > > Thanks is advance. > > -- > Stephen P. Molnar, Ph.D. > www.molecular-modeling.net > 614.312.7528 (c) > Skype: smolnar1 > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion From s.molnar at sbcglobal.net Fri Oct 11 14:41:19 2019 From: s.molnar at sbcglobal.net (Stephen P. Molnar) Date: Fri, 11 Oct 2019 14:41:19 -0400 Subject: [Numpy-discussion] np.genfromtxt StopIteration Error In-Reply-To: References: <5DA0B0A0.4070007@sbcglobal.net> Message-ID: <5DA0CCCF.6030903@sbcglobal.net> Thanks for the reply. Keep in mind that i am a Chemist, not an IT person. I used to be a marginally proficient FORTRAN II user in the ancient past. I tried running your code. Please see my comments/questing below: On 10/11/2019 01:12 PM, Bennet Fauber wrote: > I think genfromtxt() wants a filename as the first argument, and you > have to tell it the entries in the file are strings not numerics. > > test.py > ------------------------------------------ > import os > import glob > import numpy as np > > fileList = [] > filesList = [] > > for files in glob.glob("*.log"): > fileName, fileExtension = os.path.splitext(files) > fileList.append(fileName) > filesList.append(files) > > print('fileList = ', fileList) > print('filesList = ', filesList > > fname = '/tmp/foo.txt' There is no '/temp/foo.txt' Where did it come from in your example? > print('fname = ', fname) > data = np.genfromtxt(fname, dtype=str) > print(data) > ------------------------------------------ > > Contents of /tmp/foo.txt > ------------------------------------------ > 15-7.log > 18-7.log > 14-7.log > C-VX3.log > ------------------------------------------ > > Sample run I'm using python 3.7.3, should this make a difference? > > $ python --version > Python 2.7.15+ > > $ python t.py > ('fileList = ', ['15-7', '18-7', '14-7', 'C-VX3']) > ('filesList = ', ['15-7.log', '18-7.log', '14-7.log', 'C-VX3.log']) > ('fname = ', '/tmp/foo.txt') > ['15-7.log' '18-7.log' '14-7.log' 'C-VX3.log'] > > Is that any help? if I use data = np.genfromtxt('14-7.log', dtype=str, usecols=(1), skip_header=27, skip_footer=1, encoding=None) with a specific file name. in this example 14-7, I get the resutt I desired: # 14-7 -9.960902669 -8.979504781 -8.942611364 -8.915523010 -8.736508831 -8.663387139 -8.410739711 -8.389146347 -8.296798909 -8.168454106 -8.127990818 -8.127103774 -7.979090739 -7.941872682 -7.900766215 -7.881485228 -7.837826485 -7.815909505 -7.722540286 -7.720346742 so, my question is; why the StopIteration error message in my original query? Why is the dcrtipt not iterating over the log files? Sorry to be so dense. > > On Fri, Oct 11, 2019 at 12:41 PM Stephen P. Molnar > wrote: >> I have been fighting with the genfromtxt function in numpy for a while now and am trying a slightly different approach. >> >> Here is the code: >> >> >> import os >> import glob >> import numpy as np >> >> fileList = [] >> filesList = [] >> >> for files in glob.glob("*.log"): >> ?????? fileName, fileExtension = os.path.splitext(files) >> ?????? fileList.append(fileName) >> ?????? filesList.append(files) >> >> print('fileList = ', fileList) >> print('filesList = ', filesList) >> >> fname = filesList >> print('fname = ', fname) >> data = np.genfromtxt(fname, usecols=(1), skip_header=27, skip_footer=1, encoding=None) >> print(data) >> >> np.savetxt('fileList.dG', data, fmt='%12.9f', header='${d}') >> print(data.dG) >> >> I am using the Spyder IDE which has a variable explorer which shows: >> >> filesList = ['C-VX3.log', '18-7.log', '14-7.log', '15-7.log'] >> fileList = ['C-VX3', '18-7', '14-7', '15-7'] >> >> so the lists that genfromtxt needs are being generated. >> >> Goggling 'numpy genfromtxt stopiteration error' does not seem to address this problem. At least, I didn't find plaything that I thought applied. >> >> I would greatly appreciate some assistance here. >> >> Thanks is advance. >> >> -- >> Stephen P. Molnar, Ph.D. >> www.molecular-modeling.net >> 614.312.7528 (c) >> Skype: smolnar1 >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at python.org >> https://mail.python.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -- Stephen P. Molnar, Ph.D. www.molecular-modeling.net 614.312.7528 (c) Skype: smolnar1 From deak.andris at gmail.com Fri Oct 11 17:11:53 2019 From: deak.andris at gmail.com (Andras Deak) Date: Fri, 11 Oct 2019 23:11:53 +0200 Subject: [Numpy-discussion] np.genfromtxt StopIteration Error In-Reply-To: <5DA0CCCF.6030903@sbcglobal.net> References: <5DA0B0A0.4070007@sbcglobal.net> <5DA0CCCF.6030903@sbcglobal.net> Message-ID: Hi Stephen, Is this not what your original question to this list was about? See https://mail.python.org/pipermail/numpy-discussion/2019-October/080130.html and replies. I still believe that you _can't_ give genfromtxt file names in an iterable. The iterable input is only inteded to contain the contents of a single file, probably as if read using `f.readlines()`. Genfromtxt seems to read data from at most one file with each call. Regards, Andr?s On Fri, Oct 11, 2019 at 8:41 PM Stephen P. Molnar wrote: > > Thanks for the reply. > > Keep in mind that i am a Chemist, not an IT person. I used to be a > marginally proficient FORTRAN II user in the ancient past. > > I tried running your code. Please see my comments/questing below: > > On 10/11/2019 01:12 PM, Bennet Fauber wrote: > > I think genfromtxt() wants a filename as the first argument, and you > > have to tell it the entries in the file are strings not numerics. > > > > test.py > > ------------------------------------------ > > import os > > import glob > > import numpy as np > > > > fileList = [] > > filesList = [] > > > > for files in glob.glob("*.log"): > > fileName, fileExtension = os.path.splitext(files) > > fileList.append(fileName) > > filesList.append(files) > > > > print('fileList = ', fileList) > > print('filesList = ', filesList > > > > fname = '/tmp/foo.txt' > There is no '/temp/foo.txt' Where did it come from in your example? > > print('fname = ', fname) > > data = np.genfromtxt(fname, dtype=str) > > print(data) > > ------------------------------------------ > > > > Contents of /tmp/foo.txt > > ------------------------------------------ > > 15-7.log > > 18-7.log > > 14-7.log > > C-VX3.log > > ------------------------------------------ > > > > Sample run > I'm using python 3.7.3, should this make a difference? > > > > $ python --version > > Python 2.7.15+ > > > > $ python t.py > > ('fileList = ', ['15-7', '18-7', '14-7', 'C-VX3']) > > ('filesList = ', ['15-7.log', '18-7.log', '14-7.log', 'C-VX3.log']) > > ('fname = ', '/tmp/foo.txt') > > ['15-7.log' '18-7.log' '14-7.log' 'C-VX3.log'] > > > > Is that any help? > if I use data = np.genfromtxt('14-7.log', dtype=str, usecols=(1), > skip_header=27, skip_footer=1, encoding=None) with a specific file name. > in this example 14-7, I get the resutt I desired: > > # 14-7 > -9.960902669 > -8.979504781 > -8.942611364 > -8.915523010 > -8.736508831 > -8.663387139 > -8.410739711 > -8.389146347 > -8.296798909 > -8.168454106 > -8.127990818 > -8.127103774 > -7.979090739 > -7.941872682 > -7.900766215 > -7.881485228 > -7.837826485 > -7.815909505 > -7.722540286 > -7.720346742 > > so, my question is; why the StopIteration error message in my original > query? Why is the dcrtipt not iterating over the log files? > > Sorry to be so dense. > > > > > > On Fri, Oct 11, 2019 at 12:41 PM Stephen P. Molnar > > wrote: > >> I have been fighting with the genfromtxt function in numpy for a while now and am trying a slightly different approach. > >> > >> Here is the code: > >> > >> > >> import os > >> import glob > >> import numpy as np > >> > >> fileList = [] > >> filesList = [] > >> > >> for files in glob.glob("*.log"): > >> ?????? fileName, fileExtension = os.path.splitext(files) > >> ?????? fileList.append(fileName) > >> ?????? filesList.append(files) > >> > >> print('fileList = ', fileList) > >> print('filesList = ', filesList) > >> > >> fname = filesList > >> print('fname = ', fname) > >> data = np.genfromtxt(fname, usecols=(1), skip_header=27, skip_footer=1, encoding=None) > >> print(data) > >> > >> np.savetxt('fileList.dG', data, fmt='%12.9f', header='${d}') > >> print(data.dG) > >> > >> I am using the Spyder IDE which has a variable explorer which shows: > >> > >> filesList = ['C-VX3.log', '18-7.log', '14-7.log', '15-7.log'] > >> fileList = ['C-VX3', '18-7', '14-7', '15-7'] > >> > >> so the lists that genfromtxt needs are being generated. > >> > >> Goggling 'numpy genfromtxt stopiteration error' does not seem to address this problem. At least, I didn't find plaything that I thought applied. > >> > >> I would greatly appreciate some assistance here. > >> > >> Thanks is advance. > >> > >> -- > >> Stephen P. Molnar, Ph.D. > >> www.molecular-modeling.net > >> 614.312.7528 (c) > >> Skype: smolnar1 > >> > >> _______________________________________________ > >> NumPy-Discussion mailing list > >> NumPy-Discussion at python.org > >> https://mail.python.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > -- > Stephen P. Molnar, Ph.D. > www.molecular-modeling.net > 614.312.7528 (c) > Skype: smolnar1 > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion From bennet at umich.edu Fri Oct 11 18:45:51 2019 From: bennet at umich.edu (Bennet Fauber) Date: Fri, 11 Oct 2019 18:45:51 -0400 Subject: [Numpy-discussion] np.genfromtxt StopIteration Error In-Reply-To: <5DA0CCCF.6030903@sbcglobal.net> References: <5DA0B0A0.4070007@sbcglobal.net> <5DA0CCCF.6030903@sbcglobal.net> Message-ID: What happens if you use this? for txtfile in filesList: data = np.genfromtxt(txtfile, dtype=str, usecols=(1), skip_header=27, skip_footer=1, encoding=None) print(data) That pulls one file name from filesList, stuffs the name into txtfile, which is then provided to genfromtxt(). The point is that genfromtxt() wants one thing; you have to give it one thing. The above gives it one thing, but runs it for however many things are in the filesList. Does that help? On Fri, Oct 11, 2019 at 2:41 PM Stephen P. Molnar wrote: > > Thanks for the reply. > > Keep in mind that i am a Chemist, not an IT person. I used to be a > marginally proficient FORTRAN II user in the ancient past. > > I tried running your code. Please see my comments/questing below: > > On 10/11/2019 01:12 PM, Bennet Fauber wrote: > > I think genfromtxt() wants a filename as the first argument, and you > > have to tell it the entries in the file are strings not numerics. > > > > test.py > > ------------------------------------------ > > import os > > import glob > > import numpy as np > > > > fileList = [] > > filesList = [] > > > > for files in glob.glob("*.log"): > > fileName, fileExtension = os.path.splitext(files) > > fileList.append(fileName) > > filesList.append(files) > > > > print('fileList = ', fileList) > > print('filesList = ', filesList > > > > fname = '/tmp/foo.txt' > There is no '/temp/foo.txt' Where did it come from in your example? > > print('fname = ', fname) > > data = np.genfromtxt(fname, dtype=str) > > print(data) > > ------------------------------------------ > > > > Contents of /tmp/foo.txt > > ------------------------------------------ > > 15-7.log > > 18-7.log > > 14-7.log > > C-VX3.log > > ------------------------------------------ > > > > Sample run > I'm using python 3.7.3, should this make a difference? > > > > $ python --version > > Python 2.7.15+ > > > > $ python t.py > > ('fileList = ', ['15-7', '18-7', '14-7', 'C-VX3']) > > ('filesList = ', ['15-7.log', '18-7.log', '14-7.log', 'C-VX3.log']) > > ('fname = ', '/tmp/foo.txt') > > ['15-7.log' '18-7.log' '14-7.log' 'C-VX3.log'] > > > > Is that any help? > if I use data = np.genfromtxt('14-7.log', dtype=str, usecols=(1), > skip_header=27, skip_footer=1, encoding=None) with a specific file name. > in this example 14-7, I get the resutt I desired: > > # 14-7 > -9.960902669 > -8.979504781 > -8.942611364 > -8.915523010 > -8.736508831 > -8.663387139 > -8.410739711 > -8.389146347 > -8.296798909 > -8.168454106 > -8.127990818 > -8.127103774 > -7.979090739 > -7.941872682 > -7.900766215 > -7.881485228 > -7.837826485 > -7.815909505 > -7.722540286 > -7.720346742 > > so, my question is; why the StopIteration error message in my original > query? Why is the dcrtipt not iterating over the log files? > > Sorry to be so dense. > > > > > > On Fri, Oct 11, 2019 at 12:41 PM Stephen P. Molnar > > wrote: > >> I have been fighting with the genfromtxt function in numpy for a while now and am trying a slightly different approach. > >> > >> Here is the code: > >> > >> > >> import os > >> import glob > >> import numpy as np > >> > >> fileList = [] > >> filesList = [] > >> > >> for files in glob.glob("*.log"): > >> ?????? fileName, fileExtension = os.path.splitext(files) > >> ?????? fileList.append(fileName) > >> ?????? filesList.append(files) > >> > >> print('fileList = ', fileList) > >> print('filesList = ', filesList) > >> > >> fname = filesList > >> print('fname = ', fname) > >> data = np.genfromtxt(fname, usecols=(1), skip_header=27, skip_footer=1, encoding=None) > >> print(data) > >> > >> np.savetxt('fileList.dG', data, fmt='%12.9f', header='${d}') > >> print(data.dG) > >> > >> I am using the Spyder IDE which has a variable explorer which shows: > >> > >> filesList = ['C-VX3.log', '18-7.log', '14-7.log', '15-7.log'] > >> fileList = ['C-VX3', '18-7', '14-7', '15-7'] > >> > >> so the lists that genfromtxt needs are being generated. > >> > >> Goggling 'numpy genfromtxt stopiteration error' does not seem to address this problem. At least, I didn't find plaything that I thought applied. > >> > >> I would greatly appreciate some assistance here. > >> > >> Thanks is advance. > >> > >> -- > >> Stephen P. Molnar, Ph.D. > >> www.molecular-modeling.net > >> 614.312.7528 (c) > >> Skype: smolnar1 > >> > >> _______________________________________________ > >> NumPy-Discussion mailing list > >> NumPy-Discussion at python.org > >> https://mail.python.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > -- > Stephen P. Molnar, Ph.D. > www.molecular-modeling.net > 614.312.7528 (c) > Skype: smolnar1 > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion From charlesr.harris at gmail.com Fri Oct 11 21:20:37 2019 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 11 Oct 2019 19:20:37 -0600 Subject: [Numpy-discussion] 1.19 release Message-ID: Hi All, Thought I'd raise the option of trying to put together an NEP for the 1.18 release like Python does PEPs. If that is considered too procedural for releases that come out every six months or so, are there any suggestions for an alternative? About 1.19 itself, I expect to fork 1.18.x in the middle of next month aiming at a release in late December. The main task I currently see for 1.19 is to remove the shims for Python 2.7 and 3.5, there are already a couple of delayed PRs along that line. If there are other things that folks think should be on the todo list, suggestions are welcome. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sat Oct 12 11:48:00 2019 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sat, 12 Oct 2019 17:48:00 +0200 Subject: [Numpy-discussion] SciPy user documentation survey - please participate Message-ID: Hi everyone, Maja, our technical writer for Season of Docs for SciPy, created a survey specifically about how users use the SciPy docs and what they would like to see improved. She sent out the link to scipy-dev before and it was shared on Twitter. We've received some really valuable responses, and would love to get more. If you're a SciPy user, this is a very easy way to help the project! https://docs.google.com/forms/d/e/1FAIpQLSeBAO0UFKDZyKpg2XzRslsLJVHU61ugjc18-2PVEabTQg2_6g/viewform?usp=sf_link Apologies for the cross-post. We do a survey about once a decade and Maja is putting in a lot of work creating and analyzing the survey, so it's worth some more exposure and a couple of minutes of your time! Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sat Oct 12 11:57:25 2019 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sat, 12 Oct 2019 17:57:25 +0200 Subject: [Numpy-discussion] 1.19 release In-Reply-To: References: Message-ID: On Sat, Oct 12, 2019 at 3:21 AM Charles R Harris wrote: > Hi All, > > Thought I'd raise the option of trying to put together an NEP for the 1.18 > release like Python does PEPs. If that is considered too procedural for > releases that come out every six months or so, are there any suggestions > for an alternative? > The Python one only contains a release schedule, and gets updated later with a small subset of the release notes: https://www.python.org/dev/peps/pep-0494/. I guess its main audience is packagers and companies needing to plan ahead supporting a new Python version. What would you like to put in such a NEP? > About 1.19 itself, I expect to fork 1.18.x in the middle of next month > aiming at a release in late December. The main task I currently see for > 1.19 is to remove the shims for Python 2.7 and 3.5, there are already a > couple of delayed PRs along that line. If there are other things that folks > think should be on the todo list, suggestions are welcome. > For 1.18 I think the main things are further changes to numpy.random and the dispatch system. For 1.19 not sure, that still feels far away. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat Oct 12 13:13:46 2019 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 12 Oct 2019 11:13:46 -0600 Subject: [Numpy-discussion] 1.19 release In-Reply-To: References: Message-ID: On Sat, Oct 12, 2019 at 9:58 AM Ralf Gommers wrote: > > > On Sat, Oct 12, 2019 at 3:21 AM Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> Hi All, >> >> Thought I'd raise the option of trying to put together an NEP for the >> 1.18 release like Python does PEPs. If that is considered too procedural >> for releases that come out every six months or so, are there any >> suggestions for an alternative? >> > > The Python one only contains a release schedule, and gets updated later > with a small subset of the release notes: > https://www.python.org/dev/peps/pep-0494/. I guess its main audience is > packagers and companies needing to plan ahead supporting a new Python > version. > > What would you like to put in such a NEP? > > >> About 1.19 itself, I expect to fork 1.18.x in the middle of next month >> aiming at a release in late December. The main task I currently see for >> 1.19 is to remove the shims for Python 2.7 and 3.5, there are already a >> couple of delayed PRs along that line. If there are other things that folks >> think should be on the todo list, suggestions are welcome. >> > > For 1.18 I think the main things are further changes to numpy.random and > the dispatch system. For 1.19 not sure, that still feels far away. > Agree about numpy.random. What changes are you looking for in the dispatch system? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sun Oct 13 05:36:50 2019 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 13 Oct 2019 11:36:50 +0200 Subject: [Numpy-discussion] 1.19 release In-Reply-To: References: Message-ID: On Sat, Oct 12, 2019 at 7:14 PM Charles R Harris wrote: > > > On Sat, Oct 12, 2019 at 9:58 AM Ralf Gommers > wrote: > >> >> >> On Sat, Oct 12, 2019 at 3:21 AM Charles R Harris < >> charlesr.harris at gmail.com> wrote: >> >>> Hi All, >>> >>> Thought I'd raise the option of trying to put together an NEP for the >>> 1.18 release like Python does PEPs. If that is considered too procedural >>> for releases that come out every six months or so, are there any >>> suggestions for an alternative? >>> >> >> The Python one only contains a release schedule, and gets updated later >> with a small subset of the release notes: >> https://www.python.org/dev/peps/pep-0494/. I guess its main audience is >> packagers and companies needing to plan ahead supporting a new Python >> version. >> >> What would you like to put in such a NEP? >> >> >>> About 1.19 itself, I expect to fork 1.18.x in the middle of next month >>> aiming at a release in late December. The main task I currently see for >>> 1.19 is to remove the shims for Python 2.7 and 3.5, there are already a >>> couple of delayed PRs along that line. If there are other things that folks >>> think should be on the todo list, suggestions are welcome. >>> >> >> For 1.18 I think the main things are further changes to numpy.random and >> the dispatch system. For 1.19 not sure, that still feels far away. >> > > Agree about numpy.random. What changes are you looking for in the dispatch > system? > Plugging some of the gaps, like the array creation and asarray stuff at a minimum, since those are the most painful missing capabilities. Peter Entschev has given us some of the clearest use cases. There's still quite a bit of work to do and decisions to make. If it doesn't materialize in time then so be it, but it's a good goal for the next release. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Sun Oct 13 22:13:38 2019 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Sun, 13 Oct 2019 19:13:38 -0700 Subject: [Numpy-discussion] 1.19 release In-Reply-To: References: Message-ID: <51645f4a022726bdb3647f8d4aab79f59ca980ea.camel@sipsolutions.net> On Fri, 2019-10-11 at 19:20 -0600, Charles R Harris wrote: > Hi All, > > Thought I'd raise the option of trying to put together an NEP for the > 1.18 release like Python does PEPs. If that is considered too > procedural for releases that come out every six months or so, are > there any suggestions for an alternative? > > About 1.19 itself, I expect to fork 1.18.x in the middle of next > month aiming at a release in late December. The main task I currently > see for 1.19 is to remove the shims for Python 2.7 and 3.5, there are > already a couple of delayed PRs along that line. If there are other > things that folks think should be on the todo list, suggestions are > welcome. I am thinking whether there may be some more deprecations to put on the list. These are mostly only mildly annoying, so it is also not a high priority. The first thing is the CHAR dtype, which seems already deprecated on the C-Side, but `np.dtype("c")` still works [0]. It is not a big hassle, but it seems like an oversight that it is not deprecated? The other thing I would like to bring up is deprecating the `PyArray_GetArrayParamsFromObject` function: 1. I have difficulty seeing anyone using it, it has some strange behaviour: the dtype it returns (if you pass in one) depends on whether or not the output is a numpy-scalar. I think everyone would simply use `PyArray_FromAny`, since they probably want an array in either case. 2. It is not really useful currently for flexible dtypes at all, since you need AdaptFlexibleDType in that case, and that is not exposed. 3. We need to replace array coercion, keeping this interface around means compatibility guarantees. I currently reproduce it faithfully, but if we disable the function, it means we can delete all the code and also not worry about adding tests to ensure compatibility indefinitely. However, we have no replacement except conversion to array. Experience tells us that every dark corner is used by someone, but maybe we can risk breaking that (hopefully) one person? Looking through the C-API some more, we could also consider deprecating and removing (in the sense of always error return) these: * PyArray_ObjectType * PyArray_ArrayType * PyArray_ScalarKind which have been superseded since NumPy 1.6. `PyArray_MinScalarType` we probably should keep around, although we need a replacement. Which should just be a replacement for `PyArray_GetArrayParamsFromObject` [1]. I have to look at the scalar API, but there may be some candidates there as well. Best, Sebastian [0] The only thing it seems to do is change how strings are parsed in `np.asarray` calls. Otherwise it is pretty much equivalent to "S1", but who knows maybe people use it... [1] I personally would like us to stop doing value based casting for typed 0-D arrays and scalars. So at that point a "minimal scalar" dtype only makes sense for python types. I feel if you know that you want to downcast, you can use "same_kind" casting. > > Chuck > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From einstein.edison at gmail.com Tue Oct 15 10:15:18 2019 From: einstein.edison at gmail.com (Hameer Abbasi) Date: Tue, 15 Oct 2019 14:15:18 +0000 Subject: [Numpy-discussion] "Spring cleaning" sprint In-Reply-To: <81ed0f30ab4f83fa28557b0032d447f1e36922ec.camel@sipsolutions.net> References: <81ed0f30ab4f83fa28557b0032d447f1e36922ec.camel@sipsolutions.net> Message-ID: Is the meeting ongoing? I tried to join but there were no participants. ?On 10.10.19, 02:56, "NumPy-Discussion on behalf of Sebastian Berg" wrote: Hi all, we are planning to remote spring to try to reduce the number of open PRs and issues, and do triage work. This is planned for next Tuesday, October 14th, between 9:00 and 15:00 Pacific time. And everyone is invited to join in for as long as you wish. I assume we will have a video chat up and running at: https://berkeley.zoom.us/j/762261535 However, we will send a reminder/logistics email when we start with the sprint. Cheers, Sebastian From einstein.edison at gmail.com Tue Oct 15 10:20:25 2019 From: einstein.edison at gmail.com (Hameer Abbasi) Date: Tue, 15 Oct 2019 14:20:25 +0000 Subject: [Numpy-discussion] "Spring cleaning" sprint References: <81ed0f30ab4f83fa28557b0032d447f1e36922ec.camel@sipsolutions.net> Message-ID: My bad... It's not until later. ?On 15.10.19, 16:16, "NumPy-Discussion on behalf of Hameer Abbasi" wrote: Is the meeting ongoing? I tried to join but there were no participants. ?On 10.10.19, 02:56, "NumPy-Discussion on behalf of Sebastian Berg" wrote: Hi all, we are planning to remote spring to try to reduce the number of open PRs and issues, and do triage work. This is planned for next Tuesday, October 14th, between 9:00 and 15:00 Pacific time. And everyone is invited to join in for as long as you wish. I assume we will have a video chat up and running at: https://berkeley.zoom.us/j/762261535 However, we will send a reminder/logistics email when we start with the sprint. Cheers, Sebastian _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at python.org https://mail.python.org/mailman/listinfo/numpy-discussion From sebastian at sipsolutions.net Tue Oct 15 12:00:34 2019 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Tue, 15 Oct 2019 09:00:34 -0700 Subject: [Numpy-discussion] "Spring cleaning" sprint In-Reply-To: References: <81ed0f30ab4f83fa28557b0032d447f1e36922ec.camel@sipsolutions.net> Message-ID: Hi all, I have to see if that zoom link works, and we probably have to switch over the day and need some place to write down things, so lets put any updates here: https://hackmd.io/kyMr_SHJTAaWsHQGRhJ7hQ Best, Sebastian On Tue, 2019-10-15 at 14:15 +0000, Hameer Abbasi wrote: > Is the meeting ongoing? I tried to join but there were no > participants. > > ?On 10.10.19, 02:56, "NumPy-Discussion on behalf of Sebastian Berg" < > numpy-discussion-bounces+einstein.edison=gmail.com at python.org on > behalf of sebastian at sipsolutions.net> wrote: > > Hi all, > > we are planning to remote spring to try to reduce the number of > open > PRs and issues, and do triage work. > > This is planned for next Tuesday, October 14th, between 9:00 and > 15:00 > Pacific time. And everyone is invited to join in for as long as > you > wish. > > I assume we will have a video chat up and running at: > https://berkeley.zoom.us/j/762261535 > > However, we will send a reminder/logistics email when we start > with the > sprint. > > Cheers, > > Sebastian > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From b.sipocz+numpylist at gmail.com Tue Oct 15 12:01:29 2019 From: b.sipocz+numpylist at gmail.com (Brigitta Sipocz) Date: Tue, 15 Oct 2019 09:01:29 -0700 Subject: [Numpy-discussion] "Spring cleaning" sprint In-Reply-To: References: <81ed0f30ab4f83fa28557b0032d447f1e36922ec.camel@sipsolutions.net> Message-ID: I'm unable to connect, "The host has another meeting in progress". Brigitta On Tue, 15 Oct 2019 at 07:20, Hameer Abbasi wrote: > My bad... It's not until later. > > > > ?On 15.10.19, 16:16, "NumPy-Discussion on behalf of Hameer Abbasi" > einstein.edison at gmail.com> wrote: > > Is the meeting ongoing? I tried to join but there were no participants. > > ?On 10.10.19, 02:56, "NumPy-Discussion on behalf of Sebastian Berg" > of sebastian at sipsolutions.net> wrote: > > Hi all, > > we are planning to remote spring to try to reduce the number of > open > PRs and issues, and do triage work. > > This is planned for next Tuesday, October 14th, between 9:00 and > 15:00 > Pacific time. And everyone is invited to join in for as long as you > wish. > > I assume we will have a video chat up and running at: > https://berkeley.zoom.us/j/762261535 > > However, we will send a reminder/logistics email when we start > with the > sprint. > > Cheers, > > Sebastian > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Tue Oct 15 12:10:03 2019 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Tue, 15 Oct 2019 09:10:03 -0700 Subject: [Numpy-discussion] "Spring cleaning" sprint In-Reply-To: References: <81ed0f30ab4f83fa28557b0032d447f1e36922ec.camel@sipsolutions.net> Message-ID: <3e992db01ab67a86511f21602be9105fa975d13a.camel@sipsolutions.net> Sorry for the confusion, we will be using this link: https://berkeley.zoom.us/j/5416755993 On Tue, 2019-10-15 at 09:00 -0700, Sebastian Berg wrote: > Hi all, > > I have to see if that zoom link works, and we probably have to switch > over the day and need some place to write down things, so lets put > any > updates here: > > https://hackmd.io/kyMr_SHJTAaWsHQGRhJ7hQ > > Best, > > Sebastian > > > On Tue, 2019-10-15 at 14:15 +0000, Hameer Abbasi wrote: > > Is the meeting ongoing? I tried to join but there were no > > participants. > > > > ?On 10.10.19, 02:56, "NumPy-Discussion on behalf of Sebastian Berg" > > < > > numpy-discussion-bounces+einstein.edison=gmail.com at python.org on > > behalf of sebastian at sipsolutions.net> wrote: > > > > Hi all, > > > > we are planning to remote spring to try to reduce the number of > > open > > PRs and issues, and do triage work. > > > > This is planned for next Tuesday, October 14th, between 9:00 > > and > > 15:00 > > Pacific time. And everyone is invited to join in for as long as > > you > > wish. > > > > I assume we will have a video chat up and running at: > > https://berkeley.zoom.us/j/762261535 > > > > However, we will send a reminder/logistics email when we start > > with the > > sprint. > > > > Cheers, > > > > Sebastian > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From peter at entschev.com Tue Oct 15 17:02:44 2019 From: peter at entschev.com (Peter Andreas Entschev) Date: Tue, 15 Oct 2019 23:02:44 +0200 Subject: [Numpy-discussion] =?utf-8?q?NEP_33_=E2=80=94_Array_Creation_Dis?= =?utf-8?q?patching_With_=5F=5Farray=5Ffunction=5F=5F?= Message-ID: Hello everyone, I've put together a new proposal for array creation dispatching with the __array_function__ protocol [1], based on a discussion that occurred some time ago in [2]. It would be great if people could take the time to review and comment on that. [1] https://github.com/numpy/numpy/pull/14715 [2] https://github.com/numpy/numpy/issues/14441 Best regards, Peter From albuscode at gmail.com Tue Oct 15 22:21:37 2019 From: albuscode at gmail.com (Inessa Pawson) Date: Tue, 15 Oct 2019 22:21:37 -0400 Subject: [Numpy-discussion] educational resources recommendations are wanted Message-ID: I?m working on creating a curated collection of NumPy related educational resources (tutorials, articles, books, presentations, courses, etc.). Your recommendations would be much appreciated, especially in languages other than English. Please include in your submission a brief description why it deserves mention on numpy.org and what audience would benefit from it the most. -- Every good wish, *Inessa Pawson* NumPy Web Team -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian at sipsolutions.net Tue Oct 15 23:54:20 2019 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Tue, 15 Oct 2019 20:54:20 -0700 Subject: [Numpy-discussion] "Spring cleaning" sprint In-Reply-To: <81ed0f30ab4f83fa28557b0032d447f1e36922ec.camel@sipsolutions.net> References: <81ed0f30ab4f83fa28557b0032d447f1e36922ec.camel@sipsolutions.net> Message-ID: Thanks everyone who joined in today! Together with Brigitta, ZJ, Eric, and the crowd from BIDS, today 15 PRs were opened and 18 (also old ones) merged in total. Including unsticking a few trickier ones. While we did not look into triaging old issues much (aside form looking at release Blockers for 1.18) 16 issues were closed. And there are still a couple of PRs hanging which can get wrapped up soon. Cheers, Sebastian On Wed, 2019-10-09 at 17:55 -0700, Sebastian Berg wrote: > Hi all, > > we are planning to remote spring to try to reduce the number of open > PRs and issues, and do triage work. > > This is planned for next Tuesday, October 14th, between 9:00 and > 15:00 > Pacific time. And everyone is invited to join in for as long as you > wish. > > I assume we will have a video chat up and running at: > https://berkeley.zoom.us/j/762261535 > > However, we will send a reminder/logistics email when we start with > the > sprint. > > Cheers, > > Sebastian > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From matti.picus at gmail.com Wed Oct 16 01:43:47 2019 From: matti.picus at gmail.com (Matti Picus) Date: Wed, 16 Oct 2019 08:43:47 +0300 Subject: [Numpy-discussion] Deprecate norm of 3d and more ndarrays Message-ID: <47310069-078d-8fa2-4ad0-7445d6f9a10a@gmail.com> np.norm(a, axis=None) where a.ndim > 2 calls np.norm(np.ravel(a)). PR 14719 proposes deprecating this unexpected behavior, with the suggestion that people who need this should call ravel themselves. Thoughts? Matti From sebastian at sipsolutions.net Wed Oct 16 14:00:22 2019 From: sebastian at sipsolutions.net (Sebastian Berg) Date: Wed, 16 Oct 2019 11:00:22 -0700 Subject: [Numpy-discussion] NumPy Community Meeting Wednesday now... Message-ID: Hi all, I am very sorry, I forgot to send out a reminder yesterday with the sprint going on all day. We will be meeting now, the meeting topics and notes are: https://hackmd.io/76o-IxCjQX2mOXO_wwkcpg?both Best wishes Sebastian -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: This is a digitally signed message part URL: From ralf.gommers at gmail.com Wed Oct 16 18:47:44 2019 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Thu, 17 Oct 2019 00:47:44 +0200 Subject: [Numpy-discussion] educational resources recommendations are wanted In-Reply-To: References: Message-ID: On Wed, Oct 16, 2019 at 4:22 AM Inessa Pawson wrote: > I?m working on creating a curated collection of NumPy related educational > resources (tutorials, articles, books, presentations, courses, etc.). > Your recommendations would be much appreciated, especially in languages > other than English. Please include in your submission a brief description > why it deserves mention on numpy.org and what audience would benefit from > it the most. > Just in case I didn't mention it before, there's a list at the bottom of https://github.com/numfocus/gsod/blob/master/2019/NumPy_ideas_list.md And I see I missed Nicolas Rougier's book on vectorization there, which is really nice: https://www.labri.fr/perso/nrougier/from-python-to-numpy/ Audience for all those materials is beginning users. SciPy Lecture Notes are specifically designed to be good to teach with, which may be good to point out (so second audience is educators). Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu Oct 17 11:36:45 2019 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 17 Oct 2019 09:36:45 -0600 Subject: [Numpy-discussion] NumPy 1.17 3 released Message-ID: Hi All, On behalf of the NumPy team I am pleased to announce that NumPy 1.17.3 has been released. This is a bugfix release. The Python versions supported in this release are 3.5-3.8. Downstream developers should use Cython >= 0.29.13 for Python 3.8 support and OpenBLAS >= 3.7 to avoid wrong results on the Skylake architecture. The NumP Wheels for this release can be downloaded from PyPI , source archives and release notes are available from Github . *Highlights* - Wheels for Python 3.8 - Boolean matmul fixed to use booleans instead of integers. *Compatibility notes* - The seldom used PyArray_DescrCheck macro has been changed/fixed. *Contributors* A total of 7 people contributed to this release. People with a "+" by their names contributed a patch for the first time. - Allan Haldane - Charles Harris - Kevin Sheppard - Matti Picus - Ralf Gommers - Sebastian Berg - Warren Weckesser *Pull requests merged* A total of 12 pull requests were merged for this release. - gh-14456: MAINT: clean up pocketfft modules inside numpy.fft namespace. - gh-14463: BUG: random.hypergeometic assumes npy_long is npy_int64, hung... - gh-14502: BUG: random: Revert gh-14458 and refix gh-14557. - gh-14504: BUG: add a specialized loop for boolean matmul. - gh-14506: MAINT: Update pytest version for Python 3.8 - gh-14512: DOC: random: fix doc linking, was referencing private submodules. - gh-14513: BUG,MAINT: Some fixes and minor cleanup based on clang analysis - gh-14515: BUG: Fix randint when range is 2**32 - gh-14519: MAINT: remove the entropy c-extension module - gh-14563: DOC: remove note about Pocketfft license file (non-existing here). - gh-14578: BUG: random: Create a legacy implementation of random.binomial. - gh-14687: BUG: properly define PyArray_DescrCheck Cheers, Charles Harris -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Thu Oct 17 11:46:20 2019 From: charlesr.harris at gmail.com (Charles R Harris) Date: Thu, 17 Oct 2019 09:46:20 -0600 Subject: [Numpy-discussion] educational resources recommendations are wanted In-Reply-To: References: Message-ID: On Tue, Oct 15, 2019 at 8:23 PM Inessa Pawson wrote: > I?m working on creating a curated collection of NumPy related educational > resources (tutorials, articles, books, presentations, courses, etc.). > Your recommendations would be much appreciated, especially in languages > other than English. Please include in your submission a brief description > why it deserves mention on numpy.org and what audience would benefit from > it the most. > -- > Every good wish, > *Inessa Pawson* > NumPy Web Team > There are some tutorials on youtube, https://www.youtube.com/watch?v=ZB7BZMhfPgk&t=90s for instance. Asking Fernando Perez if there are some online resources at Berkeley might also be useful, Berkeley has an ongoing program teaching data science. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Fri Oct 18 05:05:55 2019 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Fri, 18 Oct 2019 11:05:55 +0200 Subject: [Numpy-discussion] Unsupporting python3.5 In-Reply-To: References: <21b78142-1733-1c3d-f64d-5141470c8b11@gmail.com> <2a7e6e0f-6b81-40f2-87be-d64b52dfa07c@www.fastmail.com> Message-ID: On Fri, Oct 11, 2019 at 4:18 AM Charles R Harris wrote: > > > On Thu, Oct 10, 2019 at 6:02 PM Stefan van der Walt > wrote: > >> On Thu, Oct 10, 2019, at 09:34, Charles R Harris wrote: >> >> I think we can support 3.5 as long as we please, the question is how long >> we *want* to support it. I don't plan to release 1.18 wheels for 3.5, but >> I'm concerned about making 1.18 outright incompatible with 3.5. I would >> like to see the random interface settle before we do that. So my preference >> would be to drop 3.5 in 1.19 with a future warning in the 1.18 release >> notes. Alternatively, we could backport all the random changes to 1.17, but >> I would rather not do that. >> >> I'm not sure I like this approach, which is reflected in the current version of https://github.com/numpy/numpy/pull/14673. If we stop testing 3.5 in CI, don't release 3.5 wheels and remove the PyPI trove classifier for it (so installation tools may not install it anymore), then what's the point of saying we "don't drop it"? I'd rather keep it fully supported for one more month and release a couple of wheels, or just drop it completely. I don't have much of a preference which option is better, just would like to avoid half-dropping it. Cheers, Ralf > >> The language in the NEP is: >> >> "we recommend that they support at least all minor versions of Python >> introduced and released in the prior 42 months" >> >> i.e., a lower bound for support, such that compatibility with more >> versions of Python is perfectly OK and, in the case of NumPy, probably >> encouraged. The NEP is meant to lift the burden of very long support >> cycles from smaller projects. >> >> St?fan >> >> > The 1.18.0rc1 is about one month out, so we should spend some effort on > those PRs and issues with the 1.18 milestone. Dealing with issues and > milestones, plus putting together the release notes, is the major pain > point in making releases these days. > > Chuck > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pankaj.jangid at gmail.com Fri Oct 18 06:51:43 2019 From: pankaj.jangid at gmail.com (Pankaj Jangid) Date: Fri, 18 Oct 2019 16:21:43 +0530 Subject: [Numpy-discussion] Recommended way to utilize GPUs via OpenCL, ROCm Message-ID: Is there an officially recommended way to utilize AMD GPUs via OpenCL, ROCm? I came across ROCm website https://rocm.github.io/. This has Tensorflow and PyTorch versions for using AMD GPUs. Just wanted to know if there is a way to use my AMD GPUs for NumPy calculations. -- Regards, Pankaj Jangid From einstein.edison at gmail.com Fri Oct 18 06:56:37 2019 From: einstein.edison at gmail.com (Hameer Abbasi) Date: Fri, 18 Oct 2019 10:56:37 +0000 Subject: [Numpy-discussion] Recommended way to utilize GPUs via OpenCL, ROCm In-Reply-To: References: Message-ID: Hello Pankaj, There's ClPy for OpenCL: https://github.com/fixstars/clpy Also this pull request for CuPy (merged, but as yet unreleased): https://github.com/cupy/cupy/pull/1094 Best regards, Hameer Abbasi ?On 18.10.19, 12:53, "NumPy-Discussion on behalf of Pankaj Jangid" wrote: Is there an officially recommended way to utilize AMD GPUs via OpenCL, ROCm? I came across ROCm website https://rocm.github.io/. This has Tensorflow and PyTorch versions for using AMD GPUs. Just wanted to know if there is a way to use my AMD GPUs for NumPy calculations. -- Regards, Pankaj Jangid _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion at python.org https://mail.python.org/mailman/listinfo/numpy-discussion From matti.picus at gmail.com Fri Oct 18 07:43:57 2019 From: matti.picus at gmail.com (mattip) Date: Fri, 18 Oct 2019 04:43:57 -0700 (MST) Subject: [Numpy-discussion] numpy C-API :: use numpy's random number generator in a ufunc In-Reply-To: <9370f92917dc8b54c3273eacf2fbabe9b28fc090.camel@daknuett.eu> References: <9370f92917dc8b54c3273eacf2fbabe9b28fc090.camel@daknuett.eu> Message-ID: <1571399037202-0.post@n7.nabble.com> Hi Daniel. Usually one would use python, something like `rng = np.random.Generator(np.random.PCG64(seed)); a = rng.uniform(10, size=(3, 4))` to get a 3 by 4 array of uniform random numbers in the range of 0 to 10. Is there a reason you need to do this from C? Matti -- Sent from: http://numpy-discussion.10968.n7.nabble.com/ From charlesr.harris at gmail.com Fri Oct 18 08:36:23 2019 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 18 Oct 2019 06:36:23 -0600 Subject: [Numpy-discussion] Unsupporting python3.5 In-Reply-To: References: <21b78142-1733-1c3d-f64d-5141470c8b11@gmail.com> <2a7e6e0f-6b81-40f2-87be-d64b52dfa07c@www.fastmail.com> Message-ID: On Fri, Oct 18, 2019 at 3:06 AM Ralf Gommers wrote: > > > On Fri, Oct 11, 2019 at 4:18 AM Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> >> >> On Thu, Oct 10, 2019 at 6:02 PM Stefan van der Walt >> wrote: >> >>> On Thu, Oct 10, 2019, at 09:34, Charles R Harris wrote: >>> >>> I think we can support 3.5 as long as we please, the question is how >>> long we *want* to support it. I don't plan to release 1.18 wheels for 3.5, >>> but I'm concerned about making 1.18 outright incompatible with 3.5. I would >>> like to see the random interface settle before we do that. So my preference >>> would be to drop 3.5 in 1.19 with a future warning in the 1.18 release >>> notes. Alternatively, we could backport all the random changes to 1.17, but >>> I would rather not do that. >>> >>> > I'm not sure I like this approach, which is reflected in the current > version of https://github.com/numpy/numpy/pull/14673. If we stop testing > 3.5 in CI, don't release 3.5 wheels and remove the PyPI trove classifier > for it (so installation tools may not install it anymore), then what's the > point of saying we "don't drop it"? I'd rather keep it fully supported for > one more month and release a couple of wheels, or just drop it completely. > I don't have much of a preference which option is better, just would like > to avoid half-dropping it. > > That's fair. The 1.17.3 release supporting Python 3.5--3.8 went well enough. The only glitch was that I had to explicitly use OSX 10.6 and icode 6.4 for the 3.5 wheels on the Mac. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From pankaj.jangid at gmail.com Fri Oct 18 12:39:32 2019 From: pankaj.jangid at gmail.com (Pankaj Jangid) Date: Fri, 18 Oct 2019 22:09:32 +0530 Subject: [Numpy-discussion] Recommended way to utilize GPUs via OpenCL, ROCm In-Reply-To: (Hameer Abbasi's message of "Fri, 18 Oct 2019 10:56:37 +0000") References: Message-ID: Hameer Abbasi writes: > There's ClPy for OpenCL: https://github.com/fixstars/clpy > Also this pull request for CuPy (merged, but as yet unreleased): https://github.com/cupy/cupy/pull/1094 > This is great hope. Thanks for sharing this. I wonder why NVIDIA's approach is so widely accepted. Sometimes, I regret purchasing AMD GPUs. Not much support for them. -- Regards, Pankaj Jangid From ralf.gommers at gmail.com Sat Oct 19 06:41:29 2019 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sat, 19 Oct 2019 12:41:29 +0200 Subject: [Numpy-discussion] Unsupporting python3.5 In-Reply-To: References: <21b78142-1733-1c3d-f64d-5141470c8b11@gmail.com> <2a7e6e0f-6b81-40f2-87be-d64b52dfa07c@www.fastmail.com> Message-ID: On Fri, Oct 18, 2019 at 2:36 PM Charles R Harris wrote: > > > On Fri, Oct 18, 2019 at 3:06 AM Ralf Gommers > wrote: > >> >> >> On Fri, Oct 11, 2019 at 4:18 AM Charles R Harris < >> charlesr.harris at gmail.com> wrote: >> >>> >>> >>> On Thu, Oct 10, 2019 at 6:02 PM Stefan van der Walt < >>> stefanv at berkeley.edu> wrote: >>> >>>> On Thu, Oct 10, 2019, at 09:34, Charles R Harris wrote: >>>> >>>> I think we can support 3.5 as long as we please, the question is how >>>> long we *want* to support it. I don't plan to release 1.18 wheels for 3.5, >>>> but I'm concerned about making 1.18 outright incompatible with 3.5. I would >>>> like to see the random interface settle before we do that. So my preference >>>> would be to drop 3.5 in 1.19 with a future warning in the 1.18 release >>>> notes. Alternatively, we could backport all the random changes to 1.17, but >>>> I would rather not do that. >>>> >>>> >> I'm not sure I like this approach, which is reflected in the current >> version of https://github.com/numpy/numpy/pull/14673. If we stop testing >> 3.5 in CI, don't release 3.5 wheels and remove the PyPI trove classifier >> for it (so installation tools may not install it anymore), then what's the >> point of saying we "don't drop it"? I'd rather keep it fully supported for >> one more month and release a couple of wheels, or just drop it completely. >> I don't have much of a preference which option is better, just would like >> to avoid half-dropping it. >> >> > That's fair. The 1.17.3 release supporting Python 3.5--3.8 went well > enough. The only glitch was that I had to explicitly use OSX 10.6 and icode > 6.4 for the 3.5 wheels on the Mac. > So do you have a preference for dropping or not dropping for 1.18? Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat Oct 19 07:01:08 2019 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 19 Oct 2019 05:01:08 -0600 Subject: [Numpy-discussion] Unsupporting python3.5 In-Reply-To: References: <21b78142-1733-1c3d-f64d-5141470c8b11@gmail.com> <2a7e6e0f-6b81-40f2-87be-d64b52dfa07c@www.fastmail.com> Message-ID: On Sat, Oct 19, 2019 at 4:42 AM Ralf Gommers wrote: > > > On Fri, Oct 18, 2019 at 2:36 PM Charles R Harris < > charlesr.harris at gmail.com> wrote: > >> >> >> On Fri, Oct 18, 2019 at 3:06 AM Ralf Gommers >> wrote: >> >>> >>> >>> On Fri, Oct 11, 2019 at 4:18 AM Charles R Harris < >>> charlesr.harris at gmail.com> wrote: >>> >>>> >>>> >>>> On Thu, Oct 10, 2019 at 6:02 PM Stefan van der Walt < >>>> stefanv at berkeley.edu> wrote: >>>> >>>>> On Thu, Oct 10, 2019, at 09:34, Charles R Harris wrote: >>>>> >>>>> I think we can support 3.5 as long as we please, the question is how >>>>> long we *want* to support it. I don't plan to release 1.18 wheels for 3.5, >>>>> but I'm concerned about making 1.18 outright incompatible with 3.5. I would >>>>> like to see the random interface settle before we do that. So my preference >>>>> would be to drop 3.5 in 1.19 with a future warning in the 1.18 release >>>>> notes. Alternatively, we could backport all the random changes to 1.17, but >>>>> I would rather not do that. >>>>> >>>>> >>> I'm not sure I like this approach, which is reflected in the current >>> version of https://github.com/numpy/numpy/pull/14673. If we stop >>> testing 3.5 in CI, don't release 3.5 wheels and remove the PyPI trove >>> classifier for it (so installation tools may not install it anymore), then >>> what's the point of saying we "don't drop it"? I'd rather keep it fully >>> supported for one more month and release a couple of wheels, or just drop >>> it completely. I don't have much of a preference which option is better, >>> just would like to avoid half-dropping it. >>> >>> >> That's fair. The 1.17.3 release supporting Python 3.5--3.8 went well >> enough. The only glitch was that I had to explicitly use OSX 10.6 and icode >> 6.4 for the 3.5 wheels on the Mac. >> > > So do you have a preference for dropping or not dropping for 1.18? > Let's not drop it. Four weeks isn't that long to wait. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From evgeny.burovskiy at gmail.com Sat Oct 19 07:40:38 2019 From: evgeny.burovskiy at gmail.com (Evgeni Burovski) Date: Sat, 19 Oct 2019 14:40:38 +0300 Subject: [Numpy-discussion] Unsupporting python3.5 In-Reply-To: References: <21b78142-1733-1c3d-f64d-5141470c8b11@gmail.com> <2a7e6e0f-6b81-40f2-87be-d64b52dfa07c@www.fastmail.com> Message-ID: > > >>>> >>> That's fair. The 1.17.3 release supporting Python 3.5--3.8 went well >>> enough. The only glitch was that I had to explicitly use OSX 10.6 and icode >>> 6.4 for the 3.5 wheels on the Mac. >>> >> >> So do you have a preference for dropping or not dropping for 1.18? >> > > Let's not drop it. Four weeks isn't that long to wait. > +1 from the peanut gallery. Cheers, Evgeni > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jni at fastmail.com Mon Oct 21 03:33:22 2019 From: jni at fastmail.com (Juan Nunez-Iglesias) Date: Mon, 21 Oct 2019 18:33:22 +1100 Subject: [Numpy-discussion] Recommended way to utilize GPUs via OpenCL, ROCm In-Reply-To: References: Message-ID: I have also used PyOpenCL quite profitably: https://github.com/inducer/pyopencl I philosophically prefer it to ROCm because it targets *all* GPUs, including intel integrated graphics on most laptops, which can actually get quite decent (30x) speedups. > On 19 Oct 2019, at 3:39 am, Pankaj Jangid wrote: > I wonder why NVIDIA's approach is so widely accepted. Sometimes, I > regret purchasing AMD GPUs. Not much support for them. I agree. I am very disappointed by the NVIDIA monopoly in scientific computing. Resist! Juan. -------------- next part -------------- An HTML attachment was scrubbed... URL: From pankaj.jangid at gmail.com Mon Oct 21 06:27:11 2019 From: pankaj.jangid at gmail.com (Pankaj Jangid) Date: Mon, 21 Oct 2019 15:57:11 +0530 Subject: [Numpy-discussion] Recommended way to utilize GPUs via OpenCL, ROCm In-Reply-To: (Juan Nunez-Iglesias's message of "Mon, 21 Oct 2019 18:33:22 +1100") References: Message-ID: Juan Nunez-Iglesias writes: > I have also used PyOpenCL quite profitably: > > https://github.com/inducer/pyopencl > > I philosophically prefer it to ROCm because it targets *all* GPUs, including intel integrated graphics on most laptops, which can actually get quite decent (30x) speedups. > This is a good find. There is some work involved but it is good. It gives transparent access to underlying hardware. I wish NumPy operations automatically use the available resources. That is more concise. It will give scientific community an edge. I am not saying they are not good programmers but still it will let them focus on the main problem at hand. Let me explore it further. Thanks for sharing. >> On 19 Oct 2019, at 3:39 am, Pankaj Jangid wrote: >> I wonder why NVIDIA's approach is so widely accepted. Sometimes, I >> regret purchasing AMD GPUs. Not much support for them. > > I agree. I am very disappointed by the NVIDIA monopoly in scientific computing. Resist! > Really, very disappointing. :-( Regards, -- Pankaj Jangid From matti.picus at gmail.com Wed Oct 23 05:23:33 2019 From: matti.picus at gmail.com (Matti Picus) Date: Wed, 23 Oct 2019 12:23:33 +0300 Subject: [Numpy-discussion] Numpy community meeting Wed Oct 23 Message-ID: Hi all, There will be a NumPy Community meeting Wednesday Oct 23 at 11 am Pacific Time. Everyone is invited to join in and edit the work-in- progress meeting topics and notes: https://hackmd.io/76o-IxCjQX2mOXO_wwkcpg?both Matti From ralf.gommers at gmail.com Thu Oct 24 05:03:40 2019 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Thu, 24 Oct 2019 11:03:40 +0200 Subject: [Numpy-discussion] proposed NEP template change Message-ID: Hi all, In https://github.com/numpy/numpy/pull/14734 we're working on a NEP template update, with the goal of better separating content that's relevant for users and authors of packages that depend on NumPy from implementation details. Please have a look and comment on the PR if you're interested in this topic. Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From einstein.edison at gmail.com Mon Oct 28 15:23:02 2019 From: einstein.edison at gmail.com (Hameer Abbasi) Date: Mon, 28 Oct 2019 19:23:02 +0000 Subject: [Numpy-discussion] =?windows-1252?q?New_draft_of_NEP_31_=97_Cont?= =?windows-1252?q?ext-local_and_global_overrides_of_the_NumPy_API?= Message-ID: Hello everyone, I?ve improved upon the content of NEP 31 to make it simpler, and also according to the new NEP template, only part of the NEP is being sent out to the mailing list. For the full nep, please see PR 14793. ============================================================ NEP 31 ? Context-local and global overrides of the NumPy API ============================================================ :Author: Hameer Abbasi :Author: Ralf Gommers :Author: Peter Bell :Status: Draft :Type: Standards Track :Created: 2019-08-22 Abstract -------- This NEP proposes to make all of NumPy's public API overridable via an extensible backend mechanism. Acceptance of this NEP means NumPy would provide global and context-local overrides in a separate namespace, as well as a dispatch mechanism similar to NEP-18 [2]_. First experiences with ``__array_function__`` show that it is necessary to be able to override NumPy functions that *do not take an array-like argument*, and hence aren't overridable via ``__array_function__``. The most pressing need is array creation and coercion functions, such as ``numpy.zeros`` or ``numpy.asarray``; see e.g. NEP-30 [9]_. This NEP proposes to allow, in an opt-in fashion, overriding any part of the NumPy API. It is intended as a comprehensive resolution to NEP-22 [3]_, and obviates the need to add an ever-growing list of new protocols for each new type of function or object that needs to become overridable. Motivation and Scope -------------------- The primary end-goal of this NEP is to make the following possible: .. code:: python # On the library side import numpy.overridable as unp def library_function(array): array = unp.asarray(array) # Code using unumpy as usual return array # On the user side: import numpy.overridable as unp import uarray as ua import dask.array as da ua.register_backend(da) # Can be done within Dask itself library_function(dask_array) # works and returns dask_array with unp.set_backend(da): library_function([1, 2, 3, 4]) # actually returns a Dask array. Here, ``backend`` can be any compatible object defined either by NumPy or an external library, such as Dask or CuPy. Ideally, it should be the module ``dask.array`` or ``cupy`` itself. These kinds of overrides are useful for both the end-user as well as library authors. End-users may have written or wish to write code that they then later speed up or move to a different implementation, say PyData/Sparse. They can do this simply by setting a backend. Library authors may also wish to write code that is portable across array implementations, for example ``sklearn`` may wish to write code for a machine learning algorithm that is portable across array implementations while also using array creation functions. This NEP takes a holistic approach: It assumes that there are parts of the API that need to be overridable, and that these will grow over time. It provides a general framework and a mechanism to avoid a design of a new protocol each time this is required. This was the goal of ``uarray``: to allow for overrides in an API without needing the design of a new protocol. This NEP proposes the following: That ``unumpy`` [8]_ becomes the recommended override mechanism for the parts of the NumPy API not yet covered by ``__array_function__`` or ``__array_ufunc__``, and that ``uarray`` is vendored into a new namespace within NumPy to give users and downstream dependencies access to these overrides. This vendoring mechanism is similar to what SciPy decided to do for making ``scipy.fft`` overridable (see [10]_). The motivation behind ``uarray`` is manyfold: First, there have been several attempts to allow dispatch of parts of the NumPy API, including (most prominently), the ``__array_ufunc__`` protocol in NEP-13 [4]_, and the ``__array_function__`` protocol in NEP-18 [2]_, but this has shown the need for further protocols to be developed, including a protocol for coercion (see [5]_, [9]_). The reasons these overrides are needed have been extensively discussed in the references, and this NEP will not attempt to go into the details of why these are needed; but in short: It is necessary for library authors to be able to coerce arbitrary objects into arrays of their own types, such as CuPy needing to coerce to a CuPy array, for example, instead of a NumPy array. In simpler words, one needs things like ``np.asarray(...)`` or an alternative to "just work" and return duck-arrays. Usage and Impact ---------------- This NEP allows for global and context-local overrides, as well as automatic overrides a-la ``__array_function__``. Here are some use-cases this NEP would enable, besides the first one stated in the motivation section: The first is allowing alternate dtypes to return their respective arrays. .. code:: python # Returns an XND array x = unp.ones((5, 5), dtype=xnd_dtype) # Or torch dtype The second is allowing overrides for parts of the API. This is to allow alternate and/or optimised implementations for ``np.linalg``, BLAS, and ``np.random``. .. code:: python import numpy as np import pyfftw # Or mkl_fft # Makes pyfftw the default for FFT np.set_global_backend(pyfftw) # Uses pyfftw without monkeypatching np.fft.fft(numpy_array) with np.set_backend(pyfftw) # Or mkl_fft, or numpy # Uses the backend you specified np.fft.fft(numpy_array) This will allow an official way for overrides to work with NumPy without monkeypatching or distributing a modified version of NumPy. Here are a few other use-cases, implied but not already stated: .. code:: python data = da.from_zarr('myfile.zarr') # result should still be dask, all things being equal result = library_function(data) result.to_zarr('output.zarr') This second one would work if ``magic_library`` was built on top of ``unumpy``. .. code:: python from dask import array as da from magic_library import pytorch_predict data = da.from_zarr('myfile.zarr') # normally here one would use e.g. data.map_overlap result = pytorch_predict(data) result.to_zarr('output.zarr') Backward compatibility ---------------------- There are no backward incompatible changes proposed in this NEP. -------------- next part -------------- An HTML attachment was scrubbed... URL: From albuscode at gmail.com Tue Oct 29 00:21:51 2019 From: albuscode at gmail.com (Inessa Pawson) Date: Tue, 29 Oct 2019 00:21:51 -0400 Subject: [Numpy-discussion] NumPy amuse-bouche Message-ID: We are looking for code snippets that illustrate NumPy?s unique capabilities. To submit your ideas or for further info please refer to the following GitHub discussion: github.com/numpy/numpy.org/issues/40#issuecomment-534303380 -- Every good wish, Inessa Pawson NumPy Web Team -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Tue Oct 29 06:21:59 2019 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Tue, 29 Oct 2019 11:21:59 +0100 Subject: [Numpy-discussion] NumPy amuse-bouche In-Reply-To: References: Message-ID: On Tue, Oct 29, 2019 at 5:22 AM Inessa Pawson wrote: > We are looking for code snippets that illustrate NumPy?s unique > capabilities. > Maybe good to clarify a bit: we are looking for a short snippet to put at the top of the main page of the new website. Say 3-10 lines of code. It should illustrate the power and elegance of the NumPy API - whatever that means to you. Cheers, Ralf To submit your ideas or for further info please refer to the following > GitHub discussion: > github.com/numpy/numpy.org/issues/40#issuecomment-534303380 > > -- > Every good wish, > Inessa Pawson > NumPy Web Team > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From matti.picus at gmail.com Tue Oct 29 09:01:54 2019 From: matti.picus at gmail.com (Matti Picus) Date: Tue, 29 Oct 2019 15:01:54 +0200 Subject: [Numpy-discussion] NEP 34 - deprecate automatic dtype=object on ragged arrays Message-ID: <143ca54d-5252-6639-cb2a-e387497e1161@gmail.com> After a few iterations by reviewers, I would like to submit NEP 34 to deprecate automatically using dtype=object for ragged arrays. https://github.com/numpy/numpy/pull/14674 and an associated PR for the implementation https://github.com/numpy/numpy/pull/14794 Commments? Matti Abstract -------- When users create arrays with sequences-of-sequences, they sometimes err in matching the lengths of the nested sequences_, commonly called "ragged arrays".? Here we will refer to them as ragged nested sequences. Creating such arrays via ``np.array([])`` with no ``dtype`` keyword argument will today default to an ``object``-dtype array. Change the behaviour to raise a ``ValueError`` instead. Motivation and Scope -------------------- Users who specify lists-of-lists when creating a `numpy.ndarray` via ``np.array`` may mistakenly pass in lists of different lengths. Currently we accept this input and automatically create an array with ``dtype=object``. This can be confusing, since it is rarely what is desired. Changing the automatic dtype detection to never return ``object`` for ragged nested sequences (defined as a recursive sequence of sequences, where not all the sequences on the same level have the same length) will force users who actually wish to create ``object`` arrays to specify that explicitly. Note that ``lists``, ``tuples``, and ``nd.ndarrays`` are all sequences [0]_. See for instance `issue 5303`_. Usage and Impact ---------------- After this change, array creation with ragged nested sequences must explicitly define a dtype: ??? >>> np.array([[1, 2], [1]]) ??? ValueError: cannot guess the desired dtype from the input ??? >>> np.array([[1, 2], [1]], dtype=object) ??? # succeeds, with no change from current behaviour The deprecation will affect any call that internally calls ``np.asarray``.? For instance, the ``assert_equal`` family of functions calls ``np.asarray``, so users will have to change code like:: ??? np.assert_equal(a, [[1, 2], 3]) to:: ??? np.assert_equal(a, np.array([[1, 2], 3], dtype=object) From matti.picus at gmail.com Tue Oct 29 14:01:10 2019 From: matti.picus at gmail.com (Matti Picus) Date: Tue, 29 Oct 2019 20:01:10 +0200 Subject: [Numpy-discussion] Numpy community meeting Wed Oct 30 Message-ID: <0d7420cb-3d14-3bdd-ef83-93146446eec9@gmail.com> Hi all, There will be a NumPy Community meeting Wednesday Oct 30 at 11 am Pacific Time. Everyone is invited to join in and edit the work-in- progress meeting topics and notes: https://hackmd.io/76o-IxCjQX2mOXO_wwkcpg?both Matti From daniele at grinta.net Wed Oct 30 19:23:05 2019 From: daniele at grinta.net (Daniele Nicolodi) Date: Wed, 30 Oct 2019 17:23:05 -0600 Subject: [Numpy-discussion] argmax() indexes to value Message-ID: <3c8ed265-06bb-c527-f103-af827ee6597d@grinta.net> Hello, this is a very basic question, but I cannot find a satisfying answer. Assume a is a 2D array and that I get the index of the maximum value along the second dimension: i = a.argmax(axis=1) Is there a better way to get the value of the maximum array entries along the second axis other than: v = a[np.arange(len(a)), i] ?? Thank you. Cheers, Daniele From ndbecker2 at gmail.com Wed Oct 30 21:10:01 2019 From: ndbecker2 at gmail.com (Neal Becker) Date: Wed, 30 Oct 2019 21:10:01 -0400 Subject: [Numpy-discussion] argmax() indexes to value In-Reply-To: <3c8ed265-06bb-c527-f103-af827ee6597d@grinta.net> References: <3c8ed265-06bb-c527-f103-af827ee6597d@grinta.net> Message-ID: max(axis=1)? On Wed, Oct 30, 2019, 7:33 PM Daniele Nicolodi wrote: > Hello, > > this is a very basic question, but I cannot find a satisfying answer. > Assume a is a 2D array and that I get the index of the maximum value > along the second dimension: > > i = a.argmax(axis=1) > > Is there a better way to get the value of the maximum array entries > along the second axis other than: > > v = a[np.arange(len(a)), i] > > ?? > > Thank you. > > Cheers, > Daniele > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniele at grinta.net Wed Oct 30 23:31:46 2019 From: daniele at grinta.net (Daniele Nicolodi) Date: Wed, 30 Oct 2019 21:31:46 -0600 Subject: [Numpy-discussion] argmax() indexes to value In-Reply-To: References: <3c8ed265-06bb-c527-f103-af827ee6597d@grinta.net> Message-ID: <526001fa-32b9-a948-7101-e742ee1f0d7d@grinta.net> On 30/10/2019 19:10, Neal Becker wrote: > max(axis=1)? Hi Neal, I should have been more precise in stating the problem. Getting the values in the array for which I'm looking at the maxima is only one step in a more complex piece of code for which I need the indexes along the second axis of the array. I would like to avoid to have to iterate the array more than once. Thank you! Cheers, Dan > On Wed, Oct 30, 2019, 7:33 PM Daniele Nicolodi > wrote: > > Hello, > > this is a very basic question, but I cannot find a satisfying answer. > Assume a is a 2D array and that I get the index of the maximum value > along the second dimension: > > i = a.argmax(axis=1) > > Is there a better way to get the value of the maximum array entries > along the second axis other than: > > v = a[np.arange(len(a)), i] > > ?? > > Thank you. > > Cheers, > Daniele > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > From Permafacture at gmail.com Thu Oct 31 00:42:34 2019 From: Permafacture at gmail.com (Elliot Hallmark) Date: Wed, 30 Oct 2019 23:42:34 -0500 Subject: [Numpy-discussion] argmax() indexes to value In-Reply-To: <526001fa-32b9-a948-7101-e742ee1f0d7d@grinta.net> References: <3c8ed265-06bb-c527-f103-af827ee6597d@grinta.net> <526001fa-32b9-a948-7101-e742ee1f0d7d@grinta.net> Message-ID: I wouldn't be surprised at all if calling max in addition to argmax wasn't as fast or faster than indexing the array using argmax. Regardless, just use that then profile when you're done with the whole thing and see if there's any gains to be made. Very likely not here. -elliot On Wed, Oct 30, 2019, 10:32 PM Daniele Nicolodi wrote: > On 30/10/2019 19:10, Neal Becker wrote: > > max(axis=1)? > > Hi Neal, > > I should have been more precise in stating the problem. Getting the > values in the array for which I'm looking at the maxima is only one step > in a more complex piece of code for which I need the indexes along the > second axis of the array. I would like to avoid to have to iterate the > array more than once. > > Thank you! > > Cheers, > Dan > > > > On Wed, Oct 30, 2019, 7:33 PM Daniele Nicolodi > > wrote: > > > > Hello, > > > > this is a very basic question, but I cannot find a satisfying answer. > > Assume a is a 2D array and that I get the index of the maximum value > > along the second dimension: > > > > i = a.argmax(axis=1) > > > > Is there a better way to get the value of the maximum array entries > > along the second axis other than: > > > > v = a[np.arange(len(a)), i] > > > > ?? > > > > Thank you. > > > > Cheers, > > Daniele > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From daniele at grinta.net Thu Oct 31 01:34:20 2019 From: daniele at grinta.net (Daniele Nicolodi) Date: Wed, 30 Oct 2019 23:34:20 -0600 Subject: [Numpy-discussion] argmax() indexes to value In-Reply-To: References: <3c8ed265-06bb-c527-f103-af827ee6597d@grinta.net> <526001fa-32b9-a948-7101-e742ee1f0d7d@grinta.net> Message-ID: <7ea8b98b-5c72-5475-025b-3b86292fec0b@grinta.net> On 30/10/2019 22:42, Elliot Hallmark wrote: > I wouldn't be surprised at all if calling max in addition to argmax > wasn't as fast or faster than indexing the array using argmax. > Regardless, just use that then profile when you're done with the > whole?thing and see if there's any gains to be made. Very likely not here. Hi Elliot, how do you arrive at this conclusion? np.argmax() and np.max() are O(N) while indexing is O(1) thus I don't see how you can conclude that running both np.argmax() and np.max() on the input array is going to incur in a small penalty compared to running np.argmax() and then indexing. Cheers, Dan > > -elliot > > On Wed, Oct 30, 2019, 10:32 PM Daniele Nicolodi > wrote: > > On 30/10/2019 19:10, Neal Becker wrote: > > max(axis=1)? > > Hi Neal, > > I should have been more precise in stating the problem. Getting the > values in the array for which I'm looking at the maxima is only one step > in a more complex piece of code for which I need the indexes along the > second axis of the array. I would like to avoid to have to iterate the > array more than once. > > Thank you! > > Cheers, > Dan > > > > On Wed, Oct 30, 2019, 7:33 PM Daniele Nicolodi > > >> wrote: > > > >? ? ?Hello, > > > >? ? ?this is a very basic question, but I cannot find a satisfying > answer. > >? ? ?Assume a is a 2D array and that I get the index of the maximum > value > >? ? ?along the second dimension: > > > >? ? ?i = a.argmax(axis=1) > > > >? ? ?Is there a better way to get the value of the maximum array > entries > >? ? ?along the second axis other than: > > > >? ? ?v = a[np.arange(len(a)), i] > > > >? ? ??? > > > >? ? ?Thank you. > > > >? ? ?Cheers, > >? ? ?Daniele > >? ? ?_______________________________________________ > >? ? ?NumPy-Discussion mailing list > >? ? ?NumPy-Discussion at python.org > > > > >? ? ?https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > From Permafacture at gmail.com Thu Oct 31 03:44:29 2019 From: Permafacture at gmail.com (Elliot Hallmark) Date: Thu, 31 Oct 2019 02:44:29 -0500 Subject: [Numpy-discussion] argmax() indexes to value In-Reply-To: <7ea8b98b-5c72-5475-025b-3b86292fec0b@grinta.net> References: <3c8ed265-06bb-c527-f103-af827ee6597d@grinta.net> <526001fa-32b9-a948-7101-e742ee1f0d7d@grinta.net> <7ea8b98b-5c72-5475-025b-3b86292fec0b@grinta.net> Message-ID: Depends on how big your array is. Numpy C code is 150x+ faster than python overhead. Fancy indexing can be expensive in my experience. Without trying I'd guess arr[:, argmax(arr, axis=1)] does what you want, but even if it is, try profiling the two and see. I highly doubt such would be even 1% of your run time, but it depends on what your doing. Part of python with numpy is slightly not caring about big O because trying to be clever is rarely worth it in my experience. On Thu, Oct 31, 2019 at 12:35 AM Daniele Nicolodi wrote: > On 30/10/2019 22:42, Elliot Hallmark wrote: > > I wouldn't be surprised at all if calling max in addition to argmax > > wasn't as fast or faster than indexing the array using argmax. > > Regardless, just use that then profile when you're done with the > > whole thing and see if there's any gains to be made. Very likely not > here. > > Hi Elliot, > > how do you arrive at this conclusion? np.argmax() and np.max() are O(N) > while indexing is O(1) thus I don't see how you can conclude that > running both np.argmax() and np.max() on the input array is going to > incur in a small penalty compared to running np.argmax() and then indexing. > > Cheers, > Dan > > > > > > -elliot > > > > On Wed, Oct 30, 2019, 10:32 PM Daniele Nicolodi > > wrote: > > > > On 30/10/2019 19:10, Neal Becker wrote: > > > max(axis=1)? > > > > Hi Neal, > > > > I should have been more precise in stating the problem. Getting the > > values in the array for which I'm looking at the maxima is only one > step > > in a more complex piece of code for which I need the indexes along > the > > second axis of the array. I would like to avoid to have to iterate > the > > array more than once. > > > > Thank you! > > > > Cheers, > > Dan > > > > > > > On Wed, Oct 30, 2019, 7:33 PM Daniele Nicolodi > > > > >> wrote: > > > > > > Hello, > > > > > > this is a very basic question, but I cannot find a satisfying > > answer. > > > Assume a is a 2D array and that I get the index of the maximum > > value > > > along the second dimension: > > > > > > i = a.argmax(axis=1) > > > > > > Is there a better way to get the value of the maximum array > > entries > > > along the second axis other than: > > > > > > v = a[np.arange(len(a)), i] > > > > > > ?? > > > > > > Thank you. > > > > > > Cheers, > > > Daniele > > > _______________________________________________ > > > NumPy-Discussion mailing list > > > NumPy-Discussion at python.org > > > > > > > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > > > > > _______________________________________________ > > > NumPy-Discussion mailing list > > > NumPy-Discussion at python.org > > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > > > _______________________________________________ > > NumPy-Discussion mailing list > > NumPy-Discussion at python.org > > https://mail.python.org/mailman/listinfo/numpy-discussion > > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at python.org > https://mail.python.org/mailman/listinfo/numpy-discussion > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pierre.augier at univ-grenoble-alpes.fr Thu Oct 31 16:16:06 2019 From: pierre.augier at univ-grenoble-alpes.fr (PIERRE AUGIER) Date: Thu, 31 Oct 2019 21:16:06 +0100 (CET) Subject: [Numpy-discussion] Transonic Vision: unifying Python-Numpy accelerators Message-ID: <1080118635.5930814.1572552966711.JavaMail.zimbra@univ-grenoble-alpes.fr> Dear Python-Numpy community, Few years ago I started to use a lot Python and Numpy for science. I'd like to thanks all people who contribute to this fantastic community. I used a lot Cython, Pythran and Numba and for the FluidDyn project, we created Transonic, a pure Python package to easily accelerate modern Python-Numpy code with different accelerators. We wrote a long and serious text to explain why we think Transonic could have a positive impact on the scientific Python ecosystem. Here it is: http://tiny.cc/transonic-vision Feedback and discussions would be greatly appreciated! Pierre -- Pierre Augier - CR CNRS http://www.legi.grenoble-inp.fr LEGI (UMR 5519) Laboratoire des Ecoulements Geophysiques et Industriels BP53, 38041 Grenoble Cedex, France tel:+33.4.56.52.86.16