From wesmckinn at gmail.com Thu Dec 1 10:06:46 2011 From: wesmckinn at gmail.com (Wes McKinney) Date: Thu, 1 Dec 2011 10:06:46 -0500 Subject: [SciPy-User] [Numpy-discussion] what statistical module to use for python? In-Reply-To: References: Message-ID: On Wed, Nov 30, 2011 at 8:41 PM, wrote: > > > On Wed, Nov 30, 2011 at 1:16 PM, Chao YUE wrote: >> Hi all, > > This is more a question for the scipy-user mailing list since that is > for more general question. > > I would also like to know since I have a biased or selective view. > >> >> I just want to broadly ask what statistical package are you guys using? I >> mean routine statistical function like linear regression, GLM, ANOVA... etc. >> >> I know there is SciKits packages like statsmodels, but are there more >> general and complete ones? > > (Not counting rpy2 since it's not available on Windows anymore.) > > I think there are more complete packages on specific topics, but > nothing in python that is complete and general, that's where > statsmodels tries to be. > > sklearn is machine learning oriented but covers also a large area of > statistical methods. > > Besides scipy.stats, statsmodels and sklearn, I don't know any that > target to be general and not field specific. (scipy and numpy have > also features that make do-it-yourself easy.) > > But there are many more field or topic specific packages, ...... > (Bayesian, spatial, discrete choice (transport), and then by scientific field.) > > http://www.scipy.org/Topical_Software ? doesn't include a statistics section > > An overview or survey of packages and statistical methods (in a very > broad definition) would be useful. > > Thanks, > > Josef > >> >> thanks to all, >> >> Chao >> -- >> *********************************************************************************** >> Chao YUE >> Laboratoire des Sciences du Climat et de l'Environnement (LSCE-IPSL) >> UMR 1572 CEA-CNRS-UVSQ >> Batiment 712 - Pe 119 >> 91191 GIF Sur YVETTE Cedex >> Tel: (33) 01 69 08 29 02; Fax:01.69.08.77.16 >> ************************************************************************************ >> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user I think that statsmodels is the right place for the kinds of models and analysis you're referring ot. We would love more contributors to make it more complete, e.g. I don't it has much in the way of ANOVA yet (probably needs the formula framework to be set up first). From aronne.merrelli at gmail.com Thu Dec 1 17:36:04 2011 From: aronne.merrelli at gmail.com (Aronne Merrelli) Date: Thu, 1 Dec 2011 16:36:04 -0600 Subject: [SciPy-User] Subclassing scipy sparse matrix class In-Reply-To: References: Message-ID: On Wed, Nov 30, 2011 at 3:01 AM, Per Nielsen wrote: > Hi all > > I am trying to create a subclass of the sparse matrix class in scipy, to > add some extra methods I need. > > I have tried to follow the guide on: http://www.scipy.org/Subclasses but > without much luck, the view method does not exist for the sparse matrix > class. Below is a script I have created > > Hi, It appears that sparse matrices do not inherit from numpy.ndarray: In [5]: sparse_mat = csr_matrix( np.ones(3) ) In [7]: isinstance( sparse_mat, np.ndarray ) Out[7]: False So much of the numpy - specific information on that page at scipy.org is not relevant for a sparse matrix subclass. I would assume subclassing csr_matrix would essentially look more like plain python subclassing. However, playing around with this, I quickly found what appears to be a sparse matrix-specific aspect. The sparse matrix format is based on the name of the class - so if you want this to work you have to name the subclass with the same 3 letters as the desired subclass ("csr" in this case). Here is a minimal example that works - note the fail_matrix doesn't work, and causes an attribute error just because of the name: from scipy.sparse.csr import csr_matrix class csr_matrix_alt(csr_matrix): def __init__(self, *args, **kwargs): csr_matrix.__init__(self, *args, **kwargs) def square_spmat(self): return self ** 2 class fail_matrix(csr_matrix): pass x = np.array( [[1, 0], [1, 3]] ) xsparse = csr_matrix_alt(x) xsparse_sq = xsparse.square_spmat() print xsparse.todense() print xsparse_sq.todense() xfail = fail_matrix(x) Here is the output I get, running from ipython: In [2]: execfile('spsub_example.py') [[1 0] [1 3]] [[1 0] [4 9]] --------------------------------------------------------------------------- AttributeError Traceback (most recent call last) AttributeError: tofai not found -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Thu Dec 1 19:25:45 2011 From: pav at iki.fi (Pauli Virtanen) Date: Fri, 02 Dec 2011 01:25:45 +0100 Subject: [SciPy-User] No Scipy 0.10 reference manual In-Reply-To: <4ED6A935.4040506@gmail.com> References: <4ED6A935.4040506@gmail.com> Message-ID: 30.11.2011 23:07, Miha Marolt kirjoitti: > There is no link to the Scipy 0.10 reference guide on the documentation > page (http://docs.scipy.org/doc/). Fixed. -- Pauli Virtanen From devdoer2 at gmail.com Fri Dec 2 03:40:48 2011 From: devdoer2 at gmail.com (devdoer bird) Date: Fri, 2 Dec 2011 16:40:48 +0800 Subject: [SciPy-User] could you explain the line_search method? Message-ID: HI: Sorry to post here if it's not a proper question. I'm reading the source code of line_search function in optimize.py , and there's a statement to compute the initial step alpha , listed below: alpha1 = pymin(1.0, 1.01*2*(phi0-old_old_fval)/derphi0) I don't what's the physical meaning of 1.01*2*(phi0-old_old_fval)/derphi0, Can some one give me some explaination? Thanks! -------------- next part -------------- An HTML attachment was scrubbed... URL: From evilper at gmail.com Fri Dec 2 03:47:41 2011 From: evilper at gmail.com (Per Nielsen) Date: Fri, 2 Dec 2011 09:47:41 +0100 Subject: [SciPy-User] Subclassing scipy sparse matrix class In-Reply-To: References: Message-ID: Hi all, Indeed it seems to work as I wanted it to. Thanks alot for the help :) Per On Thu, Dec 1, 2011 at 23:36, Aronne Merrelli wrote: > > On Wed, Nov 30, 2011 at 3:01 AM, Per Nielsen wrote: > >> Hi all >> >> I am trying to create a subclass of the sparse matrix class in scipy, to >> add some extra methods I need. >> >> I have tried to follow the guide on: http://www.scipy.org/Subclasses but >> without much luck, the view method does not exist for the sparse matrix >> class. Below is a script I have created >> >> > Hi, > > It appears that sparse matrices do not inherit from numpy.ndarray: > > In [5]: sparse_mat = csr_matrix( np.ones(3) ) > In [7]: isinstance( sparse_mat, np.ndarray ) > Out[7]: False > > So much of the numpy - specific information on that page at scipy.org is > not relevant for a sparse matrix subclass. I would assume subclassing > csr_matrix would essentially look more like plain python subclassing. > However, playing around with this, I quickly found what appears to be a > sparse matrix-specific aspect. The sparse matrix format is based on the > name of the class - so if you want this to work you have to name the > subclass with the same 3 letters as the desired subclass ("csr" in this > case). Here is a minimal example that works - note the fail_matrix doesn't > work, and causes an attribute error just because of the name: > > from scipy.sparse.csr import csr_matrix > > class csr_matrix_alt(csr_matrix): > def __init__(self, *args, **kwargs): > csr_matrix.__init__(self, *args, **kwargs) > def square_spmat(self): > return self ** 2 > > class fail_matrix(csr_matrix): > pass > > x = np.array( [[1, 0], [1, 3]] ) > xsparse = csr_matrix_alt(x) > xsparse_sq = xsparse.square_spmat() > > print xsparse.todense() > print xsparse_sq.todense() > > xfail = fail_matrix(x) > > > > Here is the output I get, running from ipython: > > In [2]: execfile('spsub_example.py') > [[1 0] > [1 3]] > [[1 0] > [4 9]] > --------------------------------------------------------------------------- > AttributeError Traceback (most recent call last) > > AttributeError: tofai not found > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsouthey at gmail.com Fri Dec 2 09:39:24 2011 From: bsouthey at gmail.com (Bruce Southey) Date: Fri, 02 Dec 2011 08:39:24 -0600 Subject: [SciPy-User] Subclassing scipy sparse matrix class In-Reply-To: References: Message-ID: <4ED8E31C.3040406@gmail.com> On 12/01/2011 04:36 PM, Aronne Merrelli wrote: > > On Wed, Nov 30, 2011 at 3:01 AM, Per Nielsen > wrote: > > Hi all > > I am trying to create a subclass of the sparse matrix class in > scipy, to add some extra methods I need. > > I have tried to follow the guide on: > http://www.scipy.org/Subclasses but without much luck, the view > method does not exist for the sparse matrix class. Below is a > script I have created > > > Hi, > > It appears that sparse matrices do not inherit from numpy.ndarray: Surely you did notice that sparse is part of scipy not numpy or even the c++ usage when looking at the code? :-) As far as I know (which is not much) scipy.sparse is essentially self-contained in scipy/sparse directory. So you are better off just working with those files directly. A common thought that I have when I reading about 'extra methods' is that other people could have them or would like them. So perhaps think about making a contribution. Bruce > > In [5]: sparse_mat = csr_matrix( np.ones(3) ) > In [7]: isinstance( sparse_mat, np.ndarray ) > Out[7]: False > > So much of the numpy - specific information on that page at scipy.org > is not relevant for a sparse matrix subclass. I > would assume subclassing csr_matrix would essentially look more like > plain python subclassing. However, playing around with this, I quickly > found what appears to be a sparse matrix-specific aspect. The sparse > matrix format is based on the name of the class - so if you want this > to work you have to name the subclass with the same 3 letters as the > desired subclass ("csr" in this case). Here is a minimal example that > works - note the fail_matrix doesn't work, and causes an attribute > error just because of the name: > > from scipy.sparse.csr import csr_matrix > > class csr_matrix_alt(csr_matrix): > def __init__(self, *args, **kwargs): > csr_matrix.__init__(self, *args, **kwargs) > def square_spmat(self): > return self ** 2 > > class fail_matrix(csr_matrix): > pass > > x = np.array( [[1, 0], [1, 3]] ) > xsparse = csr_matrix_alt(x) > xsparse_sq = xsparse.square_spmat() > > print xsparse.todense() > print xsparse_sq.todense() > > xfail = fail_matrix(x) > > > > Here is the output I get, running from ipython: > > In [2]: execfile('spsub_example.py') > [[1 0] > [1 3]] > [[1 0] > [4 9]] > --------------------------------------------------------------------------- > AttributeError Traceback (most recent call > last) > > AttributeError: tofai not found > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user -------------- next part -------------- An HTML attachment was scrubbed... URL: From likchuan at gmail.com Fri Dec 2 04:04:16 2011 From: likchuan at gmail.com (LLCB) Date: Fri, 2 Dec 2011 01:04:16 -0800 (PST) Subject: [SciPy-User] [SciPy-user] NameError: name 'common' is not defined Scipy 10.0 Message-ID: <32901047.post@talk.nabble.com> Hi, I have installed scipy 10.0 and numpy 1.5.1 on RHEL5. When I run a script, I got the following error: File "/home/cbl/Research/code/closer/closerv2/closerv2source/getcenter.py", line 5, in from scipy.optimize import leastsq File "/usr/lib64/python2.6/site-packages/scipy/optimize/__init__.py", line 135, in from nonlin import * File "/usr/lib64/python2.6/site-packages/scipy/optimize/nonlin.py", line 116, in from scipy.linalg import norm, solve, inv, qr, svd, lstsq, LinAlgError File "/usr/lib64/python2.6/site-packages/scipy/linalg/__init__.py", line 120, in from decomp_qr import * File "/usr/lib64/python2.6/site-packages/scipy/linalg/decomp_qr.py", line 7, in import special_matrices File "/usr/lib64/python2.6/site-packages/scipy/linalg/special_matrices.py", line 4, in from scipy.misc import comb File "/usr/lib64/python2.6/site-packages/scipy/misc/__init__.py", line 22, in __all__ += common.__all__ NameError: name 'common' is not defined It works in my other Ubuntu workstation with an older version of numpy and scipy. Any help is greatly appreciated! -- View this message in context: http://old.nabble.com/NameError%3A-name-%27common%27-is-not-defined--Scipy-10.0-tp32901047p32901047.html Sent from the Scipy-User mailing list archive at Nabble.com. From bsouthey at gmail.com Fri Dec 2 12:45:06 2011 From: bsouthey at gmail.com (Bruce Southey) Date: Fri, 02 Dec 2011 11:45:06 -0600 Subject: [SciPy-User] [SciPy-user] NameError: name 'common' is not defined Scipy 10.0 In-Reply-To: <32901047.post@talk.nabble.com> References: <32901047.post@talk.nabble.com> Message-ID: <4ED90EA2.7030304@gmail.com> On 12/02/2011 03:04 AM, LLCB wrote: > Hi, > > I have installed scipy 10.0 and numpy 1.5.1 on RHEL5. > When I run a script, I got the following error: > > File "/home/cbl/Research/code/closer/closerv2/closerv2source/getcenter.py", > line 5, in > from scipy.optimize import leastsq > File "/usr/lib64/python2.6/site-packages/scipy/optimize/__init__.py", line > 135, in > from nonlin import * > File "/usr/lib64/python2.6/site-packages/scipy/optimize/nonlin.py", line > 116, in > from scipy.linalg import norm, solve, inv, qr, svd, lstsq, LinAlgError > File "/usr/lib64/python2.6/site-packages/scipy/linalg/__init__.py", line > 120, in > from decomp_qr import * > File "/usr/lib64/python2.6/site-packages/scipy/linalg/decomp_qr.py", line > 7, in > import special_matrices > File > "/usr/lib64/python2.6/site-packages/scipy/linalg/special_matrices.py", line > 4, in > from scipy.misc import comb > File "/usr/lib64/python2.6/site-packages/scipy/misc/__init__.py", line 22, > in > __all__ += common.__all__ > NameError: name 'common' is not defined > > It works in my other Ubuntu workstation with an older version of numpy and > scipy. > > Any help is greatly appreciated! > > > Your scipy installation may not be correct so I would suggest reinstalling the existing scipy. Just make sure that /usr/lib64/python2.6/site-packages/scipy is gone or at least totally empty before you do actually installation. Also: If you installed a scipy package then make sure you got all the correct packages for your system. If you built it from source then first make sure to clean out your build directory. Bruce From tloramus at gmail.com Fri Dec 2 14:04:20 2011 From: tloramus at gmail.com (Miha Marolt) Date: Fri, 2 Dec 2011 19:04:20 +0000 (UTC) Subject: [SciPy-User] No Scipy 0.10 reference manual References: <4ED6A935.4040506@gmail.com> Message-ID: Pauli Virtanen iki.fi> writes: > > 30.11.2011 23:07, Miha Marolt kirjoitti: > > There is no link to the Scipy 0.10 reference guide on the documentation > > page (http://docs.scipy.org/doc/). > > Fixed. > Almost :) Now the documentation is available (http://docs.scipy.org/doc/scipy-0.10.0/reference/), but there is still no link on the front page (http://docs.scipy.org/doc/). From pav at iki.fi Fri Dec 2 14:24:28 2011 From: pav at iki.fi (Pauli Virtanen) Date: Fri, 02 Dec 2011 20:24:28 +0100 Subject: [SciPy-User] No Scipy 0.10 reference manual In-Reply-To: References: <4ED6A935.4040506@gmail.com> Message-ID: Hi, 02.12.2011 20:04, Miha Marolt kirjoitti: [clip] > Almost :) Now the documentation is available > (http://docs.scipy.org/doc/scipy-0.10.0/reference/), but there is still no link > on the front page (http://docs.scipy.org/doc/). Also that should be there (worksforme), you'll need to force the browser to refresh :) From Wolfgang.Mader at fdm.uni-freiburg.de Fri Dec 2 14:41:42 2011 From: Wolfgang.Mader at fdm.uni-freiburg.de (Wolfgang Mader) Date: Fri, 02 Dec 2011 14:41:42 -0500 Subject: [SciPy-User] ipython, numpy, matplotlib, scipy Message-ID: <2285220.Oys9cVK3bF@killbill> Hello list, what is the best way to use ipython, numpy, matplotlib, and scipy together? I start ipython using ipython --pylab in order to get an interactive matplotlib and pull numpy and matplotlib functions to the workspace. But some functions like fft, ifft, ... are reimplemented in scipy. I have read, that it is proposed to use from scipy import * for interactive shells. How does this interefere with --pylab? Thank you for you advice. Wolfgang From josef.pktd at gmail.com Fri Dec 2 14:58:39 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 2 Dec 2011 14:58:39 -0500 Subject: [SciPy-User] ipython, numpy, matplotlib, scipy In-Reply-To: <2285220.Oys9cVK3bF@killbill> References: <2285220.Oys9cVK3bF@killbill> Message-ID: On Fri, Dec 2, 2011 at 2:41 PM, Wolfgang Mader wrote: > Hello list, > > what is the best way to use ipython, numpy, matplotlib, and scipy together? > > I start ipython using > > ipython --pylab > > in order to get an interactive matplotlib and pull numpy and matplotlib > functions to the workspace. But some functions like fft, ifft, ... are > reimplemented in scipy. I have read, that it is proposed to use > > from scipy import * > > for interactive shells. How does this interefere with --pylab? Sorry, I don't know the answer, but even in interactive work I would recommend using namespaces or explicit imports. I find this "matlabism" of using no namespaces very confusing. (And it makes it harder converting interactive code to clean scripts.) (And I'm not a representative scipy user, and it's only the last of the Zen.) Josef "Where did the behavior of this function come from, I thought this was something else." > > Thank you for you advice. > Wolfgang > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From robert.kern at gmail.com Fri Dec 2 15:20:23 2011 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 2 Dec 2011 20:20:23 +0000 Subject: [SciPy-User] ipython, numpy, matplotlib, scipy In-Reply-To: <2285220.Oys9cVK3bF@killbill> References: <2285220.Oys9cVK3bF@killbill> Message-ID: On Fri, Dec 2, 2011 at 19:41, Wolfgang Mader wrote: > Hello list, > > what is the best way to use ipython, numpy, matplotlib, and scipy together? > > I start ipython using > > ipython --pylab > > in order to get an interactive matplotlib and pull numpy and matplotlib > functions to the workspace. But some functions like fft, ifft, ... are > reimplemented in scipy. I have read, that it is proposed to use > > from scipy import * > > for interactive shells. How does this interefere with --pylab? It does nothing that you want. "from scipy import *" does not import anything from its subpackages. Just pretend that it doesn't exist. It exists only for backwards compatibility. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From guyer at nist.gov Fri Dec 2 18:06:44 2011 From: guyer at nist.gov (Jonathan Guyer) Date: Fri, 2 Dec 2011 18:06:44 -0500 Subject: [SciPy-User] No Scipy 0.10 reference manual In-Reply-To: References: <4ED6A935.4040506@gmail.com> Message-ID: <194EA6C8-B009-4388-9B44-8BB60A691F8E@nist.gov> On Dec 2, 2011, at 2:24 PM, Pauli Virtanen wrote: > 02.12.2011 20:04, Miha Marolt kirjoitti: > [clip] >> Almost :) Now the documentation is available >> (http://docs.scipy.org/doc/scipy-0.10.0/reference/), but there is still no link >> on the front page (http://docs.scipy.org/doc/). > > Also that should be there (worksforme), you'll need to force the browser > to refresh :) Maybe he missed it because it's up in the middle of all the NumPy documentation instead of with all the other SciPy docs? From aronne.merrelli at gmail.com Fri Dec 2 22:52:07 2011 From: aronne.merrelli at gmail.com (Aronne Merrelli) Date: Fri, 2 Dec 2011 21:52:07 -0600 Subject: [SciPy-User] Subclassing scipy sparse matrix class In-Reply-To: <4ED8E31C.3040406@gmail.com> References: <4ED8E31C.3040406@gmail.com> Message-ID: On Fri, Dec 2, 2011 at 8:39 AM, Bruce Southey wrote: > ** > On 12/01/2011 04:36 PM, Aronne Merrelli wrote: > > > On Wed, Nov 30, 2011 at 3:01 AM, Per Nielsen wrote: > >> Hi all >> >> I am trying to create a subclass of the sparse matrix class in scipy, >> to add some extra methods I need. >> >> I have tried to follow the guide on: http://www.scipy.org/Subclasses but >> without much luck, the view method does not exist for the sparse matrix >> class. Below is a script I have created >> >> > Hi, > > It appears that sparse matrices do not inherit from numpy.ndarray: > > > Surely you did notice that sparse is part of scipy not numpy or even the > c++ usage when looking at the code? :-) > As far as I know (which is not much) scipy.sparse is essentially > self-contained in scipy/sparse directory. So you are better off just > working with those files directly. > > Well, not exactly - it looks the actual values and indices defining the sparse matrix are stored inside numpy ndarrays in separate object attributes, even though the sparse matrix itself is just a "plain" python object. So it is not quite "self-contained". I'm sure there are good reasons for implementing it that way, but it isn't obvious without knowing those reasons why it couldn't be a direct subclass of ndarray. Aronne -------------- next part -------------- An HTML attachment was scrubbed... URL: From tloramus at gmail.com Sat Dec 3 04:38:03 2011 From: tloramus at gmail.com (Miha Marolt) Date: Sat, 3 Dec 2011 09:38:03 +0000 (UTC) Subject: [SciPy-User] No Scipy 0.10 reference manual References: <4ED6A935.4040506@gmail.com> Message-ID: Pauli Virtanen iki.fi> writes: > > Also that should be there (worksforme), you'll need to force the browser > to refresh :) > Hehe, yes, i had to clear the history :) It works now. From stefan at sun.ac.za Sat Dec 3 17:43:05 2011 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Sat, 3 Dec 2011 14:43:05 -0800 Subject: [SciPy-User] scikits-image 0.4 release Message-ID: Announcement: scikits-image 0.4 =============================== We're happy to announce the 0.4 release of scikits-image, an image processing toolbox for SciPy. Please visit our examples gallery to see what we've been up to: http://scikits-image.org/docs/0.3/auto_examples/ Note that, in this release, we renamed the module from ``scikits.image`` to ``skimage``, to work around name space conflicts with other scikits (similarly, the machine learning scikit is now imported as ``sklearn``). A big shout-out also to everyone currently at SciPy India; have fun, and remember to join the scikits-image sprint! This release runs under all major operating systems where Python (>=2.6 or 3.x), NumPy and SciPy can be installed. For more information, visit our website http://scikits-image.org New Features ------------ - Module rename from ``scikits.image`` to ``skimage`` - Contour finding - Grey-level co-occurrence matrices - Skeletonization and medial axis transform - Convex hull images - New test data sets - GDAL I/O plugin ... as well as some bug fixes. Contributors to this release ---------------------------- * Andreas Mueller * Christopher Gohlke * Emmanuelle Gouillart * Neil Yager * Nelle Varoquaux * Riaan van den Dool * Stefan van der Walt * Thouis (Ray) Jones * Tony S Yu * Zachary Pincus From kevin.gullikson at gmail.com Sat Dec 3 21:53:59 2011 From: kevin.gullikson at gmail.com (Kevin Gullikson) Date: Sat, 3 Dec 2011 20:53:59 -0600 Subject: [SciPy-User] scipy.optimize.leastsq not varying fit parameters Message-ID: I am trying to do a nonlinear least squares fit using scipy.optimize.leastsq. I have done this several times pretty easily for fitting some analytical function. However, now I am fitting the parameters of a numerical model to my data. As far as I can tell, the only thing that should change is that the fitting function gets much more complicated (since I have to generate one of these models at every iteration). My problem is this: leastsq is not varying the parameters, so I keep getting the exact same model and eventually it tells me that 'The cosine of the angle between func(x) and any column of the\n Jacobian is at most 0.000000 in absolute value' I have played with the epsfcn keyword, increasing it to 0.1 and it still never changes the parameters. What am I doing wrong? I will include some code, but not all since it is quite long: The function call: fitout = leastsq(ErrorFunction, pars, args=(chips[i], const_pars), full_output=True, epsfcn = 0.1) The Error Function: def ErrorFunction(pars, chip, const_pars): all_pars = const_pars all_pars.extend(pars) print "Pars: ", pars model = FitFunction(chip, all_pars) return chip.y - model.y The Fit Function: def FitFunction(chip, pars): print pars temperature = pars[0] pressure = pars[1] co2 = pars[2] co = pars[3] o3 = pars[4] wavenum_start = pars[5] wavenum_end = pars[6] ch4 = pars[7] humidity = pars[8] resolution = pars[9] angle = pars[10] #Generate the model: model = MakeModel.Main(pressure, temperature, humidity, wavenum_start, wavenum_end, angle, co2, o3, ch4, co, chip.x, resolution) if "FullSpectrum.freq" in os.listdir(TelluricModelDir): cmd = "rm "+ TelluricModelDir + "FullSpectrum.freq" command = subprocess.check_call(cmd, shell=True) return model Kevin Gullikson -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Sat Dec 3 22:46:57 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 3 Dec 2011 22:46:57 -0500 Subject: [SciPy-User] scipy.optimize.leastsq not varying fit parameters In-Reply-To: References: Message-ID: On Sat, Dec 3, 2011 at 9:53 PM, Kevin Gullikson wrote: > I am trying to do a nonlinear least squares fit using > scipy.optimize.leastsq. I have done this several times pretty easily for > fitting some analytical function. However, now I am fitting the parameters > of a numerical model to my data. As far as I can tell, the only thing that > should change is that the fitting function gets much more complicated (since > I have to generate one of these models at every iteration). > > My problem is this: leastsq is not varying the parameters, so I keep getting > the exact same model and eventually it tells me that 'The cosine of the > angle between func(x) and any column of the\n ?Jacobian is at most 0.000000 > in absolute value' > > I have played with the epsfcn keyword, increasing it to 0.1 and it still > never changes the parameters. What am I doing wrong? I will include some > code, but not all since it is quite long: > > The function call: > ? ? fitout = leastsq(ErrorFunction, pars, args=(chips[i], const_pars), > full_output=True, epsfcn = 0.1) > > > The Error Function: > def ErrorFunction(pars, chip, const_pars): > ? all_pars = const_pars > ? all_pars.extend(pars) can you check all_pars here? To me this looks like const_par is mutable and constantly extended Josef > ? print "Pars: ", pars > ? model = FitFunction(chip, all_pars) > ? return chip.y - model.y > > > > The Fit Function: > def FitFunction(chip, pars): > ? print pars > ? temperature = pars[0] > ? pressure = pars[1] > ? co2 = pars[2] > ? co = pars[3] > ? o3 = pars[4] > ? wavenum_start = pars[5] > ? wavenum_end = pars[6] > ? ch4 = pars[7] > ? humidity = pars[8] > ? resolution = pars[9] > ? angle = pars[10] > > ? #Generate the model: > ? model = MakeModel.Main(pressure, temperature, humidity, wavenum_start, > wavenum_end, angle, co2, o3, ch4, co, chip.x, resolution) > ? if "FullSpectrum.freq" in os.listdir(TelluricModelDir): > ? ? cmd = "rm "+ TelluricModelDir + "FullSpectrum.freq" > ? ? command = subprocess.check_call(cmd, shell=True) > > ? return model > > Kevin Gullikson > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From cimrman3 at ntc.zcu.cz Mon Dec 5 06:08:24 2011 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Mon, 05 Dec 2011 12:08:24 +0100 Subject: [SciPy-User] ANN: SfePy 2011.4 Message-ID: <4EDCA628.1030407@ntc.zcu.cz> I am pleased to announce release 2011.4 of SfePy. Description ----------- SfePy (simple finite elements in Python) is a software for solving systems of coupled partial differential equations by the finite element method. The code is based on NumPy and SciPy packages. It is distributed under the new BSD license. Home page: http://sfepy.org Mailing lists, issue tracking: http://code.google.com/p/sfepy/ Git (source) repository: http://github.com/sfepy Documentation: http://docs.sfepy.org/doc Highlights of this release -------------------------- - cython used instead of swig to interface C code - many terms unified thanks to new optional material term argument type - updated Lagrangian formulation for large deformations - automatic generation of gallery of examples For more information on this release, see http://sfepy.googlecode.com/svn/web/releases/2011.4_RELEASE_NOTES.txt (full release notes, rather long and technical). Best regards, Robert Cimrman and Contributors (*) (*) Contributors to this release (alphabetical order): Vladim?r Luke?, Maty?? Nov?k From gilles.rochefort at gmail.com Mon Dec 5 09:59:10 2011 From: gilles.rochefort at gmail.com (Gilles Rochefort) Date: Mon, 5 Dec 2011 15:59:10 +0100 Subject: [SciPy-User] could you explain the line_search method? In-Reply-To: References: Message-ID: Hi, The equation you are speaking of, is just an approximation of the step size that occurs in the preceding estimate of the solution. To make a short answer, I would say that it is just a hint to save some iterations in the line search process by starting with an initial alpha which is a good guess from the previous one (twice the previous one) and not necessary restarting from 1.0 everytime. This is particularly interesting when your current estimate gets closer and closer from the optimal solution. Best regards, Gilles. 2011/12/2 devdoer bird > HI: > > Sorry to post here if it's not a proper question. > > I'm reading the source code of line_search function in optimize.py , and > there's a statement to compute the initial step alpha , listed below: > > > > alpha1 = pymin(1.0, 1.01*2*(phi0-old_old_fval)/derphi0) > > > I don't what's the physical meaning of > 1.01*2*(phi0-old_old_fval)/derphi0, > > Can some one give me some explaination? > > Thanks! > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kevin.gullikson at gmail.com Sat Dec 3 23:05:19 2011 From: kevin.gullikson at gmail.com (Kevin Gullikson) Date: Sat, 3 Dec 2011 22:05:19 -0600 Subject: [SciPy-User] scipy.optimize.leastsq not varying fit parameters In-Reply-To: References: Message-ID: can you check all_pars here? To me this looks like const_par is mutable and constantly extended Ah, yes that was the problem. I just made my Fit Function take both lists separately and it seems to work now. Thanks for the speedy reply! Kevin On Sat, Dec 3, 2011 at 9:46 PM, wrote: > On Sat, Dec 3, 2011 at 9:53 PM, Kevin Gullikson > wrote: > > I am trying to do a nonlinear least squares fit using > > scipy.optimize.leastsq. I have done this several times pretty easily for > > fitting some analytical function. However, now I am fitting the > parameters > > of a numerical model to my data. As far as I can tell, the only thing > that > > should change is that the fitting function gets much more complicated > (since > > I have to generate one of these models at every iteration). > > > > My problem is this: leastsq is not varying the parameters, so I keep > getting > > the exact same model and eventually it tells me that 'The cosine of the > > angle between func(x) and any column of the\n Jacobian is at most > 0.000000 > > in absolute value' > > > > I have played with the epsfcn keyword, increasing it to 0.1 and it still > > never changes the parameters. What am I doing wrong? I will include some > > code, but not all since it is quite long: > > > > The function call: > > fitout = leastsq(ErrorFunction, pars, args=(chips[i], const_pars), > > full_output=True, epsfcn = 0.1) > > > > > > The Error Function: > > def ErrorFunction(pars, chip, const_pars): > > all_pars = const_pars > > all_pars.extend(pars) > > can you check all_pars here? To me this looks like const_par is > mutable and constantly extended > > Josef > > > print "Pars: ", pars > > model = FitFunction(chip, all_pars) > > return chip.y - model.y > > > > > > > > The Fit Function: > > def FitFunction(chip, pars): > > print pars > > temperature = pars[0] > > pressure = pars[1] > > co2 = pars[2] > > co = pars[3] > > o3 = pars[4] > > wavenum_start = pars[5] > > wavenum_end = pars[6] > > ch4 = pars[7] > > humidity = pars[8] > > resolution = pars[9] > > angle = pars[10] > > > > #Generate the model: > > model = MakeModel.Main(pressure, temperature, humidity, wavenum_start, > > wavenum_end, angle, co2, o3, ch4, co, chip.x, resolution) > > if "FullSpectrum.freq" in os.listdir(TelluricModelDir): > > cmd = "rm "+ TelluricModelDir + "FullSpectrum.freq" > > command = subprocess.check_call(cmd, shell=True) > > > > return model > > > > Kevin Gullikson > > > > > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rashed.golam at gmail.com Mon Dec 5 09:20:21 2011 From: rashed.golam at gmail.com (Md. Golam Rashed) Date: Mon, 5 Dec 2011 06:20:21 -0800 (PST) Subject: [SciPy-User] ANN: SfePy 2011.4 In-Reply-To: <4EDCA628.1030407@ntc.zcu.cz> References: <4EDCA628.1030407@ntc.zcu.cz> Message-ID: <1347089.271.1323094821654.JavaMail.geo-discussion-forums@yqko12> 59 test file(s) executed in 776.93 s, 0 failure(s) of 100 test(s) - on a intel atom n570 netbook. -------------- next part -------------- An HTML attachment was scrubbed... URL: From gustavo.goretkin at gmail.com Mon Dec 5 21:19:13 2011 From: gustavo.goretkin at gmail.com (Gustavo Goretkin) Date: Mon, 5 Dec 2011 21:19:13 -0500 Subject: [SciPy-User] where is searchsorted implementation? Message-ID: Hi folks, I see the searchsorted function in numpy/core/fromnumeric.py, but that function seems to call a method -- and I can't find the implementation of the method. Where is it and how can I find stuff like this in general? Thanks, Gustavo From charlesr.harris at gmail.com Mon Dec 5 21:32:00 2011 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 5 Dec 2011 19:32:00 -0700 Subject: [SciPy-User] where is searchsorted implementation? In-Reply-To: References: Message-ID: On Mon, Dec 5, 2011 at 7:19 PM, Gustavo Goretkin wrote: > Hi folks, > I see the searchsorted function in numpy/core/fromnumeric.py, but that > function seems to call a method -- and I can't find the implementation > of the method. Where is it and how can I find stuff like this in > general? > > The implementation is in numpy/core/src/multiarray/item_selection.c, the method is in numpy/core/src/multiarray/methods.c. Generally, one needs to start in methods.c and slowly track down the call chain. It can be a long trek and grep is your best companion. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Mon Dec 5 21:50:23 2011 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 5 Dec 2011 19:50:23 -0700 Subject: [SciPy-User] where is searchsorted implementation? In-Reply-To: References: Message-ID: On Mon, Dec 5, 2011 at 7:32 PM, Charles R Harris wrote: > > > On Mon, Dec 5, 2011 at 7:19 PM, Gustavo Goretkin < > gustavo.goretkin at gmail.com> wrote: > >> Hi folks, >> I see the searchsorted function in numpy/core/fromnumeric.py, but that >> function seems to call a method -- and I can't find the implementation >> of the method. Where is it and how can I find stuff like this in >> general? >> >> > The implementation is in numpy/core/src/multiarray/item_selection.c, the > method is in numpy/core/src/multiarray/methods.c. Generally, one needs to > start in methods.c and slowly track down the call chain. It can be a long > trek and grep is your best companion. > > Hmm, but it might not be too difficult to put together a function index using ctags or doxygen. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From devdoer2 at gmail.com Tue Dec 6 01:26:04 2011 From: devdoer2 at gmail.com (devdoer bird) Date: Tue, 6 Dec 2011 14:26:04 +0800 Subject: [SciPy-User] could you explain the line_search method? In-Reply-To: References: Message-ID: Thanks. 2011/12/5 Gilles Rochefort > Hi, > > The equation you are speaking of, is just an approximation of the step > size that occurs in the preceding estimate of the solution. > > To make a short answer, I would say that it is just a hint to save some > iterations in the line search process by starting with an initial alpha > which is a good guess from the previous one (twice the previous one) and > not necessary restarting from 1.0 everytime. > > This is particularly interesting when your current estimate gets closer > and closer from the optimal solution. > > Best regards, > Gilles. > > > 2011/12/2 devdoer bird > >> HI: >> >> Sorry to post here if it's not a proper question. >> >> I'm reading the source code of line_search function in optimize.py , and >> there's a statement to compute the initial step alpha , listed below: >> >> >> >> alpha1 = pymin(1.0, 1.01*2*(phi0-old_old_fval)/derphi0) >> >> >> I don't what's the physical meaning of >> 1.01*2*(phi0-old_old_fval)/derphi0, >> >> Can some one give me some explaination? >> >> Thanks! >> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From _kfj at yahoo.com Wed Dec 7 05:56:29 2011 From: _kfj at yahoo.com (Kay F. Jahnke) Date: Wed, 7 Dec 2011 10:56:29 +0000 (UTC) Subject: [SciPy-User] magnifying image patches, ndimage.zoom() shouldn't use boundary conditions Message-ID: Hi group! I want to use ndimage.zoom to magnify small patches of a larger image. This works just fine, except for the patch boundaries. Here is what happens: if I have an image img (let's assume its 1000X1000) and I want to magnify the patch img[500:600, 500:600], zoom applies boundary conditions on the margin of the patch, i.e. assumes surrounding pixels are same, mirror-imaged etc. In this special use scenario, though, the surrounding pixels aren't unknown and the application of the boundary condition leads to false results. I can work around the problem by zooming a larger patch and taking only the central section of the zoom's result, but this is wasteful. A way to avoid the problem would be to set up the spline underlying the zoom operation on a larger patch, but only evaluate it for the desired output patch area. I can do this using routines from other packages, but there I am missing a routine to evaluate the spline over a regularly spaced grid, which I suppose is most efficient - instead I have to calculate x and y vectors and feed them into the spline evaluation routine. (I may have overlooked something here, if so, please let me know) I'd wish for a variant of the zoom routine which lets me specify a source image and a rectangular patch on it to be zoomed. This way, an appropriately enlarged spline could be used and boundary conditions would only be applied when the actual image border is hit. This would be the prototype: scipy.ndimage.interpolation.zoom(input, zoom, patch=None, ...) with patch being a tuple of (min,max) pairs to define the slice of the input to operate on (the default being None to signify that the whole image should be zoomed and to keep the call signature compatible), so for my example 2D image I could call it like zoom ( img , ((500,600), (500,600)) , ... ) Since I suppose that zooming patches of images if a fairly common application, I feel this addition would be sensible. Kay From rlnewman at ucsd.edu Wed Dec 7 19:44:09 2011 From: rlnewman at ucsd.edu (Rob Newman) Date: Wed, 7 Dec 2011 16:44:09 -0800 Subject: [SciPy-User] Signal processing 101: creating and applying a bandpass filter Message-ID: <02A28AB4-3A22-4275-97B1-694BA5F4EA21@ucsd.edu> Dear SciPy gurus, I have a list of values that I wish to apply a bandpass filter to. Looking at the docs (http://docs.scipy.org/doc/scipy/reference/signal.html), I can see how to design a filter (http://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.butter.html), but I don't see any examples of using your custom designed filter in practice with lists or arrays. Are there any online resources that show step-by-step how to build a filter and apply to a list of values? For example, here is a Python list: [1.2711705420245039e-05, -2.7804792241996774e-05, 3.6106973477276575e-05, -1.3862942711545279e-05, -4.5308686748537353e-06, 3.9977066695205231e-06, -1.4130285261627493e-06, -3.992578793440835e-07, 8.7310453451921392e-07, -1.2458364494266352e-06, 1.972281982220939e-06, -2.5749765923225825e-06, 3.4605357526068371e-06, -7.4952550588555781e-06] It is 512 in length. I want to apply a Butterworth-style filter to this data, using a similar format of the string below (or translated to how the SciPy bandpass filtering works): 0.02_5_0.10_5 where a 0.02Hz to 0.10Hz bandpass filter is applied, with the 5's representing the poles of the filters. I hope this makes sense. Please let me know if I need to clarify in any way. Thanks in advance, - Rob ________________________________________________________ Rob Newman Institute of Geophysics and Planetary Physics Scripps Institution of Oceanography University of California, San Diego 9500 Gilman Drive, La Jolla, CA 92093-0225, USA From chris.felton at gmail.com Thu Dec 8 09:58:08 2011 From: chris.felton at gmail.com (Christopher Felton) Date: Thu, 08 Dec 2011 08:58:08 -0600 Subject: [SciPy-User] Signal processing 101: creating and applying a bandpass filter In-Reply-To: <02A28AB4-3A22-4275-97B1-694BA5F4EA21@ucsd.edu> References: <02A28AB4-3A22-4275-97B1-694BA5F4EA21@ucsd.edu> Message-ID: On 12/7/2011 6:44 PM, Rob Newman wrote: > Dear SciPy gurus, > > I have a list of values that I wish to apply a bandpass filter to. Looking at the docs (http://docs.scipy.org/doc/scipy/reference/signal.html), I can see how to design a filter (http://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.butter.html), but I don't see any examples of using your custom designed filter in practice with lists or arrays. Are there any online resources that show step-by-step how to build a filter and apply to a list of values? > > For example, here is a Python list: > > [1.2711705420245039e-05, > -2.7804792241996774e-05, > 3.6106973477276575e-05, > -1.3862942711545279e-05, > -4.5308686748537353e-06, > 3.9977066695205231e-06, > -1.4130285261627493e-06, > -3.992578793440835e-07, > > 8.7310453451921392e-07, > -1.2458364494266352e-06, > 1.972281982220939e-06, > -2.5749765923225825e-06, > 3.4605357526068371e-06, > -7.4952550588555781e-06] > > It is 512 in length. > > I want to apply a Butterworth-style filter to this data, using a similar format of the string below (or translated to how the SciPy bandpass filtering works): > > 0.02_5_0.10_5 > > where a 0.02Hz to 0.10Hz bandpass filter is applied, with the 5's representing the poles of the filters. > > I hope this makes sense. Please let me know if I need to clarify in any way. > > Thanks in advance, > - Rob You can use lfilter, there is an example on the cookbook page, http://www.scipy.org/Cookbook/FIRFilter. This cookbook page has a good review of the additional methods that can be used and which are faster in particular cases, http://www.scipy.org/Cookbook/ApplyFIRFilter Regards, Chris From blackvladimir at gmail.com Thu Dec 8 15:23:05 2011 From: blackvladimir at gmail.com (=?utf-8?B?VmxhZGltw61yIMSMZXJuw70=?=) Date: Thu, 08 Dec 2011 21:23:05 +0100 Subject: [SciPy-User] RuntimeError: cannot unmarshal code objects in restricted execution mode Message-ID: Hi, when i import scipy in restricted mode (eval function, with changed builtins) it ends with exception: RuntimeError: cannot unmarshal code objects in restricted execution mode. On the second try it works without exception. Code to reproduce error: myb = {} for b in dir(__builtins__): myb[b]=getattr(__builtins__,b) gl=dict() gl['__builtins__'] = myb c=compile('import scipy','file','exec') eval(c,gl) This works but i don't know why: myb = {} for b in dir(__builtins__): myb[b]=getattr(__builtins__,b) gl=dict() gl['__builtins__'] = myb c=compile('import scipy','file','exec') try: eval(c,gl) except RuntimeError: eval(c,gl) How can i do it without the second try? Vladimir From robert.kern at gmail.com Thu Dec 8 15:36:29 2011 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 8 Dec 2011 20:36:29 +0000 Subject: [SciPy-User] RuntimeError: cannot unmarshal code objects in restricted execution mode In-Reply-To: References: Message-ID: 2011/12/8 Vladim?r ?ern? : > Hi, > > when i import scipy in restricted mode (eval function, with changed > builtins) it ends with exception: > > RuntimeError: cannot unmarshal code objects in restricted execution mode. > > On the second try it works without exception. The reason it "works" the second time is that the scipy module has already been added to sys.modules, so it does not try to import anything again. You will see that sys.modules['scipy'] exists after the first failing eval(). However, I'm certain that the RuntimeError prevented a number of things from being initialized, so I doubt that it is fully functional. The second import does not really succeed; it's a no-op. > Code to reproduce error: > > myb = {} > for b in dir(__builtins__): > ? myb[b]=getattr(__builtins__,b) > gl=dict() > gl['__builtins__'] = myb > c=compile('import scipy','file','exec') > eval(c,gl) By the way, you should be using "exec c in gl" here since code contains a statement, not an expression. I'm afraid that I don't have any insight into the ultimate cause of the error or how to fix it. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From blackvladimir at gmail.com Thu Dec 8 16:06:56 2011 From: blackvladimir at gmail.com (=?utf-8?B?VmxhZGltw61yIMSMZXJuw70=?=) Date: Thu, 08 Dec 2011 22:06:56 +0100 Subject: [SciPy-User] RuntimeError: cannot unmarshal code objects in restricted execution mode In-Reply-To: References: Message-ID: Thanks The code which fail can look like this: myb = {} for b in dir(__builtins__): myb[b]=getattr(__builtins__,b) gl=dict() gl['__builtins__'] = myb exec('import scipy',gl) On Thu, 08 Dec 2011 21:36:29 +0100, Robert Kern wrote: > 2011/12/8 Vladim?r ?ern? : >> Hi, >> >> when i import scipy in restricted mode (eval function, with changed >> builtins) it ends with exception: >> >> RuntimeError: cannot unmarshal code objects in restricted execution >> mode. >> >> On the second try it works without exception. > > The reason it "works" the second time is that the scipy module has > already been added to sys.modules, so it does not try to import > anything again. You will see that sys.modules['scipy'] exists after > the first failing eval(). However, I'm certain that the RuntimeError > prevented a number of things from being initialized, so I doubt that > it is fully functional. The second import does not really succeed; > it's a no-op. > >> Code to reproduce error: >> >> myb = {} >> for b in dir(__builtins__): >> myb[b]=getattr(__builtins__,b) >> gl=dict() >> gl['__builtins__'] = myb >> c=compile('import scipy','file','exec') >> eval(c,gl) > > By the way, you should be using "exec c in gl" here since code > contains a statement, not an expression. > > I'm afraid that I don't have any insight into the ultimate cause of > the error or how to fix it. From stefan at sun.ac.za Thu Dec 8 19:10:10 2011 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Thu, 8 Dec 2011 16:10:10 -0800 Subject: [SciPy-User] scikits-image 0.4 release In-Reply-To: References: Message-ID: 2011/12/3 St?fan van der Walt : > Announcement: scikits-image 0.4 > =============================== > > We're happy to announce the 0.4 release of scikits-image, an image processing > toolbox for SciPy. > > Please visit our examples gallery to see what we've been up to: > > ? http://scikits-image.org/docs/0.3/auto_examples/ Thanks to Michael Aye for pointing out the typo here. That should be http://scikits-image.org/docs/0.4/auto_examples/ (more examples!) Cheers St?fan From josef.pktd at gmail.com Fri Dec 9 01:53:36 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 9 Dec 2011 01:53:36 -0500 Subject: [SciPy-User] an exercise in spline basis functions Message-ID: Trying to understand spline basis functions, I always wanted to have something simple to play with (that is not hidden in C or Fortran code behind a lot of numerical sophistication). Here is a very simple example, coded straight from a beginner's explanation. And it even works. https://picasaweb.google.com/106983885143680349926/Joepy#5684006069657083426 https://picasaweb.google.com/106983885143680349926/Joepy#5684006072774057874 Motivation If we have the basis functions directly, then we can just treat them like regular regressors, e.g. for robust fitting, have more control and information over variable selection than using scipy's splines, or include them at the same time as other regressors. (I'm thinking mainly of noisy data with a small number of breaks/knots.) (and because I was looking at what's left of the old stats.models spline code, where most of it got removed because it was crashing C code.) This was mainly to see if it works. Is there better code to get the spline basis functions (and maybe the derivatives, ...) available somewhere? Josef I might not understand it, but it works -- maybe. -------------- next part -------------- A non-text attachment was scrubbed... Name: try_spline_basis_1.py Type: text/x-python Size: 2662 bytes Desc: not available URL: From Troy_Mullins at nps.gov Fri Dec 9 09:07:33 2011 From: Troy_Mullins at nps.gov (Troy) Date: Fri, 9 Dec 2011 14:07:33 +0000 (UTC) Subject: [SciPy-User] Coding Distribution in Python Message-ID: Can anyone assist in how to python this gaussian normal: I have m, std(standard deviation) and y . . . . EQUATION 1: -[(x-m)2 / 2std*x2 e f(x) = 1/sqrt(2std*x2*pi) Then take output and put into : EQUATION 2: ( y | P = F(y) = | f(x).dx | ) 0 Can send an image of equation to anyone if this does not make sense. From jsseabold at gmail.com Fri Dec 9 09:24:15 2011 From: jsseabold at gmail.com (Skipper Seabold) Date: Fri, 9 Dec 2011 09:24:15 -0500 Subject: [SciPy-User] Coding Distribution in Python In-Reply-To: References: Message-ID: On Fri, Dec 9, 2011 at 9:07 AM, Troy wrote: > Can anyone assist in how to python this gaussian normal: > > I have m, std(standard deviation) and y . . . . > > > > EQUATION 1: > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?-[(x-m)2 / 2std*x2 > ? ? ? ? ? ? ? ? ? ? ? ? ? ?e > ? f(x) = 1/sqrt(2std*x2*pi) > > > > > > Then take output and put into : > > > EQUATION 2: > > ? ? ? ? ? ? ? ?( y > ? ? ? ? ? ? ? ?| > ? P = F(y) = ? | ? ?f(x).dx > ? ? ? ? ? ? ? ?| > ? ? ? ? ? ? ? ?) 0 > > Can send an image of equation to anyone if this does not make sense. > You just want the normal pdf and cdf? http://docs.scipy.org/scipy/docs/scipy-docs/tutorial/stats.rst/#stats http://docs.scipy.org/scipy/docs/scipy-docs/tutorial/stats/continuous.rst/ from scipy import stats m = 0 std = 1 norm_dist = stats.norm(loc=m, scale=std) f = norm_dist.pdf F = norm_dist.cdf print f(1) print F(1) - F(0) hth, Skipper From josef.pktd at gmail.com Fri Dec 9 09:24:59 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 9 Dec 2011 09:24:59 -0500 Subject: [SciPy-User] Coding Distribution in Python In-Reply-To: References: Message-ID: On Fri, Dec 9, 2011 at 9:07 AM, Troy wrote: > Can anyone assist in how to python this gaussian normal: > > I have m, std(standard deviation) and y . . . . > > > > EQUATION 1: > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?-[(x-m)2 / 2std*x2 > ? ? ? ? ? ? ? ? ? ? ? ? ? ?e > ? f(x) = 1/sqrt(2std*x2*pi) > from scipy import stats stats.norm.pdf(x, loc=m, scale=std) When I need directly formula for a distribution in numpython, then I usually just copy it from the scipy.stats.distributions.py file. > > > > Then take output and put into : > > > EQUATION 2: > > ? ? ? ? ? ? ? ?( y > ? ? ? ? ? ? ? ?| > ? P = F(y) = ? | ? ?f(x).dx > ? ? ? ? ? ? ? ?| > ? ? ? ? ? ? ? ?) 0 stats.norm.cdf(y, loc=m, scale=std) (or raw function from scipy.special) Josef > > Can send an image of equation to anyone if this does not make sense. > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From charlesr.harris at gmail.com Fri Dec 9 12:00:56 2011 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 9 Dec 2011 10:00:56 -0700 Subject: [SciPy-User] an exercise in spline basis functions In-Reply-To: References: Message-ID: On Thu, Dec 8, 2011 at 11:53 PM, wrote: > Trying to understand spline basis functions, I always wanted to have > something simple to play with (that is not hidden in C or Fortran code > behind a lot of numerical sophistication). > > Here is a very simple example, coded straight from a beginner's > explanation. And it even works. > > > https://picasaweb.google.com/106983885143680349926/Joepy#5684006069657083426 > > https://picasaweb.google.com/106983885143680349926/Joepy#5684006072774057874 > > Motivation > If we have the basis functions directly, then we can just treat them > like regular regressors, e.g. for robust fitting, have more control > and information over variable selection than using scipy's splines, or > include them at the same time as other regressors. > (I'm thinking mainly of noisy data with a small number of breaks/knots.) > (and because I was looking at what's left of the old stats.models > spline code, where most of it got removed because it was crashing C > code.) > > This was mainly to see if it works. > Is there better code to get the spline basis functions (and maybe the > derivatives, ...) available somewhere? > > Looks like you are using uniform b-splines. You might want to look at the non-uniform variety also. If you want a complete set both require extra knot points outside the interior of the domain, but for the non-uniform variety the added knot points are just repeats of the end points. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Fri Dec 9 12:53:45 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 9 Dec 2011 12:53:45 -0500 Subject: [SciPy-User] an exercise in spline basis functions In-Reply-To: References: Message-ID: On Fri, Dec 9, 2011 at 12:00 PM, Charles R Harris wrote: > > > On Thu, Dec 8, 2011 at 11:53 PM, wrote: >> >> Trying to understand spline basis functions, I always wanted to have >> something simple to play with (that is not hidden in C or Fortran code >> behind a lot of numerical sophistication). >> >> Here is a very simple example, coded straight from a beginner's >> explanation. And it even works. >> >> >> https://picasaweb.google.com/106983885143680349926/Joepy#5684006069657083426 >> >> https://picasaweb.google.com/106983885143680349926/Joepy#5684006072774057874 >> >> Motivation >> If we have the basis functions directly, then we can just treat them >> like regular regressors, e.g. for robust fitting, have more control >> and information over variable selection than using scipy's splines, or >> include them at the same time as other regressors. >> (I'm thinking mainly of noisy data with a small number of breaks/knots.) >> (and because I was looking at what's left of the old stats.models >> spline code, where most of it got removed because it was crashing C >> code.) >> >> This was mainly to see if it works. >> Is there better code to get the spline basis functions (and maybe the >> derivatives, ...) available somewhere? >> > > Looks like you are using uniform b-splines. You might want to look at the > non-uniform variety also. If you want a complete set both require extra knot > points outside the interior of the domain, but for the non-uniform variety > the added knot points are just repeats of the end points. uniform (integer) knots was the easiest to get started. I will extend it to non-uniform knots, but I wasn't sure what the support of each basis function is. (But I just found it in http://www.cs.mtu.edu/~shene/COURSES/cs3621/NOTES/spline/B-spline/bspline-basis.html ) Also the recursive function calls repeat some calculations and won't be memory friendly. I was hoping someone already has done a more general version in Python. (BTW: I figured out how your multivariate Bernstein polynomials work but didn't find an application for it yet.) Thanks, Josef > > Chuck > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From charlesr.harris at gmail.com Fri Dec 9 13:19:27 2011 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 9 Dec 2011 11:19:27 -0700 Subject: [SciPy-User] an exercise in spline basis functions In-Reply-To: References: Message-ID: On Fri, Dec 9, 2011 at 10:53 AM, wrote: > On Fri, Dec 9, 2011 at 12:00 PM, Charles R Harris > wrote: > > > > > > On Thu, Dec 8, 2011 at 11:53 PM, wrote: > >> > >> Trying to understand spline basis functions, I always wanted to have > >> something simple to play with (that is not hidden in C or Fortran code > >> behind a lot of numerical sophistication). > >> > >> Here is a very simple example, coded straight from a beginner's > >> explanation. And it even works. > >> > >> > >> > https://picasaweb.google.com/106983885143680349926/Joepy#5684006069657083426 > >> > >> > https://picasaweb.google.com/106983885143680349926/Joepy#5684006072774057874 > >> > >> Motivation > >> If we have the basis functions directly, then we can just treat them > >> like regular regressors, e.g. for robust fitting, have more control > >> and information over variable selection than using scipy's splines, or > >> include them at the same time as other regressors. > >> (I'm thinking mainly of noisy data with a small number of breaks/knots.) > >> (and because I was looking at what's left of the old stats.models > >> spline code, where most of it got removed because it was crashing C > >> code.) > >> > >> This was mainly to see if it works. > >> Is there better code to get the spline basis functions (and maybe the > >> derivatives, ...) available somewhere? > >> > > > > Looks like you are using uniform b-splines. You might want to look at the > > non-uniform variety also. If you want a complete set both require extra > knot > > points outside the interior of the domain, but for the non-uniform > variety > > the added knot points are just repeats of the end points. > > uniform (integer) knots was the easiest to get started. I will extend > it to non-uniform knots, but I wasn't sure what the support of each > basis function is. (But I just found it in > > http://www.cs.mtu.edu/~shene/COURSES/cs3621/NOTES/spline/B-spline/bspline-basis.html > ) > > Also the recursive function calls repeat some calculations and won't > be memory friendly. > > I was hoping someone already has done a more general version in Python. > > (BTW: I figured out how your multivariate Bernstein polynomials work > but didn't find an application for it yet.) > > I'm going to extend that capability to the polynomials in numpy. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Fri Dec 9 13:32:39 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 9 Dec 2011 13:32:39 -0500 Subject: [SciPy-User] an exercise in spline basis functions In-Reply-To: References: Message-ID: On Fri, Dec 9, 2011 at 1:19 PM, Charles R Harris wrote: > > > On Fri, Dec 9, 2011 at 10:53 AM, wrote: >> >> On Fri, Dec 9, 2011 at 12:00 PM, Charles R Harris >> wrote: >> > >> > >> > On Thu, Dec 8, 2011 at 11:53 PM, wrote: >> >> >> >> Trying to understand spline basis functions, I always wanted to have >> >> something simple to play with (that is not hidden in C or Fortran code >> >> behind a lot of numerical sophistication). >> >> >> >> Here is a very simple example, coded straight from a beginner's >> >> explanation. And it even works. >> >> >> >> >> >> >> >> https://picasaweb.google.com/106983885143680349926/Joepy#5684006069657083426 >> >> >> >> >> >> https://picasaweb.google.com/106983885143680349926/Joepy#5684006072774057874 >> >> >> >> Motivation >> >> If we have the basis functions directly, then we can just treat them >> >> like regular regressors, e.g. for robust fitting, have more control >> >> and information over variable selection than using scipy's splines, or >> >> include them at the same time as other regressors. >> >> (I'm thinking mainly of noisy data with a small number of >> >> breaks/knots.) >> >> (and because I was looking at what's left of the old stats.models >> >> spline code, where most of it got removed because it was crashing C >> >> code.) >> >> >> >> This was mainly to see if it works. >> >> Is there better code to get the spline basis functions (and maybe the >> >> derivatives, ...) available somewhere? >> >> >> > >> > Looks like you are using uniform b-splines. You might want to look at >> > the >> > non-uniform variety also. If you want a complete set both require extra >> > knot >> > points outside the interior of the domain, but for the non-uniform >> > variety >> > the added knot points are just repeats of the end points. >> >> uniform (integer) knots was the easiest to get started. I will extend >> it to non-uniform knots, but I wasn't sure what the support of each >> basis function is. (But I just found it in >> >> http://www.cs.mtu.edu/~shene/COURSES/cs3621/NOTES/spline/B-spline/bspline-basis.html >> ) based on this the non-uniform case was actually easy https://picasaweb.google.com/106983885143680349926/Joepy#5684197321179952226 Josef >> >> Also the recursive function calls repeat some calculations and won't >> be memory friendly. >> >> I was hoping someone already has done a more general version in Python. >> >> (BTW: I figured out how your multivariate Bernstein polynomials work >> but didn't find an application for it yet.) >> > > I'm going to extend that capability to the polynomials in numpy. > > Chuck > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From josef.pktd at gmail.com Fri Dec 9 20:05:46 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 9 Dec 2011 20:05:46 -0500 Subject: [SciPy-User] an exercise in spline basis functions In-Reply-To: References: Message-ID: On Fri, Dec 9, 2011 at 1:32 PM, wrote: > On Fri, Dec 9, 2011 at 1:19 PM, Charles R Harris > wrote: >> >> >> On Fri, Dec 9, 2011 at 10:53 AM, wrote: >>> >>> On Fri, Dec 9, 2011 at 12:00 PM, Charles R Harris >>> wrote: >>> > >>> > >>> > On Thu, Dec 8, 2011 at 11:53 PM, wrote: >>> >> >>> >> Trying to understand spline basis functions, I always wanted to have >>> >> something simple to play with (that is not hidden in C or Fortran code >>> >> behind a lot of numerical sophistication). >>> >> >>> >> Here is a very simple example, coded straight from a beginner's >>> >> explanation. And it even works. >>> >> >>> >> >>> >> >>> >> https://picasaweb.google.com/106983885143680349926/Joepy#5684006069657083426 >>> >> >>> >> >>> >> https://picasaweb.google.com/106983885143680349926/Joepy#5684006072774057874 >>> >> >>> >> Motivation >>> >> If we have the basis functions directly, then we can just treat them >>> >> like regular regressors, e.g. for robust fitting, have more control >>> >> and information over variable selection than using scipy's splines, or >>> >> include them at the same time as other regressors. >>> >> (I'm thinking mainly of noisy data with a small number of >>> >> breaks/knots.) >>> >> (and because I was looking at what's left of the old stats.models >>> >> spline code, where most of it got removed because it was crashing C >>> >> code.) >>> >> >>> >> This was mainly to see if it works. >>> >> Is there better code to get the spline basis functions (and maybe the >>> >> derivatives, ...) available somewhere? >>> >> >>> > >>> > Looks like you are using uniform b-splines. You might want to look at >>> > the >>> > non-uniform variety also. If you want a complete set both require extra >>> > knot >>> > points outside the interior of the domain, but for the non-uniform >>> > variety >>> > the added knot points are just repeats of the end points. >>> >>> uniform (integer) knots was the easiest to get started. I will extend >>> it to non-uniform knots, but I wasn't sure what the support of each >>> basis function is. (But I just found it in >>> >>> http://www.cs.mtu.edu/~shene/COURSES/cs3621/NOTES/spline/B-spline/bspline-basis.html >>> ) > based on this the non-uniform case was actually easy > https://picasaweb.google.com/106983885143680349926/Joepy#5684197321179952226 one last plot (prototype seems to work) robust least squares spline fitting (with outliers in data) https://picasaweb.google.com/106983885143680349926/Joepy#5684292228451302850 Josef > > Josef >>> >>> Also the recursive function calls repeat some calculations and won't >>> be memory friendly. >>> >>> I was hoping someone already has done a more general version in Python. >>> >>> (BTW: I figured out how your multivariate Bernstein polynomials work >>> but didn't find an application for it yet.) >>> >> >> I'm going to extend that capability to the polynomials in numpy. >> >> Chuck >> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> From ashashiwa at gmail.com Sun Dec 11 07:15:51 2011 From: ashashiwa at gmail.com (Jean-Baptiste BUTET) Date: Sun, 11 Dec 2011 13:15:51 +0100 Subject: [SciPy-User] noob Needs guru for array manipulation Message-ID: Hi all :) I have this array : [[ 0.00000000e+00 , 2.26757370e-05 , 4.53514739e-05 , 2.27548753e+00 , 2.27551020e+00 , 2.27553288e+00] [ -1.38700000e+03 , -1.51300000e+03 , -1.52600000e+03 , 3.39000000e+02, 1.11000000e+02 0.00000000e+00] I would like to remove couples where results in 2nd row are under 120 : [[ 0.00000000e+00 , 2.26757370e-05 , 4.53514739e-05 , 2.27548753e+00 , 2.27553288e+00] [ -1.38700000e+03 , -1.51300000e+03 , -1.52600000e+03 , 3.39000000e+02, 0.00000000e+00 I don't really understand numpy array philosophia... so I need help here :) Thanks. JB -------------- next part -------------- An HTML attachment was scrubbed... URL: From emmanuelle.gouillart at normalesup.org Sun Dec 11 07:31:37 2011 From: emmanuelle.gouillart at normalesup.org (Emmanuelle Gouillart) Date: Sun, 11 Dec 2011 13:31:37 +0100 Subject: [SciPy-User] noob Needs guru for array manipulation In-Reply-To: References: Message-ID: Hi, I'm not sure I understand your example (under 120 in algebraic or absolute value? Why don't you remove the last column which has 0 on the second line?), but the syntax would be >>> import numpy as np >>> a = np.array([[1, 2, 3], [160, 30, 125]]) >>> b = a[:, a[1] > 120] >>> b array([[ 1, 3], [160, 125]]) b = a[:, a[1] > 120] is called fancy indexing. For an introduction to numpy arrays, see for example http://scipy-lectures.github.com/intro/numpy/numpy.html Cheers, Emmanuelle 2011/12/11 Jean-Baptiste BUTET > Hi all :) > > I have this array : > [[ 0.00000000e+00 , 2.26757370e-05 , 4.53514739e-05 , > 2.27548753e+00 , 2.27551020e+00 , 2.27553288e+00] > [ -1.38700000e+03 , -1.51300000e+03 , -1.52600000e+03 , > 3.39000000e+02, 1.11000000e+02 0.00000000e+00] > > I would like to remove couples where results in 2nd row are under 120 : > > [[ 0.00000000e+00 , 2.26757370e-05 , 4.53514739e-05 , > 2.27548753e+00 , 2.27553288e+00] > [ -1.38700000e+03 , -1.51300000e+03 , -1.52600000e+03 , > 3.39000000e+02, 0.00000000e+00 > > I don't really understand numpy array philosophia... so I need help here :) > > Thanks. > > JB > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Sun Dec 11 07:35:24 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sun, 11 Dec 2011 07:35:24 -0500 Subject: [SciPy-User] noob Needs guru for array manipulation In-Reply-To: References: Message-ID: On Sun, Dec 11, 2011 at 7:31 AM, Emmanuelle Gouillart wrote: > Hi, > > I'm not sure I understand your example (under 120 in algebraic > or > absolute value? Why don't you remove the last column which has 0 on > the > second line?), but the syntax would > be > >>>> import numpy as >>>> np >>>> a = np.array([[1, 2, 3], [160, 30, >>>> 125]]) >>>> b = a[:, a[1] > >>>> 120] >>>> >>>> b > array([[? 1, > 3], > ?????? [160, > 125]]) > > b = a[:, a[1] > 120] is called fancy indexing. For an introduction > to > numpy arrays, see for > example > http://scipy-lectures.github.com/intro/numpy/numpy.html and http://docs.scipy.org/doc/numpy/reference/arrays.indexing.html http://mentat.za.net/numpy/intro/intro.html Josef > > Cheers, > Emmanuelle > > 2011/12/11 Jean-Baptiste BUTET >> >> Hi all :) >> >> I have this array : >> [[? 0.00000000e+00 ,? 2.26757370e-05? , 4.53514739e-05 , >> 2.27548753e+00?? , 2.27551020e+00? , 2.27553288e+00] >> ?[ -1.38700000e+03? , -1.51300000e+03 , -1.52600000e+03 , >> 3.39000000e+02,? 1.11000000e+02?? 0.00000000e+00] >> >> I would like to remove couples where results in 2nd row are under 120 : >> >> [[? 0.00000000e+00 ,? 2.26757370e-05? , 4.53514739e-05 , >> 2.27548753e+00??? , 2.27553288e+00] >> ?[ -1.38700000e+03? , -1.51300000e+03 , -1.52600000e+03 , >> 3.39000000e+02,???? 0.00000000e+00 >> >> I don't really understand numpy array philosophia... so I need help here >> :) >> >> Thanks. >> >> JB >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From ashashiwa at gmail.com Sun Dec 11 07:47:33 2011 From: ashashiwa at gmail.com (Jean-Baptiste BUTET) Date: Sun, 11 Dec 2011 13:47:33 +0100 Subject: [SciPy-User] noob Needs guru for array manipulation In-Reply-To: References: Message-ID: thanks Emmanuelle ... this is exactly that. JB 2011/12/11 > On Sun, Dec 11, 2011 at 7:31 AM, Emmanuelle Gouillart > wrote: > > Hi, > > > > I'm not sure I understand your example (under 120 in algebraic > > or > > absolute value? Why don't you remove the last column which has 0 on > > the > > second line?), but the syntax would > > be > > > >>>> import numpy as > >>>> np > >>>> a = np.array([[1, 2, 3], [160, 30, > >>>> 125]]) > >>>> b = a[:, a[1] > > >>>> 120] > >>>> > >>>> b > > array([[ 1, > > 3], > > [160, > > 125]]) > > > > b = a[:, a[1] > 120] is called fancy indexing. For an introduction > > to > > numpy arrays, see for > > example > > http://scipy-lectures.github.com/intro/numpy/numpy.html > > and > > http://docs.scipy.org/doc/numpy/reference/arrays.indexing.html > http://mentat.za.net/numpy/intro/intro.html > > Josef > > > > > Cheers, > > Emmanuelle > > > > 2011/12/11 Jean-Baptiste BUTET > >> > >> Hi all :) > >> > >> I have this array : > >> [[ 0.00000000e+00 , 2.26757370e-05 , 4.53514739e-05 , > >> 2.27548753e+00 , 2.27551020e+00 , 2.27553288e+00] > >> [ -1.38700000e+03 , -1.51300000e+03 , -1.52600000e+03 , > >> 3.39000000e+02, 1.11000000e+02 0.00000000e+00] > >> > >> I would like to remove couples where results in 2nd row are under 120 : > >> > >> [[ 0.00000000e+00 , 2.26757370e-05 , 4.53514739e-05 , > >> 2.27548753e+00 , 2.27553288e+00] > >> [ -1.38700000e+03 , -1.51300000e+03 , -1.52600000e+03 , > >> 3.39000000e+02, 0.00000000e+00 > >> > >> I don't really understand numpy array philosophia... so I need help here > >> :) > >> > >> Thanks. > >> > >> JB > >> > >> _______________________________________________ > >> SciPy-User mailing list > >> SciPy-User at scipy.org > >> http://mail.scipy.org/mailman/listinfo/scipy-user > >> > > > > > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From emmanuelle.gouillart at nsup.org Sun Dec 11 07:29:22 2011 From: emmanuelle.gouillart at nsup.org (Emmanuelle Gouillart) Date: Sun, 11 Dec 2011 13:29:22 +0100 Subject: [SciPy-User] noob Needs guru for array manipulation In-Reply-To: References: Message-ID: <20111211122922.GA9274@phare.normalesup.org> Hi, I'm not sure I understand your example (under 120 in algebraic or absolute value? Why don't you remove the last column which has 0 on the second line?), but the syntax would be >>> import numpy as np >>> a = np.array([[1, 2, 3], [160, 30, 125]]) >>> b = a[:, a[1] > 120] >>> b array([[ 1, 3], [160, 125]]) b = a[:, a[1] > 120] is called fancy indexing. For an introduction to numpy arrays, see for example http://scipy-lectures.github.com/intro/numpy/numpy.html Cheers, Emmanuelle On Sun, Dec 11, 2011 at 01:15:51PM +0100, Jean-Baptiste BUTET wrote: > Hi all :) > I have this array : > [[? 0.00000000e+00 ,? 2.26757370e-05? , 4.53514739e-05 ,?? > 2.27548753e+00?? , 2.27551020e+00? , 2.27553288e+00] > ?[ -1.38700000e+03? , -1.51300000e+03 , -1.52600000e+03 ,?? > 3.39000000e+02,? 1.11000000e+02?? 0.00000000e+00] > I would like to remove couples where results in 2nd row are under 120 : > [[? 0.00000000e+00 ,? 2.26757370e-05? , 4.53514739e-05 ,?? > 2.27548753e+00??? , 2.27553288e+00] > ?[ -1.38700000e+03? , -1.51300000e+03 , -1.52600000e+03 ,?? > 3.39000000e+02,???? 0.00000000e+00 > I don't really understand numpy array philosophia... so I need help here > :) > Thanks. > JB > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From Dharhas.Pothina at twdb.state.tx.us Mon Dec 12 13:59:43 2011 From: Dharhas.Pothina at twdb.state.tx.us (Dharhas Pothina) Date: Mon, 12 Dec 2011 12:59:43 -0600 Subject: [SciPy-User] Is there a way to append points to a scipy.spatial.kdtree? Message-ID: <4EE5FABF0200009B000419CE@GWWEB.twdb.state.tx.us> Hi All, Wasn't able to figure this out if this is possible from the documentation. I have a large 2D KDTree and I was wondering if there is a way to append new points to the original KDTree without having to recreate it from scratch. i.e tree1 = scipy.spatial.KDTree(xy1) now I have a few new points say xy2 is there a way to get tree2 = tree1.append(xy2) rather than: xy3 = np.vstack((xy1,xy2)) tree2 = scipy.spatial.KDTree(xy3) thanks - dharhas -------------- next part -------------- An HTML attachment was scrubbed... URL: From cpblpublic at gmail.com Mon Dec 12 20:05:54 2011 From: cpblpublic at gmail.com (C Barrington-Leigh) Date: Mon, 12 Dec 2011 17:05:54 -0800 (PST) Subject: [SciPy-User] matplotlib: Simple legend code no longer works after upgrade to Ubuntu 11.10 Message-ID: Oops; I just posted this to comp.lang.python, but I wonder whether matplotlib questions are supposed to go to scipy-user? Here it is: """ Before I upgraded to 2.7.2+ / 4 OCt 2011, the following code added a comment line to an axis legend using matplotlib / pylab. Now, the same code makes the legend appear "off-screen", ie way outside the axes limits. Can anyone help? And/or is there a new way to add a title and footer to the legend? Thanks! """ from pylab import * plot([0,0],[1,1],label='Ubuntu 11.10') lh=legend(fancybox=True,shadow=False) lh.get_frame().set_alpha(0.5) from matplotlib.offsetbox import TextArea, VPacker fontsize=lh.get_texts()[0].get_fontsize() legendcomment=TextArea('extra comments here', textprops=dict(size=fontsize)) show() # Looks fine here lh._legend_box = VPacker(pad=5, sep=0, children=[lh._legend_box,legendcomment], align="left") lh._legend_box.set_figure(gcf()) draw() From warren.weckesser at enthought.com Tue Dec 13 10:09:53 2011 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Tue, 13 Dec 2011 09:09:53 -0600 Subject: [SciPy-User] matplotlib: Simple legend code no longer works after upgrade to Ubuntu 11.10 In-Reply-To: References: Message-ID: On Mon, Dec 12, 2011 at 7:05 PM, C Barrington-Leigh wrote: > Oops; I just posted this to comp.lang.python, but I wonder whether > matplotlib questions are supposed to go to scipy-user? How about matplotlib-users at lists.sourceforge.net? I've cc'ed to that list. Warren > Here it is: > """ > Before I upgraded to 2.7.2+ / 4 OCt 2011, the following code added a > comment line to an axis legend using matplotlib / pylab. > Now, the same code makes the legend appear "off-screen", ie way > outside the axes limits. > > Can anyone help? And/or is there a new way to add a title and footer > to the legend? > > Thanks! > """ > > from pylab import * > plot([0,0],[1,1],label='Ubuntu 11.10') > lh=legend(fancybox=True,shadow=False) > lh.get_frame().set_alpha(0.5) > > from matplotlib.offsetbox import TextArea, VPacker > fontsize=lh.get_texts()[0].get_fontsize() > legendcomment=TextArea('extra comments here', > textprops=dict(size=fontsize)) > show() > # Looks fine here > lh._legend_box = VPacker(pad=5, > sep=0, > children=[lh._legend_box,legendcomment], > align="left") > lh._legend_box.set_figure(gcf()) > draw() > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Tue Dec 13 15:40:43 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 13 Dec 2011 15:40:43 -0500 Subject: [SciPy-User] an exercise in spline basis functions In-Reply-To: References: Message-ID: On Fri, Dec 9, 2011 at 8:05 PM, wrote: > On Fri, Dec 9, 2011 at 1:32 PM, ? wrote: >> On Fri, Dec 9, 2011 at 1:19 PM, Charles R Harris >> wrote: >>> >>> >>> On Fri, Dec 9, 2011 at 10:53 AM, wrote: >>>> >>>> On Fri, Dec 9, 2011 at 12:00 PM, Charles R Harris >>>> wrote: >>>> > >>>> > >>>> > On Thu, Dec 8, 2011 at 11:53 PM, wrote: >>>> >> >>>> >> Trying to understand spline basis functions, I always wanted to have >>>> >> something simple to play with (that is not hidden in C or Fortran code >>>> >> behind a lot of numerical sophistication). >>>> >> >>>> >> Here is a very simple example, coded straight from a beginner's >>>> >> explanation. And it even works. >>>> >> >>>> >> >>>> >> >>>> >> https://picasaweb.google.com/106983885143680349926/Joepy#5684006069657083426 >>>> >> >>>> >> >>>> >> https://picasaweb.google.com/106983885143680349926/Joepy#5684006072774057874 >>>> >> >>>> >> Motivation >>>> >> If we have the basis functions directly, then we can just treat them >>>> >> like regular regressors, e.g. for robust fitting, have more control >>>> >> and information over variable selection than using scipy's splines, or >>>> >> include them at the same time as other regressors. >>>> >> (I'm thinking mainly of noisy data with a small number of >>>> >> breaks/knots.) >>>> >> (and because I was looking at what's left of the old stats.models >>>> >> spline code, where most of it got removed because it was crashing C >>>> >> code.) >>>> >> >>>> >> This was mainly to see if it works. >>>> >> Is there better code to get the spline basis functions (and maybe the >>>> >> derivatives, ...) available somewhere? >>>> >> >>>> > >>>> > Looks like you are using uniform b-splines. You might want to look at >>>> > the >>>> > non-uniform variety also. If you want a complete set both require extra >>>> > knot >>>> > points outside the interior of the domain, but for the non-uniform >>>> > variety >>>> > the added knot points are just repeats of the end points. >>>> >>>> uniform (integer) knots was the easiest to get started. I will extend >>>> it to non-uniform knots, but I wasn't sure what the support of each >>>> basis function is. (But I just found it in >>>> >>>> http://www.cs.mtu.edu/~shene/COURSES/cs3621/NOTES/spline/B-spline/bspline-basis.html >>>> ) >> based on this the non-uniform case was actually easy >> https://picasaweb.google.com/106983885143680349926/Joepy#5684197321179952226 > > one last plot (prototype seems to work) > > robust least squares spline fitting (with outliers in data) > https://picasaweb.google.com/106983885143680349926/Joepy#5684292228451302850 one more prototype that I haven't seen yet: using scipy.interpolate and find smoothing parameter s that minimizes Bayesian Information Criterium, BIC https://picasaweb.google.com/106983885143680349926/Joepy#5685704025775485730 Josef (one of these days I'll get a blog) > > Josef > >> >> Josef >>>> >>>> Also the recursive function calls repeat some calculations and won't >>>> be memory friendly. >>>> >>>> I was hoping someone already has done a more general version in Python. >>>> >>>> (BTW: I figured out how your multivariate Bernstein polynomials work >>>> but didn't find an application for it yet.) >>>> >>> >>> I'm going to extend that capability to the polynomials in numpy. >>> >>> Chuck >>> >>> >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> -------------- next part -------------- A non-text attachment was scrubbed... Name: try_spline_ic.py Type: text/x-python Size: 2194 bytes Desc: not available URL: From ralf.gommers at googlemail.com Tue Dec 13 15:52:18 2011 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Tue, 13 Dec 2011 21:52:18 +0100 Subject: [SciPy-User] an exercise in spline basis functions In-Reply-To: References: Message-ID: On Tue, Dec 13, 2011 at 9:40 PM, wrote: > On Fri, Dec 9, 2011 at 8:05 PM, wrote: > > > > one last plot (prototype seems to work) > > > > robust least squares spline fitting (with outliers in data) > > > https://picasaweb.google.com/106983885143680349926/Joepy#5684292228451302850 > > one more prototype that I haven't seen yet: > > using scipy.interpolate and find smoothing parameter s that minimizes > Bayesian Information Criterium, BIC > > > https://picasaweb.google.com/106983885143680349926/Joepy#5685704025775485730 > These look quite useful. Are you aiming for inclusion in statsmodels or scipy? > > Josef > (one of these days I'll get a blog) > Good idea, then I would have checked it out before attachment 6:) Cheers, Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From Guang.Dai at albertamsa.ca Tue Dec 13 15:57:39 2011 From: Guang.Dai at albertamsa.ca (Guang Dai) Date: Tue, 13 Dec 2011 13:57:39 -0700 Subject: [SciPy-User] Test file, don't read it. Sorry about this. Test file, don't read it. Message-ID: <34D4ADEE33DC964DA79B1308FBD34FBC87B3DD1445@PRODEXCMS.aeso.ca> From wesmckinn at gmail.com Tue Dec 13 18:29:45 2011 From: wesmckinn at gmail.com (Wes McKinney) Date: Tue, 13 Dec 2011 18:29:45 -0500 Subject: [SciPy-User] ANN: pandas 0.6.1 Message-ID: I'm pleased to announce the pandas 0.6.1 release. It's been a busy 3 weeks since the last release. This upgrade is recommended for all users and should not cause any API breakage for 0.6.0 users. As usual there is a lot of new and improved functionality, performance enhancements (very significant in some cases), and a number of bug fixes. See the full release notes below and on GitHub. Many thanks to all the users who contributed code, bug reports, and suggestions for new features. best, Wes What is it ========== pandas is a Python package providing fast, flexible, and expressive data structures designed to make working with ?relational? or ?labeled? data both easy and intuitive. It aims to be the fundamental high-level building block for doing practical, real world data analysis in Python. Additionally, it has the broader goal of becoming the most powerful and flexible open source data analysis / manipulation tool available in any language. Links ===== Release Notes: https://github.com/wesm/pandas/blob/master/RELEASE.rst Documentation: http://pandas.sourceforge.net Installers: http://pypi.python.org/pypi/pandas Code Repository: http://github.com/wesm/pandas Mailing List: http://groups.google.com/group/pystatsmodels Blog: http://blog.wesmckinney.com pandas 0.6.1 ============ **Release date:** 12/13/2011 **API Changes** - Rename `names` argument in DataFrame.from_records to `columns`. Add deprecation warning - Boolean get/set operations on Series with boolean Series will reindex instead of requiring that the indexes be exactly equal (GH #429) **New features / modules** - Can pass Series to DataFrame.append with ignore_index=True for appending a single row (GH #430) - Add Spearman and Kendall correlation options to Series.corr and DataFrame.corr (GH #428) - Add new `get_value` and `set_value` methods to Series, DataFrame, and Panel to very low-overhead access to scalar elements. df.get_value(row, column) is about 3x faster than df[column][row] by handling fewer cases (GH #437, #438). Add similar methods to sparse data structures for compatibility - Add Qt table widget to sandbox (PR #435) - DataFrame.align can accept Series arguments, add axis keyword (GH #461) - Implement new SparseList and SparseArray data structures. SparseSeries now derives from SparseArray (GH #463) - max_columns / max_rows options in set_printoptions (PR #453) - Implement Series.rank and DataFrame.rank, fast versions of scipy.stats.rankdata (GH #428) - Implement DataFrame.from_items alternate constructor (GH #444) - DataFrame.convert_objects method for inferring better dtypes for object columns (GH #302) - Add rolling_corr_pairwise function for computing Panel of correlation matrices (GH #189) - Add `margins` option to `pivot_table` for computing subgroup aggregates (GH #114) - Add `Series.from_csv` function (PR #482) **Improvements to existing features** - Improve memory usage of `DataFrame.describe` (do not copy data unnecessarily) (PR #425) - Use same formatting function for outputting floating point Series to console as in DataFrame (PR #420) - DataFrame.delevel will try to infer better dtype for new columns (GH #440) - Exclude non-numeric types in DataFrame.{corr, cov} - Override Index.astype to enable dtype casting (GH #412) - Use same float formatting function for Series.__repr__ (PR #420) - Use available console width to output DataFrame columns (PR #453) - Accept ndarrays when setting items in Panel (GH #452) - Infer console width when printing __repr__ of DataFrame to console (PR #453) - Optimize scalar value lookups in the general case by 25% or more in Series and DataFrame - Can pass DataFrame/DataFrame and DataFrame/Series to rolling_corr/rolling_cov (GH #462) - Fix performance regression in cross-sectional count in DataFrame, affecting DataFrame.dropna speed - Column deletion in DataFrame copies no data (computes views on blocks) (GH #158) - MultiIndex.get_level_values can take the level name - More helpful error message when DataFrame.plot fails on one of the columns (GH #478) **Bug fixes** - Fix O(K^2) memory leak caused by inserting many columns without consolidating, had been present since 0.4.0 (GH #467) - `DataFrame.count` should return Series with zero instead of NA with length-0 axis (GH #423) - Fix Yahoo! Finance API usage in pandas.io.data (GH #419, PR #427) - Fix upstream bug causing failure in Series.align with empty Series (GH #434) - Function passed to DataFrame.apply can return a list, as long as it's the right length. Regression from 0.4 (GH #432) - Don't "accidentally" upcast scalar values when indexing using .ix (GH #431) - Fix groupby exception raised with as_index=False and single column selected (GH #421) - Implement DateOffset.__ne__ causing downstream bug (GH #456) - Fix __doc__-related issue when converting py -> pyo with py2exe - Bug fix in left join Cython code with duplicate monotonic labels - Fix bug when unstacking multiple levels described in #451 - Exclude NA values in dtype=object arrays, regression from 0.5.0 (GH #469) - Use Cython map_infer function in DataFrame.applymap to properly infer output type, handle tuple return values and other things that were breaking (GH #465) - Handle floating point index values in HDFStore (GH #454) - Fixed stale column reference bug (cached Series object) caused by type change / item deletion in DataFrame (GH #473) - Index.get_loc should always raise Exception when there are duplicates - Handle differently-indexed Series input to DataFrame constructor (GH #475) - Omit nuisance columns in multi-groupby with Python function - Buglet in handling of single grouping in general apply - Handle type inference properly when passing list of lists or tuples to DataFrame constructor (GH #484) - Preserve Index / MultiIndex names in GroupBy.apply concatenation step (GH #481) Thanks ------ - Ralph Bean - Luca Beltrame - Marius Cobzarenco - Andreas Hilboll - Jev Kuznetsov - Adam Lichtenstein - Wouter Overmeire - Fernando Perez - Nathan Pinger - Christian Prinoth - Alex Reyfman - Joon Ro - Chang She - Ted Square - Chris Uga - Dieter Vandenbussche From josef.pktd at gmail.com Tue Dec 13 20:43:20 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 13 Dec 2011 20:43:20 -0500 Subject: [SciPy-User] an exercise in spline basis functions In-Reply-To: References: Message-ID: On Tue, Dec 13, 2011 at 3:52 PM, Ralf Gommers wrote: > > > On Tue, Dec 13, 2011 at 9:40 PM, wrote: >> >> On Fri, Dec 9, 2011 at 8:05 PM, ? wrote: >> > >> > one last plot (prototype seems to work) >> > >> > robust least squares spline fitting (with outliers in data) >> > >> > https://picasaweb.google.com/106983885143680349926/Joepy#5684292228451302850 >> >> one more prototype that I haven't seen yet: >> >> using scipy.interpolate and find smoothing parameter s that minimizes >> Bayesian Information Criterium, BIC >> >> >> https://picasaweb.google.com/106983885143680349926/Joepy#5685704025775485730 > > > These look quite useful. Are you aiming for inclusion in statsmodels or > scipy? My main target is statsmodels. The current version of this is more scipy style, just a nested function, the statsmodels version will be a class with an API that is not clear yet. If this is useful, I don't see a reason not to put it in scipy. For statsmodels it will be more useful if the splines can be tied in with other models, for example regression and exploratory data analysis. For more there are also https://github.com/jjstickel/scikit-datasmooth/ https://github.com/ludwigschwardt/scikits.fitting I didn't find a lot in the scipy-user archives for automatic selection this is useful http://mail.scipy.org/pipermail/scipy-user/2008-September/018200.html this I also would like to have http://mail.scipy.org/pipermail/scipy-user/2011-August/030295.html >> >> >> Josef >> (one of these days I'll get a blog) > > > Good idea, then I would have checked it out before attachment 6:) Blame gmail that it doesn't have a view attachment (anymore ?) http://mail.scipy.org/pipermail/scipy-user/attachments/20111213/37f96c26/attachment.py If you downloaded it, you can run some nice graphs, it's standalone with numpy and scipy :) the cool title of the blog article: "Don't use s=0 if k>1" Cheers, Josef > > Cheers, > Ralf > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From kevin.gullikson at gmail.com Tue Dec 13 10:59:57 2011 From: kevin.gullikson at gmail.com (Kevin Gullikson) Date: Tue, 13 Dec 2011 09:59:57 -0600 Subject: [SciPy-User] numpy.convolve needs to be normalized? Message-ID: Hi all, I have been playing with the numpy.convolve function. It seems that the general behavior is what I expect, but the values are all too high. Does the result need to be normalized in some way? Here is some example code that should replicate the figure in the wikipedia page for convolution : import numpy import pylab #Make some square data x = numpy.arange(0,10,0.01) left1 = 2 right1=3 left2=4 right2=5 y1=numpy.zeros(x.size) y2=numpy.zeros(x.size) for i in range(x.size): if x[i] >=left1 and x[i] <= right1: y1[i] = 1 if x[i] >=left2 and x[i] <= right2: y2[i] = 1 #Convolve conv = numpy.convolve(y1,y2,mode="same") pylab.plot(x,conv) pylab.show() The peak should be at x=2 (and it is), and should have a height of 1, since the maximum area under the two curves is 1. However, the height of the peak is about 101 Kevin Gullikson -------------- next part -------------- An HTML attachment was scrubbed... URL: From warren.weckesser at enthought.com Wed Dec 14 09:46:25 2011 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Wed, 14 Dec 2011 08:46:25 -0600 Subject: [SciPy-User] numpy.convolve needs to be normalized? In-Reply-To: References: Message-ID: On Tue, Dec 13, 2011 at 9:59 AM, Kevin Gullikson wrote: > Hi all, > > I have been playing with the numpy.convolve function. It seems that the > general behavior is what I expect, but the values are all too high. Does > the result need to be normalized in some way? Here is some example code > that should replicate the figure in the wikipedia page for convolution > : > > import numpy > import pylab > > #Make some square data > x = numpy.arange(0,10,0.01) > left1 = 2 > right1=3 > left2=4 > right2=5 > y1=numpy.zeros(x.size) > y2=numpy.zeros(x.size) > for i in range(x.size): > if x[i] >=left1 and x[i] <= right1: > y1[i] = 1 > if x[i] >=left2 and x[i] <= right2: > y2[i] = 1 > > #Convolve > conv = numpy.convolve(y1,y2,mode="same") > pylab.plot(x,conv) > pylab.show() > > > The peak should be at x=2 (and it is), and should have a height of 1, > since the maximum area under the two curves is 1. However, the height of > the peak is about 101 > The peak of 101 is correct. numpy.convolve is a *discrete* convolution--scroll down a bit in the wikipedia article to see the definition. Your y1 and y2 each contain 101 consecutive 1s, so the peak value in the convolution should be 101, as you observed. Warren -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Wed Dec 14 09:47:44 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 14 Dec 2011 09:47:44 -0500 Subject: [SciPy-User] numpy.convolve needs to be normalized? In-Reply-To: References: Message-ID: On Tue, Dec 13, 2011 at 10:59 AM, Kevin Gullikson wrote: > Hi all, > > I have been playing with the numpy.convolve function. It seems that the > general behavior is what I expect, but the values are all too high. Does the > result need to be normalized in some way? Here is some example code that > should replicate the figure in the wikipedia page for convolution: > > import numpy > import pylab > > #Make some square data > x = numpy.arange(0,10,0.01) > left1 = 2 > right1=3 > left2=4 > right2=5 > y1=numpy.zeros(x.size) > y2=numpy.zeros(x.size) > for i in range(x.size): > ? if x[i] >=left1 and x[i] <= right1: > ??? y1[i] = 1 > ? if x[i] >=left2 and x[i] <= right2: > ??? y2[i] = 1 > > #Convolve > conv = numpy.convolve(y1,y2,mode="same") > pylab.plot(x,conv) > pylab.show() > > > The peak should be at x=2 (and it is), and should have a height of 1, since > the maximum area under the two curves is 1. However, the height of the peak > is about 101 convolve is discrete, just the convolution sum of points. If the window sums to 1, then you just get a weighted average. conv = numpy.convolve(y1,y2/y2.sum(),mode="same") >>> conv.max() 1.0000000000000002 Josef > > Kevin Gullikson > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From jjstickel at vcn.com Wed Dec 14 10:08:37 2011 From: jjstickel at vcn.com (Jonathan Stickel) Date: Wed, 14 Dec 2011 08:08:37 -0700 Subject: [SciPy-User] an exercise in spline basis functions In-Reply-To: References: Message-ID: <4EE8BBF5.4060507@vcn.com> On 12/14/11 07:28 , scipy-user-request at scipy.org wrote: > one more prototype that I haven't seen yet: > > using scipy.interpolate and find smoothing parameter s that minimizes > Bayesian Information Criterium, BIC > > https://picasaweb.google.com/106983885143680349926/Joepy#5685704025775485730 FYI, I have a regularization based smoothing method in scikits that can "automatically" determine the smoothing parameter by generalized cross validation. I am not a statistician, but I think this is analogous to the spline smoothing example that you show. I'd like to see my code incorporated into a larger package (e.g. scipy.interpolate or scikits.statsmodels), but I haven't received definitive feedback about this when I have asked in the past. Regards, Jonathan From james.yoo at gmail.com Wed Dec 14 11:20:14 2011 From: james.yoo at gmail.com (James Yoo) Date: Wed, 14 Dec 2011 10:20:14 -0600 Subject: [SciPy-User] test_arpack.test_symmetric_modes hangs, scipy 0.10 Message-ID: Hello, Test for scipy 0.10 hang at test_arpack.test_symmetric_modes Python 2.7.2 (default, Dec 13 2011, 14:11:54) [GCC 4.1.2 20080704 (Red Hat 4.1.2-50)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import scipy >>> scipy.test(verbose=2) Running unit tests for scipy NumPy version 1.6.1 NumPy is installed in /.../lib/python2.7/site-packages/numpy SciPy version 0.10.0 SciPy is installed in /a.../lib/python2.7/site-packages/scipy Python version 2.7.2 (default, Dec 13 2011, 14:11:54) [GCC 4.1.2 20080704 (Red Hat 4.1.2-50)] nose version 1.0.0 .... test_arpack.test_symmetric_modes(True, , 'f', 2, 'LM', None, None, , None, 'normal') ... FAIL test_arpack.test_symmetric_modes(True, , 'f', 2, 'LM', None, 0.5, , None, 'normal') ... FAIL test_arpack.test_symmetric_modes(True, , 'f', 2, 'LM', None, 0.5, , None, 'buckling') ... FAIL test_arpack.test_symmetric_modes(True, , 'f', 2, 'LM', None, 0.5, , None, 'cayley') ... FAIL test_arpack.test_symmetric_modes(True, , 'f', 2, 'LM', None, None, , None, 'normal') ... -------------- next part -------------- An HTML attachment was scrubbed... URL: From rlnewman at ucsd.edu Wed Dec 14 15:14:45 2011 From: rlnewman at ucsd.edu (Rob Newman) Date: Wed, 14 Dec 2011 12:14:45 -0800 Subject: [SciPy-User] Understanding the cross-correlation function numpy.correlate & how to use it properly with real and synthetic data Message-ID: <7E392AAF-F24D-4E25-A8A0-F028FBC88CA5@ucsd.edu> Hi SciPy gurus, First up - I am not a physicist, so please be gentle! I have an array of real data and an array of synthetic data. I am trying to determine the cross-correlation of the two signals and the timeshift that needs to be applied to the real data to best match the synthetic data. I also want to only use the real data later on the script if the cross correlation result is above some level of confidence. I have read the man page on numpy.correlate, but I am not entirely sure of what that function returns to me, and how I should use it. I have looked at James Battat's website that has a useful script on the discrete correlation function of two functions (https://www.cfa.harvard.edu/~jbattat/computer/python/science/#correlation) but I think his example is more complicated than my needs. I understand that the correlate function returns an array that is twice the size of both the input arrays minus 1 (when using mode='full'), but what do I need to do to that resulting array to get the correlation value (if there is indeed a value to be returned) and the timeshift that needs to be applied to the real data to match the synthetic data. Thanks in advance, - Rob From josef.pktd at gmail.com Wed Dec 14 15:50:21 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 14 Dec 2011 15:50:21 -0500 Subject: [SciPy-User] Fwd: an exercise in spline basis functions In-Reply-To: References: <4EE8BBF5.4060507@vcn.com> Message-ID: On Wed, Dec 14, 2011 at 10:08 AM, Jonathan Stickel wrote: > On 12/14/11 07:28 , scipy-user-request at scipy.org wrote: >> >> one more prototype that I haven't seen yet: >> >> using scipy.interpolate and find smoothing parameter s that minimizes >> Bayesian Information Criterium, BIC >> >> >> https://picasaweb.google.com/106983885143680349926/Joepy#5685704025775485730 > > > FYI, I have a regularization based smoothing method in scikits that can > "automatically" determine the smoothing parameter by generalized cross > validation. ?I am not a statistician, but I think this is analogous to the > spline smoothing example that you show. > > I'd like to see my code incorporated into a larger package (e.g. > scipy.interpolate or scikits.statsmodels), but I haven't received definitive > feedback about this when I have asked in the past. A scipy.smooth package would be a good addition (as we discussed before) but someone would have to push for it. Similar for statsmodels, smoothers would be a good addition, but it's lacking a "champion". Your smoothing package would make a good addition (especially if cvxopt can be replaced with fmin_slsqp for example). I'm not a smoother person, but I bump into it every once in a while, Chris added lowess to statsmodels, Ralf is working on functional boxplots that sometimes require pre-smoothing, and we have various non-parametric pieces, so I'm working my way *slowly* to add smoothers (and polynomial fitting). Josef https://picasaweb.google.com/106983885143680349926/Joepy#5686027329247551410 > > Regards, > Jonathan From ralf.gommers at googlemail.com Wed Dec 14 16:21:43 2011 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Wed, 14 Dec 2011 22:21:43 +0100 Subject: [SciPy-User] test_arpack.test_symmetric_modes hangs, scipy 0.10 In-Reply-To: References: Message-ID: On Wed, Dec 14, 2011 at 5:20 PM, James Yoo wrote: > Hello, > > Test for scipy 0.10 hang at test_arpack.test_symmetric_modes > > Hmm, this is really turning into a never-ending story. We were thinking all test suite crashes/hangs were fixed at least. Can you provide details on your OS, compilers, where you got python, numpy, scipy? Related issues: http://projects.scipy.org/scipy/ticket/1515 http://projects.scipy.org/scipy/ticket/1523 http://projects.scipy.org/scipy/ticket/1472 Ralf Python 2.7.2 (default, Dec 13 2011, 14:11:54) > [GCC 4.1.2 20080704 (Red Hat 4.1.2-50)] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > >>> import scipy > >>> scipy.test(verbose=2) > Running unit tests for scipy > NumPy version 1.6.1 > NumPy is installed in /.../lib/python2.7/site-packages/numpy > SciPy version 0.10.0 > SciPy is installed in /a.../lib/python2.7/site-packages/scipy > Python version 2.7.2 (default, Dec 13 2011, 14:11:54) [GCC 4.1.2 20080704 > (Red Hat 4.1.2-50)] > nose version 1.0.0 > > .... > test_arpack.test_symmetric_modes(True, , 'f', 2, 'LM', > None, None, , None, 'normal') ... FAIL > test_arpack.test_symmetric_modes(True, , 'f', 2, 'LM', > None, 0.5, , None, 'normal') ... > FAIL > test_arpack.test_symmetric_modes(True, , 'f', 2, 'LM', > None, 0.5, , None, 'buckling') ... > FAIL > test_arpack.test_symmetric_modes(True, , 'f', 2, 'LM', > None, 0.5, , None, 'cayley') ... > FAIL > test_arpack.test_symmetric_modes(True, , 'f', 2, 'LM', > None, None, , None, 'normal') ... > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From james.yoo at gmail.com Wed Dec 14 16:29:25 2011 From: james.yoo at gmail.com (James Yoo) Date: Wed, 14 Dec 2011 15:29:25 -0600 Subject: [SciPy-User] test_arpack.test_symmetric_modes hangs, scipy 0.10 In-Reply-To: References: Message-ID: here's my compiler and python/numpy/scipy info... note we're also using intel mkl (8.1.014) libs. Linux 2.6.18-238.3.1.el5 #1 SMP Tue Jan 25 18:05:40 EST 2011 x86_64 x86_64 x86_64 GNU/Linux gcc -v Using built-in specs. Target: x86_64-redhat-linux Configured with: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --enable-shared --enable-threads=posix --enable-checking=release --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-libgcj-multifile --enable-languages=c,c++,objc,obj-c++,java,fortran,ada --enable-java-awt=gtk --disable-dssi --disable-plugin --with-java-home=/usr/lib/jvm/java-1.4.2-gcj-1.4.2.0/jre --with-cpu=generic --host=x86_64-redhat-linux Thread model: posix gcc version 4.1.2 20080704 (Red Hat 4.1.2-50) python from python.org numpy 1.6.1 and scipy 0.10 from sourceforge On Wed, Dec 14, 2011 at 3:21 PM, Ralf Gommers wrote: > > > On Wed, Dec 14, 2011 at 5:20 PM, James Yoo wrote: > >> Hello, >> >> Test for scipy 0.10 hang at test_arpack.test_symmetric_modes >> >> Hmm, this is really turning into a never-ending story. We were thinking > all test suite crashes/hangs were fixed at least. Can you provide details > on your OS, compilers, where you got python, numpy, scipy? > > Related issues: > http://projects.scipy.org/scipy/ticket/1515 > http://projects.scipy.org/scipy/ticket/1523 > http://projects.scipy.org/scipy/ticket/1472 > > Ralf > > > Python 2.7.2 (default, Dec 13 2011, 14:11:54) >> [GCC 4.1.2 20080704 (Red Hat 4.1.2-50)] on linux2 >> Type "help", "copyright", "credits" or "license" for more information. >> >>> import scipy >> >>> scipy.test(verbose=2) >> Running unit tests for scipy >> NumPy version 1.6.1 >> NumPy is installed in /.../lib/python2.7/site-packages/numpy >> SciPy version 0.10.0 >> SciPy is installed in /a.../lib/python2.7/site-packages/scipy >> Python version 2.7.2 (default, Dec 13 2011, 14:11:54) [GCC 4.1.2 20080704 >> (Red Hat 4.1.2-50)] >> nose version 1.0.0 >> >> .... >> test_arpack.test_symmetric_modes(True, , 'f', 2, 'LM', >> None, None, , None, 'normal') ... FAIL >> test_arpack.test_symmetric_modes(True, , 'f', 2, 'LM', >> None, 0.5, , None, 'normal') ... >> FAIL >> test_arpack.test_symmetric_modes(True, , 'f', 2, 'LM', >> None, 0.5, , None, 'buckling') ... >> FAIL >> test_arpack.test_symmetric_modes(True, , 'f', 2, 'LM', >> None, 0.5, , None, 'cayley') ... >> FAIL >> test_arpack.test_symmetric_modes(True, , 'f', 2, 'LM', >> None, None, , None, 'normal') ... >> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Wed Dec 14 16:48:40 2011 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Wed, 14 Dec 2011 22:48:40 +0100 Subject: [SciPy-User] test_arpack.test_symmetric_modes hangs, scipy 0.10 In-Reply-To: References: Message-ID: On Wed, Dec 14, 2011 at 10:29 PM, James Yoo wrote: > here's my compiler and python/numpy/scipy info... note we're also using > intel mkl (8.1.014) libs. > > Linux 2.6.18-238.3.1.el5 #1 SMP Tue Jan 25 18:05:40 EST 2011 > x86_64 x86_64 x86_64 GNU/Linux > > gcc -v > Using built-in specs. > Target: x86_64-redhat-linux > Configured with: ../configure --prefix=/usr --mandir=/usr/share/man > --infodir=/usr/share/info --enable-shared --enable-threads=posix > --enable-checking=release --with-system-zlib --enable-__cxa_atexit > --disable-libunwind-exceptions --enable-libgcj-multifile > --enable-languages=c,c++,objc,obj-c++,java,fortran,ada > --enable-java-awt=gtk --disable-dssi --disable-plugin > --with-java-home=/usr/lib/jvm/java-1.4.2-gcj-1.4.2.0/jre --with-cpu=generic > --host=x86_64-redhat-linux > Thread model: posix > gcc version 4.1.2 20080704 (Red Hat 4.1.2-50) > > > python from python.org > numpy 1.6.1 and scipy 0.10 from sourceforge > > Thanks. Would anyone object to marking all single-precision Arpack tests as knownfail until someone with the motivation and skills to get this fixed comes along? Ralf > > > On Wed, Dec 14, 2011 at 3:21 PM, Ralf Gommers > wrote: > >> >> >> On Wed, Dec 14, 2011 at 5:20 PM, James Yoo wrote: >> >>> Hello, >>> >>> Test for scipy 0.10 hang at test_arpack.test_symmetric_modes >>> >>> Hmm, this is really turning into a never-ending story. We were thinking >> all test suite crashes/hangs were fixed at least. Can you provide details >> on your OS, compilers, where you got python, numpy, scipy? >> >> Related issues: >> http://projects.scipy.org/scipy/ticket/1515 >> http://projects.scipy.org/scipy/ticket/1523 >> http://projects.scipy.org/scipy/ticket/1472 >> >> Ralf >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From kevin.gullikson at gmail.com Wed Dec 14 15:49:30 2011 From: kevin.gullikson at gmail.com (Kevin Gullikson) Date: Wed, 14 Dec 2011 14:49:30 -0600 Subject: [SciPy-User] Understanding the cross-correlation function numpy.correlate & how to use it properly with real and synthetic data In-Reply-To: <7E392AAF-F24D-4E25-A8A0-F028FBC88CA5@ucsd.edu> References: <7E392AAF-F24D-4E25-A8A0-F028FBC88CA5@ucsd.edu> Message-ID: Rob, I understand that the correlate function returns an array that is twice the size of both the input arrays minus 1 (when using mode='full'), but what do I need to do to that resulting array to get the correlation value (if there is indeed a value to be returned) and the timeshift that needs to be applied to the real data to match the synthetic data. numpy.correlate returns an array of correlation values, so you don't need to do anything to get that. Getting the timeshift is the somewhat tricky part. Here is some code that I use (I stole it from somewhere, but don't remember where...)" #Do the correlation. x and y is the x and y components of your data (so I guess x is time and y is whatever you are modeling), template is what you are cross-correlating with ycorr = scipy.correlate(y, template mode="full") #Generate an x axis xcorr = numpy.arange(ycorr.size) #Convert this into lag units, but still not really physical lags = xcorr - (y.size-1) distancePerLag = (x[-1] - x[0])/float(x.size) #This is just the x-spacing (or for you, the timestep) in your data #Convert your lags into physical units offsets = -lags*distancePerLag You can then use numpy.argmax() to find the index in ycorr that has the highest cross-correlation value, and do whatever you want with the cross-correlation. Cheers, Kevin Gullikson On Wed, Dec 14, 2011 at 2:14 PM, Rob Newman wrote: > Hi SciPy gurus, > > First up - I am not a physicist, so please be gentle! > > I have an array of real data and an array of synthetic data. I am trying > to determine the cross-correlation of the two signals and the timeshift > that needs to be applied to the real data to best match the synthetic data. > I also want to only use the real data later on the script if the cross > correlation result is above some level of confidence. > > I have read the man page on numpy.correlate, but I am not entirely sure of > what that function returns to me, and how I should use it. I have looked at > James Battat's website that has a useful script on the discrete correlation > function of two functions ( > https://www.cfa.harvard.edu/~jbattat/computer/python/science/#correlation) > but I think his example is more complicated than my needs. > > I understand that the correlate function returns an array that is twice > the size of both the input arrays minus 1 (when using mode='full'), but > what do I need to do to that resulting array to get the correlation value > (if there is indeed a value to be returned) and the timeshift that needs to > be applied to the real data to match the synthetic data. > > Thanks in advance, > - Rob > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From rlnewman at ucsd.edu Thu Dec 15 11:38:48 2011 From: rlnewman at ucsd.edu (Rob Newman) Date: Thu, 15 Dec 2011 08:38:48 -0800 Subject: [SciPy-User] Understanding the cross-correlation function numpy.correlate & how to use it properly with real and synthetic data In-Reply-To: References: <7E392AAF-F24D-4E25-A8A0-F028FBC88CA5@ucsd.edu> Message-ID: <9CDF2BD0-7B3E-46E7-BC92-F7A04C6F84D4@ucsd.edu> Hi Kevin, Thanks for that chunk of code and the explanation - its a great help. Happy holidays. - Rob Newman On Dec 14, 2011, at 12:49 PM, Kevin Gullikson wrote: > Rob, > > I understand that the correlate function returns an array that is twice the size of both the input arrays minus 1 (when using mode='full'), but what do I need to do to that resulting array to get the correlation value (if there is indeed a value to be returned) and the timeshift that needs to be applied to the real data to match the synthetic data. > > numpy.correlate returns an array of correlation values, so you don't need to do anything to get that. Getting the timeshift is the somewhat tricky part. Here is some code that I use (I stole it from somewhere, but don't remember where...)" > > #Do the correlation. x and y is the x and y components of your data (so I guess x is time and y is whatever you are modeling), template is what you are cross-correlating with > ycorr = scipy.correlate(y, template mode="full") > > #Generate an x axis > xcorr = numpy.arange(ycorr.size) > > #Convert this into lag units, but still not really physical > lags = xcorr - (y.size-1) > distancePerLag = (x[-1] - x[0])/float(x.size) #This is just the x-spacing (or for you, the timestep) in your data > > #Convert your lags into physical units > offsets = -lags*distancePerLag > > > You can then use numpy.argmax() to find the index in ycorr that has the highest cross-correlation value, and do whatever you want with the cross-correlation. > > Cheers, > Kevin Gullikson > > > > On Wed, Dec 14, 2011 at 2:14 PM, Rob Newman wrote: > Hi SciPy gurus, > > First up - I am not a physicist, so please be gentle! > > I have an array of real data and an array of synthetic data. I am trying to determine the cross-correlation of the two signals and the timeshift that needs to be applied to the real data to best match the synthetic data. I also want to only use the real data later on the script if the cross correlation result is above some level of confidence. > > I have read the man page on numpy.correlate, but I am not entirely sure of what that function returns to me, and how I should use it. I have looked at James Battat's website that has a useful script on the discrete correlation function of two functions (https://www.cfa.harvard.edu/~jbattat/computer/python/science/#correlation) but I think his example is more complicated than my needs. > > I understand that the correlate function returns an array that is twice the size of both the input arrays minus 1 (when using mode='full'), but what do I need to do to that resulting array to get the correlation value (if there is indeed a value to be returned) and the timeshift that needs to be applied to the real data to match the synthetic data. > > Thanks in advance, > - Rob > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user -------------- next part -------------- An HTML attachment was scrubbed... URL: From tmp50 at ukr.net Thu Dec 15 11:41:25 2011 From: tmp50 at ukr.net (Dmitrey) Date: Thu, 15 Dec 2011 18:41:25 +0200 Subject: [SciPy-User] Ann: OpenOpt and FuncDesigner 0.37 Message-ID: <68842.1323967285.16181933434562084864@ffe15.ukr.net> Hi all, I'm glad to inform you about new release 0.37 (2011-Dec-15) of our free software: > OpenOpt (numerical optimization): > IPOPT initialization time gap (time till first iteration) for FuncDesigner models has been decreased Some improvements and bugfixes for interalg, especially for "search all SNLE solutions" mode (Systems of Non Linear Equations) Eigenvalue problems (EIG) (in both OpenOpt and FuncDesigner) Equality constraints for GLP (global) solver de Some changes for goldenSection ftol stop criterion GUI func "manage" - now button "Enough" works in Python3, but "Run/Pause" not yet (probably something with threading and it will be fixed in Python instead) > FuncDesigner: Major sparse Automatic differentiation improvements for badly-vectorized or unvectorized problems with lots of constraints (except of box bounds); some problems now work many times or orders faster (of course not faster than vectorized problems with insufficient number of variable arrays). It is recommended to retest your large-scale problems with useSparse = 'auto' | True| False > Two new methods for splines to check their quality: plot and residual Solving ODE dy/dt = f(t) with specifiable accuracy by interalg Speedup for solving 1-dimensional IP by interalg > SpaceFuncs and DerApproximator: > Some code cleanup > > You may trace OpenOpt development information in our recently created entries in Twitter and Facebook, see http://openopt.org for details. > > See also: FuturePlans, this release announcement in OpenOpt forum > > Regards, D. > -------------- next part -------------- An HTML attachment was scrubbed... URL: From darribas at asu.edu Thu Dec 15 20:07:37 2011 From: darribas at asu.edu (Daniel Arribas-Bel) Date: Fri, 16 Dec 2011 02:07:37 +0100 Subject: [SciPy-User] fmin_l_bfgs_b issues with small numbers Message-ID: Hi all! [I hope this hasn't been asked before on the list, I haven't found anything similar at least.] I have a simple optimization problem that involves two very small arrays of dimension (2, 2) and (2, 1): aa = np.array([[ 0.00030763, -0.00011521], \ [ 0.00093007, -0.00015189]]) a = np.array([[ 2.54854751e-05], [ -3.93219333e-05]]) The function to optimize is: optim_par = lambda par: np.sum((np.dot(aa, np.array([par, par**2])) - a)**2) Now, if I just run optimize.fmin_l_bfgs_b with the default parameters: > start = [0.0] > bounds=[(-1.0,1.0)] > ll = op.fmin_l_bfgs_b(optim_par, start, approx_grad = True, bounds = > bounds) > I get a parameter of exactly 0.0, same as the starting value. If instead I either modify the parameters 'pgtol' and 'factr': l = op.fmin_l_bfgs_b(optim_par, start, approx_grad = True, bounds = bounds, > pgtol = 1e-50, factr = 10.0) > or simply re-scale the arrays multiplying them by 10,000 and run it as initially, the parameter I get is in either case -0.0296399132248. The attached script runs the three options. A couple of questions about this: - Is this caused because of the way the optimizer handles relatively small numbers or am I missing something else? - I want this to run inside an app that will take different datasets even though the arrays 'a' and 'aa' will always be of that shape. Which of the two solutions would you recommend as more stable, robust and fast? Why? Any other comment/suggestion is of course most welcome. Thank you very much in advance, ]d[ -- ============================================================ Daniel Arribas-Bel, PhD. Url: darribas.org Mail: darribas at asu.edu GeoDa Center for Geospatial Analysis and Computation (geodacenter.asu.edu) Arizona State University (USA) ============================================================ -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: opt_ex.py Type: text/x-python Size: 1024 bytes Desc: not available URL: From schut at sarvision.nl Fri Dec 16 04:45:11 2011 From: schut at sarvision.nl (Vincent Schut) Date: Fri, 16 Dec 2011 10:45:11 +0100 Subject: [SciPy-User] Understanding the cross-correlation function numpy.correlate & how to use it properly with real and synthetic data In-Reply-To: <7E392AAF-F24D-4E25-A8A0-F028FBC88CA5@ucsd.edu> References: <7E392AAF-F24D-4E25-A8A0-F028FBC88CA5@ucsd.edu> Message-ID: On 12/14/2011 09:14 PM, Rob Newman wrote: > Hi SciPy gurus, > > First up - I am not a physicist, so please be gentle! > > I have an array of real data and an array of synthetic data. I am trying to determine the cross-correlation of the two signals and the timeshift that needs to be applied to the real data to best match the synthetic data. I also want to only use the real data later on the script if the cross correlation result is above some level of confidence. > > I have read the man page on numpy.correlate, but I am not entirely sure of what that function returns to me, and how I should use it. I have looked at James Battat's website that has a useful script on the discrete correlation function of two functions (https://www.cfa.harvard.edu/~jbattat/computer/python/science/#correlation) but I think his example is more complicated than my needs. > > I understand that the correlate function returns an array that is twice the size of both the input arrays minus 1 (when using mode='full'), but what do I need to do to that resulting array to get the correlation value (if there is indeed a value to be returned) and the timeshift that needs to be applied to the real data to match the synthetic data. > > Thanks in advance, > - Rob Hi, if you just need to find the time-shift, another approach could be fft phase correlation. I have successfully used that to co-register images (satellite images) together, but I suppose it would apply in the 1-d case as well. Unfortunately I don't have any code ready, but you just might want to check some info on the subject on the internet to see if it would fit your needs. Best, Vincent. From warren.weckesser at enthought.com Fri Dec 16 05:19:43 2011 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Fri, 16 Dec 2011 04:19:43 -0600 Subject: [SciPy-User] fmin_l_bfgs_b issues with small numbers In-Reply-To: References: Message-ID: On Thu, Dec 15, 2011 at 7:07 PM, Daniel Arribas-Bel wrote: > Hi all! > > [I hope this hasn't been asked before on the list, I haven't found > anything similar at least.] > > I have a simple optimization problem that involves two very small arrays > of dimension (2, 2) and (2, 1): > > aa = np.array([[ 0.00030763, -0.00011521], \ > [ 0.00093007, -0.00015189]]) > a = np.array([[ 2.54854751e-05], [ -3.93219333e-05]]) > > > The function to optimize is: > > optim_par = lambda par: np.sum((np.dot(aa, np.array([par, par**2])) - > a)**2) > > Now, if I just run optimize.fmin_l_bfgs_b with the default parameters: > >> start = [0.0] >> bounds=[(-1.0,1.0)] >> ll = op.fmin_l_bfgs_b(optim_par, start, approx_grad = True, bounds = >> bounds) >> > > I get a parameter of exactly 0.0, same as the starting value. > > If instead I either modify the parameters 'pgtol' and 'factr': > > l = op.fmin_l_bfgs_b(optim_par, start, approx_grad = True, bounds = >> bounds, pgtol = 1e-50, factr = 10.0) >> > > or simply re-scale the arrays multiplying them by 10,000 and run it as > initially, the parameter I get is in either case -0.0296399132248. The > attached script runs the three options. > > A couple of questions about this: > > - Is this caused because of the way the optimizer handles relatively > small numbers or am I missing something else? > - I want this to run inside an app that will take different datasets > even though the arrays 'a' and 'aa' will always be of that shape. Which of > the two solutions would you recommend as more stable, robust and fast? Why? > > Any other comment/suggestion is of course most welcome. Thank you very > much in advance, > Hi Daniel, I can reproduce the behavior that you see with fmin_l_bfgs_b, but I don't know why it is behaving that way. Do you have a compelling reason for using fmin_l_bfgs_b? I get a much more accurate answer if I used fminbound: In [12]: fminbound(optim_par, -1, 1) Out[12]: -0.0089143209088738753 In [13]: fminbound(optim_par, -1, 1, xtol=1e-10) Out[13]: -0.0089129582673392344 Warren -------------- next part -------------- An HTML attachment was scrubbed... URL: From gilles.rochefort at gmail.com Fri Dec 16 05:55:15 2011 From: gilles.rochefort at gmail.com (Gilles Rochefort) Date: Fri, 16 Dec 2011 11:55:15 +0100 Subject: [SciPy-User] fmin_l_bfgs_b issues with small numbers In-Reply-To: References: Message-ID: Hi, My very first question is why are you using a multivariate optimization such as lbfgs for your problem ? >From what I understood your problem is just optimizing a function with only one scalar unknown x. L-BFGS is designed for large size problems, when you cannot explicitly compute the hessian. The L stands for limited memory ... where the hessian is approximated by means of a suitable decomposition. Also, here, you are asking lbfgs to approximate your first and second derivative. By the way, I notice that the function you give is convex between [-1;1], so you should use fminbound to find the solution : In [1]: from scipy.optimize import * In [2]: fminbound(optim_par, -1.0, 1.0,disp=3) Func-count x f(x) Procedure 1 -0.236068 1.12379e-07 initial 2 0.236068 1.09953e-07 golden 3 0.527864 4.56894e-07 golden 4 0.00164419 4.45293e-09 parabolic 5 0.0013422 4.44086e-09 parabolic 6 -0.0099419 4.2406e-09 parabolic 7 -0.0963144 1.94664e-08 golden 8 -0.00892899 4.23856e-09 parabolic 9 -0.00891099 4.23856e-09 parabolic 10 -0.00891432 4.23856e-09 parabolic 11 * -0.00891765* 4.23856e-09 parabolic In [3]: optim_par( -0.00891296 ) Out[3]: *4.2385592303771674e-09* In [4]: optim_par( -0.0296399132248 ) Out[4]:* 5.0744070563855195e-09* Obviously, you should also use another algorithm which would be really more efficient if you provide the first and second derivative of your function. Regards, Gilles. 2011/12/16 Daniel Arribas-Bel > Hi all! > > [I hope this hasn't been asked before on the list, I haven't found > anything similar at least.] > > I have a simple optimization problem that involves two very small arrays > of dimension (2, 2) and (2, 1): > > aa = np.array([[ 0.00030763, -0.00011521], \ > [ 0.00093007, -0.00015189]]) > a = np.array([[ 2.54854751e-05], [ -3.93219333e-05]]) > > > The function to optimize is: > > optim_par = lambda par: np.sum((np.dot(aa, np.array([par, par**2])) - > a)**2) > > Now, if I just run optimize.fmin_l_bfgs_b with the default parameters: > >> start = [0.0] >> bounds=[(-1.0,1.0)] >> ll = op.fmin_l_bfgs_b(optim_par, start, approx_grad = True, bounds = >> bounds) >> > > I get a parameter of exactly 0.0, same as the starting value. > > If instead I either modify the parameters 'pgtol' and 'factr': > > l = op.fmin_l_bfgs_b(optim_par, start, approx_grad = True, bounds = >> bounds, pgtol = 1e-50, factr = 10.0) >> > > or simply re-scale the arrays multiplying them by 10,000 and run it as > initially, the parameter I get is in either case -0.0296399132248. The > attached script runs the three options. > > A couple of questions about this: > > - Is this caused because of the way the optimizer handles relatively > small numbers or am I missing something else? > - I want this to run inside an app that will take different datasets > even though the arrays 'a' and 'aa' will always be of that shape. Which of > the two solutions would you recommend as more stable, robust and fast? Why? > > Any other comment/suggestion is of course most welcome. Thank you very > much in advance, > > ]d[ > > -- > ============================================================ > Daniel Arribas-Bel, PhD. > Url: darribas.org > Mail: darribas at asu.edu > > GeoDa Center for Geospatial Analysis and Computation (geodacenter.asu.edu) > Arizona State University (USA) > ============================================================ > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dave.j.thornton at gmail.com Thu Dec 15 17:27:58 2011 From: dave.j.thornton at gmail.com (Dave) Date: Thu, 15 Dec 2011 14:27:58 -0800 (PST) Subject: [SciPy-User] How do I use ols.py? Message-ID: I found OLS.py in the Cookbook (http://www.scipy.org/Cookbook/OLS), and would like to use it. But I don't know how. Numpy and scipy came with full installers. OLS.py is just a .py file. I'm totally at a loss. I tried putting OLS.py in various Python directories and then doing an "import ols" from the commandline, but Python yelled at me. I'm not a programmer - can anyone help me out? Where do I put OLS.py such that I can import it? Or, what tools do I use to "install" it the way I'm used to installing a python module? I'm running a 64-bit Windows Vista machine, and Python 2.7. From robert.kern at gmail.com Fri Dec 16 09:24:14 2011 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 16 Dec 2011 14:24:14 +0000 Subject: [SciPy-User] How do I use ols.py? In-Reply-To: References: Message-ID: On Thu, Dec 15, 2011 at 22:27, Dave wrote: > I found OLS.py in the Cookbook (http://www.scipy.org/Cookbook/OLS), > and would like to use it. ?But I don't know how. > > Numpy and scipy came with full installers. ?OLS.py is just a .py > file. ?I'm totally at a loss. > > I tried putting OLS.py in various Python directories and then doing an > "import ols" from the commandline, but Python yelled at me. > > I'm not a programmer - can anyone help me out? ?Where do I put OLS.py > such that I can import it? ?Or, what tools do I use to "install" it > the way I'm used to installing a python module? ?I'm running a 64-bit > Windows Vista machine, and Python 2.7. You want to put it into c:\Python27\Lib\site-packages\ (assuming that you installed Python into the default location of c:\Python27\). Then use "import OLS" with the same capitalization as the filename. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From robert.kern at gmail.com Fri Dec 16 09:25:25 2011 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 16 Dec 2011 14:25:25 +0000 Subject: [SciPy-User] How do I use ols.py? In-Reply-To: References: Message-ID: On Fri, Dec 16, 2011 at 14:24, Robert Kern wrote: > On Thu, Dec 15, 2011 at 22:27, Dave wrote: >> I found OLS.py in the Cookbook (http://www.scipy.org/Cookbook/OLS), >> and would like to use it. ?But I don't know how. >> >> Numpy and scipy came with full installers. ?OLS.py is just a .py >> file. ?I'm totally at a loss. >> >> I tried putting OLS.py in various Python directories and then doing an >> "import ols" from the commandline, but Python yelled at me. >> >> I'm not a programmer - can anyone help me out? ?Where do I put OLS.py >> such that I can import it? ?Or, what tools do I use to "install" it >> the way I'm used to installing a python module? ?I'm running a 64-bit >> Windows Vista machine, and Python 2.7. > > You want to put it into c:\Python27\Lib\site-packages\ ?(assuming that > you installed Python into the default location of c:\Python27\). Then > use "import OLS" with the same capitalization as the filename. Or rather (having just looked at the page), download it as "ols.py" and use "import ols" as in the example. Your filesystem may not care about capitalization, but Python does. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco From jjstickel at vcn.com Fri Dec 16 09:50:04 2011 From: jjstickel at vcn.com (Jonathan Stickel) Date: Fri, 16 Dec 2011 07:50:04 -0700 Subject: [SciPy-User] an exercise in spline basis functions In-Reply-To: References: <4EE8BBF5.4060507@vcn.com> Message-ID: <4EEB5A9C.9070504@vcn.com> On 12/14/11 13:48 , josef.pktd at gmail.com wrote: > On Wed, Dec 14, 2011 at 10:08 AM, Jonathan Stickel wrote: >> On 12/14/11 07:28 , scipy-user-request at scipy.org wrote: >>> >>> one more prototype that I haven't seen yet: >>> >>> using scipy.interpolate and find smoothing parameter s that minimizes >>> Bayesian Information Criterium, BIC >>> >>> >>> https://picasaweb.google.com/106983885143680349926/Joepy#5685704025775485730 >> >> >> FYI, I have a regularization based smoothing method in scikits that can >> "automatically" determine the smoothing parameter by generalized cross >> validation. I am not a statistician, but I think this is analogous to the >> spline smoothing example that you show. >> >> I'd like to see my code incorporated into a larger package (e.g. >> scipy.interpolate or scikits.statsmodels), but I haven't received definitive >> feedback about this when I have asked in the past. > > A scipy.smooth package would be a good addition (as we discussed > before) but someone would have to push for it. > > Similar for statsmodels, smoothers would be a good addition, but it's > lacking a "champion". Your smoothing package would make a good > addition (especially if cvxopt can be replaced with fmin_slsqp for > example). > > I'm not a smoother person, but I bump into it every once in a while, > Chris added lowess to statsmodels, Ralf is working on functional > boxplots that sometimes require pre-smoothing, and we have various > non-parametric pieces, so I'm working my way *slowly* to add smoothers > (and polynomial fitting). > > Josef > https://picasaweb.google.com/106983885143680349926/Joepy#5686027329247551410 > Thank you for your response. I'll try to find a little time to play with statsmodels to get a fell for how scikits.datasmooth would fit in there. Do you have code posted somewhere for all the examples you have shown in this thread? By the way, cvxopt is only needed for solving the quadratic program (QP) that arises when smoothing is used with constraints. Smoothing without constraints is still available if cvxopt is not installed. Per your suggestion, I tried yesterday to re-implement using fmin_slsqp, but I couldn't get it to work. It seems fmin_slsqp requires the same arguments to be passed to the constraint functions as the objective function, which is quite limiting. Since smoothing with constraints can be reduced specifically to a QP, a QP solver would be most efficient anyway. It seems strange that QP (and LP) solvers are not available in core scipy. Regards, Jonathan From josef.pktd at gmail.com Fri Dec 16 10:30:27 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 16 Dec 2011 10:30:27 -0500 Subject: [SciPy-User] an exercise in spline basis functions In-Reply-To: <4EEB5A9C.9070504@vcn.com> References: <4EE8BBF5.4060507@vcn.com> <4EEB5A9C.9070504@vcn.com> Message-ID: On Fri, Dec 16, 2011 at 9:50 AM, Jonathan Stickel wrote: > On 12/14/11 13:48 , josef.pktd at gmail.com wrote: >> >> On Wed, Dec 14, 2011 at 10:08 AM, Jonathan Stickel >> ?wrote: >>> >>> On 12/14/11 07:28 , scipy-user-request at scipy.org wrote: >>>> >>>> >>>> one more prototype that I haven't seen yet: >>>> >>>> using scipy.interpolate and find smoothing parameter s that minimizes >>>> Bayesian Information Criterium, BIC >>>> >>>> >>>> >>>> https://picasaweb.google.com/106983885143680349926/Joepy#5685704025775485730 >>> >>> >>> >>> FYI, I have a regularization based smoothing method in scikits that can >>> "automatically" determine the smoothing parameter by generalized cross >>> validation. ?I am not a statistician, but I think this is analogous to >>> the >>> spline smoothing example that you show. >>> >>> I'd like to see my code incorporated into a larger package (e.g. >>> scipy.interpolate or scikits.statsmodels), but I haven't received >>> definitive >>> feedback about this when I have asked in the past. >> >> >> A scipy.smooth package would be a good addition (as we discussed >> before) but someone would have to push for it. >> >> Similar for statsmodels, smoothers would be a good addition, but it's >> lacking a "champion". Your smoothing package would make a good >> addition (especially if cvxopt can be replaced with fmin_slsqp for >> example). >> >> I'm not a smoother person, but I bump into it every once in a while, >> Chris added lowess to statsmodels, Ralf is working on functional >> boxplots that sometimes require pre-smoothing, and we have various >> non-parametric pieces, so I'm working my way *slowly* to add smoothers >> (and polynomial fitting). >> >> Josef >> >> https://picasaweb.google.com/106983885143680349926/Joepy#5686027329247551410 >> > > Thank you for your response. ?I'll try to find a little time to play with > statsmodels to get a fell for how scikits.datasmooth would fit in there. > > Do you have code posted somewhere for all the examples you have shown in > this thread? I added the two experimental spline modules to a new branch of statsmodels https://github.com/josef-pkt/statsmodels/tree/smooth/scikits/statsmodels/sandbox/nonparametric try_spline_ic.py does the comparison with your regularsmooth if that module is in the same directory (I didn't add regularsmooth.py to the repository) The other related things in the directory are kernel regression, and smoothers.py which just contains a (global) polynomial fitter that I needed as drop-in replacement for any smoothers while working on other things. (smoothers would need wrappers with a, still undecided, standardized interface so they can be interchangeably used as part of other models.) Fitting by (orthogonal) polynomials is also for nonparametrics but is also in largely experimental state. lowess and some kernel density estimation is out of the sandbox in https://github.com/josef-pkt/statsmodels/tree/smooth/scikits/statsmodels/nonparametric Except for my two spline modules, everything else is also in statsmodels master. > > By the way, cvxopt is only needed for solving the quadratic program (QP) > that arises when smoothing is used with constraints. ?Smoothing without > constraints is still available if cvxopt is not installed. ?Per your > suggestion, I tried yesterday to re-implement using fmin_slsqp, but I > couldn't get it to work. ?It seems fmin_slsqp requires the same arguments to > be passed to the constraint functions as the objective function, which is > quite limiting. ?Since smoothing with constraints can be reduced > specifically to a QP, a QP solver would be most efficient anyway. ?It seems > strange that QP (and LP) solvers are not available in core scipy. I don't have currently cvxopt installed, so I could only run your unconstrained version. Spline fitting with constraints would be a good addition, since scipy doesn't have anything like that either, as far as I know. I also think not having QP is a big gap, but I never looked at how well any of the other solvers could substitute for it. Thanks, Josef > > Regards, > Jonathan From niki.spahiev at gmail.com Fri Dec 16 04:23:34 2011 From: niki.spahiev at gmail.com (Niki Spahiev) Date: Fri, 16 Dec 2011 11:23:34 +0200 Subject: [SciPy-User] Ann: OpenOpt and FuncDesigner 0.37 In-Reply-To: <68842.1323967285.16181933434562084864@ffe15.ukr.net> References: <68842.1323967285.16181933434562084864@ffe15.ukr.net> Message-ID: <4EEB0E16.2060202@gmail.com> On 15.12.2011 18:41, Dmitrey wrote: > > Hi all, > I'm glad to inform you about new release 0.37 (2011-Dec-15) of our free > software: > > OpenOpt (numerical optimization): Hello Dmitrey, Can OpenOpt be used to approximate 2D splines with arcs and lines? Best Regards, Nikolay Spahiev From warren.weckesser at enthought.com Sat Dec 17 11:20:13 2011 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Sat, 17 Dec 2011 10:20:13 -0600 Subject: [SciPy-User] zscore axis functionality is borked In-Reply-To: References: Message-ID: On Wed, Nov 30, 2011 at 3:25 PM, wrote: > On Wed, Nov 30, 2011 at 4:10 PM, Warren Weckesser > wrote: > > > > > > On Wed, Nov 30, 2011 at 3:05 PM, wrote: > >> > >> On Wed, Nov 30, 2011 at 4:02 PM, Warren Weckesser > >> wrote: > >> > > >> > > >> > On Wed, Nov 30, 2011 at 2:54 PM, wrote: > >> >> > >> >> On Wed, Nov 30, 2011 at 3:45 PM, wrote: > >> >> > On Wed, Nov 30, 2011 at 3:25 PM, Alacast > wrote: > >> >> >> axis=0 (the default) works fine. axis=1, etc, is clearly wrong. > Am I > >> >> >> misunderstanding how to use this, or is this a bug? > >> >> >> > >> >> >> In [16]: i = rand(4,4) > >> >> >> > >> >> >> In [17]: i > >> >> >> Out[17]: > >> >> >> array([[ 0.85367762, 0.25348857, 0.23572615, 0.50403358], > >> >> >> [ 0.70199066, 0.81872151, 0.47357357, 0.20425537], > >> >> >> [ 0.31042673, 0.25837984, 0.73550134, 0.57970176], > >> >> >> [ 0.42828877, 0.60988596, 0.04059321, 0.73944219]]) > >> >> >> > >> >> >> In [18]: zscore(i, axis=0) > >> >> >> Out[18]: > >> >> >> array([[ 1.30128758, -0.96195723, -0.52119142, -0.01453907], > >> >> >> [ 0.59653471, 1.38544585, 0.39284654, -1.55756529], > >> >> >> [-1.22271057, -0.94164388, 1.39942427, 0.37494213], > >> >> >> [-0.67511172, 0.51815526, -1.27107939, 1.19716222]]) > >> >> >> > >> >> >> In [19]: zscore(i[:,0]) > >> >> >> Out[19]: array([ 1.30128758, 0.59653471, -1.22271057, > -0.67511172]) > >> >> >> > >> >> >> In [20]: zscore(i[:,0])==zscore(i,axis=0)[:,0] > >> >> >> Out[20]: array([ True, True, True, True], dtype=bool) > >> >> >> > >> >> >> In [21]: zscore(i, axis=1) > >> >> >> Out[21]: > >> >> >> array([[-0.99378502, -1.59397407, -1.61173649, -1.34342906], > >> >> >> [-1.6379836 , -1.52125275, -1.86640069, -2.13571889], > >> >> >> [-2.09968257, -2.15172946, -1.67460796, -1.83040754], > >> >> >> [-1.29796925, -1.11637205, -1.68566481, -0.98681582]]) > >> >> >> #The above is obviously wrong, as everything has a negative z > score > >> >> >> > >> >> >> In [22]: zscore(i[0,:]) > >> >> >> Out[22]: array([ 1.56824016, -0.83321371, -0.90428403, > 0.16925757]) > >> >> >> > >> >> >> In [23]: zscore(i[0,:])==zscore(i,axis=1)[0,:] > >> >> >> Out[23]: array([False, False, False, False], dtype=bool) > >> >> >> #Using axis=1 produces different results from taking a row > directly. > >> >> >> > >> >> >> In [24]: zscore(i, axis=-1) > >> >> >> Out[24]: > >> >> >> array([[-0.99378502, -1.59397407, -1.61173649, -1.34342906], > >> >> >> [-1.6379836 , -1.52125275, -1.86640069, -2.13571889], > >> >> >> [-2.09968257, -2.15172946, -1.67460796, -1.83040754], > >> >> >> [-1.29796925, -1.11637205, -1.68566481, -0.98681582]]) > >> >> >> #Getting rows by using axis=-1 is no better (this is the same > result > >> >> >> as > >> >> >> axis=1 > >> >> > > >> >> > This looks like a serious bug to me. I don't know what happened > here > >> >> > (. > >> >> > > >> >> > The docstring example also has negative numbers only. > >> >> > > >> >> > ??? > >> >> > > >> >> > I'm looking into it > >> >> > > >> >> > Thanks for reporting > >> >> > >> >> a misplaced axis: if axis>0 > >> >> then it calculates x - mean/std instead of (x - mean) / std > >> >> > >> >> now, how did this go through the testing ? > >> > > >> > > >> > > >> > > >> > There is only one test for zscore, on a 1-d sample without the axis > >> > keyword. > >> > >> which just show that we shouldn't trust changesets that say > >> > >> "stats: rewrite of zscore functions, ticket:1083 regression tests > >> pass, still need tests for enhancements" > >> > >> http://projects.scipy.org/scipy/changeset/6169 > >> > >> my mistake (maybe January 2nd wasn't a good day.) > >> > >> Josef > >> > > > > > > Thanks for the link. Looks like zmap has the same bug. :( > > copy paste errors? > > I just don't know why I didn't do basic checks like this in the final > version > > >>> assert_equal(zscore(x.T, axis=0).T, zscore(x, axis=1)) > >>> a = zscore(x, axis=1) > >>> a.var(1) > array([ 1., 1., 1., 1.]) > >>> a.mean(1) > array([ 0.00000000e+00, -1.11022302e-16, 0.00000000e+00, > 1.94289029e-16]) > > Josef > > Ticket: http://projects.scipy.org/scipy/ticket/1575 Pull request: https://github.com/scipy/scipy/pull/116 Warren -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Sat Dec 17 12:08:22 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 17 Dec 2011 12:08:22 -0500 Subject: [SciPy-User] zscore axis functionality is borked In-Reply-To: References: Message-ID: On Sat, Dec 17, 2011 at 11:20 AM, Warren Weckesser wrote: > > > On Wed, Nov 30, 2011 at 3:25 PM, wrote: >> >> On Wed, Nov 30, 2011 at 4:10 PM, Warren Weckesser >> wrote: >> > >> > >> > On Wed, Nov 30, 2011 at 3:05 PM, wrote: >> >> >> >> On Wed, Nov 30, 2011 at 4:02 PM, Warren Weckesser >> >> wrote: >> >> > >> >> > >> >> > On Wed, Nov 30, 2011 at 2:54 PM, wrote: >> >> >> >> >> >> On Wed, Nov 30, 2011 at 3:45 PM, ? wrote: >> >> >> > On Wed, Nov 30, 2011 at 3:25 PM, Alacast >> >> >> > wrote: >> >> >> >> axis=0 (the default) works fine. axis=1, etc, is clearly wrong. >> >> >> >> Am I >> >> >> >> misunderstanding how to use this, or is this a bug? >> >> >> >> >> >> >> >> In [16]: i = rand(4,4) >> >> >> >> >> >> >> >> In [17]: i >> >> >> >> Out[17]: >> >> >> >> array([[ 0.85367762, ?0.25348857, ?0.23572615, ?0.50403358], >> >> >> >> ? ? ? ?[ 0.70199066, ?0.81872151, ?0.47357357, ?0.20425537], >> >> >> >> ? ? ? ?[ 0.31042673, ?0.25837984, ?0.73550134, ?0.57970176], >> >> >> >> ? ? ? ?[ 0.42828877, ?0.60988596, ?0.04059321, ?0.73944219]]) >> >> >> >> >> >> >> >> In [18]: zscore(i, axis=0) >> >> >> >> Out[18]: >> >> >> >> array([[ 1.30128758, -0.96195723, -0.52119142, -0.01453907], >> >> >> >> ? ? ? ?[ 0.59653471, ?1.38544585, ?0.39284654, -1.55756529], >> >> >> >> ? ? ? ?[-1.22271057, -0.94164388, ?1.39942427, ?0.37494213], >> >> >> >> ? ? ? ?[-0.67511172, ?0.51815526, -1.27107939, ?1.19716222]]) >> >> >> >> >> >> >> >> In [19]: zscore(i[:,0]) >> >> >> >> Out[19]: array([ 1.30128758, ?0.59653471, -1.22271057, >> >> >> >> -0.67511172]) >> >> >> >> >> >> >> >> In [20]: zscore(i[:,0])==zscore(i,axis=0)[:,0] >> >> >> >> Out[20]: array([ True, ?True, ?True, ?True], dtype=bool) >> >> >> >> >> >> >> >> In [21]: zscore(i, axis=1) >> >> >> >> Out[21]: >> >> >> >> array([[-0.99378502, -1.59397407, -1.61173649, -1.34342906], >> >> >> >> ? ? ? ?[-1.6379836 , -1.52125275, -1.86640069, -2.13571889], >> >> >> >> ? ? ? ?[-2.09968257, -2.15172946, -1.67460796, -1.83040754], >> >> >> >> ? ? ? ?[-1.29796925, -1.11637205, -1.68566481, -0.98681582]]) >> >> >> >> #The above is obviously wrong, as everything has a negative z >> >> >> >> score >> >> >> >> >> >> >> >> In [22]: zscore(i[0,:]) >> >> >> >> Out[22]: array([ 1.56824016, -0.83321371, -0.90428403, >> >> >> >> ?0.16925757]) >> >> >> >> >> >> >> >> In [23]: zscore(i[0,:])==zscore(i,axis=1)[0,:] >> >> >> >> Out[23]: array([False, False, False, False], dtype=bool) >> >> >> >> #Using axis=1 produces different results from taking a row >> >> >> >> directly. >> >> >> >> >> >> >> >> In [24]: zscore(i, axis=-1) >> >> >> >> Out[24]: >> >> >> >> array([[-0.99378502, -1.59397407, -1.61173649, -1.34342906], >> >> >> >> ? ? ? ?[-1.6379836 , -1.52125275, -1.86640069, -2.13571889], >> >> >> >> ? ? ? ?[-2.09968257, -2.15172946, -1.67460796, -1.83040754], >> >> >> >> ? ? ? ?[-1.29796925, -1.11637205, -1.68566481, -0.98681582]]) >> >> >> >> #Getting rows by using axis=-1 is no better (this is the same >> >> >> >> result >> >> >> >> as >> >> >> >> axis=1 >> >> >> > >> >> >> > This looks like a serious bug to me. I don't know what happened >> >> >> > here >> >> >> > (. >> >> >> > >> >> >> > The docstring example also has negative numbers only. >> >> >> > >> >> >> > ??? >> >> >> > >> >> >> > I'm looking into it >> >> >> > >> >> >> > Thanks for reporting >> >> >> >> >> >> a misplaced axis: if axis>0 >> >> >> then it calculates ? x - mean/std instead of (x - mean) / std >> >> >> >> >> >> now, how did this go through the testing ? >> >> > >> >> > >> >> > >> >> > >> >> > There is only one test for zscore, on a 1-d sample without the axis >> >> > keyword. >> >> >> >> which just show that we shouldn't trust changesets that say >> >> >> >> "stats: rewrite of zscore functions, ticket:1083 regression tests >> >> pass, still need tests for enhancements" >> >> >> >> http://projects.scipy.org/scipy/changeset/6169 >> >> >> >> my mistake ?(maybe January 2nd wasn't a good day.) >> >> >> >> Josef >> >> >> > >> > >> > Thanks for the link.? Looks like zmap has the same bug. :( >> >> copy paste errors? >> >> I just don't know why I didn't do basic checks like this in the final >> version >> >> >>> assert_equal(zscore(x.T, axis=0).T, zscore(x, axis=1)) >> >>> a = zscore(x, axis=1) >> >>> a.var(1) >> array([ 1., ?1., ?1., ?1.]) >> >>> a.mean(1) >> array([ ?0.00000000e+00, ?-1.11022302e-16, ? 0.00000000e+00, >> ? ? ? ? 1.94289029e-16]) >> >> Josef >> > > > Ticket: http://projects.scipy.org/scipy/ticket/1575 > Pull request: https://github.com/scipy/scipy/pull/116 Thanks Warren, good to see you and Ralf taking care of stats. Josef > > Warren > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From fperez.net at gmail.com Mon Dec 19 04:49:07 2011 From: fperez.net at gmail.com (Fernando Perez) Date: Mon, 19 Dec 2011 01:49:07 -0800 Subject: [SciPy-User] [ANN] IPython 0.12 is out! Message-ID: Hi all, on behalf of the IPython development team, I'm thrilled to announce, after an intense 4 1/2 months of work, the official release of IPython 0.12. This is a very important release for IPython, for several reasons. First and foremost, we have a major new feature, our interactive web-based notebook, that has been in our sights for a very long time. We tried to build one years ago (with WX) as a Google SoC project in 2005, had other prototypes later on, but things never quite worked. Finally the refactoring effort started two years ago, the communications architecture we built in 2010, and the advances of modern browsers, gave us all the necessary pieces. With this foundation in place, while part of the team worked on the 0.11 release, Brian Granger had already started quietly building the web notebook, which we demoed in early-alpha mode at the SciPy 2011 conference (http://www.archive.org/details/Wednesday-203-6-IpythonANewArchitectureForInteractiveAndParallel). By the EuroScipy conference in August we had merged Brian's amazing effort into our master branch, and after that multiple people (old and new) jumped in to make all kinds of improvements, leaving us today with something that is an excellent foundation. It's still the first release of the notebook, and as such we know it has a number of rough edges, but several of us have been using it as a daily research tool for the last few months. Do not hesitate to file issues for any problems you encounter with it, and we even have an 'open issue' for general discussion of ideas and features for the notebook at: https://github.com/ipython/ipython/issues/977. Furthermore, it is clear that our big refactoring work, combined with the amazing facilities at Github, are paying off. The 0.11 series was a major amount of work, with 511 issues closed over almost two years. But that pales in comparison to this cycle: in only 4 1/2 months we closed 515 issues, with 50% being Pull Requests. And very importantly, our list of contributors includes many new faces (see the credits section in our release notes for full details), which is the best thing that can happen to an open source project. We hope you will find the new features (the notebook isn't the only one! see below) compelling, and that many more will not only use IPython but will join the project; there's plenty to do and now there are tasks for many different skill sets (web, javascript, gui work, low-level networking, parallel machinery, console apps, etc). *Downloads* Download links and instructions are at: http://ipython.org/download.html And IPython is also on PyPI: http://pypi.python.org/pypi/ipython Those contain a built version of the HTML docs; if you want pure source downloads with no docs, those are available on github: Tarball: https://github.com/ipython/ipython/tarball/rel-0.12 Zipball: https://github.com/ipython/ipython/zipball/rel-0.12 * Features * Here is a quick listing of the major new features: - An interactive browser-based Notebook with rich media support - Two-process terminal console - Tabbed QtConsole - Full Python 3 compatibility - Standalone Kernel - PyPy support And many more... We closed over 500 tickets, merged over 200 pull requests, and more than 45 people contributed commits for the final release. Please see our release notes for the full details on everything about this release: http://ipython.org/ipython-doc/stable/whatsnew/version0.12.html * IPython tutorial at PyCon 2012 * Those of you attending (or planning on it) PyCon 2012 in Santa Clara, CA, may be interested in attending a hands-on tutorial we will be presenting on the many faces of IPython. See https://us.pycon.org/2012/schedule/presentation/121/ for full details. * Errata * This was caught by Matthias Bussionnier's (one of our great new contributors) sharp eyes while I was writing these release notes: In the example notebook called display_protocol, the first cell starts with: from IPython.lib.pylabtools import print_figure which should instead be: from IPython.core.pylabtools import print_figure This has already been fixed on master, but since the final 0.12 files have been uploaded to github and PyPI, we'll let them be. As usual, if you find any other problem, please file a ticket --or even better, a pull request fixing it-- on our github issues site (https://github.com/ipython/ipython/issues/). Many thanks to all who contributed! Fernando, on behalf of the IPython development team. http://ipython.org From sponsfreixes at gmail.com Mon Dec 19 04:59:16 2011 From: sponsfreixes at gmail.com (Sergi Pons Freixes) Date: Mon, 19 Dec 2011 10:59:16 +0100 Subject: [SciPy-User] Removing duplicate cols/rows Message-ID: Hi All, I'm using a 2D shape array to store pairs of longitudes+latitudes. At one point, I have to merge two of those 2D arrays, and then remove any duplicate entry. I've been searching for a function similar to numpy.unique, but I've had no luck. Any implementation I've been thinking on looks very "unoptimizied". Is there anything existing solution, so I do not reinvent the wheel? To make it clear, I'm looking for: >>> a = np.array([[1, 1], [2, 3], [1, 1], [5, 4], [2, 3]]) >>> unique_rows(a) array([[1, 1], [2, 3],[5, 4]]) BTW, I wanted to use just a list of tuples for it, but the lists were so big that they consumed my 4Gb RAM + 4Gb swap (numpy arrays are more memory efficient). Regards, Sergi From peter.prettenhofer at gmail.com Mon Dec 19 07:00:20 2011 From: peter.prettenhofer at gmail.com (Peter Prettenhofer) Date: Mon, 19 Dec 2011 13:00:20 +0100 Subject: [SciPy-User] Dot product of sparse vector and dense matrix Message-ID: Hi, I'd like to compute the dot product between a sparse vector `x` and a dense (numpy) matrix `W` as fast as possible. My `x` is extremely sparse (dimensionality is about 2**20, nnz about 7) and my `W` is 2**20 times 20. After some benchmarks I found that for `x` being a csr matrix, `x * W` is much slower than it should be (about 52.7 ms) whereas if I represent `x` as a dense numpy array I get (4.5 ms). If I convert both `x` and `W` to csc format I'm in the same ballmarks as the all dense version (5.12 ms). The dot product of a sparse vector and a dense matrix should be much faster than the dot product of a sparse vector and a sparse matrix - does scipy provide a lowlevel api for sparse-dense dot products? thanks, Peter -- Peter Prettenhofer From peter.prettenhofer at gmail.com Mon Dec 19 08:35:16 2011 From: peter.prettenhofer at gmail.com (Peter Prettenhofer) Date: Mon, 19 Dec 2011 14:35:16 +0100 Subject: [SciPy-User] Dot product of sparse vector and dense matrix In-Reply-To: References: Message-ID: Here's another benchmark that shows how severe the problem is (`W.shape = (2**18, 20)`) :: %timeit x * W 10 loops, best of 3: 65.6 ms per loop %timeit (W[x.indices, :] * x.data[:, np.newaxis]).sum(axis=0) 10000 loops, best of 3: 20.5 us per loop <---- That's a speedup by three orders of magnitude! BTW: does anybody know if there's something like `sparse.sparsetools.csr_matvec` for sparse vec and dense mat? best, Peter 2011/12/19 Peter Prettenhofer : > Hi, > > I'd like to compute the dot product between a sparse vector `x` and a > dense (numpy) matrix `W` as fast as possible. My `x` is extremely > sparse (dimensionality is about 2**20, nnz about 7) and my `W` is > 2**20 times 20. After some benchmarks I found that for `x` being a csr > matrix, `x * W` is much slower than it should be (about 52.7 ms) > whereas if I represent `x` as a dense numpy array I get (4.5 ms). If I > convert both `x` and `W` to csc format I'm in the same ballmarks as > the all dense version (5.12 ms). > > The dot product of a sparse vector and a dense matrix should be much > faster than the dot product of a sparse vector and a sparse matrix - > does scipy provide a lowlevel api for sparse-dense dot products? > > thanks, > ?Peter > > -- > Peter Prettenhofer -- Peter Prettenhofer From cplusplusdeveloper at gmail.com Sun Dec 18 20:55:49 2011 From: cplusplusdeveloper at gmail.com (cplusplusdeveloper at gmail.com) Date: Sun, 18 Dec 2011 17:55:49 -0800 Subject: [SciPy-User] scipy.test() FAILED: Please help Message-ID: scipy.test() failed miserably with 11 failures. Please help. Information about my installation: os.name = posix sys.platform = linux2 sys.version = 2.6.5 (r265:79063, Jul 14 2010, 11:36:05) [GCC 4.4.4 20100630 (Red Hat 4.4.4-10)] uname -a = Linux myhost 2.6.32-71.el6.x86_64 #1 SMP Wed Sep 1 01:33:01 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux numpy.version = 1.6.1 scipy.version = 0.10.0 ATLAS version = 3.9.51 LAPACK version = 3.2.1 CPU = Intel Xeon x5680 (Westmere EP) 3.33GHz Some failures look like: ====================================================================== FAIL: test_arpack.test_symmetric_modes(True, , 'f', 2, 'SA', None, 0.5, , None, 'cayley') ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib/python2.6/site-packages/nose/case.py", line 182, in runTest self.test(*self.arg) File "/net/nwfs001/vol/vol2/PLL/project/zhang/opt/lib64/python2.6/site-packages/scipy/sparse/linalg/eigen/arpack/tests/test_arpack.py", line 235, in eval_evec assert_allclose(LHS, RHS, rtol=rtol, atol=atol, err_msg=err) File "/net/nwfs001/vol/vol2/PLL/project/zhang/opt/lib64/python2.6/site-packages/numpy/testing/utils.py", line 1168, in assert_allclose verbose=verbose, header=header) File "/net/nwfs001/vol/vol2/PLL/project/zhang/opt/lib64/python2.6/site-packages/numpy/testing/utils.py", line 636, in assert_array_compare raise AssertionError(msg) AssertionError: Not equal to tolerance rtol=0.00178814, atol=0.000357628 error for eigsh:general, typ=f, which=SA, sigma=0.5, mattype=aslinearoperator, OPpart=None, mode=cayley (mismatch 100.0%) x: array([[ 0.4389545 , 0.01935703], [ 0.26099187, 0.11053154], [ 0.37254444, 0.13223575],... y: array([[ 0.44009471, 0.01935698], [ 0.25523719, 0.1105317 ], [ 0.36819276, 0.13223571],... ---------------------------------------------------------------------- Thanks, A C++ developer -------------- next part -------------- An HTML attachment was scrubbed... URL: From zhazhang at gmail.com Sun Dec 18 21:00:23 2011 From: zhazhang at gmail.com (Zhang Zhang) Date: Sun, 18 Dec 2011 18:00:23 -0800 Subject: [SciPy-User] scipy.test() FAILED: Please help Message-ID: scipy.test() failed miserably with 11 failures. Please help. Information about my installation: os.name = posix sys.platform = linux2 sys.version = 2.6.5 (r265:79063, Jul 14 2010, 11:36:05) [GCC 4.4.4 20100630 (Red Hat 4.4.4-10)] uname -a = Linux myhost 2.6.32-71.el6.x86_64 #1 SMP Wed Sep 1 01:33:01 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux numpy.version = 1.6.1 scipy.version = 0.10.0 ATLAS version = 3.9.51 LAPACK version = 3.2.1 C compiler = gcc (4.4.4) Fortran compiler = gfortran (4.4.4) CPU = Intel Xeon x5680 (Westmere EP) 3.33GHz Some failures look like: ============================== ======================================== FAIL: test_arpack.test_symmetric_modes(True, , 'f', 2, 'SA', None, 0.5, , None, 'cayley') ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib/python2.6/site-packages/nose/case.py", line 182, in runTest self.test(*self.arg) File "/net/nwfs001/vol/vol2/PLL/project/zhang/opt/lib64/python2.6/site-packages/scipy/sparse/linalg/eigen/arpack/tests/test_arpack.py", line 235, in eval_evec assert_allclose(LHS, RHS, rtol=rtol, atol=atol, err_msg=err) File "/net/nwfs001/vol/vol2/PLL/project/zhang/opt/lib64/python2.6/site-packages/numpy/testing/utils.py", line 1168, in assert_allclose verbose=verbose, header=header) File "/net/nwfs001/vol/vol2/PLL/project/zhang/opt/lib64/python2.6/site-packages/numpy/testing/utils.py", line 636, in assert_array_compare raise AssertionError(msg) AssertionError: Not equal to tolerance rtol=0.00178814, atol=0.000357628 error for eigsh:general, typ=f, which=SA, sigma=0.5, mattype=aslinearoperator, OPpart=None, mode=cayley (mismatch 100.0%) x: array([[ 0.4389545 , 0.01935703], [ 0.26099187, 0.11053154], [ 0.37254444, 0.13223575],... y: array([[ 0.44009471, 0.01935698], [ 0.25523719, 0.1105317 ], [ 0.36819276, 0.13223571],... ---------------------------------------------------------------------- Thanks, A C++ developer -------------- next part -------------- An HTML attachment was scrubbed... URL: From e.antero.tammi at gmail.com Mon Dec 19 14:41:56 2011 From: e.antero.tammi at gmail.com (eat) Date: Mon, 19 Dec 2011 21:41:56 +0200 Subject: [SciPy-User] Removing duplicate cols/rows In-Reply-To: References: Message-ID: Hi, On Mon, Dec 19, 2011 at 11:59 AM, Sergi Pons Freixes wrote: > Hi All, > > I'm using a 2D shape array to store pairs of longitudes+latitudes. At > one point, I have to merge two of those 2D arrays, and then remove any > duplicate entry. I've been searching for a function similar to > numpy.unique, but I've had no luck. Any implementation I've been > thinking on looks very "unoptimizied". Is there anything existing > solution, so I do not reinvent the wheel? > > To make it clear, I'm looking for: > >>> a = np.array([[1, 1], [2, 3], [1, 1], [5, 4], [2, 3]]) > >>> unique_rows(a) > array([[1, 1], [2, 3],[5, 4]]) > A dot product with a random vector may do the trick. like: In []: a Out[]: array([[1, 1], [2, 3], [1, 1], [5, 4], [2, 3]]) In []: unique_index= np.unique(a.dot(np.random.rand(2)), return_index= True)[1] In []: a[unique_index] Out[]: array([[1, 1], [2, 3], [5, 4]]) (and for cols use just transpose of a) My 2 cents, eat > > BTW, I wanted to use just a list of tuples for it, but the lists were > so big that they consumed my 4Gb RAM + 4Gb swap (numpy arrays are more > memory efficient). > > Regards, > Sergi > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wesmckinn at gmail.com Mon Dec 19 14:49:09 2011 From: wesmckinn at gmail.com (Wes McKinney) Date: Mon, 19 Dec 2011 14:49:09 -0500 Subject: [SciPy-User] Removing duplicate cols/rows In-Reply-To: References: Message-ID: On Mon, Dec 19, 2011 at 2:41 PM, eat wrote: > Hi, > > On Mon, Dec 19, 2011 at 11:59 AM, Sergi Pons Freixes > wrote: >> >> Hi All, >> >> I'm using a 2D shape array to store pairs of longitudes+latitudes. At >> one point, I have to merge two of those 2D arrays, and then remove any >> duplicate entry. I've been searching for a function similar to >> numpy.unique, but I've had no luck. Any implementation I've been >> thinking on looks very "unoptimizied". Is there anything existing >> solution, so I do not reinvent the wheel? >> >> To make it clear, I'm looking for: >> >>> a = np.array([[1, 1], [2, 3], [1, 1], [5, 4], [2, 3]]) >> >>> unique_rows(a) >> array([[1, 1], [2, 3],[5, 4]]) > > A dot product with a random vector may do the trick. like: > In []: a > Out[]: > array([[1, 1], > ? ? ? ?[2, 3], > ? ? ? ?[1, 1], > ? ? ? ?[5, 4], > ? ? ? ?[2, 3]]) > In []: unique_index= np.unique(a.dot(np.random.rand(2)), return_index= > True)[1] > In []: a[unique_index] > Out[]: > array([[1, 1], > ? ? ? ?[2, 3], > ? ? ? ?[5, 4]]) > > (and for cols use just transpose of a) > > > My 2 cents, > eat >> >> >> BTW, I wanted to use just a list of tuples for it, but the lists were >> so big that they consumed my 4Gb RAM + 4Gb swap (numpy arrays are more >> memory efficient). >> >> Regards, >> Sergi >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > I implemented an efficient function for this in pandas: In [1]: a = np.array([[1, 1], [2, 3], [1, 1], [5, 4], [2, 3]]) In [2]: df = DataFrame(a) In [3]: df Out[3]: 0 1 0 1 1 1 2 3 2 1 1 3 5 4 4 2 3 In [4]: df.drop_duplicates() Out[4]: 0 1 0 1 1 1 2 3 3 5 4 you can get just the ndarray back by df.drop_duplicates().values - Wes From warren.weckesser at enthought.com Mon Dec 19 14:58:02 2011 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Mon, 19 Dec 2011 13:58:02 -0600 Subject: [SciPy-User] Removing duplicate cols/rows In-Reply-To: References: Message-ID: On Mon, Dec 19, 2011 at 1:49 PM, Wes McKinney wrote: > On Mon, Dec 19, 2011 at 2:41 PM, eat wrote: > > Hi, > > > > On Mon, Dec 19, 2011 at 11:59 AM, Sergi Pons Freixes > > wrote: > >> > >> Hi All, > >> > >> I'm using a 2D shape array to store pairs of longitudes+latitudes. At > >> one point, I have to merge two of those 2D arrays, and then remove any > >> duplicate entry. I've been searching for a function similar to > >> numpy.unique, but I've had no luck. Any implementation I've been > >> thinking on looks very "unoptimizied". Is there anything existing > >> solution, so I do not reinvent the wheel? > >> > >> To make it clear, I'm looking for: > >> >>> a = np.array([[1, 1], [2, 3], [1, 1], [5, 4], [2, 3]]) > >> >>> unique_rows(a) > >> array([[1, 1], [2, 3],[5, 4]]) > > > > A dot product with a random vector may do the trick. like: > > In []: a > > Out[]: > > array([[1, 1], > > [2, 3], > > [1, 1], > > [5, 4], > > [2, 3]]) > > In []: unique_index= np.unique(a.dot(np.random.rand(2)), return_index= > > True)[1] > > In []: a[unique_index] > > Out[]: > > array([[1, 1], > > [2, 3], > > [5, 4]]) > > > > (and for cols use just transpose of a) > > > > > > My 2 cents, > > eat > >> > >> > >> BTW, I wanted to use just a list of tuples for it, but the lists were > >> so big that they consumed my 4Gb RAM + 4Gb swap (numpy arrays are more > >> memory efficient). > >> > >> Regards, > >> Sergi > >> _______________________________________________ > >> SciPy-User mailing list > >> SciPy-User at scipy.org > >> http://mail.scipy.org/mailman/listinfo/scipy-user > > > > > > > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > I implemented an efficient function for this in pandas: > > In [1]: a = np.array([[1, 1], [2, 3], [1, 1], [5, 4], [2, 3]]) > > In [2]: df = DataFrame(a) > > In [3]: df > Out[3]: > 0 1 > 0 1 1 > 1 2 3 > 2 1 1 > 3 5 4 > 4 2 3 > > In [4]: df.drop_duplicates() > Out[4]: > 0 1 > 0 1 1 > 1 2 3 > 3 5 4 > > you can get just the ndarray back by df.drop_duplicates().values > > - Wes > Or... In [44]: x Out[44]: array([[3, 3], [3, 2], [2, 1], [3, 3], [1, 2], [3, 1], [1, 3], [1, 1], [2, 3], [3, 2], [1, 1], [3, 3], [1, 1], [3, 2], [3, 2]]) In [45]: u = unique(x.view(dtype=dtype([('a',x.dtype),('b',x.dtype)]))).view(x.dtype).reshape(-1,2) In [46]: u Out[46]: array([[1, 1], [1, 2], [1, 3], [2, 1], [2, 3], [3, 1], [3, 2], [3, 3]]) The 'one-liner' above converts x to a 1D structured array with two fields, then applies numpy.unique to the 1D array, and then converts that result back to a 2D array. Warren -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at googlemail.com Mon Dec 19 16:24:50 2011 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Mon, 19 Dec 2011 22:24:50 +0100 Subject: [SciPy-User] scipy.test() FAILED: Please help In-Reply-To: References: Message-ID: On Mon, Dec 19, 2011 at 3:00 AM, Zhang Zhang wrote: > scipy.test() failed miserably with 11 failures. Please help. Information > about my installation: > > os.name = posix > sys.platform = linux2 > sys.version = 2.6.5 (r265:79063, Jul 14 2010, 11:36:05) [GCC 4.4.4 > 20100630 (Red Hat 4.4.4-10)] > uname -a = Linux myhost 2.6.32-71.el6.x86_64 #1 SMP Wed Sep 1 01:33:01 > EDT 2010 x86_64 x86_64 x86_64 GNU/Linux > numpy.version = 1.6.1 > scipy.version = 0.10.0 > ATLAS version = 3.9.51 > LAPACK version = 3.2.1 > > C compiler = gcc (4.4.4) > Fortran compiler = gfortran (4.4.4) > > CPU = Intel Xeon x5680 (Westmere EP) 3.33GHz > > This has been reported a few times now, unfortunately it's not easy to reproduce and hard to debug. It shouldn't stop you from using the rest of scipy though. For completeness, could you send the full test log? Ralf > Some failures look like: > > ============================== > ======================================== > FAIL: test_arpack.test_symmetric_modes(True, , 'f', 2, > 'SA', None, 0.5, , None, 'cayley') > ---------------------------------------------------------------------- > Traceback (most recent call last): > File "/usr/lib/python2.6/site-packages/nose/case.py", line 182, in > runTest > self.test(*self.arg) > File > "/net/nwfs001/vol/vol2/PLL/project/zhang/opt/lib64/python2.6/site-packages/scipy/sparse/linalg/eigen/arpack/tests/test_arpack.py", > line 235, in eval_evec > assert_allclose(LHS, RHS, rtol=rtol, atol=atol, err_msg=err) > File > "/net/nwfs001/vol/vol2/PLL/project/zhang/opt/lib64/python2.6/site-packages/numpy/testing/utils.py", > line 1168, in assert_allclose > verbose=verbose, header=header) > File > "/net/nwfs001/vol/vol2/PLL/project/zhang/opt/lib64/python2.6/site-packages/numpy/testing/utils.py", > line 636, in assert_array_compare > raise AssertionError(msg) > AssertionError: > Not equal to tolerance rtol=0.00178814, atol=0.000357628 > error for eigsh:general, typ=f, which=SA, sigma=0.5, > mattype=aslinearoperator, OPpart=None, mode=cayley > (mismatch 100.0%) > x: array([[ 0.4389545 , 0.01935703], > [ 0.26099187, 0.11053154], > [ 0.37254444, 0.13223575],... > y: array([[ 0.44009471, 0.01935698], > [ 0.25523719, 0.1105317 ], > [ 0.36819276, 0.13223571],... > > ---------------------------------------------------------------------- > > Thanks, > A C++ developer > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wesmckinn at gmail.com Mon Dec 19 19:58:35 2011 From: wesmckinn at gmail.com (Wes McKinney) Date: Mon, 19 Dec 2011 19:58:35 -0500 Subject: [SciPy-User] Removing duplicate cols/rows In-Reply-To: References: Message-ID: On Mon, Dec 19, 2011 at 2:58 PM, Warren Weckesser wrote: > > > On Mon, Dec 19, 2011 at 1:49 PM, Wes McKinney wrote: >> >> On Mon, Dec 19, 2011 at 2:41 PM, eat wrote: >> > Hi, >> > >> > On Mon, Dec 19, 2011 at 11:59 AM, Sergi Pons Freixes >> > wrote: >> >> >> >> Hi All, >> >> >> >> I'm using a 2D shape array to store pairs of longitudes+latitudes. At >> >> one point, I have to merge two of those 2D arrays, and then remove any >> >> duplicate entry. I've been searching for a function similar to >> >> numpy.unique, but I've had no luck. Any implementation I've been >> >> thinking on looks very "unoptimizied". Is there anything existing >> >> solution, so I do not reinvent the wheel? >> >> >> >> To make it clear, I'm looking for: >> >> >>> a = np.array([[1, 1], [2, 3], [1, 1], [5, 4], [2, 3]]) >> >> >>> unique_rows(a) >> >> array([[1, 1], [2, 3],[5, 4]]) >> > >> > A dot product with a random vector may do the trick. like: >> > In []: a >> > Out[]: >> > array([[1, 1], >> > ? ? ? ?[2, 3], >> > ? ? ? ?[1, 1], >> > ? ? ? ?[5, 4], >> > ? ? ? ?[2, 3]]) >> > In []: unique_index= np.unique(a.dot(np.random.rand(2)), return_index= >> > True)[1] >> > In []: a[unique_index] >> > Out[]: >> > array([[1, 1], >> > ? ? ? ?[2, 3], >> > ? ? ? ?[5, 4]]) >> > >> > (and for cols use just transpose of a) >> > >> > >> > My 2 cents, >> > eat >> >> >> >> >> >> BTW, I wanted to use just a list of tuples for it, but the lists were >> >> so big that they consumed my 4Gb RAM + 4Gb swap (numpy arrays are more >> >> memory efficient). >> >> >> >> Regards, >> >> Sergi >> >> _______________________________________________ >> >> SciPy-User mailing list >> >> SciPy-User at scipy.org >> >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > >> > >> > >> > _______________________________________________ >> > SciPy-User mailing list >> > SciPy-User at scipy.org >> > http://mail.scipy.org/mailman/listinfo/scipy-user >> > >> >> I implemented an efficient function for this in pandas: >> >> In [1]: a = np.array([[1, 1], [2, 3], [1, 1], [5, 4], [2, 3]]) >> >> In [2]: df = DataFrame(a) >> >> In [3]: df >> Out[3]: >> ? 0 ?1 >> 0 ?1 ?1 >> 1 ?2 ?3 >> 2 ?1 ?1 >> 3 ?5 ?4 >> 4 ?2 ?3 >> >> In [4]: df.drop_duplicates() >> Out[4]: >> ? 0 ?1 >> 0 ?1 ?1 >> 1 ?2 ?3 >> 3 ?5 ?4 >> >> you can get just the ndarray back by df.drop_duplicates().values >> >> - Wes > > > > Or... > > In [44]: x > Out[44]: > array([[3, 3], > ?????? [3, 2], > ?????? [2, 1], > ?????? [3, 3], > ?????? [1, 2], > ?????? [3, 1], > ?????? [1, 3], > ?????? [1, 1], > ?????? [2, 3], > ?????? [3, 2], > ?????? [1, 1], > ?????? [3, 3], > ?????? [1, 1], > ?????? [3, 2], > ?????? [3, 2]]) > > In [45]: u = > unique(x.view(dtype=dtype([('a',x.dtype),('b',x.dtype)]))).view(x.dtype).reshape(-1,2) > > In [46]: u > Out[46]: > array([[1, 1], > ?????? [1, 2], > ?????? [1, 3], > ?????? [2, 1], > ?????? [2, 3], > ?????? [3, 1], > ?????? [3, 2], > ?????? [3, 3]]) > > > The 'one-liner' above converts x to a 1D structured array with two fields, > then applies numpy.unique to the 1D array, and then converts that result > back to a 2D array. > > Warren > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > That is cool. I found it interesting that np.unique is really slow on record arrays (the DataFrame method, dict-based under the hood, is about 5x faster). Is it doing tuple comparison? From josef.pktd at gmail.com Mon Dec 19 20:25:17 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 19 Dec 2011 20:25:17 -0500 Subject: [SciPy-User] Removing duplicate cols/rows In-Reply-To: References: Message-ID: On Mon, Dec 19, 2011 at 7:58 PM, Wes McKinney wrote: > On Mon, Dec 19, 2011 at 2:58 PM, Warren Weckesser > wrote: >> >> >> On Mon, Dec 19, 2011 at 1:49 PM, Wes McKinney wrote: >>> >>> On Mon, Dec 19, 2011 at 2:41 PM, eat wrote: >>> > Hi, >>> > >>> > On Mon, Dec 19, 2011 at 11:59 AM, Sergi Pons Freixes >>> > wrote: >>> >> >>> >> Hi All, >>> >> >>> >> I'm using a 2D shape array to store pairs of longitudes+latitudes. At >>> >> one point, I have to merge two of those 2D arrays, and then remove any >>> >> duplicate entry. I've been searching for a function similar to >>> >> numpy.unique, but I've had no luck. Any implementation I've been >>> >> thinking on looks very "unoptimizied". Is there anything existing >>> >> solution, so I do not reinvent the wheel? >>> >> >>> >> To make it clear, I'm looking for: >>> >> >>> a = np.array([[1, 1], [2, 3], [1, 1], [5, 4], [2, 3]]) >>> >> >>> unique_rows(a) >>> >> array([[1, 1], [2, 3],[5, 4]]) >>> > >>> > A dot product with a random vector may do the trick. like: >>> > In []: a >>> > Out[]: >>> > array([[1, 1], >>> > ? ? ? ?[2, 3], >>> > ? ? ? ?[1, 1], >>> > ? ? ? ?[5, 4], >>> > ? ? ? ?[2, 3]]) >>> > In []: unique_index= np.unique(a.dot(np.random.rand(2)), return_index= >>> > True)[1] >>> > In []: a[unique_index] >>> > Out[]: >>> > array([[1, 1], >>> > ? ? ? ?[2, 3], >>> > ? ? ? ?[5, 4]]) >>> > >>> > (and for cols use just transpose of a) >>> > >>> > >>> > My 2 cents, >>> > eat >>> >> >>> >> >>> >> BTW, I wanted to use just a list of tuples for it, but the lists were >>> >> so big that they consumed my 4Gb RAM + 4Gb swap (numpy arrays are more >>> >> memory efficient). >>> >> >>> >> Regards, >>> >> Sergi >>> >> _______________________________________________ >>> >> SciPy-User mailing list >>> >> SciPy-User at scipy.org >>> >> http://mail.scipy.org/mailman/listinfo/scipy-user >>> > >>> > >>> > >>> > _______________________________________________ >>> > SciPy-User mailing list >>> > SciPy-User at scipy.org >>> > http://mail.scipy.org/mailman/listinfo/scipy-user >>> > >>> >>> I implemented an efficient function for this in pandas: >>> >>> In [1]: a = np.array([[1, 1], [2, 3], [1, 1], [5, 4], [2, 3]]) >>> >>> In [2]: df = DataFrame(a) >>> >>> In [3]: df >>> Out[3]: >>> ? 0 ?1 >>> 0 ?1 ?1 >>> 1 ?2 ?3 >>> 2 ?1 ?1 >>> 3 ?5 ?4 >>> 4 ?2 ?3 >>> >>> In [4]: df.drop_duplicates() >>> Out[4]: >>> ? 0 ?1 >>> 0 ?1 ?1 >>> 1 ?2 ?3 >>> 3 ?5 ?4 >>> >>> you can get just the ndarray back by df.drop_duplicates().values >>> >>> - Wes >> >> >> >> Or... >> >> In [44]: x >> Out[44]: >> array([[3, 3], >> ?????? [3, 2], >> ?????? [2, 1], >> ?????? [3, 3], >> ?????? [1, 2], >> ?????? [3, 1], >> ?????? [1, 3], >> ?????? [1, 1], >> ?????? [2, 3], >> ?????? [3, 2], >> ?????? [1, 1], >> ?????? [3, 3], >> ?????? [1, 1], >> ?????? [3, 2], >> ?????? [3, 2]]) >> >> In [45]: u = >> unique(x.view(dtype=dtype([('a',x.dtype),('b',x.dtype)]))).view(x.dtype).reshape(-1,2) If I remember the discussion correctly, then you need to make sure that x is c-contiguous when creating the view. Josef >> >> In [46]: u >> Out[46]: >> array([[1, 1], >> ?????? [1, 2], >> ?????? [1, 3], >> ?????? [2, 1], >> ?????? [2, 3], >> ?????? [3, 1], >> ?????? [3, 2], >> ?????? [3, 3]]) >> >> >> The 'one-liner' above converts x to a 1D structured array with two fields, >> then applies numpy.unique to the 1D array, and then converts that result >> back to a 2D array. >> >> Warren >> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > > That is cool. I found it interesting that np.unique is really slow on > record arrays (the DataFrame method, dict-based under the hood, is > about 5x faster). Is it doing tuple comparison? does your dict-based method preserve order? Josef > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From e.antero.tammi at gmail.com Mon Dec 19 20:32:42 2011 From: e.antero.tammi at gmail.com (eat) Date: Tue, 20 Dec 2011 03:32:42 +0200 Subject: [SciPy-User] Removing duplicate cols/rows In-Reply-To: References: Message-ID: Hi, On Tue, Dec 20, 2011 at 2:58 AM, Wes McKinney wrote: > On Mon, Dec 19, 2011 at 2:58 PM, Warren Weckesser > wrote: > > > > > > On Mon, Dec 19, 2011 at 1:49 PM, Wes McKinney > wrote: > >> > >> On Mon, Dec 19, 2011 at 2:41 PM, eat wrote: > >> > Hi, > >> > > >> > On Mon, Dec 19, 2011 at 11:59 AM, Sergi Pons Freixes > >> > wrote: > >> >> > >> >> Hi All, > >> >> > >> >> I'm using a 2D shape array to store pairs of longitudes+latitudes. At > >> >> one point, I have to merge two of those 2D arrays, and then remove > any > >> >> duplicate entry. I've been searching for a function similar to > >> >> numpy.unique, but I've had no luck. Any implementation I've been > >> >> thinking on looks very "unoptimizied". Is there anything existing > >> >> solution, so I do not reinvent the wheel? > >> >> > >> >> To make it clear, I'm looking for: > >> >> >>> a = np.array([[1, 1], [2, 3], [1, 1], [5, 4], [2, 3]]) > >> >> >>> unique_rows(a) > >> >> array([[1, 1], [2, 3],[5, 4]]) > >> > > >> > A dot product with a random vector may do the trick. like: > >> > In []: a > >> > Out[]: > >> > array([[1, 1], > >> > [2, 3], > >> > [1, 1], > >> > [5, 4], > >> > [2, 3]]) > >> > In []: unique_index= np.unique(a.dot(np.random.rand(2)), return_index= > >> > True)[1] > >> > In []: a[unique_index] > >> > Out[]: > >> > array([[1, 1], > >> > [2, 3], > >> > [5, 4]]) > >> > > >> > (and for cols use just transpose of a) > >> > > >> > > >> > My 2 cents, > >> > eat > >> >> > >> >> > >> >> BTW, I wanted to use just a list of tuples for it, but the lists were > >> >> so big that they consumed my 4Gb RAM + 4Gb swap (numpy arrays are > more > >> >> memory efficient). > >> >> > >> >> Regards, > >> >> Sergi > >> >> _______________________________________________ > >> >> SciPy-User mailing list > >> >> SciPy-User at scipy.org > >> >> http://mail.scipy.org/mailman/listinfo/scipy-user > >> > > >> > > >> > > >> > _______________________________________________ > >> > SciPy-User mailing list > >> > SciPy-User at scipy.org > >> > http://mail.scipy.org/mailman/listinfo/scipy-user > >> > > >> > >> I implemented an efficient function for this in pandas: > >> > >> In [1]: a = np.array([[1, 1], [2, 3], [1, 1], [5, 4], [2, 3]]) > >> > >> In [2]: df = DataFrame(a) > >> > >> In [3]: df > >> Out[3]: > >> 0 1 > >> 0 1 1 > >> 1 2 3 > >> 2 1 1 > >> 3 5 4 > >> 4 2 3 > >> > >> In [4]: df.drop_duplicates() > >> Out[4]: > >> 0 1 > >> 0 1 1 > >> 1 2 3 > >> 3 5 4 > >> > >> you can get just the ndarray back by df.drop_duplicates().values > >> > >> - Wes > > > > > > > > Or... > > > > In [44]: x > > Out[44]: > > array([[3, 3], > > [3, 2], > > [2, 1], > > [3, 3], > > [1, 2], > > [3, 1], > > [1, 3], > > [1, 1], > > [2, 3], > > [3, 2], > > [1, 1], > > [3, 3], > > [1, 1], > > [3, 2], > > [3, 2]]) > > > > In [45]: u = > > > unique(x.view(dtype=dtype([('a',x.dtype),('b',x.dtype)]))).view(x.dtype).reshape(-1,2) > > > > In [46]: u > > Out[46]: > > array([[1, 1], > > [1, 2], > > [1, 3], > > [2, 1], > > [2, 3], > > [3, 1], > > [3, 2], > > [3, 3]]) > > > > > > The 'one-liner' above converts x to a 1D structured array with two > fields, > > then applies numpy.unique to the 1D array, and then converts that result > > back to a 2D array. > > > > Warren > > > > > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > Hi, That is cool. I found it interesting that np.unique is really slow on > record arrays (the DataFrame method, dict-based under the hood, is > about 5x faster). Is it doing tuple comparison? > np.unique seems to be quite slow indeed. Also the number of columns seems need to be harcoded. An slightly off-topic issue is that it doesn't even preserve the order of 'first occurrences' of the duplicate rows. Does your dict based implementation respect this requirement? Regards, eat > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Mon Dec 19 20:39:10 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 19 Dec 2011 20:39:10 -0500 Subject: [SciPy-User] Removing duplicate cols/rows In-Reply-To: References: Message-ID: On Mon, Dec 19, 2011 at 8:32 PM, eat wrote: > Hi, > > On Tue, Dec 20, 2011 at 2:58 AM, Wes McKinney wrote: >> >> On Mon, Dec 19, 2011 at 2:58 PM, Warren Weckesser >> wrote: >> > >> > >> > On Mon, Dec 19, 2011 at 1:49 PM, Wes McKinney >> > wrote: >> >> >> >> On Mon, Dec 19, 2011 at 2:41 PM, eat wrote: >> >> > Hi, >> >> > >> >> > On Mon, Dec 19, 2011 at 11:59 AM, Sergi Pons Freixes >> >> > wrote: >> >> >> >> >> >> Hi All, >> >> >> >> >> >> I'm using a 2D shape array to store pairs of longitudes+latitudes. >> >> >> At >> >> >> one point, I have to merge two of those 2D arrays, and then remove >> >> >> any >> >> >> duplicate entry. I've been searching for a function similar to >> >> >> numpy.unique, but I've had no luck. Any implementation I've been >> >> >> thinking on looks very "unoptimizied". Is there anything existing >> >> >> solution, so I do not reinvent the wheel? >> >> >> >> >> >> To make it clear, I'm looking for: >> >> >> >>> a = np.array([[1, 1], [2, 3], [1, 1], [5, 4], [2, 3]]) >> >> >> >>> unique_rows(a) >> >> >> array([[1, 1], [2, 3],[5, 4]]) >> >> > >> >> > A dot product with a random vector may do the trick. like: >> >> > In []: a >> >> > Out[]: >> >> > array([[1, 1], >> >> > ? ? ? ?[2, 3], >> >> > ? ? ? ?[1, 1], >> >> > ? ? ? ?[5, 4], >> >> > ? ? ? ?[2, 3]]) >> >> > In []: unique_index= np.unique(a.dot(np.random.rand(2)), >> >> > return_index= >> >> > True)[1] >> >> > In []: a[unique_index] >> >> > Out[]: >> >> > array([[1, 1], >> >> > ? ? ? ?[2, 3], >> >> > ? ? ? ?[5, 4]]) >> >> > >> >> > (and for cols use just transpose of a) >> >> > >> >> > >> >> > My 2 cents, >> >> > eat >> >> >> >> >> >> >> >> >> BTW, I wanted to use just a list of tuples for it, but the lists >> >> >> were >> >> >> so big that they consumed my 4Gb RAM + 4Gb swap (numpy arrays are >> >> >> more >> >> >> memory efficient). >> >> >> >> >> >> Regards, >> >> >> Sergi >> >> >> _______________________________________________ >> >> >> SciPy-User mailing list >> >> >> SciPy-User at scipy.org >> >> >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> > >> >> > >> >> > >> >> > _______________________________________________ >> >> > SciPy-User mailing list >> >> > SciPy-User at scipy.org >> >> > http://mail.scipy.org/mailman/listinfo/scipy-user >> >> > >> >> >> >> I implemented an efficient function for this in pandas: >> >> >> >> In [1]: a = np.array([[1, 1], [2, 3], [1, 1], [5, 4], [2, 3]]) >> >> >> >> In [2]: df = DataFrame(a) >> >> >> >> In [3]: df >> >> Out[3]: >> >> ? 0 ?1 >> >> 0 ?1 ?1 >> >> 1 ?2 ?3 >> >> 2 ?1 ?1 >> >> 3 ?5 ?4 >> >> 4 ?2 ?3 >> >> >> >> In [4]: df.drop_duplicates() >> >> Out[4]: >> >> ? 0 ?1 >> >> 0 ?1 ?1 >> >> 1 ?2 ?3 >> >> 3 ?5 ?4 >> >> >> >> you can get just the ndarray back by df.drop_duplicates().values >> >> >> >> - Wes >> > >> > >> > >> > Or... >> > >> > In [44]: x >> > Out[44]: >> > array([[3, 3], >> > ?????? [3, 2], >> > ?????? [2, 1], >> > ?????? [3, 3], >> > ?????? [1, 2], >> > ?????? [3, 1], >> > ?????? [1, 3], >> > ?????? [1, 1], >> > ?????? [2, 3], >> > ?????? [3, 2], >> > ?????? [1, 1], >> > ?????? [3, 3], >> > ?????? [1, 1], >> > ?????? [3, 2], >> > ?????? [3, 2]]) >> > >> > In [45]: u = >> > >> > unique(x.view(dtype=dtype([('a',x.dtype),('b',x.dtype)]))).view(x.dtype).reshape(-1,2) >> > >> > In [46]: u >> > Out[46]: >> > array([[1, 1], >> > ?????? [1, 2], >> > ?????? [1, 3], >> > ?????? [2, 1], >> > ?????? [2, 3], >> > ?????? [3, 1], >> > ?????? [3, 2], >> > ?????? [3, 3]]) >> > >> > >> > The 'one-liner' above converts x to a 1D structured array with two >> > fields, >> > then applies numpy.unique to the 1D array, and then converts that result >> > back to a 2D array. >> > >> > Warren >> > >> > >> > _______________________________________________ >> > SciPy-User mailing list >> > SciPy-User at scipy.org >> > http://mail.scipy.org/mailman/listinfo/scipy-user >> > > > Hi, > >> That is cool. I found it interesting that np.unique is really slow on >> record arrays (the DataFrame method, dict-based under the hood, is >> about 5x faster). Is it doing tuple comparison? > > np.unique seems to be quite slow indeed.?Also the number of columns seems > need to be harcoded. It doesn't need to be hardcoded, since an array is homogenous, we can just use [('',a.dtype)]*a.shape[1] or something like this. (that was one of my first experiments with structured dtypes) Josef > > An slightly off-topic issue is that it?doesn't even preserve the order of > 'first?occurrences' of the?duplicate?rows. Does your dict based > implementation respect this requirement? > > > Regards, > eat >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From wesmckinn at gmail.com Mon Dec 19 20:39:12 2011 From: wesmckinn at gmail.com (Wes McKinney) Date: Mon, 19 Dec 2011 20:39:12 -0500 Subject: [SciPy-User] Removing duplicate cols/rows In-Reply-To: References: Message-ID: On Mon, Dec 19, 2011 at 8:32 PM, eat wrote: > Hi, > > On Tue, Dec 20, 2011 at 2:58 AM, Wes McKinney wrote: >> >> On Mon, Dec 19, 2011 at 2:58 PM, Warren Weckesser >> wrote: >> > >> > >> > On Mon, Dec 19, 2011 at 1:49 PM, Wes McKinney >> > wrote: >> >> >> >> On Mon, Dec 19, 2011 at 2:41 PM, eat wrote: >> >> > Hi, >> >> > >> >> > On Mon, Dec 19, 2011 at 11:59 AM, Sergi Pons Freixes >> >> > wrote: >> >> >> >> >> >> Hi All, >> >> >> >> >> >> I'm using a 2D shape array to store pairs of longitudes+latitudes. >> >> >> At >> >> >> one point, I have to merge two of those 2D arrays, and then remove >> >> >> any >> >> >> duplicate entry. I've been searching for a function similar to >> >> >> numpy.unique, but I've had no luck. Any implementation I've been >> >> >> thinking on looks very "unoptimizied". Is there anything existing >> >> >> solution, so I do not reinvent the wheel? >> >> >> >> >> >> To make it clear, I'm looking for: >> >> >> >>> a = np.array([[1, 1], [2, 3], [1, 1], [5, 4], [2, 3]]) >> >> >> >>> unique_rows(a) >> >> >> array([[1, 1], [2, 3],[5, 4]]) >> >> > >> >> > A dot product with a random vector may do the trick. like: >> >> > In []: a >> >> > Out[]: >> >> > array([[1, 1], >> >> > ? ? ? ?[2, 3], >> >> > ? ? ? ?[1, 1], >> >> > ? ? ? ?[5, 4], >> >> > ? ? ? ?[2, 3]]) >> >> > In []: unique_index= np.unique(a.dot(np.random.rand(2)), >> >> > return_index= >> >> > True)[1] >> >> > In []: a[unique_index] >> >> > Out[]: >> >> > array([[1, 1], >> >> > ? ? ? ?[2, 3], >> >> > ? ? ? ?[5, 4]]) >> >> > >> >> > (and for cols use just transpose of a) >> >> > >> >> > >> >> > My 2 cents, >> >> > eat >> >> >> >> >> >> >> >> >> BTW, I wanted to use just a list of tuples for it, but the lists >> >> >> were >> >> >> so big that they consumed my 4Gb RAM + 4Gb swap (numpy arrays are >> >> >> more >> >> >> memory efficient). >> >> >> >> >> >> Regards, >> >> >> Sergi >> >> >> _______________________________________________ >> >> >> SciPy-User mailing list >> >> >> SciPy-User at scipy.org >> >> >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> > >> >> > >> >> > >> >> > _______________________________________________ >> >> > SciPy-User mailing list >> >> > SciPy-User at scipy.org >> >> > http://mail.scipy.org/mailman/listinfo/scipy-user >> >> > >> >> >> >> I implemented an efficient function for this in pandas: >> >> >> >> In [1]: a = np.array([[1, 1], [2, 3], [1, 1], [5, 4], [2, 3]]) >> >> >> >> In [2]: df = DataFrame(a) >> >> >> >> In [3]: df >> >> Out[3]: >> >> ? 0 ?1 >> >> 0 ?1 ?1 >> >> 1 ?2 ?3 >> >> 2 ?1 ?1 >> >> 3 ?5 ?4 >> >> 4 ?2 ?3 >> >> >> >> In [4]: df.drop_duplicates() >> >> Out[4]: >> >> ? 0 ?1 >> >> 0 ?1 ?1 >> >> 1 ?2 ?3 >> >> 3 ?5 ?4 >> >> >> >> you can get just the ndarray back by df.drop_duplicates().values >> >> >> >> - Wes >> > >> > >> > >> > Or... >> > >> > In [44]: x >> > Out[44]: >> > array([[3, 3], >> > ?????? [3, 2], >> > ?????? [2, 1], >> > ?????? [3, 3], >> > ?????? [1, 2], >> > ?????? [3, 1], >> > ?????? [1, 3], >> > ?????? [1, 1], >> > ?????? [2, 3], >> > ?????? [3, 2], >> > ?????? [1, 1], >> > ?????? [3, 3], >> > ?????? [1, 1], >> > ?????? [3, 2], >> > ?????? [3, 2]]) >> > >> > In [45]: u = >> > >> > unique(x.view(dtype=dtype([('a',x.dtype),('b',x.dtype)]))).view(x.dtype).reshape(-1,2) >> > >> > In [46]: u >> > Out[46]: >> > array([[1, 1], >> > ?????? [1, 2], >> > ?????? [1, 3], >> > ?????? [2, 1], >> > ?????? [2, 3], >> > ?????? [3, 1], >> > ?????? [3, 2], >> > ?????? [3, 3]]) >> > >> > >> > The 'one-liner' above converts x to a 1D structured array with two >> > fields, >> > then applies numpy.unique to the 1D array, and then converts that result >> > back to a 2D array. >> > >> > Warren >> > >> > >> > _______________________________________________ >> > SciPy-User mailing list >> > SciPy-User at scipy.org >> > http://mail.scipy.org/mailman/listinfo/scipy-user >> > > > Hi, > >> That is cool. I found it interesting that np.unique is really slow on >> record arrays (the DataFrame method, dict-based under the hood, is >> about 5x faster). Is it doing tuple comparison? > > np.unique seems to be quite slow indeed.?Also the number of columns seems > need to be harcoded. > > An slightly off-topic issue is that it?doesn't even preserve the order of > 'first?occurrences' of the?duplicate?rows. Does your dict based > implementation respect this requirement? > > > Regards, > eat >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > Yes-- it also has the option to use the last observation too. From e.antero.tammi at gmail.com Mon Dec 19 20:55:35 2011 From: e.antero.tammi at gmail.com (eat) Date: Tue, 20 Dec 2011 03:55:35 +0200 Subject: [SciPy-User] Removing duplicate cols/rows In-Reply-To: References: Message-ID: Hi, On Tue, Dec 20, 2011 at 3:39 AM, Wes McKinney wrote: > On Mon, Dec 19, 2011 at 8:32 PM, eat wrote: > > Hi, > > > > On Tue, Dec 20, 2011 at 2:58 AM, Wes McKinney > wrote: > >> > >> On Mon, Dec 19, 2011 at 2:58 PM, Warren Weckesser > >> wrote: > >> > > >> > > >> > On Mon, Dec 19, 2011 at 1:49 PM, Wes McKinney > >> > wrote: > >> >> > >> >> On Mon, Dec 19, 2011 at 2:41 PM, eat > wrote: > >> >> > Hi, > >> >> > > >> >> > On Mon, Dec 19, 2011 at 11:59 AM, Sergi Pons Freixes > >> >> > wrote: > >> >> >> > >> >> >> Hi All, > >> >> >> > >> >> >> I'm using a 2D shape array to store pairs of longitudes+latitudes. > >> >> >> At > >> >> >> one point, I have to merge two of those 2D arrays, and then remove > >> >> >> any > >> >> >> duplicate entry. I've been searching for a function similar to > >> >> >> numpy.unique, but I've had no luck. Any implementation I've been > >> >> >> thinking on looks very "unoptimizied". Is there anything existing > >> >> >> solution, so I do not reinvent the wheel? > >> >> >> > >> >> >> To make it clear, I'm looking for: > >> >> >> >>> a = np.array([[1, 1], [2, 3], [1, 1], [5, 4], [2, 3]]) > >> >> >> >>> unique_rows(a) > >> >> >> array([[1, 1], [2, 3],[5, 4]]) > >> >> > > >> >> > A dot product with a random vector may do the trick. like: > >> >> > In []: a > >> >> > Out[]: > >> >> > array([[1, 1], > >> >> > [2, 3], > >> >> > [1, 1], > >> >> > [5, 4], > >> >> > [2, 3]]) > >> >> > In []: unique_index= np.unique(a.dot(np.random.rand(2)), > >> >> > return_index= > >> >> > True)[1] > >> >> > In []: a[unique_index] > >> >> > Out[]: > >> >> > array([[1, 1], > >> >> > [2, 3], > >> >> > [5, 4]]) > >> >> > > >> >> > (and for cols use just transpose of a) > >> >> > > >> >> > > >> >> > My 2 cents, > >> >> > eat > >> >> >> > >> >> >> > >> >> >> BTW, I wanted to use just a list of tuples for it, but the lists > >> >> >> were > >> >> >> so big that they consumed my 4Gb RAM + 4Gb swap (numpy arrays are > >> >> >> more > >> >> >> memory efficient). > >> >> >> > >> >> >> Regards, > >> >> >> Sergi > >> >> >> _______________________________________________ > >> >> >> SciPy-User mailing list > >> >> >> SciPy-User at scipy.org > >> >> >> http://mail.scipy.org/mailman/listinfo/scipy-user > >> >> > > >> >> > > >> >> > > >> >> > _______________________________________________ > >> >> > SciPy-User mailing list > >> >> > SciPy-User at scipy.org > >> >> > http://mail.scipy.org/mailman/listinfo/scipy-user > >> >> > > >> >> > >> >> I implemented an efficient function for this in pandas: > >> >> > >> >> In [1]: a = np.array([[1, 1], [2, 3], [1, 1], [5, 4], [2, 3]]) > >> >> > >> >> In [2]: df = DataFrame(a) > >> >> > >> >> In [3]: df > >> >> Out[3]: > >> >> 0 1 > >> >> 0 1 1 > >> >> 1 2 3 > >> >> 2 1 1 > >> >> 3 5 4 > >> >> 4 2 3 > >> >> > >> >> In [4]: df.drop_duplicates() > >> >> Out[4]: > >> >> 0 1 > >> >> 0 1 1 > >> >> 1 2 3 > >> >> 3 5 4 > >> >> > >> >> you can get just the ndarray back by df.drop_duplicates().values > >> >> > >> >> - Wes > >> > > >> > > >> > > >> > Or... > >> > > >> > In [44]: x > >> > Out[44]: > >> > array([[3, 3], > >> > [3, 2], > >> > [2, 1], > >> > [3, 3], > >> > [1, 2], > >> > [3, 1], > >> > [1, 3], > >> > [1, 1], > >> > [2, 3], > >> > [3, 2], > >> > [1, 1], > >> > [3, 3], > >> > [1, 1], > >> > [3, 2], > >> > [3, 2]]) > >> > > >> > In [45]: u = > >> > > >> > > unique(x.view(dtype=dtype([('a',x.dtype),('b',x.dtype)]))).view(x.dtype).reshape(-1,2) > >> > > >> > In [46]: u > >> > Out[46]: > >> > array([[1, 1], > >> > [1, 2], > >> > [1, 3], > >> > [2, 1], > >> > [2, 3], > >> > [3, 1], > >> > [3, 2], > >> > [3, 3]]) > >> > > >> > > >> > The 'one-liner' above converts x to a 1D structured array with two > >> > fields, > >> > then applies numpy.unique to the 1D array, and then converts that > result > >> > back to a 2D array. > >> > > >> > Warren > >> > > >> > > >> > _______________________________________________ > >> > SciPy-User mailing list > >> > SciPy-User at scipy.org > >> > http://mail.scipy.org/mailman/listinfo/scipy-user > >> > > > > > Hi, > > > >> That is cool. I found it interesting that np.unique is really slow on > >> record arrays (the DataFrame method, dict-based under the hood, is > >> about 5x faster). Is it doing tuple comparison? > > > > np.unique seems to be quite slow indeed. Also the number of columns seems > > need to be harcoded. > > > > An slightly off-topic issue is that it doesn't even preserve the order of > > 'first occurrences' of the duplicate rows. Does your dict based > > implementation respect this requirement? > > > > > > Regards, > > eat > >> > >> _______________________________________________ > >> SciPy-User mailing list > >> SciPy-User at scipy.org > >> http://mail.scipy.org/mailman/listinfo/scipy-user > > > > > > > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > Yes-- it also has the option to use the last observation too. > Very cool indeed. Does it make any (significant) difference, in performance wise, to choose either first or last occurrence? Regards, eat > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From wesmckinn at gmail.com Mon Dec 19 21:17:58 2011 From: wesmckinn at gmail.com (Wes McKinney) Date: Mon, 19 Dec 2011 21:17:58 -0500 Subject: [SciPy-User] Removing duplicate cols/rows In-Reply-To: References: Message-ID: On Mon, Dec 19, 2011 at 8:55 PM, eat wrote: > Hi, > > On Tue, Dec 20, 2011 at 3:39 AM, Wes McKinney wrote: >> >> On Mon, Dec 19, 2011 at 8:32 PM, eat wrote: >> > Hi, >> > >> > On Tue, Dec 20, 2011 at 2:58 AM, Wes McKinney >> > wrote: >> >> >> >> On Mon, Dec 19, 2011 at 2:58 PM, Warren Weckesser >> >> wrote: >> >> > >> >> > >> >> > On Mon, Dec 19, 2011 at 1:49 PM, Wes McKinney >> >> > wrote: >> >> >> >> >> >> On Mon, Dec 19, 2011 at 2:41 PM, eat >> >> >> wrote: >> >> >> > Hi, >> >> >> > >> >> >> > On Mon, Dec 19, 2011 at 11:59 AM, Sergi Pons Freixes >> >> >> > wrote: >> >> >> >> >> >> >> >> Hi All, >> >> >> >> >> >> >> >> I'm using a 2D shape array to store pairs of >> >> >> >> longitudes+latitudes. >> >> >> >> At >> >> >> >> one point, I have to merge two of those 2D arrays, and then >> >> >> >> remove >> >> >> >> any >> >> >> >> duplicate entry. I've been searching for a function similar to >> >> >> >> numpy.unique, but I've had no luck. Any implementation I've been >> >> >> >> thinking on looks very "unoptimizied". Is there anything existing >> >> >> >> solution, so I do not reinvent the wheel? >> >> >> >> >> >> >> >> To make it clear, I'm looking for: >> >> >> >> >>> a = np.array([[1, 1], [2, 3], [1, 1], [5, 4], [2, 3]]) >> >> >> >> >>> unique_rows(a) >> >> >> >> array([[1, 1], [2, 3],[5, 4]]) >> >> >> > >> >> >> > A dot product with a random vector may do the trick. like: >> >> >> > In []: a >> >> >> > Out[]: >> >> >> > array([[1, 1], >> >> >> > ? ? ? ?[2, 3], >> >> >> > ? ? ? ?[1, 1], >> >> >> > ? ? ? ?[5, 4], >> >> >> > ? ? ? ?[2, 3]]) >> >> >> > In []: unique_index= np.unique(a.dot(np.random.rand(2)), >> >> >> > return_index= >> >> >> > True)[1] >> >> >> > In []: a[unique_index] >> >> >> > Out[]: >> >> >> > array([[1, 1], >> >> >> > ? ? ? ?[2, 3], >> >> >> > ? ? ? ?[5, 4]]) >> >> >> > >> >> >> > (and for cols use just transpose of a) >> >> >> > >> >> >> > >> >> >> > My 2 cents, >> >> >> > eat >> >> >> >> >> >> >> >> >> >> >> >> BTW, I wanted to use just a list of tuples for it, but the lists >> >> >> >> were >> >> >> >> so big that they consumed my 4Gb RAM + 4Gb swap (numpy arrays are >> >> >> >> more >> >> >> >> memory efficient). >> >> >> >> >> >> >> >> Regards, >> >> >> >> Sergi >> >> >> >> _______________________________________________ >> >> >> >> SciPy-User mailing list >> >> >> >> SciPy-User at scipy.org >> >> >> >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> >> > >> >> >> > >> >> >> > >> >> >> > _______________________________________________ >> >> >> > SciPy-User mailing list >> >> >> > SciPy-User at scipy.org >> >> >> > http://mail.scipy.org/mailman/listinfo/scipy-user >> >> >> > >> >> >> >> >> >> I implemented an efficient function for this in pandas: >> >> >> >> >> >> In [1]: a = np.array([[1, 1], [2, 3], [1, 1], [5, 4], [2, 3]]) >> >> >> >> >> >> In [2]: df = DataFrame(a) >> >> >> >> >> >> In [3]: df >> >> >> Out[3]: >> >> >> ? 0 ?1 >> >> >> 0 ?1 ?1 >> >> >> 1 ?2 ?3 >> >> >> 2 ?1 ?1 >> >> >> 3 ?5 ?4 >> >> >> 4 ?2 ?3 >> >> >> >> >> >> In [4]: df.drop_duplicates() >> >> >> Out[4]: >> >> >> ? 0 ?1 >> >> >> 0 ?1 ?1 >> >> >> 1 ?2 ?3 >> >> >> 3 ?5 ?4 >> >> >> >> >> >> you can get just the ndarray back by df.drop_duplicates().values >> >> >> >> >> >> - Wes >> >> > >> >> > >> >> > >> >> > Or... >> >> > >> >> > In [44]: x >> >> > Out[44]: >> >> > array([[3, 3], >> >> > ?????? [3, 2], >> >> > ?????? [2, 1], >> >> > ?????? [3, 3], >> >> > ?????? [1, 2], >> >> > ?????? [3, 1], >> >> > ?????? [1, 3], >> >> > ?????? [1, 1], >> >> > ?????? [2, 3], >> >> > ?????? [3, 2], >> >> > ?????? [1, 1], >> >> > ?????? [3, 3], >> >> > ?????? [1, 1], >> >> > ?????? [3, 2], >> >> > ?????? [3, 2]]) >> >> > >> >> > In [45]: u = >> >> > >> >> > >> >> > unique(x.view(dtype=dtype([('a',x.dtype),('b',x.dtype)]))).view(x.dtype).reshape(-1,2) >> >> > >> >> > In [46]: u >> >> > Out[46]: >> >> > array([[1, 1], >> >> > ?????? [1, 2], >> >> > ?????? [1, 3], >> >> > ?????? [2, 1], >> >> > ?????? [2, 3], >> >> > ?????? [3, 1], >> >> > ?????? [3, 2], >> >> > ?????? [3, 3]]) >> >> > >> >> > >> >> > The 'one-liner' above converts x to a 1D structured array with two >> >> > fields, >> >> > then applies numpy.unique to the 1D array, and then converts that >> >> > result >> >> > back to a 2D array. >> >> > >> >> > Warren >> >> > >> >> > >> >> > _______________________________________________ >> >> > SciPy-User mailing list >> >> > SciPy-User at scipy.org >> >> > http://mail.scipy.org/mailman/listinfo/scipy-user >> >> > >> > >> > Hi, >> > >> >> That is cool. I found it interesting that np.unique is really slow on >> >> record arrays (the DataFrame method, dict-based under the hood, is >> >> about 5x faster). Is it doing tuple comparison? >> > >> > np.unique seems to be quite slow indeed.?Also the number of columns >> > seems >> > need to be harcoded. >> > >> > An slightly off-topic issue is that it?doesn't even preserve the order >> > of >> > 'first?occurrences' of the?duplicate?rows. Does your dict based >> > implementation respect this requirement? >> > >> > >> > Regards, >> > eat >> >> >> >> _______________________________________________ >> >> SciPy-User mailing list >> >> SciPy-User at scipy.org >> >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > >> > >> > >> > _______________________________________________ >> > SciPy-User mailing list >> > SciPy-User at scipy.org >> > http://mail.scipy.org/mailman/listinfo/scipy-user >> > >> >> Yes-- it also has the option to use the last observation too. > > Very cool indeed. Does it make any (significant) difference, in performance > wise, to choose either first or last?occurrence? > > Regards, > eat >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > Nope, no significant difference. Here's the algorithm: https://github.com/wesm/pandas/blob/master/pandas/src/groupby.pyx#L487 From charlesr.harris at gmail.com Mon Dec 19 21:37:43 2011 From: charlesr.harris at gmail.com (Charles R Harris) Date: Mon, 19 Dec 2011 19:37:43 -0700 Subject: [SciPy-User] Removing duplicate cols/rows In-Reply-To: References: Message-ID: On Mon, Dec 19, 2011 at 7:17 PM, Wes McKinney wrote: > On Mon, Dec 19, 2011 at 8:55 PM, eat wrote: > > Hi, > > > > On Tue, Dec 20, 2011 at 3:39 AM, Wes McKinney > wrote: > >> > >> On Mon, Dec 19, 2011 at 8:32 PM, eat wrote: > >> > Hi, > >> > > >> > On Tue, Dec 20, 2011 at 2:58 AM, Wes McKinney > >> > wrote: > >> >> > >> >> On Mon, Dec 19, 2011 at 2:58 PM, Warren Weckesser > >> >> wrote: > >> >> > > >> >> > > >> >> > On Mon, Dec 19, 2011 at 1:49 PM, Wes McKinney > > >> >> > wrote: > >> >> >> > >> >> >> On Mon, Dec 19, 2011 at 2:41 PM, eat > >> >> >> wrote: > >> >> >> > Hi, > >> >> >> > > >> >> >> > On Mon, Dec 19, 2011 at 11:59 AM, Sergi Pons Freixes > >> >> >> > wrote: > >> >> >> >> > >> >> >> >> Hi All, > >> >> >> >> > >> >> >> >> I'm using a 2D shape array to store pairs of > >> >> >> >> longitudes+latitudes. > >> >> >> >> At > >> >> >> >> one point, I have to merge two of those 2D arrays, and then > >> >> >> >> remove > >> >> >> >> any > >> >> >> >> duplicate entry. I've been searching for a function similar to > >> >> >> >> numpy.unique, but I've had no luck. Any implementation I've > been > >> >> >> >> thinking on looks very "unoptimizied". Is there anything > existing > >> >> >> >> solution, so I do not reinvent the wheel? > >> >> >> >> > >> >> >> >> To make it clear, I'm looking for: > >> >> >> >> >>> a = np.array([[1, 1], [2, 3], [1, 1], [5, 4], [2, 3]]) > >> >> >> >> >>> unique_rows(a) > >> >> >> >> array([[1, 1], [2, 3],[5, 4]]) > >> >> >> > > >> >> >> > A dot product with a random vector may do the trick. like: > >> >> >> > In []: a > >> >> >> > Out[]: > >> >> >> > array([[1, 1], > >> >> >> > [2, 3], > >> >> >> > [1, 1], > >> >> >> > [5, 4], > >> >> >> > [2, 3]]) > >> >> >> > In []: unique_index= np.unique(a.dot(np.random.rand(2)), > >> >> >> > return_index= > >> >> >> > True)[1] > >> >> >> > In []: a[unique_index] > >> >> >> > Out[]: > >> >> >> > array([[1, 1], > >> >> >> > [2, 3], > >> >> >> > [5, 4]]) > >> >> >> > > >> >> >> > (and for cols use just transpose of a) > >> >> >> > > >> >> >> > > >> >> >> > My 2 cents, > >> >> >> > eat > >> >> >> >> > >> >> >> >> > >> >> >> >> BTW, I wanted to use just a list of tuples for it, but the > lists > >> >> >> >> were > >> >> >> >> so big that they consumed my 4Gb RAM + 4Gb swap (numpy arrays > are > >> >> >> >> more > >> >> >> >> memory efficient). > >> >> >> >> > >> >> >> >> Regards, > >> >> >> >> Sergi > >> >> >> >> _______________________________________________ > >> >> >> >> SciPy-User mailing list > >> >> >> >> SciPy-User at scipy.org > >> >> >> >> http://mail.scipy.org/mailman/listinfo/scipy-user > >> >> >> > > >> >> >> > > >> >> >> > > >> >> >> > _______________________________________________ > >> >> >> > SciPy-User mailing list > >> >> >> > SciPy-User at scipy.org > >> >> >> > http://mail.scipy.org/mailman/listinfo/scipy-user > >> >> >> > > >> >> >> > >> >> >> I implemented an efficient function for this in pandas: > >> >> >> > >> >> >> In [1]: a = np.array([[1, 1], [2, 3], [1, 1], [5, 4], [2, 3]]) > >> >> >> > >> >> >> In [2]: df = DataFrame(a) > >> >> >> > >> >> >> In [3]: df > >> >> >> Out[3]: > >> >> >> 0 1 > >> >> >> 0 1 1 > >> >> >> 1 2 3 > >> >> >> 2 1 1 > >> >> >> 3 5 4 > >> >> >> 4 2 3 > >> >> >> > >> >> >> In [4]: df.drop_duplicates() > >> >> >> Out[4]: > >> >> >> 0 1 > >> >> >> 0 1 1 > >> >> >> 1 2 3 > >> >> >> 3 5 4 > >> >> >> > >> >> >> you can get just the ndarray back by df.drop_duplicates().values > >> >> >> > >> >> >> - Wes > >> >> > > >> >> > > >> >> > > >> >> > Or... > >> >> > > >> >> > In [44]: x > >> >> > Out[44]: > >> >> > array([[3, 3], > >> >> > [3, 2], > >> >> > [2, 1], > >> >> > [3, 3], > >> >> > [1, 2], > >> >> > [3, 1], > >> >> > [1, 3], > >> >> > [1, 1], > >> >> > [2, 3], > >> >> > [3, 2], > >> >> > [1, 1], > >> >> > [3, 3], > >> >> > [1, 1], > >> >> > [3, 2], > >> >> > [3, 2]]) > >> >> > > >> >> > In [45]: u = > >> >> > > >> >> > > >> >> > > unique(x.view(dtype=dtype([('a',x.dtype),('b',x.dtype)]))).view(x.dtype).reshape(-1,2) > >> >> > > >> >> > In [46]: u > >> >> > Out[46]: > >> >> > array([[1, 1], > >> >> > [1, 2], > >> >> > [1, 3], > >> >> > [2, 1], > >> >> > [2, 3], > >> >> > [3, 1], > >> >> > [3, 2], > >> >> > [3, 3]]) > >> >> > > >> >> > > >> >> > The 'one-liner' above converts x to a 1D structured array with two > >> >> > fields, > >> >> > then applies numpy.unique to the 1D array, and then converts that > >> >> > result > >> >> > back to a 2D array. > >> >> > > >> >> > Warren > >> >> > > >> >> > > >> >> > _______________________________________________ > >> >> > SciPy-User mailing list > >> >> > SciPy-User at scipy.org > >> >> > http://mail.scipy.org/mailman/listinfo/scipy-user > >> >> > > >> > > >> > Hi, > >> > > >> >> That is cool. I found it interesting that np.unique is really slow on > >> >> record arrays (the DataFrame method, dict-based under the hood, is > >> >> about 5x faster). Is it doing tuple comparison? > >> > > >> > np.unique seems to be quite slow indeed. Also the number of columns > >> > seems > >> > need to be harcoded. > >> > > >> > An slightly off-topic issue is that it doesn't even preserve the order > >> > of > >> > 'first occurrences' of the duplicate rows. Does your dict based > >> > implementation respect this requirement? > >> > > >> > > >> > Regards, > >> > eat > >> >> > >> >> _______________________________________________ > >> >> SciPy-User mailing list > >> >> SciPy-User at scipy.org > >> >> http://mail.scipy.org/mailman/listinfo/scipy-user > >> > > >> > > >> > > >> > _______________________________________________ > >> > SciPy-User mailing list > >> > SciPy-User at scipy.org > >> > http://mail.scipy.org/mailman/listinfo/scipy-user > >> > > >> > >> Yes-- it also has the option to use the last observation too. > > > > Very cool indeed. Does it make any (significant) difference, in > performance > > wise, to choose either first or last occurrence? > > > > Regards, > > eat > >> > >> _______________________________________________ > >> SciPy-User mailing list > >> SciPy-User at scipy.org > >> http://mail.scipy.org/mailman/listinfo/scipy-user > > > > > > > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > Nope, no significant difference. Here's the algorithm: > > https://github.com/wesm/pandas/blob/master/pandas/src/groupby.pyx#L487 > __ > I proposed the following on stackoverflow. for a similar problem and it can probably be adapted. It might be a bit trickier if you want to preserve the original row order. In [1]: a = array([[0, 0, 1], [1, 1, 1], [1, 1, 1], [1, 0, 1]]) In [2]: b = a[lexsort(a.T)] In [3]: b Out[3]: array([[0, 0, 1], [1, 0, 1], [1, 1, 1], [1, 1, 1]]) ... In [5]: (b[1:] - b[:-1]).any(-1) Out[5]: array([ True, True, False], dtype=bool) Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Mon Dec 19 21:38:17 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 19 Dec 2011 21:38:17 -0500 Subject: [SciPy-User] Removing duplicate cols/rows In-Reply-To: References: Message-ID: On Mon, Dec 19, 2011 at 9:17 PM, Wes McKinney wrote: > On Mon, Dec 19, 2011 at 8:55 PM, eat wrote: >> Hi, >> >> On Tue, Dec 20, 2011 at 3:39 AM, Wes McKinney wrote: >>> >>> On Mon, Dec 19, 2011 at 8:32 PM, eat wrote: >>> > Hi, >>> > >>> > On Tue, Dec 20, 2011 at 2:58 AM, Wes McKinney >>> > wrote: >>> >> >>> >> On Mon, Dec 19, 2011 at 2:58 PM, Warren Weckesser >>> >> wrote: >>> >> > >>> >> > >>> >> > On Mon, Dec 19, 2011 at 1:49 PM, Wes McKinney >>> >> > wrote: >>> >> >> >>> >> >> On Mon, Dec 19, 2011 at 2:41 PM, eat >>> >> >> wrote: >>> >> >> > Hi, >>> >> >> > >>> >> >> > On Mon, Dec 19, 2011 at 11:59 AM, Sergi Pons Freixes >>> >> >> > wrote: >>> >> >> >> >>> >> >> >> Hi All, >>> >> >> >> >>> >> >> >> I'm using a 2D shape array to store pairs of >>> >> >> >> longitudes+latitudes. >>> >> >> >> At >>> >> >> >> one point, I have to merge two of those 2D arrays, and then >>> >> >> >> remove >>> >> >> >> any >>> >> >> >> duplicate entry. I've been searching for a function similar to >>> >> >> >> numpy.unique, but I've had no luck. Any implementation I've been >>> >> >> >> thinking on looks very "unoptimizied". Is there anything existing >>> >> >> >> solution, so I do not reinvent the wheel? >>> >> >> >> >>> >> >> >> To make it clear, I'm looking for: >>> >> >> >> >>> a = np.array([[1, 1], [2, 3], [1, 1], [5, 4], [2, 3]]) >>> >> >> >> >>> unique_rows(a) >>> >> >> >> array([[1, 1], [2, 3],[5, 4]]) >>> >> >> > >>> >> >> > A dot product with a random vector may do the trick. like: >>> >> >> > In []: a >>> >> >> > Out[]: >>> >> >> > array([[1, 1], >>> >> >> > ? ? ? ?[2, 3], >>> >> >> > ? ? ? ?[1, 1], >>> >> >> > ? ? ? ?[5, 4], >>> >> >> > ? ? ? ?[2, 3]]) >>> >> >> > In []: unique_index= np.unique(a.dot(np.random.rand(2)), >>> >> >> > return_index= >>> >> >> > True)[1] >>> >> >> > In []: a[unique_index] >>> >> >> > Out[]: >>> >> >> > array([[1, 1], >>> >> >> > ? ? ? ?[2, 3], >>> >> >> > ? ? ? ?[5, 4]]) >>> >> >> > >>> >> >> > (and for cols use just transpose of a) >>> >> >> > >>> >> >> > >>> >> >> > My 2 cents, >>> >> >> > eat >>> >> >> >> >>> >> >> >> >>> >> >> >> BTW, I wanted to use just a list of tuples for it, but the lists >>> >> >> >> were >>> >> >> >> so big that they consumed my 4Gb RAM + 4Gb swap (numpy arrays are >>> >> >> >> more >>> >> >> >> memory efficient). >>> >> >> >> >>> >> >> >> Regards, >>> >> >> >> Sergi >>> >> >> >> _______________________________________________ >>> >> >> >> SciPy-User mailing list >>> >> >> >> SciPy-User at scipy.org >>> >> >> >> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >> >> > >>> >> >> > >>> >> >> > >>> >> >> > _______________________________________________ >>> >> >> > SciPy-User mailing list >>> >> >> > SciPy-User at scipy.org >>> >> >> > http://mail.scipy.org/mailman/listinfo/scipy-user >>> >> >> > >>> >> >> >>> >> >> I implemented an efficient function for this in pandas: >>> >> >> >>> >> >> In [1]: a = np.array([[1, 1], [2, 3], [1, 1], [5, 4], [2, 3]]) >>> >> >> >>> >> >> In [2]: df = DataFrame(a) >>> >> >> >>> >> >> In [3]: df >>> >> >> Out[3]: >>> >> >> ? 0 ?1 >>> >> >> 0 ?1 ?1 >>> >> >> 1 ?2 ?3 >>> >> >> 2 ?1 ?1 >>> >> >> 3 ?5 ?4 >>> >> >> 4 ?2 ?3 >>> >> >> >>> >> >> In [4]: df.drop_duplicates() >>> >> >> Out[4]: >>> >> >> ? 0 ?1 >>> >> >> 0 ?1 ?1 >>> >> >> 1 ?2 ?3 >>> >> >> 3 ?5 ?4 >>> >> >> >>> >> >> you can get just the ndarray back by df.drop_duplicates().values >>> >> >> >>> >> >> - Wes >>> >> > >>> >> > >>> >> > >>> >> > Or... >>> >> > >>> >> > In [44]: x >>> >> > Out[44]: >>> >> > array([[3, 3], >>> >> > ?????? [3, 2], >>> >> > ?????? [2, 1], >>> >> > ?????? [3, 3], >>> >> > ?????? [1, 2], >>> >> > ?????? [3, 1], >>> >> > ?????? [1, 3], >>> >> > ?????? [1, 1], >>> >> > ?????? [2, 3], >>> >> > ?????? [3, 2], >>> >> > ?????? [1, 1], >>> >> > ?????? [3, 3], >>> >> > ?????? [1, 1], >>> >> > ?????? [3, 2], >>> >> > ?????? [3, 2]]) >>> >> > >>> >> > In [45]: u = >>> >> > >>> >> > >>> >> > unique(x.view(dtype=dtype([('a',x.dtype),('b',x.dtype)]))).view(x.dtype).reshape(-1,2) >>> >> > >>> >> > In [46]: u >>> >> > Out[46]: >>> >> > array([[1, 1], >>> >> > ?????? [1, 2], >>> >> > ?????? [1, 3], >>> >> > ?????? [2, 1], >>> >> > ?????? [2, 3], >>> >> > ?????? [3, 1], >>> >> > ?????? [3, 2], >>> >> > ?????? [3, 3]]) >>> >> > >>> >> > >>> >> > The 'one-liner' above converts x to a 1D structured array with two >>> >> > fields, >>> >> > then applies numpy.unique to the 1D array, and then converts that >>> >> > result >>> >> > back to a 2D array. >>> >> > >>> >> > Warren >>> >> > >>> >> > >>> >> > _______________________________________________ >>> >> > SciPy-User mailing list >>> >> > SciPy-User at scipy.org >>> >> > http://mail.scipy.org/mailman/listinfo/scipy-user >>> >> > >>> > >>> > Hi, >>> > >>> >> That is cool. I found it interesting that np.unique is really slow on >>> >> record arrays (the DataFrame method, dict-based under the hood, is >>> >> about 5x faster). Is it doing tuple comparison? >>> > >>> > np.unique seems to be quite slow indeed.?Also the number of columns >>> > seems >>> > need to be harcoded. >>> > >>> > An slightly off-topic issue is that it?doesn't even preserve the order >>> > of >>> > 'first?occurrences' of the?duplicate?rows. Does your dict based >>> > implementation respect this requirement? >>> > >>> > >>> > Regards, >>> > eat >>> >> >>> >> _______________________________________________ >>> >> SciPy-User mailing list >>> >> SciPy-User at scipy.org >>> >> http://mail.scipy.org/mailman/listinfo/scipy-user >>> > >>> > >>> > >>> > _______________________________________________ >>> > SciPy-User mailing list >>> > SciPy-User at scipy.org >>> > http://mail.scipy.org/mailman/listinfo/scipy-user >>> > >>> >>> Yes-- it also has the option to use the last observation too. >> >> Very cool indeed. Does it make any (significant) difference, in performance >> wise, to choose either first or last?occurrence? >> >> Regards, >> eat >>> >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > > Nope, no significant difference. Here's the algorithm: > > https://github.com/wesm/pandas/blob/master/pandas/src/groupby.pyx#L487 As far as I understand this requires hashability, so you still need to convert rows to tuples first, and it wouldn't work with text as object arrays. Or do I misread this? Josef > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From wesmckinn at gmail.com Mon Dec 19 21:40:49 2011 From: wesmckinn at gmail.com (Wes McKinney) Date: Mon, 19 Dec 2011 21:40:49 -0500 Subject: [SciPy-User] Removing duplicate cols/rows In-Reply-To: References: Message-ID: On Mon, Dec 19, 2011 at 9:38 PM, wrote: > On Mon, Dec 19, 2011 at 9:17 PM, Wes McKinney wrote: >> On Mon, Dec 19, 2011 at 8:55 PM, eat wrote: >>> Hi, >>> >>> On Tue, Dec 20, 2011 at 3:39 AM, Wes McKinney wrote: >>>> >>>> On Mon, Dec 19, 2011 at 8:32 PM, eat wrote: >>>> > Hi, >>>> > >>>> > On Tue, Dec 20, 2011 at 2:58 AM, Wes McKinney >>>> > wrote: >>>> >> >>>> >> On Mon, Dec 19, 2011 at 2:58 PM, Warren Weckesser >>>> >> wrote: >>>> >> > >>>> >> > >>>> >> > On Mon, Dec 19, 2011 at 1:49 PM, Wes McKinney >>>> >> > wrote: >>>> >> >> >>>> >> >> On Mon, Dec 19, 2011 at 2:41 PM, eat >>>> >> >> wrote: >>>> >> >> > Hi, >>>> >> >> > >>>> >> >> > On Mon, Dec 19, 2011 at 11:59 AM, Sergi Pons Freixes >>>> >> >> > wrote: >>>> >> >> >> >>>> >> >> >> Hi All, >>>> >> >> >> >>>> >> >> >> I'm using a 2D shape array to store pairs of >>>> >> >> >> longitudes+latitudes. >>>> >> >> >> At >>>> >> >> >> one point, I have to merge two of those 2D arrays, and then >>>> >> >> >> remove >>>> >> >> >> any >>>> >> >> >> duplicate entry. I've been searching for a function similar to >>>> >> >> >> numpy.unique, but I've had no luck. Any implementation I've been >>>> >> >> >> thinking on looks very "unoptimizied". Is there anything existing >>>> >> >> >> solution, so I do not reinvent the wheel? >>>> >> >> >> >>>> >> >> >> To make it clear, I'm looking for: >>>> >> >> >> >>> a = np.array([[1, 1], [2, 3], [1, 1], [5, 4], [2, 3]]) >>>> >> >> >> >>> unique_rows(a) >>>> >> >> >> array([[1, 1], [2, 3],[5, 4]]) >>>> >> >> > >>>> >> >> > A dot product with a random vector may do the trick. like: >>>> >> >> > In []: a >>>> >> >> > Out[]: >>>> >> >> > array([[1, 1], >>>> >> >> > ? ? ? ?[2, 3], >>>> >> >> > ? ? ? ?[1, 1], >>>> >> >> > ? ? ? ?[5, 4], >>>> >> >> > ? ? ? ?[2, 3]]) >>>> >> >> > In []: unique_index= np.unique(a.dot(np.random.rand(2)), >>>> >> >> > return_index= >>>> >> >> > True)[1] >>>> >> >> > In []: a[unique_index] >>>> >> >> > Out[]: >>>> >> >> > array([[1, 1], >>>> >> >> > ? ? ? ?[2, 3], >>>> >> >> > ? ? ? ?[5, 4]]) >>>> >> >> > >>>> >> >> > (and for cols use just transpose of a) >>>> >> >> > >>>> >> >> > >>>> >> >> > My 2 cents, >>>> >> >> > eat >>>> >> >> >> >>>> >> >> >> >>>> >> >> >> BTW, I wanted to use just a list of tuples for it, but the lists >>>> >> >> >> were >>>> >> >> >> so big that they consumed my 4Gb RAM + 4Gb swap (numpy arrays are >>>> >> >> >> more >>>> >> >> >> memory efficient). >>>> >> >> >> >>>> >> >> >> Regards, >>>> >> >> >> Sergi >>>> >> >> >> _______________________________________________ >>>> >> >> >> SciPy-User mailing list >>>> >> >> >> SciPy-User at scipy.org >>>> >> >> >> http://mail.scipy.org/mailman/listinfo/scipy-user >>>> >> >> > >>>> >> >> > >>>> >> >> > >>>> >> >> > _______________________________________________ >>>> >> >> > SciPy-User mailing list >>>> >> >> > SciPy-User at scipy.org >>>> >> >> > http://mail.scipy.org/mailman/listinfo/scipy-user >>>> >> >> > >>>> >> >> >>>> >> >> I implemented an efficient function for this in pandas: >>>> >> >> >>>> >> >> In [1]: a = np.array([[1, 1], [2, 3], [1, 1], [5, 4], [2, 3]]) >>>> >> >> >>>> >> >> In [2]: df = DataFrame(a) >>>> >> >> >>>> >> >> In [3]: df >>>> >> >> Out[3]: >>>> >> >> ? 0 ?1 >>>> >> >> 0 ?1 ?1 >>>> >> >> 1 ?2 ?3 >>>> >> >> 2 ?1 ?1 >>>> >> >> 3 ?5 ?4 >>>> >> >> 4 ?2 ?3 >>>> >> >> >>>> >> >> In [4]: df.drop_duplicates() >>>> >> >> Out[4]: >>>> >> >> ? 0 ?1 >>>> >> >> 0 ?1 ?1 >>>> >> >> 1 ?2 ?3 >>>> >> >> 3 ?5 ?4 >>>> >> >> >>>> >> >> you can get just the ndarray back by df.drop_duplicates().values >>>> >> >> >>>> >> >> - Wes >>>> >> > >>>> >> > >>>> >> > >>>> >> > Or... >>>> >> > >>>> >> > In [44]: x >>>> >> > Out[44]: >>>> >> > array([[3, 3], >>>> >> > ?????? [3, 2], >>>> >> > ?????? [2, 1], >>>> >> > ?????? [3, 3], >>>> >> > ?????? [1, 2], >>>> >> > ?????? [3, 1], >>>> >> > ?????? [1, 3], >>>> >> > ?????? [1, 1], >>>> >> > ?????? [2, 3], >>>> >> > ?????? [3, 2], >>>> >> > ?????? [1, 1], >>>> >> > ?????? [3, 3], >>>> >> > ?????? [1, 1], >>>> >> > ?????? [3, 2], >>>> >> > ?????? [3, 2]]) >>>> >> > >>>> >> > In [45]: u = >>>> >> > >>>> >> > >>>> >> > unique(x.view(dtype=dtype([('a',x.dtype),('b',x.dtype)]))).view(x.dtype).reshape(-1,2) >>>> >> > >>>> >> > In [46]: u >>>> >> > Out[46]: >>>> >> > array([[1, 1], >>>> >> > ?????? [1, 2], >>>> >> > ?????? [1, 3], >>>> >> > ?????? [2, 1], >>>> >> > ?????? [2, 3], >>>> >> > ?????? [3, 1], >>>> >> > ?????? [3, 2], >>>> >> > ?????? [3, 3]]) >>>> >> > >>>> >> > >>>> >> > The 'one-liner' above converts x to a 1D structured array with two >>>> >> > fields, >>>> >> > then applies numpy.unique to the 1D array, and then converts that >>>> >> > result >>>> >> > back to a 2D array. >>>> >> > >>>> >> > Warren >>>> >> > >>>> >> > >>>> >> > _______________________________________________ >>>> >> > SciPy-User mailing list >>>> >> > SciPy-User at scipy.org >>>> >> > http://mail.scipy.org/mailman/listinfo/scipy-user >>>> >> > >>>> > >>>> > Hi, >>>> > >>>> >> That is cool. I found it interesting that np.unique is really slow on >>>> >> record arrays (the DataFrame method, dict-based under the hood, is >>>> >> about 5x faster). Is it doing tuple comparison? >>>> > >>>> > np.unique seems to be quite slow indeed.?Also the number of columns >>>> > seems >>>> > need to be harcoded. >>>> > >>>> > An slightly off-topic issue is that it?doesn't even preserve the order >>>> > of >>>> > 'first?occurrences' of the?duplicate?rows. Does your dict based >>>> > implementation respect this requirement? >>>> > >>>> > >>>> > Regards, >>>> > eat >>>> >> >>>> >> _______________________________________________ >>>> >> SciPy-User mailing list >>>> >> SciPy-User at scipy.org >>>> >> http://mail.scipy.org/mailman/listinfo/scipy-user >>>> > >>>> > >>>> > >>>> > _______________________________________________ >>>> > SciPy-User mailing list >>>> > SciPy-User at scipy.org >>>> > http://mail.scipy.org/mailman/listinfo/scipy-user >>>> > >>>> >>>> Yes-- it also has the option to use the last observation too. >>> >>> Very cool indeed. Does it make any (significant) difference, in performance >>> wise, to choose either first or last?occurrence? >>> >>> Regards, >>> eat >>>> >>>> _______________________________________________ >>>> SciPy-User mailing list >>>> SciPy-User at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >>> >>> >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >> >> Nope, no significant difference. Here's the algorithm: >> >> https://github.com/wesm/pandas/blob/master/pandas/src/groupby.pyx#L487 > > As far as I understand this requires hashability, so you still need to > convert rows to tuples first, and it wouldn't work with text as object > arrays. > > Or do I misread this? > > Josef > > >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user What would you store in an object array that is not hashable? From josef.pktd at gmail.com Mon Dec 19 22:13:35 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 19 Dec 2011 22:13:35 -0500 Subject: [SciPy-User] Removing duplicate cols/rows In-Reply-To: References: Message-ID: On Mon, Dec 19, 2011 at 9:40 PM, Wes McKinney wrote: > On Mon, Dec 19, 2011 at 9:38 PM, ? wrote: >> On Mon, Dec 19, 2011 at 9:17 PM, Wes McKinney wrote: >>> On Mon, Dec 19, 2011 at 8:55 PM, eat wrote: >>>> Hi, >>>> >>>> On Tue, Dec 20, 2011 at 3:39 AM, Wes McKinney wrote: >>>>> >>>>> On Mon, Dec 19, 2011 at 8:32 PM, eat wrote: >>>>> > Hi, >>>>> > >>>>> > On Tue, Dec 20, 2011 at 2:58 AM, Wes McKinney >>>>> > wrote: >>>>> >> >>>>> >> On Mon, Dec 19, 2011 at 2:58 PM, Warren Weckesser >>>>> >> wrote: >>>>> >> > >>>>> >> > >>>>> >> > On Mon, Dec 19, 2011 at 1:49 PM, Wes McKinney >>>>> >> > wrote: >>>>> >> >> >>>>> >> >> On Mon, Dec 19, 2011 at 2:41 PM, eat >>>>> >> >> wrote: >>>>> >> >> > Hi, >>>>> >> >> > >>>>> >> >> > On Mon, Dec 19, 2011 at 11:59 AM, Sergi Pons Freixes >>>>> >> >> > wrote: >>>>> >> >> >> >>>>> >> >> >> Hi All, >>>>> >> >> >> >>>>> >> >> >> I'm using a 2D shape array to store pairs of >>>>> >> >> >> longitudes+latitudes. >>>>> >> >> >> At >>>>> >> >> >> one point, I have to merge two of those 2D arrays, and then >>>>> >> >> >> remove >>>>> >> >> >> any >>>>> >> >> >> duplicate entry. I've been searching for a function similar to >>>>> >> >> >> numpy.unique, but I've had no luck. Any implementation I've been >>>>> >> >> >> thinking on looks very "unoptimizied". Is there anything existing >>>>> >> >> >> solution, so I do not reinvent the wheel? >>>>> >> >> >> >>>>> >> >> >> To make it clear, I'm looking for: >>>>> >> >> >> >>> a = np.array([[1, 1], [2, 3], [1, 1], [5, 4], [2, 3]]) >>>>> >> >> >> >>> unique_rows(a) >>>>> >> >> >> array([[1, 1], [2, 3],[5, 4]]) >>>>> >> >> > >>>>> >> >> > A dot product with a random vector may do the trick. like: >>>>> >> >> > In []: a >>>>> >> >> > Out[]: >>>>> >> >> > array([[1, 1], >>>>> >> >> > ? ? ? ?[2, 3], >>>>> >> >> > ? ? ? ?[1, 1], >>>>> >> >> > ? ? ? ?[5, 4], >>>>> >> >> > ? ? ? ?[2, 3]]) >>>>> >> >> > In []: unique_index= np.unique(a.dot(np.random.rand(2)), >>>>> >> >> > return_index= >>>>> >> >> > True)[1] >>>>> >> >> > In []: a[unique_index] >>>>> >> >> > Out[]: >>>>> >> >> > array([[1, 1], >>>>> >> >> > ? ? ? ?[2, 3], >>>>> >> >> > ? ? ? ?[5, 4]]) >>>>> >> >> > >>>>> >> >> > (and for cols use just transpose of a) >>>>> >> >> > >>>>> >> >> > >>>>> >> >> > My 2 cents, >>>>> >> >> > eat >>>>> >> >> >> >>>>> >> >> >> >>>>> >> >> >> BTW, I wanted to use just a list of tuples for it, but the lists >>>>> >> >> >> were >>>>> >> >> >> so big that they consumed my 4Gb RAM + 4Gb swap (numpy arrays are >>>>> >> >> >> more >>>>> >> >> >> memory efficient). >>>>> >> >> >> >>>>> >> >> >> Regards, >>>>> >> >> >> Sergi >>>>> >> >> >> _______________________________________________ >>>>> >> >> >> SciPy-User mailing list >>>>> >> >> >> SciPy-User at scipy.org >>>>> >> >> >> http://mail.scipy.org/mailman/listinfo/scipy-user >>>>> >> >> > >>>>> >> >> > >>>>> >> >> > >>>>> >> >> > _______________________________________________ >>>>> >> >> > SciPy-User mailing list >>>>> >> >> > SciPy-User at scipy.org >>>>> >> >> > http://mail.scipy.org/mailman/listinfo/scipy-user >>>>> >> >> > >>>>> >> >> >>>>> >> >> I implemented an efficient function for this in pandas: >>>>> >> >> >>>>> >> >> In [1]: a = np.array([[1, 1], [2, 3], [1, 1], [5, 4], [2, 3]]) >>>>> >> >> >>>>> >> >> In [2]: df = DataFrame(a) >>>>> >> >> >>>>> >> >> In [3]: df >>>>> >> >> Out[3]: >>>>> >> >> ? 0 ?1 >>>>> >> >> 0 ?1 ?1 >>>>> >> >> 1 ?2 ?3 >>>>> >> >> 2 ?1 ?1 >>>>> >> >> 3 ?5 ?4 >>>>> >> >> 4 ?2 ?3 >>>>> >> >> >>>>> >> >> In [4]: df.drop_duplicates() >>>>> >> >> Out[4]: >>>>> >> >> ? 0 ?1 >>>>> >> >> 0 ?1 ?1 >>>>> >> >> 1 ?2 ?3 >>>>> >> >> 3 ?5 ?4 >>>>> >> >> >>>>> >> >> you can get just the ndarray back by df.drop_duplicates().values >>>>> >> >> >>>>> >> >> - Wes >>>>> >> > >>>>> >> > >>>>> >> > >>>>> >> > Or... >>>>> >> > >>>>> >> > In [44]: x >>>>> >> > Out[44]: >>>>> >> > array([[3, 3], >>>>> >> > ?????? [3, 2], >>>>> >> > ?????? [2, 1], >>>>> >> > ?????? [3, 3], >>>>> >> > ?????? [1, 2], >>>>> >> > ?????? [3, 1], >>>>> >> > ?????? [1, 3], >>>>> >> > ?????? [1, 1], >>>>> >> > ?????? [2, 3], >>>>> >> > ?????? [3, 2], >>>>> >> > ?????? [1, 1], >>>>> >> > ?????? [3, 3], >>>>> >> > ?????? [1, 1], >>>>> >> > ?????? [3, 2], >>>>> >> > ?????? [3, 2]]) >>>>> >> > >>>>> >> > In [45]: u = >>>>> >> > >>>>> >> > >>>>> >> > unique(x.view(dtype=dtype([('a',x.dtype),('b',x.dtype)]))).view(x.dtype).reshape(-1,2) >>>>> >> > >>>>> >> > In [46]: u >>>>> >> > Out[46]: >>>>> >> > array([[1, 1], >>>>> >> > ?????? [1, 2], >>>>> >> > ?????? [1, 3], >>>>> >> > ?????? [2, 1], >>>>> >> > ?????? [2, 3], >>>>> >> > ?????? [3, 1], >>>>> >> > ?????? [3, 2], >>>>> >> > ?????? [3, 3]]) >>>>> >> > >>>>> >> > >>>>> >> > The 'one-liner' above converts x to a 1D structured array with two >>>>> >> > fields, >>>>> >> > then applies numpy.unique to the 1D array, and then converts that >>>>> >> > result >>>>> >> > back to a 2D array. >>>>> >> > >>>>> >> > Warren >>>>> >> > >>>>> >> > >>>>> >> > _______________________________________________ >>>>> >> > SciPy-User mailing list >>>>> >> > SciPy-User at scipy.org >>>>> >> > http://mail.scipy.org/mailman/listinfo/scipy-user >>>>> >> > >>>>> > >>>>> > Hi, >>>>> > >>>>> >> That is cool. I found it interesting that np.unique is really slow on >>>>> >> record arrays (the DataFrame method, dict-based under the hood, is >>>>> >> about 5x faster). Is it doing tuple comparison? >>>>> > >>>>> > np.unique seems to be quite slow indeed.?Also the number of columns >>>>> > seems >>>>> > need to be harcoded. >>>>> > >>>>> > An slightly off-topic issue is that it?doesn't even preserve the order >>>>> > of >>>>> > 'first?occurrences' of the?duplicate?rows. Does your dict based >>>>> > implementation respect this requirement? >>>>> > >>>>> > >>>>> > Regards, >>>>> > eat >>>>> >> >>>>> >> _______________________________________________ >>>>> >> SciPy-User mailing list >>>>> >> SciPy-User at scipy.org >>>>> >> http://mail.scipy.org/mailman/listinfo/scipy-user >>>>> > >>>>> > >>>>> > >>>>> > _______________________________________________ >>>>> > SciPy-User mailing list >>>>> > SciPy-User at scipy.org >>>>> > http://mail.scipy.org/mailman/listinfo/scipy-user >>>>> > >>>>> >>>>> Yes-- it also has the option to use the last observation too. >>>> >>>> Very cool indeed. Does it make any (significant) difference, in performance >>>> wise, to choose either first or last?occurrence? >>>> >>>> Regards, >>>> eat >>>>> >>>>> _______________________________________________ >>>>> SciPy-User mailing list >>>>> SciPy-User at scipy.org >>>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>>> >>>> >>>> >>>> _______________________________________________ >>>> SciPy-User mailing list >>>> SciPy-User at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>>> >>> >>> Nope, no significant difference. Here's the algorithm: >>> >>> https://github.com/wesm/pandas/blob/master/pandas/src/groupby.pyx#L487 >> >> As far as I understand this requires hashability, so you still need to >> convert rows to tuples first, and it wouldn't work with text as object >> arrays. >> >> Or do I misread this? >> >> Josef >> >> >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > > What would you store in an object array that is not hashable? numpy.void maybe I have no idea what this means, but I wouldn't have expected that this works >>> a = np.array([('hi there', 5)], [('mytext', object),('a_int', int)]) >>> a[0] ('hi there', 5) >>> mydict = {} >>> mydict[a[0]] = 1 >>> mydict {('hi there', 5): 1} >>> a[0][0] = 'hi too' >>> mydict {('hi too', 5): 1} >>> mydict[['hi there', 5]] = 2 Traceback (most recent call last): File "", line 1, in TypeError: unhashable type: 'list' >>> mydict[a] = 2 Traceback (most recent call last): File "", line 1, in TypeError: unhashable type: 'numpy.ndarray' >>> type(a[0]) >>> mydict[(a[0][0], 5)] = 2 >>> mydict {('hi too', 5): 1, ('hi too', 5): 2} so much for immutable unique keys Josef > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From sponsfreixes at gmail.com Tue Dec 20 04:09:05 2011 From: sponsfreixes at gmail.com (Sergi Pons Freixes) Date: Tue, 20 Dec 2011 10:09:05 +0100 Subject: [SciPy-User] Removing duplicate cols/rows In-Reply-To: References: Message-ID: Thank you very much for all the responses, I wasn't expecting that such an interesting discussion would arise from it (it surpasses my current knowledge about numpy and python internals). FYI, I also asked it in stackoverflow and some more solutions have appeared then. Now I only have to evaluate them and choose the best one! I'm going to link both discussions for community's sake :). The stackoverflow question is http://stackoverflow.com/questions/8560440/removing-duplicate-cols-or-rows-on-2d-array . From pierre.raybaut at gmail.com Tue Dec 20 05:17:50 2011 From: pierre.raybaut at gmail.com (Pierre Raybaut) Date: Tue, 20 Dec 2011 11:17:50 +0100 Subject: [SciPy-User] New Doodle poll: "Scientific Python packages: Popularity check" Message-ID: Hi all, Three years ago (day for day... that's weird!), I made a Doodle poll for estimating ~scientific Python packages popularity. Things have changed since then and I propose a new poll here: http://www.doodle.com/rzssq2dbnus4a34r This poll is intended to identify the most popular scientific Python packages to be included in the Python(x,y) distribution. However, even if the package list is *not* exhaustive, I'm sure that people who are not interested in Python(x,y) will find the poll results interesting anyway. Thank you for your participation, and please spread the word! -Pierre From 6b656d70 at gmail.com Mon Dec 19 10:02:54 2011 From: 6b656d70 at gmail.com (sk) Date: Mon, 19 Dec 2011 07:02:54 -0800 (PST) Subject: [SciPy-User] Using numpy.interp in PyInstaller distribution Message-ID: <55a8b7c3-4c60-49c8-b899-7fd4933b2c50@z25g2000vbs.googlegroups.com> Hi, I am trying to package a single executable version of a tool that will be used by some people who do not have python installed on windows machines. Therefore the tool is being converted into a single executable using PyInstaller. Realise this isn't a PyInstaller discussion group and hope that this question is to do with numpy dependencies rather than pyinstaller specifics. I am using one function from numpy: nunmpy.interp. I am importing it using: from numpy import interp If I include this single import in my script then I get a huge number of dependancies and files included in the windows executable (in the order of 130MB!). If I exclude this line (and the functionality) then the file comes in at more like 12MB. A significant difference. My question is: is it possible to import a stripped down version of this function, or use the c function directly, without the many additional dependencies? For information the dependencies include a huge number of numpy and scipy as well as PyQt (and a couple of other GUI tools I have installed) when, for the GUI at least, these are not used in the project, at all. Many Thanks Stephen From ralf.gommers at googlemail.com Tue Dec 20 16:40:24 2011 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Tue, 20 Dec 2011 22:40:24 +0100 Subject: [SciPy-User] Using numpy.interp in PyInstaller distribution In-Reply-To: <55a8b7c3-4c60-49c8-b899-7fd4933b2c50@z25g2000vbs.googlegroups.com> References: <55a8b7c3-4c60-49c8-b899-7fd4933b2c50@z25g2000vbs.googlegroups.com> Message-ID: On Mon, Dec 19, 2011 at 4:02 PM, sk <6b656d70 at gmail.com> wrote: > Hi, > > I am trying to package a single executable version of a tool that will > be used by some people who do not have python installed on windows > machines. Therefore the tool is being converted into a single > executable using PyInstaller. Realise this isn't a PyInstaller > discussion group and hope that this question is to do with numpy > dependencies rather than pyinstaller specifics. > > I am using one function from numpy: nunmpy.interp. I am importing it > using: > > from numpy import interp > > If I include this single import in my script then I get a huge number > of dependancies and files included in the windows executable (in the > order of 130MB!). If I exclude this line (and the functionality) then > the file comes in at more like 12MB. A significant difference. > > My question is: is it possible to import a stripped down version of > this function, or use the c function directly, without the many > additional dependencies? > > For information the dependencies include a huge number of numpy and > scipy as well as PyQt (and a couple of other GUI tools I have > installed) when, for the GUI at least, these are not used in the > project, at all. > > scipy nor PyQt are dependencies of numpy, so either you are using them somewhere unrelated to numpy.interp, or this is a PyInstaller issue. I don't know anything about PyInstaller, but I would expect that that including numpy in a binary wouldn't add more than 5 Mb. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From dtustudy68 at hotmail.com Tue Dec 20 18:46:40 2011 From: dtustudy68 at hotmail.com (Jack Bryan) Date: Tue, 20 Dec 2011 16:46:40 -0700 Subject: [SciPy-User] scipy 0.10.0 install error on linux redhat Message-ID: Hi, I am trying to install scipy 0.10.0. on GNU/linux redhat 2.6.18 X86_64. I have installed python 2.4 and numpy 1.6.1. But, I got error when installing scipy : python setup.py build output: blas_opt_info:blas_mkl_info: libraries mkl,vml,guide not found in /usr/local/lib64 libraries mkl,vml,guide not found in /usr/local/lib libraries mkl,vml,guide not found in /usr/lib64 libraries mkl,vml,guide not found in /usr/lib NOT AVAILABLE atlas_blas_threads_info:Setting PTATLAS=ATLAS libraries ptf77blas,ptcblas,atlas not found in /usr/local/lib64 libraries ptf77blas,ptcblas,atlas not found in /usr/local/lib libraries ptf77blas,ptcblas,atlas not found in /usr/lib64/sse2 libraries ptf77blas,ptcblas,atlas not found in /usr/lib64 libraries ptf77blas,ptcblas,atlas not found in /usr/lib/sse2 libraries ptf77blas,ptcblas,atlas not found in /usr/lib NOT AVAILABLE atlas_blas_info: libraries f77blas,cblas,atlas not found in /usr/local/lib64 libraries f77blas,cblas,atlas not found in /usr/local/lib libraries f77blas,cblas,atlas not found in /usr/lib64/sse2 libraries f77blas,cblas,atlas not found in /usr/lib64 libraries f77blas,cblas,atlas not found in /usr/lib/sse2 libraries f77blas,cblas,atlas not found in /usr/lib NOT AVAILABLE UserWarning: Atlas (http://math-atlas.sourceforge.net/) libraries not found. Directories to search for the libraries can be specified in the numpy/distutils/site.cfg file (section [atlas]) or by setting the ATLAS environment variable. warnings.warn(AtlasNotFoundError.__doc__)blas_info: libraries blas not found in /remote/dcnl/Ding/backup_20100716/balsf77/BLAS libraries blas not found in /usr/local/lib64 libraries blas not found in /usr/local/lib libraries blas not found in /usr/lib64 libraries blas not found in /usr/lib NOT AVAILABLE UserWarning: Blas (http://www.netlib.org/blas/) libraries not found. Directories to search for the libraries can be specified in the numpy/distutils/site.cfg file (section [blas]) or by setting the BLAS environment variable. warnings.warn(BlasNotFoundError.__doc__)blas_src_info: NOT AVAILABLE /mypath/numpy161/lib64/python2.4/site-packages/numpy/distutils/system_info.py:1426: UserWarning: Blas (http://www.netlib.org/blas/) sources not found. Directories to search for the sources can be specified in the numpy/distutils/site.cfg file (section [blas_src]) or by setting the BLAS_SRC environment variable. warnings.warn(BlasSrcNotFoundError.__doc__)Traceback (most recent call last): File "setup.py", line 196, in ? setup_package() File "setup.py", line 187, in setup_package configuration=configuration ) File "/mypath/numpy161/lib64/python2.4/site-packages/numpy/distutils/core.py", line 152, in setup config = configuration() File "setup.py", line 138, in configuration config.add_subpackage('scipy') File "/mypath/numpy161/lib64/python2.4/site-packages/numpy/distutils/misc_util.py", line 1002, in add_subpackage caller_level = 2) File "/mypath/numpy161/lib64/python2.4/site-packages/numpy/distutils/misc_util.py", line 971, in get_subpackage caller_level = caller_level + 1) File "/mypath/numpy161/lib64/python2.4/site-packages/numpy/distutils/misc_util.py", line 908, in _get_configuration_from_setup_py config = setup_module.configuration(*args) File "./scipy/setup.py", line 8, in configuration config.add_subpackage('integrate') File "/ mypath/numpy161/lib64/python2.4/site-packages/numpy/distutils/misc_util.py", line 1002, in add_subpackage caller_level = 2) File "/mypath/numpy161/lib64/python2.4/site-packages/numpy/distutils/misc_util.py", line 971, in get_subpackage caller_level = caller_level + 1) File "/mypath/numpy161/lib64/python2.4/site-packages/numpy/distutils/misc_util.py", line 908, in _get_configuration_from_setup_py config = setup_module.configuration(*args) File "scipy/integrate/setup.py", line 10, in configuration blas_opt = get_info('blas_opt',notfound_action=2) File "/mypath/numpy161/lib64/python2.4/site-packages/numpy/distutils/system_info.py", line 311, in get_info return cl().get_info(notfound_action) File "/mypath/numpy161/lib64/python2.4/site-packages/numpy/distutils/system_info.py", line 462, in get_info raise self.notfounderror(self.notfounderror.__doc__)numpy.distutils.system_info.BlasNotFoundError: Blas (http://www.netlib.org/blas/) libraries not found. Directories to search for thhe libraries can be specified in the numpy/distutils/site.cfg file (section [blas]) or by setting the BLAS environment variable. then, i installed BLAS from http://netlib.org/blas/blas.tgz.and update BLAS environment variable BLASPATH. But, for installing scipy, i got error: Traceback (most recent call last): File "setup.py", line 196, in ? setup_package() File "setup.py", line 187, in setup_package configuration=configuration ) File "/mypath/numpy161/lib64/python2.4/site-packages/numpy/distutils/core.py", line 152, in setup config = configuration() File "setup.py", line 138, in configuration config.add_subpackage('scipy') File "/mypath/numpy161/lib64/python2.4/site-packages/numpy/distutils/misc_util.py", line 1002, in add_subpackage caller_level = 2) File "/mypath/numpy161/lib64/python2.4/site-packages/numpy/distutils/misc_util.py", line 971, in get_subpackage caller_level = caller_level + 1) File "/mypath/numpy161/lib64/python2.4/site-packages/numpy/distutils/misc_util.py", line 908, in _get_configuration_from_setup_py config = setup_module.configuration(*args) File "./scipy/setup.py", line 8, in configuration config.add_subpackage('integrate') File "/mypath/numpy161/lib64/python2.4/site-packages/numpy/distutils/misc_util.py", line 1002, in add_subpackage caller_level = 2) File "/mypath/numpy161/lib64/python2.4/site-packages/numpy/distutils/misc_util.py", line 971, in get_subpackage caller_level = caller_level + 1) File "/mypath/numpy161/lib64/python2.4/site-packages/numpy/distutils/misc_util.py", line 908, in _get_configuration_from_setup_py config = setup_module.configuration(*args) File "scipy/integrate/setup.py", line 10, in configuration blas_opt = get_info('blas_opt',notfound_action=2) File "/mypath/numpy161/lib64/python2.4/site-packages/numpy/distutils/system_info.py", line 311, in get_info return cl().get_info(notfound_action) File "/mypath/numpy161/lib64/python2.4/site-packages/numpy/distutils/system_info.py", line 462, in get_info raise self.notfounderror(self.notfounderror.__doc__)numpy.distutils.system_info.BlasNotFoundError: Blas (http://www.netlib.org/blas/) libraries not found. Directories to search for the libraries can be specified in the numpy/distutils/site.cfg file (section [blas]) or by setting the BLAS environment variable. Any help is really appreciated. thanks -------------- next part -------------- An HTML attachment was scrubbed... URL: From klonuo at gmail.com Tue Dec 20 19:29:14 2011 From: klonuo at gmail.com (klo uo) Date: Wed, 21 Dec 2011 01:29:14 +0100 Subject: [SciPy-User] [ANN] IPython 0.12 is out! In-Reply-To: References: Message-ID: Congratulation I'm just speechless regular Python user excited mainly by numpy/scipy/matplotlib and you guys interface it so beautifully as if you have geniuses vision not to copy $$$ products but provide your own unique environment grasping bold with every release Thank you for your gifts Best wishes and happy holidays From zhazhang at gmail.com Tue Dec 20 16:13:03 2011 From: zhazhang at gmail.com (Zhang Zhang) Date: Tue, 20 Dec 2011 13:13:03 -0800 Subject: [SciPy-User] scipy.test() FAILED: Please help Message-ID: Please see the complete test result attached. Thanks for your help. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: scipy_test_failed.log.tgz Type: application/x-gzip Size: 3402 bytes Desc: not available URL: From bojan.durickovic at gmail.com Wed Dec 21 09:20:59 2011 From: bojan.durickovic at gmail.com (=?utf-8?B?Ym9qYW4uZHVyaWNrb3ZpY0BnbWFpbC5jb20=?=) Date: Wed, 21 Dec 2011 09:20:59 -0500 Subject: [SciPy-User] =?utf-8?q?SciPy-User_Digest=2C_Vol_99=2C_Issue_50?= Message-ID: <1620247990.144107.1324477233973.JavaMail.webspher@njbbicssmp03> qYr Sent from my Verizon Wireless 4G LTE Smartphone ----- Reply message ----- From: scipy-user-request at scipy.org To: Subject: SciPy-User Digest, Vol 99, Issue 50 Date: Sun, Nov 27, 2011 13:00 Send SciPy-User mailing list submissions to scipy-user at scipy.org To subscribe or unsubscribe via the World Wide Web, visit http://mail.scipy.org/mailman/listinfo/scipy-user or, via email, send a message with subject or body 'help' to scipy-user-request at scipy.org You can reach the person managing the list at scipy-user-owner at scipy.org When replying, please edit your Subject line so it is more specific than "Re: Contents of SciPy-User digest..." Today's Topics: 1. Re: Power Spectral Density in SciPy, not pylab (Stefan Krastanov) 2. Confusion about lognormal distribution functions (tazz_ben) 3. Re: Problem with ODEINT (Lofgren, Eric) 4. Re: Confusion about lognormal distribution functions (Robert Kern) 5. Re: Confusion about lognormal distribution functions (josef.pktd at gmail.com) ---------------------------------------------------------------------- Message: 1 Date: Sat, 26 Nov 2011 06:32:47 -0800 (PST) From: Stefan Krastanov Subject: Re: [SciPy-User] Power Spectral Density in SciPy, not pylab To: scipy-user at googlegroups.com Cc: SciPy Users List Message-ID: <17684978.61.1322317967157.JavaMail.geo-discussion-forums at yqcw10> Content-Type: text/plain; charset="utf-8" A very old question but I had the same problem and google pointed me here. Use mlab. from matplotlib import mlab powers, freqs = mlab.psd(blah_blah) -------------- next part -------------- An HTML attachment was scrubbed... URL: http://mail.scipy.org/pipermail/scipy-user/attachments/20111126/bf9faf4c/attachment-0001.html ------------------------------ Message: 2 Date: Sat, 26 Nov 2011 16:52:46 +0000 From: tazz_ben Subject: [SciPy-User] Confusion about lognormal distribution functions To: "scipy-user at scipy.org" Message-ID: Content-Type: text/plain; charset="us-ascii" Hi Group - So, what I'm trying to do is draw a firm size from a lognormal distribution in a simulation (I'm using a fortuna RNG outside of the scope of this question -- why instead of twister deals with my research question, for this purposes it is just important to say using the built in random draw from a specific distribution wouldn't work). But when I do something like this: from scipy.stats import lognorm lognorm.ppf(.5,1,50,50) The numbers that come out make no sense (I'm right in believing "loc" = "mean" and "scale" = "standard deviation"?). I've tried logging the numbers, un-logging the numbers, etc. I'm very confused on what it is doing. ------------------------------ Message: 3 Date: Sun, 27 Nov 2011 01:41:08 +0000 From: "Lofgren, Eric" Subject: Re: [SciPy-User] Problem with ODEINT To: "" Message-ID: <81CB87CA-D183-4399-8675-79BFEFDCC175 at unc.edu> Content-Type: text/plain; charset="us-ascii" > Eric, > You have given odeint an initial condition of length 5, but the function > that defines your system is returning a vector of only length 3. Don't do > that. > ... > Warren Warren- This does indeed seem to solve the problem, I haven't hit any errors in 1,000 or so runs, and it does indeed make the code run considerably faster. Thank you for the advice and help. Eric ------------------------------ Message: 4 Date: Sun, 27 Nov 2011 17:39:04 +0000 From: Robert Kern Subject: Re: [SciPy-User] Confusion about lognormal distribution functions To: SciPy Users List Message-ID: Content-Type: text/plain; charset=UTF-8 On Sat, Nov 26, 2011 at 16:52, tazz_ben wrote: > Hi Group - > > So, what I'm trying to do is draw a firm size from a lognormal > distribution in a simulation (I'm using a fortuna RNG outside of the scope > of this question -- why instead of twister deals with my research > question, for this purposes it is just important to say using the built in > random draw from a specific distribution wouldn't work). > > But when I do something like this: > > from scipy.stats import lognorm > > lognorm.ppf(.5,1,50,50) > > The numbers that come out make no sense (I'm right in believing "loc" = > "mean" and "scale" = "standard deviation"?). I've tried logging the > numbers, un-logging the numbers, etc. ?I'm very confused on what it is > doing. No, loc and scale mean exactly the same thing for every distribution. loc translates the distribution linearly and scale scales it. lognorm.pdf(x, s, loc=loc, scale=scale) == lognorm.pdf((x-loc)/scale, s)/scale They don't always map to particular parameters in standard parameterizations. However, they often do, so doing this lets us share the code for shifting and scaling in the base class rather than implementing it slightly differently for every distribution. In this case, you want to ignore the loc parameter entirely. The scale parameter corresponds to exp(mu) where mu is the mean of the underlying normal distribution. The shape parameter is the standard deviation of the underlying normal distribution. log(lognorm.ppf(p, s, scale=scale)) == norm.ppf(p, loc=log(scale), scale=s) -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." ? -- Umberto Eco ------------------------------ Message: 5 Date: Sun, 27 Nov 2011 12:49:24 -0500 From: josef.pktd at gmail.com Subject: Re: [SciPy-User] Confusion about lognormal distribution functions To: SciPy Users List Message-ID: Content-Type: text/plain; charset=ISO-8859-1 On Sun, Nov 27, 2011 at 12:39 PM, Robert Kern wrote: > On Sat, Nov 26, 2011 at 16:52, tazz_ben wrote: >> Hi Group - >> >> So, what I'm trying to do is draw a firm size from a lognormal >> distribution in a simulation (I'm using a fortuna RNG outside of the scope >> of this question -- why instead of twister deals with my research >> question, for this purposes it is just important to say using the built in >> random draw from a specific distribution wouldn't work). >> >> But when I do something like this: >> >> from scipy.stats import lognorm >> >> lognorm.ppf(.5,1,50,50) >> >> The numbers that come out make no sense (I'm right in believing "loc" = >> "mean" and "scale" = "standard deviation"?). I've tried logging the >> numbers, un-logging the numbers, etc. ?I'm very confused on what it is >> doing. > > No, loc and scale mean exactly the same thing for every distribution. > loc translates the distribution linearly and scale scales it. > > ?lognorm.pdf(x, s, loc=loc, scale=scale) == lognorm.pdf((x-loc)/scale, s)/scale > > They don't always map to particular parameters in standard > parameterizations. However, they often do, so doing this lets us share > the code for shifting and scaling in the base class rather than > implementing it slightly differently for every distribution. > > In this case, you want to ignore the loc parameter entirely. The scale > parameter corresponds to exp(mu) where mu is the mean of the > underlying normal distribution. The shape parameter is the standard > deviation of the underlying normal distribution. > > log(lognorm.ppf(p, s, scale=scale)) == norm.ppf(p, loc=log(scale), scale=s) just as background http://projects.scipy.org/scipy/ticket/1502 and several mailing list threads. It's a FAQ. It might be a case for writing a reparameterized wrapper class. Josef > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > ? -- Umberto Eco > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > ------------------------------ _______________________________________________ SciPy-User mailing list SciPy-User at scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user End of SciPy-User Digest, Vol 99, Issue 50 ****************************************** -------------- next part -------------- An HTML attachment was scrubbed... URL: From alan.isaac at gmail.com Wed Dec 21 15:01:11 2011 From: alan.isaac at gmail.com (Alan G Isaac) Date: Wed, 21 Dec 2011 15:01:11 -0500 Subject: [SciPy-User] How do I use ols.py? In-Reply-To: References: Message-ID: <4EF23B07.8000304@gmail.com> On 12/15/2011 5:27 PM, Dave wrote: > I found OLS.py in the Cookbook (http://www.scipy.org/Cookbook/OLS), > and would like to use it. But I don't know how. Consider Statsmodels: http://statsmodels.sourceforge.net/ fwiw, Alan Isaac From klonuo at gmail.com Wed Dec 21 18:32:57 2011 From: klonuo at gmail.com (klo uo) Date: Thu, 22 Dec 2011 00:32:57 +0100 Subject: [SciPy-User] Possible to band process in one pass? Message-ID: I want to calculate RMS in specified bands on input signal Is there some way I can do this in one pass, instead applying separate bandpass filter then calculate RMS for each band? From klonuo at gmail.com Wed Dec 21 18:47:43 2011 From: klonuo at gmail.com (klo uo) Date: Thu, 22 Dec 2011 00:47:43 +0100 Subject: [SciPy-User] scipy.test('full') (errors=8, failures=3) Message-ID: >>> scipy.test('full') Running unit tests for scipy NumPy version 1.6.1 NumPy is installed in /usr/local/lib/python2.7/dist-packages/numpy SciPy version 0.10.0 SciPy is installed in /usr/local/lib/python2.7/dist-packages/scipy Python version 2.7.1+ (r271:86832, Apr 11 2011, 18:05:24) [GCC 4.5.2] nose version 1.1.2 {...} ====================================================================== ERROR: test_iv_cephes_vs_amos_mass_test (test_basic.TestBessel) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib/python2.7/dist-packages/scipy/special/tests/test_basic.py", line 1642, in test_iv_cephes_vs_amos_mass_test c1 = special.iv(v, x) RuntimeWarning: divide by zero encountered in iv ====================================================================== ERROR: test_fdtri (test_basic.TestCephes) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib/python2.7/dist-packages/scipy/special/tests/test_basic.py", line 153, in test_fdtri cephes.fdtri(1,1,0.5) RuntimeWarning: invalid value encountered in fdtri ====================================================================== ERROR: test_continuous_extra.test_cont_extra(, (0.4141193182605212,), 'loggamma loc, scale test') ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib/python2.7/dist-packages/nose/case.py", line 197, in runTest self.test(*self.arg) File "/usr/local/lib/python2.7/dist-packages/scipy/stats/tests/test_continuous_extra.py", line 78, in check_loc_scale m,v = distfn.stats(*arg) File "/usr/local/lib/python2.7/dist-packages/scipy/stats/distributions.py", line 1631, in stats mu = self._munp(1.0,*goodargs) File "/usr/local/lib/python2.7/dist-packages/scipy/stats/distributions.py", line 4119, in _munp return self._mom0_sc(n,*args) File "/usr/local/lib/python2.7/dist-packages/scipy/stats/distributions.py", line 1165, in _mom0_sc self.b, args=(m,)+args)[0] File "/usr/local/lib/python2.7/dist-packages/scipy/integrate/quadpack.py", line 247, in quad retval = _quad(func,a,b,args,full_output,epsabs,epsrel,limit,points) File "/usr/local/lib/python2.7/dist-packages/scipy/integrate/quadpack.py", line 314, in _quad return _quadpack._qagie(func,bound,infbounds,args,full_output,epsabs,epsrel,limit) File "/usr/local/lib/python2.7/dist-packages/scipy/stats/distributions.py", line 1162, in _mom_integ0 return x**m * self.pdf(x,*args) File "/usr/local/lib/python2.7/dist-packages/scipy/stats/distributions.py", line 1262, in pdf place(output,cond,self._pdf(*goodargs) / scale) File "/usr/local/lib/python2.7/dist-packages/scipy/stats/distributions.py", line 4112, in _pdf return exp(c*x-exp(x)-gamln(c)) RuntimeWarning: overflow encountered in exp ====================================================================== ERROR: test_continuous_extra.test_cont_extra(, (1.8771398388773268,), 'lomax loc, scale test') ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib/python2.7/dist-packages/nose/case.py", line 197, in runTest self.test(*self.arg) File "/usr/local/lib/python2.7/dist-packages/scipy/stats/tests/test_continuous_extra.py", line 78, in check_loc_scale m,v = distfn.stats(*arg) File "/usr/local/lib/python2.7/dist-packages/scipy/stats/distributions.py", line 1617, in stats mu, mu2, g1, g2 = self._stats(*args) File "/usr/local/lib/python2.7/dist-packages/scipy/stats/distributions.py", line 4643, in _stats mu, mu2, g1, g2 = pareto.stats(c, loc=-1.0, moments='mvsk') File "/usr/local/lib/python2.7/dist-packages/scipy/stats/distributions.py", line 1615, in stats mu, mu2, g1, g2 = self._stats(*args,**{'moments':moments}) File "/usr/local/lib/python2.7/dist-packages/scipy/stats/distributions.py", line 4594, in _stats vals = 2*(bt+1.0)*sqrt(b-2.0)/((b-3.0)*sqrt(b)) RuntimeWarning: invalid value encountered in sqrt ====================================================================== ERROR: test_discrete_basic.test_discrete_extra(, (30, 12, 6), 'hypergeom entropy nan test') ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib/python2.7/dist-packages/nose/case.py", line 197, in runTest self.test(*self.arg) File "/usr/local/lib/python2.7/dist-packages/scipy/stats/tests/test_discrete_basic.py", line 199, in check_entropy ent = distfn.entropy(*arg) File "/usr/local/lib/python2.7/dist-packages/scipy/stats/distributions.py", line 6314, in entropy place(output,cond0,self.vecentropy(*goodargs)) File "/usr/local/lib/python2.7/dist-packages/numpy/lib/function_base.py", line 1862, in __call__ theout = self.thefunc(*newargs) File "/usr/local/lib/python2.7/dist-packages/scipy/stats/distributions.py", line 6668, in _entropy lvals = where(vals==0.0,0.0,log(vals)) RuntimeWarning: divide by zero encountered in log ====================================================================== ERROR: test_discrete_basic.test_discrete_extra(, (21, 3, 12), 'hypergeom entropy nan test') ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib/python2.7/dist-packages/nose/case.py", line 197, in runTest self.test(*self.arg) File "/usr/local/lib/python2.7/dist-packages/scipy/stats/tests/test_discrete_basic.py", line 199, in check_entropy ent = distfn.entropy(*arg) File "/usr/local/lib/python2.7/dist-packages/scipy/stats/distributions.py", line 6314, in entropy place(output,cond0,self.vecentropy(*goodargs)) File "/usr/local/lib/python2.7/dist-packages/numpy/lib/function_base.py", line 1862, in __call__ theout = self.thefunc(*newargs) File "/usr/local/lib/python2.7/dist-packages/scipy/stats/distributions.py", line 6668, in _entropy lvals = where(vals==0.0,0.0,log(vals)) RuntimeWarning: divide by zero encountered in log ====================================================================== ERROR: test_fit (test_distributions.TestFitMethod) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib/python2.7/dist-packages/scipy/stats/tests/test_distributions.py", line 439, in test_fit vals2 = distfunc.fit(res, optimizer='powell') File "/usr/local/lib/python2.7/dist-packages/scipy/stats/distributions.py", line 1874, in fit vals = optimizer(func,x0,args=(ravel(data),),disp=0) File "/usr/local/lib/python2.7/dist-packages/scipy/optimize/optimize.py", line 1621, in fmin_powell fval, x, direc1 = _linesearch_powell(func, x, direc1, tol=xtol*100) File "/usr/local/lib/python2.7/dist-packages/scipy/optimize/optimize.py", line 1491, in _linesearch_powell alpha_min, fret, iter, num = brent(myfunc, full_output=1, tol=tol) File "/usr/local/lib/python2.7/dist-packages/scipy/optimize/optimize.py", line 1312, in brent brent.optimize() File "/usr/local/lib/python2.7/dist-packages/scipy/optimize/optimize.py", line 1213, in optimize tmp2 = (x-v)*(fx-fw) RuntimeWarning: invalid value encountered in double_scalars ====================================================================== ERROR: test_fix_fit (test_distributions.TestFitMethod) ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib/python2.7/dist-packages/scipy/stats/tests/test_distributions.py", line 460, in test_fix_fit vals2 = distfunc.fit(res,fscale=1) File "/usr/local/lib/python2.7/dist-packages/scipy/stats/distributions.py", line 1874, in fit vals = optimizer(func,x0,args=(ravel(data),),disp=0) File "/usr/local/lib/python2.7/dist-packages/scipy/optimize/optimize.py", line 301, in fmin and max(abs(fsim[0]-fsim[1:])) <= ftol): RuntimeWarning: invalid value encountered in subtract ====================================================================== FAIL: test_mio.test_mat4_3d ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib/python2.7/dist-packages/nose/case.py", line 197, in runTest self.test(*self.arg) File "/usr/local/lib/python2.7/dist-packages/scipy/io/matlab/tests/test_mio.py", line 740, in test_mat4_3d stream, {'a': arr}, True, '4') File "/usr/local/lib/python2.7/dist-packages/numpy/testing/utils.py", line 1008, in assert_raises return nose.tools.assert_raises(*args,**kwargs) AssertionError: DeprecationWarning not raised ====================================================================== FAIL: test_datatypes.test_uint64_max ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib/python2.7/dist-packages/nose/case.py", line 197, in runTest self.test(*self.arg) File "/usr/local/lib/python2.7/dist-packages/scipy/ndimage/tests/test_datatypes.py", line 57, in test_uint64_max assert_true(x[1] > (2**63)) AssertionError: False is not true ====================================================================== FAIL: Regression test for #651: better handling of badly conditioned ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/local/lib/python2.7/dist-packages/scipy/signal/tests/test_filter_design.py", line 34, in test_bad_filter assert_raises(BadCoefficients, tf2zpk, [1e-15], [1.0, 1.0]) File "/usr/local/lib/python2.7/dist-packages/numpy/testing/utils.py", line 1008, in assert_raises return nose.tools.assert_raises(*args,**kwargs) AssertionError: BadCoefficients not raised ---------------------------------------------------------------------- Ran 5830 tests in 1198.157s FAILED (KNOWNFAIL=14, SKIP=36, errors=8, failures=3) Any cause for concern? -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Wed Dec 21 18:58:51 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 21 Dec 2011 18:58:51 -0500 Subject: [SciPy-User] scipy.test('full') (errors=8, failures=3) In-Reply-To: References: Message-ID: On Wed, Dec 21, 2011 at 6:47 PM, klo uo wrote: >>>> scipy.test('full') > > Running unit tests for scipy > NumPy version 1.6.1 > NumPy is installed in /usr/local/lib/python2.7/dist-packages/numpy > SciPy version 0.10.0 > SciPy is installed in /usr/local/lib/python2.7/dist-packages/scipy > Python version 2.7.1+ (r271:86832, Apr 11 2011, 18:05:24) [GCC 4.5.2] > nose version 1.1.2 > > {...} > > ====================================================================== > ERROR: test_iv_cephes_vs_amos_mass_test (test_basic.TestBessel) > ---------------------------------------------------------------------- > Traceback (most recent call last): > ?File > "/usr/local/lib/python2.7/dist-packages/scipy/special/tests/test_basic.py", > line 1642, in test_iv_cephes_vs_amos_mass_test > ? ?c1 = special.iv(v, x) > RuntimeWarning: divide by zero encountered in iv > > ====================================================================== > ERROR: test_fdtri (test_basic.TestCephes) > ---------------------------------------------------------------------- > Traceback (most recent call last): > ?File > "/usr/local/lib/python2.7/dist-packages/scipy/special/tests/test_basic.py", > line 153, in test_fdtri > ? ?cephes.fdtri(1,1,0.5) > RuntimeWarning: invalid value encountered in fdtri > > ====================================================================== > ERROR: > test_continuous_extra.test_cont_extra( object at 0x9c6232c>, (0.4141193182605212,), 'loggamma loc, scale test') > ---------------------------------------------------------------------- > Traceback (most recent call last): > ?File "/usr/local/lib/python2.7/dist-packages/nose/case.py", line 197, in > runTest > ? ?self.test(*self.arg) > ?File > "/usr/local/lib/python2.7/dist-packages/scipy/stats/tests/test_continuous_extra.py", > line 78, in check_loc_scale > ? ?m,v = distfn.stats(*arg) > ?File "/usr/local/lib/python2.7/dist-packages/scipy/stats/distributions.py", > line 1631, in stats > ? ?mu = self._munp(1.0,*goodargs) > ?File "/usr/local/lib/python2.7/dist-packages/scipy/stats/distributions.py", > line 4119, in _munp > ? ?return self._mom0_sc(n,*args) > ?File "/usr/local/lib/python2.7/dist-packages/scipy/stats/distributions.py", > line 1165, in _mom0_sc > ? ?self.b, args=(m,)+args)[0] > ?File "/usr/local/lib/python2.7/dist-packages/scipy/integrate/quadpack.py", > line 247, in quad > ? ?retval = _quad(func,a,b,args,full_output,epsabs,epsrel,limit,points) > ?File "/usr/local/lib/python2.7/dist-packages/scipy/integrate/quadpack.py", > line 314, in _quad > ? ?return > _quadpack._qagie(func,bound,infbounds,args,full_output,epsabs,epsrel,limit) > ?File "/usr/local/lib/python2.7/dist-packages/scipy/stats/distributions.py", > line 1162, in _mom_integ0 > ? ?return x**m * self.pdf(x,*args) > ?File "/usr/local/lib/python2.7/dist-packages/scipy/stats/distributions.py", > line 1262, in pdf > ? ?place(output,cond,self._pdf(*goodargs) / scale) > ?File "/usr/local/lib/python2.7/dist-packages/scipy/stats/distributions.py", > line 4112, in _pdf > ? ?return exp(c*x-exp(x)-gamln(c)) > RuntimeWarning: overflow encountered in exp > > ====================================================================== > ERROR: > test_continuous_extra.test_cont_extra( object at 0x9c62fec>, (1.8771398388773268,), 'lomax loc, scale test') > ---------------------------------------------------------------------- > Traceback (most recent call last): > ?File "/usr/local/lib/python2.7/dist-packages/nose/case.py", line 197, in > runTest > ? ?self.test(*self.arg) > ?File > "/usr/local/lib/python2.7/dist-packages/scipy/stats/tests/test_continuous_extra.py", > line 78, in check_loc_scale > ? ?m,v = distfn.stats(*arg) > ?File "/usr/local/lib/python2.7/dist-packages/scipy/stats/distributions.py", > line 1617, in stats > ? ?mu, mu2, g1, g2 = self._stats(*args) > ?File "/usr/local/lib/python2.7/dist-packages/scipy/stats/distributions.py", > line 4643, in _stats > ? ?mu, mu2, g1, g2 = pareto.stats(c, loc=-1.0, moments='mvsk') > ?File "/usr/local/lib/python2.7/dist-packages/scipy/stats/distributions.py", > line 1615, in stats > ? ?mu, mu2, g1, g2 = self._stats(*args,**{'moments':moments}) > ?File "/usr/local/lib/python2.7/dist-packages/scipy/stats/distributions.py", > line 4594, in _stats > ? ?vals = 2*(bt+1.0)*sqrt(b-2.0)/((b-3.0)*sqrt(b)) > RuntimeWarning: invalid value encountered in sqrt > > ====================================================================== > ERROR: > test_discrete_basic.test_discrete_extra( object at 0x9c7228c>, (30, 12, 6), 'hypergeom entropy nan test') > ---------------------------------------------------------------------- > Traceback (most recent call last): > ?File "/usr/local/lib/python2.7/dist-packages/nose/case.py", line 197, in > runTest > ? ?self.test(*self.arg) > ?File > "/usr/local/lib/python2.7/dist-packages/scipy/stats/tests/test_discrete_basic.py", > line 199, in check_entropy > ? ?ent = distfn.entropy(*arg) > ?File "/usr/local/lib/python2.7/dist-packages/scipy/stats/distributions.py", > line 6314, in entropy > ? ?place(output,cond0,self.vecentropy(*goodargs)) > ?File "/usr/local/lib/python2.7/dist-packages/numpy/lib/function_base.py", > line 1862, in __call__ > ? ?theout = self.thefunc(*newargs) > ?File "/usr/local/lib/python2.7/dist-packages/scipy/stats/distributions.py", > line 6668, in _entropy > ? ?lvals = where(vals==0.0,0.0,log(vals)) > RuntimeWarning: divide by zero encountered in log > > ====================================================================== > ERROR: > test_discrete_basic.test_discrete_extra( object at 0x9c7228c>, (21, 3, 12), 'hypergeom entropy nan test') > ---------------------------------------------------------------------- > Traceback (most recent call last): > ?File "/usr/local/lib/python2.7/dist-packages/nose/case.py", line 197, in > runTest > ? ?self.test(*self.arg) > ?File > "/usr/local/lib/python2.7/dist-packages/scipy/stats/tests/test_discrete_basic.py", > line 199, in check_entropy > ? ?ent = distfn.entropy(*arg) > ?File "/usr/local/lib/python2.7/dist-packages/scipy/stats/distributions.py", > line 6314, in entropy > ? ?place(output,cond0,self.vecentropy(*goodargs)) > ?File "/usr/local/lib/python2.7/dist-packages/numpy/lib/function_base.py", > line 1862, in __call__ > ? ?theout = self.thefunc(*newargs) > ?File "/usr/local/lib/python2.7/dist-packages/scipy/stats/distributions.py", > line 6668, in _entropy > ? ?lvals = where(vals==0.0,0.0,log(vals)) > RuntimeWarning: divide by zero encountered in log > > ====================================================================== > ERROR: test_fit (test_distributions.TestFitMethod) > ---------------------------------------------------------------------- > Traceback (most recent call last): > ?File > "/usr/local/lib/python2.7/dist-packages/scipy/stats/tests/test_distributions.py", > line 439, in test_fit > ? ?vals2 = distfunc.fit(res, optimizer='powell') > ?File "/usr/local/lib/python2.7/dist-packages/scipy/stats/distributions.py", > line 1874, in fit > ? ?vals = optimizer(func,x0,args=(ravel(data),),disp=0) > ?File "/usr/local/lib/python2.7/dist-packages/scipy/optimize/optimize.py", > line 1621, in fmin_powell > ? ?fval, x, direc1 = _linesearch_powell(func, x, direc1, tol=xtol*100) > ?File "/usr/local/lib/python2.7/dist-packages/scipy/optimize/optimize.py", > line 1491, in _linesearch_powell > ? ?alpha_min, fret, iter, num = brent(myfunc, full_output=1, tol=tol) > ?File "/usr/local/lib/python2.7/dist-packages/scipy/optimize/optimize.py", > line 1312, in brent > ? ?brent.optimize() > ?File "/usr/local/lib/python2.7/dist-packages/scipy/optimize/optimize.py", > line 1213, in optimize > ? ?tmp2 = (x-v)*(fx-fw) > RuntimeWarning: invalid value encountered in double_scalars > > ====================================================================== > ERROR: test_fix_fit (test_distributions.TestFitMethod) > ---------------------------------------------------------------------- > Traceback (most recent call last): > ?File > "/usr/local/lib/python2.7/dist-packages/scipy/stats/tests/test_distributions.py", > line 460, in test_fix_fit > ? ?vals2 = distfunc.fit(res,fscale=1) > ?File "/usr/local/lib/python2.7/dist-packages/scipy/stats/distributions.py", > line 1874, in fit > ? ?vals = optimizer(func,x0,args=(ravel(data),),disp=0) > ?File "/usr/local/lib/python2.7/dist-packages/scipy/optimize/optimize.py", > line 301, in fmin > ? ?and max(abs(fsim[0]-fsim[1:])) <= ftol): > RuntimeWarning: invalid value encountered in subtract > > ====================================================================== > FAIL: test_mio.test_mat4_3d > ---------------------------------------------------------------------- > Traceback (most recent call last): > ?File "/usr/local/lib/python2.7/dist-packages/nose/case.py", line 197, in > runTest > ? ?self.test(*self.arg) > ?File > "/usr/local/lib/python2.7/dist-packages/scipy/io/matlab/tests/test_mio.py", > line 740, in test_mat4_3d > ? ?stream, {'a': arr}, True, '4') > ?File "/usr/local/lib/python2.7/dist-packages/numpy/testing/utils.py", line > 1008, in assert_raises > ? ?return nose.tools.assert_raises(*args,**kwargs) > AssertionError: DeprecationWarning not raised > > ====================================================================== > FAIL: test_datatypes.test_uint64_max > ---------------------------------------------------------------------- > Traceback (most recent call last): > ?File "/usr/local/lib/python2.7/dist-packages/nose/case.py", line 197, in > runTest > ? ?self.test(*self.arg) > ?File > "/usr/local/lib/python2.7/dist-packages/scipy/ndimage/tests/test_datatypes.py", > line 57, in test_uint64_max > ? ?assert_true(x[1] > (2**63)) > AssertionError: False is not true > > ====================================================================== > FAIL: Regression test for #651: better handling of badly conditioned > ---------------------------------------------------------------------- > Traceback (most recent call last): > ?File > "/usr/local/lib/python2.7/dist-packages/scipy/signal/tests/test_filter_design.py", > line 34, in test_bad_filter > ? ?assert_raises(BadCoefficients, tf2zpk, [1e-15], [1.0, 1.0]) > ?File "/usr/local/lib/python2.7/dist-packages/numpy/testing/utils.py", line > 1008, in assert_raises > ? ?return nose.tools.assert_raises(*args,**kwargs) > AssertionError: BadCoefficients not raised > > ---------------------------------------------------------------------- > Ran 5830 tests in 1198.157s > > FAILED (KNOWNFAIL=14, SKIP=36, errors=8, failures=3) > > > Any cause for concern? > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > Did you set "raise on warnings"? The distribution failures all look like RuntimeWarnings for floating point calculations, that might be just a byproduct of some of the calculations and doesn't say anything about the actual tested result. most of them look familiar: 0log0, or optimizers trying invalid numbers. the lomax Warning looks strange and might indicate some problems with the test (not with lomax, I think) Josef From klonuo at gmail.com Wed Dec 21 19:58:59 2011 From: klonuo at gmail.com (klo uo) Date: Thu, 22 Dec 2011 01:58:59 +0100 Subject: [SciPy-User] scipy.test('full') (errors=8, failures=3) In-Reply-To: References: Message-ID: > > Did you set "raise on warnings"? > The distribution failures all look like RuntimeWarnings for floating > point calculations, that might be just a byproduct of some of the > calculations and doesn't say anything about the actual tested result. > > most of them look familiar: 0log0, or optimizers trying invalid numbers. > the lomax Warning looks strange and might indicate some problems with > the test (not with lomax, I think) > Thanks for your reply I just installed latest version some hours ago (with ATLAS support) and run "scipy.test('full')" without setting anything (not that I know about the "raise on warnings" option) So, I guess it's nothing wrong with my install -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Wed Dec 21 20:39:55 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 21 Dec 2011 20:39:55 -0500 Subject: [SciPy-User] scipy.test('full') (errors=8, failures=3) In-Reply-To: References: Message-ID: On Wed, Dec 21, 2011 at 7:58 PM, klo uo wrote: >> Did you set "raise on warnings"? >> The distribution failures all look like RuntimeWarnings for floating >> point calculations, that might be just a byproduct of some of the >> calculations and doesn't say anything about the actual tested result. >> >> most of them look familiar: 0log0, or optimizers trying invalid numbers. >> the lomax Warning looks strange and might indicate some problems with >> the test (not with lomax, I think) > > > Thanks for your reply > > I just installed latest version some hours ago (with ATLAS support) and run > "scipy.test('full')" without setting anything (not that I know about the > "raise on warnings" option) > > So, I guess it's nothing wrong with my install The only one that might be a real failure is scipy/ndimage/tests/test_datatypes.py the rest looks all like noise. I don't know what the default setting for the warnings are for different numpy or python versions. http://docs.scipy.org/doc/numpy/reference/generated/numpy.seterr.html and pythons warnings.simplefilter options nothing wrong with your install, but if the warnings raise an error, then you might not be able to do some calculations that have floating point problems. (Better than mine, I have an ancient crash/segfault in scipy.weave, I'm running scipy 0.9 though) Josef > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From charlesr.harris at gmail.com Wed Dec 21 22:52:59 2011 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 21 Dec 2011 20:52:59 -0700 Subject: [SciPy-User] scipy.test('full') (errors=8, failures=3) In-Reply-To: References: Message-ID: On Wed, Dec 21, 2011 at 4:47 PM, klo uo wrote: > Any cause for concern? > > What os/hardware are you running on? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From klonuo at gmail.com Thu Dec 22 00:39:31 2011 From: klonuo at gmail.com (klo uo) Date: Thu, 22 Dec 2011 06:39:31 +0100 Subject: [SciPy-User] scipy.test('full') (errors=8, failures=3) In-Reply-To: References: Message-ID: Thanks for your comments > What os/hardware are you running on? > > Chuck > I'm on Ubuntu 11.04 32bit on single core P4 3GHz, 2GB RAM -------------- next part -------------- An HTML attachment was scrubbed... URL: From nico.schloemer at gmail.com Sat Dec 24 17:01:08 2011 From: nico.schloemer at gmail.com (=?ISO-8859-1?Q?Nico_Schl=F6mer?=) Date: Sat, 24 Dec 2011 23:01:08 +0100 Subject: [SciPy-User] Sparse and dense linalg code compatibility: *, dot() Message-ID: Hi, I'm trying to a write a numerical algorithm that includes matrix-vector products and dot-products, and I'd like to make it work for sparse and dense matrices at the same time. For sparse matrices, I found it most convenient to use scipy.sparse.linalg.LinearOperator numpy.array and employ the syntax A*x for matrix-vector products and numpy.vdot(x, y) for dot-products. Works like charm. Until I plugged at dense numpy.matrix into the routines. It seems that numpy.matrix() * numpy.array() yields a numpy.matrix() of size n\times 1 which is not compatible with vdot: r = b - A * x np.vdot( r, r ) ==> ValueError: vectors have different lengths The following little snippet describes the problem I'm having. ============ *snip* ============ num_unknowns = 5 A = np.random.rand(num_unknowns, num_unknowns) rhs = np.random.rand(num_unknowns, 1) x0 = np.zeros( (num_unknowns, 1) ) r = rhs - A * x0 #x1 = np.zeros( num_unknowns ) print repr( rhs ) print np.vdot( rhs, rhs ) print repr( x0 ) print np.vdot( x0, x0 ) print repr( r ) print np.vdot( r, r ) ============ *snap* ============ I'd work with arrays instead of matrices if I didn't have to use dot() for matrix-vector multiplication. I'd appreciate suggestions how to best handle this situation. Cheers, Nico From cpeters at edisonmission.com Sat Dec 24 22:02:18 2011 From: cpeters at edisonmission.com (Christopher Peters) Date: Sat, 24 Dec 2011 22:02:18 -0500 Subject: [SciPy-User] AUTO: Christopher Peters is out of the office (returning 01/03/2012) Message-ID: I am out of the office until 01/03/2012. I am out of the office. Please email urgent requests to Mike McDonald. Note: This is an automated response to your message "[SciPy-User] Sparse and dense linalg code compatibility: *, dot()" sent on 12/24/2011 5:01:08 PM. This is the only notification you will receive while this person is away. From jerome.kieffer at esrf.fr Mon Dec 26 12:35:18 2011 From: jerome.kieffer at esrf.fr (Jerome Kieffer) Date: Mon, 26 Dec 2011 18:35:18 +0100 Subject: [SciPy-User] sparse array with ndim=3 or 4 ... Message-ID: <20111226183518.25f21a93.jerome.kieffer@esrf.fr> Dear Scipy users, I wonder if there is way to handle sparse nd-arrays ? I am looking for 3D and 4D arrays to store coeficients for geometric transformation 2D -> 1D or 2D -> 2D, with typical shape (4000x4000)x(2000x500). My idea is to contruct the array from cython (what is the time consuming part) than re-use it many times (with something similar to a scalar product. scipy.sparse seems limited to matrices (i.e 2D-array) is this true ? Flattening each array would be possible but there is probably a better solution... Thanks Seasons greatings -- Jerome Kieffer Online Data Analysis / SoftGroup From wesmckinn at gmail.com Tue Dec 27 15:55:36 2011 From: wesmckinn at gmail.com (Wes McKinney) Date: Tue, 27 Dec 2011 15:55:36 -0500 Subject: [SciPy-User] PyData GitHub organization for pandas, other data-related projects Message-ID: I've been long overdue in creating a GitHub organization for pandas, especially since a significant number of people have become engaged in its development and use in recent times. I figured rather than create a pandas-only organization, that the community would benefit for a Python for Data Analysis / Data Science organization for helping organize our development efforts: http://github.com/pydata I will be moving the main pandas repo there soon, away from my personal GitHub account (wesm) and will adopt a somewhat more pull-request oriented approach to my development work there going forward. I'll be happy to give push privileges on pandas and other projects of mine to anyone who's interested, as soon as you "earn your wings" (make some pull requests and demonstrate general git competency). I think the pydata org would be a good home for other projects in the same problem domain-- I look forward to fruitful collaborations going forward. best, Wes From nico.schloemer at gmail.com Fri Dec 30 09:30:52 2011 From: nico.schloemer at gmail.com (=?ISO-8859-1?Q?Nico_Schl=F6mer?=) Date: Fri, 30 Dec 2011 15:30:52 +0100 Subject: [SciPy-User] SciPy bug tracker? Message-ID: Hi, where can I register SciPy bugs? The website https://www.scipy.org/BugReport has dead links. --Nico From ralf.gommers at googlemail.com Fri Dec 30 09:48:59 2011 From: ralf.gommers at googlemail.com (Ralf Gommers) Date: Fri, 30 Dec 2011 15:48:59 +0100 Subject: [SciPy-User] SciPy bug tracker? In-Reply-To: References: Message-ID: On Fri, Dec 30, 2011 at 3:30 PM, Nico Schl?mer wrote: > Hi, > > where can I register SciPy bugs? The website > https://www.scipy.org/BugReport has dead links. > > http://projects.scipy.org/scipy is not a dead link, but unfortunately that server or Trac instance is not very reliable. After registering there you get a "new ticket" option. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From nico.schloemer at gmail.com Fri Dec 30 10:12:49 2011 From: nico.schloemer at gmail.com (=?ISO-8859-1?Q?Nico_Schl=F6mer?=) Date: Fri, 30 Dec 2011 16:12:49 +0100 Subject: [SciPy-User] SciPy bug tracker? In-Reply-To: References: Message-ID: All I get is ============= *snip* ============= Not Found The requested URL /scipy was not found on this server. Apache/2.2.3 (CentOS) Server at conference.scipy.org Port 80 ============= *snap* ============= Well. I might try later. --Nico On Fri, Dec 30, 2011 at 3:48 PM, Ralf Gommers wrote: > > > On Fri, Dec 30, 2011 at 3:30 PM, Nico Schl?mer > wrote: >> >> Hi, >> >> where can I register SciPy bugs? The website >> https://www.scipy.org/BugReport has dead links. >> > http://projects.scipy.org/scipy is not a dead link, but unfortunately that > server or Trac instance is not very reliable. After registering there you > get a "new ticket" option. > > Ralf > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From ognen at enthought.com Fri Dec 30 19:49:32 2011 From: ognen at enthought.com (Ognen Duzlevski) Date: Fri, 30 Dec 2011 18:49:32 -0600 Subject: [SciPy-User] SciPy bug tracker? In-Reply-To: References: Message-ID: On Fri, Dec 30, 2011 at 9:12 AM, Nico Schl?mer wrote: > All I get is > > ============= *snip* ============= > Not Found > > The requested URL /scipy was not found on this server. > > Apache/2.2.3 (CentOS) Server at conference.scipy.org Port 80 > ============= *snap* ============= > > Well. I might try later. I just visited the link that Ralf posted and it works. What do you actually mean when you say "All I get is..."? If you want help, please be more specific - what did you click on that prompted the "Not found" output? :-) Thanks, Ognen From nico.schloemer at gmail.com Sat Dec 31 07:17:17 2011 From: nico.schloemer at gmail.com (=?ISO-8859-1?Q?Nico_Schl=F6mer?=) Date: Sat, 31 Dec 2011 13:17:17 +0100 Subject: [SciPy-User] SciPy bug tracker? In-Reply-To: References: Message-ID: Well, I paste this http://projects.scipy.org/scipy into my browser (Chromium on Linux), then I get redirected to https://projects.scipy.org/scipy I get the warning message ======== *snip* ======== The site's security certificate has expired! You attempted to reach projects.scipy.org, but the server presented an expired certificate. No information is available to indicate whether that certificate has been compromised since its expiration. This means Chromium cannot guarantee that you are communicating with projects.scipy.org and not an attacker. You should not proceed. ======== *snap* ======== I then click on "Proceed anyway", and the next page I see is ======== *snip* ======== Not Found The requested URL /scipy was not found on this server. Apache/2.2.3 (CentOS) Server at conference.scipy.org Port 80 ======== *snap* ======== I've got no idea why you guys don't get the error message. Maybe the automatic forward to the https isn't what I want? If the option https doesn't work, it should be removed from the server anyways. --Nico On Sat, Dec 31, 2011 at 1:49 AM, Ognen Duzlevski wrote: > On Fri, Dec 30, 2011 at 9:12 AM, Nico Schl?mer wrote: >> All I get is >> >> ============= *snip* ============= >> Not Found >> >> The requested URL /scipy was not found on this server. >> >> Apache/2.2.3 (CentOS) Server at conference.scipy.org Port 80 >> ============= *snap* ============= >> >> Well. I might try later. > > I just visited the link that Ralf posted and it works. What do you > actually mean when you say "All I get is..."? If you want help, please > be more specific - what did you click on that prompted the "Not found" > output? :-) > > Thanks, > Ognen > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From josef.pktd at gmail.com Sat Dec 31 08:30:15 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 31 Dec 2011 08:30:15 -0500 Subject: [SciPy-User] SciPy bug tracker? In-Reply-To: References: Message-ID: On Sat, Dec 31, 2011 at 7:17 AM, Nico Schl?mer wrote: > Well, I paste this > > http://projects.scipy.org/scipy > > into my browser (Chromium on Linux), then I get redirected to > > https://projects.scipy.org/scipy > > I get the warning message > > ======== *snip* ======== > The site's security certificate has expired! > You attempted to reach projects.scipy.org, but the server presented an > expired certificate. No information is available to indicate whether > that certificate has been compromised since its expiration. This means > Chromium cannot guarantee that you are communicating with > projects.scipy.org and not an attacker. You should not proceed. > ======== *snap* ======== > > I then click on "Proceed anyway", and the next page I see is > > ======== *snip* ======== > Not Found > > The requested URL /scipy was not found on this server. > > Apache/2.2.3 (CentOS) Server at conference.scipy.org Port 80 > ======== *snap* ======== > > I've got no idea why you guys don't get the error message. Maybe the > automatic forward to the https isn't what I want? If the option https > doesn't work, it should be removed from the server anyways. I get the same, security warning and Not Found. However, the direct links to the Trac pages seems to work you could try http://projects.scipy.org/scipy/register http://projects.scipy.org/scipy/report Josef > --Nico > > > > > On Sat, Dec 31, 2011 at 1:49 AM, Ognen Duzlevski wrote: >> On Fri, Dec 30, 2011 at 9:12 AM, Nico Schl?mer wrote: >>> All I get is >>> >>> ============= *snip* ============= >>> Not Found >>> >>> The requested URL /scipy was not found on this server. >>> >>> Apache/2.2.3 (CentOS) Server at conference.scipy.org Port 80 >>> ============= *snap* ============= >>> >>> Well. I might try later. >> >> I just visited the link that Ralf posted and it works. What do you >> actually mean when you say "All I get is..."? If you want help, please >> be more specific - what did you click on that prompted the "Not found" >> output? :-) >> >> Thanks, >> Ognen >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From josef.pktd at gmail.com Sat Dec 31 08:35:12 2011 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 31 Dec 2011 08:35:12 -0500 Subject: [SciPy-User] SciPy bug tracker? In-Reply-To: References: Message-ID: On Sat, Dec 31, 2011 at 8:30 AM, wrote: > On Sat, Dec 31, 2011 at 7:17 AM, Nico Schl?mer wrote: >> Well, I paste this >> >> http://projects.scipy.org/scipy >> >> into my browser (Chromium on Linux), then I get redirected to >> >> https://projects.scipy.org/scipy >> >> I get the warning message >> >> ======== *snip* ======== >> The site's security certificate has expired! >> You attempted to reach projects.scipy.org, but the server presented an >> expired certificate. No information is available to indicate whether >> that certificate has been compromised since its expiration. This means >> Chromium cannot guarantee that you are communicating with >> projects.scipy.org and not an attacker. You should not proceed. >> ======== *snap* ======== >> >> I then click on "Proceed anyway", and the next page I see is >> >> ======== *snip* ======== >> Not Found >> >> The requested URL /scipy was not found on this server. >> >> Apache/2.2.3 (CentOS) Server at conference.scipy.org Port 80 >> ======== *snap* ======== >> >> I've got no idea why you guys don't get the error message. Maybe the >> automatic forward to the https isn't what I want? If the option https >> doesn't work, it should be removed from the server anyways. > > I get the same, security warning and Not Found. correction: I get it with https https://projects.scipy.org/scipy but not with http http://projects.scipy.org/scipy and I don't get redirected Josef > > However, the direct links to the Trac pages seems to work > > you could try > > http://projects.scipy.org/scipy/register > http://projects.scipy.org/scipy/report > > Josef > > >> --Nico >> >> >> >> >> On Sat, Dec 31, 2011 at 1:49 AM, Ognen Duzlevski wrote: >>> On Fri, Dec 30, 2011 at 9:12 AM, Nico Schl?mer wrote: >>>> All I get is >>>> >>>> ============= *snip* ============= >>>> Not Found >>>> >>>> The requested URL /scipy was not found on this server. >>>> >>>> Apache/2.2.3 (CentOS) Server at conference.scipy.org Port 80 >>>> ============= *snap* ============= >>>> >>>> Well. I might try later. >>> >>> I just visited the link that Ralf posted and it works. What do you >>> actually mean when you say "All I get is..."? If you want help, please >>> be more specific - what did you click on that prompted the "Not found" >>> output? :-) >>> >>> Thanks, >>> Ognen >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user From gael.varoquaux at normalesup.org Sat Dec 31 10:26:01 2011 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sat, 31 Dec 2011 16:26:01 +0100 Subject: [SciPy-User] SciPy bug tracker? In-Reply-To: References: Message-ID: <20111231152601.GA23638@phare.normalesup.org> On Sat, Dec 31, 2011 at 08:35:12AM -0500, josef.pktd at gmail.com wrote: > correction: > I get it with https https://projects.scipy.org/scipy > but not with http http://projects.scipy.org/scipy > and I don't get redirected Same thing here. Gael From nico.schloemer at gmail.com Sat Dec 31 10:37:15 2011 From: nico.schloemer at gmail.com (=?ISO-8859-1?Q?Nico_Schl=F6mer?=) Date: Sat, 31 Dec 2011 16:37:15 +0100 Subject: [SciPy-User] SciPy bug tracker? In-Reply-To: <20111231152601.GA23638@phare.normalesup.org> References: <20111231152601.GA23638@phare.normalesup.org> Message-ID: I've got the KB SSL Enforcer installed https://chrome.google.com/webstore/detail/flcpelgcagfhfoegekianiofphddckof which makes the browser use https over http whenever possible. Well. Disabling it let's me access the site, but I think the https site should be removed if it's not working. Cheers, Nico On Sat, Dec 31, 2011 at 4:26 PM, Gael Varoquaux wrote: > On Sat, Dec 31, 2011 at 08:35:12AM -0500, josef.pktd at gmail.com wrote: >> correction: >> I get it with https ? https://projects.scipy.org/scipy >> but not with http ?http://projects.scipy.org/scipy >> and I don't get redirected > > Same thing here. > > Gael > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From e.antero.tammi at gmail.com Sat Dec 31 12:56:28 2011 From: e.antero.tammi at gmail.com (eat) Date: Sat, 31 Dec 2011 19:56:28 +0200 Subject: [SciPy-User] SciPy bug tracker? In-Reply-To: References: Message-ID: Hi, On Sat, Dec 31, 2011 at 3:35 PM, wrote: > On Sat, Dec 31, 2011 at 8:30 AM, wrote: > > On Sat, Dec 31, 2011 at 7:17 AM, Nico Schl?mer > wrote: > >> Well, I paste this > >> > >> http://projects.scipy.org/scipy > >> > >> into my browser (Chromium on Linux), then I get redirected to > >> > >> https://projects.scipy.org/scipy > >> > >> I get the warning message > >> > >> ======== *snip* ======== > >> The site's security certificate has expired! > >> You attempted to reach projects.scipy.org, but the server presented an > >> expired certificate. No information is available to indicate whether > >> that certificate has been compromised since its expiration. This means > >> Chromium cannot guarantee that you are communicating with > >> projects.scipy.org and not an attacker. You should not proceed. > >> ======== *snap* ======== > >> > >> I then click on "Proceed anyway", and the next page I see is > >> > >> ======== *snip* ======== > >> Not Found > >> > >> The requested URL /scipy was not found on this server. > >> > >> Apache/2.2.3 (CentOS) Server at conference.scipy.org Port 80 > >> ======== *snap* ======== > >> > >> I've got no idea why you guys don't get the error message. Maybe the > >> automatic forward to the https isn't what I want? If the option https > >> doesn't work, it should be removed from the server anyways. > > > > I get the same, security warning and Not Found. > > correction: > I get it with https https://projects.scipy.org/scipy > but not with http http://projects.scipy.org/scipy > and I don't get redirected > With my latest (Windows) Chrome I can access* * https://projects.scipy.org/scipy, but there seem to be problems ( expired certificate) after once accessing http://projects.scipy.org/scipy. Regards, eat > > Josef > > > > > However, the direct links to the Trac pages seems to work > > > > you could try > > > > http://projects.scipy.org/scipy/register > > http://projects.scipy.org/scipy/report > > > > Josef > > > > > >> --Nico > >> > >> > >> > >> > >> On Sat, Dec 31, 2011 at 1:49 AM, Ognen Duzlevski > wrote: > >>> On Fri, Dec 30, 2011 at 9:12 AM, Nico Schl?mer < > nico.schloemer at gmail.com> wrote: > >>>> All I get is > >>>> > >>>> ============= *snip* ============= > >>>> Not Found > >>>> > >>>> The requested URL /scipy was not found on this server. > >>>> > >>>> Apache/2.2.3 (CentOS) Server at conference.scipy.org Port 80 > >>>> ============= *snap* ============= > >>>> > >>>> Well. I might try later. > >>> > >>> I just visited the link that Ralf posted and it works. What do you > >>> actually mean when you say "All I get is..."? If you want help, please > >>> be more specific - what did you click on that prompted the "Not found" > >>> output? :-) > >>> > >>> Thanks, > >>> Ognen > >>> _______________________________________________ > >>> SciPy-User mailing list > >>> SciPy-User at scipy.org > >>> http://mail.scipy.org/mailman/listinfo/scipy-user > >> _______________________________________________ > >> SciPy-User mailing list > >> SciPy-User at scipy.org > >> http://mail.scipy.org/mailman/listinfo/scipy-user > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ognen at enthought.com Sat Dec 31 17:06:02 2011 From: ognen at enthought.com (Ognen Duzlevski) Date: Sat, 31 Dec 2011 16:06:02 -0600 Subject: [SciPy-User] SciPy bug tracker? In-Reply-To: References: Message-ID: On Sat, Dec 31, 2011 at 11:56 AM, eat wrote: > Hi, > > On Sat, Dec 31, 2011 at 3:35 PM, wrote: >> >> On Sat, Dec 31, 2011 at 8:30 AM, ? wrote: >> > On Sat, Dec 31, 2011 at 7:17 AM, Nico Schl?mer >> > wrote: >> >> Well, I paste this >> >> >> >> http://projects.scipy.org/scipy >> >> >> >> into my browser (Chromium on Linux), then I get redirected to >> >> >> >> https://projects.scipy.org/scipy >> >> >> >> I get the warning message >> >> >> >> ======== *snip* ======== >> >> The site's security certificate has expired! >> >> You attempted to reach projects.scipy.org, but the server presented an >> >> expired certificate. No information is available to indicate whether >> >> that certificate has been compromised since its expiration. This means >> >> Chromium cannot guarantee that you are communicating with >> >> projects.scipy.org and not an attacker. You should not proceed. >> >> ======== *snap* ======== >> >> >> >> I then click on "Proceed anyway", and the next page I see is >> >> >> >> ======== *snip* ======== >> >> Not Found >> >> >> >> The requested URL /scipy was not found on this server. >> >> >> >> Apache/2.2.3 (CentOS) Server at conference.scipy.org Port 80 >> >> ======== *snap* ======== >> >> >> >> I've got no idea why you guys don't get the error message. Maybe the >> >> automatic forward to the https isn't what I want? If the option https >> >> doesn't work, it should be removed from the server anyways. >> > >> > I get the same, security warning and Not Found. >> >> correction: >> I get it with https ? https://projects.scipy.org/scipy >> but not with http ?http://projects.scipy.org/scipy >> and I don't get redirected > > With my latest (Windows) Chrome I can > access??https://projects.scipy.org/scipy, but there seem to be problems > (?expired certificate)?after once accessing?http://projects.scipy.org/scipy. I will look into this. Ognen