From blake.a.griffith at gmail.com Wed May 1 14:13:39 2013 From: blake.a.griffith at gmail.com (Blake Griffith) Date: Wed, 1 May 2013 13:13:39 -0500 Subject: [SciPy-Dev] GSoC Proposal draft -- Improvements to the sparse package of Scipy: support for bool dtype and better interaction with Numpy In-Reply-To: References: Message-ID: I've posted the proposal to melange. https://google-melange.appspot.com/gsoc/proposal/review/google/gsoc2013/cowlicks/1 On Tue, Apr 30, 2013 at 2:02 PM, Pauli Virtanen wrote: > Hi, > > 27.04.2013 10:17, Blake Griffith kirjoitti: > [clip] > > https://github.com/cowlicks/GSoC-proposal/blob/master/proposal.markdown > > Comments: > > - The second point in the abstract can probably be rephrased to be > easier to understand. > > "allow use of XXX" -> "make XXX work with sparse matrices" > > - "adding new functions/methods" --- these would for a large part > not be new user-visible methods, right? __add__, __minus__, multiply, > etc. are already there, although their functionality is at the moment > more restricted than it should be, which is to be fixed. > > - The schedule for the second project in the text says 2+5+3+1 weeks, > but the headline 7. I think you want 7 here in total. > > Is the part "Modify and add binary operation ..." leftover > from previous edits? > > - The schedule looks good to me, and the details written in the plan > make sense. > > - As Daniel notes, the second part may also involve some refactoring of > the scipy.sparse code base, so that the same logic is not repeated in > several places in the code. > > Looking for instance at CSR multiply() method: Doing this special > casing for each binop can be a bit inefficient --- there may be a > better way to write it in a generic form (and maybe on the C++ level). > > - You may want to ping also the Numpy discussion list on this, as > the proposal partly concerns Numpy. > > > -- > Pauli Virtanen > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ndbecker2 at gmail.com Thu May 2 10:32:11 2013 From: ndbecker2 at gmail.com (Neal Becker) Date: Thu, 02 May 2013 10:32:11 -0400 Subject: [SciPy-Dev] playing with intel ipp (signal processing) library Message-ID: I spent a little time wrapping some of intel ipp signal processing library and benchmarking against my hand-written c++ code. So far I've tried some of the FIR filtering (single and multi-rate), and correlation. I was surprised to find that the ipp components were 3-10 times the speed of my c++ code. Perhaps others might want to look at using these components. Of course, they are not free - but this is in the same class as building numpy/scipy using MKL. From robert.kern at gmail.com Thu May 2 10:49:38 2013 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 2 May 2013 15:49:38 +0100 Subject: [SciPy-Dev] playing with intel ipp (signal processing) library In-Reply-To: References: Message-ID: On Thu, May 2, 2013 at 3:32 PM, Neal Becker wrote: > I spent a little time wrapping some of intel ipp signal processing library and > benchmarking against my hand-written c++ code. So far I've tried some of the > FIR filtering (single and multi-rate), and correlation. > > I was surprised to find that the ipp components were 3-10 times the speed of my > c++ code. > > Perhaps others might want to look at using these components. Of course, they > are not free - but this is in the same class as building numpy/scipy using MKL. It's not quite the same class as building numpy/scipy with MKL. The parts of the MKL that numpy/scipy use are the well-standardized, vendor-neutral BLAS/LAPACK interfaces. We can make use of the MKL mostly by linking unmodified numpy/scipy sources with the MKL. But if you are just talking about whether individuals might make the decision to use it in their own application code, yeah, the decision is about the same. FWIW, AMD has an Apache-licensed competitor to the IPP that seems to closely follow the IPP API with mostly just name changes: http://framewave.sourceforge.net/ I had a devil of a time getting it built, though (i.e. I didn't succeed), so I cannot comment much beyond that. -- Robert Kern From mhpc.edas at gmail.com Fri May 3 10:26:14 2013 From: mhpc.edas at gmail.com (MHPC 2013) Date: Fri, 3 May 2013 16:26:14 +0200 Subject: [SciPy-Dev] CfP 2013 Workshop on Middleware for HPC and Big Data Systems (MHPC'13) Message-ID: we apologize if you receive multiple copies of this message =================================================================== CALL FOR PAPERS 2013 Workshop on Middleware for HPC and Big Data Systems MHPC '13 as part of Euro-Par 2013, Aachen, Germany =================================================================== Date: August 27, 2012 Workshop URL: http://m-hpc.org Springer LNCS SUBMISSION DEADLINE: May 31, 2013 - LNCS Full paper submission (rolling abstract submission) June 28, 2013 - Lightning Talk abstracts SCOPE Extremely large, diverse, and complex data sets are generated from scientific applications, the Internet, social media and other applications. Data may be physically distributed and shared by an ever larger community. Collecting, aggregating, storing and analyzing large data volumes presents major challenges. Processing such amounts of data efficiently has been an issue to scientific discovery and technological advancement. In addition, making the data accessible, understandable and interoperable includes unsolved problems. Novel middleware architectures, algorithms, and application development frameworks are required. In this workshop we are particularly interested in original work at the intersection of HPC and Big Data with regard to middleware handling and optimizations. Scope is existing and proposed middleware for HPC and big data, including analytics libraries and frameworks. The goal of this workshop is to bring together software architects, middleware and framework developers, data-intensive application developers as well as users from the scientific and engineering community to exchange their experience in processing large datasets and to report their scientific achievement and innovative ideas. The workshop also offers a dedicated forum for these researchers to access the state of the art, to discuss problems and requirements, to identify gaps in current and planned designs, and to collaborate in strategies for scalable data-intensive computing. The workshop will be one day in length, composed of 20 min paper presentations, each followed by 10 min discussion sections. Presentations may be accompanied by interactive demonstrations. TOPICS Topics of interest include, but are not limited to: - Middleware including: Hadoop, Apache Drill, YARN, Spark/Shark, Hive, Pig, Sqoop, HBase, HDFS, S4, CIEL, Oozie, Impala, Storm and Hyrack - Data intensive middleware architecture - Libraries/Frameworks including: Apache Mahout, Giraph, UIMA and GraphLab - NG Databases including Apache Cassandra, MongoDB and CouchDB/Couchbase - Schedulers including Cascading - Middleware for optimized data locality/in-place data processing - Data handling middleware for deployment in virtualized HPC environments - Parallelization and distributed processing architectures at the middleware level - Integration with cloud middleware and application servers - Runtime environments and system level support for data-intensive computing - Skeletons and patterns - Checkpointing - Programming models and languages - Big Data ETL - Stream processing middleware - In-memory databases for HPC - Scalability and interoperability - Large-scale data storage and distributed file systems - Content-centric addressing and networking - Execution engines, languages and environments including CIEL/Skywriting - Performance analysis, evaluation of data-intensive middleware - In-depth analysis and performance optimizations in existing data-handling middleware, focusing on indexing/fast storing or retrieval between compute and storage nodes - Highly scalable middleware optimized for minimum communication - Use cases and experience for popular Big Data middleware - Middleware security, privacy and trust architectures DATES Papers: Rolling abstract submission May 31, 2013 - Full paper submission July 8, 2013 - Acceptance notification October 3, 2013 - Camera-ready version due Lightning Talks: June 28, 2013 - Deadline for lightning talk abstracts July 15, 2013 - Lightning talk notification August 27, 2013 - Workshop Date TPC CHAIR Michael Alexander (chair), TU Wien, Austria Anastassios Nanos (co-chair), NTUA, Greece Jie Tao (co-chair), Karlsruhe Institut of Technology, Germany Lizhe Wang (co-chair), Chinese Academy of Sciences, China Gianluigi Zanetti (co-chair), CRS4, Italy PROGRAM COMMITTEE Amitanand Aiyer, Facebook, USA Costas Bekas, IBM, Switzerland Jakob Blomer, CERN, Switzerland William Gardner, University of Guelph, Canada Jos? Gracia, HPC Center of the University of Stuttgart, Germany Zhenghua Guom, Indiana University, USA Marcus Hardt, Karlsruhe Institute of Technology, Germany Sverre Jarp, CERN, Switzerland Christopher Jung, Karlsruhe Institute of Technology, Germany Andreas Kn?pfer - Technische Universit?t Dresden, Germany Nectarios Koziris, National Technical University of Athens, Greece Yan Ma, Chinese Academy of Sciences, China Martin Schulz - Lawrence Livermore National Laboratory Viral Shah, MIT Julia Group, USA Dimitrios Tsoumakos, Ionian University, Greece Zhifeng Yun, Louisiana State University, USA PAPER PUBLICATION Accepted full papers will be published in the Springer LNCS series. The best papers of the workshop -- after extension and revision -- will be published in a Special Issue of the Springer Journal of Scalable Computing. PAPER SUBMISSION Papers submitted to the workshop will be reviewed by at least two members of the program committee and external reviewers. Submissions should include abstract, key words, the e-mail address of the corresponding author, and must not exceed 10 pages, including tables and figures at a main font size no smaller than 11 point. Submission of a paper should be regarded as a commitment that, should the paper be accepted, at least one of the authors will register and attend the conference to present the work. The format must be according to the Springer LNCS Style. Initial submissions are in PDF; authors of accepted papers will be requested to provide source files. Format Guidelines: http://www.springer.de/comp/lncs/authors.html Style template: ftp://ftp.springer.de/pub/tex/latex/llncs/latex2e/llncs2e.zip Abstract Registration - Submission Link: http://edas.info/newPaper.php?c=14763 LIGHTNING TALKS Talks are strictly limited to 5 minutes. They can be used to gain early feedback on ongoing research, for demonstrations, to present research results, early research ideas, perspectives and positions of interest to the community. Lightning talks should spark discussion with presenters making themselves available following the lightning talk track. DURATION: Workshop Duration is one day. GENERAL INFORMATION The workshop will be held as part of Euro-Par 2013. Euro-Par 2013: http://www.europar2013.org From razimantv at gmail.com Fri May 3 15:50:27 2013 From: razimantv at gmail.com (Raziman T V) Date: Fri, 3 May 2013 21:50:27 +0200 Subject: [SciPy-Dev] Licensing issue with AMOS library Message-ID: Hi, The amos library for computing special functions is part of scipy.special. If I understand correctly, routines in the library are being used to compute Bessel function values for complex arguments. However, a version of the algorithm is seen to have been published on TOMS ( http://dl.acm.org/citation.cfm?id=214331). According to the ACM software license, the software isn't really "free" since commercial use is not allowed. How does this work with scipy? Any clarification on the license status of the library will be greatly appreciated since I plan to use it in a routine I intend to publish. Thank you Yours Raziman -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Fri May 3 17:11:51 2013 From: pav at iki.fi (Pauli Virtanen) Date: Sat, 04 May 2013 00:11:51 +0300 Subject: [SciPy-Dev] Licensing issue with AMOS library In-Reply-To: References: Message-ID: 03.05.2013 22:50, Raziman T V kirjoitti: > The amos library for computing special functions is part of > scipy.special. If I understand correctly, routines in the library are > being used to compute Bessel function values for complex arguments. > > However, a version of the algorithm is seen to have been published on > TOMS (http://dl.acm.org/citation.cfm?id=214331). According to the ACM > software license, the software isn't really "free" since commercial use > is not allowed. How does this work with scipy? > > Any clarification on the license status of the library will be greatly > appreciated since I plan to use it in a routine I intend to publish. The same routines are also relesed as a part of the SLATEC library, which is explicitly stated to be in the public domain: http://netlib.org/slatec/guide http://netlib.org/slatec/ http://netlib.org/slatec/src/ -- Pauli Virtanen From razimantv at gmail.com Sat May 4 04:06:56 2013 From: razimantv at gmail.com (Raziman T V) Date: Sat, 4 May 2013 10:06:56 +0200 Subject: [SciPy-Dev] Licensing issue with AMOS library In-Reply-To: References: Message-ID: Thanks. I will use the SLATEC version in my code then -Raziman On 3 May 2013 23:11, Pauli Virtanen wrote: > > The same routines are also relesed as a part of the SLATEC library, > which is explicitly stated to be in the public domain: > > http://netlib.org/slatec/guide > http://netlib.org/slatec/ > http://netlib.org/slatec/src/ > > -- > Pauli Virtanen > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From nils106 at googlemail.com Sat May 4 04:22:45 2013 From: nils106 at googlemail.com (Nils Wagner) Date: Sat, 4 May 2013 10:22:45 +0200 Subject: [SciPy-Dev] FAIL: test_qz_complex64 (test_decomp.TestQZ) Message-ID: Hi all, I found a (new) test failure. Seems to be an accuracy issue. >>> scipy.__version__ '0.13.0.dev-ccbdff8' >>> numpy.__version__ '1.8.0.dev-004ce27' >>> scipy.show_config() umfpack_info: NOT AVAILABLE atlas_threads_info: libraries = ['lapack', 'ptf77blas', 'ptcblas', 'atlas'] library_dirs = ['/home/nils/local/lib/'] define_macros = [('ATLAS_INFO', '"\\"3.10.1\\""')] language = f77 blas_opt_info: libraries = ['ptf77blas', 'ptcblas', 'atlas'] library_dirs = ['/home/nils/local/lib/'] define_macros = [('ATLAS_INFO', '"\\"3.10.1\\""')] language = c atlas_blas_threads_info: libraries = ['ptf77blas', 'ptcblas', 'atlas'] library_dirs = ['/home/nils/local/lib/'] define_macros = [('ATLAS_INFO', '"\\"3.10.1\\""')] language = c lapack_opt_info: libraries = ['lapack', 'ptf77blas', 'ptcblas', 'atlas'] library_dirs = ['/home/nils/local/lib/'] define_macros = [('ATLAS_INFO', '"\\"3.10.1\\""')] language = f77 lapack_mkl_info: NOT AVAILABLE blas_mkl_info: NOT AVAILABLE mkl_info: NOT AVAILABLE ====================================================================== FAIL: test_qz_complex64 (test_decomp.TestQZ) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/nils/local/lib64/python2.7/site-packages/scipy/linalg/tests/test_decomp.py", line 1765, in test_qz_complex64 assert_array_almost_equal(dot(dot(Q,AA),Z.conjugate().T), A) File "/home/nils/local/lib64/python2.7/site-packages/numpy/testing/utils.py", line 818, in assert_array_almost_equal header=('Arrays are not almost equal to %d decimals' % decimal)) File "/home/nils/local/lib64/python2.7/site-packages/numpy/testing/utils.py", line 651, in assert_array_compare raise AssertionError(msg) AssertionError: Arrays are not almost equal to 6 decimals (mismatch 4.0%) x: array([[ 0.92961562+0.72968888j, 0.31637603+0.99401349j, 0.18391919+0.67687315j, 0.20456034+0.79082197j, 0.56772506+0.17091417j],... y: array([[ 0.92961609+0.72968906j, 0.31637555+0.99401456j, 0.18391882+0.67687368j, 0.20456028+0.79082251j, 0.56772500+0.17091426j],... ---------------------------------------------------------------------- Ran 6411 tests in 208.475s FAILED (KNOWNFAIL=28, SKIP=158, failures=1) -------------- next part -------------- An HTML attachment was scrubbed... URL: From drazen.lucanin at gmail.com Sat May 4 07:33:20 2013 From: drazen.lucanin at gmail.com (=?ISO-8859-2?Q?Dra=BEen_Lu=E8anin?=) Date: Sat, 4 May 2013 13:33:20 +0200 Subject: [SciPy-Dev] scipy.integrate optimisation for pandas.TimeSeries Message-ID: Hi all, I wrote a GSoC project proposal. Unfortunately I didn't manage to get through a feedback loop to improve it based on your comments - had some trouble registering for the mailing list before. It is up on Melange as "SciPy: Improving Numerical Integration of Time Series" - probably under this link: https://google-melange.appspot.com/gsoc/proposal/review/google/gsoc2013/kermit666/2# My main motivation is that the current way to integrate a time series in Python (due to Pandas using miliseconds as its underlying structure [1]): integrate.simps(ts, ts.index.astype(np.int64) / 10**9) executes with a big overhead (of first having to divide every element to get a 1 integer unit = 1 second representation) and feels somewhat unpythonic. This gist illustrates the performance overhead that's troubling me: http://nbviewer.ipython.org/5512857 I would like to explore ways to rely on the basic timestamp arithmetic in scipy (dynamically, without introducing any dependencies), instead of forcing the user to transform the whole data domain. If there is any time left after this, the usability of scipy.integrate for time series integration could be further improved by adding some new features to Pandas too [2]. Is there perhaps anyone willing to mentor such work? Regards, Dra?en Lu?anin [1]: http://stackoverflow.com/questions/15203623/convert-pandas-datetimeindex-to-unix-time [2]: https://github.com/pydata/pandas/issues/2704 -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sun May 5 05:07:41 2013 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 5 May 2013 11:07:41 +0200 Subject: [SciPy-Dev] Request for extension to scipy.integrate In-Reply-To: <1F2B5A95-7C99-4B0A-AF0D-CFBBCF250147@gmail.com> References: <1F2B5A95-7C99-4B0A-AF0D-CFBBCF250147@gmail.com> Message-ID: Hi Nathan, On Fri, Apr 26, 2013 at 2:53 AM, Nathan Woods wrote: > SciPy's multidimensional integration capabilities are somewhat limited, as > mentioned previously: https://github.com/scipy/scipy/issues/2098. > Although dblquad and tplquad attempt to address this problem, they do not > allow any detailed control over the underlying quad algorithm and there is > no option for 4-dimensional integration at all. > > One obvious instance where this functionality would be desired is in the > evaluation of volume integrals in 4-dimensional space-time. Another > instance is the integration of discontinuous functions, where access to the > "points" option of quad is critical. As it stands, SciPy does not provide > any good functionality for either of these situations. > > I've written a recursive wrapper function for quad that resolves both of > these problems, allowing n-dimensional integration with complete > specification of options for quad. I've included a test problem that > demonstrates the resolution of both of the above problems (just execute the > script). I've done my best to document everything, and I've verified the > result of the test problem against Mathematica. > > I would like this code (or equivalent functionality) to be included in the > integrate package. I'm open to any feedback or suggestions, particularly > with the interface. > Including an n-dim integration function sounds good to me. I've tried your implementation with a few simple functions and it seems to give the right results. Some comments on the API and code: - the debug keyword arguments should be removed - function name: maybe `nquad` or `ndimquad`? "mul" can mean multiple or multiply - don't use a global for abserr - why must `opts` be of the same length as `ranges`? Repeating it in case it has only one element would make sense. For example, this should work: ``opts={'full_output', True})``, now I have to write ``opts=[{'full_output', True}, {}, {}, {}])``. - why is full_output disabled? Next steps would be to add good unit tests, adapt the documentation to Numpy format (see https://github.com/numpy/numpy/blob/master/doc/HOWTO_DOCUMENT.rst.txt), add an example or two to the docstring, and submit a PR for this. Cheers, Ralf > > Nathan Woods > > """ > Recursive integration using SciPy's quad function. > > Contains the following: > > class Error - module wrapper for Exception. For future expansion only. > > function mul_quad - Evaluate multiple integrals using recursive calls to > quad. > """ > from scipy.integrate import quad > import numpy > class Error(Exception): > pass > def mul_quad(func,ranges,args=(),opts=(),_depth=0,_int_vars=()): > """ > Evaluate multiple integrals through recursive calls to > scipy.integrate.quad. > > mul_quad takes care of the programming required to perform nested > integration using the 1-dimensional integration routines provided > by SciPy's adaptive quadrature function. It extends the capabilities > of dblquad and tplquad by allowing for more levels of integration, > and allowing the user to specify nearly the full range of options > allowed by quad, for each level of integration. Users are cautioned > that nested Gaussian integration of this kind is computationally > intensive, and may be unsuitable for many nested integrals. > > Usage: mul_quad(func,ranges,args=(),opts=(),_depth=0,_int_vars=()) > > Inputs: > func - callable, acceptable to SciPy's quad, returning a number. > Should accept a float, followed by the contents of _int_vars and > args, e.g. if x is a float, args=(1.4,string), and _int_vars = > (1,1,3), then func should be of the form > func(x,1,1,3,1.4,string). > ranges - sequence describing the ranges of integration. Integrals > are performed in order, so ranges[0] corresponds to the first > argument of func, ranges[1] to the second, and so on. Each > element of ranges may be either a constant sequence of length 2 > or else a function that returns such a sequence. If a function, > then it will be called with all of the integration arguments > available to that point. e.g. for func = f(x0,x1,x2,x3), the > range of integration for x0 may be defined as either a constant > such as (0,1) or as a function range0(x1,x2,x3). The functional > range of integration for x1 will be range1(x2,x3), x2 will be > range2(x3), and so on. > args - optional sequence of arguments. Contains only arguments of > func beyond those over which the integration is being performed. > opts - optional sequence of options for SciPy's quad. Each element > of opts may be specified as either a dictionary or as a function > that returns a dictionary similarly to ranges. opts must either > be left empty (), or it must be the same length as ranges. > Options are passed in the same order as ranges, so opts[0] > corresponds to integration over the first argument of func, and > so on. The full_output option from quad is not available, due to > the difficulty of consolidating the large number of additional > outputs. For reference, the default options from quad are: > - epsabs = 1.49e-08 > - epsrel = 1.49e-08 > - limit = 50 > - points = None > - weight = None > - wvar = None > - wopts = None > (As of Apr 2013) > _depth - used to determine level of integration. Should be omitted > by the user, except for debugging purposes. > _int_vars - contains values of integration variables in inner > integration loops. Should not be used manually except for > debugging. > > Returns: > out - value of multiple integral in the specified range. > abserr - estimate of the absolute error in the result. The > maximum value of abserr among all the SciPy quad evaluations. > > """ > global abserr > if _depth == 0: > abserr = None > if not (len(opts) in [0,len(ranges)]): > raise Error('opts must be given for all integration levels or > none!') > total_args = _int_vars+args > # Select the range and opts for the given depth of integration. > ind = -_depth-1 > if callable(ranges[ind]): > current_range = ranges[ind](*total_args) > else: > current_range = ranges[ind] > if len(opts) != 0: > if callable(opts[ind]): > current_opts = opts[ind](*total_args) > else: > current_opts = opts[ind] > else: > current_opts = {} > try: > if current_opts["full_output"] != 0: > raise Error('full_output option is disabled!') > except(KeyError): > pass > # Check to make sure that all points lie within the specified range. > try: > for point in current_opts["points"]: > if point < current_range[0] or point > current_range[1]: > current_opts["points"].remove(point) > except(KeyError): > pass > if current_range is ranges[0]:# Check to see if down to 1-D integral. > func_new = func > else: > # Define a new integrand. > def func_new(*_int_vars): > return mul_quad(func,ranges,args=args,opts=opts, > _depth=_depth+1,_int_vars=_int_vars) > out = quad(func_new,*current_range,args=_int_vars+args,**current_opts) > # Track the maximum value of abserr from quad > if abserr is None: > abserr = out[1] > if out[1] > abserr: > abserr = out[1] > if _depth == 0: > return out[0],abserr > else: > return out[0] > > if __name__=='__main__': > func = lambda x0,x1,x2,x3 : x0**2+x1*x2-x3**3+numpy.sin(x0)+( > 1 if (x0-.2*x3-.5-.25*x1>0) else 0) > points=[[lambda (x1,x2,x3) : .2*x3+.5+.25*x1],[],[],[]] > def opts0(*args,**kwargs): > return {'points':[.2*args[2]+.5+.25*args[0]]} > print mul_quad(func,[[0,1],[-1,1],[.13,.8],[-.15,1]], > opts=[opts0,{},{},{}]) > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sun May 5 05:37:37 2013 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 5 May 2013 11:37:37 +0200 Subject: [SciPy-Dev] scipy.integrate optimisation for pandas.TimeSeries In-Reply-To: References: Message-ID: On Sat, May 4, 2013 at 1:33 PM, Dra?en Lu?anin wrote: > Hi all, > > I wrote a GSoC project proposal. Unfortunately I didn't manage to get > through a feedback loop to improve it based on your comments - had some > trouble registering for the mailing list before. It is up on Melange as > "SciPy: Improving Numerical Integration of Time Series" - probably under > this link: > > > https://google-melange.appspot.com/gsoc/proposal/review/google/gsoc2013/kermit666/2# > > My main motivation is that the current way to integrate a time series in > Python (due to Pandas using miliseconds as its underlying structure [1]): > > integrate.simps(ts, ts.index.astype(np.int64) / 10**9) > > executes with a big overhead (of first having to divide every element to > get a 1 integer unit = 1 second representation) and feels somewhat > unpythonic. This gist illustrates the performance overhead that's troubling > me: > > http://nbviewer.ipython.org/5512857 > > I would like to explore ways to rely on the basic timestamp arithmetic in > scipy (dynamically, without introducing any dependencies), instead of > forcing the user to transform the whole data domain. > Hi Drazen, it seems to me that making scipy.integrate time series aware is a significant enlargement of the scope of that module, and I'm not sure if this is the best way to go. Wouldn't it make more sense to add this in pandas itself? > If there is any time left after this, the usability of scipy.integrate for > time series integration could be further improved by adding some new > features to Pandas too [2]. > I think your proposal, even including the pandas issue [2], at the moment isn't enough for 3 months of work. For example, it should take much less than the two weeks you now have for implementing the trapezoidal rule integration (also, why even do that?). Scipy.integrate actually does need some TLC (it's unfortunately one of the least well maintained modules in scipy), but my feeling is that this wouldn't be high on the list of prios even if we did want to add time series functionality. Ralf Is there perhaps anyone willing to mentor such work? > > Regards, > Dra?en Lu?anin > > [1]: > http://stackoverflow.com/questions/15203623/convert-pandas-datetimeindex-to-unix-time > [2]: https://github.com/pydata/pandas/issues/2704 > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sun May 5 05:41:32 2013 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 5 May 2013 11:41:32 +0200 Subject: [SciPy-Dev] FAIL: test_qz_complex64 (test_decomp.TestQZ) In-Reply-To: References: Message-ID: On Sat, May 4, 2013 at 10:22 AM, Nils Wagner wrote: > Hi all, > > I found a (new) test failure. > > Seems to be an accuracy issue. > For what precision does the failure disappear? Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sun May 5 06:02:02 2013 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 5 May 2013 12:02:02 +0200 Subject: [SciPy-Dev] build systems - removing numscons support Message-ID: Hi all, I spent some time fixing building with Bento: https://github.com/scipy/scipy/pull/2447. Try it out if you regularly build scipy - for me it's 5x faster than a distutils build, and the difference is even larger when disregarding the Cythonizing step. We still have support for Numscons, although it has been broken for several months. Supporting three build systems is at least one too many, so I propose to remove the numscons support. Any objections? Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Sun May 5 06:04:29 2013 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 5 May 2013 11:04:29 +0100 Subject: [SciPy-Dev] build systems - removing numscons support In-Reply-To: References: Message-ID: On Sun, May 5, 2013 at 11:02 AM, Ralf Gommers wrote: > Hi all, > > I spent some time fixing building with Bento: > https://github.com/scipy/scipy/pull/2447. Try it out if you regularly build > scipy - for me it's 5x faster than a distutils build, and the difference is > even larger when disregarding the Cythonizing step. > > We still have support for Numscons, although it has been broken for several > months. Supporting three build systems is at least one too many, so I > propose to remove the numscons support. Any objections? +1 -- Robert Kern From charlesnwoods at gmail.com Sun May 5 08:42:39 2013 From: charlesnwoods at gmail.com (Nathan Woods) Date: Sun, 5 May 2013 06:42:39 -0600 Subject: [SciPy-Dev] Request for extension to scipy.integrate In-Reply-To: References: <1F2B5A95-7C99-4B0A-AF0D-CFBBCF250147@gmail.com> Message-ID: <-6331144580724006782@unknownmsgid> Ralf, Thank you for the feedback and suggestions. I'll get started on a few of them right away. There are a few questions you brought up that I'll try and answer, while I think about ways to improve them. First, the debug arguments and the global variable abserr. The debug arguments are actually required by the code, but they are not intended to be used by the user. They carry around information that is required by the nested integration, and that would otherwise have to be included as class variables. They are not intended to be used by the user at all, really. I could certainly get rid of them if I wrote this routine as a class instead of a def, but I've heard that there are performance reasons not to use classes in a nested, iterative, routine like this one. Thoughts? For options specifications, it sounds like you're advocating a dict if options containing keys like "all", 1, 2, ... to allow the user to specify options only at a particular level of integration, or at them all. Did I understand you right? Or were you just suggesting that a single opts specification for everything should be a viable input? As for the full-output option, there would need to be some way to reduce the potentially thousands of outputs in a total integration to something manageable. I'll certainly look at it again, though. Thanks again for your feedback. N On May 5, 2013, at 3:08, Ralf Gommers wrote: Hi Nathan, On Fri, Apr 26, 2013 at 2:53 AM, Nathan Woods wrote: > SciPy's multidimensional integration capabilities are somewhat limited, as > mentioned previously: https://github.com/scipy/scipy/issues/2098. > Although dblquad and tplquad attempt to address this problem, they do not > allow any detailed control over the underlying quad algorithm and there is > no option for 4-dimensional integration at all. > > One obvious instance where this functionality would be desired is in the > evaluation of volume integrals in 4-dimensional space-time. Another > instance is the integration of discontinuous functions, where access to the > "points" option of quad is critical. As it stands, SciPy does not provide > any good functionality for either of these situations. > > I've written a recursive wrapper function for quad that resolves both of > these problems, allowing n-dimensional integration with complete > specification of options for quad. I've included a test problem that > demonstrates the resolution of both of the above problems (just execute the > script). I've done my best to document everything, and I've verified the > result of the test problem against Mathematica. > > I would like this code (or equivalent functionality) to be included in the > integrate package. I'm open to any feedback or suggestions, particularly > with the interface. > Including an n-dim integration function sounds good to me. I've tried your implementation with a few simple functions and it seems to give the right results. Some comments on the API and code: - the debug keyword arguments should be removed - function name: maybe `nquad` or `ndimquad`? "mul" can mean multiple or multiply - don't use a global for abserr - why must `opts` be of the same length as `ranges`? Repeating it in case it has only one element would make sense. For example, this should work: ``opts={'full_output', True})``, now I have to write ``opts=[{'full_output', True}, {}, {}, {}])``. - why is full_output disabled? Next steps would be to add good unit tests, adapt the documentation to Numpy format (see https://github.com/numpy/numpy/blob/master/doc/HOWTO_DOCUMENT.rst.txt), add an example or two to the docstring, and submit a PR for this. Cheers, Ralf > > Nathan Woods > > """ > Recursive integration using SciPy's quad function. > > Contains the following: > > class Error - module wrapper for Exception. For future expansion only. > > function mul_quad - Evaluate multiple integrals using recursive calls to > quad. > """ > from scipy.integrate import quad > import numpy > class Error(Exception): > pass > def mul_quad(func,ranges,args=(),opts=(),_depth=0,_int_vars=()): > """ > Evaluate multiple integrals through recursive calls to > scipy.integrate.quad. > > mul_quad takes care of the programming required to perform nested > integration using the 1-dimensional integration routines provided > by SciPy's adaptive quadrature function. It extends the capabilities > of dblquad and tplquad by allowing for more levels of integration, > and allowing the user to specify nearly the full range of options > allowed by quad, for each level of integration. Users are cautioned > that nested Gaussian integration of this kind is computationally > intensive, and may be unsuitable for many nested integrals. > > Usage: mul_quad(func,ranges,args=(),opts=(),_depth=0,_int_vars=()) > > Inputs: > func - callable, acceptable to SciPy's quad, returning a number. > Should accept a float, followed by the contents of _int_vars and > args, e.g. if x is a float, args=(1.4,string), and _int_vars = > (1,1,3), then func should be of the form > func(x,1,1,3,1.4,string). > ranges - sequence describing the ranges of integration. Integrals > are performed in order, so ranges[0] corresponds to the first > argument of func, ranges[1] to the second, and so on. Each > element of ranges may be either a constant sequence of length 2 > or else a function that returns such a sequence. If a function, > then it will be called with all of the integration arguments > available to that point. e.g. for func = f(x0,x1,x2,x3), the > range of integration for x0 may be defined as either a constant > such as (0,1) or as a function range0(x1,x2,x3). The functional > range of integration for x1 will be range1(x2,x3), x2 will be > range2(x3), and so on. > args - optional sequence of arguments. Contains only arguments of > func beyond those over which the integration is being performed. > opts - optional sequence of options for SciPy's quad. Each element > of opts may be specified as either a dictionary or as a function > that returns a dictionary similarly to ranges. opts must either > be left empty (), or it must be the same length as ranges. > Options are passed in the same order as ranges, so opts[0] > corresponds to integration over the first argument of func, and > so on. The full_output option from quad is not available, due to > the difficulty of consolidating the large number of additional > outputs. For reference, the default options from quad are: > - epsabs = 1.49e-08 > - epsrel = 1.49e-08 > - limit = 50 > - points = None > - weight = None > - wvar = None > - wopts = None > (As of Apr 2013) > _depth - used to determine level of integration. Should be omitted > by the user, except for debugging purposes. > _int_vars - contains values of integration variables in inner > integration loops. Should not be used manually except for > debugging. > > Returns: > out - value of multiple integral in the specified range. > abserr - estimate of the absolute error in the result. The > maximum value of abserr among all the SciPy quad evaluations. > > """ > global abserr > if _depth == 0: > abserr = None > if not (len(opts) in [0,len(ranges)]): > raise Error('opts must be given for all integration levels or > none!') > total_args = _int_vars+args > # Select the range and opts for the given depth of integration. > ind = -_depth-1 > if callable(ranges[ind]): > current_range = ranges[ind](*total_args) > else: > current_range = ranges[ind] > if len(opts) != 0: > if callable(opts[ind]): > current_opts = opts[ind](*total_args) > else: > current_opts = opts[ind] > else: > current_opts = {} > try: > if current_opts["full_output"] != 0: > raise Error('full_output option is disabled!') > except(KeyError): > pass > # Check to make sure that all points lie within the specified range. > try: > for point in current_opts["points"]: > if point < current_range[0] or point > current_range[1]: > current_opts["points"].remove(point) > except(KeyError): > pass > if current_range is ranges[0]:# Check to see if down to 1-D integral. > func_new = func > else: > # Define a new integrand. > def func_new(*_int_vars): > return mul_quad(func,ranges,args=args,opts=opts, > _depth=_depth+1,_int_vars=_int_vars) > out = quad(func_new,*current_range,args=_int_vars+args,**current_opts) > # Track the maximum value of abserr from quad > if abserr is None: > abserr = out[1] > if out[1] > abserr: > abserr = out[1] > if _depth == 0: > return out[0],abserr > else: > return out[0] > > if __name__=='__main__': > func = lambda x0,x1,x2,x3 : x0**2+x1*x2-x3**3+numpy.sin(x0)+( > 1 if (x0-.2*x3-.5-.25*x1>0) else 0) > points=[[lambda (x1,x2,x3) : .2*x3+.5+.25*x1],[],[],[]] > def opts0(*args,**kwargs): > return {'points':[.2*args[2]+.5+.25*args[0]]} > print mul_quad(func,[[0,1],[-1,1],[.13,.8],[-.15,1]], > opts=[opts0,{},{},{}]) > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > _______________________________________________ SciPy-Dev mailing list SciPy-Dev at scipy.org http://mail.scipy.org/mailman/listinfo/scipy-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Sun May 5 11:55:58 2013 From: cournape at gmail.com (David Cournapeau) Date: Sun, 5 May 2013 16:55:58 +0100 Subject: [SciPy-Dev] build systems - removing numscons support In-Reply-To: References: Message-ID: On Sun, May 5, 2013 at 11:02 AM, Ralf Gommers wrote: > Hi all, > > I spent some time fixing building with Bento: > https://github.com/scipy/scipy/pull/2447. Try it out if you regularly build > scipy - for me it's 5x faster than a distutils build, and the difference is > even larger when disregarding the Cythonizing step. > > We still have support for Numscons, although it has been broken for several > months. Supporting three build systems is at least one too many, so I > propose to remove the numscons support. Any objections? I thought the decision was already made to remove it ! So +1 from me as well David From ralf.gommers at gmail.com Sun May 5 12:27:21 2013 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Sun, 5 May 2013 18:27:21 +0200 Subject: [SciPy-Dev] build systems - removing numscons support In-Reply-To: References: Message-ID: On Sun, May 5, 2013 at 5:55 PM, David Cournapeau wrote: > On Sun, May 5, 2013 at 11:02 AM, Ralf Gommers > wrote: > > Hi all, > > > > I spent some time fixing building with Bento: > > https://github.com/scipy/scipy/pull/2447. Try it out if you regularly > build > > scipy - for me it's 5x faster than a distutils build, and the difference > is > > even larger when disregarding the Cythonizing step. > > > > We still have support for Numscons, although it has been broken for > several > > months. Supporting three build systems is at least one too many, so I > > propose to remove the numscons support. Any objections? > > I thought the decision was already made to remove it ! So +1 from me as > well That was for numpy. PR: https://github.com/scipy/scipy/pull/2450 Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From drazen.lucanin at gmail.com Sun May 5 17:24:01 2013 From: drazen.lucanin at gmail.com (=?ISO-8859-2?Q?Dra=BEen_Lu=E8anin?=) Date: Sun, 5 May 2013 23:24:01 +0200 Subject: [SciPy-Dev] scipy.integrate optimisation for pandas.TimeSeries In-Reply-To: References:

Message-ID: On Sun, May 5, 2013 at 11:37 AM, Ralf Gommers wrote: > > Hi Drazen, it seems to me that making scipy.integrate time series aware is > a significant enlargement of the scope of that module, and I'm not sure if > this is the best way to go. Wouldn't it make more sense to add this in > pandas itself? > Hi Ralf, well, I see the relation scipy : pandas practically the same as scipy : numpy. Pandas offers basic functionality for dealing with data (time-stamped in contrast to numpy, but still low-level), while scipy adds a layer of logic useful in specific scientific domains. I guess numerical integration functionality isn't present in numpy's arrays for the same reason. If nothing else, maintaining numerical integration methods in two places seems a bit redundant. > > >> If there is any time left after this, the usability of scipy.integrate >> for time series integration could be further improved by adding some new >> features to Pandas too [2]. >> > > I think your proposal, even including the pandas issue [2], at the moment > isn't enough for 3 months of work. For example, it should take much less > than the two weeks you now have for implementing the trapezoidal rule > integration (also, why even do that?). > Yes, it is a bit hard for me to predict the amount of work necessary for the tasks I suggested, since I don't yet have a realistic picture of how everything works together. For this, we could have used an additional couple of feedback iterations. With the trapoezoidal rule milestone I meant a very minimalistic working version, on top of which additional iterations can be made to cover the complete feature set (all the rules, all kinds of data, missing values etc.) I suggested the pandas issue more as an example. Of course, if the original aims were satisfied sooner than expected, I would be glad to continue working on other things as well. There is plenty of work to choose from, I'd say. One direction might be to see if time series support would make sense in other modules as well (interpolation (although pandas has its own version - it would be interesting to see if the two could be merged maybe), FFT, ...) > > Scipy.integrate actually does need some TLC (it's unfortunately one of the > least well maintained modules in scipy), but my feeling is that this > wouldn't be high on the list of prios even if we did want to add time > series functionality. > Another direction, as you say, might be to go deeper into the scipy.integrate module itself. There is plenty of code regarding function integration that could maybe be reorganised or even ported to something more modern and maintainable than Fortran - maybe Cython. Integrating functions doesn't perform quite well from what I heard (I think it was a StackOverflow question, can't find it now), so it might make sense to brainstorm on how to improve that. Regarding the priorities, that's for you guys to decide. I am interested in the idea, as it's something that would help me with the way I'm using scipy. Scratching my itch :) http://www.catb.org/esr/writings/homesteading/cathedral-bazaar/ar01s02.html Cheers, Dra?en > > Ralf > > > Is there perhaps anyone willing to mentor such work? >> >> Regards, >> Dra?en Lu?anin >> >> [1]: >> http://stackoverflow.com/questions/15203623/convert-pandas-datetimeindex-to-unix-time >> [2]: https://github.com/pydata/pandas/issues/2704 >> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev >> >> > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sun May 5 18:12:46 2013 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Mon, 6 May 2013 00:12:46 +0200 Subject: [SciPy-Dev] Request for extension to scipy.integrate In-Reply-To: <-6331144580724006782@unknownmsgid> References: <1F2B5A95-7C99-4B0A-AF0D-CFBBCF250147@gmail.com> <-6331144580724006782@unknownmsgid> Message-ID: On Sun, May 5, 2013 at 2:42 PM, Nathan Woods wrote: > Ralf, > > Thank you for the feedback and suggestions. I'll get started on a few of > them right away. There are a few questions you brought up that I'll try and > answer, while I think about ways to improve them. > > First, the debug arguments and the global variable abserr. The debug > arguments are actually required by the code, but they are not intended to > be used by the user. They carry around information that is required by the > nested integration, and that would otherwise have to be included as class > variables. They are not intended to be used by the user at all, really. I > could certainly get rid of them if I wrote this routine as a class instead > of a def, but I've heard that there are performance reasons not to use > classes in a nested, iterative, routine like this one. Thoughts? > A class would be preferable to globals and keyword arguments not intended for the user. I don't see a performance issue with that here. > For options specifications, it sounds like you're advocating a dict if > options containing keys like "all", 1, 2, ... to allow the user to specify > options only at a particular level of integration, or at them all. Did I > understand you right? Or were you just suggesting that a single opts > specification for everything should be a viable input? > The latter, in addition to what you currently have. If I want to pass the same keyword to each integration steps, then I shouldn't be forced to create N times the same dict. As for the full-output option, there would need to be some way to reduce > the potentially thousands of outputs in a total integration to something > manageable. I'll certainly look at it again, though. > Maybe it's unavoidable that it becomes a mess - dblquad/tplquad also don't have a full_output kw. Not sure if there's any point in just returning the same thing as quad for the outer integration step maybe. Ralf > Thanks again for your feedback. > > N > > On May 5, 2013, at 3:08, Ralf Gommers wrote: > > > Hi Nathan, > > On Fri, Apr 26, 2013 at 2:53 AM, Nathan Woods wrote: > >> SciPy's multidimensional integration capabilities are somewhat limited, >> as mentioned previously: https://github.com/scipy/scipy/issues/2098. >> Although dblquad and tplquad attempt to address this problem, they do not >> allow any detailed control over the underlying quad algorithm and there is >> no option for 4-dimensional integration at all. >> >> One obvious instance where this functionality would be desired is in the >> evaluation of volume integrals in 4-dimensional space-time. Another >> instance is the integration of discontinuous functions, where access to the >> "points" option of quad is critical. As it stands, SciPy does not provide >> any good functionality for either of these situations. >> >> I've written a recursive wrapper function for quad that resolves both of >> these problems, allowing n-dimensional integration with complete >> specification of options for quad. I've included a test problem that >> demonstrates the resolution of both of the above problems (just execute the >> script). I've done my best to document everything, and I've verified the >> result of the test problem against Mathematica. >> >> I would like this code (or equivalent functionality) to be included in >> the integrate package. I'm open to any feedback or suggestions, >> particularly with the interface. >> > > Including an n-dim integration function sounds good to me. I've tried your > implementation with a few simple functions and it seems to give the right > results. > > Some comments on the API and code: > - the debug keyword arguments should be removed > - function name: maybe `nquad` or `ndimquad`? "mul" can mean multiple or > multiply > - don't use a global for abserr > - why must `opts` be of the same length as `ranges`? Repeating it in case > it has only one element would make sense. For example, this should work: > ``opts={'full_output', True})``, now I have to write > ``opts=[{'full_output', True}, {}, {}, {}])``. > - why is full_output disabled? > > Next steps would be to add good unit tests, adapt the documentation to > Numpy format (see > https://github.com/numpy/numpy/blob/master/doc/HOWTO_DOCUMENT.rst.txt), > add an example or two to the docstring, and submit a PR for this. > > Cheers, > Ralf > > > >> >> Nathan Woods >> >> """ >> Recursive integration using SciPy's quad function. >> >> Contains the following: >> >> class Error - module wrapper for Exception. For future expansion only. >> >> function mul_quad - Evaluate multiple integrals using recursive calls to >> quad. >> """ >> from scipy.integrate import quad >> import numpy >> class Error(Exception): >> pass >> def mul_quad(func,ranges,args=(),opts=(),_depth=0,_int_vars=()): >> """ >> Evaluate multiple integrals through recursive calls to >> scipy.integrate.quad. >> >> mul_quad takes care of the programming required to perform nested >> integration using the 1-dimensional integration routines provided >> by SciPy's adaptive quadrature function. It extends the capabilities >> of dblquad and tplquad by allowing for more levels of integration, >> and allowing the user to specify nearly the full range of options >> allowed by quad, for each level of integration. Users are cautioned >> that nested Gaussian integration of this kind is computationally >> intensive, and may be unsuitable for many nested integrals. >> >> Usage: mul_quad(func,ranges,args=(),opts=(),_depth=0,_int_vars=()) >> >> Inputs: >> func - callable, acceptable to SciPy's quad, returning a number. >> Should accept a float, followed by the contents of _int_vars and >> args, e.g. if x is a float, args=(1.4,string), and _int_vars = >> (1,1,3), then func should be of the form >> func(x,1,1,3,1.4,string). >> ranges - sequence describing the ranges of integration. Integrals >> are performed in order, so ranges[0] corresponds to the first >> argument of func, ranges[1] to the second, and so on. Each >> element of ranges may be either a constant sequence of length 2 >> or else a function that returns such a sequence. If a function, >> then it will be called with all of the integration arguments >> available to that point. e.g. for func = f(x0,x1,x2,x3), the >> range of integration for x0 may be defined as either a constant >> such as (0,1) or as a function range0(x1,x2,x3). The functional >> range of integration for x1 will be range1(x2,x3), x2 will be >> range2(x3), and so on. >> args - optional sequence of arguments. Contains only arguments of >> func beyond those over which the integration is being performed. >> opts - optional sequence of options for SciPy's quad. Each element >> of opts may be specified as either a dictionary or as a function >> that returns a dictionary similarly to ranges. opts must either >> be left empty (), or it must be the same length as ranges. >> Options are passed in the same order as ranges, so opts[0] >> corresponds to integration over the first argument of func, and >> so on. The full_output option from quad is not available, due to >> the difficulty of consolidating the large number of additional >> outputs. For reference, the default options from quad are: >> - epsabs = 1.49e-08 >> - epsrel = 1.49e-08 >> - limit = 50 >> - points = None >> - weight = None >> - wvar = None >> - wopts = None >> (As of Apr 2013) >> _depth - used to determine level of integration. Should be omitted >> by the user, except for debugging purposes. >> _int_vars - contains values of integration variables in inner >> integration loops. Should not be used manually except for >> debugging. >> >> Returns: >> out - value of multiple integral in the specified range. >> abserr - estimate of the absolute error in the result. The >> maximum value of abserr among all the SciPy quad evaluations. >> >> """ >> global abserr >> if _depth == 0: >> abserr = None >> if not (len(opts) in [0,len(ranges)]): >> raise Error('opts must be given for all integration levels or >> none!') >> total_args = _int_vars+args >> # Select the range and opts for the given depth of integration. >> ind = -_depth-1 >> if callable(ranges[ind]): >> current_range = ranges[ind](*total_args) >> else: >> current_range = ranges[ind] >> if len(opts) != 0: >> if callable(opts[ind]): >> current_opts = opts[ind](*total_args) >> else: >> current_opts = opts[ind] >> else: >> current_opts = {} >> try: >> if current_opts["full_output"] != 0: >> raise Error('full_output option is disabled!') >> except(KeyError): >> pass >> # Check to make sure that all points lie within the specified range. >> try: >> for point in current_opts["points"]: >> if point < current_range[0] or point > current_range[1]: >> current_opts["points"].remove(point) >> except(KeyError): >> pass >> if current_range is ranges[0]:# Check to see if down to 1-D integral. >> func_new = func >> else: >> # Define a new integrand. >> def func_new(*_int_vars): >> return mul_quad(func,ranges,args=args,opts=opts, >> _depth=_depth+1,_int_vars=_int_vars) >> out = quad(func_new,*current_range,args=_int_vars+args,**current_opts) >> # Track the maximum value of abserr from quad >> if abserr is None: >> abserr = out[1] >> if out[1] > abserr: >> abserr = out[1] >> if _depth == 0: >> return out[0],abserr >> else: >> return out[0] >> >> if __name__=='__main__': >> func = lambda x0,x1,x2,x3 : x0**2+x1*x2-x3**3+numpy.sin(x0)+( >> 1 if (x0-.2*x3-.5-.25*x1>0) else 0) >> points=[[lambda (x1,x2,x3) : .2*x3+.5+.25*x1],[],[],[]] >> def opts0(*args,**kwargs): >> return {'points':[.2*args[2]+.5+.25*args[0]]} >> print mul_quad(func,[[0,1],[-1,1],[.13,.8],[-.15,1]], >> opts=[opts0,{},{},{}]) >> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev >> >> > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ralf.gommers at gmail.com Sun May 5 18:25:47 2013 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Mon, 6 May 2013 00:25:47 +0200 Subject: [SciPy-Dev] scipy.integrate optimisation for pandas.TimeSeries In-Reply-To: References:

Message-ID: On Sun, May 5, 2013 at 11:24 PM, Dra?en Lu?anin wrote: > On Sun, May 5, 2013 at 11:37 AM, Ralf Gommers wrote: > >> >> Hi Drazen, it seems to me that making scipy.integrate time series aware >> is a significant enlargement of the scope of that module, and I'm not sure >> if this is the best way to go. Wouldn't it make more sense to add this in >> pandas itself? >> > Hi Ralf, > > well, I see the relation scipy : pandas practically the same as scipy : > numpy. Pandas offers basic functionality for dealing with data > (time-stamped in contrast to numpy, but still low-level), > Hmm, not sure I agree with that - pandas is in some respects more high-level than scipy. It includes some plotting tools for example. To build on pandas, statsmodels is a better candidate than scipy (given that statsmodels already has pandas as a dependency). I don't see us adding a dependency on pandas any time soon. > while scipy adds a layer of logic useful in specific scientific domains. I > guess numerical integration functionality isn't present in numpy's arrays > for the same reason. > > If nothing else, maintaining numerical integration methods in two places > seems a bit redundant. > >> >> >>> If there is any time left after this, the usability of scipy.integrate >>> for time series integration could be further improved by adding some new >>> features to Pandas too [2]. >>> >> >> I think your proposal, even including the pandas issue [2], at the moment >> isn't enough for 3 months of work. For example, it should take much less >> than the two weeks you now have for implementing the trapezoidal rule >> integration (also, why even do that?). >> > Yes, it is a bit hard for me to predict the amount of work necessary for > the tasks I suggested, since I don't yet have a realistic picture of how > everything works together. For this, we could have used an additional > couple of feedback iterations. > We can still do that. You can't really edit your proposal on Melange anymore, but it would still be helpful to improve it. You can keep a copy on your blog or on Github for example. > With the trapoezoidal rule milestone I meant a very minimalistic working > version, on top of which additional iterations can be made to cover the > complete feature set (all the rules, all kinds of data, missing values etc.) > > I suggested the pandas issue more as an example. Of course, if the > original aims were satisfied sooner than expected, I would be glad to > continue working on other things as well. There is plenty of work to choose > from, I'd say. One direction might be to see if time series support would > make sense in other modules as well (interpolation (although pandas has its > own version - it would be interesting to see if the two could be merged > maybe), FFT, ...) > >> >> Scipy.integrate actually does need some TLC (it's unfortunately one of >> the least well maintained modules in scipy), but my feeling is that this >> wouldn't be high on the list of prios even if we did want to add time >> series functionality. >> > Another direction, as you say, might be to go deeper into the > scipy.integrate module itself. There is plenty of code regarding function > integration that could maybe be reorganised or even ported to something > more modern and maintainable than Fortran - maybe Cython. Integrating > functions doesn't perform quite well from what I heard (I think it was a > StackOverflow question, can't find it now), so it might make sense to > brainstorm on how to improve that. > See this thread for example: http://thread.gmane.org/gmane.comp.python.scientific.devel/16667/focus=16668. There were other ones as well on scipy-dev and scipy-user. Ralf > > Regarding the priorities, that's for you guys to decide. I am interested > in the idea, as it's something that would help me with the way I'm using > scipy. Scratching my itch :) > http://www.catb.org/esr/writings/homesteading/cathedral-bazaar/ar01s02.html > > Cheers, > Dra?en > >> >> Ralf >> >> >> Is there perhaps anyone willing to mentor such work? >>> >>> Regards, >>> Dra?en Lu?anin >>> >>> [1]: >>> http://stackoverflow.com/questions/15203623/convert-pandas-datetimeindex-to-unix-time >>> [2]: https://github.com/pydata/pandas/issues/2704 >>> >>> _______________________________________________ >>> SciPy-Dev mailing list >>> SciPy-Dev at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-dev >>> >>> >> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev >> >> > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From drazen.lucanin at gmail.com Mon May 6 03:47:32 2013 From: drazen.lucanin at gmail.com (=?ISO-8859-2?Q?Dra=BEen_Lu=E8anin?=) Date: Mon, 6 May 2013 09:47:32 +0200 Subject: [SciPy-Dev] scipy.integrate optimisation for pandas.TimeSeries In-Reply-To: References:

Message-ID: On Mon, May 6, 2013 at 12:25 AM, Ralf Gommers wrote: > > > > On Sun, May 5, 2013 at 11:24 PM, Dra?en Lu?anin wrote: > >> On Sun, May 5, 2013 at 11:37 AM, Ralf Gommers wrote: >> >>> >>> Hi Drazen, it seems to me that making scipy.integrate time series aware >>> is a significant enlargement of the scope of that module, and I'm not sure >>> if this is the best way to go. Wouldn't it make more sense to add this in >>> pandas itself? >>> >> Hi Ralf, >> >> well, I see the relation scipy : pandas practically the same as scipy : >> numpy. Pandas offers basic functionality for dealing with data >> (time-stamped in contrast to numpy, but still low-level), >> > > Hmm, not sure I agree with that - pandas is in some respects more > high-level than scipy. It includes some plotting tools for example. To > build on pandas, statsmodels is a better candidate than scipy (given that > statsmodels already has pandas as a dependency). I don't see us adding a > dependency on pandas any time soon. > Yes, in a broader sense. But that's the dynamic nature of pandas that I quite like - ti recognizes and uses other libraries if they're available, but it only really requires numpy, python-dateutil and pytz. https://github.com/pydata/pandas#dependencies That's exactly the sort of pattern I had in mind for adding to scipy.integrate - detect at runtime if pandas is installed and if so, work with it - check the data type and adapt to it. If it is a pandas.TimeSeries use the datetime arithmetic, instead of the lower-level numpy.ndarray integer index that requires a domain transformation in order to integrate based on seconds. > > >> while scipy adds a layer of logic useful in specific scientific domains. >> I guess numerical integration functionality isn't present in numpy's arrays >> for the same reason. >> >> If nothing else, maintaining numerical integration methods in two places >> seems a bit redundant. >> >>> >>> >>>> If there is any time left after this, the usability of scipy.integrate >>>> for time series integration could be further improved by adding some new >>>> features to Pandas too [2]. >>>> >>> >>> I think your proposal, even including the pandas issue [2], at the >>> moment isn't enough for 3 months of work. For example, it should take much >>> less than the two weeks you now have for implementing the trapezoidal rule >>> integration (also, why even do that?). >>> >> Yes, it is a bit hard for me to predict the amount of work necessary for >> the tasks I suggested, since I don't yet have a realistic picture of how >> everything works together. For this, we could have used an additional >> couple of feedback iterations. >> > > We can still do that. You can't really edit your proposal on Melange > anymore, but it would still be helpful to improve it. You can keep a copy > on your blog or on Github for example. > If you agree, we could do that, yes. I'll put up a new version somewhere tonight. > > >> With the trapoezoidal rule milestone I meant a very minimalistic working >> version, on top of which additional iterations can be made to cover the >> complete feature set (all the rules, all kinds of data, missing values etc.) >> >> I suggested the pandas issue more as an example. Of course, if the >> original aims were satisfied sooner than expected, I would be glad to >> continue working on other things as well. There is plenty of work to choose >> from, I'd say. One direction might be to see if time series support would >> make sense in other modules as well (interpolation (although pandas has its >> own version - it would be interesting to see if the two could be merged >> maybe), FFT, ...) >> >>> >>> Scipy.integrate actually does need some TLC (it's unfortunately one of >>> the least well maintained modules in scipy), but my feeling is that this >>> wouldn't be high on the list of prios even if we did want to add time >>> series functionality. >>> >> Another direction, as you say, might be to go deeper into the >> scipy.integrate module itself. There is plenty of code regarding function >> integration that could maybe be reorganised or even ported to something >> more modern and maintainable than Fortran - maybe Cython. Integrating >> functions doesn't perform quite well from what I heard (I think it was a >> StackOverflow question, can't find it now), so it might make sense to >> brainstorm on how to improve that. >> > > See this thread for example: > http://thread.gmane.org/gmane.comp.python.scientific.devel/16667/focus=16668. > There were other ones as well on scipy-dev and scipy-user. > Sorry, I don't quite follow what you mean by that thread? So, this efficient CVODE library is wrapped in scipy already? I see that there is this interface that dives deep into the Fortran code right now - I guess that's it? https://github.com/scipy/scipy/blob/master/scipy/integrate/_ode.py Dra?en > > > Ralf > > > >> >> Regarding the priorities, that's for you guys to decide. I am interested >> in the idea, as it's something that would help me with the way I'm using >> scipy. Scratching my itch :) >> http://www.catb.org/esr/writings/homesteading/cathedral-bazaar/ar01s02.html >> >> Cheers, >> Dra?en >> >>> >>> Ralf >>> >>> >>> Is there perhaps anyone willing to mentor such work? >>>> >>>> Regards, >>>> Dra?en Lu?anin >>>> >>>> [1]: >>>> http://stackoverflow.com/questions/15203623/convert-pandas-datetimeindex-to-unix-time >>>> [2]: https://github.com/pydata/pandas/issues/2704 >>>> >>>> _______________________________________________ >>>> SciPy-Dev mailing list >>>> SciPy-Dev at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/scipy-dev >>>> >>>> >>> >>> _______________________________________________ >>> SciPy-Dev mailing list >>> SciPy-Dev at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-dev >>> >>> >> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev >> >> > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From pablo.winant at gmail.com Mon May 6 11:58:34 2013 From: pablo.winant at gmail.com (Pablo Winant) Date: Mon, 06 May 2013 17:58:34 +0200 Subject: [SciPy-Dev] Multilinear interpolation In-Reply-To: <517990E8.2040206@crans.org> References: <51260E4E.4010405@gmail.com> <51263127.7070306@gmail.com> <512B533D.1000105@gmail.com> <517990E8.2040206@crans.org> Message-ID: <5187D32A.50004@gmail.com> Hi Pierre, Sorry for the late answer, I was travelling. Yes, I am still willing to propose it for integration into scipy. I develop it in another github directory, and have synchronized the "dolo" version where I use it and where you can take it in the meantime. It is in a subdirectory that is supposed to be completely independent from the rest of the library. Here are some infos about the state of the code: - So far I have only regular grids, but extension to rectangular grids is pending. - It uses Cython code generated by Tempita templates. There seem to be a bug with tempita which makes the implementation messier than it should be. I haven't found the time to dig into that bug, and didn't receive much echo on that from the tempita developers. - I don't have extensive benchmarks yet. To interpolate a grid with 500 points at 10000 intermediate points, I find my Cython version to be around 10 time faster than the original scipy version. I recall it was beating Matlab too but I don't have it on my laptop to test. Best, Pablo On 25/04/2013 16:24, Pierre Haessig wrote: > Hi Pablo, > > Le 25/02/2013 13:04, Pablo Winant a ?crit : >> So I made an attempt using tempita in the case of regularly spaced >> grids. For now, it is in a temporary repository in github: >> >> https://github.com/albop/python_interpolation >> >> There is a file multilinear_cython.pyx.in which must be interpreted to >> produce multilinear_cython.pyx . There is also a multilinear.py file >> containing an object oriented wrapper for the compiled routines. >> >> Here is an example: >> >> http://nbviewer.ipython.org/urls/raw.github.com/albop/python_interpolation/master/interpolation.ipynb >> >> Before, I do the same for irregularly spaced grid, I have some questions: >> - I included the templates as Python strings in the .pyx.in file . Was >> there a better way ? >> - I was not sure about how to deal with single/double precision in the >> cython code. > I was just wondering what was the status of your work on a multilinear > interpolation. I'm currently working on a small stochastic dynamic > programming code where an interpolator is a key underlying component. > I've been using scipy.interpolate.RectBivariateSpline so far because I > was solving only a 2D state-space problem, but for genericity, a N-D > interpolator is required (although the "curse of dimensionality" > probably implies N <= 3 !) > > By looking at you're Githup repositories, it seems you've integrated the > interpolation code inhttps://github.com/albop/dolo. Is that indeed the > case ? > > This makes me ask two questions : > 1) do you plan to integrate it in scipy.interpolate ? > 2) did you benchmark your code against existing N-D interpolator like > LinearNDInterpolator ? > > best, > Pierre > > > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: From drazen.lucanin at gmail.com Mon May 6 14:27:22 2013 From: drazen.lucanin at gmail.com (=?ISO-8859-2?Q?Dra=BEen_Lu=E8anin?=) Date: Mon, 6 May 2013 20:27:22 +0200 Subject: [SciPy-Dev] scipy.integrate optimisation for pandas.TimeSeries In-Reply-To: References:

Message-ID: On Mon, May 6, 2013 at 9:47 AM, Dra?en Lu?anin wrote: > > On Mon, May 6, 2013 at 12:25 AM, Ralf Gommers wrote: > >> >> We can still do that. You can't really edit your proposal on Melange >> anymore, but it would still be helpful to improve it. You can keep a copy >> on your blog or on Github for example. >> > If you agree, we could do that, yes. I'll put up a new version somewhere > tonight. > OK, I copied the proposal to GitHub: https://gist.github.com/kermit666/5526790 We can comment there to avoid too much noise on the mailing list. -------------- next part -------------- An HTML attachment was scrubbed... URL: From nils106 at googlemail.com Mon May 6 14:31:11 2013 From: nils106 at googlemail.com (Nils Wagner) Date: Mon, 6 May 2013 20:31:11 +0200 Subject: [SciPy-Dev] FAIL: test_binom_2 (test_basic.TestCephes) Message-ID: >>> scipy.__version__ '0.13.0.dev-024abe7' ====================================================================== FAIL: test_binom_2 (test_basic.TestCephes) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/nils/local/lib64/python2.7/site-packages/scipy/special/tests/test_basic.py", line 85, in test_binom_2 atol=1e-10, rtol=1e-10) File "/home/nils/local/lib64/python2.7/site-packages/scipy/special/_testutils.py", line 87, in assert_func_equal fdata.check() File "/home/nils/local/lib64/python2.7/site-packages/scipy/special/_testutils.py", line 292, in check assert_(False, "\n".join(msg)) File "/home/nils/local/lib64/python2.7/site-packages/numpy/testing/utils.py", line 41, in assert_ raise AssertionError(msg) AssertionError: Max |adiff|: 2.14065e+273 Max |rdiff|: 1 Bad results (19 out of 2040) for the following points (in output 0): 5.4555947811685144e+16 19.0 => 8.220635246624177e+300 != inf (rdiff 0.0) 2.9763514416313132e+32 9.0 => 5.051203457981723e+286 != inf (rdiff 0.0) 1.6237767391887178e+48 6.0 => 2.5458065428227896e+286 != inf (rdiff 0.0) 8.858667904100795e+63 4.0 => 2.5660342127750743e+254 != inf (rdiff 0.0) 4.832930238571653e+79 3.0 => 1.8813964861410318e+238 != inf (rdiff 0.0) 2.6366508987303447e+95 3.0 => 3.054967851387346e+285 != inf (rdiff 0.0) 1.4384498882876777e+111 2.0 => 1.0345690405574163e+222 != inf (rdiff 0.0) 7.847599703514559e+126 2.0 => 3.07924105533009e+253 != inf (rdiff 0.0) 4.2813323987192907e+142 2.0 => 9.164903554161738e+284 != inf (rdiff 0.0) 2.3357214690900255e+158 1.0 => 2.3357214690900255e+158 != inf (rdiff 0.0) 1.2742749857031426e+174 1.0 => 1.2742749857031426e+174 != inf (rdiff 0.0) 6.951927961775534e+189 1.0 => 6.951927961775534e+189 != inf (rdiff 0.0) 3.792690190732145e+205 1.0 => 3.792690190732145e+205 != inf (rdiff 0.0) 2.0691380811148325e+221 1.0 => 2.0691380811148325e+221 != inf (rdiff 0.0) 1.128837891684693e+237 1.0 => 1.128837891684693e+237 != inf (rdiff 0.0) 6.158482110660179e+252 1.0 => 6.158482110660179e+252 != inf (rdiff 0.0) 3.359818286283898e+268 1.0 => 3.359818286283898e+268 != inf (rdiff 0.0) 1.8329807108323476e+284 1.0 => 1.8329807108323476e+284 != inf (rdiff 0.0) 1e+300 1.0 => 1e+300 != inf (rdiff 0.0) ---------------------------------------------------------------------- Ran 6446 tests in 203.406s FAILED (KNOWNFAIL=27, SKIP=158, failures=2) -------------- next part -------------- An HTML attachment was scrubbed... URL: From jsseabold at gmail.com Mon May 6 15:25:41 2013 From: jsseabold at gmail.com (Skipper Seabold) Date: Mon, 6 May 2013 15:25:41 -0400 Subject: [SciPy-Dev] scipy.integrate optimisation for pandas.TimeSeries In-Reply-To: References:

Message-ID: On Mon, May 6, 2013 at 3:47 AM, Dra?en Lu?anin wrote: > On Mon, May 6, 2013 at 12:25 AM, Ralf Gommers > wrote: >> On Sun, May 5, 2013 at 11:24 PM, Dra?en Lu?anin >> wrote: >>> >>> On Sun, May 5, 2013 at 11:37 AM, Ralf Gommers >>> wrote: >>>> >>>> >>>> Hi Drazen, it seems to me that making scipy.integrate time series aware >>>> is a significant enlargement of the scope of that module, and I'm not sure >>>> if this is the best way to go. Wouldn't it make more sense to add this in >>>> pandas itself? >>> >>> Hi Ralf, >>> >>> well, I see the relation scipy : pandas practically the same as scipy : >>> numpy. Pandas offers basic functionality for dealing with data (time-stamped >>> in contrast to numpy, but still low-level), >> >> >> Hmm, not sure I agree with that - pandas is in some respects more >> high-level than scipy. It includes some plotting tools for example. To build >> on pandas, statsmodels is a better candidate than scipy (given that >> statsmodels already has pandas as a dependency). I don't see us adding a >> dependency on pandas any time soon. > > Yes, in a broader sense. But that's the dynamic nature of pandas that I > quite like - ti recognizes and uses other libraries if they're available, > but it only really requires numpy, python-dateutil and pytz. > > https://github.com/pydata/pandas#dependencies > > That's exactly the sort of pattern I had in mind for adding to > scipy.integrate - detect at runtime if pandas is installed and if so, work > with it - check the data type and adapt to it. If it is a pandas.TimeSeries > use the datetime arithmetic, instead of the lower-level numpy.ndarray > integer index that requires a domain transformation in order to integrate > based on seconds. IIUC, you would never use this on anything that's not from pandas, so I still don't see why this functionality should go into scipy with a runtime detection for pandas and not in pandas itself? You should reach out to the pandas devs, who likely don't monitor this list. Skipper From ralf.gommers at gmail.com Mon May 6 17:37:43 2013 From: ralf.gommers at gmail.com (Ralf Gommers) Date: Mon, 6 May 2013 23:37:43 +0200 Subject: [SciPy-Dev] FAIL: test_binom_2 (test_basic.TestCephes) In-Reply-To: References: Message-ID: On Mon, May 6, 2013 at 8:31 PM, Nils Wagner wrote: > >>> scipy.__version__ > '0.13.0.dev-024abe7' > > ====================================================================== > FAIL: test_binom_2 (test_basic.TestCephes) > ---------------------------------------------------------------------- > I just saw this too, but a clean rebuild fixed it. Ralf -------------- next part -------------- An HTML attachment was scrubbed... URL: From tim.leslie at gmail.com Mon May 6 19:50:14 2013 From: tim.leslie at gmail.com (Tim Leslie) Date: Tue, 7 May 2013 09:50:14 +1000 Subject: [SciPy-Dev] PEP8 conformance Message-ID: Hi All, We've just merged a couple of pull requests which greatly improve the conformance of the scipy code to the pep8 style guidelines: http://www.python.org/dev/peps/pep-0008/ In order to maintain consistency of code, I would like to recommend that before developers commit (or before they submit a PR) they run the "pep8" script over the code to make sure no errors are flagged. The pep8 script can be found here: https://github.com/jcrocholl/pep8 The file tox.ini has been updated to contain a [pep8] section, which contains the current configuration for scipy. In particular, it lists certain error codes which we currently ignore, and automatically generated python files, which we also ignore. NOTE: The latest stable release of pep8 (1.4.5) has a bug which breaks the ignore-file functionality. If you use this version you can expect to see a bunch of errors from the files in sparsetools. The current master branch of pep8 on github has this issue fixed, so it is recommended to use this version. If everything is correctly installed and there are no pep8 errors you should see the following behaviour: ~/src/scipy$ pep8 scipy ~/src/scipy$ Cheers, Tim -------------- next part -------------- An HTML attachment was scrubbed... URL: From warren.weckesser at gmail.com Mon May 6 20:13:14 2013 From: warren.weckesser at gmail.com (Warren Weckesser) Date: Mon, 6 May 2013 20:13:14 -0400 Subject: [SciPy-Dev] PEP8 conformance In-Reply-To: References: Message-ID: On 5/6/13, Tim Leslie wrote: > Hi All, > > We've just merged a couple of pull requests which greatly improve the > conformance of the scipy code to the pep8 style guidelines: > > http://www.python.org/dev/peps/pep-0008/ > > In order to maintain consistency of code, I would like to recommend that > before developers commit (or before they submit a PR) they run the "pep8" > script over the code to make sure no errors are flagged. > > The pep8 script can be found here: > > https://github.com/jcrocholl/pep8 > > The file tox.ini has been updated to contain a [pep8] section, which > contains the current configuration for scipy. In particular, it lists > certain error codes which we currently ignore, and automatically generated > python files, which we also ignore. > > NOTE: The latest stable release of pep8 (1.4.5) has a bug which breaks the > ignore-file functionality. If you use this version you can expect to see a > bunch of errors from the files in sparsetools. The current master branch of > pep8 on github has this issue fixed, so it is recommended to use this > version. > > If everything is correctly installed and there are no pep8 errors you > should see the following behaviour: > > ~/src/scipy$ pep8 scipy > ~/src/scipy$ > > Cheers, > > Tim > Great work, Tim, thanks! Warren From blake.a.griffith at gmail.com Tue May 7 23:25:09 2013 From: blake.a.griffith at gmail.com (Blake Griffith) Date: Tue, 7 May 2013 22:25:09 -0500 Subject: [SciPy-Dev] GSoC Proposal draft -- Improvements to the sparse package of Scipy: support for bool dtype and better interaction with Numpy In-Reply-To: References: Message-ID: On Tue, Apr 30, 2013 at 2:02 PM, Pauli Virtanen wrote: > Hi, > > 27.04.2013 10:17, Blake Griffith kirjoitti: > [clip] > > https://github.com/cowlicks/GSoC-proposal/blob/master/proposal.markdown > > Comments: > > > - "adding new functions/methods" --- these would for a large part > not be new user-visible methods, right? __add__, __minus__, multiply, > etc. are already there, although their functionality is at the moment > more restricted than it should be, which is to be fixed. > > -- > Pauli Virtanen > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > I was considering adding user visible methods that would have similar behavior to existing ufuncs but for sparse matricies. Although I have not looked at exactly what could be added. Although improving the functionality of __add__, __minus__, etc is probably more useful. -------------- next part -------------- An HTML attachment was scrubbed... URL: From pierre.haessig at crans.org Fri May 10 06:31:06 2013 From: pierre.haessig at crans.org (Pierre Haessig) Date: Fri, 10 May 2013 12:31:06 +0200 Subject: [SciPy-Dev] Multilinear interpolation In-Reply-To: <5187D32A.50004@gmail.com> References: <51260E4E.4010405@gmail.com> <51263127.7070306@gmail.com> <512B533D.1000105@gmail.com> <517990E8.2040206@crans.org> <5187D32A.50004@gmail.com> Message-ID: <518CCC6A.2000506@crans.org> Hi Pablo, Thank you very much for the feedback. I'll take the interpolation from "dolo" then. My work on Stochastic Dynamic Programming is paused for the coming 2-3 weeks but I'll dig again in interpolation routines once I come back to it. I'll certainly do a few more performance tests. best, Pierre -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 900 bytes Desc: OpenPGP digital signature URL: From pav at iki.fi Fri May 10 09:33:24 2013 From: pav at iki.fi (Pauli Virtanen) Date: Fri, 10 May 2013 16:33:24 +0300 Subject: [SciPy-Dev] New scipy.org site Message-ID: Dear all, We are working on migrating the front of http://scipy.org away from the wiki to a static site, which also should address the performance problems the site has been encountering recently. Please take a look at what is to come: http://scipy.github.io/ The content currently on the wiki will not go away, of course. If you have any comments, please reply on scipy-dev. Thanks, -- Pauli Virtanen From robert.kern at gmail.com Fri May 10 09:45:24 2013 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 10 May 2013 14:45:24 +0100 Subject: [SciPy-Dev] New scipy.org site In-Reply-To: References: Message-ID: On Fri, May 10, 2013 at 2:33 PM, Pauli Virtanen wrote: > Dear all, > > We are working on migrating the front of http://scipy.org away from the > wiki to a static site, which also should address the performance > problems the site has been encountering recently. Having disabled new logins and deleting all of the spammy users, the performance issues should now basically be resolved. ;-) The static site will have a better collaboration model through Github PRs, of course. How is the conversion process going? Are there people working on migrating the Topical Software and Cookbook pages, yet? -- Robert Kern From lists at hilboll.de Fri May 10 09:50:02 2013 From: lists at hilboll.de (Andreas Hilboll) Date: Fri, 10 May 2013 15:50:02 +0200 Subject: [SciPy-Dev] New scipy.org site In-Reply-To: References: Message-ID: <518CFB0A.5080704@hilboll.de> On 10.05.2013 15:45, Robert Kern wrote: > On Fri, May 10, 2013 at 2:33 PM, Pauli Virtanen wrote: >> Dear all, >> >> We are working on migrating the front of http://scipy.org away from the >> wiki to a static site, which also should address the performance >> problems the site has been encountering recently. > > Having disabled new logins and deleting all of the spammy users, the > performance issues should now basically be resolved. ;-) > > The static site will have a better collaboration model through Github > PRs, of course. How is the conversion process going? Are there people > working on migrating the Topical Software and Cookbook pages, yet? I took a look into the Cookbook and noticed that there's already quite a bit in the new repo. Can anyone say something about why only parts have been converted (probably: time), or rather how the pages which already are converted were chosen? Before I potentially start working on this, I'd like to be sure that there hasn't been any quality screening / whatever other screening involved. -- Andreas. From robert.kern at gmail.com Fri May 10 09:55:03 2013 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 10 May 2013 14:55:03 +0100 Subject: [SciPy-Dev] New scipy.org site In-Reply-To: <518CFB0A.5080704@hilboll.de> References: <518CFB0A.5080704@hilboll.de> Message-ID: On Fri, May 10, 2013 at 2:50 PM, Andreas Hilboll wrote: > On 10.05.2013 15:45, Robert Kern wrote: >> On Fri, May 10, 2013 at 2:33 PM, Pauli Virtanen wrote: >>> Dear all, >>> >>> We are working on migrating the front of http://scipy.org away from the >>> wiki to a static site, which also should address the performance >>> problems the site has been encountering recently. >> >> Having disabled new logins and deleting all of the spammy users, the >> performance issues should now basically be resolved. ;-) >> >> The static site will have a better collaboration model through Github >> PRs, of course. How is the conversion process going? Are there people >> working on migrating the Topical Software and Cookbook pages, yet? > > I took a look into the Cookbook and noticed that there's already quite a > bit in the new repo. There is? I see the Topical Software page has been converted, but none of the Cookbook. https://github.com/scipy/scipy.org-new/tree/master/www -- Robert Kern From lists at hilboll.de Fri May 10 10:01:13 2013 From: lists at hilboll.de (Andreas Hilboll) Date: Fri, 10 May 2013 16:01:13 +0200 Subject: [SciPy-Dev] New scipy.org site In-Reply-To: References: <518CFB0A.5080704@hilboll.de> Message-ID: <518CFDA9.7000506@hilboll.de> On 10.05.2013 15:55, Robert Kern wrote: > On Fri, May 10, 2013 at 2:50 PM, Andreas Hilboll wrote: >> On 10.05.2013 15:45, Robert Kern wrote: >>> On Fri, May 10, 2013 at 2:33 PM, Pauli Virtanen wrote: >>>> Dear all, >>>> >>>> We are working on migrating the front of http://scipy.org away from the >>>> wiki to a static site, which also should address the performance >>>> problems the site has been encountering recently. >>> >>> Having disabled new logins and deleting all of the spammy users, the >>> performance issues should now basically be resolved. ;-) >>> >>> The static site will have a better collaboration model through Github >>> PRs, of course. How is the conversion process going? Are there people >>> working on migrating the Topical Software and Cookbook pages, yet? >> >> I took a look into the Cookbook and noticed that there's already quite a >> bit in the new repo. > > There is? I see the Topical Software page has been converted, but none > of the Cookbook. > > https://github.com/scipy/scipy.org-new/tree/master/www > > -- > Robert Kern > My fault. Apparently, I mixed up the two, and when I looked the topical software was only partly converted. Thanks to Pauli, this is now done. Sorry for the noise. -- Andreas. From pav at iki.fi Fri May 10 10:08:10 2013 From: pav at iki.fi (Pauli Virtanen) Date: Fri, 10 May 2013 17:08:10 +0300 Subject: [SciPy-Dev] New scipy.org site In-Reply-To: References: Message-ID: 10.05.2013 16:45, Robert Kern kirjoitti: [clip] > The static site will have a better collaboration model through Github > PRs, of course. How is the conversion process going? Are there people > working on migrating the Topical Software and Cookbook pages, yet? Topical Software is done. I don't know if there's active work on Cookbook, and it is not clear that a static site is the best solution for that. However, this does not block the migration. For now, we can let the Cookbook live on the wiki and figure out what to do with it afterward. Anyway, tools: https://github.com/pv/moin2rst (fixed version of https://github.com/dwf/moin2rst for Moin 1.9) -- Pauli Virtanen From lists at hilboll.de Fri May 10 10:21:04 2013 From: lists at hilboll.de (Andreas Hilboll) Date: Fri, 10 May 2013 16:21:04 +0200 Subject: [SciPy-Dev] New scipy.org site In-Reply-To: References: Message-ID: <518D0250.8050707@hilboll.de> On 10.05.2013 16:08, Pauli Virtanen wrote: > 10.05.2013 16:45, Robert Kern kirjoitti: > [clip] >> The static site will have a better collaboration model through Github >> PRs, of course. How is the conversion process going? Are there people >> working on migrating the Topical Software and Cookbook pages, yet? > > Topical Software is done. > > I don't know if there's active work on Cookbook, and it is not clear > that a static site is the best solution for that. IMO, it would be ideal if the cookbook examples were converted to notebooks which are then put on Scipy-Central. > However, this does not block the migration. For now, we can let the > Cookbook live on the wiki and figure out what to do with it afterward. +1 > Anyway, tools: > > https://github.com/pv/moin2rst > > (fixed version of https://github.com/dwf/moin2rst for Moin 1.9) > -- Andreas From robert.kern at gmail.com Fri May 10 10:39:44 2013 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 10 May 2013 15:39:44 +0100 Subject: [SciPy-Dev] New scipy.org site In-Reply-To: References: Message-ID: On Fri, May 10, 2013 at 3:08 PM, Pauli Virtanen wrote: > 10.05.2013 16:45, Robert Kern kirjoitti: > [clip] >> The static site will have a better collaboration model through Github >> PRs, of course. How is the conversion process going? Are there people >> working on migrating the Topical Software and Cookbook pages, yet? > > Topical Software is done. > > I don't know if there's active work on Cookbook, and it is not clear > that a static site is the best solution for that. > > However, this does not block the migration. For now, we can let the > Cookbook live on the wiki and figure out what to do with it afterward. The wiki is currently essentially a static site (at least in terms of drawbacks, not as many of the benefits). For those who want to work on it, here is the latest dump of all of the wiki pages: https://www.dropbox.com/s/8p5q2ijol7zet2d/pages.tar.bz2 -- Robert Kern From suryak at ieee.org Fri May 10 11:13:37 2013 From: suryak at ieee.org (Surya Kasturi) Date: Fri, 10 May 2013 20:43:37 +0530 Subject: [SciPy-Dev] New scipy.org site In-Reply-To: <518D0250.8050707@hilboll.de> References: <518D0250.8050707@hilboll.de> Message-ID: On Fri, May 10, 2013 at 7:51 PM, Andreas Hilboll wrote: > On 10.05.2013 16:08, Pauli Virtanen wrote: > > 10.05.2013 16:45, Robert Kern kirjoitti: > > [clip] > >> The static site will have a better collaboration model through Github > >> PRs, of course. How is the conversion process going? Are there people > >> working on migrating the Topical Software and Cookbook pages, yet? > > > > Topical Software is done. > > > > I don't know if there's active work on Cookbook, and it is not clear > > that a static site is the best solution for that. > > IMO, it would be ideal if the cookbook examples were converted to > notebooks which are then put on Scipy-Central. > > > However, this does not block the migration. For now, we can let the > > Cookbook live on the wiki and figure out what to do with it afterward. > > +1 > > > Anyway, tools: > > > > https://github.com/pv/moin2rst > > > > (fixed version of https://github.com/dwf/moin2rst for Moin 1.9) > > > > -- Andreas > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > Hey, I just went through the new site.. and I see that bootstrap has been used. Actually, do we need bootstrap.min.js script? I don't think so. Also, there is some unessential js at the bottom of the page too. If you guys agree, can I look into the html and do some clean up if required (most of the html doesn't need though). Thanks Surya -------------- next part -------------- An HTML attachment was scrubbed... URL: From pav at iki.fi Fri May 10 11:31:47 2013 From: pav at iki.fi (Pauli Virtanen) Date: Fri, 10 May 2013 18:31:47 +0300 Subject: [SciPy-Dev] New scipy.org site In-Reply-To: References: <518D0250.8050707@hilboll.de> Message-ID: Hi Surya, 10.05.2013 18:13, Surya Kasturi kirjoitti: [clip] > Hey, I just went through the new site.. and I see that bootstrap has > been used. > > Actually, do we need bootstrap.min.js script? I don't think so. Also, > there is some unessential js at the bottom of the page too. > > If you guys agree, can I look into the html and do some clean up if > required (most of the html doesn't need though). It's actually based on your Scipy Central layout, many thanks! :) If you want to make some changes to the styling, the place to do it is here: https://github.com/scipy/scipy-sphinx-theme Namely, the site is Sphinx-generated, and we want to abstract the layout of the site to a Sphinx theme, so that we can easily reuse it in the documentation etc. Best, Pauli From warren.weckesser at gmail.com Fri May 10 17:25:14 2013 From: warren.weckesser at gmail.com (Warren Weckesser) Date: Fri, 10 May 2013 17:25:14 -0400 Subject: [SciPy-Dev] scipy github slow? Message-ID: Has anyone else noticed github being very slow this afternoon? Maybe it's the fault of Starbucks' wifi, but other sites don't seem particularly slow. Warren -------------- next part -------------- An HTML attachment was scrubbed... URL: From warren.weckesser at gmail.com Fri May 10 17:43:11 2013 From: warren.weckesser at gmail.com (Warren Weckesser) Date: Fri, 10 May 2013 17:43:11 -0400 Subject: [SciPy-Dev] scipy github slow? In-Reply-To: References: Message-ID: On Fri, May 10, 2013 at 5:25 PM, Warren Weckesser < warren.weckesser at gmail.com> wrote: > Has anyone else noticed github being very slow this afternoon? Maybe it's > the fault of Starbucks' wifi, but other sites don't seem particularly slow. > Of course, after sending an email, it starts responding reasonably fast. Well, never mind then. Warren > > Warren > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Fri May 10 18:11:57 2013 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 10 May 2013 16:11:57 -0600 Subject: [SciPy-Dev] scipy github slow? In-Reply-To: References:

Message-ID: On Fri, May 10, 2013 at 3:43 PM, Warren Weckesser < warren.weckesser at gmail.com> wrote: > > On Fri, May 10, 2013 at 5:25 PM, Warren Weckesser < > warren.weckesser at gmail.com> wrote: > >> Has anyone else noticed github being very slow this afternoon? Maybe >> it's the fault of Starbucks' wifi, but other sites don't seem particularly >> slow. >> > > > Of course, after sending an email, it starts responding reasonably fast. > Well, never mind then. > > Too much coffee can make things seem slow ;) Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Sat May 11 08:54:31 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 11 May 2013 08:54:31 -0400 Subject: [SciPy-Dev] issue comments out of order Message-ID: Just an observation (too late to do anything about it) Some comments to trac tickets are in a different sequence in github issues https://github.com/scipy/scipy/issues/2108 http://projects.scipy.org/scipy/ticket/1583 But maybe that's a special case, because the trac ticket comments might have been simultaneously edited during a time when trac was very slow. Josef From vanforeest at gmail.com Sat May 11 16:20:27 2013 From: vanforeest at gmail.com (nicky van foreest) Date: Sat, 11 May 2013 22:20:27 +0200 Subject: [SciPy-Dev] doc string of brentq Message-ID: Hi, Perhaps I am just checking an old documentation site... Anyway, on this page: http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.brentq.html#scipy.optimize.brentq the argument rtol is mentioned as an valid argument to brenqt, but it is not explained on the page. Is this a bug? Nicky -------------- next part -------------- An HTML attachment was scrubbed... URL: From charlesr.harris at gmail.com Sat May 11 18:31:33 2013 From: charlesr.harris at gmail.com (Charles R Harris) Date: Sat, 11 May 2013 16:31:33 -0600 Subject: [SciPy-Dev] doc string of brentq In-Reply-To: References: Message-ID: On Sat, May 11, 2013 at 2:20 PM, nicky van foreest wrote: > Hi, > > Perhaps I am just checking an old documentation site... Anyway, on this > page: > > > http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.brentq.html#scipy.optimize.brentq > > the argument rtol is mentioned as an valid argument to brenqt, but it is > not explained on the page. Is this a bug? > > See PR #2462 Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From vanforeest at gmail.com Sun May 12 15:53:42 2013 From: vanforeest at gmail.com (nicky van foreest) Date: Sun, 12 May 2013 21:53:42 +0200 Subject: [SciPy-Dev] doc string of brentq In-Reply-To: References: Message-ID: Thanks. On 12 May 2013 00:31, Charles R Harris wrote: > > > On Sat, May 11, 2013 at 2:20 PM, nicky van foreest wrote: > >> Hi, >> >> Perhaps I am just checking an old documentation site... Anyway, on this >> page: >> >> >> http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.brentq.html#scipy.optimize.brentq >> >> the argument rtol is mentioned as an valid argument to brenqt, but it is >> not explained on the page. Is this a bug? >> >> > See PR #2462 > > Chuck > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From thouis at gmail.com Mon May 13 10:11:16 2013 From: thouis at gmail.com (Thouis (Ray) Jones) Date: Mon, 13 May 2013 10:11:16 -0400 Subject: [SciPy-Dev] issues trac migration review In-Reply-To: References:

Message-ID: On Mon, Apr 22, 2013 at 4:35 PM, Pauli Virtanen wrote: > 22.04.2013 22:42, Skipper Seabold kirjoitti: > [clip] >> As for new data submissions...well...open issue I guess. Github >> accepts image attachments and posts them to their server on amazon. I >> assume this functionality will be extended at some point. They've been >> talking about it for 4 years now. We could instruct people to use >> e-mail attachments and post them to the list then link to the message >> in the issue? > > E-mail has the problem that it spams N+1 people who are not interested > in the potentially big data file, plus it's a hassle to subscribe to the > list so that you can post. > > It probably needs to be something simple, otherwise what happens is > dropbox/mediafire/ad-supported-upload-service-of-your-choice which do > not necessarily have a very long lifetimes. > > A third-party web app can use Github for authentication: > http://developer.github.com/v3/oauth/ > But nobody seems to have written so far (or at least I can't find) > something that does this. The CellProfiler project uses exactly this approach for handling attachments to issues: http://cellprofiler.org/issues/ The code behind the page is here: http://www.broadinstitute.org/~ljosa/issues-with-attachments/ Ray Jones From pav at iki.fi Wed May 15 08:40:48 2013 From: pav at iki.fi (Pauli Virtanen) Date: Wed, 15 May 2013 12:40:48 +0000 (UTC) Subject: [SciPy-Dev] New scipy.org site References: <518D0250.8050707@hilboll.de> Message-ID: Andreas Hilboll hilboll.de> writes: > On 10.05.2013 16:08, Pauli Virtanen wrote: [clip] > > I don't know if there's active work on Cookbook, and it is not clear > > that a static site is the best solution for that. > > IMO, it would be ideal if the cookbook examples were converted to > notebooks which are then put on Scipy-Central. That would be useful. If someone wants to give a hand, dumping the Cookbook pages at http://scipy.org/Cookbook to IPython notebooks is something that probably won't go to waste in any case. This format could be better than e.g. RST. -- Pauli Virtanen From thouis at gmail.com Wed May 15 09:40:22 2013 From: thouis at gmail.com (Thouis (Ray) Jones) Date: Wed, 15 May 2013 09:40:22 -0400 Subject: [SciPy-Dev] issues trac migration review In-Reply-To: References:

Message-ID: On Mon, May 13, 2013 at 10:11 AM, Thouis (Ray) Jones wrote: > The CellProfiler project uses exactly this approach for handling > attachments to issues: > http://cellprofiler.org/issues/ > > The code behind the page is here: > http://www.broadinstitute.org/~ljosa/issues-with-attachments/ > > Ray Jones Now on github: https://github.com/ljosa/issues-with-attachments From orion at cora.nwra.com Wed May 15 23:24:12 2013 From: orion at cora.nwra.com (Orion Poplawski) Date: Wed, 15 May 2013 21:24:12 -0600 Subject: [SciPy-Dev] Unbundling arpack Message-ID: <5194515C.5080807@cora.nwra.com> I'm starting to take a look at the bundled libraries in scipy from the Fedora package, starting with arpack and qhull. Are there any mechanisms in place at the moment for building scipy with the system versions of these libraries? I don't see any. I'm attaching my first attempt at a patch to allow this by specifying options in the site.cfg file. But it appears that get_info only works with certain pre-defined strings. Is that right? This seems pretty cumbersome. Can anyone help me with this? Let me know if I'm on the right track or not with this. Thanks! PS - If anyone can think of other libraries bundled with scipy, please let me know. -- Orion Poplawski Technical Manager 303-415-9701 x222 NWRA/CoRA Division FAX: 303-415-9702 3380 Mitchell Lane orion at cora.nwra.com Boulder, CO 80301 http://www.cora.nwra.com -------------- next part -------------- A non-text attachment was scrubbed... Name: scipy-arpack.patch Type: text/x-patch Size: 2855 bytes Desc: not available URL: From jakevdp at cs.washington.edu Thu May 16 00:46:36 2013 From: jakevdp at cs.washington.edu (Jacob Vanderplas) Date: Wed, 15 May 2013 21:46:36 -0700 Subject: [SciPy-Dev] Unbundling arpack In-Reply-To: <5194515C.5080807@cora.nwra.com> References: <5194515C.5080807@cora.nwra.com> Message-ID: Hi Orion, One potential issue is that there were some small patches required for ARPACK that we've made within the version bundled with scipy. Since there is no official ARPACK repository anywhere, it's unclear whether these bugs have been fixed in the other versions available out there. If you dig into the scipy source, you'll see that because of these bugs we actually prepend "scipy" to the ARPACK install to make sure that scipy *won't* use the system version if it's available. Jake On Wed, May 15, 2013 at 8:24 PM, Orion Poplawski wrote: > I'm starting to take a look at the bundled libraries in scipy from the > Fedora package, starting with arpack and qhull. Are there any mechanisms > in place at the moment for building scipy with the system versions of these > libraries? I don't see any. > > I'm attaching my first attempt at a patch to allow this by specifying > options in the site.cfg file. But it appears that get_info only works with > certain pre-defined strings. Is that right? This seems pretty cumbersome. > > Can anyone help me with this? Let me know if I'm on the right track or > not with this. > > Thanks! > > PS - If anyone can think of other libraries bundled with scipy, please let > me know. > > -- > Orion Poplawski > Technical Manager 303-415-9701 x222 > NWRA/CoRA Division FAX: 303-415-9702 > 3380 Mitchell Lane orion at cora.nwra.com > Boulder, CO 80301 http://www.cora.nwra.com > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Thu May 16 03:18:48 2013 From: cournape at gmail.com (David Cournapeau) Date: Thu, 16 May 2013 08:18:48 +0100 Subject: [SciPy-Dev] Unbundling arpack In-Reply-To: <5194515C.5080807@cora.nwra.com> References: <5194515C.5080807@cora.nwra.com> Message-ID: On Thu, May 16, 2013 at 4:24 AM, Orion Poplawski wrote: > I'm starting to take a look at the bundled libraries in scipy from the > Fedora package, starting with arpack and qhull. Are there any mechanisms in > place at the moment for building scipy with the system versions of these > libraries? I don't see any. Unfortunately, most the bundled libraries are not proper libraries as we are used to in open source, but are really just a bunch of files with no real notion of upstream (no versioning, etc...). In practice, unbundling them would be quite tedious for that reason alone. They also tend to be quite buggy, and because there is no real 'upstream' here in most cases, we just patch our own version. I don't think we have the bandwidth to actually take care of those separately and 'become' the upstream. That's the case for arpack, but also fftpack, quadpack, specfunc, etc... David From matti.pastell at helsinki.fi Thu May 16 04:13:52 2013 From: matti.pastell at helsinki.fi (Matti Pastell) Date: Thu, 16 May 2013 11:13:52 +0300 Subject: [SciPy-Dev] New scipy.org site Message-ID: <51949540.7040404@helsinki.fi> Hello all, I did some work in converting the Cookbook to Sphinx: I scraped the Cookbook from the archive Robert sent to the list earlier and used a couple of scripts to make the conversion, it's not perfect but not that bad either. You can see the results at: http://mpastell.github.io/ I put the various stages of conversion to GitHub with a brief explanation: https://github.com/mpastell/SciPy-CookBook Conversion scripts currently require a forked version of Pweave http://mpastell.com/pweave (I'm the author) that only I have, but of course I can share it. I'd like to get some feedback before I do any more work and I think a lot of the pages will require a bit of manual editing in the end. Also what do you think about using Pweave for the Cookbook? It can be used to capture figures and code and can be intgrated to sphinx build process. An example (http://mpastell.com/pweave/examples.html). Matti From pav at iki.fi Thu May 16 04:37:28 2013 From: pav at iki.fi (Pauli Virtanen) Date: Thu, 16 May 2013 08:37:28 +0000 (UTC) Subject: [SciPy-Dev] Unbundling arpack References: <5194515C.5080807@cora.nwra.com> Message-ID: Jacob Vanderplas cs.washington.edu> writes: > One potential issue is that there were some small patches required > for ARPACK that we've made within the version bundled with scipy. > Since there is no official ARPACK repository anywhere, it's unclear > whether these bugs have been fixed in the other versions available out > there. ?If you dig into the scipy source, you'll see that because of > these bugs we actually prepend "scipy" to the ARPACK install to make > sure that scipy *won't* use the system version if it's available. We are using almost unmodified arpack-ng, so if the system version is also that, things should be fine. I wrote in README.scipy under scipy/sparse/linalg/eigen/arpack/ARPACK: """ This directory contains a bundled version of ARPACK-NG 3.1.2, http://forge.scilab.org/index.php/p/arpack-ng/ NOTE FOR VENDORS: it is in general safe to use a system version of ARPACK instead. Note, however, that earlier versions of ARPACK and ARPACK-NG had certain bugs, so using those over the bundled version is not recommended. The bundled version has the following patches applied: (i) Replacing calls to certain Fortran functions with wrapper functions, to avoid various ABI mismatches on OSX. These changes are made with the following command: perl -pi -e ' s@\bsdot\b at wsdot@g; s@\bcdotc\b at wcdotc@g; s@\bzdotc\b at wzdotc@g; s@\bcdotu\b at wcdotu@g; s@\bzdotu\b at wzdotu@g; s@\bcladiv\b at wcladiv@g; s@\bzladiv\b at wzladiv@g; ' \ SRC/*.f UTIL/*.f (ii) Using a UTIL/second_cputime.f which calls the Fortran intrinsic CPU_TIME function instead of ETIME. """ System Qhull is also fine --- however, it must be compiled with QHpointer enabled. Qhull will abort() on initialization if this is not the case, so it should be easy to spot this. However, system SuperLU is *not* fine --- the upstream contains some crash-bugs that we fixed (and TBH should send the patches to upstream). Also, more importantly: the SuperLU error reporting mechanism is to abort on error, which can be overridden at compile time to do something more sensible. We make use of this, but this is not possible without bundling SuperLU. Pauli From pav at iki.fi Thu May 16 04:49:36 2013 From: pav at iki.fi (Pauli Virtanen) Date: Thu, 16 May 2013 08:49:36 +0000 (UTC) Subject: [SciPy-Dev] Unbundling arpack References: <5194515C.5080807@cora.nwra.com> Message-ID: David Cournapeau gmail.com> writes: [clip] > That's the case for arpack, but also fftpack, quadpack, specfunc, etc... ARPACK has nowadays a semi-alive fork with patches pooled from Scilab, Debian, Octave, and also from us: http://forge.scilab.org/index.php/p/arpack-ng/ The bundled Qhull is not modified. The bundled SuperLU has small modifications, and cannot be unbundled due to its error reporting mechanism. Unbundling anything under scipy.special is out of the question --- with the exception of AMOS, the original libraries are too buggy. I'm not sure what the patch situation is with fftpack, quadpack, fitpack, and the others. -- Pauli Virtanen From blake.a.griffith at gmail.com Fri May 17 00:45:46 2013 From: blake.a.griffith at gmail.com (Blake Griffith) Date: Thu, 16 May 2013 23:45:46 -0500 Subject: [SciPy-Dev] Sparse Boolean Specification Message-ID: Hello SciPy, I've been writing up how I think adding support for boolean operations and the bool dtype should work. You can read the document on my GitHub, here https://github.com/cowlicks/scipy-sparse-boolean-spec/blob/master/spec.rst I'd love some feedback. Thanks! -------------- next part -------------- An HTML attachment was scrubbed... URL: From matti.pastell at helsinki.fi Fri May 17 03:05:13 2013 From: matti.pastell at helsinki.fi (Matti Pastell) Date: Fri, 17 May 2013 10:05:13 +0300 Subject: [SciPy-Dev] New scipy.org site In-Reply-To: <51949540.7040404@helsinki.fi> References: <51949540.7040404@helsinki.fi> Message-ID: <5195D6A9.9050901@helsinki.fi> Hi, I've added a conversion to IPython notebook script format (http://ipython.org/ipython-doc/stable/interactive/htmlnotebook.html) in ipython folder in my repository(https://github.com/mpastell/SciPy-CookBook). The script files can be imported to IPython notebook http://ipython.org/ipython-doc/stable/interactive/htmlnotebook.html#exporting-a-notebook-and-importing-existing-scripts. Let me know what you think, it would be nice to know what format will used for the Cookbook in the future before doing any more work. Best, Matti On 16.5.2013 11:13, Matti Pastell wrote: > > Hello all, > I did some work in converting the Cookbook to Sphinx: I scraped the > Cookbook from the archive Robert sent to the list earlier and used a > couple of scripts to make the conversion, it's not perfect but not that > bad either. > > You can see the results at: http://mpastell.github.io/ > > I put the various stages of conversion to GitHub with a brief explanation: > https://github.com/mpastell/SciPy-CookBook > > Conversion scripts currently require a forked version of Pweave > http://mpastell.com/pweave (I'm the author) that only I have, but of > course I can share it. > > I'd like to get some feedback before I do any more work and I think a > lot of the pages will require a bit of manual editing in the end. > > Also what do you think about using Pweave for the Cookbook? It can be > used to capture figures and code and can be intgrated to sphinx build > process. An example (http://mpastell.com/pweave/examples.html). > > Matti > > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > -- Matti Pastell, Dr. Agric. & For. University lecturer, Livestock technology Department of Agricultural Sciences P.O Box 28, FI-00014 University of Helsinki, Finland Tel. +358504150612 http://mpastell.com From matthew.brett at gmail.com Sat May 18 03:11:09 2013 From: matthew.brett at gmail.com (Matthew Brett) Date: Sat, 18 May 2013 00:11:09 -0700 Subject: [SciPy-Dev] Circular reference in interp1d causing memory errors Message-ID: Hi, In a class I'm teaching, Dan Bliss (Cc'ed) noticed that he was getting memory errors while running multiple interpolations on large matrices. After a little futzing, it came down to something like this: import numpy as np import scipy.interpolate as spi N_ITERS = 50 N = 1000 for i in range(N_ITERS): x = np.arange(N) y = np.empty((N, N)) interp = spi.interp1d(x, y) Running this through ``time`` for a snapshot of memory usage gives: /usr/bin/time --format='(%DK data %MK max)' python try_interpolators.py (0K data 410088K max) If I add a ``gc.collect()`` as the last line of the loop above I get: [mb312 at angela ~/tmp/dan_code]$ /usr/bin/time --format='(%DK data %MK max)' python try_interpolators.py (0K data 42396K max) - a ten-fold memory difference. Investigating scipy.interpolate.interp1d, it holds a reference to self, for the function it will call for the interpolation. Thus in the default case: interp = spi.interp1d(x, y) interp._call holds a reference to self via self._call_linear I think (but I'm happy to be corrected) that this is why the interp1d objects are not getting freed without explicit garbage collection. Applying the attached patch to break this cycle results in the original (no garbage collection) script giving: (scipy-dev)[mb312 at angela ~/tmp/dan_code]$ /usr/bin/time --format='(%DK data %MK max)' python try_interpolators.py (0K data 43648K max) Is this the right analysis? Is this the right patch? Cheers, Matthew -------------- next part -------------- A non-text attachment was scrubbed... Name: interp1d.patch Type: application/octet-stream Size: 1394 bytes Desc: not available URL: From pav at iki.fi Sat May 18 05:20:29 2013 From: pav at iki.fi (Pauli Virtanen) Date: Sat, 18 May 2013 12:20:29 +0300 Subject: [SciPy-Dev] Circular reference in interp1d causing memory errors In-Reply-To: References: Message-ID: 18.05.2013 10:11, Matthew Brett kirjoitti: [clip] > I think (but I'm happy to be corrected) that this is why the interp1d > objects are not getting freed without explicit garbage collection. They will get freed, but the Python gc only counts the number of allocations, not their size. If you can change the code so that it doesn't rely on the cyclic gc, that's OK. -- Pauli Virtanen From robert.kern at gmail.com Sat May 18 05:23:11 2013 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 18 May 2013 10:23:11 +0100 Subject: [SciPy-Dev] Circular reference in interp1d causing memory errors In-Reply-To: References: Message-ID: On Sat, May 18, 2013 at 8:11 AM, Matthew Brett wrote: > Is this the right analysis? Is this the right patch? Sounds about right. The patch needs a comment explaining why, though, because the current code is a common pattern. -- Robert Kern From robert.kern at gmail.com Sat May 18 05:23:11 2013 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 18 May 2013 10:23:11 +0100 Subject: [SciPy-Dev] Circular reference in interp1d causing memory errors In-Reply-To: References: Message-ID: On Sat, May 18, 2013 at 8:11 AM, Matthew Brett wrote: > Is this the right analysis? Is this the right patch? Sounds about right. The patch needs a comment explaining why, though, because the current code is a common pattern. -- Robert Kern From pav at iki.fi Sat May 18 05:31:44 2013 From: pav at iki.fi (Pauli Virtanen) Date: Sat, 18 May 2013 12:31:44 +0300 Subject: [SciPy-Dev] Sparse Boolean Specification In-Reply-To: References: <5196731F.3000103@iki.fi> Message-ID: <51974A80.7080002@iki.fi> 18.05.2013 05:23, Blake Griffith kirjoitti: > Pauli, > Previously you said: > > - The numerical type used for booleans internally inside sparsetools > should be int8/char (the builtin C++ bool type can be bigger than int8) [clip] Yes. int8 is also the data type of numpy.bool_ and the sparsetools data type should correspond to this. -- Pauli Virtanen From josef.pktd at gmail.com Sat May 18 07:14:54 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 18 May 2013 07:14:54 -0400 Subject: [SciPy-Dev] Circular reference in interp1d causing memory errors In-Reply-To: References: Message-ID: On Sat, May 18, 2013 at 3:11 AM, Matthew Brett wrote: > Hi, > > In a class I'm teaching, Dan Bliss (Cc'ed) noticed that he was getting > memory errors while running multiple interpolations on large matrices. > > After a little futzing, it came down to something like this: > > > > import numpy as np > > import scipy.interpolate as spi > > N_ITERS = 50 > N = 1000 > > for i in range(N_ITERS): > x = np.arange(N) > y = np.empty((N, N)) > interp = spi.interp1d(x, y) > > > > Running this through ``time`` for a snapshot of memory usage gives: > > /usr/bin/time --format='(%DK data %MK max)' python try_interpolators.py > (0K data 410088K max) > > If I add a ``gc.collect()`` as the last line of the loop above I get: > > [mb312 at angela ~/tmp/dan_code]$ /usr/bin/time --format='(%DK data %MK > max)' python try_interpolators.py > (0K data 42396K max) > > - a ten-fold memory difference. > > Investigating scipy.interpolate.interp1d, it holds a reference to > self, for the function it will call for the interpolation. Thus in > the default case: > > interp = spi.interp1d(x, y) > > interp._call holds a reference to self via self._call_linear more general question (I think patch is fine) interp._call is an instance method assigned in the __init__ This is a pattern we use quite a few times (stats.distribution does it a lot, statsmodels in some models). I know we have problems with pickling in these cases, but didn't know about the garbage collection. I didn't find much explicit information after googling for an hour. http://stackoverflow.com/questions/9329232/internal-reference-prevents-garbage-collection looks similar but doesn't use create instance methods. The solution with calling through the class might be interesting in some cases. I found a few more threads on various mailing lists but never a clear answer or link. Does anyone know a more explicit explanation? Or, is it really just the cyclical reference, even though it's a method (which always hold self) searching for "python pickle instance method" gives a lot more direct information on that problem. Thanks, Josef > > I think (but I'm happy to be corrected) that this is why the interp1d > objects are not getting freed without explicit garbage collection. > > Applying the attached patch to break this cycle results in the > original (no garbage collection) script giving: > > (scipy-dev)[mb312 at angela ~/tmp/dan_code]$ /usr/bin/time --format='(%DK > data %MK max)' python try_interpolators.py > (0K data 43648K max) > > Is this the right analysis? Is this the right patch? > > Cheers, > > Matthew > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > From robert.kern at gmail.com Sat May 18 08:17:33 2013 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 18 May 2013 13:17:33 +0100 Subject: [SciPy-Dev] Circular reference in interp1d causing memory errors In-Reply-To: References:

Message-ID: On Sat, May 18, 2013 at 12:14 PM, wrote: > interp._call is an instance method assigned in the __init__ > This is a pattern we use quite a few times (stats.distribution does it > a lot, statsmodels in some models). > > I know we have problems with pickling in these cases, but didn't know > about the garbage collection. > > I didn't find much explicit information after googling for an hour. > http://stackoverflow.com/questions/9329232/internal-reference-prevents-garbage-collection > looks similar but doesn't use create instance methods. Yes, this is the same issue, just storing the bound methods in a dict rather than directly as an attribute on the instance. > The solution > with calling through the class might be interesting in some cases. > I found a few more threads on various mailing lists but never a clear > answer or link. > > Does anyone know a more explicit explanation? > Or, is it really just the cyclical reference, even though it's a > method (which always hold self) The bound method is not created with the .im_self reference until it is requested. Until then, the *unbound* method lives on the class without a reference to any particular instance. [~] |1> from scipy.interpolate import interp1d [~] |2> f = interp1d(np.linspace(0,1), np.linspace(0,1)) [~] |3> f._call_linear is f._call_linear False Requesting the method creates a new bound method that refers to the instance. [~] |4> f._call_linear.im_self [~] |5> f Storing that newly created bound method as a new attribute on itself creates a cycle. -- Robert Kern From pav at iki.fi Sat May 18 08:24:19 2013 From: pav at iki.fi (Pauli Virtanen) Date: Sat, 18 May 2013 15:24:19 +0300 Subject: [SciPy-Dev] Circular reference in interp1d causing memory errors In-Reply-To: References:

Message-ID: 18.05.2013 14:14, josef.pktd at gmail.com kirjoitti: [clip] > Does anyone know a more explicit explanation? > Or, is it really just the cyclical reference, even though it's a > method (which always hold self) Python gc is capable of breaking most reference cycles automatically. The point however is that it does not run on every allocation. If you allocate big objects that are kept around by a reference cycle at a too rapid rate, you run into a memoryerror before the gc gets around to breaking the cycles. -- Pauli Virtanen From josef.pktd at gmail.com Sat May 18 09:14:23 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 18 May 2013 09:14:23 -0400 Subject: [SciPy-Dev] Circular reference in interp1d causing memory errors In-Reply-To: References:

Message-ID: On Sat, May 18, 2013 at 8:17 AM, Robert Kern wrote: > On Sat, May 18, 2013 at 12:14 PM, wrote: > > > interp._call is an instance method assigned in the __init__ > > This is a pattern we use quite a few times (stats.distribution does it > > a lot, statsmodels in some models). > > > > I know we have problems with pickling in these cases, but didn't know > > about the garbage collection. > > > > I didn't find much explicit information after googling for an hour. > > > http://stackoverflow.com/questions/9329232/internal-reference-prevents-garbage-collection > > looks similar but doesn't use create instance methods. > > Yes, this is the same issue, just storing the bound methods in a dict > rather than directly as an attribute on the instance. > > > The solution > > with calling through the class might be interesting in some cases. > > I found a few more threads on various mailing lists but never a clear > > answer or link. > > > > Does anyone know a more explicit explanation? > > Or, is it really just the cyclical reference, even though it's a > > method (which always hold self) > > The bound method is not created with the .im_self reference until it > is requested. Until then, the *unbound* method lives on the class > without a reference to any particular instance. > > [~] > |1> from scipy.interpolate import interp1d > > [~] > |2> f = interp1d(np.linspace(0,1), np.linspace(0,1)) > > [~] > |3> f._call_linear is f._call_linear > False > > > Requesting the method creates a new bound method that refers to the > instance. > > [~] > |4> f._call_linear.im_self > > > [~] > |5> f > > > > Storing that newly created bound method as a new attribute on itself > creates a cycle. > Thank, took me a while but I think I understand now >>> f._call is f._call True >>> f._call_linear is f._call_linear False I didn't understand or looked at this part of the python documentation before (which I guess is the relevant part): "Note that the transformation from function object to (unbound or bound) method object happens each time the attribute is retrieved from the class or instance. In some cases, a fruitful optimization is to assign the attribute to a local variable and call that local variable. " Josef > > -- > Robert Kern > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Sat May 18 09:27:39 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 18 May 2013 09:27:39 -0400 Subject: [SciPy-Dev] Circular reference in interp1d causing memory errors In-Reply-To: References:

Message-ID: On Sat, May 18, 2013 at 8:24 AM, Pauli Virtanen wrote: > 18.05.2013 14:14, josef.pktd at gmail.com kirjoitti: > [clip] > > Does anyone know a more explicit explanation? > > Or, is it really just the cyclical reference, even though it's a > > method (which always hold self) > > Python gc is capable of breaking most reference cycles automatically. > The point however is that it does not run on every allocation. If you > allocate big objects that are kept around by a reference cycle at a too > rapid rate, you run into a memoryerror before the gc gets around to > breaking the cycles. > That's the problem with applications with large numpy arrays, especially in loops, cyclic garbage collection doesn't kick in soon enough. We didn't realize this in statsmodels until a user pointed out that we have an (unnecessary) cyclical reference. scipy.stats.distributions are safe, there shouldn't be memory problems, since they reuse the instance and don't hold any data. But we need to check this in statsmodels, where we still use instance method assignment in the __init__ of one or three models. Thanks, Josef > > -- > Pauli Virtanen > > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev > -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Sat May 18 09:56:34 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 18 May 2013 09:56:34 -0400 Subject: [SciPy-Dev] Circular reference in interp1d causing memory errors In-Reply-To: References:

Message-ID: On Sat, May 18, 2013 at 9:27 AM, wrote: > > > On Sat, May 18, 2013 at 8:24 AM, Pauli Virtanen wrote: > >> 18.05.2013 14:14, josef.pktd at gmail.com kirjoitti: >> [clip] >> > Does anyone know a more explicit explanation? >> > Or, is it really just the cyclical reference, even though it's a >> > method (which always hold self) >> >> Python gc is capable of breaking most reference cycles automatically. >> The point however is that it does not run on every allocation. If you >> allocate big objects that are kept around by a reference cycle at a too >> rapid rate, you run into a memoryerror before the gc gets around to >> breaking the cycles. >> > > That's the problem with applications with large numpy arrays, especially > in loops, cyclic garbage collection doesn't kick in soon enough. > > We didn't realize this in statsmodels until a user pointed out that we > have an (unnecessary) cyclical reference. > > scipy.stats.distributions are safe, there shouldn't be memory problems, > since they reuse the instance and don't hold any data. > But we need to check this in statsmodels, where we still use instance > method assignment in the __init__ of one or three models. > Follow-up question: Is there a way to check whether a class gets garbage collected by reference counting, and gets immediately collected when the reference count is zero? Something that can be used in unittests. I never went over the details of gc. Josef > > Thanks, > > Josef > > >> >> -- >> Pauli Virtanen >> >> _______________________________________________ >> SciPy-Dev mailing list >> SciPy-Dev at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-dev >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Sat May 18 13:28:46 2013 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 18 May 2013 18:28:46 +0100 Subject: [SciPy-Dev] Circular reference in interp1d causing memory errors In-Reply-To: References:

Message-ID: On Sat, May 18, 2013 at 2:56 PM, wrote: > Follow-up question: > > Is there a way to check whether a class gets garbage collected by reference > counting, and gets immediately collected when the reference count is zero? > Something that can be used in unittests. Sure, you can use weakrefs to determine if something has been actually collected, as long as the object is weakrefable. Most of the objects you are concerned with testing will be, just not objects like tuples. http://docs.python.org/2/library/weakref#weakref.ref import gc import weakref # You will probably want to disable cyclical garbage collection. gc.disable() try: # Create the object you want to test. obj = ... # Create a weak reference to the object. ref = weakref.ref(obj) # Do stuff to the object that you want to test. do_stuff(obj) # Do whatever is necessary to dispose of the object in normal use, usually nothing. dispose_of_object(obj) # But you must at least delete it's name from the current namespace. del obj # Now see if the weak reference still can access it. assert ref() is None finally: # Restore garbage collection. gc.enable() -- Robert Kern From josef.pktd at gmail.com Sat May 18 14:39:10 2013 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 18 May 2013 14:39:10 -0400 Subject: [SciPy-Dev] Circular reference in interp1d causing memory errors In-Reply-To: References:

Message-ID: On Sat, May 18, 2013 at 1:28 PM, Robert Kern wrote: > > On Sat, May 18, 2013 at 2:56 PM, wrote: > > > Follow-up question: > > > > Is there a way to check whether a class gets garbage collected by reference > > counting, and gets immediately collected when the reference count is zero? > > Something that can be used in unittests. > > Sure, you can use weakrefs to determine if something has been actually > collected, as long as the object is weakrefable. Most of the objects > you are concerned with testing will be, just not objects like tuples. > > http://docs.python.org/2/library/weakref#weakref.ref > > > import gc > import weakref > > > # You will probably want to disable cyclical garbage collection. > gc.disable() > try: > # Create the object you want to test. > obj = ... > # Create a weak reference to the object. > ref = weakref.ref(obj) > # Do stuff to the object that you want to test. > do_stuff(obj) > # Do whatever is necessary to dispose of the object in normal use, > usually nothing. > dispose_of_object(obj) > # But you must at least delete it's name from the current namespace. > del obj > # Now see if the weak reference still can access it. > assert ref() is None > finally: > # Restore garbage collection. > gc.enable() > Thank you, I will try this out I got started with len(gc.get_objects()) which shows that a simple loop with 20 OLS instantiations leaves 320 uncollected objects for delayed garbage collection if there is a reference cycle. (and zero with the current OLS class) https://github.com/statsmodels/statsmodels/issues/839 the main parts of my script: --- import gc import numpy as np import statsmodels.api as sm def func(nrep): for i in range(nrep): res = sm.OLS(np.random.randn(100), np.vander(np.linspace(-1,1,100), 4)).fit(); #res.model.res = res def func_cycle(nrep): for i in range(nrep): res = sm.OLS(np.random.randn(100), np.vander(np.linspace(-1,1,100), 4)).fit(); res.model.res = res nrep = 20 g6 = len(gc.get_objects()) func(nrep) g6a = len(gc.get_objects()) func_cycle(nrep) g7 = len(gc.get_objects()) print g6a, g6a - g6, "OLS func" print g7, g7 - g6, "OLS func cycle" --- 106722 0 OLS func 107042 320 OLS func cycle -- Matthew, Thanks for raising the point, this thread was very informative for me. Josef > > > -- > Robert Kern > _______________________________________________ > SciPy-Dev mailing list > SciPy-Dev at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-dev From matthew.brett at gmail.com Sat May 18 16:13:08 2013 From: matthew.brett at gmail.com (Matthew Brett) Date: Sat, 18 May 2013 13:13:08 -0700 Subject: [SciPy-Dev] Circular reference in interp1d causing memory errors In-Reply-To: References:

Message-ID: Hi, On Sat, May 18, 2013 at 5:17 AM, Robert Kern wrote: > On Sat, May 18, 2013 at 12:14 PM, wrote: > >> interp._call is an instance method assigned in the __init__ >> This is a pattern we use quite a few times (stats.distribution does it >> a lot, statsmodels in some models). >> >> I know we have problems with pickling in these cases, but didn't know >> about the garbage collection. >> >> I didn't find much explicit information after googling for an hour. >> http://stackoverflow.com/questions/9329232/internal-reference-prevents-garbage-collection >> looks similar but doesn't use create instance methods. > > Yes, this is the same issue, just storing the bound methods in a dict > rather than directly as an attribute on the instance. > >> The solution >> with calling through the class might be interesting in some cases. >> I found a few more threads on various mailing lists but never a clear >> answer or link. >> >> Does anyone know a more explicit explanation? >> Or, is it really just the cyclical reference, even though it's a >> method (which always hold self) > > The bound method is not created with the .im_self reference until it > is requested. Until then, the *unbound* method lives on the class > without a reference to any particular instance. > > [~] > |1> from scipy.interpolate import interp1d > > [~] > |2> f = interp1d(np.linspace(0,1), np.linspace(0,1)) > > [~] > |3> f._call_linear is f._call_linear > False > > > Requesting the method creates a new bound method that refers to the instance. > > [~] > |4> f._call_linear.im_self > > > [~] > |5> f > Thanks - I didn't realize this - that each time I do this: obj.method - I get a new method object. Because I had to explain this to myself, and more or less repeating Robert's exposition: In [1]: class C(object): ...: def method(self): ...: pass ...: In [2]: c = C() In [3]: m1 = c.method In [4]: m2 = c.method In [5]: m1 Out[5]: > In [6]: m2 Out[6]: > In [7]: m1 is m2 Out[7]: False In [8]: id(m1) Out[8]: 26609456 In [9]: id(m2) Out[9]: 26609696 http://docs.python.org/2/reference/datamodel.html Thanks for the discussion, Matthew From matthew.brett at gmail.com Sat May 18 16:22:09 2013 From: matthew.brett at gmail.com (Matthew Brett) Date: Sat, 18 May 2013 13:22:09 -0700 Subject: [SciPy-Dev] Circular reference in interp1d causing memory errors In-Reply-To: References:

Message-ID: Hi, On Sat, May 18, 2013 at 2:23 AM, Robert Kern wrote: > On Sat, May 18, 2013 at 8:11 AM, Matthew Brett wrote: > >> Is this the right analysis? Is this the right patch? > > Sounds about right. The patch needs a comment explaining why, though, > because the current code is a common pattern. Yes - maybe it's easier to make that clear by using unbound references rather than doing it the crude way as in the previous patch. Cheers, Matthew -------------- next part -------------- A non-text attachment was scrubbed... Name: interp1d.patch.again Type: application/octet-stream Size: 1838 bytes Desc: not available URL: From matthew.brett at gmail.com Sat May 18 16:22:09 2013 From: matthew.brett at gmail.com (Matthew Brett) Date: Sat, 18 May 2013 13:22:09 -0700 Subject: [SciPy-Dev] Circular reference in interp1d causing memory errors In-Reply-To: References:

Message-ID: Hi, On Sat, May 18, 2013 at 2:23 AM, Robert Kern wrote: > On Sat, May 18, 2013 at 8:11 AM, Matthew Brett wrote: > >> Is this the right analysis? Is this the right patch? > > Sounds about right. The patch needs a comment explaining why, though, > because the current code is a common pattern. Yes - maybe it's easier to make that clear by using unbound references rather than doing it the crude way as in the previous patch. Cheers, Matthew -------------- next part -------------- A non-text attachment was scrubbed... Name: interp1d.patch.again Type: application/octet-stream Size: 1838 bytes Desc: not available URL: From matthew.brett at gmail.com Sat May 18 20:00:06 2013 From: matthew.brett at gmail.com (Matthew Brett) Date: Sat, 18 May 2013 17:00:06 -0700 Subject: [SciPy-Dev] Circular reference in interp1d causing memory errors In-Reply-To: References:

Message-ID: Hi, On Sat, May 18, 2013 at 1:22 PM, Matthew Brett wrote: > Hi, > > On Sat, May 18, 2013 at 2:23 AM, Robert Kern wrote: >> On Sat, May 18, 2013 at 8:11 AM, Matthew Brett wrote: >> >>> Is this the right analysis? Is this the right patch? >> >> Sounds about right. The patch needs a comment explaining why, though, >> because the current code is a common pattern. > > Yes - maybe it's easier to make that clear by using unbound references > rather than doing it the crude way as in the previous patch. Following on from Josef's and Robert's posts, I wrote a `check_refs_for` context manger that does this: >>> class C(object): pass >>> with check_refs_for(C) as c: ... # do something ... del c >>> class C(object): ... def __init__(self): ... self._circular = self # Make circular reference >>> with check_refs_for(C) as c: ... # do something ... del c Traceback (most recent call last): ... ReferenceError: Remaining reference(s) to object I was thinking of proposing it for scipy.misc in order to test my interpolate.py patch, but I wonder if it belongs in numpy.testing. What do y'all think? Cheers, Matthew From matthew.brett at gmail.com Sat May 18 20:00:06 2013 From: matthew.brett at gmail.com (Matthew Brett) Date: Sat, 18 May 2013 17:00:06 -0700 Subject: [SciPy-Dev] Circular reference in interp1d causing memory errors In-Reply-To: References:

Message-ID: Hi, On Sat, May 18, 2013 at 1:22 PM, Matthew Brett wrote: > Hi, > > On Sat, May 18, 2013 at 2:23 AM, Robert Kern wrote: >> On Sat, May 18, 2013 at 8:11 AM, Matthew Brett wrote: >> >>> Is this the right analysis? Is this the right patch? >> >> Sounds about right. The patch needs a comment explaining why, though, >> because the current code is a common pattern. > > Yes - maybe it's easier to make that clear by using unbound references > rather than doing it the crude way as in the previous patch. Following on from Josef's and Robert's posts, I wrote a `check_refs_for` context manger that does this: >>> class C(object): pass >>> with check_refs_for(C) as c: ... # do something ... del c >>> class C(object): ... def __init__(self): ... self._circular = self # Make circular reference >>> with check_refs_for(C) as c: ... # do something ... del c Traceback (most recent call last): ... ReferenceError: Remaining reference(s) to object I was thinking of proposing it for scipy.misc in order to test my interpolate.py patch, but I wonder if it belongs in numpy.testing. What do y'all think? Cheers, Matthew From aron at ahmadia.net Sat May 18 20:07:32 2013 From: aron at ahmadia.net (Aron Ahmadia) Date: Sun, 19 May 2013 01:07:32 +0100 Subject: [SciPy-Dev] Circular reference in interp1d causing memory errors In-Reply-To: References:

Message-ID: On Sun, May 19, 2013 at 1:00 AM, Matthew Brett wrote: > Following on from Josef's and Robert's posts, I wrote a > `check_refs_for` context manger that does this: > Newbie question but what is context manager in this, ermnn, context? That's a cool test, maybe rename it to: assert_no_refs_to(C): And I haven't seen the code, but I assume it... - asserts no C refs - disables garbage collector - code - asserts no C refs - re-enables garbage collector Cheers, Aron -------------- next part -------------- An HTML attachment was scrubbed... URL: From aron at ahmadia.net Sat May 18 20:07:32 2013 From: aron at ahmadia.net (Aron Ahmadia) Date: Sun, 19 May 2013 01:07:32 +0100 Subject: [SciPy-Dev] Circular reference in interp1d causing memory errors In-Reply-To: References:

Message-ID: On Sun, May 19, 2013 at 1:00 AM, Matthew Brett wrote: > Following on from Josef's and Robert's posts, I wrote a > `check_refs_for` context manger that does this: > Newbie question but what is context manager in this, ermnn, context? That's a cool test, maybe rename it to: assert_no_refs_to(C): And I haven't seen the code, but I assume it... - asserts no C refs - disables garbage collector - code - asserts no C refs - re-enables garbage collector Cheers, Aron -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.brett at gmail.com Sat May 18 20:17:06 2013 From: matthew.brett at gmail.com (Matthew Brett) Date: Sat, 18 May 2013 17:17:06 -0700 Subject: [SciPy-Dev] Circular reference in interp1d causing memory errors In-Reply-To: References:

Message-ID: Hi, On Sat, May 18, 2013 at 5:07 PM, Aron Ahmadia wrote: > > On Sun, May 19, 2013 at 1:00 AM, Matthew Brett > wrote: >> >> Following on from Josef's and Robert's posts, I wrote a >> `check_refs_for` context manger that does this: > > > Newbie question but what is context manager in this, ermnn, context? :) - I mean a context manager in the sense of a thing that does the with statement: http://docs.python.org/2/library/stdtypes.html#typecontextmanager Is that what you meant? > That's a cool test, maybe rename it to: > > assert_no_refs_to(C): Not a bad name - thanks. > And I haven't seen the code, but I assume it... > > - asserts no C refs > - disables garbage collector > > - code > > - asserts no C refs > - re-enables garbage collector Well - on entry to the with-block it: Turns off garbage collection Creates the object using the input arguments to 'check_refs_for' Collects a weak reference to the new object Passes the new object into the context of the with-block - the 'as obj' part of the with statement After the with-block is done it deletes its own reference to the object asserts that the weakref is now None turns garbage collection back on (or rather, restores it to the previous state). I've attached the code too ... Cheers, Matthew -------------- next part -------------- A non-text attachment was scrubbed... Name: check_refs.py Type: application/octet-stream Size: 1951 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: test_check_refs_for.py Type: application/octet-stream Size: 2976 bytes Desc: not available URL: From matthew.brett at gmail.com Sat May 18 20:17:06 2013 From: matthew.brett at gmail.com (Matthew Brett) Date: Sat, 18 May 2013 17:17:06 -0700 Subject: [SciPy-Dev] Circular reference in interp1d causing memory errors In-Reply-To: References: