From contact at pythonxy.com Fri Jan 1 08:43:17 2010 From: contact at pythonxy.com (Pierre Raybaut) Date: Fri, 1 Jan 2010 14:43:17 +0100 Subject: [SciPy-User] [Numpy-discussion] Announcing toydist, improving distribution and packaging situation Message-ID: <629b08a41001010543r193acb2bk3290b6458f97c596@mail.gmail.com> Hi David, Following your announcement for the 'toydist' module, I think that your project is very promising: this is certainly a great idea and it will be very controversial but that's because people expectactions are great on this matter (distutils is so disappointing indeed). Anyway, if I may be useful, I'll gladly contribute to it. In time, I could change the whole Python(x,y) packaging system (which is currently quite ugly... but easy/quick to manage/maintain) to use/promote this new module. Happy New Year! and Long Live Scientific Python! ;-) Cheers, Pierre From cournape at gmail.com Sat Jan 2 02:51:38 2010 From: cournape at gmail.com (David Cournapeau) Date: Sat, 2 Jan 2010 16:51:38 +0900 Subject: [SciPy-User] [Numpy-discussion] Announcing toydist, improving distribution and packaging situation In-Reply-To: <629b08a41001010543r193acb2bk3290b6458f97c596@mail.gmail.com> References: <629b08a41001010543r193acb2bk3290b6458f97c596@mail.gmail.com> Message-ID: <5b8d13221001012351w4feda89bj13e67d102318076d@mail.gmail.com> On Fri, Jan 1, 2010 at 10:43 PM, Pierre Raybaut wrote: > Hi David, > > Following your announcement for the 'toydist' module, I think that > your project is very promising: this is certainly a great idea and it > will be very controversial but that's because people expectactions are > great on this matter (distutils is so disappointing indeed). > > Anyway, if I may be useful, I'll gladly contribute to it. > In time, I could change the whole Python(x,y) packaging system (which > is currently quite ugly... but easy/quick to manage/maintain) to > use/promote this new module. That would be a good way to test toydist on a real, complex package. I am not familiar at all with python(x,y) internals. Do you have some explanation I could look at somewhere ? In the meantime, I will try to clean-up the code to have a first experimental release. cheers, David From contact at pythonxy.com Sat Jan 2 05:40:16 2010 From: contact at pythonxy.com (Pierre Raybaut) Date: Sat, 2 Jan 2010 11:40:16 +0100 Subject: [SciPy-User] [SPAM] Re: [Numpy-discussion] Announcing toydist, improving distribution and packaging situation In-Reply-To: <5b8d13221001012351w4feda89bj13e67d102318076d@mail.gmail.com> References: <629b08a41001010543r193acb2bk3290b6458f97c596@mail.gmail.com> <5b8d13221001012351w4feda89bj13e67d102318076d@mail.gmail.com> Message-ID: <629b08a41001020240y642518f4r68f4a6a3860a3eee@mail.gmail.com> 2010/1/2 David Cournapeau : > On Fri, Jan 1, 2010 at 10:43 PM, Pierre Raybaut wrote: >> Hi David, >> >> Following your announcement for the 'toydist' module, I think that >> your project is very promising: this is certainly a great idea and it >> will be very controversial but that's because people expectactions are >> great on this matter (distutils is so disappointing indeed). >> >> Anyway, if I may be useful, I'll gladly contribute to it. >> In time, I could change the whole Python(x,y) packaging system (which >> is currently quite ugly... but easy/quick to manage/maintain) to >> use/promote this new module. > > That would be a good way to test toydist on a real, complex package. I > am not familiar at all with python(x,y) internals. Do you have some > explanation I could look at somewhere ? Honestly, let's assume that there is currently no packaging system... it would not be very far from the truth. I did it when I was young and naive regarding Python. Actually I almost did it without having writing any code in Python (approx. two months after earing about the Python language for the first time) : it's an ugly collection of AutoIt, NSIS and PHP scripts -- most of the tasks are automated like updating the generated website pages and so on. So I'm not proud at all, but it was easy and very quick to do as it is, and it's still quite easy to maintain. But, it's not satisfying in terms of code "purity" -- I've been wanting to rewrite all this in Python for a year and a half but since the features are there, there is no real motivation to do the work (in other words, Python(x,y) users would not see the difference, at least at the beginning). An other thing: Python(x,y) plugins are not built from source but from existing binaries (it's a pity I know, but it was incredibly faster to do this way). For example, eggs or distutils .exe may be converted in Python(x,y) plugins directly (same internal directory structure). So it may be different from the idea you had in mind (it's not like EPD which is entirely generated from source, AFAIK). > In the meantime, I will try to clean-up the code to have a first > experimental release. > Ok, keep up the good work! Cheers, Pierre From tpk at kraussfamily.org Sat Jan 2 15:42:30 2010 From: tpk at kraussfamily.org (Tom K.) Date: Sat, 2 Jan 2010 12:42:30 -0800 (PST) Subject: [SciPy-User] [SciPy-user] [ANN] upfirdn 0.2.0 Message-ID: <26996317.post@talk.nabble.com> ANNOUNCEMENT I am pleased to announce a new release of "upfirdn" - version 0.2.0. This package provides an efficient polyphase FIR resampler object (SWIG-ed C++) and some python wrappers. This release greatly improves installation with distutils relative to the initial 0.1.0 release. 0.2.0 includes no functional changes relative to 0.1.0. Also, the source code is now browse-able online through a Google Code site with mercurial repository. https://opensource.motorola.com/sf/projects/upfirdn http://code.google.com/p/upfirdn/ Thanks to Google for providing this hosting service! -- View this message in context: http://old.nabble.com/-ANN--upfirdn-0.2.0-tp26996317p26996317.html Sent from the Scipy-User mailing list archive at Nabble.com. From peter.shepard at gmail.com Sat Jan 2 18:16:27 2010 From: peter.shepard at gmail.com (Pete Shepard) Date: Sat, 2 Jan 2010 15:16:27 -0800 Subject: [SciPy-User] fisher's exact.py stalls? Message-ID: <5c2c43621001021516w1285568ci18c75db9e54409a5@mail.gmail.com> Hello, I am using "fishersexact.py" to compare two long ~10,000 lists of ratios. Each time I do this, the program get stuck? I can print the two ratios that come before the program stalls and if I give these numbers directly to the subroutine it seems to process them just fine. I am wondering if anyone has had a similar issue with the "fishersexact.py" subroutine? Thanks, -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Sat Jan 2 18:42:38 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 2 Jan 2010 18:42:38 -0500 Subject: [SciPy-User] fisher's exact.py stalls? In-Reply-To: <5c2c43621001021516w1285568ci18c75db9e54409a5@mail.gmail.com> References: <5c2c43621001021516w1285568ci18c75db9e54409a5@mail.gmail.com> Message-ID: <1cd32cbb1001021542iafa2d7fn7c8a09694cbd466@mail.gmail.com> On Sat, Jan 2, 2010 at 6:16 PM, Pete Shepard wrote: > Hello, > > I am using "fishersexact.py" to compare two long ~10,000 lists of ratios. > Each time I do this, the program get stuck? I can print the two ratios that > come before the program stalls and if I give these numbers directly to the > subroutine it seems to process them just fine. I am wondering if anyone has > had a similar issue with? the "fishersexact.py" subroutine? Do you mean fisherexact in the scipy trac ? If yest, did you apply http://projects.scipy.org/scipy/ticket/956#comment:10 by tkharris which found and fixes one endless loop. I was working on the ticket, but got stuck with some test failures that I haven't figured out. Do you know for which table the function gets stuck? Josef > > Thanks, > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > From peter.shepard at gmail.com Sat Jan 2 19:10:58 2010 From: peter.shepard at gmail.com (Pete Shepard) Date: Sat, 2 Jan 2010 16:10:58 -0800 Subject: [SciPy-User] fisher's exact.py stalls? In-Reply-To: <1cd32cbb1001021542iafa2d7fn7c8a09694cbd466@mail.gmail.com> References: <5c2c43621001021516w1285568ci18c75db9e54409a5@mail.gmail.com> <1cd32cbb1001021542iafa2d7fn7c8a09694cbd466@mail.gmail.com> Message-ID: <5c2c43621001021610g26b4746cu6a77659b91aa2d79@mail.gmail.com> That did the trick, thanks. On Sat, Jan 2, 2010 at 3:42 PM, wrote: > On Sat, Jan 2, 2010 at 6:16 PM, Pete Shepard > wrote: > > Hello, > > > > I am using "fishersexact.py" to compare two long ~10,000 lists of ratios. > > Each time I do this, the program get stuck? I can print the two ratios > that > > come before the program stalls and if I give these numbers directly to > the > > subroutine it seems to process them just fine. I am wondering if anyone > has > > had a similar issue with the "fishersexact.py" subroutine? > > Do you mean fisherexact in the scipy trac ? > > If yest, did you apply > http://projects.scipy.org/scipy/ticket/956#comment:10 by tkharris > which found and fixes one endless loop. > > I was working on the ticket, but got stuck with some test failures > that I haven't figured out. > > Do you know for which table the function gets stuck? > > Josef > > > > > > Thanks, > > > > > > > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From timmichelsen at gmx-topmail.de Mon Jan 4 12:11:42 2010 From: timmichelsen at gmx-topmail.de (Tim Michelsen) Date: Mon, 4 Jan 2010 17:11:42 +0000 (UTC) Subject: [SciPy-User] scikits.timeseries.tsfromtxt & guess Message-ID: Hello, I first want to stress again that the tsfromtxt in the timeseries scikit is a real killer function. Once one has understood the easyness of the "dateconverter" function it becomes a quick exercise to read in timeseries from ASCII files. As I am currently predefining a set of dateconverters for frquently used date-time combinations in different formats, I have the following question: Is it possible to integrate "ts.extras.guess_freq(dates)" into the function scikits.timeseries.tsfromtxt? Currently, I would need to read a file twice: once for guessing the frequency based on a created list of dates and then read file to create the timeseries. Ideally, I would like to do: def mydateconverter(year, month, day, hour): freq = ts.extras.guess_freq(year, month, day, hour) ts_date = ts.Date(freq, year=int(year), month=int(month), day=int(day)) return ts_date myts= ts.tsfromtxt(datafile, skiprows=1, names=None, datecols=(1,2,3), guess_freq=True, dateconverter=mydateconverter) Or is this already possible and I am just not getting this right? How can I pass a frequency value to the dateconverter argument? Like: def mydateconverter(year, month, day, hour, freq='T'): freq = ts.extras.guess_freq(year, month, day, hour) ts_date = ts.Date(freq, year=int(year), month=int(month), day=int(day)) return ts_date myts= ts.tsfromtxt(datafile, skiprows=1, names=None, datecols=(1,2,3), guess_freq=True, dateconverter=mydateconverter(freq='H')) I get this error then: TypeError: mydateconverter() takes at least 2 non-keyword arguments (0 given) Thanks in advance for any hints, Timmie From timmichelsen at gmx-topmail.de Mon Jan 4 12:19:20 2010 From: timmichelsen at gmx-topmail.de (Tim Michelsen) Date: Mon, 4 Jan 2010 17:19:20 +0000 (UTC) Subject: [SciPy-User] How to concatenate timeseries Message-ID: Hello, I am reading timeseries data from different files covering various successive time intervals. What is the best methoth concatenate these to one long running time series? I tried: import scikits.timeseries as ts series = ts.time_series([0,1,2,3], start_date=ts.Date(freq='A', year=2005)) series1 = ts.time_series([0,1,2,3], start_date=ts.Date(freq='A', year=2009)) import numpy as np full = np.concatenate([series, series1]) But the full series has then the frequency 'U' for undefined. What am I missing? Thanks, Timmie From pgmdevlist at gmail.com Mon Jan 4 12:50:34 2010 From: pgmdevlist at gmail.com (Pierre GM) Date: Mon, 4 Jan 2010 12:50:34 -0500 Subject: [SciPy-User] How to concatenate timeseries In-Reply-To: References: Message-ID: <255AD4CD-3076-4032-AA76-692E19B2A945@gmail.com> On Jan 4, 2010, at 12:19 PM, Tim Michelsen wrote: > What is the best methoth concatenate these to one long running time series? > > I tried: > > > import scikits.timeseries as ts > series = ts.time_series([0,1,2,3], start_date=ts.Date(freq='A', year=2005)) > series1 = ts.time_series([0,1,2,3], start_date=ts.Date(freq='A', year=2009)) > > import numpy as np > full = np.concatenate([series, series1]) > > But the full series has then the frequency 'U' for undefined. > > What am I missing? Use the concatenate function that comes with scikits.timeseries >>> ts.concatenate([series,series1]) timeseries([0 1 2 3 0 1 2 3], dates = [2005 ... 2012], freq = A-DEC) ts.concatenate tests whether the series have the same frequency, and optional parameters let you decide what to do with duplicates. From pgmdevlist at gmail.com Mon Jan 4 13:21:44 2010 From: pgmdevlist at gmail.com (Pierre GM) Date: Mon, 4 Jan 2010 13:21:44 -0500 Subject: [SciPy-User] scikits.timeseries.tsfromtxt & guess In-Reply-To: References: Message-ID: <8DEEF015-ECDF-49F5-9B1B-8E940A7249D2@gmail.com> On Jan 4, 2010, at 12:11 PM, Tim Michelsen wrote: > Hello, > I first want to stress again that the tsfromtxt in the timeseries scikit is a > real killer function. My, thanks a lot > Once one has understood the easyness of the "dateconverter" function it becomes > a quick exercise to read in timeseries from ASCII files. > > As I am currently predefining a set of dateconverters for frquently used > date-time combinations in different formats, I have the following question: > Is it possible to integrate "ts.extras.guess_freq(dates)" into the function > scikits.timeseries.tsfromtxt? Probably, but I doubt it'll be very different from the current behavior. See, the dateconverter function transforms a series of strings into a unique Date, independently for each row of the input. You can't guess the frequency of an individual Date, you need several to compare their lags. That means that no matter what, you'll have to reprocess the array. I'd prefer to leave this operation up to the user... > Currently, I would need to read a file twice: once for guessing the frequency > based on a created list of dates and then read file to create the timeseries. I'd first create the time series from the input, then try to guess the frequency from the DateArray > > How can I pass a frequency value to the dateconverter argument? > > Like: > def mydateconverter(year, month, day, hour, freq='T'): > freq = ts.extras.guess_freq(year, month, day, hour) > ts_date = ts.Date(freq, year=int(year), month=int(month), day=int(day)) > > return ts_date > > myts= ts.tsfromtxt(datafile, skiprows=1, names=None, > datecols=(1,2,3), guess_freq=True, > dateconverter=mydateconverter(freq='H')) > > I get this error then: > TypeError: mydateconverter() takes at least 2 non-keyword arguments (0 given) Please send a small example of datafile so that I can test wht goes wrong. If I have to guess: the line `dateconverter=mydateconverter(freq='H')` forces a call to mydateconverter without any argument (but for he frequency). Of course, that won't fly. What you want is to have `mydateconverter(freq='H')` callable. You should probably create a class that takes a frequency as instantiation input and that has a __call__ method, something like: class myconverter(object) def __init__(freq='D'): self.freq=freq def __call__(self, y,m,d,h): return ts.Date(self.freq, year=int(y),month=int(m),day=int(day),hour=int(h)) That way, myconverter(freq='T') becomes a valid function (you can call it). From timmichelsen at gmx-topmail.de Mon Jan 4 14:58:33 2010 From: timmichelsen at gmx-topmail.de (Tim Michelsen) Date: Mon, 04 Jan 2010 20:58:33 +0100 Subject: [SciPy-User] How to concatenate timeseries In-Reply-To: <255AD4CD-3076-4032-AA76-692E19B2A945@gmail.com> References: <255AD4CD-3076-4032-AA76-692E19B2A945@gmail.com> Message-ID: > Use the concatenate function that comes with scikits.timeseries >>>> ts.concatenate([series,series1]) > timeseries([0 1 2 3 0 1 2 3], dates = [2005 ... 2012], freq = A-DEC) > > > ts.concatenate tests whether the series have the same frequency, and > optional parameters let you decide what to do with duplicates. Must have overlooked that. But it isn't in the docs either: http://pytseries.sourceforge.net/search.html?q=concatenate From timmichelsen at gmx-topmail.de Mon Jan 4 15:25:30 2010 From: timmichelsen at gmx-topmail.de (Tim Michelsen) Date: Mon, 04 Jan 2010 21:25:30 +0100 Subject: [SciPy-User] scikits.timeseries.tsfromtxt & guess In-Reply-To: <8DEEF015-ECDF-49F5-9B1B-8E940A7249D2@gmail.com> References: <8DEEF015-ECDF-49F5-9B1B-8E940A7249D2@gmail.com> Message-ID: >> I first want to stress again that the tsfromtxt in the timeseries scikit is a >> real killer function. > > My, thanks a lot Yes, you may remember all my questions (still at the beginning of my scipy learning curve) on the data loading and creation of masked time series... This is now all obsolete. And as I receive data (logger) in wicked formats not counting from 0-23 but rather 1-24, I appreciate the datconverters which are based on strong datetime manupulations. > I'd first create the time series from the input, then try to guess the frequency from the DateArray So you'd recommend to create the timeseries using the userdefined frequency ('U') def mydateconverter(year, month, day, hour, freq='U'): freq = ts.extras.guess_freq(year, month, day, hour) ts_date = ts.Date(freq, year=int(year), month=int(month), day=int(day)) return ts_date and then use guess_freq to assign the correct one? I want to have the dateconverters in a flexible style only variyng by input format and clumns used. They should be working regardless of the frequency (be the data set hourly or minutely). > >> How can I pass a frequency value to the dateconverter argument? >> >> Like: >> def mydateconverter(year, month, day, hour, freq='T'): >> freq = ts.extras.guess_freq(year, month, day, hour) >> ts_date = ts.Date(freq, year=int(year), month=int(month), day=int(day)) >> >> return ts_date >> >> myts= ts.tsfromtxt(datafile, skiprows=1, names=None, >> datecols=(1,2,3), guess_freq=True, >> dateconverter=mydateconverter(freq='H')) >> >> I get this error then: >> TypeError: mydateconverter() takes at least 2 non-keyword arguments (0 given) > > Please send a small example of datafile so that I can test wht goes wrong. If I have to guess: the line `dateconverter=mydateconverter(freq='H')` forces a call to mydateconverter without any argument (but for he frequency). Of course, that won't fly. > What you want is to have `mydateconverter(freq='H')` callable. You should probably create a class that takes a frequency as instantiation input and that has a __call__ method, something like: > > class myconverter(object) > def __init__(freq='D'): > self.freq=freq > def __call__(self, y,m,d,h): > return ts.Date(self.freq, year=int(y),month=int(m),day=int(day),hour=int(h)) > > That way, myconverter(freq='T') becomes a valid function (you can call it). Thanks. I will try this way. Best regards, Timmie From pgmdevlist at gmail.com Mon Jan 4 15:35:57 2010 From: pgmdevlist at gmail.com (Pierre GM) Date: Mon, 4 Jan 2010 15:35:57 -0500 Subject: [SciPy-User] scikits.timeseries.tsfromtxt & guess In-Reply-To: References: <8DEEF015-ECDF-49F5-9B1B-8E940A7249D2@gmail.com> Message-ID: <3CD2A9F8-78D1-4FFF-901E-BFB60B25414F@gmail.com> On Jan 4, 2010, at 3:25 PM, Tim Michelsen wrote: >> I'd first create the time series from the input, then try to guess the frequency from the DateArray > So you'd recommend to create the timeseries using the userdefined > frequency ('U') > def mydateconverter(year, month, day, hour, freq='U'): > freq = ts.extras.guess_freq(year, month, day, hour) > ts_date = ts.Date(freq, year=int(year), month=int(month), day=int(day)) > > return ts_date > > and then use guess_freq to assign the correct one? Basically, yes. Note that guess_freq is only for convenience, it might not be fool-proof... > I want to have the dateconverters in a flexible style only variyng by > input format and clumns used. They should be working regardless of the > frequency (be the data set hourly or minutely). Well, you could define a converter class that takes freq as input and test in the __call__ for the value of the freq. You could have a variable nb of inputs in __call__ and test for the nb of parameters (year, month, day...). It won't be as efficient as defining a specific converter for your data, though... From dpfrota at yahoo.com.br Wed Jan 6 00:15:37 2010 From: dpfrota at yahoo.com.br (dpfrota) Date: Tue, 5 Jan 2010 21:15:37 -0800 (PST) Subject: [SciPy-User] [SciPy-user] Audiolab on Py2.6 In-Reply-To: <3d375d730911172231i4cf42760l80038a00f84fa7c8@mail.gmail.com> References: <4AE5DEDF.7070701@asu.edu> <26402986.post@talk.nabble.com> <3d375d730911172231i4cf42760l80038a00f84fa7c8@mail.gmail.com> Message-ID: <27026778.post@talk.nabble.com> Robert Kern-2 wrote: > > On Wed, Nov 18, 2009 at 00:29, dpfrota wrote: >> >> What is the meaning of these adresses? >> I opened these files, and they has some strange lines. The first file has >> only " __import__('pkg_resources').declare_namespace(__name__) ". Is >> module >> PKG necessary? > > These enable the scikits namespace such that you can have multiple > scikits packages installed (possibly to separate locations). > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > I made some tests and I am almost sure the problem is with this file: "C:\Python26\Lib\site-packages\scikits\audiolab\pysndfile\_sndfile.pyd". But I don?t know how to see its contents or fix the problem. Any more tips? (Please!) -- View this message in context: http://old.nabble.com/Audiolab-on-Py2.6-tp26064218p27026778.html Sent from the Scipy-User mailing list archive at Nabble.com. From j33433 at gmail.com Wed Jan 6 22:43:39 2010 From: j33433 at gmail.com (James) Date: Wed, 6 Jan 2010 22:43:39 -0500 Subject: [SciPy-User] timeseries and candlestick() Message-ID: Has anyone managed to plot a candlestick chart with a timeseries? Is there an easy way to wrap the matplotlib.finance.candlestick call? James -------------- next part -------------- An HTML attachment was scrubbed... URL: From jordi_molins at hotmail.com Thu Jan 7 04:09:55 2010 From: jordi_molins at hotmail.com (Jordi Molins Coronado) Date: Thu, 7 Jan 2010 10:09:55 +0100 Subject: [SciPy-User] [SciPy-user] Maximum entropy distribution for Ising model - setup? In-Reply-To: References: Message-ID: Hello, I am new to this forum. I am looking for a numerical solution to the inverse problem of an Ising model (or a model not-unlike the Ising model, see below). I have seen an old discussion, but very interesting, about this subject on this forum (http://mail.scipy.org/pipermail/scipy-user/2006-October/009703.html). I would like to pose my problem (which is quite similar to the problem discussed in the thread above) and kindly ask you your opinion on that: My space is a set of discrete nodes,s_i, where i=1,...,N, which can take two values, {0,1}. Empirically I have the following information: _emp and _emp, where i,j=1,...,N with i!=j. It is well known in the literature that the Ising model P(s_1, s_2, ..., s_N) = 1 / Z * exp( sum(h_i*s_i) + 0.5*sum(J_ij*s_i*s_j) ) i i!=jmaximizes entropy with the constraints given above (in fact, this is not the Ising model, because the Ising model assumes only nearest-neigbour interactions, and I have interactions with all other nodes, but I believe it is still true that the above P(s1,...sN)).What I would like is to solve the inverse problem of finding the h_i and J_ij which maximize entropy given my constraints. However, I would like to restrict the number of h_i and J_ij possible, since having complete freedom could become an unwieldly problem. For example, I could restrict h_i = H and J_ij = J for all i,j=1,...N, i!=j, or I could have a partition of my nodes, say nodes from 1 to M having h_i = H1 and J_ij=J1 i,j=1,...,M i!=j, and h_i=H2 and J_ij=J2 i,j=M+1,...,N i!=j.If I understand correctly the discussion in the thread shown above, a numerical solution for the inverse problem would be:hi_{new}=hi_{old} + K * ( - _{emp}) Jij_{new}=Jij_{old}+ K' * ( - _{emp}) where K and K' are pos. "step size" constants. (On the RHS, and are w.r.t. hi_{old} and Jij_{old}.)Have IHave I understood all this correctly? In particular, for the case h_i = H and J_ij = J for all i,j=1,...N, i!=j could I simplify the previous algorithm by restricting the calculations only to say i=1 (i=2,...,N should be the same?), and for the case h_i = H1 and J_ij=J1 i,j=1,...,M i!=j, and h_i=H2 and J_ij=J2 i,j=M+1,...,N i!=j simplify it by restricting the calculations only to say i=1 and i=M+1?Thank you for your help and sorry if I am new here and I have committed some "ettiquette" mistake.Jordi -------------- next part -------------- An HTML attachment was scrubbed... URL: From jordi_molins at hotmail.com Thu Jan 7 04:19:30 2010 From: jordi_molins at hotmail.com (Jordi Molins Coronado) Date: Thu, 7 Jan 2010 10:19:30 +0100 Subject: [SciPy-User] [SciPy-user] Maximum entropy distribution for Ising model - setup? In-Reply-To: References: , Message-ID: Sorry, I see my previous message has been a disaster in formatting. I try now in a different way. Sorry for the inconveniences. Hello, I am new to this forum. I am looking for a numerical solution to the inverse problem of an Ising model (or a model not-unlike the Ising model, see below). I have seen an old discussion, but very interesting, about this subject on this forum?(http://mail.scipy.org/pipermail/scipy-user/2006-October/009703.html). I would like to pose my problem (which is quite similar to the problem discussed in the thread above) and kindly ask you your opinion on that: My space is a set of discrete nodes,s_i, where i=1,...,N, which can take two values, {0,1}. Empirically I have the following information:? _emp and _emp, where i,j=1,...,N with i!=j. It is well known in the literature that the Ising model P(s_1, s_2, ..., s_N) = 1 / Z * exp( sum{for all i}(h_i*s_i) + 0.5*sum{for all i!=j}(J_ij*s_i*s_j) )? maximizes entropy with the constraints given above (in fact, this is not the Ising model, because the Ising model assumes only nearest-neigbour interactions, and I have interactions with all other nodes, but I believe it is still true that the above P(s1,...sN) still maximizes entropy given the constraints above). What I would like is to solve the inverse problem of finding the h_i and J_ij which maximize entropy given my constraints. However, I would like to restrict the number of h_i and J_ij possible, since having complete freedom could become an unwieldly problem. For example, I could restrict h_i = H and J_ij = J for all i,j=1,...N, i!=j, or I could have a partition of my nodes, say nodes from 1 to M having h_i = H1 and J_ij=J1 i,j=1,...,M i!=j, and h_i=H2 and J_ij=J2 i,j=M+1,...,N i!=j, and the J_ij=J3 for i=1,...,M and j=M+1,...N. If I understand correctly the discussion in the thread shown above, a numerical solution for the inverse problem would be: hi_{new}=hi_{old} + K * ( - _{emp}) Jij_{new}=Jij_{old}+ K' * ( - _{emp}) where K and K' are pos. "step size" constants. (On the RHS, and are w.r.t. hi_{old} and Jij_{old}.) Have I understood all this correctly??In particular, for the case?h_i = H and J_ij = J for all i,j=1,...N, i!=j could I simplify the previous algorithm by restricting the calculations only to say i=1 (i=2,...,N should be the same?), and for the case?h_i = H1 and J_ij=J1 i,j=1,...,M i!=j, and h_i=H2 and J_ij=J2 i,j=M+1,...,N i!=j simplify it by restricting the calculations only to say i=1 and i=M+1? Thank you for your help and sorry if I am new here and I have committed some "ettiquette" mistake. Jordi From josef.pktd at gmail.com Thu Jan 7 12:04:59 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 7 Jan 2010 12:04:59 -0500 Subject: [SciPy-User] multidimensional signal.convolve semivalid Message-ID: <1cd32cbb1001070904u53b07fe5laa7654446b2b5a5c@mail.gmail.com> simplest case I have two signals and I want to apply two linear filters with convolve. As a result I want to get two signals given by the convolution of the input signal with each of the filter arrays. I can either loop over the filter arrays with valid mode which produces the desired result signal.convolve(x,a3f[:,:,0], mode='valid') signal.convolve(x,a3f[:,:,1], mode='valid') or a can do one 3 dimensional convolution, and throw away two thirds of the calculation signal.convolve(x[:,:,None],a3f)[:,1,:] I didn't manage to get valid or same mode to return the results that I wanted. Is there a way to do it without loop or redundant calculations? background: this will be the fastest way to filter and work with vector autoregressive processes example below Thanks Josef >>> x = np.arange(40).reshape((2,20)).T >>> a3f[:,:,0] array([[ 0.5, 1. ], [ 0.5, 1. ]]) >>> a3f[:,:,1] array([[ 1. , 0.5], [ 1. , 0.5]]) >>> signal.convolve(x[:,:,None],a3f)[:,1,:] array([[ 10. , 20. ], [ 21.5, 41.5], [ 24.5, 44.5], [ 27.5, 47.5], [ 30.5, 50.5], [ 33.5, 53.5], [ 36.5, 56.5], [ 39.5, 59.5], [ 42.5, 62.5], [ 45.5, 65.5], [ 48.5, 68.5], [ 51.5, 71.5], [ 54.5, 74.5], [ 57.5, 77.5], [ 60.5, 80.5], [ 63.5, 83.5], [ 66.5, 86.5], [ 69.5, 89.5], [ 72.5, 92.5], [ 75.5, 95.5], [ 38.5, 48.5]]) >>> signal.fftconvolve(x[:,:,None],a3f).shape (21, 3, 2) >>> signal.fftconvolve(x[:,:,None],a3f)[:,1,:] array([[ 10. , 20. ], [ 21.5, 41.5], [ 24.5, 44.5], [ 27.5, 47.5], [ 30.5, 50.5], [ 33.5, 53.5], [ 36.5, 56.5], [ 39.5, 59.5], [ 42.5, 62.5], [ 45.5, 65.5], [ 48.5, 68.5], [ 51.5, 71.5], [ 54.5, 74.5], [ 57.5, 77.5], [ 60.5, 80.5], [ 63.5, 83.5], [ 66.5, 86.5], [ 69.5, 89.5], [ 72.5, 92.5], [ 75.5, 95.5], [ 38.5, 48.5]]) >>> signal.fftconvolve(x[:,:],a3f[:,:,0]).shape (21, 3) >>> signal.fftconvolve(x[:,:],a3f[:,:,0], mode='valid') array([[ 21.5], [ 24.5], [ 27.5], [ 30.5], [ 33.5], [ 36.5], [ 39.5], [ 42.5], [ 45.5], [ 48.5], [ 51.5], [ 54.5], [ 57.5], [ 60.5], [ 63.5], [ 66.5], [ 69.5], [ 72.5], [ 75.5]]) >>> signal.fftconvolve(x[:,:],a3f[:,:,1], mode='valid') array([[ 41.5], [ 44.5], [ 47.5], [ 50.5], [ 53.5], [ 56.5], [ 59.5], [ 62.5], [ 65.5], [ 68.5], [ 71.5], [ 74.5], [ 77.5], [ 80.5], [ 83.5], [ 86.5], [ 89.5], [ 92.5], [ 95.5]]) >>> From dwf at cs.toronto.edu Thu Jan 7 14:35:48 2010 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Thu, 7 Jan 2010 14:35:48 -0500 Subject: [SciPy-User] [SciPy-user] Maximum entropy distribution for Ising model - setup? In-Reply-To: References: , Message-ID: <90B92C0A-E286-4E7E-8BBA-E3DAC0792E28@cs.toronto.edu> On 7-Jan-10, at 4:19 AM, Jordi Molins Coronado wrote: > However, I would like to restrict the number of h_i and J_ij > possible, since having complete freedom could become an unwieldly > problem. For example, I could restrict h_i = H and J_ij = J for all > i,j=1,...N, i!=j, or I could have a partition of my nodes, say nodes > from 1 to M having h_i = H1 and J_ij=J1 i,j=1,...,M i!=j, and h_i=H2 > and J_ij=J2 i,j=M+1,...,N i!=j, and the J_ij=J3 for i=1,...,M and j=M > +1,...N. > If I understand correctly the discussion in the thread shown above, > a numerical solution for the inverse problem would be: > hi_{new}=hi_{old} + K * ( - _{emp}) > Jij_{new}=Jij_{old}+ K' * ( - _{emp}) That's correct; the way that you'd usually calculate is by starting from some state and running several iterations of Gibbs sampling to generate a new state, measure your s_i * s_j in that state, then run it for a whole bunch more steps and gather the s_i * s_j, etc. until you had enough measurements for a decent Monte Carlo approximation. The Gibbs iterations form a Markov chain whose equilibrium distribution is P(s_1, s_2, ... s_N), the distribution of interest; the problem is there's no good way to know when you've run sufficiently many Gibbs steps so that the sample you draw is from the equilibrium distribution P. However, one can often get away with just running a small fixed number of steps. There is some analysis of the convergence properties of this trick here: http://www.cs.toronto.edu/~hinton/absps/cdmiguel.pdf (refer to the sections on "Visible Boltzmann machines") I've never really heard of a situation where you'd really want to tie together parameters like you're suggesting, but it's possible and quite trivial to implement. Let's say you wanted to constrain hi and hj to be the same. Then you'd start them off at the same initial value and at every update, use the following equation instead: hi_{new} = hj_{new} = hi_{old} + K/2 ( - _{emp}) + K/2 ( - _{emp}) If you wanted Jij = Jkl, set them to the same initial value and use the update Jij_{new} = Jkl_{new} = Jij_{old}+ K'/2 * ( - _{emp}) + K'/2 ( - _{emp}) Similarly if you wanted to tie a whole set of these together you'd just average the updates and apply it to all of them at once. David From mattknox.ca at gmail.com Thu Jan 7 17:37:15 2010 From: mattknox.ca at gmail.com (Matt Knox) Date: Thu, 7 Jan 2010 22:37:15 +0000 (UTC) Subject: [SciPy-User] timeseries and candlestick() References: Message-ID: James gmail.com> writes: > > Has anyone managed to plot a candlestick chart with a timeseries? Is there an > easy way to wrap the matplotlib.finance.candlestick call? I don't know anything about the candlestick function, but you can get the underlying MaskedArray for a TimeSeries object with the .series property of the TimeSeries object. You can get raw datetime objects for the time axis by doing mytimeseries.dates.tolist(), so from there it should be fairly straight forward to pass the data into matplotlib functions I think. - Matt From bruce at clearscienceinc.com Thu Jan 7 19:43:21 2010 From: bruce at clearscienceinc.com (Bruce Ford) Date: Thu, 7 Jan 2010 19:43:21 -0500 Subject: [SciPy-User] 2D Interpolation Message-ID: All, I'm endeavoring to interpolate global 2.5 degree data (73x144) onto a 1 degree grid (181x360). I'm not sure if I'm barking up the right tree for a cubic spline interpolation. Below is my code based on an example I found at: http://docs.scipy.org/doc/scipy/reference/tutorial/interpolate.html The process is hanging at the line "tck = interpolate.bisplrep(x,y,z,s=0)". I'm unsure if this is a bug or is there an error in my code (which is below)? Secondly, has anyone done a similar interpolation (from a lower resolution regular 2D grid to a higher regular 2D grid) that would share a little code? All my efforts have been fruitless! Thanks in advance! Bruce (code follows...) ***************************************************************** import matplotlib import matplotlib.pyplot as pyplot #used to build contour and wind barbs plots import matplotlib.colors as pycolors #used to build color schemes for plots import numpy.ma as M #matrix manipulation functions import numpy as np #used to perform simple math functions on data from numpy import * import cgi #used to easily parse form variables from sys import exit as die #used to kill the python script early from netCDF4 import Dataset #interprets NetCDF files import Nio from scipy import interpolate filepath = "/media/BACKUP1/reanal-2/6hr/pgb/pgb.197901" grb_file = Nio.open_file(filepath, mode='r', options=None, history='', format='grb') z = grb_file.variables["HGT_2_ISBL_10"][1,1,:,:] print z.shape x,y = np.mgrid[90:-90:73j,0:357.5:144j] print x.shape #(73,144) print y.shape #(73,144) print z.shape #(73,144) xnew,ynew = np.mgrid[-90:90:180j,0:359:360j] print xnew.shape #(180,360) tck = interpolate.bisplrep(x,y,z,s=0) #python freezes on the above line znew = interpolate.bisplev(xnew[:,0],ynew[0,:],tck) --------------------------------------- Bruce W. Ford Clear Science, Inc. bruce at clearscienceinc.com http://www.ClearScienceInc.com 8241 Parkridge Circle N. Jacksonville, FL 32211 Skype: bruce.w.ford Google Talk: fordbw at gmail.com From burak.o.cankurtaran at alumni.uts.edu.au Thu Jan 7 20:09:09 2010 From: burak.o.cankurtaran at alumni.uts.edu.au (Burak1327) Date: Thu, 7 Jan 2010 17:09:09 -0800 (PST) Subject: [SciPy-User] [SciPy-user] 2D Interpolation In-Reply-To: References: Message-ID: <27069900.post@talk.nabble.com> Hi Bruce, I recently received help for what you need. I've listed the code that does the interpolation from a coarse grid to a finer grid (regular grid). It uses an image manipulation package, "ndimage" : pes40 = ReadPES("pes-0.5.xsf", 40) # Interpolate newx,newy,newz = mgrid[0:40:0.5, 0:40:0.5, 0:40:0.5] coords = array([newx, newy, newz]) pes80 = ndimage.map_coordinates(pes40, coords, order=1) So, the coarse grid is a 40x40x40 and I'm interpolating it onto a 80x80x80 grid. This is done with the notation 0:40:0.5. You can change the 0.5 interval length to a complex number if you want to specify the number intervals instead of the interval length. Basically, the coords array holds the actual positions that the interpolation should occur at. Obviously, get rid of the third dimension for 2D. To do higher order interpolation, change the "order" parameter in the map_coordinates function. This is a link for a quick tutorial on 2D: http://www.scipy.org/Cookbook/Interpolation Thanks Burak Bruce Ford wrote: > > All, > > I'm endeavoring to interpolate global 2.5 degree data (73x144) onto a > 1 degree grid (181x360). I'm not sure if I'm barking up the right > tree for a cubic spline interpolation. > > Below is my code based on an example I found at: > http://docs.scipy.org/doc/scipy/reference/tutorial/interpolate.html > > The process is hanging at the line "tck = > interpolate.bisplrep(x,y,z,s=0)". > > I'm unsure if this is a bug or is there an error in my code (which is > below)? > > Secondly, has anyone done a similar interpolation (from a lower > resolution regular 2D grid to a higher regular 2D grid) that would > share a little code? All my efforts have been fruitless! > > Thanks in advance! > > Bruce > > (code follows...) > ***************************************************************** > import matplotlib > import matplotlib.pyplot as pyplot #used to build contour and wind barbs > plots > import matplotlib.colors as pycolors #used to build color schemes for > plots > import numpy.ma as M #matrix manipulation functions > import numpy as np #used to perform simple math functions on data > from numpy import * > import cgi #used to easily parse form variables > from sys import exit as die #used to kill the python script early > from netCDF4 import Dataset #interprets NetCDF files > import Nio > from scipy import interpolate > > filepath = "/media/BACKUP1/reanal-2/6hr/pgb/pgb.197901" > grb_file = Nio.open_file(filepath, mode='r', options=None, history='', > format='grb') > > z = grb_file.variables["HGT_2_ISBL_10"][1,1,:,:] > print z.shape > > x,y = np.mgrid[90:-90:73j,0:357.5:144j] > > print x.shape #(73,144) > print y.shape #(73,144) > print z.shape #(73,144) > > xnew,ynew = np.mgrid[-90:90:180j,0:359:360j] > print xnew.shape #(180,360) > tck = interpolate.bisplrep(x,y,z,s=0) > #python freezes on the above line > znew = interpolate.bisplev(xnew[:,0],ynew[0,:],tck) > > --------------------------------------- > Bruce W. Ford > Clear Science, Inc. > bruce at clearscienceinc.com > http://www.ClearScienceInc.com > 8241 Parkridge Circle N. > Jacksonville, FL 32211 > Skype: bruce.w.ford > Google Talk: fordbw at gmail.com > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -- View this message in context: http://old.nabble.com/2D-Interpolation-tp27069693p27069900.html Sent from the Scipy-User mailing list archive at Nabble.com. From timmichelsen at gmx-topmail.de Fri Jan 8 07:49:21 2010 From: timmichelsen at gmx-topmail.de (Tim Michelsen) Date: Fri, 8 Jan 2010 12:49:21 +0000 (UTC) Subject: [SciPy-User] scikits.timeseries: moving difference Message-ID: Hello, the scikits.timeseries has implemented some moving windows functions: http://pytseries.sourceforge.net/lib.moving_funcs.html I would like to expand these to include the (absolute and relative) difference between one value and its successor in a time series. How could I do this? Best regards, Timmie From pgmdevlist at gmail.com Fri Jan 8 08:07:08 2010 From: pgmdevlist at gmail.com (Pierre GM) Date: Fri, 8 Jan 2010 08:07:08 -0500 Subject: [SciPy-User] scikits.timeseries: moving difference In-Reply-To: References: Message-ID: <175355BA-2C0F-425E-BAED-F8540213920D@gmail.com> On Jan 8, 2010, at 7:49 AM, Tim Michelsen wrote: > Hello, > the scikits.timeseries has implemented some moving windows functions: > http://pytseries.sourceforge.net/lib.moving_funcs.html > > I would like to expand these to include the (absolute and relative) difference > between one value and its successor in a time series. Like np.diff ? Could you give us a short example of what you have in mind ? From robince at gmail.com Fri Jan 8 08:17:25 2010 From: robince at gmail.com (Robin) Date: Fri, 8 Jan 2010 13:17:25 +0000 Subject: [SciPy-User] [SciPy-user] Maximum entropy distribution for Ising model - setup? In-Reply-To: References: Message-ID: <2d5132a51001080517l35cd8020n55062c1b436f87aa@mail.gmail.com> On Thu, Jan 7, 2010 at 9:19 AM, Jordi Molins Coronado wrote: > > Sorry, I see my previous message has been a disaster in formatting. I try now in a different way. Sorry for the inconveniences. > > > Hello, I am new to this forum. I am looking for a numerical solution to the inverse problem of an Ising model (or a model not-unlike the Ising model, see below). I have seen an old discussion, but very interesting, about this subject on this forum?(http://mail.scipy.org/pipermail/scipy-user/2006-October/009703.html). > I would like to pose my problem (which is quite similar to the problem discussed in the thread above) and kindly ask you your opinion on that: > > My space is a set of discrete nodes,s_i, where i=1,...,N, which can take two values, {0,1}. Empirically I have the following information: > _emp and _emp, where i,j=1,...,N with i!=j. > > It is well known in the literature that the Ising model > > P(s_1, s_2, ..., s_N) = 1 / Z * exp( sum{for all i}(h_i*s_i) + 0.5*sum{for all i!=j}(J_ij*s_i*s_j) ) > maximizes entropy with the constraints given above (in fact, this is not the Ising model, because the Ising model assumes only nearest-neigbour interactions, and I have interactions with all other nodes, but I believe it is still true that the above P(s1,...sN) still maximizes entropy given the constraints above). > What I would like is to solve the inverse problem of finding the h_i and J_ij which maximize entropy given my constraints. However, I would like to restrict the number of h_i and J_ij possible, since having complete freedom could become an unwieldly problem. For example, I could restrict h_i = H and J_ij = J for all i,j=1,...N, i!=j, or I could have a partition of my nodes, say nodes from 1 to M having h_i = H1 and J_ij=J1 i,j=1,...,M i!=j, and h_i=H2 and J_ij=J2 i,j=M+1,...,N i!=j, and the J_ij=J3 for i=1,...,M and j=M+1,...N. > If I understand correctly the discussion in the thread shown above, a numerical solution for the inverse problem would be: > hi_{new}=hi_{old} + K * ( - _{emp}) > Jij_{new}=Jij_{old}+ K' * ( - _{emp}) > > where K and K' are pos. "step size" constants. (On the RHS, and are w.r.t. hi_{old} and Jij_{old}.) > Have I understood all this correctly??In particular, for the case?h_i = H and J_ij = J for all i,j=1,...N, i!=j could I simplify the previous algorithm by restricting the calculations only to say i=1 (i=2,...,N should be the same?), and for the case?h_i = H1 and J_ij=J1 i,j=1,...,M i!=j, and h_i=H2 and J_ij=J2 i,j=M+1,...,N i!=j simplify it by restricting the calculations only to say i=1 and i=M+1? > Thank you for your help and sorry if I am new here and I have committed some "ettiquette" mistake. Hi, I'm not so familiar with the statisitical mechanics notation, but you might be interested in the maxent module of the pyentropy package I have produced as part of my PhD: http://code.google.com/p/pyentropy/ The main purpose of pyentropy is calculation of bias corrected entropy and information values from limited data sets, but it includes the maxent module which computes maximum entropy distributions over finite alphabet spaces with marginal constraints of up to any order. (I am working in computational neuroscience so much of the notation will probably be a bit different). So I think a second order solution from this framework over a binary space is the same as the Ising model. You can get the hi and J's directly from the results (they are called theta in the code) - although I think they have a slightly different normalisation because of Ising being -1,1 and this being 0,1... http://pyentropy.googlecode.com/svn/docs/api.html#module-pyentropy.maxent With this on a normal computer I can solve for about 18 binary variables in a reasonable amount of time (ie less than an hour for several runs), but it becomes highly exponential with more vectors. (I have a much more efficient but more hackish version of the same algorithm that I haven't added to pyentropy yet, but will in the next few weeks). In the case you describe where the thetas are constrained to be equal at each order the system can be solved much more efficiently. I have code to do this which is not released (and a bit messy) but if you are interested I could send it to you. In neuroscience this situation is called the 'pooled model'. You can see a description for how to solve in this reduced case here: http://rsta.royalsocietypublishing.org/content/367/1901/3297.short The method is using information geomery from Amari - by transforming between the P space (probability vector), eta space (marginals) and theta space (the h,J's etc.) it is possible to find the maximum entropy solution as a projection. Anyway I'm not sure if that helps... probably the documentation on how to get the thetas and the ordering of the vector might not be so clear, so give me a shout if you start using it and have any questions. If you don't have the full probability distribution you can pass the eta vector to the solve function (which would be a vector of , ) and set optional argument eta_given=True (hopefully clear from the option handling in the code). Cheers Robin From jgomezdans at gmail.com Fri Jan 8 10:08:16 2010 From: jgomezdans at gmail.com (Jose Gomez-Dans) Date: Fri, 8 Jan 2010 15:08:16 +0000 Subject: [SciPy-User] 2D Interpolation In-Reply-To: References: Message-ID: <91d218431001080708q5bdceb2cgf950bd2cb29dd5ad@mail.gmail.com> Hi, 2010/1/8 Bruce Ford > I'm endeavoring to interpolate global 2.5 degree data (73x144) onto a > 1 degree grid (181x360). I'm not sure if I'm barking up the right > tree for a cubic spline interpolation. > Looks suspiciously like NCEP reanalysis data ;) for this task, you could use map_coordinates: < http://docs.scipy.org/doc/scipy/reference/generated/scipy.ndimage.interpolation.map_coordinates.html > Hope that helps, Jose -------------- next part -------------- An HTML attachment was scrubbed... URL: From Dharhas.Pothina at twdb.state.tx.us Fri Jan 8 11:16:20 2010 From: Dharhas.Pothina at twdb.state.tx.us (Dharhas Pothina) Date: Fri, 08 Jan 2010 10:16:20 -0600 Subject: [SciPy-User] Masking multiple fields in a structured timeseries object. Message-ID: <4B4705F40200009B0002638B@GWWEB.twdb.state.tx.us> Hi, I have a structured time series object I have read in from a file. I am providing my script the following parameters filename, startdate, enddate, parameter (All, Salinity, Temp., etc), Max , Min, Instrument type (2 digit code contained in the filename). My timeseries is structured like : timeseries([ ('JOB_20090812_CXT_MW9999.csv', 0, --, --, --, --, --, 22.0, 13.199999999999999, 28.949999999999999, --, 0.39928999999999998, --, --) ('JOB_20090812_CXT_MW9999.csv', 0, --, --, --, --, --, 22.100000000000001, 13.199999999999999, 28.690000000000001, --, 0.35965999999999998, --, --) ('JOB_20090812_CXT_MW9999.csv', 0, --, --, --, --, --, 22.100000000000001, 13.300000000000001, 28.420000000000002, --, 0.32917999999999997, --, --) ..., ('JOB_20090812_CXT_MW9999.csv', 0, --, --, --, --, 3.6699999999999999, 25.800000000000001, 15.699999999999999, 25.600000000000001, --, 1.5514300000000001, --, --) ('JOB_20090812_CXT_MW9999.csv', 0, --, --, --, --, 3.8900000000000001, 25.800000000000001, 15.699999999999999, 25.710000000000001, --, 1.61849, --, --) ('JOB_20090812_CXT_MW9999.csv', 0, --, --, --, --, 3.5899999999999999, 25.899999999999999, 15.699999999999999, 25.859999999999999, --, 1.6398200000000001, --, --)], dtype = [('Filename', '|S27'), ('Year', ' References: <91d218431001080708q5bdceb2cgf950bd2cb29dd5ad@mail.gmail.com> Message-ID: Jose and Burak, thanks for the lead. Yes. Jose, this is NCEP Renal II data. I'm trying to interpolate it to a 1 degree grid to plot alongside other model data that is 1 degree. I'm not quite there yet, but close. It's possible I'm not visualizing what this process is doing correctly. Below is a small, trimmed down script that doesn't use external data so you can see what's happening. It should work for you. The first plot is of a 73X144 grid all of the value 1. This represents a 2.5 degree global grid. The second plot is what I get following the interpolation to a 1 degree grid (181X360). The value of one only appears in a region of the plot instead of being interpolated across the whole domain. If you have experience with this, I'm hoping you can tell me what I'm doing wrong. I'm stumped and I've spent many hours on this problem. Any assistance would be appreciated! Here's the script: ***************************************** import matplotlib.pyplot as pyplot #used to build contour and wind barbs plots from numpy import * from sys import exit as die #used to kill the python script early from scipy import interpolate, ndimage x,y = mgrid[-90:90:2.5,0:357.5:2.5] test_array = ones_like(x) #a test array to interpolate from print "********Shape of Test Array *********",test_array.shape print "********Shape of X array *********",x.shape #(73,144) print "********Shape of Y Array *********",y.shape #(73,144) pyplot.figure() pyplot.pcolor(y,x,test_array) pyplot.colorbar() pyplot.title("Sparsely sampled function.") pyplot.show() xnew,ynew = mgrid[-90:91:1,0:360:1] coords = array([xnew,ynew]) print "********Shape of Coordinate Array *********",coords.shape interpolated = ndimage.map_coordinates(test_array, coords, order=3) print "********Shape of Interpolated Array *********",interpolated.shape #(181,360) pyplot.figure() pyplot.pcolor(ynew,xnew,interpolated) pyplot.colorbar() pyplot.title("Interpolated function.") pyplot.show() --------------------------------------- Bruce W. Ford Clear Science, Inc. bruce at clearscienceinc.com bruce.w.ford.ctr at navy.smil.mil http://www.ClearScienceInc.com Phone/Fax: 904-379-9704 8241 Parkridge Circle N. Jacksonville, FL 32211 Skype: bruce.w.ford Google Talk: fordbw at gmail.com On Fri, Jan 8, 2010 at 10:08 AM, Jose Gomez-Dans wrote: > Hi, > > > 2010/1/8 Bruce Ford >> >> I'm endeavoring to interpolate global 2.5 degree data (73x144) onto a >> 1 degree grid (181x360). ?I'm not sure if I'm barking up the right >> tree for a cubic spline interpolation. > > Looks suspiciously like NCEP reanalysis data ;) for this task, you could use > map_coordinates: > > > Hope that helps, > Jose > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > From pgmdevlist at gmail.com Fri Jan 8 13:20:42 2010 From: pgmdevlist at gmail.com (Pierre GM) Date: Fri, 8 Jan 2010 13:20:42 -0500 Subject: [SciPy-User] Masking multiple fields in a structured timeseries object. In-Reply-To: <4B4705F40200009B0002638B@GWWEB.twdb.state.tx.us> References: <4B4705F40200009B0002638B@GWWEB.twdb.state.tx.us> Message-ID: On Jan 8, 2010, at 11:16 AM, Dharhas Pothina wrote: > Hi, > > I have a structured time series object I have read in from a file. I am providing my script the following parameters > filename, startdate, enddate, parameter (All, Salinity, Temp., etc), Max , Min, Instrument type (2 digit code contained in the filename). > > My timeseries is structured like : > > timeseries([ ('JOB_20090812_CXT_MW9999.csv', 0, --, --, --, --, --, 22.0, 13.199999999999999, 28.949999999999999, --, 0.39928999999999998, --, --) .... > ('JOB_20090812_CXT_MW9999.csv', 0, --, --, --, --, 3.5899999999999999, 25.899999999999999, 15.699999999999999, 25.859999999999999, --, 1.6398200000000001, --, --)], > dtype = [('Filename', '|S27'), ('Year', ' dates = [11-Jun-1996 21:00 11-Jun-1996 22:00 11-Jun-1996 23:00 ..., > 05-Oct-2000 09:00 05-Oct-2000 10:00 05-Oct-2000 11:00], > freq = T) > > > > I want to mask the data in the following way: > > Mask all values between start & end dates that meet the following criteria: > > 1) selected parameter (mask all if blank) > 2) selected filename (mask all if blank) > 3) selected instrument (mask all if blank). Note the instrument is the 18 & 19 character in the filename, ie 'MW' in the example above. > 4) parameter value lies between the given max and min values. > > I'm having trouble working out how to check all these conditions at once or sequentially before masking. Step by step, it's gonna be easier to debug. Take the simpler example: >>> ndtype=[('name','|S3'),('v1',float),('v2',float)] >>> series=ts.time_series([("ABC",1.1,10.),("ABD",2.2,20.),("ABE",3.3,30)], dtype=ndtype, start_date=ts.now('D')) >>> _series=series.series _series is only a masked array, that's gonna keep things nice and easy (no need to carry the dates) Mask a record (viz, a full row) if v2>25 >>> series[_series['v2']>25]=ma.masked Mask a record if the last character of the name is "C". This one is trickier, as we need to test whether the field 'name' is masked >>> maskonnames = [] >>> for _ in _series['name']: >>> if _ is ma.masked: >>> maskonnames.append(False) >>> else: >>> maskonnames.append(_[-1]=='C') >>> series[np.array(maskonnames)] = ma.masked (maskonnames is a list that we need to transform into a bool ndarray to have fancy indexing. Otherwise, we just gonna take the first or second record (depending on whether maskonnames is False (0) or True (1)), and that's not what we want. So, so far >>> series timeseries([(--, --, --) ('ABD', 2.2000000000000002, 20.0) (--, --, --)], dtype = [('name', '|S3'), ('v1', '>> _series['v1'][_series['v1']<3]=ma.masked >>> series timeseries([(--, --, --) ('ABD', --, 20.0) (--, --, --)], dtype = [('name', '|S3'), ('v1', '>> global_condition = np.zeros(len(series), dtype=bool) >>> global_condition |= _series[_series['v2']>25]=ma.masked >>> global_condition |= maskonnames >>> series[global_condition]=ma.masked HIH P. From Dharhas.Pothina at twdb.state.tx.us Fri Jan 8 16:33:58 2010 From: Dharhas.Pothina at twdb.state.tx.us (Dharhas Pothina) Date: Fri, 08 Jan 2010 15:33:58 -0600 Subject: [SciPy-User] Masking multiple fields in a structured timeseriesobject. In-Reply-To: References: <4B4705F40200009B0002638B@GWWEB.twdb.state.tx.us> Message-ID: <4B475066.63BA.009B.0@twdb.state.tx.us> Thanks. I'll try implementing your approaches on Tuesday and will probably be back with questions after that. - dharhas >>> Pierre GM 1/8/2010 12:20 PM >>> On Jan 8, 2010, at 11:16 AM, Dharhas Pothina wrote: > Hi, > > I have a structured time series object I have read in from a file. I am providing my script the following parameters > filename, startdate, enddate, parameter (All, Salinity, Temp., etc), Max , Min, Instrument type (2 digit code contained in the filename). > > My timeseries is structured like : > > timeseries([ ('JOB_20090812_CXT_MW9999.csv', 0, --, --, --, --, --, 22.0, 13.199999999999999, 28.949999999999999, --, 0.39928999999999998, --, --) .... > ('JOB_20090812_CXT_MW9999.csv', 0, --, --, --, --, 3.5899999999999999, 25.899999999999999, 15.699999999999999, 25.859999999999999, --, 1.6398200000000001, --, --)], > dtype = [('Filename', '|S27'), ('Year', ' dates = [11-Jun-1996 21:00 11-Jun-1996 22:00 11-Jun-1996 23:00 ..., > 05-Oct-2000 09:00 05-Oct-2000 10:00 05-Oct-2000 11:00], > freq = T) > > > > I want to mask the data in the following way: > > Mask all values between start & end dates that meet the following criteria: > > 1) selected parameter (mask all if blank) > 2) selected filename (mask all if blank) > 3) selected instrument (mask all if blank). Note the instrument is the 18 & 19 character in the filename, ie 'MW' in the example above. > 4) parameter value lies between the given max and min values. > > I'm having trouble working out how to check all these conditions at once or sequentially before masking. Step by step, it's gonna be easier to debug. Take the simpler example: >>> ndtype=[('name','|S3'),('v1',float),('v2',float)] >>> series=ts.time_series([("ABC",1.1,10.),("ABD",2.2,20.),("ABE",3.3,30)], dtype=ndtype, start_date=ts.now('D')) >>> _series=series.series _series is only a masked array, that's gonna keep things nice and easy (no need to carry the dates) Mask a record (viz, a full row) if v2>25 >>> series[_series['v2']>25]=ma.masked Mask a record if the last character of the name is "C". This one is trickier, as we need to test whether the field 'name' is masked >>> maskonnames = [] >>> for _ in _series['name']: >>> if _ is ma.masked: >>> maskonnames.append(False) >>> else: >>> maskonnames.append(_[-1]=='C') >>> series[np.array(maskonnames)] = ma.masked (maskonnames is a list that we need to transform into a bool ndarray to have fancy indexing. Otherwise, we just gonna take the first or second record (depending on whether maskonnames is False (0) or True (1)), and that's not what we want. So, so far >>> series timeseries([(--, --, --) ('ABD', 2.2000000000000002, 20.0) (--, --, --)], dtype = [('name', '|S3'), ('v1', '>> _series['v1'][_series['v1']<3]=ma.masked >>> series timeseries([(--, --, --) ('ABD', --, 20.0) (--, --, --)], dtype = [('name', '|S3'), ('v1', '>> global_condition = np.zeros(len(series), dtype=bool) >>> global_condition |= _series[_series['v2']>25]=ma.masked >>> global_condition |= maskonnames >>> series[global_condition]=ma.masked HIH P. _______________________________________________ SciPy-User mailing list SciPy-User at scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user From jgomezdans at gmail.com Sat Jan 9 11:18:22 2010 From: jgomezdans at gmail.com (Jose Gomez-Dans) Date: Sat, 9 Jan 2010 16:18:22 +0000 Subject: [SciPy-User] 2D Interpolation In-Reply-To: References: <91d218431001080708q5bdceb2cgf950bd2cb29dd5ad@mail.gmail.com> Message-ID: <91d218431001090818g791027d1q9276e06fc3d66a7@mail.gmail.com> Hi Bruce! 2010/1/8 Bruce Ford Here's your problem: > coords = array([xnew,ynew]) > interpolated = ndimage.map_coordinates(test_array, coords, order=3) > coords (and hence xnew and ynew) need to be specified in array units, so you need to calculate where your 1 degree grid falls within the 2.5 degree original grid, so you can define ynew = numpy.linspace (0,360,360)/2.5 xnew = numpy.linspace (0,180, 180)/2.5 coords = numpy.array([xnew, ynew]) and feed that into map_coordinates. Jose -------------- next part -------------- An HTML attachment was scrubbed... URL: From silva at lma.cnrs-mrs.fr Sat Jan 9 13:22:39 2010 From: silva at lma.cnrs-mrs.fr (Fabricio Silva) Date: Sat, 09 Jan 2010 19:22:39 +0100 Subject: [SciPy-User] spline interpolation and matplotlib interaction Message-ID: <1263061359.31641.9.camel@PCTerrusse> Hello folks, has anyone ever thought of a piece of code that - use a parametrization of a function with B-splines, i.e. with knots and coefficients, - let a user manipulate this parametrization through matplotlib (and event handling mechanisms) ? Within a scientific app, I would like to be able to handle simplified representation of time-varying quantities that could be obtained from measurement or from academic signals. I thought scipy.interpolation module could help me in such a way : - measurement of time signals - using splrep to least-square fitting and identification of knots and coefficients (with respect to a smoothing factor and a polynomial order) - checking : show the approximated function with matplotlib and let the user manually modify the control points. It seems that I have not the sufficient understanding of b-splines as the values output by splrep look strange even if the result of splev is nice... -- Fabrice Silva Laboratory of Mechanics and Acoustics (CNRS, UPR 7051) From contact at pythonxy.com Sun Jan 10 10:50:20 2010 From: contact at pythonxy.com (Pierre Raybaut) Date: Sun, 10 Jan 2010 16:50:20 +0100 Subject: [SciPy-User] [ANN] Spyder v1.0.2 released Message-ID: <4B49F73C.9060402@pythonxy.com> Hi all, I'm pleased to announce here that Spyder version 1.0.2 has been released: http://packages.python.org/spyder Previously known as Pydee, Spyder (Scientific PYthon Development EnviRonment) is a free open-source Python development environment providing MATLAB-like features in a simple and light-weighted software, available for Windows XP/Vista/7, GNU/Linux and MacOS X: * advanced code editing features (code analysis, ...) * interactive console with MATLAB-like workpace (with GUI-based list, dictionary, tuple, text and array editors -- screenshots: http://packages.python.org/spyder/console.html#the-workspace) and integrated matplotlib figures * external console to open an interpreter or run a script in a separate process (with a global variable explorer providing the same features as the interactive console's workspace) * code analysis with pyflakes and pylint * search in files features * documentation viewer: automatically retrieves docstrings or source code of the function/class called in the interactive/external console * integrated file/directories explorer * MATLAB-like path management ...and more! Spyder is part of spyderlib, a Python module based on PyQt4 and QScintilla2 which provides powerful console-related PyQt4 widgets. Spyder v1.0.2 is a bugfix release: * External console: subprocess python calls were using the external console's sitecustomize.py (instead of system sitecustomize.py) * Added workaround for PyQt4 v4.6+ major bug with matplotlib * Added option to customize the way matplotlib figures are embedded (docked or floating window) * Matplotlib's "Option" dialog box is now supporting subplots * Array editor now supports complex arrays * Editor: replaced "Run selection or current line" option by "Run selection or current block" (without selection, this feature is similar to MATLAB's cell mode) * ...and a lot of minor bugfixes. - Pierre From totalbull at mac.com Sun Jan 10 16:35:35 2010 From: totalbull at mac.com (totalbull at mac.com) Date: Sun, 10 Jan 2010 21:35:35 +0000 Subject: [SciPy-User] StdErr Problem with Gary Strangman's linregress function References: Message-ID: <92B7777B-E679-4A4F-8867-D91A2ED85FA9@mac.com> Hello, Excel and scipy.stats.linregress are disagreeing on the standard error of a regression. I need to find the standard errors of a bunch of regressions, and prefer to use pure Python than RPy. So I am going to scipy.stats.linregress, as advised at: http://www2.warwick.ac.uk/fac/sci/moac/currentstudents/peter_cock/python/lin_reg/#linregress >>> from scipy import stats >>> x = [5.05, 6.75, 3.21, 2.66] >>> y = [1.65, 26.5, -5.93, 7.96] >>> gradient, intercept, r_value, p_value, std_err = stats.linregress(x,y) >>> gradient 5.3935773611970186 >>> intercept -16.281127993087829 >>> r_value 0.72443514211849758 >>> r_value**2 0.52480627513624778 >>> std_err 3.6290901222878866 The problem is that the std error calculation does not agree with what is returned in Microsoft Excel's STEYX function (whereas all the other output does). From Excel: Anybody knows what's going on? Any alternative way of getting the standard error without going to R? -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: PastedGraphic-1.tiff Type: image/tiff Size: 33948 bytes Desc: not available URL: From jsseabold at gmail.com Sun Jan 10 16:59:43 2010 From: jsseabold at gmail.com (Skipper Seabold) Date: Sun, 10 Jan 2010 16:59:43 -0500 Subject: [SciPy-User] StdErr Problem with Gary Strangman's linregress function In-Reply-To: <92B7777B-E679-4A4F-8867-D91A2ED85FA9@mac.com> References: <92B7777B-E679-4A4F-8867-D91A2ED85FA9@mac.com> Message-ID: On Sun, Jan 10, 2010 at 4:35 PM, wrote: > > Hello, Excel and scipy.stats.linregress are disagreeing on the standard > error of a regression. > > I need to find the standard errors of a bunch of regressions, and prefer to > use pure Python than RPy. So I am going to scipy.stats.linregress, as > advised at: > > http://www2.warwick.ac.uk/fac/sci/moac/currentstudents/peter_cock/python/lin_reg/#linregress > > from scipy import stats > > x = [5.05, 6.75, 3.21, 2.66] > > y = [1.65, 26.5, -5.93, 7.96] > > gradient, intercept, r_value, p_value, std_err = stats.linregress(x,y) > > gradient > > 5.3935773611970186 > > intercept > > -16.281127993087829 > > r_value > > 0.72443514211849758 > > r_value**2 > > 0.52480627513624778 > > std_err > > 3.6290901222878866 > > > The problem is that the std error calculation does not agree with what is > returned in Microsoft Excel's STEYX function (whereas all the other output > does). From Excel: > > > > > Anybody knows what's going on? Any alternative way of getting the standard > error without going to R? > > > > 'std_err' is the standard error of 'gradient' above, not the standard error of the regression as reported in Excel. You might want to have a look at the statsmodels scikit as a possible alternative to R. I recommend getting the trunk source until the next release, which should be soon. http://statsmodels.sourceforge.net/ Skipper -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: PastedGraphic-1.tiff Type: image/tiff Size: 33948 bytes Desc: not available URL: From bsouthey at gmail.com Sun Jan 10 20:21:17 2010 From: bsouthey at gmail.com (Bruce Southey) Date: Sun, 10 Jan 2010 19:21:17 -0600 Subject: [SciPy-User] StdErr Problem with Gary Strangman's linregress function In-Reply-To: <92B7777B-E679-4A4F-8867-D91A2ED85FA9@mac.com> References: <92B7777B-E679-4A4F-8867-D91A2ED85FA9@mac.com> Message-ID: On Sun, Jan 10, 2010 at 3:35 PM, wrote: > > Hello, Excel and scipy.stats.linregress are disagreeing on the standard > error of a regression. > > I need to find the standard errors of a bunch of regressions, and prefer to > use pure Python than RPy. So I am going to scipy.stats.linregress, as > advised at: > > http://www2.warwick.ac.uk/fac/sci/moac/currentstudents/peter_cock/python/lin_reg/#linregress > > from scipy import stats > > x = [5.05, 6.75, 3.21, 2.66] > > y = [1.65, 26.5, -5.93, 7.96] > > gradient, intercept, r_value, p_value, std_err = stats.linregress(x,y) > > gradient > > 5.3935773611970186 > > intercept > > -16.281127993087829 > > r_value > > 0.72443514211849758 > > r_value**2 > > 0.52480627513624778 > > std_err > > 3.6290901222878866 > > > The problem is that the std error calculation does not agree with what is > returned in Microsoft Excel's STEYX function (whereas all the other output > does). From Excel: > > > > > Anybody knows what's going on? Any alternative way of getting the standard > error without going to R? > > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > The Excel help is rather cryptic by :"Returns the standard error of the predicted y-value for each x in the regression. The standard error is a measure of the amount of error in the prediction of y for an individual x." But clearly this is not the same as the standard error of the 'gradient' (slope) returned by linregress. Without checking the formula, STEYX appears returns the square root what most people call the mean square error (MSE). Bruce -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: PastedGraphic-1.tiff Type: image/tiff Size: 33948 bytes Desc: not available URL: From josef.pktd at gmail.com Sun Jan 10 20:41:29 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sun, 10 Jan 2010 20:41:29 -0500 Subject: [SciPy-User] StdErr Problem with Gary Strangman's linregress function In-Reply-To: References: <92B7777B-E679-4A4F-8867-D91A2ED85FA9@mac.com> Message-ID: <1cd32cbb1001101741t205f2fe2icbc6b10bf61c0be9@mail.gmail.com> On Sun, Jan 10, 2010 at 8:21 PM, Bruce Southey wrote: > > > On Sun, Jan 10, 2010 at 3:35 PM, wrote: > >> >> Hello, Excel and scipy.stats.linregress are disagreeing on the standard >> error of a regression. >> >> I need to find the standard errors of a bunch of regressions, and prefer >> to use pure Python than RPy. So I am going to scipy.stats.linregress, as >> advised at: >> >> http://www2.warwick.ac.uk/fac/sci/moac/currentstudents/peter_cock/python/lin_reg/#linregress >> >> from scipy import stats >> >> x = [5.05, 6.75, 3.21, 2.66] >> >> y = [1.65, 26.5, -5.93, 7.96] >> >> gradient, intercept, r_value, p_value, std_err = stats.linregress(x,y) >> >> gradient >> >> 5.3935773611970186 >> >> intercept >> >> -16.281127993087829 >> >> r_value >> >> 0.72443514211849758 >> >> r_value**2 >> >> 0.52480627513624778 >> >> std_err >> >> 3.6290901222878866 >> >> >> The problem is that the std error calculation does not agree with what is >> returned in Microsoft Excel's STEYX function (whereas all the other output >> does). From Excel: >> >> >> >> >> Anybody knows what's going on? Any alternative way of getting the standard >> error without going to R? >> >> >> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> > The Excel help is rather cryptic by :"Returns the standard error of the > predicted y-value for each x in the regression. The standard error is a > measure of the amount of error in the prediction of y for an individual x." > But clearly this is not the same as the standard error of the 'gradient' > (slope) returned by linregress. Without checking the formula, STEYX appears > returns the square root what most people call the mean square error (MSE). > > Bruce > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > >>> gradient, intercept, r_value, p_value, std_err = stats.linregress(x,y) >>> ((y-intercept-np.array(x)*gradient)**2).sum()/(4.-2.) 136.80611125682617 >>> np.sqrt(_) 11.6964144615701 I think this should be the estimate of the standard deviation of the noise/error term. Josef -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: PastedGraphic-1.tiff Type: image/tiff Size: 33948 bytes Desc: not available URL: From timmichelsen at gmx-topmail.de Mon Jan 11 05:11:37 2010 From: timmichelsen at gmx-topmail.de (Tim Michelsen) Date: Mon, 11 Jan 2010 10:11:37 +0000 (UTC) Subject: [SciPy-User] scikits.timeseries: moving difference References: <175355BA-2C0F-425E-BAED-F8540213920D@gmail.com> Message-ID: > > the scikits.timeseries has implemented some moving windows functions: > > http://pytseries.sourceforge.net/lib.moving_funcs.html > > > > I would like to expand these to include the (absolute and relative) difference > > between one value and its successor in a time series. > > Like np.diff ? Could you give us a short example of what you have in mind ? Yes, thanks. This was what I was searching for. Regards, Timmie From bnuttall at uky.edu Mon Jan 11 10:47:44 2010 From: bnuttall at uky.edu (Nuttall, Brandon C) Date: Mon, 11 Jan 2010 10:47:44 -0500 Subject: [SciPy-User] StdErr Problem with Gary Strangman's linregress function In-Reply-To: References: <92B7777B-E679-4A4F-8867-D91A2ED85FA9@mac.com> Message-ID: For what it's worth, using by the definition of standard error of the estimate in Crow, Davis, and Maxfield, 1960, Statistics Manual: Dover Publications (p. 156), the Excel function provides the "correct" standard error of the estimate. Using notation from Crow, Davis, and Maxfield: import numpy as np n = 4.0 x = np.array([5.05, 6.75, 3.21, 2.66]) y = np.array([1.65, 26.5, -5.93, 7.96]) x2 = x*x y2 = y*y s2x = (4.0*x2.sum()-x.sum()*x.sum())/(n*(n-1.0)) s2y = (4.0*y2.sum()-y.sum()*y.sum())/(n*(n-1.0)) xy = x * y b = (4.0*xy.sum()-x.sum()*y.sum())/(4.0*x2.sum()-x.sum()*x.sum()) a = (y.sum()-b*x.sum())/n s2xy = ((n-1.0)/(n-2.0))*(s2y-b*b*s2x) ste = np.sqrt(s2xy) r=b*np.sqrt(s2x)/np.sqrt(s2y) print "intercept: ",a print "gradient (slope): ",b print "correlation coefficient, r: ",r print "std err est: ",ste Produces the output : intercept: -16.2811279931 gradient (slope): 5.3935773612 correlation coefficient, r: 0.724435142118 std err est: 11.6964144616 This same value for the standard error of the estimate is reported with the sample x,y data at the VassarStats, Statistical Computation Web Site, http://faculty.vassar.edu/lowry/VassarStats.html. Brandon Nuttall, KRPG-1364 Kentucky Geological Survey www.uky.edu/kgs bnuttall at uky.edu (KGS, Mo-We) Brandon.nuttall at ky.gov (EEC, Th-Fr) 859-257-5500 ext 30544 (main) 859-323-0544 (direct) 859-684-7473 (cell) 859-257-1147 (FAX) From: scipy-user-bounces at scipy.org [mailto:scipy-user-bounces at scipy.org] On Behalf Of Bruce Southey Sent: Sunday, January 10, 2010 8:21 PM To: SciPy Users List Subject: Re: [SciPy-User] StdErr Problem with Gary Strangman's linregress function On Sun, Jan 10, 2010 at 3:35 PM, > wrote: Hello, Excel and scipy.stats.linregress are disagreeing on the standard error of a regression. I need to find the standard errors of a bunch of regressions, and prefer to use pure Python than RPy. So I am going to scipy.stats.linregress, as advised at: http://www2.warwick.ac.uk/fac/sci/moac/currentstudents/peter_cock/python/lin_reg/#linregress from scipy import stats x = [5.05, 6.75, 3.21, 2.66] y = [1.65, 26.5, -5.93, 7.96] gradient, intercept, r_value, p_value, std_err = stats.linregress(x,y) gradient 5.3935773611970186 intercept -16.281127993087829 r_value 0.72443514211849758 r_value**2 0.52480627513624778 std_err 3.6290901222878866 The problem is that the std error calculation does not agree with what is returned in Microsoft Excel's STEYX function (whereas all the other output does). From Excel: [cid:image001.png at 01CA92A7.C1C66980] Anybody knows what's going on? Any alternative way of getting the standard error without going to R? _______________________________________________ SciPy-User mailing list SciPy-User at scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user The Excel help is rather cryptic by :"Returns the standard error of the predicted y-value for each x in the regression. The standard error is a measure of the amount of error in the prediction of y for an individual x." But clearly this is not the same as the standard error of the 'gradient' (slope) returned by linregress. Without checking the formula, STEYX appears returns the square root what most people call the mean square error (MSE). Bruce -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 1973 bytes Desc: image001.png URL: From bruce at clearscienceinc.com Mon Jan 11 11:40:42 2010 From: bruce at clearscienceinc.com (Bruce Ford) Date: Mon, 11 Jan 2010 11:40:42 -0500 Subject: [SciPy-User] ValueError: setting and array element with a sequence Message-ID: I'm new at this and I'm getting this error. It looks straightforward enough. Any ideas? ynew = numpy.linspace (0,360,360)/2.5 xnew = numpy.linspace (0,180, 180)/2.5 coords = numpy.array([xnew, ynew]) yeilds: ValueError: setting and array element with a sequence Bruce --------------------------------------- Bruce W. Ford Clear Science, Inc. bruce at clearscienceinc.com bruce.w.ford.ctr at navy.smil.mil http://www.ClearScienceInc.com Phone/Fax: 904-379-9704 8241 Parkridge Circle N. Jacksonville, FL 32211 Skype: bruce.w.ford Google Talk: fordbw at gmail.com From robert.kern at gmail.com Mon Jan 11 11:42:27 2010 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 11 Jan 2010 10:42:27 -0600 Subject: [SciPy-User] ValueError: setting and array element with a sequence In-Reply-To: References: Message-ID: <3d375d731001110842x1ff2d488i63284f57004d66e6@mail.gmail.com> On Mon, Jan 11, 2010 at 10:40, Bruce Ford wrote: > I'm new at this and I'm getting this error. ?It looks straightforward > enough. ?Any ideas? > > ynew = numpy.linspace (0,360,360)/2.5 > xnew = numpy.linspace (0,180, 180)/2.5 > coords = numpy.array([xnew, ynew]) > > yeilds: ?ValueError: ?setting and array element with a sequence ynew has 360 elements. xnew has 180. They need to be the same if you want to make an (2,N)-shape array from them. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From Chris.Barker at noaa.gov Mon Jan 11 14:41:31 2010 From: Chris.Barker at noaa.gov (Christopher Barker) Date: Mon, 11 Jan 2010 11:41:31 -0800 Subject: [SciPy-User] ValueError: setting and array element with a sequence In-Reply-To: <3d375d731001110842x1ff2d488i63284f57004d66e6@mail.gmail.com> References: <3d375d731001110842x1ff2d488i63284f57004d66e6@mail.gmail.com> Message-ID: <4B4B7EEB.9050600@noaa.gov> Robert Kern wrote: > On Mon, Jan 11, 2010 at 10:40, Bruce Ford wrote: >> I'm new at this and I'm getting this error. It looks straightforward >> enough. Any ideas? >> >> ynew = numpy.linspace (0,360,360)/2.5 >> xnew = numpy.linspace (0,180, 180)/2.5 >> coords = numpy.array([xnew, ynew]) >> >> yeilds: ValueError: setting and array element with a sequence > > ynew has 360 elements. xnew has 180. They need to be the same if you > want to make an (2,N)-shape array from them. if you want a 360x180 (or 180x360) arrays, then you can do: In [22]: X,Y = np.meshgrid(xnew, ynew) In [23]: X.shape Out[23]: (360, 180) In [24]: Y.shape Out[24]: (360, 180) or, better yet rely on numpy broadcasting: In [25]: xnew.shape = (1, -1) # make x a single row In [26]: ynew.shape = (-1, 1) # make y a single column In [27]: z = xnew * ynew**2 # they are then broadcast when combined In [28]: z.shape Out[28]: (360, 180) -Chris -- Christopher Barker, Ph.D. Oceanographer Emergency Response Division NOAA/NOS/OR&R (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chris.Barker at noaa.gov From bnuttall at uky.edu Mon Jan 11 15:07:02 2010 From: bnuttall at uky.edu (Nuttall, Brandon C) Date: Mon, 11 Jan 2010 15:07:02 -0500 Subject: [SciPy-User] StdErr Problem with Gary Strangman's linregress function In-Reply-To: <1cd32cbb1001101741t205f2fe2icbc6b10bf61c0be9@mail.gmail.com> References: <92B7777B-E679-4A4F-8867-D91A2ED85FA9@mac.com> <1cd32cbb1001101741t205f2fe2icbc6b10bf61c0be9@mail.gmail.com> Message-ID: OK, I think I've figured it out. The numpy covariance function doesn't seem to return the actual sample variances (it returns a population variance?). What this means is that for the linregress() function in the stats.py source file, the quantity sterrest is not calculated correctly and needs to be adjusted to the sample variance. In addition, it includes the quantity ssxm, sum of squares for x (?) and I can't find documentation for its inclusion. # as implemented # sterrest = np.sqrt((1-r*r)*ssym / ssxm / df) # should be corrected to sterrest = np.sqrt((1-r*r)*(ssym*n)/df) Having made this correction, both the example provided and the example in Crow, Davis, and Maxfield (Table 6.1, p. 154) provide the same value for the standard error of the estimate and the value matches what is calculated by Excel. I don't know anything about SVN or submitting a correction, so someone will have to help me out or do it for me. Thanks. Brandon Brandon Nuttall, KRPG-1364 Kentucky Geological Survey www.uky.edu/kgs bnuttall at uky.edu (KGS, Mo-We) Brandon.nuttall at ky.gov (EEC, Th-Fr) 859-257-5500 ext 30544 (main) 859-323-0544 (direct) 859-684-7473 (cell) 859-257-1147 (FAX) From: scipy-user-bounces at scipy.org [mailto:scipy-user-bounces at scipy.org] On Behalf Of josef.pktd at gmail.com Sent: Sunday, January 10, 2010 8:41 PM To: SciPy Users List Subject: Re: [SciPy-User] StdErr Problem with Gary Strangman's linregress function On Sun, Jan 10, 2010 at 8:21 PM, Bruce Southey > wrote: On Sun, Jan 10, 2010 at 3:35 PM, > wrote: Hello, Excel and scipy.stats.linregress are disagreeing on the standard error of a regression. I need to find the standard errors of a bunch of regressions, and prefer to use pure Python than RPy. So I am going to scipy.stats.linregress, as advised at: http://www2.warwick.ac.uk/fac/sci/moac/currentstudents/peter_cock/python/lin_reg/#linregress from scipy import stats x = [5.05, 6.75, 3.21, 2.66] y = [1.65, 26.5, -5.93, 7.96] gradient, intercept, r_value, p_value, std_err = stats.linregress(x,y) gradient 5.3935773611970186 intercept -16.281127993087829 r_value 0.72443514211849758 r_value**2 0.52480627513624778 std_err 3.6290901222878866 The problem is that the std error calculation does not agree with what is returned in Microsoft Excel's STEYX function (whereas all the other output does). From Excel: [cid:image001.png at 01CA92CD.B1C81030] Anybody knows what's going on? Any alternative way of getting the standard error without going to R? _______________________________________________ SciPy-User mailing list SciPy-User at scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user The Excel help is rather cryptic by :"Returns the standard error of the predicted y-value for each x in the regression. The standard error is a measure of the amount of error in the prediction of y for an individual x." But clearly this is not the same as the standard error of the 'gradient' (slope) returned by linregress. Without checking the formula, STEYX appears returns the square root what most people call the mean square error (MSE). Bruce _______________________________________________ SciPy-User mailing list SciPy-User at scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user >>> gradient, intercept, r_value, p_value, std_err = stats.linregress(x,y) >>> ((y-intercept-np.array(x)*gradient)**2).sum()/(4.-2.) 136.80611125682617 >>> np.sqrt(_) 11.6964144615701 I think this should be the estimate of the standard deviation of the noise/error term. Josef -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image001.png Type: image/png Size: 1973 bytes Desc: image001.png URL: From totalbull at mac.com Mon Jan 11 15:08:46 2010 From: totalbull at mac.com (totalbull at mac.com) Date: Mon, 11 Jan 2010 20:08:46 +0000 Subject: [SciPy-User] StdErr Problem with Gary Strangman's linregress function In-Reply-To: References: <92B7777B-E679-4A4F-8867-D91A2ED85FA9@mac.com> Message-ID: <003BDC42-BD00-440A-A237-1F240BDE1AD1@mac.com> Thanks very much to all who have helped with this. I am going to go with the first-principles formulae as per below. Otherwise I also asked on Stack Overflow and one person answered with a scikits example: http://stackoverflow.com/questions/2038667/scipy-linregress-function-erroneous-standard-error-return On 11 Jan 2010, at 15:47, Nuttall, Brandon C wrote: > For what it?s worth, using by the definition of standard error of the estimate in Crow, Davis, and Maxfield, 1960, Statistics Manual: Dover Publications (p. 156), the Excel function provides the ?correct? standard error of the estimate. Using notation from Crow, Davis, and Maxfield: > > import numpy as np > n = 4.0 > x = np.array([5.05, 6.75, 3.21, 2.66]) > y = np.array([1.65, 26.5, -5.93, 7.96]) > x2 = x*x > y2 = y*y > s2x = (4.0*x2.sum()-x.sum()*x.sum())/(n*(n-1.0)) > s2y = (4.0*y2.sum()-y.sum()*y.sum())/(n*(n-1.0)) > xy = x * y > b = (4.0*xy.sum()-x.sum()*y.sum())/(4.0*x2.sum()-x.sum()*x.sum()) > a = (y.sum()-b*x.sum())/n > s2xy = ((n-1.0)/(n-2.0))*(s2y-b*b*s2x) > ste = np.sqrt(s2xy) > r=b*np.sqrt(s2x)/np.sqrt(s2y) > print "intercept: ",a > print "gradient (slope): ",b > print "correlation coefficient, r: ",r > print "std err est: ",ste > > Produces the output : > > intercept: -16.2811279931 > gradient (slope): 5.3935773612 > correlation coefficient, r: 0.724435142118 > std err est: 11.6964144616 > > This same value for the standard error of the estimate is reported with the sample x,y data at the VassarStats, Statistical Computation Web Site,http://faculty.vassar.edu/lowry/VassarStats.html. > > Brandon Nuttall, KRPG-1364 > Kentucky Geological Survey > www.uky.edu/kgs > bnuttall at uky.edu (KGS, Mo-We) > Brandon.nuttall at ky.gov (EEC, Th-Fr) > 859-257-5500 ext 30544 (main) > 859-323-0544 (direct) > 859-684-7473 (cell) > 859-257-1147 (FAX) > > From: scipy-user-bounces at scipy.org [mailto:scipy-user-bounces at scipy.org] On Behalf Of Bruce Southey > Sent: Sunday, January 10, 2010 8:21 PM > To: SciPy Users List > Subject: Re: [SciPy-User] StdErr Problem with Gary Strangman's linregress function > > > > On Sun, Jan 10, 2010 at 3:35 PM, wrote: > > Hello, Excel and scipy.stats.linregress are disagreeing on the standard error of a regression. > > I need to find the standard errors of a bunch of regressions, and prefer to use pure Python than RPy. So I am going to scipy.stats.linregress, as advised at: > http://www2.warwick.ac.uk/fac/sci/moac/currentstudents/peter_cock/python/lin_reg/#linregress > > > from scipy import stats > x = [5.05, 6.75, 3.21, 2.66] > y = [1.65, 26.5, -5.93, 7.96] > gradient, intercept, r_value, p_value, std_err = stats.linregress(x,y) > gradient > 5.3935773611970186 > > intercept > -16.281127993087829 > > r_value > 0.72443514211849758 > > r_value**2 > 0.52480627513624778 > > std_err > 3.6290901222878866 > > > The problem is that the std error calculation does not agree with what is returned in Microsoft Excel's STEYX function (whereas all the other output does). From Excel: > > > > > Anybody knows what's going on? Any alternative way of getting the standard error without going to R? > > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > > The Excel help is rather cryptic by :"Returns the standard error of the predicted y-value for each x in the regression. The standard error is a measure of the amount of error in the prediction of y for an individual x." But clearly this is not the same as the standard error of the 'gradient' (slope) returned by linregress. Without checking the formula, STEYX appears returns the square root what most people call the mean square error (MSE). > > Bruce > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Mon Jan 11 16:19:48 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 11 Jan 2010 16:19:48 -0500 Subject: [SciPy-User] StdErr Problem with Gary Strangman's linregress function In-Reply-To: <003BDC42-BD00-440A-A237-1F240BDE1AD1@mac.com> References: <92B7777B-E679-4A4F-8867-D91A2ED85FA9@mac.com> <003BDC42-BD00-440A-A237-1F240BDE1AD1@mac.com> Message-ID: <1cd32cbb1001111319x19e9b5bfveb4ca55258a55592@mail.gmail.com> On Mon, Jan 11, 2010 at 3:08 PM, wrote: > Thanks very much to all who have helped with this. > I am going to go with the first-principles formulae as per below. > Otherwise I also asked on Stack Overflow and one person answered with a > scikits example: > http://stackoverflow.com/questions/2038667/scipy-linregress-function-erroneous-standard-error-return If the old version of linregress matched excel, as you say, then I unintentionally changed the meaning of this value in response to a previous bug report (see http://projects.scipy.org/scipy/ticket/874 ) It's sometimes difficult to figure out what a value is supposed to be, if there are neither sufficient documentation nor tests for it. I had the numbers of linregress verified against statsmodels, but the standard error just means something different than the definition in excel. But as Skipper said, for all but the simplest regression case, scikits.statsmodels is much more general and produces more results. Josef > > On 11 Jan 2010, at 15:47, Nuttall, Brandon C wrote: > > For what it?s worth, using by the definition of standard error of the > estimate in Crow, Davis, and Maxfield, 1960, Statistics Manual: Dover > Publications (p. 156), the Excel function provides the ?correct? standard > error of the estimate. ?Using notation from Crow, Davis, and Maxfield: > > import numpy as np > n = 4.0 > x = np.array([5.05, 6.75, 3.21, 2.66]) > y = np.array([1.65, 26.5, -5.93, 7.96]) > x2 = x*x > y2 = y*y > s2x = (4.0*x2.sum()-x.sum()*x.sum())/(n*(n-1.0)) > s2y = (4.0*y2.sum()-y.sum()*y.sum())/(n*(n-1.0)) > xy = x * y > b = (4.0*xy.sum()-x.sum()*y.sum())/(4.0*x2.sum()-x.sum()*x.sum()) > a = (y.sum()-b*x.sum())/n > s2xy = ((n-1.0)/(n-2.0))*(s2y-b*b*s2x) > ste = np.sqrt(s2xy) > r=b*np.sqrt(s2x)/np.sqrt(s2y) > print "intercept: ",a > print "gradient (slope): ",b > print "correlation coefficient, r: ",r > print "std err est: ",ste > > Produces the output : > > intercept:? -16.2811279931 > gradient (slope):? 5.3935773612 > correlation coefficient, r:? 0.724435142118 > std err est:? 11.6964144616 > > This same value for the standard error of the estimate is reported with the > sample x,y data at the VassarStats, Statistical Computation Web > Site,http://faculty.vassar.edu/lowry/VassarStats.html. > > Brandon Nuttall, KRPG-1364 > Kentucky Geological Survey > www.uky.edu/kgs > bnuttall at uky.edu?(KGS, Mo-We) > Brandon.nuttall at ky.gov?(EEC, Th-Fr) > 859-257-5500 ext 30544 (main) > 859-323-0544 (direct) > 859-684-7473 (cell) > 859-257-1147 (FAX) > > From:?scipy-user-bounces at scipy.org?[mailto:scipy-user-bounces at scipy.org]?On > Behalf Of?Bruce Southey > Sent:?Sunday, January 10, 2010 8:21 PM > To:?SciPy Users List > Subject:?Re: [SciPy-User] StdErr Problem with Gary Strangman's linregress > function > > > > > On Sun, Jan 10, 2010 at 3:35 PM, wrote: > > Hello, Excel and scipy.stats.linregress are disagreeing on the standard > error of a regression. > > I need to find the standard errors of a bunch of regressions, and prefer to > use pure Python than RPy. So I am going to scipy.stats.linregress, as > advised at: > http://www2.warwick.ac.uk/fac/sci/moac/currentstudents/peter_cock/python/lin_reg/#linregress > > > from scipy import stats > > x = [5.05, 6.75, 3.21, 2.66] > > y = [1.65, 26.5, -5.93, 7.96] > > gradient, intercept, r_value, p_value, std_err = stats.linregress(x,y) > > gradient > > 5.3935773611970186 > > intercept > > -16.281127993087829 > > r_value > > 0.72443514211849758 > > r_value**2 > > 0.52480627513624778 > > std_err > > 3.6290901222878866 > > > The problem is that the std error calculation does not agree with what is > returned in Microsoft Excel's STEYX function (whereas all the other output > does). From Excel: > > > > > Anybody knows what's going on? Any alternative way of getting the standard > error without going to R? > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > The Excel help is rather cryptic by ? :"Returns the standard error of the > predicted y-value for each x in the regression. The standard error is a > measure of the amount of error in the prediction of y for an individual x." > But clearly this is not the same as the standard error of the 'gradient' > (slope) returned by linregress. Without checking the formula, STEYX appears > returns the square root what most people call the mean square error (MSE). > > Bruce > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > From totalbull at mac.com Mon Jan 11 16:34:19 2010 From: totalbull at mac.com (totalbull at mac.com) Date: Mon, 11 Jan 2010 21:34:19 +0000 Subject: [SciPy-User] StdErr Problem with Gary Strangman's linregress function In-Reply-To: <1cd32cbb1001111319x19e9b5bfveb4ca55258a55592@mail.gmail.com> References: <92B7777B-E679-4A4F-8867-D91A2ED85FA9@mac.com> <003BDC42-BD00-440A-A237-1F240BDE1AD1@mac.com> <1cd32cbb1001111319x19e9b5bfveb4ca55258a55592@mail.gmail.com> Message-ID: <07D7CD52-2453-4EE1-AA2D-0F5F66FD738C@mac.com> not a problem Josef. The new output was luckily wildly different enough so that "that something had happened" was easy to spot. Again thanks for all the help. Tom On 11 Jan 2010, at 21:19, josef.pktd at gmail.com wrote: > On Mon, Jan 11, 2010 at 3:08 PM, wrote: >> Thanks very much to all who have helped with this. >> I am going to go with the first-principles formulae as per below. >> Otherwise I also asked on Stack Overflow and one person answered with a >> scikits example: >> http://stackoverflow.com/questions/2038667/scipy-linregress-function-erroneous-standard-error-return > > If the old version of linregress matched excel, as you say, then I > unintentionally changed the meaning of this value in response to a > previous bug report (see http://projects.scipy.org/scipy/ticket/874 ) > > It's sometimes difficult to figure out what a value is supposed to be, > if there are neither sufficient documentation nor tests for it. I had > the numbers of linregress verified against statsmodels, but the > standard error just means something different than the definition in > excel. > > But as Skipper said, for all but the simplest regression case, > scikits.statsmodels is much more general and produces more results. > > Josef > > > >> >> On 11 Jan 2010, at 15:47, Nuttall, Brandon C wrote: >> >> For what it?s worth, using by the definition of standard error of the >> estimate in Crow, Davis, and Maxfield, 1960, Statistics Manual: Dover >> Publications (p. 156), the Excel function provides the ?correct? standard >> error of the estimate. Using notation from Crow, Davis, and Maxfield: >> >> import numpy as np >> n = 4.0 >> x = np.array([5.05, 6.75, 3.21, 2.66]) >> y = np.array([1.65, 26.5, -5.93, 7.96]) >> x2 = x*x >> y2 = y*y >> s2x = (4.0*x2.sum()-x.sum()*x.sum())/(n*(n-1.0)) >> s2y = (4.0*y2.sum()-y.sum()*y.sum())/(n*(n-1.0)) >> xy = x * y >> b = (4.0*xy.sum()-x.sum()*y.sum())/(4.0*x2.sum()-x.sum()*x.sum()) >> a = (y.sum()-b*x.sum())/n >> s2xy = ((n-1.0)/(n-2.0))*(s2y-b*b*s2x) >> ste = np.sqrt(s2xy) >> r=b*np.sqrt(s2x)/np.sqrt(s2y) >> print "intercept: ",a >> print "gradient (slope): ",b >> print "correlation coefficient, r: ",r >> print "std err est: ",ste >> >> Produces the output : >> >> intercept: -16.2811279931 >> gradient (slope): 5.3935773612 >> correlation coefficient, r: 0.724435142118 >> std err est: 11.6964144616 >> >> This same value for the standard error of the estimate is reported with the >> sample x,y data at the VassarStats, Statistical Computation Web >> Site,http://faculty.vassar.edu/lowry/VassarStats.html. >> >> Brandon Nuttall, KRPG-1364 >> Kentucky Geological Survey >> www.uky.edu/kgs >> bnuttall at uky.edu (KGS, Mo-We) >> Brandon.nuttall at ky.gov (EEC, Th-Fr) >> 859-257-5500 ext 30544 (main) >> 859-323-0544 (direct) >> 859-684-7473 (cell) >> 859-257-1147 (FAX) >> >> From: scipy-user-bounces at scipy.org [mailto:scipy-user-bounces at scipy.org] On >> Behalf Of Bruce Southey >> Sent: Sunday, January 10, 2010 8:21 PM >> To: SciPy Users List >> Subject: Re: [SciPy-User] StdErr Problem with Gary Strangman's linregress >> function >> >> >> >> >> On Sun, Jan 10, 2010 at 3:35 PM, wrote: >> >> Hello, Excel and scipy.stats.linregress are disagreeing on the standard >> error of a regression. >> >> I need to find the standard errors of a bunch of regressions, and prefer to >> use pure Python than RPy. So I am going to scipy.stats.linregress, as >> advised at: >> http://www2.warwick.ac.uk/fac/sci/moac/currentstudents/peter_cock/python/lin_reg/#linregress >> >> >> from scipy import stats >> >> x = [5.05, 6.75, 3.21, 2.66] >> >> y = [1.65, 26.5, -5.93, 7.96] >> >> gradient, intercept, r_value, p_value, std_err = stats.linregress(x,y) >> >> gradient >> >> 5.3935773611970186 >> >> intercept >> >> -16.281127993087829 >> >> r_value >> >> 0.72443514211849758 >> >> r_value**2 >> >> 0.52480627513624778 >> >> std_err >> >> 3.6290901222878866 >> >> >> The problem is that the std error calculation does not agree with what is >> returned in Microsoft Excel's STEYX function (whereas all the other output >> does). From Excel: >> >> >> >> >> Anybody knows what's going on? Any alternative way of getting the standard >> error without going to R? >> >> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> The Excel help is rather cryptic by :"Returns the standard error of the >> predicted y-value for each x in the regression. The standard error is a >> measure of the amount of error in the prediction of y for an individual x." >> But clearly this is not the same as the standard error of the 'gradient' >> (slope) returned by linregress. Without checking the formula, STEYX appears >> returns the square root what most people call the mean square error (MSE). >> >> Bruce >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From henrylindsaysmith at gmail.com Mon Jan 11 18:06:16 2010 From: henrylindsaysmith at gmail.com (henry lindsay smith) Date: Mon, 11 Jan 2010 23:06:16 +0000 Subject: [SciPy-User] [SciPy-user] Audiolab on Py2.6 In-Reply-To: <27026778.post@talk.nabble.com> References: <4AE5DEDF.7070701@asu.edu> <26402986.post@talk.nabble.com> <3d375d730911172231i4cf42760l80038a00f84fa7c8@mail.gmail.com> <27026778.post@talk.nabble.com> Message-ID: <6f0383341001111506w31a07522xbc869c94fb0bcd04@mail.gmail.com> On Wed, Jan 6, 2010 at 5:15 AM, dpfrota wrote: > > > > Robert Kern-2 wrote: > > > > On Wed, Nov 18, 2009 at 00:29, dpfrota wrote: > >> > >> What is the meaning of these adresses? > >> I opened these files, and they has some strange lines. The first file > has > >> only " __import__('pkg_resources').declare_namespace(__name__) ". Is > >> module > >> PKG necessary? > > > > These enable the scikits namespace such that you can have multiple > > scikits packages installed (possibly to separate locations). > > > > -- > > Robert Kern > > > > "I have come to believe that the whole world is an enigma, a harmless > > enigma that is made terrible by our own mad attempt to interpret it as > > though it had an underlying truth." > > -- Umberto Eco > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > > > I made some tests and I am almost sure the problem is with this file: > "C:\Python26\Lib\site-packages\scikits\audiolab\pysndfile\_sndfile.pyd". > > But I don?t know how to see its contents or fix the problem. > > Any more tips? (Please!) > -- > View this message in context: > http://old.nabble.com/Audiolab-on-Py2.6-tp26064218p27026778.html > Sent from the Scipy-User mailing list archive at Nabble.com. > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > I have this problem as well, as far as I am aware its not fixed and is a problem linking to the sndfile.dll. what are you using audiolab for? I have got round the problem by using wavfile in scipy to open and read wavfiles and pyaudiere to play audio. I even got 24bit wav files to read also I had to alter wavfile.py in my scipy distribution which is not advisable. -------------- next part -------------- An HTML attachment was scrubbed... URL: From alan at ajackson.org Mon Jan 11 18:22:54 2010 From: alan at ajackson.org (alan at ajackson.org) Date: Mon, 11 Jan 2010 17:22:54 -0600 Subject: [SciPy-User] Trying to use PIL and numpy Message-ID: <20100111172254.692ed02a@ajackson.org> I'm having some issues trying to use PIL and numpy (for the first time). It's probably something simple, it usually is. When I run the following, the output is all buggered up. It looks like the array indicies got switched about somewhere. import Image im = Image.open('test.ppm') im2 = im.convert(mode='F') a = np.asarray(im2) imback2 = Image.fromarray(a) imback = imback2.convert(mode='RGB') imback.save('testout.png') I tried removing bits, and it is the asarray -> fromarray sequence that messes stuff up. I'm running Karmic Koala with Python 2.6.4 (r264:75706, Dec 7 2009, 18:45:15) numpy 1.3.0 Image 1.1.6 -- ----------------------------------------------------------------------- | Alan K. Jackson | To see a World in a Grain of Sand | | alan at ajackson.org | And a Heaven in a Wild Flower, | | www.ajackson.org | Hold Infinity in the palm of your hand | | Houston, Texas | And Eternity in an hour. - Blake | ----------------------------------------------------------------------- From afraser at lanl.gov Mon Jan 11 18:47:46 2010 From: afraser at lanl.gov (Andy Fraser) Date: Mon, 11 Jan 2010 16:47:46 -0700 Subject: [SciPy-User] Trying to use PIL and numpy In-Reply-To: <20100111172254.692ed02a@ajackson.org> (alan@ajackson.org's message of "Mon\, 11 Jan 2010 17\:22\:54 -0600") References: <20100111172254.692ed02a@ajackson.org> Message-ID: <87ocl0gq59.fsf@lanl.gov> >>>>> "AJ" == writes: AJ> I'm having some issues trying to use PIL and numpy (for the AJ> first time). It's probably something simple, it usually is. I'm learning to work with images too. I started with PIL.Image for looking at images, but now I am moving towards pyFltk and ImageMagick. I find it difficult to keep track of how bits in arrays get mapped to pixels on the screen. I end up with lines like A = A.transpose((1,0,2))[::-1,:,:] in my code. Here is a utility that depends on PIL that I use to look at data: from PIL import Image # /usr/share/pyshared/PIL/Image.py def display(A,msg='Default msg for displaying an array', MAX=None): import tempfile, os if A.ndim == 3: if A.shape[0] == 3: # This is gdal format A = A.transpose((1,2,0)) else: A = A.transpose((1,0,2))[::-1,:,:] if A.dtype == numpy.dtype(numpy.bool): A = numpy.array(A*255,numpy.uint8) if A.dtype != numpy.dtype(numpy.uint8): if MAX == None: MAX = A.max(0).max(0) MIN = A.min(0).min(0) scale = 1.0/(MAX-MIN) T = (A-MIN)*scale A = numpy.array(T*256,numpy.uint8) Name = tempfile.mktemp(dir='temp') image = Image.fromarray(A) image.save(Name,'PPM') #os.system('eog %s'%Name) # eog is eye of gnome os.system('display %s'%Name) # display from ImageMagick print msg os.system('rm %s'%Name) return From joebarfett at yahoo.ca Mon Jan 11 19:18:07 2010 From: joebarfett at yahoo.ca (Joe Barfett) Date: Mon, 11 Jan 2010 16:18:07 -0800 (PST) Subject: [SciPy-User] ifft on images, symmetry artifacts? Message-ID: <440587.18866.qm@web59411.mail.ac4.yahoo.com> Hello, I'm using scipy (numpy.fft.fft2) to transform an image into the frequency domain. Then by using numpy.fft.ifft2 to transform the same image back into the spatial domain, I find that I get symmetry in the image around a reflection line (and not the original image). Google has revealed websites like this one: http://www.rzuser.uni-heidelberg.de/~ge6/Programing/convolution.html This is the code snippet they use: def Convolution(image1,image2): """ Simple convolution example """ fftimage = fft2(image1)*fft2(image2) return ifft2(fftimage).real #end of Convolution which uses ifft but generates appropriate output. They do however only use the real component of the frequency domain image. I find the exact same approach does not work in my case, but rather gives these weird symmetries. It's been a few weeks of hacking and I would really appreciate the guidance of someone more experienced than me. Thanks a great deal if you know the answer! joe __________________________________________________________________ Yahoo! Canada Toolbar: Search from anywhere on the web, and bookmark your favourite sites. Download it now http://ca.toolbar.yahoo.com. -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Mon Jan 11 20:08:19 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 11 Jan 2010 20:08:19 -0500 Subject: [SciPy-User] ifft on images, symmetry artifacts? In-Reply-To: <440587.18866.qm@web59411.mail.ac4.yahoo.com> References: <440587.18866.qm@web59411.mail.ac4.yahoo.com> Message-ID: <1cd32cbb1001111708s7d4ffa1aj5054571859a6822d@mail.gmail.com> On Mon, Jan 11, 2010 at 7:18 PM, Joe Barfett wrote: > Hello, > I'm using scipy (numpy.fft.fft2) to transform an image into the frequency > domain. Then by using numpy.fft.ifft2 to transform the same image back into > the spatial domain, I find that I get symmetry in the image around a > reflection line (and not the original image). > Google has revealed websites like this one: > http://www.rzuser.uni-heidelberg.de/~ge6/Programing/convolution.html > This is the code snippet they use: > > def Convolution(image1,image2): > """ Simple convolution example """ > fftimage = fft2(image1)*fft2(image2) > return ifft2(fftimage).real > #end of Convolution > > which uses ifft but generates appropriate output. They do however only use > the real component of the frequency domain image. I find the exact same > approach does not work in my case, but rather gives these weird symmetries. > It's been a few weeks of hacking and I would really appreciate the guidance > of someone more experienced than me. Thanks a great deal if you know the > answer! > joe > Something I recently found that might be helpful http://blogs.mathworks.com/steve/2009/12/04/fourier-transform-visualization-using-windowing/ http://shalin.wordpress.com/2009/12/06/fftifft/ (I know next to nothing about image processing but struggle with fft in general) Josef > > > ------------------------------ > The new Internet Explorer? 8 - Faster, safer, easier. Optimized for Yahoo! > *Get it Now for Free!* > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From david_baddeley at yahoo.com.au Mon Jan 11 20:21:25 2010 From: david_baddeley at yahoo.com.au (David Baddeley) Date: Mon, 11 Jan 2010 17:21:25 -0800 (PST) Subject: [SciPy-User] ifft on images, symmetry artifacts? In-Reply-To: <440587.18866.qm@web59411.mail.ac4.yahoo.com> References: <440587.18866.qm@web59411.mail.ac4.yahoo.com> Message-ID: <243264.91705.qm@web33001.mail.mud.yahoo.com> There are few possibilities - the most likely is that you are taking either the real part or the absolute value in the frequency domain. This kills all the phase information and results in a symmetric image. Note that the code snippet you cite takes the real part AFTER the inverse transformation, which is perfectly legit. hope this helps, David ________________________________ From: Joe Barfett To: scipy-user at scipy.org Sent: Tue, 12 January, 2010 1:18:07 PM Subject: [SciPy-User] ifft on images, symmetry artifacts? Hello, I'm using scipy (numpy.fft.fft2) to transform an image into the frequency domain. Then by using numpy.fft.ifft2 to transform the same image back into the spatial domain, I find that I get symmetry in the image around a reflection line (and not the original image). Google has revealed websites like this one: http://www.rzuser.uni-heidelberg.de/~ge6/Programing/convolution.html This is the code snippet they use: def Convolution(image1,image2): """ Simple convolution example """ fftimage = fft2(image1)*fft2(image2) return ifft2(fftimage).real #end of Convolution which uses ifft but generates appropriate output. They do however only use the real component of the frequency domain image. I find the exact same approach does not work in my case, but rather gives these weird symmetries. It's been a few weeks of hacking and I would really appreciate the guidance of someone more experienced than me. Thanks a great deal if you know the answer! joe ________________________________ The new Internet Explorer? 8 - Faster, safer, easier. Optimized for Yahoo! Get it Now for Free! -------------- next part -------------- An HTML attachment was scrubbed... URL: From koepsell at gmail.com Mon Jan 11 21:02:53 2010 From: koepsell at gmail.com (Kilian Koepsell) Date: Mon, 11 Jan 2010 18:02:53 -0800 Subject: [SciPy-User] [SciPy-user] Maximum entropy distribution for Ising model - setup? In-Reply-To: References: Message-ID: <32EE5911-8078-4F1B-887E-00FBE702057E@gmail.com> Jordi, > On Jan 7, 2010, at 1:09 AM, Jordi Molins Coronado wrote: >> >> Hello, I am new to this forum. I am looking for a numerical >> solution to the inverse problem of an Ising model (or a model not- >> unlike the Ising model, see below). I have seen an old discussion, >> but very interesting, about this subject on this forum (http://mail.scipy.org/pipermail/scipy-user/2006-October/009703.html >> ). >> You might want to check out a recent method developed in our group, called "Minimum Probability Flow Learning" that allows very fast parameter estimation of basically any distribution -- including the Ising model. A 100 unit ising model can be fitted within about 1 minute (see Fig. 3). The paper is here: http://arxiv.org/abs/0906.4779 Kilian -- Kilian Koepsell, PhD Redwood Center for Theoretical Neuroscience Helen Wills Neuroscience Institute, UC Berkeley 156 Stanley Hall, MC# 3220 , Berkeley, CA 94720 From Jim.Vickroy at noaa.gov Tue Jan 12 10:27:53 2010 From: Jim.Vickroy at noaa.gov (Jim Vickroy) Date: Tue, 12 Jan 2010 08:27:53 -0700 Subject: [SciPy-User] Trying to use PIL and numpy In-Reply-To: <20100111172254.692ed02a@ajackson.org> References: <20100111172254.692ed02a@ajackson.org> Message-ID: <4B4C94F9.9060207@noaa.gov> alan at ajackson.org wrote: > I'm having some issues trying to use PIL and numpy (for the first time). > It's probably something simple, it usually is. > > When I run the following, the output is all buggered up. It looks like > the array indicies got switched about somewhere. > > import Image > im = Image.open('test.ppm') > im2 = im.convert(mode='F') > > a = np.asarray(im2) > imback2 = Image.fromarray(a) > > imback = imback2.convert(mode='RGB') > imback.save('testout.png') > > I tried removing bits, and it is the asarray -> fromarray sequence that > messes stuff up. > > I'm running Karmic Koala with > Python 2.6.4 (r264:75706, Dec 7 2009, 18:45:15) > numpy 1.3.0 > Image 1.1.6 > > > I believe there is a logic error in the PIL 1.1.6 fromarray() procedure (see http://mail.scipy.org/pipermail/numpy-discussion/2006-December/024903.html) that may be relevant. Try explicitly specifying the mode parameter in the fromarray(...) call. -- jv -------------- next part -------------- An HTML attachment was scrubbed... URL: From peter.shepard at gmail.com Tue Jan 12 10:50:10 2010 From: peter.shepard at gmail.com (Pete Shepard) Date: Tue, 12 Jan 2010 07:50:10 -0800 Subject: [SciPy-User] dendogram axis display Message-ID: <5c2c43621001120750n7e064488n5dbbff1bd0b46a6c@mail.gmail.com> Hello, I am making a dendogram of clusters using "hcluster.py", the x-axis contains the distances between each cluster. I would like for the y-axis to also display the distances between the clusters, is this possible? Also, can the scale of the graph be controlled, eg display clusters that are separated by distances of <100? Thanks -------------- next part -------------- An HTML attachment was scrubbed... URL: From Dharhas.Pothina at twdb.state.tx.us Tue Jan 12 13:39:13 2010 From: Dharhas.Pothina at twdb.state.tx.us (Dharhas Pothina) Date: Tue, 12 Jan 2010 12:39:13 -0600 Subject: [SciPy-User] Masking multiple fields in a structured timeseriesobject. Message-ID: <4B4C6D710200009B00026515@GWWEB.twdb.state.tx.us> Sorry I'm still having trouble figuring out how to do multiple masking on a limited date range rather than the entire series. For a simpler example, look at the below ts construct: >>>ndtype=[('name','|S3'),('v1',float),('v2',float)] >>>series=ts.time_series([("ABBC",1.1,10.),("ABD",2.2,20.),("ABBE",3.3,30),("ABBF",4.4,40),("ABG",5.5,50),("ABH",6.6,60)],dtype=ndtype, start_date=ts.now('D')) >>>sdate = series.dates[1] >>>edate = series.dates[4] now I want to mask the v1 value between sdate and edate that contain 'BB' in the name and v1<4 and v2>10. ie the 3rd element ("ABBE",3.3,30) would become ("ABBE",--,30) thanks, - dharhas From pgmdevlist at gmail.com Tue Jan 12 15:10:48 2010 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 12 Jan 2010 15:10:48 -0500 Subject: [SciPy-User] Masking multiple fields in a structured timeseriesobject. In-Reply-To: <4B4C6D710200009B00026515@GWWEB.twdb.state.tx.us> References: <4B4C6D710200009B00026515@GWWEB.twdb.state.tx.us> Message-ID: On Jan 12, 2010, at 1:39 PM, Dharhas Pothina wrote: > Sorry I'm still having trouble figuring out how to do multiple masking on a limited date range rather than the entire series. For a simpler example, look at the below ts construct: > >>>> ndtype=[('name','|S3'),('v1',float),('v2',float)] >>>> series=ts.time_series([("ABBC",1.1,10.),("ABD",2.2,20.),("ABBE",3.3,30),("ABBF",4.4,40),("ABG",5.5,50),("ABH",6.6,60)],dtype=ndtype, start_date=ts.now('D')) >>>> sdate = series.dates[1] >>>> edate = series.dates[4] > > now I want to mask the v1 value between sdate and edate that contain 'BB' in the name and v1<4 and v2>10. ie the 3rd element ("ABBE",3.3,30) would become ("ABBE",--,30) Well, if I do your job for you, where's the fun ;) ? Seriously, why don't you build several masks and combine them as you want ? * Make a mask M1 for the 'BB' in name (use an approach similar to which I posted last time) * Make a mask M2 that tests the values: >>> M2=(_series['v1']<4)&(_series['v2']>10) * Make a mask M3 that test for the dates: >>> M3=(series.dates>=sdate)&(series.dates>> Mall=np.array(M1&M2&M3, dtype=bool) (we need to make sure that Mall is a boolean ndarray, and not an array of 0 and 1 else we mess up fancy indexing) * Mask 'v1' according to the new mask: >>> series['v1'][Mall]=ma.masked Notes: * you use 3 characters for name, but try to put strings with 4 characters. Expect problems. * When you build the masks, use series.series as much as you can (that'll save you some time) From Dharhas.Pothina at twdb.state.tx.us Tue Jan 12 15:33:47 2010 From: Dharhas.Pothina at twdb.state.tx.us (Dharhas Pothina) Date: Tue, 12 Jan 2010 14:33:47 -0600 Subject: [SciPy-User] Masking multiple fields in a structuredtimeseriesobject. Message-ID: <4B4C884B0200009B0002653A@GWWEB.twdb.state.tx.us> Thank you, I finally got it. I guess I had difficulty in conceptually treating the series and dates separately. I kept trying to apply the masks using 'series[start:end]' and ended up with my indices mismatching. on a related note is there any way to do the following without using a loop? _series['name'][1:3] == 'BB' right now this gives me 1st and 2nd entries in _series['name'] rather than the 1st and 2nd characters for all entries in _series['name'] thanks. - dharhas >>> Pierre GM 01/12/10 2:11 PM >>> On Jan 12, 2010, at 1:39 PM, Dharhas Pothina wrote: > Sorry I'm still having trouble figuring out how to do multiple masking on a limited date range rather than the entire series. For a simpler example, look at the below ts construct: > >>>> ndtype=[('name','|S3'),('v1',float),('v2',float)] >>>> series=ts.time_series([("ABBC",1.1,10.),("ABD",2.2,20.),("ABBE",3.3,30),("ABBF",4.4,40),("ABG",5.5,50),("ABH",6.6,60)],dtype=ndtype, start_date=ts.now('D')) >>>> sdate = series.dates[1] >>>> edate = series.dates[4] > > now I want to mask the v1 value between sdate and edate that contain 'BB' in the name and v1<4 and v2>10. ie the 3rd element ("ABBE",3.3,30) would become ("ABBE",--,30) Well, if I do your job for you, where's the fun ;) ? Seriously, why don't you build several masks and combine them as you want ? * Make a mask M1 for the 'BB' in name (use an approach similar to which I posted last time) * Make a mask M2 that tests the values: >>> M2=(_series['v1']<4)&(_series['v2']>10) * Make a mask M3 that test for the dates: >>> M3=(series.dates>=sdate)&(series.dates>> Mall=np.array(M1&M2&M3, dtype=bool) (we need to make sure that Mall is a boolean ndarray, and not an array of 0 and 1 else we mess up fancy indexing) * Mask 'v1' according to the new mask: >>> series['v1'][Mall]=ma.masked Notes: * you use 3 characters for name, but try to put strings with 4 characters. Expect problems. * When you build the masks, use series.series as much as you can (that'll save you some time) _______________________________________________ SciPy-User mailing list SciPy-User at scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user From pgmdevlist at gmail.com Tue Jan 12 15:56:30 2010 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 12 Jan 2010 15:56:30 -0500 Subject: [SciPy-User] Masking multiple fields in a structuredtimeseriesobject. In-Reply-To: <4B4C884B0200009B0002653A@GWWEB.twdb.state.tx.us> References: <4B4C884B0200009B0002653A@GWWEB.twdb.state.tx.us> Message-ID: <7720AAA9-CAA2-454D-A62B-63EFAEDF7A81@gmail.com> On Jan 12, 2010, at 3:33 PM, Dharhas Pothina wrote: > Thank you, I finally got it. I guess I had difficulty in conceptually treating the series and dates separately. I kept trying to apply the masks using 'series[start:end]' and ended up with my indices mismatching. > > on a related note is there any way to do the following without using a loop? > > _series['name'][1:3] == 'BB' > > right now this gives me 1st and 2nd entries in _series['name'] rather than the 1st and 2nd characters for all entries in _series['name'] _series['name'] is a 1D array w/ dtype '|S3'. What you'd want is to transform it into a 2D array of '|S1'. You could try to look chararray, but I'm not sure it'll help you. I'm afraid you gonna have to stick w/ the for loop. You may get it inlined, though: ['BB' in _ for _ in _series['name']] > thanks. From Dharhas.Pothina at twdb.state.tx.us Tue Jan 12 16:01:47 2010 From: Dharhas.Pothina at twdb.state.tx.us (Dharhas Pothina) Date: Tue, 12 Jan 2010 15:01:47 -0600 Subject: [SciPy-User] Masking multiple fields in astructuredtimeseriesobject. In-Reply-To: <7720AAA9-CAA2-454D-A62B-63EFAEDF7A81@gmail.com> References: <4B4C884B0200009B0002653A@GWWEB.twdb.state.tx.us> <7720AAA9-CAA2-454D-A62B-63EFAEDF7A81@gmail.com> Message-ID: <4B4C8EDA.63BA.009B.0@twdb.state.tx.us> _series['name'] is a 1D array w/ dtype '|S3'. What you'd want is to transform it into a 2D array of '|S1'. You could try to look chararray, but I'm not sure it'll help you. I'm afraid you gonna have to stick w/ the for loop. You may get it inlined, though: ['BB' in _ for _ in _series['name']] thanks. Would the inline version be any faster or is it pretty much equivalent to an ordinary loop? - dharhas From pgmdevlist at gmail.com Tue Jan 12 16:06:10 2010 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 12 Jan 2010 16:06:10 -0500 Subject: [SciPy-User] Masking multiple fields in astructuredtimeseriesobject. In-Reply-To: <4B4C8EDA.63BA.009B.0@twdb.state.tx.us> References: <4B4C884B0200009B0002653A@GWWEB.twdb.state.tx.us> <7720AAA9-CAA2-454D-A62B-63EFAEDF7A81@gmail.com> <4B4C8EDA.63BA.009B.0@twdb.state.tx.us> Message-ID: On Jan 12, 2010, at 4:01 PM, Dharhas Pothina wrote: > > > > _series['name'] is a 1D array w/ dtype '|S3'. What you'd want is to transform it into a 2D array of '|S1'. You could try to look chararray, but I'm not sure it'll help you. I'm afraid you gonna have to stick w/ the for loop. You may get it inlined, though: > ['BB' in _ for _ in _series['name']] > > thanks. Would the inline version be any faster or is it pretty much equivalent to an ordinary loop? I think inlined loops are a tad faster than the regular ones (they get optimized by the interpreter, if I understand correctly). Not 100% sure, though. From emmanuelle.gouillart at normalesup.org Tue Jan 12 17:02:51 2010 From: emmanuelle.gouillart at normalesup.org (Emmanuelle Gouillart) Date: Tue, 12 Jan 2010 23:02:51 +0100 Subject: [SciPy-User] is it worth working on ndimage documentation? Message-ID: <20100112220251.GC7417@phare.normalesup.org> Hello, as I'm using quite frequently some functions in scipy.ndimage (mostly mathematical morphology operations), I was considering working on their docstrings on the doc wiki. Docstrings indeed don't conform to the documentation standard and are often quite terse. However, I would like to know beforehand whether ndimage has a future in scipy, or whether if will be replaced at some point by the scikit image? So, it is worth improving the docstrings in ndimage? Cheers, Emmanuelle From dwf at cs.toronto.edu Tue Jan 12 17:25:41 2010 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Tue, 12 Jan 2010 17:25:41 -0500 Subject: [SciPy-User] is it worth working on ndimage documentation? In-Reply-To: <20100112220251.GC7417@phare.normalesup.org> References: <20100112220251.GC7417@phare.normalesup.org> Message-ID: <26A3BF94-13B2-464E-8133-65BF5EE9F98A@cs.toronto.edu> Hi Emmanuelle, I think it certainly does. If scikits.image does ever supersede ndimage (and I don't think it will - scikits.image is mainly focused on 2D images whereas I think ndimage is used for lots of 3D and 4D voxel images too?), it will likely take on functions from ndimage as well... in fact I think there is a ticket somewhere that contains parts of ndimage rewritten in Cython by the CellProfiler people (I don't have time to dig through my email to find it). Needless to say I think there is enough current use of ndimage that it's not going anywhere any time soon. David On 12-Jan-10, at 5:02 PM, Emmanuelle Gouillart wrote: > Hello, > > as I'm using quite frequently some functions in scipy.ndimage > (mostly mathematical morphology operations), I was considering > working on > their docstrings on the doc wiki. Docstrings indeed don't conform to > the > documentation standard and are often quite terse. > > However, I would like to know beforehand whether ndimage has a > future in scipy, or whether if will be replaced at some point by the > scikit image? So, it is worth improving the docstrings in ndimage? > > Cheers, > > Emmanuelle > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From cycomanic at gmail.com Tue Jan 12 17:54:09 2010 From: cycomanic at gmail.com (Jochen Schroeder) Date: Wed, 13 Jan 2010 09:54:09 +1100 Subject: [SciPy-User] ifft on images, symmetry artifacts? In-Reply-To: <440587.18866.qm@web59411.mail.ac4.yahoo.com> References: <440587.18866.qm@web59411.mail.ac4.yahoo.com> Message-ID: <20100112225407.GA2238@cudos0803> On 01/11/10 16:18, Joe Barfett wrote: > Hello, > I'm using scipy (numpy.fft.fft2) to transform an image into the frequency > domain. Then by using numpy.fft.ifft2 to transform the same image back into the > spatial domain, I find that I get symmetry in the image around a reflection > line (and not the original image). I'm struggling a bit to understand what exactly you're doing. In general you have to be careful when you plot your resulting function, i.e. do you want to plot the real part or the absolute value of the image? Anyway can you maybe post your code and the image you're converting? can sometimes lead to weird symmetry artifacts, e.g if you > Google has revealed websites like this one: http://www.rzuser.uni-heidelberg.de > /~ge6/Programing/convolution.html > This is the code snippet they use: > > def Convolution(image1,image2): > """ Simple convolution example """ > fftimage = fft2(image1)*fft2(image2) > return ifft2(fftimage).real > #end of Convolution > > which uses ifft but generates appropriate output. They do however only use the > real component of the frequency domain image. I find the exact same approach > does not work in my case, but rather gives these weird symmetries. > It's been a few weeks of hacking and I would really appreciate the guidance of > someone more experienced than me. Thanks a great deal if you know the answer! > joe > > > ??????????????????????????????????????????????????????????????????????????????? > The new Internet Explorer 8 - Faster, safer, easier. Optimized for Yahoo! Get > it Now for Free! > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From emmanuelle.gouillart at normalesup.org Wed Jan 13 02:56:19 2010 From: emmanuelle.gouillart at normalesup.org (Emmanuelle Gouillart) Date: Wed, 13 Jan 2010 08:56:19 +0100 Subject: [SciPy-User] is it worth working on ndimage documentation? In-Reply-To: <26A3BF94-13B2-464E-8133-65BF5EE9F98A@cs.toronto.edu> References: <20100112220251.GC7417@phare.normalesup.org> <26A3BF94-13B2-464E-8133-65BF5EE9F98A@cs.toronto.edu> Message-ID: <20100113075619.GA6894@phare.normalesup.org> Thanks for your answer, David! Emmanuelle On Tue, Jan 12, 2010 at 05:25:41PM -0500, David Warde-Farley wrote: > Hi Emmanuelle, > I think it certainly does. If scikits.image does ever supersede > ndimage (and I don't think it will - scikits.image is mainly focused > on 2D images whereas I think ndimage is used for lots of 3D and 4D > voxel images too?), it will likely take on functions from ndimage as > well... in fact I think there is a ticket somewhere that contains > parts of ndimage rewritten in Cython by the CellProfiler people (I > don't have time to dig through my email to find it). > Needless to say I think there is enough current use of ndimage that > it's not going anywhere any time soon. > David > On 12-Jan-10, at 5:02 PM, Emmanuelle Gouillart wrote: > > Hello, > > as I'm using quite frequently some functions in scipy.ndimage > > (mostly mathematical morphology operations), I was considering > > working on > > their docstrings on the doc wiki. Docstrings indeed don't conform to > > the > > documentation standard and are often quite terse. > > However, I would like to know beforehand whether ndimage has a > > future in scipy, or whether if will be replaced at some point by the > > scikit image? So, it is worth improving the docstrings in ndimage? > > Cheers, > > Emmanuelle > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From jordi_molins at hotmail.com Wed Jan 13 03:42:07 2010 From: jordi_molins at hotmail.com (Jordi Molins Coronado) Date: Wed, 13 Jan 2010 09:42:07 +0100 Subject: [SciPy-User] [SciPy-user] Maximum entropy distribution for Ising model - setup? In-Reply-To: <32EE5911-8078-4F1B-887E-00FBE702057E@gmail.com> References: , <32EE5911-8078-4F1B-887E-00FBE702057E@gmail.com> Message-ID: Hello, I find all the ideas posted in reply to my message very interesting, thank you very much to all who have answered to my question. Especially, I would like to know more about Kilian's and Robin's suggestions. In particular, I find difficult to understand and translate the ideas posted by them into my background. Of course, this is not Kilian's or Robin's fault, but my complete fault due to lack of knowledge. To Robin: - Is there a paper covering your package, but explained in layman's terms, not requiring previous knowledge on the subject? Or maybe a simple but fully-worked example (ideally closely related to the Ising model) that can be used in your package to see how everything works. To Kilian: - Do you have a computer package that covers that computations in your paper? Or do you have the Ising code available to distribution? I would be very interested to know more about the Ising implementation of your paper. Kind regards Jordi > CC: jordi_molins at hotmail.com > From: koepsell at gmail.com > To: scipy-user at scipy.org > Subject: Re: [SciPy-User] [SciPy-user] Maximum entropy distribution for Ising model - setup? > Date: Mon, 11 Jan 2010 18:02:53 -0800 > > Jordi, > > > On Jan 7, 2010, at 1:09 AM, Jordi Molins Coronado wrote: > >> > >> Hello, I am new to this forum. I am looking for a numerical > >> solution to the inverse problem of an Ising model (or a model not- > >> unlike the Ising model, see below). I have seen an old discussion, > >> but very interesting, about this subject on this forum (http://mail.scipy.org/pipermail/scipy-user/2006-October/009703.html > >> ). > >> > > You might want to check out a recent method developed in our group, > called "Minimum Probability Flow Learning" that allows very fast > parameter > estimation of basically any distribution -- including the Ising model. > A 100 unit ising model can be fitted within about 1 minute (see Fig. 3). > The paper is here: http://arxiv.org/abs/0906.4779 > > Kilian > > -- > Kilian Koepsell, PhD > Redwood Center for Theoretical Neuroscience > Helen Wills Neuroscience Institute, UC Berkeley > 156 Stanley Hall, MC# 3220 , Berkeley, CA 94720 > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gorkypl at gmail.com Wed Jan 13 18:15:59 2010 From: gorkypl at gmail.com (=?UTF-8?Q?Pawe=C5=82_Rumian?=) Date: Thu, 14 Jan 2010 00:15:59 +0100 Subject: [SciPy-User] scikits.timeseries or matplotlib plotting problem? Message-ID: <5158a0651001131515r3996331eue85ac3164987e5f0@mail.gmail.com> hello, I'm doing some research on climate data, using Python with NumPy. Yesterday I started implementing the scikits.timeseries package in my work, which occured to be almost perfect idea, but recently I ran into a problem with data visualisation. After some (not many) tests it seems that something weird happens when there is a gap in the data - the drawing is stopped there. To be more clear - after compiling the first example from the page: http://pytseries.sourceforge.net/lib.plotting.examples.html the result is: http://img191.imageshack.us/img191/5506/testg.png So it looks like the plotting was somehow 'stopped' after the first occurence of a hole in the data. As you can see the horizontal scale is correct (it's the same as on the webpage), but the one along the y-axis seems to be aligned to fit the broken plot. The other two examples (with consistent datasets) are plotted without a problem. Do you have any idea what could be the reason of this? What settings/packages should I check? greetings, Pawe? Rumian From pgmdevlist at gmail.com Wed Jan 13 19:00:51 2010 From: pgmdevlist at gmail.com (Pierre GM) Date: Wed, 13 Jan 2010 19:00:51 -0500 Subject: [SciPy-User] scikits.timeseries or matplotlib plotting problem? In-Reply-To: <5158a0651001131515r3996331eue85ac3164987e5f0@mail.gmail.com> References: <5158a0651001131515r3996331eue85ac3164987e5f0@mail.gmail.com> Message-ID: <67D67277-03EF-43DF-8FFD-2D42C727544E@gmail.com> On Jan 13, 2010, at 6:15 PM, Pawe? Rumian wrote: > hello, > > I'm doing some research on climate data, using Python with NumPy. Cool ! You can also check scikits.hydroclimpy, a set of extensions to scikits.timeseries with focus on climate analysis. > Yesterday I started implementing the scikits.timeseries package in my > work, which occured to be almost perfect idea, but recently I ran into > a problem with data visualisation. > > After some (not many) tests it seems that something weird happens when > there is a gap in the data - the drawing is stopped there. Ah. You're using the same data, right ? > > > The other two examples (with consistent datasets) are plotted without a problem. > > Do you have any idea what could be the reason of this? Might be a bug recently introduced. Let me check and get back to you. Note that this should not deter you from using scikits.timeseries. You can always plot your data using the regular matplotlib options (using your dates as x and your series as y). Lemme know if you need more help or if the doc is lacking on some aspects. From davide_fiocco at yahoo.it Wed Jan 13 19:47:44 2010 From: davide_fiocco at yahoo.it (davide_fiocco at yahoo.it) Date: Wed, 13 Jan 2010 16:47:44 -0800 (PST) Subject: [SciPy-User] Get array of separation vectors from an array a vectors Message-ID: <31b649b6-b1c0-4905-aa5a-e49e25e0fc62@z41g2000yqz.googlegroups.com> Hi folks, I'm new to Python and I'm trying to implement a basic molecular dynamics code. The problem I have is the following: Suppose you have an array of N vectors in R^3 like: A = [ [x1,y1,z1], [x2,y2,z2], ..., [xN,yN,zN] ] what I need is to get N!/(2! (N-2)!) separation vectors between the vectors in A, i.e. D = [ [x1-x2,y1-y2,z1-z2], [x1-x3,y1-y3,z1-z3], ..., [x2-x3,y2-y3,z2- z3], ..., [x_i-x_j,y_i-y_j,z_i-z_j],...] and I need the code to be FAST! Else I think I'll switch to a Fortran/ F2Py implementation. I'd say this task is not too different to what scipy.spatial.distance.pdist() does, with the difference that i don't need (the euclidean, say) distance but the differences between all the pairs of vectors in A. All suggestions will be very welcome, and I apologize if this is too trivial! Thank you. Davide From josef.pktd at gmail.com Wed Jan 13 20:34:11 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 13 Jan 2010 20:34:11 -0500 Subject: [SciPy-User] Get array of separation vectors from an array a vectors In-Reply-To: <31b649b6-b1c0-4905-aa5a-e49e25e0fc62@z41g2000yqz.googlegroups.com> References: <31b649b6-b1c0-4905-aa5a-e49e25e0fc62@z41g2000yqz.googlegroups.com> Message-ID: <1cd32cbb1001131734o1414e0b3r6dea85940ee8349@mail.gmail.com> On Wed, Jan 13, 2010 at 7:47 PM, davide_fiocco at yahoo.it wrote: > Hi folks, > I'm new to Python and I'm trying to implement a basic molecular > dynamics code. > > The problem I have is the following: > Suppose you have an array of N vectors in R^3 like: > A = [ [x1,y1,z1], [x2,y2,z2], ..., [xN,yN,zN] ] > > what I need is to get N!/(2! (N-2)!) separation vectors between the > vectors in A, i.e. > D = [ [x1-x2,y1-y2,z1-z2], [x1-x3,y1-y3,z1-z3], ..., [x2-x3,y2-y3,z2- > z3], ..., ?[x_i-x_j,y_i-y_j,z_i-z_j],...] > > and I need the code to be FAST! Else I think I'll switch to a Fortran/ > F2Py implementation. > > I'd say this task is not too different to what > scipy.spatial.distance.pdist() does, with the difference that i don't > need (the euclidean, say) distance but the differences between all the > pairs of vectors in A. > > All suggestions will be very welcome, and I apologize if this is too > trivial! Thank you. Something along the following, is the only thing I can come up with. Still requires intermediate arrays, and I thought I saw somewhere in numpy a function that creates the indices for a triu (but don't remember where) import numpy as np n = 5 #4 a = np.arange(n*3).reshape(n,3) print a #full ind0, ind1 = np.mgrid[0:n,0:n] ind0, ind1 = ind0.ravel(), ind1.ravel() d = a[ind1,:]-a[ind0,:] print d #reduced triuind0, triuind1 = np.nonzero(np.triu(np.ones((n,n)),k=1)) dr = a[triuind0,:]-a[triuind1,:] print dr ''' >>> import scipy >>> scipy.comb(4,2,exact=1) 6L >>> scipy.comb(5,2,exact=1) 10L ''' Warning quickly written and untested. >>> a array([[ 0, 1, 2], [ 3, 4, 5], [ 6, 7, 8], [ 9, 10, 11], [12, 13, 14]]) >>> dr array([[ -3, -3, -3], [ -6, -6, -6], [ -9, -9, -9], [-12, -12, -12], [ -3, -3, -3], [ -6, -6, -6], [ -9, -9, -9], [ -3, -3, -3], [ -6, -6, -6], [ -3, -3, -3]]) Josef > > Davide > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From j33433 at gmail.com Wed Jan 13 20:34:56 2010 From: j33433 at gmail.com (James) Date: Wed, 13 Jan 2010 20:34:56 -0500 Subject: [SciPy-User] scikits.timeseries or matplotlib plotting problem? In-Reply-To: <5158a0651001131515r3996331eue85ac3164987e5f0@mail.gmail.com> References: <5158a0651001131515r3996331eue85ac3164987e5f0@mail.gmail.com> Message-ID: This is purely a guess, but I wonder if quotes_historical_yahoo failed to fully fetch the quotes, then perhaps cached the bad data. On Wed, Jan 13, 2010 at 6:15 PM, Pawe? Rumian wrote: > hello, > > I'm doing some research on climate data, using Python with NumPy. > Yesterday I started implementing the scikits.timeseries package in my > work, which occured to be almost perfect idea, but recently I ran into > a problem with data visualisation. > > After some (not many) tests it seems that something weird happens when > there is a gap in the data - the drawing is stopped there. > > To be more clear - after compiling the first example from the page: > http://pytseries.sourceforge.net/lib.plotting.examples.html > the result is: > http://img191.imageshack.us/img191/5506/testg.png > > So it looks like the plotting was somehow 'stopped' after the first > occurence of a hole in the data. > > As you can see the horizontal scale is correct (it's the same as on > the webpage), but the one along the y-axis seems to be aligned to fit > the broken plot. > > The other two examples (with consistent datasets) are plotted without a problem. > > Do you have any idea what could be the reason of this? > What settings/packages should I check? > > greetings, > Pawe? Rumian > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From alan at ajackson.org Wed Jan 13 22:32:04 2010 From: alan at ajackson.org (alan at ajackson.org) Date: Wed, 13 Jan 2010 21:32:04 -0600 Subject: [SciPy-User] Trying to use PIL and numpy - SOLVED- In-Reply-To: <4B4C94F9.9060207@noaa.gov> References: <20100111172254.692ed02a@ajackson.org> <4B4C94F9.9060207@noaa.gov> Message-ID: <20100113213204.2057ab13@ajackson.org> >alan at ajackson.org wrote: >> I'm having some issues trying to use PIL and numpy (for the first time). >> It's probably something simple, it usually is. >> >> When I run the following, the output is all buggered up. It looks like >> the array indicies got switched about somewhere. >> >> import Image >> im = Image.open('test.ppm') >> im2 = im.convert(mode='F') >> >> a = np.asarray(im2) >> imback2 = Image.fromarray(a) >> >> imback = imback2.convert(mode='RGB') >> imback.save('testout.png') >> >> I tried removing bits, and it is the asarray -> fromarray sequence that >> messes stuff up. >> >> I'm running Karmic Koala with >> Python 2.6.4 (r264:75706, Dec 7 2009, 18:45:15) >> numpy 1.3.0 >> Image 1.1.6 >> >> >> >I believe there is a logic error in the PIL 1.1.6 fromarray() procedure >(see >http://mail.scipy.org/pipermail/numpy-discussion/2006-December/024903.html) >that may be relevant. >Try explicitly specifying the mode parameter in the fromarray(...) call. > -- jv Bingo! editing that line to imback2 = Image.fromarray(a, mode='F') fixes the problem. -- ----------------------------------------------------------------------- | Alan K. Jackson | To see a World in a Grain of Sand | | alan at ajackson.org | And a Heaven in a Wild Flower, | | www.ajackson.org | Hold Infinity in the palm of your hand | | Houston, Texas | And Eternity in an hour. - Blake | ----------------------------------------------------------------------- From gorkypl at gmail.com Thu Jan 14 03:15:19 2010 From: gorkypl at gmail.com (=?UTF-8?Q?Pawe=C5=82_Rumian?=) Date: Thu, 14 Jan 2010 09:15:19 +0100 Subject: [SciPy-User] scikits.timeseries or matplotlib plotting problem? In-Reply-To: <67D67277-03EF-43DF-8FFD-2D42C727544E@gmail.com> References: <5158a0651001131515r3996331eue85ac3164987e5f0@mail.gmail.com> <67D67277-03EF-43DF-8FFD-2D42C727544E@gmail.com> Message-ID: <5158a0651001140015j253df04y94879107bb902046@mail.gmail.com> > Cool ! You can also check scikits.hydroclimpy, a set of extensions to > scikits.timeseries with focus on climate analysis. I'm reading the docs right now - seems that I've reinvented the wheel sometimes... The good side is that I've started two weeks ago, so I didn't manage to waste too much time. > Might be a bug recently introduced. Let me check and get back to you. After more testing it seems to me more like a bug in matplotlib. It occurs only when plotting lines, using '-' or '--'. When I changed marks to '.', it worked. > Lemme know if you need more help or if the doc is lacking on some aspects. I will be playing with this stuff for at least a year, so I probably will :) Anyway - great job! greetings, Pawe? Rumian From qa at takb.net Thu Jan 14 04:23:28 2010 From: qa at takb.net (Torsten Andre) Date: Thu, 14 Jan 2010 10:23:28 +0100 Subject: [SciPy-User] Integration of double integral with integration variable as Message-ID: <4B4EE290.7070600@takb.net> Hey everyone, I am new to SciPy, but need to integrate something like this, where the boundaries of the inner integral are terms of outer variable's integration variable: \int{\int{sin(y)dy}_{-x}^{+x}dx}_0^1 Is this feasible in SciPy? I tried using quad but it only complains that x is not defined. Unfortunately I was unable to find anything on the list or in the documentation. Thanks for your time. Cheers, Torsten From ljmamoreira at gmail.com Thu Jan 14 08:21:04 2010 From: ljmamoreira at gmail.com (Jose Amoreira) Date: Thu, 14 Jan 2010 13:21:04 +0000 Subject: [SciPy-User] Integration of double integral with integration variable as In-Reply-To: <4B4EE290.7070600@takb.net> References: <4B4EE290.7070600@takb.net> Message-ID: <201001141321.04818.ljmamoreira@gmail.com> Torsten, Your example is easy! Since sin(y) is an odd function, integrating over [-x,x] gives zero and that's it. For a more illustrative example, replace sin with cos (still easy enough to do it quicker analytically). Using scipy.integrate.quad, you do it like this (excerpt from an idle session): >>> def g(x): return quad(cos,-x,x)[0] >>> quad(g,0.,1.)[0] 0.91939538826372047 The reason I take the zero-th element of the quad output is that the remaining is an estimate of the error. Maybe you should also look into scipy.integrate.dblquad. Hope this helps. jose On Thursday 14 January 2010 09:23:28 am Torsten Andre wrote: > Hey everyone, > > I am new to SciPy, but need to integrate something like this, where the > boundaries of the inner integral are terms of outer variable's > integration variable: > > \int{\int{sin(y)dy}_{-x}^{+x}dx}_0^1 > > Is this feasible in SciPy? I tried using quad but it only complains that > x is not defined. Unfortunately I was unable to find anything on the > list or in the documentation. > > Thanks for your time. > > Cheers, > Torsten > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From Dharhas.Pothina at twdb.state.tx.us Thu Jan 14 09:02:16 2010 From: Dharhas.Pothina at twdb.state.tx.us (Dharhas Pothina) Date: Thu, 14 Jan 2010 08:02:16 -0600 Subject: [SciPy-User] scikits.timeseries or matplotlib plotting problem? In-Reply-To: <5158a0651001140015j253df04y94879107bb902046@mail.gmail.com> References: <5158a0651001131515r3996331eue85ac3164987e5f0@mail.gmail.com> <67D67277-03EF-43DF-8FFD-2D42C727544E@gmail.com> <5158a0651001140015j253df04y94879107bb902046@mail.gmail.com> Message-ID: <4B4ECF88.63BA.009B.0@twdb.state.tx.us> I've had this problem before. From the email exchange I had on this list about a year ago, I basically worked out that any symbols work fine ie: dot,circle, diamond etc. When you use line types like '-' or '--' and have missing or masked data in the timeseries the plotting functions don't know what to do and just fail. From what I undesrtand this is a matplotlib issue and a work around is to compress the array to remove the missing values before plotting. See: http://old.nabble.com/Re%3A-Still-having-plotting-issue-with-latest%09svnscikits.timeseries-ts20941722.html#a20944512 - dharhas >>> Pawe* Rumian 1/14/2010 2:15 AM >>> > Cool ! You can also check scikits.hydroclimpy, a set of extensions to > scikits.timeseries with focus on climate analysis. I'm reading the docs right now - seems that I've reinvented the wheel sometimes... The good side is that I've started two weeks ago, so I didn't manage to waste too much time. > Might be a bug recently introduced. Let me check and get back to you. After more testing it seems to me more like a bug in matplotlib. It occurs only when plotting lines, using '-' or '--'. When I changed marks to '.', it worked. > Lemme know if you need more help or if the doc is lacking on some aspects. I will be playing with this stuff for at least a year, so I probably will :) Anyway - great job! greetings, Pawe* Rumian _______________________________________________ SciPy-User mailing list SciPy-User at scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user From gorkypl at gmail.com Thu Jan 14 09:15:18 2010 From: gorkypl at gmail.com (=?UTF-8?Q?Pawe=C5=82_Rumian?=) Date: Thu, 14 Jan 2010 15:15:18 +0100 Subject: [SciPy-User] scikits.timeseries or matplotlib plotting problem? In-Reply-To: <4B4ECF88.63BA.009B.0@twdb.state.tx.us> References: <5158a0651001131515r3996331eue85ac3164987e5f0@mail.gmail.com> <67D67277-03EF-43DF-8FFD-2D42C727544E@gmail.com> <5158a0651001140015j253df04y94879107bb902046@mail.gmail.com> <4B4ECF88.63BA.009B.0@twdb.state.tx.us> Message-ID: <5158a0651001140615o41a9028dse24ebae157c0c069@mail.gmail.com> 2010/1/14 Dharhas Pothina : > > I've had this problem before. From the email exchange I had on this > list about a year ago, I basically worked out that any symbols work fine > ie: dot,circle, diamond etc. When you use line types like '-' or '--' > and have missing or masked data ?in the timeseries the plotting > functions don't know what to do and just fail. From what I undesrtand > this is a matplotlib issue and a work around is to compress the array to > remove the missing values before plotting. See: > > http://old.nabble.com/Re%3A-Still-having-plotting-issue-with-latest%09svnscikits.timeseries-ts20941722.html#a20944512 That's exactly the point! I've just wrote this to matplotlib-users http://old.nabble.com/line-drawing-bug-or-it's-me-doing-something-wrong--td27159104.html So thank you very much - I've already almost gone mad with this, now I can cool down :) greetings, Pawe? From Dharhas.Pothina at twdb.state.tx.us Thu Jan 14 10:26:22 2010 From: Dharhas.Pothina at twdb.state.tx.us (Dharhas Pothina) Date: Thu, 14 Jan 2010 09:26:22 -0600 Subject: [SciPy-User] Timseries ts.tofile() Remove brackets. Message-ID: <4B4EE33E0200009B0002667D@GWWEB.twdb.state.tx.us> Hi, I'm trying to format the output of ts.tofile() and I can't find anyway to suppress the use of open and close brackets on each line. ie using tseries.tofile(cleanfile,format='%Y,%m,%d,%H,%M,%S',separator=',') saves as : 1996,06,11,21,00,00,('JOB_20090812_CXT_MW9999.csv', 0, 13.199999999999999, 28.949999999999999) 1996,06,11,22,00,00,('JOB_20090812_CXT_MW9999.csv', 0, 13.199999999999999, 28.690000000000001) ... etc While what I want is: 1996,06,11,21,00,00,'JOB_20090812_CXT_MW9999.csv', 0, 13.199999999999999, 28.949999999999999 1996,06,11,22,00,00,'JOB_20090812_CXT_MW9999.csv', 0, 13.199999999999999, 28.690000000000001 ... etc anyway of doing this without reopening the file and removing the brackets. thanks - dharhas From gorkypl at gmail.com Thu Jan 14 13:48:04 2010 From: gorkypl at gmail.com (=?UTF-8?Q?Pawe=C5=82_Rumian?=) Date: Thu, 14 Jan 2010 19:48:04 +0100 Subject: [SciPy-User] scikits.timeseries or matplotlib plotting problem? In-Reply-To: <5158a0651001140615o41a9028dse24ebae157c0c069@mail.gmail.com> References: <5158a0651001131515r3996331eue85ac3164987e5f0@mail.gmail.com> <67D67277-03EF-43DF-8FFD-2D42C727544E@gmail.com> <5158a0651001140015j253df04y94879107bb902046@mail.gmail.com> <4B4ECF88.63BA.009B.0@twdb.state.tx.us> <5158a0651001140615o41a9028dse24ebae157c0c069@mail.gmail.com> Message-ID: <5158a0651001141048y40a01d91uec7dc0200af29068@mail.gmail.com> However not as good as I supposed... Someone in the (mentioned above) matplotlib-users group redirected me to another example: http://matplotlib.sourceforge.net/examples/pylab_examples/masked_demo.html and it doesn't work - the green line is not being drawn, until the line is changed to marks. So it looks like there is still something wrong with handling masked arrays by my instance of matplotlib... Anyway - it's probably not scikits related, but if someone would know any solution I'd be very thankful - I hope it wouldn't be considered a big offtopic... greetings, Pawe? From totalbull at mac.com Thu Jan 14 14:44:13 2010 From: totalbull at mac.com (totalbull at mac.com) Date: Thu, 14 Jan 2010 19:44:13 +0000 Subject: [SciPy-User] Seasonal adjustment in scipy/python? References: <60BE5A67-DB97-4B52-A281-AC67E82B3339@me.com> Message-ID: <891B0F5A-52D1-4086-BC65-E32749E7C6D7@mac.com> Hello, I am looking to seasonally adjust some data series in Python - specifically economics in emerging markets. As you can see on the charts (www.emconfidential.com) there is a lot of seasonality to monthly data series. Example 1 (retail sales) is obvious. Example 2, CPI, is somewhat less so, but there is still some seasonality here with price falls around January and fairly high prices in December. How would I go about seasonally adjusting this data using Python and Scipy? Any canned functions? Tom -------------- next part -------------- An HTML attachment was scrubbed... URL: From pgmdevlist at gmail.com Thu Jan 14 14:53:33 2010 From: pgmdevlist at gmail.com (Pierre GM) Date: Thu, 14 Jan 2010 14:53:33 -0500 Subject: [SciPy-User] Timseries ts.tofile() Remove brackets. In-Reply-To: <4B4EE33E0200009B0002667D@GWWEB.twdb.state.tx.us> References: <4B4EE33E0200009B0002667D@GWWEB.twdb.state.tx.us> Message-ID: <5585199C-A83E-4E4D-B3D3-29B8DDA62BBC@gmail.com> On Jan 14, 2010, at 10:26 AM, Dharhas Pothina wrote: > Hi, > > I'm trying to format the output of ts.tofile() and I can't find anyway to suppress the use of open and close brackets on each line. ie using > ... > anyway of doing this without reopening the file and removing the brackets. Fixing the code :) Could you file a ticket ? Thanks a lot in advance. But here's a workaround >>> _tmp=ts.time_series([('AAA',1,1.),('BBB',1,2.)],dtype=[('a','|S3'),('b',int),('c',float)],start_date=ts.now('D')) >>> [tuple([d]+list(s)) for (d,s) in zip(_tmp.dates,_tmp.series)] (, 'AAA', 1, 1.0), (, 'BBB', 1, 2.0)] From josef.pktd at gmail.com Thu Jan 14 14:59:57 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 14 Jan 2010 14:59:57 -0500 Subject: [SciPy-User] Seasonal adjustment in scipy/python? In-Reply-To: <891B0F5A-52D1-4086-BC65-E32749E7C6D7@mac.com> References: <60BE5A67-DB97-4B52-A281-AC67E82B3339@me.com> <891B0F5A-52D1-4086-BC65-E32749E7C6D7@mac.com> Message-ID: <1cd32cbb1001141159o6b410655l26a44dc9a602f754@mail.gmail.com> On Thu, Jan 14, 2010 at 2:44 PM, wrote: > Hello, > > I am looking to seasonally adjust some data series in Python - specifically > economics in emerging markets. As you can see on the charts > (www.emconfidential.com) there is a lot of seasonality to monthly data > series. Example 1 (retail sales) is obvious. Example 2, CPI, ?is somewhat > less so, but there is still some seasonality here with price falls around > January and fairly high prices in December. > > How would I go about seasonally adjusting this data using Python and Scipy? > Any canned functions? I haven't seen any canned functions, the simplest would be to use annual differences or estimate the monthly base level with regression on month dummy variables and take the residual. >From the graph, it doesn't look like assuming a functional form for the monthly base level (seasonal trend) would be useful. There should be more sophisticated ways for filtering but not canned, and I don't think X11 is available in python. It would also depend on how long your timeseries is and what you want to do with it Josef > > Tom > > > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > From pgmdevlist at gmail.com Thu Jan 14 15:03:25 2010 From: pgmdevlist at gmail.com (Pierre GM) Date: Thu, 14 Jan 2010 15:03:25 -0500 Subject: [SciPy-User] Seasonal adjustment in scipy/python? In-Reply-To: <891B0F5A-52D1-4086-BC65-E32749E7C6D7@mac.com> References: <60BE5A67-DB97-4B52-A281-AC67E82B3339@me.com> <891B0F5A-52D1-4086-BC65-E32749E7C6D7@mac.com> Message-ID: On Jan 14, 2010, at 2:44 PM, totalbull at mac.com wrote: > > Hello, > > I am looking to seasonally adjust some data series in Python - specifically economics in emerging markets. As you can see on the charts (www.emconfidential.com) there is a lot of seasonality to monthly data series. Example 1 (retail sales) is obvious. Example 2, CPI, is somewhat less so, but there is still some seasonality here with price falls around January and fairly high prices in December. > > How would I go about seasonally adjusting this data using Python and Scipy? Any canned functions? Have a look to scikits.timeseries, the package was designed to simplify the handling of a series. You can also check scikits.hydroclimpy, a derived package: there's a 'deseasonalize' function in the second package that makes it easy to compute seasonal anomalies and normalize them. You may not have to install the whole package, just check the source and copy the function. pytseries.sourceforge.net hydroclimpy.sourceforge.net From aisaac at american.edu Thu Jan 14 15:10:32 2010 From: aisaac at american.edu (Alan G Isaac) Date: Thu, 14 Jan 2010 15:10:32 -0500 Subject: [SciPy-User] Seasonal adjustment in scipy/python? In-Reply-To: <891B0F5A-52D1-4086-BC65-E32749E7C6D7@mac.com> References: <60BE5A67-DB97-4B52-A281-AC67E82B3339@me.com> <891B0F5A-52D1-4086-BC65-E32749E7C6D7@mac.com> Message-ID: <4B4F7A38.50408@american.edu> On 1/14/2010 2:44 PM, totalbull at mac.com wrote: > I am looking to seasonally adjust some data series in Python - >>> help(np.diff) Help on function diff in module numpy.lib.function_base: diff(a, n=1, axis=-1) Calculate the nth order discrete difference along given axis. hth, Alan Isaac From josef.pktd at gmail.com Thu Jan 14 15:20:53 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 14 Jan 2010 15:20:53 -0500 Subject: [SciPy-User] Seasonal adjustment in scipy/python? In-Reply-To: <4B4F7A38.50408@american.edu> References: <60BE5A67-DB97-4B52-A281-AC67E82B3339@me.com> <891B0F5A-52D1-4086-BC65-E32749E7C6D7@mac.com> <4B4F7A38.50408@american.edu> Message-ID: <1cd32cbb1001141220g2b15d081nb3075e2e8f8ec319@mail.gmail.com> On Thu, Jan 14, 2010 at 3:10 PM, Alan G Isaac wrote: > On 1/14/2010 2:44 PM, totalbull at mac.com wrote: >> I am looking to seasonally adjust some data series in Python - > >>>> help(np.diff) > Help on function diff in module numpy.lib.function_base: > > diff(a, n=1, axis=-1) > ? ? Calculate the nth order discrete difference along given axis. diff doesn't work for seasonal, order is (1-L)^n not (1-L^n) (although possible after reshaping) BTW: residual of the regression on month dummies is just the same as subtracting the sample average for that month. Josef > > hth, > Alan Isaac > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From Dharhas.Pothina at twdb.state.tx.us Thu Jan 14 15:21:51 2010 From: Dharhas.Pothina at twdb.state.tx.us (Dharhas Pothina) Date: Thu, 14 Jan 2010 14:21:51 -0600 Subject: [SciPy-User] Timseries ts.tofile() Remove brackets. In-Reply-To: <5585199C-A83E-4E4D-B3D3-29B8DDA62BBC@gmail.com> References: <4B4EE33E0200009B0002667D@GWWEB.twdb.state.tx.us> <5585199C-A83E-4E4D-B3D3-29B8DDA62BBC@gmail.com> Message-ID: <4B4F287F.63BA.009B.0@twdb.state.tx.us> Thanks, I created a ticket. - d >>> Pierre GM 1/14/2010 1:53 PM >>> On Jan 14, 2010, at 10:26 AM, Dharhas Pothina wrote: > Hi, > > I'm trying to format the output of ts.tofile() and I can't find anyway to suppress the use of open and close brackets on each line. ie using > ... > anyway of doing this without reopening the file and removing the brackets. Fixing the code :) Could you file a ticket ? Thanks a lot in advance. But here's a workaround >>> _tmp=ts.time_series([('AAA',1,1.),('BBB',1,2.)],dtype=[('a','|S3'),('b',int),('c',float)],start_date=ts.now('D')) >>> [tuple([d]+list(s)) for (d,s) in zip(_tmp.dates,_tmp.series)] (, 'AAA', 1, 1.0), (, 'BBB', 1, 2.0)] _______________________________________________ SciPy-User mailing list SciPy-User at scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user From pgmdevlist at gmail.com Thu Jan 14 15:27:06 2010 From: pgmdevlist at gmail.com (Pierre GM) Date: Thu, 14 Jan 2010 15:27:06 -0500 Subject: [SciPy-User] Timseries ts.tofile() Remove brackets. In-Reply-To: <4B4F287F.63BA.009B.0@twdb.state.tx.us> References: <4B4EE33E0200009B0002667D@GWWEB.twdb.state.tx.us> <5585199C-A83E-4E4D-B3D3-29B8DDA62BBC@gmail.com> <4B4F287F.63BA.009B.0@twdb.state.tx.us> Message-ID: On Jan 14, 2010, at 3:21 PM, Dharhas Pothina wrote: > > Thanks, I created a ticket. Got it ! Thanks again for reporting From aisaac at american.edu Thu Jan 14 15:41:18 2010 From: aisaac at american.edu (Alan G Isaac) Date: Thu, 14 Jan 2010 15:41:18 -0500 Subject: [SciPy-User] Seasonal adjustment in scipy/python? In-Reply-To: <1cd32cbb1001141220g2b15d081nb3075e2e8f8ec319@mail.gmail.com> References: <60BE5A67-DB97-4B52-A281-AC67E82B3339@me.com> <891B0F5A-52D1-4086-BC65-E32749E7C6D7@mac.com> <4B4F7A38.50408@american.edu> <1cd32cbb1001141220g2b15d081nb3075e2e8f8ec319@mail.gmail.com> Message-ID: <4B4F816E.3060501@american.edu> On 1/14/2010 3:20 PM, josef.pktd at gmail.com wrote: > diff doesn't work for seasonal, order is (1-L)^n not (1-L^n) > (although possible after reshaping) Yep. Engaged fingers before brain... Alan From gokhansever at gmail.com Thu Jan 14 17:30:20 2010 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Thu, 14 Jan 2010 16:30:20 -0600 Subject: [SciPy-User] Wording question regarding to distributions Message-ID: <49d6b3501001141430i6dd7155ah9ea1b181404d46fc@mail.gmail.com> Hello, What is the right way to express: Do we fit data to a distribution or distribution to data? Thanks. G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From dwf at cs.toronto.edu Thu Jan 14 17:37:01 2010 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Thu, 14 Jan 2010 17:37:01 -0500 Subject: [SciPy-User] Wording question regarding to distributions In-Reply-To: <49d6b3501001141430i6dd7155ah9ea1b181404d46fc@mail.gmail.com> References: <49d6b3501001141430i6dd7155ah9ea1b181404d46fc@mail.gmail.com> Message-ID: <49577BC2-5B2B-4E1B-8E8D-01920E2A1BD4@cs.toronto.edu> On 14-Jan-10, at 5:30 PM, G?khan Sever wrote: > Hello, > > What is the right way to express: > > Do we fit data to a distribution or distribution to data? > > Thanks. I would say the latter. Assuming we are talking about the same scenario, the data follow some unknown distribution, which you try to approximate with some parametric form using maximum likelihood estimators and such. So you are fitting a (particular) distribution (or more specifically, a model of the underlying process which *uses* that particular distribution) to observed data. My $0.02, David From josef.pktd at gmail.com Thu Jan 14 18:13:54 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 14 Jan 2010 18:13:54 -0500 Subject: [SciPy-User] Wording question regarding to distributions In-Reply-To: <49577BC2-5B2B-4E1B-8E8D-01920E2A1BD4@cs.toronto.edu> References: <49d6b3501001141430i6dd7155ah9ea1b181404d46fc@mail.gmail.com> <49577BC2-5B2B-4E1B-8E8D-01920E2A1BD4@cs.toronto.edu> Message-ID: <1cd32cbb1001141513t4599e9a3nde663feb58454aaa@mail.gmail.com> On Thu, Jan 14, 2010 at 5:37 PM, David Warde-Farley wrote: > > On 14-Jan-10, at 5:30 PM, G?khan Sever wrote: > >> Hello, >> >> What is the right way to express: >> >> Do we fit data to a distribution or distribution to data? >> >> Thanks. > > I would say the latter. Assuming we are talking about the same > scenario, the data follow some unknown distribution, which you try to > approximate with some parametric form using maximum likelihood > estimators and such. So you are fitting a (particular) distribution > (or more specifically, a model of the underlying process which *uses* > that particular distribution) to observed data. I agree, unless you are "massaging" your data to fit the distribution to get nicer results. Josef > > My $0.02, > > David > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From d.l.goldsmith at gmail.com Thu Jan 14 18:48:14 2010 From: d.l.goldsmith at gmail.com (David Goldsmith) Date: Thu, 14 Jan 2010 15:48:14 -0800 Subject: [SciPy-User] Wording question regarding to distributions In-Reply-To: <1cd32cbb1001141513t4599e9a3nde663feb58454aaa@mail.gmail.com> References: <49d6b3501001141430i6dd7155ah9ea1b181404d46fc@mail.gmail.com> <49577BC2-5B2B-4E1B-8E8D-01920E2A1BD4@cs.toronto.edu> <1cd32cbb1001141513t4599e9a3nde663feb58454aaa@mail.gmail.com> Message-ID: <45d1ab481001141548ka4c2c94ka6f88b97c27cd6ee@mail.gmail.com> On Thu, Jan 14, 2010 at 3:13 PM, wrote: > On Thu, Jan 14, 2010 at 5:37 PM, David Warde-Farley wrote: >> >> On 14-Jan-10, at 5:30 PM, G?khan Sever wrote: >> >>> Hello, >>> >>> What is the right way to express: >>> >>> Do we fit data to a distribution or distribution to data? >>> >>> Thanks. >> >> I would say the latter. Assuming we are talking about the same > > I agree, unless you are "massaging" your data to fit the distribution > to get nicer results. > > Josef Exactly: you're only "fitting data to a distribution" if you're fiddling w/ the data to make it fit; otherwise, your "fitting the distribution to the data." My $2e6. DG From davide_fiocco at yahoo.it Thu Jan 14 19:26:33 2010 From: davide_fiocco at yahoo.it (davide_fiocco at yahoo.it) Date: Thu, 14 Jan 2010 16:26:33 -0800 (PST) Subject: [SciPy-User] Get array of separation vectors from an array of vectors and use it to compute the force in a MD code In-Reply-To: <1cd32cbb1001131734o1414e0b3r6dea85940ee8349@mail.gmail.com> References: <31b649b6-b1c0-4905-aa5a-e49e25e0fc62@z41g2000yqz.googlegroups.com> <1cd32cbb1001131734o1414e0b3r6dea85940ee8349@mail.gmail.com> Message-ID: <4645dbd7-c7f6-41be-88d8-fb99436f7f3a@r24g2000yqd.googlegroups.com> Thanks Josef! I post here the code i wrote to compute the matrix ff of the forces between all the pairs of particles in a given set interacting under the Lennard-Jones potential. Note that: - coordinates of the i-th particle is stored in self.txyz[i,1:]. - the returned matrix ff contains at f[i,j,:] the three components of the force due to the interaction between i and j. - the for loop is the way I used to rebuild a triangular matrix from its reduced representation I guess it can't be considered good code...and it'd be cool if someone could point out its major flaws! Thanks a lot again! Davide def get_forces(self): if self.pair_style == 'lj/cut': #Josef suggestion to get the reduced array of separation vectors R I, J = numpy.nonzero(numpy.triu(numpy.ones((self.natoms, self.natoms)), k=1)) R = self.atoms.txyz[I,1:] - self.atoms.txyz[J,1:] #invoking a vectorized function to apply the minimum image convention to the separation vectors R = minimum_image(R, self.boxes[-1].bounds) #compute the array of inverse distances S = 1/numpy.sqrt(numpy.add.reduce((R*R).transpose())) #in f I will store the information about the upper triangular part of the matrix of forces f = numpy.zeros((S.size, 3)) invcut = 1./2.5 #compute Lennard Jones force for distances below a given cutoff f[S > invcut, :] = (R[S > invcut,:])*((24.*(-2.*S[S > invcut]**13 + S[S > invcut]**7))*S[S > invcut]).reshape(-1,1) ff = numpy.zeros((self.natoms, self.natoms, 3)) #convert reduced array of forces into an antisymmetric matrix ff (f contains all the information about its triu) for i in range(self.natoms): ff[i,i+1:,:] = f[self.natoms*i - i*(i+1)/2:self.natoms*(i+1) - (i + 1)*(i + 2)/2,:] ff[i+1:,i,:] = -f[self.natoms*i - i*(i+1)/2:self.natoms*(i+1) - (i + 1)*(i + 2)/2,:] return ff #apply the minimum image convention def minimum_image_scalar(dx, box): dx = dx - int(round(dx/box))*box return dx minimum_image = numpy.vectorize(minimum_image_scalar) From josef.pktd at gmail.com Thu Jan 14 20:01:13 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 14 Jan 2010 20:01:13 -0500 Subject: [SciPy-User] Get array of separation vectors from an array of vectors and use it to compute the force in a MD code In-Reply-To: <4645dbd7-c7f6-41be-88d8-fb99436f7f3a@r24g2000yqd.googlegroups.com> References: <31b649b6-b1c0-4905-aa5a-e49e25e0fc62@z41g2000yqz.googlegroups.com> <1cd32cbb1001131734o1414e0b3r6dea85940ee8349@mail.gmail.com> <4645dbd7-c7f6-41be-88d8-fb99436f7f3a@r24g2000yqd.googlegroups.com> Message-ID: <1cd32cbb1001141701t21a8d6d3hcb6048c77336dc21@mail.gmail.com> On Thu, Jan 14, 2010 at 7:26 PM, davide_fiocco at yahoo.it wrote: > Thanks Josef! > I post here the code i wrote to compute the matrix ff of the forces > between all the pairs of particles in a given set interacting under > the Lennard-Jones potential. > Note that: > - coordinates of the i-th particle is stored in self.txyz[i,1:]. > - the returned matrix ff contains at f[i,j,:] the three components of > the force due to the interaction between i and j. > - the for loop is the way I used to rebuild a triangular matrix from > its reduced representation When you are rebuilding the triu, or the full symmetric distance matrix ff from the vectorized version then you can use again the intial triu indices I,J, and inplace add the transpose. might require a bit of thinking to get the 3rd axis right, but something like this: ff[I,J,:] = f # unless numpy switches axis ff += np.swapaxis(ff,2,1) # diagonal is zero so not duplicate to worry about You might want to try on a simple example, but I'm pretty sure something like this should work Josef > > I guess it can't be considered good code...and it'd be cool if someone > could point out its major flaws! > Thanks a lot again! > > Davide > > ? ? ? ?def get_forces(self): > ? ? ? ? ? ? ? ?if self.pair_style == 'lj/cut': > ? ? ? ? ? ? ? ? ? ? ? ?#Josef suggestion to get the reduced array of separation vectors R > ? ? ? ? ? ? ? ? ? ? ? ?I, J = numpy.nonzero(numpy.triu(numpy.ones((self.natoms, > self.natoms)), k=1)) > ? ? ? ? ? ? ? ? ? ? ? ?R = self.atoms.txyz[I,1:] - self.atoms.txyz[J,1:] > ? ? ? ? ? ? ? ? ? ? ? ?#invoking a vectorized function to apply the > minimum image convention to the separation vectors > ? ? ? ? ? ? ? ? ? ? ? ?R = minimum_image(R, self.boxes[-1].bounds) > ? ? ? ? ? ? ? ? ? ? ? ?#compute the array of inverse distances > ? ? ? ? ? ? ? ? ? ? ? ?S = 1/numpy.sqrt(numpy.add.reduce((R*R).transpose())) isn't the transpose here just choosing the axis ? 1/numpy.sqrt(((R*R).sum(0))) it won't make much difference but I find it easier to read > ? ? ? ? ? ? ? ? ? ? ? ?#in f I will store the information about the upper triangular part > of the matrix of forces > ? ? ? ? ? ? ? ? ? ? ? ?f = numpy.zeros((S.size, 3)) > ? ? ? ? ? ? ? ? ? ? ? ?invcut = 1./2.5 > ? ? ? ? ? ? ? ? ? ? ? ?#compute Lennard Jones force for distances > below a given cutoff > ? ? ? ? ? ? ? ? ? ? ? ?f[S > invcut, :] = (R[S > invcut,:])*((24.*(-2.*S[S > invcut]**13 + > S[S > invcut]**7))*S[S > invcut]).reshape(-1,1) you might want to replace the repeated comparison with a temp variable: mask = S > invcut Josef > ? ? ? ? ? ? ? ? ? ? ? ?ff = numpy.zeros((self.natoms, self.natoms, 3)) > ? ? ? ? ? ? ? ? ? ? ? ?#convert reduced array of forces into an > antisymmetric matrix ff (f contains all the information about its > triu) > ? ? ? ? ? ? ? ? ? ? ? ?for i in range(self.natoms): > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?ff[i,i+1:,:] = ?f[self.natoms*i - i*(i+1)/2:self.natoms*(i+1) - (i > + 1)*(i + 2)/2,:] > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?ff[i+1:,i,:] = -f[self.natoms*i - i*(i+1)/2:self.natoms*(i+1) - (i > + 1)*(i + 2)/2,:] > > ? ? ? ? ? ? ? ? ? ? ? ?return ff > ? ? ? ?#apply the minimum image convention > ? ? ? ?def minimum_image_scalar(dx, box): > ? ? ? ? ? ? ? ?dx = dx - int(round(dx/box))*box > ? ? ? ? ? ? ? ?return dx > ? ? ? ?minimum_image = numpy.vectorize(minimum_image_scalar) > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From gokhansever at gmail.com Thu Jan 14 20:10:17 2010 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Thu, 14 Jan 2010 19:10:17 -0600 Subject: [SciPy-User] Wording question regarding to distributions In-Reply-To: <49d6b3501001141430i6dd7155ah9ea1b181404d46fc@mail.gmail.com> References: <49d6b3501001141430i6dd7155ah9ea1b181404d46fc@mail.gmail.com> Message-ID: <49d6b3501001141710r4719d184tf826dfd31c28ba2@mail.gmail.com> On Thu, Jan 14, 2010 at 4:30 PM, G?khan Sever wrote: > Hello, > > What is the right way to express: > > Do we fit data to a distribution or distribution to data? > > Thanks. > > > > G?khan > Here is how the question arise in my mind. Previously, I had asked a question to fit a log-normal distribution on my data on this thread http://mail.scipy.org/pipermail/scipy-user/2009-November/023320.html Well the work is unfinished there, and I started to dig-in to the same subject again. For R, I have found a function that lets me estimate parameters from my binned data pair (i.e bin sizes - measurements) to construct a log-normal fit: http://www.exposurescience.org/heR.doc/library/heR.Misc/html/bin2lnorm.html The description given for the function is in conflict with itself: The title says: "Fit binned data to a log-normal distribution" However description says different: "This function takes binned data and fits a lognormal model to it, using weighted least squares, and optionally plotting the fit and the data together" I couldn't find a way to estimate log-normal parameters in Python (maybe I will need the same for the gamma distributions as well) given in the form as bin2lnorm (i.e. l- bin limits, and h- corresponding heights (measurements in my case)) that is the reason I use that R function. Any new alternative suggestions as welcome this point. Similarly, while I studying my Cloud and Precipitation Parameterizations book today (Distributions are extremely important in bulk-parameterization of clouds and cloud-constituents/products (e.g. aerosols, cloud-droplets, rain, hail etc...) I see in a couple figures (Please see the book review at http://www.cambridge.org/catalogue/catalogue.asp?isbn=9780521883382&ss=excand go to pg 9. Figure 1.2) using statements like: "gamma curves fit to data." It's clearer now after reading your inputs. Thanks again. -- G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Thu Jan 14 20:12:34 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 14 Jan 2010 20:12:34 -0500 Subject: [SciPy-User] Get array of separation vectors from an array of vectors and use it to compute the force in a MD code In-Reply-To: <1cd32cbb1001141701t21a8d6d3hcb6048c77336dc21@mail.gmail.com> References: <31b649b6-b1c0-4905-aa5a-e49e25e0fc62@z41g2000yqz.googlegroups.com> <1cd32cbb1001131734o1414e0b3r6dea85940ee8349@mail.gmail.com> <4645dbd7-c7f6-41be-88d8-fb99436f7f3a@r24g2000yqd.googlegroups.com> <1cd32cbb1001141701t21a8d6d3hcb6048c77336dc21@mail.gmail.com> Message-ID: <1cd32cbb1001141712q66f43e5bg7663ec4d274d9ef4@mail.gmail.com> On Thu, Jan 14, 2010 at 8:01 PM, wrote: > On Thu, Jan 14, 2010 at 7:26 PM, davide_fiocco at yahoo.it > wrote: >> Thanks Josef! >> I post here the code i wrote to compute the matrix ff of the forces >> between all the pairs of particles in a given set interacting under >> the Lennard-Jones potential. >> Note that: >> - coordinates of the i-th particle is stored in self.txyz[i,1:]. >> - the returned matrix ff contains at f[i,j,:] the three components of >> the force due to the interaction between i and j. >> - the for loop is the way I used to rebuild a triangular matrix from >> its reduced representation > > When you are rebuilding the triu, or the full symmetric distance > matrix ff from the vectorized version then you can use again the > intial triu indices I,J, and inplace add the transpose. > might require a bit of thinking to get the 3rd axis right, but > something like this: > > ff[I,J,:] = f ? ? ?# unless numpy switches axis > ff += np.swapaxis(ff,2,1) ?# diagonal is zero so not duplicate to worry about > > You might want to try on a simple example, but I'm pretty sure > something like this should work > > Josef > >> >> I guess it can't be considered good code...and it'd be cool if someone >> could point out its major flaws! >> Thanks a lot again! >> >> Davide >> >> ? ? ? ?def get_forces(self): >> ? ? ? ? ? ? ? ?if self.pair_style == 'lj/cut': >> ? ? ? ? ? ? ? ? ? ? ? ?#Josef suggestion to get the reduced array of separation vectors R >> ? ? ? ? ? ? ? ? ? ? ? ?I, J = numpy.nonzero(numpy.triu(numpy.ones((self.natoms, >> self.natoms)), k=1)) >> ? ? ? ? ? ? ? ? ? ? ? ?R = self.atoms.txyz[I,1:] - self.atoms.txyz[J,1:] >> ? ? ? ? ? ? ? ? ? ? ? ?#invoking a vectorized function to apply the >> minimum image convention to the separation vectors >> ? ? ? ? ? ? ? ? ? ? ? ?R = minimum_image(R, self.boxes[-1].bounds) >> ? ? ? ? ? ? ? ? ? ? ? ?#compute the array of inverse distances >> ? ? ? ? ? ? ? ? ? ? ? ?S = 1/numpy.sqrt(numpy.add.reduce((R*R).transpose())) > > isn't the transpose here just choosing the axis ? 1/numpy.sqrt(((R*R).sum(0))) > it won't make much difference but I find it easier to read typo, I think it's axis=1 in sum And thanks for posting, it's nice to see whether my answers are helpful or not Josef > >> ? ? ? ? ? ? ? ? ? ? ? ?#in f I will store the information about the upper triangular part >> of the matrix of forces >> ? ? ? ? ? ? ? ? ? ? ? ?f = numpy.zeros((S.size, 3)) >> ? ? ? ? ? ? ? ? ? ? ? ?invcut = 1./2.5 >> ? ? ? ? ? ? ? ? ? ? ? ?#compute Lennard Jones force for distances >> below a given cutoff >> ? ? ? ? ? ? ? ? ? ? ? ?f[S > invcut, :] = (R[S > invcut,:])*((24.*(-2.*S[S > invcut]**13 + >> S[S > invcut]**7))*S[S > invcut]).reshape(-1,1) > > you might want to replace ?the repeated comparison with a temp > variable: ?mask = S > invcut > > Josef > >> ? ? ? ? ? ? ? ? ? ? ? ?ff = numpy.zeros((self.natoms, self.natoms, 3)) >> ? ? ? ? ? ? ? ? ? ? ? ?#convert reduced array of forces into an >> antisymmetric matrix ff (f contains all the information about its >> triu) >> ? ? ? ? ? ? ? ? ? ? ? ?for i in range(self.natoms): >> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?ff[i,i+1:,:] = ?f[self.natoms*i - i*(i+1)/2:self.natoms*(i+1) - (i >> + 1)*(i + 2)/2,:] >> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?ff[i+1:,i,:] = -f[self.natoms*i - i*(i+1)/2:self.natoms*(i+1) - (i >> + 1)*(i + 2)/2,:] >> >> ? ? ? ? ? ? ? ? ? ? ? ?return ff >> ? ? ? ?#apply the minimum image convention >> ? ? ? ?def minimum_image_scalar(dx, box): >> ? ? ? ? ? ? ? ?dx = dx - int(round(dx/box))*box >> ? ? ? ? ? ? ? ?return dx >> ? ? ? ?minimum_image = numpy.vectorize(minimum_image_scalar) >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > From bsouthey at gmail.com Thu Jan 14 22:14:21 2010 From: bsouthey at gmail.com (Bruce Southey) Date: Thu, 14 Jan 2010 21:14:21 -0600 Subject: [SciPy-User] Wording question regarding to distributions In-Reply-To: <49d6b3501001141710r4719d184tf826dfd31c28ba2@mail.gmail.com> References: <49d6b3501001141430i6dd7155ah9ea1b181404d46fc@mail.gmail.com> <49d6b3501001141710r4719d184tf826dfd31c28ba2@mail.gmail.com> Message-ID: On Thu, Jan 14, 2010 at 7:10 PM, G?khan Sever wrote: > > > On Thu, Jan 14, 2010 at 4:30 PM, G?khan Sever wrote: >> >> Hello, >> >> What is the right way to express: >> >> Do we fit data to a distribution or distribution to data? >> >> Thanks. >> >> >> >> G?khan > > Here is how the question arise in my mind. > > Previously, I had asked a question to fit a log-normal distribution on my > data on this thread > http://mail.scipy.org/pipermail/scipy-user/2009-November/023320.html > > Well the work is unfinished there, and I started to dig-in to the same > subject again. For R, I have found a function that lets me estimate > parameters from my binned data pair (i.e bin sizes - measurements) to > construct a log-normal fit: > > http://www.exposurescience.org/heR.doc/library/heR.Misc/html/bin2lnorm.html > > The description given for the function is in conflict with itself: > > The title says: "Fit binned data to a log-normal distribution" > > However description says different: > > "This function takes binned data and fits a lognormal model to it, using > weighted least squares, and optionally plotting the fit and the data > together" > > I couldn't find a way to estimate log-normal parameters in Python (maybe I > will need the same for the gamma distributions as well) given in the form as > bin2lnorm (i.e. l- bin limits, and h- corresponding heights (measurements in > my case)) that is the reason I use that R function. Any new alternative > suggestions as welcome this point. > > Similarly, while I studying my Cloud and Precipitation Parameterizations > book today (Distributions are extremely important in bulk-parameterization > of clouds and cloud-constituents/products (e.g. aerosols, cloud-droplets, > rain, hail etc...) I see in a couple figures (Please see the book review at > http://www.cambridge.org/catalogue/catalogue.asp?isbn=9780521883382&ss=exc > and go to pg 9. Figure 1.2) using statements like: "gamma curves fit to > data." > > It's clearer now after reading your inputs. > > Thanks again. > -- > G?khan > Depends on what you mean by 'data'. However, like many things, terminology is rather flexible, misused or just incomplete. Typically you have random variables (http://en.wikipedia.org/wiki/Random_variables) from some distribution such as multivariate normal. Note that a distribution is a rather complex thing which has various properties (http://en.wikipedia.org/wiki/Probability_distribution). When you want to see if the data is from some distribution that you do not know, then you are testing a hypothesis that your data, as a whole, has certain characteristics of random variables from that distribution. Central limit theorem makes many distributions very similar (i.e. like a normal distribution) with sufficient observations when it holds. However, you can not say that the data are random variables from that distribution nor that all data points are from the distribution. So if your data are random variables then neither saying is correct. Bruce From qa at takb.net Fri Jan 15 03:19:33 2010 From: qa at takb.net (Torsten Andre) Date: Fri, 15 Jan 2010 09:19:33 +0100 Subject: [SciPy-User] Integration of double integral with integration variable as In-Reply-To: <201001141321.04818.ljmamoreira@gmail.com> References: <4B4EE290.7070600@takb.net> <201001141321.04818.ljmamoreira@gmail.com> Message-ID: <4B502515.3020809@takb.net> Jose, I thank you very much for your help. Well, my example is easy to solve, indeed. Though this was only an example. But the trick with the functions does it. Something one could have figured out... You bet it helped ;) Torsten Jose Amoreira wrote: > Torsten, > Your example is easy! > Since sin(y) is an odd function, integrating over [-x,x] gives zero and that's > it. For a more illustrative example, replace sin with cos (still easy enough > to do it quicker analytically). Using scipy.integrate.quad, you do it like > this (excerpt from an idle session): >>>> def g(x): > return quad(cos,-x,x)[0] > >>>> quad(g,0.,1.)[0] > 0.91939538826372047 > > The reason I take the zero-th element of the quad output is that the remaining > is an estimate of the error. > Maybe you should also look into scipy.integrate.dblquad. > Hope this helps. > jose > > On Thursday 14 January 2010 09:23:28 am Torsten Andre wrote: >> Hey everyone, >> >> I am new to SciPy, but need to integrate something like this, where the >> boundaries of the inner integral are terms of outer variable's >> integration variable: >> >> \int{\int{sin(y)dy}_{-x}^{+x}dx}_0^1 >> >> Is this feasible in SciPy? I tried using quad but it only complains that >> x is not defined. Unfortunately I was unable to find anything on the >> list or in the documentation. >> >> Thanks for your time. >> >> Cheers, >> Torsten >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From gorkypl at gmail.com Fri Jan 15 09:24:46 2010 From: gorkypl at gmail.com (=?UTF-8?Q?Pawe=C5=82_Rumian?=) Date: Fri, 15 Jan 2010 15:24:46 +0100 Subject: [SciPy-User] Quick question about selecting periodical data with scikits.timeseries Message-ID: <5158a0651001150624vcc4a6f6y61ba4a2c201818a@mail.gmail.com> hello, Working more and more with scikits.timeseries and hydroclimpy I'm still impressed by their performance and abilities. I cannot found, hovewer, a native method of selecting data included in a given (regular) period. Is there one? In my particular case, I have daily data for the last fifteen years, and I'd like to split them into fifteen annual series, or 15*12 monthly series, and so on... I know I can select data from one period using something like: series.['1996-01-01':'1996-12-31'] and of course I can write a function that will iterate over all years - but since I've found that many of the functions I wrote were included in the package, I don't want to make this mistake once more ;) greetings, Pawe? From kwgoodman at gmail.com Fri Jan 15 13:07:28 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Fri, 15 Jan 2010 10:07:28 -0800 Subject: [SciPy-User] scipy.stats.nanstd, bias and ddof Message-ID: By default np.std and scipy.std normalize by N. But scipy.stats.nanstd normalizes by N-1. >> x = np.random.rand(4) >> np.std(x) 0.12006913635950889 >> scipy.std(x) 0.12006913635950889 >> scipy.stats.nanstd(x) 0.13864389639705668 >> scipy.stats.nanstd(x, bias=True) 0.12006913635950889 Can the default for nanstd be changed to bias=True? Or would that break code? Even better I guess would be to replace the bias keyword with ddof as used in np.std and scipy.std. So if bias: m2c = m2 / n else: m2c = m2 / (n - 1.) in scipy.stats.nanstd would become m2c = m2 / (n - ddof) For me it doesn't matter if the default ddof is 0 or 1. But it is nice when all std functions use the same default. From josef.pktd at gmail.com Fri Jan 15 13:48:11 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 15 Jan 2010 13:48:11 -0500 Subject: [SciPy-User] scipy.stats.nanstd, bias and ddof In-Reply-To: References: Message-ID: <1cd32cbb1001151048t824458if34e85b3464e380@mail.gmail.com> On Fri, Jan 15, 2010 at 1:07 PM, Keith Goodman wrote: > By default np.std and scipy.std normalize by N. But scipy.stats.nanstd > normalizes by N-1. > >>> x = np.random.rand(4) >>> np.std(x) > ? 0.12006913635950889 >>> scipy.std(x) > ? 0.12006913635950889 >>> scipy.stats.nanstd(x) > ? 0.13864389639705668 >>> scipy.stats.nanstd(x, bias=True) > ? 0.12006913635950889 > > Can the default for nanstd be changed to bias=True? Or would that break code? > > Even better I guess would be to replace the bias keyword with ddof as > used in np.std and scipy.std. So > > ? ?if bias: > ? ? ? ?m2c = m2 / n > ? ?else: > ? ? ? ?m2c = m2 / (n - 1.) > > in scipy.stats.nanstd would become > > ? ? m2c = m2 / (n - ddof) > > For me it doesn't matter if the default ddof is 0 or 1. But it is nice > when all std functions use the same default. I agree with the consistency across function argument. But changing the degrees of freedom will affect user code, and we would have to go through a warning period, and maybe add the ddof argument in the meantime. (but having both bias and ddof as arguments would be a bit messy) Or maybe numpy should get a nanmean and nanvar, nanstd, similar to nansum ? Then it would be easier to depreciate like the other stats functions that moved to numpy. Josef > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From kwgoodman at gmail.com Fri Jan 15 13:56:10 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Fri, 15 Jan 2010 10:56:10 -0800 Subject: [SciPy-User] scipy.stats.nanstd, bias and ddof In-Reply-To: <1cd32cbb1001151048t824458if34e85b3464e380@mail.gmail.com> References: <1cd32cbb1001151048t824458if34e85b3464e380@mail.gmail.com> Message-ID: On Fri, Jan 15, 2010 at 10:48 AM, wrote: > Or maybe numpy should get a nanmean and nanvar, nanstd, similar to > nansum ? Then it would be easier to depreciate like the other stats > functions that moved to numpy. That's a great idea. Adding nanstd to numpy would not break any code. It would also be nice to have a nanmedian in numpy, one that doesn't do a full sort. A pony would be nice too. From Dharhas.Pothina at twdb.state.tx.us Fri Jan 15 15:11:49 2010 From: Dharhas.Pothina at twdb.state.tx.us (Dharhas Pothina) Date: Fri, 15 Jan 2010 14:11:49 -0600 Subject: [SciPy-User] timeseries tsfromtxt missing_values bug? Message-ID: <4B5077A4.63BA.009B.0@twdb.state.tx.us> Hi, I'm having issues with tsfromtxt masking fields using the missing_values parameter. >>> dateconverter = lambda y, m, d, hh, mm : datetime(year=int(y), month=int(m), day=int(d), hour=int(hh), minute=int(mm)) >>> rseries = ts.tsfromtxt('test.csv',freq='T',comments='#',dateconverter=dateconverter,datecols=(1,2,3,4,5),usecols=(1,2,3,4,5,8),delimiter=',',missing_values=-999.0) gives : timeseries([(-999.0,) (-999.0,) (-999.0,)], dtype = [('f5', '>> rseries = ts.tsfromtxt('test.csv',freq='T',comments='#',dateconverter=dateconverter,datecols=(1,2,3,4,5),usecols=(1,2,3,4,5,8),delimiter=',',missing_values=-999.0,names='data') gives : timeseries([(--,) (--,) (--,)], dtype = [('_tmp4', ' From pgmdevlist at gmail.com Fri Jan 15 15:56:47 2010 From: pgmdevlist at gmail.com (Pierre GM) Date: Fri, 15 Jan 2010 15:56:47 -0500 Subject: [SciPy-User] Quick question about selecting periodical data with scikits.timeseries In-Reply-To: <5158a0651001150624vcc4a6f6y61ba4a2c201818a@mail.gmail.com> References: <5158a0651001150624vcc4a6f6y61ba4a2c201818a@mail.gmail.com> Message-ID: <94C2AFA5-4418-4F10-9E05-43377C73929D@gmail.com> On Jan 15, 2010, at 9:24 AM, Pawe? Rumian wrote: > hello, > > Working more and more with scikits.timeseries and hydroclimpy I'm > still impressed by their performance and abilities. > > I cannot found, hovewer, a native method of selecting data included in > a given (regular) period. Is there one? > > In my particular case, I have daily data for the last fifteen years, > and I'd like to split them into fifteen annual series, or 15*12 > monthly series, and so on... > > I know I can select data from one period using something like: > series.['1996-01-01':'1996-12-31'] > and of course I can write a function that will iterate over all years > - but since I've found that many of the functions I wrote were > included in the package, I don't want to make this mistake once more > ;) the easiest is to use the .convert method described here: http://pytseries.sourceforge.net/generated/scikits.timeseries.TimeSeries.convert.html In your case, choose 'A' for the output frequency and func=None (the default) to get a Nx366 array of data; each row will correspond to a year, each column to a day of year (hence 366 columns to keep track of lapse years; for non-lapse year, the 366th element is masked). A second possibility is to convert first to monthly frequency with an aggregation function (eg, func=sum or func=mean) to get a series of monthly aggregated data, then convert to 'A' to get a Nx12 series. let me know if it helps or if you have more specific questions. Cheers P. From gorkypl at gmail.com Fri Jan 15 17:16:17 2010 From: gorkypl at gmail.com (=?UTF-8?Q?Pawe=C5=82_Rumian?=) Date: Fri, 15 Jan 2010 23:16:17 +0100 Subject: [SciPy-User] Quick question about selecting periodical data with scikits.timeseries In-Reply-To: <94C2AFA5-4418-4F10-9E05-43377C73929D@gmail.com> References: <5158a0651001150624vcc4a6f6y61ba4a2c201818a@mail.gmail.com> <94C2AFA5-4418-4F10-9E05-43377C73929D@gmail.com> Message-ID: <5158a0651001151416x74b8c110t5f786d79c766e821@mail.gmail.com> > the easiest is to use the .convert method described here: > http://pytseries.sourceforge.net/generated/scikits.timeseries.TimeSeries.convert.html I don't know how could I miss it - while already using ts.convert and np.ma.mean to convert hourly data to daily averages, I totally overlooked the fact that this method doesn't have to use any interpolation. > let me know if it helps or if you have more specific questions. Of course it works perfect, and so up till now this package has everything I need - you've done a great job. Anyway, one more question - is there any convenient method to plot such converted data? If I use simple tsplot, the dates are aligned horizontally, which is not what I want. Now I'm viewing them with something like: for row in converted_data: plot row but I wonder if there is a more 'proper' way of handling this. greetings, Pawe? From pgmdevlist at gmail.com Fri Jan 15 18:34:44 2010 From: pgmdevlist at gmail.com (Pierre GM) Date: Fri, 15 Jan 2010 18:34:44 -0500 Subject: [SciPy-User] Quick question about selecting periodical data with scikits.timeseries In-Reply-To: <5158a0651001151416x74b8c110t5f786d79c766e821@mail.gmail.com> References: <5158a0651001150624vcc4a6f6y61ba4a2c201818a@mail.gmail.com> <94C2AFA5-4418-4F10-9E05-43377C73929D@gmail.com> <5158a0651001151416x74b8c110t5f786d79c766e821@mail.gmail.com> Message-ID: <0D540F50-39DB-49A6-8B55-1388E594801B@gmail.com> On Jan 15, 2010, at 5:16 PM, Pawe? Rumian wrote: > > Anyway, one more question - is there any convenient method to plot > such converted data? If I use simple tsplot, the dates are aligned > horizontally, which is not what I want. Now I'm viewing them with > something like: > for row in converted_data: plot row > but I wonder if there is a more 'proper' way of handling this. Ah, you wanna plot your data year by year, right ? In that case, you don't really need the dates anymore, and I suggest you plot the .series attribute instead (that's the masked array that stores only the data. It has faster access than the whole timeseries because you don't have to access the dates anymore). Now, your problem simplifies into : how to plot multiple rows at once. You can loop on the rows, or check in matplotlib if there's not another trick (try a LineCollection if you don't need different colors for different years/rows) From gorkypl at gmail.com Sat Jan 16 04:20:03 2010 From: gorkypl at gmail.com (=?UTF-8?Q?Pawe=C5=82_Rumian?=) Date: Sat, 16 Jan 2010 10:20:03 +0100 Subject: [SciPy-User] Quick question about selecting periodical data with scikits.timeseries In-Reply-To: <0D540F50-39DB-49A6-8B55-1388E594801B@gmail.com> References: <5158a0651001150624vcc4a6f6y61ba4a2c201818a@mail.gmail.com> <94C2AFA5-4418-4F10-9E05-43377C73929D@gmail.com> <5158a0651001151416x74b8c110t5f786d79c766e821@mail.gmail.com> <0D540F50-39DB-49A6-8B55-1388E594801B@gmail.com> Message-ID: <5158a0651001160120r55e6c219h1c732dfc045ed7a1@mail.gmail.com> > Ah, you wanna plot your data year by year, right ? In that case, you don't really need the dates anymore, and I suggest you plot the .series attribute instead (that's the masked array that stores only the data. It has faster access than the whole timeseries because you don't have to access the dates anymore). > Now, your problem simplifies into : how to plot multiple rows at once. You can loop on the rows, or check in matplotlib if there's not another trick (try a LineCollection if you don't need different colors for different years/rows) That clears everything - no more questions by now :) Pawe? From resurgo at gmail.com Sat Jan 16 09:37:13 2010 From: resurgo at gmail.com (Peter Clarke) Date: Sat, 16 Jan 2010 14:37:13 +0000 Subject: [SciPy-User] Python coders for Haiti disaster relief Message-ID: Apologies for off topic posting but I think this in an important project. Python programmers are required immediately for assistance in coding a disaster management framework for the Earthquake in Haiti. >From http://wiki.python.org/moin/VolunteerOpportunities: ----------------- URGENT REQUEST, Sahana Disaster Management System, Haiti Earthquake *Job Description*:This is an urgent call for experienced Python programmers to help in the Sahana Disaster Management System immediately - knowledge of Web2Py platform would be best. The Sahana Disaster Management System is used to coordinate relief efforts. Please recruit any available programmers for the Haiti effort as quickly as possible and have them contact me immediately so that I can put them in touch with the correct people. Thank you kindly and I do hope that we can quickly identify some contributors for this monumental effort - they are needed ASAP. http://sahanapy.org/ is the developer site and the demo is http://demo.sahanapy.org/ - *Contact*: Connie White, PhD, Institute for Emergency Preparedness, Jacksonville State University - *E-mail contact*: connie.m.white at gmail.com - *Web*: http://sahanapy.org/ ----------------------------- Please help if you can. -Peter Clarke -------------- next part -------------- An HTML attachment was scrubbed... URL: From rchrdlyon1 at gmail.com Sat Jan 16 10:33:32 2010 From: rchrdlyon1 at gmail.com (Richard Lyon) Date: Sun, 17 Jan 2010 02:33:32 +1100 Subject: [SciPy-User] Issues with lfilter after version upgrade Message-ID: <4B51DC4C.4090906@googlemail.com> Hi, Problem: ======= Been successfully using scipy to run various signal processing simulations. Recently upgraded python, numpy and scipy. Now find lfilter in signal processing appears to crash the python interpreter. Details: ======== Window Vista python 2.6.4 pywin32 214 numpy 1.4.0 scipy 0.7.1 The following code crashes ---------------------------------------------------------- from numpy import zeros from scipy.signal import lfilter print 'Testing lfilter' B = [ +9.9416310E-01, -1.9883262E+00, +9.9416310E-01 ] A = [ +1.0000000E+00, -1.9882920E+00, +9.8836030E-01 ] z = zeros(2) x0 = zeros(80) # fails on this next line (x1, z) = lfilter(B, A, x0, -1, z) print 'Finished' ---------------------------------------------------------- Haven't seen any other problems yet. Regards RLYON From josef.pktd at gmail.com Sat Jan 16 12:57:17 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 16 Jan 2010 12:57:17 -0500 Subject: [SciPy-User] Issues with lfilter after version upgrade In-Reply-To: <4B51DC4C.4090906@googlemail.com> References: <4B51DC4C.4090906@googlemail.com> Message-ID: <1cd32cbb1001160957v2e77436dn7e7a78ad90425ef3@mail.gmail.com> On Sat, Jan 16, 2010 at 10:33 AM, Richard Lyon wrote: > Hi, > > Problem: > ======= > > Been successfully using scipy to run various signal processing > simulations. Recently upgraded python, numpy and scipy. Now find lfilter > in signal processing appears to crash the python interpreter. > > Details: > ======== > > Window Vista > python 2.6.4 > pywin32 214 > numpy 1.4.0 > scipy 0.7.1 > > The following code crashes > > ---------------------------------------------------------- > from numpy import zeros > from scipy.signal import lfilter > > print 'Testing lfilter' > B = [ +9.9416310E-01, -1.9883262E+00, +9.9416310E-01 ] > A = [ +1.0000000E+00, -1.9882920E+00, +9.8836030E-01 ] > z = zeros(2) > x0 = zeros(80) > # fails on this next line > (x1, z) = lfilter(B, A, x0, -1, z) > print 'Finished' > ---------------------------------------------------------- scipy 0.7.x has binary incompatibility problems if it has been compiled against numpy 1.3 and is run against numpy 1.4 When I run your script with scipy (trunk) compiled against numpy 1.4.0, I don't have any problem. When I run your script with scipy-0.7.1.dev5744 compiled against numpy 1.3, and run it with the release version of numpy 1.4.0, then I also have the crash. (I don't have a virtualenv with scipy-0.7.1 release to try out). There are 2 options, either you recompile scipy against numpy 1.4. or downgrade to numpy 1.3 until numpy 1.4 compatible scipy binaries are available. (I still think there are other binary incompatibilities besides the cython problem) I'm on WindowXP Josef > > Haven't seen any other problems yet. > > Regards > RLYON > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From perfreem at gmail.com Sat Jan 16 17:28:37 2010 From: perfreem at gmail.com (per freem) Date: Sat, 16 Jan 2010 17:28:37 -0500 Subject: [SciPy-User] smoothing in scipy/matplotlib Message-ID: hi all, i am using gaussian_kde to fit a gaussian kernel estimator to a bunch of data. the lines i get are often quite jaggy and very sensitive to fluctuations in the data. is there a way to "smooth" the estimate more? typically in gaussian kdes there is a smoothing parameter, but i do not see one in the documentation. is there a way to do this? thanks. From josef.pktd at gmail.com Sat Jan 16 18:33:50 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 16 Jan 2010 18:33:50 -0500 Subject: [SciPy-User] smoothing in scipy/matplotlib In-Reply-To: References: Message-ID: <1cd32cbb1001161533p3f732ccdl6a0fb2333728791a@mail.gmail.com> On Sat, Jan 16, 2010 at 5:28 PM, per freem wrote: > hi all, > > i am using gaussian_kde to fit a gaussian kernel estimator to a bunch > of data. the lines i get are often quite jaggy and very sensitive to > fluctuations in the data. is there a way to "smooth" the estimate > more? typically in gaussian kdes there is a smoothing parameter, but i > do not see one in the documentation. > > is there a way to do this? Not yet, I never committed the change, the cleanest way currently is by subclassing kde_gaussian, the dirtier version is by monkey patching. I can look for my example scripts for both later tonight, there is also some information on the mailing list, e.g. a subclassing example by Anne (maybe one and a half years ago) I'm a bit surprised about undersmoothing, because I did the changes for the case of oversmoothing by kde_gaussian. Josef > > thanks. > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From josef.pktd at gmail.com Sat Jan 16 23:37:16 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 16 Jan 2010 23:37:16 -0500 Subject: [SciPy-User] smoothing in scipy/matplotlib In-Reply-To: <1cd32cbb1001161533p3f732ccdl6a0fb2333728791a@mail.gmail.com> References: <1cd32cbb1001161533p3f732ccdl6a0fb2333728791a@mail.gmail.com> Message-ID: <1cd32cbb1001162037r5808ccd7v7c61966df662adad@mail.gmail.com> On Sat, Jan 16, 2010 at 6:33 PM, wrote: > On Sat, Jan 16, 2010 at 5:28 PM, per freem wrote: >> hi all, >> >> i am using gaussian_kde to fit a gaussian kernel estimator to a bunch >> of data. the lines i get are often quite jaggy and very sensitive to >> fluctuations in the data. is there a way to "smooth" the estimate >> more? typically in gaussian kdes there is a smoothing parameter, but i >> do not see one in the documentation. >> >> is there a way to do this? > > Not yet, I never committed the change, the cleanest way currently is > by subclassing kde_gaussian, the dirtier version is by monkey > patching. I can look for my example scripts for both later tonight, > there is also some information on the mailing list, e.g. a subclassing > example by Anne (maybe one and a half years ago) > > I'm a bit surprised about undersmoothing, because I did the changes > for the case of oversmoothing by kde_gaussian. > > Josef In the attachment is my subclass of stats.gaussian_kde. the main point I did was to allow to set or reset the smoothing factor to a float. It plots several examples Initially this was intended to be a continuation to this story, but I never got around to finishing it (my file is dated may, and I haven't looked at it in a long time) http://jpktd.blogspot.com/2009/03/using-gaussian-kernel-density.html I hope this helps, ask if something is not clear. I don't find a ticket or mailinglist thread on my draft for the enhancement (keyword option for bandwith) to gausssian_kde, the initial monkey patch version is here http://mail.scipy.org/pipermail/scipy-user/2009-January/019201.html Josef > > > >> >> thanks. >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > -------------- next part -------------- '''subclassing kde Author: josef pktd ''' import numpy as np import scipy from scipy import stats import matplotlib.pylab as plt class gaussian_kde_set_covariance(stats.gaussian_kde): ''' from Anne Archibald in mailinglist: http://www.nabble.com/Width-of-the-gaussian-in-stats.kde.gaussian_kde---td19558924.html#a19558924 ''' def __init__(self, dataset, covariance): self.covariance = covariance scipy.stats.gaussian_kde.__init__(self, dataset) def _compute_covariance(self): self.inv_cov = np.linalg.inv(self.covariance) self._norm_factor = sqrt(np.linalg.det(2*np.pi*self.covariance)) * self.n class gaussian_kde_covfact(stats.gaussian_kde): def __init__(self, dataset, covfact = 'scotts'): self.covfact = covfact scipy.stats.gaussian_kde.__init__(self, dataset) def _compute_covariance_(self): '''not used''' self.inv_cov = np.linalg.inv(self.covariance) self._norm_factor = sqrt(np.linalg.det(2*np.pi*self.covariance)) * self.n def covariance_factor(self): if self.covfact in ['sc', 'scotts']: return self.scotts_factor() if self.covfact in ['si', 'silverman']: return self.silverman_factor() elif self.covfact: return float(self.covfact) else: raise ValueError, \ 'covariance factor has to be scotts, silverman or a number' def reset_covfact(self, covfact): self.covfact = covfact self.covariance_factor() self._compute_covariance() def plotkde(covfact): gkde.reset_covfact(covfact) kdepdf = gkde.evaluate(ind) plt.figure() # plot histgram of sample plt.hist(xn, bins=20, normed=1) # plot estimated density plt.plot(ind, kdepdf, label='kde', color="g") # plot data generating density plt.plot(ind, alpha * stats.norm.pdf(ind, loc=mlow) + (1-alpha) * stats.norm.pdf(ind, loc=mhigh), color="r", label='DGP: normal mix') plt.title('Kernel Density Estimation - ' + str(gkde.covfact)) plt.legend() from numpy.testing import assert_array_almost_equal, \ assert_almost_equal, assert_ def test_kde_1d(): np.random.seed(8765678) n_basesample = 500 xn = np.random.randn(n_basesample) xnmean = xn.mean() xnstd = xn.std(ddof=1) print xnmean, xnstd # get kde for original sample gkde = stats.gaussian_kde(xn) # evaluate the density funtion for the kde for some points xs = np.linspace(-7,7,501) kdepdf = gkde.evaluate(xs) normpdf = stats.norm.pdf(xs, loc=xnmean, scale=xnstd) print 'MSE', np.sum((kdepdf - normpdf)**2) print 'axabserror', np.max(np.abs(kdepdf - normpdf)) intervall = xs[1] - xs[0] assert_(np.sum((kdepdf - normpdf)**2)*intervall < 0.01) #assert_array_almost_equal(kdepdf, normpdf, decimal=2) print gkde.integrate_gaussian(0.0, 1.0) print gkde.integrate_box_1d(-np.inf, 0.0) print gkde.integrate_box_1d(0.0, np.inf) print gkde.integrate_box_1d(-np.inf, xnmean) print gkde.integrate_box_1d(xnmean, np.inf) assert_almost_equal(gkde.integrate_box_1d(xnmean, np.inf), 0.5, decimal=1) assert_almost_equal(gkde.integrate_box_1d(-np.inf, xnmean), 0.5, decimal=1) assert_almost_equal(gkde.integrate_box(xnmean, np.inf), 0.5, decimal=1) assert_almost_equal(gkde.integrate_box(-np.inf, xnmean), 0.5, decimal=1) assert_almost_equal(gkde.integrate_kde(gkde), (kdepdf**2).sum()*intervall, decimal=2) assert_almost_equal(gkde.integrate_gaussian(xnmean, xnstd**2), (kdepdf*normpdf).sum()*intervall, decimal=2) ## assert_almost_equal(gkde.integrate_gaussian(0.0, 1.0), ## (kdepdf*normpdf).sum()*intervall, decimal=2) if __name__ == '__main__': # generate a sample n_basesample = 1000 np.random.seed(8765678) alpha = 0.6 #weight for (prob of) lower distribution mlow, mhigh = (-3,3) #mean locations for gaussian mixture xn = np.concatenate([mlow + np.random.randn(alpha * n_basesample), mhigh + np.random.randn((1-alpha) * n_basesample)]) # get kde for original sample #gkde = stats.gaussian_kde(xn) gkde = gaussian_kde_covfact(xn, 0.1) # evaluate the density funtion for the kde for some points ind = np.linspace(-7,7,101) kdepdf = gkde.evaluate(ind) plt.figure() # plot histgram of sample plt.hist(xn, bins=20, normed=1) # plot estimated density plt.plot(ind, kdepdf, label='kde', color="g") # plot data generating density plt.plot(ind, alpha * stats.norm.pdf(ind, loc=mlow) + (1-alpha) * stats.norm.pdf(ind, loc=mhigh), color="r", label='DGP: normal mix') plt.title('Kernel Density Estimation') plt.legend() gkde = gaussian_kde_covfact(xn, 'scotts') kdepdf = gkde.evaluate(ind) plt.figure() # plot histgram of sample plt.hist(xn, bins=20, normed=1) # plot estimated density plt.plot(ind, kdepdf, label='kde', color="g") # plot data generating density plt.plot(ind, alpha * stats.norm.pdf(ind, loc=mlow) + (1-alpha) * stats.norm.pdf(ind, loc=mhigh), color="r", label='DGP: normal mix') plt.title('Kernel Density Estimation') plt.legend() #plt.show() for cv in ['scotts', 'silverman', 0.05, 0.1, 0.5]: plotkde(cv) test_kde_1d() np.random.seed(8765678) n_basesample = 1000 xn = np.random.randn(n_basesample) xnmean = xn.mean() xnstd = xn.std(ddof=1) # get kde for original sample gkde = stats.gaussian_kde(xn) From josef.pktd at gmail.com Sun Jan 17 00:03:12 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sun, 17 Jan 2010 00:03:12 -0500 Subject: [SciPy-User] smoothing in scipy/matplotlib In-Reply-To: <1cd32cbb1001162037r5808ccd7v7c61966df662adad@mail.gmail.com> References: <1cd32cbb1001161533p3f732ccdl6a0fb2333728791a@mail.gmail.com> <1cd32cbb1001162037r5808ccd7v7c61966df662adad@mail.gmail.com> Message-ID: <1cd32cbb1001162103o4510ecf2gb227ed20ee11a679@mail.gmail.com> On Sat, Jan 16, 2010 at 11:37 PM, wrote: > On Sat, Jan 16, 2010 at 6:33 PM, ? wrote: >> On Sat, Jan 16, 2010 at 5:28 PM, per freem wrote: >>> hi all, >>> >>> i am using gaussian_kde to fit a gaussian kernel estimator to a bunch >>> of data. the lines i get are often quite jaggy and very sensitive to >>> fluctuations in the data. is there a way to "smooth" the estimate >>> more? typically in gaussian kdes there is a smoothing parameter, but i >>> do not see one in the documentation. >>> >>> is there a way to do this? >> >> Not yet, I never committed the change, the cleanest way currently is >> by subclassing kde_gaussian, the dirtier version is by monkey >> patching. I can look for my example scripts for both later tonight, >> there is also some information on the mailing list, e.g. a subclassing >> example by Anne (maybe one and a half years ago) >> >> I'm a bit surprised about undersmoothing, because I did the changes >> for the case of oversmoothing by kde_gaussian. >> >> Josef > > In the attachment is my subclass of ?stats.gaussian_kde. the main > point I did was to allow to set or reset the smoothing factor to a > float. It plots several examples > > Initially this was intended to be a continuation to this story, but I > never got around to finishing it (my file is dated may, and I haven't > looked at it in a long time) > > http://jpktd.blogspot.com/2009/03/using-gaussian-kernel-density.html > > I hope this helps, ask if something is not clear. > > I don't find a ticket or mailinglist thread on my draft for the > enhancement (keyword option for bandwith) to gausssian_kde, the > initial monkey patch version is here > http://mail.scipy.org/pipermail/scipy-user/2009-January/019201.html > > Josef I just created http://projects.scipy.org/scipy/ticket/1092 so I don't forget about it again. I appreciate any comments about what changes would be useful for the bandwidth choice. Josef > > >> >> >> >>> >>> thanks. >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >> > From yves.frederix at gmail.com Sun Jan 17 05:25:40 2010 From: yves.frederix at gmail.com (Yves Frederix) Date: Sun, 17 Jan 2010 11:25:40 +0100 Subject: [SciPy-User] Return type of scipy.interpolate.splev for input array of length 1 Message-ID: <62e6eafb1001170225h49632e1bw3c47c3d62f0cce2f@mail.gmail.com> Hi, I stumbled upon the following unlogical behavior of scipy.interpolate.splev. When presented with a length-1 array, the output is converted to a scalar. import scipy.interpolate import numpy as N x = N.arange(5.) y = N.arange(5.) tck = scipy.interpolate.splrep(x,y) x_eval = N.asarray([1.]) y_eval = scipy.interpolate.splev(x_eval, tck) print 'scipy.interpolate.splev(x_eval, tck):', y_eval print 'type(x_eval):', type(x_eval) print 'type(y_eval):', type(y_eval) with output scipy.interpolate.splev(x_eval, tck): 1.0 type(x_eval): type(y_eval): It was rather unexpected that the type of input and output data are different. After checking interpolate/fitpack.py it seems that this behavior results from the fact that the length-1 case is explicitly treated differently (probably to be able to deal with the case of scalar input, for which scalar output is expected): 434 def splev(x,tck,der=0): 487 if ier: raise TypeError,"An error occurred" 488 if len(y)>1: return y 489 return y[0] 490 Wouldn't it be less confusing to have the return value always have the same type as the input data? Cheers, YVES From perfreem at gmail.com Sun Jan 17 08:39:35 2010 From: perfreem at gmail.com (per freem) Date: Sun, 17 Jan 2010 08:39:35 -0500 Subject: [SciPy-User] smoothing in scipy/matplotlib In-Reply-To: <1cd32cbb1001162103o4510ecf2gb227ed20ee11a679@mail.gmail.com> References: <1cd32cbb1001161533p3f732ccdl6a0fb2333728791a@mail.gmail.com> <1cd32cbb1001162037r5808ccd7v7c61966df662adad@mail.gmail.com> <1cd32cbb1001162103o4510ecf2gb227ed20ee11a679@mail.gmail.com> Message-ID: hi josef, thank you so much - your patch worked brilliantly. i simply changed the smoothing factor to 0.25 and got the correct result. it was very straightforward to use! it would be great if your subclass of kde was incorporated into scipy. if you're interested in seeing the graphs before (with the default kde) and with your version, i can send you those. thanks again. On Sun, Jan 17, 2010 at 12:03 AM, wrote: > On Sat, Jan 16, 2010 at 11:37 PM, ? wrote: >> On Sat, Jan 16, 2010 at 6:33 PM, ? wrote: >>> On Sat, Jan 16, 2010 at 5:28 PM, per freem wrote: >>>> hi all, >>>> >>>> i am using gaussian_kde to fit a gaussian kernel estimator to a bunch >>>> of data. the lines i get are often quite jaggy and very sensitive to >>>> fluctuations in the data. is there a way to "smooth" the estimate >>>> more? typically in gaussian kdes there is a smoothing parameter, but i >>>> do not see one in the documentation. >>>> >>>> is there a way to do this? >>> >>> Not yet, I never committed the change, the cleanest way currently is >>> by subclassing kde_gaussian, the dirtier version is by monkey >>> patching. I can look for my example scripts for both later tonight, >>> there is also some information on the mailing list, e.g. a subclassing >>> example by Anne (maybe one and a half years ago) >>> >>> I'm a bit surprised about undersmoothing, because I did the changes >>> for the case of oversmoothing by kde_gaussian. >>> >>> Josef >> >> In the attachment is my subclass of ?stats.gaussian_kde. the main >> point I did was to allow to set or reset the smoothing factor to a >> float. It plots several examples >> >> Initially this was intended to be a continuation to this story, but I >> never got around to finishing it (my file is dated may, and I haven't >> looked at it in a long time) >> >> http://jpktd.blogspot.com/2009/03/using-gaussian-kernel-density.html >> >> I hope this helps, ask if something is not clear. >> >> I don't find a ticket or mailinglist thread on my draft for the >> enhancement (keyword option for bandwith) to gausssian_kde, the >> initial monkey patch version is here >> http://mail.scipy.org/pipermail/scipy-user/2009-January/019201.html >> >> Josef > > I just created http://projects.scipy.org/scipy/ticket/1092 so I don't > forget about it again. > > I appreciate any comments about what changes would be useful for the > bandwidth choice. > > Josef > > >> >> >>> >>> >>> >>>> >>>> thanks. >>>> _______________________________________________ >>>> SciPy-User mailing list >>>> SciPy-User at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>>> >>> >> > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From gorkypl at gmail.com Sun Jan 17 10:15:00 2010 From: gorkypl at gmail.com (=?UTF-8?Q?Pawe=C5=82_Rumian?=) Date: Sun, 17 Jan 2010 16:15:00 +0100 Subject: [SciPy-User] scikits.timeseries plot and utf8 fonts Message-ID: <5158a0651001170715o687dedcbs9c5487c2a5975ff1@mail.gmail.com> hello, Still working with scikits.timeseries and matplotlib I've encountered another problem - this time with displaying month names in UTF8. In my language (Polish) in month names there are non-ascii characters (like ? or ?). When plotting dataseries with matplotlib methods they are displayed correctly, but when using scikits tsplot I get rectangles in those places. The other texts (axis titles, legends and so on) ore OK. For example - there is no problem in this demo: http://matplotlib.sourceforge.net/examples/pylab_examples/date_demo2.html But the names are not displayed correctly when plotting the first example from: http://pytseries.sourceforge.net/lib.plotting.examples.html I wonder if there is a configuration problem or an issue with TimeSeriesPlot? greetings, Pawe? From contact at pythonxy.com Sun Jan 17 12:07:08 2010 From: contact at pythonxy.com (Pierre Raybaut) Date: Sun, 17 Jan 2010 18:07:08 +0100 Subject: [SciPy-User] [ANN] Spyder v1.0.3 released Message-ID: <4B5343BC.3070703@pythonxy.com> Hi all, I'm pleased to announce here that Spyder version 1.0.3 has been released: http://packages.python.org/spyder __Important__ Spyder v1.0.3 is a *critical* bugfix release (bonus: new "Apply" button in matplotlib's figure options editor). Previously known as Pydee, Spyder (Scientific PYthon Development EnviRonment) is a free open-source Python development environment providing MATLAB-like features in a simple and light-weighted software, available for Windows XP/Vista/7, GNU/Linux and MacOS X: * advanced code editing features (code analysis, ...) * interactive console with MATLAB-like workpace (with GUI-based list, dictionary, tuple, text and array editors -- screenshots: http://packages.python.org/spyder/console.html#the-workspace) and integrated matplotlib figures * external console to open an interpreter or run a script in a separate process (with a global variable explorer providing the same features as the interactive console's workspace) * code analysis with pyflakes and pylint * search in files features * documentation viewer: automatically retrieves docstrings or source code of the function/class called in the interactive/external console * integrated file/directories explorer * MATLAB-like path management ...and more! Spyder is part of spyderlib, a Python module based on PyQt4 and QScintilla2 which provides powerful console-related PyQt4 widgets. - Pierre From fiolj at yahoo.com Sun Jan 17 13:28:23 2010 From: fiolj at yahoo.com (Juan) Date: Sun, 17 Jan 2010 15:28:23 -0300 Subject: [SciPy-User] f2py segfault Message-ID: <4B5356C7.80908@yahoo.com> Hi, I don't know if this is the right place (if it is not, please point me in the right direction). I am using f2py with some own programs and I am going insane with a segmentation fault. It is probably a problem in my code but I'd like to know if someone has any hint to give me since I've been trying different things for two days already. I've got a few routines in fortran with in/out arrays. When I call one of the routines it works well. The second routine I call crashes the program. I've been changing routines and it seems that it does not matter with routines I use. Basically, the fortran routines have the signature: subroutine sub1(y, Np, Nt) integer(4), intent(IN) :: Np integer(4), intent(IN) :: Nt real(8), intent(INOUT), dimension(6*Np, Nt) :: y and I call them from python as: import mymod r= np.zeros((Ncoord,Ntraj),dtype=np.float64, order='Fortran') mymod.sub1(r) I am using python 2.6. Probably the statement of the problem is to vague to get an answer. But I'll settle for just some ideas on how to proceed. I've used the option for debugging: --debug-capi but it does not provide with more information. Only tells me that it checks for the array and segfaults (before analyzing the integer arguments Np, Nt) Thanks, Juan From pgmdevlist at gmail.com Sun Jan 17 15:37:21 2010 From: pgmdevlist at gmail.com (Pierre GM) Date: Sun, 17 Jan 2010 15:37:21 -0500 Subject: [SciPy-User] scikits.timeseries plot and utf8 fonts In-Reply-To: <5158a0651001170715o687dedcbs9c5487c2a5975ff1@mail.gmail.com> References: <5158a0651001170715o687dedcbs9c5487c2a5975ff1@mail.gmail.com> Message-ID: On Jan 17, 2010, at 10:15 AM, Pawe? Rumian wrote: > hello, > > Still working with scikits.timeseries and matplotlib I've encountered > another problem - this time with displaying month names in UTF8. > > In my language (Polish) in month names there are non-ascii characters > (like ? or ?). > When plotting dataseries with matplotlib methods they are displayed > correctly, but when using scikits tsplot I get rectangles in those > places. > The other texts (axis titles, legends and so on) ore OK. > > For example - there is no problem in this demo: > http://matplotlib.sourceforge.net/examples/pylab_examples/date_demo2.html > But the names are not displayed correctly when plotting the first example from: > http://pytseries.sourceforge.net/lib.plotting.examples.html > > I wonder if there is a configuration problem or an issue with TimeSeriesPlot? Oh, I'm afraid it's just bugs on the scikits part, the lib.plotlib section has been lagging behing matplotlib. Let me check and get back to you next week (meanwhile, please open a ticket at http://projects.scipy.org/scikits, that'll be easier to manage). Sorry for the inconvenience. P. From dagss at student.matnat.uio.no Sun Jan 17 16:28:27 2010 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Sun, 17 Jan 2010 22:28:27 +0100 Subject: [SciPy-User] f2py segfault In-Reply-To: <4B5356C7.80908@yahoo.com> References: <4B5356C7.80908@yahoo.com> Message-ID: <4B5380FB.2000509@student.matnat.uio.no> Juan wrote: > Hi, I don't know if this is the right place (if it is not, please point me in > the right direction). > I am using f2py with some own programs and I am going insane with a segmentation > fault. It is probably a problem in my code but I'd like to know if someone has > any hint > to give me since I've been trying different things for two days already. > > I've got a few routines in fortran with in/out arrays. When I call one of the > routines it works well. The second routine I call crashes the program. I've been > changing routines and it seems that it does not matter with routines I use. > > Basically, the fortran routines have the signature: > > subroutine sub1(y, Np, Nt) > integer(4), intent(IN) :: Np > integer(4), intent(IN) :: Nt > real(8), intent(INOUT), dimension(6*Np, Nt) :: y > > and I call them from python as: > > import mymod > r= np.zeros((Ncoord,Ntraj),dtype=np.float64, order='Fortran') > mymod.sub1(r) > > I am using python 2.6. Probably the statement of the problem is to vague to get > an answer. But I'll settle for just some ideas on how to proceed. I've used the > option for debugging: --debug-capi > but it does not provide with more information. Only tells me that it checks for > the array and segfaults (before analyzing the integer arguments Np, Nt) From what little I know of f2py the "6*Np" seems like the problematic part. If f2py isn't smart enough to take the array shape and divide by 6 (which, in general, requires solving a symbolic equation, and somehow I doubt f2py is that smart, though perhaps it deals with simple things like this explicitly? *shrug*), then Np is going to passed as a too big number (try to print out Np from your Fortran program to confirm...). Dag Sverre From gorkypl at gmail.com Sun Jan 17 16:39:15 2010 From: gorkypl at gmail.com (=?UTF-8?Q?Pawe=C5=82_Rumian?=) Date: Sun, 17 Jan 2010 22:39:15 +0100 Subject: [SciPy-User] scikits.timeseries plot and utf8 fonts In-Reply-To: References: <5158a0651001170715o687dedcbs9c5487c2a5975ff1@mail.gmail.com> Message-ID: <5158a0651001171339r7ec20d09kcef9f5ec5f8457c2@mail.gmail.com> 2010/1/17 Pierre GM : > Oh, I'm afraid it's just bugs on the scikits part, the lib.plotlib section has been lagging behing matplotlib. Let me check and get back to you next week (meanwhile, please open a ticket at http://projects.scipy.org/scikits, that'll be easier to manage). > Sorry for the inconvenience. No problem, thanks for the response :) Pawe? From kwmsmith at gmail.com Sun Jan 17 16:43:44 2010 From: kwmsmith at gmail.com (Kurt Smith) Date: Sun, 17 Jan 2010 15:43:44 -0600 Subject: [SciPy-User] f2py segfault In-Reply-To: <4B5380FB.2000509@student.matnat.uio.no> References: <4B5356C7.80908@yahoo.com> <4B5380FB.2000509@student.matnat.uio.no> Message-ID: On Sun, Jan 17, 2010 at 3:28 PM, Dag Sverre Seljebotn wrote: > Juan wrote: >> Hi, I don't know if this is the right place (if it is not, please point me in >> the right direction). >> I am using f2py with some own programs and I am going insane with a segmentation >> fault. It is probably a problem in my code but I'd like to know if someone has >> any hint >> to give me since I've been trying different things for two days already. >> >> I've got a few routines in fortran with in/out arrays. When I call one of the >> routines it works well. The second routine I call crashes the program. I've been >> changing routines and it seems that it does not matter with routines I use. >> >> Basically, the fortran routines have the signature: >> >> subroutine sub1(y, Np, Nt) >> ? integer(4), intent(IN) :: Np >> ? integer(4), intent(IN) :: Nt >> ? real(8), intent(INOUT), dimension(6*Np, Nt) :: y >> >> and I call them from python as: >> >> import mymod >> r= np.zeros((Ncoord,Ntraj),dtype=np.float64, order='Fortran') >> mymod.sub1(r) >> >> I am using python 2.6. Probably the statement of the problem is to vague to get >> an answer. But I'll settle for just some ideas on how to proceed. I've used the >> option for debugging: --debug-capi >> but it does not provide with more information. Only tells me that it checks for >> the array and segfaults (before analyzing the integer arguments Np, Nt) > ?From what little I know of f2py the "6*Np" seems like the problematic > part. If f2py isn't smart enough to take the array shape and divide by 6 > (which, in general, requires solving a symbolic equation, and somehow I > doubt f2py is that smart, though perhaps it deals with simple things > like this explicitly? *shrug*), then Np is going to passed as a too big > number (try to print out Np from your Fortran program to confirm...). That's what I suspected at first, too. f2py tries to handle this, although it uses integer division, and I think leads to a bug. I don't get a segfault, though, so your problem might be something else. If you can redo things to get rid of the '6*Np' it might be worth trying. ksmith at lothario:~/test-f2py$ cat foo.f90 subroutine sub1(y, Np, Nt) integer(4), intent(in) :: Np integer(4), intent(in) :: Nt real(8), intent(inout), dimension(6*Np, Nt) :: y y = 1.0 end subroutine sub1 ksmith at lothario:~/test-f2py$ f2py --debug-capi -c foo.f90 [f2py output] ksmith at lothario:~/test-f2py$ cat test_foo.py import numpy as np import untitled print untitled.sub1.__doc__ # this works -- note the array extents -- multiples of 6, so the integer division works... r = np.zeros((6, 6), dtype=np.float64, order="Fortran") untitled.sub1(r) print r assert np.all(r == 1.0) # this doesn't work -- note the array extents. r = np.zeros((5, 5), dtype=np.float64, order="Fortran") untitled.sub1(r) print r assert np.all(r == 1.0) ksmith at lothario:~/test-f2py$ python test_foo.py sub1 - Function signature: sub1(y,[np,nt]) Required arguments: y : in/output rank-2 array('d') with bounds (6 * np,nt) Optional arguments: np := (shape(y,0))/(6) input int nt := shape(y,1) input int debug-capi:Python C/API function untitled.sub1(y,np=(shape(y,0))/(6),nt=shape(y,1)) debug-capi:double y=:inoutput,required,array,dims(6 * np|6 * np,nt|nt) debug-capi:int np=(shape(y,0))/(6):input,optional,scalar debug-capi:np=1 debug-capi:Checking `(shape(y,0))/(6)==np' debug-capi:int nt=shape(y,1):input,optional,scalar debug-capi:nt=6 debug-capi:Checking `shape(y,1)==nt' debug-capi:Fortran subroutine `sub1(y,&np,&nt)' debug-capi:np=1 debug-capi:nt=6 debug-capi:Building return value. debug-capi:Python C/API function untitled.sub1: successful. debug-capi:Freeing memory. [[ 1. 1. 1. 1. 1. 1.] [ 1. 1. 1. 1. 1. 1.] [ 1. 1. 1. 1. 1. 1.] [ 1. 1. 1. 1. 1. 1.] [ 1. 1. 1. 1. 1. 1.] [ 1. 1. 1. 1. 1. 1.]] debug-capi:Python C/API function untitled.sub1(y,np=(shape(y,0))/(6),nt=shape(y,1)) debug-capi:double y=:inoutput,required,array,dims(6 * np|6 * np,nt|nt) debug-capi:int np=(shape(y,0))/(6):input,optional,scalar debug-capi:np=0 debug-capi:Checking `(shape(y,0))/(6)==np' debug-capi:int nt=shape(y,1):input,optional,scalar debug-capi:nt=5 debug-capi:Checking `shape(y,1)==nt' debug-capi:Fortran subroutine `sub1(y,&np,&nt)' debug-capi:np=0 debug-capi:nt=5 debug-capi:Building return value. debug-capi:Python C/API function untitled.sub1: successful. debug-capi:Freeing memory. [[ 0. 0. 0. 0. 0.] [ 0. 0. 0. 0. 0.] [ 0. 0. 0. 0. 0.] [ 0. 0. 0. 0. 0.] [ 0. 0. 0. 0. 0.]] Traceback (most recent call last): File "test_foo.py", line 16, in assert np.all(r == 1.0) AssertionError From sfalsharif at gmail.com Sun Jan 17 18:44:56 2010 From: sfalsharif at gmail.com (Sharaf Al-Sharif) Date: Sun, 17 Jan 2010 23:44:56 +0000 Subject: [SciPy-User] Amplitude scaling in fft Message-ID: <57c570d21001171544p5f9b3835g513b2575adc6af33@mail.gmail.com> Hi, I'm a bit confused regarding how the amplitudes returned by np.fft.fft (or np.fft.rfft) relate to the amplitudes of the original signal in time domain. If: A = np.fft.rfft(a,n=2048) but, n_pts = len(a) < 2048, will the physical amplitudes in time domain be np.abs(A)*2/2048 , or np.abs(A)*2/n_pts? Or something else? Thank you for your help. Sharaf -------------- next part -------------- An HTML attachment was scrubbed... URL: From fiolj at yahoo.com Mon Jan 18 07:19:48 2010 From: fiolj at yahoo.com (Juan) Date: Mon, 18 Jan 2010 09:19:48 -0300 Subject: [SciPy-User] Fwd: f2py segfault Message-ID: <4B5451E4.5020806@yahoo.com> Hi, thanks for the advice. I did not notice that the integer division could be a source for trouble. Now I changed all the routines. However, I still have the same segmentation fault. debug-capi:Python C/API function mymod.sub0(state,ndim=shape(state,0),ntrajectories=shape(state,1)) debug-capi:double state=:inoutput,required,array,dims(ndim|ndim,ntrajectories|ntrajectories) debug-capi:int ndim=shape(state,0):input,optional,scalar debug-capi:ndim=24 debug-capi:Checking `shape(state,0)==ndim' debug-capi:int ntrajectories=shape(state,1):input,optional,scalar debug-capi:ntrajectories=100 debug-capi:Checking `shape(state,1)==ntrajectories' debug-capi:Fortran subroutine `sub0(state,&ndim,&ntrajectories)' debug-capi:ndim=24 debug-capi:ntrajectories=100 debug-capi:Building return value. debug-capi:Python C/API function mymod.sub0: successful. debug-capi:Freeing memory. debug-capi:Python C/API function mymod.sub1(state,d_i,d_f,ndim=shape(state,0),ntrajectories=shape(state,1)) debug-capi:double state=:inoutput,required,array,dims(ndim|ndim,ntrajectories|ntrajectories) Segmentation fault The working sub1 has two other arguments d_i and d_f which are real scalars, the full signatures are: subroutine sub0(state, Ndim, Ntrajectories) integer(I32), intent(IN) :: Ndim integer(I32), intent(IN) :: Ntrajectories real(R64), intent(INOUT), dimension(Ndim,Ntrajectories) :: state ... end subroutine sub0 subroutine sub1(state,d_i,d_f, Ndim,Ntrajectories) integer(4), intent(IN) :: Ndim integer(4), intent(IN) :: Ntrajectories real(8), intent(INOUT), dimension(Ndim, Ntrajectories) :: state real(8), intent(IN) :: d_i real(8), intent(IN) :: d_f print *, shape(state), Ndim, Ntrajectories ... end subroutine sub1 and I am calling from my script as: import mymod Ndim=24 Ntrajectories=10 di=0., df=10. r= np.zeros((Ndim,Ntrajectories),dtype=np.float64, order='Fortran') mymod.sub0(r) mymod.sub1(r, di, df) As it can be seen from the debug output, f2py is checking the arguments for sub0 but it segfault before checking the args in sub1 (with no very informative messages). It may well be a problem related to theworkings of the routines but they work when I use them in tests on pure fortran code. Additionally I get a very similar error message if I call sub0 (mymod.sub0(r)) instead of sub1 (mymod.sub1(r, di, df)) the second time in the python script. Any ideas? Thanks again. Juan -------- Original Message -------- Subject: f2py segfault Date: Sun, 17 Jan 2010 15:28:23 -0300 From: Juan To: scipy-user at scipy.org Hi, I don't know if this is the right place (if it is not, please point me in the right direction). I am using f2py with some own programs and I am going insane with a segmentation fault. It is probably a problem in my code but I'd like to know if someone has any hint to give me since I've been trying different things for two days already. I've got a few routines in fortran with in/out arrays. When I call one of the routines it works well. The second routine I call crashes the program. I've been changing routines and it seems that it does not matter with routines I use. Basically, the fortran routines have the signature: From dagss at student.matnat.uio.no Mon Jan 18 10:55:34 2010 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Mon, 18 Jan 2010 16:55:34 +0100 Subject: [SciPy-User] Fwd: f2py segfault In-Reply-To: <4B5451E4.5020806@yahoo.com> References: <4B5451E4.5020806@yahoo.com> Message-ID: <4B548476.9060400@student.matnat.uio.no> Juan wrote: > Hi, thanks for the advice. I did not notice that the integer division could be a > source for trouble. Now I changed all the routines. However, I still have the > same segmentation fault. > > > debug-capi:Python C/API function > mymod.sub0(state,ndim=shape(state,0),ntrajectories=shape(state,1)) > debug-capi:double > state=:inoutput,required,array,dims(ndim|ndim,ntrajectories|ntrajectories) > debug-capi:int ndim=shape(state,0):input,optional,scalar > debug-capi:ndim=24 > debug-capi:Checking `shape(state,0)==ndim' > debug-capi:int ntrajectories=shape(state,1):input,optional,scalar > debug-capi:ntrajectories=100 > debug-capi:Checking `shape(state,1)==ntrajectories' > debug-capi:Fortran subroutine `sub0(state,&ndim,&ntrajectories)' > debug-capi:ndim=24 > debug-capi:ntrajectories=100 > debug-capi:Building return value. > debug-capi:Python C/API function mymod.sub0: successful. > debug-capi:Freeing memory. > debug-capi:Python C/API function > mymod.sub1(state,d_i,d_f,ndim=shape(state,0),ntrajectories=shape(state,1)) > debug-capi:double > state=:inoutput,required,array,dims(ndim|ndim,ntrajectories|ntrajectories) > Segmentation fault > > The working sub1 has two other arguments d_i and d_f which are real scalars, the > full signatures are: > > subroutine sub0(state, Ndim, Ntrajectories) > integer(I32), intent(IN) :: Ndim > integer(I32), intent(IN) :: Ntrajectories > real(R64), intent(INOUT), dimension(Ndim,Ntrajectories) :: state > ... > end subroutine sub0 > > subroutine sub1(state,d_i,d_f, Ndim,Ntrajectories) > integer(4), intent(IN) :: Ndim > integer(4), intent(IN) :: Ntrajectories > real(8), intent(INOUT), dimension(Ndim, Ntrajectories) :: state > real(8), intent(IN) :: d_i > real(8), intent(IN) :: d_f > print *, shape(state), Ndim, Ntrajectories > ... > end subroutine sub1 > > and I am calling from my script as: > > import mymod > Ndim=24 > Ntrajectories=10 > di=0., df=10. > r= np.zeros((Ndim,Ntrajectories),dtype=np.float64, order='Fortran') > > mymod.sub0(r) > mymod.sub1(r, di, df) > > As it can be seen from the debug output, f2py is checking the arguments for sub0 > but it segfault before checking the args in sub1 (with no very informative > messages). > > It may well be a problem related to theworkings of the routines but they work > when I use them in tests on pure fortran code. Additionally I get a very similar > error message if I call sub0 (mymod.sub0(r)) instead of sub1 (mymod.sub1(r, di, > df)) the second time in the python script. > > Any ideas? Thanks again. Juan > Did you say which Fortran compiler you were using? f2py makes some blatant assumptions about the Fortran compiler which is nowhere in any standard. If you don't use gfortran you may get problems. Dag Sverre From dagss at student.matnat.uio.no Mon Jan 18 10:56:42 2010 From: dagss at student.matnat.uio.no (Dag Sverre Seljebotn) Date: Mon, 18 Jan 2010 16:56:42 +0100 Subject: [SciPy-User] Fwd: f2py segfault In-Reply-To: <4B548476.9060400@student.matnat.uio.no> References: <4B5451E4.5020806@yahoo.com> <4B548476.9060400@student.matnat.uio.no> Message-ID: <4B5484BA.7050109@student.matnat.uio.no> Dag Sverre Seljebotn wrote: > Juan wrote: >> Hi, thanks for the advice. I did not notice that the integer division >> could be a >> source for trouble. Now I changed all the routines. However, I still >> have the >> same segmentation fault. >> >> >> debug-capi:Python C/API function >> mymod.sub0(state,ndim=shape(state,0),ntrajectories=shape(state,1)) >> debug-capi:double >> state=:inoutput,required,array,dims(ndim|ndim,ntrajectories|ntrajectories) >> >> debug-capi:int ndim=shape(state,0):input,optional,scalar >> debug-capi:ndim=24 >> debug-capi:Checking `shape(state,0)==ndim' >> debug-capi:int ntrajectories=shape(state,1):input,optional,scalar >> debug-capi:ntrajectories=100 >> debug-capi:Checking `shape(state,1)==ntrajectories' >> debug-capi:Fortran subroutine `sub0(state,&ndim,&ntrajectories)' >> debug-capi:ndim=24 >> debug-capi:ntrajectories=100 >> debug-capi:Building return value. >> debug-capi:Python C/API function mymod.sub0: successful. >> debug-capi:Freeing memory. >> debug-capi:Python C/API function >> mymod.sub1(state,d_i,d_f,ndim=shape(state,0),ntrajectories=shape(state,1)) >> >> debug-capi:double >> state=:inoutput,required,array,dims(ndim|ndim,ntrajectories|ntrajectories) >> >> Segmentation fault >> >> The working sub1 has two other arguments d_i and d_f which are real >> scalars, the >> full signatures are: >> >> subroutine sub0(state, Ndim, Ntrajectories) >> integer(I32), intent(IN) :: Ndim >> integer(I32), intent(IN) :: Ntrajectories >> real(R64), intent(INOUT), dimension(Ndim,Ntrajectories) :: state >> ... >> end subroutine sub0 >> >> subroutine sub1(state,d_i,d_f, Ndim,Ntrajectories) >> integer(4), intent(IN) :: Ndim >> integer(4), intent(IN) :: Ntrajectories >> real(8), intent(INOUT), dimension(Ndim, Ntrajectories) :: state >> real(8), intent(IN) :: d_i >> real(8), intent(IN) :: d_f >> print *, shape(state), Ndim, Ntrajectories >> ... >> end subroutine sub1 >> >> and I am calling from my script as: >> >> import mymod >> Ndim=24 >> Ntrajectories=10 >> di=0., df=10. >> r= np.zeros((Ndim,Ntrajectories),dtype=np.float64, order='Fortran') >> >> mymod.sub0(r) >> mymod.sub1(r, di, df) >> >> As it can be seen from the debug output, f2py is checking the >> arguments for sub0 >> but it segfault before checking the args in sub1 (with no very >> informative >> messages). >> >> It may well be a problem related to theworkings of the routines but >> they work >> when I use them in tests on pure fortran code. Additionally I get a >> very similar >> error message if I call sub0 (mymod.sub0(r)) instead of sub1 >> (mymod.sub1(r, di, >> df)) the second time in the python script. >> >> Any ideas? Thanks again. Juan >> > Did you say which Fortran compiler you were using? f2py makes some > blatant assumptions about the Fortran compiler which is nowhere in any > standard. If you don't use gfortran you may get problems. > > Dag Sverre > Actually, other compilers may work very well -- I just don't know myself, but know that that is a possible source of problems... Dag Sverre From josef.pktd at gmail.com Mon Jan 18 10:59:46 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 18 Jan 2010 10:59:46 -0500 Subject: [SciPy-User] Return type of scipy.interpolate.splev for input array of length 1 In-Reply-To: <62e6eafb1001170225h49632e1bw3c47c3d62f0cce2f@mail.gmail.com> References: <62e6eafb1001170225h49632e1bw3c47c3d62f0cce2f@mail.gmail.com> Message-ID: <1cd32cbb1001180759k74c06be8q276128724cce61ed@mail.gmail.com> On Sun, Jan 17, 2010 at 5:25 AM, Yves Frederix wrote: > Hi, > > I stumbled upon the following unlogical behavior of > scipy.interpolate.splev. When presented with a length-1 array, the > output is converted to a scalar. > > > import scipy.interpolate > import numpy as N > > x = N.arange(5.) > y = N.arange(5.) > tck = scipy.interpolate.splrep(x,y) > > x_eval = N.asarray([1.]) > y_eval = scipy.interpolate.splev(x_eval, tck) > > print 'scipy.interpolate.splev(x_eval, tck):', y_eval > print 'type(x_eval):', type(x_eval) > print 'type(y_eval):', type(y_eval) > > > with output > > > scipy.interpolate.splev(x_eval, tck): 1.0 > type(x_eval): > type(y_eval): > > > It was rather unexpected that the type of input and output data are > different. After checking interpolate/fitpack.py it seems that this > behavior results from the fact that the length-1 case is explicitly > treated differently (probably to be able to deal with the case of > scalar input, for which scalar output is expected): > > ?434 def splev(x,tck,der=0): > ? > ?487 ? ? ? ? if ier: raise TypeError,"An error occurred" > ?488 ? ? ? ? if len(y)>1: return y > ?489 ? ? ? ? return y[0] > ?490 > > Wouldn't it be less confusing to have the return value always have the > same type as the input data? I don't know of any "official" policy. scipy.stats has switched for the most part to the same behavior. I think, mainly it is just a convention to have a nicer output when the return value is a scalar. One problem with making the output depend on the input type or shape is that in most functions I know, this information is not kept inside the function. Usually the input of array_like (arrays, lists, tuples, scalar numbers) is converted to an ndarray with np.asarray or np.array. The output then is independent of the input type (which hurts also if a user wants to work with matrices or other subclasses of ndarrays). On the other hand, if I want to use a list as input for convenience, I don't really want a list as output, I want an ndarray. That's my view, I don't really care in which direction the convention goes, but I like the consistency. Josef > Cheers, > YVES > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From amenity at enthought.com Mon Jan 18 11:55:20 2010 From: amenity at enthought.com (Amenity Applewhite) Date: Mon, 18 Jan 2010 10:55:20 -0600 Subject: [SciPy-User] EPD 6.0 and IPython Webinar Friday References: <0AE0D056-D7BB-498B-A14D-AAF9A90ED8F2@enthought.com> Message-ID: <94400778-AE3F-46A4-8B92-C86CB6DAD95A@enthought.com> Email not displaying correctly? View it in your browser. Happy 2010! To start the year off, we've released a new version of EPD and lined up a solid set of training options. Scientific Computing with Python Webinar This Friday, Travis Oliphant will then provide an introduction to multiprocessing and iPython.kernal. Scientific Computing with Python Webinar Multiprocessing and iPython.kernal Friday, January 22: 1pm CST/7pm UTC Register Enthought Live Training Enthought's intensive training courses are offered in 3-5 day sessions. The Python skills you'll acquire will save you and your organization time and money in 2010. Enthought Open Course February 22-26, Austin, TX ? Python for Scientists and Engineers ? Interfacing with C / C++ and Fortran ? Introduction to UIs and Visualization Enjoy! The Enthought Team EPD 6.0 Released Now available in our repository, EPD 6.0 includes Python 2.6, PiCloud's cloud library, and NumPy 1.4... Not to mention 64-bit support for Windows, OSX, and Linux. Details. Download now. New: Enthought channel on YouTube Short instructional videos straight from the desktops of our developers. Get started with a 4-part series on interpolation with SciPy. Our mailing address is: Enthought, Inc. 515 Congress Ave. Austin, TX 78701 Copyright (C) 2009 Enthought, Inc. All rights reserved. Forward this email to a friend -------------- next part -------------- An HTML attachment was scrubbed... URL: From cycomanic at gmail.com Mon Jan 18 20:19:36 2010 From: cycomanic at gmail.com (Jochen Schroeder) Date: Tue, 19 Jan 2010 12:19:36 +1100 Subject: [SciPy-User] Amplitude scaling in fft In-Reply-To: <57c570d21001171544p5f9b3835g513b2575adc6af33@mail.gmail.com> References: <57c570d21001171544p5f9b3835g513b2575adc6af33@mail.gmail.com> Message-ID: <20100119011934.GA2266@cudos0803> Hi, your answer doesn't really have a clear answer. Say raw_fft/raw_ifft is an fft without normalization then: A = raw_ifft(raw_fft(a, n=2**11), n=2**11) A = N*a where N=2**11 not len(a). However numpy does perform a normalization step in the ifft part, so that numpy.fft.ifft = raw_fft / N This way we can use the fft just as a Fourier transform and also fft(\delta) is constant 1. Hope that explains things a bit. Cheers Jochen On 01/17/10 23:44, Sharaf Al-Sharif wrote: > Hi, > I'm a bit confused regarding how the amplitudes returned by np.fft.fft (or > np.fft.rfft) relate to the amplitudes of the original signal in time domain. > If: > A = np.fft.rfft(a,n=2048) > but, > n_pts = len(a) < 2048, > > will the physical amplitudes in time domain be np.abs(A)*2/2048 , or np.abs(A) > *2/n_pts? Or something else? > Thank you for your help. > > Sharaf > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From pav+sp at iki.fi Tue Jan 19 04:41:23 2010 From: pav+sp at iki.fi (Pauli Virtanen) Date: Tue, 19 Jan 2010 09:41:23 +0000 (UTC) Subject: [SciPy-User] Return type of scipy.interpolate.splev for input array of length 1 References: <62e6eafb1001170225h49632e1bw3c47c3d62f0cce2f@mail.gmail.com> <1cd32cbb1001180759k74c06be8q276128724cce61ed@mail.gmail.com> Message-ID: Mon, 18 Jan 2010 10:59:46 -0500, josef.pktd wrote: > On Sun, Jan 17, 2010 at 5:25 AM, Yves Frederix > wrote: [clip] >> It was rather unexpected that the type of input and output data are >> different. After checking interpolate/fitpack.py it seems that this >> behavior results from the fact that the length-1 case is explicitly >> treated differently (probably to be able to deal with the case of >> scalar input, for which scalar output is expected): >> >> ?434 def splev(x,tck,der=0): >> ? >> ?487 ? ? ? ? if ier: raise TypeError,"An error occurred" 488 ? ? ? ? >> ?if len(y)>1: return y 489 ? ? ? ? return y[0] >> ?490 >> >> Wouldn't it be less confusing to have the return value always have the >> same type as the input data? > > I don't know of any "official" policy. I think (unstructured) interpolation should respect input.shape == output.shape also for 0-d. So yes, it's a wart, IMHO. Another question is: how many people actually have code that depends on this wart, and can it be fixed? I'd guess there's not much problem: (1,) arrays function nicely as scalars, but not vice versa because of mutability. -- Pauli Virtanen From yves.frederix at gmail.com Tue Jan 19 04:45:11 2010 From: yves.frederix at gmail.com (Yves Frederix) Date: Tue, 19 Jan 2010 10:45:11 +0100 Subject: [SciPy-User] Return type of scipy.interpolate.splev for input array of length 1 In-Reply-To: <1cd32cbb1001180759k74c06be8q276128724cce61ed@mail.gmail.com> References: <62e6eafb1001170225h49632e1bw3c47c3d62f0cce2f@mail.gmail.com> <1cd32cbb1001180759k74c06be8q276128724cce61ed@mail.gmail.com> Message-ID: <62e6eafb1001190145s6847b08ald13ec237cdbca9c@mail.gmail.com> Hi, In fact, I totally agree with you. Full matching of output to the type of the input does not make sense. But one could expect that array_like input results in ndarray output and scalar input in scalar output. As far as I can see, scipy.stats behaves exactly in this way. Anyway, I checked some other files and, e.g., in scipy/interpolate/polyint.py the input is explicitly tested to be scalar. In attachment you can find a patch for scipy/interpolate/fitpack.py so that it behaves 'correctly'. Regards, YVES > scipy.stats has switched for the most part to the same behavior. I > think, mainly it is just a convention to have a nicer output when the > return value is a scalar. > > One problem with making the output depend on the input type or shape > is that in most functions I know, this information is not kept inside > the function. Usually the input of array_like (arrays, lists, tuples, > scalar numbers) is converted to an ndarray with np.asarray or > np.array. > The output then is independent of the input type (which hurts also if > a user wants to work with matrices or other subclasses of ndarrays). > > On the other hand, if I want to use a list as input for convenience, I > don't really want a list as output, I want an ndarray. > > That's my view, I don't really care in which direction the convention > goes, but I like the consistency. > > Josef > >> Cheers, >> YVES >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- A non-text attachment was scrubbed... Name: splev_patch.diff Type: application/octet-stream Size: 1112 bytes Desc: not available URL: From pgmdevlist at gmail.com Wed Jan 20 03:53:34 2010 From: pgmdevlist at gmail.com (Pierre GM) Date: Wed, 20 Jan 2010 03:53:34 -0500 Subject: [SciPy-User] timeseries tsfromtxt missing_values bug? In-Reply-To: <4B5077A4.63BA.009B.0@twdb.state.tx.us> References: <4B5077A4.63BA.009B.0@twdb.state.tx.us> Message-ID: <1C4012B8-4C61-4C97-9DE0-F4619383BC94@gmail.com> On Jan 15, 2010, at 3:11 PM, Dharhas Pothina wrote: > Hi, > > I'm having issues with tsfromtxt masking fields using the missing_values parameter. > >>>> dateconverter = lambda y, m, d, hh, mm : datetime(year=int(y), month=int(m), day=int(d), hour=int(hh), minute=int(mm)) >>>> rseries = ts.tsfromtxt('test.csv',freq='T',comments='#',dateconverter=dateconverter,datecols=(1,2,3,4,5),usecols=(1,2,3,4,5,8),delimiter=',',missing_values=-999.0) > > gives : > > timeseries([(-999.0,) (-999.0,) (-999.0,)], > dtype = [('f5', ' dates = [02-May-2000 06:00 12-May-2000 08:00 13-May-2000 00:00], > freq = T) > > While : > >>>> rseries = ts.tsfromtxt('test.csv',freq='T',comments='#',dateconverter=dateconverter,datecols=(1,2,3,4,5),usecols=(1,2,3,4,5,8),delimiter=',',missing_values=-999.0,names='data') > > gives : > > timeseries([(--,) (--,) (--,)], > dtype = [('_tmp4', ' dates = [02-May-2000 06:00 12-May-2000 08:00 13-May-2000 00:00], > freq = T) > > So if I uses the 'names' argument the missing values are masked correctly but the field name is set to '_tmp4' rather than 'data'. If I don't use the 'names' argument the missing values are not masked. I've attached a small file to demonstrate. Am I doing something wrong or is this a bug. > Dharhas, Sorry for the delay. So yes, you uncovered two bugs: (1) when no names were given, the missing values were skipped (if they were not strings); (2) when using usecols, the names were properly propagated. I fixed them on SVN, would you mind giving a try ? From icy.flame.gm at gmail.com Wed Jan 20 10:20:10 2010 From: icy.flame.gm at gmail.com (iCy-fLaME) Date: Wed, 20 Jan 2010 15:20:10 +0000 Subject: [SciPy-User] How to do symmetry detection? Message-ID: Hello, I have some signals in mirror pairs in an 1D/2D array, and I am trying to identify the symmetry axis. A simplified example of the signal pair can look like this: [0, 0, 0, 0, 2, 3, 4, 0, 0, 0, 4, 3, 2, 0] The ideal output in this case will probably be: [0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0] As long as the symmetry point has the largest value, it will be fine. There can be multiple pairs of signals in the array, and the length of separation and duration of the signal can vary from pair to pair. The overall length of the array is about 1k points. The output array should reflect the level of likeness between the two sides of the array. I tried doing a loop as follows: ############ Begin ############ from numpy import array from numpy import zeros from numpy import arange data = array([0,0,0,0,2,3,4,0,0,0,4,3,2,0]) length = len(data) result = zeros(length) left = arange(length) left[0] = 0 # Index to be used for the end of the left portion right = arange(length) + 1 right[-1] = length - 1 # Index to be used for the begining of the right hand portion for i in range(length): l_part = zeros(length) # Default values to be zero, so non-overlapping region will r_part = zeros(length) # return zero after the multiplication. l_part[:left[i]] = data[:left[i]][::-1] # Take the left hand side and mirror it r_part[:length-right[i]] = data[right[i]:] # Take the right hand side result[i] = sum(l_part*r_part)/length # Use the product and integral to find the similarity metric. print l_part print r_part print "===============================", result[i] print result ############ END ############ But it is rather slow for a 1000x1000 2D array, anyone got any suggestion for a more elegant solution? Thanks in advance! From josef.pktd at gmail.com Wed Jan 20 10:43:57 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 20 Jan 2010 10:43:57 -0500 Subject: [SciPy-User] How to do symmetry detection? In-Reply-To: References: Message-ID: <1cd32cbb1001200743w20a36140h9421a201b2a39330@mail.gmail.com> On Wed, Jan 20, 2010 at 10:20 AM, iCy-fLaME wrote: > Hello, > > I have some signals in mirror pairs in an 1D/2D array, and I am trying > to identify the symmetry axis. > > A simplified example of the signal pair can look like this: > [0, 0, 0, 0, 2, 3, 4, 0, 0, 0, 4, 3, 2, 0] > > The ideal output in this case will probably be: > [0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0] > > As long as the symmetry point has the largest value, it will be fine. > > There can be multiple pairs of signals in the array, and the length of > separation and duration of the signal can vary from pair to pair. The > overall length of the array is about 1k points. The output array > should reflect the level of likeness between the two sides of the > array. > > I tried doing a loop as follows: > > ############ Begin ############ > from numpy import array > from numpy import zeros > from numpy import arange > > data = ?array([0,0,0,0,2,3,4,0,0,0,4,3,2,0]) > length = len(data) > result = zeros(length) > > left = arange(length) > left[0] = 0 ? ? ? ? ? ? ? ? # Index to be used for the end of the left portion > > right = arange(length) + 1 > right[-1] = length - 1 ? ? ?# Index to be used for the begining of the > right hand portion > > for i in range(length): > ? ?l_part = zeros(length) ?# Default values to be zero, so > non-overlapping region will > ? ?r_part = zeros(length) ?# ? return zero after the multiplication. > > ? ?l_part[:left[i]] = data[:left[i]][::-1] ? ? # Take the left hand > side and mirror it > ? ?r_part[:length-right[i]] = data[right[i]:] ?# Take the right hand side > ? ?result[i] = sum(l_part*r_part)/length ? # Use the product and > integral to find the similarity metric. > > ? ?print l_part > ? ?print r_part > ? ?print "===============================", result[i] > > > print result > > > ############ END ############ > > > But it is rather slow for a 1000x1000 2D array, anyone got any > suggestion for a more elegant solution? not as general and flexible but fast >>> a=np.array([0, 0, 0, 0, 2, 3, 4, 0, 0, 0, 4, 3, 2, 0]) >>> kw = [-1.,-1,-1,0,1,1,1] >>> (signal.convolve(a,kw,'valid')==0).astype(int) array([0, 0, 0, 0, 0, 1, 0, 0]) convolve can handle also nd One idea might be to use something like this in a first round, and use the more correct loop solution only if there are several shorter mirrors found by convolve. Also a guess on the likely length might improve the choice of window. Distance measure is additive not multiplicative. Josef > > Thanks in advance! > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From josef.pktd at gmail.com Wed Jan 20 10:47:40 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 20 Jan 2010 10:47:40 -0500 Subject: [SciPy-User] How to do symmetry detection? In-Reply-To: <1cd32cbb1001200743w20a36140h9421a201b2a39330@mail.gmail.com> References: <1cd32cbb1001200743w20a36140h9421a201b2a39330@mail.gmail.com> Message-ID: <1cd32cbb1001200747t19f6y117390d7800b835e@mail.gmail.com> On Wed, Jan 20, 2010 at 10:43 AM, wrote: > On Wed, Jan 20, 2010 at 10:20 AM, iCy-fLaME wrote: >> Hello, >> >> I have some signals in mirror pairs in an 1D/2D array, and I am trying >> to identify the symmetry axis. >> >> A simplified example of the signal pair can look like this: >> [0, 0, 0, 0, 2, 3, 4, 0, 0, 0, 4, 3, 2, 0] >> >> The ideal output in this case will probably be: >> [0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0] >> >> As long as the symmetry point has the largest value, it will be fine. >> >> There can be multiple pairs of signals in the array, and the length of >> separation and duration of the signal can vary from pair to pair. The >> overall length of the array is about 1k points. The output array >> should reflect the level of likeness between the two sides of the >> array. >> >> I tried doing a loop as follows: >> >> ############ Begin ############ >> from numpy import array >> from numpy import zeros >> from numpy import arange >> >> data = ?array([0,0,0,0,2,3,4,0,0,0,4,3,2,0]) >> length = len(data) >> result = zeros(length) >> >> left = arange(length) >> left[0] = 0 ? ? ? ? ? ? ? ? # Index to be used for the end of the left portion >> >> right = arange(length) + 1 >> right[-1] = length - 1 ? ? ?# Index to be used for the begining of the >> right hand portion >> >> for i in range(length): >> ? ?l_part = zeros(length) ?# Default values to be zero, so >> non-overlapping region will >> ? ?r_part = zeros(length) ?# ? return zero after the multiplication. >> >> ? ?l_part[:left[i]] = data[:left[i]][::-1] ? ? # Take the left hand >> side and mirror it >> ? ?r_part[:length-right[i]] = data[right[i]:] ?# Take the right hand side >> ? ?result[i] = sum(l_part*r_part)/length ? # Use the product and >> integral to find the similarity metric. >> >> ? ?print l_part >> ? ?print r_part >> ? ?print "===============================", result[i] >> >> >> print result >> >> >> ############ END ############ >> >> >> But it is rather slow for a 1000x1000 2D array, anyone got any >> suggestion for a more elegant solution? > > > not as general and flexible but fast > >>>> a=np.array([0, 0, 0, 0, 2, 3, 4, 0, 0, 0, 4, 3, 2, 0]) >>>> kw = [-1.,-1,-1,0,1,1,1] > >>>> (signal.convolve(a,kw,'valid')==0).astype(int) > array([0, 0, 0, 0, 0, 1, 0, 0]) > > convolve can handle also nd > > One idea might be to use something like this in a first round, and use > the more correct loop solution only if there are several shorter > mirrors found by convolve. Also a guess on the likely length might > improve the choice of window. > Distance measure is additive not multiplicative. or maybe this is not a great idea. if you have integers, there might be many cancellations and wrong detections. Josef > Josef > > >> >> Thanks in advance! >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > From josef.pktd at gmail.com Wed Jan 20 10:59:12 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 20 Jan 2010 10:59:12 -0500 Subject: [SciPy-User] How to do symmetry detection? In-Reply-To: <1cd32cbb1001200747t19f6y117390d7800b835e@mail.gmail.com> References: <1cd32cbb1001200743w20a36140h9421a201b2a39330@mail.gmail.com> <1cd32cbb1001200747t19f6y117390d7800b835e@mail.gmail.com> Message-ID: <1cd32cbb1001200759r422d3b9cvd8ebbbdaddfe85c0@mail.gmail.com> On Wed, Jan 20, 2010 at 10:47 AM, wrote: > On Wed, Jan 20, 2010 at 10:43 AM, ? wrote: >> On Wed, Jan 20, 2010 at 10:20 AM, iCy-fLaME wrote: >>> Hello, >>> >>> I have some signals in mirror pairs in an 1D/2D array, and I am trying >>> to identify the symmetry axis. >>> >>> A simplified example of the signal pair can look like this: >>> [0, 0, 0, 0, 2, 3, 4, 0, 0, 0, 4, 3, 2, 0] >>> >>> The ideal output in this case will probably be: >>> [0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0] >>> >>> As long as the symmetry point has the largest value, it will be fine. >>> >>> There can be multiple pairs of signals in the array, and the length of >>> separation and duration of the signal can vary from pair to pair. The >>> overall length of the array is about 1k points. The output array >>> should reflect the level of likeness between the two sides of the >>> array. >>> >>> I tried doing a loop as follows: >>> >>> ############ Begin ############ >>> from numpy import array >>> from numpy import zeros >>> from numpy import arange >>> >>> data = ?array([0,0,0,0,2,3,4,0,0,0,4,3,2,0]) >>> length = len(data) >>> result = zeros(length) >>> >>> left = arange(length) >>> left[0] = 0 ? ? ? ? ? ? ? ? # Index to be used for the end of the left portion >>> >>> right = arange(length) + 1 >>> right[-1] = length - 1 ? ? ?# Index to be used for the begining of the >>> right hand portion >>> >>> for i in range(length): >>> ? ?l_part = zeros(length) ?# Default values to be zero, so >>> non-overlapping region will >>> ? ?r_part = zeros(length) ?# ? return zero after the multiplication. >>> >>> ? ?l_part[:left[i]] = data[:left[i]][::-1] ? ? # Take the left hand >>> side and mirror it >>> ? ?r_part[:length-right[i]] = data[right[i]:] ?# Take the right hand side >>> ? ?result[i] = sum(l_part*r_part)/length ? # Use the product and >>> integral to find the similarity metric. >>> >>> ? ?print l_part >>> ? ?print r_part >>> ? ?print "===============================", result[i] >>> >>> >>> print result >>> >>> >>> ############ END ############ >>> >>> >>> But it is rather slow for a 1000x1000 2D array, anyone got any >>> suggestion for a more elegant solution? >> >> >> not as general and flexible but fast >> >>>>> a=np.array([0, 0, 0, 0, 2, 3, 4, 0, 0, 0, 4, 3, 2, 0]) >>>>> kw = [-1.,-1,-1,0,1,1,1] >> >>>>> (signal.convolve(a,kw,'valid')==0).astype(int) >> array([0, 0, 0, 0, 0, 1, 0, 0]) >> >> convolve can handle also nd >> >> One idea might be to use something like this in a first round, and use >> the more correct loop solution only if there are several shorter >> mirrors found by convolve. Also a guess on the likely length might >> improve the choice of window. >> Distance measure is additive not multiplicative. > > or maybe this is not a great idea. if you have integers, there might > be many cancellations and wrong detections. or maybe not with a window liike >>> ws=3;kw = (np.pi/3.15)**np.abs(np.arange(-ws,ws+1))*np.sign(np.arange(-ws,ws+1)) >>> kw array([-0.99201436, -0.99466913, -0.997331 , 0. , 0.997331 , 0.99466913, 0.99201436]) Josef From charlesr.harris at gmail.com Wed Jan 20 12:15:47 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 20 Jan 2010 10:15:47 -0700 Subject: [SciPy-User] How to do symmetry detection? In-Reply-To: References: Message-ID: On Wed, Jan 20, 2010 at 8:20 AM, iCy-fLaME wrote: > Hello, > > I have some signals in mirror pairs in an 1D/2D array, and I am trying > to identify the symmetry axis. > > A simplified example of the signal pair can look like this: > [0, 0, 0, 0, 2, 3, 4, 0, 0, 0, 4, 3, 2, 0] > > In [8]: a=np.array([0, 0, 0, 0, 2, 3, 4, 0, 0, 0, 4, 3, 2, 0]) In [9]: center = np.convolve(a,a).argmax()*.5 In [10]: center Out[10]: 8.0 In [11]: a[center - 4: center + 5] Out[11]: array([2, 3, 4, 0, 0, 0, 4, 3, 2]) Essentially this computes the component of the original along the reversed version for different shifts looking for the best match. The center can be between two indices which is why it is computed as a float Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From icy.flame.gm at gmail.com Wed Jan 20 14:26:04 2010 From: icy.flame.gm at gmail.com (iCy-fLaME) Date: Wed, 20 Jan 2010 19:26:04 +0000 Subject: [SciPy-User] How to do symmetry detection? In-Reply-To: References: Message-ID: Thanks for the replies! Perhaps I should clarify that the input data can be int or float, and most of them will have a very large DC offset (i.e. sum(data) >> 0), and no, the signal duration can be anything, I can not "guess" The problem with convolution (scipy.signal.convolve) with self is, it will only produce one "valid" point in the middle, because anywhere else there is a mis-match of array shape. I believe scipy.signal.convolve do not take into account of the number of points being integrated, and in the case of a large DC offset, any matches far from the middle of the data will be drowned by other areas which has more points to integrate over. Self convolution also has a problem of signal features matching itself. Imagine the input of the following: data: ______W____M_____ data[::-1]: _____M____W______ As you do the convolution, feature W will match itself first, then the W-M pair matching, then the M-M matching. Where a valid algorithm should only produce results for the W-M pair matching. I hope I am making the problem more clear now, but it's not the easiest concept to describe for me. Thanks! From charlesr.harris at gmail.com Wed Jan 20 14:39:06 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 20 Jan 2010 12:39:06 -0700 Subject: [SciPy-User] How to do symmetry detection? In-Reply-To: References: Message-ID: On Wed, Jan 20, 2010 at 12:26 PM, iCy-fLaME wrote: > Thanks for the replies! > > Perhaps I should clarify that the input data can be int or float, and > most of them will have a very large DC offset (i.e. sum(data) >> 0), > and no, the signal duration can be anything, I can not "guess" > > You should remove the offset, it is translation invariant anyway and gives no symmetry information. > The problem with convolution (scipy.signal.convolve) with self is, it > will only produce one "valid" point in the middle, because anywhere > else there is a mis-match of array shape. > > This can be a problem if the symmetry is near an end, but won't matter much if the relevant part is short or near the middle. The end effect will be a problem no matter what method you use. Think of convolution as a matched filter. > I believe scipy.signal.convolve do not take into account of the number > of points being integrated, and in the case of a large DC offset, any > matches far from the middle of the data will be drowned by other areas > which has more points to integrate over. > > Self convolution also has a problem of signal features matching > itself. Imagine the input of the following: > > data: ______W____M_____ > data[::-1]: _____M____W______ > > As you do the convolution, feature W will match itself first, then the > W-M pair matching, then the M-M matching. Where a valid algorithm > should only produce results for the W-M pair matching. > > Well, there is no symmetry in that example. If you don't know if there is symmetry then you have to account for that possibility in setting up the statistics. I'm thinking Bayesian here. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Wed Jan 20 15:00:46 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 20 Jan 2010 15:00:46 -0500 Subject: [SciPy-User] How to do symmetry detection? In-Reply-To: References: Message-ID: <1cd32cbb1001201200qb76fa6i841af63ff959fff7@mail.gmail.com> On Wed, Jan 20, 2010 at 2:39 PM, Charles R Harris wrote: > > > On Wed, Jan 20, 2010 at 12:26 PM, iCy-fLaME wrote: >> >> Thanks for the replies! >> >> Perhaps I should clarify that the input data can be int or float, and >> most of them will have a very large DC offset (i.e. sum(data) >> 0), >> and no, the signal duration can be anything, I can not "guess" >> > > You should remove the offset, it is translation invariant anyway and gives > no symmetry information. > >> >> The problem with convolution (scipy.signal.convolve) with self is, it >> will only produce one "valid" point in the middle, because anywhere >> else there is a mis-match of array shape. >> > > This can be a problem if the symmetry is near an end, but won't matter much > if the relevant part is short or near the middle. The end effect will be a > problem no matter what method you use. Think of convolution as a matched > filter. > >> >> I believe scipy.signal.convolve do not take into account of the number >> of points being integrated, and in the case of a large DC offset, any >> matches far from the middle of the data will be drowned by other areas >> which has more points to integrate over. >> >> Self convolution also has a problem of signal features matching >> itself. Imagine the input of the following: >> >> data: ______W____M_____ >> data[::-1]: _____M____W______ >> >> As you do the convolution, feature W will match itself first, then the >> W-M pair matching, then the M-M matching. Where a valid algorithm >> should only produce results for the W-M pair matching. >> > > Well, there is no symmetry in that example. If you don't know if there is > symmetry then you have to account for that possibility in setting up the > statistics. I'm thinking Bayesian here. > > Chuck And I think that convolve, especially fftconvolve for longer series has such a large speed advantage that running your loop to confirm the results (or several candidates) will still be much faster than the python loop over the entire array. Also, if the series is normalized to mean zero than the out-of bounds effect of the full self convolution will not matter so much. Josef > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > From peridot.faceted at gmail.com Wed Jan 20 15:00:47 2010 From: peridot.faceted at gmail.com (Anne Archibald) Date: Wed, 20 Jan 2010 15:00:47 -0500 Subject: [SciPy-User] Return type of scipy.interpolate.splev for input array of length 1 In-Reply-To: References: <62e6eafb1001170225h49632e1bw3c47c3d62f0cce2f@mail.gmail.com> <1cd32cbb1001180759k74c06be8q276128724cce61ed@mail.gmail.com> Message-ID: 2010/1/19 Pauli Virtanen : > Mon, 18 Jan 2010 10:59:46 -0500, josef.pktd wrote: >> On Sun, Jan 17, 2010 at 5:25 AM, Yves Frederix >> wrote: > [clip] >>> It was rather unexpected that the type of input and output data are >>> different. After checking interpolate/fitpack.py it seems that this >>> behavior results from the fact that the length-1 case is explicitly >>> treated differently (probably to be able to deal with the case of >>> scalar input, for which scalar output is expected): >>> >>> 434 def splev(x,tck,der=0): >>> >>> 487 if ier: raise TypeError,"An error occurred" 488 >>> if len(y)>1: return y 489 return y[0] >>> 490 >>> >>> Wouldn't it be less confusing to have the return value always have the >>> same type as the input data? >> >> I don't know of any "official" policy. > > I think (unstructured) interpolation should respect > > input.shape == output.shape > > also for 0-d. So yes, it's a wart, IMHO. > > Another question is: how many people actually have code that depends on > this wart, and can it be fixed? I'd guess there's not much problem: (1,) > arrays function nicely as scalars, but not vice versa because of > mutability. More generally, I think many functions should preserve the shape of the input array. Unfortunately it's often a hassle to do this: a few functions I have written start by checking whether the input is a scalar, setting a boolean and converting it to an array of size one; then at the end, I check the boolean and strip the array wrapping if the input is a scalar. It's annoying boilerplate, and I suspect that many functions don't handle this just because it's a nuisance. Some handy utility code might help. It would also be good to have a generic test one could apply to many functions to check that they preserve array shapes (0-d, 1-d of size 1, many-dimensional, many-dimensional with a zero dimension), and scalarness. Together with a test for preservation of arbitrary array subclasses (and correct functioning when handed matrices), one might be able to shake out a lot of minor easy-to-fix nuisances. Anne From charlesr.harris at gmail.com Wed Jan 20 15:39:05 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Wed, 20 Jan 2010 13:39:05 -0700 Subject: [SciPy-User] How to do symmetry detection? In-Reply-To: References: Message-ID: On Wed, Jan 20, 2010 at 12:39 PM, Charles R Harris < charlesr.harris at gmail.com> wrote: > > > On Wed, Jan 20, 2010 at 12:26 PM, iCy-fLaME wrote: > >> Thanks for the replies! >> >> Perhaps I should clarify that the input data can be int or float, and >> most of them will have a very large DC offset (i.e. sum(data) >> 0), >> and no, the signal duration can be anything, I can not "guess" >> >> > You should remove the offset, it is translation invariant anyway and gives > no symmetry information. > > >> The problem with convolution (scipy.signal.convolve) with self is, it >> will only produce one "valid" point in the middle, because anywhere >> else there is a mis-match of array shape. >> >> > This can be a problem if the symmetry is near an end, but won't matter much > if the relevant part is short or near the middle. The end effect will be a > problem no matter what method you use. Think of convolution as a matched > filter. > > >> I believe scipy.signal.convolve do not take into account of the number >> of points being integrated, and in the case of a large DC offset, any >> matches far from the middle of the data will be drowned by other areas >> which has more points to integrate over. >> >> Self convolution also has a problem of signal features matching >> itself. Imagine the input of the following: >> >> data: ______W____M_____ >> data[::-1]: _____M____W______ >> >> As you do the convolution, feature W will match itself first, then the >> W-M pair matching, then the M-M matching. Where a valid algorithm >> should only produce results for the W-M pair matching. >> >> > Well, there is no symmetry in that example. If you don't know if there is > symmetry then you have to account for that possibility in setting up the > statistics. I'm thinking Bayesian here. > > In particular, there should be some sort of threshold for detecting symmetry, some fraction of the signal variance, for instance. That assumes the data has been demeaned. The symmetry detection problem can be pretty difficult: noise can be a problem, the end effects can be a problem, etc., etc. Any apriori information about the nature of the signal can be useful. Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Wed Jan 20 16:18:39 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 20 Jan 2010 16:18:39 -0500 Subject: [SciPy-User] Return type of scipy.interpolate.splev for input array of length 1 In-Reply-To: References: <62e6eafb1001170225h49632e1bw3c47c3d62f0cce2f@mail.gmail.com> <1cd32cbb1001180759k74c06be8q276128724cce61ed@mail.gmail.com> Message-ID: <1cd32cbb1001201318l4b6a8822j34cec23bce33a263@mail.gmail.com> On Wed, Jan 20, 2010 at 3:00 PM, Anne Archibald wrote: > 2010/1/19 Pauli Virtanen : >> Mon, 18 Jan 2010 10:59:46 -0500, josef.pktd wrote: >>> On Sun, Jan 17, 2010 at 5:25 AM, Yves Frederix >>> wrote: >> [clip] >>>> It was rather unexpected that the type of input and output data are >>>> different. After checking interpolate/fitpack.py it seems that this >>>> behavior results from the fact that the length-1 case is explicitly >>>> treated differently (probably to be able to deal with the case of >>>> scalar input, for which scalar output is expected): >>>> >>>> ?434 def splev(x,tck,der=0): >>>> ? >>>> ?487 ? ? ? ? if ier: raise TypeError,"An error occurred" 488 >>>> ?if len(y)>1: return y 489 ? ? ? ? return y[0] >>>> ?490 >>>> >>>> Wouldn't it be less confusing to have the return value always have the >>>> same type as the input data? >>> >>> I don't know of any "official" policy. >> >> I think (unstructured) interpolation should respect >> >> ? ? ? ?input.shape == output.shape >> >> also for 0-d. So yes, it's a wart, IMHO. >> >> Another question is: how many people actually have code that depends on >> this wart, and can it be fixed? I'd guess there's not much problem: (1,) >> arrays function nicely as scalars, but not vice versa because of >> mutability. > > More generally, I think many functions should preserve the shape of > the input array. Unfortunately it's often a hassle to do this: a few > functions I have written start by checking whether the input is a > scalar, setting a boolean and converting it to an array of size one; > then at the end, I check the boolean and strip the array wrapping if > the input is a scalar. It's annoying boilerplate, and I suspect that > many functions don't handle this just because it's a nuisance. Some > handy utility code might help. > > It would also be good to have a generic test one could apply to many > functions to check that they preserve array shapes (0-d, 1-d of size > 1, many-dimensional, many-dimensional with a zero dimension), ?and > scalarness. Together with a test for preservation of arbitrary array > subclasses (and correct functioning when handed matrices), one might > be able to shake out a lot of minor easy-to-fix nuisances. > > Anne > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > I just checked again, the conversion in the distribution is weaker if output.ndim == 0: return output[()] as a result: >>> stats.norm.pdf(np.array([1])) array([ 0.24197072]) >>> stats.norm.pdf(np.array(1)) 0.24197072451914337 I just followed the pattern of Travis in this. Handling and preserving array subclasses is a lot of work and increases the size of simple functions considerably and triples (? not checked) the number of required tests (I just tried with stats.gmean, hmean and zscore). I don't see a way to write generic tests that would work across different signatures and argument types. Josef From tonyyu at MIT.EDU Wed Jan 20 17:56:33 2010 From: tonyyu at MIT.EDU (Tony S Yu) Date: Wed, 20 Jan 2010 17:56:33 -0500 Subject: [SciPy-User] Splines in scipy.signal vs scipy.interpolation Message-ID: <9AF13441-AFE5-4568-9438-4E98D6E99EDF@mit.edu> I'm having trouble making splines from scipy.signal work with those in scipy.interpolation. Both packages have functions for creating (`signal.cspline1d`/`interpolate.splrep`) and evaluating (`signal.cspline1d_eval`/`interpolate.splev`) splines. There are, of course, huge differences between these functions, which is why I'm trying to get them to talk to each other. In particular, I'd like to create a smoothing spline using `cspline1d` (which allows easier smoothing) and evaluate using `splev` (which allows me to get derivatives of the spline). I believe the main difference between the two spline representations (assuming cubic splines with no smoothing) is their boundary conditions (right?). Is there any way to condition the inputs such that I can feed in a spline from `cspline1d` and get a sensible result from `splev`? (Below is an example of what I mean by "conditioning the inputs"). Alternatively, is there another way to get functionality similar to matlab's `spaps` function? Thanks, -Tony #---- Failed attempt to get cspline1d/splev roundtrip import numpy as np from scipy import signal, interpolate x = np.linspace(1, 10, 20) y = np.cos(x) tck_interp = interpolate.splrep(x, y) c_signal = signal.cspline1d(y, 0) # set lambda to zero to eliminate smoothing # knots and coefficients from splrep have more values at boundaries t_match = np.hstack(([x[0]]*4, x[2:-2], [x[-1]]*4)) c_match = np.hstack((c_signal, [0]*4)) tck_signal = [t_match, c_match, 3] y_signal = signal.cspline1d_eval(c_signal, x, dx=x[1]-x[0], x0=x[0]) y_signal_interp = interpolate.splev(x, tck_signal, der=0) y_interp = interpolate.splev(x, tck_interp, der=0) print 'spline orders match? ', np.allclose(tck_signal[2], tck_interp[2]) #True print 'knots match? ', np.allclose(tck_signal[0], tck_interp[0]) #True print 'spline coefficients match? ', np.allclose(tck_signal[1], tck_interp[1]) #False print 'y (signal roundtrip) matches? ', np.allclose(y, y_signal) #True print 'y (interp roundtrip) matches? ', np.allclose(y, y_interp) #True print 'y (signal in/interp out) matches? ', np.allclose(y, y_signal_interp) #False From sfalsharif at gmail.com Thu Jan 21 14:09:25 2010 From: sfalsharif at gmail.com (Sharaf Al-Sharif) Date: Thu, 21 Jan 2010 19:09:25 +0000 Subject: [SciPy-User] Amplitude scaling in fft In-Reply-To: <20100119011934.GA2266@cudos0803> References: <57c570d21001171544p5f9b3835g513b2575adc6af33@mail.gmail.com> <20100119011934.GA2266@cudos0803> Message-ID: <57c570d21001211109t2570e1b7wcbec53f1db1069c6@mail.gmail.com> Thank you for your answer. Sharaf 2010/1/19 Jochen Schroeder > Hi, > > your answer doesn't really have a clear answer. Say raw_fft/raw_ifft is an > fft > without normalization then: > A = raw_ifft(raw_fft(a, n=2**11), n=2**11) > A = N*a > > where N=2**11 not len(a). However numpy does perform a normalization step > in > the ifft part, so that > numpy.fft.ifft = raw_fft / N > > This way we can use the fft just as a Fourier transform and also > fft(\delta) is > constant 1. > > Hope that explains things a bit. > > Cheers > Jochen > > > On 01/17/10 23:44, Sharaf Al-Sharif wrote: > > Hi, > > I'm a bit confused regarding how the amplitudes returned by np.fft.fft > (or > > np.fft.rfft) relate to the amplitudes of the original signal in time > domain. > > If: > > A = np.fft.rfft(a,n=2048) > > but, > > n_pts = len(a) < 2048, > > > > will the physical amplitudes in time domain be np.abs(A)*2/2048 , or > np.abs(A) > > *2/n_pts? Or something else? > > Thank you for your help. > > > > Sharaf > > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From kwgoodman at gmail.com Thu Jan 21 20:54:28 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Thu, 21 Jan 2010 17:54:28 -0800 Subject: [SciPy-User] scipy.stats.nanmedian Message-ID: I noticed a couple of issues with nanmedian in scipy.stats: >> from scipy.stats import nanmedian >> nanmedian(1) --------------------------------------------------------------------------- ValueError: axis must be less than arr.ndim; axis=0, rank=0. >> nanmedian(True) --------------------------------------------------------------------------- ValueError: axis must be less than arr.ndim; axis=0, rank=0. >> nanmedian(np.array(1)) --------------------------------------------------------------------------- ValueError: axis must be less than arr.ndim; axis=0, rank=0. >> nanmedian(np.array([1, 2, 3])) array(2.0) Changing the function from the original: def nanmedian(x, axis=0): x, axis = _chk_asarray(x,axis) x = x.copy() return np.apply_along_axis(_nanmedian,axis,x) to this (I know, it is not pretty): def nanmedian(x, axis=0): if np.isscalar(x): return float(x) x, axis = _chk_asarray(x, axis) if x.ndim == 0: return float(x.tolist()) x = x.copy() x = np.apply_along_axis(_nanmedian, axis, x) if x.ndim == 0: x = float(x.tolist()) return x gives the expected results: >> nanmedian(1) 1.0 >> nanmedian(True) 1.0 >> nanmedian(np.array(1)) 1.0 >> nanmedian(np.array([1, 2, 3])) 2.0 which agree with numpy: >> np.median(1) 1.0 >> np.median(True) 1.0 >> np.median(np.array(1)) 1.0 >> np.median(np.array([1, 2, 3])) 2.0 I'm keeping a local copy of the changes I made for my own package. But it would be nice (for me) if this was fixed upstream. Are the changes above good enough for scipy? (Another difference from np.median that I noticed is that the default axis for np.median is None and for scipy.stats.nanmean it is 0. But perhaps it is too late to change that.) From josef.pktd at gmail.com Thu Jan 21 21:15:59 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 21 Jan 2010 21:15:59 -0500 Subject: [SciPy-User] scipy.stats.nanmedian In-Reply-To: References: Message-ID: <1cd32cbb1001211815n7d6b1289w2e758c359051b271@mail.gmail.com> On Thu, Jan 21, 2010 at 8:54 PM, Keith Goodman wrote: > I noticed a couple of issues with nanmedian in scipy.stats: > >>> from scipy.stats import nanmedian >>> nanmedian(1) > --------------------------------------------------------------------------- > ValueError: axis must be less than arr.ndim; axis=0, rank=0. >>> nanmedian(True) > --------------------------------------------------------------------------- > ValueError: axis must be less than arr.ndim; axis=0, rank=0. >>> nanmedian(np.array(1)) > --------------------------------------------------------------------------- > ValueError: axis must be less than arr.ndim; axis=0, rank=0. >>> nanmedian(np.array([1, 2, 3])) > ? array(2.0) > > Changing the function from the original: > > def nanmedian(x, axis=0): > ? ?x, axis = _chk_asarray(x,axis) > ? ?x = x.copy() > ? ?return np.apply_along_axis(_nanmedian,axis,x) > > to this (I know, it is not pretty): > > def nanmedian(x, axis=0): > ? ?if np.isscalar(x): > ? ? ? ?return float(x) > ? ?x, axis = _chk_asarray(x, axis) > ? ?if x.ndim == 0: > ? ? ? ?return float(x.tolist()) > ? ?x = x.copy() > ? ?x = np.apply_along_axis(_nanmedian, axis, x) > ? ?if x.ndim == 0: > ? ? ? ?x = float(x.tolist()) > ? ?return x Can you open a ticket, so that I don't forget to look at it? I will need to play with it. There are some things that I don't understand right away. What's the difference between isscalar and ndim=0 ? Why do you have the tolist() Thanks, Josef > gives the expected results: > >>> nanmedian(1) > ? 1.0 >>> nanmedian(True) > ? 1.0 >>> nanmedian(np.array(1)) > ? 1.0 >>> nanmedian(np.array([1, 2, 3])) > ? 2.0 > > which agree with numpy: > >>> np.median(1) > ? 1.0 >>> np.median(True) > ? 1.0 >>> np.median(np.array(1)) > ? 1.0 >>> np.median(np.array([1, 2, 3])) > ? 2.0 > > I'm keeping a local copy of the changes I made for my own package. But > it would be nice (for me) if this was fixed upstream. Are the changes > above good enough for scipy? > > (Another difference from np.median that I noticed is that the default > axis for np.median is None and for scipy.stats.nanmean it is 0. But > perhaps it is too late to change that.) > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From kwgoodman at gmail.com Thu Jan 21 21:28:47 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Thu, 21 Jan 2010 18:28:47 -0800 Subject: [SciPy-User] scipy.stats.nanmedian In-Reply-To: <1cd32cbb1001211815n7d6b1289w2e758c359051b271@mail.gmail.com> References: <1cd32cbb1001211815n7d6b1289w2e758c359051b271@mail.gmail.com> Message-ID: On Thu, Jan 21, 2010 at 6:15 PM, wrote: > Can you open a ticket, so that I don't forget to look at it? Sure. > I will need to play with it. There are some things that I don't > understand right away. > What's the difference between isscalar and ndim=0 ? A scalar doesn't have a ndim method. But now I see that there is a ndim function. I'll use that instead. > Why do you have the tolist() That's the only was I was able to figure out how to pull 1.0 out of np.array(1.0). Is there a better way? >> np.array(1.0).tolist() 1.0 From pgmdevlist at gmail.com Thu Jan 21 21:41:37 2010 From: pgmdevlist at gmail.com (Pierre GM) Date: Thu, 21 Jan 2010 21:41:37 -0500 Subject: [SciPy-User] scipy.stats.nanmedian In-Reply-To: References: <1cd32cbb1001211815n7d6b1289w2e758c359051b271@mail.gmail.com> Message-ID: <16A50238-D3F1-4F51-A229-4FCD8267F320@gmail.com> On Jan 21, 2010, at 9:28 PM, Keith Goodman wrote: > On Thu, Jan 21, 2010 at 6:15 PM, wrote: >> Can you open a ticket, so that I don't forget to look at it? > > Sure. > >> I will need to play with it. There are some things that I don't >> understand right away. >> What's the difference between isscalar and ndim=0 ? > > A scalar doesn't have a ndim method. But now I see that there is a > ndim function. I'll use that instead. > >> Why do you have the tolist() > > That's the only was I was able to figure out how to pull 1.0 out of > np.array(1.0). Is there a better way? .item() From josef.pktd at gmail.com Thu Jan 21 21:42:14 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 21 Jan 2010 21:42:14 -0500 Subject: [SciPy-User] scipy.stats.nanmedian In-Reply-To: References: <1cd32cbb1001211815n7d6b1289w2e758c359051b271@mail.gmail.com> Message-ID: <1cd32cbb1001211842j2a955c8fg2a768c4894b90ed6@mail.gmail.com> On Thu, Jan 21, 2010 at 9:28 PM, Keith Goodman wrote: > On Thu, Jan 21, 2010 at 6:15 PM, ? wrote: >> Can you open a ticket, so that I don't forget to look at it? > > Sure. > >> I will need to play with it. There are some things that I don't >> understand right away. >> What's the difference between isscalar and ndim=0 ? > > A scalar doesn't have a ndim method. But now I see that there is a > ndim function. I'll use that instead. > >> Why do you have the tolist() > > That's the only was I was able to figure out how to pull 1.0 out of > np.array(1.0). Is there a better way? > >>> np.array(1.0).tolist() > ? 1.0 >>> np.array(1.0) array(1.0) >>> np.array(1.0)[()] 1.0 Josef > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From josef.pktd at gmail.com Thu Jan 21 21:56:53 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 21 Jan 2010 21:56:53 -0500 Subject: [SciPy-User] Return type of scipy.interpolate.splev for input array of length 1 In-Reply-To: <62e6eafb1001190145s6847b08ald13ec237cdbca9c@mail.gmail.com> References: <62e6eafb1001170225h49632e1bw3c47c3d62f0cce2f@mail.gmail.com> <1cd32cbb1001180759k74c06be8q276128724cce61ed@mail.gmail.com> <62e6eafb1001190145s6847b08ald13ec237cdbca9c@mail.gmail.com> Message-ID: <1cd32cbb1001211856l342c5477x283fa72ce2b2e3ba@mail.gmail.com> On Tue, Jan 19, 2010 at 4:45 AM, Yves Frederix wrote: > Hi, > > In fact, I totally agree with you. Full matching of output to the type > of the input does not make sense. But one could expect that array_like > input results in ndarray output and scalar input in scalar output. As > far as I can see, scipy.stats behaves exactly in this way. > > Anyway, I checked some other files and, e.g., in > scipy/interpolate/polyint.py the input is explicitly tested to be > scalar. In attachment you can find a patch for > scipy/interpolate/fitpack.py so that it behaves 'correctly'. I also found a related http://projects.scipy.org/scipy/ticket/600 I don't know what the status of it is. Josef > Regards, > YVES > >> scipy.stats has switched for the most part to the same behavior. I >> think, mainly it is just a convention to have a nicer output when the >> return value is a scalar. >> >> One problem with making the output depend on the input type or shape >> is that in most functions I know, this information is not kept inside >> the function. Usually the input of array_like (arrays, lists, tuples, >> scalar numbers) is converted to an ndarray with np.asarray or >> np.array. >> The output then is independent of the input type (which hurts also if >> a user wants to work with matrices or other subclasses of ndarrays). >> >> On the other hand, if I want to use a list as input for convenience, I >> don't really want a list as output, I want an ndarray. >> >> That's my view, I don't really care in which direction the convention >> goes, but I like the consistency. >> >> Josef >> >>> Cheers, >>> YVES From kwgoodman at gmail.com Thu Jan 21 22:01:17 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Thu, 21 Jan 2010 19:01:17 -0800 Subject: [SciPy-User] scipy.stats.nanmedian In-Reply-To: <16A50238-D3F1-4F51-A229-4FCD8267F320@gmail.com> References: <1cd32cbb1001211815n7d6b1289w2e758c359051b271@mail.gmail.com> <16A50238-D3F1-4F51-A229-4FCD8267F320@gmail.com> Message-ID: On Thu, Jan 21, 2010 at 6:41 PM, Pierre GM wrote: > On Jan 21, 2010, at 9:28 PM, Keith Goodman wrote: >> That's the only was I was able to figure out how to pull 1.0 out of >> np.array(1.0). Is there a better way? > > > .item() Thanks. item() looks better than tolist(). I simplified the function: def nanmedian(x, axis=0): x, axis = _chk_asarray(x,axis) if x.ndim == 0: return float(x.item()) x = x.copy() x = np.apply_along_axis(_nanmedian,axis,x) if x.ndim == 0: x = float(x.item()) return x and opened a ticket: http://projects.scipy.org/scipy/ticket/1098 From josef.pktd at gmail.com Thu Jan 21 23:18:31 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 21 Jan 2010 23:18:31 -0500 Subject: [SciPy-User] scipy.stats.nanmedian In-Reply-To: References: <1cd32cbb1001211815n7d6b1289w2e758c359051b271@mail.gmail.com> <16A50238-D3F1-4F51-A229-4FCD8267F320@gmail.com> Message-ID: <1cd32cbb1001212018s21e892f8rb7b210033e9d2fa6@mail.gmail.com> On Thu, Jan 21, 2010 at 10:01 PM, Keith Goodman wrote: > On Thu, Jan 21, 2010 at 6:41 PM, Pierre GM wrote: >> On Jan 21, 2010, at 9:28 PM, Keith Goodman wrote: >>> That's the only was I was able to figure out how to pull 1.0 out of >>> np.array(1.0). Is there a better way? >> >> >> .item() > > Thanks. item() looks better than tolist(). > > I simplified the function: > > def nanmedian(x, axis=0): > ? ?x, axis = _chk_asarray(x,axis) > ? ?if x.ndim == 0: > ? ? ? ?return float(x.item()) > ? ?x = x.copy() > ? ?x = np.apply_along_axis(_nanmedian,axis,x) > ? ?if x.ndim == 0: > ? ? ? ?x = float(x.item()) > ? ?return x > > and opened a ticket: > > http://projects.scipy.org/scipy/ticket/1098 How about getting rid of apply_along_axis? see attachment I don't know whether or how much faster it is, but there is a ticket that the current version is slow. No hidden bug or corner case guarantee yet. Josef -------------- next part -------------- # -*- coding: utf-8 -*- """ Created on Wed Jan 20 10:18:32 2010 Author: josef-pktd """ import numpy as np from scipy import stats def nanmedian(x, axis = 0): x, axis = stats.stats._chk_asarray(x, axis) if x.ndim == 0: return 1.0*x[()] x = np.sort(x, axis=axis) nall = x.shape[axis] notnancount = nall - np.isnan(x).sum(axis=axis) (idx, rmd) = divmod(notnancount, 2) indx = [np.arange(x.shape[ax]) for ax in range(x.ndim)] indxlo = indx[:] indxlo[axis] = idx indxhi = indx[:] indxhi[axis] = idx - (1-rmd) nanmed = (x[indxlo] + x[indxhi])/2. if nanmed.ndim == 0: return nanmed[()] return nanmed for axis in [0,1]: for i in range(5): # for complex #x = 1j+np.arange(20).reshape(4,5) x = np.arange(20).reshape(4,5).astype(float) x[zip(np.random.randint(4, size=(2,5)))] = np.nan print nanmedian(x, axis=0) print stats.nanmedian(x, axis=0) print nanmedian(1) print nanmedian(np.array(1)) print nanmedian(np.array([1])) From Dharhas.Pothina at twdb.state.tx.us Fri Jan 22 10:48:39 2010 From: Dharhas.Pothina at twdb.state.tx.us (Dharhas Pothina) Date: Fri, 22 Jan 2010 09:48:39 -0600 Subject: [SciPy-User] timeseries tsfromtxt missing_values bug? In-Reply-To: <1C4012B8-4C61-4C97-9DE0-F4619383BC94@gmail.com> References: <4B5077A4.63BA.009B.0@twdb.state.tx.us> <1C4012B8-4C61-4C97-9DE0-F4619383BC94@gmail.com> Message-ID: <4B597476.63BA.009B.0@twdb.state.tx.us> Is there any way to install the svn version on windows? This script is being primarily used on a windows box. If not I'll test on linux. - dharhas >>> Pierre GM 1/20/2010 2:53 AM >>> On Jan 15, 2010, at 3:11 PM, Dharhas Pothina wrote: > Hi, > > I'm having issues with tsfromtxt masking fields using the missing_values parameter. > >>>> dateconverter = lambda y, m, d, hh, mm : datetime(year=int(y), month=int(m), day=int(d), hour=int(hh), minute=int(mm)) >>>> rseries = ts.tsfromtxt('test.csv',freq='T',comments='#',dateconverter=dateconverter,datecols=(1,2,3,4,5),usecols=(1,2,3,4,5,8),delimiter=',',missing_values=-999.0) > > gives : > > timeseries([(-999.0,) (-999.0,) (-999.0,)], > dtype = [('f5', ' dates = [02-May-2000 06:00 12-May-2000 08:00 13-May-2000 00:00], > freq = T) > > While : > >>>> rseries = ts.tsfromtxt('test.csv',freq='T',comments='#',dateconverter=dateconverter,datecols=(1,2,3,4,5),usecols=(1,2,3,4,5,8),delimiter=',',missing_values=-999.0,names='data') > > gives : > > timeseries([(--,) (--,) (--,)], > dtype = [('_tmp4', ' dates = [02-May-2000 06:00 12-May-2000 08:00 13-May-2000 00:00], > freq = T) > > So if I uses the 'names' argument the missing values are masked correctly but the field name is set to '_tmp4' rather than 'data'. If I don't use the 'names' argument the missing values are not masked. I've attached a small file to demonstrate. Am I doing something wrong or is this a bug. > Dharhas, Sorry for the delay. So yes, you uncovered two bugs: (1) when no names were given, the missing values were skipped (if they were not strings); (2) when using usecols, the names were properly propagated. I fixed them on SVN, would you mind giving a try ? _______________________________________________ SciPy-User mailing list SciPy-User at scipy.org http://mail.scipy.org/mailman/listinfo/scipy-user From bsouthey at gmail.com Fri Jan 22 10:58:07 2010 From: bsouthey at gmail.com (Bruce Southey) Date: Fri, 22 Jan 2010 09:58:07 -0600 Subject: [SciPy-User] scipy.stats.nanmedian In-Reply-To: <1cd32cbb1001212018s21e892f8rb7b210033e9d2fa6@mail.gmail.com> References: <1cd32cbb1001211815n7d6b1289w2e758c359051b271@mail.gmail.com> <16A50238-D3F1-4F51-A229-4FCD8267F320@gmail.com> <1cd32cbb1001212018s21e892f8rb7b210033e9d2fa6@mail.gmail.com> Message-ID: <4B59CB0F.8060502@gmail.com> On 01/21/2010 10:18 PM, josef.pktd at gmail.com wrote: > On Thu, Jan 21, 2010 at 10:01 PM, Keith Goodman wrote: > >> On Thu, Jan 21, 2010 at 6:41 PM, Pierre GM wrote: >> >>> On Jan 21, 2010, at 9:28 PM, Keith Goodman wrote: >>> >>>> That's the only was I was able to figure out how to pull 1.0 out of >>>> np.array(1.0). Is there a better way? >>>> >>> >>> .item() >>> >> Thanks. item() looks better than tolist(). >> >> I simplified the function: >> >> def nanmedian(x, axis=0): >> x, axis = _chk_asarray(x,axis) >> if x.ndim == 0: >> return float(x.item()) >> x = x.copy() >> x = np.apply_along_axis(_nanmedian,axis,x) >> if x.ndim == 0: >> x = float(x.item()) >> return x >> >> and opened a ticket: >> >> http://projects.scipy.org/scipy/ticket/1098 >> > > How about getting rid of apply_along_axis? see attachment > > I don't know whether or how much faster it is, but there is a ticket > that the current version is slow. > No hidden bug or corner case guarantee yet. > > > Josef > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > Personally, I think using masked arrays is a far better solution than the various nan methods in stats.py. Is the call to _chk_asarray that much different than the call to ma? Both require conversion or checking if the input is a np array. As stated in the documentation, _nanmedian only works on 1d arrays. So any 'axis' argument without changing the main function is perhaps a hack at best. Is it possible to adapt Sturla's version? http://projects.scipy.org/numpy/ticket/1213 I do not know the algorithm to suggest anything but perhaps the select method could be adapted to handle nan. Bruce -------------- next part -------------- An HTML attachment was scrubbed... URL: From kwgoodman at gmail.com Fri Jan 22 11:09:17 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Fri, 22 Jan 2010 08:09:17 -0800 Subject: [SciPy-User] scipy.stats.nanmedian In-Reply-To: <1cd32cbb1001212018s21e892f8rb7b210033e9d2fa6@mail.gmail.com> References: <1cd32cbb1001211815n7d6b1289w2e758c359051b271@mail.gmail.com> <16A50238-D3F1-4F51-A229-4FCD8267F320@gmail.com> <1cd32cbb1001212018s21e892f8rb7b210033e9d2fa6@mail.gmail.com> Message-ID: On Thu, Jan 21, 2010 at 8:18 PM, wrote: > On Thu, Jan 21, 2010 at 10:01 PM, Keith Goodman wrote: >> On Thu, Jan 21, 2010 at 6:41 PM, Pierre GM wrote: >>> On Jan 21, 2010, at 9:28 PM, Keith Goodman wrote: >>>> That's the only was I was able to figure out how to pull 1.0 out of >>>> np.array(1.0). Is there a better way? >>> >>> >>> .item() >> >> Thanks. item() looks better than tolist(). >> >> I simplified the function: >> >> def nanmedian(x, axis=0): >> ? ?x, axis = _chk_asarray(x,axis) >> ? ?if x.ndim == 0: >> ? ? ? ?return float(x.item()) >> ? ?x = x.copy() >> ? ?x = np.apply_along_axis(_nanmedian,axis,x) >> ? ?if x.ndim == 0: >> ? ? ? ?x = float(x.item()) >> ? ?return x >> >> and opened a ticket: >> >> http://projects.scipy.org/scipy/ticket/1098 > > > How about getting rid of apply_along_axis? ? ?see attachment > > I don't know whether or how much faster it is, but there is a ticket > that the current version is slow. > No hidden bug or corner case guarantee yet. It is faster. But here is one case it does not handle: >> nanmedian([1, 2]) array([ 1.5]) >> np.median([1, 2]) 1.5 I'm sure it could be fixed. But having to fix it, and the fact that it is a larger change, decreases the likelihood that it will make it into the next version of scipy. One option is to make the small bug fix I suggested (ticket #1098) and add the corresponding unit tests. Then we can take our time to design a better version of nanmedian. From josef.pktd at gmail.com Fri Jan 22 11:14:56 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 22 Jan 2010 11:14:56 -0500 Subject: [SciPy-User] scipy.stats.nanmedian In-Reply-To: <4B59CB0F.8060502@gmail.com> References: <1cd32cbb1001211815n7d6b1289w2e758c359051b271@mail.gmail.com> <16A50238-D3F1-4F51-A229-4FCD8267F320@gmail.com> <1cd32cbb1001212018s21e892f8rb7b210033e9d2fa6@mail.gmail.com> <4B59CB0F.8060502@gmail.com> Message-ID: <1cd32cbb1001220814x565d38cx6ef0bd029136b58f@mail.gmail.com> On Fri, Jan 22, 2010 at 10:58 AM, Bruce Southey wrote: > On 01/21/2010 10:18 PM, josef.pktd at gmail.com wrote: > > On Thu, Jan 21, 2010 at 10:01 PM, Keith Goodman wrote: > > > On Thu, Jan 21, 2010 at 6:41 PM, Pierre GM wrote: > > > On Jan 21, 2010, at 9:28 PM, Keith Goodman wrote: > > > That's the only was I was able to figure out how to pull 1.0 out of > np.array(1.0). Is there a better way? > > > .item() > > > Thanks. item() looks better than tolist(). > > I simplified the function: > > def nanmedian(x, axis=0): > ? ?x, axis = _chk_asarray(x,axis) > ? ?if x.ndim == 0: > ? ? ? ?return float(x.item()) > ? ?x = x.copy() > ? ?x = np.apply_along_axis(_nanmedian,axis,x) > ? ?if x.ndim == 0: > ? ? ? ?x = float(x.item()) > ? ?return x > > and opened a ticket: > > http://projects.scipy.org/scipy/ticket/1098 > > > How about getting rid of apply_along_axis? see attachment > > I don't know whether or how much faster it is, but there is a ticket > that the current version is slow. > No hidden bug or corner case guarantee yet. > > > Josef > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > > Personally, I think using masked arrays is a far better solution than the > various nan methods in stats.py. Is the call to _chk_asarray that much > different than the call to ma? Both require conversion or checking if the > input is a np array. I like arrays with nans better than masked arrays, and I checked, np.ma.median also uses apply_along_axis and I wanted to see if I can get a vectorized version. > > As stated in the documentation, _nanmedian only works on 1d arrays. So any > 'axis' argument without changing the main function is perhaps a hack at > best. _nanmedian is an internal function, stats.nanmedian and my rewritten version are supposed to handle any dimension and any axis. > > Is it possible to adapt Sturla's version? > http://projects.scipy.org/numpy/ticket/1213 > I do not know the algorithm to suggest anything but perhaps the select > method could be adapted to handle nan. I guess Sturla's version is a lot better, but not my kind of fish, it's more for the algorithm and c experts. I will gladly use it once it or something similar is in numpy. Josef > > Bruce > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > From kwgoodman at gmail.com Fri Jan 22 11:17:48 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Fri, 22 Jan 2010 08:17:48 -0800 Subject: [SciPy-User] scipy.stats.nanmedian In-Reply-To: <4B59CB0F.8060502@gmail.com> References: <1cd32cbb1001211815n7d6b1289w2e758c359051b271@mail.gmail.com> <16A50238-D3F1-4F51-A229-4FCD8267F320@gmail.com> <1cd32cbb1001212018s21e892f8rb7b210033e9d2fa6@mail.gmail.com> <4B59CB0F.8060502@gmail.com> Message-ID: On Fri, Jan 22, 2010 at 7:58 AM, Bruce Southey wrote: > Is it possible to adapt Sturla's version? > http://projects.scipy.org/numpy/ticket/1213 > I do not know the algorithm to suggest anything but perhaps the select > method could be adapted to handle nan. I recently needed to calculate the median in an inner loop. It would have been nice to have a median function that doesn't do a full sort. I wanted to compile Sturla's version, but I didn't even know which of the attachments to download. I've never compiled a cython function. From josef.pktd at gmail.com Fri Jan 22 11:46:10 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 22 Jan 2010 11:46:10 -0500 Subject: [SciPy-User] scipy.stats.nanmedian In-Reply-To: References: <1cd32cbb1001211815n7d6b1289w2e758c359051b271@mail.gmail.com> <16A50238-D3F1-4F51-A229-4FCD8267F320@gmail.com> <1cd32cbb1001212018s21e892f8rb7b210033e9d2fa6@mail.gmail.com> Message-ID: <1cd32cbb1001220846g26603687lc8f4524655b2e7d5@mail.gmail.com> On Fri, Jan 22, 2010 at 11:09 AM, Keith Goodman wrote: > On Thu, Jan 21, 2010 at 8:18 PM, ? wrote: >> On Thu, Jan 21, 2010 at 10:01 PM, Keith Goodman wrote: >>> On Thu, Jan 21, 2010 at 6:41 PM, Pierre GM wrote: >>>> On Jan 21, 2010, at 9:28 PM, Keith Goodman wrote: >>>>> That's the only was I was able to figure out how to pull 1.0 out of >>>>> np.array(1.0). Is there a better way? >>>> >>>> >>>> .item() >>> >>> Thanks. item() looks better than tolist(). >>> >>> I simplified the function: >>> >>> def nanmedian(x, axis=0): >>> ? ?x, axis = _chk_asarray(x,axis) >>> ? ?if x.ndim == 0: >>> ? ? ? ?return float(x.item()) >>> ? ?x = x.copy() >>> ? ?x = np.apply_along_axis(_nanmedian,axis,x) >>> ? ?if x.ndim == 0: >>> ? ? ? ?x = float(x.item()) >>> ? ?return x >>> >>> and opened a ticket: >>> >>> http://projects.scipy.org/scipy/ticket/1098 >> >> >> How about getting rid of apply_along_axis? ? ?see attachment >> >> I don't know whether or how much faster it is, but there is a ticket >> that the current version is slow. >> No hidden bug or corner case guarantee yet. > > It is faster. But here is one case it does not handle: > >>> nanmedian([1, 2]) > ? array([ 1.5]) >>> np.median([1, 2]) > ? 1.5 > > I'm sure it could be fixed. But having to fix it, and the fact that it > is a larger change, decreases the likelihood that it will make it into > the next version of scipy. One option is to make the small bug fix I > suggested (ticket #1098) and add the corresponding unit tests. Then we > can take our time to design a better version of nanmedian. I didn't see the difference to np.median for this case, I think I was taking the shape answer from the other thread on the return of splines and interpolation. If I change the last 3 lines to if nanmed.size == 1: return nanmed.item() return nanmed then I get agreement with numpy for the following test cases print nanmedian(1), np.median(1) print nanmedian(np.array(1)), np.median(1) print nanmedian(np.array([1])), np.median(np.array([1])) print nanmedian(np.array([[1]])), np.median(np.array([[1]])) print nanmedian(np.array([1,2])), np.median(np.array([1,2])) print nanmedian(np.array([[1,2]])), np.median(np.array([[1,2]]),axis=0) print nanmedian([1]), np.median([1]) print nanmedian([[1]]), np.median([[1]]) print nanmedian([1,2]), np.median([1,2]) print nanmedian([[1,2]]), np.median([[1,2]],axis=0) print nanmedian([1j,2]), np.median([1j,2]) Am I still missing any cases? The vectorized version should be faster for this case http://projects.scipy.org/scipy/ticket/740 but maybe not for long and narrow arrays. Josef > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From kwgoodman at gmail.com Fri Jan 22 11:52:50 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Fri, 22 Jan 2010 08:52:50 -0800 Subject: [SciPy-User] scipy.stats.nanmedian In-Reply-To: <1cd32cbb1001220846g26603687lc8f4524655b2e7d5@mail.gmail.com> References: <1cd32cbb1001211815n7d6b1289w2e758c359051b271@mail.gmail.com> <16A50238-D3F1-4F51-A229-4FCD8267F320@gmail.com> <1cd32cbb1001212018s21e892f8rb7b210033e9d2fa6@mail.gmail.com> <1cd32cbb1001220846g26603687lc8f4524655b2e7d5@mail.gmail.com> Message-ID: On Fri, Jan 22, 2010 at 8:46 AM, wrote: > On Fri, Jan 22, 2010 at 11:09 AM, Keith Goodman wrote: >> On Thu, Jan 21, 2010 at 8:18 PM, ? wrote: >>> On Thu, Jan 21, 2010 at 10:01 PM, Keith Goodman wrote: >>>> On Thu, Jan 21, 2010 at 6:41 PM, Pierre GM wrote: >>>>> On Jan 21, 2010, at 9:28 PM, Keith Goodman wrote: >>>>>> That's the only was I was able to figure out how to pull 1.0 out of >>>>>> np.array(1.0). Is there a better way? >>>>> >>>>> >>>>> .item() >>>> >>>> Thanks. item() looks better than tolist(). >>>> >>>> I simplified the function: >>>> >>>> def nanmedian(x, axis=0): >>>> ? ?x, axis = _chk_asarray(x,axis) >>>> ? ?if x.ndim == 0: >>>> ? ? ? ?return float(x.item()) >>>> ? ?x = x.copy() >>>> ? ?x = np.apply_along_axis(_nanmedian,axis,x) >>>> ? ?if x.ndim == 0: >>>> ? ? ? ?x = float(x.item()) >>>> ? ?return x >>>> >>>> and opened a ticket: >>>> >>>> http://projects.scipy.org/scipy/ticket/1098 >>> >>> >>> How about getting rid of apply_along_axis? ? ?see attachment >>> >>> I don't know whether or how much faster it is, but there is a ticket >>> that the current version is slow. >>> No hidden bug or corner case guarantee yet. >> >> It is faster. But here is one case it does not handle: >> >>>> nanmedian([1, 2]) >> ? array([ 1.5]) >>>> np.median([1, 2]) >> ? 1.5 >> >> I'm sure it could be fixed. But having to fix it, and the fact that it >> is a larger change, decreases the likelihood that it will make it into >> the next version of scipy. One option is to make the small bug fix I >> suggested (ticket #1098) and add the corresponding unit tests. Then we >> can take our time to design a better version of nanmedian. > > I didn't see the difference to np.median for this case, I think I was > taking the shape answer from the other thread on the return of splines > and interpolation. > > If I change the last 3 lines to > ? ?if nanmed.size == 1: > ? ? ? return nanmed.item() > ? ?return nanmed > > then I get agreement with numpy for the following test cases > > print nanmedian(1), np.median(1) > print nanmedian(np.array(1)), np.median(1) > print nanmedian(np.array([1])), np.median(np.array([1])) > print nanmedian(np.array([[1]])), np.median(np.array([[1]])) > print nanmedian(np.array([1,2])), np.median(np.array([1,2])) > print nanmedian(np.array([[1,2]])), np.median(np.array([[1,2]]),axis=0) > print nanmedian([1]), np.median([1]) > print nanmedian([[1]]), np.median([[1]]) > print nanmedian([1,2]), np.median([1,2]) > print nanmedian([[1,2]]), np.median([[1,2]],axis=0) > print nanmedian([1j,2]), np.median([1j,2]) > > Am I still missing any cases? > > The vectorized version should be faster for this case > http://projects.scipy.org/scipy/ticket/740 > but maybe not for long and narrow arrays. Here is an odd one: >> nanmedian(True) 1.0 >> nanmedian([True]) 0.5 # <--- strange >> np.median(True) 1.0 >> np.median([True]) 1.0 From kwgoodman at gmail.com Fri Jan 22 11:58:21 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Fri, 22 Jan 2010 08:58:21 -0800 Subject: [SciPy-User] scipy.stats.nanmedian In-Reply-To: References: <1cd32cbb1001211815n7d6b1289w2e758c359051b271@mail.gmail.com> <16A50238-D3F1-4F51-A229-4FCD8267F320@gmail.com> <1cd32cbb1001212018s21e892f8rb7b210033e9d2fa6@mail.gmail.com> <1cd32cbb1001220846g26603687lc8f4524655b2e7d5@mail.gmail.com> Message-ID: On Fri, Jan 22, 2010 at 8:52 AM, Keith Goodman wrote: > On Fri, Jan 22, 2010 at 8:46 AM, ? wrote: >> On Fri, Jan 22, 2010 at 11:09 AM, Keith Goodman wrote: >>> On Thu, Jan 21, 2010 at 8:18 PM, ? wrote: >>>> On Thu, Jan 21, 2010 at 10:01 PM, Keith Goodman wrote: >>>>> On Thu, Jan 21, 2010 at 6:41 PM, Pierre GM wrote: >>>>>> On Jan 21, 2010, at 9:28 PM, Keith Goodman wrote: >>>>>>> That's the only was I was able to figure out how to pull 1.0 out of >>>>>>> np.array(1.0). Is there a better way? >>>>>> >>>>>> >>>>>> .item() >>>>> >>>>> Thanks. item() looks better than tolist(). >>>>> >>>>> I simplified the function: >>>>> >>>>> def nanmedian(x, axis=0): >>>>> ? ?x, axis = _chk_asarray(x,axis) >>>>> ? ?if x.ndim == 0: >>>>> ? ? ? ?return float(x.item()) >>>>> ? ?x = x.copy() >>>>> ? ?x = np.apply_along_axis(_nanmedian,axis,x) >>>>> ? ?if x.ndim == 0: >>>>> ? ? ? ?x = float(x.item()) >>>>> ? ?return x >>>>> >>>>> and opened a ticket: >>>>> >>>>> http://projects.scipy.org/scipy/ticket/1098 >>>> >>>> >>>> How about getting rid of apply_along_axis? ? ?see attachment >>>> >>>> I don't know whether or how much faster it is, but there is a ticket >>>> that the current version is slow. >>>> No hidden bug or corner case guarantee yet. >>> >>> It is faster. But here is one case it does not handle: >>> >>>>> nanmedian([1, 2]) >>> ? array([ 1.5]) >>>>> np.median([1, 2]) >>> ? 1.5 >>> >>> I'm sure it could be fixed. But having to fix it, and the fact that it >>> is a larger change, decreases the likelihood that it will make it into >>> the next version of scipy. One option is to make the small bug fix I >>> suggested (ticket #1098) and add the corresponding unit tests. Then we >>> can take our time to design a better version of nanmedian. >> >> I didn't see the difference to np.median for this case, I think I was >> taking the shape answer from the other thread on the return of splines >> and interpolation. >> >> If I change the last 3 lines to >> ? ?if nanmed.size == 1: >> ? ? ? return nanmed.item() >> ? ?return nanmed >> >> then I get agreement with numpy for the following test cases >> >> print nanmedian(1), np.median(1) >> print nanmedian(np.array(1)), np.median(1) >> print nanmedian(np.array([1])), np.median(np.array([1])) >> print nanmedian(np.array([[1]])), np.median(np.array([[1]])) >> print nanmedian(np.array([1,2])), np.median(np.array([1,2])) >> print nanmedian(np.array([[1,2]])), np.median(np.array([[1,2]]),axis=0) >> print nanmedian([1]), np.median([1]) >> print nanmedian([[1]]), np.median([[1]]) >> print nanmedian([1,2]), np.median([1,2]) >> print nanmedian([[1,2]]), np.median([[1,2]],axis=0) >> print nanmedian([1j,2]), np.median([1j,2]) >> >> Am I still missing any cases? >> >> The vectorized version should be faster for this case >> http://projects.scipy.org/scipy/ticket/740 >> but maybe not for long and narrow arrays. > > Here is an odd one: > >>> nanmedian(True) > ? 1.0 >>> nanmedian([True]) > ? 0.5 ?# <--- strange > >>> np.median(True) > ? 1.0 >>> np.median([True]) > ? 1.0 Another one: >> x = np.random.randn(3,4,5) >> nanmedian(x) ValueError: shape mismatch: objects cannot be broadcast to a single shape If anything we should add a full set of unit tests for nanmedian. One reason why the current unit tests did not catch the problem I ran into is that >> np.array(2.0) == 2.0 True So nanmedian was returning np.array(2.0) and np.median was returning 2.0 which when compared passed the unit test. From josef.pktd at gmail.com Fri Jan 22 12:03:26 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 22 Jan 2010 12:03:26 -0500 Subject: [SciPy-User] scipy.stats.nanmedian In-Reply-To: References: <1cd32cbb1001211815n7d6b1289w2e758c359051b271@mail.gmail.com> <16A50238-D3F1-4F51-A229-4FCD8267F320@gmail.com> <1cd32cbb1001212018s21e892f8rb7b210033e9d2fa6@mail.gmail.com> <1cd32cbb1001220846g26603687lc8f4524655b2e7d5@mail.gmail.com> Message-ID: <1cd32cbb1001220903h3a5e0d17k91045d0d36963983@mail.gmail.com> On Fri, Jan 22, 2010 at 11:52 AM, Keith Goodman wrote: > On Fri, Jan 22, 2010 at 8:46 AM, ? wrote: >> On Fri, Jan 22, 2010 at 11:09 AM, Keith Goodman wrote: >>> On Thu, Jan 21, 2010 at 8:18 PM, ? wrote: >>>> On Thu, Jan 21, 2010 at 10:01 PM, Keith Goodman wrote: >>>>> On Thu, Jan 21, 2010 at 6:41 PM, Pierre GM wrote: >>>>>> On Jan 21, 2010, at 9:28 PM, Keith Goodman wrote: >>>>>>> That's the only was I was able to figure out how to pull 1.0 out of >>>>>>> np.array(1.0). Is there a better way? >>>>>> >>>>>> >>>>>> .item() >>>>> >>>>> Thanks. item() looks better than tolist(). >>>>> >>>>> I simplified the function: >>>>> >>>>> def nanmedian(x, axis=0): >>>>> ? ?x, axis = _chk_asarray(x,axis) >>>>> ? ?if x.ndim == 0: >>>>> ? ? ? ?return float(x.item()) >>>>> ? ?x = x.copy() >>>>> ? ?x = np.apply_along_axis(_nanmedian,axis,x) >>>>> ? ?if x.ndim == 0: >>>>> ? ? ? ?x = float(x.item()) >>>>> ? ?return x >>>>> >>>>> and opened a ticket: >>>>> >>>>> http://projects.scipy.org/scipy/ticket/1098 >>>> >>>> >>>> How about getting rid of apply_along_axis? ? ?see attachment >>>> >>>> I don't know whether or how much faster it is, but there is a ticket >>>> that the current version is slow. >>>> No hidden bug or corner case guarantee yet. >>> >>> It is faster. But here is one case it does not handle: >>> >>>>> nanmedian([1, 2]) >>> ? array([ 1.5]) >>>>> np.median([1, 2]) >>> ? 1.5 >>> >>> I'm sure it could be fixed. But having to fix it, and the fact that it >>> is a larger change, decreases the likelihood that it will make it into >>> the next version of scipy. One option is to make the small bug fix I >>> suggested (ticket #1098) and add the corresponding unit tests. Then we >>> can take our time to design a better version of nanmedian. >> >> I didn't see the difference to np.median for this case, I think I was >> taking the shape answer from the other thread on the return of splines >> and interpolation. >> >> If I change the last 3 lines to >> ? ?if nanmed.size == 1: >> ? ? ? return nanmed.item() >> ? ?return nanmed >> >> then I get agreement with numpy for the following test cases >> >> print nanmedian(1), np.median(1) >> print nanmedian(np.array(1)), np.median(1) >> print nanmedian(np.array([1])), np.median(np.array([1])) >> print nanmedian(np.array([[1]])), np.median(np.array([[1]])) >> print nanmedian(np.array([1,2])), np.median(np.array([1,2])) >> print nanmedian(np.array([[1,2]])), np.median(np.array([[1,2]]),axis=0) >> print nanmedian([1]), np.median([1]) >> print nanmedian([[1]]), np.median([[1]]) >> print nanmedian([1,2]), np.median([1,2]) >> print nanmedian([[1,2]]), np.median([[1,2]],axis=0) >> print nanmedian([1j,2]), np.median([1j,2]) >> >> Am I still missing any cases? >> >> The vectorized version should be faster for this case >> http://projects.scipy.org/scipy/ticket/740 >> but maybe not for long and narrow arrays. > > Here is an odd one: > >>> nanmedian(True) > ? 1.0 >>> nanmedian([True]) > ? 0.5 ?# <--- strange > >>> np.median(True) > ? 1.0 >>> np.median([True]) > ? 1.0 definitely weird >>> (np.array(True)+np.array(True))/2. 0.5 >>> np.array([True, True]).sum() 2 >>> np.array([True, True]).mean() 1.0 I assumed mean (is used by np.ma.median) is the same as adding and dividing by 2 Josef > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From thomas.robitaille at gmail.com Fri Jan 22 12:17:20 2010 From: thomas.robitaille at gmail.com (Thomas Robitaille) Date: Fri, 22 Jan 2010 12:17:20 -0500 Subject: [SciPy-User] interp1d and out of bounds values Message-ID: <0C3454D1-56E0-4D6F-B556-11367E2519F2@gmail.com> Hello, I've been using scipy.interpolate.interp1d to interpolate values in a number of different projects. However, something I often need is the following: if the interpolating function is defined as f(x) from xmin to xmax, if I specify an x value smaller than xmin, I would like the value set to f(xmin), and if the value is above xmax, I would like the value set to xmax. While this is strictly extrapolation, I'm wondering if there is a way that fill_value could be set to a certain string value, for example 'nearest', to indicate that this is the desired behavior? I could see this being commonly used. If it is not possible to modify scipy directly, what would be the best way to wrap interp1d to allow this? Since interp1d(x,y) takes y as an n-dimensional array, I'm not sure how I could code this up. Thanks in advance for any advice, Thomas From pgmdevlist at gmail.com Fri Jan 22 15:29:14 2010 From: pgmdevlist at gmail.com (Pierre GM) Date: Fri, 22 Jan 2010 15:29:14 -0500 Subject: [SciPy-User] timeseries tsfromtxt missing_values bug? In-Reply-To: <4B597476.63BA.009B.0@twdb.state.tx.us> References: <4B5077A4.63BA.009B.0@twdb.state.tx.us> <1C4012B8-4C61-4C97-9DE0-F4619383BC94@gmail.com> <4B597476.63BA.009B.0@twdb.state.tx.us> Message-ID: <8E6CFE0C-B843-4B13-87A0-D3F22CAC5DD8@gmail.com> On Jan 22, 2010, at 10:48 AM, Dharhas Pothina wrote: > > Is there any way to install the svn version on windows? This script is being primarily used on a windows box. If not I'll test on linux. Of course, just download the code from the SVN server (as described in the doc) and install it (python setup.py install). if you have any problem, contact me offline. You may not have have to compile if you're ready to get your hands dirty: edit the source files on your machine directly. From josef.pktd at gmail.com Fri Jan 22 16:08:23 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 22 Jan 2010 16:08:23 -0500 Subject: [SciPy-User] scipy.stats.nanmedian In-Reply-To: <1cd32cbb1001220903h3a5e0d17k91045d0d36963983@mail.gmail.com> References: <1cd32cbb1001211815n7d6b1289w2e758c359051b271@mail.gmail.com> <16A50238-D3F1-4F51-A229-4FCD8267F320@gmail.com> <1cd32cbb1001212018s21e892f8rb7b210033e9d2fa6@mail.gmail.com> <1cd32cbb1001220846g26603687lc8f4524655b2e7d5@mail.gmail.com> <1cd32cbb1001220903h3a5e0d17k91045d0d36963983@mail.gmail.com> Message-ID: <1cd32cbb1001221308s14935c60sc37636eaa35ff47e@mail.gmail.com> On Fri, Jan 22, 2010 at 12:03 PM, wrote: > On Fri, Jan 22, 2010 at 11:52 AM, Keith Goodman wrote: >> On Fri, Jan 22, 2010 at 8:46 AM, ? wrote: >>> On Fri, Jan 22, 2010 at 11:09 AM, Keith Goodman wrote: >>>> On Thu, Jan 21, 2010 at 8:18 PM, ? wrote: >>>>> On Thu, Jan 21, 2010 at 10:01 PM, Keith Goodman wrote: >>>>>> On Thu, Jan 21, 2010 at 6:41 PM, Pierre GM wrote: >>>>>>> On Jan 21, 2010, at 9:28 PM, Keith Goodman wrote: >>>>>>>> That's the only was I was able to figure out how to pull 1.0 out of >>>>>>>> np.array(1.0). Is there a better way? >>>>>>> >>>>>>> >>>>>>> .item() >>>>>> >>>>>> Thanks. item() looks better than tolist(). >>>>>> >>>>>> I simplified the function: >>>>>> >>>>>> def nanmedian(x, axis=0): >>>>>> ? ?x, axis = _chk_asarray(x,axis) >>>>>> ? ?if x.ndim == 0: >>>>>> ? ? ? ?return float(x.item()) >>>>>> ? ?x = x.copy() >>>>>> ? ?x = np.apply_along_axis(_nanmedian,axis,x) >>>>>> ? ?if x.ndim == 0: >>>>>> ? ? ? ?x = float(x.item()) >>>>>> ? ?return x >>>>>> >>>>>> and opened a ticket: >>>>>> >>>>>> http://projects.scipy.org/scipy/ticket/1098 >>>>> >>>>> >>>>> How about getting rid of apply_along_axis? ? ?see attachment >>>>> >>>>> I don't know whether or how much faster it is, but there is a ticket >>>>> that the current version is slow. >>>>> No hidden bug or corner case guarantee yet. >>>> >>>> It is faster. But here is one case it does not handle: >>>> >>>>>> nanmedian([1, 2]) >>>> ? array([ 1.5]) >>>>>> np.median([1, 2]) >>>> ? 1.5 >>>> >>>> I'm sure it could be fixed. But having to fix it, and the fact that it >>>> is a larger change, decreases the likelihood that it will make it into >>>> the next version of scipy. One option is to make the small bug fix I >>>> suggested (ticket #1098) and add the corresponding unit tests. Then we >>>> can take our time to design a better version of nanmedian. >>> >>> I didn't see the difference to np.median for this case, I think I was >>> taking the shape answer from the other thread on the return of splines >>> and interpolation. >>> >>> If I change the last 3 lines to >>> ? ?if nanmed.size == 1: >>> ? ? ? return nanmed.item() >>> ? ?return nanmed >>> >>> then I get agreement with numpy for the following test cases >>> >>> print nanmedian(1), np.median(1) >>> print nanmedian(np.array(1)), np.median(1) >>> print nanmedian(np.array([1])), np.median(np.array([1])) >>> print nanmedian(np.array([[1]])), np.median(np.array([[1]])) >>> print nanmedian(np.array([1,2])), np.median(np.array([1,2])) >>> print nanmedian(np.array([[1,2]])), np.median(np.array([[1,2]]),axis=0) >>> print nanmedian([1]), np.median([1]) >>> print nanmedian([[1]]), np.median([[1]]) >>> print nanmedian([1,2]), np.median([1,2]) >>> print nanmedian([[1,2]]), np.median([[1,2]],axis=0) >>> print nanmedian([1j,2]), np.median([1j,2]) >>> >>> Am I still missing any cases? >>> >>> The vectorized version should be faster for this case >>> http://projects.scipy.org/scipy/ticket/740 >>> but maybe not for long and narrow arrays. >> >> Here is an odd one: >> >>>> nanmedian(True) >> ? 1.0 >>>> nanmedian([True]) >> ? 0.5 ?# <--- strange >> >>>> np.median(True) >> ? 1.0 >>>> np.median([True]) >> ? 1.0 > > definitely weird > >>>> (np.array(True)+np.array(True))/2. > 0.5 >>>> np.array([True, True]).sum() > 2 >>>> np.array([True, True]).mean() > 1.0 > > I assumed mean (is used by np.ma.median) is the same as adding and dividing by 2 > > Josef > >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > It got a bit ugly, too many shapes for a single number, and tricky axis handling. Your idea of making just the smallish changes looks more attractive now. and almost correct >>> np.median(np.array([[[1]]]),axis=0).shape (1, 1) >>> nanmedian(np.array([[[1]]]),axis=0).shape (1, 1) >>> np.median(np.array([[[1]]]),axis=-1).shape (1, 1) >>> nanmedian(np.array([[[1]]]),axis=-1).shape (1, 1) but there slipped a python in >>> nanmedian(np.array([[[1]]]),axis=None).__class__ >>> np.median(np.array([[[1]]]),axis=None).__class__ .item() returns a python number not a numpy number >>> np.array([[[1]]]).item().__class__ >>> np.array([[[1]]]).flat[0].__class__ I also didn't know this: >>> None < 0 True Josef -------------- next part -------------- # -*- coding: utf-8 -*- """ Created on Wed Jan 20 10:18:32 2010 Author: josef-pktd """ import numpy as np from scipy import stats def nanmedian(x, axis = 0): keepshape = list(np.shape(x)) x, axis2 = stats.stats._chk_asarray(x, axis) if (not axis is None) and axis<0 : # and x.ndim>2: axis = x.ndim + axis #print 'axis', axis #print x, keepshape if x.ndim == 0 or (x.size==1 and axis is None): return 1.0*x.item() if keepshape and not axis is None : keepshape.pop(axis) if x.size == 1: #x.ndim == 0: return 1.0*x.reshape(keepshape) if x.dtype == np.bool: x = x.astype(int) if axis is None: axis = 0 x = np.sort(x, axis=axis) nall = x.shape[axis] notnancount = nall - np.isnan(x).sum(axis=axis) #print 'notnancount', notnancount (idx, rmd) = divmod(notnancount, 2) #idx = np.atleast_1d(idx) #print 'idx', idx #print idx.shape indx = np.ogrid[[slice(i)for i in x.shape]] indxlo = indx[:] idxslice = map(slice, idx.shape) idxslice.insert(axis, None) #print idxslice idx = np.atleast_1d(idx) indxlo[axis] = idx[idxslice] #print indxlo indxhi = indx[:] indxhi[axis] = (idx - (1-rmd))[idxslice]#[idx.shape[:axis]+(None,)+idx.shape[axis:]] #print map(np.shape, indxhi) #print indxhi nanmed = (x[indxlo] + x[indxhi])/2. #print 'keepshape, nanmed.shape',keepshape, nanmed.shape if np.ndim == 0: return nanmed.item() #return nanmed.reshape(keepshape) return np.squeeze(nanmed) #.reshape(keepshape) if nanmed.size == 1: return nanmed.reshape(keepshape) return nanmed.item() return nanmed from numpy.testing import assert_equal, assert_almost_equal for axis in [0,1, None, -1]: for i in range(5): # for complex #x = 1j+np.arange(20).reshape(4,5) x = np.arange(20).reshape(4,5).astype(float) x[zip(np.random.randint(4, size=(2,5)))] = np.nan assert_equal(nanmedian(x, axis=0), stats.nanmedian(x, axis=0)) for axis in [0,1,2, None, -1]: for i in range(5): x = np.arange(3*4*5).reshape(3,4,5).astype(float) x[np.random.randint(3, size=(3,4,5))] = np.nan assert_equal(nanmedian(x, axis=0), stats.nanmedian(x, axis=0)) xli = [ [1], [[1]], [1,2], [1j], [1j,2], [True], [False], [True,False], [True, False, True], np.round(np.random.randn(2,4,5),4)] xxli = xli + map(np.array,xli) for axis in [0, -1, None]: print '\n next case',axis for x in xxli: try: assert_equal(nanmedian(x, axis=axis), np.median(x, axis=axis)) assert_equal(np.shape(nanmedian(x, axis=axis)), np.shape(np.median(x, axis=axis))) except: print 'failure with', x print nanmedian(x, axis=axis), np.median(x, axis=axis) raise y=np.round(np.random.randn(2,3,5),4) axis = -1 #None print np.median(y,axis=axis) nm = nanmedian(y,axis=axis) print nm print np.median(y,axis=axis) - nm #np.broadcast(array([[[0]],[[1]]]), array([[[1, 1, 1, 1, 1]], [[1, 1, 1, 1, 1]]]), array([[[0, 1, 2, 3, 4]]])) From kwgoodman at gmail.com Fri Jan 22 16:44:56 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Fri, 22 Jan 2010 13:44:56 -0800 Subject: [SciPy-User] scipy.stats.nanmedian In-Reply-To: <1cd32cbb1001221308s14935c60sc37636eaa35ff47e@mail.gmail.com> References: <16A50238-D3F1-4F51-A229-4FCD8267F320@gmail.com> <1cd32cbb1001212018s21e892f8rb7b210033e9d2fa6@mail.gmail.com> <1cd32cbb1001220846g26603687lc8f4524655b2e7d5@mail.gmail.com> <1cd32cbb1001220903h3a5e0d17k91045d0d36963983@mail.gmail.com> <1cd32cbb1001221308s14935c60sc37636eaa35ff47e@mail.gmail.com> Message-ID: On Fri, Jan 22, 2010 at 1:08 PM, wrote: > .item() returns a python number not a numpy number > >>>> np.array([[[1]]]).item().__class__ > >>>> np.array([[[1]]]).flat[0].__class__ > Good catch. Thanks. I'll update my local copy of nanmedian. From kwgoodman at gmail.com Fri Jan 22 16:52:04 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Fri, 22 Jan 2010 13:52:04 -0800 Subject: [SciPy-User] scipy.stats.nanmedian In-Reply-To: References: <16A50238-D3F1-4F51-A229-4FCD8267F320@gmail.com> <1cd32cbb1001212018s21e892f8rb7b210033e9d2fa6@mail.gmail.com> <1cd32cbb1001220846g26603687lc8f4524655b2e7d5@mail.gmail.com> <1cd32cbb1001220903h3a5e0d17k91045d0d36963983@mail.gmail.com> <1cd32cbb1001221308s14935c60sc37636eaa35ff47e@mail.gmail.com> Message-ID: On Fri, Jan 22, 2010 at 1:44 PM, Keith Goodman wrote: > On Fri, Jan 22, 2010 at 1:08 PM, ? wrote: >> .item() returns a python number not a numpy number >> >>>>> np.array([[[1]]]).item().__class__ >> >>>>> np.array([[[1]]]).flat[0].__class__ >> > > Good catch. Thanks. I'll update my local copy of nanmedian. Looks like we don't even need .tolist(), .item(), or [()]. This should do the trick: >> np.float64(np.array(1)) 1.0 >> type(np.float64(np.array(1))) And what if the input is float32? Well, numpy turns that into float64 so nothing to worry about: >> x = np.array(1, dtype=np.float32) >> m = np.median(x) >> type(m) From josef.pktd at gmail.com Fri Jan 22 16:55:03 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 22 Jan 2010 16:55:03 -0500 Subject: [SciPy-User] scipy.stats.nanmedian In-Reply-To: References: <1cd32cbb1001212018s21e892f8rb7b210033e9d2fa6@mail.gmail.com> <1cd32cbb1001220846g26603687lc8f4524655b2e7d5@mail.gmail.com> <1cd32cbb1001220903h3a5e0d17k91045d0d36963983@mail.gmail.com> <1cd32cbb1001221308s14935c60sc37636eaa35ff47e@mail.gmail.com> Message-ID: <1cd32cbb1001221355n43777036pbc4b74298002a92@mail.gmail.com> On Fri, Jan 22, 2010 at 4:52 PM, Keith Goodman wrote: > On Fri, Jan 22, 2010 at 1:44 PM, Keith Goodman wrote: >> On Fri, Jan 22, 2010 at 1:08 PM, ? wrote: >>> .item() returns a python number not a numpy number >>> >>>>>> np.array([[[1]]]).item().__class__ >>> >>>>>> np.array([[[1]]]).flat[0].__class__ >>> >> >> Good catch. Thanks. I'll update my local copy of nanmedian. > > Looks like we don't even need .tolist(), .item(), or [()]. This should > do the trick: > >>> np.float64(np.array(1)) > ? 1.0 >>> type(np.float64(np.array(1))) > ? > > And what if the input is float32? Well, numpy turns that into float64 > so nothing to worry about: > >>> x = np.array(1, dtype=np.float32) >>> m = np.median(x) >>> type(m) > ? but it breaks complex >>> np.float64(np.array(1.j)) 0.0 I did one more shape correction in my version Josef > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From kwgoodman at gmail.com Fri Jan 22 17:19:10 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Fri, 22 Jan 2010 14:19:10 -0800 Subject: [SciPy-User] scipy.stats.nanmedian In-Reply-To: <1cd32cbb1001221355n43777036pbc4b74298002a92@mail.gmail.com> References: <1cd32cbb1001212018s21e892f8rb7b210033e9d2fa6@mail.gmail.com> <1cd32cbb1001220846g26603687lc8f4524655b2e7d5@mail.gmail.com> <1cd32cbb1001220903h3a5e0d17k91045d0d36963983@mail.gmail.com> <1cd32cbb1001221308s14935c60sc37636eaa35ff47e@mail.gmail.com> <1cd32cbb1001221355n43777036pbc4b74298002a92@mail.gmail.com> Message-ID: On Fri, Jan 22, 2010 at 1:55 PM, wrote: > On Fri, Jan 22, 2010 at 4:52 PM, Keith Goodman wrote: >> On Fri, Jan 22, 2010 at 1:44 PM, Keith Goodman wrote: >>> On Fri, Jan 22, 2010 at 1:08 PM, ? wrote: >>>> .item() returns a python number not a numpy number >>>> >>>>>>> np.array([[[1]]]).item().__class__ >>>> >>>>>>> np.array([[[1]]]).flat[0].__class__ >>>> >>> >>> Good catch. Thanks. I'll update my local copy of nanmedian. >> >> Looks like we don't even need .tolist(), .item(), or [()]. This should >> do the trick: >> >>>> np.float64(np.array(1)) >> ? 1.0 >>>> type(np.float64(np.array(1))) >> ? >> >> And what if the input is float32? Well, numpy turns that into float64 >> so nothing to worry about: >> >>>> x = np.array(1, dtype=np.float32) >>>> m = np.median(x) >>>> type(m) >> ? > > but it breaks complex >>>> np.float64(np.array(1.j)) > 0.0 > > I did one more shape correction in my version Crap. So many corner cases. After collecting all the input cases (all without NaNs) we could extend your automated test by adding an outer loop over [nanmean, nanmedian, nanstd] and make sure it gives the same results as the numpy versions. It might be good to check for dtype too since 1 == 1.0. From dgorman at berkeley.edu Fri Jan 22 19:17:19 2010 From: dgorman at berkeley.edu (Dylan Gorman) Date: Fri, 22 Jan 2010 16:17:19 -0800 Subject: [SciPy-User] Matrix Exponentials For Very Large Sparse Matrices Message-ID: Hi Folks, I'd like to exponentiate very large sparse matrices with scipy, and I would be very grateful for any suggestions. Currently, I'm running into memory errors exponentiating random matrices of order 10^3x10^3 with the standard linalg.expm() routine. However, I suspect that I may realistically be able to handle somewhat larger matrices since the actual matrices I will be using are quite sparse. Ideally, I'd like to be able to exponentiate matrices of size 10^5-10^6 x 10^5 - 10^6. However, there does not seem to be any linalg.sparse.expm() function-- is this because there is in fact no advantage to exponentiating sparse matrices? Or would I need to implement something by hand? Thank you very much, Dylan Gorman From josef.pktd at gmail.com Fri Jan 22 19:38:08 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 22 Jan 2010 19:38:08 -0500 Subject: [SciPy-User] Matrix Exponentials For Very Large Sparse Matrices In-Reply-To: References: Message-ID: <1cd32cbb1001221638n5d8b455bjca56996ba99622cf@mail.gmail.com> On Fri, Jan 22, 2010 at 7:17 PM, Dylan Gorman wrote: > Hi Folks, > > I'd like to exponentiate very large sparse matrices with scipy, and I > would be very grateful for any suggestions. Currently, I'm running > into memory errors exponentiating random matrices of order 10^3x10^3 > with the standard linalg.expm() routine. However, I suspect that I may > realistically be able to handle somewhat larger matrices since the > actual matrices I will be using are quite sparse. Ideally, I'd like to > be able to exponentiate matrices of size 10^5-10^6 x 10^5 - 10^6. > However, there does not seem to be any linalg.sparse.expm() function-- > is this because there is in fact no advantage to exponentiating sparse > matrices? Or would I need to implement something by hand? Just an idea until the experts come along. I would try for a sparse eigenvector decomposition and then only the eigenvalues need to be exponentiated. However, I never tried it with sparse, and I don't know how sparse the eigenvectors would be. Josef > > Thank you very much, > Dylan Gorman > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From charlesr.harris at gmail.com Fri Jan 22 20:04:06 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 22 Jan 2010 18:04:06 -0700 Subject: [SciPy-User] Matrix Exponentials For Very Large Sparse Matrices In-Reply-To: References: Message-ID: On Fri, Jan 22, 2010 at 5:17 PM, Dylan Gorman wrote: > Hi Folks, > > I'd like to exponentiate very large sparse matrices with scipy, and I > would be very grateful for any suggestions. Currently, I'm running > into memory errors exponentiating random matrices of order 10^3x10^3 > with the standard linalg.expm() routine. However, I suspect that I may > realistically be able to handle somewhat larger matrices since the > actual matrices I will be using are quite sparse. Ideally, I'd like to > be able to exponentiate matrices of size 10^5-10^6 x 10^5 - 10^6. > However, there does not seem to be any linalg.sparse.expm() function-- > is this because there is in fact no advantage to exponentiating sparse > matrices? Or would I need to implement something by hand? > > Out of curiosity, do the matrices have any special structure? For instance, are they banded or symmetric? Also, why to you want to exponentiate them? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From joshua.stults at gmail.com Fri Jan 22 20:16:12 2010 From: joshua.stults at gmail.com (Joshua Stults) Date: Fri, 22 Jan 2010 20:16:12 -0500 Subject: [SciPy-User] Matrix Exponentials For Very Large Sparse Matrices In-Reply-To: References: Message-ID: On Fri, Jan 22, 2010 at 8:04 PM, Charles R Harris wrote: > > > On Fri, Jan 22, 2010 at 5:17 PM, Dylan Gorman wrote: >> >> Hi Folks, >> >> I'd like to exponentiate very large sparse matrices with scipy, and I >> would be very grateful for any suggestions. Currently, I'm running >> into memory errors exponentiating random matrices of order 10^3x10^3 >> with the standard linalg.expm() routine. However, I suspect that I may >> realistically be able to handle somewhat larger matrices since the >> actual matrices I will be using are quite sparse. Ideally, I'd like to >> be able to exponentiate matrices of size 10^5-10^6 x 10^5 - 10^6. >> However, there does not seem to be any linalg.sparse.expm() function-- >> is this because there is in fact no advantage to exponentiating sparse >> matrices? Or would I need to implement something by hand? >> > > Out of curiosity, do the matrices have any special structure? For instance, > are they banded or symmetric? Also, why to you want to exponentiate them? > Another question: do you need the matrix exponential explicitly, or do you just need it's action on a vector? > Chuck > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -- Joshua Stults Website: http://j-stults.blogspot.com From dgorman at berkeley.edu Fri Jan 22 20:24:00 2010 From: dgorman at berkeley.edu (Dylan Gorman) Date: Fri, 22 Jan 2010 17:24:00 -0800 Subject: [SciPy-User] Matrix Exponentials For Very Large Sparse Matrices In-Reply-To: References: Message-ID: <6F8E0DFF-23B9-443B-BE6E-98B58AE0AC7A@berkeley.edu> Dear Chuck and Joshua, It's a problem in quantum simulation. I'm trying to solve d(rho)/dt = L*rho for a sparse matrix L. L should be symmetric, and in principle I just need to compute (e^L*t)*(rho(0)) or something--not e^(L*t) explicitly. Regards, Dylan On Jan 22, 2010, at 5:16 PM, Joshua Stults wrote: > On Fri, Jan 22, 2010 at 8:04 PM, Charles R Harris > wrote: >> >> >> On Fri, Jan 22, 2010 at 5:17 PM, Dylan Gorman >> wrote: >>> >>> Hi Folks, >>> >>> I'd like to exponentiate very large sparse matrices with scipy, >>> and I >>> would be very grateful for any suggestions. Currently, I'm running >>> into memory errors exponentiating random matrices of order 10^3x10^3 >>> with the standard linalg.expm() routine. However, I suspect that I >>> may >>> realistically be able to handle somewhat larger matrices since the >>> actual matrices I will be using are quite sparse. Ideally, I'd >>> like to >>> be able to exponentiate matrices of size 10^5-10^6 x 10^5 - 10^6. >>> However, there does not seem to be any linalg.sparse.expm() >>> function-- >>> is this because there is in fact no advantage to exponentiating >>> sparse >>> matrices? Or would I need to implement something by hand? >>> >> >> Out of curiosity, do the matrices have any special structure? For >> instance, >> are they banded or symmetric? Also, why to you want to exponentiate >> them? >> > Another question: do you need the matrix exponential explicitly, or do > you just need it's action on a vector? > >> Chuck >> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> > > > > -- > Joshua Stults > Website: http://j-stults.blogspot.com > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From joshua.stults at gmail.com Fri Jan 22 20:33:15 2010 From: joshua.stults at gmail.com (Joshua Stults) Date: Fri, 22 Jan 2010 20:33:15 -0500 Subject: [SciPy-User] Matrix Exponentials For Very Large Sparse Matrices In-Reply-To: <6F8E0DFF-23B9-443B-BE6E-98B58AE0AC7A@berkeley.edu> References: <6F8E0DFF-23B9-443B-BE6E-98B58AE0AC7A@berkeley.edu> Message-ID: On Fri, Jan 22, 2010 at 8:24 PM, Dylan Gorman wrote: > Dear Chuck and Joshua, > > It's a problem in quantum simulation. I'm trying to solve d(rho)/dt = > L*rho for a sparse matrix L. L should be symmetric, and in principle I > just need to compute (e^L*t)*(rho(0)) or something--not e^(L*t) > explicitly. If you end up rolling your own, it sounds like 'Method 20: Krylov space methods', in 'Nineteen Dubious Ways to Compute the Exponential of a Matrix' is for you: http://www.cs.cornell.edu/cv/researchpdf/19ways+.pdf > > Regards, > Dylan > > On Jan 22, 2010, at 5:16 PM, Joshua Stults wrote: > >> On Fri, Jan 22, 2010 at 8:04 PM, Charles R Harris >> wrote: >>> >>> >>> On Fri, Jan 22, 2010 at 5:17 PM, Dylan Gorman >>> wrote: >>>> >>>> Hi Folks, >>>> >>>> I'd like to exponentiate very large sparse matrices with scipy, >>>> and I >>>> would be very grateful for any suggestions. Currently, I'm running >>>> into memory errors exponentiating random matrices of order 10^3x10^3 >>>> with the standard linalg.expm() routine. However, I suspect that I >>>> may >>>> realistically be able to handle somewhat larger matrices since the >>>> actual matrices I will be using are quite sparse. Ideally, I'd >>>> like to >>>> be able to exponentiate matrices of size 10^5-10^6 x 10^5 - 10^6. >>>> However, there does not seem to be any linalg.sparse.expm() >>>> function-- >>>> is this because there is in fact no advantage to exponentiating >>>> sparse >>>> matrices? Or would I need to implement something by hand? >>>> >>> >>> Out of curiosity, do the matrices have any special structure? For >>> instance, >>> are they banded or symmetric? Also, why to you want to exponentiate >>> them? >>> >> Another question: do you need the matrix exponential explicitly, or do >> you just need it's action on a vector? >> >>> Chuck Maybe there's something already lurking in Scipy that's appropriate? -- Joshua Stults Website: http://j-stults.blogspot.com From burak.o.cankurtaran at alumni.uts.edu.au Fri Jan 22 20:58:33 2010 From: burak.o.cankurtaran at alumni.uts.edu.au (Burak1327) Date: Fri, 22 Jan 2010 17:58:33 -0800 (PST) Subject: [SciPy-User] [SciPy-user] Matrix Exponentials For Very Large Sparse Matrices In-Reply-To: References: <6F8E0DFF-23B9-443B-BE6E-98B58AE0AC7A@berkeley.edu> Message-ID: <27282461.post@talk.nabble.com> It looks like you are propagating the density on a real-space grid. Don't know the specifics to your problem, but generally calculating the exponential of the matrix happens thousands of times for the whole evolution time, so you need to approximate the exponential with a method like Joshua referenced or you will have to let your future grandchildren finish the simulation :) The most simple approximations are polynomial expansions. Chebychev polynomials are GREAT, even a 2nd order Taylor expansion is good enough in a lot of cases, specific to your type of problem. Which leads to actual scipy discussion. I'm no scipy expert, but the above mentioned methods are probably in the library. Thanks Burak Joshua Stults wrote: > > On Fri, Jan 22, 2010 at 8:24 PM, Dylan Gorman > wrote: >> Dear Chuck and Joshua, >> >> It's a problem in quantum simulation. I'm trying to solve d(rho)/dt = >> L*rho for a sparse matrix L. L should be symmetric, and in principle I >> just need to compute (e^L*t)*(rho(0)) or something--not e^(L*t) >> explicitly. > > If you end up rolling your own, it sounds like 'Method 20: Krylov > space methods', in 'Nineteen Dubious Ways to Compute the Exponential > of a Matrix' is for you: > http://www.cs.cornell.edu/cv/researchpdf/19ways+.pdf > >> >> Regards, >> Dylan >> >> On Jan 22, 2010, at 5:16 PM, Joshua Stults wrote: >> >>> On Fri, Jan 22, 2010 at 8:04 PM, Charles R Harris >>> wrote: >>>> >>>> >>>> On Fri, Jan 22, 2010 at 5:17 PM, Dylan Gorman >>>> wrote: >>>>> >>>>> Hi Folks, >>>>> >>>>> I'd like to exponentiate very large sparse matrices with scipy, >>>>> and I >>>>> would be very grateful for any suggestions. Currently, I'm running >>>>> into memory errors exponentiating random matrices of order 10^3x10^3 >>>>> with the standard linalg.expm() routine. However, I suspect that I >>>>> may >>>>> realistically be able to handle somewhat larger matrices since the >>>>> actual matrices I will be using are quite sparse. Ideally, I'd >>>>> like to >>>>> be able to exponentiate matrices of size 10^5-10^6 x 10^5 - 10^6. >>>>> However, there does not seem to be any linalg.sparse.expm() >>>>> function-- >>>>> is this because there is in fact no advantage to exponentiating >>>>> sparse >>>>> matrices? Or would I need to implement something by hand? >>>>> >>>> >>>> Out of curiosity, do the matrices have any special structure? For >>>> instance, >>>> are they banded or symmetric? Also, why to you want to exponentiate >>>> them? >>>> >>> Another question: do you need the matrix exponential explicitly, or do >>> you just need it's action on a vector? >>> >>>> Chuck > > Maybe there's something already lurking in Scipy that's appropriate? > > -- > Joshua Stults > Website: http://j-stults.blogspot.com > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -- View this message in context: http://old.nabble.com/Matrix-Exponentials-For-Very-Large-Sparse-Matrices-tp27281924p27282461.html Sent from the Scipy-User mailing list archive at Nabble.com. From charlesr.harris at gmail.com Fri Jan 22 21:18:16 2010 From: charlesr.harris at gmail.com (Charles R Harris) Date: Fri, 22 Jan 2010 19:18:16 -0700 Subject: [SciPy-User] Matrix Exponentials For Very Large Sparse Matrices In-Reply-To: <6F8E0DFF-23B9-443B-BE6E-98B58AE0AC7A@berkeley.edu> References: <6F8E0DFF-23B9-443B-BE6E-98B58AE0AC7A@berkeley.edu> Message-ID: On Fri, Jan 22, 2010 at 6:24 PM, Dylan Gorman wrote: > Dear Chuck and Joshua, > > It's a problem in quantum simulation. I'm trying to solve d(rho)/dt = > L*rho for a sparse matrix L. L should be symmetric, and in principle I > just need to compute (e^L*t)*(rho(0)) or something--not e^(L*t) > explicitly. > > Does L approximate a differential operator or is it an expansion in some other basis set? Chuck -------------- next part -------------- An HTML attachment was scrubbed... URL: From joshua.stults at gmail.com Fri Jan 22 21:47:18 2010 From: joshua.stults at gmail.com (Joshua Stults) Date: Fri, 22 Jan 2010 21:47:18 -0500 Subject: [SciPy-User] [SciPy-user] Matrix Exponentials For Very Large Sparse Matrices In-Reply-To: <27282461.post@talk.nabble.com> References: <6F8E0DFF-23B9-443B-BE6E-98B58AE0AC7A@berkeley.edu> <27282461.post@talk.nabble.com> Message-ID: On Fri, Jan 22, 2010 at 8:58 PM, Burak1327 wrote: > > > The most simple approximations are polynomial expansions. > Chebychev polynomials are GREAT, even a 2nd order > Taylor expansion is good enough in a lot of cases, specific to > your type of problem. > > Which leads to actual scipy discussion. I'm no scipy expert, but > the above mentioned methods are probably in the library. > Here's an example of using f2py to compile expokit (see slides 15 - 21): http://sf.anu.edu.au/~mhk900/Python_Workshop/short.pdf Expokit website: http://www.maths.uq.edu.au/expokit/ Uses Krylov methods for sparse matrices; these will use more memory than the polynomial expansion methods that Burak mentioned. -- Joshua Stults Website: http://j-stults.blogspot.com From josef.pktd at gmail.com Fri Jan 22 22:39:57 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 22 Jan 2010 22:39:57 -0500 Subject: [SciPy-User] 2d array to Latex Message-ID: <1cd32cbb1001221939k1f21930eu351b6f4b8836b271@mail.gmail.com> Is there a function somewhere to print a 2d array with Latex markup? No fancy table required, just for a quick copy and paste into a Latex document. Thanks, Josef From pgmdevlist at gmail.com Fri Jan 22 23:59:51 2010 From: pgmdevlist at gmail.com (Pierre GM) Date: Fri, 22 Jan 2010 23:59:51 -0500 Subject: [SciPy-User] 2d array to Latex In-Reply-To: <1cd32cbb1001221939k1f21930eu351b6f4b8836b271@mail.gmail.com> References: <1cd32cbb1001221939k1f21930eu351b6f4b8836b271@mail.gmail.com> Message-ID: <362F5EA2-6B64-4AC4-8ACC-E8D25C30876A@gmail.com> On Jan 22, 2010, at 10:39 PM, josef.pktd at gmail.com wrote: > Is there a function somewhere to print a 2d array with Latex markup? > > No fancy table required, just for a quick copy and paste into a Latex document. Check scikits.timeseries.lib.reportlib, I think Matt coded something to that effect. You'll probably have to adapt it a bit, but that should get you started. Or you could do it yourself: * print the \begin{table} header * print "\\\n".join(["&".join(map(str,line)) for line in array]) * print the \end footer From josef.pktd at gmail.com Sat Jan 23 00:37:22 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 23 Jan 2010 00:37:22 -0500 Subject: [SciPy-User] 2d array to Latex In-Reply-To: <362F5EA2-6B64-4AC4-8ACC-E8D25C30876A@gmail.com> References: <1cd32cbb1001221939k1f21930eu351b6f4b8836b271@mail.gmail.com> <362F5EA2-6B64-4AC4-8ACC-E8D25C30876A@gmail.com> Message-ID: <1cd32cbb1001222137v50bf95dai5982e6dd87a8729f@mail.gmail.com> On Fri, Jan 22, 2010 at 11:59 PM, Pierre GM wrote: > On Jan 22, 2010, at 10:39 PM, josef.pktd at gmail.com wrote: >> Is there a function somewhere to print a 2d array with Latex markup? >> >> No fancy table required, just for a quick copy and paste into a Latex document. > > Check scikits.timeseries.lib.reportlib, I think Matt coded something to that effect. You'll probably have to adapt it a bit, but that should get you started. > Or you could do it yourself: > * print the \begin{table} header > * print "\\\n".join(["&".join(map(str,line)) for line in array]) > * print the \end footer Thanks, I will look at the scikits. Your example looks simple enough, but I don't feel like debugging latex (and struggle with formatting) if I can avoid it. Josef > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From aisaac at american.edu Sat Jan 23 08:35:00 2010 From: aisaac at american.edu (Alan G Isaac) Date: Sat, 23 Jan 2010 08:35:00 -0500 Subject: [SciPy-User] 2d array to Latex In-Reply-To: <1cd32cbb1001221939k1f21930eu351b6f4b8836b271@mail.gmail.com> References: <1cd32cbb1001221939k1f21930eu351b6f4b8836b271@mail.gmail.com> Message-ID: <4B5AFB04.2020501@american.edu> On 1/22/2010 10:39 PM, josef.pktd at gmail.com wrote: > Is there a function somewhere to print a 2d array with Latex markup? Try SimpleTable: http://econpy.googlecode.com/svn/trunk/utilities/text.py Alan Isaac From josef.pktd at gmail.com Sat Jan 23 08:45:26 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 23 Jan 2010 08:45:26 -0500 Subject: [SciPy-User] 2d array to Latex In-Reply-To: <4B5AFB04.2020501@american.edu> References: <1cd32cbb1001221939k1f21930eu351b6f4b8836b271@mail.gmail.com> <4B5AFB04.2020501@american.edu> Message-ID: <1cd32cbb1001230545j3ddb6f78gc159597e5b532fe0@mail.gmail.com> On Sat, Jan 23, 2010 at 8:35 AM, Alan G Isaac wrote: > On 1/22/2010 10:39 PM, josef.pktd at gmail.com wrote: >> Is there a function somewhere to print a 2d array with Latex markup? > > Try SimpleTable: > http://econpy.googlecode.com/svn/trunk/utilities/text.py from econpy.utilities.text import SimpleTable print SimpleTable(np.eye(3*2,3*2,k=1),fmt={'data_fmt':["%d"]}).as_latex_tabular() Thanks, I had looked in econpy, but I didn't see it. Josef > > Alan Isaac > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From tmp50 at ukr.net Sat Jan 23 11:33:32 2010 From: tmp50 at ukr.net (Dmitrey) Date: Sat, 23 Jan 2010 18:33:32 +0200 Subject: [SciPy-User] annoying bug in isfinite - it yields warning each time Message-ID: Why numpy.isfinite yields warning like this: >>> __version__ '1.5.0.dev8078' (latest, as well as lots of previous) >>> isfinite(inf) Warning: invalid value encountered in isfinite False >>> isfinite([inf,1.0]) Warning: invalid value encountered in isfinite array([False,? True], dtype=bool) >>> isfinite(array([inf,1.0])) Warning: invalid value encountered in isfinite array([False,? True], dtype=bool) -------------- next part -------------- An HTML attachment was scrubbed... URL: From jsseabold at gmail.com Sat Jan 23 12:21:37 2010 From: jsseabold at gmail.com (Skipper Seabold) Date: Sat, 23 Jan 2010 12:21:37 -0500 Subject: [SciPy-User] 2d array to Latex In-Reply-To: <1cd32cbb1001230545j3ddb6f78gc159597e5b532fe0@mail.gmail.com> References: <1cd32cbb1001221939k1f21930eu351b6f4b8836b271@mail.gmail.com> <4B5AFB04.2020501@american.edu> <1cd32cbb1001230545j3ddb6f78gc159597e5b532fe0@mail.gmail.com> Message-ID: On Sat, Jan 23, 2010 at 8:45 AM, wrote: > On Sat, Jan 23, 2010 at 8:35 AM, Alan G Isaac wrote: >> On 1/22/2010 10:39 PM, josef.pktd at gmail.com wrote: >>> Is there a function somewhere to print a 2d array with Latex markup? >> >> Try SimpleTable: >> http://econpy.googlecode.com/svn/trunk/utilities/text.py > > from econpy.utilities.text import SimpleTable > print SimpleTable(np.eye(3*2,3*2,k=1),fmt={'data_fmt':["%d"]}).as_latex_tabular() > > Thanks, I had looked in econpy, but I didn't see it. > FYI, this is also in the statsmodels sandbox for when we get around to having nicer output. Skipper From josef.pktd at gmail.com Sat Jan 23 13:43:11 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 23 Jan 2010 13:43:11 -0500 Subject: [SciPy-User] 2d array to Latex In-Reply-To: References: <1cd32cbb1001221939k1f21930eu351b6f4b8836b271@mail.gmail.com> <4B5AFB04.2020501@american.edu> <1cd32cbb1001230545j3ddb6f78gc159597e5b532fe0@mail.gmail.com> Message-ID: <1cd32cbb1001231043t1a7abaf3te1c6903d20f86f12@mail.gmail.com> On Sat, Jan 23, 2010 at 12:21 PM, Skipper Seabold wrote: > On Sat, Jan 23, 2010 at 8:45 AM, ? wrote: >> On Sat, Jan 23, 2010 at 8:35 AM, Alan G Isaac wrote: >>> On 1/22/2010 10:39 PM, josef.pktd at gmail.com wrote: >>>> Is there a function somewhere to print a 2d array with Latex markup? >>> >>> Try SimpleTable: >>> http://econpy.googlecode.com/svn/trunk/utilities/text.py >> >> from econpy.utilities.text import SimpleTable >> print SimpleTable(np.eye(3*2,3*2,k=1),fmt={'data_fmt':["%d"]}).as_latex_tabular() >> >> Thanks, I had looked in econpy, but I didn't see it. >> > > FYI, this is also in the statsmodels sandbox for when we get around to > having nicer output. ouch, Josef "ouch: Ornstein-Uhlenbeck models for phylogenetic comparative hypotheses" > > Skipper > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From g.statkute at gmail.com Sat Jan 23 16:03:25 2010 From: g.statkute at gmail.com (gintare statkute) Date: Sat, 23 Jan 2010 23:03:25 +0200 Subject: [SciPy-User] ATLAS Message-ID: <8e7295c71001231303l62de53e2ubad0a716959b33ca@mail.gmail.com> Hello, I can not install ATLAS and would like to ask if i could get from somebody working and installable version of ATLAS. In tar archives, which i download from http://math-atlas.sourceforge.net/ (tired sevral different versions) - in the ./configure step header files are missing in source directory, i.e. in folders which i downloaded in tar format. In tar archives, which i download from http://www.netlib.org/atlas/index.html installaiton do not stop. I left it for 5 hours and installation continues with message. 0: NFLOP=0, tim=0.000000 without errors. My OS is linux. I posted to ATLAS newsgroup about missing header files, but most probably people in this group have installed ATLAS and i couild borriw *.tar.bz2 file form them. regards, gintare From nwagner at iam.uni-stuttgart.de Sun Jan 24 04:11:19 2010 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Sun, 24 Jan 2010 10:11:19 +0100 Subject: [SciPy-User] Matrix Exponentials For Very Large Sparse Matrices In-Reply-To: References: <6F8E0DFF-23B9-443B-BE6E-98B58AE0AC7A@berkeley.edu> Message-ID: On Fri, 22 Jan 2010 20:33:15 -0500 Joshua Stults wrote: > On Fri, Jan 22, 2010 at 8:24 PM, Dylan Gorman > wrote: >> Dear Chuck and Joshua, >> >> It's a problem in quantum simulation. I'm trying to >>solve d(rho)/dt = >> L*rho for a sparse matrix L. L should be symmetric, and >>in principle I >> just need to compute (e^L*t)*(rho(0)) or something--not >>e^(L*t) >> explicitly. > > If you end up rolling your own, it sounds like 'Method >20: Krylov > space methods', in 'Nineteen Dubious Ways to Compute the >Exponential > of a Matrix' is for you: > http://www.cs.cornell.edu/cv/researchpdf/19ways+.pdf > You might be interested in http://dx.doi.org/10.1137/S0036142995280572 http://dx.doi.org/10.1137/S1064827595295337 Nils From gokhansever at gmail.com Sun Jan 24 12:54:37 2010 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Sun, 24 Jan 2010 11:54:37 -0600 Subject: [SciPy-User] Interact with matplotlib in Sage Message-ID: <49d6b3501001240954l42188cb9m6af15d9e11bf1af3@mail.gmail.com> Hello, I have thought of this might interesting to share. Register at www.sagenb.org or try on your local Sage-notebook and using the following code: # Simple example demonstrating how to interact with matplotlib directly. # Comment plt.clf() to get the plots overlay in each update. # Gokhan Sever & Harald Schilly (2010-01-24) from scipy import stats import numpy as np import matplotlib.pyplot as plt @interact def plot_norm(loc=(0,(0,10)), scale=(1,(1,10))): rv = stats.norm(loc, scale) x = np.linspace(-10,10,1000) plt.plot(x,rv.pdf(x)) plt.grid(True) plt.savefig('plt.png') plt.clf() A very easy to use example, also well-suited for learning and demonstration purposes. Posted at: http://wiki.sagemath.org/interact/graphics#Interactwithmatplotlib Have fun ;) -- G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian.walter at gmail.com Sun Jan 24 11:53:36 2010 From: sebastian.walter at gmail.com (Sebastian Walter) Date: Sun, 24 Jan 2010 17:53:36 +0100 Subject: [SciPy-User] Matrix Exponentials For Very Large Sparse Matrices In-Reply-To: <6F8E0DFF-23B9-443B-BE6E-98B58AE0AC7A@berkeley.edu> References: <6F8E0DFF-23B9-443B-BE6E-98B58AE0AC7A@berkeley.edu> Message-ID: For me your problem looks like an ODE and therefore you should use an ODE integrator. This should work also for very large problems in principle. For quantum mechanics there are special ODE integrators that have special invariances, e.g. that the total probability is fixed 1. If I remember correctly symplectic integration schemes are quite useful: http://en.wikipedia.org/wiki/Symplectic_integrator On Sat, Jan 23, 2010 at 2:24 AM, Dylan Gorman wrote: > Dear Chuck and Joshua, > > It's a problem in quantum simulation. I'm trying to solve d(rho)/dt = > L*rho for a sparse matrix L. L should be symmetric, and in principle I > just need to compute (e^L*t)*(rho(0)) or something--not e^(L*t) > explicitly. > > Regards, > Dylan > > On Jan 22, 2010, at 5:16 PM, Joshua Stults wrote: > >> On Fri, Jan 22, 2010 at 8:04 PM, Charles R Harris >> wrote: >>> >>> >>> On Fri, Jan 22, 2010 at 5:17 PM, Dylan Gorman >>> wrote: >>>> >>>> Hi Folks, >>>> >>>> I'd like to exponentiate very large sparse matrices with scipy, >>>> and I >>>> would be very grateful for any suggestions. Currently, I'm running >>>> into memory errors exponentiating random matrices of order 10^3x10^3 >>>> with the standard linalg.expm() routine. However, I suspect that I >>>> may >>>> realistically be able to handle somewhat larger matrices since the >>>> actual matrices I will be using are quite sparse. Ideally, I'd >>>> like to >>>> be able to exponentiate matrices of size 10^5-10^6 x 10^5 - 10^6. >>>> However, there does not seem to be any linalg.sparse.expm() >>>> function-- >>>> is this because there is in fact no advantage to exponentiating >>>> sparse >>>> matrices? Or would I need to implement something by hand? >>>> >>> >>> Out of curiosity, do the matrices have any special structure? For >>> instance, >>> are they banded or symmetric? Also, why to you want to exponentiate >>> them? >>> >> Another question: do you need the matrix exponential explicitly, or do >> you just need it's action on a vector? >> >>> Chuck >>> >>> >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >>> >> >> >> >> -- >> Joshua Stults >> Website: http://j-stults.blogspot.com >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From g.statkute at gmail.com Sun Jan 24 04:22:07 2010 From: g.statkute at gmail.com (gintare statkute) Date: Sun, 24 Jan 2010 11:22:07 +0200 Subject: [SciPy-User] Fwd: ATLAS, addition Message-ID: <8e7295c71001240122w2f6dbc12h812ff539ad0a7a51@mail.gmail.com> Hello, 3) The same message. 0: NFLOP=0, tim=0.000000 and endless instaliation happenes with *.tar.gz from http://packages.debian.org/source/stable/atlas I am installing without F77, since in Lapack installation help it was written that i should remove all f77 compiler and use F95 instead. There are empty lines in "Make.Linux_Debian86SSE2_2", which i left unfilled. This file "Make.Linux_Debian86SSE2_2" is generated automaticallly during ATLAS make in command line with user prompt. # ------------------------------------ # Reference and system libraries # ------------------------------------ BLASlib = FBLASlib = FLAPACKlib = LIBS = -lpthread -lm # ---------------------------------------------------------- # ATLAS install resources (include arch default directories) # ---------------------------------------------------------- ARCHDEF = MMDEF = INSTFLAGS = 4) One more repository: http://debs.astraw.com/dapper/ apt-get can not access src from it. ---------- Forwarded message ---------- From: gintare statkute Date: Sat, 23 Jan 2010 23:03:25 +0200 Subject: ATLAS To: scipy-user at scipy.org Hello, I can not install ATLAS and would like to ask if i could get from somebody working and installable version of ATLAS. 1) In tar archives, which i download from http://math-atlas.sourceforge.net/ (tried sevral different versions) - in the ./configure step header files are missing in source directory, i.e. in folders which i downloaded in tar format. 2) In tar archives, which i download from http://www.netlib.org/atlas/index.html installaiton do not stop. I left it for 5 hours and installation continues with message. 0: NFLOP=0, tim=0.000000 without errors. My OS is linux. I posted to ATLAS newsgroup about missing header files. Nevertheless most probably people in this scipy group have installed ATLAS and i could borriw *.tar.bz2 file form them. regards, gintare From gnurser at googlemail.com Sun Jan 24 06:56:27 2010 From: gnurser at googlemail.com (George Nurser) Date: Sun, 24 Jan 2010 11:56:27 +0000 Subject: [SciPy-User] Fwd: f2py segfault In-Reply-To: <4B5451E4.5020806@yahoo.com> References: <4B5451E4.5020806@yahoo.com> Message-ID: <1d1e6ea71001240356n420574efgcf681b3daf436727@mail.gmail.com> Juan, If I replace I32 with 4 and R64 with 8 in sub0 your code works OK for me. This is with gfortran 4.3.3, python 2.5.2 and numpy v 1.4.0 --George. 2010/1/18 Juan : > Hi, thanks for the advice. I did not notice that the integer division could be a > source for trouble. Now I changed all the routines. However, I still have the > same segmentation fault. > > > debug-capi:Python C/API function > mymod.sub0(state,ndim=shape(state,0),ntrajectories=shape(state,1)) > debug-capi:double > state=:inoutput,required,array,dims(ndim|ndim,ntrajectories|ntrajectories) > debug-capi:int ndim=shape(state,0):input,optional,scalar > debug-capi:ndim=24 > debug-capi:Checking `shape(state,0)==ndim' > debug-capi:int ntrajectories=shape(state,1):input,optional,scalar > debug-capi:ntrajectories=100 > debug-capi:Checking `shape(state,1)==ntrajectories' > debug-capi:Fortran subroutine `sub0(state,&ndim,&ntrajectories)' > debug-capi:ndim=24 > debug-capi:ntrajectories=100 > debug-capi:Building return value. > debug-capi:Python C/API function mymod.sub0: successful. > debug-capi:Freeing memory. > debug-capi:Python C/API function > mymod.sub1(state,d_i,d_f,ndim=shape(state,0),ntrajectories=shape(state,1)) > debug-capi:double > state=:inoutput,required,array,dims(ndim|ndim,ntrajectories|ntrajectories) > Segmentation fault > > The working sub1 has two other arguments d_i and d_f which are real scalars, the > full signatures are: > > ?subroutine sub0(state, Ndim, Ntrajectories) > ? ?integer(I32), intent(IN) :: Ndim > ? ?integer(I32), intent(IN) :: Ntrajectories > ? ?real(R64), intent(INOUT), dimension(Ndim,Ntrajectories) :: state > ? ?... > ?end subroutine sub0 > > ?subroutine sub1(state,d_i,d_f, Ndim,Ntrajectories) > ? ?integer(4), intent(IN) :: Ndim > ? ?integer(4), intent(IN) :: Ntrajectories > ? ?real(8), intent(INOUT), dimension(Ndim, Ntrajectories) :: state > ? ?real(8), intent(IN) :: d_i > ? ?real(8), intent(IN) :: d_f > ? ?print *, shape(state), Ndim, Ntrajectories > ? ?... > ?end subroutine sub1 > > and I am calling from my script as: > > import mymod > Ndim=24 > Ntrajectories=10 > di=0., df=10. > r= np.zeros((Ndim,Ntrajectories),dtype=np.float64, order='Fortran') > > mymod.sub0(r) > mymod.sub1(r, di, df) > > As it can be seen from the debug output, f2py is checking the arguments for sub0 > but it segfault before checking the args in sub1 (with no very informative > messages). > > It may well be a problem related to theworkings of the routines but they work > when I use them in tests on pure fortran code. Additionally I get a very similar > error message if I call sub0 (mymod.sub0(r)) instead of sub1 (mymod.sub1(r, di, > df)) the second time in the python script. > > Any ideas? Thanks again. Juan > > > -------- Original Message -------- > Subject: f2py segfault > Date: Sun, 17 Jan 2010 15:28:23 -0300 > From: Juan > To: scipy-user at scipy.org > > Hi, I don't know if this is the right place (if it is not, please point me in > the right direction). > I am using f2py with some own programs and I am going insane with a segmentation > fault. It is probably a problem in my code but I'd like to know if someone has > any hint > to give me since I've been trying different things for two days already. > > I've got a few routines in fortran with in/out arrays. When I call one of the > routines it works well. The second routine I call crashes the program. I've been > changing routines and it seems that it does not matter with routines I use. > > Basically, the fortran routines have the signature: > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From cournape at gmail.com Sun Jan 24 22:31:07 2010 From: cournape at gmail.com (David Cournapeau) Date: Mon, 25 Jan 2010 12:31:07 +0900 Subject: [SciPy-User] ATLAS In-Reply-To: <8e7295c71001231303l62de53e2ubad0a716959b33ca@mail.gmail.com> References: <8e7295c71001231303l62de53e2ubad0a716959b33ca@mail.gmail.com> Message-ID: <5b8d13221001241931gbfde798u4002e7dfd2b15009@mail.gmail.com> On Sun, Jan 24, 2010 at 6:03 AM, gintare statkute wrote: > Hello, > > I can not install ATLAS and would like to ask if i could get from > somebody working and installable version of ATLAS. Just use the one packaged from Debian: apt-get install libatlas-base-dev libatlas3gf-base David From jeremy at jeremysanders.net Mon Jan 25 15:26:21 2010 From: jeremy at jeremysanders.net (Jeremy Sanders) Date: Mon, 25 Jan 2010 20:26:21 +0000 Subject: [SciPy-User] ANN: Veusz 1.6 Message-ID: Veusz 1.6 --------- Velvet Ember Under Sky Zenith ----------------------------- http://home.gna.org/veusz/ Veusz is Copyright (C) 2003-2010 Jeremy Sanders Licenced under the GPL (version 2 or greater). Veusz is a Qt4 based scientific plotting package. It is written in Python, using PyQt4 for display and user-interfaces, and numpy for handling the numeric data. Veusz is designed to produce publication-ready Postscript/PDF/SVG output. The user interface aims to be simple, consistent and powerful. Veusz provides a GUI, command line, embedding and scripting interface (based on Python) to its plotting facilities. It also allows for manipulation and editing of datasets. Data can be captured from external sources such as internet sockets or other programs. Changes in 1.6: * User defined constants, functions or external Python imports can be defined for use when evaluating expressions. * Import descriptor is much more tolerant of syntax, e.g. "x,+- y,+,-" can now be specified as "x +- y + -". * New SVG export (PyQt >= 4.6). Supports clipping and exports text as paths for full WYSIWYG. * Dataset names can now contain any character except "`". Names containing non-alphanumeric characters can be quoted in expressions `like so`*1.23 * Widget names can contain any character except "/" * A transparency dataset can be provided to specify the per-pixel transparency of the image widget. * A polygon widget has been added. * There is a new option to place axis ticks outside the plot (outer ticks setting on axis widget) * Several new line styles have been added. * Several new plotting markers have been added. * The capture dialog can optionally retain the last N values captured. Minor changes: * Use of flat cap line style for plotting error bars for exactness. * Add fixes for saving imported unicode text. * Fix image colors for big endian systems (e.g. Mac PPC). * Add boxfill error bar style, plotting errors as filled boxes. * Positive and negative error bars are forced to have the correct sign. Features of package: * X-Y plots (with errorbars) * Line and function plots * Contour plots * Images (with colour mappings and colorbars) * Stepped plots (for histograms) * Bar graphs * Plotting dates * Fitting functions to data * Stacked plots and arrays of plots * Plot keys * Plot labels * Shapes and arrows on plots * LaTeX-like formatting for text * EPS/PDF/PNG/SVG/EMF export * Scripting interface * Dataset creation/manipulation * Embed Veusz within other programs * Text, CSV and FITS importing * Data can be captured from external sources Requirements for source install: Python (2.4 or greater required) http://www.python.org/ Qt >= 4.3 (free edition) http://www.trolltech.com/products/qt/ PyQt >= 4.3 (SIP is required to be installed first) http://www.riverbankcomputing.co.uk/pyqt/ http://www.riverbankcomputing.co.uk/sip/ numpy >= 1.0 http://numpy.scipy.org/ Optional: Microsoft Core Fonts (recommended for nice output) http://corefonts.sourceforge.net/ PyFITS >= 1.1 (optional for FITS import) http://www.stsci.edu/resources/software_hardware/pyfits pyemf >= 2.0.0 (optional for EMF export) http://pyemf.sourceforge.net/ For EMF and better SVG export, PyQt >= 4.6 or better is required, to fix a bug in the C++ wrapping For documentation on using Veusz, see the "Documents" directory. The manual is in PDF, HTML and text format (generated from docbook). The examples are also useful documentation. Issues with the current version: * Due to Qt, hatched regions sometimes look rather poor when exported to PostScript, PDF or SVG. * Due to a bug in Qt, some long lines, or using log scales, can lead to very slow plot times under X11. It is fixed by upgrading to Qt-4.5.1 (or using a binary). Switching off antialiasing in the options may help. If you enjoy using Veusz, I would love to hear from you. Please join the mailing lists at https://gna.org/mail/?group=veusz to discuss new features or if you'd like to contribute code. The latest code can always be found in the SVN repository. Jeremy Sanders From dgorman at berkeley.edu Mon Jan 25 17:04:53 2010 From: dgorman at berkeley.edu (Dylan Gorman) Date: Mon, 25 Jan 2010 14:04:53 -0800 Subject: [SciPy-User] [SciPy-user] Matrix Exponentials For Very Large Sparse Matrices In-Reply-To: References: <6F8E0DFF-23B9-443B-BE6E-98B58AE0AC7A@berkeley.edu> <27282461.post@talk.nabble.com> Message-ID: <42EC8E70-685F-4AB9-9B32-AF83435053C9@berkeley.edu> Joshua, Thanks for the expokit suggestion. I seem to have gotten it installed on my Mac OS X Leopard system. The presentation you linked indicated that I need to change the call to matvec in expokit.f to include n, so I first replaced all instances of 'call matvec(' in expokit.f with 'call matvec(n,', but I'm not sure exactly when this should be done. I then executed: f2py -m expokit -h expokit.pyf expokit.f f2py -c expokit.pyf expokit.f --link-lapack-opt which produced the expokit.so file. However, now I'm trying to reproduce the example given in the presentation you linked, and the call to dmexpv() does not seem to work. It wants a lot of arguments: >>> from scipy import * >>> from expokit import dmexpv >>> dmexpv() Traceback (most recent call last): File "", line 1, in TypeError: expokit.dmexpv() takes at least 11 arguments (0 given) The most confusing thing it wants me to pass is matvec, which is apparently an external matrix-vector multiplication function. I can't seem to figure out how to get this function to work. Tthe presentation uses a much simpler call: dmexpv(m, t, v, wsp, iwsp, A) Can you offer any insight into this problem? Thank you, Dylan On Jan 22, 2010, at 6:47 PM, Joshua Stults wrote: > On Fri, Jan 22, 2010 at 8:58 PM, Burak1327 > wrote: >> >> >> The most simple approximations are polynomial expansions. >> Chebychev polynomials are GREAT, even a 2nd order >> Taylor expansion is good enough in a lot of cases, specific to >> your type of problem. >> >> Which leads to actual scipy discussion. I'm no scipy expert, but >> the above mentioned methods are probably in the library. >> > > Here's an example of using f2py to compile expokit (see slides 15 - > 21): > http://sf.anu.edu.au/~mhk900/Python_Workshop/short.pdf > > Expokit website: http://www.maths.uq.edu.au/expokit/ > > Uses Krylov methods for sparse matrices; these will use more memory > than the polynomial expansion methods that Burak mentioned. > > -- > Joshua Stults > Website: http://j-stults.blogspot.com > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From joshua.stults at gmail.com Mon Jan 25 17:18:20 2010 From: joshua.stults at gmail.com (Joshua Stults) Date: Mon, 25 Jan 2010 17:18:20 -0500 Subject: [SciPy-User] [SciPy-user] Matrix Exponentials For Very Large Sparse Matrices In-Reply-To: <42EC8E70-685F-4AB9-9B32-AF83435053C9@berkeley.edu> References: <6F8E0DFF-23B9-443B-BE6E-98B58AE0AC7A@berkeley.edu> <27282461.post@talk.nabble.com> <42EC8E70-685F-4AB9-9B32-AF83435053C9@berkeley.edu> Message-ID: On Mon, Jan 25, 2010 at 5:04 PM, Dylan Gorman wrote: > Joshua, > > Thanks for the expokit suggestion. I seem to have gotten it installed > on my Mac OS X Leopard system. The presentation you linked indicated > that I need to change the call to matvec in expokit.f to include n, so > I first replaced all instances of 'call matvec(' in expokit.f with > 'call matvec(n,', but I'm not sure exactly when this should be done. I > then executed: > > f2py -m expokit -h expokit.pyf expokit.f > f2py -c expokit.pyf expokit.f --link-lapack-opt > > which produced the expokit.so file. > > However, now I'm trying to reproduce the example given in the > presentation you linked, and the call to dmexpv() does not seem to > work. It wants a lot of arguments: > > ?>>> from scipy import * > ?>>> from expokit import dmexpv > ?>>> dmexpv() > Traceback (most recent call last): > ? File "", line 1, in > TypeError: expokit.dmexpv() takes at least 11 arguments (0 given) > > The most confusing thing it wants me to pass is matvec, which is > apparently an external matrix-vector multiplication function. I can't > seem to figure out how to get this function to work. Tthe presentation > uses a much simpler call: > > dmexpv(m, t, v, wsp, iwsp, A) > > Can you offer any insight into this problem? > Maybe a little, the Krylov methods will require a function that gives the action of your matrix on a vector, just like a Krylov method for solving a linear system would. F2py usually does a pretty good job of generating doc strings that give all the arguments and their dimensions, have you taken a look at print dmexpv.__doc__ ? > Thank you, > Dylan > > On Jan 22, 2010, at 6:47 PM, Joshua Stults wrote: > >> On Fri, Jan 22, 2010 at 8:58 PM, Burak1327 >> wrote: >>> >>> >>> The most simple approximations are polynomial expansions. >>> Chebychev polynomials are GREAT, even a 2nd order >>> Taylor expansion is good enough in a lot of cases, specific to >>> your type of problem. >>> >>> Which leads to actual scipy discussion. I'm no scipy expert, but >>> the above mentioned methods are probably in the library. >>> >> >> Here's an example of using f2py to compile expokit (see slides 15 - >> 21): >> http://sf.anu.edu.au/~mhk900/Python_Workshop/short.pdf >> >> Expokit website: http://www.maths.uq.edu.au/expokit/ >> >> Uses Krylov methods for sparse matrices; these will use more memory >> than the polynomial expansion methods that Burak mentioned. >> -- Joshua Stults Website: http://j-stults.blogspot.com From ggellner at uoguelph.ca Mon Jan 25 22:01:30 2010 From: ggellner at uoguelph.ca (Gabriel Gellner) Date: Mon, 25 Jan 2010 22:01:30 -0500 Subject: [SciPy-User] indexing array without changing the ndims Message-ID: I really want an easy way to index an array but not have numpy simplify the shape (if you know R I want their drop=FALSE behavior). np.newaxis only seems useful when you need some simple fixes, I want something that works without having to check the index to figure out which parts of shape need a 1. Any ideas? thanks, Gabriel From warren.weckesser at enthought.com Mon Jan 25 22:08:34 2010 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Mon, 25 Jan 2010 21:08:34 -0600 Subject: [SciPy-User] indexing array without changing the ndims In-Reply-To: References: Message-ID: <4B5E5CB2.9010700@enthought.com> Gabriel Gellner wrote: > I really want an easy way to index an array but not have numpy > simplify the shape (if you know R I want their drop=FALSE behavior). > For those of us who aren't familiar with R, could you give a concrete example of what you want to do? Warren > np.newaxis only seems useful when you need some simple fixes, I want > something that works without having to check the index to figure out > which parts of shape need a 1. Any ideas? > > thanks, > Gabriel > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From joshua.stults at gmail.com Mon Jan 25 22:09:07 2010 From: joshua.stults at gmail.com (Joshua Stults) Date: Mon, 25 Jan 2010 22:09:07 -0500 Subject: [SciPy-User] [SciPy-user] Matrix Exponentials For Very Large Sparse Matrices In-Reply-To: References: <6F8E0DFF-23B9-443B-BE6E-98B58AE0AC7A@berkeley.edu> <27282461.post@talk.nabble.com> <42EC8E70-685F-4AB9-9B32-AF83435053C9@berkeley.edu> Message-ID: On Mon, Jan 25, 2010 at 5:18 PM, Joshua Stults wrote: > On Mon, Jan 25, 2010 at 5:04 PM, Dylan Gorman wrote: >> Joshua, >> >> Thanks for the expokit suggestion. I seem to have gotten it installed >> on my Mac OS X Leopard system. The presentation you linked indicated >> that I need to change the call to matvec in expokit.f to include n, so >> I first replaced all instances of 'call matvec(' in expokit.f with >> 'call matvec(n,', but I'm not sure exactly when this should be done. I >> then executed: The expokit site also gives some tips about defining 'matvecs': http://www.maths.uq.edu.au/expokit/support.html >> >> f2py -m expokit -h expokit.pyf expokit.f >> f2py -c expokit.pyf expokit.f --link-lapack-opt >> >> which produced the expokit.so file. >> >> However, now I'm trying to reproduce the example given in the >> presentation you linked, and the call to dmexpv() does not seem to >> work. It wants a lot of arguments: >> >> ?>>> from scipy import * >> ?>>> from expokit import dmexpv >> ?>>> dmexpv() >> Traceback (most recent call last): >> ? File "", line 1, in >> TypeError: expokit.dmexpv() takes at least 11 arguments (0 given) >> >> The most confusing thing it wants me to pass is matvec, which is >> apparently an external matrix-vector multiplication function. I can't >> seem to figure out how to get this function to work. Tthe presentation >> uses a much simpler call: >> >> dmexpv(m, t, v, wsp, iwsp, A) >> >> Can you offer any insight into this problem? >> > > Maybe a little, the Krylov methods will require a function that gives > the action of your matrix on a vector, just like a Krylov method for > solving a linear system would. ?F2py usually does a pretty good job of > generating doc strings that give all the arguments and their > dimensions, have you taken a look at > > print dmexpv.__doc__ > > ? > >> Thank you, >> Dylan >> >> On Jan 22, 2010, at 6:47 PM, Joshua Stults wrote: >> >>> On Fri, Jan 22, 2010 at 8:58 PM, Burak1327 >>> wrote: >>>> >>>> >>>> The most simple approximations are polynomial expansions. >>>> Chebychev polynomials are GREAT, even a 2nd order >>>> Taylor expansion is good enough in a lot of cases, specific to >>>> your type of problem. >>>> >>>> Which leads to actual scipy discussion. I'm no scipy expert, but >>>> the above mentioned methods are probably in the library. >>>> >>> >>> Here's an example of using f2py to compile expokit (see slides 15 - >>> 21): >>> http://sf.anu.edu.au/~mhk900/Python_Workshop/short.pdf >>> >>> Expokit website: http://www.maths.uq.edu.au/expokit/ >>> >>> Uses Krylov methods for sparse matrices; these will use more memory >>> than the polynomial expansion methods that Burak mentioned. >>> > > > -- > Joshua Stults > Website: http://j-stults.blogspot.com > -- Joshua Stults Website: http://j-stults.blogspot.com From ggellner at uoguelph.ca Tue Jan 26 01:39:55 2010 From: ggellner at uoguelph.ca (Gabriel Gellner) Date: Tue, 26 Jan 2010 01:39:55 -0500 Subject: [SciPy-User] indexing array without changing the ndims In-Reply-To: <4B5E5CB2.9010700@enthought.com> References: <4B5E5CB2.9010700@enthought.com> Message-ID: On Mon, Jan 25, 2010 at 10:08 PM, Warren Weckesser wrote: > Gabriel Gellner wrote: >> I really want an easy way to index an array but not have numpy >> simplify the shape (if you know R I want their drop=FALSE behavior). >> > > For those of us who aren't familiar with R, could you give a concrete > example of what you want to do? > It would be the similar to what the numpy.matrix class does, namely when you use an index like `mat[0, :]` you still have ndims == 2 (a column matrix in this case). So I want this behavior for an ndarray so I could be certain that if I do any indexing the ndims of the returned array is the same as the original array. In R any array can be indexed with an extra keyword argument drop=FALSE to give this behavior so for the above I would have `mat[0, :, drop=False]` (in pretend python notation, in R we would write mat[1,, drop=F]) and it would do the right thing. An even more extreme example would be to do something like `zeros((3, 3, 3))[0, 0, 0, drop=False]` (In R `array(0, c(3, 3, 3))[1, 1, 1, drop=F]`) which would return an array with shape == (1, 1, 1) instead of (). Now this drop notation is just for explanation, I know it is not possible in python, but I was hoping their is some equivalent way of getting this nice behavior. Looking at the matrix source code suggest this is not the case and it needs to be coded by hand, I was hoping this is not the case! thanks, Gabriel From josef.pktd at gmail.com Tue Jan 26 01:59:46 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 26 Jan 2010 01:59:46 -0500 Subject: [SciPy-User] indexing array without changing the ndims In-Reply-To: References: <4B5E5CB2.9010700@enthought.com> Message-ID: <1cd32cbb1001252259g189ba814v6ea2908a9bb9fac2@mail.gmail.com> On Tue, Jan 26, 2010 at 1:39 AM, Gabriel Gellner wrote: > On Mon, Jan 25, 2010 at 10:08 PM, Warren Weckesser > wrote: >> Gabriel Gellner wrote: >>> I really want an easy way to index an array but not have numpy >>> simplify the shape (if you know R I want their drop=FALSE behavior). >>> >> >> For those of us who aren't familiar with R, could you give a concrete >> example of what you want to do? >> > It would be the similar to what the numpy.matrix class does, namely > when you use an index like > `mat[0, :]` you still have ndims == 2 (a column matrix in this case). > So I want this behavior for an ndarray so I could be certain that if I > do any indexing the ndims of the returned array is the same as the > original array. > > In R any array can be indexed with an extra keyword argument > drop=FALSE to give this behavior so for the above I would have > `mat[0, :, drop=False]` (in pretend python notation, in R we would > write mat[1,, drop=F]) and it would do the right thing. An even more > extreme example would be to do something like > > `zeros((3, 3, 3))[0, 0, 0, drop=False]` (In R `array(0, c(3, 3, 3))[1, > 1, 1, drop=F]`) which would return an array with shape == (1, 1, 1) > instead of (). > Now this drop notation is just for explanation, I know it is not > possible in python, but I was hoping their is some equivalent way of > getting this nice behavior. Looking at the matrix source code suggest > this is not the case and it needs to be coded by hand, I was hoping > this is not the case! I use slice indices to keep (or np.newaxis, or np.expand_dims to add the dropped dim back in) >>> i,j,k = 0,3,2; np.arange(2*3*4).reshape(2,3,4)[i:i+1, j:j+1, k:k+1] array([], shape=(1, 0, 1), dtype=int32) >>> i,j,k = 0,2,2; np.arange(2*3*4).reshape(2,3,4)[i:i+1, j:j+1, k:k+1] array([[[10]]]) >>> _.shape (1, 1, 1) Josef > > thanks, > Gabriel > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From warren.weckesser at enthought.com Tue Jan 26 02:00:42 2010 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Tue, 26 Jan 2010 01:00:42 -0600 Subject: [SciPy-User] indexing array without changing the ndims In-Reply-To: References: <4B5E5CB2.9010700@enthought.com> Message-ID: <4B5E931A.2050209@enthought.com> Gabriel Gellner wrote: > On Mon, Jan 25, 2010 at 10:08 PM, Warren Weckesser > wrote: > >> Gabriel Gellner wrote: >> >>> I really want an easy way to index an array but not have numpy >>> simplify the shape (if you know R I want their drop=FALSE behavior). >>> >>> >> For those of us who aren't familiar with R, could you give a concrete >> example of what you want to do? >> >> > It would be the similar to what the numpy.matrix class does, namely > when you use an index like > `mat[0, :]` you still have ndims == 2 (a column matrix in this case). > So I want this behavior for an ndarray so I could be certain that if I > do any indexing the ndims of the returned array is the same as the > original array. > One way you could do this is to always using a slice instead of a single number as the index: mat[0:1, :]. Or as in this example, where a[:, 1:2] pulls out the second column as a 2D numpy array with shape (3,1): ----- In [1]: import numpy as np In [2]: a = np.arange(12).reshape(3,4) In [3]: a Out[3]: array([[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11]]) In [4]: a[:,1] # a 1D slice, but not what you want. Out[4]: array([1, 5, 9]) In [5]: a[:,1:2] # a slice with shape (3,1) Out[5]: array([[1], [5], [9]]) ----- Warren > In R any array can be indexed with an extra keyword argument > drop=FALSE to give this behavior so for the above I would have > `mat[0, :, drop=False]` (in pretend python notation, in R we would > write mat[1,, drop=F]) and it would do the right thing. An even more > extreme example would be to do something like > > `zeros((3, 3, 3))[0, 0, 0, drop=False]` (In R `array(0, c(3, 3, 3))[1, > 1, 1, drop=F]`) which would return an array with shape == (1, 1, 1) > instead of (). > Now this drop notation is just for explanation, I know it is not > possible in python, but I was hoping their is some equivalent way of > getting this nice behavior. Looking at the matrix source code suggest > this is not the case and it needs to be coded by hand, I was hoping > this is not the case! > > thanks, > Gabriel > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From massimodisasha at yahoo.it Tue Jan 26 05:15:10 2010 From: massimodisasha at yahoo.it (Massimo Di Stefano) Date: Tue, 26 Jan 2010 11:15:10 +0100 Subject: [SciPy-User] find intervall in array Message-ID: Hi All, i have a 2xN array where : - first column = integer values sorted from min to max - second column = float values (range 0 - 1) an example array can look like : from numpy import zeros, array import random a = zeros((100,2),float) for i in range(100): a[i,0] = random.randrange(1000,2000,10) a[i,1] = random.random() the first column represent "Z" (elevation values) the second column represent percentage 0 = 0% , 1 = 100% (pixel % coverage in a map at given Z) i need to detect the Z value correspond to a precise percentage (25%, 50%, 75%) he Z value i need to find is deducted from a formula like : z = z1 + ((z2 - z1) / (f2 - f1)) * (f - f1) where : f = precise percentage (know value) -> (0.25, 0.50, 0.75) [this value can be not present in the a[i,1] array] f1, f2 = are the a[i,1] values near the " f " value where [ f1 <= f <= f2 ] z1, z2 = Z value correspond to the f1, f2 as example : array a = z f 1234 0.03 1345 0.58 1456 0.24 1457 0.63 1458 0.41 1459 0.78 1365 0.7 1468 0.56 1545 0.54 if f = 0.5 : z = 1545 + ((1468 - 1545) / (0.56 - 0.54)) * (0.5 - 0.54) any suggestion on how can i find "f1 , f2" ? thanks!!! Massimo. From josef.pktd at gmail.com Tue Jan 26 05:31:31 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 26 Jan 2010 05:31:31 -0500 Subject: [SciPy-User] find intervall in array In-Reply-To: References: Message-ID: <1cd32cbb1001260231y30eba903u9eebee909c29fc43@mail.gmail.com> On Tue, Jan 26, 2010 at 5:15 AM, Massimo Di Stefano wrote: > Hi All, > > i have a 2xN array where : > > - first column = integer values sorted from min to max > - second column = float values (range 0 - 1) > > an example array can look like : > > from numpy import zeros, array > import random > a = zeros((100,2),float) > for i in range(100): > ?a[i,0] = random.randrange(1000,2000,10) > ?a[i,1] = random.random() > > > > the first column represent "Z" (elevation values) > the second column represent percentage 0 = 0% , 1 = 100% (pixel % coverage in a map at given Z) > > i need to detect the Z value correspond to a precise percentage (25%, 50%, 75%) > > he Z value i need to find is deducted from a formula like : > > z = z1 + ((z2 - z1) / (f2 - f1)) * (f - f1) > > where : > > f ?= precise percentage (know value) -> (0.25, 0.50, 0.75) ?[this value can be not present in the a[i,1] array] > > f1, f2 = ?are the a[i,1] values near the " f " value ? where ? ? [ f1 <= f <= f2 ] > z1, z2 = ?Z value correspond to the f1, f2 > > as example : > > array a = > > z ? ? ? ? ? f > 1234 ? ?0.03 > 1345 ? ?0.58 > 1456 ? ?0.24 > 1457 ? ?0.63 > 1458 ? ?0.41 > 1459 ? ?0.78 > 1365 ? ?0.7 > 1468 ? ?0.56 > 1545 ? ?0.54 > > if > f = 0.5 : > z = 1545 + ((1468 - 1545) / (0.56 - 0.54)) * (0.5 - 0.54) > > any suggestion on how can i find "f1 , f2" ? > > > thanks!!! > > Massimo. > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > maybe something like : ind = np.searchsorted(a[:,1], f] f1,f2 = a[[ind, ind+1], 1] after fixing the indexing. Josef From ggellner at uoguelph.ca Tue Jan 26 07:14:27 2010 From: ggellner at uoguelph.ca (Gabriel Gellner) Date: Tue, 26 Jan 2010 07:14:27 -0500 Subject: [SciPy-User] indexing array without changing the ndims In-Reply-To: References: <4B5E5CB2.9010700@enthought.com> Message-ID: Perfect! I always wondered what the use for i:i+1 slices are! It will take a little bit of thinking on how to use this for an arbitrary key, but you guys rock! Thanks. Gabriel On Tue, Jan 26, 2010 at 1:39 AM, Gabriel Gellner wrote: > On Mon, Jan 25, 2010 at 10:08 PM, Warren Weckesser > wrote: >> Gabriel Gellner wrote: >>> I really want an easy way to index an array but not have numpy >>> simplify the shape (if you know R I want their drop=FALSE behavior). >>> >> >> For those of us who aren't familiar with R, could you give a concrete >> example of what you want to do? >> > It would be the similar to what the numpy.matrix class does, namely > when you use an index like > `mat[0, :]` you still have ndims == 2 (a column matrix in this case). > So I want this behavior for an ndarray so I could be certain that if I > do any indexing the ndims of the returned array is the same as the > original array. > > In R any array can be indexed with an extra keyword argument > drop=FALSE to give this behavior so for the above I would have > `mat[0, :, drop=False]` (in pretend python notation, in R we would > write mat[1,, drop=F]) and it would do the right thing. An even more > extreme example would be to do something like > > `zeros((3, 3, 3))[0, 0, 0, drop=False]` (In R `array(0, c(3, 3, 3))[1, > 1, 1, drop=F]`) which would return an array with shape == (1, 1, 1) > instead of (). > Now this drop notation is just for explanation, I know it is not > possible in python, but I was hoping their is some equivalent way of > getting this nice behavior. Looking at the matrix source code suggest > this is not the case and it needs to be coded by hand, I was hoping > this is not the case! > > thanks, > Gabriel > From eadrogue at gmx.net Tue Jan 26 08:36:08 2010 From: eadrogue at gmx.net (Ernest =?iso-8859-1?Q?Adrogu=E9?=) Date: Tue, 26 Jan 2010 14:36:08 +0100 Subject: [SciPy-User] ATLAS In-Reply-To: <5b8d13221001241931gbfde798u4002e7dfd2b15009@mail.gmail.com> References: <8e7295c71001231303l62de53e2ubad0a716959b33ca@mail.gmail.com> <5b8d13221001241931gbfde798u4002e7dfd2b15009@mail.gmail.com> Message-ID: <20100126133608.GB7938@doriath.local> 25/01/10 @ 12:31 (+0900), thus spake David Cournapeau: > On Sun, Jan 24, 2010 at 6:03 AM, gintare statkute wrote: > > Hello, > > > > I can not install ATLAS and would like to ask if i could get from > > somebody working and installable version of ATLAS. > > Just use the one packaged from Debian: apt-get install > libatlas-base-dev libatlas3gf-base Last time I checked (about 2 weeks ago) it crashed with a segmentation fault while running numpy's tests. Bye. Ernest From cournape at gmail.com Tue Jan 26 20:23:57 2010 From: cournape at gmail.com (David Cournapeau) Date: Wed, 27 Jan 2010 10:23:57 +0900 Subject: [SciPy-User] ATLAS In-Reply-To: <20100126133608.GB7938@doriath.local> References: <8e7295c71001231303l62de53e2ubad0a716959b33ca@mail.gmail.com> <5b8d13221001241931gbfde798u4002e7dfd2b15009@mail.gmail.com> <20100126133608.GB7938@doriath.local> Message-ID: <5b8d13221001261723h24f364f6m84619548deada8f3@mail.gmail.com> 2010/1/26 Ernest Adrogu? : > 25/01/10 @ 12:31 (+0900), thus spake David Cournapeau: >> On Sun, Jan 24, 2010 at 6:03 AM, gintare statkute wrote: >> > Hello, >> > >> > I can not install ATLAS and would like to ask if i could get from >> > somebody working and installable version of ATLAS. >> >> Just use the one packaged from Debian: apt-get install >> libatlas-base-dev libatlas3gf-base > > Last time I checked (about 2 weeks ago) it crashed with a segmentation > fault while running numpy's tests. Then please file a bug report with the exact crash output. David From cournape at gmail.com Tue Jan 26 20:31:41 2010 From: cournape at gmail.com (David Cournapeau) Date: Wed, 27 Jan 2010 10:31:41 +0900 Subject: [SciPy-User] Fwd: ATLAS, addition In-Reply-To: <8e7295c71001240122w2f6dbc12h812ff539ad0a7a51@mail.gmail.com> References: <8e7295c71001240122w2f6dbc12h812ff539ad0a7a51@mail.gmail.com> Message-ID: <5b8d13221001261731u3464de8bw71f04b98b496d10c@mail.gmail.com> On Sun, Jan 24, 2010 at 6:22 PM, gintare statkute wrote: > Hello, > > 3) The same message. 0: NFLOP=0, tim=0.000000 > and endless instaliation happenes with *.tar.gz from > http://packages.debian.org/source/stable/atlas Just install the binary package, using apt-get as said in my previous email. Do not try to build from the sources, especially the patched debian sources (which should not be built using make but with dpkg-buildpackage instead). David From markus.proeller at ifm.com Wed Jan 27 02:59:44 2010 From: markus.proeller at ifm.com (markus.proeller at ifm.com) Date: Wed, 27 Jan 2010 08:59:44 +0100 Subject: [SciPy-User] error import scipy.spatial Message-ID: Hi, I get an error message when importing scipy.spatial: >>> import scipy.spatial Traceback (most recent call last): File "", line 1, in File "C:\Python26\lib\site-packages\scipy\spatial\__init__.py", line 7, in from ckdtree import * File "numpy.pxd", line 30, in scipy.spatial.ckdtree (scipy\spatial\ckdtree.c:6087) ValueError: numpy.dtype does not appear to be the correct type object when I try to import it again, it seems to work: >>> import scipy.spatial >>> scipy.spatial.distance.cdist(array([[0,0,0]]),array([[1,1,1]])) array([[ 1.73205081]]) Any idea? Markus -------------- next part -------------- An HTML attachment was scrubbed... URL: From sebastian.walter at gmail.com Wed Jan 27 03:36:15 2010 From: sebastian.walter at gmail.com (Sebastian Walter) Date: Wed, 27 Jan 2010 09:36:15 +0100 Subject: [SciPy-User] good LGPL/BSD licensed NLP solvers? Message-ID: I'm looking for a good SQP NLP solver that can solve problems of the form: min_x f(x) s.t. 0 <= h(x) 0 = g(x) L <= x <= U I've been using SNOPT so far but it would be nice if I wouldn't have to rely on proprietary software. Does anyone know a good LGPL or BSD licensed SQP solver? regards, Sebastian From josef.pktd at gmail.com Wed Jan 27 10:45:47 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 27 Jan 2010 10:45:47 -0500 Subject: [SciPy-User] error import scipy.spatial In-Reply-To: References: Message-ID: <1cd32cbb1001270745t117e3a9dh440c1c4d9de8770a@mail.gmail.com> On Wed, Jan 27, 2010 at 2:59 AM, wrote: > > Hi, > > I get an error message when importing scipy.spatial: > >>>> import scipy.spatial > Traceback (most recent call last): > ? File "", line 1, in > ? File "C:\Python26\lib\site-packages\scipy\spatial\__init__.py", line 7, in > > ? ? from ckdtree import * > ? File "numpy.pxd", line 30, in scipy.spatial.ckdtree > (scipy\spatial\ckdtree.c:6087) > ValueError: numpy.dtype does not appear to be the correct type object > > when I try to import it again, it seems to work: >>>> import scipy.spatial >>>> scipy.spatial.distance.cdist(array([[0,0,0]]),array([[1,1,1]])) > array([[ 1.73205081]]) > > Any idea? scipy 0.7.x has binary incompatibility problems if it has been compiled against numpy 1.3 and is run against numpy 1.4 There are 3 options, either you recompile scipy against numpy 1.4., or downgrade to numpy 1.3 until numpy 1.4 compatible scipy binaries are available, or hope that you don't run into a case where python crashes. More details are in several threads on the mailing lists. Josef > > Markus > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > From eadrogue at gmx.net Wed Jan 27 13:46:32 2010 From: eadrogue at gmx.net (Ernest =?iso-8859-1?Q?Adrogu=E9?=) Date: Wed, 27 Jan 2010 19:46:32 +0100 Subject: [SciPy-User] ATLAS In-Reply-To: <5b8d13221001261723h24f364f6m84619548deada8f3@mail.gmail.com> References: <8e7295c71001231303l62de53e2ubad0a716959b33ca@mail.gmail.com> <5b8d13221001241931gbfde798u4002e7dfd2b15009@mail.gmail.com> <20100126133608.GB7938@doriath.local> <5b8d13221001261723h24f364f6m84619548deada8f3@mail.gmail.com> Message-ID: <20100127184632.GA3882@doriath.local> 27/01/10 @ 10:23 (+0900), thus spake David Cournapeau: > 2010/1/26 Ernest Adrogu? : > > 25/01/10 @ 12:31 (+0900), thus spake David Cournapeau: > >> On Sun, Jan 24, 2010 at 6:03 AM, gintare statkute wrote: > >> > Hello, > >> > > >> > I can not install ATLAS and would like to ask if i could get from > >> > somebody working and installable version of ATLAS. > >> > >> Just use the one packaged from Debian: apt-get install > >> libatlas-base-dev libatlas3gf-base > > > > Last time I checked (about 2 weeks ago) it crashed with a segmentation > > fault while running numpy's tests. > > Then please file a bug report with the exact crash output. It's already been reported, here: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=544274 I was wrong about being a segfault, it simply hangs with a *** glibc detected *** message. It happens with Python 2.5 and 2.6, and Numpy 1.3.0. I have not checked other versions. Bye. Ernest From ferrell at diablotech.com Wed Jan 27 16:18:09 2010 From: ferrell at diablotech.com (Robert Ferrell) Date: Wed, 27 Jan 2010 14:18:09 -0700 Subject: [SciPy-User] find particular day_of_week Message-ID: I have a time series, I want to operate (read only) on all the items with a particular day_of_week. What is an efficient way to get at those? I've tried: desiredDates = [dt for dt in myTS.dates if dt.day_of_week == day_of_week] desiredSeries = myTS[desiredDates] but that seems quite slow. The second line is the one which is taking all the time. Is there a faster way? I don't need a copy, just a view, if that helps. thanks, -robert From josef.pktd at gmail.com Wed Jan 27 16:25:45 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 27 Jan 2010 16:25:45 -0500 Subject: [SciPy-User] find particular day_of_week In-Reply-To: References: Message-ID: <1cd32cbb1001271325g45277f8ak6bad918c2080a0a9@mail.gmail.com> On Wed, Jan 27, 2010 at 4:18 PM, Robert Ferrell wrote: > I have a time series, I want to operate (read only) on all the items > with a particular day_of_week. ?What is an efficient way to get at > those? > > I've tried: > > ? ? ? ? ? ? ? ?desiredDates = [dt for dt in myTS.dates if dt.day_of_week == > day_of_week] > > ? ? ? ? ? ? ? ? desiredSeries = myTS[desiredDates] > > but that seems quite slow. ?The second line is the one which is taking > all the time. ?Is there a faster way? ?I don't need a copy, just a > view, if that helps. I think numpy-discussion jan 20th "dates, np.where finding months" answers this. Josef > > thanks, > -robert > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From ferrell at diablotech.com Wed Jan 27 16:37:46 2010 From: ferrell at diablotech.com (Robert Ferrell) Date: Wed, 27 Jan 2010 14:37:46 -0700 Subject: [SciPy-User] find particular day_of_week In-Reply-To: <1cd32cbb1001271325g45277f8ak6bad918c2080a0a9@mail.gmail.com> References: <1cd32cbb1001271325g45277f8ak6bad918c2080a0a9@mail.gmail.com> Message-ID: <31FF4CE5-5DDE-49C6-AC74-D5427B91BC9A@diablotech.com> On Jan 27, 2010, at 2:25 PM, josef.pktd at gmail.com wrote: > On Wed, Jan 27, 2010 at 4:18 PM, Robert Ferrell > wrote: >> I have a time series, I want to operate (read only) on all the items >> with a particular day_of_week. What is an efficient way to get at >> those? >> >> I've tried: >> >> desiredDates = [dt for dt in myTS.dates if >> dt.day_of_week == >> day_of_week] >> >> desiredSeries = myTS[desiredDates] >> >> but that seems quite slow. The second line is the one which is >> taking >> all the time. Is there a faster way? I don't need a copy, just a >> view, if that helps. > > I think numpy-discussion jan 20th "dates, np.where finding months" > answers this. > You're right, of course. I thought this sounded familiar, but I couldn't think of the right search terms. Luckily, I came up with a solution I like even better just after I sent this email: desiredSeries = myTS[myTS.day_of_week == day_of_week] -robert From pgmdevlist at gmail.com Wed Jan 27 16:42:09 2010 From: pgmdevlist at gmail.com (Pierre GM) Date: Wed, 27 Jan 2010 16:42:09 -0500 Subject: [SciPy-User] find particular day_of_week In-Reply-To: References: Message-ID: On Jan 27, 2010, at 4:18 PM, Robert Ferrell wrote: > I have a time series, I want to operate (read only) on all the items > with a particular day_of_week. What is an efficient way to get at > those? >>> import scikits.timeseries as ts >>> s = ts.time_series(np.arange(365), start_date="2010-01-01", freq="D") >>> s[s.day_of_week == 1] That'll return you the points falling on Tuesdays (Mon=0, Sun=6) From ferrell at diablotech.com Wed Jan 27 16:48:31 2010 From: ferrell at diablotech.com (Robert Ferrell) Date: Wed, 27 Jan 2010 14:48:31 -0700 Subject: [SciPy-User] find particular day_of_week In-Reply-To: References: Message-ID: <875BD29B-CAA9-4FE9-B987-6478A9754C0B@diablotech.com> On Jan 27, 2010, at 2:42 PM, Pierre GM wrote: > On Jan 27, 2010, at 4:18 PM, Robert Ferrell wrote: >> I have a time series, I want to operate (read only) on all the items >> with a particular day_of_week. What is an efficient way to get at >> those? > > >>>> import scikits.timeseries as ts >>>> s = ts.time_series(np.arange(365), start_date="2010-01-01", >>>> freq="D") >>>> s[s.day_of_week == 1] > That'll return you the points falling on Tuesdays (Mon=0, Sun=6) Thanks. That's the solution I eventually found. I need to remember to look for the cool stuff in time series first, since it's usually in there. Fast, and easy to understand, too. thanks, -robert From kwgoodman at gmail.com Wed Jan 27 21:10:43 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Wed, 27 Jan 2010 18:10:43 -0800 Subject: [SciPy-User] [ANN] New open source project for labeled arrays Message-ID: I recently opened sourced one of my packages. It is a labeled array that I call larry. A two-dimensional larry, for example, contains a 2d NumPy array with labels on each row and column. A larry can have any dimension. Alignment by label is automatic when you add (or subtract, multiply, divide) two larrys. larry has built-in methods such as movingsum, ranking, merge, shuffle, zscore, demean, lag as well as typical NumPy methods like sum, max, std, sign, clip. NaNs are treated as missing data. You can archive larrys in HDF5 format using save and load or using a dictionary-like interface. I'm working towards a 0.1 release. In the meantime, comments, suggestions, critiques are all appreciated. To use larry you need Python and NumPy 1.4 or newer. To save and load larrys in HDF5 format, you need h5py with HDF5 1.8. larry currently contains no extensions, just Python code, so there is nothing to compile. Just save the la package and make sure Python can find it. docs http://larry.sourceforge.net code https://launchpad.net/larry From wesmckinn at gmail.com Wed Jan 27 21:33:32 2010 From: wesmckinn at gmail.com (Wes McKinney) Date: Wed, 27 Jan 2010 21:33:32 -0500 Subject: [SciPy-User] [Numpy-discussion] [ANN] New open source project for labeled arrays In-Reply-To: References: Message-ID: <6c476c8a1001271833m331828a1sfde1c8fe27a67ea6@mail.gmail.com> On Wed, Jan 27, 2010 at 9:10 PM, Keith Goodman wrote: > I recently opened sourced one of my packages. It is a labeled array > that I call larry. > > A two-dimensional larry, for example, contains a 2d NumPy array with > labels on each row and column. A larry can have any dimension. > > Alignment by label is automatic when you add (or subtract, multiply, > divide) two larrys. > > larry has built-in methods such as movingsum, ranking, merge, shuffle, > zscore, demean, lag as well as typical NumPy methods like sum, max, > std, sign, clip. NaNs are treated as missing data. > > You can archive larrys in HDF5 format using save and load or using a > dictionary-like interface. > > I'm working towards a 0.1 release. In the meantime, comments, > suggestions, critiques are all appreciated. > > To use larry you need Python and NumPy 1.4 or newer. To save and load > larrys in HDF5 format, you need h5py with HDF5 1.8. > > larry currently contains no extensions, just Python code, so there is > nothing to compile. Just save the la package and make sure Python can > find it. > > docs ?http://larry.sourceforge.net > code ?https://launchpad.net/larry > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion at scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > Cool! Thanks for releasing. Looks like you're solving some similar problems to the ones I built pandas for (http://pandas.sourceforge.net). I'll have to have a closer look at the implementation to see if there are some design commonalities we can benefit from. - Wes From kwgoodman at gmail.com Wed Jan 27 21:57:41 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Wed, 27 Jan 2010 18:57:41 -0800 Subject: [SciPy-User] [Numpy-discussion] [ANN] New open source project for labeled arrays In-Reply-To: <6c476c8a1001271833m331828a1sfde1c8fe27a67ea6@mail.gmail.com> References: <6c476c8a1001271833m331828a1sfde1c8fe27a67ea6@mail.gmail.com> Message-ID: On Wed, Jan 27, 2010 at 6:33 PM, Wes McKinney wrote: > On Wed, Jan 27, 2010 at 9:10 PM, Keith Goodman wrote: >> I recently opened sourced one of my packages. It is a labeled array >> that I call larry. >> >> A two-dimensional larry, for example, contains a 2d NumPy array with >> labels on each row and column. A larry can have any dimension. >> >> Alignment by label is automatic when you add (or subtract, multiply, >> divide) two larrys. >> >> larry has built-in methods such as movingsum, ranking, merge, shuffle, >> zscore, demean, lag as well as typical NumPy methods like sum, max, >> std, sign, clip. NaNs are treated as missing data. >> >> You can archive larrys in HDF5 format using save and load or using a >> dictionary-like interface. >> >> I'm working towards a 0.1 release. In the meantime, comments, >> suggestions, critiques are all appreciated. >> >> To use larry you need Python and NumPy 1.4 or newer. To save and load >> larrys in HDF5 format, you need h5py with HDF5 1.8. >> >> larry currently contains no extensions, just Python code, so there is >> nothing to compile. Just save the la package and make sure Python can >> find it. >> >> docs ?http://larry.sourceforge.net >> code ?https://launchpad.net/larry >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion at scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > Cool! Thanks for releasing. > > Looks like you're solving some similar problems to the ones I built > pandas for (http://pandas.sourceforge.net). I'll have to have a closer > look at the implementation to see if there are some design > commonalities we can benefit from. Yes, I hope we have some overlap so that we can share code. As far as design goes, larry contains a Numpy array for the data and a list of lists (one list for each dimension) for the labels. Most of the larry methods have underlying Numpy array functions that could easily be used by other projects. There are also functions for repacking HDF5 archives and for creating intermediate HDF5 Groups when saving a Dataset inside nested Groups. All this is transparent to the user but hopefully useful for other projects. From denis-bz-gg at t-online.de Thu Jan 28 07:04:50 2010 From: denis-bz-gg at t-online.de (denis) Date: Thu, 28 Jan 2010 04:04:50 -0800 (PST) Subject: [SciPy-User] interp1d and out of bounds values In-Reply-To: <0C3454D1-56E0-4D6F-B556-11367E2519F2@gmail.com> References: <0C3454D1-56E0-4D6F-B556-11367E2519F2@gmail.com> Message-ID: On Jan 22, 6:17?pm, Thomas Robitaille wrote: > Hello, > > I've been using scipy.interpolate.interp1d to interpolate values in a number of different projects. However, something I often need is the following: if the interpolating function is defined as f(x) from xmin to xmax, if I specify an x value smaller than xmin, I would like the value set to f(xmin), and if the value is above xmax, I would like the value set to xmax. Thomas, interp1d( np.clip(x, xmin, xmax), y ) ? cheers -- denis From denis-bz-gg at t-online.de Thu Jan 28 09:20:20 2010 From: denis-bz-gg at t-online.de (denis) Date: Thu, 28 Jan 2010 06:20:20 -0800 (PST) Subject: [SciPy-User] Splines in scipy.signal vs scipy.interpolation In-Reply-To: <9AF13441-AFE5-4568-9438-4E98D6E99EDF@mit.edu> References: <9AF13441-AFE5-4568-9438-4E98D6E99EDF@mit.edu> Message-ID: <8b9578e2-7308-4c8b-95cf-57bb49572029@v25g2000yqk.googlegroups.com> On Jan 20, 11:56?pm, Tony S Yu wrote: > I'm having trouble making splines from scipy.signal work with those in scipy.interpolation. > > Both packages have functions for creating (`signal.cspline1d`/`interpolate.splrep`) and evaluating (`signal.cspline1d_eval`/`interpolate.splev`) splines. There are, of course, huge differences between these functions, which is why I'm trying to get them to talk to each other. > > In particular, I'd like to create a smoothing spline using `cspline1d` (which allows easier smoothing) and evaluate using `splev` (which allows me to get derivatives of the spline). Tony, bouncing between two murky packages doesn't sound as though it'll converge ... interpolate though has both smoothing and derivs -- interpolator = interpolate.UnivariateSpline( x, y, k=3, s=s ) # s=0 interpolates yy = interpolator( xx ) y1 = interpolator( xx, 1 ) # deriv Just curious, are your real knots uniform, how many ? See also http://projects.scipy.org/scipy/ticket/864 "The documentation for class scipy.interpolate.UnivariateSpline? is misleading, and maybe completely wrong. UnivariateSpline? behaves in ways that are unpredictable ... (Fitpack is just a big dense package => big dense doc.) cheers -- denis From gokhansever at gmail.com Thu Jan 28 12:36:14 2010 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Thu, 28 Jan 2010 11:36:14 -0600 Subject: [SciPy-User] Gamma distribution in scipy.stats Message-ID: <49d6b3501001280936u7e92401ap279baf5c662426a6@mail.gmail.com> Hello, Could someone explain to me why doesn't scipy explicitly use the location and scaling parameters representing its PDF? From http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.gamma.html#scipy.stats.gamma gamma.pdf(x,a) = x**(a-1)*exp(-x)/gamma(a) for x >= 0, a > 0. Gamma is represented with two or three parameters; as the function prototype shows correctly. Also both GSL and R is fine representing the Gamma PDF with two parameters: http://www.gnu.org/software/gsl/manual/html_node/The-Gamma-Distribution.html http://stat.ethz.ch/R-manual/R-patched/library/stats/html/GammaDist.html Should these instances need to be updated? PS: I am wrapping the distribution definitions given in the GSL library for SAGE. While doing this I want to make sure my understanding and use of the terms and concepts are correct. Thanks. -- G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Thu Jan 28 12:45:01 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 28 Jan 2010 12:45:01 -0500 Subject: [SciPy-User] Gamma distribution in scipy.stats In-Reply-To: <49d6b3501001280936u7e92401ap279baf5c662426a6@mail.gmail.com> References: <49d6b3501001280936u7e92401ap279baf5c662426a6@mail.gmail.com> Message-ID: <1cd32cbb1001280945u785c2f55ud450d567088f8281@mail.gmail.com> On Thu, Jan 28, 2010 at 12:36 PM, G?khan Sever wrote: > Hello, > > Could someone explain to me why doesn't scipy explicitly use the location > and scaling parameters representing its PDF? > > From > http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.gamma.html#scipy.stats.gamma > gamma.pdf(x,a) = x**(a-1)*exp(-x)/gamma(a) for x >= 0, a > 0. > > Gamma is represented with two or three parameters; as the function prototype > shows correctly. > > Also both GSL and R is fine representing the Gamma PDF with two parameters: > > http://www.gnu.org/software/gsl/manual/html_node/The-Gamma-Distribution.html > http://stat.ethz.ch/R-manual/R-patched/library/stats/html/GammaDist.html > > Should these instances need to be updated? This should be the same if you replace `x` in the pdf by `(x-loc)/scale` loc and scale are handled generically, while the individual _pdf method only defines the standardized distribution. I think I checked the algebra once for gamma, but I'm not sure. Josef > > PS: I am wrapping the distribution definitions given in the GSL library for > SAGE. While doing this I want to make sure my understanding and use of the > terms and concepts are correct. > > Thanks. > > -- > G?khan > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > From robert.kern at gmail.com Thu Jan 28 12:50:13 2010 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 28 Jan 2010 11:50:13 -0600 Subject: [SciPy-User] Gamma distribution in scipy.stats In-Reply-To: <49d6b3501001280936u7e92401ap279baf5c662426a6@mail.gmail.com> References: <49d6b3501001280936u7e92401ap279baf5c662426a6@mail.gmail.com> Message-ID: <3d375d731001280950j184142a2o8e1e75d20b64a81c@mail.gmail.com> On Thu, Jan 28, 2010 at 11:36, G?khan Sever wrote: > Hello, > > Could someone explain to me why doesn't scipy explicitly use the location > and scaling parameters representing its PDF? Because the transformation for the location and scale parameters are the same for every PDF and is well known. However, including them in the formula often clutters it up and obscures the differences between PDFs. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From gokhansever at gmail.com Thu Jan 28 13:08:19 2010 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Thu, 28 Jan 2010 12:08:19 -0600 Subject: [SciPy-User] Gamma distribution in scipy.stats In-Reply-To: <3d375d731001280950j184142a2o8e1e75d20b64a81c@mail.gmail.com> References: <49d6b3501001280936u7e92401ap279baf5c662426a6@mail.gmail.com> <3d375d731001280950j184142a2o8e1e75d20b64a81c@mail.gmail.com> Message-ID: <49d6b3501001281008u14d26b42saa2f5583ce92b018@mail.gmail.com> On Thu, Jan 28, 2010 at 11:50 AM, Robert Kern wrote: > On Thu, Jan 28, 2010 at 11:36, G?khan Sever wrote: > > Hello, > > > > Could someone explain to me why doesn't scipy explicitly use the location > > and scaling parameters representing its PDF? > > Because the transformation for the location and scale parameters are > the same for every PDF and is well known. However, including them in > the formula often clutters it up and obscures the differences between > PDFs. > > GSL and R doesn't use the location parameter then. And Numpy's Gamma PDF includes scaling in the formulae ( http://docs.scipy.org/doc/numpy/reference/generated/numpy.random.gamma.html#numpy.random.gamma). I am guessing that everyone has its own style when it comes to represent the distributions. Not so surprisingly it is shown in a different form in my Cloud and Precipitation Parametrization book. I suggest to add a statement like: Gamma distribution is mainly used to represent precipitation distribution in bulk cloud parametrization schemes. > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -- G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Thu Jan 28 13:16:51 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 28 Jan 2010 13:16:51 -0500 Subject: [SciPy-User] Gamma distribution in scipy.stats In-Reply-To: <49d6b3501001281008u14d26b42saa2f5583ce92b018@mail.gmail.com> References: <49d6b3501001280936u7e92401ap279baf5c662426a6@mail.gmail.com> <3d375d731001280950j184142a2o8e1e75d20b64a81c@mail.gmail.com> <49d6b3501001281008u14d26b42saa2f5583ce92b018@mail.gmail.com> Message-ID: <1cd32cbb1001281016l249ff126n402317a1365cf24d@mail.gmail.com> On Thu, Jan 28, 2010 at 1:08 PM, G?khan Sever wrote: > > > On Thu, Jan 28, 2010 at 11:50 AM, Robert Kern wrote: >> >> On Thu, Jan 28, 2010 at 11:36, G?khan Sever wrote: >> > Hello, >> > >> > Could someone explain to me why doesn't scipy explicitly use the >> > location >> > and scaling parameters representing its PDF? >> >> Because the transformation for the location and scale parameters are >> the same for every PDF and is well known. However, including them in >> the formula often clutters it up and obscures the differences between >> PDFs. >> > > GSL and R doesn't use the location parameter then. And Numpy's Gamma PDF > includes scaling in the formulae > (http://docs.scipy.org/doc/numpy/reference/generated/numpy.random.gamma.html#numpy.random.gamma). > I am guessing that everyone has its own style when it comes to represent the > distributions. Not so surprisingly it is shown in a different form in my > Cloud and Precipitation Parametrization book. > > I suggest to add a statement like: Gamma distribution is mainly used to > represent precipitation distribution in bulk cloud parametrization schemes. I don't think "mainly" is a correct description, a random quote after a short google search "The Gamma distribution is widely used in engineering, science, and business, to model continuous variables that are always positive and have skewed distributions" It's a pretty common distribution. Josef > > >> >> -- >> Robert Kern >> >> "I have come to believe that the whole world is an enigma, a harmless >> enigma that is made terrible by our own mad attempt to interpret it as >> though it had an underlying truth." >> ?-- Umberto Eco >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > > > > -- > G?khan > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > From gokhansever at gmail.com Thu Jan 28 13:24:49 2010 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Thu, 28 Jan 2010 12:24:49 -0600 Subject: [SciPy-User] Gamma distribution in scipy.stats In-Reply-To: <1cd32cbb1001281016l249ff126n402317a1365cf24d@mail.gmail.com> References: <49d6b3501001280936u7e92401ap279baf5c662426a6@mail.gmail.com> <3d375d731001280950j184142a2o8e1e75d20b64a81c@mail.gmail.com> <49d6b3501001281008u14d26b42saa2f5583ce92b018@mail.gmail.com> <1cd32cbb1001281016l249ff126n402317a1365cf24d@mail.gmail.com> Message-ID: <49d6b3501001281024j3a707dd3p1cbd537a56e180c1@mail.gmail.com> On Thu, Jan 28, 2010 at 12:16 PM, wrote: > On Thu, Jan 28, 2010 at 1:08 PM, G?khan Sever > wrote: > > > > > > On Thu, Jan 28, 2010 at 11:50 AM, Robert Kern > wrote: > >> > >> On Thu, Jan 28, 2010 at 11:36, G?khan Sever > wrote: > >> > Hello, > >> > > >> > Could someone explain to me why doesn't scipy explicitly use the > >> > location > >> > and scaling parameters representing its PDF? > >> > >> Because the transformation for the location and scale parameters are > >> the same for every PDF and is well known. However, including them in > >> the formula often clutters it up and obscures the differences between > >> PDFs. > >> > > > > GSL and R doesn't use the location parameter then. And Numpy's Gamma PDF > > includes scaling in the formulae > > ( > http://docs.scipy.org/doc/numpy/reference/generated/numpy.random.gamma.html#numpy.random.gamma > ). > > I am guessing that everyone has its own style when it comes to represent > the > > distributions. Not so surprisingly it is shown in a different form in my > > Cloud and Precipitation Parametrization book. > > > > I suggest to add a statement like: Gamma distribution is mainly used to > > represent precipitation distribution in bulk cloud parametrization > schemes. > > I don't think "mainly" is a correct description, > I wanted to say that it is the most commonly used distribution in cloud modelling. Most of time Gamma is used but sometimes Log-normal distribution and other distribution forms are used as well. Maybe adding an "also" -- is also mainly used" makes it clearer? > > a random quote after a short google search > "The Gamma distribution is widely used in engineering, science, and > business, to model continuous variables that are always positive and > have skewed distributions" > > It's a pretty common distribution. > > Josef > > > > > > > >> > >> -- > >> Robert Kern > >> > >> "I have come to believe that the whole world is an enigma, a harmless > >> enigma that is made terrible by our own mad attempt to interpret it as > >> though it had an underlying truth." > >> -- Umberto Eco > >> _______________________________________________ > >> SciPy-User mailing list > >> SciPy-User at scipy.org > >> http://mail.scipy.org/mailman/listinfo/scipy-user > > > > > > > > -- > > G?khan > > > > _______________________________________________ > > SciPy-User mailing list > > SciPy-User at scipy.org > > http://mail.scipy.org/mailman/listinfo/scipy-user > > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -- G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Thu Jan 28 13:36:26 2010 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 28 Jan 2010 12:36:26 -0600 Subject: [SciPy-User] Gamma distribution in scipy.stats In-Reply-To: <49d6b3501001281024j3a707dd3p1cbd537a56e180c1@mail.gmail.com> References: <49d6b3501001280936u7e92401ap279baf5c662426a6@mail.gmail.com> <3d375d731001280950j184142a2o8e1e75d20b64a81c@mail.gmail.com> <49d6b3501001281008u14d26b42saa2f5583ce92b018@mail.gmail.com> <1cd32cbb1001281016l249ff126n402317a1365cf24d@mail.gmail.com> <49d6b3501001281024j3a707dd3p1cbd537a56e180c1@mail.gmail.com> Message-ID: <3d375d731001281036h1c0639daq4974643bff732432@mail.gmail.com> On Thu, Jan 28, 2010 at 12:24, G?khan Sever wrote: > I wanted to say that it is the most commonly used distribution in cloud > modelling. Most of time Gamma is used but sometimes Log-normal distribution > and other distribution forms are used as well. > > Maybe adding an "also" -- is also mainly used" makes it clearer? I don't think that adding such specific use cases is beneficial. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From bsouthey at gmail.com Thu Jan 28 14:07:45 2010 From: bsouthey at gmail.com (Bruce Southey) Date: Thu, 28 Jan 2010 13:07:45 -0600 Subject: [SciPy-User] Gamma distribution in scipy.stats In-Reply-To: <1cd32cbb1001281016l249ff126n402317a1365cf24d@mail.gmail.com> References: <49d6b3501001280936u7e92401ap279baf5c662426a6@mail.gmail.com> <3d375d731001280950j184142a2o8e1e75d20b64a81c@mail.gmail.com> <49d6b3501001281008u14d26b42saa2f5583ce92b018@mail.gmail.com> <1cd32cbb1001281016l249ff126n402317a1365cf24d@mail.gmail.com> Message-ID: <4B61E081.1090009@gmail.com> On 01/28/2010 12:16 PM, josef.pktd at gmail.com wrote: > On Thu, Jan 28, 2010 at 1:08 PM, G?khan Sever wrote: > >> >> On Thu, Jan 28, 2010 at 11:50 AM, Robert Kern wrote: >> >>> On Thu, Jan 28, 2010 at 11:36, G?khan Sever wrote: >>> >>>> Hello, >>>> >>>> Could someone explain to me why doesn't scipy explicitly use the >>>> location >>>> and scaling parameters representing its PDF? >>>> >>> Because the transformation for the location and scale parameters are >>> the same for every PDF and is well known. However, including them in >>> the formula often clutters it up and obscures the differences between >>> PDFs. >>> >>> >> GSL and R doesn't use the location parameter then. And Numpy's Gamma PDF >> includes scaling in the formulae >> (http://docs.scipy.org/doc/numpy/reference/generated/numpy.random.gamma.html#numpy.random.gamma). >> I am guessing that everyone has its own style when it comes to represent the >> distributions. Not so surprisingly it is shown in a different form in my >> Cloud and Precipitation Parametrization book. >> >> I suggest to add a statement like: Gamma distribution is mainly used to >> represent precipitation distribution in bulk cloud parametrization schemes. >> > I don't think "mainly" is a correct description, > > a random quote after a short google search > "The Gamma distribution is widely used in engineering, science, and > business, to model continuous variables that are always positive and > have skewed distributions" > > It's a pretty common distribution. > > Josef > > > I think what G?khan is getting at is that limited description provided in the documentation link: "The Gamma distribution is often used to model the times to failure of electronic components, and arises naturally in processes for which the waiting times between Poisson distributed events are relevant." I actually think that part should be removed from the documentation. Bruce -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Thu Jan 28 14:12:48 2010 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 28 Jan 2010 13:12:48 -0600 Subject: [SciPy-User] Gamma distribution in scipy.stats In-Reply-To: <4B61E081.1090009@gmail.com> References: <49d6b3501001280936u7e92401ap279baf5c662426a6@mail.gmail.com> <3d375d731001280950j184142a2o8e1e75d20b64a81c@mail.gmail.com> <49d6b3501001281008u14d26b42saa2f5583ce92b018@mail.gmail.com> <1cd32cbb1001281016l249ff126n402317a1365cf24d@mail.gmail.com> <4B61E081.1090009@gmail.com> Message-ID: <3d375d731001281112h7a6fb436w8384bb163a9d94b0@mail.gmail.com> On Thu, Jan 28, 2010 at 13:07, Bruce Southey wrote: > I think what? G?khan is getting at is that limited description provided in > the documentation link: > "The Gamma distribution is often used to model the times to failure of > electronic components, and arises naturally in processes for which the > waiting times between Poisson distributed events are relevant." > > I actually think that part should be removed from the documentation. I think "of electronic components" should be removed. I think the rest is fine and useful. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From gokhansever at gmail.com Thu Jan 28 15:22:40 2010 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Thu, 28 Jan 2010 14:22:40 -0600 Subject: [SciPy-User] Gamma distribution in scipy.stats In-Reply-To: <4B61E081.1090009@gmail.com> References: <49d6b3501001280936u7e92401ap279baf5c662426a6@mail.gmail.com> <3d375d731001280950j184142a2o8e1e75d20b64a81c@mail.gmail.com> <49d6b3501001281008u14d26b42saa2f5583ce92b018@mail.gmail.com> <1cd32cbb1001281016l249ff126n402317a1365cf24d@mail.gmail.com> <4B61E081.1090009@gmail.com> Message-ID: <49d6b3501001281222i79fe1af5rf3c3536959ef59d2@mail.gmail.com> On Thu, Jan 28, 2010 at 1:07 PM, Bruce Southey wrote: > On 01/28/2010 12:16 PM, josef.pktd at gmail.com wrote: > > On Thu, Jan 28, 2010 at 1:08 PM, G?khan Sever wrote: > > > On Thu, Jan 28, 2010 at 11:50 AM, Robert Kern wrote: > > > On Thu, Jan 28, 2010 at 11:36, G?khan Sever wrote: > > > Hello, > > Could someone explain to me why doesn't scipy explicitly use the > location > and scaling parameters representing its PDF? > > > Because the transformation for the location and scale parameters are > the same for every PDF and is well known. However, including them in > the formula often clutters it up and obscures the differences between > PDFs. > > > > GSL and R doesn't use the location parameter then. And Numpy's Gamma PDF > includes scaling in the formulae > (http://docs.scipy.org/doc/numpy/reference/generated/numpy.random.gamma.html#numpy.random.gamma). > I am guessing that everyone has its own style when it comes to represent the > distributions. Not so surprisingly it is shown in a different form in my > Cloud and Precipitation Parametrization book. > > I suggest to add a statement like: Gamma distribution is mainly used to > represent precipitation distribution in bulk cloud parametrization schemes. > > > I don't think "mainly" is a correct description, > > a random quote after a short google search > "The Gamma distribution is widely used in engineering, science, and > business, to model continuous variables that are always positive and > have skewed distributions" > > It's a pretty common distribution. > > Josef > > > > > I think what G?khan is getting at is that limited description provided in > the documentation link: > "The Gamma distribution is often used to model the times to failure of > electronic components, and arises naturally in processes for which the > waiting times between Poisson distributed events are relevant." > > I actually think that part should be removed from the documentation. > > Bruce > > It doesn't really too much matter to me to include some extra information or not to this description, but for the sake of consistency it might be a better idea to leave this description from the Gamma distribution page. Since neither it nor my proposed addition properly generalizes the use of the distribution. Additionally, if such examples exist in other distributions they should be removed as well. > > > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -- G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Thu Jan 28 15:34:22 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 28 Jan 2010 15:34:22 -0500 Subject: [SciPy-User] Gamma distribution in scipy.stats In-Reply-To: <49d6b3501001281222i79fe1af5rf3c3536959ef59d2@mail.gmail.com> References: <49d6b3501001280936u7e92401ap279baf5c662426a6@mail.gmail.com> <3d375d731001280950j184142a2o8e1e75d20b64a81c@mail.gmail.com> <49d6b3501001281008u14d26b42saa2f5583ce92b018@mail.gmail.com> <1cd32cbb1001281016l249ff126n402317a1365cf24d@mail.gmail.com> <4B61E081.1090009@gmail.com> <49d6b3501001281222i79fe1af5rf3c3536959ef59d2@mail.gmail.com> Message-ID: <1cd32cbb1001281234k35bdf1b7u1969d881fb12f9ed@mail.gmail.com> On Thu, Jan 28, 2010 at 3:22 PM, G?khan Sever wrote: > > > On Thu, Jan 28, 2010 at 1:07 PM, Bruce Southey wrote: >> >> On 01/28/2010 12:16 PM, josef.pktd at gmail.com wrote: >> >> On Thu, Jan 28, 2010 at 1:08 PM, G?khan Sever >> wrote: >> >> >> On Thu, Jan 28, 2010 at 11:50 AM, Robert Kern >> wrote: >> >> >> On Thu, Jan 28, 2010 at 11:36, G?khan Sever wrote: >> >> >> Hello, >> >> Could someone explain to me why doesn't scipy explicitly use the >> location >> and scaling parameters representing its PDF? >> >> >> Because the transformation for the location and scale parameters are >> the same for every PDF and is well known. However, including them in >> the formula often clutters it up and obscures the differences between >> PDFs. >> >> >> >> GSL and R doesn't use the location parameter then. And Numpy's Gamma PDF >> includes scaling in the formulae >> >> (http://docs.scipy.org/doc/numpy/reference/generated/numpy.random.gamma.html#numpy.random.gamma). >> I am guessing that everyone has its own style when it comes to represent >> the >> distributions. Not so surprisingly it is shown in a different form in my >> Cloud and Precipitation Parametrization book. >> >> I suggest to add a statement like: Gamma distribution is mainly used to >> represent precipitation distribution in bulk cloud parametrization >> schemes. >> >> >> I don't think "mainly" is a correct description, >> >> a random quote after a short google search >> "The Gamma distribution is widely used in engineering, science, and >> business, to model continuous variables that are always positive and >> have skewed distributions" >> >> It's a pretty common distribution. >> >> Josef >> >> >> >> >> I think what? G?khan is getting at is that limited description provided in >> the documentation link: >> "The Gamma distribution is often used to model the times to failure of >> electronic components, and arises naturally in processes for which the >> waiting times between Poisson distributed events are relevant." >> >> I actually think that part should be removed from the documentation. >> >> Bruce >> > > It doesn't really too much matter to me to include some extra information or > not to this description, but for the sake of consistency it might be a > better idea to leave this description from the Gamma distribution page. > Since neither it nor my proposed addition properly generalizes the use of > the distribution. Additionally, if such examples exist in other > distributions they should be removed as well. this reminded me of a thread a while ago where I looked this up: For a given Poisson arrival process, the time between two arrivals is exponentially distributed, the time between k arrivals is gamma distributed. For many distribution, there are tables how different distributions are linked. I don't know whether some of this would be useful information in the docs. In many cases a quick look on Wikipedia is very informative about common application and the relationship between distributions. Josef >> >> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > > > > -- > G?khan > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > From ryanlists at gmail.com Thu Jan 28 15:39:21 2010 From: ryanlists at gmail.com (Ryan Krauss) Date: Thu, 28 Jan 2010 14:39:21 -0600 Subject: [SciPy-User] bug in signal.lsim2 In-Reply-To: References: Message-ID: I believe I have discovered a bug in signal.lsim2. ?I believe the short attached script illustrates the problem. ?I was trying to predict the response of a transfer function with a pure integrator: ? ? ? ? ? ? ?g G = ------------- ? ? ? ? ?s(s+p) to a finite width pulse. ?lsim2 seems to handle the step response just fine, but says that the pulse response is exactly 0.0 for the entire time of the simulation. ?Obviously, this isn't the right answer. I am running scipy 0.7.0 and numpy 1.2.1 on Ubuntu 9.04, but I also have the same problem on Windows running 0.7.1 and 1.4.0. Thanks, Ryan -------------- next part -------------- A non-text attachment was scrubbed... Name: lsim2_problem.py Type: text/x-python Size: 360 bytes Desc: not available URL: From afraser at lanl.gov Thu Jan 28 15:37:09 2010 From: afraser at lanl.gov (Andy Fraser) Date: Thu, 28 Jan 2010 13:37:09 -0700 Subject: [SciPy-User] I want something like numpy.put Message-ID: <87vdemezka.fsf@lanl.gov> I want to "paint" a distorted image onto a background. The distorted map is described by an array of ordered pairs called "ij". I get the effect that I want from the following loop: for x in xrange(w): for y in xrange(h): dest[ij[x,y,0],ij[x,y,1]] = source[x,y] Each assignment operates on an rgb pixel vector. Is there a single fast numpy call that achieves the same effect? Andy From josef.pktd at gmail.com Thu Jan 28 16:05:53 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 28 Jan 2010 16:05:53 -0500 Subject: [SciPy-User] bug in signal.lsim2 In-Reply-To: References: Message-ID: <1cd32cbb1001281305r38e56f1cp3d7269abe61c7302@mail.gmail.com> On Thu, Jan 28, 2010 at 3:39 PM, Ryan Krauss wrote: > I believe I have discovered a bug in signal.lsim2. ?I believe the > short attached script illustrates the problem. ?I was trying to > predict the response of a transfer function with a pure integrator: > > ? ? ? ? ? ? ?g > G = ------------- > ? ? ? ? ?s(s+p) > > to a finite width pulse. ?lsim2 seems to handle the step response just > fine, but says that the pulse response is exactly 0.0 for the entire > time of the simulation. ?Obviously, this isn't the right answer. > > I am running scipy 0.7.0 and numpy 1.2.1 on Ubuntu 9.04, but I also > have the same problem on Windows running 0.7.1 and 1.4.0. > > Thanks, > > Ryan When I add a small noise u2 = zeros(N) + 1e-14 or for u2[:50] = amp or for u2[50:200] = amp it seems to work. This might be a tricky bug. Josef > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > From bsouthey at gmail.com Thu Jan 28 16:14:50 2010 From: bsouthey at gmail.com (Bruce Southey) Date: Thu, 28 Jan 2010 15:14:50 -0600 Subject: [SciPy-User] Gamma distribution in scipy.stats In-Reply-To: <1cd32cbb1001281234k35bdf1b7u1969d881fb12f9ed@mail.gmail.com> References: <49d6b3501001280936u7e92401ap279baf5c662426a6@mail.gmail.com> <3d375d731001280950j184142a2o8e1e75d20b64a81c@mail.gmail.com> <49d6b3501001281008u14d26b42saa2f5583ce92b018@mail.gmail.com> <1cd32cbb1001281016l249ff126n402317a1365cf24d@mail.gmail.com> <4B61E081.1090009@gmail.com> <49d6b3501001281222i79fe1af5rf3c3536959ef59d2@mail.gmail.com> <1cd32cbb1001281234k35bdf1b7u1969d881fb12f9ed@mail.gmail.com> Message-ID: <4B61FE4A.6030003@gmail.com> On 01/28/2010 02:34 PM, josef.pktd at gmail.com wrote: > [snip] > > For many distribution, there are tables how different distributions > are linked. I don't know whether some of this would be useful > information in the docs. In many cases a quick look on Wikipedia is > very informative about common application and the relationship between > distributions. > > Josef > > Some what off topic but see Leemis (1986) 'relationships among common univariate distributions' in American Statistician 40:143-146 http://www.jstor.org/stable/2684876 www.math.wm.edu/~leemis/2008amstat.pdf Also see: http://www.johndcook.com/distribution_chart.html Bruce -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Thu Jan 28 16:29:52 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 28 Jan 2010 16:29:52 -0500 Subject: [SciPy-User] Gamma distribution in scipy.stats In-Reply-To: <4B61FE4A.6030003@gmail.com> References: <49d6b3501001280936u7e92401ap279baf5c662426a6@mail.gmail.com> <3d375d731001280950j184142a2o8e1e75d20b64a81c@mail.gmail.com> <49d6b3501001281008u14d26b42saa2f5583ce92b018@mail.gmail.com> <1cd32cbb1001281016l249ff126n402317a1365cf24d@mail.gmail.com> <4B61E081.1090009@gmail.com> <49d6b3501001281222i79fe1af5rf3c3536959ef59d2@mail.gmail.com> <1cd32cbb1001281234k35bdf1b7u1969d881fb12f9ed@mail.gmail.com> <4B61FE4A.6030003@gmail.com> Message-ID: <1cd32cbb1001281329g36d4d265o91f0487649626cc0@mail.gmail.com> On Thu, Jan 28, 2010 at 4:14 PM, Bruce Southey wrote: > On 01/28/2010 02:34 PM, josef.pktd at gmail.com wrote: > > [snip] > > For many distribution, there are tables how different distributions > are linked. I don't know whether some of this would be useful > information in the docs. In many cases a quick look on Wikipedia is > very informative about common application and the relationship between > distributions. > > Josef > > > Some what off topic but see Leemis (1986) 'relationships among common > univariate distributions' in American Statistician 40:143-146 > http://www.jstor.org/stable/2684876 > www.math.wm.edu/~leemis/2008amstat.pdf > > > Also see: > http://www.johndcook.com/distribution_chart.html Yes that's what I was thinking of, I have the first article, but I don't think I have seen johndcook before. Josef > > Bruce > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > From tsyu80 at gmail.com Thu Jan 28 17:14:23 2010 From: tsyu80 at gmail.com (Tony S Yu) Date: Thu, 28 Jan 2010 17:14:23 -0500 Subject: [SciPy-User] Splines in scipy.signal vs scipy.interpolation In-Reply-To: <8b9578e2-7308-4c8b-95cf-57bb49572029@v25g2000yqk.googlegroups.com> References: <9AF13441-AFE5-4568-9438-4E98D6E99EDF@mit.edu> <8b9578e2-7308-4c8b-95cf-57bb49572029@v25g2000yqk.googlegroups.com> Message-ID: <733616BB-EC5C-43E5-B8F9-FA26EB538B9E@gmail.com> On Jan 28, 2010, at 9:20 AM, denis wrote: > > > On Jan 20, 11:56 pm, Tony S Yu wrote: >> I'm having trouble making splines from scipy.signal work with those in scipy.interpolation. >> >> Both packages have functions for creating (`signal.cspline1d`/`interpolate.splrep`) and evaluating (`signal.cspline1d_eval`/`interpolate.splev`) splines. There are, of course, huge differences between these functions, which is why I'm trying to get them to talk to each other. >> >> In particular, I'd like to create a smoothing spline using `cspline1d` (which allows easier smoothing) and evaluate using `splev` (which allows me to get derivatives of the spline). > > Tony, > bouncing between two murky packages doesn't sound as though it'll > converge ... Agreed. This was more of a naive attempt to try and get the results that I wanted. > interpolate though has both smoothing and derivs -- > interpolator = interpolate.UnivariateSpline( x, y, k=3, s=s ) > # s=0 interpolates > yy = interpolator( xx ) > y1 = interpolator( xx, 1 ) # deriv You're right. When I originally read the docs for splrep, I had it in my head that the splines in scipy.interpolation didn't provide the "right" type of smoothing (don't ask me what "right" means---I have no idea). After taking some time to understand the interpolation module, I realize it does what I want. Thanks, Denis! > Just curious, are your real knots uniform, how many ? I'm actually converting some matlab code which tries to find the optimal smoothed-spline, so the number of knots really depends on the data (the data is uniformly-spaced with about a 1000 points, but the knots depend on the smoothing---if i understand your question correctly). BTW, if anyone else ever needs to compare results from matlab's `spaps` with smoothed splines using splrep or UnivariateSpline: Note there's a big difference in the **default** error calculations in matlab and scipy. If you want to match matlab's error calculation, you need to pass in weights ("w") that match the weights used for the trapezoidal rule. Also there's a subtle, but important difference between the error equations: in matlab "w" is outside the square of the differences; in scipy "w" is inside the square of the differences. In short, to match matlab's error calculation, you need to pass "w" to splrep or UnivariateSpline, where w = np.sqrt(trapz_weights(x)) >>> def trapz_weights(x): >>> dx = np.diff(x) >>> w = np.empty(x.shape) >>> w[1:-1] = (dx[1:] + dx[:-1])/2. >>> w[0] = dx[0] / 2. >>> w[-1] = dx[-1] / 2. >>> return w Unfortunately, the splines produced by matlab and scipy don't really match (not sure why---different smoothing algorithms?), but at least their errors are the same. Cheers, -T > > See also http://projects.scipy.org/scipy/ticket/864 > "The documentation for class scipy.interpolate.UnivariateSpline? is > misleading, and maybe completely wrong. > UnivariateSpline? behaves in ways that are unpredictable ... > (Fitpack is just a big dense package => big dense doc.) > > cheers > -- denis > > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From ryanlists at gmail.com Thu Jan 28 18:00:19 2010 From: ryanlists at gmail.com (Ryan Krauss) Date: Thu, 28 Jan 2010 17:00:19 -0600 Subject: [SciPy-User] bug in signal.lsim2 In-Reply-To: <1cd32cbb1001281305r38e56f1cp3d7269abe61c7302@mail.gmail.com> References: <1cd32cbb1001281305r38e56f1cp3d7269abe61c7302@mail.gmail.com> Message-ID: Hmmm. Thanks. That solves the immediate problem. I am letting my students choose between Matlab and Python for projects in my course. This one my erode their confidence in Python/Scipy a bit. On Thu, Jan 28, 2010 at 3:05 PM, wrote: > On Thu, Jan 28, 2010 at 3:39 PM, Ryan Krauss wrote: >> I believe I have discovered a bug in signal.lsim2. ?I believe the >> short attached script illustrates the problem. ?I was trying to >> predict the response of a transfer function with a pure integrator: >> >> ? ? ? ? ? ? ?g >> G = ------------- >> ? ? ? ? ?s(s+p) >> >> to a finite width pulse. ?lsim2 seems to handle the step response just >> fine, but says that the pulse response is exactly 0.0 for the entire >> time of the simulation. ?Obviously, this isn't the right answer. >> >> I am running scipy 0.7.0 and numpy 1.2.1 on Ubuntu 9.04, but I also >> have the same problem on Windows running 0.7.1 and 1.4.0. >> >> Thanks, >> >> Ryan > > When I add a small noise > > u2 = zeros(N) + 1e-14 > > or for > u2[:50] = amp > > or for > u2[50:200] = amp > > it seems to work. > > This might be a tricky bug. > > Josef > > >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From zachary.pincus at yale.edu Thu Jan 28 19:24:41 2010 From: zachary.pincus at yale.edu (Zachary Pincus) Date: Thu, 28 Jan 2010 19:24:41 -0500 Subject: [SciPy-User] I want something like numpy.put In-Reply-To: <87vdemezka.fsf@lanl.gov> References: <87vdemezka.fsf@lanl.gov> Message-ID: scipy.ndimage.map_coordinates is probably the closest you'll get, but it doesn't appear to be an exact match for your use case... It's a bit tricky to figure out, but in might be useful. I think there are a few tutorials about how it works online. Zach On Jan 28, 2010, at 3:37 PM, Andy Fraser wrote: > I want to "paint" a distorted image onto a background. The distorted > map is described by an array of ordered pairs called "ij". I get the > effect that I want from the following loop: > > > > for x in xrange(w): > for y in xrange(h): > dest[ij[x,y,0],ij[x,y,1]] = source[x,y] > > Each assignment operates on an rgb pixel vector. Is there a single > fast numpy call that achieves the same effect? > > Andy > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From m.abdollahi at gmail.com Thu Jan 28 19:51:58 2010 From: m.abdollahi at gmail.com (persia) Date: Thu, 28 Jan 2010 16:51:58 -0800 (PST) Subject: [SciPy-User] [SciPy-user] help about lfilter ???? Message-ID: <27363601.post@talk.nabble.com> lfilter is the linear filtering function in the scipy.signal module. can someone tell me what is wrong with this function please ? I used this function in the following way and it gave me that error !! : In [204]: lfilter(array([1,2,3]),array([-4,5,6]),array([9,8,7,6,5,4,3,2,1]),zi=array([0,0])) --------------------------------------------------------------------------- Traceback (most recent call last) : linear_filter not available for this type why is it so ??!!!! what is wrong with this "type " ? what is it meant by this type anyway ?! i tried with the same arguments in matlab with the corresponding function and it worked !! and if i multiply the second argument vector say by .4, this will happen : lfilter(array([1,2,3]),.4*array([-4,5,6]),array([9,8,7,6,5,4,3,2,1]),zi=array([0,0])) (array([ -5.625 , -23.28125 , -68.7890625 , -148.40820312, -312.44384766, -633.16711426, -1276.37466431, -2557.71900177, -5120.46074867]), array([-10242.1544385 , -7682.56612301])) see ? no problem then ?!! isnt this wierd ? please help ! thanx -- View this message in context: http://old.nabble.com/help-about-lfilter------tp27363601p27363601.html Sent from the Scipy-User mailing list archive at Nabble.com. From josef.pktd at gmail.com Thu Jan 28 20:03:17 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 28 Jan 2010 20:03:17 -0500 Subject: [SciPy-User] bug in signal.lsim2 In-Reply-To: References: <1cd32cbb1001281305r38e56f1cp3d7269abe61c7302@mail.gmail.com> Message-ID: <1cd32cbb1001281703h26f1f4f9y9355a7192a156188@mail.gmail.com> On Thu, Jan 28, 2010 at 6:00 PM, Ryan Krauss wrote: > Hmmm. ?Thanks. ?That solves the immediate problem. ?I am letting my > students choose between Matlab and Python for projects in my course. > This one my erode their confidence in Python/Scipy a bit. It's always good to have a rough idea about whether the results are correct and not trust the computer too much, whether it's matlab or scipy. But if some of your students are willing to submit bug reports or tests, then the next generation of students can be more confident that it's the bugs in their own program that might be causing problems and not the code in scipy. Josef > > On Thu, Jan 28, 2010 at 3:05 PM, ? wrote: >> On Thu, Jan 28, 2010 at 3:39 PM, Ryan Krauss wrote: >>> I believe I have discovered a bug in signal.lsim2. ?I believe the >>> short attached script illustrates the problem. ?I was trying to >>> predict the response of a transfer function with a pure integrator: >>> >>> ? ? ? ? ? ? ?g >>> G = ------------- >>> ? ? ? ? ?s(s+p) >>> >>> to a finite width pulse. ?lsim2 seems to handle the step response just >>> fine, but says that the pulse response is exactly 0.0 for the entire >>> time of the simulation. ?Obviously, this isn't the right answer. >>> >>> I am running scipy 0.7.0 and numpy 1.2.1 on Ubuntu 9.04, but I also >>> have the same problem on Windows running 0.7.1 and 1.4.0. >>> >>> Thanks, >>> >>> Ryan >> >> When I add a small noise >> >> u2 = zeros(N) + 1e-14 >> >> or for >> u2[:50] = amp >> >> or for >> u2[50:200] = amp >> >> it seems to work. >> >> This might be a tricky bug. >> >> Josef >> >> >>> >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >>> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From gokhansever at gmail.com Thu Jan 28 20:10:26 2010 From: gokhansever at gmail.com (=?UTF-8?Q?G=C3=B6khan_Sever?=) Date: Thu, 28 Jan 2010 19:10:26 -0600 Subject: [SciPy-User] Gamma distribution in scipy.stats In-Reply-To: <4B61FE4A.6030003@gmail.com> References: <49d6b3501001280936u7e92401ap279baf5c662426a6@mail.gmail.com> <3d375d731001280950j184142a2o8e1e75d20b64a81c@mail.gmail.com> <49d6b3501001281008u14d26b42saa2f5583ce92b018@mail.gmail.com> <1cd32cbb1001281016l249ff126n402317a1365cf24d@mail.gmail.com> <4B61E081.1090009@gmail.com> <49d6b3501001281222i79fe1af5rf3c3536959ef59d2@mail.gmail.com> <1cd32cbb1001281234k35bdf1b7u1969d881fb12f9ed@mail.gmail.com> <4B61FE4A.6030003@gmail.com> Message-ID: <49d6b3501001281710p17af8b60w1394aa6308331b58@mail.gmail.com> On Thu, Jan 28, 2010 at 3:14 PM, Bruce Southey wrote: > On 01/28/2010 02:34 PM, josef.pktd at gmail.com wrote: > > [snip] > > For many distribution, there are tables how different distributions > are linked. I don't know whether some of this would be useful > information in the docs. In many cases a quick look on Wikipedia is > very informative about common application and the relationship between > distributions. > > Josef > > > > Some what off topic but see Leemis (1986) 'relationships among common > univariate distributions' in American Statistician 40:143-146 > http://www.jstor.org/stable/2684876 > www.math.wm.edu/~leemis/2008amstat.pdf > > > Also see: > http://www.johndcook.com/distribution_chart.html > > Bruce > Thanks Bruce, Very useful sources for me. > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -- G?khan -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Thu Jan 28 20:10:40 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 28 Jan 2010 20:10:40 -0500 Subject: [SciPy-User] [SciPy-user] help about lfilter ???? In-Reply-To: <27363601.post@talk.nabble.com> References: <27363601.post@talk.nabble.com> Message-ID: <1cd32cbb1001281710m3ddd7e26x309295a5b235fd50@mail.gmail.com> On Thu, Jan 28, 2010 at 7:51 PM, persia wrote: > > lfilter is the linear filtering function in the scipy.signal module. can > someone tell me what is wrong with this function please ? > > I used this function in the following way and it gave me that error !! : > > In [204]: > lfilter(array([1,2,3]),array([-4,5,6]),array([9,8,7,6,5,4,3,2,1]),zi=array([0,0])) > --------------------------------------------------------------------------- > ? ? ? ? ? ?Traceback (most recent call last) > > > > : linear_filter not available for this type > > > why is it so ??!!!! what is wrong with this "type " ? what is it meant by > this type anyway ?! i tried with the same arguments in matlab with the > corresponding function and it worked !! > and if i multiply the second argument vector say by .4, this will happen : > > lfilter(array([1,2,3]),.4*array([-4,5,6]),array([9,8,7,6,5,4,3,2,1]),zi=array([0,0])) > > > (array([ ? -5.625 ? ? , ? -23.28125 ? , ? -68.7890625 , ?-148.40820312, > ? ? ? ?-312.44384766, ?-633.16711426, -1276.37466431, -2557.71900177, > ? ? ? -5120.46074867]), > ?array([-10242.1544385 , ?-7682.56612301])) > > > see ? no problem then ?!! isnt this wierd ? please help ! it looks like lfilter doesn't like integers >>> signal.lfilter(array([1,2,3]),array([-4.,5,6]),array([9,8,7,6,5,4,3,2,1]),zi=array([0,0])) (array([ -2.25 , -9.3125 , -27.515625 , -59.36328125, -124.97753906, -253.2668457 , -510.54986572, -1023.08760071, -2048.18429947]), array([-4096.8617754, -3073.0264492])) note I only changed a 4 to 4. can you file a ticket? Josef > > thanx > -- > View this message in context: http://old.nabble.com/help-about-lfilter------tp27363601p27363601.html > Sent from the Scipy-User mailing list archive at Nabble.com. > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From cournape at gmail.com Thu Jan 28 20:14:45 2010 From: cournape at gmail.com (David Cournapeau) Date: Fri, 29 Jan 2010 10:14:45 +0900 Subject: [SciPy-User] [SciPy-user] help about lfilter ???? In-Reply-To: <1cd32cbb1001281710m3ddd7e26x309295a5b235fd50@mail.gmail.com> References: <27363601.post@talk.nabble.com> <1cd32cbb1001281710m3ddd7e26x309295a5b235fd50@mail.gmail.com> Message-ID: <5b8d13221001281714v722efe93tb61a7674e3212741@mail.gmail.com> On Fri, Jan 29, 2010 at 10:10 AM, wrote: > On Thu, Jan 28, 2010 at 7:51 PM, persia wrote: >> >> lfilter is the linear filtering function in the scipy.signal module. can >> someone tell me what is wrong with this function please ? >> >> I used this function in the following way and it gave me that error !! : >> >> In [204]: >> lfilter(array([1,2,3]),array([-4,5,6]),array([9,8,7,6,5,4,3,2,1]),zi=array([0,0])) >> --------------------------------------------------------------------------- >> ? ? ? ? ? ?Traceback (most recent call last) >> >> >> >> : linear_filter not available for this type >> >> >> why is it so ??!!!! what is wrong with this "type " ? what is it meant by >> this type anyway ?! i tried with the same arguments in matlab with the >> corresponding function and it worked !! >> and if i multiply the second argument vector say by .4, this will happen : >> >> lfilter(array([1,2,3]),.4*array([-4,5,6]),array([9,8,7,6,5,4,3,2,1]),zi=array([0,0])) >> >> >> (array([ ? -5.625 ? ? , ? -23.28125 ? , ? -68.7890625 , ?-148.40820312, >> ? ? ? ?-312.44384766, ?-633.16711426, -1276.37466431, -2557.71900177, >> ? ? ? -5120.46074867]), >> ?array([-10242.1544385 , ?-7682.56612301])) >> >> >> see ? no problem then ?!! isnt this wierd ? please help ! > > it looks like lfilter doesn't like integers Yes - the error message could be improved at least. Integers are not supported as is because the difference equation requires division (for IIR). I wonder whether it would make sense to automatically convert to floating point if everything is integer. cheers, David From josef.pktd at gmail.com Thu Jan 28 20:20:36 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 28 Jan 2010 20:20:36 -0500 Subject: [SciPy-User] [SciPy-user] help about lfilter ???? In-Reply-To: <5b8d13221001281714v722efe93tb61a7674e3212741@mail.gmail.com> References: <27363601.post@talk.nabble.com> <1cd32cbb1001281710m3ddd7e26x309295a5b235fd50@mail.gmail.com> <5b8d13221001281714v722efe93tb61a7674e3212741@mail.gmail.com> Message-ID: <1cd32cbb1001281720s58d937ffofb5b4859ce93b111@mail.gmail.com> On Thu, Jan 28, 2010 at 8:14 PM, David Cournapeau wrote: > On Fri, Jan 29, 2010 at 10:10 AM, ? wrote: >> On Thu, Jan 28, 2010 at 7:51 PM, persia wrote: >>> >>> lfilter is the linear filtering function in the scipy.signal module. can >>> someone tell me what is wrong with this function please ? >>> >>> I used this function in the following way and it gave me that error !! : >>> >>> In [204]: >>> lfilter(array([1,2,3]),array([-4,5,6]),array([9,8,7,6,5,4,3,2,1]),zi=array([0,0])) >>> --------------------------------------------------------------------------- >>> ? ? ? ? ? ?Traceback (most recent call last) >>> >>> >>> >>> : linear_filter not available for this type >>> >>> >>> why is it so ??!!!! what is wrong with this "type " ? what is it meant by >>> this type anyway ?! i tried with the same arguments in matlab with the >>> corresponding function and it worked !! >>> and if i multiply the second argument vector say by .4, this will happen : >>> >>> lfilter(array([1,2,3]),.4*array([-4,5,6]),array([9,8,7,6,5,4,3,2,1]),zi=array([0,0])) >>> >>> >>> (array([ ? -5.625 ? ? , ? -23.28125 ? , ? -68.7890625 , ?-148.40820312, >>> ? ? ? ?-312.44384766, ?-633.16711426, -1276.37466431, -2557.71900177, >>> ? ? ? -5120.46074867]), >>> ?array([-10242.1544385 , ?-7682.56612301])) >>> >>> >>> see ? no problem then ?!! isnt this wierd ? please help ! >> >> it looks like lfilter doesn't like integers > > Yes - the error message could be improved at least. > > Integers are not supported as is because the difference equation > requires division (for IIR). I wonder whether it would make sense to > automatically convert to floating point if everything is integer. Yes, I would think so, if it doesn't work or doesn't make sense with integers than automatic conversion is the best in these cases, e.g. fftconvolve, np.mean, ... Josef > > cheers, > > David > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From m.abdollahi at gmail.com Thu Jan 28 20:26:37 2010 From: m.abdollahi at gmail.com (persia) Date: Thu, 28 Jan 2010 17:26:37 -0800 (PST) Subject: [SciPy-User] [SciPy-user] help about lfilter ???? In-Reply-To: <1cd32cbb1001281720s58d937ffofb5b4859ce93b111@mail.gmail.com> References: <27363601.post@talk.nabble.com> <1cd32cbb1001281710m3ddd7e26x309295a5b235fd50@mail.gmail.com> <5b8d13221001281714v722efe93tb61a7674e3212741@mail.gmail.com> <1cd32cbb1001281720s58d937ffofb5b4859ce93b111@mail.gmail.com> Message-ID: <27365988.post@talk.nabble.com> Thank you so much for what you pointed out, and I agree with you on automatic type conversion. persia josef.pktd wrote: > > On Thu, Jan 28, 2010 at 8:14 PM, David Cournapeau > wrote: >> On Fri, Jan 29, 2010 at 10:10 AM, ? wrote: >>> On Thu, Jan 28, 2010 at 7:51 PM, persia wrote: >>>> >>>> lfilter is the linear filtering function in the scipy.signal module. >>>> can >>>> someone tell me what is wrong with this function please ? >>>> >>>> I used this function in the following way and it gave me that error !! >>>> : >>>> >>>> In [204]: >>>> lfilter(array([1,2,3]),array([-4,5,6]),array([9,8,7,6,5,4,3,2,1]),zi=array([0,0])) >>>> --------------------------------------------------------------------------- >>>> ? ? ? ? ? ?Traceback (most recent call >>>> last) >>>> >>>> >>>> >>>> : linear_filter not available for this >>>> type >>>> >>>> >>>> why is it so ??!!!! what is wrong with this "type " ? what is it meant >>>> by >>>> this type anyway ?! i tried with the same arguments in matlab with the >>>> corresponding function and it worked !! >>>> and if i multiply the second argument vector say by .4, this will >>>> happen : >>>> >>>> lfilter(array([1,2,3]),.4*array([-4,5,6]),array([9,8,7,6,5,4,3,2,1]),zi=array([0,0])) >>>> >>>> >>>> (array([ ? -5.625 ? ? , ? -23.28125 ? , ? -68.7890625 , ?-148.40820312, >>>> ? ? ? ?-312.44384766, ?-633.16711426, -1276.37466431, -2557.71900177, >>>> ? ? ? -5120.46074867]), >>>> ?array([-10242.1544385 , ?-7682.56612301])) >>>> >>>> >>>> see ? no problem then ?!! isnt this wierd ? please help ! >>> >>> it looks like lfilter doesn't like integers >> >> Yes - the error message could be improved at least. >> >> Integers are not supported as is because the difference equation >> requires division (for IIR). I wonder whether it would make sense to >> automatically convert to floating point if everything is integer. > > Yes, I would think so, if it doesn't work or doesn't make sense with > integers than automatic conversion is the best in these cases, e.g. > fftconvolve, np.mean, ... > > Josef > >> >> cheers, >> >> David >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > -- View this message in context: http://old.nabble.com/help-about-lfilter------tp27363601p27365988.html Sent from the Scipy-User mailing list archive at Nabble.com. From warren.weckesser at enthought.com Thu Jan 28 20:50:05 2010 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Thu, 28 Jan 2010 19:50:05 -0600 Subject: [SciPy-User] bug in signal.lsim2 In-Reply-To: References: Message-ID: <4B623ECD.4040403@enthought.com> Ryan, The problem is that the ODE solver used by lsim2 is too good. :) It uses scipy.integrate.odeint, which in turn uses the Fortran library LSODA. Like any good solver, LSODA is an adaptive solver--it adjusts its step size to be as large as possible while keeping estimates of the error bounded. For the problem you are solving, with initial condition 0, the exact solution is initially exactly 0. This is such a nice smooth solution that the solver's step size quickly grows--so big, in fact, that it skips right over your pulse and never sees it. So how does it create all those intermediate points at the requested time values? It uses interpolation between the steps that it computed to create the solution values at the times that you requested. So using a finer grid of time values won't help. (If lsim2 gave you a hook into the parameters passed to odeint, you could set odeint's 'hmax' to a value smaller than your pulse width, which would force the solver to see the pulse. But there is no way to set that parameter from lsim2.) The basic problem is you are passing in a discontinuous function to a solver that expects a smooth function. A better way to solve this problem is to explicitly account for the discontinuity. One possibility is the attached script. This is an excellent "learning opportunity" for your students on the hazards of numerical computing! Warren Ryan Krauss wrote: > I believe I have discovered a bug in signal.lsim2. I believe the > short attached script illustrates the problem. I was trying to > predict the response of a transfer function with a pure integrator: > > g > G = ------------- > s(s+p) > > to a finite width pulse. lsim2 seems to handle the step response just > fine, but says that the pulse response is exactly 0.0 for the entire > time of the simulation. Obviously, this isn't the right answer. > > I am running scipy 0.7.0 and numpy 1.2.1 on Ubuntu 9.04, but I also > have the same problem on Windows running 0.7.1 and 1.4.0. > > Thanks, > > Ryan > > ------------------------------------------------------------------------ > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: lsim2_solution.py URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: lsim2_solution.png Type: image/png Size: 13212 bytes Desc: not available URL: From josef.pktd at gmail.com Thu Jan 28 20:57:31 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 28 Jan 2010 20:57:31 -0500 Subject: [SciPy-User] bug in signal.lsim2 In-Reply-To: <4B623ECD.4040403@enthought.com> References: <4B623ECD.4040403@enthought.com> Message-ID: <1cd32cbb1001281757x14b7b581xba02f4ae21d3acfb@mail.gmail.com> On Thu, Jan 28, 2010 at 8:50 PM, Warren Weckesser wrote: > Ryan, > > The problem is that the ODE solver used by lsim2 is too good. :) > > It uses scipy.integrate.odeint, which in turn uses the Fortran library > LSODA. ?Like any good solver, LSODA is an adaptive solver--it adjusts its > step size to be as large as possible while keeping estimates of the error > bounded. ?For the problem you are solving, with initial condition 0, the > exact solution is initially exactly 0. ?This is such a nice smooth solution > that the solver's step size quickly grows--so big, in fact, that it skips > right over your pulse and never sees it. > > So how does it create all those intermediate points at the requested time > values? ?It uses interpolation between the steps that it computed to create > the solution values at the times that you requested. ?So using a finer grid > of time values won't help. ?(If lsim2 gave you a hook into the parameters > passed to odeint, you could set odeint's 'hmax' to a value smaller than your > pulse width, which would force the solver to see the pulse. ?But there is no > way to set that parameter from lsim2.) It's something what I suspected. I don't know much about odeint, but do you think it would be useful to let lsim2 pass through some parameters to odeint? Josef > > The basic problem is you are passing in a discontinuous function to a solver > that expects a smooth function. ?A better way to solve this problem is to > explicitly account for the discontinuity. One possibility is the attached > script. > > This is an excellent "learning opportunity" for your students on the hazards > of numerical computing! > > Warren > > > Ryan Krauss wrote: >> >> I believe I have discovered a bug in signal.lsim2. ?I believe the >> short attached script illustrates the problem. ?I was trying to >> predict the response of a transfer function with a pure integrator: >> >> ? ? ? ? ? ? g >> G = ------------- >> ? ? ? ? s(s+p) >> >> to a finite width pulse. ?lsim2 seems to handle the step response just >> fine, but says that the pulse response is exactly 0.0 for the entire >> time of the simulation. ?Obviously, this isn't the right answer. >> >> I am running scipy 0.7.0 and numpy 1.2.1 on Ubuntu 9.04, but I also >> have the same problem on Windows running 0.7.1 and 1.4.0. >> >> Thanks, >> >> Ryan >> ?------------------------------------------------------------------------ >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > > > from pylab import * > from scipy import signal > > > g = 100.0 > p = 15.0 > G = signal.ltisys.lti(g, [1,p,0]) > > t = arange(0, 1.0, 0.002) > N = len(t) > > # u for the whole interval (not used in lsim2, only for plotting later). > amp = 50.0 > u = zeros(N) > k1 = 50 > k2 = 100 > u[k1:k2] = amp > > # Create input functions for each smooth interval. (This could be simpler, > since u > # is constant on each interval.) > a = float(k1)/N > b = float(k2)/N > T1 = linspace(0, a, 201) > u1 = zeros_like(T1) > T2 = linspace(a, b, 201) > u2 = amp*ones_like(T2) > T3 = linspace(b, 1.0, 201) > u3 = zeros_like(T3) > > # Solve on each interval; use the final value of one solution as the > starting > # point of the next solution. > # (We could skip the first calculation, since we know the solution will be > 0.) > (t1, y1, x1) = signal.lsim2(G,u1,T1) > (t2, y2, x2) = signal.lsim2(G, u2, T2, X0=x1[-1]) > (t3, y3, x3) = signal.lsim2(G, u3, T3, X0=x2[-1]) > > figure(1) > clf() > plot(t, u, 'k', linewidth=3) > plot(t1, y1, 'y', linewidth=3) > plot(t2, y2, 'b', linewidth=3) > plot(t3, y3, 'g', linewidth=3) > > show() > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > > From warren.weckesser at enthought.com Thu Jan 28 21:05:28 2010 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Thu, 28 Jan 2010 20:05:28 -0600 Subject: [SciPy-User] bug in signal.lsim2 In-Reply-To: <4B623ECD.4040403@enthought.com> References: <4B623ECD.4040403@enthought.com> Message-ID: <4B624268.2030408@enthought.com> Or, you could modify your script lsim2_problem.py to eliminate the initial interval during which the solution is exactly zero, by changing this line: u2[50:100] = amp to this: u2[:50] = amp This just translates the pulse so that it starts at t=0. Since the default initial condition used by lsim2 is zero, it will give the same solution, just translated in time. Warren Warren Weckesser wrote: > Ryan, > > The problem is that the ODE solver used by lsim2 is too good. :) > > It uses scipy.integrate.odeint, which in turn uses the Fortran library > LSODA. Like any good solver, LSODA is an adaptive solver--it adjusts > its step size to be as large as possible while keeping estimates of > the error bounded. For the problem you are solving, with initial > condition 0, the exact solution is initially exactly 0. This is such > a nice smooth solution that the solver's step size quickly grows--so > big, in fact, that it skips right over your pulse and never sees it. > > So how does it create all those intermediate points at the requested > time values? It uses interpolation between the steps that it computed > to create the solution values at the times that you requested. So > using a finer grid of time values won't help. (If lsim2 gave you a > hook into the parameters passed to odeint, you could set odeint's > 'hmax' to a value smaller than your pulse width, which would force the > solver to see the pulse. But there is no way to set that parameter > from lsim2.) > > The basic problem is you are passing in a discontinuous function to a > solver that expects a smooth function. A better way to solve this > problem is to explicitly account for the discontinuity. One > possibility is the attached script. > > This is an excellent "learning opportunity" for your students on the > hazards of numerical computing! > > Warren > > > Ryan Krauss wrote: >> I believe I have discovered a bug in signal.lsim2. I believe the >> short attached script illustrates the problem. I was trying to >> predict the response of a transfer function with a pure integrator: >> >> g >> G = ------------- >> s(s+p) >> >> to a finite width pulse. lsim2 seems to handle the step response just >> fine, but says that the pulse response is exactly 0.0 for the entire >> time of the simulation. Obviously, this isn't the right answer. >> >> I am running scipy 0.7.0 and numpy 1.2.1 on Ubuntu 9.04, but I also >> have the same problem on Windows running 0.7.1 and 1.4.0. >> >> Thanks, >> >> Ryan >> >> ------------------------------------------------------------------------ >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user > > ------------------------------------------------------------------------ > > ------------------------------------------------------------------------ > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user From cournape at gmail.com Thu Jan 28 21:21:57 2010 From: cournape at gmail.com (David Cournapeau) Date: Fri, 29 Jan 2010 11:21:57 +0900 Subject: [SciPy-User] [SciPy-user] help about lfilter ???? In-Reply-To: <5b8d13221001281714v722efe93tb61a7674e3212741@mail.gmail.com> References: <27363601.post@talk.nabble.com> <1cd32cbb1001281710m3ddd7e26x309295a5b235fd50@mail.gmail.com> <5b8d13221001281714v722efe93tb61a7674e3212741@mail.gmail.com> Message-ID: <5b8d13221001281821g68a028cbic666bf45856e6ba3@mail.gmail.com> On Fri, Jan 29, 2010 at 10:14 AM, David Cournapeau wrote: > > Yes - the error message could be improved at least. This is done. Now, your error message would be: NotImplementedError: input type 'int64' not supported Or something like that depending on your platform - I think this is much more informative. I am still not sure about whether implicit conversion is a good idea, so this will wait. David From warren.weckesser at enthought.com Thu Jan 28 22:33:09 2010 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Thu, 28 Jan 2010 21:33:09 -0600 Subject: [SciPy-User] bug in signal.lsim2 In-Reply-To: <1cd32cbb1001281757x14b7b581xba02f4ae21d3acfb@mail.gmail.com> References: <4B623ECD.4040403@enthought.com> <1cd32cbb1001281757x14b7b581xba02f4ae21d3acfb@mail.gmail.com> Message-ID: <4B6256F5.3050802@enthought.com> josef.pktd at gmail.com wrote: > On Thu, Jan 28, 2010 at 8:50 PM, Warren Weckesser > wrote: > >> Ryan, >> >> The problem is that the ODE solver used by lsim2 is too good. :) >> >> It uses scipy.integrate.odeint, which in turn uses the Fortran library >> LSODA. Like any good solver, LSODA is an adaptive solver--it adjusts its >> step size to be as large as possible while keeping estimates of the error >> bounded. For the problem you are solving, with initial condition 0, the >> exact solution is initially exactly 0. This is such a nice smooth solution >> that the solver's step size quickly grows--so big, in fact, that it skips >> right over your pulse and never sees it. >> >> So how does it create all those intermediate points at the requested time >> values? It uses interpolation between the steps that it computed to create >> the solution values at the times that you requested. So using a finer grid >> of time values won't help. (If lsim2 gave you a hook into the parameters >> passed to odeint, you could set odeint's 'hmax' to a value smaller than your >> pulse width, which would force the solver to see the pulse. But there is no >> way to set that parameter from lsim2.) >> > > It's something what I suspected. I don't know much about odeint, but > do you think it would be useful to let lsim2 pass through some > parameters to odeint? > > Sounds useful to me. A simple implementation is an optional keyword argument that is a dict of odeint arguments. But this would almost certainly break if lsim2 were ever reimplemented with a different solver. So perhaps it should allow a common set of ODE solver parameters (e.g. absolute and relative error tolerances, max and min step sizes, others?). Perhaps this should wait until after the ODE solver redesign that is occasionally discussed: http://projects.scipy.org/scipy/wiki/OdeintRedesign Then the solver itself could be an optional argument to lsim2. Warren > Josef > > > >> The basic problem is you are passing in a discontinuous function to a solver >> that expects a smooth function. A better way to solve this problem is to >> explicitly account for the discontinuity. One possibility is the attached >> script. >> >> This is an excellent "learning opportunity" for your students on the hazards >> of numerical computing! >> >> Warren >> >> >> Ryan Krauss wrote: >> >>> I believe I have discovered a bug in signal.lsim2. I believe the >>> short attached script illustrates the problem. I was trying to >>> predict the response of a transfer function with a pure integrator: >>> >>> g >>> G = ------------- >>> s(s+p) >>> >>> to a finite width pulse. lsim2 seems to handle the step response just >>> fine, but says that the pulse response is exactly 0.0 for the entire >>> time of the simulation. Obviously, this isn't the right answer. >>> >>> I am running scipy 0.7.0 and numpy 1.2.1 on Ubuntu 9.04, but I also >>> have the same problem on Windows running 0.7.1 and 1.4.0. >>> >>> Thanks, >>> >>> Ryan >>> ------------------------------------------------------------------------ >>> >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >> from pylab import * >> from scipy import signal >> >> >> g = 100.0 >> p = 15.0 >> G = signal.ltisys.lti(g, [1,p,0]) >> >> t = arange(0, 1.0, 0.002) >> N = len(t) >> >> # u for the whole interval (not used in lsim2, only for plotting later). >> amp = 50.0 >> u = zeros(N) >> k1 = 50 >> k2 = 100 >> u[k1:k2] = amp >> >> # Create input functions for each smooth interval. (This could be simpler, >> since u >> # is constant on each interval.) >> a = float(k1)/N >> b = float(k2)/N >> T1 = linspace(0, a, 201) >> u1 = zeros_like(T1) >> T2 = linspace(a, b, 201) >> u2 = amp*ones_like(T2) >> T3 = linspace(b, 1.0, 201) >> u3 = zeros_like(T3) >> >> # Solve on each interval; use the final value of one solution as the >> starting >> # point of the next solution. >> # (We could skip the first calculation, since we know the solution will be >> 0.) >> (t1, y1, x1) = signal.lsim2(G,u1,T1) >> (t2, y2, x2) = signal.lsim2(G, u2, T2, X0=x1[-1]) >> (t3, y3, x3) = signal.lsim2(G, u3, T3, X0=x2[-1]) >> >> figure(1) >> clf() >> plot(t, u, 'k', linewidth=3) >> plot(t1, y1, 'y', linewidth=3) >> plot(t2, y2, 'b', linewidth=3) >> plot(t3, y3, 'g', linewidth=3) >> >> show() >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> >> >> > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From josef.pktd at gmail.com Thu Jan 28 23:00:21 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 28 Jan 2010 23:00:21 -0500 Subject: [SciPy-User] bug in signal.lsim2 In-Reply-To: <4B6256F5.3050802@enthought.com> References: <4B623ECD.4040403@enthought.com> <1cd32cbb1001281757x14b7b581xba02f4ae21d3acfb@mail.gmail.com> <4B6256F5.3050802@enthought.com> Message-ID: <1cd32cbb1001282000x332df230n3242e7e1063ee303@mail.gmail.com> On Thu, Jan 28, 2010 at 10:33 PM, Warren Weckesser wrote: > josef.pktd at gmail.com wrote: >> On Thu, Jan 28, 2010 at 8:50 PM, Warren Weckesser >> wrote: >> >>> Ryan, >>> >>> The problem is that the ODE solver used by lsim2 is too good. :) >>> >>> It uses scipy.integrate.odeint, which in turn uses the Fortran library >>> LSODA. ?Like any good solver, LSODA is an adaptive solver--it adjusts its >>> step size to be as large as possible while keeping estimates of the error >>> bounded. ?For the problem you are solving, with initial condition 0, the >>> exact solution is initially exactly 0. ?This is such a nice smooth solution >>> that the solver's step size quickly grows--so big, in fact, that it skips >>> right over your pulse and never sees it. >>> >>> So how does it create all those intermediate points at the requested time >>> values? ?It uses interpolation between the steps that it computed to create >>> the solution values at the times that you requested. ?So using a finer grid >>> of time values won't help. ?(If lsim2 gave you a hook into the parameters >>> passed to odeint, you could set odeint's 'hmax' to a value smaller than your >>> pulse width, which would force the solver to see the pulse. ?But there is no >>> way to set that parameter from lsim2.) >>> >> >> It's something what I suspected. I don't know much about odeint, but >> do you think it would be useful to let lsim2 pass through some >> parameters to odeint? >> >> > > Sounds useful to me. ?A simple implementation is an optional keyword > argument that is a dict of odeint arguments. ? But this would almost > certainly break if lsim2 were ever reimplemented with a different > solver. ?So perhaps it should allow a common set of ODE solver > parameters (e.g. absolute and relative error tolerances, max and min > step sizes, others?). > > Perhaps this should wait until after the ODE solver redesign that is > occasionally discussed: > ? ?http://projects.scipy.org/scipy/wiki/OdeintRedesign > Then the solver itself could be an optional argument to lsim2. I was just thinking of adding to the argument list a **kwds argument that is directly passed on to whatever ODE solver is used. This should be pretty flexible for any changes and be backwards compatible. I've seen and used it in a similar way for calls to optimization routines, e.g. also optimize.curve_fit, does it. What are actually valid keywords would depend on which function is called. (But I'm not a user of lsim, I'm just stealing some ideas from lti and friends for time series analysis.) Josef > > Warren > >> Josef >> >> >> >>> The basic problem is you are passing in a discontinuous function to a solver >>> that expects a smooth function. ?A better way to solve this problem is to >>> explicitly account for the discontinuity. One possibility is the attached >>> script. >>> >>> This is an excellent "learning opportunity" for your students on the hazards >>> of numerical computing! >>> >>> Warren >>> >>> >>> Ryan Krauss wrote: >>> >>>> I believe I have discovered a bug in signal.lsim2. ?I believe the >>>> short attached script illustrates the problem. ?I was trying to >>>> predict the response of a transfer function with a pure integrator: >>>> >>>> ? ? ? ? ? ? g >>>> G = ------------- >>>> ? ? ? ? s(s+p) >>>> >>>> to a finite width pulse. ?lsim2 seems to handle the step response just >>>> fine, but says that the pulse response is exactly 0.0 for the entire >>>> time of the simulation. ?Obviously, this isn't the right answer. >>>> >>>> I am running scipy 0.7.0 and numpy 1.2.1 on Ubuntu 9.04, but I also >>>> have the same problem on Windows running 0.7.1 and 1.4.0. >>>> >>>> Thanks, >>>> >>>> Ryan >>>> ?------------------------------------------------------------------------ >>>> >>>> _______________________________________________ >>>> SciPy-User mailing list >>>> SciPy-User at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>>> >>> from pylab import * >>> from scipy import signal >>> >>> >>> g = 100.0 >>> p = 15.0 >>> G = signal.ltisys.lti(g, [1,p,0]) >>> >>> t = arange(0, 1.0, 0.002) >>> N = len(t) >>> >>> # u for the whole interval (not used in lsim2, only for plotting later). >>> amp = 50.0 >>> u = zeros(N) >>> k1 = 50 >>> k2 = 100 >>> u[k1:k2] = amp >>> >>> # Create input functions for each smooth interval. (This could be simpler, >>> since u >>> # is constant on each interval.) >>> a = float(k1)/N >>> b = float(k2)/N >>> T1 = linspace(0, a, 201) >>> u1 = zeros_like(T1) >>> T2 = linspace(a, b, 201) >>> u2 = amp*ones_like(T2) >>> T3 = linspace(b, 1.0, 201) >>> u3 = zeros_like(T3) >>> >>> # Solve on each interval; use the final value of one solution as the >>> starting >>> # point of the next solution. >>> # (We could skip the first calculation, since we know the solution will be >>> 0.) >>> (t1, y1, x1) = signal.lsim2(G,u1,T1) >>> (t2, y2, x2) = signal.lsim2(G, u2, T2, X0=x1[-1]) >>> (t3, y3, x3) = signal.lsim2(G, u3, T3, X0=x2[-1]) >>> >>> figure(1) >>> clf() >>> plot(t, u, 'k', linewidth=3) >>> plot(t1, y1, 'y', linewidth=3) >>> plot(t2, y2, 'b', linewidth=3) >>> plot(t3, y3, 'g', linewidth=3) >>> >>> show() >>> >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >>> >>> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From ryanlists at gmail.com Fri Jan 29 07:44:48 2010 From: ryanlists at gmail.com (Ryan Krauss) Date: Fri, 29 Jan 2010 06:44:48 -0600 Subject: [SciPy-User] bug in signal.lsim2 In-Reply-To: <1cd32cbb1001282000x332df230n3242e7e1063ee303@mail.gmail.com> References: <4B623ECD.4040403@enthought.com> <1cd32cbb1001281757x14b7b581xba02f4ae21d3acfb@mail.gmail.com> <4B6256F5.3050802@enthought.com> <1cd32cbb1001282000x332df230n3242e7e1063ee303@mail.gmail.com> Message-ID: Thanks to Warren and Josef for their time and thoughts. I feel like I now understand the underlying problem and have some good options to solve my short term issues (I assigned the project last night and they need to be able to start working on it immediately). I actually use a TransferFunction class that derives from ltisys. I could override its lsim2 method to try out some of these solutions quickly and fairly easily. Ryan On Thu, Jan 28, 2010 at 10:00 PM, wrote: > On Thu, Jan 28, 2010 at 10:33 PM, Warren Weckesser > wrote: >> josef.pktd at gmail.com wrote: >>> On Thu, Jan 28, 2010 at 8:50 PM, Warren Weckesser >>> wrote: >>> >>>> Ryan, >>>> >>>> The problem is that the ODE solver used by lsim2 is too good. :) >>>> >>>> It uses scipy.integrate.odeint, which in turn uses the Fortran library >>>> LSODA. ?Like any good solver, LSODA is an adaptive solver--it adjusts its >>>> step size to be as large as possible while keeping estimates of the error >>>> bounded. ?For the problem you are solving, with initial condition 0, the >>>> exact solution is initially exactly 0. ?This is such a nice smooth solution >>>> that the solver's step size quickly grows--so big, in fact, that it skips >>>> right over your pulse and never sees it. >>>> >>>> So how does it create all those intermediate points at the requested time >>>> values? ?It uses interpolation between the steps that it computed to create >>>> the solution values at the times that you requested. ?So using a finer grid >>>> of time values won't help. ?(If lsim2 gave you a hook into the parameters >>>> passed to odeint, you could set odeint's 'hmax' to a value smaller than your >>>> pulse width, which would force the solver to see the pulse. ?But there is no >>>> way to set that parameter from lsim2.) >>>> >>> >>> It's something what I suspected. I don't know much about odeint, but >>> do you think it would be useful to let lsim2 pass through some >>> parameters to odeint? >>> >>> >> >> Sounds useful to me. ?A simple implementation is an optional keyword >> argument that is a dict of odeint arguments. ? But this would almost >> certainly break if lsim2 were ever reimplemented with a different >> solver. ?So perhaps it should allow a common set of ODE solver >> parameters (e.g. absolute and relative error tolerances, max and min >> step sizes, others?). >> >> Perhaps this should wait until after the ODE solver redesign that is >> occasionally discussed: >> ? ?http://projects.scipy.org/scipy/wiki/OdeintRedesign >> Then the solver itself could be an optional argument to lsim2. > > I was just thinking of adding to the argument list a **kwds argument > that is directly passed on to whatever ODE solver is used. This should > be pretty flexible for any changes and be backwards compatible. > > I've seen and used it in a similar way for calls to optimization > routines, e.g. also optimize.curve_fit, does it. What are actually > valid keywords would depend on which function is called. > > (But I'm not a user of lsim, I'm just stealing some ideas from lti and > friends for time series analysis.) > > Josef > > > > >> >> Warren >> >>> Josef >>> >>> >>> >>>> The basic problem is you are passing in a discontinuous function to a solver >>>> that expects a smooth function. ?A better way to solve this problem is to >>>> explicitly account for the discontinuity. One possibility is the attached >>>> script. >>>> >>>> This is an excellent "learning opportunity" for your students on the hazards >>>> of numerical computing! >>>> >>>> Warren >>>> >>>> >>>> Ryan Krauss wrote: >>>> >>>>> I believe I have discovered a bug in signal.lsim2. ?I believe the >>>>> short attached script illustrates the problem. ?I was trying to >>>>> predict the response of a transfer function with a pure integrator: >>>>> >>>>> ? ? ? ? ? ? g >>>>> G = ------------- >>>>> ? ? ? ? s(s+p) >>>>> >>>>> to a finite width pulse. ?lsim2 seems to handle the step response just >>>>> fine, but says that the pulse response is exactly 0.0 for the entire >>>>> time of the simulation. ?Obviously, this isn't the right answer. >>>>> >>>>> I am running scipy 0.7.0 and numpy 1.2.1 on Ubuntu 9.04, but I also >>>>> have the same problem on Windows running 0.7.1 and 1.4.0. >>>>> >>>>> Thanks, >>>>> >>>>> Ryan >>>>> ?------------------------------------------------------------------------ >>>>> >>>>> _______________________________________________ >>>>> SciPy-User mailing list >>>>> SciPy-User at scipy.org >>>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>>>> >>>> from pylab import * >>>> from scipy import signal >>>> >>>> >>>> g = 100.0 >>>> p = 15.0 >>>> G = signal.ltisys.lti(g, [1,p,0]) >>>> >>>> t = arange(0, 1.0, 0.002) >>>> N = len(t) >>>> >>>> # u for the whole interval (not used in lsim2, only for plotting later). >>>> amp = 50.0 >>>> u = zeros(N) >>>> k1 = 50 >>>> k2 = 100 >>>> u[k1:k2] = amp >>>> >>>> # Create input functions for each smooth interval. (This could be simpler, >>>> since u >>>> # is constant on each interval.) >>>> a = float(k1)/N >>>> b = float(k2)/N >>>> T1 = linspace(0, a, 201) >>>> u1 = zeros_like(T1) >>>> T2 = linspace(a, b, 201) >>>> u2 = amp*ones_like(T2) >>>> T3 = linspace(b, 1.0, 201) >>>> u3 = zeros_like(T3) >>>> >>>> # Solve on each interval; use the final value of one solution as the >>>> starting >>>> # point of the next solution. >>>> # (We could skip the first calculation, since we know the solution will be >>>> 0.) >>>> (t1, y1, x1) = signal.lsim2(G,u1,T1) >>>> (t2, y2, x2) = signal.lsim2(G, u2, T2, X0=x1[-1]) >>>> (t3, y3, x3) = signal.lsim2(G, u3, T3, X0=x2[-1]) >>>> >>>> figure(1) >>>> clf() >>>> plot(t, u, 'k', linewidth=3) >>>> plot(t1, y1, 'y', linewidth=3) >>>> plot(t2, y2, 'b', linewidth=3) >>>> plot(t3, y3, 'g', linewidth=3) >>>> >>>> show() >>>> >>>> _______________________________________________ >>>> SciPy-User mailing list >>>> SciPy-User at scipy.org >>>> http://mail.scipy.org/mailman/listinfo/scipy-user >>>> >>>> >>>> >>> _______________________________________________ >>> SciPy-User mailing list >>> SciPy-User at scipy.org >>> http://mail.scipy.org/mailman/listinfo/scipy-user >>> >> >> _______________________________________________ >> SciPy-User mailing list >> SciPy-User at scipy.org >> http://mail.scipy.org/mailman/listinfo/scipy-user >> > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > From rcsqtc at iqac.csic.es Fri Jan 29 07:00:55 2010 From: rcsqtc at iqac.csic.es (Ramon Crehuet) Date: Fri, 29 Jan 2010 13:00:55 +0100 Subject: [SciPy-User] F_CONTIGUOUS and C_CONTIGUOUS Message-ID: <4B62CDF7.8070509@iqac.csic.es> Hi all, I have some doubts about the meaning of F_CONTIGUOUS and C_CONTIGUOUS.I thought thay refered to storing matrices "in rows" or "in columns", but... Imagine 2 arrays: y=np.zeros((10000, 10)) y2=np.zeros((10000, 10), order='F') I can understand the y.flags and y2.flags, however I would expect y[0,:].flags to be F_CONTIGUOUS False, because it is the last index which is changing. And y[:,0].flags to be C_CONTIGUOUS False, because this is a column of that matrix. I am wrong in boths. Similarly, I don'y understand why: In [114]: y2[:,0].flags Out[114]: C_CONTIGUOUS : True F_CONTIGUOUS : True and: In [113]: y2[0,:].flags Out[113]: C_CONTIGUOUS : False F_CONTIGUOUS : False So I guess I have some deep minunderstanding about the meaning of this flags and I would appreciate some enlightening. Thanks! Ramon From pav+sp at iki.fi Fri Jan 29 08:32:24 2010 From: pav+sp at iki.fi (Pauli Virtanen) Date: Fri, 29 Jan 2010 13:32:24 +0000 (UTC) Subject: [SciPy-User] F_CONTIGUOUS and C_CONTIGUOUS References: <4B62CDF7.8070509@iqac.csic.es> Message-ID: Fri, 29 Jan 2010 13:00:55 +0100, Ramon Crehuet wrote: > I have some doubts about the meaning of F_CONTIGUOUS and C_CONTIGUOUS. I > thought thay refered to storing matrices "in rows" or "in columns", > but... Imagine 2 arrays: > y=np.zeros((10000, 10)) > y2=np.zeros((10000, 10), order='F') > > I can understand the y.flags and y2.flags, however I would expect > y[0,:].flags to be F_CONTIGUOUS False, because it is the last index > which is changing. And y[:,0].flags to be C_CONTIGUOUS False, because > this is a column of that matrix. I am wrong in boths. Both y[0,:] and y[:,0] are 1-d arrays. For 1-d arrays, there is no distinction between Fortran-contiguous and C-contiguous: an 1-d array is either contiguous or not. > Similarly, I don'y understand why: > In [114]: y2[:,0].flags > Out[114]: > C_CONTIGUOUS : True > F_CONTIGUOUS : True Here, the elements span a contiguous block of memory. The memory layout of y2 is (r1c2 = element (row=1, column=2)) [ r0c0 r1c0 ... r0c1 r1c1 ... ] Note that with order='C' it would be instead [ r0c0 r0c1 ... r1c0 r1c1 ... ] Consequently, y2[:,0] has here the layout [ r0c0 r1c0 ... rnc0 ] where all elements occur immediately after each other. Hence, it's contiguous. > and: > In [113]: y2[0,:].flags > Out[113]: > C_CONTIGUOUS : False > F_CONTIGUOUS : False Now y2[0,:] has the layout [ r0c0 ### ... r0c1 ### ... ] ie., all elements except those belonging to the first row are skipped. The memory layout is discontiguous. > So I guess I have some deep minunderstanding about the meaning of this > flags and I would appreciate some enlightening. Thanks! The main point is (very tersely) explained here: http://docs.scipy.org/doc/numpy/reference/arrays.ndarray.html#internal- memory-layout-of-an-ndarray From ferrell at diablotech.com Fri Jan 29 12:58:13 2010 From: ferrell at diablotech.com (Robert Ferrell) Date: Fri, 29 Jan 2010 10:58:13 -0700 Subject: [SciPy-User] Align empty time series Message-ID: ts.align_series fails if both series are empty. As a feature request, could this special case be handled and return an empty series? As it is I have to special case this in my code, which adds some clutter. thanks, -robert > In [1566]: e1 = ts.time_series(freq='d', data=[], dates=[]) > > In [1567]: e2 = ts.time_series(freq='d', data=[], dates=[]) > > In [1568]: ts.align_series(e1,e2) > --------------------------------------------------------------------------- > ValueError Traceback (most recent > call last) > > /Users/Shared/Develop/Financial/LakMerc/Reports/maFiltersTest.py in > () > ----> 1 > 2 > 3 > 4 > 5 > > /Library/Python/2.6/site-packages/scikits.timeseries-0.91.3-py2.6- > macosx-10.6-universal.egg/scikits/timeseries/tseries.pyc in > align_series(*series, **kwargs) > 1841 start_date = kwargs.pop('start_date', > 1842 min([x.start_date for x in > filled_series > -> 1843 if x.start_date is not > None])) > 1844 if isinstance(start_date, str): > 1845 start_date = Date(common_freq, string=start_date) > > ValueError: min() arg is an empty sequence > > /Library/Python/2.6/site-packages/scikits.timeseries-0.91.3-py2.6- > macosx-10.6-universal.egg/scikits/timeseries/ > tseries.py(1843)align_series() > 1842 min([x.start_date for x in > filled_series > -> 1843 if x.start_date is not > None])) > 1844 if isinstance(start_date, str): > From pgmdevlist at gmail.com Fri Jan 29 13:09:05 2010 From: pgmdevlist at gmail.com (Pierre GM) Date: Fri, 29 Jan 2010 13:09:05 -0500 Subject: [SciPy-User] Align empty time series In-Reply-To: References: Message-ID: On Jan 29, 2010, at 12:58 PM, Robert Ferrell wrote: > ts.align_series fails if both series are empty. As a feature request, > could this special case be handled and return an empty series? As it > is I have to special case this in my code, which adds some clutter. That sounds reasonable. You could open a ticket and suggest a patch, that'll make the issue easier to trac down. From ferrell at diablotech.com Fri Jan 29 13:16:16 2010 From: ferrell at diablotech.com (Robert Ferrell) Date: Fri, 29 Jan 2010 11:16:16 -0700 Subject: [SciPy-User] Sum duplicate dates in a series Message-ID: How can I sum data for duplicate dates in a time series? I can do it with a loop, but I wonder if there is some tricky magic I might use. For instance, I've got a series: > In [1597]: s > Out[1597]: > timeseries([ 10. 11. 1. 2. 3.], > dates = [12-Jan-2010 12-Jan-2010 22-Jan-2010 22-Jan-2010 22- > Jan-2010], > freq = D) > and I'd like to sum the Jan 12 data together, and the Jan 22 data together, and return a new series with just two dates. > timeseries([ 21. 6.], > dates = [12-Jan-2010 22-Jan-2010], > freq = D) > Is there an easy way? Thanks, -robert From pgmdevlist at gmail.com Fri Jan 29 13:42:42 2010 From: pgmdevlist at gmail.com (Pierre GM) Date: Fri, 29 Jan 2010 13:42:42 -0500 Subject: [SciPy-User] Sum duplicate dates in a series In-Reply-To: References: Message-ID: On Jan 29, 2010, at 1:16 PM, Robert Ferrell wrote: > How can I sum data for duplicate dates in a time series? I can do it > with a loop, but I wonder if there is some tricky magic I might use. > > For instance, I've got a series: > >> In [1597]: s >> Out[1597]: >> timeseries([ 10. 11. 1. 2. 3.], >> dates = [12-Jan-2010 12-Jan-2010 22-Jan-2010 22-Jan-2010 22- >> Jan-2010], >> freq = D) >> > > and I'd like to sum the Jan 12 data together, and the Jan 22 data > together, and return a new series with just two dates. > >> timeseries([ 21. 6.], >> dates = [12-Jan-2010 22-Jan-2010], >> freq = D) >> > > Is there an easy way? Unfortunately, not that easy. You can use ts.find_duplicated_dates to get a dictionary (duplicated dates, indices in the series). From there, you can easily get a dictionary (dates, sum of the series for those dates). >>> s = ts.time_series([1,2,3,4,5],dates=ts.date_array(["2001-01","2001-01","2001-02","2001-03","2001-03"],freq="M")) >>> summed = dict((k,s._series[v].sum()) for (k,v) in ts.find_duplicated_dates(s).items()) You can then reinject summed into a new series >>> dropped = ts.remove_duplicated_dates(s) >>> import operator >>> [operator.setitem(dropped,k,v) for (k,v) in summed.items()] Thinking about it, we could probably overload ts.remove_duplicated_dates to accept a func argument that tells how to deal with those missing dates... You mind opening a ticket ? From jdh2358 at gmail.com Fri Jan 29 14:00:06 2010 From: jdh2358 at gmail.com (John Hunter) Date: Fri, 29 Jan 2010 13:00:06 -0600 Subject: [SciPy-User] Sum duplicate dates in a series In-Reply-To: References: Message-ID: <88e473831001291100j66f111ddh9e6e94583fb4bb02@mail.gmail.com> On Fri, Jan 29, 2010 at 12:42 PM, Pierre GM wrote: > On Jan 29, 2010, at 1:16 PM, Robert Ferrell wrote: >> How can I sum data for duplicate dates in a time series? ?I can do it >> with a loop, but I wonder if there is some tricky magic I might use. If you can put your data in a record array, you can use matplotlib.mlab.rec_groupby http://matplotlib.sourceforge.net/api/mlab_api.html#matplotlib.mlab.rec_groupby http://matplotlib.sourceforge.net/examples/misc/rec_groupby_demo.html JDH From pgmdevlist at gmail.com Fri Jan 29 14:13:44 2010 From: pgmdevlist at gmail.com (Pierre GM) Date: Fri, 29 Jan 2010 14:13:44 -0500 Subject: [SciPy-User] Sum duplicate dates in a series In-Reply-To: <88e473831001291100j66f111ddh9e6e94583fb4bb02@mail.gmail.com> References: <88e473831001291100j66f111ddh9e6e94583fb4bb02@mail.gmail.com> Message-ID: <7A4DA2E1-1A50-4792-9ADB-1117036420C7@gmail.com> On Jan 29, 2010, at 2:00 PM, John Hunter wrote: > On Fri, Jan 29, 2010 at 12:42 PM, Pierre GM wrote: >> On Jan 29, 2010, at 1:16 PM, Robert Ferrell wrote: >>> How can I sum data for duplicate dates in a time series? I can do it >>> with a loop, but I wonder if there is some tricky magic I might use. > > If you can put your data in a record array, you can use > matplotlib.mlab.rec_groupby > > http://matplotlib.sourceforge.net/api/mlab_api.html#matplotlib.mlab.rec_groupby > > http://matplotlib.sourceforge.net/examples/misc/rec_groupby_demo.html John, Could you have a look into numpy.lib.recfunctions ? That's an attempt to homogenize what you did for matplotlib, and it'd be great if you could help. From josef.pktd at gmail.com Fri Jan 29 14:36:42 2010 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 29 Jan 2010 14:36:42 -0500 Subject: [SciPy-User] Sum duplicate dates in a series In-Reply-To: <7A4DA2E1-1A50-4792-9ADB-1117036420C7@gmail.com> References: <88e473831001291100j66f111ddh9e6e94583fb4bb02@mail.gmail.com> <7A4DA2E1-1A50-4792-9ADB-1117036420C7@gmail.com> Message-ID: <1cd32cbb1001291136g279f65f4ifa84e7d343295626@mail.gmail.com> On Fri, Jan 29, 2010 at 2:13 PM, Pierre GM wrote: > On Jan 29, 2010, at 2:00 PM, John Hunter wrote: >> On Fri, Jan 29, 2010 at 12:42 PM, Pierre GM wrote: >>> On Jan 29, 2010, at 1:16 PM, Robert Ferrell wrote: >>>> How can I sum data for duplicate dates in a time series? ?I can do it >>>> with a loop, but I wonder if there is some tricky magic I might use. >> >> If you can put your data in a record array, you can use >> matplotlib.mlab.rec_groupby >> >> http://matplotlib.sourceforge.net/api/mlab_api.html#matplotlib.mlab.rec_groupby >> >> http://matplotlib.sourceforge.net/examples/misc/rec_groupby_demo.html > > John, > Could you have a look into numpy.lib.recfunctions ? That's an attempt to homogenize what you did for matplotlib, and it'd be great if you could help. > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > I just wanted to show that there will be some advantages when it is possible to easily move between packages >>> import scikits.timeseries as ts >>> import la >>> s = ts.time_series([1,2,3,4,5],dates=ts.date_array(["2001-01","2001-01","2001-02","2001-03","2001-03"],freq="M")) >>> dta = la.larry(s.data, label=[range(len(s.data))]) >>> dat = la.larry(s.dates.tolist(), label=[range(len(s.data))]) >>> s2 = ts.time_series(dta.group_mean(dat).x,dates=ts.date_array(dat.x,freq="M")) >>> s timeseries([1 2 3 4 5], dates = [Jan-2001 Jan-2001 Feb-2001 Mar-2001 Mar-2001], freq = M) >>> s2 timeseries([ 1.5 1.5 3. 4.5 4.5], dates = [Jan-2001 Jan-2001 Feb-2001 Mar-2001 Mar-2001], freq = M) >>> s2u = ts.remove_duplicated_dates(s2) >>> s2u timeseries([ 1.5 3. 4.5], dates = [Jan-2001 ... Mar-2001], freq = M) >>> s2u.dates DateArray([Jan-2001, Feb-2001, Mar-2001], freq='M') It's not so easy yet. But it would be nice if we can use timeseries, pandas and la for different things depending on the more convenient representation. Josef From kwgoodman at gmail.com Fri Jan 29 15:09:42 2010 From: kwgoodman at gmail.com (Keith Goodman) Date: Fri, 29 Jan 2010 12:09:42 -0800 Subject: [SciPy-User] Sum duplicate dates in a series In-Reply-To: <1cd32cbb1001291136g279f65f4ifa84e7d343295626@mail.gmail.com> References: <88e473831001291100j66f111ddh9e6e94583fb4bb02@mail.gmail.com> <7A4DA2E1-1A50-4792-9ADB-1117036420C7@gmail.com> <1cd32cbb1001291136g279f65f4ifa84e7d343295626@mail.gmail.com> Message-ID: On Fri, Jan 29, 2010 at 11:36 AM, wrote: > I just wanted to show that there will be some advantages when it is > possible to easily move between packages > >>>> import scikits.timeseries as ts >>>> import la >>>> s = ts.time_series([1,2,3,4,5],dates=ts.date_array(["2001-01","2001-01","2001-02","2001-03","2001-03"],freq="M")) >>>> dta = la.larry(s.data, label=[range(len(s.data))]) >>>> dat = la.larry(s.dates.tolist(), label=[range(len(s.data))]) Clever use of larry. The default label is range(n) so you can just do >>>> dta = la.larry(s.data) >>>> dat = la.larry(s.dates.tolist()) From bala1486 at gmail.com Fri Jan 29 20:30:03 2010 From: bala1486 at gmail.com (Balachandar) Date: Fri, 29 Jan 2010 20:30:03 -0500 Subject: [SciPy-User] Filter problem Message-ID: <844404061001291730r40c7b9bdtf12c368809e5540c@mail.gmail.com> Hello, I am a newbie in Scipy and also to filtering. I have a data from a pressure sensor. It needs to be filtered. The graph can be seen from the attachment. The top one is unfiltere and the bottom one is filtered. As you can see the shape looks the same. I need a good smooth line. My code goes like this.. b,a = butter(10, 0.2, 'low') ICPfiltered = lfilter(b,a,ICPunfiltered) where ICPfiltered is the filtered value and ICP unfiltered is the data from the sensor. Its a 1-dimensional array. Am i doing anything wrong with the usage of the api's. I have tried to change the cut -off frequency also but it isn't of any use. Should i use any other filter other than butterworth iir to get a smooth curve. Thank you.... Thanks, Bala -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Screenshot.png Type: image/png Size: 31409 bytes Desc: not available URL: From warren.weckesser at enthought.com Sat Jan 30 14:55:30 2010 From: warren.weckesser at enthought.com (Warren Weckesser) Date: Sat, 30 Jan 2010 13:55:30 -0600 Subject: [SciPy-User] Filter problem In-Reply-To: <844404061001291730r40c7b9bdtf12c368809e5540c@mail.gmail.com> References: <844404061001291730r40c7b9bdtf12c368809e5540c@mail.gmail.com> Message-ID: <4B648EB2.5040605@enthought.com> Balachandar wrote: > Hello, I am a newbie in Scipy and also to filtering. I have a data > from a pressure sensor. It needs to be filtered. The graph can be seen > from the attachment. The top one is unfiltere and the bottom one is > filtered. As you can see the shape looks the same. I need a good > smooth line. My code goes like this.. > > b,a = butter(10, 0.2, 'low') > ICPfiltered = lfilter(b,a,ICPunfiltered) > > where ICPfiltered is the filtered value and ICP unfiltered is the data > from the sensor. Its a 1-dimensional array. Am i doing anything wrong > with the usage of the api's. I have tried to change the cut > -off frequency also but it isn't of any use. Should i use any other > filter other than butterworth iir to get a smooth curve. Thank you.... > Bala, The second argument to butter() is the (roughly) the cutoff frequency Wn, expressed as a fraction of the Nyquist rate, which is half the sampling rate of the original signal. Wn=0.2 implies that the cutoff frequency of the lowpass filter is 0.1 times the sampling rate of the signal. If most of the noise that you want to remove is below this frequency, you won't see much difference in the filtered signal. The attached script provides a demonstration of how altering Wn affects the result of filtering with butter(). It creates a test signal as the sum of several sinusoidal functions, and then applies a Butterworth filter for several values of Wn. The attached plot shows the result. Whether you should use butter() or some other filtering algorithm depends on your objectives. "A good smooth line" does not give a well-defined set of criteria for choosing an appropriate filter. :) Warren > Thanks, > Bala > > ------------------------------------------------------------------------ > > ------------------------------------------------------------------------ > > _______________________________________________ > SciPy-User mailing list > SciPy-User at scipy.org > http://mail.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: filter_demo.py URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: filter_demo.png Type: image/png Size: 69967 bytes Desc: not available URL: From robert.pickel at gmail.com Sat Jan 30 21:04:17 2010 From: robert.pickel at gmail.com (Robert Pickel) Date: Sat, 30 Jan 2010 21:04:17 -0500 Subject: [SciPy-User] C-API - Dealing with pointers to cfloat type Message-ID: <4b64e4ef.9653f10a.0899.ffffa99c@mx.google.com> Hello, I'm attempting to integrate an algorithm in a numpy extension written in C. The extension compiles, but testing shows the data is not being transferred properly. On the Python side: thing=array([1.0+1j, 1.0+1j, 1.0+1j], dtype=cfloat) mymodule.dothing(thing) On the C side: npy_cfloat *mydat; mydat=(npy_cfloat)PyArray_GETPTR1(passedobj, 1); printf("%f", mydat->real); ... return(...) I always get 0.000 printed... Modifying the code to pass an array of floats from python and mydat of type float seems to work as expected. The problem seems to be dealing with complex data types. My build environment is win7(64), vs2008, python26(win32)+numpy1.4.0 . I'd appreciate any advice I can get on this. Thanks, Bob -------------- next part -------------- An HTML attachment was scrubbed... URL: From jsseabold at gmail.com Sun Jan 31 23:39:21 2010 From: jsseabold at gmail.com (Skipper Seabold) Date: Sun, 31 Jan 2010 23:39:21 -0500 Subject: [SciPy-User] all() and any() with array-like input Message-ID: I'm sure this has come up before, but it's difficult to search of any and all... This is tripping me up, but maybe there is just something I am missing. In [1]: import numpy as np In [2]: X = [.2,.2,.2,.2,.2] In [3]: np.all(X <= 1) Out[3]: False In [4]: np.all(X >= 0) Out[4]: True In [5]: np.all(np.asarray(X) <= 1) Out[5]: True In [6]: np.any(X>1) Out[6]: True I guess it's simple enough to use asarray, but I was just curious what drives this behavior since the docs indicate that it should work with array-like structures. Skipper