From simpson at math.toronto.edu Sun Feb 1 00:37:48 2009 From: simpson at math.toronto.edu (Gideon Simpson) Date: Sun, 1 Feb 2009 00:37:48 -0500 Subject: [SciPy-user] shared memory machines Message-ID: <2AE6D153-799C-450E-8E69-CA80D12E2FF5@math.toronto.edu> Has anyone been able to take advantage of shared memory machines with scipy? How did you do it? -gideon From karl.young at ucsf.edu Sun Feb 1 00:45:59 2009 From: karl.young at ucsf.edu (Young, Karl) Date: Sat, 31 Jan 2009 21:45:59 -0800 Subject: [SciPy-user] Automating Matlab References: <4984F58C.5070605@gmail.com> <3d375d730901311734o388adf56y9f3241032ed409c2@mail.gmail.com> Message-ID: <9D202D4E86A4BF47BA6943ABDF21BE78058FAB62@EXVS06.net.ucsf.edu> >> Is there strong interest in automating matlab to numpy conversion? > Yes! Please post your code somewhere! seconded !!!!! I'm currently working on a grant that has turned out to involve porting a lot of matlab code to python; you will be gratefully acknowledged in whatever comes of the work of the grant. -- KY From gokhansever at gmail.com Sun Feb 1 02:49:32 2009 From: gokhansever at gmail.com (gsever) Date: Sat, 31 Jan 2009 23:49:32 -0800 (PST) Subject: [SciPy-user] Automating Matlab In-Reply-To: <4984F58C.5070605@gmail.com> References: <4984F58C.5070605@gmail.com> Message-ID: <27b4bf7a-bb75-457d-8ed0-eec3465b92f1@t13g2000yqc.googlegroups.com> I am interested with this project, too. Would be much better to have an automated tool than doing manual conversations. Just for your information, there is a IDL-to-Python conversation tool named i2py @ http://code.google.com/p/i2py/ On Jan 31, 7:06?pm, Eric Schug wrote: > Is there strong interest in automating matlab to numpy conversion? > > I have a working version of a matlab to python translator. > It allows translation of matlab scripts into numpy constructs, > supporting most of the matlab language. ?The parser is nearly complete. ? > Most of the remaining work involves providing a robust translation. Such as > ? ? * making sure that copies on assign are done when needed. > ? ? * correct indexing a(:) becomes a.flatten(1) when on the left hand > side (lhs) of equals > ? ? ? ?and a[:] when on the right hand side > > I've seen a few projects attempt to do this, but for one reason or > another have stopped it. > > _______________________________________________ > SciPy-user mailing list > SciPy-u... at scipy.orghttp://projects.scipy.org/mailman/listinfo/scipy-user From s.mientki at ru.nl Sun Feb 1 04:27:11 2009 From: s.mientki at ru.nl (Stef Mientki) Date: Sun, 01 Feb 2009 10:27:11 +0100 Subject: [SciPy-user] Automating Matlab In-Reply-To: <3d375d730901311734o388adf56y9f3241032ed409c2@mail.gmail.com> References: <4984F58C.5070605@gmail.com> <3d375d730901311734o388adf56y9f3241032ed409c2@mail.gmail.com> Message-ID: <49856AEF.9050605@ru.nl> Robert Kern wrote: > On Sat, Jan 31, 2009 at 19:06, Eric Schug wrote: > >> Is there strong interest in automating matlab to numpy conversion? >> > > Yes! Please post your code somewhere! > > +1 And this is a very good moment for the persons who are creating a Matlab like environment, including the Matlab-like workspace, to show there creations. cheers, Stef From gael.varoquaux at normalesup.org Sun Feb 1 04:57:46 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sun, 1 Feb 2009 10:57:46 +0100 Subject: [SciPy-user] shared memory machines In-Reply-To: <2AE6D153-799C-450E-8E69-CA80D12E2FF5@math.toronto.edu> References: <2AE6D153-799C-450E-8E69-CA80D12E2FF5@math.toronto.edu> Message-ID: <20090201095746.GA1099@phare.normalesup.org> On Sun, Feb 01, 2009 at 12:37:48AM -0500, Gideon Simpson wrote: > Has anyone been able to take advantage of shared memory machines with > scipy? How did you do it? I am not sure I understand your question. You want to do parallel computing and share the arrays between processes, is that it? Ga?l From simpson at math.toronto.edu Sun Feb 1 10:03:30 2009 From: simpson at math.toronto.edu (Gideon Simpson) Date: Sun, 1 Feb 2009 10:03:30 -0500 Subject: [SciPy-user] shared memory machines In-Reply-To: <20090201095746.GA1099@phare.normalesup.org> References: <2AE6D153-799C-450E-8E69-CA80D12E2FF5@math.toronto.edu> <20090201095746.GA1099@phare.normalesup.org> Message-ID: Yes, but I'm talking about when you have a multiprocessor/multicore system, not a commodity cluster. In these shared memory configurations, were I using compiled code, I'd be able to use OpenMP to take advantage of the additional cores/processors. I'm wondering if anyone has looked at ways to take advantage of such configurations with scipy. -gideon On Feb 1, 2009, at 4:57 AM, Gael Varoquaux wrote: > On Sun, Feb 01, 2009 at 12:37:48AM -0500, Gideon Simpson wrote: >> Has anyone been able to take advantage of shared memory machines with >> scipy? How did you do it? > > I am not sure I understand your question. You want to do parallel > computing and share the arrays between processes, is that it? > > Ga?l > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user From gael.varoquaux at normalesup.org Sun Feb 1 10:29:40 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sun, 1 Feb 2009 16:29:40 +0100 Subject: [SciPy-user] shared memory machines In-Reply-To: References: <2AE6D153-799C-450E-8E69-CA80D12E2FF5@math.toronto.edu> <20090201095746.GA1099@phare.normalesup.org> Message-ID: <20090201152940.GD9757@phare.normalesup.org> On Sun, Feb 01, 2009 at 10:03:30AM -0500, Gideon Simpson wrote: > Yes, but I'm talking about when you have a multiprocessor/multicore > system, not a commodity cluster. In these shared memory > configurations, were I using compiled code, I'd be able to use OpenMP > to take advantage of the additional cores/processors. I'm wondering > if anyone has looked at ways to take advantage of such configurations > with scipy. I use the multiprocessing module: http://docs.python.org/library/multiprocessing.html I also have some code to share arrays between processes. I'd love to submit it for integration with numpy, but first I'd like it to get more exposure so that the eventual flaws in the APIs are found. I am attaching it. Actually I wrote this code a few months ago, and now that I am looking at it, I realise that the SharedMemArray should probably be a subclass of numpy.ndarray, and implement the full array signature. I am not sure if this is possible or not (ie if it will still be easy to have multiprocessing share the data between processes or not). I don't really have time for polishing this right, anybody wants to have a go? Ga?l > On Feb 1, 2009, at 4:57 AM, Gael Varoquaux wrote: > > On Sun, Feb 01, 2009 at 12:37:48AM -0500, Gideon Simpson wrote: > >> Has anyone been able to take advantage of shared memory machines with > >> scipy? How did you do it? > > I am not sure I understand your question. You want to do parallel > > computing and share the arrays between processes, is that it? -------------- next part -------------- """ Small helper module to share arrays between processes without copying data. Numpy arrays can be converted to shared memory arrays, which implement the array protocole, but are allocated in memory that can be share transparently by the multiprocessing module. """ # Author: Gael Varoquaux # Copyright: Gael Varoquaux # License: BSD import numpy as np import multiprocessing import ctypes _ctypes_to_numpy = { ctypes.c_char : np.int8, ctypes.c_wchar : np.int16, ctypes.c_byte : np.int8, ctypes.c_ubyte : np.uint8, ctypes.c_short : np.int16, ctypes.c_ushort : np.uint16, ctypes.c_int : np.int32, ctypes.c_uint : np.int32, ctypes.c_long : np.int32, ctypes.c_ulong : np.int32, ctypes.c_float : np.float32, ctypes.c_double : np.float64 } _numpy_to_ctypes = dict((value, key) for key, value in _ctypes_to_numpy.iteritems()) def shmem_as_ndarray(data, dtype=float): """ Given a multiprocessing.Array object, as created by ndarray_to_shmem, returns an ndarray view on the data. """ dtype = np.dtype(dtype) size = data._wrapper.get_size()/dtype.itemsize arr = np.frombuffer(buffer=data, dtype=dtype, count=size) return arr def ndarray_to_shmem(arr): """ Converts a numpy.ndarray to a multiprocessing.Array object. The memory is copied, and the array is flattened. """ arr = arr.reshape((-1, )) data = multiprocessing.RawArray(_numpy_to_ctypes[arr.dtype.type], arr.size) ctypes.memmove(data, arr.data[:], len(arr.data)) return data def test_ndarray_conversion(): """ Check that the conversion to multiprocessing.Array and back works. """ a = np.random.random((100, )) a_sh = ndarray_to_shmem(a) b = shmem_as_ndarray(a_sh) np.testing.assert_almost_equal(a, b) def test_conversion_non_flat(): """ Check that the conversion also works with non-flat arrays. """ a = np.random.random((100, 2)) a_flat = a.flatten() a_sh = ndarray_to_shmem(a) b = shmem_as_ndarray(a_sh) np.testing.assert_almost_equal(a_flat, b) def test_conversion_non_contiguous(): """ Check that the conversion also works with non-contiguous arrays. """ a = np.indices((3, 3, 3)) a = a.T a_flat = a.flatten() a_sh = ndarray_to_shmem(a) b = shmem_as_ndarray(a_sh, dtype=a.dtype) np.testing.assert_almost_equal(a_flat, b) def test_no_copy(): """ Check that the data is not copied from the multiprocessing.Array. """ a = np.random.random((100, )) a_sh = ndarray_to_shmem(a) a = shmem_as_ndarray(a_sh) b = shmem_as_ndarray(a_sh) a[0] = 1 np.testing.assert_equal(a[0], b[0]) a[0] = 0 np.testing.assert_equal(a[0], b[0]) ################################################################################ # A class to carry around the relevant information ################################################################################ class SharedMemArray(object): """ Wrapper around multiprocessing.Array to share an array accross processes. """ def __init__(self, arr): """ Initialize a shared array from a numpy array. The data is copied. """ self.data = ndarray_to_shmem(arr) self.dtype = arr.dtype self.shape = arr.shape def __array__(self): """ Implement the array protocole. """ arr = shmem_as_ndarray(self.data, dtype=self.dtype) arr.shape = self.shape return arr def asarray(self): return self.__array__() def test_sharing_array(): """ Check that a SharedMemArray shared between processes is indeed modified in place. """ # Our worker function def f(arr): a = arr.asarray() a *= -1 a = np.random.random((10, 3, 1)) arr = SharedMemArray(a) # b is a copy of a b = arr.asarray() np.testing.assert_array_equal(a, b) multiprocessing.Process(target=f, args=(arr,)).run() np.testing.assert_equal(-b, a) if __name__ == '__main__': import nose nose.runmodule() From josef.pktd at gmail.com Sun Feb 1 12:02:38 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sun, 1 Feb 2009 12:02:38 -0500 Subject: [SciPy-user] bug in stats.pdfapprox/stats.pdf_moment and new Gram-Charlier distribution Message-ID: <1cd32cbb0902010902ga75134dxac08747aa19d524c@mail.gmail.com> I wanted to create a new distribution by wrapping stats.pdfapprox and stats.pdf_moment in stats.morestats.py. However, these two function do not create a normal expansion if the first four moments are given. The inner loop in stats.pdf_moment is never entered when four moments are given. As a consequence the pdf that is returned is the unexpanded normal distribution. (There is also a small mistake in stats.pdfapprox in how the variance is calculated.) I didn't find any information which type of expansion is used by pdf_moment. I assume it is Gram-Charlier, but I didn't find any formulas to make sense out of the inner loop that calculates the coefficients (for multiplying with the Hermite polynomials). If someone could provide a (understandable) reference for these calculations or figure out what the loop is supposed to do, then we could correct the expansion for the general case. Since I couldn't fix `pdf_moment`, I wrote a new function that calculates the pdf for the Gram-Charlier expansion when the first four moments (or mean, variance, skew, kurtosis) are given. This uses the explicit formula for this expansion, and doesn't allow for higher order expansion. pdf_mvsk: get pdf of G-Ch normal expansion using mean, variance, skew, and excess kurtosis This I wrapped in a subclass of _distributionsrv_continuous: NormExpan_gen It works in the examples that I tried but is not fully tested or cleaned up yet. attachment: * try_pdfapprox.py shows problem with current function * distr_gch.py new expansion pdf, and NormExpan distribution I also wrote a skew normal and skew t distribution (as defined by Azzalini, A. & Capitanio, A., univariate only), which is not attached. Josef -------------- next part -------------- from scipy import stats, special from scipy.stats import distributions import numpy as np def mvsk2cm(args): mu,sig,sk,kur = args # Get central moments cnt = [None]*4 cnt[0] = mu cnt[1] = sig #*sig cnt[2] = sk * sig**1.5 cnt[3] = (kur+3.0) * sig**2.0 return cnt rvs = stats.norm.rvs(5,size=(2,100)).max(axis=0) mvsk = stats.describe(rvs)[2:] print 'sample: mu,sig,sk,kur' print mvsk mc = mvsk2cm(mvsk) pdffn1 = stats.pdfapprox(rvs) print '\npdf approximation from sample' print 'pdf at mean-1, mean+1', mc[0]-1,mc[0]+1 print pdffn1([mc[0]-1,mc[0]+1]) pdffn2 = stats.pdf_moments(mc) print '\npdf approximation from moments' print 'pdf at mean-1, mean+1', mc[0]-1,mc[0]+1 print pdffn2([mc[0]-1,mc[0]+1]) -------------- next part -------------- '''Gram-Charlier distribution, four-moment expansion of normal distribution ''' from scipy import stats, special from scipy.stats import distributions import numpy as np def mvsk2cm(*args): mu,sig2,sk,kur = args # Get central moments cnt = [None]*4 cnt[0] = mu cnt[1] = sig2 #*sig wrong in stats.pdfapprox cnt[2] = sk * sig2**1.5 cnt[3] = (kur+3.0) * sig2**2.0 return cnt def mc2mvsk(args): mc, mc2, mc3, mc4 = args skew = mc3 / mc2**1.5 kurt = mc4 / mc2**2.0 - 3.0 return (mc, mc2, skew, kurt) def pdf_mvsk(mvsk): """Return the Gaussian expanded pdf function given the list of 1st, 2nd moment and skew and Fisher (excess) kurtosis. Parameters ---------- mvsk : list of mu, mc2, skew, kurt distribution is matched to these four moments Returns ------- pdffunc : function function that evaluates the pdf(x), where x is the non-standardized random variable. Notes ----- Changed so it works only if four arguments are given. Uses explicit formula, not loop. This implements a Gram-Charlier expansion of the normal distribution where the first 2 moments coincide with those of the normal distribution but skew and kurtosis can deviate from it. In the Gram-Charlier distribution it is possible that the density becomes negative. This is the case when the deviation from the normal distribution is too large. References ---------- http://en.wikipedia.org/wiki/Edgeworth_series Johnson N.L., S. Kotz, N. Balakrishnan: Continuous Univariate Distributions, Volume 1, 2nd ed., p.30 """ N = len(mvsk) if N < 4: raise ValueError, "Four moments must be given to" + \ "approximate the pdf." mu, mc2, skew, kurt = mvsk totp = poly1d(1) sig = sqrt(mc2) if N > 2: Dvals = stats.morestats._hermnorm(N+1) C3 = skew/6.0 C4 = kurt/24.0 # Note: Hermite polynomial for order 3 in _hermnorm is negative # instead of positive totp = totp - C3*Dvals[3] + C4*Dvals[4] def pdffunc(x): xn = (x-mu)/sig return totp(xn)*np.exp(-xn*xn/2.0)/np.sqrt(2*np.pi)/sig return pdffunc class NormExpan_gen(distributions.rv_continuous): def __init__(self,args, **kwds): distributions.rv_continuous.__init__(self, name = 'Normal Expansion distribution', extradoc = ''' The distribution is defined as the Gram-Charlier expansion of the normal distribution using the first four moments. The pdf is given by pdf(x) = (1+ skew/6.0 * H(xc,3) + kurt/24.0 * H(xc,4))*normpdf(xc) where xc = (x-mu)/sig is the standardized value of the random variable and H(xc,3) and H(xc,4) are Hermite polynomials. Note: This distribution has to be parameterized during initialization and instantiation, and does not have a shape parameter after instantiation (similar to frozen distribution except for location and scale.) Location and scale can be used as with other distributions, however note, that they are relative to the initialized distribution. ''' ) mode = kwds.get('mode', 'sample') if mode == 'sample': mu,sig2,sk,kur = stats.describe(args)[2:] self.mvsk = (mu,sig2,sk,kur) cnt = mvsk2cm(mu,sig2,sk,kur) elif mode == 'mvsk': cnt = mvsk2cm(args) self.mvsk = args elif mode == 'centmom': cnt = args self.mvsk = mc2mvsk(cnt) else: raise ValueError, "mode must be 'mvsk' or centmom" self.cnt = cnt self._pdf = pdf_mvsk(self.mvsk) def _munp(self,n): # use pdf integration with _mom0_sc if only _pdf is defined. # default stats calculation uses ppf return self._mom0_sc(n) def _stats_skip(self): # skip for now to force numerical integration of pdf for testing return self.mvsk if __name__ == '__main__': rvs = skewnorm.rvs(5,size=100) normexpan = NormExpan_gen(rvs, mode='sample') smvsk = stats.describe(rvs)[2:] print 'sample: mu,sig2,sk,kur' print smvsk dmvsk = normexpan.stats(moments='mvsk') print 'normexpan: mu,sig2,sk,kur' print dmvsk print 'mvsk diff distribution - sample' print np.array(dmvsk) - np.array(smvsk) print '\nnormexpan attributes mvsk' print mc2mvsk(normexpan.cnt) print normexpan.mvsk print '\nusing methods' print normexpan.rvs(size=5) #slow print normexpan.cdf([-1,0,1,2]) print normexpan.pdf([-1,0,1,2]) print normexpan.ppf([0,0.1,0.5,0.9,0.95,1]) print normexpan.sf([0,0.1,0.5,0.9,0.95,1]) From timmichelsen at gmx-topmail.de Sun Feb 1 15:48:32 2009 From: timmichelsen at gmx-topmail.de (Tim Michelsen) Date: Sun, 01 Feb 2009 21:48:32 +0100 Subject: [SciPy-user] SciPy and GUI In-Reply-To: <20090125100556.GA29918@phare.normalesup.org> References: <20090125100556.GA29918@phare.normalesup.org> Message-ID: <49860AA0.1090300@gmx-topmail.de> Hello, Some ideas on adding a GUI to scientif scripts can be found in the following book: Python Scripting for Computational Science, by H. P. Langtangen. http://folk.uio.no/hpl/scripting/ I am currently as well at a point within my developments where user interaction is needed. Currently, I see three options with different levels of complexity. 1) use commandline (OptParse) with config files 2) add some simple GUIs that pop up where user input is needed. 3) code a real GUI with a Toolkit. > I would use traits (see > http://code.enthought.com/projects/traits/documentation.php, and > http://code.enthought.com/projects/traits/docs/html/tutorials/traits_ui_scientific_app.html > for documentation and a tutorial) I read your tutorial. I think it is one of the best I read that are targeting non-programmer scientists who need to to task specific coding. Your Physics Lab background shows that you know the difficulties of your readers. Well done! I shows that GUIs can be created with as little overhead code as possible. Nevertheless, I have some questions: * Where is the "science" in TraitsUI? (Why do you call it a scientific GUI?) E.g. I could also build a Wizard directly with wxPython. So why with Traits? * I tried the examples. What I did not understand is how one can control the buttons below the Traits objects. For the first example (section "An object and its representation"), there are 6 buttons in your image: Undo, Redo, Revert, OK, Cancel, Help. When I execute the code I only get OK, Cancel. May you tell how or where to find information how buttons can be contolled? * Input validation: I remember to have seen a example where a Traits Window was used to validate (numeric) input. If the user puts in invalid numers, it would turn read.. Do you know about this? * Is there a feature roadmap for traits? I would like to know where you intend it to develop it to before I settle on it. Others users may also be interested, so I relink to an earlier post: example application for a starter with TraitsUI http://thread.gmane.org/gmane.comp.python.enthought.devel/18246 It maybe of interest for many prospective beginners to see example applications. Why not listing all accessible applications built with TraitsUI on a website? I think that Enthought should put a strong pointer on their website (http://code.enthought.com/) indicating that actually a lot of documentation can also be found on the Trac wiki (https://svn.enthought.com/enthought/wiki). Kind regards, Timmie From marko.loparic at gmail.com Sun Feb 1 15:55:44 2009 From: marko.loparic at gmail.com (Marko Loparic) Date: Sun, 1 Feb 2009 21:55:44 +0100 Subject: [SciPy-user] python alternative to java rich ajax platform (RAP) for a thin client of a mathematical model? Message-ID: Hi, Do you know a python alternative to rich ajax platform (RAP)? For the development of a user interface for a mathematical model someone suggested me to use eclipse and that tool: http://www.eclipse.org/rap/ http://www.eclipse.org/rap/about.php If I understand correctly it allows you to design very easily a web client interacting with your application (in particular you write no javascript and the powerful javascript routines from qooxdoo are called for you). It seems to be a very interesting package except that... it is in java. So I would like to know if there is something with similar power made in python. Alternatively I would like to know if there is a way to use this tool doing the minimum in java and the most in python (probably not...). Thanks! Marko (sorry for crossposting with comp.lang.python, but it seems that scipy community is quite different) From alex.liberzon at gmail.com Sun Feb 1 16:27:10 2009 From: alex.liberzon at gmail.com (Alex Liberzon) Date: Sun, 1 Feb 2009 23:27:10 +0200 Subject: [SciPy-user] Automating Matlab Message-ID: <775f17a80902011327x225b9d7ve28eeff39f3024@mail.gmail.com> +1 On Sun, Feb 1, 2009 at 5:29 PM, wrote: > Send SciPy-user mailing list submissions to > scipy-user at scipy.org > > To subscribe or unsubscribe via the World Wide Web, visit > http://projects.scipy.org/mailman/listinfo/scipy-user > or, via email, send a message with subject or body 'help' to > scipy-user-request at scipy.org > > You can reach the person managing the list at > scipy-user-owner at scipy.org > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of SciPy-user digest..." > > > Today's Topics: > > 1. Automating Matlab (Eric Schug) > 2. Re: Automating Matlab (Robert Kern) > 3. Re: Automating Matlab (David Warde-Farley) > 4. shared memory machines (Gideon Simpson) > 5. Re: Automating Matlab (Young, Karl) > 6. Re: Automating Matlab (gsever) > 7. Re: Automating Matlab (Stef Mientki) > 8. Re: shared memory machines (Gael Varoquaux) > 9. Re: shared memory machines (Gideon Simpson) > 10. Re: shared memory machines (Gael Varoquaux) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Sat, 31 Jan 2009 20:06:20 -0500 > From: Eric Schug > Subject: [SciPy-user] Automating Matlab > To: scipy-user at scipy.org > Message-ID: <4984F58C.5070605 at gmail.com> > Content-Type: text/plain; charset=ISO-8859-1; format=flowed > > Is there strong interest in automating matlab to numpy conversion? > > I have a working version of a matlab to python translator. > It allows translation of matlab scripts into numpy constructs, > supporting most of the matlab language. The parser is nearly complete. > Most of the remaining work involves providing a robust translation. Such as > * making sure that copies on assign are done when needed. > * correct indexing a(:) becomes a.flatten(1) when on the left hand > side (lhs) of equals > and a[:] when on the right hand side > > > I've seen a few projects attempt to do this, but for one reason or > another have stopped it. > > > > ------------------------------ > > Message: 2 > Date: Sat, 31 Jan 2009 19:34:57 -0600 > From: Robert Kern > Subject: Re: [SciPy-user] Automating Matlab > To: SciPy Users List > Message-ID: > <3d375d730901311734o388adf56y9f3241032ed409c2 at mail.gmail.com> > Content-Type: text/plain; charset=UTF-8 > > On Sat, Jan 31, 2009 at 19:06, Eric Schug wrote: > > Is there strong interest in automating matlab to numpy conversion? > > Yes! Please post your code somewhere! > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > > > ------------------------------ > > Message: 3 > Date: Sat, 31 Jan 2009 20:49:32 -0500 > From: David Warde-Farley > Subject: Re: [SciPy-user] Automating Matlab > To: SciPy Users List > Message-ID: <5BC40EFF-8964-45CB-9DA3-D4FA87EE4B2E at cs.toronto.edu> > Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes > > On 31-Jan-09, at 8:06 PM, Eric Schug wrote: > > > Is there strong interest in automating matlab to numpy conversion? > > I think there is a strong interest in this. One of the main obstacles > to changing environments is inertia and familiarity. My advisor > repeatedly expresses his wish to give Python another try, and having > an easy way to show him how his existing scripts translate would be > awesome. > > Of course there are caveats, corner cases where such translations will > fail, but a fairly foolproof method of converting simple scripts would > be just fantastic. I imagine if you've gotten further along than > previous attempts you'll receive a lot of street cred on this list and > probably a lot of patches to make things work better. :) > > David > > > ------------------------------ > > Message: 4 > Date: Sun, 1 Feb 2009 00:37:48 -0500 > From: Gideon Simpson > Subject: [SciPy-user] shared memory machines > To: SciPy Users List > Message-ID: <2AE6D153-799C-450E-8E69-CA80D12E2FF5 at math.toronto.edu> > Content-Type: text/plain; charset=US-ASCII; format=flowed; delsp=yes > > Has anyone been able to take advantage of shared memory machines with > scipy? How did you do it? > > -gideon > > > > ------------------------------ > > Message: 5 > Date: Sat, 31 Jan 2009 21:45:59 -0800 > From: "Young, Karl" > Subject: Re: [SciPy-user] Automating Matlab > To: "SciPy Users List" > Message-ID: > <9D202D4E86A4BF47BA6943ABDF21BE78058FAB62 at EXVS06.net.ucsf.edu> > Content-Type: text/plain; charset=iso-8859-1 > > > >> Is there strong interest in automating matlab to numpy conversion? > > > Yes! Please post your code somewhere! > > seconded !!!!! I'm currently working on a grant that has turned out to > involve porting a lot of matlab code to python; you will be gratefully > acknowledged in whatever comes of the work of the grant. > > -- KY > > > > ------------------------------ > > Message: 6 > Date: Sat, 31 Jan 2009 23:49:32 -0800 (PST) > From: gsever > Subject: Re: [SciPy-user] Automating Matlab > To: scipy-user at scipy.org > Message-ID: > <27b4bf7a-bb75-457d-8ed0-eec3465b92f1 at t13g2000yqc.googlegroups.com> > Content-Type: text/plain; charset=ISO-8859-1 > > I am interested with this project, too. Would be much better to have > an automated tool than doing manual conversations. > > Just for your information, there is a IDL-to-Python conversation tool > named i2py @ http://code.google.com/p/i2py/ > > On Jan 31, 7:06?pm, Eric Schug wrote: > > Is there strong interest in automating matlab to numpy conversion? > > > > I have a working version of a matlab to python translator. > > It allows translation of matlab scripts into numpy constructs, > > supporting most of the matlab language. ?The parser is nearly complete. ? > > Most of the remaining work involves providing a robust translation. Such > as > > ? ? * making sure that copies on assign are done when needed. > > ? ? * correct indexing a(:) becomes a.flatten(1) when on the left hand > > side (lhs) of equals > > ? ? ? ?and a[:] when on the right hand side > > > > I've seen a few projects attempt to do this, but for one reason or > > another have stopped it. > > > > _______________________________________________ > > SciPy-user mailing list > > SciPy-u... at scipy.orghttp:// > projects.scipy.org/mailman/listinfo/scipy-user > > > ------------------------------ > > Message: 7 > Date: Sun, 01 Feb 2009 10:27:11 +0100 > From: Stef Mientki > Subject: Re: [SciPy-user] Automating Matlab > To: scipy-user at scipy.org > Message-ID: <49856AEF.9050605 at ru.nl> > Content-Type: text/plain; charset=ISO-8859-1; format=flowed > > > > Robert Kern wrote: > > On Sat, Jan 31, 2009 at 19:06, Eric Schug wrote: > > > >> Is there strong interest in automating matlab to numpy conversion? > >> > > > > Yes! Please post your code somewhere! > > > > > +1 > > And this is a very good moment for the persons who are creating a Matlab > like environment, > including the Matlab-like workspace, > to show there creations. > > cheers, > Stef > > > ------------------------------ > > Message: 8 > Date: Sun, 1 Feb 2009 10:57:46 +0100 > From: Gael Varoquaux > Subject: Re: [SciPy-user] shared memory machines > To: SciPy Users List > Message-ID: <20090201095746.GA1099 at phare.normalesup.org> > Content-Type: text/plain; charset=iso-8859-1 > > On Sun, Feb 01, 2009 at 12:37:48AM -0500, Gideon Simpson wrote: > > Has anyone been able to take advantage of shared memory machines with > > scipy? How did you do it? > > I am not sure I understand your question. You want to do parallel > computing and share the arrays between processes, is that it? > > Ga?l > > > ------------------------------ > > Message: 9 > Date: Sun, 1 Feb 2009 10:03:30 -0500 > From: Gideon Simpson > Subject: Re: [SciPy-user] shared memory machines > To: SciPy Users List > Message-ID: > Content-Type: text/plain; charset=ISO-8859-1; format=flowed; delsp=yes > > Yes, but I'm talking about when you have a multiprocessor/multicore > system, not a commodity cluster. In these shared memory > configurations, were I using compiled code, I'd be able to use OpenMP > to take advantage of the additional cores/processors. I'm wondering > if anyone has looked at ways to take advantage of such configurations > with scipy. > > -gideon > > On Feb 1, 2009, at 4:57 AM, Gael Varoquaux wrote: > > > On Sun, Feb 01, 2009 at 12:37:48AM -0500, Gideon Simpson wrote: > >> Has anyone been able to take advantage of shared memory machines with > >> scipy? How did you do it? > > > > I am not sure I understand your question. You want to do parallel > > computing and share the arrays between processes, is that it? > > > > Ga?l > > _______________________________________________ > > SciPy-user mailing list > > SciPy-user at scipy.org > > http://projects.scipy.org/mailman/listinfo/scipy-user > > > > ------------------------------ > > Message: 10 > Date: Sun, 1 Feb 2009 16:29:40 +0100 > From: Gael Varoquaux > Subject: Re: [SciPy-user] shared memory machines > To: SciPy Users List > Message-ID: <20090201152940.GD9757 at phare.normalesup.org> > Content-Type: text/plain; charset="iso-8859-1" > > On Sun, Feb 01, 2009 at 10:03:30AM -0500, Gideon Simpson wrote: > > Yes, but I'm talking about when you have a multiprocessor/multicore > > system, not a commodity cluster. In these shared memory > > configurations, were I using compiled code, I'd be able to use OpenMP > > to take advantage of the additional cores/processors. I'm wondering > > if anyone has looked at ways to take advantage of such configurations > > with scipy. > > I use the multiprocessing module: > http://docs.python.org/library/multiprocessing.html > > I also have some code to share arrays between processes. I'd love to > submit it for integration with numpy, but first I'd like it to get more > exposure so that the eventual flaws in the APIs are found. I am attaching > it. > > Actually I wrote this code a few months ago, and now that I am looking at > it, I realise that the SharedMemArray should probably be a subclass of > numpy.ndarray, and implement the full array signature. I am not sure if > this is possible or not (ie if it will still be easy to have > multiprocessing share the data between processes or not). I don't really > have time for polishing this right, anybody wants to have a go? > > Ga?l > > > On Feb 1, 2009, at 4:57 AM, Gael Varoquaux wrote: > > > > On Sun, Feb 01, 2009 at 12:37:48AM -0500, Gideon Simpson wrote: > > >> Has anyone been able to take advantage of shared memory machines with > > >> scipy? How did you do it? > > > > I am not sure I understand your question. You want to do parallel > > > computing and share the arrays between processes, is that it? > > -------------- next part -------------- > """ > Small helper module to share arrays between processes without copying > data. > > Numpy arrays can be converted to shared memory arrays, which implement > the array protocole, but are allocated in memory that can be > share transparently by the multiprocessing module. > """ > > # Author: Gael Varoquaux > # Copyright: Gael Varoquaux > # License: BSD > > import numpy as np > import multiprocessing > import ctypes > > _ctypes_to_numpy = { > ctypes.c_char : np.int8, > ctypes.c_wchar : np.int16, > ctypes.c_byte : np.int8, > ctypes.c_ubyte : np.uint8, > ctypes.c_short : np.int16, > ctypes.c_ushort : np.uint16, > ctypes.c_int : np.int32, > ctypes.c_uint : np.int32, > ctypes.c_long : np.int32, > ctypes.c_ulong : np.int32, > ctypes.c_float : np.float32, > ctypes.c_double : np.float64 > } > > _numpy_to_ctypes = dict((value, key) for key, value in > _ctypes_to_numpy.iteritems()) > > def shmem_as_ndarray(data, dtype=float): > """ Given a multiprocessing.Array object, as created by > ndarray_to_shmem, returns an ndarray view on the data. > """ > dtype = np.dtype(dtype) > size = data._wrapper.get_size()/dtype.itemsize > arr = np.frombuffer(buffer=data, dtype=dtype, count=size) > return arr > > > def ndarray_to_shmem(arr): > """ Converts a numpy.ndarray to a multiprocessing.Array object. > > The memory is copied, and the array is flattened. > """ > arr = arr.reshape((-1, )) > data = multiprocessing.RawArray(_numpy_to_ctypes[arr.dtype.type], > arr.size) > ctypes.memmove(data, arr.data[:], len(arr.data)) > return data > > > > def test_ndarray_conversion(): > """ Check that the conversion to multiprocessing.Array and back works. > """ > a = np.random.random((100, )) > a_sh = ndarray_to_shmem(a) > b = shmem_as_ndarray(a_sh) > np.testing.assert_almost_equal(a, b) > > > def test_conversion_non_flat(): > """ Check that the conversion also works with non-flat arrays. > """ > a = np.random.random((100, 2)) > a_flat = a.flatten() > a_sh = ndarray_to_shmem(a) > b = shmem_as_ndarray(a_sh) > np.testing.assert_almost_equal(a_flat, b) > > > def test_conversion_non_contiguous(): > """ Check that the conversion also works with non-contiguous arrays. > """ > a = np.indices((3, 3, 3)) > a = a.T > a_flat = a.flatten() > a_sh = ndarray_to_shmem(a) > b = shmem_as_ndarray(a_sh, dtype=a.dtype) > np.testing.assert_almost_equal(a_flat, b) > > > > def test_no_copy(): > """ Check that the data is not copied from the multiprocessing.Array. > """ > a = np.random.random((100, )) > a_sh = ndarray_to_shmem(a) > a = shmem_as_ndarray(a_sh) > b = shmem_as_ndarray(a_sh) > a[0] = 1 > np.testing.assert_equal(a[0], b[0]) > a[0] = 0 > np.testing.assert_equal(a[0], b[0]) > > > > ################################################################################ > # A class to carry around the relevant information > > ################################################################################ > > class SharedMemArray(object): > """ Wrapper around multiprocessing.Array to share an array accross > processes. > """ > > def __init__(self, arr): > """ Initialize a shared array from a numpy array. > > The data is copied. > """ > self.data = ndarray_to_shmem(arr) > self.dtype = arr.dtype > self.shape = arr.shape > > def __array__(self): > """ Implement the array protocole. > """ > arr = shmem_as_ndarray(self.data, dtype=self.dtype) > arr.shape = self.shape > return arr > > def asarray(self): > return self.__array__() > > > def test_sharing_array(): > """ Check that a SharedMemArray shared between processes is indeed > modified in place. > """ > # Our worker function > def f(arr): > a = arr.asarray() > a *= -1 > > a = np.random.random((10, 3, 1)) > arr = SharedMemArray(a) > # b is a copy of a > b = arr.asarray() > np.testing.assert_array_equal(a, b) > multiprocessing.Process(target=f, args=(arr,)).run() > np.testing.assert_equal(-b, a) > > > if __name__ == '__main__': > > import nose > nose.runmodule() > > > > ------------------------------ > > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > > > End of SciPy-user Digest, Vol 66, Issue 1 > ***************************************** > -- Alex Liberzon Turbulence Structure Laboratory (http://www.eng.tau.ac.il/efdl) School of Mechanical Engineering Tel Aviv University Tel: +972-3-640-8928 (office) Tel: +972-3-640-6860 (lab) E-mail: alexlib at eng.tau.ac.il -------------- next part -------------- An HTML attachment was scrubbed... URL: From gael.varoquaux at normalesup.org Sun Feb 1 16:35:17 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sun, 1 Feb 2009 22:35:17 +0100 Subject: [SciPy-user] SciPy and GUI In-Reply-To: <49860AA0.1090300@gmx-topmail.de> References: <20090125100556.GA29918@phare.normalesup.org> <49860AA0.1090300@gmx-topmail.de> Message-ID: <20090201213517.GG931@phare.normalesup.org> On Sun, Feb 01, 2009 at 09:48:32PM +0100, Tim Michelsen wrote: > > I would use traits (see > > http://code.enthought.com/projects/traits/documentation.php, and > > http://code.enthought.com/projects/traits/docs/html/tutorials/traits_ui_scientific_app.html > > for documentation and a tutorial) > * Where is the "science" in TraitsUI? (Why do you call it a scientific GUI?) > E.g. I could also build a Wizard directly with wxPython. So why with > Traits? There are two questions here: 1. What is scientific with Traits? 2. Why Traits rather than raw WxPython? Answer to 1): Per se Traits has nothing scientific and can be used for non-scientific applications. Now the people behind Traits do scientific computing. As a results Traits integrates perfectly with numpy, or Mayavi, or the Chaco visualization library. In addition there are plenty of widgets that are very relevant to scientific applications (such as slider bars). Answer to 2): Its a question of using the right abstraction level. WxPython is a library of widgets, events and eventloops. It forces you to think in these terms and not in terms of models and views. Traits makes you think in terms of building a model, making it live with a set of callbacks, and adding a view on top of it. The code is much clearer because it is not riddled with references to 'wx.TextField', and the reactive-programming model is much easier to follow than explicit registering of callbacks (it is interesting to note that Qt has started to move in this direction in Qt4, although the corresponding PyQt code is not terribly Pythonic). Moreover, the event loop is mostly hidden to the user, in Traits. This is possible because of the implicit View/Model separation and the 'message passing' programming style that comes from heavy use of callbacks on attribute modification. As a result, threading issues with event loops (which are a really bitch) are hidden with Traits: Traits, and TraitsUI is mostly thread-safe. In Wx, you will quickly have to understand the fine details of the event loop, which is interesting, but quite off-topic for the scientific programmer. But the really important thing about Traits is that is folds together a set of patterns and best-practices, such as validation, model-view separation, default-initialization, cheap callbacks/the observer pattern. Using Traits puts you on a good path building a good architecture to your application. If you are using the raw toolkit you can still architecture your application right, but you need more experience, more knowledge. It is so easy to mix model and view when manipulating widgets, and not an abstraction to them (I did this this summer without realizing it, and regretted it a lot much later). > * I tried the examples. > What I did not understand is how one can control the buttons below the > Traits objects. > For the first example (section "An object and its representation"), > there are 6 buttons in your image: > Undo, Redo, Revert, OK, Cancel, Help. > When I execute the code I only get OK, Cancel. > May you tell how or where to find information how buttons can be contolled? I can tell you this. You need to write a handler for your view: http://code.enthought.com/projects/traits/docs/html/TUIUG/handler.html To give you a short example to do this: from enthought.traits.api import HasTraits, Int from enthought.traits.ui.api import View, Handler class MyHandler(Handler): def closed(self, info, is_ok): if is_ok: print 'User closed the window, and is happy' else: print 'User closed the window, and is unhappy' class Model(HasTraits): a = Int view = View('a', handler=MyHandler(), buttons=['OK', 'Cancel']) model = Model() model.configure_traits(view=view) However, if you are not programming a reactive application, I would try to put as little code as possible in the handler, and put the logics in the code following the 'configure_traits' call. If you need to know if the user pressed 'OK' or 'Cancel', I would capture this and store it in the Handler, but I would put the processing logics later on. That's another case of separating the core logics (called 'model') from the view-related logics. > * Input validation: I remember to have seen a example where a Traits > Window was used to validate (numeric) input. If the user puts in invalid > numers, it would turn read.. Do you know about this? Sure, that's easy: when you specify the traits, you specify its type (in the above example it is an int), if the user enters a wrong type, the text box turns read, and the corresponding attribute is not changed. > * Is there a feature roadmap for traits? > I would like to know where you intend it to develop it to before I > settle on it. Traits 3 was release 6 months ago. It was a major overhaul (although the API didn't change much). Ever since development has been fairly limited. It seems that people are mainly happy with what we have right now. Of course Traits has limitations (including some design issues, nobody is perfect). In addition some specific needs might arise. Remember, there is a company behind Traits. Thus you may see some new developments, or additions. I don't expect a major change anytime soon. > It maybe of interest for many prospective beginners to see example > applications. Why not listing all accessible applications built with > TraitsUI on a website? Most of them are not open source. The open source ones (SciPyLab, Mayavi) are fairly complex, and I would advise a beginner to look into them. > I think that Enthought should put a strong pointer on their website > (http://code.enthought.com/) indicating that actually a lot of > documentation can also be found on the Trac wiki > (https://svn.enthought.com/enthought/wiki). You probably have a point. Documenting a beast like that is not easy, believe me :). HTH, Ga?l From sturla at molden.no Sun Feb 1 19:51:08 2009 From: sturla at molden.no (Sturla Molden) Date: Mon, 2 Feb 2009 01:51:08 +0100 (CET) Subject: [SciPy-user] shared memory machines In-Reply-To: <2AE6D153-799C-450E-8E69-CA80D12E2FF5@math.toronto.edu> References: <2AE6D153-799C-450E-8E69-CA80D12E2FF5@math.toronto.edu> Message-ID: <747c5db37a4e870a8e8f562a4636c6e7.squirrel@webmail.uio.no> > Has anyone been able to take advantage of shared memory machines with > scipy? How did you do it? I have either used OpenML in C or Fortran 90 extension modules, or multiprocessing in Python. If you have lengthy calculations in extension libraries you can also use Python threads, given that your extension releases the GIL. I have been working on a multiprocessing + NumPy cookbook tutorial. For now the unfinished draft is here: http://folk.uio.no/sturlamo/python/multiprocessing-tutorial.pdf It only covers shared memory programming, though. I will also add how to use Queues for message-passing. Many prefer message-passing to shared memory. You avoid performance problems due to 'false sharing', and there is often less resource contention. The difference betwwen threading and multiprocessing should also be better covered. Both are applicable, but in defferent contexts. And with threading you can also choose between 'shared-objects' and message-passing (Queue.Queue). Sturla Molden From sturla at molden.no Sun Feb 1 20:07:22 2009 From: sturla at molden.no (Sturla Molden) Date: Mon, 2 Feb 2009 02:07:22 +0100 (CET) Subject: [SciPy-user] Automating Matlab In-Reply-To: <3d375d730901311734o388adf56y9f3241032ed409c2@mail.gmail.com> References: <4984F58C.5070605@gmail.com> <3d375d730901311734o388adf56y9f3241032ed409c2@mail.gmail.com> Message-ID: > On Sat, Jan 31, 2009 at 19:06, Eric Schug wrote: >> Is there strong interest in automating matlab to numpy conversion? > > Yes! Please post your code somewhere! For those who are interested, there are two ways of doing this: The most portable is to call the 'Matlab engine', which is a C and Fortran library for automating Matlab. This can be done using f2py or ctypes (wrap libeng.dll and libmx.dll). http://www.mathworks.com/access/helpdesk/help/techdoc/index.html?/access/helpdesk/help/techdoc/matlab_external/f29148.html&http://www.google.no/search?rlz=1C1GGLS_noNO291NO303&aq=f&sourceid=chrome&ie=UTF-8&q=matlab+engine The other option (Windows only) is to use Matlab as an outproc COM server. This will require pywin32. S.M. From sturla at molden.no Sun Feb 1 20:33:38 2009 From: sturla at molden.no (Sturla Molden) Date: Mon, 2 Feb 2009 02:33:38 +0100 (CET) Subject: [SciPy-user] shared memory machines In-Reply-To: <20090201152940.GD9757@phare.normalesup.org> References: <2AE6D153-799C-450E-8E69-CA80D12E2FF5@math.toronto.edu> <20090201095746.GA1099@phare.normalesup.org> <20090201152940.GD9757@phare.normalesup.org> Message-ID: <39cf90a17c7a1ea627e6273ce09e0da2.squirrel@webmail.uio.no> > On Sun, Feb 01, 2009 at 10:03:30AM -0500, Gideon Simpson wrote: > Actually I wrote this code a few months ago, and now that I am looking at > it, I realise that the SharedMemArray should probably be a subclass of > numpy.ndarray, and implement the full array signature. I am not sure if > this is possible or not (ie if it will still be easy to have > multiprocessing share the data between processes or not). ? You can use multiprocessing.Array to allocate shared memory, and use its buffer to create an ndarray with numpy.frombuffer. Basically multiprocessing can use whatever can be pickled. ndarrays copy their contents when pickled, and subclasses seem to inherit this behaviour. Note that this is perfectly okay if you are happy with a message-passing approach to parallel computing. When using mp.Array as shared memory, the object must be passed to multiprocessing.Process on instantiation. This is because of handle inheritance. Therefore you cannot pass an instance of mp.Array through a mp.Queue or mp.Pipe. If we use named share memory (System V IPC) instead of BSD mmap, we can probably get around this. But support for this li lacking in Python and SciPy. Sturla Molden From david at ar.media.kyoto-u.ac.jp Sun Feb 1 22:43:27 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Mon, 02 Feb 2009 12:43:27 +0900 Subject: [SciPy-user] Automating Matlab In-Reply-To: References: <4984F58C.5070605@gmail.com> <3d375d730901311734o388adf56y9f3241032ed409c2@mail.gmail.com> Message-ID: <49866BDF.2000809@ar.media.kyoto-u.ac.jp> Sturla Molden wrote: > > For those who are interested, there are two ways of doing this: > I think Eric talked about source code translation, that is .m to .py. > The most portable is to call the 'Matlab engine', which is a C and Fortran > library for automating Matlab. This can be done using f2py or ctypes (wrap > libeng.dll and libmx.dll). > If you are not aware of it, there is already code for it: http://svn.scipy.org/svn/scikits/trunk/mlabwrap/ cheers, David From gael.varoquaux at normalesup.org Mon Feb 2 01:38:33 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Mon, 2 Feb 2009 07:38:33 +0100 Subject: [SciPy-user] shared memory machines In-Reply-To: <747c5db37a4e870a8e8f562a4636c6e7.squirrel@webmail.uio.no> References: <2AE6D153-799C-450E-8E69-CA80D12E2FF5@math.toronto.edu> <747c5db37a4e870a8e8f562a4636c6e7.squirrel@webmail.uio.no> Message-ID: <20090202063833.GB9627@phare.normalesup.org> On Mon, Feb 02, 2009 at 01:51:08AM +0100, Sturla Molden wrote: > I have been working on a multiprocessing + NumPy cookbook tutorial. For > now the unfinished draft is here: > http://folk.uio.no/sturlamo/python/multiprocessing-tutorial.pdf Hey, it's a very interested document. It seems that you have quite a lot of insight on these problems. I hadn't realized that a numpy array with the memory alocated as shared memory would be automaticaly shared by multiprocessing (I tried, and to my surprise, it works). So it seems that shmem_as_ndarray (the implementation of which is fairly similar in your code and in mine), and probably probably some array creation helper like empty_shmem, is all we need to use multiprocessing with numpy. Do you concur? I also like a lot your code to figure out the number of processor. It is very useful in a multiprocessing scientific computing package. However my limitation is more often than not memory. Do you have cross platform code to analyse the percent of memory used, and the absolute amount of memory free? I think I should write empty_shmem, to complete hide the multiprocessing Array, delete my useless SharedMemArray class, integrate your number of processor function, and recirculate my code, if it is OK with you. In a few iterations we can propose this for integration in numpy. Cheers, Ga?l From robert.kern at gmail.com Mon Feb 2 01:51:51 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 2 Feb 2009 00:51:51 -0600 Subject: [SciPy-user] shared memory machines In-Reply-To: <20090202063833.GB9627@phare.normalesup.org> References: <2AE6D153-799C-450E-8E69-CA80D12E2FF5@math.toronto.edu> <747c5db37a4e870a8e8f562a4636c6e7.squirrel@webmail.uio.no> <20090202063833.GB9627@phare.normalesup.org> Message-ID: <3d375d730902012251y1159737fk325a923a344f25cf@mail.gmail.com> On Mon, Feb 2, 2009 at 00:38, Gael Varoquaux wrote: > I think I should write empty_shmem, to complete hide the multiprocessing > Array, delete my useless SharedMemArray class, integrate your number of > processor function, and recirculate my code, if it is OK with you. In a > few iterations we can propose this for integration in numpy. Here's mine, FWIW. It goes down directly to the multiprocessing.heap code underlying the Array stuff. On Windows, the objects transfer via pickle while under UNIX, they must be inherited. Windows mmap objects can be pickled while UNIX mmap objects can't. Like Sturla says, we'd have to use named shared memory to get around this on UNIX. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco -------------- next part -------------- A non-text attachment was scrubbed... Name: shared_array.py Type: text/x-python-script Size: 2728 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: example.py Type: text/x-python-script Size: 1406 bytes Desc: not available URL: From bgbg.bg at gmail.com Mon Feb 2 03:12:33 2009 From: bgbg.bg at gmail.com (bgbg bg) Date: Mon, 2 Feb 2009 10:12:33 +0200 Subject: [SciPy-user] concatenating arrays of different dimensions Message-ID: <57b9201a0902020012o31524f04ic34cda82c27920b@mail.gmail.com> Hello, Consider an Octave code that concatenates an array and a vector: octave:1> a = [1, 2, 3]; octave:2> b = [ 11, 22, 33; 44, 55 66]; octave:3> c = [a; b] c = 1 2 3 11 22 33 44 55 66 octave:4> How do I emulate this behavior in Python (scipy)? This is what i tried: In [37]: from scipy import array In [38]: a = array([1,2,3]) In [39]: b = array([ [11,22,33], [44, 55, 66]]) In [40]: c = [a, b] In [41]: print c [array([1, 2, 3]), array([[11, 22, 33], [44, 55, 66]])] In [42]: # not good In [43]: c = concatenate((a,b)) --------------------------------------------------------------------------- ValueError Traceback (most recent call last) ValueError: arrays must have same number of dimensions In [44]: c = concatenate((a,b),1) --------------------------------------------------------------------------- ValueError Traceback (most recent call last) ValueError: arrays must have same number of dimensions In [45]: From pgmdevlist at gmail.com Mon Feb 2 03:23:26 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Mon, 2 Feb 2009 03:23:26 -0500 Subject: [SciPy-user] concatenating arrays of different dimensions In-Reply-To: <57b9201a0902020012o31524f04ic34cda82c27920b@mail.gmail.com> References: <57b9201a0902020012o31524f04ic34cda82c27920b@mail.gmail.com> Message-ID: <3777FF91-1F96-4179-9599-73A3607591B7@gmail.com> On Feb 2, 2009, at 3:12 AM, bgbg bg wrote: > Hello, > Consider an Octave code that concatenates an array and a vector: > octave:1> a = [1, 2, 3]; > octave:2> b = [ 11, 22, 33; 44, 55 66]; > octave:3> c = [a; b] > > > How do I emulate this behavior in Python (scipy)? This is what i > tried: > c = np.vstack((a,b)) For more info: http://www.scipy.org/Numpy_Functions_by_Category#head-ca5d5fe8c131a7ab8f7d7d38796ff84dbf4a2bd0 From ludovic.drouineau at ifremer.fr Mon Feb 2 04:41:50 2009 From: ludovic.drouineau at ifremer.fr (Ludovic DROUINEAU) Date: Mon, 02 Feb 2009 10:41:50 +0100 Subject: [SciPy-user] Problem reading NetCDF File In-Reply-To: <6a17e9ee0901300607l5345ca65oe927f32e48462592@mail.gmail.com> References: <4982F0EB.1000102@ifremer.fr> <6a17e9ee0901300607l5345ca65oe927f32e48462592@mail.gmail.com> Message-ID: <4986BFDE.2030007@ifremer.fr> Scott Sinclair a ?crit : >> 2009/1/30 Ludovic DROUINEAU : >> Hi all, >> >> When I try to open a NetCDF file, I have the following error: >> File "C:\Python25\lib\site-packages\scipy\io\netcdf.py", line 194, in >> _read_values >> count = n * bytes[nc_type-1] >> IndexError: list index out of range >> >> My code is: >> from scipy.io import netcdf >> >> nc = netcdf.netcdf_file ('test.nc', 'r') >> > > I'm not sure if anyone is actively maintaining scipy.io.netcdf (you'll > find out if there is a response to your query). In case there isn't, > you might have better luck with one of the following: > > http://code.google.com/p/netcdf4-python/ > http://matplotlib.sourceforge.net/basemap/doc/html/api/basemap_api.html#mpl_toolkits.basemap.NetCDFFile > http://www.pyngl.ucar.edu/Nio.shtml > http://pypi.python.org/pypi/pupynere/1.0 > > Cheers, > Scott > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > Hi all, I'm quite new to python and I had many problems installing netcdf4-python. I have installed netcdf, hdf5, szlib, zlib And I try to install netcdf4-python with: python setup.py install running install running build running config_cc unifing config_cc, config, build_clib, build_ext, build commands --compiler options running config_fc unifing config_fc, config, build_clib, build_ext, build commands --fcompiler options running build_src building py_modules sources building extension "netCDF4" sources running build_py running build_ext No module named msvccompiler in numpy.distutils; trying from distutils error: Python was built with Visual Studio 2003; extensions must be built with a compiler than can generate compatible binaries. Visual Studio 2003 was not found on this system. If you have Cygwin installed, you can try compiling with MingW32, by passing "-c mingw32" to setup.py. Anyway, I have installed pupynere (with the tar.gz) With the egg Do I just have to run "easy-install pupinere.egg" ? And when I tried to test it with this line: |>>> from pupynere import NetCDFFile >>> f = nc('example.nc', 'w') | It failed with: NameError: name 'nc' is not defined May be the documentation in http://dealmeida.net/2008/07/14/pupynere is old But when I tried: f = netcdf_file('test.nc', 'r') print f.history time = f.variables['time'][:] lat = f.variables['lat'][:] Everything works fine Regards Ludovic -- Ludovic DROUINEAU NSE/ILE Ifremer Centre de Brest BP 70 - 29280 Plouzan? t?l. 33 (0)2 98 22 40 94 email Ludovic.Drouineau at ifremer.fr From gael.varoquaux at normalesup.org Mon Feb 2 05:53:16 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Mon, 2 Feb 2009 11:53:16 +0100 Subject: [SciPy-user] shared memory machines In-Reply-To: <3d375d730902012251y1159737fk325a923a344f25cf@mail.gmail.com> References: <2AE6D153-799C-450E-8E69-CA80D12E2FF5@math.toronto.edu> <747c5db37a4e870a8e8f562a4636c6e7.squirrel@webmail.uio.no> <20090202063833.GB9627@phare.normalesup.org> <3d375d730902012251y1159737fk325a923a344f25cf@mail.gmail.com> Message-ID: <20090202105316.GE11955@phare.normalesup.org> On Mon, Feb 02, 2009 at 12:51:51AM -0600, Robert Kern wrote: > On Mon, Feb 2, 2009 at 00:38, Gael Varoquaux > wrote: > > I think I should write empty_shmem, to complete hide the multiprocessing > > Array, delete my useless SharedMemArray class, integrate your number of > > processor function, and recirculate my code, if it is OK with you. In a > > few iterations we can propose this for integration in numpy. > Here's mine, FWIW. It goes down directly to the multiprocessing.heap > code underlying the Array stuff. On Windows, the objects transfer via > pickle while under UNIX, they must be inherited. Windows mmap objects > can be pickled while UNIX mmap objects can't. Like Sturla says, we'd > have to use named shared memory to get around this on UNIX. Well, you know way more than I do about this. But I fear I am miss-understanding something. Does what you are saying means that an 'empty_shmem', that would create a multiprocessing Array, and expose it as a numpy array, is bound to fail under windows? My experiments seem to show that this works under Linux, and this would be a very simple way of doing shared memory. We could have a numpy.multiprocessing, with all kind of constructors for arrays (empty, zeros, ones, *_like, and maybe 'array') that would be shared between process. Am I out of my mind, and will this fail utterly? Cheers, Ga?l From david_baddeley at yahoo.com.au Mon Feb 2 06:36:11 2009 From: david_baddeley at yahoo.com.au (David Baddeley) Date: Mon, 2 Feb 2009 03:36:11 -0800 (PST) Subject: [SciPy-user] Matlab style line based profiling Message-ID: <365959.22116.qm@web33006.mail.mud.yahoo.com> Hi all, a while ago I drummed up a matlab like profiling module which gives times for individual lines. Since then I've found http://packages.python.org/line_profiler/ which seems to be a bit more refined and should be somewhat faster (mine is pure python, both hook the tracing functions). Where my code does have an advantage is that I've got it producing syntax highlighted html with the most expensive lines highlighted in red with the times in the margin, like the matlab profiler. I've also used a variant of the profile on, profile off, profile report syntax which should be familiar to matlab users. Would like to make it available, but am not sure how much demand there would be for a second line profiler module and whether it wouldn't be more sensible to see if the report generation couldn't be adapted to work with the aforementioned line_profiler module (there might be licensing issues here as my html generation borrows heavily from a GPL licensed python syntax highlighter and I'm not sure if Robert would be keen on having his module tainted). have attached the current code and would welcome any input, Cheers, David Get the world's best email - http://nz.mail.yahoo.com/ -------------- next part -------------- A non-text attachment was scrubbed... Name: colorize_db_t.py Type: text/x-python Size: 7971 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: mProfile.py Type: text/x-python Size: 2710 bytes Desc: not available URL: From robert.kern at gmail.com Mon Feb 2 11:48:44 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 2 Feb 2009 10:48:44 -0600 Subject: [SciPy-user] Matlab style line based profiling In-Reply-To: <365959.22116.qm@web33006.mail.mud.yahoo.com> References: <365959.22116.qm@web33006.mail.mud.yahoo.com> Message-ID: <3d375d730902020848w13cf82aan1c8b024c029294a2@mail.gmail.com> On Mon, Feb 2, 2009 at 05:36, David Baddeley wrote: > Hi all, > > a while ago I drummed up a matlab like profiling module which gives times for individual lines. Since then I've found http://packages.python.org/line_profiler/ which seems to be a bit more refined and should be somewhat faster (mine is pure python, both hook the tracing functions). Where my code does have an advantage is that I've got it producing syntax highlighted html with the most expensive lines highlighted in red with the times in the margin, like the matlab profiler. I've also used a variant of the profile on, profile off, profile report syntax which should be familiar to matlab users. > > Would like to make it available, but am not sure how much demand there would be for a second line profiler module and whether it wouldn't be more sensible to see if the report generation couldn't be adapted to work with the aforementioned line_profiler module (there might be licensing issues here as my html generation borrows heavily from a GPL licensed python syntax highlighter and I'm not sure if Robert would be keen on having his module tainted). Not much! :-) However, there is a version of the colorization code that you used that is more palatably licensed: http://code.activestate.com/recipes/52298/ Here is IPython's version, which uses ANSI escape sequences for terminal color output, which would also be a nice addition to the text output: http://bazaar.launchpad.net/~ipython-dev/ipython/trunk/annotate/head%3A/IPython/PyColorize.py I would be happy to accept contributions in this vein. I was gearing up to release 1.0b2 in the next day or two, but if you would like to get a patch together in the next week, I can wait. Let me know if the default line_profiler workflow doesn't work well for you. If you have suggestions for alternatives, I'm happy to listen. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From sturla at molden.no Mon Feb 2 12:24:18 2009 From: sturla at molden.no (Sturla Molden) Date: Mon, 2 Feb 2009 18:24:18 +0100 (CET) Subject: [SciPy-user] shared memory machines In-Reply-To: <20090202105316.GE11955@phare.normalesup.org> References: <2AE6D153-799C-450E-8E69-CA80D12E2FF5@math.toronto.edu> <747c5db37a4e870a8e8f562a4636c6e7.squirrel@webmail.uio.no> <20090202063833.GB9627@phare.normalesup.org> <3d375d730902012251y1159737fk325a923a344f25cf@mail.gmail.com> <20090202105316.GE11955@phare.normalesup.org> Message-ID: <786d3e06228152ae2c30291b139983e4.squirrel@webmail.uio.no> > On Mon, Feb 02, 2009 at 12:51:51AM -0600, Robert Kern wrote: > Well, you know way more than I do about this. But I fear I am > miss-understanding something. Does what you are saying means that an > 'empty_shmem', that would create a multiprocessing Array, and expose it > as a numpy array, is bound to fail under windows? Linux: You can create shared memory using BSD mmap or System V IPC. Multiprocessing does the former. Shared memory created via BSD mmap is "unnamed". Thus, it has to be created in the parent prior prior to the call to fork(), otherwise the child cannot get access to it. That is why mp.Array must be created prior to mp.Process (the latter calls os.fork). Windows: There is no fork(). Shared memory can be named or unnamed. In the second case, it is passed to the spawned process via handle inheritance. This is what multiprocessing does. Again, the consequence is that it must med created prior to the creation of mp.Process. In this case it must actually be passed as an argument to to mp.Process when it is instantiated. However: If we had an ndarray that used named shared memory as buffer, it would be more convinient on Windows and Linux alike. Any process can map this segment if it knows its name. It would only pickle the name of the shared segment (as well as dtype and shape), and could thus be messaged between processes using mp.Queue. Currently we can only send copies of private memory arrays via mp.Queue. > My experiments seem to > show that this works under Linux, and this would be a very simple way of > doing shared memory. We could have a numpy.multiprocessing, with all kind > of constructors for arrays (empty, zeros, ones, *_like, and maybe > 'array') that would be shared between process. > > Am I out of my mind, and will this fail utterly? It will work. But we should use named shared memory (which requires some C or Cython coding), not BSD mmap as mp.Array currently does. Also we must override how ndarrays are pickled. Sturla Molden From robert.kern at gmail.com Mon Feb 2 12:29:06 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 2 Feb 2009 11:29:06 -0600 Subject: [SciPy-user] shared memory machines In-Reply-To: <20090202105316.GE11955@phare.normalesup.org> References: <2AE6D153-799C-450E-8E69-CA80D12E2FF5@math.toronto.edu> <747c5db37a4e870a8e8f562a4636c6e7.squirrel@webmail.uio.no> <20090202063833.GB9627@phare.normalesup.org> <3d375d730902012251y1159737fk325a923a344f25cf@mail.gmail.com> <20090202105316.GE11955@phare.normalesup.org> Message-ID: <3d375d730902020929q68d1b163t5bcc424f52459c8a@mail.gmail.com> On Mon, Feb 2, 2009 at 04:53, Gael Varoquaux wrote: > On Mon, Feb 02, 2009 at 12:51:51AM -0600, Robert Kern wrote: >> On Mon, Feb 2, 2009 at 00:38, Gael Varoquaux >> wrote: >> > I think I should write empty_shmem, to complete hide the multiprocessing >> > Array, delete my useless SharedMemArray class, integrate your number of >> > processor function, and recirculate my code, if it is OK with you. In a >> > few iterations we can propose this for integration in numpy. > >> Here's mine, FWIW. It goes down directly to the multiprocessing.heap >> code underlying the Array stuff. On Windows, the objects transfer via >> pickle while under UNIX, they must be inherited. Windows mmap objects >> can be pickled while UNIX mmap objects can't. Like Sturla says, we'd >> have to use named shared memory to get around this on UNIX. > > Well, you know way more than I do about this. But I fear I am > miss-understanding something. Does what you are saying means that an > 'empty_shmem', that would create a multiprocessing Array, and expose it > as a numpy array, is bound to fail under windows? [These first two paragraphs are basically what Sturla says in his response. He's faster on the Send button than I am. :-)] Almost. On Windows, the subprocesses inherit nothing. All objects must be passed through pickles. Passing the Array works, but passing the ndarray won't because the ndarray pickler will pass-by-value. My approach registers a new pickler for ndarrays that recognizes my shared-memory ndarrays and makes a pickle that just references the shared memory. You could replicate that using Array as the memory allocator, but I think my approach which uses the "raw" allocators underneath Array is more straightforward. On UNIX, Arrays and the stuff underneath it don't pickle because the underlying mmap is not named. We'd need to wrap the appropriate APIs in order to do this. If you can arrange your program such that the arrays get inherited, you're fine because you don't need to pickle anything, but you can't pass these ndarrays through Queues and such. I've tried using the shm module, which does wrap those APIs, but I've never been able to get the memory to actually share unless if the subprocess inherits it. http://nikitathespider.com/python/shm/ -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From sturla at molden.no Mon Feb 2 12:41:32 2009 From: sturla at molden.no (Sturla Molden) Date: Mon, 2 Feb 2009 18:41:32 +0100 (CET) Subject: [SciPy-user] shared memory machines In-Reply-To: <3d375d730902020929q68d1b163t5bcc424f52459c8a@mail.gmail.com> References: <2AE6D153-799C-450E-8E69-CA80D12E2FF5@math.toronto.edu> <747c5db37a4e870a8e8f562a4636c6e7.squirrel@webmail.uio.no> <20090202063833.GB9627@phare.normalesup.org> <3d375d730902012251y1159737fk325a923a344f25cf@mail.gmail.com> <20090202105316.GE11955@phare.normalesup.org> <3d375d730902020929q68d1b163t5bcc424f52459c8a@mail.gmail.com> Message-ID: > On Mon, Feb 2, 2009 at 04:53, Gael Varoquaux > Almost. On Windows, the subprocesses inherit nothing. All objects must > be passed through pickles. Passing the Array works, but passing the > ndarray won't because the ndarray pickler will pass-by-value. Almost. A subprocess can be specified to inherit its parent's handles. The parent must then pass the value of the handle to the subprocess, e.g. via the stdin pipe. This is how mp.Array works on Windows. > On UNIX, Arrays and the stuff underneath it don't pickle because the > underlying mmap is not named. It is the same on Windows. Named shared memory is the cure in both cases. The advantage of named shared memory is that it can be created after the subprocesses are spawned/forked. Sturla Molden From timmichelsen at gmx-topmail.de Mon Feb 2 04:04:28 2009 From: timmichelsen at gmx-topmail.de (Tim Michelsen) Date: Mon, 02 Feb 2009 10:04:28 +0100 Subject: [SciPy-user] SciPy and GUI In-Reply-To: <20090201213517.GG931@phare.normalesup.org> References: <20090125100556.GA29918@phare.normalesup.org> <49860AA0.1090300@gmx-topmail.de> <20090201213517.GG931@phare.normalesup.org> Message-ID: Hello! > Answer to 2): > > Its a question of using the right abstraction level. WxPython is a [...] > thread-safe. In Wx, you will quickly have to understand the fine > details of the event loop, which is interesting, but quite off-topic > for the scientific programmer. [...] > But the really important thing about Traits is that is folds together > a set of patterns and best-practices, such as validation, model-view > separation, default-initialization, cheap callbacks/the observer > pattern. Using Traits puts you on a good path building a good > architecture to your application. If you are using the raw toolkit Hey, these well formulated explanations really convinced me to look more closely into ETS and GUI building! > However, if you are not programming a reactive application, I would try > to put as little code as possible in the handler, and put the logics in > the code following the 'configure_traits' call. If you need to know if > the user pressed 'OK' or 'Cancel', I would capture this and store it in > the Handler, but I would put the processing logics later on. That's > another case of separating the core logics (called 'model') from the > view-related logics. This is still something I have to discover closer. I hop to understand this once I digg deeper. > Sure, that's easy: when you specify the traits, you specify its type (in > the above example it is an int), if the user enters a wrong type, the > text box turns read, and the corresponding attribute is not changed. And can there also appear a message like: Please enter only data of "type"? >> It maybe of interest for many prospective beginners to see example >> applications. Why not listing all accessible applications built with >> TraitsUI on a website? > > Most of them are not open source. The open source ones (SciPyLab, Mayavi) > are fairly complex, and I would advise a beginner to look into them. > >> I think that Enthought should put a strong pointer on their website >> (http://code.enthought.com/) indicating that actually a lot of >> documentation can also be found on the Trac wiki >> (https://svn.enthought.com/enthought/wiki). > You probably have a point. Documenting a beast like that is not easy, > believe me :). I looked at all examples and demos in the ETS folder within the Python XY documentation folder. There are so many. I really think that the spread if ETS could benefit from a better advertisement of these demos. Look at the matplotlib gallery. The new user could quickly imagine why he/she should ponder about using the library. Thanks again & kind regards, Timmie From timmichelsen at gmx-topmail.de Mon Feb 2 13:23:29 2009 From: timmichelsen at gmx-topmail.de (Tim Michelsen) Date: Mon, 02 Feb 2009 19:23:29 +0100 Subject: [SciPy-user] timeseries: logging of defective time series Message-ID: Hello, I have a question on how to effectively log invalid timeseries. Such series may return may have one or more of the following properties: * duplicate dates (ts.time_series.has_duplicated_dates() ) * missing dates (ts.time_series.has_missing_dates() ) * masked values (ts.time_series.mask) The functions above in brackets return either "True" or "False" or the boolean mask array. But would be interested in the dates that my series are missing or the data points that are duplicated or masked (from input). May you give me an example how to retrieve these? I put some demo code with comments below. Example use cases: Someone sends you a data file from a datalogger or sensor recording device. * Due to battery problems, the logger did stop recording for some time (=> missing dates). It is important for inspection of the device setup to know when this happend or how long that period lasted. * The data file may have been reformatted or treated before sent to you. Due to this processing, some timsstamps have been saved twice or more (=> duplicated dates). For a correction, one would like to know where to search in the input files. * The input file has already NoData markers. They where used to mask data during loading in python (=> masked data). For error analysis the date and length of masked period is important. I would appreciate a pointer here. Regards, Timmie #### demo code: ### using the examples from http://pytseries.sourceforge.net/core/TimeSeries.html import numpy as np import scikits.timeseries as ts mlist_1 = ['2005-%02i' % i for i in range(1,10)] mlist_1 += ['2006-%02i' % i for i in range(2,13)] mdata_1 = np.arange(len(mlist_1)) mser_1 = ts.time_series(mdata_1, mlist_1, freq='M') mser_1.has_missing_dates() <55> True ### how do I retrieve a new series which contains only the dates that are missing? ## a series with masked mser_1_fill = mser_1.fill_missing_dates() mser_1_fill.mask # I tried "mser_1_fill.mask" but it returns the masked array. The timedate information is lost here. ### how do I retrieve a new series which contains only the dates that are masked? ### Basically it seems that I am looking for the opposite of mser_1_fill.compressed() mser_1_annual = ts.time_series(mdata_1, mlist_1, freq='A') mser_daily = mser_1.asfreq('D') ### how do I retrieve a new series which contains only the dates that are duplicated? mser_daily.has_duplicated_dates() <53> True From pgmdevlist at gmail.com Mon Feb 2 14:02:57 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Mon, 2 Feb 2009 14:02:57 -0500 Subject: [SciPy-user] timeseries: logging of defective time series In-Reply-To: References: Message-ID: <0946AD7A-E329-420B-BDD2-4550D1835783@gmail.com> Timmie, Remember that the mask is an array of boolean and can be used for indexing. I will also assume that your data is 1D * To find the dates corresponding to the missing values in your series: >>> series.dates[series.mask] * To find the missing dates, use fill_missing_dates first (to make sure the dates are continuous) and get the missing dates by >>> series.dates[series.mask] With your example: >>> mser_1_filled = ts.fill_missing_dates(mser_1) >>> missing_dates = mser_1_filled.dates[mser_1.mask] Note that if your initial `series` has already some missing dates, you'll pick those ones up as well. you shuld then check whether you have missing values in the first place, find the corresponding dates, fill the dates, recheck the missing ones, and take the difference between the two sets. * To find duplicated dates: Things get a tad more complicated: 1. make sure that your `series` is sorted chronologically first 2. construct the following array: >>> d = series.dates >>> dupcheck = np.r_[False, (d[1:]==d[:-1])] dupcheck is a ndarray of booleans with True values where the corresponding date is the same as the previous ones. Note that the first date of a duplicated series is flagged as False Gimme a few days to whip up a more useable function that would reproduce that (I think I already have something along those lines somewhere on my HD). > > Such series may return may have one or more of the following > properties: > > * duplicate dates (ts.time_series.has_duplicated_dates() ) > * missing dates (ts.time_series.has_missing_dates() ) > * masked values (ts.time_series.mask) has_duplicated_dates and has_missing_dates were not really meant to be used directly, but more internally to keep track of some info on the distribution of dates From gael.varoquaux at normalesup.org Mon Feb 2 14:30:27 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Mon, 2 Feb 2009 20:30:27 +0100 Subject: [SciPy-user] SciPy and GUI In-Reply-To: References: <20090125100556.GA29918@phare.normalesup.org> <49860AA0.1090300@gmx-topmail.de> <20090201213517.GG931@phare.normalesup.org> Message-ID: <20090202193027.GB7568@phare.normalesup.org> On Mon, Feb 02, 2009 at 10:04:28AM +0100, Tim Michelsen wrote: > Hello! > > Answer to 2): > > Its a question of using the right abstraction level. WxPython is a > [...] > > thread-safe. In Wx, you will quickly have to understand the fine > > details of the event loop, which is interesting, but quite off-topic > > for the scientific programmer. > [...] > > But the really important thing about Traits is that is folds together > > a set of patterns and best-practices, such as validation, model-view > > separation, default-initialization, cheap callbacks/the observer > > pattern. Using Traits puts you on a good path building a good > > architecture to your application. If you are using the raw toolkit > Hey, these well formulated explanations really convinced me to look more > closely into ETS and GUI building! Well thanks. I actually find that these problems are hard to understand and to explain and that I do not have enough insight on them, and thus my explanations are confused and go into circles. But thanks for the encouragement. Ga?l From pav at iki.fi Mon Feb 2 14:34:07 2009 From: pav at iki.fi (Pauli Virtanen) Date: Mon, 2 Feb 2009 19:34:07 +0000 (UTC) Subject: [SciPy-user] concatenating arrays of different dimensions References: <57b9201a0902020012o31524f04ic34cda82c27920b@mail.gmail.com> <3777FF91-1F96-4179-9599-73A3607591B7@gmail.com> Message-ID: Mon, 02 Feb 2009 03:23:26 -0500, Pierre GM wrote: > On Feb 2, 2009, at 3:12 AM, bgbg bg wrote: > >> Hello, >> Consider an Octave code that concatenates an array and a vector: >> octave:1> a = [1, 2, 3]; >> octave:2> b = [ 11, 22, 33; 44, 55 66]; octave:3> c = [a; b] >> >> >> How do I emulate this behavior in Python (scipy)? This is what i tried: >> >> > c = np.vstack((a,b)) > > For more info: > http://www.scipy.org/Numpy_Functions_by_Category#head- ca5d5fe8c131a7ab8f7d7d38796ff84dbf4a2bd0 And http://docs.scipy.org/doc/numpy/reference/routines.array-manipulation.html#joining-arrays -- Pauli Virtanen From mhhohn at gmail.com Mon Feb 2 15:40:03 2009 From: mhhohn at gmail.com (michael hohn) Date: Mon, 2 Feb 2009 20:40:03 +0000 (UTC) Subject: [SciPy-user] SciPy and GUI References: Message-ID: Lorenzo Isella gmail.com> writes: > > Dear All, > I hope this is not too off-topic. Given you Python code, relying on > SciPy for number-crunching, which tools would you use to create a GUI > in order to allow someone else to use it, without his knowing much (or > anything) about scipy and programming?I know Python is great for this, > but I do not know of anything specific. > Cheers > > Lorenzo > If you are interested in a general-purpose worksheet-style interface to Python, you can have a look at the l3gui at http://l3lang.sourceforge.net, especially example 2.3. Cheers, Michael From dwf at cs.toronto.edu Tue Feb 3 02:12:32 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Tue, 3 Feb 2009 02:12:32 -0500 Subject: [SciPy-user] "clustergrams"/hierarchical clustering heat maps Message-ID: <8E0AFD62-64B7-435F-B80F-298C702BF771@cs.toronto.edu> Hi all, I was recently asked to cluster some data and I know from experience that people use these heat maps to look for patterns in multivariate data, often with a dendrogram off to the side. This involves sorting the rows and columns in a certain fashion, the details of which are somewhat fuzzy to me (and, truthfully, I'm happy with it staying that way for now). I notice that dendrogram plotting is available in scipy.cluster.hierarchy, and was wondering if the something for producing the associated sorted heat maps is available anywhere (within SciPy or otherwise). Many thanks, David From ondrej at certik.cz Tue Feb 3 04:08:51 2009 From: ondrej at certik.cz (Ondrej Certik) Date: Tue, 3 Feb 2009 01:08:51 -0800 Subject: [SciPy-user] Matlab style line based profiling In-Reply-To: <3d375d730902020848w13cf82aan1c8b024c029294a2@mail.gmail.com> References: <365959.22116.qm@web33006.mail.mud.yahoo.com> <3d375d730902020848w13cf82aan1c8b024c029294a2@mail.gmail.com> Message-ID: <85b5c3130902030108l5c3d6d4cm7fe62da80e5d87f9@mail.gmail.com> On Mon, Feb 2, 2009 at 8:48 AM, Robert Kern wrote: > On Mon, Feb 2, 2009 at 05:36, David Baddeley > wrote: >> Hi all, >> >> a while ago I drummed up a matlab like profiling module which gives times for individual lines. Since then I've found http://packages.python.org/line_profiler/ which seems to be a bit more refined and should be somewhat faster (mine is pure python, both hook the tracing functions). Where my code does have an advantage is that I've got it producing syntax highlighted html with the most expensive lines highlighted in red with the times in the margin, like the matlab profiler. I've also used a variant of the profile on, profile off, profile report syntax which should be familiar to matlab users. >> >> Would like to make it available, but am not sure how much demand there would be for a second line profiler module and whether it wouldn't be more sensible to see if the report generation couldn't be adapted to work with the aforementioned line_profiler module (there might be licensing issues here as my html generation borrows heavily from a GPL licensed python syntax highlighter and I'm not sure if Robert would be keen on having his module tainted). > > Not much! :-) > > However, there is a version of the colorization code that you used > that is more palatably licensed: > > http://code.activestate.com/recipes/52298/ > > Here is IPython's version, which uses ANSI escape sequences for > terminal color output, which would also be a nice addition to the text > output: > > http://bazaar.launchpad.net/~ipython-dev/ipython/trunk/annotate/head%3A/IPython/PyColorize.py > > I would be happy to accept contributions in this vein. I was gearing > up to release 1.0b2 in the next day or two, but if you would like to > get a patch together in the next week, I can wait. Great, I am glad you are maintaining it. Your line profiler is very useful. Ondrej From cimrman3 at ntc.zcu.cz Tue Feb 3 04:46:03 2009 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Tue, 03 Feb 2009 10:46:03 +0100 Subject: [SciPy-user] SciPy and GUI In-Reply-To: References: Message-ID: <4988125B.80201@ntc.zcu.cz> michael hohn wrote: > If you are interested in a general-purpose worksheet-style interface to > Python, you can have a look at the l3gui at > http://l3lang.sourceforge.net, especially example 2.3. Very nice! r. From zachary.pincus at yale.edu Tue Feb 3 09:43:26 2009 From: zachary.pincus at yale.edu (Zachary Pincus) Date: Tue, 3 Feb 2009 09:43:26 -0500 Subject: [SciPy-user] "clustergrams"/hierarchical clustering heat maps In-Reply-To: <8E0AFD62-64B7-435F-B80F-298C702BF771@cs.toronto.edu> References: <8E0AFD62-64B7-435F-B80F-298C702BF771@cs.toronto.edu> Message-ID: Hi David, I don't know about making heat-maps in Python, but what I recently used for the task was the combination of "Cluster 3" (an update of Mike Eisen's original hierarchical-clustering-for-microarrays tool) to do the clustering, and "Java TreeView" to draw the heatmap/dendrogram. Cluster 3 is a bit annoying to one used to scripting analyses (lots of GUI button-pressing), but there's also a python library. Or you could just scrutinize the output format (it barfs out a few text files) and use your own clustering tools. TreeView then accepts these text files and lets you manipulate the heatmap / dendrograms (e.g. flipping nodes to get visually better results). You can then export to PS or other formats. (The PS output is pretty clean, so you can edit in Illustrator or whatnot easily.) Zach On Feb 3, 2009, at 2:12 AM, David Warde-Farley wrote: > Hi all, > > I was recently asked to cluster some data and I know from experience > that people use these heat maps to look for patterns in multivariate > data, often with a dendrogram off to the side. This involves sorting > the rows and columns in a certain fashion, the details of which are > somewhat fuzzy to me (and, truthfully, I'm happy with it staying that > way for now). > > I notice that dendrogram plotting is available in > scipy.cluster.hierarchy, and was wondering if the something for > producing the associated sorted heat maps is available anywhere > (within SciPy or otherwise). > > Many thanks, > > David > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user From gaedol at gmail.com Tue Feb 3 10:50:08 2009 From: gaedol at gmail.com (Marco) Date: Tue, 3 Feb 2009 16:50:08 +0100 Subject: [SciPy-user] FFT Filtering Message-ID: Hi list! Has anyone pointers to "applying a low pass filter to a signal's FFT" in scipy (and not only...)? Thanks, marco -- Quando sei una human pignata e la pazzo jacket si ? accorciata e non ti puoi liberare dai colpi di legno e di bastone dai petardi sul groppone Vinicio Capossela From wizzard028wise at gmail.com Tue Feb 3 19:23:53 2009 From: wizzard028wise at gmail.com (Dorian) Date: Wed, 4 Feb 2009 01:23:53 +0100 Subject: [SciPy-user] FFT Filtering In-Reply-To: References: Message-ID: <674a602a0902031623j1dd653b4jd77f3f990f3d12c5@mail.gmail.com> Could you rephrase your question ? Cheers 2009/2/3 Marco > Hi list! > > Has anyone pointers to "applying a low pass filter to a signal's FFT" > in scipy (and not only...)? > > Thanks, > > marco > > -- > > Quando sei una human pignata > e la pazzo jacket si ? accorciata > e non ti puoi liberare > dai colpi di legno e di bastone > dai petardi sul groppone > > Vinicio Capossela > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From arokem at berkeley.edu Tue Feb 3 19:30:52 2009 From: arokem at berkeley.edu (Ariel Rokem) Date: Tue, 3 Feb 2009 16:30:52 -0800 Subject: [SciPy-user] FFT Filtering In-Reply-To: <674a602a0902031623j1dd653b4jd77f3f990f3d12c5@mail.gmail.com> References: <674a602a0902031623j1dd653b4jd77f3f990f3d12c5@mail.gmail.com> Message-ID: <79E709DF-013D-4863-A3B2-CF184E45B79B@berkeley.edu> Chapters 12 and 13 here: http://www.nrbook.com/nr3/ are a good place to start. On Feb 3, 2009, at 4:23 PM, Dorian wrote: > Could you rephrase your question ? > > Cheers > > 2009/2/3 Marco > Hi list! > > Has anyone pointers to "applying a low pass filter to a signal's FFT" > in scipy (and not only...)? > > Thanks, > > marco > > -- > > Quando sei una human pignata > e la pazzo jacket si ? accorciata > e non ti puoi liberare > dai colpi di legno e di bastone > dai petardi sul groppone > > Vinicio Capossela > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user -------------- next part -------------- An HTML attachment was scrubbed... URL: From scott.sinclair.za at gmail.com Wed Feb 4 00:38:18 2009 From: scott.sinclair.za at gmail.com (Scott Sinclair) Date: Wed, 4 Feb 2009 07:38:18 +0200 Subject: [SciPy-user] FFT Filtering In-Reply-To: References: Message-ID: <6a17e9ee0902032138y390d7025mb23b5bc8b30b41cb@mail.gmail.com> > 2009/2/3 Marco : > Has anyone pointers to "applying a low pass filter to a signal's FFT" > in scipy (and not only...)? The suggestion to read the relevant sections in Numerical Recipes is a good start. After that, you can probably find the tools you need in scipy.signal http://docs.scipy.org/doc/scipy/reference/tutorial/signal.html http://docs.scipy.org/doc/scipy/reference/signal.html also see this cookbook entry http://www.scipy.org/Cookbook/SavitzkyGolay Cheers, Scott From starsareblueandfaraway at gmail.com Wed Feb 4 11:31:27 2009 From: starsareblueandfaraway at gmail.com (Roy H. Han) Date: Wed, 4 Feb 2009 11:31:27 -0500 Subject: [SciPy-user] Mysterious kmeans() error Message-ID: <6a5569ec0902040831i39e6e683y6ab8f97d8363b606@mail.gmail.com> Has anyone seen this error before? I have no idea what it means. I'm using version 0.6.0 packaged for Fedora. I'm getting this error using the kmeans2() implementation in scipy.cluster.vq File "/mnt/windows/svn/networkPlanner/acquisition/libraries/probability_process.py", line 55, in grapeCluster assignments = scipy.cluster.vq.kmeans2(globalCluster, k=2, iter=iterationCountPerBurst)[1] File "/usr/lib64/python2.5/site-packages/scipy/cluster/vq.py", line 563, in kmeans2 clusters = init(data, k) File "/usr/lib64/python2.5/site-packages/scipy/cluster/vq.py", line 469, in _krandinit x = N.dot(x, N.linalg.cholesky(cov).T) + mu File "/usr/lib64/python2.5/site-packages/numpy/linalg/linalg.py", line 418, in cholesky Cholesky decomposition cannot be computed' numpy.linalg.linalg.LinAlgError: Matrix is not positive definite - Cholesky decomposition cannot be computed Thanks, RHH From starsareblueandfaraway at gmail.com Wed Feb 4 12:28:35 2009 From: starsareblueandfaraway at gmail.com (Roy H. Han) Date: Wed, 4 Feb 2009 12:28:35 -0500 Subject: [SciPy-user] Mysterious kmeans() error In-Reply-To: <6a5569ec0902040831i39e6e683y6ab8f97d8363b606@mail.gmail.com> References: <6a5569ec0902040831i39e6e683y6ab8f97d8363b606@mail.gmail.com> Message-ID: <6a5569ec0902040928l3e48680co404c9861ee6e067b@mail.gmail.com> As a side comment, if I use Pycluster, then the clustering proceeds without error. On Wed, Feb 4, 2009 at 11:31 AM, Roy H. Han wrote: > Has anyone seen this error before? I have no idea what it means. I'm > using version 0.6.0 packaged for Fedora. > I'm getting this error using the kmeans2() implementation in scipy.cluster.vq > > > File "/mnt/windows/svn/networkPlanner/acquisition/libraries/probability_process.py", > line 55, in grapeCluster > assignments = scipy.cluster.vq.kmeans2(globalCluster, k=2, > iter=iterationCountPerBurst)[1] > File "/usr/lib64/python2.5/site-packages/scipy/cluster/vq.py", line > 563, in kmeans2 > clusters = init(data, k) > File "/usr/lib64/python2.5/site-packages/scipy/cluster/vq.py", line > 469, in _krandinit > x = N.dot(x, N.linalg.cholesky(cov).T) + mu > File "/usr/lib64/python2.5/site-packages/numpy/linalg/linalg.py", > line 418, in cholesky > Cholesky decomposition cannot be computed' > numpy.linalg.linalg.LinAlgError: Matrix is not positive definite - > Cholesky decomposition cannot be computed > > > Thanks, > RHH > From josef.pktd at gmail.com Wed Feb 4 12:44:27 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 4 Feb 2009 12:44:27 -0500 Subject: [SciPy-user] Mysterious kmeans() error In-Reply-To: <6a5569ec0902040928l3e48680co404c9861ee6e067b@mail.gmail.com> References: <6a5569ec0902040831i39e6e683y6ab8f97d8363b606@mail.gmail.com> <6a5569ec0902040928l3e48680co404c9861ee6e067b@mail.gmail.com> Message-ID: <1cd32cbb0902040944m306bbf0bia357c01d0f97fe6d@mail.gmail.com> On Wed, Feb 4, 2009 at 12:28 PM, Roy H. Han wrote: > As a side comment, if I use Pycluster, then the clustering proceeds > without error. > > On Wed, Feb 4, 2009 at 11:31 AM, Roy H. Han > wrote: >> Has anyone seen this error before? I have no idea what it means. I'm >> using version 0.6.0 packaged for Fedora. >> I'm getting this error using the kmeans2() implementation in scipy.cluster.vq >> >> >> File "/mnt/windows/svn/networkPlanner/acquisition/libraries/probability_process.py", >> line 55, in grapeCluster >> assignments = scipy.cluster.vq.kmeans2(globalCluster, k=2, >> iter=iterationCountPerBurst)[1] >> File "/usr/lib64/python2.5/site-packages/scipy/cluster/vq.py", line >> 563, in kmeans2 >> clusters = init(data, k) >> File "/usr/lib64/python2.5/site-packages/scipy/cluster/vq.py", line >> 469, in _krandinit >> x = N.dot(x, N.linalg.cholesky(cov).T) + mu >> File "/usr/lib64/python2.5/site-packages/numpy/linalg/linalg.py", >> line 418, in cholesky >> Cholesky decomposition cannot be computed' >> numpy.linalg.linalg.LinAlgError: Matrix is not positive definite - >> Cholesky decomposition cannot be computed This is just a general answer, I never used scipy.cluster The error message means that the covariance matrix of your np.cov(data) is not positive definite. Check your data, whether there is any linear dependence, eg. look at eigenvalues of np.cov(data). If that's not the source of the error, then a cluster expert is needed. Josef >> >> >> Thanks, >> RHH >> > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > From grh at mur.at Wed Feb 4 15:04:42 2009 From: grh at mur.at (Georg Holzmann) Date: Wed, 04 Feb 2009 21:04:42 +0100 Subject: [SciPy-user] audiolab Problem Message-ID: <4989F4DA.7050509@mur.at> Hallo David! I have a problem with using the scikits.audiolab package (from svn) on latest ubuntu (python 2.5.2, numpy 1.1.1). When I want to import the module I get the following error: import scikits.audiolab File "/usr/lib/python2.5/site-packages/scikits/audiolab/__init__.py", line 37, in from numpy.testing import Tester ImportError: cannot import name Tester At least in my numpy version there is no Tester class in numpy.testing. Ok, so I just removed this line in the __init__ file ... However, I wanted to run the tests, because you wrote on the audiolab website, that there can be a nasty bug with integers which corrupts the audio ... But when I try to run the tests I get the next errors: Traceback (most recent call last): File "test_matapi.py", line 22, in class test_audiolab(TestCase): NameError: name 'TestCase' is not defined And there are a few more ... Do you have a clue whats the problem ? In my numpy version the test cases have a different name ... Or do I need to run these tests on latest Ubuntu - is the integer bug still a problem ? Thanks for any hints ! LG Georg From stefan at sun.ac.za Wed Feb 4 15:42:10 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Wed, 4 Feb 2009 22:42:10 +0200 Subject: [SciPy-user] audiolab Problem In-Reply-To: <4989F4DA.7050509@mur.at> References: <4989F4DA.7050509@mur.at> Message-ID: <9457e7c80902041242n6b2e85bat8cf0c9cdd5da1788@mail.gmail.com> Hi George 2009/2/4 Georg Holzmann : > I have a problem with using the scikits.audiolab package (from svn) on > latest ubuntu (python 2.5.2, numpy 1.1.1). > > When I want to import the module I get the following error: > import scikits.audiolab > File "/usr/lib/python2.5/site-packages/scikits/audiolab/__init__.py", > line 37, in > from numpy.testing import Tester > ImportError: cannot import name Tester > > At least in my numpy version there is no Tester class in numpy.testing. > Ok, so I just removed this line in the __init__ file ... "Tester" has been added after 1.1. Your workaround is OK. > However, I wanted to run the tests, because you wrote on the audiolab > website, that there can be a nasty bug with integers which corrupts the > audio ... > But when I try to run the tests I get the next errors: > Traceback (most recent call last): > File "test_matapi.py", line 22, in > class test_audiolab(TestCase): > NameError: name 'TestCase' is not defined > > And there are a few more ... Just replace TestCase with unittest.TestCase (you'll have to import unittest, too). You should then be able to run the tests with: nosetests scikits.audiolab Cheers St?fan From grh at mur.at Wed Feb 4 16:19:04 2009 From: grh at mur.at (Georg Holzmann) Date: Wed, 04 Feb 2009 22:19:04 +0100 Subject: [SciPy-user] audiolab Problem In-Reply-To: <9457e7c80902041242n6b2e85bat8cf0c9cdd5da1788@mail.gmail.com> References: <4989F4DA.7050509@mur.at> <9457e7c80902041242n6b2e85bat8cf0c9cdd5da1788@mail.gmail.com> Message-ID: <498A0648.6000409@mur.at> Hallo! >> At least in my numpy version there is no Tester class in numpy.testing. >> Ok, so I just removed this line in the __init__ file ... > > "Tester" has been added after 1.1. Your workaround is OK. Hm, I see ... >> However, I wanted to run the tests, because you wrote on the audiolab >> website, that there can be a nasty bug with integers which corrupts the >> audio ... >> But when I try to run the tests I get the next errors: >> Traceback (most recent call last): >> File "test_matapi.py", line 22, in >> class test_audiolab(TestCase): >> NameError: name 'TestCase' is not defined >> >> And there are a few more ... > > Just replace TestCase with unittest.TestCase (you'll have to import > unittest, too). Yes thanks, but there are more problems - not only TestCase ... Are there somewhere older packages of audiolab, which compile on standard systems ? (latest ubuntu, so I think this is quite up to date) I used this package a few months ago without problems ... Thanks, LG Georg From david at ar.media.kyoto-u.ac.jp Thu Feb 5 05:30:02 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Thu, 05 Feb 2009 19:30:02 +0900 Subject: [SciPy-user] audiolab Problem In-Reply-To: <4989F4DA.7050509@mur.at> References: <4989F4DA.7050509@mur.at> Message-ID: <498ABFAA.8090508@ar.media.kyoto-u.ac.jp> Georg Holzmann wrote: > Hallo David! > > I have a problem with using the scikits.audiolab package (from svn) on > latest ubuntu (python 2.5.2, numpy 1.1.1). > Hi Georg, Audiolab requires numpy 1.2 or above. I think we should push numpy 1.2 to Ubuntu 9.04 while there is still time - 1.2 was a significant release. > When I want to import the module I get the following error: > import scikits.audiolab > File "/usr/lib/python2.5/site-packages/scikits/audiolab/__init__.py", > line 37, in > from numpy.testing import Tester > ImportError: cannot import name Tester > This has nothing to do with the problem, but may I suggest not to install anything from sources into /usr/ ? You should install either in /usr/local or somewhere else, because if audiolab becomes packaged by Ubuntu, you will mess up things. It is a good idea to never install anything from sources in /usr, > At least in my numpy version there is no Tester class in numpy.testing. > Ok, so I just removed this line in the __init__ file ... > > However, I wanted to run the tests, because you wrote on the audiolab > website, that there can be a nasty bug with integers which corrupts the > audio ... > Yes, there was a ctypes bug in old Ubuntu. The new audiolab version should avoid this problem altogether. > Do you have a clue whats the problem ? > In my numpy version the test cases have a different name ... > Or do I need to run these tests on latest Ubuntu - is the integer bug > still a problem ? > Not on recent Ubuntu releases, no. David From bloodearnest at gmail.com Thu Feb 5 06:37:47 2009 From: bloodearnest at gmail.com (Wavy Davy) Date: Thu, 5 Feb 2009 11:37:47 +0000 Subject: [SciPy-user] help with scipy.stats.mannwhitneyu Message-ID: <5063d0650902050337s3a4b656k71f9a5634c589589@mail.gmail.com> Hi all I am using the mannwhitneyu in the stats module, and I was looking the code and I see this notice in the docstring. "Use only when the n in each condition is < 20 and you have 2 independent samples of ranks. " Am I reading it correctly that this test should only be used with sample sizes less than 20? I am not a statistican, more a python coder. I have been pointed and this test as a more robust version of the t-test, so forgive my ignorance. Any help would be much appreciated. -- Simon From alexander.borghgraef.rma at gmail.com Thu Feb 5 09:09:28 2009 From: alexander.borghgraef.rma at gmail.com (Alexander Borghgraef) Date: Thu, 5 Feb 2009 15:09:28 +0100 Subject: [SciPy-user] Vector field filtering Message-ID: <9e8c52a20902050609pe71a53fl970b1123d3a5c374@mail.gmail.com> Hi all, I'm trying to implement a mean shift filtering algorithm, and for that I need to apply a sliding window to a vector field or image, possibly with as output a vector field of different dimensions. So for example I could be filtering an RGB image of shape (3, height, width) and returning a (x, y)+RGB vectorfield containing the mean shift vector as wel as the color date, resulting in a shape (5,height, width). For solving this, I looked into scipy.ndimage.generic_filter, but that doesn't seem to do the trick. For one it can't handle input and output being of different shape (easy to circumvent by adding dummy (x,y) to the input), and more importantly, it doesn't feature an axis option, meaning that it shifts the filter footprint not only across the width and height, but also across the vector dimension, which is not what I need. So generic_filter is out, any suggestions for an alternative ready-made numpy solution? Anything in scikits? Or should I implement my own sliding window for vectorfields? -- Alex Borghgraef From josef.pktd at gmail.com Thu Feb 5 09:33:05 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 5 Feb 2009 09:33:05 -0500 Subject: [SciPy-user] help with scipy.stats.mannwhitneyu In-Reply-To: <5063d0650902050337s3a4b656k71f9a5634c589589@mail.gmail.com> References: <5063d0650902050337s3a4b656k71f9a5634c589589@mail.gmail.com> Message-ID: <1cd32cbb0902050633v91c8785s6ae993d6eb63aa01@mail.gmail.com> On Thu, Feb 5, 2009 at 6:37 AM, Wavy Davy wrote: > Hi all > > I am using the mannwhitneyu in the stats module, and I was looking the > code and I see this notice in the docstring. > > "Use only when the n in each condition is < 20 and you have 2 > independent samples of ranks. " > > Am I reading it correctly that this test should only be used with > sample sizes less than 20? > > I am not a statistican, more a python coder. I have been pointed and > this test as a more robust version of the t-test, so forgive my > ignorance. > > Any help would be much appreciated. > > -- > Simon I briefly looked at the test, the implementation of the test statistic is mostly as described in http://en.wikipedia.org/wiki/Mann-Whitney_U_test It seems the test statistic is defined with the opposite sign from the definition in wikipedia. The doc string statement "Use only when the n in each condition is < 20", I think should be >20, since the pvalue is based on the asymptotic distribution, which is only correct in larger samples. I didn't see any unit tests for this test, but I will try to verify the results later today. wilcoxon is a similar test for paired instead of independent samples, and there the recommendation in the docstring is for N>20. Josef From sturla at molden.no Thu Feb 5 09:32:59 2009 From: sturla at molden.no (Sturla Molden) Date: Thu, 05 Feb 2009 15:32:59 +0100 Subject: [SciPy-user] help with scipy.stats.mannwhitneyu In-Reply-To: <5063d0650902050337s3a4b656k71f9a5634c589589@mail.gmail.com> References: <5063d0650902050337s3a4b656k71f9a5634c589589@mail.gmail.com> Message-ID: <498AF89B.6070404@molden.no> On 2/5/2009 12:37 PM, Wavy Davy wrote: > I am using the mannwhitneyu in the stats module, and I was looking the > code and I see this notice in the docstring. > > "Use only when the n in each condition is < 20 and you have 2 > independent samples of ranks. " > > Am I reading it correctly that this test should only be used with > sample sizes less than 20? First of all, the Mann-Withney U-test should NEVER be used. It has assumptions that are mathematically problematic, known as the "Behrens-Fisher problem". What you probably want to use is the "Wilcoxon rank-sum test". Despite common belief, Mann-Withney U and Wilcoxon rank-sum are not the same test. The latter assumes equal variance, the former do not. The Mann-Withney U has even been shown to fail when distributions have unequal variance (Journal of Experimental Education, Vol. 60, 1992), so its justification over the Wilcoxon rank-sum test is questionable. Wikipedia says the Wilcoxon rank-sum test assumes equal sample sizes; this is not correct. I would vote for the immediate removal of Mann-Withney U-test from SciPy. The only thing it should do is raise an exception and instruct the user to apply a t-test or Wilcoxon rank-sum test instead. As a side note, if you request a Mann-Withney test in MINITAB, you actually get a Wilcoxon rank-sum test instead. Then for your question: If N > 20, you can just as well use a t-test. Its assumptions will be asymptotically valid due to the central limit theorem, even though the data are not normally distributed. If you are worried about outliers, as opposed to systematic deviation from normality, use the Wilcoxon rank-sum test instead: When the data is transformed to rank scale and the two sample sizes are M and N respectively, the Mann-Withney U-statistic has O(N*M) complexity whereas the Wilcoxon rank-sum statistic only has O(N+M) complexity. O(N*M) behaviour makes the Mann-Withney U-statistic intractable for large samples. Sturla Molden From sturla at molden.no Thu Feb 5 09:38:31 2009 From: sturla at molden.no (Sturla Molden) Date: Thu, 05 Feb 2009 15:38:31 +0100 Subject: [SciPy-user] help with scipy.stats.mannwhitneyu In-Reply-To: <1cd32cbb0902050633v91c8785s6ae993d6eb63aa01@mail.gmail.com> References: <5063d0650902050337s3a4b656k71f9a5634c589589@mail.gmail.com> <1cd32cbb0902050633v91c8785s6ae993d6eb63aa01@mail.gmail.com> Message-ID: <498AF9E7.90200@molden.no> On 2/5/2009 3:33 PM, josef.pktd at gmail.com wrote: > wilcoxon is a similar test for paired instead of independent samples, > and there the recommendation in the docstring is for N>20. There are two Wilcoxon tests. The signed-rank test for paired samples and the rank-sum test for independent samples. S.M. From bsouthey at gmail.com Thu Feb 5 09:56:08 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Thu, 05 Feb 2009 08:56:08 -0600 Subject: [SciPy-user] help with scipy.stats.mannwhitneyu In-Reply-To: <5063d0650902050337s3a4b656k71f9a5634c589589@mail.gmail.com> References: <5063d0650902050337s3a4b656k71f9a5634c589589@mail.gmail.com> Message-ID: <498AFE08.8020402@gmail.com> Wavy Davy wrote: > Hi all > > I am using the mannwhitneyu in the stats module, and I was looking the > code and I see this notice in the docstring. > > "Use only when the n in each condition is < 20 and you have 2 > independent samples of ranks. " > > Am I reading it correctly that this test should only be used with > sample sizes less than 20? > > I am not a statistican, more a python coder. I have been pointed and > this test as a more robust version of the t-test, so forgive my > ignorance. > > Any help would be much appreciated. > > I think the docstring is referring to the distribution of the actual U-test. So for small samples typically the pvalue is directly computed from the sampling distribution. However, Scipy is using the normal approximation is which is not meant to be that great. http://faculty.vassar.edu/lowry/utest.html http://faculty.vassar.edu/lowry/ch11a.html http://www.alglib.net/statistics/hypothesistesting/mannwhitneyu.php Bruce From josef.pktd at gmail.com Thu Feb 5 10:36:08 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 5 Feb 2009 10:36:08 -0500 Subject: [SciPy-user] help with scipy.stats.mannwhitneyu In-Reply-To: <498AF9E7.90200@molden.no> References: <5063d0650902050337s3a4b656k71f9a5634c589589@mail.gmail.com> <1cd32cbb0902050633v91c8785s6ae993d6eb63aa01@mail.gmail.com> <498AF9E7.90200@molden.no> Message-ID: <1cd32cbb0902050736v5ae55230l215910f312562f2a@mail.gmail.com> On Thu, Feb 5, 2009 at 9:38 AM, Sturla Molden wrote: > On 2/5/2009 3:33 PM, josef.pktd at gmail.com wrote: > >> wilcoxon is a similar test for paired instead of independent samples, >> and there the recommendation in the docstring is for N>20. > > There are two Wilcoxon tests. The signed-rank test for paired samples > and the rank-sum test for independent samples. > According to wikipedia, Mann-Whitney-U is the Wilcoxon rank-sum test for independent samples, just a different name. Josef From sturla at molden.no Thu Feb 5 10:59:04 2009 From: sturla at molden.no (Sturla Molden) Date: Thu, 05 Feb 2009 16:59:04 +0100 Subject: [SciPy-user] help with scipy.stats.mannwhitneyu In-Reply-To: <1cd32cbb0902050736v5ae55230l215910f312562f2a@mail.gmail.com> References: <5063d0650902050337s3a4b656k71f9a5634c589589@mail.gmail.com> <1cd32cbb0902050633v91c8785s6ae993d6eb63aa01@mail.gmail.com> <498AF9E7.90200@molden.no> <1cd32cbb0902050736v5ae55230l215910f312562f2a@mail.gmail.com> Message-ID: <498B0CC8.90007@molden.no> On 2/5/2009 4:36 PM, josef.pktd at gmail.com wrote: > According to wikipedia, Mann-Whitney-U is the Wilcoxon rank-sum test > for independent samples, just a different name. You should not trust Wikipedia. From stefan at sun.ac.za Thu Feb 5 11:09:43 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Thu, 5 Feb 2009 18:09:43 +0200 Subject: [SciPy-user] help with scipy.stats.mannwhitneyu In-Reply-To: <498B0CC8.90007@molden.no> References: <5063d0650902050337s3a4b656k71f9a5634c589589@mail.gmail.com> <1cd32cbb0902050633v91c8785s6ae993d6eb63aa01@mail.gmail.com> <498AF9E7.90200@molden.no> <1cd32cbb0902050736v5ae55230l215910f312562f2a@mail.gmail.com> <498B0CC8.90007@molden.no> Message-ID: <9457e7c80902050809wf101707y9ac654f64d86e8c4@mail.gmail.com> 2009/2/5 Sturla Molden : > On 2/5/2009 4:36 PM, josef.pktd at gmail.com wrote: > >> According to wikipedia, Mann-Whitney-U is the Wilcoxon rank-sum test >> for independent samples, just a different name. > > You should not trust Wikipedia. Or, you can fix the entry on Wikipedia... St?fan From grh at mur.at Thu Feb 5 11:13:53 2009 From: grh at mur.at (Georg Holzmann) Date: Thu, 05 Feb 2009 17:13:53 +0100 Subject: [SciPy-user] audiolab Problem In-Reply-To: <498ABFAA.8090508@ar.media.kyoto-u.ac.jp> References: <4989F4DA.7050509@mur.at> <498ABFAA.8090508@ar.media.kyoto-u.ac.jp> Message-ID: <498B1041.2060807@mur.at> Hallo! > Audiolab requires numpy 1.2 or above. I think we should push numpy > 1.2 to Ubuntu 9.04 while there is still time - 1.2 was a significant > release. OK, I see. However, it would be nice to have the old working audiolab code somewhere, which can be used on recent systems ... > >> When I want to import the module I get the following error: >> import scikits.audiolab >> File "/usr/lib/python2.5/site-packages/scikits/audiolab/__init__.py", >> line 37, in >> from numpy.testing import Tester >> ImportError: cannot import name Tester >> > > This has nothing to do with the problem, but may I suggest not to > install anything from sources into /usr/ ? Hm ... I didn't notice that ... (was not my intention) I just typed 'python setup.py install'. >> Do you have a clue whats the problem ? >> In my numpy version the test cases have a different name ... >> Or do I need to run these tests on latest Ubuntu - is the integer bug >> still a problem ? >> > > Not on recent Ubuntu releases, no. OK - thanks for your feedback ! LG Georg From bloodearnest at gmail.com Thu Feb 5 11:24:45 2009 From: bloodearnest at gmail.com (Wavy Davy) Date: Thu, 5 Feb 2009 16:24:45 +0000 Subject: [SciPy-user] help with scipy.stats.mannwhitneyu In-Reply-To: <498AFE08.8020402@gmail.com> References: <5063d0650902050337s3a4b656k71f9a5634c589589@mail.gmail.com> <498AFE08.8020402@gmail.com> Message-ID: <5063d0650902050824m16bfc86aia38245113f448ef4@mail.gmail.com> 2009/2/5 Bruce Southey : > I think the docstring is referring to the distribution of the actual > U-test. So for small samples typically the pvalue is directly computed > from the sampling distribution. However, Scipy is using the normal > approximation is which is not meant to be that great. > > http://faculty.vassar.edu/lowry/utest.html > http://faculty.vassar.edu/lowry/ch11a.html > http://www.alglib.net/statistics/hypothesistesting/mannwhitneyu.php OK - that makes more sense. Thanks. I've ended up using the Kruskal-Wallis extension to Mann-Whitney anyway, as I have multiple data samples. Which of course scipy provides with the kurskal function. Confusing docstrings aside, its been a pleasure to us :) -- Simon From bsouthey at gmail.com Thu Feb 5 11:32:51 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Thu, 05 Feb 2009 10:32:51 -0600 Subject: [SciPy-user] help with scipy.stats.mannwhitneyu In-Reply-To: <9457e7c80902050809wf101707y9ac654f64d86e8c4@mail.gmail.com> References: <5063d0650902050337s3a4b656k71f9a5634c589589@mail.gmail.com> <1cd32cbb0902050633v91c8785s6ae993d6eb63aa01@mail.gmail.com> <498AF9E7.90200@molden.no> <1cd32cbb0902050736v5ae55230l215910f312562f2a@mail.gmail.com> <498B0CC8.90007@molden.no> <9457e7c80902050809wf101707y9ac654f64d86e8c4@mail.gmail.com> Message-ID: <498B14B3.2070105@gmail.com> St?fan van der Walt wrote: > 2009/2/5 Sturla Molden : > >> On 2/5/2009 4:36 PM, josef.pktd at gmail.com wrote: >> >> >>> According to wikipedia, Mann-Whitney-U is the Wilcoxon rank-sum test >>> for independent samples, just a different name. >>> >> You should not trust Wikipedia. >> > > Or, you can fix the entry on Wikipedia... > > St?fan > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > Or perhaps it is actually correct. My understanding (because I don't want to do it) is that these are equivalent and all major stats packages provide just one test. For example, Prof Brian Ripley's reply on the R list https://stat.ethz.ch/pipermail/r-help/2005-May/071544.html >/ I am hoping someone could shed some light into the Wilcoxon Rank Sum Test />/ for me? In looking through Stats references, the Mann-Whitney U-test and />/ the Wilcoxon Rank Sum Test are statistically equivalent. / Yes, but not numerically: they differ by a constant (in the data, a function of the data size). Bruce -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Thu Feb 5 11:39:19 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 5 Feb 2009 11:39:19 -0500 Subject: [SciPy-user] help with scipy.stats.mannwhitneyu In-Reply-To: <1cd32cbb0902050736v5ae55230l215910f312562f2a@mail.gmail.com> References: <5063d0650902050337s3a4b656k71f9a5634c589589@mail.gmail.com> <1cd32cbb0902050633v91c8785s6ae993d6eb63aa01@mail.gmail.com> <498AF9E7.90200@molden.no> <1cd32cbb0902050736v5ae55230l215910f312562f2a@mail.gmail.com> Message-ID: <1cd32cbb0902050839m4e587390s31ef7b5f6267c5d4@mail.gmail.com> On Thu, Feb 5, 2009 at 10:36 AM, wrote: > On Thu, Feb 5, 2009 at 9:38 AM, Sturla Molden wrote: >> On 2/5/2009 3:33 PM, josef.pktd at gmail.com wrote: >> >>> wilcoxon is a similar test for paired instead of independent samples, >>> and there the recommendation in the docstring is for N>20. >> >> There are two Wilcoxon tests. The signed-rank test for paired samples >> and the rank-sum test for independent samples. >> > > According to wikipedia, Mann-Whitney-U is the Wilcoxon rank-sum test > for independent samples, just a different name. > > Josef > So far: According to R: wilcox.test(x,y) Performs one and two sample Wilcoxon tests on vectors of data; the latter is also known as 'Mann-Whitney' test. I tried a normal random variable example ( no ties): the test statistic returned is exactly the same as the one returned by stats.mannwhitneyu(x,y) however the p-values differ. the pvalue in stats is half of the one in R (up to 1e-17) as stated in the docstring: one-tailed p-value. In R the test statistic is the same for the two sided and the one sided tests, but the reported p-values differ. I used sample size 100. So there is an inconsistency in the reporting in stats.mannwhitneyu, the test statistic is for the two-sided test, but the p-value is half of the two sided p-value and should be multiplied by two. I haven't checked the tie handling. Josef From sturla at molden.no Thu Feb 5 11:56:27 2009 From: sturla at molden.no (Sturla Molden) Date: Thu, 05 Feb 2009 17:56:27 +0100 Subject: [SciPy-user] help with scipy.stats.mannwhitneyu In-Reply-To: <1cd32cbb0902050839m4e587390s31ef7b5f6267c5d4@mail.gmail.com> References: <5063d0650902050337s3a4b656k71f9a5634c589589@mail.gmail.com> <1cd32cbb0902050633v91c8785s6ae993d6eb63aa01@mail.gmail.com> <498AF9E7.90200@molden.no> <1cd32cbb0902050736v5ae55230l215910f312562f2a@mail.gmail.com> <1cd32cbb0902050839m4e587390s31ef7b5f6267c5d4@mail.gmail.com> Message-ID: <498B1A3B.8040603@molden.no> On 2/5/2009 5:39 PM, josef.pktd at gmail.com wrote: > According to R: > wilcox.test(x,y) > Performs one and two sample Wilcoxon tests on vectors of data; the > latter is also known as 'Mann-Whitney' test. > > I tried a normal random variable example ( no ties): the test > statistic returned is exactly the same as the one returned by > stats.mannwhitneyu(x,y) however the p-values differ. the pvalue in > stats is half of the one in R (up to 1e-17) as stated in the > docstring: one-tailed p-value. I believe there is a bug in SciPy: def mannwhitneyu(x, y): """Calculates a Mann-Whitney U statistic on the provided scores and returns the result. Use only when the n in each condition is < 20 and you have 2 independent samples of ranks. REMEMBER: Mann-Whitney U is significant if the u-obtained is LESS THAN or equal to the critical value of U. Returns: u-statistic, one-tailed p-value (i.e., p(z(U))) """ x = asarray(x) y = asarray(y) n1 = len(x) n2 = len(y) ranked = rankdata(np.concatenate((x,y))) rankx = ranked[0:n1] # get the x-ranks #ranky = ranked[n1:] # the rest are y-ranks u1 = n1*n2 + (n1*(n1+1))/2.0 - np.sum(rankx,axis=0) # calc U for x u2 = n1*n2 - u1 # remainder is U for y bigu = max(u1,u2) smallu = min(u1,u2) T = np.sqrt(tiecorrect(ranked)) # correction factor for tied scores if T == 0: raise ValueError, 'All numbers are identical in amannwhitneyu' sd = np.sqrt(T*n1*n2*(n1+n2+1)/12.0) z = abs((bigu-n1*n2/2.0) / sd) # normal approximation for prob calc return smallu, 1.0 - zprob(z) Take a look at the last two lines? Do you see something peculiar? Sturla Molden From gaedol at gmail.com Thu Feb 5 12:17:55 2009 From: gaedol at gmail.com (Marco) Date: Thu, 5 Feb 2009 18:17:55 +0100 Subject: [SciPy-user] Lowpass Filter Message-ID: Hi list! Let's suppose a to be a 1D array with N elements. Basically, it's a signal of some sort. How do I apply a low pass filter (with selected frequency and width) to this signal? How to store the resulting, filtered, signal, in a new array? I had a look at lp2lp() in scipy.signal, but it returns, if I am right, a filter object, which then I dunno how to use to filter my data. Any ideas or pointers? TIA, marco -- Quando sei una human pignata e la pazzo jacket si ? accorciata e non ti puoi liberare dai colpi di legno e di bastone dai petardi sul groppone Vinicio Capossela From josef.pktd at gmail.com Thu Feb 5 12:23:16 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 5 Feb 2009 12:23:16 -0500 Subject: [SciPy-user] help with scipy.stats.mannwhitneyu In-Reply-To: <498B1A3B.8040603@molden.no> References: <5063d0650902050337s3a4b656k71f9a5634c589589@mail.gmail.com> <1cd32cbb0902050633v91c8785s6ae993d6eb63aa01@mail.gmail.com> <498AF9E7.90200@molden.no> <1cd32cbb0902050736v5ae55230l215910f312562f2a@mail.gmail.com> <1cd32cbb0902050839m4e587390s31ef7b5f6267c5d4@mail.gmail.com> <498B1A3B.8040603@molden.no> Message-ID: <1cd32cbb0902050923j492f5683icbe737533159a24e@mail.gmail.com> On Thu, Feb 5, 2009 at 11:56 AM, Sturla Molden wrote: > On 2/5/2009 5:39 PM, josef.pktd at gmail.com wrote: > >> According to R: >> wilcox.test(x,y) >> Performs one and two sample Wilcoxon tests on vectors of data; the >> latter is also known as 'Mann-Whitney' test. >> >> I tried a normal random variable example ( no ties): the test >> statistic returned is exactly the same as the one returned by >> stats.mannwhitneyu(x,y) however the p-values differ. the pvalue in >> stats is half of the one in R (up to 1e-17) as stated in the >> docstring: one-tailed p-value. > > > I believe there is a bug in SciPy: > > > def mannwhitneyu(x, y): > """Calculates a Mann-Whitney U statistic on the provided scores and > returns the result. Use only when the n in each condition is < 20 and > you have 2 independent samples of ranks. REMEMBER: Mann-Whitney U is > significant if the u-obtained is LESS THAN or equal to the critical > value of U. > > Returns: u-statistic, one-tailed p-value (i.e., p(z(U))) > """ > x = asarray(x) > y = asarray(y) > n1 = len(x) > n2 = len(y) > ranked = rankdata(np.concatenate((x,y))) > rankx = ranked[0:n1] # get the x-ranks > #ranky = ranked[n1:] # the rest are y-ranks > u1 = n1*n2 + (n1*(n1+1))/2.0 - np.sum(rankx,axis=0) # calc U for x > u2 = n1*n2 - u1 # remainder is U for y > bigu = max(u1,u2) > smallu = min(u1,u2) > T = np.sqrt(tiecorrect(ranked)) # correction factor for tied scores > if T == 0: > raise ValueError, 'All numbers are identical in amannwhitneyu' > sd = np.sqrt(T*n1*n2*(n1+n2+1)/12.0) > z = abs((bigu-n1*n2/2.0) / sd) # normal approximation for prob calc > return smallu, 1.0 - zprob(z) > > > Take a look at the last two lines? Do you see something peculiar? > > Sturla Molden > you mean that it uses bigu for the p-value calculation but reports smallu as the test-statistic? I didn't try to figure out what the formula for the p-value actually is, but I'm pretty happy that we get the same result as R, except for the times 2. I looked some more at the R implementation : the main difference is that R uses by default a continuity correction "correct a logical indicating whether to apply continuity correction in the normal approximation for the p-value" >>> rwilcox=rpy.r('wilcox.test') >>> stats.mannwhitneyu(rvs1,rvs2)[1]*2 - rwilcox(rvs1,rvs2,correct = False)['p.value'] -1.5265566588595902e-016 The test statistic in R is not symmetric in its argument, although the p-values are, stats.mannwhitneyu is symmetric in statistic and p-value. >>> rresult = rwilcox(rvs2,rvs1) >>> rresult['statistic'] {'W': 5637.0} >>> rresult['p.value'] 0.11989439052971607 >>> rresult = rwilcox(rvs1,rvs2) >>> rresult['statistic'] {'W': 4363.0} >>> rresult['p.value'] 0.11989439052971618 So overall stats.mannwhitney, I think, looks pretty good but it could be expanded to include some of the options that R offers, and I also think we should multiply the pvalue by 2, so that the reported p-value actually corresponds to the test. Josef From sturla at molden.no Thu Feb 5 12:29:43 2009 From: sturla at molden.no (Sturla Molden) Date: Thu, 05 Feb 2009 18:29:43 +0100 Subject: [SciPy-user] help with scipy.stats.mannwhitneyu In-Reply-To: <1cd32cbb0902050923j492f5683icbe737533159a24e@mail.gmail.com> References: <5063d0650902050337s3a4b656k71f9a5634c589589@mail.gmail.com> <1cd32cbb0902050633v91c8785s6ae993d6eb63aa01@mail.gmail.com> <498AF9E7.90200@molden.no> <1cd32cbb0902050736v5ae55230l215910f312562f2a@mail.gmail.com> <1cd32cbb0902050839m4e587390s31ef7b5f6267c5d4@mail.gmail.com> <498B1A3B.8040603@molden.no> <1cd32cbb0902050923j492f5683icbe737533159a24e@mail.gmail.com> Message-ID: <498B2207.2030303@molden.no> On 2/5/2009 6:23 PM, josef.pktd at gmail.com wrote: > you mean that it uses bigu for the p-value calculation but reports > smallu as the test-statistic? Yes S.M. From arokem at berkeley.edu Thu Feb 5 12:34:48 2009 From: arokem at berkeley.edu (Ariel Rokem) Date: Thu, 5 Feb 2009 09:34:48 -0800 Subject: [SciPy-user] Lowpass Filter In-Reply-To: References: Message-ID: <43958ee60902050934q3c9f98d8r38631c1e500b5faf@mail.gmail.com> Hi - I don't know if this what you want (I don't know how to use lp2lp or scipy.signal), but one strategy that I have used is to convolve your signal with a box-car function of a length equal to the inverse of your cut-off. This is most definitely not the best filter known to man, but fwiw here is the code. For example (here I do a lowpass and then subtract the low-passed signal from the original, effectively doing a quick-and-ugly highpass) : box_car = np.ones(np.ceil(1.0/(f_c/TR))) #TR is the inverse of the sampling frequency in the fMRI signal I am analyzing, f_c is the cutoff box_car = box_car/(float(len(box_car))) print('Normalizing and detrending time series') for i in range(len(tSeries)): #Detrending #Start by applying a low-pass to the signal: #Pad the signal on each side with the initial and terminal signal value: pad_s = np.append(np.ones(len(box_car)) * tSeries[i][0], tSeries[i][:]) pad_s = np.append(pad_s, np.ones(len(box_car)) * tSeries[i][-1]) #Filter operation is a convolution with the box-car: conv_s = np.convolve(pad_s,box_car) #Extract the low pass signal (by excising the central len(tSeries) points: s_lp= conv_s[len(conv_s)/2-np.ceil(len(tSeries[i][:])/2.0):len(conv_s)/2+len(tSeries[i][:])/2] #ceil(/2.0) for cases where the tSeries has an odd number of points #Extract the high pass signal simply by subtracting the high pass signal #from the original signal: tSeries[i] = tSeries[i][:] - s_lp + np.mean(s_lp) #add mean to make sure that there are no negative values #Normalization tSeries[i] = tSeries[i]/np.mean(tSeries[i])-1 On Thu, Feb 5, 2009 at 9:17 AM, Marco wrote: > Hi list! > > Let's suppose a to be a 1D array with N elements. > Basically, it's a signal of some sort. > > How do I apply a low pass filter (with selected frequency and width) > to this signal? > How to store the resulting, filtered, signal, in a new array? > > I had a look at lp2lp() in scipy.signal, but it returns, if I am > right, a filter object, which then I dunno how to use to filter my > data. > > Any ideas or pointers? > > TIA, > > marco > > > > -- > > Quando sei una human pignata > e la pazzo jacket si ? accorciata > e non ti puoi liberare > dai colpi di legno e di bastone > dai petardi sul groppone > > Vinicio Capossela > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From aisaac at american.edu Thu Feb 5 12:35:46 2009 From: aisaac at american.edu (Alan G Isaac) Date: Thu, 05 Feb 2009 12:35:46 -0500 Subject: [SciPy-user] help with scipy.stats.mannwhitneyu In-Reply-To: <498B0CC8.90007@molden.no> References: <5063d0650902050337s3a4b656k71f9a5634c589589@mail.gmail.com> <1cd32cbb0902050633v91c8785s6ae993d6eb63aa01@mail.gmail.com> <498AF9E7.90200@molden.no> <1cd32cbb0902050736v5ae55230l215910f312562f2a@mail.gmail.com> <498B0CC8.90007@molden.no> Message-ID: <498B2372.1000508@american.edu> > On 2/5/2009 4:36 PM, josef.pktd at gmail.com wrote: >> According to wikipedia, Mann-Whitney-U is the Wilcoxon rank-sum test >> for independent samples, just a different name. On 2/5/2009 10:59 AM Sturla Molden apparently wrote: > You should not trust Wikipedia. Or any other encyclopedia. But actually Wikipedia is usually pretty good on technical matters, and can easily be fixed. fwiw, Alan Isaac From josef.pktd at gmail.com Thu Feb 5 12:46:48 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 5 Feb 2009 12:46:48 -0500 Subject: [SciPy-user] help with scipy.stats.mannwhitneyu In-Reply-To: <498B2207.2030303@molden.no> References: <5063d0650902050337s3a4b656k71f9a5634c589589@mail.gmail.com> <1cd32cbb0902050633v91c8785s6ae993d6eb63aa01@mail.gmail.com> <498AF9E7.90200@molden.no> <1cd32cbb0902050736v5ae55230l215910f312562f2a@mail.gmail.com> <1cd32cbb0902050839m4e587390s31ef7b5f6267c5d4@mail.gmail.com> <498B1A3B.8040603@molden.no> <1cd32cbb0902050923j492f5683icbe737533159a24e@mail.gmail.com> <498B2207.2030303@molden.no> Message-ID: <1cd32cbb0902050946i700777cdhc920711cb393353f@mail.gmail.com> On Thu, Feb 5, 2009 at 12:29 PM, Sturla Molden wrote: > On 2/5/2009 6:23 PM, josef.pktd at gmail.com wrote: > >> you mean that it uses bigu for the p-value calculation but reports >> smallu as the test-statistic? > > Yes > Given that it works, I didn't want to spend time on this, but wikipedia again: "therefore, the absolute value of the z statistic calculated will be same whichever value of U is used." As I understand it, because the sum U1+U2 is fixed (given the sample sizes), many properties are equivalent, i.e. U1 - meanU = - (U2 - meanU) so whether bigU or smallU is used in the calculation of z doesn't matter, I have no idea why in this specific implementation both are calculated if smallU would be enough. Josef From sturla at molden.no Thu Feb 5 12:49:02 2009 From: sturla at molden.no (Sturla Molden) Date: Thu, 05 Feb 2009 18:49:02 +0100 Subject: [SciPy-user] Lowpass Filter In-Reply-To: References: Message-ID: <498B268E.7090502@molden.no> On 2/5/2009 6:17 PM, Marco wrote: > How do I apply a low pass filter (with selected frequency and width) > to this signal? What kind of lowpass filter? Single-pole? Butterworth? Bessel? Gaussian? Cheychev? Elliptic? Truncated sinc with window? What kind of window? But basically: - First obtain your filter coefficients. Filter design is an extensive subject; I cannot cover it here. Consult a text book. - Short FIR or IIR: apply filter to signal with scipy.signal.lfilter. - Long FIR: use numpy.fft.rfft for convolution in the Fourier plane. (You will get faster results with FFTW instead of NumPy's FFT.) S.M. From c-b at asu.edu Thu Feb 5 12:28:47 2009 From: c-b at asu.edu (Christopher Brown) Date: Thu, 05 Feb 2009 10:28:47 -0700 Subject: [SciPy-user] Lowpass Filter In-Reply-To: References: Message-ID: <498B21CF.6040105@asu.edu> Hi Marco, M> Let's suppose a to be a 1D array with N elements. M> Basically, it's a signal of some sort. M> M> How do I apply a low pass filter (with selected frequency and width) M> to this signal? M> How to store the resulting, filtered, signal, in a new array? M> M> I had a look at lp2lp() in scipy.signal, but it returns, if I am M> right, a filter object, which then I dunno how to use to filter my M> data. M> M> Any ideas or pointers? The following is a low-pass Butterworth filter cutoff = 500. fs = 44100. nyq = fs/2. filterorder = 5 b,a = scipy.signal.filter_design.butter(filterorder,cutoff/nyq) filteredsignal = scipy.signal.lfilter(b,a,signal) -- Chris From sturla at molden.no Thu Feb 5 13:03:34 2009 From: sturla at molden.no (Sturla Molden) Date: Thu, 05 Feb 2009 19:03:34 +0100 Subject: [SciPy-user] help with scipy.stats.mannwhitneyu In-Reply-To: <1cd32cbb0902050946i700777cdhc920711cb393353f@mail.gmail.com> References: <5063d0650902050337s3a4b656k71f9a5634c589589@mail.gmail.com> <1cd32cbb0902050633v91c8785s6ae993d6eb63aa01@mail.gmail.com> <498AF9E7.90200@molden.no> <1cd32cbb0902050736v5ae55230l215910f312562f2a@mail.gmail.com> <1cd32cbb0902050839m4e587390s31ef7b5f6267c5d4@mail.gmail.com> <498B1A3B.8040603@molden.no> <1cd32cbb0902050923j492f5683icbe737533159a24e@mail.gmail.com> <498B2207.2030303@molden.no> <1cd32cbb0902050946i700777cdhc920711cb393353f@mail.gmail.com> Message-ID: <498B29F6.4080508@molden.no> On 2/5/2009 6:46 PM, josef.pktd at gmail.com wrote: > so whether bigU or smallU is used in the calculation of z doesn't > matter, I have no idea why in this specific implementation both are > calculated if smallU would be enough. By the way, there is a fucntion scipy.stats.ranksums that does a Wilcoxon rank-sum test. It seems to be using a large-sample approximation, and has no correction for tied ranks. S.M. From pgmdevlist at gmail.com Thu Feb 5 13:11:55 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Thu, 5 Feb 2009 13:11:55 -0500 Subject: [SciPy-user] help with scipy.stats.mannwhitneyu In-Reply-To: <498B29F6.4080508@molden.no> References: <5063d0650902050337s3a4b656k71f9a5634c589589@mail.gmail.com> <1cd32cbb0902050633v91c8785s6ae993d6eb63aa01@mail.gmail.com> <498AF9E7.90200@molden.no> <1cd32cbb0902050736v5ae55230l215910f312562f2a@mail.gmail.com> <1cd32cbb0902050839m4e587390s31ef7b5f6267c5d4@mail.gmail.com> <498B1A3B.8040603@molden.no> <1cd32cbb0902050923j492f5683icbe737533159a24e@mail.gmail.com> <498B2207.2030303@molden.no> <1cd32cbb0902050946i700777cdhc920711cb393353f@mail.gmail.com> <498B29F6.4080508@molden.no> Message-ID: On Feb 5, 2009, at 1:03 PM, Sturla Molden wrote: > On 2/5/2009 6:46 PM, josef.pktd at gmail.com wrote: > >> so whether bigU or smallU is used in the calculation of z doesn't >> matter, I have no idea why in this specific implementation both are >> calculated if smallU would be enough. > > By the way, there is a fucntion scipy.stats.ranksums that does a > Wilcoxon rank-sum test. It seems to be using a large-sample > approximation, and has no correction for tied ranks. Please keep in mind that some of the tests have been reimplemented in scipy.stats.mstats to support masked/missing values in scipy.mstats and to take ties into accounts ... I trust y'all to let me know of any inconsistencies between the masked/ unmasked versions, whether in terms of signatures or assumptions. Thx a lot in advance... From josef.pktd at gmail.com Thu Feb 5 13:16:22 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 5 Feb 2009 13:16:22 -0500 Subject: [SciPy-user] help with scipy.stats.mannwhitneyu In-Reply-To: <498B29F6.4080508@molden.no> References: <5063d0650902050337s3a4b656k71f9a5634c589589@mail.gmail.com> <1cd32cbb0902050633v91c8785s6ae993d6eb63aa01@mail.gmail.com> <498AF9E7.90200@molden.no> <1cd32cbb0902050736v5ae55230l215910f312562f2a@mail.gmail.com> <1cd32cbb0902050839m4e587390s31ef7b5f6267c5d4@mail.gmail.com> <498B1A3B.8040603@molden.no> <1cd32cbb0902050923j492f5683icbe737533159a24e@mail.gmail.com> <498B2207.2030303@molden.no> <1cd32cbb0902050946i700777cdhc920711cb393353f@mail.gmail.com> <498B29F6.4080508@molden.no> Message-ID: <1cd32cbb0902051016j3eeeef55v8e7217c84e30bd2e@mail.gmail.com> On Thu, Feb 5, 2009 at 1:03 PM, Sturla Molden wrote: > On 2/5/2009 6:46 PM, josef.pktd at gmail.com wrote: > >> so whether bigU or smallU is used in the calculation of z doesn't >> matter, I have no idea why in this specific implementation both are >> calculated if smallU would be enough. > > By the way, there is a fucntion scipy.stats.ranksums that does a > Wilcoxon rank-sum test. It seems to be using a large-sample > approximation, and has no correction for tied ranks. > > S.M. > Also, in the explanation for kruskal it says it's an extension of Mann-Whitney-U to more than 2 groups for 2 groups (no ties): >>> stats.kruskal(rvs1,rvs2)[1] - stats.mannwhitneyu(rvs1,rvs2)[1]*2 -4.8572257327350599e-016 >>> stats.kruskal(rvs1,rvs2)[1] - stats.ranksums(rvs1,rvs2)[1] -4.8572257327350599e-016 >>> stats.ranksums(rvs1,rvs2)[1] - stats.mannwhitneyu(rvs1,rvs2)[1]*2 0.0 It looks like there are some redundancies or small variations in these tests. A systematic list of all tests would be pretty useful. Josef From josef.pktd at gmail.com Thu Feb 5 13:31:00 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 5 Feb 2009 13:31:00 -0500 Subject: [SciPy-user] help with scipy.stats.mannwhitneyu In-Reply-To: References: <5063d0650902050337s3a4b656k71f9a5634c589589@mail.gmail.com> <498AF9E7.90200@molden.no> <1cd32cbb0902050736v5ae55230l215910f312562f2a@mail.gmail.com> <1cd32cbb0902050839m4e587390s31ef7b5f6267c5d4@mail.gmail.com> <498B1A3B.8040603@molden.no> <1cd32cbb0902050923j492f5683icbe737533159a24e@mail.gmail.com> <498B2207.2030303@molden.no> <1cd32cbb0902050946i700777cdhc920711cb393353f@mail.gmail.com> <498B29F6.4080508@molden.no> Message-ID: <1cd32cbb0902051031o3120c70ahfd1e2030eb72e750@mail.gmail.com> On Thu, Feb 5, 2009 at 1:11 PM, Pierre GM wrote: > > On Feb 5, 2009, at 1:03 PM, Sturla Molden wrote: > >> On 2/5/2009 6:46 PM, josef.pktd at gmail.com wrote: >> >>> so whether bigU or smallU is used in the calculation of z doesn't >>> matter, I have no idea why in this specific implementation both are >>> calculated if smallU would be enough. >> >> By the way, there is a fucntion scipy.stats.ranksums that does a >> Wilcoxon rank-sum test. It seems to be using a large-sample >> approximation, and has no correction for tied ranks. > > > Please keep in mind that some of the tests have been reimplemented in > scipy.stats.mstats to support masked/missing values in scipy.mstats > and to take ties into accounts ... > I trust y'all to let me know of any inconsistencies between the masked/ > unmasked versions, whether in terms of signatures or assumptions. > Thx a lot in advance... > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > a quick check looks pretty good (still example without ties) >>> stats.mstats.kruskal(rvs1,rvs2)[1] - stats.ranksums(rvs1,rvs2)[1] -4.8572257327350599e-016 >>> stats.mstats.kruskalwallis(rvs1,rvs2)[1] - stats.ranksums(rvs1,rvs2)[1] -4.8572257327350599e-016 >>> stats.mstats.mannwhitneyu(rvs1,rvs2)[1] - stats.ranksums(rvs1,rvs2)[1] 0.00029058688269312238 >>> stats.mstats.mannwhitneyu(rvs1,rvs2) (4363.0, 0.11989439052971618) >>> stats.mstats.mannwhitneyu(rvs1,rvs2)[1] - rwilcox(rvs1,rvs2,correct = False)['p.value'] 0.00029058688269296973 >>> stats.mstats.mannwhitneyu(rvs1,rvs2)[1] - rwilcox(rvs1,rvs2)['p.value'] 0.0 stats.mstats.mannwhitneyu employs continuity correction by default as in R. Just calling this, according to docstring, requires sequence, correct usage is not clear: >>> stats.mstats.compare_medians_ms(rvs1,rvs2) Traceback (most recent call last): File "", line 1, in stats.mstats.compare_medians_ms(rvs1,rvs2) File "\Programs\Python25\Lib\site-packages\scipy\stats\mstats_extras.py", line 332, in compare_medians_ms (std_1, std_2) = (mstats.stde_median(group_1, axis=axis), File "C:\Programs\Python25\lib\site-packages\scipy\stats\mstats_basic.py", line 1511, in stde_median return _stdemed_1D(data) File "C:\Programs\Python25\lib\site-packages\scipy\stats\mstats_basic.py", line 1504, in _stdemed_1D n = len(sorted) TypeError: object of type 'builtin_function_or_method' has no len() Josef From pgmdevlist at gmail.com Thu Feb 5 13:35:35 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Thu, 5 Feb 2009 13:35:35 -0500 Subject: [SciPy-user] help with scipy.stats.mannwhitneyu In-Reply-To: <1cd32cbb0902051031o3120c70ahfd1e2030eb72e750@mail.gmail.com> References: <5063d0650902050337s3a4b656k71f9a5634c589589@mail.gmail.com> <498AF9E7.90200@molden.no> <1cd32cbb0902050736v5ae55230l215910f312562f2a@mail.gmail.com> <1cd32cbb0902050839m4e587390s31ef7b5f6267c5d4@mail.gmail.com> <498B1A3B.8040603@molden.no> <1cd32cbb0902050923j492f5683icbe737533159a24e@mail.gmail.com> <498B2207.2030303@molden.no> <1cd32cbb0902050946i700777cdhc920711cb393353f@mail.gmail.com> <498B29F6.4080508@molden.no> <1cd32cbb0902051031o3120c70ahfd1e2030eb72e750@mail.gmail.com> Message-ID: On Feb 5, 2009, at 1:31 PM, josef.pktd at gmail.com wrote: > > Just calling this, according to docstring, requires sequence, correct > usage is not clear: > >>>> stats.mstats.compare_medians_ms(rvs1,rvs2) OK, can you send me a test sample (ie, the rvs1& rvs2 you used that fail, and what we should have had)? I'll try to fix that this afternoon... From josef.pktd at gmail.com Thu Feb 5 13:53:37 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 5 Feb 2009 13:53:37 -0500 Subject: [SciPy-user] help with scipy.stats.mannwhitneyu In-Reply-To: References: <5063d0650902050337s3a4b656k71f9a5634c589589@mail.gmail.com> <1cd32cbb0902050839m4e587390s31ef7b5f6267c5d4@mail.gmail.com> <498B1A3B.8040603@molden.no> <1cd32cbb0902050923j492f5683icbe737533159a24e@mail.gmail.com> <498B2207.2030303@molden.no> <1cd32cbb0902050946i700777cdhc920711cb393353f@mail.gmail.com> <498B29F6.4080508@molden.no> <1cd32cbb0902051031o3120c70ahfd1e2030eb72e750@mail.gmail.com> Message-ID: <1cd32cbb0902051053j2beb7127sbacf49611f367f2a@mail.gmail.com> On Thu, Feb 5, 2009 at 1:35 PM, Pierre GM wrote: > > On Feb 5, 2009, at 1:31 PM, josef.pktd at gmail.com wrote: > >> >> Just calling this, according to docstring, requires sequence, correct >> usage is not clear: >> >>>>> stats.mstats.compare_medians_ms(rvs1,rvs2) > > OK, can you send me a test sample (ie, the rvs1& rvs2 you used that > fail, and what we should have had)? I'll try to fix that this > afternoon... > I generated rvs1 and rvs2 (without fixed seed) rvs1 = stats.norm.rvs(size = 100) rvs2 = 0.25*stats.norm.rvs(size = 100) I didn't look at stats.mstats.compare_medians_ms specifically, but it sounded like it should do something similar to the other tests I was trying out. So, I don't know what the expected answer should be, but I would expect a p-values similar to the other non-parametric tests for equality of location. the problem is in stats.mstats_basic.stde_median. Note: it is not exported (I'm not sure how or why the imports work) and can be accessed only directly >>> stats.mstats.stde_median(rvs1,axis=0) Traceback (most recent call last): File "", line 1, in stats.mstats.stde_median(rvs1,axis=0) AttributeError: 'module' object has no attribute 'stde_median' calling it directly returns this: >>> stats.mstats_basic.stde_median(rvs1,axis=0) Traceback (most recent call last): File "", line 1, in stats.mstats_basic.stde_median(rvs1,axis=0) File "C:\Programs\Python25\lib\site-packages\scipy\stats\mstats_basic.py", line 1514, in stde_median return ma.apply_along_axis(_stdemed_1D, axis, data) File "C:\Programs\Python25\lib\site-packages\numpy\ma\extras.py", line 185, in apply_along_axis res = func1d(arr[tuple(i.tolist())], *args, **kwargs) File "C:\Programs\Python25\lib\site-packages\scipy\stats\mstats_basic.py", line 1504, in _stdemed_1D n = len(sorted) TypeError: object of type 'builtin_function_or_method' has no len() Josef From josef.pktd at gmail.com Thu Feb 5 13:58:31 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 5 Feb 2009 13:58:31 -0500 Subject: [SciPy-user] help with scipy.stats.mannwhitneyu In-Reply-To: <1cd32cbb0902051053j2beb7127sbacf49611f367f2a@mail.gmail.com> References: <5063d0650902050337s3a4b656k71f9a5634c589589@mail.gmail.com> <498B1A3B.8040603@molden.no> <1cd32cbb0902050923j492f5683icbe737533159a24e@mail.gmail.com> <498B2207.2030303@molden.no> <1cd32cbb0902050946i700777cdhc920711cb393353f@mail.gmail.com> <498B29F6.4080508@molden.no> <1cd32cbb0902051031o3120c70ahfd1e2030eb72e750@mail.gmail.com> <1cd32cbb0902051053j2beb7127sbacf49611f367f2a@mail.gmail.com> Message-ID: <1cd32cbb0902051058ldaba9bah862e3b4e3692eb2c@mail.gmail.com> On Thu, Feb 5, 2009 at 1:53 PM, wrote: > On Thu, Feb 5, 2009 at 1:35 PM, Pierre GM wrote: >> >> On Feb 5, 2009, at 1:31 PM, josef.pktd at gmail.com wrote: >> >>> >>> Just calling this, according to docstring, requires sequence, correct >>> usage is not clear: >>> >>>>>> stats.mstats.compare_medians_ms(rvs1,rvs2) >> >> OK, can you send me a test sample (ie, the rvs1& rvs2 you used that >> fail, and what we should have had)? I'll try to fix that this >> afternoon... >> > > I generated rvs1 and rvs2 (without fixed seed) > > rvs1 = stats.norm.rvs(size = 100) > rvs2 = 0.25*stats.norm.rvs(size = 100) > > I didn't look at stats.mstats.compare_medians_ms specifically, but it > sounded like it should do something similar to the other tests I was > trying out. So, I don't know what the expected answer should be, but I > would expect a p-values similar to the other non-parametric tests for > equality of location. > > the problem is in stats.mstats_basic.stde_median. > Note: it is not exported (I'm not sure how or why the imports work) > and can be accessed only directly > >>>> stats.mstats.stde_median(rvs1,axis=0) > Traceback (most recent call last): > File "", line 1, in > stats.mstats.stde_median(rvs1,axis=0) > AttributeError: 'module' object has no attribute 'stde_median' > > calling it directly returns this: > >>>> stats.mstats_basic.stde_median(rvs1,axis=0) > Traceback (most recent call last): > File "", line 1, in > stats.mstats_basic.stde_median(rvs1,axis=0) > File "C:\Programs\Python25\lib\site-packages\scipy\stats\mstats_basic.py", > line 1514, in stde_median > return ma.apply_along_axis(_stdemed_1D, axis, data) > File "C:\Programs\Python25\lib\site-packages\numpy\ma\extras.py", > line 185, in apply_along_axis > res = func1d(arr[tuple(i.tolist())], *args, **kwargs) > File "C:\Programs\Python25\lib\site-packages\scipy\stats\mstats_basic.py", > line 1504, in _stdemed_1D > n = len(sorted) > TypeError: object of type 'builtin_function_or_method' has no len() > doing : >>> np.source(stats.mstats_basic.stde_median) shows it's a reference by wrong name: you assign the sorted data to data, but then use "sorted" as a name def _stdemed_1D(data): data = np.sort(data.compressed()) n = len(sorted) Josef From sturla at molden.no Thu Feb 5 14:51:02 2009 From: sturla at molden.no (Sturla Molden) Date: Thu, 05 Feb 2009 20:51:02 +0100 Subject: [SciPy-user] help with scipy.stats.mannwhitneyu In-Reply-To: <498B29F6.4080508@molden.no> References: <5063d0650902050337s3a4b656k71f9a5634c589589@mail.gmail.com> <1cd32cbb0902050633v91c8785s6ae993d6eb63aa01@mail.gmail.com> <498AF9E7.90200@molden.no> <1cd32cbb0902050736v5ae55230l215910f312562f2a@mail.gmail.com> <1cd32cbb0902050839m4e587390s31ef7b5f6267c5d4@mail.gmail.com> <498B1A3B.8040603@molden.no> <1cd32cbb0902050923j492f5683icbe737533159a24e@mail.gmail.com> <498B2207.2030303@molden.no> <1cd32cbb0902050946i700777cdhc920711cb393353f@mail.gmail.com> <498B29F6.4080508@molden.no> Message-ID: <498B4326.7060207@molden.no> On 2/5/2009 7:03 PM, Sturla Molden wrote: > By the way, there is a fucntion scipy.stats.ranksums that does a > Wilcoxon rank-sum test. It seems to be using a large-sample > approximation, and has no correction for tied ranks. Here is a modification of SciPy's ranksums to allow small samples and correct for tied ranks. Sturla Molden import numpy as np import scipy import scipy.special zprob = scipy.special.ndtr def ranksums(x, y): """ Wilcoxon rank sum test Returns: W-statistic Z-statistic one-tailed p-value, asymptotic approximation one-tailed p-value, Monte Carlo approximation Corrected for ties. """ x,y = map(np.asarray, (x, y)) n1 = len(x) n2 = len(y) alldata = np.concatenate((x,y)) ranked = rankdata(alldata) x = ranked[:n1] y = ranked[n1:] w = np.sum(x,axis=0) def montecarlo(): shuffle = np.random.shuffle a = np.zeros(1000) shuffle(ranked) # bug in numpy: the first shuffle doesn't work for i in xrange(1000): shuffle(ranked) a[i] = np.sum(ranked[:n1],axis=0) return np.sum(a >= w) / 1000.0 def aymptotic_p(): expected = n1*(n1+n2+1) / 2.0 z = (w - expected) / np.sqrt(n1*n2*(n1+n2+1)/12.0) return 1.0 - zprob(z), z def aymptotic_p_ties(): t = [] _t = 0 for r in ranked: if r % 1: _t += 1 else: if _t: t.append(_t) _t = 0 if _t: t.append(_t) t = np.asarray(t) expected = n1*(n1+n2+1) / 2.0 tcorr = np.sum((t-1)*t*(t+1))/float((n1+n2)*(n1+n2-1)) z = (w - expected) / np.sqrt(n1*n2*(n1+n2+1-tcorr)/12.0) return 1.0 - zprob(z), z p_mc = montecarlo() if np.any(ranked % 1): p, z = aymptotic_p_ties() else: p, z = aymptotic_p() return w, z, p, p_mc def rankdata(a): a = np.ravel(a) n = len(a) svec, ivec = fastsort(a) sumranks = 0 dupcount = 0 newarray = np.zeros(n, float) for i in xrange(n): sumranks += i dupcount += 1 if i==n-1 or svec[i] != svec[i+1]: averank = sumranks / float(dupcount) + 1 for j in xrange(i-dupcount+1,i+1): newarray[ivec[j]] = averank sumranks = 0 dupcount = 0 return newarray def fastsort(a): it = np.argsort(a) as_ = a[it] return as_, it From christopher.paul.taylor at gmail.com Thu Feb 5 15:13:29 2009 From: christopher.paul.taylor at gmail.com (christopher taylor) Date: Thu, 5 Feb 2009 15:13:29 -0500 Subject: [SciPy-user] question about using speigs.ARPACK_eigs Message-ID: I'm currently working with sparse matrix of a size of roughly 65K by 65K. I'd like to compute the first 2 eigenvectors of this 65K x 65K matrix. I've been told to use speigs.ARPACK_eigs: # data_mat sizes data_mat_width=256 data_mat_height=256 #256*256 = 65536 # # i've noticed there's an assertion that will kill a call to this function if the last argument is >=4 # eigvals, eigvecs = speigs.ARPACK_eigs( data_mat.matvec, data_mat_width*data_mat_height, 2) Unfortunately, the function is only computing a result of 2x2 matrix. I need a result that's a matrix with a height of, roughly, 65536 and a width of 2. Any recommendations or tips? Thanks, ct From josef.pktd at gmail.com Thu Feb 5 15:30:07 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 5 Feb 2009 15:30:07 -0500 Subject: [SciPy-user] help with scipy.stats.mannwhitneyu In-Reply-To: <498B4326.7060207@molden.no> References: <5063d0650902050337s3a4b656k71f9a5634c589589@mail.gmail.com> <498AF9E7.90200@molden.no> <1cd32cbb0902050736v5ae55230l215910f312562f2a@mail.gmail.com> <1cd32cbb0902050839m4e587390s31ef7b5f6267c5d4@mail.gmail.com> <498B1A3B.8040603@molden.no> <1cd32cbb0902050923j492f5683icbe737533159a24e@mail.gmail.com> <498B2207.2030303@molden.no> <1cd32cbb0902050946i700777cdhc920711cb393353f@mail.gmail.com> <498B29F6.4080508@molden.no> <498B4326.7060207@molden.no> Message-ID: <1cd32cbb0902051230k2c024a0cnc25448f6c1613679@mail.gmail.com> On Thu, Feb 5, 2009 at 2:51 PM, Sturla Molden wrote: > On 2/5/2009 7:03 PM, Sturla Molden wrote: > >> By the way, there is a fucntion scipy.stats.ranksums that does a >> Wilcoxon rank-sum test. It seems to be using a large-sample >> approximation, and has no correction for tied ranks. > > > Here is a modification of SciPy's ranksums to allow small samples and > correct for tied ranks. > there are absolute values missing abs(z-expected), I also prefer the correction p*2 since it is a two-sided test sample size 20, 9 ties this is with R wilcox.exact, ranksums is your ranksum >>> rwilcex(rvs1[:20],4*ind10+rvs2t[:20],exact=True)['p.value'] 0.15716005595098306 >>> ranksums(rvs1[:20],4*ind10+rvs2t[:20]) #wrong tail because no abs() (357.0, -1.4336547191212172, 0.9241645900073665, 0.92800000000000005) >>> ranksums(4*ind10+rvs2t[:20],rvs1[:20]) (463.0, 1.4336547191212172, 0.075835409992633496, 0.068000000000000005) >>> ranksums(4*ind10+rvs2t[:20],rvs1[:20])[3]*2 0.186 >>> ranksums(4*ind10+rvs2t[:20],rvs1[:20])[2]*2 0.15167081998526699 >>> stats.mannwhitneyu(rvs1[:20],4*ind10+rvs2t[:20])[1]*2 0.15167081998526699 With this correction, the normal distribution based p-value in ranksums looks exactly the same as stats.mannwhitneyu. your Monte Carlo p-value differs more from R's exact result than the normal distribution based p-value. Overall, the differences in p-values look pretty small in the examples I tried out, so my guess is that a Monte-Carlo on the correct size and power of the tests will show very similar rejection rates, at critical values of 0.05 or 0.1. But I don't have time for that now. Josef From josef.pktd at gmail.com Thu Feb 5 15:54:32 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 5 Feb 2009 15:54:32 -0500 Subject: [SciPy-user] help with scipy.stats.mannwhitneyu In-Reply-To: <1cd32cbb0902051230k2c024a0cnc25448f6c1613679@mail.gmail.com> References: <5063d0650902050337s3a4b656k71f9a5634c589589@mail.gmail.com> <1cd32cbb0902050736v5ae55230l215910f312562f2a@mail.gmail.com> <1cd32cbb0902050839m4e587390s31ef7b5f6267c5d4@mail.gmail.com> <498B1A3B.8040603@molden.no> <1cd32cbb0902050923j492f5683icbe737533159a24e@mail.gmail.com> <498B2207.2030303@molden.no> <1cd32cbb0902050946i700777cdhc920711cb393353f@mail.gmail.com> <498B29F6.4080508@molden.no> <498B4326.7060207@molden.no> <1cd32cbb0902051230k2c024a0cnc25448f6c1613679@mail.gmail.com> Message-ID: <1cd32cbb0902051254g13ea0ae5yf879a5f648c8030e@mail.gmail.com> > > sample size 20, 9 ties > this is with R wilcox.exact, ranksums is your ranksum ... > > With this correction, the normal distribution based p-value in > ranksums looks exactly the same as stats.mannwhitneyu. this statement is not correct. I mixed up my variables and didn't actually have ties, now with ties, I still get essentially but not exactly the same results. Josef From pjungkun at nps.edu Thu Feb 5 17:57:52 2009 From: pjungkun at nps.edu (Patrick Jungkunz) Date: Thu, 5 Feb 2009 14:57:52 -0800 Subject: [SciPy-user] Fwd: Scipy sandbox - color.py Message-ID: <7794F558-D11F-4FFE-8441-6322FD5920CB@nps.edu> G'day out there Here is an answer about color.py I got from Robert Kern. I just wanted to forward this to the mailing list for the benefit of everyone. Thank you, Robert, for the immediate response. Patrick Begin forwarded message: > > > On Tue, Feb 3, 2009 at 18:06, Patrick Jungkunz > wrote: >> >> >> For a project I am working on using scipy, I need to convert an array >> representing an rgb image into the Lab color space. I found the >> color.py >> script in the scipy sandbox which seems to be taking care of that. >> I am >> writing you because you had been referenced as the author of that >> file. > > No, I'm just the last person to touch it. > >> I was wondering why this script is still in the sandbox and not >> integrated >> into the scipy release. Are there any issues I should be aware of >> before >> using it? Are there any specific setup procedures I need to follow >> in order >> to use the script. > > It's not really fully-baked. Copy it into your own code, for now. If > you're just interested in the standard transforms, I have a somewhat > cleaner version here: > > http://www.enthought.com/~rkern/cgi-bin/hgwebdir.cgi/colormap_explorer/file/tip/colormap_explorer/conversion.py > > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > -------------- next part -------------- An HTML attachment was scrubbed... URL: From gael.varoquaux at normalesup.org Thu Feb 5 18:19:42 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Fri, 6 Feb 2009 00:19:42 +0100 Subject: [SciPy-user] shared memory machines In-Reply-To: <786d3e06228152ae2c30291b139983e4.squirrel@webmail.uio.no> References: <2AE6D153-799C-450E-8E69-CA80D12E2FF5@math.toronto.edu> <747c5db37a4e870a8e8f562a4636c6e7.squirrel@webmail.uio.no> <20090202063833.GB9627@phare.normalesup.org> <3d375d730902012251y1159737fk325a923a344f25cf@mail.gmail.com> <20090202105316.GE11955@phare.normalesup.org> <786d3e06228152ae2c30291b139983e4.squirrel@webmail.uio.no> Message-ID: <20090205231942.GB21014@phare.normalesup.org> On Mon, Feb 02, 2009 at 06:24:18PM +0100, Sturla Molden wrote: > > Am I out of my mind, and will this fail utterly? > It will work. But we should use named shared memory (which requires some C > or Cython coding), not BSD mmap as mp.Array currently does. Also we must > override how ndarrays are pickled. I just wanted to say that I am still interested in exploring this a bit deeper, but I got swamped suddenly. Besides, as Robert and you have sown, there is more than I thought to it. Cheers, Ga?l From robert.kern at gmail.com Thu Feb 5 18:23:32 2009 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 5 Feb 2009 17:23:32 -0600 Subject: [SciPy-user] shared memory machines In-Reply-To: <20090205231942.GB21014@phare.normalesup.org> References: <2AE6D153-799C-450E-8E69-CA80D12E2FF5@math.toronto.edu> <747c5db37a4e870a8e8f562a4636c6e7.squirrel@webmail.uio.no> <20090202063833.GB9627@phare.normalesup.org> <3d375d730902012251y1159737fk325a923a344f25cf@mail.gmail.com> <20090202105316.GE11955@phare.normalesup.org> <786d3e06228152ae2c30291b139983e4.squirrel@webmail.uio.no> <20090205231942.GB21014@phare.normalesup.org> Message-ID: <3d375d730902051523q179b552bg11218a0b22a161b1@mail.gmail.com> On Thu, Feb 5, 2009 at 17:19, Gael Varoquaux wrote: > On Mon, Feb 02, 2009 at 06:24:18PM +0100, Sturla Molden wrote: >> > Am I out of my mind, and will this fail utterly? > >> It will work. But we should use named shared memory (which requires some C >> or Cython coding), not BSD mmap as mp.Array currently does. Also we must >> override how ndarrays are pickled. > > I just wanted to say that I am still interested in exploring this a bit > deeper, but I got swamped suddenly. Besides, as Robert and you have sown, > there is more than I thought to it. BTW, Philip Semanchuk, the maintainer of the aforementioned shm module, contacted Sturla and myself offlist to point out two, more up-to-date, modules which provide named shared memory on UNIX systems: http://semanchuk.com/philip/sysv_ipc/ http://semanchuk.com/philip/posix_ipc/ -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From gael.varoquaux at normalesup.org Thu Feb 5 18:41:15 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Fri, 6 Feb 2009 00:41:15 +0100 Subject: [SciPy-user] shared memory machines In-Reply-To: <3d375d730902051523q179b552bg11218a0b22a161b1@mail.gmail.com> References: <2AE6D153-799C-450E-8E69-CA80D12E2FF5@math.toronto.edu> <747c5db37a4e870a8e8f562a4636c6e7.squirrel@webmail.uio.no> <20090202063833.GB9627@phare.normalesup.org> <3d375d730902012251y1159737fk325a923a344f25cf@mail.gmail.com> <20090202105316.GE11955@phare.normalesup.org> <786d3e06228152ae2c30291b139983e4.squirrel@webmail.uio.no> <20090205231942.GB21014@phare.normalesup.org> <3d375d730902051523q179b552bg11218a0b22a161b1@mail.gmail.com> Message-ID: <20090205234115.GC29684@phare.normalesup.org> On Thu, Feb 05, 2009 at 05:23:32PM -0600, Robert Kern wrote: > BTW, Philip Semanchuk, the maintainer of the aforementioned shm > module, contacted Sturla and myself offlist to point out two, more > up-to-date, modules which provide named shared memory on UNIX systems: > http://semanchuk.com/philip/sysv_ipc/ > http://semanchuk.com/philip/posix_ipc/ Interesting. I wonder how to use these. I would really like to see shared memory in numpy by itself at some point. I did not look at the code as it is GPL, from what I see. The core idea, from what I understand, would be to use the POSIX shm_open call to expose some named shared to numpy using eg from_buffer. Or can we simply make it point to the pointer of an existing array using shmat, if is is contiguous? That would avoid a copy (if contiguous). Finally, to make sure share memory works with multiprocessing, we would have to override pickling so that the pickling and unpicking are done simply by storing the name of the shared memory object or retrieving it. This is risky, because actual persistence would be destroyed. Under Window we would use CreateSharedMemory to perform the same trick using CreateFileMapping and MapViewOfFile? Sounds fun. Ga?l From josef.pktd at gmail.com Thu Feb 5 19:03:34 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 5 Feb 2009 19:03:34 -0500 Subject: [SciPy-user] help with scipy.stats.mannwhitneyu In-Reply-To: <1cd32cbb0902051254g13ea0ae5yf879a5f648c8030e@mail.gmail.com> References: <5063d0650902050337s3a4b656k71f9a5634c589589@mail.gmail.com> <1cd32cbb0902050839m4e587390s31ef7b5f6267c5d4@mail.gmail.com> <498B1A3B.8040603@molden.no> <1cd32cbb0902050923j492f5683icbe737533159a24e@mail.gmail.com> <498B2207.2030303@molden.no> <1cd32cbb0902050946i700777cdhc920711cb393353f@mail.gmail.com> <498B29F6.4080508@molden.no> <498B4326.7060207@molden.no> <1cd32cbb0902051230k2c024a0cnc25448f6c1613679@mail.gmail.com> <1cd32cbb0902051254g13ea0ae5yf879a5f648c8030e@mail.gmail.com> Message-ID: <1cd32cbb0902051603w17d9a88ai39101ddf277341ca@mail.gmail.com> On Thu, Feb 5, 2009 at 3:54 PM, wrote: >> >> sample size 20, 9 ties >> this is with R wilcox.exact, ranksums is your ranksum > ... >> >> With this correction, the normal distribution based p-value in >> ranksums looks exactly the same as stats.mannwhitneyu. > > this statement is not correct. > > I mixed up my variables and didn't actually have ties, now with ties, > I still get essentially but not exactly the same results. > I think there is a mistake in the tie handling of stats.mannwhitneyu In the calculation of the standard error the sqrt is taken twice. T = np.sqrt(tiecorrect(ranked)) # correction factor for tied scores if T == 0: raise ValueError, 'All numbers are identical in amannwhitneyu' sd = np.sqrt(T*n1*n2*(n1+n2+1)/12.0) I don't have the formulas for the tie correction, but from looking at the tie correction in Sturlas version of ranksums, it seems that the first sqrt shouldn't be there. Can someone with access to the correct references verify this. Josef From ellisonbg.net at gmail.com Thu Feb 5 19:34:51 2009 From: ellisonbg.net at gmail.com (Brian Granger) Date: Thu, 5 Feb 2009 16:34:51 -0800 Subject: [SciPy-user] shared memory machines In-Reply-To: <20090205234115.GC29684@phare.normalesup.org> References: <2AE6D153-799C-450E-8E69-CA80D12E2FF5@math.toronto.edu> <747c5db37a4e870a8e8f562a4636c6e7.squirrel@webmail.uio.no> <20090202063833.GB9627@phare.normalesup.org> <3d375d730902012251y1159737fk325a923a344f25cf@mail.gmail.com> <20090202105316.GE11955@phare.normalesup.org> <786d3e06228152ae2c30291b139983e4.squirrel@webmail.uio.no> <20090205231942.GB21014@phare.normalesup.org> <3d375d730902051523q179b552bg11218a0b22a161b1@mail.gmail.com> <20090205234115.GC29684@phare.normalesup.org> Message-ID: <6ce0ac130902051634i1408fddeiffd54c14f793f688@mail.gmail.com> This is quite interesting indeed. I am not familiar with this stuff at all, but I guess I have some reading to do. One important question though: Can these mechanisms be used to create shared memory amongst processes that are started in a completely independent manner. That is, they are not fork()'d. If so, then we should develop a shared memory version of numpy arrays that will work in any multiple-process setting. I am thinking multiprocessing *and* the IPython.kernel. Cheers, Brian On Thu, Feb 5, 2009 at 3:41 PM, Gael Varoquaux wrote: > On Thu, Feb 05, 2009 at 05:23:32PM -0600, Robert Kern wrote: >> BTW, Philip Semanchuk, the maintainer of the aforementioned shm >> module, contacted Sturla and myself offlist to point out two, more >> up-to-date, modules which provide named shared memory on UNIX systems: > >> http://semanchuk.com/philip/sysv_ipc/ >> http://semanchuk.com/philip/posix_ipc/ > > Interesting. I wonder how to use these. I would really like to see shared > memory in numpy by itself at some point. I did not look at the code as it > is GPL, from what I see. > > The core idea, from what I understand, would be to use the POSIX shm_open > call to expose some named shared to numpy using eg from_buffer. Or can we > simply make it point to the pointer of an existing array using shmat, if > is is contiguous? That would avoid a copy (if contiguous). > > Finally, to make sure share memory works with multiprocessing, we would > have to override pickling so that the pickling and unpicking are done > simply by storing the name of the shared memory object or retrieving it. > This is risky, because actual persistence would be destroyed. > > Under Window we would use CreateSharedMemory to perform the same trick > using CreateFileMapping and MapViewOfFile? > > Sounds fun. > > Ga?l > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > From philip at semanchuk.com Thu Feb 5 20:00:30 2009 From: philip at semanchuk.com (Philip Semanchuk) Date: Thu, 5 Feb 2009 20:00:30 -0500 Subject: [SciPy-user] shared memory machines Message-ID: <5AEE5578-71FA-4EF4-B305-9A29B65A827C@semanchuk.com> Brian Granger wrote: > This is quite interesting indeed. I am not familiar with this stuff > at all, but I guess I have some reading to do. One important question > though: > Can these mechanisms be used to create shared memory amongst processes > that are started in a completely independent manner. That is, they > are not fork()'d. > If so, then we should develop a shared memory version of numpy arrays > that will work in any multiple-process setting. I am thinking > multiprocessing *and* the IPython.kernel. Hi all, I'm the author of the aforementioned IPC modules and I thought I'd jump in even though I'm not a numpy guy. Yes, one can use IPC objects (Sys V or POSIX) in completely independent processes. There's a demo that comes along with both modules that demonstrates that. I guess numpy isn't GPLed? You could still download either one of the above packages and run the demo to observe the process independence. Ga?l, AFAIK shared memory is guaranteed to be contiguous. I'm making my assumption based on the fact that neither the Sys V nor POSIX API has any references to accessing different chunks of memory. It's treated as one logical block. In fact, the POSIX API for creating shared memory (shm_open) simply returns a file descriptor that one accesses as a memory mapped file: http://www.opengroup.org/onlinepubs/000095399/functions/shm_open.html HTH Philip From robert.kern at gmail.com Thu Feb 5 20:02:59 2009 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 5 Feb 2009 19:02:59 -0600 Subject: [SciPy-user] shared memory machines In-Reply-To: <5AEE5578-71FA-4EF4-B305-9A29B65A827C@semanchuk.com> References: <5AEE5578-71FA-4EF4-B305-9A29B65A827C@semanchuk.com> Message-ID: <3d375d730902051702kc6b6a9eh861d78132462cc78@mail.gmail.com> On Thu, Feb 5, 2009 at 19:00, Philip Semanchuk wrote: > Brian Granger wrote: > >> This is quite interesting indeed. I am not familiar with this stuff >> at all, but I guess I have some reading to do. One important question >> though: >> Can these mechanisms be used to create shared memory amongst processes >> that are started in a completely independent manner. That is, they >> are not fork()'d. >> If so, then we should develop a shared memory version of numpy arrays >> that will work in any multiple-process setting. I am thinking >> multiprocessing *and* the IPython.kernel. > > Hi all, > I'm the author of the aforementioned IPC modules and I thought I'd > jump in even though I'm not a numpy guy. > > Yes, one can use IPC objects (Sys V or POSIX) in completely > independent processes. There's a demo that comes along with both > modules that demonstrates that. I guess numpy isn't GPLed? No, we're BSD-licensed. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From strawman at astraw.com Thu Feb 5 23:13:34 2009 From: strawman at astraw.com (Andrew Straw) Date: Thu, 05 Feb 2009 20:13:34 -0800 Subject: [SciPy-user] shared memory machines In-Reply-To: <20090205234115.GC29684@phare.normalesup.org> References: <2AE6D153-799C-450E-8E69-CA80D12E2FF5@math.toronto.edu> <747c5db37a4e870a8e8f562a4636c6e7.squirrel@webmail.uio.no> <20090202063833.GB9627@phare.normalesup.org> <3d375d730902012251y1159737fk325a923a344f25cf@mail.gmail.com> <20090202105316.GE11955@phare.normalesup.org> <786d3e06228152ae2c30291b139983e4.squirrel@webmail.uio.no> <20090205231942.GB21014@phare.normalesup.org> <3d375d730902051523q179b552bg11218a0b22a161b1@mail.gmail.com> <20090205234115.GC29684@phare.normalesup.org> Message-ID: <498BB8EE.10900@astraw.com> FWIW, I wrote some BSD licensed Pyrex code that does some shared memory stuff. I wouldn't attempt to resurrect the complete working module, but cut and paste at will: http://code.astraw.com/projects/motmot/browser/trunk/pycamiface/src/_cam_iface_shm.pyx?rev=328 (This was from a wrapper of a camera driver that used shared memory since the camera driver was very badly behaved and couldn't be trusted to run in the same process. I have since stopped using this code and wouldn't have time to get it working again, but it did open and use shared memory quite nicely on linux.) Also I found this web site very useful: http://www.ecst.csuchico.edu/~beej/guide/ipc/ Gael Varoquaux wrote: > On Thu, Feb 05, 2009 at 05:23:32PM -0600, Robert Kern wrote: >> BTW, Philip Semanchuk, the maintainer of the aforementioned shm >> module, contacted Sturla and myself offlist to point out two, more >> up-to-date, modules which provide named shared memory on UNIX systems: > >> http://semanchuk.com/philip/sysv_ipc/ >> http://semanchuk.com/philip/posix_ipc/ > > Interesting. I wonder how to use these. I would really like to see shared > memory in numpy by itself at some point. I did not look at the code as it > is GPL, from what I see. > > The core idea, from what I understand, would be to use the POSIX shm_open > call to expose some named shared to numpy using eg from_buffer. Or can we > simply make it point to the pointer of an existing array using shmat, if > is is contiguous? That would avoid a copy (if contiguous). > > Finally, to make sure share memory works with multiprocessing, we would > have to override pickling so that the pickling and unpicking are done > simply by storing the name of the shared memory object or retrieving it. > This is risky, because actual persistence would be destroyed. > > Under Window we would use CreateSharedMemory to perform the same trick > using CreateFileMapping and MapViewOfFile? > > Sounds fun. > > Ga?l > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user From ellisonbg.net at gmail.com Thu Feb 5 23:20:04 2009 From: ellisonbg.net at gmail.com (Brian Granger) Date: Thu, 5 Feb 2009 20:20:04 -0800 Subject: [SciPy-user] shared memory machines In-Reply-To: <498BB8EE.10900@astraw.com> References: <2AE6D153-799C-450E-8E69-CA80D12E2FF5@math.toronto.edu> <747c5db37a4e870a8e8f562a4636c6e7.squirrel@webmail.uio.no> <20090202063833.GB9627@phare.normalesup.org> <3d375d730902012251y1159737fk325a923a344f25cf@mail.gmail.com> <20090202105316.GE11955@phare.normalesup.org> <786d3e06228152ae2c30291b139983e4.squirrel@webmail.uio.no> <20090205231942.GB21014@phare.normalesup.org> <3d375d730902051523q179b552bg11218a0b22a161b1@mail.gmail.com> <20090205234115.GC29684@phare.normalesup.org> <498BB8EE.10900@astraw.com> Message-ID: <6ce0ac130902052020l403bd877h42eb8f9b90f8158e@mail.gmail.com> Thanks! On Thu, Feb 5, 2009 at 8:13 PM, Andrew Straw wrote: > FWIW, I wrote some BSD licensed Pyrex code that does some shared memory > stuff. I wouldn't attempt to resurrect the complete working module, but > cut and paste at will: > > http://code.astraw.com/projects/motmot/browser/trunk/pycamiface/src/_cam_iface_shm.pyx?rev=328 > > (This was from a wrapper of a camera driver that used shared memory > since the camera driver was very badly behaved and couldn't be trusted > to run in the same process. I have since stopped using this code and > wouldn't have time to get it working again, but it did open and use > shared memory quite nicely on linux.) > > Also I found this web site very useful: > http://www.ecst.csuchico.edu/~beej/guide/ipc/ > > Gael Varoquaux wrote: >> On Thu, Feb 05, 2009 at 05:23:32PM -0600, Robert Kern wrote: >>> BTW, Philip Semanchuk, the maintainer of the aforementioned shm >>> module, contacted Sturla and myself offlist to point out two, more >>> up-to-date, modules which provide named shared memory on UNIX systems: >> >>> http://semanchuk.com/philip/sysv_ipc/ >>> http://semanchuk.com/philip/posix_ipc/ >> >> Interesting. I wonder how to use these. I would really like to see shared >> memory in numpy by itself at some point. I did not look at the code as it >> is GPL, from what I see. >> >> The core idea, from what I understand, would be to use the POSIX shm_open >> call to expose some named shared to numpy using eg from_buffer. Or can we >> simply make it point to the pointer of an existing array using shmat, if >> is is contiguous? That would avoid a copy (if contiguous). >> >> Finally, to make sure share memory works with multiprocessing, we would >> have to override pickling so that the pickling and unpicking are done >> simply by storing the name of the shared memory object or retrieving it. >> This is risky, because actual persistence would be destroyed. >> >> Under Window we would use CreateSharedMemory to perform the same trick >> using CreateFileMapping and MapViewOfFile? >> >> Sounds fun. >> >> Ga?l >> _______________________________________________ >> SciPy-user mailing list >> SciPy-user at scipy.org >> http://projects.scipy.org/mailman/listinfo/scipy-user > > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > From gael.varoquaux at normalesup.org Fri Feb 6 01:36:45 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Fri, 6 Feb 2009 07:36:45 +0100 Subject: [SciPy-user] shared memory machines In-Reply-To: <6ce0ac130902051634i1408fddeiffd54c14f793f688@mail.gmail.com> References: <2AE6D153-799C-450E-8E69-CA80D12E2FF5@math.toronto.edu> <747c5db37a4e870a8e8f562a4636c6e7.squirrel@webmail.uio.no> <20090202063833.GB9627@phare.normalesup.org> <3d375d730902012251y1159737fk325a923a344f25cf@mail.gmail.com> <20090202105316.GE11955@phare.normalesup.org> <786d3e06228152ae2c30291b139983e4.squirrel@webmail.uio.no> <20090205231942.GB21014@phare.normalesup.org> <3d375d730902051523q179b552bg11218a0b22a161b1@mail.gmail.com> <20090205234115.GC29684@phare.normalesup.org> <6ce0ac130902051634i1408fddeiffd54c14f793f688@mail.gmail.com> Message-ID: <20090206063645.GA1704@phare.normalesup.org> On Thu, Feb 05, 2009 at 04:34:51PM -0800, Brian Granger wrote: > If so, then we should develop a shared memory version of numpy arrays > that will work in any multiple-process setting. I am thinking > multiprocessing *and* the IPython.kernel. I am +1 on that, obviously. I'd love to see a 'fork'-based IPython version, though :). Ga?l From starsareblueandfaraway at gmail.com Fri Feb 6 08:42:01 2009 From: starsareblueandfaraway at gmail.com (Roy H. Han) Date: Fri, 6 Feb 2009 08:42:01 -0500 Subject: [SciPy-user] SciPy-user Digest, Vol 66, Issue 9 In-Reply-To: References: Message-ID: <6a5569ec0902060542n42e27cf9r22b9d24de9cbee8d@mail.gmail.com> Thanks, Josef. This doesn't really answer my question, but thanks for your response. Date: Wed, 4 Feb 2009 12:44:27 -0500 From: josef.pktd at gmail.com Subject: Re: [SciPy-user] Mysterious kmeans() error To: SciPy Users List Message-ID: <1cd32cbb0902040944m306bbf0bia357c01d0f97fe6d at mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 On Wed, Feb 4, 2009 at 12:28 PM, Roy H. Han wrote: > As a side comment, if I use Pycluster, then the clustering proceeds > without error. > > On Wed, Feb 4, 2009 at 11:31 AM, Roy H. Han > wrote: >> Has anyone seen this error before? I have no idea what it means. I'm >> using version 0.6.0 packaged for Fedora. >> I'm getting this error using the kmeans2() implementation in scipy.cluster.vq >> >> >> File "/mnt/windows/svn/networkPlanner/acquisition/libraries/probability_process.py", >> line 55, in grapeCluster >> assignments = scipy.cluster.vq.kmeans2(globalCluster, k=2, >> iter=iterationCountPerBurst)[1] >> File "/usr/lib64/python2.5/site-packages/scipy/cluster/vq.py", line >> 563, in kmeans2 >> clusters = init(data, k) >> File "/usr/lib64/python2.5/site-packages/scipy/cluster/vq.py", line >> 469, in _krandinit >> x = N.dot(x, N.linalg.cholesky(cov).T) + mu >> File "/usr/lib64/python2.5/site-packages/numpy/linalg/linalg.py", >> line 418, in cholesky >> Cholesky decomposition cannot be computed' >> numpy.linalg.linalg.LinAlgError: Matrix is not positive definite - >> Cholesky decomposition cannot be computed This is just a general answer, I never used scipy.cluster The error message means that the covariance matrix of your np.cov(data) is not positive definite. Check your data, whether there is any linear dependence, eg. look at eigenvalues of np.cov(data). If that's not the source of the error, then a cluster expert is needed. Josef From gaedol at gmail.com Fri Feb 6 08:52:44 2009 From: gaedol at gmail.com (Marco) Date: Fri, 6 Feb 2009 14:52:44 +0100 Subject: [SciPy-user] Lowpass Filter In-Reply-To: <498B21CF.6040105@asu.edu> References: <498B21CF.6040105@asu.edu> Message-ID: Thank you all for the pointers and ideas: I will try to do something, and let you know what comes out. Thanks, marco -- Quando sei una human pignata e la pazzo jacket si ? accorciata e non ti puoi liberare dai colpi di legno e di bastone dai petardi sul groppone Vinicio Capossela On Thu, Feb 5, 2009 at 6:28 PM, Christopher Brown wrote: > Hi Marco, > > M> Let's suppose a to be a 1D array with N elements. > M> Basically, it's a signal of some sort. > M> > M> How do I apply a low pass filter (with selected frequency and width) > M> to this signal? > M> How to store the resulting, filtered, signal, in a new array? > M> > M> I had a look at lp2lp() in scipy.signal, but it returns, if I am > M> right, a filter object, which then I dunno how to use to filter my > M> data. > M> > M> Any ideas or pointers? > > The following is a low-pass Butterworth filter > > cutoff = 500. > fs = 44100. > nyq = fs/2. > filterorder = 5 > > b,a = scipy.signal.filter_design.butter(filterorder,cutoff/nyq) > filteredsignal = scipy.signal.lfilter(b,a,signal) > > -- > Chris > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > From starsareblueandfaraway at gmail.com Fri Feb 6 09:29:11 2009 From: starsareblueandfaraway at gmail.com (Roy H. Han) Date: Fri, 6 Feb 2009 09:29:11 -0500 Subject: [SciPy-user] Mysterious kmeans() error Message-ID: <6a5569ec0902060629u14b7b594vf568429ef1580375@mail.gmail.com> Thanks, Josef. It seems that it happens when one of the clusters becomes empty. Pycluster never seems to have the problem of empty clusters though. /usr/lib64/python2.5/site-packages/scipy/cluster/vq.py:477: UserWarning: One of the clusters is empty. Re-run kmean with a different initialization. warnings.warn("One of the clusters is empty. " Traceback (most recent call last): File "clusterProbabilities.py", line 88, in run(taskName, parameterByName) File "clusterProbabilities.py", line 57, in run locationGeoFrame = probability_process.cluster(targetLocationPath, probabilityPath, iterationCountPerBurst, maximumGeoDiameter, minimumGeoDiameter) File "/mnt/windows/svn/networkPlanner/acquisition/libraries/probability_process.py", line 33, in cluster windowLocations = grapeCluster(vectors, iterationCountPerBurst, maximumPixelDiameter, minimumPixelDiameter) File "/mnt/windows/svn/networkPlanner/acquisition/libraries/probability_process.py", line 66, in grapeCluster assignments = scipy.cluster.vq.kmeans2(globalCluster, k=2, iter=iterationCountPerBurst)[1] File "/usr/lib64/python2.5/site-packages/scipy/cluster/vq.py", line 563, in kmeans2 clusters = init(data, k) File "/usr/lib64/python2.5/site-packages/scipy/cluster/vq.py", line 469, in _krandinit x = N.dot(x, N.linalg.cholesky(cov).T) + mu File "/usr/lib64/python2.5/site-packages/numpy/linalg/linalg.py", line 418, in cholesky Cholesky decomposition cannot be computed' numpy.linalg.linalg.LinAlgError: Matrix is not positive definite - Cholesky decomposition cannot be computed On Fri, Feb 6, 2009 at 9:08 AM, wrote: > On Fri, Feb 6, 2009 at 8:42 AM, Roy H. Han > wrote: >> Thanks, Josef. This doesn't really answer my question, but thanks for >> your response. >> >> >> Date: Wed, 4 Feb 2009 12:44:27 -0500 >> From: josef.pktd at gmail.com >> Subject: Re: [SciPy-user] Mysterious kmeans() error >> To: SciPy Users List >> Message-ID: >> <1cd32cbb0902040944m306bbf0bia357c01d0f97fe6d at mail.gmail.com> >> Content-Type: text/plain; charset=ISO-8859-1 >> >> On Wed, Feb 4, 2009 at 12:28 PM, Roy H. Han >> wrote: >>> As a side comment, if I use Pycluster, then the clustering proceeds >>> without error. >>> >>> On Wed, Feb 4, 2009 at 11:31 AM, Roy H. Han >>> wrote: >>>> Has anyone seen this error before? I have no idea what it means. I'm >>>> using version 0.6.0 packaged for Fedora. >>>> I'm getting this error using the kmeans2() implementation in scipy.cluster.vq >>>> >>>> >>>> File "/mnt/windows/svn/networkPlanner/acquisition/libraries/probability_process.py", >>>> line 55, in grapeCluster >>>> assignments = scipy.cluster.vq.kmeans2(globalCluster, k=2, >>>> iter=iterationCountPerBurst)[1] >>>> File "/usr/lib64/python2.5/site-packages/scipy/cluster/vq.py", line >>>> 563, in kmeans2 >>>> clusters = init(data, k) >>>> File "/usr/lib64/python2.5/site-packages/scipy/cluster/vq.py", line >>>> 469, in _krandinit >>>> x = N.dot(x, N.linalg.cholesky(cov).T) + mu >>>> File "/usr/lib64/python2.5/site-packages/numpy/linalg/linalg.py", >>>> line 418, in cholesky >>>> Cholesky decomposition cannot be computed' >>>> numpy.linalg.linalg.LinAlgError: Matrix is not positive definite - >>>> Cholesky decomposition cannot be computed >> >> This is just a general answer, I never used scipy.cluster >> >> The error message means that the covariance matrix of your >> np.cov(data) is not positive definite. Check your data, whether there >> is any linear dependence, eg. look at eigenvalues of np.cov(data). >> >> If that's not the source of the error, then a cluster expert is needed. >> >> Josef >> > > I had looked a bit more, and I get the same error if the data has more > columns than rows. > The assumption in scipy.cluster is that columns represent random > variables and rows represent > observations. So, if the matrix is transposed then also the same > exception is raised as in your case > > Josef > > BTW: it's better to reply to individual threads than to the Digest, > since that preserves the subject line and threading. > From starsareblueandfaraway at gmail.com Fri Feb 6 09:37:23 2009 From: starsareblueandfaraway at gmail.com (Roy H. Han) Date: Fri, 6 Feb 2009 09:37:23 -0500 Subject: [SciPy-user] Mysterious kmeans() error In-Reply-To: <6a5569ec0902060629u14b7b594vf568429ef1580375@mail.gmail.com> References: <6a5569ec0902060629u14b7b594vf568429ef1580375@mail.gmail.com> Message-ID: <6a5569ec0902060637y6cd7d1ddt66da6e9becce8504@mail.gmail.com> Well I feel like there are numerical problems with scipy's kmeans2(), at least in the 0.6.0 version of scipy. I changed the code to try to ensure that no clusters were empty. Pycluster seems to be the better clustering algorithm for now. Even though the size (number of columns = 3) of each vector in the cluster is three, kmeans should still work even if one of the clusters contained a single vector (number of rows = 1). This is a bug. On Fri, Feb 6, 2009 at 9:29 AM, Roy H. Han wrote: > Thanks, Josef. > > It seems that it happens when one of the clusters becomes empty. > Pycluster never seems to have the problem of empty clusters though. > > > /usr/lib64/python2.5/site-packages/scipy/cluster/vq.py:477: > UserWarning: One of the clusters is empty. Re-run kmean with a > different initialization. > warnings.warn("One of the clusters is empty. " > > Traceback (most recent call last): > File "clusterProbabilities.py", line 88, in > run(taskName, parameterByName) > File "clusterProbabilities.py", line 57, in run > locationGeoFrame = probability_process.cluster(targetLocationPath, > probabilityPath, iterationCountPerBurst, maximumGeoDiameter, > minimumGeoDiameter) > File "/mnt/windows/svn/networkPlanner/acquisition/libraries/probability_process.py", > line 33, in cluster > windowLocations = grapeCluster(vectors, iterationCountPerBurst, > maximumPixelDiameter, minimumPixelDiameter) > File "/mnt/windows/svn/networkPlanner/acquisition/libraries/probability_process.py", > line 66, in grapeCluster > assignments = scipy.cluster.vq.kmeans2(globalCluster, k=2, > iter=iterationCountPerBurst)[1] > File "/usr/lib64/python2.5/site-packages/scipy/cluster/vq.py", line > 563, in kmeans2 > clusters = init(data, k) > File "/usr/lib64/python2.5/site-packages/scipy/cluster/vq.py", line > 469, in _krandinit > x = N.dot(x, N.linalg.cholesky(cov).T) + mu > File "/usr/lib64/python2.5/site-packages/numpy/linalg/linalg.py", > line 418, in cholesky > Cholesky decomposition cannot be computed' > numpy.linalg.linalg.LinAlgError: Matrix is not positive definite - > Cholesky decomposition cannot be computed > > > > On Fri, Feb 6, 2009 at 9:08 AM, wrote: >> On Fri, Feb 6, 2009 at 8:42 AM, Roy H. Han >> wrote: >>> Thanks, Josef. This doesn't really answer my question, but thanks for >>> your response. >>> >>> >>> Date: Wed, 4 Feb 2009 12:44:27 -0500 >>> From: josef.pktd at gmail.com >>> Subject: Re: [SciPy-user] Mysterious kmeans() error >>> To: SciPy Users List >>> Message-ID: >>> <1cd32cbb0902040944m306bbf0bia357c01d0f97fe6d at mail.gmail.com> >>> Content-Type: text/plain; charset=ISO-8859-1 >>> >>> On Wed, Feb 4, 2009 at 12:28 PM, Roy H. Han >>> wrote: >>>> As a side comment, if I use Pycluster, then the clustering proceeds >>>> without error. >>>> >>>> On Wed, Feb 4, 2009 at 11:31 AM, Roy H. Han >>>> wrote: >>>>> Has anyone seen this error before? I have no idea what it means. I'm >>>>> using version 0.6.0 packaged for Fedora. >>>>> I'm getting this error using the kmeans2() implementation in scipy.cluster.vq >>>>> >>>>> >>>>> File "/mnt/windows/svn/networkPlanner/acquisition/libraries/probability_process.py", >>>>> line 55, in grapeCluster >>>>> assignments = scipy.cluster.vq.kmeans2(globalCluster, k=2, >>>>> iter=iterationCountPerBurst)[1] >>>>> File "/usr/lib64/python2.5/site-packages/scipy/cluster/vq.py", line >>>>> 563, in kmeans2 >>>>> clusters = init(data, k) >>>>> File "/usr/lib64/python2.5/site-packages/scipy/cluster/vq.py", line >>>>> 469, in _krandinit >>>>> x = N.dot(x, N.linalg.cholesky(cov).T) + mu >>>>> File "/usr/lib64/python2.5/site-packages/numpy/linalg/linalg.py", >>>>> line 418, in cholesky >>>>> Cholesky decomposition cannot be computed' >>>>> numpy.linalg.linalg.LinAlgError: Matrix is not positive definite - >>>>> Cholesky decomposition cannot be computed >>> >>> This is just a general answer, I never used scipy.cluster >>> >>> The error message means that the covariance matrix of your >>> np.cov(data) is not positive definite. Check your data, whether there >>> is any linear dependence, eg. look at eigenvalues of np.cov(data). >>> >>> If that's not the source of the error, then a cluster expert is needed. >>> >>> Josef >>> >> >> I had looked a bit more, and I get the same error if the data has more >> columns than rows. >> The assumption in scipy.cluster is that columns represent random >> variables and rows represent >> observations. So, if the matrix is transposed then also the same >> exception is raised as in your case >> >> Josef >> >> BTW: it's better to reply to individual threads than to the Digest, >> since that preserves the subject line and threading. >> > From sturla at molden.no Fri Feb 6 10:15:05 2009 From: sturla at molden.no (Sturla Molden) Date: Fri, 06 Feb 2009 16:15:05 +0100 Subject: [SciPy-user] shared memory machines In-Reply-To: <6ce0ac130902051634i1408fddeiffd54c14f793f688@mail.gmail.com> References: <2AE6D153-799C-450E-8E69-CA80D12E2FF5@math.toronto.edu> <747c5db37a4e870a8e8f562a4636c6e7.squirrel@webmail.uio.no> <20090202063833.GB9627@phare.normalesup.org> <3d375d730902012251y1159737fk325a923a344f25cf@mail.gmail.com> <20090202105316.GE11955@phare.normalesup.org> <786d3e06228152ae2c30291b139983e4.squirrel@webmail.uio.no> <20090205231942.GB21014@phare.normalesup.org> <3d375d730902051523q179b552bg11218a0b22a161b1@mail.gmail.com> <20090205234115.GC29684@phare.normalesup.org> <6ce0ac130902051634i1408fddeiffd54c14f793f688@mail.gmail.com> Message-ID: <498C53F9.3070708@molden.no> On 2/6/2009 1:34 AM, Brian Granger wrote: > Can these mechanisms be used to create shared memory amongst processes > that are started in a completely independent manner. That is, they > are not fork()'d. Yes it can. If we know the name of the segment (an integer on Unix, a string on Windows), it can be mapped into any process. Similarly for named semaphores. This is different from the locks ans shared memory in multiprocessing, which must be shared through forking (Unix) or handle inheritance (Windows), and therefore created prior to instantiation of multiprocessing.Process. Otherwise, there is no valid handle to inherit. The question remains: should we base this on Cython or C (there is very little coding to do), or some third party extension, e.g. Philip Semanchuk's POSIX IPC and Mark Hammond's pywin32? I am thinking that at least for POSIX IPC, GPL is a severe limitation. Also we need some automatic clean up, which can only be accomplished with an extension object (that is, __dealloc__ in Cython will always be called, as opposed to __del__ in Python). In Pywin32 there is a PyHANDLE object that automatically calls CloseHandle when it is collected. But I don't think Semanchuk's POSIX IPC module will do the same. And avoiding dependencies on huge projects like pywin32 is also good. Sturla Molden From gael.varoquaux at normalesup.org Fri Feb 6 10:24:29 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Fri, 6 Feb 2009 16:24:29 +0100 Subject: [SciPy-user] shared memory machines In-Reply-To: <498C53F9.3070708@molden.no> References: <747c5db37a4e870a8e8f562a4636c6e7.squirrel@webmail.uio.no> <20090202063833.GB9627@phare.normalesup.org> <3d375d730902012251y1159737fk325a923a344f25cf@mail.gmail.com> <20090202105316.GE11955@phare.normalesup.org> <786d3e06228152ae2c30291b139983e4.squirrel@webmail.uio.no> <20090205231942.GB21014@phare.normalesup.org> <3d375d730902051523q179b552bg11218a0b22a161b1@mail.gmail.com> <20090205234115.GC29684@phare.normalesup.org> <6ce0ac130902051634i1408fddeiffd54c14f793f688@mail.gmail.com> <498C53F9.3070708@molden.no> Message-ID: <20090206152429.GA13894@phare.normalesup.org> On Fri, Feb 06, 2009 at 04:15:05PM +0100, Sturla Molden wrote: > The question remains: should we base this on Cython or C (there is very > little coding to do), or some third party extension, e.g. Philip > Semanchuk's POSIX IPC and Mark Hammond's pywin32? I am thinking that at > least for POSIX IPC, GPL is a severe limitation. Also we need some > automatic clean up, which can only be accomplished with an extension > object (that is, __dealloc__ in Cython will always be called, as opposed > to __del__ in Python). In Pywin32 there is a PyHANDLE object that > automatically calls CloseHandle when it is collected. But I don't think > Semanchuk's POSIX IPC module will do the same. And avoiding dependencies > on huge projects like pywin32 is also good. I am all for avoiding external dependencies (especially if they are GPL). multiprocessing is in the standard library, I would like to be able to do shared memory parallel computing with only numpy and the standard library. Actually I can see a near future where some algorithms of scipy could have the option of using multiple cores (I am thinking of eg non-parametric statistics). The __dealloc__ argument is a very good one for going with Cython. In addition I really like the feeling of Cython code. And am I wrong in thinking that it would make the transition to Python3 easier? Ga?l From philip at semanchuk.com Fri Feb 6 10:26:46 2009 From: philip at semanchuk.com (Philip Semanchuk) Date: Fri, 6 Feb 2009 10:26:46 -0500 Subject: [SciPy-user] shared memory machines In-Reply-To: <498C53F9.3070708@molden.no> References: <2AE6D153-799C-450E-8E69-CA80D12E2FF5@math.toronto.edu> <747c5db37a4e870a8e8f562a4636c6e7.squirrel@webmail.uio.no> <20090202063833.GB9627@phare.normalesup.org> <3d375d730902012251y1159737fk325a923a344f25cf@mail.gmail.com> <20090202105316.GE11955@phare.normalesup.org> <786d3e06228152ae2c30291b139983e4.squirrel@webmail.uio.no> <20090205231942.GB21014@phare.normalesup.org> <3d375d730902051523q179b552bg11218a0b22a161b1@mail.gmail.com> <20090205234115.GC29684@phare.normalesup.org> <6ce0ac130902051634i1408fddeiffd54c14f793f688@mail.gmail.com> <498C53F9.3070708@molden.no> Message-ID: <577F2D0E-D3B9-4400-912B-3EB6EFD5F48C@semanchuk.com> On Feb 6, 2009, at 10:15 AM, Sturla Molden wrote: > On 2/6/2009 1:34 AM, Brian Granger wrote: > >> Can these mechanisms be used to create shared memory amongst >> processes >> that are started in a completely independent manner. That is, they >> are not fork()'d. > > Yes it can. If we know the name of the segment (an integer on Unix, a > string on Windows), it can be mapped into any process. Similarly for > named semaphores. A small correction -- SysV IPC objects are referred to with an integer key. POSIX IPC objects are referred to with a file system-ish name e.g. "/my_semaphore". > The question remains: should we base this on Cython or C (there is > very > little coding to do), or some third party extension, e.g. Philip > Semanchuk's POSIX IPC and Mark Hammond's pywin32? I am thinking that > at > least for POSIX IPC, GPL is a severe limitation. Also we need some > automatic clean up, which can only be accomplished with an extension > object (that is, __dealloc__ in Cython will always be called, as > opposed > to __del__ in Python). In Pywin32 there is a PyHANDLE object that > automatically calls CloseHandle when it is collected. But I don't > think > Semanchuk's POSIX IPC module will do the same. And avoiding > dependencies > on huge projects like pywin32 is also good. You are correct that posix_ipc doesn't close handles when deallocated. THis is a deliberate choice -- the documentation says that closing the handle makes the IPC object no longer available to the *process*. So if one has multiple handles to an IPC object (say, inside multiple threads), closing one would invalidate them all. But as I write this, I'm wondering if that's not just a documentation bug and something with which I ought to experiment. bye Philip From sturla at molden.no Fri Feb 6 10:38:17 2009 From: sturla at molden.no (Sturla Molden) Date: Fri, 06 Feb 2009 16:38:17 +0100 Subject: [SciPy-user] shared memory machines In-Reply-To: <577F2D0E-D3B9-4400-912B-3EB6EFD5F48C@semanchuk.com> References: <2AE6D153-799C-450E-8E69-CA80D12E2FF5@math.toronto.edu> <747c5db37a4e870a8e8f562a4636c6e7.squirrel@webmail.uio.no> <20090202063833.GB9627@phare.normalesup.org> <3d375d730902012251y1159737fk325a923a344f25cf@mail.gmail.com> <20090202105316.GE11955@phare.normalesup.org> <786d3e06228152ae2c30291b139983e4.squirrel@webmail.uio.no> <20090205231942.GB21014@phare.normalesup.org> <3d375d730902051523q179b552bg11218a0b22a161b1@mail.gmail.com> <20090205234115.GC29684@phare.normalesup.org> <6ce0ac130902051634i1408fddeiffd54c14f793f688@mail.gmail.com> <498C53F9.3070708@molden.no> <577F2D0E-D3B9-4400-912B-3EB6EFD5F48C@semanchuk.com> Message-ID: <498C5969.1040809@molden.no> On 2/6/2009 4:26 PM, Philip Semanchuk wrote: > You are correct that posix_ipc doesn't close handles when deallocated. > THis is a deliberate choice -- the documentation says that closing the > handle makes the IPC object no longer available to the *process*. So > if one has multiple handles to an IPC object (say, inside multiple > threads), closing one would invalidate them all. But as I write this, > I'm wondering if that's not just a documentation bug and something > with which I ought to experiment. I have been thinking about this as well. I am mostly familiar with Windows so excuse my terminology: We don't want an array to call CloseHandle() on a mapped segment that another array is still using. The effect would be global to the process. Thus, we would either need to maintain some sort of global reference count for all mapped shared resources, or make duplicates of the handle. On Windows there is a function called DuplicateHandle() that will do this. I am not sure about Unix. Sturla Molden From sturla at molden.no Fri Feb 6 10:51:52 2009 From: sturla at molden.no (Sturla Molden) Date: Fri, 06 Feb 2009 16:51:52 +0100 Subject: [SciPy-user] shared memory machines In-Reply-To: <498C5969.1040809@molden.no> References: <2AE6D153-799C-450E-8E69-CA80D12E2FF5@math.toronto.edu> <747c5db37a4e870a8e8f562a4636c6e7.squirrel@webmail.uio.no> <20090202063833.GB9627@phare.normalesup.org> <3d375d730902012251y1159737fk325a923a344f25cf@mail.gmail.com> <20090202105316.GE11955@phare.normalesup.org> <786d3e06228152ae2c30291b139983e4.squirrel@webmail.uio.no> <20090205231942.GB21014@phare.normalesup.org> <3d375d730902051523q179b552bg11218a0b22a161b1@mail.gmail.com> <20090205234115.GC29684@phare.normalesup.org> <6ce0ac130902051634i1408fddeiffd54c14f793f688@mail.gmail.com> <498C53F9.3070708@molden.no> <577F2D0E-D3B9-4400-912B-3EB6EFD5F48C@semanchuk.com> <498C5969.1040809@molden.no> Message-ID: <498C5C98.6000108@molden.no> On 2/6/2009 4:38 PM, Sturla Molden wrote: > I have been thinking about this as well. I am mostly familiar with > Windows so excuse my terminology: We don't want an array to call > CloseHandle() on a mapped segment that another array is still using. The > effect would be global to the process. Thus, we would either need to > maintain some sort of global reference count for all mapped shared > resources, or make duplicates of the handle. On Windows there is a > function called DuplicateHandle() that will do this. I am not sure about > Unix. On Unix we could possibly use a WeakValueDictionary with name as key and handle as value. And then let the handle object close itself when it is collected. So an array could first look for an open handle in the dictionary, before trying to map a new one. And since we are doing all this in Cython, the GIL will take care of the synchronization. This would also work on Windows, but there we have DuplicateHandle as another option. S.M. From philip at semanchuk.com Fri Feb 6 10:54:42 2009 From: philip at semanchuk.com (Philip Semanchuk) Date: Fri, 6 Feb 2009 10:54:42 -0500 Subject: [SciPy-user] shared memory machines In-Reply-To: <498C5969.1040809@molden.no> References: <2AE6D153-799C-450E-8E69-CA80D12E2FF5@math.toronto.edu> <747c5db37a4e870a8e8f562a4636c6e7.squirrel@webmail.uio.no> <20090202063833.GB9627@phare.normalesup.org> <3d375d730902012251y1159737fk325a923a344f25cf@mail.gmail.com> <20090202105316.GE11955@phare.normalesup.org> <786d3e06228152ae2c30291b139983e4.squirrel@webmail.uio.no> <20090205231942.GB21014@phare.normalesup.org> <3d375d730902051523q179b552bg11218a0b22a161b1@mail.gmail.com> <20090205234115.GC29684@phare.normalesup.org> <6ce0ac130902051634i1408fddeiffd54c14f793f688@mail.gmail.com> <498C53F9.3070708@molden.no> <577F2D0E-D3B9-4400-912B-3EB6EFD5F48C@semanchuk.com> <498C5969.1040809@molden.no> Message-ID: <8FCF568D-EA63-45ED-9933-2DAB1B9714AC@semanchuk.com> On Feb 6, 2009, at 10:38 AM, Sturla Molden wrote: > On 2/6/2009 4:26 PM, Philip Semanchuk wrote: > >> You are correct that posix_ipc doesn't close handles when >> deallocated. >> THis is a deliberate choice -- the documentation says that closing >> the >> handle makes the IPC object no longer available to the *process*. So >> if one has multiple handles to an IPC object (say, inside multiple >> threads), closing one would invalidate them all. But as I write this, >> I'm wondering if that's not just a documentation bug and something >> with which I ought to experiment. > > I have been thinking about this as well. I am mostly familiar with > Windows so excuse my terminology: We don't want an array to call > CloseHandle() on a mapped segment that another array is still using. > The > effect would be global to the process. Thus, we would either need to > maintain some sort of global reference count for all mapped shared > resources, or make duplicates of the handle. On Windows there is a > function called DuplicateHandle() that will do this. I am not sure > about > Unix. On Unix, one can duplicate a file handle with a call to dup(). Note that the doc for shm_unlink() says this: "If one or more references to the shared memory object exist when the object is unlinked, the name shall be removed before shm_unlink() returns, but the removal of the memory object contents shall be postponed until all open and map references to the shared memory object have been removed." Furthermore (and this is where it gets tricky): "Even if the object continues to exist after the last shm_unlink(), reuse of the name shall subsequently cause shm_open() to behave as if no shared memory object of this name exists (that is, shm_open() will fail if O_CREAT is not set, or will create a new shared memory object if O_CREAT is set)." You'd have to do your testing very carefully to see if dup() really increments the kernel's reference count on a shared memory segment. From cournape at gmail.com Fri Feb 6 11:05:55 2009 From: cournape at gmail.com (David Cournapeau) Date: Sat, 7 Feb 2009 01:05:55 +0900 Subject: [SciPy-user] Mysterious kmeans() error In-Reply-To: <6a5569ec0902060637y6cd7d1ddt66da6e9becce8504@mail.gmail.com> References: <6a5569ec0902060629u14b7b594vf568429ef1580375@mail.gmail.com> <6a5569ec0902060637y6cd7d1ddt66da6e9becce8504@mail.gmail.com> Message-ID: <5b8d13220902060805j1b16d281v203201ad53c0df54@mail.gmail.com> On Fri, Feb 6, 2009 at 11:37 PM, Roy H. Han wrote: > Well I feel like there are numerical problems with scipy's kmeans2(), > at least in the 0.6.0 version of scipy. kmeans and kmeans2 are fairly low level - they will fail if you have empty cluster, indeed. > I changed the code to try to ensure that no clusters were empty. > Pycluster seems to be the better clustering algorithm for now. Maybe - I am not familiar with pycluster. > Even though the size (number of columns = 3) of each vector in the > cluster is three, kmeans should still work even if one of the clusters > contained a single vector (number of rows = 1). Strictly speaking, kmeans is undefined in that case - there are various strategies which can be implemented, like cluster splitting, etc... Generally, I agree the code is not great. David From josef.pktd at gmail.com Fri Feb 6 11:25:31 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 6 Feb 2009 11:25:31 -0500 Subject: [SciPy-user] Mysterious kmeans() error In-Reply-To: <5b8d13220902060805j1b16d281v203201ad53c0df54@mail.gmail.com> References: <6a5569ec0902060629u14b7b594vf568429ef1580375@mail.gmail.com> <6a5569ec0902060637y6cd7d1ddt66da6e9becce8504@mail.gmail.com> <5b8d13220902060805j1b16d281v203201ad53c0df54@mail.gmail.com> Message-ID: <1cd32cbb0902060825i65265e9idd359ee0a2522cea@mail.gmail.com> On Fri, Feb 6, 2009 at 11:05 AM, David Cournapeau wrote: > On Fri, Feb 6, 2009 at 11:37 PM, Roy H. Han > wrote: >> Well I feel like there are numerical problems with scipy's kmeans2(), >> at least in the 0.6.0 version of scipy. > > kmeans and kmeans2 are fairly low level - they will fail if you have > empty cluster, indeed. I thought that the tests test_kmeans_lost_cluster(self) verifies that empty clusters are handled. > >> I changed the code to try to ensure that no clusters were empty. >> Pycluster seems to be the better clustering algorithm for now. > > Maybe - I am not familiar with pycluster. > >> Even though the size (number of columns = 3) of each vector in the >> cluster is three, kmeans should still work even if one of the clusters >> contained a single vector (number of rows = 1). > > Strictly speaking, kmeans is undefined in that case - there are > various strategies which can be implemented, like cluster splitting, > etc... Generally, I agree the code is not great. > > David If the problem is just the cholesky decomposition in the random initialization, then it should be possible to switch to a different initialization scheme, or force a correct covariance matrix for the cholesky decomposition. Eg. replace with diagonal matrix or, ensure that cov has the right dimension and add a small diagonal array (as in Ridge regression or Tychonov penalization). Josef From sturla at molden.no Fri Feb 6 11:40:48 2009 From: sturla at molden.no (Sturla Molden) Date: Fri, 06 Feb 2009 17:40:48 +0100 Subject: [SciPy-user] shared memory machines In-Reply-To: <8FCF568D-EA63-45ED-9933-2DAB1B9714AC@semanchuk.com> References: <2AE6D153-799C-450E-8E69-CA80D12E2FF5@math.toronto.edu> <747c5db37a4e870a8e8f562a4636c6e7.squirrel@webmail.uio.no> <20090202063833.GB9627@phare.normalesup.org> <3d375d730902012251y1159737fk325a923a344f25cf@mail.gmail.com> <20090202105316.GE11955@phare.normalesup.org> <786d3e06228152ae2c30291b139983e4.squirrel@webmail.uio.no> <20090205231942.GB21014@phare.normalesup.org> <3d375d730902051523q179b552bg11218a0b22a161b1@mail.gmail.com> <20090205234115.GC29684@phare.normalesup.org> <6ce0ac130902051634i1408fddeiffd54c14f793f688@mail.gmail.com> <498C53F9.3070708@molden.no> <577F2D0E-D3B9-4400-912B-3EB6EFD5F48C@semanchuk.com> <498C5969.1040809@molden.no> <8FCF568D-EA63-45ED-9933-2DAB1B9714AC@semanchuk.com> Message-ID: <498C6810.7030702@molden.no> On 2/6/2009 4:54 PM, Philip Semanchuk wrote: > You'd have to do your testing very carefully to see if dup() really > increments the kernel's reference count on a shared memory segment. Ok, in that case it is probably better to let Python take care of the reference counting. S.M. From c-b at asu.edu Fri Feb 6 15:59:29 2009 From: c-b at asu.edu (Christopher Brown) Date: Fri, 06 Feb 2009 13:59:29 -0700 Subject: [SciPy-user] Zero crossings Message-ID: <498CA4B1.3030807@asu.edu> Hi List, What's the best way to find all zero crossings in my data? Is there something already written in scipy? -- Chris From sturla at molden.no Fri Feb 6 16:13:46 2009 From: sturla at molden.no (Sturla Molden) Date: Fri, 06 Feb 2009 22:13:46 +0100 Subject: [SciPy-user] Zero crossings In-Reply-To: <498CA4B1.3030807@asu.edu> References: <498CA4B1.3030807@asu.edu> Message-ID: <498CA80A.3000707@molden.no> On 2/6/2009 9:59 PM, Christopher Brown wrote: > Hi List, > > What's the best way to find all zero crossings in my data? Is there > something already written in scipy? > zc = numpy.where(numpy.sign(a[1:]) != numpy.sign(a[:-1])) ... or something like that. Sturla Molden From sturla at molden.no Fri Feb 6 16:22:53 2009 From: sturla at molden.no (Sturla Molden) Date: Fri, 06 Feb 2009 22:22:53 +0100 Subject: [SciPy-user] shared memory machines In-Reply-To: <498C5969.1040809@molden.no> References: <2AE6D153-799C-450E-8E69-CA80D12E2FF5@math.toronto.edu> <747c5db37a4e870a8e8f562a4636c6e7.squirrel@webmail.uio.no> <20090202063833.GB9627@phare.normalesup.org> <3d375d730902012251y1159737fk325a923a344f25cf@mail.gmail.com> <20090202105316.GE11955@phare.normalesup.org> <786d3e06228152ae2c30291b139983e4.squirrel@webmail.uio.no> <20090205231942.GB21014@phare.normalesup.org> <3d375d730902051523q179b552bg11218a0b22a161b1@mail.gmail.com> <20090205234115.GC29684@phare.normalesup.org> <6ce0ac130902051634i1408fddeiffd54c14f793f688@mail.gmail.com> <498C53F9.3070708@molden.no> <577F2D0E-D3B9-4400-912B-3EB6EFD5F48C@semanchuk.com> <498C5969.1040809@molden.no> Message-ID: <498CAA2D.10102@molden.no> Ok, so this is approximately what I had in mind for Windows. It is a named mutex and shared memory that is pickled by name (given that I read the Python manuals on pickling extension objects correctly...) It still lacks an ndarray subclass that is pickled without making a copy of the buffer, and also a malloc similar to multiprocessing. And similar Cython code has to be written for posix... But this is a start. If anyone feel like contributing, please do. S.M. -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: sharedmemory_win.pyx URL: From jean-pascal.mercier at inrialpes.fr Fri Feb 6 16:31:16 2009 From: jean-pascal.mercier at inrialpes.fr (J-Pascal Mercier) Date: Fri, 6 Feb 2009 22:31:16 +0100 Subject: [SciPy-user] Zero crossings In-Reply-To: <498CA4B1.3030807@asu.edu> References: <498CA4B1.3030807@asu.edu> Message-ID: <20090206223116.65da02fd@utopia> On Fri, 06 Feb 2009 13:59:29 -0700 Christopher Brown wrote: > Hi List, > > What's the best way to find all zero crossings in my data? Is there > something already written in scipy? > Hi, To my knowledge, there is no such function in scipy. Are your data 1D, 2D, 3D, ... ? What kind of precision you need? Do you have to find every zero-crossing? An easy solution in 1D without sub-grid accuracy would be something like : scipy.where(A[:-1] * A[1:] < 0) cheers, J-Pascal -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 197 bytes Desc: not available URL: From karl.young at ucsf.edu Fri Feb 6 17:08:01 2009 From: karl.young at ucsf.edu (Karl Young) Date: Fri, 06 Feb 2009 14:08:01 -0800 Subject: [SciPy-user] stupid array tricks In-Reply-To: <1cd32cbb0902051230k2c024a0cnc25448f6c1613679@mail.gmail.com> References: <5063d0650902050337s3a4b656k71f9a5634c589589@mail.gmail.com> <498AF9E7.90200@molden.no> <1cd32cbb0902050736v5ae55230l215910f312562f2a@mail.gmail.com> <1cd32cbb0902050839m4e587390s31ef7b5f6267c5d4@mail.gmail.com> <498B1A3B.8040603@molden.no> <1cd32cbb0902050923j492f5683icbe737533159a24e@mail.gmail.com> <498B2207.2030303@molden.no> <1cd32cbb0902050946i700777cdhc920711cb393353f@mail.gmail.com> <498B29F6.4080508@molden.no> <498B4326.7060207@molden.no> <1cd32cbb0902051230k2c024a0cnc25448f6c1613679@mail.gmail.com> Message-ID: <498CB4C1.6040302@ucsf.edu> I know there are a number of array manipulation maestros on the list and wanted to run a problem by the list that my feeble mind is foundering on. I have three objects, 1) an array containing an "image" (could be any dimension), 2) a mask for the image (of the same dimensions as the image), and 3) a "template" which is just a list of offset coordinates from any point in the image. Say the template has n elements, the problem is to move the template over the image and build an m x n array containing image values (at the template positions) where m is the number of image indices such that the template lies within the image and all mask values are true. I currently do this using some raveling, compressing, and length comparison but I still haven't been able to figure out how to do it without looping through the image indices (and this is sloooowwww for big multidimensional "images"). I keep feeling like there must be some clever way to concatenate the template with the image and mask so as to do this without looping but haven't been able to come up with it. I could speed it up by doing the looping in C but that doesn't seem very elegant. Any thoughts welcome. --KY From guilherme at gpfreitas.com Fri Feb 6 18:19:50 2009 From: guilherme at gpfreitas.com (Guilherme P. de Freitas) Date: Fri, 6 Feb 2009 15:19:50 -0800 Subject: [SciPy-user] Computational Economics with SciPy In-Reply-To: References: Message-ID: Actually it is being shipped already. Mine arrived today from Amazon! I'm a econ graduate student trying to do the computational assignments in Python, and these resources seem *very* helpful. Especially when none of your professors or classmates ever used Python before, so you have nobody to ask for help. I don't know what's the policy for what should and what should not be linked on SciPy's website, but I think these resources should definitely be linked there (notably the "Cookbook" section and the "Topical Software" section). Best, Guilherme On Wed, Jan 28, 2009 at 12:39 PM, Peter Skomoroch wrote: > Just stumbled across a new book by John Stachurski using scipy which will > ship later this month > > Economic Dynamics: Theory and Computation > John Stachurski > MIT Press, 2009 > http://www.amazon.com/Economic-Dynamics-Computation-John-Stachurski/dp/0262012774 > http://johnstachurski.net/book/book.html > > There are some nice tutorials using scipy here as well: > > http://johnstachurski.net/lectures/index.html > > >> Economic Dynamics: Theory and Computation is a graduate level introduction >> to deterministic and stochastic dynamics, dynamic programming and >> computational methods with economic applications. >> >> Topics >> >> Programming techniques >> Basic analysis (real analysis, metric spaces, fixed points) >> Deterministic dynamic systems >> Finite state Markov chains >> Finite state dynamic programming >> Continuous state stochastic dynamics >> Continuous state dynamic programming > > -Pete > > > > > -- > Peter N. Skomoroch > peter.skomoroch at gmail.com > http://www.datawrangling.com > http://del.icio.us/pskomoroch > > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > > -- Guilherme P. de Freitas http://www.gpfreitas.com From robert.kern at gmail.com Fri Feb 6 18:27:26 2009 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 6 Feb 2009 17:27:26 -0600 Subject: [SciPy-user] Computational Economics with SciPy In-Reply-To: References: Message-ID: <3d375d730902061527t47914da8i71427d3c65ab9a87@mail.gmail.com> On Fri, Feb 6, 2009 at 17:19, Guilherme P. de Freitas wrote: > Actually it is being shipped already. Mine arrived today from Amazon! > I'm a econ graduate student trying to do the computational assignments > in Python, and these resources seem *very* helpful. Especially when > none of your professors or classmates ever used Python before, so you > have nobody to ask for help. > > I don't know what's the policy for what should and what should not be > linked on SciPy's website, but I think these resources should > definitely be linked there (notably the "Cookbook" section and the > "Topical Software" section). I think a page describing the book and linking to its home page (not an Amazon link, for preference) would be good. I don't think it really fits into the Cookbook section, though. Hopefully, there will eventually be enough books to make a special category for such pages. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From cournape at gmail.com Fri Feb 6 19:14:55 2009 From: cournape at gmail.com (David Cournapeau) Date: Sat, 7 Feb 2009 09:14:55 +0900 Subject: [SciPy-user] Mysterious kmeans() error In-Reply-To: <1cd32cbb0902060825i65265e9idd359ee0a2522cea@mail.gmail.com> References: <6a5569ec0902060629u14b7b594vf568429ef1580375@mail.gmail.com> <6a5569ec0902060637y6cd7d1ddt66da6e9becce8504@mail.gmail.com> <5b8d13220902060805j1b16d281v203201ad53c0df54@mail.gmail.com> <1cd32cbb0902060825i65265e9idd359ee0a2522cea@mail.gmail.com> Message-ID: <5b8d13220902061614q685e6d2ei413d3797d1812ae0@mail.gmail.com> On Sat, Feb 7, 2009 at 1:25 AM, wrote: > On Fri, Feb 6, 2009 at 11:05 AM, David Cournapeau wrote: >> On Fri, Feb 6, 2009 at 11:37 PM, Roy H. Han >> wrote: >>> Well I feel like there are numerical problems with scipy's kmeans2(), >>> at least in the 0.6.0 version of scipy. >> >> kmeans and kmeans2 are fairly low level - they will fail if you have >> empty cluster, indeed. > > I thought that the tests test_kmeans_lost_cluster(self) verifies that > empty clusters > are handled. Actually, it tests a warning/exception is raised, instead of silently fail - so you can for example repeat the kmeans procedure with different initializations values (that's how I use kmeans in the em toolbox). But again, a better kmeans algorithm implementation would be nice - I just not sure it should be in scipy, though, David From c-b at asu.edu Fri Feb 6 19:28:41 2009 From: c-b at asu.edu (Christopher Brown) Date: Fri, 06 Feb 2009 17:28:41 -0700 Subject: [SciPy-user] Zero crossings In-Reply-To: <498CA80A.3000707@molden.no> References: <498CA4B1.3030807@asu.edu> <498CA80A.3000707@molden.no> Message-ID: <498CD5B9.7060706@asu.edu> Thanks Sturla and J-Pascal, SM> zc = numpy.where(numpy.sign(a[1:]) != numpy.sign(a[:-1])) I want to estimate f0 based on zero crossings in recorded speech, and I've got something that looks pretty good for only a few hours of work. Does anyone have any interest in this kind of thing? -- Christopher Brown, Ph.D. Department of Speech and Hearing Science Arizona State University From robert.kern at gmail.com Fri Feb 6 19:31:57 2009 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 6 Feb 2009 18:31:57 -0600 Subject: [SciPy-user] Zero crossings In-Reply-To: <498CD5B9.7060706@asu.edu> References: <498CA4B1.3030807@asu.edu> <498CA80A.3000707@molden.no> <498CD5B9.7060706@asu.edu> Message-ID: <3d375d730902061631m5b228828hfb49309580fe5000@mail.gmail.com> On Fri, Feb 6, 2009 at 18:28, Christopher Brown wrote: > Thanks Sturla and J-Pascal, > > SM> zc = numpy.where(numpy.sign(a[1:]) != numpy.sign(a[:-1])) > > I want to estimate f0 based on zero crossings in recorded speech, and > I've got something that looks pretty good for only a few hours of work. > > Does anyone have any interest in this kind of thing? Sure! It would make a good recipe for the Cookbook. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From josef.pktd at gmail.com Fri Feb 6 19:35:08 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 6 Feb 2009 19:35:08 -0500 Subject: [SciPy-user] Mysterious kmeans() error In-Reply-To: <5b8d13220902061614q685e6d2ei413d3797d1812ae0@mail.gmail.com> References: <6a5569ec0902060629u14b7b594vf568429ef1580375@mail.gmail.com> <6a5569ec0902060637y6cd7d1ddt66da6e9becce8504@mail.gmail.com> <5b8d13220902060805j1b16d281v203201ad53c0df54@mail.gmail.com> <1cd32cbb0902060825i65265e9idd359ee0a2522cea@mail.gmail.com> <5b8d13220902061614q685e6d2ei413d3797d1812ae0@mail.gmail.com> Message-ID: <1cd32cbb0902061635l336ed7cex695e344be1e64a3e@mail.gmail.com> On Fri, Feb 6, 2009 at 7:14 PM, David Cournapeau wrote: > On Sat, Feb 7, 2009 at 1:25 AM, wrote: >> On Fri, Feb 6, 2009 at 11:05 AM, David Cournapeau wrote: >>> On Fri, Feb 6, 2009 at 11:37 PM, Roy H. Han >>> wrote: >>>> Well I feel like there are numerical problems with scipy's kmeans2(), >>>> at least in the 0.6.0 version of scipy. >>> >>> kmeans and kmeans2 are fairly low level - they will fail if you have >>> empty cluster, indeed. >> >> I thought that the tests test_kmeans_lost_cluster(self) verifies that >> empty clusters >> are handled. > > Actually, it tests a warning/exception is raised, instead of silently > fail - so you can for example repeat the kmeans procedure with > different initializations values (that's how I use kmeans in the em > toolbox). Doesn't random initialization automatically restart with different random values. When I ran the example in test_kmeans_lost_cluster, it seemed to produce reasonable results after the warning, but I didn't verify any numbers. Also the follow up error that the OP got was in the cov calculation for the random init. So it seems to me that there is a failure in reinitializing the process. (But, I only looked at the source for this part and don't know how the cluster analysis in scipy is constructed overall.) Josef > > But again, a better kmeans algorithm implementation would be nice - I > just not sure it should be in scipy, though, > > David > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > From guilherme at gpfreitas.com Sat Feb 7 06:20:52 2009 From: guilherme at gpfreitas.com (Guilherme P. de Freitas) Date: Sat, 7 Feb 2009 03:20:52 -0800 Subject: [SciPy-user] Computational Economics with SciPy In-Reply-To: <3d375d730902061527t47914da8i71427d3c65ab9a87@mail.gmail.com> References: <3d375d730902061527t47914da8i71427d3c65ab9a87@mail.gmail.com> Message-ID: On Fri, Feb 6, 2009 at 3:27 PM, Robert Kern wrote: > On Fri, Feb 6, 2009 at 17:19, Guilherme P. de Freitas > wrote: >> I don't know what's the policy for what should and what should not be >> linked on SciPy's website, but I think these resources should >> definitely be linked there (notably the "Cookbook" section and the >> "Topical Software" section). > > I think a page describing the book and linking to its home page (not > an Amazon link, for preference) would be good. I don't think it really > fits into the Cookbook section, though. Hopefully, there will > eventually be enough books to make a special category for such pages. Here's the link to the book page: http://johnstachurski.net/book/book.html As for the Cookbook section, I'm sorry, I should have been more specific. In the lectures, there are "Application" lectures, like "Finite-state Optimal Growth" http://johnstachurski.net/lectures/finite_growth.html And I feel these specific application lectures could be linked under the cookbook or topical software section. They are essentially recipes. However, the applications often refer to the book, but it is not strictly necessary, if you understand the problem. Just an idea. It's just that it took me a while to find this, and there's nothing like this in the SciPy website (which is a natural place to search). From stefan at sun.ac.za Sat Feb 7 12:36:46 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Sat, 7 Feb 2009 19:36:46 +0200 Subject: [SciPy-user] stupid array tricks In-Reply-To: <498CB4C1.6040302@ucsf.edu> References: <5063d0650902050337s3a4b656k71f9a5634c589589@mail.gmail.com> <1cd32cbb0902050839m4e587390s31ef7b5f6267c5d4@mail.gmail.com> <498B1A3B.8040603@molden.no> <1cd32cbb0902050923j492f5683icbe737533159a24e@mail.gmail.com> <498B2207.2030303@molden.no> <1cd32cbb0902050946i700777cdhc920711cb393353f@mail.gmail.com> <498B29F6.4080508@molden.no> <498B4326.7060207@molden.no> <1cd32cbb0902051230k2c024a0cnc25448f6c1613679@mail.gmail.com> <498CB4C1.6040302@ucsf.edu> Message-ID: <9457e7c80902070936w792c9329o114f0a67acfbda3d@mail.gmail.com> 2009/2/7 Karl Young : > I have three objects, 1) an array containing an "image" (could be any > dimension), 2) a mask for the image (of the same dimensions as the > image), and 3) a "template" which is just a list of offset coordinates > from any point in the image. You can create a strided view of the image, so that the values around each position where the filter can be applied becomes a row. Thereafter, using the indexing tricks shown at http://mentat.za.net/numpy/numpy_advanced_slides/ index the view to produce the templated values at each position. Say your template has length n, then you'd have: template of shape (1, n) rows = np.arange(m)[:, None] with shape (m, 1) When using template and rows in a fancy indexing operating, you should get an output of shape (m, n). Here is a simplified example: # Strided view of your image In [25]: data Out[25]: array([[ 0, 1, 2], [ 3, 4, 5], [ 6, 7, 8], [ 9, 10, 11]]) In [26]: rows Out[26]: array([[0], [1], [2], [3]]) In [27]: rows.shape Out[27]: (4, 1) In [28]: template Out[28]: array([[0, 2]]) In [29]: template.shape Out[29]: (1, 2) In [30]: data[rows, template] Out[30]: array([[ 0, 2], [ 3, 5], [ 6, 8], [ 9, 11]]) Hope that helps! Cheers St?fan From dav at alum.mit.edu Sat Feb 7 17:13:27 2009 From: dav at alum.mit.edu (Dav Clark) Date: Sat, 7 Feb 2009 14:13:27 -0800 Subject: [SciPy-user] failed easy_install on OSX Message-ID: <80AC4A04-B81C-4524-B504-B2FEA32C5AF0@alum.mit.edu> Hi, I found a small bug (more with OSX than with SciPy) but worth mentioning. If you upgrade setuptools on OS X without changing your path, for some reason /usr/bin/easy_install (system setuptools 0.6c7) will remain ahead of /usr/local/bin/easy_install (your current install). Then, if you try to do an easy_install of scipy, it fails because setuptools 0.6c7 doesn't provide the proper fcompiler attribute. Two solutions: 1) download and run setup.py manually - this will use the most recent setuptools via python or 2) Change your PATH so that /usr/local/bin comes before /usr/bin. Why this isn't the case already, I have no idea. I guess it's apples way of insulating casual users from hackers like us. I would like to put this advice here: http://www.scipy.org/Installing_SciPy/Mac_OS_X But I don't have permission. If you want to give me permission, I am DavClark on the scipy.org wiki. Cheers, Dav From robert.kern at gmail.com Sat Feb 7 17:27:33 2009 From: robert.kern at gmail.com (Robert Kern) Date: Sat, 7 Feb 2009 16:27:33 -0600 Subject: [SciPy-user] failed easy_install on OSX In-Reply-To: <80AC4A04-B81C-4524-B504-B2FEA32C5AF0@alum.mit.edu> References: <80AC4A04-B81C-4524-B504-B2FEA32C5AF0@alum.mit.edu> Message-ID: <3d375d730902071427r259ecebctd4383f9db41baf11@mail.gmail.com> On Sat, Feb 7, 2009 at 16:13, Dav Clark wrote: > Hi, > > I found a small bug (more with OSX than with SciPy) but worth > mentioning. > > If you upgrade setuptools on OS X without changing your path, for some > reason /usr/bin/easy_install (system setuptools 0.6c7) will remain > ahead of /usr/local/bin/easy_install (your current install). Then, if > you try to do an easy_install of scipy, it fails because setuptools > 0.6c7 doesn't provide the proper fcompiler attribute. No version of setuptools provides an fcompiler attribute. That's all numpy.distutils. I suspect there is a different problem going on. The system's Python comes with a 1.0.x series numpy. I think that is the root of the problem. > Two solutions: > > 1) download and run setup.py manually - this will use the most recent > setuptools via python > > or > > 2) Change your PATH so that /usr/local/bin comes before /usr/bin. Why > this isn't the case already, I have no idea. I guess it's apples way > of insulating casual users from hackers like us. I believe all of the Python binaries I am aware of (www.python.org, Activestate, and EPD) will modify your .bashrc or .bash_profile to place the appropriate bin/ path (not always /usr/local/bin/!) at the front of your $PATH. If you are using a different shell, you may have to do this manually. Additionally, the default installation location for scripts is not /usr/local/bin/ but /Library/Frameworks/Python.framework/Versions/Current/bin/, so I suspect you have modified your .pydistutilsrc file to point there. When you modify things, you are on your own. :-) > I would like to put this advice here: > > http://www.scipy.org/Installing_SciPy/Mac_OS_X > > But I don't have permission. If you want to give me permission, I am > DavClark on the scipy.org wiki. You have now been added to the EditorsGroup. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From dav at alum.mit.edu Sat Feb 7 19:12:24 2009 From: dav at alum.mit.edu (Dav Clark) Date: Sat, 7 Feb 2009 16:12:24 -0800 Subject: [SciPy-user] failed easy_install on OSX In-Reply-To: <3d375d730902071427r259ecebctd4383f9db41baf11@mail.gmail.com> References: <80AC4A04-B81C-4524-B504-B2FEA32C5AF0@alum.mit.edu> <3d375d730902071427r259ecebctd4383f9db41baf11@mail.gmail.com> Message-ID: On Feb 7, 2009, at 2:27 PM, Robert Kern wrote: > On Sat, Feb 7, 2009 at 16:13, Dav Clark wrote: >> Hi, >> >> I found a small bug (more with OSX than with SciPy) but worth >> mentioning. >> >> If you upgrade setuptools on OS X without changing your path, for >> some >> reason /usr/bin/easy_install (system setuptools 0.6c7) will remain >> ahead of /usr/local/bin/easy_install (your current install). Then, >> if >> you try to do an easy_install of scipy, it fails because setuptools >> 0.6c7 doesn't provide the proper fcompiler attribute. > > No version of setuptools provides an fcompiler attribute. That's all > numpy.distutils. I suspect there is a different problem going on. The > system's Python comes with a 1.0.x series numpy. I think that is the > root of the problem. > >> Two solutions: >> >> 1) download and run setup.py manually - this will use the most recent >> setuptools via python >> >> or >> >> 2) Change your PATH so that /usr/local/bin comes before /usr/bin. >> Why >> this isn't the case already, I have no idea. I guess it's apples way >> of insulating casual users from hackers like us. > > I believe all of the Python binaries I am aware of (www.python.org, > Activestate, and EPD) will modify your .bashrc or .bash_profile to > place the appropriate bin/ path (not always /usr/local/bin/!) at the > front of your $PATH. If you are using a different shell, you may have > to do this manually. Additionally, the default installation location > for scripts is not /usr/local/bin/ but > /Library/Frameworks/Python.framework/Versions/Current/bin/, so I > suspect you have modified your .pydistutilsrc file to point there. > When you modify things, you are on your own. :-) This is a super-fresh install of OS X, using the system python. I have definitely not modified the .pydistutilsrc file... that's just where Apple set things to go by default for the system python. This problem shouldn't occur for a /Library/Framework install. Cheers, Dav From dmitrey15 at ukr.net Sun Feb 8 13:55:13 2009 From: dmitrey15 at ukr.net (Dmitrey) Date: Sun, 08 Feb 2009 20:55:13 +0200 Subject: [SciPy-user] MPS files, Python code - does anyone have? Message-ID: <498F2A91.9000201@ukr.net> Hi there, does anyone have Python-written code that can read and/or write MPS files? I could write it by myself but I'm short of time (I'm busy with other things). Having the one would be very helpful for connecting more LP/MILP solvers to openopt. Regards, D. From gael.varoquaux at normalesup.org Sun Feb 8 19:00:46 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Mon, 9 Feb 2009 01:00:46 +0100 Subject: [SciPy-user] shared memory machines In-Reply-To: <498CAA2D.10102@molden.no> References: <20090202105316.GE11955@phare.normalesup.org> <786d3e06228152ae2c30291b139983e4.squirrel@webmail.uio.no> <20090205231942.GB21014@phare.normalesup.org> <3d375d730902051523q179b552bg11218a0b22a161b1@mail.gmail.com> <20090205234115.GC29684@phare.normalesup.org> <6ce0ac130902051634i1408fddeiffd54c14f793f688@mail.gmail.com> <498C53F9.3070708@molden.no> <577F2D0E-D3B9-4400-912B-3EB6EFD5F48C@semanchuk.com> <498C5969.1040809@molden.no> <498CAA2D.10102@molden.no> Message-ID: <20090209000046.GD12866@phare.normalesup.org> On Fri, Feb 06, 2009 at 10:22:53PM +0100, Sturla Molden wrote: > > Ok, so this is approximately what I had in mind for Windows. It is a named > mutex and shared memory that is pickled by name (given that I read the > Python manuals on pickling extension objects correctly...) > > It still lacks an ndarray subclass that is pickled without making a copy of > the buffer, and also a malloc similar to multiprocessing. > > And similar Cython code has to be written for posix... OK, I've given it try, but it seems that my sheer incompetence on these matters is about to be revealed. Running the attached test code, I get a bus error. The output of test.py is: {'/c0aa50edb5a04371b8414ef16a49a4fa': (3070545920L, 409600)} Buffer created Array created 3070545920 [1] 9882 bus error python test.py I am quite clueless as to where this comes from (I can see different posibilities) and how to debug this. Once again, this is from sheer incompetence, but I have never mmaped files throught the C API, and my days of C, especially memory allocation in C, are very far. I am posting this on the mailing list hoping that someone will have an idea as to what I am doing wrong. Once this work, we can start looking at making this clean to have posix and windows implementations work together. Ga?l -------------- next part -------------- A non-text attachment was scrubbed... Name: shared_arrays.zip Type: application/x-zip-compressed Size: 9628 bytes Desc: not available URL: From philip at semanchuk.com Sun Feb 8 19:22:20 2009 From: philip at semanchuk.com (Philip Semanchuk) Date: Sun, 8 Feb 2009 19:22:20 -0500 Subject: [SciPy-user] shared memory machines In-Reply-To: <20090209000046.GD12866@phare.normalesup.org> References: <20090202105316.GE11955@phare.normalesup.org> <786d3e06228152ae2c30291b139983e4.squirrel@webmail.uio.no> <20090205231942.GB21014@phare.normalesup.org> <3d375d730902051523q179b552bg11218a0b22a161b1@mail.gmail.com> <20090205234115.GC29684@phare.normalesup.org> <6ce0ac130902051634i1408fddeiffd54c14f793f688@mail.gmail.com> <498C53F9.3070708@molden.no> <577F2D0E-D3B9-4400-912B-3EB6EFD5F48C@semanchuk.com> <498C5969.1040809@molden.no> <498CAA2D.10102@molden.no> <20090209000046.GD12866@phare.normalesup.org> Message-ID: <56B03E36-8CA9-4F34-B0C4-C38C833C8ACD@semanchuk.com> On Feb 8, 2009, at 7:00 PM, Gael Varoquaux wrote: > On Fri, Feb 06, 2009 at 10:22:53PM +0100, Sturla Molden wrote: >> >> Ok, so this is approximately what I had in mind for Windows. It is >> a named >> mutex and shared memory that is pickled by name (given that I read >> the >> Python manuals on pickling extension objects correctly...) >> >> It still lacks an ndarray subclass that is pickled without making a >> copy of >> the buffer, and also a malloc similar to multiprocessing. >> >> And similar Cython code has to be written for posix... > > OK, I've given it try, but it seems that my sheer incompetence on > these > matters is about to be revealed. Running the attached test code, I > get a > bus error. The output of test.py is: > > {'/c0aa50edb5a04371b8414ef16a49a4fa': (3070545920L, 409600)} > Buffer created > Array created > 3070545920 > [1] 9882 bus error python test.py > > I am quite clueless as to where this comes from (I can see different > posibilities) and how to debug this. > > Once again, this is from sheer incompetence, but I have never mmaped > files throught the C API, and my days of C, especially memory > allocation > in C, are very far. Hi Ga?l, I believe one must call ftruncate() on the file handle returned by shm_open(). Look at the example at the bottom of this page: http://www.opengroup.org/onlinepubs/000095399/functions/shm_open.html I hope this info is useful. Philip From gael.varoquaux at normalesup.org Mon Feb 9 01:15:11 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Mon, 9 Feb 2009 07:15:11 +0100 Subject: [SciPy-user] shared memory machines In-Reply-To: <56B03E36-8CA9-4F34-B0C4-C38C833C8ACD@semanchuk.com> References: <20090205231942.GB21014@phare.normalesup.org> <3d375d730902051523q179b552bg11218a0b22a161b1@mail.gmail.com> <20090205234115.GC29684@phare.normalesup.org> <6ce0ac130902051634i1408fddeiffd54c14f793f688@mail.gmail.com> <498C53F9.3070708@molden.no> <577F2D0E-D3B9-4400-912B-3EB6EFD5F48C@semanchuk.com> <498C5969.1040809@molden.no> <498CAA2D.10102@molden.no> <20090209000046.GD12866@phare.normalesup.org> <56B03E36-8CA9-4F34-B0C4-C38C833C8ACD@semanchuk.com> Message-ID: <20090209061511.GB26350@phare.normalesup.org> On Sun, Feb 08, 2009 at 07:22:20PM -0500, Philip Semanchuk wrote: > Hi Ga?l, > I believe one must call ftruncate() on the file handle returned by > shm_open(). Look at the example at the bottom of this page: > http://www.opengroup.org/onlinepubs/000095399/functions/shm_open.html Hurray, that was it! The code snippet at the end of this page is very clear. Thank you for the pointer. > I hope this info is useful. It really was. Thanks a lot. I need to do a few more checks, but I believe I have a first version of some code sharing arrays by name. Ga?l From gael.varoquaux at normalesup.org Mon Feb 9 03:23:44 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Mon, 9 Feb 2009 09:23:44 +0100 Subject: [SciPy-user] shared memory machines In-Reply-To: <20090209061511.GB26350@phare.normalesup.org> References: <3d375d730902051523q179b552bg11218a0b22a161b1@mail.gmail.com> <20090205234115.GC29684@phare.normalesup.org> <6ce0ac130902051634i1408fddeiffd54c14f793f688@mail.gmail.com> <498C53F9.3070708@molden.no> <577F2D0E-D3B9-4400-912B-3EB6EFD5F48C@semanchuk.com> <498C5969.1040809@molden.no> <498CAA2D.10102@molden.no> <20090209000046.GD12866@phare.normalesup.org> <56B03E36-8CA9-4F34-B0C4-C38C833C8ACD@semanchuk.com> <20090209061511.GB26350@phare.normalesup.org> Message-ID: <20090209082344.GA635@phare.normalesup.org> On Mon, Feb 09, 2009 at 07:15:11AM +0100, Gael Varoquaux wrote: > It really was. Thanks a lot. I need to do a few more checks, but I > believe I have a first version of some code sharing arrays by name. OK, I have a first working version under Unix (attached, with trivial test case). Now we need to make it so that the ndarray can be used in the mutliprocessing function call, rather than the buffer object. In other words we need to create an object that behaves as an ndarray, but implements a different pickling method. What do people suggest as a best approach here? Subclassing ndarray? Cheers, Ga?l -------------- next part -------------- A non-text attachment was scrubbed... Name: shared_arrays.zip Type: application/x-zip-compressed Size: 12108 bytes Desc: not available URL: From sturla at molden.no Mon Feb 9 06:20:33 2009 From: sturla at molden.no (Sturla Molden) Date: Mon, 09 Feb 2009 12:20:33 +0100 Subject: [SciPy-user] shared memory machines In-Reply-To: <20090209082344.GA635@phare.normalesup.org> References: <3d375d730902051523q179b552bg11218a0b22a161b1@mail.gmail.com> <20090205234115.GC29684@phare.normalesup.org> <6ce0ac130902051634i1408fddeiffd54c14f793f688@mail.gmail.com> <498C53F9.3070708@molden.no> <577F2D0E-D3B9-4400-912B-3EB6EFD5F48C@semanchuk.com> <498C5969.1040809@molden.no> <498CAA2D.10102@molden.no> <20090209000046.GD12866@phare.normalesup.org> <56B03E36-8CA9-4F34-B0C4-C38C833C8ACD@semanchuk.com> <20090209061511.GB26350@phare.normalesup.org> <20090209082344.GA635@phare.normalesup.org> Message-ID: <49901181.3040801@molden.no> On 2/9/2009 9:23 AM, Gael Varoquaux wrote: > What do people suggest as a best > approach here? Subclassing ndarray? I have been working on that. Basically using what Robert Kern posted a while ago and ripping out some code from multiprocessing's heap object. S.M. From gael.varoquaux at normalesup.org Mon Feb 9 06:23:47 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Mon, 9 Feb 2009 12:23:47 +0100 Subject: [SciPy-user] shared memory machines In-Reply-To: <49901181.3040801@molden.no> References: <6ce0ac130902051634i1408fddeiffd54c14f793f688@mail.gmail.com> <498C53F9.3070708@molden.no> <577F2D0E-D3B9-4400-912B-3EB6EFD5F48C@semanchuk.com> <498C5969.1040809@molden.no> <498CAA2D.10102@molden.no> <20090209000046.GD12866@phare.normalesup.org> <56B03E36-8CA9-4F34-B0C4-C38C833C8ACD@semanchuk.com> <20090209061511.GB26350@phare.normalesup.org> <20090209082344.GA635@phare.normalesup.org> <49901181.3040801@molden.no> Message-ID: <20090209112347.GC32331@phare.normalesup.org> On Mon, Feb 09, 2009 at 12:20:33PM +0100, Sturla Molden wrote: > On 2/9/2009 9:23 AM, Gael Varoquaux wrote: > > What do people suggest as a best > > approach here? Subclassing ndarray? > I have been working on that. Basically using what Robert Kern posted a > while ago and ripping out some code from multiprocessing's heap object. Fantastic. I have to worry about other things for a little while (real-work related), so I won't be competing without you to find a good solution :). Cheers, Ga?l From sturla at molden.no Mon Feb 9 06:28:59 2009 From: sturla at molden.no (Sturla Molden) Date: Mon, 09 Feb 2009 12:28:59 +0100 Subject: [SciPy-user] shared memory machines In-Reply-To: <20090209082344.GA635@phare.normalesup.org> References: <3d375d730902051523q179b552bg11218a0b22a161b1@mail.gmail.com> <20090205234115.GC29684@phare.normalesup.org> <6ce0ac130902051634i1408fddeiffd54c14f793f688@mail.gmail.com> <498C53F9.3070708@molden.no> <577F2D0E-D3B9-4400-912B-3EB6EFD5F48C@semanchuk.com> <498C5969.1040809@molden.no> <498CAA2D.10102@molden.no> <20090209000046.GD12866@phare.normalesup.org> <56B03E36-8CA9-4F34-B0C4-C38C833C8ACD@semanchuk.com> <20090209061511.GB26350@phare.normalesup.org> <20090209082344.GA635@phare.normalesup.org> Message-ID: <4990137B.6030802@molden.no> On 2/9/2009 9:23 AM, Gael Varoquaux wrote: > On Mon, Feb 09, 2009 at 07:15:11AM +0100, Gael Varoquaux wrote: >> It really was. Thanks a lot. I need to do a few more checks, but I >> believe I have a first version of some code sharing arrays by name. > > OK, I have a first working version under Unix (attached, with trivial > test case). By the way, how is memory reclaimed under your Posix code? On Windows, a memory mapping is removed when there is no open handles to it. That is what the Handle object does (i.e. preventing a sytem wide memory leak). On System V IPC a shared segment it has to be marked for removal, i.e. there are no reference counting in the kernel as in Windows. So I was thinking out marking it for removal when the attachment count is zero. But as you have used Posix V IPC I have no idea. Just make sure it does not produce a global memory leak. S.M. From gael.varoquaux at normalesup.org Mon Feb 9 06:38:35 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Mon, 9 Feb 2009 12:38:35 +0100 Subject: [SciPy-user] shared memory machines In-Reply-To: <4990137B.6030802@molden.no> References: <6ce0ac130902051634i1408fddeiffd54c14f793f688@mail.gmail.com> <498C53F9.3070708@molden.no> <577F2D0E-D3B9-4400-912B-3EB6EFD5F48C@semanchuk.com> <498C5969.1040809@molden.no> <498CAA2D.10102@molden.no> <20090209000046.GD12866@phare.normalesup.org> <56B03E36-8CA9-4F34-B0C4-C38C833C8ACD@semanchuk.com> <20090209061511.GB26350@phare.normalesup.org> <20090209082344.GA635@phare.normalesup.org> <4990137B.6030802@molden.no> Message-ID: <20090209113835.GD32331@phare.normalesup.org> On Mon, Feb 09, 2009 at 12:28:59PM +0100, Sturla Molden wrote: > On System V IPC a shared segment it has to be marked for removal, i.e. > there are no reference counting in the kernel as in Windows. So I was > thinking out marking it for removal when the attachment count is zero. > But as you have used Posix V IPC I have no idea. Just make sure it does > not produce a global memory leak. Hum, I believe you are right, and I have produced just that. This means that we probably need a shared reference counter :(. Sounds tedious to implement. Do people have any suggestions on how to implement this? I can see several possibilities: * Using multiprocessing to share the dictionnary of shared map addresses, but this induces a tight coupling with multiprocessing, and I am not sure we want this. * Sharing this dictionnary via a C structure, ie to do our own implementation of a shared state. * Add the ref count information in the shared array. For instance the first byte could be the ref count. This sounds the easiest option, but I am probably not seeing some of the problems that will arize from this approach. I think I am going to take a stab at option three, tonight or later in the week, but please, wise people of the list, give me feedback on what you think might work. Ga?l From sturla at molden.no Mon Feb 9 06:56:56 2009 From: sturla at molden.no (Sturla Molden) Date: Mon, 09 Feb 2009 12:56:56 +0100 Subject: [SciPy-user] shared memory machines In-Reply-To: <20090209113835.GD32331@phare.normalesup.org> References: <6ce0ac130902051634i1408fddeiffd54c14f793f688@mail.gmail.com> <498C53F9.3070708@molden.no> <577F2D0E-D3B9-4400-912B-3EB6EFD5F48C@semanchuk.com> <498C5969.1040809@molden.no> <498CAA2D.10102@molden.no> <20090209000046.GD12866@phare.normalesup.org> <56B03E36-8CA9-4F34-B0C4-C38C833C8ACD@semanchuk.com> <20090209061511.GB26350@phare.normalesup.org> <20090209082344.GA635@phare.normalesup.org> <4990137B.6030802@molden.no> <20090209113835.GD32331@phare.normalesup.org> Message-ID: <49901A08.9020104@molden.no> On 2/9/2009 12:38 PM, Gael Varoquaux wrote: > This means that we probably need a shared reference counter :(. Sounds > tedious to implement. On System V, you can get the attachment count using shmctl with IPC_STAT. Then after calling shmdt, checking the count and marking for removal if it is zero: int cleanup(int shmid) { int ierr; struct shmid_ds buf; ierr = shmctl(shmid, IPC_STAT, &buf); if(ierr < 0) goto error; if (buf.shm_nattch == 0) { ierr = shmctl(shmid, IPC_RMID, NULL); if(ierr < 0) goto error; } return 0 error: return errno; } S.M. From philip at semanchuk.com Mon Feb 9 10:07:16 2009 From: philip at semanchuk.com (Philip Semanchuk) Date: Mon, 9 Feb 2009 10:07:16 -0500 Subject: [SciPy-user] shared memory machines In-Reply-To: <49901A08.9020104@molden.no> References: <6ce0ac130902051634i1408fddeiffd54c14f793f688@mail.gmail.com> <498C53F9.3070708@molden.no> <577F2D0E-D3B9-4400-912B-3EB6EFD5F48C@semanchuk.com> <498C5969.1040809@molden.no> <498CAA2D.10102@molden.no> <20090209000046.GD12866@phare.normalesup.org> <56B03E36-8CA9-4F34-B0C4-C38C833C8ACD@semanchuk.com> <20090209061511.GB26350@phare.normalesup.org> <20090209082344.GA635@phare.normalesup.org> <4990137B.6030802@molden.no> <20090209113835.GD32331@phare.normalesup.org> <49901A08.9020104@molden.no> Message-ID: On Feb 9, 2009, at 6:56 AM, Sturla Molden wrote: > On 2/9/2009 12:38 PM, Gael Varoquaux wrote: > >> This means that we probably need a shared reference counter :(. >> Sounds >> tedious to implement. > > On System V, you can get the attachment count using shmctl with > IPC_STAT. Then after calling shmdt, checking the count and marking for > removal if it is zero: > > int cleanup(int shmid) > { > int ierr; > struct shmid_ds buf; > ierr = shmctl(shmid, IPC_STAT, &buf); > if(ierr < 0) goto error; > if (buf.shm_nattch == 0) { > ierr = shmctl(shmid, IPC_RMID, NULL); > if(ierr < 0) goto error; > } > return 0 > error: > return errno; > } Unfortunately POSIX IPC doesn't report that information. Since I'm not a numpy user I'm a little lost as to how you're using the shared memory here, but I gather that it is effectively "magic" to a numpy user? i.e., he doesn't have any idea that a shared memory segment is being created on his behalf? If that's the case I don't see any way around reference counting. From gael.varoquaux at normalesup.org Mon Feb 9 10:09:20 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Mon, 9 Feb 2009 16:09:20 +0100 Subject: [SciPy-user] shared memory machines In-Reply-To: References: <498C5969.1040809@molden.no> <498CAA2D.10102@molden.no> <20090209000046.GD12866@phare.normalesup.org> <56B03E36-8CA9-4F34-B0C4-C38C833C8ACD@semanchuk.com> <20090209061511.GB26350@phare.normalesup.org> <20090209082344.GA635@phare.normalesup.org> <4990137B.6030802@molden.no> <20090209113835.GD32331@phare.normalesup.org> <49901A08.9020104@molden.no> Message-ID: <20090209150920.GC27832@phare.normalesup.org> On Mon, Feb 09, 2009 at 10:07:16AM -0500, Philip Semanchuk wrote: > Since I'm not a numpy user I'm a little lost as to how you're using > the shared memory here, but I gather that it is effectively "magic" to > a numpy user? i.e., he doesn't have any idea that a shared memory > segment is being created on his behalf? My goal would be that he shouldn't have to know, or to care. It should be as much transparent as possible. > If that's the case I don't see any way around reference counting. Thanks for your input, it is valued a lot. Ga?l From philip at semanchuk.com Mon Feb 9 10:28:44 2009 From: philip at semanchuk.com (Philip Semanchuk) Date: Mon, 9 Feb 2009 10:28:44 -0500 Subject: [SciPy-user] shared memory machines In-Reply-To: <20090209082344.GA635@phare.normalesup.org> References: <3d375d730902051523q179b552bg11218a0b22a161b1@mail.gmail.com> <20090205234115.GC29684@phare.normalesup.org> <6ce0ac130902051634i1408fddeiffd54c14f793f688@mail.gmail.com> <498C53F9.3070708@molden.no> <577F2D0E-D3B9-4400-912B-3EB6EFD5F48C@semanchuk.com> <498C5969.1040809@molden.no> <498CAA2D.10102@molden.no> <20090209000046.GD12866@phare.normalesup.org> <56B03E36-8CA9-4F34-B0C4-C38C833C8ACD@semanchuk.com> <20090209061511.GB26350@phare.normalesup.org> <20090209082344.GA635@phare.normalesup.org> Message-ID: <64DD8CB4-3089-4695-91E7-F98907954E13@semanchuk.com> On Feb 9, 2009, at 3:23 AM, Gael Varoquaux wrote: > On Mon, Feb 09, 2009 at 07:15:11AM +0100, Gael Varoquaux wrote: >> It really was. Thanks a lot. I need to do a few more checks, but I >> believe I have a first version of some code sharing arrays by name. > > OK, I have a first working version under Unix (attached, with trivial > test case). > > Now we need to make it so that the ndarray can be used in the > mutliprocessing function call, rather than the buffer object. In other > words we need to create an object that behaves as an ndarray, but > implements a different pickling method. What do people suggest as a > best > approach here? Subclassing ndarray? Ga?l, I notice that the size of the shared memory segment is set to "pages" * PAGESIZE. Who determines the value of "pages"? And what happens if the numpy object you're storing in the segment grows beyond that size? AFAIK ftruncate() can only be called *once* to resize the segment. That's true on OS X, anyway, so it's probably true elsewhere. I once wrote some code to implement a shared dict using shared memory, and this was a problem I ran into. What happens when an item grows? The solution I eventually developed was to have one shared memory segment for metadata and a collection of other shared memory segments to hold the actual data. The metadata segment stored a (pickled) free space map and if a request was made to store an item that was larger than any free space I had, I'd allocate a new segment of the appropriate size. Otherwise, I'd stick it in the smallest piece of free space that it would fit into in an existing segment. You can perhaps see where this is leading -- once one is tracking free space slots and so forth, one needs to think about memory compaction, too, because sooner or later items will get deleted from the dict and if nothing new is inserted all of that free space is sitting around going to waste. Also, is it consistent with your license to use code from Python itself? If so, then I have another minor suggestion. Cheers Philip From robert.kern at gmail.com Mon Feb 9 11:24:36 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 9 Feb 2009 10:24:36 -0600 Subject: [SciPy-user] shared memory machines In-Reply-To: <64DD8CB4-3089-4695-91E7-F98907954E13@semanchuk.com> References: <3d375d730902051523q179b552bg11218a0b22a161b1@mail.gmail.com> <498C53F9.3070708@molden.no> <577F2D0E-D3B9-4400-912B-3EB6EFD5F48C@semanchuk.com> <498C5969.1040809@molden.no> <498CAA2D.10102@molden.no> <20090209000046.GD12866@phare.normalesup.org> <56B03E36-8CA9-4F34-B0C4-C38C833C8ACD@semanchuk.com> <20090209061511.GB26350@phare.normalesup.org> <20090209082344.GA635@phare.normalesup.org> <64DD8CB4-3089-4695-91E7-F98907954E13@semanchuk.com> Message-ID: <3d375d730902090824y39a69cc0i8551d94df4be39b7@mail.gmail.com> On Mon, Feb 9, 2009 at 09:28, Philip Semanchuk wrote: > Also, is it consistent with your license to use code from Python > itself? If so, then I have another minor suggestion. Yup. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From sturla at molden.no Mon Feb 9 11:42:36 2009 From: sturla at molden.no (Sturla Molden) Date: Mon, 09 Feb 2009 17:42:36 +0100 Subject: [SciPy-user] shared memory machines In-Reply-To: References: <6ce0ac130902051634i1408fddeiffd54c14f793f688@mail.gmail.com> <498C53F9.3070708@molden.no> <577F2D0E-D3B9-4400-912B-3EB6EFD5F48C@semanchuk.com> <498C5969.1040809@molden.no> <498CAA2D.10102@molden.no> <20090209000046.GD12866@phare.normalesup.org> <56B03E36-8CA9-4F34-B0C4-C38C833C8ACD@semanchuk.com> <20090209061511.GB26350@phare.normalesup.org> <20090209082344.GA635@phare.normalesup.org> <4990137B.6030802@molden.no> <20090209113835.GD32331@phare.normalesup.org> <49901A08.9020104@molden.no> Message-ID: <49905CFC.9010507@molden.no> On 2/9/2009 4:07 PM, Philip Semanchuk wrote: > Unfortunately POSIX IPC doesn't report that information. I'll suggest we use System V IPC instead, as it does report a ref count. Code example attached. It compiles with Cython but I have not done any testing except that. My suggestion is to spawn a thread in the creator process to monitor the attachment count for the segment, and mark it for removal when it has dropped to zero. There is a __dealloc__ in a Handle object that does the shmdt, and then Python should do the refcounting (similar to what is done for CloseHandle in Windows). We have to figure out what to do with ctrl-c. It is a source of trouble. With a daemonic GC thread it could cause a leak, with a non-daemonic GC thread it may hang forever (which is also a leak). So I opted for a daemonic GC thread. I also have a version of the Windows sharedmem with a small bugfix (I forgot to unmap the segment before closing the handle). I had to remove the mutex from the Windows code. It can be put in a separate module. We should also have a lock with a named Sys V semaphore. > Since I'm not a numpy user I'm a little lost as to how you're using > the shared memory here, but I gather that it is effectively "magic" to > a numpy user? i.e., he doesn't have any idea that a shared memory > segment is being created on his behalf? If that's the case I don't see > any way around reference counting. We are going to use multiple processes as if they were threads. It is basically a hack to work around Python's GIL (global interpreter lock). Basically we want to create ndarray's with the same interface as before, except that they have shared memory as data. For example, import numpy a = numpy.zeros((4,1024), order='F', dtype=float) import scipy a = scipy.sharedmem.zeros((4,1024), order='F', dtype=float) should do the same, except that the latter uses shared memory. And when it is sent through a multiprocessing.Queue, only the segment name, offset, shape and dtype gets pickled. In the former case, a copy of the whole data buffer is made. Right now we are just creating the shared memory buffer to use as backend. In multiprocessing you will find an object called mp.Array. We can wrap its buffer with an ndarray, but it cannot be passes through a mp.Queue. In other words, all shared memory must be allocated in advance. And that is what we don't want. Sturla Molden -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: sharedmemory_sysv.pyx URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: sharedmemory_win.pyx URL: From philip at semanchuk.com Mon Feb 9 11:48:58 2009 From: philip at semanchuk.com (Philip Semanchuk) Date: Mon, 9 Feb 2009 11:48:58 -0500 Subject: [SciPy-user] shared memory machines In-Reply-To: <3d375d730902090824y39a69cc0i8551d94df4be39b7@mail.gmail.com> References: <3d375d730902051523q179b552bg11218a0b22a161b1@mail.gmail.com> <498C53F9.3070708@molden.no> <577F2D0E-D3B9-4400-912B-3EB6EFD5F48C@semanchuk.com> <498C5969.1040809@molden.no> <498CAA2D.10102@molden.no> <20090209000046.GD12866@phare.normalesup.org> <56B03E36-8CA9-4F34-B0C4-C38C833C8ACD@semanchuk.com> <20090209061511.GB26350@phare.normalesup.org> <20090209082344.GA635@phare.normalesup.org> <64DD8CB4-3089-4695-91E7-F98907954E13@semanchuk.com> <3d375d730902090824y39a69cc0i8551d94df4be39b7@mail.gmail.com> Message-ID: <4CF29518-2439-4938-A85B-E2B4DEE68D57@semanchuk.com> On Feb 9, 2009, at 11:24 AM, Robert Kern wrote: > On Mon, Feb 9, 2009 at 09:28, Philip Semanchuk > wrote: >> Also, is it consistent with your license to use code from Python >> itself? If so, then I have another minor suggestion. > > Yup. I'm not sure how prevalent the getpagesize() API is. You might want to consider using the following code (from Python's mmapmodule.c) to get the page size. #ifdef MS_WINDOWS #include static int my_getpagesize(void) { SYSTEM_INFO si; GetSystemInfo(&si); return si.dwPageSize; } #endif #ifdef UNIX #include #include #if defined(HAVE_SYSCONF) && defined(_SC_PAGESIZE) static int my_getpagesize(void) { return sysconf(_SC_PAGESIZE); } #else #define my_getpagesize getpagesize #endif #endif /* UNIX */ From sturla at molden.no Mon Feb 9 11:59:42 2009 From: sturla at molden.no (Sturla Molden) Date: Mon, 09 Feb 2009 17:59:42 +0100 Subject: [SciPy-user] shared memory machines In-Reply-To: <64DD8CB4-3089-4695-91E7-F98907954E13@semanchuk.com> References: <3d375d730902051523q179b552bg11218a0b22a161b1@mail.gmail.com> <20090205234115.GC29684@phare.normalesup.org> <6ce0ac130902051634i1408fddeiffd54c14f793f688@mail.gmail.com> <498C53F9.3070708@molden.no> <577F2D0E-D3B9-4400-912B-3EB6EFD5F48C@semanchuk.com> <498C5969.1040809@molden.no> <498CAA2D.10102@molden.no> <20090209000046.GD12866@phare.normalesup.org> <56B03E36-8CA9-4F34-B0C4-C38C833C8ACD@semanchuk.com> <20090209061511.GB26350@phare.normalesup.org> <20090209082344.GA635@phare.normalesup.org> <64DD8CB4-3089-4695-91E7-F98907954E13@semanchuk.com> Message-ID: <499060FE.6020608@molden.no> On 2/9/2009 4:28 PM, Philip Semanchuk wrote: > I once wrote some code to implement a shared dict using shared memory, > and this was a problem I ran into. We should have that removed. The actual allocation will be rounded up to a multiple of the page size. So to prevent a leak we should round up before allocating and reporting the actual size. > What happens when an item grows? We don't want an array to grow or move once it has been created. But a process should be allowed to create subarray views. S.M. From simpson at math.toronto.edu Mon Feb 9 12:04:22 2009 From: simpson at math.toronto.edu (Gideon Simpson) Date: Mon, 9 Feb 2009 12:04:22 -0500 Subject: [SciPy-user] root finding in complex valued systems Message-ID: Do any of the SciPy root finding algorithms for systems of equations have native support for when the equations are complex valued? -gideon From karl.young at ucsf.edu Mon Feb 9 12:03:45 2009 From: karl.young at ucsf.edu (Young, Karl) Date: Mon, 9 Feb 2009 09:03:45 -0800 Subject: [SciPy-user] stupid array tricks References: <5063d0650902050337s3a4b656k71f9a5634c589589@mail.gmail.com><1cd32cbb0902050839m4e587390s31ef7b5f6267c5d4@mail.gmail.com><498B1A3B.8040603@molden.no><1cd32cbb0902050923j492f5683icbe737533159a24e@mail.gmail.com><498B2207.2030303@molden.no><1cd32cbb0902050946i700777cdhc920711cb393353f@mail.gmail.com><498B29F6.4080508@molden.no> <498B4326.7060207@molden.no><1cd32cbb0902051230k2c024a0cnc25448f6c1613679@mail.gmail.com><498CB4C1.6040302@ucsf.edu> <9457e7c80902070936w792c9329o114f0a67acfbda3d@mail.gmail.com> Message-ID: <9D202D4E86A4BF47BA6943ABDF21BE78058FAB8A@EXVS06.net.ucsf.edu> Thanks Stefan ! The index meister comes through again. I was sort of thinking along those lines but couldn't quite take the final step of understanding how to get the row coordinates for arbitrary filter locations. BTW, I should send you the latest version of my modification of glcom; the "profiling" that showed this part of my code to be the current bottleneck was the result of the generalized glcom (arbitrary number of co-registered images and arbitrary templates) being so fast at generating the co-occurrence matrices (for others on the list, Stefan wrote a very nice ctypes module, glcom, that generates co-occurrence matrices from an image for doing texture analysis). Karl Young Center for Imaging of Neurodegenerative Disease, UCSF VA Medical Center, MRS Unit (114M) Phone: (415) 221-4810 x3114 FAX: (415) 668-2864 Email: karl young at ucsf edu -----Original Message----- From: scipy-user-bounces at scipy.org on behalf of St?fan van der Walt Sent: Sat 2/7/2009 9:36 AM To: SciPy Users List Subject: Re: [SciPy-user] stupid array tricks 2009/2/7 Karl Young : > I have three objects, 1) an array containing an "image" (could be any > dimension), 2) a mask for the image (of the same dimensions as the > image), and 3) a "template" which is just a list of offset coordinates > from any point in the image. You can create a strided view of the image, so that the values around each position where the filter can be applied becomes a row. Thereafter, using the indexing tricks shown at http://mentat.za.net/numpy/numpy_advanced_slides/ index the view to produce the templated values at each position. Say your template has length n, then you'd have: template of shape (1, n) rows = np.arange(m)[:, None] with shape (m, 1) When using template and rows in a fancy indexing operating, you should get an output of shape (m, n). Here is a simplified example: # Strided view of your image In [25]: data Out[25]: array([[ 0, 1, 2], [ 3, 4, 5], [ 6, 7, 8], [ 9, 10, 11]]) In [26]: rows Out[26]: array([[0], [1], [2], [3]]) In [27]: rows.shape Out[27]: (4, 1) In [28]: template Out[28]: array([[0, 2]]) In [29]: template.shape Out[29]: (1, 2) In [30]: data[rows, template] Out[30]: array([[ 0, 2], [ 3, 5], [ 6, 8], [ 9, 11]]) Hope that helps! Cheers St?fan _______________________________________________ SciPy-user mailing list SciPy-user at scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user From robert.kern at gmail.com Mon Feb 9 12:42:13 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 9 Feb 2009 11:42:13 -0600 Subject: [SciPy-user] root finding in complex valued systems In-Reply-To: References: Message-ID: <3d375d730902090942q542b70c3r77f5aaf55f72f88f@mail.gmail.com> On Mon, Feb 9, 2009 at 11:04, Gideon Simpson wrote: > Do any of the SciPy root finding algorithms for systems of equations > have native support for when the equations are complex valued? Not really. You have to separate out the .real and .imag separately. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From sturla at molden.no Mon Feb 9 12:44:07 2009 From: sturla at molden.no (Sturla Molden) Date: Mon, 09 Feb 2009 18:44:07 +0100 Subject: [SciPy-user] shared memory machines In-Reply-To: <4CF29518-2439-4938-A85B-E2B4DEE68D57@semanchuk.com> References: <3d375d730902051523q179b552bg11218a0b22a161b1@mail.gmail.com> <498C53F9.3070708@molden.no> <577F2D0E-D3B9-4400-912B-3EB6EFD5F48C@semanchuk.com> <498C5969.1040809@molden.no> <498CAA2D.10102@molden.no> <20090209000046.GD12866@phare.normalesup.org> <56B03E36-8CA9-4F34-B0C4-C38C833C8ACD@semanchuk.com> <20090209061511.GB26350@phare.normalesup.org> <20090209082344.GA635@phare.normalesup.org> <64DD8CB4-3089-4695-91E7-F98907954E13@semanchuk.com> <3d375d730902090824y39a69cc0i8551d94df4be39b7@mail.gmail.com> <4CF29518-2439-4938-A85B-E2B4DEE68D57@semanchuk.com> Message-ID: <49906B67.5070109@molden.no> On 2/9/2009 5:48 PM, Philip Semanchuk wrote: > I'm not sure how prevalent the getpagesize() API is. You might want to > consider using the following code (from Python's mmapmodule.c) to get > the page size. I think we can just use mmap.PAGESIZE :) S.M. From philip at semanchuk.com Mon Feb 9 12:49:06 2009 From: philip at semanchuk.com (Philip Semanchuk) Date: Mon, 9 Feb 2009 12:49:06 -0500 Subject: [SciPy-user] shared memory machines In-Reply-To: <49905CFC.9010507@molden.no> References: <6ce0ac130902051634i1408fddeiffd54c14f793f688@mail.gmail.com> <498C53F9.3070708@molden.no> <577F2D0E-D3B9-4400-912B-3EB6EFD5F48C@semanchuk.com> <498C5969.1040809@molden.no> <498CAA2D.10102@molden.no> <20090209000046.GD12866@phare.normalesup.org> <56B03E36-8CA9-4F34-B0C4-C38C833C8ACD@semanchuk.com> <20090209061511.GB26350@phare.normalesup.org> <20090209082344.GA635@phare.normalesup.org> <4990137B.6030802@molden.no> <20090209113835.GD32331@phare.normalesup.org> <49901A08.9020104@molden.no> <49905CFC.9010507@molden.no> Message-ID: <450D80F1-8023-43F0-A02C-28A87919F775@semanchuk.com> On Feb 9, 2009, at 11:42 AM, Sturla Molden wrote: > On 2/9/2009 4:07 PM, Philip Semanchuk wrote: > >> Unfortunately POSIX IPC doesn't report that information. > > I'll suggest we use System V IPC instead, as it does report a ref > count. Code example attached. It compiles with Cython but I have not > done any testing except that. > > My suggestion is to spawn a thread in the creator process to monitor > the attachment count for the segment, and mark it for removal when > it has dropped to zero. There is a __dealloc__ in a Handle object > that does the shmdt, and then Python should do the refcounting > (similar to what is done for CloseHandle in Windows). If you're destroying the segment when the attach count drops to zero, why not check that immediately after the call to shmdt()? > key = ftok( (self.name), 0) ftok() should probably be avoided as it returns duplicate keys: http://nikitathespider.com/python/shm/#ftok I'd recommend using a random number generator instead. I believe a key_t is guaranteed to fit into an int, so you could generate a random number anywhere from 1 to INT_MAX, taking care not to step on the value IPC_PRIVATE (unless you want to assume that that is always #defined to 0). From sturla at molden.no Mon Feb 9 13:05:35 2009 From: sturla at molden.no (Sturla Molden) Date: Mon, 09 Feb 2009 19:05:35 +0100 Subject: [SciPy-user] shared memory machines In-Reply-To: <450D80F1-8023-43F0-A02C-28A87919F775@semanchuk.com> References: <6ce0ac130902051634i1408fddeiffd54c14f793f688@mail.gmail.com> <498C53F9.3070708@molden.no> <577F2D0E-D3B9-4400-912B-3EB6EFD5F48C@semanchuk.com> <498C5969.1040809@molden.no> <498CAA2D.10102@molden.no> <20090209000046.GD12866@phare.normalesup.org> <56B03E36-8CA9-4F34-B0C4-C38C833C8ACD@semanchuk.com> <20090209061511.GB26350@phare.normalesup.org> <20090209082344.GA635@phare.normalesup.org> <4990137B.6030802@molden.no> <20090209113835.GD32331@phare.normalesup.org> <49901A08.9020104@molden.no> <49905CFC.9010507@molden.no> <450D80F1-8023-43F0-A02C-28A87919F775@semanchuk.com> Message-ID: <4990706F.7000505@molden.no> On 2/9/2009 6:49 PM, Philip Semanchuk wrote: > If you're destroying the segment when the attach count drops to zero, > why not check that immediately after the call to shmdt()? I thought it was only the owner/creator that was allowed to do that? > ftok() should probably be avoided as it returns duplicate keys: > http://nikitathespider.com/python/shm/#ftok Oh :( In that case I could rewrite the object to pickle the shmid instead of a random name (uuid string) on System V. > I'd recommend using a random number generator instead. I believe a > key_t is guaranteed to fit into an int, so you could generate a random > number anywhere from 1 to INT_MAX, taking care not to step on the > value IPC_PRIVATE (unless you want to assume that that is always > #defined to 0). I am not sure how big the problem is, as I pass an uuid as filename to ftok. S.M. From anjiro at cc.gatech.edu Mon Feb 9 13:03:38 2009 From: anjiro at cc.gatech.edu (Daniel Ashbrook) Date: Mon, 09 Feb 2009 13:03:38 -0500 Subject: [SciPy-user] suppress scientific notation printing for large numbers? Message-ID: <49906FFA.2000909@cc.gatech.edu> So using set_printoptions, I can set suppress=True, and suppress printing of tiny numbers using scientific notation. However, it doesn't do anything with respect to large numbers - is there any way to force large numbers in arrays to be printed as they are? Thanks, dan From sturla at molden.no Mon Feb 9 13:24:25 2009 From: sturla at molden.no (Sturla Molden) Date: Mon, 09 Feb 2009 19:24:25 +0100 Subject: [SciPy-user] shared memory machines In-Reply-To: <49901181.3040801@molden.no> References: <3d375d730902051523q179b552bg11218a0b22a161b1@mail.gmail.com> <20090205234115.GC29684@phare.normalesup.org> <6ce0ac130902051634i1408fddeiffd54c14f793f688@mail.gmail.com> <498C53F9.3070708@molden.no> <577F2D0E-D3B9-4400-912B-3EB6EFD5F48C@semanchuk.com> <498C5969.1040809@molden.no> <498CAA2D.10102@molden.no> <20090209000046.GD12866@phare.normalesup.org> <56B03E36-8CA9-4F34-B0C4-C38C833C8ACD@semanchuk.com> <20090209061511.GB26350@phare.normalesup.org> <20090209082344.GA635@phare.normalesup.org> <49901181.3040801@molden.no> Message-ID: <499074D9.6060401@molden.no> On 2/9/2009 12:20 PM, Sturla Molden wrote: > I have been working on that. Basically using what Robert Kern posted a > while ago and ripping out some code from multiprocessing's heap object. Here is a first draft. The ftok issue is not fixed. I am not sure if Robert Kern's use of copy_reg affects ndarrays in general, or just the ones we create here. There are probably a few bugs to kill. And we need a setup script. S.M. -------------- next part -------------- A non-text attachment was scrubbed... Name: sharedmem.zip Type: application/x-zip-compressed Size: 7088 bytes Desc: not available URL: From robert.kern at gmail.com Mon Feb 9 13:26:19 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 9 Feb 2009 12:26:19 -0600 Subject: [SciPy-user] suppress scientific notation printing for large numbers? In-Reply-To: <49906FFA.2000909@cc.gatech.edu> References: <49906FFA.2000909@cc.gatech.edu> Message-ID: <3d375d730902091026t51b60bf2jdcae6d5f9db03c94@mail.gmail.com> On Mon, Feb 9, 2009 at 12:03, Daniel Ashbrook wrote: > So using set_printoptions, I can set suppress=True, and suppress > printing of tiny numbers using scientific notation. However, it doesn't > do anything with respect to large numbers - is there any way to force > large numbers in arrays to be printed as they are? No. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Mon Feb 9 13:28:59 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 9 Feb 2009 12:28:59 -0600 Subject: [SciPy-user] shared memory machines In-Reply-To: <499074D9.6060401@molden.no> References: <3d375d730902051523q179b552bg11218a0b22a161b1@mail.gmail.com> <577F2D0E-D3B9-4400-912B-3EB6EFD5F48C@semanchuk.com> <498C5969.1040809@molden.no> <498CAA2D.10102@molden.no> <20090209000046.GD12866@phare.normalesup.org> <56B03E36-8CA9-4F34-B0C4-C38C833C8ACD@semanchuk.com> <20090209061511.GB26350@phare.normalesup.org> <20090209082344.GA635@phare.normalesup.org> <49901181.3040801@molden.no> <499074D9.6060401@molden.no> Message-ID: <3d375d730902091028u7f00aa35i84699372c07ed94b@mail.gmail.com> On Mon, Feb 9, 2009 at 12:24, Sturla Molden wrote: > On 2/9/2009 12:20 PM, Sturla Molden wrote: > >> I have been working on that. Basically using what Robert Kern posted a >> while ago and ripping out some code from multiprocessing's heap object. > > Here is a first draft. The ftok issue is not fixed. > > I am not sure if Robert Kern's use of copy_reg affects ndarrays in general, > or just the ones we create here. It affects ndarrays in general. The reduce function should ideally be written to detect whether the ndarray is shared, or is a view eventually leading back to a shared ndarray, or is just a regular ndarray. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From philip at semanchuk.com Mon Feb 9 15:25:57 2009 From: philip at semanchuk.com (Philip Semanchuk) Date: Mon, 9 Feb 2009 15:25:57 -0500 Subject: [SciPy-user] shared memory machines In-Reply-To: <4990706F.7000505@molden.no> References: <6ce0ac130902051634i1408fddeiffd54c14f793f688@mail.gmail.com> <498C53F9.3070708@molden.no> <577F2D0E-D3B9-4400-912B-3EB6EFD5F48C@semanchuk.com> <498C5969.1040809@molden.no> <498CAA2D.10102@molden.no> <20090209000046.GD12866@phare.normalesup.org> <56B03E36-8CA9-4F34-B0C4-C38C833C8ACD@semanchuk.com> <20090209061511.GB26350@phare.normalesup.org> <20090209082344.GA635@phare.normalesup.org> <4990137B.6030802@molden.no> <20090209113835.GD32331@phare.normalesup.org> <49901A08.9020104@molden.no> <49905CFC.9010507@molden.no> <450D80F1-8023-43F0-A02C-28A87919F775@semanchuk.com> <4990706F.7000505@molden.no> Message-ID: <98CBDDB9-24E0-4769-A526-B58EACC89498@semanchuk.com> On Feb 9, 2009, at 1:05 PM, Sturla Molden wrote: > On 2/9/2009 6:49 PM, Philip Semanchuk wrote: > >> If you're destroying the segment when the attach count drops to zero, >> why not check that immediately after the call to shmdt()? > > I thought it was only the owner/creator that was allowed to do that? Yes, sort of. Sys V IPC objects are owned by users, not processes, so if user foo creates a semaphore in one process and destroys it in another, that's OK. I just verified this on OS X. Unless portions of your SciPy application will be running under different users, I think the matter of which process destroys an IPC object is irrelevant. >> ftok() should probably be avoided as it returns duplicate keys: >> http://nikitathespider.com/python/shm/#ftok > > Oh :( > > In that case I could rewrite the object to pickle the shmid instead > of a > random name (uuid string) on System V. But you need the key, not the id, to pass to shmget() to get a handle to an existing IPC object. >> I'd recommend using a random number generator instead. I believe a >> key_t is guaranteed to fit into an int, so you could generate a >> random >> number anywhere from 1 to INT_MAX, taking care not to step on the >> value IPC_PRIVATE (unless you want to assume that that is always >> #defined to 0). > > I am not sure how big the problem is, as I pass an uuid as filename > to ftok. I'm not sure how big the problem is either. All I know is that in my experience, ftok() returned the same key for different files in the same directory. I realized, therefore, that my code needed to handle the case where ftok() didn't generate a useful key. Since I needed a second, more reliable method of key generation, why use ftok() at all? If I were you, rather than trying to figure out how broken ftok() is (and it might be broken in different ways on different platforms), I'd just abandon it altogether. It's not as if generating a random number instead is difficult. In fact, it's easier. Instead of generating a random uuid and passing that to ftok(), eliminate the middleman and generate a random key yourself. From sturla at molden.no Mon Feb 9 16:41:59 2009 From: sturla at molden.no (Sturla Molden) Date: Mon, 9 Feb 2009 22:41:59 +0100 (CET) Subject: [SciPy-user] shared memory machines In-Reply-To: <98CBDDB9-24E0-4769-A526-B58EACC89498@semanchuk.com> References: <6ce0ac130902051634i1408fddeiffd54c14f793f688@mail.gmail.com> <498C53F9.3070708@molden.no> <577F2D0E-D3B9-4400-912B-3EB6EFD5F48C@semanchuk.com> <498C5969.1040809@molden.no> <498CAA2D.10102@molden.no> <20090209000046.GD12866@phare.normalesup.org> <56B03E36-8CA9-4F34-B0C4-C38C833C8ACD@semanchuk.com> <20090209061511.GB26350@phare.normalesup.org> <20090209082344.GA635@phare.normalesup.org> <4990137B.6030802@molden.no> <20090209113835.GD32331@phare.normalesup.org> <49901A08.9020104@molden.no> <49905CFC.9010507@molden.no> <450D80F1-8023-43F0-A02C-28A87919F775@semanchuk.com> <4990706F.7000505@molden.no> <98CBDDB9-24E0-4769-A526-B58EACC89498@semanchuk.com> Message-ID: > On Feb 9, 2009, at 1:05 PM, Sturla Molden wrote: > Yes, sort of. Sys V IPC objects are owned by users, not processes, so > if user foo creates a semaphore in one process and destroys it in > another, that's OK. I just verified this on OS X. Unless portions of > your SciPy application will be running under different users, I think > the matter of which process destroys an IPC object is irrelevant. My programs will certainly not do that (but I use Windows anyway). I have one process that spawns workers, and they run with the same user. I think that covers 99% of all use for this extension. But others may have a more complex design, so the safest method is to let the creator kill the segment. But it comes at the expense of a thread. Otherwise I could just assume the user is the same and do the check after shmdt. As I use Windows I have no personal preference here. Not using a thread avoids some of the ctrl-c issue, and it will be a bit faster. But then all processes sharing memory must run with the same user, otherwise there will be leaks and havoc. >> In that case I could rewrite the object to pickle the shmid instead >> of a >> random name (uuid string) on System V. > > But you need the key, not the id, to pass to shmget() to get a handle > to an existing IPC object. Yes, but if we use the shmid, we have the handle. So we can pass that integer from one process to another. At least that is what my old book on Linux programming describes. Since you say I need to use shmget, can I assume this method is not valid on all Unix incarnations? > If I were you, rather than trying to figure out how broken ftok() is > (and it might be broken in different ways on different platforms), I'd > just abandon it altogether. It's not as if generating a random number > instead is difficult. In fact, it's easier. Instead of generating a > random uuid and passing that to ftok(), eliminate the middleman and > generate a random key yourself. It has to be a unique key for the system, not just a random number. So I could try to call shmget multiple times with IPC_EXCL until it succeeds. Then I'll have to check why it failed as well. This is the first time I have found Windows to be the less annoying system. Thanks for your help. :-) Sturla Molden From sturla at molden.no Mon Feb 9 18:50:44 2009 From: sturla at molden.no (Sturla Molden) Date: Tue, 10 Feb 2009 00:50:44 +0100 (CET) Subject: [SciPy-user] shared memory machines In-Reply-To: References: <6ce0ac130902051634i1408fddeiffd54c14f793f688@mail.gmail.com> <498C53F9.3070708@molden.no> <577F2D0E-D3B9-4400-912B-3EB6EFD5F48C@semanchuk.com> <498C5969.1040809@molden.no> <498CAA2D.10102@molden.no> <20090209000046.GD12866@phare.normalesup.org> <56B03E36-8CA9-4F34-B0C4-C38C833C8ACD@semanchuk.com> <20090209061511.GB26350@phare.normalesup.org> <20090209082344.GA635@phare.normalesup.org> <4990137B.6030802@molden.no> <20090209113835.GD32331@phare.normalesup.org> <49901A08.9020104@molden.no> <49905CFC.9010507@molden.no> <450D80F1-8023-43F0-A02C-28A87919F775@semanchuk.com> <4990706F.7000505@molden.no> <98CBDDB9-24E0-4769-A526-B58EACC89498@semanchuk.com> Message-ID: <038c02f4374caf7e49b01c0203e8a6bd.squirrel@webmail.uio.no> Just a small update: I have removed ftok and done what Philip Semanchuk suggested. The System V version now uses numpy's random integer generator to create a key (and if it fails, checks errno for EEXIST). Clean up can be done using threads or without threads; keep your paws off os.setuid if you set gc_thread to False. S.M. -------------- next part -------------- A non-text attachment was scrubbed... Name: sharedmemory_sysv.pyx Type: / Size: 7996 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: sharedmemory_win.pyx Type: / Size: 5418 bytes Desc: not available URL: From matthew.brett at gmail.com Mon Feb 9 19:31:29 2009 From: matthew.brett at gmail.com (Matthew Brett) Date: Mon, 9 Feb 2009 16:31:29 -0800 Subject: [SciPy-user] scipy.org Message-ID: <1e2af89e0902091631s777226f7k17eedcee1657716c@mail.gmail.com> Hi, Could scipy.org be down again? Best, Matthew From robert.kern at gmail.com Mon Feb 9 19:37:57 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 9 Feb 2009 18:37:57 -0600 Subject: [SciPy-user] scipy.org In-Reply-To: <1e2af89e0902091631s777226f7k17eedcee1657716c@mail.gmail.com> References: <1e2af89e0902091631s777226f7k17eedcee1657716c@mail.gmail.com> Message-ID: <3d375d730902091637s55ce128coc963c0bce36c8124@mail.gmail.com> On Mon, Feb 9, 2009 at 18:31, Matthew Brett wrote: > Hi, > > Could scipy.org be down again? It was. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From sturla at molden.no Mon Feb 9 20:23:09 2009 From: sturla at molden.no (Sturla Molden) Date: Tue, 10 Feb 2009 02:23:09 +0100 (CET) Subject: [SciPy-user] shared memory machines Message-ID: <51ab5afe7eb746fa60db99ef7635df9b.squirrel@webmail.uio.no> Ok, the work is basically done :) What remains is testing/debugging and a setup script. Perhaps we should move this debate to scipy-dev? I feel like I am spamming this list... Regards, Sturla Molden -------------- next part -------------- A non-text attachment was scrubbed... Name: sharedmem.zip Type: application/x-zip-compressed Size: 7117 bytes Desc: not available URL: From robert.kern at gmail.com Mon Feb 9 20:27:33 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 9 Feb 2009 19:27:33 -0600 Subject: [SciPy-user] shared memory machines In-Reply-To: <51ab5afe7eb746fa60db99ef7635df9b.squirrel@webmail.uio.no> References: <51ab5afe7eb746fa60db99ef7635df9b.squirrel@webmail.uio.no> Message-ID: <3d375d730902091727x4e3c9760m6c164025d9f0aa36@mail.gmail.com> On Mon, Feb 9, 2009 at 19:23, Sturla Molden wrote: > Ok, the work is basically done :) > > What remains is testing/debugging and a setup script. > > Perhaps we should move this debate to scipy-dev? I feel like I am spamming > this list... Whatever. It would be nice, though, if you hosted the files somewhere, perhaps under source control, instead of passing them around in attachments. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From philip at semanchuk.com Tue Feb 10 00:05:42 2009 From: philip at semanchuk.com (Philip Semanchuk) Date: Tue, 10 Feb 2009 00:05:42 -0500 Subject: [SciPy-user] shared memory machines In-Reply-To: References: <6ce0ac130902051634i1408fddeiffd54c14f793f688@mail.gmail.com> <498C53F9.3070708@molden.no> <577F2D0E-D3B9-4400-912B-3EB6EFD5F48C@semanchuk.com> <498C5969.1040809@molden.no> <498CAA2D.10102@molden.no> <20090209000046.GD12866@phare.normalesup.org> <56B03E36-8CA9-4F34-B0C4-C38C833C8ACD@semanchuk.com> <20090209061511.GB26350@phare.normalesup.org> <20090209082344.GA635@phare.normalesup.org> <4990137B.6030802@molden.no> <20090209113835.GD32331@phare.normalesup.org> <49901A08.9020104@molden.no> <49905CFC.9010507@molden.no> <450D80F1-8023-43F0-A02C-28A87919F775@semanchuk.com> <4990706F.7000505@molden.no> <98CBDDB9-24E0-4769-A526-B58EACC89498@semanchuk.com> Message-ID: <92829ED2-44D4-497E-9AEF-2B49283778D7@semanchuk.com> On Feb 9, 2009, at 4:41 PM, Sturla Molden wrote: >>> In that case I could rewrite the object to pickle the shmid instead >>> of a >>> random name (uuid string) on System V. >> >> But you need the key, not the id, to pass to shmget() to get a handle >> to an existing IPC object. > > Yes, but if we use the shmid, we have the handle. So we can pass that > integer from one process to another. At least that is what my old > book on > Linux programming describes. Since you say I need to use shmget, can I > assume this method is not valid on all Unix incarnations? No, sorry, ignore what I said. I was not thinking clearly. >> If I were you, rather than trying to figure out how broken ftok() is >> (and it might be broken in different ways on different platforms), >> I'd >> just abandon it altogether. It's not as if generating a random number >> instead is difficult. In fact, it's easier. Instead of generating a >> random uuid and passing that to ftok(), eliminate the middleman and >> generate a random key yourself. > > It has to be a unique key for the system, not just a random number. > So I > could try to call shmget multiple times with IPC_EXCL until it > succeeds. > Then I'll have to check why it failed as well. Exactly. It's a pain in the arse but it is what must be done. > This is the first time I have found Windows to be the less annoying > system. Indeed, that's a rare occurrence. The POSIX API is better than Sys V in that there's a much larger "key" space, so large that collisions between randomly generated ids are statistically...hmmm, maybe I should watch my mouth on a statistically-oriented mailing list. =) Under POSIX, an IPC object's name is a string that starts with a slash. Under FreeBSD (the most restrictive API I've encountered), the name is limited to 14 filename-permissible characters. Subtracting the leading slash, that's space for 13 alphanumeric characters plus underscore and dot. Filenames permit 52 upper and lowercase characters plus 10 digits plus 2 punctuation characters = 64 characters. 64**13 is a lot more choices than INT_MAX. So OK, I stand by my statement. How does the Windows API resolve name/key collisions? > Thanks for your help. :-) Glad to be able to provide it. (Ni ?r v?lkomna.) Cheers Philip From gael.varoquaux at normalesup.org Tue Feb 10 01:09:38 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Tue, 10 Feb 2009 07:09:38 +0100 Subject: [SciPy-user] shared memory machines In-Reply-To: <49906B67.5070109@molden.no> References: <498C5969.1040809@molden.no> <498CAA2D.10102@molden.no> <20090209000046.GD12866@phare.normalesup.org> <56B03E36-8CA9-4F34-B0C4-C38C833C8ACD@semanchuk.com> <20090209061511.GB26350@phare.normalesup.org> <20090209082344.GA635@phare.normalesup.org> <64DD8CB4-3089-4695-91E7-F98907954E13@semanchuk.com> <3d375d730902090824y39a69cc0i8551d94df4be39b7@mail.gmail.com> <4CF29518-2439-4938-A85B-E2B4DEE68D57@semanchuk.com> <49906B67.5070109@molden.no> Message-ID: <20090210060938.GB4170@phare.normalesup.org> On Mon, Feb 09, 2009 at 06:44:07PM +0100, Sturla Molden wrote: > On 2/9/2009 5:48 PM, Philip Semanchuk wrote: > > I'm not sure how prevalent the getpagesize() API is. You might want to > > consider using the following code (from Python's mmapmodule.c) to get > > the page size. > I think we can just use mmap.PAGESIZE :) Good point :). I was using getpagesize, from unistd.h. Ga?l From jbh at broad.mit.edu Tue Feb 10 07:19:25 2009 From: jbh at broad.mit.edu (John Hanks) Date: Tue, 10 Feb 2009 07:19:25 -0500 Subject: [SciPy-user] Problem with scipy.linalg and LAPACK. In-Reply-To: References: Message-ID: I'm trying to build scipy for Python 2.6.1 using gcc-4.3.3. BLAS and LAPACK both build successfully as does ATLAS. numpy and scipy find the libraries and build without any obvious problems. But when I try to use scipy.linalg, I get this error: Python 2.6.1 (r261:67515, Feb 9 2009, 17:41:40) [GCC 4.3.3] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import scipy.linalg Traceback (most recent call last): File "", line 1, in File "/broad/tools/Linux/x86_64/pkgs/python_2.6.1/lib/python2.6/site-packages/scipy/linalg/__init__.py", line 8, in from basic import * File "/broad/tools/Linux/x86_64/pkgs/python_2.6.1/lib/python2.6/site-packages/scipy/linalg/basic.py", line 17, in from lapack import get_lapack_funcs File "/broad/tools/Linux/x86_64/pkgs/python_2.6.1/lib/python2.6/site-packages/scipy/linalg/lapack.py", line 17, in from scipy.linalg import flapack ImportError: /broad/tools/Linux/x86_64/pkgs/python_2.6.1/lib/python2.6/site-packages/scipy/linalg/flapack.so: undefined symbol: iladlc_ I've repeated this using variations of every set of install instructions for scipy that I can find with google with the same result each time. Any suggestions about where to look for what I've broken would be appreciated. Thanks, jbh From david at ar.media.kyoto-u.ac.jp Tue Feb 10 07:16:41 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 10 Feb 2009 21:16:41 +0900 Subject: [SciPy-user] Problem with scipy.linalg and LAPACK. In-Reply-To: References: Message-ID: <49917029.3060203@ar.media.kyoto-u.ac.jp> John Hanks wrote: > I've repeated this using variations of every set of install > instructions for scipy that I can find with google with the same > result each time. Any suggestions about where to look for what I've > broken would be appreciated. > Lapack 3.2 is not supported - please use 3.1.1 or below, David From sturla at molden.no Tue Feb 10 08:01:28 2009 From: sturla at molden.no (Sturla Molden) Date: Tue, 10 Feb 2009 14:01:28 +0100 Subject: [SciPy-user] shared memory machines In-Reply-To: <92829ED2-44D4-497E-9AEF-2B49283778D7@semanchuk.com> References: <6ce0ac130902051634i1408fddeiffd54c14f793f688@mail.gmail.com> <498C53F9.3070708@molden.no> <577F2D0E-D3B9-4400-912B-3EB6EFD5F48C@semanchuk.com> <498C5969.1040809@molden.no> <498CAA2D.10102@molden.no> <20090209000046.GD12866@phare.normalesup.org> <56B03E36-8CA9-4F34-B0C4-C38C833C8ACD@semanchuk.com> <20090209061511.GB26350@phare.normalesup.org> <20090209082344.GA635@phare.normalesup.org> <4990137B.6030802@molden.no> <20090209113835.GD32331@phare.normalesup.org> <49901A08.9020104@molden.no> <49905CFC.9010507@molden.no> <450D80F1-8023-43F0-A02C-28A87919F775@semanchuk.com> <4990706F.7000505@molden.no> <98CBDDB9-24E0-4769-A526-B58EACC89498@semanchuk.com> <92829ED2-44D4-497E-9AEF-2B49283778D7@semanchuk.com> Message-ID: <49917AA8.4020603@molden.no> On 2/10/2009 6:05 AM, Philip Semanchuk wrote: > How does the Windows API resolve name/key collisions? Right. On Windows a name is a string, and an UUIDs should be unique to the system. If a name exists, CreateFileMapping fails and GetLastError returns ERROR_INVALID_HANDLE. If the object alredy exist in the process, CreateFileMapping returns a valid handle but GetLastError returns ERROR_ALREADY_EXISTS. I'll put som some tests for that to be pedantic, albeit UUIDs should be unique. Sturla Molden From sturla at molden.no Tue Feb 10 09:24:03 2009 From: sturla at molden.no (Sturla Molden) Date: Tue, 10 Feb 2009 15:24:03 +0100 Subject: [SciPy-user] shared memory machines In-Reply-To: <49917AA8.4020603@molden.no> References: <6ce0ac130902051634i1408fddeiffd54c14f793f688@mail.gmail.com> <498C53F9.3070708@molden.no> <577F2D0E-D3B9-4400-912B-3EB6EFD5F48C@semanchuk.com> <498C5969.1040809@molden.no> <498CAA2D.10102@molden.no> <20090209000046.GD12866@phare.normalesup.org> <56B03E36-8CA9-4F34-B0C4-C38C833C8ACD@semanchuk.com> <20090209061511.GB26350@phare.normalesup.org> <20090209082344.GA635@phare.normalesup.org> <4990137B.6030802@molden.no> <20090209113835.GD32331@phare.normalesup.org> <49901A08.9020104@molden.no> <49905CFC.9010507@molden.no> <450D80F1-8023-43F0-A02C-28A87919F775@semanchuk.com> <4990706F.7000505@molden.no> <98CBDDB9-24E0-4769-A526-B58EACC89498@semanchuk.com> <92829ED2-44D4-497E-9AEF-2B49283778D7@semanchuk.com> <49917AA8.4020603@molden.no> Message-ID: <49918E03.5090404@molden.no> The Windows version seems to be working correctly on my computers. I will put updates here: http://folk.uio.no/sturlamo/python/sharedmem.zip Testing (particularly on Unix and friends) is appreciated. If you have comments, encounter bugs, or have corrections, please send them to my email. usage: import sharedmem as shm Now shm.zeros, shm.ones, and shm.empty should work like their numpy equivalents. They are pickled and depickled by name (hidden from sight), meaning only metadata is stored in the pickle. Call .copy() if you need the pickle to contain a copy of the data as well. Unlike multiprocessing.Array, these shared memory arrays can be sent through a multiprocessing.Queue, tcp, or any other IPC you may think of. That's it. I am done spamming this list with this for now. Regards, Sturla Molden From jbh at broad.mit.edu Tue Feb 10 09:30:39 2009 From: jbh at broad.mit.edu (John Hanks) Date: Tue, 10 Feb 2009 09:30:39 -0500 Subject: [SciPy-user] Problem with scipy.linalg and LAPACK. In-Reply-To: <49917029.3060203@ar.media.kyoto-u.ac.jp> References: <49917029.3060203@ar.media.kyoto-u.ac.jp> Message-ID: On Tue, Feb 10, 2009 at 7:16 AM, David Cournapeau wrote: > John Hanks wrote: >> I've repeated this using variations of every set of install >> instructions for scipy that I can find with google with the same >> result each time. Any suggestions about where to look for what I've >> broken would be appreciated. >> > > Lapack 3.2 is not supported - please use 3.1.1 or below, > I went back and rebuilt everything from LAPACK 3.1.1 and forward and still get the same error. Here's my compiler settings: ATLAS (from Make.inc) ICC = gcc ICCFLAGS = $(CDEFS) -fomit-frame-pointer -mfpmath=387 -O2 -falign-loops=4 -fPIC -m64 SMC = gcc SMCFLAGS = -fomit-frame-pointer -mfpmath=387 -O2 -falign-loops=4 -fPIC -m64 DMC = gcc DMCFLAGS = -fomit-frame-pointer -mfpmath=387 -O2 -falign-loops=4 -fPIC -m64 SKC = gcc SKCFLAGS = -fomit-frame-pointer -mfpmath=387 -O2 -falign-loops=4 -fPIC -m64 DKC = gcc DKCFLAGS = -fomit-frame-pointer -mfpmath=387 -O2 -falign-loops=4 -fPIC -m64 XCC = gcc XCCFLAGS = $(CDEFS) -O -fomit-frame-pointer -fPIC -m64 F77 = gfortran F77FLAGS = -fomit-frame-pointer -mfpmath=387 -O2 -falign-loops=4 -fPIC -m64 SMAFLAGS = -fno-tree-loop-optimize DMAFLAGS = -fno-tree-loop-optimize LAPACK (from make.inc) FORTRAN = gfortran OPTS = -O2 -fPIC -m64 DRVOPTS = $(OPTS) NOOPT = -O0 -fPIC -m64 LOADER = gfortran LOADOPTS = The numpy and scipy setup finds gfortran and produces a mostly functional scipy install except for the undefined symbol error. Thanks, jbh From david at ar.media.kyoto-u.ac.jp Tue Feb 10 09:20:01 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Tue, 10 Feb 2009 23:20:01 +0900 Subject: [SciPy-user] Problem with scipy.linalg and LAPACK. In-Reply-To: References: <49917029.3060203@ar.media.kyoto-u.ac.jp> Message-ID: <49918D11.6090702@ar.media.kyoto-u.ac.jp> John Hanks wrote: > > I went back and rebuilt everything from LAPACK 3.1.1 and forward and > still get the same error. > If you get the same error, you forgot to rebuild something, or there is a leftover. The ILADLC function is specific to Lapack 3.2 AFAIK. If possible, you should really use lapack as packaged by your distribution, it will be much easier cheers, David From michael.abshoff at googlemail.com Tue Feb 10 09:53:06 2009 From: michael.abshoff at googlemail.com (Michael Abshoff) Date: Tue, 10 Feb 2009 06:53:06 -0800 Subject: [SciPy-user] Problem with scipy.linalg and LAPACK. In-Reply-To: References: <49917029.3060203@ar.media.kyoto-u.ac.jp> Message-ID: <499194D2.9090704@gmail.com> John Hanks wrote: > On Tue, Feb 10, 2009 at 7:16 AM, David Cournapeau > wrote: >> John Hanks wrote: Hi John, > I went back and rebuilt everything from LAPACK 3.1.1 and forward and > still get the same error. > > Here's my compiler settings: > > ATLAS (from Make.inc) > ICC = gcc > ICCFLAGS = $(CDEFS) -fomit-frame-pointer -mfpmath=387 -O2 > -falign-loops=4 -fPIC -m64 > SMC = gcc > SMCFLAGS = -fomit-frame-pointer -mfpmath=387 -O2 -falign-loops=4 -fPIC -m64 > DMC = gcc > DMCFLAGS = -fomit-frame-pointer -mfpmath=387 -O2 -falign-loops=4 -fPIC -m64 > SKC = gcc > SKCFLAGS = -fomit-frame-pointer -mfpmath=387 -O2 -falign-loops=4 -fPIC -m64 > DKC = gcc > DKCFLAGS = -fomit-frame-pointer -mfpmath=387 -O2 -falign-loops=4 -fPIC -m64 > XCC = gcc > XCCFLAGS = $(CDEFS) -O -fomit-frame-pointer -fPIC -m64 > F77 = gfortran > F77FLAGS = -fomit-frame-pointer -mfpmath=387 -O2 -falign-loops=4 -fPIC -m64 > SMAFLAGS = -fno-tree-loop-optimize > DMAFLAGS = -fno-tree-loop-optimize > > LAPACK (from make.inc) > FORTRAN = gfortran > OPTS = -O2 -fPIC -m64 > DRVOPTS = $(OPTS) > NOOPT = -O0 -fPIC -m64 > LOADER = gfortran > LOADOPTS = > > The numpy and scipy setup finds gfortran and produces a mostly > functional scipy install except for the undefined symbol error. Hi, did you set the CFLAGS or did ATLAS pick them for you? I have seen scipy throw import errors in certain situation when I needed to set some flags for the Fortran compiler to build 64 bit code if it defaulted to a 32 bit target. If your toolchain produces 32 bit binaries per default there are a couple files in Scipy 0.6 that are not build cleanly via distutils, but that pick the Fortran compiler like gfortran directly and you end up attempting to link 32 bit object files into a 64 bit lib. This passes at link time, but blows up on import. > Thanks, > > jbh Cheers, Michael > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > From cournape at gmail.com Tue Feb 10 10:43:39 2009 From: cournape at gmail.com (David Cournapeau) Date: Wed, 11 Feb 2009 00:43:39 +0900 Subject: [SciPy-user] Problem with scipy.linalg and LAPACK. In-Reply-To: <499194D2.9090704@gmail.com> References: <49917029.3060203@ar.media.kyoto-u.ac.jp> <499194D2.9090704@gmail.com> Message-ID: <5b8d13220902100743j59243327l3481e9e34ece3d0@mail.gmail.com> On Tue, Feb 10, 2009 at 11:53 PM, Michael Abshoff wrote: > If your toolchain produces 32 bit binaries per default > there are a couple files in Scipy 0.6 that are not build cleanly via > distutils, but that pick the Fortran compiler like gfortran directly and > you end up attempting to link 32 bit object files into a 64 bit lib. > This passes at link time, but blows up on import. I don't think that's the problem here: the symbol not found simply does not exist in LAPACK < 3.2. Scipy and LAPACK 3.2 do not work together AFAIK. cheers, David From michael.abshoff at googlemail.com Tue Feb 10 10:55:52 2009 From: michael.abshoff at googlemail.com (Michael Abshoff) Date: Tue, 10 Feb 2009 07:55:52 -0800 Subject: [SciPy-user] Problem with scipy.linalg and LAPACK. In-Reply-To: <5b8d13220902100743j59243327l3481e9e34ece3d0@mail.gmail.com> References: <49917029.3060203@ar.media.kyoto-u.ac.jp> <499194D2.9090704@gmail.com> <5b8d13220902100743j59243327l3481e9e34ece3d0@mail.gmail.com> Message-ID: <4991A388.1080205@gmail.com> David Cournapeau wrote: > On Tue, Feb 10, 2009 at 11:53 PM, Michael Abshoff > wrote: >> If your toolchain produces 32 bit binaries per default >> there are a couple files in Scipy 0.6 that are not build cleanly via >> distutils, but that pick the Fortran compiler like gfortran directly and >> you end up attempting to link 32 bit object files into a 64 bit lib. >> This passes at link time, but blows up on import. > > I don't think that's the problem here: the symbol not found simply > does not exist in LAPACK < 3.2. Scipy and LAPACK 3.2 do not work > together AFAIK. Yes, it was a long short, but given the statement that John retried with Lapack 3.1.1 it seems odd. I don't know if he wiped the build directory and all that, so the problem might be something simple like that. > cheers, > > David Cheers, Michael > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > From jbh at broad.mit.edu Tue Feb 10 12:25:50 2009 From: jbh at broad.mit.edu (John Hanks) Date: Tue, 10 Feb 2009 12:25:50 -0500 Subject: [SciPy-user] Problem with scipy.linalg and LAPACK. In-Reply-To: <4991A388.1080205@gmail.com> References: <49917029.3060203@ar.media.kyoto-u.ac.jp> <499194D2.9090704@gmail.com> <5b8d13220902100743j59243327l3481e9e34ece3d0@mail.gmail.com> <4991A388.1080205@gmail.com> Message-ID: On Tue, Feb 10, 2009 at 10:55 AM, Michael Abshoff wrote: > Yes, it was a long short, but given the statement that John retried with > Lapack 3.1.1 it seems odd. I don't know if he wiped the build directory > and all that, so the problem might be something simple like that. > Never attribute to software what can be explained by John's incompetence. I removed all traces of the libraries, scipy and numpy source and started over fresh. After rebuilding everything with the 3.1.1 LAPACK instead of 3.2 I now have a working version. It looks like 3.2 got stuck somewhere and wouldn't go away. Thanks for your help, jbh From gael.varoquaux at normalesup.org Tue Feb 10 16:06:06 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Tue, 10 Feb 2009 22:06:06 +0100 Subject: [SciPy-user] shared memory machines In-Reply-To: <51ab5afe7eb746fa60db99ef7635df9b.squirrel@webmail.uio.no> References: <51ab5afe7eb746fa60db99ef7635df9b.squirrel@webmail.uio.no> Message-ID: <20090210210606.GB9128@phare.normalesup.org> On Tue, Feb 10, 2009 at 02:23:09AM +0100, Sturla Molden wrote: > Ok, the work is basically done :) Congratulation, Sturla! I must admit that I am not too much enthousiastic about the thread+polling to do the cleaning up. I don't really understand why it is necessary. If the view of the array is the last to be decref, than 'buf.shm_nattch' should be 0, and as a result the freeing up of the memory can happen in the dealloc. Or did I miss something? Cheers, Ga?l From josegomez at gmx.net Tue Feb 10 16:36:00 2009 From: josegomez at gmx.net (Jose Luis Gomez Dans) Date: Tue, 10 Feb 2009 22:36:00 +0100 Subject: [SciPy-user] Array selection help Message-ID: <20090210213600.123400@gmx.net> Hi! Let's say I have two 2D arrays, arr1 and arr2. The elements of arr1 contain different numbers (such as labels, for example), and the elements of arr2 contain some floating point data (say, height above sea level or something like that). For each unique value in arr1, I want to work out the mean (... sum, std dev, etc) of arr2 for the overlapping region. So far, I have used the following code: #Get all the unique values in arr1 U = numpy.unique ( arr1 ) #Create a dictionary with the unique values as key, and the #locations of elements that have that value in arr1 R = dict (zip ( [U[i] for i in xrange(U.shape[0])], \ [ numpy.nonzero( arr1==U[i]) for i in xrange(U.shape[0]) ] ) ) #Now, calculate the eg mean of arr2 per arr1 "label" M = dict ( zip ( R.keys(), [ numpy.mean(arr2[R[i]]) for i in R.keys() ] ) ) # So I now have a dictionary with the unique values of arr1, and the mean # value of arr2 for those pixels. The code is fast and I was feeling rather smug and pleased with myself about it :) However, when numpy.unique( arr1 ) increases, [ numpy.nonzero( arr1==U[i]) for i in xrange(U.shape[0]) ] starts taking a long time (understandable, there are loads and loads of operations in that loop). At present, I can easily have numpy.unique ( arr1).shape[0] > 10000, so it does take a long time. Apart from looping through different values of arr1, can anyone think of an efficient way of achieving something similar to this? It doesn't have to be a dictionary as the output, an array or something else would do nicely. Thanks! jose -- Remote Sensing Unit | Env. Monitoring and Modelling Group Dept. of Geography | Dept. of Geography University College London | King's College London Gower St, London WC1E 6BT UK | Strand Campus, Strand, London WC2R 2LS UK -- Jetzt 1 Monat kostenlos! GMX FreeDSL - Telefonanschluss + DSL f?r nur 17,95 Euro/mtl.!* http://dsl.gmx.de/?ac=OM.AD.PD003K11308T4569a From stefan at sun.ac.za Tue Feb 10 17:14:31 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Wed, 11 Feb 2009 00:14:31 +0200 Subject: [SciPy-user] Array selection help In-Reply-To: <20090210213600.123400@gmx.net> References: <20090210213600.123400@gmx.net> Message-ID: <9457e7c80902101414v56ea9319m7fbc70a13229ef9c@mail.gmail.com> Hi Jose 2009/2/10 Jose Luis Gomez Dans : > Let's say I have two 2D arrays, arr1 and arr2. The elements of arr1 contain > different numbers (such as labels, for example), and the elements of arr2 > contain some floating point data (say, height above sea level or something > like that). For each unique value in arr1, I want to work out the mean (... > sum, std dev, etc) of arr2 for the overlapping region. So far, I have used > the following code: Also take a look at scipy.ndimage, which has functions to calculate means and variances over labeled data. Cheers St?fan From josef.pktd at gmail.com Tue Feb 10 17:17:53 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 10 Feb 2009 17:17:53 -0500 Subject: [SciPy-user] Array selection help In-Reply-To: <9457e7c80902101414v56ea9319m7fbc70a13229ef9c@mail.gmail.com> References: <20090210213600.123400@gmx.net> <9457e7c80902101414v56ea9319m7fbc70a13229ef9c@mail.gmail.com> Message-ID: <1cd32cbb0902101417s1894df9h4409c2d0529108da@mail.gmail.com> I think this is also similar to a recent thread in scipy-dev. There I used a dict to build the unique indices. I don't know if this is fast for your case, since the use case was when there are only a few unique items. see thread at: http://projects.scipy.org/pipermail/scipy-dev/2009-January/010900.html Josef On 2/10/09, St?fan van der Walt wrote: > Hi Jose > > 2009/2/10 Jose Luis Gomez Dans : >> Let's say I have two 2D arrays, arr1 and arr2. The elements of arr1 >> contain >> different numbers (such as labels, for example), and the elements of arr2 >> contain some floating point data (say, height above sea level or something >> like that). For each unique value in arr1, I want to work out the mean >> (... >> sum, std dev, etc) of arr2 for the overlapping region. So far, I have used >> the following code: > > Also take a look at scipy.ndimage, which has functions to calculate > means and variances over labeled data. > > Cheers > St?fan > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > From josegomez at gmx.net Tue Feb 10 17:56:42 2009 From: josegomez at gmx.net (Jose Luis Gomez Dans) Date: Tue, 10 Feb 2009 23:56:42 +0100 Subject: [SciPy-user] Array selection help In-Reply-To: <9457e7c80902101414v56ea9319m7fbc70a13229ef9c@mail.gmail.com> References: <20090210213600.123400@gmx.net> <9457e7c80902101414v56ea9319m7fbc70a13229ef9c@mail.gmail.com> Message-ID: <20090210225642.141110@gmx.net> Hi St?fan, > > like that). For each unique value in arr1, I want to work out the mean > > (...sum, std dev, etc) of arr2 for the overlapping region. So far, > Also take a look at scipy.ndimage, which has functions to calculate > means and variances over labeled data. Oh, this looks very nice! In case someone is looking for this, you need to propose your labelling (using labels), and then use scipy.ndimage.means() and friends with your label definitions. Going back to my example, my labels grid is arr1, so I just have to do, eg: for i in numpy.unique ( arr1 ): print i, scipy.ndimage.mean ( arr2, labels=arr1, index=i ) I think that solves it! Thanks! Jose -- Jetzt 1 Monat kostenlos! GMX FreeDSL - Telefonanschluss + DSL f?r nur 17,95 Euro/mtl.!* http://dsl.gmx.de/?ac=OM.AD.PD003K11308T4569a From josegomez at gmx.net Tue Feb 10 17:56:42 2009 From: josegomez at gmx.net (Jose Luis Gomez Dans) Date: Tue, 10 Feb 2009 23:56:42 +0100 Subject: [SciPy-user] Array selection help In-Reply-To: <9457e7c80902101414v56ea9319m7fbc70a13229ef9c@mail.gmail.com> References: <20090210213600.123400@gmx.net> <9457e7c80902101414v56ea9319m7fbc70a13229ef9c@mail.gmail.com> Message-ID: <20090210225642.141110@gmx.net> Hi St?fan, > > like that). For each unique value in arr1, I want to work out the mean > > (...sum, std dev, etc) of arr2 for the overlapping region. So far, > Also take a look at scipy.ndimage, which has functions to calculate > means and variances over labeled data. Oh, this looks very nice! In case someone is looking for this, you need to propose your labelling (using labels), and then use scipy.ndimage.means() and friends with your label definitions. Going back to my example, my labels grid is arr1, so I just have to do, eg: for i in numpy.unique ( arr1 ): print i, scipy.ndimage.mean ( arr2, labels=arr1, index=i ) I think that solves it! Thanks! Jose -- Jetzt 1 Monat kostenlos! GMX FreeDSL - Telefonanschluss + DSL f?r nur 17,95 Euro/mtl.!* http://dsl.gmx.de/?ac=OM.AD.PD003K11308T4569a From gael.varoquaux at normalesup.org Tue Feb 10 18:13:56 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Wed, 11 Feb 2009 00:13:56 +0100 Subject: [SciPy-user] shared memory machines In-Reply-To: <51ab5afe7eb746fa60db99ef7635df9b.squirrel@webmail.uio.no> References: <51ab5afe7eb746fa60db99ef7635df9b.squirrel@webmail.uio.no> Message-ID: <20090210231356.GC9128@phare.normalesup.org> On Tue, Feb 10, 2009 at 02:23:09AM +0100, Sturla Molden wrote: > Ok, the work is basically done :) > What remains is testing/debugging and a setup script. I did a setup script, and I had to change a few detail because Cython was unhappy with the names of the modules (I suspect local imports happening instead of absolute ones). I had to add a __weakref__ attribute to the handle, to make it so that it can be weakref'd. Now I am stuck because shared memory allocation is not working. This boils down to the following traceback: Traceback (most recent call last): File "test.py", line 4, in a = shmem.shared_zeros(10) File "ndarray.py", line 135, in shared_zeros arr = shared_empty(shape, dtype, order) File "ndarray.py", line 126, in shared_empty wrapper = heap.BufferWrapper(nbytes) File "array_heap.py", line 168, in __init__ block = BufferWrapper._heap.malloc(size) File "array_heap.py", line 148, in malloc (arena, start, stop) = self._malloc(size) File "array_heap.py", line 70, in _malloc arena = Arena(length) File "array_heap.py", line 37, in __init__ self.buffer = SharedMemoryBuffer(size) File "sharedmemory_sysv.pyx", line 170, in sharedmemory_sysv.SharedMemoryBuffer.__init__ (sharedmemory_sysv.c:1400) raise OSError, "Failed to attach shared memory: permission denied" OSError: Failed to attach shared memory: permission denied Basically this means that the shmat on line 167 of sharedmemory_sysv.pyx is failing. I don't really know why, but I suspect this might be something stupid. I need to go to bed now, and I probably won't have time to look at that at all before thursday evening. Maybe I will be in luck and someone more clever than me will have time to look at that in the mean time :). Cheers, Ga?l -------------- next part -------------- A non-text attachment was scrubbed... Name: sharedmem.zip Type: application/x-zip-compressed Size: 14704 bytes Desc: not available URL: From philip at semanchuk.com Tue Feb 10 18:23:13 2009 From: philip at semanchuk.com (Philip Semanchuk) Date: Tue, 10 Feb 2009 18:23:13 -0500 Subject: [SciPy-user] shared memory machines In-Reply-To: <20090210231356.GC9128@phare.normalesup.org> References: <51ab5afe7eb746fa60db99ef7635df9b.squirrel@webmail.uio.no> <20090210231356.GC9128@phare.normalesup.org> Message-ID: <047B5950-BB40-494F-9230-2C46BD138E50@semanchuk.com> On Feb 10, 2009, at 6:13 PM, Gael Varoquaux wrote: > Now I am stuck because shared memory allocation is not working. This > boils down to the following traceback: > > Traceback (most recent call last): > File "test.py", line 4, in > a = shmem.shared_zeros(10) > File "ndarray.py", line 135, in shared_zeros > arr = shared_empty(shape, dtype, order) > File "ndarray.py", line 126, in shared_empty > wrapper = heap.BufferWrapper(nbytes) > File "array_heap.py", line 168, in __init__ > block = BufferWrapper._heap.malloc(size) > File "array_heap.py", line 148, in malloc > (arena, start, stop) = self._malloc(size) > File "array_heap.py", line 70, in _malloc > arena = Arena(length) > File "array_heap.py", line 37, in __init__ > self.buffer = SharedMemoryBuffer(size) > File "sharedmemory_sysv.pyx", line 170, in > sharedmemory_sysv.SharedMemoryBuffer.__init__ (sharedmemory_sysv.c: > 1400) > raise OSError, "Failed to attach shared memory: permission denied" > OSError: Failed to attach shared memory: permission denied > > Basically this means that the shmat on line 167 of > sharedmemory_sysv.pyx > is failing. I don't really know why, but I suspect this might be > something stupid. One problem I see is that the call to shmget() specifies no permissions. The third param to shmget() should contain two sets of bitwise params OR-ed together. The first set is IPC_CREAT and IPC_EXCL, the second set is the permissions. So you might want to change line 156 to this: shmid = shmget(key, buf_size, IPC_CREAT | IPC_EXCL | 0600) or this: shmid = shmget(key, buf_size, IPC_CREAT | IPC_EXCL | 0666) http://www.opengroup.org/onlinepubs/009695399/functions/shmget.html From robert.kern at gmail.com Tue Feb 10 18:24:37 2009 From: robert.kern at gmail.com (Robert Kern) Date: Tue, 10 Feb 2009 17:24:37 -0600 Subject: [SciPy-user] shared memory machines In-Reply-To: <20090210231356.GC9128@phare.normalesup.org> References: <51ab5afe7eb746fa60db99ef7635df9b.squirrel@webmail.uio.no> <20090210231356.GC9128@phare.normalesup.org> Message-ID: <3d375d730902101524h2b96961bs93106ba7898556d1@mail.gmail.com> On Tue, Feb 10, 2009 at 17:13, Gael Varoquaux wrote: > On Tue, Feb 10, 2009 at 02:23:09AM +0100, Sturla Molden wrote: >> Ok, the work is basically done :) > >> What remains is testing/debugging and a setup script. > > I did a setup script, and I had to change a few detail because Cython was > unhappy with the names of the modules (I suspect local imports happening > instead of absolute ones). > > I had to add a __weakref__ attribute to the handle, to make it so that it > can be weakref'd. > > Now I am stuck because shared memory allocation is not working. This > boils down to the following traceback: > > Traceback (most recent call last): > File "test.py", line 4, in > a = shmem.shared_zeros(10) > File "ndarray.py", line 135, in shared_zeros > arr = shared_empty(shape, dtype, order) > File "ndarray.py", line 126, in shared_empty > wrapper = heap.BufferWrapper(nbytes) > File "array_heap.py", line 168, in __init__ > block = BufferWrapper._heap.malloc(size) > File "array_heap.py", line 148, in malloc > (arena, start, stop) = self._malloc(size) > File "array_heap.py", line 70, in _malloc > arena = Arena(length) > File "array_heap.py", line 37, in __init__ > self.buffer = SharedMemoryBuffer(size) > File "sharedmemory_sysv.pyx", line 170, in > sharedmemory_sysv.SharedMemoryBuffer.__init__ (sharedmemory_sysv.c:1400) > raise OSError, "Failed to attach shared memory: permission denied" > OSError: Failed to attach shared memory: permission denied I believe that was the error I kept running into when I was futzing around with this. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From stefan at sun.ac.za Wed Feb 11 02:39:21 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Wed, 11 Feb 2009 09:39:21 +0200 Subject: [SciPy-user] Array selection help In-Reply-To: <20090210225642.141110@gmx.net> References: <20090210213600.123400@gmx.net> <9457e7c80902101414v56ea9319m7fbc70a13229ef9c@mail.gmail.com> <20090210225642.141110@gmx.net> Message-ID: <9457e7c80902102339wea512bcj773811ff524a2828@mail.gmail.com> 2009/2/11 Jose Luis Gomez Dans : > Going back to my example, my labels grid is arr1, so I just have to do, eg: > for i in numpy.unique ( arr1 ): > print i, scipy.ndimage.mean ( arr2, labels=arr1, index=i ) Ndimage can also do the for loop: scipy.ndimage.mean(arr2, labels=arr1, index=np.unique(arr1)) St?fan From millman at berkeley.edu Wed Feb 11 03:26:56 2009 From: millman at berkeley.edu (Jarrod Millman) Date: Wed, 11 Feb 2009 00:26:56 -0800 Subject: [SciPy-user] ANN: SciPy 0.7.0 Message-ID: I'm pleased to announce SciPy 0.7.0. SciPy is a package of tools for science and engineering for Python. It includes modules for statistics, optimization, integration, linear algebra, Fourier transforms, signal and image processing, ODE solvers, and more. This release comes sixteen months after the 0.6.0 release and contains many new features, numerous bug-fixes, improved test coverage, and better documentation. Please note that SciPy 0.7.0 requires Python 2.4 or greater (but not Python 3) and NumPy 1.2.0 or greater. For information, please see the release notes: https://sourceforge.net/project/shownotes.php?release_id=660191&group_id=27747 You can download the release from here: https://sourceforge.net/project/showfiles.php?group_id=27747&package_id=19531&release_id=660191 Thank you to everybody who contributed to this release. Enjoy, Jarrod Millman From gael.varoquaux at normalesup.org Wed Feb 11 03:35:04 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Wed, 11 Feb 2009 09:35:04 +0100 Subject: [SciPy-user] shared memory machines In-Reply-To: <047B5950-BB40-494F-9230-2C46BD138E50@semanchuk.com> References: <51ab5afe7eb746fa60db99ef7635df9b.squirrel@webmail.uio.no> <20090210231356.GC9128@phare.normalesup.org> <047B5950-BB40-494F-9230-2C46BD138E50@semanchuk.com> Message-ID: <20090211083504.GA13047@phare.normalesup.org> On Tue, Feb 10, 2009 at 06:23:13PM -0500, Philip Semanchuk wrote: > One problem I see is that the call to shmget() specifies no > permissions. The third param to shmget() should contain two sets of > bitwise params OR-ed together. The first set is IPC_CREAT and > IPC_EXCL, the second set is the permissions. So you might want to > change line 156 to this: > shmid = shmget(key, buf_size, IPC_CREAT | IPC_EXCL | 0600) > or this: > shmid = shmget(key, buf_size, IPC_CREAT | IPC_EXCL | 0666) Indeed, Philip, that was it. Thanks a lot for your help. Ga?l From gael.varoquaux at normalesup.org Wed Feb 11 06:46:20 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Wed, 11 Feb 2009 12:46:20 +0100 Subject: [SciPy-user] shared memory machines In-Reply-To: <20090210231356.GC9128@phare.normalesup.org> References: <51ab5afe7eb746fa60db99ef7635df9b.squirrel@webmail.uio.no> <20090210231356.GC9128@phare.normalesup.org> Message-ID: <20090211114620.GB19956@phare.normalesup.org> I shouldn't be working on that, but this is way more fun :). So I found a few more simple errors in the code and fixed them (code attached). The garbage collector thread lock multiprocessing. I am not sure why. I disabled it just to see what would happen. I added a few print statement to try and debug eventual memory leaks. I think I have a memory leak, judging from the different prints. I am not sure though, and I wonder if there is a good way of checking this, other than running the test code in a big loop and checking if the test box eventually dies. Valgrind does report some 'possibly lost' blocks that increase with the size of the array. Anybody has a suggestion on how to debug this? Cheers, Ga?l -------------- next part -------------- A non-text attachment was scrubbed... Name: sharedmem.zip Type: application/x-zip-compressed Size: 11588 bytes Desc: not available URL: From sturla at molden.no Wed Feb 11 07:04:59 2009 From: sturla at molden.no (Sturla Molden) Date: Wed, 11 Feb 2009 13:04:59 +0100 Subject: [SciPy-user] shared memory machines In-Reply-To: <20090210231356.GC9128@phare.normalesup.org> References: <51ab5afe7eb746fa60db99ef7635df9b.squirrel@webmail.uio.no> <20090210231356.GC9128@phare.normalesup.org> Message-ID: <4992BEEB.4030507@molden.no> On 2/11/2009 12:13 AM, Gael Varoquaux wrote: > I did a setup script, and I had to change a few detail because Cython was > unhappy with the names of the modules (I suspect local imports happening > instead of absolute ones). > > I had to add a __weakref__ attribute to the handle, to make it so that it > can be weakref'd. Thanks Gael I've noticed you used the version I posted to the list, and not the latest on the web. So there is a lot of debugging you missed. I'll do a quick merge of what you posted with mine. I inherited via a Python class to allow a weakref to a Handle. Your solution is cleaner. As for the clean-up thread: A shared segment has an owner on Linux. Only the owner or superuser can mark it for deletion. Someone else but the owner may be the last to detach, and then marking for deletion will fail. I think we should remove the thread and raise an exception if marking for deletion fails. We cannot completely foolproof the clean-up against use of os.setuid anyway. Sturla Molden From sturla at molden.no Wed Feb 11 07:07:49 2009 From: sturla at molden.no (Sturla Molden) Date: Wed, 11 Feb 2009 13:07:49 +0100 Subject: [SciPy-user] shared memory machines In-Reply-To: <20090211114620.GB19956@phare.normalesup.org> References: <51ab5afe7eb746fa60db99ef7635df9b.squirrel@webmail.uio.no> <20090210231356.GC9128@phare.normalesup.org> <20090211114620.GB19956@phare.normalesup.org> Message-ID: <4992BF95.6050505@molden.no> On 2/11/2009 12:46 PM, Gael Varoquaux wrote: > I shouldn't be working on that, but this is way more fun :). > > So I found a few more simple errors in the code and fixed them Here is my working Windows version from yesterday: http://folk.uio.no/sturlamo/python/sharedmem.zip Sturla Molden From sturla at molden.no Wed Feb 11 07:31:53 2009 From: sturla at molden.no (Sturla Molden) Date: Wed, 11 Feb 2009 13:31:53 +0100 Subject: [SciPy-user] shared memory machines In-Reply-To: <20090211114620.GB19956@phare.normalesup.org> References: <51ab5afe7eb746fa60db99ef7635df9b.squirrel@webmail.uio.no> <20090210231356.GC9128@phare.normalesup.org> <20090211114620.GB19956@phare.normalesup.org> Message-ID: <4992C539.7020501@molden.no> On 2/11/2009 12:46 PM, Gael Varoquaux wrote: > So I found a few more simple errors in the code and fixed them (code > attached). The garbage collector thread lock multiprocessing. def __dealloc__(SharedMemoryBuffer self): print 'Calling __dealloc__ on buffer at %s' \ % self.mapped_address #DBG self.handle.dealloc() Why do you do this? The Handle should self destruct. Anyway, this is evil and will possibly case multiprocessing to hang, as well as segfaults. Sturla From sturla at molden.no Wed Feb 11 07:41:39 2009 From: sturla at molden.no (Sturla Molden) Date: Wed, 11 Feb 2009 13:41:39 +0100 Subject: [SciPy-user] shared memory machines In-Reply-To: <20090211114620.GB19956@phare.normalesup.org> References: <51ab5afe7eb746fa60db99ef7635df9b.squirrel@webmail.uio.no> <20090210231356.GC9128@phare.normalesup.org> <20090211114620.GB19956@phare.normalesup.org> Message-ID: <4992C783.6060807@molden.no> On 2/11/2009 12:46 PM, Gael Varoquaux wrote: > So I found a few more simple errors in the code and fixed them (code > attached). Gael, I tried to merge your changes with mine. My Python code worked yesterday (albeit not the version you've debugging), so I guess it still does. The gc thread is removed. Sturla -------------- next part -------------- A non-text attachment was scrubbed... Name: sharedmem.zip Type: application/x-zip-compressed Size: 7867 bytes Desc: not available URL: From cournape at gmail.com Wed Feb 11 07:46:05 2009 From: cournape at gmail.com (David Cournapeau) Date: Wed, 11 Feb 2009 21:46:05 +0900 Subject: [SciPy-user] Numpy 1.2.1 and Scipy 0.7.0; Ubuntu packages Message-ID: <5b8d13220902110446x82e25a9ifb11bb563e468313@mail.gmail.com> Hi, I started to set up a PPA for scipy on launchpad, which enables to build ubuntu packages for various distributions/architectures. The link is there: https://edge.launchpad.net/~scipy/+archive/ppa So you just need to add one line to your /etc/apt/sources.list, and you will get uptodate numpy and scipy packages, cheers, David From python-ml at nn7.de Wed Feb 11 07:41:12 2009 From: python-ml at nn7.de (Soeren Sonnenburg) Date: Wed, 11 Feb 2009 13:41:12 +0100 Subject: [SciPy-user] sparse matrices again Message-ID: <1234356072.5642.16.camel@localhost> Dear all, is it somehow possible to interface to the C API of scipy's spars matrices? I know numpy does not have sparse matrix support but scipy does (at least it can be used from the python side). If it is not too unstable then I would invest some time to get some swig typemaps to connect to it. Soeren From scott.sinclair.za at gmail.com Wed Feb 11 08:13:10 2009 From: scott.sinclair.za at gmail.com (Scott Sinclair) Date: Wed, 11 Feb 2009 15:13:10 +0200 Subject: [SciPy-user] Numpy 1.2.1 and Scipy 0.7.0; Ubuntu packages In-Reply-To: <5b8d13220902110446x82e25a9ifb11bb563e468313@mail.gmail.com> References: <5b8d13220902110446x82e25a9ifb11bb563e468313@mail.gmail.com> Message-ID: <6a17e9ee0902110513x46905c03y7f6dcd1b93839406@mail.gmail.com> > 2009/2/11 David Cournapeau : > I started to set up a PPA for scipy on launchpad, which enables to > build ubuntu packages for various distributions/architectures. The > link is there: > > https://edge.launchpad.net/~scipy/+archive/ppa > > So you just need to add one line to your /etc/apt/sources.list, and > you will get uptodate numpy and scipy packages, Thanks! Cheers, Scott From faltet at pytables.org Wed Feb 11 08:20:49 2009 From: faltet at pytables.org (Francesc Alted) Date: Wed, 11 Feb 2009 14:20:49 +0100 Subject: [SciPy-user] ANN: Numexpr 1.2 released Message-ID: <200902111420.49579.faltet@pytables.org> ======================== Announcing Numexpr 1.2 ======================== Numexpr is a fast numerical expression evaluator for NumPy. With it, expressions that operate on arrays (like "3*a+4*b") are accelerated and use less memory than doing the same calculation in Python. The main feature added in this version is the support of the Intel VML library (many thanks to Gregor Thalhammer for his nice work on this!). In addition, when the VML support is on, several processors can be used in parallel (see the new `set_vml_num_threads()` function). When the VML support is on, the computation of transcendental functions (like trigonometrical, exponential, logarithmic, hyperbolic, power...) can be accelerated quite a few. Typical speed-ups when using one single core for contiguous arrays are around 3x, with peaks of 7.5x (for the pow() function). When using 2 cores the speed-ups are around 4x and 14x respectively. In case you want to know more in detail what has changed in this version, have a look at the release notes: http://code.google.com/p/numexpr/wiki/ReleaseNotes Where I can find Numexpr? ========================= The project is hosted at Google code in: http://code.google.com/p/numexpr/ And you can get the packages from PyPI as well: http://pypi.python.org/pypi How it works? ============= See: http://code.google.com/p/numexpr/wiki/Overview for a detailed description of the package. Share your experience ===================== Let us know of any bugs, suggestions, gripes, kudos, etc. you may have. Enjoy! -- Francesc Alted From gael.varoquaux at normalesup.org Wed Feb 11 08:33:06 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Wed, 11 Feb 2009 14:33:06 +0100 Subject: [SciPy-user] shared memory machines In-Reply-To: <4992C539.7020501@molden.no> References: <51ab5afe7eb746fa60db99ef7635df9b.squirrel@webmail.uio.no> <20090210231356.GC9128@phare.normalesup.org> <20090211114620.GB19956@phare.normalesup.org> <4992C539.7020501@molden.no> Message-ID: <20090211133305.GC19956@phare.normalesup.org> On Wed, Feb 11, 2009 at 01:31:53PM +0100, Sturla Molden wrote: > def __dealloc__(SharedMemoryBuffer self): > print 'Calling __dealloc__ on buffer at %s' \ > % self.mapped_address #DBG > self.handle.dealloc() > Why do you do this? The Handle should self destruct. Anyway, this is > evil and will possibly case multiprocessing to hang, as well as segfaults. This was for debugging. I do not understand why my test code shows only one call to __dealloc__ (see below), and I am trying to figure out why. I fear this has more to do with Python's garbage collector. I agree this is evil. However, if I don't add this code, the __dealloc__ method of the handler does not seem get called in my example. Here is what worries me: I run this test code: ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ import ndarray as shmem import numpy as np def modify_array(ary): ary[:3] = 1 print 'Array address in sub program %s' % ary.ctypes.data from multiprocessing import Pool def main(): a = shmem.shared_zeros(10) p = Pool() print 'Array address in main program %s' % a.ctypes.data print a job = p.apply_async(modify_array, (a, )) p.close() p.join() print a main() import gc gc.collect() ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ I get the following output: ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Array address in main program 47294723575808 [ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.] Array address in sub program 47294723575808 Calling __dealloc__ on buffer at 47294723575808 Deallocated memory at 47294723575808 [ 1. 1. 1. 0. 0. 0. 0. 0. 0. 0.] ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ The two messages about deallocation are debug prints that I inserted in the two __dealloc__ methods. It seems to me that the array 'a' in the main program has not been dellocated. I thus believe that there is a memory leak (I haven't been able to really confirm). It seems to me that the __dealloc__ method of 'a' does not get called in the main program. I have also just added print of pid (not in above example), and the two calls to __dealloc__ do happen in the child process. Finally, if I do not call explictely __dealloc__ for the handler in the dealloc of the buffer, I do not see it being called. So I am wondering if we are not being tricked by the fact that Python calls the __del__ method lazily, in particular when quitting. Maybe the solution to this problem is to add an exit hook (seems like that's what other people did when faced with this problem: http://www.python.org/search/hypermail/python-recent/0635.html, follow up is also interresting: http://www.python.org/search/hypermail/python-recent/0636.html), however this is not terribly robust. I wonder how mutliprocessing deals with this problem. By the way, I have just found a trivial bug: if I call shared_zeros with 1e5 as an argument, the code does not realise it should process this as an int. I suggest that shared_empty also accepts floats in the 'magic' cast from numbers to tuple for the shape, as this is what numpy does. Ga?l From sturla at molden.no Wed Feb 11 08:51:38 2009 From: sturla at molden.no (Sturla Molden) Date: Wed, 11 Feb 2009 14:51:38 +0100 Subject: [SciPy-user] shared memory machines In-Reply-To: <20090211133305.GC19956@phare.normalesup.org> References: <51ab5afe7eb746fa60db99ef7635df9b.squirrel@webmail.uio.no> <20090210231356.GC9128@phare.normalesup.org> <20090211114620.GB19956@phare.normalesup.org> <4992C539.7020501@molden.no> <20090211133305.GC19956@phare.normalesup.org> Message-ID: <4992D7EA.5070404@molden.no> You need to do if __name__ == "__main__": main() in you testing code for multiprocessing to work correctly. Leaving it out is a source of mysterious errors. In fact on Windows, leaving it out creates something similar to a fork bomb. > So I am wondering if we are not being tricked by the fact that Python > calls the __del__ method lazily, in particular when quitting. Maybe the > solution to this problem is to add an exit hook (seems like that's what > other people did when faced with this problem: > http://www.python.org/search/hypermail/python-recent/0635.html, follow up > is also interresting: > http://www.python.org/search/hypermail/python-recent/0636.html), however > this is not terribly robust. I wonder how mutliprocessing deals with this > problem. multiprocessing.util.Finalize is an exit hook. That should do the clean-up in the main program. It clean up the BufferWrapper object, which owns the SharedMemoryBuffer. As long as the Heap object is destroyed, it will clean up. Try to put in some printing to see if the buffer is marked for removal. Do not use a Cython's print statement but something else (e.g. printf from stdio.h). Sturla From josegomez at gmx.net Wed Feb 11 08:56:48 2009 From: josegomez at gmx.net (Jose Luis Gomez Dans) Date: Wed, 11 Feb 2009 14:56:48 +0100 Subject: [SciPy-user] Array selection help In-Reply-To: <9457e7c80902102339wea512bcj773811ff524a2828@mail.gmail.com> References: <20090210213600.123400@gmx.net> <9457e7c80902101414v56ea9319m7fbc70a13229ef9c@mail.gmail.com> <20090210225642.141110@gmx.net> <9457e7c80902102339wea512bcj773811ff524a2828@mail.gmail.com> Message-ID: <20090211135648.67960@gmx.net> St?fan, On Wednesday 11 February 2009 07:39:21 St?fan van der Walt wrote: > scipy.ndimage.mean(arr2, labels=arr1, index=np.unique(arr1)) True. The question is, how do I get the output of your code back into my original array? Presumably, there's another function that does that quickly? Many thanks! Jose -- Remote Sensing Unit | Env. Monitoring and Modelling Group Dept. of Geography | Dept. of Geography University College London | King's College London Gower St, London WC1E 6BT UK | Strand Campus, Strand, London WC2R 2LS UK -- Psssst! Schon vom neuen GMX MultiMessenger geh?rt? Der kann`s mit allen: http://www.gmx.net/de/go/multimessenger01 From sturla at molden.no Wed Feb 11 08:58:24 2009 From: sturla at molden.no (Sturla Molden) Date: Wed, 11 Feb 2009 14:58:24 +0100 Subject: [SciPy-user] shared memory machines In-Reply-To: <4992D7EA.5070404@molden.no> References: <51ab5afe7eb746fa60db99ef7635df9b.squirrel@webmail.uio.no> <20090210231356.GC9128@phare.normalesup.org> <20090211114620.GB19956@phare.normalesup.org> <4992C539.7020501@molden.no> <20090211133305.GC19956@phare.normalesup.org> <4992D7EA.5070404@molden.no> Message-ID: <4992D980.2070402@molden.no> On 2/11/2009 2:51 PM, Sturla Molden wrote: > As long as the Heap object is destroyed, eh, Handle object. cdef extern from "stdio.h": void printf(char *str) cdef class Handle: """ Automatic shared segment deattachment - without this object we would need to do reference counting manually, as shmdt is global to the process. Do not instantiate this class, except from within SharedMemoryBuffer.__init__. """ cdef int shmid cdef object name cdef object cleanup cdef object __weakref__ def __init__(Handle self, shmid, name): self.shmid = shmid self.name = name def gethandle(Handle self): return int(self.shmid) def __dealloc__(Handle self): self.dealloc() def dealloc(Handle self): cdef shmid_ds buf cdef int _shmid= self.shmid cdef void *addr cdef int ierr try: ma, size = __mapped_addresses[ self.name ] addr = ( ma) ierr = shmdt(addr) if (ierr < 0): raise MemoryError, "shmdt failed." del __mapped_addresses[ self.name ] print "Deallocated memory at %s" % ma #DBG except KeyError: print __mapped_addresses #DBG print self.name #DBG print 'KeyError' #DBG #pass # this may happen and is not a problem if (shmctl(_shmid, IPC_STAT, &buf) == -1): raise OSError, \ "IPC_STAT failed, you could have a global memory leak!" if (buf.shm_nattch == 0): if( shmctl(_shmid, IPC_RMID, NULL) == -1 ): raise OSError, \ "IPC_RMID failed, you have a global memory leak!" else: printf("shared segment removed\n") S.M. From gael.varoquaux at normalesup.org Wed Feb 11 09:24:22 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Wed, 11 Feb 2009 15:24:22 +0100 Subject: [SciPy-user] shared memory machines In-Reply-To: <4992D7EA.5070404@molden.no> References: <51ab5afe7eb746fa60db99ef7635df9b.squirrel@webmail.uio.no> <20090210231356.GC9128@phare.normalesup.org> <20090211114620.GB19956@phare.normalesup.org> <4992C539.7020501@molden.no> <20090211133305.GC19956@phare.normalesup.org> <4992D7EA.5070404@molden.no> Message-ID: <20090211142422.GD19956@phare.normalesup.org> On Wed, Feb 11, 2009 at 02:51:38PM +0100, Sturla Molden wrote: > As long as the Heap object is destroyed, it will clean up. Try to put in > some printing to see if the buffer is marked for removal. Do not use a > Cython's print statement but something else (e.g. printf from stdio.h). I have put in some more print statements. I have the fealing that it is not cleaned up. I am attaching my test code, and the modified sharedmemory_sysv.pyx for debug. The output is the following: ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Array address in main program 47782588157952 (pid: 18024) [ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.] Array address in sub program 47782588157952 (pid: 18033) Calling __dealloc__ on buffer at 47782588157952, in pid 18033 Checking for deallocating of memory at 47782588157952 Not deallocating: 8 attached segments [ 1. 1. 1. 0. 0. 0. 0. 0. 0. 0.] ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ I do have the feeling that something is not getting garbage-collected. Ga?l -------------- next part -------------- # Written by Sturla Molden, 2009 # Released under SciPy license # ctypedef int size_t cdef extern from "errno.h": int EEXIST, errno int EACCES, errno int ENOMEM, errno cdef extern from "string.h": void memset(void *addr, int val, size_t len) void memcpy(void *trg, void *src, size_t len) cdef extern from "sys/types.h": ctypedef int key_t cdef extern from "sys/shm.h": ctypedef unsigned int shmatt_t cdef struct shmid_ds: shmatt_t shm_nattch int shmget(key_t key, size_t size, int shmflg) void *shmat(int shmid, void *shmaddr, int shmflg) int shmdt(void *shmaddr) int shmctl(int shmid, int cmd, shmid_ds *buf) nogil cdef extern from "stdio.h": void printf(char *str, ...) cdef extern from "sys/ipc.h": key_t ftok(char *path, int id) int IPC_STAT, IPC_RMID, IPC_CREAT, IPC_EXCL, IPC_PRIVATE cdef extern from "unistd.h": unsigned int sleep(unsigned int seconds) nogil import uuid import weakref import numpy import threading import os cdef object __mapped_addresses = dict() cdef object __open_handles = weakref.WeakValueDictionary() cdef class Handle: """ Automatic shared segment deattachment - without this object we would need to do reference counting manually, as shmdt is global to the process. Do not instantiate this class, except from within SharedMemoryBuffer.__init__. """ cdef int shmid cdef object name cdef object __weakref__ def __init__(Handle self, shmid, name): self.shmid = shmid self.name = name def gethandle(Handle self): return int(self.shmid) def __dealloc__(Handle self): self.dealloc() def dealloc(Handle self): cdef shmid_ds buf cdef int _shmid = self.shmid cdef void *addr cdef int ierr try: ma, size = __mapped_addresses[ self.name ] addr = ( ma) ierr = shmdt(addr) if (ierr < 0): raise MemoryError, "shmdt failed." del __mapped_addresses[ self.name ] printf("Checking for deallocating of memory at %lu\n", ma) #DBG except KeyError: print __mapped_addresses #DBG print self.name #DBG print 'KeyError' #DBG #pass if (shmctl(_shmid, IPC_STAT, &buf) == -1): raise OSError, \ "IPC_STAT failed, you could have a global memory leak!" if (buf.shm_nattch == 0): if( shmctl(_shmid, IPC_RMID, NULL) == -1 ): raise OSError, \ "IPC_RMID failed, you have a global memory leak!" else: printf("shared segment removed\n") else: printf('Not deallocating: %i attached segments\n', buf.shm_nattch) cdef class SharedMemoryBuffer: """ Windows API shared memory segment """ cdef void *mapped_address cdef object name cdef object handle cdef int shmid cdef unsigned long size def __init__(SharedMemoryBuffer self, unsigned int buf_size, name=None, unpickling=False): cdef void* mapped_address cdef long mode cdef int shmid cdef int ikey cdef key_t key lkey = 1 if IPC_PRIVATE < 0 else IPC_PRIVATE + 1 if (name is None) and (unpickling): raise TypeError, "Cannot unpickle without a kernel object name." elif (name is None) and not unpickling: # create a brand new shared segment while 1: self.name = numpy.random.random_integers(lkey, int(2147483646)) ikey = self.name memset( &key, 0, sizeof(key_t)) memcpy( &key, &ikey, sizeof(int)) # key_t is large enough to contain an int shmid = shmget(key, buf_size, IPC_CREAT|IPC_EXCL|0600) if (shmid < 0): if (errno != EEXIST): raise OSError, "Failed to open shared memory" else: # we have an open segment break self.handle = Handle(int(shmid), self.name) __open_handles[ self.name ] = self.handle mapped_address = shmat(shmid, NULL, 0) if (mapped_address == -1): if errno == EACCES: raise OSError, "Failed to attach shared memory: permission denied" elif errno == ENOMEM: raise OSError, "Failed to attach shared memory: insufficient memory" else: raise OSError, "Failed to attach shared memory" self.shmid = shmid self.size = buf_size self.mapped_address = mapped_address ma = int( self.mapped_address) size = int(buf_size) __mapped_addresses[ self.name ] = ma, size else: # unpickling self.name = name try: # check if this process has an open handle to # this segment already self.handle = __open_handles[ self.name ] self.shmid = self.handle.gethandle() ma, size = __mapped_addresses[ self.name ] self.mapped_address = ( ma) self.size = size except KeyError: # unpickle a segment created by another process ikey = self.name memset( &key, 0, sizeof(key_t)) memcpy( &key, &ikey, sizeof(int)) shmid = shmget(key, buf_size, 0) if (shmid < 0): raise OSError, "Failed to open shared memory" self.handle = Handle(int(shmid), name) __open_handles[ self.name ] = self.handle mapped_address = shmat(shmid, NULL, 0) if (mapped_address == -1): raise OSError, "Failed to attach shared memory" self.shmid = shmid self.size = buf_size self.mapped_address = mapped_address ma = int( self.mapped_address) size = int(buf_size) __mapped_addresses[ self.name ] = ma, size def __dealloc__(SharedMemoryBuffer self): printf('Calling __dealloc__ on buffer at %lu, in pid %i\n', #DBG self.mapped_address, #DBG os.getpid()) #DBG self.handle.dealloc() # return base address and segment size # this will be used by the heap object def getbuffer(SharedMemoryBuffer self): return int( self.mapped_address), int(self.size) # pickle def __reduce__(SharedMemoryBuffer self): return (__unpickle_shm, (self.size, self.name)) def __unpickle_shm(*args): s, n = args return SharedMemoryBuffer(s, name=n, unpickling=True) -------------- next part -------------- import ndarray as shmem import numpy as np import sys import os #a = shmem.shared_zeros(10) #print >>sys.stderr, 'Array created' #print a.ctypes.data #print a #print >>sys.stderr, 'Array printed' def modify_array(ary): ary[:3] = 1 print >>sys.stderr, 'Array address in sub program %s (pid: %s)' \ % (ary.ctypes.data, os.getpid()) from multiprocessing import Pool def main(): SIZE = 10 a = shmem.shared_zeros(SIZE) #a = np.zeros(SIZE) p = Pool() print >>sys.stderr, 'Array address in main program %s (pid: %s)' \ % (a.ctypes.data, os.getpid()) print >>sys.stderr, a job = p.apply_async(modify_array, (a, )) p.close() p.join() print >>sys.stderr, a if __name__ == '__main__': main() import gc gc.collect() From stefan at sun.ac.za Wed Feb 11 09:22:05 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Wed, 11 Feb 2009 16:22:05 +0200 Subject: [SciPy-user] Array selection help In-Reply-To: <20090211135648.67960@gmx.net> References: <20090210213600.123400@gmx.net> <9457e7c80902101414v56ea9319m7fbc70a13229ef9c@mail.gmail.com> <20090210225642.141110@gmx.net> <9457e7c80902102339wea512bcj773811ff524a2828@mail.gmail.com> <20090211135648.67960@gmx.net> Message-ID: <9457e7c80902110622t294e1b98hf4c2612e34a16fb6@mail.gmail.com> 2009/2/11 Jose Luis Gomez Dans : > On Wednesday 11 February 2009 07:39:21 St?fan van der Walt wrote: >> scipy.ndimage.mean(arr2, labels=arr1, index=np.unique(arr1)) > > True. The question is, how do I get the output of your code back into my > original array? Presumably, there's another function that does that quickly? It is already in an array, so I'm not sure I understand. Maybe you mean out[:] = scipy.ndimage.mean(...) ? St?fan From wnbell at gmail.com Wed Feb 11 09:40:11 2009 From: wnbell at gmail.com (Nathan Bell) Date: Wed, 11 Feb 2009 09:40:11 -0500 Subject: [SciPy-user] sparse matrices again In-Reply-To: <1234356072.5642.16.camel@localhost> References: <1234356072.5642.16.camel@localhost> Message-ID: On Wed, Feb 11, 2009 at 7:41 AM, Soeren Sonnenburg wrote: > > is it somehow possible to interface to the C API of scipy's spars > matrices? I know numpy does not have sparse matrix support but scipy > does (at least it can be used from the python side). > > If it is not too unstable then I would invest some time to get some swig > typemaps to connect to it. > The interface is not guaranteed to be stable, but you can access the C++ functions that implement much of scipy.sparse through scipy.sparse.sparsetools. What do you want to do exactly? -- Nathan Bell wnbell at gmail.com http://graphics.cs.uiuc.edu/~wnbell/ From josegomez at gmx.net Wed Feb 11 10:03:15 2009 From: josegomez at gmx.net (Jose Luis Gomez Dans) Date: Wed, 11 Feb 2009 16:03:15 +0100 Subject: [SciPy-user] Array selection help In-Reply-To: <9457e7c80902110622t294e1b98hf4c2612e34a16fb6@mail.gmail.com> References: <20090210213600.123400@gmx.net> <9457e7c80902101414v56ea9319m7fbc70a13229ef9c@mail.gmail.com> <20090210225642.141110@gmx.net> <9457e7c80902102339wea512bcj773811ff524a2828@mail.gmail.com> <20090211135648.67960@gmx.net> <9457e7c80902110622t294e1b98hf4c2612e34a16fb6@mail.gmail.com> Message-ID: <20090211150315.141380@gmx.net> > >> scipy.ndimage.mean(arr2, labels=arr1, index=np.unique(arr1)) > > > > True. The question is, how do I get the output of your code back into my > > original array? Presumably, there's another function that does that > quickly? > > It is already in an array, so I'm not sure I understand. Maybe you mean > > out[:] = scipy.ndimage.mean(...) ? Sorry, I was clumsy with my wording. What I meant is how to put together the results, so that I have a 2D array where the value of each element is the value that corresponds to the mean of the corresponding label. So if arr1[100,100] = 4 (say), and after running the mean of arr2 for elements that in arr1 are labeled as 4, the mean value is 2.3, I'd like to have an array (out, out.shape == arr1.shape) where the values of elements of out that share a common label are given the mean value (2.3 for those labeled as 4 in my previous example). In essence, I want to have an array where each element is the mean value for its corresponding class. many thanks! Jose -- Jetzt 1 Monat kostenlos! GMX FreeDSL - Telefonanschluss + DSL f?r nur 17,95 Euro/mtl.!* http://dsl.gmx.de/?ac=OM.AD.PD003K11308T4569a From stefan at sun.ac.za Wed Feb 11 10:26:31 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Wed, 11 Feb 2009 17:26:31 +0200 Subject: [SciPy-user] Array selection help In-Reply-To: <20090211150315.141380@gmx.net> References: <20090210213600.123400@gmx.net> <9457e7c80902101414v56ea9319m7fbc70a13229ef9c@mail.gmail.com> <20090210225642.141110@gmx.net> <9457e7c80902102339wea512bcj773811ff524a2828@mail.gmail.com> <20090211135648.67960@gmx.net> <9457e7c80902110622t294e1b98hf4c2612e34a16fb6@mail.gmail.com> <20090211150315.141380@gmx.net> Message-ID: <9457e7c80902110726y3d2ca16bmcd42dbccf1f9c55c@mail.gmail.com> 2009/2/11 Jose Luis Gomez Dans : >> out[:] = scipy.ndimage.mean(...) ? > > Sorry, I was clumsy with my wording. What I meant is how to put together the results, so that I have a 2D array where the value of each element is the value that corresponds to the mean of the corresponding label. So if arr1[100,100] = 4 (say), and after running the mean of arr2 for elements that in arr1 are labeled as 4, the mean value is 2.3, I'd like to have an array (out, out.shape == arr1.shape) where the values of elements of out that share a common label are given the mean value (2.3 for those labeled as 4 in my previous example). > > In essence, I want to have an array where each element is the mean value for its corresponding class. Thanks, now I understand! In that case your for-loop should be fine (I guess you won't have too many unique indices?). Cheers St?fan From josegomez at gmx.net Wed Feb 11 10:41:15 2009 From: josegomez at gmx.net (Jose Luis Gomez Dans) Date: Wed, 11 Feb 2009 16:41:15 +0100 Subject: [SciPy-user] Array selection help In-Reply-To: <9457e7c80902110726y3d2ca16bmcd42dbccf1f9c55c@mail.gmail.com> References: <20090210213600.123400@gmx.net> <9457e7c80902101414v56ea9319m7fbc70a13229ef9c@mail.gmail.com> <20090210225642.141110@gmx.net> <9457e7c80902102339wea512bcj773811ff524a2828@mail.gmail.com> <20090211135648.67960@gmx.net> <9457e7c80902110622t294e1b98hf4c2612e34a16fb6@mail.gmail.com> <20090211150315.141380@gmx.net> <9457e7c80902110726y3d2ca16bmcd42dbccf1f9c55c@mail.gmail.com> Message-ID: <20090211154115.67130@gmx.net> Hi, > > In essence, I want to have an array where each element is the mean value > for its corresponding class. > > Thanks, now I understand! In that case your for-loop should be fine > (I guess you won't have too many unique indices?). Well, there can be quite a lot of them (~10000 at least), so it does take a long while. I was just wondering whether some numpy/scipy array Jedi trick might speed it up :) jose -- Jetzt 1 Monat kostenlos! GMX FreeDSL - Telefonanschluss + DSL f?r nur 17,95 Euro/mtl.!* http://dsl.gmx.de/?ac=OM.AD.PD003K11308T4569a From josef.pktd at gmail.com Wed Feb 11 11:27:44 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 11 Feb 2009 11:27:44 -0500 Subject: [SciPy-user] Array selection help In-Reply-To: <20090211154115.67130@gmx.net> References: <20090210213600.123400@gmx.net> <9457e7c80902101414v56ea9319m7fbc70a13229ef9c@mail.gmail.com> <20090210225642.141110@gmx.net> <9457e7c80902102339wea512bcj773811ff524a2828@mail.gmail.com> <20090211135648.67960@gmx.net> <9457e7c80902110622t294e1b98hf4c2612e34a16fb6@mail.gmail.com> <20090211150315.141380@gmx.net> <9457e7c80902110726y3d2ca16bmcd42dbccf1f9c55c@mail.gmail.com> <20090211154115.67130@gmx.net> Message-ID: <1cd32cbb0902110827n773a6897p7b6dca3043784843@mail.gmail.com> What's your average number of observations per label? If you have only a few number of observations per label, then using the looping once through your array in python is faster, then the way you were building your dict in the initial message: Below are some timing comparisons, first line is your usage of numpy, second line is one python loop. you see that the python loop scales much better Josef (length of observation array is 2000, labels are random integers) mean observation per label 40.0 0.404751721231 0.361348718907 >>> mean observation per label 200.0 0.149529060262 0.349892234903 >>> mean observation per label 4.0 2.87190969802 0.380998981716 >>> mean observation per label 2.0 4.87971013076 0.405277207021 >>> mean observation per label 400.0 0.117748205434 0.432144029481 for len(arr1) = 100000 and 10000 labels: mean observation per label 10.0 22.9237349998 0.292642780018 Note: the return types differ, version two return plain lists as dict values ------------------------- file----------------- import numpy as np from scipy import ndimage from numpy.testing import assert_array_equal n = 10000 size = 100000 print 'mean observation per label', size/float(n) rvs= np.random.randint(n,size=size) arr1 = rvs arr2 = float(n)-rvs def usendimage(arr1,arr2): for i in np.unique(arr1): print i, ndimage.mean(arr2, labels=arr1, index=i) labelsunique = np.unique(arr1) print labelsunique print ndimage.mean(arr2, labels=arr1, index=labelsunique) def labelcoord1(arr1, arr2): #Get all the unique values in arr1 U = np.unique ( arr1 ) #Create a dictionary with the unique values as key, and the #locations of elements that have that value in arr1 R = dict (zip ( [U[i] for i in xrange(U.shape[0])], \ [ np.nonzero( arr1==U[i]) for i in xrange(U.shape[0]) ] ) ) return R # value of dict is tuple def labelcoord2(arr1, arr2): #Get all the unique values in arr1 U = np.unique ( arr1 ) #Create a dictionary with the unique values as key, and the #locations of elements that have that value in arr1 R = {} for index, row in enumerate(zip(arr1,arr2)): R.setdefault(row[0],[]).append(index) return R # value of dict is list # So I now have a dictionary with the unique values of arr1, and the mean # value of arr2 for those pixels. import timeit t=timeit.Timer("labelcoord1(arr1, arr2)", "from __main__ import *") print t.timeit(1) t=timeit.Timer("labelcoord2(arr1, arr2)", "from __main__ import *") print t.timeit(1) R1 = labelcoord1(arr1, arr2) R2 = labelcoord2(arr1, arr2) for k in sorted(R1): assert_array_equal(R1[k][0], np.array(R2[k])) On 2/11/09, Jose Luis Gomez Dans wrote: > Hi, > >> > In essence, I want to have an array where each element is the mean value >> for its corresponding class. >> >> Thanks, now I understand! In that case your for-loop should be fine >> (I guess you won't have too many unique indices?). > > Well, there can be quite a lot of them (~10000 at least), so it does take a > long while. I was just wondering whether some numpy/scipy array Jedi trick > might speed it up :) > > jose > -- > Jetzt 1 Monat kostenlos! GMX FreeDSL - Telefonanschluss + DSL > f?r nur 17,95 Euro/mtl.!* http://dsl.gmx.de/?ac=OM.AD.PD003K11308T4569a > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > From josef.pktd at gmail.com Wed Feb 11 11:47:52 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 11 Feb 2009 11:47:52 -0500 Subject: [SciPy-user] Array selection help In-Reply-To: <1cd32cbb0902110827n773a6897p7b6dca3043784843@mail.gmail.com> References: <20090210213600.123400@gmx.net> <9457e7c80902101414v56ea9319m7fbc70a13229ef9c@mail.gmail.com> <20090210225642.141110@gmx.net> <9457e7c80902102339wea512bcj773811ff524a2828@mail.gmail.com> <20090211135648.67960@gmx.net> <9457e7c80902110622t294e1b98hf4c2612e34a16fb6@mail.gmail.com> <20090211150315.141380@gmx.net> <9457e7c80902110726y3d2ca16bmcd42dbccf1f9c55c@mail.gmail.com> <20090211154115.67130@gmx.net> <1cd32cbb0902110827n773a6897p7b6dca3043784843@mail.gmail.com> Message-ID: <1cd32cbb0902110847w4e234183u79ad1dbec6340c2f@mail.gmail.com> list comprehension is still a bit faster. That's about 90 times faster than your version for building the dict of indices for this case. Josef def labelcoord3(arr1, arr2): R = {} [R.setdefault(row[0],[]).append(index) for index, row in enumerate(zip(arr1,arr2))] return R mean observation per label 10.0 labelcoord2 0.374560733278 labelcoord3 0.254217505297 >>> len(R3) # number of different labels 10000 >>> len(arr1) # number of observations 100000 On 2/11/09, josef.pktd at gmail.com wrote: > What's your average number of observations per label? > > If you have only a few number of observations per label, then using > the looping once through your array in python is faster, then the way > you were building your dict in the initial message: > > Below are some timing comparisons, first line is your usage of numpy, > second line is one python loop. > you see that the python loop scales much better > > Josef > > > (length of observation array is 2000, labels are random integers) > > mean observation per label 40.0 > 0.404751721231 > 0.361348718907 >>>> > mean observation per label 200.0 > 0.149529060262 > 0.349892234903 >>>> > mean observation per label 4.0 > 2.87190969802 > 0.380998981716 >>>> > mean observation per label 2.0 > 4.87971013076 > 0.405277207021 >>>> > mean observation per label 400.0 > 0.117748205434 > 0.432144029481 > > for len(arr1) = 100000 and 10000 labels: > > mean observation per label 10.0 > 22.9237349998 > 0.292642780018 > > Note: the return types differ, version two return plain lists as dict > values > ------------------------- file----------------- > > > import numpy as np > from scipy import ndimage > from numpy.testing import assert_array_equal > > n = 10000 > size = 100000 > print 'mean observation per label', size/float(n) > rvs= np.random.randint(n,size=size) > arr1 = rvs > arr2 = float(n)-rvs > > def usendimage(arr1,arr2): > for i in np.unique(arr1): > print i, ndimage.mean(arr2, labels=arr1, index=i) > > labelsunique = np.unique(arr1) > print labelsunique > print ndimage.mean(arr2, labels=arr1, index=labelsunique) > > > > def labelcoord1(arr1, arr2): > #Get all the unique values in arr1 > U = np.unique ( arr1 ) > #Create a dictionary with the unique values as key, and the > #locations of elements that have that value in arr1 > R = dict (zip ( [U[i] for i in xrange(U.shape[0])], \ > [ np.nonzero( arr1==U[i]) for i in xrange(U.shape[0]) ] ) ) > return R # value of dict is tuple > > def labelcoord2(arr1, arr2): > > #Get all the unique values in arr1 > U = np.unique ( arr1 ) > #Create a dictionary with the unique values as key, and the > #locations of elements that have that value in arr1 > R = {} > for index, row in enumerate(zip(arr1,arr2)): > R.setdefault(row[0],[]).append(index) > return R # value of dict is list > > > # So I now have a dictionary with the unique values of arr1, and the > mean > # value of arr2 for those pixels. > > > > import timeit > t=timeit.Timer("labelcoord1(arr1, arr2)", "from __main__ import *") > print t.timeit(1) > t=timeit.Timer("labelcoord2(arr1, arr2)", "from __main__ import *") > print t.timeit(1) > > R1 = labelcoord1(arr1, arr2) > R2 = labelcoord2(arr1, arr2) > for k in sorted(R1): > assert_array_equal(R1[k][0], np.array(R2[k])) > > > > > On 2/11/09, Jose Luis Gomez Dans wrote: >> Hi, >> >>> > In essence, I want to have an array where each element is the mean >>> > value >>> for its corresponding class. >>> >>> Thanks, now I understand! In that case your for-loop should be fine >>> (I guess you won't have too many unique indices?). >> >> Well, there can be quite a lot of them (~10000 at least), so it does take >> a >> long while. I was just wondering whether some numpy/scipy array Jedi >> trick >> might speed it up :) >> >> jose >> -- >> Jetzt 1 Monat kostenlos! GMX FreeDSL - Telefonanschluss + DSL >> f?r nur 17,95 Euro/mtl.!* http://dsl.gmx.de/?ac=OM.AD.PD003K11308T4569a >> _______________________________________________ >> SciPy-user mailing list >> SciPy-user at scipy.org >> http://projects.scipy.org/mailman/listinfo/scipy-user >> > From jgomezdans at gmail.com Wed Feb 11 12:31:44 2009 From: jgomezdans at gmail.com (Jose Gomez-Dans) Date: Wed, 11 Feb 2009 17:31:44 +0000 Subject: [SciPy-user] Array selection help In-Reply-To: <1cd32cbb0902110847w4e234183u79ad1dbec6340c2f@mail.gmail.com> References: <20090210213600.123400@gmx.net> <20090210225642.141110@gmx.net> <9457e7c80902102339wea512bcj773811ff524a2828@mail.gmail.com> <20090211135648.67960@gmx.net> <9457e7c80902110622t294e1b98hf4c2612e34a16fb6@mail.gmail.com> <20090211150315.141380@gmx.net> <9457e7c80902110726y3d2ca16bmcd42dbccf1f9c55c@mail.gmail.com> <20090211154115.67130@gmx.net> <1cd32cbb0902110827n773a6897p7b6dca3043784843@mail.gmail.com> <1cd32cbb0902110847w4e234183u79ad1dbec6340c2f@mail.gmail.com> Message-ID: <91d218430902110931s769e0160o62c449fc22f8f95b@mail.gmail.com> Josef, Thanks for that 2009/2/11 > list comprehension is still a bit faster. That's about 90 times faster > than your version for building the dict of indices for this case. I still think ndimage is more obvious to use if you have images, and in my case it is fast (I didn't time it, but less than making a cup of horrible instant coffee ;p). My problem is that it takes a long time to go from a list/dictionary of class mean values (output from either St?fan's ndimage solution or your dictionary solution) back into the original 2D array. In fact, I'm thinking about using weave to achieve this, unless someone comes up with a better idea. Many thanks for your help, and for the code! Jose -- Remote Sensing Unit | Env. Monitoring and Modelling Group Dept. of Geography | Dept. of Geography University College London | King's College London Gower St, London WC1E 6BT UK | Strand Campus, Strand, London WC2R 2LS UK -------------- next part -------------- An HTML attachment was scrubbed... URL: From josef.pktd at gmail.com Wed Feb 11 13:23:42 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 11 Feb 2009 13:23:42 -0500 Subject: [SciPy-user] Array selection help In-Reply-To: <91d218430902110931s769e0160o62c449fc22f8f95b@mail.gmail.com> References: <20090210213600.123400@gmx.net> <9457e7c80902102339wea512bcj773811ff524a2828@mail.gmail.com> <20090211135648.67960@gmx.net> <9457e7c80902110622t294e1b98hf4c2612e34a16fb6@mail.gmail.com> <20090211150315.141380@gmx.net> <9457e7c80902110726y3d2ca16bmcd42dbccf1f9c55c@mail.gmail.com> <20090211154115.67130@gmx.net> <1cd32cbb0902110827n773a6897p7b6dca3043784843@mail.gmail.com> <1cd32cbb0902110847w4e234183u79ad1dbec6340c2f@mail.gmail.com> <91d218430902110931s769e0160o62c449fc22f8f95b@mail.gmail.com> Message-ID: <1cd32cbb0902111023y1958afbsa947064f0f3abd34@mail.gmail.com> Putting the array back together with means also seems to be pretty fast this way. The test for correctness takes several times longer than the array creation. I think labelmeanfilter below does what you want. Josef timing again for large case: >>> arr1.size 100000 >>> labelsunique.size 10000 mean observation per label 10.0 labelcoord2 0.391759008477 labelcoord3 0.253704311581 labelmeanfilter 0.446776056733 def labelmeanfilter(arr1, arr2): R = {} [R.setdefault(row[0],[]).append(index) for index, row in enumerate(zip(arr1,arr2))] labelsunique = R.keys() #np.unique(arr1) labelmeans = ndimage.mean(arr2, labels=arr1, index=labelsunique) arr3 = np.zeros(arr1.shape) for k,v in zip(labelsunique,labelmeans): arr3[R[k]] = v return arr3 def test_labelmeanfilter(arr1, arr2): arr3b = labelmeanfilter(arr1, arr2) labmeandict = dict(zip(labelsunique,labelmeans)) for orig,means in zip(arr1,arr3b): assert_array_equal(means, labmeandict[orig], repr(orig)) On 2/11/09, Jose Gomez-Dans wrote: > Josef, > > Thanks for that > > 2009/2/11 > >> list comprehension is still a bit faster. That's about 90 times faster >> than your version for building the dict of indices for this case. > > > I still think ndimage is more obvious to use if you have images, and in my > case it is fast (I didn't time it, but less than making a cup of horrible > instant coffee ;p). My problem is that it takes a long time to go from a > list/dictionary of class mean values (output from either St?fan's ndimage > solution or your dictionary solution) back into the original 2D array. > > In fact, I'm thinking about using weave to achieve this, unless someone > comes up with a better idea. > > Many thanks for your help, and for the code! > Jose > > -- > Remote Sensing Unit | Env. Monitoring and Modelling Group > Dept. of Geography | Dept. of Geography > University College London | King's College London > Gower St, London WC1E 6BT UK | Strand Campus, Strand, London WC2R 2LS UK > From rpyle at post.harvard.edu Wed Feb 11 11:13:13 2009 From: rpyle at post.harvard.edu (Robert Pyle) Date: Wed, 11 Feb 2009 11:13:13 -0500 Subject: [SciPy-user] ANN: SciPy 0.7.0 In-Reply-To: References: Message-ID: <27CB252B-AFF8-4003-8304-74581F14C0EB@post.harvard.edu> Hi, I have a dual G5 running 10.5.6. I removed scipy-0.6.0 from site- packages, downloaded scipy-0.7.0-py2.5-macosx10.5.dmg, and installed it. The installation went surprisingly fast and claimed to have succeeded, but 0.7.0 was nowhere to be found. I then downloaded the tarball and successfully (if rather more slowly) went through the installation. Is something wrong with scipy-0.7.0-py2.5-macosx10.5.dmg? Bob Pyle On Feb 11, 2009, at 3:26 AM, Jarrod Millman wrote: > I'm pleased to announce SciPy 0.7.0. From josef.pktd at gmail.com Wed Feb 11 13:46:14 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 11 Feb 2009 13:46:14 -0500 Subject: [SciPy-user] Array selection help In-Reply-To: <1cd32cbb0902111023y1958afbsa947064f0f3abd34@mail.gmail.com> References: <20090210213600.123400@gmx.net> <20090211135648.67960@gmx.net> <9457e7c80902110622t294e1b98hf4c2612e34a16fb6@mail.gmail.com> <20090211150315.141380@gmx.net> <9457e7c80902110726y3d2ca16bmcd42dbccf1f9c55c@mail.gmail.com> <20090211154115.67130@gmx.net> <1cd32cbb0902110827n773a6897p7b6dca3043784843@mail.gmail.com> <1cd32cbb0902110847w4e234183u79ad1dbec6340c2f@mail.gmail.com> <91d218430902110931s769e0160o62c449fc22f8f95b@mail.gmail.com> <1cd32cbb0902111023y1958afbsa947064f0f3abd34@mail.gmail.com> Message-ID: <1cd32cbb0902111046h6b3a0103q5c10bb77fd26aac5@mail.gmail.com> sorry, cut and paste error, the test function used globals, correction below. You can speed up a little bit more using itertools.izip instead of zip. (labelmeanfilter 0.377629000334) Josef def test_labelmeanfilter(arr1, arr2): arr3b = labelmeanfilter(arr1, arr2) labelsunique = np.unique(arr1) labelmeans = ndimage.mean(arr2, labels=arr1, index=labelsunique) labmeandict = dict(zip(labelsunique,labelmeans)) for orig,means in zip(arr1,arr3b): assert_array_equal(means, labmeandict[orig], repr(orig)) From strawman at astraw.com Wed Feb 11 13:54:43 2009 From: strawman at astraw.com (Andrew Straw) Date: Wed, 11 Feb 2009 10:54:43 -0800 Subject: [SciPy-user] Array selection help In-Reply-To: <1cd32cbb0902111023y1958afbsa947064f0f3abd34@mail.gmail.com> References: <20090210213600.123400@gmx.net> <9457e7c80902102339wea512bcj773811ff524a2828@mail.gmail.com> <20090211135648.67960@gmx.net> <9457e7c80902110622t294e1b98hf4c2612e34a16fb6@mail.gmail.com> <20090211150315.141380@gmx.net> <9457e7c80902110726y3d2ca16bmcd42dbccf1f9c55c@mail.gmail.com> <20090211154115.67130@gmx.net> <1cd32cbb0902110827n773a6897p7b6dca3043784843@mail.gmail.com> <1cd32cbb0902110847w4e234183u79ad1dbec6340c2f@mail.gmail.com> <91d218430902110931s769e0160o62c449fc22f8f95b@mail.gmail.com> <1cd32cbb0902111023y1958afbsa947064f0f3abd34@mail.gmail.com> Message-ID: <49931EF3.7010106@astraw.com> > def labelmeanfilter(arr1, arr2): > R = {} > [R.setdefault(row[0],[]).append(index) for index, row in > enumerate(zip(arr1,arr2))] > I think the following should produce a bit more speed (although I haven't benchmarked it) because it avoids creating len(arr1) empty lists. import collections def labelmeanfilter(arr1, arr2): R = collections.defaultdict(list) [R[row[0]].append(index) for index, row in enumerate(zip(arr1,arr2))] From josef.pktd at gmail.com Wed Feb 11 14:06:20 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 11 Feb 2009 14:06:20 -0500 Subject: [SciPy-user] Array selection help In-Reply-To: <49931EF3.7010106@astraw.com> References: <20090210213600.123400@gmx.net> <9457e7c80902110622t294e1b98hf4c2612e34a16fb6@mail.gmail.com> <20090211150315.141380@gmx.net> <9457e7c80902110726y3d2ca16bmcd42dbccf1f9c55c@mail.gmail.com> <20090211154115.67130@gmx.net> <1cd32cbb0902110827n773a6897p7b6dca3043784843@mail.gmail.com> <1cd32cbb0902110847w4e234183u79ad1dbec6340c2f@mail.gmail.com> <91d218430902110931s769e0160o62c449fc22f8f95b@mail.gmail.com> <1cd32cbb0902111023y1958afbsa947064f0f3abd34@mail.gmail.com> <49931EF3.7010106@astraw.com> Message-ID: <1cd32cbb0902111106g5dad3a58v13edb6fb1daafa52@mail.gmail.com> On Wed, Feb 11, 2009 at 1:54 PM, Andrew Straw wrote: > >> def labelmeanfilter(arr1, arr2): >> R = {} >> [R.setdefault(row[0],[]).append(index) for index, row in >> enumerate(zip(arr1,arr2))] >> > > > I think the following should produce a bit more speed (although I > haven't benchmarked it) because it avoids creating len(arr1) empty lists. > > import collections > > def labelmeanfilter(arr1, arr2): > R = collections.defaultdict(list) > [R[row[0]].append(index) for index, row in > enumerate(zip(arr1,arr2))] > Yes, its around 14% faster for the example, however, it requires python 2.5 and I am still thinking as if I were using 2.4. Josef From josef.pktd at gmail.com Wed Feb 11 14:26:04 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 11 Feb 2009 14:26:04 -0500 Subject: [SciPy-user] Array selection help In-Reply-To: <1cd32cbb0902111106g5dad3a58v13edb6fb1daafa52@mail.gmail.com> References: <20090210213600.123400@gmx.net> <20090211150315.141380@gmx.net> <9457e7c80902110726y3d2ca16bmcd42dbccf1f9c55c@mail.gmail.com> <20090211154115.67130@gmx.net> <1cd32cbb0902110827n773a6897p7b6dca3043784843@mail.gmail.com> <1cd32cbb0902110847w4e234183u79ad1dbec6340c2f@mail.gmail.com> <91d218430902110931s769e0160o62c449fc22f8f95b@mail.gmail.com> <1cd32cbb0902111023y1958afbsa947064f0f3abd34@mail.gmail.com> <49931EF3.7010106@astraw.com> <1cd32cbb0902111106g5dad3a58v13edb6fb1daafa52@mail.gmail.com> Message-ID: <1cd32cbb0902111126h510fa857mcecae47165567a85@mail.gmail.com> labelmeanfilter 0.387612196522 labelmeanfilter1 0.0931486264316 #new version from itertools import izip def labelmeanfilter1(arr1, arr2): labelsunique = np.unique(arr1) labelmeans = ndimage.mean(arr2, labels=arr1, index=labelsunique) labmeandict = dict(izip(labelsunique,labelmeans)) arr3 = np.array([labmeandict[orig] for orig in arr1]) return arr3 arr3_0 = labelmeanfilter(arr1, arr2) arr3_1 = labelmeanfilter1(arr1, arr2) >>> np.all(arr3_1 == arr3_0) True >>> arr3_1.shape (100000,) I'm finished playing, it's simple and obvious. Josef From jgomezdans at gmail.com Wed Feb 11 15:00:45 2009 From: jgomezdans at gmail.com (Jose Gomez-Dans) Date: Wed, 11 Feb 2009 20:00:45 +0000 Subject: [SciPy-user] Array selection help In-Reply-To: <1cd32cbb0902111126h510fa857mcecae47165567a85@mail.gmail.com> References: <20090210213600.123400@gmx.net> <9457e7c80902110726y3d2ca16bmcd42dbccf1f9c55c@mail.gmail.com> <20090211154115.67130@gmx.net> <1cd32cbb0902110827n773a6897p7b6dca3043784843@mail.gmail.com> <1cd32cbb0902110847w4e234183u79ad1dbec6340c2f@mail.gmail.com> <91d218430902110931s769e0160o62c449fc22f8f95b@mail.gmail.com> <1cd32cbb0902111023y1958afbsa947064f0f3abd34@mail.gmail.com> <49931EF3.7010106@astraw.com> <1cd32cbb0902111106g5dad3a58v13edb6fb1daafa52@mail.gmail.com> <1cd32cbb0902111126h510fa857mcecae47165567a85@mail.gmail.com> Message-ID: <91d218430902111200v6ece97dayb85be9977d74feec@mail.gmail.com> Josef, 2009/2/11 > labelmeanfilter 0.387612196522 > labelmeanfilter1 0.0931486264316 #new version > I'm finished playing, it's simple and obvious. > Wow! That is a massive improvement on my efforts!!!Many thanks to all that helped, it's been very useful. Cheers, Jose -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefan at sun.ac.za Wed Feb 11 15:03:33 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Wed, 11 Feb 2009 22:03:33 +0200 Subject: [SciPy-user] Array selection help In-Reply-To: <20090211154115.67130@gmx.net> References: <20090210213600.123400@gmx.net> <9457e7c80902101414v56ea9319m7fbc70a13229ef9c@mail.gmail.com> <20090210225642.141110@gmx.net> <9457e7c80902102339wea512bcj773811ff524a2828@mail.gmail.com> <20090211135648.67960@gmx.net> <9457e7c80902110622t294e1b98hf4c2612e34a16fb6@mail.gmail.com> <20090211150315.141380@gmx.net> <9457e7c80902110726y3d2ca16bmcd42dbccf1f9c55c@mail.gmail.com> <20090211154115.67130@gmx.net> Message-ID: <9457e7c80902111203xf84ff89k6ddc06bfefab2ecf@mail.gmail.com> Hi Jose 2009/2/11 Jose Luis Gomez Dans : > Well, there can be quite a lot of them (~10000 at least), so it does take a long while. I was just wondering whether some numpy/scipy array Jedi trick might speed it up :) Since you have integer labels, you can make use of the following trick: In [54]: means = np.array([0.1, 0.2, 0.3]) In [55]: means[[1,1,0,1,0,0]] Out[55]: array([ 0.2, 0.2, 0.1, 0.2, 0.1, 0.1]) I implemented a solution using such a "translation table" (see attached). Regards St?fan -------------- next part -------------- A non-text attachment was scrubbed... Name: translate_labels.py Type: application/octet-stream Size: 1090 bytes Desc: not available URL: From stefan at sun.ac.za Wed Feb 11 15:08:03 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Wed, 11 Feb 2009 22:08:03 +0200 Subject: [SciPy-user] Array selection help In-Reply-To: <9457e7c80902111203xf84ff89k6ddc06bfefab2ecf@mail.gmail.com> References: <20090210213600.123400@gmx.net> <9457e7c80902101414v56ea9319m7fbc70a13229ef9c@mail.gmail.com> <20090210225642.141110@gmx.net> <9457e7c80902102339wea512bcj773811ff524a2828@mail.gmail.com> <20090211135648.67960@gmx.net> <9457e7c80902110622t294e1b98hf4c2612e34a16fb6@mail.gmail.com> <20090211150315.141380@gmx.net> <9457e7c80902110726y3d2ca16bmcd42dbccf1f9c55c@mail.gmail.com> <20090211154115.67130@gmx.net> <9457e7c80902111203xf84ff89k6ddc06bfefab2ecf@mail.gmail.com> Message-ID: <9457e7c80902111208g5116ac73odcc1fdefa45c6d73@mail.gmail.com> 2009/2/11 St?fan van der Walt : > In [55]: means[[1,1,0,1,0,0]] > Out[55]: array([ 0.2, 0.2, 0.1, 0.2, 0.1, 0.1]) > > I implemented a solution using such a "translation table" (see attached). Note that, for this approach to work, the labels must progress in increments of one from 0 to N. So labels 0, 1, 2, 3 are fine, but 0, 5, 10 are not. Cheers St?fan From josef.pktd at gmail.com Wed Feb 11 15:49:29 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 11 Feb 2009 15:49:29 -0500 Subject: [SciPy-user] Array selection help In-Reply-To: <9457e7c80902111208g5116ac73odcc1fdefa45c6d73@mail.gmail.com> References: <20090210213600.123400@gmx.net> <20090210225642.141110@gmx.net> <9457e7c80902102339wea512bcj773811ff524a2828@mail.gmail.com> <20090211135648.67960@gmx.net> <9457e7c80902110622t294e1b98hf4c2612e34a16fb6@mail.gmail.com> <20090211150315.141380@gmx.net> <9457e7c80902110726y3d2ca16bmcd42dbccf1f9c55c@mail.gmail.com> <20090211154115.67130@gmx.net> <9457e7c80902111203xf84ff89k6ddc06bfefab2ecf@mail.gmail.com> <9457e7c80902111208g5116ac73odcc1fdefa45c6d73@mail.gmail.com> Message-ID: <1cd32cbb0902111249r8df5162m3cca1567d9f4f16f@mail.gmail.com> On Wed, Feb 11, 2009 at 3:08 PM, St?fan van der Walt wrote: > 2009/2/11 St?fan van der Walt : >> In [55]: means[[1,1,0,1,0,0]] >> Out[55]: array([ 0.2, 0.2, 0.1, 0.2, 0.1, 0.1]) >> >> I implemented a solution using such a "translation table" (see attached). > > Note that, for this approach to work, the labels must progress in > increments of one from 0 to N. So labels 0, 1, 2, 3 are fine, but 0, > 5, 10 are not. I just checked that ndimage can handle non-existing labels: >>> ndimage.mean(5.0-np.arange(5), labels=np.arange(1,10,2), index=np.arange(10)) [0.0, 5.0, 0.0, 4.0, 0.0, 3.0, 0.0, 2.0, 0.0, 1.0] So your translation table should work if you replace labels_unique by range(max(labels)) in ndimage.mean(...). I tried some basic example but I didn't really test it. This would work then as long as labels are positive integers. Josef example: >>> tt = ndimage.mean(5.0-np.arange(5), labels=np.arange(1,10,2), index=np.arange(10)) >>> tt [0.0, 5.0, 0.0, 4.0, 0.0, 3.0, 0.0, 2.0, 0.0, 1.0] >>> np.array(tt)[np.arange(1,10,2)] array([ 5., 4., 3., 2., 1.]) >>> From josef.pktd at gmail.com Wed Feb 11 16:31:31 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 11 Feb 2009 16:31:31 -0500 Subject: [SciPy-user] Array selection help In-Reply-To: <1cd32cbb0902111249r8df5162m3cca1567d9f4f16f@mail.gmail.com> References: <20090210213600.123400@gmx.net> <9457e7c80902102339wea512bcj773811ff524a2828@mail.gmail.com> <20090211135648.67960@gmx.net> <9457e7c80902110622t294e1b98hf4c2612e34a16fb6@mail.gmail.com> <20090211150315.141380@gmx.net> <9457e7c80902110726y3d2ca16bmcd42dbccf1f9c55c@mail.gmail.com> <20090211154115.67130@gmx.net> <9457e7c80902111203xf84ff89k6ddc06bfefab2ecf@mail.gmail.com> <9457e7c80902111208g5116ac73odcc1fdefa45c6d73@mail.gmail.com> <1cd32cbb0902111249r8df5162m3cca1567d9f4f16f@mail.gmail.com> Message-ID: <1cd32cbb0902111331s666ea70co89846f810d4fc3a0@mail.gmail.com> On Wed, Feb 11, 2009 at 3:49 PM, wrote: > On Wed, Feb 11, 2009 at 3:08 PM, St?fan van der Walt wrote: >> 2009/2/11 St?fan van der Walt : >>> In [55]: means[[1,1,0,1,0,0]] >>> Out[55]: array([ 0.2, 0.2, 0.1, 0.2, 0.1, 0.1]) >>> >>> I implemented a solution using such a "translation table" (see attached). >> >> Note that, for this approach to work, the labels must progress in >> increments of one from 0 to N. So labels 0, 1, 2, 3 are fine, but 0, >> 5, 10 are not. > > I just checked that ndimage can handle non-existing labels: translation table version, another 10 times faster makes some missing labels >>> np.unique(arr1).shape (9998,) >>> np.unique(arr1)[:10] array([ 0, 1, 2, 3, 4, 7, 8, 9, 10, 11]) labelmeanfilter 0.383765171272 labelmeanfilter1 0.0916504471937 labelmeanfilter2 0.377427047292 labelmeanfilter3 0.00886087477598 # version with translation table with missing labels >>> np.all(arr3_3 == arr3_0) True def labelmeanfilter3(arr1, arr2): # requires integer labels labelsunique = np.arange(np.max(arr1)+1) labelmeans = np.array(ndimage.mean(arr2, labels=arr1, index=labelsunique)) arr3 = labelmeans[arr1] return arr3 I think we started out with more than 20 seconds. Josef From Juergen.Herrmann at XLhost.de Wed Feb 11 17:08:15 2009 From: Juergen.Herrmann at XLhost.de (=?iso8859-1?Q?J=FCrgen_Herrmann?=) Date: Wed, 11 Feb 2009 23:08:15 +0100 (CET) Subject: [SciPy-user] speaker crossover gui app project needs help Message-ID: <7179793a32e83a5bdd275de6c1aa27f1.squirrel@xlhost.de> hi there! i'm currently coding on a python gui application, that will generate coefficients and config for brutefir, a software convolution engine ( http://www.ludd.luth.se/~torger/brutefir.html ). the application offers signal routing between different filters and should allow the design of multi-way crossovers for speakers. i'm totally new to dsp and came accross scipy, which really looks interesting to me, but i have to admit that i hardly understand what i'm doing right now. i have been playing around with signal.butter and signal.remez and integrated basic fr graph plotting for them. but i simply lack the mathematical background for going deeper atm. my wishlist for configurable filters (in order of priority) would be: - low/higpass with configurable -3db freq and slope (preferrably in db/octave) - shelving low/highpass with configurable freq. slope and gain - notch/gain with freq, gain and q settings - all outputs will have configurable delay and gain (already implemented) so if someone with knowledge on this stuff and some hours for discussion on this topic, feel free to chime in! a screenshot of the current working state can be found here: http://t5.by/pyjackfir/screens/screen01.png by the time i have at least one basic filter type going i will release this (gpled) project. best regards and i'm looking forward to your answers. j?rgen herrmann -- >> XLhost.de - eXperts in Linux hosting ? << XLhost.de GmbH J?rgen Herrmann, Gesch?ftsf?hrer Boelckestrasse 21, 93051 Regensburg, Germany Gesch?ftsf?hrer: Volker Geith, J?rgen Herrmann Registriert unter: HRB9918 Umsatzsteuer-Identifikationsnummer: DE245931218 Fon: +49 (0)700 XLHOSTDE [0700 95467833] Fax: +49 (0)700 XLHOSTDE [0700 95467833] WEB: http://www.XLhost.de IRC: #XLhost at irc.quakenet.org -- >> XLhost.de - eXperts in Linux hosting ? << XLhost.de GmbH J?rgen Herrmann, Gesch?ftsf?hrer Boelckestrasse 21, 93051 Regensburg, Germany Gesch?ftsf?hrer: Volker Geith, J?rgen Herrmann Registriert unter: HRB9918 Umsatzsteuer-Identifikationsnummer: DE245931218 Fon: +49 (0)700 XLHOSTDE [0700 95467833] Fax: +49 (0)700 XLHOSTDE [0700 95467833] WEB: http://www.XLhost.de IRC: #XLhost at irc.quakenet.org From timmichelsen at gmx-topmail.de Wed Feb 11 17:59:52 2009 From: timmichelsen at gmx-topmail.de (Tim Michelsen) Date: Wed, 11 Feb 2009 23:59:52 +0100 Subject: [SciPy-user] Numpy 1.2.1 and Scipy 0.7.0; Ubuntu packages In-Reply-To: <5b8d13220902110446x82e25a9ifb11bb563e468313@mail.gmail.com> References: <5b8d13220902110446x82e25a9ifb11bb563e468313@mail.gmail.com> Message-ID: > https://edge.launchpad.net/~scipy/+archive/ppa Thanks. There is also: http://linux.pythonxy.com/ubuntu/ From fperez.net at gmail.com Wed Feb 11 18:11:10 2009 From: fperez.net at gmail.com (Fernando Perez) Date: Wed, 11 Feb 2009 15:11:10 -0800 Subject: [SciPy-user] Numpy 1.2.1 and Scipy 0.7.0; Ubuntu packages In-Reply-To: <5b8d13220902110446x82e25a9ifb11bb563e468313@mail.gmail.com> References: <5b8d13220902110446x82e25a9ifb11bb563e468313@mail.gmail.com> Message-ID: On Wed, Feb 11, 2009 at 4:46 AM, David Cournapeau wrote: > Hi, > > I started to set up a PPA for scipy on launchpad, which enables to > build ubuntu packages for various distributions/architectures. The > link is there: > > https://edge.launchpad.net/~scipy/+archive/ppa Cool, thanks. Is it easy to provide also hardy packages, or does it require a lot of work on your part? Cheers, f From karl.young at ucsf.edu Wed Feb 11 18:31:08 2009 From: karl.young at ucsf.edu (Karl Young) Date: Wed, 11 Feb 2009 15:31:08 -0800 Subject: [SciPy-user] slice question In-Reply-To: References: <5b8d13220902110446x82e25a9ifb11bb563e468313@mail.gmail.com> Message-ID: <49935FBC.2080902@ucsf.edu> Sorry for the dumb question but I did search quite a bit before realizing it would only take a couple of keystrokes from an Illuminati to dispel my ignorance. I want to slice an array as: A[a:b,c:d,e:f,...] starting with two 1d arrays containing the lower and upper slice limits: B = [a,c,e,...] and C = [b,d,f,...] and I'd like to write a general expression for this given A,B,C (i.e. not specify the dimension) but I can't figure out how to turn B and C into a:b,c:d,e:f,... re indexing A - any thoughts ? Thanks From robert.kern at gmail.com Wed Feb 11 18:34:35 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 11 Feb 2009 17:34:35 -0600 Subject: [SciPy-user] slice question In-Reply-To: <49935FBC.2080902@ucsf.edu> References: <5b8d13220902110446x82e25a9ifb11bb563e468313@mail.gmail.com> <49935FBC.2080902@ucsf.edu> Message-ID: <3d375d730902111534s3dea3e1axf5f6bf57a6980b22@mail.gmail.com> On Wed, Feb 11, 2009 at 17:31, Karl Young wrote: > > Sorry for the dumb question but I did search quite a bit before > realizing it would only take a couple of keystrokes from an Illuminati > to dispel my ignorance. > > I want to slice an array as: > > A[a:b,c:d,e:f,...] > > starting with two 1d arrays containing the lower and upper slice limits: > > B = [a,c,e,...] and C = [b,d,f,...] > > and I'd like to write a general expression for this given A,B,C (i.e. > not specify the dimension) but I can't figure out how to turn B and C > into a:b,c:d,e:f,... re indexing A - any thoughts ? Thanks A[tuple([slice(b,c) for b,c in zip(B,C)])] -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From karl.young at ucsf.edu Wed Feb 11 18:40:05 2009 From: karl.young at ucsf.edu (Karl Young) Date: Wed, 11 Feb 2009 15:40:05 -0800 Subject: [SciPy-user] slice question In-Reply-To: <3d375d730902111534s3dea3e1axf5f6bf57a6980b22@mail.gmail.com> References: <5b8d13220902110446x82e25a9ifb11bb563e468313@mail.gmail.com> <49935FBC.2080902@ucsf.edu> <3d375d730902111534s3dea3e1axf5f6bf57a6980b22@mail.gmail.com> Message-ID: <499361D5.2090308@ucsf.edu> Thanks maestro ! (not quite as trivial as I thought it would be) > On Wed, Feb 11, 2009 at 17:31, Karl Young wrote: > >> Sorry for the dumb question but I did search quite a bit before >> realizing it would only take a couple of keystrokes from an Illuminati >> to dispel my ignorance. >> >> I want to slice an array as: >> >> A[a:b,c:d,e:f,...] >> >> starting with two 1d arrays containing the lower and upper slice limits: >> >> B = [a,c,e,...] and C = [b,d,f,...] >> >> and I'd like to write a general expression for this given A,B,C (i.e. >> not specify the dimension) but I can't figure out how to turn B and C >> into a:b,c:d,e:f,... re indexing A - any thoughts ? Thanks >> > > A[tuple([slice(b,c) for b,c in zip(B,C)])] > > From bernardo.rocha at meduni-graz.at Thu Feb 12 02:23:44 2009 From: bernardo.rocha at meduni-graz.at (Bernardo M. Rocha) Date: Thu, 12 Feb 2009 08:23:44 +0100 Subject: [SciPy-user] intersect (matlab) Message-ID: <4993CE80.20204@meduni-graz.at> Hi Guys, Is there an equivalent in scipy/numpy to the following MATLAB code??? Or is there a way to do the same and get this ia and ib? A = [1 2 3 6]; B = [1 2 3 4 6 10 20]; [c, ia, ib] = intersect(A, B); disp([c; ia; ib]) 1 2 3 6 1 2 3 4 1 2 3 5 Best regards. Bernardo M. Rocha From robert.kern at gmail.com Thu Feb 12 02:56:21 2009 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 12 Feb 2009 01:56:21 -0600 Subject: [SciPy-user] intersect (matlab) In-Reply-To: <4993CE80.20204@meduni-graz.at> References: <4993CE80.20204@meduni-graz.at> Message-ID: <3d375d730902112356k65a9e993ue42337b56c32b3b@mail.gmail.com> On Thu, Feb 12, 2009 at 01:23, Bernardo M. Rocha wrote: > Hi Guys, > > Is there an equivalent in scipy/numpy to the following MATLAB code??? Or > is there a way to do the same and get this ia and ib? > > A = [1 2 3 6]; B = [1 2 3 4 6 10 20]; > [c, ia, ib] = intersect(A, B); > disp([c; ia; ib]) 1 2 3 6 > 1 2 3 4 > 1 2 3 5 In [40]: A = array([1, 2, 3, 6]) In [41]: B = array([1,2,3,4,6,10,20]) In [42]: c = intersect1d(A, B) In [43]: c Out[43]: array([1, 2, 3, 6]) In [46]: ma = setmember1d(A, B) In [47]: ma Out[47]: array([ True, True, True, True], dtype=bool) In [48]: ia = nonzero(ma)[0] In [49]: ia Out[49]: array([0, 1, 2, 3]) In [50]: mb = setmember1d(B, A) In [51]: mb Out[51]: array([ True, True, True, False, True, False, False], dtype=bool) In [52]: ib = nonzero(mb)[0] In [53]: ib Out[53]: array([0, 1, 2, 4]) -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From sturla at molden.no Thu Feb 12 09:53:12 2009 From: sturla at molden.no (Sturla Molden) Date: Thu, 12 Feb 2009 15:53:12 +0100 Subject: [SciPy-user] shared memory machines In-Reply-To: <20090211142422.GD19956@phare.normalesup.org> References: <51ab5afe7eb746fa60db99ef7635df9b.squirrel@webmail.uio.no> <20090210231356.GC9128@phare.normalesup.org> <20090211114620.GB19956@phare.normalesup.org> <4992C539.7020501@molden.no> <20090211133305.GC19956@phare.normalesup.org> <4992D7EA.5070404@molden.no> <20090211142422.GD19956@phare.normalesup.org> Message-ID: <499437D8.9080103@molden.no> On 2/11/2009 3:24 PM, Gael Varoquaux wrote: > I have put in some more print statements. I have the fealing that it is > not cleaned up. I am attaching my test code, and the modified > sharedmemory_sysv.pyx for debug. The output is the following: I have reproduced the same error in Windows. A suspected, it stems from using multiprocessing's Finalizer. For some reason it is never called. Also having the heap object as a class attribute of the BufferWrapper prevents clean-up with a __del__ method. Whereas having it as a global variable in the module works ok. The problem is not the Cython extension code, it is the array_heap.py module. I think we should not use a malloc at all. If a segment cannot be reused (Heap.free) until all other handles to it is closed. This is a bug in my code. The easiest solution is to remove the heap malloc all together. In that case, the allocator for small arrays will just reuse a shared segment until it is exhausted, and then discared it. Large arrays will get their own segment. Otherwise we have to do refcounting for open handles (manually in Windows) and we are back to thread-based cleanup in the creator. Sturla Molden From rmay31 at gmail.com Thu Feb 12 13:04:53 2009 From: rmay31 at gmail.com (Ryan May) Date: Thu, 12 Feb 2009 12:04:53 -0600 Subject: [SciPy-user] odeint for calculating trajectories Message-ID: Hi, Is there a good way to use scipy.integrate.odeint to calculate trajectories from an observed velocity field? I know you can do this when you have an analytic expression for dx/dt, but in this case I have a spatial grid of values for dx/dt. The only way I've come up with is to make the function passed to odeint something that will interpolate fromt the grid to the given point. Thanks, Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma Sent from: Norman Oklahoma United States. -------------- next part -------------- An HTML attachment was scrubbed... URL: From rob.clewley at gmail.com Thu Feb 12 13:16:22 2009 From: rob.clewley at gmail.com (Rob Clewley) Date: Thu, 12 Feb 2009 13:16:22 -0500 Subject: [SciPy-user] odeint for calculating trajectories In-Reply-To: References: Message-ID: > Is there a good way to use scipy.integrate.odeint to calculate trajectories > from an observed velocity field? I know you can do this when you have an > analytic expression for dx/dt, but in this case I have a spatial grid of > values for dx/dt. The only way I've come up with is to make the function > passed to odeint something that will interpolate fromt the grid to the given > point. I don't think odeint is the right tool for this job - there is no ODE integration to do if you do not have an explicit function for the vector field. You should think of it purely as an interpolation problem. You have (t,x) values and (t, dx/dt) values, so this defines a piecewise quadratic function which has continuous *second* derivative everywhere (i.e. the trajectory smoothly agrees at your mesh points). I would use the polynomial interpolation classes that were recently added to scipy by Anne Archibald (search this list for details about it). You pass it your arrays of values and you get back a function that smoothly interpolates through your points. This is the most accurate trajectory that you can derive from this finite mesh vector-field. -Rob From peridot.faceted at gmail.com Thu Feb 12 16:28:11 2009 From: peridot.faceted at gmail.com (Anne Archibald) Date: Thu, 12 Feb 2009 16:28:11 -0500 Subject: [SciPy-user] odeint for calculating trajectories In-Reply-To: References: Message-ID: 2009/2/12 Rob Clewley : >> Is there a good way to use scipy.integrate.odeint to calculate trajectories >> from an observed velocity field? I know you can do this when you have an >> analytic expression for dx/dt, but in this case I have a spatial grid of >> values for dx/dt. The only way I've come up with is to make the function >> passed to odeint something that will interpolate fromt the grid to the given >> point. > > > I don't think odeint is the right tool for this job - there is no ODE > integration to do if you do not have an explicit function for the > vector field. You should think of it purely as an interpolation > problem. You have (t,x) values and (t, dx/dt) values, so this defines > a piecewise quadratic function which has continuous *second* > derivative everywhere (i.e. the trajectory smoothly agrees at your > mesh points). I would use the polynomial interpolation classes that > were recently added to scipy by Anne Archibald (search this list for > details about it). You pass it your arrays of values and you get back > a function that smoothly interpolates through your points. This is the > most accurate trajectory that you can derive from this finite mesh > vector-field. Put another way, odeint is full of cleverness to figure out how fast the derivative is changing, but that is of no use to you here (unless you have an extremely high-resolution, slowly-changing vector field). So an ode solver that walks along the grid is about as good as you can do. There is actually cython code to do exactly this in the scikit vectorplot, which uses it to implement line integral convolution. If you want fast and dirty trajectories, you may be interested in modifying that code. Using a piecewise polynomial will certainly give you a smoother trajectory, though. Anne From R.Springuel at umit.maine.edu Thu Feb 12 18:06:17 2009 From: R.Springuel at umit.maine.edu (R. Padraic Springuel) Date: Thu, 12 Feb 2009 18:06:17 -0500 Subject: [SciPy-user] isnotnan Message-ID: <4994AB69.7000304@umit.maine.edu> Is there a isnotnan function somewhere in the numpy or scipy library that functions similarly to isnan (except that the results are reversed)? -- R. Padraic Springuel Research Assistant Department of Physics and Astronomy University of Maine Bennett 309 Office Hours: By appointment only -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/x-pkcs7-signature Size: 5627 bytes Desc: S/MIME Cryptographic Signature URL: From c-b at asu.edu Thu Feb 12 18:37:15 2009 From: c-b at asu.edu (Christopher Brown) Date: Thu, 12 Feb 2009 16:37:15 -0700 Subject: [SciPy-user] isnotnan In-Reply-To: <4994AB69.7000304@umit.maine.edu> References: <4994AB69.7000304@umit.maine.edu> Message-ID: <4994B2AB.3040005@asu.edu> Hi Padraic, PS> Is there a isnotnan function somewhere in the numpy or scipy library PS> that functions similarly to isnan (except that the results are PS> reversed)? I don't understand. Will 'not numpy.isnan' not work? -- Chris From pgmdevlist at gmail.com Thu Feb 12 18:37:37 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Thu, 12 Feb 2009 18:37:37 -0500 Subject: [SciPy-user] isnotnan In-Reply-To: <4994AB69.7000304@umit.maine.edu> References: <4994AB69.7000304@umit.maine.edu> Message-ID: On Feb 12, 2009, at 6:06 PM, R. Padraic Springuel wrote: > Is there a isnotnan function somewhere in the numpy or scipy library > that functions similarly to isnan (except that the results are > reversed)? Like np.logical_not(np.isnan(...)) ? From pgmdevlist at gmail.com Thu Feb 12 18:41:38 2009 From: pgmdevlist at gmail.com (Pierre GM) Date: Thu, 12 Feb 2009 18:41:38 -0500 Subject: [SciPy-user] isnotnan In-Reply-To: <4994B2AB.3040005@asu.edu> References: <4994AB69.7000304@umit.maine.edu> <4994B2AB.3040005@asu.edu> Message-ID: <52F18680-6E58-4274-9BDB-F09BB493A3E0@gmail.com> On Feb 12, 2009, at 6:37 PM, Christopher Brown wrote: > Hi Padraic, > > PS> Is there a isnotnan function somewhere in the numpy or scipy > library > PS> that functions similarly to isnan (except that the results are > PS> reversed)? > > I don't understand. Will 'not numpy.isnan' not work? Can't work: "not" works on booleans, not on arrays, and np.isnan returns a ndarray of booleans. You end up raising a ValueError exception: >>> x = np.array([1,np.nan,3.]) >>> not np.isnan(x) ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all() Just use np.logical_not. From robert.kern at gmail.com Thu Feb 12 18:47:13 2009 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 12 Feb 2009 17:47:13 -0600 Subject: [SciPy-user] isnotnan In-Reply-To: <4994AB69.7000304@umit.maine.edu> References: <4994AB69.7000304@umit.maine.edu> Message-ID: <3d375d730902121547j23a54345kd87b6a8ea5a8b5ff@mail.gmail.com> On Thu, Feb 12, 2009 at 17:06, R. Padraic Springuel wrote: > Is there a isnotnan function somewhere in the numpy or scipy library that > functions similarly to isnan (except that the results are reversed)? ~isnan(x) -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From yelvergm at gmail.com Thu Feb 12 20:39:07 2009 From: yelvergm at gmail.com (yelver huang) Date: Thu, 12 Feb 2009 20:39:07 -0500 Subject: [SciPy-user] Help on installation of scipy Message-ID: Hi all, I have encountered some problems on installing scipy on Windows, though it's quite simple. There is no fault for me to import numpy from Python25, but after I install the binary file of scipy, I could not import it. The error it shows is: Warning (from warnings module): File "G:\Python25\lib\site-packages\scipy\__init__.py", line 30 UserWarning) UserWarning: Numpy 1.2.0 or above is recommended for this version of scipy (detected version 1.0.4) Traceback (most recent call last): File "", line 1, in import scipy File "G:\Python25\Lib\site-packages\scipy\__init__.py", line 75, in from numpy.testing import Tester ImportError: cannot import name Tester The version of numpy in my computer is 1.2.1, I could not understand this message. Hope someone could help me. Thanks, Tao -------------- next part -------------- An HTML attachment was scrubbed... URL: From david at ar.media.kyoto-u.ac.jp Thu Feb 12 20:33:50 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Fri, 13 Feb 2009 10:33:50 +0900 Subject: [SciPy-user] Help on installation of scipy In-Reply-To: References: Message-ID: <4994CDFE.8060706@ar.media.kyoto-u.ac.jp> Hi Tao, yelver huang wrote: > Hi all, > > I have encountered some problems on installing scipy on Windows, > though it's quite simple. There is no fault for me to import numpy > from Python25, but after I install the binary file of scipy, I could > not import it. The error it shows is: First, what does the following command tell you: python -c "import numpy; print numpy.version.version; print numpy.__file__" This gives both versions and numpy package location (I suspect that you have several numpy installs, and that scipy does not pick up the one you expect). How did you install numpy and scipy ? Did you use easy_install ? cheers, David From yelvergm at gmail.com Thu Feb 12 21:21:33 2009 From: yelvergm at gmail.com (yelver huang) Date: Thu, 12 Feb 2009 21:21:33 -0500 Subject: [SciPy-user] Help on installation of scipy In-Reply-To: <4994CDFE.8060706@ar.media.kyoto-u.ac.jp> References: <4994CDFE.8060706@ar.media.kyoto-u.ac.jp> Message-ID: Hi David, Thank you so much for you help. I have execute the command you suggest and the result is as you expected: the version is 1.0.4, and the result of 'print numpy.__file__' is: 'G:\Program Files\MGLTools 1.5.2\MGLToolsPckgs\numpy\__init__.pyc' This relates to my previously installed software Autodock, so can you give me some suggestion on what I can do further in order to implement numpy and scipy? Cheers, Tao On Thu, Feb 12, 2009 at 8:33 PM, David Cournapeau < david at ar.media.kyoto-u.ac.jp> wrote: > Hi Tao, > > yelver huang wrote: > > Hi all, > > > > I have encountered some problems on installing scipy on Windows, > > though it's quite simple. There is no fault for me to import numpy > > from Python25, but after I install the binary file of scipy, I could > > not import it. The error it shows is: > > First, what does the following command tell you: > > python -c "import numpy; print numpy.version.version; print numpy.__file__" > > This gives both versions and numpy package location (I suspect that you > have several numpy installs, and that scipy does not pick up the one you > expect). How did you install numpy and scipy ? Did you use easy_install ? > > cheers, > > David > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > -- Tao-wei Huang Department of Chemistry and Chemical Biology Rensselaer Polytechnic Institute Phone: 518-275-7997 -------------- next part -------------- An HTML attachment was scrubbed... URL: From david at ar.media.kyoto-u.ac.jp Thu Feb 12 21:19:53 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Fri, 13 Feb 2009 11:19:53 +0900 Subject: [SciPy-user] Help on installation of scipy In-Reply-To: References: <4994CDFE.8060706@ar.media.kyoto-u.ac.jp> Message-ID: <4994D8C9.4020503@ar.media.kyoto-u.ac.jp> yelver huang wrote: > Hi David, Thank you so much for you help. I have execute the command > you suggest and the result is as you expected: the version is 1.0.4, > and the result of 'print numpy.__file__' is: 'G:\Program > Files\MGLTools 1.5.2\MGLToolsPckgs\numpy\__init__.pyc' > > This relates to my previously installed software Autodock, so can you > give me some suggestion on what I can do further in order to implement > numpy and scipy? I don't know autodock, but I would guess that it adds MGLToolsPckgs to your PYTHONPATH environment variable. Ideally, if autodock uses its own local python packages, it should keep them private - if you put C:\Python25\libs\site-packages in front of your PYTHONPATH, I am afraid it will break autodock. That may be an issue worth discussing with autodock people, David From bsouthey at gmail.com Thu Feb 12 21:38:35 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Thu, 12 Feb 2009 20:38:35 -0600 Subject: [SciPy-user] isnotnan In-Reply-To: <3d375d730902121547j23a54345kd87b6a8ea5a8b5ff@mail.gmail.com> References: <4994AB69.7000304@umit.maine.edu> <3d375d730902121547j23a54345kd87b6a8ea5a8b5ff@mail.gmail.com> Message-ID: Hi, Thanks to the Doc Marathon! Perhaps numpy.isfinite() is what you want: http://docs.scipy.org/doc/numpy/reference/generated/numpy.isfinite.html It is even linked from numpy.isnan() page http://docs.scipy.org/doc/numpy/reference/generated/numpy.isnan.html Bruce On Thu, Feb 12, 2009 at 5:47 PM, Robert Kern wrote: > On Thu, Feb 12, 2009 at 17:06, R. Padraic Springuel > wrote: >> Is there a isnotnan function somewhere in the numpy or scipy library that >> functions similarly to isnan (except that the results are reversed)? > > ~isnan(x) > > -- > Robert Kern > > "I have come to believe that the whole world is an enigma, a harmless > enigma that is made terrible by our own mad attempt to interpret it as > though it had an underlying truth." > -- Umberto Eco > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > From wesmckinn at gmail.com Thu Feb 12 21:45:00 2009 From: wesmckinn at gmail.com (Wes McKinney) Date: Thu, 12 Feb 2009 21:45:00 -0500 Subject: [SciPy-user] isnotnan In-Reply-To: References: <4994AB69.7000304@umit.maine.edu> <3d375d730902121547j23a54345kd87b6a8ea5a8b5ff@mail.gmail.com> Message-ID: isfinite(x) is faster than -isnan(x), and it also gets INFs, so definitely the way to go. On Feb 12, 2009, at 9:38 PM, Bruce Southey wrote: > Hi, > Thanks to the Doc Marathon! > > Perhaps numpy.isfinite() is what you want: > http://docs.scipy.org/doc/numpy/reference/generated/ > numpy.isfinite.html > > It is even linked from numpy.isnan() page > http://docs.scipy.org/doc/numpy/reference/generated/numpy.isnan.html > > Bruce > > > > On Thu, Feb 12, 2009 at 5:47 PM, Robert Kern > wrote: >> On Thu, Feb 12, 2009 at 17:06, R. Padraic Springuel >> wrote: >>> Is there a isnotnan function somewhere in the numpy or scipy >>> library that >>> functions similarly to isnan (except that the results are reversed)? >> >> ~isnan(x) >> >> -- >> Robert Kern >> >> "I have come to believe that the whole world is an enigma, a harmless >> enigma that is made terrible by our own mad attempt to interpret >> it as >> though it had an underlying truth." >> -- Umberto Eco >> _______________________________________________ >> SciPy-user mailing list >> SciPy-user at scipy.org >> http://projects.scipy.org/mailman/listinfo/scipy-user >> > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user From david at ar.media.kyoto-u.ac.jp Thu Feb 12 21:30:05 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Fri, 13 Feb 2009 11:30:05 +0900 Subject: [SciPy-user] isnotnan In-Reply-To: References: <4994AB69.7000304@umit.maine.edu> <3d375d730902121547j23a54345kd87b6a8ea5a8b5ff@mail.gmail.com> Message-ID: <4994DB2D.6010004@ar.media.kyoto-u.ac.jp> Bruce Southey wrote: > Hi, > Thanks to the Doc Marathon! > > Perhaps numpy.isfinite() is what you want: > http://docs.scipy.org/doc/numpy/reference/generated/numpy.isfinite.html > isfinite may be acceptable for the OP, but !isnan is not the same as isfinite. cheers, David From yelvergm at gmail.com Thu Feb 12 21:48:11 2009 From: yelvergm at gmail.com (yelver huang) Date: Thu, 12 Feb 2009 21:48:11 -0500 Subject: [SciPy-user] Help on installation of scipy In-Reply-To: <4994D8C9.4020503@ar.media.kyoto-u.ac.jp> References: <4994CDFE.8060706@ar.media.kyoto-u.ac.jp> <4994D8C9.4020503@ar.media.kyoto-u.ac.jp> Message-ID: Hi David, Thank you for your suggestion, I have uninstall the software of Autodock, which I might not use at present. And now there is no error after I import scipy. Hopefully I can start learn scipy now. Cheers, Tao On Thu, Feb 12, 2009 at 9:19 PM, David Cournapeau < david at ar.media.kyoto-u.ac.jp> wrote: > yelver huang wrote: > > Hi David, Thank you so much for you help. I have execute the command > > you suggest and the result is as you expected: the version is 1.0.4, > > and the result of 'print numpy.__file__' is: 'G:\Program > > Files\MGLTools 1.5.2\MGLToolsPckgs\numpy\__init__.pyc' > > > > This relates to my previously installed software Autodock, so can you > > give me some suggestion on what I can do further in order to implement > > numpy and scipy? > > I don't know autodock, but I would guess that it adds MGLToolsPckgs to > your PYTHONPATH environment variable. Ideally, if autodock uses its own > local python packages, it should keep them private - if you put > C:\Python25\libs\site-packages in front of your PYTHONPATH, I am afraid > it will break autodock. That may be an issue worth discussing with > autodock people, > > David > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > -- Tao-wei Huang Department of Chemistry and Chemical Biology Rensselaer Polytechnic Institute Phone: 518-275-7997 -------------- next part -------------- An HTML attachment was scrubbed... URL: From rmay31 at gmail.com Thu Feb 12 21:50:41 2009 From: rmay31 at gmail.com (Ryan May) Date: Thu, 12 Feb 2009 20:50:41 -0600 Subject: [SciPy-user] scipy.ndimage.gaussian_filter for masked data? Message-ID: Hi, I have a 2D grid of spatial data that I wanted to smooth just using a simple gaussian filter. My grid comes from observational data, so there are some points that are masked. Is there something similar to scipy.ndimage.gaussian_filter anywhere that will work with masked points? Thanks in advance, Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsouthey at gmail.com Thu Feb 12 22:11:49 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Thu, 12 Feb 2009 21:11:49 -0600 Subject: [SciPy-user] isnotnan In-Reply-To: <4994DB2D.6010004@ar.media.kyoto-u.ac.jp> References: <4994AB69.7000304@umit.maine.edu> <3d375d730902121547j23a54345kd87b6a8ea5a8b5ff@mail.gmail.com> <4994DB2D.6010004@ar.media.kyoto-u.ac.jp> Message-ID: On Thu, Feb 12, 2009 at 8:30 PM, David Cournapeau wrote: > Bruce Southey wrote: >> Hi, >> Thanks to the Doc Marathon! >> >> Perhaps numpy.isfinite() is what you want: >> http://docs.scipy.org/doc/numpy/reference/generated/numpy.isfinite.html >> > > isfinite may be acceptable for the OP, but !isnan is not the same as > isfinite. > > cheers, > > David > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > Yes, isfinite is not the opposite of isnan unless there are no positive or negative infinity elements. Bruce From josef.pktd at gmail.com Thu Feb 12 22:16:26 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 12 Feb 2009 22:16:26 -0500 Subject: [SciPy-user] incorrect variance in optimize.curvefit and leastsq Message-ID: <1cd32cbb0902121916q60a08be6ka027b5873a3239da@mail.gmail.com> I just saw the new optimize.curvefit which provides a wrapper around optimize.leastsq optimize.leastsq provides the raw covariance matrix (cov_x). As I mentioned once on the mailing list, this is not the covariance matrix of the parameter estimates. To get those, the raw covariance matrix has to be multiplied by the standard error of the residual. So, the docstring in optimize.curvefit doesn't correspond to the actual calculation. I'm preparing a test against an example from the NIST certified cases: http://www.itl.nist.gov/div898/strd/nls/data/misra1b.shtml >>> SSE=np.sum((y-yh)**2) difference in standard deviation of the parameter estimates compared to NIST: >>> np.sqrt(SSE/12.0)*np.sqrt(np.diag(pcov))[0] - 3.1643950207E+00 2.4865573906573957e-006 >>> np.sqrt(SSE/12.0)*np.sqrt(np.diag(pcov))[1] - 4.2547321834E-06 3.6275492099440408e-012 The first parameter is not exactly high precision. The second problem is that, in weighted least squares, the calculation of the standard deviation of the parameter estimates has to take the weights into account. (But I don't have the formulas right now.) I was looking at this to provide a general non-linear least squares class in stats. But for several calculation, the Jacobian would be necessary. optimize.leastsq only provides cov_x, but I was wondering whether the Jacobian can be calculated from the return of the minpack functions in optimize.leastsq, but I didn't have time to figure this out. Josef From c-b at asu.edu Thu Feb 12 22:22:19 2009 From: c-b at asu.edu (Christopher Brown) Date: Thu, 12 Feb 2009 20:22:19 -0700 Subject: [SciPy-user] isnotnan In-Reply-To: <4994DB2D.6010004@ar.media.kyoto-u.ac.jp> References: <4994AB69.7000304@umit.maine.edu> <3d375d730902121547j23a54345kd87b6a8ea5a8b5ff@mail.gmail.com> <4994DB2D.6010004@ar.media.kyoto-u.ac.jp> Message-ID: <4994E76B.4060906@asu.edu> David Cournapeau wrote: > isfinite may be acceptable for the OP, but !isnan is not the same as > isfinite. So, what your saying is, not is not a number is not is infinite. Got it. :) From david at ar.media.kyoto-u.ac.jp Thu Feb 12 22:22:40 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Fri, 13 Feb 2009 12:22:40 +0900 Subject: [SciPy-user] isnotnan In-Reply-To: <4994E76B.4060906@asu.edu> References: <4994AB69.7000304@umit.maine.edu> <3d375d730902121547j23a54345kd87b6a8ea5a8b5ff@mail.gmail.com> <4994DB2D.6010004@ar.media.kyoto-u.ac.jp> <4994E76B.4060906@asu.edu> Message-ID: <4994E780.8090403@ar.media.kyoto-u.ac.jp> Christopher Brown wrote: > David Cournapeau wrote: > >> isfinite may be acceptable for the OP, but !isnan is not the same as >> isfinite. >> > > So, what your saying is, not is not a number is not is infinite. Got it. :) > Or more simply, that inf is a number but not finite :) David From oliphant at enthought.com Thu Feb 12 23:09:19 2009 From: oliphant at enthought.com (Travis E. Oliphant) Date: Thu, 12 Feb 2009 23:09:19 -0500 Subject: [SciPy-user] incorrect variance in optimize.curvefit and leastsq In-Reply-To: <1cd32cbb0902121916q60a08be6ka027b5873a3239da@mail.gmail.com> References: <1cd32cbb0902121916q60a08be6ka027b5873a3239da@mail.gmail.com> Message-ID: <4994F26F.4050808@enthought.com> josef.pktd at gmail.com wrote: > I just saw the new optimize.curvefit which provides a wrapper around > optimize.leastsq > > optimize.leastsq provides the raw covariance matrix (cov_x). As I > mentioned once on the mailing list, this is not the covariance matrix > of the parameter estimates. To get those, the raw covariance matrix > has to be multiplied by the standard error of the residual. So, the > docstring in optimize.curvefit doesn't correspond to the actual > calculation. > Thank you for the clarification. I had forgotten your earlier valid concerns. Help fixing the docstring is appreciated. If you can figure out how to improve the code, that is even better. I think it is good to at least report the cov, but the docstring should not mislead. > > The first parameter is not exactly high precision. > > The second problem is that, in weighted least squares, the calculation > of the standard deviation of the parameter estimates has to take the > weights into account. (But I don't have the formulas right now.) > > I was looking at this to provide a general non-linear least squares > class in stats. But for several calculation, the Jacobian would be > necessary. optimize.leastsq only provides cov_x, but I was wondering > whether the Jacobian can be calculated from the return of the minpack > functions in optimize.leastsq, but I didn't have time to figure this > out. > I'm not sure, but it might be. I would love to spend time on this, but don't have it. If somebody else can pick up, that would be great. -Travis -- Travis Oliphant Enthought, Inc. (512) 536-1057 (office) (512) 536-1059 (fax) http://www.enthought.com oliphant at enthought.com From oliphant at enthought.com Thu Feb 12 23:56:20 2009 From: oliphant at enthought.com (Travis E. Oliphant) Date: Thu, 12 Feb 2009 23:56:20 -0500 Subject: [SciPy-user] incorrect variance in optimize.curvefit and leastsq In-Reply-To: <4994F26F.4050808@enthought.com> References: <1cd32cbb0902121916q60a08be6ka027b5873a3239da@mail.gmail.com> <4994F26F.4050808@enthought.com> Message-ID: <4994FD74.9080305@enthought.com> Travis E. Oliphant wrote: > josef.pktd at gmail.com wrote: > >> I just saw the new optimize.curvefit which provides a wrapper around >> optimize.leastsq >> >> optimize.leastsq provides the raw covariance matrix (cov_x). As I >> mentioned once on the mailing list, this is not the covariance matrix >> of the parameter estimates. To get those, the raw covariance matrix >> has to be multiplied by the standard error of the residual. So, the >> docstring in optimize.curvefit doesn't correspond to the actual >> calculation. >> >> > Thank you for the clarification. I had forgotten your earlier valid > concerns. Help fixing the docstring is appreciated. If you can > figure out how to improve the code, that is even better. I think it is > good to at least report the cov, but the docstring should not mislead. > >> The first parameter is not exactly high precision. >> >> The second problem is that, in weighted least squares, the calculation >> of the standard deviation of the parameter estimates has to take the >> weights into account. (But I don't have the formulas right now.) >> >> I was looking at this to provide a general non-linear least squares >> class in stats. But for several calculation, the Jacobian would be >> necessary. optimize.leastsq only provides cov_x, but I was wondering >> whether the Jacobian can be calculated from the return of the minpack >> functions in optimize.leastsq, but I didn't have time to figure this >> out. >> >> > I'm not sure, but it might be. I would love to spend time on this, > but don't have it. If somebody else can pick up, that would be great. O.K. So my desire to spend time on it outweighed my wisdom, and I went ahead and looked at the reference linked-to and multipled by the necessary scale factor. I fixed the documentation in leastsq as well. A sanity check on my work would be appreciated. I divided by the sum of the weights squared for the weighted case. I'm not sure if this is correct, but it's probably close. When someone can verify the formula that would be great. Adding a check against the test case referred-to would be great. -Travis -- Travis Oliphant Enthought, Inc. (512) 536-1057 (office) (512) 536-1059 (fax) http://www.enthought.com oliphant at enthought.com From josef.pktd at gmail.com Thu Feb 12 23:57:58 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 12 Feb 2009 23:57:58 -0500 Subject: [SciPy-user] incorrect variance in optimize.curvefit and leastsq In-Reply-To: <4994F26F.4050808@enthought.com> References: <1cd32cbb0902121916q60a08be6ka027b5873a3239da@mail.gmail.com> <4994F26F.4050808@enthought.com> Message-ID: <1cd32cbb0902122057x1fb8ac51l74b6988ec98a8f51@mail.gmail.com> On Thu, Feb 12, 2009 at 11:09 PM, Travis E. Oliphant wrote: > josef.pktd at gmail.com wrote: >> I just saw the new optimize.curvefit which provides a wrapper around >> optimize.leastsq >> >> optimize.leastsq provides the raw covariance matrix (cov_x). As I >> mentioned once on the mailing list, this is not the covariance matrix >> of the parameter estimates. To get those, the raw covariance matrix >> has to be multiplied by the standard error of the residual. So, the >> docstring in optimize.curvefit doesn't correspond to the actual >> calculation. >> > Thank you for the clarification. I had forgotten your earlier valid > concerns. Help fixing the docstring is appreciated. If you can > figure out how to improve the code, that is even better. I think it is > good to at least report the cov, but the docstring should not mislead. >> the standard deviation of the error can be calculated and the corrected (this is written for the use from outside of curvefit): yhat = func(x,popt[0], popt[1]) # get predicted observations SSE = np.sum((y-yhat)**2) sig2 = SSE/(len(y)-len(popt)) ecov = sig2*pcov # this is the variance-covariance matrix of the parameter estimates inside curvefit, this work (before the return): err = func(popt, *args) SSE = np.sum((err)**2) sig2 = SSE / (len(ydata) - len(popt)) pcov = sig2 * pcov >> The first parameter is not exactly high precision. >> >> The second problem is that, in weighted least squares, the calculation >> of the standard deviation of the parameter estimates has to take the >> weights into account. (But I don't have the formulas right now.) >> >> I was looking at this to provide a general non-linear least squares >> class in stats. But for several calculation, the Jacobian would be >> necessary. optimize.leastsq only provides cov_x, but I was wondering >> whether the Jacobian can be calculated from the return of the minpack >> functions in optimize.leastsq, but I didn't have time to figure this >> out. >> > I'm not sure, but it might be. I would love to spend time on this, > but don't have it. If somebody else can pick up, that would be great. > > -Travis > Below are two versions of the test function, the first is against curvefit with corrected pcov, the second is a test against an uncorrected curvefit. It uses only decimal=5 so the tests don't fail Josef ---------- test against corrected version ---------- def test_curvefit(): '''test against NIST certified case at http://www.itl.nist.gov/div898/strd/nls/data/misra1b.shtml''' data = array([[ 10.07, 77.6 ], [ 14.73, 114.9 ], [ 17.94, 141.1 ], [ 23.93, 190.8 ], [ 29.61, 239.9 ], [ 35.18, 289. ], [ 40.02, 332.8 ], [ 44.82, 378.4 ], [ 50.76, 434.8 ], [ 55.05, 477.3 ], [ 61.01, 536.8 ], [ 66.4 , 593.1 ], [ 75.47, 689.1 ], [ 81.78, 760. ]]) pstd_c = [3.1643950207E+00, 4.2547321834E-06] popt_c = [3.3799746163E+02, 3.9039091287E-04] SSE_c = 7.5464681533E-02 Rstd_c = 7.9301471998E-02 decimal = 5 #accuracy of parameter estimate and standard deviation def funct(x, b1, b2): return b1 * (1-(1+b2*x/2)**(-2)) start1 = [500, 0.0001] start2 = [300, 0.0002] for start in [start1, start2]: popt, pcov = curve_fit(funct, x, y, p0=start) pstd = np.sqrt(np.diag(pcov)) assert_almost_equal(popt, popt_c, decimal=decimal) assert_almost_equal(pstd, pstd_c, decimal=decimal) ------------------- test against current version: -------------------------------------- from numpy.testing import assert_almost_equal def test_curvefit_old(): '''test against NIST certified case at http://www.itl.nist.gov/div898/strd/nls/data/misra1b.shtml''' data = array([[ 10.07, 77.6 ], [ 14.73, 114.9 ], [ 17.94, 141.1 ], [ 23.93, 190.8 ], [ 29.61, 239.9 ], [ 35.18, 289. ], [ 40.02, 332.8 ], [ 44.82, 378.4 ], [ 50.76, 434.8 ], [ 55.05, 477.3 ], [ 61.01, 536.8 ], [ 66.4 , 593.1 ], [ 75.47, 689.1 ], [ 81.78, 760. ]]) pstd_c = [3.1643950207E+00, 4.2547321834E-06] popt_c = [3.3799746163E+02, 3.9039091287E-04] SSE_c = 7.5464681533E-02 Rstd_c = 7.9301471998E-02 decimal = 5 #accuracy of parameter estimate and standard deviation def funct(x, b1, b2): return b1 * (1-(1+b2*x/2)**(-2)) start1 = [500, 0.0001] start2 = [300, 0.0002] for start in [start1, start2]: popt, pcov = curve_fit(funct, x, y, p0=start) yest = funct(x,popt[0], popt[1]) SSE = np.sum((y-yest)**2) dof = len(y)-len(popt) #Residual standard error Rstd = np.sqrt(SSE/(len(y)-len(popt))) #parameter standard error pstd = np.sqrt(SSE/(len(y)-len(popt))*np.diag(pcov)) assert_almost_equal(popt, popt_c, decimal=decimal) assert_almost_equal(pstd, pstd_c, decimal=decimal) assert_almost_equal(SSE, SSE_c) assert_almost_equal(Rstd, Rstd_c) From oliphant at enthought.com Fri Feb 13 00:24:13 2009 From: oliphant at enthought.com (Travis E. Oliphant) Date: Fri, 13 Feb 2009 00:24:13 -0500 Subject: [SciPy-user] incorrect variance in optimize.curvefit and leastsq In-Reply-To: <1cd32cbb0902122057x1fb8ac51l74b6988ec98a8f51@mail.gmail.com> References: <1cd32cbb0902121916q60a08be6ka027b5873a3239da@mail.gmail.com> <4994F26F.4050808@enthought.com> <1cd32cbb0902122057x1fb8ac51l74b6988ec98a8f51@mail.gmail.com> Message-ID: <499503FD.3050004@enthought.com> josef.pktd at gmail.com wrote: > On Thu, Feb 12, 2009 at 11:09 PM, Travis E. Oliphant > wrote: > >> josef.pktd at gmail.com wrote: >> >>> I just saw the new optimize.curvefit which provides a wrapper around >>> optimize.leastsq >>> >>> >>> > the standard deviation of the error can be calculated and the > corrected (this is written for the use from outside of curvefit): > > yhat = func(x,popt[0], popt[1]) # get predicted observations > SSE = np.sum((y-yhat)**2) > sig2 = SSE/(len(y)-len(popt)) > ecov = sig2*pcov # this is the variance-covariance matrix of > the parameter estimates > > > inside curvefit, this work (before the return): > > err = func(popt, *args) > SSE = np.sum((err)**2) > sig2 = SSE / (len(ydata) - len(popt)) > pcov = sig2 * pcov > Thanks very much for these... I appreciate your help in getting curve_fit in better shape. These functions were essentially added to curvefit so now curve_fit can be tested against those NIST pages. Thank you also for the unit tests. A good project idea for somebody wanting to get their feet wet with SciPy would be to write unit-tests of curve_fit against some more of those NIST data-sets --- those look nice. A combination of loadtxt and additional parsing fu and you could automate the whole thing --- nice project for a student out there ;-) -Travis -- Travis Oliphant Enthought, Inc. (512) 536-1057 (office) (512) 536-1059 (fax) http://www.enthought.com oliphant at enthought.com From josef.pktd at gmail.com Fri Feb 13 01:08:03 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 13 Feb 2009 01:08:03 -0500 Subject: [SciPy-user] incorrect variance in optimize.curvefit and leastsq In-Reply-To: <499503FD.3050004@enthought.com> References: <1cd32cbb0902121916q60a08be6ka027b5873a3239da@mail.gmail.com> <4994F26F.4050808@enthought.com> <1cd32cbb0902122057x1fb8ac51l74b6988ec98a8f51@mail.gmail.com> <499503FD.3050004@enthought.com> Message-ID: <1cd32cbb0902122208x5abc7c91n1de14e0ab4a04231@mail.gmail.com> On Fri, Feb 13, 2009 at 12:24 AM, Travis E. Oliphant wrote: > josef.pktd at gmail.com wrote: >> On Thu, Feb 12, 2009 at 11:09 PM, Travis E. Oliphant >> wrote: >> >>> josef.pktd at gmail.com wrote: >>> >>>> I just saw the new optimize.curvefit which provides a wrapper around >>>> optimize.leastsq >>>> >>>> >>>> >> the standard deviation of the error can be calculated and the >> corrected (this is written for the use from outside of curvefit): >> >> yhat = func(x,popt[0], popt[1]) # get predicted observations >> SSE = np.sum((y-yhat)**2) >> sig2 = SSE/(len(y)-len(popt)) >> ecov = sig2*pcov # this is the variance-covariance matrix of >> the parameter estimates >> >> >> inside curvefit, this work (before the return): >> >> err = func(popt, *args) >> SSE = np.sum((err)**2) >> sig2 = SSE / (len(ydata) - len(popt)) >> pcov = sig2 * pcov I think for the weighted least squares problem the weights should go into the SSE calculation. I didn't find directly the reference, but I am somewhat confident that this is correct, from the analogy to the transformed model ynew = y*weight where weight_i = 1/sigma_i in the linear case. But it's too late today to try to check this. SSE = np.sum((err*weight)**2) >> > > Thanks very much for these... I appreciate your help in getting > curve_fit in better shape. These functions were essentially added to > curvefit so now curve_fit can be tested against those NIST pages. > Thank you also for the unit tests. You're welcome > > A good project idea for somebody wanting to get their feet wet with > SciPy would be to write unit-tests of curve_fit against some more of > those NIST data-sets --- those look nice. A combination of loadtxt and > additional parsing fu and you could automate the whole thing --- nice > project for a student out there ;-) > > -Travis > Yes, and they are also useful to test other optimization functions. Unfortunately, I didn't see any cases with several explanatory variables or with weights/heteroscedasticity. There are also certified reference cases for basic statistics and linear regression, that also would be useful. I will try to look during weekend at the weighted case a bit more. Josef From alex.liberzon at gmail.com Fri Feb 13 01:51:09 2009 From: alex.liberzon at gmail.com (Alex Liberzon) Date: Fri, 13 Feb 2009 08:51:09 +0200 Subject: [SciPy-user] odeint for calculating trajectories Message-ID: <775f17a80902122251n3e1854e9ja30a1d4388af60a9@mail.gmail.com> I think that if you have given velocity field you should use streamlines. Link one: http://www.sagenb.org/home/pub/106/ or using MayaVi: http://code.enthought.com/projects/mayavi/docs/development/html/mayavi/_sources/mlab.txt or MayaVi cookbook for graphical example. http://www.scipy.org/Cookbook/MayaVi/Examples HIH, Alex -- Alex Liberzon Turbulence Structure Laboratory (http://www.eng.tau.ac.il/efdl) School of Mechanical Engineering Tel Aviv University Tel: +972-3-640-8928 (office) Tel: +972-3-640-6860 (lab) E-mail: alexlib at eng.tau.ac.il -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Fri Feb 13 02:06:13 2009 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 13 Feb 2009 01:06:13 -0600 Subject: [SciPy-user] incorrect variance in optimize.curvefit and leastsq In-Reply-To: <1cd32cbb0902122208x5abc7c91n1de14e0ab4a04231@mail.gmail.com> References: <1cd32cbb0902121916q60a08be6ka027b5873a3239da@mail.gmail.com> <4994F26F.4050808@enthought.com> <1cd32cbb0902122057x1fb8ac51l74b6988ec98a8f51@mail.gmail.com> <499503FD.3050004@enthought.com> <1cd32cbb0902122208x5abc7c91n1de14e0ab4a04231@mail.gmail.com> Message-ID: <3d375d730902122306k6b5a9739i6aa91e21753258f8@mail.gmail.com> On Fri, Feb 13, 2009 at 00:08, wrote: > On Fri, Feb 13, 2009 at 12:24 AM, Travis E. Oliphant > wrote: >> josef.pktd at gmail.com wrote: >>> On Thu, Feb 12, 2009 at 11:09 PM, Travis E. Oliphant >>> wrote: >>> >>>> josef.pktd at gmail.com wrote: >>>> >>>>> I just saw the new optimize.curvefit which provides a wrapper around >>>>> optimize.leastsq >>>>> >>>>> >>>>> >>> the standard deviation of the error can be calculated and the >>> corrected (this is written for the use from outside of curvefit): >>> >>> yhat = func(x,popt[0], popt[1]) # get predicted observations >>> SSE = np.sum((y-yhat)**2) >>> sig2 = SSE/(len(y)-len(popt)) >>> ecov = sig2*pcov # this is the variance-covariance matrix of >>> the parameter estimates >>> >>> >>> inside curvefit, this work (before the return): >>> >>> err = func(popt, *args) >>> SSE = np.sum((err)**2) >>> sig2 = SSE / (len(ydata) - len(popt)) >>> pcov = sig2 * pcov > > I think for the weighted least squares problem the weights should go > into the SSE calculation. I didn't find directly the reference, but I > am somewhat confident that this is correct, from the analogy to the > transformed > model ynew = y*weight where weight_i = 1/sigma_i in the linear case. > But it's too late today to try to check this. > > SSE = np.sum((err*weight)**2) Yes. Basically, you want a Chi-squared statistic for the residuals. scipy.odr does this scaling, for example. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From robert.kern at gmail.com Fri Feb 13 02:12:52 2009 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 13 Feb 2009 01:12:52 -0600 Subject: [SciPy-user] Help on installation of scipy In-Reply-To: References: Message-ID: <3d375d730902122312q7a5c2b1ra2632d0a9fe1a81@mail.gmail.com> On Thu, Feb 12, 2009 at 19:39, yelver huang wrote: > Hi all, > I have encountered some problems on installing scipy on Windows, though it's > quite simple. There is no fault for me to import numpy from Python25, but > after I install the binary file of scipy, I could not import it. The error > it shows is: > Warning (from warnings module): > File "G:\Python25\lib\site-packages\scipy\__init__.py", line 30 > UserWarning) > UserWarning: Numpy 1.2.0 or above is recommended for this version of scipy > (detected version 1.0.4) > Traceback (most recent call last): > File "", line 1, in > import scipy > File "G:\Python25\Lib\site-packages\scipy\__init__.py", line 75, in > > from numpy.testing import Tester > ImportError: cannot import name Tester > The version of numpy in my computer is 1.2.1, I could not understand this > message. Hope someone could help me. Well, the version of numpy that is actually installed does appear to be 1.0.4, not 1.2.1. It is possible that you have old files left over from a previous installation or you have two numpy packages installed to different locations, and the old one is the one that is getting picked up. Do the following to double-check the location of the installed numpy: >>> import numpy >>> print numpy.__file__ /Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site-packages/numpy-1.2.0rc2-py2.5-macosx-10.3-fat.egg/numpy/__init__.pyc You may need to delete this installation of numpy thoroughly and reinstall. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From gael.varoquaux at normalesup.org Fri Feb 13 03:44:26 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Fri, 13 Feb 2009 09:44:26 +0100 Subject: [SciPy-user] odeint for calculating trajectories In-Reply-To: <775f17a80902122251n3e1854e9ja30a1d4388af60a9@mail.gmail.com> References: <775f17a80902122251n3e1854e9ja30a1d4388af60a9@mail.gmail.com> Message-ID: <20090213084426.GA12393@phare.normalesup.org> On Fri, Feb 13, 2009 at 08:51:09AM +0200, Alex Liberzon wrote: > I think that if you have given velocity field you should use streamlines. > Link one: [1]http://www.sagenb.org/home/pub/106/ > or using MayaVi: > [2]http://code.enthought.com/projects/mayavi/docs/development/html/mayavi/_sources/mlab.txt Maybe the best example of using Mayavi to integrate a vector field with streamlines would be: http://code.enthought.com/projects/mayavi/docs/development/html/mayavi/example_using_with_scipy.html However, this is going to be convienent only for visualization. If the OP's problem is visualization I am more than happy to answer questions on how to do this with Mayavi. Ga?l From simon.palmer at gmail.com Fri Feb 13 09:11:03 2009 From: simon.palmer at gmail.com (SimonPalmer) Date: Fri, 13 Feb 2009 06:11:03 -0800 (PST) Subject: [SciPy-user] Google App Engine Message-ID: <6028559e-55b8-425b-8677-2f94aadb2fe7@l39g2000yqn.googlegroups.com> Is it possible to get scipy/numpy running on the Google App Engine? Has anyone tried? From bernardo.rocha at meduni-graz.at Fri Feb 13 09:21:29 2009 From: bernardo.rocha at meduni-graz.at (Bernardo M. Rocha) Date: Fri, 13 Feb 2009 15:21:29 +0100 Subject: [SciPy-user] intersect (matlab) In-Reply-To: References: Message-ID: <499581E9.7090803@meduni-graz.at> Hi Robert Kern, thanks a lot for your help! Just another question, is there a way to do it in a matrix (that is, the intersection in the rows of 2 matrices)? The matlab version would be like: [c,ia,ib] = intersect(A,B,'rows') where A and B are matrices with the same number of columns. Best regards, Bernardo M. Rocha From sturla at molden.no Fri Feb 13 10:00:58 2009 From: sturla at molden.no (Sturla Molden) Date: Fri, 13 Feb 2009 16:00:58 +0100 Subject: [SciPy-user] shared memory machines In-Reply-To: <4992D7EA.5070404@molden.no> References: <51ab5afe7eb746fa60db99ef7635df9b.squirrel@webmail.uio.no> <20090210231356.GC9128@phare.normalesup.org> <20090211114620.GB19956@phare.normalesup.org> <4992C539.7020501@molden.no> <20090211133305.GC19956@phare.normalesup.org> <4992D7EA.5070404@molden.no> Message-ID: <49958B2A.7020001@molden.no> Have put up a new version version of the shared memory ndarrays here: http://folk.uio.no/sturlamo/python/sharedmem-feb13-2009.zip usage: import numpy import sharedmem as shm arr = shm.empty(...) arr = shm.ones(...) arr = shm.zeros(...) As for the memory leak reported by Ga?l previously: This is a bug in multiprocessing, not in our code. The offending line is 353 in multiprocessing/forking.py. It shuts down sub-processes abruptly, by using os._exit for suicide, preventing any clean-up code from executing. Change this line to "sys.exit(exitcode)" and it works as expected. The bug has been reported to Jesse Noller. Regards, Sturla Molden From josef.pktd at gmail.com Fri Feb 13 11:21:41 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 13 Feb 2009 11:21:41 -0500 Subject: [SciPy-user] incorrect variance in optimize.curvefit and leastsq In-Reply-To: <1cd32cbb0902122208x5abc7c91n1de14e0ab4a04231@mail.gmail.com> References: <1cd32cbb0902121916q60a08be6ka027b5873a3239da@mail.gmail.com> <4994F26F.4050808@enthought.com> <1cd32cbb0902122057x1fb8ac51l74b6988ec98a8f51@mail.gmail.com> <499503FD.3050004@enthought.com> <1cd32cbb0902122208x5abc7c91n1de14e0ab4a04231@mail.gmail.com> Message-ID: <1cd32cbb0902130821x1ce9dc74l4e191fbf50a179e7@mail.gmail.com> > I think for the weighted least squares problem the weights should go > into the SSE calculation. I didn't find directly the reference, but I > am somewhat confident that this is correct, from the analogy to the > transformed > model ynew = y*weight where weight_i = 1/sigma_i in the linear case. > But it's too late today to try to check this. > > SSE = np.sum((err*weight)**2) > I looked at the weighted function some more: Since the error calculation for the s_sq uses `func = _weighted_general_function` the above weighting for the SSE is automatically done. But then, in this case there is no renormalization necessary when calculation s_sq. The special casing of the weighted function should be dropped (the commented out part below) if (len(ydata) > len(p0)) and pcov is not None: s_sq = (func(popt, *args)**2).sum()/(len(ydata)-len(p0)) ## if sigma is not None: ## s_sq /= (args[-1]**2).sum() pcov = pcov * s_sq else: pcov = inf Below is the test for the weighted case, rewritten from the unweighted NIST example. Note: I multiplied y and and f(x) by the weight, which then is corrected again by the weighted least squares, and so I get the original NIST estimates back. This test passes with the above correction but fails in the current version, r5551. Josef ----------------------- def test_curvefit_weights(): '''test against NIST certified case at http://www.itl.nist.gov/div898/strd/nls/data/misra1b.shtml add weights to test curvefit with weights''' data = array([[ 10.07, 77.6 ], [ 14.73, 114.9 ], [ 17.94, 141.1 ], [ 23.93, 190.8 ], [ 29.61, 239.9 ], [ 35.18, 289. ], [ 40.02, 332.8 ], [ 44.82, 378.4 ], [ 50.76, 434.8 ], [ 55.05, 477.3 ], [ 61.01, 536.8 ], [ 66.4 , 593.1 ], [ 75.47, 689.1 ], [ 81.78, 760. ]]) y,x = data[:,0],data[:,1] pstd_c = [3.1643950207E+00, 4.2547321834E-06] popt_c = [3.3799746163E+02, 3.9039091287E-04] SSE_c = 7.5464681533E-02 Rstd_c = 7.9301471998E-02 decimal = 5 #accuracy of parameter estimate and standard deviation w = 1.0 + np.arange(len(x)) # 1/weights = sigma print w def funct(x, b1, b2): #weighted function for y return w * b1 * (1-(1+b2*x/2)**(-2)) y = w * y #weighted y start1 = [500, 0.0001] start2 = [300, 0.0002] for start in [start1, start2]: popt, pcov = curve_fit(funct, x, y, p0=start, sigma=w) pstd = np.sqrt(np.diag(pcov)) assert_almost_equal(popt, popt_c, decimal=decimal) assert_almost_equal(pstd, pstd_c, decimal=decimal) --------------------------- From Scott.Askey at afit.edu Fri Feb 13 11:54:09 2009 From: Scott.Askey at afit.edu (Askey Scott A Capt AFIT/ENY) Date: Fri, 13 Feb 2009 11:54:09 -0500 Subject: [SciPy-user] pythonxy and Numpy 1.2.1 and Scipy 0.7.0; Ubuntu 8.10 packages Message-ID: <792700546363C941B876B9D41AF4475905930E11@MS-AFIT-03.afit.edu> If I want to run pythonxy under linux is reverting form my current ubuntu 8.10 to 8.04 my only option? How does pythonxy linux compare to pythonxy windows in terms of UI/IDE and modernity of the components? V/R Scott -------------- next part -------------- An HTML attachment was scrubbed... URL: From ellisonbg.net at gmail.com Fri Feb 13 12:05:51 2009 From: ellisonbg.net at gmail.com (Brian Granger) Date: Fri, 13 Feb 2009 09:05:51 -0800 Subject: [SciPy-user] shared memory machines In-Reply-To: <49958B2A.7020001@molden.no> References: <51ab5afe7eb746fa60db99ef7635df9b.squirrel@webmail.uio.no> <20090210231356.GC9128@phare.normalesup.org> <20090211114620.GB19956@phare.normalesup.org> <4992C539.7020501@molden.no> <20090211133305.GC19956@phare.normalesup.org> <4992D7EA.5070404@molden.no> <49958B2A.7020001@molden.no> Message-ID: <6ce0ac130902130905i670c1124l47f00d4b255c0331@mail.gmail.com> Does this work without multiprocessing? Brian On Fri, Feb 13, 2009 at 7:00 AM, Sturla Molden wrote: > > Have put up a new version version of the shared memory ndarrays here: > > http://folk.uio.no/sturlamo/python/sharedmem-feb13-2009.zip > > usage: > > import numpy > import sharedmem as shm > > arr = shm.empty(...) > arr = shm.ones(...) > arr = shm.zeros(...) > > As for the memory leak reported by Ga?l previously: This is a bug in > multiprocessing, not in our code. The offending line is 353 in > multiprocessing/forking.py. It shuts down sub-processes abruptly, by > using os._exit for suicide, preventing any clean-up code from executing. > Change this line to "sys.exit(exitcode)" and it works as expected. The > bug has been reported to Jesse Noller. > > > Regards, > > Sturla Molden > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > From warren.weckesser at gmail.com Fri Feb 13 12:29:06 2009 From: warren.weckesser at gmail.com (Warren Weckesser) Date: Fri, 13 Feb 2009 11:29:06 -0600 Subject: [SciPy-user] pythonxy and Numpy 1.2.1 and Scipy 0.7.0; Ubuntu 8.10 packages In-Reply-To: <792700546363C941B876B9D41AF4475905930E11@MS-AFIT-03.afit.edu> References: <792700546363C941B876B9D41AF4475905930E11@MS-AFIT-03.afit.edu> Message-ID: <114880320902130929t26c8aeefv4b6e2522f4ca46bf@mail.gmail.com> Hi Scott, This sounds like a question for the Python(x,y) Discussion Group. You can get there from here: http://www.pythonxy.com/discussions.php Warren On Fri, Feb 13, 2009 at 10:54 AM, Askey Scott A Capt AFIT/ENY < Scott.Askey at afit.edu> wrote: > If I want to run pythonxy under linux is reverting form my current ubuntu > 8.10 to 8.04 my only option? > > How does pythonxy linux compare to pythonxy windows in terms of UI/IDE and > modernity of the components? > > > > V/R > > > > Scott > > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From strawman at astraw.com Fri Feb 13 12:50:39 2009 From: strawman at astraw.com (Andrew Straw) Date: Fri, 13 Feb 2009 09:50:39 -0800 Subject: [SciPy-user] pythonxy and Numpy 1.2.1 and Scipy 0.7.0; Ubuntu 8.10 packages In-Reply-To: <792700546363C941B876B9D41AF4475905930E11@MS-AFIT-03.afit.edu> References: <792700546363C941B876B9D41AF4475905930E11@MS-AFIT-03.afit.edu> Message-ID: <4995B2EF.4020905@astraw.com> Askey Scott A Capt AFIT/ENY wrote: > > If I want to run pythonxy under linux is reverting form my current > ubuntu 8.10 to 8.04 my only option? > > How does pythonxy linux compare to pythonxy windows in terms of UI/IDE > and modernity of the components? > Out of curiosity, what is it from pythonxy that would make you want it rather than the stock packages? Or are you just thinking of it as a quick way to get latest version of the various packages onto your machine? (I do understand the issue that Ubuntu lags numpy/scipy/matplotlib releases quite a bit.) -Andrew From dwf at cs.toronto.edu Fri Feb 13 13:06:45 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Fri, 13 Feb 2009 13:06:45 -0500 Subject: [SciPy-user] Google App Engine In-Reply-To: <6028559e-55b8-425b-8677-2f94aadb2fe7@l39g2000yqn.googlegroups.com> References: <6028559e-55b8-425b-8677-2f94aadb2fe7@l39g2000yqn.googlegroups.com> Message-ID: On 13-Feb-09, at 9:11 AM, SimonPalmer wrote: > Is it possible to get scipy/numpy running on the Google App Engine? > Has anyone tried? I looked into it, and at present it doesn't seem to be possible: "Currently, Google App Engine allows you to write your applications with Python 2.5. For security reasons, some Python modules written in C are disabled in our system. Also, since Google App Engine doesn't support writing to disk, some libraries that support this and other functionalities are only partially enabled." http://code.google.com/appengine/kb/general.html#language If you can't get C code running (so no NumPy anyway), good luck with Fortran (which SciPy depends pretty heavily on). David From sturla at molden.no Fri Feb 13 13:51:19 2009 From: sturla at molden.no (Sturla Molden) Date: Fri, 13 Feb 2009 19:51:19 +0100 (CET) Subject: [SciPy-user] shared memory machines In-Reply-To: <6ce0ac130902130905i670c1124l47f00d4b255c0331@mail.gmail.com> References: <51ab5afe7eb746fa60db99ef7635df9b.squirrel@webmail.uio.no> <20090210231356.GC9128@phare.normalesup.org> <20090211114620.GB19956@phare.normalesup.org> <4992C539.7020501@molden.no> <20090211133305.GC19956@phare.normalesup.org> <4992D7EA.5070404@molden.no> <49958B2A.7020001@molden.no> <6ce0ac130902130905i670c1124l47f00d4b255c0331@mail.gmail.com> Message-ID: > Does this work without multiprocessing? Yes. Multiprocessing is only used for obtaining the page size. I could move that to Cython as well. S.M. From philip at semanchuk.com Fri Feb 13 13:54:22 2009 From: philip at semanchuk.com (Philip Semanchuk) Date: Fri, 13 Feb 2009 13:54:22 -0500 Subject: [SciPy-user] shared memory machines In-Reply-To: References: <51ab5afe7eb746fa60db99ef7635df9b.squirrel@webmail.uio.no> <20090210231356.GC9128@phare.normalesup.org> <20090211114620.GB19956@phare.normalesup.org> <4992C539.7020501@molden.no> <20090211133305.GC19956@phare.normalesup.org> <4992D7EA.5070404@molden.no> <49958B2A.7020001@molden.no> <6ce0ac130902130905i670c1124l47f00d4b255c0331@mail.gmail.com> Message-ID: <4F6B786E-63B2-4170-BBE0-BB163552CDCD@semanchuk.com> On Feb 13, 2009, at 1:51 PM, Sturla Molden wrote: > >> Does this work without multiprocessing? > > Yes. > > Multiprocessing is only used for obtaining the page size. I could move > that to Cython as well. The page size is available in the mmap module. From Scott.Askey at afit.edu Fri Feb 13 15:15:29 2009 From: Scott.Askey at afit.edu (Askey Scott A Capt AFIT/ENY) Date: Fri, 13 Feb 2009 15:15:29 -0500 Subject: [SciPy-user] Ubuntu pythonxy and Numpy 1.2.1 and Scipy 0.7.0; Message-ID: <792700546363C941B876B9D41AF4475905930ED6@MS-AFIT-03.afit.edu> #Out of curiosity, what is it from pythonxy that would make you want it rather than the stock packages? Or are you just thinking #of it as a quick way to get latest version of the various packages onto your machine? (I do understand the issue that Ubuntu #lags numpy/scipy/matplotlib releases quite a bit.) #-Andrew I wanted to install pythonxy for the IDE and to get more up to date packages such as pytable 2.1, sympy 6.3 and scipy .7. Pythonxy-linux only supports one linux ubuntu 8.04 and I am not even sure it provides a version newer than the standard ubuntu distro. Cheers Scott -------------- next part -------------- An HTML attachment was scrubbed... URL: From simon.palmer at gmail.com Fri Feb 13 15:20:29 2009 From: simon.palmer at gmail.com (Simon Palmer) Date: Fri, 13 Feb 2009 20:20:29 +0000 Subject: [SciPy-user] Google App Engine In-Reply-To: References: <6028559e-55b8-425b-8677-2f94aadb2fe7@l39g2000yqn.googlegroups.com> Message-ID: That was pretty much my conclusion too, but I thought I would ask. Do you happen to have a route into google to lobby for it? On Fri, Feb 13, 2009 at 6:06 PM, David Warde-Farley wrote: > On 13-Feb-09, at 9:11 AM, SimonPalmer wrote: > > > Is it possible to get scipy/numpy running on the Google App Engine? > > Has anyone tried? > > I looked into it, and at present it doesn't seem to be possible: > > "Currently, Google App Engine allows you to write your applications > with Python 2.5. For security reasons, some Python modules written in > C are disabled in our system. Also, since Google App Engine doesn't > support writing to disk, some libraries that support this and other > functionalities are only partially enabled." > > http://code.google.com/appengine/kb/general.html#language > > If you can't get C code running (so no NumPy anyway), good luck with > Fortran (which SciPy depends pretty heavily on). > > David > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Fri Feb 13 18:34:52 2009 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 13 Feb 2009 17:34:52 -0600 Subject: [SciPy-user] intersect (matlab) In-Reply-To: <499581E9.7090803@meduni-graz.at> References: <499581E9.7090803@meduni-graz.at> Message-ID: <3d375d730902131534g78a20508g237843023fb6497c@mail.gmail.com> On Fri, Feb 13, 2009 at 08:21, Bernardo M. Rocha wrote: > Hi Robert Kern, > > thanks a lot for your help! Just another question, is there a way to do it in a matrix (that is, the intersection in the rows of 2 matrices)? The matlab version would be like: > > [c,ia,ib] = intersect(A,B,'rows') > > where A and B are matrices with the same number of columns. I don't know exactly what you mean, off-hand. Can you show me an example? -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From david at ar.media.kyoto-u.ac.jp Sat Feb 14 00:28:30 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Sat, 14 Feb 2009 14:28:30 +0900 Subject: [SciPy-user] Ubuntu pythonxy and Numpy 1.2.1 and Scipy 0.7.0; In-Reply-To: <792700546363C941B876B9D41AF4475905930ED6@MS-AFIT-03.afit.edu> References: <792700546363C941B876B9D41AF4475905930ED6@MS-AFIT-03.afit.edu> Message-ID: <4996567E.9020000@ar.media.kyoto-u.ac.jp> Askey Scott A Capt AFIT/ENY wrote: > > #Out of curiosity, what is it from pythonxy that would make you want > it rather than the stock packages? Or are you just thinking #of it as > a quick way to get latest version of the various packages onto your > machine? (I do understand the issue that Ubuntu #lags > numpy/scipy/matplotlib releases quite a bit.) > > > > #-Andrew > > > > I wanted to install pythonxy for the IDE and to get more up to date > packages such as pytable 2.1, sympy 6.3 and scipy .7. > > Pythonxy-linux only supports one linux ubuntu 8.04 and I am not even > sure it provides a version newer than the standard ubuntu distro. > Well, at least for numpy and scipy, we have up to date packages now for Intrepid: https://edge.launchpad.net/~scipy/+archive/ppa We will try to keep up to date, at least for intrepid (and later), David From strawman at astraw.com Sat Feb 14 04:49:39 2009 From: strawman at astraw.com (Andrew Straw) Date: Sat, 14 Feb 2009 01:49:39 -0800 Subject: [SciPy-user] Automating Matlab In-Reply-To: <49866BDF.2000809@ar.media.kyoto-u.ac.jp> References: <4984F58C.5070605@gmail.com> <3d375d730901311734o388adf56y9f3241032ed409c2@mail.gmail.com> <49866BDF.2000809@ar.media.kyoto-u.ac.jp> Message-ID: <499693B3.9090609@astraw.com> I just came across this, which looks very relevant. http://frontiersin.org/neuroinformatics/paper/10.3389/neuro.11/005.2009/ David Cournapeau wrote: > Sturla Molden wrote: >> For those who are interested, there are two ways of doing this: >> > > I think Eric talked about source code translation, that is .m to .py. > >> The most portable is to call the 'Matlab engine', which is a C and Fortran >> library for automating Matlab. This can be done using f2py or ctypes (wrap >> libeng.dll and libmx.dll). >> > > If you are not aware of it, there is already code for it: > > http://svn.scipy.org/svn/scikits/trunk/mlabwrap/ > > cheers, > > David > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user From gael.varoquaux at normalesup.org Sat Feb 14 05:20:43 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sat, 14 Feb 2009 11:20:43 +0100 Subject: [SciPy-user] Ubuntu pythonxy and Numpy 1.2.1 and Scipy 0.7.0; In-Reply-To: <4996567E.9020000@ar.media.kyoto-u.ac.jp> References: <792700546363C941B876B9D41AF4475905930ED6@MS-AFIT-03.afit.edu> <4996567E.9020000@ar.media.kyoto-u.ac.jp> Message-ID: <20090214102043.GA1528@phare.normalesup.org> On Sat, Feb 14, 2009 at 02:28:30PM +0900, David Cournapeau wrote: > Well, at least for numpy and scipy, we have up to date packages now for > Intrepid: > https://edge.launchpad.net/~scipy/+archive/ppa For Mayavi2, you can use: https://launchpad.net/~gael-varoquaux/+archive The packages are for hardy, but they support intrepid with no problem. Ga?l From mchandra at iitk.ac.in Sun Feb 15 03:23:55 2009 From: mchandra at iitk.ac.in (Mani chandra) Date: Sun, 15 Feb 2009 00:23:55 -0800 Subject: [SciPy-user] Multicolumn file io Message-ID: <4997D11B.7060504@iitk.ac.in> Hi, How can I can output my data in 1D arrays to a file as seperate columns? For ex, my data is like x = numpy.mgrid[0:10:10j], y = numpy.mgrid[0:10:10j], or x = numpy.arange(0, 10, 1), y = numpy.arange(0, 10, 1), and I'd like the data outputted to a file like : #x y 0 0 1 1 2 2 and so on... Thanks Mani chandra From stefan at sun.ac.za Sat Feb 14 16:44:55 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Sat, 14 Feb 2009 23:44:55 +0200 Subject: [SciPy-user] Multicolumn file io In-Reply-To: <4997D11B.7060504@iitk.ac.in> References: <4997D11B.7060504@iitk.ac.in> Message-ID: <9457e7c80902141344j48c99ad5vf72d9939860dd860@mail.gmail.com> 2009/2/15 Mani chandra : > How can I can output my data in 1D arrays to a file as seperate > columns? For ex, my data is like x = numpy.mgrid[0:10:10j], y = > numpy.mgrid[0:10:10j], or x = numpy.arange(0, 10, 1), y = > numpy.arange(0, 10, 1), and I'd like the data outputted to a file like : > #x y > 0 0 > 1 1 > 2 2 > > and so on... np.savetxt('/tmp/data.txt', np.array([x, y]).T) or np.savetxt('/tmp/data.txt', np.array([x, y]).T, fmt='%.0f') St?fan From oliphant at enthought.com Sat Feb 14 17:10:15 2009 From: oliphant at enthought.com (Travis E. Oliphant) Date: Sat, 14 Feb 2009 16:10:15 -0600 Subject: [SciPy-user] incorrect variance in optimize.curvefit and leastsq In-Reply-To: <1cd32cbb0902130821x1ce9dc74l4e191fbf50a179e7@mail.gmail.com> References: <1cd32cbb0902121916q60a08be6ka027b5873a3239da@mail.gmail.com> <4994F26F.4050808@enthought.com> <1cd32cbb0902122057x1fb8ac51l74b6988ec98a8f51@mail.gmail.com> <499503FD.3050004@enthought.com> <1cd32cbb0902122208x5abc7c91n1de14e0ab4a04231@mail.gmail.com> <1cd32cbb0902130821x1ce9dc74l4e191fbf50a179e7@mail.gmail.com> Message-ID: <49974147.10305@enthought.com> josef.pktd at gmail.com wrote: >> I think for the weighted least squares problem the weights should go >> into the SSE calculation. I didn't find directly the reference, but I >> am somewhat confident that this is correct, from the analogy to the >> transformed >> model ynew = y*weight where weight_i = 1/sigma_i in the linear case. >> But it's too late today to try to check this. >> >> SSE = np.sum((err*weight)**2) >> >> > > I looked at the weighted function some more: > > Since the error calculation for the s_sq uses `func = > _weighted_general_function` > the above weighting for the SSE is automatically done. But then, in > this case there is no renormalization > necessary when calculation s_sq. The special casing of the weighted > function should be dropped (the commented out part below) > > if (len(ydata) > len(p0)) and pcov is not None: > s_sq = (func(popt, *args)**2).sum()/(len(ydata)-len(p0)) > ## if sigma is not None: > ## s_sq /= (args[-1]**2).sum() > pcov = pcov * s_sq > else: > pcov = inf > Thanks for figuring this out. I've updated the code in the trunk to reflect this change. I will add the test suite as soon as I get a power connection again --- unless someone does it first. Thanks for pointing out the NIST test-cases and sheparding the curve_fit function. Best regards, -Travis -- Travis Oliphant Enthought, Inc. (512) 536-1057 (office) (512) 536-1059 (fax) http://www.enthought.com oliphant at enthought.com From simpson at math.toronto.edu Sat Feb 14 18:58:40 2009 From: simpson at math.toronto.edu (Gideon Simpson) Date: Sat, 14 Feb 2009 18:58:40 -0500 Subject: [SciPy-user] sparse jacobians and nonlinear solvers Message-ID: Just wanted to check before I modified my code. If my jacobian matrices are in fact sparse, will any of the nonlinear solvers be able to take advantage of that structure if I were to build up the jacobians as a sparse data structure? -gideon From pav at iki.fi Sat Feb 14 19:10:29 2009 From: pav at iki.fi (Pauli Virtanen) Date: Sun, 15 Feb 2009 00:10:29 +0000 (UTC) Subject: [SciPy-user] sparse jacobians and nonlinear solvers References: Message-ID: Sat, 14 Feb 2009 18:58:40 -0500, Gideon Simpson wrote: > > Just wanted to check before I modified my code. > > If my jacobian matrices are in fact sparse, will any of the nonlinear > solvers be able to take advantage of that structure if I were to build > up the jacobians as a sparse data structure? Based on the documentation on docs.scipy.org, I don't think any of those currently in Scipy can do this. `fsolve` can handle banded Jacobians, though. -- Pauli Virtanen From wnbell at gmail.com Sat Feb 14 20:06:34 2009 From: wnbell at gmail.com (Nathan Bell) Date: Sat, 14 Feb 2009 20:06:34 -0500 Subject: [SciPy-user] Multicolumn file io In-Reply-To: <9457e7c80902141344j48c99ad5vf72d9939860dd860@mail.gmail.com> References: <4997D11B.7060504@iitk.ac.in> <9457e7c80902141344j48c99ad5vf72d9939860dd860@mail.gmail.com> Message-ID: On Sat, Feb 14, 2009 at 4:44 PM, St?fan van der Walt wrote: > > np.savetxt('/tmp/data.txt', np.array([x, y]).T) > > or > > np.savetxt('/tmp/data.txt', np.array([x, y]).T, fmt='%.0f') > To that I'd add that if some columns are integers and others are floats that fmt='%g' gives you the desired (compact) output even though the combined array has a floating point type. -- Nathan Bell wnbell at gmail.com http://graphics.cs.uiuc.edu/~wnbell/ From eads at soe.ucsc.edu Sun Feb 15 01:18:53 2009 From: eads at soe.ucsc.edu (Damian Eads) Date: Sat, 14 Feb 2009 22:18:53 -0800 Subject: [SciPy-user] "clustergrams"/hierarchical clustering heat maps In-Reply-To: <8E0AFD62-64B7-435F-B80F-298C702BF771@cs.toronto.edu> References: <8E0AFD62-64B7-435F-B80F-298C702BF771@cs.toronto.edu> Message-ID: <91b4b1ab0902142218v44ceccd8r10e51eea56f03c57@mail.gmail.com> Hi David, Sorry. I did not see your message until now. Several people have already inquired about heatmaps. I've been meaning to eventually implement support for them but since I don't work with microarray data and I'm in the midst of trying to get a paper out, it has fallen onto the back burner. As a first step, I'd need to implement support for missing attributes since this seems to be common with microarray data. As far as I know, a heatmap illustrates clustering along two axes: observation vectors and attributes. For example, suppose we're clustering patients by their genes. There is one observation vector for each patient, and one vector element per gene. Clustering observation vectors is the typical case, which is used to identify groups of similar patients. Clustering attributes (across observation vectors) is less typical but would be used to identifying groups of similar genes. The heatmap just illustrates the vectors, the color is the intensity. When clustering along a single dimension (observation vectors), no sorting is necessary, and a dendrogram is drawn along the vertical axis. The i'th row is just the observation vector corresponding to the i'th leaf node. No sorting along the attribute dimension is needed. Along two dimensions, there is a dendrogram along the horizontal axis. Now the attributes must be reordered so that the j'th column corresponds to the j'th leaf node. This is my first time describing heat maps so I apologize if this description is terse. Does it make some sense? As far as how someone implements this, it seems like it'd be pretty simple. There is a helper function called _plot_dendrogram that takes in a collection of raw dendrogram lines to be rendered on the plot. First, plot the heatmap (sorting the attributes so that the columns correspond to the ids of the leaf nodes); this can be done with imshow. Second, for the first dendrogram, call _plot_dendrogram but provide it with a shifting parameters so that the dendrogram lines are rendered to the left of the image. Third, call _plot_dendrogram again, provide a shifting parameter, but instead shift the lines downward for the attribute clustering dendrogram. I want to get to this soon but no promises. Sorry. Cheers, Damian On Mon, Feb 2, 2009 at 11:12 PM, David Warde-Farley wrote: > Hi all, > > I was recently asked to cluster some data and I know from experience > that people use these heat maps to look for patterns in multivariate > data, often with a dendrogram off to the side. This involves sorting > the rows and columns in a certain fashion, the details of which are > somewhat fuzzy to me (and, truthfully, I'm happy with it staying that > way for now). > > I notice that dendrogram plotting is available in > scipy.cluster.hierarchy, and was wondering if the something for > producing the associated sorted heat maps is available anywhere > (within SciPy or otherwise). > > Many thanks, > > David > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > -- ----------------------------------------------------- Damian Eads Ph.D. Student Jack Baskin School of Engineering, UCSC E2-489 1156 High Street Machine Learning Lab Santa Cruz, CA 95064 http://www.soe.ucsc.edu/~eads From contact at pythonxy.com Sun Feb 15 06:52:03 2009 From: contact at pythonxy.com (Pierre Raybaut) Date: Sun, 15 Feb 2009 12:52:03 +0100 Subject: [SciPy-user] [ Python(x,y) ] New release : 2.1.11 Message-ID: <499801E3.3050306@pythonxy.com> Hi all, Release 2.1.11 is now available on http://www.pythonxy.com: - All-in-One Installer ("Full Edition"), - Plugin Installer -- to be downloaded with xyweb, - Update Changes history Version 2.1.11 (02-15-2009) * Added: o numexpr 1.2 - Fast evaluation of array expressions elementwise by using a vector-based virtual machine * Updated: o SciPy 0.7.0 o Console 2.0.141.8 o Enthought Tool Suite 3.1.0.4 o GDAL 1.6.0 o h5py 1.1.0 o IPython 0.9.1.7 o pylint 0.16.0 o reportlab 2.3 o SWIG 1.3.38 o VPython 5.0.3 o winpdb 1.4.4 o xy 1.0.20 o xydoc 1.0.3 * Corrected: o Issue 70: Python installation folder was asked but not changed if the "Default directories" option was not selected o Issue 71: SciTE shortcut was broken in "Python(x,y) Home" application o Issue 74: IPython syntax coloring incompatible with default white background o Issue 75: Console plugin installer: remove 'console.xml' in user home directory o Issue 80: Upgrade to SciPy 0.7.0 Regards, Pierre Raybaut From teddy.kord at googlemail.com Sun Feb 15 11:22:04 2009 From: teddy.kord at googlemail.com (Ted Kord) Date: Sun, 15 Feb 2009 08:22:04 -0800 (PST) Subject: [SciPy-user] Specific Odeint Problem Message-ID: Hi I have a specific problem for which I am unsure how to use odeint. I'll try to outline the problem in steps so please bear with me. 1. I have a set of data in a file (one column only), say resistances. 2. I have a set of 5 time-dependent ODEs to solve. 3. At each time step, a specific resistance value from the aforementioned data set must be used in the solution of the ODE set e.g dydt = 4.56 * exp(resistance)/F 4. I am able to solve the ODEs using odeint using a time range from 'a' to 'b' according to the requirements of odeint, i.e., odeint(f, y0, t) 5. However, it's proving difficult to incorporate 'at each time step', the specific resistances required. I'd appreciate any help on this issue. Thanks. Ted Kord P.S: The resistance values cannot be hard-coded in. There are 5001 values each for a specific time step. From rob.clewley at gmail.com Sun Feb 15 12:36:16 2009 From: rob.clewley at gmail.com (Rob Clewley) Date: Sun, 15 Feb 2009 12:36:16 -0500 Subject: [SciPy-user] Specific Odeint Problem In-Reply-To: References: Message-ID: Ted, > 3. At each time step, a specific resistance value from the > aforementioned data set must be used in the solution of the ODE set > e.g dydt = 4.56 * exp(resistance)/F You are supplying odeint with a python function ("f") that returns the right hand side. You need to define a second, auxiliary function that takes the time t and returns the value for the resistance. Then you can call that function from within your RHS function, so resistance in the above expression for dydt will become resistance(t). t is already an argument to the RHS function f, so you can just use that. As for the auxiliary function, you could use interp1d from scipy.interpolate, which takes the time domain as an array of values defining (potentially non-uniform length) intervals, and a corresponding 1D array of values (your resistances) and will linearly interpolate between the values. I have code to do this with piecewise constant values if you *really* need them to be constant on each time interval. But I suspect you'd like these values to be smooth if possible... -Rob From teddy.kord at googlemail.com Sun Feb 15 17:30:54 2009 From: teddy.kord at googlemail.com (Ted Kord) Date: Sun, 15 Feb 2009 14:30:54 -0800 (PST) Subject: [SciPy-user] Specific Odeint Problem In-Reply-To: References: Message-ID: Thanks. Worked like a peach. Ted From eads at soe.ucsc.edu Sun Feb 15 19:22:49 2009 From: eads at soe.ucsc.edu (Damian Eads) Date: Sun, 15 Feb 2009 16:22:49 -0800 Subject: [SciPy-user] "clustergrams"/hierarchical clustering heat maps In-Reply-To: <91b4b1ab0902142218v44ceccd8r10e51eea56f03c57@mail.gmail.com> References: <8E0AFD62-64B7-435F-B80F-298C702BF771@cs.toronto.edu> <91b4b1ab0902142218v44ceccd8r10e51eea56f03c57@mail.gmail.com> Message-ID: <91b4b1ab0902151622y2fb88f9bq653bb8f4fbfc22a0@mail.gmail.com> I would like to propose a design for the heatmap function interface. A heatmap involves two hierarchical clusterings of the same data. Let's start with the dendrogram function, some of argument names are derived from m*tlab so it is already a tested design, in a sense. def dendrogram(Z, p=30, truncate_mode=None, color_threshold=None, get_leaves=True, orientation='top', labels=None, count_sort=False, distance_sort=False, show_leaf_counts=True, no_plot=False, no_labels=False, color_list=None, leaf_font_size=None, leaf_rotation=None, leaf_label_func=None, no_leaves=False, show_contracted=False, link_color_func=None) (I know others have disagreed with the use of capital variable names in the hierarchy module. I carried them over from m*tlab when I first wrote it for backwards compatability purposes. Also, MATLAB often uses them to denote matrices, as opposed to vectors. I think denoting this distinction with capitalization is somewhat helpful, but I'm sure others disagree. I don't want to get into a flame war about this. I want to talk about heat maps.) Since Z is typically used to denote a clustering in m*tlab, and there are two clusterings, we will need two clusterings Z1 and Z2. Z1 will be along the observation dimension and Z2 along the attribute dimension. The function heatmap will take in parameters for both dendrograms, which will be suffixed x and y. It is assumed the first dendrogram will be plotted along the y axis either on the left or the right. The second one, along the x-axis. def heatmap(Zx, Zy, p1=30, p2=30, color_threshold=None, get_leaves=True, orientation='left-down', labels1=None, labels2=None, count_sortx=False, count_sorty=False, distance_sortx=False, distance_sorty=False, no_plot=False, no_labelsx=False, no_labelsy=False, color_listx=None, color_listy=None, leaf_font_sizex=None, leaf_font_sizey=None, leaf_rotationx=None, leaf_rotationy=None, leaf_label_funcx=None, leaf_label_funcy=None, no_leavesx=False, no_leavesy=False, link_color_funcx=None, link_color_funcy=None) The orientation parameter can be any of 'left-down', 'right-down', 'left-up', 'right-up'; the first direction in the string specifies whether to plot the first dendrogram to the left or the right of the heat map, and the second. All contraction-related parameters have been removed since they don't really make sense for heatmaps. All data structures returned are 2 element tuples. For example, instead of returning a single list of leaf node ids as in `dendrogram`, two are returned, one per clustering (Lx, Ly). Since I don't really heatmaps myself, but I'd be willing to write such a function, I'd appreciate if the end users who want this feature can give me some feedback on their needs. Thank you. Cheers, Damian On Sat, Feb 14, 2009 at 10:18 PM, Damian Eads wrote: > Hi David, > > Sorry. I did not see your message until now. Several people have > already inquired about heatmaps. I've been meaning to eventually > implement support for them but since I don't work with microarray data > and I'm in the midst of trying to get a paper out, it has fallen onto > the back burner. As a first step, I'd need to implement support for > missing attributes since this seems to be common with microarray data. > > As far as I know, a heatmap illustrates clustering along two axes: > observation vectors and attributes. For example, suppose we're > clustering patients by their genes. There is one observation vector > for each patient, and one vector element per gene. Clustering > observation vectors is the typical case, which is used to identify > groups of similar patients. Clustering attributes (across observation > vectors) is less typical but would be used to identifying groups of > similar genes. > > The heatmap just illustrates the vectors, the color is the intensity. > When clustering along a single dimension (observation vectors), no > sorting is necessary, and a dendrogram is drawn along the vertical > axis. The i'th row is just the observation vector corresponding to the > i'th leaf node. No sorting along the attribute dimension is needed. > Along two dimensions, there is a dendrogram along the horizontal axis. > Now the attributes must be reordered so that the j'th column > corresponds to the j'th leaf node. > > This is my first time describing heat maps so I apologize if this > description is terse. Does it make some sense? > > As far as how someone implements this, it seems like it'd be pretty > simple. There is a helper function called _plot_dendrogram that takes > in a collection of raw dendrogram lines to be rendered on the plot. > First, plot the heatmap (sorting the attributes so that the columns > correspond to the ids of the leaf nodes); this can be done with > imshow. Second, for the first dendrogram, call _plot_dendrogram but > provide it with a shifting parameters so that the dendrogram lines are > rendered to the left of the image. Third, call _plot_dendrogram again, > provide a shifting parameter, but instead shift the lines downward for > the attribute clustering dendrogram. > > I want to get to this soon but no promises. Sorry. > > Cheers, > > Damian > > > On Mon, Feb 2, 2009 at 11:12 PM, David Warde-Farley wrote: >> Hi all, >> >> I was recently asked to cluster some data and I know from experience >> that people use these heat maps to look for patterns in multivariate >> data, often with a dendrogram off to the side. This involves sorting >> the rows and columns in a certain fashion, the details of which are >> somewhat fuzzy to me (and, truthfully, I'm happy with it staying that >> way for now). >> >> I notice that dendrogram plotting is available in >> scipy.cluster.hierarchy, and was wondering if the something for >> producing the associated sorted heat maps is available anywhere >> (within SciPy or otherwise). >> >> Many thanks, >> >> David >> _______________________________________________ >> SciPy-user mailing list >> SciPy-user at scipy.org >> http://projects.scipy.org/mailman/listinfo/scipy-user >> > > > > -- > ----------------------------------------------------- > Damian Eads Ph.D. Student > Jack Baskin School of Engineering, UCSC E2-489 > 1156 High Street Machine Learning Lab > Santa Cruz, CA 95064 http://www.soe.ucsc.edu/~eads > -- ----------------------------------------------------- Damian Eads Ph.D. Student Jack Baskin School of Engineering, UCSC E2-489 1156 High Street Machine Learning Lab Santa Cruz, CA 95064 http://www.soe.ucsc.edu/~eads From bernardo.rocha at meduni-graz.at Mon Feb 16 01:53:40 2009 From: bernardo.rocha at meduni-graz.at (Bernardo M. Rocha) Date: Mon, 16 Feb 2009 07:53:40 +0100 Subject: [SciPy-user] intersect (matlab) Message-ID: <49990D74.4080704@meduni-graz.at> Hi Robert Kern, I have 2 matrices of dimensions npts1 x 3 and npts2 x 3, and I would like to figure out the intersection between the rows of these matrices. If you think that the matrices are lists of points with their coordinates, I want to find out the common points to both lists. That's it. In matlab you can simply do: [c,iai,ib] = intersect(A,B,'rows') But the intersect in python only works with 1D arrays. Best regards, Bernardo M. Rocha From stefan at sun.ac.za Mon Feb 16 02:59:46 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Mon, 16 Feb 2009 09:59:46 +0200 Subject: [SciPy-user] intersect (matlab) In-Reply-To: <49990D74.4080704@meduni-graz.at> References: <49990D74.4080704@meduni-graz.at> Message-ID: <9457e7c80902152359h2da44483yef32a1048c0dc9a5@mail.gmail.com> 2009/2/16 Bernardo M. Rocha : > I have 2 matrices of dimensions npts1 x 3 and npts2 x 3, and I would > like to figure out the intersection between the rows of these matrices. > If you think that the matrices are lists of points with their > coordinates, I want to find out the common points to both lists. That's > it. In matlab you can simply do: > > [c,iai,ib] = intersect(A,B,'rows') > > But the intersect in python only works with 1D arrays. Here's a place to start: # Setup some dummy data a = np.array([[1,2,3],[4,5,6],[7,8,9],[10,11,12],[13,14,15]]) b = np.array([[10, 11, 12], [1, 4, 5], [1, 2, 3], [13, 14, 15], [1, 2, 4]]) # Calculate the indices of the intersecting rows intersection = np.logical_or.reduce(np.logical_and.reduce(a == b[:, None], axis=2)) print a[intersection] Regards St?fan From cimrman3 at ntc.zcu.cz Mon Feb 16 07:07:53 2009 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Mon, 16 Feb 2009 13:07:53 +0100 Subject: [SciPy-user] intersect (matlab) In-Reply-To: <49990D74.4080704@meduni-graz.at> References: <49990D74.4080704@meduni-graz.at> Message-ID: <49995719.5070007@ntc.zcu.cz> Bernardo M. Rocha wrote: > Hi Robert Kern, > > I have 2 matrices of dimensions npts1 x 3 and npts2 x 3, and I would > like to figure out the intersection between the rows of these > matrices. If you think that the matrices are lists of points with > their coordinates, I want to find out the common points to both > lists. That's it. In matlab you can simply do: > > [c,iai,ib] = intersect(A,B,'rows') > > But the intersect in python only works with 1D arrays. > > Best regards, Bernardo M. Rocha These questions appear from time to time - it would be nice to add more kwarg options to all the functions in the arraysetops module, like return_index, rows (or better, axis!), etc. Unfortunately, I (the culprit of arraysetops) am too swamped by work to look at it right now. But it's on my TODO list. Best regards, r. From peter.cimermancic at gmail.com Mon Feb 16 07:16:04 2009 From: peter.cimermancic at gmail.com (=?UTF-8?Q?Peter_Cimerman=C4=8Di=C4=8D?=) Date: Mon, 16 Feb 2009 13:16:04 +0100 Subject: [SciPy-user] integrate.odeint problem Message-ID: <18d53ca60902160416y6b6b6527nc6406a1def5e1644@mail.gmail.com> Hi! Using 60 differential equations, I am trying to simulate some biological process. After running the script, I've got next message: lsoda-- at current t (=r1), mxstep (=i1) steps taken on this call before reaching tout In above message, I1 = 500 In above message, R1 = 0.857...E+01 Excess work done on this call (Perhaps wrong Dfun type). Run with full_output = 1 to get quantitative information. After running with full_output = 1, additional information was: KeyError: 0. The same equations and parameters were run in Jarnac (simulation software) and I've got correct results, assuming equations and parameters are right. What else could go wrong to produce above error message? Thank you in advance, Peter -------------- next part -------------- An HTML attachment was scrubbed... URL: From warren.weckesser at gmail.com Mon Feb 16 08:47:54 2009 From: warren.weckesser at gmail.com (Warren Weckesser) Date: Mon, 16 Feb 2009 07:47:54 -0600 Subject: [SciPy-user] integrate.odeint problem In-Reply-To: <18d53ca60902160416y6b6b6527nc6406a1def5e1644@mail.gmail.com> References: <18d53ca60902160416y6b6b6527nc6406a1def5e1644@mail.gmail.com> Message-ID: <114880320902160547x4fd15483j6585ece2f63c1ab8@mail.gmail.com> Hi Peter, Try setting the option mxstep in your call to odeint, e.g. mxstep=1000. Warren On Mon, Feb 16, 2009 at 6:16 AM, Peter Cimerman?i? < peter.cimermancic at gmail.com> wrote: > Hi! > > Using 60 differential equations, I am trying to simulate some biological > process. After running the script, I've got next message: > > lsoda-- at current t (=r1), mxstep (=i1) steps > taken on this call before reaching tout > In above message, I1 = 500 > In above message, R1 = 0.857...E+01 > Excess work done on this call (Perhaps wrong Dfun type). > Run with full_output = 1 to get quantitative information. > > After running with full_output = 1, additional information was: KeyError: > 0. > > The same equations and parameters were run in Jarnac (simulation software) > and I've got correct results, assuming equations and parameters are right. > What else could go wrong to produce above error message? > > Thank you in advance, > > Peter > > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jr at sun.ac.za Mon Feb 16 08:47:52 2009 From: jr at sun.ac.za (Johann Rohwer) Date: Mon, 16 Feb 2009 15:47:52 +0200 Subject: [SciPy-user] integrate.odeint problem In-Reply-To: <18d53ca60902160416y6b6b6527nc6406a1def5e1644@mail.gmail.com> References: <18d53ca60902160416y6b6b6527nc6406a1def5e1644@mail.gmail.com> Message-ID: <200902161547.52998.jr@sun.ac.za> Hi Peter Try increasing the parameter mxstep (default is 500) in the odeint function call to a higher value (such as 1000 or 3000). We are using SciPy's odeint in our Python based systems biology simulation software PySCeS (http://pysces.sf.net) and have found quite regularly for biological models that an mxstep value of 500 is insufficient. Incidentally, when PySCeS encounters this situation, it automatically increments the mxstep value and re-simulates.... In general, we have been able to get good agreement with PySCeS and Jarnac simulating the same model, so I'd be surprised if this does not fix it. Regards Johann On Monday, 16 February 2009, Peter Cimerman?i? wrote: > Hi! > > Using 60 differential equations, I am trying to simulate some > biological process. After running the script, I've got next > message: > > lsoda-- at current t (=r1), mxstep (=i1) steps > taken on this call before reaching tout > In above message, I1 = 500 > In above message, R1 = 0.857...E+01 > Excess work done on this call (Perhaps wrong Dfun type). > Run with full_output = 1 to get quantitative information. > > After running with full_output = 1, additional information was: > KeyError: 0. > > The same equations and parameters were run in Jarnac (simulation > software) and I've got correct results, assuming equations and > parameters are right. What else could go wrong to produce above > error message? > > Thank you in advance, > > Peter From josef.pktd at gmail.com Mon Feb 16 23:49:14 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 16 Feb 2009 23:49:14 -0500 Subject: [SciPy-user] improving ML estimation of distributions - example Message-ID: <1cd32cbb0902162049y56d93f7ctd2ef2e3ae3bcc894@mail.gmail.com> I was trying out some of the improvements to the maximum likelihood estimation of distribution parameters with fit: * allow some parameters to be fixed, especially location for distribution with finite upper or lower bound * get automatic "meaningful" starting values from the data instead of default (1,1,1,..) * define loglike directly if it is a shortcut, instead of np.log(pdf) I tried the changes for the gamma distribution, which has a lower bound of zero if location is at default zero: Here are some monte carlo results: * about 3 times faster for fixed location, * Mean Squared Error about half of current version for sample sizes around 100 to 200 time stats.gamma 41.375 time new gamma 13.1560001373 sample size = 200, number of iterations = 500 with fixed location with estimated location shape scale shape location scale bias [ 0.0371193 -0.10947458 -0.08416937 0.94610597 0.77749813] errvar [ 0.06138217 4.47976327 0.19587426 6.46472802 8.44292793] mse [ 0.06276001 4.49174795 0.20295874 7.35984453 9.04743127] maxabs [ 1.31184619 7.43273778 2.0578926 12.45965912 8.23351093] mad [ 0.19163682 1.65387385 0.35096994 2.13576654 2.36227371] data, rvsg, was generated with parameters [2.5, 0.0, 20] >>> stats.gamma.fit(rvsg) # unconstrained current stats version array([ 2.41650867, 1.25529111, 20.06830571]) >>> gamma.fit_fr(rvsg, 1) # unconstrained new version array([ 2.41650867, 1.25529111, 20.06830571]) >>> gamma.fit_fr(rvsg) # unconstrained new version with method of moment starting values array([ 2.41650354, 1.25536201, 20.06831229]) >>> gamma.fit_fr(rvsg, frozen=[np.nan,0.0,np.nan]) # fix location at 0 array([ 2.60144762, 19.12416175]) >>> gamma.fit_fr(rvsg, frozen=[2.5,0.0,np.nan]) # fix shape and location, estimate scale array([ 19.90018213]) >>> gamma.fit_fr(rvsg, frozen=[2.5,0.0,20]) # fix all parameters array([], dtype=float64) >>> gamma.fit_fr(rvsg, frozen=[np.nan,0.0,20]) # fix location and scale, estimate shape array([ 2.5077181]) Given the constraint estimates, we can compare the log likelihood values at constrained or unconstrained ML estimate. Can be used to construct likelihood ratio tests: >>> gamma.nnlf([2.5,0.0,20], rvsg) # neg. log-likelihood at true values 941.50470025907089 >>> gamma.nnlf([ 2.41650867, 1.25529111, 20.06830571], rvsg) # neg. log-likelihood at unconstrained estimates 941.30423824721584 >>> gamma.nnlf([ 2.60144762, 0.0, 19.12416175], rvsg) # neg. log-likelihood at constrained estimates 941.41052750929691 I picked the format for the frozen mask mostly because it is easy and fast to switch between reduced and full parameter vectors # select parameters that needs to be estimated: x0 = np.array(x0)[np.isnan(frmask)] # insert reduced set of estimation parameters to the full parameter vector theta = frmask.copy() theta[np.isnan(frmask)] = thetashort I appreciate comments, especially if someone has an opinion about the API of the added functionality. These changes to the fit method will be backward compatible (except when different starting value results in different estimates). Adding starting values and log-likelihood function to each individual distribution is work and will take some time, but adding partially frozen parameters to the generic fit method is relatively simple. my working file with the gamma distribution is attached. Josef -------------- next part -------------- import numpy as np from scipy import stats, special, optimize from scipy.stats import distributions class gamma_gen(distributions.rv_continuous): def _rvs(self, a): return mtrand.standard_gamma(a, self._size) def _pdf(self, x, a): return x**(a-1)*np.exp(-x)/special.gamma(a) def _loglike(self, x, a): return (a-1)*np.log(x) - x - special.gammaln(a) def _cdf(self, x, a): return special.gammainc(a, x) def _ppf(self, q, a): return special.gammaincinv(a,q) def _stats(self, a): return a, a, 2.0/np.sqrt(a), 6.0/a def _entropy(self, a): return special.psi(a)*(1-a) + 1 + special.gammaln(a) def _nnlf_(self, x, *args): # inherited version for comparison return -np.sum(np.log(self._pdf(x, *args)),axis=0) def _nnlf(self, x, *args): # overwrite ic subclass return -np.sum(self._loglike(x, *args),axis=0) def _fitstart(self, x): # method of moment estimator as starting values, not verified # with literature loc = np.min([x.min(),0]) a = 4/stats.skew(x)**2 scale = np.std(x) / np.sqrt(a) return (a, loc, scale) def nnlf_fr(self, thetash, x, frmask): # new frozen version # - sum (log pdf(x, theta),axis=0) # where theta are the parameters (including loc and scale) # try: if frmask != None: theta = frmask.copy() theta[np.isnan(frmask)] = thetash else: theta = thetash loc = theta[-2] scale = theta[-1] args = tuple(theta[:-2]) except IndexError: raise ValueError, "Not enough input arguments." if not self._argcheck(*args) or scale <= 0: return np.inf x = np.array((x-loc) / scale) cond0 = (x <= self.a) | (x >= self.b) if (np.any(cond0)): return np.inf else: N = len(x) #raise ValueError return self._nnlf(x, *args) + N*np.log(scale) def fit_fr(self, data, *args, **kwds): loc0, scale0 = map(kwds.get, ['loc', 'scale'],[0.0, 1.0]) Narg = len(args) if Narg == 0 and hasattr(self, '_fitstart'): x0 = self._fitstart(data) elif Narg > self.numargs: raise ValueError, "Too many input arguments." else: args += (1.0,)*(self.numargs-Narg) # location and scale are at the end x0 = args + (loc0, scale0) if 'frozen' in kwds: frmask = np.array(kwds['frozen']) if len(frmask) != self.numargs+2: raise ValueError, "Incorrect number of frozen arguments." else: # keep starting values for not frozen parameters x0 = np.array(x0)[np.isnan(frmask)] else: frmask = None #print x0 #print frmask return optimize.fmin(self.nnlf_fr, x0, args=(np.ravel(data), frmask), disp=0) gamma = gamma_gen(a=0.0,name='gamma',longname='A gamma', shapes='a',extradoc=""" Gamma distribution For a = integer, this is the Erlang distribution, and for a=1 it is the exponential distribution. gamma.pdf(x,a) = x**(a-1)*exp(-x)/gamma(a) for x >= 0, a > 0. """ ) rvsg = stats.gamma.rvs(2.5,scale=20,size=1000) print rvsg.min() print gamma.fit(rvsg) print gamma.fit_fr(rvsg, frozen=[np.nan,0.0,np.nan]) import time niter = 500 ssize = 200 t0 = time.time() result1 = [] np.random.seed(64398721) for ii in range(niter): rvsg = stats.gamma.rvs(2.5,scale=20,size=ssize) result1.append(np.hstack(stats.gamma.fit(rvsg))) t1 = time.time() result2 = [] np.random.seed(64398721) for ii in range(niter): rvsg = stats.gamma.rvs(2.5,scale=20,size=ssize) result2.append(gamma.fit_fr(rvsg, frozen=[np.nan,0.0,np.nan])) #result2.append(gamma.fit_fr(rvsg,1)) # this is equivalent to old t2 = time.time() print 'time stats.gamma', t1-t0 print 'time new gamma', t2-t1 resarr1 = np.array(result1) resarr2 = np.array(result2) resarr = np.hstack([resarr2, resarr1]) ptrue = np.array([2.5,20.0,2.5,0.0,20.0])#[:2] #ptrue = np.array([2.5,0.0,20.0,2.5,0.0,20.0]) print ' sample size = %d, number of iterations = %d' % (ssize, niter) print ' with fixed location with estimated location' print ' shape scale shape location scale' bias = np.mean((resarr - ptrue), axis=0) errvar = np.var((resarr - ptrue), axis=0) maxabs = np.max(np.abs(resarr - ptrue), axis=0) mad = np.mean(np.abs(resarr - ptrue), axis=0) mse = np.mean((resarr - ptrue)**2, axis=0) print 'bias ', bias print 'errvar', errvar print 'mse ', mse print 'maxabs', maxabs print 'mad ', mad From sahar at cmt.co.il Tue Feb 17 02:38:21 2009 From: sahar at cmt.co.il (Sahar Vilan) Date: Tue, 17 Feb 2009 09:38:21 +0200 Subject: [SciPy-user] Delete equal elements from array Message-ID: Hi, I have to exclude equal elements from an array. x1 = np.array([1, 4, 5, 5]) Is there any function to get from x1 an array with one element of each kind ( [1, 4, 5, 5] -> [1, 4, 5])? Thanks, Sahar ******************************************************************************************************* This e-mail message may contain confidential,and privileged information or data that constitute proprietary information of CMT Medical Ltd. Any review or distribution by others is strictly prohibited. If you are not the intended recipient you are hereby notified that any use of this information or data by any other person is absolutely prohibited. If you are not the intended recipient, please delete all copies. Thank You. http://www.cmt.co.il ******************************************************************************************************** ************************************************************************************ This footnote confirms that this email message has been scanned by PineApp Mail-SeCure for the presence of malicious code, vandals & computer viruses. ************************************************************************************ -------------- next part -------------- An HTML attachment was scrubbed... URL: From fredmfp at gmail.com Tue Feb 17 02:56:43 2009 From: fredmfp at gmail.com (fred) Date: Tue, 17 Feb 2009 08:56:43 +0100 Subject: [SciPy-user] Delete equal elements from array In-Reply-To: References: Message-ID: <499A6DBB.8080504@gmail.com> Sahar Vilan a ?crit : > Hi, > > I have to exclude equal elements from an array. > x1 = np.array([1, 4, 5, 5]) > > Is there any function to get from x1 an array with one element of each > kind ( [1, 4, 5, 5] -> [1, 4, 5])? np.unique(x1)? Cheers, -- Fred From faltet at pytables.org Tue Feb 17 02:57:44 2009 From: faltet at pytables.org (Francesc Alted) Date: Tue, 17 Feb 2009 08:57:44 +0100 Subject: [SciPy-user] Delete equal elements from array In-Reply-To: References: Message-ID: <200902170857.44881.faltet@pytables.org> A Tuesday 17 February 2009, Sahar Vilan escrigu?: > Hi, > > I have to exclude equal elements from an array. > x1 = np.array([1, 4, 5, 5]) > > Is there any function to get from x1 an array with one element of > each kind ( [1, 4, 5, 5] -> [1, 4, 5])? Yes. Try np.unique (or np.unique1d, depending on your needs): In [2]: x1 = np.array([1, 4, 5, 5]) In [3]: np.unique(x1) Out[3]: array([1, 4, 5]) Cheers, -- Francesc Alted From sahar at cmt.co.il Tue Feb 17 03:32:10 2009 From: sahar at cmt.co.il (Sahar Vilan) Date: Tue, 17 Feb 2009 10:32:10 +0200 Subject: [SciPy-user] Delete equal elements from array In-Reply-To: <499A6DBB.8080504@gmail.com> Message-ID: Thanks -----Original Message----- From: scipy-user-bounces at scipy.org [mailto:scipy-user-bounces at scipy.org]On Behalf Of fred Sent: Tue, February 17, 2009 9:57 AM To: SciPy Users List Subject: Re: [SciPy-user] Delete equal elements from array Sahar Vilan a ?crit : > Hi, > > I have to exclude equal elements from an array. > x1 = np.array([1, 4, 5, 5]) > > Is there any function to get from x1 an array with one element of each > kind ( [1, 4, 5, 5] -> [1, 4, 5])? np.unique(x1)? Cheers, -- Fred _______________________________________________ SciPy-user mailing list SciPy-user at scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user **************************************************************************** ******** This footnote confirms that this email message has been scanned by PineApp Mail-SeCure for the presence of malicious code, vandals & computer viruses. **************************************************************************** ******** No virus found in this incoming message. Checked by AVG - www.avg.com Version: 8.0.237 / Virus Database: 270.10.25/1956 - Release Date: 02/16/09 18:31:00 ******************************************************************************************************* This e-mail message may contain confidential,and privileged information or data that constitute proprietary information of CMT Medical Ltd. Any review or distribution by others is strictly prohibited. If you are not the intended recipient you are hereby notified that any use of this information or data by any other person is absolutely prohibited. If you are not the intended recipient, please delete all copies. Thank You. http://www.cmt.co.il ******************************************************************************************************** ************************************************************************************ This footnote confirms that this email message has been scanned by PineApp Mail-SeCure for the presence of malicious code, vandals & computer viruses. ************************************************************************************ From sahar at cmt.co.il Tue Feb 17 03:32:22 2009 From: sahar at cmt.co.il (Sahar Vilan) Date: Tue, 17 Feb 2009 10:32:22 +0200 Subject: [SciPy-user] Delete equal elements from array In-Reply-To: <200902170857.44881.faltet@pytables.org> Message-ID: Thanks -----Original Message----- From: scipy-user-bounces at scipy.org [mailto:scipy-user-bounces at scipy.org]On Behalf Of Francesc Alted Sent: Tue, February 17, 2009 9:58 AM To: SciPy Users List Subject: Re: [SciPy-user] Delete equal elements from array A Tuesday 17 February 2009, Sahar Vilan escrigu?: > Hi, > > I have to exclude equal elements from an array. > x1 = np.array([1, 4, 5, 5]) > > Is there any function to get from x1 an array with one element of > each kind ( [1, 4, 5, 5] -> [1, 4, 5])? Yes. Try np.unique (or np.unique1d, depending on your needs): In [2]: x1 = np.array([1, 4, 5, 5]) In [3]: np.unique(x1) Out[3]: array([1, 4, 5]) Cheers, -- Francesc Alted _______________________________________________ SciPy-user mailing list SciPy-user at scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user ************************************************************************************ This footnote confirms that this email message has been scanned by PineApp Mail-SeCure for the presence of malicious code, vandals & computer viruses. ************************************************************************************ No virus found in this incoming message. Checked by AVG - www.avg.com Version: 8.0.237 / Virus Database: 270.10.25/1956 - Release Date: 02/16/09 18:31:00 ******************************************************************************************************* This e-mail message may contain confidential,and privileged information or data that constitute proprietary information of CMT Medical Ltd. Any review or distribution by others is strictly prohibited. If you are not the intended recipient you are hereby notified that any use of this information or data by any other person is absolutely prohibited. If you are not the intended recipient, please delete all copies. Thank You. http://www.cmt.co.il ******************************************************************************************************** ************************************************************************************ This footnote confirms that this email message has been scanned by PineApp Mail-SeCure for the presence of malicious code, vandals & computer viruses. ************************************************************************************ From guilherme at gpfreitas.com Tue Feb 17 04:23:20 2009 From: guilherme at gpfreitas.com (Guilherme P. de Freitas) Date: Tue, 17 Feb 2009 01:23:20 -0800 Subject: [SciPy-user] Simple plot problem Message-ID: Hi everybody, I'm running Python 2.5.2 with EPD with Py2.5 4.0.30002 I need to plot three simpe things, in 3d, domain is x, y in [0,3] x [0,3: 1. the graph of the function f(x,y) = x**(0.3) * y**(0.7) 2. the parametric straight line (x, 2 - 2*x, 0) for x in [0,1] 3. the parametric curve (x, 2 - 2*x, f(x, 2 - 2*x)) - I thought of matplotlib, mlab (from mayavi2), SymPy and PyX. - Apparently matplotlib cannot do 3d plots. Is this correct? My version of matplotlib is 0.98. - I could not plot 2 (the parametric line) with mlab. I am not sure I understood the syntax of mlab.plot3d, and the examples everywhere are too complicated for me to understand how it works. I tried "mlab.plot3d([0, 1], [2, 0], [0, 0])" and it gave me a short line, very small, that was in the interior of the positive quadrant of the xy plane. Imagine the line I wanted and shrink it 80% to its midpoint, that's what I got. - I tried SymPy. I could plot everything easily. However, I could not change thickness and color of curves. Didn't find anything in the documentation. Any ideas? By the way, setting colors of objects in both SymPy and Mlab seems to complicated for simple use. How do I make the object red, for example? - PyX apparently has no 3d plotting capabilities. Is this correct? I could not find an example of a simple plot of a function f : R^2 -> R in any python tool, with explanations of how to change thickness, color, style and labels of plots. Ideally, labels should be rendered with TeX. There were all sorts of complicated examples, but nothing simple like what I needed. And no example simple enough that I could really understand what was going on for all the commands I needed. I will gladly write a short tutorial about it if there is a way to do this and I find out the solution. Now I will try GnuPlot, and if that does not work, Maple. But I'd like to do it with Python. Any help is appreciated. Thanks, -- Guilherme P. de Freitas http://www.gpfreitas.com From gael.varoquaux at normalesup.org Tue Feb 17 04:46:11 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Tue, 17 Feb 2009 10:46:11 +0100 Subject: [SciPy-user] Simple plot problem In-Reply-To: References: Message-ID: <20090217094611.GC17638@phare.normalesup.org> On Tue, Feb 17, 2009 at 01:23:20AM -0800, Guilherme P. de Freitas wrote: > I need to plot three simpe things, in 3d, domain is x, y in [0,3] x > [0,3: > 1. the graph of the function f(x,y) = x**(0.3) * y**(0.7) > 2. the parametric straight line (x, 2 - 2*x, 0) for x in [0,1] > 3. the parametric curve (x, 2 - 2*x, f(x, 2 - 2*x)) Here is some code that probably does what you want. Use the keyword arguments of the plotting functions to change the different properties (such as linewidth) of the objects created): ################################################################################ import numpy as np from enthought.mayavi import mlab x, y = np.mgrid[0:3:100j, 0:3:100j] def f(x, y): return x**(0.3) * y**(0.7) mlab.surf(x, y, f) x = np.linspace(0, 1, 100) mlab.plot3d(x, 2-2*x, np.zeros_like(x)) mlab.plot3d(x, 2 - 2*x, f(x, 2 - 2*x)) mlab.show() ################################################################################ I would interested in figuring out what posed problem in the documentation, and how things could be improved. The trick is to create arrays to evaluate the functions on: x and y. For a surface, you need to create a 2D array, in other words a grid of x and y varying in the 2 directions. This is what the mgrid function does. Maybe a note about mgrid in the documentation relative to plotting surface could help? HTH, Ga?l From guilherme at gpfreitas.com Tue Feb 17 05:00:06 2009 From: guilherme at gpfreitas.com (Guilherme P. de Freitas) Date: Tue, 17 Feb 2009 02:00:06 -0800 Subject: [SciPy-user] Simple plot problem In-Reply-To: <20090217094611.GC17638@phare.normalesup.org> References: <20090217094611.GC17638@phare.normalesup.org> Message-ID: Hi Gael, Thanks for the quick reply! So, I tried your code, and I got the same problem: the parametric curves are "shrinked". I don't know if it's a bug in my system, but the parametric curves are just not right. You can see a picture here: http://archive.gpfreitas.com/misc/snapshot.png I had achieved this before. As I said, the problem is with the parametric curves, not with the f function. Thanks! Guilherme On Tue, Feb 17, 2009 at 1:46 AM, Gael Varoquaux wrote: > On Tue, Feb 17, 2009 at 01:23:20AM -0800, Guilherme P. de Freitas wrote: >> I need to plot three simpe things, in 3d, domain is x, y in [0,3] x >> [0,3: > >> 1. the graph of the function f(x,y) = x**(0.3) * y**(0.7) >> 2. the parametric straight line (x, 2 - 2*x, 0) for x in [0,1] >> 3. the parametric curve (x, 2 - 2*x, f(x, 2 - 2*x)) > > Here is some code that probably does what you want. Use the keyword > arguments of the plotting functions to change the different properties > (such as linewidth) of the objects created): > > ################################################################################ > import numpy as np > from enthought.mayavi import mlab > > x, y = np.mgrid[0:3:100j, 0:3:100j] > def f(x, y): > return x**(0.3) * y**(0.7) > mlab.surf(x, y, f) > > x = np.linspace(0, 1, 100) > mlab.plot3d(x, 2-2*x, np.zeros_like(x)) > mlab.plot3d(x, 2 - 2*x, f(x, 2 - 2*x)) > mlab.show() > ################################################################################ > > I would interested in figuring out what posed problem in the > documentation, and how things could be improved. The trick is to create > arrays to evaluate the functions on: x and y. For a surface, you need to > create a 2D array, in other words a grid of x and y varying in the 2 > directions. This is what the mgrid function does. Maybe a note about > mgrid in the documentation relative to plotting surface could help? > > HTH, > > Ga?l > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > -- Guilherme P. de Freitas http://www.gpfreitas.com From Ralf_Ahlbrink at web.de Tue Feb 17 04:38:58 2009 From: Ralf_Ahlbrink at web.de (Ralf Ahlbrink) Date: Tue, 17 Feb 2009 10:38:58 +0100 Subject: [SciPy-user] Simple plot problem In-Reply-To: References: Message-ID: <200902171039.29112.Ralf_Ahlbrink@web.de> Am Dienstag, 17. Februar 2009 schrieb Guilherme P. de Freitas: > Hi everybody, > > I'm running Python 2.5.2 with EPD with Py2.5 4.0.30002 > > I need to plot three simpe things, in 3d, domain is x, y in [0,3] x > [0,3: > > 1. the graph of the function f(x,y) = x**(0.3) * y**(0.7) > 2. the parametric straight line (x, 2 - 2*x, 0) for x in [0,1] > 3. the parametric curve (x, 2 - 2*x, f(x, 2 - 2*x)) > > > - I thought of matplotlib, mlab (from mayavi2), SymPy and PyX. > > - Apparently matplotlib cannot do 3d plots. Is this correct? My > version of matplotlib is 0.98. > > - I could not plot 2 (the parametric line) with mlab. I am not sure > I understood the syntax of mlab.plot3d, and the examples > everywhere are too complicated for me to understand how it works. > I tried "mlab.plot3d([0, 1], [2, 0], [0, 0])" and it gave me a > short line, very small, that was in the interior of the positive > quadrant of the xy plane. Imagine the line I wanted and shrink it > 80% to its midpoint, that's what I got. > > - I tried SymPy. I could plot everything easily. However, I could > not change thickness and color of curves. Didn't find anything in > the documentation. Any ideas? By the way, setting colors of > objects in both SymPy and Mlab seems to complicated for simple > use. How do I make the object red, for example? > > - PyX apparently has no 3d plotting capabilities. Is this correct? > > > I could not find an example of a simple plot of a function f : R^2 > -> R in any python tool, with explanations of how to change > thickness, color, style and labels of plots. Ideally, labels should > be rendered with TeX. There were all sorts of complicated examples, > but nothing simple like what I needed. And no example simple enough > that I could really understand what was going on for all the > commands I needed. I will gladly write a short tutorial about it if > there is a way to do this and I find out the solution. > > Now I will try GnuPlot, and if that does not work, Maple. But I'd > like to do it with Python. Any help is appreciated. > > Thanks, Hi! You could use Mayavi (see Enthought site) for interactive 3D plotting, e.g. start (recent version of) ipython: $ ipython -pylab -wthread and in this session: In [1]: from enthought.mayavi import mlab See http://code.enthought.com/projects/mayavi/docs/development/html/mayavi/mlab.html#simple- scripting-with-mlab for examples. Regards, Ralf. -------------- next part -------------- An HTML attachment was scrubbed... URL: From guilherme at gpfreitas.com Tue Feb 17 05:10:29 2009 From: guilherme at gpfreitas.com (Guilherme P. de Freitas) Date: Tue, 17 Feb 2009 02:10:29 -0800 Subject: [SciPy-user] Simple plot problem In-Reply-To: <200902171039.29112.Ralf_Ahlbrink@web.de> References: <200902171039.29112.Ralf_Ahlbrink@web.de> Message-ID: Hi, Ralf > You could use Mayavi (see Enthought site) for interactive 3D plotting, e.g. > start (recent version of) ipython: > > $ ipython -pylab -wthread > > and in this session: > > In [1]: from enthought.mayavi import mlab > > See > http://code.enthought.com/projects/mayavi/docs/development/html/mayavi/mlab.html#simple-scripting-with-mlab > for examples. Yes, I tried this. Unfortunately, this does not solve my problem. The code I had, and the code Gael sent me give the following result: http://archive.gpfreitas.com/misc/snapshot.png See that tiny closed curve? The top part of it should be in the graph of the function (in wireframe style) and the bottom line should touch the x and the y axis. I couldn't understand why this is the case, and the examples did not help in understanding that. -- Guilherme P. de Freitas http://www.gpfreitas.com From guilherme at gpfreitas.com Tue Feb 17 05:56:08 2009 From: guilherme at gpfreitas.com (Guilherme P. de Freitas) Date: Tue, 17 Feb 2009 02:56:08 -0800 Subject: [SciPy-user] Simple plot problem In-Reply-To: <20090217094611.GC17638@phare.normalesup.org> References: <20090217094611.GC17638@phare.normalesup.org> Message-ID: Hi again, Gael. I forgot to comment on the documentation. > I would interested in figuring out what posed problem in the > documentation, and how things could be improved. The trick is to create > arrays to evaluate the functions on: x and y. For a surface, you need to > create a 2D array, in other words a grid of x and y varying in the 2 > directions. This is what the mgrid function does. Maybe a note about > mgrid in the documentation relative to plotting surface could help? The use of mgrid was not the puzzle. The puzzle was why I got the shrinked object in the plot3d. As for the documentation, it would be nice in the documentation of the plot3d function to have specified what is a valid input. As far as I know, here's the official documentation of the mlab.plot3d function: http://code.enthought.com/projects/mayavi/docs/development/html/mayavi/auto/mlab_helper_functions.html#plot3d And it just says "Draws lines between points". I think something like "x, y and z must be arrays of the same shape" would help a lot. I figured that out by trial and error (the error messages are informative), but it took a while, and given that I had a problem, I thought I was doing something wrong (I'm still not sure, because I still have the "shrinking" problem), and the documentation did not help in figuring out what was wrong. In part the documentation did not help be because a specification of valid input was not available, but also in part because there isn't a single *simple* example of Mlab worked out. I suppose plotting simple (x,y) |-> z function is probably not the intended use of Mlab, but given the lack of other 3d plotting alternatives in Python, it can be very useful. -- Guilherme P. de Freitas http://www.gpfreitas.com From mjakubik at ta3.sk Tue Feb 17 05:28:20 2009 From: mjakubik at ta3.sk (Marian Jakubik) Date: Tue, 17 Feb 2009 11:28:20 +0100 Subject: [SciPy-user] Simple plot problem - extended In-Reply-To: References: Message-ID: <20090217112820.609ff403@jakubik.ta3.sk> Hi Gael, the code you create "for" Ralph gives this error: *** (python:29796): Gtk-CRITICAL **: gtk_widget_set_colormap: assertion `!GTK_WIDGET_REALIZED (widget)' failed Traceback (most recent call last): File "my.py", line 12, in mlab.show() AttributeError: 'module' object has no attribute 'show' Segmentation fault *** I was looking for the solution through Google and I found one thread in [Enthought-dev] mailing list from September 2008 dealing with the problem that mlab (from enthought.mayavi import mlab) object has no attribute 'show'. But there was no solution :( I apologize (especially to Ralph) for mixing the subjects of the discussion... Thanks for your useful comments... Best, Marian From gael.varoquaux at normalesup.org Tue Feb 17 06:33:09 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Tue, 17 Feb 2009 12:33:09 +0100 Subject: [SciPy-user] Simple plot problem - extended In-Reply-To: <20090217112820.609ff403@jakubik.ta3.sk> References: <20090217112820.609ff403@jakubik.ta3.sk> Message-ID: <20090217113309.GE17638@phare.normalesup.org> On Tue, Feb 17, 2009 at 11:28:20AM +0100, Marian Jakubik wrote: > Hi Gael, > the code you create "for" Ralph gives this error: > *** > (python:29796): Gtk-CRITICAL **: gtk_widget_set_colormap: assertion > `!GTK_WIDGET_REALIZED (widget)' failed Traceback (most recent call > last): File "my.py", line 12, in > mlab.show() > AttributeError: 'module' object has no attribute 'show' > Segmentation fault > *** What is your version of mayavi2? I believe you have a version older than 3.0. You should really update to 3.0 or later, see http://code.enthought.com/projects/mayavi/docs/development/html/mayavi/installation.html on how to upgrade. In particular, if you are using Ubuntu or Debian, there are specific instructions on how to get Debian packages. Ga?l From jbednar at inf.ed.ac.uk Tue Feb 17 05:51:19 2009 From: jbednar at inf.ed.ac.uk (James A. Bednar) Date: Tue, 17 Feb 2009 10:51:19 +0000 Subject: [SciPy-user] Simple plot problem In-Reply-To: References: Message-ID: <18842.38567.588551.585016@cortex.inf.ed.ac.uk> | Date: Tue, 17 Feb 2009 01:23:20 -0800 | From: "Guilherme P. de Freitas" | Subject: [SciPy-user] Simple plot problem | To: SciPy Users List | Message-ID: | Content-Type: text/plain; charset=ISO-8859-1 | | Hi everybody, | | I'm running Python 2.5.2 with EPD with Py2.5 4.0.30002 | | I need to plot three simpe things, in 3d, domain is x, y in [0,3] x | [0,3: | | 1. the graph of the function f(x,y) = x**(0.3) * y**(0.7) | 2. the parametric straight line (x, 2 - 2*x, 0) for x in [0,1] | 3. the parametric curve (x, 2 - 2*x, f(x, 2 - 2*x)) | | | - I thought of matplotlib, mlab (from mayavi2), SymPy and PyX. | | - Apparently matplotlib cannot do 3d plots. Is this correct? My | version of matplotlib is 0.98. | | - I could not plot 2 (the parametric line) with mlab. I am not sure | I understood the syntax of mlab.plot3d, and the examples | everywhere are too complicated for me to understand how it works. | I tried "mlab.plot3d([0, 1], [2, 0], [0, 0])" and it gave me a | short line, very small, that was in the interior of the positive | quadrant of the xy plane. Imagine the line I wanted and shrink it | 80% to its midpoint, that's what I got. | | - I tried SymPy. I could plot everything easily. However, I could | not change thickness and color of curves. Didn't find anything in | the documentation. Any ideas? By the way, setting colors of | objects in both SymPy and Mlab seems to complicated for simple | use. How do I make the object red, for example? | | - PyX apparently has no 3d plotting capabilities. Is this correct? | | | I could not find an example of a simple plot of a function f : R^2 | -> R in any python tool, with explanations of how to change | thickness, color, style and labels of plots. Ideally, labels should | be rendered with TeX. There were all sorts of complicated examples, | but nothing simple like what I needed. And no example simple enough | that I could really understand what was going on for all the | commands I needed. I will gladly write a short tutorial about it if | there is a way to do this and I find out the solution. | | Now I will try GnuPlot, and if that does not work, Maple. But I'd | like to do it with Python. Any help is appreciated. I use matplotlib-0.91.4, and I can do nice 3D wireframe plots using the code below. However, I don't think that works in newer versions, so I have stopped upgrading. The matplotlib docs tell how to change colors, etc. Jim _______________________________________________________________________________ import pylab from numpy import outer,arange,cos,sin,ones,zeros,array from matplotlib import axes3d def matrixplot3d(mat,title=None): fig = pylab.figure() ax = axes3d.Axes3D(fig) # Construct matrices for r and c values rn,cn = mat.shape c = outer(ones(rn),arange(cn*1.0)) r = outer(arange(rn*1.0),ones(cn)) ax.plot_wireframe(r,c,mat) ax.set_xlabel('R') ax.set_ylabel('C') ax.set_zlabel('Value') if title: windowtitle(title) pylab.show() matrixplot3d(array([[0.1,0.5,0.9],[0.2,0.1,0.0]])) -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. From gael.varoquaux at normalesup.org Tue Feb 17 06:38:52 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Tue, 17 Feb 2009 12:38:52 +0100 Subject: [SciPy-user] Simple plot problem In-Reply-To: References: <20090217094611.GC17638@phare.normalesup.org> Message-ID: <20090217113852.GF17638@phare.normalesup.org> On Tue, Feb 17, 2009 at 02:56:08AM -0800, Guilherme P. de Freitas wrote: > In part the documentation did not help be because a specification of > valid input was not available, but also in part because there isn't > a single *simple* example of Mlab worked out. I suppose plotting > simple (x,y) |-> z function is probably not the intended use of > Mlab, but given the lack of other 3d plotting alternatives in > Python, it can be very useful. The plotting of a simple (x, y) -> function is definitely part of the intended use of mlab. I don't understand why you say that there isn't a single simple example, isn't the following acceptable: http://code.enthought.com/projects/mayavi/docs/development/html/mayavi/auto/mlab_helper_functions.html#enthought.mayavi.mlab.surf ? Did you not find this example. In which case we must make it clearer. Cheers, Ga?l From mjakubik at ta3.sk Tue Feb 17 06:42:59 2009 From: mjakubik at ta3.sk (Marian Jakubik) Date: Tue, 17 Feb 2009 12:42:59 +0100 Subject: [SciPy-user] Simple plot problem - extended In-Reply-To: <20090217113309.GE17638@phare.normalesup.org> References: <20090217112820.609ff403@jakubik.ta3.sk> <20090217113309.GE17638@phare.normalesup.org> Message-ID: <20090217124259.3bc24589@jakubik.ta3.sk> Hi, my version of mayavi2 is 2.2.0 from official Ubuntu repositories. I see I have to update to newer version... Thanks for your reply... Marian D?a Tue, 17 Feb 2009 12:33:09 +0100 Gael Varoquaux nap?sal: > On Tue, Feb 17, 2009 at 11:28:20AM +0100, Marian Jakubik wrote: > > Hi Gael, > > > the code you create "for" Ralph gives this error: > > > *** > > > (python:29796): Gtk-CRITICAL **: gtk_widget_set_colormap: assertion > > `!GTK_WIDGET_REALIZED (widget)' failed Traceback (most recent call > > last): File "my.py", line 12, in > > mlab.show() > > AttributeError: 'module' object has no attribute 'show' > > Segmentation fault > > > *** > > What is your version of mayavi2? I believe you have a version older than > 3.0. You should really update to 3.0 or later, see > http://code.enthought.com/projects/mayavi/docs/development/html/mayavi/installation.html > on how to upgrade. In particular, if you are using Ubuntu or Debian, > there are specific instructions on how to get Debian packages. > > Ga?l > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user From gael.varoquaux at normalesup.org Tue Feb 17 06:53:41 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Tue, 17 Feb 2009 12:53:41 +0100 Subject: [SciPy-user] Simple plot problem In-Reply-To: References: <20090217094611.GC17638@phare.normalesup.org> Message-ID: <20090217115341.GG17638@phare.normalesup.org> On Tue, Feb 17, 2009 at 02:56:08AM -0800, Guilherme P. de Freitas wrote: > And it just says "Draws lines between points". I think something > like "x, y and z must be arrays of the same shape" would help a lot. > I figured that out by trial and error (the error messages are > informative) Thanks for the feedback. I changed the docs to be a bit more informative. I will indeed need to go over them again to be sure the input arguments are clear. Ga?l From gael.varoquaux at normalesup.org Tue Feb 17 06:55:07 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Tue, 17 Feb 2009 12:55:07 +0100 Subject: [SciPy-user] Simple plot problem - extended In-Reply-To: <20090217124259.3bc24589@jakubik.ta3.sk> References: <20090217112820.609ff403@jakubik.ta3.sk> <20090217113309.GE17638@phare.normalesup.org> <20090217124259.3bc24589@jakubik.ta3.sk> Message-ID: <20090217115507.GH17638@phare.normalesup.org> On Tue, Feb 17, 2009 at 12:42:59PM +0100, Marian Jakubik wrote: > my version of mayavi2 is 2.2.0 from official Ubuntu repositories. > I see I have to update to newer version... That's what I guessed. The next version of Ubuntu will have the latest Mayavi2, thanks to the excellent work of the packagers. Due to rapid last summer, Ubuntu got a bit out of sync for this release. Cheers, Ga?l From gael.varoquaux at normalesup.org Tue Feb 17 06:57:43 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Tue, 17 Feb 2009 12:57:43 +0100 Subject: [SciPy-user] Simple plot problem In-Reply-To: References: <20090217094611.GC17638@phare.normalesup.org> Message-ID: <20090217115743.GI17638@phare.normalesup.org> On Tue, Feb 17, 2009 at 02:00:06AM -0800, Guilherme P. de Freitas wrote: > Hi Gael, > Thanks for the quick reply! > So, I tried your code, and I got the same problem: the parametric > curves are "shrinked". I don't know if it's a bug in my system, but > the parametric curves are just not right. You can see a picture here: > http://archive.gpfreitas.com/misc/snapshot.png I am not sure why you are getting this. This might be because of a slight miss-feature that we corrected in a recent version of Mayavi2. What is your version of Mayavi2. I will try to get my hands on a computer where an older version of Mayavi2 to try things out. Also, did you use the exact same code as the one I sent you? Having the exact code will help me diagnose the problem. Cheers, Ga?l From guilherme at gpfreitas.com Tue Feb 17 06:58:15 2009 From: guilherme at gpfreitas.com (Guilherme P. de Freitas) Date: Tue, 17 Feb 2009 03:58:15 -0800 Subject: [SciPy-user] Simple plot problem In-Reply-To: <20090217113852.GF17638@phare.normalesup.org> References: <20090217094611.GC17638@phare.normalesup.org> <20090217113852.GF17638@phare.normalesup.org> Message-ID: > The plotting of a simple (x, y) -> function is definitely part of the > intended use of mlab. I don't understand why you say that there isn't a > single simple example, isn't the following acceptable: > http://code.enthought.com/projects/mayavi/docs/development/html/mayavi/auto/mlab_helper_functions.html#enthought.mayavi.mlab.surf ? > > Did you not find this example. In which case we must make it clearer. Sorry, I didn't express myself correctly. There should be a simple example for each function. That is just the mlab.surf. For example, in my case, I get the shrinked object in the mlab.plot3d function. As there was no explanation of what was a valid input and the example in the section was too complicated for me, I did not know if I had the wrong input or if it is a problem in the software. But, still, back to the original problem. Did you see the problem with that code? I mean, test the code you sent me on the first email. Don't you get a "shrinked" version of the parametric curve like in this link: http://archive.gpfreitas.com/misc/snapshot.png This is what I get with your code. And I got this before too, trying different approaches. I don't know why it is giving me this wrong behavior. Again, what I need is: 1. the graph of the function f(x,y) = x**(0.3) * y**(0.7), for x and y in [0,3] 2. the parametric straight line (x, 2 - 2*x, 0) for x in [0,1] 3. the parametric curve (x, 2 - 2*x, f(x, 2 - 2*x)) for x in [0,1] The code you sent me in the first email does not correctly plot the last two objects. Objects 2 and 3 (which form a closed curve) are "shrinked" (see link above). Thanks! -- Guilherme P. de Freitas http://www.gpfreitas.com From guilherme at gpfreitas.com Tue Feb 17 07:05:25 2009 From: guilherme at gpfreitas.com (Guilherme P. de Freitas) Date: Tue, 17 Feb 2009 04:05:25 -0800 Subject: [SciPy-user] Simple plot problem In-Reply-To: <20090217115743.GI17638@phare.normalesup.org> References: <20090217094611.GC17638@phare.normalesup.org> <20090217115743.GI17638@phare.normalesup.org> Message-ID: >> So, I tried your code, and I got the same problem: the parametric >> curves are "shrinked". I don't know if it's a bug in my system, but >> the parametric curves are just not right. You can see a picture here: > >> http://archive.gpfreitas.com/misc/snapshot.png > > I am not sure why you are getting this. This might be because of a slight > miss-feature that we corrected in a recent version of Mayavi2. What is > your version of Mayavi2. I will try to get my hands on a computer where > an older version of Mayavi2 to try things out. > > Also, did you use the exact same code as the one I sent you? Having the > exact code will help me diagnose the problem. > > Cheers, > > Ga?l Hi again. My version of Mayavi is 3.0.3. I tried your code, exactly how you sent, and I got the same shrinked object. I wanted to add a "representation='wireframe'" to the plot to see more clearly, but for some reason it did not work in your code (I still don't understand this very well). I decided to put your code for the parametric curves in my previous code (that had the function in wireframe style), and rotated it and took a snapshot. That's what you sent. The exact code that generates that picture in the link above is: ###################################################################### #!/usr/env python import numpy as np from enthought.mayavi import mlab x, y = np.mgrid[0.0:3.0:0.1, 0.0:3.0:0.1] def f(x,y): return x**(0.3) * y**(0.7) utility = mlab.surf(x, y, f, representation='wireframe') x = np.linspace(0, 1, 100) mlab.plot3d(x, 2-2*x, np.zeros_like(x)) mlab.plot3d(x, 2 - 2*x, f(x, 2 - 2*x)) mlab.axes(utility) ###################################################################### -- Guilherme P. de Freitas http://www.gpfreitas.com From aisaac at american.edu Tue Feb 17 07:39:57 2009 From: aisaac at american.edu (Alan G Isaac) Date: Tue, 17 Feb 2009 07:39:57 -0500 Subject: [SciPy-user] Simple plot problem In-Reply-To: References: Message-ID: <499AB01D.8080803@american.edu> On 2/17/2009 4:23 AM Guilherme P. de Freitas apparently wrote: > I need to plot three simpe things, in 3d, domain is x, y in [0,3] x > [0,3: > > 1. the graph of the function f(x,y) = x**(0.3) * y**(0.7) > 2. the parametric straight line (x, 2 - 2*x, 0) for x in [0,1] > 3. the parametric curve (x, 2 - 2*x, f(x, 2 - 2*x)) 1. Your parametric function specifications do not enforce your original domain restriction. 2. Gnuplot.py and PyX can also do 3d static graphs; both are light-weight. Alan Isaac From gael.varoquaux at normalesup.org Tue Feb 17 08:02:24 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Tue, 17 Feb 2009 14:02:24 +0100 Subject: [SciPy-user] Simple plot problem In-Reply-To: References: <20090217094611.GC17638@phare.normalesup.org> <20090217115743.GI17638@phare.normalesup.org> Message-ID: <20090217130224.GJ17638@phare.normalesup.org> On Tue, Feb 17, 2009 at 04:05:25AM -0800, Guilherme P. de Freitas wrote: > My version of Mayavi is 3.0.3. OK, that might be in the default scaling of surf that has changed between the 3.0.3 and the 3.1.0 (I can't get my hands on an install of 3.0.3, and I don't want to downgrade the boxes at work). If you don't want to upgrade the version of mayavi, you can try to force the extent, as in the following code: ################################################################################ import numpy as np from enthought.mayavi import mlab x, y = np.mgrid[0:3:100j, 0:3:100j] def f(x, y): return x**(0.3) * y**(0.7) mlab.surf(x, y, f, extent=[0, 3, 0, 3, 0, f(x, y).max()]) x = np.linspace(0, 1, 100) mlab.plot3d(x, 2-2*x, np.zeros_like(x)) mlab.plot3d(x, 2 - 2*x, f(x, 2 - 2*x)) mlab.show() ################################################################################ Do note that the docs on the web are for the 3.1.0 version. There are some small differences, and you seem to have hit a version where we made a change in auto-scaling based on user experience. Please tell me if the above code works for you. Cheers, Ga?l From guilherme at gpfreitas.com Tue Feb 17 08:31:32 2009 From: guilherme at gpfreitas.com (Guilherme P. de Freitas) Date: Tue, 17 Feb 2009 05:31:32 -0800 Subject: [SciPy-user] Simple plot problem In-Reply-To: <20090217130224.GJ17638@phare.normalesup.org> References: <20090217094611.GC17638@phare.normalesup.org> <20090217115743.GI17638@phare.normalesup.org> <20090217130224.GJ17638@phare.normalesup.org> Message-ID: On Tue, Feb 17, 2009 at 5:02 AM, Gael Varoquaux wrote: > Do note that the docs on the web are for the 3.1.0 version. There are > some small differences, and you seem to have hit a version where we made > a change in auto-scaling based on user experience. I'll try to upgrade as soon as possible. Actually, I don't even know how to do this, I try to just install the newest EPD binaries on a regular basis. > Please tell me if the above code works for you. Works like a charm! Thanks! Guilherme -- Guilherme P. de Freitas http://www.gpfreitas.com From H.Zahiri at curtin.edu.au Tue Feb 17 09:29:19 2009 From: H.Zahiri at curtin.edu.au (Hani Zahiri) Date: Tue, 17 Feb 2009 23:29:19 +0900 Subject: [SciPy-user] What is the equivalent of MATLAB's "interp2(Z, ntimes)" in Scipy's "interp2d"? Message-ID: <82200558F6DE2C479D381D3000D1551C01C79443@EXMSK1.staff.ad.curtin.edu.au> Hi Folks, Would you help me to write this line from MATLAB in Python: interp2(arr,3) where 'arr' is an array with shape (2,270). I tried to use "scipy.interpolate.inetrp2d()" but it doesn't work since there is no option to do interleaving interpolates between every elements same as what MATLAB does using "interp2(Z, ntimes)" I'll be happy if you help me with this Cheers, Hani -------------- next part -------------- An HTML attachment was scrubbed... URL: From Dharhas.Pothina at twdb.state.tx.us Tue Feb 17 10:48:28 2009 From: Dharhas.Pothina at twdb.state.tx.us (Dharhas Pothina) Date: Tue, 17 Feb 2009 09:48:28 -0600 Subject: [SciPy-user] Approximate volume of an irregular triangular mesh. Message-ID: <499A87E8.63BA.009B.0@twdb.state.tx.us> Hi, I have a unstructured mesh in the form of triangular prisms. ie a irregular triangular mesh in 2D extruded in z to form prisms. The mesh is in the format: Node# XCoord YCoord Depth ... ... Element# Node1 Node2 Node3 ... ... I want to calculate the approximate volume of this mesh. The brute force way is to cycle through each triangular element and calculate the area of each triangle and multiply it by the average depth of the three nodes of the element. I was wondering if there was a simpler way maybe by just using the surface defined by the nodes and ignoring the connectivity. thanks, - dharhas From rmay31 at gmail.com Tue Feb 17 12:43:04 2009 From: rmay31 at gmail.com (Ryan May) Date: Tue, 17 Feb 2009 11:43:04 -0600 Subject: [SciPy-user] odeint for calculating trajectories In-Reply-To: References: Message-ID: On Thu, Feb 12, 2009 at 12:16 PM, Rob Clewley wrote: > > Is there a good way to use scipy.integrate.odeint to calculate > trajectories > > from an observed velocity field? I know you can do this when you have an > > analytic expression for dx/dt, but in this case I have a spatial grid of > > values for dx/dt. The only way I've come up with is to make the function > > passed to odeint something that will interpolate fromt the grid to the > given > > point. > > > I don't think odeint is the right tool for this job - there is no ODE > integration to do if you do not have an explicit function for the > vector field. You should think of it purely as an interpolation > problem. You have (t,x) values and (t, dx/dt) values, so this defines > a piecewise quadratic function which has continuous *second* > derivative everywhere (i.e. the trajectory smoothly agrees at your > mesh points). I would use the polynomial interpolation classes that > were recently added to scipy by Anne Archibald (search this list for > details about it). You pass it your arrays of values and you get back > a function that smoothly interpolates through your points. This is the > most accurate trajectory that you can derive from this finite mesh > vector-field. > I understand the idea of the curve fitting. But I'm having trouble seeing how to take the krogh_interpolator in scipy and apply it to a 2, or better yet, 3 dimensional problem. Any pointers? Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma -------------- next part -------------- An HTML attachment was scrubbed... URL: From wnbell at gmail.com Tue Feb 17 13:59:41 2009 From: wnbell at gmail.com (Nathan Bell) Date: Tue, 17 Feb 2009 13:59:41 -0500 Subject: [SciPy-user] Approximate volume of an irregular triangular mesh. In-Reply-To: <499A87E8.63BA.009B.0@twdb.state.tx.us> References: <499A87E8.63BA.009B.0@twdb.state.tx.us> Message-ID: On Tue, Feb 17, 2009 at 10:48 AM, Dharhas Pothina wrote: > > I want to calculate the approximate volume of this mesh. The brute force way > is to cycle through each triangular element and calculate the area of each > triangle and multiply it by the average depth of the three nodes of the element. > I was wondering if there was a simpler way maybe by just using the surface > defined by the nodes and ignoring the connectivity. > Look at "Subject 2.01: How do I find the area of a polygon?" here: http://www.faqs.org/faqs/graphics/algorithms-faq/ This requires that the edges are consistently oriented (e.g. counter clockwise around the perimeter) Interestingly, you can apply a similar trick in higher dimensions, and even compute things like the center of mass and inertia tensor. http://www.geometrictools.com/Documentation/PolyhedralMassProperties.pdf Ultimately, it boils down to the fact that you can replace volume integrals with a surface integrals using the divergence theorem. -- Nathan Bell wnbell at gmail.com http://graphics.cs.uiuc.edu/~wnbell/ From josef.pktd at gmail.com Tue Feb 17 15:07:07 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 17 Feb 2009 15:07:07 -0500 Subject: [SciPy-user] What is the equivalent of MATLAB's "interp2(Z, ntimes)" in Scipy's "interp2d"? In-Reply-To: <82200558F6DE2C479D381D3000D1551C01C79443@EXMSK1.staff.ad.curtin.edu.au> References: <82200558F6DE2C479D381D3000D1551C01C79443@EXMSK1.staff.ad.curtin.edu.au> Message-ID: <1cd32cbb0902171207i7bcc7439te7c92a7b29355091@mail.gmail.com> On Tue, Feb 17, 2009 at 9:29 AM, Hani Zahiri wrote: > Hi Folks, > > Would you help me to write this line from MATLAB in Python: > > interp2(arr,3) > > where 'arr' is an array with shape (2,270). > > I tried to use "scipy.interpolate.inetrp2d()" but it doesn't work since > there is no option to do interleaving interpolates between every elements > same as what MATLAB does using "interp2(Z, ntimes)" > > I'll be happy if you help me with this > > Cheers, > > Hani > I'm not completely sure what the matlab function does, but this may be the same for interleaving interpolation: (I didn't see a direct solution with scipy.interpolate) >>> z.T array([[ 1., 11.], [ 2., 12.], [ 3., 13.], [ 4., 14.], [ 5., 15.]]) >>> interp2d_interleave_recursive(z,2).T array([[ 1. , 3.5 , 6. , 8.5 , 11. ], [ 1.25, 3.75, 6.25, 8.75, 11.25], [ 1.5 , 4. , 6.5 , 9. , 11.5 ], [ 1.75, 4.25, 6.75, 9.25, 11.75], [ 2. , 4.5 , 7. , 9.5 , 12. ], [ 2.25, 4.75, 7.25, 9.75, 12.25], [ 2.5 , 5. , 7.5 , 10. , 12.5 ], [ 2.75, 5.25, 7.75, 10.25, 12.75], [ 3. , 5.5 , 8. , 10.5 , 13. ], [ 3.25, 5.75, 8.25, 10.75, 13.25], [ 3.5 , 6. , 8.5 , 11. , 13.5 ], [ 3.75, 6.25, 8.75, 11.25, 13.75], [ 4. , 6.5 , 9. , 11.5 , 14. ], [ 4.25, 6.75, 9.25, 11.75, 14.25], [ 4.5 , 7. , 9.5 , 12. , 14.5 ], [ 4.75, 7.25, 9.75, 12.25, 14.75], [ 5. , 7.5 , 10. , 12.5 , 15. ]]) >>> interp2d_interleave_recursive(z,1).T array([[ 1. , 6. , 11. ], [ 1.5, 6.5, 11.5], [ 2. , 7. , 12. ], [ 2.5, 7.5, 12.5], [ 3. , 8. , 13. ], [ 3.5, 8.5, 13.5], [ 4. , 9. , 14. ], [ 4.5, 9.5, 14.5], [ 5. , 10. , 15. ]]) Josef -------------- next part -------------- import numpy as np def interp2d_interleave(z,n): '''performs linear interpolation on a grid all points are interpolated in one step not recursively Parameters ---------- z : 2d array (M,N) n : int number of points interpolated Returns ------- zi : 2d array ((M-1)*n+M, (N-1)*n+N) original and linear interpolated values ''' frac = np.atleast_2d(np.arange(0,n+1)/(1.0+n)).T zi1 = np.kron(z[:,:-1],np.ones(len(frac))) + np.kron(np.diff(z),frac.T) zi1 = np.hstack((zi1,z[:,-1:])) zi2 = np.kron(zi1.T[:,:-1],np.ones(len(frac))) + np.kron(np.diff(zi1.T),frac.T) zi2 = np.hstack((zi2,zi1.T[:,-1:])) return zi2.T def interp2d_interleave_recursive(z,n): '''interpolates by recursively interleaving n times ''' zi = z.copy() for ii in range(1,n+1): zi = interp2d_interleave(zi,1) return zi x = np.linspace(1,5,5) y = np.linspace(11,15,5) z = np.vstack((x,y)) print z.T n = 1 print interp2d_interleave(z,n).T n = 2 print interp2d_interleave(z,n).T n = 3 zi3a = interp2d_interleave(z,n) print zi3a zi3b = interp2d_interleave_recursive(z,2) print zi3b # for linear function recursive and one-step interpolation are the same # I'm not sure about the general case print np.all(zi3a==zi3b) From williamhpurcell at gmail.com Tue Feb 17 15:25:54 2009 From: williamhpurcell at gmail.com (flyeng4) Date: Tue, 17 Feb 2009 12:25:54 -0800 (PST) Subject: [SciPy-user] Fwd: signal.lti(A,B,C,D) with D=0 In-Reply-To: References: Message-ID: <446ebe82-a298-4ab2-90a4-6289cdaff83e@f24g2000vbf.googlegroups.com> I am getting back to using signal.lti for state-space representation. This is a post and a reply that I had received a while back, but there was no follow up. Any thoughts on a solution for representing MIMO or SIMO systems with signal.lti, with the issue described below with ss2zpk? -Bill ---------- Forwarded message ---------- From: "Ryan Krauss" Date: Jul 14 2008, 7:41?am Subject: signal.lti(A,B,C,D) with D=0 To: SciPy-user This seems like a bug in ltisys. ?I think this line 149 ? ? type_test = A[:,0] + B[:,0] + C[0,:] + D should be 149 ? ? type_test = A[:,0] + B[:,0] + C[0,:] + D[0,:] for a multipe-input/multiple-output system with i inputs, n states, and m outputs, A should be n by n, B should be n by i, C should be m by n, and D should be m by i. (actually, because of this code a few lines above: ? ? # make MOSI from possibly MOMI system. ? ? if B.shape[-1] != 0: ? ? ? ? B = B[:,input] ? ? B.shape = (B.shape[0],1) ? ? if D.shape[-1] != 0: ? ? ? ? D = D[:,input] I guess line 149 should be 149 ? ? type_test = A[:,0] + B[:,0] + C[0,:] + D[0] ) But making this change exposes another problem: /usr/lib/python2.5/site-packages/scipy/signal/ltisys.py in __init__(self, *args, **kwords) ? ? 224 ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? self.__dict__['D'] = abcd_normalize(*args) ? ? 225 ? ? ? ? ? ? self.__dict__['zeros'], self.__dict__['poles'], \ --> 226 ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? self.__dict__['gain'] = ss2zpk(*args) ? ? 227 ? ? ? ? ? ? self.__dict__['num'], self.__dict__['den'] = ss2tf (*args) ? ? 228 ? ? ? ? ? ? self.inputs = self.B.shape[-1] /usr/lib/python2.5/site-packages/scipy/signal/ltisys.py in ss2zpk(A, B, C, D, input) ? ? 183 ? ? """ ? ? 184 ? ? Pdb().set_trace() --> 185 ? ? return tf2zpk(*ss2tf(A,B,C,D,input=input)) ? ? 186 ? ? 187 class lti(object): /usr/lib/python2.5/site-packages/scipy/signal/filter_design.py in tf2zpk(b, a) ? ? 128 ? ? k = b[0] ? ? 129 ? ? b /= b[0] --> 130 ? ? z = roots(b) ? ? 131 ? ? p = roots(a) ? ? 132 ? ? return z, p, k /usr/lib/python2.5/site-packages/numpy/lib/polynomial.py in roots(p) ? ? ?98 ? ? p = atleast_1d(p) ? ? ?99 ? ? if len(p.shape) != 1: --> 100 ? ? ? ? raise ValueError,"Input must be a rank-1 array." ? ? 101 ? ? 102 ? ? # find non-zero array entries : Input must be a rank-1 array. WARNING: Failure executing file: In [2]: %debug> /usr/lib/python2.5/site-packages/numpy/lib/ polynomial.py(100)roots() ? ? ?99 ? ? if len(p.shape) != 1: --> 100 ? ? ? ? raise ValueError,"Input must be a rank-1 array." ? ? 101 ipdb> print p [[ ?1.00000000e+00 ? 1.00000000e+00 ? 1.00000000e+00 ? 1.00000000e+00 ? ? 1.00000000e+00 ? 1.00000000e+00] ?[ ?1.81898940e-11 ?-3.06222400e+00 ?-2.79196150e+02 ?-5.87518033e+03 ? ?-1.59843721e+04 ? 1.91386789e-07] ?[ ?1.75910240e+01 ? 1.60479206e+03 ? 3.37698682e+04 ? 9.12618857e+04 ? ?-1.23219975e+04 ?-3.29828641e+04]] ss2tf seems to correctly handle MIMO (or at least SIMO systems) correctly and returns one denominator polynomial with several (m) numerator polynomials. ?But tf2zpk in filter_design.py does not seem able to handle more than siso systems (which makes sense, it is expecting a transfer function which is just a SISO system). How should this be fixed? I understand why ss2tf converts a MIMO system to SIMO - trying to represent a mulitple input, multiple output system with a transfer function has some limitations. ?I think the real offender is in the __init__ method of signal.lti: elif N == 4: ... self.__dict__['zeros'], self.__dict__['poles'], \ ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? self.__dict__['gain'] = ss2zpk (*args) For a MIMO system, what should the zpk representation be? ?For a SIMO system, I would expect a vector of poles and a matrix of zeros. ?This seems to be in line with what ss2tf does. It seems like the init method could find the pole by passing C[0,:] and D[0,:] to ss2zpk. ?The zeros and gains could then be found by calling ss2zpk in some vectorized way or simply in a for loop for each row of C and D. ?But there may well be a more elegant solution. Any thoughts? Ryan On Sun, Jul 13, 2008 at 2:59 PM, William Purcell wrote: > I am trying to set up state space representation of a system using > signal.lti. The system has no feedforward so D = 0. I've tried the three > options listed in the code below to represent D. > The only way I can get it to work is option 2 if C has 1 row. If C has more > than 1 row it won't work. > Any thoughts? > -Bill > Code > --------------------------------------------------------------- > from numpy import ones,matrix > from scipy import signal > r = ones(3) > A = matrix([r,r,r]) > B = matrix([r]).T > C = matrix([r,r]) > #three options of D to make it 0 > #1) D=0 > #2) D = matrix([0,0]).T > #3) D = None > D = 0 > #D = None > #D = matrix([0,0]).T > Gss = signal.lti(A,B,C,D) > ----------------------------------------------------------- > Tracebacks > ----------------------------------------------------------- > Option 1 > /usr/lib/python2.5/site-packages/scipy/signal/ltisys.py in abcd_normalize(A, > B, C, D) > ? ? 101 ? ? ? ? raise ValueError, "A and C must have the same number of > columns." > ? ? 102 ? ? if MD != MC: > --> 103 ? ? ? ? raise ValueError, "C and D must have the same number of > rows." > ? ? 104 ? ? if ND != NB: > ? ? 105 ? ? ? ? raise ValueError, "B and D must have the same number of > columns." > : C and D must have the same number of rows. > WARNING: Failure executing file: > Option 2 (with C as two rows...if C is a single row I do not get this > traceback) > /usr/lib/python2.5/site-packages/scipy/signal/ltisys.py in ss2tf(A, B, C, D, > input) > ? ? 147 > ? ? 148 ? ? num_states = A.shape[0] > --> 149 ? ? type_test = A[:,0] + B[:,0] + C[0,:] + D > ? ? 150 ? ? num = numpy.zeros((nout, num_states+1), type_test.dtype) > ? ? 151 ? ? for k in range(nout): > : shape mismatch: objects cannot be broadcast > to a single shape > WARNING: Failure executing file: > Option 3 > same as 1 > _______________________________________________ > SciPy-user mailing list > SciPy-u... at scipy.org >http://projects.scipy.org/mailman/listinfo/scipy-user _______________________________________________ SciPy-user mailing list SciPy-u... at scipy.orghttp://projects.scipy.org/mailman/listinfo/scipy- user From josef.pktd at gmail.com Tue Feb 17 16:09:32 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 17 Feb 2009 16:09:32 -0500 Subject: [SciPy-user] Fwd: signal.lti(A,B,C,D) with D=0 In-Reply-To: <446ebe82-a298-4ab2-90a4-6289cdaff83e@f24g2000vbf.googlegroups.com> References: <446ebe82-a298-4ab2-90a4-6289cdaff83e@f24g2000vbf.googlegroups.com> Message-ID: <1cd32cbb0902171309n3f6f7cbevb27f56d5356c67c5@mail.gmail.com> On Tue, Feb 17, 2009 at 3:25 PM, flyeng4 wrote: > I am getting back to using signal.lti for state-space representation. > This is a post and a reply that I had received a while back, but there > was no follow up. Any thoughts on a solution for representing MIMO or > SIMO systems with signal.lti, with the issue described below with > ss2zpk? > -Bill I was looking recently into scipy.signal to see whether it can be used for time series analysis. After digging around in the code and trying out some examples (I didn't find any examples in the docs), I got the impression that scipy.signal is not designed to handle more than one signal (I wanted recursive filter with interactions between several time series). I think the main problem I found was, that numpy/scipy doesn't have a multivariate/multidimensional polynomial class. Somewhere when converting from a multidimensional transfer function, scipy signal uses numpy.polynomial which only works for one-dimensional polynomials. That's when I gave up. My conclusion was, that for multidimensional signals, scipy signal would need to be rewritten quite a bit. Josef From rob.clewley at gmail.com Tue Feb 17 17:40:04 2009 From: rob.clewley at gmail.com (Rob Clewley) Date: Tue, 17 Feb 2009 17:40:04 -0500 Subject: [SciPy-user] odeint for calculating trajectories In-Reply-To: References: Message-ID: > I understand the idea of the curve fitting. But I'm having trouble seeing > how to take the krogh_interpolator in scipy and apply it to a 2, or better > yet, 3 dimensional problem. Any pointers? Off the top of my head I would think you can interpolate the x(t) and y(t) parts separately, i.e. make it a parametric problem. You have the individual derivatives dx/dt and dy/dt, and when you have the two curve components interpolated you can reconstruct the curve in 2D. Does that help? -Rob From williamhpurcell at gmail.com Wed Feb 18 09:50:47 2009 From: williamhpurcell at gmail.com (William Purcell) Date: Wed, 18 Feb 2009 08:50:47 -0600 Subject: [SciPy-user] Fwd: signal.lti(A,B,C,D) with D=0 In-Reply-To: <1cd32cbb0902171309n3f6f7cbevb27f56d5356c67c5@mail.gmail.com> References: <446ebe82-a298-4ab2-90a4-6289cdaff83e@f24g2000vbf.googlegroups.com> <1cd32cbb0902171309n3f6f7cbevb27f56d5356c67c5@mail.gmail.com> Message-ID: I think that scipy.signal is set up to do what I need to do, but I am having trouble with ss2tf. Line 149 of ltisys is type_test = A[:,0] + B[:,0] + C[0,:] + D but I keep getting the error Traceback (most recent call last): File "test_lti.py", line 13, in x = scipy.signal.ss2tf(A,B,C,D) File "/usr/lib/python2.5/site-packages/scipy/signal/ltisys.py", line 153, in ss2tf type_test = A[:,0] + B[:,0] + C[0,:] + D ValueError: shape mismatch: objects cannot be broadcast to a single shape This is because A is nxn, B is nxi, C is mxn, and D is mxi (I hope I got that right). My point is that type_test slices the 'n' dimension of each matrix and D doesn't have an 'n' dimension. I think that the ' + D' needs to be removed from type_test or it needs to be padded with n-m elements for the test. I attached a test that reproduces the error. If I comment out '+ D' in ss2tf, it seems to work just fine and return what I want. One last thing, I think that signal.lti should pass an input kwarg to ss2zpk and ss2tf so that you don't have to always look at the first input (0 index). In other words, ss2zpk and ss2tf both have a input kwarg to tell which input to use and I think that signal.lti should have the same feature. Let me know your thoughts. Thanks, Bill -------------- next part -------------- A non-text attachment was scrubbed... Name: test_lti.py Type: text/x-python Size: 177 bytes Desc: not available URL: From josef.pktd at gmail.com Wed Feb 18 11:25:31 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 18 Feb 2009 11:25:31 -0500 Subject: [SciPy-user] Fwd: signal.lti(A,B,C,D) with D=0 In-Reply-To: References: <446ebe82-a298-4ab2-90a4-6289cdaff83e@f24g2000vbf.googlegroups.com> <1cd32cbb0902171309n3f6f7cbevb27f56d5356c67c5@mail.gmail.com> Message-ID: <1cd32cbb0902180825x47d70ab2y5a45ea25ff8cdd32@mail.gmail.com> 2009/2/18 William Purcell : > I think that scipy.signal is set up to do what I need to do, but I am > having trouble with ss2tf. Line 149 of ltisys is > > type_test = A[:,0] + B[:,0] + C[0,:] + D > > but I keep getting the error > > Traceback (most recent call last): > File "test_lti.py", line 13, in > x = scipy.signal.ss2tf(A,B,C,D) > File "/usr/lib/python2.5/site-packages/scipy/signal/ltisys.py", line > 153, in ss2tf > type_test = A[:,0] + B[:,0] + C[0,:] + D > ValueError: shape mismatch: objects cannot be broadcast to a single shape > > This is because A is nxn, B is nxi, C is mxn, and D is mxi (I hope I > got that right). My point is that type_test slices the 'n' dimension > of each matrix and D doesn't have an 'n' dimension. I think that the ' > + D' needs to be removed from type_test or it needs to be padded with > n-m elements for the test. > > I attached a test that reproduces the error. If I comment out '+ D' in > ss2tf, it seems to work just fine and return what I want. > > One last thing, I think that signal.lti should pass an input kwarg to > ss2zpk and ss2tf so that you don't have to always look at the first > input (0 index). In other words, ss2zpk and ss2tf both have a input > kwarg to tell which input to use and I think that signal.lti should > have the same feature. > > Let me know your thoughts. > > Thanks, > Bill > I ran your test script with +D commented out as you proposed. x = ss2tf(A,B,C,D) runs without raising an exception, but I didn't check whether the numbers are correct. However, trying to do the reverse operation raises different exceptions, see below. None of the lti functions have any tests in the test suite, so it is difficult for me to figure out what the expected behavior of these functions is, and it makes refactoring or rewriting of the functions a hazardous enterprise. I'm not an expert on continuous time state space modeling, but I tried out different versions, and my conclusion was that only single input, single output work correctly. I attach my trying-out script. As I see it, it is possible to use different parts of scipy.signal for multidimensional input or output, but the conversion code that relies on numpy polynomials cannot handle it. So part of the functionality could be rewritten to allow for different shapes (as you did with commenting out D in the test). Additionally, the filters in scipy signal cannot handle multiple signals, however the filters in nd.image can be used in an indirect way to have multi-dimensional filters. I looked at this more for the usage in Kalman filter, vector arma or vector ar, and without a multidimensional polynomial class and proper multi-dimensional filters, it is not possible to use scipy.signal for this. So, I started to write my own VARMA filter ( for discrete time), but without convenient conversion between the different representation as scipy.signal has for the univariate case For a beginning user and not knowing the jargon of the field, signal.lti is not very approachable. A set of examples and tests would be useful to see what the functionality and the limitations is. If you try out different function in lti for multi-dimensional signals, it would be useful to have a list of functions that could be extended to the multi-dimensional case, and of functions for which this is not possible because of the underlying limitations. Since I'm not using scipy signal, I stopped looking into it. But, maybe signal.ltisys needs to be adopted by someone. Josef ----------- >>> x = ss2tf(A,B,C,D) >>> ss=signal.tf2ss(*x) Warning (from warnings module): File "C:\Josef\_progs\building\scipy\scipy-trunk-new-r5551\dist\scipy-0.8.0.dev5551.win32\Programs\Python25\Lib\site-packages\scipy\signal\filter_design.py", line 221 "results may be meaningless", BadCoefficients) BadCoefficients: Badly conditionned filter coefficients (numerator): the results may be meaningless Traceback (most recent call last): File "", line 1, in ss=signal.tf2ss(*x) File "C:\Josef\_progs\building\scipy\scipy-trunk-new-r5551\dist\scipy-0.8.0.dev5551.win32\Programs\Python25\Lib\site-packages\scipy\signal\ltisys.py", line 59, in tf2ss C = num[:,1:] - num[:,0] * den[1:] ValueError: shape mismatch: objects cannot be broadcast to a single shape >>> zpk=signal.tf2zpk(*x) Warning (from warnings module): File "C:\Josef\_progs\building\scipy\scipy-trunk-new-r5551\dist\scipy-0.8.0.dev5551.win32\Programs\Python25\Lib\site-packages\scipy\signal\filter_design.py", line 221 "results may be meaningless", BadCoefficients) BadCoefficients: Badly conditionned filter coefficients (numerator): the results may be meaningless Traceback (most recent call last): File "", line 1, in zpk=signal.tf2zpk(*x) File "C:\Josef\_progs\building\scipy\scipy-trunk-new-r5551\dist\scipy-0.8.0.dev5551.win32\Programs\Python25\Lib\site-packages\scipy\signal\filter_design.py", line 161, in tf2zpk z = roots(b) File "\Programs\Python25\Lib\site-packages\numpy\lib\polynomial.py", line 133, in roots raise ValueError,"Input must be a rank-1 array." ValueError: Input must be a rank-1 array. -------------- next part -------------- import numpy as np from scipy import signal dvec = np.array([1,2,3,4]) A1 = np.array([-dvec,[1,0,0,0],[0,1,0,0],[0,0,1,0]]) B1 = np.array([[1,0,0,0]]).T # wrong dimension, this requires D has one column B1 = np.eye(4) C1 = np.array([[1,2,3,4]]) D1 = np.zeros((1,4)) print signal.ss2tf(A1,B1,C1,D1) #same as http://en.wikipedia.org/wiki/State_space_(controls)#Canonical_realizations signal.ss2tf(*signal.tf2ss(*signal.ss2tf(A1,B1,C1,D1))) np.testing.assert_almost_equal(signal.ss2tf(*signal.tf2ss(*signal.ss2tf(A1,B1,C1,D1)))[0],signal.ss2tf(A1,B1,C1,D1)[0]) ''' dx_t = A x_t + B u_t y_t = C x_t + D u_t >>> dvec = np.array([1,2,3,4]) >>> A = np.array([-dvec,[1,0,0,0],[0,1,0,0],[0,0,1,0]]) >>> B = np.array([[1,0,0,0]]).T # wrong dimension, this requires D has one column >>> B = np.eye(4) >>> C = np.array([[1,2,3,4]]) >>> D = np.zeros((1,4)) >>> num, den = signal.ss2tf(A,B,C,D) >>> print num [[ 0. 1. 2. 3. 4.]] >>> print den [ 1. 1. 2. 3. 4.] >>> A1,B1,C1,D1 = signal.tf2ss(*signal.ss2tf(A,B,C,D)) >>> A1 array([[-1., -2., -3., -4.], [ 1., 0., 0., 0.], [ 0., 1., 0., 0.], [ 0., 0., 1., 0.]]) >>> B1 array([[ 1.], [ 0.], [ 0.], [ 0.]]) >>> C1 array([[ 1., 2., 3., 4.]]) >>> D1 array([ 0.]) ''' # can only have one noise variable u_t # ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ dvec = np.array([1,2,3,4]) A = np.array([-dvec,[1,0,0,0],[0,1,0,0],[0,0,1,0]]) B = np.array([[1,0,0,0]]).T # wrong dimension, this requires D has one column B = np.eye(4) B[2,1] = 1 C = np.array([[1,2,3,4]]) D = np.zeros((1,4)) print signal.ss2tf(A,B,C,D) # can only have one output variable y_t # ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ dvec = np.array([1,2,3,4]) A = np.array([-dvec,[1,0,0,0],[0,1,0,0],[0,0,1,0]]) B = np.array([[1,0,0,0]]).T # wrong dimension, this requires D has one column B = np.eye(4) B[2,1] = 1 C = np.array([[1,2,3,4],[1,0,0,0]]) D = np.zeros((2,4)) #print signal.ss2tf(A,B,C,D) #this causes ## type_test = A[:,0] + B[:,0] + C[0,:] + D ##ValueError: shape mismatch # From josef.pktd at gmail.com Wed Feb 18 13:26:44 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 18 Feb 2009 13:26:44 -0500 Subject: [SciPy-user] Fwd: signal.lti(A,B,C,D) with D=0 In-Reply-To: <1cd32cbb0902180825x47d70ab2y5a45ea25ff8cdd32@mail.gmail.com> References: <446ebe82-a298-4ab2-90a4-6289cdaff83e@f24g2000vbf.googlegroups.com> <1cd32cbb0902171309n3f6f7cbevb27f56d5356c67c5@mail.gmail.com> <1cd32cbb0902180825x47d70ab2y5a45ea25ff8cdd32@mail.gmail.com> Message-ID: <1cd32cbb0902181026qd409529g7d392c1db82ca9b2@mail.gmail.com> another thought: Obtaining multi-dimensional output can be done by stitching together several lti systems or transfer functions by looping over the output dimension (I did something similar for multi-dimensional filters with ndimage). However, multi-dimensional inputs cannot be handled this way. I didn't see any way to merge 2 independent input signals to one output signal. Josef From trottier+pylist at gmail.com Wed Feb 18 13:42:08 2009 From: trottier+pylist at gmail.com (Leo Trottier) Date: Wed, 18 Feb 2009 10:42:08 -0800 Subject: [SciPy-user] float96 displayed (incorrectly) as float64 In-Reply-To: References: Message-ID: Hi, I'd like to show off how much easier it is to work with multiple data types in numpy (as compared with matlab). It would be especially handy to show off float96 , etc. Unfortunately, this doesn't seem to work under Vista or Windows XP: ########################### Python 2.5.4 (r254:67916, Dec 23 2008, 15:10:54) [MSC v.1310 32 bit (Intel)] Type "copyright", "credits" or "license" for more information. IPython 0.9.1 -- An enhanced Interactive Python. ... IPython profile: xy # NOTE: numpy.__version__ == '1.2.1' # ... and: scipy.__version__ == '0.7.0' In [1]: a = array(1.0,dtype=float96) In [2]: print a # DISPLAYS INCORRECTLY 0.0 In [3]: print a.astype(float64) # DISPLAYS CORRECTLY -- DATA STILL THERE 1.0 In [4]: print (a*-1).astype(float64) # MULTIPLICATION APPEARS TO WORK PROPERLY -1.0 ############################# Does anyone know of a work-around? Leo -------------- next part -------------- An HTML attachment was scrubbed... URL: From williamhpurcell at gmail.com Wed Feb 18 14:00:43 2009 From: williamhpurcell at gmail.com (William Purcell) Date: Wed, 18 Feb 2009 13:00:43 -0600 Subject: [SciPy-user] Fwd: signal.lti(A,B,C,D) with D=0 In-Reply-To: <1cd32cbb0902181026qd409529g7d392c1db82ca9b2@mail.gmail.com> References: <446ebe82-a298-4ab2-90a4-6289cdaff83e@f24g2000vbf.googlegroups.com> <1cd32cbb0902171309n3f6f7cbevb27f56d5356c67c5@mail.gmail.com> <1cd32cbb0902180825x47d70ab2y5a45ea25ff8cdd32@mail.gmail.com> <1cd32cbb0902181026qd409529g7d392c1db82ca9b2@mail.gmail.com> Message-ID: I was thinking the same thing. I am working on a thin wrapper over ltisys to do tests on the dimensions of the output and then looping if there is any stitching to be done. I was also thinking that I could loop over the inputs first and then in a sub-routine, loop over the outputs to come up with a two-dimensional list of what ever I need. For example, if I am passing a MIMO system to signal.lti, each of the representation conversions would be in matrix/list form corresponding to each input/output relationship. Or lti could test if its a MIMO or even SIMO and make a matrix/list of lti instances which would each have there own alternative representations through ss2zpk/ss2tf etc (which might be the cleaner of the two alternatives.) Do you think this would be a feature for ltisys (I don't think it would take much time to implement), or do you think that it is hackish and ltisys should stick to SISO systems? On Wed, Feb 18, 2009 at 12:26 PM, wrote: > another thought: > > Obtaining multi-dimensional output can be done by stitching together > several lti systems or transfer functions by looping over the output > dimension (I did something similar for multi-dimensional filters with > ndimage). > > However, multi-dimensional inputs cannot be handled this way. I didn't > see any way to merge 2 independent input signals to one output signal. > > Josef > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > From josef.pktd at gmail.com Wed Feb 18 14:39:08 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 18 Feb 2009 14:39:08 -0500 Subject: [SciPy-user] Fwd: signal.lti(A,B,C,D) with D=0 In-Reply-To: References: <446ebe82-a298-4ab2-90a4-6289cdaff83e@f24g2000vbf.googlegroups.com> <1cd32cbb0902171309n3f6f7cbevb27f56d5356c67c5@mail.gmail.com> <1cd32cbb0902180825x47d70ab2y5a45ea25ff8cdd32@mail.gmail.com> <1cd32cbb0902181026qd409529g7d392c1db82ca9b2@mail.gmail.com> Message-ID: <1cd32cbb0902181139l4b3f759em96e34fd16153bb2a@mail.gmail.com> On Wed, Feb 18, 2009 at 2:00 PM, William Purcell wrote: > I was thinking the same thing. I am working on a thin wrapper over > ltisys to do tests on the dimensions of the output and then looping if > there is any stitching to be done. I was also thinking that I could > loop over the inputs first and then in a sub-routine, loop over the > outputs to come up with a two-dimensional list of what ever I need. > For example, if I am passing a MIMO system to signal.lti, each of the > representation conversions would be in matrix/list form corresponding > to each input/output relationship. Or lti could test if its a MIMO or > even SIMO and make a matrix/list of lti instances which would each > have there own alternative representations through ss2zpk/ss2tf etc > (which might be the cleaner of the two alternatives.) Do you think > this would be a feature for ltisys (I don't think it would take much > time to implement), or do you think that it is hackish and ltisys > should stick to SISO systems? Are you looking at parallel SISO and SIMO systems, or did you find a way to have one output depend on two inputs? Personally, I don't like to change code without having tests, to verify that what I am doing doesn't break things and delivers what I want. Since, ltisys doesn't have tests, I wouldn't want to do much surgery on it, and I would prefer a separate wrapper, or subclassing or delegation with minimal changes to the existing code. Another idea for true MIMO system would be to extract what works for that case and write a separate MIMO package. I think using only the state space representation, it shouldn't be too difficult to get simulation and similar things working correctly. The definition and representation of transfer functions and zpk would be more difficult, and I don't know much about it in the multi-dimensional input case (My intuition is more in terms of time series analysis and I haven't looked at this very closely.) Later if everything works and is tested, merging of MIMO and SISO could be considered. These are my thoughts as a (half) innocent bystander. Josef > > On Wed, Feb 18, 2009 at 12:26 PM, wrote: >> another thought: >> >> Obtaining multi-dimensional output can be done by stitching together >> several lti systems or transfer functions by looping over the output >> dimension (I did something similar for multi-dimensional filters with >> ndimage). >> >> However, multi-dimensional inputs cannot be handled this way. I didn't >> see any way to merge 2 independent input signals to one output signal. >> >> Josef >> _______________________________________________ >> SciPy-user mailing list >> SciPy-user at scipy.org >> http://projects.scipy.org/mailman/listinfo/scipy-user >> > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > From sturla at molden.no Wed Feb 18 18:37:32 2009 From: sturla at molden.no (Sturla Molden) Date: Thu, 19 Feb 2009 00:37:32 +0100 (CET) Subject: [SciPy-user] Parallel processing with Python Message-ID: I know this is not directly related to SciPy, but it may be of interest to some subscribers to this list. About a year ago, I posted a scheme to comp.lang.python describing how to use isolated interpreters and threads to circumvent the GIL on SMPs: http://groups.google.no/group/comp.lang.python/msg/0351c532aad97c5e?hl=no&dmode=source One interpreter per thread is how tcl work. Erlang also uses isolated threads that only communicate through messages (as opposed to shared objects). "Appdomains" are also available in the .NET framework, and in Java as "Java isolates". They are potentially very useful as multicore CPUs become abundant. They allow one process to run one independent Python interpreter on each available CPU core. In Python, "appdomains" can be created by embedding the Python interpreter multiple times in a process, and associating each interpreter with a thread. For this to work, we have to make multiple copies of the Python DLL and rename them (e.g. Python25-0.dll, Python25-1.dll, Python25-2.dll, etc.) Otherwise the dynamic loader will just return a handle to the already imported DLL. As DLLs can be accessed with ctypes, we don't even have to program a line of C to do this. we can start up a Python interpreter and use ctypes to embed more interpreters into it, associating each interpreter with its own thread. ctypes takes care of releasing the GIL in the parent interpreter, so calls to these sub-interpreters become asynchronous. I had a mock-up of this scheme working. Martin L?wis replied he doubted this would work, and pointed out that Python extension libraries (.pyd files) are DLLs as well. They would only be imported once, and their global states would thus crash, thus producing havoc: http://groups.google.no/group/comp.lang.python/msg/0a7a22910c1d5bf5?hl=no&dmode=source He was right, of course, but also wrong. In fact I had already proven him wrong by importing a DLL multiple times. If it can be done for Python25.dll, it can be done for any other DLL as well - including .pyd files - in exactly the same way. Thus what remains is to change Python's dynamic loader to use the same "copy and import" scheme. This can either be done by changing Python's C code, or (at least on Windows) to redirect the LoadLibrary API call from kernel32.dll to a custom DLL. Both a quite easy and requires minimal C coding. Thus it is quite easy to make multiple, independent Python interpreters live isolated lives in the same process. As opposed to multiple processes, they can communicate without involving any IPC. It would also be possible to design proxy objects allowing one interpreter access to an object in another. Immutable object such as strings would be particularly easy to share. This very simple scheme should allow parallel processing with Python similar to how it's done in Erlang, without the GIL getting in our way. At least on Windows this can be done without touching the CPython source at all. I am not sure about Linux though. I may be necessary to patch the CPython source to make it work there. Sturla Molden From cournape at gmail.com Wed Feb 18 20:31:29 2009 From: cournape at gmail.com (David Cournapeau) Date: Thu, 19 Feb 2009 10:31:29 +0900 Subject: [SciPy-user] float96 displayed (incorrectly) as float64 In-Reply-To: References: Message-ID: <5b8d13220902181731t20af48c6pb58f10a8802c666b@mail.gmail.com> On Thu, Feb 19, 2009 at 3:42 AM, Leo Trottier wrote: > Hi, > > I'd like to show off how much easier it is to work with multiple data types > in numpy (as compared with matlab). It would be especially handy to show > off float96 , etc. > > Unfortunately, this doesn't seem to work under Vista or Windows XP Windows does not support long double - long double is exactly the same as double on this platform. The printing bug have been fixed and will be in the upcoming numpy 1.3. David From dwf at cs.toronto.edu Wed Feb 18 21:06:54 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Wed, 18 Feb 2009 21:06:54 -0500 Subject: [SciPy-user] "clustergrams"/hierarchical clustering heat maps In-Reply-To: <91b4b1ab0902142218v44ceccd8r10e51eea56f03c57@mail.gmail.com> References: <8E0AFD62-64B7-435F-B80F-298C702BF771@cs.toronto.edu> <91b4b1ab0902142218v44ceccd8r10e51eea56f03c57@mail.gmail.com> Message-ID: <244098B9-8442-443F-BFF3-CC0B738402ED@cs.toronto.edu> Hi Damian, On 15-Feb-09, at 1:18 AM, Damian Eads wrote: > Hi David, > > Sorry. I did not see your message until now. Several people have > already inquired about heatmaps. I've been meaning to eventually > implement support for them but since I don't work with microarray data > and I'm in the midst of trying to get a paper out, it has fallen onto > the back burner. Not a problem, I know how it is. > As a first step, I'd need to implement support for > missing attributes since this seems to be common with microarray data. It can be, though as far as I know, a common strategy with microarrays is to just impute missing values in one way or another. > As far as I know, a heatmap illustrates clustering along two axes: > observation vectors and attributes. For example, suppose we're > clustering patients by their genes. There is one observation vector > for each patient, and one vector element per gene. Clustering > observation vectors is the typical case, which is used to identify > groups of similar patients. Clustering attributes (across observation > vectors) is less typical but would be used to identifying groups of > similar genes. > > The heatmap just illustrates the vectors, the color is the intensity. > When clustering along a single dimension (observation vectors), no > sorting is necessary, and a dendrogram is drawn along the vertical > axis. The i'th row is just the observation vector corresponding to the > i'th leaf node. No sorting along the attribute dimension is needed. > Along two dimensions, there is a dendrogram along the horizontal axis. > Now the attributes must be reordered so that the j'th column > corresponds to the j'th leaf node. > > This is my first time describing heat maps so I apologize if this > description is terse. Does it make some sense? That corresponds with my understanding as well. Though I'm not certain that 'no sorting is needed' if we're just clustering along one dimension. Is what you mean is that the order is completely specified by the dendrogram? Because that would make sense. As far as I know there is also some heuristic for laying out both axes (since there are arbitrary ordering choices to be made, e.g. which branch to put on the left and which on the right) which makes them easier to see patterns in, my advisor name-dropped the name of the algorithm once but I'd have to ask him again. > As far as how someone implements this, it seems like it'd be pretty > simple. There is a helper function called _plot_dendrogram that takes > in a collection of raw dendrogram lines to be rendered on the plot. > First, plot the heatmap (sorting the attributes so that the columns > correspond to the ids of the leaf nodes); this can be done with > imshow. Second, for the first dendrogram, call _plot_dendrogram but > provide it with a shifting parameters so that the dendrogram lines are > rendered to the left of the image. Third, call _plot_dendrogram again, > provide a shifting parameter, but instead shift the lines downward for > the attribute clustering dendrogram. Sounds as though the "completely specified" bit above is what you meant. And it sounds as though the existing interface should be sufficient to get something going. > I want to get to this soon but no promises. Sorry. If I don't beat you to it. :) David From michael.abshoff at googlemail.com Thu Feb 19 00:35:37 2009 From: michael.abshoff at googlemail.com (Michael Abshoff) Date: Wed, 18 Feb 2009 21:35:37 -0800 Subject: [SciPy-user] float96 displayed (incorrectly) as float64 In-Reply-To: <5b8d13220902181731t20af48c6pb58f10a8802c666b@mail.gmail.com> References: <5b8d13220902181731t20af48c6pb58f10a8802c666b@mail.gmail.com> Message-ID: <499CEFA9.60608@gmail.com> David Cournapeau wrote: > On Thu, Feb 19, 2009 at 3:42 AM, Leo Trottier wrote: >> Hi, Hi, >> I'd like to show off how much easier it is to work with multiple data types >> in numpy (as compared with matlab). It would be especially handy to show >> off float96 , etc. >> >> Unfortunately, this doesn't seem to work under Vista or Windows XP > > Windows does not support long double - long double is exactly the same > as double on this platform. The C99 standard does not guarantee that long double is any "longer" than double. Nearly all systems, but Windows do have a long double that is either 96 or 128 bits. But it is wrong to assume that this is always the case and it is not a violation of the C99 standard. > The printing bug have been fixed and will > be in the upcoming numpy 1.3. > > David Cheers, Michael > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > From david at ar.media.kyoto-u.ac.jp Thu Feb 19 00:33:37 2009 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Thu, 19 Feb 2009 14:33:37 +0900 Subject: [SciPy-user] float96 displayed (incorrectly) as float64 In-Reply-To: <499CEFA9.60608@gmail.com> References: <5b8d13220902181731t20af48c6pb58f10a8802c666b@mail.gmail.com> <499CEFA9.60608@gmail.com> Message-ID: <499CEF31.2000809@ar.media.kyoto-u.ac.jp> Michael Abshoff wrote: > David Cournapeau wrote: > >> On Thu, Feb 19, 2009 at 3:42 AM, Leo Trottier wrote: >> >>> Hi, >>> > > Hi, > > >>> I'd like to show off how much easier it is to work with multiple data types >>> in numpy (as compared with matlab). It would be especially handy to show >>> off float96 , etc. >>> >>> Unfortunately, this doesn't seem to work under Vista or Windows XP >>> >> Windows does not support long double - long double is exactly the same >> as double on this platform. >> > > The C99 standard does not guarantee that long double is any "longer" > than double. Nearly all systems, but Windows do have a long double that > is either 96 or 128 bits. But it is wrong to assume that this is always > the case and it is not a violation of the C99 standard. > I kept things simple, but you're right that the real problem is more complicated. For once, long double being bigger than double is not a OS problem, but a compiler + CPU problem. On windows, it is made complicated by the fact that that sizeof(long double) > sizeof(double) for gcc, even on windows, but that windows C runtime does not support this You can check that the problem is printing, not computation (example untested): import numpy as np a = np.float96(1.) print a # bogus, 0. b = 2 * a print np.double(b) # print 2. Pauli and me have spent some time to fix various formatting issues, and I also added some support to make sure long double is converted to double before any printing on windows, cheers, David From strawman at astraw.com Thu Feb 19 01:45:10 2009 From: strawman at astraw.com (Andrew Straw) Date: Wed, 18 Feb 2009 22:45:10 -0800 Subject: [SciPy-user] Parallel processing with Python In-Reply-To: References: Message-ID: <499CFFF6.9050901@astraw.com> Hi Sturla, I think this is a very interesting idea. I once ran into weird and mysterious issues with dlopen() when trying to do similar on linux but I hadn't thought of the rename-the-shared-library trick. If we invented some kind of syntax for software transactional memory (STM), we might really be playing with fire. I might give this a try on a couple problems I'm working on (which generally have more to do with making complicated stuff happen quickly -- with low latency -- than crunching tons of data). Anyhow, please keep us informed of further progress! -Andrew Sturla Molden wrote: > I know this is not directly related to SciPy, but it may be of interest to > some subscribers to this list. > > About a year ago, I posted a scheme to comp.lang.python describing how to > use isolated interpreters and threads to circumvent the GIL on SMPs: > > http://groups.google.no/group/comp.lang.python/msg/0351c532aad97c5e?hl=no&dmode=source > > One interpreter per thread is how tcl work. Erlang also uses isolated > threads that only communicate through messages (as opposed to shared > objects). "Appdomains" are also available in the .NET framework, and in > Java as "Java isolates". They are potentially very useful as multicore > CPUs become abundant. They allow one process to run one independent Python > interpreter on each available CPU core. > > In Python, "appdomains" can be created by embedding the Python interpreter > multiple times in a process, and associating each interpreter with a > thread. For this to work, we have to make multiple copies of the Python > DLL and rename them (e.g. Python25-0.dll, Python25-1.dll, > Python25-2.dll, etc.) Otherwise the dynamic loader will just return a > handle to the already imported DLL. As DLLs can be accessed with ctypes, > we don't even have to program a line of C to do this. we can start up a > Python interpreter and use ctypes to embed more interpreters > into it, associating each interpreter with its own thread. ctypes takes > care of releasing the GIL in the parent interpreter, so calls to these > sub-interpreters become asynchronous. I had a mock-up of this scheme > working. Martin L?wis replied he doubted this would work, and pointed out > that Python extension libraries (.pyd files) are DLLs as well. They would > only be imported once, and their global states would thus crash, thus > producing havoc: > > http://groups.google.no/group/comp.lang.python/msg/0a7a22910c1d5bf5?hl=no&dmode=source > > He was right, of course, but also wrong. In fact I had already proven him > wrong by importing a DLL multiple times. If it can be done for > Python25.dll, it can be done for any other DLL as well - including .pyd > files - in exactly the same way. Thus what remains is to change Python's > dynamic loader to use the same "copy and import" scheme. This can either > be done by changing Python's C code, or (at least on Windows) to redirect > the LoadLibrary API call from kernel32.dll to a custom DLL. Both a quite > easy and requires minimal C coding. > > Thus it is quite easy to make multiple, independent Python interpreters > live isolated lives in the same process. As opposed to multiple processes, > they can communicate without involving any IPC. It would also be possible > to design proxy objects allowing one interpreter access to an object in > another. Immutable object such as strings would be particularly easy to > share. > > This very simple scheme should allow parallel processing with Python > similar to how it's done in Erlang, without the GIL getting in our way. At > least on Windows this can be done without touching the CPython source at > all. I am not sure about Linux though. I may be necessary to patch the > CPython source to make it work there. > > > Sturla Molden > > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > From dwf at cs.toronto.edu Thu Feb 19 03:15:55 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Thu, 19 Feb 2009 03:15:55 -0500 Subject: [SciPy-user] "clustergrams"/hierarchical clustering heat maps In-Reply-To: References: <8E0AFD62-64B7-435F-B80F-298C702BF771@cs.toronto.edu> Message-ID: <5A54FE63-4ED5-4DC7-8D2A-743B8BDD734D@cs.toronto.edu> On 3-Feb-09, at 9:43 AM, Zachary Pincus wrote: > Cluster 3 is a bit annoying to one used to scripting analyses (lots of > GUI button-pressing), but there's also a python library. Or you could > just scrutinize the output format (it barfs out a few text files) and > use your own clustering tools. TreeView then accepts these text files > and lets you manipulate the heatmap / dendrograms (e.g. flipping nodes > to get visually better results). You can then export to PS or other > formats. (The PS output is pretty clean, so you can edit in > Illustrator or whatnot easily.) Thanks for the tip. I've since downloaded and played with the both of them, I think I will try and grok the output format of Cluster 3 and see if I can't write a function to dump linkage trees from scipy.cluster.hierarchy to this (one of those?) formats. Cheers, David From peter.cimermancic at gmail.com Thu Feb 19 04:25:38 2009 From: peter.cimermancic at gmail.com (=?UTF-8?Q?Peter_Cimerman=C4=8Di=C4=8D?=) Date: Thu, 19 Feb 2009 10:25:38 +0100 Subject: [SciPy-user] integrate.odeint problem In-Reply-To: <200902161547.52998.jr@sun.ac.za> References: <18d53ca60902160416y6b6b6527nc6406a1def5e1644@mail.gmail.com> <200902161547.52998.jr@sun.ac.za> Message-ID: <18d53ca60902190125t1df3b0beya7715adbd8d75be9@mail.gmail.com> Thank you, Johann. Now it works. However, results, compared to those gotten from Jarnac, are quite different (initial slope or steady state value differ two times). Do you know, is this "normal" (since I used modified Jarnac script directly I can exclude transcriptional mistakes)? I will try with PySCes, as well. Regards, Peter 2009/2/16 Johann Rohwer > Hi Peter > > Try increasing the parameter mxstep (default is 500) in the odeint > function call to a higher value (such as 1000 or 3000). We are using > SciPy's odeint in our Python based systems biology simulation > software PySCeS (http://pysces.sf.net) and have found quite regularly > for biological models that an mxstep value of 500 is insufficient. > Incidentally, when PySCeS encounters this situation, it automatically > increments the mxstep value and re-simulates.... > > In general, we have been able to get good agreement with PySCeS and > Jarnac simulating the same model, so I'd be surprised if this does > not fix it. > > Regards > Johann > > On Monday, 16 February 2009, Peter Cimerman?i? wrote: > > Hi! > > > > Using 60 differential equations, I am trying to simulate some > > biological process. After running the script, I've got next > > message: > > > > lsoda-- at current t (=r1), mxstep (=i1) steps > > taken on this call before reaching tout > > In above message, I1 = 500 > > In above message, R1 = 0.857...E+01 > > Excess work done on this call (Perhaps wrong Dfun type). > > Run with full_output = 1 to get quantitative information. > > > > After running with full_output = 1, additional information was: > > KeyError: 0. > > > > The same equations and parameters were run in Jarnac (simulation > > software) and I've got correct results, assuming equations and > > parameters are right. What else could go wrong to produce above > > error message? > > > > Thank you in advance, > > > > Peter > > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From Dharhas.Pothina at twdb.state.tx.us Thu Feb 19 12:09:10 2009 From: Dharhas.Pothina at twdb.state.tx.us (Dharhas Pothina) Date: Thu, 19 Feb 2009 11:09:10 -0600 Subject: [SciPy-user] Remove duplicate points from xyz data. Message-ID: <499D3DD60200009B0001B007@GWWEB.twdb.state.tx.us> Hi, I've been trying to use the cookbook instructions for plotting irregular data and from what I can tell the delaunay triangulation is failing because my dataset has duplicate points. I tried to follow PierreGM's technique for removing duplicate points from an archived discussion: > For example: > z=np.array([[1,1],[1,2],[1,2],[1,3]]) > zr=z.view([('a',int),('b',int)]) > zs = numpy.unique(zr).view((int,2)) And I can use this to remove duplicate x,y points but I'm unclear how to remove the corresponding z values. To be clear. I have three arrays x,y,z . I need to remove rows that have duplicate (x,y) coordinates. These rows may or may not have the same z values in them. - dharhas ps. Once I get this working is there a way to update the cookbook page to indicate that duplicate points need to be removed before it will work. From cmac at mit.edu Thu Feb 19 16:03:21 2009 From: cmac at mit.edu (Christopher MacMinn) Date: Thu, 19 Feb 2009 16:03:21 -0500 Subject: [SciPy-user] SciPy-user Digest, Vol 65, Issue 50 In-Reply-To: References: Message-ID: <95da30590902191303r6354f758ib8fcee0722aa86be@mail.gmail.com> > > On Wed, Jan 21, 2009 at 1:22 PM, Rob Clewley > wrote: > >> odeint is a wrapper for the LSODA solver in the Fortran ODEPACK > >> library. This library also includes LSODAR, which is LSODA with > >> root-finding (aka event detection). Does anyone want to take a stab at > >> wrapping LSODAR? The wrapping of LSODA with odeint provides a good > >> starting point, and an ODE solver with root-finding would be a great > >> addition to SciPy. > >> > >> Warren > > > > Ryan Gutenkunst already wrapped it while working on the SloppyCell > package. See > > > > http://osdir.com/ml/python.scientific.devel/2005-07/msg00028.html > > > > with a link there to the code. I've never tried it myself or even > > looked at it, FYI :) > > -Rob > > > > PS There's some mention of Ryan's lsodar.pyf in the trunk of scipy SVN, as > per > > > projects.scipy.org/scipy/scipy/browser/trunk/scipy/integrate/setup.py?rev=4763 > > but I don't know if it's still there. If it is, is the associated pyd > now shipped with Scipy? I haven't installed a new version for months. > -Rob Sorry for letting this drop for... err... awhile. I don't speak "developer", but I take this to mean that this functionality is available in a few other python packages (PyDSTool, CVODE via PySundials), and that there may or may not be pieces of lsodar already in scipy. As a MATLAB to python/numpy/scipy convert who really still has one foot on each log, as it were, having this 'root finding' functionality readily available in python/scipy/numpy would be a big plus. Consider this one vote for finishing the wrapping of lsodar. Of course, I should not say such things without volunteering to help... I would be happy to contribute, but I don't even know where to start. Best, Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: From dwf at cs.toronto.edu Thu Feb 19 17:27:26 2009 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Thu, 19 Feb 2009 17:27:26 -0500 Subject: [SciPy-user] SciPy-user Digest, Vol 65, Issue 50 In-Reply-To: <95da30590902191303r6354f758ib8fcee0722aa86be@mail.gmail.com> References: <95da30590902191303r6354f758ib8fcee0722aa86be@mail.gmail.com> Message-ID: On 19-Feb-09, at 4:03 PM, Christopher MacMinn wrote: > > As a MATLAB to python/numpy/scipy convert who really still has one > foot on each log, as it were, having this 'root finding' > functionality readily available in python/scipy/numpy would be a big > plus. Consider this one vote for finishing the wrapping of lsodar. > Of course, I should not say such things without volunteering to > help... I would be happy to contribute, but I don't even know where > to start. Documentation and testing are good places. But that might necessitate the wrapping first being done ;) David From c.j.lee at tnw.utwente.nl Fri Feb 20 01:06:30 2009 From: c.j.lee at tnw.utwente.nl (Chris Lee) Date: Fri, 20 Feb 2009 07:06:30 +0100 Subject: [SciPy-user] Parallel processing with Python In-Reply-To: <499CFFF6.9050901@astraw.com> References: <499CFFF6.9050901@astraw.com> Message-ID: <6A308240-C031-4666-A36F-AEAF49FD8621@tnw.utwente.nl> You might also want to visit the folk at parallelpython who are doing something superficially similar. They pickle up the appropriate functions and data. The pp server then starts up an sh session that calls python with the pickled functions and data. The advantage of this is that it works across multiple platforms and computers so you can really scale your computational resources. The draw back is that the programmer must determine how best to make the code parallel and then call the pp server themselves. It is also limited in that data sharing is nigh on impossible. Cheers Chris On Feb 19, 2009, at 7:45 AM, Andrew Straw wrote: > Hi Sturla, I think this is a very interesting idea. I once ran into > weird and mysterious issues with dlopen() when trying to do similar on > linux but I hadn't thought of the rename-the-shared-library trick. > If we > invented some kind of syntax for software transactional memory > (STM), we > might really be playing with fire. I might give this a try on a couple > problems I'm working on (which generally have more to do with making > complicated stuff happen quickly -- with low latency -- than crunching > tons of data). Anyhow, please keep us informed of further progress! > > -Andrew > > Sturla Molden wrote: >> I know this is not directly related to SciPy, but it may be of >> interest to >> some subscribers to this list. >> >> About a year ago, I posted a scheme to comp.lang.python describing >> how to >> use isolated interpreters and threads to circumvent the GIL on SMPs: >> >> http://groups.google.no/group/comp.lang.python/msg/0351c532aad97c5e?hl=no&dmode=source >> >> One interpreter per thread is how tcl work. Erlang also uses isolated >> threads that only communicate through messages (as opposed to shared >> objects). "Appdomains" are also available in the .NET framework, >> and in >> Java as "Java isolates". They are potentially very useful as >> multicore >> CPUs become abundant. They allow one process to run one independent >> Python >> interpreter on each available CPU core. >> >> In Python, "appdomains" can be created by embedding the Python >> interpreter >> multiple times in a process, and associating each interpreter with a >> thread. For this to work, we have to make multiple copies of the >> Python >> DLL and rename them (e.g. Python25-0.dll, Python25-1.dll, >> Python25-2.dll, etc.) Otherwise the dynamic loader will just return a >> handle to the already imported DLL. As DLLs can be accessed with >> ctypes, >> we don't even have to program a line of C to do this. we can start >> up a >> Python interpreter and use ctypes to embed more interpreters >> into it, associating each interpreter with its own thread. ctypes >> takes >> care of releasing the GIL in the parent interpreter, so calls to >> these >> sub-interpreters become asynchronous. I had a mock-up of this scheme >> working. Martin L?wis replied he doubted this would work, and >> pointed out >> that Python extension libraries (.pyd files) are DLLs as well. They >> would >> only be imported once, and their global states would thus crash, thus >> producing havoc: >> >> http://groups.google.no/group/comp.lang.python/msg/0a7a22910c1d5bf5?hl=no&dmode=source >> >> He was right, of course, but also wrong. In fact I had already >> proven him >> wrong by importing a DLL multiple times. If it can be done for >> Python25.dll, it can be done for any other DLL as well - >> including .pyd >> files - in exactly the same way. Thus what remains is to change >> Python's >> dynamic loader to use the same "copy and import" scheme. This can >> either >> be done by changing Python's C code, or (at least on Windows) to >> redirect >> the LoadLibrary API call from kernel32.dll to a custom DLL. Both a >> quite >> easy and requires minimal C coding. >> >> Thus it is quite easy to make multiple, independent Python >> interpreters >> live isolated lives in the same process. As opposed to multiple >> processes, >> they can communicate without involving any IPC. It would also be >> possible >> to design proxy objects allowing one interpreter access to an >> object in >> another. Immutable object such as strings would be particularly >> easy to >> share. >> >> This very simple scheme should allow parallel processing with Python >> similar to how it's done in Erlang, without the GIL getting in our >> way. At >> least on Windows this can be done without touching the CPython >> source at >> all. I am not sure about Linux though. I may be necessary to patch >> the >> CPython source to make it work there. >> >> >> Sturla Molden >> >> _______________________________________________ >> SciPy-user mailing list >> SciPy-user at scipy.org >> http://projects.scipy.org/mailman/listinfo/scipy-user >> > > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user *************************************************** Chris Lee Laser Physics and Nonlinear Optics Group MESA+ Research Institute for Nanotechnology University of Twente Phone: ++31 (0)53 489 3968 fax: ++31 (0)53 489 1102 *************************************************** From python-ml at nn7.de Fri Feb 20 04:02:22 2009 From: python-ml at nn7.de (Soeren Sonnenburg) Date: Fri, 20 Feb 2009 10:02:22 +0100 Subject: [SciPy-user] sparse matrices again In-Reply-To: References: <1234356072.5642.16.camel@localhost> Message-ID: <1235120542.4216.15.camel@localhost> On Wed, 2009-02-11 at 09:40 -0500, Nathan Bell wrote: > On Wed, Feb 11, 2009 at 7:41 AM, Soeren Sonnenburg wrote: > > > > is it somehow possible to interface to the C API of scipy's spars > > matrices? I know numpy does not have sparse matrix support but scipy > > does (at least it can be used from the python side). > > > > If it is not too unstable then I would invest some time to get some swig > > typemaps to connect to it. > > > > The interface is not guaranteed to be stable, but you can access the > C++ functions that implement much of scipy.sparse through > scipy.sparse.sparsetools. > > What do you want to do exactly? Just a pointer to the standard ccs format (similar to what I have in octave) to be able to exchange data with shogun (www.shogun-toolbox.org). Soeren From dannoritzer at web.de Fri Feb 20 08:10:01 2009 From: dannoritzer at web.de (=?ISO-8859-1?Q?G=FCnter_Dannoritzer?=) Date: Fri, 20 Feb 2009 14:10:01 +0100 Subject: [SciPy-user] remezord() ticket #475 Message-ID: <499EABA9.9000001@web.de> Hi, I am trying to implement a FIR filter with SciPy and for the design would first like to estimate the order I will need. I noticed that there is no remezord() function like in Matlab, but it has been submitted as patch with ticket #475. However, that function is not scheduled to be included in any of the next releases. The ticket history shows that it was scheduled for release 0.6, then 0.7, and now unspecified. Unfortunately the history does not show why this has been delayed so much. Anyone knows why the function will not be included soon? For the time being I will just use the patch as an extra module, just would be good to have that function finally in the standard package. Thanks for the help. Cheers, Guenter From stefan at sun.ac.za Fri Feb 20 09:05:48 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Fri, 20 Feb 2009 16:05:48 +0200 Subject: [SciPy-user] remezord() ticket #475 In-Reply-To: <499EABA9.9000001@web.de> References: <499EABA9.9000001@web.de> Message-ID: <9457e7c80902200605u4ad666bcj5abc6cd0f8334456@mail.gmail.com> Hi G?nter 2009/2/20 G?nter Dannoritzer : > I am trying to implement a FIR filter with SciPy and for the design > would first like to estimate the order I will need. I noticed that there > is no remezord() function like in Matlab, but it has been submitted as > patch with ticket #475. However, that function is not scheduled to be > included in any of the next releases. Would you like to help? We need to reformat the docstring according to the NumPy standard, and add tests to make sure that all those new functions do what they're supposed to. If you can provide the tests, that would help a great deal! Kind regards St?fan From scott.sinclair.za at gmail.com Fri Feb 20 09:50:15 2009 From: scott.sinclair.za at gmail.com (Scott Sinclair) Date: Fri, 20 Feb 2009 16:50:15 +0200 Subject: [SciPy-user] remezord() ticket #475 In-Reply-To: <9457e7c80902200605u4ad666bcj5abc6cd0f8334456@mail.gmail.com> References: <499EABA9.9000001@web.de> <9457e7c80902200605u4ad666bcj5abc6cd0f8334456@mail.gmail.com> Message-ID: <6a17e9ee0902200650r1b92b81eu79793f9e4aaae4c3@mail.gmail.com> > 2009/2/20 St?fan van der Walt : > Hi G?nter > > 2009/2/20 G?nter Dannoritzer : >> I am trying to implement a FIR filter with SciPy and for the design >> would first like to estimate the order I will need. I noticed that there >> is no remezord() function like in Matlab, but it has been submitted as >> patch with ticket #475. However, that function is not scheduled to be >> included in any of the next releases. > > Would you like to help? We need to reformat the docstring according > to the NumPy standard, and add tests to make sure that all those new > functions do what they're supposed to. If you can provide the tests, > that would help a great deal! There are some guidelines on writing tests here: http://projects.scipy.org/scipy/numpy/wiki/TestingGuidelines If you don't have nose installed, you could make tests that run with the standard Python module unittest (http://docs.python.org/library/unittest.html). Nose will pick these up automatically and has many extra features that make it more attractive than unittest. Cheers, Scott From wnbell at gmail.com Fri Feb 20 15:05:36 2009 From: wnbell at gmail.com (Nathan Bell) Date: Fri, 20 Feb 2009 15:05:36 -0500 Subject: [SciPy-user] sparse matrices again In-Reply-To: <1235120542.4216.15.camel@localhost> References: <1234356072.5642.16.camel@localhost> <1235120542.4216.15.camel@localhost> Message-ID: On Fri, Feb 20, 2009 at 4:02 AM, Soeren Sonnenburg wrote: >> >> What do you want to do exactly? > > Just a pointer to the standard ccs format (similar to what I have in > octave) to be able to exchange data with shogun > (www.shogun-toolbox.org). > You can access those arrays in Python: A = csc_matrix( ... ) A.indptr # pointer array A.indices # indices array A.data # nonzero values array -- Nathan Bell wnbell at gmail.com http://graphics.cs.uiuc.edu/~wnbell/ From josef.pktd at gmail.com Fri Feb 20 17:09:04 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 20 Feb 2009 17:09:04 -0500 Subject: [SciPy-user] problem with spatial.kdtree.sparse_distance_matrix Message-ID: <1cd32cbb0902201409g4e7c1fe3i51e902af89146718@mail.gmail.com> I would like to get the distance_matrix of all point in a 2d array, but it looks like kdtree cannot create a sparse distance matrix with itself. Is this intentional, a bug, or am I doing something wrong? Using a small distortion in the data works. I followed the example in the testfile (BTW: in class test_sparse_distance_matrix, M is often empty in the examples I tried, with given r=0.3). Josef >>> from scipy import spatial as ssp >>> r = 1 >>> xs2 = np.random.randn(4,50) >>> T1 = ssp.KDTree(xs2,leafsize=2) >>> M = T1.sparse_distance_matrix(T1, r) Traceback (most recent call last): File "", line 1, in M = T1.sparse_distance_matrix(T1, r) File "C:\Josef\_progs\building\scipy\scipy-trunk-new-r5551\dist\scipy-0.8.0.dev5551.win32\Programs\Python25\Lib\site-packages\scipy\spatial\kdtree.py", line 628, in sparse_distance_matrix other.tree, Rectangle(other.maxes, other.mins)) File "C:\Josef\_progs\building\scipy\scipy-trunk-new-r5551\dist\scipy-0.8.0.dev5551.win32\Programs\Python25\Lib\site-packages\scipy\spatial\kdtree.py", line 623, in traverse traverse(node1.less,less1,node2.less,less2) File "C:\Josef\_progs\building\scipy\scipy-trunk-new-r5551\dist\scipy-0.8.0.dev5551.win32\Programs\Python25\Lib\site-packages\scipy\spatial\kdtree.py", line 623, in traverse traverse(node1.less,less1,node2.less,less2) File "C:\Josef\_progs\building\scipy\scipy-trunk-new-r5551\dist\scipy-0.8.0.dev5551.win32\Programs\Python25\Lib\site-packages\scipy\spatial\kdtree.py", line 611, in traverse result[i,j] = d File "\Programs\Python25\Lib\site-packages\scipy\sparse\dok.py", line 222, in __setitem__ del self[(i,j)] KeyError: (0, 0) >>> T1b = ssp.KDTree(xs2.copy(),leafsize=2) >>> M = T1.sparse_distance_matrix(T1b, r) Traceback (most recent call last): File "", line 1, in M = T1.sparse_distance_matrix(T1b, r) File "C:\Josef\_progs\building\scipy\scipy-trunk-new-r5551\dist\scipy-0.8.0.dev5551.win32\Programs\Python25\Lib\site-packages\scipy\spatial\kdtree.py", line 628, in sparse_distance_matrix other.tree, Rectangle(other.maxes, other.mins)) File "C:\Josef\_progs\building\scipy\scipy-trunk-new-r5551\dist\scipy-0.8.0.dev5551.win32\Programs\Python25\Lib\site-packages\scipy\spatial\kdtree.py", line 623, in traverse traverse(node1.less,less1,node2.less,less2) File "C:\Josef\_progs\building\scipy\scipy-trunk-new-r5551\dist\scipy-0.8.0.dev5551.win32\Programs\Python25\Lib\site-packages\scipy\spatial\kdtree.py", line 623, in traverse traverse(node1.less,less1,node2.less,less2) File "C:\Josef\_progs\building\scipy\scipy-trunk-new-r5551\dist\scipy-0.8.0.dev5551.win32\Programs\Python25\Lib\site-packages\scipy\spatial\kdtree.py", line 611, in traverse result[i,j] = d File "\Programs\Python25\Lib\site-packages\scipy\sparse\dok.py", line 222, in __setitem__ del self[(i,j)] KeyError: (0, 0) using a small distortion works: >>> T1b = ssp.KDTree(xs2.copy()+1e-8,leafsize=2) >>> M = T1.sparse_distance_matrix(T1b, r) >>> len(M) 110 From josef.pktd at gmail.com Fri Feb 20 17:26:41 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 20 Feb 2009 17:26:41 -0500 Subject: [SciPy-user] problem with spatial.kdtree.sparse_distance_matrix In-Reply-To: <1cd32cbb0902201409g4e7c1fe3i51e902af89146718@mail.gmail.com> References: <1cd32cbb0902201409g4e7c1fe3i51e902af89146718@mail.gmail.com> Message-ID: <1cd32cbb0902201426y505121ban44b205af0be14c81@mail.gmail.com> On Fri, Feb 20, 2009 at 5:09 PM, wrote: > I would like to get the distance_matrix of all point in a 2d array, > but it looks like kdtree cannot create a sparse distance matrix with > itself. Is this intentional, a bug, or am I doing something wrong? > Using a small distortion in the data works. > I followed the example in the testfile (BTW: in class > test_sparse_distance_matrix, M is often empty in the examples I tried, > with given r=0.3). > > Josef > The problem is more general: KDTree.sparse_distance_matrix fails when there are zero distance points in the two trees, even if they are otherwise different. Example: >>> xs3 = np.random.randint(0,3,200).reshape(50,4) >>> xs4 = np.random.randint(0,3,200).reshape(50,4) >>> ds34 = ssp.distance_matrix(xs3,xs4) >>> np.min(ds34) 0.0 >>> T3 = ssp.KDTree(xs3,leafsize=2) >>> T4 = ssp.KDTree(xs4,leafsize=2) >>> M = T3.sparse_distance_matrix(T4, r) Traceback (most recent call last): File "", line 1, in M = T3.sparse_distance_matrix(T4, r) File "C:\Josef\_progs\building\scipy\scipy-trunk-new-r5551\dist\scipy-0.8.0.dev5551.win32\Programs\Python25\Lib\site-packages\scipy\spatial\kdtree.py", line 628, in sparse_distance_matrix other.tree, Rectangle(other.maxes, other.mins)) File "C:\Josef\_progs\building\scipy\scipy-trunk-new-r5551\dist\scipy-0.8.0.dev5551.win32\Programs\Python25\Lib\site-packages\scipy\spatial\kdtree.py", line 623, in traverse traverse(node1.less,less1,node2.less,less2) File "C:\Josef\_progs\building\scipy\scipy-trunk-new-r5551\dist\scipy-0.8.0.dev5551.win32\Programs\Python25\Lib\site-packages\scipy\spatial\kdtree.py", line 623, in traverse traverse(node1.less,less1,node2.less,less2) File "C:\Josef\_progs\building\scipy\scipy-trunk-new-r5551\dist\scipy-0.8.0.dev5551.win32\Programs\Python25\Lib\site-packages\scipy\spatial\kdtree.py", line 623, in traverse traverse(node1.less,less1,node2.less,less2) File "C:\Josef\_progs\building\scipy\scipy-trunk-new-r5551\dist\scipy-0.8.0.dev5551.win32\Programs\Python25\Lib\site-packages\scipy\spatial\kdtree.py", line 623, in traverse traverse(node1.less,less1,node2.less,less2) File "C:\Josef\_progs\building\scipy\scipy-trunk-new-r5551\dist\scipy-0.8.0.dev5551.win32\Programs\Python25\Lib\site-packages\scipy\spatial\kdtree.py", line 623, in traverse traverse(node1.less,less1,node2.less,less2) File "C:\Josef\_progs\building\scipy\scipy-trunk-new-r5551\dist\scipy-0.8.0.dev5551.win32\Programs\Python25\Lib\site-packages\scipy\spatial\kdtree.py", line 618, in traverse traverse(node1.less,less,node2,rect2) File "C:\Josef\_progs\building\scipy\scipy-trunk-new-r5551\dist\scipy-0.8.0.dev5551.win32\Programs\Python25\Lib\site-packages\scipy\spatial\kdtree.py", line 611, in traverse result[i,j] = d File "\Programs\Python25\Lib\site-packages\scipy\sparse\dok.py", line 222, in __setitem__ del self[(i,j)] KeyError: (7, 19) >>> From python-ml at nn7.de Sat Feb 21 11:23:55 2009 From: python-ml at nn7.de (Soeren Sonnenburg) Date: Sat, 21 Feb 2009 17:23:55 +0100 Subject: [SciPy-user] sparse matrices again In-Reply-To: References: <1234356072.5642.16.camel@localhost> <1235120542.4216.15.camel@localhost> Message-ID: <1235233435.8843.31.camel@localhost> On Fri, 2009-02-20 at 15:05 -0500, Nathan Bell wrote: > On Fri, Feb 20, 2009 at 4:02 AM, Soeren Sonnenburg wrote: > >> > >> What do you want to do exactly? > > > > Just a pointer to the standard ccs format (similar to what I have in > > octave) to be able to exchange data with shogun > > (www.shogun-toolbox.org). > > > > You can access those arrays in Python: > > A = csc_matrix( ... ) > A.indptr # pointer array > A.indices # indices array > A.data # nonzero values array I need to pass A to C++ code, so I need access to sparse arrays from C/C ++. Soeren. From schugschug at gmail.com Sat Feb 21 11:38:50 2009 From: schugschug at gmail.com (Eric Schug) Date: Sat, 21 Feb 2009 11:38:50 -0500 Subject: [SciPy-user] LiberMate was Re: Automating Matlab In-Reply-To: <4984F58C.5070605@gmail.com> References: <4984F58C.5070605@gmail.com> Message-ID: <49A02E1A.5050703@gmail.com> Eric Schug wrote: > Is there strong interest in automating matlab to numpy conversion? > > I have a working version of a matlab to python translator. > It allows translation of matlab scripts into numpy constructs, > supporting most of the matlab language. The parser is nearly > complete. Most of the remaining work involves providing a robust > translation. Such as > * making sure that copies on assign are done when needed. > * correct indexing a(:) becomes a.flatten(1) when on the left hand > side (lhs) of equals > and a[:] when on the right hand side > > > I've seen a few projects attempt to do this, but for one reason or > another have stopped it. > > For those interested, my new project has been uploaded sourceforge at, http://sourceforge.net/projects/libermate/ Latest version now supports simple command expressions (e.g hold on) From dmitrey15 at ukr.net Sat Feb 21 11:49:52 2009 From: dmitrey15 at ukr.net (Dmitrey) Date: Sat, 21 Feb 2009 18:49:52 +0200 Subject: [SciPy-user] LiberMate was Re: Automating Matlab In-Reply-To: <49A02E1A.5050703@gmail.com> References: <4984F58C.5070605@gmail.com> <49A02E1A.5050703@gmail.com> Message-ID: <49A030B0.5070202@ukr.net> Thank you Eric, it's very interesting, certainly it should be mentioned in scipy.org topical software section, also, I guess, it's worth mention in such mail lists as http://groups.google.com/group/comp.soft-sys.matlab, http://groups.google.com/group/comp.lang.python.announce, http://groups.google.com/group/sci.op-research, http://groups.google.com/group/sci.math.num-analysis Regards, Dmitrey Eric Schug wrote: > Eric Schug wrote: > > For those interested, my new project has been uploaded sourceforge at, > http://sourceforge.net/projects/libermate/ > > Latest version now supports simple command expressions (e.g hold on) > > From peridot.faceted at gmail.com Sat Feb 21 12:03:49 2009 From: peridot.faceted at gmail.com (Anne Archibald) Date: Sat, 21 Feb 2009 12:03:49 -0500 Subject: [SciPy-user] problem with spatial.kdtree.sparse_distance_matrix In-Reply-To: <1cd32cbb0902201426y505121ban44b205af0be14c81@mail.gmail.com> References: <1cd32cbb0902201409g4e7c1fe3i51e902af89146718@mail.gmail.com> <1cd32cbb0902201426y505121ban44b205af0be14c81@mail.gmail.com> Message-ID: 2009/2/20 : > On Fri, Feb 20, 2009 at 5:09 PM, wrote: >> I would like to get the distance_matrix of all point in a 2d array, >> but it looks like kdtree cannot create a sparse distance matrix with >> itself. Is this intentional, a bug, or am I doing something wrong? >> Using a small distortion in the data works. >> I followed the example in the testfile (BTW: in class >> test_sparse_distance_matrix, M is often empty in the examples I tried, >> with given r=0.3). >> >> Josef >> > > The problem is more general: > KDTree.sparse_distance_matrix fails when there are zero distance > points in the two trees, even if they are otherwise different. This was actually a bug in dok_matrix (setting an already-zero element to zero failed), which is now fixed in SVN. Anne > Example: > >>>> xs3 = np.random.randint(0,3,200).reshape(50,4) >>>> xs4 = np.random.randint(0,3,200).reshape(50,4) >>>> ds34 = ssp.distance_matrix(xs3,xs4) >>>> np.min(ds34) > 0.0 >>>> T3 = ssp.KDTree(xs3,leafsize=2) >>>> T4 = ssp.KDTree(xs4,leafsize=2) >>>> M = T3.sparse_distance_matrix(T4, r) > Traceback (most recent call last): > File "", line 1, in > M = T3.sparse_distance_matrix(T4, r) > File "C:\Josef\_progs\building\scipy\scipy-trunk-new-r5551\dist\scipy-0.8.0.dev5551.win32\Programs\Python25\Lib\site-packages\scipy\spatial\kdtree.py", > line 628, in sparse_distance_matrix > other.tree, Rectangle(other.maxes, other.mins)) > File "C:\Josef\_progs\building\scipy\scipy-trunk-new-r5551\dist\scipy-0.8.0.dev5551.win32\Programs\Python25\Lib\site-packages\scipy\spatial\kdtree.py", > line 623, in traverse > traverse(node1.less,less1,node2.less,less2) > File "C:\Josef\_progs\building\scipy\scipy-trunk-new-r5551\dist\scipy-0.8.0.dev5551.win32\Programs\Python25\Lib\site-packages\scipy\spatial\kdtree.py", > line 623, in traverse > traverse(node1.less,less1,node2.less,less2) > File "C:\Josef\_progs\building\scipy\scipy-trunk-new-r5551\dist\scipy-0.8.0.dev5551.win32\Programs\Python25\Lib\site-packages\scipy\spatial\kdtree.py", > line 623, in traverse > traverse(node1.less,less1,node2.less,less2) > File "C:\Josef\_progs\building\scipy\scipy-trunk-new-r5551\dist\scipy-0.8.0.dev5551.win32\Programs\Python25\Lib\site-packages\scipy\spatial\kdtree.py", > line 623, in traverse > traverse(node1.less,less1,node2.less,less2) > File "C:\Josef\_progs\building\scipy\scipy-trunk-new-r5551\dist\scipy-0.8.0.dev5551.win32\Programs\Python25\Lib\site-packages\scipy\spatial\kdtree.py", > line 623, in traverse > traverse(node1.less,less1,node2.less,less2) > File "C:\Josef\_progs\building\scipy\scipy-trunk-new-r5551\dist\scipy-0.8.0.dev5551.win32\Programs\Python25\Lib\site-packages\scipy\spatial\kdtree.py", > line 618, in traverse > traverse(node1.less,less,node2,rect2) > File "C:\Josef\_progs\building\scipy\scipy-trunk-new-r5551\dist\scipy-0.8.0.dev5551.win32\Programs\Python25\Lib\site-packages\scipy\spatial\kdtree.py", > line 611, in traverse > result[i,j] = d > File "\Programs\Python25\Lib\site-packages\scipy\sparse\dok.py", > line 222, in __setitem__ > del self[(i,j)] > KeyError: (7, 19) >>>> > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > From stefan at sun.ac.za Sat Feb 21 12:19:28 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Sat, 21 Feb 2009 19:19:28 +0200 Subject: [SciPy-user] sparse matrices again In-Reply-To: <1235233435.8843.31.camel@localhost> References: <1234356072.5642.16.camel@localhost> <1235120542.4216.15.camel@localhost> <1235233435.8843.31.camel@localhost> Message-ID: <9457e7c80902210919n229e6ad3i9f0de4bdde822838@mail.gmail.com> 2009/2/21 Soeren Sonnenburg : >> >> What do you want to do exactly? >> > >> > Just a pointer to the standard ccs format (similar to what I have in >> > octave) to be able to exchange data with shogun >> > (www.shogun-toolbox.org). The indices, data values and index pointers are stored as ndarrays, so you can access those pointer locations: x.indices.ctypes.data Regards St?fan From josef.pktd at gmail.com Sat Feb 21 12:38:09 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 21 Feb 2009 12:38:09 -0500 Subject: [SciPy-user] problem with spatial.kdtree.sparse_distance_matrix In-Reply-To: References: <1cd32cbb0902201409g4e7c1fe3i51e902af89146718@mail.gmail.com> <1cd32cbb0902201426y505121ban44b205af0be14c81@mail.gmail.com> Message-ID: <1cd32cbb0902210938w2671699fh50edc89d5f833911@mail.gmail.com> On Sat, Feb 21, 2009 at 12:03 PM, Anne Archibald wrote: > 2009/2/20 : >> On Fri, Feb 20, 2009 at 5:09 PM, wrote: >>> I would like to get the distance_matrix of all point in a 2d array, >>> but it looks like kdtree cannot create a sparse distance matrix with >>> itself. Is this intentional, a bug, or am I doing something wrong? >>> Using a small distortion in the data works. >>> I followed the example in the testfile (BTW: in class >>> test_sparse_distance_matrix, M is often empty in the examples I tried, >>> with given r=0.3). >>> >>> Josef >>> >> >> The problem is more general: >> KDTree.sparse_distance_matrix fails when there are zero distance >> points in the two trees, even if they are otherwise different. > > This was actually a bug in dok_matrix (setting an already-zero element > to zero failed), which is now fixed in SVN. > > Anne > thank you, I will try it out soon. Josef From nwagner at iam.uni-stuttgart.de Sat Feb 21 12:39:22 2009 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Sat, 21 Feb 2009 18:39:22 +0100 Subject: [SciPy-user] LiberMate was Re: Automating Matlab In-Reply-To: <49A02E1A.5050703@gmail.com> References: <4984F58C.5070605@gmail.com> <49A02E1A.5050703@gmail.com> Message-ID: On Sat, 21 Feb 2009 11:38:50 -0500 Eric Schug wrote: > Eric Schug wrote: >> Is there strong interest in automating matlab to numpy >>conversion? >> >> I have a working version of a matlab to python >>translator. >> It allows translation of matlab scripts into numpy >>constructs, >> supporting most of the matlab language. The parser is >>nearly >> complete. Most of the remaining work involves providing >>a robust >> translation. Such as >> * making sure that copies on assign are done when >>needed. >> * correct indexing a(:) becomes a.flatten(1) when on >>the left hand >> side (lhs) of equals >> and a[:] when on the right hand side >> >> >> I've seen a few projects attempt to do this, but for one >>reason or >> another have stopped it. >> >> >For those interested, my new project has been uploaded >sourceforge at, > http://sourceforge.net/projects/libermate/ > > Latest version now supports simple command expressions >(e.g hold on) > > > > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user Hi Eric, You might be interested in some tests. Different Matlab Toolboxes are available through http://www.maths.manchester.ac.uk/~higham/mg/ Cheers, Nils From josef.pktd at gmail.com Sat Feb 21 14:28:51 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 21 Feb 2009 14:28:51 -0500 Subject: [SciPy-user] math operations on sparse matrices Message-ID: <1cd32cbb0902211128x9472a59n53e79ecfda4b9fca@mail.gmail.com> I have a (dok) sparse distance matrix and I would like to take the exponential of the distances The direct method doesn't work >>> Mexp = np.exp(-M) Traceback (most recent call last): File "", line 1, in Mexp = np.exp(-M) AttributeError: exp The following seems to work. What is the recommended way for doing these transformations? I didn't see anything in the docs. >>> T1 = ssp.KDTree(xs3[::k,:],leafsize=2) >>> M = T1.sparse_distance_matrix(T1, r) >>> Mexp = M.copy() >>> for k,v in M.items(): Mexp[k]=np.exp(-v) thanks, Josef From opossumnano at gmail.com Sat Feb 21 14:33:11 2009 From: opossumnano at gmail.com (Tiziano Zito) Date: Sat, 21 Feb 2009 20:33:11 +0100 Subject: [SciPy-user] Matlab to Python compiler was Re: LiberMate In-Reply-To: References: <4984F58C.5070605@gmail.com> <49A02E1A.5050703@gmail.com> Message-ID: <20090221193310.GA2066@localhost> You may be all interested in http://ompc.juricap.com/ it seems very promising too. tiziano On Sat 21 Feb, 18:39, Nils Wagner wrote: > On Sat, 21 Feb 2009 11:38:50 -0500 > Eric Schug wrote: > > Eric Schug wrote: > >> Is there strong interest in automating matlab to numpy > >>conversion? > >> > >> I have a working version of a matlab to python > >>translator. > >> It allows translation of matlab scripts into numpy > >>constructs, > >> supporting most of the matlab language. The parser is > >>nearly > >> complete. Most of the remaining work involves providing > >>a robust > >> translation. Such as > >> * making sure that copies on assign are done when > >>needed. > >> * correct indexing a(:) becomes a.flatten(1) when on > >>the left hand > >> side (lhs) of equals > >> and a[:] when on the right hand side > >> > >> > >> I've seen a few projects attempt to do this, but for one > >>reason or > >> another have stopped it. > >> > >> > >For those interested, my new project has been uploaded > >sourceforge at, > > http://sourceforge.net/projects/libermate/ > > > > Latest version now supports simple command expressions > >(e.g hold on) > > > > > > > > _______________________________________________ > > SciPy-user mailing list > > SciPy-user at scipy.org > > http://projects.scipy.org/mailman/listinfo/scipy-user > > Hi Eric, > > You might be interested in some tests. > > Different Matlab Toolboxes are available through > > http://www.maths.manchester.ac.uk/~higham/mg/ > > Cheers, > > Nils > > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user From shchelokovskyy at gmail.com Sat Feb 21 14:40:47 2009 From: shchelokovskyy at gmail.com (Pavlo Shchelokovskyy) Date: Sat, 21 Feb 2009 20:40:47 +0100 Subject: [SciPy-user] error estimate in stats.linregress Message-ID: Hi all, I was working with linear regression in scipy and met some problems with value of standard error of the estimate returned by scipy.stats.linregress() function. I could not compare it to similar outputs of other linear regression routines (for example in Origin), so I took a look in the source (stats.py). In the source it is defined as sterrest = np.sqrt((1-r*r)*ss(y) / ss(x) / df) where r is correlation coefficient, df is degrees of freedom (N-2) and ss() is sum of squares of elements. After digging through literature the only formula looking somewhat the same was found to be stderrest = np.sqrt((1-r*r)*ss(y-y.mean())/df) which gives the same result as a standard definition (in notation of the source of linregress) stderrest = np.sqrt(ss(y-slope*x-intercept)/df) but the output of linregress is different. I humbly suppose this is a bug, but maybe somebody could explain me what is it if I'm wrong... Pavlo. From dannoritzer at web.de Sat Feb 21 14:48:59 2009 From: dannoritzer at web.de (=?ISO-8859-1?Q?G=FCnter_Dannoritzer?=) Date: Sat, 21 Feb 2009 20:48:59 +0100 Subject: [SciPy-user] remezord() ticket #475 In-Reply-To: <6a17e9ee0902200650r1b92b81eu79793f9e4aaae4c3@mail.gmail.com> References: <499EABA9.9000001@web.de> <9457e7c80902200605u4ad666bcj5abc6cd0f8334456@mail.gmail.com> <6a17e9ee0902200650r1b92b81eu79793f9e4aaae4c3@mail.gmail.com> Message-ID: <49A05AAB.6070708@web.de> Scott Sinclair wrote: >> 2009/2/20 St?fan van der Walt : >> Hi G?nter ... >> Would you like to help? We need to reformat the docstring according >> to the NumPy standard, and add tests to make sure that all those new >> functions do what they're supposed to. If you can provide the tests, >> that would help a great deal! > > There are some guidelines on writing tests here: > > http://projects.scipy.org/scipy/numpy/wiki/TestingGuidelines > > If you don't have nose installed, you could make tests that run with > the standard Python module unittest > (http://docs.python.org/library/unittest.html). Nose will pick these > up automatically and has many extra features that make it more > attractive than unittest. St?fan and Scott, thanks for the information. I will give it a try to create some test cases. I wrote Lev Givon, the author of those functions an email to discuss some ideas about testing. I was not sure whether he is still subscribed to this list and wrote him direct. I will read the test guidelines and come back if I have some questions about them. Guenter From fredmfp at gmail.com Sat Feb 21 14:57:19 2009 From: fredmfp at gmail.com (fred) Date: Sat, 21 Feb 2009 20:57:19 +0100 Subject: [SciPy-user] scipy.org issue? Message-ID: <49A05C9F.4000404@gmail.com> Hi all, On http://www.scipy.org/SciPy_packages, one can see a link to Signal packages, but... nothing behind... :-( The page does not exist. What's wrong? Cheers, -- Fred From josef.pktd at gmail.com Sat Feb 21 15:11:32 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 21 Feb 2009 15:11:32 -0500 Subject: [SciPy-user] scipy.org issue? In-Reply-To: <49A05C9F.4000404@gmail.com> References: <49A05C9F.4000404@gmail.com> Message-ID: <1cd32cbb0902211211u19c0efb3h38195b48ee11d0a3@mail.gmail.com> On Sat, Feb 21, 2009 at 2:57 PM, fred wrote: > Hi all, > > On http://www.scipy.org/SciPy_packages, > one can see a link to Signal packages, but... > nothing behind... :-( > > The page does not exist. > > What's wrong? > > > Cheers, > > -- > Fred The new documentation is here: http://docs.scipy.org/doc/scipy/reference/ see tutorial for signal and reference. The documentation editor for writing new docs is here: http://docs.scipy.org/scipy/Front%20Page/ Josef From fredmfp at gmail.com Sat Feb 21 16:16:47 2009 From: fredmfp at gmail.com (fred) Date: Sat, 21 Feb 2009 22:16:47 +0100 Subject: [SciPy-user] scipy.org issue? In-Reply-To: <1cd32cbb0902211211u19c0efb3h38195b48ee11d0a3@mail.gmail.com> References: <49A05C9F.4000404@gmail.com> <1cd32cbb0902211211u19c0efb3h38195b48ee11d0a3@mail.gmail.com> Message-ID: <49A06F3F.8030707@gmail.com> josef.pktd at gmail.com a ?crit : > The new documentation is here: > http://docs.scipy.org/doc/scipy/reference/ see tutorial for signal > and reference. > The documentation editor for writing new docs is here: > http://docs.scipy.org/scipy/Front%20Page/ Bookmarked! ;-) Thanks. Cheers, -- Fred From stefan at sun.ac.za Sat Feb 21 18:43:41 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Sun, 22 Feb 2009 01:43:41 +0200 Subject: [SciPy-user] math operations on sparse matrices In-Reply-To: <1cd32cbb0902211128x9472a59n53e79ecfda4b9fca@mail.gmail.com> References: <1cd32cbb0902211128x9472a59n53e79ecfda4b9fca@mail.gmail.com> Message-ID: <9457e7c80902211543l480bad71y1d5efbdfccbc2200@mail.gmail.com> 2009/2/21 : > I have a (dok) sparse distance matrix and I would like to take the > exponential of the distances I guess you could just as well switch to dense matrices then, since exp(0) is no longer zero. If you just want to change the non-zero values, you can use x.data = np.exp(x.data) Regards St?fan From stefan at sun.ac.za Sat Feb 21 18:47:54 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Sun, 22 Feb 2009 01:47:54 +0200 Subject: [SciPy-user] remezord() ticket #475 In-Reply-To: <49A05AAB.6070708@web.de> References: <499EABA9.9000001@web.de> <9457e7c80902200605u4ad666bcj5abc6cd0f8334456@mail.gmail.com> <6a17e9ee0902200650r1b92b81eu79793f9e4aaae4c3@mail.gmail.com> <49A05AAB.6070708@web.de> Message-ID: <9457e7c80902211547k61084e83g347822127bd4027@mail.gmail.com> 2009/2/21 G?nter Dannoritzer : > St?fan and Scott, thanks for the information. I will give it a try to > create some test cases. Thanks for your help, G?nter! I look forward to integrating your tests. Regards St?fan From stefan at sun.ac.za Sat Feb 21 19:03:59 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Sun, 22 Feb 2009 02:03:59 +0200 Subject: [SciPy-user] scipy.org issue? In-Reply-To: <49A05C9F.4000404@gmail.com> References: <49A05C9F.4000404@gmail.com> Message-ID: <9457e7c80902211603x4061b20bo21ddbd72ba02d9a9@mail.gmail.com> Hi Fred 2009/2/21 fred : > On http://www.scipy.org/SciPy_packages, > one can see a link to Signal packages, but... > nothing behind... :-( Thanks for this report. I've added links to the docs editor on that page, which should become permanent as soon as the server decides to respond. Cheers St?fan From hetland at tamu.edu Sat Feb 21 19:38:14 2009 From: hetland at tamu.edu (Rob Hetland) Date: Sat, 21 Feb 2009 18:38:14 -0600 Subject: [SciPy-user] Error in scipy.spatial.cKDTree Message-ID: I am getting a strange error in scipy.spatial.cKDtree: # make a sample array. fill_value = 0.5 x = np.random.rand(25, 50) x = x.clip(min=fill_value, max=inf) # Create (i, j) point arrays for good and bad data. # Bad data is marked by the fill_value, good data elsewhere. igood = np.vstack(np.where(x!=fill_value)).astype('d').T ibad = np.vstack(np.where(x==fill_value)).astype('d').T # create a tree for the bad points, the points to be filled tree = scipy.spatial.cKDTree(ibad) # get the four closest points to the bad points dist, iquery = tree.query(igood, k=4, p=2) np.any(dist == 0) I get True for the last command, which should not be. Other implementations of kdtree that I have, including regular KDTree. I'm not good enough at C to track the code down. -Rob ---- Rob Hetland, Associate Professor Dept. of Oceanography, Texas A&M University http://pong.tamu.edu/~rob phone: 979-458-0096, fax: 979-845-6331 From wnbell at gmail.com Sat Feb 21 20:20:21 2009 From: wnbell at gmail.com (Nathan Bell) Date: Sat, 21 Feb 2009 20:20:21 -0500 Subject: [SciPy-user] math operations on sparse matrices In-Reply-To: <1cd32cbb0902211128x9472a59n53e79ecfda4b9fca@mail.gmail.com> References: <1cd32cbb0902211128x9472a59n53e79ecfda4b9fca@mail.gmail.com> Message-ID: On Sat, Feb 21, 2009 at 2:28 PM, wrote: > I have a (dok) sparse distance matrix and I would like to take the > exponential of the distances > > The direct method doesn't work >>>> Mexp = np.exp(-M) > Traceback (most recent call last): > File "", line 1, in > Mexp = np.exp(-M) > AttributeError: exp > > The following seems to work. What is the recommended way for doing > these transformations? > I didn't see anything in the docs. > >>>> T1 = ssp.KDTree(xs3[::k,:],leafsize=2) >>>> M = T1.sparse_distance_matrix(T1, r) >>>> Mexp = M.copy() >>>> for k,v in M.items(): > Mexp[k]=np.exp(-v) > Does the following solve your problem? >>> M = M.tocsr() >>> M.data = np.exp(M.data) I suppose we could add a transform(A, fn) function to scipy.sparse to codify this sort of thing. Any suggestions? The difference here is that fn() would only be applied to the nonzero entries of A (and to the explicit zeros, if they are also present). I'd rather not encourage people to manipulate the underlying CSR/CSC representations directly, so tranform() would be a nice way to expose this. -- Nathan Bell wnbell at gmail.com http://graphics.cs.uiuc.edu/~wnbell/ From josef.pktd at gmail.com Sat Feb 21 20:33:48 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 21 Feb 2009 20:33:48 -0500 Subject: [SciPy-user] math operations on sparse matrices In-Reply-To: <9457e7c80902211543l480bad71y1d5efbdfccbc2200@mail.gmail.com> References: <1cd32cbb0902211128x9472a59n53e79ecfda4b9fca@mail.gmail.com> <9457e7c80902211543l480bad71y1d5efbdfccbc2200@mail.gmail.com> Message-ID: <1cd32cbb0902211733qbc53f44va3679667351714f1@mail.gmail.com> On Sat, Feb 21, 2009 at 6:43 PM, St?fan van der Walt wrote: > 2009/2/21 : >> I have a (dok) sparse distance matrix and I would like to take the >> exponential of the distances > > I guess you could just as well switch to dense matrices then, since > exp(0) is no longer zero. > > If you just want to change the non-zero values, you can use > > x.data = np.exp(x.data) > > Regards > St?fan That was also my first guess, however >>> M <50x50 sparse matrix of type '' with 208 stored elements in Dictionary Of Keys format> >>> M.data Traceback (most recent call last): File "", line 1, in M.data File "\Programs\Python25\Lib\site-packages\scipy\sparse\base.py", line 429, in __getattr__ AttributeError: data not found For now this seems to work pretty fast Mexp = M.copy() Mexp.update(((k,exp(-v)) for k,v in M.iteritems())) But I'm not sure I know what I'm doing. What I'm trying to do is something like OLS with a sparse X'X matrix (kernel rigdge regression). The next step are: alpha = sparse.linalg.minres(M,y) yhat = M1.matmat(alpha[0]) >From the graphical results it seems to work, but since this is my first try with scipy.sparse.linalg, I'm not sure what the methods to in detail. Josef From wnbell at gmail.com Sat Feb 21 22:50:29 2009 From: wnbell at gmail.com (Nathan Bell) Date: Sat, 21 Feb 2009 22:50:29 -0500 Subject: [SciPy-user] math operations on sparse matrices In-Reply-To: <1cd32cbb0902211733qbc53f44va3679667351714f1@mail.gmail.com> References: <1cd32cbb0902211128x9472a59n53e79ecfda4b9fca@mail.gmail.com> <9457e7c80902211543l480bad71y1d5efbdfccbc2200@mail.gmail.com> <1cd32cbb0902211733qbc53f44va3679667351714f1@mail.gmail.com> Message-ID: On Sat, Feb 21, 2009 at 8:33 PM, wrote: > > That was also my first guess, however > >>>> M > <50x50 sparse matrix of type '' > with 208 stored elements in Dictionary Of Keys format> >>>> M.data > Traceback (most recent call last): > File "", line 1, in > M.data > File "\Programs\Python25\Lib\site-packages\scipy\sparse\base.py", > line 429, in __getattr__ > AttributeError: data not found > Note the .tocsr() in the first step: >>> M = M.tocsr() >>> M.data = np.exp(M.data) > From the graphical results it seems to work, but since this is my > first try with scipy.sparse.linalg, I'm not sure what the methods to > in detail. You'll want to convert to CSR (or CSC) before calling those solvers anyway. CSR/CSC offer much faster matrix-vector products, the main cost in most iterative methods. -- Nathan Bell wnbell at gmail.com http://graphics.cs.uiuc.edu/~wnbell/ From peridot.faceted at gmail.com Sat Feb 21 23:07:41 2009 From: peridot.faceted at gmail.com (Anne Archibald) Date: Sat, 21 Feb 2009 23:07:41 -0500 Subject: [SciPy-user] Error in scipy.spatial.cKDTree In-Reply-To: References: Message-ID: 2009/2/21 Rob Hetland : > > I am getting a strange error in scipy.spatial.cKDtree: Oops! Fixed in SVN r5585. The error happens when the query array is not "contiguous"; the easiest way to trigger it is to do a query with a transposed array; the query coordinates will be scrambled. As a workaround, just apply np.ascontiguousarray() to any query. Anne > # make a sample array. > fill_value = 0.5 > x = np.random.rand(25, 50) > x = x.clip(min=fill_value, max=inf) > > # Create (i, j) point arrays for good and bad data. > # Bad data is marked by the fill_value, good data elsewhere. > igood = np.vstack(np.where(x!=fill_value)).astype('d').T > ibad = np.vstack(np.where(x==fill_value)).astype('d').T > > # create a tree for the bad points, the points to be filled > tree = scipy.spatial.cKDTree(ibad) > > # get the four closest points to the bad points > dist, iquery = tree.query(igood, k=4, p=2) > > np.any(dist == 0) > > > > > I get True for the last command, which should not be. Other > implementations of kdtree that I have, including regular KDTree. I'm > not good enough at C to track the code down. > > -Rob > > > > ---- > Rob Hetland, Associate Professor > Dept. of Oceanography, Texas A&M University > http://pong.tamu.edu/~rob > phone: 979-458-0096, fax: 979-845-6331 > > > > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > From josef.pktd at gmail.com Sat Feb 21 23:12:58 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Sat, 21 Feb 2009 23:12:58 -0500 Subject: [SciPy-user] math operations on sparse matrices In-Reply-To: References: <1cd32cbb0902211128x9472a59n53e79ecfda4b9fca@mail.gmail.com> <9457e7c80902211543l480bad71y1d5efbdfccbc2200@mail.gmail.com> <1cd32cbb0902211733qbc53f44va3679667351714f1@mail.gmail.com> Message-ID: <1cd32cbb0902212012t2004592ah977f485e014f7f77@mail.gmail.com> On Sat, Feb 21, 2009 at 10:50 PM, Nathan Bell wrote: > On Sat, Feb 21, 2009 at 8:33 PM, wrote: >> >> That was also my first guess, however >> >>>>> M >> <50x50 sparse matrix of type '' >> with 208 stored elements in Dictionary Of Keys format> >>>>> M.data >> Traceback (most recent call last): >> File "", line 1, in >> M.data >> File "\Programs\Python25\Lib\site-packages\scipy\sparse\base.py", >> line 429, in __getattr__ >> AttributeError: data not found >> > > > Note the .tocsr() in the first step: > >>>> M = M.tocsr() >>>> M.data = np.exp(M.data) > > >> From the graphical results it seems to work, but since this is my >> first try with scipy.sparse.linalg, I'm not sure what the methods to >> in detail. > > You'll want to convert to CSR (or CSC) before calling those solvers > anyway. CSR/CSC offer much faster matrix-vector products, the main > cost in most iterative methods. > Yes, thanks after your message, I started to compare both formats and the is a big time difference csr is much faster. I had used dok, because that's what I got from spatial. I started to increase my test size and I am working now with a distance matrix of 10000 by 10000, and the solver seems to work pretty well. Initially, I was worried about the LinearOperator because I wasn't sure what the solution means when I transform a sparse distance matrix. But the graphs look good, so, I guess, it works. With scipy.spatial and scipy.sparse it's pretty quick to write a large scale regression problem like this. Thanks, Josef From hetland at tamu.edu Sun Feb 22 00:35:12 2009 From: hetland at tamu.edu (Rob Hetland) Date: Sat, 21 Feb 2009 23:35:12 -0600 Subject: [SciPy-user] Error in scipy.spatial.cKDTree In-Reply-To: References: Message-ID: On Feb 21, 2009, at 10:07 PM, Anne Archibald wrote: > > Oops! Fixed in SVN r5585. > > The error happens when the query array is not "contiguous"; the > easiest way to trigger it is to do a query with a transposed array; > the query coordinates will be scrambled. As a workaround, just apply > np.ascontiguousarray() to any query. Thanks. I tried the pure python version in the meantime, and it was surprisingly fast for pure python .. Hopefully the c version will be even faster still (and give the right answer to boot!). For posterity, here is some code that fills in sparse arrays that should not be sparse, below, for when delaunay interpolation is overkill. (what I was working on when I found the bug). -Rob import numpy as np from scipy.spatial import cKDTree def fill_nearest(x, fill_value=0.0): '''Fill missing values in an array with an average of nearest neighbors.''' assert x.ndim == 2, 'x must be a 2D array.' # Create (i, j) point arrays for good and bad data. # Bad data is marked by the fill_value, good data elsewhere. igood = np.vstack(np.where(x!=fill_value)).T ibad = np.vstack(np.where(x==fill_value)).T # create a tree for the bad points, the points to be filled # ann_tree = ann.kd_tree(igood) tree = cKDTree(igood) # get the four closest points to the bad points # iquery, dist = ann_tree.search(ibad, k=4) # here, distance is squared dist, iquery = tree.query(ibad, k=4, p=2) # create a weight normalized the nearest points are weighted as 1. # points greater than one are then set to zero. weight = dist/(dist.min(axis=1)[:, newaxis] * ones_like(dist)) weight[weight > 1] = 0 # multiply the queried good points by the weight, selecting only the near # points. Divide by the number of nearest points to get average. xfill = weight * x[igood[:,0][iquery], igood[:,1][iquery]] xfill = (xfill/weight.sum(axis=1)[:, newaxis]).sum(axis=1) # place average of nearest good points, xfill, into bad point locations. x[ibad[:,0], ibad[:,1]] = xfill return x ---- Rob Hetland, Associate Professor Dept. of Oceanography, Texas A&M University http://pong.tamu.edu/~rob phone: 979-458-0096, fax: 979-845-6331 From cmutel at gmail.com Sun Feb 22 02:23:33 2009 From: cmutel at gmail.com (Christopher Mutel) Date: Sun, 22 Feb 2009 08:23:33 +0100 Subject: [SciPy-user] Question about implementation of a directed acyclic graph of formulas and variables Message-ID: <5e5978e10902212323o1f9dc4dn17b32d8c986f3440@mail.gmail.com> Hello all- I am working on a model that uses a large set of linear equations. SciPy provides a set of tools that help very much in my case (especially sparse matrix stuff), and I hope it is okay if I ask the general SciPy community for advice on a further development of my model. I am sure that some of you have already dealt with the questions that I am struggling with. I would like to replace some of the numbers used to construct my matrix with a directed acyclic graph of formulas and variables, to represent the fact that many model components are not independent of one another. This is especially useful when doing Monte Carlo analysis, where every element in the set of linear equations has an associated uncertainty distribution. In the model I am working on, the linear equations represent physical processes in the industrial economy, and its makes the model more accurate to say that, for example, the NOx production in a boiler is a function of the temperature of the boiler, or the fuel consumption of a truck is a function of the load. The alternative, which is what I do now, is assume these parameters are independently distributed. My questions are: 1. To store my graph of references, I need to choose an existing python graph implementation. Does anyone have ideas on what would be best in my specific case? I only need a graph implementation to ensure transitive closure (no circular references), and to allow a way to keep track of references so the entire graph can be easily and correctly re-calculated. NetworkX seems like tremendous overkill in this case. 2. Is there a "best" way to write a formula? Perhaps there are libraries for something like this? I was thinking of a class like: class Formula(object): formula = "foo" references = [bar1, bar2] A key point here is that the formula itself must be stored in a SQL database, and human-readable (at least to some extent). I am sure that there is someone out there who has though a lot about these types of issues, and has a decent solution. I don't think something like SymPy would work here, though of course I may be wrong. Respectfully yours, Chris -- ############################ Chris Mutel ?kologisches Systemdesign - Ecological Systems Design Institut f.Umweltingenieurwissenschaften - Institute for Environmental Engineering ETH Z?rich - HIF C 42 - Schafmattstr. 6 8093 Z?rich Telefon: +41 44 633 71 45 - Fax: +41 44 633 10 61 ############################ From robert.kern at gmail.com Sun Feb 22 02:46:28 2009 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 22 Feb 2009 01:46:28 -0600 Subject: [SciPy-user] Question about implementation of a directed acyclic graph of formulas and variables In-Reply-To: <5e5978e10902212323o1f9dc4dn17b32d8c986f3440@mail.gmail.com> References: <5e5978e10902212323o1f9dc4dn17b32d8c986f3440@mail.gmail.com> Message-ID: <3d375d730902212346w288ce1b2pb54d0e55db5804ae@mail.gmail.com> On Sun, Feb 22, 2009 at 01:23, Christopher Mutel wrote: > Hello all- > > I am working on a model that uses a large set of linear equations. > SciPy provides a set of tools that help very much in my case > (especially sparse matrix stuff), and I hope it is okay if I ask the > general SciPy community for advice on a further development of my > model. I am sure that some of you have already dealt with the > questions that I am struggling with. > > I would like to replace some of the numbers used to construct my > matrix with a directed acyclic graph of formulas and variables, to > represent the fact that many model components are not independent of > one another. This is especially useful when doing Monte Carlo > analysis, where every element in the set of linear equations has an > associated uncertainty distribution. In the model I am working on, the > linear equations represent physical processes in the industrial > economy, and its makes the model more accurate to say that, for > example, the NOx production in a boiler is a function of the > temperature of the boiler, or the fuel consumption of a truck is a > function of the load. The alternative, which is what I do now, is > assume these parameters are independently distributed. > > My questions are: > > 1. To store my graph of references, I need to choose an existing > python graph implementation. Does anyone have ideas on what would be > best in my specific case? I only need a graph implementation to ensure > transitive closure (no circular references), and to allow a way to > keep track of references so the entire graph can be easily and > correctly re-calculated. NetworkX seems like tremendous overkill in > this case. You probably just want a simple dict mapping nodes to lists of adjacent nodes (following the direction of the arrows). Then you just need an implementation of the appropriate algorithms on top of this data structure. You might find what you need here (we had a similar problem once): https://svn.enthought.com/svn/enthought/EnthoughtBase/trunk/enthought/util/graph.py > 2. Is there a "best" way to write a formula? Perhaps there are > libraries for something like this? I was thinking of a class like: > > class Formula(object): > formula = "foo" > references = [bar1, bar2] > > A key point here is that the formula itself must be stored in a SQL > database, and human-readable (at least to some extent). I am sure that > there is someone out there who has though a lot about these types of > issues, and has a decent solution. I don't think something like SymPy > would work here, though of course I may be wrong. I think sympy is probably an excellent option for you, but I'm not entirely clear on what your formulae look like. Remember that you can always pickle sympy expressions if they need to get stored in a SQL database. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From dmitrey15 at ukr.net Sun Feb 22 03:53:20 2009 From: dmitrey15 at ukr.net (Dmitrey) Date: Sun, 22 Feb 2009 10:53:20 +0200 Subject: [SciPy-user] Question about implementation of a directed acyclic graph of formulas and variables In-Reply-To: <5e5978e10902212323o1f9dc4dn17b32d8c986f3440@mail.gmail.com> References: <5e5978e10902212323o1f9dc4dn17b32d8c986f3440@mail.gmail.com> Message-ID: <49A11280.3050607@ukr.net> Hi Christopher, I'm working on OpenOpt oofun concept http://openopt.org/oofun it doesn't have any graphical back-end yet but I hope someday it will be binded to a one. The concept is something like MATLAB's SIMULINK (as I have mentioned, without graphical back-end yet). Also,it is intended first of all to numerical optimization (so oofuns have derivatives subfield, and in future some more will be added, like desired lb-ub bounds for each oofun) rather than SIMULINK that is keen on signal processing. There are other persons in our numerical optimization dept that are working on directed acyclic graph of formulas and variables (including SQL databases), however, unfortunately they work using C++/Rational Rose, no Python. BTW they handle even cyclic graphs (via separating cycles and handling them via non-linear systems solver, like scipy.optimize fsolve). Regards, D. P.S. unfortunately, because of some my changes oolin concept doesn't work for latest svn snapshot. Christopher Mutel wrote: > Hello all- > > I am working on a model that uses a large set of linear equations. > SciPy provides a set of tools that help very much in my case > (especially sparse matrix stuff), and I hope it is okay if I ask the > general SciPy community for advice on a further development of my > model. I am sure that some of you have already dealt with the > questions that I am struggling with. > > I would like to replace some of the numbers used to construct my > matrix with a directed acyclic graph of formulas and variables, to > represent the fact that many model components are not independent of > one another. This is especially useful when doing Monte Carlo > analysis, where every element in the set of linear equations has an > associated uncertainty distribution. In the model I am working on, the > linear equations represent physical processes in the industrial > economy, and its makes the model more accurate to say that, for > example, the NOx production in a boiler is a function of the > temperature of the boiler, or the fuel consumption of a truck is a > function of the load. The alternative, which is what I do now, is > assume these parameters are independently distributed. > > My questions are: > > 1. To store my graph of references, I need to choose an existing > python graph implementation. Does anyone have ideas on what would be > best in my specific case? I only need a graph implementation to ensure > transitive closure (no circular references), and to allow a way to > keep track of references so the entire graph can be easily and > correctly re-calculated. NetworkX seems like tremendous overkill in > this case. > > 2. Is there a "best" way to write a formula? Perhaps there are > libraries for something like this? I was thinking of a class like: > > class Formula(object): > formula = "foo" > references = [bar1, bar2] > > A key point here is that the formula itself must be stored in a SQL > database, and human-readable (at least to some extent). I am sure that > there is someone out there who has though a lot about these types of > issues, and has a decent solution. I don't think something like SymPy > would work here, though of course I may be wrong. > > Respectfully yours, > > Chris > > > From tritemio at gmail.com Sun Feb 22 06:48:34 2009 From: tritemio at gmail.com (Antonino Ingargiola) Date: Sun, 22 Feb 2009 12:48:34 +0100 Subject: [SciPy-user] Very slow loadmat in scipy 0.7 (regression) Message-ID: <5486cca80902220348p5bc7b15dk9cce43d73caf4961@mail.gmail.com> Hi to the list, I'm loading matlab file of a few tents of MB in python with scipy.io.loadmat. With scipy 0.6 (the stock ubuntu 8.10 version) the load takes a few seconds (2-5 sec). Now with scipy 0.7 it takes much longer, around 80 secs. I did a profile and found that the all the time is spent in GzipInputStream.__zfill method. I blindly tried to change the GzipInputStream.blocksize attribute from 16K to 256K and 1M and found that the performances become exponentially better. Here there are the profile resuts loading a 33M matlab file: *Scipy 0.7 default, BUFFER 16K* 12984 function calls (12981 primitive calls) in 140.456 CPU seconds Ordered by: internal time List reduced from 40 to 3 due to restriction <3> ncalls tottime percall cumtime percall filename:lineno(function) 27 139.250 5.157 140.304 5.196 gzipstreams.py:80(__fill) 2119 0.950 0.000 0.950 0.000 {built-in method decompress} 9 0.123 0.014 0.123 0.014 {method 'copy' of 'numpy.ndarray' objects} *BUFFER 256K* 1080 function calls (1077 primitive calls) in 9.988 CPU seconds Ordered by: internal time List reduced from 40 to 3 due to restriction <3> ncalls tottime percall cumtime percall filename:lineno(function) 27 8.870 0.329 9.833 0.364 gzipstreams.py:80(__fill) 135 0.925 0.007 0.925 0.007 {built-in method decompress} 9 0.124 0.014 0.124 0.014 {method 'copy' of 'numpy.ndarray' objects} *BUFFER 1M* 480 function calls (477 primitive calls) in 3.509 CPU seconds Ordered by: internal time List reduced from 40 to 3 due to restriction <3> ncalls tottime percall cumtime percall filename:lineno(function) 27 2.329 0.086 3.302 0.122 gzipstreams.py:80(__fill) 35 0.925 0.026 0.925 0.026 {built-in method decompress} 9 0.124 0.014 0.124 0.014 {method 'copy' of 'numpy.ndarray' objects} As you can see there is a dramatic improvement as the time passes from 140 to around 3 seconds. I think that the default value should be raised a bit (at least 256K), but as the performance hit can be so big is definitely better to have this as keyword argument directly in io.loadmat. Any comment is appreciated. - Antonio PS: the test file used for the profiling is attached. -------------- next part -------------- A non-text attachment was scrubbed... Name: test_setup.py Type: text/x-python Size: 340 bytes Desc: not available URL: From matthieu.brucher at gmail.com Sun Feb 22 06:58:43 2009 From: matthieu.brucher at gmail.com (Matthieu Brucher) Date: Sun, 22 Feb 2009 12:58:43 +0100 Subject: [SciPy-user] Very slow loadmat in scipy 0.7 (regression) In-Reply-To: <5486cca80902220348p5bc7b15dk9cce43d73caf4961@mail.gmail.com> References: <5486cca80902220348p5bc7b15dk9cce43d73caf4961@mail.gmail.com> Message-ID: Hi, This issue popped up in the scipy-dev ML and will be fixed in the future. Matthieu 2009/2/22 Antonino Ingargiola : > Hi to the list, > > I'm loading matlab file of a few tents of MB in python with > scipy.io.loadmat. With scipy 0.6 (the stock ubuntu 8.10 version) the > load takes a few seconds (2-5 sec). Now with scipy 0.7 it takes much > longer, around 80 secs. > > I did a profile and found that the all the time is spent in > GzipInputStream.__zfill method. I blindly tried to change the > GzipInputStream.blocksize attribute from 16K to 256K and 1M and found > that the performances become exponentially better. Here there are the > profile resuts loading a 33M matlab file: > > *Scipy 0.7 default, BUFFER 16K* > > 12984 function calls (12981 primitive calls) in 140.456 CPU seconds > > Ordered by: internal time > List reduced from 40 to 3 due to restriction <3> > > ncalls tottime percall cumtime percall filename:lineno(function) > 27 139.250 5.157 140.304 5.196 gzipstreams.py:80(__fill) > 2119 0.950 0.000 0.950 0.000 {built-in method decompress} > 9 0.123 0.014 0.123 0.014 {method 'copy' of > 'numpy.ndarray' objects} > > > *BUFFER 256K* > > 1080 function calls (1077 primitive calls) in 9.988 CPU seconds > > Ordered by: internal time > List reduced from 40 to 3 due to restriction <3> > > ncalls tottime percall cumtime percall filename:lineno(function) > 27 8.870 0.329 9.833 0.364 gzipstreams.py:80(__fill) > 135 0.925 0.007 0.925 0.007 {built-in method decompress} > 9 0.124 0.014 0.124 0.014 {method 'copy' of > 'numpy.ndarray' objects} > > > *BUFFER 1M* > > 480 function calls (477 primitive calls) in 3.509 CPU seconds > > Ordered by: internal time > List reduced from 40 to 3 due to restriction <3> > > ncalls tottime percall cumtime percall filename:lineno(function) > 27 2.329 0.086 3.302 0.122 gzipstreams.py:80(__fill) > 35 0.925 0.026 0.925 0.026 {built-in method decompress} > 9 0.124 0.014 0.124 0.014 {method 'copy' of > 'numpy.ndarray' objects} > > > > As you can see there is a dramatic improvement as the time passes from > 140 to around 3 seconds. > > I think that the default value should be raised a bit (at least 256K), > but as the performance hit can be so big is definitely better to have > this as keyword argument directly in io.loadmat. > > Any comment is appreciated. > > - Antonio > > PS: the test file used for the profiling is attached. > > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > > -- Information System Engineer, Ph.D. Website: http://matthieu-brucher.developpez.com/ Blogs: http://matt.eifelle.com and http://blog.developpez.com/?blog=92 LinkedIn: http://www.linkedin.com/in/matthieubrucher From fredmfp at gmail.com Sun Feb 22 06:59:32 2009 From: fredmfp at gmail.com (fred) Date: Sun, 22 Feb 2009 12:59:32 +0100 Subject: [SciPy-user] scipy.org issue? In-Reply-To: <9457e7c80902211603x4061b20bo21ddbd72ba02d9a9@mail.gmail.com> References: <49A05C9F.4000404@gmail.com> <9457e7c80902211603x4061b20bo21ddbd72ba02d9a9@mail.gmail.com> Message-ID: <49A13E24.1060701@gmail.com> St?fan van der Walt a ?crit : > Thanks for this report. I've added links to the docs editor on that > page, which should become permanent as soon as the server decides to > respond. Hmmm... :-( St?fan, I guess I get another one. If you go here, for instance: http://www.scipy.org/SciPyPackages/Ndimage you can click on the ScipyPackages page, at the top, before Ndimage. But this link does not exist, because this is the SciPy_package page that exists. Cheers, -- Fred From stefan at sun.ac.za Sun Feb 22 07:30:06 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Sun, 22 Feb 2009 14:30:06 +0200 Subject: [SciPy-user] scipy.org issue? In-Reply-To: <49A13E24.1060701@gmail.com> References: <49A05C9F.4000404@gmail.com> <9457e7c80902211603x4061b20bo21ddbd72ba02d9a9@mail.gmail.com> <49A13E24.1060701@gmail.com> Message-ID: <9457e7c80902220430i52edab0dja76ae8f616da5217@mail.gmail.com> 2009/2/22 fred : > St?fan, I guess I get another one. > > If you go here, for instance: > > http://www.scipy.org/SciPyPackages/Ndimage > > you can click on the ScipyPackages page, at the top, before Ndimage. Thanks, Fred, I've reorganised those pages so that they show a deprecation warning and give a link to the old content. Cheers St?fan From fredmfp at gmail.com Sun Feb 22 08:20:41 2009 From: fredmfp at gmail.com (fred) Date: Sun, 22 Feb 2009 14:20:41 +0100 Subject: [SciPy-user] scipy.org issue? In-Reply-To: <9457e7c80902220430i52edab0dja76ae8f616da5217@mail.gmail.com> References: <49A05C9F.4000404@gmail.com> <9457e7c80902211603x4061b20bo21ddbd72ba02d9a9@mail.gmail.com> <49A13E24.1060701@gmail.com> <9457e7c80902220430i52edab0dja76ae8f616da5217@mail.gmail.com> Message-ID: <49A15129.1000806@gmail.com> St?fan van der Walt a ?crit : > Thanks, Fred, I've reorganised those pages so that they show a > deprecation warning and give a link to the old content. Thanks, St?fan. Cheers, -- Fred From dmitrey15 at ukr.net Sun Feb 22 11:19:30 2009 From: dmitrey15 at ukr.net (Dmitrey) Date: Sun, 22 Feb 2009 18:19:30 +0200 Subject: [SciPy-user] LiberMate was Re: Automating Matlab In-Reply-To: <49A02E1A.5050703@gmail.com> References: <4984F58C.5070605@gmail.com> <49A02E1A.5050703@gmail.com> Message-ID: <49A17B12.8090006@ukr.net> hi Eric, I'm trying to use your soft on a file from matlab fileexchage area (BTW you could place a link to your soft there), so it prints The following appear to be variables: status, A, b, lb, val, f, M, bound, options, Aeq, val0, x, x0, beq, e, ub The following appear to be functions: rec, inf, IP1, linprog, optimset So I would say "inf" is certainly not a function, it's equivalent to numpy.inf (I guess you know). BTW I think it would be a good idea to mention your software in http://www.scipy.org/NumPy_for_Matlab_Users as well as in Octave, SAGE mail lists. Regards, D. Eric Schug wrote: > For those interested, my new project has been uploaded sourceforge at, > http://sourceforge.net/projects/libermate/ > > Latest version now supports simple command expressions (e.g hold on) > > > > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > > > > From wnbell at gmail.com Sun Feb 22 16:01:31 2009 From: wnbell at gmail.com (Nathan Bell) Date: Sun, 22 Feb 2009 16:01:31 -0500 Subject: [SciPy-user] Very slow loadmat in scipy 0.7 (regression) In-Reply-To: References: <5486cca80902220348p5bc7b15dk9cce43d73caf4961@mail.gmail.com> Message-ID: On Sun, Feb 22, 2009 at 6:58 AM, Matthieu Brucher wrote: > > This issue popped up in the scipy-dev ML and will be fixed in the future. > http://thread.gmane.org/gmane.comp.python.scientific.devel/9934/focus=9974 -- Nathan Bell wnbell at gmail.com http://graphics.cs.uiuc.edu/~wnbell/ From pinto at mit.edu Sun Feb 22 22:59:38 2009 From: pinto at mit.edu (Nicolas Pinto) Date: Sun, 22 Feb 2009 22:59:38 -0500 Subject: [SciPy-user] How to generate equivalent "random" numbers in matlab and scipy? Message-ID: <954ae5aa0902221959g75625c93s9f6c4352610e5ba1@mail.gmail.com> Dear all, I'd like to generate equivalent sequences of 'random' numbers in matlab and scipy, is there any way I can do that? I tried to fix the seed (see below) but it doesn't work. # scipy In [29]: random.seed(1); random.permutation(5)+1 Out[29]: array([3, 2, 5, 1, 4]) % matlab >> rand('seed', 1); randperm(5) ans = 4 3 5 2 1 Thanks for your time. Best regards, -- Nicolas Pinto Ph.D. Candidate, Brain & Computer Sciences Massachusetts Institute of Technology, USA http://web.mit.edu/pinto -------------- next part -------------- An HTML attachment was scrubbed... URL: From robert.kern at gmail.com Sun Feb 22 23:07:54 2009 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 22 Feb 2009 22:07:54 -0600 Subject: [SciPy-user] How to generate equivalent "random" numbers in matlab and scipy? In-Reply-To: <954ae5aa0902221959g75625c93s9f6c4352610e5ba1@mail.gmail.com> References: <954ae5aa0902221959g75625c93s9f6c4352610e5ba1@mail.gmail.com> Message-ID: <3d375d730902222007p43c91b4fhd242b6870fbc5d48@mail.gmail.com> On Sun, Feb 22, 2009 at 21:59, Nicolas Pinto wrote: > Dear all, > > I'd like to generate equivalent sequences of 'random' numbers in matlab and > scipy, is there any way I can do that? I tried to fix the seed (see below) > but it doesn't work. > > # scipy > In [29]: random.seed(1); random.permutation(5)+1 > Out[29]: array([3, 2, 5, 1, 4]) > > % matlab >>> rand('seed', 1); randperm(5) > > ans = > > 4 3 5 2 1 A quick Google tells me that starting with Matlab 7.4, they do use the Mersenne Twister by default, so it might be possible if you can re-implement their seeding algorithm. Use RandomState.set_state() to set the MT state vector directly (this is different from seeding, which generates an MT state vector from a other kinds of inputs). While this would give you identical sequences of samples (e.g. random_sample(), randint(), etc.), there is no guarantee that the algorithms we implement on top of the fundamental PRNG are the same (e.g. permutation()). -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From tritemio at gmail.com Mon Feb 23 03:19:47 2009 From: tritemio at gmail.com (Antonino Ingargiola) Date: Mon, 23 Feb 2009 09:19:47 +0100 Subject: [SciPy-user] Very slow loadmat in scipy 0.7 (regression) In-Reply-To: References: <5486cca80902220348p5bc7b15dk9cce43d73caf4961@mail.gmail.com> Message-ID: <5486cca80902230019w5dd1f0b3ycb0fbf6a4f59c79c@mail.gmail.com> 2009/2/22 Nathan Bell : > On Sun, Feb 22, 2009 at 6:58 AM, Matthieu Brucher > wrote: >> >> This issue popped up in the scipy-dev ML and will be fixed in the future. >> > > http://thread.gmane.org/gmane.comp.python.scientific.devel/9934/focus=9974 Thanks for the link. Can you someone forward this thread to scipy-dev just to note that I modified *only* the blocksize in order to recover optimum performances. I can perform some bench if is needed, just put me in CC since I'm not a scipy-dev subscriber. - Antonio From wnbell at gmail.com Mon Feb 23 05:07:19 2009 From: wnbell at gmail.com (Nathan Bell) Date: Mon, 23 Feb 2009 05:07:19 -0500 Subject: [SciPy-user] Very slow loadmat in scipy 0.7 (regression) In-Reply-To: <5486cca80902230019w5dd1f0b3ycb0fbf6a4f59c79c@mail.gmail.com> References: <5486cca80902220348p5bc7b15dk9cce43d73caf4961@mail.gmail.com> <5486cca80902230019w5dd1f0b3ycb0fbf6a4f59c79c@mail.gmail.com> Message-ID: On Mon, Feb 23, 2009 at 3:19 AM, Antonino Ingargiola wrote: > > Can you someone forward this thread to scipy-dev just to note that I > modified *only* the blocksize in order to recover optimum > performances. > > I can perform some bench if is needed, just put me in CC since I'm not > a scipy-dev subscriber. > Just FYI, the discussion has moved to this thread: http://thread.gmane.org/gmane.comp.python.scientific.devel/10010 You can reply to the thread using the "followup" option from the menu box in the upper right. -- Nathan Bell wnbell at gmail.com http://graphics.cs.uiuc.edu/~wnbell/ From sturla at molden.no Mon Feb 23 07:49:25 2009 From: sturla at molden.no (Sturla Molden) Date: Mon, 23 Feb 2009 13:49:25 +0100 Subject: [SciPy-user] How to generate equivalent "random" numbers in matlab and scipy? In-Reply-To: <3d375d730902222007p43c91b4fhd242b6870fbc5d48@mail.gmail.com> References: <954ae5aa0902221959g75625c93s9f6c4352610e5ba1@mail.gmail.com> <3d375d730902222007p43c91b4fhd242b6870fbc5d48@mail.gmail.com> Message-ID: <49A29B55.1050009@molden.no> On 2/23/2009 5:07 AM, Robert Kern wrote: > A quick Google tells me that starting with Matlab 7.4, they do use the > Mersenne Twister by default, so it might be possible if you can > re-implement their seeding algorithm. And for previous versions of Matlab, the PRNGs are: http://www.mathworks.com/moler/ncm/randtx.m http://www.mathworks.com/moler/ncm/randntx.m S.M. From bernardo.rocha at meduni-graz.at Mon Feb 23 08:48:32 2009 From: bernardo.rocha at meduni-graz.at (Bernardo M. Rocha) Date: Mon, 23 Feb 2009 14:48:32 +0100 Subject: [SciPy-user] FEM sol interpolation Message-ID: <49A2A930.5000606@meduni-graz.at> Hi Guys, I would like to know how can I interpolate some data (a FEM solution) from a coarse grid to a finer grid.... I would like to do something like (available in FEMLAB/COMSOL): u_int = postinterp(fem_sol, 'u', p_ref); Where fem_sol is my solution at some coarse grid and p_ref is my reference mesh (where I also have a fem_ref solution defined). If this is not clear, I need to compute e = fem_ref - fem_sol. Thanks in advance, Bernardo M. Rocha From josef.pktd at gmail.com Mon Feb 23 11:02:35 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 23 Feb 2009 11:02:35 -0500 Subject: [SciPy-user] LiberMate was Re: Automating Matlab In-Reply-To: <49A17B12.8090006@ukr.net> References: <4984F58C.5070605@gmail.com> <49A02E1A.5050703@gmail.com> <49A17B12.8090006@ukr.net> Message-ID: <1cd32cbb0902230802h653db600ic62014d0cb2498a@mail.gmail.com> > Eric Schug wrote: >> For those interested, my new project has been uploaded sourceforge at, >> http://sourceforge.net/projects/libermate/ >> >> Latest version now supports simple command expressions (e.g hold on) >> I tried it out on some simple matlab files. I didn't try to run the translated files yet, but two translations make the code less readable * translation of integers to floats, e.g. for array indices or for arange * multiplication of scalar with array is translated with dot, eg m: randn(n,k-1)*10 -> py: = dot(randn(n, k-1.), 10.) string array in matlab came out as empty numpy array m: vnames=['yvar', 'iota', 'x1 ', 'x2 ']; -> py: vnames = array(r_[]) Importing everything is a source of possible errors (e.g. pylab overwrites numpy names) from numpy import * import scipy # if available import pylab (from matlibplot) try: from pylab import * except ImportError: pass I would prefer the current standard import numpy as np import matplotlib.pylab as plt and keep the name space as part of the name, e.g. np.dot, np.exp, ... But it looks like it will save a lot of typing and editing, even though it will need careful proof reading. Thanks, Josef From dmitrey15 at ukr.net Mon Feb 23 11:19:53 2009 From: dmitrey15 at ukr.net (Dmitrey) Date: Mon, 23 Feb 2009 18:19:53 +0200 Subject: [SciPy-user] LiberMate was Re: Automating Matlab In-Reply-To: <1cd32cbb0902230802h653db600ic62014d0cb2498a@mail.gmail.com> References: <4984F58C.5070605@gmail.com> <49A02E1A.5050703@gmail.com> <49A17B12.8090006@ukr.net> <1cd32cbb0902230802h653db600ic62014d0cb2498a@mail.gmail.com> Message-ID: <49A2CCA9.90704@ukr.net> josef.pktd at gmail.com wrote: >> Eric Schug wrote: >> >>> For those interested, my new project has been uploaded sourceforge at, >>> http://sourceforge.net/projects/libermate/ >>> >>> Latest version now supports simple command expressions (e.g hold on) >>> >>> > > > from numpy import * > import scipy > I had got the same, while no scipy function has been involved in code > I would prefer the current standard > import numpy as np > and keep the name space as part of the name, e.g. np.dot, np.exp, ... > +1 Let me add some more cents: instead of round(arr) it should be used arr.round() (because Python's round doesn't work for numpy arrays) zeros(m,n) should be zeros((m,n)) ones(m,n) should be ones((m,n)) "end" should go to "-1" (now "xend-1") variables of index type (for example, obtained from ind=find(...) replaced by ind=nonzero() or ind=where()) should not be decreased in the following lines with indexing, eg arr[ind] should be used instead of arr[ind-1] for A=[B C ...] and A=[B;C] using A=hstack((B, C,...)) and A=vstack((B, C,...)) is more readable and natural than current array(r_[c_[B], c_[C]]) Regards, D. From peter.skomoroch at gmail.com Mon Feb 23 14:51:59 2009 From: peter.skomoroch at gmail.com (Peter Skomoroch) Date: Mon, 23 Feb 2009 14:51:59 -0500 Subject: [SciPy-user] Scientific packages for a distributed computing Amazon EC2 image? Message-ID: I'm collecting a wishlist of scientific and python related packages (numpy, scipy, etc) people would want installed on a Debian based Amazon EC2 machine image (AMI)for distributed computing. I'll make more information available as the machine image develops, some of these will also go into the Machetec2AMI. Several variants of the AMI should become available in the next month. Please feel free to add any packages you would want pre-installed on the following wiki page: http://scipy.org/SciPyAmazonAmi Let me know if you spot any potential license conflicts with listed software. -- Peter N. Skomoroch 617.285.8348 http://www.datawrangling.com http://delicious.com/pskomoroch http://twitter.com/peteskomoroch -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsouthey at gmail.com Mon Feb 23 15:01:45 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Mon, 23 Feb 2009 14:01:45 -0600 Subject: [SciPy-user] error estimate in stats.linregress In-Reply-To: <3d375d730902212346w288ce1b2pb54d0e55db5804ae@mail.gmail.com> References: <5e5978e10902212323o1f9dc4dn17b32d8c986f3440@mail.gmail.com> <3d375d730902212346w288ce1b2pb54d0e55db5804ae@mail.gmail.com> Message-ID: <49A300A9.4020608@gmail.com> Hi, Yes, the formula is incorrect. The reason is that the sum of squares terms are not corrected by the means because the ss function just computes the uncorrected sum of squares. Thus the correct formula should : sterrest = np.sqrt(((1-r*r)*(ss((y-ymean))))/(df*(ss(x-xmean)))) Alternatively: sterrest = np.sqrt((1-r*r)*(ss(y)-n*ymean*ymean)/ (ss(x)-n*xmean*xmean) / df) Note the formula is derived using the definition of R-squared: The estimated variance of the slope = MSE/Sxx= ((1-R*R)*Syy)/(df*Sxx) where Syy and Sxx are the corrected sums of squares for Y and X, respectively, Regards Bruce Hi all, I was working with linear regression in scipy and met some problems with value of standard error of the estimate returned by scipy.stats.linregress() function. I could not compare it to similar outputs of other linear regression routines (for example in Origin), so I took a look in the source (stats.py). In the source it is defined as sterrest = np.sqrt((1-r*r)*ss(y) / ss(x) / df) where r is correlation coefficient, df is degrees of freedom (N-2) and ss() is sum of squares of elements. After digging through literature the only formula looking somewhat the same was found to be stderrest = np.sqrt((1-r*r)*ss(y-y.mean())/df) which gives the same result as a standard definition (in notation of the source of linregress) stderrest = np.sqrt(ss(y-slope*x-intercept)/df) but the output of linregress is different. I humbly suppose this is a bug, but maybe somebody could explain me what is it if I'm wrong... Pavlo. From josef.pktd at gmail.com Mon Feb 23 16:04:00 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 23 Feb 2009 16:04:00 -0500 Subject: [SciPy-user] error estimate in stats.linregress In-Reply-To: <49A300A9.4020608@gmail.com> References: <5e5978e10902212323o1f9dc4dn17b32d8c986f3440@mail.gmail.com> <3d375d730902212346w288ce1b2pb54d0e55db5804ae@mail.gmail.com> <49A300A9.4020608@gmail.com> Message-ID: <1cd32cbb0902231304s79948a1bj89a45d4b394fbf24@mail.gmail.com> On Mon, Feb 23, 2009 at 3:01 PM, Bruce Southey wrote: > Hi, > Yes, the formula is incorrect. The reason is that the sum of squares > terms are not corrected by the means because the ss function just > computes the uncorrected sum of squares. > > Thus the correct formula should : > sterrest = np.sqrt(((1-r*r)*(ss((y-ymean))))/(df*(ss(x-xmean)))) > Alternatively: > sterrest = np.sqrt((1-r*r)*(ss(y)-n*ymean*ymean)/ (ss(x)-n*xmean*xmean) > / df) > > Note the formula is derived using the definition of R-squared: > The estimated variance of the slope = MSE/Sxx= ((1-R*R)*Syy)/(df*Sxx) > where Syy and Sxx are the corrected sums of squares for Y and X, > respectively, > > Regards > Bruce > > Hi all, > I was working with linear regression in scipy and met some problems > with value of standard error of the estimate returned by > scipy.stats.linregress() function. I could not compare it to similar > outputs of other linear regression routines (for example in Origin), > so I took a look in the source (stats.py). > > In the source it is defined as > sterrest = np.sqrt((1-r*r)*ss(y) / ss(x) / df) > where r is correlation coefficient, df is degrees of freedom (N-2) and > ss() is sum of squares of elements. > > After digging through literature the only formula looking somewhat the > same was found to be > stderrest = np.sqrt((1-r*r)*ss(y-y.mean())/df) > which gives the same result as a standard definition (in notation of > the source of linregress) > stderrest = np.sqrt(ss(y-slope*x-intercept)/df) > but the output of linregress is different. > > I humbly suppose this is a bug, but maybe somebody could explain me > what is it if I'm wrong... > Pavlo. > Thank you for reporting and checking this. I fixed it in trunk, but still have to add a test. There are still small (1e-4) numerical differences to the multivariate ols version in the example I tried Josef From bsouthey at gmail.com Mon Feb 23 16:45:02 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Mon, 23 Feb 2009 15:45:02 -0600 Subject: [SciPy-user] error estimate in stats.linregress In-Reply-To: <1cd32cbb0902231304s79948a1bj89a45d4b394fbf24@mail.gmail.com> References: <5e5978e10902212323o1f9dc4dn17b32d8c986f3440@mail.gmail.com> <3d375d730902212346w288ce1b2pb54d0e55db5804ae@mail.gmail.com> <49A300A9.4020608@gmail.com> <1cd32cbb0902231304s79948a1bj89a45d4b394fbf24@mail.gmail.com> Message-ID: <49A318DE.9030303@gmail.com> josef.pktd at gmail.com wrote: > On Mon, Feb 23, 2009 at 3:01 PM, Bruce Southey wrote: > >> Hi, >> Yes, the formula is incorrect. The reason is that the sum of squares >> terms are not corrected by the means because the ss function just >> computes the uncorrected sum of squares. >> >> Thus the correct formula should : >> sterrest = np.sqrt(((1-r*r)*(ss((y-ymean))))/(df*(ss(x-xmean)))) >> Alternatively: >> sterrest = np.sqrt((1-r*r)*(ss(y)-n*ymean*ymean)/ (ss(x)-n*xmean*xmean) >> / df) >> >> Note the formula is derived using the definition of R-squared: >> The estimated variance of the slope = MSE/Sxx= ((1-R*R)*Syy)/(df*Sxx) >> where Syy and Sxx are the corrected sums of squares for Y and X, >> respectively, >> >> Regards >> Bruce >> >> Hi all, >> I was working with linear regression in scipy and met some problems >> with value of standard error of the estimate returned by >> scipy.stats.linregress() function. I could not compare it to similar >> outputs of other linear regression routines (for example in Origin), >> so I took a look in the source (stats.py). >> >> In the source it is defined as >> sterrest = np.sqrt((1-r*r)*ss(y) / ss(x) / df) >> where r is correlation coefficient, df is degrees of freedom (N-2) and >> ss() is sum of squares of elements. >> >> After digging through literature the only formula looking somewhat the >> same was found to be >> stderrest = np.sqrt((1-r*r)*ss(y-y.mean())/df) >> which gives the same result as a standard definition (in notation of >> the source of linregress) >> stderrest = np.sqrt(ss(y-slope*x-intercept)/df) >> but the output of linregress is different. >> >> I humbly suppose this is a bug, but maybe somebody could explain me >> what is it if I'm wrong... >> Pavlo. >> >> > > Thank you for reporting and checking this. > > I fixed it in trunk, but still have to add a test. > There are still small (1e-4) numerical differences to the > multivariate ols version in the example I tried > > Josef > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > Assuming that linregress is different from expected, linregress is just a scalar implementation so it is probably prone to rounding error in numerous places. If you want, I can look at the example and see. Bruce From stefan at sun.ac.za Mon Feb 23 17:08:40 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Tue, 24 Feb 2009 00:08:40 +0200 Subject: [SciPy-user] error estimate in stats.linregress In-Reply-To: <1cd32cbb0902231304s79948a1bj89a45d4b394fbf24@mail.gmail.com> References: <5e5978e10902212323o1f9dc4dn17b32d8c986f3440@mail.gmail.com> <3d375d730902212346w288ce1b2pb54d0e55db5804ae@mail.gmail.com> <49A300A9.4020608@gmail.com> <1cd32cbb0902231304s79948a1bj89a45d4b394fbf24@mail.gmail.com> Message-ID: <9457e7c80902231408g3fe5421dk7fa577b8a5242a07@mail.gmail.com> 2009/2/23 : > Thank you for reporting and checking this. > > I fixed it in trunk, but still have to add a test. > There are still small (1e-4) numerical differences to the > multivariate ols version in the example I tried Thanks, Josef. I assume you are working on a test, otherwise let me know so I can write one. Cheers St?fan From josef.pktd at gmail.com Mon Feb 23 19:57:13 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 23 Feb 2009 19:57:13 -0500 Subject: [SciPy-user] error estimate in stats.linregress In-Reply-To: <49A318DE.9030303@gmail.com> References: <5e5978e10902212323o1f9dc4dn17b32d8c986f3440@mail.gmail.com> <3d375d730902212346w288ce1b2pb54d0e55db5804ae@mail.gmail.com> <49A300A9.4020608@gmail.com> <1cd32cbb0902231304s79948a1bj89a45d4b394fbf24@mail.gmail.com> <49A318DE.9030303@gmail.com> Message-ID: <1cd32cbb0902231657y21aafc81pff468ed5b7bc4421@mail.gmail.com> On Mon, Feb 23, 2009 at 4:45 PM, Bruce Southey wrote: > josef.pktd at gmail.com wrote: >> On Mon, Feb 23, 2009 at 3:01 PM, Bruce Southey wrote: >> >>> Hi, >>> Yes, the formula is incorrect. The reason is that the sum of squares >>> terms are not corrected by the means because the ss function just >>> computes the uncorrected sum of squares. >>> >>> Thus the correct formula should : >>> sterrest = np.sqrt(((1-r*r)*(ss((y-ymean))))/(df*(ss(x-xmean)))) >>> Alternatively: >>> sterrest = np.sqrt((1-r*r)*(ss(y)-n*ymean*ymean)/ (ss(x)-n*xmean*xmean) >>> / df) >>> >>> Note the formula is derived using the definition of R-squared: >>> The estimated variance of the slope = MSE/Sxx= ((1-R*R)*Syy)/(df*Sxx) >>> where Syy and Sxx are the corrected sums of squares for Y and X, >>> respectively, >>> >>> Regards >>> Bruce >>> >>> Hi all, >>> I was working with linear regression in scipy and met some problems >>> with value of standard error of the estimate returned by >>> scipy.stats.linregress() function. I could not compare it to similar >>> outputs of other linear regression routines (for example in Origin), >>> so I took a look in the source (stats.py). >>> >>> In the source it is defined as >>> sterrest = np.sqrt((1-r*r)*ss(y) / ss(x) / df) >>> where r is correlation coefficient, df is degrees of freedom (N-2) and >>> ss() is sum of squares of elements. >>> >>> After digging through literature the only formula looking somewhat the >>> same was found to be >>> stderrest = np.sqrt((1-r*r)*ss(y-y.mean())/df) >>> which gives the same result as a standard definition (in notation of >>> the source of linregress) >>> stderrest = np.sqrt(ss(y-slope*x-intercept)/df) >>> but the output of linregress is different. >>> >>> I humbly suppose this is a bug, but maybe somebody could explain me >>> what is it if I'm wrong... >>> Pavlo. >>> >>> >> >> Thank you for reporting and checking this. >> >> I fixed it in trunk, but still have to add a test. >> There are still small (1e-4) numerical differences to the >> multivariate ols version in the example I tried >> >> Josef >> _______________________________________________ >> SciPy-user mailing list >> SciPy-user at scipy.org >> http://projects.scipy.org/mailman/listinfo/scipy-user >> > > Assuming that linregress is different from expected, linregress is just > a scalar implementation so it is probably prone to rounding error in > numerous places. > If you want, I can look at the example and see. > > Bruce > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > I just made up an example on the command line and compared it with my version of the cookbook ols class using pinv . Until now I haven't looked carefully at linregress because I think it should be completely replaced with a full ols regression, or at least with a call to it. I added and corrected a test. original tests only test constant and slope estimates. Josef >>> x = np.random.randn(50,2) >>> y = (x*[1,0.5]).sum(axis=1) >>> res = ols(y,x[:,0]) >>> res.b array([-0.04397404, 0.88322506]) >>> res.se array([ 0.07458703, 0.08676139]) >>> res.p array([ 5.58246087e-01, 1.40776280e-13]) >>> np.sqrt(res.R2) 0.82670557087727525 >>> stats.linregress(x[:,0],y) (0.88322505756806569, -0.043974044431550653, 0.8267055708772757, 1.408136357829305e-013, 0.086581413890270145) From simpson at math.toronto.edu Mon Feb 23 21:05:16 2009 From: simpson at math.toronto.edu (Gideon Simpson) Date: Mon, 23 Feb 2009 21:05:16 -0500 Subject: [SciPy-user] struve function Message-ID: Am I correct that the struve function, scipy.special.struve, only accept scalar arguments? -gideon From robert.kern at gmail.com Mon Feb 23 21:09:23 2009 From: robert.kern at gmail.com (Robert Kern) Date: Mon, 23 Feb 2009 20:09:23 -0600 Subject: [SciPy-user] struve function In-Reply-To: References: Message-ID: <3d375d730902231809j387008c5of134e93be7cfac51@mail.gmail.com> On Mon, Feb 23, 2009 at 20:05, Gideon Simpson wrote: > Am I correct that the struve function, scipy.special.struve, only > accept scalar arguments? No. In [3]: struve([-12., +12, -11, +11], [41., -41., 41., -41.]) Out[3]: array([ -1.15736059e-01, -1.12364370e+06, 8.33084926e-02, 6.29536643e+05]) -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From pav at iki.fi Mon Feb 23 21:18:59 2009 From: pav at iki.fi (Pauli Virtanen) Date: Tue, 24 Feb 2009 02:18:59 +0000 (UTC) Subject: [SciPy-user] struve function References: <3d375d730902231809j387008c5of134e93be7cfac51@mail.gmail.com> Message-ID: Mon, 23 Feb 2009 20:09:23 -0600, Robert Kern wrote: > On Mon, Feb 23, 2009 at 20:05, Gideon Simpson > wrote: >> Am I correct that the struve function, scipy.special.struve, only >> accept scalar arguments? > > No. > > In [3]: struve([-12., +12, -11, +11], [41., -41., 41., -41.]) Out[3]: > array([ -1.15736059e-01, -1.12364370e+06, 8.33084926e-02, > 6.29536643e+05]) If you're using the struve function, be aware of this: http://scipy.org/scipy/scipy/ticket/679 The results are incorrect for orders in range [-12, 0] and x < 20; the bug is now fixed, but it was still present in 0.7.0. -- Pauli Virtanen From schugschug at gmail.com Mon Feb 23 23:15:52 2009 From: schugschug at gmail.com (Eric Schug) Date: Mon, 23 Feb 2009 23:15:52 -0500 Subject: [SciPy-user] LiberMate was Re: Automating Matlab In-Reply-To: References: Message-ID: <49A37478.7030205@gmail.com> Thanks everyone for you comments. I will try to address them. > * translation of integers to floats, e.g. for array indices or for arange > Rational for cast if int to float. Matlab always uses float data type, for most expressions e.g. try format long e a=1 This is mostly a problem for division, integer division would be used in some cases yielding the wrong results. e.g. a=1/2 would give 0 and not 0.5 as it does in Matlab. An alternative method would be to use from __future__ import division I think the best would be to have this be a command line option, so that various translation rules could be enabled or disabled. > * multiplication of scalar with array is translated with dot, eg > m: randn(n,k-1)*10 -> py: = dot(randn(n, k-1.), 10.) Matrix multiplication in matlab is * -> dot in Numpy but with scalars should use more readable * > string array in matlab came out as empty numpy array > > m: > vnames=['yvar', > 'iota', > 'x1 ', > 'x2 ']; > -> py: > vnames = array(r_[]) > Importing everything is a source of possible errors (e.g. pylab > overwrites numpy names) > > from numpy import * > import scipy > # if available import pylab (from matlibplot) > try: > from pylab import * > except ImportError: > pass need to cross reference numpy functions. I've added the rest to tracker at SourceForge. Although, still learning to use Source forge tools. Eric. From josef.pktd at gmail.com Tue Feb 24 00:32:07 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 24 Feb 2009 00:32:07 -0500 Subject: [SciPy-user] local nonparametric regression, gauss process Message-ID: <1cd32cbb0902232132m348a7086s1d6491a70e71d6d2@mail.gmail.com> Last weekend, I was trying out spatial and sparse. I wrote some functions for kernel ridge regression, however I still have parameterization problems with the sparse version. But here is the dense version, which I tried out with 1000 training points and 2000 points in total. It's not finished but produces some nice graphs to show the local regression. If there is interest, I can add this to scipy. Josef -------------- next part -------------- '''Kernel Ridge Regression for local non-parametric regression''' import numpy as np from scipy import spatial as ssp from numpy.testing import assert_equal import matplotlib.pylab as plt def plt_closeall(n=10): '''close a number of open matplotlib windows''' for i in range(n): plt.close() def kernel_rbf(x,y,scale=1, **kwds): #scale = kwds.get('scale',1) dist = ssp.minkowski_distance_p(x[:,np.newaxis,:],y[np.newaxis,:,:],2) return np.exp(-0.5/scale*(dist)) def kernel_euclid(x,y,p=2, **kwds): return ssp.minkowski_distance(x[:,np.newaxis,:],y[np.newaxis,:,:],p) class GaussProcess(object): '''class to perform kernel ridge regression (gaussian process) Warning: this class is memory intensive, it creates nobs x nobs distance matrix and its inverse, where nobs is the number of rows (observations). See sparse version for larger number of observations Notes ----- Todo: * normalize multidimensional x array on demand, either by var or cov * add confidence band * automatic selection or proposal of smoothing parameters Reference --------- Rasmussen, C.E. and C.K.I. Williams, 2006, Gaussian Processes for Machine Learning, the MIT Press, www.GaussianProcess.org/gpal, chapter 2 ''' def __init__(self, x,y=None, kernel=kernel_rbf, scale=0.5, ridgecoeff = 1e-10, **kwds ): ''' Parameters ---------- x : 2d array (N,K) data array of explanatory variables, columns represent variables rows represent observations y : 2d array (N,1) (optional) endogenous variable that should be fitted or predicted can alternatively be specified as parameter to fit method kernel : function, default: kernel_rbf kernel: (x1,x2)->kernel matrix is a function that takes as parameter two column arrays and return the kernel or distance matrix scale : float (optional) smoothing parameter for the rbf kernel ridgecoeff : float (optional) coefficient that is multiplied with the identity matrix in the ridge regression Notes ----- After initialization, kernel matrix is calculated and if y is given as parameter then also the linear regression parameter and the fitted or estimated y values, yest, are calculated. yest is available as an attribute in this case. Both scale and the ridge coefficient smooth the fitted curve. ''' self.x = x self.kernel = kernel self.scale = scale self.ridgecoeff = ridgecoeff self.distxsample = kernel(x,x,scale=scale) self.Kinv = np.linalg.inv(self.distxsample + np.eye(*self.distxsample.shape)*ridgecoeff) if not y is None: self.y = y self.yest = self.fit(y) def fit(self,y): '''fit the training explanatory variables to a sample ouput variable''' self.parest = np.dot(self.Kinv,y) yhat = np.dot(self.distxsample,self.parest) return yhat ## print ds33.shape ## ds33_2 = kernel(x,x[::k,:],scale=scale) ## dsinv = np.linalg.inv(ds33+np.eye(*distxsample.shape)*ridgecoeff) ## B = np.dot(dsinv,y[::k,:]) def predict(self,x): '''predict new y values for a given array of explanatory variables''' self.xpredict = x distxpredict = self.kernel(x,self.x,scale=self.scale) self.ypredict = np.dot(distxpredict,self.parest) return self.ypredict def plot(self, y, plt=plt ): '''some basic plots''' #todo return proper graph handles plt.figure(); plt.plot(self.x,self.y,'bo-',self.x,self.yest,'r.-') plt.title('sample (training) points') plt.figure() plt.plot(self.xpredict,y,'bo-',self.xpredict,self.ypredict,'r.-') plt.title('all points') def example1(): m,k = 500,4 upper = 6 scale=10 xs1a = np.linspace(1,upper,m)[:,np.newaxis] xs1 = xs1a*np.ones((1,4)) + 1/(1.0+np.exp(np.random.randn(m,k))) xs1 /= np.std(xs1[::k,:],0) # normalize scale, could use cov to normalize y1true = np.sum(np.sin(xs1)+np.sqrt(xs1),1)[:,np.newaxis] y1 = y1true + 0.250 * np.random.randn(m,1) stride = 2 #use only some points as trainig points e.g 2 means every 2nd gp1 = GaussProcess(xs1[::stride,:],y1[::stride,:], kernel=kernel_euclid, ridgecoeff=1e-10) yhatr1 = gp1.predict(xs1) plt.figure() plt.plot(y1true, y1,'bo',y1true, yhatr1,'r.') plt.title('euclid kernel: true y versus noisy y and estimated y') plt.figure() plt.plot(y1,'bo-',y1true,'go-',yhatr1,'r.-') plt.title('euclid kernel: true (green), noisy (blue) and estimated (red) '+ 'observations') gp2 = GaussProcess(xs1[::stride,:],y1[::stride,:], kernel=kernel_rbf, scale=scale, ridgecoeff=1e-1) yhatr2 = gp2.predict(xs1) plt.figure() plt.plot(y1true, y1,'bo',y1true, yhatr2,'r.') plt.title('rbf kernel: true versus noisy (blue) and estimated (red) observations') plt.figure() plt.plot(y1,'bo-',y1true,'go-',yhatr2,'r.-') plt.title('rbf kernel: true (green), noisy (blue) and estimated (red) '+ 'observations') #gp2.plot(y1) def example2(m=100, scale=0.01, stride=2): #m,k = 100,1 upper = 6 xs1 = np.linspace(1,upper,m)[:,np.newaxis] y1true = np.sum(np.sin(xs1**2),1)[:,np.newaxis]/xs1 y1 = y1true + 0.05*np.random.randn(m,1) ridgecoeff = 1e-10 #stride = 2 #use only some points as trainig points e.g 2 means every 2nd gp1 = GaussProcess(xs1[::stride,:],y1[::stride,:], kernel=kernel_euclid, ridgecoeff=1e-10) yhatr1 = gp1.predict(xs1) plt.figure() plt.plot(y1true, y1,'bo',y1true, yhatr1,'r.') plt.title('euclid kernel: true versus noisy (blue) and estimated (red) observations') plt.figure() plt.plot(y1,'bo-',y1true,'go-',yhatr1,'r.-') plt.title('euclid kernel: true (green), noisy (blue) and estimated (red) '+ 'observations') gp2 = GaussProcess(xs1[::stride,:],y1[::stride,:], kernel=kernel_rbf, scale=scale, ridgecoeff=1e-2) yhatr2 = gp2.predict(xs1) plt.figure() plt.plot(y1true, y1,'bo',y1true, yhatr2,'r.') plt.title('rbf kernel: true versus noisy (blue) and estimated (red) observations') plt.figure() plt.plot(y1,'bo-',y1true,'go-',yhatr2,'r.-') plt.title('rbf kernel: true (green), noisy (blue) and estimated (red) '+ 'observations') #gp2.plot(y1) example2() #example2(m=1000, scale=0.01) #example2(m=100, scale=0.5) # oversmoothing #example2(m=2000, scale=0.005) # this looks good for rbf, zoom in #example2(m=200, scale=0.01,stride=4) example1() plt.show() #plt_closeall() # use this to close the open figure windows From stefan at sun.ac.za Tue Feb 24 01:01:23 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Tue, 24 Feb 2009 08:01:23 +0200 Subject: [SciPy-user] local nonparametric regression, gauss process In-Reply-To: <1cd32cbb0902232132m348a7086s1d6491a70e71d6d2@mail.gmail.com> References: <1cd32cbb0902232132m348a7086s1d6491a70e71d6d2@mail.gmail.com> Message-ID: <9457e7c80902232201g44eb537eqdfa069a75e61f6fe@mail.gmail.com> Hey Josef 2009/2/24 : > Last weekend, I was trying out spatial and sparse. I wrote some > functions for kernel ridge regression, however I still have > parameterization problems with the sparse version. But here is the > dense version, which I tried out with 1000 training points and 2000 > points in total. > > It's not finished but produces some nice graphs to show the local > regression. If there is interest, I can add this to scipy. Really nice graphs! This would be very useful for class! Some trivial comments about spacing in the code: - Use spaces: return ssp.minkowski_distance(x[:,np.newaxis,:],y[np.newaxis,:,:],p) becomes return ssp.minkowski_distance(x[:,np.newaxis,:], y[np.newaxis,:,:], p) According to the Python PEP I think there should be spaces inside the indexing brackets too, but that doesn't enhance readability much in this case. - Keywords do not take spaces scale=0.5, ridgecoeff = 1e-10, **kwds ): should be scale=0.5, ridgecoeff=1e-10, **kwds): Thanks again, Cheers St?fan From stefan at sun.ac.za Tue Feb 24 04:32:00 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Tue, 24 Feb 2009 11:32:00 +0200 Subject: [SciPy-user] error estimate in stats.linregress In-Reply-To: <1cd32cbb0902231657y21aafc81pff468ed5b7bc4421@mail.gmail.com> References: <5e5978e10902212323o1f9dc4dn17b32d8c986f3440@mail.gmail.com> <3d375d730902212346w288ce1b2pb54d0e55db5804ae@mail.gmail.com> <49A300A9.4020608@gmail.com> <1cd32cbb0902231304s79948a1bj89a45d4b394fbf24@mail.gmail.com> <49A318DE.9030303@gmail.com> <1cd32cbb0902231657y21aafc81pff468ed5b7bc4421@mail.gmail.com> Message-ID: <9457e7c80902240132v6dd4849ase3ea86acef1fef8f@mail.gmail.com> 2009/2/24 : > I just made up an example on the command line and compared it with my > version of the cookbook ols class using pinv . Until now I haven't > looked carefully at linregress because I think it should be completely > replaced with a full ols regression, or at least with a call to it. I > added and corrected a test. original tests only test constant and > slope estimates. Great! Thanks, Josef. St?fan From tritemio at gmail.com Tue Feb 24 04:56:32 2009 From: tritemio at gmail.com (Antonino Ingargiola) Date: Tue, 24 Feb 2009 10:56:32 +0100 Subject: [SciPy-user] Very slow loadmat in scipy 0.7 (regression) In-Reply-To: References: <5486cca80902220348p5bc7b15dk9cce43d73caf4961@mail.gmail.com> <5486cca80902230019w5dd1f0b3ycb0fbf6a4f59c79c@mail.gmail.com> Message-ID: <5486cca80902240156j6f138dd2g2027d339077aa30c@mail.gmail.com> 2009/2/23 Nathan Bell : > On Mon, Feb 23, 2009 at 3:19 AM, Antonino Ingargiola wrote: >> >> Can you someone forward this thread to scipy-dev just to note that I >> modified *only* the blocksize in order to recover optimum >> performances. >> >> I can perform some bench if is needed, just put me in CC since I'm not >> a scipy-dev subscriber. >> > > Just FYI, the discussion has moved to this thread: > http://thread.gmane.org/gmane.comp.python.scientific.devel/10010 > > You can reply to the thread using the "followup" option from the menu > box in the upper right. Thanks a lot. ~ Antonio From bsouthey at gmail.com Tue Feb 24 09:17:30 2009 From: bsouthey at gmail.com (Bruce Southey) Date: Tue, 24 Feb 2009 08:17:30 -0600 Subject: [SciPy-user] error estimate in stats.linregress In-Reply-To: <1cd32cbb0902231657y21aafc81pff468ed5b7bc4421@mail.gmail.com> References: <5e5978e10902212323o1f9dc4dn17b32d8c986f3440@mail.gmail.com> <3d375d730902212346w288ce1b2pb54d0e55db5804ae@mail.gmail.com> <49A300A9.4020608@gmail.com> <1cd32cbb0902231304s79948a1bj89a45d4b394fbf24@mail.gmail.com> <49A318DE.9030303@gmail.com> <1cd32cbb0902231657y21aafc81pff468ed5b7bc4421@mail.gmail.com> Message-ID: <49A4017A.1080409@gmail.com> josef.pktd at gmail.com wrote: > On Mon, Feb 23, 2009 at 4:45 PM, Bruce Southey wrote: > >> josef.pktd at gmail.com wrote: >> >>> On Mon, Feb 23, 2009 at 3:01 PM, Bruce Southey wrote: >>> >>> >>>> Hi, >>>> Yes, the formula is incorrect. The reason is that the sum of squares >>>> terms are not corrected by the means because the ss function just >>>> computes the uncorrected sum of squares. >>>> >>>> Thus the correct formula should : >>>> sterrest = np.sqrt(((1-r*r)*(ss((y-ymean))))/(df*(ss(x-xmean)))) >>>> Alternatively: >>>> sterrest = np.sqrt((1-r*r)*(ss(y)-n*ymean*ymean)/ (ss(x)-n*xmean*xmean) >>>> / df) >>>> >>>> Note the formula is derived using the definition of R-squared: >>>> The estimated variance of the slope = MSE/Sxx= ((1-R*R)*Syy)/(df*Sxx) >>>> where Syy and Sxx are the corrected sums of squares for Y and X, >>>> respectively, >>>> >>>> Regards >>>> Bruce >>>> >>>> Hi all, >>>> I was working with linear regression in scipy and met some problems >>>> with value of standard error of the estimate returned by >>>> scipy.stats.linregress() function. I could not compare it to similar >>>> outputs of other linear regression routines (for example in Origin), >>>> so I took a look in the source (stats.py). >>>> >>>> In the source it is defined as >>>> sterrest = np.sqrt((1-r*r)*ss(y) / ss(x) / df) >>>> where r is correlation coefficient, df is degrees of freedom (N-2) and >>>> ss() is sum of squares of elements. >>>> >>>> After digging through literature the only formula looking somewhat the >>>> same was found to be >>>> stderrest = np.sqrt((1-r*r)*ss(y-y.mean())/df) >>>> which gives the same result as a standard definition (in notation of >>>> the source of linregress) >>>> stderrest = np.sqrt(ss(y-slope*x-intercept)/df) >>>> but the output of linregress is different. >>>> >>>> I humbly suppose this is a bug, but maybe somebody could explain me >>>> what is it if I'm wrong... >>>> Pavlo. >>>> >>>> >>>> >>> Thank you for reporting and checking this. >>> >>> I fixed it in trunk, but still have to add a test. >>> There are still small (1e-4) numerical differences to the >>> multivariate ols version in the example I tried >>> >>> Josef >>> _______________________________________________ >>> SciPy-user mailing list >>> SciPy-user at scipy.org >>> http://projects.scipy.org/mailman/listinfo/scipy-user >>> >>> >> Assuming that linregress is different from expected, linregress is just >> a scalar implementation so it is probably prone to rounding error in >> numerous places. >> If you want, I can look at the example and see. >> >> Bruce >> _______________________________________________ >> SciPy-user mailing list >> SciPy-user at scipy.org >> http://projects.scipy.org/mailman/listinfo/scipy-user >> >> > > I just made up an example on the command line and compared it with my > version of the cookbook ols class using pinv . Until now I haven't > looked carefully at linregress because I think it should be completely > replaced with a full ols regression, or at least with a call to it. I > added and corrected a test. original tests only test constant and > slope estimates. > > Josef > > I have not tried the ols class but for an example based on your code (plus another I had very handy) I did not see any differences between linregress and SAS glm. It would be inappropriate to replace it or have a call to other module because we would have to provide the same input and output requirements. Rather I think we just have to leave it alone until we get an acceptable regression/general linear model/ANOVA solution. Then we can just depreciate before removing it. Bruce From josef.pktd at gmail.com Tue Feb 24 11:06:15 2009 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Tue, 24 Feb 2009 11:06:15 -0500 Subject: [SciPy-user] error estimate in stats.linregress In-Reply-To: <49A4017A.1080409@gmail.com> References: <5e5978e10902212323o1f9dc4dn17b32d8c986f3440@mail.gmail.com> <3d375d730902212346w288ce1b2pb54d0e55db5804ae@mail.gmail.com> <49A300A9.4020608@gmail.com> <1cd32cbb0902231304s79948a1bj89a45d4b394fbf24@mail.gmail.com> <49A318DE.9030303@gmail.com> <1cd32cbb0902231657y21aafc81pff468ed5b7bc4421@mail.gmail.com> <49A4017A.1080409@gmail.com> Message-ID: <1cd32cbb0902240806k74a317bcj4f66be9e1104d985@mail.gmail.com> On Tue, Feb 24, 2009 at 9:17 AM, Bruce Southey wrote: > josef.pktd at gmail.com wrote: >> On Mon, Feb 23, 2009 at 4:45 PM, Bruce Southey wrote: >> >>> josef.pktd at gmail.com wrote: >>> >>>> On Mon, Feb 23, 2009 at 3:01 PM, Bruce Southey wrote: >>>> >>>> >>>>> Hi, >>>>> Yes, the formula is incorrect. The reason is that the sum of squares >>>>> terms are not corrected by the means because the ss function just >>>>> computes the uncorrected sum of squares. >>>>> >>>>> Thus the correct formula should : >>>>> sterrest = np.sqrt(((1-r*r)*(ss((y-ymean))))/(df*(ss(x-xmean)))) >>>>> Alternatively: >>>>> sterrest = np.sqrt((1-r*r)*(ss(y)-n*ymean*ymean)/ (ss(x)-n*xmean*xmean) >>>>> / df) >>>>> >>>>> Note the formula is derived using the definition of R-squared: >>>>> The estimated variance of the slope = MSE/Sxx= ((1-R*R)*Syy)/(df*Sxx) >>>>> where Syy and Sxx are the corrected sums of squares for Y and X, >>>>> respectively, >>>>> >>>>> Regards >>>>> Bruce >>>>> >>>>> Hi all, >>>>> I was working with linear regression in scipy and met some problems >>>>> with value of standard error of the estimate returned by >>>>> scipy.stats.linregress() function. I could not compare it to similar >>>>> outputs of other linear regression routines (for example in Origin), >>>>> so I took a look in the source (stats.py). >>>>> >>>>> In the source it is defined as >>>>> sterrest = np.sqrt((1-r*r)*ss(y) / ss(x) / df) >>>>> where r is correlation coefficient, df is degrees of freedom (N-2) and >>>>> ss() is sum of squares of elements. >>>>> >>>>> After digging through literature the only formula looking somewhat the >>>>> same was found to be >>>>> stderrest = np.sqrt((1-r*r)*ss(y-y.mean())/df) >>>>> which gives the same result as a standard definition (in notation of >>>>> the source of linregress) >>>>> stderrest = np.sqrt(ss(y-slope*x-intercept)/df) >>>>> but the output of linregress is different. >>>>> >>>>> I humbly suppose this is a bug, but maybe somebody could explain me >>>>> what is it if I'm wrong... >>>>> Pavlo. >>>>> >>>>> >>>>> >>>> Thank you for reporting and checking this. >>>> >>>> I fixed it in trunk, but still have to add a test. >>>> There are still small ?(1e-4) numerical differences to the >>>> multivariate ols version in the example I tried >>>> >>>> Josef >>>> _______________________________________________ >>>> SciPy-user mailing list >>>> SciPy-user at scipy.org >>>> http://projects.scipy.org/mailman/listinfo/scipy-user >>>> >>>> >>> Assuming that linregress is different from expected, linregress is just >>> a scalar implementation so it is probably prone to rounding error in >>> numerous places. >>> If you want, I can look at the example and see. >>> >>> Bruce >>> _______________________________________________ >>> SciPy-user mailing list >>> SciPy-user at scipy.org >>> http://projects.scipy.org/mailman/listinfo/scipy-user >>> >>> >> >> I just made up an example on the command line and compared it with my >> version of the cookbook ols class using pinv . Until now I haven't >> looked carefully at linregress because I think it should be completely >> replaced with a full ols regression, or at least with a call to it. I >> added and corrected a test. original tests only test constant and >> slope estimates. >> >> Josef >> >> > I have not tried the ols class but for an example based on your code > (plus another I had very handy) I did not see any differences between > linregress and SAS glm. I guess, I had an incomplete reload somewhere, I ran 1000 simulations of my simple example and the maximum absolute difference is 1e-15 (1e-13 in some other cases) The only difference I found is that the coefficient of determination R is calculated as the correlation coefficient between x and y and has a sign, compared to standard definition of R^squared. > > It would be inappropriate to replace it or have a call to other module > because we would have to provide the same input and output requirements. > Rather I think we just have to leave it alone until we get an acceptable > regression/general linear model/ANOVA solution. Then we can just > depreciate before removing it. > > Bruce Ok with me, I didn't think of the output requirements. And if linregress is properly tested (as it is now), we can leave it alone. The calculation could be cleaned up, e.g. use r = np.corrcoef(x,y) or the use of betai. Josef From simpson at math.toronto.edu Tue Feb 24 16:55:40 2009 From: simpson at math.toronto.edu (Gideon Simpson) Date: Tue, 24 Feb 2009 16:55:40 -0500 Subject: [SciPy-user] hankel transform Message-ID: Some time ago there was discussion of adding a Hankel Transform to SciPy. Was this ever done? -gideon From bjracine at glosten.com Tue Feb 24 19:35:09 2009 From: bjracine at glosten.com (Benjamin J. Racine) Date: Tue, 24 Feb 2009 16:35:09 -0800 Subject: [SciPy-user] Scientific packages for a distributed computing Amazon EC2 image? In-Reply-To: Message-ID: <8C2B20C4348091499673D86BF10AB6763B05797AE6@clipper.glosten.local> I put on there, but perhaps might have missed... cython mpi4py ETS (Enthought Tool Suite) Ben R. ________________________________ From: scipy-user-bounces at scipy.org [mailto:scipy-user-bounces at scipy.org] On Behalf Of Peter Skomoroch Sent: Monday, February 23, 2009 11:52 AM To: SciPy Developers List; SciPy Users List Subject: [SciPy-user] Scientific packages for a distributed computing Amazon EC2 image? I'm collecting a wishlist of scientific and python related packages (numpy, scipy, etc) people would want installed on a Debian based Amazon EC2 machine image (AMI) for distributed computing. I'll make more information available as the machine image develops, some of these will also go into the Machetec2 AMI. Several variants of the AMI should become available in the next month. Please feel free to add any packages you would want pre-installed on the following wiki page: http://scipy.org/SciPyAmazonAmi Let me know if you spot any potential license conflicts with listed software. -- Peter N. Skomoroch 617.285.8348 http://www.datawrangling.com http://delicious.com/pskomoroch http://twitter.com/peteskomoroch -------------- next part -------------- An HTML attachment was scrubbed... URL: From tritemio at gmail.com Wed Feb 25 03:12:37 2009 From: tritemio at gmail.com (Antonino Ingargiola) Date: Wed, 25 Feb 2009 09:12:37 +0100 Subject: [SciPy-user] In-place matrix reordering Message-ID: <5486cca80902250012o3246a175u62aa9494adb0927d@mail.gmail.com> Hi to the list, I have to "reorder" the columns of a big 2D array (a matrix) but without doing a temp copy of the whole matrix (since it is 1.5GB). Basically I would need something like this: a = arange(12).reshape(4,3) b = a[:,(2,0,1)] but without the array copy triggered by the advanced indexing. Let the (2,0,1) tuple be an arbitrary sequence previously computed. The .take method seems to do a copy too: b = a.take((2,0,1), axis=1) What I need is a "view" of "a" with an arbitrary column order. Is there a way to accomplish this task? Ciao, ~ Antonio From robert.kern at gmail.com Wed Feb 25 03:14:33 2009 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 25 Feb 2009 02:14:33 -0600 Subject: [SciPy-user] In-place matrix reordering In-Reply-To: <5486cca80902250012o3246a175u62aa9494adb0927d@mail.gmail.com> References: <5486cca80902250012o3246a175u62aa9494adb0927d@mail.gmail.com> Message-ID: <3d375d730902250014u781b670fs15757a7eb26328a2@mail.gmail.com> On Wed, Feb 25, 2009 at 02:12, Antonino Ingargiola wrote: > Hi to the list, > > I have to "reorder" the columns of a big 2D array (a matrix) but > without doing a temp copy of the whole matrix (since it is 1.5GB). > > Basically I would need something like this: > > a = arange(12).reshape(4,3) > b = a[:,(2,0,1)] > > but without the array copy triggered by the advanced indexing. Let the > (2,0,1) tuple be an arbitrary sequence previously computed. > > The .take method seems to do a copy too: > > b = a.take((2,0,1), axis=1) > > What I need is a "view" of "a" with an arbitrary column order. > > Is there a way to accomplish this task? No, sorry. numpy's memory model does not allow arbitrary views like this. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From ondrej at certik.cz Wed Feb 25 04:57:40 2009 From: ondrej at certik.cz (Ondrej Certik) Date: Wed, 25 Feb 2009 01:57:40 -0800 Subject: [SciPy-user] FEM sol interpolation In-Reply-To: <49A2A930.5000606@meduni-graz.at> References: <49A2A930.5000606@meduni-graz.at> Message-ID: <85b5c3130902250157l4f0cadb6h6548ab681d00e637@mail.gmail.com> Hi Bernardo! On Mon, Feb 23, 2009 at 5:48 AM, Bernardo M. Rocha wrote: > Hi Guys, > > I would like to know how can I interpolate some data (a FEM solution) > from a coarse grid to a finer grid.... > > I would like to do something like (available in FEMLAB/COMSOL): > > u_int = postinterp(fem_sol, 'u', p_ref); > > Where fem_sol is my solution at some coarse grid and p_ref is my > reference mesh (where I also have a fem_ref solution defined). > > If this is not clear, I need to compute e = fem_ref - fem_sol. Depending on your FEM, you need to implement projections of the solution. Which software are you developing? If you want, feel free to ask and discuss any such questions on our hp-FEM list here (CCing there): http://groups.google.com/group/hpfem or browser what we do in our group: http://hpfem.org/ we are always looking for new collaborators and people to work with. Ondrej From tritemio at gmail.com Wed Feb 25 05:31:25 2009 From: tritemio at gmail.com (Antonino Ingargiola) Date: Wed, 25 Feb 2009 11:31:25 +0100 Subject: [SciPy-user] In-place matrix reordering In-Reply-To: <3d375d730902250014u781b670fs15757a7eb26328a2@mail.gmail.com> References: <5486cca80902250012o3246a175u62aa9494adb0927d@mail.gmail.com> <3d375d730902250014u781b670fs15757a7eb26328a2@mail.gmail.com> Message-ID: <5486cca80902250231k1d4f89edkb6503911ec1fae6d@mail.gmail.com> 2009/2/25 Robert Kern : > On Wed, Feb 25, 2009 at 02:12, Antonino Ingargiola wrote: >> Hi to the list, >> >> I have to "reorder" the columns of a big 2D array (a matrix) but >> without doing a temp copy of the whole matrix (since it is 1.5GB). >> >> Basically I would need something like this: >> >> a = arange(12).reshape(4,3) >> b = a[:,(2,0,1)] >> >> but without the array copy triggered by the advanced indexing. Let the >> (2,0,1) tuple be an arbitrary sequence previously computed. >> >> The .take method seems to do a copy too: >> >> b = a.take((2,0,1), axis=1) >> >> What I need is a "view" of "a" with an arbitrary column order. >> >> Is there a way to accomplish this task? > > No, sorry. numpy's memory model does not allow arbitrary views like this. Is there any workarounds to save ram in this case? Something like using an external package or inline C (I have no idea about that). Thanks, ~ Antonio From gael.varoquaux at normalesup.org Wed Feb 25 05:43:30 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Wed, 25 Feb 2009 11:43:30 +0100 Subject: [SciPy-user] In-place matrix reordering In-Reply-To: <5486cca80902250231k1d4f89edkb6503911ec1fae6d@mail.gmail.com> References: <5486cca80902250012o3246a175u62aa9494adb0927d@mail.gmail.com> <3d375d730902250014u781b670fs15757a7eb26328a2@mail.gmail.com> <5486cca80902250231k1d4f89edkb6503911ec1fae6d@mail.gmail.com> Message-ID: <20090225104330.GA27481@phare.normalesup.org> On Wed, Feb 25, 2009 at 11:31:25AM +0100, Antonino Ingargiola wrote: > Is there any workarounds to save ram in this case? Something like > using an external package or inline C (I have no idea about that). Save to a file. Delete the array, load the file using memmaping, and extract the components you are interested in to a memroy-resident array. Ga?l From peter.skomoroch at gmail.com Wed Feb 25 06:32:47 2009 From: peter.skomoroch at gmail.com (Peter Skomoroch) Date: Wed, 25 Feb 2009 06:32:47 -0500 Subject: [SciPy-user] scipy comparison to other packages Message-ID: There is a discussion happening here including some folks from Mathworks that people on the list might be interested in. http://anyall.org/blog/2009/02/comparison-of-data-analysis-packages-r-matlab-scipy-excel-sas-spss-stata/ -- Peter N. Skomoroch 617.285.8348 http://www.datawrangling.com http://delicious.com/pskomoroch http://twitter.com/peteskomoroch -------------- next part -------------- An HTML attachment was scrubbed... URL: From mwojc at p.lodz.pl Wed Feb 25 07:04:51 2009 From: mwojc at p.lodz.pl (Marek Wojciechowski) Date: Wed, 25 Feb 2009 13:04:51 +0100 Subject: [SciPy-user] incremental sum Message-ID: Hi! Is there in numpy/scipy a function/method which does efficiently: for i in xrange(1, len(arr)): arr[i] += arr[i-1] Greetings, -- Marek Wojciechowski From pav at iki.fi Wed Feb 25 07:17:07 2009 From: pav at iki.fi (Pauli Virtanen) Date: Wed, 25 Feb 2009 12:17:07 +0000 (UTC) Subject: [SciPy-user] incremental sum References: Message-ID: Wed, 25 Feb 2009 13:04:51 +0100, Marek Wojciechowski wrote: > Is there in numpy/scipy a function/method which does efficiently: > > for i in xrange(1, len(arr)): > arr[i] += arr[i-1] cumsum(arr, out=arr) From mwojc at p.lodz.pl Wed Feb 25 08:28:15 2009 From: mwojc at p.lodz.pl (Marek Wojciechowski) Date: Wed, 25 Feb 2009 14:28:15 +0100 Subject: [SciPy-user] incremental sum References: Message-ID: Pauli Virtanen wrote: > Wed, 25 Feb 2009 13:04:51 +0100, Marek Wojciechowski wrote: >> Is there in numpy/scipy a function/method which does efficiently: >> >> for i in xrange(1, len(arr)): >> arr[i] += arr[i-1] > > cumsum(arr, out=arr) Thanks! -- Marek Wojciechowski From tritemio at gmail.com Wed Feb 25 08:45:07 2009 From: tritemio at gmail.com (Antonino Ingargiola) Date: Wed, 25 Feb 2009 14:45:07 +0100 Subject: [SciPy-user] In-place matrix reordering In-Reply-To: <20090225104330.GA27481@phare.normalesup.org> References: <5486cca80902250012o3246a175u62aa9494adb0927d@mail.gmail.com> <3d375d730902250014u781b670fs15757a7eb26328a2@mail.gmail.com> <5486cca80902250231k1d4f89edkb6503911ec1fae6d@mail.gmail.com> <20090225104330.GA27481@phare.normalesup.org> Message-ID: <5486cca80902250545o33e528a4nd6c4674dbfd7db7a@mail.gmail.com> 2009/2/25 Gael Varoquaux : > On Wed, Feb 25, 2009 at 11:31:25AM +0100, Antonino Ingargiola wrote: >> Is there any workarounds to save ram in this case? Something like >> using an external package or inline C (I have no idea about that). > > Save to a file. Delete the array, load the file using memmaping, and > extract the components you are interested in to a memroy-resident array. I saved the matrix with numpy.save, but I have a problem during the assignment between the unordered (co) matrix and the reordered one (cor). u is a sequence of indices with u.sahpe = (500,). import numpy as N co = N.memmap('S_co.npy', dtype='float64', shape=(288092, 500) # Raw data cor = N.memmap('S_co_reord.npy', dtype='float64', shape=(288092, 500), mode='w+') # Reordered data to be saved here cor[:,:] = co[:,u] Exception exceptions.AttributeError: "'memmap' object has no attribute '_mmap'" in ignored --------------------------------------------------------------------------- MemoryError Traceback (most recent call last) /home/anto/Simulazioni/matlab/mc3d/ in () MemoryError: >>> I've never used the N.memmap function so maybe I do something wrong here. Any hints? > > Ga?l ~ Antonio From gael.varoquaux at normalesup.org Wed Feb 25 09:07:31 2009 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Wed, 25 Feb 2009 15:07:31 +0100 Subject: [SciPy-user] In-place matrix reordering In-Reply-To: <5486cca80902250545o33e528a4nd6c4674dbfd7db7a@mail.gmail.com> References: <5486cca80902250012o3246a175u62aa9494adb0927d@mail.gmail.com> <3d375d730902250014u781b670fs15757a7eb26328a2@mail.gmail.com> <5486cca80902250231k1d4f89edkb6503911ec1fae6d@mail.gmail.com> <20090225104330.GA27481@phare.normalesup.org> <5486cca80902250545o33e528a4nd6c4674dbfd7db7a@mail.gmail.com> Message-ID: <20090225140731.GA7706@phare.normalesup.org> On Wed, Feb 25, 2009 at 02:45:07PM +0100, Antonino Ingargiola wrote: > I've never used the N.memmap function so maybe I do something wrong > here. Any hints? use:: from numpy.lib import format format.open_memmap HTH, Ga?l From nwagner at iam.uni-stuttgart.de Wed Feb 25 09:43:34 2009 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Wed, 25 Feb 2009 15:43:34 +0100 Subject: [SciPy-user] Importing models from other FEM codes (NASTRAN, ...) Message-ID: Hi all, I am looking for non-commercial tools to import Nastran files - something similar to what is offered by the Structural Dynamics Toolbox http://www.sdtools.com/femlink.html Any pointer would be appreciated. Nils From tritemio at gmail.com Wed Feb 25 10:58:46 2009 From: tritemio at gmail.com (Antonino Ingargiola) Date: Wed, 25 Feb 2009 16:58:46 +0100 Subject: [SciPy-user] In-place matrix reordering In-Reply-To: <20090225140731.GA7706@phare.normalesup.org> References: <5486cca80902250012o3246a175u62aa9494adb0927d@mail.gmail.com> <3d375d730902250014u781b670fs15757a7eb26328a2@mail.gmail.com> <5486cca80902250231k1d4f89edkb6503911ec1fae6d@mail.gmail.com> <20090225104330.GA27481@phare.normalesup.org> <5486cca80902250545o33e528a4nd6c4674dbfd7db7a@mail.gmail.com> <20090225140731.GA7706@phare.normalesup.org> Message-ID: <5486cca80902250758h50ce836h1a78276e872b78ea@mail.gmail.com> 2009/2/25 Gael Varoquaux : > On Wed, Feb 25, 2009 at 02:45:07PM +0100, Antonino Ingargiola wrote: >> I've never used the N.memmap function so maybe I do something wrong >> here. Any hints? > > use:: > > from numpy.lib import format > format.open_memmap Is the difference between numpy.memmap and format.open_memmap only that the latter will guess the correct shape and dtype of the array? I did not found much documentation about memmap and what is the best format for storing an ndarray. I opened co and cor as mmap using format.open_memmap but the command cor[:,:] = co[:,u] gives the same error as before. I suspect that the command creates a temp variable in RAM with the size of co. I can copy the data in blocks, but the operation becomes really slow (> 10 min). Thanks for the suggestions so far. ~ Antonio From daniel.wheeler2 at gmail.com Wed Feb 25 11:30:34 2009 From: daniel.wheeler2 at gmail.com (Daniel Wheeler) Date: Wed, 25 Feb 2009 11:30:34 -0500 Subject: [SciPy-user] FEM sol interpolation In-Reply-To: <49A2A930.5000606@meduni-graz.at> References: <49A2A930.5000606@meduni-graz.at> Message-ID: <80b160a0902250830y493a4623r7b3049735ace0d80@mail.gmail.com> Hi Bernado, I have tried doing this with scipy/numpy. If you check previous posting there has been quite a lot of discussion about this. The real problem is going from an unstructured mesh to another unstructured mesh. If the grid you are interpolating from is regular then it is a much easier problem. When both meshes are irregular than it seems like the functions available in scipy/numpy require a lot of memory (Nele*Mele, where N and M are the number of cells in each domain, I would like to be wrong about this). Anyhow, the way the problem is tackled in fipy is to use the call method to pass the set of points from one grid to a variable which knows its mesh and holds the values. The above basically calls a nearest neighbor search and then projects the value using the gradient. For general meshes we use this method for the nearest cell: The method above is heinous for memory usage and efficiency this is something we have to work on as we want to have adaptive grids etc in fipy, but at least it works. It is generally used for interpolating too a single point or a small number of points. If the mesh is regular we do much better in terms of efficiency and memory Cheers On Mon, Feb 23, 2009 at 8:48 AM, Bernardo M. Rocha wrote: > Hi Guys, > > I would like to know how can I interpolate some data (a FEM solution) > from a coarse grid to a finer grid.... > > I would like to do something like (available in FEMLAB/COMSOL): > > u_int = postinterp(fem_sol, 'u', p_ref); > > Where fem_sol is my solution at some coarse grid and p_ref is my > reference mesh (where I also have a fem_ref solution defined). > > If this is not clear, I need to compute e = fem_ref - fem_sol. > > Thanks in advance, > Bernardo M. Rocha > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > -- Daniel Wheeler From stefan at sun.ac.za Wed Feb 25 12:17:09 2009 From: stefan at sun.ac.za (=?ISO-8859-1?Q?St=E9fan_van_der_Walt?=) Date: Wed, 25 Feb 2009 19:17:09 +0200 Subject: [SciPy-user] FEM sol interpolation In-Reply-To: <49A2A930.5000606@meduni-graz.at> References: <49A2A930.5000606@meduni-graz.at> Message-ID: <9457e7c80902250917o442c32bfne0149c4f5d7ac863@mail.gmail.com> 2009/2/23 Bernardo M. Rocha : > I would like to know how can I interpolate some data (a FEM solution) > from a coarse grid to a finer grid.... > > I would like to do something like (available in FEMLAB/COMSOL): > > u_int = postinterp(fem_sol, 'u', p_ref); > > Where fem_sol is my solution at some coarse grid and p_ref is my > reference mesh (where I also have a fem_ref solution defined). This sounds like the perfect application for a GPU! (I'm not a domain expert, just thinking out loud) Cheers St?fan From jeremy at jeremysanders.net Wed Feb 25 12:34:48 2009 From: jeremy at jeremysanders.net (Jeremy Sanders) Date: Wed, 25 Feb 2009 17:34:48 +0000 Subject: [SciPy-user] ANN: Veusz 1.3 - a scientific plotting package and module Message-ID: Veusz 1.3 --------- Velvet Ember Under Sky Zenith ----------------------------- http://home.gna.org/veusz/ Veusz is Copyright (C) 2003-2009 Jeremy Sanders Licenced under the GPL (version 2 or greater). Veusz is a scientific plotting package. It is written in Python, using PyQt4 for display and user-interfaces, and numpy for handling the numeric data. Veusz is designed to produce publication-ready Postscript/PDF output. The user interface aims to be simple, consistent and powerful. Veusz provides a GUI, command line, embedding and scripting interface (based on Python) to its plotting facilities. It also allows for manipulation and editing of datasets. Changes in 1.3: * Add data capture from sockets, files and external programs * Remembers previous entries in dialog boxes * Add shaded regions or lines error bar style * Plot keys can be dragged around with the mouse * New clearer scalable icons * Now requires Python >= 2.4 * minor changes - Add filename completion in several places - Remember import dialog tab selection - Use font drop-down to select font - Add icons for error bar styles - Error bar code rewritten and simplified - Add import dialog to toolbar * bug fixes: - Fix incorrect "security errors" when loading invalid documents - Fix dragging around of shapes and lines problems - Fix address of FSF in license - Fix appearance of dialog box fonts on some systems - Fix recent files menu - Fix hiding of pages and graphs Features of package: * X-Y plots (with errorbars) * Line and function plots * Contour plots * Images (with colour mappings and colorbars) * Stepped plots (for histograms) * Fitting functions to data * Stacked plots and arrays of plots * Plot keys * Plot labels * Shapes and arrows on plots * LaTeX-like formatting for text * EPS/PDF/PNG/SVG export * Scripting interface * Dataset creation/manipulation * Embed Veusz within other programs * Text, CSV and FITS importing Requirements: Python (2.4 or greater required) http://www.python.org/ Qt >= 4.3 (free edition) http://www.trolltech.com/products/qt/ PyQt >= 4.3 (SIP is required to be installed first) http://www.riverbankcomputing.co.uk/pyqt/ http://www.riverbankcomputing.co.uk/sip/ numpy >= 1.0 http://numpy.scipy.org/ Optional: Microsoft Core Fonts (recommended for nice output) http://corefonts.sourceforge.net/ PyFITS >= 1.1 (optional for FITS import) http://www.stsci.edu/resources/software_hardware/pyfits For documentation on using Veusz, see the "Documents" directory. The manual is in pdf, html and text format (generated from docbook). Issues: * Can be very slow to plot large datasets if antialiasing is enabled. Right click on graph and disable antialias to speed up output. If you enjoy using Veusz, I would love to hear from you. Please join the mailing lists at https://gna.org/mail/?group=veusz to discuss new features or if you'd like to contribute code. The latest code can always be found in the SVN repository. Jeremy Sanders From karl.young at ucsf.edu Wed Feb 25 13:11:02 2009 From: karl.young at ucsf.edu (Young, Karl) Date: Wed, 25 Feb 2009 10:11:02 -0800 Subject: [SciPy-user] FEM sol interpolation References: <49A2A930.5000606@meduni-graz.at> <9457e7c80902250917o442c32bfne0149c4f5d7ac863@mail.gmail.com> Message-ID: <9D202D4E86A4BF47BA6943ABDF21BE78058FABCF@EXVS06.net.ucsf.edu> >> 2009/2/23 Bernardo M. Rocha : >> I would like to know how can I interpolate some data (a FEM solution) >> from a coarse grid to a finer grid.... >> >> I would like to do something like (available in FEMLAB/COMSOL): >> >> u_int = postinterp(fem_sol, 'u', p_ref); >> >> Where fem_sol is my solution at some coarse grid and p_ref is my >> reference mesh (where I also have a fem_ref solution defined). > This sounds like the perfect application for a GPU! (I'm not a domain > expert, just thinking out loud) Speaking of which... don't know who's seen this (and have no idea what the quality of these is) but it might be of interest to folks interested in GPU computing: https://www.livemeeting.com/lrs/1100002761/Registration.aspx?pageName=nz3jz5c979v7nts0 From dmitrey15 at ukr.net Thu Feb 26 04:25:40 2009 From: dmitrey15 at ukr.net (Dmitrey) Date: Thu, 26 Feb 2009 11:25:40 +0200 Subject: [SciPy-user] [numerical optimization] Poll "what do you miss in OpenOpt" Message-ID: <49A66014.6070506@ukr.net> Hi all, I created a poll "what do you miss in OpenOpt framework", could you take participation? http://www.doodle.com/participation.html?pollId=a78g5mk9sf7dnrbe Let me remember for those ones who is not familiar with openopt - it's a free Python-written numerical optimization framework http://openopt.org Thank you in advance, Dmitrey From ross.williamson at usap.gov Thu Feb 26 05:23:27 2009 From: ross.williamson at usap.gov (Ross Williamson) Date: Thu, 26 Feb 2009 23:23:27 +1300 Subject: [SciPy-user] Convert int 64bits to float Message-ID: <49A66D9F.7050709@usap.gov> Hi everyone I need to convert a set of bits (64) held in an int64 variable to the equivalent (bit) float64. e.g. number = 5975615269021 - float(number) will not work it will just convert it to 5975615269021.00 not whatever it is as a float from the raw bits. Cheers Ross From s.mientki at ru.nl Thu Feb 26 06:33:32 2009 From: s.mientki at ru.nl (Stef Mientki) Date: Thu, 26 Feb 2009 12:33:32 +0100 Subject: [SciPy-user] LiberMate was Re: Automating Matlab In-Reply-To: <49A02E1A.5050703@gmail.com> References: <4984F58C.5070605@gmail.com> <49A02E1A.5050703@gmail.com> Message-ID: <49A67E0C.4010002@ru.nl> hi Eric, Eric Schug wrote: > Eric Schug wrote: > >> Is there strong interest in automating matlab to numpy conversion? >> >> I have a working version of a matlab to python translator. >> It allows translation of matlab scripts into numpy constructs, >> supporting most of the matlab language. The parser is nearly >> complete. Most of the remaining work involves providing a robust >> translation. Such as >> * making sure that copies on assign are done when needed. >> * correct indexing a(:) becomes a.flatten(1) when on the left hand >> side (lhs) of equals >> and a[:] when on the right hand side >> >> >> I've seen a few projects attempt to do this, but for one reason or >> another have stopped it. >> >> >> > > Such a translator would be very welcome. We just tried the translator with a simple script, attached the m-file and the hand-corrected py-file (where you can changes the corrections). These are our findings * show() command is not needed in Matlab. * Case-Senitive function calls are automatically translated in Matlab * Complex number i is transformed in a function 1j() * product of 2 numbers is changed in a (ugly) dot-product. * can't handle unicode e.g. "re?el" * matdiv is not found * graphs are nicer, but axis are uglier * subplot needs integer values the resulting graphs from both Matlab and Python, can be seen here (we might not have used the latest MatPlotLib). good luck and cheers, Stef -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: HPF_graph.m URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: HPF_graph.py Type: text/x-python Size: 1025 bytes Desc: not available URL: From s.mientki at ru.nl Thu Feb 26 06:47:21 2009 From: s.mientki at ru.nl (Stef Mientki) Date: Thu, 26 Feb 2009 12:47:21 +0100 Subject: [SciPy-user] LiberMate was Re: Automating Matlab In-Reply-To: <49A67E0C.4010002@ru.nl> References: <4984F58C.5070605@gmail.com> <49A02E1A.5050703@gmail.com> <49A67E0C.4010002@ru.nl> Message-ID: <49A68149.5040003@ru.nl> forgot the link > > > the resulting graphs from both Matlab and Python, can be seen here > (we might not have used the latest MatPlotLib). > http://mientki.ruhosting.nl/data_www/pylab_works/matlab.html From sgarcia at olfac.univ-lyon1.fr Thu Feb 26 06:17:20 2009 From: sgarcia at olfac.univ-lyon1.fr (Samuel GARCIA) Date: Thu, 26 Feb 2009 12:17:20 +0100 Subject: [SciPy-user] Circular Buffer Message-ID: <49A67A40.1040205@olfac.univ-lyon1.fr> HI, I am writting a circular buffer for a acuisiqtion program. The idea is simple use and derivate a numpy.array for changing one behaviour : the buffer is circular. So when the slice arrive to the end it start at the begining with the modulo size of the buffer. Here is a draft of implementation of this for 1d. Question : -is there something more clerver in line 27 to avoid concatenate and so a duplication of the array - does someones already implement this for ND ? Thanks Samuel # -*- coding: utf-8 -*- import numpy class CircularBuffer1d(): def __init__(self , shape , dtype = 'f'): self.shape = shape self.array = numpy.zeros(shape , dtype = dtype) def __getitem__(self , sl): if type(sl) == int: return self.array[sl%self.shape[0]] elif type(sl) == slice : if sl.start is None : start = 0 else : start = sl.start % self.shape[0] stop = sl.stop % self.shape[0] if stop>start : return self.array[start:stop] else : return numpy.concatenate( ( self.array[start:], self.array[: stop ]) , axis = 0) def __setitem__(self , sl, a): if type(sl) == int: self.array[sl%self.shape[0]] = a elif type(sl) == slice : if sl.start is None : start = 0 else : start = sl.start % self.shape[0] stop = sl.stop % self.shape[0] if stop>start : self.array[start:stop] = a else : self.array[start:] = a[:self.shape[0] - start] self.array[: a.shape[0]-(self.shape[0] - start) ] = a[self.shape[0] - start:] c = CircularBuffer1d( (30,) , dtype = 'f') print c.array c[14:17] = numpy.ones((3)) print c.array c[58:63] = numpy.arange(5)+1 print c.array c[14] = 14 c[15] = 15 print c.array print c[15:15] c[15:15] = numpy.arange(30) print c[0:30] -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Samuel Garcia Laboratoire de Neurosciences Sensorielles, Comportement, Cognition. CNRS - UMR5020 - Universite Claude Bernard LYON 1 Equipe logistique et technique 50, avenue Tony Garnier 69366 LYON Cedex 07 FRANCE T?l : 04 37 28 74 64 Fax : 04 37 28 76 01 http://olfac.univ-lyon1.fr/unite/equipe-07/ http://neuralensemble.org/trac/OpenElectrophy ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ From sturla at molden.no Thu Feb 26 07:43:48 2009 From: sturla at molden.no (Sturla Molden) Date: Thu, 26 Feb 2009 13:43:48 +0100 Subject: [SciPy-user] Circular Buffer In-Reply-To: <49A67A40.1040205@olfac.univ-lyon1.fr> References: <49A67A40.1040205@olfac.univ-lyon1.fr> Message-ID: <49A68E84.1010105@molden.no> On 2/26/2009 12:17 PM, Samuel GARCIA wrote: > -is there something more clerver in line 27 to avoid concatenate and so > a duplication of the array You can make the allocated memory twice the size you need. When the buffer is full, you copy the latter half to the start. Let N be the capacity of the ringuffer, you could then have an append method like this (only appending to the tail is how most ringbuffers are used): def append(self, a): try: self.array[self.tail] = a self.tail += 1 except IndexError: N = self.array.shape[0] // 2 self.array[:N] = self.array[N:] self.array[N] = a self.tail = N + 1 The IndexError exception triggers an O(N) operation for every Nth call to append, which makes appends having amortized O(1) complexity. That is, they become O(1) on average. You avoid the concatenation in __getitem__ because the items are always stored contiguously. Sturla Molden > - does someones already implement this for ND ? > > Thanks > > Samuel > > # -*- coding: utf-8 -*- > > > > > import numpy > > > class CircularBuffer1d(): > def __init__(self , shape , dtype = 'f'): > self.shape = shape > self.array = numpy.zeros(shape , dtype = dtype) > > def __getitem__(self , sl): > if type(sl) == int: > return self.array[sl%self.shape[0]] And ndarray already has bounds-checks, to this is > elif type(sl) == slice : > if sl.start is None : > start = 0 > else : > start = sl.start % self.shape[0] > stop = sl.stop % self.shape[0] > if stop>start : > return self.array[start:stop] > else : > return numpy.concatenate( ( self.array[start:], > self.array[: stop ]) , axis = 0) > > def __setitem__(self , sl, a): > if type(sl) == int: > self.array[sl%self.shape[0]] = a > elif type(sl) == slice : > if sl.start is None : > start = 0 > else : > start = sl.start % self.shape[0] > > stop = sl.stop % self.shape[0] > > if stop>start : > self.array[start:stop] = a > else : > self.array[start:] = a[:self.shape[0] - start] > self.array[: a.shape[0]-(self.shape[0] - start) ] = > a[self.shape[0] - start:] > > > c = CircularBuffer1d( (30,) , dtype = 'f') > print c.array > c[14:17] = numpy.ones((3)) > print c.array > c[58:63] = numpy.arange(5)+1 > print c.array > c[14] = 14 > c[15] = 15 > print c.array > print c[15:15] > > c[15:15] = numpy.arange(30) > print c[0:30] > > > > > From ndbecker2 at gmail.com Thu Feb 26 08:23:37 2009 From: ndbecker2 at gmail.com (Neal Becker) Date: Thu, 26 Feb 2009 08:23:37 -0500 Subject: [SciPy-user] Circular Buffer References: <49A67A40.1040205@olfac.univ-lyon1.fr> <49A68E84.1010105@molden.no> Message-ID: You might like to look at the circular buffer implementation in boost. I also have my own c++ code for circular. It is an iterator adaptor. Basically, it adapts an iterator to a circular iterator using mod function. From robert.kern at gmail.com Thu Feb 26 14:16:31 2009 From: robert.kern at gmail.com (Robert Kern) Date: Thu, 26 Feb 2009 13:16:31 -0600 Subject: [SciPy-user] Convert int 64bits to float In-Reply-To: <49A66D9F.7050709@usap.gov> References: <49A66D9F.7050709@usap.gov> Message-ID: <3d375d730902261116w40ee1fb2v2f75fb9a239b5f20@mail.gmail.com> On Thu, Feb 26, 2009 at 04:23, Ross Williamson wrote: > Hi everyone > > I need to convert a set of bits (64) held in an int64 variable to the > equivalent (bit) float64. > > e.g. number = 5975615269021 - float(number) will not work it will just > convert it to 5975615269021.00 not whatever it is as a float from the > raw bits. In [2]: x = int64(5975615269021) In [3]: x.view(float64) Out[3]: 2.9523462171876746e-311 -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From nwagner at iam.uni-stuttgart.de Fri Feb 27 05:09:54 2009 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Fri, 27 Feb 2009 11:09:54 +0100 Subject: [SciPy-user] scipy, matlab and NASTRAN Message-ID: Hi all, I found a tool to import NASTRAN op4 files in Matlab. http://danial.org/op4/ Just now I have asked Al Danial to release his op4 tool under the BSD license. What is needed to write a wrapper for loadop4.c ? I tried to use swig. So I have created a file loadop4.i %module example %{ /* Includes the header in the wrapper code */ #include "op4.h" #include "sparse.h" %} /* Parse the header file to generate wrappers */ %include "op4.h" %include "sparse.h" swig -python loadop4.i gcc -c loadop4.c loadop4_wrap.c -I /data/home/nwagner/local/include/python2.5 In Datei, eingef?gt von loadop4.c:42: sparse.h:23:17: mex.h: Datei oder Verzeichnis nicht gefunden In file included from loadop4.c:42: sparse.h:180: Fehler: Syntaxfehler vor "mwIndex" loadop4.c:1138: Fehler: Syntaxfehler vor "mxArray" loadop4.c: In function `mexFunction': loadop4.c:1178: Fehler: ?nrhs? nicht deklariert (erste Benutzung in dieser Funktion) loadop4.c:1178: Fehler: (Jeder nicht deklarierte Bezeichner wird nur einmal aufgef?hrt loadop4.c:1178: Fehler: f?r jede Funktion in der er auftritt.) loadop4.c:1182: Fehler: ?prhs? nicht deklariert (erste Benutzung in dieser Funktion) loadop4.c:1187: Fehler: ?nlhs? nicht deklariert (erste Benutzung in dieser Funktion) loadop4.c:1193: Warnung: Zuweisung erzeugt Zeiger von Ganzzahl ohne Typkonvertierung loadop4.c:1319: Fehler: ?plhs? nicht deklariert (erste Benutzung in dieser Funktion) loadop4.c:1320: Fehler: ?mxCOMPLEX? nicht deklariert (erste Benutzung in dieser Funktion) loadop4.c:1321: Warnung: Zuweisung erzeugt Zeiger von Ganzzahl ohne Typkonvertierung loadop4.c:1322: Warnung: Zuweisung erzeugt Zeiger von Ganzzahl ohne Typkonvertierung loadop4.c:1325: Fehler: ?mxREAL? nicht deklariert (erste Benutzung in dieser Funktion) loadop4.c:1326: Warnung: Zuweisung erzeugt Zeiger von Ganzzahl ohne Typkonvertierung loadop4.c:1328: Warnung: Zuweisung erzeugt Zeiger von Ganzzahl ohne Typkonvertierung loadop4.c:1329: Warnung: Zuweisung erzeugt Zeiger von Ganzzahl ohne Typkonvertierung loadop4.c:1339: Warnung: Zuweisung erzeugt Zeiger von Ganzzahl ohne Typkonvertierung loadop4.c:1340: Warnung: Zuweisung erzeugt Zeiger von Ganzzahl ohne Typkonvertierung loadop4.c:1346: Warnung: Zuweisung erzeugt Zeiger von Ganzzahl ohne Typkonvertierung In file included from loadop4_wrap.c:708: op4.h:110: Fehler: Syntaxfehler vor "str_t" op4.h:122: Fehler: Syntaxfehler vor "str_t" op4.h:155: Fehler: Syntaxfehler vor "SparseMatrix" In Datei, eingef?gt von loadop4_wrap.c:709: sparse.h:23:17: mex.h: Datei oder Verzeichnis nicht gefunden In file included from loadop4_wrap.c:709: sparse.h:180: Fehler: Syntaxfehler vor "mwIndex" loadop4_wrap.c: In function `_wrap_strings_in_list': loadop4_wrap.c:2405: Fehler: ?mwIndex? nicht deklariert (erste Benutzung in dieser Funktion) loadop4_wrap.c:2405: Fehler: (Jeder nicht deklarierte Bezeichner wird nur einmal aufgef?hrt loadop4_wrap.c:2405: Fehler: f?r jede Funktion in der er auftritt.) loadop4_wrap.c:2405: Fehler: ?arg2? nicht deklariert (erste Benutzung in dieser Funktion) loadop4_wrap.c:2405: Fehler: Syntaxfehler vor ?)?-Zeichen Is it possible to remove the Matlab dependency (mex.h) ? Any pointer would be appreciated ? Thanks in advance. Nils From jeremy at jeremysanders.net Fri Feb 27 05:45:32 2009 From: jeremy at jeremysanders.net (Jeremy Sanders) Date: Fri, 27 Feb 2009 10:45:32 +0000 Subject: [SciPy-user] numpy repr issue Message-ID: Hi - I wonder whether you consider this a bug (numpy 1.2.0 x86_64): In [51]: repr(numpy.arange(10000)) Out[51]: 'array([ 0, 1, 2, ..., 9997, 9998, 9999])' I've found out that you can use numpy.set_printoptions to control this truncation behaviour, but the python documentation says: repr(object)?? Return a string containing a printable representation of an object. This is the same value yielded by conversions (reverse quotes). It is sometimes useful to be able to access this operation as an ordinary function. For many types, this function makes an attempt to return a string that would yield an object with the same value when passed to eval(), otherwise the representation is a string enclosed in angle brackets that contains the name of the type of the object together with additional information often including the name and address of the object. A class can control what this function returns for its instances by defining a __repr__() method. Numpy doesn't obey this. It returns a string you can't do repr on to get back the original data. It's not a string in angle brackets either. I've just hit a bug in pymc where truncated data is stored in its text database. It uses eval to read back the data. This truncation causes silent data loss in numpy-using applications which assume the standard python behaviour of being able to eval a repr'd object. You won't see the problem unless your data starts being greater than 1000 values. Shouldn't numpy change the default truncation so that no data are lost in repr? Jeremy From nwagner at iam.uni-stuttgart.de Fri Feb 27 07:01:26 2009 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Fri, 27 Feb 2009 13:01:26 +0100 Subject: [SciPy-user] scipy, matlab and NASTRAN In-Reply-To: References: Message-ID: Hi all, Meanwhile I made some progress. >>> import loadop4 Traceback (most recent call last): File "", line 1, in ImportError: ./loadop4.so: undefined symbol: sp_copy_column nm loadop4.so | grep sp_copy_column U sp_copy_column 000000000000e091 t _wrap_sp_copy_column How can I resolve that problem ? Nils From sturla at molden.no Fri Feb 27 11:27:36 2009 From: sturla at molden.no (Sturla Molden) Date: Fri, 27 Feb 2009 17:27:36 +0100 Subject: [SciPy-user] Implementation of a parallel cKDTree Message-ID: <49A81478.7050807@molden.no> I have fiddled a bit with scipy.spatial.cKDTree for better performance on multicore CPUs. I have used threading.Thread instead of OpenMP, so no special compilation or compiler is required. The number of threads defaults to the number of processors if it can be determined. The performance is not much different from what I get with OpenMP. It is faster than using cKDTree with multiprocessing and shared memory. Memory handling is also improved. There are checks for NULL pointers returned by malloc or realloc. setjmp/longjmp is used for error handling if malloc or realloc fail. A memory pool is used to make sure all complex data structures are cleaned up properly. I have assumed that crt functions malloc, realloc and free are thread safe. This is usually the case. If they are not, they must be wrapped with calls to PyGILState_Ensure and PyGILState_Release. I have not done this as it could impair scalability. Regards, Sturla Molden -------------- next part -------------- A non-text attachment was scrubbed... Name: ckdtree_mt.pyx Type: / Size: 29331 bytes Desc: not available URL: From wnbell at gmail.com Fri Feb 27 13:58:59 2009 From: wnbell at gmail.com (Nathan Bell) Date: Fri, 27 Feb 2009 13:58:59 -0500 Subject: [SciPy-user] scipy, matlab and NASTRAN In-Reply-To: References: Message-ID: On Fri, Feb 27, 2009 at 7:01 AM, Nils Wagner wrote: > Hi all, > > Meanwhile I made some progress. > >>>> import loadop4 > Traceback (most recent call last): > ? File "", line 1, in > ImportError: ./loadop4.so: undefined symbol: > sp_copy_column > > nm loadop4.so | grep sp_copy_column > ? ? ? ? ? ? ? ? ?U sp_copy_column > 000000000000e091 t _wrap_sp_copy_column > > How can I resolve that problem ? > It looks like sparse.h contains function prototypes that are never defined in a .c file. You should be able to just delete them from the header files. Nils, have you considered using NumPy's IO capabilities instead of using this code? I find IO to be a place where implementing from scratch with Python + NumPy is often easier and just as fast. Plus, you end up with a .py that's easy to share with others. -- Nathan Bell wnbell at gmail.com http://graphics.cs.uiuc.edu/~wnbell/ From matthew.brett at gmail.com Fri Feb 27 14:05:10 2009 From: matthew.brett at gmail.com (Matthew Brett) Date: Fri, 27 Feb 2009 11:05:10 -0800 Subject: [SciPy-user] scipy, matlab and NASTRAN In-Reply-To: References: Message-ID: <1e2af89e0902271105t2cfffc4obdcec2ad8b9927eb@mail.gmail.com> Hi, >>>>> import loadop4 >> Traceback (most recent call last): >> ? File "", line 1, in >> ImportError: ./loadop4.so: undefined symbol: >> sp_copy_column >> >> nm loadop4.so | grep sp_copy_column >> ? ? ? ? ? ? ? ? ?U sp_copy_column >> 000000000000e091 t _wrap_sp_copy_column >> >> How can I resolve that problem ? Is the format defined anywhere? Can you let us know either way about the author's response to relicensing? Best, Matthew From nwagner at iam.uni-stuttgart.de Fri Feb 27 14:08:36 2009 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Fri, 27 Feb 2009 20:08:36 +0100 Subject: [SciPy-user] scipy, matlab and NASTRAN In-Reply-To: <1e2af89e0902271105t2cfffc4obdcec2ad8b9927eb@mail.gmail.com> References: <1e2af89e0902271105t2cfffc4obdcec2ad8b9927eb@mail.gmail.com> Message-ID: On Fri, 27 Feb 2009 11:05:10 -0800 Matthew Brett wrote: > Hi, > >>>>>> import loadop4 >>> Traceback (most recent call last): >>> ? File "", line 1, in >>> ImportError: ./loadop4.so: undefined symbol: >>> sp_copy_column >>> >>> nm loadop4.so | grep sp_copy_column >>> ? ? ? ? ? ? ? ? ?U sp_copy_column >>> 000000000000e091 t _wrap_sp_copy_column >>> >>> How can I resolve that problem ? > > Is the format defined anywhere? Can you let us know >either way about > the author's response to relicensing? > He is forced to use GPL 2 due to the tops derivation. Cheers, Nils From nwagner at iam.uni-stuttgart.de Fri Feb 27 14:11:49 2009 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Fri, 27 Feb 2009 20:11:49 +0100 Subject: [SciPy-user] scipy, matlab and NASTRAN In-Reply-To: References: <1e2af89e0902271105t2cfffc4obdcec2ad8b9927eb@mail.gmail.com> Message-ID: On Fri, 27 Feb 2009 20:08:36 +0100 "Nils Wagner" wrote: > On Fri, 27 Feb 2009 11:05:10 -0800 > Matthew Brett wrote: >> Hi, >> >>>>>>> import loadop4 >>>> Traceback (most recent call last): >>>> ? File "", line 1, in >>>> ImportError: ./loadop4.so: undefined symbol: >>>> sp_copy_column >>>> >>>> nm loadop4.so | grep sp_copy_column >>>> ? ? ? ? ? ? ? ? ?U sp_copy_column >>>> 000000000000e091 t _wrap_sp_copy_column >>>> >>>> How can I resolve that problem ? >> >> Is the format defined anywhere? Can you let us know >>either way about >> the author's response to relicensing? >> > > He is forced to use GPL 2 due to the tops derivation. > > Cheers, > > Nils I should add the link to tops for completeness. http://savannah.nongnu.org/projects/tops From matthew.brett at gmail.com Fri Feb 27 14:13:44 2009 From: matthew.brett at gmail.com (Matthew Brett) Date: Fri, 27 Feb 2009 11:13:44 -0800 Subject: [SciPy-user] scipy, matlab and NASTRAN In-Reply-To: References: <1e2af89e0902271105t2cfffc4obdcec2ad8b9927eb@mail.gmail.com> Message-ID: <1e2af89e0902271113q74209010v822931946d69700d@mail.gmail.com> Hi >> Is the format defined anywhere? ?Can you let us know >>either way about >> the author's response to relicensing? >> > > He is forced to use GPL 2 due to the tops derivation. That's a killer; do you know where the format is defined? It might be easy to implement in python. Matthew From robert.kern at gmail.com Fri Feb 27 14:19:22 2009 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 27 Feb 2009 13:19:22 -0600 Subject: [SciPy-user] numpy repr issue In-Reply-To: References: Message-ID: <3d375d730902271119p45c5dfc2q8de35b3baf4e4e48@mail.gmail.com> On Fri, Feb 27, 2009 at 04:45, Jeremy Sanders wrote: > Hi - I wonder whether you consider this a bug (numpy 1.2.0 x86_64): > > In [51]: repr(numpy.arange(10000)) > Out[51]: 'array([ ? 0, ? ?1, ? ?2, ..., 9997, 9998, 9999])' > > I've found out that you can use numpy.set_printoptions to control this > truncation behaviour, but the python documentation says: > > repr(object)? > Return a string containing a printable representation of an object. This is > the same value yielded by conversions (reverse quotes). It is sometimes > useful to be able to access this operation as an ordinary function. For many > types, this function makes an attempt to return a string that would yield an > object with the same value when passed to eval(), otherwise the > representation is a string enclosed in angle brackets that contains the name > of the type of the object together with additional information often > including the name and address of the object. A class can control what this > function returns for its instances by defining a __repr__() method. > > Numpy doesn't obey this. It returns a string you can't do repr on to get > back the original data. It's not a string in angle brackets either. It is far from a strict commandment. There is absolutely no guarantee. The reason for the truncation by default is interactive use. Numeric used to display everything by default. If you typed the name of a very large array and pressed enter, the interpreter would try to make a full string repr from it and then display it. Since this is done in C code, you could not cancel it from the keyboard. > I've just hit a bug in pymc where truncated data is stored in its text > database. It uses eval to read back the data. PyMC should not be relying on that behavior. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From nwagner at iam.uni-stuttgart.de Fri Feb 27 14:23:04 2009 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Fri, 27 Feb 2009 20:23:04 +0100 Subject: [SciPy-user] scipy, matlab and NASTRAN In-Reply-To: References: Message-ID: On Fri, 27 Feb 2009 13:58:59 -0500 Nathan Bell wrote: > On Fri, Feb 27, 2009 at 7:01 AM, Nils Wagner > wrote: >> Hi all, >> >> Meanwhile I made some progress. >> >>>>> import loadop4 >> Traceback (most recent call last): >> ? File "", line 1, in >> ImportError: ./loadop4.so: undefined symbol: >> sp_copy_column >> >> nm loadop4.so | grep sp_copy_column >> ? ? ? ? ? ? ? ? ?U sp_copy_column >> 000000000000e091 t _wrap_sp_copy_column >> >> How can I resolve that problem ? >> > > It looks like sparse.h contains function prototypes that >are never > defined in a .c file. You should be able to just delete >them from the > header files. > > Nils, have you considered using NumPy's IO capabilities >instead of > using this code? Not really. I am not familiar with C. Hence I tried swig. I find IO to be a place where >implementing from > scratch with Python + NumPy is often easier and just as >fast. Plus, > you end up with a .py that's easy to share with others. +1. Anyway the source is available at http://danial.org/op4/load_save_op4.tar.gz Nils From matthew.brett at gmail.com Fri Feb 27 14:33:54 2009 From: matthew.brett at gmail.com (Matthew Brett) Date: Fri, 27 Feb 2009 11:33:54 -0800 Subject: [SciPy-user] scipy, matlab and NASTRAN In-Reply-To: References: Message-ID: <1e2af89e0902271133p5e0f6532ub2484fd58056af9c@mail.gmail.com> Hi, > +1. > > Anyway the source is available at > http://danial.org/op4/load_save_op4.tar.gz Yes, but no use to us, because it's GPL. Best, Matthew From nwagner at iam.uni-stuttgart.de Fri Feb 27 14:44:16 2009 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Fri, 27 Feb 2009 20:44:16 +0100 Subject: [SciPy-user] scipy, matlab and NASTRAN In-Reply-To: <1e2af89e0902271133p5e0f6532ub2484fd58056af9c@mail.gmail.com> References: <1e2af89e0902271133p5e0f6532ub2484fd58056af9c@mail.gmail.com> Message-ID: On Fri, 27 Feb 2009 11:33:54 -0800 Matthew Brett wrote: > Hi, > >> +1. >> >> Anyway the source is available at >> http://danial.org/op4/load_save_op4.tar.gz > > Yes, but no use to us, because it's GPL. > > Best, > > Matthew Maybe someone can convince Al Danial to release his code under BSD license. Nils This is what I found on the mailing list: If you are the copyright holder, and if you have not granted exclusive rights to another party, you can release you code under as many different nonexclusive licenses as you please as many times as you please. If you created the code, you are the copyright holder. If others helped create it, you need their "OK", unless they assigned their copyrights to you. So, for example, if you (the copyright holder) released your code under the GPL and now wish to release it under the new BSD license, no problem. (And, bravo!) Many people just hand out their code with a note in the header stating the author and that it is BSD licensed. If you want to be more careful, the note should direct them to an accompanying file (say, LICENSE.txt) in which you copy this template at the *bottom* of http://www.opensource.org/licenses/bsd-license.php and replace and . (And actually, this is short enough---2 paragraphs---that you can put it in every file if you prefer.) Most people nowadays leave out the no endorsement clause, which proved pointless. What does the BSD license do: it roughly says anyone can use or modify the code without conditions except keeping the copyright intact, and you incur no obligation to them or to anyone else they might distribute to. It is a good, simple license. If you leave out the "endorsement clause", as I would, you end up with a license that is essentially the same as the MIT license. MIT is my favorite (for utter simplicity), but for code destined for SciPy, the simplified BSD (i.e., no endorsement clause) is the right choice. From matthew.brett at gmail.com Fri Feb 27 14:47:59 2009 From: matthew.brett at gmail.com (Matthew Brett) Date: Fri, 27 Feb 2009 11:47:59 -0800 Subject: [SciPy-user] scipy, matlab and NASTRAN In-Reply-To: References: <1e2af89e0902271133p5e0f6532ub2484fd58056af9c@mail.gmail.com> Message-ID: <1e2af89e0902271147y64a9e7e0sa9d0589ba059e871@mail.gmail.com> Hi, > Maybe someone can convince Al Danial to release his code > under BSD license. That person would most likely be the person who most wants to use the code. Matthew From zachary.pincus at yale.edu Fri Feb 27 16:47:55 2009 From: zachary.pincus at yale.edu (Zachary Pincus) Date: Fri, 27 Feb 2009 16:47:55 -0500 Subject: [SciPy-user] Some more useful image filters: anisotropic diffusion and Canny edge-finding Message-ID: Hi all, As I've been quite happily using with Nadav's bilateral filtering code, I figured I should reciprocate and post some other useful basic image-processing algorithms that I've recently numpy-ified. Here are first-passes at Canny edge finding and Perona and Malik-style anisotropic diffusion; both are pretty simplistic and not particularly fast, but they work as advertised. Zach import numpy import scipy.ndimage as ndimage # Anisotropic Diffusion, as per Perona and Malik's paper (see section V). def _exp(image_gradient, scale): return numpy.exp(-(numpy.absolute(image_gradient)/scale)**2) def _inv(image_gradient, scale): return 1 / (1 + (numpy.absolute(image_gradient)/scale)**2) def anisotropic_diffusion(image, num_iters=10, scale=10, step_size=0.2, conduction_function=_inv): # 'step_size' is Perona and Malik's lambda parameter; scale is their 'K' parameter. # The 'conduction_function' is the function 'g' in the original formulation; # if this function simply returns a constant, the result is Gaussian blurring. if step_size > 0.25: raise ValueError('step_size parameter must be <= 0.25 for numerical stability.') image = image.copy() # simplistic boundary conditions -- no diffusion at the boundary central = image[1:-1, 1:-1] n = image[:-2, 1:-1] s = image[2:, 1:-1] e = image[1:-1, :-2] w = image[1:-1, 2:] directions = [s,e,w] for i in range(num_iters): di = n - central accumulator = conduction_function(di, scale)*di for direction in directions: di = direction - central accumulator += conduction_function(di, scale)*di accumulator *= step_size central += accumulator return image # Canny edge-finding, implemented as per the Wikipedia article # Note that this takes four passes through the image to do the # non-maximal suppression, whereas a c or cython loop could do # it in one. # Filter kernels for calculating the value of neighbors in several directions _N = numpy.array([[0, 1, 0], [0, 0, 0], [0, 1, 0]], dtype=bool) _NE = numpy.array([[0, 0, 1], [0, 0, 0], [1, 0, 0]], dtype=bool) _W = numpy.array([[0, 0, 0], [1, 0, 1], [0, 0, 0]], dtype=bool) _NW = numpy.array([[1, 0, 0], [0, 0, 0], [0, 0, 1]], dtype=bool) # After quantizing the angles, vertical (north-south) edges get values of 3, # northwest-southeast edges get values of 2, and so on, as below: _NE_d = 0 _W_d = 1 _NW_d = 2 _N_d = 3 def canny(image, high_threshold, low_threshold): grad_x = ndimage.sobel(image, 0) grad_y = ndimage.sobel(image, 1) grad_mag = numpy.sqrt(grad_x**2+grad_y**2) grad_angle = numpy.arctan2(grad_y, grad_x) # next, scale the angles in the range [0, 3] and then round to quantize quantized_angle = numpy.around(3 * (grad_angle + numpy.pi) / (numpy.pi * 2)) # Non-maximal suppression: an edge pixel is only good if its magnitude is # greater than its neighbors normal to the edge direction. We quantize # edge direction into four angles, so we only need to look at four # sets of neighbors NE = ndimage.maximum_filter(grad_mag, footprint=_NE) W = ndimage.maximum_filter(grad_mag, footprint=_W) NW = ndimage.maximum_filter(grad_mag, footprint=_NW) N = ndimage.maximum_filter(grad_mag, footprint=_N) thinned = (((g > W) & (quantized_angle == _N_d )) | ((g > N) & (quantized_angle == _W_d )) | ((g > NW) & (quantized_angle == _NE_d)) | ((g > NE) & (quantized_angle == _NW_d)) ) thinned_grad = thinned * grad_mag # Now, hysteresis thresholding: find seeds above a high threshold, then # expand out until we go below the low threshold high = thinned_grad > high_threshold low = thinned_grad > low_threshold canny_edges = ndimage.binary_dilation(high, iterations=-1, mask=low) return grad_mag, thinned_grad, canny_edges From zachary.pincus at yale.edu Fri Feb 27 16:51:38 2009 From: zachary.pincus at yale.edu (Zachary Pincus) Date: Fri, 27 Feb 2009 16:51:38 -0500 Subject: [SciPy-user] Some more useful image filters: anisotropic diffusion and Canny edge-finding In-Reply-To: References: Message-ID: <2BD49B0E-3F29-464D-A3B2-B58F43EB7093@yale.edu> Err... I missed a variable name-change in a few places! The canny function should be: def canny(image, high_threshold, low_threshold): grad_x = ndimage.sobel(image, 0) grad_y = ndimage.sobel(image, 1) grad_mag = numpy.sqrt(grad_x**2+grad_y**2) grad_angle = numpy.arctan2(grad_y, grad_x) # next, scale the angles in the range [0, 3] and then round to quantize quantized_angle = numpy.around(3 * (grad_angle + numpy.pi) / (numpy.pi * 2)) # Non-maximal suppression: an edge pixel is only good if its magnitude is # greater than its neighbors normal to the edge direction. We quantize # edge direction into four angles, so we only need to look at four # sets of neighbors NE = ndimage.maximum_filter(grad_mag, footprint=_NE) W = ndimage.maximum_filter(grad_mag, footprint=_W) NW = ndimage.maximum_filter(grad_mag, footprint=_NW) N = ndimage.maximum_filter(grad_mag, footprint=_N) thinned = (((grad_mag > W) & (quantized_angle == _N_d )) | ((grad_mag > N) & (quantized_angle == _W_d )) | ((grad_mag > NW) & (quantized_angle == _NE_d)) | ((grad_mag > NE) & (quantized_angle == _NW_d)) ) thinned_grad = thinned * grad_mag # Now, hysteresis thresholding: find seeds above a high threshold, then # expand out until we go below the low threshold high = thinned_grad > high_threshold low = thinned_grad > low_threshold canny_edges = ndimage.binary_dilation(high, iterations=-1, mask=low) return grad_mag, thinned_grad, canny_edges From zachary.pincus at yale.edu Fri Feb 27 16:57:24 2009 From: zachary.pincus at yale.edu (Zachary Pincus) Date: Fri, 27 Feb 2009 16:57:24 -0500 Subject: [SciPy-user] Some more useful image filters: anisotropic diffusion and Canny edge-finding In-Reply-To: <2BD49B0E-3F29-464D-A3B2-B58F43EB7093@yale.edu> References: <2BD49B0E-3F29-464D-A3B2-B58F43EB7093@yale.edu> Message-ID: Blarg, my pasted code text-wrapped at 70 cols, messing everything up! Sorry for the spam, but here are those functions as an attachment. Zach -------------- next part -------------- A non-text attachment was scrubbed... Name: image_filters.py Type: text/x-python-script Size: 3823 bytes Desc: not available URL: -------------- next part -------------- From Ross.Williamson at usap.gov Fri Feb 27 22:02:30 2009 From: Ross.Williamson at usap.gov (Williamson, Ross) Date: Sat, 28 Feb 2009 16:02:30 +1300 Subject: [SciPy-user] reverse an array Message-ID: Hi everyone Is there an easy way to reverse an array without converting it to a list? i.e I'm currently doing: x = array([0,1,2,3,4]) x = x.tolist() x.reverse() x = array(x) print x : [4,3,2,1,0] Thanks Ross From c-b at asu.edu Fri Feb 27 22:25:08 2009 From: c-b at asu.edu (Christopher Brown) Date: Fri, 27 Feb 2009 20:25:08 -0700 Subject: [SciPy-user] reverse an array In-Reply-To: References: Message-ID: <49A8AE94.3010807@asu.edu> Williamson, Ross wrote: > Hi everyone > > Is there an easy way to reverse an array without converting it to a list? > > i.e I'm currently doing: > > x = array([0,1,2,3,4]) > x = x.tolist() > x.reverse() > x = array(x) > > print x : [4,3,2,1,0] I found the following code snippet: >>> x[::-1] array([4, 3, 2, 1, 0]) at the following (very helpful) web address: http://mathesaurus.sourceforge.net/matlab-numpy.html -- Chris From mcohen at caltech.edu Sat Feb 28 23:53:40 2009 From: mcohen at caltech.edu (Michael Cohen) Date: Sat, 28 Feb 2009 20:53:40 -0800 Subject: [SciPy-user] Numpy/Scipy rfft transformations do not match? Message-ID: <49AA14D4.1040100@caltech.edu> Hi all, Relatively new Scipy/numpy user. I have been trying to switch from numpy.fft.rfft calls to scipy.fftpack.rfft calls in order to make use of fftw3, but I find that the array sizes are different. With numpy, the size of the resulting array is n/2+1 where n is the size of the original array. With scipy, the array has length n. Additionally, scipy takes twice as long to compute the fft as numpy, presumably because it is computing twice as many values. Is there a reason for this discrepancy? What do I need to call from scipy to make the calls match? Additionally, how do I check in a compiled & installed copy of scipy whether it is using fftw or its own fftpack routines? Regards, Michael