From drnlmuller+scipy at gmail.com Mon Dec 1 07:29:11 2008 From: drnlmuller+scipy at gmail.com (Neil Muller) Date: Mon, 1 Dec 2008 14:29:11 +0200 Subject: [SciPy-user] fminbound vs. brent In-Reply-To: <4F87D554-DBC5-4D6B-B2C1-E8AB2EC5E58C@math.toronto.edu> References: <4F87D554-DBC5-4D6B-B2C1-E8AB2EC5E58C@math.toronto.edu> Message-ID: On Mon, Dec 1, 2008 at 2:01 AM, Gideon Simpson wrote: > Based on the documentation, I'm a bit unclear on how fminbound and > brent, as optimization algorithms, differ. Could someone clarify this > for me? brent doesn't do constrained optimisation, while fminbound does. Consider: In [1]: import scipy.optimize as opt In [2]: f = lambda x: x*x In [3]: opt.brent(f, brack=(3, 4)) Out[3]: 0.0 In [4]: opt.fminbound(f, x1=3, x2=4) Out[4]: 3.00000596086 Hope that helps. -- Neil Muller drnlmuller at gmail.com From ndbecker2 at gmail.com Mon Dec 1 08:34:52 2008 From: ndbecker2 at gmail.com (Neal Becker) Date: Mon, 01 Dec 2008 08:34:52 -0500 Subject: [SciPy-user] minimax optimization Message-ID: Any suggestions on techniques for minimax optimization? From david.huard at gmail.com Mon Dec 1 12:58:59 2008 From: david.huard at gmail.com (David Huard) Date: Mon, 1 Dec 2008 12:58:59 -0500 Subject: [SciPy-user] Is it possible to pass Fortran derived data types to Python via C and SWIG? In-Reply-To: References: <113e17f20811292345k7cab3263macda578df9189876@mail.gmail.com> <49325385.9090302@ar.media.kyoto-u.ac.jp>

Message-ID: <91cf711d0812010958v25d8c158va479d6900c6ed37e@mail.gmail.com> John, this is something I've wanted to look at. Here is what I had planned to do, so there is no guarantee that it will actually work... The ISO_C_BINDING module is part of the 2003 standard and allows interoperability between C and Fortran (it is included in the latest gfortran compiler). It allows interoperability of Fortran derived types with C structures (with certain restrictions). For example, use iso_c_binding type, bind(c) :: mytype real(c_float) :: data integer(c_int) :: n end type is interoperable with typedef struct { float data; int n; } mytype Now, I am just guessing, but if such a module was built into a shared library, maybe it could be accessed from python using ctypes STRUCTURES. Regards, David On Sun, Nov 30, 2008 at 4:38 PM, Berthold H?llmann < berthold at xn--hllmanns-n4a.de> wrote: > Matthieu Brucher gmail.com> writes: > > > > > 2008/11/30 David Cournapeau ar.media.kyoto-u.ac.jp>: > > > John Salvatier wrote: > > >> I have a Fortran 90 algorithm which uses a derived data type to return > > >> data, and I would like to make a python wrapper for this algorithm. I > > >> understand that f2py cannot wrap derived data types; is it possible to > > >> do so with a C interface for the Fortran algorithm and SWIG? I would > > >> have to pass the derived data type into a C struct and then to Python. > > > > > > It is possible as long as you can pass the structure from fortran to C. > > > I don't know anything about Fortran derived data types, but if it is a > > > non trivial object (more than a set of fundamental types), I am afraid > > > it will be difficult. Does F90 supports POD data ? Otherwise, you will > > > need a scheme for marshalling your data from Fortran to C (to match > > > exactly how the structure would look like in C at the binary level). > > > > > > David > > > > I've read an article (I don't remember where though, possibly CiSE) > > that stated that it's really not an easy task, as each Fortran > > compiler can do as it pleases it. So depending on the compiler and the > > Fortran standard, it can be possible, or not. So as there are no > > guaranties, you should write a function that transforms the Fortran > > structure in several pieces that are then passed to the C function. > > > > Matthieu > > A feasible way to achieve this would be to write a Fortran wrapper > around your routine(x) that decomposes your derived data type to > standard types and exposes these in the interface. Than you can compose > the derived data type again in the wrapper and pass it to the original > routine. :: > > module geom > type Point > real :: x, y > end type Point > type Circle > type (Point) :: Center > real :: Radius > end type Circle > end module geom > subroutine test(c) > use geom > type (Circle) :: c > print*, c%Radius > print*, c%Center%X > print*, c%Center%Y > end subroutine test > subroutine w_test(x, y, r) > use geom > real :: x, y, z > type (Circle) :: C > c%Radius = r > c%Center%X = x > c%Center%Y = y > call test(c) > end subroutine w_test > > Wrapping w_test should be trivial using f2py > > Regards > Berthold > > > > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From ferrell at diablotech.com Mon Dec 1 13:44:52 2008 From: ferrell at diablotech.com (Robert Ferrell) Date: Mon, 1 Dec 2008 11:44:52 -0700 Subject: [SciPy-user] scikits.timeseries In-Reply-To: <94379D99-3429-4A6F-B3FA-8613ED16679B@gmail.com> References: <99B5C565-967B-43AB-A978-F0F740B31FB8@gmail.com> <94379D99-3429-4A6F-B3FA-8613ED16679B@gmail.com> Message-ID: <9FE89EE8-377E-49EA-AA9A-30D219E1D4FB@diablotech.com> On Nov 28, 2008, at 12:03 PM, Pierre GM wrote: > Robert: > It's always easier to manipulate series withoutmissing data. The trick > I gave you earlier about computing a moving average after having > removed the missing dates was that, just a trick. However, I'm > confident it should work. It does work quite well. When I plot I have a few holes in the data (at holidays), but that's about the only issue I haven't resolved. > > > Unfortunately, there's no easy way to define new frequencies, and it's > not on or todo list either. Frequencies are defined in the C part of > the code... How do you (or other users) use the Business frequency? Also, I get this error when I use tsplot: --------------------------------------------------------------------------- Traceback (most recent call last) /Users/Shared/Develop/Financial/ in () /Library/Frameworks/Python.framework/Versions/2.5.2001/lib/python2.5/ site-packages/scikits/timeseries/lib/plotlib.py in tsplot(self, *args, **kwargs) 1021 # when adding a right axis (using add_yaxis), for some reason the 1022 # x axis limits don't get properly set. This gets around the problem -> 1023 if self.get_xlim().tolist() == [0., 1.]: 1024 # if xlim still at default values, autoscale the axis 1025 self.autoscale_view() : 'tuple' object has no attribute 'tolist' That comes up no matter what kind of data or frequency I'm using (full, valid, etc...). Is that possibly why the cursor won't give me x axis position when I mouse around? thanks again, -robert From pgmdevlist at gmail.com Mon Dec 1 13:54:29 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Mon, 1 Dec 2008 13:54:29 -0500 Subject: [SciPy-user] scikits.timeseries In-Reply-To: <9FE89EE8-377E-49EA-AA9A-30D219E1D4FB@diablotech.com> References: <99B5C565-967B-43AB-A978-F0F740B31FB8@gmail.com> <94379D99-3429-4A6F-B3FA-8613ED16679B@gmail.com> <9FE89EE8-377E-49EA-AA9A-30D219E1D4FB@diablotech.com> Message-ID: <5E75474A-5987-4B7B-97F1-EA608A08C3C6@gmail.com> On Dec 1, 2008, at 1:44 PM, Robert Ferrell wrote: >> Unfortunately, there's no easy way to define new frequencies, and >> it's >> not on or todo list either. Frequencies are defined in the C part of >> the code... > > How do you (or other users) use the Business frequency? I'll let other users answer that. I never used that frequency myself. > > Also, I get this error when I use tsplot: Looks familiar... What version of matplotlib and scikits.timeseries are you using? > > That comes up no matter what kind of data or frequency I'm using > (full, valid, etc...). Is that possibly why the cursor won't give me > x axis position when I mouse around? No. I never took the time to find out what I can't get the x axis position under the cursor either, but the two issues are unrelated: the error you see comes from an update of matplotlib that hasn't been ported yet to scikits.timeseries. From h5py at alfven.org Mon Dec 1 15:09:56 2008 From: h5py at alfven.org (Andrew Collette) Date: Mon, 01 Dec 2008 12:09:56 -0800 Subject: [SciPy-user] HDF5 for Python 1.0 Message-ID: <1228162196.6960.8.camel@tachyon-laptop> Thought this might be of interest to the scipy crowd... Like PyTables it lets you store array data in a hierarchical format, and perform slicing and partial I/O, but it has a simpler, NumPy-oriented interface and also provides access to the majority of the HDF5 C API. However, it doesn't have the database-style indexing and query support of tables. ===================================== Announcing HDF5 for Python (h5py) 1.0 ===================================== What is h5py? ------------- HDF5 for Python (h5py) is a general-purpose Python interface to the Hierarchical Data Format library, version 5. HDF5 is a versatile, mature scientific software library designed for the fast, flexible storage of enormous amounts of data. >From a Python programmer's perspective, HDF5 provides a robust way to store data, organized by name in a tree-like fashion. You can create datasets (arrays on disk) hundreds of gigabytes in size, and perform random-access I/O on desired sections. Datasets are organized in a filesystem-like hierarchy using containers called "groups", and accesed using the tradional POSIX /path/to/resource syntax. This is the fourth major release of h5py, and represents the end of the "unstable" (0.X.X) design phase. Why should I use it? -------------------- H5py provides a simple, robust read/write interface to HDF5 data from Python. Existing Python and NumPy concepts are used for the interface; for example, datasets on disk are represented by a proxy class that supports slicing, and has dtype and shape attributes. HDF5 groups are are presented using a dictionary metaphor, indexed by name. A major design goal of h5py is interoperability; you can read your existing data in HDF5 format, and create new files that any HDF5- aware program can understand. No Python-specific extensions are used; you're free to implement whatever file structure your application desires. Almost all HDF5 features are available from Python, including things like compound datatypes (as used with NumPy recarray types), HDF5 attributes, hyperslab and point-based I/O, and more recent features in HDF 1.8 like resizable datasets and recursive iteration over entire files. The foundation of h5py is a near-complete wrapping of the HDF5 C API. HDF5 identifiers are first-class objects which participate in Python reference counting, and expose the C API via methods. This low-level interface is also made available to Python programmers, and is exhaustively documented. See the Quick-Start Guide for a longer introduction with code examples: http://h5py.alfven.org/docs/guide/quick.html Where to get it --------------- * Main website, documentation: http://h5py.alfven.org * Downloads, bug tracker: http://h5py.googlecode.com * The HDF group website also contains a good introduction: http://www.hdfgroup.org/HDF5/doc/H5.intro.html Requires -------- * UNIX-like platform (Linux or Mac OS-X); Windows version is in progress. * Python 2.5 or 2.6 * NumPy 1.0.3 or later (1.1.0 or later recommended) * HDF5 1.6.5 or later, including 1.8. Some features only available when compiled against HDF5 1.8. * Optionally, Cython (see cython.org) if you want to use custom install options. You'll need version 0.9.8.1.1 or later. About this version ------------------ Version 1.0 follows version 0.3.1 as the latest public release. The major design phase (which began in May of 2008) is now over; the design of the high-level API will be supported as-is for the rest of the 1.X series, with minor enhancements. This is the first version to support Python 2.6, and the first to use Cython for the low-level interface. The license remains 3-clause BSD. ** This project is NOT affiliated with The HDF Group. ** Thanks ------ Thanks to D. Dale, E. Lawrence and other for their continued support and comments. Also thanks to the PyTables project, for inspiration and generously providing their code to the community, and to everyone at the HDF Group for creating such a useful piece of software. From ferrell at diablotech.com Mon Dec 1 15:21:52 2008 From: ferrell at diablotech.com (Robert Ferrell) Date: Mon, 1 Dec 2008 13:21:52 -0700 Subject: [SciPy-user] scikits.timeseries In-Reply-To: <5E75474A-5987-4B7B-97F1-EA608A08C3C6@gmail.com> References: <99B5C565-967B-43AB-A978-F0F740B31FB8@gmail.com> <94379D99-3429-4A6F-B3FA-8613ED16679B@gmail.com> <9FE89EE8-377E-49EA-AA9A-30D219E1D4FB@diablotech.com> <5E75474A-5987-4B7B-97F1-EA608A08C3C6@gmail.com> Message-ID: <6E8EF7CD-1431-421E-9870-1BF3C76258C8@diablotech.com> On Dec 1, 2008, at 11:54 AM, Pierre GM wrote: > > On Dec 1, 2008, at 1:44 PM, Robert Ferrell wrote: >>> Unfortunately, there's no easy way to define new frequencies, and >>> it's >>> not on or todo list either. Frequencies are defined in the C part of >>> the code... >> >> How do you (or other users) use the Business frequency? > > I'll let other users answer that. I never used that frequency myself. > > >> >> Also, I get this error when I use tsplot: > > Looks familiar... What version of matplotlib and scikits.timeseries > are you using? In [741]: matplotlib.__version__ Out[741]: '0.98.3' In [742]: ts.__version__ Out[742]: '0.67.0.dev-r1570' > > >> >> That comes up no matter what kind of data or frequency I'm using >> (full, valid, etc...). Is that possibly why the cursor won't give me >> x axis position when I mouse around? > > No. I never took the time to find out what I can't get the x axis > position under the cursor either, but the two issues are unrelated: > the error you see comes from an update of matplotlib that hasn't been > ported yet to scikits.timeseries. The error seems benign enough that I can ignore it. -robert From pgmdevlist at gmail.com Mon Dec 1 17:57:19 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Mon, 1 Dec 2008 17:57:19 -0500 Subject: [SciPy-user] scikits.timeseries In-Reply-To: <6E8EF7CD-1431-421E-9870-1BF3C76258C8@diablotech.com> References: <99B5C565-967B-43AB-A978-F0F740B31FB8@gmail.com> <94379D99-3429-4A6F-B3FA-8613ED16679B@gmail.com> <9FE89EE8-377E-49EA-AA9A-30D219E1D4FB@diablotech.com> <5E75474A-5987-4B7B-97F1-EA608A08C3C6@gmail.com> <6E8EF7CD-1431-421E-9870-1BF3C76258C8@diablotech.com> Message-ID: <7E9BA46D-1276-49BB-BD13-2F22D42A89BE@gmail.com> Robert, Thx a lot for reporting, I'll take a better look ASAP. On Dec 1, 2008, at 3:21 PM, Robert Ferrell wrote: > > On Dec 1, 2008, at 11:54 AM, Pierre GM wrote: > >> >> On Dec 1, 2008, at 1:44 PM, Robert Ferrell wrote: >>>> Unfortunately, there's no easy way to define new frequencies, and >>>> it's >>>> not on or todo list either. Frequencies are defined in the C part >>>> of >>>> the code... >>> >>> How do you (or other users) use the Business frequency? >> >> I'll let other users answer that. I never used that frequency myself. >> >> >>> >>> Also, I get this error when I use tsplot: >> >> Looks familiar... What version of matplotlib and scikits.timeseries >> are you using? > > In [741]: matplotlib.__version__ > Out[741]: '0.98.3' > > In [742]: ts.__version__ > Out[742]: '0.67.0.dev-r1570' > > >> >> >>> >>> That comes up no matter what kind of data or frequency I'm using >>> (full, valid, etc...). Is that possibly why the cursor won't give >>> me >>> x axis position when I mouse around? >> >> No. I never took the time to find out what I can't get the x axis >> position under the cursor either, but the two issues are unrelated: >> the error you see comes from an update of matplotlib that hasn't been >> ported yet to scikits.timeseries. > > The error seems benign enough that I can ignore it. > > -robert > > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user From daniel.wheeler2 at gmail.com Mon Dec 1 18:53:36 2008 From: daniel.wheeler2 at gmail.com (Daniel Wheeler) Date: Mon, 1 Dec 2008 18:53:36 -0500 Subject: [SciPy-user] weave problems, weave_imp.o no such file or directory In-Reply-To: References: <4871E500.6070704@olfac.univ-lyon1.fr> Message-ID: <80b160a0812011553t1f1a312x359d460acfd3d215@mail.gmail.com> I have a similar issue when using pythonxy (2.1.4) (python version 2.5.2) and windows. The following import scipy print 'scipy.__version__',scipy.__version__ print 'scipy.__path__',scipy.__path__ from scipy import weave weave.inline('printf("hello world");', verbose=2) returns scipy.__version__ 0.6.0 scipy.__path__ ['C:\\Program Files\\pythonxy\\python\\lib\\site-packages\\scipy'] kw {'extra_link_args': [], 'define_macros': [], 'libraries': [], 'sources': ['C:\\Program Files\\pythonxy\\python\\lib\\site-packages\\scipy\\weave\\scxx\\weave_imp.cpp'], 'extra_compile_args': [], 'library_dirs': [], 'include_dirs': ['C:\\Program Files\\pythonxy\\python\\lib\\site-packages\\scipy\\weave', 'C:\\Program Files\\pythonxy\\python\\lib\\site-packages\\scipy\\weave\\scxx']} running build_ext running build_src building extension "sc_5c84b188757e017720cf8a0a3b0555304" sources customize Mingw32CCompiler customize Mingw32CCompiler using build_ext customize Mingw32CCompiler customize Mingw32CCompiler using build_ext building 'sc_5c84b188757e017720cf8a0a3b0555304' extension compiling C++ sources C compiler: g++ -mno-cygwin -O2 -Wall compile options: '-I"C:\Program Files\pythonxy\python\lib\site-packages\scipy\weave" -I"C:\Program Files\pythonxy\python\lib\site-packages\scipy\weave\scxx" -I"C:\Program Files\pythonxy\python\lib\site-packages\numpy\core\include" -I"C:\Program Files\pythonxy\python\include" -I"C:\Program Files\pythonxy\python\PC" -c' g++ -mno-cygwin -O2 -Wall -I"C:\Program Files\pythonxy\python\lib\site-packages\scipy\weave" - I"C:\Program Files\pythonxy\python\lib\site-packages\scipy\weave\scxx" -I"C:\Program Files\pythonxy\python\lib\site-packages\numpy\core\include" -I"C:\Program Files\pythonxy\python\include" -I"C:\Program Files\pythonxy\python\PC" -c c:\docume~1\wd15\locals~1\temp\wd15\python25_compiled\sc_5c84b188757e017720cf8a0a3b0555304.cpp -o c:\docume~1\wd15\locals~1\temp\wd15\python25_intermediate\compiler_08edc7e348e1c33f63a33ab500aef08e\Release\docume~1\wd15\locals~1\temp\wd15\python25_compiled\sc_5c84b188757e017720cf8a0a3b0555304.o Found executable C:\MinGW\bin\g++.exe g++ -mno-cygwin -shared c:\docume~1\wd15\locals~1\temp\wd15\python25_intermediate\compiler_08edc7e348e1c33f63a33ab500aef08e\Release\docume~1\wd15\locals~1\temp\wd15\python25_compiled\sc_5c84b188757e017720cf8a0a3b0555304.o c:\docume~1\wd15\locals~1\temp\wd15\python25_intermediate\compiler_08edc7e348e1c33f63a33ab500aef08e\Release\program files\pythonxy\python\lib\site-packages\scipy\weave\scxx\weave_imp.o -L"C:\Program Files\pythonxy\python\libs" -L"C:\Program Files\pythonxy\python\PCBuild" -lpython25 -lmsvcr71 -o c:\docume~1\wd15\locals~1\temp\wd15\python25_compiled\sc_5c84b188757e017720cf8a0a3b0555304.pyd g++.exe: files\pythonxy\python\lib\site-packages\scipy\weave\scxx\weave_imp.o: No such file or directory On Mon, Jul 7, 2008 at 2:31 PM, S?ren Nielsen wrote: > I have a space in my path too... So thats why it works on some computers and > not on others... -- Daniel Wheeler From josef.pktd at gmail.com Mon Dec 1 19:23:22 2008 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Mon, 1 Dec 2008 19:23:22 -0500 Subject: [SciPy-user] weave problems, weave_imp.o no such file or directory In-Reply-To: <80b160a0812011553t1f1a312x359d460acfd3d215@mail.gmail.com> References: <4871E500.6070704@olfac.univ-lyon1.fr> <80b160a0812011553t1f1a312x359d460acfd3d215@mail.gmail.com> Message-ID: <1cd32cbb0812011623i5385e7aes3c6bcd5bb5918d54@mail.gmail.com> This looks like a spaces in path problem: your problematic path mixes short windows names and long windows names and is not in quotes ("...") c:\docume~1\wd15\locals~1\temp\wd15\python25_intermediate\compiler_08edc7e348e1c33f63a33ab500aef08e\Release\program files\pythonxy\python\lib\site-packages\scipy\weave\scxx\weave_imp.o I tried the same example on my Windows XP, where neither python nor scipy paths ( which is not in Python directory) have spaces and it compiles without errors. As a relatively quick fix, I would move scipy on a path without spaces, just move directory and link to new parent directory in easy-install.path. A quick look at my compilation log, shows that only spaces in the scipy.weave path are relevant. The spaces in the python shouldn't be a problem, because your link and include directories, e.g. -L"C:\Program Files\pythonxy\python\libs" seem all to be correctly quoted for the Windows command shell. I usually avoid any paths with spaces, because getting the quoting always right is a pain with programs that don't put in the correct quotes, e.g. pythons subprocess.popen. Josef From jsalvatier at gmail.com Mon Dec 1 20:56:20 2008 From: jsalvatier at gmail.com (John Sal) Date: Mon, 1 Dec 2008 17:56:20 -0800 (PST) Subject: [SciPy-user] Is it possible to pass Fortran derived data types to Python via C and SWIG? In-Reply-To: <91cf711d0812010958v25d8c158va479d6900c6ed37e@mail.gmail.com> References: <113e17f20811292345k7cab3263macda578df9189876@mail.gmail.com> <49325385.9090302@ar.media.kyoto-u.ac.jp>

<91cf711d0812010958v25d8c158va479d6900c6ed37e@mail.gmail.com> Message-ID: <075c435a-bb50-4d9c-9063-df890c6c9a9c@y1g2000pra.googlegroups.com> Thank you all for your help. I think that writing a set of subroutines that creates derived data types from arguments passed to it and vice versa is probably my best bet, but I may try Huard's solution. On Dec 1, 9:58?am, "David Huard" wrote: > John, > > this is something I've wanted to look at. Here is what I had planned to do, > so there is no guarantee that it will actually work... > > The ISO_C_BINDING module is part of the 2003 standard and allows > interoperability between C and Fortran (it is included in the latest > gfortran compiler). It allows interoperability of Fortran derived types with > C structures (with certain restrictions). For example, > > use iso_c_binding > type, bind(c) :: mytype > ? real(c_float) :: data > ? integer(c_int) :: n > end type > > is interoperable with > > typedef struct { > ? float data; > ? int n; > > } mytype > > Now, I am just guessing, but if such a module was built into a shared > library, maybe it could be accessed from python using ctypes STRUCTURES. > > Regards, > > David > > On Sun, Nov 30, 2008 at 4:38 PM, Berthold H?llmann < > > berth... at xn--hllmanns-n4a.de> wrote: > > Matthieu Brucher gmail.com> writes: > > > > 2008/11/30 David Cournapeau ar.media.kyoto-u.ac.jp>: > > > > John Salvatier wrote: > > > >> I have a Fortran 90 algorithm which uses a derived data type to return > > > >> data, and I would like to make a python wrapper for this algorithm. I > > > >> understand that f2py cannot wrap derived data types; is it possible to > > > >> do so with a C interface for the Fortran algorithm and SWIG? I would > > > >> have to pass the derived data type into a C struct and then to Python. > > > > > It is possible as long as you can pass the structure from fortran to C. > > > > I don't know anything about Fortran derived data types, but if it is a > > > > non trivial object (more than a set of fundamental types), I am afraid > > > > it will be difficult. Does F90 supports POD data ? Otherwise, you will > > > > need a scheme for marshalling your data from Fortran to C (to match > > > > exactly how the structure would look like in C at the binary level). > > > > > David > > > > I've read an article (I don't remember where though, possibly CiSE) > > > that stated that it's really not an easy task, as each Fortran > > > compiler can do as it pleases it. So depending on the compiler and the > > > Fortran standard, it can be possible, or not. So as there are no > > > guaranties, you should write a function that transforms the Fortran > > > structure in several pieces that are then passed to the C function. > > > > Matthieu > > > A feasible way to achieve this would be to write a Fortran wrapper > > around your routine(x) that decomposes your derived data type to > > standard types and exposes these in the interface. Than you can compose > > the derived data type again in the wrapper and pass it to the original > > routine. :: > > > ? module geom > > ? ? type Point > > ? ? ? ?real :: x, y > > ? ? end type Point > > ? ? type Circle > > ? ? ? ?type (Point) :: Center > > ? ? ? ?real :: Radius > > ? ? end type Circle > > ? end module geom > > ? subroutine test(c) > > ? ? use geom > > ? ? type (Circle) :: c > > ? ? print*, c%Radius > > ? ? print*, c%Center%X > > ? ? print*, c%Center%Y > > ? end subroutine test > > ? subroutine w_test(x, y, r) > > ? ? use geom > > ? ? real :: x, y, z > > ? ? type (Circle) :: C > > ? ? c%Radius = r > > ? ? c%Center%X = x > > ? ? c%Center%Y = y > > ? ? call test(c) > > ? end subroutine w_test > > > Wrapping w_test should be trivial using f2py > > > Regards > > Berthold > > > _______________________________________________ > > SciPy-user mailing list > > SciPy-u... at scipy.org > >http://projects.scipy.org/mailman/listinfo/scipy-user > > > > _______________________________________________ > SciPy-user mailing list > SciPy-u... at scipy.orghttp://projects.scipy.org/mailman/listinfo/scipy-user From jsalvatier at gmail.com Mon Dec 1 21:11:38 2008 From: jsalvatier at gmail.com (John Salvatier) Date: Mon, 1 Dec 2008 18:11:38 -0800 (PST) Subject: [SciPy-user] minimax optimization In-Reply-To: References: Message-ID: <49bf56d9-1eb0-4e68-a099-d3ac047c0927@l33g2000pri.googlegroups.com> Well, if you can just use optimize.fmax and optimize.fmin together. if F is the function you want to minmax: def Fminimum(x): y0 = comeUpWithSomeStartingPoint() return optimize.fmin(F, y0, args = (x,)) Xminmax = optimize.fmax(Fminimum, x0) I am sure there are much better algorithms for this, though. On Dec 1, 5:34?am, Neal Becker wrote: > Any suggestions on techniques for minimax optimization? > > _______________________________________________ > SciPy-user mailing list > SciPy-u... at scipy.orghttp://projects.scipy.org/mailman/listinfo/scipy-user From lepto.python at gmail.com Tue Dec 2 02:36:12 2008 From: lepto.python at gmail.com (oyster) Date: Tue, 2 Dec 2008 15:36:12 +0800 Subject: [SciPy-user] scipy on old CPU crashes Message-ID: <6a4f17690812012336vee84c7bw9c53477f5b811173@mail.gmail.com> sorry, but scipy-0.7.0b1-win32-superpack-python2.4.exe and numpy-1.2.1-win32-superpack-python2.4.exe crash on my old pc too, which uses duron 750MHz. So now I think it is not the problem with non-sse/sse/sse2 instruction is there any method to find out the real reason except to compile from the source thanx (I know Duron is too old, but currently I don't have the money to buy a new PC or even update it :( so bad) [code] from scipy.integrate import quad print quad(lambda x:x, 1, 2) #crash soon [/code] > Date: Thu, 13 Nov 2008 12:41:33 +0900 > From: "David Cournapeau" > Subject: Re: [SciPy-user] scipy on old CPU crashes > To: "SciPy Users List" > Message-ID: > <5b8d13220811121941va8442f2gcb3a997874878b4b at mail.gmail.com> > Content-Type: text/plain; charset=UTF-8 > > On Thu, Nov 13, 2008 at 12:16 PM, oyster wrote: > > hi, all > > I am using an old AMD Duron CPU with Win2k, which seems does not > > support SSE/SSE2 > > indeed, old Duron does not support SSE IIRC. > > > I found that there are 3 verisons in > > numpy-1.2.1-win32-superpack-python2.5.exe(numpy-1.2.1-sse3.exe, > > numpy-1.2.1-sse2.exe and numpy-1.2.1-nosse.exe) > > Yep, the superpack is just a simple wrapper around the correct > installer, nothing fancy. > > > Is there a precompiled scipy that judges nosse/sse/sse2 automatically? > > No, but there will be for 0.7, which hopfully is only days away now. > > > or is there a way to > > change ATLAS only according to my CPU? > > Unfortunately not without rebuilding scipy yourself. Win32 binaries > are built by linking atlas statically. > > David From textdirected at gmail.com Tue Dec 2 07:02:38 2008 From: textdirected at gmail.com (HEMMI, Shigeru) Date: Tue, 2 Dec 2008 21:02:38 +0900 Subject: [SciPy-user] scipy-0.7.0b1... error: file 'ARPACK/FWRAPPERS/veclib_cabi_c.c' does not exist Message-ID: Hello, I am using Mac OS X 10.3. scipy-0.7.0b1 build failed with the message: error: file 'ARPACK/FWRAPPERS/veclib_cabi_c.c' does not exist Regards, From glen.shennan at gmail.com Tue Dec 2 07:34:29 2008 From: glen.shennan at gmail.com (Glen Shennan) Date: Tue, 2 Dec 2008 23:34:29 +1100 Subject: [SciPy-user] Building Scipy from source Message-ID: Hello, I'm new to Scipy (and Linux in general) and am trying to build Scipy from source, following the directions on the official installation guidebeginning at the section titled "Building everything from source with gfortran on Ubuntu". I am running Ubuntu 8.04 (Debian, kernel version 2.6.24-22) on a dual-core AMD 64 bit machine. I can get through the building of lapack, ATLAS, UMFPACK, FFTW without problems but I can't finish off the numpy/scipy compile and was hoping someone here could enlighten me. Numpy compiles but I'm not sure that it is working as intended/required and the scipy build produces a huge string of errors ending with: 01: error: 'result' undeclared (first use in this function) build/src.linux-x86_64-2.5/scipy/sparse/linalg/dsolve/umfpack/_umfpack_wrap.c:6201: warning: implicit declaration of function 'umfpack_zl_get_numeric' build/src.linux-x86_64-2.5/scipy/sparse/linalg/dsolve/umfpack/_umfpack_wrap.c:6202: error: expected expression before ')' token build/src.linux-x86_64-2.5/scipy/sparse/linalg/dsolve/umfpack/_umfpack_wrap.c:6202: error: too few arguments to function 'SWIG_Python_NewPointerObj' error: Command "gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -fPIC -DSCIPY_UMFPACK_H -DSCIPY_AMD_H -DATLAS_INFO="\"3.8.2\"" -I/home/glen/scipy_build/lib/include -I/usr/lib/python2.5/site-packages/numpy/core/include -I/usr/include/python2.5 -c build/src.linux-x86_64-2.5/scipy/sparse/linalg/dsolve/umfpack/_umfpack_wrap.c -o build/temp.linux-x86_64-2.5/build/src.linux-x86_64-2.5/scipy/sparse/linalg/dsolve/umfpack/_umfpack_wrap.o" failed with exit status 1 There are a few hundred similar errors, mostly of the of '01:' form, prior to this. I've managed to work through all other installation problems but cannot find any information on this sort of problem. To build numpy and scipy I used site.cfg.example and modified it according to the instructions on the afforementioned guide, putting one site.cfg file in each of the numpy and scipy source directories (downloaded using svn from the official repositories) and then issued sudo python setup.py build sudo python setup.py install in the numpy directory, ~/numpy/ . This compiled numpy but it gave many, many warnings of the "unused variable" and "function is not a prototype" sort. I then issued the same commands in the scipy source directory, ~/scipy/ . Can anyone point me to a place I might find an explanation, or more detailed instructions on how to build numpy and scipy specifically? I assume there's more information that's required that I haven't provided, just let me know if what's required. Thanks in advance for any help. Glen -------------- next part -------------- An HTML attachment was scrubbed... URL: From wnbell at gmail.com Tue Dec 2 07:58:48 2008 From: wnbell at gmail.com (Nathan Bell) Date: Tue, 2 Dec 2008 07:58:48 -0500 Subject: [SciPy-user] Building Scipy from source In-Reply-To: References: Message-ID: On Tue, Dec 2, 2008 at 7:34 AM, Glen Shennan wrote: > > 01: error: 'result' undeclared (first use in this function) > build/src.linux-x86_64-2.5/scipy/sparse/linalg/dsolve/umfpack/_umfpack_wrap.c:6201: > warning: implicit declaration of function 'umfpack_zl_get_numeric' > build/src.linux-x86_64-2.5/scipy/sparse/linalg/dsolve/umfpack/_umfpack_wrap.c:6202: > error: expected expression before ')' token > build/src.linux-x86_64-2.5/scipy/sparse/linalg/dsolve/umfpack/_umfpack_wrap.c:6202: > error: too few arguments to function 'SWIG_Python_NewPointerObj' > error: Command "gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 This looks like a SWIG error. What version of SWIG do you have (run $swig -version)? Does installing a more recent version ( http://www.swig.org/ ) solve the problem? -- Nathan Bell wnbell at gmail.com http://graphics.cs.uiuc.edu/~wnbell/ From glen.shennan at gmail.com Tue Dec 2 09:08:10 2008 From: glen.shennan at gmail.com (Glen Shennan) Date: Wed, 3 Dec 2008 01:08:10 +1100 Subject: [SciPy-user] Building Scipy from source In-Reply-To: References: Message-ID: I installed swig 1.3.36 and tried again, it produced the same error. In both site.cfg files I have the entries [atlas] atlas_libs = lapack, f77blas, cblas, atlas [blas_opt] libraries = f77blas, cblas, atlas [lapack_opt] libraries = lapack, f77blas, cblas, atlas though the remarks in site.cfg.example say "Some other sections still exist for linking against certain optimized libraries (e.g. [atlas], [lapack_atlas]), however, they are now deprecated and should not be used." I have been trying different combinations of the above entries (removing [atlas], then removing the other two) but not had any success with this either. Glen 2008/12/2 Nathan Bell > On Tue, Dec 2, 2008 at 7:34 AM, Glen Shennan > wrote: > > > > 01: error: 'result' undeclared (first use in this function) > > > build/src.linux-x86_64-2.5/scipy/sparse/linalg/dsolve/umfpack/_umfpack_wrap.c:6201: > > warning: implicit declaration of function 'umfpack_zl_get_numeric' > > > build/src.linux-x86_64-2.5/scipy/sparse/linalg/dsolve/umfpack/_umfpack_wrap.c:6202: > > error: expected expression before ')' token > > > build/src.linux-x86_64-2.5/scipy/sparse/linalg/dsolve/umfpack/_umfpack_wrap.c:6202: > > error: too few arguments to function 'SWIG_Python_NewPointerObj' > > error: Command "gcc -pthread -fno-strict-aliasing -DNDEBUG -g -fwrapv -O2 > > This looks like a SWIG error. What version of SWIG do you have (run > $swig -version)? Does installing a more recent version ( > http://www.swig.org/ ) solve the problem? > > -- > Nathan Bell wnbell at gmail.com > http://graphics.cs.uiuc.edu/~wnbell/ > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From cournape at gmail.com Tue Dec 2 09:29:27 2008 From: cournape at gmail.com (David Cournapeau) Date: Tue, 2 Dec 2008 23:29:27 +0900 Subject: [SciPy-user] Building Scipy from source In-Reply-To: References: Message-ID: <5b8d13220812020629k8565f62ta37d8bf11e9c5134@mail.gmail.com> On Tue, Dec 2, 2008 at 9:34 PM, Glen Shennan wrote: > Hello, > > I'm new to Scipy (and Linux in general) and am trying to build Scipy from > source, following the directions on the official installation guide > beginning at the section titled "Building everything from source with > gfortran on Ubuntu". I am running Ubuntu 8.04 (Debian, kernel version > 2.6.24-22) on a dual-core AMD 64 bit machine. I can get through the > building of lapack, ATLAS, UMFPACK, FFTW without problems but I can't finish > off the numpy/scipy compile and was hoping someone here could enlighten me. > Numpy compiles but I'm not sure that it is working as intended/required and > the scipy build produces a huge string of errors ending with: Hi Glen, The easiest way to build both numpy and scipy is to avoid building atlas, blas and co by yourself. Those are difficult to build right. Do the following: sudo apt-get install g77 atlas3-base-dev atlas3-base Then remove the build directories in both numpy and scipy (to make sure you build from scratch), as well as the site.cfg files. You should then be able to build both numpy and scipy without any trouble. To avoid trouble with umfpack, you should try building scipy with the following command: UMFPACK=None python setup.py build Finally, although last numpy release is fine, you should build scipy from svn instead of 0.6.0. 0.6 is more than one year old; we are about to release 0.7, so the trunk should be fairly stable (and we would be able to help you better if there is any problem compared to 0.6). David From cournape at gmail.com Tue Dec 2 09:41:12 2008 From: cournape at gmail.com (David Cournapeau) Date: Tue, 2 Dec 2008 23:41:12 +0900 Subject: [SciPy-user] scipy-0.7.0b1... error: file 'ARPACK/FWRAPPERS/veclib_cabi_c.c' does not exist In-Reply-To: References: Message-ID: <5b8d13220812020641jf677df0kb821cc1c4d8bcb4e@mail.gmail.com> On Tue, Dec 2, 2008 at 9:02 PM, HEMMI, Shigeru wrote: > Hello, I am using Mac OS X 10.3. > scipy-0.7.0b1 build failed with the message: > error: file 'ARPACK/FWRAPPERS/veclib_cabi_c.c' does not exist Hi Shigeru, This is a bug in the way we generated the source tarball. The problem has been fixed since then in svn, and the fix will be included in next release candidate. David From cournape at gmail.com Tue Dec 2 09:52:09 2008 From: cournape at gmail.com (David Cournapeau) Date: Tue, 2 Dec 2008 23:52:09 +0900 Subject: [SciPy-user] scipy on old CPU crashes In-Reply-To: <6a4f17690812012336vee84c7bw9c53477f5b811173@mail.gmail.com> References: <6a4f17690812012336vee84c7bw9c53477f5b811173@mail.gmail.com> Message-ID: <5b8d13220812020652pd138588kf41e70ea99ba72dd@mail.gmail.com> On Tue, Dec 2, 2008 at 4:36 PM, oyster wrote: > sorry, but scipy-0.7.0b1-win32-superpack-python2.4.exe and > numpy-1.2.1-win32-superpack-python2.4.exe crash on my old pc too, > which uses duron 750MHz. So now I think it is not the problem with > non-sse/sse/sse2 instruction > Ok. Just to be sure it is not a bug in the installer, can you tell me what the following commands give you ? python -c "import numpy; print numpy.show_config()" python -c "import scipy; print scipy.show_config()" > is there any method to find out the real reason except to compile from > the source It may not be easy, and you will need some tools, like a debugger to get a backtrace. What it the error you get exactly ? Illegal instruction or something else ? David From daniel.wheeler2 at gmail.com Tue Dec 2 09:59:42 2008 From: daniel.wheeler2 at gmail.com (Daniel Wheeler) Date: Tue, 2 Dec 2008 09:59:42 -0500 Subject: [SciPy-user] weave problems, weave_imp.o no such file or directory In-Reply-To: <1cd32cbb0812011623i5385e7aes3c6bcd5bb5918d54@mail.gmail.com> References: <4871E500.6070704@olfac.univ-lyon1.fr> <80b160a0812011553t1f1a312x359d460acfd3d215@mail.gmail.com> <1cd32cbb0812011623i5385e7aes3c6bcd5bb5918d54@mail.gmail.com> Message-ID: <80b160a0812020659n54cd1abdvdf1668bd00e9c651@mail.gmail.com> On Mon, Dec 1, 2008 at 7:23 PM, wrote: > > As a relatively quick fix, I would move scipy on a path without > spaces, just move directory and link to new parent directory in > easy-install.path. Right. I'm reinstalling pythonxy in "C:\" rather than "C:\Program Files " just to keep everything coherent. > I usually avoid any paths with spaces, because getting the quoting > always right is a pain with programs that don't put in the correct > quotes, e.g. pythons subprocess.popen. No doubt. However, putting stuff in "C:\Program Files " seems to be the pythonxy default. Enthought python doesn't do that, which seems preferable. Cheers -- Daniel Wheeler From anjiro at cc.gatech.edu Tue Dec 2 11:43:34 2008 From: anjiro at cc.gatech.edu (Daniel Ashbrook) Date: Tue, 02 Dec 2008 11:43:34 -0500 Subject: [SciPy-user] indices of consecutive elements Message-ID: <493565B6.1090607@cc.gatech.edu> I'm trying to figure out a way to return the indices of the start and end of a run of consecutive elements that match some condition, but only if there are more than a certain number. For example, take the array (with indices in comment for clarity): #0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 [0, 0, 1, 1, 1, 1, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 1, 1, 0] I want to find the start and end indices of all runs of 1s with length of 4 or longer; so here the answer would be: [[2,5], [15,18]] Is there a reasonable way to do this without looping? I've been playing around with diff() and where() but without too much progress. Thanks, dan From simpson at math.toronto.edu Tue Dec 2 11:59:08 2008 From: simpson at math.toronto.edu (Gideon Simpson) Date: Tue, 2 Dec 2008 11:59:08 -0500 Subject: [SciPy-user] os x, intel compilers & mkl, and fink python In-Reply-To: References: <10D66598-1DD4-46D9-BC84-5998E06C01F5@math.toronto.edu> Message-ID: On Nov 28, 2008, at 7:07 PM, David Warde-Farley wrote: > On 28-Nov-08, at 5:38 PM, Gideon Simpson wrote: > >> Has anyone gotten the combination of OS X with a fink python >> distribution to successfully build numpy/scipy with the intel >> compilers and the mkl? If so, how'd you do it? > > > IIRC David Cournapeau has had some success building numpy with MKL on > OS X, but I doubt it was the fink distribution. Is there a reason you > prefer fink's python rather than the Python.org universal framework > build? Also, which particular python version (2.4, 2.5, 2.6? I know > fink typically has a couple). > > David > > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user I use the fink one mostly because I've already got it installed, but I'm not wedded to it. Must python and numpy/scipy all be built with the same compiler? From pgmdevlist at gmail.com Tue Dec 2 12:13:16 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 2 Dec 2008 12:13:16 -0500 Subject: [SciPy-user] indices of consecutive elements In-Reply-To: <493565B6.1090607@cc.gatech.edu> References: <493565B6.1090607@cc.gatech.edu> Message-ID: <1CF052EE-3721-4419-B9EF-794B13F03AFE@gmail.com> Daniel, I coded a generic class that does what you want. It's not optimize, but at least should get you started. Let me know if you find it useful and if you find ways to tweak it... Cheers, P. _____ class Cluster(object): """ Groups consecutive data from an array according to a clustering condition. A cluster is defined as a group of consecutive values differing by at most the increment value. Missing values are **not** handled: the input sequence must therefore be free of missing values. Parameters ---------- darray : ndarray Input data array to clusterize. increment : {float}, optional Increment between two consecutive values to group. By default, use a value of 1. operator : {function}, optional Comparison operator for the definition of clusters. By default, use :func:`numpy.less_equal`. Attributes ---------- inishape Shape of the argument array (stored for resizing). inisize Size of the argument array. uniques : sequence List of unique cluster values, as they appear in chronological order. slices : sequence List of the slices corresponding to each cluster of data. starts : ndarray Array of the indices at which the clusters start. clustered : list List of clustered data. Examples -------- >>> A = [0, 0, 1, 2, 2, 2, 3, 4, 3, 4, 4, 4] >>> klust = cluster(A,0) >>> [list(_) for _ in klust.clustered] [[0, 0], [1], [2, 2, 2], [3], [4], [3], [4, 4, 4]] >>> klust.uniques array([0, 1, 2, 3, 4, 3, 4]) >>> x = [ 1.8, 1.3, 2.4, 1.2, 2.5, 3.9, 1. , 3.8, 4.2, 3.3, ... 1.2, 0.2, 0.9, 2.7, 2.4, 2.8, 2.7, 4.7, 4.2, 0.4] >>> Cluster(x,1).starts array([ 0, 2, 3, 4, 5, 6, 7, 10, 11, 13, 17, 19]) >>> Cluster(x,1.5).starts array([ 0, 6, 7, 10, 13, 17, 19]) >>> Cluster(x,2.5).starts array([ 0, 6, 7, 19]) >>> Cluster(x,2.5,greater).starts array([ 0, 1, 2, 3, 4, 5, 8, 9, 10, ... 11, 12, 13, 14, 15, 16, 17, 18]) >>> y = [ 0, -1, 0, 0, 0, 1, 1, -1, -1, -1, 1, 1, 0, 0, 0, 0, 1, 1, 0, 0] >>> Cluster(y,1).starts array([ 0, 1, 2, 5, 7, 10, 12, 16, 18]) """ def __init__(self,darray,increment=1,operator=np.less_equal): """ Initializes instance. Parameters ---------- darray : ndarray Input data array to clusterize. increment : {float}, optional Increment between two consecutive values to group. By default, use a value of 1. operator : {function}, optional Comparison operator for the definition of clusters. By default, use :func:`np.less_equal` """ if hasattr(darray,'mask') and darray.mask.any(): raise ma.MAError("Masked arrays should be filled prior clustering.") else: darray = np.asanyarray(darray) n = darray.size self.inishape = darray.shape self.inisize = darray.size clustercond = 1 - operator(np.absolute(np.diff(darray.ravel())), increment) sid = np.r_[[0,], np.arange(1,n).compress(clustercond), [n,]] slobj = np.asarray([slice(i,d) for (i,d) in np.broadcast(sid[:-1],sid[1:])]) # self.uniques = darray.ravel()[sid[:-1]] self.clustered = [darray[k] for k in slobj] self.sizes = np.asarray(np.diff(sid)) self.slices = slobj self.starts = sid[:-1] def markonsize(self,operator,sizethresh): """ Creates a **mask** for the clusters that do not meet a size requirement. Thus, outputs ``False`` if the size requirement is met, ``True`` otherwise. Parameters ---------- operator : function Comparison operator sizethresh : float Requirement for the sizes of the clusters """ resmask = np.empty(self.inisize, dtype=bool) resmask[:] = True # for k in self.slices.compress(operator(self.sizes,sizethresh)): for k in self.slices[operator(self.sizes,sizethresh)]: resmask[k] = False return resmask.reshape(self.inishape) def mark_greaterthan(self,sizemin): """ Shortcut for :meth:`markonsize(greater_equal,sizemin)`. Thus, the command outputs ``False`` for clusters larger than ``sizemin``, and ``True`` for clusters smaller than ``sizemin``. Parameters ---------- sizemin : int Minimum size of the clusters. See Also -------- :meth:`markonsize` Creates a **mask** for the clusters that do not meet a size requirement. """ return self.markonsize(np.greater_equal,sizemin) def grouped_slices(self): """ Returns a dictionary with the unique values of ``self`` as keys, and a list of slices for the corresponding values. See Also -------- :meth:`~Cluster.grouped_limits` that does the same thing """ # output = dict([(k,[]) for k in np.unique1d(self.uniques)]) for (k,v) in zip(self.uniques, self.slices): output[k].append(v) return output def grouped_limits(self): """ Returns a dictionary with the unique values of ``self`` as keys, and a list of tuples (starting index, ending index) for the corresponding values. See Also -------- :meth:`~Cluster.grouped_slices` """ output = dict([(k,[]) for k in np.unique1d(self.uniques)]) for (k,v) in zip(self.uniques, self.slices): output[k].append((v.start, v.stop)) for k in output: output[k] = np.array(output[k]) return output _____ On Dec 2, 2008, at 11:43 AM, Daniel Ashbrook wrote: > I'm trying to figure out a way to return the indices of the start and > end of a run of consecutive elements that match some condition, but > only > if there are more than a certain number. > > For example, take the array (with indices in comment for clarity): > > #0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 > [0, 0, 1, 1, 1, 1, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 1, 1, 0] > > I want to find the start and end indices of all runs of 1s with length > of 4 or longer; so here the answer would be: > > [[2,5], [15,18]] > > Is there a reasonable way to do this without looping? I've been > playing > around with diff() and where() but without too much progress. > > Thanks, > > > dan > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user From anjiro at cc.gatech.edu Tue Dec 2 13:37:52 2008 From: anjiro at cc.gatech.edu (Daniel Ashbrook) Date: Tue, 02 Dec 2008 13:37:52 -0500 Subject: [SciPy-user] indices of consecutive elements In-Reply-To: <1CF052EE-3721-4419-B9EF-794B13F03AFE@gmail.com> References: <493565B6.1090607@cc.gatech.edu> <1CF052EE-3721-4419-B9EF-794B13F03AFE@gmail.com> Message-ID: <49358080.5000702@cc.gatech.edu> Wow, ask and ye shall receive way more than you expected! Thanks so much, Pierre - it's what I need: a=[0, 0, 1, 1, 1, 1, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 1, 1, 0] c = Cluster(a,0) z = zip(list(c.starts), [list(i) for i in c.clustered]) r = [(i,i+len(j)-1) for i,j in z if j[0] == 1 and len(j) >= 4] print(r) [(2, 5), (15, 18)] dan Pierre GM wrote: > Daniel, > I coded a generic class that does what you want. It's not optimize, > but at least should get you started. Let me know if you find it useful > and if you find ways to tweak it... > Cheers, > P. > > _____ > > class Cluster(object): > """ > Groups consecutive data from an array according to a clustering > condition. > A cluster is defined as a group of consecutive values differing > by at most the > increment value. > > Missing values are **not** handled: the input sequence must > therefore be free > of missing values. > > Parameters > ---------- > darray : ndarray > Input data array to clusterize. > increment : {float}, optional > Increment between two consecutive values to group. > By default, use a value of 1. > operator : {function}, optional > Comparison operator for the definition of clusters. > By default, use :func:`numpy.less_equal`. > > > Attributes > ---------- > inishape > Shape of the argument array (stored for resizing). > inisize > Size of the argument array. > uniques : sequence > List of unique cluster values, as they appear in > chronological order. > slices : sequence > List of the slices corresponding to each cluster of data. > starts : ndarray > Array of the indices at which the clusters start. > clustered : list > List of clustered data. > > > Examples > -------- > >>> A = [0, 0, 1, 2, 2, 2, 3, 4, 3, 4, 4, 4] > >>> klust = cluster(A,0) > >>> [list(_) for _ in klust.clustered] > [[0, 0], [1], [2, 2, 2], [3], [4], [3], [4, 4, 4]] > >>> klust.uniques > array([0, 1, 2, 3, 4, 3, 4]) > > >>> x = [ 1.8, 1.3, 2.4, 1.2, 2.5, 3.9, 1. , 3.8, 4.2, 3.3, > ... 1.2, 0.2, 0.9, 2.7, 2.4, 2.8, 2.7, 4.7, 4.2, 0.4] > >>> Cluster(x,1).starts > array([ 0, 2, 3, 4, 5, 6, 7, 10, 11, 13, 17, 19]) > >>> Cluster(x,1.5).starts > array([ 0, 6, 7, 10, 13, 17, 19]) > >>> Cluster(x,2.5).starts > array([ 0, 6, 7, 19]) > >>> Cluster(x,2.5,greater).starts > array([ 0, 1, 2, 3, 4, 5, 8, 9, 10, > ... 11, 12, 13, 14, 15, 16, 17, 18]) > >>> y = [ 0, -1, 0, 0, 0, 1, 1, -1, -1, -1, 1, 1, 0, 0, 0, 0, 1, > 1, 0, 0] > >>> Cluster(y,1).starts > array([ 0, 1, 2, 5, 7, 10, 12, 16, 18]) > > """ > def __init__(self,darray,increment=1,operator=np.less_equal): > """ > Initializes instance. > > Parameters > ---------- > darray : ndarray > Input data array to clusterize. > increment : {float}, optional > Increment between two consecutive values to group. > By default, use a value of 1. > operator : {function}, optional > Comparison operator for the definition of clusters. > By default, use :func:`np.less_equal` > > """ > if hasattr(darray,'mask') and darray.mask.any(): > raise ma.MAError("Masked arrays should be filled prior > clustering.") > else: > darray = np.asanyarray(darray) > n = darray.size > self.inishape = darray.shape > self.inisize = darray.size > clustercond = 1 - > operator(np.absolute(np.diff(darray.ravel())), > increment) > sid = np.r_[[0,], np.arange(1,n).compress(clustercond), [n,]] > slobj = np.asarray([slice(i,d) > for (i,d) in > np.broadcast(sid[:-1],sid[1:])]) > # > self.uniques = darray.ravel()[sid[:-1]] > self.clustered = [darray[k] for k in slobj] > self.sizes = np.asarray(np.diff(sid)) > self.slices = slobj > self.starts = sid[:-1] > > def markonsize(self,operator,sizethresh): > """ > Creates a **mask** for the clusters that do not meet a size > requirement. > Thus, outputs ``False`` if the size requirement is met, ``True`` > otherwise. > > Parameters > ---------- > operator : function > Comparison operator > sizethresh : float > Requirement for the sizes of the clusters > > """ > resmask = np.empty(self.inisize, dtype=bool) > resmask[:] = True > # for k in self.slices.compress(operator(self.sizes,sizethresh)): > for k in self.slices[operator(self.sizes,sizethresh)]: > resmask[k] = False > return resmask.reshape(self.inishape) > > def mark_greaterthan(self,sizemin): > """ > Shortcut for :meth:`markonsize(greater_equal,sizemin)`. > Thus, the command outputs ``False`` for clusters larger than > ``sizemin``, and > ``True`` for clusters smaller than ``sizemin``. > > Parameters > ---------- > sizemin : int > Minimum size of the clusters. > > See Also > -------- > :meth:`markonsize` > Creates a **mask** for the clusters that do not meet a size > requirement. > """ > return self.markonsize(np.greater_equal,sizemin) > > def grouped_slices(self): > """ > Returns a dictionary with the unique values of ``self`` as keys, > and a list > of slices for the corresponding values. > > See Also > -------- > :meth:`~Cluster.grouped_limits` > that does the same thing > """ > # > output = dict([(k,[]) for k in np.unique1d(self.uniques)]) > for (k,v) in zip(self.uniques, self.slices): > output[k].append(v) > return output > > def grouped_limits(self): > """ > Returns a dictionary with the unique values of ``self`` as keys, > and a list > of tuples (starting index, ending index) for the corresponding > values. > > See Also > -------- > :meth:`~Cluster.grouped_slices` > """ > output = dict([(k,[]) for k in np.unique1d(self.uniques)]) > for (k,v) in zip(self.uniques, self.slices): > output[k].append((v.start, v.stop)) > for k in output: > output[k] = np.array(output[k]) > return output > > > _____ > > > > On Dec 2, 2008, at 11:43 AM, Daniel Ashbrook wrote: > >> I'm trying to figure out a way to return the indices of the start and >> end of a run of consecutive elements that match some condition, but >> only >> if there are more than a certain number. >> >> For example, take the array (with indices in comment for clarity): >> >> #0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 >> [0, 0, 1, 1, 1, 1, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 1, 1, 0] >> >> I want to find the start and end indices of all runs of 1s with length >> of 4 or longer; so here the answer would be: >> >> [[2,5], [15,18]] >> >> Is there a reasonable way to do this without looping? I've been >> playing >> around with diff() and where() but without too much progress. >> >> Thanks, >> >> >> dan >> _______________________________________________ >> SciPy-user mailing list >> SciPy-user at scipy.org >> http://projects.scipy.org/mailman/listinfo/scipy-user > > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user From pgmdevlist at gmail.com Tue Dec 2 14:10:20 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 2 Dec 2008 14:10:20 -0500 Subject: [SciPy-user] indices of consecutive elements In-Reply-To: <49358080.5000702@cc.gatech.edu> References: <493565B6.1090607@cc.gatech.edu> <1CF052EE-3721-4419-B9EF-794B13F03AFE@gmail.com> <49358080.5000702@cc.gatech.edu> Message-ID: <86391AE5-7DEA-47DA-93EE-4ABEED20739B@gmail.com> On Dec 2, 2008, at 1:37 PM, Daniel Ashbrook wrote: > Wow, ask and ye shall receive way more than you expected! Thanks so > much, Pierre - it's what I need: > > a=[0, 0, 1, 1, 1, 1, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 1, 1, > 0] > c = Cluster(a,0) > z = zip(list(c.starts), [list(i) for i in c.clustered]) > r = [(i,i+len(j)-1) for i,j in z if j[0] == 1 and len(j) >= 4] > print(r) > > [(2, 5), (15, 18)] There's simpler: >>> [(_.start,_.stop-1) for _ in c.slices[(c.sizes>=4) & (c.uniques==1)]] as c.slices is an array, you can directly select its elements that satisfy some conditions: here, we take the clusters that cover at least 4 element, provided the unique element is 1... From anjiro at cc.gatech.edu Tue Dec 2 14:12:31 2008 From: anjiro at cc.gatech.edu (Daniel Ashbrook) Date: Tue, 02 Dec 2008 14:12:31 -0500 Subject: [SciPy-user] indices of consecutive elements In-Reply-To: <86391AE5-7DEA-47DA-93EE-4ABEED20739B@gmail.com> References: <493565B6.1090607@cc.gatech.edu> <1CF052EE-3721-4419-B9EF-794B13F03AFE@gmail.com> <49358080.5000702@cc.gatech.edu> <86391AE5-7DEA-47DA-93EE-4ABEED20739B@gmail.com> Message-ID: <4935889F.8030709@cc.gatech.edu> Aaah, so much better! I'm still getting used to how arrays operate differently than python lists. That's really cool - thanks a lot! This will make my work so much easier. dan Pierre GM wrote: > There's simpler: > >>> [(_.start,_.stop-1) for _ in c.slices[(c.sizes>=4) & > (c.uniques==1)]] > > as c.slices is an array, you can directly select its elements that > satisfy some conditions: here, we take the clusters that cover at > least 4 element, provided the unique element is 1... From pgmdevlist at gmail.com Tue Dec 2 14:21:10 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 2 Dec 2008 14:21:10 -0500 Subject: [SciPy-user] indices of consecutive elements In-Reply-To: <4935889F.8030709@cc.gatech.edu> References: <493565B6.1090607@cc.gatech.edu> <1CF052EE-3721-4419-B9EF-794B13F03AFE@gmail.com> <49358080.5000702@cc.gatech.edu> <86391AE5-7DEA-47DA-93EE-4ABEED20739B@gmail.com> <4935889F.8030709@cc.gatech.edu> Message-ID: <7E074960-5174-4A0B-A2C2-B872B786DB5A@gmail.com> On Dec 2, 2008, at 2:12 PM, Daniel Ashbrook wrote: > Aaah, so much better! I'm still getting used to how arrays operate > differently than python lists. I know... It's so easy to select specific elements of an array that it's frustrating not to be able to do the same thing w/ lists... > That's really cool - thanks a lot! This > will make my work so much easier. Glad I could help. Your feedback is needed, let me know if you run into some bugs or other problems, so that I can update my code (it's part of a set of tools I'm writing to help the analysis of climate data...). From argriffi at ncsu.edu Tue Dec 2 14:38:07 2008 From: argriffi at ncsu.edu (alex) Date: Tue, 02 Dec 2008 14:38:07 -0500 Subject: [SciPy-user] indices of consecutive elements In-Reply-To: <493565B6.1090607@cc.gatech.edu> References: <493565B6.1090607@cc.gatech.edu> Message-ID: <49358E9F.9060703@ncsu.edu> Here's a way that uses ufuncs. from numpy import * v = [0, 0, 1, 1, 1, 1, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 1, 1, 0] myufunc = frompyfunc(lambda a, b: (a+b)*b, 2, 1) where(diff(myufunc.accumulate(v)) <= -4) This gives (array([5,18]),) where 5 and 18 are the right hand indices; the left hand indices can be found similarly. Alex Daniel Ashbrook wrote: > I'm trying to figure out a way to return the indices of the start and > end of a run of consecutive elements that match some condition, but only > if there are more than a certain number. > > For example, take the array (with indices in comment for clarity): > > #0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 > [0, 0, 1, 1, 1, 1, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 1, 1, 0] > > I want to find the start and end indices of all runs of 1s with length > of 4 or longer; so here the answer would be: > > [[2,5], [15,18]] > > Is there a reasonable way to do this without looping? I've been playing > around with diff() and where() but without too much progress. > > Thanks, > > > dan > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > From Dharhas.Pothina at twdb.state.tx.us Tue Dec 2 15:21:06 2008 From: Dharhas.Pothina at twdb.state.tx.us (Dharhas Pothina) Date: Tue, 02 Dec 2008 14:21:06 -0600 Subject: [SciPy-user] Calculating daily averages from a timeseries without using the timeseries package. Message-ID: <49354452.63BA.009B.0@twdb.state.tx.us> Hi All, I have two arrays t & sal. t was created from an array of datetimes using the date2num function. The timeseries is approximately at an hourly frequency but there are days with little or no data or data at a non hourly frequency. How would I calculate the average of all salinity values on a particular day and form a new time series. t_days, sal_dailyavg I eventually plan to use the timeseries toolkit for my timeseries analysis but I'm close to the end of a project right now and don't have the time to install and learn it right now so I was hoping someone knew how to do this within numpy/scipy. I can think of a fairly laborious way using looping through each day and selecting the data in that day calculating the average and populating a new array. thanks - dharhas From pgmdevlist at gmail.com Tue Dec 2 15:44:40 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 2 Dec 2008 15:44:40 -0500 Subject: [SciPy-user] Calculating daily averages from a timeseries without using the timeseries package. In-Reply-To: <49354452.63BA.009B.0@twdb.state.tx.us> References: <49354452.63BA.009B.0@twdb.state.tx.us> Message-ID: <5EAA9151-FFC3-403A-A486-E7AC2BDB09CE@gmail.com> Dharhas, Installing and starting to use scikits.timeseries will take you a couple of hours, unless you're on windows (because I don't have access to a windows machine and can't help you with the installation). Trying to find a trick to solve your problem might take you as long, if not more. If I were you, I wouldn't hesitate But well, some ideas 1. Revert to an array of datetime objects instead of your datenum. 2. Define some function that tests the .day of your dates, and select the ones that match the day you want. or, just convert your date2num floats to int and select the int that correspond to your day. 3. Constrct a mask from the results of the previous step. 4. Apply the mask on your data and compute the average. 5. Rinse and repeat for a new day. Yep, laborious... With scikits.timeseries, it'd be something like that: series = ts.fill_missing_dates(series).convert('D').mean(-1) From Dharhas.Pothina at twdb.state.tx.us Tue Dec 2 16:31:05 2008 From: Dharhas.Pothina at twdb.state.tx.us (Dharhas Pothina) Date: Tue, 02 Dec 2008 15:31:05 -0600 Subject: [SciPy-user] Calculating daily averages from a timeserieswithout using the timeseries package. In-Reply-To: <5EAA9151-FFC3-403A-A486-E7AC2BDB09CE@gmail.com> References: <49354452.63BA.009B.0@twdb.state.tx.us> <5EAA9151-FFC3-403A-A486-E7AC2BDB09CE@gmail.com> Message-ID: <493554B9.63BA.009B.0@twdb.state.tx.us> Pierre, Well I have older versions of Scipy, Numpy & Matplotlib installed through the Fedora 8 repositories. To install the timeseries package I need to build the latest versions of scipy/numpy etc. My main concern is that I have a whole slew of scripts that are doing various analyses and plots etc and from what I understand there have been some API changes in matplotlib from the version I have installed to the latest version. I'm fairly close to a deadline and I'm worried about breaking my existing scripts when installing the more recent versions numpy/scipy/matplotlib. Is there a way to do a parallel install so I can choose whether to use the newer versions or the old versions of numpy/scipy etc? I looked through the scipy website and couldn't find anything. thanks, - dharhas >>> Pierre GM 12/2/2008 2:44 PM >>> Dharhas, Installing and starting to use scikits.timeseries will take you a couple of hours, unless you're on windows (because I don't have access to a windows machine and can't help you with the installation). Trying to find a trick to solve your problem might take you as long, if not more. If I were you, I wouldn't hesitate But well, some ideas 1. Revert to an array of datetime objects instead of your datenum. 2. Define some function that tests the .day of your dates, and select the ones that match the day you want. or, just convert your date2num floats to int and select the int that correspond to your day. 3. Constrct a mask from the results of the previous step. 4. Apply the mask on your data and compute the average. 5. Rinse and repeat for a new day. Yep, laborious... With scikits.timeseries, it'd be something like that: series = ts.fill_missing_dates(series).convert('D').mean(-1) _______________________________________________ SciPy-user mailing list SciPy-user at scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user From pgmdevlist at gmail.com Tue Dec 2 16:47:59 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Tue, 2 Dec 2008 16:47:59 -0500 Subject: [SciPy-user] Calculating daily averages from a timeserieswithout using the timeseries package. In-Reply-To: <493554B9.63BA.009B.0@twdb.state.tx.us> References: <49354452.63BA.009B.0@twdb.state.tx.us> <5EAA9151-FFC3-403A-A486-E7AC2BDB09CE@gmail.com> <493554B9.63BA.009B.0@twdb.state.tx.us> Message-ID: <0665C3B3-FC60-48D6-A724-6377161AB0F5@gmail.com> On Dec 2, 2008, at 4:31 PM, Dharhas Pothina wrote: > > > Is there a way to do a parallel install so I can choose whether to > use the newer versions or the old versions of numpy/scipy etc? I > looked through the scipy website and couldn't find anything. You may want to try virtualenv and its wrapper. Have a lok to here: http://www.doughellmann.com/articles/CompletelyDifferent-2008-05-virtualenvwrapper/index.html That way, you can install the latest numpy/scipy/mpl in one virtual environment without modifying your base one. Very, very, very useful. From argriffi at ncsu.edu Tue Dec 2 16:50:11 2008 From: argriffi at ncsu.edu (alex) Date: Tue, 02 Dec 2008 16:50:11 -0500 Subject: [SciPy-user] indices of consecutive elements In-Reply-To: <493565B6.1090607@cc.gatech.edu> References: <493565B6.1090607@cc.gatech.edu> Message-ID: <4935AD93.1040908@ncsu.edu> Here's a way that uses convolution. from numpy import * v = array([0, 0, 1, 1, 1, 1, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 1, 1, 0]) n=4 c = convolve(v, ones(n+1))[:len(v)] where((c==n) & (v==0)) This gives (array([ 6, 19]),) where 6 and 19 are off by one from the correct end indices. Maybe there's a more efficient way to do a convolution with this kind of rectangular window, using the cumulative sum for example. Alex Daniel Ashbrook wrote: > I'm trying to figure out a way to return the indices of the start and > end of a run of consecutive elements that match some condition, but only > if there are more than a certain number. > > For example, take the array (with indices in comment for clarity): > > #0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 > [0, 0, 1, 1, 1, 1, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 0, 1, 1, 0] > > I want to find the start and end indices of all runs of 1s with length > of 4 or longer; so here the answer would be: > > [[2,5], [15,18]] > > Is there a reasonable way to do this without looping? I've been playing > around with diff() and where() but without too much progress. > > Thanks, > > > dan > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > From timmichelsen at gmx-topmail.de Tue Dec 2 17:00:25 2008 From: timmichelsen at gmx-topmail.de (Tim Michelsen) Date: Tue, 02 Dec 2008 23:00:25 +0100 Subject: [SciPy-user] calculations using the datetime information of timeseries In-Reply-To: <39BA62A5-E5F2-4E17-A34A-1EA59F2A649B@gmail.com> References: <39E9D479-C5F5-4AF3-A4E6-4EEFB4F1DAD6@gmail.com> <39BA62A5-E5F2-4E17-A34A-1EA59F2A649B@gmail.com> Message-ID: <4935AFF9.2040704@gmx-topmail.de> Hello Pierre, this thingy to use the datetime information really bothers me now. >>> As a wrap-up: >>> Try to avoid looping if you can. >> Yes, I noticed that. >> But I couldn't find another way to pass the individual datetimes to my >> calculation function which expects only one value at once (i.e. it >> is not >> designed to calculate full arrays). > > That might be a bottleneck. If you could modify your function so that > it can process arrays, you should get better results. Of course, that > depends on the actual function... > When I asked whether you really needed datetime objects, I was > thinking about the actual datetime.datetime objects, not about objects > having, say, a `day` or `hour` property. If you send an example of > function closer to your actual need, I may be able to help you more. I prepared an example. Maybe you have some ideas how to optimize the code. Please find below my commented example. ### START ### #!/usr/bin/env python import datetime as dt import numpy as np import scikits.timeseries as ts def hoy(datetime_obj): """ calculate hour of year """ mydt = datetime_obj year = mydt.year start = dt.datetime(mydt.year, 01, 01, 0) td = mydt - start seconds = td.days * 3600 * 24 + td.seconds hours = seconds / 3600 return hours def create_ts(datetime_obj): """ create a hourly series """ data = np.arange(0,8760) startdate = ts.Date(freq='H', datetime=datetime_obj) series = ts.time_series(data, freq='H', start_date=startdate) return series ## get a datetime object my_datetime = dt.datetime.now() ## create time series myseries = create_ts(my_datetime) ## calculate hoy for datetime object my_hoy = hoy(my_datetime) print 'my_hoy:', my_hoy ## first vectorize hoy_vect = np.vectorize(hoy) ## calculate the hoy for each hour in the series # 1 method: working but workaround since the main calculation is perfomed # outside the time series object!!! array_hoy = hoy_vect(myseries.dates.tolist()) series_hoy_01 = ts.time_series(array_hoy, myseries.dates) # 2. method: desired but not working #series_hoy_02 = hoy_vect(myseries.dates) ## this fails with the error message: # # AttributeError: 'numpy.int32' object has no attribute 'year' # or # AttributeError: 'int' object has no attribute 'year' def create_dt(series): dt_vect = np.vectorize(dt.datetime) dt_ser = dt_vect(series.year, series.month, series.hour) return dt_ser ser = create_dt(myseries) series_hoy_03 = hoy_vect(dt.datetime(myseries.year, myseries.month, myseries.hour)) ### END CODE ### Thanks in advance, Timmie From jdh2358 at gmail.com Tue Dec 2 17:15:55 2008 From: jdh2358 at gmail.com (John Hunter) Date: Tue, 2 Dec 2008 16:15:55 -0600 Subject: [SciPy-user] Calculating daily averages from a timeseries without using the timeseries package. In-Reply-To: <49354452.63BA.009B.0@twdb.state.tx.us> References: <49354452.63BA.009B.0@twdb.state.tx.us> Message-ID: <88e473830812021415je6a521fw5d6cf31f9e792be8@mail.gmail.com> On Tue, Dec 2, 2008 at 2:21 PM, Dharhas Pothina wrote: > Hi All, > > I have two arrays t & sal. t was created from an array of datetimes using the date2num function. The timeseries is approximately at an hourly frequency but there are days with little or no data or data at a non hourly frequency. How would I calculate the average of all salinity values on a particular day and form a new time series. > > t_days, sal_dailyavg > > I eventually plan to use the timeseries toolkit for my timeseries analysis but I'm close to the end of a project right now and don't have the time to install and learn it right now so I was hoping someone knew how to do this within numpy/scipy. I can think of a fairly laborious way using looping through each day and selecting the data in that day calculating the average and populating a new array. You can use some of the rec* functions in matplotlib.mlab import matplotlib.mlab as mlab import numpy as np # create a date column dates = np.array([d.date() for d in datetimes]) create a record array with the columns you need to analyze r = np.rec.fromarrays([dates, values], names='date,value']) # stats is a list of (input_name, function, output_name) stats = [('values', np.mean, 'means')] # you can gropup by one or more attrs, eg 'date', or ['year', 'month'] rsummary = mlab.rec_groupby(r, ['date'], stats) # pretty print the output print mlab.rec2txt(rsummary) From dwf at cs.toronto.edu Wed Dec 3 01:36:01 2008 From: dwf at cs.toronto.edu (David Warde-Farley) Date: Wed, 3 Dec 2008 01:36:01 -0500 Subject: [SciPy-user] os x, intel compilers & mkl, and fink python In-Reply-To: References: <10D66598-1DD4-46D9-BC84-5998E06C01F5@math.toronto.edu>

Message-ID: On 2-Dec-08, at 11:59 AM, Gideon Simpson wrote: > I use the fink one mostly because I've already got it installed, but > I'm not wedded to it. Must python and numpy/scipy all be built with > the same compiler? I am not certain, but I don't think it's necessary. At any rate, I would give the build available at python.org a try, that's typically the easiest one for the list to support. Fink moves at its own pace and applies it's own set of patches which might break things. I am afraid I don't have a definitive answer to your compiler question, but I suspect it is not the case, since I've seen reference to intel-compiled numpy being used with (presumably gcc built) system python. David From cournape at gmail.com Wed Dec 3 02:06:04 2008 From: cournape at gmail.com (David Cournapeau) Date: Wed, 3 Dec 2008 16:06:04 +0900 Subject: [SciPy-user] scipy on old CPU crashes In-Reply-To: <5b8d13220812020652pd138588kf41e70ea99ba72dd@mail.gmail.com> References: <6a4f17690812012336vee84c7bw9c53477f5b811173@mail.gmail.com> <5b8d13220812020652pd138588kf41e70ea99ba72dd@mail.gmail.com> Message-ID: <5b8d13220812022306la9c4abdv279e490e895dc992@mail.gmail.com> On Tue, Dec 2, 2008 at 11:52 PM, David Cournapeau wrote: > On Tue, Dec 2, 2008 at 4:36 PM, oyster wrote: >> sorry, but scipy-0.7.0b1-win32-superpack-python2.4.exe and >> numpy-1.2.1-win32-superpack-python2.4.exe crash on my old pc too, >> which uses duron 750MHz. So now I think it is not the problem with >> non-sse/sse/sse2 instruction >> > Ok, I checked the machine code in scipy and it seems that the quadpack module (used by scipy.integrate) has a couple of SSE instructions. Code-wise, it is trivial to solve, but we may need a new numpy version for that. thanks for the report, and sorry for the trouble, David From cournape at gmail.com Wed Dec 3 02:29:04 2008 From: cournape at gmail.com (David Cournapeau) Date: Wed, 3 Dec 2008 16:29:04 +0900 Subject: [SciPy-user] os x, intel compilers & mkl, and fink python In-Reply-To: References: <10D66598-1DD4-46D9-BC84-5998E06C01F5@math.toronto.edu>

Message-ID: <5b8d13220812022329i5b8a49f1ra946c77250d15075@mail.gmail.com> On Wed, Dec 3, 2008 at 1:59 AM, Gideon Simpson wrote: > > I use the fink one mostly because I've already got it installed, but > I'm not wedded to it. Must python and numpy/scipy all be built with > the same compiler? Depends on the language. For C, it should not matter: every C compiler on a given platform have to be compatible to a great degree to be of any use (because the system ABI more or less defines the C ABI). For fortran or worse C++, you better use the same compiler for every package. cheers, David From Dharhas.Pothina at twdb.state.tx.us Wed Dec 3 09:23:02 2008 From: Dharhas.Pothina at twdb.state.tx.us (Dharhas Pothina) Date: Wed, 03 Dec 2008 08:23:02 -0600 Subject: [SciPy-user] Calculating daily averages froma timeserieswithout using the timeseries package. In-Reply-To: <0665C3B3-FC60-48D6-A724-6377161AB0F5@gmail.com> References: <49354452.63BA.009B.0@twdb.state.tx.us> <5EAA9151-FFC3-403A-A486-E7AC2BDB09CE@gmail.com> <493554B9.63BA.009B.0@twdb.state.tx.us> <0665C3B3-FC60-48D6-A724-6377161AB0F5@gmail.com> Message-ID: <493641E6.63BA.009B.0@twdb.state.tx.us> Thank you Pierre, vitualenv looks like it will do exactly what is needed. I'll try installing it and getting up to date versions of everything + timeseries scikit installed. - dharhas >>> Pierre GM 12/2/2008 3:47 PM >>> On Dec 2, 2008, at 4:31 PM, Dharhas Pothina wrote: > > > Is there a way to do a parallel install so I can choose whether to > use the newer versions or the old versions of numpy/scipy etc? I > looked through the scipy website and couldn't find anything. You may want to try virtualenv and its wrapper. Have a lok to here: http://www.doughellmann.com/articles/CompletelyDifferent-2008-05-virtualenvwrapper/index.html That way, you can install the latest numpy/scipy/mpl in one virtual environment without modifying your base one. Very, very, very useful. _______________________________________________ SciPy-user mailing list SciPy-user at scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user From Dharhas.Pothina at twdb.state.tx.us Wed Dec 3 09:28:48 2008 From: Dharhas.Pothina at twdb.state.tx.us (Dharhas Pothina) Date: Wed, 03 Dec 2008 08:28:48 -0600 Subject: [SciPy-user] Calculating daily averages from a timeserieswithout using the timeseries package. In-Reply-To: <88e473830812021415je6a521fw5d6cf31f9e792be8@mail.gmail.com> References: <49354452.63BA.009B.0@twdb.state.tx.us> <88e473830812021415je6a521fw5d6cf31f9e792be8@mail.gmail.com> Message-ID: <49364340.63BA.009B.0@twdb.state.tx.us> Hi John, Unfortunately it looks like the verison of matplotlib (0.91.2) I have installed doesn't have the mlab.rec_groupby function. I guess I will try installing more recent versions of numpy/scipy/matplotlib using the virtualenv package Pierre mentioned. Thanks for your help. - dharhas >>> "John Hunter" 12/2/2008 4:15 PM >>> On Tue, Dec 2, 2008 at 2:21 PM, Dharhas Pothina wrote: > Hi All, > > I have two arrays t & sal. t was created from an array of datetimes using the date2num function. The timeseries is approximately at an hourly frequency but there are days with little or no data or data at a non hourly frequency. How would I calculate the average of all salinity values on a particular day and form a new time series. > > t_days, sal_dailyavg > > I eventually plan to use the timeseries toolkit for my timeseries analysis but I'm close to the end of a project right now and don't have the time to install and learn it right now so I was hoping someone knew how to do this within numpy/scipy. I can think of a fairly laborious way using looping through each day and selecting the data in that day calculating the average and populating a new array. You can use some of the rec* functions in matplotlib.mlab import matplotlib.mlab as mlab import numpy as np # create a date column dates = np.array([d.date() for d in datetimes]) create a record array with the columns you need to analyze r = np.rec.fromarrays([dates, values], names='date,value']) # stats is a list of (input_name, function, output_name) stats = [('values', np.mean, 'means')] # you can gropup by one or more attrs, eg 'date', or ['year', 'month'] rsummary = mlab.rec_groupby(r, ['date'], stats) # pretty print the output print mlab.rec2txt(rsummary) _______________________________________________ SciPy-user mailing list SciPy-user at scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user From brannode at gmail.com Wed Dec 3 10:43:54 2008 From: brannode at gmail.com (B Rannode) Date: Wed, 3 Dec 2008 10:43:54 -0500 Subject: [SciPy-user] matlab regstats "tstat" equivalent Message-ID: <3B9812C6-70B5-4E11-8CFE-2C6A89104628@gmail.com> Hello, I am attempting to convert a matlab script to python using scipy. the last hurdle I have run into is with regards to matlab's regstats. Given two 2d arrays, I am able to get betas and rsquared from lstsq (adding the leading ones to the first matrix) and linregress (by iterating over the first's rows). The part I am missing is getting the tstats. I have found ttest_rel/ ind. Are either of these comparable? If so are there examples in their use? Thanks for your time. From aisaac at american.edu Wed Dec 3 11:22:20 2008 From: aisaac at american.edu (Alan G Isaac) Date: Wed, 03 Dec 2008 11:22:20 -0500 Subject: [SciPy-user] matlab regstats "tstat" equivalent In-Reply-To: <3B9812C6-70B5-4E11-8CFE-2C6A89104628@gmail.com> References: <3B9812C6-70B5-4E11-8CFE-2C6A89104628@gmail.com> Message-ID: <4936B23C.7040009@american.edu> Perhaps the OLS class at http://code.google.com/p/econpy/source/browse/trunk/pytrix/ls.py provides some hints. Alan Isaac From Dharhas.Pothina at twdb.state.tx.us Wed Dec 3 12:16:14 2008 From: Dharhas.Pothina at twdb.state.tx.us (Dharhas Pothina) Date: Wed, 03 Dec 2008 11:16:14 -0600 Subject: [SciPy-user] scikits.timeseries DateArray Question Message-ID: <49366A7D.63BA.009B.0@twdb.state.tx.us> Hi, Thanks to Pierre's suggestion of using virtualenv package, I now have a working install of the scikit.timeseries package installed. I have some questions about constructing a timeseries from data in a file. From the file I am reading Year,Month,Day,Hour,Min,Data and I need to convert them to a timeseries. I've been looking through the documentation and the mailing list archive and I'm not sure how to create a DateArray that contains a non uniform list of datetimes. The data is mostly at a 15 minute frequency but sometimes may not fall exactly at 00,15,30,45 mins etc and other times may not be present. All the examples I've seen involve data that is present in a fixed frequency. ie. #Year Month Day Hour Min Data 2008 01 01 10 00 2.9 2008 01 01 10 15 3.2 2008 01 01 10 33 3.1 2008 01 01 12 45 3.0 2008 01 02 11 15 3.4 ... Is there a way to read these dates into a DateArray so I can create a timeseries? thanks, - dharhas From roban at astro.columbia.edu Wed Dec 3 12:35:13 2008 From: roban at astro.columbia.edu (Roban Hultman Kramer) Date: Wed, 3 Dec 2008 12:35:13 -0500 Subject: [SciPy-user] new Kolmogorov-Smirnov test In-Reply-To: <1cd32cbb0811291346j76d15b66ud2221653de709139@mail.gmail.com> References: <1cd32cbb0811291346j76d15b66ud2221653de709139@mail.gmail.com> Message-ID: <463180e60812030935n1058b1dfi7bc03f6ec8a23a98@mail.gmail.com> Is there a correct two-sample k-s test in scipy? On Sat, Nov 29, 2008 at 4:46 PM, wrote: > Since the old scipy.stats.kstest wasn't correct, I spent quite some > time fixing and testing it. Now, I know more about the > Kolmogorov-Smirnov test, than I wanted to. > > The kstest now resembles the one in R and in matlab, giving the option > for two-sided or one-sided tests. The names of the keyword options are > a mixture of matlab and R, which I liked best. > > Since the exact distribution of the two-sided test is not available in > scipy, I use an approximation, that seems to work very well. In > several Monte Carlo studies against R, I get very close results, > especially for small p-values. (For those interested, for small > p-values, I use ksone.sf(D,n)*2; for large p-values or large n, I use > the asymptotic distribution kstwobign) > > example signature and options: > kstest(x,testdistfn.name, alternative = 'unequal', mode='approx')) > kstest(x,testdistfn.name, alternative = 'unequal', mode='asymp')) > kstest(x,testdistfn.name, alternative = 'larger')) > kstest(x,testdistfn.name, alternative = 'smaller')) > > Below is the Monte Carlo for the case, when the random variable and > the hypothesized distribution both are standard normal (with sample > size 100 and 1000 replications. Rejection rates are very close to > alpha levels. It also contains the mean absolute error MAE for the old > kstest. I also checked for mean shifted normal random variables. In > all cases that I tried, I get exactly the same rejection rates as in > R. > > For details see doc string or source. > I attach file to a separate email, to get around attachment size limit. > > I intend to put this in trunk tomorrow, review and comments are welcome. > > Josef > > > data generation distribution is norm, hypothesis is norm > ================================================== > n = 100, loc = 0.000000 scale = 1.000000, n_repl = 1000 > columns: D, pval > rows are > kstest(x,testdistfn.name, alternative = 'unequal', mode='approx')) > kstest(x,testdistfn.name, alternative = 'unequal', mode='asymp')) > kstest(x,testdistfn.name, alternative = 'larger')) > kstest(x,testdistfn.name, alternative = 'smaller')) > > Results for comparison with R: > > MAE old kstest > [[ 0.00453195 0.19152727] > [ 0.00453195 0.2101139 ] > [ 0.02002774 0.19145982] > [ 0.02880553 0.26650226]] > MAE new kstest > [[ 1.87488913e-17 1.07738517e-02] > [ 1.87488913e-17 1.91763848e-06] > [ 2.38576520e-17 8.90287843e-16] > [ 1.41312743e-17 9.92428362e-16]] > percent count absdev > 0.005 > [[ 0. 53.9] > [ 0. 0. ] > [ 0. 0. ] > [ 0. 0. ]] > percent count absdev > 0.01 > [[ 0. 24.3] > [ 0. 0. ] > [ 0. 0. ] > [ 0. 0. ]] > percent count abs percent dev > 1% > [[ 0. 51.8] > [ 0. 0. ] > [ 0. 0. ] > [ 0. 0. ]] > percent count abs percent dev > 10% > [[ 0. 0.] > [ 0. 0.] > [ 0. 0.] > [ 0. 0.]] > new: count rejection at 1% significance > [ 0.01 0.008 0.009 0.014] > R: proportion of rejection at 1% significance > [ 0.01 0.008 0.009 0.014] > new: proportion of rejection at 5% significance > [ 0.054 0.048 0.048 0.06 ] > R: proportion of rejection at 5% significance > [ 0.054 0.048 0.048 0.06 ] > new: proportion of rejection at 10% significance > [ 0.108 0.096 0.095 0.109] > R: proportion of rejection at 10% significance > [ 0.108 0.096 0.095 0.109] > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > From pgmdevlist at gmail.com Wed Dec 3 12:49:22 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Wed, 3 Dec 2008 12:49:22 -0500 Subject: [SciPy-user] scikits.timeseries DateArray Question In-Reply-To: <49366A7D.63BA.009B.0@twdb.state.tx.us> References: <49366A7D.63BA.009B.0@twdb.state.tx.us> Message-ID: Dharhas, The documentation is a bit scarce indeed, and some functions are being rewritten (eg, loadtxt). For now, here's what you can do: *First, load your data into an array with np.loadtxt, matplotlib.mlab.csv2rec, whatever. >>> loaded = np.loadtxt(...) As in your example, we'll assume that the array in 6 cols wide, the first five being year, month, day, hour and min and the last one some data. No missing values in any of the first 5 cols, or find a way to fill them. Because your data are every 15 min or so, we need to use a 'minute' frequency (code 'T' or 'MIN'). There might be gaps in dates, that's OK as long as the whole line is missing. For now, let's use >>> loaded = [(2008, 1, 1, 12, 0, 1.0), (2008, 1, 1, 12, 15, 2.0), (2008, 1, 1, 18, 0, 3.0)] * Then, construct a DateArray from those first 5 cols. The simplest is to rely on datetime for that: >>> import scikits.timeseries as ts >>> import datetime >>> dates = ts.date_array([datetime.datetime(yy,mm,dd,hh,nn) for (yy,mm,dd,hh,nn,_) in loaded], freq='MIN') * Now, construct your time series >>> series = ts.time_series([_[-1] for _ in loaded, dates=dates) Let me know how it goes. P. On Dec 3, 2008, at 12:16 PM, Dharhas Pothina wrote: > Hi, > > Thanks to Pierre's suggestion of using virtualenv package, I now > have a working install of the scikit.timeseries package installed. I > have some questions about constructing a timeseries from data in a > file. From the file I am reading Year,Month,Day,Hour,Min,Data and I > need to convert them to a timeseries. I've been looking through the > documentation and the mailing list archive and I'm not sure how to > create a DateArray that contains a non uniform list of datetimes. > The data is mostly at a 15 minute frequency but sometimes may not > fall exactly at 00,15,30,45 mins etc and other times may not be > present. All the examples I've seen involve data that is present in > a fixed frequency. > > ie. > > #Year Month Day Hour Min Data > 2008 01 01 10 00 2.9 > 2008 01 01 10 15 3.2 > 2008 01 01 10 33 3.1 > 2008 01 01 12 45 3.0 > 2008 01 02 11 15 3.4 > ... > > Is there a way to read these dates into a DateArray so I can create > a timeseries? > > thanks, > > - dharhas > > > > > > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user From millman at berkeley.edu Wed Dec 3 14:20:25 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Wed, 3 Dec 2008 11:20:25 -0800 Subject: [SciPy-user] new Kolmogorov-Smirnov test In-Reply-To: <463180e60812030935n1058b1dfi7bc03f6ec8a23a98@mail.gmail.com> References: <1cd32cbb0811291346j76d15b66ud2221653de709139@mail.gmail.com> <463180e60812030935n1058b1dfi7bc03f6ec8a23a98@mail.gmail.com> Message-ID: On Wed, Dec 3, 2008 at 9:35 AM, Roban Hultman Kramer wrote: > Is there a correct two-sample k-s test in scipy? http://www.scipy.org/scipy/scipy/changeset/5213 -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From josef.pktd at gmail.com Wed Dec 3 14:24:24 2008 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 3 Dec 2008 14:24:24 -0500 Subject: [SciPy-user] new Kolmogorov-Smirnov test In-Reply-To: <463180e60812030935n1058b1dfi7bc03f6ec8a23a98@mail.gmail.com> References: <1cd32cbb0811291346j76d15b66ud2221653de709139@mail.gmail.com> <463180e60812030935n1058b1dfi7bc03f6ec8a23a98@mail.gmail.com> Message-ID: <1cd32cbb0812031124x1bb4bad5yd64122c05298cb57@mail.gmail.com> On Wed, Dec 3, 2008 at 12:35 PM, Roban Hultman Kramer wrote: > Is there a correct two-sample k-s test in scipy? > in scipy.stats.stats def ks_2samp(data1, data2): """ Computes the Kolmogorov-Smirnof statistic on 2 samples. Modified from Numerical Recipies in C, page 493. Returns KS D-value, prob. Not ufunc- like. Returns: KS D-value, p-value It uses the special.kolmogorov which is the asymptotic two-sided distribution. It "looks" ok, but there are no tests for it, and I haven't tested it either. A quick Monte Carlo with your sample size would verify how accurate it is. Josef From josef.pktd at gmail.com Wed Dec 3 14:27:18 2008 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 3 Dec 2008 14:27:18 -0500 Subject: [SciPy-user] new Kolmogorov-Smirnov test In-Reply-To: References: <1cd32cbb0811291346j76d15b66ud2221653de709139@mail.gmail.com> <463180e60812030935n1058b1dfi7bc03f6ec8a23a98@mail.gmail.com> Message-ID: <1cd32cbb0812031127m5f448167pa30cd21537d1896@mail.gmail.com> On Wed, Dec 3, 2008 at 2:20 PM, Jarrod Millman wrote: > On Wed, Dec 3, 2008 at 9:35 AM, Roban Hultman Kramer > wrote: >> Is there a correct two-sample k-s test in scipy? > > http://www.scipy.org/scipy/scipy/changeset/5213 The kstest that I fixed, is a one-sample test, comparing one sample with a theoretical distribution. Josef From matthew.brett at gmail.com Wed Dec 3 14:43:38 2008 From: matthew.brett at gmail.com (Matthew Brett) Date: Wed, 3 Dec 2008 11:43:38 -0800 Subject: [SciPy-user] new Kolmogorov-Smirnov test In-Reply-To: <1cd32cbb0812031124x1bb4bad5yd64122c05298cb57@mail.gmail.com> References: <1cd32cbb0811291346j76d15b66ud2221653de709139@mail.gmail.com> <463180e60812030935n1058b1dfi7bc03f6ec8a23a98@mail.gmail.com> <1cd32cbb0812031124x1bb4bad5yd64122c05298cb57@mail.gmail.com> Message-ID: <1e2af89e0812031143w6341f38cmdef53410ebb3adf5@mail.gmail.com> Hi, > def ks_2samp(data1, data2): > """ Computes the Kolmogorov-Smirnof statistic on 2 samples. Modified > from Numerical Recipies in C, page 493. Returns KS D-value, prob. Not > ufunc- like. Wait - really? We can't use Numerical Recipes code, it has strict and incompatible licensing... If it's in there it really has to come out as fast as possible. Matthew From millman at berkeley.edu Wed Dec 3 14:49:30 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Wed, 3 Dec 2008 11:49:30 -0800 Subject: [SciPy-user] new Kolmogorov-Smirnov test In-Reply-To: <1e2af89e0812031143w6341f38cmdef53410ebb3adf5@mail.gmail.com> References: <1cd32cbb0811291346j76d15b66ud2221653de709139@mail.gmail.com> <463180e60812030935n1058b1dfi7bc03f6ec8a23a98@mail.gmail.com> <1cd32cbb0812031124x1bb4bad5yd64122c05298cb57@mail.gmail.com> <1e2af89e0812031143w6341f38cmdef53410ebb3adf5@mail.gmail.com> Message-ID: On Wed, Dec 3, 2008 at 11:43 AM, Matthew Brett wrote: >> def ks_2samp(data1, data2): >> """ Computes the Kolmogorov-Smirnof statistic on 2 samples. Modified >> from Numerical Recipies in C, page 493. Returns KS D-value, prob. Not >> ufunc- like. > > Wait - really? We can't use Numerical Recipes code, it has strict and > incompatible licensing... If it's in there it really has to come out > as fast as possible. http://www.nr.com/licenses/redistribute.html -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From josef.pktd at gmail.com Wed Dec 3 15:15:18 2008 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 3 Dec 2008 15:15:18 -0500 Subject: [SciPy-user] new Kolmogorov-Smirnov test In-Reply-To: References: <1cd32cbb0811291346j76d15b66ud2221653de709139@mail.gmail.com> <463180e60812030935n1058b1dfi7bc03f6ec8a23a98@mail.gmail.com> <1cd32cbb0812031124x1bb4bad5yd64122c05298cb57@mail.gmail.com> <1e2af89e0812031143w6341f38cmdef53410ebb3adf5@mail.gmail.com> Message-ID: <1cd32cbb0812031215w1b311ee9gd378d0d448e8ef86@mail.gmail.com> On Wed, Dec 3, 2008 at 2:49 PM, Jarrod Millman wrote: > On Wed, Dec 3, 2008 at 11:43 AM, Matthew Brett wrote: >>> def ks_2samp(data1, data2): >>> """ Computes the Kolmogorov-Smirnof statistic on 2 samples. Modified >>> from Numerical Recipies in C, page 493. Returns KS D-value, prob. Not >>> ufunc- like. >> >> Wait - really? We can't use Numerical Recipes code, it has strict and >> incompatible licensing... If it's in there it really has to come out >> as fast as possible. > > http://www.nr.com/licenses/redistribute.html > > -- > Jarrod Millman > Computational Infrastructure for Research Labs > 10 Giannini Hall, UC Berkeley > phone: 510.643.4014 > http://cirl.berkeley.edu/ > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > The algorithm is essentially one loop to calculate the distance measure, I would assume that this simple algorithm cannot be copyright protected, but for efficiency, it might be better anyway to come up with a vectorized version similar to kstest. about correctness: ============= A quick Monte Carlo shows that the test is pretty accurate under the null even for small sample sizes, power to reject, if the alternative is true is only reasonably high in larger samples Null correct ================================================== Monte Carlo for K-S 2sample test (ks_2samp): sample size = 100, 1000 replications sample 1: normal distribution (loc=1.000000,scale=2.000000) sample 2: normal distribution (loc=1.000000,scale=2.000000) ks_2samp: proportion of rejection at 1% significance: 0.003 ks_2samp: proportion of rejection at 5% significance: 0.049 ks_2samp: proportion of rejection at 10% significance: 0.101 ========= Null not true: ================================================== Monte Carlo for K-S 2sample test (ks_2samp): sample size = 500, 1000 replications sample 1: normal distribution (loc=0.000000,scale=1.000000) sample 2: t distribution (dof=10, loc=0.000000,scale=1.000000) ks_2samp: proportion of rejection at 1% significance: 0.253 ks_2samp: proportion of rejection at 5% significance: 0.71 ks_2samp: proportion of rejection at 10% significance: 0.88 Josef -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: ks2_samp_MCtest.py URL: From timmichelsen at gmx-topmail.de Wed Dec 3 16:18:24 2008 From: timmichelsen at gmx-topmail.de (Tim Michelsen) Date: Wed, 03 Dec 2008 22:18:24 +0100 Subject: [SciPy-user] scikits.timeseries DateArray Question [timeseries documentation] In-Reply-To: References: <49366A7D.63BA.009B.0@twdb.state.tx.us> Message-ID: Hello Dharhas, welcome as new user of timeseries user! Learning this scikit will soon pay off. I have seen a huge boost in the simplicity and usability of my analysis through the code I wrote using the timeseries. A special praise shall be given to the developers Pierre & Matt. By patiently answering my "advanced python newbie" questions they really help me to get the maximum of the numpy.ma and scikits.timeseries tool box. This did really bring my numerical python coding forward. > The documentation is a bit scarce indeed, and some functions are being > rewritten (eg, loadtxt). For now, here's what you can do: I tried to add some of my questions to the cookbook. Here is one that may help you in the current situation: More extensive answer - http://www.scipy.org/Cookbook/TimeSeries/FAQ#head-9f5c8c4d4aa0de90c9851b972db8b4d8100c2d36 All other Q&A in form of mails sent in by other users and myself are at gmane: * creating timeseries for non convertional custom frequencies - http://article.gmane.org/gmane.comp.python.scientific.user/15688 Search for "time series" or timeseries http://search.gmane.org/?query=%22time+series%22+timeseries&author=&group=gmane.comp.python.scientific.user&sort=relevance&DEFAULTOP=or&xP=time%09series&xFILTERS=Gcomp.python.scientific.user---A Some answers to feature requests may also help: http://scipy.org/scipy/scikits/query?status=new&status=assigned&status=reopened&status=closed&component=timeseries&order=priority @Pierre, Are there plans to include timeseries into the scipy online doc editor? What for do you suggest if I would like to contribute examples here and there? > * Now, construct your time series > >>> series = ts.time_series([_[-1] for _ in loaded, dates=dates) after this step you'd probably want to fill the missing dates: series_filled = series.fill_missing_dates => now you can save the data to csv using reportlib from the scikit and to other neat things. Hope that helps. Regards, Timmie From Dharhas.Pothina at twdb.state.tx.us Wed Dec 3 16:40:01 2008 From: Dharhas.Pothina at twdb.state.tx.us (Dharhas Pothina) Date: Wed, 03 Dec 2008 15:40:01 -0600 Subject: [SciPy-user] scikits.timeseries DateArray Question [timeseriesdocumentation] In-Reply-To: References: <49366A7D.63BA.009B.0@twdb.state.tx.us> Message-ID: <4936A851.63BA.009B.0@twdb.state.tx.us> Thank you Tim. I've been following this package for a while. It looks really impressive. The only thing that was holding me back was installation issues and the virtualenv stuff have fixed that. A question. Why do I need to fill missing dates? Is it required for other things like calculating daily averages etc or is there another reason? @Pierre & Matt. Please don't my earlier emails as criticism about the documentation. I am extremely thankful that you have taken the time to develop this package. Seconding Tim, I would like to contribute examples/howto's based on the work I'm doing. If you have any guidance on how the best way to do this is that would be great. - dharhas >>> Tim Michelsen 12/3/2008 3:18 PM >>> Hello Dharhas, welcome as new user of timeseries user! Learning this scikit will soon pay off. I have seen a huge boost in the simplicity and usability of my analysis through the code I wrote using the timeseries. A special praise shall be given to the developers Pierre & Matt. By patiently answering my "advanced python newbie" questions they really help me to get the maximum of the numpy.ma and scikits.timeseries tool box. This did really bring my numerical python coding forward. > The documentation is a bit scarce indeed, and some functions are being > rewritten (eg, loadtxt). For now, here's what you can do: I tried to add some of my questions to the cookbook. Here is one that may help you in the current situation: More extensive answer - http://www.scipy.org/Cookbook/TimeSeries/FAQ#head-9f5c8c4d4aa0de90c9851b972db8b4d8100c2d36 All other Q&A in form of mails sent in by other users and myself are at gmane: * creating timeseries for non convertional custom frequencies - http://article.gmane.org/gmane.comp.python.scientific.user/15688 Search for "time series" or timeseries http://search.gmane.org/?query=%22time+series%22+timeseries&author=&group=gmane.comp.python.scientific.user&sort=relevance&DEFAULTOP=or&xP=time%09series&xFILTERS=Gcomp.python.scientific.user---A Some answers to feature requests may also help: http://scipy.org/scipy/scikits/query?status=new&status=assigned&status=reopened&status=closed&component=timeseries&order=priority @Pierre, Are there plans to include timeseries into the scipy online doc editor? What for do you suggest if I would like to contribute examples here and there? > * Now, construct your time series > >>> series = ts.time_series([_[-1] for _ in loaded, dates=dates) after this step you'd probably want to fill the missing dates: series_filled = series.fill_missing_dates => now you can save the data to csv using reportlib from the scikit and to other neat things. Hope that helps. Regards, Timmie _______________________________________________ SciPy-user mailing list SciPy-user at scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user From pgmdevlist at gmail.com Wed Dec 3 16:43:01 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Wed, 3 Dec 2008 16:43:01 -0500 Subject: [SciPy-user] scikits.timeseries DateArray Question [timeseries documentation] In-Reply-To: References: <49366A7D.63BA.009B.0@twdb.state.tx.us> Message-ID: On Dec 3, 2008, at 4:18 PM, Tim Michelsen wrote: > > A special praise shall be given to the developers Pierre & Matt. By > patiently answering my "advanced python newbie" questions they really > help me to get the maximum of the numpy.ma and scikits.timeseries tool > box. This did really bring my numerical python coding forward. Wow, thanks a lot ! I should fwd this message to the-one-of-my-bosses- who-has-the-money. He'll probably complain that I don't write enough papers... But still, I really appreciate. Thanks again. > > @Pierre, > Are there plans to include timeseries into the scipy online doc > editor? > What for do you suggest if I would like to contribute examples here > and > there? Well, Matt and I have been considering making a first official release for a while, but we keep postponing it (I'm the one to blame). Hopefully we should be ready for early 2009 (a mere 5 weeks away). Then, we'll see how we can get incorporate the scikits in scipy, or at least get the docs in the scipy online doc editor. Until then, the easiest is to contact either Matt and I offlist, so that we can take your comments into account. From pgmdevlist at gmail.com Wed Dec 3 16:50:25 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Wed, 3 Dec 2008 16:50:25 -0500 Subject: [SciPy-user] scikits.timeseries DateArray Question [timeseriesdocumentation] In-Reply-To: <4936A851.63BA.009B.0@twdb.state.tx.us> References: <49366A7D.63BA.009B.0@twdb.state.tx.us> <4936A851.63BA.009B.0@twdb.state.tx.us> Message-ID: <7AF3E296-FE68-4514-9D26-55DA2CFFC8D8@gmail.com> On Dec 3, 2008, at 4:40 PM, Dharhas Pothina wrote: > > A question. Why do I need to fill missing dates? Is it required for > other things like calculating daily averages etc or is there another > reason? Well, it is required in some operations, in particular conversion from one frequency to another. If you don't get any error message about the dates being incomplete, you're OK. If not, just use fill_missing_dates. I recognize that it's a lot of wasted space when you have a 15min- interval series for example, has you end up with a LOT of missing data. Keep in mind that the package was initially designed for Matt's issues and mine, and we both usually work with daily frequencies or lower (monthly...). > @Pierre & Matt. Please don't my earlier emails as criticism about > the documentation. I am extremely thankful that you have taken the > time to develop this package. Seconding Tim, I would like to > contribute examples/howto's based on the work I'm doing. If you have > any guidance on how the best way to do this is that would be great. Oh, don't worry, we don't take it personnally. We'd be delighted to have some help with the documentation: it's always difficult to put oneself back in the shoes of a newbie when one has been working with a package for a while. Tutorial and how-tos would be great indeed. I'll give you the same answer as to Tim: just drop us a line with your material, we'll find a way to put it on the SVN and the online doc. Thanks again for your support! From Dharhas.Pothina at twdb.state.tx.us Wed Dec 3 17:18:46 2008 From: Dharhas.Pothina at twdb.state.tx.us (Dharhas Pothina) Date: Wed, 03 Dec 2008 16:18:46 -0600 Subject: [SciPy-user] scikits.timeseries DateArray Question[timeseriesdocumentation] Message-ID: <4936B1660200009B000186DF@GWWEB.twdb.state.tx.us> Hi, Almost there I think. I'm getting the following error : Traceback (most recent call last): File "/home/dharhas/scripts/selfe/plot-stations_selfevsfield_wfreq.py", line 210, in fseries_freq = fseries.convert(freq='D', func=mean) AttributeError: 'function' object has no attribute 'convert' What am I doing wrong? The relevant code is below : year, month, day, hour, minute, fdata = loadtxt(fieldfile,comments="#",usecols=(0,1,2,3,4,ndata),unpack=True) fielddates = date2num([datetime.datetime(int(y),int(m),int(d),int(hh),int(mm),0) for y,m,d,hh,mm in zip(year,month,day,hour,minute)]) fdates = ts.date_array(fielddates,freq='MIN') fseries = ts.time_series(fdata, dates=fdates) #remove -999.9 nodata values fo parameter fseries[fseries==-999.9] = ma.masked fseries = fseries.fill_missing_dates #convert tor required frequency fseries_freq = fseries.convert(freq='D', func=mean) >>> Pierre GM 12/03/08 3:51 PM >>> On Dec 3, 2008, at 4:40 PM, Dharhas Pothina wrote: > > A question. Why do I need to fill missing dates? Is it required for > other things like calculating daily averages etc or is there another > reason? Well, it is required in some operations, in particular conversion from one frequency to another. If you don't get any error message about the dates being incomplete, you're OK. If not, just use fill_missing_dates. I recognize that it's a lot of wasted space when you have a 15min- interval series for example, has you end up with a LOT of missing data. Keep in mind that the package was initially designed for Matt's issues and mine, and we both usually work with daily frequencies or lower (monthly...). > @Pierre & Matt. Please don't my earlier emails as criticism about > the documentation. I am extremely thankful that you have taken the > time to develop this package. Seconding Tim, I would like to > contribute examples/howto's based on the work I'm doing. If you have > any guidance on how the best way to do this is that would be great. Oh, don't worry, we don't take it personnally. We'd be delighted to have some help with the documentation: it's always difficult to put oneself back in the shoes of a newbie when one has been working with a package for a while. Tutorial and how-tos would be great indeed. I'll give you the same answer as to Tim: just drop us a line with your material, we'll find a way to put it on the SVN and the online doc. Thanks again for your support! _______________________________________________ SciPy-user mailing list SciPy-user at scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user From pgmdevlist at gmail.com Wed Dec 3 17:25:13 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Wed, 3 Dec 2008 17:25:13 -0500 Subject: [SciPy-user] scikits.timeseries DateArray Question[timeseriesdocumentation] In-Reply-To: <4936B1660200009B000186DF@GWWEB.twdb.state.tx.us> References: <4936B1660200009B000186DF@GWWEB.twdb.state.tx.us> Message-ID: <58AFEFC5-7E85-4028-8A36-A2E3A95AF2D8@gmail.com> On Dec 3, 2008, at 5:18 PM, Dharhas Pothina wrote: > Hi, > > Almost there I think. I'm getting the following error : > > Traceback (most recent call last): > File "/home/dharhas/scripts/selfe/plot- > stations_selfevsfield_wfreq.py", line 210, in > fseries_freq = fseries.convert(freq='D', func=mean) > AttributeError: 'function' object has no attribute 'convert' > > What am I doing wrong? Well, you're obviously accessing a function instead of a TimeSeries instance... > #remove -999.9 nodata values fo parameter > fseries[fseries==-999.9] = ma.masked > fseries = fseries.fill_missing_dates Here's the culprit: you forgot to put the () after fill_missing_dates. Therefore, fseries is a reference to the method `fill_missing_dates`, that is, a function. Put the () and then you reference the output of the method, which is a TimeSeries object, like you wanted. From guilherme at gpfreitas.com Wed Dec 3 17:34:05 2008 From: guilherme at gpfreitas.com (Guilherme P. de Freitas) Date: Wed, 3 Dec 2008 14:34:05 -0800 Subject: [SciPy-user] Software for economic experiments Message-ID: Hi all One of our professors here at Caltech is working on regulations of fishing activity in the Pacific coast. He needs software to run some experiments that would help test possible regulatory designs. The rough picture is that the subjects are fishing companies that are assigned a fishing quota and should be allowed to, at each period, fish, trade their fishing quotas, and participate on auctions of fishing quotas whenever the regulator decides to have an auction. An experiment without the trading or auctions would be a start. The problem is: we need someone to write the software. Nobody in our research group is a programmer, and I was wondering if this could not be done with Python. A still rough but more detailed description can be found in the bottom of this email. I will provide a better description later, this is just so you can have an idea. Let me know if you want to see the theoretical work. So the questions are: 1. Does anybody know if Python is a good choice of language for creating this kind of software? 2. Can anybody point to work already done by someone in the community that is similar to the one we need? 3. Can anyone recommend individual programmers or companies that would be able to do this job? 4. Does anybody have an idea of how much such a job would cost? (Please, three prices: without auctions or trading, with auctions only, with trading only) 5. Maybe we will find someone to do the coding, but still it may be useful to have an external "consultant" that provides support to the coder, and informs us of the quality of the code, makes sure it is well documented, etc. I personally think it is easy to fall in the trap of having someone come up with a hack that works for some situations, and that is ill documented, etc. So, questions 3 and 4 apply to this kind of professional too. 6. What else on top of the rough sketch below does one need to know to have a good idea of how to do the job and how to price it? I intend to define the variables and its relations soon. I guess it is a big bonus if whoever is doing this work for us lives in the LA/Pasadena area. Best! Guilherme \\----------------- Software for fishing experiment - Sketch Guilherme Last modified: Wed Dec 3 14:22:01 2008 This texts presents a very rough sketch of the software needed for running experiments related to a particular regulation of fishing markets. Research is conducted by prof. John O. Ledyard at the Humanities and Social Sciences (HSS) division of the California Institute of Technology (Caltech). No regulation ============= First, let's quickly see the scenario without regulation. There are fishing companies and the environment. Fishing companies are characterized by their cost function, which associates to each amount of fish caught (the catch) a specific monetary cost. The environment is characterized by an initial fish stock and an evolution/transition law that associates to each current fish stock (number of fish in the environment) a fish stock in the next period. Fishing companies have to decide how much to fish, and their objective is maximizing profit. Price of fish is determined exogenously by the market. One could think of the scenario above as one in which only one species exist or one in which different levels of catch contain the same proportion of different species of fish, and thus all that is relevant in terms of price is the average price per unit (say tons) of catch. An experiment on this scenario would be as follows: each subject in the experiment would be a different fishing company. Then: 1. Subjects could be informed of the environmental variables: current fish stock and transition law. 2. Subjects would be informed of fish market price. 3. Subjects would be given a fixed amount of time to decide how much to fish (say in tons). 4. After decisions were made, profits for that period would be computed, the population would be reduced by the sum of the catch of each fishing company (total catch at the period), the transition law would determined then how much fish is available in the next period. This is a simple experiment: the experimenter has to be able to set: - The initial fish stock. - The transition law of the fish population. - The price of fish at each period. - The cost function of fishing companies. - The number of periods that the experiment lasts. - The number of subjects (maybe better to be automatically set to the number of subjects that log in) - If subjects know or not about the number of fish available at each stage. - If the subjects know or not the transition law. - What happens if subjects do not make any decisions in the time they are given to decide how much to fish. For example, the decision could be automatically set to fish nothing, or to fish the same as in the previous period. - What happens if subjects overfish, that is, if their aggregate decision implies fishing more than what is available in the environment. For example, each subject could be given a share proportional to his fishing decision. That it, if in the aggregate decisions subject A would be responsible for thirty percent of the total catch, than he should be given thirty percent of the total population in case there is overfishing. Regulation ========== Now, suppose that there is a regulator that can do the following: - Determine a total available catch (TAC), that is, a cap on the aggregate number of fish caught. - Assign fishing quotas at any stage (or, at least at the first stage). Quotas are permits to fish x amount of fish for t consecutive periods. - Auction fishing quotas at any stage. - Determine when fishing companies can trade their quotas. The main situation would be the following: the regulator determines the TAC to protect the environment, and assigns fishing quotas in the initial stage with the objective of attaining some economic/politic performance goals. To account for the dynamic nature of the fishing activity, and for possible mistakes in th einitial assignment, the regulator creates auctions of quotas at certain stages and lets fishing companies trade their quotas at certain stages. This experiment would run as follows: 1. The experimenter determines: - The initial fish stock. - The transition law of the fish population. - The TAC function. - The price of fish at each period. - The cost function of fishing companies. - The number of periods that the experiment lasts. - The number of subjects (maybe better to be automatically set to the number of subjects that log in) - At which stages auctions are going to be ran. - The rules of the auction. - At which stages trading quotas is allowed. - The initial allocation of fishing quotas/permits. 2. Then, the software should inform each subject about: - The number of periods that the experiment lasts. - His initial quota and the initial TAC. - His cost function. - The initial price of fish. - Any scheduled auctions. - Any scheduled "markets", that is, at which stages fishing companies will be able to trade their fishing quotas. 3. At each stage, each subject should be informed of: - His current quota. - Current price of fish. - Any scheduled auctions. - Any schedule markets. - Number of remaining stages. 4. At each stage, each subject has to decide: - How much to fish. - How much to buy and sell in the market, if possible. - His bids in the auction, if possible. 5. At each stage, each subject could (experimenter determines whether or not) be informed of: - Current TAC - Other fishing companies quotas Remarks ======= 1. The auction and market should be built in a independent module. At the moment, Caltech has programs for running auctions and markets, but not in Python. A lot of work as been put in these programs, and possibly they could be "glued" to the main application here described. I personally wonder if it is not worth rewriting them as independent modules some day. 2. All variables should be stored in a format that is easily accessible (possible via a simple hassle-free conversion) to Python, Matlab, R, Excel (or some other widely compatible spereadsheet format), Java, C++. 3. Web interfaces are preferable. That said, I see no reason why a decent text-based interface wouldn't be suitable. Maybe it is worth working with a text-based interface as long as the application is under heavy development. I wonder how Django could be used for a web interface. Why Django? Well, the short answer is that it is well documented. 4. The code should be well documented. In the future, graduate students will have to understand the code to maintain it. ----------------------------------------// -- Guilherme P. de Freitas http://www.gpfreitas.com From aisaac at american.edu Wed Dec 3 18:05:07 2008 From: aisaac at american.edu (Alan G Isaac) Date: Wed, 03 Dec 2008 18:05:07 -0500 Subject: [SciPy-user] Software for economic experiments In-Reply-To: References: Message-ID: <493710A3.4070103@american.edu> On 12/3/2008 5:34 PM Guilherme P. de Freitas apparently wrote: > I was wondering if this could not > be done with Python. It certainly can be done with Python. I might be interested in helping. Will one subject at a time interact with the software, or would it need to synchronously elicit responses from multiple human subjects? What kind of network configuration will you rely on in the lab? What department are you in? What are the research goals? Is coauthorship a possibility? Would the code be free and open source (e.g., BSD license)? Is the existing auction code already FOSS licensed? What language is it in? Feel free to send a couple relevant papers, if that might help me see into your project. I have no substantial fisheries knowledge, but I am an economist with Python programming experience, and I think fisheries issues are interesting. Alan Isaac Assoc Prof Econ American University From robert.kern at gmail.com Wed Dec 3 18:07:25 2008 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 3 Dec 2008 17:07:25 -0600 Subject: [SciPy-user] Software for economic experiments In-Reply-To: References: Message-ID: <3d375d730812031507n4a0de2d8t45488aaea98b1e36@mail.gmail.com> On Wed, Dec 3, 2008 at 16:34, Guilherme P. de Freitas wrote: > Hi all > > One of our professors here at Caltech is working on regulations of > fishing activity in the Pacific coast. He needs software to run some > experiments that would help test possible regulatory designs. The > rough picture is that the subjects are fishing companies that are > assigned a fishing quota and should be allowed to, at each period, > fish, trade their fishing quotas, and participate on auctions of > fishing quotas whenever the regulator decides to have an auction. An > experiment without the trading or auctions would be a start. The > problem is: we need someone to write the software. Nobody in our > research group is a programmer, and I was wondering if this could not > be done with Python. > > A still rough but more detailed description can be found in the bottom > of this email. I will provide a better description later, this is just > so you can have an idea. Let me know if you want to see the > theoretical work. > > So the questions are: > > 1. Does anybody know if Python is a good choice of language for > creating this kind of software? Yes, quite so. > 2. Can anybody point to work already done by someone in the community > that is similar to the one we need? Caltech runs a number of such web-based Econ experiments fairly frequently. Making money from them was a common hobby when I was an undergrad (Ruddock '03). You may want to ask around. Most likely, they were written by students long-since graduated and are poorly documented, but you never know. I would be surprised if you couldn't find examples of similar scope. > 3. Can anyone recommend individual programmers or companies that would > be able to do this job? > 4. Does anybody have an idea of how much such a job would cost? > (Please, three prices: without auctions or trading, with auctions > only, with trading only) > 5. Maybe we will find someone to do the coding, but still it may be > useful to have an external "consultant" that provides support to the > coder, and informs us of the quality of the code, makes sure it is > well documented, etc. I personally think it is easy to fall in the > trap of having someone come up with a hack that works for some > situations, and that is ill documented, etc. So, questions 3 and 4 > apply to this kind of professional too. I'm not sure you'd get many takers for this, and you probably won't get your money's worth. It would be more cost-effective for you to simply hire a professional. > 6. What else on top of the rough sketch below does one need to know to > have a good idea of how to do the job and how to price it? I intend to > define the variables and its relations soon. Some examples of the transition laws, quota allocation strategies, and cost functions would probably help. It would give the bidder a better idea of how much flexibility they will need. Similarly, descriptions of the auction software that you have available and the kinds of auctions and trading that you will need will help a lot. If you want to get new, reusable auction/trading components out of this, you will want a more thorough survey of what other groups in HSS may want, too. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From josef.pktd at gmail.com Wed Dec 3 21:53:09 2008 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 3 Dec 2008 21:53:09 -0500 Subject: [SciPy-user] new Kolmogorov-Smirnov 2-sample test not Numerical Recipes based Message-ID: <1cd32cbb0812031853w622785ddlaa0b5d3f0d8aefd9@mail.gmail.com> On Wed, Dec 3, 2008 at 3:15 PM, wrote: > On Wed, Dec 3, 2008 at 2:49 PM, Jarrod Millman wrote: >> On Wed, Dec 3, 2008 at 11:43 AM, Matthew Brett wrote: >>>> def ks_2samp(data1, data2): >>>> """ Computes the Kolmogorov-Smirnof statistic on 2 samples. Modified >>>> from Numerical Recipies in C, page 493. Returns KS D-value, prob. Not >>>> ufunc- like. >>> >>> Wait - really? We can't use Numerical Recipes code, it has strict and >>> incompatible licensing... If it's in there it really has to come out >>> as fast as possible. >> >> http://www.nr.com/licenses/redistribute.html >> I did a 2sample kstest based on the definition using search sorted, see function below. Attached is script file with standalone old and new versions and Monte Carlo evaluation. I didn't do any speed comparison. The function works only for one dimensional data, neither does the old one, but there is some initial more than 1 dimension setup in the old version that I don't understand. Main reference used (after googling): http://math.ucsd.edu/~gptesler/283/kolmogorov_smirnov_05.pdf If there is any interest, I can replace the existing implementation with the new function Josef Conclusion: Comparing old implementation based on numerical recipes with new implementation * in 2 samples have same size then difference in KS statistic is less than 1e-14 * if sample sizes differ * in n1=2, n2=3 example new version is correct, old version is wrong * Monte Carlo essentially identical results (sample sizes between 100 and 1000 rejection rates for alpha = 1%,5%,10% sometimes differ in 3rd decimal >>> data1 = np.array([1.0,2.0]) >>> data2 = np.array([1.0,2.0,3.0]) >>> ks_2samp_new(data1+0.01,data2) (0.33333333333333337, 0.99062316386915694) >>> ks_2samp(data1+0.01,data2) (0.33333333333333331, 0.99062316386915694) >>> ks_2samp_new(data1-0.01,data2) (0.66666666666666674, 0.42490954988801982) >>> ks_2samp(data1-0.01,data2) # result not correct (-0.5, 0.77962787254643151) =============== def ks_2samp_new(data1, data2): """ Computes the Kolmogorov-Smirnof statistic on 2 samples. Returns: KS D-value, p-value """ data1, data2 = map(asarray, (data1, data2)) n1 = data1.shape[0] n2 = data2.shape[0] n1 = len(data1) n2 = len(data2) data1 = np.sort(data1) data2 = np.sort(data2) data_all = np.concatenate([data1,data2]) cdf1 = np.searchsorted(data1,data_all,side='right')/(1.0*n1) cdf2 = (np.searchsorted(data2,data_all,side='right'))/(1.0*n2) d = np.max(np.absolute(cdf1-cdf2)) #Note: d absolute not signed distance en = np.sqrt(n1*n2/float(n1+n2)) try: prob = ksprob((en+0.12+0.11/en)*d) except: prob = 1.0 return d, prob ==================== -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: ks2samp_rewrite_cleaned.py URL: From matthew.brett at gmail.com Wed Dec 3 21:58:49 2008 From: matthew.brett at gmail.com (Matthew Brett) Date: Wed, 3 Dec 2008 18:58:49 -0800 Subject: [SciPy-user] new Kolmogorov-Smirnov 2-sample test not Numerical Recipes based In-Reply-To: <1cd32cbb0812031853w622785ddlaa0b5d3f0d8aefd9@mail.gmail.com> References: <1cd32cbb0812031853w622785ddlaa0b5d3f0d8aefd9@mail.gmail.com> Message-ID: <1e2af89e0812031858g57025313y75f6dad989c42b26@mail.gmail.com> Hi, > I did a 2sample kstest based on the definition using search sorted, > see function below. [snip] > If there is any interest, I can replace the existing implementation > with the new function [snip] > * in n1=2, n2=3 example new version is correct, old version is wrong > * Monte Carlo essentially identical results (sample sizes between 100 and 1000 > rejection rates for alpha = 1%,5%,10% sometimes differ in 3rd decimal Given more secure license compatibility and that you think the current version is correct where the other is wrong, I think this should go in now, and into the 0.7 release. Any disagreement? Matthew From cournape at gmail.com Wed Dec 3 22:15:27 2008 From: cournape at gmail.com (David Cournapeau) Date: Thu, 4 Dec 2008 12:15:27 +0900 Subject: [SciPy-user] scipy on old CPU crashes In-Reply-To: <5b8d13220812022306la9c4abdv279e490e895dc992@mail.gmail.com> References: <6a4f17690812012336vee84c7bw9c53477f5b811173@mail.gmail.com> <5b8d13220812020652pd138588kf41e70ea99ba72dd@mail.gmail.com> <5b8d13220812022306la9c4abdv279e490e895dc992@mail.gmail.com> Message-ID: <5b8d13220812031915n19b5e98fw4b9d7303d06ae95d@mail.gmail.com> On Wed, Dec 3, 2008 at 4:06 PM, David Cournapeau wrote: > On Tue, Dec 2, 2008 at 11:52 PM, David Cournapeau wrote: >> On Tue, Dec 2, 2008 at 4:36 PM, oyster wrote: >>> sorry, but scipy-0.7.0b1-win32-superpack-python2.4.exe and >>> numpy-1.2.1-win32-superpack-python2.4.exe crash on my old pc too, >>> which uses duron 750MHz. So now I think it is not the problem with >>> non-sse/sse/sse2 instruction >>> >> > > Ok, I checked the machine code in scipy and it seems that the quadpack > module (used by scipy.integrate) has a couple of SSE instructions. > Code-wise, it is trivial to solve, but we may need a new numpy version > for that. > Could you try this installer ? http://www.ar.media.kyoto-u.ac.jp/members/david/archives/scipy/scipy-0.7.0.dev5213-win32-superpack-python2.4.exe This one should hopefully contains no SSE instructions at all for old CPU (the _quadpack.pyd contains no SSE2 instructions anymore), and should work. David From josef.pktd at gmail.com Wed Dec 3 22:31:03 2008 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Wed, 3 Dec 2008 22:31:03 -0500 Subject: [SciPy-user] new Kolmogorov-Smirnov 2-sample test not Numerical Recipes based In-Reply-To: <1e2af89e0812031858g57025313y75f6dad989c42b26@mail.gmail.com> References: <1cd32cbb0812031853w622785ddlaa0b5d3f0d8aefd9@mail.gmail.com> <1e2af89e0812031858g57025313y75f6dad989c42b26@mail.gmail.com> Message-ID: <1cd32cbb0812031931g19fd7d8fs81fb75a103816d83@mail.gmail.com> Hi, I forgot to ask: I took the speed of convergence correction for the asymptotic KS distribution from the existing implementation. With a quick look on the internet I didn't find any reference for the correction. en = np.sqrt(n1*n2/float(n1+n2)) # same as in http://en.wikipedia.org/wiki/Kolmogorov-Smirnov_test prob = ksprob((en+0.12+0.11/en)*d) I'm curious why it is not just the sqrt(n) analog, i.e. prob = ksprob(en*d) I tried it in the Monte Carlo and ksprob(en*d) has slightly less power than ksprob((en+0.12+0.11/en)*d). Josef From robert.kern at gmail.com Wed Dec 3 23:18:09 2008 From: robert.kern at gmail.com (Robert Kern) Date: Wed, 3 Dec 2008 22:18:09 -0600 Subject: [SciPy-user] new Kolmogorov-Smirnov 2-sample test not Numerical Recipes based In-Reply-To: <1e2af89e0812031858g57025313y75f6dad989c42b26@mail.gmail.com> References: <1cd32cbb0812031853w622785ddlaa0b5d3f0d8aefd9@mail.gmail.com> <1e2af89e0812031858g57025313y75f6dad989c42b26@mail.gmail.com> Message-ID: <3d375d730812032018v2c09d1a6w26c88b86e63dba2d@mail.gmail.com> On Wed, Dec 3, 2008 at 20:58, Matthew Brett wrote: > Hi, > >> I did a 2sample kstest based on the definition using search sorted, >> see function below. > > [snip] > >> If there is any interest, I can replace the existing implementation >> with the new function > > [snip] > >> * in n1=2, n2=3 example new version is correct, old version is wrong >> * Monte Carlo essentially identical results (sample sizes between 100 and 1000 >> rejection rates for alpha = 1%,5%,10% sometimes differ in 3rd decimal > > Given more secure license compatibility and that you think the current > version is correct where the other is wrong, I think this should go in > now, and into the 0.7 release. Any disagreement? None from me. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From cournape at gmail.com Thu Dec 4 00:16:27 2008 From: cournape at gmail.com (David Cournapeau) Date: Thu, 4 Dec 2008 14:16:27 +0900 Subject: [SciPy-user] new Kolmogorov-Smirnov 2-sample test not Numerical Recipes based In-Reply-To: <1e2af89e0812031858g57025313y75f6dad989c42b26@mail.gmail.com> References: <1cd32cbb0812031853w622785ddlaa0b5d3f0d8aefd9@mail.gmail.com> <1e2af89e0812031858g57025313y75f6dad989c42b26@mail.gmail.com> Message-ID: <5b8d13220812032116g1dc91346u979550e55a9582d1@mail.gmail.com> On Thu, Dec 4, 2008 at 11:58 AM, Matthew Brett wrote: > Hi, > >> I did a 2sample kstest based on the definition using search sorted, >> see function below. > > [snip] > >> If there is any interest, I can replace the existing implementation >> with the new function > > [snip] > >> * in n1=2, n2=3 example new version is correct, old version is wrong >> * Monte Carlo essentially identical results (sample sizes between 100 and 1000 >> rejection rates for alpha = 1%,5%,10% sometimes differ in 3rd decimal > > Given more secure license compatibility and that you think the current > version is correct where the other is wrong, I think this should go in > now, and into the 0.7 release. Any disagreement? Nope, this change is fine. I won't complain this time :) David From guilherme at gpfreitas.com Thu Dec 4 03:13:39 2008 From: guilherme at gpfreitas.com (Guilherme P. de Freitas) Date: Thu, 4 Dec 2008 00:13:39 -0800 Subject: [SciPy-user] Software for economic experiments In-Reply-To: <493710A3.4070103@american.edu> References: <493710A3.4070103@american.edu> Message-ID: >> I was wondering if this could not >> be done with Python. > > It certainly can be done with Python. > I might be interested in helping. Great! As for the language, I feel that not only it is possible to use Python, but that it is also a good choice. Do you agree? Do you suggest another language or set of languages? I know this is a Python mailing list, so we are all biased but... anyway, it's good to check. > Will one subject at a time interact with the software, or > would it need to synchronously elicit responses from > multiple human subjects? What kind of network configuration > will you rely on in the lab? I still have to find out about the details. See, I'm not directly involved in the project, I'm just a grad student that works with the professor on other things, but that saw the hacks they were doing to run experiments and thought that "there must be a better way to do this!". That said, from what I've talked to the grad student that is actually involved in the project, I think that at each stage (which lasts, say 90s) subjects make their decisions (how much to fish, demand for fishing quotas, supply for fishing quotas, bids on the auction for quotas). It is not clear to me what kind of "money" the subjects would use in the auction, if its fake money distributed by the auctioneer or if it comes from their profits. It should come from profits, but I have to get the details. > > What department are you in? What are the research goals? > Is coauthorship a possibility? Would the code be free and > open source (e.g., BSD license)? Is the existing auction > code already FOSS licensed? What language is it in? I'm a PhD student in the Humanities and Social Sciences division at Caltech. This research is led by prof. John Ledyard, who is my main advisor at the moment. From what I understood, the goals are to set up new regulations for fisheries in the Pacific coast that are more efficient, environmentally "healthy" and politically viable (for example, current fishing companies will have to be given a bigger share of the pie). As for co-authorship, again, I'm not part of the project, but I can ask the people involved about it. I can't say for sure, but I firmly believe the code will be licensed under some FOSS license. I would definitely make a lot of noise here if it did not! The trading (general equilibrium) software is called jMarkets, and it is written in Java. Here's the website: http://jmarkets.ssel.caltech.edu/ It is licensed under the GPL, as you can see on the website. Prof. Charles Plott is the main person behind jMarkets at the moment, from what I know. There is also a jAuctions: http://www.hss.caltech.edu/~jkg/jAuctions.html also in Java. Prof. Jacob Goeree is the main person behind jAuctions. Anyway, I'll ask John (Ledyard) about these details too. > Feel free to send a couple relevant papers, > if that might help me see into your project. > I have no substantial fisheries knowledge, > but I am an economist with Python programming > experience, and I think fisheries issues are > interesting. Great! We'll be in touch. -- Guilherme P. de Freitas http://www.gpfreitas.com From guilherme at gpfreitas.com Thu Dec 4 03:19:01 2008 From: guilherme at gpfreitas.com (Guilherme P. de Freitas) Date: Thu, 4 Dec 2008 00:19:01 -0800 Subject: [SciPy-user] Software for economic experiments In-Reply-To: <3d375d730812031507n4a0de2d8t45488aaea98b1e36@mail.gmail.com> References: <3d375d730812031507n4a0de2d8t45488aaea98b1e36@mail.gmail.com> Message-ID: >> 2. Can anybody point to work already done by someone in the community >> that is similar to the one we need? > > Caltech runs a number of such web-based Econ experiments fairly > frequently. Making money from them was a common hobby when I was an > undergrad (Ruddock '03). You may want to ask around. Most likely, they > were written by students long-since graduated and are poorly > documented, but you never know. I would be surprised if you couldn't > find examples of similar scope. Sorry, I should have said work done *with Python*. From what I know, people have mostly used Java. >> 5. Maybe we will find someone to do the coding, but still it may be >> useful to have an external "consultant" that provides support to the >> coder, and informs us of the quality of the code, makes sure it is >> well documented, etc. I personally think it is easy to fall in the >> trap of having someone come up with a hack that works for some >> situations, and that is ill documented, etc. So, questions 3 and 4 >> apply to this kind of professional too. > > I'm not sure you'd get many takers for this, and you probably won't > get your money's worth. It would be more cost-effective for you to > simply hire a professional. I agree. I just need to find such a professional. In any case, it would be good at some point that grad students learned to do this themselves, at least those willing to learn it. >> 6. What else on top of the rough sketch below does one need to know to >> have a good idea of how to do the job and how to price it? I intend to >> define the variables and its relations soon. > > Some examples of the transition laws, quota allocation strategies, and > cost functions would probably help. It would give the bidder a better > idea of how much flexibility they will need. Similarly, descriptions > of the auction software that you have available and the kinds of > auctions and trading that you will need will help a lot. If you want > to get new, reusable auction/trading components out of this, you will > want a more thorough survey of what other groups in HSS may want, too. Thanks for the tips! -- Guilherme P. de Freitas http://www.gpfreitas.com From cimrman3 at ntc.zcu.cz Thu Dec 4 09:51:28 2008 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Thu, 04 Dec 2008 15:51:28 +0100 Subject: [SciPy-user] ANN: SfePy 2008.4 Message-ID: <4937EE70.4030603@ntc.zcu.cz> I am pleased to announce the release of SfePy 2008.4. SfePy (simple finite elements in Python) is a finite element analysis software based primarily on Numpy and SciPy. Mailing lists, issue tracking, mercurial repository: http://sfepy.org Home page: http://sfepy.kme.zcu.cz Major improvements: - framework for running parametric studies - greatly improved support for time-dependent problems - time derivatives of variables as term arguments - initial conditions via ics, ic_ keywords - live plotting using multiprocessing module - type of term arguments determined fully at run-time - new terms, namely piezo-coupling Applications: - enhanced acoustic band gaps code - dispersion analysis (polarization angle calculation) - applied load tensor computation - phase velocity computation for periodic perforated media with empty holes - improved schroedinger.py - plotting DFT iterations For more information on this release, see http://sfepy.googlecode.com/svn/web/releases/2008.4_RELEASE_NOTES.txt Best regards, Robert Cimrman From Dharhas.Pothina at twdb.state.tx.us Thu Dec 4 10:55:55 2008 From: Dharhas.Pothina at twdb.state.tx.us (Dharhas Pothina) Date: Thu, 04 Dec 2008 09:55:55 -0600 Subject: [SciPy-user] scikits.timeseries DateArrayQuestion[timeseriesdocumentation] In-Reply-To: <58AFEFC5-7E85-4028-8A36-A2E3A95AF2D8@gmail.com> References: <4936B1660200009B000186DF@GWWEB.twdb.state.tx.us> <58AFEFC5-7E85-4028-8A36-A2E3A95AF2D8@gmail.com> Message-ID: <4937A92B.63BA.009B.0@twdb.state.tx.us> Thank you Pierre. Looks like its working now. I have some plotting questions but I will make a new thread for those. - dharhas >>> Pierre GM 12/3/2008 4:25 PM >>> On Dec 3, 2008, at 5:18 PM, Dharhas Pothina wrote: > Hi, > > Almost there I think. I'm getting the following error : > > Traceback (most recent call last): > File "/home/dharhas/scripts/selfe/plot- > stations_selfevsfield_wfreq.py", line 210, in > fseries_freq = fseries.convert(freq='D', func=mean) > AttributeError: 'function' object has no attribute 'convert' > > What am I doing wrong? Well, you're obviously accessing a function instead of a TimeSeries instance... > #remove -999.9 nodata values fo parameter > fseries[fseries==-999.9] = ma.masked > fseries = fseries.fill_missing_dates Here's the culprit: you forgot to put the () after fill_missing_dates. Therefore, fseries is a reference to the method `fill_missing_dates`, that is, a function. Put the () and then you reference the output of the method, which is a TimeSeries object, like you wanted. _______________________________________________ SciPy-user mailing list SciPy-user at scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user From Dharhas.Pothina at twdb.state.tx.us Thu Dec 4 11:10:18 2008 From: Dharhas.Pothina at twdb.state.tx.us (Dharhas Pothina) Date: Thu, 04 Dec 2008 10:10:18 -0600 Subject: [SciPy-user] scikit.timeseries plotting bugs? Message-ID: <4937AC8A0200009B00018721@GWWEB.twdb.state.tx.us> Hi, I'm trying to make some plots with two different timeseries on them and have come across some strange behavior. I have two timeseries, one is every 15 mins and the other is daily averages created from the 15 minute series (fseries & fseries_freq). If I use two commands and '.', 'b+' or as the symbol for the first series it works, ie : fsp1.tsplot(fseries, '.') fsp1.tsplot(fseries_freq, 'r--') If I use two commands and certain other symbols like 'b',b--' etc for the first series, it only plots the second series : fsp1.tsplot(fseries, 'b.') fsp1.tsplot(fseries_freq, 'r--') If I combine the two in one command as shown in the moving average plotting example in the documentation it only shows the second series whatever I put as the symbol in the first series. ie : fsp2 = fig2.add_tsplot(111) fsp2.tsplot(fseries, '.',fseries_freq, 'r--') Am I missing something or is this a bug. Also the plots are fairly slow, is that just the overhead of plotting through the timeseries package? - dharhas Entire Code Below: year, month, day, hour, minute, fdata = loadtxt(fieldfile,comments="#",usecols=(0,1,2,3,4,ndata),unpack=True) fielddates = [datetime.datetime(int(y),int(m),int(d),int(hh),int(mm),0) for y,m,d,hh,mm in zip(year,month,day,hour,minute)] fdates = ts.date_array(fielddates,freq='MIN') fseries = ts.time_series(fdata, dates=fdates) #remove -999.9 nodata values fo parameter fseries[fseries==-999.9] = ma.masked fseries = fseries.fill_missing_dates() #convert tor required frequency fseries_freq = fseries.convert(freq=freq, func=mean) fig1 = tpl.tsfigure() fsp1 = fig1.add_tsplot(111) fsp1.tsplot(fseries, '.') fsp1.tsplot(fseries_freq, 'r--') fig2 = tpl.tsfigure() fsp2 = fig2.add_tsplot(111) fsp2.tsplot(fseries, '.',fseries_freq, 'r--') From Dharhas.Pothina at twdb.state.tx.us Thu Dec 4 13:00:54 2008 From: Dharhas.Pothina at twdb.state.tx.us (Dharhas Pothina) Date: Thu, 04 Dec 2008 12:00:54 -0600 Subject: [SciPy-user] Efficient way of finding the parent element of a point in an unstructured triangular mesh. Message-ID: <4937C676.63BA.009B.0@twdb.state.tx.us> Hi, I have an unstructured triangular mesh. ie a list of nodal locations and a list of element connectivity. node #,x,y 1 10.0 10.0 2 10.0 12.0 ... element #, node1 ,node2 ,node3 1 4 3 5 2 1 3 7 3 2 9 6 ... where node1, node2 and node3 are the nodes that make each triangle element. Given a set of x,y points I need to create a list of parent elements. ie for each point I need to find the triangle that contains it. Presently for each point I cycle through the list of elements and for each element calculate whether the point is inside or outside the triangle by checking if the sum areas of the triangle formed by the point and the each node is the same as the area of the element. This works but is inefficient because I'm cycling through the entire list of elements for each point and calculating the areas each time. Do any of you know of a more efficient way to do this? thanks, - dharhas From timmichelsen at gmx-topmail.de Thu Dec 4 17:10:09 2008 From: timmichelsen at gmx-topmail.de (Tim Michelsen) Date: Thu, 04 Dec 2008 23:10:09 +0100 Subject: [SciPy-user] scikit.timeseries plotting bugs? In-Reply-To: <4937AC8A0200009B00018721@GWWEB.twdb.state.tx.us> References: <4937AC8A0200009B00018721@GWWEB.twdb.state.tx.us> Message-ID: > #convert tor required frequency > fseries_freq = fseries.convert(freq=freq, func=mean) from a quick look you seem not to have specified freq here. see also ts.extras.guess_freq Also I do not know whether one can create a plot from timeseries with different frequencies. With reports, this wouldn't work. Regards! From pgmdevlist at gmail.com Thu Dec 4 17:19:34 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Thu, 4 Dec 2008 17:19:34 -0500 Subject: [SciPy-user] scikit.timeseries plotting bugs? In-Reply-To: References: <4937AC8A0200009B00018721@GWWEB.twdb.state.tx.us> Message-ID: <8B1564CC-DC54-4FF4-9569-EE0DEC671760@gmail.com> Guys, I'm not MIA, I'm just trying to get mpl running on this new machine I'm playing with. On Dec 4, 2008, at 5:10 PM, Tim Michelsen wrote: >> #convert tor required frequency >> fseries_freq = fseries.convert(freq=freq, func=mean) > from a quick look you seem not to have specified freq here. > > see also ts.extras.guess_freq > > Also I do not know whether one can create a plot from timeseries with > different frequencies. With reports, this wouldn't work. > > Regards! > > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user From pgmdevlist at gmail.com Thu Dec 4 18:29:24 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Thu, 4 Dec 2008 18:29:24 -0500 Subject: [SciPy-user] scikit.timeseries plotting bugs? In-Reply-To: <4937AC8A0200009B00018721@GWWEB.twdb.state.tx.us> References: <4937AC8A0200009B00018721@GWWEB.twdb.state.tx.us> Message-ID: OK, all is well... On Dec 4, 2008, at 11:10 AM, Dharhas Pothina wrote: > > If I use two commands and '.', 'b+' or as the symbol for the first > series it works, ie : > > fsp1.tsplot(fseries, '.') > fsp1.tsplot(fseries_freq, 'r--') Yep. In that case, by the time you try to plot fseries_freq, fsp1.freq has been initialized to series.freq, and fseries_freq is automatically adjusted to match fsp1.freq > > If I use two commands and certain other symbols like 'b',b--' etc > for the first series, it only plots the second series : > > fsp1.tsplot(fseries, 'b.') > fsp1.tsplot(fseries_freq, 'r--') ??? can't reproduce this one... > > If I combine the two in one command as shown in the moving average > plotting example in the documentation it only shows the second > series whatever I put as the symbol in the first series. ie : > > fsp2 = fig2.add_tsplot(111) > fsp2.tsplot(fseries, '.',fseries_freq, 'r--') > Well, it shows the first series, not the second one... And yes, tht can be characterized as a bug. See, when you plot everythng with the same command, fsp2.freq hasn't be initialized yet. The initialization takes place at the end of the loop on the arguments. By then, fseries_freq has been plotted, but at its own frequency, viz using a range of xs far smaller than the fseries. So in fact, fseries-freq is plotted, but at the wrong xs, and you can't see it. We should update the frequency in real time, not at the end of the loop. Hop, something else on my todo list > Am I missing something or is this a bug. Also the plots are fairly > slow, is that just the overhead of plotting through the timeseries > package? yes, all the fancy automatic adjustment of the ticks to match your resolution comes to a price. And it's expensive... From mattknox.ca at gmail.com Thu Dec 4 20:50:20 2008 From: mattknox.ca at gmail.com (Matt Knox) Date: Fri, 5 Dec 2008 01:50:20 +0000 (UTC) Subject: [SciPy-user] scikit.timeseries plotting bugs? References: <4937AC8A0200009B00018721@GWWEB.twdb.state.tx.us> Message-ID: > see also ts.extras.guess_freq I wouldn't recommend using this function as is. It is not included in the docs right now because it isn't very robust or well tested (as opposed to the usual reasons of laziness or lack of time :P ). I don't think it works very well. > Also I do not know whether one can create a plot from timeseries with > different frequencies. With reports, this wouldn't work. Plotting series of different frequencies is not supported. Apparently this doesn't raise an error right now, but it probably should. It is easy enough to convert series to different frequencies that I think it makes sense to require the user to explicitly convert the series to a common frequency before plotting rather than try to guess what the user intended. From mattknox.ca at gmail.com Thu Dec 4 20:56:25 2008 From: mattknox.ca at gmail.com (Matt Knox) Date: Fri, 5 Dec 2008 01:56:25 +0000 (UTC) Subject: [SciPy-user] scikit.timeseries plotting bugs? References: <4937AC8A0200009B00018721@GWWEB.twdb.state.tx.us> Message-ID: > > Am I missing something or is this a bug. Also the plots are fairly > > slow, is that just the overhead of plotting through the timeseries > > package? > > yes, all the fancy automatic adjustment of the ticks to match your > resolution comes to a price. And it's expensive... It is expensive, but we were also a bit sloppy here. I just committed a change that should speed things up by about a factor or 3-4. Please test when you guys get a chance. There is probably room for further improvement, but I think the performance will generally be acceptable for most uses now. From pgmdevlist at gmail.com Thu Dec 4 21:01:25 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Thu, 4 Dec 2008 21:01:25 -0500 Subject: [SciPy-user] scikit.timeseries plotting bugs? In-Reply-To: References: <4937AC8A0200009B00018721@GWWEB.twdb.state.tx.us>

Message-ID: <39DECBC1-DF83-43BC-95AA-B25A60CA42EF@gmail.com> >> see also ts.extras.guess_freq > > I wouldn't recommend using this function as is. It is not included > in the docs > right now because it isn't very robust or well tested (as opposed to > the usual > reasons of laziness or lack of time :P ). I don't think it works > very well. Well, we could in theory try to come up with something, following the example of the StringConverter (viz, you start with one freq, if this doesn't work go to the next one...), but don't expect it to happen any time soon. >> Also I do not know whether one can create a plot from timeseries with >> different frequencies. With reports, this wouldn't work. > > Plotting series of different frequencies is not supported. > Apparently this > doesn't raise an error right now, but it probably should. It is easy > enough to > convert series to different frequencies that I think it makes sense > to require > the user to explicitly convert the series to a common frequency > before plotting > rather than try to guess what the user intended. That's actually not the way it was designed. In normal procedure, you create a tsplot with a time_series attached. The plot gets its frequency from this series, and all the series that are plotted are transformed to the plot frequency uisng their .asfreq method: in other terms, you don;t change the data, just the representation of the dates. It's quite convenient and I don't see why we should raise an exception. Issue a warning, perhaps, but let it work. need to document that From pgmdevlist at gmail.com Thu Dec 4 21:02:15 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Thu, 4 Dec 2008 21:02:15 -0500 Subject: [SciPy-user] scikit.timeseries plotting bugs? In-Reply-To: References: <4937AC8A0200009B00018721@GWWEB.twdb.state.tx.us>

Message-ID: <81BA1796-7290-4915-844D-EE9620D03342@gmail.com> On Dec 4, 2008, at 8:56 PM, Matt Knox wrote: >> >> yes, all the fancy automatic adjustment of the ticks to match your >> resolution comes to a price. And it's expensive... > > It is expensive, but we were also a bit sloppy here. I just > committed a change > that should speed things up by about a factor or 3-4. Please test > when you guys > get a chance. There is probably room for further improvement, but I > think the > performance will generally be acceptable for most uses now. OK, will do. Thx a lot btw. From c.j.lee at tnw.utwente.nl Fri Dec 5 04:11:35 2008 From: c.j.lee at tnw.utwente.nl (Chris Lee) Date: Fri, 5 Dec 2008 10:11:35 +0100 Subject: [SciPy-user] development timeline for scipy and numpy Message-ID: <831BBF3C-7053-4D44-AD1B-0EF73FF1C116@tnw.utwente.nl> Hi All, This is just a general enquiry on the timeline for scipy and numpy development. When these packages move to python 3.0, they won't be 2.x compatible anymore, right? When is that move planned? I am not actually asking for the move to be quick, because I don't want to rewrite my code. However, if it is going to be soon (and I want to take advantage of new features in scipy/numpy), then I will need to do some prep-work in learning 3.0. I would like to either know that I can forget about it for a while, or get it scheduled. Cheers Chris *************************************************** Chris Lee Laser Physics and Nonlinear Optics Group MESA+ Research Institute for Nanotechnology University of Twente Phone: ++31 (0)53 489 3968 fax: ++31 (0)53 489 1102 *************************************************** -------------- next part -------------- An HTML attachment was scrubbed... URL: From david at ar.media.kyoto-u.ac.jp Fri Dec 5 04:26:57 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Fri, 05 Dec 2008 18:26:57 +0900 Subject: [SciPy-user] development timeline for scipy and numpy In-Reply-To: <831BBF3C-7053-4D44-AD1B-0EF73FF1C116@tnw.utwente.nl> References: <831BBF3C-7053-4D44-AD1B-0EF73FF1C116@tnw.utwente.nl> Message-ID: <4938F3E1.2030906@ar.media.kyoto-u.ac.jp> Chris Lee wrote: > > I would like to either know that I can forget about it for a while, or > get it scheduled. I think you can safely forget about it for a while :) http://projects.scipy.org/pipermail/numpy-discussion/2008-December/039017.html cheers, David From cimrman3 at ntc.zcu.cz Fri Dec 5 05:09:37 2008 From: cimrman3 at ntc.zcu.cz (Robert Cimrman) Date: Fri, 05 Dec 2008 11:09:37 +0100 Subject: [SciPy-user] Efficient way of finding the parent element of a point in an unstructured triangular mesh. In-Reply-To: <4937C676.63BA.009B.0@twdb.state.tx.us> References: <4937C676.63BA.009B.0@twdb.state.tx.us> Message-ID: <4938FDE1.3030408@ntc.zcu.cz> Hi Dharhas, Dharhas Pothina wrote: > Hi, > > I have an unstructured triangular mesh. ie a list of nodal locations > and a list of element connectivity. > > node #,x,y > 1 10.0 10.0 > 2 10.0 12.0 > ... > > element #, node1 ,node2 ,node3 > 1 4 3 5 > 2 1 3 7 > 3 2 9 6 > ... > > where node1, node2 and node3 are the nodes that make each triangle > element. > > Given a set of x,y points I need to create a list of parent elements. > ie for each point I need to find the triangle that contains it. > > Presently for each point I cycle through the list of elements and for > each element calculate whether the point is inside or outside the > triangle by checking if the sum areas of the triangle formed by the > point and the each node is the same as the area of the element. > > This works but is inefficient because I'm cycling through the entire > list of elements for each point and calculating the areas each time. > Do any of you know of a more efficient way to do this? You could create an inverted connectivity - like you have the list of elements which point to the nodes, you would have, for each node, a list of elements the node is contained in. something like (not tested, slow, just to get the idea): iconn = [[] for ii in xrange( nodes.shape[0] )] for iel, row in enumerate( elements ): for node in row[1:]: iconn[node].append( iel ) Then you would have a problem for finding the nearest neighbours in two sets of points, for which several algorithms exist (see e.g. [1], [2]). After machting the points just look to the iconn and choose one of the elements. cheers, r. [1] scipy.spatial.KDTree (in SVN version, added recently by Anne M. Archibald) [2] http://www.cs.umd.edu/~mount/ANN/ From Dharhas.Pothina at twdb.state.tx.us Fri Dec 5 11:11:34 2008 From: Dharhas.Pothina at twdb.state.tx.us (Dharhas Pothina) Date: Fri, 05 Dec 2008 10:11:34 -0600 Subject: [SciPy-user] scikit.timeseries plotting bugs? In-Reply-To: References: <4937AC8A0200009B00018721@GWWEB.twdb.state.tx.us> Message-ID: <4938FE56.63BA.009B.0@twdb.state.tx.us> @ Tim : >> fseries_freq = fseries.convert(freq=freq, func=mean) > from a quick look you seem not to have specified freq here. I had pasted a piece of a script. The value of freq is passed in from the command line. @ Pierre : >> If I use two commands and certain other symbols like 'b',b--' etc >> for the first series, it only plots the second series : > ??? can't reproduce this one... I've double checked, its happening consistently with multiple datasets (The datasets are similar just at different locations). How do I track down the problem? Should I send you a datafile and my short script? @ Matt & Pierre : So if I understand correctly if I plot multiple time series with different frequencies it should use the .asfreq method to transform the other series to the first series frequency. I want to make sure I understand .asfreq correctly. 1) If .asfreq is converting to a higher frequency the data stays the same and extra dates are added with no data/masked values 2) If converting to a lower frequency multiple data points end up on the same date Implications of this are that for what I am trying to do I need to put the highest frequency plot first. > That's actually not the way it was designed. In normal procedure, you > create a tsplot with a time_series attached. The plot gets its > frequency from this series, and all the series that are plotted are > transformed to the plot frequency uisng their .asfreq method: in other > terms, you don;t change the data, just the representation of the > dates. It's quite convenient and I don't see why we should raise an > exception. Issue a warning, perhaps, but let it work. > need to document that Another question. How do I set the x limits. In my plot the two timeseries go from 12/15/05 - 04/01/07 & from 01/01/06 - 12/31/06. If I do not specify the x limits the x ticks and labels look very nice and well spaced. However, if I try to limit the plot to a date range by using fsp.set_xlim(startdate,enddate) where startdate and enddate are datetimes, it works but the major/minor ticks don't look correct and labeling looks bad. thanks - dharhas From Dharhas.Pothina at twdb.state.tx.us Fri Dec 5 11:40:28 2008 From: Dharhas.Pothina at twdb.state.tx.us (Dharhas Pothina) Date: Fri, 05 Dec 2008 10:40:28 -0600 Subject: [SciPy-user] Efficient way of finding the parent element of apoint in an unstructured triangular In-Reply-To: <4938FDE1.3030408@ntc.zcu.cz> References: <4937C676.63BA.009B.0@twdb.state.tx.us> <4938FDE1.3030408@ntc.zcu.cz> Message-ID: <4939051C.63BA.009B.0@twdb.state.tx.us> Hi Robert, Ok so if I understood you correctly what I would be doing is using the nearest neighbour algorithm to find the closest node to the point and then loop through the 5 or 6 elements that have that node as a vertex and check which whether the point is inside for each of those elements. >From the sound of it as long as the ANN search is fairly efficient I might be able to speed the process up quite a bit. I'll probably try your second reference, I'm not too keen on installing the SVN version. thanks, - dharhas >>> Robert Cimrman 12/5/2008 4:09 AM >>> Hi Dharhas, Dharhas Pothina wrote: > Hi, > > I have an unstructured triangular mesh. ie a list of nodal locations > and a list of element connectivity. > > node #,x,y > 1 10.0 10.0 > 2 10.0 12.0 > ... > > element #, node1 ,node2 ,node3 > 1 4 3 5 > 2 1 3 7 > 3 2 9 6 > ... > > where node1, node2 and node3 are the nodes that make each triangle > element. > > Given a set of x,y points I need to create a list of parent elements. > ie for each point I need to find the triangle that contains it. > > Presently for each point I cycle through the list of elements and for > each element calculate whether the point is inside or outside the > triangle by checking if the sum areas of the triangle formed by the > point and the each node is the same as the area of the element. > > This works but is inefficient because I'm cycling through the entire > list of elements for each point and calculating the areas each time. > Do any of you know of a more efficient way to do this? You could create an inverted connectivity - like you have the list of elements which point to the nodes, you would have, for each node, a list of elements the node is contained in. something like (not tested, slow, just to get the idea): iconn = [[] for ii in xrange( nodes.shape[0] )] for iel, row in enumerate( elements ): for node in row[1:]: iconn[node].append( iel ) Then you would have a problem for finding the nearest neighbours in two sets of points, for which several algorithms exist (see e.g. [1], [2]). After machting the points just look to the iconn and choose one of the elements. cheers, r. [1] scipy.spatial.KDTree (in SVN version, added recently by Anne M. Archibald) [2] http://www.cs.umd.edu/~mount/ANN/ _______________________________________________ SciPy-user mailing list SciPy-user at scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user From Dharhas.Pothina at twdb.state.tx.us Fri Dec 5 11:41:52 2008 From: Dharhas.Pothina at twdb.state.tx.us (Dharhas Pothina) Date: Fri, 05 Dec 2008 10:41:52 -0600 Subject: [SciPy-user] scikit.timeseries plotting bugs? In-Reply-To: <4938FE56.63BA.009B.0@twdb.state.tx.us> References: <4937AC8A0200009B00018721@GWWEB.twdb.state.tx.us> <4938FE56.63BA.009B.0@twdb.state.tx.us> Message-ID: <49390570.63BA.009B.0@twdb.state.tx.us> Ok I've narrowed it down. It has nothing to do with plotting two timeseries. The problem is plotting the any of the original timeseries I read from files. Once I convert them to daily means the plotting works fine. With the original timeseries whether it plots or not seems to depend on what plotting symbol I use. - dharhas >> If I use two commands and certain other symbols like 'b',b--' etc >> for the first series, it only plots the second series : > ??? can't reproduce this one... I've double checked, its happening consistently with multiple datasets (The datasets are similar just at different locations). How do I track down the problem? Should I send you a datafile and my short script? From Dharhas.Pothina at twdb.state.tx.us Fri Dec 5 12:26:10 2008 From: Dharhas.Pothina at twdb.state.tx.us (Dharhas Pothina) Date: Fri, 05 Dec 2008 11:26:10 -0600 Subject: [SciPy-user] scikit.timeseries plotting bugs? Message-ID: <49390FD20200009B00018804@GWWEB.twdb.state.tx.us> Hi, I'm attaching a ascii datafile called JDM3_short.txt and the commands I typed in ipython to reproduce the problem. The first figure works the second has the correct axes but is blank. In [2]: import sys In [3]: import subprocess In [4]: from numpy import * In [5]: from pylab import * In [6]: import datetime In [7]: import scikits.timeseries as ts In [8]: import scikits.timeseries.lib.plotlib as tpl In [9]: import numpy.ma as ma In [10]: fieldfile='JDM3_short.txt' In [11]: year, month, day, hour, minute, fdata = loadtxt(fieldfile,comments="#",usecols=(0,1,2,3,4,8),unpack=True) In [12]: fielddates = [datetime.datetime(int(y),int(m),int(d),int(hh),int(mm),0) for y,m,d,hh,mm in zip(year,month,day,hour,minute)] In [13]: fdates = ts.date_array(fielddates,freq='MIN') In [14]: fseries = ts.time_series(fdata, dates=fdates) In [15]: #remove -999.9 nodata values fo parameter In [16]: fseries[fseries==-999.9] = ma.masked In [17]: In [18]: fseries = fseries.fill_missing_dates() In [19]: fig = tpl.tsfigure() In [20]: fsp = fig.add_tsplot(111) In [21]: fsp.tsplot(fseries, '.') Out[21]: [] In [22]: fig1 = tpl.tsfigure() In [23]: fsp1 = fig1.add_tsplot(111) In [24]: fsp1.tsplot(fseries, 'b', label='data') Out[24]: [] In [25]: show() - dharhas >>> "Dharhas Pothina" 12/05/08 10:42 AM >>> Ok I've narrowed it down. It has nothing to do with plotting two timeseries. The problem is plotting the any of the original timeseries I read from files. Once I convert them to daily means the plotting works fine. With the original timeseries whether it plots or not seems to depend on what plotting symbol I use. - dharhas >> If I use two commands and certain other symbols like 'b',b--' etc >> for the first series, it only plots the second series : > ??? can't reproduce this one... I've double checked, its happening consistently with multiple datasets (The datasets are similar just at different locations). How do I track down the problem? Should I send you a datafile and my short script? _______________________________________________ SciPy-user mailing list SciPy-user at scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: JDM3_short.txt URL: From pgmdevlist at gmail.com Fri Dec 5 12:34:29 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Fri, 5 Dec 2008 12:34:29 -0500 Subject: [SciPy-user] scikit.timeseries plotting bugs? In-Reply-To: <4938FE56.63BA.009B.0@twdb.state.tx.us> References: <4937AC8A0200009B00018721@GWWEB.twdb.state.tx.us> <4938FE56.63BA.009B.0@twdb.state.tx.us> Message-ID: On Dec 5, 2008, at 11:11 AM, Dharhas Pothina wrote: > @ Matt & Pierre : > > So if I understand correctly if I plot multiple time series with > different frequencies it should use the .asfreq method to transform > the other series to the first series frequency. That's the generic idea. Or, you plot a first series, then plot the others: the others will be modified w/ .asfreq > I want to make sure I understand .asfreq correctly. > 1) If .asfreq is converting to a higher frequency the data stays the > same and extra dates are added with no data/masked values Well, not exactly: in that case, the dates are internally transformed to the new frequency, there's no addition of dates > > 2) If converting to a lower frequency multiple data points end up on > the same date Yes. > Implications of this are that for what I am trying to do I need to > put the highest frequency plot first. > Yes >> That's actually not the way it was designed. In normal procedure, you >> create a tsplot with a time_series attached. The plot gets its >> frequency from this series, and all the series that are plotted are >> transformed to the plot frequency uisng their .asfreq method: in >> other >> terms, you don;t change the data, just the representation of the >> dates. It's quite convenient and I don't see why we should raise an >> exception. Issue a warning, perhaps, but let it work. >> need to document that > > Another question. How do I set the x limits. In my plot the two > timeseries go from 12/15/05 - 04/01/07 & from 01/01/06 - 12/31/06. > If I do not specify the x limits the x ticks and labels look very > nice and well spaced. However, if I try to limit the plot to a date > range by using fsp.set_xlim(startdate,enddate) where startdate and > enddate are datetimes, it works but the major/minor ticks don't look > correct and labeling looks bad. Don't use set_xlim on the plot, unless you understand how it works. Integers are associated with each date of a time_series, depending on its frequency. You need to set the xlims to integers that are compatible wth your series. The easiest by far is to use the `set_datelimits` method of your plot. From tgrav at mac.com Fri Dec 5 12:52:45 2008 From: tgrav at mac.com (Tommy Grav) Date: Fri, 05 Dec 2008 12:52:45 -0500 Subject: [SciPy-user] minpack2.so not proper arch in 0.7.0b1 mac release binary? Message-ID: <42AD7D52-E49D-469C-9420-996494377F55@mac.com> I installed the Mac OS X 0.7.0b1 binary from the scipy.org site, but I run into this error [skathi:myCode/Python/pyMPC] drtgrav% python -u comet_obs.py File /Users/drtgrav/Work/myLibrary/Catalogs/binEphemLong.405 has been opened Traceback (most recent call last): File "comet_obs.py", line 6, in from scipy.optimize import leastsq File "/Library/Frameworks/Python.framework/Versions/2.5/lib/ python2.5/site-packages/scipy/optimize/__init__.py", line 7, in from optimize import * File "/Library/Frameworks/Python.framework/Versions/2.5/lib/ python2.5/site-packages/scipy/optimize/optimize.py", line 28, in import linesearch File "/Library/Frameworks/Python.framework/Versions/2.5/lib/ python2.5/site-packages/scipy/optimize/linesearch.py", line 3, in from scipy.optimize import minpack2 ImportError: dlopen(/Library/Frameworks/Python.framework/Versions/2.5/ lib/python2.5/site-packages/scipy/optimize/minpack2.so, 2): no suitable image found. Did find: /Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site- packages/scipy/optimize/minpack2.so: mach-o, but wrong architecture [skathi:myCode/Python/pyMPC] drtgrav% Anyone know how to solve this problem? Cheers Tommy From heliocbortolon at gmail.com Fri Dec 5 12:55:57 2008 From: heliocbortolon at gmail.com (Maggoo) Date: Fri, 5 Dec 2008 09:55:57 -0800 (PST) Subject: [SciPy-user] Efficient way of finding the parent element of apoint in an unstructured triangular In-Reply-To: <4939051C.63BA.009B.0@twdb.state.tx.us> References: <4937C676.63BA.009B.0@twdb.state.tx.us> <4938FDE1.3030408@ntc.zcu.cz> <4939051C.63BA.009B.0@twdb.state.tx.us> Message-ID: Dear Dharhas Pothina, I think this is a common task in Computational Geometry. 1. Embed your unstructured grid into an ordinary uniform grid. 2. The grid being uniform, the position of the point gives the indexes of the horizontal and vertical positons of the square it pertains to: xidx = int( point.x / grid.dx ) yidx = int( point.y / grid.dy ) 3. You will have to make a connection matrix of the nodes in your unstructured grid to the squares in the structured one, and relate each of the elements to which the nodes are part of to the cited square, i.e., Nodes k1,...,kN are in the square S. These nodes are vertices of the elements e1,...,eN. So S is connected to e1,...,eN. 4. If the point is in S, so you search only through the elements e1,...,eN. I hope this gives you some starting point. On 5 dez, 14:40, "Dharhas Pothina" wrote: > Hi Robert, > > Ok so if I understood you correctly what I would be doing is using the nearest neighbour algorithm to find the closest node to the point and then loop through the 5 or 6 elements that have that node as a vertex and check which whether the point is inside for each of those elements. > > >From the sound of it as long as the ANN search is fairly efficient I might be able to speed the process up quite a bit. > > I'll probably try your second reference, I'm not too keen on installing the SVN version. > > thanks, > > - dharhas > > >>> Robert Cimrman 12/5/2008 4:09 AM >>> > > Hi Dharhas, > > > > Dharhas Pothina wrote: > > Hi, > > > I have an unstructured triangular mesh. ie a list of nodal locations > > and a list of element connectivity. > > > node #,x,y > > 1 10.0 10.0 > > 2 10.0 12.0 > > ... > > > element #, node1 ,node2 ,node3 > > 1 4 3 5 > > 2 1 3 7 > > 3 2 9 6 > > ... > > > where node1, node2 and node3 are the nodes that make each triangle > > element. > > > Given a set of x,y points I need to create a list of parent elements. > > ie for each point I need to find the triangle that contains it. > > > Presently for each point I cycle through the list of elements and for > > each element calculate whether the point is inside or outside the > > triangle by checking if the sum areas of the triangle formed by the > > point and the each node is the same as the area of the element. > > > This works but is inefficient because I'm cycling through the entire > > list of elements for each point and calculating the areas each time. > > Do any of you know of a more efficient way to do this? > > You could create an inverted connectivity - like you have the list of > elements which point to the nodes, you would have, for each node, a list > of elements the node is contained in. > > something like (not tested, slow, just to get the idea): > > iconn = [[] for ii in xrange( nodes.shape[0] )] > for iel, row in enumerate( elements ): > ? ? ?for node in row[1:]: > ? ? ? ? ?iconn[node].append( iel ) > > Then you would have a problem for finding the nearest neighbours in two > sets of points, for which several algorithms exist (see e.g. [1], [2]). > After machting the points just look to the iconn and choose one of the > elements. > > cheers, > r. > > [1] scipy.spatial.KDTree (in SVN version, added recently by Anne M. > Archibald) > [2]http://www.cs.umd.edu/~mount/ANN/ > > _______________________________________________ > SciPy-user mailing list > SciPy-u... at scipy.orghttp://projects.scipy.org/mailman/listinfo/scipy-user > > _______________________________________________ > SciPy-user mailing list > SciPy-u... at scipy.orghttp://projects.scipy.org/mailman/listinfo/scipy-user From Dharhas.Pothina at twdb.state.tx.us Fri Dec 5 13:07:37 2008 From: Dharhas.Pothina at twdb.state.tx.us (Dharhas Pothina) Date: Fri, 05 Dec 2008 12:07:37 -0600 Subject: [SciPy-user] Efficient way of finding the parent element ofapoint in an unstructured triangular In-Reply-To: References: <4937C676.63BA.009B.0@twdb.state.tx.us> <4938FDE1.3030408@ntc.zcu.cz> <4939051C.63BA.009B.0@twdb.state.tx.us> Message-ID: <49391989.63BA.009B.0@twdb.state.tx.us> Dear Maggoo, This is an interesting algorithm, do you have a name or reference for it? One question I have is how you would decide what to use for the size of the uniform grid cells(i.e. dx & dy). I can see that a larger cell size would involve searching through more elements but a cell size which is too small would have the problem of cells that don't contain any nodes. - dharhas >>> Maggoo 12/5/2008 11:55 AM >>> Dear Dharhas Pothina, I think this is a common task in Computational Geometry. 1. Embed your unstructured grid into an ordinary uniform grid. 2. The grid being uniform, the position of the point gives the indexes of the horizontal and vertical positons of the square it pertains to: xidx = int( point.x / grid.dx ) yidx = int( point.y / grid.dy ) 3. You will have to make a connection matrix of the nodes in your unstructured grid to the squares in the structured one, and relate each of the elements to which the nodes are part of to the cited square, i.e., Nodes k1,...,kN are in the square S. These nodes are vertices of the elements e1,...,eN. So S is connected to e1,...,eN. 4. If the point is in S, so you search only through the elements e1,...,eN. I hope this gives you some starting point. On 5 dez, 14:40, "Dharhas Pothina" wrote: > Hi Robert, > > Ok so if I understood you correctly what I would be doing is using the nearest neighbour algorithm to find the closest node to the point and then loop through the 5 or 6 elements that have that node as a vertex and check which whether the point is inside for each of those elements. > > >From the sound of it as long as the ANN search is fairly efficient I might be able to speed the process up quite a bit. > > I'll probably try your second reference, I'm not too keen on installing the SVN version. > > thanks, > > - dharhas > > >>> Robert Cimrman 12/5/2008 4:09 AM >>> > > Hi Dharhas, > > > > Dharhas Pothina wrote: > > Hi, > > > I have an unstructured triangular mesh. ie a list of nodal locations > > and a list of element connectivity. > > > node #,x,y > > 1 10.0 10.0 > > 2 10.0 12.0 > > ... > > > element #, node1 ,node2 ,node3 > > 1 4 3 5 > > 2 1 3 7 > > 3 2 9 6 > > ... > > > where node1, node2 and node3 are the nodes that make each triangle > > element. > > > Given a set of x,y points I need to create a list of parent elements. > > ie for each point I need to find the triangle that contains it. > > > Presently for each point I cycle through the list of elements and for > > each element calculate whether the point is inside or outside the > > triangle by checking if the sum areas of the triangle formed by the > > point and the each node is the same as the area of the element. > > > This works but is inefficient because I'm cycling through the entire > > list of elements for each point and calculating the areas each time. > > Do any of you know of a more efficient way to do this? > > You could create an inverted connectivity - like you have the list of > elements which point to the nodes, you would have, for each node, a list > of elements the node is contained in. > > something like (not tested, slow, just to get the idea): > > iconn = [[] for ii in xrange( nodes.shape[0] )] > for iel, row in enumerate( elements ): > for node in row[1:]: > iconn[node].append( iel ) > > Then you would have a problem for finding the nearest neighbours in two > sets of points, for which several algorithms exist (see e.g. [1], [2]). > After machting the points just look to the iconn and choose one of the > elements. > > cheers, > r. > > [1] scipy.spatial.KDTree (in SVN version, added recently by Anne M. > Archibald) > [2]http://www.cs.umd.edu/~mount/ANN/ > > _______________________________________________ > SciPy-user mailing list > SciPy-u... at scipy.orghttp://projects.scipy.org/mailman/listinfo/scipy-user > > _______________________________________________ > SciPy-user mailing list > SciPy-u... at scipy.orghttp://projects.scipy.org/mailman/listinfo/scipy-user _______________________________________________ SciPy-user mailing list SciPy-user at scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user From robert.kern at gmail.com Fri Dec 5 14:13:51 2008 From: robert.kern at gmail.com (Robert Kern) Date: Fri, 5 Dec 2008 13:13:51 -0600 Subject: [SciPy-user] minpack2.so not proper arch in 0.7.0b1 mac release binary? In-Reply-To: <42AD7D52-E49D-469C-9420-996494377F55@mac.com> References: <42AD7D52-E49D-469C-9420-996494377F55@mac.com> Message-ID: <3d375d730812051113h5d2a79e5r4ac92818aad1683c@mail.gmail.com> On Fri, Dec 5, 2008 at 11:52, Tommy Grav wrote: > I installed the Mac OS X 0.7.0b1 binary from the scipy.org site, but I > run into this > error > > [skathi:myCode/Python/pyMPC] drtgrav% python -u comet_obs.py > File /Users/drtgrav/Work/myLibrary/Catalogs/binEphemLong.405 has been > opened > Traceback (most recent call last): > File "comet_obs.py", line 6, in > from scipy.optimize import leastsq > File "/Library/Frameworks/Python.framework/Versions/2.5/lib/ > python2.5/site-packages/scipy/optimize/__init__.py", line 7, in > from optimize import * > File "/Library/Frameworks/Python.framework/Versions/2.5/lib/ > python2.5/site-packages/scipy/optimize/optimize.py", line 28, in > > import linesearch > File "/Library/Frameworks/Python.framework/Versions/2.5/lib/ > python2.5/site-packages/scipy/optimize/linesearch.py", line 3, in > > from scipy.optimize import minpack2 > ImportError: dlopen(/Library/Frameworks/Python.framework/Versions/2.5/ > lib/python2.5/site-packages/scipy/optimize/minpack2.so, 2): no > suitable image found. Did find: > /Library/Frameworks/Python.framework/Versions/2.5/lib/python2.5/site- > packages/scipy/optimize/minpack2.so: mach-o, but wrong architecture > [skathi:myCode/Python/pyMPC] drtgrav% > > Anyone know how to solve this problem? It's not easy to build a Universal binary with Fortran extension modules like scipy. The packager of this binary did not; the binaries are for Intel Macs only. -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From tgrav at mac.com Fri Dec 5 14:22:59 2008 From: tgrav at mac.com (Tommy Grav) Date: Fri, 05 Dec 2008 14:22:59 -0500 Subject: [SciPy-user] minpack2.so not proper arch in 0.7.0b1 mac release binary? References: Message-ID: <7EA65D69-A33B-476A-A3D4-466A7BCEF88F@mac.com> On Dec 5, 2008, at 2:13 PM, Robert Kern wrote: > > It's not easy to build a Universal binary with Fortran extension > modules like scipy. The packager of this binary did not; the binaries > are for Intel Macs only. http://sourceforge.net/project/showfiles.php?group_id=27747&package_id=19531 needs to be changed then as it lists the dmg binary as a universial. I was able to compile the most recent svn however, so I am generally good :) Cheers Tommy From aarchiba at physics.mcgill.ca Fri Dec 5 14:30:02 2008 From: aarchiba at physics.mcgill.ca (Anne Archibald) Date: Fri, 5 Dec 2008 14:30:02 -0500 Subject: [SciPy-user] Efficient way of finding the parent element ofapoint in an unstructured triangular In-Reply-To: <49391989.63BA.009B.0@twdb.state.tx.us> References: <4937C676.63BA.009B.0@twdb.state.tx.us> <4938FDE1.3030408@ntc.zcu.cz> <4939051C.63BA.009B.0@twdb.state.tx.us> <49391989.63BA.009B.0@twdb.state.tx.us> Message-ID: 2008/12/5 Dharhas Pothina : > This is an interesting algorithm, do you have a name or reference for it? One question I have is how you would decide what to use for the size of the uniform grid cells(i.e. dx & dy). I can see that a larger cell size would involve searching through more elements but a cell size which is too small would have the problem of cells that don't contain any nodes. Actually, I think the idea is, for each cell, you make a list of triangles that meet the cell. Thus for any point, if you look up the cell containing that point you get a list of triangles that might contain it. That way the only time you get an empty list is when your point is not in any triangle. More generally, you can apply this idea to any spatial data structure, including kd-trees. Grids have the advantage that they are simple to implement and quite fast, but deciding on the grid size can be tricky, and they can suffer badly from the "teapot in a stadium" problem (where a few big triangles force big grid cells, but many if the triangles are small and contained in a single cell). I would recommend using a kd-tree, though I'm afraid the ones in scipy.spatial will not be of much use to you. I should point out that partial as I am to the nearest-neighbor code, it's not as simple to use for your problem as it seems: the three vertices of the triangle containing a point need not be the nearest points in the mesh. In fact there can be as many other mesh points as you like nearer to your query point than the corners of its triangle. Anne From Dharhas.Pothina at twdb.state.tx.us Fri Dec 5 14:44:26 2008 From: Dharhas.Pothina at twdb.state.tx.us (Dharhas Pothina) Date: Fri, 05 Dec 2008 13:44:26 -0600 Subject: [SciPy-user] scikit.timeseries plotting bugs? Message-ID: <4939303A0200009B00018821@GWWEB.twdb.state.tx.us> Pierre, > Don't use set_xlim on the plot, unless you understand how it works. > Integers are associated with each date of a time_series, depending on > its frequency. You need to set the xlims to integers that are > compatible wth your series. > The easiest by far is to use the `set_datelimits` method of your plot. I tried your suggestion and if you look at the attached JDM2_2.gif, you can see that the minor ticks don't align with the major ticks and for some reason 2006 appears twice. JDM2_1.gif is the same plot without using set_datelimits() code : startdate = ts.Date(freq='D',year=startdate.year,month=startdate.month,day=startdate.day) enddate = ts.Date(freq='D',year=enddate.year,month=enddate.month,day=enddate.day) ... fig = tpl.tsfigure() fsp = fig.add_tsplot(111) fsp.tsplot(fseries_freq, 'r',label='data') fsp.tsplot(mseries_freq, 'b',label='model') fsp.set_ylim(0,35) fsp.set_datelimits(start_date=startdate,end_date=enddate) - dharhas -------------- next part -------------- A non-text attachment was scrubbed... Name: JDM2_1.gif Type: image/gif Size: 22737 bytes Desc: CompuServe GIF graphic URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: JDM2_2.gif Type: image/gif Size: 23637 bytes Desc: CompuServe GIF graphic URL: From Dharhas.Pothina at twdb.state.tx.us Fri Dec 5 15:28:10 2008 From: Dharhas.Pothina at twdb.state.tx.us (Dharhas Pothina) Date: Fri, 05 Dec 2008 14:28:10 -0600 Subject: [SciPy-user] Efficient way of finding the parent elementofapoint in an unstructured triangular In-Reply-To: References: <4937C676.63BA.009B.0@twdb.state.tx.us> <4938FDE1.3030408@ntc.zcu.cz> <4939051C.63BA.009B.0@twdb.state.tx.us> <49391989.63BA.009B.0@twdb.state.tx.us> Message-ID: <49393A7A.63BA.009B.0@twdb.state.tx.us> > it's not as simple to use for your problem as it seems: the three > vertices of the triangle containing a point need not be the nearest > points in the mesh. The mesh is conforming, has certain quality constraints and the size of the triangles changes gradually so the nearest neighbour approach will probably work most of the time but after sketching some diagrams I can think of cases in which it will not find parent the element. > Actually, I think the idea is, for each cell, you make a list of > triangles that meet the cell. Thus for any point, if you look up the > cell containing that point you get a list of triangles that might I think to make this list I would need to calculate which triangle edges intersect the cell edges. Not sure how expesive that will be. Second way would be to make a list of triangle nodes within a cell and then generate a list of elements that have any of those nodes as edges. If I did the second, I'd have to make the cells large enough to ensure that at least one node was present in each cell (ie cellsize > max triangle size) Looks like these general idea is called a structured auxiliary mesh algorithm. I found a paper that summarizes some methods : http://www.mie.utoronto.ca/labs/bsl/pubs/confpapers/miccai2003_geometricsearch.pdf - dharhas >>> "Anne Archibald" 12/5/2008 1:30 PM >>> 2008/12/5 Dharhas Pothina : > This is an interesting algorithm, do you have a name or reference for it? One question I have is how you would decide what to use for the size of the uniform grid cells(i.e. dx & dy). I can see that a larger cell size would involve searching through more elements but a cell size which is too small would have the problem of cells that don't contain any nodes. Actually, I think the idea is, for each cell, you make a list of triangles that meet the cell. Thus for any point, if you look up the cell containing that point you get a list of triangles that might contain it. That way the only time you get an empty list is when your point is not in any triangle. More generally, you can apply this idea to any spatial data structure, including kd-trees. Grids have the advantage that they are simple to implement and quite fast, but deciding on the grid size can be tricky, and they can suffer badly from the "teapot in a stadium" problem (where a few big triangles force big grid cells, but many if the triangles are small and contained in a single cell). I would recommend using a kd-tree, though I'm afraid the ones in scipy.spatial will not be of much use to you. I should point out that partial as I am to the nearest-neighbor code, it's not as simple to use for your problem as it seems: the three vertices of the triangle containing a point need not be the nearest points in the mesh. In fact there can be as many other mesh points as you like nearer to your query point than the corners of its triangle. Anne _______________________________________________ SciPy-user mailing list SciPy-user at scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user From aarchiba at physics.mcgill.ca Fri Dec 5 15:49:55 2008 From: aarchiba at physics.mcgill.ca (Anne Archibald) Date: Fri, 5 Dec 2008 15:49:55 -0500 Subject: [SciPy-user] Efficient way of finding the parent elementofapoint in an unstructured triangular In-Reply-To: <49393A7A.63BA.009B.0@twdb.state.tx.us> References: <4937C676.63BA.009B.0@twdb.state.tx.us> <4938FDE1.3030408@ntc.zcu.cz> <4939051C.63BA.009B.0@twdb.state.tx.us> <49391989.63BA.009B.0@twdb.state.tx.us> <49393A7A.63BA.009B.0@twdb.state.tx.us> Message-ID: 2008/12/5 Dharhas Pothina : >> Actually, I think the idea is, for each cell, you make a list of >> triangles that meet the cell. Thus for any point, if you look up the >> cell containing that point you get a list of triangles that might > > I think to make this list I would need to calculate which triangle edges intersect the cell edges. Not sure how expesive that will be. Second way would be to make a list of triangle nodes within a cell and then generate a list of elements that have any of those nodes as edges. If I did the second, I'd have to make the cells large enough to ensure that at least one node was present in each cell (ie cellsize > max triangle size) Looks like these general idea is called a structured auxiliary mesh algorithm. If you build a kd-tree, it's not too hard to design a recursive algorithm: build_kdtree(triangles): if len_triangles=x[i]] return KDTree(i,x[i],build_kdtree(less),build_kdtree(greater)) Looking up a point is then easy: walk down the tree until you hit a leaf, and test the point's membership in each triangle in that leaf. It's probably not an ideal data structure, since many triangles may cross the cut plane and therefore be in both subsets, but for reasonable meshes it's probably fine. If you want to use a grid, you can test whether a grid cell meets a triangle reasonably easily: they meet if and only if either some grid corner is inside the triangle or some triangle corner is in the grid cell. If you are feeling devious you can possibly even use OpenGL hardware acceleration to determine which grid cells a triangle falls into. Anne From Dharhas.Pothina at twdb.state.tx.us Fri Dec 5 16:15:24 2008 From: Dharhas.Pothina at twdb.state.tx.us (Dharhas Pothina) Date: Fri, 05 Dec 2008 15:15:24 -0600 Subject: [SciPy-user] Efficient way of finding the parentelementofapoint in an unstructured triangular In-Reply-To: References: <4937C676.63BA.009B.0@twdb.state.tx.us> <4938FDE1.3030408@ntc.zcu.cz> <4939051C.63BA.009B.0@twdb.state.tx.us> <49391989.63BA.009B.0@twdb.state.tx.us> <49393A7A.63BA.009B.0@twdb.state.tx.us> Message-ID: <4939458C.63BA.009B.0@twdb.state.tx.us> >>> "Anne Archibald" 12/5/2008 2:49 PM >>> > 2008/12/5 Dharhas Pothina : > > If you build a kd-tree, it's not too hard to design a recursive algorithm: I have no idea how a kd-tree works. I'm going to have to read up on it before I can follow your algorithm below. > If you want to use a grid, you can test whether a grid cell meets a > triangle reasonably easily: they meet if and only if either some grid > corner is inside the triangle or some triangle corner is in the grid > cell. If you are feeling devious you can possibly even use OpenGL > hardware acceleration to determine which grid cells a triangle falls > into. Seems like generating this list is going to be pretty expensive. Unless I'm missing something, for each cell I need to cycle through each triangles using 7 point_in_poly checks. It will be useful when I am doing lots of look ups which I may need in the future, but presently I'm just looking for about 15 points so I'm cycling through all the triangle 15 times and doing 1 point_in_poly lookup for each triangle. It may be useful if I pre-generate the list for a given grid and save it to a file. thanks for your ideas they have really helped me think this through some. - dharhas From fritz.peter.maas at googlemail.com Sat Dec 6 09:46:59 2008 From: fritz.peter.maas at googlemail.com (Peter Maas) Date: Sat, 06 Dec 2008 15:46:59 +0100 Subject: [SciPy-user] Problemds building NumPy with Python 2.6 / MinGW 3.4.5 on Windows XP SP3 In-Reply-To: References: Message-ID: <493A9063.2010203@googlemail.com> Hi, I cannot build NumPy 1.2.1 with the aforementionend environment. It works with Python 2.5 but with Python 2.6 I get: > python setup.py build -c mingw32 [lots of lines omitted] running build_src building py_modules sources building extension "numpy.core.multiarray" sources Generating build\src.win32-2.6\numpy\core\include/numpy\config.h error: None Has anybody experienced this error and knows a remedy? Thanks for your advice. -- Regards, Peter Peter Maas, Aachen, Germany From david at ar.media.kyoto-u.ac.jp Sat Dec 6 09:51:47 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Sat, 06 Dec 2008 23:51:47 +0900 Subject: [SciPy-user] Problemds building NumPy with Python 2.6 / MinGW 3.4.5 on Windows XP SP3 In-Reply-To: <493A9063.2010203@googlemail.com> References: <493A9063.2010203@googlemail.com> Message-ID: <493A9183.5030906@ar.media.kyoto-u.ac.jp> Peter Maas wrote: > Hi, > > I cannot build NumPy 1.2.1 with the aforementionend environment. > It works with Python 2.5 but with Python 2.6 I get: > Hi Peter, Unfortunately, you need to build numpy from svn if you wish to build numpy on python 2.6 with mingw ATM. No release of numpy contains the necessary changes to build with python 2.6 yet cheers, David From fritz.peter.maas at googlemail.com Sat Dec 6 12:03:58 2008 From: fritz.peter.maas at googlemail.com (Peter Maas) Date: Sat, 06 Dec 2008 18:03:58 +0100 Subject: [SciPy-user] Problemds building NumPy with Python 2.6 / MinGW 3.4.5 on Windows XP SP3 In-Reply-To: <493A9183.5030906@ar.media.kyoto-u.ac.jp> References: <493A9063.2010203@googlemail.com> <493A9183.5030906@ar.media.kyoto-u.ac.jp> Message-ID: <493AB07E.5060201@googlemail.com> David Cournapeau schrieb: > Unfortunately, you need to build numpy from svn if you wish to build > numpy on python 2.6 with mingw ATM. No release of numpy contains the > necessary changes to build with python 2.6 yet Hi David, thanks for your advice. I just checked out numpy svn co http://scipy.org/svn/numpy/trunk numpy and repeated the build command python setup.py build -c mingw32 The result: [...] running build_src building py_modules sources creating build creating build\src.win32-2.6 creating build\src.win32-2.6\numpy creating build\src.win32-2.6\numpy\distutils building extension "numpy.core.multiarray" sources creating build\src.win32-2.6\numpy\core Generating build\src.win32-2.6\numpy\core\include/numpy\config.h C:\home\peter\projekte\scipy\archiv\svn-6139\numpy\numpy\distutils\command\config.py:35: DeprecationWarning: +++++++++++++++++++++++++++++++++++++++++++++++++ Usage of try_run is deprecated: please do not use it anymore, and avoid configuration checks involving running executable on the target machine. +++++++++++++++++++++++++++++++++++++++++++++++++ DeprecationWarning) error: None Something has changed but the build still fails. Any idea which parts of numpy conflict with Python2.6? -- Regards, Peter. Peter Maas, Aachen, Germany From david at ar.media.kyoto-u.ac.jp Sat Dec 6 11:58:14 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Sun, 07 Dec 2008 01:58:14 +0900 Subject: [SciPy-user] Problemds building NumPy with Python 2.6 / MinGW 3.4.5 on Windows XP SP3 In-Reply-To: <493AB07E.5060201@googlemail.com> References: <493A9063.2010203@googlemail.com> <493A9183.5030906@ar.media.kyoto-u.ac.jp> <493AB07E.5060201@googlemail.com> Message-ID: <493AAF26.8010802@ar.media.kyoto-u.ac.jp> Peter Maas wrote: > David Cournapeau schrieb: > >> Unfortunately, you need to build numpy from svn if you wish to build >> numpy on python 2.6 with mingw ATM. No release of numpy contains the >> necessary changes to build with python 2.6 yet >> > > Hi David, > > thanks for your advice. I just checked out numpy > > svn co http://scipy.org/svn/numpy/trunk numpy > > and repeated the build command > > python setup.py build -c mingw32 > > Could you give the full build log, after having made sure to remove the build directory ? Because I have never seen this error: None, and without more context, it is difficult to see what could be wrong. cheers, David From fritz.peter.maas at googlemail.com Sat Dec 6 13:03:47 2008 From: fritz.peter.maas at googlemail.com (Peter Maas) Date: Sat, 06 Dec 2008 19:03:47 +0100 Subject: [SciPy-user] Problemds building NumPy with Python 2.6 / MinGW 3.4.5 on Windows XP SP3 In-Reply-To: <493AAF26.8010802@ar.media.kyoto-u.ac.jp> References: <493A9063.2010203@googlemail.com> <493A9183.5030906@ar.media.kyoto-u.ac.jp> <493AB07E.5060201@googlemail.com> <493AAF26.8010802@ar.media.kyoto-u.ac.jp> Message-ID: <493ABE83.8020709@googlemail.com> David Cournapeau schrieb: > Could you give the full build log, after having made sure to remove the > build directory ? Because I have never seen this error: None, and > without more context, it is difficult to see what could be wrong. Hi David, here it is: C:\home\peter\projekte\scipy\archiv\svn-6139\numpy>python setup.py build -c mingw32 Running from numpy source directory. non-existing path in 'numpy\\distutils': 'site.cfg' F2PY Version 2_6139 blas_opt_info: blas_mkl_info: libraries mkl,vml,guide not found in c:\python26\lib libraries mkl,vml,guide not found in C:\ libraries mkl,vml,guide not found in c:\python26\libs NOT AVAILABLE atlas_blas_threads_info: Setting PTATLAS=ATLAS libraries ptf77blas,ptcblas,atlas not found in c:\python26\lib libraries ptf77blas,ptcblas,atlas not found in C:\ libraries ptf77blas,ptcblas,atlas not found in c:\python26\libs NOT AVAILABLE atlas_blas_info: libraries f77blas,cblas,atlas not found in c:\python26\lib libraries f77blas,cblas,atlas not found in C:\ libraries f77blas,cblas,atlas not found in c:\python26\libs NOT AVAILABLE C:\home\peter\projekte\scipy\archiv\svn-6139\numpy\numpy\distutils\system_info.py:1345: UserWarning: Atlas (http://math-atlas.sourceforge.net/) libraries not found. Directories to search for the libraries can be specified in the numpy/distutils/site.cfg file (section [atlas]) or by setting the ATLAS environment variable. warnings.warn(AtlasNotFoundError.__doc__) blas_info: libraries blas not found in c:\python26\lib libraries blas not found in C:\ libraries blas not found in c:\python26\libs NOT AVAILABLE C:\home\peter\projekte\scipy\archiv\svn-6139\numpy\numpy\distutils\system_info.py:1354: UserWarning: Blas (http://www.netlib.org/blas/) libraries not found. Directories to search for the libraries can be specified in the numpy/distutils/site.cfg file (section [blas]) or by setting the BLAS environment variable. warnings.warn(BlasNotFoundError.__doc__) blas_src_info: NOT AVAILABLE C:\home\peter\projekte\scipy\archiv\svn-6139\numpy\numpy\distutils\system_info.py:1357: UserWarning: Blas (http://www.netlib.org/blas/) sources not found. Directories to search for the sources can be specified in the numpy/distutils/site.cfg file (section [blas_src]) or by setting the BLAS_SRC environment variable. warnings.warn(BlasSrcNotFoundError.__doc__) NOT AVAILABLE lapack_opt_info: lapack_mkl_info: mkl_info: libraries mkl,vml,guide not found in c:\python26\lib libraries mkl,vml,guide not found in C:\ libraries mkl,vml,guide not found in c:\python26\libs NOT AVAILABLE NOT AVAILABLE atlas_threads_info: Setting PTATLAS=ATLAS libraries ptf77blas,ptcblas,atlas not found in c:\python26\lib libraries lapack_atlas not found in c:\python26\lib libraries ptf77blas,ptcblas,atlas not found in C:\ libraries lapack_atlas not found in C:\ libraries ptf77blas,ptcblas,atlas not found in c:\python26\libs libraries lapack_atlas not found in c:\python26\libs numpy.distutils.system_info.atlas_threads_info NOT AVAILABLE atlas_info: libraries f77blas,cblas,atlas not found in c:\python26\lib libraries lapack_atlas not found in c:\python26\lib libraries f77blas,cblas,atlas not found in C:\ libraries lapack_atlas not found in C:\ libraries f77blas,cblas,atlas not found in c:\python26\libs libraries lapack_atlas not found in c:\python26\libs numpy.distutils.system_info.atlas_info NOT AVAILABLE C:\home\peter\projekte\scipy\archiv\svn-6139\numpy\numpy\distutils\system_info.py:1252: UserWarning: Atlas (http://math-atlas.sourceforge.net/) libraries not found. Directories to search for the libraries can be specified in the numpy/distutils/site.cfg file (section [atlas]) or by setting the ATLAS environment variable. warnings.warn(AtlasNotFoundError.__doc__) lapack_info: libraries lapack not found in c:\python26\lib libraries lapack not found in C:\ libraries lapack not found in c:\python26\libs NOT AVAILABLE C:\home\peter\projekte\scipy\archiv\svn-6139\numpy\numpy\distutils\system_info.py:1263: UserWarning: Lapack (http://www.netlib.org/lapack/) libraries not found. Directories to search for the libraries can be specified in the numpy/distutils/site.cfg file (section [lapack]) or by setting the LAPACK environment variable. warnings.warn(LapackNotFoundError.__doc__) lapack_src_info: NOT AVAILABLE C:\home\peter\projekte\scipy\archiv\svn-6139\numpy\numpy\distutils\system_info.py:1266: UserWarning: Lapack (http://www.netlib.org/lapack/) sources not found. Directories to search for the sources can be specified in the numpy/distutils/site.cfg file (section [lapack_src]) or by setting the LAPACK_SRC environment variable. warnings.warn(LapackSrcNotFoundError.__doc__) NOT AVAILABLE running build running config_cc unifing config_cc, config, build_clib, build_ext, build commands --compiler options running config_fc unifing config_fc, config, build_clib, build_ext, build commands --fcompiler options running build_src building py_modules sources creating build creating build\src.win32-2.6 creating build\src.win32-2.6\numpy creating build\src.win32-2.6\numpy\distutils building extension "numpy.core.multiarray" sources creating build\src.win32-2.6\numpy\core Generating build\src.win32-2.6\numpy\core\include/numpy\config.h C:\home\peter\projekte\scipy\archiv\svn-6139\numpy\numpy\distutils\command\config.py:35: DeprecationWarning: +++++++++++++++++++++++++++++++++++++++++++++++++ Usage of try_run is deprecated: please do not use it anymore, and avoid configuration checks involving running executable on the target machine. +++++++++++++++++++++++++++++++++++++++++++++++++ DeprecationWarning) error: None C:\home\peter\projekte\scipy\archiv\svn-6139\numpy> -- Regards, Peter. Peter Maas, Aachen, Germany From mattknox.ca at gmail.com Sat Dec 6 14:08:41 2008 From: mattknox.ca at gmail.com (Matt Knox) Date: Sat, 6 Dec 2008 19:08:41 +0000 (UTC) Subject: [SciPy-user] scikit.timeseries plotting bugs? References: <4939303A0200009B00018821@GWWEB.twdb.state.tx.us> Message-ID: > ... for some reason 2006 appears twice. The problem with the year showing up twice should be fixed in svn now. I've fixed a few issues with plotting recently so please test your plots with the new code and let me know if you encounter any problems. > I tried your suggestion and if you look at the attached JDM2_2.gif, you can > see that the minor ticks don't align with the major ticks Could you elaborate on what you mean by this? The only problem I see in your two graphs is the year showing up twice. - Matt From david at ar.media.kyoto-u.ac.jp Sun Dec 7 00:01:39 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Sun, 07 Dec 2008 14:01:39 +0900 Subject: [SciPy-user] Problemds building NumPy with Python 2.6 / MinGW 3.4.5 on Windows XP SP3 In-Reply-To: <493ABE83.8020709@googlemail.com> References: <493A9063.2010203@googlemail.com> <493A9183.5030906@ar.media.kyoto-u.ac.jp> <493AB07E.5060201@googlemail.com> <493AAF26.8010802@ar.media.kyoto-u.ac.jp> <493ABE83.8020709@googlemail.com> Message-ID: <493B58B3.80509@ar.media.kyoto-u.ac.jp> Peter Maas wrote: > David Cournapeau schrieb: > >> Could you give the full build log, after having made sure to remove the >> build directory ? Because I have never seen this error: None, and >> without more context, it is difficult to see what could be wrong. >> > > Hi David, > The underlying problem is that you don't have mingw in your PATH (you should be able to run gcc from the command line). I don't know yet why with python 2.6 the error is not explicit (with 2.5, you get a failure later, and you know it is because gcc.exe is not found). Adding mingw\bin into your PATH should solve the problem. I can't spend much time on this ATM, but for the record, the problem is in distutils.command.config. David From glen.shennan at gmail.com Sun Dec 7 01:59:13 2008 From: glen.shennan at gmail.com (Glen Shennan) Date: Sun, 7 Dec 2008 17:59:13 +1100 Subject: [SciPy-user] Building Scipy from source In-Reply-To: <5b8d13220812020629k8565f62ta37d8bf11e9c5134@mail.gmail.com> References: <5b8d13220812020629k8565f62ta37d8bf11e9c5134@mail.gmail.com> Message-ID: Thanks very much for that. I was hoping to build everything from source mainly for the learning experience. I thought I got most of the way through it but apparently not. One day. :) 2008/12/3 David Cournapeau > On Tue, Dec 2, 2008 at 9:34 PM, Glen Shennan > wrote: > > Hello, > > > > I'm new to Scipy (and Linux in general) and am trying to build Scipy from > > source, following the directions on the official installation guide > > beginning at the section titled "Building everything from source with > > gfortran on Ubuntu". I am running Ubuntu 8.04 (Debian, kernel version > > 2.6.24-22) on a dual-core AMD 64 bit machine. I can get through the > > building of lapack, ATLAS, UMFPACK, FFTW without problems but I can't > finish > > off the numpy/scipy compile and was hoping someone here could enlighten > me. > > Numpy compiles but I'm not sure that it is working as intended/required > and > > the scipy build produces a huge string of errors ending with: > > Hi Glen, > > The easiest way to build both numpy and scipy is to avoid building > atlas, blas and co by yourself. Those are difficult to build right. > > Do the following: > > sudo apt-get install g77 atlas3-base-dev atlas3-base > > Then remove the build directories in both numpy and scipy (to make > sure you build from scratch), as well as the site.cfg files. You > should then be able to build both numpy and scipy without any trouble. > To avoid trouble with umfpack, you should try building scipy with the > following command: > > UMFPACK=None python setup.py build > > Finally, although last numpy release is fine, you should build scipy > from svn instead of 0.6.0. 0.6 is more than one year old; we are about > to release 0.7, so the trunk should be fairly stable (and we would be > able to help you better if there is any problem compared to 0.6). > > David > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dineshbvadhia at hotmail.com Sun Dec 7 10:27:22 2008 From: dineshbvadhia at hotmail.com (Dinesh B Vadhia) Date: Sun, 7 Dec 2008 07:27:22 -0800 Subject: [SciPy-user] New Numpy and Scipy Beta Releases Message-ID: I missed the announcements for the recent Numpy 1.2.1 and Scipy 0.7.0b1 releases but do these work with Python 2.6 or still only Python 2.5? Thanks. Dinesh -------------- next part -------------- An HTML attachment was scrubbed... URL: From chaichenets at int.uni-karlsruhe.de Sun Dec 7 14:39:28 2008 From: chaichenets at int.uni-karlsruhe.de (Leonid Chaichenets) Date: Sun, 7 Dec 2008 20:39:28 +0100 Subject: [SciPy-user] =?utf-8?q?high_precision_bessel=E2=80=99s_functions?= Message-ID: <20081207193928.GA26946@int.uni-karlsruhe.de> Hello, im looking for a high precision (more than float/double) implementation of the hankel2 function. My actual problem is, that i need to evaluate hankel2 of a realatively high order (7.5) on a short path in the complex plane. Most of the precision is lost, as the values are of high magnitude (10^6) and vary in the 6th decimal place only. The roundoff errors then propagate through the calculation and become significant. Does anyone know an implementation of hankel2 with more precision? Can maybe the scaled bessel functions (scipy.special.hankel2e) be used for that (unfortunatly I couldnt find enough documentation on them)? -- Thanks a lot, Leonid Chaichenets. From haase at msg.ucsf.edu Sun Dec 7 16:49:42 2008 From: haase at msg.ucsf.edu (Sebastian Haase) Date: Sun, 7 Dec 2008 22:49:42 +0100 Subject: [SciPy-user] from scipy import * -- was: [Scipy-tickets] #635: update the old scipy tutorial and re-release it Message-ID: Hi all, I just got the mentioned scipy ticket emailed, which triggered this email: Isn't the official guideline always "don't use >>> from ... import * " ? Why are tutorial often written with introducing sentences like this: """Throughout this tutorial it is assumed that the user has imported all of the names defined in the SciPy top-level namespace using the command >>> from scipy import * """ ??? I would argue that for two reasons this should be changed: 1) Especially tutorials should lead by example and show any newcomer how to do it "right" - and therefore start with """>>> import scipy as sp""" instead. 2) It would be much easier to spot which commands / types / functions come from scipy and which are standard python built ins or from the py standard lib. It would be nice to hear if others also feel (somewhat) strongly about this .... Cheers, Sebastian Haase Forwarded conversation Subject: Re: [Scipy-tickets] [SciPy] #635: update the old scipy tutorial and re-release it ------------------------ From: SciPy Date: Sun, Nov 30, 2008 at 2:11 AM To: Cc: scipy-tickets at scipy.org #635: update the old scipy tutorial and re-release it -------------------------+-------------------------------------------------- Reporter: AndrewStraw | Owner: somebody Type: enhancement | Status: new Priority: low | Milestone: 0.7.0 Component: Other | Version: Severity: normal | Resolution: Keywords: | -------------------------+-------------------------------------------------- Changes (by pv): * milestone: => 0.7.0 Comment: This is nearly done; cf. http://docs.scipy.org/scipy/docs/scipy-docs/tutorial/index.rst/ -- Ticket URL: SciPy SciPy is open-source software for mathematics, science, and engineering. _______________________________________________ Scipy-tickets mailing list Scipy-tickets at scipy.org http://projects.scipy.org/mailman/listinfo/scipy-tickets ---------- From: SciPy Date: Sun, Dec 7, 2008 at 6:21 PM To: Cc: scipy-tickets at scipy.org Type: enhancement | Status: closed Severity: normal | Resolution: fixed Keywords: | -------------------------+-------------------------------------------------- Changes (by pv): * status: new => closed * resolution: => fixed Comment: The tutorial is now up-to-date. -- Ticket URL: From gael.varoquaux at normalesup.org Sun Dec 7 16:51:51 2008 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Sun, 7 Dec 2008 22:51:51 +0100 Subject: [SciPy-user] from scipy import * -- was: [Scipy-tickets] #635: update the old scipy tutorial and re-release it In-Reply-To: References: Message-ID: <20081207215151.GB22858@phare.normalesup.org> On Sun, Dec 07, 2008 at 10:49:42PM +0100, Sebastian Haase wrote: > It would be nice to hear if others also feel (somewhat) strongly about this .... I am +1 on that. Ga?l From millman at berkeley.edu Sun Dec 7 16:57:58 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Sun, 7 Dec 2008 13:57:58 -0800 Subject: [SciPy-user] from scipy import * -- was: [Scipy-tickets] #635: update the old scipy tutorial and re-release it In-Reply-To: References: Message-ID: On Sun, Dec 7, 2008 at 1:49 PM, Sebastian Haase wrote: > Isn't the official guideline always "don't use >>> from ... import * " ? > Why are tutorial often written with introducing sentences like this: > """Throughout this tutorial it is assumed that the user has imported > all of the names defined in the SciPy top-level namespace using the > command > >>> from scipy import * > """ Yes, it should (and will) be changed to: import scipy as sp The text was converted from the old guide and it needs to be cleaned up and improved. Please feel free to update it. The more people who contribute the faster it will get done. Thanks, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From robert.kern at gmail.com Sun Dec 7 17:19:01 2008 From: robert.kern at gmail.com (Robert Kern) Date: Sun, 7 Dec 2008 16:19:01 -0600 Subject: [SciPy-user] =?utf-8?q?high_precision_bessel=E2=80=99s_functions?= In-Reply-To: <20081207193928.GA26946@int.uni-karlsruhe.de> References: <20081207193928.GA26946@int.uni-karlsruhe.de> Message-ID: <3d375d730812071419y36d9ef2ep8e87e4678258595e@mail.gmail.com> On Sun, Dec 7, 2008 at 13:39, Leonid Chaichenets wrote: > Hello, > > im looking for a high precision (more than float/double) implementation of > the hankel2 function. > > My actual problem is, that i need to evaluate hankel2 of a realatively high > order (7.5) on a short path in the complex plane. Most of the precision is > lost, as the values are of high magnitude (10^6) and vary in the 6th decimal > place only. The roundoff errors then propagate through the calculation and > become significant. > > Does anyone know an implementation of hankel2 with more precision? Can maybe > the scaled bessel functions (scipy.special.hankel2e) be used for that > (unfortunatly I couldnt find enough documentation on them)? mpmath has Bessel functions. You should be able to construct the Hankel functions from those. http://code.google.com/p/mpmath/ -- Robert Kern "I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco From chaichenets at int.uni-karlsruhe.de Sun Dec 7 18:31:59 2008 From: chaichenets at int.uni-karlsruhe.de (Leonid Chaichenets) Date: Mon, 8 Dec 2008 00:31:59 +0100 Subject: [SciPy-user] =?utf-8?q?=5BSOLVED=5D_high_precision_bessel?= =?utf-8?q?=E2=80=99s_functions?= In-Reply-To: <3d375d730812071419y36d9ef2ep8e87e4678258595e@mail.gmail.com> References: <20081207193928.GA26946@int.uni-karlsruhe.de> <3d375d730812071419y36d9ef2ep8e87e4678258595e@mail.gmail.com> Message-ID: <20081207233159.GA27260@int.uni-karlsruhe.de> Hi Robert, On Sun, Dec 07, 2008 at 04:19:01PM -0600, Robert Kern wrote: >> Does anyone know an implementation of hankel2 with more precision? > mpmath has Bessel functions. You should be able to construct the > Hankel functions from those. > > http://code.google.com/p/mpmath/ thanks a lot, this saved my day! -- Best Regards, Leonid Chaichenets. From nwagner at iam.uni-stuttgart.de Mon Dec 8 02:55:30 2008 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Mon, 08 Dec 2008 08:55:30 +0100 Subject: [SciPy-user] Excel macros Message-ID: Hi all, Is it possible to run Excel macros from Python in linux ? Nils http://www.lexicon.net/sjmachin/xlrd.htm From Dharhas.Pothina at twdb.state.tx.us Mon Dec 8 09:00:48 2008 From: Dharhas.Pothina at twdb.state.tx.us (Dharhas Pothina) Date: Mon, 08 Dec 2008 08:00:48 -0600 Subject: [SciPy-user] scikit.timeseries plotting bugs? Message-ID: <493CD4300200009B00018891@GWWEB.twdb.state.tx.us> > The problem with the year showing up twice should be fixed in svn now. I've > fixed a few issues with plotting recently so please test your plots with the new > code and let me know if you encounter any problems. I tried installing the svn version and I got the following error : ... ... File "/home/dharhas/.virtualenvs/ts/lib64/python2.5/site-packages/numpy/distutils/misc_util.py", line 781, in _get_configuration_from_setup_py config = setup_module.configuration(*args) File "/home/dharhas/downloads/software/scipy/timeseries/setup.py", line 81, in configuration File "/home/dharhas/.virtualenvs/ts/lib64/python2.5/site-packages/numpy/distutils/misc_util.py", line 1239, in add_scripts dist.scripts.extend(scripts) AttributeError: 'NoneType' object has no attribute 'extend' Do I need to remove the earlier version first? >> I tried your suggestion and if you look at the attached JDM2_2.gif, you can >> see that the minor ticks don't align with the major ticks > > Could you elaborate on what you mean by this? The only problem I see in your two > graphs is the year showing up twice. Well I guess my question is what do the minor ticks represent. If they represent weeks they look ok but if they are subdivisions of the major ticks then they should have the same spacing between any two major ticks. - dharhas _______________________________________________ SciPy-user mailing list SciPy-user at scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user From jeff.lyon at cox.net Mon Dec 8 10:28:01 2008 From: jeff.lyon at cox.net (Jeff Lyon) Date: Mon, 8 Dec 2008 08:28:01 -0700 Subject: [SciPy-user] Excel macros In-Reply-To: References: Message-ID: <847518E5-4184-4195-B610-346F1991894F@cox.net> > >> Is it possible to run Excel macros from Python in linux ? Does Microsoft poet Excel to linux? I found that OpenOffice supports python scripting in their framework; maybe that's a good place to start. http://framework.openoffice.org/scripting/index.html Regards, Jeff Lyon From tonyyu at MIT.EDU Mon Dec 8 11:47:05 2008 From: tonyyu at MIT.EDU (Tony S Yu) Date: Mon, 8 Dec 2008 11:47:05 -0500 Subject: [SciPy-user] easy_install problem for scipy 0.7b1 on OS X 10.5.5 Message-ID: <7F1A7252-1F69-4122-A9F5-E0EE09E90A14@mit.edu> I tried to install scipy 0.7b1 using easy_install and received the following error: $ error: Setup script exited with error: file 'ARPACK/FWRAPPERS/ veclib_cabi_c.c' does not exist Any ideas? Thanks in advance, -Tony OS X 10.5.5 Python 2.5.1 Numpy 1.2.1 From tonyyu at MIT.EDU Mon Dec 8 12:08:03 2008 From: tonyyu at MIT.EDU (Tony S Yu) Date: Mon, 8 Dec 2008 12:08:03 -0500 Subject: [SciPy-user] easy_install problem for scipy 0.7b1 on OS X 10.5.5 In-Reply-To: References: Message-ID: > > Date: Mon, 8 Dec 2008 11:47:05 -0500 > From: Tony S Yu > Subject: [SciPy-user] easy_install problem for scipy 0.7b1 on OS X > 10.5.5 > To: scipy-user at scipy.org > > I tried to install scipy 0.7b1 using easy_install and received the > following error: > > $ error: Setup script exited with error: file 'ARPACK/FWRAPPERS/ > veclib_cabi_c.c' does not exist Sorry, I didn't see this email earlier: http://projects.scipy.org/pipermail/scipy-user/2008-December/018899.html From rmay31 at gmail.com Mon Dec 8 12:08:08 2008 From: rmay31 at gmail.com (Ryan May) Date: Mon, 08 Dec 2008 11:08:08 -0600 Subject: [SciPy-user] from scipy import * -- was: [Scipy-tickets] #635: update the old scipy tutorial and re-release it In-Reply-To: References: Message-ID: <493D5478.4010102@gmail.com> Sebastian Haase wrote: > It would be nice to hear if others also feel (somewhat) strongly about this .... +1 The best way to break bad habits is to never learn them in the first place. Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma From Dharhas.Pothina at twdb.state.tx.us Mon Dec 8 12:28:30 2008 From: Dharhas.Pothina at twdb.state.tx.us (Dharhas Pothina) Date: Mon, 08 Dec 2008 11:28:30 -0600 Subject: [SciPy-user] from scipy import * -- was: [Scipy-tickets] #635:update the old scipy tutorial and In-Reply-To: <493D5478.4010102@gmail.com> References: <493D5478.4010102@gmail.com> Message-ID: <493D04DE.63BA.009B.0@twdb.state.tx.us> + 1 I started using python/numpy/scipy in the past year and am trying to break the habit of using 'from scipy import *' that I picked up while trying out the examples and tutorials. I already have a fair number of scripts that I need to convert from when I didn't know better. - dharhas >>> Ryan May 12/8/2008 11:08 AM >>> Sebastian Haase wrote: > It would be nice to hear if others also feel (somewhat) strongly about this .... +1 The best way to break bad habits is to never learn them in the first place. Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma _______________________________________________ SciPy-user mailing list SciPy-user at scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user From mattknox.ca at gmail.com Mon Dec 8 12:41:42 2008 From: mattknox.ca at gmail.com (Matt Knox) Date: Mon, 8 Dec 2008 17:41:42 +0000 (UTC) Subject: [SciPy-user] scikit.timeseries plotting bugs? References: <493CD4300200009B00018891@GWWEB.twdb.state.tx.us> Message-ID: > I tried installing the svn version and I got the following error : > > ... > > Do I need to remove the earlier version first? In general, installing new versions of python packages without removing the old version first can cause unexpected behaviour. There are a lot of problems posted on the numpy/scipy mailing list which turn out to be someone installed over top of an old version. Let me know if you still have problems after cleaning out the old version. > Well I guess my question is what do the minor ticks represent. If they > represent weeks they look ok but if they are subdivisions of the major > ticks then they should have the same spacing between any two major ticks. They represent weeks in the case of your first graph. Note that it is not possible to exactly evenly subdivide the major ticks because the major ticks themselves are not evenly spaced (depends on number of days in the month, etc). The minor ticks are normally placed at one frequency higher than the major ticks (eg. if the major ticks are at one month intervals, the minor ticks at one week intervals, etc). - Matt From Dharhas.Pothina at twdb.state.tx.us Mon Dec 8 13:19:23 2008 From: Dharhas.Pothina at twdb.state.tx.us (Dharhas Pothina) Date: Mon, 08 Dec 2008 12:19:23 -0600 Subject: [SciPy-user] scikit.timeseries plotting bugs? In-Reply-To: References: <493CD4300200009B00018891@GWWEB.twdb.state.tx.us> Message-ID: <493D10CB.63BA.009B.0@twdb.state.tx.us> how do you clean out an out version? Do you delete a directory or do you have to run an uninstall script. > In general, installing new versions of python packages without removing the > old version first can cause unexpected behaviour. There are a lot of problems > posted on the numpy/scipy mailing list which turn out to be someone installed > over top of an old version. Let me know if you still have problems after > cleaning out the old version. From fredrik.johansson at gmail.com Mon Dec 8 16:05:55 2008 From: fredrik.johansson at gmail.com (Fredrik Johansson) Date: Mon, 8 Dec 2008 22:05:55 +0100 Subject: [SciPy-user] =?windows-1252?q?=5BSOLVED=5D_high_precision_bessel?= =?windows-1252?q?=92s_functions?= Message-ID: <3d0cebfb0812081305n7186151r4efc65f708e0f4d@mail.gmail.com> Leonid Chaichenets wrote: > Does anyone know an implementation of hankel2 with more precision? Can maybe > the scaled bessel functions (scipy.special.hankel2e) be used for that > (unfortunatly I couldnt find enough documentation on them)? Robert Kern wrote: > mpmath has Bessel functions. You should be able to construct the > Hankel functions from those. Hi, I'm the main author of mpmath and though this problem already has been solved (thanks Robert), I thought I'd drop a comment. Unfortunately mpmath only had the Bessel J function, and though you can compute the other Bessel functions from it, it requires some trickery when the order is an integer (though that shouldn't be a problem in this case since the order was explicitly stated to be a half-integer). Since this isn't the first time someone asked for Bessel functions in mpmath (or rather, asked for Bessel functions in a more general setting and was pointed to mpmath) I've now implemented the Bessel I, Y, K and Hankel H1/H2 functions, and in a way that hopefully avoids the major numerical issues. You can get it by checking out the SVN version. Documentation is here: http://mpmath.googlecode.com/svn/trunk/doc/build/functions/hypergeometric.html Leonid, I'd be interested to know if this implementation of the Hankel function works for your problem. If it's not too complicated, I'd like to add your calculation (or a simplified version thereof) to the test suite or as a documentation example. It's always nice with real-world tests. Fredrik From mattknox.ca at gmail.com Mon Dec 8 16:57:08 2008 From: mattknox.ca at gmail.com (Matt Knox) Date: Mon, 8 Dec 2008 21:57:08 +0000 (UTC) Subject: [SciPy-user] scikit.timeseries plotting bugs? References: <493CD4300200009B00018891@GWWEB.twdb.state.tx.us> <493D10CB.63BA.009B.0@twdb.state.tx.us> Message-ID: > how do you clean out an out version? Do you delete a directory or do you > have to run an uninstall script. I'm not really sure what the best way to do that on linux is (I'm a windows user). I don't believe there is any kind of uninstall script though, I think you have to manually delete the files. If you need more help with uninstalling stuff, I would post a new message without the phrase "timeseries" in it because a lot of people probably aren't bothering to read this thread if they don't use time series. - Matt From pgmdevlist at gmail.com Mon Dec 8 17:29:27 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Mon, 8 Dec 2008 17:29:27 -0500 Subject: [SciPy-user] scikit.timeseries plotting bugs? In-Reply-To: References: <493CD4300200009B00018891@GWWEB.twdb.state.tx.us> <493D10CB.63BA.009B.0@twdb.state.tx.us> Message-ID: On Dec 8, 2008, at 4:57 PM, Matt Knox wrote: >> how do you clean out an out version? Do you delete a directory or >> do you >> have to run an uninstall script. AFAIK, there's usually no uninstall script coming with setuptools. What I usually do is rename the package I want to delete (by putting a _ in front or at the end), install the new version, delete the old one. That's usually OK as long as no executables are installed anywhere. That definitely works with scikits.timeseries. Cool thing with virtualenv is that you can create a new virtualenv and install a new version of the package without erasing any old one... From ndbecker2 at gmail.com Mon Dec 8 20:12:23 2008 From: ndbecker2 at gmail.com (Neal Becker) Date: Mon, 08 Dec 2008 20:12:23 -0500 Subject: [SciPy-user] =?utf-8?q?=5BSOLVED=5D_high_precision_bessel?= =?utf-8?q?=E2=80=99s_functions?= References: <3d0cebfb0812081305n7186151r4efc65f708e0f4d@mail.gmail.com> Message-ID: Fredrik Johansson wrote: [...] > > Since this isn't the first time someone asked for Bessel functions in > mpmath (or rather, asked for Bessel functions in a more general > setting and was pointed to mpmath) I've now implemented the Bessel I, > Y, K and Hankel H1/H2 functions, and in a way that hopefully avoids > the major numerical issues. You can get it by checking out the SVN > version. Documentation is here: > http://mpmath.googlecode.com/svn/trunk/doc/build/functions/hypergeometric.html > > Leonid, I'd be interested to know if this implementation of the Hankel > function works for your problem. If it's not too complicated, I'd like > to add your calculation (or a simplified version thereof) to the test > suite or as a documentation example. It's always nice with real-world > tests. > > Fredrik Thanks! From millman at berkeley.edu Tue Dec 9 05:11:42 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Tue, 9 Dec 2008 02:11:42 -0800 Subject: [SciPy-user] Please help prepare the 0.7 release notes Message-ID: We are almost ready for 0.7.0rc1 (we just need to sort out the Numerical Recipes issues and I haven't had time to look though them yet). So I wanted to ask once more for help with preparing the release notes: http://projects.scipy.org/scipy/scipy/browser/trunk/doc/release/0.7.0-notes.rst There have been numerous improvements and changes. As always I would appreciate any feedback about mistakes or omissions. It would also be nice to know how many tests were in the last release and how many are there now. Highlighting major bug fixes or pointing out know issues would be very useful. I would also like to ask if anyone would be interested in stepping forward to work on something like Andrew Kuchling's "What's New in Python ....": http://docs.python.org/whatsnew/2.6.html This would be a great area to contribute. The release notes provide visibility for our developers' immense contributions of time and effort. They help provide an atmosphere of momentum, maturity, and excitement to a project. It is also a great service to users who haven't been following the trunk closely as well as other developer's who have missed what is happening in other areas of the code. It is also becomes a nice historical artifact for the future. It would be great if someone wanted to contribute in this way. Ideally, I would like to have someone who be interested in doing this for several releases of scipy and numpy. Such a person could develop a standard template for this and write some scripts to gather specific statistics (e.g., how many lines of code have changed, how many unit tests were added, what is the test coverage, what is the docstring coverage, who were the top contributors, who has increased their code contributions the most, how many new developers, etc.) Just a thought. Figure it won't happen, if I don't ask. Thanks, -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From almar.klein at gmail.com Tue Dec 9 05:47:16 2008 From: almar.klein at gmail.com (Almar Klein) Date: Tue, 9 Dec 2008 11:47:16 +0100 Subject: [SciPy-user] from scipy import * -- was: [Scipy-tickets] #635:update the old scipy tutorial and In-Reply-To: <493D04DE.63BA.009B.0@twdb.state.tx.us> References: <493D5478.4010102@gmail.com> <493D04DE.63BA.009B.0@twdb.state.tx.us> Message-ID: Hi, I moved from Matlab to python about half a year ago and I love it. Scipy is the reason why this is possible at all for me. But it always annoyed me when I read "from scipy import *" or "from pylab import *". As it says in the python ZEN: "Namespaces are one honking great idea -- let's do more of those!", and not less... A big +1 Cheers, Almar I love scipy and 2008/12/8 Dharhas Pothina > > + 1 > > I started using python/numpy/scipy in the past year and am trying to break > the habit of using 'from scipy import *' that I picked up while trying out > the examples and tutorials. I already have a fair number of scripts that I > need to convert from when I didn't know better. > > - dharhas > > >>> Ryan May 12/8/2008 11:08 AM >>> > Sebastian Haase wrote: > > It would be nice to hear if others also feel (somewhat) strongly about > this .... > > +1 > > The best way to break bad habits is to never learn them in the first place. > > Ryan > > -- > Ryan May > Graduate Research Assistant > School of Meteorology > University of Oklahoma > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > -------------- next part -------------- An HTML attachment was scrubbed... URL: From timmichelsen at gmx-topmail.de Tue Dec 9 16:27:56 2008 From: timmichelsen at gmx-topmail.de (Tim Michelsen) Date: Tue, 09 Dec 2008 22:27:56 +0100 Subject: [SciPy-user] scikit.timeseries plotting bugs? In-Reply-To: References: <493CD4300200009B00018891@GWWEB.twdb.state.tx.us> <493D10CB.63BA.009B.0@twdb.state.tx.us>

Message-ID: Hi, since you seem to be on a Linux machine I recommend using checkinstall: http://jhcore.com/2008/06/18/pymedia-on-ubuntu-hardy-heron/ http://wiki.ubuntuusers.de/Programme_kompilieren => German, but I think you can read the code insets. If you have problems come back to me. I successfully created a deb for timseries. Regards, Timmie From eads at soe.ucsc.edu Tue Dec 9 17:34:47 2008 From: eads at soe.ucsc.edu (Damian Eads) Date: Tue, 9 Dec 2008 15:34:47 -0700 Subject: [SciPy-user] [Numpy-discussion] Support for sparse matrix in Distance function (and clustering)? In-Reply-To: <91b4b1ab0812091432l4306c1bep6a20370e1e3615f6@mail.gmail.com> References: <993136.52361.qm@web50410.mail.re2.yahoo.com> <91b4b1ab0812091432l4306c1bep6a20370e1e3615f6@mail.gmail.com> Message-ID: <91b4b1ab0812091434j16eb9cf1sd36220d6e48376f2@mail.gmail.com> Btw, I didn't notice this before I clicked "Send" but this question is more appropriate for the SciPy users list. Please make sure you direct your reply there. On Tue, Dec 9, 2008 at 3:32 PM, Damian Eads wrote: > Hi, > > Can you be more specific? Do you need sparse matrices to represent > observation vectors because they are sparse? Or do you need sparse > matrices to represent distance matrices because most vectors you are > clustering are similar while a few are dissimilar? > > The clustering code is written mostly in C and does not support sparse > matrices. However, this should not matter because most of the > clustering code does not look at the raw observation vectors > themselves, just the distances passed as a distance matrix. > > Damian > > On Tue, Dec 9, 2008 at 1:28 PM, Bab Tei wrote: >> Hi >> Does the distance function in spatial package support sparse matrix? >> regards >> >> >> >> _______________________________________________ >> Numpy-discussion mailing list >> Numpy-discussion at scipy.org >> http://projects.scipy.org/mailman/listinfo/numpy-discussion > > ----------------------------------------------------- > Damian Eads Ph.D. Student > Jack Baskin School of Engineering, UCSC E2-489 > 1156 High Street Machine Learning Lab > Santa Cruz, CA 95064 http://www.soe.ucsc.edu/~eads > From fritz.peter.maas at googlemail.com Tue Dec 9 18:14:46 2008 From: fritz.peter.maas at googlemail.com (Peter Maas) Date: Wed, 10 Dec 2008 00:14:46 +0100 Subject: [SciPy-user] Problemds building NumPy with Python 2.6 / MinGW 3.4.5 on Windows XP SP3 In-Reply-To: <493B58B3.80509@ar.media.kyoto-u.ac.jp> References: <493A9063.2010203@googlemail.com> <493A9183.5030906@ar.media.kyoto-u.ac.jp> <493AB07E.5060201@googlemail.com> <493AAF26.8010802@ar.media.kyoto-u.ac.jp> <493ABE83.8020709@googlemail.com> <493B58B3.80509@ar.media.kyoto-u.ac.jp> Message-ID: <493EFBE6.1050202@googlemail.com> David Cournapeau schrieb: > The underlying problem is that you don't have mingw in your PATH (you > should be able to run gcc from the command line). Hi David, If i open a cmd window and type "gcc" it responds "gcc: no input files". My PATH looks like ...;C:\MinGW\bin;... And with python 2.5 the build works! > I can't spend much time on this ATM, but for the record, the problem is > in distutils.command.config. Thanks a lot for the time you already spent! And thanks for the hint although I don't have much hope that I'm able to find the bug. I'll stick with Python 2.5 in the meantime. -- Regards, Peter. Peter Maas, Aachen, Germany From david at ar.media.kyoto-u.ac.jp Wed Dec 10 01:05:08 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Wed, 10 Dec 2008 15:05:08 +0900 Subject: [SciPy-user] Problemds building NumPy with Python 2.6 / MinGW 3.4.5 on Windows XP SP3 In-Reply-To: <493EFBE6.1050202@googlemail.com> References: <493A9063.2010203@googlemail.com> <493A9183.5030906@ar.media.kyoto-u.ac.jp> <493AB07E.5060201@googlemail.com> <493AAF26.8010802@ar.media.kyoto-u.ac.jp> <493ABE83.8020709@googlemail.com> <493B58B3.80509@ar.media.kyoto-u.ac.jp> <493EFBE6.1050202@googlemail.com> Message-ID: <493F5C14.9010201@ar.media.kyoto-u.ac.jp> Peter Maas wrote: > Hi David, > > If i open a cmd window and type "gcc" it responds "gcc: no input files". > My PATH looks like ...;C:\MinGW\bin;... And with python 2.5 the build > works! > Which version of gcc are you using ? David From nwagner at iam.uni-stuttgart.de Wed Dec 10 07:11:26 2008 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Wed, 10 Dec 2008 13:11:26 +0100 Subject: [SciPy-user] reading and writing data in Excel files Message-ID: Hi all, I am looking for pythonic ways of reading and writing data in Excel files. Just now I found pyExelerator. Are there other python tools ? Any pointer would be appreciated. Thanks in advance Nils From jeremy at jeremysanders.net Wed Dec 10 07:26:01 2008 From: jeremy at jeremysanders.net (Jeremy Sanders) Date: Wed, 10 Dec 2008 12:26:01 +0000 Subject: [SciPy-user] numpy indexing with arrays Message-ID: Hi - I've been trying to think of a way of doing this in numpy. Is it possible without an explicit loop? I have an array of values vals = [1.1, 2.2, 3.3...] I also have a 2D array of indices into this vals array: img = [[1,2,3], [2,3,4], ..] Can I make another image with values from vals as indexed by the indicies in img? i.e., img = [[1.1, 2.2, 3.3], [2.2, 3.3, 4.4], ...] Thanks Jeremy From discerptor at gmail.com Wed Dec 10 07:24:24 2008 From: discerptor at gmail.com (Joshua Lippai) Date: Wed, 10 Dec 2008 04:24:24 -0800 Subject: [SciPy-user] reading and writing data in Excel files In-Reply-To: References: Message-ID: <9911419a0812100424h70ca5ab9l37b55a4cf72fca2d@mail.gmail.com> With PyExcelerator installed, you can use the Excel tools in the matplotlib toolkits http://matplotlib.sourceforge.net/users/toolkits.html Using them, you can read in Excel files as recarrays and write recarrays to Excel files. It's fairly well-documented through docstrings. Josh On Wed, Dec 10, 2008 at 4:11 AM, Nils Wagner wrote: > Hi all, > > I am looking for pythonic ways of reading and writing data > in Excel files. > > Just now I found pyExelerator. Are there other python > tools ? > > Any pointer would be appreciated. > > Thanks in advance > Nils > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > From pav at iki.fi Wed Dec 10 07:32:24 2008 From: pav at iki.fi (Pauli Virtanen) Date: Wed, 10 Dec 2008 12:32:24 +0000 (UTC) Subject: [SciPy-user] numpy indexing with arrays References: Message-ID: Wed, 10 Dec 2008 12:26:01 +0000, Jeremy Sanders wrote: > Hi - I've been trying to think of a way of doing this in numpy. Is it > possible without an explicit loop? > > I have an array of values > > vals = [1.1, 2.2, 3.3...] > > I also have a 2D array of indices into this vals array: > > img = [[1,2,3], > [2,3,4], > ..] > > Can I make another image with values from vals as indexed by the > indicies in img? i.e., > > img = [[1.1, 2.2, 3.3], > [2.2, 3.3, 4.4], > ...] Yes, >>> import numpy as np >>> vals = np.array([1.1, 2.2, 3.3, 4.4]) >>> img = np.array([[0,1,2],[1,2,3]]) >>> vals[img] array([[ 1.1, 2.2, 3.3], [ 2.2, 3.3, 4.4]]) See http://docs.scipy.org/doc/numpy/user/basics.indexing.html http://docs.scipy.org/doc/numpy/reference/arrays.indexing.html From nwagner at iam.uni-stuttgart.de Wed Dec 10 08:11:15 2008 From: nwagner at iam.uni-stuttgart.de (Nils Wagner) Date: Wed, 10 Dec 2008 14:11:15 +0100 Subject: [SciPy-user] reading and writing data in Excel files In-Reply-To: <9911419a0812100424h70ca5ab9l37b55a4cf72fca2d@mail.gmail.com> References: <9911419a0812100424h70ca5ab9l37b55a4cf72fca2d@mail.gmail.com> Message-ID: On Wed, 10 Dec 2008 04:24:24 -0800 "Joshua Lippai" wrote: > With PyExcelerator installed, you can use the Excel >tools in the > matplotlib toolkits > > http://matplotlib.sourceforge.net/users/toolkits.html > > Using them, you can read in Excel files as recarrays and >write > recarrays to Excel files. It's fairly well-documented >through > docstrings. > > Josh > Hi Josh, Thank you for your prompt response ! I found an example in matplotlib/examples/pylab_examples loadrec.py Is it possible to color cells depending on the entry ? Nils From dave.hirschfeld at gmail.com Wed Dec 10 09:23:48 2008 From: dave.hirschfeld at gmail.com (Dave Hirschfeld) Date: Wed, 10 Dec 2008 14:23:48 +0000 (UTC) Subject: [SciPy-user] scikits.timeseries References: <99B5C565-967B-43AB-A978-F0F740B31FB8@gmail.com> Message-ID: Pierre GM gmail.com> writes: > > > On Nov 27, 2008, at 11:23 AM, Robert Ferrell wrote: > > > 2. I've noticed that 'business frequency' includes holidays, and that > > can create holes in what are actually complete data sets. For > > instance, Sep 01, 2008 was a holiday in the US (Labor Day). > > Yes, the moniker "business days" is a bit decepetive, as it refers > only to days that are not Saturday or Sunday. It'd be too tricky for > us to implement holidays, as it'd vary from one place to another (no > such things as Thanksgiving in Europe, for example...). Hi Pierre & Matt, I'm finding the timeseries package very useful but I've also run into the same holidays issue as Robert. I was wondering if a solution of allowing the user to specify the holidays (cf Excel networkdays function) would be feasible? In the following example the user is able to change the function in the descriptor which would allow him/her to specify the holidays in their particular part of the world. I don't claim that this is the best way to do it, but I was wondering if such a scheme could be made to work in the wider context of the timeseries package? Cheers, Dave from scikits.timeseries import Date, DateArray, date_array class _isbusinessday(object): def __init__(self, func): assert callable(func) self.func = func def __get__(self, obj, objtype): return self.func(obj) def __set__(self, obj, func): assert callable(func) self.func = func # class BusinessDateArray(DateArray): isbusinessday = _isbusinessday(lambda x: x.weekday < 5) def __init__(self,*args,**kwargs): super(BusinessDateArray, self).__init__(*args,**kwargs) # dates = date_array(start_date=Date('D','01-Jan-2008'),length=100) dates = BusinessDateArray(dates=dates) print dates.isbusinessday From jeremy at jeremysanders.net Wed Dec 10 09:46:43 2008 From: jeremy at jeremysanders.net (Jeremy Sanders) Date: Wed, 10 Dec 2008 14:46:43 +0000 Subject: [SciPy-user] numpy indexing with arrays References: Message-ID: Pauli Virtanen wrote: >>>> import numpy as np >>>> vals = np.array([1.1, 2.2, 3.3, 4.4]) >>>> img = np.array([[0,1,2],[1,2,3]]) >>>> vals[img] > array([[ 1.1, 2.2, 3.3], > [ 2.2, 3.3, 4.4]]) > > See > > http://docs.scipy.org/doc/numpy/user/basics.indexing.html > http://docs.scipy.org/doc/numpy/reference/arrays.indexing.html Thanks - I thought I had tried this and it had failed. Jeremy From Dharhas.Pothina at twdb.state.tx.us Wed Dec 10 10:49:29 2008 From: Dharhas.Pothina at twdb.state.tx.us (Dharhas Pothina) Date: Wed, 10 Dec 2008 09:49:29 -0600 Subject: [SciPy-user] scikit.timeseries plotting bugs? In-Reply-To: References: <493CD4300200009B00018891@GWWEB.twdb.state.tx.us> <493D10CB.63BA.009B.0@twdb.state.tx.us>

Message-ID: <493F90A9.63BA.009B.0@twdb.state.tx.us> ok I fixed the install issues. It was a problem with numpy/scipy & the blas/lapack libraries. Tim thanks for the suggestion but your solution seems to be for debian/ubuntu. I'm using Fedora. - dharhas >>> Tim Michelsen 12/9/2008 3:27 PM >>> Hi, since you seem to be on a Linux machine I recommend using checkinstall: http://jhcore.com/2008/06/18/pymedia-on-ubuntu-hardy-heron/ http://wiki.ubuntuusers.de/Programme_kompilieren => German, but I think you can read the code insets. If you have problems come back to me. I successfully created a deb for timseries. Regards, Timmie _______________________________________________ SciPy-user mailing list SciPy-user at scipy.org http://projects.scipy.org/mailman/listinfo/scipy-user From pgmdevlist at gmail.com Wed Dec 10 12:08:34 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Wed, 10 Dec 2008 12:08:34 -0500 Subject: [SciPy-user] scikits.timeseries In-Reply-To: References: <99B5C565-967B-43AB-A978-F0F740B31FB8@gmail.com> Message-ID: Dave, On Dec 10, 2008, at 9:23 AM, Dave Hirschfeld wrote: > Pierre GM gmail.com> writes: >> On Nov 27, 2008, at 11:23 AM, Robert Ferrell wrote: >> >>> 2. I've noticed that 'business frequency' includes holidays, and >>> that >>> can create holes in what are actually complete data sets. For >>> instance, Sep 01, 2008 was a holiday in the US (Labor Day). >> >> Yes, the moniker "business days" is a bit decepetive, as it refers >> only to days that are not Saturday or Sunday. It'd be too tricky for >> us to implement holidays, as it'd vary from one place to another (no >> such things as Thanksgiving in Europe, for example...). > > Hi Pierre & Matt, > I'm finding the timeseries package very useful but I've also run > into the same > holidays issue as Robert. I was wondering if a solution of allowing > the user > to specify the holidays (cf Excel networkdays function) would be > feasible? Yes and no. No : there's no plan for any user-defined frequency yet, if either. The whole machinery is in C, and it would be *very* tricky for us to implement such a feature. Besides, this 'OpenBusinessDate' frequency is far too local to be developed on a large scale. Yes : This said, there should be a way to take holidays into account, at a small scale. I'm thinking out loud here: Say we come with a list of holidays for a given period of time. We could use that to mask specific dates on a series with Business/ WeekDay frequency. That way, conversion and statistics would still work seamlessly, we'd just be working with masked data. However, we'd still have some problems. A basic one would be to find the value in the series that falls 3 business days after some date: we could start adding 3 to the initial date (in WeekDay frequency), but then we would have to check whether there were some missing data during these 3 days (a vacation), and adjust the result accordingly. Doable, but not straightforward. > In the following example the user is able to change the function in > the > descriptor which would allow him/her to specify the holidays in their > particular part of the world. I don't claim that this is the best > way to do it, > but I was wondering if such a scheme could be made to work in the > wider context > of the timeseries package? We'd be more than happy to incorporate a good subclass of DateArray that takes holidays into account, whether through your scheme or the one I just suggested, and adresses some of the issues I listed above (find the business day that falls 3 days from now). I don't have time to do it myself, I don't think Matt has either, so we'll rely on users to come up with a solution. > from scikits.timeseries import Date, DateArray, date_array > > class _isbusinessday(object): > def __init__(self, func): > assert callable(func) > self.func = func > def __get__(self, obj, objtype): > return self.func(obj) > def __set__(self, obj, func): > assert callable(func) > self.func = func > # > > class BusinessDateArray(DateArray): > isbusinessday = _isbusinessday(lambda x: x.weekday < 5) > def __init__(self,*args,**kwargs): > super(BusinessDateArray, self).__init__(*args,**kwargs) > # > > dates = date_array(start_date=Date('D','01-Jan-2008'),length=100) > dates = BusinessDateArray(dates=dates) > print dates.isbusinessday > > > > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user From Dharhas.Pothina at twdb.state.tx.us Wed Dec 10 12:32:26 2008 From: Dharhas.Pothina at twdb.state.tx.us (Dharhas Pothina) Date: Wed, 10 Dec 2008 11:32:26 -0600 Subject: [SciPy-user] Still having plotting issue with latest svn scikits.timeseries References: <49390FD20200009B00018804@GWWEB.twdb.state.tx.us> Message-ID: <493FA8CA.63BA.009B.0@twdb.state.tx.us> Hi, I pulled this into a separate thread since the earlier one had too many different things in it. When I read a datafile in and try to plot it, Whether it plots correctly or not depends on what symbols I use in the plot. I think but I haven't exhaustively checked that the problem occurs for symbols that require some sort of interpolation ie lines, dashed lines etc. In these cases the axes are correct but the plot is blank. Symbols like dots or pluses seem to work fine. If I convert the timeseries using the .convert or .asfreq methods the plots work without a problem. Since I can plot after converting using .asfreq this isn't too much of a problem but I was wondering if there is a bug or whether a warning should be thrown. The dataset and commands I used are below. thanks, - dharhas >>> "Dharhas Pothina" 12/5/2008 11:26 AM >>> Hi, I'm attaching a ascii datafile called JDM3_short.txt and the commands I typed in ipython to reproduce the problem. The first figure works the second has the correct axes but is blank. In [2]: import sys In [3]: import subprocess In [4]: from numpy import * In [5]: from pylab import * In [6]: import datetime In [7]: import scikits.timeseries as ts In [8]: import scikits.timeseries.lib.plotlib as tpl In [9]: import numpy.ma as ma In [10]: fieldfile='JDM3_short.txt' In [11]: year, month, day, hour, minute, fdata = loadtxt(fieldfile,comments="#",usecols=(0,1,2,3,4,8),unpack=True) In [12]: fielddates = [datetime.datetime(int(y),int(m),int(d),int(hh),int(mm),0) for y,m,d,hh,mm in zip(year,month,day,hour,minute)] In [13]: fdates = ts.date_array(fielddates,freq='MIN') In [14]: fseries = ts.time_series(fdata, dates=fdates) In [15]: #remove -999.9 nodata values fo parameter In [16]: fseries[fseries==-999.9] = ma.masked In [17]: In [18]: fseries = fseries.fill_missing_dates() In [19]: fig = tpl.tsfigure() In [20]: fsp = fig.add_tsplot(111) In [21]: fsp.tsplot(fseries, '.') Out[21]: [] In [22]: fig1 = tpl.tsfigure() In [23]: fsp1 = fig1.add_tsplot(111) In [24]: fsp1.tsplot(fseries, 'b', label='data') Out[24]: [] In [25]: show() - dharhas >>> "Dharhas Pothina" 12/05/08 10:42 AM >>> Ok I've narrowed it down. It has nothing to do with plotting two timeseries. The problem is plotting the any of the original timeseries I read from files. Once I convert them to daily means the plotting works fine. With the original timeseries whether it plots or not seems to depend on what plotting symbol I use. - dharhas >> If I use two commands and certain other symbols like 'b',b--' etc >> for the first series, it only plots the second series : > ??? can't reproduce this one... I've double checked, its happening consistently with multiple datasets (The datasets are similar just at different locations). How do I track down the problem? Should I send you a datafile and my short script? -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: JDM3_short.txt URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Part.002 Type: application/octet-stream Size: 151 bytes Desc: not available URL: From peter.skomoroch at gmail.com Wed Dec 10 12:54:29 2008 From: peter.skomoroch at gmail.com (Peter Skomoroch) Date: Wed, 10 Dec 2008 12:54:29 -0500 Subject: [SciPy-user] fastest way to populate sparse matrix? Message-ID: What is the fastest way to replace non-zero elements of a sparse matrix with corresponding elements from a product of dense matrices, without the memory overhead of computing the entire dense matrix product? The code below demonstrates the way I am doing it now: looping through the nonzero elements in the sparse matrix, and forming the corresponding row - column product from the dense matrices. It uses the sparse module from the latest scipy trunk. -Pete #!/usr/bin/env python # encoding: utf-8 """ sparse_fill.py uses latest sparse package from scipy svn """ import sys import os import time from numpy import * from numpy.random import random import scipy.sparse as sparse from scipy import io import urllib import tarfile def main(): # number of rows 1,916 # number of columns 1,916 # nonzeros 195,985 print "loading matrix..." f = urllib.urlopen("http://www.cise.ufl.edu/research/sparse/MM/JGD_CAG/CAG_mat1916.tar.gz") tar_download = open('CAG_mat1916.tar.gz','w') print >> tar_download, f.read() tar_download.close() tar = tarfile.open("CAG_mat1916.tar.gz", "r:gz") tar.extractall() # V is a sparse matrix, with around 5 % of entries populated V = io.mmread("CAG_mat1916/CAG_mat1916.mtx").tocsr() n,m = V.shape r = 200 eps=1e-9 nonzeros = [tuple(ij) for ij in transpose(V.nonzero())] W = (random([n,r]) ).astype(float32) H = random([r,m]).astype(float32) ################################################# # This block needs to be executed many times... # Fill non-zero elements of a sparse matrix with # corresponding elements from dense matrix product. # We don't want to compute the full matrix product so # we can save memory print "filling..." t0=time.time() V_approx = sparse.lil_matrix((n,m), dtype=float32) for i,j in nonzeros: V_approx[i,j] = dot(W[i,:],H[:,j]) print "time to fill:", time.time() -t0 W_factor = (V*H.T + eps ) / ( V_approx*H.T + eps) W = W_factor*W #################################################### if __name__ == '__main__': main() -- Peter N. Skomoroch peter.skomoroch at gmail.com http://www.datawrangling.com http://del.icio.us/pskomoroch From pgmdevlist at gmail.com Wed Dec 10 13:02:41 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Wed, 10 Dec 2008 13:02:41 -0500 Subject: [SciPy-user] Still having plotting issue with latest svn scikits.timeseries In-Reply-To: <493FA8CA.63BA.009B.0@twdb.state.tx.us> References: <49390FD20200009B00018804@GWWEB.twdb.state.tx.us> <493FA8CA.63BA.009B.0@twdb.state.tx.us> Message-ID: On Dec 10, 2008, at 12:32 PM, Dharhas Pothina wrote: > > When I read a datafile in and try to plot it, Whether it plots > correctly or not depends on what symbols I use in the plot. I'm very, very surprised. Most likely, some dates get messed up and don't show up where you expect them. The plotlib routines only deal with where the data should be, not how it is plotted. * When you create a tsplot, you need to specify an associated series with the 'series' keywords. This series will control the behavior of the axes. OK, let me get a tad deeper here: Rmmbr that a DateArray is basically nothing but a int ndarray with some extra candy. The frequency parameter controls how the dates are converted to integers, and vice-versa. For example, today (12/10/2008) is associated with 733386 if you use a daily frequency (that's the corresponding gregorian proleptic day), with 24096 if you use a monthly frequency, or simply 2008 if you use an annual frequency. Plotting a series with tsplot just amounts to plot a standard masked array with a given set of integers as x coordinates (the dates). If you don't give a series when you create a tsplot, the first series to be plotted will be associated with the plot. Its frequency will dictate how dates are converted. So, if you plot series #1 with a daily frequency and then series #2 with a monthly frequency on the same graph, some conversion should happen to make sure that the dates of #2 are actually expressed in the same frame as series #1. Normally, that happens in the background when you plot series one after the other, using one command for each series. If you try to plot several series at once using a single command, the conversion will happen provided that a frequency was already associated with the plot. > If I convert the timeseries using the .convert or .asfreq methods > the plots work without a problem. Which is a strong hint that the dates get messed up before you plot. Are you trying to plot several series with different frequencies ? > Since I can plot after converting using .asfreq this isn't too much > of a problem but I was wondering if there is a bug or whether a > warning should be thrown. A bug is always possible, but I would tend to think it's a misuse of the function on your side here. Then, of course, the doc could be clearer. > The dataset and commands I used are below. OK, thx for the dataset, but please, don't copy-paste directly from ipython. Just send us the commands, without a prompt (the In[...]), in a script if you prefer. Right now, I don't have time to spend on removing these extra characters to check what's going on. > thanks, > > - dharhas > > >>>> "Dharhas Pothina" 12/5/2008 >>>> 11:26 AM >>> > Hi, > > I'm attaching a ascii datafile called JDM3_short.txt and the > commands I typed in ipython to reproduce the problem. The first > figure works the second has the correct axes but is blank. > > In [2]: import sys > In [3]: import subprocess > In [4]: from numpy import * > In [5]: from pylab import * > In [6]: import datetime > In [7]: import scikits.timeseries as ts > In [8]: import scikits.timeseries.lib.plotlib as tpl > In [9]: import numpy.ma as ma > In [10]: fieldfile='JDM3_short.txt' > In [11]: year, month, day, hour, minute, fdata = > loadtxt(fieldfile,comments="#",usecols=(0,1,2,3,4,8),unpack=True) > In [12]: fielddates = > [datetime.datetime(int(y),int(m),int(d),int(hh),int(mm),0) for > y,m,d,hh,mm in zip(year,month,day,hour,minute)] > In [13]: fdates = ts.date_array(fielddates,freq='MIN') > In [14]: fseries = ts.time_series(fdata, dates=fdates) > In [15]: #remove -999.9 nodata values fo parameter > In [16]: fseries[fseries==-999.9] = ma.masked > In [17]: > In [18]: fseries = fseries.fill_missing_dates() > In [19]: fig = tpl.tsfigure() > In [20]: fsp = fig.add_tsplot(111) > In [21]: fsp.tsplot(fseries, '.') > Out[21]: [] > In [22]: fig1 = tpl.tsfigure() > In [23]: fsp1 = fig1.add_tsplot(111) > In [24]: fsp1.tsplot(fseries, 'b', label='data') > Out[24]: [] > In [25]: show() > > - dharhas > >>>> "Dharhas Pothina" 12/05/08 >>>> 10:42 AM >>> > > Ok I've narrowed it down. It has nothing to do with plotting two > timeseries. The problem is plotting the any of the original > timeseries I read from files. Once I convert them to daily means the > plotting works fine. > > With the original timeseries whether it plots or not seems to depend > on what plotting symbol I use. > > - dharhas > >>> If I use two commands and certain other symbols like 'b',b--' etc >>> for the first series, it only plots the second series : >> ??? can't reproduce this one... > > I've double checked, its happening consistently with multiple > datasets (The datasets are similar just at different locations). How > do I track down the problem? Should I send you a datafile and my > short script? > > 002>_______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user From wnbell at gmail.com Wed Dec 10 13:46:38 2008 From: wnbell at gmail.com (Nathan Bell) Date: Wed, 10 Dec 2008 13:46:38 -0500 Subject: [SciPy-user] fastest way to populate sparse matrix? In-Reply-To: References: Message-ID: On Wed, Dec 10, 2008 at 12:54 PM, Peter Skomoroch wrote: > What is the fastest way to replace non-zero elements of a sparse > matrix with corresponding elements from a product of dense matrices, > without the memory overhead of computing the entire dense matrix > product? > > The code below demonstrates the way I am doing it now: looping through > the nonzero elements in the sparse matrix, and forming the > corresponding row - column product from the dense matrices. It uses > the sparse module from the latest scipy trunk. > The fastest way to construct a sparse matrix is using the COO format as discussed here: http://www.scipy.org/SciPyPackages/Sparse#head-be8a0be5d0e44c4d59550d64fb0173508073c36e Using COO instead of LIL should be considerably faster. -- Nathan Bell wnbell at gmail.com http://graphics.cs.uiuc.edu/~wnbell/ From Dharhas.Pothina at twdb.state.tx.us Wed Dec 10 13:55:22 2008 From: Dharhas.Pothina at twdb.state.tx.us (Dharhas Pothina) Date: Wed, 10 Dec 2008 12:55:22 -0600 Subject: [SciPy-user] Still having plotting issue with latest svnscikits.timeseries Message-ID: <493FBC3B0200009B000189FA@GWWEB.twdb.state.tx.us> Hi Pierre, The culprit seems to be the .fill_missing_dates() method. If I comment that out my plots are fine and if I use it I have the problems I mentioned. > Since I can plot after converting using .asfreq this isn't too much Disregard my comment about .asfreq working I'm unable to reproduce it now. .convert only works when converting to a lower frequency. > Which is a strong hint that the dates get messed up before you plot. > Are you trying to plot several series with different frequencies ? I am plotting a single series. > A bug is always possible, but I would tend to think it's a misuse of > the function on your side here. Then, of course, the doc could be > clearer. understood. Just wanted to clarify which it is. > OK, thx for the dataset, but please, don't copy-paste directly from > ipython. Just send us the commands, without a prompt (the In[...]), in > a script if you prefer. Right now, I don't have time to spend on Sorry about that. I've attached a script called test.py and the data JDM3_short.txt - dharhas -------------- next part -------------- A non-text attachment was scrubbed... Name: test.py Type: application/octet-stream Size: 828 bytes Desc: not available URL: -------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: JDM3_short.txt URL: From pgmdevlist at gmail.com Wed Dec 10 14:32:02 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Wed, 10 Dec 2008 14:32:02 -0500 Subject: [SciPy-user] Still having plotting issue with latest svnscikits.timeseries In-Reply-To: <493FBC3B0200009B000189FA@GWWEB.twdb.state.tx.us> References: <493FBC3B0200009B000189FA@GWWEB.twdb.state.tx.us> Message-ID: On Dec 10, 2008, at 1:55 PM, Dharhas Pothina wrote: > Hi Pierre, > > The culprit seems to be the .fill_missing_dates() method. If I > comment that out my plots are fine and if I use it I have the > problems I mentioned. Should have thought about it earlier. When you use .fill_missing_dates on your data, you introduce a lot of missing values. matplotlib doesn't know how to connect those missing values with lines, so it doesn't plot the lines. However, it plots the dots alright. The pb is thus a limitation of matplotlib, not of timeseries (relief), and no, there's no work around. > >> Since I can plot after converting using .asfreq this isn't too much > > Disregard my comment about .asfreq working I'm unable to reproduce > it now. .convert only works when converting to a lower frequency. convert transforms a series w/ one frequency to another series w/ another frequency. From lower to higher frequency, it outputs by default a 2D array, which is not what you want. Check the docstring. asfreq just convert the dates from one frequency to another. The data is left unchanged. That's what you want. Here again, please refer to the documentation. > >> OK, thx for the dataset, but please, don't copy-paste directly from >> ipython. Just send us the commands, without a prompt (the In[...]), >> in >> a script if you prefer. Right now, I don't have time to spend on > > Sorry about that. I've attached a script called test.py and the data > JDM3_short.txt OK, thx a lot. So, yes, that was only a pb of masked data. From peter.skomoroch at gmail.com Wed Dec 10 16:18:59 2008 From: peter.skomoroch at gmail.com (Peter Skomoroch) Date: Wed, 10 Dec 2008 16:18:59 -0500 Subject: [SciPy-user] fastest way to populate sparse matrix? In-Reply-To: References: Message-ID: Nathan, Thanks for the pointer, I had missed that wiki page. Using coo_matrix as you suggest gives over a 5x speedup: time to fill lil_matrix: 10.4162230492 total time to fill coo_matrix: 1.82639312744 The bottleneck now seems to be this for-loop, which takes the majority of the remaining time (1.82258105278 seconds): for index, (i,j) in enumerate(nonzero_indices): data[index] = dot(W[i,:],H[:,j]) Is there a better approach for this assignment block? -Pete New code: #!/usr/bin/env python # encoding: utf-8 """ sparse_fill.py uses latest sparse package from scipy svn """ import sys import os import time from numpy import * from numpy.random import random import scipy.sparse as sparse from scipy import io import urllib import tarfile def lil_fill(V): print "Filling lil_matrix" n,m = V.shape r = 200 eps=1e-9 nonzeros = [tuple(ij) for ij in transpose(V.nonzero())] W = (random([n,r]) ).astype(float32) H = random([r,m]).astype(float32) print "filling..." t0=time.time() V_approx = sparse.lil_matrix((n,m), dtype=float32) for i,j in nonzeros: V_approx[i,j] = dot(W[i,:],H[:,j]) print "time to fill:", time.time() -t0 W_factor = (V*H.T + eps ) / ( V_approx*H.T + eps) W = W_factor*W #################################################### print "done...\n\n" def coo_fill(V): print "Filling coo_matrix" n,m = V.shape r = 200 eps=1e-9 nonzeros = V.nonzero() nonzero_indices = [tuple(ij) for ij in transpose(nonzeros)] L = len(nonzeros[0]) W = (random([n,r]) ).astype(float32) H = random([r,m]).astype(float32) print "filling..." t0=time.time() row = nonzeros[0] # row indices go here col = nonzeros[1] # column indices go here data = zeros(L) for index, (i,j) in enumerate(nonzero_indices): data[index] = dot(W[i,:],H[:,j]) print "data assignment done, filling matrix", time.time() -t0 #data = ones(L) # data values go here V_approx = sparse.coo_matrix((data,(row,col)), shape=(n,m)) print "total time to fill coo_matrix:", time.time() -t0 W_factor = (V*H.T + eps ) / ( V_approx*H.T + eps) W = W_factor*W #################################################### print "done...\n\n" def main(): # number of rows 1,916 # number of columns 1,916 # nonzeros 195,985 print "loading matrix..." f = urllib.urlopen("http://www.cise.ufl.edu/research/sparse/MM/JGD_CAG/CAG_mat1916.tar.gz") tar_download = open('CAG_mat1916.tar.gz','w') print >> tar_download, f.read() tar_download.close() tar = tarfile.open("CAG_mat1916.tar.gz", "r:gz") tar.extractall() # V is a sparse matrix, with around 5 % of entries populated V = io.mmread("CAG_mat1916/CAG_mat1916.mtx").tocsr() lil_fill(V) coo_fill(V) if __name__ == '__main__': main() On Wed, Dec 10, 2008 at 1:46 PM, Nathan Bell wrote: > On Wed, Dec 10, 2008 at 12:54 PM, Peter Skomoroch > wrote: >> What is the fastest way to replace non-zero elements of a sparse >> matrix with corresponding elements from a product of dense matrices, >> without the memory overhead of computing the entire dense matrix >> product? >> >> The code below demonstrates the way I am doing it now: looping through >> the nonzero elements in the sparse matrix, and forming the >> corresponding row - column product from the dense matrices. It uses >> the sparse module from the latest scipy trunk. >> > > The fastest way to construct a sparse matrix is using the COO format > as discussed here: > http://www.scipy.org/SciPyPackages/Sparse#head-be8a0be5d0e44c4d59550d64fb0173508073c36e > > Using COO instead of LIL should be considerably faster. > > -- > Nathan Bell wnbell at gmail.com > http://graphics.cs.uiuc.edu/~wnbell/ > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > -- Peter N. Skomoroch peter.skomoroch at gmail.com http://www.datawrangling.com http://del.icio.us/pskomoroch From fritz.peter.maas at googlemail.com Wed Dec 10 16:20:56 2008 From: fritz.peter.maas at googlemail.com (Peter Maas) Date: Wed, 10 Dec 2008 22:20:56 +0100 Subject: [SciPy-user] Problemds building NumPy with Python 2.6 / MinGW 3.4.5 on Windows XP SP3 In-Reply-To: <493F5C14.9010201@ar.media.kyoto-u.ac.jp> References: <493A9063.2010203@googlemail.com> <493A9183.5030906@ar.media.kyoto-u.ac.jp> <493AB07E.5060201@googlemail.com> <493AAF26.8010802@ar.media.kyoto-u.ac.jp> <493ABE83.8020709@googlemail.com> <493B58B3.80509@ar.media.kyoto-u.ac.jp> <493EFBE6.1050202@googlemail.com> <493F5C14.9010201@ar.media.kyoto-u.ac.jp> Message-ID: <494032B8.90403@googlemail.com> David Cournapeau schrieb: > Which version of gcc are you using ? C:\>gcc -v Reading specs from C:/MinGW/bin/../lib/gcc/mingw32/3.4.5/specs Configured with: ../gcc-3.4.5-20060117-3/configure --with-gcc --with-gnu-ld --with-gnu-as --host=mingw32 --target=mingw32 --prefix=/mingw --enable-threads --disable-nls --enable-languages=c,c++,f77,ada,objc,java --disable-win32-registry --disable-shared --enable-sjlj-exceptions --enable-libgcj --disable-java-awt --without-x --enable-java-gc=boehm --disable-libgcj-debug --enable-interpreter --enable-hash-synchronization --enable-libstdcxx-debug Thread model: win32 gcc version 3.4.5 (mingw-vista special r3) -- Regards, Peter Peter Maas, Aachen, Germany From Dharhas.Pothina at twdb.state.tx.us Wed Dec 10 16:27:30 2008 From: Dharhas.Pothina at twdb.state.tx.us (Dharhas Pothina) Date: Wed, 10 Dec 2008 15:27:30 -0600 Subject: [SciPy-user] Still having plotting issue with latestsvnscikits.timeseries In-Reply-To: References: <493FBC3B0200009B000189FA@GWWEB.twdb.state.tx.us> Message-ID: <493FDFE2.63BA.009B.0@twdb.state.tx.us> > Should have thought about it earlier. When you use .fill_missing_dates > on your data, you introduce a lot of missing values. matplotlib > doesn't know how to connect those missing values with lines, so it > doesn't plot the lines. However, it plots the dots alright. The pb is > thus a limitation of matplotlib, not of timeseries (relief), and no, > there's no work around. I'm glad we got this sorted out. Thank Pierre, this toolkit is great, saving me a lot of time already. - dharhas From wnbell at gmail.com Wed Dec 10 16:46:51 2008 From: wnbell at gmail.com (Nathan Bell) Date: Wed, 10 Dec 2008 16:46:51 -0500 Subject: [SciPy-user] fastest way to populate sparse matrix? In-Reply-To: References: Message-ID: On Wed, Dec 10, 2008 at 4:18 PM, Peter Skomoroch wrote: > Nathan, > > Thanks for the pointer, I had missed that wiki page. It's fairly recent, so don't feel bad :) > > The bottleneck now seems to be this for-loop, which takes the majority > of the remaining time (1.82258105278 seconds): > > for index, (i,j) in enumerate(nonzero_indices): > data[index] = dot(W[i,:],H[:,j]) > > Is there a better approach for this assignment block? > You could vectorize the loop: W = random([n,r]).astype(float32) H = random([m,r]).astype(float32) # note, shape is (m,r) I,J = V.nonzero() X = (W[I,:] * H[J,:]).sum(axis=1) V_approx = sparse.coo_matrix((X,(I,J)), shape=(n,m)) If memory usage of the above is too costly, you could use the same approach, but on fixed-sized chunks of the arrays. -- Nathan Bell wnbell at gmail.com http://graphics.cs.uiuc.edu/~wnbell/ From peter.skomoroch at gmail.com Wed Dec 10 18:56:38 2008 From: peter.skomoroch at gmail.com (Peter Skomoroch) Date: Wed, 10 Dec 2008 18:56:38 -0500 Subject: [SciPy-user] fastest way to populate sparse matrix? In-Reply-To: References: Message-ID: Hmmm, surprisingly the vectorized version seems to take longer: Original method: Filling coo_matrix filling... data assignment done, filling matrix 1.84783697128 total time to fill coo_matrix: 1.85190200806 done... Vectorized: Filling coo_matrix filling... data assignment done, filling matrix 3.22157812119 total time to fill coo_matrix: 3.2216091156 done... On Wed, Dec 10, 2008 at 4:46 PM, Nathan Bell wrote: > On Wed, Dec 10, 2008 at 4:18 PM, Peter Skomoroch > wrote: >> Nathan, >> >> Thanks for the pointer, I had missed that wiki page. > > It's fairly recent, so don't feel bad :) > >> >> The bottleneck now seems to be this for-loop, which takes the majority >> of the remaining time (1.82258105278 seconds): >> >> for index, (i,j) in enumerate(nonzero_indices): >> data[index] = dot(W[i,:],H[:,j]) >> >> Is there a better approach for this assignment block? >> > > You could vectorize the loop: > > W = random([n,r]).astype(float32) > H = random([m,r]).astype(float32) # note, shape is (m,r) > > I,J = V.nonzero() > X = (W[I,:] * H[J,:]).sum(axis=1) > V_approx = sparse.coo_matrix((X,(I,J)), shape=(n,m)) > > > If memory usage of the above is too costly, you could use the same > approach, but on fixed-sized chunks of the arrays. > > -- > Nathan Bell wnbell at gmail.com > http://graphics.cs.uiuc.edu/~wnbell/ > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > -- Peter N. Skomoroch peter.skomoroch at gmail.com http://www.datawrangling.com http://del.icio.us/pskomoroch From strang at nmr.mgh.harvard.edu Wed Dec 10 19:51:27 2008 From: strang at nmr.mgh.harvard.edu (Gary Strangman) Date: Wed, 10 Dec 2008 19:51:27 -0500 (EST) Subject: [SciPy-user] Rotate volume data and regrid? In-Reply-To: References: Message-ID: Hi scipy-experts, I have a 3D array (180x200x200 elements) that represents a 3D volume in space. I want to rotate the data in this array by an arbitrary amount around an arbitrary point inside the volume, then re-grid the result back into the original voxels. Does anyone know of a scipythonic or numpythonic function, package, or even an algorithm to achieve this? -best Gary From pgmdevlist at gmail.com Wed Dec 10 20:15:39 2008 From: pgmdevlist at gmail.com (Pierre GM) Date: Wed, 10 Dec 2008 20:15:39 -0500 Subject: [SciPy-user] Still having plotting issue with latestsvnscikits.timeseries In-Reply-To: <493FDFE2.63BA.009B.0@twdb.state.tx.us> References: <493FBC3B0200009B000189FA@GWWEB.twdb.state.tx.us> <493FDFE2.63BA.009B.0@twdb.state.tx.us> Message-ID: On Dec 10, 2008, at 4:27 PM, Dharhas Pothina wrote: > >> Should have thought about it earlier. When you >> use .fill_missing_dates >> on your data, you introduce a lot of missing values. matplotlib >> doesn't know how to connect those missing values with lines, so it >> doesn't plot the lines. However, it plots the dots alright. The pb is >> thus a limitation of matplotlib, not of timeseries (relief), and no, >> there's no work around. > > I'm glad we got this sorted out. Thank Pierre, this toolkit is > great, saving me a lot of time already. OK, just to claridy a point: matplotlib has no problem with masked values: it just ignores them. The problem we have with your dataset is that the non-masked values are never consecutive, and matplotlib doesn't know how to connect 2 points separated by one or more masked values. And it's a good thing, if you think about it. A solution therefore consists in using markers (dot, square, whatever) in conjunction to the lines. Another consists in plotting a compressed array, where the missing values are suppressed. From david at ar.media.kyoto-u.ac.jp Wed Dec 10 23:11:15 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Thu, 11 Dec 2008 13:11:15 +0900 Subject: [SciPy-user] Problemds building NumPy with Python 2.6 / MinGW 3.4.5 on Windows XP SP3 In-Reply-To: <494032B8.90403@googlemail.com> References: <493A9063.2010203@googlemail.com> <493A9183.5030906@ar.media.kyoto-u.ac.jp> <493AB07E.5060201@googlemail.com> <493AAF26.8010802@ar.media.kyoto-u.ac.jp> <493ABE83.8020709@googlemail.com> <493B58B3.80509@ar.media.kyoto-u.ac.jp> <493EFBE6.1050202@googlemail.com> <493F5C14.9010201@ar.media.kyoto-u.ac.jp> <494032B8.90403@googlemail.com> Message-ID: <494092E3.50502@ar.media.kyoto-u.ac.jp> Peter Maas wrote: > David Cournapeau schrieb: > >> Which version of gcc are you using ? >> > > C:\>gcc -v > Reading specs from C:/MinGW/bin/../lib/gcc/mingw32/3.4.5/specs > Configured with: ../gcc-3.4.5-20060117-3/configure --with-gcc --with-gnu-ld > --with-gnu-as --host=mingw32 --target=mingw32 --prefix=/mingw --enable-threads > --disable-nls --enable-languages=c,c++,f77,ada,objc,java > --disable-win32-registry --disable-shared --enable-sjlj-exceptions > --enable-libgcj --disable-java-awt --without-x --enable-java-gc=boehm > --disable-libgcj-debug --enable-interpreter --enable-hash-synchronization > --enable-libstdcxx-debug > Thread model: win32 > gcc version 3.4.5 (mingw-vista special r3) > Ok, that's not the alpha version. Unfortunately, I have no idea on what could be different in your setup from mine. Could you open a ticket on numpy trac, so we don't forget this ? Since I can not work on it now, I am afraid I will forget about it otherwise, David From gael.varoquaux at normalesup.org Thu Dec 11 00:59:48 2008 From: gael.varoquaux at normalesup.org (Gael Varoquaux) Date: Thu, 11 Dec 2008 06:59:48 +0100 Subject: [SciPy-user] Rotate volume data and regrid? In-Reply-To: References: Message-ID: <20081211055948.GA21281@phare.normalesup.org> On Wed, Dec 10, 2008 at 07:51:27PM -0500, Gary Strangman wrote: > Hi scipy-experts, > I have a 3D array (180x200x200 elements) that represents a 3D volume in > space. I want to rotate the data in this array by an arbitrary amount > around an arbitrary point inside the volume, then re-grid the result back > into the original voxels. Does anyone know of a scipythonic or numpythonic > function, package, or even an algorithm to achieve this? You will probably find what you need in one of the functions listed on http://docs.scipy.org/scipy/docs/scipy.ndimage.interpolation/#scipy-ndimage-interpolation Ga?l From dave.hirschfeld at gmail.com Thu Dec 11 04:36:18 2008 From: dave.hirschfeld at gmail.com (Dave Hirschfeld) Date: Thu, 11 Dec 2008 09:36:18 +0000 (UTC) Subject: [SciPy-user] scikits.timeseries References: <99B5C565-967B-43AB-A978-F0F740B31FB8@gmail.com>

Message-ID: Pierre GM gmail.com> writes: > > Dave, > > On Dec 10, 2008, at 9:23 AM, Dave Hirschfeld wrote: > > > I was wondering if a solution of allowing > > the user > > to specify the holidays (cf Excel networkdays function) would be > > feasible? > > Yes and no. > No : there's no plan for any user-defined frequency yet, if either. > The whole machinery is in C, and it would be *very* tricky for us to > implement such a feature. > Yes : This said, there should be a way to take holidays into account, > at a small scale. I was afraid it could be difficult to incorporate in a more general setting. Thanks for the quick reply. -Dave From strang at nmr.mgh.harvard.edu Thu Dec 11 07:30:04 2008 From: strang at nmr.mgh.harvard.edu (Gary Strangman) Date: Thu, 11 Dec 2008 07:30:04 -0500 (EST) Subject: [SciPy-user] Rotate volume data and regrid? In-Reply-To: <20081211055948.GA21281@phare.normalesup.org> References: <20081211055948.GA21281@phare.normalesup.org> Message-ID: >> Hi scipy-experts, > >> I have a 3D array (180x200x200 elements) that represents a 3D volume in >> space. I want to rotate the data in this array by an arbitrary amount >> around an arbitrary point inside the volume, then re-grid the result back >> into the original voxels. Does anyone know of a scipythonic or numpythonic >> function, package, or even an algorithm to achieve this? > > You will probably find what you need in one of the functions listed on > http://docs.scipy.org/scipy/docs/scipy.ndimage.interpolation/#scipy-ndimage-interpolation Not only had I forgotten about ndimage, but had no recollection of all the nifty things in there. Thanks for saving me a week of wheel-reinvention! G From shchelokovskyy at gmail.com Thu Dec 11 08:09:14 2008 From: shchelokovskyy at gmail.com (Pavlo Shchelokovskyy) Date: Thu, 11 Dec 2008 14:09:14 +0100 Subject: [SciPy-user] negative values in diagonal of covariance matrix Message-ID: Hi all, I'm a moderately new user of scipy, trying to make some curve-fitting with it. I wanted to use cov matrix output of leastsq to estimate errors of fitted parameters, but stumbled upon strange discrepancy (for one particular dataset): On Linux (Fedora 8), using Python 2.5.1, numpy 1.2.0 and scipy 0.6.0 >>> from scipy import * >>> from scipy import optimize >>> y = asarray([217, 182, 162, 170, 255]) >>> x = linspace(0, y.size - 1, y.size) >>> gauss = lambda p, x: p[0] + p[1] * exp(-(x-p[2])**2/(2*p[3]**2)) >>> errfunc = lambda p, x, y: y - gauss(p, x) >>> pinit = (max(y), -y.ptp(), argmin(y), y.size/4) >>> fit,cov,info,mesg,success = optimize.leastsq(errfunc, pinit, args=(x,y), full_output = 1) >>> fit array([ 3.01602865e+05, -3.01444487e+05, 1.83283239e+00, 8.87273494e+01]) >>> cov array([[ -2.27903048e+16, 2.27903047e+16, -4.72378743e+05, -3.35454486e+12], [ 2.27903047e+16, -2.27903046e+16, 4.72378733e+05, 3.35454485e+12], [ -4.72378950e+05, 4.72378947e+05, 6.38886491e-05, -6.95317302e+01], [ -3.35454486e+12, 3.35454485e+12, -6.95316881e+01, -4.93761332e+08]]) I know that the fit is very bad (I can reject it afterwards in my procedure), but what bothers me the most are the negative numbers on the diagonal of the covariance matrix - as far as I understand, there shouldn't be any... especially when the same script run on Windows XP, using Python 2.5.2, numpy 1.2.0 and scipy 0.6.0 gives no such values >>> fit array([ 3.58426106e+05, -3.58267727e+05, 1.83283417e+00, 9.67300286e+01]) >>> cov array([[ 6.40320190e+14, -6.40320188e+14, 4.24641077e+04, 8.64500286e+10], [ -6.40320188e+14, 6.40320186e+14, -4.24641083e+04, -8.64500284e+10], [ 4.24641250e+04, -4.24641094e+04, 7.64931901e-05, 5.73152924e+00], [ 8.64500286e+10, -8.64500284e+10, 5.73152989e+00, 1.16716728e+07]]) Is it my error or misunderstanding somewhere, or is it really a bug in Linux implementation? Thanks in advance, From cournape at gmail.com Thu Dec 11 08:41:10 2008 From: cournape at gmail.com (David Cournapeau) Date: Thu, 11 Dec 2008 22:41:10 +0900 Subject: [SciPy-user] negative values in diagonal of covariance matrix In-Reply-To: References: Message-ID: <5b8d13220812110541u738c7660oc2fdff9eab469811@mail.gmail.com> On Thu, Dec 11, 2008 at 10:09 PM, Pavlo Shchelokovskyy wrote: > Hi all, > > I'm a moderately new user of scipy, trying to make some curve-fitting > with it. I wanted to use cov matrix output of leastsq to estimate > errors of fitted parameters, but stumbled upon strange discrepancy > (for one particular dataset): > > On Linux (Fedora 8), using Python 2.5.1, numpy 1.2.0 and scipy 0.6.0 > >>>> from scipy import * >>>> from scipy import optimize >>>> y = asarray([217, 182, 162, 170, 255]) >>>> x = linspace(0, y.size - 1, y.size) >>>> gauss = lambda p, x: p[0] + p[1] * exp(-(x-p[2])**2/(2*p[3]**2)) >>>> errfunc = lambda p, x, y: y - gauss(p, x) >>>> pinit = (max(y), -y.ptp(), argmin(y), y.size/4) >>>> fit,cov,info,mesg,success = optimize.leastsq(errfunc, pinit, args=(x,y), full_output = 1) >>>> fit > array([ 3.01602865e+05, -3.01444487e+05, 1.83283239e+00, 8.87273494e+01]) >>>> cov > array([[ -2.27903048e+16, 2.27903047e+16, -4.72378743e+05, -3.35454486e+12], > [ 2.27903047e+16, -2.27903046e+16, 4.72378733e+05, 3.35454485e+12], > [ -4.72378950e+05, 4.72378947e+05, 6.38886491e-05, -6.95317302e+01], > [ -3.35454486e+12, 3.35454485e+12, -6.95316881e+01, -4.93761332e+08]]) > This kind of values suggests a badly scaled result at best. The numerical values cannot be trusted, so discrepencies are not surprising: you are outside the validity of the numerical methods. The fact that the values on windows are positive are just an accident and an implementation detail. > Is it my error or misunderstanding somewhere, or is it really a bug in > Linux implementation? It is likely that both platforms do not use the same implementation - and compilers version/options differences could explain the difference for the same code source. Again, this is not surprising - and not a bug - if the methods are outside their validity range (it is of course a totally different matter if you are in the expected domain of the algorithms: in this case, a correct implementation and installation of numpy/scipy should hopefully only differ in the same range as machine precision). cheers, David From beckers at orn.mpg.de Thu Dec 11 08:55:48 2008 From: beckers at orn.mpg.de (Gabriel Beckers) Date: Thu, 11 Dec 2008 14:55:48 +0100 Subject: [SciPy-user] fftw and scipy 0.7 Message-ID: <1229003748.13644.16.camel@gabriel-desktop> Hi, I see that in the new scipy (0.7) only fftpack (NETLIB) remains for fft. I understand the reasons for this, but for my application (filtering gigabytes of neurophysiological data) fft speed differences really lead to very significant speed differences. What would the best way of using fftw, now that the wrappers are gone. Cython? Weave? I haven't used either of these yet, but would look into them if someone could confirm that that would be a good idea in order to solve my problem. I vaguely remember having seen an example somewhere, but I can't find it anymore. Best, Gabriel From ndbecker2 at gmail.com Thu Dec 11 09:11:06 2008 From: ndbecker2 at gmail.com (Neal Becker) Date: Thu, 11 Dec 2008 09:11:06 -0500 Subject: [SciPy-user] fftw and scipy 0.7 References: <1229003748.13644.16.camel@gabriel-desktop> Message-ID: Gabriel Beckers wrote: > Hi, > > I see that in the new scipy (0.7) only fftpack (NETLIB) remains for fft. > I understand the reasons for this, but for my application (filtering > gigabytes of neurophysiological data) fft speed differences really lead > to very significant speed differences. > > What would the best way of using fftw, now that the wrappers are gone. > Cython? Weave? I haven't used either of these yet, but would look into > them if someone could confirm that that would be a good idea in order to > solve my problem. I vaguely remember having seen an example somewhere, > but I can't find it anymore. > > Best, > Gabriel I have a boost::python wrapper From cournape at gmail.com Thu Dec 11 09:23:46 2008 From: cournape at gmail.com (David Cournapeau) Date: Thu, 11 Dec 2008 23:23:46 +0900 Subject: [SciPy-user] fftw and scipy 0.7 In-Reply-To: <1229003748.13644.16.camel@gabriel-desktop> References: <1229003748.13644.16.camel@gabriel-desktop> Message-ID: <5b8d13220812110623h3cd421d1qd2ef4785303c51e2@mail.gmail.com> On Thu, Dec 11, 2008 at 10:55 PM, Gabriel Beckers wrote: > Hi, > > I see that in the new scipy (0.7) only fftpack (NETLIB) remains for fft. > I understand the reasons for this, but for my application (filtering > gigabytes of neurophysiological data) fft speed differences really lead > to very significant speed differences. What it the dimensions of your fourier transforms ? The use of fftw in scipy was far from optimal for various reasons, so I am a bit surprised about the very significant part. > > What would the best way of using fftw, now that the wrappers are gone. Reusing the removed code from scipy into your own package would be one solution. Ideally, I should have put this code into a scikit, but this would have required some work to be in an acceptable shape. If you are willing to spend some time on it, I would certainly welcome a good wrapper around fftw put into a scikit. > Cython? Cython would be a bit difficult because it lacks complex support (more exactly, it cannot translate yet into C99 complex numbers; it should be possible to circumvent the problems). If you only need real to real transforms, I think cython would be a quick way to do it. That's certainly how I would do it if I had to. If you really care about speed, you should tweak you allocation scheme to force aligned allocation: it will give you a factor 2 speed increase in many cases - it is difficult to do in a cross platform manner, but trivial in a particular case/platform (since fftw has such an allocator). cheers, David From bsouthey at gmail.com Thu Dec 11 09:53:21 2008 From: bsouthey at gmail.com (Bruce Southey) Date: Thu, 11 Dec 2008 08:53:21 -0600 Subject: [SciPy-user] negative values in diagonal of covariance matrix In-Reply-To: References: Message-ID: <49412961.4090901@gmail.com> Pavlo Shchelokovskyy wrote: > Hi all, > > I'm a moderately new user of scipy, trying to make some curve-fitting > with it. I wanted to use cov matrix output of leastsq to estimate > errors of fitted parameters, but stumbled upon strange discrepancy > (for one particular dataset): > > On Linux (Fedora 8), using Python 2.5.1, numpy 1.2.0 and scipy 0.6.0 > > >>>> from scipy import * >>>> from scipy import optimize >>>> y = asarray([217, 182, 162, 170, 255]) >>>> x = linspace(0, y.size - 1, y.size) >>>> gauss = lambda p, x: p[0] + p[1] * exp(-(x-p[2])**2/(2*p[3]**2)) >>>> errfunc = lambda p, x, y: y - gauss(p, x) >>>> pinit = (max(y), -y.ptp(), argmin(y), y.size/4) >>>> fit,cov,info,mesg,success = optimize.leastsq(errfunc, pinit, args=(x,y), full_output = 1) >>>> fit >>>> > array([ 3.01602865e+05, -3.01444487e+05, 1.83283239e+00, 8.87273494e+01]) > >>>> cov >>>> > array([[ -2.27903048e+16, 2.27903047e+16, -4.72378743e+05, -3.35454486e+12], > [ 2.27903047e+16, -2.27903046e+16, 4.72378733e+05, 3.35454485e+12], > [ -4.72378950e+05, 4.72378947e+05, 6.38886491e-05, -6.95317302e+01], > [ -3.35454486e+12, 3.35454485e+12, -6.95316881e+01, -4.93761332e+08]]) > > I know that the fit is very bad (I can reject it afterwards in my > procedure), but what bothers me the most are the negative numbers on > the diagonal of the covariance matrix - as far as I understand, there > shouldn't be any... especially when the same script run on Windows XP, > using Python 2.5.2, numpy 1.2.0 and scipy 0.6.0 gives no such values > >>>> fit >>>> > > array([ 3.58426106e+05, -3.58267727e+05, 1.83283417e+00, 9.67300286e+01]) > > >>>> cov >>>> > > array([[ 6.40320190e+14, -6.40320188e+14, 4.24641077e+04, 8.64500286e+10], > > [ -6.40320188e+14, 6.40320186e+14, -4.24641083e+04, -8.64500284e+10], > > [ 4.24641250e+04, -4.24641094e+04, 7.64931901e-05, 5.73152924e+00], > > [ 8.64500286e+10, -8.64500284e+10, 5.73152989e+00, 1.16716728e+07]]) > > Is it my error or misunderstanding somewhere, or is it really a bug in > Linux implementation? > > Thanks in advance, > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > Actually you do not have sufficient data points to estimate this model because you have four parameters and only five observations resulting in no degrees of freedom for the error (if you allow correction for the mean). I do not know scipy.optimize but I am very doubtful that the model even converges correctly (a solution does not mean convergence). If the model has not converged then everything else is usually invalid (especially when those parameters depend on the solution). Bruce From josef.pktd at gmail.com Thu Dec 11 10:57:37 2008 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Thu, 11 Dec 2008 10:57:37 -0500 Subject: [SciPy-user] negative values in diagonal of covariance matrix In-Reply-To: <49412961.4090901@gmail.com> References: <49412961.4090901@gmail.com> Message-ID: <1cd32cbb0812110757i1070ce5at4d4843d453091045@mail.gmail.com> >> > Actually you do not have sufficient data points to estimate this model > because you have four parameters and only five observations resulting in > no degrees of freedom for the error (if you allow correction for the > mean). I do not know scipy.optimize but I am very doubtful that the > model even converges correctly (a solution does not mean convergence). > If the model has not converged then everything else is usually invalid > (especially when those parameters depend on the solution). > Since the constant is included as one of the four parameters, I don't think that any degree of freedom are lost. But, the negative diagonal elements means that the underlying Hessian or its approximation is not positive-definite. The inverse of a real symmetric positive definite matrix should have positive diagonal elements. So, I'm also pretty sure that the the optimization did not converge to a minimum, so I would either redefine the optimization problem or choose a more robust optimization algorithm. Josef From j.anderson at hull.ac.uk Thu Dec 11 10:52:38 2008 From: j.anderson at hull.ac.uk (Joseph Anderson) Date: Thu, 11 Dec 2008 15:52:38 +0000 Subject: [SciPy-user] Python 3.0? In-Reply-To: <5b8d13220812110623h3cd421d1qd2ef4785303c51e2@mail.gmail.com> Message-ID: Hello All, I've just noticed Python 3.0 final is now available. Curious if there are any expectations or thoughts about SciPy and Python 3.0? My best, Jo ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Dr Joseph Anderson Lecturer in Music School of Arts and New Media University of Hull, Scarborough Campus, Scarborough, North Yorkshire, YO11 3AZ, UK T: +44.(0)1723.357341 T: +44.(0)1723.357370 F: +44.(0)1723.350815 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ***************************************************************************************** To view the terms under which this email is distributed, please go to http://www.hull.ac.uk/legal/email_disclaimer.html ***************************************************************************************** From beckers at orn.mpg.de Thu Dec 11 11:31:24 2008 From: beckers at orn.mpg.de (Gabriel Beckers) Date: Thu, 11 Dec 2008 17:31:24 +0100 Subject: [SciPy-user] fftw and scipy 0.7 In-Reply-To: <5b8d13220812110623h3cd421d1qd2ef4785303c51e2@mail.gmail.com> References: <1229003748.13644.16.camel@gabriel-desktop> <5b8d13220812110623h3cd421d1qd2ef4785303c51e2@mail.gmail.com> Message-ID: <1229013084.14968.16.camel@gabriel-desktop> thanks David, On Thu, 2008-12-11 at 23:23 +0900, David Cournapeau wrote: > What it the dimensions of your fourier transforms ? The use of fftw in > scipy was far from optimal for various reasons, so I am a bit > surprised about the very significant part. The dimension varies, but I must immediately admit that I didn't make any comparisons between the different fft libraries from within scipy. I just used fftw a lot before, in plain C programs, and then it was very fast and gave significant speed increases. I was assuming that it would do the same if I could use it from within scipy. But apparently that may not be the case. Perhaps I should try the boost::python wrapper that Neal mentions, since it already exists, and see if I get an improvement that is worth the trouble or not. Cheers, Gabriel From discerptor at gmail.com Thu Dec 11 12:12:01 2008 From: discerptor at gmail.com (Joshua Lippai) Date: Thu, 11 Dec 2008 09:12:01 -0800 Subject: [SciPy-user] Python 3.0? In-Reply-To: References: <5b8d13220812110623h3cd421d1qd2ef4785303c51e2@mail.gmail.com> Message-ID: <9911419a0812110912u1c9a2ce5nd0e643d5d3038d5a@mail.gmail.com> NumPy is going to have to move over to Python 3.0 before the whole of SciPy can, and as far as I know that's not going to happen until early 2009. I'd hold off on Python 3.x for the time being. Josh On Thu, Dec 11, 2008 at 7:52 AM, Joseph Anderson wrote: > Hello All, > > I've just noticed Python 3.0 final is now available. Curious if there are > any expectations or thoughts about SciPy and Python 3.0? > > My best, > Jo > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > Dr Joseph Anderson > Lecturer in Music > > School of Arts and New Media > University of Hull, Scarborough Campus, > Scarborough, North Yorkshire, YO11 3AZ, UK > > T: +44.(0)1723.357341 T: +44.(0)1723.357370 F: +44.(0)1723.350815 > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > > ***************************************************************************************** > To view the terms under which this email is distributed, please go to http://www.hull.ac.uk/legal/email_disclaimer.html > ***************************************************************************************** > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user > From yuniyunigis at yahoo.com Thu Dec 11 13:00:13 2008 From: yuniyunigis at yahoo.com (Yuni Kim) Date: Thu, 11 Dec 2008 10:00:13 -0800 (PST) Subject: [SciPy-user] Density plot Message-ID: <194531.80296.qm@web110308.mail.gq1.yahoo.com> Hi, Can anyone let me know?how to make?a density plot, more specifically a gaussian-smoothed version of a histogram???I mean, it's not the plot itself, but to get the maximum count (or highest peak) of the density distribution.?? In S plus and R, there is the "density" function which I can use to get the max x and? y in the probability frequency distribution. I was told that, in scipy, there is a kernel density estimator for non-parametric density estimation, and fit?the "Gaussian smoothed version of the histogram" when used with Gaussian kernels.? So far I haven't suceeded finding the function. I really appreciate if anyone can help in any way! thank you! -------------- next part -------------- An HTML attachment was scrubbed... URL: From zachary.pincus at yale.edu Thu Dec 11 13:57:56 2008 From: zachary.pincus at yale.edu (Zachary Pincus) Date: Thu, 11 Dec 2008 13:57:56 -0500 Subject: [SciPy-user] Density plot In-Reply-To: <194531.80296.qm@web110308.mail.gq1.yahoo.com> References: <194531.80296.qm@web110308.mail.gq1.yahoo.com> Message-ID: I use scipy.stats.kde.gaussian_kde for similar purposes. Zach On Dec 11, 2008, at 1:00 PM, Yuni Kim wrote: > Hi, > > Can anyone let me know how to make a density plot, more specifically a > gaussian-smoothed version of a histogram? I mean, it's not the plot > itself, but to get the maximum count (or > highest peak) of the density distribution. > > In S plus and R, there is the "density" function which I can use to > get the max x and y in the probability frequency distribution. > > I was told that, in scipy, there is a kernel density estimator for > non-parametric density estimation, and fit the > "Gaussian smoothed version of the histogram" when used with Gaussian > kernels. So far I haven't suceeded finding the function. > > I really appreciate if anyone can help in any way! > > thank you! > > _______________________________________________ > SciPy-user mailing list > SciPy-user at scipy.org > http://projects.scipy.org/mailman/listinfo/scipy-user From ellisonbg.net at gmail.com Thu Dec 11 14:05:23 2008 From: ellisonbg.net at gmail.com (Brian Granger) Date: Thu, 11 Dec 2008 11:05:23 -0800 Subject: [SciPy-user] fftw and scipy 0.7 In-Reply-To: <5b8d13220812110623h3cd421d1qd2ef4785303c51e2@mail.gmail.com> References: <1229003748.13644.16.camel@gabriel-desktop> <5b8d13220812110623h3cd421d1qd2ef4785303c51e2@mail.gmail.com> Message-ID: <6ce0ac130812111105l524cc05elc466700a1ef66a18@mail.gmail.com> We use Cython to call the parallel version of FFTW and its worked fine. > Cython would be a bit difficult because it lacks complex support (more > exactly, it cannot translate yet into C99 complex numbers; it should > be possible to circumvent the problems). If you only need real to real > transforms, I think cython would be a quick way to do it. That's > certainly how I would do it if I had to. This was not an issue for us. We just created our complex arrays as numpy arrays and then passed the memory pointers into FFTW and it went fine. But, we weren't building the actual arrays in cython so we didn't run into the the lack of C99 complex numbers. Here is our parallel FFTW wrapper for Cython: http://bazaar.launchpad.net/~ipython-dev/ipython/ipythondistarray/files/35?file_id=fft-20080411221448-7w9p8hckcswj4ymh-16 Just a warning. This code has bugs in it and doesn't yet have a test suite, so I don't recommend copying this verbatim. But, it at least shows that it can be done. Depending on what exactly you want to do, you could get Cython+FFTW working in a *very* short amount or time. Cheers, Brian From pav at iki.fi Thu Dec 11 15:06:15 2008 From: pav at iki.fi (Pauli Virtanen) Date: Thu, 11 Dec 2008 20:06:15 +0000 (UTC) Subject: [SciPy-user] negative values in diagonal of covariance matrix References: <49412961.4090901@gmail.com> <1cd32cbb0812110757i1070ce5at4d4843d453091045@mail.gmail.com> Message-ID: Thu, 11 Dec 2008 10:57:37 -0500, josef.pktd wrote: [clip] > But, the negative diagonal elements means that the underlying Hessian or > its approximation is not positive-definite. The inverse of a real > symmetric positive definite matrix should have positive diagonal > elements. > > So, I'm also pretty sure that the the optimization did not converge to a > minimum, so I would either redefine the optimization problem or choose a > more robust optimization algorithm. Ideally, the algorithm should be robust enough to know that it didn't converge in this case, and signal failure (success > 4). leastsq is a wrapper to MINPACK's LMDER, so this is not completely straightforward to debug. -- Pauli Virtanen From simpson at math.toronto.edu Thu Dec 11 19:49:50 2008 From: simpson at math.toronto.edu (Gideon Simpson) Date: Thu, 11 Dec 2008 19:49:50 -0500 Subject: [SciPy-user] complex odes Message-ID: Does odeint not support the integration of complex valued ODEs? -gideon From zane at ideotrope.org Thu Dec 11 20:08:43 2008 From: zane at ideotrope.org (Zane Selvans) Date: Thu, 11 Dec 2008 17:08:43 -0800 Subject: [SciPy-user] Numpy floating point comparisons In-Reply-To: References: Message-ID: Okay, let's try that again... I'm sure that this problem has come up before, and caused headaches, and hopefully someone can just point me at the appropriate documentation... but, I'm having issues doing floating point value comparisons... Specifically, I have two arrays of floating point values A and B, which have different lengths, and I want to do something semantically equivalent to: X = B[where(A in B[:,0])] ] But the array set operations of course don't do well when the values being compared are floating point, so: X = B[where(numpy.lib.arraysetops.setmember1d(B[:,0],A))] works unreliably. Is there a way to do this quickly, that also works for floating point values? -- Zane Selvans Amateur Earthling zane at ideotrope.org 303/815-6866 http://zaneselvans.org PGP Key: 55E0815F -------------- next part -------------- An HTML attachment was scrubbed... URL: From zane at ideotrope.org Thu Dec 11 19:58:03 2008 From: zane at ideotrope.org (Zane Selvans) Date: Thu, 11 Dec 2008 16:58:03 -0800 Subject: [SciPy-user] Numpy floating point comparisons Message-ID: Hello, I'm sure that this problem has come up before, and caused headaches, and hopefully someone can just point me at the appropriate documentation... but, I'm having issues doing floating point value comparisons... Specifically, I have two arrays of floating point values A and B, which have different lengths, and I want to extract all the X = A[where(A in B) -- Zane Selvans Amateur Earthling zane at ideotrope.org 303/815-6866 http://zaneselvans.org PGP Key: 55E0815F -------------- next part -------------- An HTML attachment was scrubbed... URL: From david at ar.media.kyoto-u.ac.jp Thu Dec 11 23:49:55 2008 From: david at ar.media.kyoto-u.ac.jp (David Cournapeau) Date: Fri, 12 Dec 2008 13:49:55 +0900 Subject: [SciPy-user] negative values in diagonal of covariance matrix In-Reply-To: References: <49412961.4090901@gmail.com> <1cd32cbb0812110757i1070ce5at4d4843d453091045@mail.gmail.com> Message-ID: <4941ED73.5090302@ar.media.kyoto-u.ac.jp> Pauli Virtanen wrote: > > Ideally, the algorithm should be robust enough to know that it didn't > converge in this case, and signal failure (success > 4). leastsq is a > wrapper to MINPACK's LMDER, so this is not completely straightforward to > debug. > Do you mean the python wrapper miss some diagnostic information available in fortran ? Otherwise, would it make sense to check the ier value from fortran and at least generate a warning about failed convergence ? David From millman at berkeley.edu Fri Dec 12 00:24:08 2008 From: millman at berkeley.edu (Jarrod Millman) Date: Thu, 11 Dec 2008 21:24:08 -0800 Subject: [SciPy-user] Python 3.0? In-Reply-To: <9911419a0812110912u1c9a2ce5nd0e643d5d3038d5a@mail.gmail.com> References: <5b8d13220812110623h3cd421d1qd2ef4785303c51e2@mail.gmail.com> <9911419a0812110912u1c9a2ce5nd0e643d5d3038d5a@mail.gmail.com> Message-ID: On Thu, Dec 11, 2008 at 9:12 AM, Joshua Lippai wrote: > NumPy is going to have to move over to Python 3.0 before the whole of > SciPy can, and as far as I know that's not going to happen until early > 2009. I'd hold off on Python 3.x for the time being. It definitely will not happen in early 2009. Early 2010 would be a better estimate. We are aiming to fully support Python 2.6 in early 2009. -- Jarrod Millman Computational Infrastructure for Research Labs 10 Giannini Hall, UC Berkeley phone: 510.643.4014 http://cirl.berkeley.edu/ From josef.pktd at gmail.com Fri Dec 12 00:52:11 2008 From: josef.pktd at gmail.com (josef.pktd at gmail.com) Date: Fri, 12 Dec 2008 00:52:11 -0500 Subject: [SciPy-user] negative values in diagonal of covariance matrix In-Reply-To: <4941ED73.5090302@ar.media.kyoto-u.ac.jp> References: